Save List Items and Files to Disk

Posted on Posted in SharePoint 2007, STSADM Commands

I’ve seen numerous examples of people needing to save all the files from a document library or custom list (containing attachments) to disk. I didn’t necessarily need the ability myself for the upgrade we are doing but I did need a quick way to generate lots of different samples to make sure that my gl-addlistitem command was working correctly. So I decided to create a new command which would make my testing easier as well as help the many out there that have the need of saving lots of files out to disk. The command I created is gl-exportlistitem2. I already had an gl-exportlistitem command which used the deployment API and I just wasn’t feeling very creative with the name so I just added "2" (maybe "savelistdata" is better???). The command does two key things – saves all the files to a specified path and creates a Manifest.xml file that contains information about the files and any list items that were in the list. This information can then be used by the gl-addlistitem command to actually import the data into another list. For this initial version I’ve kept things fairly simple – there’s no compression, no security information, and no version history. I’m only storing the file(s) (if present) and any field data (perhaps I’ll look to handle more data in the future but for now this met my needs). The nice thing is that if you don’t need any of the other information then what I created actually works better than using the deployment API as mine actually takes folder location into account whereas the deployment API is extremely buggy when it comes to folders. I also included a simplified version of the code which just simply dumps all the files to disk without the manifest information (the command does not use this but I kept it in the source in case anyone needed it). The code to do all of this is really straightforward – I decided to break it up into two chunks – the first gathers all the necessary data from the list and stores it in some custom data classes and the second takes those classes and saves to disk and creates the actual manifest file:

   1: /// <summary>
   2: /// Gets the item data.
   3: /// </summary>
   4: /// <param name="web">The web.</param>
   5: /// <param name="list">The list.</param>
   6: /// <param name="ids">The ids.</param>
   7: /// <returns></returns>
   8: private static List<ItemInfo> GetItemData(SPWeb web, SPList list, List<int> ids)
   9: {
  10:  List<ItemInfo> itemData = new List<ItemInfo>();
  11:  
  12:  foreach (SPListItem item in list.Items)
  13:  {
  14:   if (!(ids.Count == 0 || ids.Contains(item.ID)))
  15:    continue;
  16:  
  17:   ItemInfo info = new ItemInfo();
  18:   itemData.Add(info);
  19:   info.ID = item.ID;
  20:   
  21:   if (item.File != null)
  22:   {
  23:    info.File = new FileDetails(item.File.OpenBinary(), item.File.Name, item.File.Author, item.File.TimeCreated);
  24:    info.Title = item.File.Name;
  25:   }
  26:   else
  27:    info.Title = item.Title;
  28:  
  29:   info.FolderUrl = item.Url.Substring(list.RootFolder.Url.ToString().Length, item.Url.LastIndexOf("/") - list.RootFolder.Url.ToString().Length);
  30:  
  31:   try
  32:   {
  33:    foreach (string fileName in item.Attachments)
  34:    {
  35:     SPFile file = web.GetFile(item.Attachments.UrlPrefix + fileName);
  36:     info.Attachments.Add(new FileDetails(file.OpenBinary(), file.Name, file.Author, file.TimeCreated));
  37:    }
  38:   }
  39:   catch (ArgumentException)
  40:   {}
  41:  
  42:   foreach (SPField field in list.Fields)
  43:   {
  44:    if (!field.ReadOnlyField && 
  45:     field.InternalName != "Attachments" && 
  46:     field.InternalName != "FileLeafRef" &&
  47:     item[field.InternalName] != null)
  48:    {
  49:     info.FieldData.Add(field.InternalName, item[field.InternalName].ToString());
  50:    }
  51:   }
  52:  }
  53:  return itemData;
  54: }
  55:  
  56: /// <summary>
  57: /// Gets the item data from XML.
  58: /// </summary>
  59: /// <param name="itemData">The item data.</param>
  60: /// <param name="manifestPath">The manifest path.</param>
  61: private static void SaveItemData(List<ItemInfo> itemData, string manifestPath)
  62: {
  63:  if (string.IsNullOrEmpty(manifestPath))
  64:   throw new ArgumentNullException("manifest", "No directory was specified for the manifest.");
  65:  
  66:  if (!Directory.Exists(manifestPath))
  67:   Directory.CreateDirectory(manifestPath);
  68:  
  69:  string dataPath = Path.Combine(manifestPath, "Data");
  70:  
  71:  StringBuilder sb = new StringBuilder();
  72:  
  73:  XmlTextWriter xmlWriter = new XmlTextWriter(new StringWriter(sb));
  74:  xmlWriter.Formatting = Formatting.Indented;
  75:  
  76:  xmlWriter.WriteStartElement("Items");
  77:  
  78:  foreach (ItemInfo info in itemData)
  79:  {
  80:   xmlWriter.WriteStartElement("Item");
  81:  
  82:   if (info.File != null)
  83:   {
  84:    string folder = Path.Combine(dataPath, info.FolderUrl.Trim('\\', '/')).Replace("/", "\\");
  85:    if (!Directory.Exists(folder))
  86:     Directory.CreateDirectory(folder);
  87:  
  88:    xmlWriter.WriteAttributeString("File", Path.Combine(folder, info.File.Name));
  89:    xmlWriter.WriteAttributeString("Author", info.File.Author.LoginName);
  90:    xmlWriter.WriteAttributeString("CreatedDate", info.File.CreatedDate.ToString());
  91:    File.WriteAllBytes(Path.Combine(folder, info.File.Name), info.File.File);
  92:   }
  93:   xmlWriter.WriteAttributeString("LeafName", info.Title);
  94:   xmlWriter.WriteAttributeString("FolderUrl", info.FolderUrl);
  95:     
  96:   xmlWriter.WriteStartElement("Fields");
  97:   foreach (string key in info.FieldData.Keys)
  98:   {
  99:    xmlWriter.WriteStartElement("Field");
 100:    xmlWriter.WriteAttributeString("Name", key);
 101:    xmlWriter.WriteString(info.FieldData[key]);
 102:    xmlWriter.WriteEndElement(); // Field
 103:   }
 104:   xmlWriter.WriteEndElement(); // Fields
 105:  
 106:   xmlWriter.WriteStartElement("Attachments");
 107:   foreach (FileDetails file in info.Attachments)
 108:   {
 109:    string folder = Path.Combine(Path.Combine(dataPath, info.FolderUrl.Trim('\\', '/')).Replace("/", "\\"), "item_" + info.ID);
 110:    if (!Directory.Exists(folder))
 111:     Directory.CreateDirectory(folder);
 112:    
 113:    xmlWriter.WriteElementString("Attachment", Path.Combine(folder, file.Name));
 114:  
 115:    File.WriteAllBytes(Path.Combine(folder, file.Name), file.File);
 116:   }
 117:   xmlWriter.WriteEndElement(); // Attachments
 118:  
 119:   xmlWriter.WriteEndElement(); // Item
 120:  }
 121:  
 122:  xmlWriter.WriteEndElement();
 123:  xmlWriter.Flush();
 124:  
 125:  File.WriteAllText(Path.Combine(manifestPath, "Manifest.xml"), sb.ToString());
 126: }
 127:  
 128: #region Private Classes
 129:  
 130: private class FileDetails
 131: {
 132:  public byte[] File = null;
 133:  public string Name = null;
 134:  public SPUser Author = null;
 135:  public DateTime CreatedDate = DateTime.Now;
 136:  public FileDetails(byte[] file, string name, SPUser author, DateTime createdDate)
 137:  {
 138:   File = file;
 139:   Name = name;
 140:   Author = author;
 141:   CreatedDate = createdDate;
 142:  }
 143: }
 144: private class ItemInfo
 145: {
 146:  public FileDetails File = null;
 147:  public string FolderUrl = null;
 148:  public List<FileDetails> Attachments = new List<FileDetails>();
 149:  public Dictionary<string, string> FieldData = new Dictionary<string, string>();
 150:  public int ID = -1;
 151:  public string Title = null;
 152: }
 153: #endregion

The syntax of the command can be seen below:

C:\>stsadm -help gl-exportlistitem2

stsadm -o gl-exportlistitem2

Exports list items to disk (exported results can be used with addlistitem).

Parameters:
        -url <list view url to export from>
        -path <export path>
        [-id <list item ID (separate multiple items with a comma)>]

Here’s an example of how to do export list items:

stsadm -o gl-exportlistitem2 -url "http://intranet/documents/forms/allitems.aspx" -path "c:\documents"

Note that a "Data" folder will be created under the path specified – all files will be put in this folder and the folder structure will mirror that of the list. The Manifest.xml file will be in the root of the folder specified. Attachments will be stored in sub-folders using the name "item_{ID}" where {ID} is the item ID. Once exported you could then use the gl-addlistitem command to import these items to another list:

stsadm -o gl-addlistitem -url "http://intranet/documents2/forms/allitems.aspx" -datafile "c:\documents\manifest.xml" -publish

Update 1/31/2008: I’ve modified this command so that it now also supports exporting web part pages. The resultant exported manifest file can be used in conjunction with the gl-addlistitem command so that web part pages can be properly imported using that command.

24 thoughts on “Save List Items and Files to Disk

  1. Does this also allow the exportation of issues lists? Since you are not exporting versions as well, I would assume not.

    Now since I need to export issues lists, I have tried to just export the entire site, but of course, life cannot be that easy and consequently, for large sites, the export typically fails under the banner of something not containing unique something or another. My presumption is that something was allowed to duplicate while the site was underneath 2003.

  2. I haven’t tried it with an issues list but it may work (not sure if it uses history or folders (like discusions) to group and link information – haven’t looked too closely). Did you try the exportlist/importlist commands with the retainobjectidentity flag? Also – I think SP1 may have fixed some issues regarding unique constraints (I seem to remember seeing something when was looking through the release notes).

  3. Hi, Gary.

    First of all, I would just like to say I love what you’ve been doing, posting all this code and all. You’re a lifesaver!

    Anyways, I’m trying to use exportlistitem2 to export pages from the Pages library. But it fails on the web part pages. It gives me:

    WARNING: Cannot export web part. Make sure that the web part assembly is in the GAC and is registered as a safe control

    I’ve managed to debug this, and it only applies to my custom web part on the page. Funny thing is that my custom web part is indeed installed in the GAC and registered as a safe control. I also have a Telerik web part on the same page, and this gets exported just fine…

    Hope you can give me any clue as to what might be wrong.

    Once again, thanks a bunch!

    Frank

  4. Frank-Ove – the addlistitem and exportlistitem2 commands is something that I’ve been doing a lot of work on and it’s definitely not perfect (I just posted an update yesterday to address a minor issue with ListViewWebPart controls). With custom web parts it’s hard to say – are you able to export the web part via the UI? Is the web part marked as exportable (shouldn’t matter but…)? Are you doing anything in the loading of your web part that is dependent on the existence of SPContext (shouldn’t matter if you’re using a more recent version of the code as I now export using the web part web service)?

  5. Hi again.
    Thank you for quick feedback!

    Yes, I am able to export my web part. But that aside, you were correct about me having code in the web part constructor. I had the following code:

    Guid pagesGuid = PublishingWeb.GetPagesListId(SPContext.Current.Web);
    string pagesName = PublishingWeb.GetPagesListName(SPContext.Current.Web);
    if (this.WebUrl.Equals(string.Empty) && this.ListGuid.Equals(string.Empty) && this.ListName.Equals(string.Empty))
    {
    this.WebUrl = SPContext.Current.Web.ServerRelativeUrl;
    this.ListGuid = pagesGuid.ToString();
    this.ListName = pagesName;
    }

    After commenting these lines out, and a new build and deployment, it worked.
    My web part is inheriting from the ContentByQueryWebPart. As you can see from my lines of code, I’m trying to point it to the current sites’ Pages library. Since this is not possible inside a .webpart file, I thought I might do it here.

    Frank

  6. Not sure what version of the code you’re using (I really should probably start versioning this stuff) but you may want to try the latest – I modified the code at one point so that it uses the built in web services to get the exported web part xml so as to get around the specific issue of web parts requiring SPContext.

  7. Sorry, forgot to mention in the previous post.
    I downloaded the latest version this morning (Friday).

    One other thing I ran into: When exporting from the Pages library, it’s not given that all the items are based on a Page Layout. The page newsarchive.aspx created via the site definition SPSNHOME is one example of this. In this case, you should add a check for this in exportlistitem2, and consequencely also in addlistitem.

    Frank

  8. Is it possible to export pictures? I tried this but:
    Progress: Getting item data for item ‘7101’
    Could not load file or assembly ‘Microsoft.SharePoint.Publishing, Version=12.0.0
    .0, Culture=neutral, PublicKeyToken=71e9bce111e9429c’ or one of its dependencies
    . The system cannot find the file specified.

  9. My guess is that you are using WSS and not MOSS? I’ve only tested this command against MOSS. I have some things in there which try to address issues with publishing sites and I haven’t done testing to add the necessary error handling when not working with MOSS.

  10. Hi Gary!

    I’d just like to know if we can use it with Sharepoint 2001 Portal Server.
    In fact I have to do a migration from Sharepoint 2001 to a filesystem. I have about 7500 documents and 1000 folders to move with their metadata in a file.

    Thanks!
    Romain.

  11. I am trying to export all 3000+ documents from a document library, and I got the following error:
    Exception of type ‘System.OutOfMemoryException’ was thrown.

    The last message before this error was thrown:
    Progress: Getting item data for item ‘1534’

    Any idea? (The server has 16GB physical memory). Thanks!

  12. Yeah – it’s because I didn’t really code it very well – I originally built this to solve a quick issue and kept building on it without refactoring. Problem is that it suffers from one fundamental flaw – I store everything in memory before saving to disk so if you have a lot of stuff it will eventually run out of memory. I’d suggest you look at the gl-exportlistitem (or gl-exportlist) commands – I hope to one day rework this one but I’ve not yet had a need so haven’t really worried about it.

  13. Thanks, Gary. gl-exportlistitem works, but I like gl-exportlistitem2 as it meets my needs better. Look forward to your modified version of gl-exportlistitem2.

    I wanted to let you know that your STSADM custom extensions have been great help to me. Super work!

  14. I’ll move my post to the proper thread.

    Is there any way to just extract the metadata? I am not concerned with the files themselves, just the metadata associated with them.

  15. I installed the stsadm commands and powershell cmdlets for SharePoint 2010 and I can’t find the gl-exportlistitem2 command. Is it not available for 2010?

  16. Wouldn’t it be better to run a CAML or Linq query, or to use the SPList.GetItemById() method, than to iterate through the entire list? For large lists, it seems to me that this could be a problem.

    1. If you were exporting just a single item or a sub-set of items – but as I’m exporting all items iterating the list is the only option. That said, the way I’m storing the items in memory before saving the lot to disk is a really bad design and will cause problems for large libraries – I would most definitely do this differently were I to do it again (in fact I have done it again and in a much, much better way but that code is for 2010 and will not be open source as I may choose to either sell it directly or include it in a potential pay version of my open source product).

        1. Forgot about that. Yeah, that was something I threw in at one point to allow you to pass in a set of IDs. So, yes, for this condition it would be better to use the GetItemByID approach rather than do what I’m doing here – downfall of adding functionality to support a condition with no real time to do it properly :).

Leave a Reply

Your email address will not be published. Required fields are marked *

CAPTCHA

*