In my last post I mentioned a project which required me to move documents from one list to another list in a different farm one folder at a time. Along with that was a requirement to set various field values (metadata) based on patterns in the folder name and/or filename. I needed a reasonably flexible way to accomplish this considering that the client didn’t actually have a clue as to what they really wanted the rules to be. I already had a command (gl-replacefieldvalues) which let me set the value of an existing field but it didn’t allow me to do it based on the values of other fields and there was not real filtering capability. So I built a new command called gl-setmetadata
which allows me to pass in an XML file containing various rules.
There’s really not much to the code – the bulk of it is just parsing the XML and figuring out what to do. There’s two core methods – the first, ProcessFolder
, is responsible for getting the collection of items that should be processed using the provided rules. This is done by using an SPQuery
object and passing in the Query XML node if present. The second method, ApplyRule
, is called by the ProcessFolder
method for each Rule node found in the XML and it is responsible for setting any field data based on the rules.
1public SetMetaData()
2{
3 SPParamCollection parameters = new SPParamCollection();
4 parameters.Add(new SPParam("url", "url", true, null, new SPNonEmptyValidator(), "Please specify the url to search."));
5 parameters.Add(new SPParam("quiet", "q"));
6 parameters.Add(new SPParam("test", "t"));
7 parameters.Add(new SPParam("inputfile", "input", true, null, new SPFileExistsValidator()));
8 parameters.Add(new SPParam("logfile", "log", false, null, new SPDirectoryExistsAndValidFileNameValidator()));
9 parameters.Add(new SPParam("recursefolders", "recurse"));
10
11 StringBuilder sb = new StringBuilder();
12 sb.Append("\r\n\r\nUpdates list field values based on the rules defined in the provided input file. Use -test to verify your updates before executing.\r\n\r\nParameters:");
13 sb.Append("\r\n\t-url <list folder url>");
14 sb.Append("\r\n\t-inputfile <input file containing meta data rules>");
15 sb.Append("\r\n\t[-recursefolders]");
16 sb.Append("\r\n\t[-quiet]");
17 sb.Append("\r\n\t[-test]");
18 sb.Append("\r\n\t[-logfile <log file>]");
19
20 Init(parameters, sb.ToString());
21}
22
23/// <summary>
24/// Gets the help message.
25/// </summary>
26/// <param name="command">The command.</param>
27/// <returns></returns>
28public override string GetHelpMessage(string command)
29{
30 return HelpMessage;
31}
32
33/// <summary>
34/// Runs the specified command.
35/// </summary>
36/// <param name="command">The command.</param>
37/// <param name="keyValues">The key values.</param>
38/// <param name="output">The output.</param>
39/// <returns></returns>
40public override int Execute(string command, StringDictionary keyValues, out string output)
41{
42 output = string.Empty;
43
44 string url = Params["url"].Value.TrimEnd('/');
45 bool quiet = Params["quiet"].UserTypedIn;
46 bool testMode = Params["test"].UserTypedIn;
47 string logFile = Params["logfile"].Value;
48 XmlDocument metaDataDoc = new XmlDocument();
49 string inputFile = Params["inputfile"].Value;
50 bool recurseFolders = Params["recursefolders"].UserTypedIn;
51
52 Verbose = !quiet;
53 LogFile = logFile;
54
55 metaDataDoc.Load(inputFile);
56
57 using (SPSite site = new SPSite(url))
58 using (SPWeb web = site.OpenWeb())
59 {
60 SPFolder folder = web.GetFolder(url);
61
62 if (!folder.Exists || folder == null) // the null check is unnecessary but it makes me feel better.
63 throw new SPException("The specified list folder was not found.");
64
65 SPList list = null;
66 try
67 {
68 list = web.Lists[folder.ParentListId];
69 }
70 catch (ArgumentException)
71 {}
72 if (list == null) // This should never happen if we found a folder but again, it makes me feel better having it.
73 throw new SPException("The specified list was not found.");
74
75 // Process the folder.
76 ProcessFolder(folder, list, metaDataDoc, recurseFolders, testMode);
77 }
78 return OUTPUT_SUCCESS;
79}
80
81/// <summary>
82/// Processes the folder.
83/// </summary>
84/// <param name="folder">The folder.</param>
85/// <param name="list">The list.</param>
86/// <param name="metaDataDoc">The meta data doc.</param>
87/// <param name="recurseFolders">if set to <c>true</c> [recurse folders].</param>
88/// <param name="testMode">if set to <c>true</c> [test mode].</param>
89private static void ProcessFolder(SPFolder folder, SPList list, XmlDocument metaDataDoc, bool recurseFolders, bool testMode)
90{
91 // If we don't have any rules to process then there's no sense continueing so error out.
92 if (metaDataDoc.SelectNodes("//Rule").Count == 0)
93 throw new SPException("Missing \"Rule\" node(s) which should be a child of the root \"MetaData\" node.");
94
95 // Get a namespace manager so that we can retrieve the Query element if present.
96 XmlNamespaceManager nsManager = new XmlNamespaceManager(metaDataDoc.NameTable);
97 nsManager.AddNamespace("sp", "http://schemas.microsoft.com/sharepoint/");
98
99 // Look for a Query element
100 XmlElement queryElement = (XmlElement)metaDataDoc.SelectSingleNode("//sp:Query", nsManager);
101 SPListItemCollection items;
102 SPQuery query = new SPQuery();
103 if (recurseFolders)
104 query.ViewAttributes = "Scope=\"Recursive\"";
105 // Set the root folder to query
106 query.Folder = folder;
107 if (queryElement != null)
108 {
109 // We have a query element so do an intial filtering using the provided filter
110 query.Query = queryElement.OuterXml;
111 items = list.GetItems(query);
112 }
113 else
114 {
115 // User didn't provide any query parameters so just use an empty query (no filtering)
116 items = list.GetItems(query);
117 }
118
119 Log("Beginning processing of {0} items...", items.Count.ToString());
120 int modificationCount = 0;
121
122 for (int i = 0; i < items.Count; i++)
123 {
124 SPListItem item = items[i];
125 Log("Progress: Processing item {0}: {1}\r\n", item.ID.ToString(), item["ServerUrl"].ToString());
126
127 if (item.FileSystemObjectType == SPFileSystemObjectType.Folder)
128 {
129 // Currently not handling folders - no particular reason, I just don't need this ability.
130 // Commenting out this block will not hurt anything.
131 Log("Progress: Item {0} is a folder - skipping.", item.ID.ToString());
132 continue;
133 }
134
135 bool modified = false;
136
137 // Loop through each rule element and apply the rules changes
138 foreach (XmlElement ruleElement in metaDataDoc.SelectNodes("//Rule"))
139 {
140 if (ApplyRule(item, ruleElement))
141 modified = true;
142 }
143
144 if (modified)
145 {
146 // The rules resulted in modified data so update the item if not in test mode.
147 if (!testMode)
148 item.SystemUpdate();
149 modificationCount++;
150 Log("Progress: Item ID {0} was modified.", item.ID.ToString());
151 }
152 else
153 {
154 // There were no modifications made
155 Log("Progress: Item ID {0} was NOT modified.", item.ID.ToString());
156 }
157
158 Log("Progress: Finished Processing item {0}\r\n\r\n", item.ID.ToString());
159
160 }
161 Log("Finished processing items. {0} out of {1} items were modified.\r\n", modificationCount.ToString(), items.Count.ToString());
162
163}
164
165/// <summary>
166/// Applies the rule.
167/// </summary>
168/// <param name="item">The item.</param>
169/// <param name="ruleElement">The rule element.</param>
170/// <returns></returns>
171private static bool ApplyRule(SPListItem item, XmlElement ruleElement)
172{
173 bool modified = false;
174 string ruleName = ruleElement.GetAttribute("Name");
175
176 XmlElement matchElement = (XmlElement)ruleElement.SelectSingleNode("Match");
177 bool isMatch = true;
178
179 // The match element is optional and just provides some additional regular expression filtering beyond what the Query element can provide
180 if (matchElement != null)
181 {
182 bool isAnd = true;
183 if (matchElement.HasAttribute("Op"))
184 isAnd = matchElement.GetAttribute("Op").ToLowerInvariant() == "and";
185 // For "And" operations we default our starter item to true as everything must come back as true to be a match
186 // For "Or" operations we default our starter item to false as we only need one item to come back as true to
187 // be a match and we don't want that one item to be the starter item.
188 bool fieldMatches = isAnd;
189
190 // If we have a Match element then we need at least one Field element otherwise what's the point.
191 if (matchElement.SelectNodes("Field").Count == 0)
192 throw new SPException("Missing \"Field\" node(s) which should be a child of the \"Match\" node.");
193
194 foreach (XmlElement fieldElement in matchElement.SelectNodes("Field"))
195 {
196 // The Field element needs a Name attribute and a value to use as the search pattern string
197 if (!fieldElement.HasAttribute("Name"))
198 throw new SPException("Missing \"Name\" attribute of \"Field\" node.");
199 if (string.IsNullOrEmpty(fieldElement.InnerText.Trim()))
200 throw new SPException(string.Format("Missing search pattern string value for match field '{0}'", fieldElement.GetAttribute("Name")));
201
202 // We use the internal name for all field names
203 SPField field = item.Fields.GetFieldByInternalName(fieldElement.GetAttribute("Name"));
204
205 // Determine if we have a match for this field.
206 bool fieldMatch = Regex.IsMatch(item[field.Id].ToString(), fieldElement.InnerText);
207
208 // Apply the match results to our fieldMatches variable to track the overall result
209 if (isAnd)
210 fieldMatches = fieldMatches && fieldMatch;
211 else
212 fieldMatches = fieldMatches || fieldMatch;
213 }
214 // Set the overall result
215 isMatch = fieldMatches;
216 }
217 if (!isMatch)
218 {
219 Log("Progress: Unable to find match for rule '{0}'.", ruleName);
220 return modified; // No match so evaluate the next rule
221 }
222 else
223 Log("Progress: Found match for rule '{0}'.", ruleName);
224
225 // Every Rule element must have one and only one Set element
226 XmlElement setElement = (XmlElement) ruleElement.SelectSingleNode("Set");
227 if (setElement == null)
228 throw new SPException("Missing \"Set\" node.");
229
230 // Every Set element must have at least one Field element
231 if (setElement.SelectNodes("Field").Count == 0)
232 throw new SPException("Missing \"Field\" node(s) which should be a child of the \"Set\" node.");
233
234 // Loop through all the Field elements and apply the indicated values
235 foreach (XmlElement fieldElement in setElement.SelectNodes("Field"))
236 {
237 // Every Field element must have a Name attribute - the value can be empty which is the same as setting the field to null.
238 if (!fieldElement.HasAttribute("Name"))
239 throw new SPException("Missing \"Name\" attribute of \"Field\" node.");
240
241 string fieldName = fieldElement.GetAttribute("Name");
242 string fieldData = fieldElement.InnerText;
243 SPField field = item.Fields.GetFieldByInternalName(fieldName);
244
245 if (field.ReadOnlyField)
246 {
247 // We can't update read-only fields so log a warning and move on.
248 Log("WARNING: Field '{0}' is read only and will not be updated.", EventLogEntryType.Warning, field.InternalName);
249 continue;
250 }
251
252 if (field.Type == SPFieldType.Computed)
253 {
254 // We can't update computed fields so log a warning and move on.
255 Log("Progress: Field '{0}' is a computed column and will not be updated.", EventLogEntryType.Warning, field.InternalName);
256 continue;
257 }
258 // If a SearchPattern attribute was provided then do a regular expression replace instead of just a straight up set.
259 if (fieldElement.HasAttribute("SearchPattern"))
260 {
261 if (string.IsNullOrEmpty(fieldElement.GetAttribute("SearchPattern")))
262 throw new SPException(string.Format("SearchPattern attribute of Field node '{0}' is empty.", fieldName));
263
264 if (item[field.Id] == null)
265 {
266 // We can't do a regex on a null value so move on
267 Log("Progress: Value of field '{0}' is 'null' - no replace operation will be performed.", field.InternalName);
268 continue;
269 }
270 else
271 fieldData = Regex.Replace(item[field.Id].ToString(), fieldElement.GetAttribute("SearchPattern"), fieldData);
272 }
273 // If the fieldData is empty then make sure it's set to null
274 if (string.IsNullOrEmpty(fieldData))
275 fieldData = null;
276
277
278 if (item[field.Id] == null || item[field.Id].ToString() != fieldData)
279 {
280 // The modified field data is different from the source so go ahead and apply the change
281 Log("Progress: Applying modification to field '{0}' per rule '{1}'", fieldName, ruleName);
282 if (field.Type == SPFieldType.URL)
283 item[field.Id] = new SPFieldUrlValue(fieldData);
284 else
285 item[field.Id] = fieldData;
286
287 modified = true;
288 }
289 else
290 {
291 Log("Progress: No change required for field '{0}' per rule '{1}'.", fieldName, ruleName);
292 }
293 }
294 if (!modified)
295 Log("Progress: Set rules resulted in no change from existing data for rule '{0}'.", ruleName);
296
297 return modified;
298}
The core thing to understand with this command is the structure of the input folder and this where things get a little more complicated. I don’t currently have an XSD for this (I may create one to aid in validation but I just didn’t have the time). So failing a good XSD here’s a reasonably detailed example XML file with comments:
1<MetaData>
2 <!-- Query is an optional CAML element and is used to filter the items that are to be considered. Anything you can do with a standard CAML Query element you can put here (be sure to include the namespace attribute) -->
3 <Query xmlns="http://schemas.microsoft.com/sharepoint/">
4 <Where>
5 <BeginsWith>
6 <FieldRef Name="FileRef" />
7 <Value Type="string">/Documents/Sub-Folder1/</Value>
8 </BeginsWith>
9 </Where>
10 </Query>
11 <!-- There must be at least one Rule element - multiple elements are processed in the order they appear -->
12 <!-- The Rule element may contain an optional Name attribute which is a simple label used for logging -->
13 <Rule Name="Set Content Type">
14 <!-- Every Rule element must have one and only one Set element -->
15 <Set>
16 <!-- The Set element must contain one or more Field elements -->
17 <!-- The Field element must have a Name attribute which corresponds to the fields internal name -->
18 <!-- The value of the Field element is what will be set to the list item for that field -->
19 <!-- A Field element may contain an optional SearchPattern attribute which can be used to update an existing value via a Regex.Replace() call -->
20 <!-- If no SearchPattern attribute is present then existing data is ignored -->
21 <Field Name="ContentType">Dublin Core Columns</Field>
22 </Set>
23 </Rule>
24 <Rule Name="Set English Language">
25 <!-- A Rule element can contain one optional Match element which is used to provide regular expression based filtering -->
26 <!-- The Match element can contain an optional Op attribute used to indicate whether the match logic is "AND" or "OR" (default is "AND" if not present) -->
27 <Match Op="OR">
28 <!-- The Field element must have a Name attribute which corresponds to the fields internal name -->
29 <!-- The value of the Field element is used in a Regex.IsMatch() call to determine whether the item should be processed -->
30 <Field Name="FileLeafRef">(?i:.* Eng.*|.*ENGLISH ONLY.*|.*-EN.*)</Field>
31 <Field Name="Title">(?i:.* Eng.*|.*ENGLISH ONLY.*|.*-EN.*)</Field>
32 </Match>
33 <Set>
34 <Field Name="FileLeafRef" SearchPattern="(?i: -?Eng|ENGLISH ONLY)|-EN">-English</Field>
35 <Field Name="Language">English</Field>
36 </Set>
37 </Rule>
38 <Rule Name="Set Korean Language">
39 <Match Op="And">
40 <Field Name="FileLeafRef">(?i:.* Kor.*|.*KOREAN ONLY.*|.*-KO.*)</Field>
41 </Match>
42 <Set>
43 <Field Name="FileLeafRef" SearchPattern="(?i: -?Kor|KOREAN ONLY)|-KO">-Korean</Field>
44 <Field Name="Language">Korean</Field>
45 </Set>
46 </Rule>
47</MetaData>
Note that I don’t claim to be a regular expression expert and I’ve not extensively tested the regular expressions in the examples above and I know that there are issues with them for more complex data but for the purpose of a simple demonstration they do well enough. The example above will return back all documents in the folder “/documents/sub-folder1” and will set the content type of every item to “Dublin Core Columns”. It will then standardize the name of the file (FileLeafRef
) so that it only contains *-English
or *-Korean
using information in the filename and it will also set the Language field to English or Korean using this same information.
Probably the most important thing to remember when constructing your XML is that you need to use the internal field name and not the display name.
You can also do additional filtering using the command line parameters by restricting whether folders are recursed and by specifying a sub-folder instead of a root list folder. The syntax of the command can be seen below:
C:\>stsadm -help gl-setmetadata stsadm -o gl-setmetadata Updates list field values based on the rules defined in the provided input file. Use -test to verify your updates before executing. Parameters: -url <list folder url> -inputfile <input file containing meta data rules> [-recursefolders] [-quiet] [-test] [-logfile <log file>]
Here’s an example of how you would execute this command using the XML shown above as an input:
stsadm -o gl-setmetadata -url http://portal/documents -inputfile c:\metadata.xml -recursefolders -logfile c:\metadata.log
Like many of my commands that do batch updating you can run this command in a test mode by passing in a “-test” parameter.