Parallel SharePoint Tasks with PowerShell

Today I was working on a deployment for a client which entailed activating a custom SharePoint Feature on about 1000 Site Collections. This Feature did a fair number of things and on average it takes about 10-15 minutes to complete in their test environment (which is pretty slow compared to their production environment which I’ve not yet deployed to but I expect close to a 5 minute run time per Site Collection once I go to production with it). You can obviously do the math and quickly see that it will take me somewhere around 10 days for this to complete if I did one Site Collection at a time. This is just unacceptable as I personally don’t want to be monitoring a Feature activation script for that long. What’s worse is that when I look at CPU and memory utilization on the servers I can see that they have plenty of resources so it’s not like the operation is actually taxing the system, they’re just slow operations. So the solution, for me, is pretty obvious: I need to activate these Features in parallel.

There are two ways that I can achieve this using PowerShell and they depend on which version of PowerShell you’re using. In my case I’m running SharePoint 2010 which means that I’m using PowerShell V2; because of this my only option is to use the Start-Job cmdlet with some control logic to dictate how many jobs I’m willing to run at once. If I were using SharePoint 2013 I could use the new workflow capabilities of PowerShell V3 thereby making the whole process a lot easier to understand. I’ll show both approaches but I want to first start with what you would do for SharePoint 2010 with PowerShell V2.

Using Start-Job for Parallel Operations

The trick with using the Start-Job cmdlet is knowing when to stop creating new jobs until existing jobs have completed. The key is to use the Get-Job cmdlet and filter on the JobStateInfo property’s State property and then, if you have reached your job count threshold, call the Wait-Job cmdlet to block the script until a job completes. The following script is a simple example of what I created for my client and can be used as a template for your own scripts:

 1$jobThreshold = 10
 2
 3foreach ($site in (Get-SPSite -Limit All)) {
 4    # Get all running jobs
 5    $running = @(Get-Job | where { $_.JobStateInfo.State -eq "Running" })
 6
 7    # Loop as long as our running job count is >= threshold
 8    while ($running.Count -ge $jobThreshold) {
 9        # Block until we get at least one job complete
10        $running | Wait-Job -Any | Out-Null
11        # Refresh the running job list
12        $running = @(Get-Job | where { $_.JobStateInfo.State -eq "Running" })
13    }
14
15    Start-Job -InputObject $site.Url {
16        $url = $input | %{$_}
17        Write-Host "BEGIN: $(Get-Date) Processing $url..."
18
19        # We're in a new process so load the snap-in
20        Add-PSSnapin Microsoft.SharePoint.PowerShell
21
22        # Enable the custom feature
23        Enable-SPFeature -Url $url -Identity MyCustomFeature
24
25        Write-Host "END: $(Get-Date) Processing $url."
26    }
27    # Dump the results of any completed jobs
28    Get-Job | where { $_.JobStateInfo.State -eq "Completed" } | Receive-Job
29
30    # Remove completed jobs so we don't see their results again
31    Get-Job | where { $_.JobStateInfo.State -eq "Completed" } | Remove-Job
32}

If you run this script and open up task manager you’ll see that it’s created a powershell.exe process for each job. You might be able to get away with more processes running at once but I’d recommend starting smaller before you bump it up too high and risk crippling your system.

Using PowerShell V3 Workflow

With PowerShell V3 we now have the ability to create a workflow within which I can specify tasks that should be run in parallel. I actually detailed how to do this in an earlier post so I won’t spend much time on it here. I do want to show the code again for the sake of comparison as well as to point out one core difference (I recommend you read the Workflow section of my aforementioned post for more details). First though, here’s a slightly modified version of the code so you can compare it to the V2 equivalent:

 1workflow Enable-SPFeatureInParallel {
 2    param(
 3        [string[]]$urls,
 4        [string]$feature
 5    )
 6 
 7    foreach -parallel($url in $urls) {
 8        InlineScript {
 9            # Write-Host doesn't work within a workflow
10            Write-Output "BEGIN: $(Get-Date) Processing $($using:url)..."
11 
12            # We're in a new process so load the snap-in
13            Add-PSSnapin Microsoft.SharePoint.PowerShell
14 
15            # Enable the custom feature
16            Enable-SPFeature -Identity $using:feature -Url $using:url
17            
18            Write-Output "END: $(Get-Date) Processing $($using:url)."
19        }
20    }
21}
22Enable-SPFeatureInParallel (Get-SPSite -Limit All).Url "MyCustomFeature"

The first thing you should be asking yourself when you look at this is how many will be processed simultaneously? With the V2 version we could set the limit to whatever arbitrary value made sense for our situation. With this approach, however, we’re limited to only 5 processes. You can see this if you run the code and open up task manager where, like the Start-Job approach, you’ll see the powershell.exe for each process (note that it’s not the workflow that is creating the powershell.exe process, it’s the call to the InlineScript activity which is doing it – this call to InlineScript just helps to point out that you’ll never see more than five created).

Summary

So, though we’re limited by the number of processes and there are some downsides in terms of how we output information (like the fact that we can’t use Write-Host and any output generated by one run could be intermixed with output from another run) I think the V3 approach is much cleaner and easier to use. That said, you could make the Start-Job approach generic so that you pass in a script to run along with an array of input values so that this could be easily used without having to look at the details of what’s happening.