Friday, August 27, 2010

SharePoint 2010 My Tasks Web Part using Search Driven Cross-Site Query with Muenchian Grouping

There always seems to be a requirement for rolling up data from all sites in one or more SharePoint solutions, such as getting a list of my tasks, a list of new documents this week, or creating a searchable news archive for publishing sites; or such as creating a site map or dynamic site directory based on metadata collected in your site provisioning workflow, that are later maintained by site owners.

SharePoint has several web-parts that can do cross-list and cross-subsite queries, such as the Content Query web-part, but all restricted to a single site-collection. In addition, there are the classic Data View web-part and the new XSLT List View web-parts that can be configured using SharePoint Designer. These web-parts can connect to a diverse set of data sources, from internal SharePoint lists to external REST and OData services.

Still, the simplest solution for cross-site/cross-solution rollups is to customize the ootb search web-parts against custom search scopes in the Search Service application. In most cases, no coding will be required, pure configuration of SharePoint will go a long way. This post will show how to configure a search driven "My Tasks" web-part that will show all tasks assigned to the user across all SharePoint sites across all indexed SharePoint solutions. The unstyled cross-site task rollup web-part looks like this, included some debug info:

First you need to configure the results scope behind the search driven web-part in Central Admin. Start by adding a new scope in 'Search Service Application>Scopes' called TaskRollup using the rules as shown here:

If you can't see ContentType when adding a rule, then go to 'Search Service Application>Metadata Properties' and edit the managed property to set Allow this property to be used in scopes.

As the TaskStatus site column is not mapped to any managed property by default, you must map the crawled property ows_Status to one before it can be used. Go to 'Search Service Application>Metadata Properties' and create a managed property called TaskStatus using the mapping as shown here:

Do not go creative with the naming, stay away from spaces and special characters such as ÆØÅ - a SharePoint best practice for any artifact name used as an identifier or an URL fragment. For example, a name like "Contoso Web Ingress" first gets encoded as "Contoso_x0020_Web_x0020_Ingress" when stored, and then once more encoded as "Contoso_x005F_x0020_Web_x005F_x0020_Ingress" in a search result XML.

A full crawl is required after adding or changing crawled or managed properties. Do a full crawl of the content source you used in the TaskRollup scope. Note that there must be some matching content stored in SharePoint for these properties to be added to the property database in the first place. Thus after provisioning new site content types or site columns, you must add some sample content and then do a full recrawl of the applicable content source.

Verifying that the full crawl of the SharePoint sites content source finished without errors completes the Central Admin configuration. Now it's time to configure the ootb Search Core Results web-part to become the customized My Tasks web-part.

Open a team-site and add the Search Core Results web-part to a page. Switch to page edit mode and select 'Edit Web Part' to open the Search Core Results settings panel. Rename the web-part 'Title' to Task Rollup (cross-site) and set the 'Cross Web-Part Query ID' to User query and 'Fixed Keyword Query' to scope: "TaskRollup" as shown here:

The Search Core Results web-part requires a user query, or a configured fixed or appended query, to actually perform a search. No configured or no user query will just show a message asking for query input. The cross-page query ID setting User query is chosen here for reasons explained later.

If you want to further limit what tasks are shown in the My Tasks web-part, just add more query keywords to the 'Append Text to Query' setting as shown here:

The My Tasks web-part will show the two task fields 'Status' and 'Assigned to' in the task list. Any managed crawled property can be added to the search results by configuring the 'Fetched Properties' setting. Add the following XML <Column Name="AssignedTo"/> <Column Name="TaskStatus"/> as shown here:

You need to uncheck the 'Use Location Visualization' setting to enable the controls for customizing the result set and XSL formatting. See A quick guide to CoreResultsWebPart configuration changes in SharePoint 2010 by Corey Roth to learn more about the new search location concept in SharePoint 2010. Read all his Enterprise Search posts for an excellent introduction to the improved SharePoint 2010 search services and web-parts.

After adding 'TaskStatus' and 'AssignedTo' to the fetched properties, you will also need to customize the XSL used to format and show the search results to also include your extra task fields. Click the 'XSL Editor' button in the 'Display Properties' section of the web-part settings panel, and add the fields to the match="Result" xsl:template according to your design. Note that the property names must be entered in lower case in the XSL.

The astute reader will have noticed the nice grouping of the search results. This is done using the Muenchian method as SharePoint 2010 still uses XLST 1.0, thus no simple XSLT 2.0 xsl:for-each-group. The customized "My Tasks" results XSL creates a key called 'tasks-by-status' that selects 'Result' elements and groups them on the 'taskstatus' field as shown here:

Again, note the requirement for lower case names for the fetched properties when used in the XSL. Use the <xmp> trick to see the actual result XML.

The final part of the puzzle is how to turn the cross-site task list into a personal task list. Unfortunately, the [Me] and [Today] filter tokens cannot be used in the enterprise search query syntax, so some coding is required to add such dynamic filter tokens. Export the customized Search Core Results web-part to disk to start packaging into a WSP solution.

Create a new TaskRollupWebPart web-part SPI in your web-parts feature in Visual Studio 2010. Make the new web-part class inherit from CoreResultsWebPart in the Microsoft.Office.Server.Search assembly. Override the methods shown here to add dynamic filtering of the query through the SharedQueryManager for the web-part page:

namespace PuzzlepartTaskRollup.WebParts
public class TaskRollupWebPart : 
QueryManager _queryManager;
protected override void OnInit(EventArgs e) {
  _queryManager = SharedQueryManager.GetInstance(this.Page).QueryManager;
protected override System.Xml.XPath.XPathNavigator GetXPathNavigator(string viewPath)
  SPUser user = SPContext.Current.Web.CurrentUser;
  _queryManager.UserQuery = string.Format("scope:\"TaskRollup\" AssignedTo:\"{0}\""
  return base.GetXPathNavigator(viewPath);
protected override void CreateChildControls()
  //debug info
  //Controls.Add(new Label { Text = string.Format("FixedQuery: {0}<br/>
AppendedQuery: {1}<br/>UserQuery: {2}", 
FixedQuery, AppendedQuery, _queryManager.UserQuery) });

The code in GetXPathNavigator is what adds the current user to the QueryManager.UserQuery to filter tasks based on the assigned user by [me]. There are five query objects available on a search web-part page, where QueryId.Query1 is the default. This is also what is exposed in the web-part settings as the 'User Query' option. Use the GetInstance(Page, QueryId) overload in SharedQueryManager to get at a specific cross-page query object.

Replace the content of the TaskRollupWebPart.webpart file with the exported Search Core Results configuration. This will ensure that all the configuration done to customize the ootb web-part into the My Tasks web-part is applied to the new TaskRollupWebPart. A small change is needed in the metadata type element to load the new TaskRollupWebPart code rather than the CoreResultsWebPart code:

<webPart xmlns="">
<type name="PuzzlepartTaskRollup.WebParts.TaskRollupWebPart, 
$SharePoint.Project.AssemblyFullName$" />

Build the feature and deploy the package to your test site from Visual Studio 2010. Add the web-part to a page and verify that you get only your tasks as expected.

I know that this seems like a lot of work, but a search-driven web-part is easily created and tested before lunch. The inevitable styling & layout using XSL and CSS is what will burn hours, as usual.
A drawback of search driven web-parts or code is the delay before new/updated content is shown due to the periodical crawling schedule, typically five or ten minutes. On the positive side, the results will be automatically security trimmed for you based on the logged on user - no authentication hassles or stored username password required as with the XSL List View.

Note that most enterprise search classes are still sealed in SharePoint 2010 as in SharePoint 2007, except the CoreResultsWebPart and some new classes, so you're limited to what customizations can be achieved with configuration or the SharedQueryManager. Search driven web-parts works equally well in SharePoint 2007, except that there is no SharedQueryManager, but rather the infamous search results hidden object (SRHO) which is unsupported.

Recommended: SharePoint Search XSL Samples and the Search Community Toolkit at CodePlex.

Wednesday, August 04, 2010

Five Steps to Structure Your SharePoint Sites (Part III)

At every SharePoint customer and in every SharePoint project, the issue of how to realize a future-proof, working site structure arise. I've written a few blog posts on classification and structuring of SharePoint sites before (part 1 and part 2), focusing on scalable site structure design within the SharePoint system boundaries. This post is about getting from a classic site map based approach to a methodology driven by Information Architecture and the site classification pyramid, combined with the SharePoint containment hierarchy and SharePoint governance best practices. The latter is important as the management and usage of a site, and the service level agreement for the site, strongly influence how the overall SharePoint solution should be partitioned to adhere to architecture best practices.

Most information architects come up with a content organization model like a site map with hierarchical URL scheme. "This is the draft model, and we want the URL structure to be like this in our SharePoint solution" they say. "That's a very useful mental model of the site" I respond, "but what you really want is a good, intuitive navigation concept, helping users find what they need - like an invisible guiding hand; not that specific URL structure". Having a simple and easy mental model for the site structure is important, as it makes it easy to both contribute and find content, driving findability when the amount of information stored and managed in SharePoint grows.

Alas, that is not the primary issue with the site map approach for designing SharePoint site structures. The site map is closer to define statical IA structure than navigation, still it is only good for logical IA structure and cannot be expected to be used directly as the physical IA structure in SharePoint.

Using a site map approach to design a SharePoint site structure will only work in simple scenarios with homogenous solution domains, such as a publishing portal or a collaboration solution with limited document storage and versioning. It will fail when the SharePoint solution domains and their usage, governance and service level agreements differ, not to forget when the overall amount of data stored in SharePoint surpasses the 100GB limit per content database. The wanted URL structure might not be possible to realize in most cases anyhow when adhering to SharePoint architecture best practices.

A more structured approach is needed to design a SharePoint site structure according to architectural and governance best practices. My methodology involves a process with these five steps:

  1. Review your site map draft model against the SharePoint site classification pyramid and your governance plan for each of your identified site types.
  2. Partition your sites into solution domains based on site type and governance policies aspects such as business purpose, usage patterns, storage requirements, management boundaries, information security and isolation, operations and service level agreements.
  3. Adapt the site structure to the SharePoint containment hierarchy, adhering to limits and best practices based on the partitioned solution domains.
  4. Use the main SharePoint architectural components explicit and wildcard site-collections to ensure a scalable design that allows for future growth and expansion into new business domains. Use wildcard site-collections by default as these are super scalable. Adapt or customize SharePoint global menu and current menu to your navigation concept, to hide the solution partitioning.
  5. Define your URL scheme using host headers, manage paths and site provisioning policies.
Following these five steps is a top-down approach that is more understandable for non-techies. However, from step 3 onwards, good technical knowledge of SharePoint architecture is required.

To successfully design a working and scalable SharePoint site structure, you need a site classification scheme. I recommend the classic site classification pyramid as it outlines typical solution domains with governance and site types, helping you classify sites and draw domain boundaries in your site model.

Then, after identifying the different solution domains that make up your overall SharePoint solution, you must map the identified solution domains to the SharePoint containment hierarchy. It is imperative that you understand the containment hierarchy and its boundaries, otherwise your solution will slowly stop working as more and more users create more and more sites and fill them with more and more information.

Site-collections is the main architectural component of SharePoint, in combination with the content database that stores the site-collection's content and documents. A web-application always have a single root (1st tier) site-collection, which is the home site of the web-application. A 2nd tier of site-collections can be added to the web-application as needed to scale the application, using SharePoint managed paths. A typical scenario is to add a new solution domain. Another typical scenario is to add room for more content and documents; because a site-collection cannot span content databases and adding more space thus requires a new site-collection. Use subsites for anything that is just a section of an existing solution domain or site. See the golf and car communities in the example, which functionally are "sites" in their own right. Read my post Classification and Structuring of SharePoint Sites (Part I) to learn more.

It's easy to choose what solution goes into the root site-collection of a web-application, typically this is the portal to the overall solution. As there is a limited set of explicit site-collections per SharePoint web-application, these should be reserved for a planned, wellknown small number of predefined sites. Use the limited set of available explicit site-collections for the top two types in the site classification pyramid. Note that root and explicit site-collection storage cannot be scaled as you cannot add another content database to a site collection. Use as few explicit site-collection as possible, stay well below the limit of 20 managed paths. Expect everybody to require vanity URLs such as ://puzzlepart/HR/ for their site, so a strict governance policy for 'explicit' is required.

Use wildcard site-collections when the predictable number of sites is too large for explicit, typically for the bottom three types in the site classification pyramid. Also use wildcard site-collections even for central portals and division areas when there is no absolute requirement for an 'explicit' 2nd tier top-level site. See Puzzlepart TV in the example; even if the channel sites are classified as 'division area' (yellow color code) they do not need to be 'explicit'. In fact, due to their massive storage requirements for rich media content, best practices dictate the use of 'wildcard'. Always use wildcard site-collections for solutions have a limited life span such as projects.

Finally, after composing your scalable site structure design from root, explicit and wildcard site-collections, define the URL scheme that underlies the site structure. Use short descriptive URL elements for web-applications, managed paths and site-collections. The combination of these elements become the URL of the top-level sites that are what the users actually interacts with when using SharePoint. Note how the URL structure is affected by the SharePoint containment hierarchy. Compare ://puzzlepart/pzl-tv/ and ://puzzlepart/pzltv/ in the example, the latter contains child sites of the former in the overall navigation concept. In addition, I recommend defining and enforcing policies for all URL elements that users can provision, such as subsites, lists and libraries, to avoid non-friendly URLs with anything from spaces to special characters.

It is very likely that you will need to customize the ootb SharePoint navigation provider and menu rendering to have a consistent navigation experience across sites, sections and pages. Users get very confused and annoyed when the content of the top menu changes as they navigate from site to site, which is very understandable. There are several ways to implement common menu and navigation across site-collections, but that's left as an exercise for my readers.

The capacity and planning boundaries are mostly the same in SharePoint 2010 as in SharePoint 2007 for web-applications, site-collections and content databases. The most significant change is that SharePoint 2010 now supports max 250 000 site-collections per web-application, five times the limit for 2007. The max number of content databases per web-application has increased from 100 to 300, on the other hand the max number of site-collections per content database have been reduced from 50 000 to max 5000 with a recommendation of 2000 site-collections per content database. None of these numbers are hard limits, these are numbers for acceptable SharePoint performance.