Thursday, May 19, 2011

New Sites and the SharePoint 2010 Content Type Hub

The SharePoint 2010 content type hub does quite a good job of managing and publishing a centrally controlled set of content types. There are a few quirks and limitations, some of them documented in Content Type Hub FAQ and Limitations by Chaks' SharePoint Corner; not to forget the content type publishing timer jobs that actually push the content types to the subscribers.

One of the less documented areas of using a content type hub (HUB) is what happens to new site-collections that are provisioned? What if I have list definitions in my features, how can I be sure that their referenced content types have been provisioned at feature activation time? Can I deploy my enterprise content types feature at both the hub site-collection and also at new site-collections that users create?

First, when a new site-collection is created, it will immediately have all the published content types from the parent SharePoint web-application's connected Managed Metadata Service (MMS) application's defined HUB, automatically provisioned into its local content type gallery. Note that this applies only to the published content types as configured in the source hub. Content types that are not published, will not exist in your new site-collection. Note that hub content types are not by default published; this must be configured for every single content type in the source hub.

So if your list definitions depend on global content types that have not been published in the HUB, your feature activation will fail. You can of course solve this by publishing the applicable global content types in the source hub and run the timer jobs first, as this will ensure that new site-collections will have the enterprise content types auto-provisioned from the MMS HUB.

However, you can also deploy your enterprise content types feature to both the content type hub and to any other site-collection that you create. This works fine as the site content type definitions are identical, including the content type ID structure - after all it is the same content type CAML feature. This won't affect subscribing to the content type hub, and publishing new, updated or derived content types from the hub works just as expected.

Activate your site content type feature before activating your list definition feature, or any other feature that depends on the site content types being provisioned, to ensure that they exists locally in the new site-collection even if not yet published in the HUB.

As your taxonomy is subject to change, so are your enterprise content types. Thus, your deployment strategy for enterprise content types needs to handle change. I strongly recommend using the Open-Closed Principle for modifying and extending the enterprise content types. The Open-Closed Principle is based on using a set of immutable base content types that you derive from to make new specialized content types, inheriting fields from the base. The immutable base of the Open-Closed Principle coincides nicely with provisioning global content types through both a feature and the content type hub, as by policy any changes are made by extending the former through the latter.

Even trivial stuff such as providing standardized company templates for Word and other Office applications, is best done by publishing new derived content types. Use the content type hub to inherit your base PzlDocument into PzlDocumentMemo and attach a template, go to "Manage publishing for this content type" to publish the content type. Wait for, or run, the two HUB timer jobs, and then add the Word template to the applicable document libraries.

Now you're in for a surprise later on. The next time you try to create a new site-collection after publishing modifications in the HUB, you might get this "content type is read only" error:

The ULS log typically contains an exception like this:

Error code: -2146232832
The content type is read only or updateChildren is true and one of the child objects of the content type is read only.

The root cause for this is that published content types are by default read-only in the subscribers. What typically leads to this error is the need to use code when provisioning content types, e.g. when renaming and reordering fields, or when adding the enterprise keywords field to your content type. Another typical scenario where code is required is managed metadata fields; see How to provision SharePoint 2010 Managed Metadata columns by Wictor Wilén.

Making changes to the site content type definition in the FeatureActivated code and then calling SPContentType Update with updateChildren=true will work fine, until someone creates a new derived content type in the source hub and publish it. Your carefully tested code will suddenly crash, as the published child content type is read-only! Alas, what better proof that the deployed and the published global content types are the same?

Luckily, the change is isolated to the new inherited content type, thus it can safely be ignored when deploying the base content types. Use this overloaded Update method when modifying the global content types:

public void Update(
         bool updateChildren := true,
         bool throwOnSealedOrReadOnly := false

The HUB change did not affect your global content type due to using the Open-Closed governance policy for enterprise content types. See my SharePoint 2010 Open-Closed Taxonomy post to learn more about this recommended policy.

The content type hub and the Managed Metadata Service are perhaps the best new features in SharePoint 2010, still there are some uncharted areas that make developers reluctant at using the MMS HUB. There are a lot of articles at Technet and MSDN on the architecture, but way too little about deployment scenarios and issues such as those in this post.

Saturday, May 07, 2011

Site Lifecycle Management using Retention Policies

The ootb governance tools for site lifecycle management (SLM) in SharePoint 2010 have not improved from the previous version. You're still stuck with the Site Use Confirmation and Deletion policies that will just periodically e-mail site owners and ask them to confirm that their site is still in use. There is no check for the site or its content actually being used, it is just a dumb timer job. If the site is not confirmed as still being active, the site will then be deleted - even if it is still in use. As deleting a site is not covered by any SharePoint recycle bin mechanism (coming in SP1), Microsoft also provides the site deletion capture tool on CodePlex.

Wouldn't it be nice if we could apply the information management policies for retention and disposition of content also for SharePoint 2010 sites? Yes we can :) By using a content type to identify and keep metadata for a site, the standard information management policies for content expiration can be configured to implement a recurring multistage retention policy for site disposition.

Create a site information content type and bind it to a list or library in your site definition, and ensure that this list contains one SiteInfo item with the metadata of the site. Typical metadata are site created date, site contact, site type, cost center, unit and department, is restricted site flag, last review date, next review date, and last update timestamp. Restrict edit permissions for this list to just site owners or admins.

Enable retention for the SiteInfo content type to configure your site lifecycle management policy as defined in your governance plan.

Add one or more retention stages for the SiteInfo content type as needed by your SLM policy. You will typically have a first stage that will start a workflow to notify the site owner of site expiration and ask for disposition confirmation. Make sure that the site owner knows about and enacts on your defined governance policies for manual information management, such as sending valuable documents to records management. Then there will be a second stage for performing the site disposition steps triggered by the confirmation.

You can also implement custom information management policy expiration formula or expiration action for use when configuring your retention policy. You typically do this when your policy requires retention events that are not based on date fields only. See Sahil Malik's Authoring custom expiration policies and actions in SharePoint 2007 which is still valid for SharePoint 2010.

Use a custom workflow or custom expiration action to implement the site disposition steps: user removal, automated content clean-up and archiving, and finally trigger deletion of the site. If the site is automatically deleted by a custom workflow, or marked for deletion to be processed by a custom timer job, or a custom action just sends an e-mail to the site-admin, is up to your SLM policy.

If you need to keep the site in a passive state for e.g. 6 months before deleting it, you can use a delegate control in your site master pages to prevent access to passive sites or you can move the site to an archive web-app that use a "deny write" / "deny all" access policy to prevent access. Note that the former is not real security, just content targeting for the site. The latter is real security, as "deny" web-app policies overrides site specific access rights granted to SharePoint groups and users. This allows for keeping the site users and groups "as-is" in case the site can be reactivated again according to your SLM policies. If site owners can do housekeeping on a site while passive, then grant them access by creating extra "steward" accounts that are not subject to being denied access.

I recommend removing all users from the default site members group before deleting the site, otherwise the site will not be deleted from the site memberships list in the user's my site.

The astute reader may wonder how the content type retention policy knows if the site is actually in use. The answer is quite simple; each SPWeb object provides a LastItemModifiedDate property. This timestamp is also stored in the SharePoint property bag. Use a delegate control in your site's master page to check and push the timestamp to a date-time field the SiteInfo item, so that the rentention policy can trigger on it. Remember to use SystemUpdate when updating the SiteInfo, otherwise you will change the site's LastItemModifiedDate to now. You can also use a custom expiration formula that inspects the last modified timestamp for the site when the information management policy timer job runs.

We also use the site information content type in our Puzzlepart projects to provide a search-driven site directory. It is quite simple to make a nicely categorized and searchable site catalog by simply using one or more customized the search results web-parts. This search-driven catalog can of course be sorted by the search result 'write' managed property, which must be mapped to the crawled property field that contains the LastItemModifiedDate of a site.

Using a search-driven approach makes it unnecessary to have a classic site directory list. The site metadata is simply stored directly in a list within each site, managed by the respective site owners. This is more likely to keep the site metadata up-to-date rather than going stale in a central site directory list that no one maintains.

I hope this post have given you some new ideas on how to store, manage and use site metadata both for site lifecycle management and for providing a relevant search-driven site directory.