Showing posts with label ContentTypeHub. Show all posts
Showing posts with label ContentTypeHub. Show all posts

Saturday, April 21, 2012

Migrate SharePoint 2010 Term Sets between MMS Term Stores

When using the SharePoint 2010 managed metadata fields connected to termsets stored in the Managed Metadata Service (MMS) term store in your solutions, you should have a designated master MMS that is reused across all your SharePoint environment such as the development, test, staging and production farms. Having a single master termstore across all farms gives you the same termsets and terms with the same identifiers all over, allowing you to move content and content types from staging to production without invalidating all the fields and data connected to the MMS term store.

You'll find a lot of termset tools on CodePlex, some that use the standard SharePoint 2010 CSV import file format (which is without identifiers), and some that on paper does what you need, but don't fully work. Some of the better tools are SolidQ Managed Metadata Exporter for export and import of termset (CSV-style), SharePoint Term Store Powershell Utilities for fixing orphaned terms, and finally SharePoint Taxonomy and TermStore Utilities for real migration.

There are, however, standard SP2010 PowerShell cmdlets that allow you to migrate the complete termstore with full fidelity between Managed Metadata Service applications across farms. The drawback is that you can't do selective migration of specific termsets, the whole term store will be overwritten by the migration.

This script exports the term store to a backup file:

# MMS Application Proxy ID has to be passed for -Identity parameter

Export-SPMetadataWebServicePartitionData -Identity "12810c05-1f06-4e35-a6c3-01fc485956a3" -ServiceProxy "Managed Metadata Service" -Path "\\Puzzlepart\termstore\pzl-staging.bak"

This script imports the backup by overwriting the term store:

# MMS Application Proxy ID has to be passed for -Identity parameter
# NOTE: overwrites all existing termsets from MMS
# NOTE: overwrites the MMS content type HUB URL - must be reconfigured on target MMS proxy after restoring

Import-SPMetadataWebServicePartitionData -Identity "53150c05-1f06-4e35-a6c3-01fc485956a3" -ServiceProxy "Managed Metadata Service" -path "\\Puzzlepart\termstore\pzl-staging.bak" -OverwriteExisting

Getting the MMS application proxy ID and the ServiceProxy object:

$metadataApp= Get-SpServiceApplication | ? {$_.TypeName -eq "Managed Metadata Service"}
$mmsAppId = $metadataApp.Id
$mmsproxy = Get-SPServiceApplicationProxy | ?{$_.TypeName -eq "Managed Metadata Service Connection"}

Tajeshwar Singh has posted several posts on using these scripts, including how to solve typical issues:
In addition to such issues, I've run into this issue:

The Managed Metadata Service or Connection is currently not available. The Application Pool or Managed Metadata Web Service may not have been started. Please Contact your Administrator. 

The cause of this error was neither the app-pool nor the 'service on server' not being started, but the service account used in the production farm not being available in the staging farm. Look through the user accounts listed in the ECMPermission table in the MMS database, and correct the "wrong" accounts. Note that updating the MMS database directly might not be supported.

Note that after the term store migration, the MMS content type HUB URL configuration will also have been overwritten. You may not notice for some time, but the content type HUB publishing and subscriber timer jobs will stop working. What you will notice, is that if you try to click republish on a content type in the HUB, you'll get an "No valid proxy can be found to do this operation" error. See How to change the Content Type Hub URL by Michal Pisarek for the steps to rectify this.

Set-SPMetadataServiceApplication -Identity "Managed Metadata Service" -HubURI "http://puzzlepart:8181/"

After resetting this MMS configuration, you should verify that the content type publishing works correctly by republishing and running the timer jobs. Use "Site Collection Administration > Content Type Publishing" as shown on page 2 in Chris Geier's article to verify that the correct HUB is set and that HUB content types are pushed to the subscribers.

Thursday, May 19, 2011

New Sites and the SharePoint 2010 Content Type Hub

The SharePoint 2010 content type hub does quite a good job of managing and publishing a centrally controlled set of content types. There are a few quirks and limitations, some of them documented in Content Type Hub FAQ and Limitations by Chaks' SharePoint Corner; not to forget the content type publishing timer jobs that actually push the content types to the subscribers.

One of the less documented areas of using a content type hub (HUB) is what happens to new site-collections that are provisioned? What if I have list definitions in my features, how can I be sure that their referenced content types have been provisioned at feature activation time? Can I deploy my enterprise content types feature at both the hub site-collection and also at new site-collections that users create?


First, when a new site-collection is created, it will immediately have all the published content types from the parent SharePoint web-application's connected Managed Metadata Service (MMS) application's defined HUB, automatically provisioned into its local content type gallery. Note that this applies only to the published content types as configured in the source hub. Content types that are not published, will not exist in your new site-collection. Note that hub content types are not by default published; this must be configured for every single content type in the source hub.

So if your list definitions depend on global content types that have not been published in the HUB, your feature activation will fail. You can of course solve this by publishing the applicable global content types in the source hub and run the timer jobs first, as this will ensure that new site-collections will have the enterprise content types auto-provisioned from the MMS HUB.

However, you can also deploy your enterprise content types feature to both the content type hub and to any other site-collection that you create. This works fine as the site content type definitions are identical, including the content type ID structure - after all it is the same content type CAML feature. This won't affect subscribing to the content type hub, and publishing new, updated or derived content types from the hub works just as expected.

Activate your site content type feature before activating your list definition feature, or any other feature that depends on the site content types being provisioned, to ensure that they exists locally in the new site-collection even if not yet published in the HUB.

As your taxonomy is subject to change, so are your enterprise content types. Thus, your deployment strategy for enterprise content types needs to handle change. I strongly recommend using the Open-Closed Principle for modifying and extending the enterprise content types. The Open-Closed Principle is based on using a set of immutable base content types that you derive from to make new specialized content types, inheriting fields from the base. The immutable base of the Open-Closed Principle coincides nicely with provisioning global content types through both a feature and the content type hub, as by policy any changes are made by extending the former through the latter.

Even trivial stuff such as providing standardized company templates for Word and other Office applications, is best done by publishing new derived content types. Use the content type hub to inherit your base PzlDocument into PzlDocumentMemo and attach a template, go to "Manage publishing for this content type" to publish the content type. Wait for, or run, the two HUB timer jobs, and then add the Word template to the applicable document libraries.

Now you're in for a surprise later on. The next time you try to create a new site-collection after publishing modifications in the HUB, you might get this "content type is read only" error:


The ULS log typically contains an exception like this:

SPContentTypeReadOnlyException
Error code: -2146232832
The content type is read only or updateChildren is true and one of the child objects of the content type is read only.


The root cause for this is that published content types are by default read-only in the subscribers. What typically leads to this error is the need to use code when provisioning content types, e.g. when renaming and reordering fields, or when adding the enterprise keywords field to your content type. Another typical scenario where code is required is managed metadata fields; see How to provision SharePoint 2010 Managed Metadata columns by Wictor Wilén.

Making changes to the site content type definition in the FeatureActivated code and then calling SPContentType Update with updateChildren=true will work fine, until someone creates a new derived content type in the source hub and publish it. Your carefully tested code will suddenly crash, as the published child content type is read-only! Alas, what better proof that the deployed and the published global content types are the same?

Luckily, the change is isolated to the new inherited content type, thus it can safely be ignored when deploying the base content types. Use this overloaded Update method when modifying the global content types:

public void Update(
         bool updateChildren := true,
         bool throwOnSealedOrReadOnly := false
)

The HUB change did not affect your global content type due to using the Open-Closed governance policy for enterprise content types. See my SharePoint 2010 Open-Closed Taxonomy post to learn more about this recommended policy.

The content type hub and the Managed Metadata Service are perhaps the best new features in SharePoint 2010, still there are some uncharted areas that make developers reluctant at using the MMS HUB. There are a lot of articles at Technet and MSDN on the architecture, but way too little about deployment scenarios and issues such as those in this post.

Friday, July 02, 2010

Scalable SharePoint 2010 Farm & Services Architecture

This post describes a proposed solution design for providing an intranet collaboration farm capable to scale out to support a publishing portal and extranet collaboration with partners and suppliers. The design is based on a set of diverese sites types identified in an Information Architecture analysis, and the general planning recommendations for these types. In addition, several non-functional requirements is accounted for in the design, such as farm security, robustness and availability.

The SharePoint 2010 farm is designed with scale out support for different future solution areas. The initial farm architecture will be designed to support collaboration for internal users only, with an option to also add an intranet publishing solution. The long term goal of the farm is to eventually support collaboration with external users also, such as partners an suppliers.

All SharePoint 2010 solutions are built from a set of elements ranging from the farm hosting the solutions down to sites that contains the actual functionality and information of the solutions.


SharePoint solutions are deployed into SharePoint web-applications in a SharePoint farm. Web-applications are management containers for site collections. SharePoint uses site collections to structure the sites and subsites that implements the actual functionality of SharePoint solutions.

Site collections bridge logical architecture and information architecture (IA). The design goals for site collections in the model are to create logical divisions of content and functionality, in addition to satisfy requirements for URL design. To satisfy the requirements for URL design, each web application includes a single root-level site collection, such as ://portal/. In addition, a set of managed paths are used to incorporate a second tier of site collections, such as ://portal/HR/ and ://portal/sites/***.

All web-applications in a farm share a group of common SharePoint 2010 service applications that provide shared services such as indexing & search to the farm. I use the term “shared services provider” (SSP) in this post for such a group of service applications. A farm can contain multiple groups of shared services, such as an intranet SSP and an extranet SSP.

The overall SharePoint farm architecture for internal collaboration can be summarized like this (click to enlarge):


The initial SharePoint 2010 farm will be configured to support two solutions: providing team-sites for internal collaboration in groups or projects, and for providing My Sites for employees. These two web-applications will share a group of shared service applications (“default group”) for search, user profiles and enterprise taxonomy.

The User Profile service is required to provide My Sites. In addition, this service application is what provides all social features such as tagging, rating and social bookmarking.

Note that a search service application is needed for more than just crawling and queries, the social tagging functionality in SP2010 also requires search. The reason is that several web-parts such as Note Board, Tag Cloud and Tagged Items in some modes depends on the search service to do security trimming to provide their content. This includes the tag profile page, which use such web-parts. You can also use the search service and the content search, people search and social search capabilities when customizing the user experience of your solutions. A typical example is creating a tag cloud web-part for managed metadata and enterprise keywords columns for lists and libraries, not just for social bookmarking (it is not really social tagging that is implemented in SP2010) with the standard "Tags & Notes" ribbon.

If you use FAST Search for SharePoint 2010 (FS4SP) then the FAST Content SSA will provide the content sources that are crawled for search results, except for people search. The FAST Query SSA has a different set of content sources, including the one feeding the people scope. Only one Search Service Application (SSA) should be associated with your web-application, either FAST or standard SharePoint 2010 Server search. Note that FAST doesn't index the social tags, which affects those social tagging web-parts that depends on search (see above).

The intranet will use Active Directory as the identity provider for authentication in classic mode. The web-application authentication mode can later on be switched from classic to claims-based as needed. Note that it is not recommended to switch the mode from claims-based to classic.

The farm is also designed to scale-out to support future SharePoint 2010 solutions such as a publishing intranet (://puzzlepart). Note that other service applications not shown here might be required to support the functional solution design and future solutions.

The different web-applications in the farm runs in separate application pools to achieve process isolation. This way an error or crash in one application will be isolated and not affect the other web-applications in the farm.

Anywhere access for mobile employees to the intranet sites is provided using Forefront Unified Access Gateway (UAG). This supports a wide range of locations such as at home or on the road, and a wide range of devices from laptops to smart phones.

The overall SharePoint farm architecture target for collaboration with external partners and suppliers can be summarized like this (click to enlarge):


The extended 2011+ version of the farm is scaled-out to support two new solutions; one is the ://puzzlepart web-application for intranet publishing, the other is an extranet web-application for collaboration with external users such as partners and suppliers. The former connects to the intranet shared services provider (“default group”) and will as such reuse and share search, user profiles, and the enterprise taxonomy (metadata and content types) with the existing solutions for team-sites and my sites.

The extranet collaboration solution must not be connected to the intranet SSP for information security reasons. Using a separate group of shared services for all service applications that may contain or expose confidential information is strongly recommended – an extranet SSP (“custom group”). A typical service application of this kind is search. Using separate service applications prevents accidental information exposure due to e.g. misconfiguration of a service application.

The User Profile service application is included in the extranet SSP even if the extranet will not provide My Sites at all. This service is what enables user to do social tagging and bookmarking, and must thus be part of the extranet SSP to provide social features. It is not recommended to connect the extranet to the intranet User Profile service for the same reasons as for the Search service.

Still, some service applications are typically shared across both the intranet and the extranet web-applications. This includes the enterprise taxonomy – you need to provide for consistent classification and tagging of content across all solutions to drive findability. The managed metadata service, including content type hubs, is specifically built for the purpose of being both shared and syndicated across multiple solutions.

Another security aspect is access control; the extranet must use a separate claims-based authentication provider, that can be federated with external identity providers. The intranet and extranet web-applications must trust different providers for authorization. Using separate providers prevents accidental information exposure due to e.g. misconfiguration of site access control and group memberships. Note that multi-mode authentication is preferred over mixed mode for extranet collaboration.

A final aspect of the extended farm design is that the extranet web-application and shared services are run in separate application pools from the intranet. This ensures process isolation to prevent errors and crashes in the extranet processes to affect the intranet solutions. It also isolates malicious use of the extranet process resources from affecting the intranet processes, such as denial of service attacks.

Remote access for external users to the extranet collaboration web-application is provided using Forefront Unified Access Gateway (UAG) combined with ADFS for federated identity and access management (IAM). Using UAG saves you from setting up and managing a separate extranet farm in a DMZ perimeter zone; instead you give a controlled set of external users secure anywhere access to a controlled set of internal applications hosted on the intranet farm.

Friday, June 18, 2010

Open-Closed Term Sets & Terms in SharePoint 2010

The 'Move Term Set' action for term store management, in combination with Restricted and Full permissions settings for the Managed Metadata Service, can be used to enforce the Open-Closed principle for adding new term sets to your taxonomy. Allow local taxonomy managers to add column specific term sets (open for extension) in local site-collections, but do not give them full permissions on the core term store (closed for modification). Then have a policy for periodically reviewing local site-collection term sets to incorporate useful new ones into the shared core term store.


Note that each term set also have an open-closed setting for controlling if new terms can be added to the term set or not. Users with restricted or full permissions are allowed to add new terms to open term sets, users with read permissions are not. Use the combination of open + restricted to enforce the "open for extension, closed for modification" policy for term set terms.

Note that access to the term store is granted per web-application using the app-pool account, not per user or group.  In addition, you can control permissions on each term group using the "contribute" and "manage" settings for granting rights to users and groups. If users are not granted contribute rights, they will be restricted to read even if the web-application's MMS connection permissions allows for more than just read.

Tuesday, June 15, 2010

Content Type Hub Publishing and Column Specific Term Sets in SharePoint 2010

In SharePoint 2010, you realize your taxonomy using the content type hub and term store provided by the Managed Metadata Service (MMS). The hub contains site content types built from site columns, and the "Managed Metadata" column type is what you use to connect term sets to a field. When adding a managed metadata column, you must choose between "Use a managed term set" defined in the MMS using Central Admin, or "Customize your term set" which allows you to create a new column specific term set on the fly.

Column specific term sets that you create will by default be assigned to the site-collection hosting the site content type gallery in which you create the site column. Such term sets will not be visible in the Term Store Management Tool in Central Admin, even if the MMS connection is set as the default storage location for column specific term sets. This also applies to column specific term sets created in the content type hub.

Do not create column specific term sets in a content type hub, this will break the content type publishing. If you click "Manage publishing for this content type" you will get this error:
The current content type contains a managed metadata column that uses a customized term set that is not available outside the current site collection. Please change the column setting or remove the column and publish the content type again.
There is, however, a third option in addition to the two suggestions; moving the term set from the site collection term group to a shared term set group in the managed metadata service term store.


Move the term set, then verify that the content type publishing setting is Republish in the content type hub. In Central Admin, run the "Content Type Hub" job first, then all applicable "Content Type Subscriber" jobs to execute the actual publish-subscribe process. Finally, open the site content types inventory in a subscriber site-collection and verify the subscribed content type, site column and term set.

I also recommend updating the site column definition to use the "use a managed term set" setting after moving the term set into the core term store.

Saturday, June 12, 2010

SharePoint 2010 Intra-Farm Term Store Syndication

The managed metadata service in SharePoint 2010 makes it simple to realize an enterprise taxonomy across all site-collections in a farm, or even across farms. In addition, the managed metadata service (MMS) application can be syndicated, that is, you can have multiple instances of the MMS and they can all be consumed in combination as one by web-applications. You can have multiple instances of the other service application types, but only one of them can be the active default association in a consuming web-application.

The Managed metadata service application overview on Technet used to have a nice example scenario of managed metadata term store and content type syndication across multiple departments in a company, in the page published on April 16th. As the example scenario has been removed in the page published on May 12th, the following figure shows the MMS syndication example:


For those of you who speak Spanish, or want to run it through Google translate, the managed metadata syndication example is still available here: http://technet.microsoft.com/es-es/library/ee424403.aspx

The salient point of the example is that the HR, IT, Products and Legal departments all share a common base term store and content type hub, allowing for the Products department to extend the base with its own term store and content types, and allowing the Legal department to have its own private term store while syndicating both the Global and Products managed metadata services.

An important detail in the example is that the Legal department uses term sets to represent confidential information. Therefore, it requires its own term store that prohibits other departments from seeing the confidential terms, even as new terms and term sets are added. Thus, the Legal department requires its own default term set location to prevent new confidential terms being added to the Global and Products managed metadata services.


This scenario is quite possible to realize as the two managed metadata service connection setting Default keyword location and Default term set location allows for specifying per MMS connection if this is the MMS term store where new keywords and column specific term sets will be added. However, as these options are set on the "Managed Metadata Service Connection" proxy itself rather than on the web-application "Service Application Associations" settings, there is a problem for the example scenario.

All web-applications in a farm share all service applications, connection proxies and proxy groups defined in the farm. The example scenario requires the IT, HR, Products and Legal web-applications to all be connected to the "Global Managed Metadata" service. As indicated by the red markup in the above figure, the web-apps require a different setting for default term set location - Legal requires "no", the others "yes". Thus, they cannot use the same MMS connection, a second connection is needed. However, you cannot add another local service application connection from Manage Service Applications, only cross-farm connections can be created from Central Admin .

The solution is to use PowerShell to add a new Managed Metadata Service Connection with the command New-SPMetadataServiceApplicationProxy using the -ServiceApplication parameter, rather than the -Uri parameter used for cross-farm connections.

 New-SPMetadataServiceApplicationProxy -Name "MMS_PX001" -ServiceApplication "MMS2" 

Some other parameters are:
-DefaultKeywordTaxonomy: This service application is the default storage location for keywords.
-DefaultSiteCollectionTaxonomy: This service application is the default storage location for column specific term sets.
-ContentTypeSyndicationEnabled: Publish content types from the content type hub.
-ContentTypePushdownEnabled: Push-down updates from the content type hub to subscribers.
-DefaultProxyGroup: Add this connection to the service application associations [default] proxy group.

Don't add your custom MMS syndication connections to the default proxy group as you will need to configure the service application associations per web-application that subscribes to a syndicated set of managed metadata services, typically using the [custom] proxy group. Additional custom proxy groups can be created in the farm using the New-SPServiceApplicationProxyGroup PowerShell cmdlet.


Note that as there are two term set "default" settings, there are four possible combination of MMS connection configurations for terms. Thus, four MMS connection variants must be created to cover all possible scenarios per managed metadata service instance. Then, each web-application can use its own service application associations [custom] proxy group to pick the applicable set of MMS connections.

Make sure that each proxy group has only zero or one default keyword location and default term set location. Having zero for either will prevent users from adding new keywords or new term sets. Having multiple defaults for either will cause an error as SharePoint cannot know in which of the locations to add new keywords and new term sets.

A final note about term store permissions: Access to the term store is granted per web-application using the app-pool account, not per user or group.  In addition, you can control permissions on each term group using the "contribute" and "manage" settings for granting rights to users and groups. If users are not granted contribute rights, they will be restricted to read even if the web-application's MMS connection permissions allows for more than just read.

[UPDATE] This scenario will make incremental crawl of friendly URLs (FURL) fail in SP2013; and if you contact Microsoft Support you will get help from people who don't understand proxy groups, default proxy/MMS connection, default keyword location and default column-specific term set location settings, who will tell you that the product group confirms that the documented scenario on Technet is in fact not supported. Stay away from using SharePoint for anything else than a single department collaboration scenario, SP2013 is not built for real-life enterprise usage.

Wednesday, May 12, 2010

Minimal SharePoint Governance Plan - Part III

This is part III in a mini-series about the Minimal SharePoint Governance Plan needed to get you started with your SharePoint governance efforts.This part gives a more detailed overview of the mininal governance plan. The overview comprises operational and functional areas from SharePoint architecture, via site, user and information lifecycle management, to realization of SharePoint solutions. As you will see, there is a multitude of governance aspects, even if just focusing on the technical aspects. Executing on all governance aspects from day one is not viable, that is why I recommend to start with simple governance.

The is no one governance plan to rule them all. Governance is too multi-faceted for a single set of policies to fit all the different site types across diverse business areas. Governance for controlled publishing sites will be very different from Enterprise 2.0 pull-style situational solutions, as will it differ for management of project sites and team sites, and as for community, social and personal sites. Adapt your governance plan according to the targeted solution.


The site classification scheme shown here is from the Technet SharePoint Governance Checklist Guide, refer to page 20 for more details.

Architectural Governance

The governance plan for the SharePoint solution must define a logical architecture model based on your Information Architecture analysis, adhering to architectural components, farm deployment, capacity and planning recommendations on Technet.

• Farm design policies for a robust and flexible platform
• Sharing and isolation policies for applications and information
• Site-collection structure policies to drive overall solution architecture & governance
• Information asset structure policies to ensure classification, management and findability

The objective of having architectural policies is to create a workable farm and solution design considering hard and soft SharePoint limits.

Site Lifecycle Management

The governance plan for site lifecycle management (SLM) must specify policies for managing sites from creation to disposition. Define a classification scheme for site types and adapt your governance plan to each site type. It is recommended to develop timer jobs to automate and enforce SLM policies.

You need a site sweeper job that disposes expired, abandoned and useless sites from your solution to ensure that the overall effect produces useful business results. Make sure that knowledge captured in obsolete sites are retained through information management tasks before permanently disposing the sites. Alas, don't be afraid of self-service provisioning, after all more is different.

• SLM policies must be defined and enforced
• Site Provisioning
    o Implement custom provisioning if ootb functionality is not sufficient
    o Provisioning policy per site type defines level of automation and self-service
    o Use provisioning wizard to collection data related to SLM
    o Store SLM data in site properties or a site inventory list
• Site Retention
    o Do not rely on database backup for retention, backup retention might be shorter than site retention
    o Prepare to restore sites deleted by users
• Site Disposition
    o Implement custom site sweeper if ootb functionality is not sufficient
    o Standard site sweeper is only for site-collections (site use confirmation)
    o Define a procedure for information management when disposing a site

Note that there is no ootb site directory in SharePoint 2010. Still, just create a shared custom list and use it for inventory management of sites as part of your SLM implementation.

DocAve Backup & Recovery a recommended 3rd-party tool, it provides capabilities beyond ootb SharePoint 2010, such as item-level recovery. For site retention, the CodePlex MSIT Site Delete Capture tool is an option.

User Lifecycle Management / Identity & Access Management

The governance plan for user lifecycle management (ULM) must specify policies for managing users from onboarding to termination. ULM is directly related to information security (access and auditing), information management, and Identity & Access Management (IAM). Employees come and go, resulting in SharePoint data that nobody manages, or in worst case, lost knowledge.

Implementing good site disposition policies and good information management policies will reduce the efforts required for user lifecycle management, as obsolete sites and information then will be disposed of in a timely manner - keeping the quantity of orphaned data down.

• ULM policies must be defined and enforced
• Site memberships and permissions must be assigned for new users
• Site and information asset permissions & ownership must be handled when
    o Account is terminated
    o User transfers to another business role or department
• A policy for reassignment of ownership must be defined

Having tools for management of user permissions, ownership and lifespan is nice, but no prerequisite. LigthningTools DeliverPoint or Axceler ControlPoint are recommended partner solutions for user management.

Content Type Governance / Information Management

The governance plan must specify policies for content management according to your Information Architecture analysis and taxonomy. A taxonomy is realized in SharePoint using Site Content Types for information asset types and Term Sets for coherent tagging of information. Content types combined with metadata tagging is essential for information classification and for driving findability.

Content types defines the static classification hierarchy of the information managed in SharePoint. A content type is built from a set of fields defining the metadata of the content type, further detailing the classification of the information. Some metadata fields require the use of a controlled vocabulary for content tagging.

Information management policies can be assigned to content types. The most important are for retention and disposition of content, helping you manage e.g. outdated content to ensure the relevance and timeliness of your information.

• Reuse the Open-Closed Enterprise Taxonomy across web-applications and site-collections
• Always use a core Content Type Hub store for the enterprise taxonomy
• The core content type store defines company specific immutable base content types
• Ensure that all additional content types derives from the core content types, by extending the immutable base
• Use few required metadata fields, max 3-5 per content type
• Use sensible default values where possible
• Define polices for
    o Reusable content (document repository)
    o Retention of outdated content / historical archive (document repository)
    o Retention of expired content (records repository or disposition)
    o Regulatory compliance records (records management)
    o Disposition of content
• Retention policies are important for driving findability, use them to prevent irrelevant search results
• Define and enforce behavior using
    o Information Management policies
    o Workflows / Event receivers
• Use Information Management policies for
    o Retention, disposition, auditing, labeling / barcodes
• Ensure that content types are evolved according to best practices

The new SharePoint 2010 multi-stage retention support, combined with workflow and document /records repositories allows for implementing and enforcing sophisticated content management policies. Note that the new SharePoint 2010 Content Organizer only supports Document-based content types, in addition to e-mail messages.

Managed Metadata Governance

The governance plan must specify policies for management of the managed metadata used when tagging content in SharePoint. Managed metadata is a controlled vocabulary defined in the corporate taxonomy, realized in SharePoint 2010 as term sets defined in the Managed Metadata Service.

• Reuse the Open-Closed Enterprise Taxonomy across web-applications and site-collections
• Always use a core Managed Metadata Service term store for the enterprise taxonomy
• Allow local Managed Metadata Services for isolated, locally managed term stores
• Always use synonyms when defining terms, consistent content tagging is essential for content management and for driving findability
• Use term translation to support other languages for the term
• Avoid random or haphazard tagging due to unintelligible terms
• Enable managed keywords for user-driven freeform tagging of content
• Ensure that term sets are evolved according to best practices
• Define and enforce a policy for reviewing open term sets for improper usage

Note that search do not comprise term synonyms or translations when searching, it only finds the stored key term. The same applies to faceted search – or 'refinement panels' as they are called.

You can have multiple Term Set stores and Content Type Hub inventories in SharePoint 2010. This allows for combining both enterprise definitions and local definitions to support both shared and isolated taxonomy configurations. See Plan to share terminology and content types on Technet.

Social Tagging Governance

The governance plan must specify policies for management of the social tagging features

• Use managed keywords to enable folksonomy for content (list items)
• Use social tagging to enable folksonomy for "anything with an URL"
• Allow for managed metadata and managed keywords to be included in social tags
• Define and enforce a policy for reviewing the folksonomy tags for improper usage

Note that the social tagging of "anything with an URL" is provided by the SharePoint2010 User Profile Service application, not the Managed Metadata Service application. Thus, social tags have no explicit relation to the term store at all. The same applies to other SharePoint2010 social features such as ranking and social bookmarking.

Document Template Governance

The governance plan must specify policies for using Office templates in content types.

• Use a shared set of enterprise Office templates
• Manage and store templates in a SharePoint document library at a central location
• Do not store templates directly in content types, always reference the central shared templates
• Make use of the Office 2010 Backstage or the document information panel for managing metadata directly in Office

Office 2010 now has support for storing templates in a SharePoint repository. Use AD group policies to populate 'File > Save As' and to lock down storage locations such as file shares and local disk.

List & Library Definition Governance

The governance plan must specify policies for managing content in lists and libraries. It is strongly recommended to use only lists based on site content types, rather than directly customizing list definitions. Enforcement of consistent classification and information management policies depends on using site content types.

• List content
    o Use only a few content types per list
    o Content types in a list must be cohesive
    o Prefer list views over "dumb" folders
    o Use SharePoint 2010 folders when appropriate
• List permissions
    o Prefer using inherited permissions
    o Avoid user item level permissions
• Enforce content management policies using
    o Versioning, check-in/out, workflows / event receivers
• Information Rights Management (IRM)
    o Policies for document access and usage restrictions
    o Applies IRM policies when document is downloaded from library
    o Enable by installing Active Directory Rights Management Services (AD RMS)
• Information management policies
    o Prefer implementing IM policies on content types rather than on lists or libraries

Note that some of the new SharePoint 2010 features work only for document libraries, such as the Unique Document ID, Document Set and Content Organizer features.

Permissions Governance

The governance plan must specify policies for how to manage access to sites and information assets, including which permissions users and groups have. All experience shows that simple permission policies are more secure. The more intricate and fine-grained permissions assignments you have, the harder it is to know who has access to what – and the more likely it is that there will be information security breaches exposing confidential information.

• Use SP groups to manage user group memberships
• Build your SP groups from AD security groups
    o Management of AD group members is typically a bottleneck, thus avoid it
• Do not assign permissions to single users, always assign to SP groups
• Prefer inherited groups (role assignments)
• Prefer inherited permission levels (role definitions)
• Use unique permissions at site level (favored) or list/library level only when absolutely required
• Avoid assigning item level permissions
• Site-collections are preferred security management boundaries

The visibility into what a user has access to has improved a bit in SharePoint 2010, so has the usage reporting capabilities. Still, 3rd-party tools such as LigthningTools DeliverPoint or Axceler ControlPoint might be required for professional permissions management beyond the built-in SharePoint 2010 Permissions Tool.

Search Governance

The governance plan must specify policies for driving findability through indexing and search. The Information Architecture analysis defines the information taxonomy and organization blueprint realized in a SharePoint site structure capable of storing and managing your content. The site structure combined with content types enables findability through consistent classification and tagging of content.

• Ensure ease of adding information assets to correct location
    o Users should not have to enter a lot of required metadata
    o Users should not have to browse/navigate extensively to store content
    o Task context should deduce location, e.g. CRM client document store
• Metadata tagging through content types for all findable assets
• Use content type retention policies to prevent irrelevant, outdated search results
• Use search scopes to provide search context: people, tasks, articles, project, archive, etc.
• Use faceted search (refinement panel)
• Ensure and enforce information isolation
    o Farm design must prevent configuration mistakes from exposing confidential information by accident
    o Use separate service application groups or even separate shared services farms

The most valuable search is the one that connects a user to other people, as people are often the best sources of information and knowledge, especially tacit knowledge – know-how relating to new better business performance or to novel information that generate flow of new knowledge – in short, that ignites innovation. A former CEO of Hewlett Packard famously observed: "If HP knew what HP knows, we would be three times as profitable".

Findability is more than just search capabilities, it also includes the SharePoint 2010 social computing features such as “Tags & Notes” for tagging and social bookmarking. Tag clouds, metadata-based navigation and filtering, and even the My Site activity feed are all enablers of driving findability.

Note how I say "driving findability"; findability is not something you just enable, you have to actively manage and adapt the Search Service application settings according to your business needs. Just enabling search is just as bad as not managing your user's expectations for what to expect from enterprise search.

All parts of this mini-series:
Part I - SharePoint Governance - Eating an Elephant
Part II - Start with Simple Governance
Part III - Minimal Governance Plan (this post)

Thursday, April 15, 2010

SharePoint 2010 Open-Closed Taxonomy

A taxonomy is realized in SharePoint using content types for information asset types and term sets for coherent tagging of information. Content types defines the static classification hierarchy of the information managed in SharePoint. A content type is built from a set of fields defining the metadata of the content type, further detailing the classification of the information. Some metadata fields require the use of a controlled vocabulary, realized in SharePoint 2010 as term sets defined in the managed metadata service.

A central design recommendation for both content types and term sets is the Open-Closed principle: open for extension, closed for modification. The Open-Closed principle states "software entities (content types, term sets) should be open for extension, but closed for modification"; that is, such an entity can allow its configuration to be modified through extension without altering the core base entity definition. This is especially valuable in a production environment, where changes to base definitions will necessitate code reviews, unit tests, and other such procedures to qualify those changes. The principle isolates the changes to the extension entity definitions, because the core base entities are immutable. This makes it simpler to evolve the entities in a controlled manner while keeping a common, consistent core base definition.

SharePoint is restrictive on what changes can be made when evolving or specializing content types, with very limited support for modifying existing content type definitions. The recommended practice is to use a base set of core content types (the closed part) and then perform all specialization and variations on content types inheriting from the base content types (the open part). Evolving term sets is less restrictive than content types and allows for much more flexibility in changes to their definition, including renaming and merging terms.


New in SharePoint 2010 is the capability to share content types and term sets across multiple site-collections and even across farms. These enterprise content types and term sets can be combined with local site content types and local term sets, the giving a combined syndication effect. The Open-Closed principle can be applied also to this syndicated taxonomy: use the enterprise content type inventory and the enterprise term set inventory as the core taxonomy base (the closed part), and use the site inventories to extend the taxonomy with local specializations (the open part).

In technical SharePoint 2010 service application terms: create a primary term store and a primary content type hub, then extend those locally with additional local term sets and local content type inventories subscribing to the primary core taxonomy.

Applying the Open-Closed principle both for content type and term set definitions, and for content type and term set syndication, allows for defining and enforcing governance policies that makes it possible to have centrally controlled lifecycle management of the core base definitions of your SharePoint taxonomy, while still allowing local extension and adaptation of the taxonomy.

See SharePoint 2010 Intra-Farm Term Store Syndication for details on realizing taxonomy syndication, including configuration of permissions per web-app, user or group for adding or modifying term sets.