Wednesday, August 04, 2010

Five Steps to Structure Your SharePoint Sites (Part III)

At every SharePoint customer and in every SharePoint project, the issue of how to realize a future-proof, working site structure arise. I've written a few blog posts on classification and structuring of SharePoint sites before (part 1 and part 2), focusing on scalable site structure design within the SharePoint system boundaries. This post is about getting from a classic site map based approach to a methodology driven by Information Architecture and the site classification pyramid, combined with the SharePoint containment hierarchy and SharePoint governance best practices. The latter is important as the management and usage of a site, and the service level agreement for the site, strongly influence how the overall SharePoint solution should be partitioned to adhere to architecture best practices.

Most information architects come up with a content organization model like a site map with hierarchical URL scheme. "This is the draft model, and we want the URL structure to be like this in our SharePoint solution" they say. "That's a very useful mental model of the site" I respond, "but what you really want is a good, intuitive navigation concept, helping users find what they need - like an invisible guiding hand; not that specific URL structure". Having a simple and easy mental model for the site structure is important, as it makes it easy to both contribute and find content, driving findability when the amount of information stored and managed in SharePoint grows.

Alas, that is not the primary issue with the site map approach for designing SharePoint site structures. The site map is closer to define statical IA structure than navigation, still it is only good for logical IA structure and cannot be expected to be used directly as the physical IA structure in SharePoint.


Using a site map approach to design a SharePoint site structure will only work in simple scenarios with homogenous solution domains, such as a publishing portal or a collaboration solution with limited document storage and versioning. It will fail when the SharePoint solution domains and their usage, governance and service level agreements differ, not to forget when the overall amount of data stored in SharePoint surpasses the 100GB limit per content database. The wanted URL structure might not be possible to realize in most cases anyhow when adhering to SharePoint architecture best practices.


A more structured approach is needed to design a SharePoint site structure according to architectural and governance best practices. My methodology involves a process with these five steps:

  1. Review your site map draft model against the SharePoint site classification pyramid and your governance plan for each of your identified site types.
  2. Partition your sites into solution domains based on site type and governance policies aspects such as business purpose, usage patterns, storage requirements, management boundaries, information security and isolation, operations and service level agreements.
  3. Adapt the site structure to the SharePoint containment hierarchy, adhering to limits and best practices based on the partitioned solution domains.
  4. Use the main SharePoint architectural components explicit and wildcard site-collections to ensure a scalable design that allows for future growth and expansion into new business domains. Use wildcard site-collections by default as these are super scalable. Adapt or customize SharePoint global menu and current menu to your navigation concept, to hide the solution partitioning.
  5. Define your URL scheme using host headers, manage paths and site provisioning policies.
Following these five steps is a top-down approach that is more understandable for non-techies. However, from step 3 onwards, good technical knowledge of SharePoint architecture is required.


To successfully design a working and scalable SharePoint site structure, you need a site classification scheme. I recommend the classic site classification pyramid as it outlines typical solution domains with governance and site types, helping you classify sites and draw domain boundaries in your site model.


Then, after identifying the different solution domains that make up your overall SharePoint solution, you must map the identified solution domains to the SharePoint containment hierarchy. It is imperative that you understand the containment hierarchy and its boundaries, otherwise your solution will slowly stop working as more and more users create more and more sites and fill them with more and more information.


Site-collections is the main architectural component of SharePoint, in combination with the content database that stores the site-collection's content and documents. A web-application always have a single root (1st tier) site-collection, which is the home site of the web-application. A 2nd tier of site-collections can be added to the web-application as needed to scale the application, using SharePoint managed paths. A typical scenario is to add a new solution domain. Another typical scenario is to add room for more content and documents; because a site-collection cannot span content databases and adding more space thus requires a new site-collection. Use subsites for anything that is just a section of an existing solution domain or site. See the golf and car communities in the example, which functionally are "sites" in their own right. Read my post Classification and Structuring of SharePoint Sites (Part I) to learn more.


It's easy to choose what solution goes into the root site-collection of a web-application, typically this is the portal to the overall solution. As there is a limited set of explicit site-collections per SharePoint web-application, these should be reserved for a planned, wellknown small number of predefined sites. Use the limited set of available explicit site-collections for the top two types in the site classification pyramid. Note that root and explicit site-collection storage cannot be scaled as you cannot add another content database to a site collection. Use as few explicit site-collection as possible, stay well below the limit of 20 managed paths. Expect everybody to require vanity URLs such as ://puzzlepart/HR/ for their site, so a strict governance policy for 'explicit' is required.

Use wildcard site-collections when the predictable number of sites is too large for explicit, typically for the bottom three types in the site classification pyramid. Also use wildcard site-collections even for central portals and division areas when there is no absolute requirement for an 'explicit' 2nd tier top-level site. See Puzzlepart TV in the example; even if the channel sites are classified as 'division area' (yellow color code) they do not need to be 'explicit'. In fact, due to their massive storage requirements for rich media content, best practices dictate the use of 'wildcard'. Always use wildcard site-collections for solutions have a limited life span such as projects.


Finally, after composing your scalable site structure design from root, explicit and wildcard site-collections, define the URL scheme that underlies the site structure. Use short descriptive URL elements for web-applications, managed paths and site-collections. The combination of these elements become the URL of the top-level sites that are what the users actually interacts with when using SharePoint. Note how the URL structure is affected by the SharePoint containment hierarchy. Compare ://puzzlepart/pzl-tv/ and ://puzzlepart/pzltv/ in the example, the latter contains child sites of the former in the overall navigation concept. In addition, I recommend defining and enforcing policies for all URL elements that users can provision, such as subsites, lists and libraries, to avoid non-friendly URLs with anything from spaces to special characters.

It is very likely that you will need to customize the ootb SharePoint navigation provider and menu rendering to have a consistent navigation experience across sites, sections and pages. Users get very confused and annoyed when the content of the top menu changes as they navigate from site to site, which is very understandable. There are several ways to implement common menu and navigation across site-collections, but that's left as an exercise for my readers.

The capacity and planning boundaries are mostly the same in SharePoint 2010 as in SharePoint 2007 for web-applications, site-collections and content databases. The most significant change is that SharePoint 2010 now supports max 250 000 site-collections per web-application, five times the limit for 2007. The max number of content databases per web-application has increased from 100 to 300, on the other hand the max number of site-collections per content database have been reduced from 50 000 to max 5000 with a recommendation of 2000 site-collections per content database. None of these numbers are hard limits, these are numbers for acceptable SharePoint performance.

5 comments:

Richard Harbridge said...

Beautiful. Just absolutely beautiful.

Anonymous said...

1st it is good non registered user can post comment here:)

then, I truly appreciate this article series. it is 1 or 2 year old but the beauty of the arch design is still uptodate!

Thank you!

Jason Brown said...

Very informative. I'm a business user that's doing IA work on our SharePoint portal.

I'm struggling with this question: should our publishing pages, which are to be used to display policies, be stored in one central location or dispersed (that is, stored within site/division sections).

Your diagram (#4 & 6) seems to suggest that centrally storing pages (explicit site collection 2nd tier TLS) is better than storing your publishing pages in subsites within the root site collection. I think you said this typical approach to site structure won't work with SharePoint.

The only thing about the central repository for pages is the change in URL when links point back to the actual page where it lives.

Can you provide some insight?

Kjell-Sverre Jerijærvi said...

The pages used to show the (elsewhere) stored content to users, should be where these readers expect to find them, such as on the intranet home page. The shown policies is typically rollup content (content by query web-part using content types, tagging and metadata) stored elsewhere, typically where maintained, close to the policy editors.

So, dispersed storage of content that can be rolled up and targeted to reader multiple places. Think of a news paper with a front (home) page and then multiple sections, such as domestic, foreign, sports, etc. The front page and section pages rollup content "teasers" and allows the user to browse into the content stored elsewhere in the paper to explore it further.

Whether you need sections (subsites) or not depends on your IA policies for information management. Think of a chest of drawers for you clothes: it makes it easier to manage different types of clothes in different ways, and allows for delegating a few drawers to be managed by your wife; maybe you even want to have some locked drawers with more privacy :)

Your content is the clothes, and depending on the variety of clothes you have, you might need quite a sophisticated chest of drawer. Throw in all your other stuff, and you might need a bigger closet or a garage!

Jason Brown said...

Thanks--that helps!