Showing posts with label SharePoint. Show all posts
Showing posts with label SharePoint. Show all posts

Thursday, May 08, 2014

Getting SSL Termination to work for HNSC in SP2013

We have been struggling a bit with getting off-box SSL termination to work properly for SharePoint 2013 host-named site collections (HNSC). We had issues with the ribbon, with admin pages like "manage content and structure", and with the term picker. Sure signs that some JavaScript files did not load. Users could not edit the terms in managed metadata fields, that is, terms could be selected, but clicking "ok" to save would just hang forever. A lot of scripts and links would not load, showing mixed content warnings in IE9 - and nothing at all in Chrome and Firefox, which both just blocks HTTP content on secure HTTPS pages.

To cut to the chase, this setup for SSL offloading is what worked for us:
  • create the web-app on port 80 (not 443), do not use the -SecureSocetsLayer switch
  • do not use the server name as the web-app name, you have a farm - don't you?
  • always extend the web-app to other zones before starting to create HNSC sites; leave one zone unextended, e.g. the "custom" zone
  • create a classic root site-collection with the same HTTP name as the web-app, do not use a HNSC for this
  • a site template is not required for the root site-collection
  • alternate access mapping (AAM) is used for load balancing even for HNSC, but HNSCs can't use AAM for host name aliases
  • create the HNSC using an internal HTTP URL in New-SPSite for the default zone, remember that crawling must always use the default zone
  • create a public URL alias for the default zone by mapping an unextended zone using a HTTPS URL in Set-SPSiteUrl, such as the "custom" zone
  • create public HNSC mappings using HTTPS URL in Set-SPSiteUrl for the other zones
  • ensure that your gateway adds the custom header "front-end-https: on" for all your public URLs secured using SSL
  • note that using just "front-end-https: on" and HTTP in the public URL will not correctly rewrite all links in the returned pages

In short, the salient point is to use HTTPS in the public URLs even if the web-app zone does not use the SecureSocetsLayer switch nor any SSL certificates. The default zone of the web-application must be configured for crawling - either no SSL or full SSL with certificates assigned in IIS. With no SSL you have to simulate AAM by mapping two URLs to the HNSC default zone. Using Set-SPSiteUrl on an unextended zone is like creating an alias for the default zone.

We had to use HTTP on the default zone to crawl the content of the published pages. It seems that if the web-application does not use SSL and your site default zone uses a HTTPS host header, then only the friendly URLs (FURL) will be crawled while the content will generate a lot of "This item comprises multiple parts and/or may have attachments. Not all of these parts were indexed." warnings. The result of the warning is no metadata being indexed, thus no search results - not good for a search-driven solution.

Note that SSL is recommended for all web-applications in SP2013 also inside the firewall, especially if you use apps - as the OAuth tokens otherwise will be exposed in the HTTP traffic, just as classic IIS basic authentication is not recommended without SSL. We wanted to do SSL bridging with BigIP due to this, but could not get SSL server name indication (SNI) configured successfully in BigIP v11 to allow us to have SSL certificates bound to two different IIS web-sites, even if IIS8 supports SNI.

SNI is required when the shared wildcard certificate or SAN certificate approach cannot be used for your SP2013 web-application setup, i.e. when binding to host names in multiple IIS web-sites at the web-application level. SNI is required when you need to use more than one web-application or more than one zone (extended web-app), even if you could bind your one-SAN-to-rule-them-all certificate to multiple IIS web-sites. IIS cannot route the request based on the host header until the request has been decrypted - SNI allows the request to be routed to the correct IIS web-site.

Remember that this is the path the HTTP(S) request travels from the browser:

 browser >
  host header >
   DNS A-record >
    virtual IP-address (VIP) in gateway > SSL off-box termination here
     load balancing >
      IIS server configured with IP-address >
       IIS web-site bound to IP-address (or host header) > normal SSL termination here
        SP web-application >
         site-collection bound to host header (HNSC)

Keeping tabs on this will help you understand the Technet guide to HNSC, which has some room for improvements. See this article by jasonth for a step-by-step guide for HNSC and SSL. Note that binding to host names in IIS rather than to IP-addresses for HNSCs at the SP2013 web-application level is supported, just as it was for SP2010.

Thursday, May 01, 2014

Managed Metadata Navigation, Anonymous Users in SP2013

The new term-driven navigation in SP2013 has some gotchas for anonymous users, resulting in them not seeing a full navigation menu. These are some things to check:
Finally, remember that you have to publish a major version for each page that you link to from the navigation node, otherwise anonymous users won't see the page, and neither the term. This includes all items on the page that also requires approval, such as images. An easy thing to forget, if you've been so stupid as not to use the simple publishing configuration for your site. If you as an admin or logged in user can see terms and view a page, while visitors can not - you forgot to publish. An empty page or no term is a sure sign.

Related to the managed navigation is the friendly URL (FURL) mechanism, which uses the term set structure to build the FURL from the linked-to term. To prevent broken links when moving a term, SP2013 stores links using the FIXUPREDIRECT.ASPX page, with params such as the termID, which will be resolved server-side into a friendly URL when rendered (see navigation term GetResolvedDisplayUrl). Do not render RichHtmlField using the simple "SPWC:FieldValue" web-control, as this will not resolve the fixup-links. In addition, having the same control both in an edit mode panel and in a display mode panel might cause problems.

This all applies to author-in-place (AIP) usage of term-driven navigation and friendly URLs; cross-site publishing (XSP) have different kind of issues.

Note that the managed navigation term set is stored in the default MMS of the hosting web-application. It uses the local term store for the site-collection it belongs to (IsSiteCollectionGroup). This will affect your backup/restore procedure as not only the content database or the site-collection backup will be needed for a restore, the MMS database or tenant backup is also needed. As all host-named site-collections (HNSC) share a web-application, restoring the MMS with it's term stores will affect the navigation term set of all site-collections. Take care.

Saturday, March 16, 2013

Controlling Content Database Size in SharePoint


A SharePoint content database can be up to 4TB with data (max 200GB is recommended). However, storage size is not the problem; it is the recovery time to restore all that data that is the availability problem. The recovery time decides for how long your business critical solution will be down. As SharePoint can spread its content across multiple databases, it is recommended that your architecture segments different content across different databases based on IA and other user experience aspects, plus business requirements for availability and recovery time. Plan for structuring your solutions with a strong focus on your information architecture (IA).

Here are some options for how to control the size of the content databases, without disposing and deleting content:

A) Use an ootb Record Center as an archive for old content: The users must manually send each document to the RC using e.g. move and leave a link; note that only the latest major version with metadata is kept – all version history is lost. The information management policies supported by SharePoint for retention and disposition can be used to automate the cleanup.
As the RC has its own content databases, the live collaboration databases will grow slower or even shrink as outdated information is moved to the archive. Keeping the live databases small ensures shorter recovery time; while the recovery time for the archived content can be considerable, but not business critical.
Search must be configured appropriately to cover both live and archived content.

B) Use a third-party archiving solution for SharePoint from e.g. MetaLogix or AvePoint. This has the same pros & cons as in option A, but the functionality is probably better in relation to keeping version history and batch management of outdated content.
Search must be configured appropriately to cover both live and archived content.

C) Use a third-party remote blob storage (RBS) solution for SharePoint, such as MetaLogix StoragePoint, so that documents are registered in the database, but not stored there. This gives smaller content databases, but more complicated backup and recovery as the content now resides both in databases and on disk. Provided that you don’t lose both at the same time, the recovery time should be shorter.
Search will work as before, as all content is still logically in the “database”.

D) Use powershell scripts or other code to implement the disposition of outdated content. The script can e.g. copy old documents to disk and delete old versions from the content database; the drawback being that all metadata will be lost and there is no link left in SharePoint.
The databases size will shrink as data is actually deleted, and backup and recovery is more complicated as content is now both in the database and on disk (same as for option C).
Search can be configured to also crawl and index the files on disk, but content ranking will suffer as the valuable metadata is lost.

My recommendation is to consider option A first, especially if you are able to define automated rules and exploit the built-in information management policies in SharePoint. The keyword is *able* - in my experience, everyone is positive to having automated retention and disposition, but noone even at large banks and law firms are able to come up with the policies.

Always consider using RBS for databases larger than 200GB, and note that RBS also helps you meet the disk IOPS requirements of SharePoint.


Saturday, March 02, 2013

How to Debug SharePoint Solutions in a Multi-Server Farm

Some tips on deploying and debugging code in a multi-server SharePoint 2010 farm:

When debugging on a multi-server SharePoint farm, with Visual Studio on the app-server, then add AAM mapping for the app-server to the web-application to debug, otherwise VS can't attach to the local w3wp process.

For example, to debug the web-app hosted on
  • http://azure-sp2010web:8383/ 
you must add the app-server URL first to AAM and then use it as Site URL when debugging
  • http://azure-sp2010app:8383/
Make sure to browse the site using the added app-server URL to load the code in a local w3wp process.

If the breakpoints have yellow warning triangles, then VS could not load the correct code to the w3wp processes attached to the debugger. Solve by rebuild and deploy to get the latest bits into the [14] hive and GAC. Note that you can’t activate WSPs on deploy in a multi-server farm, set the "Active Deployment Configuration" to “no activation” in Visual Studio project properties. If VS still can’t deploy the WSP, then use powershell to first Add-SPSolution and then Install-SPSolution across the farm.

Validate the WSP deployment status in “Manage farm solutions” in Central Admin first, and make sure that your feature is activated in the site or subsite you try to debug.

Happy debugging :)

Thursday, October 04, 2012

Dynamically Assign Approver for the Content Approval Workflow in SP2010

This post is a how-to guide for customizing the ootb SharePoint 2010 content approval workflow to automatically pick a user from the current list item such as the 'content resposible' field, and assign that user as the approver of the content, using SharePoint Designer. The customization of the content approval using SPD is quite straightforwards except for some less intuitive and misleading options for editing the workflow task process. It also involves publishing and editing the XML config of the workflow to enable using the "Start this workflow to approve publishing a major version of an item" option for automatically starting the approval workflow when the author check-in (submits) the page or document for approval.

To get started, follow the SharePoint Designer Walkthrough: Copy & Modify Publishing Workflow steps 1-4 to make a copy of the "Approval - SharePoint 2010" workflow in the site-collection root. Edit the workflow name to suit your needs and make sure to pick the content type that contains the user field that you will use to auto-assign as the approver in "Pick a base content type to limit this workflow to". Otherwise you won't be able to add a lookup for that content type field. Click OK and save.


Open the saved custom approval workflow and click "edit workflow". Change the name of the "Start Approval Workflow Task" action as you like. Then click on "Parameter: Approvers" to change the "with [these users]" for the workflow action into using an user from the current item that is pending approval. Now, to dynamically assign an approver, you need to click the "Enter participants manually" button.


Then in "Select task participants" click the address book button to open the "Select users" dialog box. Now select "Workflow lookup for a user", which will trigger the "Lookup for person or group" dialog.


Click OK three times, and the start approval workflow action should now look like this:

Now, as a side-effect the comments for the task has been unbound, so you need to click on properties for the workflow action and bind the comments to the "Parameters: Request" again. This will ensure that the text entered in the request field when starting the approval workflow will not be missing when the approver opens the workflow task to approve or reject the pending content.


Click OK two times, and your customized approval workflow with a dynamic approver is almost ready. There are a couple of workflow parameters that are not needed when automatically assigning the approver, these can be hidden from the workflow initiation form so that the author is not confused when the workflow starting form is shown on check-in. Click "Initiation form parameter" in the ribbon and make sure that the "Approvers" and "Expand groups" parameters are not shown during workflow initiation.


Removing the parameters is also an option, it just feels safer to just hide them rather than deleting them. Save and publish your custom content approval workflow as a globally reusable workflow as shown in step 13-15. You can now follow the Configuring Approval in SharePoint steps to use your workflow on a document library that requires content approval, and it will work for the content types that the custom approval workflow is associated with for the selected list.


However, the "Start this workflow to approve publishing a major version of an item" option will be missing when associating the workflow with the document library, even if it is a bonafide approval workflow and even if the list has "Require content approval for submitted items" turned on. This is caused by the workflow being a content type rather than a list workflow, and only list workflows can be configured to start content approval on check-in. Luckily, it is rather simple to change this by editing the XML config file for the published workflow XOML definition using SPD, as shown in Writing Your Own SharePoint Publishing Approval Workflow. I made these changes to the file:
  • changed the Category attribute to "List;Language:1033;#ContentType;Language:1033" 
  • changed the AllowStartOnMajorCheckin attribute to "true"
  • removed the ContentTypeId attribute completely
After directly editing and saving the workflow config file using SPD, the workflow designer will be somewhat out of sync with the XOML definition, due to the removal of the content type and changing of the association category. Still, the actual workflow logic is still working as expected.

You can now associate and test the custom approval workflow on your document library. These are the typical "Version settings" used for the "Pages" list in publishing sites:


Note that the users that are assigned as the approver must be members of the "Approvers" group in publishing sites, or have the right to edit draft items in the document library, otherwise they cannot open the  workflow task to approve or reject the content, even if the user is the owner of the task.

Wednesday, September 12, 2012

Migrated Content Database gives Unexpected Error for all Publishing Pages

Today I migrated a SharePoint 2010 publishing site content database to our Azure staging farm. All went smooth after using sp_changedbowner on the restored database before adding the content database to the web-application using Central Admin. However, when I tried to browse the publishing site, I got the "an unexpected error has occurred" message. Browsing /_layouts/settings.aspx worked fine and so did browsing "all site content" and the /pages/ list settings.

Using the correlation ID in combination with the ULS viewer lead me to this infamous portal sitemap provider exception:

DelegateControl: Exception thrown while adding control 'ASP._controltemplates_publishingconsole_ascx': Object reference not set to an instance of an object.

PortalSiteMapProvider was unable to fetch current node, request URL: /Pages/Forsiden.aspx, message: Object reference not set to an instance of an object., stack trace:   
 at Microsoft.SharePoint.SPField.GetTypeOrBaseTypeIfTypeIsInvalid(SPFieldCollection fields, String strType)    
 at Microsoft.SharePoint.SPFieldCollection.GetViewFieldsForContextualListItem()    
 at Microsoft.SharePoint.SPContext.get_Item()    
 at Microsoft.SharePoint.SPContext.get_ListItem()    
 at Microsoft.SharePoint.Publishing.Navigation.PortalSiteMapProvider.get_CurrentNode()

Googling a "PortalSiteMapProvider was unable to fetch current node" exception is no fun. You will typically get it in relation to top navigation and related site map providers. I've chased the cause of that error before, and sometimes had to resort to iisreset each night due to the publishing cache going corrupt over time.

This time, luckily, the exception details indicated a problem with the /pages/ list item definition, which led me to How to fix "System.NullReferenceException: Object reference not set to an instance of an object. at Microsoft.SharePoint.SPField.GetTypeOrBaseTypeIfTypeIsInvalid" that helped me solve my problem. By looking closer at the list settings for the pages list, I could see that a list column was flagged as invalid (look for the text "Delete this invalid field").

Trying to browse to Site Settings > Site Columns didn't work, but it gave me the enough information to find out what caused the issue and helped me solve it:

Field type AdvancedCalculated is not installed properly. Go to the list settings page to delete this field.

Deploying the missing feature to the staging farm solved the problem. Now it only remains to fix all those absolute URLs entered by the content authors.

Saturday, August 25, 2012

Use Azure VMs with On Premises Gateway as SP2010 Branch Office Farm

Here are some notes from my ongoing experience of setting up a SharePoint 2010 "branch office" farm in Azure using the current preview of persistent Azure virtual machines in a Azure virtual network connected to the on premises Active Directory using an Azure gateway to a Juniper VPN device.
Useful TechEd Europe 2012 web-casts that show how things work in general:
VM+VNET in Azure: http://channel9.msdn.com/Events/TechEd/Europe/2012/AZR208
AD in Azure: http://channel9.msdn.com/Events/TechEd/Europe/2012/SIA205
SP in Azure: http://channel9.msdn.com/Events/TechEd/Europe/2012/OSP334

Step by step instructions that I followed:
How to Create a Virtual Network for Cross-Premises Connectivity using the Azure gateway.
How to Install a Replica Active Directory Domain Controller in Windows Azure Virtual Networks.

Creating a virtual network (vnet) is easy and simple using the Azure preview management portal. I recommed creating the local network first, as the vnet wizard otherwise might fail - without giving any useful exception message. We had an issue caused by naming the local network "6sixsix" which didn't work due to the name starting with a digit. Also note that the VPN gateway only supports one LAN subnet in the current preview.

Plan your subnets upfront and make sure that they don't overlap with the on premises subnets. Register both the existing on premises DNS server and the planned vnet DNS server when configuring the vnet. A tip here is that the first VM created in a subnet will get .4 as the last part of the IP address, so if your ADDNSSubnet is 10.3.4.0/24, then the vnet DNS will get 10.3.4.4 as its IP address. Note that you can't change the DNS configuration after adding the first VM to the network, this includes creating the Azure gateway which adds devices to the gateway subnet.

After creating the Azure virtual network, we created and started the Azure gateway for connecting to the on premises LAN using a Site-to-Site VPN tunnel using a secure IPSec connection. Creating the gateway takes some time as some devices or VMs are provisioned in the gateway subnet you specified. We then sent the public IP address of the gateway, plus the shared key and the configuration script for the Juniper VPN device to our network admin. The connection wouldn't come up, and to make a long story short, the VPN configuration needs the 'peerid' to be set to an IP address of a device in the gateway subnet. Our gateway subnet was 10.3.1.0/24 and after trying 10.3.1.4 first (see above tip), the network admin tried 10.3.1.5 and that worked. I'll come back to this below when telling you about our incident when our trial Azure account was deactivated by Microsoft.

With the Azure virtual network up and running and connected to the on premises LAN, I created the AD DNS virtual machine using the preview portal "create from gallery" option. As SP2010 is not supported on WS2012 yet, I decided to use the WS2008R2 server image in this Azure server farm. Note that you should use size "large" for hosting AD as you need to attach two data disks for storing the AD database, log files and system state backup.

I did not use powershell for creating this first VM, instead I manually changed the DNS setting on the network adapter (both IPv4 and IPv6) and then manually joined the to-be AD DNS VM to the existing domain. Note that while you're at it, also set the advanced DNS option "Use this connection's DNS suffix in DNS registration" for both network adapters. Otherwise you will get the "Changing the Primary Domain DNS name of this computer to "" failed" error when trying to join the domain.

Following the how-to for setting up a replica AD in Azure work fine, we only had some minor issues due to the existing AD being on WS2003. For instance, we found no DEFAULTIPSITELINK when creating a new site in AD, so we had to create a new site link first, then create the site and finally modify the site link so that it linked the Azure "CloudSite" and the LAN site. Then the dcpromo wizard step for AD site detection didn't manage to resolve against our WS2003 domain controller, just click "ok" on the error message and manually select the "CloudSite" in the "Select a site" page.


I really wanted to set up a read-only domain controller (RODC) to save some outgoing (egress) traffic and thus save some money, as this branch farm don't need a full fidelity domain controller. However, it is not possible to create a RODC when the existing DC is on WS2003, because RODC is a WS2008 feature. So for "Additional Domain Controller Options" we went with DNS and "Global Catalog" (GC). GC isn't required, but if not installed then all authentication traffic on login need to go all the way to the on premises DC. So to save some traffic (and money), and get faster authN in the branch farm, we added the GC - even if the extra data will drive up Azure storage cost.

The next servers in the farm were added using powershell to ensure that 1) the VM is domain joined on boot, and 2) that the DNS settings for the VM is automatically configured.

Here are some tips for using New-AzureDNS, New-AzureVMConfig and New-AzureVM:
  • You can use the Azure vnet DNS server or the on premises DNS server with New-AzureDNS. I used the former.
  • The New-AzureVMConfig Name parameter is used when naming and registering the server in AD and DNS. Make sure that the full computer name is unique across the domain.
  • The New-AzureVM ServiceName parameter is used for the cloud DNS name prefix in the .cloudapp.net domain. It is also used to provision a "Cloud Service" in the Azure preview management portal. Even if multiple VMs can be added to the same service name (shared workload), I used uniqe names for the farm VMs (standalone virtual machine), connected using the vnet for load balancing.
  • To get the built-in Azure image names, use this powershell to go through the images in the gallery until you find the one you're looking for:
           (Get-AzureVMImage)[1].ImageName
After adding the SQL Server 2012, web server and application server VMs using powershell, I logged in using RDP and verified that each server was up and running, domain joined and registered in the DNS. Note that the SQL Server image is not by default configured with separate data disks for data and log files. This means that the SQL Server 2012 master database etc is stored on the OS disk in this preview. You need to add data disks and then change the SQL Server file location configuration your self. Adding two data disks will require that the SQL Server VM is of size "large".

The next step was to intall SharePoint 2010 on the farm the next day. Thats when the trial account was deactivated because all the computing hours was spent. Even if you then reactivate the account, all your VM instances are deleted, keeping only the VHD disks. As Microsoft support says, it is easy to recreate the VMs, but they also tell you that the AD virtual machine needs a static IP which you can only get in Azure if you never delete the VM. Remember to recreate the VMs in the correct order so that they get the same IP addresses as before.

What is worse is that they also delete the virtual network and the gateway. Even if it is also easy to recreate these, your gateway will get a new public IP address and a new shared key, so you need to call your network provider again to make them reconfigure the VPN device.

I strongly recommend not using a spending capped trial account for hosting your Azure branch office farm. Microsoft deleted the VMs and the network to stop incurring costs, which was fine with non-persistent Azure VM Roles (PaaS) anyway, but not as nice for a IaaS service with a persistent server farm.

I recommend exporting your VMs using Export-AzureVM so that you don't have to recreate the VMs manually if something should happen. The exported XML will contain all the VM settings, including the attached data disks.

How to deatch Azure VMs to move or save cost: http://michaelwasham.com/2012/06/18/importing-and-exporting-virtual-machine-settings/

When we recreated the Azure virtual network and the gateway, the VPN connection would not come back up again. The issue was that this time the gateway devices had got different IP addresses, so now the "peerid" had to be configured as 10.3.4.4 to make things work.

Now the gateway is back up again, and next week I'll restore the VMs and continue with installing SP2010 on the Azure "branch office" farm. More notes to come if I run into other issues.

- - - continued - - -

Installing the SharePoint 2010 bits went smooth, but running the config wizard did not. First you need to allow incoming TCP traffic on port 1433 on the Azure SQL Server. Then the creation of the SharePoint_Config database failed with:

    Could not find stored procedure 'sp_dboption'.

...even if I had downloaded and installed SP2010 with SP1 bits. So I downloaded and installed SharePoint Server 2010 SP1 and June CU from 2011 due to the issue caused by using SQL Server 2012 and that fixed the problem. Got "Configuration Successfull" without any further issues.

Finally, I tested and verified it all by creating a SP2010 web-application with a team site, creating a self-signed certificate with IIS7 and adding an Azure port mapping for SSL (virtual machine endpoint, TCP 443 to 443), allowing me to login to the team site using my domain account over HTTPS from anywhere.

A note on the VM firewall config is that ping is by default blocked, thus you can't ping other machines in the vnet unless you configure the firewall to allow it. Also note that you can't ping addresses outside of the virtual network and the connected LAN anyway; even if you can browse to www.puzzlepart.com, you can't ping us.

Saturday, June 23, 2012

SharePoint Publishing Site Map Providers and Navigation

Configuring the navigation of SharePoint 2010 publishing sites and subsites can be a bit confusing, also when configuring the navigation from code or from your web templates (or even old school site definitions). Add to that the UI that changes based on which settings you chose combined with what site or subsite context you're currently in. Plus the quite large number of site map providers defined in the web.config when using the PortalSiteMapProvider from code.

This post is about how the UI settings can be repeated in your site provisioning code, that is: first configure the navigation settings in your prototype to make it work according to your navigation concept, then package the settings into feature code.

The PortalSiteMapProvider works in combination with the PublishingWeb navigation settings, and of course with the top and current navigation controls used to render the navigation as HTML. The latter needs to look at the publishing web's PortalNavigation settings when querying the portal site map provider for the CurrentNode or when getting a set of navigation nodes to render. The navigation controls' code use the PortalSiteMapProvider properties IncludeSubSites, IncludePages, IncludeAuthoredLinks and IncludeHeadings to set the filtering applied to GetChildNodes when rendering nodes. These filter properties are typically set to IncludeOption.PerWeb to reflect the navigation settings of the current site or subsite.

The navigation settings UI tries to show the effects of your navigation settings (upper half) by rendering a preview of what nodes GetChildNodes would return for the *current* site (lower half) from the applicable site map provider. The PortalSiteMapProvider exposes several of the providers defined in web.config as static properties, but only two of them are typically used: CombinedNavSiteMapProvider and CurrentNavSiteMapProvider. The former is what feeds the top navigation, the latter feeds the current (left, local) navigation.


Note that when inheriting global navigation, the UI won't show the global navigation container as it only supports configuring the navigation of the current site. The term "parent site" in the UI refers to the top-level site of the site-collection, which is not the direct parent of a subsite beyond level 1 children.


Configuring the navigation settings from your site provisioning feature is quite simple once you've got a working prototype of your navigation concept. Use the mapping shown in the above figure to program the configuration settings for both global navigation (InheritGlobal, GlobalIncludeSubSites, GlobalIncludePages) and current navigation (InheritCurrent, CurrentIncludeSubSites, CurrentIncludePages, ShowSiblings).

The only little pitfall is for "Display the current site, the navigation items below the current site, and the current site's siblings" which requires a combination of InheritCurrent = false and ShowSiblings = true. Use this setting to show the same local navigation for a section of your web-site and all its child sites. A typical example is for the Quality Management section (level 1 subsite) and its QMS areas (level 2 subsites) to have a shared navigation experience. The QMS section would not use ShowSiblings while all the child areas would have ShowSiblings turned on.

Implementing a custom navigation concept is as simple as writing your own navigation rendering controls, and inheriting the PortalSiteMapProvider to override the logic for CurrentNode and GetChildNodes to suit your needs by applying the applicable node filtering properties to control which nodes are returned and rendered in which context. I've also used this approach for reading the navigation items from a central SharePoint list to get common cross site-collection top-navigation.

I hope this helped you understand how to realize your navigation concept from code, and that you're not totally confused by all the available site map providers and how they are used anymore.

Monday, May 07, 2012

Exploring Search Results Step-by-Step in SharePoint 2010

Using search to provide a news archive in SharePoint 2010 is a wellknown solution. Just add the core results web-part to a page and configure it to query for your news article content type and sort it in descending order. Then customize the result XSLT to tune the content and layout of the news excerpts to look like a new archive. Add also the search box, the search refiners and the results paging web-parts and you have a functional news archive in no time.

This post is about providing contextual navigation by adding "<< previous", "next >>" and "result" links to the article pages, to allow users to explore the result set in a step-by-step manner. Norwegians will reckognize this way of exploring results from finn.no.


For a user or visitor to be able to navigate the results, the result set must be cached per user. The search results are in XML format, and it contains a sequential id and the URL for each hit. This allows the navigation control to use XPath to locate the current result by id, and get the URLs for the previous and next results. The user query must also be cached so that clicking the "result" link will show the expected search results.

Override the CoreResultsWebPart as shown in my Getting Elevated Search Results in SharePoint 2010 post to add per-user caching of the search results. If your site allows for anonymous visitors, you need to decide on how to keep tab on them. In the code I've used the requestor IP address, which is not 100% foolproof, but this allows me to avoid using cookies for now.

namespace Puzzlepart.SharePoint.Presentation
{
    [ToolboxItemAttribute(false)]
    public class NewsArchiveCoreResultsWebPart : CoreResultsWebPart
    {
        public static readonly string ScopeNewsArticles 
            = "Scope=\"News Archive\"";
 
        private static readonly string CacheKeyResultsXmlDocument 
            = "Puzzlepart_CoreResults_XmlDocument_User:";
        private static readonly string CacheKeyUserQueryString 
            = "Puzzlepart_CoreResults_UserQuery_User:";
        private int _cacheUserQueryTimeMinutes = 720;
        private int _cacheUserResultsTimeMinutes = 30;
 
        protected override void CreateChildControls()
        {
            try
            {
                base.CreateChildControls();
            }
            catch (Exception ex)
            {
                var error = SharePointUtilities.CreateErrorLabel(ex);
                Controls.Add(error);
            }
        }
 
        protected override XPathNavigator GetXPathNavigator(string viewPath)
        {
            //return base.GetXPathNavigator(viewPath);
 
            SetCachedUserQuery();
            XmlDocument xmlDocument = GetXmlDocumentResults();
            SetCachedResults(xmlDocument);
 
            XPathNavigator xPathNavigator = xmlDocument.CreateNavigator();
            return xPathNavigator;
        }
 
 
        private XmlDocument GetXmlDocumentResults()
        {
            XmlDocument xmlDocument = null;
 
            QueryManager queryManager = 
            SharedQueryManager.GetInstance(Page, QueryNumber).QueryManager;
 
            Location location = queryManager[0][0];
            string query = location.SupplementaryQueries;
            if (query.IndexOf(ScopeNewsArticles, 
                StringComparison.CurrentCultureIgnoreCase) < 0)
            {
                string userQuery = 
                    queryManager.UserQuery + " " + ScopeNewsArticles;
                queryManager.UserQuery = userQuery.Trim();
            }
 
            xmlDocument = queryManager.GetResults(queryManager[0]);
            return xmlDocument;
        }
 
        private void SetCachedUserQuery()
        {
            var qs = HttpUtility.ParseQueryString
                    (Page.Request.QueryString.ToString());
            if (qs["resultid"] != null)
            {
                qs.Remove("resultid");
            }
            HttpRuntime.Cache.Insert(UserQueryCacheKey(this.Page), 
               qs.ToString(), null
               Cache.NoAbsoluteExpiration, 
               new TimeSpan(0, 0, _cacheUserQueryTimeMinutes, 0));
        }
 
        private void SetCachedResults(XmlDocument xmlDocument)
        {
            HttpRuntime.Cache.Insert(ResultsCacheKey(this.Page), 
               xmlDocument, null
               Cache.NoAbsoluteExpiration, 
               new TimeSpan(0, 0, _cacheUserResultsTimeMinutes, 0));
        }
 
        private static string UserQueryCacheKey(Page page)
        {
            string visitorId = GetVisitorId(page);
            string queryCacheKey = String.Format("{0}{1}"
                CacheKeyUserQueryString, visitorId);
            return queryCacheKey;
        }
 
        private static string ResultsCacheKey(Page page)
        {
            string visitorId = GetVisitorId(page);
            string resultsCacheKey = String.Format("{0}{1}"
                CacheKeyResultsXmlDocument, visitorId);
            return resultsCacheKey;
        }
 
        public static string GetCachedUserQuery(Page page)
        {
            string userQuery = 
                (string)HttpRuntime.Cache[UserQueryCacheKey(page)];
            return userQuery;
        }
 
        public static XmlDocument GetCachedResults(Page page)
        {
            XmlDocument results = 
                (XmlDocument)HttpRuntime.Cache[ResultsCacheKey(page)];
            return results;
        }
 
        private static string GetVisitorId(Page page)
        {
            //TODO: use cookie for anonymous visitors
            string id = page.Request.ServerVariables["HTTP_X_FORWARDED_FOR"
                ?? page.Request.ServerVariables["REMOTE_ADDR"];
            if(SPContext.Current.Web.CurrentUser != null)
            {
                id = SPContext.Current.Web.CurrentUser.LoginName;
            }
            return id;
        }
    }
}

I've used sliding expiration on the cache to allow for the user to spend some time exploring the results. The result set is cached for a short time by default, as this can be quite large. The user query text is, however, small and cached for a long time, allowing the users to at least get their results back after a period of inactivity.

As suggested by Mikael Svenson, an alternative to caching would be running the query again using the static QueryManager page object to get the result set. This would require using another result key element than the dynamic <id> number to ensure that the current result lookup is not scewed by new results being returned by the search. An example would be using a content type field such as "NewsArticlePermaId" if it exists.

Overriding the GetXPathNavigator method gets you the cached results that the navigation control needs. In addition, the navigator code needs to know which is the result set id of the current page. This is done by customizing the result XSLT and adding a "resultid" parameter to the $siteUrl variable for each hit.

. . . 
 <xsl:template match="Result">
    <xsl:variable name="id" select="id"/>
    <xsl:variable name="currentId" select="concat($IdPrefix,$id)"/>
    <xsl:variable name="url" select="url"/>
    <xsl:variable name="resultid" select="concat('?resultid=', $id)" />
    <xsl:variable name="siteUrl" select="concat($url, $resultid)" />
. . . 

The result set navigation control is quite simple, looking up the current result by id and getting the URLs for the previous and next results (if any) and adding the "resultid" to keep the navigation logic going forever.

namespace Puzzlepart.SharePoint.Presentation
{
    public class NewsArchiveResultsNavigator : Control
    {
        public string NewsArchivePageUrl { get; set; }
 
        private string _resultId = null;
        private XmlDocument _results = null;
 
        protected override void CreateChildControls()
        {
            base.CreateChildControls();
 
            _resultId = Page.Request.QueryString["resultid"];
            _results = NewsArchiveCoreResultsWebPart.GetCachedResults(this.Page);
 
            if(_results == null || _resultId == null)
            {
                //render nothing
                return;
            }
 
            AddResultsNavigationLinks();
        }
 
        private void AddResultsNavigationLinks()
        {
            string prevUrl = GetPreviousResultPageUrl();
            var linkPrev = new HyperLink()
            {
                Text = "<< Previous",
                NavigateUrl = prevUrl
            };
            linkPrev.Enabled = (prevUrl.Length > 0);
            Controls.Add(linkPrev);
 
            string resultsUrl = GetSearchResultsPageUrl();
            var linkResults = new HyperLink()
            {
                Text = "Result",
                NavigateUrl = resultsUrl
            };
            Controls.Add(linkResults);
 
            string nextUrl = GetNextResultPageUrl();
            var linkNext = new HyperLink()
            {
                Text = "Next >>",
                NavigateUrl = nextUrl
            };
            linkNext.Enabled = (nextUrl.Length > 0);
            Controls.Add(linkNext);
        }
 
        private string GetPreviousResultPageUrl()
        {
            return GetSpecificResultUrl(false);
        }
 
        private string GetNextResultPageUrl()
        {
            return GetSpecificResultUrl(true);
        }
 
        private string GetSpecificResultUrl(bool useNextResult)
        {
            string url = "";
 
            if (_results != null)
            {
                string xpath = 
                    String.Format("/All_Results/Result[id='{0}']", _resultId);
                XPathNavigator xNavigator = _results.CreateNavigator();
                XPathNavigator xCurrentNode = xNavigator.SelectSingleNode(xpath);
                if (xCurrentNode != null)
                {
                    bool hasNode = false;
                    if (useNextResult)
                        hasNode = xCurrentNode.MoveToNext();
                    else
                        hasNode = xCurrentNode.MoveToPrevious();
 
                    if (hasNode && 
                        xCurrentNode.LocalName.Equals("Result"))
                    {
                        string resultId = 
                        xCurrentNode.SelectSingleNode("id").Value;
                        string fileUrl = 
                        xCurrentNode.SelectSingleNode("url").Value;
                        url = String.Format("{0}?resultid={1}"
                           fileUrl, resultId);
                    }
                }
            }
 
            return url;
        }
 
        private string GetSearchResultsPageUrl()
        {
            string url = NewsArchivePageUrl;
 
            string userQuery = 
                NewsArchiveCoreResultsWebPart.GetCachedUserQuery(this.Page);
            if (String.IsNullOrEmpty(userQuery))
            {
                url = String.Format("{0}?resultid={1}", url, _resultId);
            }
            else
            {
                url = String.Format("{0}?{1}&resultid={2}"
                    url, userQuery, _resultId);
            }
 
            return url;
        }
 
    }
}

Note how I use the "resultid" URL parameter to discern between normal navigation to a page and result set navigation between pages. If the resultid parameter is not there, then the navigation controls are hidden. The same goes for when there are no cached results. The "result" link could always be visible for as long as the user's query text is cached.

You can also provide this result set exploration capability for all kinds of pages, not just for a specific page layout, by adding the result set navigation control to your master page(s). The result set <id> and <url> elements are there for all kind of pages stored in your SharePoint solution.

Monday, April 30, 2012

Almost Excluding Specific Search Results in SharePoint 2010

Sometimes you want to hide certain content from being exposed through search in certain SharePoint web-applications, even if the user really has access to the information in the actual content source. A scenario is intranet search that is openly used, but in which you want to prevent accidental information exposure. Think of a group working together on reqruiting, where the HR manager use the search center looking for information - you wouldn't want even excerpts of confidential information to be exposed in the search results.

So you carefully plan your content sources and crawl rules to only index the least possible amount of information. Still, even with crawl rules you will often need to tweak the query scope rules to exclude content at a more fine-grained level, or even add new scopes for providing search-driven content to users. Such configuration typically involves using exclude rules on content types or content sources. This is a story of how SharePoint can throw you a search results curveball, leading to accidental information disclosure.

In this scenario, I had created a new content source JobVault for crawling the HR site-collection in another SharePoint web-application, to be exposed only through a custom shared scope. So I tweaked the rules of the existing scopes such as "All Sites" to exclude the Puzzlepart JobVault content source, and added a new JobReqruiting scope that required the JobVault content source and included the content type JobHired and excluded the content type JobFired.

So no shared scopes defined in the Search Service Application (SSA) included JobFired information, as all scopes either excluded the HR content source or excluded the confidential content type. To my surprise our SharePoint search center would find and expose such pages and documents when searching for "you're fired!!!".

Knowing that the search center by default uses the "All Sites" scope when no specific scope is configured or defined in the keyword query, it was back to the SSA to verify the scope. It was all in order, and doing a property search on Scope:"All Sites" got me the expected results with no confidential data in it. The same result for Scope:"JobReqruiting", no information exposure there either. It looked very much like a best bet, but there where no best bet keywords defined for the site-collection.


The search center culprit was the Top Federated Results web-part in our basic search site, by default showing results from the local search index very much like best bets. That was the same location as defined in the core results web-part, so why this difference?

Looking into the details of the "Local Search Results" federated location, the reason became clear: "This location provides unscoped results from the Local Search index". The keyword here is "unscoped".


The solution is to add the "All Sites" scope to the federated location to ensure that results that you want to hide are also excluded from the federated results web-part. Add it to the "Query Template" and optionally also to the "More Results Link Template" under the "Location Information" section in "Edit Federated Location".


Now the content is hidden when searching. Not through query security trimming, but through query filtering. Forgetting to add the filter somewhere can expose the information, but then only to users that have permission to see the content anyway. The results are still security trimmed, so this no actual information disclosure risk.

Note that this approach is no replacement for real information security; if that is what you need, don't crawl confidential information from an SSA that is exposed through openly available SharePoint search, even on your intranet.

Saturday, April 21, 2012

Migrate SharePoint 2010 Term Sets between MMS Term Stores

When using the SharePoint 2010 managed metadata fields connected to termsets stored in the Managed Metadata Service (MMS) term store in your solutions, you should have a designated master MMS that is reused across all your SharePoint environment such as the development, test, staging and production farms. Having a single master termstore across all farms gives you the same termsets and terms with the same identifiers all over, allowing you to move content and content types from staging to production without invalidating all the fields and data connected to the MMS term store.

You'll find a lot of termset tools on CodePlex, some that use the standard SharePoint 2010 CSV import file format (which is without identifiers), and some that on paper does what you need, but don't fully work. Some of the better tools are SolidQ Managed Metadata Exporter for export and import of termset (CSV-style), SharePoint Term Store Powershell Utilities for fixing orphaned terms, and finally SharePoint Taxonomy and TermStore Utilities for real migration.

There are, however, standard SP2010 PowerShell cmdlets that allow you to migrate the complete termstore with full fidelity between Managed Metadata Service applications across farms. The drawback is that you can't do selective migration of specific termsets, the whole term store will be overwritten by the migration.

This script exports the term store to a backup file:

# MMS Application Proxy ID has to be passed for -Identity parameter

Export-SPMetadataWebServicePartitionData -Identity "12810c05-1f06-4e35-a6c3-01fc485956a3" -ServiceProxy "Managed Metadata Service" -Path "\\Puzzlepart\termstore\pzl-staging.bak"

This script imports the backup by overwriting the term store:

# MMS Application Proxy ID has to be passed for -Identity parameter
# NOTE: overwrites all existing termsets from MMS
# NOTE: overwrites the MMS content type HUB URL - must be reconfigured on target MMS proxy after restoring

Import-SPMetadataWebServicePartitionData -Identity "53150c05-1f06-4e35-a6c3-01fc485956a3" -ServiceProxy "Managed Metadata Service" -path "\\Puzzlepart\termstore\pzl-staging.bak" -OverwriteExisting

Getting the MMS application proxy ID and the ServiceProxy object:

$metadataApp= Get-SpServiceApplication | ? {$_.TypeName -eq "Managed Metadata Service"}
$mmsAppId = $metadataApp.Id
$mmsproxy = Get-SPServiceApplicationProxy | ?{$_.TypeName -eq "Managed Metadata Service Connection"}

Tajeshwar Singh has posted several posts on using these scripts, including how to solve typical issues:
In addition to such issues, I've run into this issue:

The Managed Metadata Service or Connection is currently not available. The Application Pool or Managed Metadata Web Service may not have been started. Please Contact your Administrator. 

The cause of this error was neither the app-pool nor the 'service on server' not being started, but the service account used in the production farm not being available in the staging farm. Look through the user accounts listed in the ECMPermission table in the MMS database, and correct the "wrong" accounts. Note that updating the MMS database directly might not be supported.

Note that after the term store migration, the MMS content type HUB URL configuration will also have been overwritten. You may not notice for some time, but the content type HUB publishing and subscriber timer jobs will stop working. What you will notice, is that if you try to click republish on a content type in the HUB, you'll get an "No valid proxy can be found to do this operation" error. See How to change the Content Type Hub URL by Michal Pisarek for the steps to rectify this.

Set-SPMetadataServiceApplication -Identity "Managed Metadata Service" -HubURI "http://puzzlepart:8181/"

After resetting this MMS configuration, you should verify that the content type publishing works correctly by republishing and running the timer jobs. Use "Site Collection Administration > Content Type Publishing" as shown on page 2 in Chris Geier's article to verify that the correct HUB is set and that HUB content types are pushed to the subscribers.

Monday, April 16, 2012

Getting Elevated Search Results in SharePoint 2010

I often use the SharePoint 2010 search CoreResultsWebPart in combination with scopes, content types and managed properties defined in the Search Service Application (SSA) for having dynamic search-driven content in pages. Sometimes the users might need to see some excerpt of content that they really do not have access to, and that you don't want to grant them access to either; e.g. to show a summary to anonymous visitors on your public web-site from selected content that is really stored in the extranet web-application in the SharePoint farm.

What is needed then is to execute the search query with elevated privileges using a custom core results web-part. As my colleague Mikael Svenson shows in Doing blended search results in SharePoint–Part 2: The Custom CoreResultsWebPart Way, it is quite easy to get at the search results code and use the SharedQueryManager object that actually runs the query. Create a web-part that inherits the ootb web-part and override the GetXPathNavigator method like this:

namespace Puzzlepart.SharePoint.Presentation
{
    [ToolboxItemAttribute(false)]
    public class JobPostingCoreResultsWebPart : CoreResultsWebPart
    {
        protected override void CreateChildControls()
        {
            base.CreateChildControls();
        }
 
        protected override XPathNavigator GetXPathNavigator(string viewPath)
        {
            XmlDocument xmlDocument = null;
            QueryManager queryManager = 
              SharedQueryManager.GetInstance(Page, QueryNumber)
                .QueryManager;
            SPSecurity.RunWithElevatedPrivileges(delegate()
            {
                xmlDocument = queryManager.GetResults(queryManager[0]);
            });
            XPathNavigator xPathNavigator = xmlDocument.CreateNavigator();
            return xPathNavigator;
        }
    }
}

Running the query with elevated privileges means that it can return any content that the app-pool identity has access to. Thus, it is important that you grant that account read permissions only on content that you would want just any user to see. Remember that the security trimming is done at query time, not at crawl time, with standard SP2010 server search. It is the credentials passed to the location's SSA proxy that is used for the security trimming. Use WindowsIdentity.GetCurrent() from the System.Security.Principal namespace if you need to get at the app-pool account from your code.

You would want to add a scope and/or some fixed keywords to the query in the code before getting the results, in order to prevent malicious or accidental misuse of the elevated web-part to search for just anything in the crawled content of the associated SSA that the app-pool identity has access to. Another alternative is to run the query under another identity than the app-pool account by using real Windows impersonation in combination with the Secure Store Service (see this post for all the needed code) as this allows for using a specific content query account.

The nice thing about using the built-in query manager this way, rather than running your own KeywordQuery and providing your own result XML local to the custom web-part instance, is that the shared QueryManager's Location object will get its Result XML document populated. This is important for the correct behavior for the other search web-parts on the page using the same QueryNumber / UserQuery, such as the paging and refiners web-parts.

The result XmlDocument will also be in the correct format with lower case column names, correct hit highlighting data, correct date formatting, duplicate trimming, getting <path> to be <url> and <urlEncoded>, have the correct additional managed and crawled properties in the result such as <FileExtension> and <ows_MetadataFacetInfo>, etc, in addition to having the row <id> element and <imageUrl> added to each result. If you override by using a replacement KeywordQuery you must also implement code to apply appended query, fixed query, scope, result properties, sorting and paging yourself to gain full fidelity for your custom query web-part configuration.

If you don't get the expected elevated result set in your farm (I've only tested this on STS claims based web-apps; also see ForceClaimACLs for the SSA by my colleague Ole Kristian Mørch-Storstein), then the sure thing is to create a new QueryManager instance within the RWEP block as shown in How to: Use the QueryManager class to query SharePoint 2010 Enterprise Search by Corey Roth. This will give you correctly formatted XML results, but note that the search web-parts might set the $ShowMessage xsl:param to true, tricking the XSLT rendering into show the "no results" message and advice texts. Just change the XSLT to call either dvt_1.body or dvt_1.empty templates based on the TotalResults count in the XML rather than the parameter. Use the <xmp> trick to validate that there are results in the XML that all the search web-parts consumes, including core results and refinement panel.

The formatting and layout of the search results is as usual controlled by overriding the result XSLT. This includes the data such as any links in the results, as you don't want the users to click on links that just will give them access denied errors.

When using the search box web-part, use the contextual scope option for the scopes dropdown with care. The ContextualScopeUrl (u=) parameter will default to the current web-application, causing an empty result set when using the custom core results web-part against a content source from another SharePoint web-application.

Thursday, March 08, 2012

SharePoint 'Approve on behalf of' for Publishing Pages

Using require approval and the approval workflow in SharePoint 2010 publishing, or just approval on the pages library in the simple publishing configuration, is straight forwards when using the browser as you're only logged in as one person with a limited set of roles and thus set of rights. Typically a page author can edit a page and submit it for approval, but not actually approve the page to be published on the site. When you need to approve, you have to log on as a user with approval rights.

Sometimes you need to extend the user experience to allow an author to make simple changes to an already published page, such as extending the publishing end date, and republish it directly without having to do all the approval procedures all over again. So you create a custom "extend expiry date" ribbon button, elevate the privileges from code and call the Page ListItem File Approve method, only to get an access denied error.

In SharePoint 2010, it is not sufficient to be the app-pool identity (SHAREPOINT\System) or even a site-collection admin, you have to run the approval procedures as a user with approval privileges. So your code needs to impersonate a user in the "Approvers" group to be able to approve an item on behalf of the current user.

public static void ApproveOnBehalfOfUser(
SPListItem item, string approversGroupName, string userName, string comment)
{
    Guid siteId = item.ParentList.ParentWeb.Site.ID;
    Guid webId = item.ParentList.ParentWeb.ID;
    Guid listId = item.ParentList.ID;
    Guid itemId = item.UniqueId;
 
    SPUserToken approveUser = GetApproverUserToken(siteId, webId, approversGroupName);
    if (approveUser == null)
        throw new ApplicationException(String.Format(
"The group '{0}' has no members of type user, cannot approve item on behalf of '{1}'"
approversGroupName, userName));
 
    using (SPSite site = new SPSite(siteId, approveUser))
    {
        using (SPWeb web = site.OpenWeb(webId))
        {
            var approveItem = web.Lists[listId].Items[itemId];
            approveItem.File.Approve(comment);
        }
    }
}
 
private static SPUserToken GetApproverUserToken(
Guid siteId, Guid webId, string approversGroupName)
{
    SPUserToken token = null;
    SPSecurity.RunWithElevatedPrivileges(() =>
    {
        using (SPSite site = new SPSite(siteId))
        {
            using (SPWeb web = site.OpenWeb(webId))
            {
                var group = web.SiteGroups[approversGroupName];
                if (group != null)
                {
                    foreach (SPUser user in group.Users)
                    {
                        if (!user.IsDomainGroup && !user.IsSiteAdmin)
                        {
                            token = web.GetUserToken(user.LoginName);
                            break;
                        }
                    }
                }
            }
        }
    });
    return token;
}

The code picks a user from the approvers group, and then impersonates that user using a SPUserToken object. From within the impersonated user token scope, the given page list item is opened again with the permissions of an approver, and finally the page is approved on behalf of the given page author.