InfoWorker Solutions: 2008

Friday, December 19, 2008

REST Versioning: The Ripple Effect

A few weeks ago, I posted an illustration of the SOA service + schema versioning ripple effect for incompatible or sematic changes. Stepping out on a limb, I thought I should make a similar illustration for the artifacts in a RESTful service.

The artifacts in such a service is based on the four REST principles:

1. Identification of resources
2. Manipulation of resources through representations
3. Self-descriptive messages
4. Hypermedia as the engine of application state (HATEOAS)

Even REST needs a versioning and compatibility policy. The only thing that is not subject to change is the uniform interface. All other artifacts are subject to semantic or incompatible changes as the service evolves over time. Changes to the flow of the state machine are changed semantics, while changes to the decisions (allowable next state transitions/hyperlinks) need not be.

The ripple effect is a bottom-up effect, and an incompatible change on e.g. a customer address resource will cause an explosion of new versions of all affected representation artifacts. In the end, such a change should propagate even to the service consumers. If not, the consumers would operate on representations that have different real-world effects than they are expected to have.

Semantic changes should always be explicitly communicated to the consumers. Incompatible changes should be treated the same way for consistency, i.e. enforce a uniform versioning policy that allows consumers to be standardized.

Having a versioning strategy let you control the effects of the inevitable changes. Using a compatibility policy will help you alleviate some of the negative effects of versioning.

Tuesday, December 16, 2008

REST: Decision Services Needed More Than Ever

You can switch from classic SOA using service interfaces expressing business capabilities and domain specific messages and data types, to REST exchanging resource representations in standardized formats over the uniform interface with hypermedia expressing the next allowable state transitions of the processes operating on the resources. But how to you put those next state transitions (links) into the representation in the first place? Where is the semantics of the business events interpreted?

The classic SOA domain specific service interfaces and schemas try to express semantics that can be used when composing processes, not just a common format and protocol. Yes, you get more technical loose coupling in your systems with REST, but still have to do the semantic interpretation of the business messages somewhere. The huge advantage of REST is that the semantic decisions or compositions of processes are not part of the consumers, allowing applications like mashup clients to be standardized and (re)use a wide range of shared RESTful services. The only decisions consumers make is which of the transitions to follow next.

The application state machine using hypermedia as the engine needs to have a mechanism for deducing which are the next allowable state transitions to emit as links into the resource representation. This is where I think that the classical decision services/business rule engine is a good fit even for REST. Granted, it is a bit unusual to calculate a set of possible decisions - allowable state transitions - on beforehand, kind of a dynamic state-driven workflow. But it should be no stranger than adjusting to accepting the uniform interface. In addition, nothing is stopping you from inspecting the incoming resource representation as part of the decision making for controlling the flow of your business processes.

As I’ve said before, to me REST is just another architectural approach to service-orientation, as is EDA.

Thursday, December 11, 2008

Service Compatibility: Backwards, Forwards

The definitions for backwards compatible and forwards compatible contracts are straightforward:

A new version of a contract that continues to support consumers designed to work with the old version of the contract is considered to be Backwards Compatible

A contract that is designed to support unknown future consumers is considered to be Forwards Compatible

Backwards compatibility is typically achieved by using optional schema components, while forwards compatibility is typically achieved by using schema wildcards. What can be confusing is that schema compatibility is strictly defined as being between message sender and receiver, while it now is more common to talk about service consumers and service providers.

The correct definition of forwards compatible schemas as defined by David Orchard is this: "In schema terms, this is when a schema processor with an older schema can process and validate an instance that is valid against a newer schema". Backwards compatibility means that a new version of a receiver can be rolled out so it does not break existing senders. Forwards compatibility means that an older version of a receiver can consume newer messages and not break.

The confusion is caused when applying this message based definition to services, as service operations typically are both receivers and senders of messages. Most services should be designed to support both backwards and forwards compatible messages. But are the services themselves backwards or forwards compatible?

I define service compatibility this way:

A new version of a service that continues to support consumers designed to work with the old version of the service is considered to be Backwards Compatible

A service that is designed to support unknown future consumers is considered to be Forwards Compatible

Backwards and forwards compatible services use a combined “ignore unknown/missing” strategy, that is a combination of forwards and backwards contract (schema) compatibility. The following figures illustrates the definition of forwards and backwards services.

As can be seen from the above figure, service backwards compatibility depends on the provider being able to validate the on-the-wire request XML against a newer schema version and the consumer being able to validate the response XML against an older schema version. The provider must be able to "ignore missing", while the consumer must be able to "ignore unknown".

As can be seen from the above figure, service forwards compatibility depends on the provider being able to validate the on-the-wire request XML against an older schema version and the consumer being able to validate the response XML against a newer schema version. The provider must be able to "ignore unknown", while the consumer must be able to "ignore missing".

So my advice when talking about compatibility is, always make it clear if you're focusing on the contracts (message) or the services (provider). You can even talk about compatibility from the consumer perspective if you're bold enough. But please: never, ever talk about the service provider as the message consumer...

To add to the complexity of forwards compatibility, there are three types of forward:

Schema forward compatibilty
Service forward compatibilty
Routing forward compatibilty, a.k.a implicit versioning

In service version routing, the service endpoint accepts multiple versions of the contract (service virtualization) and then applies one of two routing policies:

Implicit version routing: forwards compatible service routing, where a message automatically is routed to the newest compatible version
Explicit version routing: traditional service routing, where each message is routed based on explicit version information in the message, typically a namespace

The implicit version routing policy is what our InfoQ article refers to as forwards compatible service versioning.

[UPDATE] More details on service vs schema compatibility.

Tuesday, December 09, 2008

Published on InfoQ: SOA Versioning

An article about "Contract Versioning, Compatibility & Composability" in service-oriented solutions that I've written together with Jean-Jaques Dubray has been published on InfoQ. It covers a lot of themes that I've written about in this blog, and focuses on the need for having a versioned common information model for the enterprise data that comprise the messages used in your business processes.

Thursday, November 27, 2008

Service+Schema Versioning: The Ripple Effect

In my SOA versioning, compatibility & composability session at NNUG this week, I stressed the importance of recognizing the ripple effect that incompatible or semantic changes to service contract artifacts will have. This illustration captures how versioning an artifact will affect upstream artifacts:

The ripple effect is a bottom-up effect, and an incompatible change on e.g. a customer address schema will cause an explosion of new versions of all affected contract artifacts. In the end, the change will propagate even to the service consumers.

Having a versioning strategy let you control the effects of the inevitable changes. Using a compatibility policy will help you alleviate some of the negative effects of versioning.

Sunday, November 23, 2008

Never buy a DELL XPS

Now it has happened again :(

The motherboard of my two year old DELL XPS 700 has broke again - only six months after it was replaced last April. DELL's support in Norway won't help me as XPS is just a subcontractor PC branded as DELL. I had to call international support in Ireland (or was it India), where they tried to cheat me of my Norwegian consumer rights. Then I had to stay home from work for three days because InfoCare didn't come as promised the first two times, without even notifying me that they wouldn't come anyway.

No more.

The XPS is now for sale at finn.no. In fact, I will never buy DELL again.

Wednesday, November 19, 2008

SharePoint: Reference Data Lookup

In my SharePoint Common Reference Data Management post, I outlined several options for how to maintain and use common master data as reference data across site-collections:

Replicated Lookup List
Remote Lookup List
External Database Lookup
External Hybrid Lookup List

I've made some figures showing the four reference data designs:
Lookup metadata and content.pdf

Note that if you plan to use custom lookup columns in your site content types to get a controlled vocabulary for your taxonomy, you must test and verify that the lookup columns can:

be used as site columns
be filtered to show a subset of lookup values
work in the DIP panel - fully, downgraded or not at all
reference data in other site-collections
be replicated across site-collections

A lookup column that can be filtered is useful as it is better positioned to support an evolving set of lookup values, e.g. hide values that become expired. Such a lookup column can be configured to work exactly as a choice column and still contain a centrally controlled vocabulary value-set.

Tuesday, November 18, 2008

NNUG Oslo 25. November

I'll be giving a session on service+schema compatibility & composability at NNUG Oslo on 25. November: details here.

Among the topics I will cover is the "Flexible/Strict" strategy.

[UPDATE 12-DEC-2008] Download slidedeck here.

Wednesday, November 12, 2008

Service+Schema Versioning: Flexible/Strict Strategy

In a few weeks time I will be giving a session at NNUG on SOA service and schema versioning strategies and practices. A central topic will be schema compatibility rules, where I will recommend creating a policy based on the "Strict", "Flexible" and "Loose" versioning strategies described in chapter 20.4 in the latest Thomas Erl series book Web Service Contract Design and Versioning for SOA. I guess David Orchard is the author/editor of part III in the book.

I recommend using a “Flexible/Strict” compatibility policy:

Flexible: Safe changes to schemas are backwards compatible and cause just a point version

Strict: All unsafe schema changes must cause a new schema version and thus a new service version

Do not require forwards compatible schemas (Loose, wildcard schemas) - schemas should be designed for extensibility, not to avoid versioning

Service interfaces should also have a Flexible/Strict policy

Safe changes is typically adding to schemas, while unsafe changes are typically modifying or removing schema components.

Note that forwards compatible schemas is not required to have forwards compatible services, as service compatibility is defined by the ability to validate messages. WCF uses a variant of 'validation by projection' (ignore unknown) for forwards compatibility, but also supports schema wildcards.

As Nicolai M. Josuttis shows in the book SOA in Practice (chapter 12.2.1 Trivial Domain-Driven Versioning), even simple backwards compatible changes might cause unpredicted side effects such as response times breaking SLAs and causing problems for consumers. It is much safer to provide a new service version with the new schema version, as if there is a problem, only the upgraded consumers that required the change will be affected.

Note that even adding backwards compatible schema components can be risky, but adding is typically safe. Josuttis recommends using "Strict" as it is a very simple and explicit policy, but I prefer "Flexible/Strict" as this gives more flexibility and less service versions to govern.

Avoid trying to implement some smart automagical mechanism for handling schema version issues in the service logic. Rather use backwards compatibility, explicit schema versions and support multiple active service versions. In addition, consider applying a service virtualization mechanism.

Friday, November 07, 2008

SharePoint: Content Type Guidelines

I've collected some recommended practices for SharePoint content types for my current customer. The guidelines are available here: ContentTypeGuidelines.pdf

The main advice is that content types must be based on your information architecture (IA) and goverened, to make it easy for your users to both contribute content and to drive findability across that content.

The general guideline for evolving the content type IA is to never change or rename content types or their aspects, make new ones and hide the old ones.

Wednesday, November 05, 2008

SharePoint: Common Reference Data Management

There are two kind of common reference data typically used in SharePoint:

Data that is native to the portal
Data that provided by external systems such as CRM, ERP, SCM

The usage and maintenance of the common reference data differs by their information type classification as described in later sections. Doing a thorough information architecture analysis is important for this aspect of SharePoint also.

The sharing of common data across multiple site-collections will also affect how native data can be stored, accessed and maintained.

External Common Data

External data used in SharePoint should not be imported into SharePoint lists; but must rather be used as pure reference data, and be maintained in their native systems. This is to avoid replication as much as possible due to the extra implementation and operational effort each added replication requires.

Use the web-parts provided by the external system to view the external data. Usage of the external data as lookup columns in lists is restricted to either:

Plain text fields with lookup only in InfoPath DIP form
Custom developed lookup columns
Third-party lookup columns
Use of the MOSS BDC

The “Business Data Catalog” (BDC) is the standard MOSS mechanism for utilizing external data in MOSS, but this will require MOSS enterprise edition. Note that the BDC "Business Data" column cannot be used as a site column and hence not be part of a site content type.

Suggested third-party components:

Bamboo external data sources: MashPoint
Bamboo external lookup column: Business Data Column

Always test and verify that custom columns can be used as site columns if you plan to use them in site content types. Also test and verify that custom columns will work in DIP when used in a document content type, even if only added to list content types.

Native Common Data

Native reference data should be stored and managed in SharePoint lists. The advantage of using SharePoint lists is that you get a data management UI for free: just use standard SharePoint to maintain the data. The downside is the need for replication of shared data across site-collections that use the reference data, as OOTB lookup columns are restricted to a single site-collection.

The common data lists will be used as reference data sources for site columns of type “lookup”. This will ensure that common data is based on native SharePoint aspects, possibly enhanced with third-party SharePoint components.

It is recommended to create the reference data list and the corresponding lookup site column at the root site of the site collection, as the lookup column can then be used in lists in any sub-site to lookup the reference data (cross-site lookup).

Externalized Native Common Data - External/Hybrid

Sharing common data in SharePoint lists across site-collections is not trivial. It must be considered if native common reference data should be externalized to a custom database and be treated as external data. The downside is that an application must then be implemented to allow for maintaining the externalized data. Data-aware tools such as InfoPath, Excel or Access can be utilized as the data maintenance front-end.

A variant of the external database is to use SharePoint itself as the external database, then you get the best from both alternatives (hybrid): a real master for the common data and the SharePoint UI for free. Use either a BDC lookup column or a third-party lookup column that works across site-collections.

Externalizing the native common data to a separate database will require more work on backup/restore. Using the SharePoint database avoids this.

Suggested third-party components:

Bamboo lookup column: Selector
KWizCom remote list viewer/editor: Remote List Viewer web part

Always test and verify that custom columns work across site-collections.

Replicating Common Data across Site-Collections

Native common data will be managed centrally in a master site collection, replicated to multiple target site collections. There will be one master list per target list type. Lists in sites in the target site collections can then reference the target common data lists using columns of type “lookup”.

Note: always prefer using externalized native common data instead of replication.

There are two options for how to maintain and replicate the shared data:

A) Overwrite data in targets on replication, i.e. data must only be entered in master as any updates to data in target lists will be lost on next replication. Updating target lists must be prohibited.

B) Two-way data replication to allow for updating data in targets in addition to master. Conflict resolution rules must be configured so that the master wins any auto-resolvable conflicts.

Which option to use for a list depends on the classification of the information asset that the list contains. E.g. country and product lists need not be updated except from in the master; while it is critical for the business that project information assets like contacts can be updated and used immediately.

Note that there will be latency in the replication of changes for both option A) and B). This latency can be configured. The impact on this replication latency must be analyzed based on the information types identified as common portal data. Different information types will have different tolerance for data staleness.

For option A) it must be considered to use a remote list viewer/editor in relevant sites to allow for editing master lists “remotely” from target sites.

Suggested third-party components:

KWizCom remote list viewer/editor: Remote List Viewer web part
Echo 2007 Content Manager
DocAve5 two-way Replicator

Always test and verify that custom components such as web-parts and column types can be replicated.

SharePoint Lookup Column Types

Note that the standard “lookup” column is not a many-to-many relationship; it is just a choice list with a multi-select option. There are several third-party lookup types available to alleviate this limitation.

Suggested third-party components:

KWizCom many-to-many list item relationship: Dual Lookup Field Type
Bamboo one-to-many relationship: List Integrity
Bamboo master-detail lookup column: Linked Selector

Always test and verify that custom columns can be replicated if used in lists and content types.

Codename "Bulldog" - Master Data Management

Microsoft last year acquired the Stratature +EDM master data management (MDM) platform. It will be incorporated as part of the next version of SharePoint, however only a few details are public. Read the roadmap here.

Monday, November 03, 2008

I&AM: Understanding Geneva

At PDC last week, the "Geneva" platform for federated claims-based identity and access management (I&AM) was announced. Here are some useful resources to get started:

Identity Roadmap for Software + Services: video with Kim Cameron and Vittorio Bertocci
Project Geneva part 1, part 2, part 3: Kim provides details about the roadmap
Identity @ PDC08 HOL, samples, demos, tools: Vittorio provides useful Geneva resources
Microsoft's New Identity Landscape: Vittorio explains all the Geneva features and modules

Don't let the federation part of this stop you from looking at Geneva, it is utilizing claims for access management and access control you should focus on initially, externalizing it to the identity metasystem. Let the support for distributed I&AM across the extended enterprise be a nice feature that you get for free.

Friday, October 17, 2008

SharePoint ACLs: RoleDefinitions, RoleAssignments, Inheritance

It's been a long time since my last SharePoint post, and this must be my first post on MOSS and WSS3.0. There have been a significant change from WSS2.0 in the underlying security model for access control on sites, lists, items, libraries, folders and documents - and these are some findings that should make it easier to understand the new model.

Permission Levels have been replaced by Role Definitions, and Permissions have been replaced by Role Assignments. While this is not reflected in the SharePoint UI, the authorization object model is new.

The rights are assigned to securable objects that are referred to as 'scopes' as the rights of a user or group by default are inherited throughout the contents of a site (SPWeb). Thus, if you have only read access to a site, this scope will by default give you read access to all contents of the site - unless some contents have been configured to not inherit permissions, and are thus in a separate scope.

Both Role Definitions (RD) and Role Assignments (RA) are by default inherited, but both can be broken to not inherit from it's parent. A site can break the RD inheritance to define new roles or change or delete existing roles (except 'full control' and 'limited access'). RD definitions always applies to the whole site, while permission (RA) inheritance can be broken at several securable object levels. This makes it possible to have unique permissions for e.g. lists or even list items.

Role definition inheritance in a Web site has impact upon permissions inheritance in accordance with the following prohibitions:

Cannot inherit permissions unless it also inherits role definitions.
Cannot create unique role definitions unless it also creates unique permissions.
Cannot revert to inherited role definitions unless it also reverts all unique permissions within the Web site. The existing permissions are dependent on the role definitions.
Cannot revert to inherited permissions unless it also reverts to inherited role definitions. The permissions for a Web site are always tied to the role definitions for that Web site.

Thus, in order to have unique role definitions you must have unique permissions (role assignments), but unique permissions can have either unique or inherited role definitions. Inherited permissions must have inherited roles.

If a site do not have role inheritance, then reverting to inherited roles will also revert unique site permissions into inherited permissions. Reverting to inherited permission for a subsite discards custom permissions, permissions levels, users, and groups that were created for the subsite and all it’s lists and contents.

What can be a bit confusing about the authorization object model is that there are two methods for breaking the inheritance of RDs and RAs respectively, but only one to revert inheritance:

SPWeb.RoleDefinitions.BreakInheritance (bool CopyRoleDefinitions, bool KeepRoleAssignments)

SPWeb/SPList/SPListItem/SP*.BreakRoleInheritance (bool CopyRoleAssignments)

SPWeb/SPList/SPListItem/SP*.ResetRoleInheritance()

This asymmetry is an effect of the above role and permission inheritance rules and prohibitions. As soon as you revert the unique permissions of a site using SPWeb.ResetRoleInheritance, you cannot have unique roles and thus inherited role definitions will be in effect.

Thursday, October 02, 2008

Versions and Schema Semantics in SOA Composite Services

As part of your SOA governance efforts, it is not sufficient to handle just the versioning of the services and the data schemas used by the services. You must also govern the semantics of the data to enable service composability. Composite services are not possible without semantic data integration.

There are two information models in play when composing services - just as there are two different, but related, process models involved in service-oriented solutions. One is the well known common information model (CIM), the other is the related business event message model used to enable semantic business process composition.

The process composition information should be partitioned according to the business process domains to allow for the business process information model (BPIM) to evolve independent of each other. Design the BPIM based on the CIM, ensuring that the model is canonical for each process domain. Your BPIM must be federated even if evolved and versioned separately, to enable semantic data mediation and semantic business process integration.

The BPIM contains metadata that models business event messages as projections of CIM data entities. A business process message typically contains a subset of one or more domain objects. E.g. in an order handling system, the real-world event "customer submits order" is a projection of the customer, address, credit and product entities from the CIM. Think classical paper-based mail order schemas or the Excel schemas you use to get your travel expences reimbursed, or rather the form used to apply for a vacation.

Creating an enterprise canonical data model might be feasible, but federated domain models are recommended. You would anyway need to mediate semantics between your system and third-party services that you involve, or on parts of your processes that you chose to outsource.Federated models are required for B2x integrations as you cannot easily enforce your model on the outside world. Using a CDM might work in a centrally controlled EAI hub, but most likely not across departments, organizational and partner boundaries in an extended enterprise.

Wednesday, October 01, 2008

WCF: WS-Security using Kerberos over SSL

WCF support typical scenarios with the system provided bindings such as basicHttpBinding and wsHttpBinding. These bindings work very well in WCF-only environments and for interop scenarios where all parties are completely WS* standard compliant.

However, interop always seems to need some tweaking, and the ootb bindings allow you to change many aspects of the binding configuration. But there will always be scenarios where these bindings just cannot be configured to fit your needs. E.g. if you need a binding that supports custom wsHttp over SOAP1.1 using a non-negotiated/direct Kerberos WS-Security token secured by TLS. In this case you need a custom binding, because e.g. even if you can turn of spnego for wsHttpBinding, you cannot tweak it into using SOAP1.1

WCF <customBinding> to the rescue:
<customBinding>
<binding name="wsBindingCustom">
<security authenticationMode="KerberosOverTransport"
requireDerivedKeys="false"
includeTimestamp="true">
</security>
<textMessageEncoding messageVersion="Soap11" />
<httpsTransport />
</binding>
</customBinding>

This custom binding combines HTTPS transport with direct Kerberos. The timestamp is required for security reasons when using authentication mode Kerberos over SSL. Note that using the UPN/SPN type <identity> element is only supported NTLM or negotiated security, and cannot be used with direct Kerberos.

WCF by default requires that the response always contain a timestamp when it is in the request. This is a typical interop issue with e.g. WSS4J, Spring-WS and WebSphere DataPower.

The response with the signed security header timestamp should be like this:
<soapenv:Header>
<wsse:Security soapenv:mustUnderstand="1" xmlns:wsse="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-secext-1.0.xsd">
<wsu:Timestamp wsu:Id="uuid-c9f9cf30-2685-4090-911b-785d59718267" xmlns:wsu="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-utility-1.0.xsd">
<wsu:Created>2008-09-29T12:02:59Z</wsu:Created>
<wsu:Expires>2008-09-29T12:12:59Z</wsu:Expires>
</wsu:Timestamp>

Note the casing of the wsse: and wsu: namespace elements and attributes. This replay attack detection timestamp can be turned off using the includeTimestamp attribute, or it can be configured using the <localClientSettings> and <localServiceSetting> security elements.

Refer to the <security> element of the <customBinding> documentation for more details. Also review the settings for initiating a WS-SecureConversation if a session is needed.

Tuesday, September 30, 2008

Service Versions: Active, Published, Discoverable

As part of your SOA governance efforts, you should have a process for service lifecycle management. Many already use the policy that the three latest major versions of a service is supported, but at different endpoints due to the lack of service virtualization.

As using service virtualization becomes more common for doing lifecycle management, you need to categorize your services into lifecycle stages. Typical stages are

Provisioning
Active - Published
Active - Deprecated
Decommissioned

Only the latest major active version should be the "published" version, i.e. the WSDL provided by the virtual endpoint. All the other active versions are considered deprecated, and should not get any new consumers. The virtual endpoint must accept requests to all active versions, both the latest and the deprecated versions.

Existing consumers of deprecated versions should as soon as possible move to the a newer discoverable version (see figure), preferably the published version. Note how even if there is only one published version, the active major versions can still be discovered through a service registry.

These service versioning lifecycle policies are illustrated in the above figure. Note how only the latest major version should allow multiple minor versions to be active. A recommended practice is to have max two such active minor versions. Still, only the latest version of the service should be discoverable.

It is important to minimize the number of active and discoverable versions of the services. This makes it easier to manage and communicate with the service consumers the force them to move on before their version becomes decommissioned. The virtualized endpoints must be monitored, so that you know how much a deprecated service is used - and who those consumers are.

Tuesday, September 23, 2008

Information Models: Business Process or Forms-over-Data

Joe McKendrick has a post about the need for doing data governance today:

If data governance is inadequate — information is outdated, out of sync, duplicated, or plain inaccurate — SOA-enabled services and applications will be delivering garbage. That’s a formula for SOA disaster.

He links to the XML to the rescue: Data governance in SOA article by Ed Tittel, which describes how having a common model and doing data governance is imperative for SOA success. The article conveys the same concepts as chapter 4 'XML: The Foundation for Business Data Integration' in David Chappell's seminal book Enterprise Service Bus. David describes the need for having a common XML data format for "expressing data in messages as it flows through an enterprise across the ESB". This common XML format is a specialization for the common data model (CDM) pattern from the classic book Enterprise Integration Patterns by Gregor Hohpe/Bobby Woolf.

I've written many times about the importance for having a common information model (CIM) for the domain objects (business entities) that your data services operates on, and the need for applying master data management (MDM) on those data. David Linthicum and I agreed on this last year and still do.

What I've written even more often about, and what most canonical models do not encompass; is that the data expressed in the messages conveying business events is not simply CIM entities, but are rather projections of one or more of the domain objects. In addition, the messages covers more than just activity data, they also contain queries, notifications (events) and commands.

E.g. an insurance clerk triggers a business process by registering the real world event "customer has moved" using a projection of customer and address data, not the complete customer aggregate. This action event message triggers a recalculation of the insurance premium due to the relocation of the customer, in addition to updating the customer record - i.e. an underlying business process is executed.

In a traditional forms-over-data application, the whole customer aggregate would be used - but this style would not make for services that can be easily composed to support business process management efforts. The business event data requirements must be simple enough to make composite services possible, including semantic data mediation and integration. In addition, process tailored data make it easier for any involved human operator that have to enter or act on the data. After all, BPM is not all about automation, and human tasks will be central in driving the processes for quite a while.

We need to move to an information model that allows for semantic business process integration, not just semantic business data integration. That is why a business process information model (BPIM) is required, modeling the messages for the action, query and notification events that drive the business processes. Design the BPIM based on the CIM, ensuring that the model is canonical for each process domain.

The BPIM is based on the CIM as each service domain model is just a projection of the common data model, and the projection metadata should be goverened as any other SOA artifact.

Sunday, September 07, 2008

WS* Oriented Architecture

If a company asks for a service-oriented approach for exposing business functions to external consumers using web-services, but has no explicit requirements in the RFP for service virtualization or coordinated loose coupling, and neither any requirements for service and data contract versioning mechanisms; is it then sufficient to design a solution based on web-services adhering to the WS* standards, W3C MEPs and WS-I BasicProfile, plus a service registry; without any of the aforementioned mechanisms, and call it SOA ?

I think not. Adding a repository and applying a process for service lifecycle management will help, but you risk getting a just a glorified, governed JBOWS system. Then again, that's a start.

Saturday, September 06, 2008

How "Oslo" relates to BizTalk

The Microsoft "Oslo" initiative is to provide a modeling (UML), management, and hosting platform for delivering model-driven service-oriented composite applications and S+S solutions; plus cloud computing through modeling.

The recently published BizTalk 2009 roadmap shows how "Oslo" features can be utilized also from BizTalk:

In fact, you won’t need to upgrade BizTalk Server to take advantage of "Oslo" – current BizTalk Server 2006 R2 or BizTalk Server 2009 customers can benefit from "Oslo" by being able to leverage and compose existing services into new composite applications. BizTalk Server today provides the ability to service enable LOB systems or trading partners as web services (using WCF supported protocols), which can be composed with the "Oslo" modeling technologies.

As the roadmap shows, WCF is the central enabler for connecting services from different systems into service-oriented solutions. WCF and WF are central to the new "Oslo" modeling tool and repository, allowing users to define and execute business processes (much like BPEL, XLANG). "Oslo" has a strong focus on enabling automation of business processes with strong support for humans as central actors in the processes (like BPEL4People, WS-HumanTask). Add the BAM interceptors for WCF and WF provided by BizTalk, and you have a platform that gives repeatability, consistency and visibility to your business process management efforts.

Monday, August 11, 2008

SOA Information Models and Services Compositions

I've written quite a lot about having a model for the information needed for semantic business process composition in service-oriented solutions, two of the most read and commented are SOA Canonical "Data" Model and Ontology for Business Processes, SOA and Information Models.

The process composition information model is focused on the business events, messages and documents and the semantics of the information that the model encompasses. The model is targeted at the orchestrations level in the ontology, not at the services level; as shown in the figure (click to enlarge):

The service classification scheme has the "hierarchy" taxomomy form. It is not a scientific hierarchy nor a tree form, thus there is no "is using" or "is dependent on" relation between the services - it is just a classification scheme. There can be no dependency between services in SOA except for composite services realizing a business process by consuming the actual services. Taking a dependency on another service will break the central "services are autonomous" tenet of SOA.

A note on the splitting of the service categories: this is a pure metadata ontology based on the taxonomy for categorizing services, it is not a split based on technolog or, teams, and surely not on layers. A service should not call on other services, making a service dependent on another service will lead to repeating the failure of DCOM/CORBA in distributed systems; using web-services and calling it SOA solves nothing. Use the active service pattern or EDA to ensure that your services are truly autonomous and independent of the availability of other services. Isolate service dependency to the composite services, or even better to the consumer mashups, only.

The "business process information model" (BPIM) is what models the business events and documents in the composite services level (orchestration / conversations / sagas). I like to use a domain model as it by definition focuses on the concepts of the problem domain (the business process information artifacts) rather than just the data of the domain. The information model must comprise the different message types for the action, query and notification events that drive the business processes. The model must be canonical for its domain to ensure that the format and semantics of the model is consitent within the domain.

There is some confusion on the difference between the process composition BPIM and the EAI "canonical data model" (CDM), plus the similar concept Common Information Model (CIM). As can be seen from the comments on the two postings and some blog reactions like SOA doesn’t need a Common Information Model, this has made it harder to convey the difference between the process composition information model and the more data-oriented models.

A CIM should be used for the resources data model - which the BPIM is related to. Design the BPIM based on the CIM, ensuring that the model is canonical for each process domain. The data in the event document for the real-life action "CustomerHasMoved" is not a complete "Customer" resource entity, it is a message including minimal "AddressChange" data. This data will typically contain both actual event data combined with reference keys to the resources such as customers and addresses in the different service domains that the composition comprises.

The message data is of course associated with the CIM of the underlying resource services. The BPIM message types are projections of CIM objects; in fact, they are projected compositions of the referenced resources - not just simple compositions of resource objects. These projections must be kept as small as possible, preferably just the resource keys, in addition to event details such as a booking reference number. Keeping the event document data model small has the nice side-effect of making the schema versioning a bit simpler.

The CIM should be the starting point of your modeling efforts, as the documents in the process compositions is based on the service domains. Having Business Process Modeling Notation (BPMN) diagrams can help in modeling, as BPMN encompass process, events, messages and business documents, covering actions, queries and notifications. Other process modeling diagrams can be used, just ensure that the events, messages and data are captured in the model.

It is still important to recognize that having a "one true schema" for process composition or SOA in general is not a recommended practice, hence the term "domain" in the naming. Rather than trying to enforce a common model across all service domains, federated models should be used. When composing services across the domains, mediation and transformation of messages will be required - here a service bus with such capabilities would come in handy.

A thing to recgonize is that the overall model incorporates several "flows": process flows, event flows, message flows, and data state flows (resource lifecycles). Typically, only the process flow is clearly shown in diagrams, while e.g. message flow is only represented by receive and send ports. See this interview with Gregor Hohpe on Conversation Patterns about how more than just the process need to be modeled.

For an excellent primer on architectural aspects of SOA and the proliferation of acronyms, I recommend reading Architecture requirements for Service-Oriented Business Applications by Johan den Haan. Note that he does the classical simplification of focusing on business processes without including a BPIM; it is important to separate the process composition model from the resource model, just as business processes are separated from resource lifecycle processes. The business process events, documents and messages are not just collaterals of the BPMN modeling process, they are important artifacts for semantic mediation in composite services.

[Business Process Integration: "e-Business: organizational and technical foundations", Papazoglou, M. P. & Ribbers, P. (2006)]

Friday, August 08, 2008

Business Process Modeling, Not Only For Automation

Every now and then I see statements like this and it makes me cringe:

"Unless we are only documenting the As Is, business process modeling are made to optimize the business process, notably by using automation."

Aiming for automating processes might be a good strategic goal, but it will get you into a lot of both political and technological issues. How will information workers react to a project that aims at automating their jobs? What percentage of business processes can readily be automated with the available technology today?; think of the BPMN-BPEL chasm. Also read The Case Against BPEL: Why the Language is Less Important Than You Think.

In my opinion, you should do business process modeling with the objective to document the processes, enable visibility into the process performance for operational awareness (BAM, KPI, etc) and to achieve consistency and repeatability in the way the processes are performed - even by humans. Way too many processes that are core to your business (DDD core domain) depends on human interaction, and these processes are not going to be automated using BPM any time soon - however, they might get efficacious support from BPMS solutions.

The ultimate BPM automation system might be Agent Smith: "Never send a human to do a machine's job".

Wednesday, August 06, 2008

Service Virtualization: MSE on Channel9

Service virtualization can be an important architectual mechanism for your service-oriented solutions, and the Managed Services Engine is a free WCF-based tool available at CodePlex.

There are two videos at Channel9 about MSE that I recommend watching to learn about the capabilities of MSE:

Code to live: service virtualization, versioning, etc
Talking about MSE: virtual services and endpoints, protocol adaption, policy enforcement, etc

A topic related to service virtualization is consumer-driven contracts, and you can do that with MSE, but the WCF LOB Adapter Kit might be an even better tool for that.

Monday, August 04, 2008

How "Oslo" relates to SOA

The upcoming Microsoft "Oslo" offering might be a bit vague to grasp, which I can readily understand. It has been perceived as a SOA implementation toolkit, but is rather focused on being a modeling (UML), management, and hosting platform for delivering model-driven service-oriented composite applications and S+S solutions; plus cloud computing through modeling. Read Loraine Lawson's Best Explanation of Oslo So Far for a rather typical experience.

Also look at the Oslo PDC sessions posted by Matt Winkler, on how "Oslo" is a service repository and a platform for both lifecycle management and hosting of services and processes. The service and process implementation tools in "Oslo" will be Visual Studio, WF, WCF, BizTalk, Zermatt, etc - and "Oslo" will be the one initiative to rule them all.

Directions on Microsoft on "Oslo": Messaging, Workflow Roadmap Announced (PDF)

Friday, July 04, 2008

CAB/SCSF: Designing WorkItems From Use Cases

One of the most elusive and complicated parts of CAB is the workitem dependency injection container and all its use case resource collections. SCSF improved on the CAB workitem by adding the WorkItemController as the place to put use case controller logic. The WorkItemController gets a WorkItem instance injected, and many developers never use anything but this default workitem built into the ModuleController. This might lead to a system that is hard to build and maintain due to lack of task isolation because of no good mechanisms for controlling the lifespan of use cases (workitems).

In most projects there is a strong focus on the visual elements of the solution: the views (WorkSpaces, SmartParts, Ribbons, etc) - and CAB makes it quite easy to create a composite UI application. But this doesn't necessarily make for a maintainable, pluggable and loosely coupled composite application / service-oriented business application (SOBA).

It is imperative that the workitem structure and lifespans are analyzed to be able to design the lifecycle management based on the dynamic behavior, events and state of the system's use cases. Without designing a set of workitems and their lifecycle management requirements, your application will not be able to provide users with a task-oriented composite application allowing users to start, work on, switch between, and complete a set of use cases. Without workitems, your system will be just a multi-view forms-over-data application - having no processes for resource life-cycles.

I recommend reading the 'Identifying WorkItems' section of Designing Smart Clients based on CAB and SCSF by Mario Szpuszta, a case-study from Raiffeisen Bank. The case-study shows how to analyze use case diagrams to identify workitems and modules, and provides some good rules for how to do this analysis. It shows how to use root use-cases and pure sub use-cases to identify first-level workitems and sub-workitems, and what I call hub-workitems from sub use-cases that are used from several use-cases. The hub-workitems are often entry points to use-cases such as "find customer".

What I would like to add to the procedure is that adding a dynamic modeling aspect based on the use cases will help with explaining, doing and presenting the workitem analysis.

Dynamic modelling is done using UML activity diagrams or BPMN diagrams, both showing workflow and responsible parties using swimlanes. The main help of using these dynamic diagrams is to get a dynamic (run-time) view of the static use case diagrams to be able to see the lifespan of tasks.

Create business process or workflow diagrams of the use cases. Each process will have a defined start and end, and is a good candidate for becoming a first-level workitem with zero, one or more sub-workitems. The diagram swimlanes are good candidates for identifying sub-workitems of the first-level workitem. Use the process diagrams to identitfy the lifecycle management requirements for workitems in your composite application.

Don't create a CAB application with only one workitem that lives as long as its module, unless your use case is as simple as that.

Tuesday, July 01, 2008

WCF, SVCUTIL Proxy Problems

We've had some rather strange proxy problems with our WCF client code lately, with some PCs not being able to connect to the services using DNS names ("There was no endpoint listening at ... that could accept the message"), but working fine using the service's IP address. This happened even if all PCs had the same MSIE "bypass proxy server for local addresses" and the same exception list in "do not use proxy server for addresses beginning with".

To make a long story short, there is a "sneaky gotcha" in the proxy exceptions logic: port numbers are not considered to be a sub-domain. So an exception like this "*.blogspot.com" might not include e.g. "*.blogspot.com:12001" depending on your system setup. Alas, you can add your own <bypasslist> directly in the <system.net/defaultProxy> element. Note that the bypass list addresses must be valid regex expressions. Read more about proxies and <system.net> at Matt Ellis' blog.

<system.net>
<defaultProxy useDefaultCredentials="true" >
<proxy usesystemdefault="True" bypassonlocal="True"/>
<bypasslist>
<add address=".+\.blogspot\.com:\d{1,5}" />
</bypasslist>
</defaultProxy>
</system.net>

For more info about WCF and proxies and how to do this from code, see Setting Credentials for your HTTP Proxy by Kenny Wolf.

The <system.net> proxy configuration can also be applied to SVCUTIL.EXE when e.g. proxy authentication is required to get the WSDL from the MEX endpoint.

Create a file named svcutil.exe.config in the directory where SVCUTIL.EXE is located to apply the proxy configuration to the tool. To pass user name and password use "username:password@" in front of the URL in the proxyaddress attribute. Do not store passwords in clear text like this.

PS! note how the casing of the attributes and values in the <proxy> element differs from what's normal.

Technology & Restaurants in Trondheim, Norway

It is vacation time in Norway, so you won't see many posts on my blog in July. So this summer's post is a bit on the light side: restaurants in Trondheim that me and two colleagues have dined in this year while on a project there.

The ranking is based on the quality of the meals (food+wine), then the service and atmosphere:

1. Credo: the best overall experience, really good wine list, always full, lively place.
2. Fem Bord: maybe the best food, the smallest place of the top three.
3. Emilies: one of the top three, close race with Fem Bord, a bit too quiet.
4. Palmehaven: the best main courses in Trondheim, the best service, too few guests.
5. Rica Nidelven: based on local food, too few guests, impersonal waiters.
6. Chablis: a bit to formal/bizniz, but the food is good, quite good wine list.
7. To rom og kjøkken: one of the top ten, few tables, combination of bar and restaurant.
Non ranked, not quite up there: Jonathan, Dråpen

What has this got to do with technology? Trondheim is the technology capital of Norway, with my univerity NTNU, and both Fast, Google and Yahoo has research centers there. So, if you like e.g. Google's Marissa Mayer ever goes to Trondheim, you know where to dine.

Friday, June 20, 2008

CAB/SCSF: View Parameters

CAB/SCSF use the wellknown Model-View-Presenter design pattern for its views. There is, however, a design flaw in its implementation: you cannot pass parameters to the view constructor. As documenten in the SCSF knowledge base, the proven practice is to pass the parameters using the WorkItem State collection.

This proven practice can be improved using generics to become type-safe, while still adhering to the MVP design rules. Add the following new method to the WorkItemController base class in the Infrastructure.Interface project:

public virtual TView ShowViewInWorkspace<TView, TParams>(string viewId, string workspaceName, TParams viewParams)
where TParams : class
{
string stateKey = StateItemNames.PresenterConstructorParams + typeof(TView).FullName + ":" + typeof(TParams).FullName;
_workItem.State[stateKey] = viewParams;

TView view = ShowViewInWorkspace<TView>(viewId, workspaceName);
return view;
}

Then add the following new method to the Presenter base class in the Infrastructure.Interface project:

protected TParams ViewParameters<TParams>() where TParams : class
{
//NOTE: must use typeof(view instance) to get correct run-time type of generic view type
string stateKey = StateItemNames.PresenterConstructorParams + _view.GetType().FullName + ":" + typeof(TParams).FullName;
return (TParams)_workItem.State[stateKey];
}

Now you can simply pass the parameters from the module controller to the view using ShowViewInWorkspace and then get the parameters in the presenter's OnViewReady event:

_viewParameters = ViewParameters<ViewParameters>();
if (_viewParameters == null)
{
throw new ArgumentException("Required view parameter error", "ViewParameters", null);
}

This little design improvement hides the gory details of using the untyped object State collection, e.g. generating shared keys and type casting.

Wednesday, June 18, 2008

WCF: Using Shared Types

In WCF you can use shared types from existing assemblies when generating a service proxy. Use the advanced options in VS2008, otherwise use SVCUTIL.EXE with the /reference /r option. Note that you can only use /reference when your WSDL operations, types and messages adheres to the DataContractSerializer rules that I described this spring.

I'm currently trying to research why we are not able to use shared types for our wsdl:fault message parts.

<wsdl:message name="GetCustomerRequest">
<wsdl:part element="tns:GetCustomerRequest" name="parameters" />
</wsdl:message>
<wsdl:message name="GetCustomerResponse">
<wsdl:part element="tns:GetCustomerResponse" name="parameters" />
</wsdl:message>
<wsdl:message name="FaultDetailList">
<wsdl:part element="fault:FaultDetailList" name="detail" />
</wsdl:message>

<wsdl:portType name="CustomerService">
<wsdl:operation name="GetCustomer">
<wsdl:input message="tns:GetCustomerRequest" name="GetCustomerRequest" />
<wsdl:output message="tns:GetCustomerResponse" name="GetCustomerResponse" />
<wsdl:fault message="tns:FaultDetailList" name="FaultDetailList" />
</wsdl:operation>
</wsdl:portType>

<wsdl:binding name="CustomerServiceSoap11" type="tns:CustomerService">
<soap:binding style="document" transport="http://schemas.xmlsoap.org/soap/http" />
<wsdl:operation name="GetCustomer">
<soap:operation soapAction="" />
<wsdl:input name="GetCustomerRequest">
<soap:body use="literal" />
</wsdl:input>
<wsdl:output name="GetCustomerResponse">
<soap:body use="literal" />
</wsdl:output>
<wsdl:fault name="FaultDetailList">
<soap:fault name="FaultDetailList" use="literal" />
</wsdl:fault>
</wsdl:operation>
</wsdl:binding>

Even if the fault details contains only xs:string elements, we get this error when generating the proxy:

Cannot import wsdl:portTypeDetail: An exception was thrown while running a WSDL import extension: System.ServiceModel.Description.DataContractSerializerMessageContractImporterError: Referenced type 'SharedDataContracts.FaultDetailType, MyService, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null' with data contract name 'FaultDetailType' in namespace 'urn:kjellsj.blogspot.com/FaultDetail/1.0' cannot be used since it does not match imported DataContract. Need to exclude this type from referenced types.

I thought that writing a unit-test to import the WSDL using the underlying SVCUTIL importer directly should help me deduce the cause of the error. You will find all the code that you need to import WSDL using DataContractSerializerMessageContractImporter here at the "nV Framework for Enterprise Integration" project as CodePlex.

Add these lines to enforce the use of /namespace, /reference and /ct from code (note that these switches can be used multiple times):

xsdDataContractImporter.Options.Namespaces. Add(new KeyValuePair<string, string>("*", "MyService"));
xsdDataContractImporter.Options.ReferencedCollectionTypes. Add(typeof(List<>));
xsdDataContractImporter.Options.ReferencedTypes. Add(typeof(SharedDataContracts.FaultDetailList));
xsdDataContractImporter.Options.ReferencedTypes. Add(typeof(SharedDataContracts.FaultDetailType));

Sure enough, I get the exact same exception in my unit-test. Inspecting the imported contract shows that the operation has a fault with a "null" DetailType XML Schema type (watch: contracts[0].Operations[0].Faults[0].DetailType). I suspect that it is the "null" DetailType in the FaultDescription of the imported ServiceDescription for the WSDL that causes the error for the referenced shared types.

As it turns out this is a bug that has been fixed in .NET 3.5 SP1. The workaround for earlier versions of SVCUTIL is to add nillable="true" to all elements that are reference types (strings and complexTypes) in your wsdl:types XSD schemas. Or just drop the /reference switch and cut'n'paste the generated code to use your shared types.