Thursday, July 05, 2007

Composite Services: Information Model, Federated Information Models

A SOA solution is typically implemented by composition of services provided by different systems. The solution might comprise a mix of of in-house services, 3rd party and outsourced services. In addition, the solution might also involve cross-enterprise service compositions.

To be able to efficiently compose the services, a model that ensures shared sematics is needed, and in my last post I described the business process information model (BPIM). David Chappell calls this approach semantic data integration. BPIM is a similar concept to the EAI canonical data model [Hohpe/Woolf CDM pattern (355)], but it is a model with a slightly different purpose at a different architectual layer and it is not just about the data. The BPIM is about the events, messages and data passed between services, not about having a unified superset of the entities within an enterprise.

The BPIM allows for composition of services without having to know and comply with the model of the underlying service logic, all you need to know is the message types. This is a big advantage, and allows for the consumers to be isolated from the details of the consumed services. E.g. all a service consumer need to know is the "AddressChange" data of the "CustomerHasMoved" event, not the complete data schema of the "Customer" service.

BPIM is closely related to the Common Information Model (CIM) concept. Note that the BPIM message types are projected compositions of the referenced CIM resources - not just simple compositions of resource objects. In addition, the messages covers more than just activity data, they also contain queries, notifications (events) and commands. Design the BPIM based on the CIM, ensuring that the model is canonical for each process domain.

The term 'domain' is borrowed from Domain Driven Design (DDD). Domain-driven design is not a technology or a methodology. It is a way of thinking and a set of priorities, aimed at accelerating software projects that have to deal with complicated domains. Read stories about how DDD can be applied to a diverse set of architectual problems: Practitioner Experience Reports.

Focusing on the bounded context for modeling the flow, events, messages, data and semantics involved in implementing the core business processes of the domain, should make it easier to come up with a working model. Note again that the information model is about more than just the data. Hence the name business process information model. Arvindra Sehmi and Beat Schwegler used the same term in Service-Oriented Modeling for Connected Systems, an article that also provides details about creating a service model.

Partner/ 3rd party/ outsourced services are not part of the core business domain according to DDD. If they were core processes in your business, how come they are so general that they can be outsourced or bought? Core processes are those that make your business unique and give you a competitive edge. DDD dictates using translators or an "anti-corruption layer" against services/systems that are not within the domain.

Having a canonical schema model at the service layer might be feasible within an enterprise, but should be avoided as this will cause very tight coupling to the One True Schema. Making every service depend on the One True Schema will make it impossible for the services to evolve separately, they will no longer be autonomous. If agility at the service layer is less important for you, then such a service straight-jacket might initially feel good. Trying to make a enterprise data model (EDM) is not a good idea for the same reasons. Steve Jones has a good post about canonical form issues and how you cannot enforce your model upon the world: Single Canonical Form - not for SOA

Federated Business Process Information Models

DDD recommends splitting big, diverse and complex solutions into several bounded contexts. Set explicit boundaries based on e.g. orgranizational units and application usage. A natural boundary in SOA is partner/ 3rd party/ outsourced services. Each set of services that is not under your control and that you cannot enforce your information model upon, is a separate bounded context (domain model).

Note that also within your enterprise service model there will be several other bounded contexts for different domains, each with its own information model that is canonical per domain

You should identify each model in play on the project and make a context map.
A context map describes the points of contact between the domain models, in addition to outlining explicit translation for any communication between the models.

This figure show how a context map is used to show how two domains relate to each other:
The figure and the definitions of "bounded context" and "context map" is taken from the 'Strategic Design' chapter of the book "DOMAIN-DRIVEN DESIGN" by Eric Evans [Addison-Wesley, 2004].

Note how not all elements of a domain model needs to be mapped to other models. Only the interconnected parts needs to have a translation map. E.g. the credit check process is provided by a 3rd party, thus it exists in a separate external information model. Your BPIM needs to have a translation map to the other service to be able to invoke it. Note how similar this is to the purpose of the BPIM itself: translating between business process compositions and the underlying, composed services. Thus, the context map is the basis for modelling a set of Federated Business Process Information Models. A federated BPIM system map shows the integration of multiple enterprise service models, avoiding the pitfall of designing a single canonical data model across a set of different domains.

Monday, July 02, 2007

SOA: Canonical "Data" Model

An important topic when designing service oriented systems is how to enable different services to share semantics to be able to be composed into working solutions. Jack van Hoof has written a good article about this: How to mediate semantics in an EDA. A few weeks later, Nick Malik posted another good read about this topic: Canonical Model, Canonical Schema, and Event Driven SOA. Read Jack's post first.

They both talk about using a canonical data model (CDM) as the Esperanto / Babel fish to map between the format and semantics of the disparate systems taking part in a SOA solution. Note that CDM is not about having a common data model (EAI CDM) or a shared database across all systems in an enterprise, don't get fooled by the "data" in the term 'canonical data model'. CDM is about not making everybody have to speak English, but rather having CDM translators for each native system.

Btw, Gregor Hohpe sometimes use the term 'canonical domain model' on his blog, while using the term 'canonical data model' in the book "Enterprise Integration Patterns" [Hohpe/Woolf CDM pattern (355)]. I think it is better to talk about the business domain rather than about "data", as this help focusing on the business processes rather than databases and other technology. You'd be surprised how many biz people concern themselves with how the data model looks - maybe a leftover from the client-server days, to show that they know what an ER-diagram is? Focus on designing a business process information model (BPIM) for each business process domain.

Trying to enforce a One True Schema across your services (everyone has to speak English) is not a viable path, and it is also a recipe for future maintenance hell. Making every service contract depend on the One True Schema will make it impossible for the services to evolve separately, they will no longer be autonomous. A simple change to e.g. the order entity will cause a ripple effect through all referring services. This is where the business process information model comes into play, it allows you to version and evolve the services independently of each other.

The Canonical "Data" Model concept is also sometimes referred to as a Common Information Model (CIM). Both the business process information model (BPIM) and the data focused CDM/CIM models has the same goal: mediation og semantics. However, they are not the same as the two other models are both variations of the common data model approach. The business process information model is about semantic business process integration, not just only semantic data integration.