Thursday, July 05, 2007

SOA: Canonical Domain Model, Federated Canonical Domain Models

Services can be categorized into different service layers: process services, capability/activity services, entity services. A SOA solution is typically implemented by composition of services provided by different systems. The solution might comprise a mix of of in-house services, 3rd party and outsourced services. In addition, the solution might also involve cross-enterprise service compositions.

To be able to efficiently compose the services, a model that ensures shared sematics is needed, and in my last post I described the business event canonical domain model (ECDM). David Chappell calls this approach semantic data integration. ECDM is a similar concept to the EAI canonical data model [Hohpe/Woolf CDM pattern (355)], but it is a model with a slightly different purpose at a different architectual layer and it is not just about the data. The ECDM is about the events, messages and data passed between services, not about having a unified superset of the entities within an enterprise.


The ECDM allows for composition of services without having to know and comply with the model of the underlying service logic, all you need to know is the message types. This is a big advantage, and allows for the consumers to be isolated from the details of the entity services. E.g. all a service consumer need to know is the "AddressChange" data of the "CustomerHasMoved" event, not the complete data schema of the "Customer" entity service. ECDM is closely related to the Common Information Model (CIM) concept. Note that the ECDM message types are projected compositions of the referenced CIM resources - not just simple compositions of resource objects.
The term 'domain' is borrowed from Domain Driven Design (DDD). Domain-driven design is not a technology or a methodology. It is a way of thinking and a set of priorities, aimed at accelerating software projects that have to deal with complicated domains. Read stories about how DDD can be applied to a diverse set of architectual problems: Practitioner Experience Reports
Focusing on the bounded context for modeling the flow, events, messages, data and semantics involved in implementing the core business processes of the domain, should make it easier to come up with a working model. Note again that the ECDM is about more than just the data. Hence the term canonical domain model. Arvindra Sehmi and Beat Schwegler used the same term in Service-Oriented Modeling for Connected Systems, an article that also provides details about creating a service model.

Partner/ 3rd party/ outsourced services are not part of the core business domain according to DDD. If they were core processes in your business, how come they are so general that they can be outsourced or bought? Core processes are those that make your business unique and give you a competitive edge. DDD dictates using translators or an "anti-corruption layer" against services/systems that are not within the domain.

Having a canonical schema model at the entity service layer might be feasible within an enterprise, but should be avoided as this will cause very tight coupling to the One True Schema. Making every entity service depend on the One True Schema will make it impossible for the services to evolve separately, they will no longer be autonomous. If agility at the entity service layer is less important for you, then such a service straight-jacket might initially feel good. Trying to make a enterprise data model (EDM) is not a good idea for the same reasons. Steve Jones has a good post about canonical form issues and how you cannot enforce your model upon the world: Single Canonical Form - not for SOA

Federated ECDM

DDD recommends splitting big, diverse and complex solutions into several bounded contexts. Set explicit boundaries based on e.g. orgranizational units and application usage. A natural boundary in SOA is partner/ 3rd party/ outsourced services. Each set of services that is not under your control and that you cannot enforce your ECDM upon, is a separate bounded context (domain model).

Note that within your enterprise service model there will be several other bounded contexts for different domains, afterall ECDM is the canonical model across all the services in your
enterprise.

You should identify each model in play on the project and make a context map.
A context map describes the points of contact between the domain models, in addition to outlining explicit translation for any communication between the models.

This figure show how a context map is used to show how two domains relate to each other:
The figure and the definitions of "bounded context" and "context map" is taken from the 'Strategic Design' chapter of the book "DOMAIN-DRIVEN DESIGN" by Eric Evans [Addison-Wesley, 2004].

Note how not all elements of a domain model needs to be mapped to other models. Only the interconnected parts needs to have a translation map. E.g. the credit check process is provided by a 3rd party, thus it exists in a separate external ECDM. Your ECDM needs to have a translation map to the other service to be able to invoke it. Note how similar this is to the purpose of the ECDM itself: translating between business process compositions and the underlying, composed services. Thus, the context map is the basis for modelling a Federated ECDM. A federated ECDM serves the same purpose as the ECDM, just between multiple enterprise service models instead of within a single service model.

Monday, July 02, 2007

SOA: Canonical "Data" Model

An important topic when designing service oriented systems is how to enable different services to share semantics to be able to be composed into working solutions. Jack van Hoof has written a good article about this: How to mediate semantics in an EDA. A few weeks later, Nick Malik posted another good read about this topic: Canonical Model, Canonical Schema, and Event Driven SOA. Read Jack's post first.

They both talk about using a canonical data model (CDM) as the Esperanto / Babel fish to map between the format and semantics of the disparate systems taking part in a SOA solution. Note that CDM is not about having a common data model (EAI CDM) or a shared database across all systems in an enterprise, don't get fooled by the "data" in the term 'canonical data model'. CDM is about not making everybody have to speak English, but rather having CDM translators for each native system.

Btw, Gregor Hohpe sometimes use the term 'canonical domain model' on his blog, while using the term 'canonical data model' in the book "Enterprise Integration Patterns" [Hohpe/Woolf CDM pattern (355)]. I think it is better to talk about the business domain rather than about "data", as this help focusing on the biz processes rather than databases and other technology. You'd be surprised how many biz people concern themselves with how the data model looks - maybe a leftover from the client-server days, to show that they know what an ER-diagram is?

Trying to enforce a One True Schema across your services (everyone has to speak English) is not a viable path, and it is also a recipe for future maintenance hell. Making every service contract depend on the One True Schema will make it impossible for the services to evolve separately, they will no longer be autonomous. A simple change to e.g. the order entity will cause a ripple effect through all referring services. This is where the business CDM comes into play, it allows you to version and evolve the services independently of each other.

Note that you will still need to do master data management (MDM), do not misuse the business CDM as an excuse to slip into master data anarchy in your internal systems. CDM is about a shared meta-data model for business event and message formats; it is not about controlling entity creation and lifecycle, or about adhering to legislations such as Sarbanes-Oxley.


[UPDATE] The Canonical "Data" Model concept is also sometimes referred to as a Common Information Model (CIM). Both the business event CDM (ECDM) model and the data focused CDM/CIM models has the same goal: mediation og semantics. However, they are not the same as the two other models are both variations of the common data model approach. ECDM is about semantic business process integration, not just only semantic data integration.