Wednesday, January 30, 2008

Ontology for Business Processes, SOA and Information Models

I have for a long time promoted that you should apply a specific kind of information model (the business process information model - BPIM) when composing services into business processes in service oriented solutions. A BPIM models the messages for the business events that drive the business processes. The main goal of this information model is to enable coordinated loose-coupling (semantic mediation) between the services at the composition level.

There has been a lot of discussions on how the BPIM relates to SOA services and the common information model (CIM), and how the BPM process layer relates to the SOA process layer. In the following ontology I have tried to model the relationships between the documents and the domain objects, Shy Cohen’s service taxonomy & ontology, and Jean-Jacques Dubray’s SOA+BPM ontology

The service classification scheme has the "hierarchy" taxomomy form. It is not a scientific hierarchy nor a tree form, thus there is no "is using" or "is dependent on" relation between the services - it is just a classification scheme. There can be no dependency between services in SOA except for composite services realizing a business process by consuming the actual services. Taking a dependency on another service will break the central "services are autonomous" tenet of SOA.

Click to enlarge the ontology:

Note how there are processes at two levels in the model: the event-driven resource lifecycle processes for domain objects (bottom), and the human-driven business processes realized as composed services (top). These two levels are separate bounded contexts with well defined mappings between them: the business process domain and the resource lifecycle domain.

The BPIM is applied at the business process domain level (orchestrations / composites / sagas), while the services operate on the domain objects in the CIM to manage the lifecycle of these resources in accordance with the business events.

Note that the BPIM is related to, and is a subset of/reference to, the resource domain objects in the common information model (CIM). In addition, the messages covers more than just activity data, they also contain queries, notifications (events) and commands. This in like a mail order paper form that contains e.g. some customer data and references to product data. The word "subset" is central, e.g. the “customer has moved” action event has a BPIM document that contains the data pertinent to that specific business process event, not the complete schema of the customer domain object. The resource process service used to act on the “customer has moved” event does of course operate on the actual domain object, but this is invisible to the human workflow process.

The business process messages (action, query, notification) driving the composite services and the resulting business events are central artifacts when designing the services and the information model. I am a strong believer in designing a service-oriented solution by applying "EDA style" thinking to avoid missing out on the events and their documents due to traditional “invoke operations” SOA thinking, getting too much focus on business process flow.

An aspect not shown in the model is that the need for agility increases towards the top, while the cost of change increases towards the bottom. This is an important argument for having processes at two levels; it allows you to contain the frequent changes to the business processes (mashup style compositions) and composite services rather than having to constantly make changes to the services themselves, which could become very costly as the number of service subscribers increases over time. This is something that your business people should readily appreciate.

Design your system to have flexibility in the flow of the business processes and composite services. The diamonds of your business process are the business decisions, and this is where you should implement a business rules engine (BRE) mechanism. Process flow logic is not service business logic, avoid leaking domain logic into the orchestration layer at all cost.

Finally, the model show how claims relate to services at all levels. I have done this to show how a claims-based security model is a cross-cutting concern throughout a service-oriented solution. I strongly recommend designing in process claims right from the beginning, as this makes it easier to learn how access control relates to the business processes and solution domain.

Master Data Management (MDM) relates to the "Resources" in the above ontology. It is imperative that you apply a MDM strategy to your enterprise resource domain objects to avoid inconsistencies and multiple versions of the truth.

I hope this ontology makes my viewpoint on business process driven SOA clear to you. Feel free to comment on this model, and also check out the six figures in JJD’s ontology for models showing different perspectives of resource lifecycles & business events, BPEL, BPMN and human tasks.

See also this related post.


James Taylor said...

Not sure I agree with the agility/cost comments. Blogged a response here.

James Taylor
Author, with Neil Raden, of Smart (enough) Systems
blog at

Kjell-Sverre Jerijærvi said...

JT has comments on where the cutpoints for the decision service (biz rules engine) fits in my ontology, and here is my response:

Your refinement of my bulletpoint statement is good, we really do agree. I just see the decision service as conceptually being a part of the process, composition layer and the human task (workflow) layers, thus towards the top of my figure. I do not think Shy Cohen’s taxonomy ment for activity services to be the cutpoint for decisions, rather the process services.

In my opinion, the business capabilities delivered by a process stays relatively fixed, it is the flow of the process that changes most often and needs to be designed for agility. It is very important that the aspects of the system that needs agility most are the ones that need to be least expensive to change.

My main argument for the cost of change being higher for the lower layers are based on data contract (resources in JJD terms) changes; the more subscribers/consumers a set of activity and entity services have, the more impact introducing a new version of the serivce contract will have. And I think most solutions will involve coded compositions at the resource service layers, while the composition and business process layers will be declaratively designed.

Integral ):( Reporting said...


thank you for the references. I also wanted to point you to a discussion I had with Dave Linthicum.

where we both argue that you rather need a "Logical Data Model" and not necessarily a "Common Information Model".

The argument really goes as follows:
- if you can have a CDM great
- but over time it is likely to diverge (the effort to keep a CDM is too big), so be prepared to maintain an LDM (or RDM= Reference Data Model) and use the LDM to create direct transformations from one interface to the other.

a) transformations with a CDM when the interfaces are aligned:
Cons. Ic | ---> none ---> Ip | provider

b) transformations with a CDM when the interfaces are not aligned:

Cons. | Ic --> X1 --> CDM ---> X2 ---> Ip | provider

c) transformations with a LDM:
Cons. | Ic ---> X ---> Ip | Provider

In gerenal you will end up in b) or c) and b) requires 2 transformations plus the cost of maintaining a CDM + LDM

In a given message (CDM) you tipically "project" the LDM into ways that are very specific to the interaction. A message typically may involve information from more than one business entity. A purchase order is going to have customer information. In LDM you maintain Customer and PO separately, in the CDM you are going to have a lot of message where the two business entities are convoluted.

IMHO, you also need to consider Master Data Management along side your architecture diagram.


Kjell-Sverre Jerijærvi said...

I have not suggested using a CIM or a Common Data Model (as in Gregor Hohpe's EAI CDM), but a canonical model of the message and documents belonging to processes and events within a business domain (plus using DDD bounded contexts and mappings when applicable). So I am with alternative C.

As the acronym CDM is so overloaded, I have previously used ECDM to differentiate this model from CIM and COMMON DATA MODEL.

After the discussions with you and DaveL last year I wrote several posts to describe ECDM (LDM) and how "ask for more" + reference data is central to keep the documents (Jack van Hoof: dossiers) as small as possible. The posts are all tagged with ECDM in my blog.

Thanks for the comment, I will add an end note to explain that the Canonical Domain Model is not CIM/EAI CDM.

Integral ):( Reporting said...

apologies for mis-understanding. It often takes a single word !

Kjell-Sverre Jerijærvi said...

I know, I did the same when Nick Malik first blogged about canonical models+schemas last year:

The words database and schema triggered me to think of the "one true schema" approach, which Steve Jones have disclosed properly: