Monday, August 11, 2008

SOA Information Models and Services Compositions

I've written quite a lot about having a model for the information needed for semantic business process composition in service-oriented solutions, two of the most read and commented are SOA Canonical "Data" Model and Ontology for Business Processes, SOA and Information Models.

The process composition information model is focused on the business events, messages and documents and the semantics of the information that the model encompasses. The model is targeted at the orchestrations level in the ontology, not at the services level; as shown in the figure (click to enlarge):

The service classification scheme has the "hierarchy" taxomomy form. It is not a scientific hierarchy nor a tree form, thus there is no "is using" or "is dependent on" relation between the services - it is just a classification scheme. There can be no dependency between services in SOA except for composite services realizing a business process by consuming the actual services. Taking a dependency on another service will break the central "services are autonomous" tenet of SOA.

A note on the splitting of the service categories: this is a pure metadata ontology based on the taxonomy for categorizing services, it is not a split based on technolog or, teams, and surely not on layers. A service should not call on other services, making a service dependent on another service will lead to repeating the failure of DCOM/CORBA in distributed systems; using web-services and calling it SOA solves nothing. Use the active service pattern or EDA to ensure that your services are truly autonomous and independent of the availability of other services. Isolate service dependency to the composite services, or even better to the consumer mashups, only.

The "business process information model" (BPIM) is what models the business events and documents in the composite services level (orchestration / conversations / sagas). I like to use a domain model as it by definition focuses on the concepts of the problem domain (the business process information artifacts) rather than just the data of the domain. The information model must comprise the different message types for the action, query and notification events that drive the business processes. The model must be canonical for its domain to ensure that the format and semantics of the model is consitent within the domain.

There is some confusion on the difference between the process composition BPIM and the EAI "canonical data model" (CDM), plus the similar concept Common Information Model (CIM). As can be seen from the comments on the two postings and some blog reactions like SOA doesn’t need a Common Information Model, this has made it harder to convey the difference between the process composition information model and the more data-oriented models.

A CIM should be used for the resources data model - which the BPIM is related to. Design the BPIM based on the CIM, ensuring that the model is canonical for each process domain. The data in the event document for the real-life action "CustomerHasMoved" is not a complete "Customer" resource entity, it is a message including minimal "AddressChange" data. This data will typically contain both actual event data combined with reference keys to the resources such as customers and addresses in the different service domains that the composition comprises. 

The message data is of course associated with the CIM of the underlying resource services. The BPIM message types are projections of CIM objects; in fact, they are projected compositions of the referenced resources - not just simple compositions of resource objects. These projections must be kept as small as possible, preferably just the resource keys, in addition to event details such as a booking reference number. Keeping the event document data model small has the nice side-effect of making the schema versioning a bit simpler.

The CIM should be the starting point of your modeling efforts, as the documents in the process compositions is based on the service domains. Having Business Process Modeling Notation (BPMN) diagrams can help in modeling, as BPMN encompass process, events, messages and business documents, covering actions, queries and notifications. Other process modeling diagrams can be used, just ensure that the events, messages and data are captured in the model.

It is still important to recognize that having a "one true schema" for process composition or SOA in general is not a recommended practice, hence the term "domain" in the naming. Rather than trying to enforce a common model across all service domains, federated models should be used. When composing services across the domains, mediation and transformation of messages will be required - here a service bus with such capabilities would come in handy.

A thing to recgonize is that the overall model incorporates several "flows": process flows, event flows, message flows, and data state flows (resource lifecycles). Typically, only the process flow is clearly shown in diagrams, while e.g. message flow is only represented by receive and send ports. See this interview with Gregor Hohpe on Conversation Patterns about how more than just the process need to be modeled.

For an excellent primer on architectural aspects of SOA and the proliferation of acronyms, I recommend reading Architecture requirements for Service-Oriented Business Applications by Johan den Haan. Note that he does the classical simplification of focusing on business processes without including a BPIM; it is important to separate the process composition model from the resource model, just as business processes are separated from resource lifecycle processes. The business process events, documents and messages are not just collaterals of the BPMN modeling process, they are important artifacts for semantic mediation in composite services.

[Business Process Integration: "e-Business: organizational and technical foundations", Papazoglou, M. P. & Ribbers, P. (2006)]

No comments: