Joe McKendrick has a post about the need for doing data governance today:
If data governance is inadequate — information is outdated, out of sync, duplicated, or plain inaccurate — SOA-enabled services and applications will be delivering garbage. That’s a formula for SOA disaster.
He links to the XML to the rescue: Data governance in SOA article by Ed Tittel, which describes how having a common model and doing data governance is imperative for SOA success. The article conveys the same concepts as chapter 4 'XML: The Foundation for Business Data Integration' in David Chappell's seminal book Enterprise Service Bus. David describes the need for having a common XML data format for "expressing data in messages as it flows through an enterprise across the ESB". This common XML format is a specialization for the common data model (CDM) pattern from the classic book Enterprise Integration Patterns by Gregor Hohpe/Bobby Woolf.
I've written many times about the importance for having a common information model (CIM) for the domain objects (business entities) that your data services operates on, and the need for applying master data management (MDM) on those data. David Linthicum and I agreed on this last year and still do.
What I've written even more often about, and what most canonical models do not encompass; is that the data expressed in the messages conveying business events is not simply CIM entities, but are rather projections of one or more of the domain objects. In addition, the messages covers more than just activity data, they also contain queries, notifications (events) and commands.
E.g. an insurance clerk triggers a business process by registering the real world event "customer has moved" using a projection of customer and address data, not the complete customer aggregate. This action event message triggers a recalculation of the insurance premium due to the relocation of the customer, in addition to updating the customer record - i.e. an underlying business process is executed.
In a traditional forms-over-data application, the whole customer aggregate would be used - but this style would not make for services that can be easily composed to support business process management efforts. The business event data requirements must be simple enough to make composite services possible, including semantic data mediation and integration. In addition, process tailored data make it easier for any involved human operator that have to enter or act on the data. After all, BPM is not all about automation, and human tasks will be central in driving the processes for quite a while.
We need to move to an information model that allows for semantic business process integration, not just semantic business data integration. That is why a business process information model (BPIM) is required, modeling the messages for the action, query and notification events that drive the business processes. Design the BPIM based on the CIM, ensuring that the model is canonical for each process domain.
The BPIM is based on the CIM as each service domain model is just a projection of the common data model, and the projection metadata should be goverened as any other SOA artifact.