Sunday, February 05, 2006

Implementing "RowState" for Entity Objects

At my current project, we have some object model design discussions about "the how" for our entity objects (whose primary objective is "the what" of the domain), between the DataSet clan and the POCO clan. Our system is quite big, and must be designed to make it easy to provide future services to other parts of the organization (including Java-based systems) and to external partners. Support for web-services is a planned feature of the system, not to mention SOA.

Personally, I always like to design a system in a way that make the client application just another consumer of the system services. This promotes re-usability and low coupling, and is best achieved through a TDD approach. Thus, as you may guess, I am not in favor of using DataSets as the basis for entity objects, value objects, messages, or for DTOs. I recommend reading Scott Hanselman's famous blog post about using DataSets as business objects or as message elements in web-services. Also check out this MSDN Mag article, including the referenced blog posts at the end of the article.

Design-time data-binding of anything but DataSets was not simple in .NET 1.1, but with the advent of the object binding source and generics (List<T>) in .NET 2.0, it has become viable to do data-binding to entity objects and lists/collections.

The software design discussion now revolve around the RowState to signal the state of an entity object in relation to a containing list. Having a RowState property is really useful when e.g. implementing the 'Unit of Work' pattern. I have promoted that the 'cloning' and 'is dirty' mechanisms, plus a deleted flag as enough, while the DataSet clan advocates the need for a separate, setable row-state property, which will introduce some ambiguity in deducing the actual state of an entity object when received from a client.

I claim that the RowState need not be a separate, serializable property, it is just a read-only combination of the entity object identifier .Id, .IsDirty, and an .IsDeleted flag (the latter being the only setable property, the other two are read-only to the clients). Each entity object must have a class member that contains a clone of its original state. The .Id (internal, hidden entity identifier; PK) is assumed to be e.g. -1 when the object is not an existing entity in the data store. The state of an entity object is deduced like this:

  • Added: .Id == -1 && .IsDirty == true && .IsDeleted == false
  • Deleted: .IsDeleted == true && .Id != -1 (must exist to be deleted)
  • Detached: not applicable, entites are not defined by membership in any List<T>
  • Modified: .Id != -1 && .IsDirty == true && .IsDeleted == false
  • Unchanged I: .IsDirty == false && .IsDeleted == false
  • Unchanged II: .Id == -1 && .IsDeleted == true (aborted insert, no operation on this operand)
Thus, RowState is just a read-only property on the entity object. If your system requires the need for public setable row-state (like the new feature in ADO.NET 2.0), I recommend that you add an AdviceRowState property that the system clients can use to signal their intended state of the entity object.

Remember to keep any "how" properties away from the serialization of the "what" when using serialization as the basis for .IsDirty, otherwise such extra properties will cause bogus 'is dirty' logic. This includes any 'original values' clone that you may include in the entity object to support the .IsDirty method. As the 'is dirty' code uses the BinaryFormatter, use the [NonSerialized] attribute on all fields/class members that should not be comprised by the .IsDirty comparison.

You may also need to exclude some properties from XML serialization e.g. to prevent exposure in your web-services.
Apply the XML serialization control attribute XmlIgnore to such properties. You may also conditionally serialize properties such as the AdviceRowState property to XML. Add an extra control property bool AdviceRowStateSpecified and set it to false to exclude the actual property from XML serialization. Apply [XmlIgnore] to the AdviceRowStateSpecified property as you do not want it to be serialized anyway.

It should always be an internal aspect of your system how domain entities are stored, how you implement locking, how you implement long-running work/operations, etc ( see Data on the outside vs. Data on the inside by Pat Helland). The AdviceRowState property is just for giving the client the perception of having a say. The operations provided by your system should always make the client say please: "please update the account with this data", "please delete this order", etc.

The WinForms DataGridView still favors the use of DataSets as it has built-in support for DataTable.DataRow.RowState, e.g. automatically hiding deleted rows. This will not happen when using an object binding source bound to a generic list. Using two List<T> are one way of solving this, one for the deleted entities and one for the others. Use the latter as an object binding source, and the former to keep track of deleted entities. Pass both lists in the message to your service to save the changes (the lists are the 'operands' of the message).

Thus, the RAD model of Visual Studio favors developer productivity over good software design. Read 'Does Visual Studio Rot the Mind?' by Charles Petzold for more on this topic.

No comments: