Tuesday, October 09, 2007

LINQ to SQL and Entity Framework XML Serialization Issues with WCF - Part 1

Setting LINQ to SQL's Serialization property value for entities to Unidirectional in the O/R Designer's properties window for the entity enables Windows Communication Foundation's default DataContractSerializer (DCS). The DCS implements the Shared Contract mode, which (in Aaron Skonnard's words) "shares the schema contract across the wire." An alternative NetDataContractSerializer (NDCS) shares .NET Framework types and implements Shared Type mode, which isn't interoperable. The third alternative is .NET 2.0 and earlier's XMLSerializer.

However DCS doesn't maintain full object-graph schema fidelity of the original entity's properties. You lose EntityRefs for m:1 associations if your object has EntitySets for 1:n associations that share a common property. In this case, you have a circular reference or cycle. The DCS doesn't serialize circular references by default, so these EntityRef properties lose their [DataMember] attribute on the server and the [System.Runtime.Serialization.DataMemberAttribute()] decoration on the client. 

The DCS also adds an ExtensionData property of the ExtensionData type to support round-tripping with new service versions that have added properties. The property stores any data from future versions of the data contract that is unknown to the current version.

LINQ to SQL automatically adds the EntityRef value and you usually can ignore ExtensionData property. This article offers a link to a complex workaround to enable serializing EntityRefs for m:1 associations. This doesn't fully conform ServiceName.EntityName objects sent to and returned from service clients with the original DataContextName.EntityName objects because you can't remove the ExtensionData property.

The Entity Framework (EF) and Entity Data Model (EDM) support only binary serialization of entities and make no provision whatsoever for XML serialization of entity relationships. Julie Lerman addresses this issue in her XML Serializing Entity Framework entities with their children for SOA post of October 2, 2007. Julie proposes constructing a facade from the original entities and decorating the entity and member with [DataContract] and [DataMember] attributes to enable WCF serialization. It remains to be seen if WCF can handle cycles in facades; this is a more serious issue for the EF, because EntityRefs replace—rather than supplement—foreign-key values.

You must construct a similar facade to retain original entity values for updates to LINQ to SQL entities that require value-based concurrency management. Mike Taulty mentions this problem in his Disconnected LINQ to SQL post:

I had to write a function to copy a customer record so that I can maintain a "current" value and an "original" value (I didn't include this code as it's tedious - sometimes I find myself wishing that ICloneable was a bit more central in the .NET world.)

Adding a timestamp property to each entity eliminates the need to construct facades to hold original entity values. A timestamp property also lets you attach an entity to a DataContext with the Table<TEntity>.Attach(true) overload that specifies the entity as modified.

WCF's default configuration uses the WsHttpBinding binding to implement WS-Reliable Messaging and WS-Security specifications for message reliability, security and authentication. HTTP is the transport, messages are encoded as Text/XML and secured by SOAP message security; the SOAP body is encrypted and digitally signed by default. Message security and authentication are relatively fragile in development environments, especially with Windows Vista and virtual machines that you access by the Remote Desktop service.

Message and transport security should be applied after you get your app running on the server with basicHttpBinding, which implements the Web Service Interoperabilty (WS-I) Council's Basic Profile 1.1 Second Edition. You can change to WsHttpBinding by entries in host and client configuration file(s): app.config, AppName.exe.config, or web.config).

Fighting the Attached Associated Objects Bug

There is a serious bug in LINQ to SQL Beta 2 that causes invocation of the Table<TEntity>.Attach() method or its Table<TEntity>.Attach(ModifiedEntity, OriginalEntity)overload to correctly attach the root object as unmodified but perform inserts on the persistence tables for all associated entities that are equally unmodified. The bug was originally reported by an anonymous developer as Feedback 295402 Incorrect Update Behavior - Attaching an Entity after Calling a Relationship Property of 8/27/2007, which was closed as [to be] Fixed on 9/5/2007. Here's what C# program manager Alex Turner says about the bug:

Thanks for reporting this issue you've encountered with Visual Studio 2008! We were able to reproduce your issue after all! It turns out that our Attach() logic was adding the single attached object (the Product) to the object cache, but not any related objects (the Category) that had not been in the DataContext's object graph before. When SubmitChanges() found these objects, it did not know about them and thus assumed they had been added since the last query (otherwise the DataContext would have been the one to have materialized them and they would be known). We've now added an intermediate state for such objects that are related to attached objects so that they are assumed to already exist in the database (just like the attached object itself), and they will not be inserted during SubmitChanges. If you later add more related objects after Attach, they will be considered new objects as normal and queued for insertion.

Obviously there is no way that any two-level parent-child UI, such as the test harness below, could be considered close to operable with this problem.

Note: What's interesting about this third "intermediate state" is that Hibernate has three object states: Transient, Persistent, and Detached (see the diagram in section 4.1.1 transient objects on page 140 (Chapter 4) of Hibernate in Action by Christian Bauer and Gavin King (Manning, 2005). Hibernate has an evict() methodmethod to detach objects from the Persistent (attached) to the Detached state, whereas LINQ to SQL has no Table<TEntity>.Detach() method.

This is a client test harness for a self-hosted WCF service that has a LINQ to SQL data access layer (DAL) which retrieves the Customer, Order and two Order_Details entities from the persisting store. Changes have been made to the ContactName (Bogus -> Bogosian), EmployeeID (1 -> 2), and ProductID (1 -> 3 and 2 -> 4) property values:

Click image for full-size screen capture.   

Here's the result of invoking SubmitChanges() with the above edits. Instead of applying changes to the OrderID 11217 Order entity and its two Order_Details entities, attaching the root entity adds a new Order 11218 with two new Order_Details entities having the modified ProductID values:

Click image for full-size screen capture.   

Note: Rick Strahl reported the same bug in his Complex Detached Entities in LINQ to SQL - more Nightmares post of October 1, 2007 and suggested a workaround. I intend to test Rick's approach and will update this post with the result.

Incentives for Use of WCF for Serializing Business Objects

.NET Framework 3.0 added support for Windows Communication Foundation (WCF), together with Windows Presentation Foundation (WPF) and Windows Workflow Foundation (WF) to .NET Fx 2.0 and Windows Vista. .NET Fx 3.5 adds a few new features to these three technologies. WCF, formerly code-named Indigo, replaces conventional ASP.NET (.asmx) Web services, Web Services Extensions (WSE), .NET Remoting and Enterprise Services with a unified distributed communication technology.

WCF's Serialization Peformance

WCF provides significant performance improvements over the technologies it replaces. So it's a good bet that all .NET developers who attempt to maintain partition their object/relational mapping (O/RM) applications into multiple tiers will attempt to use WCF and XML serialization for messaging across process boundaries. WCF also enables more efficient (and thus performant) binary serialization for intranet applications that don't traverse firewalls.

The DataContractSerializer

WCF provides the DataContractSerializer (DCS) as an optional replacement for the traditional XmlSerializer of .NET Fx 2.0 and earlier. According to Microsoft UK's James World, DCS adds these features:

  • Hooks are providing for refining control of (de)serialization - particularly useful for handling versioning issues. By applying any of four special attributes to methods in the target class you can have them called either before or after (de)serialization.
  • The serializer is "opt-in" rather than "opt-out" - which makes (imho) for much cleaner code. In XmlSerializer you could use XmlIgnore to have the serializer ignore certain properties. With the DCS you explicitly mark what you want to serialize.
  • Finally, ANY field or property can be serialized - even if they are marked private.

The "four special attributes to methods" are the following DataContractSerializer properties:

  • MaxItemsInObjectGraph is a configurable property that "specifies the maximum number of items allowed in an object." The default is 65,536 (0x7FFF). Documentation says the units are bytes, which conflicts with the term Items. 
  • IgnoreExtensionDataObject is a configurable property that "gets or sets a value that specifies whether to send unknown serialization data onto the wire. The default value is false. Setting the value to true doesn't make the ExtensionData property disappear; it just prevents sending data, if any exists, to the client. 
  • PreserveObjectReferences is a non-configurable property that "gets a value that specifies whether to use non-standard XML constructs to preserve object reference data," specifically EntityRefs. The default value is false.
  • DataContractSurrogate is a non-configurable property that's "designed to be used for type customization and substitution in situations where users want to change how a type is serialized, deserialized or projected into metadata." 

You can set configurable property values in a configuration file (app.config, AppName.exe.config or web.config) or code in the WCF host application; setting non-configurable properties requires code.

Working Around the Cyclic Relationship Problem

If you must persist both m:1 and n:1 associations, you can patch your host code to set the PreserveObjectReferences property value to true, but doing so adds non-standard <id> and <idRef> elements to the message body. Sowmy Srinivasan provides a code example for Preserving Object Reference in WCF (March 26, 2006) that you can add to pass an instance of the DataContractSerializer with the PreserveObjectReferences property value to true to the WCF runtime for both server and client components. Microsoft's WCF team wants to discourage users from XML-encoding cycles, so you must write a custom behavior instead of setting an attribute value for the PreserveObjectReferences property value. (According to Aaron Skonnard, the same is true for substituting the NCDS for CDS.)

Stay tuned for Part 2, which will cover the code required to add, update and delete a top-level entity that might work as expected in the Visual Studio 2008 RTM version.

Updated 8/10/2007: Minor additions and clarifications.