Wednesday, January 23, 2008

Namespace Strangenesses in XML Infosets Transformed by LINQ to XML

The purpose of namespace prefixes is to provide abbreviations for global or group-level namespaces, which otherwise would bloat the already substantial overhead of XML Infosets. I've found LINQ to XML not to process namespace declarations as I expected when processing some semi-real-world documents.

Updated 1/23/2008: See end of post.

Bloating All Prefixed Elements with Duplicate Local Namespace Declarations

LINQ to XML works exclusively with expanded namespace prefixes, so relatively simple documents with a few namespaces become unwieldy to store and difficult for humans to read. For example, this simple Atom 1.0-formatted source Infoset returned by an ADO.NET Data Services URL query is quite easy to read:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<feed xml:base="http://localhost:50539/Northwind.svc/" xmlns:ads="http://schemas.microsoft.com/ado/2007/08/dataweb" xmlns:adsm="http://schemas.microsoft.com/ado/2007/08/dataweb/metadata" xmlns="http://www.w3.org/2005/Atom">
  <entry adsm:type="NorthwindModel.Orders">
    <id>http://localhost:50539/Northwind.svc/Orders(11077)</id>
    <updated />
    <title />
    <author>
      <name />
    </author>
    <link rel="edit" href="Orders(11077)" title="Orders" />
    <content type="application/xml">
      <ads:OrderID adsm:type="Int32">11077</ads:OrderID>
      <ads:OrderDate adsm:type="Nullable`1[System.DateTime]">2007-05-06T00:00:00</ads:OrderDate>
      <ads:RequiredDate adsm:type="Nullable`1[System.DateTime]">2007-06-03T00:00:00</ads:RequiredDate>
      <ads:ShippedDate adsm:type="Nullable`1[System.DateTime]" ads:null="true" />
      <ads:Freight adsm:type="Nullable`1[System.Decimal]">8.5300</ads:Freight>
      <ads:ShipName>Rattlesnake Canyon Grocery</ads:ShipName>
      <ads:ShipAddress>2817 Milton Dr.</ads:ShipAddress>
      <ads:ShipCity>Albuquerque</ads:ShipCity>
      <ads:ShipRegion>NM</ads:ShipRegion>
      <ads:ShipPostalCode>87110</ads:ShipPostalCode>
      <ads:ShipCountry>USA</ads:ShipCountry>
    </content>
    <link rel="related" title="Customers" href="Orders(11077)/Customers" type="application/atom+xml;type=entry" />
    <link rel="related" title="Employees" href="Orders(11077)/Employees" type="application/atom+xml;type=entry" />
    <link rel="related" title="Order_Details" href="Orders(11077)/Order_Details" type="application/atom+xml;type=feed" />
    <link rel="related" title="Shippers" href="Orders(11077)/Shippers" type="application/atom+xml;type=entry" />
  </entry>
  <!-- ... -->
</feed>
Applying a LINQ to XML query that returns only abbreviated <entry> groups for the USA in reverse OrderDate order results in this mess:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <feed> <entry> <content type="application/xml" xmlns="http://www.w3.org/2005/Atom"> <ads:OrderID adsm:type="Int32" xmlns:adsm="http://schemas.microsoft.com/ado/2007/08/dataweb/metadata" xmlns:ads="http://schemas.microsoft.com/ado/2007/08/dataweb">11077</ads:OrderID> <ads:OrderDate adsm:type="Nullable`1[System.DateTime]" xmlns:adsm="http://schemas.microsoft.com/ado/2007/08/dataweb/metadata" xmlns:ads="http://schemas.microsoft.com/ado/2007/08/dataweb"> 2007-05-06T00:00:00 </ads:OrderDate> <ads:RequiredDate adsm:type="Nullable`1[System.DateTime]" xmlns:adsm="http://schemas.microsoft.com/ado/2007/08/dataweb/metadata" xmlns:ads="http://schemas.microsoft.com/ado/2007/08/dataweb"> 2007-06-03T00:00:00 </ads:RequiredDate> <ads:ShippedDate adsm:type="Nullable`1[System.DateTime]" ads:null="true" xmlns:adsm="http://schemas.microsoft.com/ado/2007/08/dataweb/metadata" xmlns:ads="http://schemas.microsoft.com/ado/2007/08/dataweb" /> <ads:Freight adsm:type="Nullable`1[System.Decimal]" xmlns:adsm="http://schemas.microsoft.com/ado/2007/08/dataweb/metadata" xmlns:ads="http://schemas.microsoft.com/ado/2007/08/dataweb">8.5300</ads:Freight> <ads:ShipName xmlns:ads="http://schemas.microsoft.com/ado/2007/08/dataweb">Rattlesnake Canyon Grocery</ads:ShipName> <ads:ShipAddress xmlns:ads="http://schemas.microsoft.com/ado/2007/08/dataweb">2817 Milton Dr.</ads:ShipAddress> <ads:ShipCity xmlns:ads="http://schemas.microsoft.com/ado/2007/08/dataweb">Albuquerque</ads:ShipCity> <ads:ShipRegion xmlns:ads="http://schemas.microsoft.com/ado/2007/08/dataweb">NM</ads:ShipRegion> <ads:ShipPostalCode xmlns:ads="http://schemas.microsoft.com/ado/2007/08/dataweb">87110</ads:ShipPostalCode> <ads:ShipCountry xmlns:ads="http://schemas.microsoft.com/ado/2007/08/dataweb">USA</ads:ShipCountry> </content> </entry>
<!-- ... --> </feed>

You can remove the duplicate local namespaces by string manipulation but doing so results in brittle code.

Bloating Some Unprefixed Elements with Duplicate Group Namespace Declarations

An alternative is to transform, rather than filter, the document because the compiler is reported to cache the namespaces you add with code and remove them from the output. My Visual Basic 9.0 XML literal transform code is similar to the following abbreviated version (the three namespaces are imported with Imports directives, which aren't shown):

Private Sub TransformOrders()
    Dim xdOrders As XDocument = XDocument.Load(strDataPath & "Orders.xml", _
                                               LoadOptions.PreserveWhitespace)
    Dim xdDetails As XDocument = XDocument.Load(strDataPath & _
                           "Order_Details.xml", LoadOptions.PreserveWhitespace)

    Dim Orders As XDocument = _
    <?xml version="1.0" encoding="utf-8" standalone="yes"?>
    <feed xmlns="http://www.w3.org/2005/Atom"
        xmlns:ads="http://schemas.microsoft.com/ado/2007/08/dataweb"
        xmlns:adsm="http://schemas.microsoft.com/ado/2007/08/dataweb/metadata"
        <%= From o In xdOrders...<content> _
            Where o...<ads:ShipCountry>.Value = "USA" _
            Order By o...<ads:OrderDate>.Value Descending _
            Select New XElement( _
        <entry>
            <content type="application/xml">
                <Order>
                    <ads:OrderID adsm:type="Int32">
                        <%= o...<ads:OrderID>.Value %>
                    </ads:OrderID>
                    <!-- ... -->
                    <ads:ShipCountry>
                        <%= o...<ads:ShipCountry>.Value %>
                    </ads:ShipCountry>
                </Order>
            </content>
        </entry>)
        %>>
    </feed>
End Sub

However, the compiler doesn't get rid of all duplicate namespaces, as illustrated by the following output Infoset:

<feed xmlns:adsm="http://schemas.microsoft.com/ado/2007/08/dataweb/metadata" xmlns:ads="http://schemas.microsoft.com/ado/2007/08/dataweb" xmlns="http://www.w3.org/2005/Atom">
  <entry xmlns:adsm="http://schemas.microsoft.com/ado/2007/08/dataweb/metadata" xmlns:ads="http://schemas.microsoft.com/ado/2007/08/dataweb" xmlns="http://www.w3.org/2005/Atom">
    <content type="application/xml">
      <Order>
        <ads:OrderID adsm:type="Int32">11006</ads:OrderID>
        <ads:OrderDate adsm:type="Nullable`1[System.DateTime]">2007-04-07T00:00:00</ads:OrderDate>
        <ads:RequiredDate adsm:type="Nullable`1[System.DateTime]">2007-05-05T00:00:00</ads:RequiredDate>
        <ads:ShippedDate adsm:type="Nullable`1[System.DateTime]">2007-04-15T00:00:00</ads:ShippedDate>
        <ads:Freight adsm:type="Nullable`1[System.Decimal]">25.1900</ads:Freight>
        <ads:ShipName>Great Lakes Food Market</ads:ShipName>
        <ads:ShipAddress>2732 Baker Blvd.</ads:ShipAddress>
        <ads:ShipCity>Eugene</ads:ShipCity>
        <ads:ShipRegion>OR</ads:ShipRegion>
        <ads:ShipPostalCode>97403</ads:ShipPostalCode>
        <ads:ShipCountry>USA</ads:ShipCountry>
      </Order>
    </content>
  </entry>
  <!-- ... -->
</feed>

The <entry> element (shown set bold) has a duplicate set of namespace declarations.

Bloating More Unprefixed Elements with Duplicate Group Namespace Declarations

Adding code to insert related Order_Detail elements, as shown below, results in namespace declaration duplication in the those elements.

Private Sub TransformOrderAndOrderDetails()
    Dim xdOrders As XDocument = XDocument.Load(strDataPath & "Orders.xml", _
                                               LoadOptions.PreserveWhitespace)
    Dim xdDetails As XDocument = XDocument.Load(strDataPath & _
                           "Order_Details.xml", LoadOptions.PreserveWhitespace)

    Dim Orders As XDocument = _
    <?xml version="1.0" encoding="utf-8" standalone="yes"?>
    <feed xmlns="http://www.w3.org/2005/Atom"
        xmlns:ads="http://schemas.microsoft.com/ado/2007/08/dataweb"
        xmlns:adsm="http://schemas.microsoft.com/ado/2007/08/dataweb/metadata"
        <%= From o In xdOrders...<content> _
            Where o...<ads:ShipCountry>.Value = "USA" _
            Order By o...<ads:OrderDate>.Value Descending _
            Select New XElement( _
        <entry>
            <content type="application/xml">
                <Order>
                    <ads:OrderID adsm:type="Int32">
                        <%= o...<ads:OrderID>.Value %>
                    </ads:OrderID>
                    <!-- ... -->
                    <ads:ShipCountry>
                        <%= o...<ads:ShipCountry>.Value %>
                    </ads:ShipCountry>
                    <Order_Details
                    <%= From d In xdDetails...<content> _
                        Where d...<ads:OrderID>.Value = o...<ads:OrderID>.Value _
                        Select New XElement( _
                        <Order_Detail>
                            <ads:OrderID adsm:type="Int32">
                                <%= d...<ads:OrderID>.Value %>
                            </ads:OrderID>
                            <ads:ProductID adsm:type="Int32">
                                <%= d...<ads:ProductID>.Value %>
                            </ads:ProductID>
                            <ads:Quantity adsm:type="Short">
                                <%= d...<ads:Quantity>.Value %>
                            </ads:Quantity>
                            <ads:QuantityPerUnit>
                                <%= p...<ads:QuantityPerUnit>.Value %>
                            </ads:QuantityPerUnit>
                            <ads:UnitPrice adsm:type="Decimal">
                                <%= d...<ads:UnitPrice>.Value %>
                            </ads:UnitPrice>
                            <ads:Discount adsm:type="Single">
                                <%= d...<ads:Discount>.Value %>
                            </ads:Discount>
                        </Order_Detail>) _
                    %>>
                    </Order_Details>
                </Order>
            </content>
        </entry>)
        %>>
    </feed>
End Sub

Each Order_Detail group has its own duplicate namespace declarations:

<feed xmlns:adsm="http://schemas.microsoft.com/ado/2007/08/dataweb/metadata" xmlns:ads="http://schemas.microsoft.com/ado/2007/08/dataweb" xmlns="http://www.w3.org/2005/Atom">
  <entry xmlns:adsm="http://schemas.microsoft.com/ado/2007/08/dataweb/metadata" xmlns:ads="http://schemas.microsoft.com/ado/2007/08/dataweb" xmlns="http://www.w3.org/2005/Atom">
    <content type="application/xml">
      <Order>
        <ads:OrderID adsm:type="Int32">11006</ads:OrderID>
        <ads:OrderDate adsm:type="Nullable`1[System.DateTime]">2007-04-07T00:00:00</ads:OrderDate>
        <ads:RequiredDate adsm:type="Nullable`1[System.DateTime]">2007-05-05T00:00:00</ads:RequiredDate>
        <ads:ShippedDate adsm:type="Nullable`1[System.DateTime]">2007-04-15T00:00:00</ads:ShippedDate>
        <ads:Freight adsm:type="Nullable`1[System.Decimal]">25.1900</ads:Freight>
        <ads:ShipName>Great Lakes Food Market</ads:ShipName>
        <ads:ShipAddress>2732 Baker Blvd.</ads:ShipAddress>
        <ads:ShipCity>Eugene</ads:ShipCity>
        <ads:ShipRegion>OR</ads:ShipRegion>
        <ads:ShipPostalCode>97403</ads:ShipPostalCode>
        <ads:ShipCountry>USA</ads:ShipCountry>
        <Order_Details>
          <Order_Detail xmlns:adsm="http://schemas.microsoft.com/ado/2007/08/dataweb/metadata" xmlns:ads="http://schemas.microsoft.com/ado/2007/08/dataweb" xmlns="http://www.w3.org/2005/Atom">
            <ads:OrderID adsm:type="Int32">11006</ads:OrderID>
            <ads:ProductID adsm:type="Int32">1</ads:ProductID>
            <ads:Quantity adsm:type="Short">8</ads:Quantity>
            <ads:UnitPrice adsm:type="Decimal">18.0000</ads:UnitPrice>
            <ads:Discount adsm:type="Single">0</ads:Discount>
          </Order_Detail>
          <Order_Detail xmlns:adsm="http://schemas.microsoft.com/ado/2007/08/dataweb/metadata" xmlns:ads="http://schemas.microsoft.com/ado/2007/08/dataweb" xmlns="http://www.w3.org/2005/Atom">
            <ads:OrderID adsm:type="Int32">11006</ads:OrderID>
            <ads:ProductID adsm:type="Int32">29</ads:ProductID>
            <ads:Quantity adsm:type="Short">2</ads:Quantity>
            <ads:UnitPrice adsm:type="Decimal">123.7900</ads:UnitPrice>
            <ads:Discount adsm:type="Single">0.25</ads:Discount>
          </Order_Detail>
        </Order_Details>
      </Order>
    </content>
  </entry>
  <!-- ... -->
</feed>

Meaningless Empty Namespace Inserted with Functional Construction

C# 3.0 and VB 9.0 functional construction code doesn't repeat global namespace declarations, but it isn't immune to namespaces strangeness either. This C# code (and its corresponding VB port) adds the empty namespace shown below:

private void btnOrder_DetailsLookup_Click(object sender, EventArgs e)
{
    XDocument xdOrders = XDocument.Load(strDataPath + "Orders.xml", 
                                        LoadOptions.PreserveWhitespace);
    XDocument xdDetails = XDocument.Load(strDataPath + "Order_Details.xml", 
                                         LoadOptions.PreserveWhitespace);

    XNamespace atom = "http://www.w3.org/2005/Atom";
    XNamespace ads = "http://schemas.microsoft.com/ado/2007/08/dataweb";
    XNamespace adsm = "http://schemas.microsoft.com/ado/2007/08/dataweb/metadata";

    XDocument Orders = new XDocument(new XDeclaration("1.0", "utf-8", "yes"), 
        new XElement(atom + "feed", 
            new XAttribute("xmlns", "http://www.w3.org/2005/Atom"), 
            new XAttribute(XNamespace.Xmlns + "ads", 
                    "http://schemas.microsoft.com/ado/2007/08/dataweb"), 
                new XAttribute(XNamespace.Xmlns + "adsm", 
                    "http://schemas.microsoft.com/ado/2007/08/dataweb/metadata"), 
                (from o in xdOrders.Descendants(atom + "content") 
                 where o.Element(ads + "ShipCountry").Value == "USA" 
                 orderby o.Element(ads + "OrderDate").Value descending 
                 select new XElement("entry", 
                     new XElement("content", new XAttribute("type", 
                         "application/xml"), 
                     new XElement("Order", new XElement(ads + "OrderID", 
                         o.Element(ads + "OrderID").Value, 
                         new XAttribute(adsm + "type", "int32")), 
                         <!-- ... -->
                         new XElement(ads + "ShipCountry", 
                             o.Element(ads + "ShipCountry").Value), 
                         new XElement("Order_Details", 
                             from d in xdDetails.Descendants(atom + "content") 
                             where d.Element(ads + "OrderID").Value == 
                                 o.Element(ads +"OrderID").Value 
                             select new XElement("Order_Detail", 
                                 new XElement(ads + "OrderID", 
                                     d.Element(ads + "OrderID").Value, 
                                     new XAttribute(adsm + "type", 
                                         "System.Int32")), 
                                 new XElement(ads + "ProductID", 
                                     d.Element(ads + "ProductID").Value, 
                                     new XAttribute(adsm + "type", 
                                         "System.Int32")), 
                                 new XElement(ads + "Quantity", 
                                     d.Element(ads + "Quantity").Value, 
                                     new XAttribute(adsm + "type", 
                                         "System.Int16")), 
                                 new XElement(ads + "UnitPrice", 
                                     d.Element(ads + "UnitPrice").Value, 
                                     new XAttribute(adsm + "type", 
                                         "System.Decimal")), 
                                 new XElement(ads + "Discount", 
                                     d.Element(ads + "Discount").Value, 
                                     new XAttribute(adsm + "type", 
                                         "System.Single")))))))));
        }

The resulting Infoset's <entry> group has a spurious xmlns="" attribute:

<feed xmlns="http://www.w3.org/2005/Atom" xmlns:ads="http://schemas.microsoft.com/ado/2007/08/dataweb" xmlns:adsm="http://schemas.microsoft.com/ado/2007/08/dataweb/metadata">
  <entry xmlns="">
    <content type="application/xml">
      <Order>
        <ads:OrderID adsm:type="int32">11006</ads:OrderID>
        <ads:OrderDate adsm:type="Nullable`1[System.DateTime]">2007-04-07T00:00:00</ads:OrderDate>
        <ads:RequiredDate adsm:type="Nullable`1[System.DateTime]">2007-05-05T00:00:00</ads:RequiredDate>
        <ads:ShippedDate adsm:type="Nullable`1[System.DateTime]">2007-04-15T00:00:00</ads:ShippedDate>
        <ads:Freight adsm:type="Nullable`1[System.Decimal]">25.1900</ads:Freight>
        <ads:ShipName>Great Lakes Food Market</ads:ShipName>
        <ads:ShipAddress>2732 Baker Blvd.</ads:ShipAddress>
        <ads:ShipCity>Eugene</ads:ShipCity>
        <ads:ShipRegion>OR</ads:ShipRegion>
        <ads:ShipPostalCode>97403</ads:ShipPostalCode>
        <ads:ShipCountry>USA</ads:ShipCountry>
        <Order_Details>
          <Order_Detail>
            <ads:OrderID adsm:type="System.Int32">11006</ads:OrderID>
            <ads:ProductID adsm:type="System.Int32">1</ads:ProductID>
            <ads:Quantity adsm:type="System.Int16">8</ads:Quantity>
            <ads:UnitPrice adsm:type="System.Decimal">18.0000</ads:UnitPrice>
            <ads:Discount adsm:type="System.Single">0</ads:Discount>
          </Order_Detail>
          <Order_Detail>
            <ads:OrderID adsm:type="System.Int32">11006</ads:OrderID>
            <ads:ProductID adsm:type="System.Int32">29</ads:ProductID>
            <ads:Quantity adsm:type="System.Int16">2</ads:Quantity>
            <ads:UnitPrice adsm:type="System.Decimal">123.7900</ads:UnitPrice>
            <ads:Discount adsm:type="System.Single">0.25</ads:Discount>
          </Order_Detail>
        </Order_Details>
      </Order>
    </content>
  </entry>
  <!-- ... -->
</feed>

Update 1/23/2008: After further consideration, I've concluded that the spurious xmlns="" (empty namespace) declaration isn't benign. The source document at the top of this post has <entry> and <content> groups without a namespace declaration; these groups are within the Atom namespace. Therefore, there's no justification that I can see for removing them from the Atom namespace with xmlns="".

The xmlns="" attribute probably is benign or even useful because the inferred schema for the source document doesn't include Order, Order_Details, and Order_Detail groups. However, I didn't include the XML document, or reference or infer an XML schema in the C# project because C# doesn't support IntelliSense for functional construction.

VB MVP Bill McCarthy, who has written extensively about LINQ to XML, suggested in his comment to this post:

Last one first: the xmlns="" is required because you have a default namespace at the root of the document, then you have entry, content, Order, Order_Details and Order_Detail all without a namespace (or more correctly the empty namespace. If they were meant to be in the default namespace, and you are using the explicit XElement constructors, you need to supply that namespace. VB makes it easy if you use XML literals.

Bill is correct that you must "supply that namespace." Thanks, Bill.

But the only way I can find to supply it in my sample code is to add an expanded name to each element that precedes an element in a prefixed namespace, as shown in black in the following snippets:

select new XElement("{http://www.w3.org/2005/Atom}entry",
    // Following doesn't solve the xmlns="" issue; it throws a "The prefix '' cannot be redefined 
    // from '' to 'http://www.w3.org/2005/Atom' within the same start element tag.
    //new XAttribute("xmlns", "http://www.w3.org/2005/Atom"), 
    new XElement("{http://www.w3.org/2005/Atom}content", new XAttribute("type", "application/xml"),
    new XElement("{http://www.w3.org/2005/Atom}Order", new XElement(ads + "OrderID", o.Element(ads + 
        "OrderID").Value, new XAttribute(adsm + "type", "int32")), 

new XElement("{http://www.w3.org/2005/Atom}Order_Details", 
    from d in xdDetails.Descendants(atom + "content") 
    where d.Element(ads + "OrderID").Value == 
        o.Element(ads +"OrderID").Value
    select new XElement("{http://www.w3.org/2005/Atom}Order_Detail", 

It's interesting that attempts to add the default namespace declaration with an attribute throw the runtime exception noted in the first comment. I don't recall seeing any documentation about this peculiar expanded name syntax requirement.

Note: For the record, Order, Order_Details and Order_Detail elements aren't in the actual Atom or ads namespace. They were inserted into the document for demonstration purposes.

Bill's comment also includes the following observation about the problem with the VB literal implementation:

As to the second [to] last example, the VB code has a superfluous New XElement that is stuffing you up I think.

I don't think that either New XElement statement is superfluous. The first is required for each Order group and the second is required for each Order_Detail group in the enumeration.

So I don't believe mystery is solved for VB's XML literal syntax.

Monday, January 21, 2008

Coming to the Entity Framework: A Serializable EntityBag for n-Tier Deployment

Danny Simmons explained in his earlier Why are data-centric web services so hard anyway? and So they're hard, but what if I need them... posts of December 20, 2007 that serializing object graphs with associations isn't a walk in the park. The LINQ to SQL team abandoned their efforts to create a serializable "mini connectionless DataContext," which led to Dinesh Kulkarni's LINQ to SQL Has No "Out-of-the-Box Multi-Tier Story" admission on October 15, 2007.

Updated 1/21/2008 and 1/22/2008: Additions/clarifications in [].

Part of the LINQ to SQL team's problem was the desire for Web service interoperability, which (in my opinion, at least) isn't in the cards for object/relational management (O/RM) tools. This observation is especially true for those O/RMs that support [value-based rather than timestamp] change tracking [that's independent of the domain's business objects.]

So I'm glad to see Danny taking the pragmatic approach to Web-service enabling the Entity Framework, as described in his EntityBag Part I – Goals article of January 20, 2008. He says the following about service interoperabilty:

While I like the simplicity of [EntityBag] interaction, it is super important to keep in mind the restrictions imposed by this approach. First off, there’s the fact that this requires us to run .Net and the EF on the client—in fact it requires that the code for your object model be available on the client, so it is certainly not interoperable with Java or something like that.

If interoperabilty is the Holy Grail of Web services, why do typed DataSets remain one of the most common objects serialized by .NET Web services? According to Scott Hanselman, Returning DataSets from WebServices is the Spawn of Satan and Represents All That Is Truly Evil in the World. (Scott posted this diatribe on June 1, 2004, three years before he joined Microsoft in July 2007.) An obvious answer is "because you can."

Another issue is lack of adherence to (or support for) the Web service contract's terms and conditions:

Secondly, because we are sending back and forth the entire ObjectContext, the interface of the web methods imposes no real contract on the kind of data that will travel over the wire. The retrieval method in our example is called GetRoomAndExits, but there’s absolutely no guarantee that the method might not return additional data or even that it will return a room and exits at all. This is even scarier for the update method where the client sends back an EntityBag which can contain any arbitrary set of changes and they are just blindly persisted to the database.

The lack of a service contract or its enforcement doesn't appear to deter the crowd supporting RESTful Web services, including ADO.NET Data Services.

To qualify as an enterprise-level O/RM tool, the Entity Framework must provide out-of-the-box n-tier support for WCF Web services and remoting, even if it's an add-in to v1.0. Save interoperability and contract concerns for v.2.0+.

I look forward to the promised "future posts" in which "we can dig into the implementation of EntityBag."

Backstory: I discussed problems with serializing object graphs that contain cyclic references created by combinations of EntitySet and EntityRef(erence) associations in my Controlling the Depth and Order of EntitySets for 1:Many Associations post of December 20, 2007 (updated 12/23/2007), Serializing Object Graphs Without and With References of November 21, 2007 (updated 12/12/2007), and Serializing Cyclic LINQ to SQL References with WCF of October 30, 2007).

Update 1/21/2008: Frans Bouma's comment of 1/21/2008 takes Microsoft and me to task: Microsoft for not incorporating change data in the business object itself and me for not being precise regarding the difficulty of managing value-based concurrency conflicts without changing the business object's structure. Both the Entity Framework (EF) and LINQ to SQL minimize encroaching on "persistence ignorance" and plain old CLR objects (POCO). See Danny's EF Persistence Ignorance Recap of September 26, 2007 and my Persistence Ignorance Is Bliss, but Is It Missing from the Entity Framework? article of March 14, 2007 (updated 4/24/2007 and earlier).

Update 1/22/2008: In answer to a comment about EntityBag Part I – Goals of January 20, 2008, Danny says on the same day:

The EF won't have something like this built-in for v1, but we are looking hard at the topic for future releases.  I'm not 100% certain we'll do this, since there are some serious issues when it comes to interoperability, etc. (as I've noted).  I believe we made some major mistakes with the DataSet in this regard, and I don't want to repeat them.

Sunday, January 20, 2008

LINQ and Entity Framework Posts for 1/14/2008+

Danny Simmons Describes a Serializable EntityBag for n-Tier Entity Framework Deployment

I'm glad to see the ADO.NET Entity Framework team taking the pragmatic approach to Web service-enabling v1.0 with the serializable EntiyBag that Danny describes in his EntityBag Part I – Goals post of January 20, 2008.

For my take on Danny's answer to the LINQ to SQL Team's apparently-abandoned serializable "mini-connectionless DataContext" read Coming to the Entity Framework: A Serializable EntityBag for n-Tier Deployment, which ran too long for inclusion in this post.

Added: 1/20/2008

John Papa Explains How to Design an Entity Data Model with the EDM Tools

John's February 2008 "DataPoints" column for MSDN Magazine is subtitled "Designing an Entity Data Model." Topics include:

  • Understanding the EDM
  • Building an EDM with the Wizard
  • Stored Procedures in the Entity Model
  • Windows on EDM
  • Derived Entities

This article follows up his July 2007 "ADO.NET Entity Framework Overview" column.

Added: 1/20/2008

Eric White Explains How to Extract Comments from OOXML Documents

In his How to Extract Comments from Open XML Documents of January 16, 2008, Eric uses LINQ to XML to identify and extract comments as an XML list from WordprocessingML, SpreadsheetML, and PresentationML documents in a specified folder.

Added: 1/20/2008

Danny Simmons Posts the Finale to His Extension Methods Extravaganza

Danny's final post in the series, EF Extension Methods Extravaganza Part III – Fun with LINQ to Objects of January 17, 2008, includes the following LINQ to Objects extension methods that you invoke on an ObjectContext instance:

  • ObjectContext.GetEntities(EntityState state)
  • ObjectContext.GetRelationships(EntityState state)
  • ObjectContext.GetUnchangedManyToManyRelationships()
  • ObjectContext.GetRelationshipForKey(EntityKey key, EntityState state)
  • ObjectContext.GetRelationshipsByRelatedEnd(Entity entity, EntityState state)

He incorporates a caveat that these extension methods operate on internal methods that are subject to change in CTP and RTM releases that might be exposed in future releases, which potentially would obsolete some of these extension methods. (See Danny's comment of 1/18/2008.)

Added: 1/17/2008 Updated: 1/20/2008

Mike Taulty Produces Five ADO.NET Data Services Screencasts

His ADO.NET Data Services - Screencasts post of January 17, 2008 offers links to the following five video screencasts:

The "A Basic Silverlight Client" is a fast-paced, 14-minute segment that shows you how to create a simple read-only Silverlight Web client with WebDataGen.exe that's indistinguishable from the Windows version. Mike says:

I particularly enjoyed the Silverlight one which I stayed up into the small hours of the morning making. Why? Because of how closely it lines up with what you do in a full .NET application.

Once Silverlight grows some controls and databinding in version 2.0 I really see Silverlight+Data Services as being a killer combination.

Added: 1/17/2008

Rob Conery Continues the Subsonic 2.1 ("Pakala") Preview

Rob's SubSonic: 2.1 (Pakala) Preview, Part 2 post of January 16, 2008 shows an example of his Query2 syntax, which has decided LINQ overtones:

ProductCollection p = Northwind.DB.Select()
        .From<Northwind.Product>()
        .InnerJoin<Northwind.Category>()
        .Where(“CategoryName”).Like(“c%”)
        .ExecuteAsCollection<Northwind.ProductCollection>();

He also provides examples of code to return typed results from stored procedures and to substitute a new Repository pattern for the traditional ActiveRecord pattern.

Note: Pakala Village is on the Island of Kauai diagonally opposite Princeville, where Rob appears to live (on Weke Road, Hanalei).

Matt Warren Resurfaces with Part IX of the Building an IQueryable Provider Series

After a three-month hiatus, Matt finally posted LINQ: Building an IQueryable Provider - Part IX on January 16, 2008. (Matt blames the TV writer strike for the long delay.)

This post, which appears to be subtitled "Cleaning up the Mess," shows prospective LINQ IQueryable providers how to eliminate SELECT statement redundancy in generated SQL statements and eliminate unused columns from the query.

Be sure to read Frans Bouma's comment. He wants an episode that explains 'Funcletization' (and so do I.)

Note: Frans' Developing Linq to LLBLGen Pro, part 11 post (see the "Frans Bouma Tackles All/Any, Cast, and Contains but Abandons Aggregate and Concat" topic below) and Luke Hoban's response to a MSDN C# forum question are the only two Google hits I get on 'Funcletization':

What you describe is actually a generally useful thing to do for Expression Trees. We refer to it internally as "Funcletization" - though the name leaves something to be desired. Linq to SQL does something similar down in it's guts - turning sub-expression-trees which do not depend on any of the parameters into Constant nodes.

You can implement this yourself by:

1) Writing a visitor over the Expression Tree which recreates the tree at each node.

2) Specializing this to check at each node whether the subtree depends on any of the parameters, and if not - call .Compile on it and replace the sub tree with a Constant node with the value returned from calling the delegate returned from .Compile()

I've been meaning to post some sample code that does this - but maybe you can beat me to it :-)

Luke Hoban, who's a F#C# compiler PM, is probably the man to ask about Funcletization, but he appears to be blogging primarily (and infrequently) about functional programming.

Added: 1/17/2008 Updated: 1/20/2008

Danny Simmons Follows Up with EF Extension Methods Extravaganza Part II

The second episode, EF Extension Methods Extravaganza part II – Relationship Entry & IRelatedEnd of January 16, 2008, provides extension methods for querying relationship entries, EntityReference<T> and EntityCollection<T>. You're likely to find the following extension methods especially useful if the ADO.NET team doesn't provide the option to make foreign key values visible:

  • ObjectStateEntry.IsRelationshipForKey(EntityKey key)
  • ObjectStateEntry.OtherEndKey(EntityKey thisEndKey)
  • ObjectStateEntry.OtherEndRole(EntityKey thisEndKey)
  • IRelatedEnd.IsEntityReference()
  • IRelatedEnd.GetEntityKey()
  • IRelatedEnd.SetEntityKey(EntityKey key)
  • IRelatedEnd.Contains(EntityKey key)

Something to think about: All primary keys aren't necessarily surrogate keys; often it's a natural key that has business meaning, such as an employee ID number. Unless natural foreign key values are visible, you must instantiate the parent object to gain access to the value.

If part II is a harbinger, this looks to be a fast-paced series.

Added: 1/17/2008

Danny Simmons Launches Entity Framework Extension Methods Extravaganza

Danny's first episode, EF Extension Method Extravaganza Part I - ObjectStateEntry, of January 15, 2008 is "the first installment in a series of posts with lots of little extension methods which seem to make my recent Entity Framework experiments easier to write."

The first set of extension methods are:

  • GetEntityState()
  • ObjectStateEntry.UsableValues()
  • ObjectStateEntry.EdmType()
  • AssociationType.IsManyToMany()

This has the makings of an interesting series of posts.

Added: 1/16/2008

Marcelo Lopez Ruiz Explains ADO.NET Data Services' Built-In $filter Operators and Functions

Marcelo's Arithmetic and built-in functions for $filter include details about ADO.NET Data Services'

  • add (+)
  • sub (-)
  • div (/)
  • mul (*)
  • mod (%)

operators, as well as

  • String
  • DateTime
  • Math
  • Type

functions. The December 2007 "Using ADO.NET Data Services ('Project Astoria')" white paper for CTP1 also has descriptions and examples of these operators and functions.

Added: 1/16/2008

The Pfx Team Describes PLINQ's Debugger Display

The Debugger display of PLINQ queries of January 15, 2008 shows you how to get the most out of PLINQ's debugger with illustrations of the debugger display and debugger tips for simple parallel queries.

If you didn't follow early Parallel Programming with .NET posts about PLINQ, they started with little or no fanfare on November 29, 2007. Filter the posts on the PLINQ topic to see all seven related posts to date.

Added: 1/16/2008

David Hayden Delivers Illustrated ASP.NET Dynamic Data Images from Databases Demo

David's ASP.NET Dynamic Data - Displaying, Inserting and Editing Images in SQL Server Database post of January 15, 2008 says the following about Scott Hunter's Sample for Displaying Images from the Database using Dynamic Data post of January 14, 2008:

Scott's example shows off a number of really cool features and techniques:

  • Authoring Custom Dynamic Data Fields: DbImage.ascx, DbImage_Edit.ascx, and FileImage.ascx.
  • Using a Custom Attribute, ImageFormat, that contains format metadata.
  • Creating a Custom HttpHandler, ImageHandler.ashx, for displaying images from the database.
  • Generating a LINQ query on the fly, via LinqImageHelper, to get the image using LINQ To SQL.
  • Working with various Metadata and Reflection Classes in ASP.NET Dynamic Data and the .NET Framework.

His post also notes that "[t]here are numerous helper classes in a separate project that consist of the ImageFormat Attribute, ImageHandler HTTP Handler, LINQImageHelper for generating a LINQ To SQL query to get the image, etc."

See the "Scott Hunter Demos ASP.NET Dynamic Data Displaying Images from a Database" topic below for more details.

Added: 1/16/2008

Mike Taulty Continues His Reflections on Access Control for ADO.NET Data Services

His ADO.NET Data Services - More Sketchy Thoughts on Access Control post of January 15, 2008 proposes a ServiceAuthorizationManager implementation to "determine who it is that's calling my service and whether I allow them to do it based on a role."

Added: 1/16/2008

Michael Sync Demos a Databound Silverlight ListView

Michael's Consuming ADO.NET Data Service (Astoria) from Silverlight post of January 15, 2008 shows you how to how to access the ADO.NET Data Service (Astoria) from Silverlight with the  ADO.NET Data Services Silverlight Add-On. Michael says:

This article is just an introduction of how to use Astoria in Silverlight application. The sample that I used in this article is nearly as same as the sample which is mentioned in ASP.NET 3.5 Extensions Preview (Quick Start). But I used the Astoria Silverlight Addon and create the Silverlight listview that looks cool.

Thanks to Andy Conrad for the heads-up.

Added: 1/16/2008

Alex James Explains How to Perform Bulk Updates with LINQ to Entities: Part 2

From LINQ and Entity Framework Posts for 12/3/2007+:

Entity SQL v.1.0 doesn't include DML keywords, so you must use LINQ to Entities if you want to insert, update, or delete entities rather than operate on physical rows in the persistence store (a.k.a. database) directly. To update or delete entities, you must bring the entities into memory with the appropriate LINQ query before you modify or destroy them. Alex contends that this process "seems somewhat wasteful considering you could do this in T-SQL with one very simple update statement, thereby bypassing all the query, hydration and cross process shipping costs" when performing a batch update or deletion.

His previous post of December 7, 2007, Rolling your own SQL Update on-top of the Entity Framework - Part 1, showed about a third of the process for creating a generic Update<T>() function with the following signature:

public static int Update<T>(this IQueryable<T> queryable,  
       Expression<Func<T, T>> updator) where T : EntityObject 
{ 
    SqlDmlCommandFactory<T> factory = new SqlDmlCommandFactory<T>(queryable); 
    SqlCommand updateCmd = factory.GetUpdateCommand(updator); 
    return updateCmd.ExecuteNonQuery(); 
}

It's taken Alex more than a month to deliver Rolling your own SQL Update on top of the Entity Framework - Part 2 post of January 15, 2008, which describes creating the constructor for SqlDmlCommandFactory and its SqlDmlContext object, which also can provide the foundation for bulk deletion and insertion operations.

Alex's next post will cover the GetUpdateCommand() method. Hopefully, it won't take another month to deliver it.

Added: 1/16/2008

Frans Bouma Tackles All/Any, Cast, and Contains but Abandons Aggregate and Concat

Frans is nearing nearing completion of his LINQ to LLBLGen Pro implementation after having completed work on "all major parts of a SELECT query," including GroupJoin, GroupBy, and the aggregate functions.

In his Developing Linq to LLBLGen Pro, part 11 of January 15, 2008, he starts working on the remaining Standard Query Operators (SQOs) in alphabetical order.

Aggregate: Frans abandons his attempt to implement the VB-only Aggregate keyword because it won't compile in C# 3.0. Probably not a good idea, Frans. My LINQ to XML: Grouping and Aggregation Gotchas, Part II post of the same day has several examples of its use in conjunction with LINQ to XML documents created with VB's XML literal data type.

Update 1/20/2008: See Frans' comment of 1/18/2008 and my reply of 1/20/2008 regarding the Aggregate keyword.

All / Any: All() and Any() inside a Where clause will be supported. However, use without parameters as the final methods in the query won't be.

Cast: Enables compiling queries that otherwise would fail from missing types or type conflicts.

Concat: Frans abandons Concat because LLBLGen doesn't support UNION queries.

Contains: Contains proved difficult because it's schizophrenic: It can be "Queryable.Contains, IList.Contains, String.Contains, EntityCollection.Contains, etc." as Frans demonstrates with 12 sample queries. Contains is critical because it implements the IN () predicate, inter alia.

Frans clearly is on the home stretch and I'm looking forward to test-driving LINQ to LLBLGen Pro when it's done.

LINQ to XML: Grouping and Aggregation Gotchas, Part II

This is the promised sequel to my LINQ to XML: Grouping and Aggregation Gotchas, Part I post of January 7, 2008 which contains a single, large XML Infoset that demonstrates aggregation over business documents (such as sales orders or invoices) with rollups by customer, as well as grand totals for the document as a whole.

Rob Conery Describes Subsonic's Forthcoming MVC Implementation ("Makai")

Rob's SubSonic And MVC: Introducing Makai post of January 14, 2008 describes his vision and implementation of SubSonic's Scaffold control as a controller to provide a "complete editing suite with no tools to manage." Rob illustrates each current stock page template.

Subsonic "Makai" looks very similar to ASP.NET Dynamic Data and also resembles Blinq, but its editor includes the jQuery JavaScript library and a Membership controller is included. 

Scott Hunter Demos ASP.NET Dynamic Data Displaying Images from a Database

Scott's Sample for Displaying Images from the Database using Dynamic Data post of January 14, 2008 offers:

[A] sample project that shows some custom field template controls for Dynamic Data that allow images to be viewed/edited from a database. The sample also shows how you can view images that have their filenames in the database but the images on the physical disk. You can download the sample by clicking this download link: DOWNLOAD SAMPLE.

His post explains how to add the same functionality to your own Dynamic Data project.

LINQ to Google (GLinq) Surfaces on Source Forge

The latest third-party LINQ provider appears to be the Beta 1 version of Scott Brown's LINQ to Google (GLinq) that became available for download on December 3, 2007. According to Scott:

LINQ to Google allows developers to easily query Google's Data Sources using a strongly typed syntax. LINQ to Google shows an example of implementing IQueryable and IQueryProvider.

GLinq is an implimentation of the LINQ deferred execution model for querying Google's data sources. The initial release can be used to query the Google Base. Subsequent releases will target support for YouTube, Calendar, Email, etc. The ultimate goal of this project is to create a generic enough model to allow plugin providers for any REST API.

I will be adding posts in the near future on how the code works. For now you have to play with it yourself.

The source code carries a Common Development and Distribution License (CDDL)Common Development and Distribution License (CDDL).

Note: Don't confuse LINQ to Google with Luis Diego Fallas' LINQ to Google Desktop