Thursday, February 14, 2008

Pablo Castro: ADO.NET Data Services and the Atom Publishing Protocol

Pablo explains the Project Astoria team's approach to adopting the Atom Syndication Protocol (Atom, RFC 4287) and Atom Publishing Protocol (APP, RFC 5023) as data-exchange (payload) formats in his AtomPub support in the ADO.NET Data Services Framework post of February 13, 2008.

Pablo says the following under the heading "Why are we looking at AtomPub?:

Astoria data services can work with different payload formats and to some level different user-level details of the protocol on top of HTTP. For example, we support a JSON payload format that should make the life of folks writing AJAX applications a bit easier. While we have a couple of these kind of ad-hoc formats, we wanted to support a pre-established format and protocol as our primary interface.

If you look at the underlying data model for Astoria, it boils down to two constructs: resources (addressable using URLs) and links between those resources. The resources are grouped into containers that are also addressable. The mapping to Atom entries, links and feeds is so straightforward that [it's] hard to ignore. Of course, the devil is in the details and we'll get to that later on.

Here's part of a sample query result against the Northwind.Orders EntityCollection delivered in the December 2007 CTP's Atom payload format:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<feed xml:base="http://localhost:50539/Northwind.svc/" xmlns:ads="http://schemas.microsoft.com/ado/2007/08/dataweb" xmlns:adsm="http://schemas.microsoft.com/ado/2007/08/dataweb/metadata" xmlns="http://www.w3.org/2005/Atom">
  <entry adsm:type="NorthwindModel.Orders">
    <id>http://localhost:50539/Northwind.svc/Orders(11077)</id>
    <updated />
    <title />
    <author>
      <name />
    </author>
    <link rel="edit" href="Orders(11077)" title="Orders" />
    <content type="application/xml">
      <ads:OrderID adsm:type="Int32">11077</ads:OrderID>
      <ads:OrderDate adsm:type="Nullable`1[System.DateTime]">2007-05-06T00:00:00</ads:OrderDate>
      <ads:RequiredDate adsm:type="Nullable`1[System.DateTime]">2007-06-03T00:00:00</ads:RequiredDate>
      <ads:ShippedDate adsm:type="Nullable`1[System.DateTime]" ads:null="true" />
      <ads:Freight adsm:type="Nullable`1[System.Decimal]">8.5300</ads:Freight>
      <ads:ShipName>Rattlesnake Canyon Grocery</ads:ShipName>
      <ads:ShipAddress>2817 Milton Dr.</ads:ShipAddress>
      <ads:ShipCity>Albuquerque</ads:ShipCity>
      <ads:ShipRegion>NM</ads:ShipRegion>
      <ads:ShipPostalCode>87110</ads:ShipPostalCode>
      <ads:ShipCountry>USA</ads:ShipCountry>
    </content>
    <link rel="related" title="Customers" href="Orders(11077)/Customers" type="application/atom+xml;type=entry" />
    <link rel="related" title="Employees" href="Orders(11077)/Employees" type="application/atom+xml;type=entry" />
    <link rel="related" title="Order_Details" href="Orders(11077)/Order_Details" type="application/atom+xml;type=feed" />
    <link rel="related" title="Shippers" href="Orders(11077)/Shippers" type="application/atom+xml;type=entry" />
  </entry>
  <!-- ... -->
</feed>

Using Atom and APP as data exchange formats has a precedent. Google was one of the early proponents of Atom as a replacement for RSS and specified Atom 0.3 as Blogger's sole syndication format. The Google data APIs (GData) are based on either Atom or RSS for responses to HTTP GET requests and APP for updating data with POST (insert), PUT (update), or DELETE requests. GData uses an Atom extension for queries, while Astoria uses an expressive Universal Resource Identifiier (URI) syntax for GET requests. (The Google Base data API has a simple URI-based GET syntax for querying items by attribute values.) A more distinguishing characteristic is ADO.NET Data Services use of LINQ to REST to generate URI-based WebDataQueries.

It appears that the following issues with implementing AtomPub as an Astoria protocol are still open for discussion:

  1. How does a client send a set of PUT/POST/DELETE operations to the server in a single go (request batching)?
  2. How is metadata that describes the structure of a service end points exposed?
  3. How do we deal with aspects that AtomPub does not handle by design or just because it has not been needed so far?
  4. What to do with fields that may not have a backing value in the input source (e.g. [title], updated, author)?
  5. How high-level can we make clients so they can consume AtomPub-based Astoria services but still feel that they are working against regular objects and have general integration with the development environment?

The Astoria Team plans to post its proposed extensions to and application-level features for Atom for discussion on the atom-syntax and atom-protocol mailing lists.

Update 2/16/2008:

In his ADO.NET Data Services (Astoria) Adopts AtomPub post of February 16, 2008, Dare Obasanjo says:

I'm glad to see Microsoft making a huge bet on standards based, RESTful protocols especially given our recent history where we foisted Snakes On APlane on the industry. [Emphasis Dare's.]

However since AtomPub is intended to be an extensible protocol, Astoria has added certain extensions to make the service work for their scenarios while staying within the letter and spirit of the spec.

Dare then goes on to analyze in his Thoughts on Google's Proposal for Granular Updates in AtomPub post of February 16, 2008 Joe Gregorio's How to do RESTful Partial Updates proposal of February 15, 2008 for updating specific properties of an entry without retrieving and returning the entire entry. (Joe Gregorio

See the "Dare Obasanjo Seconds the ADO.NET Entity Data Team's Decision to Adopt AtomPub for Updates" topic of LINQ and Entity Framework Posts for 2/11/2008+ for more details about solving the partial updates problem.

0 comments: