Friday, May 18, 2007

Erik Meijer and LINQ 2.0 - Round 2

Microsoft's peripatetic LINQ theorist/evangelist Erik Meijer delivered the second edition of his "LINQ 2.0: Democratizing the Cloud" presentation to the XTech 2007 Conference in Paris on May 16, 2007. The session's overview demonstrates that LINQ 2.0 is more than Erik's vision for "simplifying the development of data-intensive distributed applications."

Erik observes:

One painful problem that is yet unsolved is the “last mile” of data programming namely the mapping between data models at the edges, in particular the mapping between relational data and objects.

My response? "Ain't that the truth!"

Jeni Tennison wrote a brief review of Erik's session to which Mike Champion replied.

Update 5/22/2007: Erik also presented XML and LINQ: What's new in Orcas and beyond at XTech 2007.

What's new in round 2 is that LINQ 2.0 will address an issue that plagues the current (delayed) version of the Entity Framework (EF) and Entity SQL (eSQL)—lack of update capability.

Object/Relational Mapping and Updatable Views

ADO.NET program manager Zlatko Michailov wrote in his Entity Client post of February 14, 2007:

Entity SQL will not support any DML constructs – INSERT, UPDATE, or DELETE in the Orcas release. Data modification could still be done through ObjectContext, or through the native SQL of the target store. It is our goal to implement DML support but for now it’s more important to verify we’ve built the right query language. [Emphasis added.]

You can read more about this issue in Bob Beauchemin's Re: UPDATE, INSERT, DELETE in eSQL? thread of September 14, 2006 in the ADO.NET Orcas forum. Pablo Castro replied on September 15, 2006:

We want to do DML for Entity SQL over the mapping layer, it's just that we'll not get to it in the upcoming release.

Just three weeks before Zlatko's belated admission, ADO.NET data architect Mike Pizzo had written in his January 23, 2007 Data Access API of the Day - Part IV (Programming to the Conceptual Model…) post:

These rich conceptual schemas are exposed and queried through an "Entity Client".  The Entity Client is an ADO.NET Data Provider that builds queries against storage-specific providers using client-side read/write "views". Queries and updates written against these conceptual views are expanded by the Entity Client and executed as queries against underlying storage-specific providers.  All the actual query execution is done in the store (not on the client), and the results are assembled into possibly hierarchical, polymorphic, results with nesting and composite members. [Emphasis added.]

Updating views isn't a walk in the park, as Microsoft Research's Sergei Melnik writes in his August 23, 2006 "Mapping-Driven Data Access" position paper:

[T]o date there is no elegant and widely accepted mechanism
for specifying view update behavior and processing updates
in mapping-driven data access scenarios. As a consequence, commercial
database systems offer poor support for updatable views.

Sergei goes on to say, about three weeks before Pablo's announcement that updates wouldn't make it into the first EF release:

The database architects and researchers at Microsoft have been working on an innovative mapping architecture which aims to address the challenges identified in the previous sections. It exploits the following ideas:

• Mappings are specified using a declarative language that has well-defined semantics and puts a wide range of mapping scenarios within reach of non-expert users.
• Mappings are compiled into bidirectional views, called query and update views, that drive query and update processing in the runtime engine.
• Update translation is done using a general mechanism that leverages view maintenance, a mature database technology [5, 13].

The new mapping architecture enables building a powerful stack of mapping-driven technologies in a principled, future-proof way. Moreover, it opens up interesting research directions of immediate practical relevance.

It has been implemented in the upcoming release of the ADO.NET Entity Framework, which provides a mapping-driven data access layer for .NET applications. [Emphasis added.]

It appears that view maintenance turned out to be a bit harder to implement than Sergei, Mike Pizzo and the ADO.NET team estimated. LINQ to SQL now supports updates with dynamic SQL and stored procedures, but it offers only one-to-one mapping of tables to entities and has limited inheritance support. (The Beta 1 version also has a few bugs in its stored procedure support).

Abandon eSQL and Move to LINQ 2.0 for Updates and Queries?

Here's Erik's grand plan for LINQ 2.0, O/R mapping, and updatable views:

Just as we provided deep support for XML in Visual Basic, in LINQ 2.0 we hope to directly support relationships and updatable views at the programming-language level. In that case, we only need a very thin layer of non-programmable default mapping at the edge between the relation and object world and allow programmers to express everything else in their own favourite LINQ-enabled programming language. The result is that just as LINQ 1.0 side-stepped the impedance mismatch “problem” with something better (monads and monad comprehensions), LINQ 2.0 will sidestep the mapping “problem” with something better (composable and programmable mapping).

If LINQ 2.0 can deliver "composable and programmable mapping," why add DML commands to a non-standard, proprietary SQL dialect—eSQL? If LINQ 2.0 is two years away, as Erik estimates, and EF's RTM has been delayed until 2007H1 (or later), why not just let EF [N]hibernate for another year, then release with LINQ 2.0 for Entities?

However, if "we only need a very thin layer of non-programmable default mapping", perhaps Erik doesn't intend LINQ 2.0 to supplement EF and LINQ to Entities, but instead to replace them. I don't consider three layers of complex XML mapping documents to be "a very thin layer."

The First Round

In his original "Democratizing the Cloud" session at QCon 2007 Conference (London) on March 15, 2007, Erik introduced the term LINQ 2.0 but concentrated on automatically refactoring data intensive apps to n-tier projects for the Web that support all client devices. None of the bloggers who reported on Erik's session covered it thoroughly, but my "LINQ 2.0" in Early Returns from QCon 2007, London post has links to as many as I could find at the time. Tim Anderson's Getting to grips with LINQ 2.0 review is the most comprehensive.

In round 2, Erik reiterated his dynamic refactoring proposal, and added data concurrency and synchronization to the mix. His overview doesn't mention how or whether Plinq and Synchronization Services for ADO.NET fit into his concurrency and synchronization vision. However, my take is that "composable and programmable mapping" has the potential to offer more immediate benefit to data-oriented .NET developers than tool-driven n-tier/device refactoring or solving concurrency/sync problems.

Round 3

The next round of "LINQ 2.0: Democratizing the Cloud" is scheduled for June 14, 2007 at DevDays Europe 2007 in Amsterdam's RAI convention center. In addition to Erik, DevDays features a panoply of .NET heavy hitters, such as Scott Guthrie, Francisco Ballena, Dino Esposito, Ingo Rammer, Gert Drapers, Jan Thielens, Daniel Moth, Tim Sneath, and Alex Thissen. Molly Holzschlag will be "Gazing into the Future of Web Development."

Here are the LINQ and Entity Framework-related sessions at DevDays Europe 2007:

Round 4

On June 15, 2007, Erik's giving the same presentation as an invited talk (keynote) at XMIME-P 2007, the 4th International Workshop on XQuery Implementation Experience and Perspectives, in Beijing. Erik's Amsterdam presentation is scheduled for June 14, 2007. (Hope he makes it to Beijing in time. Erik says his 9:20 PM flight from AMS to PEK will get him to the the airport at 8:55 AM. Hope he gets to the venue on time. (PEKing airport is about 10-12 miles from Beijing's center.) Update 6/16/2007: Erik says he "made it to his hotel before 9:00 AM, checked in, took a shower, and spent a nice day at the workshops. Both talks went very well." Apparently, his route provided a substantial tailwind.)

Here's the blurb:

The web is rocking the world of developers. Our customers love consistency. They want to have the same rich experience, anywhere, any time, on any device. Our sales people love market share.They want no platform that cannot leverage their web services. We as developers have embraced agile methods. We want to keep our options open as long as possible and create software incrementally by successive refactorings. This surely sounds like a contradiction, another impossible triangle just like infamous Relations-Objects-XML trio we tackled with LINQ 1.0. As the Dutch artist MC Escher once said "Only those who attempt the absurd will achieve the impossible". With LINQ 2.0, we are trying to stretch the .NET framework to cover the Cloud such that it will become possible to incrementally and seamlessly design, develop, and debug complex distributed applications using your favorite existing and unmodified .NET compiler and deploy these applications anywhere.

Thanks to Mike Champion (still writing for the the XML Team) for bringing Erik's round 2 presentation to my attention. I believe this is the first of his recent LINQ presentation I didn't catch before Erik gave it.