Thursday, June 14, 2007

Sahil Malik Takes on LINQ, LINQ to SQL, and Entity Framework

From the "How Did I Miss This?" department. (I subscribe to Sahil's blah!bLaH!BLOG from IE7, but his June 9, 2007 post in question probably was buried under a long SharePoint entry. Thanks to Sam Gentile for yesterday's New and Notable item, which also includes a link to Ian Cooper's June 10, 2007 Being Ignorant with LINQ to SQL post, the subject of this review.)

ADO.NET and SharePoint guru Sahil Malik's My Views about the ORM space, Entity Framework and all such stuff! dissertation on Microsoft's "New Generation Data Access" efforts for ADO.NET 3.5 delivers the following conclusions:

  • LINQ is Microsoft's latest "making a molehill from a mountain" project.

"What LINQ does give me, is a way to simplify my code for about 10-20% of the use cases in my code. It will revolutionize the 10-20%, but really - in an entire project - 10-20% of a revolution that took 3 years to incubate? I'm beginning to not get impressed."

  • LINQ to SQL isn't a production-grade technology. 

"LINQ to SQL is great for scratch and sniff - concept projects. ... LINQ to SQL is to .NET 3.5, what TableAdapters were to .NET 2.0. In my honest and not so humble opinion, most production projects will and should stay away from it."

  • The Entity Framework and Entity SQL get Sahil "really excited."

"The whole concept of the Entity Data Mapper, the Mapping provider, Entity Model and the best thing around—Entity SQL—are quite awesome. ... eSQL is what differentiates Entity Framework from the rest of the ORMs. eSQL is quite kickass. eSQL + LINQ gives you the organizability(!?) of C#, and the ease of Foxpro. What is there not to like? ... 

I am quite disappointed to learn that it won't be a part of Orcas. ... I am bored of data access, and will continue to watch it from the sidelines until we have some serious progress on the only one possible MSFT winner out of the above at this time, the Entity framework."

Yet Sahil finally concludes:

Note that LINQ will also fly, but IMO is a different animal and has nothing to do with Data Access. But really, LINQ isn't that big or complex (or even impressive! :-/).

Sounds like damnation by faint praise (or weak condemnation) to me.

LINQ and Data Access

Sahil is right; LINQ has nothing to do with data access.

LINQ is an enabling technology for applying a common SQL-like query syntax to a wide variety of data domains. LINQ's strongly typed queries consist of C# 3.0 or VB 9.0 keywords so they're checked at compile time—not runtime—and provide IntelliSense and statement completion.

LINQ to SQL is simply a domain-specific LINQ implementation. LINQ to Entities is another domain-specific implementation. There are many third-party LINQ implementations in process, including Ayende's LINQ to NHibernate and Bart De Smet's LINQ to SharePoint.

Note: Forgot to mention in prior posts that Bart De Smet is going to work for Microsoft's WPF team in the Developer Division. Hopefully, he'll continue working on this third-party LINQ implementations.

Update 6/16/2007: Bart says in this comment that he plans to continue work on his LINQ implementations after moving to Microsoft in October and that an update to LINQ to SharePoint is scheduled for this month.

Update 6/17/2007: True to his word, Bart posted a set of samples that illustrate how to use LINQ to Sharepoint today. (I bet he'll miss the moules aux vin blanc from the joints off the Grand Place like Chez Leon on the petit rue des Bouchers or less touristy places in the outskirts of Brussels.)

LINQ also is responsible for adding many new constructs to C# 3.0 and VB 9.0, including some from functional languages, such as Haskell:

  • Local variable type inference implemented by C# 3.0’s var and VB 9.0’s Dim keywords to shorten the syntax for declaring and instantiating generic types and support anonymous types
  • Object initializers to simplify object construction and initialization with syntax similar to array initializers
  • Collection initializers to combine the concept of array initializers and object initializers and extend it to generic collections
  • Anonymous types to define inline CLR types without writing a formal class declaration for the type
  • Lambda expressions to simplify the syntax of C# 2.0’s anonymous methods, deliver inline functions to Visual Basic developers, and aid type inference and conversion for expression trees
  • Extension methods to enable chaining of extensions that add custom methods to a CLR type without the need to subclass or compile it

However, Sahil complains in his May 12, 2007 A different point of view post (linked from my New Series on Closures in Visual Basic 9.0 item) that:

It's a shame that the language is getting so complex. This is the same mistake C++ made 7 years ago, and why .NET was so successful. It's a shame that we are seeing the same mistake being made all over again.

It seems to me that LINQ and its query expressions make the language simpler by cloaking the complexity of some of these new language features with "syntactic sugar." Sahil seems to abandon his "LINQ has nothing to do with data access" point when he says:

Finally, don't forget - LINQ doesn't buy you any performance gain, or set based theory like Foxpro did, it is more or less syntactical sugar and bunch of .NET code under the scenes that you didn't have to write. [Emphasis added.]

And Sahil did say earlier that that LINQ "gives you .. the ease of Foxpro," which implies the relational data domain to me.

Sahil on Entity SQL as the "Best Thing Around"

I believe Sahil waxes a bit too enthusiastic when describing Entity SQL (eSQL) as the "best thing around" in his paean to the Entity Framework. Apparently, he missed the several posts that mention omission of Data Management Language (DML) constructs (INSERT, UPDATE and DELETE) from eSQL v1. Perhaps most of his work involves read-only data access. Microsoft recommends using the ObjectContext—presumably with LINQ to Entities for UPDATEs and DELETEs—for DML operations.

Erik Meijer, known as the "Creator" of LINQ, has these plans for updatable views and O/R mapping in LINQ 2.0:

Just as we provided deep support for XML in Visual Basic, in LINQ 2.0 we hope to directly support relationships and updatable views at the programming-language level. In that case, we only need a very thin layer of non-programmable default mapping at the edge between the relation and object world and allow programmers to express everything else in their own favourite LINQ-enabled programming language. The result is that just as LINQ 1.0 side-stepped the impedance mismatch “problem” with something better (monads and monad comprehensions), LINQ 2.0 will sidestep the mapping “problem” with something better (composable and programmable mapping).

It sounds to me as if LINQ 2.0 might transcend eSQL and potentially replace the Entity Framework.

Comments Tell the Tale

Sahil's post had 16 comments on June 14, 2007, many of which were from .NET luminaries, such as Don Damsak (donxml), Aaron Erickson (author of the i4o LINQ indexing extension), Ian Cooper, and Frans Bouma (lead developer of the LLBLGen Pro O/RM). I have the feeling that this post caused (or at least contributed to) Ian's Being Ignorant with LINQ to SQL post. The comments include this astounding claim by Damon:

No vendor (open source or for purchase) has anything now production quality except NHIbernate that can really call itself an ORM

Sahil's item also elicited a Sahil on O/RM response from Ayende and this comment from Sahil:

eSQL gives you runtime ability to run queries against your objects.

You might suggest that LINQ does the same, but not really. eSQL gives you the ability to truly bring set based theory into higher level programming languages. There is a query optimizer built into the eSQL framework, so the queries on your object model take advantage of db concepts.

This is something current ORMs cannot do.

Secondly, I am pretty firm on my LINQ to SQL views - I am pretty sure of that.

The Entity Framework's EntityCommand and ObjectCommand objects take eSQL strings and bring no more "set based theory into higher level programming languages" than T-SQL or PL/SQL strings do. Only LINQ incorporates a query syntax "into higher level programming languages" (e.g., C# and VB). That's LINQ's claim to fame and the objective of LINQ to Entities.

The way I understand the plan is: eSQL is an SQL dialect that enables querying the entities defined by the conceptual schema layer of the Entity Data Model (EDM) by CSDL (Conceptual Schema Definition Language) or against the optional Object Services layer's ObjectContext. A command tree in the EntityClient's custom query pipeline for the RDBMS (limited to SQL Server 200x and SQL Server Express Edition at present) translates eSQL to the RDBMS's SQL dialect. Query optimization, if any, takes place on the database server. 

I have my doubts that eSQL will become the lingua franca of "SQL for Entities" any time soon, although IBM, Oracle, MySQL and others appear to be developing custom EntityClient implementations. IBM's interest appears to be LINQ-enabling DB2; the remaining third-parties haven't stated their goals.

Update 5/15/2007: LINQ to SQL architect Matt Warren elaborates in this comment on eSQL and confirms that eSQL doesn't include a query optimizer.

Michael Pizzo offers an architect's view of the Entity Framework with emphasis on inheritance in his "An Application-Oriented Model for Relational Data" article for Microsoft's The Architectural Journal #12. He says the following about Client Views generated by eSQL:

The Entity Framework uses a Client View mechanism to expand
queries and updates written against the conceptual model into
queries against the storage schema. The expanded queries are
evaluated entirely within the database; there is no client-side query
processing. These Client Views may be compiled into your application
for performance, or generated at runtime from mapping metadata
provided in terms of XML files, allowing deployed applications to work
against different or evolving storage schemas without recompilation.

Update 5/14/2007: Entity Framework developer Danny Simmons clarifies the relative ease with which third parties can develop EntityClients for their RDBMSs in this comment.

LINQ to Entities and LINQ to SQL are analogous implementations; their command trees translate LINQ expressions to eSQL and T-SQL respectively. From what I've seen of eSQL, I would use LINQ to Entities unless there was some eSQL construct I badly needed and LINQ to Entities couldn't translate it. (Such a issue would appear to me to qualify as a bug.)

7 comments:

Daniel Simmons said...

One point of clarification: Toward the bottom of this post you mention that "IBM, Oracle, MySQL and others appear to be developing custom EntityClient implementations." While it's certainly true that these folks are working with the ADO.Net team to create (or more often update existing) providers that work with the entity framework, the effort involved in doing this is not at all the same as a custom EntityClient implementation. That is, they don't have to write a provider that does mapping, interprets eSQL or many of the other things that EntityClient does. They just have to translate from entity framework query trees that are already in terms of their database schema to the actual query syntax their backend supports (normally a fairly straightforward task).

- Danny

--rj said...

Danny,

Thanks for the clarification. I've stricken "custom" and added a linq to your comment to make sure folks see it.

--rj

Matt Warren said...

ESQL is a SQL variant that has additional operations that are useful over the entity data domain. The Entity Framework translates ESQL into regular SQL for execution on the database server. As far as I know it is not optimized in the traditional sense, except perhaps where mapping translation comes in. Someone could of course build a ESQL query engine, or at least use it as the front end of one. Will we see a version of SQL Server someday that uses ESQL instead of TSQL? Probably not, but I'd bet that TSQL starts turning into something more like ESQL.

Matt Warren said...

ESQL is a SQL variant that has additional operations that are useful over the entity data domain. The Entity Framework translates ESQL into regular SQL for execution on the database server. As far as I know it is not optimized in the traditional sense, except perhaps where mapping translation comes in. Someone could of course build a ESQL query engine, or at least use it as the front end of one. Will we see a version of SQL Server someday that uses ESQL instead of TSQL? Probably not, but I'd bet that TSQL starts turning into something more like ESQL.

Bart De Smet said...

Hi Roger,

You're doing an awesome job with your blog covering various LINQ related topics (amongst others)! Thanks for linking to my projects and my blog so many times already.

Just to set things right: despite my move to the WPF team in October this year, I'll try to continue my LINQ-related projects on a regular basis. In the meantime, watch out for some LINQ to SharePoint news later this month :-).

Keep up the good work,
-Bart

--rj said...

Hi, Bart,

Thanks for the kind words on the blog and the news about continuing work on your LINQ implementations.

Cheers,

--rj

Aaron Erickson said...

Indeed - the comment that eSQL brings set related operations to the language baffles me as well. The whole point of LINQ, more than being about hitting SQL or XML or Flikr or Amazon, is to make set operations a first class part of C#, without having to write code that writes code (i.e. concatenate strings in eSQL to embed business logic).

And, by the way, I also thank you for building a great resource from which to learn about LINQ technologies!