Monday, June 11, 2007

Ian Cooper Takes on DDD, TDD and PI with LINQ to SQL

UK developer Ian Cooper posted Sunday a detailed analysis of how LINQ to SQL fits into domain-driven design (DDD) and test-driven development (TDD), and then raises the issue of the LINQ implementation's persistence ignorance (PI). His bio says:

Ian has over 15 years of experience delivering Microsoft platform solutions in government, healthcare, and finance. During that time he has worked for the DTi, Reuters, Sungard, Misys and Beazley delivering everything from bespoke enterprise solutions to 'shrink-wrapped' products to thousands of customers. Ian is a passionate exponent of the benefits of OO and Agile [programming]. He is test-infected and contagious. When he is not writing C# code he is also the and founder of the London .NET user group.


Ian's Being Ignorant with LINQ to SQL essay starts by contrasting the data-centric versus domain-centric design approaches: "Data-centric designs tend to flow the relational model into the code" while "[t]hose who tend to be domain-centric flow the domain model out to their persistent store." Ian classifies LINQ to SQL "as a domain-centric tool because of its design goal of making it possible to share on[e] query syntax across many collection types and in the feature set provided by data context." The capability to employ a common set of LINQ queries over in-memory collections and the persistent store is critical to his final conclusion:

LINQ to SQL is usable with a TDD/DDD approach to development. Indeed the ability to swap between LINQ to Objects and LINQ to SQL promises to make much more of the code easily testable via unit tests than before. [Emphasis added.]

Ian goes on to analyze LINQ to SQL's feature set in terms of patterns from Martin Fowler's Patterns of Enterprise Application Development. This is the first example of such an analysis that I've seen for LINQ to SQL. I wouldn't characterize the ActiveRecord pattern, which Ruby on Rails and MonoRail use, as domain-centric. As Ian notes in a reply to Gregory Young's comment:

I understand some folks like ActiveRecord, but I think it has issues for PI, because it is usually intrusive into the domain classes via a template method or reference to protected variables.

Update 6/12/2007: In a 6/12/2007 reply to a comment from Gregory Young, Ian agrees that ActiveRecord is data-centric.

I've added a request that the ADO.NET team provide a similar analysis for the Entity Framework (EF) and Entity Data Model (EDM) as #17 to my suggestions for Defining the Direction of LINQ to Entities/EDM.

Persistence Ignorance

He tests LINQ to SQL against Jimmy Nilsson's eight conditions that preclude persistence ignorance and concludes:

LINQ to SQL scores pretty well against the PI checklist. As always there are trade-offs where performance can be obtained by specific features. It would be nice if we could choose to trade off lazy loading for standard collections so that we could obviate the need to use specific collection types for associations unless we needed lazy loading, but otherwise there is nothing to complain about here.

The "specific collection types" Ian refers to are EntityRef and EntitySet for associations. A reader named Wuz notes can be replaced by a plain object reference and and a list type, if you don't mind giving up lazy loading and specifying the entities to load with DataShape [to become DataLoadOptions in Beta 2 and later.]

There's been little or no discussion up to this point about PI in LINQ to SQL. The lack of interest on the part of the participants in the PI in EF and EDM controversy probably is due to LINQ to SQL's permanent connection at the hip to SQL Server 200x.

TDD and Code Generation

Ian isn't a fan of code generation for creating classes or databases:

This article is about a TDD approach to using LINQ which means that I am not using the code-generation made available through the designers in Orcas. ...

SQLMetal provides code generation support for strongly typed data-contexts in LINQ to SQL (for both mapping file and attribute based approaches); Orcas will ship with designers for people who don't like working with a command line. I prefer to avoid them for anything that is not demo based or first-cut.

He then goes on to describe his approach to TDD with LINQ to SQL and LINQ to Objects and demonstrates how to switch between an in-memory repository and the persistent store (SQL Server) for test doubling.

The Entity Framework versus LINQ to Entities

Ian is a proponent of LINQ to SQL and an Entity Framework detractor. In an earlier LINQ to Entities and Occam's Razor post, which I quoted in my LINQ to SQL:Entity Framework::REST:SOAP? entry, he says:

The key to most ORM toolsets adoption is the productivity benefits they bring and the clean programming model - persistence ignorance - that they support. When I look at LINQ to Entities I see the former being dragged-down by additional abstractions and in the latter case entirely absent; by contrast, LINQ to SQL hits both of these spots.

LINQ to Entities is overcomplex for many needs and its use in many scenarios defies Occam's Razor - Entities should not be multiplied beyond necessity. For simple mapping scenarios, LINQ to Entities feels bloated and I don't want to use until I have to use it. The very design goals for LINQ to Entities preclude it ever being a simple solution.

He then goes on with a plea to Microsoft to enable LINQ to SQL for databases other than SQL Server. I agree that EF and EDM are far too heavyweight approaches to ordinary object persistence needs and that a single-file or attribute-based approach is likely to satisfy 90% of developer's needs for an object/relational mapping tool. But it won't if it's locked into SQL Server 200x. It's especially surprises me that the ADO.NET team would choose EF and EDM over LINQ to SQL as the O/RM tool for their lightweight SQL Server Compact Edition.

Note: It took 13 seconds to dynamically generate a default EDM for the Northwind database in my test of Using the Entity Framework with IronPython 1.1 in Project Jasper. This is considerably better than the approximately 30 seconds it took in Sam Drucker and Shyam Pather's DEV18 - Rapidly Building Data Driven Web Pages with Dynamic ADO.NET MIX07 video (19:17 to 19:47).

You can read more about Microsoft's travels from persistence schizophrenia to persistence parsimony with LINQ to SQL here: Future LINQ to SQL Support for Multiple Databases?