Sunday, September 02, 2007

LINQ and Entity Frameword Updates for 9/1/2007+

Julie Lerman with More About Entity Key Serialization Issues

Her Exploring EntityKeys, Web Services and Serialization a little further post of 9/2/2007 tests the ObjectContext.Attach(EntityWithKey) and ObjectContext.AttachTo(KeylessEntity, "EntitySetName") methods.

Note the late binding of EntitySetName; I'm underwhelmed by what appears to me as a inauspicious trend toward late binding that's reinforced by Julie's comment that "SetModifiedProperty ... takes the property name (string) as a parameter." (Emphasis added.) 

She then details the obstacles that EF places in the path to updating entity properties explicitly so they're marked modified. Not encouraging. (Update 9/3/2007.)

Julie Lerman on Errors Opening EDMX Files in the EDM Designer

Julie's Entity framework Tools- Package Load Failure when opening up EDMX in designer post has the workaround for the dreaded Visual Studio displays a Package Load Failure error message for Package 'Microsoft.Data.Entity.Design.Package.MicrosoftDataEntityDesignPackage’ when you double-click on a .edmx file exception. (Update 9/3/2007.)

Tip: Views as the Data Source for LINQ to SQL Queries Make DataGridViews Sortable

Most LINQ to SQL examples that use DataGridView control(s) to browse base and associated entity collections use LINQ queries and invoke the ToList<T> method to limit the rows returned when providing the DataSource  property value to the BindingSource(s) or DataGridView(s). The generic List<T> type doesn't enable users to sort the DataGridView columns.

To enable sortable DataGridViews, drag tables to the O/R Designer as usual, then add a SELECT * view with the appropriate TOP(n) or WHERE and ORDER BY clauses to the database, open the designer's properties sheet for the view's table, and change the Source property value from dbo.TableName to dbo.ViewName. This approach uses the Table<TEntity>.GetNewBindingList() method to deliver an IBindingList() type, which supports sorting, instead of a List<T> type for data binding.

Comment: It's unfortunate that the LINQ development team removed the IEnumerable<T>.ToBindingList() method from Beta 1. My complaint about this issue in the LINQ Project General forum received no response. The Entity Framework folks removed ObjectQuery<T>.ToBindingList() and EntityCollection<T>.ToBindingList() from EF Beta 2 according to Danny Simmons' August CTP [sic] of the Entity Framework released post of 8/27/2007, which applies to Beta 2.

Note: This tip doesn't apply to ASP.NET GridView/LinqDataSource combinations because the LinqDataSource provides server-side sorting capability in conjunction with it paging feature.

Julie Lerman Tests Entity Key Serialization and Persistence Ignorance

Julie's busy testing items in Danny Simmons' list of new Entity Framework features on Labor Day weekend. Her 9/1/2007 Knocking off Danny Simmons Entity Framework Beta 2 List: #3 & #4: EntityKey Serialization and new Entity interfaces post reports that EntityKey objects, which are independently serializable in Beta 2, won't serialize as a property of an entity. This bug, which is scheduled to be fixed, limits the use of Web services and WCF to transport entities that support concurrency conflict detection across process boundaries.

Dinesh Kulkarni and the Excess Queries for Eager Loading Issue

My August 17, 2007 Eager Loading Appears to Cause LINQ to SQL Entity Table Problems post reported a problem with LINQ to SQL's identity manager failing to recognize eager-loaded entities. I provided more information about the issue in Clarification of the Object Tracking Problem with LINQ to SQL's Eager Loading Feature. This problem causes repeated dynamic SQL queries for entities that are in memory already, as demonstrated by a small test harness that's available for downloading. Dinesh Kulkarni, senior program manager for LINQ to SQL says in an 8/31/2007 Provide a DataContext.AllowDataExplosions Property for Eager Loading with Multiple 1:N Joins comment to my related suggestion, :

On a related note, I want to thank you for bringing to our attention the additional queries fired for eager loaded associations when the root objects are requeried. We are working on fixing that to avoid the additional queries.

The bug report is LINQ to SQL Objects Eager Loaded with LoadOptions Aren't Recognized by Object Tracker of 8/23/2007.

Dinesh's comment indicates that there isn't much chance of my suggestion regarding giving developers the option to enable eager loading of multiple 1:n queries in a single query.

Danny Simmons and Custom Attributes for Entity Framework Classes

Danny's 8/31/2007 EF CodeGen Events for Fun and Profit (aka How to add custom attributes to my generated classes) post shows you how to add an event handler for OnTypeGenerated or OnPropertyGenerated events to an instance of the EntityClassGenerator class. You add a custom attribute to the specified class or event by invoking the the TypeGeneratedEventArgs.AdditionalAttributes.Add() or PropertyGeneratedEventArgs.AdditionalAttributes.Add() method with a named CodeAttributeDeclaration as the argument value. Simple, no?

Fortunately, Danny's post includes a brief, self-contained console app to demonstrate the process. (Update 9/1/2007.)

Rico Mariani and Measuring Database Performance in Context

Rico's 8/31/2007 Database Performance, Correctness, Composition, Compromise, and Linq too post explains three "Key Factors" in obtaining the optimum combination of "solid correctness characteristics and good performance:"

  • Locality
  • Isolation
  • Unit of Work

Rico uses LINQ to SQL as the example for his "Unit of Work" (transaction) topic and concludes:

[W]hen composed with the other operations that are happening on the server you may find that making the best looking choice [between eager and lazy loading] independently results in a poor situation for the system as a whole.

Eager loading generally improves performance and lazy loading usually improves correctness at the expense of more (sometimes many more) database round-trips, as reported in my August 17, 2007 Eager Loading Appears to Cause LINQ to SQL Entity Table Problems. However, Eager loading has a bug that causes repeated database queries for sets of data already in memory (see Clarification of the Object Tracking Problem with LINQ to SQL's Eager Loading Feature and the "Dinesh Kulkarni and the Excess Queries for Eager Loading Issue" above.)

Rico previously posted in late June and early July 2007 a series of five articles that compared the performance of LINQ to SQL queries with the equivalent. My links to and comments about Rico's test are here, here, and here.

Bart de Smet Adds a New VB 9.0 Language Feature

He added Relaxed Delegates on 9/2/2007 and Runtime Agility on 9/3/2007: 

I agree with Bart that "this is a HUGE feature even though it hasn't been covered that much yet in the blogosphere" and will be especially important for developers who use VB 10 (a.k.a. VBX) as a dynamic language alternative to IronPython or IronRuby. Bart's explanation is a detailed and very useful supplement to Paul Vick's original VB Runtime agility, Orcas and new platforms post of May 31, 2007.

(Updated 9/3/2007.)

2 comments:

Frans Bouma said...

I really can't understand how they messed up the eager loading system that much as they've done with linq to sql. I mean: the join system is only ok for graph paths all formed by either 1:1 or M:1 relations (== graph edges), but as soon as there's an 1:n edge or 1(pk):1(fk) edge in the graph, you can't use joins anymore: you have to use a subquery in a different, additional query.

The problem with the additional query is that you have to do the merging yourself. With the join, as you can see in the execution plan in sqlserver, the RDBMS will perform the hash creation and hash merging for you (which is the join), however with an additional query, you have to calculate PK hashes and fk hashes and merge them efficiently. This is code which isn't in Linq to Sql at the moment IMHO, otherwise they wouldn't have this problem.

It's about what the API says it will do (i.e. fetch the whole graph as efficient as possible, no matter how the graph looks like), vs. what reality says it does.

I now wonder: Do they also eager load m:n relations? (Customer - Employee in northwind for example, via order) Because you then either will have a lot of data with a join (you don't want that) or 3 queries: one for the root (e.g. customer), one for pk-pk tuples (the intermediate table data, e.g. solely the pk's of customer and employee) and the employees related to customer.

--rj said...

Frans,

I'm in the process of doing another analysis of the queries generated and performance for 1:m:n (Customers:Orders:Order_Details) graphs, because I'm seeing blog and forum posts indicating that LINQ to SQL is 2X or more slower than Typed DataSets in databinding scenarios.

It's my understanding that LINQ to SQL doesn't support many-to-many relations directly. You end up with two 1:N associations, which requires one query per second N element.

--rj