Saturday, March 31, 2007

"Disconnected Operation" and the Entity Framework

Fabrice Marguerie alerted me this morning to a controversy about the handling of update concurrency conflicts in data-intensive, n-tier applications. Fabrice started his Change tracking, the ADO.NET Entity Framework and DataSets post with:

Andres Aguiar started an interesting discussion about disconnected operation and change tracking in the ADO.NET Entity Framework. [Emphasis added.]

Andres' ADO.NET Orcas Entity Framework and disconnected operation post says:

David [sic] Simmons explained how the 'disconnected mode' works today in Entity Framework (and as far as I know, that's the way it will work in the Orcas release).

Basically, there is no disconnected mode. You can create a context and attach objects that you kept somewhere, telling the context if it was added/deleted/modified.

This basically means that if you plan to build a multi-tier application with ADO.NET Orcas in the middle tier, you will need to hack your own change tracking mechanism in the client, send the whole changeset, and apply it in the middle tier. From this point of view, it's a huge step backwards, as that's something we already have with DataSets today. [Emphasis added.]

I had read Danny Simmons' Change Tracking Mechanisms post and added a link to it Thursday. I didn't recall any mention of disconnected operation or disconnected mode in his post.

I believe that Danny only addressed the Entity Framework's optimistic concurency implementation for data updates in an n-tier environment, not disconnected operation.

Note: Today is Danny's 10th Anniversary as a Microsoft employee.

Defining Disconnected Operations and Occasionally Connected Systems

I view disconnected operations as environment in which:

The client application has sporadic or unreliable access to a network to process typical CRUD data operations. Microsoft's preferred term is Occasionally Connected Systems (OCS). Typical users of OCS are sales people, construction managers, social-services caseworkers, foresters, fish and game officials, and physicians who work in the field, especially in non-urban locations.

Note: The Social Security Administration is testing a new healthcare approach that offers seriously ill patients the services of a primary care physician who make house calls.

Architectural characteristics of the OCS scenario, in my experience, are:

  • A smart-client (WinForms) UI
  • Task-based data-entry and update forms
  • Grids for data entry used only where a single view of multiple items is required
  • Locally cached lookup (catalog or historical) data, which can be very large and thus must merge changes rather than require full-table refreshes
  • Automatic or semi-automatic push of cached data to the server (two-tier) or data service (n-tier) when connected to the network
  • A process to enable the client to resolve data update and deletion concurrency conflicts; alternatively, to notify the client of the action taken by a business rule
  • A process to enable the client to resolve insert and other conflicts with multiple rows of child tables, such as order or medication line items *
  • Increased probability of concurrency conflicts than with usually connected systems because of increased update latency
  • Increased probability of foreign key conflicts due to lookup data latency (stale data)
  • A process to test cached inserts and updates for lookup data changes.

* Note: This process isn't addressed by any built-in concurrency handling implementation that I've found, and it also influences the method of handling concurrency conflicts for updates and deletions to child tables. Disconnected or not, you must test for post-retrieval changes by other users to all dependent objects in the graph before committing updates. In many cases, business rules can't resolve these conflicts. I described the approach for DataSets in Expert One-on-One Visual Basic 2005 Database Programming, and I plan to add a blog post and write an article about the technique as it applies to other local persistence implementations.

The preceding definition assumes that, when connected, the client can rely on the data server or access services to invoke methods reliably and quickly (i.e., synchronously). Usually connected systems (UCS) that must deal with unreliable (i.e., asynchronous) data-access services are even more complex because user intervention to resolve complicated concurrency conflicts no longer is real-time. Local data caching capability ordinarily is required even for UCS to support changes to multiple objects.

Local Data Caching for OCS with DataSets

Andres observes that "[W]e already have [change tracking management] with DataSets today." Not only do we already have an optimistic concurrency implementation with DataSets, but a local data cache to handle disconnected operations also. DataSets handle change managment by preserving row state as Added,Deleted,Detached Modified or Unchanged. Invoking the DataSet.GetChanges(Data.DataRowState) method returns a copy of a DataSet that contains rows having the specified DataRowState. The same approach applies to individual DataTables. Orcas's DataSet implementation now includes the ability to generate the DataSet code to another project in preparation for migration to n-tier SOA with the WCF Service template.

Note: See the "N-Tier Support in Typed Dataset" topic in The Visual Basic Team's New Data Tools Features in Visual Studio Orcas post and Steve Lasker's demo of the WCF Service template with split client and server synchronization components in this 12:52 Going N Tier with WCF, Synchronizing data using Sync Services for ADO.NET and SQL Server Compact Edition screencast.

DataSets handle the OCS scenario by persisting an Updategram to the local file system as an XML document (DataSet.WriteXml). When the client boots, it loads the Updategram into the DataSet (DataSet.ReadXml), and tests for network connectivity. If the network is alive, the client attempts to refresh lookup data, process all saved CRUD operations to the data server or service. If not, the client continues with additions to the DataSet, which the user saves manually and the app saves periodically or when closing the main form. I described the process with ADO.NET 1.0 DataSets and SQL Server 2000 in this early "Optimize Update-Concurrency Tests" article for Visual Studio magazine.

This is a scenario that's vastly different from editing an order or medical record in a brower-based form, clicking the Update button, and dealing with the occasional concurrency conflicts caused by other users' edits in the few seconds or minutes between data retrieval and sending updates.

Andres' second post, RE: Disconnected Problems and Solutions, responds to Udi Dahan's Entity Framework: Disconnected Problems & Solutions post. Udi says "I don’t use DataSets that much today anyway." Click here and here for some of Udi's opinions on DataSets.

Most .NET developers aren't partial to DataSets, typed or untyped -- LINQ for DataSet notwithstanding. For example, I've found that the new DataSet.UpdateAll(DataSet) shortcut method has quite poor performance compared to conventional DataSet.GetChanges(Data.DataRowState) code in the Orcas March 2007 CTP.

Update 4/1/2007: Matt Warren sets me straight in his comment on usage of the term disconnected in the LINQ to Entities context. Matt says:

Disconnected objects in LINQ to Entities are not meant to solve the disconnected application problem either. They are merely referred to as 'disconnected' as a means of distinguishing them from actively tracked objects.

Dinesh Kulkarni applies a different definition of disconnected in the LINQ to SQL context in his September 2005 Connected, Disconnected and DLinq post:

Since DLinq is a part of the next version of ADO.NET, it is natural to ask - is it connected or disconnected? After all, we have talked about connected vs disconnected components in ADO.NET quite a bit. DataReader is connected (you are using the underlying connection while consuming the data) while DataSet is disconnected. You need to use DataAdapter to bridge the two worlds. All nice and explicit.

Quite often you need to combine the two modes (as developers do with the DataAdapter + DataSet combo). Wouldn't it be nice if the data access library knew how to provide the benefits of disconnected mode while connecting as and when needed? You would not have the old ADO problems of scalability that ADO.NET solved and yet you would not have to wire all the components explicitly and do all the plumbing yourself. Well DLinq does exactly that.

In this case, disconnected means the database connection is closed while change are made to the cached dataset but reopened in real time when updating the data store.

Update 4/2/2007: I recalled discussions in mid-2006, when the first Entity Framework white papers re-appeared, about . For example, "The ADO.NET Entity Framework: Making the Conceptual Level Real," a revised verision of a presentation to the 25th International Conference on Conceptual Modeling, Tucson, AZ, USA, November 6-9, 2006, by José A. Blakeley, David Campbell, S. Muralidhar, Anil Nori of Microsoft, contains this paragraph:

Occasionally Connected Components. The Entity Framework enhances the well established disconnected programming model of the ADO.NET DataSet. In addition to enhancing the programming experiences around the typed and un-typed DataSets, the Entity Framework embraces the EDM to provide rich disconnected experiences around cached collections of entities and entitysets.

I've found no evidence of support for the client-side persistance of the ObjectStateManager's entity state and updated and original member values in the Orcas March 2007 version of the EF.

Revised Conclusion

In my opinion, nNeither Andres nor Udi deal with totally disconnected operations, disconnected problems or such as OCS. The issue they're addressing is handling potential short-term concurrency conflicts in fully connected systems that have untracked objects.

The Shape of Things to Come

Most folks appear to be writing serial expositions on the same or related LINQ or EF topics, so consider this post to be the first member of the "Concurrency Quartet," with apologies to Lawrence Durrell.

Part II, Change Tracking in the Entity Framework and LINQ to SQL, offers my views about the change tracking and concurrency conflict issues Andres and Udi discuss. It also takes a look at the differences between EF's and LINQ to SQL's approaches to change tracking and concurrency management in the Orcas March 2007 CTP.

Part III, Local Data Caching for OCS with SQL Server Compact Edition, relates my experiences with substituting SSCE for DataSet XML Updategram files as a local data cache.

Part IV will cover issues relating to LINQ to SQL issues with SSCE in the Orcas March 2007 CTP and other trivia.

Technorati tags: , ,

Thursday, March 29, 2007

Entity Framework Updates

Julie Lerman reports in her March 28, 2007 What to expect in next (and future) Orcas bits for Entity Framework post on Brian Dawson's two Entity Framework presentations in Orlando at DevConnections for the SQL Server track:

SMS309: Entity Framework: Part I – Code Past the Depths of ORM Let’s map objects to the database and find out what else awaits with the next generation of the Entity Framework. The Entity Framework provides more than just another ORM solution, although the Entity Framework does that pretty well. Come join a session which dives into lots of code for the sole purpose of making data access even easier. Also see how objects can become web services, and how inheritance works using the Entity Data Model (EDM).

SMS310: Entity Framework: Part II – Dive Even Deeper into Depths of the Entity Framework See how to program against the value layer and go through a more in depth look at navigating through the Entity Framework; see what CSDL, MSL, SSDL means and how LINQ works the new model. Also as a special addition, learn the business value or an answer to the question, “Why should I use the Entity Framework”, not just for programmers, but also for managers.

Brian is the author of the recent Object Services – Write less code post for the ADO.NET Team blog.


Danny Simmons has posted Change Tracking Mechanisms for the Entity Framework, which covers alternative approaches to change tracking. Change tracking is based on values designated as "concurrency tokens."

He also mentioned in A delay before I write more about persistence ignorance... that it might be a while before he responds to the issue of Persistence Ignorance in the Entity Framework.

Danny warns in the thread of my EDM Designer ETA Slipping Past Orcas RTM? thread in the ADO.NET Orcas forum not to use the Orcas March 2007 CTP as the guide to what's coming in the final product:

While it's certainly true that we are locking down, and I can't speak to this particular feature, folks should NOT use the March CTP as their guide for what's going to be in Orcas. The ADO.Net team has been working frantically to get a number of key features completed and in the product, and we already have a large list working that will show up in a future CTP or Beta (not Beta 1 because that's very close to what's in the March CTP, but after that). These features include things like referential integrity constraints, span, ipoco and others that you may have seen mentioned here or in blog postings. There's really quite a lot of great new things yet to come. Keep your eyes open.

Julie describes many of these new features, such as span and IPOCO (Interface to Plain Old CLR Objects) in her What to expect in next (and future) Orcas bits for Entity Framework.

Mark your calendars! .NET Rocks will feature "Daniel Simmons Tours the ADO.NET Entity Framework!" as the 4/5/2007 broadcast.


Julie's mention of the span parameter brought back memories of ObjectSpaces' OPath query language and Matt Warren's ten 2004 and one 2005 GotDotNet posts on the evolution and demise of that unfortunate project. Trade-offs of pre-fetching vs. demand-loading (eager versus lazy loading) of associated (related) entities has been a continuing issue for projects that query object graphs. Matt says in his ObjectSpaces: Spanning the Matrix post of three years ago:

That’s when I hit upon the solution. Like the dot, it was the graph stupid. Object queries needed that additional bit of information that would allow the user to specify exactly what reachable parts of the network should be pre-fetched together. So I took the OPath parser and added an additional context that would allow the specification of a list of properties, and sub-properties and so on that would form a tree of access paths. Anything touched along these paths would trigger a pre-fetch of that data. With a simple list of dot expressions you could easily specify what part of the matrix you wanted to span. ...

Luca [Bolognese] was a bit dubious of the idea at first, but I wore him down. Ever since then, ObjectSpaces has had the span parameter, and yes the name derives from my own ultra-nerdiness. It refers to the span of a space as used in linear algebra. Crack a book if you don’t believe me.

It seems like only yesterday that I was doing my best to learn the ins and outs of programming Orca (Object/Relational Component Architecture?), the code name for the first technical preview of Object Spaces introduced at PDC 2001. ObjectSpaces Microsoft's first (failed) attempt to deliver an O/R mapper; the project ultimately lost its way after being subsumed by long-gone WinFS.


Wednesday, March 28, 2007

Beth Massi Joins Microsoft to Increase MSDN's VB Content

Visual Basic-oriented developers—including me—lament the preponderance of C#-only documentation and code samples in MSDN publications and MSDN blogs. As an example, the C# team has Charlie Calvert whose LINQ Farm and other LINQ-related posts are cited here because of the paucity of corresponding VB articles and posts. Hopefully the tide will turn shortly.

Beth Massi, a well-known East Bay solutions architect, developer and VB proponent (a.k.a. DotnetFox) has joined Microsoft, tasked with "writing content for the Visual Basic Developer Center and promoting the Visual Basic language in the community."

In response to my "Does this mean you're the VB counterpart to C#'s Charlie Calvert?" question, Beth replied:

Yep, that's right. I'll be focusing on getting some killer content up onto the Visual Basic MSDN Developer Center. Keep an eye out!

Already the VB Team blog has a new and detailed Partial Methods post by , which defines them as "a light-weight replacement for events designed primarily for use by automatic code generators."

Update 5/25/2007: Mike Taulty links to Wes Dyer's May 23, 2007 In Case You Haven't Heard post about partial methods in C#. VB developers heard about partial methods two months ago.

It's great to see Microsoft expending more resources on VB topics topics that appeal to a wider audience than beginning and hobbyist programmers. Real developers do write Visual Basic.

Note: Although the article mentions use of partial methods in the "DLINQ Designer" (called the O/R Mapper in Orcas), the classes I've created with the latest version of the O/R Mapper don't appear to use them. Instead, the classes fire ClassName_PropertyChanging and ClassName_PropertyChanged events:

Protected Sub OnPropertyChanging(ByVal propertyName As String)
    If (Not (Me.PropertyChangingEvent) Is Nothing) Then
        RaiseEvent PropertyChanging(Me, New _
            Global.System.ComponentModel. _
            PropertyChangedEventArgs(propertyName))
    End If
End Sub

Protected Sub OnPropertyChanged(ByVal propertyName As String)
    If (Not (Me.PropertyChangedEvent) Is Nothing) Then
        RaiseEvent PropertyChanged(Me, New _
            Global.System.ComponentModel. _
            PropertyChangedEventArgs(propertyName))
    End If
End Sub

Technorati tags: ,

Orcas EDM Wizard and Designer Previewed at VSLive! SFO

Last night the ADO.NET Team posted EDM Wizard and Designer Featured in VSLive! San Francisco Keynote, which briefly describes Britt Johnson's "Data Explosion: The Last and Next Decade in Data Management with the Microsoft Data Platform" keynote. Near the end of the presentation he demonstrated the Entity Data Model (EDM) Wizard and EDM Designer, neither of which made the Orcas March 2007 CTP cut, with a pair of screencasts.

See the 3/29/2007 and 5/21/2007 updates below.

A careful reading of this paragraph from the post:

Britt also featured 5 short videos or screencasts that demonstrated some of the work, specifically around Tools, that the Data Programmability team (including the ADO.NET) has been doing. These videos provide some great information and a preview of the new EDM Wizard coming in Orcas, as well as a sneak preview of a new EDM Designer that we can expect to see released after the upcoming Orcas release [Emphasis added].

indicates that the Orcas RTM bits won't include the EDM Designer. Frans Bouma and I have added questions about this to the post's comments. I also asked the same question in the ADO.NET Orcas forum.

A similar post in the Data Programmability Team blog has a slightly expanded list of new features and screencasts:

These videos provide some great information and a preview of the new XML Editor, XSLT Debugger, and EDM Wizard coming in Orcas, as well as a sneak preview of a new XSD Designer and EDM Designer that we can expect to see released after the upcoming Orcas release.

Elisa Johnson's VSLive Keynote - San Francisco post of the same date says:

During his presentation Britt talked a lot about Conceptual Data Programming and where Microsoft plans to focus on for future innovation in the Data Access space. He also gave a sneak peak at two tools that hadn't previously been seen... the much anticipated EDM Designer and the XSD Designer (you can expect to see more on these sometime after the upcoming Orcas release).

A working preview of the EDM Designer debuted in September 2006; my October 4, 2006 New Entity Data Model Graphic Designer Prototype post provided a walkthrough with the Northwind database.

Sanjay Nagamangalam, the ADO.NET Program Manger who presented the EDM Designer screencast, posted on March 3, 2007 the following message in the ADO.NET Orcas forum:

An EDM designer is foremost on our minds and we are looking to provide a tool for Beta 1. We don’t have a date yet though.

Of course, the message was accompanied by the usual "This posting is provided 'AS IS' with no warranties, and confers no rights" disclaimer.

Shades of ObjectSpaces. Déjà vu all over again. Please say it isn't so.

Update 3/28/2007: Visual Studio Magazine writer Lee Thé writes in his "Microsoft Moves DBMS into the VS Developer Mainstream" article that covered Britt Johnson's keynote:

The final demonstration featured an entity data models (EDM) Wizard that generates classes from the conceptual model. But the portion of this demo that roduced spontaneous applause from the audience was an EDM designer functionality that will not be available in the Orcas beta release. This hotly anticipated functionality is a database designer and an entity modeler. These features let you map a database to a model, keeping the model in sync. The project manager leading this demonstration created a user entity type from an users' table with drag-and-drop functionality, where glyphs showed mapping of entity type to table. Sophisticated graphical representations of entities and relationships enabled the user to work at the level of the business relationships rather than programming abstractions. [Emphasis added]

It remains to be seen if Lee's report is correct. Stay tuned for Microsoft's official confirmation or denial of the EDM Designer's disappearance from Orcas.

Note: Lee's report on VSLive!'s Monday keynote, "Windows Vista, the 2007 Office system, and ASP.NET AJAX" by Prashant Sridharan is here, and his coverage of K. D. Hallman's general session address, "Visual Studio Everywhere: Tools for Office, Office Business Applications, and Custom Application Extensibility," for Redmond Developer News is here. Click here to read the March 28, 2007 VSLive! Show Daily.

InfoWorld's Paul Krill mentions the EDM Designer in his March 27, 2007 "Microsoft maps data management plans" article without discussing the Designer's future availablity. He notes that the EDM Wizard, XSD Designer, and XSLT Debutter are slated for the Orcas Beta release in May. Elisa Johnson says that The XSD Designer will arrive post-Orcas.

Kevin Hoffman complains bitterly in his Microsoft finally shows off their EDM designer... but it won't ship with Orcas?!? post about the apparent demise of the EDM Designer.

Update 5/21/2007: The 1:04:00 video of Britt Johnson's keynote is available here. Click here for Prashant Sridharan's keynote video.

Update 3/29,2007: Julie Lerman—2,500 miles from San Francisco at the DevConnections conference in Florida—reports in What to expect in next (and future) Orcas bits for Entity Framework :

Britt Johnston did a keynote and showed [a video of] the latest prototype of the EDM Modeler and also let us know that it won't be ready for Orcas but they plan to release it shortly after Orcas. This is really frustrating, but it is just the reality and as developers we know the difficulties of designing tools... so it is what it is and until we have it, I will learn a LOT with the XML and personally hold off on doing any seriously complex modeling. [Emphasis added.]

Pending official word from Microsoft, I'm inclined to believe Julie's rendition, although telepathy might be involved. I'm still waiting for someone from the ADO.NET team to post an official response, either as a blog comment or an answer to my question in the ADO.NET Orcas forum.

Note: There's more from Julie's report on Brian Dawson's two Entity Framework presentations at DevCon in her post.

Tuesday, March 27, 2007

Third-Party LINQ Providers

Following is a short list of the third-party LINQ providers I've found to date in more-or-less chronological order:

  • LINQ to WebQueries by Hartmut Maennel handles searches in the SiteSeer and MSDN Web sites. (This provider predates Fabrice's LINQ to Amazon provider by a few days.)
  • LINQ to Amazon by Fabrice Marguerie, a co-author of the forthcoming LINQ in Action book, was the first third-party LINQ provider that I know of. LINQ to Amazon returns lists of books meeting specific criteria.
  • LINQ to RDF Files by Hartmut Maennel handles queries against Resource Description Format files' triples. Part I of the two-part post is here.
  • LINQ to MySQL by George Moudry, based on the LINQ May 2006 CTP, was in the development stage as of January 2007, but George says it's "capable of simplest queries and updates" and "now has support for most primitive joins."
  • LINQ to NHibernate by Ayende Rahien (a.k.a. Oren Eini) translates LINQ queries to NHibernate Criteria Queries and is based on the Orcas March 2007 CTP. The documentation that describes development of the provider presently is at the Part 1 stage.
  • LINQ to LDAP by Bart de Smet is a "query provider for LINQ that's capable of talking to Active Directory (and other LDAP data sources potentially) over LDAP." As of 4/11/2007, Bart's "IQueryable Tales - LINQ to LDAP" consisted of Part 0: Introduction, Part 1: Key Concepts, Part 2: Getting Started with IQueryable, Part 3: Why do we need entities?, Part 4: Parsing and executing queries, and Part 5: Supporting Updates.
  • LINQ to Flickr by Mohammed Hossam El-Din (Bashmohandes) uses the open-source FlickrNet C# library as its infrastructure.
  • LINQ to Google Desktop by Costa Rican programming language enthusiast Luis Diego Fallas supports GDFileResult and GDEmail types. A subsequent Adding support for projections to Linq to Google Desktop implements the LINQ Select expression.
  • LINQ to SharePoint by Bart de Smet supports writing LINQ queries for SharePoint lists in both C# 3.0 and Visual Basic 9.0 and communicates with SharePoint via Web services or though the SharePoint Object Model. The SpMetal command-line utility automates C# or VB class generation.
  • LINQ to Streams (SLinq, Streaming LINQ) by Oren Novotny processes continuous data streams, such as stock tickers or sensor data. The project's home page on CodePlex includes an animated GIF simulation of a stock ticker displayed in a DataGridView. The current version supports Select, Where, Order By, and Descending only.
  • LINQ to Expressions (MetaLinq) by Aaron Erickson (the developer of Indexes for Objects a.k.a i4o*) lets you query over and edit expression trees with LINQ. Like .NET strings, LINQ expression trees are immutable; the only way you can change a LINQ expression tree is to make a copy, modify the copy, and then replace the original. MetaLinq's ExpressionBuilder lets you create an "Editable Shadow of an expression tree, modify it in place, and then by calling ToExpression on the shadow tree, generate a new, normal, immutable tree." ExpressionBuilder is an analog of the .NET StringBuilder.

* i4o isn't a LINQ provider, per se, but a helper class that can increase the speed of LINQ queries against large collections by a factor of 1,000 or more. InfoQ published on June 22, 2007, Aaron Erickson on LINQ and i4o, an interview of Aaron Erickson by Jonathan Allen about i4o's purpose and background.

Ayende observes:

There is an appalling lack of documentation about how to implement LINQ providers. ... I decided to document what I found out while building LINQ for NHibernate.

However, the "appalling lack of documentation" hasn't thwarted the work of third-party LINQ providers for specialty data domains. A search of the CodePlex site on LINQ returned 15 projects as of 6/3/2007.

If you know of other third-party LINQ providers in development, please leave a comment.

Thanks in advance.

Update 3/29/2007: .NET Rocks features "Oren Eini On NHibernate and RhinoMocks!" as its 3/29/2007 broadcast. LINQ to NHibernate might become a much more popular LINQ provider if the ADO.NET Team doesn't finish the EDM Designer by Orcas RTM. The interview with Oren starts at 11:00. (Thanks to Danny Simmons for the heads-up.) .NET Rocks TV (dnrTV) taped an instructional video interview with Oren on January 25, 2007.

Update 4/7/2007: Added Bart de Smet's LINQ to LDAP project, which includes extensive IL code inspection of LINQ queries.

Bobby Diaz has extended Ayende's initial Part 1 From ... In ... Where ... Select implementation with Part 2: Ordering and Paging (adds Order By, Then By, OrderByDescending, ThenByDescending, Take, Skip, and Distinct) and Part 3: Aggregate and Element Operators (adds First, FirstOrDefault, Single, SingleOrDefault, Average, Count, LongCount, Max, Min, and Sum.) The momentum is building behind LINQ to NHibernate.

Update 4/11/2007: Added Mohammed Hossam El-Din's LINQ to Flickr provider and updated LINQ to LDAP with parts 4 and 5.

Update 5/6/2007: Added Bart de Smet's LINQ to SharePoint project, version 0.1.2.0 Alpha release.

Update 5/12/2007: Added Helmut Maennel's early LINQ to WebSearch and LINQ to RDF Files providers. I haven't tested these providers with later LINQ implementations.

Update 6/3/2007: Added Oren Novotny's LINQ to Streams (SLinq) and Aaron Erickson's LINQ to Expressions (MetaLinq) and Indexes for Objects (i4o).

Update 6/11/2007: Added Luis Diego Fallas' LINQ to Google Desktop of May 11 and 12, 2007, which I had missed.

Update 7/22/2007: Added link to inteview with Aaron Erickson about i4o.

Update 8/7/2007: Aaaron Erickson has written "Indexed LINQ" for .NET Developer's Journal. Subtitiled "Optimizing the performance of LINQ queries using in-memory indexes," the article covers the theory behind creating indexes on in-memory collections and applying an extension method to the Where() standard query operator to enable speeding queries by up to a factor of 100 or so with the indexes. You can download the source code, runtime binary or both for i4o at http://www.codeplex.com/i4o.

Angel Saenz-Badillos on LINQ Extensions

A search on "does not contain a definition" AsEnumerable turned up a February 23, 2007 does not contain a definition for post by Angel Saenz-Badillos, a "voice from the past." During the Whidbey beta, Angelsb had been an indispensable source of information on ADO.NET esoterica, such as Multiple Active Result Sets (MARS), connection/session pooling, asynchronous command execution, transactions, and SQLCLR user-defined types.

He and Pablo Castro saved me many hours of headscratching while writing books and magazine articles before Visual Studio 2005's RTM. Angelsb's posts and responses to forum questions have always been marvels of lucidity and completeness, despite their exclusive use of C#.

During his year or more of what I call exile in the Lotusland of JDBC, I had turned off Angelsb's RSS feed. Now I find he's working on LINQ extensions, specifically LINQ to DataSet and LINQ to Entities.

His February 23 post explains how to overcome in LINQ to DataSet code the

'System.Data.DataTable' does not contain a definition for 'AsEnumerable' and no extension method 'AsEnumerable' accepting a first argument of type 'System.Data.DataTable' could be found (are you missing a using directive or an assembly reference?)

and

The type arguments for method 'System.Linq.Enumerable.AsEnumerable<TSource>(System.Collections.Generic.IEnumerable<TSource>)' cannot be inferred from the usage. Try specifying the type arguments explicitly

exceptions by adding using/Imports System.Linq to your class and a reference to System.Data.Entity.dll to your project.

I'm looking forward to las joyas de Saenz-Badillos and, needless to say, have added his feed to my growing list of LINQ-related blogs. Recepción a bordo.

P.S.: Apparently, Julie Lerman was having the same problem as I. Her March 5, 2007 LINQ and Entity Framework Resources for March Orcas CTP post was the second hit of my search.

Monday, March 26, 2007

Updated "Overview of Visual Basic 9.0" Stealth Post

Microsoft recently published a February 2007 update to "Overview of Visual Basic 9.0" by Erik Meijer, Amanda Silver and Paul Vick as a Visual Studio 2005 Technical Article. The update's timing obviously is tied to the Orcas March 2007 CTP release, which occurred in late February. The first reference I've seen to it was by Beth Massi (a.k.a, DotnetFox) on February 9, 2007 in the Hooked on LINQ wiki, although it's possible that the reference could be to an earlier version.

I use the term stealth post because none of the authors or the Visual Basic Team has mentioned the updated version publicly, as far as I can determine. Following are the results of searches I ran to find references to the February 2007 version:

Note: Jim and I discovered the February 2007 version from the link in Bill McCarthy's blog of yesterday.

Updated documentation for the language extensions to Visual Basic 9.0 is important, at least to VB programmers, although the Orcas March 2007 CTP include VB-specific topics for anonymous types, object identifiers, and query expressions, as well as a substantial number of LINQ-related online help topics.

The lack of links to the Overview's February 2007 version is surprising, considering that one of its authors, Paul Vick, is an active blogger and Amanda Silver posts to the Visual Basic Team blog occasionally. Could the reason for the lack of links be that the authors aren't proud of their work?

Issues with the February 2007 Version

Jim Wooley takes issue with some of the content of the February 2007 update in today's VB 9.0 documentation post:

I took a couple minutes looking it over and it does give a quick glimpse of the basic underlying concepts that hopefully are coming. Unfortunately, there are a number of items in the documentation that don't appear to be included in the current [Orcas March 2007] CTP. I am definitely hoping that they do make it in the next drop. The features that are discussed but not yet included are: Joins, Lambdas, and [shorthand DateTime? syntax for] Nullable types. In addition, the samples seem to use an Auto-Implemented Property syntax as introduced in the current C#, but reading more closely, they are just using a pseudo code syntax. [Emphasis added].

Jim then adds details about the three missing features.

Comparison with and of Previous Versions

The February 2007 version contains the following topic list:

  • Introduction
  • Getting Started With Visual Basic 9.0
  • Implicitly Typed Local Variables
  • Object and Array Initializers
  • Anonymous Types
  • Deep XML Support
  • Query Comprehensions
  • Extension Methods and Lambda Expressions
  • Nullable Types
  • Relaxed Delegates
  • Conclusion

The previous version was first published as a Microsoft Word file by the same authors in September 2005. The LINQ May 2006 CTP contained an updated May 2006 version of of the September 2005 release. The topic lists were the same for both versions:

  • [Getting Started with VB 9.0]
  • Implicitly typed local variables
  • Query comprehensions
  • Object initializers
  • Anonymous types
  • Full integration with the Linq [sic] framework
  • Deep XML support
  • Relaxed delegates
  • Nullable types
  • Dynamic interfaces
  • Dynamic identifiers
  • [Conclusion]

The primary differences between the two previous versions, determined by running a compare operation between the two .doc files with Microsoft Word, were minor syntax changes and adoption of the C# syntax/sequence (From ... Select) for LINQ queries. There were 127 deletions and 117 insertions, most of which were one or a few characters.

Jim Wooley's Converting from VB LINQ to Orcas post documents the manual changes required to make his LINQ May 2006 samples run with the Orcas March 2007 CTP LINQ implementation.

Update 3/28/2007: Beth Massi, who's taken a new job at Microsoft "writing content for the Visual Basic Developer Center and promoting the Visual Basic language in the community," notes in a comment that the What’s New in Visual Basic 9.0? link on the main Visual Basic Developer Center page links to the updated post. However, it's my recollection that this link formerly pointed to earlier version(s) and there's no indication that the information was recently updated. Beth says there will be pointer to the doc in a future VB Team post.

Friday, March 23, 2007

A Sync Services Bidirectional Test Harness

The Microsoft Synchronization Services (Sync Services) 1.0 API (runtime) requires a substantial amount of developer-authored code to define the basic elements required to perform bidirectional synchronization between an SQL Sever 2005 Compact Edition (SSCE) client and an SQL Server 2005 [Express] or other RDBMS server. The VB code (without empty lines or comments) to implement bidirectional synchronization for a pair of simple tables is about 95 lines if you take advantage of the runtime's CommandBuilder or about 165 lines if you don't.

Beta 1 Update May 15, 2007: The test harness was built and tested with the Orcas March 2007 CTP. Upgrading the project to Beta 1 exposed some new Sync Designer and SSCE v3.5 problems that prevent its operation. There is no effective workaround available at this time. (See my Sync Designer/SSCE Version Problems with Orcas Beta 1 post in The Microsoft Synchronization Services for ADO.NET forum.) The Sync Services team says that the Sync Designer problems are "fixed in a later build" and "most changes to runtime and designer are coming in beta 2.0." The SSCE team hasn't replied regarding fixes for compatibility problems with SQL Server Management Studio and Server Explorer, which another forum participant has experienced. I'll update this post when an upgraded test harness becomes available for download from the Visual Studio Magazine site.

Background

The Sync Services Designer that debuts in the Orcas March 2007 CTP greatly reduces the effort required to get a simple unidirectional service up and running. Completing a couple of simple forms generates a DatabaseName.sync XML document and VB or C# class file with all but two lines of code required to produce a simple one-way, download-only synchronization project. The price you pay for automating the service design process is the requirement to use SQL Server 2005 [Express] (or SQL Server/MSDE 2000) as the server RDBMS. The Sync Services 1.0 runtime is server-agnostic, as demonstrated by Rafik Robeal's use of Oracle 10g Express in his Demo V: Offline Application - Oracle Backend C# project.

All of Rafik's Sync Services runtime demos use simple orders and order_details tables with random primary key values. The orders table has order_id and order_date columns and order_details has order_id, order_details_id, product (name), and quantity columns. Both tables use order_id (int) as the sole primary key column, which has as PK_orders or PK_order_details (PK, Unique, Non-clustered), and UQ__orders__##### or UQ__order_details__##### (Unique, Non-clustered) indexes. This selection of keys prevents establishing a one-to-many relationship between orders and order_details tables, so there are no foreign key fields.

Update: 3/24/2007: Rafik added a Deep in Sync: Handling PK-FK Constraints post to The Synchronizer blog yesterday. This post explains Rafik's reason for not including PK/FK relationships and details the workings of—and settings for—these relationships in detail. Sync Services interprets the first table in the addition of client-side SyncTables and server-side SyncAdapters to their respective collections as the parent table. Adding related SyncTables to the SyncGroup that's attached to the SyncAgent assures that Sync Services processes the table changes as a unit.

A Sync Services Bidirectional Test Harness (Work in Progress)

Here's a preview of my Sync Services test harness, which (as usual) uses the Northwind Orders and Order_Details tables to emulate pseudo "real world" order and line items data. There's a one-to-many relationship (FK_Order_Details_Orders) between the identity primary key (OrderID) on the Orders table and composite primary key (OrderID, ProductID) on the Order_Details tables of both the client and server database. One of the purposes of the test harness is to determine whether it's practical to use Sync Services to replace merge replication for master/child tables. (This question remains unanswered at present, but the ability to replace RemoteDataAccess (RDA) seems assured).

The test harness project is in development at present; the downloadable sample code will accompany an article for Visual Studio Magazine's May 2007 issue. Click the images to display a full-size version.

Figure 1 - SQL Server 2005 Compact Edition Client Cache Page

The Client above and Server (below) pages enable selecting automated UPDATE, INSERT, or DELETE operations and provide rapid comparison of the latest additions to the client and server tables. You type the number of Orders and Order Details records in the text boxes and then click the Random Insert/Update/Delete button to apply the changes to the SSCE tables. Updates randomly alter the EmployeeID, OrderDate, RequiredDate, ShippedDate, ShipVia, and Freight field values of the Orders table and ProductID, Quantity, Unit Price, and Discount of the Order_Details table. Inserts add a random selection of a CustomerID value from the Customers table.

Clicking Synchronize starts the synchronization process. You can select from three methods of handling synchronization data conflicts on the client page.

Clicking the Add FK Constraints button adds a DataRelation between the Orders and Order_Details tables to the database and the NorthwindDataSet. (By design Sync Services doesn't add DataRelation(s) during the database and table creation process, and a problem with the DataSet Designer prevents persisting changes.) Code adds or updates the LastEditDate value to the Orders table (not shown) and the Order_Details table.

Figure 2 - SQL Server 2005 Express Server Data Source Page

Figure 3 - Client Schemas and Sync Statistics Page

The schemas and statistics page has a DataGridView control to display SSCE INFORMATION_SCHEMA "views" (actually tables). The Client ID combo box and Set button are for testing behavior of SSCE's identity feature. Text boxes display sync statistics and CommandText property values for all operations. You can copy the commands to Notepad for better visibility.

Figure 4 - Sync Payload Page

The Payload page shows the data transferred between the server and client (and vice-versa).

Figure 5 - Test Grids Page

Test grids hold snapshots of data transferred in a more readable format than the Payload page's.

Updated 3/24/2007: Added link to Rafik PK/FK posts, plus minor additions and clarifications.

Thursday, March 22, 2007

SSCE Sync Designer Q&A and Screencast

Sync Services pilgrims working with the Sync Designer preview in the Orcas March 2007 CTP had many of their questions answered by Steve Lasker's Additional Q&A on the Visual Studio Orcas Sync Designer post of March 21, 2007, which supplements Steve's original Q&A on OCS & Sync Services for ADO.NET post of March 18, 2007. First look at the Visual Studio Orcas Sync Designer and Going N Tier w/WCF, Synchronizing data using Sync Services for ADO.NET and SQL Server Compact Edition are a pair of screencast posts (dated March 22 and 23, 2007), which cover the Sync Designer that's scheduled to debut in Visual Studio Orcas. SSCE Sync Designer Q&A

Steve answers these questions to which I've added some related references:

  • Why does the Orcas Feb CTP Typed DataSet designer not work on Vista? I discovered this problem at the end of the aborted guided tour described in my Guided Tour of Orcas's Sync Services Designer for SSCE post of March 17, 2007.
  • Will the Sync Designer generate time based sync? I mentioned the lack of this feature in the same post.
  • Will tombstone records be automatically cleaned up? Rafik Robeal covers this topic in his Sync Services: Periodic Tombstone Cleanup post of February 16, 2006 to The Synchronizer blog.
  • How do I get my cached tables to be synchronized in a single transaction? Rafik's A nice gift from SQL Server 2005 SP2 to sync developers post discusses an SQL Server 2005 fix for potential timestamp errors with uncommitted transactions.
  • Once all the tables are placed in a single transaction, how do I control the order the tables are updated to handle parent/child relationships? Rafik discussed this issue in a "Synchronizing an 'Order'" thread in the Microsoft Synchronization Services for ADO.NET forum.
  • Does the sync runtime create relationships locally within SQLce? I mentioned this problem in conjunction with the problem of inability to save design changes to SSCE DataSets in the Guided Tour of Orcas's Sync Services Designer for SSCE post.
  • Does the sync runtime work with server side identities for PK's? The test harness I'm building has OrderID identity columns on the client and server sides and currently uses identity partitioning (similar to merge replication's approach) to identify the client machine that's the source of the update. Ultimately, the test harness will use ROWGUIDCOL columns.

SSCE Sync Designer Screencast—Part 1: First look at the Visual Studio Orcas Sync Designer

Steve's first Sync Designer (a.k.a. Cache Designer) screencast (25:49) demonstrates two-tier, one-way (download-only) synchronization of updates to reference data (Customers, Employees, and Shippers) for the Northwind Orders table. Reference (also called catalog) data, such as customer, vendor, or product lists, ordinarily are quite large but usually change relatively slowly. Two-tier, one-way sync for changes only is likely to be the most common Occasionally Connected System (OCS) scenario.

These are the only two lines of code in the Synchronize button's event handler that you need to sync the client with the server tables using the defaults you set in the designer:

Dim SyncAgent As NorthwindCacheSyncAgent = New NorthwindCacheSyncAgent

Dim SyncStats As Microsoft.Synchronization.Data.SyncStatistics = _ SyncAgent.Synchronize

The test harness's Synchronize button's event handler has about 100 lines of code to specify sync type and conflict handling, add and remove event handlers, and display SyncStatistics.

SSCE Sync Designer Screencast—Part 2: Going N Tier w/WCF, Synchronizing data using Sync Services for ADO.NET and SQL Server Compact Edition

The second screencast covers the n-tier scenario with a Windows Communications Foundation (WCF) service as an intermediary between the client and server. The architecture is similar to that Rafik Robeal demonstrated in his Demo III: Offline Application – WebService project.

Update 3/23/2007: My Sync Services demo project (a work in progress) has been moved to this new location: A Sync Services Bidirectional Test Harness. Added link to Part 2 of the screencast. Incorporated reference to original Sync Services Q&A in first paragraph.

Tuesday, March 20, 2007

New SQL Server Compact Edition Article

"Lighten Up Your Local Databases" from Visual Studio Magazine's March 2007 issue recommends that you "[p]ut local data storage on a resource diet and gain performance with the newly upgraded (and free) SQL Server 2005 Compact Edition." The article's sample application includes a VS 2005 application that demonstrates binding an SqlCeResultset object to a BindingNavigator and DataGridView:

Click for Full-Size Image

Neither SQL Server Management Studio [Express] or Visual Studio 2005's SP2 Server Explorer let you display and edit data in a grid. Orcas March 2007 CTP's Server Explorer SSCE table nodes have a context menu with a Show Table Data choice:

Click for Full-Size Image

Here are a couple of minor corrections and clarifications to Table 1:

  • The "Non-destructive change to the identity property of columns" topic states "You must drop and recreate the table to change the identity property." This is true to add or remove the identity property, but you can change the seed and increment values without altering the data in the table.
  • The @@IDENTITY system function has session scope and doesn't return a value until after you've inserted a row in a table. SCOPE_IDENTITY and IDENT_CURRENT aren't supported.

Update 3/27/2007: The March 27, 2007 edition of 1105 Media's .NETInsight newsletter features "Lighten Up Your Local Databases." Click here to subscribe.

Monday, March 19, 2007

Mike Taulty Dissects LINQ to SQL

Thanks to Julie Lerman's Deconstructing LINQ to SQL post, I learned about Mike Taulty's two-part series that digs into the inner workings of the LINQ to SQL API:

Deconstructing LINQ to SQL (Part 1) discusses the differences and similarities between IEnumerable<T> and IQueryable<T>. Mike's abbreviated take is:

To me, the primary difference between IQueryable and IEnumerable with respect to LINQ is that I view IQueryable as offering the potential for "capturing the whole query and executing it in one go" whereas I view IEnumerable as "executing a set of functions in sequence on lists in order to produce more lists".

Deconstructing LINQ to SQL (Part 2) discusses how IEnumerable<T> and IQueryable<T> relate to LINQ to SQL. Here's Mike's conclusion:

I do now know where the T-SQL comes from and I do know that the infrastructure uses SqlCommand and so on but I still wouldn't like to say that I have this "nailed" because I think it's pretty hard to walk through without actually single-stepping the live source code and I don't have that (and, at some point, it's sensible to give up and accept that "it works.")

As I mentioned in this earlier Yet Another Primer on New Language Features in Orcas post, Mike is a Microsoft UK evangelist "involved in getting information out to developers about what's happening with the Microsoft platform through online mechanisms like newsletters, blogs, videos and through offline mechanisms such as technical sessions."

Sync Services for ADO.NET Overview

Steve Lasker's lengthy Q&A on OCS & Sync Services for ADO.NET post covers use of merge replication, Remote Data Access (RDA) and Sync Services for ADO.NET for synchronizing data between servers and clients (or publishers and subscribers) of Occasionally Connected Systems. As you'd expect, the emphasis is on Sync Services for ADO.NET with SQL Server Compact Edition (SSCE) v3.5.

Surprisingly, there's only one brief reference to the Sync Designer (in the answer to "Does Sync Services Support N Tier?") With the Sync Designer debuting in Orcas, I've been expecting more Q&A on the Sync Designer in the Microsoft Synchronization Services for ADO.NET forum. So far, searching on "designer" returns only two hits (one on 1/24/2007 and another 2/28/2007).

Maybe the lack of traffic is due to the strange name of the Orcas designer template: Local Database Cache (see Guided Tour of Orcas's Sync Services Designer for SSCE.) Data[base] Synchronization Service makes more sense to me. As of today, Google returned relevent hits on "Local Database Cache" only for Nick Randolph's and my posts. Same for "Sync Services Designer."

Update 3/21/2007: I should have searched for "local data cache" orcas, to pick up the Visual Basic Team blog's New Data Tools Features in Visual Studio Orcas post by Young Joo (3/13/2007). The post includes a "Local Data Cache with SQL Compact Edition" topic. The author omitted "base" and called the template "Local Data Cache." The article also:

  • Describes Hierarchical Updates with the new TableAdapterManager class, which simplifies code for executing updates on all of your DataSet's table adapters with the TableAdapterManager.UpdateAll(DataSet) method
  • Previews the newly-renamed Object Relational Designer (formerly the DLinq Designer)
  • Introduces n-tier support for typed DataSets by splitting the class file into another project.

Local Database Cache also suffers from lack of any documentation whatsoever. Try searching online help for "Local Database Cache" -- nada. (Hierarchical Update's help topics appear complete and Object Relational Designer has an unfinished walkthrough.)

Get The Bits

Here are OakLeaf links to details for downloading the current SSCE CTPs, RTMs and samples:

Saturday, March 17, 2007

Guided Tour of Orcas's Sync Services Designer for SSCE

Nick Randolph, co-author of WROX's Professional Visual Studio 2005 and Microsoft Visual Developer/Device Application Development MVP , has published a two-part demonstration for using the Orcas March 2007 CTP's new Sync Designer that starts when you add a Local Database Cache (LocalDataCache1.sync) template to your project.

Part 1 stops at the Add New Item dialog that displays Local Database Cache and Service-based Database templates.

Part 2 continues with the Configure Data Synchronization and Configure Tables for Offline Use dialogs.

Note: For assistance with Sync Services and Sync Designer issues, I recommend the Microsoft Synchronization Services for ADO.NET forum. The Synchronizer blog and Rafik Robeal's SyncGuru site offer downloadable C# Sync Services sample projects, documentation, and commentary. Nick Randolph's SQL Server CE Portal site offers FAQs for SSCE and Sync Services, including a sample VB project that improves on Rafik Robeal's C# code for configuring Sync Services.

My Problems with the Sync Designer

The series is similar to a demonstration page that I put together last week for the Orders and Order_Details tables of the Northwind sample database. I didn't publish it to the blog because the Wizard failed at the last step.

As noted in steps 9 through 11 of my test drive:

9. Under Windows Vista running on Virtual Server 2003 R2 Beta 1, the Orcas Data Source Configuration Wizard creates NorthwindDataSet.xsd, .xsc, and .xss, but doesn't create the the NorthwindDataSet.Designer.vb file for a typed NorthwindDataSet. (This problem doesn't occur with the Orcas Data Source Wizard running under virtualized Windows Server 2003 R2.)

10. When I attempted to generate a typed data set from Northwind.sdf with the Data Source Configuration Wizard under virtualized Vista, I received this informative error message:

This message has nothing to do with SSCE. It also occurs when attempting to create a typed DataSet from SSCE or SQL Server [Express] with the Data Source Configuration Wizard under Vista in my configuration. A search for the error message returns a link to this even less informative suggestion from the MSDN Library: "Inspect the error message and check for any errors in the Task List (Visual Studio) that can be fixed."

11. I received this error message when attempting to save any changes, such as adding the DataRelationship between the Orders and Order Details tables to the DataSet Designer for the client SSCE database's typed DataSet. This exception occurs under virtualized Windows Vista and virtualized Windows Server 2003 R2:

12. I set the sync interval to 5 minutes, but was unable to detect any evidence that synchronization was occurring between the SSCE client and SSX server.

Subsequently, I wrote a Sync Services test harness to add the DataRelation for the client and a customized SyncGroup at runtime. A future blog post and Visual Studio Magazine article will provide more details on the test harness.

I plan to redo the test drive with virtualized Windows XP SP2 after I finish the test harness studies.

Friday, March 16, 2007

New "Inheritance in the Entity Framework" Article

Erick Thompson has posted a brief article entitiled "Inheritance in the Entity Framework" to the ADO.NET Team blog. Topics include:

  • Why OOP?
    • Problem Modeling
    • Application Extension
  • How the Entity Framework bridges the gap
    • Table per Hierarchy
    • Table per Type
    • Table per Concrete Type

Erik provides sample Entity Data Model (CSDL conceptual schema) XML fragements for Product, DiscontinuedProduct and SeasonalProduct entity types to illustrate how to implement inheritance in ADO.NET 2.0's Entity Framework.

P.S. The Entity Framework's Persistence Ignorance issue discussed in my earlier post is raising more contoversity. Check the original post for updates.