Wednesday, April 02, 2008

LINQ and Entity Framework Posts for 3/31/2008+

Note: This post is updated daily or more frequently, depending on the availability of new posts.

Microsoft France Academic Interns Create Visual LINQ Query Designer (VLinq)

Mitsuru Furuta, Microsoft France's manager and technical lead of the VLinq project, recruited two talented students, Simon Ferquel and Johanna Piou, who was a French Imagine Cup participant, to write a Visual Studio designer template that helps LINQ naifs write LINQ queries on a graphical design surface implemented in WPF.

Mitsu's Visual Linq query builder for Linq to Sql: VLinq post of April 2, 2008, offers a brief history of the VLinq project and a step-by-step demonstration with LINQ to SQL as the data source. He says the project took "almost one year of work and organization." The student internships lasted six months.

Here's the Visual LINQ Query Designer surface running a simple query:

Do you see something missing in the generated T-SQL prepared statement?

You can download the add-in template from MSDN Code Gallery.

Added: April 2, 2008

Microsoft Data Centers Expand the Unit of Deployment from Bricks to ISO 668 20-Foot Intermodal Containers

Mary Jo Foley reports in her Microsoft builds out its first containerized datacenter post of April 2, 2008 that the new Northlake, Ill. data center will use containers holding 1,000 to 2,000 servers as the basic unit of deployment.

Mike Manos who leads the Microsoft Global Foundations Data Center team made the first public announcement of a containerized production data center at Data Center World. According to Microsoft Data Architect James Hamilton:

The Microsoft Chicago facility is a two-floor design where the first floor is a containerized design housing 150 to 220 40’ containers each [holding] 1,000 to 2,000 servers. Chicago is a large facility with the low end of the ranges Mike quoted yielding 150k serves and the high end running to 440k servers.

The Database Bricks/Containers Backstory and Their Relation to SSDS

I wrote about bricks and containers as units of data-center deployment almost two years ago (to the day) in my Very Large Databases: Bricks, BitVault and BigTable post of April 6, 2006. That post's "Really Big Smart Bricks and Databases" topic starts with:

The ultimate brick is a Google data center in a shipping container, as postulated by PBS's Robert X. Cringely (Mark Stephens) in his 11/17/2005 column, "Google-Mart: Sam Walton Taught Google More About How to Dominate the Internet Than Microsoft Ever Did."

I concluded:

Clearly, large distributed file systems and databases must run on very big smart bricks or hardware of similar scale. Google probably probably runs the world's largest distributed file system.

It's a reasonably safe bet that future Midwest users of SQL Server Data Services will be accessing data in the container instead of data in the cloud.

The "VLDB Implementation/Deployment Issues" topic quoted Dare Obasanjo's "Greg Linden on SQL Databases and Internet-Scale Applications" post, which in turn quotes Greg Linden:

What I want is a robust, high performance virtual relational database that runs transparently over a cluster, nodes dropping in an out of service at will, read-write replication and data migration all done automatically. I want to be able to install a database on a server cloud and use it like it was all running on one machine.

and goes on to describe Adam Bosworth's database requirements of the time:

Adam Bosworth, who managed the intial development of Microsoft Access, went on to found CrossGain (together with Tod Neilsen, Access marketing honcho), sold CrossGain to BEA and became BEA's chief architect and senior vice president, and [was then] VP Engineering at Google, lists these three features that database users want but database vendors don't suppply: Dynamic schema, dynamic partitioning of data across large dynamic numbers of machines, and modern [Googlesque] indexing. Adam wants the the Open Source community to "[g]ive us systems that scale linearly, are flexible and dynamically reconfigurable and load balanced and easy to use." Adam does mean give, not sell.

It seems to me that BitVault's smart bricks with appropriate deployment and management applications would fulfill Greg's and all but Adam's economic desires for reference data, which now constitutes more than half of the data stored by North American firms

The tragedy is that Jim Gray didn't live to witness the massive scaleup and scaleout of his original CyberBrick concept.

Added: April 2, 2008

DevExpress Implements LINQ Data Sources for ASPxGrid View and XtraGrid

The Server mode using LINQ? Let's wax rhapsodic post of April 1, 2008 by DevExpress CTO Julian M. Buckness describes how DevExpress refactored its ASP.NET and Windows form grid controls from dependence on eXpress Persistent Objects (XPO) to support for any LINQ provider that returns an IQueryable<T> type.

The DevExpress's 2008 Volume 1 release adds LinqServerModeDataSource for pageable ASPxGridView controls for ASP.NET, which appears to emulate the ASP.NET LinqDataSource. The LinqServerModeSource supports the XtraGrid and adds similar paging capability. You can view a short screen cast that demos LinqServerModeDataSource here.

The "Developer Express uses LINQ for grid controls" article for SDTimes by David Worthington became one of the magazine's Top Stories by April 2, 2008.

Added: April 2, 2008

Eugenio Pace Implements a Mock SSDS for LitwareHR with SQL Server Express

Eugenio's LitwareHR on SSDS - Part IV - Data access enhancements 2: developing offline post of April 1, 2008 describes how the new SQL Server Data Services (SSDS) version of LitwareHR implements a proxy to let developers switch between SSDS and a local SQL Server Express instance as the data source for the demo project.

Eugenio's post describes his objective in implementing the proxy:

Our goal while developing LitwareHR was to actually make the dev team as independent and autonomous as possible. Notice I say the dev team, not the application itself. We were comfortable in taking a dependency with SSDS for runtime, that is when LitwareHR would be deployed in a production data center. But we wanted developers to be able to work even if they are flying on a plane with no connectivity (like I was!). In one sentence we wanted a "mock SSDS".

With this in mind, we developed a new proxy implementing the same interface the real proxy implements, but against a local SQL Server Express database.

Following are previous posts in Eugenio's LitwareHR for SSDS series:

Added: April 2, 2008

Dave Robinson Writes in 20 Minutes an SSDS Client That Adds and Retrieves Red Bull Recipes to/from SSDS

Challenged to write a "full-featured" SSDS app in the 20 minutes before his CodeTrip presentation last Friday at the Red Bull headquarters in Santa Monica, Dave came up with the project he describes in his SSDS @ Red Bull post of April 1, 2008. The project uses the Excel Add-in for SSDS that Dave describes in his Using Excel with SSDS post of March 21, 2008.

[Dave's link to SSDS @ RedBull in his SSDS, The Code Trip & Red Bull post of April 1, 2008 doesn't work for me; it's fixed above.] Dave fixed his link.

Update April 3, 2008: You can watch a 09:54 TechjZulu video segment that includes an interview with Dave, which begins at 02:40.

TechZulu claims to be "one of the first weblog that introduces southern California's companies to the technology community."

Added: April 1, 2008 Updated: April 2, 2008

Noam Ben-Ami Describes Entity Framework's Future Update Model from Database Feature

Noam's Update Model From DB post of April 1, 2008 describes the post-CTP2 version of the Entity Data Model (EDM) Designer's Update Model from Database feature that detects changes to the data store's schema and modifies the physical schema (SSDL) and the mapping (MSL) layer as follows:

  1. Adds EntityTypes, Properties and/or Associations for objects that have tables or columns in the database but are missing from the EDM. You can select which tables, columns, or both to add.
  2. Modifies EntityTypes, Properties and/or Associations where database tables or columns have changed from the structure represented by the current SSDL schema.
  3. Removes EntityTypes, Property and/or Associations from the SSDL only whose database tables or columns have been deleted from the database. This process doesn't delete the element from the model (CSDL section).

The post-CTP2 EDM Designer handles renamed entities or properties, but not renamed tables or columns.

Noam says that these are the limitations of the forthcoming designer:

  • It does not update the types of properties when the corresponding database columns change.
  • It does not “resurrect” entity types – once you have deleted a type, the only way to get it back is to recreate it manually, or delete the corresponding SSDL by hand in the XML editor.
  • It will also not “resurrect” properties.
  • If you change the keys that define your type, all current associations lose their identity and new associations will be brought in – you will need to delete old associations.
  • It cannot detect database object renaming – renames will show up as the deletion of an old object and the addition of a new one.

Of all the improvements, my favorite is the ability to handle renamed entities because I work primarily with tables having plural names which I always singularize for entity names.

Added: April 1, 2008

Oren Eini: NHibernate 2.0 Alpha 1 Released

Ayende says in his NHibernate 2.0 Alpha is out! post of March 31, 2008:

It gives me great pleasure to announce that NHibernate 2.0 Alpha 1 was released last night and can be downloaded from this location.

We call this alpha, but many of us are using this in production, so we are really certain in its stability. The reason that this is an alpha is that we have made a lot of changes in the last nine months (since the last release), and we want to get more real world experience before we ship this. Recent estimates are of about 100,000 lines of code has changed since the last release.

NHibernate is the standard of comparison for Windows object/relational mapping (O/RM) tools. The "NHibernate Mafia" (James Kovacs, Scott Bellware, Jeffrey Palermo, and Jean-Paul Boodhoo) were responsible for making the Entity Framework folks finally understand what persistence ignorance is all about.

Added: April 1, 2008

SQL Server Data Services Team Seeks Ruby on Rails, PHP and Java Developers for April SDR

Ryan Dunn's Interested in SQL Server Data Services? post of March 31, 2008 touts early access to SSDS and a seat at an SSDS software design review (SDR) at the Microsoft Silicon Valley Campus on April 24 - 25, 2008 for developers using non-Microsoft technologies, such as Ruby on Rails, PHP and Java.

PS: Watch for my cover story about SSDS coming in the July 2008 issue of Visual Studio Magazine.

Query Caching In ADO.NET Data Services Feedback Wanted

Pablo Castro is seeking feedback from users of ADO.NET Data Services (Astoria) about the tradeoff between the performance benefit of compiling/caching data retrieval queries and the code complexity of defining the queries when taking advantage of Astoria's query interceptor feature. Pablo's Looking for feedback: query caching in data services post of March 31, 2008 dives deep into the caching issues.

Data retrieval queries involve elaborate, provider-specific composition. For example, Pablo cites this flow for Astoria with Entity Framework as the data provider:

URL -> Expression Tree -> Canonical Query Tree -> View expansion/Query simplification -> Canonical Query Tree -> SQL -> Rows (DataReader) -> Entities (DataReader) -> Objects -> Serialization (Atom/JSON)

Thus compiling and caching these queries, especially complex ones that involve deeply nested data structures resulting from use of the $expand keyword, has a very beneficial effect on performance. Undoubtedly the perf improvement will be substantially greater than that reported for LINQ to SQL by Rico Mariani in his famous five-part DLinq (LINQ to SQL) Performance series, as well as his more recent Performance Quiz #13 -- Linq to SQL compiled queries cost of January 11, 2008 and Performance Quiz #13 -- Linq to SQL compiled query cost -- solution of January 14.

Data interceptors, which enable adding a custom filter predicate to the query, are commonly used to implement data access control but complicate the compiling/caching process. What Pablo is seeking is opinions on the relative importance of query performance versus code complexity for implementing data inceptors. Without knowing how much the perf benefit is affected by the two options he offers at the end of his post's Part 2, it's difficult to make a recommendation. I'm inclined to favor performance because the interceptor code is likely to be suited to cookbook-style documentation.

The ultimate decision is likely to affect SQL Server Data Services, also.

PS: Watch for my cover article, "Retrieve and Update Data in the Cloud with Astoria," about ADO.NET Data Services coming in the May 2008 issue of Visual Studio Magazine.

PPS: My "ASP.NET MVC: Is the new MVC pattern right for you?" TechBrief is in the March 15, 2008 issue of Redmond Developer News. Unlike on p. 27 of the print edition, the electronic version has the right blog address.

Taking Advantage of Stored Procedures in the Entity Framework

The Entity Framework (EF) defaults to generating dynamic parameterized Entity SQL (eSQL) statements to populate (hydrate) entities from tables in the underlying relational data store and parameterized data manipulation language (DML) statements to create, update and delete entities and their associated tables.

Many organizations and individual DBAs prefer or insist on the use of stored procedures rather than dynamic SQL for SELECT, INSERT, UPDATE or DELETE operations. Stored procedures prevent the need to grant users direct access to tables, which could compromise database security. In many cases, the capability to efficiently execute stored procedures instead of dynamic SQL is the principal determining factor in choosing an object/relational mapping (O/RM) tool.

EF has built-in support for stored procedures but there's no central source of complete, up-to-date documentation and how-to information for using stored procedures with the current Beta 3 version and its Entity Data Model (EDM) Designer CTP 2. The EF Extensions, which add flexibility in the use of custom stored procedures and speed their execution, hasn't been widely used or thoroughly documented. (There were only 181 EF Extension downloads from the MSDN Code Gallery as of 3/31/2008, and Colin Meek only recently published his ADO.Entity Framework: Stored Procedure Customization article on 3/26/2008)

The following three articles contain detailed instructions with screen captures and code examples for creating entities, changing from dynamic SQL to stored procedures, testing EDMs with stored procedures, as well as minimizing the number of stored procedure calls and improving performance with the EF Extensions:

Migrating to SQL Server Stored Procedures with the EDM Designer December CTP 2 (March 27, 2008)

Testing Stored Procedure Replacements for Entity SQL Statements (March 29, 2008)

Minimize Stored Procedure Calls and Improve Performance with EF Extensions (March 30, 2008)

Jim Zemlin: "The Linux ecosystem should pay attention to" SQL Server Data Services

Jim Zemlin is the Executive Director of the Linux Foundation so his advice is well taken, especially by Microsoft. In his Cloud Computing - Did anybody notice Microsoft’s SQL Server Data Services Announcement? post of March 14, 2008, Jim says:

Linux has an early lead in this area with service offerings like Amazon’s Linux based S3 and IBM’s Blue Cloud which uses Xen and PowerVM virtualized Linux operating-system images.

Today Web 2.0 start ups are flocking to these services as a way to reduce their cost, have world class infrastructure, and most importantly to be able to scale up and down based on demand. It won’t be long before mainstream enterprises follow this trend. It turns out that IT operations combined with convenient tools is something that will be the core competency of companies like Google, IBM, Amazon, and Microsoft in the future. ...

A product from Microsoft that doesn’t require their tools or lock you into their platform offered as a service over the web? Time[s] are indeed a changing. The interesting thing about this is that the folks who need to adjust for this change the most are software companies that don’t have broad IT operations infrastructure and management competencies similar to the likes of Google and Amazon. Software companies like Microsoft are waking up to this. Are other going to follow suit? [Emphasis added.]

Like many others, Jim identifies the wrong services as SSDS competitors. Amazon's S3 stores raw data (blobs) and IBM touts Blue Cloud as providing on demand computing services, similar to Amazon's EC2. SSDS's only direct competitor is Amazon's SimpleDB, as far as I've been able to determine.

Thanks to the SSDS Team for the heads up.

Added: April 1, 2008 [Not an April Fool]

Video: Liam Cavanaugh and Neil Padgett Demonstrate Synchronizing SSDS Data with Access and Outlook

From the Microsoft Sync Framework blog's Sync to the Web - Enable Offline Access to Web Data from any Data Store post of March 12, 2008:

At MIX 2008 Liam Cavanagh discusses how to use the Microsoft Sync Framework along with SQL Server Data Services to provide synchronization cabability across multiple applications.

In the demonstration, he synchronizes contacts between Microsoft Access, Outlook and SQL Server Data Services. However, by building new providers, any application can participate in synchronization.

Later in the presentation Neil Padgett goes through a detailed internals discussion of the Microsoft Sync Framework metadata and provider models.

You can view the WMV (high or low bitrate) version of the 41:00 informal presentation. Here's another link to Neil's formal MIX 08 session: Using Microsoft Sync Framework and FeedSync (T32).

The OakLeaf Systems blog will start carrying more Sync Framework/Services content because of its importance to SSDS and its pending integration with Astoria to create "Astoria Offline."

Here's a link to an earlier "Introducing Microsoft Sync Framework" presentation by Moe Khosravy, the lead PM for the Sync Framework.

Added: April 1, 2008