Saturday, March 28, 2009

More Confusion about Relational SQL Data Services and the Entity-Attribute-Value Data Model

Jamie Thomson’s SQL Data Services and Entity-Attribute-Value models post takes issue with Jeffrey Schwartz’s question and Niraj Nagrani’s answer in Jeff’s UPDATED: Microsoft Exec Explains SDS About-Face article for Visual Studio Magazine.

Q. Are you basically not going to be offering SDS with the EAV tables any more?

A. We are looking into our future roadmap to make sure that Astoria [ADO.NET Data Services] can be leveraged on top of SDS and Entity Data Model continues to exist, and we will continue to provide for that through Astoria. We will continue to work with the Astoria framework and figure out how SDS can support that.

Obviously, the answer was to a question that appears not to have been asked. I can’t think of any reason that relational SDS couldn’t support the server-side pieces of SQL Server Data Services (Astoria) v1.

Jamie contends:

Firstly, whilst the current incarnation of SDS is (underneath the ACE model abstraction) built upon an EAV model (which I think is what the interviewer was alluding to) I don’t believe its true to say that the ACE model with which a user interacts is an EAV model in itself.

Secondly, it is absolutely not the case that it won’t be possible to host EAV models on the future incarnation of SDS. That version of SDS will be built upon SQL Server 2008 which contains the new sparse column feature – a perfect storage mechanism for EAV models. Moreover, its more than possible to build EAV models on a traditional relational database (I’ve done it myself) and if you want to know how to do it then go and read Arnie Rowland’s excellent treatise on the subject at Through the Looking Glass: Elegant -or Not?.

My take is that SDS (and its predecessor, SQL Server Data Services) were based on the EAV data model.

Although it would be possible for the SDS team to implement an EAV version, Azure Tables use the EAV data model and will have much greater scale-out capabilities than SDS v1, which will be limited to “departmental” size databases.

Undoubtedly, SDS will be carry a surcharge over plain-vanilla (but fully elastic) Azure EAV Tables, like the SQL Server option for Windows Server running on Amazon EC2. So implementing EAV on SDS would carry a significant price penalty over native EAV with Azure Tables.

3 comments:

mamund said...

IMO, this is all about what interfaces are offered for accessing data. The original SDS offering was an attempt to provide EAV services for data stored in SQL Server. Azure Table Storage was another, slightly different attempt to do the same thing.

The question is not *can* the SQL team provide an EAV interface, the question is why and when should the SQL team do this. The message I get from the recent shift is that the SQL team has decided that will not, in the near future, provide an EAV interface. Even more to the point, I heard folks at MSFT say that they were getting feedback that a significant majority of their customers did not *want* and EAV interface for SQL data.

Finally, MSFT has decided that, if you really want an EAV interface on SQL data and you don't like what Azure Table Storage is offering, you can roll your own.

Jeff Currier said...

This is precisely the feedback we recieved from customers. While we could have continued to layer on additional relational features on a EAV model there is a point of diminishing return.

What's more, in attempting to provide additional relational features on the EAV model we were, in essences, reinventing a SQL like language. At some point one has to question that if your reinventing something that people already know and love why not just give them the original thing. So we are...

--Jeff-

ccchai said...

If SDS only offer relational database platform, and we have to use Azure to host our mid-tier or application tier. Then won't it be we are double-charged in the end??

I suspect the change is simply the shortcut way to introduce all the capabilities that we have asked for, e.g. batch processing, stored proc, joins, group by, aggregates, UDF, etc...

I believe the change has little to do with relational vs EAV model in this case. The tactic is to expose a more 'reliable' database hosting service and then we have to do the rest to make use of it e.g rolling out our own SOAP/REST interface.