OakLeaf Systems: Examples of Scaling Relational Databases Up and Out

Monday, March 30, 2009

Examples of Scaling Relational Databases Up and Out

There’s considerable doubt about the capability of enterprise-grade relational database management systems (RDBMSs) to provide sufficient elasticity for cloud-scale services while supporting most features that data-oriented developers consider de rigueur, such as:

Immediate consistency
Rich data types
Views
Indexes
Stored procedures
Triggers
Explicit ACID transactions
Inner and outer joins
Referential integrity
Cell-level encryption
Full-text search
Horizontal and vertical partitioning

Microsoft’s mid-course correction to SQL Server Data Sources (SDS, formerly SQL Server Data Services, SSDS) changed the table data model from Entity-Attribute-Value (EAV) with freeform attributes (a.k.a, flexible properties) and simple data types to fully relational with most SQL Server 2008’s features. My A Mid-Course Correction for SQL Data Services posts of 2/24/2009 and SQL Data Services Abandons REST for TDS API and Knocks My Socks Off of 3/10/2009 describe the anticipated changes.

Updated 4/3/2009 with excerpts from Simon Munro’s SQL Data Services Does Not Scale post of 3/31/2009.

“Cloud RDBMS usage looks good” according to James Hamilton, who’s currently an Amazon Web Services vice president and distinguished engineer and formerly a high-level architect for Microsoft’s data centers and SQL Server, as well as IBM’s DB2. Hamilton gave the opening keynote, Cloud Computing Economies of Scale, presented at the Self Managing Database Systems workshop, which is part of the International Conference on Data Engineering in Shanghai, China on 3/29/2009. His slides to accompany a closing panel titled Grand Challenges in Database Self-Management cite:

Microsoft’s Hotmail with more than 300 million users and more than 2 billion non-span messages per day with SQL Server on every back-end node of a 10,000-server farm.
Facebook with 1,800 MySQL copies.
Windows Live ID with 420 million IDs and well over 1 billion authentications per day

However, Hamilton notes that “all the complexity ‘is above’ the RDBMS:

Partitioning & partition management done above RDBMS
Many of these workloads could run on simple ISAMS
Many new workloads are going non-relational
For many installations, the most mission critical data-intensive applications don’t involve RDBMS

Azure covers the above issues with Azure Table services, which use a “conventional” EAV data model that’s similar to Amazon’s SimpleDB and Google’s App Engine.

Hamilton asks: “Why the RDBMS exodus?” and answers:

Failure to scale
Excess administrative complexity
Resource intensive due to monolithic delivery of un-needed features
Unpredictable response times
Opaque failure modes
Access patterns excessively random
Slow to evolve to new workload patterns

Azure and SDS incorporate many of the required features Hamilton describes in his On Designing and Deploying Internet-Scale Services paper for the 21st Large Installation System Administration Conference (LISA ’07) in November 2007.

Wai-Ming Mok, a former product-line manager at Sun Microsystems, posted Multi-tenancy @ Salesforce.com on 3/29/2009. Mok observes:

Salesforce.com supports over 55K customers, including Google and Dell. This feat is achieved by an ingenious group of Oracle database experts who have taken an enterprise class relational database and turned it into a multi-tenant system that runs customer relationship management for these many customers. BTW, this system supports close to 200M transactions each weekday, at less than 1/4 of a second response time.

Mok’s post continues:

On March 25, Craig Weissman, CTO of Salesforce.com, gave an illuminating presentation on the internal architecture at his company to a room full of attendees at the SDForum SAMsig. Some highlights:

There are 15 or so “pods”, each consisting of 50 or so servers, running a 4-way Oracle RAC and application (Java) and support servers. Each pod supports thousands of customers.
Each Oracle RAC database consists of static tables that store the data from thousands of customers all mixed together. One row of data belongs to one customer and the next row belongs to another, and the columns may have completely different types of data between the rows. Control and access to the data are managed by metadata. In essence, the Oracle database is transformed to a multi-tenant database.
Customer data in the columns are stored as human-readable strings. Some customers have requested certain data to be encrypted. Appropriate transformation functions are used to convert to the correct data types when the data are accessed.
Using Lucene, the data are all indexed.
Apex is a new language to enable customers to write data processing applications, like a new generation of 4GL. It resembles Java. Governors are deployed to prevent abuse of the system resources. [Emphasis added.]

The flexible properties of SDS’s original data model and Azure Tables have characteristics similar to those highlighted above.

Simon Munro argues that SQL Data Services Does Not Scale in this 3/31/2009 essay based on his “Comparing Azure Storage and SQL Data Services” presentation to the SQLBits conference in Manchester, UK. Simon writes:

The only (and suggested) way of getting scalability out of SDS is to scale out using partitions. Generally this is a really difficult thing for the relational model and SQL to do as a big part of the SQL model is the ability to have consistent data. Brewers CAP conjecture implies that SQL, by being Consistent and Available has to forgo Partition Tolerance – and I believe that that is the case with SDS. …

Anyone who has tried to build a solution based on partitioned databases will know that the architecture in the above case study is no simple achievement and takes a lot of work whereby an entire layer needs to exist to allow the system to first identify the correct partition for the data. In the same MIX09 presentation Nigel Ellis hints at efforts that will be made in terms of adding some partition support to SDS, which may, through the use of configuration, provide for better querying across partitions (although ACID may be more difficult).

It’s to be noted that the SDS team now refers to “departmental” sized databases. As an example, the First round of Questions and Answers post of 3/12/2009 to the SQL Data Services Team Blog says:

The database size will be capped. We are still evaluating what the cap will be, but the plan is to ensure that the allowed database size supports most, if not all, departmental and web application workloads. [Emphasis added.]

A video segment of the session might be available shortly here.

Did the SDS team make the right decision when they listened to customer feedback and abandoned the EAV data model with flexible properties for more traditional RDBMS features, Transact-SQL, and the Tabular DataStream (TDS) protocol? Only time will tell.

However, I’m betting that .NET developers, Azure’s target audience, will welcome the change despite having to roll their own support for RESTful Astoria-based data access.

Technorati Tags: Windows Azure,Azure Services Platform,Azure Storage Services,Azure Table Services,Azure Blob Services,SQL Data Services,SDS,Cloud Computing,Amazon Web Services,SimpleDB,SalesForce.com,Oracle

3 comments:

mamund said...: excellent summary. quality links.
thanks!; March 30, 2009 10:49 AM
Geoffrey Emery said...: Great rap up .. Time will tell; March 31, 2009 8:58 AM
Anonymous said...: The issue is not whether it can scale, the issue is whether you are willing to give up enough functionality to make it scale. Access patterns and pinning down query plans are key to some of these issues. But the most important one is still how much are you willing to pay to get a scalable relational database service in the cloud? I believe that there is an upper limit to how much people are willing to pay. We need to nail that number down into an acceptable range before your question can be answered. In the end it is all about how much of your workload you are willing to dump to leap across a chasm.; March 31, 2009 10:36 AM