Sunday, May 02, 2010

Windows Azure and Cloud Computing Posts for 4/30/2010+

Windows Azure, SQL Azure Database and related cloud computing topics now appear in this weekly series.

 
• Updates for articles added on 5/1/2010 were moved to Windows Azure and Cloud Computing Posts for 5/1/2010+ on 5/2/2010 due to length.

Note: This post is updated daily or more frequently, depending on the availability of new articles in the following sections:

To use the above links, first click the post’s title to display the single article you want to navigate.

Cloud Computing with the Windows Azure Platform published 9/21/2009. Order today from Amazon or Barnes & Noble (in stock.)

Read the detailed TOC here (PDF) and download the sample code here.

Discuss the book on its WROX P2P Forum.

See a short-form TOC, get links to live Azure sample projects, and read a detailed TOC of electronic-only chapters 12 and 13 here.

Wrox’s Web site manager posted on 9/29/2009 a lengthy excerpt from Chapter 4, “Scaling Azure Table and Blob Storage” here.

You can now download and save the following two online-only chapters in Microsoft Office Word 2003 *.doc format by FTP:

  • Chapter 12: “Managing SQL Azure Accounts and Databases”
  • Chapter 13: “Exploiting SQL Azure Database's Relational Features”

HTTP downloads of the two chapters are available from the book's Code Download page; these chapters will be updated for the January 4, 2010 commercial release in April 2010. 

Azure Blob, Table and Queue Services

Steve Marx (@smarx) and Ryan Dunn (@dunnry) present Cloud Cover Episode 9 - Blob API, a 00:33:36 Channel9 video segment in their Cloud Cover series:

Join Ryan and Steve each week as they cover the Microsoft cloud. You can follow and interact with the show at @cloudcovershow
In this episode:

  • Using the StorageClient library, take a lap around the Blob API and discover the common operations
  • Hear the latest news and announcements for the platform
  • Discover a quick tip/gotcha for running the AppFabric Service Bus in Windows Azure 

Show Links:
Windows Azure self-paced training
OData under Apache 2.0 license
Filtering Diagnostic Events
New SQL Azure features are live
AppFabric Service Bus troubleshooting Tips
Blob API Upload Optimizations (via Rob Gillen)

Jai Haridas explains Protecting Your Blobs Against Application Errors in this detailed 4/30/2010 tutorial from the Windows Azure Storage blog:

A question we get asked is the following - Do applications need to backup data stored in Windows Azure Storage if Windows Azure Storage already stores multiple replicas of the data? For business continuity, it can be important to protect the data against errors in the application, which may erroneously modify the data.

The replication of data in Windows Azure Storage will not protect against application errors since these are problems at the application layer which will get committed on the replicas that Windows Azure Storage maintains. Currently, many application developers implement their own backup strategy. The purpose of this post is to briefly cover a few strategies that one can use to backup data. We will classify backup strategies based on blob service here, and table service in a later post.

Backing up Blob Data

To create backups in blob service, one first creates a snapshot of the blob. This creates a read only version of the blob. The snapshot blob can be read, copied or deleted but never modified. The great news here is that for a snapshot, you will be charged only for the unique blocks or pages that differ from the base blob (i.e. the blob to which the snapshot belongs to). What this implies is that if the base blob is never modified, you will be charged only for a single copy of the blocks/pages.

The following code shows how to create a snapshot of a blob:

CloudBlobClient blobClient = new CloudBlobClient(baseUri, credentials);
CloudBlobContainer cloudContainer = blobClient.GetContainerReference(containerName);
CloudBlob cloudBlob = cloudContainer.GetBlobReference("docs/mix2010.ppt");
CloudBlob backupSnapshot = cloudBlob.CreateSnapshot(); 

When you create a snapshot, the snapshot blob gets assigned a unique timestamp This timestamp is returned in the x-ms-snapshot and provides the blob snapshot name. The snapshot name has the same name as the original blob, just extended with the snapshot datetime. For example: The following Uri can be used to address the snapshot:

http://account.blob.core.windows.net/container/docs/mix2010.ppt?snapshot=2010-03-07T00%3A12%3A14.1496934Z

When creating a snapshot you can store new metadata with the snapshot blob at the time the snapshot is created, but after the snapshot is created you cannot modify the metadata nor the blob. If no metadata is provided when performing the snapshot, then the metadata from the base blob is copied over to the snapshot blob.

Jai continues with code for copying and deleting blobs, as well as establishing a backup strategy.

Frederico Boerr’s Escaping illegal characters for Azure row key and partition key post of 4/29/2010 explains:

I was a some-hours-troubleshooting the Bad Request (400) response from azure storage when trying to add an object (context.AddObject).

The result was simple: the partition key and/or the row key contained Characters Disallowed in Key Fields.

In my case I was trying to use as the partition key the user name: "ADATUM\Mary".

Trying to find a generic solution, I url encoded the user name so the partition key ended up being: "ADATUM%5CMary". This value could be inserted but could not be deleted (yes, it’s not mentioned as an illegal character but still).

At this point, I had to decide if a custom Escape method was needed or if I will encode the partition key as base64.

Decision table
Option Pros Cons
Base64 encoding The solution is generic. 
Easy and fast to implement.
The output is not human-readable text.
Custom escaping method Total control on the output. Takes time and effort. 
Will need maintenance.

The decision was to use the Base64 encoding.

Frederico continues with the “methods for encoding/decoding and the usage.”

<Return to section navigation list> 

SQL Azure Database, Codename “Dallas” and OData

Tapas Pal’s Free Microsoft Azure SQL Tools For Cloud Application Development post of 4/29/2010 to CodeGuru.com describes SQL Azure Data Sync, SQL Azure Migration Wizard, and Gem Query:

Introduction
The relation database (RDBMS) provided in Microsoft Azure is known as SQL Azure. The SQL Azure database can be easily integrated with your local SQL Server and tools provided in Microsoft Visual Studio 2008 and 2010. Azure Developers can use T-SQL script for queryring data as they presently do for any on-premises SQL database. It's a highly available and scalable service can be obtained by registering to SQL.Azure.com site. Microsoft Azure doesn't provide any Off-premises SQL Azure development tools or Management studio for developers. You need to develop a local database and migrate it to SQL Azure durin production deployment.
Top SQL Azure On-Premises Development Tools

Microsoft and other third party vendors have developed few free SQL Azure development tools for integration of your SQL Azure Database with local on premises database. Out of these tools is Microsoft Sync Framework, the SQL Azure Migration Wizard and Gem Query tools are most popular and well accepted by the Azure Development Community. In this article we will be discussing these on-premises Free SQL Azure development tools and how developers can use these tools in their daily workflow. …

I had not heard of Gem Query before.

David Robinson explains Patching SQL Azure in this 4/30/2010 post:

You could be completely oblivious to the SQL Azure patching process and successfully use SQL Azure – however if you are curious here are some details. The Microsoft SQL Azure team is responsible for patching SQL Azure – you do not need to download, install, or worry about the availability of your data during patching. A SQL Server DBA usually has concerns when updating SQL Server, usually around successful installation and post installation data accessibility. We are going to try to address these concerns and how they relate to SQL Azure in this article.

SQL Azure is a completely fail-tolerant system designed to allow for rolling updates without interruption to the service. This means that when the updates are applied there is no interruption in your ability to access your data.

We apply two types of patches to SQL Azure, operating system updates and service updates. Both types of updates can cause an established SQL connection to be dropped. For this reason, and others, you need to make sure that your code is designed to try to reestablish a connection to SQL Azure and handle the connection pool correctly on connection loss. We will be following up with another blog post on how to accomplish this. [Emphasis added.]

Operating System Patches

Operating system updates are updates to the underlying Windows operating system. They are the same as the operating system updates issued by Microsoft worldwide via Windows/Microsoft Update.

We regularly deploy operating system updates. Critical updates are given the highest priority and are deployed immediately.

Users of SQL Azure never notice these updates because their service is never interrupted (except for established SQL connections as mentioned before) and the updates do not cause a change in the SQL Azure feature set like service updates.

Service Updates

Service updates are patches, enhancements and bugs fixes applied directly to the SQL Azure platform running on top of Windows. Service updates enhance and provide additional features to the SQL Azure platform.

Data Loss from Updates

There is no risk for data or schema loss for either service updates or operating system updates.

There is no need to backup your data before a scheduled service update.

You trust us to keep your data safe and we take this trust very seriously.

Long before updates are deployed into production, our dedicated testing team extensively tests the updates and their deployment. We have many dedicated testing clusters and many servers to test deploy our service release. We don’t take chances on the production servers.

SQL Azure is designed to keep your data accessible across many different kinds of downtime events like hardware failures and power loss. Included in that design is the deployment of service updates in a rolling fashion to keep your data available.

SQL Azure always keeps three replicas running to protect your data in the event of a failover (a primary replica and two secondary replicas). We never reduce the replica count to update SQL Azure. When a rolling update happens, we signal all the replicas on the server that we want to update to failover to other servers in the datacenter. Once the replicas are safely running on other servers, we update the server and then put it back into production.

This is very different than the SQL Server production environments where SQL Server DBAs manage offline failover systems to update each server in turn. For example, a DBA with two SQL Servers with the same databases for redundancy will take one of them offline to update it – leaving themselves vulnerable to the remaining server failing during that time. With a larger budget these DBAs could replicate the redundancy features of SQL Azure by adding additional machines to handle failover during updates.

One of the great things about SQL Azure is we maintain our failover capabilities during updates and this is provided to as part of the service

Service updates do not update your data, and data availability over rolling service updates is guaranteed.

Transact-SQL

Service updates only add features to Transact SQL. This means if your transact-SQL is working on SQL Azure currently, it will continue to work after a service update. Additional features, syntaxes and semantics could be added to the language on a service update; however, none of these will affect the execution of your stored procedures or currently running queries. In other words, we will not break you[r] code with a service update.

Dave is a technical editor of my Cloud Computing with the Windows Azure Platform book.

Roman Schindlauer suggests you Learn More about StreamInsight Complex Event Processing (CEP) and download the 32-bit and 64-bit RTW versions in this 4/30/2010 post to the SQL Server Team blog:

Hi! I’m Roman and I been working on a really cool piece of technology in SQL Server 2008 R2. Maybe you have heard of it? It has generated a lot of buzz and interest during its CTP phase. It’s called StreamInsight, Microsoft’s platform for Complex Event Processing. Microsoft’s information platform vision provides enterprises with a “complete approach” to managing information assets, enabling all businesses to gain strategic value from information from the desktop to the datacenter to the cloud. And StreamInsight V1 is one essential piece in this spectrum. After more than a year of blood, sweat, tears, and insane amounts of coffee we are proud to release the first version of our Complex Event Processing Framework. [Emphasis added.]

Those of you who have been following our Community Technology Previews (CTPs) throughout last year have already had the possibility to familiarize themselves with the product. Early feedback was not only incredibly positive, but also very constructive and strongly influenced the final feature set. Four notable increments over our last public CTP are:

  • Count windows
  • Non-occurrence detection (Anti-Join)
  • Dynamic query composition at runtime
  • Synchronize time across input streams

Additionally, many smaller issues and bugs were addressed. A few APIs slightly changed with respect to the November CTP, but porting your application to RTM should not require a lot of effort.

Here are the (English) bits - choosing the evaluation license during setup lets you already play with this version. Before you install, make sure to uninstall any previous CTP version:

StreamInsight X86
StreamInsight X64

Within a few days, we will update our product page and add download links and instructions there as well. The StreamInsight documentation is provided through a help file as part of the installation as well as through Books Online on MSDN. We also invite you to visit the StreamInsight Blog and the StreamInsight Forum, which is a great place to discuss questions and issues with the community and the development team.

It will be interesting to see how soon the SQL Azure team implements CEP with StreamInsight. (Comment added to Roman’s post.)

SQL Server guru Kalen Delaney asks (and answers) How Will Cloud Computing Affect Your Programming and Management Practices? in this 4/29/2010 article for the SQL Server Magazine site:

Some of you know I’ve been working with SQL Server for a long time, more than 22 years in fact, and that was after spending many years working with computers and doing programming that had nothing to do with relational databases. The first computer programming class I took used punch cards, as did the first class I taught. No knowledge of the computer infrastructure was required to submit the cards to the operator and pick up the output. It was useful to be aware of certain limitations of the system, but really, all we needed to know was how to format and submit our programs. A few years later, when we moved to terminal-based programming, we still needed no specific knowledge of the computer system itself.

We needed to know how to create and edit a file, and we needed to know the commands for compile and execute, but that was all. What kind of computer was it? Where was the processing actually taking place? Which other users were trying to run programs at the same time? I have no doubt someone knew the answers to these questions, but as programmers, we had no access to, or even interest in, this level of information.

I couldn’t help but think of programmers’ isolation from the machine when I started reading about cloud computing and Microsoft’s cloud database service SQL Azure. (You can read some basic information about SQL Azure at www.microsoft.com/windowsazure/sqlazure.) The descriptions of the service tout the fact that database programmers don’t need any knowledge of how or where the data actually exists, or who else might be using the same machine for other, possibly unrelated, purposes.

So how is cloud computing with SQL Azure different from the dumb terminal model we used so many years ago? Is it like a pendulum swinging back and forth, from programmer isolation to programmers having full knowledge and control of the physical environment and now back to programmer isolation again? Or is it more like a three-dimensional spiral; each time we come back around to a point we were at before, we’re a little higher up, with better, more powerful machines and more complex infrastructure? Plus, now we have the Internet. With SQL Azure, we aren’t limited to accessing a computer that’s in close physical proximity to the punch card stations or the dumb terminals. With SQL Azure, and cloud computing in general, you can be anywhere in the world and so can your data. In fact, all of your data doesn’t even have to be in the same physical location.

I can’t begin to go into any detail here about how SQL Azure actually works and what you need to do to get set up to use it. I suggest you take a look at the Microsoft website I referenced earlier, which should give you a good start. You can even download a training kit, which is full of example scripts and hands on labs, from the website to really get your hands dirty.

So how might working in the cloud change your practices? What would you do differently if you had no access to the physical storage mechanisms and no ability to do any kind of system configuration, monitoring, or tuning? I’m aware of the fact that there are SQL programmers not working with cloud-based data right now who have no interest in the actual physical storage or system configuration and wouldn’t know what to do with the results of any type of monitoring. But if these programmers ever did develop an interest in digging deeper, they would have the tools and resources to do that in a typical SQL Server environment.

I must admit that the thought of not having full information about my data environment available makes me uncomfortable. But maybe that’s just because I have enough experience to know what to do with that information. For users who just need to get a job done and don’t have the need to understand exactly what’s happening at every step as long as the job gets done, maybe it’s a good thing to let someone else handle the messy physical details and deal with hardware setup and SQL Server software installation and patching. It’s probably a good thing that nobody is being forced to access only data in the cloud; instead, we can choose the services and tools that work best for us. [Off-topic links removed.]

Marcello Lopez Ruiz is on an XmlReader and streaming data roll with his Layering enumerators post of 4/30/2010:

Now that I've touched upon XmlReader and how it can be used to stream data while allowing clean layering at the same time, I want to touch on the layer-able interface by excellence in the .NET Framework: IEnumerable<T>.

You may have also seen the term 'composing' enumerables, but I tend to think of composing them as being able to do things with multiple enumerables at once, like joining to enumerations, and I want to stick to the simpler, one-in-one-out layering for a bit longer.

LINQ-enabled languages provide great support for layering through the use of Enumerable and Queryable classes, although their functionality is typically accessed through LINQ keywords (from, where, select, etc) or as extension methods (where customers.Select(c => c.Name) actually translates to Enumerable.Select(customers, c => c.Name)).

So let's say we have something like the following.

string[] values = new string[] { "Mary", "had", "a", "little", "lamb };
var shortWordsInUpperCase = values.Where(v => v.Length < 4).Select(v => v.ToUpper());

What we end up with is an enumerator that will do the Select on the results on the enumerator that applies a Where on an enumerator that reads from an array - a chained set of layered enumerators. The good thing is that all of these will never be processing more than one of these at a time, so if the original source where streaming them from disk for example, we only need enough resources for the item we're currently processing.

C# also has great support by allowing you to write methods that use the yield keyword to build methods that can enumerate their results element by element.

It's often useful to think of XmlReader as a very specific enumerator that walks a document node-by-node; many of the same layering considerations that apply to enumerators will apply to XmlReader. There is less syntactic sugar in the language to deal with XmlReader, and typically there is more work in producing valid results so things still work like consumers will expect, with element starts balanced with element ends; but by and large, it's very straightforward to apply the design learnings from one realm to another.

Next week, a note on streaming layered components that break streaming (yes, it may sound weird, but it's actually quite common).

Marcello Lopez Ruiz explains Catching streaming exceptions with streaming readers in this 4/29/2010 tutorial:

About two years ago, I put up a post discussing how exceptions are streamed in WCF Data Services (called ADO.NET Data Services at the time).

Today I want to continue from yesterday's post and discuss how the client implements support for this, but first a quick note.

I'd like to make sure I clarify what I mean by a "streaming server", a "streaming client", and a "streamer reader". In all cases, the idea is the same: a component doesn't need to have all data at hand to do its work - it can work as long as it has enough to make some meaningful progress.

So the server doesn't need to hold all the query results in memory before writing them out: it can start writing them out as data comes from the database. The client doesn't need to read the complete payload before processing begins: it can start materializing objects as soon as there's enough information from the network. An XmlReader component is a streaming component because it doesn't need the complete document - just enough to provide all the right information after a .Read() method call.

As we know, when the server sends the response headers indicating success and it hits an error while sending the results, it will put the error information on the response and then terminate the connection without correctly closing out the rest of the document. This sends a very clear signal to non-streaming components that try to read the response, as it will fail to load in an implementation that checks for correct XML.

In the client library case, we not only want to avoid having to read the whole response before processing, but we also want to parse that last bit of XML with the error information.

Marcello continues with details of the XmlAtomErrorReader.

<Return to section navigation list> 

AppFabric: Access Control and Service Bus

No significant articles today.

<Return to section navigation list>

Live Windows Azure Apps, APIs, Tools and Test Harnesses

Eugenio Pace showed Windows Azure Guidance – Part I in a single picture on 4/29/2010:

What’s covered in this first part of the Windows Azure Architecture Guide? Here’s a picture inspired in our previous guide style:

image

And remember, this is the first guide on a series that we have planned. In the next couple of weeks we will start sharing more broadly what we are thinking for Part II.

The Windows Azure Team posted a Real World Windows Azure: Interview with Shane Leonard, Senior Director at Givedon on 4/29/2010:

As part of the Real World Windows Azure series, we talked to Shane Leonard, Senior Director at Givedon, about using the Windows Azure platform to deliver an application that helps charities raise money URL added]. Here's what he had to say:

MSDN: Tell us about Givedon and the services you offer.

Leonard: Givedon is a not-for-profit organization based in London, England, that uses Web technologies to raise money for charities. Users can access different services on the Web, such as search, through our Web site or toolbar. We then raise money through affiliate or partnership agreements, and then we donate directly to the charities that our users choose. One hundred percent of the money we raise from search activity goes to charity.

MSDN: What was the biggest challenge Givedon faced prior to implementing the Windows Azure platform?

Shane: We needed a fast, scalable, and global platform to run our Web site and offer lightning-fast performance to users-it doesn't matter how charitable people are; if we can't offer a fast, stable service, people won't use Givedon. However, as a not-for-profit organization, cost-efficiency isn't an option-it's a necessity. We previously hosted the site on GoDaddy, but it was slow and lacked functionality. In particular, we were limited in our ability to offer even simple search functionality, and we also found it difficult to offer a service in multiple languages.

MSDN: Can you describe the solution you built with Windows Azure to help address your need for cost-effective scalability?

Shane: We redesigned our Web site and migrated it to the Windows Azure platform with the help of our technology partner BitStar. The architecture is simple: we use Web roles, Windows Azure Table storage to store data, and Windows Azure Blob storage to store files. We've expanded Givedon to include maps, news, images, videos, and shopping, helping ensure that we have a more complete search offering.

The interview continues with more formula Q&A.

Eugenio Pace offered Windows Azure Guidance - Additional notes on failure recovery on Windows Azure on 4/29/2010:

Things will eventually fail in your application and you need to be prepared. So most components should be designed for something going wrong and recover gracefully (or as gracefully as possible) and leaving the system in a consistent state (eventually in some cases).

In this post I wrote about dealing with data consistency when interacting with multiple tables in a single “unit of work”:

To summarize, step #1 and step #2 belong to a single logic unit of work. However, since there’s no support for ACID transactions  between 2 tables in Windows Azure, you might end up with  “orphaned” records if something happens in between and you don’t have any logic to clean things up.

Things get even more complicated as you add other resources to the Unit of Work. For example, in the latest version of aExpense, we also (optionally) save the images to blobs. For each blob we then write messages to a queue to notify workers that there are new images to compress. in this case, writing to the tables, writing to the blobs and writing to queues is a single logical unit of work. And of course, there are no ACID transactions supported across all these resources. …

Eugenio continues with an approach that handles all errors in a repository.

Bill Zack’s Web Application Publisher Combines Data Across Clouds post of 4/29/2010 describes the DreamFactory Suite:

DreamFactory Software is a leading publisher of rich web applications for cloud platforms.  The DreamFactory Suite delivers enterprise class project, document and data collaboration software to over 10,000 businesses using different cloud platforms. 

They like to say they are “cloud agnostic” and recently they launched their suite on Microsoft Windows Azure.  DreamFactory offers the ability to run on top of a number of cloud platforms as well as to interoperate between them.

DreamFactory took less than a month to migrate its applications to Windows Azure.  DreamFactory customers are able to leverage pay-as-you-go pricing model enables customers to purchase processing and storage as needed. Read more here.

Return to section navigation list> 

Windows Azure Infrastructure

Chris Hoff (@Beaker) asks Dear SaaS Vendors: If Cloud Is The Way Forward & Companies Shouldn’t Spend $ On Privately-Operated Infrastructure, When Are You Moving Yours To Amazon Web Services? on 4/30/2010:

We’re told repetitively by Software as a Service (SaaS)* vendors that infrastructure is irrelevant, that CapEx spending is for fools and that Cloud Computing has fundamentally changed the way we will, forever, consume computing resources.

Why is it then that many of the largest SaaS providers on the planet (including firms like Salesforce.com, Twitter, Facebook, etc.) continue to build their software and choose to run it in their own datacenters on their own infrastructure?  In fact, many of them are on a tear involving multi-hundred million dollar (read: infrastructure) private datacenter build-outs.

I mean, SaaS is all about the software and service delivery, right?  IaaS/PaaS is the perfect vehicle for the delivery of scaleable software, right?  So why do you continue to try to convince *us* to move our software to you and yet *you* don’t/won’t/can’t move your software to someone else like AWS?

Hypocricloud: SaaS firms telling us we’re backwards for investing in infrastructure when they don’t eat the dog food they’re dispensing (AKA we’ll build private clouds and operate them, but tell you they’re a bad idea, in order to provide public cloud offerings to you…)

Quid pro quo, agent Starling.

Good point, Beaker, but the post’s title has more characters (164) than allowed for Tweets.

Lydia Leong said she’s willing to pay for The convenience of not coping in her 4/30/2010 post to her Cloud Pundit: Massive-Scale Computing blog:

There’s a lot to be said for the ability to get a server for less than the price of a stick of chewing gum.

But convenience has a price, and it’s sufficient that shared hosters, blog hosters, and other folks who make their daily pittance from infrastructure-plus-a-little-extra aren’t especially threatened by cloud infrastructure services.

For instance, I pay for WordPress to host a blog because, while I am readily capable of managing a cloud server and everything necessary to run WordPress, I don’t want to deal with it. I have better things to do with my time.

Small businesses will continue to use traditional shared hosting or even some control-panel-based VPS offerings, despite the much-inferior price-to-resource ratios compared to raw cloud servers, because of the convenience of not having to cope with administration.

The reason why cloud servers are not a significant cost savings for most enterprises (when running continuously, not burst or one-time capacity), is because administration is still a tremendous burden. It’s why PaaS offerings will gain more and more traction over time, as the platforms mature, but also why those companies that crack the code to really automating systems administration will win over time.

I was pondering this equation while contemplating the downtime of a host that I use for some personal stuff; they’ve got a multi-hour maintenance downtime this weekend. My solution to this was simple: write a script that would, shortly before shutdown time, automatically shut down my application, provision a 1.5-cent-an-hour cloud server over on Rackspace, copy the data over, and fire up the application on its new home. (Note: This was just a couple of lines of code, taking moments to type.) The only thing I couldn’t automate was the DNS changeover, since I use GoDaddy for primary DNS and they don’t have an API available for ordinary customers. But conveniently: failover, without having to disrupt my Saturday.

But I realized that I was paying, on a resource-unit equivalent, tremendously more for my regular hosting than I would for a cloud server. Mostly, I’m paying for the convenience of not thinking — for not having to deal with making sure the OS is hardened, pay attention to security advisories, patch, upgrade, watch my logs, etc. I can probably afford the crude way of not thinking for a couple of hours — blindly shutting down all ports, pretty much — but I’m not comfortable with that approach for more than an afternoon.

This is, by the way, also a key difference between the small-business folks who have one or two servers, and the larger IT organizations with dozens, hundreds, or thousands of servers. The fewer you’ve got, the less efficient your labor leverage is. The guy with the largest scale doesn’t necessarily win on cost-efficiency, but there’s definitely an advantage to getting to enough scale.

Ellen Rubin analyzes use of backhauls of cloud content for branch offices to central corporate hubs in her Hubs, Spokes and WANs post of 4/29/2010 to the CloudSwitch blog:

Recently, we’ve had a number of discussions with enterprises about how they’d like to use the cloud. The basic use case is around capacity on-demand (not surprisingly), but the specifics have raised some interesting issues. The companies have distributed branch offices that need the capacity for a range of applications, including dev/test environments as well as back-office and web apps. Today, these distributed groups are relying on corporate IT to meet their scaling and infrastructure needs, and they are frequently bottlenecked. This is both in terms of overall challenges in getting new capacity approved in a timely way, but also from a network bandwidth perspective. At a panel this week at Interop, Riverbed noted that 2/3 of their enterprise customers have a hub and spoke model that requires the “spokes” to backhaul to the “hub” for connectivity to the internet, and thus to cloud computing services. Only the remaining 1/3 have direct connections. At the same panel, Blue Coat agreed with the stats but commented that the branch sites are trending towards a direct-connect model as new sites are added.

All this is interesting to us at CloudSwitch since we have been hearing more and more frequently from enterprises that want more “edge” computing, and to empower the branch offices to add capacity on-demand in a controlled but self-service way. This creates a set of new requirements around cloud computing, in terms of both networking and security. In the hub and spoke model, corporate IT maintains control over all access to the cloud, which has benefits on the security and permissions side, but creates potential bottlenecks – both in terms of the need for self-service management tools to increase agility, as well as in bandwidth constraints where the backhaul traffic starts to strain the corporate networks. Backhauling also creates strain on the branch offices since it often adds significant latency to their internet connections.

Most of the vendors at the Interop panel (including Akamai, Riverbed, Ipanema and Blue Coat) claimed to be developing or are already offering WAN optimization products – increasingly in the form of virtual appliances and/or software versions – to help alleviate these bottlenecks. These will surely help, but will become even more important as the branch offices start to have more direct connectivity to the cloud. WAN optimization offerings at the “edge” will be increasingly needed, and cloud service providers are focused on building out these capabilities at their end of the network. Security in a more distributed model will also require some new thinking, since users in the branches will want to maximize flexibility and agility, while corporate IT will still need a way to limit potential threats and exposure created by opening these direct connections.

Underlying all these discussions is the fundamental issue of the laws of physics. As enterprises start to embrace the cloud model, they’ve realized that the major choke-point will be their network bandwidth. Innovation around addressing these issues, especially in the virtualized world of the cloud, will definitely be required. At CloudSwitch, we’re staying closely involved in discussions around customer requirements and vendor offerings to increase performance for workloads moving to the cloud.

William Vambenepe’s Flavors of PaaS post of 4/29/2010 asks:

How many flavors of PaaS do we need?

  • PaaS for business apps
  • PaaS for toy apps (simple form-based CRUD) and simple business front ends (e.g. restaurant web sites)
  • PaaS for games, mash-ups and social apps
  • PaaS for multimedia delivery and live streams
  • PaaS for high performance and scientific computing
  • PaaS for spamming, hacking and other illegal activities (Zombies as a Service)
  • Other?

BTW, doesn’t “flavors of PaaS” sound like the name of a perfume? At least when I say it with my French accent it does.

<Return to section navigation list> 

Cloud Security and Governance

Chris Hoff (@Beaker) asserts that You Can’t Secure The Cloud… in this 4/30/2010 post to his Rational Survivability blog:

That’s right. You can’t secure “The Cloud” and the real shocker is that you don’t need to.

You can and should, however, secure your assets and the elements within your control that are delivered by cloud services and cloud service providers, assuming of course there are interfaces to do so made available by the delivery/deployment model and you’ve appropriately assessed them against your requirements and appetite for risk.

That doesn’t mean it’s easy, cheap or agile, and lest we forget, just because you can “secure” your assets does not mean you’ll achieve “compliance” with those mandates against which you might be measured.

Even if you’re talking about making investments primarily in solutions via software thanks to the abstraction of cloud (and/or virtualization) as well adjusting processes and procedures due to operational impact, you can generally effect compensating controls (preventative and/or detective) that give you security on-par with what you might deploy today in a non-Cloud based offering.

Yes, it’s true. It’s absolutely possible to engineer solutions across most cloud services today that meet or exceed the security provided within the walled gardens of your enterprise today.

The realities of that statement come crashing down, however, when people confuse possibility with the capability to execute whilst not disrupting the business and not requiring wholesale re-architecture of applications, security, privacy, operations, compliance, economics, organization, culture and governance.

Not all of that is bad.  In fact, most of it is long overdue.

I think what is surprising is how many people (or at least vendors) simply suggest or expect that the “platform” or service providers to do all of this for them across the entire portfolio of services in an enterprise.  In my estimation that will never happen, at least not if one expects anything more than commodity-based capabilities at a cheap price while simultaneously being “secure.”

Vendors conflate the various value propositions of cloud (agility, low cost, scalability, security) and suggest you can achieve all four simultaneously and in equal proportions.  This is the fallacy of Cloud Computing.  There are trade-offs to be found with every model and Cloud is no different.

If we’ve learned anything from enterprise modernization over the last twenty years, it’s that nothing comes for free — and that even when it appears to, there’s always a tax to pay on the back-end of the delivery cycle.  Cloud computing is a series of compromises; it’s all about gracefully losing control over certain elements of the operational constructs of the computing experience. That’s not a bad thing, but it’s a painful process for many.

I really enjoy the forcing function of Cloud Computing; it makes us re-evaluate and sharpen our focus on providing service — at least it’s supposed to.  I look forward to using Cloud Computing as a lever to continue to help motivate industry, providers and consumers to begin to fix the material defects that plague IT and move the ball forward.

This means not worrying about securing the cloud, but rather understanding what you should do to secure your assets regardless of where they call home.

<Return to section navigation list> 

Cloud Computing Events

No significant articles today.

<Return to section navigation list> 

Other Cloud Computing Platforms and Services

Maureen O’Gara claims “Of course cloud-aware printers – which are not the same as web-connected printers – don’t exist yet” as a preface to her Google Wrestles with Cloud Printing post of 4/30/2010:

Google is making a provision for Chrome OS to print web, native desktop or mobile apps on any device to any printer - both cloud-aware and legacy - anywhere in the world.

Of course cloud-aware printers - which are not the same as web-connected printers - don't exist yet but Google imagines printers one day being registered with one or more cloud print services.

It's gotten as far as publishing the schematic below and the in-progress design documents and protocol specifications for a Web Service it calls Google Cloud Print.

It looks like PC-like complex print subsystems and print drivers for each platform are out and APIs are in.

It says it expects third-party cloud print services too.

It's starting with Windows and will support Mac and Linux later.

Of course legacy printers will require an Internet connection until somebody figures out a way of not having the PC on.

See http://code.google.com/apis/cloudprint/docs/faq.html [for more info].

Chris Czarnecki’s Amazon EC2 Infrastructure Fosters Platform as a Service Innovation post of 4/20/2010 to the Learning Tree blog discusses Cloud Foundry and Heroku:

Commenting last week on Steve Balmer’s claims on the Microsoft Azure platform got me thinking about some of the differences between the main cloud providers. One of the major strengths of Amazon as a cloud provider is the wide range of services it provides ranging from servers, storage, messaging to payment services. Coupled to this is the flexible modes of accessing these services. Take EC2, servers can be provisioned using a browser based management tool, via command line tools and also programatically using Web services API’s. The fact that infrastructure such as servers can be provisioned and then de-provisioned programatically offers the scope for organisations to build high level tools to utilise and manage the Amazon cloud infrastructure.

Two examples of how the Amazon Infrastructure as a Service (IaaS) has been used to deliver a Platform as a Service (PaaS) are Cloud Foundry and heroku. Cloud Foundry is an enterprise Java PaaS. heroku is a cloud based Ruby PaaS . Both Cloud Foundry and heroku provide development and deployment environments for developers together with the ability to monitor and manage applications, automatically fixing failed servers and instances, and auto-scaling applications as the load varies.

The major benefit to developers using these two innovative platforms is that they can utilise all their existing skills (Java, Ruby), build applications using the tools they are familiar with and then seamlessly deploy these applications to a cloud to leverage the associated benefits transparently.

In summary, the above two examples highlight how Amazon’s flexible cloud infrastructure has enabled two organisations to build innovative PaaS businesses that will enable a wide base of developers to rapidly begin benefitting from cloud computing

James Hamilton analyzes Yahoo’s new chicken-coop data center design in his Yahoo! Computing Coop post of 4/30/2010:

Rich Miller of Datacenter Knowledge covered this last week and it caught my interest. I’m super interested in modular data centers (Architecture for Modular Datacenters) and highly efficient infrastructure (Data Center Efficiency Best Practices) so the Yahoo! Computing Coop caught my interest.

As much as I like the cost, strength, and availability of ISO standard shipping containers, 8’ is an inconvenient width. It’s not quite wide enough for two rows of standard racks and there are cost and design advantages in having at least two rows in a container. With two rows, air can be pulled in each side with a single hot aisle in the middle with large central exhaust fans. Its an attractive design point and there is nothing magical about shipping containers. What we want is commodity, prefab, and a moderate increment of growth.

The Yahoo design is a nice one. They are using a shell borrowed from a Tyson foods design. Tyson is the grower of a large part of the North American chicken production. These prefab facilities are essentially giant air handlers with the shell making up a good part of the mechanical plant. They pull air in either side of the building, it passes through two rows of servers into the center of the building. The roof slopes to the center from both side with central exhaust fans. Each unit is 120’ x 60’ and houses 3.6 MW of critical load.

Because of the module width they have 4 rows of servers. It’s not clear if the air from outside has to pass through both rows to get the central hot aisle but it sounds like that is the approach. Generally serial cooling where the hot air from one set of servers is routed through another is worth avoiding. It certainly can work but requires more air flow than single pass cooling using the same approach temperature.

Yahoo! believes they will be able to bring a new building online in 6 months at a cost of $5M per megawatt. In the Buffalo New York location, they expect to only use process-based cooling 212 hours/year and have close to zero water consumption when the air conditioning is not in use. See the Data Center Knowledge article for more detail: Yahoo Computing Coop: Shape of Things to Come?

More pictures at: A Closer Look at Yahoo’s New Data Center. Nice design Yahoo.

The primary advantage of ISO containers I see is their portability and low transportation cost.

Bruce Guptill’s VMforce: Salesforce.com + VMware = Java for the Cloud? Saugatuck Research Alert of 4/29/2010 continues the analyst onslaught (site registration required):

On Tuesday April 27, 2010, Salesforce.com and VMware jointly announced VMforce, a new Cloud-based platform for Java-based enterprise application development.
As announced, VMforce will use Salesforce’s Force.com physical infrastructure to run VMware vSphere with a customized VMware vCloud layer that runs the SpringSource TC Server virtual Java stack. The resulting Spring Framework in the Cloud can be used by developers to produce Java applications that run on Force.com. According to the two companies, almost any enterprise Java-based application can be ported to VMforce via a drag-and-drop interface. VMforce thus enables Cloud deployment of Java-based applications developed traditionally or in the Cloud.

Pricing was not announced, although Salesforce CEO Marc Benioff claimed it would be priced to enable Java development “at half the cost” of traditional development environments. VMforce is scheduled to become available for what the two companies called “developer preview” late in 2010. …

Bruce continues with explanations of “Why Is It Happening” and the “Market Impact.” Hopefully, this will be the last item about the VMforce announcement until they release a beta version of the product.

Prashant Rai makes the case for use cases in his SAPs use of AWS post of 4/28/2010 to the Enterprise Irregulars blog:

One of my often repeated complaints lack of sufficient focus on the use cases & case studies for “enterprise” both large & SMB. If you go to the case studies section in the AWS site you can see that the bulk of them are startups or ISV’s very few are “Enterprise” and even those are not the the most useful / relevant cases studies.

In this regard, today AWS hosted slides from their “AWS Enterprise events” hosted in several cities this year. Don’t see too many “Enterprise” case studies. But the presentation by SAP – was pretty interesting. Below are some excerpts from the slides:

SAP

SAP-AWS

Here are some excerpts

  1. Top 3 consuming departments got an average cost saving rate of 77%
  2. SAP AWS Footprint
    1. 1100 new SAP Systems
    2. 42086 EC2 instance Hours
    3. 39 TB EBS storage
    4. 3 TB S3 Storage
  3. Primary use cases
    1. Workshops
    2. Trainings
    3. Demo’s / Sandboxes
    4. Testing
    5. Development
  4. Plan for Extended Clouds for SAP Customer – POTENTIAL >2000000 Systems
    1. Certification by SAP
    2. Support by SAP
    3. Tools & Solutions by SAP & Partners
    4. Services by SAP & Partners

SAP AWS Usage

<Return to section navigation list> 

blog comments powered by Disqus