Wednesday, January 12, 2011

Windows Azure and Cloud Computing Posts for 1/12/2011+

image A compendium of Windows Azure, Windows Azure Platform Appliance, SQL Azure Database, AppFabric and other cloud-computing articles.

Note: This post is updated daily or more frequently, depending on the availability of new articles in the following sections:

To use the above links, first click the post’s title to display the single article you want to navigate.

Azure Blob, Drive, Table and Queue Services

Brian Swan explained Using the Windows Azure Storage Explorer in Eclipse in a 1/11/2011 post:

Some time back I wrote a post that showed how to get started with creating Windows Azure PHP projects in Eclipse. What I didn’t talk about in that post was a nifty Eclipse feature (the Windows Azure Storage Explorer) that allows you easily manage your Windows Azure blobs, tables, and queues. That’s what I’ll look at in this post.

What is the Windows Azure Storage Explorer?

The Windows Azure Storage Explorer is a feature of the Windows Azure Tools for Eclipse that allows you to easily manage your Windows Azure blobs, queues, and tables. It simply gives you a convenient UI for inserting or deleting objects in your Windows Azure storage. I have found it very helpful for verifying scripts that interact with my Azure Storage account (Did that script actually upload the image? Was that table entry deleted?) and for cleaning things up after I’ve finished developing a script. Here’s a screenshot from my Eclipse installation (I’ve minimized the PHP Project Explorer):


Of course, to make use of the Storage Explorer, you need an Azure Storage account. Directions for creating one are in the “How do I create a storage account?” section of this post: How to Run PHP in Windows Azure.

Accessing the Windows Azure Storage Explorer

To access the Windows Azure Storage Explorer, you need to do two things:

  1. Install the Windows Azure Tools for Eclipse. Instructions for doing this are in this post: Using the Windows Azure Tools for Eclipse with PHP.
  2. Open the Windows Azure Perspective from the Windows menu:

With a bit of moving windows around, you should now be able to see something like this:


Note that the storage account shown above is devstorageaccount1. This is the local storage account that simulates real Azure storage when you are running an application in the development fabric (an local environment that simulate the Azure environment, usually used for testing purposes).

To add an actual Azure storage account, click Manage. In the resulting dialog, click Add:


Now provide your your storage account name and key and click OK:


You can add as many accounts as you like and toggle back and forth between them in the Storage Explorer:


Once you have selected your account, click Open and you should be ready to manage your blobs, queues, and tables.

Managing Blobs

Start by clicking on Blob below the Open button…


…which will show you the blob containers that are in your storage account. By clicking on any one of the container names, you can see the blobs in that container and metadata about the container:


By right-clicking a blob name, you have a wide variety of options:


I leave exploration of those options as an “exercise for the reader.” :-)

Managing Queues

Now click Queue to get a list of your Azure queues, then click on one of the queues to see queue properties and message information:


If you click the Metadata tab, you can see queue metadata as key-value pairs:


Finally, right-click a message to get a list of message options (which, again, are left as an exercise for the reader):


Managing Tables

Lastly, managing tables is very similar to managing blobs and queues. Click Tables to get a list of tables, and click a table name to see table items in table properties:


Click a table item and you can get item properties (you can get the same property information in text format by clicking the Text tab):


And, as with blobs and queues, right-click an item and you will get a list of options:


And that’s it. I find the UI to be fairly intuitive…enough so that perhaps the screenshots above aren’t necessary to figure things out. But, I think the Storage Explorer is a little-known feature in the Windows Azure Tools for Eclipse, and I have found it useful in many ways…worth highlighting anyway.

imageNo significant articles today.

<Return to section navigation list> 

SQL Azure Database and Reporting

Jim Skurzynski described Delivering Geospatial Information Systems to the Mainstream in a 1/11/2011 post to the HPC in the Cloud blog:

image2011 will be the year Geospatial Information Systems (GIS) in the cloud goes mainstream.

Despite the recession – or arguably because of the recession – the cloud is changing the way GIS services are developed and delivered. This year promises major and fast-paced changes: a lower-cost infrastructure for developing spatial applications; more robust, creative and sophisticated business, government and consumer uses; and a rapidly-expanding pool of spatial technology users.

At geospatial technology conferences and workshops in 2010, the conversation in the GIS world was centered upon questions of, “should we move to the cloud?” While some major industries and companies continue to struggle with this issue, for most the question has shifted to “how do we move to the cloud?”

A result of this shift was the launch of Directions Magazine’s Location Intelligence for Geospatial Cloud Computing Executive Symposium in 2009. In 2010 the conversation at the symposium focused on the opportunities SaaS offers to developers, the ROI of cloud architecture, and how new Internet-based solutions may change the face of geospatial technology delivery.

What Trends Have an Impact on Movement of GIS to the Cloud?

What has caused the shift? Several important trends, some financial and some technological, are coming together to accelerate interest in cloud-based GIS:

1. Mapping / GIS and cloud computing are converging. In late 2010, Pitney Bowes Business Insight and Microsoft announced the integration of their respective desktop GIS and mapping platforms, calling it “one more example of how divergent solutions are coming together to provide greater insight to analysts and organizations.”  New cloud-based applications, fostered by wider adoption of spatial technology like Google Maps™, Bing Maps™ and Microsoft’s SQL Server are actually pushing desktop GIS developers to integrate new cloud-type features. Even the most traditional desktop technologies like those offered by Esri are moving into the cloud, a clear indication that the cross-pollenization will continue.

2. The recession has spurred adoption of the cloud. Rather than slowing down the movement of GIS into the cloud, the recession has actually accelerated the change. Providing GIS services in the cloud is actually more cost-effective than providing desktop services. While many businesses weren’t quite ready to commit to a cloud-based future, the recession put such a crimp on many budgets that lower-cost cloud services suddenly became the only option. In other cases, companies supplemented their overtaxed IT departments with some cloud-based functions during the recession. Either way, those companies are now often satisfied with the SaaS applications they’re using and less likely to backtrack to a desktop system when the economy takes off again. They also found during the recession that letting someone else handle the technical part of GIS allows them to focus on their core business.

3. Consumer demand for location technology is changing the marketplace. For many consumers, the rapid rise of mapping platforms such as Google Maps and Bing Maps was the light switch that flipped on to show the value of spatially-enabled applications. More than a billion people have used Google Maps, more and more often from a mobile phone to get directions or find local businesses. In 2010, Twitter added a geo API and Google improved its Maps API to support spatial search and search feeds, changes that further help developers bring location intelligence to any application.

As consumers see the power of location, without even knowing what GIS means, they are increasingly expecting business and government to offer them spatial tools. In response, more and more businesses are now relying on GIS to automate decision-making. For example, Computerworld reported that General Motors and other automakers used GIS tools to help figure out which auto dealerships should be closed. Government agencies are finding new uses for GIS analytics such as monitoring properties at high risk for foreclosures.

Page:  1  of  4
Read more: 2 | 3 | 4 All »

Morten Cristensen posted Umbraco on Azure Series: SQL Azure on 1/10/2011:

There has been a lot of buzz around the Azure Accelerator for Umbraco by Microsoft, but it doesn’t seem like many have tried it out yet. It might be because it is a little complicated to get up and running. But with Windows Azure Pass (gives you 30 day free access to Windows Azure and SQL Azure) and this blog series there shouldn’t really be any excuse.

First off I need to give some credit to Daniel Bardi for wrting this 23 step guide on the wiki “Installing Umbraco to SQL Azure“, and to Microsoft for creating the Azure Accelerator project (developed by Slalom Consulting). The guides that are available helped me successfully deploy an Umbraco database to SQL Azure and an instance to the Windows Azure Hosted Services.
So why am I writing this post if guides already exist? Well, I found a couple of gotchas along the way that I think others can benefit from. And if you have got limited to no experience with Azure or if you find the existing guides too techie or complicated this post is for you.

On a side note: On the Azure Accelerator project page on codeplex you will find two extensive guides to deploying Umbraco to Azure, but nothing about the database and i’m not sure if this is simply because they have used an embedded database, but it confused me the first time around.

The focus of this first post is SQL Azure, because I found it the best approach to get the db up and running first. Best see the database working before we start deploying the Umbraco solution, right. (If you were using an embedded database this wouldn’t be necessary, but since SQLCE is still in beta and Vistadb is out of the picture SQL Azure is the best option in my opinion).


  • SQL Server 2008 – Only needed if you don’t already have an SQL Server or SQL Server Express available.
  • SQL Management Studio 2008 R2 Express – If you already have an SQL Server installed then you just need to download the SQL Server Management Studio Express (second column), which is needed to connect to SQL Azure among a couple of other things.
  • Local Internet Information Server (IIS 7.5).
  • Umbraco v.4.5.2 ASP.NET 3.5.
  • Windows Azure account with access to Hosted Services and Storage (will be used in next post), and of course access to SQL Azure.

1.) First thing you want to do is to setup the database server and database instance on SQL Azure. I’m using the new layout on the Windows Azure Platform and I recommend you do the same.
When you login you should see a left column similar to the screenshot below:

The interesting thing here is the Database, so click it and you should see your subscriptions for SQL Azure.
From the top menu click Create Server to setup your Database Server instance. You will need to select a Region for server, and an administrative user and password. Select a region that is close to you (i.e. I have selected North Europe).
The name of the server will be generated for you and the DNS to the server will be
When the server is up and running you need to configure Firewall Rules, otherwise you will not be able to connect to it from your local machine. So add an IP range that includes your local IP (see example below).

With database server and firewall setup you can now create a new database, but instead of doing this through the portal we will create a .dacpac and an sql script to create or rather deploy the Umbraco database to SQL Azure. The next steps will take you through the process of creating these two scripts, and finally deploying them to SQL Azure.

2.) Now that you got SQL Azure setup, you need to make a local installation of Umbraco. This is just a regular installation of Umbraco, so just do what you normally do to setup a site in your IIS.
One very important thing is to do a “clean” installation, which means to let Umbraco run its install script to setup the db, but don’t install runway, cws or any other starterkit. Keeping the database clean will make it easier to deploy.

A side note for installing Umbraco with regards to the upcoming post is to install it to IIS’ default site. If you have the possibility to do this it will save you from editing a couple of settings when deploying Umbraco, as the Azure Accelerator is set to the default IIS site (but it can of course be changed).

3.) Third step is to make the local database deployable – this is also step 3 in the 23-step guide on So open up management studio R2 and navigate to the database for your Umbraco install – I have called mine UmbracoAzure.
Expand the database, then Tables and find the table called “umbracoUserLogins”, right click and select Design from the menu. In the design view you select the two rows called contextID and userID, right click and select Set Primary Key. Now save the changes to the table and you should be ready for the next step.

4.) This step will cover step 4-6 from the 23-step guide. The files that are generated in this step and the next is available for download at the end of this post.
Close the design view, and select your database, right click and navigate to Tasks -> Extra Data-tier Application (if you don’t have this option you probably don’t have Management Studio R2 Express installed).

A new dialog will appear, which will guide you through the creation of a .dacpac file and an sql script.
Click next and verify Application name (same as database), and make a note of the location in “Save to DAC package file” as we will need this file to deploy the local database to SQL Azure. Click next a couple of times to finish generating the file.

5.) This step will cover step 7-13 from the 23-step guide.
Once again go back and select the database, right click and select Tasks -> Generate Scripts.

A new dialog will appear, which will guide you through the creation of an sql script with inserts for the database on SQL Azure.
Click next and change the default radio button selection to “Select specific database objects” and check the Tables checkbox, as you only want to generate a script for the tables. This is an easy step as a default Umbraco database only contains tables.

On the next screen click the Advanced-button and find the row with “Types of data to script” in the new dialog, and change it to Data only. Click OK, note the location of where the file is saved and click next, next and finally finish.

6.) Now we have the two files needed to deploy the database to SQL Azure, but you need to move a single line the sql script before deploying anything. This is the first gotcha!

Open up your sql file with the data inserts and look at line 29 where the inserts for the umbracoNode table begins. If you look at the values you will notice that the id is ascending -92, -90, -89 etc. but all of them have a parentID, which is -1. The node with id -1 is the umbraco master root, which you need to move up so its the first insert in the umbracoNode table. If you don’t you will get various insert errors while deploying, you will be able to login to Umbraco, but if you navigate to the Developer section and expand the DataType folder you will notice that something is missing.

The edited script is available for download at the end of this post (login is: admin and password: b).

7.) Disconnect from your local database server and connect to SQL Azure using the connection info that you got while setting up the database server in step 1 and 2.

When you are connected you can simply expand the Databases folder and see your databases if you have created any through the WindowsAzurePlatform portal. The cool thing about the R2 version of SQL Management Studio is that is allows you to connect to SQL Azure as it was just another SQL Server. Microsoft has also created an online management tool that you can use to do a lot of the same stuff as in management studio. You can access this online tool from the WindowsAzurePlatform portal by clicking Manage from the top menu (in the Database section), but I prefer to use management studio. You will need management studio to deploy the .dacpac file, which creates the database with Tables and contraints.

When you are logged into your SQL Azure database server, right click on the server and select “Deploy Data-tier Application” from the menu. This will open up a dialog where you simply select the .dacpac file, which was generated in a previous step, click next a couple of times and when you are done you have a new database with all the tables of a normal Umbraco install. Next step is to insert the default data.

8.) From the menu in SQL Management Studio R2 Express click File -> Open -> File and locate the sql file that was previously generated and re-organized. Make sure the newly created database is selected or that the script starts with “Use [DATABASENAME]” and click Execute in the menu (might be an idea to click Parse first to verify there are no errors in the script).
Execute should run without any errors, so if you encounter any errors you best revise the previous steps, delete the database and try again.

9.) With the database setup you can now verify that it is in fact working. Go back to your local install of Umbraco and change the connection string in web.config, which should be changed to something like this:

<add key="umbracoDbDSN" value=";Database=UmbracoAzure;User ID=USERNAME;Password=PASSWORD;Trusted_Connection=False;Encrypt=True;" />

Note: If you can’t access the database it might be that you need to review your firewall settings for the SQL Azure database server.

And there you go, now you have deployed a standard Umbraco database to SQL Azure. Next post will go through the deployment of the Umbraco solution.

Package with the two scripts needed to deploy to SQL Azure:
Zip contains both the .dacpac and .sql files.

Morten is a danish .NET web developer working for a Copenhagen based company called Codehouse with a primary focus on Sitecore solutions (developing, upgrading and supporting).

See Morten Cristensen posted Umbraco on Azure Series: Deploying to Azure with Accelerator on 1/12/2011 in the Live Windows Azure Apps, APIs, Tools and Test Harnesses section below.

<Return to section navigation list> 

MarketPlace DataMarket and OData

Jorge Fioranelli published OData validation using DataAnnotations on 1/12/2011:

image WCF DataServices (OData) becomes more popular every day. If you need to expose data between different applications or even between different tiers of the same application, it is definitely a really good option.

imageUnfortunately, one of the missing features of the current version (v4.0) is the validation using DataAnnotations (if you want that feature to be implemented in the next version, just vote for it here). In this post I am going to show how to implement this validation using a ChangeInterceptor. I hope it helps you.

Here we have a Customer object (POCO) that is mapped in the Entity Framework model:

public class Customer 
    public int Id { get; set; } 
    public string Name { get; set; } 

As you can see, the Name property is decorated with the Required attribute (DataAnnotation).

In the following code snippet we can see the WCF DataService that is exposing the Customers EntitySet:

public class WcfDataService : DataService<DatabaseEntities> 
    public static void InitializeService(DataServiceConfiguration config) 
        config.SetEntitySetAccessRule("Customers", EntitySetRights.All); 
        config.DataServiceBehavior.MaxProtocolVersion = DataServiceProtocolVersion.V2; 

So far nothing new. The first thing we need to do in order to add the validation logic is to create a ChangeInterceptor method:

public class WcfDataService : DataService<DatabaseEntities> 
    public void ValidateCustomers(Customer customer, UpdateOperations operation) 
            // Validation logic 

After that, we just need to add the following validation logic:

public void ValidateCustomers(Customer customer, UpdateOperations operation) 
    // Only validates on inserts and updates 
    if (operation != UpdateOperations.Add && operation != UpdateOperations.Change) 
    // Validation 
    var validationContext = new ValidationContext(customer, null, null); 
    var result = new List<VALIDATIONRESULT>(); 
    Validator.TryValidateObject(customer, validationContext, result); 
    throw new DataServiceException( 
        .Select(r => r.ErrorMessage) 
        .Aggregate((m1, m2) => String.Concat(m1, Environment.NewLine, m2))); 

As you can see I am using the Validator class (System.ComponentModel.DataAnnotations namespace) and I am throwing a DataServiceException in case the validator finds any error.
If you are using the MetadataType attribute instead of having the DataAnnotation attributes in your POCO, you need to add some additional lines of code in order to support that. Check the attached source code to see how to do it.

Finally, in the client we are going to receive a DataServiceRequestException that will contain the error we sent from the server.

    var context = new DatabaseEntities(new Uri("http://localhost:4799/WcfDataService.svc")); 
    var customer = new Customer(); 
    Console.WriteLine("Calling data service..."); 
    Console.WriteLine("Insert successful"); 
catch (DataServiceRequestException ex) 
    if(ex.InnerException != null && ex.InnerException.Message != null) 

As you can see, our original error message is now wrapped inside an xml document (the serialized original exception). If you need to get the original message from there, you can either read the xml or use the code shown by Phani Raj in this post.

To download the source code, just click here.

Seth Grimes asserted “Embedded, automatic and easy new approaches meet growing demands for do-it-yourself data analysis” as a deck for his 5 Paths To The New Data Integration article of 1/11/2011 for InformationWeek’s Software blog:

Whether your interests are in business intelligence, information access, or operations, there are clear and compelling benefits in linking enterprise data -- customer profiles and transactions, product and competitive information, weblogs, -- to business-relevant content drawn from the ever-growing social/online information flood.

Eric Rubin, CEO of Dreamfactory, talks about the company's Business Essentials, a suite of software services, including project management and business intelligence, that runs on the AppExchange platform.

ETL (extraction, transform, load) to data stores, together with the younger, load-first variant ELT, will remain the leading integration approaches. But they'll be complemented by new, dynamic capabilities provided by mash-ups and by semantic integration, driven by data profiles (type, distribution, and attributes of values) rather than by rigid, application-specific data definitions.

These newer, beyond-ETL approaches constitute a New Data Integration. The approaches were developed to provide easy-to-use, application-embedded, end-user-focused integration capabilities.

The New Data Integration responds to the volume and diversity of data sources and needs and to growing demand for do-it-yourself data analysis. I explored these ideas last year in an article on 'NoETL'. In this follow-up I consider five examples, with capsule reviews of same-but-different approaches at Tableau, Attivio, FirstRain, Google, and Extractiv. Each example illustrates paths to the new data integration.

Tableau: Easy Exploration

No BI vendor better embodies the DIY spirit than Tableau Software. The company's visual, exploratory data analysis software lets end users delve into structured data sources and share and publish analyses. By "structured data sources," I mean anything ranging from Excel spreadsheets to very large databases managed with high-end data-warehousing systems. Tableau's power and ease of use has won the company an enthusiastic following.

Tableau's Data Blending capability, new in November's Tableau 6.0 release, caught my attention. The software will not only suggest joins for data fields across sources, by name and characteristics; according to Dan Jewett, Tableau VP of Product Management, it will also aggregate values, for instance rolling up months to quarters, to facilitate fusing like data stored at different aggregation levels.

The software also supports "alias values" for use in blending relationships. For instance, it can match state names to abbreviations, part numbers to part names, and coded values such as 0 and 1 for "male" and "female."

Usage scenarios include comparing budget and sales projections to actuals, where users may compare spreadsheet-held values to corporate records. The software also supports blending of external-source information into corporate data.

"Marketing organizations often get data feeds from suppliers and partners they want to join in with the in-house CRM system data," Jewett explains. "These are often ad-hoc feeds, so structured processes that IT likes to build don't support this case."

imageTableau can pull data from Web sources via application programming interfaces (APIs) adhering to the Open Data Protocol (OData) standard. This capability will help users keep up with the growing volume of online data.

Tableau, like the vast majority of BI applications, does work exclusively with "structured" data. That focus must and will change as users confront an imperative to tap online and social sources, via search- and text-analytics enhanced BI.

Read more: 2 | 3 Next Page

<Return to section navigation list> 

Windows Azure AppFabric: Access Control and Service Bus

Suren Machiraju described Running .NET4 Windows Workflows in Azure in a 1/12/2011 post to the Windows Server AppFabric Customer Advisory Team blog:

This blog reviews the current (January 2011) set of options available for hosting existing .NET4 Workflow (WF) programs in Windows Azure and also provides a roadmap to the upcoming features that will further enhance support for hosting and monitoring the Workflow programs. The code snippets included below are also available as an attachment for you to download and try it out yourself. 

Workflow in Azure – Today

Workflow programs can broadly classified as durable or non-durable (aka non-persisted Workflow Instances). Durable Workflow Services are inherently long running, persist their state, and use correlation for follow-on activities. Non-durable Workflows are stateless, effectively they start and run to completion in a single burst.

Today non-durable Workflows are readily supported by Windows Azure of course with a few configuration/trivial changes. Hosting durable Workflows today is a challenge; since we do not yet have a ‘Windows Server AppFabric’ equivalent for Azure which can persist, manage and monitor the Service. In brief the big buckets of functionality required to host the durable Workflow Services are:

  • Monitoring store: There is no Event Collection Service available to gather the ETW events and write them to the SQL Azure based Monitoring database. There is also no schema that ships with .NET Framework for creating the monitoring database, and the one that ships with Windows Server AppFabric is incompatible with SQL Azure – an example, the scripts that are provided with Windows Server AppFabric make use of the XML column type which is currently not supported by SQL Azure.
  • Instance Store: The schemas used by the SqlWorkflowInstanceStore have incompatibilities with SQL Azure. Specifically, the schema scripts require page locks, which are not supported on SQL Azure.
  • Reliability: While the SqlWorkflowInstanceStore provides a lot of the functionality for managing instance lifetimes, the lack of the AppFabric Workflow Management Service means that you need to manually implement a way to start your WorkflowServiceHosts before any messages are received (such as when you bring up a new role instance or restart a role instance), so that the contained SqlWorkflowInstanceStore can poll for workflow service instances having expired timers and subsequently resume their execution.

The above limitations make it rather difficult to run a durable Workflow Service on Azure – the upcoming release of Azure AppFabric (Composite Application) is expected to make it possible to run durable Workflow Services. In this blog, we will focus on the design approaches to get your non-durable Workflow instances running within Azure roles.

Today you can run your non-durable Workflow on Azure. What this means, is that your Workflow programs really cannot persist their state and wait for subsequent input to resume execution- they must complete following their initial launch. With Azure you can run non-durable Workflows programs in one of the three ways:

  1. Web Role
  2. Worker Roles
  3. Hybrid

The Web Role acts very much like IIS does on premise as an HTTP server, and is easier to configure and requires little code to integrate and is activated by an incoming request. The Worker Role acts like an on-premise Windows Service and is typically used in backend processing scenarios have multiple options to kick off the processing – which in turn add to the complexity.  The hybrid approach, which bridges communication between Azure hosted and on-premise resources, has multiple advantages: it enables you to leverage existing deployment models and also enables use of durable Workflows on premise as a solution until the next release.  The following sections, succinctly, provide details on these three approaches and in the ‘Conclusion’ section we will also provide you pointers on the appropriateness of each approach. 

Host Workflow Services in a Web Role

The Web Role is similar to a ‘Web Application’ and can also provide a Service perspective to anything that uses the http protocol - such as a WCF service using basicHttpBinding. The Web Role is generally driven by a user interface – the user interacts with a Web Page, but a call to a hosted Service can also cause some processing to happen. Below are the steps that enable you to host a Workflow Service in a Web Role.

First step is to create a Cloud Project in Visual Studio, and add a WCF Service Web Role to it. Delete the IService1.cs, Service1.svc and Service1.svc.cs added by the template since they are not needed and will be replaced by the workflow service XAMLX.

To the Web Role project, add a WCF Workflow Service. The structure of your solution is now complete (see the screenshot below for an example), but you need to add a few configuration elements to enable it to run on Azure.

Fig 1

Windows Azure does not include a section handler in its machine.config for as you have in an on-premises solution. Therefore, the first configuration change (HTTP Handler for XAMLX and XAMLX Activation) is to add the following to the top of your web.config, within the configuration element:

  <sectionGroup name="" type="System.Xaml.Hosting.Configuration.XamlHostingSectionGroup, System.Xaml.Hosting, Version=, Culture=neutral, PublicKeyToken=31bf3856ad364e35">
    <section name="httpHandlers" type="System.Xaml.Hosting.Configuration.XamlHostingSection, System.Xaml.Hosting, Version=, Culture=neutral, PublicKeyToken=31bf3856ad364e35" />

Next, you need to add XAML http handlers for WorkflowService and Activities root element types by adding the following within the configuration element of your web.config, below the configSection that we included above:

    <add xamlRootElementType="System.ServiceModel.Activities.WorkflowService, System.ServiceModel.Activities, Version=, Culture=neutral, PublicKeyToken=31bf3856ad364e35" httpHandlerType="System.ServiceModel.Activities.Activation.ServiceModelActivitiesActivationHandlerAsync, System.ServiceModel.Activation, Version=, Culture=neutral, PublicKeyToken=31bf3856ad364e35" />
    <add xamlRootElementType="System.Activities.Activity, System.Activities, Version=, Culture=neutral, PublicKeyToken=31bf3856ad364e35" httpHandlerType="System.ServiceModel.Activities.Activation.ServiceModelActivitiesActivationHandlerAsync, System.ServiceModel.Activation, Version=, Culture=neutral, PublicKeyToken=31bf3856ad364e35" />

Finally, configure the WorkflowServiceHostFactory to handle activation for your service by adding a serviceActivation element to system.serviceModel\serviceHostingEnvironment element:

<serviceHostingEnvironment multipleSiteBindingsEnabled="true" >
    <add relativeAddress="~/Service1.xamlx" service="Service1.xamlx"  factory="System.ServiceModel.Activities.Activation.WorkflowServiceHostFactory"/>       

The last step is to deploy your Cloud Project and with that you now have your Workflow service hosted on Azure – graphic below!

Fig 3

Note: Sam Vanhoutte from CODit in his blog also elaborates on Hosting workflow services in Windows Azure and focuses on troubleshooting configuration by disabling custom errors-- do review.

Host Workflows in a Worker Role

The Worker Role is similar to a Windows Service and would start up ‘automatically’ and be running all the time. While the Workflow Programs could be initiated by a timer, it could use other means to activate such as a simple while (true) loop and a sleep statement. When it ‘ticks’ it performs work. This is generally the option for background or computational processing.

In this scenario you use Workflows to define Worker Role logic.  Worker Roles are created by deriving from the RoleEntryPoint Class and overriding a few of its method. The method that defines the actual logic performed by a Worker Role is the Run Method. Therefore, to get your workflows executing within a Worker Role, use WorkflowApplication* or WorkflowInvoker to host an instance of your non-service Workflow (e.g., it doesn’t use Receive activities) within this Method. In either case, you only exit the Run Method when you want the Worker Role to stop executing Workflows.

The general strategy to accomplish this is to start with an Azure Project and add a Worker Role to it. To this project you Add a reference to an assembly containing your XAML Workflow types. Within the Run Method of WorkerRole.cs, you initialize one of the host types (WorkflowApplication or WorkflowInvoker), referring to an Activity type contained in the referenced assembly. Alternatively, you can initialize one of the host types by loading an Activity instance from the XAML Workflow file available on the file system.. You will also need to add references to .NET Framework assemblies (System.Activities and System.Xaml - if you wish to load XAML workflows from a file).

Host Workflow (Non-Service) in a Worker Role

For ‘non-Service’ Workflows, your Run method needs to describe a loop that examines some input data and passes it to a workflow instance for processing. The following shows how to accomplish this when the Workflow type is acquired from a referenced assembly:

public override void Run(){
    Trace.WriteLine("WFWorker entry point called", "Information");
     while (true)    
        /* ...         * ...Poll for data to hand to WF instance...
         * ...         */ 
        //Create a dictionary to hold input data
        Dictionary<;string, object> inputData = new Dictionary<string,object>();
        //Instantiate a workflow instance from a type defined in a referenced assembly
        System.Activities.Activity workflow = new Workflow1();
        //Execute the WF passing in parameter data and capture output results
        IDictionary<;string, object> outputData =
          System.Activities.WorkflowInvoker.Invoke(workflow, inputData);
       Trace.WriteLine("Working", "Information");    

Alternatively, you could perform the above using the WorkflowApplication to host the Workflow instance. In this case, the main difference is that you need to use semaphores to control the flow of execution because the workflow instances will be run on threads separate from the one executing the Run method.

public override void Run(){
    Trace.WriteLine("WFWorker entry point called", "Information");
                while (true)
    {                     Thread.Sleep(1000);
        /* ...
            * ...Poll for data to hand to WF...
            * ...
         AutoResetEvent syncEvent = new AutoResetEvent(false);
        //Create a dictionary to hold input data and declare another for output data
        Dictionary<;string, object> inputData = new Dictionary<string,object>();
        IDictionary<;string, object> outputData;
        //Instantiate a workflow instance from a type defined in a referenced assembly
        System.Activities.Activity workflow = new Workflow1();
        //Run the workflow instance using WorkflowApplication as the host.
        System.Activities.WorkflowApplication workflowHost = 
            new System.Activities.WorkflowApplication(workflow, inputData);
        workflowHost.Completed = (e) =>
                outputData = e.Outputs;
        Trace.WriteLine("Working", "Information");    

Finally, if instead of loading Workflow types from a referenced assembly, you want to load the XAML from a file available, for example, one included with WorkerRole or stored on an Azure Drive, you would simply replace the line that instantiates the Workflow in the above two examples with the following, passing in the appropriate path to the XAML file to XamlServices.Load:

System.Activities.Activity workflow = (System.Activities.Activity)

By and large, if you are simply hosting logic described in a non-durable workflow, WorkflowInvoker is the way to go. As it offers fewer lifecycle features (when compared to WorkflowApplication), it is also more light weight and may help you scale better when you need to run many workflows simultaneously.

Host Workflow Service in a Worker Role

When you need to host Workflow Service in a Worker Role, there are a few more steps to take. Mainly, these exist to address the fact that Worker Role instances run behind a load balancer. From a high-level, to host a Workflow Service means creating an instance of a WorkflowServiceHost based upon an instance of an Activity or WorkflowService defined either in a separate assembly or as a XAML file. The WorkflowService instance is created and opened in the Worker Role’s OnStart Method, and closed in the OnStop Method. It is important to note that you should always create the WorkflowServiceHost instance within the OnStart Method (as opposed to within Run as was shown for non-service Workflow hosts). This ensures that if a startup error occurs, the Worker Role instance will be restarted by Azure automatically. This also means the opening of the WorkflowServiceHost will be attempted again.

Begin by defining a global variable to hold a reference to the WorkflowServiceHost (so that you can access the instance within both the OnStart and OnStop Methods).

public class WorkerRole : RoleEntryPoint{
    System.ServiceModel.Activities.WorkflowServiceHost wfServiceHostA;

Next, within the OnStart Method, add code to initialize and open the WorkflowServiceHost, within a try block. For example:

public override bool OnStart(){
    Trace.WriteLine("Worker Role OnStart Called.");
    //…     try    {
    catch (Exception ex)
     return base.OnStart();}

Let’s take a look at the OpenWorkflowServiceHostWithAddressFilterMode method implementation, which really does the work. Starting from the top, notice how either an Activity or WorkflowService instance can be used by the WorkflowServiceHost constructor, they can even be loaded from a XAMLX file on the file-system. Then we acquire the internal instance endpoint and use it to define both the logical and physical address for adding an application service endpoint using a NetTcpBinding. When calling AddServiceEndpoint on a WorkflowServiceHost, you can specify either just the service name as a string or the namespace plus name as an XName (these values come from the Receive activity’s ServiceContractName property).

private void OpenWorkflowServiceHostWithAddressFilterMode(){
    //workflow service hosting with AddressFilterMode approach
    //Loading from a XAMLX on the file system
    System.ServiceModel.Activities.WorkflowService wfs =
    //As an alternative you can load from an Activity type in a referenced assembly:
    //System.Activities.Activity wfs = new WorkflowService1();
    wfServiceHostA = new System.ServiceModel.Activities.WorkflowServiceHost(wfs);
    IPEndPoint ip =
    wfServiceHostA.AddServiceEndpoint(System.Xml.Linq.XName.Get("IService", ""),
        new NetTcpBinding(SecurityMode.None),
        String.Format("net.tcp://{0}/MyWfServiceA", ip));
    //You can also refer to the implemented contract without the namespace, just passing the name as a string:
    //    new NetTcpBinding(SecurityMode.None),
    //    String.Format("net.tcp://{0}/MyWfServiceA", ip));
    wfServiceHostA.ApplyServiceMetadataBehavior(String.Format("net.tcp://{0}/MyWfServiceA/mex", ip));
    Trace.WriteLine(String.Format("Opened wfServiceHostA"));

In order to enable our service to be callable externally, we next need to add an Input Endpoint that Azure will expose at the load balancer for remote clients to use. This is done within the Worker Role configuration, on the Endpoints tab. The figure below shows how we have defined a single TCP Input Endpoint on port 5555 named WorkflowServiceTcp. It is this Input Endpoint, or IPEndpoint as it appears in code, that we use in the call to AddServiceEndpoint in the previous code snippet. At runtime, the variable ip provides the local instance physical address and port which the service must use, to which the load balancer will forward messages. The port number assigned at runtime (e.g., 20000) is almost always different from the port you specify in the Endpoints tab (e.g., 5555), and the address (e.g., is not the address of your application in Azure (e.g.,, but rather the particular Worker Role instance.


It is very important to know that currently, Azure Worker Roles do not support using HTTP or HTTPS endpoints (primarily due to permissions issues that only Worker Roles face when trying to open one). Therefore, when exposing your service or metadata to external clients, your only option is to use TCP.

Returning to the implementation, before opening the service we add a few behaviors. The key concept to understand is that any workflow service hosted by an Azure Worker Role will run behind a load balancer, and this affects how requests must be addressed. This results in two challenges which the code above solves:

  • How to properly expose service metadata and produce metadata which includes the load balancer’s address (and not the internal address of the service hosted within a Worker Role instance).
  • How to configure the service to accept messages it receives from the load balancer, that are addressed to the load balancer.

To reduce repetitive work, we defined a helper class that contains extension methods for ApplyServiceBehaviorAttribute and ApplyServiceMetadataBehavior that apply the appropriate configuration to the WorkflowServiceHost and alleviate the aforementioned challenges.

//Defines extensions methods for ServiceHostBase (useable by ServiceHost &; WorkflowServiceHost)public static class ServiceHostingHelper{
    public static void ApplyServiceBehaviorAttribute(this ServiceHostBase host)
        ServiceBehaviorAttribute sba = host.Description.Behaviors.Find<;ServiceBehaviorAttribute>();
        if (sba == null)
        {            //For WorkflowServices, this behavior is not added by default (unlike for traditional WCF services).
            host.Description.Behaviors.Add(new ServiceBehaviorAttribute() { AddressFilterMode = AddressFilterMode.Any });
            Trace.WriteLine(String.Format("Added address filter mode ANY."));
            sba.AddressFilterMode = System.ServiceModel.AddressFilterMode.Any;
            Trace.WriteLine(String.Format("Configured address filter mode to ANY."));
     public static void ApplyServiceMetadataBehavior(this ServiceHostBase host, string metadataUri)
    {        //Must add this to expose metadata externally
        UseRequestHeadersForMetadataAddressBehavior addressBehaviorFix = new UseRequestHeadersForMetadataAddressBehavior();
         Trace.WriteLine(String.Format("Added Address Behavior Fix"));
        //Add TCP metadata endpoint. NOTE, as for application endpoints, HTTP endpoints are not supported in Worker Roles.
        ServiceMetadataBehavior smb = host.Description.Behaviors.Find<;ServiceMetadataBehavior>();
        if (smb == null)
            smb = new ServiceMetadataBehavior();
            Trace.WriteLine("Added ServiceMetaDataBehavior.");
            metadataUri        );

Looking at how we enable service metadata in the ApplyServiceMetadataBehavior method, notice there are three key steps. First, we add the UseRequestHeadersForMetadataAddressBehavior. Without this behavior, you could only get metadata by communicating directly to the Worker Role instance, which is not possible for external clients (they must always communicate through the load balancer). Moreover, the WSDL returned in the metadata request would include the internal address of the service, which is not helpful to external clients either. By adding this behavior, the WSDL includes the address of the load balancer. Next, we add the ServiceMetadataBehavior and then add a service endpoint at which the metadata can be requested. Observe that when we call ApplyServiceMetadataBehavior, we specify a URI which is the service’s internal address with mex appended. The load balancer will now correctly route metadata requests to this metadata endpoint.

The rationale behind the ApplyServiceBehaviorAttribute method is similar to ApplyServiceMetadataBehavior. When we add a service endpoint by specifying only the address parameter (as we did above), the logical and physical address of the service are configured to be the same. This causes a problem when operating behind a load balancer, as messages coming from external clients via the load balancer will be addressed to the logical address of the load balancer, and when the instance receives such a message it will not accept—throwing an AddressFilterMismatch exception. This happens because the address in the message does not match the logical address at which the endpoint was configured. With traditional code-based WCF services, we could resolve this simply by decorating the service implementation class with [ServiceBehavior(AddressFilterMode=AddressFilterMode.Any)], which allows the incoming message to have any address and port. This is not possible with Workflow Services (as there is no code to decorate with an attribute), hence we have to add it in the hosting code.

If allowing an incoming address concerns you, an alternative to using AddressFilterMode is simply to specify the logical address that is to be allowed. Instead of adding the ServiceBehaviorAttribute, you simply open the service endpoint specifying both the logical (namely the port the load balancer will receives messages on) and physical address (at which your service listens). The only complication, is that your Workflow Role instance does not know which port the load balancer is listening- so you need to add this value to configuration and read it from their before adding the service endpoint. To add this to configuration, return to the Worker Role’s properties, Settings tab. Add string setting with the value of the of the port you specified on the Endpoints tab, as we show here for the WorkflowServiceEndpointListenerPort.


With that setting in place, the rest of the implementation is fairly straightforward:

private void OpenWorkflowServiceHostWithoutAddressFilterMode(){
    //workflow service hosting without AddressFilterMode
    //Loading from a XAMLX on the file system
    System.ServiceModel.Activities.WorkflowService wfs =
    System.Xaml.XamlServices.Load "WorkflowService1.xamlx");
    wfServiceHostB = new System.ServiceModel.Activities.WorkflowServiceHost(wfs);
     //Pull the expected load balancer port from configuration...
    int externalPort = int.Parse(RoleEnvironment.GetConfigurationSettingValue("WorkflowServiceEndpointListenerPort"));
    IPEndPoint ip = RoleEnvironment.CurrentRoleInstance.InstanceEndpoints
     //Use the external load balancer port in the logical address...
    wfServiceHostB.AddServiceEndpoint(System.Xml.Linq.XName.Get("IService", ""),        new NetTcpBinding(SecurityMode.None),
        String.Format("net.tcp://{0}:{1}/MyWfServiceB", ip.Address, externalPort),
        new Uri(String.Format("net.tcp://{0}/MyWfServiceB", ip)));
    wfServiceHostB.ApplyServiceMetadataBehavior(String.Format("net.tcp://{0}/MyWfServiceB/mex", ip));
    Trace.WriteLine(String.Format("Opened wfServiceHostB"));

With that, we can return to the RoleEntryPoint definition of our Worker Role and override the Run and OnStop Methods. For Run, because the WorkflowServiceHost takes care of all the processing, we just need to have a loop that keeps Run from exiting.

public override void Run(){
    Trace.WriteLine("Run - WFWorker entry point called", "Information");
    while (true)

For OnStop we simply close the WorkflowServiceHost.

public override void OnStop(){
    Trace.WriteLine(String.Format("OnStop - Called"));
    if (wfServiceHostA != null)

With OnStart, Run and OnStop Methods defined, our Worker Role is fully capable of hosting a Workflow Service.

Hybrid Approach - Host Workflow On-Premise and Reach From the Cloud

Unlike ‘pure’ cloud solutions, hybrid solutions have a set of “on-premises” components: business processes, data stores, and services. These must be on-premises, possibly due to compliance or deployment restrictions. A hybrid solution is one which has parts of the solution deployed in the cloud while some applications remain deployed on-premises.

This is a great interim approach, leveraging on-premise Workflows hosted within on-premise Windows Server AppFabric (as illustrated in the diagram below) to various components and application that are hosted in Azure.  This approach may also be applied if stateful/durable Workflows are required to satisfy scenarios. You can build a Hybrid solution and run the Workflows on-premise and use either the AppFabric Service Bus or Windows Azure Connect to reach into your on-premise Windows Server AppFabric instance.

Fig 4

Source: MSDN Blog Hybrid Cloud Solutions with Windows Azure AppFabric Middleware


How do you choose which approach to take? The decision ultimately boils down to your specific requirements, but here are some pointers that can help.

Hosting Workflow Services in a Web Role is very easy and robust. If your Workflow is using Receive Activities as part of its definition, you should be hosting in a Web Role. While you can build and host a Workflow Service within a Worker Role, you take on the responsibility of rebuilding the entire hosting infrastructure provided by IIS in the Web Role- which is a fair amount of non-value added work. That said, you will have to host in a Worker Role when you want to use a TCP endpoint, and a Web Role when you want to use an HTTP or HTTPS endpoint.

Hosting non-service Workflows that poll for their tasks is most easily accomplished within a Worker Role. While you can build another mechanism to poll and then call Workflow Services hosted in a Web Role, the Worker Role is designed to support and keep a polling application alive. Moreover, if your Workflow design does not already define a Service, then you should host it in a Worker Role-- as Web Role hosting would require you to modify the Workflow definition to add the appropriate Receive Activities.

Finally, if you have existing investments in Windows Server AppFabric as hosted Services that need to be called from Azure hosted applications, then taking a hybrid approach is a very viable option. One clear benefit, is that you retain the ability to monitor your system’s status through the IIS Dashboard. Of course this approach has to be weighed against the obvious trade-offs of added complexity and bandwidth costs.

The upcoming release of Azure AppFabric Composite Applications will enable hosting Workflow Services directly in Azure while providing feature parity to Windows Server AppFabric. Stay tuned for the exciting news and updates on this front.


The sample project attached provides a solution that shows how to host non-durable Workflows, in both Service and non-Service forms. For non-Service Workflows, it shows how to host using a WorkflowInvoker or WorkflowApplication within a Worker Role. For Services, it shows how to host both traditional WCF service alongside Workflow services, in both Web and Worker Roles.

It’s not evident where the sample project is “attached.”

Steve Plank (@plankytronixx) reported availability of a Video: How Windows Azure App Fab:ACS and ADFS 2.0 work together on 1/11/2011:

image This is a short video that describes how an on-prem ADFS 2.0  server, Windows Azure App Fab: ACS (Access Control Service), an AD Domain Controller and a WIF (Windows Identity Foundation) based app running on Windows Azure communicate with each other. It details the protocol and the data flows and what happens with the tokens and tickets that are generated at each stage of the process.

How Windows Azure AppFab:ACS and ADFS 2.0 work together

image722322It’s created using a tablet PC and a drawing program and has rather the feel of somebody explaining something on a whiteboard so you can follow along as they draw arrows and so on.

<Return to section navigation list> 

Windows Azure Virtual Network, Connect, RDP and CDN

imageNo significant articles today.

<Return to section navigation list> 

Live Windows Azure Apps, APIs, Tools and Test Harnesses

Morten Cristensen posted Umbraco on Azure Series: Deploying to Azure with Accelerator on 1/12/2011:

This is the second post in the series about deploying Umbraco to Windows Azure. In the previous post we looked at deploying a clean Umbraco database to SQL Azure, so this post assumes that you have a database running on SQL Azure.

The focus of this post is to get your local Umbraco solution deployed to a Hosted Service on Windows Azure. Well, you actually need two services to deploy Umbraco because the Hosted Service is only half of running Umbraco on Azure, so deployed to a Hosted Service and Storage.

During the deployment you will be creating a virtual harddrive, which will be deployed to Storage. So before getting started please make sure you have access to both Hosted Service and Storage.

In this post the Azure Accelerator project will come into play, and is listed in the prerequisites below. The cool thing about Accelerator is that it “just” needs to be configured and then it’ll build a virtual harddrive (VHD) containing your Umbraco solution and upload it to storage for you. Aside from that you will also be able to publish the two packages (Configuration cscfg-file and Package cspkg-file) needed to deploy the worker role, which is the Azure VM instance that keeps your site running. The binding between the hosted Azure instance and VHD in storage is done by configuring (step 3) and running Accelerator.

The Azure VM is the equivalent of a website instance in your IIS, but since this is a cloud service the instance will live and die in the cloud. This means that if you stop the instance it will die, if the instance crashes it will die  (for a production site you would have at least two instances running, so your website doesn’t die if one instance crashes). Because of this you need your Umbraco solution persisted to storage, which is where the Azure Storage and VHD comes in. So basically you will have a lot less to worry about when using the Accelerator project.


  • Umbraco database running on SQL Azure from previous post (or an embedded database if you don’t want to use SQL Azure).
  • Local Umbraco (4.5.2 .NET 3.5) solution using the database on SQL Azure.
  • Keys for Azure Storage (more info will follow).
  • The two guides on The Azure Accelerators Project from codeplex.
  • Source Code for Azure Accelerator from codeplex.
  • Visual Studio 2010 with Cloud Sevice v.1.2 installed.
  • Windows Azure SDK version 1.2 (included in Cloud Service installation) – look for June 2010 release (version 1.2) at the bottom of the page.
  • Windows Azure AppFabric SDK version 1.0. Note that the version is very important as the Azure Accelerator will not work with newer versions of the SDK. At least for now, as I’m still to figure out if there is a conflict when using newer versions of the SDK. Using version 1.3 of the SDK has proven troublesome for Warren Buckley. So to save yourself a headache go with version 1.2 of the SDK untill it has been investigated further.

Make sure you have installed the SDKs from the list above before going further.

1.) First off I will show you how to setup Storage on Windows Azure, as you will need access keys for storage in order to deploy the VHD to Azure Storage.
Go to the WindowsAzurePlatform portal and login. If you are prompted to choose which version of the portal you want to use then choose the new one. Its built in Silverlight and works quite nice.

From the left column click “Hosted Serivces, Storage Accounts & CDN” (see screenshot below).

Now your available services will be listed, so click “Storage Accounts” from the left column and then “New Storage Account” from the top menu.

A new dialog pops up where you have to choose your Storage Account subscription, a name that will make up the url and a region – select a region that is close to you. See screenshot below.

When you have clicked the Create button you will see the storage account appear in the main window. Click and you will get a new right column with properties. Copy the Primary access key and your Storage Account name as they are needed in the configuration of the Accelerator project.

Now you have succesfully created a storage account that you can use to deploy your VHD with Umbraco solution to. But first you need to configure the Accelerator project and build it, so you can have it create and upload the VHD for you.

2.) Unzip the Azure Accelerator solution and open it with Visual Studio 2010. Unbind the solution from (codeplex) source control and make the files writeable, as you need to update a couple of files.

First thing I recommend you do is to look through the references for the 3 projects, just to make sure there are no missing references or version conflicts.
Finally build the solution to verify it also builds.

3.) This step is probably the most complicated one, because there are alot of settings to go through to configure the solution. Before going any further I recommend that you read the two guides from the Azure Accelerator project page if you haven’t already done so.

Locate the “AcceleratorService” project in the solution and right click Properties on the AcceleratorWorkerRole to edit the various settings. See screenshot below.

I’m not what the best approach for going through these settings are, so I have listed them all with their default value and written what you should change the value to – if the value needs to be changed.

Full trust – leave it set to full trust.
Instances is by default set to one instance with Medium VM size. You can leave this as is, but it is recommended to deploy production sites with minimum two instances. For this guide you don’t need more then one.
Note: Remember that two instances implies higher costs. Also note that a Medium VM isn’t super fast, so consider setting it to Large or Extra Large.

AcceleratorApplication – Default: Umbraco,3.x
Change to Umbraco,4.0

AcceleratorConnectionString – Default: UseDevelopmentStorage=true
Change this by clicking … and enter your storage credentials in the dialog. See screenshow below:

When changed the connectionstring should look something like this:

LocalSitePath – Default: C:\inetpub\wwwroot
This is the path to the folder containing the Umbaco solution. If you have installed your Umbraco website to the default location you probably don’t need to change it, but make sure the path is correct and change if necessary.

AcceleratorConfigBlobUri – Default: wa-accelerator-config/umbraco.config
This is the default location for the umbraco.config, which is being used by Accelerator so leave this setting as is.

AcceleratorContainerSyncUri – Default: wa-accelerator-apps/inetpub
Also a default setting that you don’t need to change.

AcceleratorMachineKey – Default: <machineKey validationKey=”A Long Key” decryptionKey=”Another Long Key” validation=”SHA1″ decryption=”AES” />
Just leave this as is.

EnableDevStorage – Default: false
Since we are deploying to Azure Hosted Service leave this as is. You only want to enable development storage when testing locally.

allowInsecureRemoteEndpoints – Default: true
The site will be running a standard HTTP connection, so leave the value set to true.

Hostheader – Default: (blank)
The hostheader is blank by default and for this guide it can be left blank as we will only deploy one Umbraco site that will respond to whatever URL is generated by Azure.

Diagnostics – Default: ApplicationName=Umbraco;EnableLogging=true;DiagnosticsConsole=false;RealtimeTracing=false;LogFilter=Verbose;LogTransferInterval=5;BufferQuotaInMB=512
This string determines how diagnostics will be handled by Accelerator when deployed to Azure. As you can see logging is enabled, which is fine for now so just leave it set to default value. But maybe consider if the buffer should be set to something smaller then 512MB.

DiagnosticsConnectionString – Default: UseDevelopmentStorage=true
If you want to use diagnostics you have to change this setting to an Azure Storage account. This would typically be the same credentials as entered in AcceleratorConnectionString and when done your connectionstring would look similar to the one below:
Note that official documentation states that diagnostics should run on https (for DefautlEndPointsProtocol), but you will be able to deploy it with http.

DiagnosticsServiceBus – Default: ServiceNamespace=enterservicebus;ServicePath=diag/umbraco;IssuerName=owner;IssuerSecret=enterissuersecret
I must admit I’m not totally confident about this setting, but I believe the idea is to have a ServiceBus that keeps an eye on the Instance and writes errors to storage. You will probably have to setup a ServiceBus via AppFabric, but leave this for now as its not needed to succesfully deploy the Umbraco solution to Azure. I will write a follow-up blog post where I hope to have some more answers about this setting ;)

AcceleratorDrivePageBlobUri – Default: cloud-drives/Umbraco.vhd
Also a default setting that you don’t need to change. This is the location used for the VHD in storage.

HttpIn – Default: 80
Leave this setting set to default value.

HttpInAlt – Default: 8080
Leave this setting set to default value.

Local Storage:
LocalStorage – Default: 16384
This local storage refers to the actual instance’s (Azure VM) local storage size, which will typically be used for ASP.NET caching and IIS temporary files.

CloudDriveCache – Default: 2048
This setting is the size in MB used for the common storage for mounted cloud drives.

DiagnosticLogs – Default: 16384
This setting is the size in MB used for temporary storage for trace files and event logs created by Accelerator.
All of these local storage settings can be left set to default values. But keep in mind that you might want to revise these settings for a production site.

4.) Now that the various settings has been updated you are ready to build the solution and publish the packages. But first you want to make sure you build release assemblies – see screenshot below:

So build the solution, which produces a Publish folder in your Accelerator solution folder:

Now go back to Visual Studio and right click Publish from the AcceleratorService project. This will open a new dialog like the one in the screenshot below.

Choose “Create Service Package Only” and click OK. The output folder will open automatically, so note the location of the two files (see screenshot below) as they will be used in step 6.

5.) Now its time to build the VHD and upload it to storage using AccelCon.exe (located in Publish folder in your Accelerator solution folder) from Windows Azure SDK Command Prompt.

Note that you have to run the Azure SDK Command Prompt as administrator, so select the Windows Azure SDK Command Prompt from Start -> All Programs -> Windows Azure SDK v1.2 and right click “Run as administrator”. Once you got it open you need to naviate to the Publish folder which contains the AccelCon.exe file.

Type in this command and follow the instructions in the Command Prompt window: AccelCon /u /w /q

This will start the creation of the VHD, copying of Umbraco solution to VHD and finally uploading it to your storage account.

6.) Final setup is to setup a Hosted Service and install the two published packages (AcceleratorService.cspkg and ServiceConfiguration.cscfg).
Go to the WindowsAzurePlatform portal and Navigate to “Hosted Services, Storage Accounts & CDN” -> “Hosted Services”. From the top menu click “New Hosted Service” and a new dialog will appear.

Fill out the form similar to the screen below, but note that the URL prefix has to be unique from anything that is already deployed to azure. But don’t worry it’ll give you an error message if you choose a URL prefix that is already in use.

You can choose to deploy to staged environment or directly to production. For this guide you can choose either one. If you choose the staged environment you will get a random URL that you can use to test your hosted service (Umbraco solution) and you can later choose to promote that service fra staging to production.

Choose the same region as you chose for you Storage Account and browse and locate the two packages: AcceleratorService.cspkg (package localtion) and ServiceConfiguration.cscfg (configuration file).

After clicking OK you will most likely get a warning message if you have only set one instance in the configuration of the AcceleratorWorkerRole. It will take some time for the Hosted Service to initialize and startup the instance with VHD attached. While waiting you will see the following in the main window of the WindowsAzurePlatform portal:

And when your solution has been successfully deployed to Azure Hosted Service it will look like this:

You should now be able to access the site via the standard URL with your selected URL prefix:

I have deployed my Umbraco solution to (with CWS installed of course – this is my special dedication to Warren Buckley for trying to deploy Umbraco to Azure).

Please note that this site will only be running for a couple of days, as I’m paying for each hour the instance is active :-S

That wasn’t too bad was it ;-)

I hope you have found these two posts about deploying Umbraco to Windows Azure useful, and if you have any additional questions or think something is missing in this guide please feel free to drop a comment.

Update: Okay, maybe I should have deployed this demo site to two large or extra-large instances because now the site just loads like crap :-)

2nd Update: Deploying with two instances seems to work a lot better.

Morten is a danish .NET web developer working for a Copenhagen based company called Codehouse with a primary focus on Sitecore solutions (developing, upgrading and supporting).

See Morten Cristensen posted Umbraco on Azure Series: SQL Azure on 1/10/2011 in the SQL Azure Database and Reporting section above.

David Aiken (@TheDavidAiken) explained how to add a Windows Azure Memcached plugin in a 1/11/2011 post:

image Cutting a really long story short. I wanted an easy way of adding caching to a Windows Azure project. Several people I know are already using Memcached, while waiting for our own caching service to go into production. I thought, this should be a plugin.

imageA plugin is exactly what it sounds, it is something that gets “plugged” into your role at build time. You can see the existing plugins on you have installed by looking at C:\Program Files\Windows Azure SDK\v1.3\bin\plugins. You should see plugins for diagnostics, RemoteAccess and more.

To make your own plugin is fairly easy – if you look in any of the plugin folders you will see a .CSPLUGIN file. You should be able to work out how this works. If you cannot, you probably don’t want to be building your own.

For the Memcached plugin, all I really need is the memcached.exe file (and any dependencies( I used the version from, plus a little wrapper to read the config and launch memcached with the correct parameters – and of course the CSPLUGIN file:

<?xml version="1.0" ?>
    <Task commandLine="Memcached64Plugin.exe" taskType="background" executionContext="limited"/>
    <InternalEndpoint name="Endpoint" protocol="tcp" port="11212" />
    <Setting name="CacheSizeInMB"/>

What the above essentially does is:

  1. Adds a startup task
  2. Adds an InternalEndpoint for port 11212
  3. Adds a CacheSizeInMb setting

The little wrapper was written as a console app:

static void Main()

    var endpoint = RoleEnvironment.CurrentRoleInstance.InstanceEndpoints["Memcached64Plugin.Endpoint"].IPEndpoint;
    var cacheSize = RoleEnvironment.GetConfigurationSettingValue("Memcached64Plugin.CacheSizeInMB");
    var startupInfo = new ProcessStartInfo("memcached.exe", string.Format("-m {0} -p {1}", cacheSize, endpoint.Port));


Note when we read the values we use the format <name of the plugin>.<key>, rather than just <key>, in this case Memcached64Plugin.Endpoint, rather than just Endpoint.

Once you build the solution and copy the files into the plugin folder, you can then use the Imports tag in your ServiceConfiguration.csdef file:

 <Import moduleName="Memcached64Plugin" />

This will magically add the correct settings to your ServiceConfiguration.cscfg file as shown:

      <Setting name="Memcached64Plugin.CacheSizeInMB" value="512" />

Note, you won’t see the internal port or the startup task.

When you build and deploy (or even run in the Compute Emulator) the plugin will be packaged up with your code and executed on boot.

For the client side, I used the enyim client, which you can grab from

You can grab a list of memcached servers using the RoleEnvironment, something like:

RoleEnvironment.Roles["WebRole1"].Instances.Select(instance => instance.InstanceEndpoints["Memcached64Plugin.Endpoint"].IPEndpoint)

Note the role is WebRole1 and the port we are looking for is the one we defined in the CSPLUGIN!

You should also write some code to handle a topology change, including adding more instances (or removing instances). You can do this by adding code to handle the RoleEnvironment.Changed event, and rebuilding your server list.

You can grab my plugin from here, minus the memcached.exe and required dll. You can download those from the


PS: No, this was nothing to do with PowerShell, I hit an issue with that which I’m still working on – cross your fingers – it will be splendid.

Bruce Kyle invited everyone on 1/11/2010 to Join in Dynamics CRM 2011 Virtual Launch Event on 1/20/2010:

imageMark your calendar and attend the global virtual launch event for Dynamics CRM 2011 on Thursday, January 20 at 9am Pacific Time. Register for the virtual launch event at


Steve Ballmer will introduce Microsoft Dynamics CRM 2011. With the launch of Microsoft Dynamics CRM Online 2011 in 40 markets and 41 languages, CRM Online will be more interesting than ever to all partners and your customers.

Getting Started with Dynamics 2011 CRM

clip_image002For developers and partners looking to get started with CRM 2011 please be sure to check out this whitepaper on

Building Business Applications with Microsoft Dynamics CRM 2011: A guide to Independent Software Vendors and Developers

This white paper is a helpful guide for ISVs and developers to build line of business applications using Microsoft Dynamics CRM 2011 and the Microsoft platform. For technical decision makers, it is a valuable resource to understand what the xRM Framework, which underpins Microsoft Dynamics CRM 2011, has to offer.

<Return to section navigation list> 

Visual Studio LightSwitch

Robert Green (@rogreen_ms) answered Where Do I Put My Data Code In A LightSwitch Application? in a 1/10/2010 blog post:

image In my previous post (see Using Both Remote and Local Data in a LightSwitch Application), I started building a demo application that a training company’s sales rep can use to manage customers and the courses they order. The application includes both remote data and local data. The remote data consists of the Courses, Customers and Orders tables in the TrainingCourses SQL Server database. The local data consists of the Visits and CustomerNotes tables in a local SQL Server Express database.


The application currently contains four screens:

  • CustomerList is the application’s main screen and shows the list of customers.
  • CustomerDetail is used for adding and viewing/editing customers
  • CourseList is used for viewing courses. I am using the default LightSwitch generated screens for adding and editing courses.
  • NewOrder is used to add a course order for a customer.

image2224222In this post, I want to show you some of the code I have written for this application and I want to discuss where the code goes. I’m not going to do a deep-dive on all the events in LightSwitch and when they execute. For that, check out Prem Ramanathan’s Overview of Data Validation in LightSwitch Applications. I’m also not going to do not a deep-dive on data validation. Rather, I want to do more of an introduction to the types of things you will think about when you write code to perform specific tasks.

We are going to review the following data-related tasks:

  • Generate default values for customer region and country.
  • Calculate total price for a course order based on attendees and price.
  • Show year to date order totals for each customer.
  • Validate that courses can’t be scheduled less than 7 days in advance.
  • Retrieve the price when the user selects a course for a new order.
  • Validate that the sales rep doesn’t discount courses by more than 20%.
  • Prompt the user to confirm deletion of a customer.
Generate default values for customer region and country

When the user creates a new customer, I want to default the Region to “WA” and the Country to “USA”. You can do this at the screen level with the following code:

‘ VB
Public Class CustomerDetail
  Private Sub CustomerDetail_Loaded()
    If Me.CustomerId.HasValue Then
      Me.Customer = Me.CustomerQuery
      Me.Customer = New Customer
      Me.Customer.Region = "WA"
      Me.Customer.Country = "USA"
    End If
  End Sub
End Class

// C#
public partial class CustomerDetail
  partial void CustomerDetail_Loaded()
    if (this.CustomerId.HasValue)
      this.Customer = this.CustomerQuery;
      this.Customer = new Customer();
      this.Customer.Region = "WA";
      this.Customer.Country = "USA";

To do this, open the screen and select CustomerDetail_Loaded from the Write Code button’s drop-down list. This method runs after the screen is displayed and runs on the client.


This works fine if the only place the user ever adds a customer is the CustomerDetail screen. But if customers can get added in other screens or in code, and you want the region and country to default to WA and USA, then this code should be at the entity-level.

‘ VB
Public Class Customer
  Private Sub Customer_Created()
    Me.Region = "WA"
    Me.Country = "USA"
  End Sub
End Class

// C#
public partial class Customer
  partial void Customer_Created()
    this.Region = "WA";
    this.Country = "USA";

To do this, open the Customer entity and select CustomerDetail_Loaded from the Write Code button’s drop-down list. This method runs after the after the item is created and runs on the tier where the item was created.


Calculate total price for a course order based on attendees and price

To add a new order, you specify the customer and the course they are purchasing. You then specify the number of attendees and the price of the course. The total price of the order is number of attendees times the price. This calls for a computed field. I added a TotalPrice property to the Orders table and made it a Money type. I clicked Edit Method in the Properties window and added the following code:

‘ VB
Public Class Order
  Private Sub TotalPrice_Compute(ByRef result As Decimal)
    result = Me.Attendees * Me.Price
  End Sub
End Class

// C#
public partial class Order
  partial void TotalPrice_Compute(ref decimal result)
    result = this.Attendees * this.Price;

TotalPrice_Computed is a method in the Order class, so it is attached to the data. That means that whether you add a new order by hand in the UI or in code, the total price will always be calculated based on the attendees and price. This is clearly the right place for this code.

Show year to date order totals for each customer

This application is used by a sales rep, so it would be very useful to easily see year-to-date revenue for each customer. So I added a YearToDateRevenue computed property to the Customers table and used a LINQ query to retrieve the data. (If you are new to LINQ, check out the LINQ topic in the online Visual Studio documentation.)

‘ VB
Public Class Customer
  Private Sub YearToDateRevenue_Compute(ByRef result As Decimal)
    result = Aggregate order In Me.Orders
             Where order.OrderDate.Year = Date.Today.Year
             Into Sum(order.Attendees * order.Price)
  End Sub
End Class

// C#
public partial class Customer
  partial void YearToDateRevenue_Compute(ref decimal result)
    result = (from order in this.Orders
              where order.OrderDate.Year == DateTime.Today.Year
              select (decimal)order.Attendees * order.Price).Sum();

YearToDateRevenue is a property of the Order entity, so I can simply add it to the CustomerList screen, as well as the CustomerDetail screen. And I know that any time I access a customer, I automatically retrieve the year to date revenue.


When I run this application, I notice that the customers appear first and then the revenue figures appear, one at a time, as the query and looping occurs for each customer. Of course, I am using Beta 1 and it is not optimized for performance. I have no doubt that when LightSwitch releases, this task will be faster. However, what if I have dozens of customers or hundreds of orders? I may have to rethink this and not use a computed property. I may move the query into the CustomerDetail screen so the query runs once each time I view a customer.

Validate that courses can’t be scheduled less than 7 days in advance

In our scenario, customers buy courses for a particular date. So an order has an order date and a course date, which is when the course is scheduled. I want to enforce a rule that the sales rep can’t schedule a course less than 7 days ahead of time. There are two ways to do this: at the property level and at the entity level.

To validate this at the property level, I can write code to check the course date.

‘ VB
Public Class Order
  Private Sub CourseDate_Validate(ByVal results As _

    If Me.CourseDate < Me.OrderDate.AddDays(7) Then
        "Courses cant be scheduled less than 7 days in advance")
    End If
  End Sub
End Class

// C#
public partial class Order
  partial void CourseDate_Validate(
    EntityValidationResultsBuilder results)
    if (this.CourseDate < this.OrderDate.AddDays(7))
        "Courses cant be scheduled less than 7 days in advance");

Tip: I can get to CourseDate_Validate in a number of ways:

  • Open Orders in the Entity Designer. Select CourseDate and click Custom Validation in the Properties window.
  • Open Orders in the Entity Designer. Click the Write Code button. Then select CourseDate_Validate from the members drop-down list.
  • Right-click on Orders in the Solution Explorer and select View Table Code. Then select CourseDate_Validate from the members drop-down list.

As soon as I wrote that code, it occurred to me that I need the same code in the OrderDate_Validate method. Rather than duplicate this code, I can make this an entity level validation by moving this code to the Orders_Validate method.

‘ VB
Public Class TrainingCoursesDataService
  Private Sub Orders_Validate(ByVal entity As Order, 
    ByVal results As EntitySetValidationResultsBuilder)

    If entity.CourseDate < entity.OrderDate.AddDays(7) Then
        "Courses cant be scheduled less than 7 days in advance")
    End If
  End Sub
End Class

// C#
public partial class TrainingCoursesDataService
  partial void Orders_Validate(Order entity, 
    EntitySetValidationResultsBuilder results)
    if (entity.CourseDate < entity.OrderDate.AddDays(7))
        "Courses cant be scheduled less than 7 days in advance");

Notice that this method is in the TrainingCoursesDataService class, not the Order class. That is why I use entity instead of Me and AddEntityError instead of AddPropertyError.

That also explains why you can’t get to this method from inside Order.vb.


To get to this method, open the Orders table in the Entity Designer and select Orders_Validate from the Write Code button’s drop-down list.


What are the pros and cons of property level validations vs. entity level? Property level validations occur on the client as soon as the user moves off a control. When this code is in the CourseDate_Validate and OrderData_Validate methods of the Order class, I see the error as soon as I tab out of the course date control.


Entity-level validations run on the server and don’t run until the user clicks Save. When this code is in the Orders_Validate and method of the TrainingCoursesDataService class, I don’t see the error as it occurs. I have to wait until I click Save.


So there is a UI tradeoff. But there is also a scalability tradeoff. If I use entity level validation, the validation occurs on the middle tier. Do I really want to involve the middle tier for a simple comparison of two dates? Yes, there is a little duplication of code, but it is more efficient at runtime to have the validation occur on the client.

Retrieve the price when the user selects a course for a new order

When the user enters an order, he or she selects a course and then enters the course date, number of attendees and the price. The user should not have to enter the price. The application should retrieve the price from the course record and display it on the screen.

If you are not well versed in the multi-tier application arts, you might think about putting the code to get the price in the screen. So you go into the NewOrder screen’s code behind and start looking for events tied to the various UI elements. You won’t find them. The only thing you will find is events related to the screen and the queries it uses.


LightSwitch applications are built on the classic three-tier architecture. See The Anatomy of a LightSwitch Application Series on the LightSwitch team blog for the full story. The presentation tier handles data entry and visualization. Queries and validation occur in the logic tier. So the code to retrieve the course price won’t be in the screen. It will be part of the entity.


To get the price of the course, I add one line of code to the Order class’s Course_Changed method. Every time the course changes for an order, the order’s price is immediately updated.

‘ VB
Public Class Order
  Private Sub Course_Changed()
    Me.Price = Me.Course.Price
  End Sub
End Class

// C#
public partial class Order
  partial void Course_Changed()
    this.Price = this.Course.Price;
Validate that the sales rep doesn’t discount courses by more than 20%

When placing an order, the sales rep can apply a discount and offer a course for less than its list price. I want to make sure the discount doesn’t exceed 20%. In the Order class’s Price_Validate method I will check if the price is less than the lowest discounted price.

The first thing I need to do is calculate the lowest acceptable price. I am going to do that right after I retrieve the course’s price. So I create a lowestPrice field and calculate the lowest price after retrieving the course’s price. I then validate the price by using the Order class’ Price_Changed method:

‘ VB
Public Class Order
  Private lowestPrice As Decimal = 0
  Private Sub Course_Changed()
    Me.Price = Me.Course.Price
    lowestPrice = Me.Course.Price * 0.8
  End Sub

  Private Sub Price_Validate(ByVal results As _
    If lowestPrice > 0 AndAlso Me.Price < lowestPrice Then
        "No discounts of more than 20%. “ & 
        "Price can't be less than {0:C}", lowestPrice))
    End If
  End Sub
End Class

// C#
public partial class Order
  private decimal lowestPrice = 0;
  partial void Course_Changed()
    this.Price = this.Course.Price;
    lowestPrice = this.Course.Price * .8M;

  partial void Price_Validate(EntityValidationResultsBuilder results)
    // Users can set a price before selecting a course. 
    // if (so, don't validate the discount
    if (lowestPrice > 0 && this.Price < lowestPrice)
        "No discounts of more than 20%. " + 
        "Price can't be less than {0:C}", lowestPrice));

The user could enter a price before selecting a course, so the code first checks if lowestPrice has a positive value. It will if the user selected a course. Then the code checks for the amount of the discount.

Prompt the user to confirm deletion of a customer

The last thing I want to look at in this post is deleting. When you create a detail screen, there is no delete button by default. You can easily add one by selecting Delete from the Command Bar’s Add button’s drop-down list.


However, you can’t write your own code to run when the user presses that Delete button. Which means you can’t prompt to see if the user really wants to delete the customer or whatever. So I added my own Delete button by selecting New Button instead. I named it DeleteButton and set its Display Name to Delete. Then I right-clicked and selected Edit Execute Code. Then I wrote code to prompt and only delete if confirmed.


‘ VB
Public Class CustomerDetail
  Private Sub DeleteCustomer_Execute()
    If ShowMessageBox("Do you want to delete this customer?",
                      "Customer", MessageBoxOption.YesNo) =
                      Windows.MessageBoxResult.Yes Then
    End If
  End Sub
End Class

// C#
public partial class CustomerDetail
  partial void DeleteCustomer_Execute()
    if (this.ShowMessageBox(
      "Do you want to delete this customer?",
      "Customer", MessageBoxOption.YesNo) ==

To delete the customer, I call the Delete method of the Customer object. That deletes the customer locally. To make this permanent I call the Save method of the screen. Then I close the CustomerDetail form. I pass False to Close indicating I do not want the user prompted to save changes.


This post explored how to accomplish in LightSwitch some of the standard things we have all done in every data application we have ever written. I love doing things like this because this is how I learn. I am a data guy and always have been. Whenever I learn a new technology like LightSwitch, I want to know how I add, edit, delete and query data. I know what I want to do. The only questions are what code do I write and where do I put it? Hopefully, the examples here will help you answer those questions in your own LightSwitch applications.

Check out Performing Data-Related Tasks by Using Code for examples of how to do many of these same tasks. Note that the examples there are purely code focused whereas the examples here are more driven from the UI.

Return to section navigation list> 

Windows Azure Infrastructure

My (@rogerjenn) updated 15-Day Extensions Offered for Windows Azure Platform 30-Day Passes post of 1/12/2011 described what I initially assumed to be a 30-day extension to a Windows Azure benefit:

• Updated 1/12/2011 11:30 AM: Extensions are 15, not 30, days; see end of post.

On 1/12/2011 I received the following message from


I have only one Windows Azure Pass subscription in my Microsoft Online Services Customer Portal:


My other active subscriptions are Cloud Essentials for Partners and Windows Azure Platform MSDN Premium, which have a one-year duration.

Note: For more details about my Cloud Essentials for Partners subscription, see my Windows Azure Compute Extra-Small VM Beta Now Available in the Cloud Essentials Pack and for General Use post of 1/9/2011. I was surprised to find that my Windows Azure platform Cloud Essentials for Partners subscription was reduced from one-year to one-month duration:


None of my active subscriptions are set to expire on 1/12/2010, but I decided to give the extension a try.

Clicking the link at the bottom of the window opened this form with a disabled Windows Live ID text box. Clicking the Signin button inserted my current Windows Live ID:


Clicking the Extend Login button produced this page:


• Updated 1/12/2011 11:30 AM: Extensions are 15 (not 30) days, so post title changed.

After more than one hour, I received the following (disappointing) confirmation message:


I assume that the correct wording is:

Your account will now expire on 1/27/2011 11:21 PM. At that point all of your data will be erased.

Don’t the developers proofread automated emails?

It’s not clear to me if multiple extensions will be available. Stay tuned.

Adron Hall (@adronbh) explained Windows Azure Web, Worker, and CGI Roles – How They Work in this 1/12/2010 post:

image This is a write up I’ve put together of how the roles in Windows Azure work.  As far as I know, this is all correct – but if there are any Windows Azure Team Members out there that wouldn’t mind providing some feedback about specifics or adding to the details I have here – please do add comments!  :)

Windows 2008 and Hyper-V

imageWindows Azure is built on top of Windows 2008 and Hyper-V. Hyper-V provides virtualization to the various instance types and allocation of resources to those instances. Windows 2008 provides the core operating system functionality for those systems and the Windows Azure Platform Roles and Storage.

The hypervisor that a Hyper-V installation implements does a few unique things compared to many of the other virtualization offerings in the industry. Xen (The Open Source Virtualization Software that Amazon Web Services use) & VMWare both use a shared resource model for utilization of physical resources within a system. This allows for more virtualized instances to be started per physical machine, but can sometimes allow hardware contention. On the other hand Hyper-V pins a particular amount of resources to a virtualized instance, which decreases the number of instances allowed on a physical machine. This enables Hyper-V to prevent hardware contention though. Both designs have their plusses and minuses and in cloud computing these design choices are rarely evident. The context however is important to know when working with high end computing within the cloud.

Windows Azure Fabric Controller

The Windows Azure Fabric Controller is kind of the magic glue that holds all the pieces of Windows Azure together. The Azure Fabric Controller automates all of the load balancing, switches, networking, and other networking configuration. Usually within an IaaS environment you’d have to setup the load balancer, static IP address, internal DNS that would allow for connection and routing by the external DNS, the switch configurations, configuring the DMZ, and a host of other configuration & ongoing maintenance is needed. With the Windows Azure Platform and the Fabric Controller, all of that is taken care of entirely. Maintenance for these things goes to zero.

The Windows Azure Fabric Controller has several primary tasks: networking, hardware, and operating system management, service modeling, and life cycle management of systems.

The low level hardware that the Windows Azure Fabric Controller manages includes switches, load balancers, nodes, load balancers, and other network elements. In addition it manipulates the appropriate internal DNS and other routing needed for communication within the cloud so that each URI is accessed seamlessly from the outside.

The service modeling that the fabric controller provides is a to map the topology of services, port usage, and as mentioned before the internal communication within the cloud. All of this is done by the Fabric Controller without any interaction other than creating an instance or storage service within Windows Azure.

The operating system management from the Fabric Controller involves patching the operating system to assure that security, memory and storage, and other integral operating system features are maintained and optimized. This allows the operating system to maintain uptime and application performance characteristics that are optimal.

Finally the Fabric Controller has the responsibility for service life cycle. This includes updates and configuration changes for domains and fault domains. The Fabric Controller does so in a way to maintain uptime for the services.

Each role is maintained in a way, by the Fabric Controller, that if the role stops responding it is recycled and a new role takes over. This can sometimes take several minutes, and is a core reason behind the 99.99% uptime SLA requiring two roles to be running. In addition to this the role that is recycled is rebuilt from scratch, thus destroying any data that would be stored on the role instance itself. This is when Windows Azure Storage plays a pivotal role in maintaining Windows Azure Cloud Applications.

Web Role

The Windows Azure Web Role is designed as a simply to deploy IIS web site or services hosting platform feature. The Windows Azure Web Role can provide hosting for any .NET related web site such as; ASP.NET, ASP.NET MVC, MonoRails, and more.

The Windows Azure Web Role is provides this service hosting with a minimal amount of maintenance required. No routing or load balancing setup is needed; everything is handled by the Windows Azure Fabric Controller.

Uses: Hosting ASP.NET, ASP.NET MVC, MonoRails, or other .NET related web site in a managed, high uptime, highly resilient, controlled environment.

Worker Role

A worker role can be used to host any number of things that need to pull, push, or run continuously without any particular input. A service role can be used to setup a schedule or other type of service. This provides a role dedicated to what could closely be compared to a Windows Service. The options and capabilities of a Worker Role however vastly exceed a simple Windows Service.

CGI Role

This service role is designed to allow execution of technology stacks such as Ruby on Rails, PHP, Java, and other non-Microsoft options.

Windows Azure Storage

Windows Azure Storage is broken into three distinct features within the service. Windows Azure provides tables, blob, and queue for storage needs. Any of the Windows Azure Roles can also connect to the storage to maintain data across service lifecycle reboots, refreshes, and any temporary loss of a Windows Azure Role.

A note about Windows Azure Storage compared to most Cloud Storage Providers: None of the Azure Storage Services are “eventually consistent”. When a write is done, it is instantly visible to all subsequent readers. This simplifies coding but slows down the data storage mechanisms more than eventually consistent data architectures.

I noted the relatively slow speed of updates to the tables of my OakLeaf Systems Azure Table Services Sample Project - Paging and Batch Updates Demo in my Speed ServiceContext.SaveChanges() Execution with the SaveChangesOptions.Batch Argument post of 12/20/2010.

For more details about Windows Azure Storage, see the TechNet Wiki’s Understanding Data Storage Offerings on the Windows Azure Platform page.

The Windows Azure Storage Team’s Windows Azure Storage Architecture Overview of 12/30/2010 explains the use of the Distributed File System to maintain storage replicas:

In this posting we provide an overview of the Windows Azure Storage architecture to give some understanding of how it works. Windows Azure Storage is a distributed storage software stack built completely by Microsoft for the cloud.

Before diving into the details of this post, please read the prior posting on Windows Azure Storage Abstractions and their Scalability Targets to get an understanding of the storage abstractions (Blobs, Tables and Queues) provided and the concept of partitions.

3 Layer Architecture

The storage access architecture has the following 3 fundamental layers:

  1. Front-End (FE) layer – This layer takes the incoming requests, authenticates and authorizes the requests, and then routes them to a partition server in the Partition Layer. The front-ends know what partition server to forward each request to, since each front-end server caches a Partition Map. The Partition Map keeps track of the partitions for the service being accessed (Blobs, Tables or Queues) and what partition server is controlling (serving) access to each partition in the system.
  2. Partition Layer – This layer manages the partitioning of all of the data objects in the system. As described in the prior posting, all objects have a partition key. An object belongs to a single partition, and each partition is served by only one partition server. This is the layer that manages what partition is served on what partition server. In addition, it provides automatic load balancing of partitions across the servers to meet the traffic needs of Blobs, Tables and Queues. A single partition server can serve many partitions.
  3. Distributed and replicated File System (DFS) Layer – This is the layer that actually stores the bits on disk and is in charge of distributing and replicating the data across many servers to keep it durable. A key concept to understand here is that the data is stored by the DFS layer, but all DFS servers are (and all data stored in the DFS layer is) accessible from any of the partition servers.

These layers and a high level overview are shown in the below figure:


Here we can see that the Front-End layer takes incoming requests, and a given front-end server can talk to all of the partition servers it needs to in order to process the incoming requests. The partition layer consists of all of the partition servers, with a master system to perform the automatic load balancing (described below) and assignments of partitions. As shown in the figure, each partition server is assigned a set of object partitions (Blobs, Entities, Queues). The Partition Master constantly monitors the overall load on each partition sever as well the individual partitions, and uses this for load balancing. Then the lowest layer of the storage architecture is the Distributed File System layer, which stores and replicates the data, and all partition servers can access any of the DFS severs.

The article continues with these sections:

    • Fault Domains and Server Failures
    • Upgrade Domains and Rolling Upgrade
    • DFS Layer and Replication
    • Geo-Replication
    • Load Balancing Hot DFS Servers
    • Why Both a Partition Layer and DFS Layer?

Buck Woody published Windows Azure Learning Plan – Compute as another member of his series on 1/11/2011:

image This is one in a series of posts on a Windows Azure Learning Plan. You can find the main post here. This one deals with the "compute" function of Windows Azure, which includes Configuration Files, the Web Role, the Worker Role, and the VM Role. There is a general programming guide for Windows Azure that you can find here to help with the overall process.

Configuration Files

Configuration Files define the environment for a Windows Azure application, similar to an ASP.NET application. This section explains how to work with these.

General Introduction and Overview

Service Definition File Schema

Service Configuration File Schema 

Windows Azure Web Role

The Web Role runs code (such as ASP pages) that require a User Interface.

Web Role "Boot Camp" Video

Web Role Deployment Checklist 

Using a Web Role as a Worker Role for Small Applications

Windows Azure Worker Role

The Worker Role is used for code that does not require a direct User Interface.

Worker Role "Boot Camp" Video

Worker Role versus Web Roles

Deploying other applications (like Java) in a Windows Azure Worker Role

Windows Azure VM Role

The Windows Azure VM Role is an Operating System-level mechanism for code deployment.

VM Role Overview and Details

The proper use of the VM Role

Robert Duffner posted Thought Leaders in the Cloud: Talking with Chris Auld, CTO at Intergen Limited and Windows Azure MVP to the Windows Azure Team blog on 1/11/2011:

image Chris Auld [pictured at right] is a Microsoft MVP, the CTO at Intergen Limited and a director of Locum Jobs Startup MedRecruit. Trained as an attorney, Chris chose to pursue a career with emerging technologies instead of practicing law. He is widely known for his evangelical, arm-waving style, as well as for his enthusiasm and drive.

In this interview we discuss:

  • Cloud computing as a business, rather than technological, innovation
  • Scenarios that utilize the cloud's elastic capabilities
  • The red herring of security vs. the real issue of sovereignty
  • Laws are unlikely to catch up, so hybrid clouds, with things like the Azure appliance, will become the way this is navigated
  • A key challenge in porting apps to the cloud is that their data tier was architected for vertical scaling, and the cloud provides horizontal data scaling
  • The success of the cloud is "just math", as you're paying for average usage. With on-premises you're paying for peak usage
  • Azure stands out as a "platform that is designed to give you the building blocks to build elastic, massive-scale applications"

Robert Duffner: Chris, could you take a moment to introduce yourself?

Chris Auld: I am the Chief Technology Officer at company called Intergen; we're a reasonably significantly sized Microsoft Gold partner based out of Australia and New Zealand. I've got a pretty long background with Microsoft technologies, and most recently, I have been focused quite significantly on the Windows Azure platform.

I'm one of the about 25 Windows Azure MVPs world wide with my particular focus being on Azure Architecture. MVPs are members of the community who have a lot to say about Microsoft technology and who provide support and guidance in the community. I've done a significant amount of presenting and training delivery on Windows Azure around the globe.

For example, I'm in New Zealand this week, and I will be in Australia the week after next to do some Azure training courses. Last week, I was at TechEd Europe in Berlin, and at the Oredev Conference in Malmo, Sweden, delivering talks on Windows Azure architecture.

Robert: You've said that cloud computing isn't a technological innovation as much as a business one, and that it's really a new model to procure computing. Can you expand a little bit about that?

imageChris: The architectural patterns and implementation approaches that we take with Windows Azure applications are the same ones we've implemented for many, many years. And the thinking around scale out architectures that we're building today are the same thoughts as those around what I was building back in the 'dot com' timeframe with classic ASP.

Where cloud computing is really unique is that it offers a very different way for us to be able to procure computing power. And in particular, to be able to procure computing power on an elastic basis. So there are significant new opportunities that are opened up by virtue of being able to buy very large amounts of computing resources for very short periods of time, for example.

Robert: You've also said that the cloud's unique selling proposition is elasticity. What are some of the scenarios that have highly elastic needs?

Chris: The canonical one that I always use is selling tickets to sporting events. Typically, your website may be selling a handful of tickets each and every day, but when a very popular event goes on sale, you can expect to sell hundreds of thousands of tickets over a time period as short as, say, five to ten minutes. We see similar patterns in other business scenarios as well.

Another good example would be the ability to use the cloud to spin up a super computer for a temporary load. Maybe you're a mining company or a minerals exploration company, and you get some seismic data that you need to analyze rapidly.

Being able to spin up a super computer for a couple of days and then turn it back off again is really valuable, because it means that you don't have the cost of carrying all of that capital on your balance sheet when you don't actually need to use it.

Robert: Background-wise, you come into technology with a law degree. As you look at the cloud, where the technology really is outpacing legislation, how do you think your law background informs the way you view the cloud?

Chris: Some of the legal stuff around the cloud remains somewhat intractable. I obviously do a lot of presenting around this stuff, and I usually start by asking people in the audience how many of them are concerned about cloud security, and it typically is everybody.

I'm not particularly concerned about cloud security, because there's really nobody I would trust more with my data than a really large, multinational technology company like Microsoft or some of the other major cloud vendors. The more interesting thing, in terms of the legal stuff, is data sovereignty. That's really thinking about what laws apply when we start working with cloud computing.

If my app is in Singapore, but the Singaporean datacenter is owned by a Belgian company that happens to have a sales office in Reno, what laws apply to my data? What privacy law applies? What competition law applies? What legal jurisdiction applies? Who can get search warrants to look at my data and so forth?

Those are some very hard problems, and in fact, my law degree doesn't particularly help me solve them. Indeed, the law in general really struggles to answer those sorts of questions at the moment. Those legal and sovereignty questions may be the hardest questions in cloud computing.

Robert: In Switzerland, customer financial information has to reside in the country, and moreover, only Swiss citizens can actually look at that data. So unless you have Swiss citizens in your call centers in Dublin or Mumbai, you start to see challenges.

Chris: That, in some ways, determines who can actually run your data center, who can be operating your servers. Some of those laws can become quite pervasive.

Robert: At some point, that is just going to become technologically untenable. Do you have any thoughts on that? Do you think that eventually there'll be a lot of pressure to change laws?

Chris: I think there will. Technology is outpacing the law already, and we see it across many areas. For instance, in New Zealand we have things called "name suppression orders," and there's a been a whole load of issues with suppression orders. What happens with bloggers? What happens depending on where the data happens to be housed, and so forth?

So technology is massively outpacing the law at the moment. If you think about how we might handle these sorts of complex, multi-jurisdiction, conflict-of-laws kind of issues traditionally; we'd sit up and we'd put together a multilateral treaty or some sort of international treaty.

But of course, in the IT industry, we move at the sort of pace where we're shipping new functionality every couple of weeks. And specifically, cloud computing vendors are shipping new releases of their technology and platform every few months. An international treaty can take many years to negotiate.

Can you imagine the sorts of negotiations that would need to occur for various jurisdictions around the world to be prepared to cede legal sovereignty for information that might be domiciled within their country? I don't have any degree of optimism that the law will actually catch up. I think the approach that needs to be taken is this idea of a hybrid approach. You need to have a broad range of options as to what cloud computing means for you.

Cloud computing, for some customers, does mean a true public cloud, with massive-scaled, highly nested workloads. For other customers, it means a private cloud, where they are a large organization, particularly a government entity, and they want to have a private cloud.

For other customers, the cloud's just not suitable at all, particularly if they need absolute control over their data. One of the benefits of working with some of the Windows Azure stuff that we find is it's actually pretty easy to work across all of those scenarios.

To take the Microsoft Windows Azure cloud offering as an example, the option is forthcoming to drop something like a Windows Azure appliance which will let youun the same apps I your private cloud as in the public cloud. To me, that's particularly beneficial for large corporations and federal government, where they may sell it to other government departments.

At the end of the day, we're working with standard Windows technologies, which we've worked with for a long time, but we can pick up and deploy into on-premises environments just as easily.

Robert: That's a good segue, because we did announce an Azure platform appliance, primarily to give customers an on-premises solution. Where do you think this is going, Chris? Do you think this is just a short-term issue, and that once trust and legal issues are worked out, everything will go to the public cloud? Or do you think customers are always going to need private cloud options?

Chris: I think customers are always going to be interested in private cloud options, particularly in things like the public sector. And I think we need to draw a strong distinction between what is really a true cloud computing offering and what's really just virtualization in drag. To me, true cloud computing offerings require a pretty significant scale. People who look at the Windows Azure appliance need to know that it's going to be a large-scale investment and a large-scale deployment.

If you think back to what we discussed at the start, one of the key reasons you want that large scale is because you want to have, effectively, spare computing capacity that you can tap into elastically. By having a large-scale deployment shared by many, many people, the cost to carry that additional capacity is shared across all of those customers. Some of the key scenarios where I see the Windows Azure appliance really working well are things like government.

For example, you may have a national government that chooses to deploy a Windows Azure appliance, and then sells that Windows Azure appliance to other government agencies within that national government. And based on the fact that they are selling it and actually applying a true pricing model and ideally, maybe applying some sort of differential pricing, they can encourage those government agencies to move their load around based on the price.

So if it's more expensive to run computing workloads during the day than it is at night, you'd expect organizations such as a meteorological office or a big university who want to use the cloud for number crunching to move their loads into off peak time zones.

To me, one of the key things that we need to see from true private clouds is massive scale. And to meet massive scales, at least one order of magnitude larger than the largest elastic workload is my sort of rule of thumb.

You also need to have a suitable pricing system. I think there'd be an internal marketplace in which people would buy that computing power. If you buy a private cloud and then apply that as an overhead charge across all of your departments in your business or government, it's simply not going to work. Because it's not going to economically drive a sort of behavior that will optimize your usage of computing.

Robert: That's a very good point. James Urquhart recently put up a post entitled "Moving to Versus Building for Cloud Computing," where he says that many applications can't just be ported over the cloud. That post really holds up Netflix as an organization that's completely architected around public cloud services. What's your advice to organizations that have lots of legacy applications on how to be competitive against startups that can fully embrace the cloud from day one?

Chris: Moving to the cloud is very hard, because historically, people have not typically architected their applications for aggressive scale-out scenarios. Typically, people would have thought of scaling out in the application tier. But often, they will not have thought of scaling out in the data tier, and that's actually something that's really important to all of the cloud platforms and Windows Azure in particular.

I think organizations that are looking at how they mature their current on-premises set need to really take a hard look at the data tier. And looking at that, they need to ask how they can partition their data tier. How can they get their data tier to enable scale out horizontally, rather than the on-premises approach, which is just buying a bigger SQL server?

When you think about scaling the database tier on premises, you just buy a bigger box. If you think about scaling a database tier in Windows Azure, you really are all about taking SQL Azure and partitioning your database.

So to me, most of the focus needs to be around the data tier for these applications. If people can solve the data tier, it's going to massively reduce the impact of trying to migrate into one of the clouds.

Robert: From a different perspective, how should startups be looking at cloud computing and the way to enter and disrupt the industry with established players?

Chris: For startups, cloud is as total no-brainer. You've basically cloistered yourselves in a Silicon Valley garage and lived on pizza and caffeine for six months building your app. You need to hold onto your equity as tightly as possible, and the last thing you want to do is spend a whole lot of capital on hardware. There are two major reasons.

The first is that, if you're going to buy all that equipment, you've got to go and find some venture capital. And those guys are going to take a pretty penny off you in terms of your equity to give you the money to go and buy the hardware.

The second thing is that lots of startups fail. The last thing you want when you have a failed startup is to be left carrying a whole lot of hardware that you then have to get rid of to recover your cash so you can go and do your next startup. The beauty of the cloud is it's basically a scale-fast, fail-fast model. So if your startup's a dog, you can fail fast. It doesn't cost you the earth, and you don't have all that hardware hanging around.

If your startup's a wild success, and you need to add massive amounts of computing power fast, traditional infrastructures can be impossible to scale fast enough to meet the demand- you can't buy and ship the servers fast enough! That situation can turn your wildly successful startup suddenly into a complete disaster. The beauty of the cloud is that, without paying any capital costs up front, you have an effectively infinite amount of computing capacity that you can turn on as needed.

Robert: In his "Cloudonomics" work, Joe Weinman basically says there's no way that building on premises for peak usage can compare with pay per use for your average capacity. How much of the cloud adoption you're seeing is just for cost savings versus business agility? Or even building new kinds of solutions that just wouldn't be feasible without cloud capabilities?

Chris: "Cloudonomics" is based on the idea that building for the peak loads on premises is too expensive. It's not merely that we can save money by doing this in the cloud; it's that we can only do it by building it in the cloud, because it's just so economically unfeasible to do it on premises.

It is economically unfeasible to carry the hardware you need for those peak loads if you've got to have it running 365 days of the year. The cloud allows us at a business level to solve problems that we haven't been able to solve in the past.

Robert: If you could take your MVP hat off for a second, I imagine that you must have looked at other cloud offerings. You probably have some opinions where you think Azure stands out, and then where other offerings stand out. Can you comment more on that?

Chris: Azure really stands out as a platform-as-a-service offering. The thing that you have to think about with Windows Azure is that you're not just buying virtual machines. You're really buying an entire platform that is designed to give you the building blocks to build elastic, massive-scale applications.

Contrast that with something like Amazon's cloud services offering. Those guys are really mature, and they've been doing it a long time. It probably wouldn't be wrong to call them the market leaders and the innovators. It seems odd for an online bookstore to be the key innovators in cloud computing, but literally I think they just woke up one morning, and said, "Hey we're really good at building these massive scale websites. Why don't we put it in a bottle and sell it?"

But Amazon doesn't really have that platform offering. If we think about building these massive scale applications, they maybe haven't taken it to the next level, in terms of being willing to build in things like the load balancer, recovery capabilities, and other features that you get with Windows Azure. One real strength of Amazon, though, is that they really get the economic stuff.

Arguably, they're probably innovating more slowly in terms of technology than they are on the business side of things. Amazon offers things like spot pricing, which I love, because it sends economic price signals to encourage people to change their behavior. At the end of the day, that's what's going to drive Green IT: proper economic price signals driving behavior.

Amazon also has reserved instances. These things mean that we can start to look at computing far more like we might look at say, the electricity market. Amazon is really probably the market leader in infrastructure as a service, in the sense of really renting raw capacity by the hour.

Robert: In a recent interview, Accenture's Jimmy Harris said, "Cloud changes the role of IT, from a purveyor of service, to being an integrator of service." One potential challenge I see for IT is increased finger pointing. If an organization is accessing its SaaS solution through the Internet, and the SaaS solution is hosted on a public cloud, you could see finger pointing between the ISPs, the SaaS provider, and the cloud provider.

Chris: I've been presenting pretty often for audiences like CIOs, and invariably at the end of my presentation, one of them will put up their hand and very boldly ask, "Why then should I trust Microsoft to run my application?" And of course, the answer to that is, there's probably nobody I'd trust more than Microsoft to run my application. These guys are running enormous data centers, and they have the smartest possible people running them, because the smartest possible people really want to run the enormous data centers.

But I think there's still a mindset that there are benefits in being able to walk down the hallway and put a boot up someone's ass if something's broken. And you kind of lose that with the cloud, and to a degree you also lose some of the high-touch service level agreements that you might see with a typical outsourced provider.

Because to a typical outsource provider, a large enterprise workload is a very significant customer, so they're often prepared at the sale time to actually enter into detailed negotiations about service level agreements.

When you look at cloud computing, on the other hand, even large enterprise workloads are often just a drop in the ocean for the provider- remember my order of magnitude rule of thumb. But at the end of the day, what really matters is whether your application is up and running. And again, I come back to reinforce the point that these providers run at a massive scale, with very high levels of redundancy and reliability.

There is nobody I would feel more confident in running my technology than a large cloud provider, even though I may not be able to walk down the corridor and kick someone when it stops working.

Robert: Well Chris, thanks for your time.

Chris: Thanks Robert. Always a pleasure.

Bridget Botelho (@BridgetBotelho) posted Microsoft Azure to be in Windows shops, like it or not to the SearchCloudComputing blog on 1/11/2011:

image Microsoft’s aggressive push to the cloud is destined to rock IT shops heavily invested in Windows Server. As it stands, the mission to align its Azure cloud platform with Windows has already upset the company’s own leadership.

image Earlier this week, Microsoft disclosed that longtime president of the Server and Tools Business Division, Bob Muglia, would leave Microsoft this summer. Microsoft CEO Steve Ballmer said in an email to Microsoft employees, “All businesses go through cycles and need new and different talent to manage through those cycles.” He added that the new leadership will move Microsoft servers “forward into the era of cloud computing.”

image“The power of the Azure environment is the ability to run apps and not think that much about how to scale it,” he said. “If you need more horse power, you just throw more servers in the rack.”

One New York-based integrator and Microsoft partner said the leadership change is “either part of a bad exodus or a healthy shedding of old guard -- or both.”

But Muglia was involved with many transformations at Microsoft. He’d been with the company since 1988 and was part of Microsoft’s technology evolution from Windows NT to mobile devices and most recently, cloud services.

The company offered no additional comments on either the Muglia transition or its Windows Server strategy, leaving IT pros to wonder what a leadership change will mean for the future of Windows Server, if anything. The company’s trajectory will continue to be a push to cloud computing. Unfortunately, the cloud is what worries people.

Microsoft’s next cycle: cloud
Microsoft will use its existing customer base to make a name for itself in the cloud market, just as it has done with virtualization (Hyper-V) and other technologies. In this case, the company will get its customers to use Azure by merging that platform with future versions of Windows Server, said Rob Horwitz, an analyst with Kirkland, Wash.-based Directions on Microsoft.

“Microsoft’s long-term vision is to have one common platform that can be leveraged in many ways, so that I can have the same application running on-premise, in the cloud or by a hosting provider,” he said.

The concept of development platforms (Windows Server and Azure) and pre-packaged applications such as Exchange will continue into the foreseeable future though. “What will change is the code used to implement the on-premises platform and apps, and the hosted platform and apps will converge,” Horwitz said.

Convergence of on-premise and cloud computing has been Microsoft’s goal since the inception of Azure, but on-premises codes needed to be modified to tackle issues such as multi-tenancy and scalability, Horwitz said.

But Microsoft is working on ways to give Windows Server customers a smooth transition to Azure. The company disclosed some Azure-centric technologies at the Professional Developers Conference last fall, including Server Application Virtualization, which will let IT pros virtualize traditional apps and move them to Windows Azure without a rewrite. The final version of that and other Azure tools are due this year.

Azure vs. Windows Server
The benefit of Azure over traditional Windows Server is that the OS doesn’t have to be tied to specific functions. For instance, IT pros run Microsoft SQL Server on a database server with dedicated resources, but Windows Azure server supports all scalable apps with resources added as needed instead, Horwitz said.

That sounds convenient, but most enterprise IT pros are conservative. They are in no rush to adopt Azure because they are comfortable with traditional Windows Server if it meets their needs. Also, Azure isn’t mature.

Alan Silverman, a consultant with the IT services firm Atrion Networking Corp. in Warwick, R.I., said his small business clients have moved to cloud-based services primarily for email to reduce hardware and software costs, but the typical mid-sized customer is investing in private clouds using virtualization and centralized storage. Those private clouds offer faster and cheaper provisioning of resources and fulfill their needs, so Azure isn’t a must-have.

“With those investments taking place, it is hard to imagine that everything will be moving to the public cloud any time soon,” Silverman said. “Our mid-size customers still need the customizability of on-premise software.”

The Azure private cloud
Silverman added that IT pros are talking about hybrid public/private cloud options, “but the interoperability and especially the account synchronization is still a work in progress.”

Microsoft’s long term vision is to have one common platform that can be leveraged in many ways.

Rob Horwitz, analyst, Directions on Microsoft

Since companies are skittish about public clouds, Microsoft came out with a private cloud version of Azure in July that companies can run within their own datacenters, which is essentially Microsoft’s way of pushing customers to take baby steps towards public clouds.

The need for a private version of Azure shows that customers don’t trust public clouds and most of all, that Microsoft hasn’t reconciled its cloud vision with the needs of its customers, analysts said.

“Microsoft is really focused on the endgame where the real power comes from writing an app from scratch for the cloud,” said Carl Claunch, an analyst at Gartner Inc. “Very few people are really ready to do that; just the sheer investment of turning everything over to the cloud isn’t feasible.”

Some say Microsoft shouldn’t even be considered a serious player in the public cloud space yet.

“Lots of people are very confident using Microsoft technologies to build their own private clouds. But are they really confident in Microsoft to run and host all their technologies and information for them?” said Nelson Ruest, an IT consultant with Victoria, BC, Canada- based Resolutions Enterprises. “I haven’t personally seen such professionalism on the part of Microsoft’s implementation staff to suggest that.”

Full disclosure: I’m a new contributor to TechTarget’s

Josh Greenbaum asked Bob Muglia Leaves — is this the Beginning of a Major Enterprise Realignment at Microsoft? in a 1/11/2011 post to his Enterprise Irregulars blog:

image There’s lots of speculation floating around about why Bob Muglia, head of Microsoft’s Server and Tools Business, is leaving this summer.  I agree with my colleague Mary Jo Foley that is wasn’t because Muglia wasn’t all-in on software&services, he seemed to actually get it.  But I do think there may be a reason that is nonetheless tied more to S&S than it may appear: Microsoft is going to reorg around its own “stack” business, and in doing so take a major new tac[k] in the battle for the enterprise.

_Y6H2944The new tac[k] will be Microsoft’s own version of the stack wars, in which Azure, fueled by the Dynamics ERP products and partners’ enterprise software and services, becomes the leading edge of an increasing focus on direct sales to the enterprise.  This won’t obliterate the thousands of partner[s] from the mix, but it will create a major shift in how Microsoft goes to market, particularly with respect to the large enterprise: much more direct and more in line with what IBM, SAP, and Oracle are able to do with their stack offerings.

imageAs I have said before, the synergies between Dynamics and the products that Muglia oversaw was growing significantly, and that overlap will continue as Azure moves forward to claim a significant piece of the cloud market. Indeed, the growing realization that the cloud is taking over the mindshare (though not walletshare — yet) in the enterprise has been sharpening the focus of executives across the industry. And while Muglia was great at building a strong STB and a strong partner channel for the products, would he necessarily be the right guy to help shift gears and help position Microsoft for a C-level dialogue about the new enterprise a la Microsoft? I don’t think so.

This shift is one that has to be under consideration in Redmond, for no other reason than the fact that the competitive  landscape is demanding it. Oracle is pushing the envelope hardest right now, though it’s following IBM’s footsteps into the CEO’s office with a me-too stack sale. This is putting a ton of pressure on SAP to man and women-up its own efforts to sell a deeply strategic vision of software and services, and that vision is starting to look like it will include a decent amount of Azure+Dynamics-like functionality, and be highly competitive to the Microsoft offering. Then there’s, chewing up mindshare across a broad swath of the market that Microsoft is targeting. And waiting in the wings is Hewlett-Packard, now in the hands of a seasoned software executive who understands these opportunities as well.

This massive market realignment leaves Microsoft out on a limb if it only relies on its fabled channel to do the heavy lifting for all the new products and services that S&S and Azure+Dynamics represents. While the partner channel has been beefed up by the presence of the global SIs in recent years, the idea that Microsoft can compete directly with IBM, Oracle, Salesforce, SAP and HP in the coming market realignment with just its channel partners must have struck Ballmer as a little iffy. And it may have seemed that, for all his talents, Muglia was not the one who could take Microsoft to this new level.

What this realignment means inside Microsoft is a shift with what STB is and does in support of the rest of the Microsoft vision: STB will have to play more of a supporting role, more the loyal commodity stack provider, in a high-stakes, high-value market where everyone (except SAP) has a rapidly commoditizing stack offering on which each vendor must position high value-add assets like enterprise software and services. The stack components don’t go away — they become an essential part of the sale — but they have to take second fiddle to a higher-value offering. That offering in Microsoft-land will not come from STB, but from — and here’s where I go out on a limb — a realigned Microsoft that is selling a much broader strategic vision of what the enterprise wants and needs: a vision much more like Oracle and IBM than ever before.

I may be wrong that Muglia’s departure is related to this new market reality, but if it isn’t, then another shoe will drop soon that defines the beginning of such a realignment. Microsoft has never been in a more precarious position in its fabled history — I haven’t even mentioned the Google threat or the Apple threat, or the smart phone and tablet challenge — and it has never been better positioned to pull ahead of its enterprise competitors with a new focus on Azure and Dynamics as essential components of Microsoft’s own stack strategy. The manner of Muglia’s departure and Ballmer’s careful wording may have led me to go out on a limb about the particulars of the cause and effect at play here, but, one way or another, Microsoft needs to shift to a more direct focus on the enterprise — direct as in direct presence in the executive suite — or watch its latest opportunity fall prey to a slew of competitors that get the enterprise opportunity in a much more direct way.

Related articles

Image by Microsoft PDC via Flickr

As noted in my Windows Azure and Cloud Computing Posts for 1/11/2011+ post about Dynamics AX code-named “6”:

For more background on Microsoft Dynamics AX code-named "6," see the Microsoft Previews Next-Generation ERP press release of 1/11/2011 on Microsoft PressPass. It’s surprising that there’s no mention of Windows Azure or “cloud” in either Kurt [Mackie]’s article or the official press release. Will Ballmer apply the hatchet to Dynamics AX management?

<Return to section navigation list> 

Windows Azure Platform Appliance (WAPA), Hyper-V and Private Clouds

Greg Shields wrote Defining the Microsoft Hyper-V Cloud for on 1/11/2011:

The fracturing of cloud computing into several different "types" has made it even more difficult to fully comprehend the cloud conversation. These days, the concept of "cloud computing" now encompasses public cloud, private cloud and even hybrid cloud models. And within each model, various vendors and pundits supply different definitions, all of which seem designed to suit their specific needs.

This confusion is why I attempted to define private cloud in my recent book, Private Clouds: Selecting the Right Hardware for a Scalable Virtual Infrastructure. Here's how I describe it:

"While virtual machines are the mechanism in which IT services are provided, the private cloud infrastructure is the platform that enables those virtual machines to be created and managed at the speed of business."

Some IT pros might look at that statement and think, "That almost describes what my virtual environment already does today." Depending on their configurations, they're probably not far off. And that's, at least in my mind, where most of the confusion about private cloud comes from: It is in many ways less revolutionary than people think. Here's another definition:

"A private cloud at its core is little more than a virtualization technology, some really good management tools, the right set of hardware and business process integration."

This, of course, means that private cloud requires virtualization, hardware to run that virtualization and the administrative tools that link what businesses demand with what computing resources can supply.

I use this lengthy introduction as a starting point for Microsoft's entry into private cloud because its Hyper-V Cloud represents both of these definitions. While the software technologies aren't entirely new, they are being laid together in ways that better align virtualization's activities with business processes.

Explaining the Hyper-V Cloud
So what is Microsoft's Hyper-V Cloud? In one form, it's a combination of Microsoft virtualization and virtualization management software with a set of specialized hardware, which I can only describe as being "designed with virtualization in mind."

The third piece to this puzzle is the management activities that link business process integration with virtualization. These concepts will be difficult to grasp for more technical IT pros, but they should make sense to anyone who understands the business side.

More technically-minded users should think of the situation like this: Virtualization is a technology. Using virtualization, virtual machines are provisioned and managed, assigned resources, and then deprovisioned when they're no longer relevant. At the same time, however, there are business rules that define when to create, or interact with, those virtual machines. Some business driver (think: new opportunity) decides when a new SQL, Exchange or other server is needed. Another business driver (think: business expansion) decides when it's time to add more resources to augment existing virtual environments.

Traditional virtualization has never been good at translating those business drivers into actual clicks inside its virtual management platform. This translation is one of the goals of private cloud, and it's one of the primary reasons for Microsoft's Hyper-V Cloud implementation.

Now, before you quickly throw away this concept as vaporware or "just another Microsoft marketing campaign," know that within it are some real technologies that are compelling to both technologists and business-oriented types. Some of them, like the evolving hardware that drives virtualization, are here today and quite exciting in how they change our perceptions. Others, like Microsoft's virtualization management studio in System Center Virtual Machine Manager, remain unchanged but should be evolving in the near future.

Private clouds represent a new way of thinking about the resources in your virtual environment. By collecting together the right set of hardware -- that which has been designed with virtualization in mind -- a kind of "economics of resources" comes into being. Both the enabling technologies, as well as the economics itself, will be the topics of my next two tips in this series.

Until then, take a look through Microsoft's description of the Hyper-V Cloud portfolio. There are a few gems of useful information included that should get you started.

Greg  is a Microsoft MVP and a partner at Concentrated Technology.

Full disclosure: I’m a new contributor to TechTarget’s

Brian Gracely described Multiple approaches to Hybrid Cloud on 12/29/2010 (missed when posted):

Most of 2010 was consumed with the religious wars between Private Cloud and Public Cloud, debates about the best ways for businesses to build (or consume) the next-generation of IT services. But with the recent announcement by Amazon Web Services which allows the import of VMware VMs, the debate is starting to move towards the concept of a Hybrid Cloud. In this case, it's specifically focused on the aspect of Hybrid Cloud that allows applications (a.k.a. "workloads") to move from one cloud to another cloud.

All this means that 2011 will be filled with talk of Hybrid Cloud. Definitions of hybrid, initial hybrid services from external providers, and lots of customers wondering how they should plan to take advantage of these new architectures.
So let's take a look at some of the approaches that businesses might take to incorporate "Hybrid" functionality into their near-term and long-term IT strategies.

"Insourced and Outsourced" Applications

At the most basic level, this approach begins by taking an inventory of existing business functionality and determining which of them might be better serviced outside of existing IT resources. For many companies, the "outsourced" list included functions such as CRM, Email, Travel, Payroll, Disaster Recovery, Conferencing (Audio, Web or Video), and Social Media. The "insourced" list typically includes HR, ERP, Compliance-related functions, AR/PR, Ordering, and Finance.

This approach will look very familiar to most functional-leaders within the business as they have been evaluating aspects of their business through a "core vs. context" lens for many years. Core functions demand investment and internal skills development, while context functions are constantly seeking cost reductions and are frequently candidates for external services. With the advancements in public cloud services, this approach will become more prevalent within IT. It's the simplest model to implement, requires very little retraining and minimizes concerns about security. Outsourced applications move to a Software-as-a-Service (SaaS) model, and Insourced applications continue to be provided by the internal IT staff.

Gateway Clouds

A number of start-up companies are beginning to offer products that pacify security or trust concerns by placing hardware on-site but reduce CapEx and OpEx costs by leveraging public cloud services (AWS, etc.). Companies such as CloudSwitch, Cirtas, Nasumi and many others are trying to offer companies a "have your cake and eat it too" model that is relatively easy to implement and doesn't require many changes to IT skills or operations. These solutions will probably get integrated into bigger company solutions over time, but for now they offer a quick-fix for companies looking to save money or streamline operations while exploring a hybrid cloud strategy.

"Internal to External Application Mobility" (and back) 

Assuming that all servers are created equal, this approach looks at ways to move selective workloads (typically in a VM) from internal servers to external servers - or put a different way - from a private cloud to a public cloud. This movement may be done to reduce costs, improve availability, use additional capacity or test new functionality. The expectation is that the movement will often be bi-directional, with many businesses looking for ways to have multiple (external) partners to facilitate movement. This will allow them to maintain competitive costs as well as leverage geographically dispersed services when needed.

In theory this approach is desirable because it probably won't require that existing applications be rewritten. They could be encapsulated in a VM and moved freely between systems, either dynamically or via offline file transfers. The challenge with this approach is in areas such as networking and security, which may not be provisioned identically between the private cloud and public cloud. In addition, applications may need to be adjusted to adapt to different amounts of bandwidth, delay or latency between the two systems. Add in multiple public clouds as mobility destinations and the models begins to get more and more complicated to ensure seamless movements.

From an Infrastructure-as-a-Service (IaaS) perspective, some of these challenges will be minimized by the adoption of integrated stacks (Vblock, FlexPod, Exalogic, etc.) used by Enterprise and Service Provider customers. Other companies will look to OpenStack to provide standards. Still others will look at solutions such as VMware vCloud Director to abstract all the networking and security via software and create similar systems in the different environments.

This approach offers the benefits of limited software rewrites and could reduce overall costs if used properly. It will requires some changes to IT skills (networking, security, virtualization) and IT processes (compliance, data management, cloud partner management) and could require some interoperability testing if standards aren't finalized.

"Multiple Clouds, Mobility Clouds"

As I've pointed out in previous posts, some CIOs will embrace public cloud models in a very large way. They will be faced with business problems that require new applications to be written (Platform-as-a-Service), or will combine new applications with existing applications (SaaS). With the rapid growth of public cloud platforms ( / Heroku,, Google App Engine, Microsoft Azure, Amazon Web Services, SpringSource, etc.), as well as industry-specific mandates for "cloud first" policies, CIOs will face the challenge of managing relationships and operations of multiple clouds. This will create a hybrid external/public cloud environment with different pricing, management, security and networking models. It offers the ultimate in flexibility, but it also means that many CIOs will be treading in unchartered waters as services and applications get combined in new and interesting ways.

By no means is this an exhaustive list of Hybrid Cloud options, and I'm sure 2011 will introduce many more definitions. Hopefully this list gives you an idea of what solutions are available today or in the near future and allows you to think about the types of decisions you'll need to make as you plan to incorporate hybrid cloud technologies into your IT strategies.

How do you expect your business to adopt cloud computing in 2011? Is hybrid cloud something to consider? What other approaches are you considering?

<Return to section navigation list> 

Cloud Security and Governance

Chris Hoff (@Beaker) posted The Cloud As a (Hax0r’s) Calculator. Yawn… on 1/11/2011:

imageIf I see another news story that talks about how a “hacker” has shown that by “utilizing the cloud” to harness compute on demand to do what otherwise one might use a botnet or specialized hardware to perform BUT otherwise suggest that it somehow compromises an entire branch of technology, I’m going to…


How a botnet works: 1. A botnet operator sends...Yeah, cloud makes this cheap and accessible…rainbow table cracking using IaaS images via a cloud provider…passwords, wifi creds, credit card numbers, pi…

Please. See:


Image via Wikipedia

Chris Hoff (@Beaker) explained in an Incomplete Thought: Why Security Doesn’t Scale…Yet on 1/11/2011:

image There are lots of reasons one might use to illustrate why operationalizing security — both from the human and technology perspectives — doesn’t scale.

X-ray machines and metal detectors are used to...I’ve painted numerous pictures highlighting the cyclical nature of technology transitions, the supply/demand curve related to threats, vulnerabilities, technology and compensating controls and even relevant anecdotes involving the intersection of Moore’s and Metcalfe’s laws.  This really was a central theme in my Cloudinomicon presentation; “idempotent infrastructure, building survivable systems and bringing sexy back to information centricity.”

Here are some other examples of things I’ve written about in this realm.

Batting around how public “commodity” cloud solutions forces us to re-evaluate how, where, why and who “does” security was an interesting journey.  Ultimately, it comes down to architecture and poking at the sanctity of models hinged on an operational premise that may or may not be as relevant as it used to be.

However, I think the most poignant and yet potentially obvious answer to the “why doesn’t security scale?” question is the fact that security products, by design, don’t scale because they have not been created to allow for automation across almost every aspect of their architecture.

Automation and the interfaces (read: APIs) by which security products ought to be provisioned, orchestrated, and deployed are simply lacking in most security products.

Yes, there exist security products that are distributed but they are still managed, provisioned and deployed manually — generally using a management hub-spoke model that doesn’t lend itself to automated “anything” that does not otherwise rely upon bubble-gum and bailing wire scripting…

Sure, we’ve had things like SNMP as a “standard interface” for “management” for a long while. We’ve had common ways of describing threats and vulnerabilities.  Recently we’ve seen the emergence of XML-based APIs emerge as a function of the latest generation of (mostly virtualized) firewall technologies, but most products still rely upon stand-alone GUIs, CLIs, element managers and a meat cloud of operators to push the go button (or reconfigure.)

Really annoying.

Alongside the lack of standard API-based management planes, control planes are largely proprietary and the output for correlated event-driven telemetry at all layers of the stack is equally lacking.  Of course the applications and security layers that run atop infrastructure are still largely discrete thus making the problem more difficult.

The good news is that virtualization in the enterprise and the emergence of the cultural and operational models predicated upon automation are starting to influence product roadmaps in ways that will positively affect the problem space described above but we’ve got a long haul as we make this transition.

Security vendors are starting to realize that they must retool many of their technology roadmaps to deal with the impact of dynamism and automation.  Some, not all, are discovering painfully the fact that simply creating a virtualized version of a physical appliance doesn’t make it a virtual security solution (or cloud security solution) in the same way that moving an application directly to cloud doesn’t necessarily make it a “cloud application.”

In the same way that one must often re-write or specifically design applications “designed” for cloud, we have to do the same for security.  Arguably there are things that can and should be preserved; the examples of the basic underpinnings such as firewalls that at their core don’t need to change but their “packaging” does.

I’m privy to lots of the underlying mechanics of these activities — from open source to highly-proprietary — and I’m heartened by the fact that we’re beginning to make progress.  We shouldn’t have to make a distinction between crafting and deploying security policies in physical or virtual environments.  We shouldn’t be held hostage by the separation of application logic from the underlying platforms.

In the long term, I’m optimistic we won’t have to.


Related articles

Image via Wikipedia

<Return to section navigation list> 

Cloud Computing Events

No significant articles today.

<Return to section navigation list> 

Other Cloud Computing Platforms and Services

Klint Finley briefly described 7 Cloud-Based Database Services in a 1/12/2011 post to the ReadWriteCloud blog: announced, its hosted relational database service, in December. Since that time it's clear that is far from alone in the market for offering stand-alone, cloud-hosted databases. There are at least four other competitors, with more on the horizon. And one of those is a company all too familiar to Microsoft.

image Although customers have been able to install Oracle or MySQL on commodity cloud instances for years, these services all provide databases specifically designed for the cloud.

In reverse alphabetical order:


Xeround, based in Bellvue, WA, offers its own elastic database service based on MySQL. Today it announced that its service will now be available from Amazon Web Services data centers in both Europe and North America. Customers can choose whichever location is closest.

In addition to offering multiple geographic locations, Xeround announced that over the course of the next year it will begin to offer other cloud providers such as GoGrid and Rackspace. Xeround's database is host agnostic, so customers will be able to migrate freely between providers.

Microsoft SQL Azure Database

Microsoft offers SQL Azure Database as a standalone service.

In a report on SQL Azure, analyst firm Forrester wrote: "Most customers stated that SQL Azure delivers a reliable cloud database platform to support various small to moderately sized applications as well as other data management requirements such as backup, disaster recovery, testing, and collaboration."


Amazon Web Services has its own NoSQL cloud database service called SimpleDB. We've covered it occasionally, both in our "Is the Relational Database Doomed?" article and in our "3 New NoSQL Tutorials to Check Out This Weekend."

SimpleDB is really, well, simple. But it could be useful for very basic use cases. It's also free for minimal use.

Google AppEngine Data Store

As we reported, Google's own cloud database just made a new type of datastore available and revised its pricing. This was also one of the services we covered in "Is the Relational Database Doomed?" is based on the same technology that powers's flagship CRM service. That means it must be a robust and reliable database. isn't available yet, but it's already been field tested for the past decade. It's worthy of your consideration for that reason alone.


You might remember ClearDB from our "Cloud Startups to Watch in 2011" series. Like, it offers a hosted relational database.


CouchOne, the sponsor company of the NoSQL solution CouchDB, offers a free CouchDB hosting service. The service is still in beta.

We've covered CouchDB several times, notably in this article: "Why Large Hadron Collider Scientists are Using CouchDB." We don't know much about CouchOne's hosting service yet, however.

Chris Czarnecki explained Deleting Attached Amazon EC2 EBS Volumes in a 1/11/2011 post to the Learning Tree blog:

image When teaching Learning Tree’s Cloud Computing Course, I demonstrate various aspects of Amazons Infrastructure as a Service (IaaS). As part of this, not only do I provision various machine types but also associated Elastic Block Storage (EBS) devices and attach and detach these. Since I use a demonstration account for this, one task I undertake at the end of the course is to make sure that all resources are released/removed so that no unnecessary costs are incurred. For the demonstrations, I always use the Amazon Web browser administration interface.

image On a recent teach, I tidied up the account – or more accurately thought I had. When the monthly bill arrived, charges were still being incurred, albeit minor. The charges were for an EBS volume, which I thought would be straightforward to delete. However, when I tried to delete this from the browser administration interface, I received an error saying that the volume was in use and could not be deleted and should be detached using the force flag. This is a feature not available from the browser interface. Equally there was no other resource running on my account that the volume could be attached too ! I was paying for a fault in the browser interface and how resources had not been cleaned up properly by this toolset.

image So what was the solution ? To detach the volume I used the EC2 command line tools. If you are not sure of these Kevin Kell has written a post on how to install these. The command to detach the volume is then simply:

$ ec2-detach-volume volume-id -force

The force flag is important here. Once this has completed the volume could be deleted using the command:

ec2-delete-volume volume-id

Hopefully you will not encounter this scenario, but if you do, you now know the solution.

Matthew Aslett predicted NoSQL – consolidating and proliferating in 2011 in a 1/10/2011 post to the 451 Group’s Too Much Information blog:

image Among the numerous prediction pieces during the rounds at the moment, Bradford Stephens, founder of Drawn to Scale suggested we could be in for continued proliferation of NoSQL database technologies in 2011, while Redmonk’s Stephen O’Grady predicted consolidation. I agree with both of them.

image To understand how NoSQL could both proliferate and consolidate in 2011 it’s important to look at the small print. Bradford was talking specifically about open source tools, while Stephen was writing about commercially successful projects.

Given the levels of interest in NoSQL database technologies, the vast array of use cases, and the various interfaces and development languages – most of which are open source – I predict we’ll continue to see cross-pollination and the emergence of new projects as developers (corporate and individual) continue to scratch their own data-based itches.

However, I think we are also beginning to see the a narrowing of the commercial focus on those projects and companies that have enough traction to generate significant business opportunities and revenue, and that a few clear leaders will emerge in the various NoSQL sub-categories (key-value stores, document stores, graph databases and distributed column stores).

We can see previous evidence of the dual impact of proliferation and consolidation in the Linux market. While commercial opportunities are dominated by Red Hat, Novell and Canonical, that has not stopped the continued proliferation of Linux distributions.

The main difference between NoSQL and Linux markets, of course, is that the various Linux distributions all have a common core, and the diversity in the NoSQL space means that we are unlikely to see proliferation on the scale of Linux.

However, I think we’ll see a similar two-tier market emerge with a large number of technically interesting and differentiated open source projects, and a small number of commercially-viable general-purpose category leaders.

Matthew covers data management software for The 451 Group's Information Management practice, including relational and non-relational databases, data warehousing and data caching.

Todd Hoff posted Google Megastore - 3 Billion Writes and 20 Billion Read Transactions Daily to the High Scalability blog on 1/11/2011:

image A giant step into the fully distributed future has been taken by the Google App Engine team with the release of their High Replication Datastore. The HRD is targeted at mission critical applications that require data replicated to at least three datacenters, full ACID semantics for entity groups, and lower consistency guarantees across entity groups.

This is a major accomplishment. Few organizations can implement a true multi-datacenter datastore. Other than SimpleDB, how many other publicly accessible database services can operate out of multiple datacenters? Now that capability can be had by anyone. But there is a price, literally and otherwise. Because the HRD uses three times the resources as Google App Engine's Master/Slave datastatore, it will cost three times as much. And because it is a distributed database, with all that implies in the CAP sense, developers will have to be very careful in how they architect their applications because as costs increased, reliability increased, complexity has increased, and performance has decreased. This is why HRD is targeted ay mission critical applications, you gotta want it, otherwise the Master/Slave datastore makes a lot more sense.

The technical details behind the HRD are described in this paper, Megastore: Providing Scalable, Highly Available Storage for Interactive Services. This is a wonderfully written and accessible paper, chocked full of useful and interesting details. James Hamilton wrote an excellent summary of the paper in Google Megastore: The Data Engine Behind GAE. There are also a few useful threads in Google Groups that go into some more details about how it works, costs, and performance (the original announcement, performance comparison).

Some Megastore highlights:

  • Megastore blends the scalability of a NoSQL datastore with the convenience of a traditional RDBMS. It has been used internally by Google for several years, on more than 100 production applications, to handle more than three billion write and 20 billion read transactions daily, and store a petabyte of data across many global datacenters.
  • Megastore is a storage system developed to meet the storage requirements of today's interactive online services. It is novel in that it blends the scalability of a NoSQL datastore with the convenience of a traditional RDBMS. It uses synchronous replication to achieve high availability and a consistent view of the data. In brief, it provides fully serializable ACID semantics over distant replicas with low enough latencies to support interactive applications. We accomplish this by taking a middle ground in the RDBMS vs. NoSQL design space: we partition the datastore and replicate each partition separately, providing full ACID semantics within partitions, but only limited consistency guarantees across them. We provide traditional database features, such as secondary indexes, but only those features that can scale within user-tolerable latency limits, and only with the semantics that our partitioning scheme can support. We contend that the data for most Internet services can be suitably partitioned (e.g., by user) to make this approach viable, and that a small, but not spartan, set of features can substantially ease the burden of developing cloud applications.
  • Paxos is used to manage synchronous replication between datacenters. This provides the highest level of availability for reads and writes at the cost of higher-latency writes. Typically Paxos is used only for coordination, Megastore also uses it to perform write operations. 
  • Supports 3 levels of read consistency: current, snapshot, and inconsistent reads.
  • Entity groups are now a unit of consistency as well as a unit of transactionality. Entity groups seem to be like little separate databases. Each is independently and synchronously replicated over a wide area. The underlying data is stored in a scalable NoSQL datastore in each datacenter.
  • Entities within an entity group are mutated with single 
  • phase ACID transactions. Two-phase commit is used for cross entity group updates, which will greatly limit the write throughput when not operating on an entity group.
  • Entity groups are an a priori grouping of data for fast operations. Their size and composition must be balanced. Examples of entity groups are: an email account for a user; a blog would have a profile entity group and more groups to hold posts and meta data for each blog. Each application will have find natural ways to draw entity group boundaries. Fi ne-grained entity groups will force expensive cross-group operations. Groups with too much unrelated data will cause unrelated writes to be serialized which degrades throughput. This a process that ironically seems a little like normalizing and will probably prove just as frustrating.
  • Queries that require strongly consistent results must be restricted to a single entity group. Queries across entity groups may return stale results  This is a major change for programmers. The Master/Slave datastore defaulted to strongly consistent results for all queries, because reads and writes were from the master replica by default. With multiple datacenters the world is a lot ore complicated. This is clear from some the Google group comments too. Performance will vary quite a bit where entities are located and how they are grouped.
  • Applications will remain fully available during planned maintenance periods, as well as during most unplanned infrastructure issues. The Master/Slave datastore was subject to periodic maintenance windows. If availability is job one for your application the HRD is a big win.
  • Backups and redundancy are achieved via synchronous replication, snapshots, and incremental log backups.
  • The datastore API does not change at all
  • Writes to a single entity group are strongly consistent.
  • Writes are limited to 1 per second, so HRD is  not a good match when high usage is expected.
  • With eventual consistency, more than 99.9% of your writes are available for queries within a few seconds.
  • Only new applications can choose the HRD option. An existing application must be moved to a new application.
  • Performance can be improved at the expense of consistency by setting the read_policy to eventually consistent. This will be bring performance similar to that of Master/Slave datastore.

  • One application can't mix Master/Slave with HRD. The reasoning is HRD can serve out of multiple datacenters and Master/Slave can not, so there's no way to ensure in failure cases that apps are running in the right place. So if you planned to use an expensive HRD for critical data and the less expensive Master/Slave for less critical data, you can't do that. You might be thinking to delegate Master/Slave operations to another application, but splitting up applications that way is against the TOS. 
  • Once HRD is selected your choice can't be changed. So if you would like to start with the cheeper Master/Slave for customers who want to pay less and use HRD who would like to pay for a premium service, you can't do that.
  • There's no automated migration of Master/Slave data, the HRD data. The application must write that code. The reasoning is the migration will require a read-only period and the application is in the best position to know how to minimize that downtime. 
  • Moving to a caching based architecture will be even more important to hide some of the performance limitations of HRD. Cache can include memcache, cookies, or state put in a URL.
Related Articles

Klint Finley reported Google Announces High Replication Datastore for App Engine in a 1/6/2011 post to the ReadWriteHack blog (missed when posted):

image It's no secret that Google App Engine has suffered from reliability issues. Google is attempting to address some of its issues by making a new datastore option available: the High Replication Datastore.

Google App Engine logo 150x150"The High Replication Datastore provides the highest level of availability for your reads and writes, at the cost of increased latency for writes and changes in consistency guarantees in the API," writes Kevin Gibbs in the announcement. "API. The High Replication Datastore increases the number of data centers that maintain replicas of your data by using the Paxos algorithm to synchronize that data across datacenters in real time." A detailed comparison of the two datastore options is available in App Engine documentation.

The price for the new datastore is starting out at three times the cost of the Master/Slave option, but the pricing will likely change in the future.

image For the time being, the traditional Master/Slave datastore will remain the default configuration option. The datastore cannot be changed after an application is created, so existing applications can't be switched to the High Replication Datastore. However, Google is providing some migration tools.

There's a new option in the admin console that allow users to put their applications in read-only mode so that data can be reliable copied between applications. Google is also providing a migration tool with the Python SDK that allows code to be copied from one application to another. The documentation for the migration tools can be found here.

See Also

<Return to section navigation list>