OakLeaf Systems: Windows Azure and Cloud Computing Posts for 11/18/2011+

A compendium of Windows Azure, SQL Azure Database, AppFabric, Windows Azure Platform Appliance and other cloud-computing articles.

• Updated 11/18/2011 4:00 PM PST with new articles marked • in the Marketplace DataMarket, Social Analytics and OData, Windows Azure AppFabric: Apps, Access Control, WIF and Service Bus and Cloud Computing Events sections.

Note: This post is updated daily or more frequently, depending on the availability of new articles in the following sections:

Azure Blob, Drive, Table, Queue and Hadoop Services
SQL Azure Database and Reporting
Marketplace DataMarket, Social Analytics and OData
Windows Azure AppFabric: Apps, Access Control, WIF and Service Bus
Windows Azure VM Role, Virtual Network, Connect, RDP and CDN
Live Windows Azure Apps, APIs, Tools and Test Harnesses
Visual Studio LightSwitch and Entity Framework v4+
Windows Azure Infrastructure and DevOps
Windows Azure Platform Appliance (WAPA), Hyper-V and Private/Hybrid Clouds
Cloud Security and Governance
Cloud Computing Events
Other Cloud Computing Platforms and Services

Azure Blob, Drive, Table, Queue and Hadoop Services

Mary Jo Foley (@maryjofoley) asserted “Microsoft is dropping its ‘Dryad’ big-data processing work and focusing, instead, on developing a Windows Azure and Windows Server implementation of Hadoop” in a deck for her Microsoft drops Dryad; puts its big-data bets on Hadoop of 11/16/2011 for ZDNet’s All About Microsoft blog:

Just a month after insisting there was still a place for its own Hadoop competitor, Microsoft officials have decided to discontinue work on LINQ to HPC, codenamed “Dryad.”

In a November 11 post on the Windows HPC Team Blog, officials said that Microsoft had provided a minor update to the latest test build of the Dryad code as part of Windows High Performance Computing (HPC) Pack 2008 R2 Service Pack (SP) 3. But they also noted that “this will be the final (Dryad) preview and we do not plan to move forward with a production release.”

Dryad was supposed to provide a way for running big-data jobs across clusters of Windows servers. It was designed to provide a platform for developers to build applications that can process large amounts of unstructured data. Just a month ago, Microsoft updated its near-final test build of Dryad.

But it now appears Microsoft is putting all its big-data eggs in the Hadoop framework basket. Microsoft officials said a month ago that Microsoft was working with Hortonworks to develop both a Windows Azure and a Windows Server distribution of Hadoop. A Community Technology Preview (CTP) of the Windows Azure version is due out before the end of this calendar year; the Windows Server test build of Hadoop is due some time in 2012.

From the November 11 HPC Team blog post [see item below]:

“Hadoop has emerged as a great platform for analyzing unstructured data or large volumes of data at low cost, which aligns well with Microsoft’s vision for its Information Platform. It also has a vibrant community of users and developers eager to innovate on this platform. Microsoft is keen to not only contribute to this vibrant community, but also help its adoption in the Enterprise.”

Microsoft Chairman Bill Gates first publicly mentioned Dryad, a Microsoft Research project, in 2006. The company took a number of steps to move Dryad from a research effort to a commercial one.

My Google, IBM, Oracle want piece of big data in the cloud article of 11/7/2011 for SearchCloudComputing.com discussed recent Apache Hadoop-related projects of Microsoft, Google, IBM and Oracle.

Don Pattee of the Windows HPC Team posted Announcing the Windows Azure HPC Scheduler and HPC Pack 2008 R2 Service Pack 3 releases! on 11/11/2011 (missed when posted):

Once again I get the honor of announcing, on behalf of the Microsoft High Performance Computing team, our latest releases are available immediately!

Our new Windows Azure HPC Scheduler development kit, and

the third update to the HPC Pack 2008 R2 family of software

Windows Azure HPC Scheduler

This is a great new product for us, the preview version was previously announced at the BUILD conference, and now we have officially released it for application developers to start using right now!

Windows Azure HPC Scheduler SDK includes modules and features that developers can use to create Windows Azure deployments that support compute-intensive, parallel applications that can scale when offered more compute power. The Windows Azure HPC Scheduler SDK enables developers to define a Windows Azure deployment that includes built-in job scheduling and resource management, runtime support for MPI, SOA, web-based job submission interfaces, and persistent state management of job queue and resource configuration. So, basically you can create an application that builds an app-specific cluster with no on-premises cluster requirements - this is cool :) Of course, applications that have been built using the HPC Pack's on-premises job submission API can use very similar job submission interfaces in the Windows Azure HPC Scheduler.

Get more details, and links to docs and sample code, from our Windows Azure HPC Scheduler MSDN page.

The Windows Azure HPC Scheduler SDK works with the newest version of the Windows Azure SDK (November 2011) which includes its own cool set of features to help develop Windows Azure applications, available for installation through the Web Platform installer here.

HPC Pack 2008 R2 Service Pack 3

This update includes a number of improvements, including two frequently requested features:

The Windows Azure bursting scenarios has reduced the number of ports you have to open in your firewall, using port 443 for most communication

The ability to install the HPC Pack software on a server not dedicated to your cluster (e.g. a team file server) for use in a manner similar to the existing Workstation Node functionality

The new 'Cycle Harvesting' feature is available to anyone who has a license for the Workstation or Enterprise versions - you'll need to download the update from your VL download page or by using the SP3 Integration Pack + your original Workstation or Enterprise media to create the new installer.

As part of this release we’ve also updated the preview version of LINQ to HPC, however, this will be the final preview and we do not plan to move forward with a production release. In line with our announcement in October at the PASS conference we will focus our effort on bringing Apache Hadoop to both Windows Server and Windows Azure. Hadoop has emerged as a great platform for analyzing unstructured data or large volumes of data at low cost, which aligns well with Microsoft’s vision for its Information Platform. It also has a vibrant community of users and developers eager to innovate on this platform. Microsoft is keen to not only contribute to this vibrant community, but also help its adoption in the Enterprise. We expect a preview version on Windows Azure available by end of the calendar year. [Emphasis added.]

For more information on those, and other, new features available in Service Pack 3 please see our documentation on TechNet.

Note: The single SP3 installer applies to all installations - Express, Workstation, and Enterprise, as well as the standalone 'Client Utilities' and 'MS-MPI' packages. You can download it from the Microsoft Download Center. Installers for the standalone Client Utilities and MS-MPI packages with the service pack already integrated are also available.

If you do not have an HPC Pack 2008 R2 cluster, you can download a free Windows HPC Server 2008 R2 Suite evaluation version. Before you install, you can try out the new Installation Preparation Wizard which can help analyze your environment for common issues and provide some best practice guidance to help ensure an easy HPC cluster setup.

Head over to the Windows HPC Discussion forums if you have any questions or comments, we'll be happy to hear from you!

The Windows Azure Team’s Now Available! Updated Windows Azure SDK & Windows Azure HPC Scheduler SDK post of 11/14/2011 didn’t mention that Microsoft was abandoning LINQ to HPC development in favor of Apache Hadoop.

SQL Azure Database and Reporting

The Microsoft MVP Award Team posted SQL Azure MVP Herve Roggero on the Value of Being an MVP on 11/17/2011:

We caught up with SQL Azure MVP Herve Roggero at the recent PASS Summit in Seattle. We spoke with Herve and Product Manager for SQL Azure, Cihan Biyikoglu, about the relationship between MVPs and Microsoft in the SQL Azure community. Herve also offered to share his thoughts on the value he has found in his first year as an MVP. (Please visit the site to view this video)

Being an MVP: The Ultimate Community Reward

This post contributed by SQL Azure MVP Herve Roggero

It’s been almost a year as of this writing that I have been honored with receiving the SQL Azure MVP Award. As a first timer, it really wasn’t clear to me what being an MVP could bring me. It turns out I was looking at it the wrong way. It isn’t as much what the MVP Award brings me as it is what it allows me to bring to others. I can breakdown my experience in the following categories: individuals, businesses and product involvement.

Individuals

This is perhaps the most obvious aspect of becoming an MVP: the community you affect monthly, weekly and sometimes daily. MVPs get involved in various ways. Whether it is running a user group, answering questions on the MSDN Forums, writing books, helping out at SQL Saturdays, speaking at .NET User Group venues, flying to Tennessee to run an Azure Code Camp or even planning a trip to Paris to speak about SQL Azure, all these activities have one thing in common: they provide unique opportunities to help people one on one, have candid conversations about Microsoft technologies and hopefully help individuals achieve greater results with the Microsoft platform. At no charge.

Businesses

As an MVP I often reach out to the corporate world, performing various presentations, trainings and guidance. Some of this is done for-hire, but many meetings are an extension of my community work and as a result are performed at no charge. However the most important aspect of working with businesses is to obtain on-the-floor feedback on the realities of what companies are struggling with, in terms of process, people and technology. As a result, this category becomes an important source of information that leads to the third category…

Product Involvement

This is perhaps the area that l least expected as a new MVP: being involved in early technical previews and providing feedback to Microsoft on design features, prioritization and even early bits of upcoming features. Above anything else, I consider this the cherry on the cake, a second layer of icing, the ultimate chocolate fondue! I’ve met a lot of SQL Server/SQL Azure team members and provided feedback on early previews of upcoming features. I also met Cihan Biyikoglu (blog: http://blogs.msdn.com/b/cbiyikoglu/) in March while visiting the Microsoft Campus for the MVP Global Summit. Cihan knew that I was building a sharding library for parallel processing of SQL requests. Since Cihan was leading the SQL Azure Federation feature he invited me to participate in the Federation Evaluation Program. Since then I updated my sharding library on codeplex to support Federations (http://enzosqlshard.codeplex.com/)and presented the library at the recent PASS Summit 2011. Cihan and I connected again and made this video announcing the upcoming features of SQL Azure and discussing some of the ways MVPs work with the SQL Azure team.

Being involved with the product teams completes the circle of the community picture; it is a win-win situation for all parties, from the individuals and businesses seeking advice, to Microsoft obtaining feedback from the field. If you are thinking about becoming an MVP I hope I gave you some motivations to pursue your goal. It is well worth it.

Author

Herve Roggero (http://www.herveroggero.com) is co-founder of Pyn Logic (http://www.pynlogic.com) and Blue Syntax Consulting (http://www.bluesyntax.net). Herve’s experience includes software development, architecture, database administration and senior management with both global corporations and startup companies. Over the last 15 years, Herve has worked in the Education, Financial, Health Care, Management Consulting and Database Security sectors. He holds multiple certifications, including an MCDBA, MCSE, MCSD. He also holds an MBA from Indiana University. Herve is heavily involved with the South Florida SQL Server community, speaks at multiple venues, co-authored Pro SQL Azure and runs SQL Saturday events in South Florida. Herve is a SQL Azure MVP.

MarketPlace DataMarket, Social Analytics and OData

• I updated My (@rogerjenn) My Microsoft Codename “Social Analytics” Windows Form Client Detects Anomaly in VancouverWindows8 Dataset post of 11/17/2011 with an addition example of the precipitous drop in Tweets per day (buzz) added to the

Microsoft’s Data Explorer Team described The Data Explorer Formula Language in an 11/17/2011 post:

We have now seen several blog posts introducing us to several feature areas of Data Explorer. One particular feature was visible in numerous screen shots, but has not yet been given specific attention: the formula bar. This post delves into some of the depths of what makes Data Explorer tick. Feel free to skim and read selectively. We will follow up with further posts covering related details and go even deeper, so taking a bit of time with today’s post should pay off in laying grounds for those future posts.

When building up tasks in Data Explorer, the formula bar tracks what each step is about. Just as in Excel, it is possible to directly edit the formula in the formula bar. To do so, we need to understand the “language” used to write a formula: the Data Explorer formula language. This language is, by design, quite close to the one found in Excel. So before taking a closer look at the Data Explorer formula language, let’s take a brief look at Excel’s first.

When working with spreadsheets, it is often necessary to enter formulas that compute a cells value based on other values in the spreadsheet. In Excel, the formula bar just above the current worksheet shows the formula “behind” the currently selected cell.

An Excel formula always starts with an equal sign, followed by an expression. The expression may contain simple operators familiar from basic math (+ - * /) and function invocations. For instance, in order to sum up the orders from a table containing the number of orders from each listed customer, the SUM function can be used.

Spreadsheets and Excel are very widely used. Many people have some level of understanding of the Excel formula language. To ease the adoption of Data Explorer, its usage model follows a design similar to Excel’s. In Excel, we typically start out “building” formulas in the user interface by clicking on cells and completing forms. Over time, we see how the formulas we build turn out looking in the formula bar. From there, it is a short step to begin “tweaking” formulas – by fixing a cell reference, editing a constant used in a formula, and so on. Finally, some users venture into just typing formulas into the formula bar.

Data Explorer follows that same model. A lot can be achieved by clicking and completing forms in the user interface. Behind the scenes, there is always the formula language at play. So, let’s take a closer look at the formula language and how it works. Here is a simple example of performing a calculation using the formula bar:

Consider the summation of values in a list. Lists can be written using curly braces – although it is much more common for lists of data to come from some data source.

Once we have a list, we can call functions to calculate interesting properties of the list. For instance, we can sort the list in descending order.

Here, we use the function List.Sort to sort the given list. The second argument to that function, Order.Descending, is a pre-defined named value that indicates the desired sort order (descending, in this case). You may wonder where List.Sort is coming from. It is a function included in the Data Explorer standard library that today offers over 300 functions, with more to come in the future.

There are a few things worth noting before we move on. As in Excel, function names can contain dots. In Data Explorer, by convention, function names are formed as pairs of nouns that name the area the function belongs to followed by the specific function. Unlike Excel, function names use lower- and upper-case characters and the casing is significant. (LIST.SUM would be a different name and will not work when the intention is to use List.Sum.) Function names are also spelled out to make them easier to read. (In Excel, many function names are abbreviated. Excel’s STDDEV is called List.StandardDeviation in Data Explorer.)

In Excel, the values are usually spread out across cells of a row or column and summation is achieved by specifying the range of cells that hold the values we want to sum up. It is also possible to name such a range of cells in Excel and then use that name to specify which values to sum up. In Data Explorer, there is no concept of cell ranges. Instead, data sets are commonly referred to by name. If we have a table of values, then we can extract the list of values found in one of the table’s columns and compute the sum of those values.

In this example, we have a list of monthly pay rates for the employees of a company. By summing up the values in that list (using the List.Sum function) and dividing by the count of employees (using the List.Count function), we get the average monthly pay rate. We can also use the function List.Average to directly compute the average value.

In Data Explorer, it is possible to see all formulas behind a section of a mashup document at once. To get there, right-click a section tab and select “View Formulas”.

The structure of formulas in a section relates very closely to the structure of resources and task streams seen in the UI. In our example, if we select the first resource, “AveragePay”, we see the defining formula for that resource.

If we select the second resource, “EmployeePay”, we see a task stream, a series of formulas applied on top of each other. When selecting any one of the tasks in that stream, we see the formula for that task. In the screenshot, we selected the task “ExtractedColumn” and the formula for that task extracts the column “Pay” from the “InsertedCustom” table, returning it as a list.

Now, let’s look at the combined formulas for the (only) section of our document.

Each of the resource definitions (“AveragePay”, “EmployeePay”) appears in the section’s formulas as a name followed by an equal sign followed by the definition. If a resource is defined by a simple formula, as is the case for AveragePay in our example, then that formula appears on the right-hand side of the equal sign:

AveragePay = List.Average(EmployeePay) ;

For resources that are defined by a task stream, as is the case for EmployeePay in our example, we can see a further feature of the Data Explorer formula language: the “let-in” expression. As we can see, EmployeePay is defined as a sequence of named expressions, each of which corresponds to one task in the task stream. All but the first of these expressions refer to the immediate preceding task by name, thus forming the task stream. Finally, the last named expression is referenced in the “in” part of the “let-in” expression.

The Localhost expression returns a view of all the databases installed in the local machine’s default database. (We are using the Sql.Databases function, included in the Data Explorer standard library.) The value of that expression is a record – a piece of data that has named parts to it, called record fields or just fields. In this example, each field corresponds to an installed database. We also say that the Localhost expression evaluates to a record value.

Let’s take a closer look at that “let-in” expression defining EmployeePay. The first named expression in the “let” part opens the databases on the local machine’s default database server:

Localhost = Sql.Databases("localhost"),

We then extract the specific database “M_AdventureWorks” and from there the specific table “HumanResources.EmployeePayHistory”:

HumanResources.EmployeePayHistory =
Localhost[M_AdventureWorks][ HumanResources.EmployeePayHistory],

The square brackets used above indicate accessing a part of some data, selected by name. We saw that Localhost evaluates to a record value. Writing someRecord [ fieldName ] yields the value of the record’s field that has name fieldName.

The result of selecting the “M_AdventureWorks” database is another record holding all the named tables in that database. So, the expression:

Localhost[M_AdventureWorks][HumanResources.EmployeePayHistory]

evaluates to the value of the “HumanResources.EmployeePayHistory” table.

A table is, essentially, a list of rows, where each row is a record of the same “shape”. Specifically, all row records of a table have the same number of like-named fields.

Tables are such an important concept in Data Explorer that they are handled specially. A table is displayed with column headers and an open-ended list of rows. In addition, the ribbon changes to present specific table tasks.

Let’s say we want to add a new column to the table. For each row, we want the new column to hold the product of the existing “Rate” and “PayFrequency” values in that row. In Excel, we would achieve that by typing in a product expression that multiplies two cell values. We would then “fill down” that expression to get this product for each row of our table. As we saw earlier, tables in Data Explorer don’t live in a spreadsheet grid and can have huge numbers of rows. Instead of filling down a formula to every row of a table, we instead want to specify the desired behavior at table level.

By selecting “Insert Column”, “Custom Column” in the ribbon, we get to fill out a formula builder that helps us form the right task in Data Explorer.

The final formula reads:

= Table.AddColumn( HumanResources.EmployeePayHistory, "Pay",
each [Rate] * [PayFrequency] )

This formula looks a little daunting, so let’s take it apart to understand what it actually does.

First of all, we are adding a column to an existing table. The function Table.AddColumn does just that. It takes an existing table (here the one named HumanResources.EmployeePayHistory introduced in the previous task of our task stream), the name of the new column (here “Pay”), and an expression that defines what value the new column should have in each of the table’s rows.

It is really only that last expression, the “each” expression, that introduces us to a few new concepts. We will return to the exact mechanics behind “each” expressions in a future blog post, but for now it suffices to think of “each” as saying: calculate the “each” expression separately for each row of the table that Table.AddColumn (or any other table-manipulating function) visits when building the new column. Then, for each of the rows, our expression each [Rate] * [PayFrequency] extracts the value of the “Rate” column in that row and multiplies it by the value of the “PayFrequency” column in that same row.

So, with a little squinting, we can actually read our entire add-column expression as: ‘Add a column to the given table, name it “Pay”, and for each row in the given table, compute a value for the new column that is the product of the given “Rate” and “PayFrequency” columns.’ With a little practice, this becomes second nature when using Data Explorer and its formula language since table-level manipulations are so very common.

At this point, we have seen a fairly large part of the Data Explorer formula language already – enough to make serious headway when using Data Explorer. We will continue visiting the formula language in future posts, delving into various more technical aspects. Our next language-related post will explain more of the basics of how things actually work, including the various shapes of data that are supported, and introduce a few new concepts such as types and custom functions. However, rest assured that the language, despite its power, is overall fairly simple and easy to learn.

These “Data Explorer” posts would be much more interesting if potential users had access to the CTP, which isn’t due for a limited, private release until the end of November 2011. Sign up for a SQL Azure Labs’ “Data Explorer” account here.

Glenn Gailey (@ggailey777) described Storing Complex Types, Binary Resources, and Other Tricky Things, a.k.a. “Sync’ing OData to Local Storage in Windows Phone (Part 2)” in an 11/17/2011 post:

In my previous post in this series, I described how to use a T4 template in Visual Studio to generate a hybrid proxy that supports both OData feeds and local database storage. There were also a few local storage issues that I didn’t get a chance to cover in the first post, namely what to do with binary resources (media resource streams), which can be images or audio and video files or streams, downloaded from the data service. Also, I came across a workable solution that enables you to store complex properties in the local database.

I also decided to create and publish a Windows Phone Mango app project to Code Gallery that will include my T4 template for the hybrid proxy and demonstrate how to use it. I decided to use the public Netflix feed as my data source because, a) it’s got a pretty advanced data model that includes both complex types and media link entries (MLEs) and b) I can just add this mew local storage functionality to the original Netflix feed-based Windows Phone quickstart that I wrote. (I will probably also need to extend the sample to include traversing an association, but I will decide that later.)

Support for Complex Types

If you recall from my previous post, I was bemoaning the lack of complex type support in LINQ-to-SQL (L2S) and thereby also in local database. Fortunately, I came across an old blog post by Dino Esposito where he describes his solution to this problem (lack of support for complex properties in L2S). Dino’s basic solution was to create private backing properties in the main type for the properties of the complex type, and then attribute these properties for L2S storage. Fortunately, L2S supports storing of private fields. For example, here is what the definition for the complex property Title.BlueRay, which returns the complex type DeliveryFormatAvailability ,looks like in my T4-generated proxy:

private DeliveryFormatAvailability _bluray;
[global::System.Runtime.Serialization.DataMemberAttribute()]
public DeliveryFormatAvailability BluRay
{
get { return _bluray; }
set
{
if (_bluray != value)
{
_bluray = value;
OnPropertyChanged("BluRay");
}
}
}
partial void OnBluRayChanging(DeliveryFormatAvailability value);
partial void OnBluRayChanged();

[global::System.Data.Linq.Mapping.ColumnAttribute(CanBeNull=false)]
private bool BluRay_Available
{
get { return this.BluRay.Available; }
set { this.BluRay.Available = value; }
}
[global::System.Data.Linq.Mapping.ColumnAttribute]
private DateTime? BluRay_AvailableFrom
{
get { return this.BluRay.AvailableFrom; }
set { this.BluRay.AvailableFrom = value; }
}
[global::System.Data.Linq.Mapping.ColumnAttribute]
private DateTime? BluRay_AvailableTo
{
get { return this.BluRay.AvailableTo; }
set { this.BluRay.AvailableTo = value; }
}
[global::System.Data.Linq.Mapping.ColumnAttribute]
private string BluRay_Rating
{
get { return this.BluRay.Rating; }
set { this.BluRay.Rating = value; }
}
[global::System.Data.Linq.Mapping.ColumnAttribute]
private int? BluRay_Runtime
{
get { return this.BluRay.Runtime; }
set { this.BluRay.Runtime = value; }
}

I knew that I had updated my templates to correctly generate the attributed complex property and private backing properties because L2S was able to load the model, but I was getting null exceptions when trying to add objects. Turns out L2S was doing something to try and set the backing properties before the complex type was set by the OData client. The solution was simply to instantiate the complex properties in the default constructor for the entity type:

public Title()
{
// We need to explicitly instantiate complex properties or L2S fails.
this.BluRay = new DeliveryFormatAvailability();
this.BoxArt = new BoxArt();
this.Dvd = new DeliveryFormatAvailability();
this.Instant = new InstantAvailability();
}

At this point, I was able to get the first page of Titles from the Netflix service, materialize them, and store them in the local database. Then, the next time I started the program, I was able to get them from the local database first before trying the service again. Here’s the LoadData method that does this:

// Loads data when the application is initialized.
public void LoadData()
{
// Instantiate the context and binding collection using the stored URI.
this._context = new NetflixCatalogEntities();

try
{
// Try to get entities from local database.
var storedTitles = from t in localDb.Titles
orderby t.ReleaseYear descending, t.Name
select t;
if (storedTitles != null && storedTitles.Count() == 0)
{

var titlesFromService = new DataServiceCollection<Title>(this._context);

titlesFromService.LoadCompleted += this.OnTitlesLoaded;

// Load the data from the OData service.
titlesFromService.LoadAsync(GetQuery());
}
else
{
// Bind to the data from the local database.
this.Titles = new ObservableCollection<Title>(storedTitles);
}
}
catch (Exception ex)
{
MessageBox.Show("Unable to load stored titles. " + ex.Message);
}
}

And, here’s where we store the entities returned by the data service request:

private void OnTitlesLoaded(object sender, LoadCompletedEventArgs e)
{
if (e.Error == null)
{
// Get the binding collection, which is the sender.
DataServiceCollection<Title> loadedTitles =
sender as DataServiceCollection<Title>;

if (loadedTitles != null)
{
// Make sure that we load all pages of the Customers feed.
if (loadedTitles.Continuation != null)
{
loadedTitles.LoadNextPartialSetAsync();
}

// Set the total page count, if we requested one.
if (e.QueryOperationResponse.Query
.RequestUri.Query.Contains("$inlinecount=allpages"))
{
_totalCount = (int)e.QueryOperationResponse.TotalCount;
}

try
{
localDb.Titles.InsertAllOnSubmit<Title>(loadedTitles);

localDb.SubmitChanges();
}
catch (Exception ex)
{
MessageBox.Show("Titles could not be stored locally. "
+ ex.Message);
}

loadedTitles.LoadCompleted -= OnTitlesLoaded;

this.Titles = loadedTitles;

IsDataLoaded = true;
}
// Update the pages loaded text binding.
NotifyPropertyChanged("PagesLoadedText");
}
else
{
// Display the error message in the binding.
this.Message = e.Error.Message;
}
}

I still haven’t thought through all the ramifications of local storage with client paging, but assuming that we maintain the same sort order and the service isn’t changing much, we should be able to perform this “get it from local and if it’s not there then get it from the service” logic for each page in the feed (of 82K Netflix titles). Obviously, I will have to figure the memory issue out at some point or else a user can fill-up the phone with a large feed like Netflix.
This is why you should always page in the client and not rely on the service to page!

Storing Binary Resources

The final massively tricky problem I need to deal with is getting and storing binary binary resources. In OData (and really it’s from AtomPub), an entity that is attributed as a media link entry (MLE) can have a related media resource (MR). This media resource, usually a BLOB like a image, audio or video file, is accessed from the data service separately from the entity itself. I’ve written a blog series on this for the OData team blog. In our case, we are storing the MLEs in local database, but instead of filling up the database with BLOBs, it makes more sense just to put them in isolated storage. Here’s where it gets tricky… The original app simply called the GetReadStreamUri method to get the URI of the image from the data service; then on binding the data service requested the image from that URI. Easy and works great, but this means that you hit the data service for the same images over-and-over every time you run the app, hence the need for local storage (especially when your images are large…what a waste of bandwidth).

Instead, we need to get the images ourselves from the data service and store them locally. This involves some stream reading and writing, but nothing too hard there. (It’s also a little more cumbersome to get images from local storage than from the net, but a huge cost savings.) But, what about when we have stored the entity in local database, but for some reason the image didn’t make it. Now we have trouble because the entity isn’t being tracked by the DataServiceContext, and even if we attach it, the read stream URI data is missing—this info was kept in the EntityDescriptor object which is recreated on attach but it doesn’t know this URI. Looks like we will need another async call to the service to get the correct URI for the media resource for an attached entity that is a MLE (drat).

I’ll talk about my complete solution to this binary resource retrieval and storage problem in my next post.

Windows Azure AppFabric: Apps, Access Control, WIF and Service Bus

• The Windows Azure Platform Team sent the following End date for the Windows Azure AppFabric June CTP message to users who were on-boarded to the CTP:

In June of this year, we released the Windows Azure AppFabric June CTP to share some early ideas on possible future services and gather your feedback on the value of these service capabilities.

We appreciate your participation in the CTP and are encouraged by the feedback we have received from customers thus far.

With significant customer feedback now in hand, we will conclude the current CTP in approximately 3 weeks. Sometime on or after December 12, 2011, we will be permanently deleting all existing namespaces and data in the AppFabric LABS environment. Microsoft will not provide this data to you. If you would like to keep a backup of your work, please take the necessary steps to keep a local copy of your work before December 12, 2011. In no event shall Microsoft be liable for any damages whatsoever caused by the deletion of any namespaces, data or other information in the AppFabric LABS environment.

Thank you again for your valuable feedback. We will share more details on new preview releases on the Windows Azure blog in the future as they become available.

For any questions or to provide feedback please use our Windows Azure AppFabric CTP Forum.

Note: This is only related to the AppFabric LABS environment and does not affect any of the production services.

Windows Azure Team
Microsoft Corporation

Forewarned is forearmed.

Hymanshu Singh posted Real World Windows Azure: Interview with Bert Craven, Enterprise Architect, easyJet on 11/17/2011:

The Real World Windows Azure series staff recently talked to Bert Craven, Enterprise Architect at easyJet, Europe’s leading low-fare airline, about using Windows Azure Service Bus to securely open up corporate applications to mobile devices at airports all over Europe. Let’s listen in:

MSDN: Tell us about the challenges you were trying to solve with Windows Azure Service Bus.

Craven: In most airports, we use Common Use platforms to provide departure control services such as bag drop, check in, and boarding. We fly to more than 130 airports in Europe and pay millions of pounds annually to rent desks and Common Use equipment. These are expensive, inflexible, closed systems are not well suited to easyJet’s style of rapid, low-cost innovation and adaptation. Furthermore, the contractual terms are rarely well suited to our need to flex our levels of service over seasonal peaks and troughs, operate out of airports for only part of the year, deploy and exit quickly with changes in demand, and so on.

More importantly, these terminals anchor our service agents behind desks, which is not always the best place to serve customers. We wanted our service agents to be free to roam around airport check-in areas with mobile devices and not only check in passengers but sell them additional services, such as rental cars, subway tickets, and so forth.

MSDN: What was the technical problem to doing this?

Craven: This vision of mobile airport service agents has been around for a long time, but the problem is securely opening up our back-end business systems to mobile devices. It’s too big a risk, and no airline, including easyJet, has been willing to do it.

MSDN: So how did Windows Azure Service Bus help?

Craven: Service Bus gave us a way to make our back-end, on-premises services available publicly but in a secure and flexible way. Instead of exposing endpoints in the easyJet data center, we can expose services in the Microsoft cloud where anyone can consume them. The address for the service is in the cloud and stays the same regardless of which data center I provision it from. We don’t have to build a new high-availability service platform, make firewall configuration changes, or deploy lots of new servers.

We also used Windows Azure Access Control to provide authorization services. Access Control gave us a rich, federated security model based on open standards, which was critical.

MSDN: Very cool. So, what did you actually build using Service Bus and Access Control?

Craven: We built a mobile service-delivery platform called Halo that overlays the European airports in which we operate with a secure, private communications network and local wireless endpoints. Wireless handheld devices access the communications network in a managed device layer. Halo services are exposed through Service Bus to access back-end applications such as boarding, sales, customer relationship management, and others. Eventually, Halo will also accommodate portable computers, kiosks, and any other devices that can help us serve customers better.

MSDN: How did your developers like working with Service Bus and Access Control?

Craven: It was very easy for our developers to come up to speed on these Windows Azure services. They still write .NET code in the Microsoft Visual Studio development system. The jump from consuming normal .NET services was incredibly straightforward. We had to do little more than change a configuration file to expose our services in Windows Azure. With Service Bus, we’ve been able to deliver features that previously would have required reams of code. It gave us extensive out-of-the-box functionality that enabled us to get new services to market before our competitors, using familiar development tools.

MSDN: Have you rolled out the Halo platform yet?

Craven: We have piloted Halo at select airports and given service agents access to applications that support boarding and payment. In the next phase, we’ll roll out additional functionality, including check-in, ticket purchases, and other services. Ultimately, we’re aiming for a full suite of operational, retail, and CRM applications.

MSDN: What kind of savings will easyJet realize with Halo?

Craven: Reducing our usage of and reliance on Common Use platforms whilst augmenting them with our own mobile, flexible platform could amount to multi-million pound savings annually, as well as providing a gateway to other cost reductions and new streams of revenue.

MSDN: Wow. What about the benefit to your customers?

Craven: That’s the whole point of Halo; with it, we can give customers faster service and a better airport experience by eliminating many of the lines they currently wait in. A roaming agent can triage questions a lot faster than an agent stuck behind a desk. Halo will also be of huge benefit during periods of disruption such as recent bouts with snow and volcanic ash, where traditional resources were placed under unbearable strain.

MSDN: In addition to CUTE rental savings, what are other benefits to the business?

Craven: Without Service Bus, there’s a good chance that this project simply would not have gotten off the ground. It would have cost way too much just to get to the prototype stage. I was able to create something single-handedly that was proof enough for management to proceed with the idea.

As for ongoing development, Windows Azure has become an extension of our on-premises environment and gives developers a unified experience, because it’s an extension of what they already know. It’s a low-cost sandbox in which we can cost-effectively incubate new ideas. The moment the competition catches up, we want to innovate again.

Of course, Windows Azure also gives us immense scalability, high availability, and airtight data security. We have a high level of confidence that we are doing something fundamentally safe.

Read the full story.

Read more about Windows Azure Service Bus.

Wade Wegner (@WadeWegner) described Outsourcing User Authentication in a Windows Phone Application with Access Control Services in an 11/16/2011 post:

Yesterday I shared all the NuGet packages we’re building to make it easy to build Windows Phone and Windows Azure applications. Today I wanted to share how easy it is to build a Windows Phone application that leverages the Windows Azure Access Control service.

The Phone.Identity.AccessControl.BasePage NuGet package includes a control for Window Phone that allows your phone applications to outsource user authentication to the Windows Azure Access Control service (ACS). This service enables your users to login by reusing their existing accounts from identity providers such as Windows Live ID, Google, Yahoo, Facebook, and even Active Directory. If you want to know more about ACS you can take a look at the dedicated hands-on labs in the Windows Azure Platform Training Course.

Using this NuGet package and the included control for ACS in your Windows Phone applications takes care of all the runtime interactions with ACS. Additionally, this package provides a base login page that uses the control and is easy to setup in your phone application. All that is left for you to do is to configure your ACS namespace via the management portal (i.e. specifying your preferences such as the identity providers you want to enable in your application) and integrate the login page into your existing Windows Phone application.

For more information on setting up ACS take a look at the resources at http://acs.codeplex.com/

To help simplify the process below, I’m making the assumption you already have ACS setup and configured. I’ll be using the following values in the below sample (no guarantee that they’ll be available when you read this post but I’ll do my best):

namespace: watwindowsphone

realm: uri:watwindowsphone

Without further ado, here are the steps to build a Windows Phone application that outsources authentication to ACS:
Create a new Windows Phone OS 7.1 application.

From the Package Manager Console type the following to install the ACS base login page NuGet package for Windows Phone: Install-Package Phone.Identity.AccessControl.BasePage
Update the AccessControlResources.xaml resources file to use your ACS namespace and the realm you have configured.
    <system:String x:Key="acsNamespace">watwindowsphone</system:String>
    <system:String x:Key="realm">uri:watwindowsphone</system:String>
Update the WMAppManifest.xml file so that the default page is the LoginPage.xaml. This way the user will come to the login page before the MainPage.xaml.
Update the LoginPage.xaml.cs so that the user is navigated to the MainPage.xaml upon successfully logging into the application. Make sure to update Line 23 and Line 33.
 this.NavigationService.Navigate(new Uri("/MainPage.xaml", UriKind.Relative));
Let’s display some information from the Simple Web Token. Add a TextBlock control to the MainPage.xaml page.
    
    <Grid x:Name="ContentPanel" Grid.Row="1" Margin="12,0,12,0">
        <TextBlock Name="DisplayLoginInfo" />
    </Grid>
Add a Loaded event for the MainPage.xaml. In this event you’ll want to load the simpleWebTokenStore out of the application resources. You can then use it to grab resources like the name identifier or various other claim types (like Name). Finish by updating the DisplayLoginInfotextblock.
    using Microsoft.WindowsAzure.Samples.Phone.Identity.AccessControl;

    ...

    var simpleWebTokenStore = Application.Current.Resources["swtStore"]
        as SimpleWebTokenStore;

    var userNameIdentifier = simpleWebTokenStore.SimpleWebToken.NameIdentifier;
    var name = simpleWebTokenStore.SimpleWebToken.Claims[ClaimTypes.Name];

    this.DisplayLoginInfo.Text =
        "Identifier: " + userNameIdentifier + Environment.NewLine +
        "Name: " + name;
Run the application. I’d recommend using Facebook, Google, or Yahoo! for the identity providers, as Live ID does not provide the name claim type in the SWT token.
And that’s it! You can now take advantage of the Identifier claim (and others) in your phone application for many things – tracking users, displaying additional user information, and so forth. Additionally, you can use these claims to authenticate against additional services running in Windows Azure – I’ll cover this token in a future post.

The Phone.Identity.AccessControl.BasePage NuGet package makes it really easy for you to take advantage of the Windows Azure Access Control service within your applications. ACS provides a great way for you to leverage your users existing identity providers when using your application.

Many are skeptical of claims that involve benchmarks. Over the years benchmarks have been manipulated and misrepresented. Benchmarks aren't inherently bad or created in bad faith. To the contrary, when understood and applied correctly, benchmarks can often provide useful insight for performance analysis and capacity planning. The problem with benchmarks is they are often misunderstood or misrepresented, frequently resulting in bold assertions and questionable claims. Oftentimes there are also extraneous factors involved such as agenda-driven marketing organizations. In fact, the term "benchmarketing" was coined to describe questionable marketing-driven, benchmark-based claims.This post will discuss a few questions one might consider when reading benchmark-based claims. We'll then apply these questions to 2 recent cloud related, benchmark-based studies.

Questions to consider
The following are 7 questions one might ask when considering benchmark-based claims. Answering these questions will help to provide a clearer understanding on the validity and applicability of the claims.

What is the claim? Typically the bold-face, attention grabbing headline like Service Y is 10X faster than Service Z

What is the claimed measurement? Usually implied by the headline. For example the claim Service Y is 10X faster than Service Z implies a measurement of system performance

What is the actual measurement? To answer this question, look at the methodology and benchmark(s) used. This may require some digging, but can usually be found somewhere in the article body. Once found, do some research to determine what was actually measured. For example, if Geekbench was used, you would discover the actual measurement is processor and memory performance, but not disk or network IO

Is it an apples-to-apples comparison? The validity of a benchmark-based claim ultimately depends on the fairness of the testing methodology. Claims involving comparisons should compare similar things. For example, Ford could compare a Mustang Shelby GT500 (top speed 190 MPH) to a Chevy Aveo (top speed 100 MPH) and claim their cars are nearly twice as fast, but the Aveo is not a comparable vehicle and therefore the claim would be invalid. A more fair, apples-to-apples comparison would be a Mustang GT500 and a Chevy Camaro ZL1 (top speed 186).

Is the playing field level? Another important question to ask is whether or not there are any extraneous factors that provided an unfair advantage to one test subject over another. For example, using the top speed analogy, Ford could compare a Mustang with 92 octane fuel and a downhill course to a Camaro with 85 octane fuel and an uphill course. Because there are extraneous factors (fuel and angle of the course) which provided an unfair advantage to the Mustang, the claim would be invalid. To be fair, the top speeds of both vehicles should be measured on the same course, with the same fuel, fuel quantity, driver and weather conditions.

Was the data reported accurately? Benchmarking often results in large datasets. Summarizing the data concisely and accurately can be challenging. Things to watch out for include lack of good statistical analysis (i.e. reporting average only), math errors, and sloppy calculations. For example, if large, highly variable data is collected, it is generally a best practice to report the median value in place of mean (average) to mitigate the effects of outliers. Standard deviation is also a useful metric to include to identify data consistency.

Does it matter to you? The final question to ask is, assuming the results are valid, does it actually mean anything to you? For example, purchasing a vehicle based on a top speed comparison is not advisable if fuel economy is what really matters to you.

Case Study #1: Joyent Cloud versus AWS EC2

[Excised for brevity. See site.]

Case Study #2: Microsoft Azure Named Fastest Cloud Service
In October 2011, Compuware published a blog post related to cloud performance. This post was picked up by various media outlets resulting in the following headlines:

Azure Tops Cloud Provider Performance Index (ReadWrite Cloud)

Windows Azure beats Amazon EC2, Google App Engine in cloud speed test (ars technica)

Microsoft Azure Named Fastest Cloud Service (Information Week)

Windows Azure cloud fastest according to tests (Ewandoo)

Here's how the test worked in a nutshell:

Two sample e-commerce web pages were created. The first with items description and 40 thumbnails (product list page), and the second with a single 1.75 MB image (product details page)

These pages were made accessible using a Java application server (Tomcat 6) running in each cloud environment. The exception to this is Microsoft Azure and Google AppEngine (platform-as-a-service/PaaS environments) which required the pages to be bundled and deployed using their specific technology stack

30 monitoring servers/nodes were instructed to request these 2 pages in succession every 15 minutes and record the amount of time it took to render both in their entirety (including the embedded images)

The 30 monitoring nodes are located in data centers in North America (19), Europe (5), Asia (3), Australia (1) and South America (2) - they are part of the Gomez Performance Network (GPN) monitoring service

After 1 year an average response time was calculated for each service (response times above 10 seconds were discarded)

Now lets dig a little deeper...
Questions & Answers

What is the claim? Microsoft Azure is the "fastest cloud"

What is the claimed measurement? Overall performance (it's fastest)

What is the actual measurement? Network Latency & Throughput

Rendering 2 html pages and some images is not CPU intensive and as such is not a measure of system performance. The main bottleneck is network latency and throughput, particularly to distant monitoring nodes (e.g. Australia to US)

Is it an apples-to-apples comparison? Types of services tested are different (IaaS vs PaaS) and the instance types are dissimilar

Microsoft Azure and Google AppEngine are platform-as-a-service (PaaS) environments, very different from infrastructure-as-a-service (IaaS) environments like EC2 and GoGrid. With PaaS, users must package and deploy applications using custom tools and more limited capabilities. Applications are deployed to large clustered, multi-tenant environments. Because of the greater structure and more limited capabilities of PaaS, providers are able to better optimize and scale those applications, often resulting in better performance and availability when compared to a single server IaaS deployment. Not much information is disclosed regarding the sizes of instances used for the IaaS services. With some IaaS providers, network performance can vary depending on instance size. For example, with Rackspace Cloud, a 256MB cloud server is capped with a 10 Mbps uplink. With EC2, bandwidth is shared across all instances deployed to a physical host. Smaller instance sizes generally have less, and more variable bandwidth. This test was conducted using the nearly smallest EC2 instance size, an m1.small.

Is the playing field level? Services may have unfair advantage due network proximity and uplink performance

Because network latency is the main bottleneck for this test, and only a handful of monitoring nodes were used, the results are highly dependent on network proximity and latency between the services tested and the monitoring nodes. For example, the Chicago monitoring node might be sitting in the same building as the Azure US Central servers giving Azure and unfair advantage in the test. Additionally, the IaaS services where uplinks are capped on smaller instance types would be at a disadvantage to uncapped PaaS and IaaS environments.

Was the data reported accurately? Simple average was reported - no median, standard deviation or regional breakouts were provided

The CloudSleuth post provided a single metric only… the average response time for each service across all monitoring nodes. A better way to report this data would involve breaking the data down by region. For example, average response time for eastern US monitoring nodes. Reporting median, standard deviation and 90th percentile statistical calculations would also be very helpful in evaluating the data.

Does it matter to you? Probably not

Unless your users are sitting in the same 30 data centers as the GPN monitoring nodes, this study probably means very little. It does not represent a real world scenario where static content like images would be deployed to a distributed content delivery network like CloudFront or Edgecast. It attempts to compare two different types of cloud services, PaaS and IaaS. It may use IaaS instance types like the EC2 m1.small that represent the worst case performance scenario. The 30 node test population is also very small and not indicative of a real end user population (end users don't sit in data centers). Finally, reporting only a single average value ignores most statistical best practices.

Jay Heiser offered and interesting analogy in his SLA feather allows you to fly in the cloud post of 11/17/2011 to his Gartner blog:

In the Disney cartoon Dumbo, a misfit elephant discovers that his ears are so large that he has the ability to fly. Lacking confidence in this unusual capability, he is understandably reluctant to fully exploit it. His shrewed friends, a pack of crows, come up with an ingenious psychological ploy. They give him a single feather, explain that it has magical properties, and as long as he clutches it in his trunk, he will be able to fly. This ruse is wildly successful, enabling the young pachyderm to soar with the crows.

This story is compelling because it illustrates a common form of human behavior. Desiring to do something, and unfamiliar with the associated risks, we often latch onto talismans, superstitiously hoping that they will protect us from unforeseen disasters. We do the same thing in the IT world, vainly clutching contractual SLAs in order to gain the false confidence that it will enable us to safely fly in the public cloud.

A typical example is a security document I recently reviewed which explained how this particular enterprise could safely rely on a specific SaaS vendor to reliably protect, and if necessary recover, their data, because it was ‘protected by a high level of SLA’. AN SLA IS NO MORE THAN AN EXPRESSION OF INTENT; IT IS NOT EVIDENCE OF DELIVERABILITY.

I’m not saying that you should not seek very specific service level agreements in your contracts–by no means. What I am saying is that the mere fact that a vendor promises to do something does not mean that you as the buyer can rely on it to happen.

If your business depends upon having access to data or services that are hosted externally, then you need to either have a tested contingency plan that functions entirely independently of that provider, or you need specific evidence that the provider is not only reliably making offline backups that are consistent with your recovery point objectives, but also that they have a proven ability to restore your data within your time objectives.

I have to admit that I’m completely at a loss to come up with some sort of test which would reliably demonstrate that in the event that even a small part of their client data was lost, say a petabyte worth belonging to a couple hundred thousand digital tenants, that a service provider would be able to restore it within a few days. As a recent case in point for what was a minuscule disaster, earlier this year it took Google 4 days to restore what they described as .02% of their Gmail users.

What form of evidence could demonstrate that a provider with hundreds of thousands of tenants, if not millions, could, after some sort of data-eating disaster, reliably restore that data from offline backups and link it back into the accounts and applications such that it could be used again by their customers? How long would it take for the highly-leveraged administrative staff of a cloud services provider to complete such an operation? Where in the recovery queue would you be?

An SLA from a public cloud service promising some sort of recoverability can be a crow feather, clutched in the trunk of the enterprise elephant, providing them the false courage to be willing to fly in the public cloud. I hope that this is not a lesson that your organization will have to learn the hard way.

Recent Gartner research on this topic includes ‘Black Swans’ Are Sure to Fly in the Public Cloud, and The Realities of Cloud Services Downtime: What You Must Know and Do.

Ernest Mueller (@ernestmueller) described Why Your Testing Is Lying To You in an 11/17/2011 post to his Agile Admin blog:

As a follow-on to Why Your Monitoring Is Lying To You. How is it that you can have an application go through a whole test phase, with two-day-long load tests, and have surprising errors when you go to production? Well, here’s how… The same application I describe in the case study part of the monitoring article slipped through testing as well and almost went live with some issues. How, oh how could this happen…

I Didn’t See Any Errors!

Our developers quite reasonably said “But we’ve been developing and using this app in dev and test for months and haven’t seen this problem!” But consider the effects at work in But, You See, Your Other Customers Are Dumber Than We Are. There are a variety of levels of effect that prevent you from seeing intermittent problems, and confirmation bias ends up taking care of the rest.

The only fix here is rigor. If you hit your application and test and it errors, you can’t just ignore it. “I hit reload, it worked. Maybe they were redeploying. On with life!” Maybe it’s your layer, maybe it’s another layer, it doesn’t matter, you have to log that as a bug and follow up and not just cancel the bug as “not reproducible” if you don’t see it yourself in 5 minutes of trying. Devs sometimes get frustrated with us when we won’t let up on occurrences of transient errors, but if you don’t know why they happened and haven’t done anything to fix it, then it’s just a matter of time before it happens again, right?

We have a strict policy that every error is a bug, and if the error wasn’t detected it is multiple bugs – a bug with the monitoring, a bug with the testing, etc. If there was an error but “you don’t know why” – you aren’t logging enough or don’t have appropriate tools in place, and THAT’s a bug.

Our Load Test/Automated Tests Didn’t See Any Errors!

I’ll be honest, we don’t have much in the way of automated testing in place. So there’s that. But we have long load tests we run. “If there are intermittent failures they would have turned up over a two day load test right?” Well, not so fast. How confident are you this error is visible to and detected by your load test? I have seen MANY load test results in my lifetime where someone was happily measuring the response time of what turned out to be 500 errors. “Man, my app is a lot faster this time! The numbers look great! Wait… It’s not even deployed. I hit it manually and I get a Tomcat page.”

Often we build deliberate “lies” into our software. We throw “pretty” error pages that aren’t basic errors. We are trying not to leak information to customers so we bowderlize failures on the front end. We retry maniacally in the face of failed connections, but don’t log it. We have to use constrained sets of return codes because the client consuming our services (like, say, Silverlight) is lobotomized and doesn’t savvy HTTP 401 or other such fancy schmancy codes.

Be careful that load tests and automated tests are correctly interpreting responses. Look at your responses in Fiddler – we had what looked to the eye to be a 401 page that was actually passing back a 200 HTTP return code.

The best fix to this is test driven development. Run the tests first so you see them fail, then write the code so you see them work! Tests are code, and if you just write them on your working code then you’re not really sure if they’ll fail if somethings bad!

Fault Testing

Also, you need to perform positive and negative fault testing. Test failures end to end, including monitoring and logging and scaling and other management stuff. At the far end of this you have the cool if a little crazy Chaos Monkey. Most of us aren’t ready or willing to jack up our production systems regularly, but you should at least do it in test and verify both that things work when they should and that they fail and you get proper notification and information if they do.

Try this. Have someone Chaos Monkey you by turning off something random – a database, making a file system read only, a back end Web service call. If you have redundancy built in to counter this, great, try it with one and see the failover, but then have them break “all of it” to provoke a failure. Do you see the failure and get alerted? More importantly, do you have enough information to tell what they broke? “One of the four databases we connect to” is NOT an adequate answer. Have someone break it, send you all the available logs and info, and if you can’t immediately pinpoint the problem, fix that.

How Complex Systems Fail, Invisibly

In the end, a lot of this boils down to How Complex Systems Fail. You can have failures at multiple levels – and not really failures, just assumptions – that stack on top of each other to both generate failures and prevent you from easily detecting those failures.

Also consider that you should be able to see those “short of failure” errors. If you’re failing over, or retrying, or whatnot – well it’s great that you’re not down, but don’t you think you should know if you’re having to fail over 100x more this week? Log it and turn it into a metric. On our corporate Web site, there’s hundreds of thousands of Web pages, so a certain level of 404s is expected. We don’t alert anyone on a 404. But we do metricize it and trend it and take notice if the number spikes up (or down – where’d all that bad content go?).

Whoelsale failures are easy to detect and mitigate. It’s the mini-failures, or things that someone would argue are not a failure, on a given level that line up with the same kinds of things on all the other layers and those lined up holes start letting problems slip through.

http://www.codinghorror.com/blog/2011/04/working-with-the-chaos-monkey.html

No significant articles today.

Windows Azure Platform Appliance (WAPA), Hyper-V and Private/Hybrid Clouds

Yung Chou reported the availability of a TechNet Video: Live! IT Time: Private Cloud Chat (Episode 1) in an 11/18/2011 post:

Download
WMV Download | WMA | MP3

Have questions about Microsoft's Private Cloud solutions? If you attended one of the Live Private Cloud TechNet events, hopefully we’ve inspired you to build out your own test environment with our downloadable evaluation products. If you did, you may have some follow-up questions after the event. If you didn't attend an in-person event, but have questions regarding Private Cloud Computing all are welcome to join us during this fun, interactive, Q&A online session.

Alan Le Marquand reported the November MVA TechNet Radio show [is] Live in an 11/17/2011 post to the Microsoft Virtual Academy blog:

The November MVA TechNet Radio show is now live. This month show cover what’s happen in the Academy now and what is coming down the line.

This show focuses on the Private Cloud courses that will releasing at the end of November and looking forward to the courses coming in 2012.

To get involved send your questions to Further Episode Questions and we will try and answer them in the show.

November Episode

Cloud Security and Governance

Steve Plank (@plankytronixx) posted his analysis of Cloud Security Standards in an 11/18/2011 post to the MSDN UK Team blog:

They seem to be random character strings: ISO 27001/IEC 2005, SAS 70 Type II, SSAE-16, PCI DSS, EU DPD 9546 EC… But there is a value behind each one of them, if you understand the story.

If we discount the smaller niche cloud players who provide very specialised services – let’s concentrate on the high-volume players – there’s a reason you can buy 1 hour of compute for 5 cents. Volume. By having absolutely massive data-centres distributed throughout the world - Microsoft’s Chicago data-centre is more than 700,000 square feet. Let’s just put that in to perspective with a little diagram. Here is the average soccer pitch: 115 yards by 74 yards…

Total area = 76,590 square feet. So there are just under nine and a quarter of these in the Chicago Datacentre. Which is surprising – I thought the only professional team in that city was Chicago Fire…

Just look at the Chicago Fire soccer pitch picture above to get an idea of scale – that’s just one pitch, not nine and a quarter. Imagine being at the position of this little group of fans, looking across nine and a quarter pitches:

…now imagine that filled with racks of servers, shipping containers full of servers, cooling, specialised power supplies. Operating on this kind of scale, the huge cloud operators even spec their own hardware from the suppliers. These are specialised computers designed to work in cloud datacentres.

This scale brings very cheap prices. Computing at these volumes is a commodity. Imagine one of these operators had a million customers all buying their compute and storage commodity and then imagine they all wanted to perform security audits on all the datacentres they were using. Microsoft has 6 major data centres around the world, even more if you include CDN distribution points. The other large cloud operators are in vaguely similar positions. That would probably equate to multiples of millions of datacentre audit requests. Hopefully you can see that the large cloud operators would be spending all their time, money and resources with millions of audits and as a result, doing a bad job of providing compute and storage as a commodity (at commodity prices). The whole commercial model of cloud computing as a commodity would simply evaporate and there’d be no such thing as cloud computing.

So – to try and get round this essentially commercial problem, cloud operators have embraced a number of security standards. They’re saying “no, we don’t operate the kind of business where you can come in and audit our datacentres. If you need that there are some niche players, hosters and outsourcers out there who can probably help (and $$$ charge) you. However you can take advantage of audits that have been done according to a set of information security principles and criteria”. The details of these are embedded in to some of those awful numbers I quoted in the first sentence of this blog post.

So that’s why large cloud operators have embraced this group of standards, certifications, accreditations etc. I often quietly chuckle when I hear security consultants advise “negotiate the terms of the contract”, “insist on auditing the security of the cloud operator. If they say ‘no’, walk away” because it shows no appreciation of the real world. If you are about to buy a car, a house, a large piece of expensive furniture – it’s natural to negotiate. If you walk in to a convenience store and pick up a can of coke then try to negotiate the price you’ll get some very strange looks. You’ll walk out of the store either cokeless or having paid the stated price. That’s what it’s like with commodities. If you go to the supplier they’ll have a volume discount price-sheet. One coke costs one price, 1000 is a bit cheaper per can, one million gives a pretty good discount. Just like buying compute resources from large cloud operators. If you say you want to audit the production processes of the factory where the drinks are manufactured they say “no – it’s a canning factory, we don’t have an operation that is geared up for individual audits. We adhere to x and y food/drink standards and we are audited on that. You can see our certificate. If that’s not good enough, sorry, but you’ll have to buy your drinks somewhere else”.

So the standards give you something you can look at and determine if that fits your own ideas of what you’d audit, whether they are reasonable and applicable, whether they fit with compliance and legislation standards you have to fit yourself. And if they don’t? It reminds me of a joke by the late-great Tommy Cooper. “Doctor, it hurts when I do this” holding his arm out. “Well, don’t do it then” the doctor replies. If a cloud operator doesn’t meet certain security criteria you require, don’t use them for that data/application. You might decide to host it in your own datacentre or go to a niche outsourcer who will likely be more expensive but meet your stated criteria.

Let’s start with ISO 27001/IEC 2005. Anybody can look at Microsoft’s certificate on the BSI website.

Click the above image to see the BSI certifications.

…but what does it mean? The International Standards Organisation have created a set of control frameworks that deal with information security. The idea is that an organisation that enjoys the certification can show that the controls they have in place for information security are adequate. As long as the consumer (the customer) of the certified service agrees that each control framework provides adequate safety over their own information, it means they don’t have to perform their own audit against that organisation. You can probably start to see why this is important if you are about to use a large cloud provider’s data centre to host some of your data because you can’t audit the datacentre directly yourself.

Some of the control frameworks aren’t applicable to some organisations The company creates a statement of applicability in which they say which frameworks apply in order to prove adequate security of information – so for example huge companies like Microsoft have their HR procedures audited under a whole collection of separate standards. Therefore HR processes might not appear in the statement of applicability. Of course if the cloud provider you are going to is a 10-person company, it’s fairly unlikely they’ll have internationally recognised HR procedure certifications, so for your own peace-of-mind, you might want to see a statement of applicability which for them includes HR processes such as the vetting of staff.

The company then engages an auditor – Microsoft uses the British Standards Institute (BSI), a widely recognised auditor of International Standards – to audit the processes against the control frameworks defined in the statement of applicability. What the auditor essentially says is “we agree there are policies and procedures in place which, when adhered to, mean these control frameworks apply”. The key point is that you can’t just simply compare one cloud operator’s security certifications to another’s unless you understand their individual statement of applicability. Although I suspect it’s unlikely the BSI (or indeed any other ISO auditor) would grant a certificate if the statement of applicability said no control frameworks applied!. The statement of applicability really means both the service provider and their customer to have to boil the ocean with certain aspects that just aren’t applicable.

One thing that seems to always be associated with ISO27001 is SAS70 Type II. This asks the question “are there controls in place and are they being operated”. So although the ISO certification determines which procedures are actually applicable, the SAS70 Type II accreditation determines that the procedures are adequate and that they are actually being performed. Hopefully you can see why both would normally appear together.

Over time we will see a gradual move away from SAS70 Type II to SSAE-16 for an attestation that procedures are adequate and are actually being performed. SSAE-16 defines 2 types of interest to cloud operators; SOC1 and SOC2. SOC1 is where the cloud operator says “these are my control frameworks, policies, procedures – these are what you should audit me against”. SOC2 essentially means the auditor says “These are the things we think are important for information security and that’s what we are going to audit you against”.

Let’s move on to other standards like PCI DSS, the Payment Card Industry Data Security Standard. It’d be easy to think, on seeing that Microsoft’s Online services are PCI DSS compliant to think the Windows Azure Platform itself allows you to build your own payments platform and that it would therefore inherit PCI DSS benefits from the platform. Be carful here – that’s not the case. PCI DSS when applied to Microsoft’s online services means the payment platform that Microsoft uses for taking payments against say you Office 365 or Windows Azure subscription meets the standards of security and privacy they require. I’d humbly suggest that most organisations would be better off to use an existing payment provider who already carries PCI DSS blessings in their service rather than try to build one that conforms.

There is something that gets often trotted out in security conversations around the cloud – EU DPD 9546 EC, known as the EU Data Privacy Directive. This essentially says “when I take your information, how am I going to protect you and put the necessary protections around it. Moving that data inside the EU is not a problem because in Europe we’re all part of the same club and we all agree on those standards”. When data is taken out of Europe is where it gets tricky because for example the US is not in the EU. The directive allows data to be sent wherever you want, as long as there are adequate protections around it. One of the key agreements here is “Safe Harbour”. So for example any US organisation that enjoys Safe Harbour certification can receive EU data as if it still resides in the EU, because the EU are satisfied that they have put adequate controls in place to protect that data. But there aren't adequate protection for EU data in every country in the world. The trouble is that as the world wide distribution of cloud services grows, it’s possible that data could be held (or could transit through – such is the nature of data-transfer on the Internet) in such countries. EU model clauses can be used to apply in these cases. Obviously not necessary with organisations from say the US or Canada where they can enjoy Safe Harbour, but a country which doesn’t enjoy Safe Harbour can use an EU Model Clause which again says “there are adequate controls in place to protect this data”.

I hope this post has helped to explain why it is that major cloud operators can’t really have millions of data-centre audits going on and how it is that security accreditations and certifications can help you in lieu of conducting your own audits on a data centre. As we go forward it’s clear that inexpensive cloud services are going to rely more and more on security audits performed by trusted industry and government bodies. It also means cloud-providers’ customers are going to demand more and more detail in these standards.

This article is cross-posted to the Plankytronixx Blog

Unlike Office 365 and Amazon Web Services, Microsoft’s cloud security, audits and certifications apply to the data centers in which Windows Azure and SQL Azure run, not to the service platforms. This is the problem with lack of security information about Windows Azure that I complained about in my Where is Windows Azure’s Equivalent of Office 365’s “Security, Audits, and Certifications” Page? post of 11/17/2011.

Tim Greene asserted “Cloud Security Alliance creates a resource for customers of cloud products and services” in a deck for his Google, Microsoft, Intel, Verizon among new cloud-security registry members article for Network World of 11/17/2011:

Google, Verizon, Intel, McAfee, Microsoft and Savvis are joining a voluntary program set up by the Cloud Security Alliance that provides public information about whether contributors comply with CSA-recommended cloud-security practices.

By reading reports submitted to CSA's Security Trust and Assurance Registry (STAR), potential customers of participating providers can more readily assess whether products and services meet their security needs.

To encourage other participants, CSA is encouraging businesses to require that any cloud vendors they deal with to submit reports to CSA STAR. …

For example, eBay is requiring the submissions from all cloud vendors it works with, says the company's CISO Dave Cullinane. He says the information will help eBay security and its customers' privacy. Similarly, Sallie Mae will look for cloud vendors to demonstrate their security via CSA STAR filings.

CSA STAR lets participants file self-assessment reports about whether they comply with CSA best practices. The registry will also list vendors whose governance, risk management and compliance (GRC) wares take the CSA STAR reports into account when determining compliance. The idea is that customers will be able to extend GRC monitoring and assessment to their cloud providers, the CSA says.

Google, Microsoft, Savvis and Verizon will submit information about their services and Intel and McAfee will file reports about security products.

CSA announced the keystone participants in its STAR program at CSA Congress 2011 in Orlando, Fla., this week.

CSA also announced it is extending its scrutiny to cloud-based security service providers -- businesses that offer security services from cloud platforms.

Customer concerns with security as a service include:

Systems might not be locked down properly.

Personnel might not be vetted thoroughly.

Data leakage among virtual machines within multi-tenant environments.

Cloud-based security services might not meet compliance standards.

"When deploying Security as a Service in a highly regulated industry or environment," says the CSA's latest Guidance for Critical Areas of Focus in Cloud Computing, "agreement on the metrics defining the service level required to achieve regulatory objectives should be negotiated in parallel with the SLA documents defining service."

These cloud-based security services are wide-ranging and include identity and access management, data loss protection, Web and email security, encryption and intrusion prevention, CSA says.

Reports to the CSA might partially overcome the issues with lack of security information about Windows Azure that I complained about in my Where is Windows Azure’s Equivalent of Office 365’s “Security, Audits, and Certifications” Page? post of 11/17/2011. Hopefully Microsoft, CSA or both will announce when and where we’ll be able to read the first reports.

Lori MacVittie (@lmacvittie) asked Who is most responsible for determining the adequacy of security in the cloud in your organization? in an introduction to her The Scariest Cloud Security Statistic You’ll See This Year post of 11/14/2011 to F5’s DevCentral blog:

Dome9, whom you may recall is a security management-as-a-service solution that aims to take the complexity out of managing administrative access to cloud-deployed servers, recently commissioned research on the subject of cloud computing and security from the Ponemon Institute and came up with some interesting results that indicate cloud chaos isn’t confined to just its definition.

The research, conducted this fall and focusing on the perceptions and practices of IT security practitioners, indicated that 54% of respondents felt IT operations and infrastructure personnel were not aware of the risks of open ports in cloud computing environments.

I found that hard to swallow. After all, we’re talking about IT practitioners. Surely these folks recognize the dangers associated with open ports on servers in general. But other data in the survey makes this a reasonable assumption, as 51% of respondents said leaving administrative server ports open in cloud computing environments was very likely or likely to expose the company to increased attacks and risks, with 19% indicating such events had already happened.

Yet 30% of those surveyed claimed it was not likely or simply would not happen. At all. I remain wishing Ponemon had asked the same questions of the same respondents about similar scenarios in their own data center as I ‘m confident the results would be very heavily weighted toward the “likely or very likely to happen.” It may be time for a reminder of Hoff’s law: “If your security practices suck in the physical realm, you’ll be delighted by the surprising lack of change when you move to Cloud.”

However, digging down into the data one begins to find the real answer to this very troubling statistic in the assignment of responsibility for security of cloud-deployed servers. It is, without a doubt, the scariest statistic with respect to cloud security I’ve seen all year, and it seems to say that for some organizations, at least, the cloud of Damocles is swinging mightily.

If it doesn’t scare you that business functions are most cited as being ultimately responsible for determining adequacies of security controls in the cloud, it might frighten you to know that 54% of respondents indicated that IT operations and infrastructure personnel were not very or completely unknowledgeable with respect to the dangers inherent in open ports on servers in cloud computing environments – and that 35% of those organizations rely on IT operations to determine the adequacy of security in cloud deployments.

While certainly IT security is involved in these decisions (at least one hopes that is the case) that the most responsibility is assigned to those least knowledgeable in the risks.

That 19% of respondents indicated already experiencing an attack because of open ports on cloud-deployed servers is no longer such a surprising result of the study.

CALL to ACTION

The Ponemon study is very interesting in its results, and indicates that we’ve got a ways to go when it comes to cloud and security and increasing our comfort level combining the two. Cloud is a transformational and highly disruptive technology, and at times it may be transforming organizations in ways that are perhaps troubling – such as handing responsibility of security to business or non-security practitioners. Or perhaps it’s simply exposing weaknesses in current processes that should force change. Or it may be something we have to live with.

It behooves IT security, then, to ensure it is finding ways to address the threats they know exist in the cloud through education of those responsible for ensuring it.

It means finding tools like Dome9 to assist the less-security savvy in the organization with ensuring that security policies are consistently applied in cloud environments as well as in the data center. It may require new technology and solutions that are designed with the capability to easily replicate policies across multiple environments, to ensure that a standard level of security is maintained regardless of where applications are deployed.

As cloud becomes normalized as part of the organization’s deployment options, the ability to effectively manage security, availability, and performance across all application deployments becomes critical. The interconnects (integration) between applications in an organization means that the operational risk of one is necessarily shared by others. Consistent enforcement of all delivery-related policies – security, performance, and availability – is paramount to ensuring the successful integration of cloud-based resources, systems, and applications into the IT organization’s operational processes.

You can register to read the full report on Dome9’s web site.

Cloud Computing Events

• Michelle Laroux Bustamante posted Windows Azure Platform for Developers: A Complete Look with a Practical Spin on 11/9/2011 (missed when posted):

I did a tutorial on Windows Azure Platform [at CloudConnection] last week. Here is the initial post with my slides. Will post all the code shortly.

VPR03: Windows Azure Platform for Developers: A Complete Look with a Practical Spin

Today, every developer must know how to develop for the Windows Azure Platform. Every company is in the hosting business today, thus they must either provide the infrastructure for global reach, or rent it to reduce up front capital costs, IT management overhead, and the ability to scale on demand. The Windows Azure Platform is Microsoft’s cloud computing initiative supplying an operating system in the cloud—hosted in Microsoft data centers—in addition to data storage and other infrastructure and application services. It provides businesses with on-demand hosting, storage and management features in fashion with utility computing. In this workshop, developers will get a top to bottom view of the platform with a practical look at each of the platform features.

We’ll explore Windows Azure, Windows Azure Storage (tables, blobs, queues), look at Windows Azure AppFabric features and also SQL Azure. You’ll learn how to build and deploy applications and services to the cloud with familiar development tools; you’ll learn about storage options offered by Windows Azure Storage and how that compares to SQL Azure; and you’ll learn how to employ features of AppFabric including the Service Bus, Caching and Access Control. This workshop is intended to give developers a jump on Windows Azure with practical guidance and tips for each feature. It will get you up to speed with the platform and then some, while also preparing you for sessions at the conference that dive a little deeper into equally important aspects of the platform with a continued focus on practical guidance.

Slides to this session: Bustamante_CloudConnections_WindowsAzurePlatformForDevelopers_Tutorial

Ali Parker of Microsoft’s Server and Cloud Platform Team recommended that you Don’t Miss The Microsoft Management Summit 2012 – April 16-20, [2012] in Las Vegas in an 11/17/2011 post:

The Microsoft Management Summit 2012 is packed with engaging ways to learn about how Microsoft management technologies enable a new breed of datacenter, private cloud and public cloud solutions, and deliver a more flexible and productive desktop infrastructure. You’ll also gain free access to valuable trial software and be able to visit with over 50 technology companies that will be exhibiting their latest innovations. [Emphasis added.]

MMS offers hundreds of hands-on learning opportunities that will help you accelerate your career and solve today’s most challenging technical problems.

We hope you will join us for this popular event designed to stimulate new thinking and forge lasting relationships among a remarkable group of IT professionals and industry leaders.

Don’t delay your registration. MMS typically sells out. For a very limited time, you can save $275 off of the registration fee—just register now, and have a great time in Las Vegas!

Ali Parker
Senior Product Marketing Manager

Other Cloud Computing Platforms and Services

Lydia Leong (@cloudpundit) posted Amazon and the power of default choices on 11/17/2011:

Estimates of Amazon’s revenues in the cloud IaaS market vary, but you could put it upwards of $1 billion in 2011 and not cause too much controversy. That’s a dominant market share, comprised heavily of early adopters but at this point, also drawing in the mainstream business — particularly the enterprise, which has become increasingly comfortable adopting Amazon services in a tactical manner. (Today, Amazon’s weakness is the mid-market — and it’s clear from the revenue patterns, too, that Amazon’s competitors are mostly winning in the mid-market. The enterprise is highly likely to go with Amazon, although it may also have an alternative provider such as Terremark for use cases not well-suited to Amazon.)

There are many, many other providers out there who are offering cloud IaaS, but Amazon is the brand that people know. They created this market; they have remained synonymous with it.

That means that for many organizations that are only now beginning to adopt cloud IaaS (i.e., traditional businesses that already run their own data centers), Amazon is the default choice. It’s the provider that everyone looks at because they’re big — and because they’re #1, they’re increasingly perceived as a safe choice. And because Amazon makes it superbly easy to sign up and get started (and get started for free, if you’re just monkeying around), there’s no reason not to give them a whirl.

Default choices are phenomenally powerful. (You can read any number of scientific papers and books about this.) Many businesses believe that they’ve got a low-risk project that they can toss on cloud IaaS and see what happens next. Or they’ve got an instant need and no time to review all the options, so they simply do something, because it’s better than not doing something (assuming that the organization is one in which people who get things done are not punished for not having filled out a form in triplicate first).

Default choices are often followed by inertia. Yeah, the company put a project on Amazon. It’s running fine, so people figure, why mess with it? They’ve got this larger internal private cloud story they’re working on, or this other larger cloud IaaS deal they’re working on, but… well, they figure, they can migrate stuff later. And it’s absolutely true that people can and do migrate, or in many cases, build a private cloud or add another cloud IaaS provider, but a high enough percentage of the time, whatever they stuck out there remains at Amazon, and possibly begins to accrete other stuff.

This is increasingly leaving the rest of the market trying to pry customers away from a provider they’re already using. It’s absolutely true that Amazon is not the ideal provider for all use cases. It’s absolutely true that any number of service providers can tell me endless stories of customers who have left Amazon for them. It’s probably true, as many service providers claim, that customers who are experienced with Amazon are better educated about the cloud and their needs, and therefore become better consumers of their next cloud provider.

But it does not change the fact that Amazon has been working on conquering the market one developer at a time, and that in turn has become the bean-counters in business saying, hey, shouldn’t we be using these Amazon guys?

This is what every vendor wants: For the dude at the customer to be trying to explain to his boss why he’s not using them.

This is increasingly my client inquiry pattern: Client has decided they are definitively not using Amazon (for various reasons, sometimes emotional and sometimes well thought out) and are looking at other options, or they are looking at cloud IaaS and are figuring that they’ll probably use Amazon or have even actually deployed stuff on Amazon (even if they have done zero reading or evaluation). Two extremes.

David Linthicum (@DavidLinthicum) asserted “Seen as the cloud's Linux, the open source cloud infrastructure effort has picked up support from a lot of big companies” in a deck for his OpenStack a year later: Popular but still bare bones post of 11/17/2011 to InfoWorld’s Cloud Computing blog:

OpenStack, simply put, is a series of interrelated projects delivering components for a cloud infrastructure solution. The projects are worked on by a global collaboration of developers and cloud computing providers supporting an open source cloud computing platform for public and private clouds, and the objectives are cookie-cutter cloud computing, including massive scalability and ease of use.

The OpenStack effort was launched by Rackspace and NASA a year ago. Researchers at the NASA Ames Research Center first developed the base components of OpenStack, called Nova, to provide the U.S. space and aeronautical agency with a highly scalable private cloud. Rackspace then got involved to promote the technology commercially and later spun it out into an independent foundation.

What's the deal with OpenStack after a year? The market considers many of the existing leading public cloud computing solutions as closed and proprietary. Thus, there's a pent-up demand for an open source product, even though a few already existed -- the open source version of Eucalyptus, for example.

As such, OpenStack has momentum on its side. Since the launch, more than 100 organizations have contributed to the code base or participated in the project in some way. This includes Dell and Hewlett-Packard, who are building commercial cloud products using OpenStack as a base. Moreover, several startups, including Internap, Nebula, and Piston Cloud Computing, are basically providing OpenStack vending operations, much like Red Hat vends Linux. More recently, Rackspace has released a private cloud product based on OpenStack, as well as an architectural framework. …

Read more: 2, next page ›

Scott M. Fulton, III (@SMFulton3) described A "Carrier Cloud" - Alcatel-Lucent's Bid to Compete with Amazon in an 11/17/2011 post to the ReadWriteCloud blog:

While cloud architectures do have significant advantages in improving resource utilization and reducing costs, historically they've had two substantive drawbacks: First, they tend to reduce administrators' visibility. Second, the reliability of service cannot easily be guaranteed, especially for customers whose operations demand near-perpetual uptime.

This morning, telecommunications firm Alcatel-Lucent - the inheritor of Bell Labs' intellectual property - presented a complete value proposition for a product that is still in development. Called CloudBand, it proposes a cloud provision model for carrier-grade services. Think of a cloud that provides live video for thousands of customers simultaneously, and you get the idea.

Sponsor

Rather than concentrating on delivering functionality, such as applications (SaaS) or computing infrastructure (IaaS), CloudBand would focus on networking. It might not be a cloud the way we currently understand it, where processing power and storage capacity are leased. But there are three components to the modern data network, not two, and the third is the interconnection layer, or what the OSI model would call "Level 3."

"The carrier cloud is based on the service provider's greatest asset: the network and its carrier-grade attributes," reads an Alcatel-Lucent white paper published today (PDF available here). "Carrier-grade networks are known for their high reliability and availability and their fast fault recovery."

The paper cites the fact that typical cloud providers cannot provide very high service-level guarantees, by virtue of the fact that cloud architectures often rely upon implementation of commodity-grade parts that are prone to failure. In one sense, the whole point of cloud architectures was to enable businesses to use the components they had, including the cheap ones, in a collective pool that minimizes the impact of low performance and faults. But when a business leases cloud services in order to provide cloud services to its own customers, it passes on those lower service level guarantees to those customers.

Citing surveys that it commissioned, A-L is saying that among some 3,886 influential IT employees, including executives, on seven continents (Antarctica was not available), 46% of respondents report being at least annoyed by the latencies they experience from their cloud services, while 36% report that their service-level agreements (SLAs) with their cloud providers are essentially meaningless.

This is where Alcatel-Lucent is making its distinction: It seeks to build a cloud network capable of offering carrier-grade service guarantees. The company suggests that cloud service resellers can recoup the premiums they'll likely pay for this service, by offering premium-tier service with higher SLA levels to their own customers in turn.

"Moving to the carrier cloud allows service providers to virtualize and transform their operations to become more efficient and agile while reducing costs," the company contends. "It also puts them in a better position to increase revenue by meeting growing enterprise demand for highly available cloud services with end-to-end performance guaranteed."

Jason Meyers reported Apprenda Platform Addresses Plug-and-Play PaaS private .NET cloud platform in an 11/15/2011 post to the Cloud IT Pro blog:

Platform-as-a-service developer Apprenda has released Apprenda 3.0, which the company says lets enterprises take any Windows infrastructure and offer a self-service .NET cloud platform in a plug-and-play approach.

Apprenda is touting the release as “the world’s first fully productized PaaS software layer."

“A developer is spending way too much time creating apps and then building the complext infrastructure to support it,” said Sinclair Schuller, CEO of Apprenda. “We built a data center operating system as platform-as-a-service that you can take anywhere.”

The Apprenda platform behaves like an application server, he said, running underneath the application and injecting new capabilities into apps that were not there before. The platform also offers improved manageability, more support of .NET or SOA apps and can help boost developer productivity because of features like distributed caching, publish/subscribe systems, message brokering and application metering that allow developers to concentrate primarily on code development.

“The level of productization is beyond anything you’ve seen out there,” Schuller said. “We purposely built it to be plug-and-play on top of existing Windows infrastructures.”

The result, Apprenda believes, is a quicker migration to the cloud.

“People don’t have to waste a lot of time trying to cloudify their apps,” Schuller said. “The can take all the exsitings apps and move them into a cloud-based model very quickly.”

What is the claim?	Microsoft Azure is the "fastest cloud"

What is the claimed measurement?	Overall performance (it's fastest)

What is the actual measurement?	Network Latency & Throughput
Rendering 2 html pages and some images is not CPU intensive and as such is not a measure of system performance. The main bottleneck is network latency and throughput, particularly to distant monitoring nodes (e.g. Australia to US)
Is it an apples-to-apples comparison?	Types of services tested are different (IaaS vs PaaS) and the instance types are dissimilar
Microsoft Azure and Google AppEngine are platform-as-a-service (PaaS) environments, very different from infrastructure-as-a-service (IaaS) environments like EC2 and GoGrid. With PaaS, users must package and deploy applications using custom tools and more limited capabilities. Applications are deployed to large clustered, multi-tenant environments. Because of the greater structure and more limited capabilities of PaaS, providers are able to better optimize and scale those applications, often resulting in better performance and availability when compared to a single server IaaS deployment. Not much information is disclosed regarding the sizes of instances used for the IaaS services. With some IaaS providers, network performance can vary depending on instance size. For example, with Rackspace Cloud, a 256MB cloud server is capped with a 10 Mbps uplink. With EC2, bandwidth is shared across all instances deployed to a physical host. Smaller instance sizes generally have less, and more variable bandwidth. This test was conducted using the nearly smallest EC2 instance size, an m1.small.
Is the playing field level?	Services may have unfair advantage due network proximity and uplink performance
Because network latency is the main bottleneck for this test, and only a handful of monitoring nodes were used, the results are highly dependent on network proximity and latency between the services tested and the monitoring nodes. For example, the Chicago monitoring node might be sitting in the same building as the Azure US Central servers giving Azure and unfair advantage in the test. Additionally, the IaaS services where uplinks are capped on smaller instance types would be at a disadvantage to uncapped PaaS and IaaS environments.
Was the data reported accurately?	Simple average was reported - no median, standard deviation or regional breakouts were provided
The CloudSleuth post provided a single metric only… the average response time for each service across all monitoring nodes. A better way to report this data would involve breaking the data down by region. For example, average response time for eastern US monitoring nodes. Reporting median, standard deviation and 90th percentile statistical calculations would also be very helpful in evaluating the data.
Does it matter to you?	Probably not
Unless your users are sitting in the same 30 data centers as the GPN monitoring nodes, this study probably means very little. It does not represent a real world scenario where static content like images would be deployed to a distributed content delivery network like CloudFront or Edgecast. It attempts to compare two different types of cloud services, PaaS and IaaS. It may use IaaS instance types like the EC2 m1.small that represent the worst case performance scenario. The 30 node test population is also very small and not indicative of a real end user population (end users don't sit in data centers). Finally, reporting only a single average value ignores most statistical best practices.

Friday, November 18, 2011

Being an MVP: The Ultimate Community Reward

Individuals

Businesses

Product Involvement

Author

Support for Complex Types

Storing Binary Resources

Questions to consider

Case Study #1: Joyent Cloud versus AWS EC2

Case Study #2: Microsoft Azure Named Fastest Cloud Service

I Didn’t See Any Errors!

Our Load Test/Automated Tests Didn’t See Any Errors!

Fault Testing

How Complex Systems Fail, Invisibly

CALL to ACTION

0 comments:

Forbes: Who Are The Top 20 Influencers in Big Data?

TRAACKR: Who Are The Top 50 Data Science Influencers?

Blog Archive

OakLeaf Blog Curations on Curah!

Links to SQL Azure Labs and Other Big Data Articles

Windows Azure Mobile Services Preview Walkthrough for Windows Store Apps

OakLeaf's New Windows Azure WebSites

Check Out OakLeaf's New Azure DataMarket Offerings

SearchCloudComputing Articles

Articles for Red Gate Software's ACloudyPlace

Windows Azure Articles for Developer.com

DZone Syndication

Feeds

OakLeaf Systems' Listings in Microsoft Pinpoint

Links to Cover Stories for Visual Studio Magazine

The Windows Azure Team Interview, 11/30/2010

OakLeaf Blog Ranked #134 of Influential Data Blogs by eCairn

OakLeaf Systems' Windows Azure Table Services Sample Project

Check Out the OakLeaf SharePoint Online Site

Google Blog Search

Windows Azure Questions & Answers

About Me

My Latest Books

Early MiniDV and FireWire Posts

Labels

Berkeley Juneteenth Festival 1998 Historical Web Pages

Berkeley Juneteenth Festival Silver Anniversary

Miscellaneous Links

Squidoo Lenses

Terms of Use/Privacy Statement

Copyright