OakLeaf Systems: Windows Azure and Cloud Computing Posts for 11/3/2010+

A compendium of Windows Azure, Windows Azure Platform Appliance, SQL Azure Database, AppFabric and other cloud-computing articles.

• Updated 11/4/2010 with missing OData article marked •.

Note: This post is updated daily or more frequently, depending on the availability of new articles in the following sections:

Azure Blob, Drive, Table and Queue Services
SQL Azure Database, Azure DataMarket and OData
AppFabric: Access Control and Service Bus
Windows Azure Virtual Network, Connect, and CDN
Live Windows Azure Apps, APIs, Tools and Test Harnesses
Visual Studio LightSwitch
Windows Azure Infrastructure
Windows Azure Platform Appliance (WAPA)
Cloud Security and Governance
Cloud Computing Events
Other Cloud Computing Platforms and Services

To use the above links, first click the post’s title to display the single article you want to navigate.

Cloud Computing with the Windows Azure Platform published 9/21/2009. Order today from Amazon or Barnes & Noble (in stock.)

Read the detailed TOC here (PDF) and download the sample code here.

Discuss the book on its WROX P2P Forum.

See a short-form TOC, get links to live Azure sample projects, and read a detailed TOC of electronic-only chapters 12 and 13 here.

Wrox’s Web site manager posted on 9/29/2009 a lengthy excerpt from Chapter 4, “Scaling Azure Table and Blob Storage” here.

You can now freely download by FTP and save the following two online-only PDF chapters of Cloud Computing with the Windows Azure Platform, which have been updated for SQL Azure’s January 4, 2010 commercial release:

Chapter 12: “Managing SQL Azure Accounts and Databases”
Chapter 13: “Exploiting SQL Azure Database's Relational Features”

HTTP downloads of the two chapters are available for download at no charge from the book's Code Download page.

Tip: If you encounter articles from MSDN or TechNet blogs that are missing screen shots or other images, click the empty frame to generate an HTTP 404 (Not Found) error, and then click the back button to load the image.

Azure Blob, Drive, Table and Queue Services

No significant articles today.

SQL Azure Database, Azure DataMarket and OData

• Scott Banachowski explained how to Take Control of [Streaming] ADO.NET Serialization in a detailed 11/3/2010 tutorial:

ADO.NET data services provide a rich way to access data over the network using REST APIs. It’s easy to use because the .NET libraries allow you to hook up to many different kind of data sources, connect over networks, and use high-level query syntax such as LINQ. In this article, I describe how to take control over the serialization on the server side of an ADO.NET data service.

Background

Typically, ADO.NET is used to access data stored in relational databases or key/value stores, but it doesn’t (natively) support streaming of continuous entities. Nevertheless, with a bit of work, you can take over the connection and start streaming to clients yourself. Tom Laird-McConnell wrote an excellent article that describes this technique. I’m using the software that Tom created to stream data to hundreds of clients.

I won’t repeat why streaming from ADO.NET is an intriguing option, because the reasons are bulleted at the start of Tom’s article; the rest of my post assumes you’ve read or at least skimmed his main points. The gist is to create a streaming data service class that overrides DataService<> and implements an IHttpHandler interface. The latter allows control over the connection. You also must create a class implementing IDataServiceHost, which attaches to each DataService connection–this gives control over the connection state. And you need to implement an IQueryable<> wrapper that prevents clients from requesting actions that don’t make sense on endless streams, such as sorting. Finally, your data source will return an enumeration over the real-time stream of data by wrapping it in this IQueryable<>. An implementation of each of these classes is presented in Tom’s post.

The Problem

I have a system that takes incoming data from social networks, processes the data, and then stream the results to many different consumers (including the Bing search index, trend detectors, and maps). As soon as data is processed, it is served to clients as an endless stream of data entities. This is a classic consumer-producer system, with a single producer and many consumers.

As more and more clients connect, the CPU begins to saturate. As it turns out, the DataService serializes each entity to either XML Atoms or JSON. In our case, each client is pulling the same data from the real-time stream, so if 100 clients receive the same entity, it will be serialized 100 times, and this dominates the overall processing. As you can imagine, this is not an effective use of CPU. To remedy this, we need to serialize each entity to a string once, and reuse the string for each client. Unfortunately, as customizable as the DataService is, there doesn’t seem to be a hook to control the serialization on the server side.

From what I found by examining the DataService code in .NET Reflector, all the serialization code was private and not overridable. I did find a promising interface that allows control over the way data is serialized for WCF services, and I still haven’t really determined if (and how) this could be leveraged in my case. Perhaps it’s possible, but I already forged ahead with a different technique before discovering this.

The Approach

Below I include many code snippets that use the types introduced in Tom’s article to demonstrate how to manually controlled serialization. The key is to manually take over the processing of the request:
public class StreamingDataService : DataService, IHttpHandler where DataModelT : class
{
    public void ProcessRequest(HttpContext)
    {
        ...
        // This will send results until the client connection is dropped
        ProcessRequest();
        ...
    }
}
Becomes:
public class StreamingDataService : DataService, IHttpHandler where DataModelT : class
{
    public void ProcessRequest(HttpContext)
    {
        ...
        bool processed = false;
        if (_manualQueryProcessingEnabled)  processed = ProcessRequestManually(context, httpHost.ResponseStream);
        if (!processed)
            ProcessRequest();  // default ADO.NET handler
        ...
    }   abstract bool ProcessRequestManually(HttpContext context, Stream responseStream);
}
Notice that is possible for ProcessRequestManually to decide it can’t handle the request by returning false. In that case it falls back on the default processing instead. This is because I implemented a simple (dumb) parser that understands a subset of possible queries; it handles all the queries the clients currently use, but I didn’t want it to drop new queries it can’t yet parse, so it lets the ADO.NET take over (bypassing the serialization cache). These unhandled queries are logged; if they become frequent, the parser is extended to handle them. The _manualQueryProcessingEnabled is a runtime configuration option to easily disable the feature when I need to.

Here’s ProcessRequestManually, which is added to the QuoteStreamService class.
override bool ProcessRequestManually(HttpContext context, Stream responseStream)
{
    bool processed = false;
    if (context.Request.PathInfo.StartsWith("/Quotes"))
    {
        StockQuoteStream service = this.CreateDataSource() as StockQuoteStream;
        if (service != null)
            processed = ManualQueryExecutor.ProcessRequest(service.Quotes, context, responseStream);
    }
    return processed;
}
ProcessRequestManually creates a data source, which is the StockQuoteStream object. It is this object that the query executes against, so it passes it off to the following ProcessRequest function, that lives in a static class called ManualQueryExecutor:
static byte[] _jsonPrefix = Encoding.UTF8.GetBytes("{\n\"d\": [\n");
static byte[] _jsonPostfix = Encoding.UTF8.GetBytes("]\n}\n");   static bool ProcessRequest(IQueryable query, Stream responseStream, HttpRequest request, HttpResponse response)
{
    query = GetQuery(query, request);
    if (query == null)
        return false;   DateTime lastFlush = DateTime.UtcNow;
    int bytesOut = 0;
    bool started = false;   foreach (var i in query)
    {
        if (!started)
        {   response.ContentType = "application/json";
            response.Cache.SetCacheability(HttpCacheability.NoCache);   responseStream.Write(_jsonPrefix, 0, _jsonPrefix.Length);
             started = true;
        }
        else
        {
             responseStream.WriteByte((byte)',');
        }   bytesOut += i.ToJson(responseStream, expansions);
        DateTime now = DateTime.UtcNow;
        if ((bytesOut > flushWhen) || ((now - lastFlush) > _flushInterval))
        {
             response.Flush();
             lastFlush = now;
             bytesOut = 0;
        }
    }   if (started)
        responseStream.Write(_jsonPostfix, 0, _jsonPostfix.Length);   return true;
}
The ProcessRequest function first calls GetQuery to parse the URL to a query. Then, if the query was created successfully, the foreach loop will execute the query, which results in the production of entities from the real-time stream. At this point, we receive an unserialized entity. I extended the entity’s API to include a function called ToJson(), which returns the entity serialized as JSON in a UTF8 encoded byte array. This is because all clients are expecting JSON. If I were also supporting XML Atoms, I would need to parse the request header to determine which format is requested (by looking at the “Accept” parameter of the HTTP header). In the loop the data is just sent by reading the JSON string to the response stream.

I left out details of ToJson(), but it is a straightforward conversion from the entity to JSON. The only gotcha is that clients expect the OData metadata for each entity, so the serializer must create it (and means you can’t simply just use the DataContractJsonSerializer). This amounts to adding a JSON object as the first nested object in the entity that looks like:
"__metadata": {
    "uri": "http://yourservice/QuoteStream.ashx/Quotes",
    "type": "Quote"
}
Also, _jsonPrefix and _jsonPostfix wrap the entire sequence of entities in a json array named “d”, and the loop adds a comma between each entity as expected for a JSON array, resulting in: { "d": [ entity0, entity1 ] }. Each entity is a JSON object with OData metadata.

Flushing for the output stream is disabled, this way I can manually control when the stream gets flushed, and do so after either a certain number of bytes or elapsed time. This way performance of the stream can be tuned slightly. IIS performs HTTP chunking at every flush point.

So far this hasn’t been too much work, and pretty straightforward; the only tricky part is writing a custom JSON serializer that writes OData format (unfortunately, although .NET has one built into its DataService, it does expose it for public use). But writing JSON is not too difficult. The only detail left undone is to parse the request URL into a queryable object. Again, .NET has code to do this but has deftly kept it private.

The role of GetQuery is to convert the request from the OData URL specification to a LINQ query. Since .NET hides this functionality, you can get there by rewriting the URL into the C# equivalent, then use the Dynamic LINQ library to convert. Dynamic LINQ is a library whose source code is provided by MSDN as part of C# sample pack.

For example, suppose the client side issues the following query:
var results = ( from quote in quoteStream.Quotes.Expand("Company")
    where quote.Symbol == "IBM" && quote.Delta > 0.1
    select quote).Take(5);
The client data service converts this to a URL query:
http://yourservice/QuoteStream.ashx/Quotes()?$filter=(Symbol eq 'IBM') and (Delta gt 0.1)&$top=5&$expand=Company
In order to rebuild the query, the URL filter arguments must be coverted to C# syntax for parsing by Dynamic LINQ. In this example, the query $filter component of the argument is converted as string from "(symbol eq 'en') and (value gt 10)" to "(symbol == "en") && (value > 10)". The Dynamic LINQ library includes an extension function for Where, so that a string argument is interpreted and compiled into the a queryable object. If it fails to parse, it will throw an exception that I catch, and then rely on the default ADO.NET implementation to handle the query. The $take argument is also parsed, and may be straightforwardly applied to the query via its Take function. For now, the $expand argument is just ignored.

GetQuery, the last piece in the puzzle, is the code that handles this conversion by doing a series of string replacements on the $filter argument. The following code snippet all resides inside the ManualQueryExecutor class:
static RegexSearch[] _regexSearches = new RegexSearch[] {
    new RegexSearch { regex= new Regex(@"endswith$(?.*?),'(?.*?)'$", RegexOptions.Compiled), funcToReplace= "EndsWith" },
    new RegexSearch { regex= new Regex(@"substringof$'(?.*?)',(?.*?)$", RegexOptions.Compiled), funcToReplace= "Contains" }
};   static Regex _longIntMatch = new Regex(@"(\d+)L", RegexOptions.Compiled);   class RegexSearch
{
    public Regex regex;
    public string funcToReplace;
}   struct StringMatcher
{
    public int start;
    public int stop;
    public string replace;
}   static private void StringMatch(string str, RegexSearch rgx, List lsm)
{
    Match m = rgx.regex.Match(str);   while (m.Success)
    {
         string replace = String.Format("{0}.{1}(\"{2}\")", m.Groups["obj"], rgx.funcToReplace, m.Groups["lit"]);
         lsm.Add(new StringMatcher { start = m.Index, stop = m.Index + m.Length, replace = replace });
         m = m.NextMatch();
     }
}   static private void StringMatching(StringBuilder sb)
{
    string origStr = sb.ToString();
    List lsm = new List();   foreach (var rgxSearch in _regexSearches)
        StringMatch(origStr, rgxSearch, lsm);   if (lsm.Count > 0)
    {
        // reset the string to rebuild it with the replacements
        sb.Length = 0;
        int loc = 0;
        foreach (var lsmi in lsm)
        {
            sb.Append(origStr.Substring(loc, lsmi.start - loc));
            loc = lsmi.stop;
            sb.Append(lsmi.replace);
        }
        sb.Append(origStr.Substring(loc));
    }
}   static private IQueryable GetQuery(IQueryable query, HttpRequest request)
{
    string filterString = request.QueryString["$filter"];
    if (!String.IsNullOrEmpty(filterString))
    {
        StringBuilder sb = new StringBuilder(filterString);
        StringMatching(sb);
        sb.Replace(" eq ", " == ");
        sb.Replace(" ne ", " != ");
        sb.Replace(" ge ", " >= ");
        sb.Replace(" gt ", " > ");
        sb.Replace(" lt ", " < ");
        sb.Replace(" le ", " <= ");
        sb.Replace(" and ", " && ");
        sb.Replace(" or ", " || ");
        sb.Replace("not ", "!");
        sb.Replace(" mod ", " % ");
        sb.Replace('\'', '"');   // must replace / with . if they are not within quotes.
        if (sb.ToString().Contains('/'))
        {
            bool inQuotes = false;
            List indices = new List();
            for (int i = 0; i < sb.Length; i++)
            {
                if (sb[i] == '"')
                    inQuotes = !inQuotes;
                else if (sb[i] == '/' && !inQuotes)
                    indices.Add(i);
            }   if (indices.Count > 0)
                foreach (int i in indices)
                    sb.Replace('/', '.', i, 1);
        }   string s = _longIntMatch.Replace(sb.ToString(), "$1");   // This is using the DynamicQueryable extension to compile the query
        try
        {
            query = query.Where(s);
        }
        catch (System.Linq.Dynamic.ParseException)
        {
            Trace.TraceWarning("Manual query not handled: {0} -> {1}", request.QueryString, s);
            query = null;
        }
    }   if (query != null)
    {
        string topString = request.QueryString["$top"];
        if (!String.IsNullOrEmpty(topString))
            query = query.Take(int.Parse(topString));
    }   return query;
}
Warning, the above code is ugly (but you probably already noticed) and I’m a bit embarrassed to present it: it is error prone, probably performs poorly, and is inelegant. Nevertheless, I felt I better show it otherwise the guts of this solution would be missing. I don’t really recommend doing it this way. A better way is to write a proper parser or more sophisticated regex. However this gets the job done by brute force replacing the URL syntax until it conforms to C# syntax.

At least part of the potential performance problem is addressed by caching the computation of the query. Since many clients connect using the same queries, keeping a hash table of URLs to the resulting query for those already prevents recomputing many of the queries. The production code uses this hash, but I removed it here to make the code shorter.

The Downside

This approach of controlling the serialization of entities so that the serialized strings may be cached and reused by many clients really helped the situation with servers being overtaxed. When deployed to production I found that CPUs were no longer saturated and were able to handle more clients. However, the code as described above has a couple of limitations: it ignores and essentially disables expansion of entities and projection of properties. I’ll briefly describe these limitations.

Expansion allows you to take the nested components of an entity. In the above example, the query said quoteStream.Quotes.Expand("Company"), meaning that in addition to requesting the Quote object, it would like a Company object that is associated with that quote. The Quote entity has a nested Entity called Company. Without expansion, if you request the quote you will get a null reference for the company information. By explicitly expanding the entity the company data will come along with the quote. In my initial implementation of the JSON parser, I automatically wrote all the data, so that each pre-serialized entity already contained anything that could be expanded. This solved the problem that $expand was being ignored, because you would always get the data. The downside is that clients not needing the data received it anyway, and this increased the amount of bandwidth required for their connections. I was wasting network on data that was being tossed away.

Eventually I added expansion capability back to the system by changing the way the JSON serializer worked. Instead of writing out the whole string, I wrote chunks with breaks where entities could be expanded. In those breaks, I may choose to insert either the expanded entity (also already serialized) or a placeholder (unexpanded entities are replaced with an object to describe them–see OData spec referenced above for more info). This complicated things a bit: I must assemble a chain of buffers for each entity on a per query basis, and deal with arbitrarily deep nesting of entities, but still the bulk of the serialization work is done before the query, so it still comes out a win. This JSON serializer is probably worth a whole blog entry on it’s own if I ever get around to it.

As for projection, it is still disabled. Projection has the nice quality that you can further reduce the bandwidth by selecting only properties you are interested in. For example, the query above could be rewritten with projection so that it only sends the price.
var results = ( from quote in quoteStream.Quotes
    where quote.Symbol == "IBM" && quote.Delta > 0.1
    select new Quote { Price = quote.Price } ).Take(5);
Instead of selecting the entire quotes, it creates new ones with only a Price property. As a result, on the wire only the price data is sent. Although this saves bandwidth on the connection, we found that projection was very expensive on the server side. Clients that were trying to keep up with real-time data streams were falling behind when using projection queries. Therefore, for now I made no effort to support projection and just ignore it, sending the entire entity. This means clients will receive more data than they ask for, but can discard it.

Conclusion

In this blog entry I described a pretty hacky technique for bypassing some of the functionality of ADO.NET in order to control serialization. This allowed me to serialize each entity once, so that its cached result could be reused for multiple clients. It was fun to implement this project and was worthwhile because it improved the performance of my servers. Unfortunately, it duplicates a lot of the functionality that ADO.NET already implements such as URL parsing, and the OData JSON formatting. It would’ve been nice if ADO.NET exposed some of the functionally that did this, or provided simple hooks to do custom serialization (or just did the caching on its own!).

No significant articles today.

AppFabric: Access Control and Service Bus

Clemens Vasters (@clemensv) announced Windows Azure AppFabric October 2010 CTP - Service Bus Samples from my PDC Talk (Part 1) on 11/3/2010:

In my PDC talk I’m showing a few little samples that are not in the SDK for the CTP release. I’ve bundled them up and they’re attached to this post.

The solution contains three samples:

SBMT is the ‘Service Bus Management Tool’, a small command line utility to create/view/delete ConnectionPoints and MessageBuffers using the new management model for Service Bus. I’ll explain the options of the tool below and will drill into the inner workings of the tool and the ideas behind the new management interface in the next post of this series.

EchoService/EchoClient are versions of the meanwhile probably familiar ‘Echo’ sample for Service Bus that illustrates the new load balancing capability with session affinity.

DurableMessageBufferSample is a variation of the SDK sample for the new durable message buffer that sends a sequence of 200 messages and fetches them back.

The SBMT project/tool is the probably most interesting one, because we’re looking to change the way how connection points are managed on Service Bus and that requires a tiny bit of tooling.

In the current version of Service Bus, creating a listener is an implicit operation. The service (listener) picks a name (or address) in the namespace, binds that name to an endpoint, and opens the ServiceHost holding the endpoint. That’s extremely convenient, but what we’ve found is that this kind of implicit management makes it hard for us to create consistent behavior for eventing (NetEventRelayBinding) and for one of our most-requested ‘missing’ features, which is load balancing and/or failover. For eventing, the majority of customers that we’ve heard feedback from is expecting the the fabric will absorb sent events with no error message even if no event subscribers are present. For the new load balancing capability we’d like to appropriately return an error message in the case that no listener is connected, but differentiate that from the case where the connection point isn’t known, at all.

The implicit model causes no connection point to exist when there are no listeners, so we can’t solve either of these puzzles without making a change. The good thing is that we’ve got a model that works quite well and we’re building on that: Our message buffers already need to be created using an explicit AtomPub operation on the namespace, so what we’re doing is to make connection points explicit by requiring that you create them beforehand just as you create message buffers. We believe that’s the right model long-term, especially also because the <MessageBuffer/> and <ConnectionPoint/> descriptions are going to have more siblings over time and we’d like to have a consistent model. If you take a deeper look at the slides for the PDC talk, you can already infer a few of the siblings.

In the talk I also give a few reasons for why we’re splitting off the management and runtime namespaces as we’re doing this. It turns out that our longer term plans for how to enable rich monitoring and control for connection points and message buffers and other messaging artifacts on Service Bus will require quite a bit space in terms of protocol surface and that’s difficult to create in a single namespace. If I have a connection point at sb://clemens.servicebus.appfabriclabs.com/foo, any right-side extensions of that URI belong to the listener. If we were now looking to give you a protocol surface to enumerate the current sessions that are active and suspend/terminate the individually or summarily, for instance, it’s difficult to get that slotted into the same place without resorting to stunts like special escape characters in URIs. Instead, we’d like to provide a straightforward mechanism where you can get at the session connection with https://clemens-mgmt.servicebus.appfabriclabs.com/Resources/ConnectionPoints(foo)/Sessions/.

If that begs the question how those two addresses are related – I explain it in the talk. The resource has a projection into the runtime namespace. Resources are organized by their type (there will be other taxonomy pivots in the future) and you can project them into the runtime address space as you see it fit.

The SBMT tool allows interacting with the management namespace to create connection points and message buffers. The call structure is:

sbmt.exe –n <namespace> –a <account-name>–s <account-secret> <type> <command> <resource-name> <uri> [<args> ..]</

-n <namespace> - the Service Bus namespace created via http://portal.appfabriclabs.com

-a <account-name>– the service account name that’s set up in Access Control (defaults to ‘owner’)

-s <account-secret>– the service account key set up in Access Control (the key you find on the portal)

<type>– ‘cp’ for connection points, ‘mb’ for message buffers

<command>– ‘create’, ‘get’, or ‘delete’ (see below)

<resource-name>– a unique, friendly name for the resource (eg. ‘crm-leads’)

<uri>– the URI projection into the runtime namespace (e.g. ‘sb://<NAMESPACE>.servicebus.appfabriclabs.com/crm/leads/’)

<args>– further arguments (see below)

Examples:

Create a connection point. ‘cp create’ requires a trailing argument after the URI that indicates the number of concurrent listeners (max. 20):
sbmt.exe –n clemens –a owner –s […] cp create crm-leads sb://clemens.servicebus.appfabriclabs.com/crm/leads/ 10

Delete a connection point:
sbmt.exe –n clemens –a owner –s […] cp delete crm-leads

Show a connection point:
sbmt.exe –n clemens –a owner –s […] cp get crm-leads

Enumerate all connection points:
sbmt.exe –n clemens –a owner –s […] cp get

Create a message buffer:
sbmt.exe –n clemens –a owner –s […] mb create jobs https://clemens.servicebus.appfabriclabs.com/jobs/ 10

Delete a message buffer:
sbmt.exe –n clemens –a owner –s […] mb delete jobs

Show a message buffer:
sbmt.exe –n clemens –a owner –s […] mb get jobs

Enumerate all message buffers:
sbmt.exe –n clemens –a owner –s […] mb get

The Echo sample requires that you create a connection point in your namespace like this:

sbmt.exe –n –a owner –s […] cp create cp1 sb://<namespace>.servicebus.appfabriclabs.com/services/echo/ 10

The Message Buffer sample requires this setup, whereby the friendly name is asked for by the sample itself (‘mb1’, so you can choose that freely)

sbmt.exe –n –a owner –s […] mb create mb1 https://<namespace>.servicebus.appfabriclabs.com/mybuf/

The good news here is that you do this exactly once for your namespace. There is no renewal. A connection point that’s created sticks around like an object in the file system.

In the next post I’ll drill into what happens at the protocol level.

Since this is a CTP, we’re keenly interested in your feedback on all the things we’re doing here – making connection points explicit, splitting the namespace, what’s the tooling experience you’d like. This tool here is obviously just something I wrote to get going on demos; you can probably see that having a durable namespace is begging for tooling and it’s one of the reasons we do it.

pdc10-servicebus-oct10-ctp.zip (24.65 KB)

Dave Kearns asserted “Replication and distribution of identity services can take up a huge amount of bandwidth; partitioning breaks up the datastore into manageable parts” in a deck for his Partitioning is important component of identity services post of 11/2/2010 to NetworkWorld’s Security blog:

We've been talking about what an identity service needs in order to be considered both pervasive and ubiquitous. That is, to be "available anywhere and every time we want to use it" as well as “available everywhere and any time we want to use it." Last issue we began to explore three abilities necessary to insure the ubiquity and pervasiveness of the services -- the ability to be distributed, replicated and partitioned.

In the last issue I asserted that the identity data needs to be replicated for fault tolerance and also to move the data through the network so that its close to where it will be used. It should be distributed so that there are multiple read/write replicas (not just read-only or static catalog images) of the data. The third quality is partitioning.

If we're going to maintain multiple replicas of the data, and if we're going to allow changes from multiple replicas, which must then be synchronized to all other replicas, then we're going to create quite a bit of network traffic. Add to that the sheer quantity of data, which today's (but especially tomorrow's) identity-enabled applications and devices will be placing in the directory and other identity services, and you can see that all this replication and distribution could take up a huge amount of bandwidth.

One solution is partitioning -- breaking up the datastore into manageable parts. Then, by a well-designed placement of replicas of the partitions you can insure that data is both physically and logically stored near to the point it will be used while still minimizing the amount of traffic on the network necessary for synchronizing the data.

Also, because the datastore is distributed as well as partitioned, you can view the entire tree as if it were stored in one place -- even if there is no physical copy of the entire tree. And, of course, you can see this (and so can the identity-enabled apps and devices) from anywhere in the network because now your identity service is pervasive and ubiquitous.

Which leads us back to the original point -- application designers, programmers and coders will only use your identity service if it is both pervasive and ubiquitous.

Next week we'll take a look at some possible current solutions to this need.

Windows Azure Virtual Network, Connect, and CDN

No significant articles today.

Live Windows Azure Apps, APIs, Tools and Test Harnesses

Steve Marx (@smarx) offered Source From My PDC10 Session: Building Windows Phone 7 Applications with the Windows Azure Platform on 11/3/2010:

Last week at PDC 2010, I presented a session called “Building Windows Phone 7 Applications with the Windows Azure Platform.” Thanks to all who attended in person and virtually! As promised, I’m going to share all the source code from my session here on my blog.

To start with, I’ve just published the source code for my “DoodleJoy” application here: http://cdn.blog.smarx.com/files/DoodleJoy_source.zip. The source includes both a Windows Azure application that hosts an OData service and a Windows Phone 7 application that consumes it.

To get the application running, you’ll need the just-released OData client for Windows Phone 7, which you can find at http://odata.codeplex.com. (Click “View all downloads” to find the binaries and code gen tool.) Then open the two projects (phone and cloud) in separate Visual Studio instances. Run the cloud project first, and note the port. If it’s running at anything other than port 81, you’ll need to edit the URL in MainViewModel.cs in the phone project to point to the right location. Then launch the phone project, and things should just work.

Mary Jo Foley (@maryjofoley) posted Microsoft outlines plans for [BizTalk] 'integration as a service' on Azure to ZDNet’s All About Microsoft blog on 11/3/2010:

Microsoft officials have been talking up plans for the successor to the company’s BizTalk Server 2010 product for a while now. It wasn’t until last week’s Professional Developers Conference (PDC), however, that the cloud angle of Microsoft’s plans for its BizTalk integration server became clearer.

Based on an October 28 post to the BizTalk Server Team blog, it sounds as though Microsoft is thinking about BizTalk vNext evolving into something like a “BizTalk Azure” offering (akin, at least in concept, to Windows Azure and SQL Azure).

From the blog post:

“Our plans to deliver a true Integration service – a multi-tenant, highly scalable cloud service built on AppFabric and running on Windows Azure – will be an important and game changing step for BizTalk Server, giving customers a way to consume integration easily without having to deploy extensive infrastructure and systems integration.”

Microsoft’s goal is to deliver a preview release of (what I’m going to be calling from now on out) BizTalk Azure in calendar year 2011 and deliver updates every six months. (There’s no word in the post as to the final release target.) There’s also no word as to whether there will be both a server and a service version of BizTalk vNext — which seemed to be the plan in 2009 when Microsoft shared some very early roadmap information about its BizTalk plans. I’ve asked Microsoft for clarification on this point.

Update: There is, indeed, going to be an on-prem version of BizTalk vNext, too. And the final version is due out in 2012. From a Microsoft spokesperson:

“We will deliver new cloud-based integration capabilities both on Windows Azure (as outlined in the blog) as well as continuing to deliver the same capability on-premises. This leverages our AppFabric strategy of providing a consistent underlying architecture foundation across both services and server. This will be available to customers in the 2 year cadence that is consistent with previous major releases of BizTalk Server and other Microsoft enterprise server products.”

Microsoft released to manufacturing the latest on-premises-software version of BizTalk (BizTalk Server 2010) in September 2010. BizTalk Server 2010 is a minor release of Microsoft’s integration server that supports Visual Studio 2010, SQL Server 2008 R2, Windows Server AppFabric and Windows Server 2008 R2.

Microsoft officials are being careful in how they are positioning BizTalk Azure, as there are currently more than 10,000 BizTalk Server customers out there (who pay a pretty penny for the product). The Softies were careful to note that Microsoft will insure that existing customers can move to the Azure version “only at their own pace and on their own terms.” To make sure apps don’t break, Microsoft plans to provide side-by-side support for BizTalk Server 2010 and BizTalk Azure, officials said, as well as to provide “enhanced integration” between BizTalk and AppFabric (both Windows Server AppFabric and Windows Azure AppFabric).

Microsoft rolled out last week the first CTP (Community Technology Preview) of the Patterns and Practices Composite Application Guidance for using BizTalk Server 2010, Windows Server AppFabric and Windows Azure AppFabric together as part of an overall composite application solution. Microsoft also previewed a number of coming enhancements to Windows Azure AppFabric at last week’s confab.

It will be interesting to see how Microsoft handles the positioning of BizTalk vs. AppFabric, going forward, as well. As Daniel Probert, a consultant and blogger, noted in a recent post, while AppFabric won’t be replacing BizTalk, BizTalk will become part of AppFabric. Probert blogged, “A reasonable question to ask would be: why can’t I write an AppFabric application today which effectively replaces what BizTalk does? The answer is: you can. Sort of.”

Any BizTalk customers out there who have thoughts to share on the viability of BizTalk in the cloud?

Andy Grammuto posted Microsoft Platform Ready announces free app compat testing tool at PDC 2010 on 10/28/2010 (missed when posted):

Microsoft Platform Ready [MPR] launched in September 2010 and is designed to help software companies of any size develop, test, and market applications. The program brings together a variety of useful assets including training, support, and marketing. Current members of the Microsoft Partner Network, startups participating in Microsoft BizSpark, and established software companies with no prior association to Microsoft are welcome to participate in Microsoft Platform Ready.

Introducing the Microsoft Platform Ready Test Tool

Today, we are announcing the Microsoft Platform Ready Test Tool, delivering a simple, self-service way for software publishers to quickly verify application compatibility, fix errors, and receive benefits for innovating on the latest Microsoft platform technologies.

The tool allows a user to choose from a series of individual tests. It verifies solutions for compatibility and provides rich, actionable reports to help developers accurately identify errors and take corrective action. Successful test results also count towards Microsoft Partner Network program requirements, providing additional benefits. In addition, a completed test automatically earns selected Microsoft logos, including Powered by Windows Azure and Works with Windows Server 2008 R2.

How to Get and Use the Microsoft Platform Ready Test Tool

The Microsoft Platform Ready Test Tool is available through the Microsoft Platform Ready portal. To get it, go to microsoftplatformready.com, and download the test. First time visitors need to register.

On November 1, 2010, the available tests will include Windows Azure, Microsoft SQL Azure, Windows Server 2008 R2, Microsoft SQL Server 2008 R2, Microsoft Exchange Server 2010, Microsoft Dynamics CRM 2011, Microsoft Lync 2010, and Windows 7.

For more information about Microsoft Platform Ready, visit the site or contact mprcore@microsoft.com

Andy joined Microsoft in 1995 and has been with the Developer and Platform Evangelism group since 2003.

I just used the MPR Test Tool to verify compliance of my OakLeaf Systems Azure Table Services Sample Project with MPR’s requirements for Azure Web sites in my (@rogerjenn) Microsoft Announces Cloud Essentials for Partners with Free Windows Azure and Office 365 Benefits post (updated 11/3/2010). My sample project passed with no problem.

Visual Studio LightSwitch

Steve Lange posted Turn It On! My Visual Studio LightSwitch (Beta1) Presentation on 10/26/2010 (missed when posted):

Tonight I presented to the Denver Visual Studio .NET User Group on Visual Studio LightSwitch. Thank you to those who attended!

For those of you who missed it, or just want the content anyway, below are links to my presentation.

You can find it on SkyDrive or on my profile at SlideShare:

Visual Studio LightSwitch (Beta 1) Overview

View more presentations from Steve Lange.

Below are a few links to get you started as well:

LightSwitch Homepage

LightSwitch Developer Center

How Do I? Videos

Tutorials

Training Kit

Forums

Lastly, a few of you asked for the funny looping slide deck I used during the break. That’s on SkyDrive HERE.

Return to section navigation list>

Windows Azure Infrastructure

Chris Czarnecki posted Comparing Cloud Computing with Grid Computing to the Learning Tree blog on 11/3/2010:

On a recent teach of the Learning Tree Cloud Computing course, I was asked about the difference between Grid Computing and Cloud Computing. An excellent question, because there are similarities between the two but also differences. Lets consider these individually.

Starting with Grid computing. A grid comprises of a set of loosely coupled computers that are networked together and form what appears a single coherent whole. The magic that makes this integration happen is middleware software that manages the computers, detects failures, new computers being added and computers being removed from the logical grid. Using the middleware software, applications can be deployed to the grid and they will use whatever cpu utilisation and storage the grid can provide them for their current demands. For compute intensive tasks, the more computers in the grid the faster the tasks will complete. Grids are often on-premise and dedicated to an individual organisation or project.

Now consider Cloud Computing, which provides three levels of service. At one end is Software as a Service (SaaS), allowing a pay per use model to be applied to acquiring software with no on-premise hosting. Platform as a Service (PaaS) provides an elastically scalable compute platform including middleware for applications to be deployed to. Finally Infrastructure as a Service (IaaS) enables servers, storage, networks to be provisioned on demand and on a pay as you use basis with self administration of the infrastructure.

Comparing Grid with the Cloud Computing services, Grid computing most closely compares with PaaS in Cloud Computing. It provides a deployment environment for application software which will elastically scale its compute and storage capacity to best meet the applications immediate requirements, all autonomously. A grid may be considered as a private cloud delivering PaaS. There are however, a lot of differences between Grid and Cloud computing including:

Grids are normally on-premise, and owned by an organisation, whereas clouds are normally provided by vendors and utilised on an as needs, pay per use basis by many different organisations. Grids do not provide the ability to individually provision servers and the self administration including installing a variety of operating system and software applications on these servers like IaaS does.

In summary, Grid computing and Cloud computing have some similarities: scalable, on demand compute and storage. They also have some major differences: immediate self provisioning, pay per use and a wide variety of applications available via the cloud.

If you are interested in learning how you can maybe leverage Cloud Computing for your organisation, why not consider attending the Cloud Computing course. We explore in detail and provide hands-on exposure to a variety of Cloud Computing services, considering the benefits and risks of adopting these for your organisation.

Adron Hall (@adronbh) continues Cloud Throw Down: Part 2 on 11/3/2010:

Ok, in this edition the fight gets graphic! Let’s jump right into the bout. I’ve also been thinking about adding Rackspace or another cloud provider, so if you want me to add a comparison point, please let me know and I’ll get them added.

Deploying .NET Web Application Code into AWS and Windows Azure is done in some distinctly different ways. There are two ways to look at this measurement;

One is with a configured Web Role or VM Role in Windows Azure and already configured EC2 Instance for AWS or…

A Web Role or VM Role as is and an EC2 Instance as is.

The first scenario one can deploy .NET code directly to the Web Role by simply building the code, directly from Visual Studio and just clicking on publish. For simple web applications and even some complex ones, this is all that is required to get the app into the cloud and running. This takes about 1-5 minutes depending on your build time and bandwidth. I’ll measure this first method of deployment with the web role already started, so it is only the deployment being measured. For AWS I’m making the assumption that you’ve already got an EC2 instance running with either Linux + Mono or Windows with IIS Configured and ready. I’ll call it the…

Deploying .NET Web Applications into Ready Environments

With Windows Azure the deployment takes about 1-5 minutes from Visual Studio 2010. With AWS EC2 using the FTP deployment it takes about 1-5 minutes. In this particular situation, both cloud services are equal for deployment time and steps to deploy.

Rating & Winner: Deploying .NET Web Applications into Ready Environments is a tie.

…the second situation though is where things get tricky and Windows Azure has some startling advantages. I’ll call this deployment…

Deploying a .NET Web Application into Environments As-is

This is a trickier situation. Setup and then deployment. For a Web Role the setup is done almost entirely in the actual .NET Web Application Project. So it is done if the project builds and runs locally. Nothing available is as fast as a deployment from Visual Studio 2010 straight to a Web Role. Of course the application has to be built with this specific scenario in mind. Total deployment time for this is 1-5 minutes.

The AWS EC2 instance you have to configure the operating system and IIS. Installation of this and setup for the FTP server, etc, often takes several minutes. So the first time deployment onto an EC2 instance will take a good 5-15 minutes and often requires, as any deployment to a web server does, a few minutes just to make sure all the settings are right and setup just the way you want them. Overall, this method is not as clean as the Windows Azure Web Role deployment method.

Windows Azure however has a VM Role, similar in many ways to an AWS EC2 instance, and it has the same issues and concerns as deploying to an AWS EC2 instance for the first time. It requires manual intervention to setup, configure, and assure that all things are in order for the specific application.

Combining these facts for deploying a .NET web application as-is leaves a few odd points. The VM Role and AWS EC2 instance is definitely more time consuming and prone to human error for deployment because of the additional control one can have. However the Web Role limits the ability to control many variables of the serving of content from the Web Role, it absolutely is the fastest and cleanest way to deploy a .NET Web Application.

Rating & Winner: Deploying a .NET Web Application into Environments As-is goes to Windows Azure.

Virtual Instance Options

The next measurement is a simple one, virtual instance options. Windows Azure has the following options available for virtual instances:

Compute Instance Size CPU Memory

Extra Small 1.0 GHz 768 MB

Small 1.6 GHz 1.75 GB

Medium 2 x 1.6 GHz 3.5 GB

Large 4 x 1.6 GHz 7 GB

Extra large 8 x 1.6 GHz 14 GB

(reference: http://www.microsoft.com/windowsazure/compute/default.aspx)

That gives us 5 different compute instance sizes to choose from. Amazon Web Services provides the following compute instance sizes:

Compute Instance Size CPU Memory

Micro Up to 2 EC2 Compute Units 613 MB

Small 1 Dedicated EC2 Compute Unit 1.7 GB

Large 4 EC2 compute units 3.5 GB

Extra Large 8 EC2 compute units 7 GB

High Memory Extra Large 6.5 EC2 compute units 17.1 GB

High Memory Double Extra Large 13 EC2 compute units 34.2 GB

High Memory Quadruple Extra Large 26 EC2 compute units 68.4 GB

High CPU Medium 5 EC2 compute units 1.7 GB

High CPU Extra Large 20 EC2 compute units 7 GB

Which provides us a diversified range of 9 different instance types.

Rating & Winner: Virtual Instance Options goes to AWS.

Today’s Winners is… Windows Azure and AWS in a tie. The rest of my throw down series will be coming over the next week and few days. If you have any ideas or things I should compare the two services on.

To check out more about either cloud service navigate over to:

Amazon Web Services or

Windows Azure

Disclosure: I didn’t mention it in either of the previous throw down segments about any disclosure I need to make. I’m a .NET programmer, love and hate Microsoft at the same time, but have no real honest preference toward either cloud service. I’m just interested in and always learning more about each technology. Using either service when their respective capabilities meet the price, feature, or other combination that I can use. I also do not work directly for Microsoft or Amazon. Again, thanks for reading. :)

The Windows Azure Team (and almost every other Microsoft blogger) announced Windows Azure Platform Introductory Special Extended to March 31, 2011 on 11/3/2010:

Good news - the Windows Azure Platform Introductory Special offer (which includes the SQL Azure Free Trial) has been extended through March 31, 2011! This promotional offer enables you to try a limited amount of the Windows Azure platform at no charge. The subscription includes a base level of monthly Windows Azure compute hours, storage, data transfers, AppFabric Access Control transactions, AppFabric Service Bus connections, and a SQL Azure Database, at no charge.

Included each month at no charge:

Windows Azure

25 hours of a small compute instance

500 MB of storage

10,000 storage transactions

Windows Azure AppFabric

100,000 Access Control transactions

2 Service Bus connections

SQL Azure

1GB Web Edition database (available for first 3 months only)

Data Transfers (per region)

500 MB in

500 MB out

Any monthly usage in excess of the above amounts will be charged at the standard rates. This Introductory Special offer will end on March 31, 2011 and all usage will then be charged at the standard rates.

Please visit http://www.microsoft.com/windowsazure/offers/ to see additional details on the Introductory Special as well as other offers currently available for the Windows Azure platform.

Unfortunately, “25 hours of a small compute instance” is gone in a day.

William Vambenepe (@vambenepe) asserted Cloud management is to traditional IT management what spreadsheets are to calculators on 11/3/2010:

It’s all in the title of the post. An elevator pitch short enough for a 1-story ride. A description for business people. People who don’t want to hear about models, virtualization, blueprints and devops. But people who also don’t want to be insulted with vague claims about “business/IT alignment” and “agility”.

The focus is on repeatability. Repeatability saves work and allows new approaches. I’ve found spreadsheets (and “super-spreadsheets”, i.e. more advanced BI tools) to be a good analogy for business people. Compared to analysts furiously operating calculators, spreadsheets save work and prevent errors. But beyond these cost savings, they allow you to do things you wouldn’t even try to do without them. It’s not just the same process, done faster and cheaper. It’s a more mature way of running your business.

Same with the “Cloud” style of IT management.

Related posts:

REST in practice for IT and Cloud management (part 3: wrap-up)

Exploring “IT management in a changing IT world”

REST in practice for IT and Cloud management (part 2: configuration management)

The necessity of PaaS: Will Microsoft be the Singapore of Cloud Computing?

Cloud catalog catalyst or cloud catalog cataclysm?

Toolkits to wrap and bridge Cloud management protocols

My (@rogerjenn) Windows Azure Uptime Report: OakLeaf Table Test Harness for October 2010 (99.91%) post of 11/3/2010 reported:

Following is the Pingdom monthly report for a single instance of the OakLeaf Systems Windows Azure Table Test Harness running in Microsoft’s South Central US (San Antonio) data center:

Click here for 9 more months of tabular data.

David Linthicum Updated: 3 Reasons Clouds Fail for ebizQ on 11/3/2010:

I'm at Cloud Expo this week, and have been speaking with a great many people who are building their first clouds...public, private, and hybrid. Some are working, some are not. Here are 3 reasons that clouds are failing.

Reason 1: No governance. You need to have a good governance and management strategy in order to effectively operate clouds to support the services levels required. This means that governance software is monitoring your infrastructure as to set policies. Guys like Abiquo, and a few others, should be considered to address the management and governance requirements.

Reason 2: No performance modeling. While clouds seems like they should scale by some decree of the buzzword law, the reality is that most clouds are not optimized for performance or scaling. Make sure to model performance and scalability, and validate that the architecture and technology is optimized.

Reason 3: No talent. The largest issue around the failure of some clouds is the lack of architectural understanding of cloud computing, and the lack of skills required to scope, design, deploy, and test a cloud...private, public, or hybrid. Not sure I have a fix for this, other than make sure to get the expert help you need. It's relatively cheap considering the cost of failure.

Lori MacVittie (@lmacvittie) asserted There are many logical fallacies, some more recognizable than others. Today’s lesson is brought to you by the logical fallacy “equivocation” and the term “multi-tenant” in a preface to her Logic Clearly Dictates That Different Things are Different post of 11/3/2010 to F5’s DevCentral blog:

Definition: Equivocation is sliding between two or more different meanings of a single word or phrase that is important to the argument.

Say “cloud” and ask for a definition today and you’ll still get about 1.2 different answers for every three people in the room. It’s just a rather nebulous technology that’s hard to nail down and because it’s something that’s defined by characteristics rather than a concrete specification it’s difficult to get folks with sometimes diametrically opposed architectural approaches to agree on a definition. One of the reasons this is a problem is because you end up with a lot of equivocation when people start arguing about whether “this” is cloud or “that” is cloud or, more apropos to today’s lesson, whether “cloud” is secure.

LOGIC DICTATES YOU SHOULD BACK UP and TRY AGAIN

Security remains the biggest obstacle preventing major businesses from embracing cloud services. However, Charles Babcock of Information Week says such fears are “overblown.”

In particular, Babcock supports multi-tenant, shared cloud computing , which some executives fear has weak security. “To me, Salesforce.com and other SaaS vendors have established the legitimacy of the multi-tenant model. If it didn’t work, we’d be hearing constant complaints about compromises of data and loss of business,” wrote Babcock. “The question of whether it can be made safer than it is, however, I would answer at face value, ‘of course it can.’” [emphasis added]

-- Cloud Security – One Size Does Not Fit All

While it may be argued – and argued well – that “SaaS vendors have established the legitimacy of the multi-tenant model” they have done so only for the SaaS multi-tenant model. The (in)security of SaaS or IaaS does not imply the (in)security of multi-tenancy in other models because they may be (and often are) implemented in entirely different ways.

SaaS ≠ PaaS ≠ IaaS

If none of the “aaS” are the same (and they are not) then neither are the multi-tenant models they employ – if they even employ such a thing. The multi-tenancy requirements of infrastructure and systems – the ones that make up PaaS and IaaS – are necessarily implemented in myriad ways that do not mirror the database-configuration-driven methodology associated with SaaS vendors. Multi-tenancy in a Load balancer, for example, is not implemented using a database and it is, in part, the security of the database system in a SaaS that provides those offerings with a measure of its security.

Using SaaS as the poster-child for cloud security is, to quote Hoff, intellectually dishonest or the product of ignorance.

Almost all of these references to "better security through Cloudistry" are drawn against examples of Software as a Service (SaaS) offerings. SaaS is not THE Cloud to the exclusion of everything else. Keep defining SaaS as THE Cloud and you're being intellectually dishonest (and ignorant.)

-- Christofer Hoff, “What People REALLY Mean When They Say “THE Cloud” Is More Secure…”

Multi-tenancy in an IaaS environment is necessarily more complex than that of a SaaS environment. Unless you really believe that Salesforce.com is not only providing isolation at the application layer but also divvying up the network into VLANs and applying ACls on every router on a per customer basis. I didn’t think you did.

Yet this level of “security” is what it takes at an IaaS layer to provide a secured, multi-tenant environment. Multi-tenant means different things in different deployment models, and one cannot equate SaaS multi-tenancy to IaaS multi-tenancy. Well, you can, but you’d be very, very wrong.

IaaS and MULTI-TENANCY

Multi-tenancy is the ability to support multiple “tenants” on the same solution while providing isolation, individual configuration and security for each customer. In an IaaS environment this is not necessarily achieved on the device but is instead often realized through an architectural approach. When the network is involved isolation and security of a complete flow of data is achieved not by configuration settings in a database, but through the use of protocols designed to segment and isolate while routing data through the network. Protocols are not inherently multi-tenant; they are the means by which some forms of multi-tenancy can be (and are) implemented.

But the use of protocols and architecture to achieve multi-tenancy is in no wise related to the multi-tenancy of a SaaS environment. In an IaaS environment providers are concerned with multi-tenancy at the network and infrastructure layer. There are not required to provide this same capability for applications, except where server infrastructure is concerned. SaaS providers, on the other hand, may or may not be concerned about the multi-tenancy of the network and are instead concerned only with the application that is being delivered.

With such very different models and concerns for the provider, it is impossible to apply the (in)security of one model to another. SaaS may be in fact very secure, but that says nothing about an IaaS provider, and vice-versa.

Any such arguments attempting to imply the security of PaaS and IaaS by pointing at SaaS implementations are nothing less than equivocations, and are simply illogical.

Windows Azure Compute and SQL Azure had brief problems with service management according to these RSS messages:

[Windows Azure Compute] [North Central US] [Green(Info)] Service management issues

Nov 3 2010 6:15AM Service management operations such as creation, update and deletion of hosted services or storage accounts are failing. Running deployments are not impacted and will continue to run. Storage accounts remain accessible.
Nov 3 2010 8:32AM Full service management functionality has been restored.

[Windows Azure Compute] [South Central US] [Green(Info)] Service management issues

Nov 3 2010 6:15AM Service management operations such as creation, update and deletion of hosted services or storage accounts are failing. Running deployments are not impacted and will continue to run. Storage accounts remain accessible.
Nov 3 2010 8:32AM Full service management functionality has been restored.

[SQL Azure Database] [North Central US] [Yellow] SQL Azure Service Management Issues

Nov 3 2010 8:08AM Service management operations such as creation, update and deletion of hosted services or storage accounts are failing. All other functionality is working normally.
Nov 3 2010 8:36AM Normal service functionality is fully restored for SQL Azure Database.

[SQL Azure Database] [South Central US] [Yellow] SQL Azure Service Management Issues

Nov 3 2010 8:08AM Service management operations such as creation, update and deletion of hosted services or storage accounts are failing. All other functionality is working normally.
Nov 3 2010 8:36AM Normal service functionality is fully restored for SQL Azure Database.

Alan Le Marquand explained How to Manage Server Costs with Windows Azure on 11/2/2010:

New Applications often mean new servers. The rate of technical change in our industry means our workforce has different demands and so our “line of business” applications need to be updated. This could mean a new version or modifications that add a web service front end. Whichever way, there usually is a new hardware request in there somewhere.

What do you do if budgets are tight? Do you buy a smaller box and hope it does the job, or do you cut funding for another project? Tough decisions IT departments have to make. What is different these days is that there is a different option. You could use a platform service such as Windows Azure.

Windows Azure can lower your up-front capital costs. With consumption-based pricing, packages and partner discounts lower the barriers to entry to cloud services adoption and ensure a predictable IT spend.

Microsoft’s data centers can provide organizations with the equivalent of hundreds or even thousands of servers. Now, instead of buying new hardware, you add the resources you need and pay for only that usage. Reducing the number of physical servers in your environment slashes other costs, such as the price of power, cooling and the day-to-day maintenance physical hardware requires. This frees IT people to focus on software, and to simply carry on running and maintaining application as they do today, with the knowledge that they can improve the throughput performance in minutes.

When the need arises to increase the performance of some applications in an environment to keep up with new demand, it often means investing in a new server or undertaking a performance-tuning exercise, both of which incur costs. By contrast, moving those applications to Windows Azure provides a means to scale these applications and increase throughput at reduced costs compared to adding new hardware and software.

To take advantage of this cloud computing scenario requires some planning to transition a cloud application from your on-premise systems. After reading the above you probably have lots of questions, most everyone does when anyone mentions the cloud. To help with this the TechNet has put together a new Cloud hub. This hub contains information and resources to help you understand this option above and be able to take advantage of. Called Getting Business Done with the Cloud, the site contains 7 scenarios today; we will be adding new scenarios on regular basis, to continue to explain the cloud options and how to implement them quickly.

The particular scenario mentioned above can be found on the How to Manage Server Costs with Windows Azure section of the hub.

Phil Ferst reported The Industry Speaks about Cloud, Part I: Business execs are buying-in to Cloud even more than their IT counterparts on 11/2/2010:

HfS Research and the London School of Economics have surveyed 1053 organizations on the future of Cloud Business Services

This week, we’re beginning to unravel the colossus study we just ran with the London School of Economics delving into the future potential of Cloud Business Services.

We managed to receive 1053 participants across business buyers, IT buyers, advisors, providers and industry influencers – if anyone else in the industry has performed such as exhaustive study of Cloud business services, please enlighten us! Thank you to all our loyal readers who completed the study, and our friends at SSON who helped engage their network. A complimentary report of the study findings will be winging its way to you all very soon.

One of the unique angles of our study has been to contrast the views and intentions of the non-IT business community, and solely IT executives. And – as we suspected – the dynamics driving the future direction of Cloud adoption within the business functions is going to come from the business function leaders who “get it”.

Cloud Business Services are no longer hype – both business and IT executives are buying-into the value Cloud can bring to their jobs and their organizations. Let’s examine further:

*The ability to access business applications quicker, faster, cheaper and in a virtual business environment are the major drivers – and it’s the business side of the house which is even more engaged by the potential value than the IT-side.

*Most notably, half the business respondents seriously value the focus Cloud brings to transforming their business, as opposed to their IT. Barely a third of IT respondents were as enthralled by this.

Does this mean that the real impetus behind future adoption of Cloud Business Services is going to come from business function leaders with heavy influence over IT spending for their function? And what role will Cloud Business Services play in altering the make-up of today’s outsourcing and integrated services engagements?

Stay tuned for Part 2, and Part 3… and probably Parts 4 and 5 as well…

Windows Azure Platform Appliance (WAPA)

No significant articles today.

Cloud Security and Governance

Richard L. Santalesa reported White House CIO Council Releases Draft Guidance on U.S. Govt Cloud Computing in an 11/3/2010 post to the Information Law Group blog:

A draft release of a 90-page Proposed Security Assessment and Authorization for U.S. Government Cloud Computing was distributed by the White House CIO Council yesterday, curiously numbered a 0.96 release. A product of FedRAMP (the Federal Risk and Authorization Management Program), the guidance draft is the result of an 18-month inter-agency effort by the National Institute of Standards and Technology (NIST), General Services Administration (GSA)(see GAO-10-855T), the CIO Council and others, including state and local governments, industry, academia, and additional governmental bodies, such as the Information Security and Identity Management Committee (ISIMC). Comments on the draft can be submitted online until December 2nd here.

While we'll be posting further analysis of the cloud computing guidance draft, the three chapters of the draft's tripartite organization focus on:

Cloud Computing Security Requirements Baselines;

Continuous Monitoring; and a

Potential Assessment & Authorization Approach.

An appendix contains materials on assessment procedures and security documentation templates. While the end goal of this FedRAMP initiative is to streamline federal governmental cloud computing vetting and procurement across agencies, it clearly remains to be seen how this ultimately works out in the field. As the guidance states, on page 46, in the introduction to Chapter 3, Potential Assessment & Authorization Approach:

"the end goal is to establish an on-going A&A approach that all Federal Agencies can leverage. To accomplish that goal, the following benefits are desired regardless of the operating approach:

Inter-Agency vetted Cloud Computing Security Requirement baseline that is used across the Federal Government;

Consistent interpretation and application of security requirement baseline in a cloud computing environment;

Consistent interpretation of cloud service provider authorization packages using a standard set of processes and evaluation criteria;

More consistent and efficient continuous monitoring of cloud computing environment/systems fostering cross-agency communication in best practices and shared knowledge; and

Cost savings/avoidance realized due to the “Approve once, use often” concept for security authorization of cloud systems.

Check back for a detailed analysis of the draft Proposed Security Assessment and Authorization for U.S. Government Cloud Computing.

Chris Hoff (@Beaker) casted a baleful eye on FedRAMP. My First Impression? We’re Gonna Need A Bigger Boat… in this 11/3/2010 post:

I’m grumpy, confused and scared. Classic signs of shock. I can only describe what I’m feeling by virtue of an analog…

There’s a scene in the movie Jaws where Chief Brody, chumming with fish guts to attract and kill the giant shark from the back of the boat called “The Orca,” meets said fish for the first time. Terrified by it’s menacing size, he informs [Captain]
Quint “You’re gonna need a bigger boat.”

I felt like that today as I read through the recently released draft of the long-anticipated FedRAMP documents. I saw the menace briefly surface, grin at me, and silently slip back into the deep. Sadly, channeling Brody, I whispered to myself “…we’re gonna need something much sturdier to land this fish we call cloud.”

I’m not going to make any friends with this blog.

I can barely get my arms around all of the issues I have. There will be sequels, just like with Jaws, though unlike Roy Schneider, I will continue to be as handsome as ever.

Here’s what I do know…it’s 81 pages long and despite my unhappiness with the content and organization, per Vivek Kundra’s introduction, I can say that it will certainly “encourage robust debate on the best path forward.” Be careful what you ask for, you might just get it…

What I expected isn’t what was delivered in this document. Perhaps in the back of my mind it’s exactly what I expected, it’s just not what I wanted.

This is clearly a workstream product crafted by committee and watered down in the process. Unlike the shark in Jaws, it’s missing it’s teeth, but it’s just as frightening because its heft is scary enough. Even though all I can see is the dorsal fin cresting the water’s surface, it’s enough to make me run for the shore.

As I read though the draft, I was struck by a wave of overwhelming disappointment. This reads like nothing more than a document which scrapes together other existing legacy risk assessment, vulnerability management, monitoring and reporting frameworks and loosely defines interactions between various parties to arrive at a certification which I find hard to believe isn’t simply a way for audit companies to make more money and service providers to get rubber-stamped service ATO’s without much in the way of improved security or compliance.

This isn’t bettering security, compliance, governance or being innovative. It’s not solving problems at a mass scale through automation or using new and better-suited mousetraps to do it. It’s gluing stuff we already have together in an attempt to make people feel better about a hugely disruptive technical, cultural, economic and organizational shift. This isn’t Gov2.0 at all. It’s Gov1.0 with a patch. It’s certainly not Cloud.

Besides the Center for Internet Security reference, there’s no mention of frameworks, tools, or organizations outside of government at all…that explains the myopic focus of “what we have” versus “what we need.”

The document is organized into three chapters:

Chapter 1: Cloud Computing Security Requirement Baseline
This chapter presents a list of baseline security controls for Low and Moderate
impact Cloud systems. NIST Special Publication 800-53R3 provided the foundation
for the development of these security controls.

Chapter 2: Continuous Monitoring
This chapter describes the process under which authorized cloud computing systems
will be monitored. This section defines continuous monitoring deliverables,
reporting frequency and responsibility for cloud service provider compliance with
FISMA.

Chapter 3: Potential Assessment & Authorization Approach
This chapter describes the proposed operational approach for A&A’s for cloud
computing systems. This reflects upon all aspects of an authorization (including
sponsorship, leveraging, maintenance and continuous monitoring), a joint
authorization process, and roles and responsibilities for Federal agencies and Cloud
Service Providers in accordance with the Risk Management Framework detailed in
NIST Special Publication 800-37R1.

It’s clear that the document was written almost exclusively from the perspective of farming out services to Public cloud providers capable of meeting FIPS 199 Low/Moderate requirements. It appears to be written in the beginning from the perspective of SaaS services and the scoping and definition of cloud isn’t framed — so it’s really difficult to understand what sort of ‘cloud’ services are in scope. NIST’s own cloud models aren’t presented. Beyond Public SaaS services, it’s hard to understand whether Private, Hybrid, and Community clouds — PaaS or IaaS — were considered.

It’s like reading an article in Wired about the Administration’s love affair with Google while the realities of security and compliance are cloudwashed over.

I found the additional requirements and guidance related to the NIST 800-53-aligned control objectives to be hit or miss and some of them utterly laughable (such as SC-7 – Boundary Protection: “Requirement: The service provider and service consumer ensure that federal information (other than unrestricted information) being transmitted from federal government entities to external entities using information systems providing cloud services is inspected by TIC processes.” Good luck with that. Sections on backup are equally funny.

The “Continuous Monitoring” section requirements wherein the deliverable frequency and responsibile party is laid out engenders a response from “The Princess Bride:”

You keep using that word (continuous)…I do not think it means what you think it means…

Only 2 of the 14 categories are those which FedRAMP is required to provide (pentesting and IV&V of controls.) All others are the responsibility of the provider.

Sigh.

There’s also not a clear distinction that in a service deployed on IaaS (as an example) where anything in the workload’s VM fits into this scheme (you know…all the really important stuff like information and applications) and how agency processes intersect with the CSP, FedRAMP and the JAB.

The very dynamism and agility of cloud are swept under the rug, especially in sections discussing change control. It’s almost laughable…code changes in some “cloud” SaaS vendors every few hours. The rigid and obtuse classification of the severity of changes is absolutely ludicrous.

I’m unclear if the folks responsible for some of this document have ever used cloud based services, frankly.

“Is there anything good in the document,” you might ask? Yes, yes there is. Firstly, it exists and frames the topic for discussion. We’ll go from there.

However, I’m at a loss as how to deliver useful and meaningful commentary back to this team using the methodology they’ve constructed…there’s just so much wrong here.

I’ll do my best to hook up with folks at the NIST Cloud Workshop tomorrow and try, however if I smell anything remotely like seafood, I’m outa there.

/Hoff

Related articles

Feds Roll Out Cloud Security Guidelines (informationweek.com)

Administration Proposes Security Requirements For Cloud Computing (techdailydose.nationaljournal.com)

New Standards for Governmental Agencies on Cloud Computing Security (blogs.csoonline.com)

Feds Discover 1,000 More Data Centers (datacenterknowledge.com)

What’s The Problem With Cloud Security? There’s Too Much Of It… (rationalsurvivability.com)

Hoff’s 5 Rules Of Cloud Security… (rationalsurvivability.com)

Graphic credit: Cover of Jaws (30th Anniversary Edition)

Cloud Computing Events

The Windows Azure Team reported on 11/3/2010 New Windows Azure Firestarter Series Starts November 8, 2010 in Tampa, FL:

Is cloud computing still a foggy concept for you? Have you heard of Windows Azure, but aren't quite sure of how it applies to your development projects? If you're a developer who'd like to learn more, then join Microsoft Developer Evangelists at one of eight free, all-day Firestarter events that will combine presentations, demos and hands-on labs to help you better understand where the cloud and Windows Azure can take you. All sessions run 8:30 AM - 5:30 PM.

Click on a session to register:

Tampa, FL - Monday, November 8

Atlanta, GA - Wednesday, November 10

Charlotte, NC - Thursday, November 11

Rochester, NY - Tuesday, November 16

Waltham, MA - Tuesday, November 30

New York, NY - Wednesday, December 1

Malvern, PA - Tuesday, December 7

Chevy Chase, MD - Thursday, December 9

If you can't make it to any of the events in-person, you can watch the December 9 event live online by registering here.

Other Cloud Computing Platforms and Services

Guy Rosen (@guyro) returned with State of the Cloud – November 2010 on 11/3/2010:

After a short hiatus, State of the Cloud is back with a brand new update. Starting from this report, updates will be published every two months.

Methodology

State of the Cloud is an on-going survey of the market penetration of cloud computing. Specifically, the survey tracks publicly facing websites (i.e., www.something.com) and does not look into internal usage such as R&D, testing and enterprise use.

The technique used is as follows: I use QuantCast’s top 1M site list as a reference. taking the top half of the list (500k sites in total). Each site is queried to determine whether it is hosted on a cloud provider, and if so on which. The results can be seen below.

Snapshot for November 2010

Here are the results for this month.

The two market leaders continue to make significant gains, both gaining an average of 2.6% per month over the past two month. Both have more than doubled their footprint since I started measuring back in August 2009. All the rest but Joyent have exhibited slower – if steady – growth. We’ve been wondering for over a year if anyone will rise to challenge the dominance of Amazon and Rackspace. So far, that hasn’t happened.

Trends

Looking back over the time since the survey began, what jumps out more than anything is how the cloud as a whole has grown. The cloud has more than doubled – from 3,635 sites hosted in the cloud back in August 2009 to 7,845 today. Otherwise the picture remains pretty static, with a clear division into major league and minor league. Perhaps the only exception here is Linode – which I continue to predict will be grabbed up sooner or later by a large player looking to make a quick entry into the cloud space (much as SliceHost was acquired by Rackspace and became Rackspace Cloud Servers, measured above).

Jeff Barr reported Sauce Labs - OnDemand Testing Service on EC2 on 11/3/2010:

Late last month I spent some time on the phone with John Dunham and Steve Hazel of Sauce Labs to learn more about their Sauce OnDemand testing service. The product is built around the popular Selenium testing tool and can actually make use of existing Selenium scripts for functionality and performance testing. John and Steve said that I can think of it as "robotically controlled browsers as a service."

We talked about the fact that the cloud is the natural place for testing resources, since the amount of usage within a particular organization is subject to extreme fluctuations. John told me that the dedicated test resources in a typical organization are idle 99% of the time. This inefficient use of capital is of great concern to CIOs and CTOs, since the return on idle resources is zero.

By allocating EC2 instances on demand, their customers can now test in hours instead of days. Steve said that many customers see a 10- to 20-fold improvement (reduction) in test times because they can simply throw additional resources at the problem. He told me that this works really well until it becomes an "accidental load test" of the system under test! They perform a continuous screen capture from each test machine and store all of the video in Amazon S3. To date they have run over 3 million tests and have retained all of the videos. This feature is called Sauce TV.

Architecturally, the system uses a single Extra Large (m1.xlarge) EC2 instance to coordinate an elastic pool of High CPU Medium (c1.medium) test servers. They've built 5 or 6 different Amazon Machine Images (AMIs) with different combinations of operating systems and browsers (IE6, IE7, Firefox, and so forth -- complete list is here). They keep a few instances idle so that they can start testing as soon as the need arises. They make use of historic usage data to guide their pre-scaling process, further adjusted to reflect actual demand. They are able to run high-scale tests in short order without charging extra.

Scale-wise, the largest test has consumed about 250 instances and the overall pool has grown as large as 400 instances.

You can sign up for a free 30-day trial of Sauce OnDemand. After that, you can get 1,000 minutes of testing per month for $49 (additional minutes cost $0.05 each). There are also some enterprise plans.

Sauce Labs has implemented a number of improvements to Selenium including SSL and Unicode support, cross-browser file uploads, and an OS-level popup eliminator. They plan to make these enhancements available to the Selenium project.

John told me that The Motley Fool is really sold on the idea of cloud-based testing with Sauce OnDemand. Here's a presentation from Dave Haeffner, Quality Assurance Manager at The Motley Fool: Mountains to Molehills: A Story of QA.

View more presentations from tourdedave.

Compute Instance Size	CPU	Memory
Extra Small	1.0 GHz	768 MB
Small	1.6 GHz	1.75 GB
Medium	2 x 1.6 GHz	3.5 GB
Large	4 x 1.6 GHz	7 GB
Extra large	8 x 1.6 GHz	14 GB

Thursday, November 04, 2010

Background

The Problem

The Approach

The Downside

Conclusion

Definition: Equivocation is sliding between two or more different meanings of a single word or phrase that is important to the argument.

SaaS ≠ PaaS ≠ IaaS

IaaS and MULTI-TENANCY

Methodology

Snapshot for November 2010

Trends

0 comments:

Forbes: Who Are The Top 20 Influencers in Big Data?

TRAACKR: Who Are The Top 50 Data Science Influencers?

Blog Archive

OakLeaf Blog Curations on Curah!

Links to SQL Azure Labs and Other Big Data Articles

Windows Azure Mobile Services Preview Walkthrough for Windows Store Apps

OakLeaf's New Windows Azure WebSites

Check Out OakLeaf's New Azure DataMarket Offerings

SearchCloudComputing Articles

Articles for Red Gate Software's ACloudyPlace

Windows Azure Articles for Developer.com

DZone Syndication

Feeds

OakLeaf Systems' Listings in Microsoft Pinpoint

Links to Cover Stories for Visual Studio Magazine

The Windows Azure Team Interview, 11/30/2010

OakLeaf Blog Ranked #134 of Influential Data Blogs by eCairn

OakLeaf Systems' Windows Azure Table Services Sample Project

Check Out the OakLeaf SharePoint Online Site

Google Blog Search

Windows Azure Questions & Answers

About Me

My Latest Books

Early MiniDV and FireWire Posts

Labels

Berkeley Juneteenth Festival 1998 Historical Web Pages

Berkeley Juneteenth Festival Silver Anniversary

Miscellaneous Links

Squidoo Lenses

Terms of Use/Privacy Statement

Copyright