Wednesday, November 17, 2010

Windows Azure and Cloud Computing Posts for 11/17/2010+

A compendium of Windows Azure, Windows Azure Platform Appliance, SQL Azure Database, AppFabric and other cloud-computing articles.

AzureArchitecture2H640px3   
Note: This post is updated daily or more frequently, depending on the availability of new articles in the following sections:

To use the above links, first click the post’s title to display the single article you want to navigate.


Cloud Computing with the Windows Azure Platform published 9/21/2009. Order today from Amazon or Barnes & Noble (in stock.)

Read the detailed TOC here (PDF) and download the sample code here.

Discuss the book on its WROX P2P Forum.

See a short-form TOC, get links to live Azure sample projects, and read a detailed TOC of electronic-only chapters 12 and 13 here.

Wrox’s Web site manager posted on 9/29/2009 a lengthy excerpt from Chapter 4, “Scaling Azure Table and Blob Storage” here.

You can now freely download by FTP and save the following two online-only PDF chapters of Cloud Computing with the Windows Azure Platform, which have been updated for SQL Azure’s January 4, 2010 commercial release:

  • Chapter 12: “Managing SQL Azure Accounts and Databases”
  • Chapter 13: “Exploiting SQL Azure Database's Relational Features”

HTTP downloads of the two chapters are available for download at no charge from the book's Code Download page.


Tip: If you encounter articles from MSDN or TechNet blogs that are missing screen shots or other images, click the empty frame to generate an HTTP 404 (Not Found) error, and then click the back button to load the image.


Azure Blob, Drive, Table and Queue Services

Nathan Stuller (@YeahStu) posted Implications of Windows Azure Container Types on 11/17/2010:

image In 7 Reasons I Used Windows Azure for Media Storage, I described the download process involved in streaming a large video through a Silverlight applet using the Microsoft cloud offering. My scenario involved the use of public containers to store large files (blobs) for a web application. Public containers are convenient because they can be accessed via a simple GET request. Unfortunately, being that simple begets some negative behavior. By being accessible via a simple URL, any user on the web can link to that file and/or download a copy for personal use.

imageIf you are already using public containers, do not be alarmed as if your storage is entirely exposed. I tested my site by typing a URL in which I removed the file name and the result indicated that the URL could not be found. I immediately breathed a sigh of relief. In other words, even public containers do not act the same way that IIS would if the Directory Browsing setting were enabled.

Example URL: http://{ApplicationName}.blob.core.windows.net/{Container}/

Still, for cases in which public containers are not satisfactory due to their openness, the alternative is to use private containers.

Private containers are similar to public containers and remain fairly simple to use. They require the inclusion of a unique key during the GET request for stored files. This is extremely easy using the Azure SDK sample, which abstracts away the details of what must be included in the request.

Effectively, the container type determines where the request for Azure blobs come from. For public containers, the request comes from the client, because a simple URL fetches the file. In contrast, the request for private containers must come from the server. The server-side code embeds the key in the GET request, receives the blob, and processes it or delivers it to the client accordingly.

The obvious benefit to private containers being accessible only to the server-side code is that security logic can occur in it, thereby restricting who can access blobs to specific users based on rules. It also makes it much more difficult, but still possible, to download files stored in private containers for personal use. The drawback to this solution is that streaming now passes through the server, greatly increasing the bandwidth consumed by the application.

As described above, there are cases to be made for the use of both public and private containers. The best solution comes from weighing security requirements against bandwidth and development costs. Of course, there are ways to reap the benefits of both paradigms, but the above restrictions cover the “out of the box” web application scenario.


<Return to section navigation list> 

SQL Azure Database and Reporting

David Ramel reported SQL Server Developers Get Taste of 'Juneau' Goodies in an 11/15/2010 post to Redmond Developer News’ Data Driver blog:

image Here's good news if you're a database developer who doesn't like working with SQL Server Management Studio (SSMS): You may not have to use it too much longer.

At the recent PASS Summit, Microsoft showed a packed roomful of database developers how the next version of SQL Server--code-named "Denali"--will include a unified development environment based on Visual Studio.

Officially called SQL Server Developer Tools code-named "Juneau," the new bells and whistles were demonstrated by Microsoft's Don Box during a PASS keynote address by Quentin Clark.

"What we're trying to do with Juneau is really advance the state of the art of database development," Box said as he demonstrated the new functionality in a Visual Studio shell. "What we're doing is, we're looking at all the stuff we've done in the past, all the stuff that's been done in Visual Studio, around doing things like .NET, C++, C#--we're trying to bring that goodness to the database development world," Box said.

Working in the Visual Studio shell, Box said, lets developers "take advantage of the new shell, the new WPF-based text editor, new language services. All those accrue value to the SQL Server product. And this also accrues value to business intelligence, so the BIDS assets are going to be in the same shell as our relational database assets."

With a nod to those who happen to like SSMS, Box demonstrated the "connected experience that an SSMS user is going to be used to, inside of Juneau."

Box showed how Juneau lets developers use the Server Explorer to drill down into a database and get the same preview that SSMS provides. "I can say new query and I get the new text editor with new language services on top of T-SQL based on the database I'm deployed against," he said. He showed a simple Execute command and said, "I also can do execute with debugging, so basically anywhere I see SQL text in a text editor, I can select it and either execute it directly against the database, or I can execute under debugging and just start doing step into or step over, anyplace I see text."

A new table designer was also demoed that used the Visual Studio style panes of code, design or a split between the two. When changes were made to the design pane, they were immediately reflected in the text editor pane, and vice versa.

Also, Box noted that the table designer lets developers easily see subordinate objects such as check constraints, primary constraints and indices. As he clicked through these items, the relevant T-SQL code was brought up in the text editor, with the affected columns highlighted.

And in some ways, Juneau will surpass SSMS. When using a CREATE view, Box noted that it's not idempotent, so any changes made would normally involve transcribing them into ALTER commands. "One of the things that we do in the tool which is an advance of what I have in SSMS today, is I have the ability to take all these pending changes which I've been accruing and say 'figure out the ALTER script for me.' So if I say commit all to database, we actually do analysis of all the source text that you've now accrued vs. the actual catalog in the database and we figure out what needs to be done to make this so."

imagePlenty of other cool things were shown, also, including FileTable file storage, semantic searching and version-awareness and edition-awareness, which means Juneau enforces language constraints depending on your target, be it Denali, SQL Server 2008 or SQL Azure. And speaking of Azure and the cloud, while Box didn't address it directly, one of the aims of Juneau is to "is to make developing for SQL Azure and on-site SQL exactly the same," according to a recent interview with Clark by The Register.

You can register and see the Box demo yourself--along with other presentations--at the PASS Web site.

While the first Community Technology Preview (CTP) of Denali was released at the Summit, developers will have to wait a bit to get their hands on the Juneau technology--it's scheduled to be included in the next CTP, with no expected timeline provided.

I happen to like working with SSMS for SQL Server and SQL Azure.


<Return to section navigation list> 

Dataplace DataMarket and OData

Clark Sell contributed a CVCC follow-up | Introduction to OData tutorial on 11/17/2010:

This past weekend I was able to attended and present at the Chippewa Valley Code Camp.  This was the was my first time to CVCC.  Just a great event and loads of great people there.  I know it’s no easy feat to pull off a free community event so a huge thanks to both Doug and Dan for all of their hard work.  I also thought it was just awesome to see all of the students who attended. 

Ok, OData.  My session looked a little something like this:

OData, it’s like ATOM but with some extra stuff

You know Open Data Protocol or OData. It’s a Web protocol for querying and updating data that provides a way to unlock your data and free it from silos that exist in applications today. OData does this by applying and building upon Web technologies such as HTTP, Atom PublishingProtocol (AtomPub) and JSON to provide access to information from a variety of applications, services, and stores.

We are going to explore how to publish and consume an OData services and have a little fun along the way.

So what is OData?  Well OData.org defines it as such:

The Open Data Protocol (OData) is a Webprotocol for querying and updating data that provides a way to unlock your data and free it from silos that exist in applications today. OData does this by applying and building upon Web technologies such as HTTP, Atom PublishingProtocol (AtomPub) and JSON to provide access to information from a variety of applications,services, and stores. The protocol emerged from experiences implementing AtomPub clients and servers in a variety of products over the past several years.  OData is being used to expose and access information from a variety of sources including, but not limited to, relational databases, file systems, content management systems and traditional Web sites.

I find myself talking about OData a lot.  Why?  Well interesting enough it’s not because of some hidden agenda, I’m just drawn to it.  Every application I have been part of has had to deal with data in some fashion.  I realize this isn't the case for *every* application, it’s just my experience. For me, I think the draw comes down to this:

  • Its simplicity
  • I like REST
  • I like that in .NET it’s built on WCF
  • I like the overall industry support around the protocol
  • It takes what works very well, http and expands on existing success
  • I like how you can easily use it with things like JQuery

But enough about me…  There are really two parts of OData or for that matter any web service really. Consuming and Producing or the client side vs. the server side.  Let’s start with consuming some OData.

Consuming

For the purposes of consumption, I am going to use Fiddler, Internet Explorer 9 and StackOverflow and my own demo for these examples.  You can find StackOverflow’s OData services at http://odata.stackexchange.com/.  Lets walk through a few queries against StackOverflow.  Another thing to point out.  Since OData is built on ATOM when you query in the browser you will be show the Feed Reading view.  You can actually disable that by going to Tools –> Internet Options –> Content –> Feeds and WebSlices Settings –> Uncheck Turn on feed reading view.

image

What Entities are available to consume? http://odata.stackexchange.com/stackoverflow/atom/.  The result ( below ) will show you a couple of things.  First it will show you the entities you can actually query ( highlighted ), and secondly the url of where that that entity exists ( the href attribute for the collection element ).

<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<service xml:base="http://odata.stackexchange.com/stackoverflow/atom/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:app="http://www.w3.org/2007/app" xmlns="http://www.w3.org/2007/app">
  <workspace>
    <atom:title>Default</atom:title>
    <collection href="Badges">
      <atom:title>Badges</atom:title>
    </collection>
    <collection href="Comments">
      <atom:title>Comments</atom:title>
    </collection>
    <collection href="Posts">
      <atom:title>Posts</atom:title>
    </collection>
    <collection href="Tags">
      <atom:title>Tags</atom:title>
    </collection>
    <collection href="Users">
      <atom:title>Users</atom:title>
    </collection>
    <collection href="Votes">
      <atom:title>Votes</atom:title>
    </collection>
    <collection href="VoteTypes">
      <atom:title>VoteTypes</atom:title>
    </collection>
  </workspace>
</service>

Next lets query the base service to find out what technical goo we can gleam.  To do so I simple ask the service for $metadata http://odata.stackexchange.com/stackoverflow/atom/$metadata. This returns all of the geeky details you were looking for in regards to those exposed entities.  At this point you could also use the OData Visualizer in Visual Studio to get that graphical look of the service.  After you add your service reference, just right click on it and select View in Diagram.

SNAGHTML10692dc9

Querying an entity is as simple as navigating to the url that was provided to you.  Let’s query for the Posts: http://odata.stackexchange.com/stackoverflow/atom/Posts.  *Note* the url is case sensitive so posts will not work. One of my favorite parts about OData is the ability to just query it. Think about the web services of yester year.  As the business changed we would be asked to refactor our services.  Maybe we started with something simple like this:

public string GetCustomersByState( string state )

Things were great, and a few months later someone wanted Active Customers by state.

public string GetActiveCustomerByState ( string state )

Not a big deal right?  Hell, we were most likely even calling into the same business services and just added some flag to now query all or by “status”.  Of course now, the business came back and said.  I want to query customers by state and by status.  Now you might be thinking:

public string GetCustomers ( string state, string status )

But in fact you’re really just frustrated. Maybe you could have been more creative with your services? Maybe more abstract? Of course now you have services out there that will be a huge deal to change. I mean it’s in production and billions of customers are using it right?  I have been there. Ok maybe not billions of customers but all it really takes it two to cause headaches.  OData actually address this very scenario head on.  You can now query your endpoint without the need of having predetermined operations. BOOYA!! Don’t get me wrong, you will most likely still need some operations maybe even a lot, but what about that scenario above?  As developers we are horrible at estimating.  As humans it turns out we are just horrible at predicting the future.  You could never predict the future of your web services or the users who want to consume them.  Of course you still want agility and I am sure you had to have them last week, right.  So let’s query StackOverflow.

If this http://odata.stackexchange.com/stackoverflow/atom/Posts gives us all the posts. How can we get just Posts that are tagged with css? Or what if we wanted to know how many posts there were? Or the count on Posts tagged css? I am sure the ninjas at StackOverflow are worried about all the questions I have about their data.

That is pretty sick. 

But now I want JSON. Simple. Now it might be different depending on the libraries your using. In .NET if you want JSON returned you modify your http header to include: accept: application/json.

http://odata.stackexchange.com/stackoverflow/atom/Posts

Host: odata.stackexchange.com
accept: application/json

You can find more about JSON support in odata here: http://www.odata.org/developers/protocols/json-format.  After reading that you might also notice the format parameter $format=json.  At the time of writing this, that parameter isn’t supported out of the box in .NET 4. You have to change the accept header ( for out of box support ) or write some server side logic to support the parameter.

But not all of my users are geeks.  Good! Lets use Excel.  Yea I said it.  As it turns out Excel is actually a great place to consume OData services too. There is an add on to Excel called PowerPivot

SNAGHTML1188e9cd

Once you select From Data Feeds.  You will be prompted with a dialog box asking you for the URL.  After it inspects the endpoint it will give you the option of which entities from the service you want to import.  After you’ve selected things it will create a worksheet per entity populated with data.  From there you can get our your Excel ninja skills and have some fun.

What about posting to the service?  Posting to the services isn’t very complicated.  You will need to create an instance of DataServiceContext found in System.Data.Services.Client. Once we create an instance of the context we set our merge options.  Then we will need to create and populate the model.  Since we used Visual Studio to add a service reference, it automatically created objects bases on the entities found from the service.  One we have added the data to the object, we add it to the context and then save it.  The save will result in posting it to the service.  Here is a sample code snippet from the demo rather than StackOverflow.

DataServiceContext context = new DataServiceContext (new Uri(@"http://localhost:9998/Services/Reporting.svc/"));
context.MergeOption = MergeOption.AppendOnly;

Evangelist evangelist = new Evangelist();
evangelist.Name = "Clark Sell";
evangelist.District = "Midwest";
evangelist.State = "IL";

context.AddObject("Evangelists", evangelist);
context.SaveChanges();

In a later post I will explore posting more complicated data structures.

Producing

Up to now we have seen how to query a service. Here is the funny part, that might be actually harder than creating the service itself.  Let’s start with just exposing an Entity Framework model as an OData endpoint.  First we will need to create the Service itself.  Add New Item, and Select a WCF Data Service.  The result of that is your service:

public class blog : DataService< /* TODO: put your data source class name here */ >
    {
        // This method is called only once to initialize service-wide policies.
        public static void InitializeService(DataServiceConfiguration config)
        {
            // TODO: set rules to indicate which entity sets and service operations are visible, updatable, etc.
            // Examples:
            // config.SetEntitySetAccessRule("MyEntityset", EntitySetRights.AllRead);
            // config.SetServiceOperationAccessRule("MyServiceOperation", ServiceOperationRights.All);
            config.DataServiceBehavior.MaxProtocolVersion = DataServiceProtocolVersion.V2;
        }
    }

Notice our service inherits from DataService. That DataService also takes a generic which happens to either be our Entity Framework model or our own class. During InitializeService we have the change to setup things. This is where we can configure operations, entity rights and so on. You can set Entity Access on all ‘*’ or each entity. You also set what can be done to it EntitySetRights. The rights are detailed here: http://msdn.microsoft.com/en-us/library/system.data.services.entitysetrights.aspx.

Next, Operations.  If you need one, you have to tell the DataService you have one.  The SetServiceOperationAccessRule is where you do so.  Let’s take my demo example. 

….

  config.SetServiceOperationAccessRule("SomeOperation", ServiceOperationRights.All);
….

Now for our actual operation:

[WebGet]
public IQueryable<SnowAlert> SomeOperation(string state)
{
   var events = (from p in this.CurrentDataSource.WeatherAlerts
        where p.State == state
        select p);

   return events;

  }

Of course you will most likely want to intercept the request.  There are two interceptors. QueryInterceptor and ChangeInterceptor.  Let’s look at both.  On my Reporting Service I had both interceptors.  While my implementation is hokey, it makes the point.  My QueryInterceptor will only return Evangelists with the name Clark Sell.  You could actually imagine a scenario where based on your security profile you filtered results.   My ChangeInterceptor fires for the Evangelists Entity and looks at the change operation.  If the Name being added is Clark Sell it throws an exception.

[QueryInterceptor("Evangelists")]
public Expression<Func<Evangelist, bool>> FilterEvangelists()
{
    return e => e.Name == "Clark Sell";
}
[ChangeInterceptor("Evangelists")]
public void OnChangeEvangelists(Evangelist e, UpdateOperations operation)
{
    if (operation == UpdateOperations.Add || operation == UpdateOperations.Change)
    {
        if (e.Name == "Clark Sell")
        {
            throw new DataServiceException(400, "Sorry not allowed.");
        }
    }
}

So I mentioned earlier, you don’t have to expose your Entity framework but your own object. That is true, but on your object you will have to expose IQueryable and IEditable. In my sample demo you can look at the WeatherAlertRepository.cs for a simple example.  Of course this is a place where your implementation will vary drastically.

Removing the SVC extension.

Extension in URLs are just so 2005.  Luckily it’s really not a big deal to get rid of with ASP.NET. On App_Start in our global.asax we can actually register a Service Route.  This will add support for extension-less base addresses.

var factory = newDataServiceHostFactory();
RouteTable.Routes.Add(
newServiceRoute("WeatherAlerts", factory, typeof(WeatherAlerts)));

PHEWWWW, what a whirlwind. Here are some great resources to check out:

My CVCC Demo Source:

You can find my demo code at: https://bitbucket.org/csell5/demos/src/tip/cvcc/.  When I present I try to do three things. 

  1. Limit the use of slides.  Why?  Well PowerPoint just isn’t what you live in.
  2. Keep it simple.  Solutions that are too “complicated” could dilute concepts.
  3. Make it more complicated than hello world.  Hello world is great in a lot of examples but the devil is always in the details.  I try to keep it real world. 

My demo was based on something I have been thinking about a lot, reporting.  Bottom line, this is a real world thing I am building.  There are four projects:

  1. DPE.Reporting.Web.Tests, its my test project
  2. DPE.Reporting.Web, all of the web assets.  This will include the OData endpoints
  3. DPE.Reporting.Infrastructure, glue in the middle
  4. DPE.Reporting.Core, core stuff including the models.

image

At this point my data model is pretty incomplete but it’s enough.  In the demo we were dealing with the Evangelist Entity.

image

I also showed how you could just expose any type of object, not just an Entity Framework Model.  Since it snowed that day I created a Weather Alerting Service.  It just seemed appropriate. It was comprise of three things:

  1. SnowAlert.cs, found in Core and is the basic data model.
  2. WeatherAlertsRepository.cs, found in Infrastructure.  This is a very cheap repository like pattern to retrieve data. It’s also the object that exposes the IQueryable for OData.
  3. WeatherAlerts.svc, found in the web project. This is the OData endpoint.

What’s next?

Well for me, it’s looking at the following:

  • Best practice's around JQuery integration
  • PUT and POST techniques for complicated data structures


Pervasive Software posted on 11/17/2010 a Pervasive Software Offers OData Connectivity for DataMarket in Windows Azure Marketplace press release:

AUSTIN, Texas - (BUSINESS WIRE) - Pervasive Software (NASDAQ: PVSW), a global leader in cloud-based and on-premises data integration, today announced it has enriched its data innovation portfolio by providing direct connectivity to the Open Data Protocol, or OData (www.odata.org). OData is a Web protocol for querying and updating data. Microsoft includes it in the DataMarket on Microsoft's Windows Azure Marketplace.

imageBy extending its industry-leading range of connectivity to include direct connectivity to OData, Pervasive now enables customers to create automated dataflows that connect OData directly into their existing applications, whether cloud-based or on-premises. Leveraging Pervasive Data Integrator , the offering enables back-end integration connectivity and tooling to intelligently pull content from any traditional repository, whether structured or unstructured. It also provides front-end integration connectivity and tooling to pull data directly from the DataMarket OData API and populate the full range of target applications, such as Microsoft Dynamics CRM and ERP, SAP, Salesforce CRM, NetSuite, etc.

"We are on the cusp of a revolution in making data pervasively available 'as a service,' with potential industry-changing impact on traditional content and data owners," said Mike Hoskins, Pervasive CTO and general manager, Integration Products. "Initiatives like Microsoft's DataMarket and their game-changing OData open data API show that the era of data-as-a-service is upon us, and the potential is there for exciting new revenue sources for millions of content stores. The combination of Microsoft and Pervasive ensures that companies can unlock the value in their data stores and offer attractive data-by-the-drink delivery models."

Going forward, OData connectivity will be part of Pervasive Data Integrator's extensive connectivity, continuing the company's commitment to giving all users access to the full library of Pervasive integration connectors as part of its standard offering, providing comprehensive connectivity to virtually anything, from legacy to SaaS applications, data files to databases, Web to mainframe. OData connectivity also enriches the recently released Pervasive Data Integrator v10, Cloud Edition.

"With this offering, Microsoft and Pervasive continue longstanding collaboration to deliver innovative technologies and initiatives that drive better business outcomes for our joint customers," said Moe Khosravy, product unit manager for DataMarket at Microsoft."We are excited that Pervasive OData connectivity will help allow developers and information workers to easily discover, purchase, and manage premium data subscriptions in the Windows Azure Marketplace."

"Pervasive continues to deliver powerful innovation leadership, in concert with partners and customers like Melissa Data, one of the Microsoft DataMarket launch partners already providing content on DataMarket in the Windows Azure Marketplace," said Geoji George, director of product management, Integration Products for Pervasive.

About Pervasive Software

Pervasive Software (NASDAQ: PVSW) helps companies get the most out of their data investments through agile and embeddable software and cloud-based services for data management, data integration, B2B exchange and analytics. The embeddable Pervasive PSQL database engine provides robust database reliability in a near-zero database administration environment for packaged business applications. Pervasive's multi-purpose data integration platform, available on-premises and in the cloud, accelerates the sharing of information between multiple data stores, applications, and hosted business systems and allows customers to re-use the same software for diverse integration scenarios. Pervasive DataRush is an embeddable parallel dataflow platform enabling data-intensive applications such as claims processing, risk analysis, fraud detection, data mining, predictive analytics, sales optimization and marketing analytics. For more than two decades, Pervasive products have delivered value to tens of thousands of customers in more than 150 countries with a compelling combination of performance, flexibility, reliability and low total cost of ownership. Through Pervasive Innovation Labs, the company also invests in exploring and creating cutting edge solutions for the toughest data analysis and data delivery challenges. Robin Bloor, Chief Research Analyst and President, The Bloor Group and Founder, Bloor Research, recently cited Pervasive as one of the "10 IT Companies to Watch in 2010." For additional information, go to www.pervasive.com.

Cautionary Statement

This release may contain forward-looking statements, which are made pursuant to the safe harbor provisions of the Private Securities Litigation Reform Act of 1995. All forward-looking statements included in this document are based upon information available to Pervasive as of the date hereof, and Pervasive assumes no obligation to update any such forward-looking statement.

All Pervasive brand and product names are trademarks or registered trademarks of Pervasive Software Inc. in the United States and other countries. All other marks are the property of their respective owners.


Lynn Langit (@llangit) offered a First Look - Windows Azure DataMarket on 11/16/2010:

imageI took a look at the Windows Azure DataMarket.  To do so, you’ll need to sign in with a Windows Live ID first.

Cover

Next you’ll want to browse to the currently included datasets.  Many of them have (a limited number) of free queries to try out for free.  I selected the WTO Tourism Dataset for my first exploration.MarketplaceWeb

The site includes a data viewer, so that you can take a look at the dataset that you are interested in before you decide to work with it in your application.  Exploring

Because this is a newly launched service, there are only a few datasets online currently.  When I was a the SQLPass Summit in Seattle last week, I talked to several members of the product team for the Azure Data Market and got an idea of our direction for this services going forward.

FreeDatasets

You can work with datasets on the site itself or you can use other methods to explore the data. I started by using the (recently released) new feature of Excel Power Pivot 2010.  This feature is exposed via the ‘From Azure DataMarket’ button which is added to the Power Pivot ribbon after you download and install the latest (free) add-in.

FromAzureDataMarket

You’ll need to enter the URI (endpoint) for the service which you’ve signed up to try out into the Power Pivot dialog box as shown below.  This URI is found on the DataMarket website (Details tab) for the particular DataSet.  You will also need to enter your account key to complete the connection.  You find your account key on the DataMarket website MyData>Account Keys section.

Source

After you’ve successfully connected then you can use all the goodness that is Power Pivot to analyze your data.  I took a look at tourism in Zambia for 2005 using a Pivot Table populated with the Power Pivot (imported) Windows Azure Data Market data.

Results

You can also develop applications programmatically.  Here’s the MSDN sample code.  Also here’s a webcast from TechEd Europe 2010  (recorded in Nov in Berlin). …


Hani AbuHuwaij’s CreatePivot.com site helps you create Microsoft Pivot sites from OData:

image Welcome to CreatePivot!

CreatePivot allows you to generate your own, customized Microsoft Pivot collection, and gives you the code to easily embed the result on your website or blog.

CreatePivot is very easy to use, provides lots of features and can easily maintain and update your collection.

Pivot CollectionWhat is Microsoft Pivot?

Microsoft Pivot is a new technology provided by the developers at Microsoft Live Labs, and aims to revolutionize the way we visualize data. With Microsoft Pivot, you can show huge amounts of data in an informative, interactive and customizable way.  It is "an interaction model that accommodates the complexity and scale of information rather than the traditional structure of the Web" [GetPivot.com].

At the heart of Microsoft Pivot are Pivot Collections. These collections contain the data you wish to visualize, and combine large groups of information from the internet. Microsoft Pivot's power lies in its ability to visualize small and huge amounts of data with no problems.

More information at GetPivot.com.

What is CreatePivot in all this?

imageCreatePivot helps you create Pivot Collections in a simple and powerful way.

Just create the criteria and categories that you wish to enter about your data, then you can create a static collection based on data you enter, or if you have the data in Excel, just copy and paste it! CreatePivot will do the rest for you!

Also, if you have an Open Data feed, create a view that combines the data you want in that feed, enter it in CreatePivot, and you'll have a dynamic Pivot Collection, with data taken from this feed!

Enjoy, and CreatePivot Now!...

If the Submit button isn’t enabled after completing the Sign In dialog, click another field to enable it. (Temporary issue.)

Here’s a capture of the preview for the Sessions collection (table) from TechEd Europe 2010’s OData feed:

image

The Pivot view would have been more interesting if each session had a link to an image of its opening PowerPoint slide:

image


<Return to section navigation list> 

Windows Azure AppFabric: Access Control and Service Bus

image722322No significant articles today.


<Return to section navigation list> 

Windows Azure Virtual Network, Connect, and CDN

imageNo significant articles today.


<Return to section navigation list> 

Live Windows Azure Apps, APIs, Tools and Test Harnesses

The Windows Azure Team reported NCBI BLAST on Windows Azure Helps Power Scientific Research on 11/16/2010:

imageToday at Supercomputing 2010 in New Orleans, Microsoft announced the release of the National Center for Biotechnology Information (NCBI) BLAST on Windows Azure, a new application that enables a broader community of scientists to combine desktop resources with the power of cloud computing for critical biological research. NCBI Basic Local Alignment Search Tool (BLAST) on Windows Azure provides a user-friendly Web interface and access to Windows Azure cloud computing for very large BLAST computations, as well as smaller-scale operations.

Bob Muglia, president, Server and Tools at Microsoft sums up the key benefit for researchers best, "NCBI BLAST on Windows Azure shows how the platform provides the genuine Platform-as-a-Service capabilities that technical computing applications need in order to extract insights from massive data and help solve some of the world's biggest challenges across science, business and government."

The NCBI BLAST on Windows Azure software is available from Microsoft at no cost and Windows Azure resources are available at no charge to many researchers through Microsoft's Global Cloud Computing Research Engagement Initiative. To learn more about technical computing in the cloud, you should check out the latest posts from Bill Hilf, director of Platform Strategy at Microsoft and Dan Reed, corporate vice president, Technology Policy and Strategy and eXtreme Computing Group at Microsoft.  Dan's post also includes a great video with Dan where he discusses the tremendous opportunity with the cloud.


Igor Papirov posted How to dynamically scale your Windows Azure application in under 10 minutes to LinkedIn’s Windows Azure User Group on 11/10/2010 (missed when posted):

Part One - Introduction

imageThis article is written for IT professionals who are interested in optimizing their Windows Azure cloud applications by dynamically adjusting compute resources to accommodate for changes in demand - in realtime.

imageThe need for dynamic scaling is great: without it, your Azure applications will be performing poorly when demand is unexpectedly high and waste a lot of money when demand is low.  Your application is charged for all allocated compute instances even if they are underutilized or not utilized at all.

There are a number of articles and examples available that allow one to start on the road to creating their own auto-scaling engine for Windows Azure.  The most important thing to know is that you will need to hook into Service Management REST-based API. If you're looking to implement this yourself, I highly recommend reading this article by Neil Mackenzie.  It summarizes all the key information as well as provides links to all known open-source examples and articles on this topic.

However, if your company would rather concentrate on developing its core business products and leave the scaling work to others, whose core competency is dynamic scaling in Azure, then read on.

By following instructions in this article, you will be able to start monitoring and auto-scaling your Windows Azure application in about 5-10 minutes by using the AzureWatch service.  The service is currently free while in public beta and is expected to be released in the very first days of 2011.

Part Two - Installation

At its core, AzureWatch aggregates and analyzes performance metrics and matches these metrics against user-defined rules. When a rule produces a "hit", a scaling action occurs.  This process is distributed across your on-premises computer that is responsible for sending raw metric data to AzureWatch servers, and AzureWatch servers that are responsible for aggregating large amounts of metric data and deciding when your compute instances need to be scaled. The actual scale actions originate from your local computer.

You will need an account to install and use AzureWatch.  Follow this link to fill out a simple registration form.  After registration, download links will be provided.

Part Three - Start Control Panel

After installation is complete, start AzureWatch ControlPanel and login with your newly created account.  You will be presented with a wizard to enter your Azure connection information.

[Control Panel figure missing.]

Subscription ID can be found on your Windows Azure developer portal.  If you do not already have the X.509 certificate, AzureWatch can create one for you.  What's important to point out is that AzureWatch needs your X.509 certificate to be located in the LocalSystem\My certificate store.  If you already have a certificate on your computer, chances are that it is located in Personal\My certificate store.  In order to copy it from one store to another, you will need to export your X.509 certificate from Personal\My store into .pfx file and then import it into LocalSystem\My store.  Alternatively, you can choose to create a new certificate that will be installed into LocalSystem\My certificate store automatically, and you will only have to upload it to your Windows Azure account.

Please visit AzureWatch page to understand how your certificates and storage keys are kept secure.

After entering your account SubscriptionID and specifying a valid X.509 certificate, press Connect to Azure.  You will be presented with a list of storage accounts.  Storage account that is monitored by your Diagnostics Monitor is required.

On the next wizard page you can validate default settings for such things as throttle times, notification email, etc.

After the connection wizard is completed, AzureWatch will figure out what services, deployments and roles are present.  For each role found, you will be offered a chance to create simple predefined rules.

The two sample rules offered are simple rules that rely upon calculating a 60-minute average CPU usage across all instances within a Role.  We will come back to these rules in a short while.  For now, wizards need to be completed.

Part Four - First time in Control Panel

After wizards complete, you are presented with a dashboard screen.  It likely contains empty historical charts as no data has yet been collected.  Navigation Explorer on the left shows various parameters that can be customized, while Instructions tab on the right shows context-sensitive instructions.

It is a good idea to visit the the Rules section to see the rules that have been defined by the wizard.  Two sample rules should be present and can be edited by double-clicking on each.  Rule Edit screen is simple yet powerful.  You can specify what formula needs to be evaluated, what happens when the evaluation returns TRUE, and what time of day should evaluation be restricted to.  To make formula entry easier, a list of already defined aggregated metrics is provided.  Hovering over the formula box will display allowed operands.

One last place to visit before starting the monitoring process is the screen that contains safety limits for the number of instances that AzureWatch can scale up to or down to.  By clicking on the appropriate Role name in the navigation explorer, you will be presented with a chance to modify these boundaries.

This is it.  If you are ready, press "Publish Changes" button.  Provided your AzureWatch Monitor service is running, it will pick up these configuration settings in the next iteration of its internal loop and instruct Azure Diagnostics Manager to start capturing the metrics required for formulas to work.  Windows Azure will need a few minutes to instruct your instances to start capturing those metrics afterwards, and then a few minutes more before these metrics will be transferred to your storage.  Thus, give AzureWatch at least 5-10 minutes before expecting to see anything on the Dashboard screen.

Part Five - A few tips and tricks

Some things to keep in mind while using AzureWatch

If you just started using AzureWatch and have not accumulated enough metric data, evaluation of your rules may be suspect as your aggregations will lack sufficient data.  It maybe prudent to disable scaling inside Rules in the beginning so that your scaling actions do not trigger unexpectedly.

Metric transfer occurs only while Monitor service is running.  If you stopped Monitor service for an hour and then restarted it, it does not "go back" and send the missing hour's worth of metrics to AzureWatch.

AzureWatch will always instruct your instances to capture metrics that are defined in the Raw Metrics screen.  You do not need to do anything special with existing or newly started instances.  It may be worthwhile, however, to visit the System Settings screen to further configure how metric enforcement and gathering works.

AzureWatch will send a notification when it scales your instances up or down.  In it, it will provide values for all the aggregated metrics it knows about to help you understand why the scaling event occurred.

Since AzureWatch is a hybrid SaaS, the two Windows components must always be up to date in order to properly connect to the remote service.  Therefore, both of the programs automatically self-update whenever a new version is released.

I suppose the Windows Azure team is working on something similar.


<Return to section navigation list> 

Visual Studio LightSwitch

Nilotpal Das (@_nilotpaldas) offers suggestions for improving Visual Studio LightSwitch in his As easy as switching on a light… Or is it…? post of 11/17/2010:

image I have been working on a [number] of personal projects. A website for my wife’s NGO xomidhan, an application that I am calling Life Manager, and a few others. All of them are a work in progress and I have to admit with a pinch of salt that the progress is quite slow and User Interface is my bane.

image2224222Now Microsoft Lightswitch was a solution to my UI problems. At least when it came to really really small database driven business applications. But there are a few problems with that technology still. Firstly, let me state that it is still in its beta and all that I say after this is positive criticism. I am dead sure that the final product will be breakthrough in the “Super Duper High Speed Small Business Applications Development” (RAD SDHSSBAD) arena.

My Issues with Lightswitch

  • If you build a database model from a scratch, (alternative to attaching your application to an existing database), Lightswitch does not allow you to choose a database and creates it in SQL Express. Now I understand that Microsoft has made a disclaimer that Lightswitch is still in its beta and should not be used in a production environment as yet. But I still think that this is an absolute fundamental requirement that Microsoft should have taken care of even before its alpha release (if there was one). I decide which database my tables should exist in, period.

  • If you attach your application with an existing database, there are a [number] of features that go missing, and that’s not fair. Firstly it is not consistent behavior and secondly what if I want to build a database first and then build the application based on it. I don’t want to go the model driven development route. Why shouldn’t it enable me?

  • Content based web applications cannot be conveniently built using Lightswitch. Now I completely understand that this is a lightweight SDHSSBAD environment and if you added everything you wanted then it would become a full fledged Visual Studio 2010 Ultimate environment. And I agree. But you also have to agree that most of the business applications have to have some content in their screens that is static. Content that describes their product or service. So if you are keeping bare essentials, you should keep this.

  • Oh and by the way, Beth Massi might want to relook at her tutorial videos. As amazing as they are, they cannot all be recreated after Video #6 due to the SQL Express error for not being able to recognize DateTime data type.

There are a few other but they are superficial and if Microsoft is looking for feedback, this is my 2 pence worth. I am really looking forward to it and I can’t wait to get my hands on the real (RTM) thing.

Nilotpal’s objections are similar to those developers raised for Entity Framework v1, which were corrected in EF v4, its second iteration. However, I doubt if the VS team will enable connections to databases other than SQL Server, SQL Azure, or Microsoft Access by a free LightSwitch framework.


Spurlock Solutions updated on 11/12/2010 the LightSwitch SQL Server Reporting Services Control project in the MSDN Code Gallery:

Resource Page Description
image2224222This user control contains a WebBrowser control. Using several properties allows for embedding access to the ReportServer served web pages and pulls from properties in the ScreenObject to set parameters on a SQL Server Reporting Services report.

This is the second release of a very simple control. There will hopefully be enhancements to come in the future.
Steps and Notes

  1. Install the Extension included in the downloads.
  2. Add Group and change it to SSRS Viewer.
  3. Set the ReportServerUrl and the ReportPage properties to a report you wish to use.
  4. Add any parameters as local properties to the screen giving them the name of the parameter.
  5. The code snippet can be placed into the Button_Execute code on the screen in the LightSwitch project.
Feel free to add comments in the discussion and if you find it to be helpful please vote it as helpful on the LightSwitch General Beta Forurm.


Return to section navigation list> 

Windows Azure Infrastructure

John Treadway (@cloudbzz) asked If You’ve Never Used A Cloud, Can You Call Yourself An Expert? in this 11/17/2010 post to his CloudBzz blog:

image A recurring challenge I have with a lot of enterprise vendor “cloud” solutions I get briefed on is that they seem to be designed and built without any real understanding of how and why customers are actually using the cloud today.  I suspect in most cases that this results from the fact that the people building these solutions have NEVER EVER used Amazon, Rackspace, or any other mainstream public cloud offering.

Chris Hoff points out his suspicion of this scenario in his frank assessment of the recently released FedRAMP documentation.

I’m unclear if the folks responsible for some of this document have ever used cloud based services, frankly.

When you gather together a group of product managers, architects, developers and self-styled strategists who have never used a public cloud, and ask them to design a cloud solution, more often than not their offering will not be a cloud solution (or any other kind of solution that customers want).  It’s not that these people are lacking in intelligence.  Rather, they lack the context provided through experience.  Oh, and many large enterprise vendors suck at the very basics of the “customer development process.”  So not only will their solution not be cloudy, it will be released to the market without them knowing this basic piece of information.


Simon May recommended on 11/17/2010 Mark Russinovich[‘s] Inside Windows Azure session at TechEd Europe 2010:

image

image I was just thumbing around the Sysinternals site (for some handy tools for a deployment I’m doing) and happened across Mark’s PDC presentation which he re-ran at TechEd Europe last week and which I happened to be at.  It’s a fantastic way for the IT Professional to understand some of the (very) interesting back end of Windows Azure.  How it works, what it does, what fabric is, how it heals how it deploys how it all bolts together.  When you’re done watching the vid, go get an Azure trial.

Simon May just joined Microsoft as the new IT Pro evangelist on the block, specializing in client side technologies, looking at deployment and productivity.


Nicole Hemsoth published her Easy HPC in the Big Easy: An SC10 Interview with Bill Hilf to the HPC in the Cloud blog on 11/17/2010:

image During SC10 in New Orleans this week, our editor spent an hour with Bill Hilf to discuss a wide range of topics, including Microsoft’s Azure cloud offering, both in terms of some recent newsworthy enhancements and the announcement of a certain other major public cloud that now boasts GPU capabilities. This led to discussions about performance, job scheduling requirements for hosting compute-intensive and HPC applications in a cloud environment and more general topics related to the company’s strategy  as the “other” public cloud continues to evolve, albeit via a different course. We'll be bringing more details from this chat as the week goes on...

image Microsoft’s Technical Computing Group, which focuses on HPC, parallel and cloud computing has been evolving as of late, a fact that is due in large part to input from its General Manager, Bill Hilf and his belief that the only way to broaden HPC access is to focus on making simplicity of access to high-performance computing applications and resources as easy as filling in rectangles in an Excel spreadsheet.

Ultimate abstraction of complexity might strike some of you as unrealistic. The thought that your applications can somehow be negotiated and abstracted to such a high level that they require little more than data entry does seem far-fetched but clearly, for Microsoft, the effort make this reality is not simply a priority so they can better engage that elusive missing middle of HPC users—it’s the key to their survival in the HPC space.

In Hilf’s view, technical computing users are going to form the backbone of Azure, hence the focus on HPC applications in any number of the company’s cloud-related announcements.

imageThis includes, for example, the news today that BLAST had been ported to their cloud and was being offered “free” (which is good since it’s really free to begin with) to users with Azure accounts. We’ll get to that item in a moment, but for now, back to how Bill Hilf wants to destroy HPC…or at least the weight of that acronym….in other words, by making it synonymous with computing in general.

"It goes far beyond building operating systems; it’s about building end user tools; it’s about making it all seamless like we did recently with BLAST. We ported it to Azure, which was good, but there was still a lot of this that was really difficult. Like, how do you go and distribute all of this across Azure? And what is Azure then exactly? And then how do you track progress when it’s thousands and thousands of cores and any of this could be anywhere since it’s a global OS. Really, your job could be running anywhere; in Shnghai or elsewhere—so how do you track it or get one answer back across thousands of machines?"

Easing into Old Models

As Bill Hilf noted, a couple of years ago it became clear that Microsoft’s efforts to become major parties in the HPC server space was not working as envisioned so a shift in ideology was necessary—that shift actually brought Microsoft right back to where it got its start in the first place so long ago—removing complexity and thereby taking vastly complicated programming and hiding it under a seamless veneer of usability.

That veneer has been so seamless that we can all too often forget completely what lies behind that Excel spreadsheet or, for that matter, the Word document that the first draft of this article was created on. Here’s the idea though, and it does go beyond removing complexity and adding the intuitive UI…By taking such steps to deliver complex applications to the masses via these smooth user interfaces and focusig on ease of use above all, what we consider to be a powerful applications (the “we” is loose and general here) no longer are perceived as powerful necessarily because they’ve become ubiquitous.

So more specifically, Hilf is saying, “we want to eventually make HPC, that acronym, meaningless” in the sense that users, even highly technical users, will no longer consider their applications in the context of high-performance or general purpose—or anything. It will all simply become computation. Plain and simple.

Read More: Page  1  of  3
1 | 2 | 3 All »


<Return to section navigation list> 

Windows Azure Platform Appliance (WAPA) and Hyper-V Cloud

image

No significant articles today.


<Return to section navigation list> 

Cloud Security and Governance

Lori MacVittie (@lmacvittie) asserted Three shall be the number thou shalt count, and the number of the counting shall be three. If you’re concerned about maintaining application availability, then these three rules of thumb shall be the number of the counting. Any less and you’re asking for trouble in an introduction to her The Number of the Counting Shall be Three (Rules of Thumb for Application Availability) post to F5’s DevCentral blog of 11/17/2010:

Fish 2 - 19 Oct 99

(I like to glue animals to rocks and put disturbing amounts of electricity and saltwater NEXT TO EACH OTHER.)

Last week I was checking out my saltwater reef when I noticed water lapping at the upper edges of the tank. Yeah, it was about to overflow. Somewhere in the system something had failed. Not entirely, but enough to cause the flow in the external sump system to slow to a crawl and the water levels in the tank to slowly rise.

Troubleshooting that was nearly as painful as troubleshooting the cause of application downtime. As with a data center, there are ingress ports and egress ports and inline devices (protein skimmers) that have their own flow rates (bandwidth) and gallons per hour processing capabilities (capacity) and filtering (security). When any one of these pieces of the system fails to perform optimally, well, the entire system becomes unreliable, instable, and scary as hell. Imagine a hundred or so gallons of saltwater (and all the animals inside) floating around on the floor. Near electricity.

The challenges to maintaining availability in a marine reef system are similar to those in an application architecture. There are three areas you really need to focus on, and you must focus on all three because failing to address any one of them can cause an imbalance that may very well lead to an epic fail. 

RELIABILITY

darth-maul Reliability is the cornerstone of assuring application availability. If the underlying infrastructure – the hardware and software – fails, the application is down. Period. Any single point of failure in the delivery chain – from end-to-end – can cause availability issues. The trick to maintaining availability, then, is redundancy. It is this facet of availability where virtualization most often comes into play, at least from the application platform / host perspective. You need at least two instances of an application, just in case. Now, one might think that as long as you have the capability to magically create a secondary instance and redirect application traffic to it if the primary application host fails that you’re fine. You’re not. Creation, boot, load time…all impact downtime and in some cases, every second counts. The same is true of infrastructure. It may seem that as long as you could create, power up, and redirect traffic to a virtual instance of a network component that availability would be sustained, but the same timing issues that plague applications will plague the network, as well. There really is no substitute for redundancy as a means to ensure the reliability necessary to maintain application availability. Unless you find prescient, psychic components (or operators) capable of predicting an outage at least 5-10 minutes before it happens. Then you’ve got it made.

Several components are often overlooked when it comes to redundancy and reliability. In particular, internet connectivity is often ignored as a potential point of failure or, more often the case, it is viewed as one of those “things beyond our control” in the data center that might cause an outage. Multiple internet connections are expensive, understood. That’s why leveraging a solution like link load balancing makes sense. If you’ve got multiple connections, why not use them both and use them intelligently – to assist in efforts to maintain/improve application performance or prioritize application traffic in and out of the data center. Doing so allows you to assure availability in the event that one connection fails, yet the connection never sits idle when things are all hunky dory in the data center.

The rule of thumb for reliability is this: Like Sith lords, there should always be two of everything with automatic failover to the secondary if the primary fails (or is cut down by a Jedi knight).

CAPACITY

clock scary The most common cause of downtime is probably a lack of capacity. Whether it’s due to a spike in usage (legitimate or not) or simply unanticipated growth over time, a lack of compute resources available across the application infrastructure tiers is usually the cause of unexpected downtime. This is certainly one of the drivers for cloud computing and rapid provisioning models – external and internal – as it addresses the immediacy of need for capacity upon availability failures. This is particularly true in cases where you actually have the capacity – it just happens to reside physically on another host. Virtualization and cloud computing models allow you to co-opt that idle capacity and give it to the applications that need it, on-demand. That’s the theory, anyway. Reality is that there are also timing issues around provisioning that must be addressed but these are far less complicated and require fewer psychic powers than predicting total failure of a component. Capacity planning is as much art as science, but it is primarily based on real numbers that can be used to indicate when an application is nearing capacity. Because of this predictive power of monitoring and data, provisioning of additional capacity can be achieved before it’s actually needed.

Even without automated systems for provisioning, this method of addressing capacity can be leveraged – the equations for when provisioning needs to begin simply change based on the amount of time needed to manually provision the resources and integrate it with the scalability solution (i.e. the Load balancer, the application delivery controller).

The rule of thumb for capacity is this: Like interviews and special events, unless you’re five minutes early provisioning capacity you’re late.

SECURITY

Security – or lack thereof - is likely the most overlooked root cause of availability issues, especially in today’s hyper-connected environments. Denial of service attacks are just that, an attempt to deny service to legitimate users, and they are getting much harder to detect because they’ve been slowly working their way up the stack. Layer 7 DDoS attacks are particularly difficult to ferret out as they don’t necessarily have to even be “fast”, they just have to chew up resources.

Consider the latest twist on the SlowLoris attack; the attack takes the form of legitimate POST requests that s-l-o-w-l-y feed data to the server, in a way that consumes resources but doesn’t necessarily set off any alarm bells because it’s a completely legitimate request. You don’t even need a lot of them, just enough big bad wolfto consume all the resources on web/application servers such that no one else can utilize them. Leveraging a full proxy intermediary should go quite a ways to mitigate this situation because the request is being fed to the intermediary, not the web/application servers, and the intermediary generally has more resources and is already well versed in dealing with very slow clients. Resources are not consumed on the actual servers and it would take a lot (generally hundreds of thousands to millions) of such requests to consume the resources on the intermediary. The reason such an attack works is because the miscreants aren’t using many connections, so it’s likely that in order to take out a site front-ended by such an intermediary enough connections to trigger an alert/notification would be necessary.

Disclaimer:I have not tested such a potential solution so YMMV. In theory, based on how the attack works, the natural offload capabilities of ADCs should help mitigate this attack.

But I digress, the point is that security is one of the most important facets of maintaining availability. It isn’t just about denial of service attacks, either, or even consuming resources. A well-targeted injection attack or defacement can cripple the database or compromise the web/application behavior such that the application no longer behaves as expected. It may respond to requests, but what it responds with is just as vital to “availability” as responding at all. As such, ensuring the integrity of application data and applications themselves is paramount to preserving application availability.

The rule of thumb for security is this: If you build your security house out of sticks, a big bad wolf will eventually blow it down.

Assuring application availability is a much more complex task than just making sure the application is running. It’s about ensuring enough capacity exists at the right time to scale on demand; it’s about ensuring that if any single component fails another is in place to take over, and it’s absolutely about ensuring that a lackluster security policy doesn’t result in a compromise that leads to failure. These three components are critical to the success of availability initiatives and failing to address any one of them can cause the entire system to fail.


Lydia Leong (@cloudpundit) posted Amazon, ISO 27001, and some conference observations on 11/16/2010:

image Greetings from Gartner’s Application Architecture, Development, and Integration Summit. There are around 900 people here, and the audience is heavy on enterprise architects and other application development leaders.

imageOne of the common themes of my interaction here has been talking to an awful lot of people who are using or have used Amazon for IaaS. They’re a different audience than the typical clients I talk to about the cloud, who are generally IT Operations folks, IT executives, or Procurement folks. The audience here is involved in assessing the cloud, and in adopting the cloud in more skunkworks ways — but they are generally not ultimately the ones making the purchasing decisions. Consequently, they’ve got a level of enthusiasm about it that my usual clients don’t share (although it correlates with the reported enthusiasm they know their app dev folks have for it). Fun conversations.

imageSo on the heels of Amazon’s ISO 27001 certification, I thought it’d be worth jotting down a few thoughts about Amazon and the enterprise.

To start with, SAS 70 Is Not Proof of Security, Continuity or Privacy Compliance (Gartner clients only). As my security colleagues Jay Heiser and French Caldwell put it, “The SAS 70 auditing report is widely misused by service providers that find it convenient to mischaracterize the program as being a form of security certification. Gartner considers this to be a deceptive and harmful practice.” It certainly is possible for a vendor to do a great SAS 70 certification — to hold themselves to best pratices and have the audit show that they follow them consistently — but SAS 70 itself doesn’t require adherence to security best practices. It just requires you to define a set of controls, and then demonstrate you follow them.

ISO 27001, on the other hand, is a security certification standard that examines the efficacy of risk management and an organization’s security posture, in the context of ISO 27002, which is a detailed security control framework. This certification actually means that you can be reasonably assured that an organization’s security controls are actually good, effective ones.

The 27001 cert — especially meaningful here because Amazon certified its actual infrastructure platform, not just its physical data centers — addresses two significant issues with assessing Amazon’s security to date. First, Amazon doesn’t allow enterprises to bring third-party auditors into its facilities and to peer into its operations, so customers have to depend on Amazon’s own audits (which Amazon does share under certain circumstances). Second, Amazon does a lot of security secret sauce, implementing things in ways different than is the norm — for instance, Amazon claims to provide network isolation between virtual machines, but unlike the rest of the world, it doesn’t use VLANs to achieve this. Getting something like ISO 27001, which is proscriptive, hopefully offers some assurance that Amazon’s stuff constitutes effective, auditable controls.

A lot of people like to tell me, “Amazon will never be used by the enterprise!” Those people are wrong (and are almost always shocked to hear it). Amazon is already used by the enterprise — a lot. Not necessarily always in particularly “official” ways, but those unofficial ways can sometimes stack up to pretty impressive aggregate spend. (Some of my enterprise clients end up being shocked by how much they’re spending, once they total up all the credit cards.)

And here’s the important thing: The larger the enterprise, the more likely it is that they use Amazon, to judge from my client interactions. (Not necessarily as their only cloud IaaS provider, though.) Large enterprises have people who can be spared to go do thorough evaluations, and sit on committees that write recommendations, and decide that there are particular use cases that they allow, or actively recommend, Amazon for. These are companies that assess their risks, deal with those risks, and are clear on what risks they’re willing to take with what stuff in the cloud. These are organizations — some of the largest global companies in the world — for whom Amazon will become a part of their infrastructure portfolio, and they’re comfortable with that, even if their organizations are quite conservative.

Don’t underestimate the rate of change that’s taking place here. The world isn’t shifting overnight, and we’re going to be looking at internal data centers and private clouds for many years to come, but nobody can afford to sit around smugly and decide that public cloud is going to lose and that a vendor like Amazon is never going to be a significant player for “real businesses”.

One more thing, on the subject of “real businesses”: All of the service providers who keep telling me that your multi-tenant cloud isn’t actually “public” because you only allow “real businesses”, not just anyone who can put down a credit card? Get over it. (And get extra-negative points if you consider all Internet-centric companies to not be “real businesses”.) Not only isn’t it a differentiator, but customers aren’t actually fooled by this kind of circumlocution, and the guys who accept credit cards still vet their customers, albeit in more subtle ways. You’re multi-tenant, and your customers aren’t buying as a consortium or community? Then you’re a public cloud, and to claim otherwise is actively misleading.

See my comments to the Jeff Barr reported AWS Receives ISO 27001 Certification article in Windows Azure and Cloud Computing Posts for 11/16/2010+ about Microsoft’s ISO 27001 certifications for their data centers.

Lydia posted Amazon, ISO 27001, and a correction on 11/17/2010 at about 2:00 PM PST:

FlyingPenguin has posted a good critique of my earlier post about Amazon’s ISO 27001 certification.

Here’s a succinct correction:

To quote Wikipedia, ISO 27001 requires that management:

  • Systematically examine the organization’s information security risks, taking account of the threats, vulnerabilities and impacts;
  • Design and implement a coherent and comprehensive suite of information security controls and/or other forms of risk treatment (such as risk avoidance or risk transfer) to address those risks that are deemed unacceptable; and
  • Adopt an overarching management process to ensure that the information security controls continue to meet the organization’s information security needs on an ongoing basis.

ISO 27002, which details the security best practices, is not required to be used in conjunction with 27001, although this is customary. I forgot this when I wrote my post (when I was reading docs written by my colleagues on our security team, which specifically recommend the 27001 approach, in the context of 27002).

In other words: 27002 is proscriptive in its controls; 27001 is not that specific.

So FlyingPenguin is right — without the 27002, we have no idea what security controls Amazon has actually implemented.

Microsoft’s Frequently Asked Questions: Microsoft Online Services Risk Management, last updated 9/28/2009, discusses ISO 27002 directives starting on page 4:

Q: What security policies does Microsoft follow for Microsoft Online Services?

Microsoft Online Services Information Security Policy is based on ISO 27002 directives augmented with requirements specific to online services. (For example, Microsoft requires that all major Microsoft Online Services releases must undergo web penetration testing; any critical vulnerabilities discovered during such penetration testing must be resolved prior to releasing that service version to customers.) The Microsoft Online Services Information Security Policy also incorporates additional requirements derived from best in class security practices and mapping of relevant international, national and state/providential requirements.

ISO 27002 is part of the ISO/IEC 27000 family of standards, published jointly by the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC) and is the renamed, updated ISO 17799 standard. The full name of this international standard is, "Information technology - Security techniques - Code of Practice for Information Security Management."

The ISO 27000 standard is intentionally broad in scope, covering privacy, confidentiality and technical security issues and "established guidelines and general principles for initiating, implementing, maintaining, and improving information security management within an organization." To that end, the standard outlines hundreds of potential controls and control mechanisms. ISO 27000 was developed in the context of the following core principles:

“the preservation of confidentiality (ensuring that information is accessible only to those authorized to have access), integrity (safeguarding the accuracy and completeness of information and processing methods) and availability (ensuring that authorized users have access to information and associated assets when required).”

The topical domains addressed by ISO 27002 and the Microsoft Online Services Information Security Policy include:

  • Risk assessment
  • Security policy - management direction
  • Organization of information security - governance of information security
  • Asset management - inventory and classification of information assets
  • Human resources security - security aspects for employees joining, moving and leaving an organization
  • Physical and environmental security - protection of the computer facilities
  • Communications and operations management - management of technical security controls in systems and networks
  • Access control - restriction of access rights to networks, systems, applications, functions and data
  • Information systems acquisition, development and maintenance - building security into applications
  • Information security incident management - anticipating and responding appropriately to information security breaches
  • Business continuity management - protecting, maintaining and recovering business-critical processes and systems
  • Compliance - ensuring conformance with information security policies, standards, laws and regulations


<Return to section navigation list> 

Cloud Computing Events

Michael Krigsman posted Defrag: Cloud economics, cultural change, and other exciting stories to the Enterprise Irregulars blog on 11/16/2010:Go to full article

Defrag, which runs this week near Denver, is one of the more interesting conferences on the enterprise circuit. Going far beyond bit and bytes, Defrag explores the intersection of technology on business, society, and the enterprise as a whole.

As part of the Enterprise Irregulars track at this year’s Defrag, I am honored to moderate a panel discussion on the impact of cloud computing. The panel will explore what the cloud means to various groups: ordinary end-users, the IT department, software vendors (including traditional on-premise vendors), and even large professional services organizations.

Aside from me, the panel includes the following participants:

  • Steve Mann, a strategy consultant and formerly an executive with SAP
  • John Taschek, an executive with Salesforce.com who reports directly to CEO Marc Benioff
  • Sadagopan Singam, an executive with Mahindra Satyam, one of the largest Indian IT outsource firms
  • Mitch Lieberman, an analyst / consultant and previously Vice President of Marketing at open source cloud vendor SugarCRM

The panel begins with a simple premise: to typical users, cloud is a meaningless term related to data centers. However, despite this apparent invisibility to ordinary folks, the cloud is driving profound changes in computing, the enterprise, and even the social landscape.

image

A recently released white paper, The Economics of the Cloud (PDF), from Microsoft’s strategy team describes the economic basis for these changes. The cloud offers economies of scale that make possible cultural impacts, such as those associated with Facebook and Twitter. And, when subscription software pricing is combined with the cloud, there is disruption to business models associated with IT vendors, services providers, and even the IT ecosystem as a whole.

The Microsoft paper connects the dots between technology, economics, and disruption:

Economics are a powerful force in shaping industry transformations. Todayâs discussions on the cloud focus a great deal on technical complexities and adoption hurdles. While we acknowledge that such concerns exist and are important, historically, underlying economics have a much stronger impact on the direction and speed of disruptions, as technological challenges are resolved or overcome through the rapid innovation we’ve grown accustomed to….

The emergence of cloud services is again fundamentally shifting the economics of IT. Cloud technology standardizes and pools IT resources and automates many of the maintenance tasks done manually today. Cloud architectures facilitate elastic consumption, self-service, and pay-as-you-go pricing.

The paper goes on to discuss the broader impact of cloud economics on innovation:

Many IT leaders today are faced with the problem that 80% of the budget is spent on keeping the lights on, maintaining existing services and infrastructure. This leaves few resources available for innovation or addressing the never-ending queue of new business and user requests. Cloud computing will free up significant resources that can be redirected to innovation. Demand for general purpose technologies like IT has historically proven to be very price elastic. Thus, many IT projects that previously were cost prohibitive will now become viable thanks to cloud economics.

I spoke with Michael Yamartino, from Microsoft’s Corporate Strategy Group and one of the paper’s authors, who explained his view that cloud economics suggest a force that is “inevitable.” Perhaps needless to say, Michael and I agreed wholeheartedly on this point.

As something of a side note to the core issue of impact, but still highly relevant, fellow ZDNet blogger, Phil Wainewright, explains that economics will eventually diminish private clouds, with a corresponding increase in innovation:

There’s a virtuous cycle here, of course, in that public clouds are already more cost-effective as platforms for innovation, so that there is going to more innovation happening here than on private clouds anyway. That innovation will help to further accelerate the evolution of public clouds, thus amplifying their economic advantage more rapidly and to a greater extent than even Microsoft’s strategy team have envisaged.

What do you think about cloud economics and the impact on computing, the enterprise, and society of a whole?


<Return to section navigation list> 

Other Cloud Computing Platforms and Services

James Hamilton [pictured below] condenses a description of GPGPU computing in his GPU Clusters in 10 Minutes post of 11/17/2010:

image Earlier this week Clay Magouyrk sent me a pointer to some very interesting work: A Couple More Nails in the Coffin of the Private Compute Cluster: Benchmarks for the Brand New Cluster GPU Instance on Amazon EC2.

This detailed article has detailed benchmark results from runs on [Amazon Web Services’] new Cluster GPU Instance type and leads in with:

image During the past few years it has been no secret that EC2 has been best cloud provider for massive scale, but loosely connected scientific computing environments. Thankfully, many workflows we have encountered have performed well within the EC2 boundaries. Specifically, those that take advantage of pleasantly parallel, high-throughput computing workflows.> Still, the AWS approach to virtualization and available hardware has made it difficult to run workloads which required high bandwidth or low latency communication within a collection of distinct worker nodes. Many of the AWS machines used CPU technology that, while respectable, was not up to par with the current generation of chip architectures. The result? Certain use cases simply were not a good fit for EC2 and were easily beaten by in-house clusters in benchmarking that we conducted within the course of our research. All of that changed when Amazon released their Cluster Compute offering.>

The author goes on to run the Saleable Heterogeneous cOmputing BenChmarking Suite and compare EC2 with Native performance and conclude[s]:

“With this new AWS offering, the line between internal hardware and virtualized, cloud-based hardware for high performance computing using GPUs has indeed been blurred.

“Finally a run with a Cycle Computing customer workload:

Based on the positive results of our SHOC benchmarking, we approached a Fortune 500 Life Science and a Finance/Insurance clients who develop and use their own GPU-accelerated software, to run their applications on the GPU-enabled Cluster Compute nodes. For both applications, the applications perform a large number of Monte Carlo simulations for given set of initial data, all pleasantly parallel. The results, similar to the SHOC result, were that the EC2 GPU-enabled Cluster Compute nodes performed as well as, or better than, the in-house hardware maintained by our clients.”

Even if you only have a second, give the results a scan: http://blog.cyclecomputing.com/2010/11/a-couple-more-nails-in-the-coffin-of-the-private-compute-cluster-gpu-on-cloud.html.


Dave Kearns asserted “We've tried to differentiate identity for the cloud, identity from the cloud and identity in the cloud -- but what's identity to the cloud?” as a deck for his The cloud from different angles post of 11/16/2010 to NetworkWorld’s Security blog about CA Technologies Identity Manager:

image We seem to have spent an inordinate amount of time on cloud computing the past few issues, but there's one more announcement (at least) that I want to bring to your attention before moving on.

image A press release crossed my desk last week and the headline really did grab me: "…Cloud Strategy with Identity and Access Management To, For and From the Cloud." We've tried, in the past, to differentiate identity for the cloud, identity from the cloud and identity in the cloud -- but what's identity to the cloud? [Link added.]

image That was intriguing enough for me to pursue CA Technologies' announcement a bit further.  Turns out that what was meant was that CA Identity Manager now supports user provisioning to Google Apps. Using Identity Manager you can now automate identity management functions, such as role-based user provisioning and de-provisioning and self-service access requests, to deliver a single, automated system for managing identities for Google Apps in the cloud, as well as existing in-house applications.

OK, but how is that different from "for the cloud"?

As it turns out, existing IAM solutions from CA Technologies can be used to help control users, their access and how they can use information in private, public or hybrid cloud environments, delivering the same level of security found within the enterprise, and addressing needs that include virtualization security, compliance, policy management and more. CA noted one of their clients uses a combination of products to control access to its SaaS-based health-management applications, including a collaborative healthcare management platform for delivering outcome-driven case, disease and utilization management, and a collaborative healthcare decision support service that fosters better payer-patient-physician interactions, all kept HIPAA-compliant by the CA products.

Finally, there's "from the cloud". CA Technologies, it seems, is working with partners to deliver IAM as a fully managed service (i.e., "Identity as a Service") that helps strengthen, streamline and simplify how organizations approach identifying, authenticating and granting secure access to on premise and cloud applications and services. There will be more on that front early next year.

You can learn more about CA's identity products and solutions at their Web site.


<Return to section navigation list> 

0 comments: