Sunday, January 25, 2009

Excel Table Analysis Tools for the Cloud (SQL Data Services)

I’ve made numerous references to “promised business intelligence (BI) features” for SQL Data Services (SDS) and its predecessor, SQL Server Data Services (SSDS), but overlooked the Microsoft’s Data Mining (DM) team’s Excel Table Analysis Tools for the Cloud preview, which include a full-featured add-in of the same name for Excel 2007 and a corresponding Web-based demonstration service that offers fewer analysis types.

The DM team demonstrated this feature for the first time on August 25, 2008 at the KDD (Knowledge Discovery and Data Mining) 2008 conference in Las Vegas. Jamie MacLennan tells the whole story in his KDD 2008 and Incredibly Awesome SQL 2008 Data Mining Demos post of the same date.

The Table Analysis Tools for the Cloud Web Service

The Web application, which runs from data pre-stored by SQL Data Services and DM code executed under Windows Azure doesn’t require downloading add-ins or other code to demonstrate the following analyses:

  • Analyze Key Influences
  • Forecasting
  • Prediction Calculator

Here’s the input data after clicking the Load Data button from the AdventureWorks.Sales.vIndividualDemographics view with an additional Purchased Bike bit (Boolean) column added:

(Click the image to display a full-size screen capture. Note that analyses marked with a barred red circle are unavailable in the current Web preview.)

Here are the major influencers on bicycle purchases by individual customers:

When you click one of the three active analysis buttons, the Windows Azure instance executes the data mining code to generate the Analysis Results table.

It’s easy to deduce Key influencers for purchasing a bicycle, such as lack of an automobile, unmarried status and location in the Asia-Pacific region where bicycle riding is more common than in North America. However, the Score column indicates the relative numerical importance of these factors for not purchasing and purchasing a bicycle.

Excel Table Analysis Tools for the Cloud Add-in

Uploading data from an Excel worksheet table, performing an analysis, and downloading the resulting Analysis Results worksheet requires downloading and installing the the Data Mining Add-Ins for Office 2007. The download page mentions SQL Server 2008, but you don’t need to install SQL Server 2008 to use the Table Analysis Tools (TAT) add-in.

The following screen capture shows the AdventureWorks DimCustomer.csv file that contains data that’s similar to the preceding example’s, except for the lack of the Purchased Bike column:

Clicking the Analyze (in the Cloud) button adds a analysis header to the worksheet. The Excel TAT add-in enables the following additional sample analyses:

  • Categorizer
  • Highlight Exceptions
  • Scenario Analysis
  • Market Basket Analysis
  • as you can see from the following screen capture:

    Clicking one of the analysis buttons sends the data to an SDS table, performs the analysis, and returns the data to Excel for conversion to a worksheet’s table.

    The Key Influencers analysis in this case analyzes the factors that favor increasing annual income rather than bicycle purchases.