Tuesday, December 27, 2011

Microsoft Codename “Data Explorer” Cloud Version Fails to Save Snapshots of Codename “Social Analytics” Data

The problem is related to differences in the snapshot database schema between desktop and cloud versions, as shown below.

Update 12/27/2011 1:30 PM PST: See my new Mashup Big Data with Microsoft Codename “Data Explorer” - An Illustrated Tutorial post of 12/27/2011 for detailed instructions for creating and downloading the mashups that underlie this post.

Update 12/26/2011 8:45 AM PST: The “Data Explorer” Team is investigating the problem. See end of post.


Microsoft described Codename “Data Explorer” in its 10/12/2011 product announcement as follows:

image"Data Explorer” is a new concept which provides an innovative way to gain new insights from the data you care about. With "Data Explorer” you can discover relevant data sources, enrich your data by combining with other sources, and then publish and share your insights with others.

You can get started with “Data Explorer” by identifying the data you care about from the sources you work with (e.g. Excel spreadsheets, files, SQL Server databases, Windows Azure Marketplace, etc.). [Emphasis added.]

I’ve been working with the API for Microsoft Codename “Social Analytics” since its introduction on 10/25/2011. My Microsoft tests Social Analytics experimental cloud article of 12/1/2011 for SearchCloudComputing.com describes how Social Analytics datasets enable discovery of consumer interest in and sentiment about products (Windows 8) or individuals (Bill Gates.)

My downloadable Codename “Social Analytics” WinForms Client Sample App displays individual Tweets, Facebook posts, and occasional Stack Overload questions in a DataGrid control and summarizes daily counts of items referring to Windows 8 as well as their positive and negative sentiment (tone) in a graph, shown here for 12/25/2011 and the preceding 21 days:

imageClick figures to display full-size (1024 x 768 px) screen captures.

While testing the capability of Desktop and Cloud implementations of Microsoft Codename “Data Explorer” to emulate the features of my sample client app, I discovered the following:

Saving a Snapshot Containing Codename “Social Analytics” Data from the Desktop Client to a Local SQL Server Instance Works as Expected

I was able to save a snapshot of a Data Explorer mashup containing rows from Social Analytics’ VancouverWindows8 ContentItems collection to a local SQL Server 2008 R2 (v10.50.2500) database. All 263,171 ContentItems rows at the time were returned, as shown in this SQL Server Management Studio (SSMS) 2008 R2 screen capture:


Here’s the ContentItems column’s specs and the Errors table’s initial rows:


Notice that Data Explorer Desktop added a row column of the int data type as the primary key.

Note: The full text of errors for Feeds, Tones and WindowsAzureMarketplace1 source data tables is “The value does not have a type that is supported for sending to an external source.” These columns contain collections. Flattening the columns doesn’t appear to be practical for performance reasons, as well as the presence of collection columns in the flattened columns.

Saving a Snapshot Containing Codename “Social Analytics” Data from the Cloud Implementation to an SQL Azure Instance Fails with Primary Key Constraint Violation

Specifying an SQL Azure server (v11.0.1814) instance to store snapshot data, adding a new database, specifying the schema name, and creating a snapshot results in an empty ContentItems table. The following SSMS screen displays the cause of the problem:


Notice that the cloud implementation doesn’t add the row primary key column. Renaming the Id column to ItemGuid and adding an Index (row number) column named Id doesn’t solve the problem. It erroneously eliminates the error message shown above but returns an empty ContentItems resultset:


Notice that Data Explorer Index columns you create are incorrectly saved as float instead of the int data type.

The difference in behavior of desktop and cloud snapshots is surprising; my understanding is that the two versions share the same codebase.

Note: You can download empty text (CSV) or Excel files, or OData feeds, as well as the mashup itself here:


If you have a feed key for Microsoft Codename “Social Analytics” and the Codename “Data Explorer” Desktop client installed, download the Data Explorer Mashup and store a copy locally. Paste your feed key, as shown below, and click the Continue button to enable connection to the Codename “Social Analytics” VancouverWindows8 dataset:


Click the Content Items button in the data sources navigation pane and be patient while waiting for the following list to populate with live data:


Note: Tables for the other three data sources don’t appear as expected. The reason for this problem isn’t clear. Missing Tone values for recent items has been reported to the Microsoft Codename “Social Analytics” team.

The Microsoft codename Data Explorer forum thread for this problem is here.

Update 12/26/2011 8:45 AM PST: The Data Explorer team’s Community Program Manager, Miguel Llopis, responded to my thread on 12/26/2011:

We are looking at the mashup that you published in order to identify and fix the exact issue.

For more information about my (@rogerjenn) Social Analytics WinForms Client, see:

imageJamie Thomson (@jamiet) has posted several interesting articles about Codename “Data Explorer” here.