Monday, November 21, 2011

New Features Added to My Microsoft Codename “Social Analytics” WinForms Client Sample App

My continued analysis of the Microsoft Codename “Social Analytics” Team’s VancouverWindows8 dataset shows a substantial number of blog Posts and a few Comment, Reply, Question, Answer, and Like DataItemTypes included with a preponderance of Tweets and Retweets. As a result, I added a ContentItemType ID column to the DataGrid control and enabled the list box to selectively display ContentItemType.Name data. (WCF Data Services doesn’t support LINQ queries with Join clauses, so the DataGrid contains numeric Id values.)

Update 11/23/2011: ContentItemTypes is an enumeration with Id and Name fields, so you can add ContentItemType.Name to the projection, as emphasized below. For other details and the link to download the sample project and its source code, see my More Features and the Download Link for My Codename “Social Analytics” WinForms Client Sample App post of 11/23/2011.

Here’s the LINQ query to return 500-page batches of OData ContextItems, where j is the number of 500-page batches updated previously:

var contentQuery = (from c in contentItems 
                    /* Join isn't supported by WCF Data Services 
                    join t in typeItems 
                    on c.ContentItemTypeId equals t.Id 
                     */ 
                    where c.CalculatedToneId != null 
                    orderby c.PublishedOn descending 
                    select new 
                    { 
                        c.Id, 
                        c.ContentItemTypeId, 
                        // ContentItemType is an Enum member (implemented by
                        // the Entity Framework June 2011 CTP)
                        c.ContentItemType.Name,
                        c.Title, 
                        c.PublishedOn, 
                        c.CalculatedToneId, 
                        c.ToneReliability 
                    }).Skip(j * 500).Take(rowsRequested);

This query sends the following typical GET request to the data source:

GET https://api.datamarket.azure.com/Vancouver/VancouverWindows8/ContentItems()?$filter=CalculatedToneId%20ne%20null&$orderby=PublishedOn%20desc&$skip=88000&$top=500&$select=Id,ContentItemTypeId,Title,PublishedOn,CalculatedToneId,ToneReliability HTTP/1.1
User-Agent: Microsoft ADO.NET Data Services
DataServiceVersion: 2.0;NetFx
MaxDataServiceVersion: 2.0;NetFx
Accept: application/atom+xml,application/xml
Accept-Charset: UTF-8
Host: api.datamarket.azure.com
Connection: Keep-Alive

LINQ queries that return custom projections with a select new {field list} clause aren’t easy to customize, such as adding && (c.ContentItemTypeId == 5 || c.ContentItemTypeId == 6) expressions to the where clause. Therefore, selectively restricting row added to the grid and included in calculations to Tweets and Retweets is handled during iteration.

Here’s the app’s main form after retrieving 100,000 rows, of which 4,783 were types other than Tweets and Retweets:

image

The List Box displays the total count of each ContentItemType in the result set. The following items are hidden:

8 - Answer - 2
9 - Like - 1
10 - DirectMessage = 0

I’ll upload the source code under an MIT license to my SkyDrive account after a few days of testing.

0 comments: