Recent Articles about SQL Azure Labs and Other Added-Value Windows Azure SaaS Previews: A Bibliography
I’ve been concentrating my original articles for the past 12 six months or so on SQL Azure Labs, HDInsight Apache Hadoop on Windows Azure and Windows Azure SQL Azure Database Federations previews, which I call added-value offerings. I use the term added-value because Microsoft doesn’t charge for their use, other than Windows Azure compute, storage and bandwidth costs or Windows Azure SQL Database SQL Azure monthly charges and bandwidth costs for some of the applications, such as Codename “Cloud Numerics” and SQL Azure Federations.
‡‡‡ Updated 11/1/2012 with updates and change of name from Apache Hadoop on Windows Azure to HDInsight, new Codename “Clound Numerics” features, new Windows Azure Mobile Services tutorials, pending Project “Austin” (StreamInsight for Windows Azure) articles and brief descriptions of the added-value offerings.
‡‡ Updated 6/30/2012 with my Analyzing 'big data' with Microsoft [Codename] Cloud Numerics article for SearchCloudComputing.com and revised status of Codenames “Data Hub,” Data Transfer” and “Social Analytics” to discontinued.
‡ Updated 6/1/2012 with Table of Contents and addition of Amazon Elastic MapReduce content.
Contents:
- Windows Azure Marketplace DataMarket plus Codenames “Data Hub” and “Data Transfer” from SQL Azure Labs
- HDInsight
Apache Hadoop on Windows Azurefrom the SQL Server Team ‡‡‡ - StreamInsight Project Codename “Austin” from the SQL Server Team ‡‡‡
- Codename “Cloud Numerics” from SQL Azure Labs ‡‡‡
- Codename “Social Analytics from SQL Azure Labs
- Codename “Data Explorer” from SQL Azure Labs
- Mobile Services and
SQL AzureFederations from the Windows Azure SQL DatabaseSQL AzureTeam
The following tables list my articles in reverse chronological order of their publication date on the OakLeaf, SearchCloudComputing.com or SearchSQLServer.com (marked •) and Red Gate Software’s ACloudyPlace.com (marked ••) blogs. Dates usually are the date of their last update, if updated; otherwise, the publication date. Blank dates are for articles submitted but not yet published (titles might change).
I’ll update this post as I write other articles in the same genre. Dates for updated items will be bold.
Windows Azure Marketplace DataMarket plus Codenames “Data Hub” and “Data Transfer” from SQL Azure Labs
‡‡‡ The SQL Azure Labs Team describes Microsoft Codename "Data Hub" as follows:
Data drives your business. Having the right data at the right time gives you and your business a competitive advantage. Often the data you need already exists within your enterprise. You just need to find it. "Data Hub" enables your enterprise to curate and publish its data on a private data marketplace, making it easy to discover and leverage.
‡ Update 6/30/2012: Shoshanna Budzianowski (@shoshe) of the Codename “Data Transfer” team reported the shutdown of this SQL Labs project in a Thanks for using the Data Transfer Lab post on 6/2/2012
HDInsight Apache Hadoop on Windows Azure from the SQL Server Team
‡‡‡ Microsoft’s Matt Winkler described the Windows Azure HDInsight Service as follows on 10/24/2012:
This morning we made some big announcements about delivering Hadoop for Windows Azure users. Windows Azure HDInsight Service is the easiest way to deploy, manage and scale Hadoop based solutions. This release includes:
- Hadoop updates that ensure the latest stable versions of:
- HDFS and Map/Reduce
- Pig
- Hive
- Sqoop
- Increased availability of the preview service
- A local, developer installation of Microsoft HDInsight Server
- An SDK for writing Hadoop jobs using .NET and Visual Studio
Community Contributions
As part of our ongoing commitment to Apache™ Hadoop®, the team has been actively working to submit our changes to Apache™. You can follow the progress of this work by following branch-1-win for check-ins related to HDFS and Map/Reduce. We’re also contributing patches to other projects, including Hive, Pig and HBase. This set of components is just the beginning, with monthly refreshes ahead we’ll be adding additional projects, such as HCatalog.
HDInsight is available for on-premise deployment as Microsoft HDInsight Server.
StreamInsight Project Codename “Austin” from the SQL Server Team
‡‡‡ The SQL Server StreamInsight Team described StreamInsight Project Codename “Austin” as follows on 5/24/2011:
Project Codename “Austin” will make Microsoft StreamInsight’s complex event processing capabilities available as a service on the Windows Azure Platform. This allows Microsoft’s customers and partners to build event-driven applications where the analysis of the events is performed in the Cloud. Such a deployment becomes relevant in scenarios where
- event data needs to be collected from globally distributed assets or equipment such as connected cars or oil platforms
- event data is already “born” in the cloud, like clickstream data
- event-processing results need to be consolidated and made globally available.
Instead of pulling data into an on-premise analytics environment and then possibly distributing it again, it can be processed in the Cloud using StreamInsight’s event-driven computation framework, providing cloud computing benefits for many application scenarios in verticals such as manufacturing, oil & gas, utilities, health care and web analytics.
Project Codename “Austin” offers the same capabilities for declarative event processing to derive insight from real-time and historical event data as Microsoft StreamInsight does on premises. To facilitate migrations from on premise applications to the Cloud, Project Codename “Austin” adopts the existing .NET and LINQ-based development experience that Microsoft StreamInsight provides for on premise solutions. A StreamInsight instance in the Cloud should appear just like an on-premise instance . In addition, Project Codename “Austin” will adopt a cloud-based deployment and servicing experience.
The latest update to Codename “Austin” is the third CTP dated August 2012.
Date | Link |
•• Pending | Move Complex Event Processing to the Cloud with the StreamInsight Service for Windows Azure CTP, Part 2 |
•• Pending | Move Complex Event Processing to the Cloud with the StreamInsight Service for Windows Azure CTP, Part 1 |
Codename “Cloud Numerics” from SQL Azure Labs
‡‡‡ The SQL Azure Labs team describes Microsoft Codename "Cloud Numerics" as follows:
The Microsoft Codename "Cloud Numerics" lab is a numerical and data analytics library for data scientists, quantitative analysts, and others who write C# applications in Visual Studio. It enables these applications to be scaled out, deployed, and run on Windows Azure.
‡‡‡ Ronnie Hoogerwerf (@rhoogerw) announced Microsoft Codename “Cloud Numerics” Lab Refresh on 10/18/2012. This post is a repeat of an 8/2/2012 post about v0.2 August 2012 update, reported here, with minor edits which caused it to reappear with a new publish date:
We are announcing a refresh of the Microsoft Codename "Cloud Numerics" Lab. We want to thank everyone who participated in the initial lab, we amassed and used your feedback to make improvements and add exciting features. Your participation is what makes this lab a success. Thank you.
Here’s what is new in the refresh:
Improved user experience: through more actionable exception messages, a refactoring of the probability distribution function APIs, and better and more actionable feedback in the deployment utility. In addition, the deployment process time has decreased and the installer supports installation on a on-premises Windows HPC Cluster. All up, this refresh provides for a more efficient way of writing and deploying “Cloud Numerics” applications to Windows Azure. [Emphasis added.]
More scale-out enabled functions: more algorithms are enabled to work on distributed arrays. This significantly increases the breadth and depth of big data algorithms that can be developed using “Cloud Numerics” Lab. Scale-out functionality was added in the following areas: Fourier transforms, linear algebra, descriptive statistics, pattern recognition, random sampling, similarity measures, set operations, and matrix math.
Array indexing and manipulation: a large part of any data analytics application concerns handling and preparing data to be in the right shape and have the right content. With this refresh “Cloud Numerics” adds advanced array indexing enabling users to easily and efficiently set and extract subsets of arrays and to apply Boolean filters.
Sparse data structures and algorithms: much of the real-world big data sets are sparse, i.e., not every field in a table has a value. With this refresh of the lab we introduce a distributed sparse matrix structure to hold these datasets and introduce core sparse linear algebra functions enabling scenarios such as document classification, collaborative filtering, etc.
Apply/Sweep framework: in addition to the built-in parallelism the “Cloud Numerics” Lab, this refresh now exposes a set of APIs to enable embarrassingly parallel patterns. The Apply framework enables applying arbitrary serializable .NET code to each element of an array or to each row or column of an array. The framework also provides a set of expert level interfaces to define arbitrary array splits. The Sweep framework performs as its name implies —this framework enables distributed parameter sweeps across a set of nodes allowing for better execution times.
Improved IO functionality: we added more parallel readers to enable out of the box data ingress from Windows Azure storage and introduced parallel writers. [Emphasis added.]
Documentation: we introduced detailed mathematical descriptions of more than half of the algorithms using print-quality formulae and best-of-web equation rendering that help clarify algorithm mathematical definition and method behavior. In addition, we updated the “Getting Started” wiki, and we added conceptual documentation for the “Cloud Numerics” help that includes the programming model, the new Apply framework, IO, and so on.
Stay tuned for upcoming blog posts:
- F#: We’ll be distributing a F# add-in for “Cloud Numerics” soon. The add-in exposes the “Cloud Numerics” APIs in a more functional manner, introduces operators, such as matrix multiply, and F# style constructors for and indexing on “Cloud Numerics” arrays.
- Text analytics using sparse data structures
Do you want to learn more about Microsoft Codename “Cloud Numerics” Lab? Please visit us on our SQL Azure Labs home page, take a deeper look at the Getting Started material and Sign Up to get access to the installer. Let us know what you think by sending us email at cnumerics-feedback@microsoft.com.
The “Cloud Numerics” refresh depends on the newly released Azure SDK 1.7 and Microsoft HPC Server R2 SP4. It does not provide support for the Visual Studio 2012 RC. [Emphasis added.]
Codename “Social Analytics from SQL Azure Labs
‡‡‡ The SQL Azure Labs team reported the availability of Microsoft Codename "Social Analytics" on 10/25/2011:
As the popularity of the social web continues to grow it has become increasingly important for businesses to keep their finger on the pulse of the social web. Social information provides businesses with new insights, and the social web provides a means to connect with customers and respond quickly to customer concerns or comments. Microsoft Codename "Social Analytics" allows you to easily integrate social information into your business applications.
‡ Update 6/30/2012: The Codename “Social Analytics” Team reported in a Microsoft Codename "Social Analytics" - Lab Phase is Complete blog post of 6/21/2012 that the project was discontinued. The OData data source was no longer available from the Windows Azure Marketplace Data Market as of 6/30/2012. I am in the process of modifying my Microsoft Social Analytics Windows Form Client sample project to used data saved before the data source shutdown. See my My “Big Data in the Cloud” Cover Article for Visual Studio Magazine’s July Issue post for more details.
Codename “Data Explorer” from SQL Azure Labs
‡‡‡ The SQL Azure Labs team describes Microsoft Codename "Data Explorer" as follows:
Gain new insights from your data
Have you ever had trouble finding data you needed? Or combining data from different, incompatible sources? How about sharing the results with others in a web-friendly way? If so, we want you to try Microsoft Codename “Data Explorer” Cloud service.
With "Data Explorer" you can:
Identify the data you care about from the sources you work with (e.g. Excel spreadsheets, files, SQL Server databases).
Discover relevant data and services via automatic recommendations from the Windows Azure Marketplace.
Enrich your data by combining it and visualizing the results.
Collaborate with your colleagues to refine the data.
Publish the results to share them with others or power solutions.
In short, we help you harness the richness of data on the Web to generate new insights.
Mobile Services and Federations from the Windows Azure SQL Database SQL Azure Team
‡‡‡ The Windows Azure Team describes Windows Azure Mobile Services as follows:
Windows Azure Mobile Services is a Windows Azure service offering designed to make it easy to create highly-functional mobile apps using Windows Azure. Mobile Services brings together a set of Windows Azure services that enable backend capabilities for your apps. Mobile Services provides the following backend capabilities in Windows Azure to support your apps:
- Simple provisioning and management of tables for storing app data.
- Integration with notification services to deliver push notifications to your app.
- Integration with well-known identity providers for authentication.
- Granular control for authorizing access to tables.
- Supports scripts to inject business logic into data access operations.
- Integration with other cloud services.
- Supports the ability to scale a mobile service instance.
- Service monitoring and logging.
and Windows Azure SQL Database Federations thusly:
Federations in SQL Database are a way to achieve greater scalability and performance from the database tier of your application through horizontal partitioning. One or more tables within a database are split by row and portioned across multiple databases (Federation members). This type of horizontal partitioning is often referred to as ‘sharding’. The primary scenarios in which this is useful are where you need to achieve scale, performance, or to manage capacity.
SQL Database can deliver scale, performance, and additional capacity through federation, and can do so dynamically with no downtime; client applications can continue accessing data during repartitioning operations with no interruption in service.
0 comments:
Post a Comment