Thursday, October 31, 2013

Uptime Report for My Live Windows Azure Web Site: October 2013 = 99.63%

image_thumbMy (@rogerjenn) Android MiniPCs and TVBoxes blog runs WordPress on WebMatrix with Super Cache in Windows Azure Web Site (WAWS) Standard tier instance in Microsoft’s West U.S. (Bay Area) data center and ClearDB’s MySQL database (Venus plan). I converted the site from the Shared Preview to the Standard tier on September 7, 2013 in order to store diagnostic data in Windows Azure blobs instead of the file system.

Service Level Agreements aren’t applicable to the Web Services’ Shared tier; only sites in the Standard tier qualify for the 99.9% uptime SLA.

  • Running a Shared Preview WAWS costs ~US$10/month plus MySQL charges
  • Running a Standard tier instance costs ~US$75/month plus MySQL charges

I use Windows Live Writer to author posts that provide technical details of low-cost MiniPCs with HDMI outputs running Android JellyBean 4.1+, as well as Google’s new Chromecast device. The site emphases high-definition 1080p video recording and rendition.


The site commenced operation on 4/25/2013. To improve response time, I implemented WordPress Super Cache on May 15, 2013. I moved the site’s diagnostic logs from the file system to Windows Azure blobs on 9/7/2013, as reported in my Storing the Android MiniPCs Site’s Log Files in Windows Azure Blobs post.

Barbara Darrow (@gigabarb) asserted “Microsoft likes to tout the fact that it runs Windows Azure at data centers worldwide. Yesterday compute instances across most of those regions were disrupted. Not good” in a summary of her Whoopsie: Windows Azure stumbles again post of 10/31/2013 to GigaOm’s Cloud Computing blog:

imageMicrosoft Windows Azure had a bad day Wednesday with compute capability severely impacted worldwide throughout the day, as first reported by The Register. Compute service was partially disrupted across nearly all regions,  according to the Azure status page,

At 2:35 a.m. UTC (7:35 p.m. PDT) the company said:

We are experiencing an issue with Compute in North Central US, South Central US, North Europe, Southeast Asia, West Europe, East Asia, East US and West US. We are actively investigating this issue and assessing its impact to our customers. Further updates will be published to keep you apprised of the impact. We apologize for any inconvenience this causes our customers.

imageA flurry of updates followed throughout the day and as of 10:45 a.m. UTC (3:45 a.m. PDT) on Thursday, the status report declared that the problem had been addressed and the company was “remediating” affected services.

Microsoft said the issue impacted the swap deployment feature of service management. As explained in PC World: Azure offers both a staging environment to let users test their systems, and a production environment separated by the virtual IP addresses used to access them. Swap deployment operations are used to turn the staging environment into the production environment.

Update: What this meant was that while existing applications continued to work, new code could not be deployed to production until the swap deployment issue could be resolved. … [Emphasis added.]

Read the full article here.

Full disclosure: I’m a registered GigaOm analyst.

The problem didn’t affect my Windows Azure demo site, as shown in the following Pingdom reports and the Windows Azure Management Portal’s Dashboard.

Here’s Pingdom’s graphical Uptime report (first page in Downtime mode) for October 2013:


Note: Downtime is exaggerated because Pingdom’s timing resolution is five minutes.

And here’s their Response Time report for the same period:


Response time is slowed by the length and number of images in this blog’s posts.

The Windows Azure Management Portal displays resource utilization for a maximum period of one prior week:


The 7.491 maximum requests per hour on 10/10/2013 is a bit above the September 2013 maximum.

I plan to report and log uptime values monthly with pages similar to OakLeaf Systems’ Uptime Report for my Live OakLeaf Systems Azure Table Services Sample Project posts, which has more that two full years of uptime and response time data as of June 2013.

Month Year Uptime Downtime Outages Response Time
October 2013 99.63% 02:40:00 23 2,061 ms
September 2013 99.81% 01:20:00 7 1,816 ms
August 2013 99.71% 02:10:00 24 1,786 ms
July 2013 99.37% 04:40:00 45 2,002 ms
June 2013 99.34% 04:45:00 30 2,431 ms
May 2013 99.58% 03:05:00 32 2,706 ms

Note: The usual 99.9% uptime Service Level Agreement (SLA) applies only to Windows Azure Web Sites running in the Standard tier. The Android MiniPCs and TVBoxes site was upgraded from the Shared to Standard tier on 9/7/2013.