⬅️ Fudge Sunday - Cloud in Public: Engineering SLO 🧭 Fudge Sunday - Razor Thin Margins of Error Bars ➡️
Fudge Sunday - Cloud in Public: Status Dashboards
This week we take a look at public things for a public cloud.
☁️✅⚠️🛑
This issue is part 1 of a 5 part series
- Fudge Sunday - Cloud in Public: Status Dashboards
- Fudge Sunday - Cloud in Public: Engineering SLO
- Fudge Sunday - Cloud in Public: DevCommsOps
- Fudge Sunday - Cloud in Public: Mean Time To RCA
- Fudge Sunday - Cloud in Public: Impact Mapping
Last Week
Razor Thin Margins of Error Bars
Last week we took a look at three links and the implications in the cloud ☁️, attestation🔐, and statistical journeys to here be dragons🐲.
Last Week(s)
A look back at this week in…
- 2020 - Tokyo Stock Exchange outage
- 2019 - NS1 took $33M Series C
- 2018 - M&A of Hortonworks + Cloudera for $5.2B
- 2017 - Yahoo’s 2013 breach impacted all 3B users
- 2016 - Microsoft Azure plans a data center in France
- 2015 - Google Inc. became a subsidiary of Alphabet
- 2014 - M&A: Facebook + WhatsApp for $19B (EU approved)
- 2013 - Twitter filed for IPO
- 2012 - M&A: T-Mobile USA + MetroPCS
- 2011 - M&A: HP + Autonomy for $12B
This Week
This week let’s take a look at the history and evolution of the public status dashboard and what this means for the public cloud. We’ve come a long way but there is more to come in terms of transparency, personalization, and relevance.☁️✅⚠️🛑
Weekly Inspiration and Attribution
“The cloud is NOT just someone else’s computer”
– Lydia Leong aka @cloudpundit
Lydia Leong is a Distinguished VP Analyst at Gartner with both an active revered Twitter presence and a personal blog that is selectively syndicated to Gartner Blog properties. Regarding “media amplification of outage awareness” it is a good reminder that there’s always a XKCD (2347).
Earlier this week, I came across this Twitter thread and blog post from @cloudpundit. In a nutshell, ☁️ service provider transparency:
Publicly👏Document👏All👏The👏Things👏
- Engineering Service Level Objectives (Engineering SLO)
- Service Architecture and Dependency Mapping (Impact Mapping)
- Status Dashboards
- Change Plans and Logs (DevCommsOps)
- Outage Root Cause Analysis (Mean Time To RCA)
This week in this issue, we’ll only be focusing on Status Dashboards as the first issue of five to come.🤓
- Summarize Dashboards ☁️✅⚠️🛑
- Contrast the past and present of Dashboards ☁️✅⚠️🛑
Status Dashboards Then ☁️✅⚠️🛑
Cloud historians point back to 2006 as the time of Amazon Web Services (AWS) entering the market. By 2008, AWS Service Health Dashboard was announced as a way to “provides access to current status and historical data about each and every Amazon Web Service”.
Originally, AWS Service Health Dashboard almost fit on one page scroll in a web browser. Today, AWS Service Health Dashboard involves scrolling multiple times per page and multiple regional tabs with their own scrolling multiple times per page.
The look of the 2008 AWS Service Health Dashboard was simple, clean, and brief with only nine (9!) services. (via Wayback Machine)
Today, +2000 services by region are represented on the AWS Service Health Dashboard.
By 2010, a status dashboard was becoming an expectation for customers of various “as a service” companies. Customers wanted to know if something was wrong with their Internet connect or if a service was truly having issues. See also: Social Telecom (2010) and Social Telecom 2030.
Paradise by the Stashboard Light
Paradise by the Stashboard Light was one in a series of technical blog post I authored for ReadWriteWeb (aka RWW). At the time, several online companies realized the need to offer status dashboards and Twilio responded by offering Stashboard which leveraged Google App Engine, an early precursor to the serverless movement.
Before the Stashboard project demo stopped working on June 20, 2017 and before the project was archived there were +300 forks on Github.
By 2016, status dashboards were commonly referred to as status pages. You can even find GitHub repositories that simply documented all the awesome status pages in existence – and those curate list repositories also saw forks!
By 2018, DevOps was almost a 10 year old concept. By late 2018, DevOps would come to decorate the names of status dashboards.
Public Cloud Status Dashboards Now ☁️✅⚠️🛑
Today, public cloud status dashboards are a more accurate way to think of an exhaustive status page. As compared to a simple SaaS company status page, a hyperscale public cloud service provider can have thousands of services that require a status report meaning many status pages must be represented in a status dashboard.☁️✅⚠️🛑
- AWS Service Health Dashboard
- Microsoft Azure Status
- Google Cloud Status Dashboard
- Oracle Cloud Infrastructure Status
- Alibaba Cloud condition monitoring
- IBM Cloud Status
- Cloudflare System Status
As one might expect, a personalized status page approach would filter and summarize the pertinent information to a specific customer. However, a personalized status page approach is a very recent concept in even in 2021.
In fact, only two (2!) hyperscale public cloud service providers offer a truly personalized status page at this time whereas. As for the others, the approach is decidedly do it yourself (DIY).
Historically, AWS launched Personal Health Dashboard in 2016 as a premium service while Azure launched Service Health in 2017 as a preview that became generally available in 2018 – at no additional cost.
Now, to round out this issue, reflect back to the five (5) areas that @cloudpundit outlined. It’s worth noting that beyond dashboard variation, there are still clear variations between hyperscale public cloud service providers today but those won’t possibly fit into this issue – hence the remaining issues in a series of issues to come.🤓
Disclosure
I am linking to my disclosure.