⬅️ Fudge Sunday - Cloud in Public: Engineering SLO 🧭 Fudge Sunday - Cloud in Public: Mean Time To RCA ➡️

Fudge Sunday - Cloud in Public: DevCommsOps

by Jay Cuthrell
Share and discuss on LinkedIn or HN

View online

Start the week more informedThis week we continue to take a look at public things for a public cloud.

☁️✅⚠️🛑

This issue is part 3 of a 5 part series

  1. Fudge Sunday - Cloud in Public: Status Dashboards
  2. Fudge Sunday - Cloud in Public: Engineering SLO
  3. Fudge Sunday - Cloud in Public: DevCommsOps
  4. Fudge Sunday - Cloud in Public: Mean Time To RCA
  5. Fudge Sunday - Cloud in Public: Impact Mapping

When I wrote about The Perfect Team, I summarized it as one to do it, write it down, and think ahead. We now have a historical perspective and definitions for the status dashboards and the Engineering SLO. Next, let’s talk about how “write it down” can be expressed as various forms of communication in DevOps cultures.

DevCommsOps is best described as a purposeful insertion of change management communications within a DevOps culture and conspicuously expressing change management communications. To unpack that neologism a bit, imagine things we want (need?) to know relating to change that is planned, achieved, deferred, failed, and resulting in an outcome.

Recall that Error Budgets, Uptime, and SLO are simply a way to describe the operational objectives to stay up and running balanced with the innovation demands for developing new features, functionality, and availability for services. As such, DevCommsOps provides a consistent and conspicuous account for the changes planned, taking place, and completed that draw against Error Budgets.

Is DevCommsOps a word soup for Changelog, Release Notes, and Error Budget tracking? Perhaps! In practice, much like the growing depth of status dashboards, a single Changelog is more symbolic than practical as a single page to follow all change.

Is DevCommsOps a word soup for a post-ChatOps world within the context of Error Budgets economic policy? Perhaps! However, ChatOps definitions are likely going to vary from vendor to vendor to practitioner pioneers.

Luckily, there’s always a cat meme ready to help us better understand.

Vive La ChatOps!

Capitaine Flam 💫🚀🔥🔥🔥

@CapitaineFlam4

Image rare d’une reproduction de la pyramide de Chatops https://t.co/KLEYyqyTkL

2:02 PM - 13 Jul 2021

DevCommsOps in practice

  1. Who do cloud companies send “write it down”? Public? Personalized?
  2. What do cloud companies “write it down”?
  3. Where do cloud companies “write it down”?
  4. When do cloud companies “write it down”?
  5. Why do cloud companies “write it down”?

Let’s take 1-3 in this issue and leave 4-5 for our following issues in the series.

To provide examples, let’s examine where DevCommsOps is found within the hyperscale cloud service providers today using a basic search for “Release Notes,” “Changelog,” “Notices / Maintenance / Announcements,” and “Root Cause Analyses (RCAs) / Incidents.” The list is in no particular order or weighting other than shorter names to longer names.

IBM Cloud

Alibaba Cloud

Microsoft Azure

Amazon Web Services

Google Cloud Platform

Oracle Cloud Infrastructure

Notes:

  • As of this brief exercise, the only hyperscale cloud service provider that appears to have a “single page” approach to Release Notes and Changelog is Oracle Cloud Infrastructure.
  • Compared to AWS’s use of the term major, Google Cloud Platform “incidents,” Oracle Cloud Infrastructure “incidents,” and Microsoft Azure RCAs are more granular and historically accessible IMHO.
  • OCI Status appears to be using Atlassian Statuspage.
  • IBM Cloud publishes incident reports as PDFs.

While there are variations amongst the hyperscalers in expressing DevCommsOps, it is essential to consider personalization less transparent to public perspectives. Personalization is outside of the examples above because these are not public representations.

At the same time, personalized views are unique to the customer experience, which is a topic for our next issue related to time to published communications and dependency mapping.

At this point, we have established definitions for status dashboards and the Engineering SLO set against the backdrop of communications of DevOps culture in the form of DevCommsOps. Now we have a baseline to look at for comparison against timing and dependencies.

In the remaining two issues of the series, we will examine the time involved in publishing “Root Cause Analyses (RCAs) / Incidents” and dependency mapping value. We will also look at the increasing importance of dependency mapping for the future. The answers to “When and Why” from questions 4-5 above are coming soon.

Stay tuned!

Disclosure

I am linking to my disclosure.

Topics:

✍️ 🤓 Edit on Github 🐙 ✍️

⬅️ Previously: Fudge Sunday - Cloud in Public: Engineering SLO

➡️ Next: Fudge Sunday - Cloud in Public: Mean Time To RCA

Share and discuss on LinkedIn or HN
  • Get Fudge Factor each week