Originally published on Failure is Inevitable.
BOO! Did we scare you? We couldn’t help it, we’re just so happy it’s spooky season. Here’s the October issue of SREview! This monthly zine features epic Tweets, content, and events happening in the SRE and resilience engineering community.
Take our survey on SRE Maturity & SLO Adoption: it will take only 5-10 minutes. 5 lucky winners will receive a $100 Amazon gift card!
Senior Oops Engineerdistributed systems is an attempt to answer the question "is it possible for something to be broken and still work at the same time"21:11 PM - 22 Sep 2020
Brenda Wallace, Potato EnthusiastMy favourite metaphor for tech debt is dishes. you gotta do the dishes, not just the cooking.
If the dishes are done, then when the customer orders a new thing you can make it right away.00:07 AM - 23 Sep 2020
Charity MajorsMaybe I need to write a blog post called "On Call For Managers". If you're asking engineers to be on call for their code -- and you should -- you owe in return:
- enough time to fix what's broken
- hands to do the work
- closely track how often they are interrupted/woken
- ..etc21:24 PM - 25 Sep 2020
The Comprehensive Guide on SLIs, SLOs, and Error Budgets: This 27-page guide walks through how to set SLIs and SLOs that matter to make data-informed decisions.
Here's your Complete Definition of Software Reliability: In this blog post, we’ll break down what software reliability means in terms of perception, team operation, and customer happiness.
Alerting on SLOs: Glitch’s Mads Hartmann writes about the team’s progress in adopting SLOs, including the motivation behind implementation.
How to Construct a Reliability Model for your Organization: In this post, we’ll construct a basic reliability model and show you how to create one for your own organization.
Four Things I Wish I Knew as the New CTO of a Startup: Isabel Nyo writes about her experiences as CTO of a small startup and key lessons learned over the course of a year.
Blameless automates toil and creates guardrails during incidents, streamlines learning from incidents, and much more.
Try out our free sandbox today.
Blameless Bi-Weekly Demo October 20 at 8 AM PST: Check out a live demo of Blameless as we walk you through operations best practices, and get your questions answered.
Unscripted Conference October 21-22: DevOps practitioners, and technology leaders to learn and share stories of simplified software delivery at scale, but with a twist.
Achieving Zero Downtime October 22 at 10 AM PST: Learn from Cindy Sridharan on how to conduct zero downtime deployments at the latest 99 Percent DevOps Talk from Lightstep.
If you’re looking to share your insights with the SRE and resilience engineering community, we’d love to partner with you on content. Fill out our form here and we’ll reach out!