DEV Community

loading...

Site Reliability Engineering

πŸ‘‹ Sign in for the ability sort posts by top and latest.
Top Reliability and Scaling Practices from Experts at Citrix, Greenlight Financial, and Incognia

Top Reliability and Scaling Practices from Experts at Citrix, Greenlight Financial, and Incognia

Reactions 2
14 min read
Reliability as an Inseparable Part of Software Engineering

Reliability as an Inseparable Part of Software Engineering

Reactions 3
5 min read
Kubernetes gone bust. Now what?

Kubernetes gone bust. Now what?

Reactions 4
4 min read
Localizer: An adventure in creating a reverse tunnel/tunnel manager for Kubernetes

Localizer: An adventure in creating a reverse tunnel/tunnel manager for Kubernetes

Reactions 5
8 min read
Here are the Top Predictions for SRE in 2021

Here are the Top Predictions for SRE in 2021

Reactions 3
6 min read
Argo CD

Argo CD

Reactions 6
3 min read
AWS Project: Deploying a Static Website to AWS

AWS Project: Deploying a Static Website to AWS

Reactions 4
1 min read
The Engineer's Guide to Preparing for Black Friday 2020

The Engineer's Guide to Preparing for Black Friday 2020

Reactions 2
8 min read
Choosing SLOs that users need, not the ones you want to provide

Choosing SLOs that users need, not the ones you want to provide

Reactions 6
6 min read
Blameless Book Club: Implementing Service Level Objectives, Part 1

Blameless Book Club: Implementing Service Level Objectives, Part 1

Reactions 6
7 min read
Debugging incidents in Google's Distributed Systems

Debugging incidents in Google's Distributed Systems

Reactions 1
2 min read
Resilience Engineering and Life

Resilience Engineering and Life

Reactions 3
4 min read
Testing ML incident detection using a cloud native microservices app

Testing ML incident detection using a cloud native microservices app

Reactions 11
10 min read
Operational Readiness Review Template

Operational Readiness Review Template

Reactions 6
7 min read
Google Down worldwide | Why is Google Down? Let's break it down

Google Down worldwide | Why is Google Down? Let's break it down

Reactions 15
4 min read
SREview Issue #7 November 2020

SREview Issue #7 November 2020

Reactions 4
2 min read
Making Instrumentation Extensible

Making Instrumentation Extensible

Reactions 5
7 min read
SREview Issue #8 December 2020

SREview Issue #8 December 2020

Reactions 4
2 min read
Challenges with Implementing SLOs

Challenges with Implementing SLOs

Reactions 3
11 min read
How to SRE without an SRE on your team

How to SRE without an SRE on your team

Reactions 2
10 min read
Honeycomb SLO Now Generally Available: Success, Defined.

Honeycomb SLO Now Generally Available: Success, Defined.

Reactions 6
7 min read
Working Toward Service Level Objectives (SLOs), Part 1

Working Toward Service Level Objectives (SLOs), Part 1

Reactions 6
5 min read
Creating Chaos and a Giveaway βš’ 🎁

Creating Chaos and a Giveaway βš’ 🎁

Reactions 19 Comments 6
2 min read
Top Open Source projects for SREs and DevOps

Top Open Source projects for SREs and DevOps

Reactions 5
7 min read
The Operational Excellence Collection

The Operational Excellence Collection

Reactions 4
1 min read
Yury NiΓ±o Roa Shares her Insights on Chaos Engineering and SRE

Yury NiΓ±o Roa Shares her Insights on Chaos Engineering and SRE

Reactions 2
7 min read
How an SRE became an Application Security Engineer (and you can too)

How an SRE became an Application Security Engineer (and you can too)

Reactions 5
8 min read
My Top 5 Books for DevOps/SRE

My Top 5 Books for DevOps/SRE

Reactions 3
4 min read
LitmusChaos: A Reflection On The Past Six Months

LitmusChaos: A Reflection On The Past Six Months

Reactions 19
15 min read
3 Ways SRE Can Boost your Business Value

3 Ways SRE Can Boost your Business Value

Reactions 3
6 min read
Essence of Terraform

Essence of Terraform

Reactions 32 Comments 1
3 min read
Building on observability

Building on observability

Reactions 4
2 min read
SREview Issue #6 October 2020

SREview Issue #6 October 2020

Reactions 4
2 min read
The Future of Ops Careers β€” Honeycomb

The Future of Ops Careers β€” Honeycomb

Reactions 6
8 min read
The Resilient Architecture Collection

The Resilient Architecture Collection

Reactions 14
2 min read
DevOps 2021: Paving your way into SRE

DevOps 2021: Paving your way into SRE

Reactions 10
6 min read
Creativity in the Ops

Creativity in the Ops

Reactions 3 Comments 1
3 min read
Intro to o11ycast: A Human Perspective on the Role of Observability

Intro to o11ycast: A Human Perspective on the Role of Observability

Reactions 2
6 min read
Error Budgeting & Site Reliability Engineering

Error Budgeting & Site Reliability Engineering

Reactions 6
5 min read
GCP DevOps Certification - Day Eight

GCP DevOps Certification - Day Eight

Reactions 2
2 min read
From Sysadmin to SRE

From Sysadmin to SRE

Reactions 8
7 min read
Engineers, Stop Hoarding your Metrics

Engineers, Stop Hoarding your Metrics

Reactions 2
5 min read
GCP DevOps Certification - Day Five

GCP DevOps Certification - Day Five

Reactions 2
2 min read
Building Reliability Through Culture with Veteran Google SRE, Steve McGhee

Building Reliability Through Culture with Veteran Google SRE, Steve McGhee

Reactions 4
6 min read
GCP DevOps Certification - Day Six

GCP DevOps Certification - Day Six

Reactions 3
3 min read
5 Best Practices for Nailing Incident Retrospectives

5 Best Practices for Nailing Incident Retrospectives

Reactions 10
6 min read
Disposable Kubernetes clusters

Disposable Kubernetes clusters

Reactions 14
5 min read
SRE + Honeycomb: Observability for Service Reliability

SRE + Honeycomb: Observability for Service Reliability

Reactions 8
11 min read
The Ultimate, Free Incident Retrospective Template

The Ultimate, Free Incident Retrospective Template

Reactions 3
6 min read
Changes are a good thing

Changes are a good thing

Reactions 2
4 min read
How to Construct a Reliability Model for your Organization

How to Construct a Reliability Model for your Organization

Reactions 9
6 min read
GCP DevOps Certification - Day Four

GCP DevOps Certification - Day Four

Reactions 3
2 min read
Let's stop fooling ourselves. What we call CI/CD is actually only CI.

Let's stop fooling ourselves. What we call CI/CD is actually only CI.

Reactions 155 Comments 32
5 min read
Learn How to Apply SRE Outside of Engineering with Dave Rensin

Learn How to Apply SRE Outside of Engineering with Dave Rensin

Reactions 2
42 min read
Can Security Teams Benefit from SRE? You bet!

Can Security Teams Benefit from SRE? You bet!

Reactions 3
6 min read
Are you Great at Incident Response?

Are you Great at Incident Response?

Reactions 2
5 min read
Availability, Maintainability, Reliability: What's the Difference?

Availability, Maintainability, Reliability: What's the Difference?

Reactions 4
4 min read
SRE for Business Continuity in the Face of Uncertainty

SRE for Business Continuity in the Face of Uncertainty

Reactions 2
6 min read
5 On-Call Practices to Help you Sleep through the Night

5 On-Call Practices to Help you Sleep through the Night

Reactions 2
5 min read
Getting SRE Buy-in from a Manager or Lead for Incident Response

Getting SRE Buy-in from a Manager or Lead for Incident Response

Reactions 2
5 min read
loading...