DEV Community

Site Reliability Engineering

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
Books That Helped Me Become a Tech Lead

Books That Helped Me Become a Tech Lead

336
Comments 32
10 min read
What is DevOps? REALLY understand it

What is DevOps? REALLY understand it

267
Comments 4
12 min read
5 DevOps Books to Read for FREE

5 DevOps Books to Read for FREE

204
Comments 7
2 min read
#DevOps para noobs - Proxy Reverso

#DevOps para noobs - Proxy Reverso

192
Comments 12
3 min read
Let's stop fooling ourselves. What we call CI/CD is actually only CI.

Let's stop fooling ourselves. What we call CI/CD is actually only CI.

161
Comments 32
5 min read
For the Love of Bleep! Building a Scalable Monitoring System

For the Love of Bleep! Building a Scalable Monitoring System

140
Comments 12
6 min read
April Fools and the Broken Promises of One-off Hacks

April Fools and the Broken Promises of One-off Hacks

129
Comments 8
4 min read
Making On-Call Not Suck

Making On-Call Not Suck

127
Comments 17
7 min read
If you’re not using SSH certificates you’re doing SSH wrong | Episode 2: Certificates improve usability, operability, & security

If you’re not using SSH certificates you’re doing SSH wrong | Episode 2: Certificates improve usability, operability, & security

111
Comments 4
6 min read
#DevOps para noobs - Requests x limits no Kubernetes

#DevOps para noobs - Requests x limits no Kubernetes

94
Comments 15
2 min read
SRE and Tasks of an SRE explained âś…

SRE and Tasks of an SRE explained âś…

90
Comments 2
13 min read
DevOps vs SRE: What's The Difference?

DevOps vs SRE: What's The Difference?

79
Comments 2
4 min read
Three Tips To Understand Chaos Engineering

Three Tips To Understand Chaos Engineering

77
Comments 5
5 min read
Switching From Resque to Sidekiq

Switching From Resque to Sidekiq

77
Comments 7
7 min read
3 fundamental monitoring methods essential for every DevOps engineer 🚀💥

3 fundamental monitoring methods essential for every DevOps engineer 🚀💥

72
Comments
4 min read
Introduction to Thanos!

Introduction to Thanos!

72
Comments 1
5 min read
How does deployment work at your organization?

How does deployment work at your organization?

71
Comments 73
1 min read
Have you considered Site Reliability Engineering as a path?

Have you considered Site Reliability Engineering as a path?

66
Comments 12
1 min read
What You Need to Break into DevOps and SRE

What You Need to Break into DevOps and SRE

64
Comments
3 min read
4 YouTube Resources to Get Started with Kubernetes

4 YouTube Resources to Get Started with Kubernetes

59
Comments
2 min read
7 Site Reliability lessons from Google and Amazon

7 Site Reliability lessons from Google and Amazon

53
Comments
6 min read
The Night Before Code Freeze

The Night Before Code Freeze

53
Comments 1
4 min read
DevOps vs. Site Reliability Engineering (SRE)

DevOps vs. Site Reliability Engineering (SRE)

52
Comments
31 min read
Facebook is down, discuss...

Facebook is down, discuss...

49
Comments 43
1 min read
How do you wrap your head around observability?

How do you wrap your head around observability?

49
Comments 13
1 min read
Top 13 open source Application Performance Monitoring(APM) tools in 2021

Top 13 open source Application Performance Monitoring(APM) tools in 2021

48
Comments 1
12 min read
End-to-End Monitoring with Grafana Cloud with Minimal Effort

End-to-End Monitoring with Grafana Cloud with Minimal Effort

44
Comments
12 min read
10 GitHub Repositories That Help You Become A Better DevOps Engineer

10 GitHub Repositories That Help You Become A Better DevOps Engineer

41
Comments 3
3 min read
I’m a certified Associate Cloud Engineer!

I’m a certified Associate Cloud Engineer!

40
Comments 5
4 min read
Why SREs Should be Responsible for Development Environments

Why SREs Should be Responsible for Development Environments

39
Comments 13
5 min read
If you’re not using SSH certificates you’re doing SSH wrong | Episode 1: Keys versus Certificates

If you’re not using SSH certificates you’re doing SSH wrong | Episode 1: Keys versus Certificates

37
Comments
5 min read
Are you the lonely DevOps engineer doing 24/7 on-call? Change it!

Are you the lonely DevOps engineer doing 24/7 on-call? Change it!

36
Comments 1
3 min read
Dreams and Nightmares of Ops

Dreams and Nightmares of Ops

34
Comments 2
10 min read
Essence of Terraform

Essence of Terraform

33
Comments 1
3 min read
Kafka Chaos Engineering With Litmus

Kafka Chaos Engineering With Litmus

33
Comments
10 min read
If you’re not using SSH certificates you’re doing SSH wrong | Episode 3: An ideal SSH flow

If you’re not using SSH certificates you’re doing SSH wrong | Episode 3: An ideal SSH flow

31
Comments 2
5 min read
Chaos Engineering for cloud-native systems

Chaos Engineering for cloud-native systems

30
Comments
4 min read
AWS VPC 101

AWS VPC 101

30
Comments
10 min read
Chaos Workflows with Argo and LitmusChaos

Chaos Workflows with Argo and LitmusChaos

30
Comments 1
8 min read
Deep Dive into Docker Internals - Union Filesystem

Deep Dive into Docker Internals - Union Filesystem

30
Comments
10 min read
Site Reliability Engineering (SRE) Best Practices

Site Reliability Engineering (SRE) Best Practices

30
Comments 1
8 min read
What happens when Amazon accidentally sends all of their support traffic your way?

What happens when Amazon accidentally sends all of their support traffic your way?

28
Comments 3
3 min read
Quick, Pretty and Easy Maintenance Page using Cloudflare Workers & Terraform

Quick, Pretty and Easy Maintenance Page using Cloudflare Workers & Terraform

28
Comments
3 min read
SRE DevOps Interview Questions — Linux Troubleshooting

SRE DevOps Interview Questions — Linux Troubleshooting

27
Comments 3
7 min read
Keeping the Stakes Low while Breaking Production

Keeping the Stakes Low while Breaking Production

27
Comments 5
4 min read
#90DaysOfDevOps - Day 1

#90DaysOfDevOps - Day 1

26
Comments 4
4 min read
Practical Nix Flakes

Practical Nix Flakes

25
Comments
15 min read
Managing CNAMEs with Azure Resource Manager Templates

Managing CNAMEs with Azure Resource Manager Templates

25
Comments
3 min read
5 Tips for Getting Alert Fatigue Under Control

5 Tips for Getting Alert Fatigue Under Control

25
Comments 1
9 min read
SigNoz : Open-source alternative to DataDog

SigNoz : Open-source alternative to DataDog

24
Comments 2
3 min read
Introduction to LitmusChaos

Introduction to LitmusChaos

24
Comments
11 min read
Questions To Ask Yourself Before Accepting A Software Engineering Role That Involves On Call Duties

Questions To Ask Yourself Before Accepting A Software Engineering Role That Involves On Call Duties

23
Comments
3 min read
Feelings during incident response

Feelings during incident response

23
Comments
3 min read
15 DevOps and SRE Tools you Should Know About in 2023

15 DevOps and SRE Tools you Should Know About in 2023

22
Comments
7 min read
10 Things I wish I’d known before building a Kubernetes CRD controller

10 Things I wish I’d known before building a Kubernetes CRD controller

22
Comments
8 min read
Best practices for Kubernetes security; scaling write-heavy productions; & SRE

Best practices for Kubernetes security; scaling write-heavy productions; & SRE

22
Comments
2 min read
LitmusChaos: A Reflection On The Past Six Months

LitmusChaos: A Reflection On The Past Six Months

21
Comments
15 min read
Testing Infrastructure at ✨ Corp, a DevOps Story

Testing Infrastructure at ✨ Corp, a DevOps Story

20
Comments 2
6 min read
Towards Operational Excellence — Part 1

Towards Operational Excellence — Part 1

20
Comments
10 min read
Litmus 2.0 - Simplifying Chaos Engineering for Enterprises

Litmus 2.0 - Simplifying Chaos Engineering for Enterprises

19
Comments
3 min read
loading...