DEV Community

# reliability

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
Going 'reliability first' in 2020

Going 'reliability first' in 2020

15
Comments
4 min read
Observability is becoming mission critical, but who watches the watchmen?

Observability is becoming mission critical, but who watches the watchmen?

14
Comments 3
6 min read
Delivering 100% of Webhooks

Delivering 100% of Webhooks

14
Comments
2 min read
A simple guide to addressing single point of failure (SPOF) while evaluating external tools

A simple guide to addressing single point of failure (SPOF) while evaluating external tools

12
Comments
5 min read
SRE + Honeycomb: Observability for Service Reliability

SRE + Honeycomb: Observability for Service Reliability

12
Comments
11 min read
What Do Reliability, Scalability, and Maintainability Mean?

What Do Reliability, Scalability, and Maintainability Mean?

11
Comments 1
3 min read
Ensuring reliability: SLOs, on-call process, and postmortems

Ensuring reliability: SLOs, on-call process, and postmortems

11
Comments
5 min read
Designing software for the enterprise - Pt.1 Reliability

Designing software for the enterprise - Pt.1 Reliability

8
Comments
4 min read
Falando sobre SRE - Parte 01 - Uma breve introdução

Falando sobre SRE - Parte 01 - Uma breve introdução

8
Comments
7 min read
Engineering Ideas #8

Engineering Ideas #8

7
Comments
6 min read
Availability Service Level Calculation

Availability Service Level Calculation

7
Comments
5 min read
Engineering Ideas #6

Engineering Ideas #6

6
Comments
6 min read
Reliability Restaurant – How to approach software reliability as a mindset

Reliability Restaurant – How to approach software reliability as a mindset

6
Comments 1
14 min read
Improve Resilience with Controlled Chaos Engineering

Improve Resilience with Controlled Chaos Engineering

6
Comments
1 min read
ETCD Cluster and Non-voting Learners

ETCD Cluster and Non-voting Learners

6
Comments
3 min read
Reliability: ETCD Cluster

Reliability: ETCD Cluster

6
Comments
2 min read
How SLOs Help Evernote's SRE Team Manage Tech Debt

How SLOs Help Evernote's SRE Team Manage Tech Debt

6
Comments
6 min read
How does chaos engineering relate to the mathematical definitions of chaos?

How does chaos engineering relate to the mathematical definitions of chaos?

5
Comments
3 min read
How our team improved perceived reliability of Kaggle Notebooks

How our team improved perceived reliability of Kaggle Notebooks

5
Comments 1
5 min read
3 Things You Must Do When Calling Third-Party APIs

3 Things You Must Do When Calling Third-Party APIs

4
Comments
3 min read
Error Economics - How to avoid breaking the budget

Error Economics - How to avoid breaking the budget

3
Comments
7 min read
Drift Into Failure by Sidney Dekker – notes on the book

Drift Into Failure by Sidney Dekker – notes on the book

3
Comments
14 min read
The Closed Loop

The Closed Loop

3
Comments
3 min read
Reliability Engineering: Two Mistakes High

Reliability Engineering: Two Mistakes High

3
Comments 1
4 min read
Bringing reliability closer to you with Reliably and DataDog

Bringing reliability closer to you with Reliably and DataDog

3
Comments
7 min read
What about off-grid programming?

What about off-grid programming?

3
Comments
2 min read
60 Years of the IBM System/360: A Legacy of Reliability and Security

60 Years of the IBM System/360: A Legacy of Reliability and Security

2
Comments 1
2 min read
What are the most important features you need in your logging product?

What are the most important features you need in your logging product?

2
Comments 1
1 min read
Why Elixir?

Why Elixir?

2
Comments
1 min read
Learning from other industries, part 2 of n

Learning from other industries, part 2 of n

2
Comments
3 min read
Reliability concepts: Availability, Resiliency, Robustness, Fault-Tolerance, and Reliability

Reliability concepts: Availability, Resiliency, Robustness, Fault-Tolerance, and Reliability

2
Comments
1 min read
System Design : Reliability

System Design : Reliability

1
Comments
3 min read
Managing Reliability With SLOs and Error Budgets

Managing Reliability With SLOs and Error Budgets

1
Comments
5 min read
How to Measure System Reliability

How to Measure System Reliability

1
Comments
4 min read
SLO Anti-Patterns: Real-World Lessons

SLO Anti-Patterns: Real-World Lessons

1
Comments
3 min read
"Building Secure and Reliable Systems": How Google's Approach to Security and Reliability Can Benefit Your Organization

"Building Secure and Reliable Systems": How Google's Approach to Security and Reliability Can Benefit Your Organization

1
Comments
3 min read
Building Resilient Software Architecture: Lessons Learned from the Domino Game

Building Resilient Software Architecture: Lessons Learned from the Domino Game

1
Comments
2 min read
10 Most Effective Strategies to ensure reliability of the system

10 Most Effective Strategies to ensure reliability of the system

1
Comments
2 min read
10 most important Metrics you must know as a DevOps Engineer

10 most important Metrics you must know as a DevOps Engineer

1
Comments 2
2 min read
Devops Shorts 001 - Tobias Kunze

Devops Shorts 001 - Tobias Kunze

1
Comments
2 min read
Azure Site Recovery

Azure Site Recovery

1
Comments
2 min read
Reliability in Legacy Software

Reliability in Legacy Software

1
Comments
3 min read
Multi-Version Rollouts

Multi-Version Rollouts

Comments
7 min read
How to design Reliable Microservice Chains using the principles of Systems Thinking.

How to design Reliable Microservice Chains using the principles of Systems Thinking.

Comments
4 min read
Saving 30% on costs and improve infrastructure reliability with profiling

Saving 30% on costs and improve infrastructure reliability with profiling

Comments
2 min read
How security, reliability, and design teams can get other teams to do work for them -- the Objective Expert Model

How security, reliability, and design teams can get other teams to do work for them -- the Objective Expert Model

Comments
11 min read
Has Facebook outgrown "Move fast and break things"?

Has Facebook outgrown "Move fast and break things"?

Comments
2 min read
Delinearized Rollouts

Delinearized Rollouts

Comments
3 min read
PagerDuty Community Update: November 18, 2022

PagerDuty Community Update: November 18, 2022

Comments
3 min read
5 key points about Immutable Infrastructure

5 key points about Immutable Infrastructure

Comments
1 min read
SRE book notes: Introduction to Site Reliability Engineering

SRE book notes: Introduction to Site Reliability Engineering

Comments
3 min read
Building Resilient Systems on AWS: Avoiding Common Errors with the Well-Architected Framework

Building Resilient Systems on AWS: Avoiding Common Errors with the Well-Architected Framework

Comments
5 min read
Submitting Changes

Submitting Changes

Comments
5 min read
Lessons in Reliability: Margaret Hamilton's Software Engineering Approach

Lessons in Reliability: Margaret Hamilton's Software Engineering Approach

Comments
2 min read
loading...