DEV Community

Site Reliability Engineering

Site Reliability Engineering principles, practices, and culture.

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Wild Ride from Raw Syscalls to Figuring Out NSS and libc

Wild Ride from Raw Syscalls to Figuring Out NSS and libc

1
Comments
4 min read
Docker Monitoring Without a Platform: docker stats + cgroups (DevOps)

Docker Monitoring Without a Platform: docker stats + cgroups (DevOps)

Comments
3 min read
Reliability vs Uptime: Why Availability Fails at Scale

Reliability vs Uptime: Why Availability Fails at Scale

5
Comments 1
3 min read
SRE is the BEST Thing Ever

SRE is the BEST Thing Ever

Comments
4 min read
Getting Started with cURL

Getting Started with cURL

Comments
4 min read
How I Reduced Production Incidents as a Senior SRE (Without Slowing Releases)

How I Reduced Production Incidents as a Senior SRE (Without Slowing Releases)

Comments
2 min read
AI-Assisted Incident Triage in Large-Scale Cloud Systems: A Human-Centered Reliability Framework

AI-Assisted Incident Triage in Large-Scale Cloud Systems: A Human-Centered Reliability Framework

Comments
3 min read
When Asynchronous Systems Fail Quietly, Reliability Teams Pay the Price

When Asynchronous Systems Fail Quietly, Reliability Teams Pay the Price

Comments
5 min read
Fallback e Degradação resiliente em APIs com Redis e Circuit Breaker

Fallback e Degradação resiliente em APIs com Redis e Circuit Breaker

Comments
8 min read
What a 60-second war-room scan reveals

What a 60-second war-room scan reveals

Comments
3 min read
A Measurable Snapchat Proxy Validation Mini Lab You Can Run This Week

A Measurable Snapchat Proxy Validation Mini Lab You Can Run This Week

Comments
6 min read
The "DevOps Engineer" is Dead. Long Live the Platform Architect.

The "DevOps Engineer" is Dead. Long Live the Platform Architect.

5
Comments
2 min read
Debugging Missing Kubernetes Events: A Deep Dive into the Event Spam Filter

Debugging Missing Kubernetes Events: A Deep Dive into the Event Spam Filter

Comments
3 min read
DevOps com IA: Quem Está no Controle do Pipeline?

DevOps com IA: Quem Está no Controle do Pipeline?

Comments
13 min read
Spegel, Pixie, and Why :latest Is Evil

Spegel, Pixie, and Why :latest Is Evil

Comments
4 min read
Rotating Residential Proxy Evaluation Mini-Lab You Can Run in 90 Minutes

Rotating Residential Proxy Evaluation Mini-Lab You Can Run in 90 Minutes

Comments
6 min read
Workflow Deep Dive

Workflow Deep Dive

Comments
1 min read
Building a Config Drift Detector for AWS (with Snapshots, Lambdas, and a Next.js Dashboard)

Building a Config Drift Detector for AWS (with Snapshots, Lambdas, and a Next.js Dashboard)

Comments
5 min read
Running Cluster on 100% Spot Instances: How K8s Does It Better Than ECS

Running Cluster on 100% Spot Instances: How K8s Does It Better Than ECS

Comments
4 min read
Two Terraform Traps That Burned Me: Hidden Defaults & Circular Dependencies

Two Terraform Traps That Burned Me: Hidden Defaults & Circular Dependencies

Comments
4 min read
Why Your Engineering Wiki is a Graveyard (And How to Fix It)

Why Your Engineering Wiki is a Graveyard (And How to Fix It)

Comments
3 min read
How to Make Engineering Knowledge Searchable (A Complete Guide)

How to Make Engineering Knowledge Searchable (A Complete Guide)

1
Comments
3 min read
Shift-Left Reliability

Shift-Left Reliability

Comments
4 min read
How to pass the CKA Exam on the first try [GUARANTEED]

How to pass the CKA Exam on the first try [GUARANTEED]

Comments 1
4 min read
You’re Running EC2 Instances That Do Nothing

You’re Running EC2 Instances That Do Nothing

1
Comments
2 min read
loading...