DEV Community

# observability

Gaining deep insights into system behavior through metrics, logs, and traces.

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
How a Missing Trace Led Me to Build a Local Observability Stack

How a Missing Trace Led Me to Build a Local Observability Stack

2
Comments
10 min read
SwiftUI App Health Dashboard Architecture (Internal Telemetry UI)

SwiftUI App Health Dashboard Architecture (Internal Telemetry UI)

Comments
2 min read
How to Convert Logs to Metrics: A Practical Guide with OpenObserve Pipelines

How to Convert Logs to Metrics: A Practical Guide with OpenObserve Pipelines

1
Comments
4 min read
FastAPI + OpenTelemetry: Stop Debugging with grep (Use Distributed Tracing)

FastAPI + OpenTelemetry: Stop Debugging with grep (Use Distributed Tracing)

2
Comments
3 min read
Is Elixir’s Observability Ready for Production? A Guide for Skeptical Engineers

Is Elixir’s Observability Ready for Production? A Guide for Skeptical Engineers

Comments
10 min read
Docker Monitoring Without a Platform: docker stats + cgroups (DevOps)

Docker Monitoring Without a Platform: docker stats + cgroups (DevOps)

Comments
3 min read
Reliability vs Uptime: Why Availability Fails at Scale

Reliability vs Uptime: Why Availability Fails at Scale

5
Comments 1
3 min read
SwiftUI Crash Reporting & Incident Triage Architecture (Production Reality)

SwiftUI Crash Reporting & Incident Triage Architecture (Production Reality)

Comments
2 min read
Beyond Basic Logs: Implementing Custom Observability for n8n Workflows

Beyond Basic Logs: Implementing Custom Observability for n8n Workflows

Comments
6 min read
Mitigating 'Scraping Shock': Engineering Cost-Aware Data Pipelines

Mitigating 'Scraping Shock': Engineering Cost-Aware Data Pipelines

Comments
5 min read
A Measurable Snapchat Proxy Validation Mini Lab You Can Run This Week

A Measurable Snapchat Proxy Validation Mini Lab You Can Run This Week

Comments
6 min read
Retry Logic Is a Policy Decision, Not a Code Pattern

Retry Logic Is a Policy Decision, Not a Code Pattern

1
Comments
2 min read
Essential Strategies to Monitor and Observe Production Node.js Applications

Essential Strategies to Monitor and Observe Production Node.js Applications

1
Comments
1 min read
Designing Resilient Systems: From Failure Domains to Long-Lived Software

Designing Resilient Systems: From Failure Domains to Long-Lived Software

Comments
1 min read
The new Ably dashboard: realtime visibility in your hands

The new Ably dashboard: realtime visibility in your hands

Comments
4 min read
SwiftUI Logging & Observability Architecture (Production-Grade)

SwiftUI Logging & Observability Architecture (Production-Grade)

Comments
2 min read
Logs, Metrics, and Traces: What They Are and When to Use Each

Logs, Metrics, and Traces: What They Are and When to Use Each

1
Comments
4 min read
Debugging Microservices Like a Pro: How Trace IDs Saved My Production Incident

Debugging Microservices Like a Pro: How Trace IDs Saved My Production Incident

Comments
1 min read
NVIDIA GPU Monitoring: Catch Thermal Throttling Before It Costs You $50k/Year

NVIDIA GPU Monitoring: Catch Thermal Throttling Before It Costs You $50k/Year

3
Comments
7 min read
Why your system can be 100% up and still completely broken

Why your system can be 100% up and still completely broken

3
Comments 1
2 min read
🐦‍🔥 Weekly Flamehaven Patch Report — this week was a “stack alignment” week.

🐦‍🔥 Weekly Flamehaven Patch Report — this week was a “stack alignment” week.

Comments
2 min read
Observability Isn’t Understanding — Why We Still Don’t Know Our Systems

Observability Isn’t Understanding — Why We Still Don’t Know Our Systems

Comments
3 min read
From Prometheus to ARMS: How We Simplified Observability for a Multi-Tier App on Alibaba Cloud

From Prometheus to ARMS: How We Simplified Observability for a Multi-Tier App on Alibaba Cloud

Comments
3 min read
Composite SLOs for Serverless Event-Driven Systems

Composite SLOs for Serverless Event-Driven Systems

2
Comments
5 min read
Datadog + AWS: Observability Maturity Model 2026

Datadog + AWS: Observability Maturity Model 2026

2
Comments
8 min read
loading...