DEV Community

Cover image for How I built horizontally scalable chat server in Go
D Abhiram Karanth
D Abhiram Karanth

Posted on • Edited on

How I built horizontally scalable chat server in Go

How I built a horizontally scalable chat server in Go.

It's not about sending messages fast.
It's about doing three things at once:
— deliver in real time
— preserve order + durability
— scale without sticky sessions

Project: Signal-Flow

GitHub: https://github.com/abhiram-karanth-core/signal-flow

Demo: https://global-chat-app-web.vercel.app/

Here's how I broke it down


The key insight: stop treating a chat message as one thing.

Every message has a lifecycle with very different requirements at each stage. Collapsing them into one system is where most chat servers break down at scale.


The lifecycle:

  1. User sends a message (Next.js)
  2. Go WebSocket server receives it
  3. Made durable (Kafka)
  4. Fanned out to other users (Redis)
  5. Written to queryable history (Postgres)

Each step needs a different system. Each system does exactly one job.


The first big lesson: real-time delivery and durable storage have opposing needs.

Real-time = sub-millisecond latency
Durable = safety, ordering, replayability

Forcing one system to do both means compromising on both. So don't.


Kafka is the source of truth. Every message hits Kafka first — before fan-out, before anything.

But Kafka doesn't deliver messages to users. It exists so the system can survive failures and recover state.

Correctness lives here. Latency does not.


Why partition by room?

Message ordering matters — but only within a room, not globally.

Kafka guarantees ordering per partition, so messages are partitioned by room_id.

Each room maps deterministically to a single partition, preserving strict ordering
for that room while allowing horizontal throughput across rooms.

Per partition → ~5k msg/sec

N partitions → ~N × 5k msg/sec across independent rooms

This keeps chat semantics correct and scales throughput without global bottlenecks.


Redis makes it feel instant.

It sits on the hot path — performing room-based fan-out so that each WebSocket connection independently receives messages for its subscribed room.

Redis is intentionally ephemeral. If it drops a message: Kafka still has it. Clients recover on reconnect. History stays correct.


Postgres makes it usable.

Kafka consumers write to Postgres asynchronously. This means:
— DB slowness never blocks ingestion
— State can be rebuilt from Kafka replay
— Frontend reloads always return consistent history

Postgres is the materialized view users actually interact with.


The most important architectural decision: heavy decoupling between consumers.

Kafka → Redis (real-time fan-out)
Kafka → Postgres (durable writes)

Producer publishes once. Doesn't care who consumes. Each consumer fails and restarts independently.

No cascading failures.

Kafka → Redis (room-based fan-out) → WebSocket connections (one per client)

Why it scales horizontally:

— Go servers are stateless (no sticky sessions)
— Real-time delivery is decoupled from durability
— Each component has one job
— Any component can rebuild from Kafka replay

Scaling becomes an ops concern. Not an app rewrite.


The frontend never sees any of this.

WebSockets → real-time updates via Redis
HTTP APIs → message history from Postgres
Kafka stays internal. Clean boundary.


Don't build a chat server. Build three systems that each do one thing well:

Kafka → correctness
Redis → latency
Postgres → usability

Everything else is trade-offs.

Top comments (0)