
March 2, 2026
Microservices don’t fail because of Ruby.
They fail because of architecture.
Most “microservices” I see in Ruby are:
• HTTP chains tightly coupled together
• Shared databases behind the scenes
• No tracing
• No event replay
• No contract validation
That’s not distributed architecture.
That’s a distributed monolith.
🧠 What Production-Grade Actually Means
A real microservices system must tolerate:
• Partial failures
• Event replay
• Duplicate messages
• Independent deployments
• Horizontal scaling
If you cannot replay events or trace a request across services, your system is fragile.
🏗 The Stack I Would Deploy Today

Per service:
• Rails (API mode)
• PostgreSQL (one DB per service)
• Redis
• Sidekiq
• Kafka (with Karafka)
• OpenAPI contracts
• Prometheus metrics
• OpenTelemetry tracing
• Docker + Kubernetes
That’s a serious, production-ready Ruby microservices architecture.
📦 Example: Orders & Billing Microservices
Let’s make this concrete.
Imagine two services:
1️⃣ Orders Service
Responsible for: • Creating orders • Persisting order state • Publishing events
rails new orders_service --api -T -d postgresql
When an order is created:
# app/services/create_order.rbclass CreateOrder def call(params) Order.transaction do order = Order.create!(params) OutboxEvent.create!( event_type: "order.created", payload: order.as_json ) order end endend
A Sidekiq job publishes the outbox event to Kafka:
class PublishOutboxJob include Sidekiq::Worker def perform(event_id) event = OutboxEvent.find(event_id) Karafka.producer.produce_sync( topic: "orders", payload: event.payload.to_json ) event.update!(published: true) endend
This guarantees: ✔ No lost events ✔ No dual-write problem ✔ Replay capability
2️⃣ Billing Service
Consumes events:
class OrdersConsumer < ApplicationConsumer def consume messages.each do |message| payload = message.payload next if BillingRecord.exists?(order_id: payload["id"]) BillingRecord.create!( order_id: payload["id"], amount: payload["total"] ) end endend
Notice:
• No HTTP call to Orders • No shared DB • Idempotent consumer • Fully decoupled
If Billing is down, Kafka retains the event.
When it comes back, it replays.
That’s distributed resilience.
🔎 Observability Is Not Optional
Every service must include:
• Structured JSON logging
• Correlation IDs
• Prometheus metrics
• OpenTelemetry tracing
Without tracing, debugging microservices becomes archaeology.
🔥 When I Would Choose Something Else
If I need:
• Ultra-low latency
• Millions of requests per second
• Heavy compute workloads
I’d choose Go or Rust.
But for complex business-domain systems?
Ruby + Rails is still one of the most productive and balanced choices available.
🎯 Final Thought
Microservices are not about splitting code.
They’re about managing failure.
Rails is not too slow. Kafka is not overkill. Observability is not optional.
The real question is:
Are you building services — or are you building systems that survive failure?
