Engineering Blog

War stories from production, deep dives into infrastructure, and things we learned the hard way.

$ cat /var/log/wisdom | grep --lessons-learned
KUBERNETES · FINOPS · AWS
2026-04-05 | 7 min read

Karpenter + Spot Instances + Scale-to-Zero: How We Cut EKS Costs by 70%

We replaced Cluster Autoscaler with Karpenter, moved 80% of workloads to Spot, and implemented scale-to-zero for non-critical services. Monthly bill went from $47K to $14K.

AI-OPS · SRE · CLAUDE
2026-03-28 | 5 min read

Building an AI Incident Responder That Actually Works

We built an AI agent that reads logs, correlates traces, and suggests fixes before the on-call engineer finishes their coffee. Here's exactly how we did it.

FINOPS · OBSERVABILITY · KUBERNETES
2026-03-15 | 6 min read

The FinOps Dashboard That Stopped Our Cloud Bill From Bleeding

We built a real-time cost visibility dashboard with Grafana, Prometheus, and custom exporters. Now every team sees exactly what they spend — and they started caring.

AWS · FINOPS · KUBERNETES
2026-02-20 | 6 min read

Case Study: $240K/Year AWS Savings for a Healthcare SaaS

A healthcare SaaS was spending $38K/month on AWS with no idea where the money went. We audited everything, implemented 12 changes, and brought it down to $18K. Here's the full breakdown.