Saad Shahd — Principal Engineer

Systems will always fail. The work is making failure survivable.

For fifteen years I chased the dream of systems that couldn't break. They broke anyway—at 2am, mid-match, under load no one modelled. The pattern that held wasn't prevention; it was recovery. So the question I keep asking teams is this: when this fails, and it will, how fast can we see it, name it, and bring it back? Reliability isn't the absence of failure. It's how gracefully you carry it.

Selected Work

Where the failure became the design.

One system, told honestly—origins, the bottleneck, the 2am test, and what I'd change.

Case Study · 2018–2022

Statsbomb: Real-Time Data Collection

How do you let thousands of collectors capture a live match concurrently without choosing between speed and correctness? We built it from week-one sketches with Ali to the collection infrastructure Hudl bought in 2024.

Real-Time Data Collection State Machines Stream Processing Distributed Teams

Writing

Questions worth sitting with.

All writing →

What does failure look like in your systems?

If you're wrestling with reliability, resilience, or the architecture underneath both, I'd like to hear the shape of it. No pitch—just the question first.

Start a Conversation