Spotify’s outage of 8.3.2022, explained
Spotify had one of its most disruptive outages in recent history in the evening of 8.3.2022 CET, which resulted in over an hour of downtime and users getting logged out. As luck would have it, I was on-call for the very first time for the User platform tribe. I helped where I could, but mostly watched in awe as a flurry of teams came online after-hours to work together to debug and mitigate the issue. Here, I walk you through the storm of incident-20220308, including symptoms, root causes, aftereffects, and takeaways.
Kat is a Senior Software Engineer with about 8 years of experience in industry. She is currently at Spotify, working on the user and account platform. She has worked extensively with microservice architectures, relational and non-relational databases, synchronous and asynchronous messaging patterns. Given the industry’s push for a DevOps culture, she has also done some work in configuring deployment pipelines, provisioning machines and applications, and setting up monitoring. When she is not at work, she’ probably training mixed martial arts, playing guitar, or cooking!