← All categories
Monday, 18 May 2026Today · official
System DesignDifficulty ★★★★2026-05-18

Gmail and multiple services fail after deployment

Your team operates a suite of internal platform services that underpin Gmail and dozens of other Google products. Over a two-hour window, on-call engineers begin receiving alerts: Gmail is partially unavailable for a subset of users, and several other unrelated-seeming products start surfacing elevated error rates simultaneously. The platform services themselves report healthy uptime on their internal dashboards. No code changes were deployed to Gmail or the affected products in the hours before the incident. User-facing error rates climb to roughly 15–20% across affected surfaces. The incident starts resolving itself before the team has manually intervened on any of the directly affected services, and the blast radius appears disproportionately large relative to any single point of failure the team can identify.

Skills:ScalabilityResilienceObservabilityDeployment
Attempt 1of 5

Submit sends whatever you have — empty, typed-without-select, or a picked suggestion. Each submission counts as one attempt and unlocks a hint if wrong.

System Design — Gmail and multiple services fail after deployment · Archdle