Amazon Web Services: Post-Mortem
Amazon has finally released a detailed post-mortem that describes what happened during last week's outage at their Northern Virginia data center (US-EAST) and why it took so long to fix (short version: we screwed up and things snowballed out of control). It makes for a long, technical, but interesting read.
Complex systems make for complex failures.
Complex systems make for complex failures.
Comments