How to Turn Disaster Into Gold
may 2011 by jpcody
We're very careful about how we explain downtime and other glitches. We don't beat around the bush. We don't try to hedge. We don't pass the blame to a vendor or another party. When our customers are affected, it's on us.
failure
customer-service
may 2011 by jpcody
Coding Horror: Working with the Chaos Monkey
may 2011 by jpcody
One of the first systems our engineers built in AWS is called the Chaos Monkey. The Chaos Monkey’s job is to randomly kill instances and services within our architecture. If we aren’t constantly testing our ability to succeed despite failure, then it isn’t likely to work when it matters most – in the event of an unexpected outage.
programming
failure
performance
may 2011 by jpcody