One of the hardest, but most important things to do when building your cloud architecture, is to eliminate Single Point of Failures (SPoF). What this means is that every mission critical service should be able to survive an outage of any given server. Some companies, like Netflix, have taken this to an extreme and created a service called Chaos Monkey. …