In this post I'll go over various options for scaling your business web platform. We'll take a look at five different approaches. There is no wrong or right approach, it is just a matter of what aspects you want to emphasize and what your real world needs are. I'll be using the Amazon stack in examples as it is my preffered stack of choice, but the strategies shown here apply to every other competing stack as well.
First let's go over some concepts: region and availability zone.
Amazon Availability Zones are distinct physical locations that have Low latency network connectivity between them, are located inside the same region and are also engineered to be insulated from failures that happen to afflict other AZ’s. Each availability zone runs on its own physically distinct, independent infrastructure, and is engineered to be highly reliable; they have Independent power, cooling, network and security. Common points of failures like generators and cooling equipment are not shared across Availability Zones. Additionally, they are physically separate; such that even extremely uncommon disasters like fires, tornadoes or flooding would only affect a single Availability Zone.[^1]
If your platform is working mostly in one area of the world it makes sense to put your servers in that region. The region will then have mulitple "Availability Zones". This means that you can put redundant servers in different zones within same region and as a result you'll have better availability. The important twist here is that within the same region network latency is minimal. So we have separate facilities with good interconnectedness. Here is an image of the available regions along with the number of available zones on AWS. Two green circles are new regions that are opening soon (in Paris and Ningxia).
Things are simple here, you have one machine that serves all your traffic. When you notice that the server can't handle the traffic you simply shut down your machine, upgrade the CPU, RAM and storage and you run it again. This approach is the cheapest and is ideal for the MPV state. Don't be fooled, though, it still can get you very far and I would most definitely always begin with this approach.
This is similar to having a single machine in the sense that if our availability zone goes down, production goes down too. So our server(s) live in single region inside of a single availability zone. We are merely adding an Elastic Load Balancer that distributes traffic to multiple servers within the same availability zone.
It is much better to use the approach #3 with multiple zones. This can be used when the load is so low it requires only one server (so it has to be in one availability zone) as a stepping stone in the right direction.
Amazon EC2/RDS instances have an uptime guarantee of 99.95% on a monthly basis. The max permissible downtime roughly equates to 22 minutes per month (assuming 30 days per month)[^2]
When we combine multiple availability zones it means it makes it very unlikely we will have an outage. Elastic Load Balancer can detect problems in each zone and redirect traffic to healthy instances.
This combination is a sweet spot for reasonable realiability and cost.
Although it is very rare for an entire AWS region to go down, it does happen. Many enterprises want to replicate their databases across regions, so that when a catastrophe does occur and the primary region goes down, infrastructure can be quickly setup in another region.[^3]
Such a setup requires the database to be synced across regions. Total time from end point failure to DNS failover is about 3 minutes, so we can have a backup server running soon, preventing big outage.
One possibility to cut down cost is to use a passive setup as staging area for testing prior to production rollout.
When your server handles lots of customers across multiple regions it makes sense to keep both regions active. In normal circumstances you might use Amazon Route 53 Latency Based Routing (LBR) or Weight Round Robin (WRR) to distribute load. In case of emergency when an entire region goes down you transfer the traffic over to a working region. his means you get slower responses, but it certainly beats suffering complete downtime. The configuration is exactly the same as #4 Active/Passive Failover but we use both regions and we distribute the load between them at all times, not just in case of one region going down.
For a big system, a major problem is always the database. So in a sense you do everything you can to remove the burden from it:
Another good tip is protecting web servers from being burdened by using a CDN for static content delivery or streaming.
DDOS protection is another valid concern.
Congratulations on making it all the way here. If you just jumped here, shame on you, otherwise I hope you found this useful :)
If you are in search of an awesome RoR team, or you need help with setting up your project you can ping us here.
New AWS Feature: Amazon RDS now supports cross-region replicationCategory: Rails, Amazon