Let's face it: outages are a pain. They're also unavoidable. Whether your application is in the cloud or on physical hardware, it's bound to encounter failures eventually.
To help mitigate these failures, we recommend using a High Availability (HA) configuration for mission-critical applications. While requirements and types of High Availability setups vary widely, here are some best practices through the lens of applications deployed in the cloud.
Before we begin, let's discuss what High Availability actually means. For our purposes, HA means that your application is configured to handle different types of failures with minimal to zero downtime. Components of an HA setup can include servers in different geographic locations, database redundancy and data replication. We will discuss these, and more, later.
From a technology perspective, it's easy to justify reasons for creating an HA setup. But justifying the costs from the business side? That's another story.
Let's say you're the CTO of a company. If you approach your CEO and say you need to double the cost of your infrastructure, she's going to want to know why. And really, it boils down to how much value your uptime is worth, in dollars. If your application being down for hours (or potentially days) will not significantly affect your company's bottom line, then maybe you don't need to explore High Availability.
More often than not, though, your business will be severely impacted by that downtime--and that's when you need to explore your other options because in these cases, the cost of having a separate backup environment running is easily justifiable from a business standpoint.
Now that you've convinced your boss, let's look at some requirements of High Availability configurations.
Multiple application servers
Any serious application needs more than one server to handle serving it. These servers should be load-balanced so that traffic is split between them. If you only run a single application server and you experience a traffic spike that crashes that server, then you need to wait until that it is rebooted before you can begin processing requests. In more extreme cases, the server keeps crashing until the traffic dies.
Engine Yard Cloud provides the ability to automatically failover to another application server if the master should crash. It will then automatically create a new application server to take the place of the one that was promoted.
Orchestra PHP Cloud offers Elastic environments that start with a minimum of two servers and scale up and down automatically. If your application gets hit with a traffic spike, it automatically adds servers to handle that traffic and then kills them off after the traffic decreases.
The database is the heart of a data-driven application and typically does not receive the love that it so deserves. Many people take it for granted that the database will always be around. It is still a server and can fail at any point. It may even be more important to have multiple database servers than it is to have multiple application servers. If your single database crashes, the potential for data loss can affect your customers' data as well as your revenue. In this day and age, that is not acceptable and it's likely to cost you a lot of customers and revenue.
At Engine Yard, we encourage you to create your environments with at least one database slave right off the bat. This not only allows the ability to failover your database if needed, but the slave(s) can be used to offload some of the work, such as backups.
Diversified geographical locations
Setting up your application's environment with multiple application and database servers is a great first line of defense. The problem comes when all of those servers are in the same location and something comes along, like a natural disaster, and severely affects the datacenter.
If you do not already have a second environment running to handle these situations, then you either need to wait for the datacenter to come back up or manually re-create everything in another datacenter. The potential for excessive downtime in either situation is huge.
With Engine Yard Cloud, you only need to create a new environment in a different geographical region. Four best results, make sure your database and any application data are replicated to the new location. We also recommend storing assets separately, using Amazon S3, for example, to keep them in sync.
Once you have everything replicated to a different location, it is as easy as updating your DNS settings to use that IP address. When an event affects your primary location, you simply lower the TTL (time to live) in your DNS configuration and change the IP.
We've touched on the basic requirements for most High Availability setups, but no two are going to be the same.
The things you need to consider when researching HA are:
- Multiple application servers
- Database redundancy
- Diversified geographic locations
I hope you've enjoyed this blog post! Feel free to leave comments below and visit the following links for further information.