What is elasticity?
The focus for many applications is scalability, which means the ability to scale up. The idea of scalability is that your application can handle bursts of traffic or resource-heavy jobs. This is handled by scaling up your architecture. A rule of thumb is that if you provision more resources then you can handle more traffic.
There are two ways to scale:
An example of this situation is if your web application gets featured on a site like Hacker News or Product Hunt. When this happens, you’re likely to get a sudden rush of traffic. if you cannot scale up, then your application is likely to cripple under the load. The results can be incredibly damaging to your reputation – if people can’t use your site, they can’t see what you have to offer.
Elasticity covers the ability to scale up but also the ability to scale down. The idea is that you can quickly provision new infrastructure to handle a high load of traffic, like the example above. But what happens after that rush? If you leave all of these new instances running, your bill will skyrocket as you will be paying for unused resources. In the worst case scenario, these resources can even cancel out revenue from the sudden rush. An elastic system prevents this from happening. After a scaled up period, your infrastructure can scale back down, meaning you will only be paying for your usual resource usage and some extra for the high traffic period.
The key is that this all happens automatically. When resource needs meet a certain threshold (usually measured by traffic), the system “knows” that it needs to de-provision a certain amount of infrastructure, and does so. With a couple hours of training, anyone can use the AWS web console to manually add or subtract instances. But it takes a true Solutions Architect to set up monitoring, account for provisioning time, and configure a system for maximum elasticity.