We’ve been hosting PagerDuty on AWS for about the last year. One of the biggest draws to the platform for us was the promise of ready-built components — on AWS there’s no need to run your own redundant DB setup or load balancer, since Amazon provides them: pre-built and professionally managed.
Well, that’s the theory, anyway. Unfortunately, every time I’ve evaluated any AWS service beyond their simple EC2 hosting, AWS has come up short. Perhaps most frustrating, their services cover 95% of what we need. But without fail, they are lacking some small but critical piece of functionality.
Consider AWS’s elastic load balancer (ELB), for example. It provides an easy way to distribute traffic fairly over all of your front-end instances. It can automatically stop routing requests to failed instances, completely hiding network and instance failures from the user. The ELB can even automatically spin up new instances in response to traffic spikes. All of this would take some serious engineering effort to replicate on your own.
Unfortunately, it’s totally unusable in many real-world deployments. The problem is that Amazon doesn’t assign static IPs to their load balancers. Instead, you get a hostname and are told to setup CNAME records aliasing www.yourdomain.com to the ELB’s name. This has three serious problems.
First, you can’t use a CNAME for the root of a domain. This is because a CNAME record can’t coexist with a SOA record at the same point in the DNS hierarchy. As a result, if your site is hosted at yourdomain.com, you’ll need to move it to www.yourdomain.com. Of course, even with redirects in place at the original domain, this sort of branding change is going to be unacceptable to many businesses.
Second, you can’t properly accept email to a domain hosted by an ELB. This too is due to a DNS limitation — you can’t have a MX and CNAME record at the same point in the DNS hierarchy. While you might be able to accept mail if you run a SMTP server on the machines behind the ELB, this is far from a typical configuration. At PagerDuty, this is a showstopper, since we need to be able to both host a site and accept mail at yoursubdomain.pagerduty.com.
Finally, you have no “out” if the ELB blows up, short of adjusting your DNS records and waiting for cached records to expire. This is a big problem for us, since we’re very hesitant to introduce components into PagerDuty’s infrastructure that we can’t quickly swap out in the event of a problem.
The solution to this problem is simple — it should be possible to map an Amazon Elastic IP to an ELB. Since the ELB would now have a static IP, the DNS issues would be solved. And if the ELB blew up, you could simply provision another and remap the IP — no DNS changes required. I realize that ELB’s “no static IP” architecture is probably a deeply baked in design decision — but unfortunately, a LB without a static IP isn’t really usable.