We're building a Cloud… We're building it bigger

For going on seven years I've been with Curse and we've managed to do many wild and diverse things, and it's been a great ride so far. I've got to work on one of the largest Django websites on the internet (at the time), I worked on a desktop application that is in use by two million people world wide (I later wrote the mac version from scratch), and for the last few years my main effort has been production level IT work.

I want to make it clear. This isn't your corporate stooge PC Load Letter style IT problems. We run one of the largest website networks in the country. Our monthly dynamic requests are numbered in the billions, we've broken twenty-five million uniques, and we are continually reminded by our vendors and partners that the challenges we face are in no way standard. Despite all that, we truly knew we arrived when we started having significant daily DDoS attacks (and there was much rejoicing...).

In those last seven years we've gone from fully managed dedicated servers, to collocated servers, to a private cage with some managed services, and now we're planning what the next iteration is going to look like. As much as I dislike buzz words it's going to look like a cloud.

We started virtualizing about two years ago in response to security and scalability concerns that only isolation could really fix. It's worked, we now run more than 800 vms in production, adding more all the time. The problem we ran into was hardware we had in place wasn't really meant for virtualization. Case in point: our LAN switch at the time. It had 192 ports, and was fine pre-virtualization. What we didn't realize is that it's MAC address table only had room for about 200 entries in it. The new virtualized environment grew and we ended up with more than 500 MAC addresses in fairly short order. We decided to replace the old switch when we saw it was dropping more than 35 million packets a day on certain ports.

I point all of that out to illustrate that we didn't really know what we were getting into. We were growing organically and reactively, and that's a problem. It hasn't really stopped yet. Every time I've turned around for the last few years that bare metal hardware stack has been pushed to the limits and forced to grow in ways it was never meant to.

So here we are going on three years later, with more than a few battle scars but a lot wiser for the wear. We started planning for the new hosting more than six months ago, and during that time I've had one overriding mantra: Planned flexible growth is a must. Never grow organically or reactively. My second mantra has been: Everything fails, don't let anyone notice. I honestly don't know that I'll fully be able to achieve those lofty goals. Eventually something we couldn't possibly imagine will rear it's ugly head and force me to react. I'll damn sure try to make it three years before that day comes though.

So what am I building? Here's a few bullet points.

  • Three distinct availability zones. Each zone will have dedicated and isolated networking, compute and management servers, as well as it's own SAN.
  • Capacity will be planned so that a whole zone can be taken off line for maintenance, or, in a more spectacular moment, can crash without noticeably impacting uptime or performance.
  • Avoid 1gbit networking wherever possible, instead prefer 10gbit, it should last longer.
  • Avoid spinning disks. Use SSD storage for all critical applications.
  • Use the best hypervisor tech. That's a powder keg statement, but it's important. We're reevaluating our original hypervisor choice (Citrix's Xenserver), and are trying out some new things.

We've made good progress so far and as mentioned we're in the prototyping stage. I'm planning on blogging about a lot of the details as we go forward. What I'll say for now is that the bleeding edge can be painful, but it's definitely exciting.