Why Amazon’s Elastic Block Store Matters

August 20, 2008

On the technical side, Amazon’s EBS service may look like “just” another great new feature of the Elastic Compute Cloud, but on the business side it enables a whole slew of new customers. I won’t pretend that I understand all the new uses, but I can talk about those we see and are supporting.

First a couple of words about what EBS is. In short it’s a SAN (Storage Area Network) in the cloud. You can allocate a disk volume of 1GB to 1TB in size from what is now an endless SAN in the cloud and attach it to an instance of yours running in EC2. The volume is stored on redundant disks (i.e. with some form of RAID) and has a lifetime separate from any instance on which it is mounted, so you can unmount it and later remount it on another instance. You can also perform a snapshot backup of a volume to S3, where it is stored with the redundancy and durability of all objects on S3. Moreover, successive snapshots are incremental providing a very powerful and efficient incremental backup capability for volumes.

All this and much more is explained in detail in my other post and there’s yet more detailed EBS information on our support site. The official EBS announcement is on the EC2 detail page, Werner Vogels provides some background, and Jeff Barr’s blog entry has links to many other related announcements.

The RightScale dashboard supports all the features of EBS and offers a number of additional goodies such as configuring volumes to automatically be attached to servers when these launch and keeping track of the ancestry of a volume or snapshot.

What does EBS enable? In short: traditional processing on large datasets and reliable storage for many servers. Let’s look at these two areas one-by-one.

Large datasets

Amazon Web Services are designed for scale. EC2, S3, SQS, and SDB are ideally suited for building large systems that process huge data volumes. The catch has been that they are geared towards modern service oriented systems that can use storage accessed via HTTP PUTs and GETs (Amazon S3), can work using a non-relational database like Amazon SDB, and thrive on large numbers of simple servers (EC2). Users that have more traditional applications, such as relational databases, that require large datasets stored in a file system with a POSIX interface have had difficulties in meeting all their requirements for operating in AWS. While an EC2 X-large instance comes with about 1.4TB of local disk it is rather difficult to actually use this disk space in a production system. Populating the disk with data at boot time can take hours and backups, replication and restoring the data in case of an instance failure are all sore points. For up to 100GB the timescales are all workable, but beyond that it gets difficult.

With EBS the processing of large datasets contained within a file system becomes easily accessible. First of all, volumes can be up to 1TB in size and beyond that it is possible to mount multiple volumes on the same instance such that file systems of 10TB are practical. The volumes can further be backed-up to S3 using the snapshots and they can be replicated by creating new volumes from the snapshots. What is particularly nice is that a volume can be created in any availability zone (think datacenter) of a region from a snapshot, so copying a large volume across datacenters can be off-loaded to EBS and is done very efficiently.

Many virtual appliance servers

EBS also really enables SaaS vendors that use a single-tenant “virtual appliance” model. Many software vendors have approached us with use-cases where they would like to run individual servers on behalf of their customers. Often these servers are co-managed between customer and software vendor or have other properties that make the service inappropriate for multi-tenant SaaS implementation. In these use-cases the end-customer is storing important data on these servers and requires a robust data safeguarding architecture, in particular for database storage. While we today have a very effective mysql replication and backup solution, it is really geared at multi-server set-ups and doesn’t fit the price and complexity budget of cookie-cutter single-server virtual appliances. For those use-cases EBS brings the desired performance and reliability and drops the complexity and price.

With EBS the canonical reliable single-server virtual appliance can be implemented with the following architecture: an EC2 instance whose type is chosen for the cpu and memory required, an EBS volume sized appropriately for the data set, a revolving set of frequent snapshots providing disaster recovery backups, and periodic application-level “export” of backups to S3 for archiving and off-cloud backups. In case of a total failure of the EC2 instance and the EBS volume (e.g. datacenter fire) a new instance and volume can be allocated in another availability zone from the last revolving snapshot.

When it comes time to upgrade the virtual appliance to a new software version it becomes relatively easy for the software vendor to spin-up a second instance and volume with the upgraded software for important customers so they can test-drive the new version on their data and train their internal users before committing to the upgrade.

Try it out for yourself!

We’ve been busy integrating support for this new storage system for months so that you can start using it immediately. And our RightScale Dashboard support for EBS is available as part of our free Developer Edition! To learn more about EBS and RightScale’s support for it, check out my detailed technical review, read our EBS tutorials at wiki.rightscale.com, or register for our upcoming RightScale EBS Webinar. Or just drop us a line at sales@rightscale.com.


Amazon’s Elastic Block Store explained

August 20, 2008

Now that Amazon’s Elastic Block Store is live I thought it’d be helpful to explain all the ins and outs as well as how to use them. The official information about EBS is found on the AWS site, I’ve written about the significance of EBS before and I’ll follow-up with a post about some new use-cases it enables.

The Basics

EBS starts out really simple: you create a volume from 1GB to 1TB in size and then you mount it on a device (like /dev/sdj) on an instance, format it, and off you go. Later you can detach it, let it sit for a while, and then reattach it to a different instance. You can also snapshot the volume at any time to S3, and if you want to restore your snapshot you can create a fresh volume from the snapshot. Sounds simple, eh? It is but the devil is in the detail!

Amazon Elastic Block Store features

Reliability

EBS volumes have redundancy built-in, which means that they will not fail if an individual drive fails or some other single failure occurs. But they are not as redundant as S3 storage which replicates data into multiple availability zones: an EBS volume lives entirely in one availability zone. This means that making snapshot backups, which are stored in S3, is important for long-term data safeguarding.

I know that folks at Amazon have thought long and hard how to characterize the reliability of EBS volumes, so here’s their explanation taken from the EC2 detail page:

Amazon EBS volumes are designed to be highly available and reliable. Amazon EBS volume data is replicated across multiple servers in an Availability Zone to prevent the loss of data from the failure of any single component. The durability of your volume depends both on the size of your volume and the percentage of the data that has changed since your last snapshot. As an example, volumes that operate with 20 GB or less of modified data since their most recent Amazon EBS snapshot can expect an annual failure rate (AFR) of between 0.1% - 0.5%, where failure refers to a complete loss of the volume. This compares with commodity hard disks that will typically fail with an AFR of around 4%, making EBS volumes 10 times more reliable than typical commodity disk drives.

From a practical point of view what this means is that you should expect the same type of reliability you get from a fully redundant RAID storage system. While it may be technically possible to increase the reliability by, for example, mirroring two EBS volumes in software on one instance, it is much more productive to rely on EBS directly. Focus your efforts on building a good snapshot strategy that ensures frequent and consistent snapshots, and build good scripts that allow you to recover from many types of failures using the snapshots and fresh instances and volumes.

Volume performance

Our performance observations are based on the pre-release EBS volumes, thus some variations on the production systems should be expected. On the one hand our pre-release tests were probably running on a small infrastructure with fewer users, but on the other hand many of these users were also running stress tests, so it’s really hard to tell how all this will carry over. Only time will tell.

EBS volumes are network attached disk storage and thus take a slice off the instance’s overall network bandwidth. The speed of light here is evidently 1GBps, which means that the peak sequential transfer rate is 120MBytes/sec. “Any number larger than that is an error in your math.” We see over 70MB/sec using sysbench on a m1.small instance, which is hot! Presumably we didn’t get much network contention from other small instances on the same host when running the benchmarks. For random access we’ve seen over 1000 I/O ops/sec, but it’s much more difficult to benchmark those types of workloads. The bottom line though is that performance exceeds what we’ve seen for filesystems striped across the four local drives of x-large instances.

With EBS it is possible to increase the I/O transaction rate further by mounting multiple EBS volumes on one instance and striping filesystems across them. For streaming performance this doesn’t seem worthwhile as the limit of the available instance network bandwidth is already reached with one volume, but it can increase the performance of random workloads as more heads can be seeking at a time.

Snapshot backups

Snapshot backups are simultaneously the most useful and the most difficult to understand feature of EBS. Let me try to explain. A snapshot of an EBS volume can be taken at any time, it causes a copy of the data in the volume to be written to S3 where it is stored redundantly in multiple availability zones (like all data in S3). The first peculiarity is that snapshots do not appear in your S3 buckets, thus you can’t access them using the standard S3 API. You can only list the snapshots using the EC2 API and you can restore a snapshot by creating a new volume from it. The second peculiarity is that snapshots are incremental, which means that in order to create a subsequent snapshot, EBS only saves the disk blocks that have changed since previous snapshots to S3.

How the incremental snapshots work conceptually is depicted in the diagram below. Each volume is divided up into blocks. When the first snapshot of a volume is taken all blocks of the volume that have ever been written are copied to S3, and then a snapshot table of contents is written to S3 that lists all these blocks. Now, when the second snapshot is taken of the same volume only the blocks that have changed since the first snapshot are copied to S3. The table of contents for the second snapshot is then written to S3 and lists all the blocks on S3 that belong to the snapshot. Some are shared with the first snapshot, some are new. The third snapshot is created similarly and can contain blocks copied to S3 for the first, second and third snapshots.

Illustration of EBS snapshots to show incremental storage of a snapshots block in Amazon S3

There are two nice things about the incremental nature of the snapshots: it saves time and space. Taking subsequent snapshots can be very fast because only changed blocks need to be sent to S3, and it saves space because you’re only paying for the storage in S3 of the incremental blocks. What is difficult to answer is how much space a snapshot uses. Or, to put it differently, how much space would be saved if a snapshot were deleted. If you delete a snapshot, only the blocks that are only used by that snapshot (i.e. are only referenced by that snapshot’s table of contents) are deleted.

Something to be very careful about with snapshots is consistency. A snapshot is taken at a precise moment in time even though the blocks may trickle out to S3 over many minutes. But in most situations you will really want to control what’s on disk vs. what’s in-flight at the moment of the snapshot. This is particularly important when using a database. We recommend you freeze the database, freeze the file system, take the snapshot, then unfreeze everything. At the file system level we’ve been using xfs for all the large local drives and EBS volumes because it’s fast to format and supports freezing. Thus when taking a snapshot we perform an xfs freeze, take the snapshot, and unfreeze. When running mysql we also “flush all tables with read lock” to briefly halt writes. All this ensures that the snapshot doesn’t contain partial updates that need to be recovered when the snapshot is mounted. It’s like USB dongles: if you pull the dongle out while it’s being written to “your mileage may vary” when you plug it back into another machine…

Snapshot performance appears to be pretty much gated by the performance of S3, which is around 20MBytes/sec for a single stream. The three big bonuses here are that the snapshot is incremental, that the data is compressed, and that all this is performed in the background by EBS without affecting the instance on which the volume is mounted much. Obviously the data needs to come off the disks, so there is some contention to be expected, but compared to having to do the transfer from disk through the instance to S3 it is like night and day.

Availability Zones

EBS volumes can only be mounted on an instance in the same availability zone, which makes sense when you think of availability zones as being equivalent to datacenters. It would probably be technically possible to mount volumes across zones, but from a network latency and bandwidth point of view it doesn’t make much sense.

The way you get a volume’s data from one zone into another is through a snapshot: You snapshot one volume and then immediately create a new volume in a different zone from the snapshot. We have really gotten away from the idea that we’re unmounting a volume from one instance and then remount it on the next one: we always go through a snapshot for a variety of reasons. The way we think and operate is as follows:

  • You create a volume, mount it on an instance, format it, and write some data to it.
  • Then you periodically snapshot the volume for backup purposes.
  • If you don’t need the instance anymore, you may terminate it and, after unmounting the volume you always take a final snapshot. If the instance crashes instead of properly terminating, you also always take a final snapshot of the volume as it was left.
  • When you launch a new instance on which you want the same data, you create a fresh volume from your snapshot of choice. This may be the last snapshot, but it could also be a prior one if it turns out that the last one is corrupt (e.g. in the case of an instance crash or of some software failure).

By creating a volume from the snapshot you achieve two things: one, you are independent of the availability zone of the original volume, and second, you have a repeatable process in case mounting the volume fails, which can easily happen especially if the unmount wasn’t clean.

Now, of course, in some situations you can directly remount the original volume instead of creating a new volume from a snapshot as an optimization. This applies if the new instance is in the same availability zone, the volume corresponds to the snapshot that we’d like to mount, and the volume is guaranteed not to have been modified since (e.g. by a failed prior mount). The best is to think of the volume as a high-speed cache for the snapshot.

Price

Estimating the costs of EBS is really quite tricky. The easy part is the storage cost of $0.10 per GB per month. Once you create a volume of a certain size you’ll see the charge. The $0.10 per million I/O transactions are much harder to estimate. To get a rough estimate you can look at /proc/diskstats on your servers. This will include something like this:

   8  160 sdk 9847 77 311900 56570 1912664 3312437 160672914 211993229 0 1597261 212049797
   8  176 sdl 333 86 4561 1538 895 51 19002 20131 0 4043 21669

which is just a pile of numbers. Following the explanation for the columns you should sum the first number (reads completed) and the fifth number (writes completed) to arrive at the number of I/O transactions (9847+1912664 for /dev/sdk above). This is not 100% accurate but should be close (I believe subtracting the 2nd and 6th numbers gets you closer yet, but I prefer an over-estimate). As a point of reference, our main database server is pretty busy and chugs along at an average of 17 transactions per second, which should total to around $4.40 per month. But our monitoring servers, prior to some recent optimizations, hammered the disks as fast as they would go at over 1000 random writes per second sustained 24×7. That would end up costing over $250 per month! As far as I can tell, for most situations the EBS transaction costs will be in the noise, but you can make it expensive if you’re not careful.

The cost of snapshots is harder to estimate due to their incremental nature. First of all, only the blocks written are captured on S3 (i.e. blocks on the volume that have never been written are not stored on S3). Second it’s tricky to talk about the cost of a snapshot due to their incremental sharing.

Summing it up

All in all it’s amazing how simple EBS is, yet how complex a universe of options it opens. Between snapshots, availability zones, pricing, and performance there are many options to consider and a lot of automation to provide. Of course at RightScale we’re busy working out a lot of these for you, but beyond that it is not an overstatement to say that Amazon’s Elastic Block Store brings cloud computing to a whole new level. I’ll repeat what I’ve said before: if you’re using traditional forms of hosting it’s gonna get pretty darn hard for you to keep up with the cloud, and you’ve probably already fallen behind at this point!


Cloud Computing wouldn’t exist without Open Source

July 23, 2008

I’m at OSCON this week drinking from the open source that made RightScale possible. In talking to Tim O’Reilly I noticed that he hadn’t realized how integral Open Source is to the cloud. So maybe this isn’t as obvious as I thought and worth writing a blog entry about.

Cloud Computing is all about the flexibility to launch and terminate servers on demand, or more generally, to acquire and release resources on demand. This can help solve many tricky problems, from reliability, scaling, development, testing, to business flexibility needs. Where open source comes into the picture is when you think about the software licenses for the software stacks you’re running on all the servers you’re launching. If you are normally running 2 servers but today you need 10 did you consider whether you have licenses for all the software on the additional 8 servers? Most commercial software seems to be licensed by the server or by the cpu, and obviously this just doesn’t cut it in the cloud. If it weren’t for open source stacks no production service would be operating in the cloud today; everyone would still be waiting for software vendors to ‘get it’ and change their licenses to enable efficient use in the cloud (yeah, right…).

But all this is starting to change. The vast majority of software vendors we talk to are in the process of trying to figure out how they can sell their software in the cloud. What technical changes are necessary to enable their customers to deploy their software into the cloud environment and what business model changes are necessary to offer frictionless sales into the cloud. Of course deploying software on the RightScale platform offers a number of benefits, including some new features we’re currently adding to support publishing and charging by the use. But the bottom line really is that without open source we wouldn’t have cloud computing today.


Cloud Computing vs. Grid Computing

July 7, 2008

Recently Rich Wolski (UCSB Eucalyptus project) and I were discussing grid computing vs. cloud computing. An observation he made makes a lot of sense to me. Since he doesn’t blog [...], let me repeat here what he said. Grid computing has been used in environments where users make few but large allocation requests. For example, a lab may have a 1000 node cluster and users make allocations for all 1000, or 500, or 200, etc. So only a few of these allocations can be serviced at a time and others need to be scheduled for when resources are released. This results in sophisticated batch job scheduling algorithms of parallel computations.

Cloud computing really is about lots of small allocation requests. The Amazon EC2 accounts are limited to 20 servers each by default and lots and lots of users allocate up to 20 servers out of the pool of many thousands of servers at Amazon. The allocations are real-time and in fact there is no provision for queueing allocations until someone else releases resources. This is a completely different resource allocation paradigm, a completely different usage pattern, and all this results in completely different method of using compute resources.

I always come back to this distinction between cloud and grid computing when people talk about “in-house clouds.” It’s easy to say “ah, we’ll just run some cloud management software on a bunch of machines,” but it’s a completely different matter to uphold the premise of real-time resource availability. If you fail to provide resources when they are needed, the whole paradigm falls apart and users will start hoarding servers, allocating for peak usage instead of current usage, and so forth.


Define Cloud Computing

May 26, 2008

It looks like pretty soon all computing will be called cloud computing, just because the cloud is “in.” Fortunately most computer savvy folks actually have a pretty good idea of what the term ‘cloud computing’ means: outsourced, pay-as-you-go, on-demand, somewhere in the internet, etc. What is still confusing to many is how the different offerings compare from Amazon Web Services to Google App Engine and Force.com. I recently heard a characterization of three different levels of clouds which really helps put the various offerings into perspective. Here’s my rephrasing:

Applications in the cloud: this is what almost everyone has already used in the form of gmail, yahoo mail, wordpress.com (hosting this blog), the rest of google apps, the various search engines, wikipedia, encyclopedia britannica, etc. Some company hosts an application in the internet that many users sign-up for and use without any concern about where, how, by whom the compute cycles and storage bits are provided. The service being sold (or offered in ad-sponsored form) is a complete end-user application. To me all this is SaaS, Software as a Service, looking to join the ‘cloud’ craze.

Platforms in the cloud: this is the newest entry where an application platform is offered to developers in the cloud. Developers write their application to a more or less open specification and then upload their code into the cloud where the app is run magically somewhere, typically being able to scale up automagically as usage for the app grows. Examples are Mosso, Google App Engine, and Force.com. The service being sold is the machinery that funnels requests to an application and makes the application tick.

Infrastructure in the cloud: this is the most general offering that Amazon has pioneered and where RightScale offers its management platform. Developers and system administrators obtain general compute, storage, queueing, and other resources and run their applications with the fewest limitations. This is the most powerful type of cloud in that virtually any application and any configuration that is fit for the internet can be mapped to this type of service. Of course it also requires more work on the part of the buyer, which is where RightScale comes in to help with set-up and automation.

Looking at these different types of clouds it’s pretty clear that they are geared toward different purposes and that they all have a reason for being. The platforms in the cloud are a very interesting offering in that they promise to take some of the mundane pain away from dealing with the raw infrastructure. But it’s not at all clear to me that the vendors can live up to the promise of managing everything seamlessly and that the functional constraints won’t cause applications to have to move up to the infrastructure clouds as they mature and gain complexity. It would not be good if toy apps started on the platform clouds and then moved to the infrastructure clouds as they gain adoption. One possible outcome is a hybrid model where the canonical application core remains in the platform cloud and the odd pieces of functionality and/or the parts that need to scale the most drastically move off to infrastructure clouds.


MySQL comes to the Amazon EC2 Cloud

May 9, 2008

I’m sure you’ve seen the announcement that MySQL/Sun now supports the MySQL Enterprise Server product on Amazon EC2. Of course the MySQL community edition has been in the cloud for a very long time and we have engineered and supported it for many months. But it is nevertheless exciting to see another vendor (after RedHat) embrace the cloud and offer support for its software there.

Interestingly MySQL hasn’t announced any new product or any new pricing. I suppose this just means they don’t really care whether the copy of MySQL Enterprise you bought is running on your server or on Amazon’s server, and why would they as long as you pay. I also suppose this means that if you call for support you won’t get a blanket “sorry, we don’t support this configuration” when you mention that it’s running in EC2. We’re trying to find out whether there is anything additional happening, like special training for the support reps on EC2-specific issues.

No doubt the next step will be for the folks at MySQL to go a step further and offer more flexible pricing in keeping with the variable nature of cloud usage. As I’ve written before, I am convinced that cloud computing will dramatically change processes around managing databases for the better. All DBAs that have worked for me have hated change. If the database is running don’t touch it! If it’s broken don’t touch it until all the causes are figured out! As a result the most common request I’ve gotten from my DBAs is “Thorsten, I really need another box”. If the database is running fine my DBAs always wanted another box to test out the changes I requested before committing them for real. Or they wanted to run that killer reporting job where it’s guaranteed not to impact production. If something broke and we failed-over to the slave I was breathing down their neck to get the broken box back togteher asap so we had a replicated database again. But they always wanted to take their time and analyze the cause for failure and perhaps even try to reproduce it. Agonizing tugs of war!

I’m sure you got it: the cloud changes all that. In all the above cases the answer becomes trivial: just launch another instance. Another slave machine with the database replicating in sync is one click away (at least with RightScale’s automation). And it only costs a few bucks in server charges to run it for the couple of days most of the above scenarios take. Once we also have by-the-hour MySQL Enterprise Server licensing to match, we’ll be in database heaven. Mårten, I know you will get there soon and we’ll be waiting!


RightScale featured on Mashable [podcast]

April 30, 2008
Hear our CEO, Michael Crandell in a featured interview with Mark Hopkins on his show, “Mashable Conversations.” Michael gives a great overview of cloud computing and explains the growing interest among start-ups, mid-size, and even enterprise level businesses. The interview explains why cloud computing is such a disruptive technology, as well as RightScale’s role in this rapidly evolving industry. Michael talks about some hot topics in cloud computing such as data security issues and the overall stability of web infrastructures built on the cloud. He also provides a realistic comparison between hosting websites or applications on the cloud vs. hosting them on traditional hosting infrastructures.

Tracking changes to your deployment

April 30, 2008

I’m sure you’ve had this experience: your site worked great yesterday, and now it’s acting up. Something’s clearly broken. “Who changed what when !?!?” [expletive deleted] Of course many of our users have turned to us for help in those situations and we’re listening (oh, and we have shouted the above words ourselves). Yesterday we released part 1 of the answer, which is to track changes to a number of the objects that you can configure and manage via the RightScale dashboard. Here’s how this looks for a deployment where Eugene made a number of changes to quickly test the feature:

More along these lines is forthcoming, especially more rigorous version control for RightScripts and Server Templates. We’re as desperate for these features ourselves as you are. This, by the way, can be a great asset when doing software as a service, specially when one engages with the customer as much as we do: we are the most demanding and advanced users of our own system and as such we feel the pain just as intensely as many of our users. This allows us to close the loop on product priorities much more rapidly than w otherwise could. But I’m sure we’re also failing to see things that we should be adding, so if you have a suggestion, please do take a couple of minutes and send me email or post a comment here!


Animoto’s Facebook scale-up

April 23, 2008

The Animoto guys did hit the jackpot on Facebook this past week. Jeff Barr mentioned a few of the stats on his blog: Animoto ramped from 25,000 users to 250,000 users in three days, signing up 20,000 new users per hour at peak. The system they run using RightScale is quite complicated with the www.animoto.com web site, then a separate site for the facebook app run by Hungry Machines, both of these feeding into a back-end web services site which then orchestrates uploads, and, most importantly, the render farm which creates the cool videos.

The upshot is that there are a lot of moving parts! Each one of the subsystems consists of many servers and everything needs to scale-up as the load increases. What Animoto CTO Stevie Clifton did really well is to connect all the operations using queues, many of them in SQS. One queue contains work items that list photo URLs to fetch from other sites, such as Facebook, Flickr, etc., and that is processed by one array of worker instances. Another queue has the list of render jobs and each work item in there points to the set of photos sitting at the ready in S3 and at the music files also on S3. All of these queues are held in Amazon SQS and the arrays of worker instances are managed by RightScale. This allows the monitoring part of our service to detect when the queue gets too large and more instances need to be launched. What’s nice about using queues is that it decouples the various parts of the site, so if the renderers get backlogged the queue simply builds up and users have to wait a little longer for their video to be produced. Waiting is not good, but dropping requests on the floor is much worse!

Producing the videos takes 8-9 minutes on average and at peak Animoto has pumped over 450 render requests per minute into the queue. Last week we ended up with just under 3500 instances in the various Animoto deployments and tonight it was more than 4000 and it looks like it will not drop under 2000 instances through the night. Yikes! At peak RightScale was launching and configuring 40 new instances per minute pretty much sustained to handle the injection of thousands of render jobs that needed special handling. Mind-boggling stuff…

Lessons learned… First of all, when you scale 10x and then 10x again to run on thousands of servers every little problem turns into a large one. That insignificant error rate of 0.1% gets multiplied by 1000x per second and you end up with an error a second, and actually, the error rate typically increases in itself too because of the added load on the system. So suddenly it’s not something you can ignore anymore. An example for this was having exponential backoff for uploads to S3 when using curl, but forgetting that the 5th retry exceeds the S3 connection timeout. Normally, this happens only once in a blue moon, but when tens of uploader instances are banging hard on one S3 bucket the S3 error rate goes up a bit and suddenly uploads are failing left and right. Once we changed this to a constant retry timeout it all went smooth again.

Now does this mean that you should fix all the little issues before going live? Of course not: you can’t! What I’ve found to be most effective is to think about every little problem that you come across for a few minutes. Don’t just brush it aside as being insignificant. It is now, but it *will* trip you up tomorrow or the day after. So spend 5 minutes to troubleshoot and hypothesize as far as you can get. You don’t have to solve it immediately. Think up a work-around or how you would troubleshoot further, or perhaps how you’d fix it. Then move on. Come tomorrow, when and if the issue becomes big, you will have an invaluable head-start. Instead of being caught off-guard you’ll be able to immediately kick into action and solve the issue.

Another lesson learned is not to forget the manual overrides. Yup, I know, we have this super smart auto-scaling algorithm. But we also have manual overrides and when Animoto went from about 50 instances to 4000 instances we used it. We wanted to make sure the extra instances didn’t overload the database, the queue, and that everything was running smoothly (and, yes, to take a pause and fix some issues before scaling up further). Stevie and the Hungry Machines guys also had put in some overrides to queue-up automatically generated videos and let manually requested ones zip through. This was essential in keeping the active users happy when everything first exploded and the system had trouble keeping up with the load. A lot of the queued videos were processed a bit later when the load went back down. Automation is cool for the daily routine events but for something like this you want the overrides.

Animoto is a great example of leveraging the cloud for its strengths of instant availability and virtually limitless scope. Of course, most sites don’t need to launch 4000 servers in one go, but its nice to know you can if you need to. Whether the number is 4 or 40 or 4000 — getting the resources you need at the time you need them is a key benefit of “right-scaling” your deployment using the cloud. Looking at our database today I noticed that RightScale has launched, configured, and managed over 200,000 instances to date! That’s an impressive number — but as the Animoto scale-up proves, we’re only just beginning…

Animoto AutoScaling Graphs


Amazon takes EC2 to the next level with persistent storage volumes

April 13, 2008

The Amazon folks have gone public today with the next EC2 feature: persistent storage. The official information is found in Jeff Barr’s blog entry and in Matt’s forum post. Calling the persistent storage a “feature” is actually quite an understatement, it really revolutionizes EC2 and enables usage patterns that any big-iron SAN user would die for.

The basics

What does this persistent storage look like? We’ve been testing it for awhile and are thoroughly impressed. The Amazon folks are clearly still fine-tuning a lot of the details, but basically you can create storage volumes in the cloud next to the server instances you launch in the cloud. Think of having a really big SAN in the cloud in which you can create volumes of up to 1TB each with a single API call, or with a simple click in the RightScale UI (yes, of course we’ll have nice support for the storage volumes on our site coupled with some neat automation and an array of pre-packaged solutions). You can mount one or multiple volumes on an instance and they appear just like the other local drives, so you can format them as you like, set-up striping and do other useful things.

The feature that really makes the storage volumes sizzle is the ability to snapshot them to S3 and then create new volumes from the snapshots. The snapshots are great for durability: once a snapshot is taken it is stored in S3 with all the reliability attributes of S3, namely redundant storage in multiple availability zones. This essentially solves the whole backup issue with one simple API call or click in the RightScale UI. You can also easily restore a snapshot by creating a fresh volume from it. This feature is useful beyond just restoring a backup: you may restore to another instance where you now have a clone of the data and can do whatever you want to it. Wow!

The cool stuff

There are so many great uses for the storage volumes that it’s impossible to write them all up in a single blog post, and we obviously haven’t thought of them all (or even close). The first usage scenario we looked into is running a database. Up to today the only setup for a mission critical database we recommend is using two instances with real-time database replication and frequent backups to S3. We’ve now installed our Manager for MySQL replicated set-up for many, many customers and it works very well. In short, we use MySQL replication for redundancy and frequent (like every 10 minutes) backups to S3 on the slave to guard against the unlikely event of simultaneous failure of both instances located in different availability zones.

With the storage volumes the Manager for MySQL set-up works even better. Instead of having to tar-up the database files and upload them to S3 we can just take a snapshot. And in order to initialize a slave we simply create a volume for it from the last snapshot of the master and launch the replication: no more rsync of the data is necessary. It’s really nice to see how all the automation we’ve built stays in place with the new Amazon capabilities and saves just as many headaches as before, it just gets turbocharged by the storage volumes!

In addition, the storage volumes enable slightly lower-end database offerings. Since the storage volumes are more durable than local instance storage a lot of the risk of losing it all if the instance dies goes away. It is now possible to run a single instance with the database data living on a storage volume and to take frequent snapshots to get backups onto S3. Should the instance die, it is very simple to launch a fresh one using the same storage volume. Typically it would take only a few minutes for the new instance to come up and take off where the old one stopped! Of course this set-up has more downtime when compared to the redundant database set-up, and one has to be really careful in setting everything up to minimize the time it takes to mount the volume and to ensure a successful database recovery.

Just as the storage volumes enable the reliable use of single-instance databases they also enable single-tenant appliances in EC2. It is now possible to host the data for a single-tenant virtual appliance on a storage volume and mount it on an instance. What’s really cool is the decoupling of the data from the instance. It means that you can start a customer on a small instance and if they outgrow it, you can migrate them almost seamlessly to a large and later an x-large instance, all using the same storage volume. Beyond an x-large a couple of interesting options are possible to increase performance further, such as striping multiple storage volumes. EC2 really brings virtual appliances to the next level!

The S3 snapshots enable some completely different and very intriguing usage scenarios. Suppose you’re doing some DNA matching against a Genome data set on 1000 instances. In addition to firing-up 1000 instances on a whim you can, also on a whim, clone a nicely prepared snapshot of the data set 1000-times to create 1000 volumes, one for each instance. BANG! This way they can all independently crawl over the data set. This type of massive (essentially read-only) cloning really opens-up new possibilities in running such large computations in a cost effective manner.

Summing it up

I’ll stop here, but clearly the cloud has just squared in size! Two years ago, when I started on EC2 there were only small instances available and the sentiment was that in order to get the horizontal scalability and pricing of the cloud you had to accept inferior features. In the meantime we’ve gotten multiple instance sizes plus recently the remappable IP addresses and availability zones. That already indicated that computing in the cloud would soon surpass computing in traditional colos or in your own datacenter not just in scale and price, but also in feature set. With the addition of the storage volumes with all the cool snapshot features it’s now a fait accomplit: the cloud adopters will have much more computing horsepower and flexibility at their fingertips than those who are still racking their own machines. It’s going to be like agile software development: if you want to survive as an internet/web service you will have to compute in the cloud or your competitors will leave you in the dust by being able to deploy faster, better, and cheaper.

Update: Werner Vogels, Amazon’s CTO also blogs about the storage volumes in all-things-distributed with a little more background perspective. The Amazon folks are getting pretty coordinated with news appearing at the same time on their blogs and the forums. Maybe I missed it, but I don’t think they even press release this stuff…