Computer Vision in the Elastic Compute Cloud

In a datacenter somewhere on the other side of the planet, a rack-mounted computer is busy hunting for patterns in photographs of Oxford.  It is doing this for 10 cents an hour, with more RAM and more horsepower than I can muster on my local machine. This delightful arrangement is made possible by Amazon’s Elastic Compute Cloud.

For the decreasing number of people who haven’t heard of EC2, it’s a pretty simple idea. Via a simple command line interface you can “create” a server running in Amazon’s datacenter. You pick a hardware configuration and OS image, send the request and voilà – about 30 seconds later you get back a response with the IP address of the machine, to which you now have root access and sole use.  You can customize the software environment to your heart’s content and then save the disk image for future use. Of course, now that you can create one instance you can create twenty. Cluster computing on tap.

This is an absolutely fantastic resource for research. I’ve been using it for about six months now, and have very little bad to say about it. Computer vision has an endless appetite for computation. Most groups, including our own, have their own computing cluster but demand for CPU cycles typically spikes around paper deadlines, so having the ability to instantly double or triple the size of your cluster is very nice indeed.

Amazon also have some hi-spec machines available. I recently ran into trouble where I needed about 10GB of RAM for a large learning job. Our cluster is 32-bit, so 4GB RAM is the limit. What might have been a serious headache was solved with a few hours and $10 on Amazon EC2.

The one limitation I’ve found is that disk access on EC2 is a shared resource, so bandwidth to disk tends to be about 10MB/s, as opposed to say 70MB/sec on a local SATA hard drive. Disk bandwidth tends to be a major factor in running time for very big out-of-core learning jobs. Happily, Amazon very recently released a new service called Elastic Block Store which offers dedicated disks, though the pricing is a little hard to figure out.

I should mention that for UK academics there is a free service called National Grid, though personally I’d rather work with Amazon.

Frankly, the possibilities opened up by EC2 just blow my mind. Every coder in a garage now potentially has access to Google-level computation. For tech startups this is a dream. More traditional companies are playing too. People have been talking about this idea for a long time, but it’s finally here, and it rocks!

Update: Amazon are keen to help their scientific users. Great!

  • Is this system exclusively for those needing more processing power, or can it be used as a terminal server, as well?

    It would be useful for me to be able to access a virtual desktop through an encrypted channel.

  • You can use it for anything, more or less.

    Any web hosting company will give you a shell account on a shared Unix machine for much less. Amazon is almost $900 a year if you leave the machine running 24/7.

    What’s really different is that you get sole use of the machine, and it’s very easy to get a machine or bank of machines for a few hours at a time. It’s great if your computing needs are “bursty”.

  • I would only need to use it once in a while, mostly to get through firewalls.

    My Oxford terminal server account played the role perfectly, before it was cruelly disconnected.

  • Extremely interesting blog post thank you for writing it I have added your website to my bookmarks and will check back :) By the way this is a little off topic but I really like your blogs layout.

  • I looked into this a bit last year, my main concern was the cost of sending image data up to the cloud – data transfer costs seemed very high to me.

Comments are closed.