Who's afraid of AWS? (2)
I promised a second entry so here we go.
Searchstorage have an article up covering user issues with AWS. It's interesting to see the negative feedback from SmugMug since if I recall they were quoted singing it's praises earlier on during it's lifecycle. In general however the consensus is in, perfectly fine for small businesses with little to no infrastructure, not bad for backups, but more than likely unacceptable for anyone with real scale.
These are all first generation approaches to storage from a socket so we'll see this idea continue to be developed as time goes on.
Lets talk about their other offering the Elastic Compute Cloud. This is server virtualization nothing more nothing less. You buy compute time on non-persistent Xen virtual machines running Linux on x86-64 processors (2GB of RAM, 150GB of storage per instance).
The machines can be created and destroyed by the user on the fly, which is useful the next time you need a QUAKE server for internet gaming. As I mentioned, they're not persistent so when you shut one down it's virtual volumes are nuked and reclaimed but you can export the state of the volumes to S3 as a machine image which you can then load into any number of instances you care to create. You can also forget your plans to use these instances as servers out on the net for the long term as IP assignment is done by DHCP. Reboot the server or renew the lease and who knows what address you'll end up with.
What EC2 is good for is short term ad-hoc work, it's not designed to be a long term thing.
Now pricing is where things get a bit sticky, you pay 10 cents per instance hour which means it doesn't matter if the virtual machine is 98% idle, you're paying the exact same price as the person who's running it at 100% utilization. You also pay 20 cents per GB of data transfered, unless that data is being transfered between EC2 & S3.
So what have we learned from this quick look at Amazon Web Services?
-Sweet damn all.
Oh wait...we've learned that none of this is rocket science and you can build the same stuff on your own, with better response times, and in your own data center using off the shelf components. You just might not be able to buy the hardware and bandwidth in as much bulk as Amazon can.
Smugmug still loves S3 :)
http://blogs.smugmug.com/onethumb/2007/03/08/amazon-s3-the-speed-of-light-problem/
Could you build your own S3/EC2 if you had unlimited capital? Well, sure. But most companies don't. The ability to leverage someone else's infrastructure is a pretty significant value proposition.
BTW, if you're looking to set up a utility computing service in your own data center, check out http://www.3tera.com (disclaimer: I'm on their advisory board). 3Tera's software allows you to package a complex app into one single self-contained entity, which can be scaled from a fraction of a server to dozens of machines (I think 64 is the max they've tested) without making any changes to the code. Packaged apps can also be dragged and dropped between grids, or repeatedly deployed as additional instances.
Posted by:Isabel Wang | March 09, 2007 at 05:13 AM
Thanks for the clarification Isabel; it seemed a bit unusual given the praise SmugMug had showered on S3 previously.
3tera looks interesting; I know EMC has a large GRID project for Enterprise Applications so I'll be waiting to see the level of overlap when EMC ships their offering. ;)
Posted by:Storagezilla | March 09, 2007 at 11:18 AM
As Isabel already mentioned, we still love S3. :)
In fact, I don't think I said anything negative in my entire interview. I was factually accurate - Amazon has problems just like Google, eBay, Yahoo, etc do. But S3 is extremely fast and reliable for us, and we're saving a lot of money with it.
Oh, and I'd certainly consider us to be "anyone with real scale." With 200TB of data, tens of millions of images served per day, I'd call that scale.
Don
Posted by:Don MacAskill | March 09, 2007 at 09:50 PM
Glad to hear things are going well for you and S3 Don. I wasn't being disparaging when I used the term "real scale" I just deal with orders of magnitude higher measurements of data capacity on a day to day basis.
I do work for EMC Corp after all. ;)
Posted by:Storagezilla | March 09, 2007 at 10:46 PM
To me, the most useful thing about S3 is that it sets a benchmark for storage pricing, everyone now knows they should be aiming to pay around $0.15 per GB per month for bulk storage, which works out over a year at $1800 per TB
It's a tough target to meet while retaining decent performance, but it's certainly achievable.
Posted by:Ewan | March 09, 2007 at 11:59 PM
Hmmmmmm, but we are talking DAS storage here with what's probably IP based replication. Maybe even using rsync I don't know.
How that pricing would transfer to an intelligent array with snapshots, FC or iSCSI replication, multipath support, blah, blah, blah, I don't know either.
Posted by:Storagezilla | March 10, 2007 at 02:24 AM