With all the talk of Gmail going offline today you'd swear it was the first time it has ever happened. Here's something you might not have noticed, Google have been having trouble with Gmail for a long time now.
I can recall four or five times over the past 12 months where I've gone to login to the web interface and for some reason I couldn't get past the front page. You'd think this would be a case of fat fingering a password but a check of my desktop email client showed that it couldn't login either at that moment in time. Wait a few seconds to try again and it worked fine. Maybe it's my net connection you think and move on. Or perhaps Gmail burps every now and then and you pass it off as your net connection?
It is possible that we now have unrealistic expectations of availability coming from our "five nines you're paying me if you don't meet the service level agreement" data centre world. But what's Google's issue?
WE don't know for sure but I have a theory. Might it be Google's mythical infrastructure perhaps? Something, if you listen to the hype, every company should aspire to even if their infrastructure needs are relatively modest (if not fractional) in comparison.
Google's bluster around their infrastructure conveniently ignores that it's optimised for one hyperscale application. Their search application.
That's the cash cow and that's what they've tuned the hell out of their infrastructure for. Tuned the block size, tuned for the I/O patterns unique to that application and tuned for the read/write frequency of that workload. Everything which they can optimise to increase the performance of their search application means more money in the bank if they do so.
But then they come along and try and put Gmail on top of that and it's been stuttering away in the background ever since. It's also funny to note that Gmail, which one assumes is sharing -cough cough- Google's general purpose (Search optimised) infrastructure has issues and yet Google.com carries on without a blip.
The same way that Amazon.com coasts away just fine any time S3 is having a nervous breakdown.
Funny that.
Now, while Google suffers from trying to run things on an infrastructure optimised for one hyperscale application you could say that S3 suffers from the opposite problem by not being optimised for any workload at all. It's so general purpose that it's cheap to get started on but if you're very successful you'll quickly want to get off of it as you can't optimise any of it.
The only things which are not a commodities in infrastructures designed to support hyperscale applications are the performance optimisations.
Everything else you buy cheap and rack high.
