To start with

  • Disclaimer
    The opinions expressed here are my personal opinions. Content published here is not read or approved in advance by EMC and does not necessarily reflect the views and opinions of EMC. This is my blog, it is not an EMC blog.

Pages of note

Music I'm listening to

Storagezilla’s blog

May 13, 2008

The Avamar Client explained. (Quickly)

As the question was asked it's probably worth while to dig deeper into how Avamar does what it does.

Since Avamar is both a stand alone backup product and a component of NetWorker on each host to be backed up you'd either install the Avamar client or the NetWorker client.

The two important components in this discussion are the host based Avamar Client which does all the de-dup work and the Avamar Data Store which acts as the orchestration/policy engine if you're using Avamar (NetWorker takes over those functions if used with NetWorker) and ensures the protection and integrity of the de-duped data after it receives it from a client across the network.

For those of you not up on NetWorker speak feel free skip to the section beyond the screen shot while I spend a moment speaking to NW Admins.

Still here? Fine. We'll drop talk of NetWorker after this section but for those that are interested the Avamar de-dup technology exists as a NetWorker ASM (Application Specific Module) in the NetWorker client. You get full indexing, browse, client config recovery through the NW CLI and GUI and all the usual stuff while the Data Store shows up in the NetWorker Management Console as a De-Duplication Node.

image

Here's an over simplified run through of what the Avamar Client does when it backs up a host.

-It first walks the file system to identify modified files.

-The modified files are chunked using the Avamar sub-file variable length de-dup algorithm.

-The resulting chunks are compressed using a high speed compression algorithm.

-The compressed chunks are hashed to generate unique values.

-Those hash values are checked against a local hash cache. These are one or two small files containing the hash values of data which has been previously backed up by that client. Hash values mind you, not the data itself.

-If values aren't in the local cache the Data Store is queried to see if data with that unique value has already been stored there. (Queries like all transmissions are done in bursts so before anyone asks there's no "slow drip" across the network.)

If data with that hash value is present in the Data Store (It was backed up already by another backup client) the data is not transmitted across the LAN/WAN.

If data with that value is not present in the Data Store it is transmitted across the LAN/WAN.

And that's why at EMC World last year in Orlando I wasn't worried when after a week of being on the road Avamar backed up my laptop to a Data Store located in a lab in Ireland via the incredibly crummy 802.11B WiFi connection in the hotel lobby.

And did so in record time.

Factoring in that at the time I still had renegade PST files (EMC archives email with EmailXtender to Centera but I did have pre-email archiving PSTs from years ago), typical backup software would attempt to drag the entire 2GB sized PST files back just because I had opened them up and looked at some attachments thereby modifying the PST file itself.

Avamar on the other hand sent back a fraction of that data across the Atlantic as only what had changed in the file was shot out of my wireless card.

And that's your 10 second look at how the Avamar Client does what it does.

May 12, 2008

Data Domain freak out

It's that time of the year again. Last week we had HP with a bulk storage product running warmed over PolyServe you can't actually buy and this week we have Data Domain showing that they're either not paying attention or too dumb to figure out how things work.

Yes, it must be time for EMC World.

In Data Domain's case it was this nugget of nose gold from today's DD690 press briefing which I found hilarious.

The identified inline deduplication competition is the IBM Diligent product, said to be about a third as fast per controller with the Diligent 1000E 4-socket server, and EMC's Avamar RAIN Grid, which is 17 times slower than the DD690.

So Data Domain's box is faster at de-dup than the Avamar back end which doesn't do any de-dup.

Since the de-dup is host based and only globally unique data leaves the NIC do I get to count the aggregate de-dup performance of all the hosts being backed up?

Yes, I do!

And someone else saved me the trouble of having to break out the calculator and run the back end numbers.

Good going Data Domain

/Golf Clap

You're looking pretty twitchy right now and the fun hasn't even started from a product and marketing stand point. But of course were I about to start fighting EMC, IBM and NetApp all at the same time I'd be twitchy too.

(Beth P. I expect my VendorFights title belt to be in the post by now)

BMR (P2P, P2V and even V2P)

When not stealth launching products, something EMC tends to do more and more frequently, the company makes stealth acquisitions. Indigo Stone developer of HomeBase was one of those.

What I'm considering doing during my frequent commutes is spending a couple of paragraphs putting a spotlight on some of the less well known products I work with. HomeBase, though not new to a ton of larger organizations might be new to you so it's not a bad place to start.

So what is EMC HomeBase?

HomeBase is a heterogeneous Bare Metal Restore product which uses profile capture instead of image capture. We all know of BMR products where you juggle full system images for different hardware configs, HomeBase does away with that by splitting the installation into logical layers and those layers into unique elements. The three main layers are the Application Layer (Where apps and app/user data reside), the Root Layer (Where the OS Binary's and drivers are located), and the Configuration Layer. (Where hardware, software and other system config settings or stored)

Getting more granular the Configuration Layer can be broken down into Functional Elements (Security, performance etc) and Support Elements (Storage/Network configurations etc)

This system state information is captured on a scheduled basis, usually before a backup and packaged into what's called a Profile. This profile is then encrypted and transmitted to the HomeBase server where upon arrival the HomeBase Differential Factoring Engine compares this latest profile to previous profiles and can flag any systems which have been modified into a non-compliant state against whatever corporate standards might be in place. So you also have change management and other such functions in there.

Picture. Thousand words.

homebase

By breaking the configuration information out you now have the ability to perform BMR to dissimilar hardware without requiring a pre-built image to support that hardware. Blast down a standard image, apply the profile, restore your data and apps and you're good to go.

But it gets more interesting. Using the profile model you can now go Physical to Virtual if you're consolidating physical servers onto something like ESX, but you can also go Virtual to Physical if for whatever reason you need to move out of a Virtual Machine and onto physical hardware.

Maybe test and dev is in a VM but you plan on running production on physical hardware.

There's more, but my train has just arrived at the station.

May 09, 2008

Steve Jobs gets more of my money

After ten years of being a Mac user (And before that it was all about NeXT) I kicked the habit when their desktop hardware turned out to be more junky than even I had expected. But after a few months on the wagon I had a relapse.

mbp1.jpg

The more things change the more they stay the same.

Profitability

Fortune list EMC as the 14th most profitable tech company.

What's interesting, (Besides the fact the hammer has dropped so hard on expenses like travel that a finance bod follows me around the place saying "no" to everything which doesn't involve flapping my arms really hard and pretending I can fly), is that the company is selling more lower priced products and services than ever before. This year alone we'll probably introduce more lower priced offerings than we have in our entire 30 year history.

If we can drive prices down even further I'm convinced we can steadily move towards breaking into the top 10.

May 02, 2008

When Open isn't.

I wrote two entries about Sun Open Storage which I nuked as I didn't want it to appear unseemly that I was kicking a competitor when they were down..

(Well, I'm not sure if I could even class Sun as a competitor anymore. Sorry folks but it's true. We see sun sales guys swearing blind that only Sun storage is supported with Sun hardware a couple of times per year and every now and then someone buys a Thumper box and tells me they have to rack on the bottom of the frame for fear it'll topple over and kill someone but that's about it)

..and then Dave Raffo is more savage then both of them combined.

To summarise my thoughts. Open Storage is an admission that Sun can't invest in this business to the same level that their competitors can. It's more about defending Solaris from being eaten alive by Linux and speeding up the company's retreat out of the storage hardware business then it is about "revolutionising" anything.

One now wonders if Symantec are now rethinking their marketing use of the term OpenStorage for their similarly marketing spin job?

With one you have open source but not open development, while the other isn't open in any shape or form at all.

It's all gone Nirvana

I just had to repost this. Six Apart co-Founder and President Mena Trott posts a video on YouTube showing what she would be doing if YouTube existed 14 years ago.

Be sure to read the credits and notice that she even put her braces back in to get the lisp right.

May 01, 2008

Fortress EMEA

It didn't leak since Vance Checketts told Chris Mellor up front but it did get out sooner than I'd have liked it to.

Yes Fortress has risen in Ireland and it will be available to customers in EMEA soon. I was sitting on this until testing was complete and it was open for business but who am I to argue with someone with hair as good as Vance's.

Stay tuned for further details but one thing I will add some colour to..

"The technology in the back-end systems is very scalable and very reliable." He said 3X mirroring is common in large service providers such as Yahoo!, with three different copies of data in three separate data centres so the data is always available, no matter what. "We don't have the same overhead but offer the same or better reliability. Multi-petabyte storage systems are our speciality."

Okay so now you're asking how EMC could offer better reliability with much less overhead?

Distributed Reed Solomon encoding +  mathematical Secret Sauce

It does a body good.

Larry Ellison wants to be Iron Man

Proving Oracle can't get on with anyone they had Marvel send Webby 2.0 website TechCrunch a cease and desist letter to get TechCrunch to cancel their perfectly legal Iron Man screening.

It turns out Oracle were having their own screening in the same cinema at a different time and didn't fancy the competition. We know Larry thinks he's Tony Stark (Though according to Forbes Larry would be richer. Bruce Wayne probably has enough fictional comic billions to bury them both as guess who ends up with the defence and reconstruction contracts anytime some threat stomps all over Gotham City/Metropolis or Earth? That's right, wholly owned subsidiaries of Wayne Enterprises), but the C&D was taking things too far.

It's be interesting to watch Mike Arrington's monstrously oversized ego seek pretty retribution over the coming months, but since this is Oracle we're talking about it'll be like water off a ducks back.

Unless you're named Microsoft or SAP it's hard to get a reaction from the green glass towers of Oz.

Pi Smart Desktop and VDI Vs SaaS.

Steve Todd is mentioning product code names like there's more tomorrow. Not to feel left behind it's about time I pre-announce something.

Nah! ;)

But now for your perusal the other application from P(ersonal) i(nformation) Corporation and this one is all about adding Intelligence to Information.

Smart Desktop.

project-Center

Again it's in Beta so you may or may not get an invite, but what Smart Desktop does is analyse how you interact with the information on your computer then automatically generate metadata for your information and organise it accordingly into a project view. You can then get a break down of where you've been spending your time and it can advise you as to what you might consider working on next. 

Give the website a look.   

Moving on somewhat I received an interesting comment from a colleague asking if VDI threatens SaaS.  To be honest while I can see the overlap in some places for the most part I don't see it as an "Or" I see it as an "And".

Let me explain..

I was speaking with a customer recently who were considering both but for completely different reasons. They had hundreds of desktops on their production floor all of which had to be locked down physically, imaged, maintained, upgraded and then retired when they aged. VDI makes perfect sense in their case but as VDI is a centralisation play they were considering SaaS for offsite backup copies all the new data which was now going to land in their data center instead of scaling an onsite backup solution to deal with it.

VDI while a new implementation isn't a new idea, we've had remote desktops, thin clients and dumb terminals going back to whenever IBM ever first invented the idea (Obligatory "IBM created everything" statement put here to ensure I don't get notes from IBM people telling me such), and since I use VDI as part of my job nearly every day, as I did with Telnet/SSH/XTerm sessions before that, when I connect into systems which have been pre-configured, systems which I don't have to worry about maintaining or patching and are located thousands of miles away that doesn't mean I don't need the services SaaS can provide to me right to the fat client I'm using or in the data centre(s) where the VDI resides.

That's my burned out from travel take on that question. Now it's your turn.

VDI is it a threat to SaaS?