Allegedly I have NetApp’s dedupe architecture all wrong when it comes to running dedupe with Snapshots.
But hey, I’m only reading it from NetApp’s own materials.
Lets read along together!
Snapshot copies lock blocks on disk that cannot be freed until the Snapshot copy expires or is deleted. On any volume, once a Snapshot copy of data is made, any subsequent changes to that data temporarily require additional disk space, until the Snapshot copy is deleted or expires. The same is true with deduplication-enabled volumes.
Volumes using deduplication do not see the savings initially if the blocks are locked by Snapshot copies.
This is because the blocks are not freed until the lock is removed.
Some best practices to achieve the best space savings from deduplication-enabled volumes that contain Snapshot copies include:
• Run deduplication before creating new Snapshot copies.
• Limit the number of Snapshot copies you maintain.
• If possible, reduce the retention duration of Snapshot copies.
• Schedule deduplication only after significant new data has been written to the volume.
• Configure appropriate reserve space for the Snapshot copies.
Publication date for this? Q1 of this year.
And then weeks later the NetApp PR wagon rides in calling Snapshots “backups”. So if you’re looking for a solution where deduplication requires you limit the number of Snapshot “backups” you keep and if possible reduce the amount of time you keep them then SnapProtect is the product for you.
Of course then we have the whole snapshot replication thing going on. Since the Snaps are intertwined with the primary data if you delete the snaps on the source they go away on the replica. But think of it this way, your storage administrator has just reclaimed 2x the storage by deleting all that worthless stuff the backup administrator said she/he just had to hold on to.
But the Backup Admin doesn’t want that, so the answer you’ll get here is SnapVault! Except SnapVault requires all the data be copied to the destination at native size and then be deduplicated all over again.
There are better ways.
They involve heterogeneous backup applications and real protection storage with deduplication, compression and bandwidth optimised replication at all times.
Snapshots aren’t backups. They never were. And the more they try to retrofit modern functionality onto them the larger the mess it becomes for the Backup Admin.
