The inexorable increase in data growth will drive the ascension of “client-free” driven backup flows. The “backup client” approach is reaching its apex with source-side deduplicated backups (e.g., via Avamar or Data Domain Boost that are stored as deduped full backups on disk. What happens on a heavily loaded ESX server that lacks the CPU cycles and I/O bandwidth to scan for changed blocks? When can you pummel the mission-critical Oracle database to identify the changed data? Will the client scan for changed files on the NAS server ever complete?
The answer is simple: The backup application must depend on the data owner to tell it what data needs protection. After all, who better than the VMware, Oracle, or the NAS server to efficiently identify the data that has changed since the last backup?
To scale with the environment, backup applications and primary data owning applications/systems must collaborate: data owners need to efficiently identify changed data and backup applications need to turn that changed data into first-class backups. The partnership between primary data owners and backup application has already begun.
In the Backup space 75% of development effort goes into Application integration because more likely than not the Application wasn't too aware of when would be a good time to back something up and what exactly should have been backed up.
But new developments bring new opportunities. I'd like to see development get off the hamster wheel or Application interpretation and focus on where we can go next. As Stephen says in the post linked above, it looks like we're getting there.
