Modern applications built on non-relational databases like Cassandra and MongoDB and Big Data filesystems like Cloudera and HortonWorks are here to stay, increasing in popularity and rate of adoption (running in the cloud or on-premises) within enterprises everywhere. In response, traditional backup vendors, including Commvault, are rushing to market with solutions that purport to provide backup and recovery for these next generation workloads, and we are frequently asked is ‘Why can’t I just use CommVault to backup my modern applications?’  My response is always the same; while backup is not a new problem, modern applications running in hybrid cloud environments are new and require a completely new approach to backup and recovery.  Let’s explore why and then specifically answer the question of ‘Why Not CommVault?’

 

We’re Not In Kansas Anymore
The key to understanding the challenge of backup and recovery for modern applications is to first understand the fundamental differences in the architectures of traditional versus modern applications.  The table below provides an easy to understand explanation:

The reality is that Commvault, using an architecture that is 17 years old, or any traditional backup vendor with a decades old architecture for that matter, is architected for the traditional applications on the left side of the chart.  They are not, however, architected for protecting modern applications.


Commvault Challenges to Protecting Modern, Cloud Applications:
With that as a backdrop, let’s look at the specific areas in which Commvault comes up short for protecting modern applications.  


No Solution for Eventually Consistent Databases
Commvault’s method for  protecting Cassandra has no application aware capability to create point in time consistent versions of backed up Cassandra Data.  Restores from version backups contain inconsistent, multiple copies of data. The result is is that running “repair” is required after a restore to reconcile the data restored to a Cassandra cluster.  Running repair can take hours or days and significantly degrades Cassandra performance until it completes.  Commvault has no capability for making Cassandra data consistent before restoring it from backups.


Monolithic Backup Server Architecture for traditional monolithic applications
Commvault recently announced their HyperScale backup infrastructure solution, acknowledging that their legacy architecture of individual, dedicated backup servers with dedicated storage is “not suited to adapt to new cloud centric infrastructure and operations.”  HyperScale simply replaces their scale-up backup infrastructure with a scale-out backup infrastructure but is still a legacy backup architecture focused on protecting traditional applications. Commvault said  HyperScale is  designed for on-premises environments to address legacy applications and data that cannot be migrated and run in the cloud.

Not a surprise because, as a legacy, infrastructure centric solution it is not architected for the cloud.  It requires dedicated machine and storage resources to create a  backup infrastructure, an expensive and impractical solution to deploy in the cloud.


It’s Not Elastic
Commvault legacy architecture is storage inefficient for several reasons.  It does not utilize elastic cloud services, instead requiring expensive non-elastic dedicated storage and compute resources in the cloud.


It Relies Upon Traditional Deduplication
Additionally, next generation data sources compress data to reduce local disk footprint rendering traditional block data deduplication ineffective at delivering backup storage savings.  Commvault deduplication, based on traditional block fingerprinting methods, can’t achieve significant footprint reduction for retained backups of modern, cloud applications.


Limited Support for Protecting MongoDB Data
Commvault has no application agent for backing up MongoDB, instead relying on file system backups of MongoDB “dumps.”  MongoDB dump is only recommended for backing up small amounts of data from from small, non-sharded clusters.  MongoDB dumps are not usable for backing up and restoring sharded clusters and large clusters with multiple replicas.  As a result of its limited MongoDB protection, Commvault cannot provide Any Point in Time (APIT) data restore.

Limited to file system restores of a MongoDB dump file restored back into the source cluster,   Commvault MongoDB restore is a non-scaleable, multi-step procedure.  Because MongoDB dumps cannot be restored to a different topology cluster than the original source, there is no capability to enable  restore mobility.  Commvault MongoDB data protection cannot enable data mobility restores to alternative MongoDB clusters for Test/Dev/QA use cases.

The bottom line is that next-generation workloads need next-generation backup and recovery.   

Effective protection of modern applications  and data in a hybrid cloud environment requires a next generation data protection and data management architecture that:

  • Scales by using the inherent cloud elastic resources  and economy of services in the cloud.  
  • Eliminates the bottleneck and chokepoint of backup servers, scaling backup by separating control plane (backup software) and data plane (purpose built backup filesystems and appliances) , eliminating any requirement for media servers.   
  • Utilizes application aware intelligence and semantic deduplication to deliver significant (70%+)  backup footprint reduction of compressed, eventually consistent data inherent in next generation applications and cloud native workloads.
  • Utilizes application aware intelligence to reconcile multiple copies of inconsistent data into Point in Time consistent versions.
  • Is agnostic of underlying infrastructure enabling protection of data wherever it resides as well as seamless data mobility for moving workloads across on-premises and cloud boundaries.

 

That’s exactly what Datos IO RecoverX does!  

Our revolutionary application-centric approach provides customers with dramatic backup storage efficiency, provides fully automated backup and recovery at a granular level, and enables customers to move their applications back and forth between public clouds or across cloud environments. Customers who have standardized on on-premises , VMware-based,  environments, Hybrid Cloud environments or Public Clouds can now deploy their modern applications with confidence knowing their data is protected across all cloud boundaries.

Click here to learn more about RecoverX