Hosting traditional or next-generation applications, built on scale-out noSQL databases and scale big-data file systems, in public and hybrid cloud environments requires cloud scale data protection that scales effectively and economically.  Yet, many organizations consider protecting modern next generation data sources in the cloud with legacy architectures. In this week’s Tech Talk we’ll explore the inherent shortcomings of traditional media server based architectures for protecting data in modern multi-cloud environments.

Traditional backup architectures are based on single vendor, end to end architectures which centralize the control of the backup process (the “control plane”) and moving/storing the backed-up data (the “data plane”). In this architecture one or more “media servers” act as both the “control plane” and the “data plane” as shown below:

The downside of this architecture is that the media server is an inevitable and expensive bottleneck.  Enabling  more performance to process and manage more data necessitates  adding more media servers.  Providing  more capacity for growth of data backed up requires  adding more storage to the media server, up to the total capacity supported. The inescapable result is that the more you scale, the more cost and complexity you introduce.


In an attempt to reduce complexity and increase scalability some backup vendors introduced a new class of storage, the Purpose Built Backup Appliance (PBBA).  PBBA’s served the role of separating the storage used for backups from the media server host, and they introduced the concept of on-disk block/segment level deduplication.  

While PBBA’s delivered better backup storage performance and efficiency they actually increased cost and complexity.  Organizations had to deal with an architecture comprised of two (2) silos of control plane and data plane — media servers continue to be silos of control and backup i/o performance while PBBA’s create multiple backup storage silos with their own performance and capacity limitations.  Not only do you have to worry about media server sprawl, you also need to worry about PBBA sprawl.  To address this challenge some backup vendors introduced integrated solutions that consolidate media server and PBBA functionality into a single appliance.  The result, however,  is still increased cost, complexity, and server  sprawl. In addition, modern data sources maintain data in compressed format which defies capacity reduction using block/segment deduplication.

Nevertheless, backup infrastructure sprawl has been accepted as a necessary evil in on-premises data centers, and the majority of traditional and new backup vendors rely on media server architectures including Veritas, CommVault, Dell-EMC, Rubrik, and Cohesity.

Ironically, in the world of public and hybrid clouds, the role or need for the media server is essentially eliminated.   As a result,  media server architecture based solutions are antiquated legacies.  Instead, the future of data protection focuses on the control plane to automate and orchestrate protection while leveraging the inherent compute and storage resources available in the cloud.  The result is faster backup and restore, more efficient backup storage, increased agility to scale, and dramatically reduced cost and complexity.

Not surprisingly, this is exactly the architecture Datos IO has championed.  RecoverX is elastic, scale-out software purpose-built for multi-cloud, hyperscale environments.  The patented Datos IO architecture focuses 100% on the control plane:

  • RecoverX itself is elastic, scalable software
  • RecoverX does not utilize media servers and instead enables using the backup storage of your choic
  •  During backup data is streamed directly from source data nodes to cost effective, elastic cloud secondary storage (e.g. S3 in AWS) and is stored in native format.
  • Using patented Semantic Deduplication, RecoverX deduplicates the compressed data, without interrupting the backup streamed from the nodes.
  • Completed backups use a fraction of the space of the total data backed up.  
  • Subsequent backups after a first backup only stream incremental new data to the secondary storage

The result is application consistent backups,  >70% reduction in backup storage requirements, flexible recovery point objectives (RPO), all at dramatically lower cost relative to media server based solutions.

The cloud and next generation data sources change everything, including your backup strategy.  Media server based solutions, whether from a ‘traditional’ or ‘new’ backup vendors are not architected for the cloud.  Make sure you don’t get stuck with a legacy architecture using yesterday’s solutions to solve today’s challenges.