In the movie “Other People’s Money”, Danny DeVito’s character declares:

“You know, at one time there must’ve been dozens of companies makin’ buggy whips. And I’ll bet the last company around was the one that made the best damn buggy whip you ever saw.”

Once the de facto standard for backup storage, Purpose Built Backup Appliances (PBBA) are on their way to joining other legacy media server based backup architectures as buggy whips in the new, cloud-centric world of modern data. According to IDC’s Worldwide Quarterly Purpose-Built Backup Tracker the PBBA market posted a 16.2% decline year-over-year (Q2’17), affecting all vendors but most notably the share leader, Dell EMC Data Domain (Data Domain).  Why the decline?  The answer lies in the fact that what originally made Data Domain so compelling is exactly what now makes it an increasingly less attractive solution.

Purpose built backup appliances were originally created as standalone hardware disk servers to replace backing up to tape, initially emulating tape libraries as “virtual” tape libraries. Data Domain, arguably the pioneering vendor of PBBA, even used the phrase “tape sucks” as part of its branding and marketing messaging.  To make the use of disk storage cost effective compared to tape PBBA’s utilized data compression in combination with an at the time new technology called in-line block-based deduplication, whereby data was deduplicated as it was received, and then compressed in real-time before being stored on disk.   Specifically, in-line block deduplication works by “segmenting” incoming data streams, identifying data segments, and then comparing the segments to previously stored data. If a segment is unique, it is stored on disk. If an incoming data segment is a duplicate of what has already been stored, only a reference to it is created and the segment isn’t stored again.

Data Domain excelled at delivering value for several reasons, including:

  • It was delivered as an integrated, scale-up hardware appliance that was simple to deploy in a data center.  When customers needed more performance and capacity they could purchase a bigger, more powerful box.
  • It supported a broad ecosystem of data sources including SQL databases (e.g. Oracle, DB2), email/messaging servers, and unstructured file systems, all of which were well-suited to Data Domain’s block-based deduplication.
  • It reduced backup storage requirements with block-based deduplication.

Data Domain was a great product for its time.  However, the times have changed in the 13 years since the introduction of the first PBBA, and the very tenets that made Data Domain successful are now it’s greatest weaknesses.  The cause lies in the new, modern cloud-native IT stack which is fundamentally different than the traditional stack in numerous ways, including:

  • Monolithic, on-premises applications are being replaced with microservices based applications, hosted across hybrid clouds (public and private) with workloads potentially distributed across multiple locations and geographies.  While Data Domain was ideal for consolidating backups within a data center its monolithic, scale-up hardware architecture is not well suited for protecting highly distributed workloads.
  • Traditional scale up compute and storage architectures are transitioning to elastic compute (e.g. EC2) and storage (e.g. S3) resource architectures.  The ready availability of elastic compute and storage resources mitigates the need for a monolithic hardware appliance (PBBA) in hybrid cloud environments.
  • New cloud-native applications (hosted on-premises or in the cloud) are increasingly built upon modern, distributed NoSQL databases such as Apache Cassandra, DataStax Enterprise (DSE) and MongoDB.  In most cases data is compressed to reduce storage required on the database nodes and often encrypted for security and privacy reasons.  Because compression and encryption change the block structure of data, traditional block based deduplication does not achieve meaningful reduction of compressed and/or encrypted data.  Additionally, modern databases distribute data across multiple nodes for resilience and performance.  As a result, the data copies distributed across the nodes are not exactly identical at the byte level because ordering and flushing of updates differs across the nodes, thus negating the potential benefits of block based deduplication.

The bottom line is that hybrid cloud and new, modern applications are increasingly relegating PBBAs, led by Data Domain, to becoming buggy whip solutions for protecting legacy, on-premises workloads.

To address the requirements of hybrid cloud and modern application backup and recovery, we built Datos IO RecoverX upon the core tenets of application centricity, elastic cloud services of databases, compute, and storage, and highest order backup storage efficiency. With rich functionality including application-consistent any point-in-time (APIT) backup, support for cluster topology changes, incremental and query-able restore for sub-table level recovery, industry-first semantic deduplication for unbeatable backup storage efficiency, and support for geo-distributed clusters, RecoverX is the only solution that can protect modern applications running in hybrid cloud environments, all while reducing cost and eliminating backup complexity and risk regardless of where customers’ data resides.

Click here to learn more about RecoverX