Backup, business continuity (BC) and disaster recovery (DR) have all been a critical part of IT for over 30 years — ever since we began relying on technology to run our businesses. Traditional solutions were designed for the world of on-premise infrastructure and structured applications and relational databases. But the world is changing. In the last blog, I talked about the era of digital transformation and the impetus it has on rethinking and reinventing the fundamental backup and recovery architecture for workloads moving to the cloud and applications being born in the cloud.
What Changed? The Motivations Leading To Reinvent Backup and Recovery
Application and data platforms are undergoing the biggest transformation since the dawn of computing. There are several forces at work:
- New applications. The third-generation applications are geo-distributed, scale across multiple systems, are always on, and typically deployed in a cloud-first model.
- Existing applications are moving to the cloud. They aren’t going away, but companies are moving some or all of them to the cloud. They still need backup and recovery.
- RPO and RTO windows are shrinking: Enterprises want “always-on” and gone are the days when you could do nightly backups.
- Smaller companies will go all-in on public cloud. SMBs don’t want to be in the business of IT. They’ve been driving the rapid growth of cloud applications and platforms.
- Enterprises will build hybrid clouds. Enterprises will deploy applications and data across on-premise and public cloud environments. Scale, compliance and other factors mean they’ll need to keep some systems on-premise.
- Everyone will use multiple clouds. No one will bet their business on one cloud or one provider. Even now, enterprises are splitting workloads across clouds or cloud and on-premise. Development and test may use one cloud, while the same application might be deployed in a private cloud or a different public cloud.
Cloud’s Impact on Backup, Recovery and Continuity
Cloud gives organizations much more agility, operational savings, and a pay-as-you go model. Public cloud providers can also build much more resilient infrastructure. Amazon guarantees 99.95% availability for EC2 and 99.99% for S3; S3 is designed for 11 9s of data durability, with multiple availability zones. Because the cloud is so reliable — and cheap — it’s quickly becoming a backup target for on-premise data. But that shouldn’t trick us into believing backup and recovery are “built-in” when we run applications in the cloud. Even Amazon itself recommends backup services for all AWS-native applications and cloud databases.
While service availability and data resiliency addresses infrastructure business continuity and disaster recovery, it doesn’t provide point-in-time recovery or application level intelligence for backup and recovery. As good as the cloud platforms are, they don’t protect against logical errors. And research shows 8 out of 10 errors are logical errors — data corruption, user errors.
But Won’t Existing Backup Products Work in the Cloud?
As we noted above, traditional backup and recovery products don’t meet the needs of cloud applications — even for existing applications that have moved to the cloud, and not just because they were built in a different era. Cloud and distributed architectures present several other challenges:
- Cloud breaks the media server based architecture of traditional solutions. Cloud applications and data don’t reside on a specific array or disk, so you can’t easily backup what you can’t see. Backups also don’t capture configuration data in the cloud, like AWS Cloud Formation templates.
- Cloud doesn’t speak the same language. Legacy solutions talk to tape, disk or virtual disks. Backup and recovery in the cloud means integrating with the right protocols, such as the S3 API or Google Cloud Storage.
- Backup appliances can’t be moved to the cloud. Existing backup appliances such as EMC Data Domain or NetBackup that work extremely well on-premise can’t be picked up and moved to the cloud.
- Traditional backup agents won’t scale. If you could get a backup agent running in the cloud, it wouldn’t scale well across dozens or possibly hundreds of nodes.
- VM is not the right layer of abstraction: This is precisely why the core principle of the Datos IO CODR architecture is a scalable application centric view of data management and data protection that distinguishes it from conventional approaches. This is the exact reason why the CODR architecture introspects application data and uses global semantic de-duplication to achieve storage efficiencies, instead of relying on traditional de-duplication techniques which treat data as an opaque object (such as a VM or a LUN). The benefit of this approach is fine-grained and highly space-efficient data protection that can span clouds over network links.
- Cloud Gateway or Migration Services: Unidirectional only.
Datos IO Reinvents Data Protection In the Cloud with CODR Architecture
The problem of backup and recovery of cloud applications is novel and Datos IO is charting the path here. There are three critical things a cloud backup and recovery architecture should have:
- Elastic compute only. The architecture should scale efficiently on elastic compute instances. There shouldn’t be any CapEx costs for servers or appliances.
- No media servers. Backing up large scale out databases requires direct a parallel streaming architecture for data movement between the database and secondary storage. Legacy backup architectures rely on media servers that would quickly become choke points. Direct parallel streaming also allows the data to remain available in native formats.
- Semantic deduplication. Scale-out application databases typically have a 3x replication factor. If you backup individual nodes or even manage to snapshot the whole database, two-thirds of the backup data is redundant. Over time, backups will be 75 to 80 percent inefficient without semantic deduplication that works across a distributed architecture.
Datos IO: Data Management for the Cloud
Datos IO RecoverX is founded upon Consistent Orchestrated Distributed Recovery (CODR®) architecture, next-generation scale-out data protection architecture that is no longer dependent on media servers and transfers data in parallel to and from file-based and object-based secondary storage. CODR allows RecoverX to provide scalable versioning that enables enterprises to protect and backup their data at any interval and granularity, one-click recovery in minutes (not hours) for both operational recovery and test/dev use cases, as well as industry-first semantic de-duplication capability that allows customers to save up to 70 percent on secondary storage costs.
The cloud doesn’t eliminate the need for BCDR. It changes how we need to think about architecting it. We need the same level of enterprise backup, recovery and continuity for applications and data that we run in the cloud. And that’s what we’ve built at Datos IO. In the follow-on blog, I will talk about why traditional backup and recovery tools won’t and can’t make the leap to cloud.