Rapid proliferation in social, mobile, cloud, and Internet-of-Things are driving enterprises to deploy hyper-scale, distributed, data-centric applications. Applications and uses cases such as customer analytics, e-commerce, security, surveillance, and business intelligence are driving increasing requirements for scale of data and speed of transaction processing Enterprises are rapidly adopting massively scalable and non-relational databases, such as Apache Cassandra and MongoDB, to service the data requirements of these high-volume, high-ingestion-rate, and real-time applications. Simultaneously, they are utilizing VMware based virtual infrastructures to increase their agility in deploying scalable, non-relational databases. Compared to installing and configuring database nodes on individual bare metal machines, they can be installed, configured and deployed rapidly in a VMware environment. Installation and configuration can even be orchestrated using VMware orchestration tools, resulting in “turn on a dime” deployment and expansion agility in private and hybrid clouds.
Used for business-critical applications, these databases require robust data protection requirements that are standard for enterprise data but not built-into new generation scale out database environments. These include application-consistent backup, near-zero RPO and RTO, granular, repair-free recovery, failure handling, backup storage efficiency, and software-only deployment for cloud-first environments.
Because of the hyper-scale, distributed nature of these databases traditional backup and recovery products don’t support these requirements, leaving a critical data protection gap. Likewise, VMware data protection methods don’t deliver the tools or methods to meet these protection requirements for scale out databases.
The combination of VMware, Scale out databases, and Datos IO RecoverX delivers an agile environment to deploy resilient and elastic scale out database transaction processing with enterprise caliber consistent data protection.
VMware is the industry leading private cloud solution for virtualizing computing, from the data center to the cloud enabling enterprises to be more agile, responsive, and profitable. Datos IO RecoverX solution is certified with VMware Ready for Application Software™ status and is supported on VMware vSphere® 6 for production environments.
Apache Cassandra and MongoDB are modern scale-out databases that deliver high rates of transaction processing with robust resilience and easy scalability.
RecoverX is an industry-first scale-out, software-only data protection solution that is purpose-built for next-generation applications that are deployed on scale-out databases such as DataStax Enterprise and Apache Cassandra Or MongoDB.
Datos IO RecoverX is built on the foundation of Consistent Orchestrated Distributed Recovery (CODR), Datos IO’s cloud-first, scale-out data management architecture that enables customers to harness the cloud for both next-generation data protection and management. With 16 patents approved or pending, CODR uses elastic compute servicescomputeservices that can be auto-scaled with load and removes the dependency on silos of media servers compared to legacy data protection solutions.. CODR also transfers data in parallel to and from file-based and object-based secondary storage for multiple workloads, including data protection and testing and development. This powerful architecture delivers in:
By using native application intelligence, RecoverX creates an application-consistent point-in- time backup of Apache Cassandra or MongoDB databases at user-specified intervals, a concept that we call cluster-consistent versioning. A cluster-consistent version contains all the records that have achieved user-specified consistency. As a
result, no repairs are needed when a version is restored back to the cluster, significantly reducing RTO. The backup operation is also highly parallel in its nature; RecoverX acts only as a control plane that orchestrates data movement from the data source cluster to version (secondary) storage. This approach allows RecoverX to handle large scale-out database and application workloads.
By allowing administrators to generate database backups at any user-specified time interval and at any granularity (table-level or entire database), RecoverX also simplifies operational use. Overall, versioning helps reduce data loss risk and minimizes capital and operating expenditures for an enterprise.
Datos IO RecoverX provides single-click, fully orchestrated, any-point- in-time recovery. With RecoverX, customers can recover data directly restored into the same database cluster (operational recovery) or to a different cluster (testing-and- development refresh) with a different topology (number of database nodes). This flexibility reduces the operational burden of refreshing testing-and- development clusters for continuous-development DevOps environments. Further, the recovery process deals only with the logical data, making it 3 times faster than with traditional approaches. During recovery, the data is directly (with no reliance on media servers) transferred from secondary storage into target databases, resulting in a very low RTO.
Semantic deduplication is an industry-first capability that Datos IO has developed to reduce the cost of storing backups of scale-out non-relational databases over their retention period. Most scale-out non-relational databases keep multiple copies of the primary data, called replicas. As part of versioning, Datos IO RecoverX deduplicates all the replicas of a primary dataset, thus greatly increasing storage savings without losing native formats. For example, if the database uses a replication factor of 3 (RF = 3), RecoverX saves up to 70% in secondary storage costs. By using its application awareness, RecoverX optimizes the backup of Cassandra compacted SSTables and MongoDB ops logs , resulting in significant additional secondary storage savings.
Datos IO RecoverX is a software-only data protection product that can be deployed on virtual machines.
It communicates with Apache and MongoDB Cassandra scale-out databases through a Secure Shell (SSH) connection that forms a control plane to orchestrate data movement. Data is backed up locally to an NFS target. In addition to CLIs and RESTful APIs, customers can use the RecoverX UI to manage their data protection environment.
Hosting RecoverX on VMware to protect scale-out databases hosted on VMware enables rapid and flexible deployment.
There is no need to skimp on the suggested Memory, CPU and disk capacity requirements.
Datos RecoverX is a software solution with specific machine prerequisites based on memory usage, CPU requirements and provisioned disk. In actual use it frequently doesn’t consume all the required memory or CPU resource and may not consume all the recommended disk capacity.
Memory, CPU and disk resources are virtual for VM hosts running on VMware. Customers deploying RecoverX deployed in VMware environments don’t have to dedicate these provisioned resources
RecoverX can be deployed as a cluster for resiliency and increased resources for backup orchestration and processing. Use Clones to simplify deploying a RecoverX clustered data protection solution. Create a VM (Centos or RHEL VM) for the 1st RecoverX node and configure it with the prerequisites (SSH configuration settings, specific Linux utilities, specific users and assigned GIDs,etc). Then create 2 clones of that image for the 3 VMs into which to install RecoverX for a 3 node cluster. The RecoverX installer, run on the 1st VM will automatically RecoverX on all 3 VMs.
vSwitch vLANs make it easy to network Datos IO RecoverX and the Clustered Databases it is protecting on the same logical network. RecoverX and the clusters it protects need to be located on the same network.