Enterprise applications that require scale and availability are being migrated to non-relational cloud databases (such as Apache HBase, Apache Cassandra, DataStax, MongoDB, CouchBase, Amazon DynamoDB, Microsoft DocumentDB, et al.). At the same time, enterprises are onboarding next-generation applications (analytics, e-commerce, IoT, etc.) natively on non-relational databases. Due to the distributed nature of these non-relational databases, traditional backup and recovery solutions are unable to meet the data protection requirements: application-consistent  and online backups (versioning), granular and orchestrated recovery, restore to different topology for staging and test/dev use cases, any point-in-time recovery for near-zero RTOs, etc. However, enterprises value their applications and data, and are struggling to find a next-generation horizontal scale-out data protection solution to help them recover from data loss. In this blog, I will dig deeper into one such cloud database: MongoDB. I will analyze 3 existing backup and recovery solutions for MongoDB, and compare their value and deployment costs.

The 3 Solutions

Solution #1: Hidden Secondary based Scripted Solution

This is a manually scripted solution, which requires a hidden secondary node for backup purposes. At regular intervals, the node is locked (i.e. quiesced!), file system snapshotted and transferred to a secondary storage repository. Alternatively, customers may also use MongoDump tool to dump the data in binary format before transferring to secondary storage. This is the most rudimentary solution, not scalable, and is error prone. Above all, this method doesn’t provide a consistent point-in-time backup of sharded clusters that is required to recover cleanly from data loss.

Solution #2: MongoDB Ops Manager based Backup and Recovery

Customers that use the Enterprise version of MongoDB subscription license get access to a DIY backup solution. However, the solution architecture is extremely complex as mentioned here. This results in massive costs to purchase servers and storage, deploy and maintain it.

Solution #3: Datos IO RecoverX Solution

Datos IO has redefined data protection with its innovative CODRTM architecture that simplifies data protection architecture considerably. In addition, users can achieve continuous data protection at a collection-level granularity.

Solution 3Comparison Of The Basics

Recovery Time: This is the most important metrics for any enterprise. The scripted solution (#1) requires a large amount of manual effort, typically several hours to days, to restore the data and apply operational logs on the last backup copy. However, there is more to it, which is the hidden cost of restore. It is not easy to restore granularly at a collection-level and hence, restore operation involves more data than necessary. With Datos IO RecoverX, you just restore the collection that you need and it is as simple as one-click recovery.

Data Consistency: For sharded MongoDB clusters, getting consistent backups is a big challenge. The scripted solution (#1) works on replica set basis and hence does nothing to ensure consistency across shards. RecoverX ensures that consistency is maintained across the shards before taking backups.

The Value Comparison

There are multiple elements that make up the cost of a data protection solution – software licensing, infrastructure costs and operational costs. We have used Amazon AWS as a proxy to get infrastructure costs, but the same arguments hold true for any other on-premise or cloud environment. Let’s take a real customer scenario:

MongoDB Environment

  • A sharded 3×4 MongoDB cluster (3 replica set, 1 primary + 2 secondary + 1 hidden node)
    Database Size: 3TB (primary on-disk data set)
  • A sharded 1×4 MongoDB cluster (1 replica set, 1 primary + 2 secondary + 1 hidden node)
    Database Size: 1TB (primary on-disk data set)

Daily change rate: 5%
Backup Retention Time: 14 days

Solution #1: Hidden secondary solution

Assuming that a weekly full and daily incremental strategy is used, below is the total storage required.

  • Total data size = [ 4TB x 2 + 0.2TB x 12 ]  = 10.4TB
    Storage Type = AWS EBS
    Secondary storage costs = $12.5K/year
  • MongoDB licensing for hidden secondary ~ $10K x2 = $20K/year
  • Cost of deployment (including developing and maintaining scripts)
    Assuming 2 weeks to develop and 3 weeks/year to maintain at $1K/day rate
    Total operational costs = $25K

Total cost of the solution (first year): ~$57.5K/year

Solution #2: Ops Manager Based Backup Solution

This solution is hard to understand and even harder to implement as listed here. Looking at the minimum configuration required for this solution, the cost breakdown is listed below:

  • Servers = 2x 4-core (backup daemon) + 3x 4-core (backup metadata and oplog stores)
    Assuming m4.xlarge instances ($1.8K/yr)
    Total server costs = $9K/yr
  • Storage = 2.5 x 4TB (backup daemon)  + 3 x 4TB (backup and oplog storage)
    Assuming EBS storage ($1.2K/yr) for backup daemon and for rest S3 storage ($0.5K/yr)
    Total storage costs = $17.5K/yr
  • MongoDB licensing for data bearing nodes ~ $10K x3 = $30K/yr
  • Cost of deployment
    Assuming 3 weeks to deploy and make this solution work at $1K/day rate
    Total deployment cost = $15K

Total cost of the solution (first year) ~ $71.5K/year
There may potentially be more licensing costs for Head DB and storage that are not included in this analysis. Further, maintaining this complex solution will incur on-going operational costs.

Solution #3: Datos IO RecoverX

Datos IO has developed a continuous data protection solution for MongoDB that provides massive cost benefits given its next-generation CODR architecture. Users can recover their data at a collection level granularity up to the last minute. Not only this, deploying RecoverX requires only a single server and 1x the storage. Below are the calculations that show cost savings that users can achieve through RecoverX.

  • Server = 1x 8-core
    Assuming m4.2xlarge instances ($3.6K/yr)
    Total server costs – $3.6K/yr
  • Total data size = [ 4TB + 0.2TB x 13 ]  = 6.6TB
    Storage Type = AWS S3
    Secondary storage costs = $3K/yr
  • RecoverX licensing cost estimate ~ $6K x 4TB = $24K/yr
  • Cost of deployment
    Assuming 1 day to deploy at $1K/day  rate
    Total deployment cost = $1K

Solution 3BTotal Cost of the RecoverX solution (TCO) ~ $31.6K/year (~57% less)

Summary

To recap, you should be aware of the value you are getting with each solution, and the hidden costs of infrastructure, deployment and operations. Datos IO RecoverX not only provides unique data protection features such as any point-in-time recovery up to the last minute, but also reduces your operational and deployment cost significantly. If you struggling with high operational and deployment costs, please feel free to reach out to our backup and recovery specialists to learn how we can provide you with maximum benefits at lowest cost.

Comparison of 3 Solutions for Backup and Recovery of MongoDB

Hidden Secondary Backup Native Database Backup Datos IO RecoverX
Recovery Time Very large (hours to days) Low-Mid Low (minutes to hours)
Recovery Point Objective (data loss) High Mid-High Low (up to last minute)
Data Source Failure Handling Poor Good Good
Operational Efficiency Poor Good Good
Storage Costs High High Low
Vendor Lock-in No Yes No
Overall solution cost High Very High Low