This post is a contributed blogpost by Eric Lubow,, who was most recently the CTO of SimpleReach, the measurement layer for content. He has also been a Cassandra MVP since the beginning and is the co-author of Practical Cassandra.
Many people believe that replicating a distributed database is a replacement for having a solid disaster recovery (DR) plan or backup system. While there are certainly elements of truth to this, it is missing the mark. Replication is a complement to a well thought out DR plan, not a replacement.
What is Replication?
Replication is the act of copying data from one database server or data center to another database server or data center. In other words, replication creates an exact copy of the data in a different storage array. Replication always copies exactly what is in your database. This is important because if your data is corrupt or somehow bad data has gotten into your system, then you are basically just replicating garbage.
What is Backup?
Backups, on the other hand, are creating a point in time snapshot of the operational state, architecture and stored data in the database. This distinction is incredibly important given that a backup could be the difference between restoring good data or bad data into your database to recover from an outage (or even just bad data). Backup is also important from a regulatory standpoint. Many organizations have contractual clauses in their MSAs (master service agreements) that require a vendor to have backups. While this certainly in the best interest of the vendors to do anyway, this is a function of how organizations are attempting to protect themselves from externalities that they have little control over. Many of the backups requested are ones that fall under the regulatory requirements of things like HIPAA or Sarbanes Oxley which have important requirements such as the following:
- Data must be backed up offsite frequently and be fully recoverable at a point in time
- Backed up data must be encrypted at rest and transported in an encrypted fashion
- Backups must be stored in multiple locations and each location must be deemed physically safe
- Backups must be tested regularly for validity
What is the difference between replication and backup?
Replication can and sometimes does handle parts of the backup solution. Data is stored and transported in an encrypted fashion between physical locations. The location of the data is spread out and your servers are likely in locations that are deemed physically safe. Most importantly, with replication, your data is available in another location almost instantly. However, replication also means that any errors or corruption issues are also propagated across all replicas. Thus, replication cannot be used to hedge against issues such as logical corruption, application schema corruption, ransomware, fat finger mistakes, operator errors, compliance and so on.
After all this, there is a caveat that one version of replication comes very close to backups: CDP or continuous data protection. This is where backups and replication meet. CDP logs the changes in the data and ships them off to a secondary storage location in near real-time. This allows for point in time recovery, one of the reasons that backups are so important.
Backups and replication are not interchangeable.
They are complementary processes that are components of having a complete disaster recovery plan. Knowing when to take proper advantage of each type is important when setting up your application and your organization for success.