Consistency & Consistency Levels: Distributed Data Stores
Prerequisites — Quorum
In this article, we are going to talk about consistency in distributed systems. Consistency in a distributed context is not the same as consistency in RDBMS(Single Server). Consistency in a traditional RDBMS is about ensuring data correctness through constraints. You can find out more about Consistency in RDBMS through this blog.
Coming back to Distributed Data Stores,
Consistency is about keeping the replica nodes in sync with the leader & other replica nodes in a distributed data store.
What If I Do Not Have A Consistent System
If you’re using a distributed data store, & you’re not having consistency guarantees, any write to the data store might be written to the leader, but since the replica nodes are not in sync with the leader, the subsequent reads might not show the latest write.
Similarly, if some replicas are in sync, but others are not, some users might see the latest write, while others would not. This too leads to Inconsistent behaviour from the system.
Consistency Level is the number of replica nodes that must acknowledge a read/write request to be successful for the entire request to be considered successful.
Notice that I mentioned both read & write requests in the definition. That’s because we can define consistency level for both read & write requests.
Read CL — Read Consistency Level can be defined as the number of replica nodes that must acknowledge the latest copy of the data in its partition to the user.
Write CL — Write Consistency Level can be defined as the number of replica nodes that must successfully acknowledge a write of the latest data to its partition.
CL Values — You can set different values for Consistency Level in your data store for read/write requests -
- ONE — Only one node must acknowledge a read/write request.
- TWO/THREE… — N(2/3…) nodes must acknowledge a read/write request.