In my last article, we saw how Distributed Data Stores use Version Vector to identify concurrent updates to data records. We looked at one of the techniques of identifying concurrent updates/conflicts by leveraging ClientId as an Actor & the advantages and disadvantages of doing so. In this article, we’ll look at another approach for identifying concurrent updates/conflicts.
Server As An Actor
The problem with Server as an Actor is that of Actor Explosion, as the number of clients can grow to a very high number. To solve that, we can leverage servers as actors.
But, you can ask, we can have very large clusters as well, across multiple regions and that might face the same problem of Actor Explosion.
Yes, You’re right! Hence, we define servers as the number of nodes defined by the replication factor. If you remember, Each data record is tied to Version Vectors & hence for each data record, the maximum size of the version vectors will be the replication factor for the data in that cluster.
Let’s try to understand what’s happening in the above diagram -
- Let’s assume we have a key K, with value U. We’re assuming that we have an empty version vector, to begin with. Client’s C2 and C3 sync the same state from the Replica(Assuming all clients are interacting with the same replica) that’s implementing Version Vectors.
- C2 updates the value to W & sends a PUT command, with the local state of Version Vector it has(empty VV).
- C3 updates the value to V & sends a PUT command, with the local state of Version Vector it has(empty VV).
- Replica A receives the request from C3 first(C2’s request might be delayed because of network latency). Replica A compares the Version Vector it received with its local state & sees that they match. So it increments the counter to 1 & updates the value to V.
- The request from C2 finally arrives at Replica A. Replica A compares the Version Vector it received with its local state & sees that the vector it received does not match its state. The following approaches can be taken —
- Replica A can ignore the request from C2, as its local version vector is higher than the incoming version vector from C2.