CRDTs: Strong Eventual Consistency without concurrency control
One of the major benefits of having a NoSQL database is that it can scale very easily, these databases take care of many things like data replication across the cluster, concurrent updates, data consistency, message redelivery, conflict resolution, etc. However, all these things come at a cost, for instance, you may be able to scale up nodes in your database cluster, but you might have to compromise on consistency (which means that all nodes in your database may not return the same result) or there might substantial performance implications for resolving conflicts in data and not to mention the CAP theorem, which states that you can only choose only two out of consistency, availability and partition tolerance.
You might be wondering, where does CRDTs fit into all of this? well, for starter CRTDs is a solution to C in CAP theorem where C stands for consistency.
Consistency: C in CAP
Usually when people say consistency, what they actually mean is Strong consistency. How can we define Strong consistency? well if you have a series of updates, and if every node in your cluster sees the updates in the same order, it is called Strong consistency. But unfortunately, it is very hard to achieve when you a have a system which allows concurrent updates (which ideally you should be doing).
while Eventual consistency is when you allow replicas to diverge for a while on updated but they eventually converge to the same value.
In order to allow concurrent updates we often have to give up on strong consistency. There are many flavour of consistency in any distributed systems that you can opt for, like strong consistency, eventual consistency, strong eventual consistency, optimistic consistency, etc. Out of these flavors CRDTs allows you to have provable strong eventual consistency (key here is provable )this means that it guarantees a strong eventual consistency (SEC) and if you can settle for SEC than you can have all three from CAP theorem.
Consistency and Replication Strategies:
Scalability is the biggest feature of any NoSQL database, which means that you usually have more that one node in DB cluster, As a result, if any update goes to one node, that update needs to be replicated to other nodes to maintain consistency of data.
To put it simple words, Data replication strategies can be grouped into two main categories.
- Synchronous
- Asynchronous
Synchronous Replication:
Synchronous data replication is straight forward, wherein if an update request comes to one node, then we propagate the update to all of the other nodes and wait for acknowledgment from them, once it gets all acknowledgments, it replies back to the client saying write was successful.
This solution has the data replication in its critical path as it waits for data to be replicated across the cluster before returning a response back to the client.
Asynchronous Replication:
In this strategy, data replication is not in the critical path which means that if a node receives an update request, it updates it’s local copy and immediately return a response to the client, In the background, it tries to propagate the updates to other nodes.
Well, this sounds like a solution !!!!!!
Sorry to disappoint but this path is full of perils, In order to explain it better let me take Shopping cart as an example wherein user can add or remove an item.
Problem 1: Order Independence
- Let’s say, user tries to add item A to the cart, this request goes to the first node(let’s call it R1) of your DB cluster, after returning a response to User, it successfully propagates the updated to R2 (another replication Node in the same cluster).
- R2 receives the update and updates its local copy.
- The user again adds item B to the cart, update again goes to R1 and R1 tries to propagate the same update to the R2, but this time R2 could not receive the update because of some network issues, as a result, R2 has only item A in its shopping cart.
- Unfortunately, User changes his mind and decides to remove Item B from the cart. Again request goes to R1 which propagates it to R2 but now R2 is in dilemma since R2 never added item B in its Cart, how is it supposed to remove it.
- Now R2 diverges and there is a conflict which must be resolved.
Problem 2: Duplication and Redelivery
- Let’s say we try to solve the above problem by requiring every replication node to send an acknowledgment for each update so that we could attempt a redelivery if any update fails to propagate.
- R1 tells R2 to add item A to the cart, and R2 after adding item A to the cart sends an acknowledgment to the R1, however, this acknowledgment message is lost in the network.
- Since the acknowledgment is lost, R1 attempts redelivery of update (add(A)). R2 not knowing what is going on add again Item A to the cart, And Now R2 has two item A in its Cart.
- Now again R2 diverges and we have a conflict in our cluster.
Conflict Resolution: diverge → Rollback → Converge
Most of the distributed databases use the divergence, rollback and converge cycle, which means that whenever a node diverges or has a conflict, the system will rollback changes in the conflicting node and try the bring it in the same state as other nodes.
In order to achieve this, nodes have to resort to something called as consensus algorithm (for example Paxos or Raft). In this algorithm, each node talks to another node and tries to reconcile the conflict. For example if the majority of nodes agrees that Item A was added only once, then all the nodes having two Item A, will roll back their changes and take changes that were decided by the consensus.
It might sound a fairly good solution except the fact the these consensus algorithm are notoriously hard to implement. And Even if we get it right, there are substantial performance penalties that need to be paid for every conflict reconciliation because now for every conflict reconciliation each node needs to talk to another node to form a consensus on overall system state and thus not scalable.
Enter CRDT:
According to Wikipedia:
In distributed computing, a conflict-free replicated data type (abbreviated CRDT) is a type of specially-designed data structure used to achieve strong eventual consistency (SEC) and monotonicity (absence of rollbacks)[1] . As their name indicates, a CRDT instance is distributed into several replicas; each replica can be mutated promptly and concurrently; the potential divergence between replicas is however guaranteed to be eventually reconciled through downstream synchronisation (off the critical path);[1] consequently CRDTs are known to be highly available.
Don’t worry if you don’t understand anything. Keywords here are, Strong eventual consistency, absence of rollback and synchronization off the critical path.
This means that if you are using CRDT data structures than you get following benefits out of the box.
- Strong Eventual consistency without Consensus or concurrency control which leads to high performing and scalable systems.
- Solves CAP theorem: if you accept SEC instead of Strong consistency than you can have all three.
- Any kind of update is allowed with no conflict because CRDTs guarantees that these update will eventually converge.
How do CRDTs achieve this awesomeness?
CRDTs are specially designed data structure which follows some mathematical properties like,
Commutative: Order Independence
Taking the example of a shopping cart, this means that in CRDT following holds true:
add(item A) + add (item B) + remove (item B) = add(item A) + remove (item B) + add (item B)
this means that even if a node receives a delete operation before an add operation, it does not have have to resort to conflict reconciliation, every update(add or remove) will eventually converge.
Idempotent: Immune to duplication and redelivery
CRDTs can handle duplication and redelivery of the same update, which means following holds true
add(item A) + add(item A) = add(A)
this means that CRDTs are immune to redelivery of message/update (if you are wondering how it does this then you can read about vector clocks and timestamps)
How does it work
- Whenever a node receives an update, it updates the local copy and propagates the same update to other replicas.
- Any kind of update is allowed (for example deleting an item even it hasn’t been added) because we know that all of these updates will eventually converge though they may seem to be conflicting right now (remember !!! those magical mathematical properties).
I do not want to go into details of there implementation. But if you are interested you can read about them in this paper.
So, What’s the catch? Why don’t everybody uses them !!!
That is because of those magical mathematical properties, not every operation or data structure can be modeled in this way. There are many CRDT data types but I’ll only talk about the types supported by Riak KV and these are:
- Counters : simple counter which can be decreased or increased.
- Flags : bolean flag
- Maps: Maps are the most versatile of the Riak data types because all other data types can be embedded within them, including maps themselves.
- Registers: just like string or a variable.
- Sets: Collection of unique value such as String.