CRDTs are a way of handling replicated or distributed data. What is distributed data? It just means data that is copied to many machines. As soon as we have such a distributed system we have to think about what happens when our data changes. We can decide that all machines will be aware of all changes. That is, we can maintain consistency. This is nice because it means we never deal with out-of-date data, but it requires every change to be sent to every machine before it is considered complete. If a machine (or the network) goes down we must refuse updates because we can’t ensure everyone has seen every update, and thus we can’t maintain consistency .
We can instead prefer availability, meaning we’ll just soldier on if machines go down, but this does mean we will end up with the same piece of data having different values on different machines. In other words our data will become inconsistent. CRDTs allow us to recover a particular type of consistency, called eventual consistency, without a great deal of work. With CRDTs we will be able to merge different copies of our data together without issue, and if we merge all copies together we are guaranteed to arrive at the same value everywhere.
YOU MAY ALSO LIKE:
Convergent Replicated Data Types
Noel Welsh
Noel has been interested in computers for a long time, particularly the leverage that computers give to people. He followed this interest to a PhD in machine learning, focusing on Bayesian nonparametrics and reinforcement learning. He still finds machine learning very interesting, but right now is more involved with programming and programming languages. A large part of his work is helping people become more effective with functional programming.