🔄 Conflict-free Replicated Data Types (CRDTs)
Distributed data structures with guaranteed convergence
Overview
CRDTs are data structures designed for distributed systems where replicas update independently without coordination. They guarantee Strong Eventual Consistency (SEC): any two replicas that have received the same set of updates will be in the same state, regardless of message ordering, duplication, or delays.
This is achieved through merge operations satisfying three algebraic properties: commutativity, associativity, and idempotency — forming a join-semilattice.
Concepts Demonstrated
- Distributed systems — replicas, eventual consistency, partition tolerance
- Lattice theory — join-semilattices, monotonic growth
- State-based replication — full state exchange with deterministic merge
- Algebraic data types — modular CRDT building blocks
- Higher-order functions — composable merge operations
Implemented CRDTs
| Type | Description | Use Case |
|---|---|---|
GCounter | Grow-only counter | Page views, likes |
PNCounter | Increment/decrement counter | Stock levels, votes |
GSet | Grow-only set | Tag collections, seen-message IDs |
TwoPhaseSet | Add-once, remove-once set | User ban lists |
LWWRegister | Last-writer-wins register | User profile fields |
ORSet | Observed-remove set | Shopping carts, collaborative editing |
MVRegister | Multi-value register | Conflict-preserving fields |
How It Works
(* Each replica maintains local state *)
(* Updates are applied locally — always succeed *)
(* Replicas periodically exchange state and merge *)
(* G-Counter example: *)
let c = GCounter.create ()
|> GCounter.increment "nodeA" 3
|> GCounter.increment "nodeB" 5
(* GCounter.value c = 8 *)
(* Merge two diverged replicas: *)
let merged = GCounter.merge replica1 replica2
(* Takes pointwise maximum → convergence guaranteed *)
(* Key insight: merge is commutative, associative, idempotent *)
(* merge(A,B) = merge(B,A) — order doesn't matter *)
(* merge(merge(A,B),C) = merge(A,merge(B,C)) — grouping doesn't matter *)
(* merge(A,A) = A — duplicates are harmless *)
CRDTs vs STM
Compare with Software Transactional Memory:
- CRDTs: Eventual consistency, no aborts, every operation succeeds locally
- STM: Strong consistency via optimistic transactions that abort on conflict
CRDTs trade consistency strength for availability (AP in CAP theorem). STM trades availability for consistency (CP).
Key Takeaways
- Algebraic properties (commutativity, associativity, idempotency) can replace coordination
- State-based CRDTs are simpler but transfer more data than operation-based
- OCaml's module system maps cleanly to CRDT type hierarchies
- Immutable data + pure merge functions = easy reasoning about correctness