An introduction to Conflict-Free Replicated Data Types

Part 3: Lattices

Previous part Next part

This is a series about Conflict-Free Replicated Data Types, or CRDTs for short. Their purpose is to allow seamless replication of data on different nodes in a distributed system. Merging is by construction always possible, without any conflicts. This series assumes no knowledge about CRDTs, but be prepared to learn a thing or two about algebras. All code samples on this page are interactive and executed in your browser. Understanding the code is necessary for understanding the concepts, so you should be familiar with JavaScript. If you notice any bugs on this page, please let me know!


A tasty lattice

What’s happening in this episode

So far, we’ve learned about partial orderings. Those are essentially functions that can tell whether or not a thing is smaller than another thing. This is essential for defining monotonic operations, for example inserting an element into a set, which never makes the set smaller.

But one important piece of the puzzle is still missing. After all, we don’t just want to compare data, we also want to merge data that has diverged on two different machines.

Consider the following chain of events.

  1. Alice and Bob start with the set containing just 1.
  2. Internet connection fails.
  3. Alice adds a 2.
  4. Bob adds a 3.
  5. Internet connection is restored.

Clearly, the set after merging should be 1, 2, 3. That’s what you’d expect! But how can this notion be described mathematically? The answer is lattices.


Hasse diagram of the powerset of {x, y, z}

Lattices? Lattices.

Remember this diagram from the previous episode? I told you it shows a partial ordering of sets. There’s an arrow pointing from {x} to {x, z} because the former is a subset of the latter. But the diagram actually shows a different algebraic structure (they just happen to coincide): a lattice. We know this word from real life, but what does it mean in this context?

Just like a partial ordering, a lattice is a structure that has an operation (here, it’s called join) that has to satisfy some properties. The join operation is used to merge two values and produce a “bigger” value.

For sets, the join operation is defined to be set union. Intuitively, this makes sense: we can always merge them, order doesn’t matter, and the result is always a superset of both constituent sets. All of these are properties that are crucial for CRDTs. It shouldn’t matter whether Bob pulls Alice’s data first or vice versa; they should always end up with the same common state.

Let’s go through the laws that a lattice has to satisfy one by one.

  1. Associativity states that the parentheses don’t matter. Formally speaking, x ∨ (yz) = (xy) ∨ z must hold (mathematicians usually write ∨ for join). You know associativity from adding and multiplying numbers.
  2. Commutativity states that the order of joining doesn’t matter. Formally, xy must be equal to yx. Adding and multiplying numbers is also commutative.
  3. Idempotence might seem a little odd, but it’s nonetheless an important property. It states that joining a value to itself always returns the value itself. This is not something that numbers typically do. Unless you consider taking the maximum of two numbers.

To implement this operation for sets, once again we’ll need to monkey patch (yolo) the set union operation. I’m also starting to get annoyed by the verbose Set constructor, so I’ll define my own, concise version.

set = (...elems) => new Set(elems);

Set.prototype.union = function (that) {
  return set(...this.values(), ...that.values());
}

assert.deepEqual(set(1).union(set(2)), set(1, 2));

Cool.1

Enough with the paperwork, let’s define the lattice for sets together with its laws.

lattices = {};

lattices.set = {
  join: (x, y) => x.union(y)
};

contracts.lattice = (instance, gen) => ({
  assoc:
    fc.property(gen, gen, gen, (x, y, z) => {
      const x_yz = instance.join(x, instance.join(y, z));
      const xy_z = instance.join(instance.join(x, y), z);
      assert.deepEqual(x_yz, xy_z);
    }),
  commute:
    fc.property(gen, gen, (x, y) => {
      const xy = instance.join(x, y);
      const yx = instance.join(y, x);
      assert.deepEqual(xy, yx);
    }),
  idem:
    fc.property(gen, x => {
      const xx = instance.join(x, x);
      assert.deepEqual(xx, x);
    })
});

const intSetGen = fc.array(fc.integer()).map(entries => new Set(entries));

checkAll(contracts.lattice(lattices.set, intSetGen));

This follows the exact same pattern as the partial ordering contract we saw in the previous part.

A semi-lettuce

At this point, I have to tell you that I lied to you again. The structure that’s defined above is actually not a lattice, but only a semilattice; more precisely, a join-semilattice. The reason is that a lattice also needs another operation: meet, the opposite of join. For sets, that would be intersection. But this is not relevant for now. I’ll just keep on writing lattice.


There …

Partial ordering with lub and join semilattice? They're the same picture.

You may have noticed that the above definitions do not require any ordering. That’s no accident. It’s possible to define a lattice without worrying about the ordering of the elements. But wait, you might say, this subset diagram is exactly the same as in the previous episode! And you’d be right. So what is the connection between lattices and partial orderings?

Intuitively speaking, when we have two values and join them together, we should obtain something greater-than-or-equal-to the original values. And that’s precisely the trick. We can define what less-than-or-equal-to means based just on the join operation!

This is a pattern that is applied throughout algebra. We have two seemingly different abstract structures and can then define one in terms of the other.

You may already know an example of this: subtraction. We can describe x - y equivalently as x + (- y). It turns out that mathematicians prefer the latter representation because it makes talking about the properties of - and + easier. But both representations can be transformed into one another.

What we have seen above is one possible definition of a semilattice. Here’s another one.

Assume we have a partial ordering ≤ for a set of values a, b, … Let’s call that set M. If we additionally know that for each pair of values in M, there is a least upper bound2 that’s also in M, then we can construct a semilattice. The least upper bound of a and b is defined to be a value c where ac and bc (so much is obvious since it’s an upper bound) and there’s no other element d that’s closer to a and b than c. Formally: c is least upper bound of a and b if

  1. c is in M, and
  2. ac, and
  3. bc, and
  4. for all d in M:
    • if ad and bd (“d is another candidate for an upper bound”),
    • then cd (“c is a better upper bound than d”)

Let’s make this concrete. Consider the natural numbers 0, 1, 2, … We pick a = 2 and b = 3. Naturally, c = 5 would be an upper bound of both 2 and 3, since clearly 2 ≤ 5 and 3 ≤ 5. But unfortunately, there’s another number that’s closer: 3. So, c = 5 violates the last constraint.

The magic of this definition is that the join operation falls out of it: it is the least upper bound. When joining two values, you’ll get exactly the smallest possible value that’s just larger-than-or-equal-to the inputs. We can also apply that to sets. It wouldn’t make sense if the set union would just add extra elements, right?

Now we have constructed a lattice from a partial ordering.

… and back again

But can we also go back? Yes, we can! Any lattice can also be used to define a partial ordering. Buy a lattice, get a partial ordering for free!

The construction is kind of funny. Again, consider two values a and b. Let’s also assume that ab. What is the join of these two elements? As an example, take the sets a = {1, 2} and b = {1, 2, 3}. The answer is that the union of a and b is b itself because a is already included in b.

Turns out, this also holds in general. We can define that ab holds precisely when ab = b.

Let’s try it for another case: a = {1} and b = {2}. Their union is {1, 2}. So, neither is a subset of the other. Neat, right?

We can also write this down in code:

const partialOrderingOfLattice = lattice => ({
  isLeq: (x, y) => deepEqual(lattice.join(x, y), y)
});

const smallSetGen = gen => fc.array(gen, 5).map(elems => new Set(elems));

checkAll(
  contracts.partialOrdering(
    partialOrderingOfLattice(lattices.set),
    smallSetGen(fc.integer())
  )
);

This uses the deepEqual function from Chai because == won’t give the results we want when comparing sets.

But now you may ask, why haven’t you shown the code for the opposite direction? They want to see how to implement a lattice based on a partial ordering, a voice says, the people are becoming restless!

To which I’ll have to say: Sorry, no can do. That tiny little “for all d in M …” in the construction? Hard to implement. Especially because there might be infinitely many values in M.

Now what?

Depending on the context, it may be more convenient to talk about one representation or the other one. And when implementing the algebras, it may yet be more convenient to talk about both of them at the same time! It is completely fine to implement both the join and the ≤ operations for a particular type of values, maybe because that’s more efficient. In the case of sets, both are easily implemented.

But we should now look at some more complicated examples of CRDTs.

What’s in a number?

Let’s consider a counter.

Didn’t I just say something more complicated than sets?

I believe I did. And a distributed counter is more complicated, trust me!

Why? Well, let’s try to figure it out. We have Alice and Bob, as usual. The goal is to track the global value of a counter, starting at the initial value of zero. Both of them can increment, but not decrement.

  1. Alice and Bob both start with the value 0.
  2. Internet connection fails.
  3. Alice increments.
  4. Bob increments.
  5. Internet connection is restored.

Both Alice and Bob have a local value of 1. How are they gonna merge? Just take the maximum? No good, because that means one increment went missing.

The problem is that a single number has not enough structure to contain information about the whereabouts of the increments. We need a way to distinguish Alice’s and Bob’s increments.

Luckily, we can assume that Alice and Bob both have a unique identity. Let’s say Alice is labelled with the string "alice" and Bob with the string "bob". Instead of just tracking a simple number, both of them will now track a map from label to number. The above scenario now reads as follows:

  1. Alice and Bob both start with the value {"alice" → 0, "bob" → 0}.
  2. Internet connection fails.
  3. Alice increments. Her state is {"alice" → 1, "bob" → 0}.
  4. Bob increments. His state is {"alice" → 0, "bob" → 1}.
  5. Internet connection is restored.

Now we can merge the two maps by taking the maximum of each key-value pair. The total value of the counter can be computed by taking the sum of all values in the map. Of course, this only works if all participants agree that they will never increment someone else’s counter directly.

It turns out that we can even simplify this further: there’s no need for each involved party to know about everyone else. They just have to know about their own label:

  1. Alice starts with the value {"alice" → 0}. Bob with "bob", respectively.
  2. Internet connection fails.
  3. Alice increments. Her state is {"alice" → 1}.
  4. Bob increments. His state is {"bob" → 1}.
  5. Internet connection is restored.

When merging a foreign map, we can check for keys that are only present in the other one and copy them unchanged into our map.

Cornelius points out that apart from connection losses, another problem in distributed systems are retransmissions. He writes:

Let’s consider the following scenario:

  1. Alice starts with a counter of value 0.
  2. Alice increments her counter.
  3. Alice sends the increment to Bob.
  4. Internet connection fails and Alice does not receive an acknowledgement from Bob.
  5. Since Alice does not receive an acknowledgement, Alice retransmits the increment to Bob.

Now imagine the failed Internet connection only lost Bob’s acknowledgement and he receives Alice’s increment twice. Will Bob think Alice’s counter is 1 or 2?

Since Alice is sending the message {"alice" → 1} twice, Bob knows that the intended counter value is 1.

This behaviour is a direct consequence of idempotence: receiving Alice’s message twice is the same as receiving it just once.

What’s next?

John “Hannibal” Smith saying: “I love it when an algebraic construction comes together”

We now know how to track a grow-only counter across different machines. This is one of the most basic, but also most useful CRDTs. A possible use case is tracking outgoing or incoming network traffic in a data centre, where each server would keep track of its own traffic and any server can be asked for the total.

However, I haven’t shown you yet how to actually implement the lattice and ordering operations for this data structure. I promise to show you in the next episode. But I can already tell you that I lied to you again: the CRDT literature actually models G-Counters differently, by storing an array of values instead of a map of values. Instead of any old label, each node must have a (positive) integer identity. I don’t like this for two reasons:

  1. new servers always need to be added “at the end” of the current list of servers (which may be hard)
  2. it doesn’t demonstrate how to compose CRDTs with other data structures to form larger CRDTs

The latter is actually what happens here and simplifies the implementation greatly. But it requires a lot more prose, so it’s reserved for the next episode.

References

  1. If you don’t like this monkey business, then tough luck. 

  2. Scala programmers talk about least upper bounds all the time. That’s why their favourite music genre is lubstep. 


Thanks to the people who've read drafts of this series and provided valuable feedback: Andrea, Clement Delafargue, Heiko Seeberger, Hillel Wayne, Johannes Link, Matthew Weidner, Princess.