Garrett Mills

Posted on Dec 9, 2022 • Originally published at garrettmills.dev

Generalized Commutative Data-Types

#parallel #programming #distributedsystems

Disclaimer: after I started writing about this, I found this paper from the Hydro project which presents a formulation of this idea using lattices & morphisms. What follows is my derivation of a similar technique, albeit significantly less formal. As far as I can tell, the Hydro paper does not separate "pseudo-commutative operations" instead opting to form reactive values which are re-computed as the PC operations are applied.

A "commutative data type" is one whose value is modified by a set of operations whose execution order is irrelevant. Such data types are useful in distributed & parallel systems which employ accumulator-style execution (i.e. many jobs perform a calculation then merge their result into a single, shared value).

What follows is a formulation of such a data type along with the structures required to define various operations over it.

Naïve Commutative Data Types: A First Draft

Begin with a base value v of some type T.

There are n many jobs which act in parallel to perform commutative operations on v.

An operation on v is commutative if, for all operations a and b of type (T -> T), a (b v) = b (a v).

Each of the jobs produces an operation of the type (T -> T) which are collected.

The result is a value (v : T) and a set of commutative operations (list (T -> T)).

The list of operations is applied to the value, chained, producing a final value v'.

For example, if we have the list ((v -> v+1) :: (v -> v+4) :: nil) and a base value of 0, resolving the value gives (v -> v+4) ((v -> v+1) 0) = 5.

Importantly, because the order of the operations is irrelevant, we can apply the operations as they are received by the reducer (the piece of software accumulating the result), rather than collecting them all at once.

This allows for efficient reduction of a shared result variable by many distributed parallel jobs.

Pseudo-Commutative Operations

Some operations, however, are not purely commutative. An example of this is multiplication.

If we introduce a job which produces a multiply operation into the above example, the list of operations is no longer commutative (herein referred to as asymmetric, or inconsistent).

However, the operation of multiplication is distributive in the sense that the TFAE:

c * (a + b)
(c * a) + (c * b)

Or, perhaps more interestingly for our case, TFAE:

d * (a + b + c)
(d * c) + (d * (a + b))

For example, say we receive the following operations in the following order. A C: prefix denotes a commutative operation, and a P: prefix denotes a pseudo-commutative operation:

C: (v -> v + 1)
C: (v -> v + 2)
P: (v -> v * 2)
C: (v -> v + 3)

If the base value of v is 0, we find that the "consistent" result should be (0 + 1 + 2 + 3) * 2 = 12.

(v -> v + 1) 0  => 1
(v -> v + 2) 1  => 3
(v -> v * 2) 3  => 6

(v -> v + 3) 6  => 9   (incorrect)
(v -> (v + 3) * 2) 6  => 18   (incorrect)
(v -> v + (3 * 2)) 6  => 12   (incorrect)

Because the commutative operation is opaque, there is no way of "pushing" the pseudo-commutative operation into the subsequent commutative operations, resulting in asymmetric results.

To address this, we re-define our naïve commutative operations like so:

A commutative operation is pair of the form (T, T -> T -> T) where the first element is the right-operand to a commutative binary operation. The second element is a function which takes the current accumulated value and the right operand and returns the new accumulated value.

This structure removes the opacity of the right operand in the operation, allowing us to push the pseudo-commutative operation into subsequent commutative operations.

We similarly re-define PC operations to have the form (T, T -> T -> T).

Now, the same example using the new structure:

C: (1, l -> r -> l + r)
C: (2, l -> r -> l + r)
P: (2, l -> r -> l * r)
C: (3, l -> r -> l + r)

This results in:

(l -> r -> l + r) 0 1  => 1
(l -> r -> l + r) 1 2  => 3
(l -> r -> l * r) 3 2  => 6
(l -> r -> l + r) 6 ((l -> r -> l * r) 3 2)  => 12

This is the fundamental insight of pseudo-commutative operations: if they are folded into the operands of all subsequent operations applied to the accumulator, the ordering of commutative and pseudo-commutative operations is irrelevant (insofar as the correct pseudo-commutative folds are performed).

Pseudo-commutative operations can even be chained to arrive at similarly-consistent results:

C: (2, l -> r -> l + r)
P: (2, l -> r -> l * r)
P: (3, l -> r -> l * r)
C: (2, l -> r -> l + r)

The expected result here is (0 + 2 + 2) * 2 * 3 = 24, and is computed as:

(l -> r -> l + r) 0 2  => 2
(l -> r -> l * r) 2 2  => 4
(l -> r -> l * r) 4 3  => 12
(l -> r -> l + r) 12 ((l -> r -> l * r) ((l -> r -> l * r) 2 2) 3)  => 24

Another added benefit of this representation is the lack of specialization of the operations. Both commutative and pseudo-commutative operations can be represented as generic functions over two parameters, and those functions reused for each operation.

Another GCDT: Sets

We will further formulate a GCDT over sets. A value of type (set T) is a collection of distinct, non-ordered values of type T.

A set has a characteristic commutative operation: append (or, more generally, union). Because sets have no order, the order in which unions are applied is irrelevant.

We use the ∪ operator to represent set union. So, A ∪ B is the union of sets A and B. For unions, the right operand is clear.

An operation, therefore, may be something like:

(B, l -> r -> l ∪ r)

Sets also have a clear pseudo-commutative operation: map (or set comprehension, if you prefer). This is the operation of applying a function (T1 -> T2) to every element in a set, resulting in a set of type set T2.

We represent set comprehension with the map function, which is of the form: map :: (T1 -> T2) -> set T1 -> set T2.

Here's an example, assuming we start with a base value v = {} (the empty set):

C: ({1}, l -> r -> l ∪ R)
C: ({1, 2}, l -> r -> l ∪ R)
P: ({}, l -> _ -> map (* 2) l)
C: ({3, 4}, l -> r -> l ∪ r)

Interestingly, map is a pseudo-commutative operation, but it is unary. To fit the structure, we implement it as a binary operation, but ignore the right operand, since it is always the one specified by the PC operation itself.

The expected result here is map (* 2) ({1} U {1, 2} U {3, 4}) = {2, 4, 6, 8}, and is computed as:

(l -> r -> l ∪ r) {} {1}  => {1}
(l -> r -> l ∪ r) {1} {1, 2}  => {1, 2}
(l -> _ -> map (* 2) l) {1, 2} {}  => {2, 4}
(l -> r -> l ∪ r) {2, 4} ((l -> _ -> map (* 2) l) {3, 4} {})  => {2, 4, 6, 8}

Pseudo-Commutative Operation Precedence

Now, let's introduce another pseudo-commutative operation over sets: filter (or set subtraction). Set subtraction removes all elements in the right operand from the left operand. For example, {1, 2, 3} - {2} = {1, 3}.

This can similarly be implemented using a function of type (T -> bool) which removes an element from set T unless the function returns true.

Based on the properties defined above, we can apply set subtraction in an example:

C: ({1, 4, 7}, l -> r -> l ∪ r)
P: ({}, l -> _ -> filter (< 5) l)
C: ({2, 5, 8}, l -> r -> l ∪ r)

The expected result here is filter (< 5) ({1, 4, 7} U {2, 5, 8}) = {5, 7, 8}, and is computed:

(l -> r -> l ∪ r) {} {1, 4, 7}  => {1, 4, 7}
(l -> _ -> filter (< 5) l) {1, 4, 7} {}  => {7}
(l -> r -> l ∪ r) {7} (filter (< 5) {2, 5, 8})  => {5, 7, 8}

Something problematic happens when we combine the two pseudo-commutative operators, however:

C: ({1, 4, 7}, l -> r -> l ∪ r)
P: ({}, l -> _ -> filter (< 5) l)
P: ({}, l -> _ -> map (* 2) l)
C: ({2, 5, 8}, l -> r -> l ∪ r)

Depending on whether we filter then map or map then filter, we arrive at {10, 14, 16} or {8, 10, 14, 16}, an asymmetric result. Unlike commutative operations, pseudo-commutative operations are not necessarily commutative with each other. Thus, the order in which pseudo-commutative operations are applies matters a great deal.

To resolve this inconsistency, we can require pseudo-commutative operations to be orderable such that, for a set of pseudo-commutative operations s1, there exists a list of these operations s2 such that, s2 has the form { s_i | s_i in s1 and forall j < i, s_i > s_j }.

This gives precedence to pseudo-commutative operations, allowing their order to be resolved when they are "pushed" into subsequent commutative operands, but how do we handle the case when a greater PC operation is received after a lesser PC operation is processed?

One approach to this is to specify the inverse of an operation, allowing it to be efficiently re-ordered.

For example, say we have an initial value v0 and a PC operation ({}, pc1, pc1') (where pc1' inverts pc1). If a subsequent PC operation with a greater precedence is applied, ({}, pc2, pc2'), we compute the accumulator like so:

v = v0
v = pc1 v {}
v = pc1 (pc2 (pc1' v {}) {}) {}

This approach has a few benefits:

First, by inverting and re-applying operations on-the-fly, we avoid the need to re-compute the accumulator all the way from the initial value. Instead, we only need to re-compute the operations which were PC and of a lower priority.
Second, because of this rewinding approach, you will never have to rewind a PC operation of equal or greater precedence, as the operations of lesser precedence will always be "closest" to the end of the chain.
Finally, commutative operations need not be re-applied during a rewind. Instead, the resultant value is treated as a pre-existing member of the set to be re-computed, since the commuted operation is preserved through the inverse of the PC operations.

However, there are a few drawbacks:

Depending on the order in which the PC operations are received, the reducer may be forced to perform unoptimally-many re-computation.
Fundamentally, some PC operations will lack easily computable inverses. For example map sqrt.

This last case is perhaps the most serious drawback to this approach, but it also has a fairly simple solution.

Because the entire domain of a PC operation is known when the operation is applied, we can trivially define an inversion of the operation by building a map from the range -> domain and storing that after the PC is applied (we call this "auto-inversion").

This will require updating the mapping as the PC is applied to subsequent commutative operations, but such updates are considered relatively minor overhead.

This allows us to auto-invert any PC operation. The trade-off here is between time and space complexity.

In cases where the domain operand is small, but the inverse operation complex or impossible to define, defining the inverse as a mapping is more efficient.

However, in cases where the domain operand is large, the resultant auto-inverse may require a large amount of memory. In these cases, if the inverse operation is efficiently computable, defining an inverse function is more efficient.

Applications

The motivation for this thought exercise came from Swarm: a modular & massively-parallel distributed programming language I've been building w/ Ethan Grantz for the past year.

Swarm provides set-enumeration constructs which are natively parallelized and shared variables whose synchronization is handled by the runtime.

However, the language still relies on the developer to avoid asymmetric operations. For example:

enumeration<number> e = [1, 2, 3, 4, 5];
shared number acc = 0;

enumerate e as n {
    if ( n % 2 == 0 ) {
        acc += n;
    } else {
        acc *= n;
    }
}

This example is somewhat contrived, but it is easy to see that the order in which the enumerate body executes for each element of e determines the value of acc.

This example could be made consistent by treating acc as the initial value of a GCDT of type number, and each execution of the body would submit one of two operations:

-- If n % 2 == 0:
C: (n, l -> r -> l + r)

-- Else:
P: (n, l -> r -> l * r)

Then, using the method described above, this result is always consistent, regardless of the order in which the jobs are executed.

This post originally appeared on my blog, here.

DEV Community