Today's challenge is an interesting one, and an opportunity to use TensorFlow!

Things mentioned in this post: **cellular automata, tensor, tensorflow, linear algebra, convolution**

# The Challenge Part 1

Link to challenge on Advent of Code 2020 website

The challenge is one in which you make a cellular automata, and let it run until a steady state. Cellular automata are systems made up of individual cells, each of which run a simple set of rules that determine their state for the next "round". The rules tend to involve counting the state of its neighbours. Conway's Game of Life is possibly the most famous example of cellular automata.

## Importing data

We need to import the data, I'm going to use numpy for this, as we did last time, but I'm going to let numpy convert the characters for me:

```
import numpy as np
data = np.loadtxt("input.txt", dtype='U')
bool_data = (data.view('U1') == 'L').reshape((data.size, -1))
```

It's compact, but it's loading the data, converting the `L`

to True (everything else False), and re-shaping the data from lines of strings into a grid.

## TensorFlow

Cellular automata involve running a bit of code across every cell in the grid. What better way to do it than in parallel, on a GPU, a device designed to run some calculation on every pixel in a texture in parallel.

As a python developer, TensorFlow is the obvious choice here, even though we're not going to be training any neural networks (for which it is famous for).

The challenge involves counting neighbours of a cell. A simple convolution with a 3x3 kernel consisting of the eight neighbours will achieve this, but we need to do some padding of the data to account for the edge effects.

```
import tensorflow as tf
height, width = bool_data.shape
padding = tf.constant([[1, 1,], [1, 1]])
neighbors = tf.constant([
[1, 1, 1],
[1, 0, 1],
[1, 1, 1],
])
conv_filt = tf.reshape(neighbors, [3, 3, 1, 1])
init_data = tf.zeros([height, width], dtype=tf.bool)
valid_seats = tf.constant(bool_data, dtype=tf.bool)
```

Here, the padding is created, our neighborhood convolution filter is created, the initial data is created (all zeroes), and also a tensor containing our valid seats as well, which is needed as part of the game rules.

Next, the main loop is implemented:

```
@tf.function
def generate(this_gen):
padded = tf.pad(this_gen, padding)
padded = tf.reshape(padded, [1, height + 2, width + 2, 1])
convolved = tf.nn.convolution(tf.cast(padded, tf.int32), conv_filt)
filled_seats = tf.reshape(convolved, [height, width])
adjacent_empty = tf.math.equal(filled_seats, 0)
less_four_occupied = tf.math.less(filled_seats, 4)
next_gen = tf.math.logical_or(this_gen, adjacent_empty)
next_gen = tf.math.logical_and(next_gen, less_four_occupied)
next_gen = tf.math.logical_and(next_gen, valid_seats)
return next_gen
```

The first part of the function takes the data, and pads it as needed for tf.nn.convolution, while the next one applies the convolution filter, and the returned value, once cleaned up, is a tensor containing the number of filled seats for each cell.

This leaves the rest of the function to apply some math to implement the rules of the game - we calculate a boolean for whether all adjacent cells are empty, and a boolean that indicates whether less than four surrounding seats are occupied. The game's rules simply become: `next_gen = (this_gen | adjacent_empty) & less_four_occupied`

However, this will also create invalid filled seats where there are none, so the final result need to be filtered using `valid_seats`

calculated previously using another AND function.

Finally, we can loop this, checking for when the system reaches a steady state, and count the seats as needed by the challenge:

```
generation = init_data
while True:
next_generation = generate(generation)
if tf.math.reduce_all((tf.equal(generation, next_generation))):
break
generation = next_generation
print("total", tf.math.reduce_sum(tf.cast(generation, tf.int32)))
```

The full code for this part:

```
import numpy as np
import tensorflow as tf
data = np.loadtxt("input.txt", dtype='U')
bool_data = (data.view('U1') == 'L').reshape((data.size, -1))
height, width = bool_data.shape
padding = tf.constant([[1, 1,], [1, 1]])
neighbors = tf.constant([
[1, 1, 1],
[1, 0, 1],
[1, 1, 1],
])
conv_filt = tf.reshape(neighbors, [3, 3, 1, 1])
init_data = tf.zeros([height, width], dtype=tf.bool)
valid_seats = tf.constant(bool_data, dtype=tf.bool)
@tf.function
def generate(this_gen):
padded = tf.pad(this_gen, padding)
padded = tf.reshape(padded, [1, height + 2, width + 2, 1])
convolved = tf.nn.convolution(tf.cast(padded, tf.int32), conv_filt)
filled_seats = tf.reshape(convolved, [height, width])
adjacent_empty = tf.math.equal(filled_seats, 0)
less_four_occupied = tf.math.less(filled_seats, 4)
next_gen = tf.math.logical_or(this_gen, adjacent_empty)
next_gen = tf.math.logical_and(next_gen, less_four_occupied)
next_gen = tf.math.logical_and(next_gen, valid_seats)
return next_gen
generation = init_data
while True:
next_generation = generate(generation)
if tf.math.reduce_all((tf.equal(generation, next_generation))):
break
generation = next_generation
print("total", tf.math.reduce_sum(tf.cast(generation, tf.int32)))
```

## The Challenge Part 2

The next part of the challenge unfortunately requires us to radically rethink our approach - the previous part can be achieved with a simple convolution kernel. But in this part, the kernel for each position looks different.

So we are going to have to pre-compute a kernel for every location. This results in a big 4D stack of kernels, but we can unroll this to regular old matrix-vector multiplication.

The first thing we're going to do is re-import the data, but unroll it to 1D. (Actually thys method was already 1D, we simply don't do the reshape operation)

```
data = np.loadtxt("input.txt", dtype='U')
bool_data= data.view('U1') == 'L'
```

Next, re-compute width/height properly, and change the init data also to 1D:

```
height = data.size
flat_dim = bool_data.size
width = flat_dim//height
init_data = tf.zeros(flat_dim, dtype=tf.bool)
valid_seats = tf.constant(bool_data, dtype=tf.bool)
```

Next, we compute every kernel. This is done by looping over the space, and crawling in a particular direction until we find a valid seat. For each seat position, this effectively creates a matrix in which there is a 1 for the first seat along any of the 8 directions. Each kernel will have up to eight 1s in it, and can have fewer if there aren't any seats in a particular direction.

And we create one of these kernels per position, and they're unrolled for ease of calculation later.

```
kernel_stack = []
for ystart in range(height):
for xstart in range(width):
kernel = np.zeros(flat_dim)
for dx, dy in [[0, 1], [1, 1], [1, 0], [1, -1], [0, -1], [-1, -1], [-1, 0], [-1, 1]]:
x = xstart + dx
y = ystart + dy
while 0 <= x < width and 0 <= y < height:
if valid_seats[y*width+x]:
kernel[y*width+x] = True
break
x += dx
y += dy
kernel_stack.append(kernel)
kernels = tf.constant(np.stack(kernel_stack), dtype=tf.int32)
```

Each kernel, when multiplied with the data, returns a matrix containing just the occupied seats we care about. And in fact, a cross-product between one of the kernels (a row vector) and the data (a column vector) directly yields the number of occupied seats surrounding a given seat.

By creating one kernel per position, and stacking them up into a matrix, we can do a matrix-vector cross product to return a list of occupied neighbours for each cell! This is now the new `generate()`

function:

```
@tf.function
def generate(this_gen):
filled_seats = tf.linalg.matvec(kernels, tf.cast(this_gen, tf.int32))
adjacent_empty = tf.math.equal(filled_seats, 0)
less_five_occupied = tf.math.less(filled_seats, 5)
next_gen = tf.math.logical_or(this_gen, adjacent_empty)
next_gen = tf.math.logical_and(next_gen, less_five_occupied)
next_gen = tf.math.logical_and(next_gen, valid_seats)
return next_gen
```

The call to `tf.linalg.matvec()`

between the kernels and this generation's data returns the number of filled seats surrounding each position. The remaining code is the same as before, generating a couple of booleans for the two game rules, and applying them to create the next generation. The generation data is now in 1D instead of 2D, but TensorFlow doesn't really care.

Finally, to run the code, it's the same as before:

```
generation = init_data
while True:
next_generation = generate(generation)
if tf.math.reduce_all((tf.equal(generation, next_generation))):
break
generation = next_generation
print("total", tf.math.reduce_sum(tf.cast(generation, tf.int32)))
```

The final code for this part:

```
import numpy as np
import tensorflow as tf
data = np.loadtxt("input.txt", dtype='U')
bool_data= data.view('U1') == 'L'
height = data.size
flat_dim = bool_data.size
width = flat_dim//height
init_data = tf.zeros(flat_dim, dtype=tf.bool)
valid_seats = tf.constant(bool_data, dtype=tf.bool)
kernel_stack = []
for ystart in range(height):
for xstart in range(width):
kernel = np.zeros(flat_dim)
for dx, dy in [[0, 1], [1, 1], [1, 0], [1, -1], [0, -1], [-1, -1], [-1, 0], [-1, 1]]:
x = xstart + dx
y = ystart + dy
while 0 <= x < width and 0 <= y < height:
if valid_seats[y*width+x]:
kernel[y*width+x] = True
break
x += dx
y += dy
kernel_stack.append(kernel)
kernels = tf.constant(np.stack(kernel_stack), dtype=tf.int32)
@tf.function
def generate(this_gen):
filled_seats = tf.linalg.matvec(kernels, tf.cast(this_gen, tf.int32))
adjacent_empty = tf.math.equal(filled_seats, 0)
less_five_occupied = tf.math.less(filled_seats, 5)
next_gen = tf.math.logical_or(this_gen, adjacent_empty)
next_gen = tf.math.logical_and(next_gen, less_five_occupied)
next_gen = tf.math.logical_and(next_gen, valid_seats)
return next_gen
generation = init_data
while True:
next_generation = generate(generation)
if tf.math.reduce_all((tf.equal(generation, next_generation))):
break
generation = next_generation
print("total", tf.math.reduce_sum(tf.cast(generation, tf.int32)))
```

The great thing about this is, it's happy running on either a GPU or a CPU

Onward!

## Discussion (0)