Yuan Gao

Posted on Dec 4, 2021

Advent of Code 2021: Day 04 with Python, and numpy masked arrays

#python #adventofcode

Link to challenge on Advent of Code 2021 website

Loading the data

The data comes as a text file with two distinct chunks. The first line is a comma-separated list of numbers that will be used for bingo number draws. The rest of the file contains several bingo grids.

An example of the data is:

7,4,9,5,11,17,23,2,0,14,21,24,10,16,13,6,15,25,12,22,18,20,8,19,3,26,1

22 13 17 11  0
 8  2 23  4 24
21  9 14 16  7
 6 10  3 18  5
 1 12 20 15 19

 3 15  0  2 22
 9 18 13 17  5
19  8  7 25 23
20 11 10 24  4
14 21 16 12  6

14 21 17 24  4
10 16 15  9 19
18  8 23 26 20
22 11 13  6  5
 2  0 12  3  7

The draws can be read easily using a single fp.readline() which reads a single line of text from an open file. We can .split(",") this string to chop it up into individual values, and then we can apply int() to each of those elements (using map()) to turn it back into a number. (the extra list() is needed to turn the iterator returned by map() into an actual list, since we have to use this list twice.

with open("day_4.txt") as fp:
    draws = list(map(int, fp.readline().split(",")))

The bingo boards themselves, we can read in using numpy's np.mat() function, which takes a space/semicolon separated string and turns it into a numpy matrix. For example: np.mat("1 2 3; 4 5 6; 7 8 9") returns a 3x3 matrix with those numbers in it.

So we use fp.read() which by default reads all the lines of the open file as an array of strings, and split it using double newlines: fp.read()[1:-1].split("\n\n") (we have to ignore the first and last entry of the file to avoid the empty lines

This will return an array of board strings, for example:

[
  '90  8  2 34 41\n11 67 74 71 62\n47 42 44  1 17\n21 55 12 91  6\n60 69 75 92 56',
'49 29 60 45 31\n94 51 73 33 67\n21 92 53 95 96\n 2 55 52  8 87\n 4 36 76 83 42', '23 66 50 84 58\n62 98 81 76 57\n24  2 56 79  6\n55  0 16 64 38\n12 67  5 97 60',

From there we can replace the newlines with semicolons, and put it through np.mat() to retrieve a numpy matrix of the data.

The final import code is:

with open("day_4.txt") as fp:
    draws = list(map(int, fp.readline().split(",")))
    boards = [np.mat(board.replace("\n", ";")) for board in fp.read()[1:-1].split("\n\n")]

Part 1

Part 1 asks which board will win at bingo. So we will play bingo, iterating over each draw, and each board, and "marking" the matching values.

Fortunately, numpy has a useful feature called masked_array() which will maintain a matrix or array and a mask, allowing you to easily mask the data or get the sum of unmasked data.

We can construct a masked array using np.ma.masked_array(board) which initializes the mask to all False. We'll use False as meaning "not yet drawn", and True as "drawn".

To work out whether a drawn number draw is inside the mask, we can do board.data == draw (board.data in a masked array is the original unmasked data) this will return a 5x5 matrix of False or True if that element matches. So something like:

matrix([[False, False, False, False, False],
        [False, False, False, False, False],
        [False,  True, False, False, False],
        [False, False, False, False, False],
        [False, False, False, False, False]])

We can use this to directly update the mask, using a boolean operation: board.mask |= board.data == draw. This will perform a binary OR operation between the above, and the current mask. In effect, it'll turn the mask to True for any value matching the draw. As the game progresses, more and more of the mask will be turned to True.

To work out whether we have a win, we can use numpy's sum(axis) method to sum the mask along certain axes. It'll do all five rows or columns at once. so, board.mask.sum(0) will give us how many masked values are in each column. If the mask was the above, then:

board.mask.sum(0):

matrix([[0, 1, 0, 0, 0]])

board.mask.sum(1):

matrix([[0],
        [0],
        [1],
        [0],
        [0]])

It can be seen that if any of these row or column sums reaches 5, then there is a bingo. So to test for both, we can check if any of the values returned is a 0: if np.any(board.mask.sum(0) == 5) or np.any(board.mask.sum(5) == 0):

The code is therefore:

for draw, board in product(draws, [np.ma.masked_array(board) for board in boards]):
    board.mask |= board.data == draw
    if np.any(board.mask.sum(0) == 5) or np.any(board.mask.sum(1) == 5):
        result = board.sum()*draw
        break

Here, we're using itertools.product to combine the for loops for both the draw and the board, this makes it a bit easier to break out of the loop.

The question calls for the sum of the remaining undrawn numbers. multiplied by the last drawn number. Numpy's masked arrays have a sum() method which will sum unmasked numbers directly, so the result can be simply calculated above as board.sum()*draw

Part 2

In part 2, it asks which is the board that will win last if the game continues even if boards have a bingo. To this we can reuse the above, but track which boards have won, and pull out the last board to win.

Only a few modifications are needed:

won_boards = set()
for draw, (idx, board) in product(draws, enumerate([np.ma.masked_array(board) for board in boards])):
    board.mask |= board.data == draw
    if np.any(board.mask.sum(0) == 5) or np.any(board.mask.sum(1) == 5):
        if idx not in won_boards and len(won_boards) == len(boards) -1:
            result = board.sum()*draw
            break
        won_boards.add(idx)

Firstly, boards that have won are tracked in the won_boards set. Next, we also fetch the index number of the board using an enumerate() on the boards. This index is what is added to won_boards if the board wins. Then, the last win can be detected using if idx not in won_boards and len(won_boards) == len(boards) -1:, giving us the result of the last win.

Full Code

import numpy as np
from itertools import product

with open("day_4.txt") as fp:
    draws = list(map(int, fp.readline().split(",")))
    boards = [np.mat(board.replace("\n", ";")) for board in fp.read()[1:-1].split("\n\n")]

for draw, board in product(draws, [np.ma.masked_array(board) for board in boards]):
    board.mask |= board.data == draw
    if np.any(board.mask.sum(0) == 5) or np.any(board.mask.sum(1) == 5):
        result = board.sum()*draw
        break
print("Part 1 result:", result)

won_boards = set()
for draw, (idx, board) in product(draws, enumerate([np.ma.masked_array(board) for board in boards])):
    board.mask |= board.data == draw
    if np.any(board.mask.sum(0) == 5) or np.any(board.mask.sum(1) == 5):
        if idx not in won_boards and len(won_boards) == len(boards) -1:
            result = board.sum()*draw
            break
        won_boards.add(idx)
print("Part 2 result:", result)

DEV Community

Advent of Code 2021: Day 04 with Python, and numpy masked arrays

Loading the data

Part 1

Part 2

Full Code

Top comments (0)

Read next

2024 Developer Advent Calendars

FETCHING ADVENT OF CODE INPUT DYNAMICALLY IN GO

How to Use PySpark for Machine Learning

ChatWithSQL — Secure, Schema-Validated Text-to-SQL Python Library, Eliminating Arbitrary Query Risks from LLMs