Advent of Code 2020 - Day 16

In the spirit of the holidays (and programming), I’ll be posting my solutions to the Advent of Code 2020 puzzles here, at least one day after they’re posted (no spoilers!). I’ll be implementing the solutions in R because, well, that’s what I like! What I won’t be doing is posting any of the actual answers, just the reasoning behind them.

Also, as a general convention, whenever the puzzle has downloadable input, I’m saving it in a file named input.txt.

Day 16 - Ticket Translation

Find the problem description HERE.

Part One - No Money? No Ticket!

This puzzle got me right in my comfort zone: manipulating tabular data! Granted, that’s probably not the only (or even necessarily best) way to solve this puzzle, but at this point I’m frankly incapable of not wanting to use dplyrfor one of these. Now’s my chance!

Also a first for me in these puzzles (I think): a closure! If you’re not familiar with closures in R, it’s what you get when you have a function that returns a function. The returned function takes the environment of the function that produced it along for the ride. In this case, each function produced by field_test_function() carries the vectors for range_one andrange_two with it in its enclosing environment, which is mighty convenient.

library(stringr)
library(tidyr)
library(dplyr)
library(dequer)

test_input <- c(
  "class: 1-3 or 5-7",
  "row: 6-11 or 33-44",
  "seat: 13-40 or 45-50",
  "", 
  "your ticket:",
  "7,1,14",
  "", 
  "nearby tickets:",
  "7,3,47",
  "40,4,50",
  "55,2,20",
  "38,6,12"
)

real_input <- readLines('input.txt')

# Given a line from the input for a field test (i.e., in the format of 
# "class: 1-3 or 5-7"), returns a closure that accepts a number and 
# returns TRUE if the number matches one of the defined ranges from the
# input and FALSE if it does not.
field_test_function <- function(line) {
  nums <- as.numeric(str_extract_all(line, '\\d+', simplify = TRUE))
  range_one <- nums[1]:nums[2]
  range_two <- nums[3]:nums[4]

  function(n) {
    (n %in% range_one) | (n %in% range_two)
  }
}

# Given a list of input lines `input`, returns an environment containing
# a list of functions for testing fields `field_tests`, a numeric vector
# containing the values from your ticket `my_ticket`, and a list containing
# the values from all the other tickets `nearby_tickets`
parse_input <- function(input) {
  env <- new.env() # Container environment
  env$field_tests <- list() # List of functions for testing fields
  env$my_ticket <- numeric(0) # Vector for 'my ticket' numbers
  nearby_tickets <- stack() # Stack to hold other ticket numbers
  mode <- 'tests' # Indicates what the for loop does with current line

  # For each line in the input lines...
  for (line in input) {
    if (line == "") { next } # Skip blank lines

    # Set `mode` to parse my ticket data
    if (line == 'your ticket:') { mode <- 'my ticket'; next }

    # Set `mode` to parse numbers from other tickets
    if (line == 'nearby tickets:') { mode <- 'other tickets'; next }

    # The default, for the first set of input lines create a function for
    # each that will test a number to see if it falls within the given ranges
    if (mode == 'tests') { 
      name <- str_extract(line, '^[\\w\\s]+(?=:)')
      env$field_tests[[name]] <- field_test_function(line)
    }

    # For 'my ticket', just get the numbers
    if (mode == 'my ticket') {
      env$my_ticket <- as.numeric(unlist(strsplit(line, ',')))
    }

    # For 'other tickets', get the numbers and push them onto the 
    # `nearby_tickets` stack
    if (mode == 'other tickets') {
      ticket_nums <- as.numeric(unlist(strsplit(line, ',')))
      push(nearby_tickets, ticket_nums)
    }
  }

  env$nearby_tickets <- as.list(nearby_tickets) # Stack to list
  env # Return the environment
}

# Given a list of vectors containing the field data from nearby tickets 
# `nearby_tickets` and a list of tests to determine whether a number satisfies
# the ranges for a given field `field_tests`, builds and returns a data frame
# where each row represents the results of testing single field in a single 
# ticket for a match against the rules for a named field. So, one row per test, 
# per field, per ticket. Includes columns that represent a unique identifier for 
# the ticket `ticket_no`, the order in which that field appears on the ticket 
# `field_no`, the name of the field being tested for `field`, whether the 
# value of that field matched the test for the named field `match`, and the
# value of the field being examined `field_val`
get_field_matches <- function(nearby_tickets, field_tests) {

  # Structure for the resulting data frame
  all_results <- data.frame(
    ticket_no = numeric(0) , field_no = numeric(0),
    field = character(0), match = logical(0),
    field_val = numeric(0)
  )

  # For each vector in `nearby_tickets`...
  for (ticket_no in 1:length(nearby_tickets)) {
    # Shorthand reference to ticked field values
    field_vals <- nearby_tickets[[ticket_no]]  

    # Check each field value against each field test function, convert the
    # results into a data frame
    matches <- sapply(field_tests, function(f) { f(field_vals) }) %>% 
      as.data.frame() %>% 
      mutate(
        ticket_no = ticket_no,
        field_no = row_number(),
        field_val = field_vals
      ) %>% 
      pivot_longer(
        -c('ticket_no', 'field_no', 'field_val'), 
        names_to = 'field', 
        values_to = 'match'
      )

    all_results <- bind_rows(all_results, matches) # Append to our data frame
  }

  all_results # Return the data frame
}

input_environment <- parse_input(real_input) # Unpack the input

# Parse the input data into a data frame indicating matches against the field
# test functions
field_matches <- get_field_matches(
  input_environment$nearby_tickets, 
  input_environment$field_tests
) 

# Determine the answer:
# - Group the data frame by `ticket_no`, `field_no`, and `field_val`
# - For each group, sum the number of fields where `match` == TRUE
# - Keep only records where no matches were found
# - Extract the `field_val` column as a vector
# - Sum the contents of the `field_val` column
answer1 <- field_matches %>% 
  group_by(ticket_no, field_no, field_val) %>% 
  summarise_at('match', sum) %>% 
  filter(match == 0) %>% 
  pull(field_val) %>% 
  sum()

Ah… Just look at those magrittr pipes… Lovely. Just in case you haven’t seen them before (unlikely if you’ve used R for very long), the %>%s are ‘pipe’ operators. They pass the result from their left-hand side as the first argument to the function (or expression) on their right-hand side.

Part Two - There Can Be Only One

This one took me a bit to realize that it wasn’t because I’d done something wrong that many of the fields appeared to satisfy more than one validation rule. Once I realized that was on purpose (and that initially one of the fields satisfied only a single validation rule), I was off to the races! So, identify the field(s) that satisfy only a single rule, assign that field the name of the field represented by that validation rule, remove that field name from the options, then do it again.

source('exercise_1b.R')

test_input <- c(
  "class: 0-1 or 4-19",
  "row: 0-5 or 8-19",
  "seat: 0-13 or 16-19",
  "",
  "your ticket:",
  "11,12,13",
  "",
  "nearby tickets:",
  "3,9,18",
  "15,1,5",
  "5,14,9"
)

input_environment <- parse_input(real_input) # Unpack the input

# Parse the input data into a data frame indicating matches against the field
# test functions
field_matches <- get_field_matches(
  input_environment$nearby_tickets, 
  input_environment$field_tests
) 

# Get a data frame containing only valid tickets:
# - Group by `ticket_no`, `field_no`, and `field_val`
# - Mark any fields containing invalid data with `invalid_field` = TRUE
# - Group by `ticket_no`
# - Mark any tickets containing invalid fields with `exclude` = TRUE
# - Ungroup and remove any tickets containing invalid fields
valid_ticket_data <- field_matches %>% 
  group_by(ticket_no, field_no, field_val) %>% 
  mutate(invalid_field = !any(match)) %>% 
  group_by(ticket_no) %>% 
  mutate(exclude = any(invalid_field)) %>% 
  ungroup() %>% 
  filter(!exclude)

# Get a data frame consisting of one row per `field_no`, per `field` name where
# all fields in that number matched a particular field test (for example, all
# values in `field_no` 1 matched the 'arrival_platform' field):
# - Group by `field_no` and `field`
# - Identify groups where the `field_no` matched a particular field test every time
# - Filter, keeping only those groups
identify_fields <- valid_ticket_data %>% 
  group_by(field_no, field) %>% 
  summarise(all_matched = all(match)) %>% 
  filter(all_matched)

# Create an empty data frame with columns for `field_no` and `field`
confirmed_fields <- data.frame(field_no = numeric(0), field = character(0))

# Until all records in `identify_fields` have been transferred to 
# `confirmed_fields`
while (nrow(confirmed_fields) < length(unique(identify_fields$field))) {

  # Identify fields that only match one set of field rules, sequentially. On the
  # first pass, many of the `field_no`s will satisfy more than one field test, 
  # so we start by transferring the `field_no`s that only match one test to
  # `confirmed_fields`:
  # - Starting with `identify_fields`
  # - Remove any records containing a `field_no` that has already been confirmed
  # - Group by `field`
  # - Calculate the number of records in the group as `group_size`
  # - Filter, keeping only `field`s where the `group_size` is 1, indicating that
  # that `field_no` only matches that field's requirements
  # - Select the `field_no` and `field` columns
  newly_confirmed <- identify_fields %>% 
    filter(!(field_no %in% confirmed_fields$field_no)) %>% 
    group_by(field) %>% 
    mutate(group_size = n()) %>% 
    filter(group_size == 1) %>% 
    select(field_no, field)

  # Add `newly_confirmed` rows to `confirmed_fields`
  confirmed_fields <- bind_rows(confirmed_fields, newly_confirmed)
}

# Starting with the data frame of confirmed field names/numbers, filter the 
# data frame to only 'departure' fields and pull out the `field_no` column
# as a vector
departure_field_nos <- confirmed_fields %>% 
  filter(str_detect(field, 'departure')) %>% 
  pull(field_no)

# Now that we know which `field_no`s (i.e., ticket vector indices) belong to 
# the 'departure' fields, just select those values from `my_ticket` and 
# multiply together
answer2 <- prod(input_environment$my_ticket[departure_field_nos])

Wrap-Up

This one was fun! It was a bit more code than some of the others, but it was so worth it. Plus, I got to use a bit of ‘functional programming’ with my closure (that’s my story and I’m sticking to it). I normally try not to lean too heavily on external packages for these, but I happily relied on dplyr andmagrittr.

If you found a different solution (or spotted a mistake in one of mine), please drop me a line!