Ben Steadman

Posted on Apr 21, 2019 • Edited on Jun 17, 2019

Advent of Code 2018 Day 4: Repose Record

#aoc2018 #python #puzzle #challenge

This post originally appeared on steadbytes.com

Complete solution can be found on GitHub

Part One

The puzzle input contains records of the sleep patterns of each guard during their midnight hour shifts. We are asked to find the ID of the guard who is asleep the most, the minute of the hour (0->59) which the guard is asleep and submit the product of these values as the answer.

We are told that the records are not in chronological order. After sorting, however, the input has a repeating structure of blocks of records for each guard:

[<timestamp>] Guard <ID> begins shift
<---------repeat------------>
[<timestamp>] falls asleep
[<timestamp>] wakes up
<-------------------------->

Utilising this, we obtain the following steps for a solution:

Sort the records chronologically
Iterate over each record
Each time a 'begins shift' record is encountered:
- Extract the guard ID
- For each subsequent pair of 'falls asleep', 'wakes up' records:
  - Extract the minute values from the timestamps
  - Increment a counter for the current guard by the total number of minutes between the two values
  - Increment a counter for each minute value for the current guard for each minute in the range of the two minute values
Find the guard ID with the maximum total minutes asleep counter
Find the maximum minute counter for that guard
Return the product of the ID and minute

Parsing The Input

Our solution above requires chronologically sorted records. This sounds more complex than it actually is due to a few key properties:

Each record starts with a timestamp
Timestamps are unique
Timestamps follow the same [YYYY-MM-DD HH:MM] format

These mean that chronological order, in this case, is the same as lexicographic order. In python, calling list.sort() on a list of strings will achieve this. So, to initially parse our input:

Split the puzzle input on newlines into a list of strings
Sort the list lexicographically

with open("input.txt") as f:
    records = f.read().split("\n") # split into list of strings
    records.sort() # sort lexicographically

Counting Minutes Asleep

First, we need to identify each 'type' of record - 'begins shift', 'falls asleep' and 'wakes up':

if "begins shift" in record:
    # extract guard ID
elif "falls asleep" in record:
    # extract timestamp etc
elif "wakes up" in record:
    # extract timestamp etc

Extracting Guard ID's

Once a 'begins shift' record has been identified, we can extract the ID of the guard using a regular expression:

import re

id_re = re.compile(r"#(\d+)")

def parse_id(record):
    m = id_re.search(record)
    guard_id = int(m.group(1))

parse_id("[1518-11-01 00:00] Guard #10 begins shift")
# 10

Interactive demo and explanation of this regex

Extracting Minutes from Timestamps

Each timestamp has a [YYYY-MM-DD HH:MM] format. We only need the MM (minutes) portion of this. Handily (and I suspect deliberately), there is a single : separator between the minutes and the rest of the timestamp and a single ] denoting the end of the timestamp. Therefore, to extract the minutes we need to split the record on a ] to get the timestamp and action ('falls asleep' etc) and then split the timestamp on the : character:

def parse_minute(record):
    timestamp, action = record.split("] ")
    minute = int(timestamp.split(":")[1]

parse_minute("[1518-11-01 00:05] falls asleep")
# 5

Implementing the Algorithm

Now we can retrieve the pieces of information needed from the records, it's time to implement the previously outlined algorithm. The first part involves building two data structures:

Total number of minutes each guard spent asleep
Number of times each guard was asleep during a particular minute value (0->59)

from collections import defaultdict

# {guard_id: {minute_value: count, ...}, ...}
guard_minute_asleep_counts = defaultdict(lambda: defaultdict(int))
# {guard_id: total_minutes_asleep, ...}
guards_asleep_totals = defaultdict(int)

current_guard = None
asleep_minute = None
for r in records:
    minute = parse_minute(r)
    if "begins shift" in r: # new guard begins shift
        current_guard = parse_id(r)
        asleep_minute = None
    elif "falls asleep" in r:
        asleep_minute = minute # start of a sleep cycle
    elif "wakes up" in r:
        for m in range(asleep_minute, minute): # end of a sleep cycle
            guard_minute_asleep_counts[current_guard][m] += 1
            guards_asleep_totals[current_guard] += 1

The guard with the most minutes asleep is now the guard ID corresponding to the maximum value in the guard_asleep_totals dictionary:

most_asleep_guard = max(guards_asleep_totals, key=guards_asleep_totals.get)

The minute for which this guard was most aslep in is the minute value corresponding to the maximum value of the guard_asleep_minute_counts for the guard:

most_asleep_minute = max(
    guard_minute_asleep_counts[most_asleep_guard],
    key=guard_minute_asleep_counts[most_asleep_guard].get,
)

Finally, our answer is simply the product of these two values:

answer = most_asleep_guard * most_asleep_minute

For my puzzle input, the answer is 151754.

Part Two

Next, we are asked to find the guard which is most frequently asleep on the same minute and again provide the product of the guard ID and the minute value.

Sort the records chronologically
Iterate over each record
Each time a 'begins shift' record is encountered:
- Extract the guard ID
- For each subsequent pair of 'falls asleep', 'wakes up' records:
  - Extract the minute values from the timestamps
  - Increment a counter for each minute value for the current guard for each minute in the range of the two minute values
Find the guard ID with the maximum minute value counter
Return the product of the ID and minute value

Notice how similar this algorithm is to part one? The record iteration and data extraction logic is exactly the same, essentially all that has changed is the action taken when a 'wakes up' record is found. Your DRY sense should be tingling here as this looks like a great opportunity to abstract out some repeated logic. First, I will show the algorithm implementation and then go through refactoring it.

from collections import defaultdict

# {(guard_id, minute_value): count, ...}
guard_minute_asleep_counts = defaultdict(int)

current_guard = None
asleep_minute = None
for r in records:
    minute = parse_minute(r)
    if "begins shift" in r: # new guard begins shift
        current_guard = parse_id(r)
        asleep_minute = None
    elif "falls asleep" in r:
        asleep_minute = minute # start of a sleep cycle
    elif "wakes up" in r:
        for m in range(asleep_minute, minute): # end of a sleep cycle
            guard_minute_asleep_counts[(current_guard, m)] += 1
guard_id, minute = max(
    guard_minute_asleep_counts, key=guard_minute_asleep_counts.get
)
answer = guard_id * minute

For my puzzle input, the answer is 819896.

DRY Refactoring

The logic for iterating records and extracting guard IDs and timestamps is the same between both parts. As mentioned earlier, the only difference is the action taken when a 'wakes up' record is found:

# part 1
for m in range(asleep_minute, minute):
    guard_minute_asleep_counts[current_guard][m] += 1
    guards_asleep_totals[current_guard] += 1

# part 2
for m in range(asleep_minute, minute):
    guard_minute_asleep_counts[(current_guard, m)] += 1

Both require the ID of the current guard that has woken up and the range of minutes which the guard was asleep for and nothing else! To refactor this, we can use a callback function executed when a 'wakes up' record is found which receives the current guard ID and range of minutes asleep as arguments:

def process_records(records, on_wake_up):
    current_guard = None
    asleep_minute = None
    for r in records:
        if not r:
            continue
        minute = parse_minute(r)
        if "begins shift" in r:
            current_guard = parse_id(r)
            asleep_minute = None
        elif "falls asleep" in r:
            asleep_minute = minute
        elif "wakes up" in r:
            on_wake_up(current_guard, range(asleep_minute, minute)) # callback

Part one and two both utilise this process_records function, providing their own callbacks to implement their required logic:

def part_1(records):
    guard_minute_asleep_counts = defaultdict(lambda: defaultdict(int))
    guards_asleep_totals = defaultdict(int)

    # callback
    def on_wake_up(current_guard, asleep_minutes_range):
        for m in asleep_minutes_range:
            guard_minute_asleep_counts[current_guard][m] += 1
            guards_asleep_totals[current_guard] += 1

    process_records(records, on_wake_up)

    most_asleep_guard = max(guards_asleep_totals, key=guards_asleep_totals.get)
    most_asleep_minute = max(
        guard_minute_asleep_counts[most_asleep_guard],
        key=guard_minute_asleep_counts[most_asleep_guard].get,
    )
    return most_asleep_guard * most_asleep_minute

def part_2(records):
    guard_minute_asleep_counts = defaultdict(int)

    # callback
    def on_wake_up(current_guard, asleep_minutes_range):
        for m in asleep_minutes_range:
            guard_minute_asleep_counts[(current_guard, m)] += 1

    process_records(records, on_wake_up)

    guard_id, minute = max(
        guard_minute_asleep_counts, key=guard_minute_asleep_counts.get
    )
    return guard_id * minute

DEV Community