This is my third year participating in Advent of Code, but the first using Rust! Since I’m new to the Rust ecosystem, I’ve been dependent on others to steer my third-party library selections. As an example, Day 15 (like most days) presented some interesting string parsing requirements. Luckily, I was guided toward an excellent parser combinator library, affectionately named nom, via Chris Biscardi1.
Beacon exclusion zone
The Day 15 challenge requires you to track sensors, beacons, and their coordinates. The raw input for this looks like:
Sensor at x=2, y=18: closest beacon is at x=-2, y=15
Sensor at x=9, y=16: closest beacon is at x=10, y=16
Sensor at x=13, y=2: closest beacon is at x=15, y=3
Sensor at x=12, y=14: closest beacon is at x=10, y=16
Sensor at x=10, y=20: closest beacon is at x=10, y=16
Sensor at x=14, y=17: closest beacon is at x=10, y=16
Sensor at x=8, y=7: closest beacon is at x=2, y=10
Sensor at x=2, y=0: closest beacon is at x=2, y=10
Sensor at x=0, y=11: closest beacon is at x=2, y=10
Sensor at x=20, y=14: closest beacon is at x=25, y=17
Sensor at x=17, y=20: closest beacon is at x=21, y=22
Sensor at x=16, y=7: closest beacon is at x=15, y=3
Sensor at x=14, y=3: closest beacon is at x=15, y=3
Sensor at x=20, y=1: closest beacon is at x=15, y=3
While this text is parsable with regular expressions, or a combination of well-placed string splits, using a parsing library helps break things down in a structured way (which can sometimes be beneficial for part 2 challenges).
Presuming we have structs for Sensor
and Beacon
that look like the ones below, we can start building out the parsing logic.
struct Sensor {
x: i64,
y: i64,
}
struct Beacon {
x: i64,
y: i64,
}
Parsing with Nom
First, we’ll parse out each line of input, along with the part of the line relevant to either a Sensor
or a Beason
. Second, we’ll parse out the coordinates and populate them into instances of Sensor
and Beacon
.
For the first part, everything is contained in a function that takes the raw input as a string slice (&str
) and returns an IResult
. An IResult
is a container for the result of a nom
parsing function. The string slice component of an IResult
is the remaining unparsed input, and the Vec(Sensor, Beacon)
is our expected parsing result.
fn map(input: &str) -> IResult<&str, Vec<(Sensor, Beacon)>> {
let (input, reports) = separated_list1(
line_ending,
preceded(
tag("Sensor at "),
separated_pair(
position.map(|(x, y)| Sensor { x, y }),
tag(": closest beacon is at "),
position.map(|(x, y)| Beacon { x, y }),
),
),
)(input)?;
Ok((input, reports))
}
Inside the map
function, we start off with separated_list1
, which helps us break up the input into lines. The first argument is line_ending
, which matches line endings of both the \n
and \r\n
variety. The second argument starts with preceded
, which isolates everything after the Sensor at
tag in the line and supplies it to separated_pair
. separated_pair
in turn helps parse out what is on either side of the : closest beacon is at
tag. In this case, those are the coordinate pairs for Sensor
and Beacon
, respectively. To parse them, we’ll define another function called position
.
The position
function helps extract the values of coordinate pairs. As you can see, it has similar arguments to map
, and an IResult
return value. However, the types in the IResult
are a bit different here. The second argument is a tuple, for the x
and y
coordinates, both i64
.
fn position(input: &str) -> IResult<&str, (i64, i64)> {
separated_pair(
preceded(tag("x="), complete::i64),
tag(", "),
preceded(tag("y="), complete::i64),
)(input)
}
Right away, we jump into separated_pair
again. This parses out both sides of the ,
, while preceded
isolates the value after either x=
or y=
. The second argument of preceded
is another parsing function—a character::complete::i64
, which matches the coordinate integer value.
Coming back to the map
function, we (somewhat confusingly) call map
on the position
parsing result to get the parsed values. That allows us to destructure the tuple and use the values to construct the Sensor
and Beacon
struct literals.
Now, if we use the dbg!
macro on the result of a call to map
with test input, we should see something like:
map = [
(
Sensor {
x: 2,
y: 18,
},
Beacon {
x: -2,
y: 15,
},
),
(
Sensor {
x: 9,
y: 16,
},
Beacon {
x: 10,
y: 16,
},
),
// . . .
]
Look at that beautifully structured data!
Conclusion
Reasonably painless, and well-structured—that’s parsing data with Rust and Nom! If you’re interested in taking a closer look at Nom, I encourage you to review this handy list of its available parsers and combinators.
Top comments (0)