The second challenge of the Advent of Code 2022 is pretty straightforward with SQL. In summary the task is to use some starting values and transform those into a numeric value using a lookup table, and then calculate the sum of all the values you get. If you are familiar with relational databases this should sound like a
JOIN operation to get the lookup value and a
GROUP BY to get the results.
The background story is that you are playing Tic-Tac-Toe with the elves. You are given an encrypted strategy guide that you have to follow if you want to win.
Let's start importing the input. As yesterday I've pasted the input values in a query and then I'm using
STRING_SPLIT to move everything into a more comfortable table:
declare @input varchar(max) = 'B Y A Y B Z ... A Y'; drop table if exists dbo.ch02_input; with cte as ( select replace(value, char(10), '') as [round] from string_split(@input, char(13)) ) select identity(int, 1, 1) as id, left([round], 1) as [opponent], right([round], 1) as [player] into [ch02_input] from cte;
Full script is available on GitHub here: day-02/00-setup.sql
In part one you have to assign to each shape a value. I built the set on the fly using the Row Constructors
select * from (values ('A', 'Rock', 1), ('Y', 'Paper', 2), ('B', 'Paper', 2), ('X', 'Rock', 1), ('C', 'Scissors', 3), ('Z', 'Scissors', 3) ) decode(code, [shape], [value])
and then all I had to do was join the above set with the input table, to convert the shapes into the associated value. I stored the result into the
#result temporary table.
The last step to complete the task is to calculate if I won, tied, or lost each round. While I'm sure there are better ways to do that, given that the number of combinations is extremely limited, I went for a super simple solution, using the
CASE statement (I'm really all in for KISS approach):
select *, case when opponent_shape = player_shape then 3 -- Tie when opponent_shape = 'Rock' and player_shape = 'Paper' then 6 -- Won when opponent_shape = 'Rock' and player_shape = 'Scissors' then 0 -- Lost when opponent_shape = 'Paper' and player_shape = 'Rock' then 0 -- Lost when opponent_shape = 'Paper' and player_shape = 'Scissors' then 6 -- Won when opponent_shape = 'Scissors' and player_shape = 'Paper' then 0 -- Lost when opponent_shape = 'Scissors' and player_shape = 'Rock' then 6 -- Won end as outcome from #rounds;
Now, to calculate the overall score I did I just need to sum all my games:
select sum(player_value + outcome) from #results
Part 1, done. Find the full script here: day-02/01-part1.sql
In part two you discover that you didn't really decoded the original encrypted strategy guide. If fact, the X, Y and Z letter tells you not which shape you should play, but what should be the outcome of that game: "X means you need to lose, Y means you need to end the round in a draw, and Z means you need to win."
To solve the challenge then, I only needed to transform the X, Y and Z into the related A, B and C, based on the given logic. Again, with a CASE statement is pretty easy:
select e.g.*, case when player = 'Y' then opponent -- Must tie when player = 'X' then -- Must lose case opponent when 'A' then 'C' when 'B' then 'A' when 'C' then 'B' end when player = 'Z' then -- Must win case opponent when 'A' then 'B' when 'B' then 'C' when 'C' then 'A' end end as player_decoded from dbo.ch02_input as eg order by eg.id;
With these results I can just then apply the same queries used in Part One to calculate the round results and then get the overall result points. The full script for part two is here: day-02/02-part2.sql
With such small datasets performances are almost never an issue. If you were to use a much larger dataset, say 100 times bigger than this, I would suggest three things to make sure you'll get the best performances possible
- If you can, use numbers - integers - as identifiers. Those are much faster when aggregations are required. Strings are really expensive from a CPU perspective.
- If you cannot use numbers as identifiers for any reason, make sure you create columns or operate on string using a binary collation. That will make string comparisons (and thus aggregations) much faster as the engine doesn't have to take into account casing, accents and so on. A collation like
Latin1_General_BIN2is your friend when a string is used as id. (Binary Collations, Colummn-level Collations, Expression-level Collations)
- Use the columnstore indexes whenever you need to boost the aggregation performance.