Hanukkah of Data is a series of data-themed puzzles, where you solve puzzles to move your way through a holiday-themed story using a fictional dataset. See the introductory post for a bit more detail, but the pitch in my head is "Advent of Code meets SQL Murder Mystery". This post walks through my approach to the first puzzle.
Pre-work
Before actually solving any puzzles, there's a mini warm-up puzzle that lets you access the fictional data set. With that done, it's helpful to explore the data first just to get a feel for the shape/size/relationships. I'm not going to document that here, but in my case the gist was "open with VisiData and poke around a bit."
Warning: From here on, this post unavoidably contains spoilers. If you'd rather do the puzzles on your own first, please close this browser tab and run away :).
Thinking
The first daily puzzle involves finding a customer whose phone number matches their last name as typed on a phone's keypad:
“They came in awhile ago and showed me their business card, and that’s what it said. Skilled Private Investigator. And their phone number was their last name spelled out. I didn’t know what that meant, but apparently before there were smartphones, people had to remember phone numbers or write them down. If you wanted a phone number that was easy-to-remember, you could get a number that spelled something using the letters printed on the phone buttons: like 2 has “ABC”, and 3 “DEF”, etc. And I guess this person had done that, so if you dialed the numbers corresponding to the letters in their name, it would call their phone number!
So...
- Define a phone letter --> number translation
- Find cases where the phone number shows up in the customer name
Doing (VisiData)
Since this was the first puzzle and I already had the data set open in VisiData, I stayed there. I wrote a little plugin a while back that lets me drop into a ptipython REPL that preserves VisiData context and keeps a VisiData-specific history. I find it super helpful for quick code explorations that I intend to throw away - if something is useful, I can adapt bits from the saved history into .visidatarc
snippets or plugins later. So my actual work started there...
- From a REPL, define a translation table and save it to the global
vd
object so it's available in future vd expressions:
phone_pad = {
"abc": 2,
"def": 3,
"ghi": 4,
"jkl": 5,
"mno": 6,
"pqrs": 7,
"tuv": 8,
"wxyz": 9,
}
vd.phone_lookup = str.maketrans(
{letter: str(number) for letters, number in phone_pad.items() for letter in letters}
)
- Add a new column (
=
oraddcol-expr
) based on this expression:
name.lower().replace(' ', '').translate(vd.phone_lookup)
Rename the new column (
^
orrename-col
) to a friendly name likename_translated
Find rows where the phone number appears in the translated name, using
z|
orselect-expr
with this expression:
phone.replace('-', '') in name_translated
- Open a new sheet with only the selected row, using
"
ordup-selected
Refining / Reusing
While I find this "one step at a time" iterative approach very friendly for ad-hoc exploration, sometimes "just this once..." operations keep coming back. So how would I clean this up just enough so that it was more easily reusable?
- First, turn those custom REPL bits into a custom VisiData command, and include it in my
.visidatarc
- either directly or as part of an isolated plugin file. - Then, save my manual steps using VisiData's command log and adapt/replay it in the future.
There's sample code for that in this gist. With the Python bits included or imported into .visidatarc
and a noahs-customers.csv
file in the current directory, vd -p puzzle1.vdj
finds the solution.
Solutions from Elsewhere
I love Dian Fey's approach to this puzzle using Postgres!
Dolmen's solution here leaned into the SQLite approach I snuck up on and rejected in favor of VisiData. It's good to see it up in GitHub for the sake of comparison - pure SQLite is a feature!
Top comments (0)