DEV Community

AJ Kerrigan
AJ Kerrigan

Posted on • Updated on

Hanukkah of Data 2022 - Puzzle 1

Hanukkah of Data is a series of data-themed puzzles, where you solve puzzles to move your way through a holiday-themed story using a fictional dataset. See the introductory post for a bit more detail, but the pitch in my head is "Advent of Code meets SQL Murder Mystery". This post walks through my approach to the first puzzle.

Pre-work

Before actually solving any puzzles, there's a mini warm-up puzzle that lets you access the fictional data set. With that done, it's helpful to explore the data first just to get a feel for the shape/size/relationships. I'm not going to document that here, but in my case the gist was "open with VisiData and poke around a bit."

Warning: From here on, this post unavoidably contains spoilers. If you'd rather do the puzzles on your own first, please close this browser tab and run away :).

Thinking

The first daily puzzle involves finding a customer whose phone number matches their last name as typed on a phone's keypad:

“They came in awhile ago and showed me their business card, and that’s what it said. Skilled Private Investigator. And their phone number was their last name spelled out. I didn’t know what that meant, but apparently before there were smartphones, people had to remember phone numbers or write them down. If you wanted a phone number that was easy-to-remember, you could get a number that spelled something using the letters printed on the phone buttons: like 2 has “ABC”, and 3 “DEF”, etc. And I guess this person had done that, so if you dialed the numbers corresponding to the letters in their name, it would call their phone number!

So...

  • Define a phone letter --> number translation
  • Find cases where the phone number shows up in the customer name

Doing (VisiData)

Since this was the first puzzle and I already had the data set open in VisiData, I stayed there. I wrote a little plugin a while back that lets me drop into a ptipython REPL that preserves VisiData context and keeps a VisiData-specific history. I find it super helpful for quick code explorations that I intend to throw away - if something is useful, I can adapt bits from the saved history into .visidatarc snippets or plugins later. So my actual work started there...

  • From a REPL, define a translation table and save it to the global vd object so it's available in future vd expressions:
phone_pad = {
    "abc": 2,
    "def": 3,
    "ghi": 4,
    "jkl": 5,
    "mno": 6,
    "pqrs": 7,
    "tuv": 8,
    "wxyz": 9,
}

vd.phone_lookup = str.maketrans(
    {letter: str(number) for letters, number in phone_pad.items() for letter in letters}
)
Enter fullscreen mode Exit fullscreen mode
  • Add a new column (= or addcol-expr) based on this expression:
name.lower().replace(' ', '').translate(vd.phone_lookup)
Enter fullscreen mode Exit fullscreen mode
  • Rename the new column (^ or rename-col) to a friendly name like name_translated

  • Find rows where the phone number appears in the translated name, using z| or select-expr with this expression:

phone.replace('-', '') in name_translated
Enter fullscreen mode Exit fullscreen mode
  • Open a new sheet with only the selected row, using " or dup-selected

Refining / Reusing

While I find this "one step at a time" iterative approach very friendly for ad-hoc exploration, sometimes "just this once..." operations keep coming back. So how would I clean this up just enough so that it was more easily reusable?

  • First, turn those custom REPL bits into a custom VisiData command, and include it in my .visidatarc - either directly or as part of an isolated plugin file.
  • Then, save my manual steps using VisiData's command log and adapt/replay it in the future.

There's sample code for that in this gist. With the Python bits included or imported into .visidatarc and a noahs-customers.csv file in the current directory, vd -p puzzle1.vdj finds the solution.

Solutions from Elsewhere

I love Dian Fey's approach to this puzzle using Postgres!

Dolmen's solution here leaned into the SQLite approach I snuck up on and rejected in favor of VisiData. It's good to see it up in GitHub for the sake of comparison - pure SQLite is a feature!

Top comments (0)