AJ Kerrigan

Posted on

# Hanukkah of Data 2022 - Puzzle 6

Hanukkah of Data is a series of data-themed puzzles, where you solve puzzles to move your way through a holiday-themed story using a fictional dataset. See the introductory post for a bit more detail, but the pitch in my head is "Advent of Code meets SQL Murder Mystery". This post walks through my approach to the sixth puzzle.

Warning: This post unavoidably contains spoilers. If you'd rather do the puzzles on your own first, please close this browser tab and run away :).

## Thinking

Day 6 brings us a cheapskate!

“She refused to buy a new rug for herself–she said they were way too expensive. She’s always been very frugal, and she clips every coupon and shops every sale at Noah’s Market. In fact I like to tease her that Noah actually loses money whenever she comes in the store.

So we're looking for someone who's buying things on sale...

## Doing (Pandas)

We have the wholesale cost of items, along with the unit price and quantity of each order. That means we can define the profit for each as:

``````df['profit'] = df.qty * (df.unit_price - df.wholesale_cost)
``````

Which means we can see which customer represented the lowest profit (or greatest loss) in aggregate:

``````df.groupby(
['name','phone']
).agg(
{'profit':'sum'}
).sort_values(
by='profit'
``````

Emily, you're a thief!

## Doing (Datasette)

Having a `noahs_order_detail` view with the same structure as my pandas DataFrame makes this conversion pretty smooth:

``````select
name,
phone,
sum(qty * (unit_price - wholesale_cost)) as profit
from
noahs_order_detail
group by
name,
phone
order by
profit
``````