Hanukkah of Data is a series of data-themed puzzles, where you solve puzzles to move your way through a holiday-themed story using a fictional dataset. See the introductory post for a bit more detail, but the pitch in my head is "Advent of Code meets SQL Murder Mystery". This post walks through my approach to the sixth puzzle.
Warning: This post unavoidably contains spoilers. If you'd rather do the puzzles on your own first, please close this browser tab and run away :).
Thinking
Day 6 brings us a cheapskate!
“She refused to buy a new rug for herself–she said they were way too expensive. She’s always been very frugal, and she clips every coupon and shops every sale at Noah’s Market. In fact I like to tease her that Noah actually loses money whenever she comes in the store.
So we're looking for someone who's buying things on sale...
Doing (Pandas)
We have the wholesale cost of items, along with the unit price and quantity of each order. That means we can define the profit for each as:
df['profit'] = df.qty * (df.unit_price - df.wholesale_cost)
Which means we can see which customer represented the lowest profit (or greatest loss) in aggregate:
df.groupby(
['name','phone']
).agg(
{'profit':'sum'}
).sort_values(
by='profit'
).head()
Emily, you're a thief!
Doing (Datasette)
Having a noahs_order_detail
view with the same structure as my pandas DataFrame makes this conversion pretty smooth:
select
name,
phone,
sum(qty * (unit_price - wholesale_cost)) as profit
from
noahs_order_detail
group by
name,
phone
order by
profit
Top comments (0)