DEV Community

AJ Kerrigan
AJ Kerrigan

Posted on

Hanukkah of Data 2022 - Puzzle 6

Hanukkah of Data is a series of data-themed puzzles, where you solve puzzles to move your way through a holiday-themed story using a fictional dataset. See the introductory post for a bit more detail, but the pitch in my head is "Advent of Code meets SQL Murder Mystery". This post walks through my approach to the sixth puzzle.

Warning: This post unavoidably contains spoilers. If you'd rather do the puzzles on your own first, please close this browser tab and run away :).

Thinking

Day 6 brings us a cheapskate!

“She refused to buy a new rug for herself–she said they were way too expensive. She’s always been very frugal, and she clips every coupon and shops every sale at Noah’s Market. In fact I like to tease her that Noah actually loses money whenever she comes in the store.

So we're looking for someone who's buying things on sale...

Doing (Pandas)

We have the wholesale cost of items, along with the unit price and quantity of each order. That means we can define the profit for each as:

df['profit'] = df.qty * (df.unit_price - df.wholesale_cost)
Enter fullscreen mode Exit fullscreen mode

Which means we can see which customer represented the lowest profit (or greatest loss) in aggregate:

df.groupby(
  ['name','phone']
).agg(
  {'profit':'sum'}
).sort_values(
  by='profit'
).head()
Enter fullscreen mode Exit fullscreen mode

Emily, you're a thief!

Doing (Datasette)

Having a noahs_order_detail view with the same structure as my pandas DataFrame makes this conversion pretty smooth:

select
  name,
  phone,
  sum(qty * (unit_price - wholesale_cost)) as profit
from
  noahs_order_detail
group by
  name,
  phone
order by
  profit
Enter fullscreen mode Exit fullscreen mode

Top comments (0)