DEV Community

loading...

Advent Of Code 2020 in SQL – Day 7: “Handy Haversacks”

pesse profile image Samuel Nitsche Originally published at cleandatabase.wordpress.com on ・3 min read

Day 7’s puzzle comes with a bunch of rules about bags that contain other bags – and a lot of different colors (very cool puzzle setting btw!)

As always, the first thing we want to do is to normalize the input in a way that gives us a workable result set. In this case I aim to have a row for each child-rule, groupable by the bag’s name.

If we want to split a column into several rows, one of best approaches (if we cannot easily unpivot) is to use recursive subqueries. They seemed complicated to me at first, but once I used them several times I really get to like them:

with
  base_data as (
    select
      column_value line
    from table(
      aoc_file_loader.file_as_stringlist(
        aoc_file_loader.local_url('day_7_input.txt')
      )
    )
  ),
  bag_children as (
    select
      replace(
        regexp_replace(line, '^(.+) contain (.+)$', '\1'),
        'bags',
        'bag'
        ) bag,
      regexp_replace(line, '^(.+) contain (.+)$', '\2') children
    from base_data
  ),
  children_in_rows (bag, child, children) as (
    select
      bag,
      nvl(substr(children, 1, instr(children, ', ')-1), children),
      case when instr(children, ', ') > 0 then
        substr(children, instr(children, ', ')+2)
      else null end
    from bag_children
    union all
    select
      bag,
      nvl(substr(children, 1, instr(children, ', ')-1), children),
      case when instr(children, ', ') > 0 then
        substr(children, instr(children, ', ')+2)
      else null end
    from children_in_rows where children is not null
  )
select * from children_in_rows;
Enter fullscreen mode Exit fullscreen mode

It’s important we normalize the “bags” in the bagcolumn to “bag”, because we have rules that say “contain 1 other bag.” and we won’t be able to match that with the “bags”.

The first task now is to wander upwards and count how many different (parent) bag-colors we have.

This might be possible with connect by prior, too, but I noticed I have way more control about how exactly my recursion behaves if I use recursive subquery again:

with
  base_data as (
    select
      column_value line
    from table(
      aoc_file_loader.file_as_stringlist(
        aoc_file_loader.local_url('day_7_input.txt')
      )
    )
  ),
  bag_children as (
    select
      replace(
        regexp_replace(line, '^(.+) contain (.+)$', '\1'),
        'bags',
        'bag'
        ) bag,
      regexp_replace(line, '^(.+) contain (.+)$', '\2') children
    from base_data
  ),
  children_in_rows (bag, child, children) as (
    select
      bag,
      nvl(substr(children, 1, instr(children, ', ')-1), children),
      case when instr(children, ', ') > 0 then
        substr(children, instr(children, ', ')+2)
      else null end
    from bag_children
    union all
    select
      bag,
      nvl(substr(children, 1, instr(children, ', ')-1), children),
      case when instr(children, ', ') > 0 then
        substr(children, instr(children, ', ')+2)
      else null end
    from children_in_rows where children is not null
  ),
  bag_hierarchy( bag, child, hierarchy_level, path ) as (
    select
      bag,
      child,
      1,
      bag
    from children_in_rows
    where child like '%shiny gold bag%'
    union all
    select
      cur_bag.bag,
      cur_bag.child,
      parent_bag.hierarchy_level+1,
      cur_bag.bag || '/' || parent_bag.path
    from children_in_rows cur_bag
      inner join bag_hierarchy parent_bag on cur_bag.child like '%'||parent_bag.bag||'%'
  )
select count(distinct bag)
from bag_hierarchy
Enter fullscreen mode Exit fullscreen mode

Part 2 now wants us to walk down the tree and count all the bags inside of the shiny gold bag – and the bags inside the bags etc.

Therefore we need to extract the number from the text – easy with regular expressions. We also replace “no other bags.” with 0 so we can calculate properly.

I found it easier to calculate the cumulated children of each bag (on parent-level) rather than the cumulated number on child-level.

with
  base_data as (
    select
      column_value line
    from table(
      aoc_file_loader.file_as_stringlist(
        aoc_file_loader.local_url('day_7_input.txt')
      )
    )
  ),
  bag_children as (
    select
      replace(
        regexp_replace(line, '^(.+) contain (.+)$', '\1'),
        'bags',
        'bag'
        ) bag,
      regexp_replace(line, '^(.+) contain (.+)$', '\2') children
    from base_data
  ),
  children_in_rows (bag, child, children) as (
    select
      bag,
      nvl(substr(children, 1, instr(children, ', ')-1), children),
      case when instr(children, ', ') > 0 then
        substr(children, instr(children, ', ')+2)
      else null end
    from bag_children
    union all
    select
      bag,
      nvl(substr(children, 1, instr(children, ', ')-1), children),
      case when instr(children, ', ') > 0 then
        substr(children, instr(children, ', ')+2)
      else null end
    from children_in_rows where children is not null
  ),
  children_with_count as (
    select
      bag,
      to_number(
        replace(
          regexp_replace(child, '^([0-9]+) (.*)$', '\1'),
          'no other bags.',
          '0'
          )
        )child_count,
      regexp_replace(child, '^([0-9]+) (.*)$', '\2') child_name
    from children_in_rows
  ),
  bag_hierarchy( bag, child_name, child_count, cumulated_child_count, hierarchy_level, path ) as (
    select
      bag,
      child_name,
      child_count,
      child_count,
      1,
      bag
    from children_with_count
    where bag like '%shiny gold bag%'
    union all
    select
      cur_bag.bag,
      cur_bag.child_name,
      cur_bag.child_count,
      parent_bag.cumulated_child_count*cur_bag.child_count,
      parent_bag.hierarchy_level+1,
      parent_bag.path || '/' || cur_bag.bag
    from children_with_count cur_bag
      inner join bag_hierarchy parent_bag on parent_bag.child_name like '%'||cur_bag.bag||'%'
  )
select sum(cumulated_child_count)
from bag_hierarchy
Enter fullscreen mode Exit fullscreen mode

This was quite a challenge and I had several errors (mainly from the missing “one some-colored bag.” connection) – hierarchical queries are something I don’t do that often. But it was very funny and I learned and refreshed a lot!

You can find the whole code on my github-repository.

Discussion (0)

Forem Open with the Forem app