Robert Mion

Posted on

# Allergen Assessment

## Task: Solve for X where...

``````X = the number of times that each ingredient provably unable to contain any allergens appears amongst all ingredient lists
``````

### Example input

``````mxmxvkd kfcds sqjhc nhms (contains dairy, fish)
trh fvjkl sbzzf mxmxvkd (contains dairy)
sqjhc fvjkl (contains soy)
sqjhc mxmxvkd sbzzf (contains fish)
``````

It represents:

• Several lists of cryptographically encoded ingredients (with some - maybe all - identifiable allergens present in one or more ingredients)

## Before attempting Part 1, I need to understand how the sample answer was determined

Given the example input, it is stated that `kfcds, nhms, sbzzf` and `trh` are the target ingredients: them being the ones that can't possibly contain any of the allergens in any food in the list.

### But why? Let's try to find it.

The instructions state:

• Each allergen is found in exactly one ingredient
• Each ingredient contains zero or one allergen

Ok. Let's target `fish`:

``````mxmxvkd kfcds sqjhc nhms (contains dairy, fish)
sqjhc mxmxvkd sbzzf (contains fish)
``````
• The ingredient that contains fish could be `mxmxvkd` or `sqjhc` because those appear in both lists
• `sbzzf`, `kfcds`, and `nhms` are definitely not fish

How about `dairy`:

``````mxmxvkd kfcds sqjhc nhms (contains dairy, fish)
trh fvjkl sbzzf mxmxvkd (contains dairy)
``````
• The ingredient that contains dairy must be `mxmxvkd`, because it is the only one that appears in both lists

Back to fish:

• If the ingredient can't be `mxmxvkd` - since it must be dairy - then `sqjhc` must contain fish

How about `soy`:

``````sqjhc fvjkl (contains soy)
``````
• If `sgjhc` must be fish, we know fvjkl must be soy

Now we know:

• Which ingredients contain fish, dairy and soy

Looking at the list again, we can identify ingredients that we know have no allergen, and thus are irrelevant

``````Input
mxmxvkd kfcds sqjhc nhms (contains dairy, fish)
trh fvjkl sbzzf mxmxvkd (contains dairy)
sqjhc fvjkl (contains soy)
sqjhc mxmxvkd sbzzf (contains fish)

Knowledge
mxmxvkd sqjhc (are dairy, fish)
fvjkl mxmxvkd (are soy and dairy)
fvjkl (is soy)
sqjhc (is fish)

Process of elimination
kfcds nhms (are irrelevant)
trh sbzzf (are irrelevant)
...
sbzzf (is irrelevant again)
``````

Count up each appearance of the irrelevant ingredients to get 5, just like in the puzzle's description.

Now, how could I describe that process programmatically?

## Part 1 - it's gross, but it works!

``````Step 1: Parse the input
Split the input into an array of strings
For each string in the array
Split the string at the phrase ' (contains ' to create an array with two elements, each a string
Split each string at the space character to create an array of single-word strings
By now, the first array contains each ingredient, and the second array contains each allergen, but with a few quirks:
If there were more than one allergen, the all but the last has a , as the last character
The last allergen as a ) as the last character
Update each string in the second array to remove any , and ) characters from the end

Step 2: Aggregate each ingredient list grouped with an individual allergen
Create an object - called mappings - that will map allergens with ingredient lists
For each parsed ingredients and allergens list from the input
For each allergen
Create a key in the object whose value is an empty array
Add the ingredients list to the allergen key's array

Step 3: Do an initial pass to determine which ingredients are in every list associated with any single allergen
Create an object with keys for each allergen and values initialized as empty arrays
For each allergen key in the object
Set the value equal to the result of the following operations:
For each string in the first ingredient list associated with this allergen, accumulate an array - starting as empty -  of strings that can be found in all ingredient lists associated with the current allergen
Create an empty array that will contain booleans representing whether the current string is in any subsequent ingredient lists associated with the current allergen
For each ingredients list - excluding the first one - associated with the current allergen
If the list contains the current string
Add the boolean true to the array
Else, add the boolean false to the array
If the array contains all true booleans, add the current string to the accumulating array

Step 4: Process of elimination to determine each ingredient-to-allergen pairing
Assuming that Step 3 resulted in at least one ingredient-to-allergen match - meaning one of the resulting arrays contains only one value...
...and assuming that each iteration of the following loop generates a new single-ingredient allergen:
Do the following operations until every allergen's associated array contains only one value:
For each single-ingredient allergen newly discovered after a prior iteration
For all allergens
If their associated array of possible matching ingredients has two or more items remaining, and it includes the ingredient associated with the current single-ingredient allergen
Update that array of that allergen so that the current ingredient is removed from that array

By now, the program should have an object associating each known allergen to a single ingredient

Step 5: Calculate the count of irrelevant ingredient instances across all ingredient list
For each identified ingredient associated with each allergen
Accumulate a shrinking list of ingredients - starting with a combined and flattened list of each ingredient list
Return a filtered list such that all instances of the current ingredient have been removed

Return the length of the accumulated, filtered list
``````

## Building the simulator

• I really wanted to see how my algorithm worked
• It wasn't too hard making the simulator - again, I pulled bits and pieces from previous ones
• The fun part was in rendering each slowly shrinking list of ingredients associated with each allergen

### Try the simulator yourself!

Here's the algorithm working on the example input

Here it is working on my input

## Part 2: easier than expected

• Thankfully, my algorithm from Part 1 set me up for a quick correct answer for Part 2
``````Generate an array of each allergen
Sort it alphabetically
Change each allergen to its matching ingredient
Concatenate each string in the array using a , character
Return the full string
``````

Here's the simulator showing my correct answers to Part 1 and 2

## In summary: 'I did it?!'

Given how confused I was about this challenge's initial set of instructions, I'm beyond impressed with myself that I built a working algorithm and simulator that generate the correct answers for both parts.