DEV Community

Cover image for Advent of Code, but differently
Frank Blaauw
Frank Blaauw

Posted on

Advent of Code, but differently

Advent of Code, or AoC, is a well-known event among software engineers. It takes place annually, with participants solving a series of programming puzzles each day from December 1st to December 25th. These puzzles can be solved in any programming language, making it a great opportunity for developers of all levels to practice their skills and try out new languages. Each day's puzzle builds upon the previous one, creating a challenging and cohesive experience for participants.

This year, we (at Researchable) held an internal competition in which extra points were awarded to those who completed a puzzle using COBOL. COBOL, or Common Business-Oriented Language, is a high-level programming language designed for business applications. It was one of the first programming languages to be widely used, and many systems still rely on it.

But that wasn't all. Solutions were also submitted in some unconventional languages, such as HTML and CSS (yes, HTML and CSS, not JavaScript, not SCSS), as well as COBOL. In this blog post, we will showcase two interesting solutions to some of the first AoC problems - one in HTML and CSS, and the other in COBOL.

As not-so-experienced COBOL developers, we were only able to complete the first few puzzles in COBOL. For the later ones, we used other, though still cool, languages such as Rust, Ruby, GO, and C++. These will not be featured in this blog post.

Solving AoC day 1a in HTML

The day 1 problem of Advent of Code (AoC) gives you a list of groups of numbers, and asks you what the sum is of the largest group of numbers. This problem is surprisingly solvable in just HTML and CSS (no JavaScript, no SCSS or other computation capable languages). The approach is to first apply a little bit of preprocessing (in the form of find and replace commands) to turn the input into valid HTML, and then to apply some CSS styling rules to read the answer in the inspect element devtools of a browser.

Preprocessing the input to HTML

The end goal is to turn each number into a div that we can apply styling to, and to wrap each group of numbers (separated by empty lines) into other divs that we can style. Concretely, I want to turn a list that looks like this:

4235
342
6564

234534
4234
Enter fullscreen mode Exit fullscreen mode

Into this:

<body>
<div class=”block”>
<div class="num" style="--hght: 4235px"></div>
<div class="num" style="--hght: 342px"></div>
<div class="num" style="--hght: 6564px"></div>
</div>
<div class=”block”>
    <div class="num" style="--hght: 234534px"></div>
<div class="num" style="--hght: 4234px"></div>
</div>
</body>
Enter fullscreen mode Exit fullscreen mode

To do this, I am going to use regular expressions in my editor (vim) to first substitute all empty lines with </div><div class=”block”>

:%s/^$/<\/div><div class=”block”>/g
Enter fullscreen mode Exit fullscreen mode

Next, I replace all numbers with <div class=”num” style=”--hght: __px”></div> substituting in the correct number on the underscores. I used the following substitution command for this:

:%s/\(\d\+\)/<div class=”num” style=”--hght: \1px”><\/div>/g
Enter fullscreen mode Exit fullscreen mode

What’s left to do is add the rest of the HTML at the beginning and end of the file, such as wrapping everything in ,

and , and adding the starting and ending tag of the first and last element. After this, the preprocessing step is done.

Using CSS to “calculate” the solution

If we use the above HTML, without any styling applied we end up with all divs appearing underneath each other (as is default for block-level elements). This is almost what we want, as we want divs that appear within each block div to add up in height. However, we do not want to add up the heights of all block elements. We are only interested in the height of the largest block element. As such, we position each block of numbers to appear next to each other. To do so, we give their parents (the body) a flex class, and set the correct height of the number elements as well. Additionally, we make sure that the height of everything grows according to its contents, and does not get limited to the browsers viewport:

<style>
  body {
    margin: 0px;
    display: flex;
    flex-direction: row;
    height: fit-content;
  }
  .block {
    height: fit-content;
  }
  .num {
    width: 1px;
    height: var(--hght);
  }
</style>

With this done, we can now read the answer in our browser’s dev tools, as the body height: 68787 pixels, meaning the sum of the largest block is 68787.

Browser dev tools

Solving AoC day 4 in COBOL

In day 4, the AoC challenge is to find for each pair of 2 number ranges, whether their domains overlap or not. The input to this puzzle has the following shape:

5-8,6-7
24-54,8-14
.. many more ..

One problem that COBOL users might be familiar with is that this form of input does not have a fixed width (the number of digits for each number can be different), which makes it significantly more difficult for the uninitiated COBOL user such as myself to parse the input. Again I resort to a bit of preprocessing, this time done in python, to solve my dynamic width problem.

Preprocessing all numbers to have a fixed width.

Upon a brief inspection of the problem input, I saw that none of the numbers were extremely large, and that using three digits per number ought to be sufficient to fit everything. To accommodate for this I wrote a small python script that rewrote my entire input to use three digits.

COBOL fixed width numbers

Note that there is actually a small mistake on line 12, which causes the second range to be printed with a comma instead of a dash. This does not affect the COBOL parsing, as COBOL assumes every number to be in a specific location, and ignores all other text. Additionally, let’s forget about the fact that in the same amount of python code I could also have solved the entire problem :D

After running this script on the input, the resulting preprocessed input (should) have the following shape:

005-008,006-007
024-054,008-014
.. many more ..

This is much simpler to read in a COBOL program. The reason for this will become evident later.

The structure of COBOL programs

COBOL programs are split up in several sections and divisions. I don’t exactly know what the difference between a section and a division is. The structure of every COBOL program starts with an identification division, which contains the name of the program, the name of the author, and other useful information that I did not fill in.
The next division is the environment division, which contains sections that describe the IO behavior of the program. This is where I specify that my program reads standard input. Next is the data division, which is where the memory layout of the program is specified. This is where every variable that is used in the program must be declared. Additionally, the file descriptors are described here which is how the structure of the file that is going to be read is specified. Crucially, this must be known at compile time (which we did not know for the unprocessed input). In summary, these sections combined look like this:

The structure of COBOL programs

Note that some things are called student, because I copied a sample COBOL program from a tutorial that was processing student data.

Our program’s memory layout

In COBOL, you have to define the structure of all of your variables ahead of time, which admittedly is quite nice. To accommodate for the input of this day’s problem I created file descriptors and variables of the following form:

Our program's memory layout

This roughly means that each line of the input STUDENT-FILE (our standard input) will contain a PIC 9(3) (three digits), followed by a single PIC X (text character), followed by 3 digits, a text character, 3 digits, another character and 3 more digits. This exactly matches one of the 123-456,234-567 inputs that were created earlier. I also create a variable called WS-LINE that has the exact same structure, except this time this structure describes a variable instead of a file descriptor. In addition to declaring WS-LINE, I also declare WS-COUNT (Working Storage COUNT), WS-EOF (which is a flag to determine whether the input reached EOF (End Of File), and a WS-SURFACETOTAL which is a helper variable for part 2.

The program itself

The program itself consists of a procedure that repeats until we reach EOF, after which a simple count is printed. This is written using the following COBOL code:

The program itself

As you can see, COBOL reads just like English which must mean that the code is self-documenting.

The interesting part of COBOL is the READ statement, which can do different things depending on whether the end of the file descriptor that is being read from has been reached. In this case, at the end of the file, the WS-EOF flag is set to “Y” (yes) which causes the loop to terminate. If we have not reached the end of the file yet, the CountOverlap procedure is called (which is actually the implementation for part 2). Additionally note that the line that is being read will be written to the variable WS-LINE, which is where it will be destructured into our 4 numbers and 3 delimiter (placeholder) characters.

Counting included pairs

For part 1 of this AoC, we are going to count the number of range pairs that are included in one another. To do this in our COBOL program, we simply use a few if-statements on our four numbers to see if one range is included in the other.

Counting included pairs

Note that COBOL does not have an else-if statement, so I had to nest another IF statement inside of the else branch. Later I learned that COBOL does have an EVALUATE statement which can take multiple branches (like a switch statement in modern languages).

Counting overlapping ranges

For part 2, we have to count the number of ranges that simply overlap a little bit with another range. This can be solved in many ways, but I did not want to think about all of the boundary checks too much. Instead I resorted to a technique used in proofs called the Pigeonhole Principle. I calculated the total size of the range from the lowest low boundary to the highest high boundary, and compared this to the size of the two individual ranges. If the two individual ranges do not fit in the total area of the outer range, then that means that they are “sharing” some space, meaning that they overlap. To do this “elegantly” I use MIN and MAX functions. In a normal programming language this would look something like this: surface = max(num2, num4) - min(num1, num3) + 1. However, in COBOL it looks like the following monstrosity:

Counting overlapping ranges

Unfortunately, any function invocation must be prefixed with FUNCTION. Note that the +1 is there because we are dealing with closed intervals, and a range of [6, 6] covers 1 space. For the same reason, the calculation for the individual range size has a +1 on each range (so a +2).

Also, note that I have not yet learned how to write functions in COBOL (I don’t even know if they exist), so using return statements is out of the question. Instead everything is done by mutating global variables, and making very sure that they are initialized to something sensible before they are used.

Top comments (0)