Kirk Shillingford

Posted on Dec 24, 2022

From Script to Scaffold in F#

#fsharp #beginners #adventofcode #showdev

Introduction

This year I've been attempting Advent of Code in my favourite programming language, F#. This is a beginner(ish) centered post about making incremental changes from the smallest possible solution to something more robust.

What is F# (F Sharp)

If you've never heard of it before, F# is a functional first programming language in the Microsoft dotnet ecosystem used for general purpose programming, Web development, data analysis, and everything in between. It compiles to the same Common Language Runtime as all other .NET languages, meaning it can interop with any C# package//library and make full use of the .NET ecosystem.

More importantly, (to me), it's a delightfully expressive language, that has just about every semantic pattern I enjoy in a modern programming language, like algebraic data types, powerful pattern matching capabilties, immutability as a default, and a generally terse, lightweight syntax, which both functional and Object Oriented patterns a delight to use.

What is Advent of Code

If you're unfamiliar with it, Advent of Code (or AOC as we'll be referring to it from now on) is a yearly programming challenge that takes place from the first to the 25th of December. Each day a themed coding puzzle is released, in increasing order of difficulty. Programmers from all over the world attempt to solve these puzzles, with varying degrees of competitiveness and rigour. For example, as of writing this in 2022, over two hundred thousand participants.

I have zero interest in the hyper-competitive aspect of AOC; my current life does not afford me the opportunity to do programming challenges in the wee hours of the night. However, this year, after beginning the way I normally do AOC, writing one-off scripts each day, I began feeling frustrated with how much time I spent erecting structures to make solutions possible, vs actually solving problems.

Earlier this year, I wrote about scripting and automation in VSCode using typescript as part of an ongoing project to reduce time consuming repetitive actions in my professional life. So this seemed like a perfect opportunity to do something similar at a more targeted domain space. Additionally, I'd seen a few examples of AOC runners floating around (pre-built libraries and templates that come with hope to help streamline the mechanics of participating in AOC). There are a few solutions in the .NET ecosystem, but they're all in C#!

There's nothing wrong with C# mind you, but I'm here this year, to write F#, not C#. So without any more pre-ample, I'd like to share some of the things I've implemented so far and the trajectory for my AOC runner and solution.

warning: I've used snippets from past and current solutions as examples for the article. It shouldn't be enough to provide meaningful spoilers for anyone, but if your goal is to try the problems, and you're worried about any sort've input, you've been warned.

Step 0 - The current state of things

One of the things I love about F# is its power as a scripting language and with the start of this year's AOC, I took full advantage of this. With access to the F# interactive REPL, my initial pattern was to simply:

create each day as a .fsx file called DayXX.fsx
Each AOC problem is split into two parts, so after completing each portion, I would simply print my solution to console.

let data = """

...

"""

let day10part1solution =
    data
    |> Seq.map (Seq.fold tokenize (List.singleton Empty))
    |> Seq.choose (Seq.tryFindBack isInvalid)
    |> Seq.sumBy invalidValue

...

let day10part2solution =
    data
    |> Seq.map (Seq.fold tokenize (List.singleton Empty))
    |> Seq.filter (Seq.forall (not << isInvalid))
    |> Seq.choose (Seq.tryFindBack isIncomplete)
    |> Seq.map ((Seq.unfold deTokenize >> Seq.map toNum2) >> Seq.fold score 0L)
    |> middle

printfn $"{day03part1solution}, {day03part2solution}"

You might be able to see the obvious problems with this approach over time.

I manually copy and paste all inputs from AOC into the file itself which is clunky and error-prone. It also means more hot swapping of comments when doing tests
Toggling the output I focus on involves modifying an interpolated string
Because the solutions are values and not functions, unless commented out, they're both calculated anyway, which is an issue for solutions involving a heavier amount of calculations.
You can't see it, but because I just make a new script for each day, I'm doing a lot of rewriting of useful patterns. I could make yet another script for those things I want to repeat, but at that point, I may as well just overhaul the entire solution, which is exactly what we're gonna do.
Lastly, .fsx aren't ideal for using official testing frameworks like Xunit, FsUnit, or Unquote, etc.

Step 1 - Switching to solutions

Firstly we can easily use the dotnet command line interface to scaffold a solution with a new project for all our solutions.

dotnet new sln -o AOC
dotnet new console -lang "F#" -o AOC2022
dotnet sln add AOC2022/AOC2022.fsproj

mkdir AOC2022.Tests
cd AOC2022.Tests
dotnet new xunit -lang "F#"
dotnet add reference ../AOC2022/AOC2022.fsproj
cd ..
dotnet sln add AOC2022.Tests/AOC2022.Tests.fsproj

There's a little bit going on but essentially these commands just:

Make a new solution called AOC
Add a project called AOC2022 that contains our code for our actual solutions
Make a project called AOC2022.Tests that'll consume our AOC code and run tests on our solutions.

Now for our .fsx files. First, we'll place them all in our new AOC2022 folder as part of the project and we'll change them to fs files (which is the file type dotnet can parse into a proper solution. Additionally, we'll make a few tweaks.

module Day03

... 

let part1 data =
    data
    |> Seq.map (Seq.fold tokenize (List.singleton Empty))
    |> Seq.choose (Seq.tryFindBack isInvalid)
    |> Seq.sumBy invalidValue

...

let part2 data =
    data
    |> Seq.map (Seq.fold tokenize (List.singleton Empty))
    |> Seq.filter (Seq.forall (not << isInvalid))
    |> Seq.choose (Seq.tryFindBack isIncomplete)
    |> Seq.map ((Seq.unfold deTokenize >> Seq.map toNum2) >> Seq.fold score 0L)
    |> middle

Each file now has a unique module, which is the primary way of organizing code in .NET. Think Solution -> Project -> Namespace (Optional) -> Module -> Value. Instead of the parts being values, they're now functions that accept the data as input and perform the appropriate calculations. We can call them by saying DayXX.part1 input... for example. And that makes it a lot easier to use and test them.

Yay, more structure! There's just one problem! We now have no way of actually running our problem files. Let's fix that with some more CLI work!

Step 2 - Passing Arguments through the Command Line

Now that out code is .fs files, we have no way of calling them directly from the command line. In the entrypoint of our application, Program.fs, we could simply call and return every value in our files


let day1input = ...
let day2input = ...
let day3input = ...
...
// printfn $"{Day01.part1 day1input}, {Day01.part2 day1input}"
// printfn $"{Day02.part1 day2input}, {Day02.part2 day2input}"
printfn $"{Day03.part1 day3input}, {Day03.part2 day3input}"
...

but this still means we're doing control flow with comments. We can do better.

One of the things I've been experimenting with a bit more recently are Computation Expressions which for the purposes of this post we can just treat as ways of safely transforming data in context while keeping a convenient procedural syntax. Using one little CE for the Option type we can put together a crude mechanism to capture user input on the solutions we want to run.

open System.IO

let getInput dayNumber =
        let fileName = $"./data/day{dayNumber}.txt"

        if File.Exists(fileName) then
            Some(File.ReadAllLines)
        else
            None

let getSolution =
    function
    | "1" -> Some(Day01.part1, Day01.part2)
    | "2" -> Some(Day02.part1, Day02.part2)
    | "3" -> Some(Day03.part1, Day03.part2)
    | _ -> None

type MaybeBuilder() =
    member this.Bind(x, f) =
        match x with
        | Some x -> f x
        | _ -> None

    member this.Zero() = None

let maybe = new MaybeBuilder()

[<EntryPoint>]
let main args =
    maybe {
        let! day = Seq.tryHead args
        let! part = Seq.tryItem 1 args
        let! input = getInput day
        let! (part1, part2) = getSolution day

        match part with
        | "1" -> printfn "%A" (part1 input)
        | "2" -> printfn "%A" (part2 input)
        | _ ->
            printfn "%A" (part1 input)
            printfn "%A" (part 2 input)
    }
    |> ignore

    0

Again, let's take a step back and talk about what we're doing here.

We've made a function getInput that accepts a string and tries to find a txt file based on that string. We're going to start keeping our input data in a folder called data, and keep our .fs files containing just our solution code. The function will check if the file exists, and if it does, try and read all the lines into an array of strings
We've also made a getSolution function that attempts to get a solution based on an input string. If we try to call a solution we haven't created yet, it'll just return a value of None instead of crashing.
Our last special value is a MaybeBuilder computation expression. Rather than explain it here, we'll just talk through how it's used below.

In our main function, we put it all together, plus attempting to access two values from the incoming arguments to the program, day and part values representing what code we want to run and what day we want to run it. Those let!s reference any value that would normally be an option and let us use them as if the were just normal values. Our MaybeBuilder is doing the work of chaining the valid values in the background. If any of these actions returns a None, the whole thing gracefully returns None and won't crash. It's not particularly informative, but this is just for me, and all I really want is to not handle crazy error messages every time I mistype a day or part name.

Mind you, a construct like maybe builder is absolutely unnecessary. This was just my small attempt at IO safety. I may switch to a more formal method using async expressions. Or just handling the logic using regular, practical if expressions. Whatever works.

Anyways, we did all of this just so in our terminal we can run our solutions individually.

dotnet run 3 1 

// part 1 results

dotnet run 3 2

// part 2 results

dotnet run 3

// part 1 results
// part 2 results

Great! This is looking way more organized! And most of our earlier hiccups are gone. All we need now are some tests!

Step 3 - All the Tests

We've done a bit of setup so far so we just need a few things to get our tests working

First, we need some test files. We'll mirror the project setup in our AOC2022 folder, one file/module per day, and a data file for test inputs.

module Day03Tests

open Xunit

let input =
    match getInput 3 with
        | Some input -> input
        | _ -> failwith "Input cannot be found" 

[<Fact>]
let ``Part 1`` () = Assert.Equal(157, Day03.part1 input)

[<Fact>]
let ``Part 2`` () = Assert.Equal(70, Day03.part2 input)

I've kept the tests pretty simple for now. We reuse the getInput function from our AOC2022 project but this type we deliberately throw an error if it's missing, since I'm not doing any manual typing with these tests. Then we simply call our module functions in our tests, and compare them to our expected values. We can run these tests from our solution folder just be saying.

dotnet test --filter Day03Tests

Note here the built in functionality to filter tests to one specific module. (And we can of course omit the filter and run all the tests).

The last thing we need is just a little tweak to our AOC2022.Tests.fsproj file so those data files are included in our output.

<ItemGroup>
  <Compile Include="*.fs" />
  <Content Include="data\**"> 

  <CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
  </Content>
  </ItemGroup>

This little contant tag in our XML ensures all the data successfully makes to our build directory.

Conclusion and Next Steps

If you've made it here you've seen all I wanted to share in this post. This project is far from over, and I almost didn't share this post because I wanted to add a lot more features first, but I revised that after thinking about how it's rare to see conversations about work in progress. I'm hoping that seeing a project go from nothing to something could provide value for some people. Also, this is a living project and I am using and solving advent of code in it, so my time is split between doing more scaffolding and well, solving problems.

Things coming soon: