Matt Thornton for Symbolica

Posted on Nov 28, 2021 • Edited on Nov 29, 2021

Typesafe F# configuration binding

#dotnet #fsharp #showdev

At Symbolica we're building a symbolic execution service that explores every reachable state of a user's program and verifies assertions at each of these states to check that the program is correct. By default it will check for common undefined behaviours, such as out-of-bounds memory reads or divide by zero, but it can also be used with custom, application specific, assertions too just like the kind you'd write in a unit test. Seen from this perspective it's kind of like FsCheck (or Haskell's QuickCheck or Python's Hypothesis), but much more exhaustive and without the randomness.

As much as we like finding bugs with Symbolica, we prefer to not write any in the first place. Our first line of defence is a strong type system, so that we can try to design types that make invalid states impossible and let the compiler tell us off when we make a mistake. For that reason we've opted to build our service using F# as it also interops nicely with the core part of our symbolic executor which is written in C#.

One of the many things we love about F# is that, by default, it doesn't permit null as a regular value. This feature eliminates a whole class of errors caused by null values, most notably the cursed NullReferenceException. Another feature that we like is having access to the vast wealth of .NET libraries that exist. However, many of these are written in C# and so they are often places where null values can sneak into an F# program through the backdoor at runtime.

One area where this was frequently biting us was the binding of configuration data using the Microsoft.Extensions.Configuration library. Due to this and other problems that we'll go into below, we created a safer alternative for configuration binding for F# projects called Symbolica.Extensions.Configuration.FSharp and open-sourced it on GitHub.

Symbolica / Symbolica.Extensions.Configuration.FSharp

Provides a safe API for binding the dotnet IConfiguration to types in F#.

Symbolica.Extensions.Configuration.FSharp

Provides a safe API for binding an F# type from the dotnet IConfiguration interface. It is an F#-friendly alternative to using the reflection-based ConfigurationBinder.Bind.

Motivation

Out-of-the-box dotnet provides what it calls the "the Options pattern" which it describes as:

The options pattern uses classes to provide strongly typed access to groups of related settings.

Whilst this might be "strongly typed" in the sense that you're interacting with statically typed options objects, the binding mechanism is not strictly safe and so the static types are often a lie. This leads to a few notable problems, especially when working with it from F#.

It's a large source of NullReferenceExceptions because the binder will hapily set a value to null if it's missing in the underlying config. This means your F# type is probably lying to you about the fact its value cannot be null. F# developers would rather model…

View on GitHub

The problems with `Microsoft.Extensions.Configuration.Binder`

The best way to highlight the shortcomings of the defacto config binder is with an example. Let's say we want to model some logging options in our code. We might start out with a simple record type like this to represent the options.

type LoggingOptions = 
    { Level: string
      Sink: string }

Where Level represents how verbose we want the logging output to be, e.g. "Debug" or "Error" etc and Sink is where we want to send the logs, for example it might be "Console" or "File".

Let's test this out with a little fsx script that we can run with FSI.

#r "nuget: Microsoft.Extensions.Configuration.Binder"

open Microsoft.Extensions.Configuration

type LoggingOptions = { Level: string; Sink: string }

let config =
    ConfigurationBuilder()
        .AddInMemoryCollection(
            [ "Logging:Level", "Debug"
              "Logging:Sink", "Console" ]
            |> Map.ofList
        )
        .Build()

config.GetSection("Logging").Get<LoggingOptions>()

Here we're just seeding the in memory configuration provider with a dictionary of config data and then attempting to retrieve and bind the LoggingOptions.

Problem 1: Mutable data and `null` values

If we run the above script you might be expecting it to print out a LoggingOptions with a Level of "Debug" and a Sink of "Console". However, we actually hit a different problem. The above script throws the following exception.

System.InvalidOperationException: Cannot create instance of type 'FSI_0008+LoggingOptions' because it is missing a public parameterless constructor.

That's because an F# record doesn't contain a parameterless constructor, because all of the record’s properties must be properly initialised and null isn't an allowed value. To make matters worse, the defacto binder mandates that the properties of the type being bound must be settable too, breaking immutability and making the use of a record to model options kind of pointless.

There are two typical workarounds to this:

Define a mutable class instead of a record for the options type, like we would in C#.
Add the [<CLIMutable>] attribute to the LoggingOptions record.

Neither of these are particularly pleasing. The first one means we have to give up on having immutable options types and the rest of the code base has to deal with the added complexity of potential mutability. The second is basically a hack which provides a mutable backdoor at runtime to our immutable type.

Using [<CLIMutable>] actually opens up a can of worms because our types are now deceiving us. Our simple record purports to be immutable and never contain null values and so in the rest of the code base we program as if this is the case. On the other hand the config binder isn’t abiding by these compile time invariants and may in fact initialise the record’s properties as null at runtime.

To see this in action, let's rerun the above example, but this time with the [<CLIMutable>] attribute added to the LoggingOptions and a missing value for the Level In the raw config. The modified script looks like this.

#r "nuget: Microsoft.Extensions.Configuration.Binder"

open Microsoft.Extensions.Configuration

[<CLIMutable>]
type LoggingOptions = { Level: string; Sink: string }

let config =
    ConfigurationBuilder()
        .AddInMemoryCollection([ "Logging:Sink", "Console" ] |> Map.ofList)
        .Build()

config.GetSection("Logging").Get<LoggingOptions>()

Running it produces this output.

val it: LoggingOptions = { Level = null
                           Sink = "Console" }

We see that the type system has lied to us because the value of Level was actually null at runtime. In this case it's relatively harmless, but in a real application it's likely that we'll have a more complex hierarchy of option types and so we'd end up trying to dereference a potentially null object leading to the dreaded NullReferenceException.

When working in F# we'd rather the config binder returned a Result if the config couldn't be parsed and allow us to use an Option type for config data that is, well, optional. Which leads us to the next problem.

Problem 2: No native support for binding DUs

As the defacto binder uses reflection to bind the raw config to "strongly typed objects", it only has support for a limited set of types. This includes all the primitive types, like int and string and a few of the common BCL collection types like List and Dictionary. This is frustrating for both C# and F# developers that wish to use more complex types to model their options.

Particularly frustrating for F# developers though is that this means it doesn't support discriminated unions (DUs) and therefore doesn't support types like Option. To highlight this let's imagine we wanted to improve our LoggingOptions so that the Level was restricted to a discrete set of values. To do this we'll create a DU called LoggingLevel and use it as the type for the Level property.

#r "nuget: Microsoft.Extensions.Configuration.Binder"

open Microsoft.Extensions.Configuration

[<RequireQualifiedAccess>]
type LogLevel =
    | Debug
    | Info
    | Warning
    | Error

[<CLIMutable>]
type LoggingOptions = { Level: LogLevel; Sink: string }

let config =
    ConfigurationBuilder()
        .AddInMemoryCollection(
            [ "Logging:Level", "Debug"
              "Logging:Sink", "Console" ]
            |> Map.ofList
        )
        .Build()

config.GetSection("Logging").Get<LoggingOptions>()

We're now supplying a config dictionary that looks correct, it has properties for both of "Logging:Level" and "Logging:Sink", so let's run it and see what the output is.

val it: LoggingOptions = { Level = null
                           Sink = "Console" }

So we can see here that the binder has silently failed to bind the Level property now that its type is LoggingLevel.

If we want to bind more complex type, we'll first have to bind to a simple type, like a string, and then write a parser ourselves to turn that into a LoggingLevel. That’s a slippery slope because it then probably means having something like a ParsedLoggingConfig which we create from the more loosely typed LoggingConfig. Resulting in us needing to define a fair amount of config parsing “boilerplate” anyway.

Problem 3: Parse, don't validate

The defacto binder doesn't really give us much help when our configuration is faulty. We can write some options validators and wire these up with DI, but as Alexis King has taught us - parse, don't validate.

In short, "parse, don't validate" tells us that it's better to parse data into a type, that once constructed must be valid, than it is to read the data into a more loosely typed object and then run some post-validation actions over the values to make sure they're correct. The primary reason being that if we know that our type only permits valid values, then we no longer have to wonder whether or not it's already been validated.

The defacto configuration binder doesn't make it easy to adhere to this. It's easy to forget to register a validator for the options and then when they're accessed at runtime we instead get a rather unhelpful null value, like we observed earlier. What we'd prefer is for the compiler to prevent us from making such a mistake, by enforcing validation through the type system.

To give a specific example, let's imagine we want to be able to restrict the logging level to only have the values, "Info", "Debug", "Warning" and "Error". We've already seen we can't use a DU to model this. So we have no way of knowing whether or not Level is valid when we come to use it, all we know is that it's a string. So if we want to be sure, we're forced to keep validating the logging level at every point of use.

A better binder for F#

Given these shortcomings we decided to write our own config binder with the following design goals in mind:

Binding failures should be expected and be reflected in the type returned from the binder. We should be made to deal with the unhappy path.
Binding should not break immutability.
Binding should work for all types including complex user defined types.
Binding should be composable, such that if I can bind a type X which is then later used within Y, I should be able to reuse the binder for X when defining the binder for Y.
Error reporting should be greedy and descriptive so that developers can quickly fix as many errors as possible when binding fails.

To that end we opted to write a binder that didn't use any reflection. The trade-off we're making here is that we're forced to be much more explicit when we bind a type and so we end up with what some people might consider to be boilerplate. However, we'd personally rather have code that is explicit than have to read through documentation to discover the implicit behaviours of something magic, because when the magic thing breaks we usually spend more time debugging that than we would have spent writing the explicit "boilerplate" to begin with.

Also, thanks to the composable nature of functional programming languages and the power of F#'s computation expressions it's possible to be both explicit and terse. It's probably best appreciated with an example. So let's see how we'd bind the above LoggingOptions using our new approach.

#r "nuget: Symbolica.Extensions.Configuration.FSharp"

open Microsoft.Extensions.Configuration
open Symbolica.Extensions.Configuration.FSharp

[<RequireQualifiedAccess>]
type LogLevel =
    | Debug
    | Info
    | Warning
    | Error

module LogLevel =
    let bind =
        Binder(
            fun (s: string) -> s.ToLowerInvariant()
            >> (function
            | "info" -> Success LogLevel.Info
            | "debug" -> Success LogLevel.Debug
            | "warning" -> Success LogLevel.Warning
            | "error" -> Success LogLevel.Error
            | _ -> Failure ValueError.invalidType<LogLevel>)
        )

type LoggingOptions = { Level: LogLevel; Sink: string }

let bindConfig =
    Bind.section
        "Logging"
        (bind {
            let! level = Bind.valueAt "Level" LogLevel.bind
            and! sink = Bind.valueAt "Sink" Bind.string
            return { Level = level; Sink = sink }
         })

let config =
    ConfigurationBuilder()
        .AddInMemoryCollection(
            [ "Logging:Level", "Debug"
              "Logging:Sink", "Console" ]
            |> Map.ofList
        )
        .Build()

bindConfig
|> Binder.eval config
|> BindResult.mapFailure(fun e -> e.ToString())

Running this script produces the following output.

val it: BindResult<LoggingOptions,string> = 
    Success { Level = Debug
              Sink = "Console" }

From this example we can see that it's successfully bound our more complex LoggingOptions type that contains a DU. There's also zero magic, the binding process is clear to see and simple to customise. Let's check that it's met our design goals.

Failures are expected - We can see this by the fact that right at the end, after we've called eval on the Binder, it's produced a BindResult.
Binding doesn't break immutability - No [<CLIMutable>] required here.
Binding works for complex types - Binding a DU was no problem. We were also able to make it case insensitive just through a little function composition with ToLowerInvariant.
Binding is composable - We defined the binder for the LogLevel in isolation to the overall config binder.
Error reporting is greedy and informative - Let's simulate some failures and see what happens.

Let's run the script again but this time with the following input config.

[ "Logging:Level", "Critical" ]

So that the Level is invalid and the Sink is missing. We get the following output.

Failure(
  “@'Logging':
    all of these:
      @'Level':
        Value: 'Critical'
        Error:
          Could not parse value as type 'LogLevel'.
      @'Sink':
        The key was not found.”)

It's shown us all of the paths in the config for which it found errors and what those errors are.

The Implementation Details

At the heart of all of this is a Binder<'config, 'value, 'error> type. This type is just a wrapper around a function of the form 'config -> BindResult<'a,'error>. For the category theory inclined, it's just a reader monad whose return type has been specialised to a BindResult.

The BindResult type is very similar to a regular F# Result except that its applicative instance will accumulate errors, whereas the regular Result will typically short-circuit on the first error it encounters.

Binder and BindResult are defined generically to keep them as flexible as possible. However at some point we want to provide some specialisations for the common binding scenarios. There are really two primary specialisations to consider; one for binding sections and another for binding values.

Section binders are of the form Binder<#IConfiguration, 'a, Error> and value binders are of the form Binder<string, 'a, ValueError>. By fixing 'error to the custom types Error and ValueError it's easy to compose Binders and also ensure that the errors can be properly accumulated in both applicative and alternative computations.

One of the primary specialisations comes from the bind applicative computation expression. We saw in the example above how bind lets us compose a Binder for an IConfigurationSection by binding its properties using existing Binders and at the same time ensures all binding errors from this section are accumulated. The bind CE gives us a declarative looking DSL for defining new binders for our application specific config objects.

In the Bind module the library also provides various combinators for building new Binders. Such as Bind.section and Bind.valueAt which take an existing Binder and bind them to a section or a value at a particular key, which are typically used inside a bind CE. It also contains many binders for types like int, bool System.DateTime and System.Uri as well as more complex structures like List and IDictionary.

Try it out

The code is available on GitHub and you can install the library via NuGet. If you want to see even more sophisticated examples that shows how to do things like handle optional values, deal with alternatives and bind units of measure then check out the IntegrationTests. Of course if there's something that you think is missing then open an issue or a pull request. I'm sure there are plenty of other Binders that we can add to the Bind module to cover other common .NET types.

Future Improvements

If you want to use things like IOptionsSnapshot then it requires interaction with the IServiceCollection and a call to Configure<MyOptionsType>(configureAction). Unfortunately the way that Microsoft have designed this means that a parameterless public constructor is required on the options type being configured so that an instance can be passed to configureAction, which goes against our design principles here. So currently this library won't play nicely with things like reactive options updates. If this is something that you'd like then it should be possible to provide a way around this by providing an alternative IOptionsFactory, so please open an issue and let us know. See the README for more details.

Oldest comments (4)

Kirk Shillingford • Nov 29 '21

Lovely post. I really enjoyed it, and I think it's given me some ideas for improvements to a few tools of my own.

jkone27 • Dec 9 '21

An alternative i like in F# is to use **FSharp.Data.JsonProvider **and just register a singleton for the F# provided type of your settings.

Than configuration can be read all around your app via standard DI, and always take the correct environment configuration values, with the proper typings inferred from appsetting.json ( a similar approach below)

jkone27-3876.medium.com/comparing-...

Matt Thornton Symbolica • Dec 10 '21 • Edited

That's a cool approach, I hadn't thought of using type providers for config. I guess that works quite well providing there is a type provider for the config source you're using. Are you able to deal with environment variables using that approach too, as I suspect they are about as common as JSON based configuration in most apps?

I have played with type providers for something else briefly and one thing I'm curious about is how they handle changes to the data post-compilation. Does it fail at runtime then, or does it still return you some kind of Result so that you're forced to check for errors at runtime too?

jkone27 • Dec 10 '21 • Edited

if you try out the linked repo you should see one possible way of safely using type providers. Type providers usually define a Type (Class), not an object (instance), the instance can be bind at run time (with Load function in JsonProvider case).

let settingsFile = "appsettings." + 
     Environment.GetEnvironmentVariable("ASPNETCORE_ENVIRONMENT") + 
     ".json"
let config = AppSettingsProvider.Load(settingsFile)

services.AddSingleton<AppSettings>(config)

If you talk about changing configuration once the app has started (unique of IOption), that doesn't happen no, but we happily lived without it before dotnetcore, in most of use-cases, I generarlly don't need that "option" functionality for apps.

weblog.west-wind.com/posts/2017/de...

For environment variables, they are automatically overridden in the json file by aspnetcore, if they have a matching pattern, so that is also already covered by aspnet and works fine as the file is loaded for each environment.

That's for example how you would inject secrets and sensitive config for example, that also works fine with type providers.