If you asked me what the biggest game changer for me in 2019 was for developing quality software, I would point to Scientist .NET without hesitation.
Scientist .NET lets you test experimental code safely without exposing end users to new errors or defects.
In this article, I’ll go in depth into the Scientist .NET library, explore what it is and when you should use it. I’ll highlight its benefits and drawbacks so you can make an informed decision as to whether Scientist is something you should consider for your team.
Note: This post was originally created as part of the collaborative C# Advent series on C# and .NET Development. This article is also an expanded and more in-depth version of an article I wrote on Scientist in late August.
So, what is Scientist .NET and what problem does it exist to solve?
Scientist .NET is part of the Scientist family of libraries, popularized by GitHub’s original Scientist library in Ruby but available in most modern programming languages.
Scientist is a library designed to perform experiments comparing multiple implementations of a solution to each other without introducing adverse effects to end users.
So, what does that mean?
Imagine you had a critical section of code that was steeped in complexity and years of technical debt. You’d want to pay down that debt and roll out a new or otherwise improved version of it, but you’d need to make sure you weren’t accidentally taking a step backwards.
The traditional solution might be to have a beta user flag where beta users get to interact with the new version of a feature while old users use the traditional version.
That works, but it does have a few risks. First of all, you never really want to expose bugs to users – even beta users. Secondly, your beta users might not be the ones with the problematic data or scenarios, so your beta test could miss broken scenarios.
What you really want to do is test your new logic against all types of users, but remove the risk of any bugs from that logic ever reaching them.
This is where Scientist comes in.
With Scientist, your code calls to the Scientist .NET library to perform an experiment. You provide it a legacy implementation to use as well as one or more experimental methods to try.
Scientist will then execute all provided implementations and return only the result of the legacy implementation.
This effectively isolates the calling code from any exceptions or inaccuracies in the experimental implementations and allows you to run these pieces of code in production or other environments you might not normally try.
So how is this valuable if we’re only returning the result of the legacy version?
Scientist .NET has something called a results publisher which lets us collect the results of individual experiments and store or send them on for analysis in whatever way we want.
This results publisher is where the value comes in because it can be used for things like:
- Recording if new implementations encounter exceptions
- Recording instances where the new implementation returned a different result than the old one
- Measuring the result of different implementations against large sets of data over time
- Comparing the duration of old and new implementations
Essentially, Scientist .NET gives us the data we need to make informed decisions about whether new code should see the light of day.
Best of all, Scientist .NET lets us do so without endangering the end users by pushing potentially dangerous new implementations out into production before they’ve been proven to work.
Now that we know what scientist is and what problems it helps with, let’s get into the details.
We’ll start by installing the Scientist .NET package in NuGet into whatever library we want to be able to run the experiments from.
Do this by right clicking on a project in solution explorer in Visual Studio and then selecting Manage NuGet Packages…
Once open, go to the Browse tab and type in
Scientist .NET to find the library:
Click install and the dependency will be added to your project.
Now that we have access to Scientist, let’s run an experiment.
In this example, we’ll be looking at a method that works with a problematic legacy component and trying out an experiment.
Since the implementation details of the legacy code aren’t relevant to discussing Scientist .NET, just pretend that the
LegacyAnalyzer file is thousands of lines long, rife with code smells, and needs to be burned to the ground.
Let’s start replacing it by incorporating Scientist .NET:
Holy nested lambda expressions! Okay, don’t panic, let’s talk about this code.
Line 5 is what kicks off the Scientist process. We call to
Scientist.Science method specifying the generic result type that will be returned and providing a name for our experiment. This name is provided to the result publisher and can be used to distinguish between results if you are running multiple experiments on the same result publisher.
experiment parameter to the
Science method allows us to configure Scientist .NET inside of a lambda expression. This is where the nuts and bolts of configuring our experiment are accomplished. We’ll go over these line by line.
experiment.Use method tells Scientist that when it runs, it should perform the provided action and return its result. This is where we’ll call out to our legacy implementation for the moment. You can have one and only one
Scientist allows us to specify zero to many
experiment.Try statements to run comparisons that will be provided to the results publisher. This is where you try your experimental logic since its results will not go back to the calling method.
You can use
experiment.Compare to tell Scientist how to compare experiments to each other and determine if they came to the consensus. This is used by the results publisher for logging differences. For us, we’ll just look at the
Score property on our
We can also add context to the results for the result publisher in order to help distinguish one result from another. In our simple example, we record the resume’s name to help differentiate one resume from another. If we didn’t do this, we would know that there’s an issue comparing some resumes, but we wouldn’t have enough information to recreate it later, making it less useful.
Take a look at the full documentation for Scientist .NET for more ways you can configure the experiment.
If you run this now, Scientist will run the experimental and legacy version in random order and then return the result of the legacy version, leaving our behavior unchanged.
Let’s look at how to make this useful by incorporating result publishers.
By default, Scientist uses an internal in-memory result publisher which doesn’t do much for us to get usable results out of it.
You’ll need to implement
IResultPublisher to store whatever results you want. Thankfully, it’s not too hard:
Ignore the scary
<T, TClean> syntax – it’s necessary for Scientist’s internals, but not relevant to what we really need to do.
What we really want to do is loop over the
MismatchedObservations collection. This collection contains any experiments that threw exceptions or resulted in a value that didn’t pass the
Compare check we defined earlier.
In this case we’re just logging to the console for simplicity.
Typically you’d log results to some form of consumable event source – either a log file, the Windows Event Viewer, add entries in a database somewhere, or track them on some form of exception logging service.
In order to use our new result publisher, we have to tell Scientist about it. We do this by setting a property on
Line 8 is what really matters here – we tell Scientist which
ResultPublisher to use for experiments.
Note that I’m using static fields to track whether we’ve set the result publisher for Scientist. This is because Scientist will error if you try to change the result publisher while a test is running if you use Scientist within your unit test suite.
Personally I would rather work with a non-static version of Scientist where I didn’t need to worry about this, but this is one of the few drawbacks of the Scientist .NET library.
Scientist can be made to work in asynchronous mode as well, removing some of the performance drawbacks for executing results in parallel. All that changes is that you call an async method from your
Try methods and you call to
ScienceAsync instead of
I’m a huge advocate for Scientist and the ways it can allow you to optimize your algorithms for performance and best results as well as its hand in helping safely pay down technical debt.
That said, Scientist can have some side effects if applied without thought.
Let’s say you have an algorithm that sends out an E-Mail with results of applying for a mortgage, for example. You could provide a legacy and new implementation of this algorithm, but if Scientist .NET runs both of these implementations, two E-Mails will be sent – possibly with different contents!
Obviously this is not what you want to have happening.
Because of this, you should isolate dependencies with side effects via dependency injection and pass them in to your actual algorithm but a dummy object (or null) into your experimental implementation.
Here’s a sample to help illustrate what I’m talking about:
Note: Instead of relying on dependency injection, you could alternatively add properties or parameters to allow certain behavior to be disabled. For example, a
ShouldSendEmail property would accomplish a similar result.
This should help illustrate that if you adopt Scientist, you’re going to start naturally decoupling your complex logic from their dependencies such as database operations, working with disk, network calls, or sending E-Mails.
Ultimately, this change in behavior is going to make your code easier to change, maintain, and test, so this is an added benefit, despite being additional work in the short term.
Scientist .NET doesn’t just have to work in production. In fact, I’ve had phenomenal success integrating it into my unit tests.
When you think about it, all you want to happen is for a mismatch to fail your unit tests.
I used to advocate for writing a custom
IResultPublisher that fails unit tests on mismatches, but Scientist .NET now supports a property out of the box for unit test integration:
true (see line 9 above) is all you need to do to incorporate comparison tests via Scientist.
Note that for simple comparisons in unit tests you may also want to consider Jest style snapshot tests via Snapper.
Generally, when incorporating Scientist into my workflow, the process goes something like this:
- Wrap up the existing implementation into a single method or class
- Write unit tests around the existing behavior as needed
- Introduce a new method / class that improves on the existing implementation
- Use Scientist in Unit Tests to compare the old implementation to the new implementation
- Send a Scientist build with the new logic as an experiment to Quality Assurance
- Release the scientist build into production
- After a fixed period of time without issues, remove the legacy implementation and only use the new implementation (following the full testing and release process)
The drawbacks of this workflow are:
- It takes longer for code following this flow to reach production in an active state
- Code must be tested and deployed multiple times due to the phased rollout approach
- If bugs are found in the legacy implementation, they must be fixed in both the old and new implementation during the transition period
However, the benefits are considerable:
- The code has been vetted in production against actual production data
- Depending on what was logged in your result publisher, you now have detailed information to support the performance of your new implementation
- If you follow this process, you should not be introducing any observable defects from the end user’s perspective
- You now have detailed tests covering your new implementation
- Your architecture is going to be more extensible and testable in the long term
Scientist .NET isn’t for every team, and it’s not for every change, but I’ve been able to use this technique to great effect when paying down technical debt.
As you can see, Scientist .NET offers a polished wrapper around a simple but revolutionary concept: test new implementations in production without exposing any of their new defects or weak areas to end users.
Ultimately, everything we do in software development is to serve some user – whether it’s someone using a web application or an automated system consuming an API. Software that is worth writing is worth writing in such a way that we don’t risk harming the end users.
Scientist .NET is an extremely powerful and capable tool for protecting the users as we pay down technical debt, improve quality, and add additional functionality.
Give it a try. Your users and codebase will thank you for it.
Cover Photo by Gabriel Gurrola on Unsplash