DEV Community

MichaelFack1 for IT Minds

Posted on • Updated on

Writing Good Tests (Part 1) - What Makes a Test Bad

Unless you're some kind of demi-god, I believe you too have tried writing some new fresh code that didn't execute as you expected at first. Perhaps you forgot a guard-clause, some underlying dependency didn't behave as anticipated or, for some other reason, a bug was introduced to your precious code. What is more important (at least with regards to this blog post) is that you discovered this by compiling the code, running the software and seeing it fail.

In this blogpost, I hope to indoctrinate you into believing in the higher powers; automated testing!

Don't just write tests

Let me know if you've ever been in this situation before; you've just opened your favorite IDE to work on a large (and old) project, you see the infamous area of the project aptly named "Tests", and you shudder a little bit?
Example of folder named "Tests"
I have, and I can tell you why I shuddered - I knew I was the one delegated the tasks of fixing tests that did not succeed anymore. I was the designated Bughunter (trademark pending), and I knew a bug could take days to resolve, even for my more experienced coworkers who had been working on the project for years.

You know why? The tests were bad!

What makes a test bad?

Just like a bowl of cereal, a lot of things can make a test bad, and it is not always clear why a test has gone bad. However, once you've had a spoonful of it - you know!

In this short series, I am first going to supply you with a non-exhaustive list of things to check for when suspecting a test of being bad, and then later on in the series, I will discuss how to write good tests.

I just happened to have a CmsService lying around which implements a Read/Write access string mapping in a database and that I need to test, so, below I'll use it as an example of how NOT to write a test.

Try to spot a few things you find obscene in the following .NET code segment.

public class UnitTest1
{
    [Theory]
    [MemberData(nameof(TestData))]
    public async Task Test1(List<Tuple<int, CmsDbEntry>> entries, string expected, bool boolResult)
    {
        var repoMock = new Mock<ICmsWriteRepository>();
        var dbContextMock = new Mock<CmsDbContext>();
        var repo = new CmsRepository(null, dbContextMock.Object);
        var service = new CmsService(new LoggerFactory().CreateLogger<CmsService>(), repo, repoMock.Object);
        var entriesMock = new Mock<DbSet<CmsDbEntry>>();
        foreach (var (key, entry) in entries)
        {
            entriesMock.Setup(m => m.FindAsync(key)).ReturnsAsync(entry);
        }
        dbContextMock.Setup(m => m.Entries).Returns(entriesMock.Object);
        var result = await service.ReadSectionAsync("some arbitrary constant");
        Assert.Equal(result, expected, ignoreCase: true);
        if (boolResult) {
            repoMock.Setup(m => m.WriteAsync("some arbitrary constant", "Content")).ReturnsAsync(boolResult);
            var boolResult0 = await service.WriteSectionAsync("some arbitrary constant", "Content");
            Assert.Equal(boolResult, boolResult0);
        }
    }

    public object[][] TestData = new object[][]
    {
       ...
    };
}
Enter fullscreen mode Exit fullscreen mode

Here's a few things I like to pay attention to:

Is the test readable?

At a quick glance, the test should convey what is being tested. If it fails to convey what is being tested, at best, it becomes harder to update as the target code changes, at worst, it can sabotage development as it misleads the developer.

If you cannot discern what a test tests, how can you know if it is correct? A test which you cannot tell whether is correct or not because you cannot discern its purpose or meaning adds no value. You are better off without it.

Is the test well named?

A well-named test might convey the entire purpose of a test, much like how a well-named method conveys its functionality.

Formatting test names in a consistent way helps conveying purpose. Additionally, in conjunction with other tests, well-named tests can actually help you pinpoint exactly where you should look for bugs.

public class CmsServiceUnitTests {
  [Fact]
  public async Task ReadSectionAsync_ValidKeyOfExistingEntry_ReturnsEntry() {
    ...
  }
}
Enter fullscreen mode Exit fullscreen mode
  • Other naming conventions also exist - I can recommend the {Given}_{When}_{Then} format since user stories can be formulated in this format as well.
public class CmsServiceUnitTests {
  [Fact]
  public async Task GivenValidKeyEntryPairInDb_WhenFetchingEntryOfTheKey_ThenEntryIsReturned() {
    ...
  }
}
Enter fullscreen mode Exit fullscreen mode

Does it test more than one thing?

A test that tests more than one thing at a time can be difficult to derive meaning from and becomes less useful with respect to pinpointing where a bug lives in the code.

One thing does not mean one field or just one value, but rather the resulting state. A rule of thumb: primitive fields and such can be checked together as a way of validating the state of an object, but do make sure to add more informative messages when you have more than one assertion in a test.

[Fact]
public async Task ReadSectionAsync_ValidKeyOfExistingEntry_ReturnsEntry() {
  var expected = dbKeyEntryPair.Entry;
  ...
  var actual = await service.ReadSectionAsync(validKey);
  ...
  Assert.Equal(expected, actual);
}
Enter fullscreen mode Exit fullscreen mode

Is it well arranged?

A well-arranged test does not configure some state, execute some code, configure some more, execute some more, assert something and keep on mixing these things. In the abstract, a test that does one thing, and one thing only, validates that the code under testing from some configured state performs some execution, returns something expected and finds rest in some expected state.

  • Microsoft advice adding the three A's as code comments:
[Fact]
public async Task {TestName}(){
  // Arrange
  {Configuration}
  // Act
  var actual = {subject}.{Method}({args});
  // Assert
  {Assertion(s)}
}
Enter fullscreen mode Exit fullscreen mode
  • There is some disagreement out there in the Great Ether with regards to where one should put shared static configuration, such as the initialization of the test subject(s) and mocks as well as common configuration of the subject(s) and mocks. It is my opinion that initialization can occur in the constructor of the test class and that this will be the same for all tests in a test class. Configuring the subject(s) and mocks thus belongs to the configuration step of each test, as they likely vary.
public class CmsServiceUnitTests {
  public Mock<ICmsWriteRepository> writeRepoMock;
  public Mock<ICmsReadRepository> readRepoMock;
  public CmsService subject;

  public CmsServiceUnitTests() {
    readRepoMock = new Mock<ICmsReadRepository>();
    writeRepoMock = new Mock<ICmsWriteRepository>();
    subject = new CmsService(Mock.Of<ILogger<CmsService>>(), readRepoMock.Object, writeRepoMock.Object)
  }

  const string validKey = {key of valid format};
  const string entry = {content of entry};

  [Fact]
  public async Task ReadSectionAsync_ValidKeyOfEntry_ReturnsEntry() {
    // Arrange
    readRepoMock.Setup(m => m.ReadAsync(validKey)).ReturnsAsync(entry);
    var expected = entry;
    // Act
    var actual = subject.ReadSectionAsync(validKey);
    // Assert
    Assert.Equal(expected, actual);
  }
}
Enter fullscreen mode Exit fullscreen mode

Does logic occur?

A great way to detect whether a test only does one thing or not is to check whether it contains logic, such as a switch-, an if- or a for-statement. These statements make the test a lot less readable seeing as the state of the subject and the input at the time of method calling becomes something computed, rather than something set (which can simply be read).

  • I must admit that I sin with respect to this when compressing testcases - sometimes I find it easier to write a lot of testcases using one parameterized test, where some of the arguments inform the testcase to configure in a particular way, instead of splitting up the testcases into multiple separate parameterized tests. However, just because something is easy doesn't mean you should do it!

Is it a typical test?

Here typical refers to whether or not the test follows some sacred dogma allowing it to be categorized easily into some preconceived type, such as End-to-end integration tests, unit tests, or simply tests that fit in to the overall test strategy.

If the test does not neatly fit into a category, consider refactoring it in such a way that it does. A way to do this could potentially be to clone the test if it resides within a canyon of two or more categories, in order to ensure that each category gets its own version of the test.

Does it follow Clean Code Principles?

In spite of popular belief, test code is actually also code and should in general receive the same care you provide to any other code. There's great benefits to this, such as readability, reusability and flexibility.

  • To demonstrate this, consider a test class containing a large number of tests where the subject has some internal state that may vary. While one state can be considered both an initial and valid, all other states can abstractly be thought of as mutations of this first state. We can use this sort of thinking when coding our tests. In each test following // arrange, we use the method Setup();. Each test then deviates from this initial state by mutating it, perhaps with a SetSubjectDependency(s => s.{dep}, null) in a test validating how the subject handles internal nullreferences.

Is the test independent?

Tests which share state and space with other tests are at risk of experiencing bugs themselves because the timing of execution of the tests may affect their correctness. This should be avoided by instantiating dependent objects for each test.

  • If you want to increase readability of a test by reducing bloat by moving some variables to the class scope or similar, or perhaps refactoring away some magic constants, make sure they are immutable in the outer scope. If this is not an option, consider not moving it to the outer scope. Otherwise, you are vulnerable to undesirable behavior in your tests.

Is the test timely (relatively)?

A test that is never run is a bad test as it adds no value. A tests that motivates the developer not to run it, perhaps because it takes a lot of time to set up or it just runs slowly, is a bad test.

  • A test is a slow test, seen relatively to how often you'd like to run it. If a test suite is run once a month, because the code it touches is very static, or the suite is overly thorough relatively to everyday development, it is acceptable that it takes longer to run. Test suites that are expected to run on a daily basis, perhaps even hourly basis, should be quick, though! Otherwise, you will end up getting more coffee than is healthy for you, with all those coffee breaks.

Is the test repeatable?

A test which has stochastic behavior, such as being real-time dependent, using pseudo-randomness generators or threading timing are inherently bad tests because a pass does not guarantee anything (a false positive), and a failure can be hard to reproduce.

Believe me! In my more inexperienced days, I once spent two days trying to reproduce a test failure which turned out to only occur when the clock on the machine had gone past 19:00.

These qualities makes the test unreliable. To fix this, you fix the uncertainties. Mock the problematic dependencies, in relation to which you might possibly need to introduce wrappers, in order to gain control over what they return. In that way, the test result becomes reliable, although the result is less general and more specific - but do not fool yourself! This was already the case, now we just know the specifics.

  • This also implies that you should not use a PRG or the clock by yourself to setup or run your tests!
  • Sometimes, it cannot be avoided to have uncertainty in your tests, for instance if you are integration testing in conjunction with some third party code. In that case, there are some strategies concerning how to avoid false positives, such as repeating the code x times, but you should be aware that there are not absolutes in the realm of probabilistic.

What makes a test good?

Having uncovered what makes a test bad, the next logical step is to explore how to identify a good test. Rather than letting this blogpost go on forever, I'll let you ponder upon the matter until the release of the second half of this short series on testing, wherein I'll elaborate further on testing and how to spot good tests - so watch out for part 2!

Conclusion

To sum up this blogpost, however, we've now compiled a list of things to check for when you suspect a test of being bad and need to identity what needs to be changed.

In case you already forgot, here's a shorter version:

  1. Know what you are testing.
  2. Name the test accordingly.
  3. Test one thing at a time.
  4. Arrange your tests.
  5. Make your test readable.
  6. Make your test simple.
  7. Make your test timely.
  8. Make your test independent.
  9. Make your test repeatable.

... and if the test is without purpose, get rid of it.

Remark

If there is anything you wonder about, disagree with or would like added to this blog post - leave a comment!

References

My personal experience,
https://docs.microsoft.com/en-us/dotnet/core/testing/unit-testing-best-practices
https://martinfowler.com/bliki/GivenWhenThen.html
https://gist.github.com/wojteklu/73c6914cc446146b8b533c0988cf8d29 (Clean Code by Robert C. Martin summary)
https://docs.microsoft.com/en-us/aspnet/core/test/integration-tests?view=aspnetcore-6.0

Top comments (0)