We've all been there. By all I mean developers. You've finished your unit tests and now it's time to you check code coverage. Nice. Above 80%, the result are looking good, … But are they? You say to yourself: Alright, I've reached the target of industry standard I read somewhere, now I can perform all those fancy tests which will be our guardians for future refactors and everybody will be happy that we have them.
But, what if instead you asked yourself this: "Did I create tests just for the sake of the coverage numbers or are those tests really testing what matters?"
Let's talk about unit testing
Let's talk about unit testing of frontend applications and let's discover why code coverage can lead to a false feeling that your unit tests are good enough in terms of tested functionality. This piece will not be about the quality of test codes or anything like that. It will be a discussion about psychological switch in terms of how we write our unit tests.
This whole philosophy comes down to how users use your application and components you just wrote. Use cases will reveal errors that could happen when a user interacts with your app/components or if there are some external entities besides users that interact with your app/components (e.g. subscriptions in websocket)
Let’s take the following example of Todos application in Github repository. Besides the main branch contains 2 additional branches:
When you look into code coverage in both branches, you see that the percentage is pretty high.
The only difference between branches based on code coverage reports is that the decoupled-tests branch has a lower coverage and number of tests performed.
Now let me tell you why I consider almost all of the tests in the coupled-tests branch useless and why is the code coverage in this case misleading.
If you opened the repository in the coupled-tests branch, you would find out that every production code file has a corresponding file with tests.
A question comes to mind - why is there 1 test file for every component file? Maybe someone would've argued that these are unit tests. One component file represents one unit and that unit is tested in the appropriate test file next to it. Yep, I've heard it many times. But is it the right reasoning? Remember what I said and try to think in terms of real end user use cases. You will quickly figure out that one use case can cover multiple production code files.
So, what does it mean that tests are coupled to production code? It means that the structure of your tests is a mirror of your production code structure as per above mentioned example. If that happens, tests will become sensitive to changes in the production code. And if a production code is refactored then tests will most probably fail. This is not good, as the point of refactoring is altering the code's internal structure without changing its external behavior.
So when we think about it, these tests are becoming useless because they don't protect us against making mistakes when code is refactored or if we add new features. With every refactoring we will need to refactor also tests which will not only increase error prone but also amount of maintenance.
How to decouple the tests from the production code?
We can design this test structure to be contra-variant with the production code. The best friend here is thinking in use cases. So if we take our Todos app, we can think of these use cases:
- User can view Todos
- User can add new Todo
- User can remove Todo
- User can mark Todo as done
- some error use cases: viewing, adding, removing, updating can fail
When the number of use cases is low, we can design the structure of unit tests as a part of one file. Based on how the application will grow, we can then split use cases into multiple files. The point is that test files will not mirror our production code.
Now, let’s have a look into the decoupled-tests branch.
As you've immediately noticed, there are no more test files next to production code and all of our tests are inside one test file Todos.test.tsx which contains all the mentioned use cases. Tests test only TodoList.tsx component and if we refactor TodoItem.tsx or AddTodo.tsx, then tests will be still passing as we are not changing external behaviour (which in this case is in TodoItem.tsx).
When we look again into coupled-tests branch and components tests, we will notice that we are mocking todos.ts service.
Now we are going to remove all the mocks and leave only the ones that are necessary. Ah, I am hearing a question! What are necessary mocks? Well, now we are getting into the difference between integration tests and unit tests. Necessary mocks are those that mock some integration with another system. In our example it is communication with the server with Ajax calls with fetch api*. So fetch api is our integration point with a different system and this is the point where we introduce mock in our tests and exactly this you can find in the decoupled-tests branch.
Maybe one could say that this is becoming an integration test. Is it? If it was, we would not even mock fetch api and let our components do real communication with external system. So, from my point of view this is still a unit test.
And what's wrong with mocking non-integration points? Basically, you won't verify whether your code works correctly together and you can always make mistakes in your mocks.That can cause false feeling that everything is ok.
* We don't have any backend for our simple app but we mock it in an app with mock.ts which represents a backend application. This mock has nothing to do with mocking in tests, it exists just for the demonstration of async communication.
This is related to coupling tests to production code. If we manage to decouple tests from the production code it rarely happens that implementation details are being tested. But what are the implementation details? One can think about it as all the supportive code of the main code. It's big component or class refactored to small pieces which are usually implementation details. But it could also be lower layers of multilayered application. In ReactJS app it can be Redux store, Sagas, services, etc. Those are also implementation details about which users don't care.
In our example todos.ts service, TodoItem.tsx and AddTodo.tsx components are implementation details we don't want to test individually as it is done in the coupled-tests branch. Instead, all those files can be tested as a part of testing TodoList.tsx component as it is done in the decoupled-tests branch. And as you can see in above code coverage, those files are fully covered even if they are not tested explicitly. This allows us to do refactoring of those internals without failing tests and it requires less code, which means less maintenance.
And why decoupled-tests branch has lower test coverage than coupled-tests branch. It’s because in the decoupled-tests branch we don’t test App.tsx. But if we would like to have really 100% coverage also in the decoupled-tests, it's easy to do. We can just replace tested component TodoList in Todos.test.tsx with the App component and the test will show that everything is fine.
These are supportive tests written during development, before the whole feature is completed. E.g. when you write a code with some algorithm and you want to verify that the algorithm is working correctly. I call them development tests because they are really needed only during development. Once we verify that our algorithm is working correctly, those tests become redundant, and we can safely remove them. Why? Try to answer the following questions:
- Is there any chance that those tests will ever fail?
- Is there any chance that I will need to update the algorithm with more features?
- Is there any chance that the algorithm will be changed in the future with a different implementation?
If the answer to those questions is "no", you can remove the test because it will show that everything is perfect all of the time, and that's a waste of your resources. And most probably this algorithm will be tested anyway as a part of some use case once you finish with the unit tests. So don't be afraid to remove tests!
All my thoughts in this post are not new. For example Kent C. Dodds came up with the idea of Testing Trophy instead of test pyramid. The idea behind it is that most of the tests should be integration tests, not the unit tests. Of course, this depends on how you define unit or integration tests. I am more inclined to unit tests because we are just integrating our own components together in our tests, not external systems.
Also term contra-variance is widely known, especially in the TDD community, but I think it can be generally applied even if you don't use the TDD approach.The way code coverage can give you false feeling of good test is perfectly explained by Martin Fowler in his blog.
If you've reached this paragraph, I suppose I got you interested in this topic. I would like to encourage you to go into your unit tests codebase and check if you really test what matters. Also, I would like to state that thoughts in this post can be applied to any application layer, not just the frontend. Now let's end this post with a couple of statements. Do you agree? Let's continue this discussion in the comments section!
“Testing is not about test coverage numbers but rather about use cases verification”
“Use test coverage only as a guide to choose the next use case you will test”
“Having 0 tests is better than having a lot of bad tests.”
“Tests are another code you need to take care of”
“Don't be afraid to remove tests”
“Couple of decoupled tests can cover more code than tens of coupled tests”