DEV Community

Cover image for Integration tests are king
Kieran Bond
Kieran Bond

Posted on

Integration tests are king

Stop writing so many unit tests and join me on the dark side. Focus on integration testing instead of unit testing, and bask in the saved time and superiority.

Originally posted on my Substack here.

Testing Diamond model

A caveat before we get into the meat of it:

Most of the test suites I work in are for Web APIs and deployed systems - especially recently, so I am biased. Writing integration tests for Web APIs is more accessible than other ‘systems’. When building other systems, such as libraries, consider ‘big unit’ tests but know that unit tests do still have their place.

Contents

  • Test suite composure
  • Why integration tests are king
  • The downfalls of integration testing
  • What makes an excellent integration test

Test suite composure

The ‘Testing Pyramid’ concept advises that most automated tests you create are the quickest to write and execute - often considered your unit tests. You have fewer tests as you move up the pyramid, but they are more integrated with the whole service. The three tiers from top to bottom are Unit testingIntegration testing, and E2E testing. Most people will refer to the testing pyramid as why you should have more unit tests than anything else. However, there are other options.

Unit tests are, generally, very targeted tests. They’ll often test a specific function with a particular input, expecting an output that exhibits a behaviour - or they should. This style of unit testing is a ‘small unit’ test. 

These ‘small unit’ tests are perfect for critical behaviours/functions but quickly become cumbersome for comprehensive coverage. Most functional changes lead to you updating tests because these tests struggle to avoid implementation details, so at which point, why bother with them? You’ve fallen into an anti-pattern. Consider something better if you find yourself in this situation. Enter ‘large unit’ tests and integration tests.

One model called the ‘Testing Diamond’ or ‘Testing Honeycomb’ is more powerful (see the photo at the top!). Spotify highlights it as particularly great for testing Microservices. The main difference between the diamond and the pyramid is that Integration testing is where you focus most of your tests. 

As a de facto, I believe the testing diamond model to be better than the testing pyramid - even outside microservices. The critical reason the diamond is better is that it moves you away from implementation details and you go ‘bigger’; you move to behaviour testing at the most controlled and significant scale, interacting with the users’ boundaries with the system.

What is integration testing?

The key focus in integration testing is to check how the system-under-test (SUT) behaves with the systems it integrates with. For example, my service may interact with another service - I want to test some behaviours dependent on this other service without it being there. If I rely on the service being there, I may end up with flaky tests that randomly fail when it is offline or when it has been broken. Creating a stub you can control and inspect for the other service is standard - a fake upstream system.

Integration testing allows us to test user flow through our system within our control bounds - at a high level. We can focus on how they interact with the system, from input to output, thus allowing us to determine if the behaviours we have built are correct without knowing implementation details. These tests have the added benefit of enabling us to assert there have been no side effects on other systems, as we usually have control over the ‘integrated’ systems via stubs/mocks. 

Why integration tests are king

Integration testing reduces the ability to involve implementation details in your testing. They focus a lot more on input and output and thus are much less prone to failure and are easier to keep up-to-date.

You see, you’re doomed to fail when your tests involve implementation details. Implementation details require constant upkeep and lead to biased tests. Whenever your related feature changes, you need to update your tests, and it’s effortless to ‘cheat’ with your test and do the easiest thing (such as faking behaviour) rather than testing a flow thoroughly.

By doing tests at a larger scale, you don’t mock out everything that makes the function behave the way it does - you get a true reflection of the behaviour in your system. You write fewer mocks, and so there are fewer mocks to maintain. You write tests against genuine entry points to your system from a user perspective and assert against the same output your user would receive.

Another bonus to writing tests at the integration level is that they are generally ‘cleaner’. There’s rarely much setup because a stub already has defined behaviour and doesn’t need setting up for specific tests, so it’s simply the case of calling the entry point to your service and asserting on the result. I’ve had some tests be as small as two simple lines at this level.

If you’re writing behaviour tests at this scale, you’ll ideally need fewer of these tests than you do ‘small unit’ tests. This can be easier to manage than the (probably) thousands of ‘small unit’ tests you write in every project. It’s also quicker to get off the ground.

A great, overlooked advantage of integration testing is that you have service stubs. You need them for your tests, but have you considered that you can also host them in lower-release environments?

Hosted stubs allow you to have development environments that don’t have dependencies on other teams or systems. You can release more quickly into these environments and perform any manual testing you fancy in a controlled manner. This control is also great for introducing chaos engineering or performance testing. You’ve got no risk of breaking other systems, can alter your stubs to suit the situation, don’t need to talk to other teams when experimenting and have control of the observation on these dependent systems to ensure they’re not causing the trouble you’re investigating in higher environments.

The downfalls of integration testing

As always, it’s not all roses. When is it? If you find some silver bullets, let me know.

Integration tests can, and often are, slower to run. You need to start systems before testing, which is much more work than exercising a small unit. As the test suite grows, the execution time does too. Not that unit testing is exempt from that, however. I’m keen to discover whether this becomes negligible at a big enough test suite scale; as mentioned above, you don’t need as many integration tests as you do unit tests. The most considerable slowdown in integration tests is the startup process.

The beauty of being a ‘big unit’ is also a downfall. You’re not testing things at such a granular level; thus, it’s easier for tiny bugs to slip through the net. Arguably, they may not be bugs if they’re not affecting the overall behaviour - or your tests don’t cover enough behaviours.

Sometimes, depending on how your suite is made to run, it can be harder to debug/write integration tests - especially without clear thought on the Developer Experience (DevXP) when building the framework. The difficulty usually comes from running the SUT in a docker container to emulate a genuine service (dependent on your architecture, etc.). You’ll either need tools to attach debugging to this container or spend the time ensuring you can run it locally when testing. This ability is essential; otherwise, writing tests can be slow, bugs can be hard to investigate, and you’ll be limited in your ability to use TDD. 

The most considerable downside of integration testing is that you now need system-level stubs. In most testing frameworks, creating mocks for ‘small unit’ tests is simple; thus, you don’t need to invest much time in getting them working for your tests. However, creating system-level stubs takes more time - often considerably more.

These stubs are more complex, and ideally, you want them to replicate the other systems’ behaviours, which means they take time to build - even if simplified. You can mitigate this by making an integration test framework one of your top priorities when starting a project - you can then evolve it over time with your system as needed. 

Another downside to needing a stub - how do you know it behaves the same as the system it’s pretending to be? You can quickly instil false confidence in your system if you’re not careful. Mitigate this using E2E tests in your suite, potentially even having some that validate your stubs.

One problem you can sometimes experience when creating integration tests for a Web API is static typing. If you build your test suite in a separate repository, you may need help referencing the static types you use in the API. It would be best to keep the tests colocated to make them easier to work with as part of the development process; otherwise, people will not contribute.

What makes an excellent integration test

Short and sweet.

I’ve seen tests be as small as two lines (alongside some setup used for all integration tests). Keeping your test small and focused on a target behaviour is essential - making it readable and easy to comprehend. You can take a quick look at the test and understand the exact behaviour it is focused on and if it’s even being tested correctly (very useful for peer reviews). Make the call to your user interface and validate that the output/behaviour is seen correctly. That’s it.

Try to do most of the data seeding in your framework before running any tests. Ideally, the SUT is already configured before any tests are run to behave more like a real system - The closer to realism, the better. How often does your system get called without any data set up, really?

How I like to create them

I like to try writing integration tests first as part of TDD (and often skip unit tests altogether), so next time you’ve got a new feature or bug to write - give that a go. Before creating your new feature, define your behaviour in the integration test, specify the user interface in the test, and assert the output from your interface. Then, write your feature. The idea here is you’re focusing on the interactivity that matters - the users. If you concentrate on that up front, I find it makes the implementation more straightforward and is an easy point to start from.

I like using the Arrange, Act, Assert methodology in my tests to keep them focused and clear - as I do in my unit testing (see this post for more).

Once that test passes, you can refactor as much as you like. You can then decide what might also be worthwhile unit tests to write. Don’t forget that you can always delete tests - you can use them to reassure yourself without adding them permanently to the test suite.

An example

Suppose a trading system that depends on the market’s stock prices (market data). When an ‘order’ (trade) is created in this system, it wants to annotate the order with the current market price alongside the price it was executed at - this might be done to see the ‘improvement’ that the seller or buyer has achieved, aka profit.

So, the behaviour I want to test is: An order is annotated with market data prices, and the user receives the data correctly in the output.

public async Task CreateOrder_Annotates_With_MarketData_Touch()
{
    // Arrange
    var orderRequest = CreateOrderRequest(price: 42);

    // Act
    var order = await Client.PostCreateOrder(orderRequest);    

    // Assert
    order.TouchPrice.Should().Be(MarketDataStub.AskTouchPrice);
}
Enter fullscreen mode Exit fullscreen mode

You would probably need a few unit tests to cover the same scenario, as it’s likely to encompass considerable parts of the system. However, you do need a stub to represent the market data integration, and depending on how you integrate with that system, this can be simple or tricky to create. In the example above, there’s a presumption the market data stub has already seeded the trading system.

You would want one (or many) unit tests to verify that market data updates are fed correctly through the system from ingress; you would want another (or more) to confirm that an order is annotated with market data. You would want at least one more to ensure the API returns an order object correctly.

Because that’s a lot of tests to write, I generally prefer integration tests when using TDD, as I think it helps me move quicker and just as accurately without compromise.

In Summary

If you want to feel closer to the authentic experience of using your service, write more integration/‘big unit’ tests. Do them first with TDD, and you’ll make the system better for users.

Try replacing some of your unit test suites with integration tests - you’ll hopefully see reduced coupling (to implementation behaviour) in your tests and a significant reduction in the number of tests that need to be written.

If you’re not creating a web API, you can still write larger unit tests focused on functional/behaviour flow. Try moving some code towards that instead of small, focused units.

Just don’t forget that other tests are still necessary; a well-rounded suite helps avoid the pitfalls of each test type.

Top comments (0)