Problems In Integration Testing
Writing API integration test scripts is easy to be messed up with. There are 2 important problems that I think are the root of chaos. The first problem is about how to prepare test data, the second is how to see and validate the side effects that are not immediately observable through the output of the API.
It is common for a non-trivial system to have integration with many external dependencies. For example, a monolith API service might have a minimal architecture like this.
Test engineers often ask me how they can write a script to connect to the database to modify some data before running the test and to validate the states in the database that are not observable through the output of the API. This is an unavoidable situation, I know this because I have once had a failed attempt trying to force that the integration test scripts must use only the available APIs as being used by the client applications. For example, to test a product checkout API of an e-commerce system, the test script must use existing APIs to create users, products (and their categories), and shopping cart, and so on, before the actual test logic can be executed. The script turns out to be too complex, time-consuming, slow, unmaintainable, and fragile.
The script can be a lot simpler if we just allow it to modify and query the database to validate the result, thanks to the universal and simple nature of SQL. But when it is not just a database, the script becomes messed up again as, for example, it has to connect to a file storage service to upload a prerequisite file and then test if the file has been processed as expected. The script sometimes has to call the third-party API services to check the result. I believe you can imagine the more complex cases as the system goes bigger. We have to "hack" the infrastructure and security of the system to allow testing to be possible. And the test script becomes dependent on this hacking as well as all the dependencies behind the API service under testing. At the worst, you end up duplicating all the middleware and integration logic of the API service, and how do you know that such logic inside the test script is less buggy than that of the service?
Manual Operations
That said, what I have seen people achieve so far is to rely on manual operations to prepare all the prerequisite data in every dependency before running the test. Ones can have some good strategies to create a pile of database dumps and label each of them to be for a specific test case. Anyways, it is not easy to automate the integration test script in CI/CD pipeline.
Insight From IoT
This problem has been stuck in my head for some time until recently I started learning IoT development. At the time I had zero knowledge of electronics but somehow managed to randomly buy an ESP8266 microcontroller and a Raspberry Pi.
My first experience with Pi is interesting. The seller told me I should buy a monitor together with the Pi but I rejected it thinking I am a programmer so I don't need to waste my money with a monitor as I can SSH to the Pi from my Mac. I followed all the instructions to boot the Pi with a WIFI connection. But I stuck as I could not SHH to it for some unknown reasons. I had to go back to the store to buy a monitor, then I could see what was going on inside the Pi and fix all problems in no time. That is when I understand the power of a monitor. Without it, I have to "hack" the Pi somehow (if I can) to be able to understand the system. Then I realize this is similar to the problem in integration testing. If we can have a "monitor" that shows all the states related to the API under testing, we can have enough information to validate the system without having to sneak into it. So I have an idea, what if we architect our system such that when developers build an API, they have to build a monitor together with it.
But that only solves half of the problems because a monitor cannot be used to prepare the test data. So I extend my idea and introduce a "flasher" counterpart of the monitor. A monitor shows the "state" of the system related to API under testing, on the other hand, a flasher takes the expected "state" and then "flash" the system to be in such state before running the test. Here I use the terminology "flash" as a metaphor because it is similar to when I have to "flash" my ESP8266 microcontroller with a specific version of my program before running my test script in a specific case.
The following diagram depicts my idea.
Please notice that, with this strategy, the test script does not have to care about the external dependencies. The system under testing is responsible for query states from all dependencies and combines them with its own state to produce the final information for the monitor. And to prepare the test data, the test script defines the expected state of the whole integral system and puts it to the flasher. The system under testing is, again, responsible for updating its own state and all dependencies according to the expected state. I think this makes very much sense because the system under testing has all the infrastructure ready to interact with its dependencies.
An API with built-in monitors and flashers
I have experimented with this idea in my IoT project which uses Serverless Framework to create Lambda functions for the OTA feature. But I will show only the general code to exemplify the idea of monitor and flasher. The readers don't have to care about what is Serverless Framework and OTA. I hope you can apply the idea in any case as you wish.
The functionality to be tested here is an API to publish a firmware version. Imagine you have built a program for your microcontroller in my-iot
project as a .bin
file, e.g. firmware-v99.bin
. You have uploaded the file to a storage service, I will use AWS S3 here. Now you are going to publish the just uploaded firmware file so that every device knows which is the latest firmware version to use. Though the firmware files are kept in S3, the data about the project and what is the latest firmware version is kept in a database table. The following picture depicts the relation between them.
Notice that the latest_version
for project my-iot
in the table is pointing to firmware-v98.bin
and you are going to update it to the just uploaded firmware my-iot/firmware-v99.bin
. To publish the firmware version, you have to call a rest API like this.
curl -X POST https://my-api-url/firmwares/publish?projectKey=my-iot&firmwareKey=firmware-v99.bin -H "x-api-key: xyz"
If the request has been processed successfully, the response should have status 200 with an empty body, and the latest_version
in the database table should become firmware-v99.bin
for project my-iot
. But it can also fail with response status 400 for the cases when projectKey
or firmwareKey
does not exist in the system.
Let's start with the test script for the case when projectKey
does not exist. Here I use Jest as the test engine, and Axios as the HTTP client, the language is Typescript.
import axios from 'axios';
const apiKey = 'xyz';
const apiUrl = 'https://my-api-url'
interface State {
[key: string]: {
latest: string | null;
firmwares: string[];
};
}
async function fetchMonitor(): Promise<State> {
const response = await axios.get(`${apiUrl}/firmwares/publish`, {
headers: {
'x-api-key': apiKey,
'x-mode': 'monitor',
},
});
expect(response.status).toBe(200);
expect(response.headers['x-schema']).toBe('MONITOR');
return response.data;
}
async function flash(state: State): Promise<void> {
const response = await axios.put(
`${apiUrl}/firmwares/publish`,
{
...state,
},
{
headers: {
'x-api-key': apiKey,
'x-mode': 'flash',
},
}
);
expect(response.status).toBe(200);
expect(response.headers['x-schema']).toBe('FLASH');
}
async function flashRemoveAllProjects(): Promise<void> {
await flash({})
}
it(`should reply 400 with body { "erroCode": "UNKNOWN_PROJECT_KEY" } when input projectKey does not exist in the system`,
async (done) => {
const mockProjectKey = 'mockProjectKey';
await flashRemoveAllProjects();
const state = await fetchMonitor();
expect(Object.keys(state)).not.toContain(mockProjectKey);
axios.post(`${apiUrl}/firmwares/publish`, null, {
headers: {
'x-api-key': apiKey,
},
params: {
projectKey: mockProjectKey,
firmwareKey: 'anything',
},
}).catch((err) => {
expect(err.response.status).toBe(400);
expect(response.headers['x-schema']).toBe('ERROR');
expect(err.response.data).toHaveProperty('errorCode', 'UNKNOWN_PROJECT_KEY');
done();
});
}
);
Please first notice the declaration of the State
interface. It defines the structure of the output of our monitor, as well as the input of our flasher. As per the theory, the state should show all information behind the system related to the API under testing, which in this case are the name of all projects, the latest firmware version of each project, and the list of available firmware files in the S3 bucket for each project. An example of State data is as follows.
{
"my-iot": {
"latest": "firmware-v98.bin",
"firmwares": [
"firmware-v1.bin",
"firmware-v2.bin",
"firmware-v98.bin"
]
},
"my-smartfarm": {
"latest": "firmware-v2.bin",
"firmwares": [
"firmware-v1.bin",
"firmware-v2.bin"
]
}
}
Following the State
declaration is the function to fetchMonitor
and to flash
the system. The former sends a GET request with the header x-mode: monitor
to the same endpoint as the API under testing. The latter sends a PUT request with the header x-mode: flash
. I have set up the middleware that enforces every API to include handlers to handle these request accordingly, which I will talk about shortly.
The function flashRemoveAllProjects
does just as its name implies. It invoke the function flash()
with the empty state {}
which indicates that we expect no project to exist in the system.
In the test function, we use it to make our assumption before executing the actual test.
const mockProjectKey = 'mockProjectKey';
await flashRemoveAllProjects();
const state = await fetchMonitor();
expect(Object.keys(state)).not.toContain(mockProjectKey);
Next is the example code to test the case of unknown firmwareKey
async function flashSetProject(projectKey: string): Promise<void> {
const state: State = {};
state[projectKey] = {
firmwares: [],
latest: null,
};
await flash(state);
}
it(`should reply 400 with { errorCode: "UNKNOWN_FIRMWARE_KEY" } when input firmwareKey does not exist in the system`,
async (done) => {
const mockProjectKey = 'mockProjectKey';
const mockFirmwareKey = 'mockFirmwareKey';
await flashSetProjectKey(mockProjectKey);
const state = await fetchMonitor();
expect(Object.keys(state)).toContain(mockProjectKey);
expect(state[mockProjectKey].firmwares).not.toContain(mockFirmwareKey);
axios.post(`${apiUrl}/firmwares/publish`, null, {
headers: {
'x-api-key': apiKey,
},
params: {
projectKey: mockProjectKey,
firmwareKey: mockFirmwareKey,
},
}).catch((err) => {
expect(err.response.status).toBe(400);
expect(response.headers['x-schema']).toBe('ERROR');
expect(err.response.data).toHaveProperty('errorCode', 'UNKNOWN_FIRMWARE_KEY');
done();
});
}
);
In this case, we define the function flashSetProject
to set up an empty project in the system. The empty project has no latest firmware version and no firmware files in the list. Then we use it in the test the ensure our assumption that the input projectKey
exists but the firmwareKey
doesn't.
const mockProjectKey = 'mockProjectKey';
const mockFirmwareKey = 'mockFirmwareKey';
await flashSetProject(mockProjectKey);
const state = await fetchMonitor();
expect(Object.keys(state)).toContain(mockProjectKey);
expect(state[mockProjectKey].firmwares).not.toContain(mockFirmwareKey);
The final test case is for the happy case. We have to ensure that both of our inputs projectKey
and firmwareKey
exist before running the test logic. Here we define another function flashSetProjectAndUploadFirmwareKey
for this.
async function flashSetProjectAndUploadFirmwareKey(projectKey: string, previousFirmwareKey: string): Promise<void> {
const state: StateType = {};
state[projectKey] = {
latest: null,
firmwares: [previousFirmwareKey],
};
await flash(state);
}
Please notice that we don't actually upload any files to S3. It is up to the system under testing to implement this flashing logic. It may upload some dummy file behind the scene to the real S3 service or entirely mock the S3 system with its own convenience. And here is how we use it in the main test script.
it(`should be able to publish the firmware successfully.`,
async () => {
const mockProjectKey = 'my-iot';
const mockFirmwareKey = 'firmware-v99.bin';
await flashSetProjectAndUploadFirmwareKey(mockProjectKey, mockFirmwareKey);
const state = await fetchMonitor();
expect(Object.keys(state)).toContain(mockProjectKey);
expect(state[mockProjectKey].firmwares).toContain(mockFirmwareKey);
expect(state[mockProjectKey].latest).toBe(null);
await axios.post(`${apiUrl}/firmwares/publish`, null, {
headers: {
'x-api-key': apiKey,
},
params: {
projectKey: mockProjectKey,
firmwareKey: mockFirmwareKey,
},
})
const stateAfter = await fetchMonitor();
expect(Object.keys(stateAfter)).toContain(mockProjectKey);
expect(stateAfter[mockProjectKey].firmwares).toContain(mockFirmwareKey);
expect(stateAfter[mockProjectKey].latest).toBe(mockFirmwareKey);
}
);
Here, after publishing the firmware, I fetch the stateAfter
from the monitor and check that stateAfter[mockProjectKey].latest
actually becomes mockFirmwareKey
as expected.
With all these examples, we can see the power of our "flasher" and "monitor". Without them, we have to write code in our test script to clean up and upload files to S3 and write some SQL to update the database table. We also have the bothersome task of setting up infrastructure to allow the test script to access the S3 bucket and the database, which should be private to the API service in production. We also might have to set up some middleware that knows how to structure the firmware files in each project in S3 (because S3 doesn't have a concept of a directory). If the API service has decided to change the database and file storage service to other vendors, the test script should also be updated too. All of which makes our test script unmaintainable.
Finally, let me show you the middleware of the API that can enforce developers to always include the handler for flasher and monitor in every API.
The following code is the entry point of the Lambda function, it is like a controller in other frameworks if you are not familiar with Lambda. Anyways, please focus on the Middleware.integrate(...)
, which is my custom middleware.
import Middleware from 'my-middleware'
import handler from 'publish-firmware/handler'
import { monitor, flasher } from 'common/handlers'
export async function publishFirmware(event): APIGatewayProxyHandler {
return Middleware.integrate(
'POST',
event,
handler,
monitor,
flasher
);
};
I won't show the code inside Middleware.integrate
for brevity. Here I would like to exemplify how you can architect your code structure to enforce that every API controller should return a Middleware.integrate(...)
which, in turns, is responsible to route the request to either handler
, monitor
, or flasher
depending on your strategy. For mine, I make it that every request with method PUT
and header x-mode: flash
goes to the flasher
, the request with method GET
and header x-mode: monitor
goes to the monitor
, and the request with method matching the first parameter, i.e. POST
in this case, goes to the main handler
. I also have a configuration to disable this behavior in production. And please notice that the monitor
and flasher
do not have to be specific to only one API handler. You can have a shared monitor
and flasher
for a group of APIs. Here please notice that I put them in common/handlers
instead of publish-firmware/handler
Conclusion
In this post, we have seen the idea of making every API to have a built-in monitor and flahser. Some may argue that this would take time and may not worth the effort. Anyways, I found it is very useful for me as a developer because this strategy leads me to think more thoroughly about all the possible states affected by my API. I can uncover some bugs which I could not have seen if I were not writing the code to handle the monitor and flasher.
I think the greatest benefit of this method is that it leverages automated integration testing in CI/CD pipeline. Integration testing is the real proof that our system works as per specifications or requirements. But most of the time we have to rely only on the coverage of unit testing because it is too complex to test some cases in integration. But unit testing is more or less of an Art that you cannot enforce a deterministic strategy for how and where should be tested. On the contrary, it is easy to set up a rule that every API should have the corresponding integration test script that covers at least what is in the specification. And now with the monitor and flasher strategy, we eliminate the indeterministic factors from the integration test script. Every developer knows how and has the same pattern to write the test script. All the complex responsibilities of dealing with dependencies are delegated to the API under testing, which already has the full infrastructure to handle the dependencies.
Top comments (0)