If you're writing an application with a database of some kind - and most of them do these days - and you're writing acceptance tests for the application, at some point you are going to come up against the conundrum of test data.
There are two different ways that this problem comes up, and both are important.
Firstly, there is the case of populating the database so that your tests work. Assuming you are following proper test principlesÂ¹, you have problems when it comes to testing any read-only actions. You need to be able to have some data in the system to actually retrieve - or else be able to guarantee the absence of this data.
The way I see it, there are 4 ways that you can solve this, but with different levels of viability:
- Use the existing APIs for populating the data. For example, if you are testing retrieving a user by ID, then use the APIs for Delete User and Create User to ensure that the database is in the correct state first. This does assume that such APIs exist, and it does muddy the testing a bit, since you are using something that might have failed testing to set up this test.
- Create a new API purely for seeding the database. You can have an endpoint that you pass an XML or JSON document to describing the database state, and have the backend set up the data correctly. This means more work to create this endpoint, but it is a lot more flexible than the above in the long run. It does open a potential security risk if you're not careful though
- Access the database directly from the tests. This is one that I see fairly often, and it works well enough, but it does mean duplication of knowledge between the code and the tests, and it does mean that the tests can only ever run from an environment that can access the database.
- Pre-seed the database with all of the required test data. This is the easiest option in some ways, because you have a single setup script and you're done. However, maintenance is a real pain.
My personal preferences are #2 or #3, depending on the size of the application under test.
The other problem that you have is the opposite - reading from the database to ensure that tests have run correctly. Again, there are 4 options that I can think of here, with varying levels of viability.
- Use the existing APIs. If you've just created a user, use the Retrieve User API to check the data. This will only ever show data that you can retrieve from the APIs, so will not be perfect.
- Create a new API purely for retrieving the database. Essentially a no-op wrapper around database access. This ends up being a lot of duplication around your real endpoints though
- Access the database directly from the tests.
- Don't bother about it.
Technically, #4 is the more correct answer for Acceptance Testing. After all, as long as the system does the correct thing from the outside it shouldn't matter how the insides work. That's what Unit and Integration tests are for. However, #4 does sometimes fall down when there are changes that don't have a visible outside effect but still need testing. Those cases are sufficiently rare that that's when I would be tempted to fall back on one of the earlier options.
1) By "Proper Test Principles", I mean that each test should perform exactly one action, and that the order of tests should be completely independent. Performing multiple actions in a test means you lose clarity of what is and isn't working, and tests being inter-dependent makes them brittle, and again means that one failing test can cause a cascade onto other tests.