DEV Community

Cover image for How to create an automated test suite - get from 0 to 90% code coverage with a single command and GPT-4

Posted on

How to create an automated test suite - get from 0 to 90% code coverage with a single command and GPT-4

Creating a test suite from scratch can certainly seem like a daunting task, especially if you have an already-developed codebase with no tests. However, it is crucial to implement automated tests if you are looking to scale up your software project.

In this blog post, you'll see how to use the power of GPT-4 to create your test suite from scratch in just a couple of minutes with Pythagora. You can achieve code coverage of 90%; basically, with a single command.

If you are familiar with unit tests and other types of automated tests, feel free to skip to How to kickstart a test suite to see how to generate the entire test suite for your repo.

What type of tests to start with?

There are many types of tests while the main types are: unit tests, integration tests, and end-to-end (E2E) tests.

  • Unit tests: simplest type of tests. They usually test just a single function. However, they are the fastest type of tests and they can show exactly where the bug is in contrast to E2E tests that in most cases don’t show the location of the bug. Since they are the fastest type of tests, you can run them with a watcher (upon code changes) so you get feedback super fast.

  • Integration tests: these are the tests that test a bigger part of the codebase, usually multiple functions that are combined in some way. What they provide is a real world scenario how the codebase behaves in contrast to unit tests that test just functions in isolation from the rest of the system. They are slower than unit tests but are still fast so that a developer can run them before a commit or upon a bigger change in the codebase.

  • E2E (end-to-end): they mimic the user behaviour and test the entire system from the user action all the way until the reaction to the user action in the UI. That means that they test the frontend, the backend, the database and everything in between. Their main downside is that they are very slow to run since they need the entire UI spun up (eg. headless browser or a mobile app emulator) so they usually run within the CICD pipeline before deploying the code to production and not on the local machine of a developer

What tests should you write and in what percentage comparing to other types is still heavily debated topic. Some say that there should be unit tests the most, then integration tests and the fewest E2E tests. This is called Testing Pyramid - you can read more about it here. In that light, we’ll focus on creating unit tests since the are the simplest to make but still provide many benefits.

Confused developer

What to consider when writing automated tests?

If you have never written tests, you may believe that automated tests are only used to verify if a part of your code works as expected. While this is indeed one of the primary goals of automated tests, it is definitely not the only one.

  • Documentation: Automated tests serve as hands-on documentation for your codebase. When a new developer starts working on a part of the codebase that they've never seen before, if there are automated tests for that part of the codebase, they can run the tests and understand how the code works.
  • Debugging: Tests, especially unit tests, are important in identifying where a bug is in the code. This is one of the reasons why having only end-to-end or integration tests is not enough. Usually, different tests are all run together, so end-to-end and integration tests test how the system works in the real world, whereas unit tests detect where the bug might be occurring and test different edge cases.

How to kickstart a test suite

Getting started is always the hardest part, especially when you have an entire codebase but no tests. If this is the case, you will likely have to write hundreds or thousands of tests. This is where Pythagora comes in. With it, you can run a single command, take a break for lunch, and return to a fully written test suite. However, this is not the end of the process, as you still need to spend a few hours reviewing the tests. Nonetheless, with Pythagora, you can have an entire test suite ready in just one day. Let's see how it works.

1. First, install Pythagora with:

npm i pythagora
Enter fullscreen mode Exit fullscreen mode

2. You will need to either have OpenAI API key (with GPT-4 access) or Pythagora API key that you can get here. When you have one of these, add it to config with:

npx pythagora --config --pythagora-api-key <PYTHAGORA_API_KEY>
Enter fullscreen mode Exit fullscreen mode


npx pythagora --config --openai-api-key <OPENAI_API_KEY>
Enter fullscreen mode Exit fullscreen mode

3. Finally, run the following command from the root of your repo:

npx pythagora --unit-tests --path ./
Enter fullscreen mode Exit fullscreen mode

That's all for now. You can take a break and return once the test generation is complete. As a rough estimate, generating 100 tests will take approximately 30 minutes, so depending on the size of your project, it may take a bit longer.

Once you have the tests, it's important to review them. I recommend starting with the failed tests. You can run all of the generated tests with the following command:

npx jest ./pythagora_tests
Enter fullscreen mode Exit fullscreen mode

This command will run all tests saved in the pythagora_tests folder, where Pythagora stores them. Check any failed tests to determine whether the failure is due to a syntax mistake (which GPT sometimes makes) or a bug in the code.

After reviewing any failed tests, evaluate all tests to determine whether they are meaningful enough to keep. It's not ideal to have too many tests so feel free to delete any that you think are not necessary.

Example repo

I gotta say, I was pretty surprised at how well GPT performed. I ran tests on a bunch of different repositories, and it was able to find some pretty tricky edge cases that would've been tough to find otherwise. The tests it generated were on point, and it found bugs right off the bat.

Here's a fork of the lodash repo where I ran the command above to generate tests with Pythagora. It took a minute (well, actually, like four hours), but the results were pretty impressive.

Pythagora unit tests results on Lodash repo

It generated 1604 tests that get code coverage to 90% and among those, GPT was able to catch 3 edge case bugs that might've flown under the radar otherwise. Now, those bugs might not seem like a big deal, but it's pretty impressive that GPT was able to catch them at all. Plus, it found 10 more regular bugs (thankfully, these are not in the live lodash version but are in the master branch).

Feel free to clone the demo repo and check out the tests yourself. If you want to see the tests in action, just do npm i and then run them as above with npx jest ./pythagora_tests.


In conclusion, in this blog post you saw how to kickstart your test suite from scratch, starting with understanding the importance of different types of tests to finally leveraging the power of Pythagora.

Pythagora, powered by GPT-4, helps automate the creation of your unit tests, and substantially improve your code coverage in a fraction of the time it would take manually. It's an exciting tool that has already proven its worth by catching edge case bugs in public repos like Lodash. If you're in the web dev field, I would encourage you to give Pythagora a try. With it, you can create an entire test suite with a single command.

If you found this post valuable, it would mean the world to me if you could support us by starring Pythagora Github repo.

And, if you try it out, please let me know what do you think. How do generated tests look like? Will you be committing the tests to your repo?

Top comments (0)

Some comments have been hidden by the post's author - find out more