Leonardo Montini for This is Learning

Posted on May 18, 2023 • Edited on Apr 29 • Originally published at leonardomontini.dev

Copilot Chat writes Unit Tests for you!

#github #testing #ai #githubcopilot

We don't write tests because we don't have time.

How many times have you heard that? Or maybe you said it yourself? I know you did, we all do at some point.

The thing is, you should probably also know it's not a valid reason. The time you usually spend manually testing your code (for example, by running the app and clicking around), in addition to all the time spent in fixing bugs, is way more than the time you'd spend writing the tests.

Oh, imagine one day you have to edit again that part of the code, you forgot what a specific method does and a tiny change causes a bug that the client will find a week later. Great, you now have an angry client and you're also in a hurry to fix it.

Still not having time to write tests?

Copilot Chat

One of the coolest features of Copilot Chat, one of the tools in the Copilot X suite, is the ability to generate unit tests for you. You just need to select a block of code and ask Copilot to write the tests for you.

Cool, it will make you and me save so much time!

But... is it reliable? Let's find out.

Points of attention

Yeah sure, in a single click, you have a bunch of tests written for you. But you should pay attention to some details.

I'll further explore the following topics in the video:

Copilot tries to guess the logic of your code - If it's right, it will help you find bugs. Is it wrong? Well, you'll have a bunch of tests that don't make sense.
Copilot doesn't know what you're testing - It will generate tests for the code you selected, but it doesn't know what you're trying to test. In some cases might be more noise than signal.
Copilot doesn't know your business logic - If you wrote code that actually makes sense, Copilot will generate tests that make sense. But what if your business logic is not what the client asked? The generated tests will validate the wrong logic.
The scope is limited to the selected code - If in the method you're trying to test you're calling other methods in other files, Copilot doesn't know what's inside and will try to guess.

Demo

If you're curious and you want to see it in action, check out the video below:

I might sound boring at this point, but the closing chapter of all of my Copilot/AI posts is pretty much always the same.

These are incredibly amazing tools, they speed up our work a lot giving us more time to deliver quality products and more value to our clients, BUT, we should always be careful, eyes open, and make sure we understand what we're doing and what the tools are doing for us.

Will I use it to generate tests? Probably yes. Will I use the generated tests as they are? Probably not.

What do you think? Let me know!

Thanks for reading this article, I hope you found it interesting!

I recently launched my Discord server to talk about Open Source and Web Development, feel free to join: https://discord.gg/bqwyEa6We6

Do you like my content? You might consider subscribing to my YouTube channel! It means a lot to me ❤️
You can find it here:

Feel free to follow me to get notified when new articles are out ;)

Leonardo Montini

Awarded GitHub Star since 2023 ⭐️ and Microsoft MVP since 2024 🔷 I talk about Open Source, GitHub, and Web Development. I also run a YouTube channel called DevLeonardo, see you there!

Top comments (10)

Ingo Steinke, web developer • May 18 '23

This sounds like the opposite of test-driven development, but it might be a good start to introduce tests and maybe that's still better having no tests altogether.

Leonardo Montini • May 18 '23

ai-generated-test-development 😂

"better than no tests at all" is also an interesting point of view. As long as generated tests are decent and reviewing them takes little time, it might seems as a good tradeoff.

The interesting part comes when at a later moment you have to change the code and suddenly some tests fail. Are they failing because in fact you broke some logic, or are they failing because they weren't testing the right logic in first place?

This opens up some new scenarios worth thinking about.

Levi Schouten • May 19 '23

Really cool demo!

Leonardo Montini • May 19 '23

Thank you! :D

Jan Wedel • May 20 '23 • Edited

Phew, what a bad idea.

I would rather write the tests and let Copilot implement code to make it green. This actually sounds like a plausible approach.

When you actually embrace and understand TDD, you’ll know a couple of things:

You write tests to design your code. To make it look, behave and feel the way you want it plus making it modular and - of course - testable. If you don’t write tests before, you’ll automatically get less maintainable code. I would even rather delete the tests after making them green (some people do that) then write tests afterwards and let Copilot do that
Tests are not boring, they’ll cover your ass when things get nasty (that very important bugfix) so it’s worth spend some time with it.
Test have bugs, too. If you let Copilot write the tests, how do you find out if the tests are correct or buggy? (Spoiler: you’ll do this by writing a red one first ;) ) as an alternative, you would need to internally change tests/implementation to see if they fail when the implementation changes and fail if the test changes. That time you’ll spend there could be easily used to just write tests yourself.

Bottom line: The act of writing tests first has a much higher value than your production code.

Todd Bradley • May 21 '23

I don't know why more people still don't understand this. The industry switched to TDD 20 years ago and still half the "AI is gonna write your code for you" articles I read are by people who just don't get it. This is no better than outsourcing your unit testing to another company; it totally misses the point and we've proven it doesn't work.

Greg Wright • May 22 '23

I completely agree - great points Jan! If you use copilot to (test-last) write your tests there is significant danger that you will entomb your bugs in the tests.
We need to be testing behaviour not code. That means specifying behaviour in tests up front. This has many advantages, one is that is makes sure that test code is decoupled from implementation code.

The age of the software software craftsman is rapidly coming to an end. Software developers will become people who specify software - with good specification the AI will do the rest.

Ole Petersen • May 19 '23

Actually this sounds like a terrible idea to me. Copilot only knows the code you wrote, not what you were trying to do with it. Either copilot will understand the logic of your code and generate a test that passes or it won't understand it and generate garbage. But if there is a logical error in your code copilot may still generate a test case that passes your wrong code