Peter Strøiman

Posted on Jul 14

My three epiphanies of TDD

#tdd #testing

I once had a talented developer join my team. He was quick to understand the sometimes odd idioms used in the code base. But he wasn't that comfortable writing tests. When talking about his past experiences he said, "we often didn't have time to write tests".

What strikes me as odd is, that writing the right tests first makes you faster. Of course, if you first write code, and only proceed to write tests after you have already fixed all issues and established that the code works as intended; then yes, writing tests would be an investment in time with a questionable return on investments. But if you write the tests before you write the code, the tests help you implement the code faster, with fewer bugs, and more maintainable.

In my career, I've had three epiphanies while practicing TDD. The first epiphany was:

Writing tests first makes you faster; not slower

Let me write about the background story of all three events, and how they helped me become a better programmer.

The first epiphany

This was the first project, where I consistently wrote tests before I wrote code. Prior to that, I had been from time to time, adding tests to existing code. But on this particular project project, there was a shared understanding that the code should be covered by tests. However, apart from me, only one other developer consistently wrote code test first. A few would sometimes write the test first, but most wrote the tests after writing production code (often resulting in complex tests that themselves had bugs, so they didn't actually test anything)

It was of course a learning process for me, I didn't get good at writing tests immediately, and today I would question if I was even practicing TDD.

I wrote tests under the assumption that it was an investment. I would spend a little bit more time developing a feature. The payoff would be reduced maintenance costs. It turns out I was wrong in a way I didn't expect.

One day, something marvellous happened

One day I was working on a feature that required modifications to ALL the layers of the application. The application was a Windows Application build using WPF following an MVVM pattern, communicating with a .NET server using WCF. This is a list of the layers I needed to modify:

WPF Views
View Models
Internal model in the frontend
Communication layer in the frontend
WCF contracts
Communication layer in the backend
Domain model in the backend
Entity framework mappings
Database migrations.

With the exception of the WPF views, that were basically untestable, I had implemented every thing I needed in every layer by writing tests first, starting from the UI (that's what I though outside-in meant back then). I had been coding for quite a while, probably a few hours, when I was finally ready to launch the application and test everything manually before handing it over to testers for further scrutiny when something totally unexpected happened:

It worked the first time!

Did I not make any mistakes during several hours of coding? Of course I had. I had made many mistakes, but they had all been caught early by the tests.

Now I started thinking, how long time would I have spent debugging, if those errors had not been caught early? At that time we didn't have a reliable mechanism for hot code reloading, so every code change would require a relaunch of the application. For a proper stateless web application that's not a big a deal; your state is in the already open browser window. But for a Windows application, I need to shut down the application, navigate through the menus to get the application into the desired state, just to get to the point, where I can start debugging. Combine that with the build times, it would mean about 1-2 minutes from making a code change just to get to the point where I could reproduce the unexpected behaviour. Add to this that debugging was not trivial, as I would have no idea of knowing initially if it was the server or the client that had the bug.

The time saved, by not debugging was enormous!

I had no doubt at all, the time necessary to find the bugs in debugging sessions would far exceed the relatively minor amount of time I had spent writing the tests.

The pattern continued; from time to time, I could work on a feature that worked the first time. It wasn't always - far from it. But often when the feature didn't work initially, the problem was often trivial to find, typically a bad bindings in the WPF views, as they were virtually impossible to test.

Writing tests first, made me faster; not slower

These days, more than 10 years later, I almost never use a debugger. I don't even have one configured in neovim, my editor of choice.

The second epiphany

A few years later, I was working on a web application. This was at a time when SPAs were still not that common, and our application was delivering server-rendered HTML, with a wee bit of JavaScript. All backend code was developed using TDD as this time all team members would write the tests first.

The tooling for testing JavaScript was far from what we have today, neither was the editor support. To test code that manipulates the DOM, you basically needed to run them in a browser. For that purpose we had an HTML page that would load QUnit, our test code, and our production code.

Previously, I had been using various Visual Studio plugins that allowed me to run tests without leaving Visual Studio. Now, when a file was saved, I needed to switch focus to the browser, and refresh the page. Fortunately, there was tool called LiveReload that could monitor the file system for changes, and trigger the browser to reload. Now, every time I saved the file, the tests would run. And they were FAST.

I didn't even know what fast meant before this. I though the C# test suite was fast. Maybe we could run the test suite with thousands of tests in 10 seconds, or maybe even just a second. That is excellent for a CI server. But add compilation and startup time to that, and the fastest the feedback loop could be would still be measured in a magnitude of seconds.

But the browser updated instantaneously. My guess is that it was less than 200 milliseconds from saving a file to seeing the result on screen.

Sometimes the HTML page would not show any test results at all. In that case, the JavaScript would have syntax errors, e.g. mismatching parenthesis or braces, an easy mistake to make as our JavaScript code relied heavily on callback functions, and there wasn't a lot of help from the editor.

This experience profoundly changed the way I wrote code. I would no longer write the code to make a test pass and then save. I would save my changes every time the code was syntactically valid; typically every 5-20 seconds. Did I need to add a new UI event handler? First, add an empty event handler, save, BAM! Immediate feedback on a misplaced closing brace.

In case of an error, I wouldn't bother trying to find out what I had done wrong; I would just undo my change, and try again. Quite simply, spending 30 seconds trying to identify which parenthesis or bracket that was misplaced would be silly when it takes 10 seconds to rewrite the intended code.

Having instantaneous feedback while writing code profoundly changed the way I worked.

Today, I will go to great lengths to setup my tooling to ensure the fastest possible feedback. Why do I still use Mocha for testing JavaScript code when more modern alternatives exist? Because none of them can match the speed of mocha (despite at least one of them makes the claim to speed)

The third epiphany

A few years later, I needed to connect to a RabbitMQ from a node.js project. I already had a little experience with RabbitMQ, but in Go, not node.js.

But what I realised here was that there was a series of problems that needed to be solved individually. The first problem was just establishing a connection. So I wrote something like this:

describe.only("Rabbitmq communication", () => {
  it("can connect", async () => {
    await amqplib.connect("amqp://guest:guest@localhost");
    await sleep(10000)
  })
})

This would give me the feedback I needed. First, no errors are thrown. Second, has a connection been established? I would manually verify that in the RabbitMQ management UI that lists active connections.

This was the first time I wrote a "test" that wasn't a test, i.e. I didn't even try to think about what kind of verification would make sense here. But I didn't write it for verification, I needed a way to quickly execute bits of code to get feedback, and that is exactly the functionality that mocha provided making it the perfect tool for the job.

In essence, the typical "unit test tools" basically just allow you to write a lot of small tasks, or pieces of code, that can be executed individually, and for which errors are reported. In addition to that, they also often provide an easy way to be selective about which tasks to run, allowing you to pick what is relevant for the problem you are working on, improving the feedback loop, and possibly reducing noise in the output.

After I was able to successful connect, I wrote the code to create a channel over the connection, and then the code to send a message to a queue. Once again, I would manually inspect in the RabbitMQ management console that it had indeed arrived at the queue, which I had also created manually in the management UI. Then I would proceed to writing code to receive messages, and now I had the first meaningful assertion to write: verify that the content of the message received is identical to the content sent.

After that, I no longer depended on manual inspection. I was now in a state with feedback measured in milliseconds and the process started to speed up dramatically; I wrote code to create queues programatically, create exchanges and bindings¹, send the message through the exchange. After each code change, within milliseconds, I knew if the message was still coming through.

Eventually, I had uncovered the unknowns about how to use the library, and the "test file" contained all pieces of code necessary to setup the infrastructure. At that point in time, I would start extracting the code into a class, gradually moving the different parts of the code from the test file to production file. Still, with millisecond feedback, I would know if a message would still pass through RabbitMQ after every small change to the code.

The code that started its life inline in a single test function eventually evolved into a production-mature module for communicating with RabbitMQ.

The Third Epiphany: TDD is NOT about writing unit tests, it is about feedback.

I had long known about the idea that TDD is the "red, green, refactor" cycle. But this was the first time that refactoring was an essential part of the process, and it was the feedback that allowed refactoring.

Eventually, the test changed characteristics. Rather than describing RabbitMQ interaction, they described how certain domain events should trigger other business processes, such as, "When a new user has registered, a welcome email should be sent". The test did not actually call the business logic, neither was the "register user", not the "send welcome email" code called from these tests. This had the responsibility of ensuring that a well defined event would eventually trigger a call to a use case, which had the responsibility of sending a mail. Both of these functions had been developed using different tests of tests.

Eventually RabbitMQ was reduced to an implementation detail from the perspective of the test that verified the behaviour of a Publisher and Subscriber. In fact, RabbitMQ could potentially be replaced with another messaging technology without modifying the tests. The test suite would then be able to verify that business processes that required a temporal decoupling would still run as intended after replacing the messaging infrastructure. Other tests verified behaviour of a technical concern rather than from the business rule side. E.g. how message retry would work in the case of a failure during message processing, e.g. a broken SMTP connection.

Authoritative sources to back my conclusions

Before publishing this article with claims to what TDD is really about, I decided to see if there are any sources where Kent Beck himself opine on the matter. I learned that Kent Beck has a YouTube channel. Most of the content is really old, but I was delighted to learn that he express exactly the same point of view as I do: He repeatedly states that the goal is to get fast feedback.

Building a Tokyo Tyrant client library

This video from 2010 is a small fraction of what was a longer video course on TDD by Kent Beck. Unfortunately, the comments track suggests, the the rest is lost. Only the first 10 minutes are now available, but during those, Kent Beck specifically mentions that the primary reason for TDD is to get feedback.

The actual use case is creating a Java library for connecting to Tokyo Tyrant. Early on, he expects to end up with a class TyrantClient to which the test will eventually interact. That's where he imagines that the test will end up, but he doesn't even try to write that test. Rather, he writes a "test" that tries to open a TCP socket just to explore how to connect with the server. Next, he writes code, still in "the test" that transmits binary data over the socket directly; at which point in time the video ends.

But the very short video is enough to show that his way of approaching the problem is almost identical to how I started using RabbitMQ from node.js; Let's just get feedback on whether or not we can establish a connection without worrying about how the "test" will look when completed.

Use TCR to implement a rope data structure

More recently, Kent Beck has been experimenting with a process called TCR, Test && Commit || Revert. The idea is that after every file save, tests are automatically executed. If they pass, all changes are automatically committed to source control; otherwise, all changes are reverted, i.e. your working directory is reverted to the previous green state.

This process encourages very small changes, and you basically end up writing code in way similar to what I did during my second epiphany.

The only important difference is that in TCR, the undo is automatic, rather than manual, I would work in just as small batches as Kent Beck shows. One minor difference is that in my case, a failing test could be the expected outcome, assuming it failed with the expected error. But apart from that the process was quite similar. If I didn't see the expected outcome, I'd more often undo than diagnose, potentially work in even smaller steps during the next attempt.

Interview with Dan North

The third source is an interview with Dan North, the "inventor" of BDD. Originally, BDD was exactly the same as TDD was supposed to be. But in one particular project, the word "test" gave the wrong impression. Testers objected, that programmers would take their jobs (and would be bad at it), and programmers objected that they weren't testers. As the focus of the programmers was different from that of the testers, which would not be replaced, changing the term, "test" to "behaviour" made all the difference.

Since then, a lot have happened in the BDD community, and today is more considered an acceptance-test-driven outside-in TDD approach, with more tools surrounding and supporting the process, such as Cucumber. But just as TDD has been misinterpreted, so has the intention of cucumber

In the interview, Dan states that BDD was originally intended to exactly the same as TDD was about: A process to iteratively work on the design, and he states directly "TDD ... has nothing to do with testing".

Why and when is TDD the most efficient development method?

I believe that TDD is the most efficient method for developing the vast majority of your code base, as the fast feedback encourages you to solve the problems one at a time; potentially trying to identify the smallest possible solvable problem. You can iteratively work on the design of your code, and extract abstractions when you discover them; or remove abstractions when you realise they are no longer helpful. You can make these changes in very small steps, immediately validating your assumptions, truncating long unproductive paths.

When is TDD not the right choice?

TDD isn't helpful when executing small pieces of code doesn't provide meaningful feedback. The most obvious example is UI code when the focus not behaviour, but style, colours and layout. In this case, the only meaningful feedback is visually inspecting the result. Does everything look as intended? Does it look good?

Fast feedback is still important for efficiency, so be sure to get the fasted possible feedback. If you are writing web applications, you are probably used to hot code reload in the browser, providing near-instantaneous feedback. What about desktop applications? Automated e-mail? PDF Generation?².

But for the vast majority of the code, a good TDD process provides the fastest possible meaningful feedback in the development process.

Do I have a good test suite?

You can find a lot of literature about which properties a test suite should exhibit. Much of it is BS, and much describes properties that are often observed in a good test suite, not the properties that make the test suite good.

In this article I have tried to describe, both through my own experiences, and by referencing videos with some of the pioneers of modern TDD³, that TDD is really about the feedback, allowing you to solve one small problem at a time, and facilitating the ability to refactor to patterns as you discover them.

So in essence, to identify if you have a good test suite, ask yourself the following two questions:

Do my tests provide fast feedback when implementing new behaviour?
Do my tests allow me to refactor safely?

If the answer to both questions is "yes", you have a good test suite.

Exchanges and bindings are part of the RabbitMQ infrastructure and controls how messages published are distributed to different queues owned by different consumes, respecting specific routing rules. ↩
In one project I needed to write code for PDF generation. I invested time into finding the tools that would automatically create and reload a PDF file in a viewer after each code change. It took some time, but the time saved by the fast feedback was worth it, allowing a fast iterative approach to writing the code. Again, the tool I used to write the code was my test framework, and once the PDF was correct, the output file was copied to an "expected" folder, used in snapshot testing to prevent a regression. I generally would advice against shapshot testing, but for this purpose it was perfect. ↩
Kent Beck only claims to have rediscovered TDD, not invented it. ↩

Top comments (1)

Damian Cyrus • Jul 15

Thank you for sharing this inspiring experience!

From all the TDD/BDD methodologies, it seems we can add another one: Feedback Driven Development 🤣

Most developers I know follow a workflow like this:

Think about the task.
Write the code.
Test the code manually.
Write automated tests for it.
Done.

However, what I find lacking in this approach is thorough preparation. A more structured workflow could look like this:

Think about the task.
Plan it out (using paper, whiteboard, descriptions/definitions, etc.).
Write/prepare automated tests (Unit, Integration, E2E) based on the plan and expectations (and acceptance criteria).
Check quality gates as helpers (local, pipeline, etc.), if not already set up.
Write the code.
Done.

The issue isn't the time spent writing tests, but rather the time invested in planning. Shifting one's mindset to prioritize planning requires significant effort and a change in habits. Initially, this can be challenging and time-consuming, but it becomes easier with practice. As developers get accustomed to thorough planning, their testing and coding quality improves. They can focus on writing good code because they already have a solid plan and clear outcomes.

I enjoy seeing developers go through this transformation. The time invested in planning is something you and your clients will appreciate because it results in quality at every layer.

DEV Community