I'm sure you spend a lot of time thinking about how to write better software just like I do. And if you've dipped your toes into the waters of unit testing best practices, you will have undoubtedly asked this question at some point: how simple is too simple to test?
Well, today I found an error in one of my projects that suggests that there might not be a lower bound. Check this out:
This method does some checks on the contents of $pcIDs and throws an exception if something goes wrong. Simple, right? Of course. It's only 15 lines of code, including the signature and parentheses.
Would you believe this code contains an error? Let me give you a hint:
Do you see the problem (take your time, I'll wait).
waiting...
waiting...
waiting...
waiting...
waiting...
Figure it out yet? Need a more obvious hint?
What happened
See what happened there? We want $pcID[0] and $pcID[1] to be the only values in the input array. They should be integers > 0 and not equal to each other. At first glance the method appears to work correctly. But upon closer inspection you probably realized that ValidateID::run() returns a bool but this code doesn't check the return value. Instead it assumes that ValidateID::run() will throw an exception if the value is invalid.
So, validateProductCategoryIDs(array('cat', 'dog')); will not throw an exception, which is not the behavior we want.
Unit testing best practices
So, how simple is too simple to test? I'm starting to come around to the point of view that almost nothing is too simple to test.
There's a quote I like but I can't remember who wrote/said it:
Write tests for any behavior you care about.
Indeed. This fits in nicely with a lecture on Computer Bugs in Hospitals (great talk--by the way). The lecturers argue that doctors and nurses enter incorrect data into computer systems all the time. And some of those errors kill patients. But here's the part that's important for us--they don't know they made a mistake. A nurse entering drug doses into an infusion pump with a numeric keypad makes a mistake that they do not catch about 4% of the time! Doesn't that seem like a high error rate?
So, if nurses are messing up simple number entry tasks without noticing, how often are programmers inserting silly errors into their code without noticing? I think the answer is probably a lot.
The code I'm showing you in this post was written with care. The author wrote a comprehensive manual testing plan for it but no unit tests. The code was reviewed by the author using our code review checklist. And then I also reviewed the code using our code review checklist and executed the manual testing plan. And, after all of that QA, we still missed this pretty obvious error.
So how did we catch this error today?
My colleague wrote unit/integration tests for a method that calls the validateProductCategoryIDs() method. He actually didn't notice the problem but when I was doing a code review on his new tests, I noticed that array('cat', 'dog') types of inputs were not resulting in exceptions and I thought that was suspicious.
To summarize, we needed automated testing combined with code reviews to catch this error. Even though this was a simple error, neither manual testing, unit testing, nor code reviews alone were enough to bring this error to our attention. We only discovered this error by combining several kinds of quality assurance activities.
By the way, this is exactly what Steve McConnell recommends in Code Complete--multiple kinds of QA because each kind catches different kinds of errors.
A couple of thoughts about clean coding practices
We'll never know exactly what happened in the mind of the programmer who originally wrote this code but I can tell you with confidence that this was not an intentional error.
If you do a little root cause analysis you'll notice that we have two validation methods in the code I've shown you. One returns void or throws an exception and the other returns bool. And that's confusing as hell.
Don't use a confusingly similar naming system for methods that return incompatible types
Having validateXXX methods that return incompatible types is probably a code smell. And I'd be willing to bet you that it contributed to us not catching the error during code review.
We could probably reduce the potential for the same kind of errors in the future by changing some of the names of the validateXXX methods to "verifyXXX." That would make it clear that validate methods return bool and verify methods throw exceptions if they encounter data they don't like. Our IDE has strong refactoring support so we can make that kind of change quickly and safely.
Catch this kind of potential error with a static analysis inspection
My preferred solution is to use a static analysis inspection that's integrated into our IDE to help us find all instances of this class of problems while we are coding. I'm not sure how easy it would be to turn on that kind of inspection in our project but I'm sure you can see the appeal.
Takeaways
Even the best programmers make mistakes. You might be able to catch some of them with careful code reviews and good clean code practices. But you need to combine multiple QA methods if you want really low defect rates. Your unit testing best practices should include testing any behavior you care about, even for code that might appear to be too simple to contain an error.
Agree or disagree? I'd love to hear your thoughts in the comments.
Enjoy this post? Please "like" it below.
Top comments (22)
Great post. I've seen the cycle of misery a thousand times:
Tests support many development goals. Verifying that the code works correctly now is just one. It's arguably more important to have tests that verify that the code continues to work as it changes over time.
I love it! The "cycle of misery" is the perfect term to describe it.
Another error is when you send an array with count(array) <= 1, because the count(arr) is NOT the first check, as it should be. Depending on the language also NULL can do the same error. (null[0] or null[1] would not work).
Good catch. That actually can't happen because if the way the data is generated but I agree the size check should be before the equality check. I've updated the production code to match.
Thanks for your comment.
Yes, I am completely with you on this. Manual code reviews must not be skipped.
A well-known technique is also en.wikipedia.org/wiki/QuickCheck (you may find it under different names depending on the language).
The goal is to automatically generate random values to pass in a function and see how it reacts.
Yes. Quickcheck is an implementation of property-based testing. We do property-based testing mostly for reversible operations like encode-decode functionality when we want to make sure the data is preserved. But that's really a niche for us. The base of our testing pyramid is unit tests.
I'm not familiar with the language, but it looks kinda strange that validation function has
void
type. Wouldn't it be better to returntrue
orfalse
instead of throwing exceptions?Is there any reason for passing an array of values instead of 2 params?
Yes. That's what I was trying to discuss towards the end of the post. It is confusing.
Passing an array instead of two params was the author's choice. It meets our coding standard and it makes a fair bit of sense in the larger context of where that data comes from. Two params would have been perfectly fine in this context too.
Great article!
I agree that nothing is too simple to test.
I've been writing unit tests for a while now and I can say that you need to unit tests every method you have. Even the simplest ones can create bugs sometimes...
Thanks. Carry on...
Great post and very interesting discussion/code review.
I've seen an even eviler variation of the example you've shown us. Someone has called a validation method and stored its result in a boolean variable. After the iteration result stored in the method, the variable has been returned to a caller. It could end with false positive or false negative result. The result depends only on the last partial validation in the foreach loop, because of the overwriting the method variable in each iteration.
Luckily we've found the issue during code review proces.
...thereby proving that WTFs/minute is still the only true measure of code quality.
(I've seen something like your example as well.)
You should watch Uncle Bob (Martin) on Unit Testing. He doesn't believe anything is too small to test. In fact you should test that you can create an object. I tend to agree with him.
Also 15 lines of code in a method is a lot. Anything above 10 in my opinion is a lot if you're trying to follow single responsibility.
Yes. I've watched uncle Bob's stuff on unit testing. TDD is a little different than getting tests around existing code. But, yes, I agree. I'd never fail a PR because the unit tests were testing things that were simple.
I'm actually with Phil Koopman on the issue of method complexity/size. There's a sweet spot and I think it's bigger than uncle Bob recommends for my projects. That doesn't mean I like huge methods. But when I've chopped my code up into methods like uncle Bob recommends, I end up with really deep call chains and it becomes harder to understand what's happening.
I don't know the language. What happens if you compare the first category to the second category but there's only one category? Will it get to the second check which checks to see if there are two?
Good question! I actually had to run some code to make sure it behaves as I expected.
It will get to the check for the number of elements in the collection unless the first element in the array is null. If that happens the first "if" returns true and you'll get an exception, which is what we want but the error message will be deceptive. That's a real edge case though.
I was hesitant to mention it because I don't details of the language. I was just curious because I didn't know if you could compare the first and second elements without an exception if there were fewer than two elements. In .NET the check itself would throw an exception, so I'd check the number of elements first.
No worries. We're just talking. Bring up anything you like.
I swapped the if statements this morning in the actual code (not updated in the blog post). So the size check is now before the equality check.
If you turn up the warnings enough in PHP you'll get a notice about accessing a nonexistent array element but it won't throw an exception on its own.
The thing that immediately caught my eye was the way the variable naming changed from using "ID" to "Id" half way through, which made me expect it to be an error around that.
Good eye. It's so interesting to hear all these comments and suggested improvements for what amounts to a pretty trivial and boring piece of code. My thanks to everybody who commented.
I'm curious how many of you run your code reviews at this level of detail? Are the things you are mentioning typical things you'd pickup and mention in a regular code review at your job? Or is this a special level of feedback because of the circumstances?
I caught the naming issue too and fixed it in the production code before I released the blog post.
Great post!
Thanks, Ben. I enjoyed writing it.