As a quick note, I released this post (well, a version with slightly different editing) on my blog, Wednesday morning, so it can get to be (as I tend to be) a bit rambling in places. However, the original text is on GitHub (licensed CC-BY-SA), so if anything seems muddy, by all means:
- Leave a comment here,
- Leave a comment on the blog,
- File an issue on GitHub, or
- Add a pull request!
Every once in a while, I’m reminded that one of the most important tasks in software development that most developers never learn to do is to estimate projects. Interestingly, rather than drawing from other disciplines to teach people how to produce better estimates, the industry has largely shifted to management techniques that assume that estimates are going to be embarrassingly wrong.
They don't need to be.
So, since the topic has come up a couple of times in the past few weeks, I wanted to centralize my thoughts, in case they’re useful to someone else.
To start off, let’s be clear that a project estimate isn’t hearing a quick description of the finished project and picking a number that “sounds about right.” That’s a guess. There are certainly cases where guesses are acceptable, and your guess might sometimes be right, but estimates are something different.
Let’s start with the dumb “the dictionary defines…” trope, here, borrowing from Wiktionary.
estimate (third-person singular present estimates , present participle estimating , simple past and past participle estimated )
To calculate roughly, often from imperfect data.
To judge and form an opinion of the value of, from imperfect data.
Those are basically the same thing in different contexts, and I hope it shows the distinction between estimating and guessing. Your estimate would be the correct answer, if you had better information. Your guess would probably not change, however, if more information was available.
While, again, guessing is fine in certain environments, it has a huge weakness: To try not to over-promise, workers (especially software developers, in my experience) tend to try to find some magic multiplier that will turn their fantasy numbers into a valid estimates. It doesn’t work, being a perfect example of "garbage in, garbage out," but that doesn’t stop people from hoping to make the proposed time-frame long enough to sound credible, but short enough for the investment to sound like a good bargain.
In my experience, these factors come in a small handful of varieties, some more legitimate than others.
- After reviewing the plan, you can point to the places where the project has a lot of uncertainty that could go horribly wrong, and don’t think that there’s any way to measure that impact without doing the work. For example:
- If you need to integrate with a third-party service that’s still under development, you’re going to spend an unknown amount of time testing with your counterparts, discussing options in conference calls, and rewriting code as the target moves.
- If you need to optimize an algorithm to trim requests below a certain threshold, you can (and should) calculate a lower-bound for the run-time, but time spent actually trimming time can easily hit a point of diminishing returns, dragging out the time required.
- You’re including overhead of work that you don’t personally do, but is an unavoidable part of the project. I’ll talk about this more, later.
- You haven’t the foggiest idea of what’s involved in this project, so you picked a number out of a hat and feel that you need a fudge factor, just in case you’re wrong; you’re probably also planning to work late when the deadline approaches to meet the deadline. This is the situation that I described earlier, where developers tend to just find a compromise between sounding appealing and sounding possible.
- You don’t trust the other party to uphold their end of your agreement, so you’re just flat-out assuming that you’re going to need to do everything multiple times.
Uncertainty in the plan (the first item) is almost always a legitimate reason to “pad” a schedule. The padding is localized on the schedule and each instance has specific justifications, so it’s possible to isolate that part of the project.
Overhead—design, management, testing, debugging—is also generally going to be legitimate, and probably even good to consider, as long as you understand the different roles involved in the project and which need to be part of your estimate.
You see the “fudge factor” in cases where the developer doesn’t understand how to build a real estimate. You’ll see people talk about “rules” of multiplying all initial guesses by two or three, and we all start our careers thinking that's how you do the job. But it’s basically an admission of being far too optimistic, as well as showing ignorance of scheduling, so it’s probably best to steer clear of doing this, unless you can be specific about it, as I'll get to later. Basically, this multiplier is a step away from “it’ll get done when it’s done.”
Finally, trust is, as they say, a two-way street. In one direction, you have developers padding schedules because they believe that management or the customer is going to change the scope or even the nature of the project, meaning that someone will need to scrap or rework a fair amount of existing work. In the other direction, management or customers might push to pad the schedule (or might treat the budget for the schedule as if it was several times larger), because they don’t trust the developer(s) to do the right work. If you’re adjusting the schedule because of a trust problem, you need to build trust, rather than tweak the schedule, or your schedules are always going to leave someone looking foolish.
Note, by the way, that Agile development processes have tried to take trust out of the process, by shifting the focus of planning to “sprints,” when developers just try to get through as many tickets as possible, and then use that data point to predict how many "points" the team will complete during the next sprint. In doing that, however, it also mostly eliminates the idea of rigorous estimates in favor of gut reactions.
As I said about guessing above, there are contexts where guessing is fine, and Agile development processes are (usually) one of them. Just be mindful about what you’re doing and the risk that’s introduced, compared with how that risk is contained.
Now that we better understand what not to do and why we might care about getting accurate estimates, let’s talk about how to put a credible estimate together. Improved estimates are important, because reliably getting work done on time helps to fix the trust issue discussed above.
That all said, the way I build my own estimates to increase the chances of it being correct is to recursively break tasks down with best-case and worst-case estimates, until the worst-case estimate is no more than a few hours long. I pick that size, since just about anybody is going to get that scale right, even if it’s just a guess.
That maximum-size time (of a few hours) is based on projects that are likely to take an on-one-hand number of worker-months, and so should probably be adjusted if that’s not the likely scale. Otherwise, you’re mapping out individual hours on a twenty worker-year project or half-days on a weekend hack, which is either a waste of time or too abstract to help. To adjust, if we figure that three months has around five hundred work hours in them, you’re talking about your smallest unit being a bit less than 1% of the guessed-at size of the whole project. If you get that guess wrong, no harm done if your minimum size gives you three times as many sub-tasks.
By contrast, looking for the worst case for each sub-task keeps you honest and reminds you about all the little tasks and pitfalls that you were going to just assume happened magically. Don’t skip that part.
A plan like this might look something like the following excerpt. Let’s say that the project is a traditional blog. Ignore the fact that the estimates assume a multi-month project, when the reality is that you'll probably be done in about a week.
|Task Name||Best Case||Worst||Actual||Ratio|
|Index||👇2 h||👇5 h|
|👆 List Posts||30 m||1 h|
|👆 Paginate||1.5 h||4 h|
|Create Post||👇21 h||👇26 h|
|👆 Implement Default Editor (Scaffolding)||0 h||0 h|
|👆 Evaluate Rich-Text Editors||16 h||16 h|
|👆 Replace Default Editor with Rich Text||2 h||3 h|
|👆 Validate Input||2 h||4 h|
|👆 Preview Post||1 h||1 h|
|👆 Save Post||0 h||2 h|
|Edit Post||👇1 h||👇6 h|
|👆 Load Existing Post to Create Page||1 h||4 h|
|👆 Overwrite Existing Post on Save||0 h||2 h|
|Display Post||👇2 h||👇7 h|
|👆 Convert Internal Representation to HTML||0 h||1 h|
|👆 Insert CSS Classes into Post||1 h||4 h|
|👆 Show Post in Context||1 h||2 h|
|Add Comments||👇3.5 h||👇6 h|
|👆 Show Existing Comments for Post (No Threading)||1 h||1 h|
|👆 Create Comment Form on Post Page||2 h||4 h|
|👆 Save Comment Associated with Post||30 m||1 h|
I didn’t bother to break up Evaluate Rich-Text Editors, because (a) this isn’t a real project, (b) breaking it down would probably just mean listing the candidates, and (c) it’s not out of the question to say “after I’ve spent two days on this, I’m just going to pick whatever I’ve seen that looks best, and forget the rest.”
Anyway, this project outline gives us a fifty-hour prediction. As tasks are completed, fill out the actual time spent in the fourth column, as honestly as possible.
Next, take your development-time estimate and put together a 2:1:3 schedule of design/architecture time (2x), development time (your actual estimate), and quality assurance/debug time (3x). I forget where I got that guideline from, honestly, but it’s basically fixed overhead that you don’t want to think about up-front.
I’ve never seen anybody successfully cheat that ratio, by the way. I've seen many attempts, but they invariably slow the project down, rather than speeding it up. You can push the work around—like having the development team use test-driven development—and many projects can often afford to ship incomplete. But as a sell-able product, you need that design and test time, even though it might feel wasteful.
That is, for the blog plan described above that’s planned to take fifty hours, the design time (all choices, coordination, and prototyping) is going to take approximately one hundred hours, and testing/debugging time is going to come in at around one hundred fifty hours, for a total of three hundred worker-hours to make this a polished product.
Well, that’s not actually entirely true. That sixteen-hour step to evaluate rich-text editors isn’t a development task, even though a developer is going to do that work. Instead, it’s a design task that almost certainly incurs no QA time. So as you get better at this, you might say that the real development schedule is thirty-four hours, with sixty-eight plus sixteen hours of design time, and about a hundred hours of testing and debugging.
That's still a lot, but...well, I'll get to that later.
As you work, you then track real time elapsed against your low-level estimates, in order to forecast how you should adjust estimates in the future. Divide the worst-case estimates by the actual time spent. Take the average of those ratios—that’s not quite mathematically valid, but it’s close enough for our purposes—and use that final ratio to adjust the sub-task estimates on the next project.
Note that this isn’t a “fudge factor” like the above. This is a recognition that you’re still learning to get the small estimates right, so this is deliberately correcting for your bias. As you go through more projects, you’ll notice that the ratio gets close enough to one-to-one that you won’t need to correct for bias.
Don’t stop tracking the bias, though. If it starts drifting, again, you’re going to want to know that and start accounting for it. In some extreme cases, you might want to make sure that you're not burning out or otherwise unhappy, too.
Yes, you’re thinking that this is much more work for a proposal than “eh, sounds like a nine-month job to me,” but it’s work that makes it easy to get estimates pretty close to the actual work that’ll be done, and also quickly see how any scope changes are going to impact the schedule. That builds everybody’s confidence, which is helpful.
Because this gives you a good view on the cost of features, a process like this also helps argue for and against work based on the return on investment, which is more useful than just arguing that a project will be “good” for the product. For example, our estimate for that blog came to hundreds of hours of work, which would easily cost the organization tens of thousands of dollars. Is that money going to bring the organization more value than installing WordPress?
However, like I hinted a couple of times, this process doesn’t mesh particularly well with Agile-style project planning, because Agile is built around the idea that long-term estimates are going to be wrong, so take all of this with a grain of salt for a modern office. I suspect that there’s probably a way to fit the two worldviews together, but I haven’t gotten it to work, yet, beyond mapping out individual tickets.
Credits : The header image is untitled by an unknown PxHere photographer, released under the terms of the Creative Commons CC0 1.0 Universal Public Domain Dedication.