Tomasz Łakomy

Posted on Oct 5, 2020 • Edited on Sep 30, 2021

Why I don't like story-point-driven estimates

#agile #career #estimation #scrum

Before we start - I'm working on https://cloudash.dev, a brand new way of monitoring serverless apps 🚀. Check it our if you're tired of switching between 50 CloudWatch tabs when debugging a production incident.

Disclaimer: all of this is my personal opinion, you are more than welcome to disagree - let's chat, I'd love to learn your perspective!

Software engineering estimates are ridiculously hard.

Let's face it, when a developer tells you on Monday that:

Oh yeah, I'm finishing that feature, it'll be merged to main branch today!

you can translate that to:

There's a non-zero chance that it'll be pushed to code review this week.

Bear in mind that it's not because developers are slacking at work (even though the more senior you are, the more time you spend on Slack, but that's a discussion for another time).

There are multiple reasons why this happens - unforeseen requirements popping up in the middle of a sprint, tests that take a little longer to write due to legacy code, a tricky manual testing flow, deployment issues, just to name a few.

Unfortunately an ever-changing (or dare I say - agile?) environment is a part of our job. Some people think that's exciting, some folks thrive in managing growing complexity of software, some just want to merge this feature and be done with it.

Here's a thing though - other teams, stakeholders, your manager are probably not going to be very happy if every single ticket you're assigned to takes sometime between an hour and 40 months.

That's why we're constantly asked to estimate our work, which is the bane of our existence. I'm yet to meet anyone who enjoys estimating their work (although I've noticed that some consider themselves to be very good at estimating other people's work).

The art of estimation

Back in 2013, I had my very first fulltime job as a Junior Software Engineer. At some point I was assigned my first very own large feature to implement and ... estimate.

So, Tomasz - when is it going to be ready?

Here's what I thought:

crap, crap, crap, I have no idea?! Nobody taught me how to estimate, what do I say?! I don't want to appear slow, what if they find out that I'm an imposter?

Here's what I said:

I think it'll be ready in 2 weeks!

(It wasn't)

7 years later, estimation is still... tricky, to say the least.

In 2015 I was introduced to the concept of story points. Since everyone [citation needed] uses JIRA, let me quote what Atlassian has to say about story points:

Many agile teams, however, have transitioned to story points. Story points rate the relative effort of work in a Fibonacci-like format: 0, 0.5, 1, 2, 3, 5, 8, 13, 20, 40, 100. It may sound counter-intuitive, but that abstraction is actually helpful because it pushes the team to make tougher decisions around the difficulty of work.

(I wonder what a 100 story points ticket looks like, I have a feeling that webpack is somehow involved)

In essence the idea is to stop for a while before jumping in to implement a feature and think.

How difficult is this?

Is it more difficult to implement than the stuff we did before?

Is it possible to split the ticket into multiple smaller ones?

(I've worked with teams that refused to take anything larger than 8/13 points since it should be split and it works well.)

Asking those questions is a great idea and you should be doing that. The questions are not why I'm writing this post.

Assigning an arbitrary number to JIRA ticket is.

After the team finishes discussing a feature they have to estimate it (usually using something called a planning poker - to make sure that team members are not influencing each other estimates).

If a ticket is small, then the whole team will (usually) gues...estimate it 1/2 story points and they get to continue

The fun part begins when there's a range of estimates. I've personally been involved in (too) many discussions whether X is a 3 or 8 story point ticket. Look, having more opportunities for discussion is not a bad idea, but those conversations would often drag for ages.

I've even personally witnessed engineers implementing the feature being discussed during the meeting.

Perhaps establishing a rule of simply selecting a larger estimate would be a good idea?

The numbers, what do they mean?

Okay, but why am I complaining about assigning (seemingly) harmless numbers to JIRA tickets?

The problem lies in what development teams are doing with story points and how they change the perception of our own work.

Many Agile/Scrum teams are measuring their velocity (which is an amount of story points they usually deliver within a single sprint).

Let's assume that there are two teams contributing to the same codebase - Alpha and Beta.

The Beta team is seemingly beta (hah!) than the other team, their velocity is 60 story points, whereas Alpha team usually delivers around 35 points per sprint.

Even though you've never met those developers (mostly because they don't exist) you've most likely already established an unconscious bias regarding their performance. The idea behind velocity is not to do that, obviously, since every team has a different way of estimating but we're only humans - given two numbers, we will compare them.

That's not the worst part, let's zoom on a perspective of a single team.

Why do we even measure velocity?

To optimize for predictability.

Velocity does not help you optimize for user's experience, accessibility, performance, value provided, $$ - the only thing that it cares about is:

Given 100 story points, how long it'll take the team to implement this.

Which would absolutely fantastic if it worked, since business needs to understand when/if features will be shipped to prod. The problem is that, in my experience, it rarely does.

Software engineers struggle to accurately estimate single tickets and now you want us to take the sum of our wildly inaccurate estimates and make decisions based on this? Good luck.

Notice how assigning those numbers changes how we perceive work being done by the team.

They're asked to estimate (guess) how much work is required to implement a large collection of features and that changes the whole discussion.

Instead of celebrating the amazing work done by them every sprint, the discussion shifts towards:

Are we on track with the estimates we've provided a quarter ago when we barely understood the problem we're trying to solve?

This is not healthy.

A process like this introduces unreasonable expectations on the team and may cause them to work longer hours/weekends, eventual burnout and that 1on1 "I'm leaving" meeting.

What should we do instead?

In larger software projects there are three factors that shape the final product:

the deadline (when do we need to ship it?)
the scope (what do we need to ship?)
the size of the team (who is going to ship it?)

Imagine that your startup absolutely has to ship a large feature next quarter.

Your JIRA backlog can barely contain all the tickets, product managers can barely contain their excitement, and developers can barely ship their code to prod because of legacy prod pipelines (but I digress!)

Out of those three factors: deadline, scope, size of the team, I propose we set one of them "in stone":

The deadline.

Wait, are you seriously suggesting that we should work with fixed deadlines? I thought that estimating software is difficult/impossible?

Yes. Exactly.

Since given a scope (a list of tickets) we cannot tell for sure when they're going to be done, let's set the when in place and modify only the what.

What do we need to solve our user's needs? Can we solve that particular problem without all those bells and whistles?

In other words - given a deadline of 15 November, take a good look at what is absolutely necessary to ship this feature and throw away everything else.

And then throw away even more.

Notice how this technique will allow the team to have a laser focus on the problem they're trying to solve. Smaller scope usually results in more resilient code because there's simply more time to consider how it should be implemented (not to mention adding tests!)

Shifting the focus from:

What do we need to do in order to finish all those tickets before the deadline?

What more can we cut in order to solve our user's problem before the deadline?

helps establish a more healthy relationship between business and developers. The deadlines are met, unfortunately (?) not every idea gets implemented which is not always a bad thing.

Imagine building something that your users absolutely don't want for a year.

Isn't it better to ship something meaningful (even a MVP?) in a quarter and validate if their problems/needs are addressed?

Isn't it better to stop guessing and start focusing on our user's and solving their problems in a predictable fashion?

I honestly think so, but I'd love to hear your perspective 🥳

BTW: If you've enjoyed this post, perhaps you'd like to buy me a beer?

Top comments (24)

Dave • Oct 6 '20

It seems to me that your issue isn't story points themselves, but the usage of them.

Story Points (in combination with Team Velocity) - allows PMO to forecast delivery, to know if the set of requirements is achievable by the deadline or not.

Story Points isn't, and shouldn't ever be, a way to measure developer productivity (either within a Team or between Teams). There's better tools for that job (such as Git Quick Stats).

hidden_dude • Oct 6 '20

I like a power of 2 point system: 1, 2, 4, 8
If it's larger than 4 it must be split.

But we've done fine by asking people to estimate features in days, and if it takes more than 5 days split it.

Then once you have everyone's estimates, multiply by 2.

I can seem pessimistic, but I've found that it is a very good estimate of ship dates.

We recently had a very high pressure project where we were able to measure the actual multiplier for the project. And because the team was very good. It was 1.51.

So this works well.

I do like the idea of being able to measure if we are speeding up or slowing down with velocity. And velocity helps normalize time to make up for different speeds from different team members. But working off days works well as well.

BUT ONE THING IS CLEAR: this is something that should be taught in Universities.

Cliff • Oct 9 '20

This approaches Joel Spolsky's estimation adjustment technique, though his involves measuring it for every developer over time:
joelonsoftware.com/2007/10/26/evid...

I like the estimate in days approach for some teams. However, I always insist on the rule that no story can be less than 1/2 day unless it's an extremely trivial reminder (e.g. log into production system so your account doesn't expire), and after that it's whole days only. It's not that no story will take less than half a day, it's that by the time you consider interruptions and other aspects of reality, that tends to give you the right amount of padding on average.

hidden_dude • Oct 9 '20

I agree. I always count it as 0.5 days no matter what minimum they put.

An important principle in software estimation is that it's better to overestimate than underestimate.

Steve McConnell's book on the Art of Software Estimation (amazon.com/Software-Estimation-Dem...) cover's this point well.

All work needs to be estimated, not just coding time. Even talking about it before hand is part of the work. But generally many things get overestimated and many things get underestimated. But if you're in about the middle of that estimation error you can get a pretty good estimate.

Lately we've been able to do work in LESS time than estimated. But it's taken many years of polish to get there.

Cliff • Oct 9 '20 • Edited

I'm convinced that the real drivers of success, whether a team is using Scrum or any other system -- or no "system" to speak of -- are trust and communication. If you have those things on a team that is at least competent on the average, which means you can even handle some incompetence if you have some rock stars, you'll find a way to execute that is effective for your team.

I'd bet if we looked at how your team got to the point that they can finish under their estimates, we'd find that not only does your team trust each other and communicate well, your team has gained the trust of the rest of the organization/customer and communicates your progress effectively to your external stakeholders. This results in fewer interruptions from outside.

hidden_dude • Oct 9 '20

Yeah.. I can agree with that.

Also, I don't really believe in "rock stars".. I believe teams are a mix of talents that complement each other. When a team does this well.. the results are good. Even if some are inevitably faster than others.

hidden_dude • Oct 9 '20

But I like that FogBugz has the estimation process incorporated. Maybe we should have adopted it instead of JIRA.

Cliff • Oct 9 '20

Yeah, I said "rock stars", but I really just meant programmers who are more than 1x, either because they are just crazy good or more often because they amplify the abilities of the team by being excellent library writers and support for other devs. A close friend of mine is one of the crazy good types, both skillful and fast. I'm a good programmer with a solid theoretical background, but I try to be much more in the second category of providing good support, because I'm just never going to be that fast. Sure, I can find a bug and get a patch out quickly, but I take forever to develop good, strong abstractions and libraries. But when I'm done, they'll be well-made.

Adrián Norte • Oct 6 '20

dev.to/anortef/explaining-scrum-st...

Story points are very useful to predict how much complexity can a team handle and for that you apply a rule of thumb, in my workplace it goes about this:
1 point - almost done, there is no unknown complexity or significant amount of grind to do.
2 points - A story with very little complexity.
3 points - A story with little complexity.
5 points - A story that has few uncertainity but still some investigation has to be done.
8 points - A story that has a lot of uncertainity.
13 points - Break that story because it has way too much complexity.

Now, the part about the discussions, that tells me two things: Your team needs to work way more about sharing domain knowledge and that you do not have someone in the role of Scrum master to say "enough pointless discussion, we go with X".

And the final part about story points is that it allows upper management to have some sort of quick feedback about the team performance and when something will be done because a team that cannot give feedback or realistic estimations is a useless one for management.

Cliff • Oct 9 '20

This is pretty similar to how my team(s) has/have bid, though we haven't written it out as such. And we almost always split a story that requires investigation before implementation into 2 stories. And if a story requires multiple people, that'll add 1.5x to 2x the points, depending on how involved they must be.

That said, I think if you didn't need to put stories "on sprint" based on a nebulous velocity number, it would be easier to just use words to describe a story's size or complexity: simple todo/reminder, trivial, small, medium, large, jumbo (aka SYNS: see you next sprint), all hands, or split

Francisco Quintero 🇨🇴 • Oct 6 '20

I have to disagree that I agree(?)

While I was reading I was just crossing my fingers hoping you weren't going to favor estimating by hours.

I'm relieved.

I agree that point estimation might be bad (most probably used very bad as everything in software development life cycle) but hour estimation is worst.

I agree that point estimation and velocity will be misused by PMs, POs, and everyone who is managing the project.

I agree that setting a deadline and scoping work to meet the deadline is ideal. This is the whole stuff Basecamp sells in their books: Getting Real, It Doesn't Have to Be Crazy at Work, Shape Up.

Define work for a given amount of time/weeks/date and set in stone the things the team could be able to work on AND remove stuff that isn't important or that won't be ship.

Chop that scope!

In the other hand, the thing with estimations is that people do not really understand what it is. According to "Software Estimation" book, there's a misconception between "Estimate" and "Commitment".

When PMs ask for an estimation of when something will be done, they do not want to know the effort but when it's going to be delivered.

Estimation: this will take 3 months to be done.
Commitment: but if we only work in these two features(nothing more), we can release it in two months.

In the end, story points are a way to measure effort which will never say when something will be done but how complicated something might be.

Aurelio • Oct 6 '20 • Edited

This is funny. I swear i fell asleep yesterday night right after reading this article titled "Stop using story points".

And now i wake up, open dev.to without even thinking right before breakfast and this is the first article in my feed.

Looks like someone out there really thinks i have to question story points (I agree tbh).
By the way, the gist of the article i linked is

We have been counting items done. Each week we just choose the most important items and sign up for them up to the number from last week. It turns out that we get about the same number of them done regardless of estimated effort. We have 1 week iterations so we tend to break things down a bit at the iteration planning meeting.

Perhaps the effect is that we have learned how to break things down to the right size. I don’t know yet, but the point is we get about 8 things done each week, no estimation required.

Jason C. McDonald • Oct 6 '20

This is a problem I've solved. You'd appreciate these:

Three Ground Rules for Sane Project Planning

Jason C. McDonald ・ Oct 1 '19 ・ 8 min read

#management #coding #project #planning

Gallifreyan Software Project Management

Jason C. McDonald ・ Jan 2 '18 ・ 11 min read

#management #project #software

Quantified Task Management Standard

FJones • Oct 6 '20

I have to very strongly disagree on cutting features to maintain the deadline.
We're service providers - to stakeholders first, downstream customers second. Our job is to fulfill the business dream. Discussions about non-mandatory aspects are fine - and I would encourage everyone to question the why and what that's being asked of a development team, but by the time a ticket is considered for estimation, these discussions should already have been had. What ends up anywhere near the sprint is a promise to deliver. Setting a hard deadline is a good means of giving stability to the business expectation, but features should not be cut to meet that deadline - unless that is roadmapped precisely. Delaying the bells and whistles for a later product iteration is good, but estimation is not when this should happen.

Instead - and this is where many teams use pre-refinement and story point caps - tickets need to be refined to a workable state early-on. Call it a 100 if you think this isn't a shippable iteration within a certain timeframe. Tell business to resize the iteration - and help them sort through the bells and whistles, providing product value in a staircase.

Francisco Quintero 🇨🇴 • Oct 6 '20

I see your point. The thing with chopping things off is that software projects tend to always add more work to the already established time.

With this "chop it off" mindset, what we want instead is to stop adding unnecessary BS to the scope and keep removing unnecessary BS that might leaked when estimating.

I agree that work to do have to be well defined but barely happens in many teams.

Rafael Nunes • Oct 6 '20

Great post! I've actually been thinking a lot about story points, I was hoping to write down my thoughts as well hahaha. I do relate with a lot of points you highlighted.

I do like the idea of throw away everything that is not absolutely necessary, but the team needs to have a strong maturity to be able to follow up on these "nice to have" things that ultimately can make the difference to the end user. Otherwise, that's will always produce "half ready" projects.

One thing I've been thinking a lot is that we do story points to estimate scope and predict velocity, but ultimately the constraints are under time. The product, management, or any client theam is not interested in how many points that feature is. They are interested in when that thing will be ready.

We were thought that estimating using time/hours is bad, the good is to estimate complexity. But my impression is that this fixation about complexity just adds one more layer to the problem. That layer being to translate the "complexity points" to "time estimatives" so you can communicate to other stakeholders the when. And that is yet another place that we increase the error margin.

I've worked with various models and teams, the conclusion for me is the same: Software engineering estimates are ridiculously hard.

Dave • Oct 6 '20

Or as I like to repeat to our Project Management Office - estimates are always wrong. When was the last time you got a taxi (not Uber etc), and the estimated fair is what you paid? When was the last time you called a plumber, and the quoted estimate on the phone is what you paid?

They're called estimates, because we're guessing.

To me, a developer measuring complexity is a good thing. Project Management usually aren't technical, so wouldn't have a clue how complex a bug was to fix. Planning poker solves that.

Story Points then allow a Team to build a Velocity (number of points burnt down, across a number of sprints, gives average velocity per sprint).

Knowing the Team velocity, and having a fully estimated backlog, allows a Project Manager to simply drag/drop stories into Sprints 6 months in advance if they like. Of course, the difficulty then is how to manage the fact that estimates can, and should change (as we know more about the problem later down the line).

Our software engineering problem is: How do we define Done? (I'd love it to be, it's in the Production environment, and users can hit it with a hammer... but honestly, we're not there yet for a number of reasons.)

Phippsy • Oct 6 '20

Under-estimating the complexity of a task that seems simple is my constant source of suffering.

George Mauer • Oct 15 '20 • Edited

The thing is that story points are a tool for helping to make exactly the sorts of determinations that you propose. They get abused to high-heaven, but tell me a single project management thing that doesn't.

The main thing about agile is that you're supposed to do the retrospective and in the retrospective you're supposed to evaluate your process and come up with a plan for how to improve it. Which means that we need to understand why we do the things that we do so we can reason on first principles.

So first of, I feel like no one actually covers why the hell story points recommend Fibonacci Sequence to begin with. Understanding this has some implications. The premise is that human beings are pretty bad at estimating fixed points, but pretty good at estimating relations. And - because it is everywhere in nature - there is something about human physiology that makes us particularly good at estimating relations on the golden ratio of 1.618. Which is the ratio that Fibonacci happens to have between its numbers.

Yes, this means that Jira's description of the concept is full of shit - who would have guessed? It also means that starting at 1 is dumb because that is the part of Fibonacci Sequence that doesn't follow that ratio. Have a small task be an 8 or a 13 and you're at least sticking with the intent of the technique and can evaluate its effectiveness on fair grounds.

Also note that looking at first principles, velocity makes no sense if you don't have a good sprint cadence. If that is the case and sprints are all over the place and no one has a good feel for what unit of work a sprint is, then sure, story points are only going to be useful in terms of being a framework by which you can talk about complexity. That in itself can be useful, but isn't always, it's absolutely something to be considered in your retrospectives. Also, for many projects I run more Kanban style - it's not terribly useful there at all.

The bit about comparing teams - it's a fair point, but I don't know how often that realistically happens. Velocity is meant to be used within the project planning process as administered by the team. No one who has oversight over multiple teams needs to see it. I have no clue why they would want to. But even if they did - let's say they wanted to have a dashboard of whether teams are slowing down or not. Understanding the underlying theory lets us propose an alternative. Don't show the higher ups your velocity, show them what they actually want to see - show them the sprint-over-sprint percentage of change. Now they're comparing apples to apples and problem solved.

As for fixing deadline, etc. That's all well and good. It's a fine way to view structure definition of done. But so long as you work in sprints, you still are going to need a way to know what work you should schedule in your sprint planning. This still means making estimates, viewing dependencies, and priorities, and organizing things properly. Story points are a technique for one part of that. Throwing them out and replacing them with nothing might not harm you but certainly doesn't leave you any better off.

Also, don't just start friggin work. People should plan their projects. "Oh we're so bad at estimating and nothing ever gets done on time". Well how much time did you spend out planning your project? Drawing out dependencies? Did you revisit your assumptions often and address issues? Hardly anyone ever does that part.

View full discussion (24 comments)