As the product owner (PO) for a small software team, I have read countless blogs, training manuals, books, how-to guides, and more about how to order items in my team’s backlog. I’ve gotten tons of great advice and worked with my team to implement strategies that help us decide which features will deliver the most value to the customer. There is a near endless world of information on the subject and for that I am grateful.
However, my team is not just a development team. We are a DevOps team (or DevSecOps, if you prefer). This means our backlog is not just full of new features and improvements, it also contains bug fixes, infrastructure management, security tasks, and more. With such diverse tickets, how is a PO to compare and sort them in order of importance? What’s a higher priority: a patch fix to a production server, or creating that new front-end component a stakeholder asked for? It’s comparing apples and oranges (cookies and muffins? wine and beer?).
Before I jump into it, let me give you some quick background on my team. We are a small team (<10 people), embedded in an established engineering consulting company (>150 people), building a product for a small, but rapidly growing, market. We follow the Agile Principles and are leaning towards Scrum, with no illusions that we are a proper Scrum Team. Our largest user base by number are the company’s clients. Our largest user base by time spent in the application are internal to the company.
Note: I am not a software developer! My background is in civil engineering and I have just enough knowledge to say the right words in the right situations and mostly understand what’s going on in the code. Any technical bits in this post are meant for background only, so please forgive any mistakes or unelaborated, non-specific statements.
First, let’s talk about what hasn’t worked for us in the past. A couple years ago we embarked on a long project to rebuild our data pipeline. It changed not only how data is processed when it arrives to our system, but how that data gets transformed and delivered to the front end. The entire team worked on this for a full year. The outcome was a nearly 10-fold increase in front-end performance, and dramatic improvements reliability and stability on the back end. It was a huge win!
Except we also delivered exactly zero front end features for a full year. Yes, users were happy with the speed of the application, but they also remember 11 months of silence. What we delivered did not replenish the magnitude of good will that we lost over that time, and it took close to another year before we were in good standing with our core internal users.
Fast forward a few years as we start to replace one of our long-in-the-tooth front-end modules. This was another major project with months of planning, user interviews, design iterations, etc. The importance of delivering our first “new page” since the initial release got us excited and we wanted to knock it out of the park. So again, everyone was working on one thing. You can imagine where this is going: we delivered a great product and our users were happy! But we also started running into the limits of our system architecture as more users switched to the new module. Time for another all-hands-on-deck infrastructure project!
At this point we know that going all-in for big projects doesn’t work. But what is the ideal split between ensuring the system is secure, developing new features, and managing operational tasks?
Here are some of the strategies that have worked for our team to order backlog items. We’ve been refining this system for about a year, and in true Scrum fashion, will continue to refine it every cycle going forward. Some of it may work for you, and some may not.
In the words of Amazon Web Services, security is “job zero”. If our system is compromised, that puts our customers’ data at risk. That is a level of trust that we will never win back. Critical security issues are always the priority!
But not every task related to the security of the system is critical. There are some changes that will improve security but are not required for the application to be secure. Others may be associated with a particular compliance or contract rule but are not for security reasons alone. What is important in situations like this is to work with your decision-makers to ensure that any deadlines give you sufficient time to implement the necessary measures without stressing your resources.
It can help to explain what the trade-offs are of setting a shorter project time period. For example, if a proposed contract would require implementing changes within three months, and you need three months of all-in work to complete it, let the decision-makers know that the next few features in the backlog will not be released to the existing user base during this time. If the contract can be negotiated to six months, you can deliver new features and still meet the requirements.
The other type of task in this category is bugs that break functionality. Anything that prevents a user from doing what they need to do is important and should make its way to the top of the backlog. That could mean sliding into the current sprint, if a large number of users are affected, or the next, if there is a reasonable workaround.
Borrowing a term from my civil engineering days, I’m using “functionally obsolete” to describe a part of the system that typically runs fine, but at times of high traffic may start returning errors or becoming overloaded. There are often band-aid-like solutions, like increasing processing power or available memory for a service, that can be applied easily, however the larger solution would take more development resources. Even if the band-aid could get by for a year, you’ll need to re-build it eventually.
One key for us with projects like this is to spend the time to plan out, roughly, what needs to be done and estimate the size of the undertaking. This doesn’t have to be a full set of tickets, individually sized, with definition of done and acceptance criteria, but more of a bulleted list of how the team would go through the process. The brainstorming session could help determine if it’s a 6-week project or a 6-month project, which lets you fit it into the roadmap with more accuracy. It may also reveal discrete, independent tasks that move toward the proposed solution little by little rather than all at once. From there, you need to determine how other factors, such as the risk of downtime or loss of data, customer frustration, or increased server cost, weigh into the decision on when to move forward.
If a project turns out to be smaller, or if you are close to finishing a larger project, we’ve found that going all-in for a single sprint can actually increase the sense of accomplishment for the team. Coming together and working towards a single goal, with a defined ending, can be a boost, especially when it feels like lots of projects are floating around half-complete. Better yet, end the sprint with a team lunch or happy hour!
Now it’s getting a little tougher! This category now has no direct effect on end users, but the indirect value is hard to ignore. If a task will increase the velocity of the development team, it’s worth doing, that much is clear. But how do you fit it into the backlog?
To start with, it can help to quantify how much time or effort the task would save, and how often it’s done. If you can save an hour every week, that’s a great opportunity! If it’s 5 minutes once per quarter, that has less value. Time isn’t everything, though. A task that reduces the complexity or likelihood of human error in the team’s work should also be higher value.
The most common way we’ve found to weave tasks like this into the backlog is to pick them up when that part of the code base is touched for another reason. For example, we recently implemented automated deployment for all our services. Rather than do this all at once, we went repo-by-repo as other backlog tickets touched the services. While it took half a year to get to each repo, we were able to balance it with continued delivery of new front-end features.
This is a practice that can be controversial with some in the Agile community. If a task is small, let’s say only about a half-day of work for a developer, but it isn’t going to displace other tasks of higher value, we add it to the candy jar. Opponents argue that the reason the backlog is ordered is so that when a developer is finished with one task, they can move on to the next most important. However, if it’s the last day of the sprint, and all other committed stories are complete, a candy jar ticket fills in that space nicely. It also provides an outlet for a task that may not necessarily be important now but might be nice to finish before it becomes a problem somewhere down the line.
While I don’t think any of us have the resources to implement Google’s “20% of time to side projects” policy, it is important to allow some time for experimentation. Especially during large, multi-sprint epics that can get monotonous, developer burn-out is a big concern. Making sure your team is happy and engaged is essential to a healthy workplace!
This is one area in which we have not made great progress. Every few months we sit down as a team and talk about how to start working on a fun team side project, but regular sprint deliverables always seem to get prioritized. One next option would be to meet with management to discuss the importance of “extracurricular” projects and budget time to support them.
A way to make these projects more attractive is to use them as a test bed for new tools or technologies that will eventually become part of your product. That way the team can iron out the kinks and make the newbie mistakes outside of the main code base, then implement them with more refined knowledge later.
Of course, there are items that don’t fit into any of the categories above. If the backlog is always ordered by value added, they may never get done.
Okay, here’s the big secret: I don’t have the answer!
We tend to pick up tasks when they feel convenient, or when a developer gets frustrated enough with something on a given day that they just make the change (or convince me that it’s important enough to add to the sprint).
Sometimes a task can be added to an existing story with a small increase in scope, such as adding half a day to a three-day ticket. The important note for this approach is that you need to make that decision before the start of the sprint, not during the sprint. If it happens during the sprint, then it’s scope creep and your sprint could be in danger of going off track. But if you make the decision before the sprint starts, the team can size the ticket appropriately for the larger scope and still commit an appropriate amount of work.
Let’s face it: for a PO, ordering items in the backlog is hard when it’s a bunch of new features. It’s even harder when it’s new features, security updates, operational tasks, and bug fixes! I provided a few strategies for approaching this challenge that have worked for us, but I can say, with high certainty, that there is no one right way to do this. What has worked for us may not work for your team.
Fortunately, following Scrum practices gives us frequent feedback on what is working and what is not. This is really the key to everything! Just as important is sharing this feedback with decision-makers and stakeholders to ensure that they are bought in to whatever strategies you decide to use.
Lastly, I would like to invite you all to share what strategies you use to order your diverse DevSecOps backlogs. I’m interested to learn what other teams are doing, and hopefully bring new ideas into my repertoire.