On August 9, 2000, Joel Spolsky published the Joel Test, 12 yes-or-no questions used to score a software development team or organization. Joel described a score of 12 as “perfect”, 11 as “tolerable”, and 10 and lower as indicative of “serious problems” in the organization.
20 years later, how has the Joel Test aged? Is it still useful to assess development teams? Or has the passage of time caused the test to become irrelevant?
The 12 questions of the Joel Test are:
- Do you use source control?
- Can you make a build in one step?
- Do you make daily builds?
- Do you have a bug database?
- Do you fix bugs before writing new code?
- Do you have an up-to-date schedule?
- Do you have a spec?
- Do programmers have quiet working conditions?
- Do you use the best tools money can buy?
- Do you have testers?
- Do new candidates write code during their interview?
- Do you do hallway usability testing?
The benefits of source control, for individuals and teams, are well known. Given the number of options for source control, from services that offer managed repositories to commercial and open-source software to create self-managed repositories, there’s no reason for a team not to have some kind of source control tooling in place.
As recently as 2010, however, there are teams with limited to no use of source control tools. The low barrier to entry for just having source control makes this question less interesting. Looking at the team’s source control tooling is a good starting point for looking at the team’s workflows as well as their continuous integration, continuous delivery, or continuous deployment pipelines.
This question is about how many steps it takes to start with an arbitrary snapshot of source code, preferably from a version control system, and turn it into a deliverable or executable product. The original description focused on compiling code, producing executables and installation packages, and producing media, but this isn’t always relevant. The idea of a build often includes static analysis, automated testing, and packaging for deployment.
Regardless of the steps necessary to prepare software for deployment, automating the process reduces errors. There are also implications for getting the software running in a development environment for testing and debugging. A one-step build is another step closer to having a CI/CD pipeline. There are few reasons not to have a scripted process for building software.
Daily builds are about making sure that errors are detected quickly. If code does not compile, pass static analysis, and pass tests, the team should find out soon after the failure is introduced so it can be corrected and the team can continue developing against a stable system.
Today, teams should be moving beyond daily builds and towards building on each commit. Although it may not be feasible to run every single check or test on every commit, depending on the technology in use, there are ways to run a subset every commit. Reducing the feedback loop’s length can make it easier to find and fix issues before complexity around them grows.
A bug database or issue tracker lets teams keep track of reported issues along with the steps to reproduce the bug, what the expected behavior is, and prioritize the work to fix them. Beyond bugs, having a tool to keep track of all necessary work and the state it’s in is essential as team size and system complexity grow.
It’s becoming harder for organizations not to have a bug database. The leading managed source code repository services provide integrated issue tracking functionality. There are various open-source and commercial offerings for self-hosting on-premises. For the most basic tracking, Google Suite and Microsoft Office 365 offer collaborative spreadsheets in the cloud. However, the advantage of issue trackers integrated with source code repositories is traceability to commits, code reviews, and the steps of the CI/CI pipeline.
Fixing bugs before writing new code is known as a “zero bugs policy” or “zero defects policy”. There are benefits to fixing bugs quickly. They are cheaper to fix as they are more well known and the system has less complexity. Schedule estimates are more reliable since it’s easier to estimate the development of new functionality than to estimate how long it will take to reproduce, find, and fix a bug. The software is more likely to be in a demonstrable, if not releasable state with fewer known defects.
For some software systems, a zero defects approach may be viable. However, this isn’t always true. There are differences between typographic errors, nuisances that have workarounds, and significant issues that prevent users from accomplishing their goals. It also neglects the intended use of the software, as some environments have a higher tolerance for issues than others. It overlooks the software distribution method as it’s harder to fix problems post-release in software that must be packaged, delivered, and installed by end-users than it is to fix issues in software provided as a service.
The idea of an up-to-date schedule may or may not be firm, fixed dates for deliveries. Minimally, teams should understand when there will be demonstrations of pre-release software, trade shows, or contractual or regulatory obligations to make certain features and functionality available. The development teams must be aware of these dates and track progress toward what is required and what should be done before them.
Not all organizations are date-driven, and some are even tolerant of the work being done whenever it’s done. However, if there are key dates, events, or milestones, collaboration with the development teams on defining what is in-scope for those milestones and how progress is tracked can be crucial.
A specification is a document that describes what a system must do or what a system does. It comes in different forms, from user requirements specifications in the domain language of users and other stakeholders to technical specifications in software developers’ language.
Understanding what needs to be built doesn’t have to come from a formal specification. There are other ways to communicate the needs and expectations of stakeholders to a development team. Rather than focusing on the existence of a specification, focusing on how the team learns what they need to build and ensuring that there is appropriate domain knowledge accessible to the team can be more valuable.
There’s plenty of discussion about what kinds of working environments are best for software developers. There’s also information about the effect of interruptions on productivity. It’s safe to say that people need to have an environment conducive to focus and productivity.
The description in Joel’s original blog post advocates for private offices. This is not the only way to achieve “quiet working conditions”. There needs to be some balance between individuals having quiet working conditions and teams having the necessary space for collaboration. Even in Peopleware: Productive Projects and Teams, DeMarco and Lister recognize the value of offices for teams. Although the office environment is important for anyone spending time there, the general culture around meetings, email, instant messaging, and calls is just as influential as the office floor plan, the types of rooms available, and general noise levels in different areas.
In this case, tools refer to everything that a developer uses, including their computer and peripherals, editors and IDEs, software tools, and even the supporting IT infrastructure that may be managed outside of the development teams. Nearly all of these tools are cheaper over time than a developer, so no expense should be spared.
Rather than merely providing the best tools, it may be better to focus on having the right tools. Depending on the environment, there may be a good number of free and open-source tools widely used and supported within the community. It would be easier to get support for these tools than a less popular commercial alternative. It is better to focus on what the “standard” development environment tooling is and what it looks like to get the necessary approvals to buy other vital tools and equipment.
This calls for one dedicated tester for every two or three developers on the team, claiming that without such dedicated testers, the product is defective or money is wasted by having programming specialists perform testing.
There is a place for manual testers on a team, but it doesn’t scale, especially in regards to regression testing. To rapidly perform regression testing, automation is almost always necessary and automation experts are not “$30/hour testers”. Testing, when done right, is a collaboration between manual and automated testing, and the whole team takes responsibility and accountability for delivering software that avoids defects. Having test specialists, both manual and automation, help “level up” developers in terms of designing and running tests can add value to the team and the whole organization. Understanding how the team maintains the skills and knowledge to ensure product quality is more important than ensuring that there are dedicated testers, especially in a particular ratio, on the team.
Instead of just reviewing a resumé, asking some programming trivia questions, or chatting with the candidate, make sure that the candidate writes some code. This could be a solo activity, perhaps as a take-home activity before the interview, or it could be a pair programming session during the interview itself.
The intent is good, but it is challenging to implement. Depending on the candidate’s life circumstances, asking them to write code at home may remove good candidates from the pool. Asking someone to pair program in an unfamiliar development environment and tools may also remove good candidates. There is value in working with candidates on software design, implementation, and testing in a collaborative setting to ensure that the candidate works well with the team. Today, there are other ways to assess the candidate’s knowledge and skills, such as checking out their GitHub or Stack Overflow profiles to see if they contribute to the community. Still, even this shouldn’t disqualify someone if they do not have the time outside of work to participate in these activities.
Hallway usability testing is where the developer asks a random person passing by to attempt to use the software to learn about usability problems and fix them. In theory, a small number of people will be able to identify the biggest challenges that users may face before the software is released.
Carrying out a valid hallway usability test requires that the people who are doing the testing are somewhat representative of the users. When the software is highly specialized or the user base has a particular background that may not be represented in the development organization, hallway usability testing may not guide the user experience in the right direction. Rather than taking time to do hallway usability testing, having some staff with expertise in user experience design and usability testing with direct access to end-users to carry out demonstrations, tests, and handling feedback so that it’s actionable by development teams.
Of the 12 questions on the original Joel Test, over half are still relevant today. Questions about the use of source control, one-step builds, bug databases, and quiet working conditions are just as relevant today as they were in 2000. Other questions about the use of daily builds, specifications, and testers have seen alternative techniques or new technologies that render some of the thinking from 2000 obsolete. Other questions, such as fixing bugs before writing new code, maintaining a schedule, interviewing processes, hallway usability testing, and using the best tools are context-sensitive.
However, there’s more than just the questions that are part of the test. Several topics aren’t addressed at all, including onboarding, career development, cross-functionality or access to cross-functional skills and knowledge, the diversity of the development organization and inclusive practices, and continuous improvement. These are just as relevant for assessing a development organization’s maturity and capability as to how the day-to-day work gets done.
One of the flaws with the Joel Test is the emphasis on yes/no questions. Such questions provide a useful checklist of sorts. Still, they may not be the best to have a conversation around, whether that’s in the context of an organization performing a self-assessment or a candidate assessing an organization during the hiring process.
Even after 20 years, the Joel test does a decent job of providing a starting point for an organization to evaluate its current maturity or for someone to assess a prospective employer’s development team. However, it’s just that - a starting point. One of the good measures of maturity is identifying the areas for improvement and improving the results over time.