DEV Community

Cover image for How many Harry Potters can your team maintain?
detoix
detoix

Posted on

How many Harry Potters can your team maintain?

Talking to the business is hard. Convincing them that your application requires time for refactoring or maintenance is even harder. I've been there and I want to share with you an interesting metric that bridges business people and engineers.

A couple of months ago I was assigned with a task to explain why we need time for refactoring, why can't we just remove the code we don't need. And it was a struggle because I didn't know how to quantify the system we had. How do I tell the business people how much cognitive effort we put into just understanding what's under the hood of a running system?

And I came up with an idea to compare it to a book. Simple as that: every line of code takes time to read and understand, just as every line of a book.

How much is a Harry Potter?

I chose Harry Potter series because it's well-known (eventually it turned out to be the bull's eye). I found a couple of places like this that claim that

the Harry Potter books contain 1,084,170 words

Yeah, maybe. My concern was to find out lines count.

Why lines?

A line of code is an entity a developer works with in the first place. Also, it is easy to imagine a line of written text. My assumption here is that every word somewhat corresponds with a single token in code (variable, assignment, invocation), because it represents a single thing to understand in the bigger context.

So I decided to find out myself how many lines are there in the Harry Potter books. And the results are as follows.

Title Word count Line count
Harry Potter and the Philosopher's Stone 76,944 7,890
Harry Potter and the Chamber of Secrets 85,141 8,695
Harry Potter and the Prisoner of Azkaban 107,253 11,190
Harry Potter and the Goblet of Fire 190,637 19,077
Harry Potter and the Order of the Phoenix 257,045 23,004
Harry Potter and the Half-Blood Prince 168,923 13,454
Harry Potter and the Deathly Hallows 198,227 20,630
Total 1,084,170 103,940

I did some double checks and confirmed these numbers - 100k lines in the whole series. Now, how does it compare to actual code? I used some of my onliners to get some data about popular repositories.

Title Token count Line count
Autofac 152,249 42,137
three.js 1,614,528 272,856
pandas 1,804,215 457,886
Total 3,570,992 772,879

As you can easily notice, the ratios are much different. Whereas in a typical novel there are about ten words per line, in the code I pulled there is less than 5 tokens per line.

These two numbers are difficult to compare, but I would argue that software code requires significantly more cognitive effort to be understood. Remember that we're doing this to establish some common ground between business and software people, these numbers don't have to be equal.

This is the part where you can disagree, but for the sake of simplicity of calculations I claim the following.

1 Harry Potter ~ 100k lines of code

With this in mind I literally put it on a Power Point slide and said that

we currently maintain an equivalent of X Harry Potter series with a team of Y engineers and that's why we have to simplify the system

and I think i made a point. If you're reading this for the first time and this idea is fresh in your mind, try to imagine being a business professional and learning this from a software engineer. Is it intelligible? Does it help you understand the situation?

Such comparisons, the Harry Potter Metric in particular, can be a novel approach to bridging the gap between business professionals and engineers; it's simple to calculate and easy to understand for both parties.

So, how many Harry Potters can your team maintain?

Top comments (0)