Let’s assume you need to build code-readability-meter. How would you do it?
What is code readability?
Readability is the ease with which a reader can understand a written text.
Readability is what makes some texts easier to read than others.
Many factors play into the readability of text. For example, contrast, font, choice of words and many more. (See: readability for natural languages)
Some factors are subjective e.g. depends on the reader - if the reader knows given programming language, if the reader knows the context if the reader familiar with the jargon, etc. There is a great talk on the subject: Laura Savino - Talk Session: Readable Code. She says that “readability needs a reader”.
Some factors are objective e.g. doesn’t depend on the reader - length of the text, indirection in code, etc.
Subjective factors
In my opinion the only way to control subjective factors of readability is with code reviews e.g. if every team memeber can read and understand give code.
Objective factors
On the other hand objective factors can be measured by the porgramm, for example, we can build graph of function calls or variables, types, clesses dependencies and measure depth of graph.
So the question is: which objective factors we can measure to detect code readability?
Goodhart’s law
When a measure becomes a target, it ceases to be a good measure
I want to think about possible metrics of readability to better understand it, I don’t plan to make it one of those metrics like, test coverage or so called code quality.
Top comments (13)
Various measures of complexity come to mind, as they're often used as guideposts for when to sub-divide functions for readability.
Cyclomatic complexity is the most commonly used one, and is essentially a measure of possible paths through the control flow graph of the code, typically measured per function/subroutine. Higher cyclomatic complexity usually translates to the code being more difficult to understand, and always translates to it being harder to test.
The other big one is the Halstead complexity measures, which reason about various properties of the code based on the number of operators and operands (both total and distinct values). They're somewhat harder to calculate in a lot of languages than the cyclomatic complexity, though the 'difficulty' measure is arguably more useful for deciding on readability than the cyclomatic complexity is.
Not readability metrics
Length of text
I agree that 1000 LOC code is easier to read than 100,000 LOC code. But if the size of code samples comparable than length says nothing. I can easily imagine 500 LOC that is harder to read than 1000 LOC 🤷♀️.
Nobody reads code from cover to cover. Most of the time people read relevant pieces in small chunks, from that point of view length doesn't help to measure readability. Structure and "predictability" is more important.
Code formatting
I agree that homogeneously formatted code is easier to read. But on the other side, I don't think we need to measure instead we can use auto formatter, like Prettier, etc.
Static analysis tools like Sonar can do some of this already, definitely the complexity measures Austin mentioned. Some other things that come to mind are:
everyone working on the same pieces of code should definitely have a common formatting that’s performed on every file. Variations from the standard format could be measured.
There’s the idea that you can follow a simple set of (arguably measurable by code) rules and you’ll end up with much more readable code. This topic always makes me think of this talk—following small rules can add up to big readability improvements.
I don't think that size (LOC or number of something) is readability metric. See my answer here dev.to/stereobooster/comment/m56f
I agree, your team could be filled with geniuses that write 1 liners that are "non-trivial and elegant". Or you team could be filled with random people with minimal code experience.
Slap on as many metrics as you want, it will just give you a number that ultimately means nothing.
I'd grab the "weakest programmer" and ask them to review some given code for readability. Worst case they learn something, best case the code should become more readable if it isn't simple.
The best kind of code should be easy to read even for a "beginner", but what a beginner depends on the company, culture and use case. :)
This approach kind of stops conversation. My idea is to understand what contributes to code readability, not to make it as ultimate measure (I mentioned Goodhart’s law).
-self explanatory, which mean minimum comments.
-code can be read, not investigated.
-methods split by functional purpose, and should have single purpose the same applies for classes
code written for a human not for a machine :)
Yes, but what does it mean? How to measure it?
Sometimes those super tiny functions hard to read, you need constantly jump back and forward to read what the function does and you lose the context
I would not! Because there are no metrics that work.
I would define a code style, and that's the rule you live by. End of line!
I prefer tabs over spaces. If your project uses spaces, and I want to submit a patch, I uses spaces. If your project puts curly braces on new lines, and I want to submit a patch, I put curly braces on new lines.
yes yes, that is why I mentioned Goodhart’s law. This is exercise for imagination
From my point of view, code readability is not something quantifiable but qualifiable. It's hard to measure the code readability based on how many lines, functions, indentations and so on. Code readability is more about grab characteristics, standards and any other tool to use it as a reference of how the code should be made in a way to make it easier to understand for everybody and not only the author.
It would be cool if I could select a block of code in a repo or some code review app and press
?
to indicate that I do not understand.