There are mainly two types of tools/techniques to analyze the quality of a codebase in an automated manner, ones that run on your code (static) and ones that test your program when its running (dynamic).
In this post, I'll be talking about the importance of static program analysis and how I set up appropriate tooling for it in Python project.
What is Static Code Analysis ๐ต๐ป
Static code analysis is a software testing technique that analyzes source code or compiled code without executing it.
The primary goal of static code analysis is to find potential issues, vulnerabilities, and maintainability problems in a program's source code before it is executed or compiled. This analysis helps identify defects early in the development process, leading to more robust and maintainable software.
Significance of such tooling ๐ค
Beginners, including me, tend to overlook the importance of static analysis tools, and take them for granted. But its important to realize how the rise of such tools, especially code formatters, has revolutionized the way contributions are made to projects, exponentially boosting the development productiviy.
The following interview of my open source professor, with the early developers of prettier, does a great job at highlighting the importance of such formatters, and how blessed we should feel to have them at our disposal.
There are various types of such tools focussing on different areas like code formatters, linters, security scanners and many more. But I'll only be focusing on the first two in this post.
Integrating Static Analysis in my Project ๐
The first thing I had to do was decide what code formatter and linter I would use in my python project.
I researched online and chose the most popular ones I could find.
Adding Black
Black is a highly opinionated code formatter with little configuration options widely used by major companies and projects.
I started out by reading the official documentation to see steps for getting started, and surprisingly, it was pretty simple to use.
Run the following command format a file or files within a folder,
black {source_file_or_directory}...
However, this is not enough if you would like to setup some custom configurations, like excluding some files or setting a max-line-length
of your choice.
For this, you need to add a pyproject.toml
file at the root of your project with the following configuration to get started
[tool.black]
line-length = 88
target-version = ['py37']
include = '\.pyi?$'
# 'extend-exclude' excludes files or directories in addition to the defaults
exclude = '^/src/version.py'
extend-exclude = '''
# A regex preceded with ^/ will apply only to files and directories
# in the root of the project.
(
^/foo.py # exclude a file named foo.py in the root of the project
| .*_pb2.py # exclude autogenerated Protocol Buffer files anywhere in the project
)
'''
Refer Configuration via a file for more options and details.
Now that the setup was done, it was time to run the formatter on my code.
I was surprised to see the massive formatting changes to all my files, and the entire codebase instantly become way more readable.
Adding Pylint
From Pylint documentation,
Pylint is a tool that checks for errors in Python code, tries to enforce a coding standard and looks for bad code smells. This is similar but nevertheless different from what pychecker provides, especially since pychecker explicitly does not bother with coding style. The default coding style used by Pylint is close to Guidoโs style guide). For more information about code smells, refer to Martin Fowlerโs refactoring book.
This process was pretty similar to configuring black. The first thing I did was learn how to invoke pylint from the commandline.
pylint [options] module_or_package
Unlike black, which is an opinionated tool, pylint is highly configurable with a whole lot of options that can be found here:
https://docs.pylint.org/features.html#pylint-global-options-and-switches
To configure these options, I had to create a .pylintrc
file, again in the root of my project with the following command.
pylint --generate-rcfile > ./.pylintrc
This was a long file all the supported options set to their default values.
Should like like this
Now that the configuration was setup, it was time to run the tool on my project.
As expected, there were dozens of issues in my project even though its very young. There were various types of issues like missing docstrings, dangerous default values, extra general exceptions, long lines, too few public members, and the list goes on.
One thing I did was disable any formatting related settings black was handling for me to avoid conflicts. This part is really important to prevent unexpected behaviors.
disable=raw-checker-failed,
bad-inline-option,
locally-disabled,
file-ignored,
suppressed-message,
useless-suppression,
deprecated-pragma,
use-symbolic-message-instead,
use-implicit-booleaness-not-comparison-to-string,
use-implicit-booleaness-not-comparison-to-zero,
# From here
line-too-long,
missing-module-docstring,
consider-using-from-import,
unspecified-encoding,
import-error
After this, I added some missing docstrings and fixed most of the linting issues present. This took me like 3 hours since while going through the code, I moved some method across classes where they belong better, leading to a better score.
Now mostly some missing docstrings were left, which I opened created an issue for.
Integrating tools in the IDE
To get the benefits of these tools while you are writing code, it is recommended to integrate these tools into your code editor as well. This gives you suggestions and you're able to fix your code while developing.
This was pretty simple with VSCode as all you need to do is install black-formatter and pylint extensions.
The corresponding configuration needs to be set in a settings.json
file (workspace settings) the .vscode
directory.
{
"[python]": {
"editor.defaultFormatter": "ms-python.black-formatter",
"editor.formatOnSave": true,
},
"editor.codeActionsOnSave": {
"source.organizeImports.ruff": true
},
"pylint.ignorePatterns": [
"version.py"
],
"pylint.args": [
"--rcfile=./.pylintrc"
],
}
This will auto-format the code whenever a file is saved by black code formatter, and you'll contantly receive warnings from pylint. These settings are based on whatever we set earlier for the commadnline versions.
For more options, feel free to explore the official documentations of these extensions.
I also added an extensions.json
file such that any new contributors are automatically recommend to install these extensions when they first open my project.
{
// See https://go.microsoft.com/fwlink/?LinkId=827846 to learn about workspace recommendations.
// Extension identifier format: ${publisher}.${name}. Example: vscode.csharp
// List of extensions which should be recommended for users of this workspace.
"recommendations": [
"ms-python.pylint",
"ms-python.black-formatter"
],
}
And with that, I was all set.
Git Hooks ๐ช
Many projects also add pre-commit git hooks to their projects that run some code or script on every commit. However, I don't want to be too strict with my contributors by allowing them to write in their own style, while still maintaining these standards in my codebase.
I am planning to do this by configuring a Github Actions CI, that does this on every pull request. The issue has already been opened, feel free to follow up if you would like to contribute. Will be adding more details soon!
Conclusion ๐
In this post, I discussed about the importance of static code analysis and its importance, and how to add linting and code formatting tools to a python project.
These were all the changes I had to make to accomplish this.
Here's a link to the squashed commit if you're interested.
Hope this helped!
Top comments (0)