DEV Community

Cover image for Static Code Analysis - deep dive
Pavlo Lozovikov
Pavlo Lozovikov

Posted on

Static Code Analysis - deep dive

Static Code Analysis. What is it about? Why and when is it important?

What do you need to pay attention to before integrating Static Code Analysis (further — SCA) tools into your workflow?

And, most importantly, how does it contribute to the overall quality culture in your team and your company?

To cover these questions, we will walk together through the whole concept of quality, look into domains SCA can take care of, shortly dive into a real-life example that might happen in every organization and get familiar with nice (or even the best) SCA tools of different programming languages.

SCA and it’s role in the quality concept

Before we start, we need to better familiarize with the broader aspect of a system’s quality, to have a clear definition and understand its parts to be able to define in which areas SCA lives.

The quality of a system is a sum of its measurements describing to what extent the system is:

  • Maintainable;
  • Reliable;
  • Secure;
  • Performant.

These are the factors that determine how trustworthy, dependable, and resilient a software system is. Based on them, we can tailor the behavior and other characteristics of our system in the most optimal way to the environment where it operates.

All of these measurements can be derived from different levels of a system quality analysis:

  • Unit level — single unit of code, like a function or class;
  • Technology level — integrated collection of units written in the same language;
  • System level — the whole technology stack + execution context in business reality to get a holistic view.

Microservices with SCA coverage areas

Everything aforementioned is covered by requirements: functional and non-functional, details of each depending on the environment where the system will operate.

When choosing tools for your project, understand it’s coverage level and search for ones which transcend the unit level scope.

Where SCA is contributing the most?

Static code analysis is the process of examining your code without executing it to help you find and fix issues early in the development cycle. Usually, such SCA tools are restricted to the unit and partially technology levels. They are good for targeted “microscope” analysis but are quite weak when it comes to the integration level, especially with third-party tools or libraries.

SCA contribution layers

These tools capture unused variables, overcomplicated expressions, and similar items useful for the hygiene of the code. Lack of context prevents prioritizing those issues well, but they ensure a decent level of readability and tidiness of the code, which is important for its maintainability, and to some extent for code reliability and performance efficiency.

Benefits: It provides a narrow-focused analysis that helps with niche technology problems; usually offers fast feedback to developers with low performance overhead.

Downside: It does not give us a holistic view of the whole system, including all of its integration behavior in its operational state.

According to the How to Deliver Resilient, Secure, Efficient, and Easily Changed IT Systems in Line with CISQ Recommendations paper, there is a strong imbalance between the amount of findings per certain quality level and their impact:

  • 92% of the total errors are in the source code (Unit level), but they account for only 10% of the defects in production.
  • 8% of total defects reside at the Technology and System levels but lead to 90% of the serious reliability, security, and efficiency issues in production and consume over half the effort spent on fixing problems.

As we see, static code analysis falls into this first “unimportant” section. Does it mean we need to cut our attention here?

No, it does not. For two reasons:

Technology Aspect

Our systems should adhere to non-functional requirements defined by stakeholders, yet we still need to measure the internal structure of a software product: absence of hardcoded passwords/tokens in our code, elimination of dead unused variables, avoidance of overcomplicated expressions, and other factors contributing to the hygiene of the code. This is crucial for the project’s maintainability and, to some extent, for its reliability, security, and performance efficiency.

Exposed code metrics assist executives in determining the risk that can be tolerated from each business or mission-critical system. This enables the setting of risk tolerance thresholds necessary to effectively allocate resources.

Human aspect

Broken windows theory

In general, the theory states that visible signs of disorder incentivize criminals to make crimes because it signals that such a state is acceptable, the area is not monitored, and criminal behavior carries little risk of detection. In contrast, an ordered and clean environment, one that is maintained, signals that the area is monitored and that criminal behavior is not tolerated. The theory assumes that the landscape communicates to people. It’s not the broken window itself that is defensive, but the message it sends to society.

A successful strategy for preventing vandalism is to address the problems when they are small.

And now, build the analogy to the software development lifecycle 😁

Normalization of deviance

Normalization of deviance refers to a phenomenon when an organization tends to accept risky anomalies as normal through repeated success. When the system signals issues but they are not addressed immediately, cognitive bias starts building around the idea that this is not an issue at all. Over time, what was once considered deviant becomes normalized, potentially leading to dangerous or harmful consequences as individuals no longer recognize the risk involved. This concept is often observed in organizations where repeated deviations from safety procedures or standards gradually become an accepted practice.
Also, over time, subsequent system parts are built on top of flawed components, making it difficult to implement easy fixes in the future.

No SCA in your toolchain

Let’s go to a real-life disaster situation.

Imagine you have a project written in Python with a codebase exceeding 10,000 lines, actively maintained by multiple developers. Initially, everyone agreed to use Python version 3.8 locally. However, there was little attention paid to the Python version installed on the virtual machines (VMs) where the code would be deployed. For convenience, Python 3.8 was installed everywhere, at least for the time being.

Following a quality initiative forced in your company, it was decided that the latest Python 3.12 version should be used on all VMs (actually a very good decision). The upgrade was implemented, but soon issues started arising. Stack traces began appearing from your project installed on the VMs, caused by the backward incompatibility of Python features between your local 3.8 environment and the VMs’ 3.12 environment.

In response to these failures, developers rushed to make quick fixes as the current issues are mission-critical for the business. Stack traces disappeared, but the quick-fixes unintentionally left some new bugs…

Of course, such a situation is not normal and may never happen under adequate circumstances. To circumvent it, much better and more thought-through steps can be taken, but this is just an example 😉

Right SCA tools are integrated

Now, let’s revisit the same situation, but this time our CI/CD pipeline has a properly-chosen and well-configured SCA integration.

Sysadmins update the Python version on VMs, while developers add this Python version to the “python compatibility”-like field in the SCA tool. They then execute the pipeline to deliver the new artifact.

The pipeline fails because it detects that some Python features used are incompatible with the desired 3.12 version, where they have been changed or even removed.

Developers promptly address this issue at their normal pace. A new stable artifact version is then delivered to the VMs without any unexpected complications.

SCA coverage areas

Before we dive into specifics of how SCA tools can effortlessly protect your company from potential disasters, it’s important to acknowledge that they are not a silver bullet. While they can significantly contribute to improving code quality and security, it’s essential to understand that maintaining code quality requires constant ongoing effort.

Despite your effort and experience, it’s challenging to track all the “underwater rocks” that may be present in your code, especially if you’re working with projects written in different technologies. Even with decent knowledge of the programming language, there can still be hidden issues that are difficult to uncover without the assistance of tools like SCA.

Indeed, all SCA tools are developed and continuously evolved by some of the best developers in their respective programming languages. To create algorithms that effectively detect flaws and issues, at least deep understanding of the technology is required. Therefore, feel free to treat integrated SCA tools into your pipeline like having a private high-class specialist working tirelessly for you, without weekends 😄

All SCA tools can be divided into 2 groups: formatters and linters.


Formatters are tools that addressing styling issues in code, without checking code logic.

As Guido van Rossum said, “Code is read much more often than it’s written.”


Linters are tools that aressing logical issues, like naming consistency and bug detection

More specifically, each formatter and linter covers a specific domain of issues:


  • Code style;
  • Code documentation style.


  • Cognitive complexity;
  • Code duplications;
  • Unused code parts;
  • Performance;
  • Best practices;
  • Security issues;
  • Deprecations;
  • and more 😉

Code style </>

Code style is all about your code’s presentation. While inconsistent code style may not directly impact your application’s logic, it significantly affects the potential for your application to evolve effectively.

Code is executed by machines but written for humans. When others read our code, we need to consider the cognitive load we impose on those people and how our changes will affect code review. Consistent code style dramatically reduces git diffs, making code reviews easier and more efficient. The harder the code review process, the fewer people can help you to identify potential flaws or improvements you can make.

Our brains naturally grasp visual information by blocks, first seeing the bigger picture and only then focusing on details. The more we can keep our focus on the application’s logic and not on deciphering code content, the more effective our changes are.

However, if our brains start fighting to understand the code style, which is different in different places for the same things, then reading of such code becomes trickier.

Imagine a book where each sentence of a single chapter uses different fonts, space widths, colors, and is full of typos. Finishing reading the first page would already be an achievement, right?

In our daily business, we primarily communicate through code. Having consistent style habits helps our brains relax and focus on important parts. It’s important to remember that that our communications are not only in present but also in future.

Code docu style📖

Code documentation, especially for public APIs, is crucial. When a function or module has a proper text description, it allows other developers, as well as yourself in the future, to quickly understand the bigger picture and keep focus on the things you want to.

The better you understand the code you are reading, the more effective your further decisions are, and the faster and more pleasant your development workflow becomes.

Another role of our code documentation is auto-generating application documentation, often integrated into the application’s release cycle. Imagine, during automated release, one of the steps is generating a new HTML page on your beautiful company site for the new version of your application. To do such a thing, a third-party tool is always used, which expects inline documentation of a certain format that it can parse. No surprise if the code documentation format is broken, the third-party documentation generator will produce something else, or will even publish a broken HTML page. Is it good for your business reputation?

Even if you don’t currently have documentation generation integration in place, having automated checks will help you be prepared for when such integration will be needed.

Cognitive complexity 𖡎

We’ve already discussed the importance of reducing cognitive load in the code style section, emphasizing the need to lower the mental effort required to process information.

However, cognitive complexity is often more inherent to application logic. While not all instances of cognitive complexity can be easily identified by static code analysis tools, some of them can. For further details and to avoid duplicating already well-articulated content, I recommend reading this article: Clean Code: Cognitive Complexity by SonarQube

Code duplications🔁

Another enemy of code quality is code duplication. Each instance of code duplication increases the accidental complexity of your code.

But what is accidental complexity?

When writing software, our goal is to solve a business problem, which inherently has its own complexity, known as essential complexity.

However, developers and software architects often add another type of complexity through the design and implementation of the software. This additional complexity is known as accidental complexity, and while it’s inevitable, its amount should be minimized.

Code duplication is the antithesis of this practice. It not only increases accidental complexity but also introduces risks during bug fixes: a developer may fix a bug in one duplicated code block but leave another untouched. The longer duplicated code is left in the codebase, the more instances of code duplication accumulate, increasing the risks to your application.

Pay double attention to duplicated code by extracting it to a function or module, covering it with tests, and reusing it. This approach will establish the next “best practice” in your codebase, adhering to the principle of doing “one thing and doing it well” as outlined in the Basics of the Unix Philosophy.

Unused code parts🗑️

Just a list of why unused code is bad:

  • Needs maintenance, which wastes resources without giving value.
  • Slows down performance by doing something which is not needed for adding business value, and sometimes, it’s a very huge performance-eating thing.
  • Imposes cognitive complexity on your developers.
  • Adds potential security breaches in your application.
  • And now, one by one, real-life examples…

All of us have been in a situation where spending the next hour trying to understand and debug a piece of code, only to realize that it is doing nothing and is a leftover of prototyping/code refactoring/application decoupling or feature removal.

Sometimes, such debugging scenarios are only possible in production environments, so such activities also impose security risks.

Unused code still requires maintenance. Migration to a new platform or language version sometimes requires changes in the code itself. SCA tools like SonarQube findings require effort to fix bugs. Why do you need to spend this effort on code that is doing nothing?

And here’s a big performance trouble situation: in my practice there was a noticeable performance breach caused by a heavyweight function leftover which was called hundreds of times and was identified in special cases at runtime, but unfortunately on the client’s side… It took a few days to find it, consumed many hours of clients’ time and our hotline to analyze the issue and trigger necessary conversations, but it would take just one pipeline execution for a static code analysis tool to find the breach and suggest a way to fix it.

Maintain only code which is used. Maintenance is effort which should be considered as added value to your business.


Okay, we talked about performance in the context of unused code leftovers (a really special case, I agree), but now let’s take a look at more common scenarios.

Such cases might involve operations on arrays or other simple operations, which sometimes can be done in a more performant way. Do not neglect such cases because at scale, even the smallest performance leak would be very visible.

Best practices🏅

Every goal can be reached in different ways, but only a few of them are the ones you and your project need.

Examples include shadowing variables, variable reassignment inside of functions, using inefficient methods, invoking risky APIs within code blocks, handling exceptions, or using APIs when better native alternatives exist, and so on. I can continue the list of examples, but the main point is that each change you make should keep “evolution doors” open, allowing for further refactoring or module decoupling while maintaining clarity of the change’s intention for other developers. SCA tools provide good assistance in achieving this.


Hardcoded passwords or tokens? Script injection risk? Overlooked outer scope variable overriding? Risky regular expressions? Those, and many other issues, are very easy to overlook even for experienced developers. Leaving them as-is, in the best-case scenario, will result in frustrated work in the future.

Consider the security of your application as a must-have feature 😉


Moving to a new language version may result in some of the APIs you’re using becoming deprecated. Not adapting to these changes before making the version jump can lead to troubles.

The kind of trouble will vary depending on the language. Some languages will catch deprecated API usage at compile time, but in languages like JavaScript or Python, if the code is not covered by tests, deprecated APIs may slip into production, causing failures at runtime. If such a situation occurs, I hope the observability of your infrastructure allows you to catch it at an early stage, or you’re fortunate enough to realize it quickly and address all the consequences.

To use or not to use?

And now, the question: should we apply static code analysis everywhere?
It depends 😂

It depends on many factors. For instance, if there is a simple Python script that only needs to copy some folders in the pipeline or be called on demand, there is no need to overdo things here. However, if there is a promising project that multiple teams rely on, then definitely yes, you need thorough SCA tools integration, for all the good reasons mentioned above.

It also depends on the policies in your company. Blindly including static code analysis without agreed plans to fix found issues, when the results are integrated into quality reports, will only break your KPIs, and no one would be thankful for that.

But again, if your team sees all the reasons why such tools should be integrated, you’d better do it as early as possible.
Now it will cost you time, but later it will cost you work.

Let’s go to concrete examples!

If you want to make research on your own, Analysis-Tools is a great resource for that! It has a list of, I think, most of the linters available on the market.

But for now, let’s stick with concrete examples derived from my own practical experience. In my opinion, those tools represent the bare minimum for covering most of the topics discussed in this article in the most efficient and user-friendly manner.

Tools listed below are more focused on the source code aspect rather than security, like for example Blackduck, simply because I have more experience with them at the moment. The topic of security is vast and opened for further articles 😎



Ruff is just a revelation and all-in-one solution for both formatting and linting in Python.

It handles code style (naming conventions, indents, etc.), enforces Python rules, automatically upgrades syntax for newer language versions, resolves deprecations, identifies dead code, locates TODOs, highlights security issues, promotes Python best practices such as exception handling, suggests performance improvements (sometimes yielding drastic results), ensures docstring conformity, and automates fixes in your code to minimize manual work. And doing it blazingly fast.

Yes, it boasts remarkable speed, outperforming other tools by 10 to 200 times (see on the main page), and covers the majority of Python rules. It seamlessly integrates 1-to-1 rules from numerous other tools like Pylint, isort, Bandit, Flake(all of them), Black, …

From my personal experience

I have always used Pylint in my pipeline and local git hooks. Then, tried out Ruff, and this was outcome, running on few thousand lines codebase.

Pylint found ~120 errors, very slow results showing… (about 7–10 seconds)
Ruff found ~471 errors, instantaneously fast. (about 1 second or less)

Ruff brought to light numerous issues, including:

  • Import problems;
  • Bad practices in exception handling and naming conventions;
  • Various deviations from PEP standards;
  • Inefficient usage of ‘for’ loops;
  • Inefficient usage of Python built-ins;
  • Identified usageof deprecated features from older Python versions; along with recommended alternatives
  • Unclear function APIs;
  • Risky utilization of built-in functionality;
  • Suggestions for improving complex code parts.
  • For each finding, Ruff offers helpful fix suggestions accompanied by a rule number. You can reference the documentation to gain a comprehensive understanding, complete with code examples, of each issue.

For each finding, Ruff offers helpful fix suggestions accompanied by a rule number. You can reference the documentation to gain a comprehensive understanding, complete with code examples, of each issue.

Below is a Ruff config I came to (pyproject.toml format)and believe this is a good candidate to be a bare minimum in a Python project.

src = ["src"]
fix = true
show-fixes = true
output-format = "concise"
line-length = 170
# python standard indent type and width
indent-width = 4
# us
target-version = "py312"
include = ["*.py"]
# exclude the same files as in .gitignore
respect-gitignore = true  

select = [
    "A", # flake8-builtins. Check for python builtins being used as variables or parameters.
    "ARG", # flake8-unused-arguments. Checks for unused arguments.
    "ASYNC", # flake8-async
    "B", # flake8-bugbear. Finding likely bugs and design problems in your program.
    "BLE", # flake8-blind-except. Checks for blind exception catching.
    "C4", # flake8-comprehensions. Write better list/set/dict comprehensions.
    "D", # pydocstyle. Checking compliance with Python docstring conventions.
    "DTZ", # flake8-datetimez. Ban the usage of unsafe naive datetime class.
    "E", # pycodestyle. Python style guide checker - errors.
    "ERA", # eradicate. Finds commented out code.
    "EM", # flake8-errmsg. Improves your stacktraces.
    "F", # pyflakes. Checks Python source files for errors.
    "FLY", # flynt. Search for 'join' which can be replaced by f-strings.
    "FA", # flake8-future-annotations. Helps pyupgrade for better code autofixes.
    "FBT", # flake8-boolean-trap. Avoiding the confusing antipattern.
    "FURB", # refurb. A tool for refurbish and modernize Python codebases.
    "I", # isort. Helps you to see imports issues.
    "ISC", # flake8-implicit-str-concat. Helps for better strings concatenation.
    "ICN", # flake8-import-conventions. Improves your imports.
    "INP", # implicit-namespace-package. Checks for packages that are missing an file.
    "UP", # pyupgrade. A tool to automatically upgrade syntax for newer versions.
    "LOG", # flake8-logging. Checks for issues using the standard library logging module.
    "PIE", # flake8-pie. Implements misc. lints.
    "PL", # pylint.
    "PTH", # flake8-use-pathlib. Pathlib provides an easier method to interact with the filesystem no matter what the operating system is.
    "PERF", # Perflint. Search for performance antipatterns.
    "RSE", # flake8-raise. Finds improvements for raise statements.
    "RET", # flake8-return. Finds improvements for return statements.
    "RUF", # Ruff-specific rules.
    "N", # pep-8 naming
    "S", # bandit. Automated security testing built right into your workflow!
    "SLF", # flake8-self. Cehcks for private members access.
    "SIM", # flake8_simplify. Cehcks for code that can be simplified.
    "T20", # flake8-print. Checks for 'print' and 'pprint'.
    "TID", # flake8-tidy-imports. Helps you write tidier imports.
    "TRY", # tryceratops. Helps you to improve try/except blocks.
    "TCH", # flake8-type-checking. Lets you know which imports to move in or out of type-checking blocks.
    "TD003", # flake8-todos. Check for TODO to have issue link.
    "W", # pycodestyle. Python style guide checker - warnings.
    "YTT", # flake8-2020. Checks for misuse of `sys.version` or `sys.version_info`.

ignore-init-module-imports = true
# Allow fix for all enabled rules (when `--fix`) is provided.
fixable = ["ALL"]  

force-single-line = true
order-by-type = false  

convention = "pep257"  

# Enable reformatting of code snippets in docstrings.
docstring-code-format = true
Enter fullscreen mode Exit fullscreen mode

To effortlessly safe fix and format found issues:

ruff check --select I --fix
ruff format
Enter fullscreen mode Exit fullscreen mode

For non-safe fixes, which will require your own aftercheck, execute:

ruff check --fix --unsafe-fixes
Enter fullscreen mode Exit fullscreen mode


Static type checking is crucial for promising Python projects that are evolving beyond simple scripting. To achieve this, you need more control over Python’s dynamic nature. By introducing stricter typing, you can anticipate future language updates without encountering runtime errors.

The more you annotate, the more effective MyPy will be.

To help yourself and automatically annotate your code use one of MonkeyType, autotyping and PyAnnotate.

An excellent goal to aim for is to have your codebase pass with:

mypy --strict
Enter fullscreen mode Exit fullscreen mode


Another useful formatter which can format your code accordingly to your current codestyle. Useful to avoid formatting overhead in big proejcts.

JavaScript & TypeScript


Prettier is a dedicated formatter to style out your code. You can also do it with linter, if such configuration is available, but better to avoid it and use dedicated Prettier which has been designed for this goal. More about it here.

Prettier is pretty fast in formatting huge codebases and doing it in a secure way.

The first requirement of Prettier is to output valid code that has the exact same behavior as before formatting.

// Format everything
npx prettier . --write
// Check everything
npx prettier . --check
Enter fullscreen mode Exit fullscreen mode

Some useful integrations:

If using ESLint, use eslint-config-prettier to disable all ESLint formatting rules to not be in a conflict with Prettier. Also mentioned here.


The most popular JavaScript linter.

It helps you find and fix problems in your JavaScript code. Problems can be anything from potential runtime bugs, to not following best practices and styling issues (use Prettier for that!)

Use rules page to see the ESLint checks.

The configuration can be pretty vast, up to 1000 lines, so my advice is to search for popular OpenSource JS projects and see how they implemented it.


The same as ESLint, but for the TypeScript.

My preferred way is to use linting + type information and stylistic-type-checked rules.

For more, apply to the rules page.


Let’s start with formatters. The most prominent ones are: Checkstyle and Spotless.

Spotless is a current choice for many companies and a target to migrate out from the Checkstyle because of:

  • Configuration is way easier
  • Can format many languages which is very beneficial when java project has also xml, groovy, kotlin, json and markdown files.
  • Can make stylefixes with 1 command through the whole project, which allwos for githooks integrations.

Continuing with linters, there would be a good list you can choose from.


Checks miscellaneous java code flaws. Has CPD module to check for code duplications. Has both Gradle and Maven integration. For details, apply to Java rules and Kotlin rules docu.


NullAway alerts you about possible NPEs by leveraging @Nullable annotations.


Go is a beautiful programming language in a sence that strict code quality checks is in it’s DNA.


1-to-1 analogy of earlier discussed Ruff for Python in a sense of combining together huge amount of linters alltogether in 1 tool.

At the moment, Includes huge amount of linters (115 at the time of writing!!). To list all of them, first install and then execute command below.

golangci-lint help linters
Enter fullscreen mode Exit fullscreen mode

Outputting to common formats such as JSON, JUnit, and Checkstyle facilitates seamless integration with CI/CD reporting dashboards. Additionally, HTML output enhances convenient workflow for local Git actions. The tool also offers native integration with GitHub Actions.


Simply checks for misspellings in your code.


Not a static code analysis tool, but rather a smart integration point. It runs desired linters and unit tests and generate beautiful HTML/json/text report for all of them!

HTML can be also customized by your own templates.

Online example of HTML report.

My personal workflow for linters would be like:

golangci-lint run --fix
staticcheck -checks all ./...
identypo ./...
// and locally or for pipeline reports
goreporter -p ./.. -r ./reports/report.html -f html
Enter fullscreen mode Exit fullscreen mode

Multi-language SCA tools

There are also tools on the market which encapsulating all of the checks away from developer to have all-in-one. The most popular are undoubtedly SonarQube, Quodana and MegaLinter.

I would use them as a last resoirt in our pipelines, and first try to catch issues by more niche tools, starting with Git-Hooks, them more heavyweight checks in CI/CD pipeline, and only then applying multilanguage SCA tool.


To summarize, static code analysis helps cover some of the quality aspects of your projects and makes it a part of your daily routine, which inevitably becomes a part of developer’s learning process. It certainly improves code review by elevating it to a new level, allowing developers to focus more on design aspects and not be overwhelmed by minor issues. Since none of us has a compiler in our heads, some problems can be really tricky to grasp on your own, so feel free to outsource it to some nice SCA tool 😉

First thing which will be improved by SCA integration is maintainability, which is crucial for successful evolution process.

And it will definitely reduce amount of “what the fucks” in your projects, but some of them probably will still stay ;)

But despite all the benefits, your quality journey only begins here. SCA tools, despite their versatility, cover only a small part of possible issues. Those tricky ones you will have to find on your own. None of the integrated SCA tools will restrict you from investing your own efforts into the quality of your projects. Make such investments consistently, rather than waiting for a big-bang “chances”. Make it a habit.

Real-life experience has taught me a lot, summarized well in the following:

“Don’t cut corners on quality, otherwise quality will cut corners on you.”

In any case, clients in production are always ready to assist you in this endeavor 😉

Top comments (0)