When I discuss the topic of dependencies with other developers, I often find that we approach the conversation with differing definitions of “a dependency.” Given the various operating systems, build systems, package management systems, common practices, and common paradigms, this does seem reasonable to me. It’s like the term “unit test” insomuch as the definition is subjective and has many possible correct answers based on context. For the purposes of secure dependency management, however, we need a more expansive definition.
There are three common sources of for dependencies, and I like to use travel as a metaphor:
Operating system and virtualization dependencies — These are your program’s accommodations as it travels.
Runtime dependencies — These are the amenities available to your program at those accommodations.
Program dependencies — This is the luggage your program travels with.
Let’s take a trip and add some context here.
For the purposes of this section, we’re going to treat an operating system as one of four things:
A bare-metal operating system running on a server or a laptop, including drivers, or any other system package.
A virtualized operating system running within a hypervisor.
The hypervisor application or virtualization runtime itself.
A serverless compute resource
Many of us have received findings from security teams before telling us that a kernel, openssh, or other system package version was out of date. These are frequently detected by scanners such as Nexpose, Crowdstrike, or even something as simple as apt or yum. Excellent automation already exists to patch and update system packages, and to report on which packages/versions are installed.
On a local development environment, it’s important for the developer to keep their machine up-to-date. Whether it be Homebrew, apt, yum, dpkg, chocolatey, or any of the other common desktop package management tools, it is very important that folks understand and use these tools well. In an enterprise environment, it becomes important to catalog packages on machines so that when vulnerabilities are reported notifications can be sent out to folks to apply updates.
Virtual machines are a special case with dependencies because there are so many special cases.
VMs running in bare-metal hypervisors (these aren’t as common as they used to be)
VMs running in cloud hypervisors (these are very common)
Containerized VMs running container images (for example, Docker)
For the purposes of dependency management, let’s break these down into two groups: container images and whole VMs.
For a container image the definition of a dependency is pretty straight forward: the image itself is the dependency. A container image is a versioned collection of layers which when loaded produces a running VM. That makes the container image a dependency, but that’s not all! A container image can also be derived from other images, or a composition of multiple images, and it can nest very deeply. Each of those images involved in creating the image you’re using may have its own set of issues. There is a good deal of tooling out there between cloud vendor tools — such as AWS Inspector — and Docker’s docker scan functionality, so it’s not difficult to keep yourself covered.
For non-containerized VMs, there are two common ways I’ve seen folks manage these: persistent, and disposable.
A persistent VM is one that you expect to behave like a traditional bare-metal server. You set up patching schedules, you use yum or apt (or other package manager) directly, and you patch running VMs in place.
A disposable VM is one where your patching strategy is to create a new copy of the VM’s disk image, and then destroy the out-of-date one while you deploy the new one. This is very common in cloud environments as it makes patching much quicker, and also makes it easier to take advantage of popular cloud functionality such as auto-scaling.
For a persistent VM, dependencies are the same as they would be if the VM were a traditional server, and regular operating system packages are your dependencies. For a disposable VM, however, the VM image becomes the dependency more along the lines of a containerized VM image. These can frequently be version-controlled, or at least produced using tooling which is in version control. The operating system packages inside of these images are super important, but the “dependency” would be the versioned image itself.
This may seem like splitting hairs, but it makes a big difference when you’re talking about the volume of findings, workflows for maintaining running systems, and communicating to the business what needs to be done. In my [REDACTED] years in the software field, I have had it happen more than once where I got a giant list of “findings” from a scan which all turned out to belong to a versioned image.
For serverless compute resources, such as AWS Lambda or Azure Functions, you would usually think that there’s no need to consider dependencies outside of your own code and its libraries, but any program or library that you deploy along-side your serverless resource counts. If you’re using a layer in an AWS Lambda to add ClamAV, or if you’re packaging a clamav.exe executable for your Azure Function, then you’ve made ClamAV (in relevant form) a dependency. It’ll be important to keep that updated as well.
A runtime dependency is just that: a dependency of the runtime. This could be an optional feature or a standard one, such as openssl in node or a Zend module like PCRE in PHP. It could be a server runtime, like tomcat or even a virtualized runtime like NodeJS for an AWS Lambda or Azure Function. Most Docker containers would fit in here, too. In my experience, runtime dependencies are the most forgotten type of dependency.
Runtime dependencies are important to remember, though, as they’re the types of dependencies which are the most disruptive to patch or update. The reason for this is that these are the most difficult to detect:
These dependencies are not in your program’s package manager.
These dependencies are likely considered part of your server or environment, not your application.
These dependencies are not part of the normal functioning of the operating system, but are only installed for your application, so they may not be detected by tools which report on out-of-date packages.
Many GNU/Linux distributions do not keep up with new versions of these dependencies well, or pose a greater risk for breaking changes, so it can be common to find them installed in either source or binary form outside of normal operating system package managers.
Take all of this, then add layers in Lambdas and Docker images, and now it is possible to grok in fullness the picture of runtime dependencies. Runtime dependencies are big, they’re layered, and they take special attention. (Everybody who had to chase down log4j across dozens of Docker images is nodding their head with me right now.)
When we consider runtime dependencies, we need to consider all of the different times when our application can run:
During a build
These are all different areas where runtime dependencies can be environment-specific. For example, your debugger may be a runtime dependency for development and testing, but not in production. Your testing framework(s) will be useful from development through deployment, but you probably don’t even want the tests deployed to production at all.
Some of these runtime dependencies will perform tasks locally, others will call out to third party services. Some of these dependencies may interact with our infrastructure or even stand up virtualized instances for testing. There really is a lot of work done by these dependencies and it is very important to have a good hold on which of them are in use and what they do.
There are two ways in which a dependency can be classified as a program dependency:
Managed program dependencies are when a program uses a package management to cause it to be installed during a build or install process.
When a program has a dependency which is expected to be installed independently of the build or install process used by the program itself, that is an unmanaged program dependency.
Package management is a familiar practice across many languages, but the tools have some key differences. Most package managers have a few things in common:
There’s a file which lays out which dependencies are needed.
There’s frequently a lock file which contains the source, as well as some sort of validation hash to verify the integrity when installing dependencies in other environments.
The files used are intended to be version-controlled with the program to help make sure that the dependency versions will be kept stable, resulting in consistent functionality and performance of the program.
Most of these package management schemes have a program to maintain them, such as npm, yarn, carton, cargo, composer, pip, or mvn.
These tools will frequently let you specify dependencies for development in addition to dependencies needed to run the program.
Managed dependencies are “managed” because the package management tool helps you maintain your dependencies. This maintenance includes installing dependencies, removing dependencies, and upgrading dependencies. Many package managers also include scanning capabilities for security vulnerabilities.
For compiled programs, or for languages where package management is less common (like C or C++), there may be tooling which automatically detects the presence or absence of dependencies. Two examples of this are the GNU Autotools suite, and CMake. While these tools detect the presence or absence, they do not install or manage dependencies; these programs verify that a program can be built, installed, or run.
In the dynamic language space there are a number of examples of unmanaged program dependencies. Perl, Python, and Ruby frequently depend on globally-installed dependencies. Bash programs (yes, Bash is a programming language) often assume commands like wc, grep, jq, among others, are available.
Key distinguishing characteristics for unmanaged program dependencies are:
The program directly uses on the dependency, and will be unable to install or run without it. Common examples include libc, CPAN, setuptools, or openssl.
There is no package manager used, even though there may be a script distributed with the program which assists in the otherwise manual process of package installation and verification.
Unmanaged dependencies can be installed in a “local” manner, meaning that they’re installed alongside the program in a place like ./lib/ or ~/.local/, or “globally,” meaning that they’re installed in a common place such as /usr/local/lib/ or /usr/local/include/.
Some dependencies are indirect, also sometimes referred to as “dependencies of dependencies.” For example, if you’re using express as a dependency, you’re using escape-html indirectly because express uses escape-html. It’s important to keep this in mind, as it may not always be clear which things you’re using if you stop at your own dependencies.
Just like your direct dependencies, these can be managed or unmanaged. Common examples of unmanaged indirect dependencies would be your database client library, Image Magick, or OpenSSL.
Now that we’ve defined dependencies we can find them and maintain them; simple, right? Not exactly. Software dependencies are living, breathing things, and they will change over time. It’s important to keep up with how you, and others in your organization, maintain dependencies. If you find that you aren’t addressing dependencies yet this may seem daunting.
I’ve been in the software game for a long time, and the one thing I’ve learned is that nobody feels like they’ve got it perfect. I’ve heard teams tell me that their application is “in maintenance mode,” and that’s why they won’t update their dependencies. I’ve heard folks say that the next version up has breaking changes, and they don’t have time to upgrade. Sometimes folks will say that their dependencies release new versions too often (think React and AWS SDK), so it’s impossible to keep up! All of these folks are right, and all of them are wrong. It is hard, but there are ways to make it easier on yourself.
Make maintenance a habit. Just like oil disposal is a normal cost for Jiffy-Lube, and changing the oil is a cost of doing business for Wendy’s, updating dependencies is the cost of doing business in software. Look for hooks in your workflows and SDLC where you can shim dependency-related maintenance in. A key benefit of habitually updating dependencies is that you’ll be more in tune with changes, resulting in fewer surprises when you do find those breaking changes.
Prioritize automated tests, no matter where. If you have good test coverage — unit tests, integration tests, anything — then you might even be able to automate some of your dependency updates.
Buckle down, and just do it. Breaking changes are nobody’s idea of fun, but you can’t really avoid them. Put an item on your to-do list, find time, make it happen. Just like a difficult workout, it won’t be fun and it’ll be difficult to get started, but you’ll be glad you did when you’re done.
Build relationships with key maintainers. If you’re using a dependency which is niche, or which is crucial to your application, reach out to the maintainer and get to know them. Don’t forget that they’re human, they’re also busy, and they also have different priorities competing for their attention.
Sponsor the open source dependencies you can’t live without. Open source isn’t parade candy, it takes a bunch of work. It’s just code, like any other code. If you depend deeply on an open source project then don’t leave it to chance! Make sure that maintainer knows that you support them, that you value their work, and that you’re giving them incentive to keep working on the thing you need.
Find a bug? Submit a PR! Maintainers only have so much time between whatever their for-pay work is and their personal lives. As a maintainer it is a joy when I receive not only a bug report but also a PR to address it. Just don’t forget to read the CONTRIBUTING.md file, if there is one.
I’ll close this with the same advice I give to everyone: keep moving the ball forward. Slow progress is better than no progress, and just having a good inventory of which dependencies are in use, or a Software Bill of Materials (SBOM), can be very helpful.
I’ve been working on this article off and on for four months, and I’d like to thank Jason Taylor and Bola Adebesin for motivating me to write and helping me to edit this article. It’s better for their help, I promise.