The real reason companies do software inventory exercises is to have a sense of control over their application portfolio. Because you need to KNOW that everything is tested, monitored and patched. That feeling of being certain, rather than ‘fairly confident’, is the difference between knowing that you are doing a good job of securing your software and just hoping that you are.
Most enterprise-level companies have hundreds or even thousands of assets, and do not have accurate inventory of them. This happens for many reasons; developers don’t always report their work, no one is assigned to keeping track of application updates, shadow IT, people not realizing that an API or SaaS (Software as a Service) qualifies as an ‘app’, APIs being split into multiple APIs and the new one(s) not being recorded, lost excel spreadsheets, other priorities getting in the way, employee turnover, etc. There are too many reasons to list, and the cause of the problem doesn’t matter as much as what can be done about it.
Why is having an incomplete inventory of your web assets a problem? Because you can’t protect something if you don’t know you have it.
You might be saying, “All of our applications are released via our standard DevOps Pipeline.” Sure, all the ones you are aware of go through it. What about the ones that you don’t know about?
Buying application security tools and then using them on only 60%-90% of your apps means many of your web assets not hardened, protected, monitored and logged. This presents a potentially large hole in your web-armour. It's almost as if there's a good reason that the NIST Cyber Security Framework has asset management (Identify) in the #1 spot, SC Magazine lists improper inventory as one of the Top 5 risks to application security, IBM lists it has #1 on Five Steps to Achieve Risk-Based Application Security Management, Contrast Security lists it as the #1 step of The 4 Dimensions of a Sound Application Security Strategy, and Veracode lists it as a precursor to even starting the Five Best Practices of Vendor Application Security Management. Ensuring you have a current, complete, and accurate inventory is the foundational step for every application security program.
As you might have guessed by this point; when I wrote this article I was working for a company that did application inventory. :-D. But trust me, it's still very important.
Story from Tanya:
Back in the day when I was a software developer, one of the government departments that I worked for hired a consultant to do ‘application portfolio management’, his name was Sanjeev. Sanjeev was the coolest guy in our office, well-dressed, charming, always in the right meetings with the powerful people. As you can guess, Sanjeev was paid a lot more than I was. His entire project, for the whole year, was to interview my 14 team members and I about which apps we had, to get links to where they were, and gather any documentation we had about them. He was a patient guy, talking to developers all day and asking us for information that we just didn’t have. Also, none of us thought what he was doing was important, and he was sometimes treated as a pest.
Long story short, we thought we had 200 apps, but he only found evidence of 72. Knowing what I know now…. I wonder how many we really had.
In order to perform a manual application inventory you could interview members of your dev teams and ask them what apps they have. Although having good communication with the dev team is always a good idea, this would be a long and tedious process if you are at a large company. If you are reading this blog, you are likely to be looking for a more technical solution, so let’s look at how we can try to automate this.
For a DevOps shop (those using automated pipelines for releases), inventorying all of the apps and APIs released from various pipelines would be a good starting point. For example: add triggers to the git
pre-push hook or to a Jenkins pipeline. Any that are not released via a pipeline would not be included, but ideally this would cover the majority of your newer applications.
From there you could look at this from a network angle. If you have the IP range(s) for your network and you have permission to scan them, you may run a network scanning tool such as nmap or masscan against the IP range and look for open ports on TCP port 80 (HTTP) and 443 (HTTPS). If either of these ports are open, they are likely running something web-related. From there you could use an automated tool such as EyeWitness to visit each one of the IP addresses, one at a time, and take a snapshot of the page to see what’s there. Or you could hire someone and have them visit each one manually, to find out what is there.
There are a few problems with the network scanning approach.
Web applications are not restricted to running only on port 80 or 443 (ask most developers where they test locally, and they’ll probably tell you port 8080). Web servers are capable of having several APIs and apps on them as well; they could be separated by ports or through the host name sent in the request. This means that you may not even see all the applications or APIs at a given IP address with a simple network scan.
You may not even be able to reach all your IP addresses due to firewalls, VLANS, or physical network separation.
What if your applications are hosted in the cloud? In cloud environments, IP addresses change frequently, and applications cycle rapidly (for example – serverless applications). You’re certainly not going to scan all of your cloud provider’s network hosts; you neither have the time (that’s a lot of IP addresses) nor the legal permission (remember – the “cloud” is just someone else’s computer) to do so.
Another approach could be to do an external scan of your domain name(s). If you know all of your domain names, you could use a tool like OWASP Zap or Burp Suite to crawl all the links on your domain and have it find as many pages as it can.
Again, there are issues with this method as well, mostly around visibility.
Crawling software can only see what it has access to. If there is no link on a page to an API call, it will never see that API. Are you using MFA (multi-factor authentication)? If so, the crawler won’t be able to get around that (it’s literally what MFA was designed for). Do you have authentication or authorization in your applications? If so, you probably don’t want to write evergreen login credentials into an application (this is also known as a “backdoor”) and then give those credentials to a crawler. That crawler may also hit some sensitive parts of your application (e.g. admin functionality) and wreak wanton destruction to your database, users, or critical business functions.
An external network agent will only be able to access what an external network user will be able to access. This means that it does not give a complete and accurate view of your applications.
If your domain list is incomplete, or there are subdomains you didn’t know about, again – the crawler will completely miss these too.
These approaches all come down to two main problems:
- These actions require significant overhead to continuously update your targets and initiate scans.
- You will never be able to get a complete view of your applications and their functionality.
In the meantime, I ask you to ponder the following question:
How do you create and maintain your application inventory?