TL;DR: GitHub has launched push protection, a new feature that scans for highly identifiable secrets before they are pushed to public repositories. The feature is now free for all public repositories and should be activated on the dashboard without delay. However, there are some limitations to this feature, including the detection of only a few types of secrets and the absence of historical scanning. To make an informed decision about whether this protective layer is sufficient for your needs, we have provided a detailed breakdown of these limitations. We believe that transparency is key when it comes to security, and we hope that this information will help you make the best decision for your organization.
Since its inception in 2017, GitGuardian has been advocating for improved code security, particularly for open-source code, which is highly vulnerable due to its exposure to the public.
GitHub, the largest open-source community, is a global hub for open-source code, and it’s also where the GitGuardian story started. The GitGuardian App eventually started gaining popularity, becoming the top security app on the GitHub marketplace.
Preventing secret leaks on GitHub's massive scale is a significant challenge with serious security implications for individuals and organizations across the world. We send over 5,000 daily emails to alert contributors after detecting a hard-coded secret in their patches.
A few weeks ago, GitHub stepped up in the protection of open source by making secrets protection available and free for all public repositories (private repositories already benefited from this feature), which is a significant milestone for open-source security. 🎉
In this blog, we'll provide a summary of what to expect from this feature and its limitations. Our aim is to assist you in determining whether this protective layer is sufficient for your specific situation.
Push protection prevents secret leaks by scanning for highly identifiable secrets before they are pushed.
“When a secret is detected in code, developers are prompted directly in their IDE or command line interface with remediation guidance to ensure that the secret is never exposed.”
Essentially what it means is that any leaky commit pushed to a remote branch will be blocked by GitHub, impeding the leaked credentials from sprawling on the Git server. The remote then refuses the push and responds with information about the leak, like this:
It is important to note that this protection is opt-in, meaning that you need to activate it through the dashboard for it to be effective.
This option is activable at the repository, organization, and enterprise level, and also for internal or private repositories when the enterprise or organization has GitHub Advanced Security enabled.
To enable push protection in a repository, organization, or enterprise, go to your “Code security and analysis” settings and scroll down to the secret scanning section. You can enable both “Secret scanning” and its subset, “Push protection” by selecting the enable all button.
GitHub provides the commit contributor the option to bypass the push protection by following a URL. If a contributor bypasses a push protection block for a secret, GitHub:
- creates an alert in the Security tab of the repository in the state described in the table below,
- adds the bypass event to the audit log,
- sends an email alert to the organization or personal account owners, security managers, and repository administrators who are watching the repository, with a link to the secret and the reason why it was allowed.
When the push protection triggers, the secret needs to be removed from all the commits it appears in. From the developer's perspective, this creates a lot of friction, as rewriting the local commit history is not a trivial task.
The best for a frictionless developer experience is to introduce secret scanning at the pre-commit stage, which will stop the secret from entering the VCS (git in this case) in the first place, sparing the developer from tedious work (that can even worsen the situation).
Push protection is a real-time detection mechanism that can effectively reduce the likelihood of a secret entering a remote Git branch. However, this can leave your organization with a false sense of security because you won't be able to detect secrets that were hard-coded coded in the past. This can leave your organization vulnerable, as it's not uncommon to see the historical scanning surface hundreds or even thousands of incidents, with many of them still exploitable.
To avoid any loophole in your security posture, we encourage you to assess your repositories' health by performing a historical scan of your repositories at least once, which is done automatically when integrating the GitGuardian platform with your sources.
According to GitHub,
“This feature proactively prevents leaks by scanning for secrets before 'git push' operations are accepted, and it works with 69 token types (API keys, private keys, secret keys, authentication tokens, access tokens, management certificates, credentials, and more) detectable with a low "false positive" detection rate.”
Le list of detected token types contains popular service tokens such as AWS, Azure, and Stripe. But is it enough? If we just look at the number of web APIs, it is estimated that in 2022 there were more than 24,000 in the world, and this number is growing very fast. New services appear every day, and some have the chance to gain in popularity at an explosive rate. Take, for example, the number of OpenAI API keys found in public commits in 2022:
Developer adoption is fast-paced, which means that it’s almost impossible to predict which service will be popular in the next 6 months from now. In fact, detecting new, previously unlisted tokens is a strong predictor of the emerging popularity of a given service provider.
At first, limiting the detection capability to credentials with “a low false positive detection rate” sounds like a good idea. But there is a caveat. It means that many credentials are going to fall through the cracks. Why? Because not all credentials are easily identifiable.
In fact, In 2022, what we refer to as generic secrets accounted for no less than two-thirds (67%) of the secrets detected. (from the State of Secrets Sprawl 2023):
In short, limiting secret detection to a set of secrets that are almost 100% certain to be secrets leaves room for undetected leaks. It's important to keep this in mind as it's worse to assume complete immunity against a vulnerability only to realize later that it was only partially the case.
Activating GitHub's push protection is essential, but it only works within the enterprise or organization's perimeter. This means that developers outside of this perimeter won't be covered by this protective layer and may inadvertently leak corporate secrets or source code on their personal GitHub repositories. This is a significant security concern for organizations since 85% of leaks occur in developers' personal repositories.
Because leaks mostly happen where organizations have no authority to implement security policies (such as push protection), other tools are required to cover the full GitHub attack surface.
Finally, an often overlooked type of leak occurs when a private or internal repository is intentionally or unintentionally made public. This is a sensitive event as it exposes the entire repository history where credentials could be discovered. Since it's disconnected from the git flow, only a historical analysis of all the commits from all the branches of a repository can detect secret leaks.
It's important to note that push protection is ineffective if a private repository is made public.
Although push protection is a great step towards securing open-source repositories, it's important to acknowledge the gaps that remain uncovered. It's crucial to avoid assuming complete protection at the click of a button.
A multi-layered approach to security is always beneficial, so we encourage combining push protection with pre-commit hook protection like installing ggshield. Additionally, history scanning is a separate field that requires monitoring for threat-hunting purposes. For a more comprehensive comparison between GitHub Advanced Security and GitGuardian's enterprise-level capabilities, check out this page: