This guide explain how to remove sensitive text from your Git repo. It requires BFG Repo-Cleaner, which is endorsed by GitHub, and Java.
- Close/merge all pull requests.
- Make sure all developers' local branches are clean.
- Clone your repo to create an emergency backup.
- Download the
bfgJAR file from BFG Repo-Cleaner, for examplebfg-1.14.0.jar. - Go to a temporary working folder.
- Clone a bare mirror of your repo, for example:
git clone --mirror git@gitlab.com:SomeUser/myrepo.git - Create a text file named
sensitive.txtwith regular expressions to replace. For example, this text file will replace all occurrences ofpassword123with***REMOVED***and all occurrences ofabc123withsamplePassword:
password123
abc123==>samplePassword
- Copy in the
bfg-1.14.0.jarfile. - Execute this command to replace the sensitive text:
java -jar bfg-1.14.0.jar --no-blob-protection --replace-text sensitive.txt myrepo.git - Go into the mirror repo:
cd myrepo.git - Execute
git reflog expire --expire=now --all && git gc --prune=now --aggressive - Push to your remote branch:
git push. Note: If this fails, you may need to unprotect the branch in the remote Git server. - Ask all developers to re-clone the repo to get the rewritten Git histories.
- Verify that the repo looks correct, then delete the local backup repo and temporary working folder.
Now all sensitive data is gone.
Thanks for reading!
Follow me on Twitter @realEdwinTorres for programming tips, software engineering content, and career advice. π
The content in this blog post is publicly available at Git, GitHub, GitLab, and BFG Repo-Cleaner.
Oldest comments (2)
π‘ Calling all developers and tech wizards! Areon Network presents a golden opportunity with its Hackathon. Register at hackathon.areon.network and compete for a share of the incredible $500,000 prize pool. Code your way to success! ππ» #CodingCompetition #AreonNetwork
Is this process necessary if my PR (which contains sensitive data) is still open (or closed) and is not merged yet in main repo?
In my case, if I simply remove the commits that contain sensitive data, will it suffice?