DEV Community

Cover image for ๐Ÿ“œ RepoList - A tool to generate wordlists based on GitHub repositories
Adem Kouki
Adem Kouki

Posted on

๐Ÿ“œ RepoList - A tool to generate wordlists based on GitHub repositories

Hello everyone, I am back with another tool. This time it is a tool to generate wordlists based on GitHub repositories. I have named it Repolist

It is a simple tool written in Python. The code is available on GitHub and the package is available on PyPI.

The story behind Repolist

I was working on pentesting a website. I was trying to bruteforce the directories and files on the website. Using the common wordlists from seclists didn't help much. So I thought of creating a custom wordlist.

I know for a fact that the website is using an open source e-commerce platform called PrestaShop for its backend. So I thought of creating a wordlist based on the files and directories of PrestaShop.

I didn't want to manually copy the files and directories. So I thought of creating a tool that would do it for me.

I'm sure there are other tools that do the same thing. But I wanted to create my own tool just for fun. Python is not my primary language for development. So I thought it would be a good opportunity to use Python for this project.

What is Repolist?

Repolist is a tool that generates wordlists based on GitHub repositories. It uses GitHub API to fetch the files and directories of a repository. It then saves the files and directories in a text file.

Repolist

To use Repolist, just run the following command:

pip3 install repolist
Enter fullscreen mode Exit fullscreen mode

To generate a wordlist, run the following command:

repolist -u "https://github.com/username/repository"
Enter fullscreen mode Exit fullscreen mode

RepoList

Options

Arguments:
  -h, --help            show this help message and exit
  -u URL, --url URL     Github repository URL (required)
  -o OUTPUT, --output OUTPUT
                        Output file (optional)
  -b BRANCH, --branch BRANCH
                        Use a specific branch (optional)
  -t TOKEN, --token TOKEN
                        Github token (optional)
  -p PREFIX, --prefix PREFIX
                        Prefix (optional)
  -s SUFFIX, --suffix SUFFIX
                        Suffix (optional)
  -f, --files           Get only files (optional)
  -d, --directories     Get only directories (optional)
  -v, --verbose         Verbose mode (optional)
  --proxy PROXY         Proxy (optional)
Enter fullscreen mode Exit fullscreen mode

Combining Repolist with other tools

Using RepoList with tools like ffuf, httpx and gobuster can be very useful for penetration testing and bug bounty programs.

For example, you can use ffuf to bruteforce the files and directories of a website using the wordlist generated by Repolist.

repolist -u "https://github.com/WordPress/WordPress" | ffuf -u "http://example.com/FUZZ" -w -
Enter fullscreen mode Exit fullscreen mode

If you have other tools in mind, please let me know in the comments below.

How I made Repolist?

I've used Python with Poetry to create Repolist. Poetry is fairly new to me and It was a great experience using it. Easy setup and dependency management. With few commands, I was able to create the project and publish it to PyPI. I will definitely use it for my future projects.

Argparse is used to parse the command line arguments. Requests is used to make the HTTP requests to GitHub API.

The code behind Repolist

The code is fairly simple. It uses the GitHub API to fetch the files and directories of a repository. It then saves the files and directories in a text file.

Here is a small snippet of how it works:

    def _get_files_and_directories(self, username="", repo="", branch="main"):
        """
        Get files and directories from a repository (recursive)
        https://docs.github.com/en/rest/reference/git#trees
        """
        url = "https://api.github.com/repos/{}/{}/git/trees/{}?recursive=1".format(
            username, repo, branch)
        r = self._make_request(url) # add headers if token is specified
        if r.status_code == 200:
            for file in r.json()["tree"]:
                self.repo_content.append({
                    "path": file["path"],
                    "type": file["type"]
                })
        else:
            self._log_error(type=r.status_code, msg=r.text)
            exit(1)
Enter fullscreen mode Exit fullscreen mode

Using Poetry to publish to PyPI

Poetry makes it very easy to build and publish the package to PyPI. Those who are new to Poetry, here is how you can do it:

poetry new repolist
poetry build
poetry install
poetry publish
Enter fullscreen mode Exit fullscreen mode

You can read more about it here.

Rate limit and proxies

Github API has a rate limit. So I have added an option to specify proxies and tokens. You can also specify a specific branch to get the files and directories.

Conclusion

If you read this far, thank you for reading. I hope you find RepoList useful. If you have any suggestions or feedback, please let me know in the comments below.

GitHub logo Ademking / repolist

Generate wordlists from Github repositories

RepoList - Generate Wordlists from GitHub Repositories

Build PyPI version License: MIT

Repolist is a command-line interface (CLI) tool designed to generate wordlists from GitHub repositories. It simplifies the process of extracting files and directories from GitHub repos, enabling the creation of custom wordlists for penetration testing and bug bounty programs.

You can read more about it in this blog: https://ademkouki.tech/posts/repolist

Product Hunt

Table of Contents

Features

  • Wordlist Generation: Easily create wordlists from GitHub repositories. Choose between generating a wordlist of files, directories, or both.
  • Customization: Add custom prefixes and suffixes to the generated wordlists, such as appending .php to each word.
  • Support for Private Repositories: Access and generate wordlists from both private and public repositories by providing a GitHub token using the -t option.
  • Branch Selection: Specify a different branch using the -b option.
  • Proxy Support: Utilize a proxy by using the -p option.
โ€ฆ

Top comments (1)

Collapse
 
ademking profile image
Adem Kouki

I hope you find RepoList useful. If you have any suggestions or feedback, please let me know in the comments below