Let's talk about Git and the two main ways of hosting and managing code - monorepo and multi-repo. Both have their ups and downs, and the right approach depends on the specific project and team.
Whether you're working on a small project with just a few libraries or a massive codebase with thousands, or whether you're hosting private or open-source code, both monorepo and multi-repo can work.
But what are the pros and cons of each? Let's dive in and find out!
What Is a Monorepo?
The monorepo approach uses a single repository to host all the code for the multiple libraries or services composing a company’s projects. At its most extreme, the whole codebase from a company — spanning various projects and coded in different languages — is hosted in a single repository.
Benefits of Monorepo
Hosting the whole codebase on a single repository provides the following benefits.
Lowers Barriers of Entry
When new staff members start working for a company, they need to download the code and install the required tools to begin working on their tasks. Suppose the project is scattered across many repositories, each having its installation instructions and tooling required. In that case, the initial setup will be complex, and more often than not, the documentation will not be complete, requiring these new team members to reach out to colleagues for help.
A monorepo simplifies matters. Since there is a single location containing all code and documentation, you can streamline the initial setup.
Centrally Located Code Management
Having a single repository gives visibility of all the code to all developers. It simplifies code management since we can use a single issue tracker to watch all issues throughout the application’s life cycle.
For instance, these characteristics are valuable when an issue spans two (or more) child libraries with the bug existing on the dependent library. With multiple repositories, it may be challenging to find the piece of code where the problem happens.
Painless Application-Wide Refactorings
When creating an application-wide refactoring of the code, multiple libraries will be affected. A monorepo makes it easy to perform all modifications to all code for all libraries and submit it under a single pull request.
More Difficult To Break Adjacent Functionality
With the monorepo, we can set up all tests for all libraries to run whenever any single library is modified. As a result, the likelihood of doing a change in some libraries has minimized adverse effects on other libraries.
Teams Share Development Culture
Even though not impossible, with a monorepo approach, it becomes challenging to inspire unique subcultures among different teams. Since they’ll share the same repository, they will most likely share the same programming and management methodologies and use the same development tools.
Issues With the Monorepo Approach
Using a single repository for all our code has several drawbacks.
Slower Development Cycles
When the code for a library contains breaking changes, which make the tests for dependent libraries fail, the code must also be fixed before merging the changes.
If these libraries depend on other teams, who are busy working on some other task and are not able (or willing) to adapt their code to avoid the breaking changes and have the tests pass, the development of the new feature may stall.
Requires Download of Entire Codebase
When the monorepo contains all the code for a company, it can be huge, containing gigabytes of data. To contribute to any library hosted within, anybody would require a download of the whole repository.
Unmodified Libraries May Be Newly Versioned
When we tag the monorepo, all code within is assigned the new tag. If this action triggers a new release, then all libraries hosted in the repository will be newly released with the version number from the tag, even though many of those libraries may not have had any change.
Forking Is More Difficult
Open source projects must make it as easy as possible for contributors to become involved. With multiple repositories, contributors can head directly to the specific repository for the project they want to contribute to. With a monorepo hosting various projects, though, contributors must first navigate their way into the right project and will need to understand how their contribution may affect all other projects.
What Is Multi-Repo?
The multi-repo approach uses several repositories to host the multiple libraries or services of a project developed by a company. At its most extreme, it’ll host every minimum set of reusable code or standalone functionality (such as a microservice) under its repository.
Benefits of Multi-Repo
Hosting every library independently of all others provides a plethora of benefits.
Independent Library Versioning
When tagging a repository, its whole codebase is assigned the “new” tag. Since only the code for a specific library is on the repository, the library can be tagged and versioned independently of all other libraries hosted elsewhere.
Independent Service Releases
Since the repository only contains the code for some service and nothing else, it can have its own deployment cycle, independently of any progress made on the applications accessing it.
Helps Define Access Control Across the Organization
Only the team members involved with developing a library need to be added to the corresponding repository and download its code. As a result, there’s an implicit access control strategy for each layer in the application. Those involved with the library will be granted editing rights, and everyone else may get no access to the repository. Or they may be given reading but not editing rights.
Allows Teams To Work Autonomously
Team members can design the library’s architecture and implement its code working in isolation from all other teams. They can make decisions based on what the library does in the general context without being affected by the specific requirements from some external team or application.
Issues With the Multi-Repo Approach
Using multiple repositories can give rise to several issues.
Libraries Must Constantly Be Resynced
When a new version of a library containing breaking changes is released, libraries depending on this library will need to be adapted to start using the latest version. If the release cycle of the library is faster than that of its dependent libraries, they could quickly become out of sync with each other.
May Fragment Teams
When different teams don’t need to interact, they may work in their own silos. In the long term, this could result in teams producing their subcultures within the company, such as employing different methodologies of programming or management or utilizing different sets of development tools.
Summary
When it comes to hosting and managing code, there are two main ways to do it: monorepo and multi-repo. With monorepo, all the code from different libraries, projects, or even an entire company is stored in one single repo. Multi-repo, on the other hand, separates the code into different units, like libraries or services, and each has its own repository.
The approach you choose will depend on a bunch of factors. Both have their own pros and cons, which we just covered in detail. If you want to learn more tech tips, check out our library at https://kinsta.com/topic/tech-tips/.
Top comments (0)