Chintan Soni

Posted on Jul 1 • Edited on Jul 7

GitLab CI/CD Pipelines: Best Practices for Monorepos

#gitlab #cicd #pipeline

Hello everyone! This article is for those who want to optimize their CI/CD pipelines using best practices in a monorepo setup.

To provide a clear walkthrough, let’s consider the following example:

Project structure:

Initial .gitlab-ci.yml:

stages:
  - build
  - test
  - deploy

build-a:
  stage: build
  script:
    - ...

test-a:
  stage: test
  script:
    - ...

deploy-a:
  stage: deploy
  script:
    - ...

build-b:
  stage: build
  script:
    - ...

test-b:
  stage: test
  script:
    - ...

deploy-b:
  stage: deploy
  script:
    - ...

build-c:
  stage: build
  script:
    - ...

test-c:
  stage: test
  script:
    - ...

deploy-c:
  stage: deploy
  script:
    - ...

The above configuration can quickly become unmanageable as the number of projects in the monorepo increases.

Why is this a problem?

Unnecessary Job Triggers: A single commit will trigger all jobs, regardless of the scope of the change. For instance, a commit made for changes in project-a will also trigger jobs for project-b and project-c, which is inefficient..

Reduced Readability: The CI/CD configuration becomes less readable and harder to maintain, especially with environment-specific jobs for dev, QA, UAT, and prod.
Increased Complexity: The setup becomes fragile, making it easy for anyone to inadvertently disrupt the pipeline. It requires more expertise to understand the scope, impact of changes, and dependencies of jobs.

How to solve this?

We will perform a series of steps to optimize the above pipeline. Let’s start.

Parent-Child Pipelines Architecture

With this approach, you will create a child pipeline, meaning a separate CI/CD file, only for that particular project. Move the relevant code into that project’s .gitlab-ci.yml. Below is the example for project-a, and similarly, it can be replicated for project-b and project-c:

project-a/.gitlab-ci.yml:

stages:
  - build
  - test
  - deploy

build-a:
  stage: build
  script:
    - ...

test-a:
  stage: test
  script:
    - ...

deploy-a:
  stage: deploy
  script:
    - ...

Then, link the child pipeline to the parent as below:

Root .gitlab-ci.yml:

stages:
  - triggers

trigger-project-a:
  stage: triggers
  trigger:
    include: project-a/.gitlab-ci.yml

trigger-project-b:
  stage: triggers
  trigger:
    include: project-b/.gitlab-ci.yml

trigger-project-c:
  stage: triggers
  trigger:
    include: project-c/.gitlab-ci.yml

With this simple refactor, the pipeline structure becomes more manageable:

Use rules: changes

To scope job execution to project-level changes, we can modify the pipeline to trigger jobs only when changes are made to specific projects.

Root .gitlab-ci.yml:

stages:
  - triggers

trigger-project-a:
  stage: triggers
  trigger:
    include: project-a/.gitlab-ci.yml
  rules:
    - changes:
      - project-a/**/*

trigger-project-b:
  stage: triggers
  trigger:
    include: project-b/.gitlab-ci.yml
  rules:
    - changes:
      - project-b/**/*

trigger-project-c:
  stage: triggers
  trigger:
    include: project-c/.gitlab-ci.yml
  rules:
    - changes:
      - project-c/**/*

If you see duplicate pipelines running (a commit to a branch triggering the pipeline twice), you can add the following rule:

trigger-project-a:
  rules:
    - if '$CI_PIPELINE_SOURCE == "merge_request_event"'
      when: never

Result:

Use YAML Anchors:

YAML anchors allow for the reuse of common configuration blocks, increasing reusability and reducing redundancy, especially when targeting multiple environments like dev, QA, staging, and prod.

project-a/.gitlab-ci.yml:

.base-build:
  stage: build
  image: node:22-alpine
  variables: ...
  before_script:
    - cd project-a

build-a-dev:
  extends: .base-build
  script:
    - export ENV = "dev"
    - // build steps for dev

build-a-qa:
  extends: .base-build
  script:
    - export ENV = "qa"
    - // build steps for qa

build-a-staging:
  extends: .base-build
  script:
    - export ENV = "staging"
    - // build steps for staging

build-a-prod:
  extends: .base-build
  script:
    - export ENV = "prod"
    - // build steps for prod

If you want to reuse only specific blocks of an anchor, you can use !reference as below:

build-a-dev:
  before_script: !reference [.base-build, before_script]
  script:
    - export ENV = "dev"
    - // build steps for dev

Using needs for Proper Job Chaining

We can create dependencies between jobs using needs, ensuring proper execution order.

build-a:
  stage: build
  script:
    - ...

test-a:
  stage: test
  needs: [build-a]
  script:
    - ...

deploy-a:
  stage: deploy
  needs: [test-a]
  script:
    - ...

Result:

Parallel Job Execution

To execute multiple jobs in parallel, for example, if there’s a check stage before the build stage, with a check-a job performing static code analysis, lint checks, etc., you can configure it as below:

stages:
  - check
  - build
  - ...

check-a:
  stage: check
  needs: []
  script:
    - ...

build-a:
  stage: build
  needs: []
  script:
    - ...

test-a:
  stage: build
  needs: [build-a]
  script:
    - ...

deploy-a:
  stage: build
  needs: [test-a]
  script:
    - ...

Result:

Source Code

You can find the source code here: https://gitlab.com/iChintanSoni/learning-ci-cd/

Conclusion

Optimizing CI/CD pipelines in a monorepo setup can significantly enhance the efficiency, readability, and maintainability of your projects. By adopting best practices such as using parent-child pipeline architecture, applying rules: changes, leveraging YAML anchors, and strategically utilizing needs for job chaining, you can create a more robust and scalable pipeline.

These techniques not only help in minimizing unnecessary job executions but also streamline the overall development workflow, making it easier to manage complex projects. By implementing these best practices, you ensure that your CI/CD processes are both efficient and adaptable to the evolving needs of your monorepo.

I hope this guide helps you in refining your GitLab CI/CD pipelines. If you have any questions or additional tips, feel free to share them in the comments below. Happy coding!

Top comments (1)

hashilbasheer • Sep 25

I have an issue with mono repo - a race condition. This issue arises because both pipelines are trying to push changes to the same branch in your Git repository simultaneously, leading to a race condition. When the first pipeline pushes its changes, the second pipeline tries to push but fails because it’s not aware of the new commit made by the first pipeline. Any solution to solve this in CICD?

DEV Community

GitLab CI/CD Pipelines: Best Practices for Monorepos

Why is this a problem?

How to solve this?

Parent-Child Pipelines Architecture

Use rules: changes

Use YAML Anchors:

Using needs for Proper Job Chaining

Parallel Job Execution

Source Code

Conclusion

Top comments (1)

Read next

LeetCode: Removing Adjacent Duplicates in a String

Only Javascript cheatsheet you need !

32nd day of my CP journey

31st day of my CP journey