Skip to main content

Command Palette

Search for a command to run...

How to set up monorepo build in GitLab CI

Updated
5 min read
M

I'm JS developer with 13 years of professional experience. I'm always happy to teach my craft.

If you migrate your multirepo to a monorepo, or if your project is getting big enough to consider running only part of continuous integration (CI) - then it can make sense to run only those parts of CI that could have been affected by the change. This article will show how to achieve it on the GitLab platform, using a simple repository as an example.

The approach

The project is split into folders. For distinguishing if a given part of the project was modified, we use rules:changes so every part of the project that we want to be able to run in separation from the rest should be placed in one folder.

My assumptions are as follow:

  • you want to run only changed sections of CI in the merge requests (MR) - mainly to save CI resources. So we can avoid running whole 30 minutes of jobs for a change we know is not likely to affect them. This is especially important if we have code in the repo that is not depending on each other - for example, our main application in one place and some landing pages in other folders.

  • after changes are merged to master/main, we want to build everything no matter if it was changed or not. In this way, our main branch is indeed continuously integrated, and we keep on checking on even less commonly changed parts of the project.

Configuration

My project structure is simple:

$ git ls-files
.gitlab-ci.yml
README.md
backend/README.md
frontend/README.md

I have 2 folders, backend & frontend. Each would host files of a given part of our project. This approach scales for any number of sub-projects - we could have company-website, slack-bot, or whatnot inside.

.gitlab-ci.yml step by step

The configuration starts with defining the stages:

stages:
  - build
  - test
  - deploy

This one is copied from GitLab's starting CI template. We can customize it with adding or removing stages. As we define needs:, there is no speed penalty for adding more stages - each job is executed as soon as its requirements are defined in needs: are met. For example, in my project, I ended up adding pre-build to run some preparation scripts before building docker images in my project.

variables:
  RULES_CHANGES_PATH: "**/*"

The default value for our changes configuration - by default, the job that extends our base config will be executed for any changes.

.base-rules:
  rules:
    ...

Our base config. I define it in a way that requires us to add it with extends: .base-rule - probably we could define those rules on the top level, but it's something a headache to configure everything in a way that works as expected in every case. I found it easier to have control over if the .base-rules are set or not.

.base-rules:
  rules:
    - if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
      when: always
  ...

The rules are checked in order. Our 1 rule - if it's master/main, CI should always run the job.

    - if: '$CI_PIPELINE_SOURCE == "push"'
      when: never

Here, we avoid duplicated jobs for merge requests. Without, GitLab would create 1 pipeline for the branch and a "detached pipeline" for the MR. As the branch pipeline doesn't seem to support changes:, we disable branch one & delay starting CI until an MR is created.

    - if: $CI_COMMIT_TAG
      when: never

Similarly, we don't need CI running for a tag.

    - if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
      changes:
        - $RULES_CHANGES_PATH

In MRs, we start jobs for when there are changes in the path defined in the RULES_CHANGES_PATH variable.

    - when: manual
      allow_failure: true

Otherwise, we all the job to be triggered manually.

.frontend & .backend

Now, we define 2 more jobs to be extended from:

.backend:
  extends: .base-rules
  variables:
    RULES_CHANGES_PATH: "backend/**/*"

.frontend:
  extends: .base-rules
  variables:
    RULES_CHANGES_PATH: "frontend/**/*"

In this way, we avoid duplicating the same path definition in each job we define for one or the other part - a possible source of errors.

Example jobs

On top of that all, we can define our jobs as:

backend-build:
  stage: build
  extends: .backend
  needs: []
  script:
    - echo "Compiling the backend code..."

frontend-build:
  stage: build
  extends: .frontend
  needs: []
  script:
    - echo "Compiling the frontend code..."

Complete .gitlab-ci.yml

So, in the end, the complete config files are like this:

stages:
  - build
  - test
  - deploy

variables:
  RULES_CHANGES_PATH: "**/*"

.base-rules:
  rules:
    - if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
      when: always
    - if: '$CI_PIPELINE_SOURCE == "push"'
      when: never
    - if: $CI_COMMIT_TAG
      when: never
    - if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
      changes:
        - $RULES_CHANGES_PATH
    - when: manual
      allow_failure: true

.backend:
  extends: .base-rules
  variables:
    RULES_CHANGES_PATH: "backend/**/*"

.frontend:
  extends: .base-rules
  variables:
    RULES_CHANGES_PATH: "frontend/**/*"

backend-build:
  stage: build
  extends: .backend
  needs: []
  script:
    - echo "Compiling the backend code..."

frontend-build:
  stage: build
  extends: .frontend
  needs: []
  script:
    - echo "Compiling the frontend code..."

backend-test:
  stage: test
  extends: .backend
  needs: ["backend-build"]
  script:
    - echo "Testing the backend code..."

frontend-test:
  stage: test
  extends: .frontend
  needs: ["frontend-build"]
  script:
    - echo "Testing the frontend code..."

backend-deploy:
  stage: deploy
  extends: .backend
  needs: ["backend-test"]
  script:
    - echo "Deploying the backend code..."

frontend-deploy:
  stage: deploy
  extends: .frontend
  needs: ["frontend-test"]
  script:
    - echo "Deploying the frontend code..."

Working CI

With a setup like this, you can have your backend CI run for backend MR:

backend-mr.png

frontend, for MR with frontend changes:

frontend-mr.png

and each commit will trigger all CI jobs once it's merged to the main branch:

merged-backend-2.png

Refrences

You can find the repo I used to write this article here.

Are you interested in learning why monorepos can be a good idea? I discuss pros & cons in another article.

Summary

In this article, we have seen how to set up partially split CI for branches in GitLab. If you are maintaining or building CI in GitLab, I’m interested in learning about your use case. Please write to me an e-mail, or propose us a meeting by scheduling here.

H
Honsemiro2y ago

지금 당장 필요한 설명이었습니다!

D

Hi. Tks for this article. I always have a big question about this Workflow.

What happen if I merge into main a super feture which work and run perfectly on PR/MR but when run on main there was a small error in a yml file or some missconfiguring file? The code was already merged but no one of the changes was deployed. So I fix the yml but non of the paths defined in change/rules where touched then nothing will de deployed… We cann’t just re run the last failed pipeline because it always checked out agains the old commit..

There is no way to run the same projects that failed unless I touch some files (with empty spaces or enter or whatever change). This is ugly for me.

One idea came up, what if I always build all projects on main but when deploy the docker images that hasn’t changed won’t be deployed

Any ideas? Thanks

M

Hey!

Yes, this sounds like a problem that could be very annoying. I think, I managed to avoid this kind of situation with:

  1. CI that will definitively fail if anything is as wrong as broken YML files,
  2. Rebasing feature branches all the time

There is no way to run the same projects that failed unless I touch some files (with empty spaces or enter or whatever change). This is ugly for me.

Triggering jobs manually should do the trick. It's annoying, as you have to trigger them in the right order, but you can get everything build finally

One idea came up, what if I always build all projects on main but when deploy the docker images that hasn’t changed won’t be deployed

Yes, I would always run everything on the main branch. It's already merged, if it takes very long it doesn't matter, and it verifies all the code—even the parts that are hardly every touched.

1
D

Hi Marcin Wosinek

Yes I think build everything on main is the real continuos integration. It ensure that everything no matter if was changed or not is working.

Thank you very much

R

I really like your approach. Just have one question:

How can I manually trigger an explicit sub-project with this? For example I want to run only backend-build from within the Gitlab Interface or just frontend-deploy without triggering lets say 100 other jobs.

M

I think you could achieve it by adding some variable/flag that introduces another level of control over what and when runs. It could complicate your set-up though, so I'm uncertain if it's a good idea.

I'm interested in learning more about GitLab CI cases that people try building—you can write a description of what you have now, and how would you like to change it, mail me at marcin.wosinek@gmail.com, and then we can meet at half-an-hour call.

R

I switched our microservice infrastructure to a monorepo and it contains a websocket server, api gateway, a bunch of backend services and two frontends for now. I asked because at some point you might want to build only specific jobs or pipelines. E.g. when creating a new version / tag. In that case I don’t want to run all the code quality stuff over and over again. Just the build jobs to tag a new Docker image version of all services and frontends. Marcin Wosinek

J

This is a very nice solution, much more elegant than something I've had on the go for the past year. Thanks for sharing!

1

More from this blog

H

How to dev

164 posts

Articles about programming. JavaScript and general advice for beginners in the industry.