Zuri Hunter

Posted on Jan 19, 2019

Beginner-Friendly Introduction to GitLab CI/CD

#devops #ci #beginners #intro

The goal of this tutorial is to give a high-level introduction of GitLab CI/CD that helps people get started in 30 minutes without having to read all of GitLab's documentation. This tutorial is geared toward beginners who wish to tinker with CI/CD tools like GitLab CI/CD. In this tutorial, I will briefly go over what is CI/CD, why I decided to go with GitLab's tool and a walkthrough on how to create a .gitlab-ci.yaml with an example application.

CI/CD

CI/CD is short for Continuous Integration/ Continuous Delivery / Continuous Deployment. It enables teams to build, test and release software at a faster rate. CI/CD removes manual human interactions where possible - automating everything except the final manual code deployment to production. One of the challenges of implementing this practice is integrating the various tools and systems required to build a CI/CD pipeline. For example, you might store your code in Bitbucket, test it in automated test suites on private infrastructure, and deploy your application to AWS or Microsoft Azure. Complicated applications residing on multiple systems have contributed to not all organizations implementing a seamless CI/CD pipeline.

Why GitLab CI/CD?

I use GitLab CI/CD for three reasons: I can build a complete CI/CD pipeline solution with one tool, it's fast, and it's open source. With GitLab CI/CD in the same place, I can create tickets, merge requests, write code and setup CI/CD tools without another application. It's essentially a one-stop shop. GitLab CI/CD runs builds on GitLab Runners. Runners are isolated virtual machines that run predefined steps through the GitLab CI API. This tool, alone, allows for projects to run through the pipeline builds faster, compared to running on a single instance. You can learn more details about GitLab Runners in this link. Finally, it's open source, so I can always contribute to the code base, and create a new issue when a problem arises.

Scenario

Let's say we have a Node.js API that retrieves a list of books in a database. We can create a pipeline that pushes our code through three phases: build, test and deploy. A pipeline is a group of steps that are grouped by similar characteristics. With those phases our pipeline is defined by three types:

Project Pipeline
Continuous Integration Pipeline
Deploy Pipeline

The Project Pipeline installs dependencies, runs linters and any scripts that deal with the code. The Continuous Integration Pipeline runs automated tests and builds a distributed version of the code. Finally, the Deploy Pipeline deploys code to a designated cloud provider and environment.

The steps that the three pipelines execute are called jobs. When you group a series of jobs by those characteristics it is called stages. Jobs are the basic building block for pipelines. They can be grouped together in stages and stages can be grouped together into pipelines. Here's an example hierarchy of jobs, stages, and pipelines:

A.) Build
     i. Install NPM Dependencies
     ii. Run ES-Linter
     iii. Run Code-Minifier
B.) Test
     i. Run unit, functional and end-to-end test.
     ii. Run pkg to compile Node.js application
C.) Deploy
     i. Production
        1.) Launch EC2 instance on AWS
     ii. Staging
        1.) Launch on local development server

In this hierarchy, all three components are considered three different pipelines. The main bullets -- build, test, and deploy are stages and each bullet under those sections are jobs. Let's break this out into a GitLab CI/CD yaml file.

Using GitLab CI/CD

To use GitLab CI/CD, create a file called .gitlab-ci.yml at the root of the project in your GitLab repository and add the following yaml:

image: node:10.5.0

stages:
  - build
  - test
  - deploy

before_script:
  - npm install

As I mentioned earlier, GitLab CI/CD uses Runners to execute pipelines. We can define which operating system and predefined libraries we would want our Runner to be based off by using the image directive. In our instance, we will be using the latest version of Node.js for our Runner. The stages directive allows us to predefine a stage for the entire configuration. Jobs will be executed based off of the order listed in the stages directive. To learn more about stages you can view it here. The before_script directive is used to run a command before all jobs.

Now let's start with our job dedicated to the Build stage. We are going to call this job build-min-code. In this job we want it to install dependencies and minify the code. We can start this off with using the script directive. The script directive is a shell script that gets executed within the Runner. Then we are going to assign this job to the "build" stage. To assign a job to a stage, use the stage directive.

build-min-code:
  stage: build
  script:
    - npm install
    - npm run minifier

Now we have a job associated with our Build stage and we are going to do that for our Test stage. Our test job is going to be called run-unit-test and we are going to use the npm script in our API to run a test npm test.

run-unit-test:
  stage: test
  script:
    - npm run test

Finally, we are going to add a job to handle our Deploy stage: deploy-production, deploy-staging. In this instance, we are going to have two different jobs for deployment (staging and production). These jobs will reflect the same layout as our previous jobs but with a small change. Currently, all of our jobs are automatically set to be triggered on any code push or branch. We do not want to have that for when we deploy our code to staging and production. To prevent that from happening we use the only directive. The only directive defines the names of branches and tags for which the job will run. The job will look like the following:

deploy-staging:
 stage: deploy
 script:
   - npm run deploy-stage
 only:
   - develop

deploy-production:
 stage: deploy
 script:
   - npm run deploy-prod
 only:
   - master

In our deploy-staging job, the Runner will only execute it if there was a change to the develop branch and for deploy-production the master branch. Here is a screenshot below that shows a code push made to the master branch.

From this image, all three stages and jobs are triggered with the exception of deploy-staging since the code push was to the master branch. GitLab CI/CD comes with an intuitive interface to help show what jobs and stages are running and what errors are occurring in the midst of the build. Below is the final version of the .gitlab-ci.yaml file. If you wish to test this out yourself, here is the link to the example application.

image: node:10.5.0

stages:
  - build
  - test
  - deploy

before_script:
  - npm install

build-min-code:
  stage: build
  script:
    - npm install
    - npm run minifier

run-unit-test:
  stage: test
  script:
    - npm run test

deploy-staging:
  stage: deploy
  script:
    - npm run deploy-stage
  only:
    - develop

deploy-production:
  stage: deploy
  script:
    - npm run deploy-prod
  only:
    - master

Conclusion

The items covered above is a high-level overview of the capabilities that GitLab CI/CD can offer. GitLab CI/CD has the ability to have a more in-depth control of the automation of codebases by building and publishing Docker images to integrating with third-party tools. I hope that you found this tutorial helpful. Thanks for reading!

Top comments (15)

Renato Suero • Jan 20 '19 • Edited

Great post Zuri, congrats =)
I'd like to suggest 2 things:
First, you could remove npm install in the action build-min-code:, because you added a before_script to run that. It is useful if you don't want to do the second suggestion. As you can see in the screenshot

Second, you could use cache+git strategy to save time and process, You could use cache in the first step to get the code and run npm install, the next can use that files.
Short explanation, you could use cache to store the code and files generated by npm install(so only need to run one time) and git strategy because in other actions you could see to ignore git clone and use the cache folder.

please let me know if you have questions about that, I can explain or send you a Merge Request to see my suggestion.

DDd • Jan 21 '19 • Edited

It really depends on your gitlab setup. If you are using dynamic runners. The cost of compressing and decompressing a distributed shared cache is about the same. We have large projects with 600 to 1 GB dependencies and distributed cache shaves about 50 to 60 seconds from a yarn install. Cache is not a small topic. Would love to see more advanced articles on this topic.

Renato Suero • Jan 21 '19

Yes, you are right. The most important is you know when use cache or not. in your case cache isn't good.

Tamara Temple • Jan 20 '19

Really well done! The info is clear and helpful. What I appreciated most was the structured approach you took to explaining a rather complex operation.

Congrats on your first tech post, too. Looking forward to seeing more. ☺💪👏💯

Sid Sijbrandij • Jan 20 '19

Thanks for writing about GitLab!

Benny Powers 🇮🇱🇨🇦 • Jan 20 '19

I learned ci/CD with gitlab. What a pleasure! Recently I tried Travis and of ok could aim up the experience in one Shreveport, it would be "why did this have to be so hard?"

I thoroughly recommend gitlab.

Well done on the article, @zurihunter

Comment deleted

DDd • Jan 21 '19

Hum depends on the version of npm. I would use yarn it still is faster. And like the first comment. With cache. Npm ci is very much abandoned, the idea was good but the speed is not there.

Pierre Smeyers • Jan 27 '22

Thanks for the post !
Another great feature with GitLab CI is the ability to use templates.
B.t.w. a good option if you want to build state-of-the art DevOps pipeline with GitLab CI is to use advanced templates such as to be continuous (to-be-continuous.gitlab.io/doc/)
It provides many ready-to-use and composable templates (Node, Python, Maven, Docker, AWS, Kubernetes, S3 and many more).
It's open-source ;)

davew723 • Dec 27 '19

Nice Article. I would love to see some best practices around the actual deployment. Most examples just have fluff in those sections, similar to your gitlab (below).

"deploy-stage": "echo 'Deploying to STAGING'",
"deploy-prod": "echo 'Deploying to PRODUCTION'"

It feels like people are in two camps around deployment. Many just use Heroku or other services for the actual deployment and the rest are just scp'ing files around. I would love to better understand best practices in the non-heroku-ish case.