The goal of this tutorial is to give a high-level introduction of GitLab CI/CD that helps people get started in 30 minutes without having to read all of GitLab's documentation. This tutorial is geared toward beginners who wish to tinker with CI/CD tools like GitLab CI/CD. In this tutorial, I will briefly go over what is CI/CD, why I decided to go with GitLab's tool and a walkthrough on how to create a .gitlab-ci.yaml
with an example application.
CI/CD
CI/CD is short for Continuous Integration/ Continuous Delivery / Continuous Deployment. It enables teams to build, test and release software at a faster rate. CI/CD removes manual human interactions where possible - automating everything except the final manual code deployment to production. One of the challenges of implementing this practice is integrating the various tools and systems required to build a CI/CD pipeline. For example, you might store your code in Bitbucket, test it in automated test suites on private infrastructure, and deploy your application to AWS or Microsoft Azure. Complicated applications residing on multiple systems have contributed to not all organizations implementing a seamless CI/CD pipeline.
Why GitLab CI/CD?
I use GitLab CI/CD for three reasons: I can build a complete CI/CD pipeline solution with one tool, it's fast, and it's open source. With GitLab CI/CD in the same place, I can create tickets, merge requests, write code and setup CI/CD tools without another application. It's essentially a one-stop shop. GitLab CI/CD runs builds on GitLab Runners. Runners are isolated virtual machines that run predefined steps through the GitLab CI API. This tool, alone, allows for projects to run through the pipeline builds faster, compared to running on a single instance. You can learn more details about GitLab Runners in this link. Finally, it's open source, so I can always contribute to the code base, and create a new issue when a problem arises.
Scenario
Let's say we have a Node.js API that retrieves a list of books in a database. We can create a pipeline that pushes our code through three phases: build, test and deploy. A pipeline is a group of steps that are grouped by similar characteristics. With those phases our pipeline is defined by three types:
- Project Pipeline
- Continuous Integration Pipeline
- Deploy Pipeline
The Project Pipeline installs dependencies, runs linters and any scripts that deal with the code. The Continuous Integration Pipeline runs automated tests and builds a distributed version of the code. Finally, the Deploy Pipeline deploys code to a designated cloud provider and environment.
The steps that the three pipelines execute are called jobs. When you group a series of jobs by those characteristics it is called stages. Jobs are the basic building block for pipelines. They can be grouped together in stages and stages can be grouped together into pipelines. Here's an example hierarchy of jobs, stages, and pipelines:
A.) Build
i. Install NPM Dependencies
ii. Run ES-Linter
iii. Run Code-Minifier
B.) Test
i. Run unit, functional and end-to-end test.
ii. Run pkg to compile Node.js application
C.) Deploy
i. Production
1.) Launch EC2 instance on AWS
ii. Staging
1.) Launch on local development server
In this hierarchy, all three components are considered three different pipelines. The main bullets -- build, test, and deploy are stages and each bullet under those sections are jobs. Let's break this out into a GitLab CI/CD yaml
file.
Using GitLab CI/CD
To use GitLab CI/CD, create a file called .gitlab-ci.yml
at the root of the project in your GitLab repository and add the following yaml
:
image: node:10.5.0
stages:
- build
- test
- deploy
before_script:
- npm install
As I mentioned earlier, GitLab CI/CD uses Runners to execute pipelines. We can define which operating system and predefined libraries we would want our Runner to be based off by using the image
directive. In our instance, we will be using the latest version of Node.js for our Runner. The stages
directive allows us to predefine a stage for the entire configuration. Jobs will be executed based off of the order listed in the stages
directive. To learn more about stages you can view it here. The before_script
directive is used to run a command before all jobs.
Now let's start with our job dedicated to the Build stage. We are going to call this job build-min-code
. In this job we want it to install dependencies and minify the code. We can start this off with using the script
directive. The script
directive is a shell script that gets executed within the Runner. Then we are going to assign this job to the "build" stage. To assign a job to a stage, use the stage
directive.
build-min-code:
stage: build
script:
- npm install
- npm run minifier
Now we have a job associated with our Build stage and we are going to do that for our Test stage. Our test job is going to be called run-unit-test
and we are going to use the npm script in our API to run a test npm test
.
run-unit-test:
stage: test
script:
- npm run test
Finally, we are going to add a job to handle our Deploy stage: deploy-production
, deploy-staging
. In this instance, we are going to have two different jobs for deployment (staging and production). These jobs will reflect the same layout as our previous jobs but with a small change. Currently, all of our jobs are automatically set to be triggered on any code push or branch. We do not want to have that for when we deploy our code to staging and production. To prevent that from happening we use the only
directive. The only
directive defines the names of branches and tags for which the job will run. The job will look like the following:
deploy-staging:
stage: deploy
script:
- npm run deploy-stage
only:
- develop
deploy-production:
stage: deploy
script:
- npm run deploy-prod
only:
- master
In our deploy-staging
job, the Runner will only execute it if there was a change to the develop
branch and for deploy-production
the master
branch. Here is a screenshot below that shows a code push made to the master
branch.
From this image, all three stages and jobs are triggered with the exception of deploy-staging
since the code push was to the master
branch. GitLab CI/CD comes with an intuitive interface to help show what jobs and stages are running and what errors are occurring in the midst of the build. Below is the final version of the .gitlab-ci.yaml
file. If you wish to test this out yourself, here is the link to the example application.
image: node:10.5.0
stages:
- build
- test
- deploy
before_script:
- npm install
build-min-code:
stage: build
script:
- npm install
- npm run minifier
run-unit-test:
stage: test
script:
- npm run test
deploy-staging:
stage: deploy
script:
- npm run deploy-stage
only:
- develop
deploy-production:
stage: deploy
script:
- npm run deploy-prod
only:
- master
Conclusion
The items covered above is a high-level overview of the capabilities that GitLab CI/CD can offer. GitLab CI/CD has the ability to have a more in-depth control of the automation of codebases by building and publishing Docker images to integrating with third-party tools. I hope that you found this tutorial helpful. Thanks for reading!
Top comments (15)
Great post Zuri, congrats =)
I'd like to suggest 2 things:
First, you could remove npm install in the action build-min-code:, because you added a before_script to run that. It is useful if you don't want to do the second suggestion. As you can see in the screenshot
Second, you could use cache+git strategy to save time and process, You could use cache in the first step to get the code and run npm install, the next can use that files.
Short explanation, you could use cache to store the code and files generated by npm install(so only need to run one time) and git strategy because in other actions you could see to ignore git clone and use the cache folder.
please let me know if you have questions about that, I can explain or send you a Merge Request to see my suggestion.
It really depends on your gitlab setup. If you are using dynamic runners. The cost of compressing and decompressing a distributed shared cache is about the same. We have large projects with 600 to 1 GB dependencies and distributed cache shaves about 50 to 60 seconds from a yarn install. Cache is not a small topic. Would love to see more advanced articles on this topic.
Yes, you are right. The most important is you know when use cache or not. in your case cache isn't good.
Really well done! The info is clear and helpful. What I appreciated most was the structured approach you took to explaining a rather complex operation.
Congrats on your first tech post, too. Looking forward to seeing more. ☺💪👏💯
Thanks for writing about GitLab!
I learned ci/CD with gitlab. What a pleasure! Recently I tried Travis and of ok could aim up the experience in one Shreveport, it would be "why did this have to be so hard?"
I thoroughly recommend gitlab.
Well done on the article, @zurihunter
Hum depends on the version of npm. I would use yarn it still is faster. And like the first comment. With cache. Npm ci is very much abandoned, the idea was good but the speed is not there.
Thanks for the post !
Another great feature with GitLab CI is the ability to use templates.
B.t.w. a good option if you want to build state-of-the art DevOps pipeline with GitLab CI is to use advanced templates such as to be continuous (to-be-continuous.gitlab.io/doc/)
It provides many ready-to-use and composable templates (Node, Python, Maven, Docker, AWS, Kubernetes, S3 and many more).
It's open-source ;)
Nice Article. I would love to see some best practices around the actual deployment. Most examples just have fluff in those sections, similar to your gitlab (below).
It feels like people are in two camps around deployment. Many just use Heroku or other services for the actual deployment and the rest are just scp'ing files around. I would love to better understand best practices in the non-heroku-ish case.
Looks perfect for beginners
Thank you
Cool post, thanks for sharing :)