DEV Community

Cover image for What Are Multi-Stage Docker Builds
Pavan Belagatti
Pavan Belagatti

Posted on

What Are Multi-Stage Docker Builds

In the fast-paced world of software development and deployment, every minute counts. Every second is important for streamlining your processes so you can reduce time spent waiting on build times or other lagging steps. When you are using Docker to manage your software builds and deployments, you want to make sure you are doing everything possible to simplify the process so that your developers don't get bogged down with lengthy build processes. That’s why multi-stage builds are such a helpful feature of Docker.

In the world of software development, you’ll soon discover that there is never one right way to do things. Instead, there are usually many ways; it comes down to finding the best solution for your situation and organization. Just like that, there are a number of different techniques out there when it comes to creating container images as efficiently as possible. One such approach is using multi-stage builds with Docker, which can help reduce your container size. This article explains what multi-stage builds are and how they can help speed up your development process.

Dockerfile

Containers allow you to package up an application with all the necessary parts, such as libraries and other dependencies and ship it all out as one package. The whole application can be converted into an image and pushed to an image registry such as DockerHub. To create an image, you need a Dockerfile. Dockerfile is a simple text document that contains all the commands and instructions to create a Docker image. It is written as a list of instructions for Docker to follow. The Dockerfile starts with an instruction to copy the contents of another file, called a base image, onto your computer. After this, you can add your own customizations accordingly, depending on the application you are working on. The Dockerfile is read by the Docker Engine, which then executes the instructions in order. The primary purpose of Dockerfile is to create an image that can be deployed as quickly as possible and with the fewest possible dependencies.

Docker Build

Docker is a containerization platform allowing developers to create portable, self-sufficient containers. The Docker build process starts with an image which is only a base layer of the final image. This means that the image contains only the operating system and any other packages needed to execute commands. The next step in this process is adding layers to this base layer using layers from other images or manually installing packages. A Dockerfile specifies all these steps in detail and can be used as input for Docker build process through the docker build command. The docker build command is used to create an image from a Dockerfile. The docker build command can be run with a tag to specify which version of the image should be created.

Docker build is the simplest single command that automatically generates an image with your desired configuration and dependencies specified in the Dockerfile.

Multi-Stage Docker Builds

multi-stage docker builds

Every microservice should be its own separate container. If you only use a single-stage Docker build, you’re missing out on some powerful features of the build process. On the other hand, a multi-stage Docker build has many advantages over a single-stage build for deploying microservices.

A multi-stage build is a process that allows you to break the steps in building a Docker image into multiple stages. This will enable you to create images that include only the dependencies that are necessary for the desired functionality of the final application, cutting down on both time and space. With a multi-stage build, you will first build the image that contains only the dependencies needed to build your application. Then, after the image has been built, you can add in any additional layers needed to create your application and configure it for deployment. In this way, you can build images with only the code necessary for building the application. This is also strategically used to optimize the container images and make them smaller.

As mentioned above, multi-stage builds let you create optimized Docker images with only the dependencies necessary to build your application. Combined with Docker’s layered images, this can help you save significant space. The multi-stage process saves space on your Docker host and in the Docker image and speeds up the build process. In addition, the process will be much quicker than it would be if you included all the code needed to build your application.

Creating two Dockerfiles; one for development and one for production is not considered ideal in the DevOps world and that is where multi-stage Docker builds come handy as we can have one optimized Dockerfile created for all the environments - Dev, Staging and Production.

Multi-Stage Docker Build Examples

Java Example:

To understand the concept of Multi-stage Docker builds better, let us consider a simple Java Hello World application.

Add the following code in a file named HelloWorld.java

class HelloWorld {
   public static void main(String[] a) {
       System.out.println("Hello world!");
   }
}
Enter fullscreen mode Exit fullscreen mode

Then, create a Dockerfile with the following content in it,

FROM openjdk:11-jdk
COPY HelloWorld.java .
RUN javac HelloWorld.java
CMD java HelloWorld
Enter fullscreen mode Exit fullscreen mode

Build the image with the following command,
docker build -t helloworld:huge .

Let’s modify our Dockerfile with the following content to show how multi-stage Docker build works.

FROM openjdk:11-jdk AS build
COPY HelloWorld.java .
RUN javac HelloWorld.java

FROM openjdk:11-jre AS run
COPY --from=build HelloWorld.class .
CMD java HelloWorld
Enter fullscreen mode Exit fullscreen mode

Build the image with the following command,

docker build -t helloworld:small .

Now, let’s compare both images. Check the images created with the following command,

docker images

docker images

Hope you can see the difference in size between the two images. This way, you can separate the build and runtime environments in the same Dockerfile. Use build environment as a dependency [COPY --from=build HelloWorld.class .] while creating the Dockerfile with the approach of multi-stage docker build. This will help minimize the size of Docker images.

Node.Js Example

Let’s learn with a simple NodeJs application that has a simple Dockerfile.

FROM node:14-alpine
ADD . /app
WORKDIR /app
COPY package.json .
RUN npm install --production
COPY . .
EXPOSE 3002
CMD [ "node", "app.js" ]
Enter fullscreen mode Exit fullscreen mode

Let’s build the image with the following command,
docker build -t [DockerHub username]/image name:tag

Push the image to Docker Hub with the command,
docker push [DockerHub username]/image name:tag

I pushed the image to DockerHub, and here is the image and size below,

example image latest

Now, let’s try using the concept of multi-stage Docker build and modify our existing Dockerfile.

FROM node:14-alpine as base
ADD . /app
WORKDIR /app
COPY package.json .
RUN npm install 
FROM alpine:latest
COPY --from=stage1 /app /app
WORKDIR /app
EXPOSE 3002
CMD [ "node", "app.js" ]
Enter fullscreen mode Exit fullscreen mode

Let’s build and push the image with the similar commands used above. Just make sure to give a different name to the image.

multi-stage example

Now, compare the image sizes. One with the usual Dockerfile is 48.81 MB, and the other created with a multi-stage Docker build is 7.12 MB. Can you see the difference? The image created by the multi-stage Docker build approach is more optimized.

Another example that shows how multi-stage Docker builds can be used efficiently is a scenario where you like to dissect the Dockerfile for different environments.

A normal Dockerfile looks as below,

FROM node:14-alpine

WORKDIR /src
COPY package.json package-lock.json /src/
RUN npm install --production

COPY . /src

EXPOSE 3000

CMD ["node", "bin/www"]
Enter fullscreen mode Exit fullscreen mode

We will create 3 simple stages from the above Dockerfile.

  1. Base stage: This stage will have things in common with the original Dockerfile
  2. Production stage: This stage will include things useful for the production environment
  3. Dev stage: This stage will have components useful for the Dev environment

The modified Dockerfile looks as below,

FROM node:14-alpine as base

WORKDIR /src
COPY package.json package-lock.json /src/
EXPOSE 3000

FROM base as production
ENV NODE_ENV=production
RUN npm ci
COPY . /src
CMD ["node", "bin/www"]

FROM base as dev
ENV NODE_ENV=development
RUN npm install -g nodemon && npm install
COPY . /src
CMD ["nodemon", "bin/www"]
Enter fullscreen mode Exit fullscreen mode

Some notable advantages of using a multi-stage build,

  • Optimizes the overall size of the Docker image
  • Removes the burden of creating multiple Dockerfiles for different stages
  • Easy to debug a particular build stage
  • Able to use the previous stage as a new stage in the new environment
  • Ability to use the cached image to make the overall process quicker
  • Reduces the risk of vulnerabilities found as the image size becomes smaller with multi-stage builds

Deploying Applications with Harness

Sign up for a free trial of Harness and select the TryNextGen tab for a seamless experience. Create a new project and select the Continuous Delivery module. Start creating a new pipeline, and add all the details that your pipeline needs.

Harness platform

select stage

Note: For Harness to do its magic, you need something called a ‘Delegate’ to be running on your Kubernetes cluster.

What is Harness Delegate?
The Harness Delegate is a service/software you need to install/run on the target cluster [Kubernetes cluster in our case] to connect your artifacts, infrastructure, collaboration, verification and other providers with the Harness Manager. When you set up Harness for the first time, you install a Harness Delegate.

We will not dig deeper about Delegate in this article as it can be a separate blog in itself. For now, just know that the Delegate performs all deployment operations for you. If you want to know more about Delegate, you can read here.

cd start

Next, specify the service, infrastructure and deployment strategy for your application. Once everything is set, save the configuration and run to deploy the application.

CD Quickstart

There you go! Once the pipeline runs successfully, you should see your application deployed on the specified Kubernetes cluster. That can be verified via the kubectl command ‘kubectl get pods’. Would you like to try Harness CD? sign up for the Harness CD free trial.

Conclusion

Multi-stage builds help build optimized Docker images that can run anywhere. If streamlining software delivery is one of your goals, then you should definitely understand how multi-stage Docker builds work. The software deployments can be faster through this approach, and the image can be reused to save time and effort. Multi-stage builds are a great way to simplify your image creation process and save developers time.

In the cloud-native world, security is considered of high importance. One excellent benefit of multi-stage Docker builds is that it reduces the number of dependencies and unnecessary packages in the image, reducing the attack surface. In addition, it keeps it clean and lean by having only the things required to run your application in production. Else, we all end up building and pushing images that are large in size with vulnerabilities that can give an easy way to attackers to get into our applications. Try using multi-stage Docker builds for optimized images and security. Hope this article helped you learn more about multi-stage Docker builds and why we should use them.

Oldest comments (0)