One of the advantages of using containers is the ability to have images with our environment pre configured and ready to go with our application code. These images is what allows us to run a container on our laptop and then run the same image on a container in the cloud. But building this images is not a trivial task, and we tend add useless tasks or overhead to the images we create. This entry is to share some of the best practices that has work for me in the past when building docker images.
First of all we need to define what a docker image is. A docker image is a unit of packaging that may includes Operating systems constructs or packages, application dependencies and libraries and application code. If you came from the virtualization world, see the image as a VM template, and the containers are instances created using this template. Otherwise, if you are a developer, you can think of images as a class definition, and containers are instances of that class.
Docker images must be stored somewhere, right?, these are the called image registries. These registries are like repositories where you store your images. Some examples are:
- Docker Hub
- AWS ECR
- Oracle Container Registry
- Azure Container Registry
- Google Container Registry
- Red Hat Quay
Note you will need to create an account on these registries if you want to pull or push docker images. You need to pull images to your local computer to use it.
Note you can also create a docker image locally without the need of a registry but if you want to use that image on another computer, server o cloud yo need to use the registry.
Images are made up of multiple layers represented as a single object. One layer that is out of the instance is the kernel, as container use the host kernel.
Example of this:
☁ docker [master] ⚡ docker image pull mongo:latest
latest: Pulling from library/mongo
7b1a6ab2e44d: Already exists
90eb44ebc60b: Pull complete
5085b59f2efb: Pull complete
c7499923d022: Pull complete
019496b6c44a: Pull complete
c0df4f407f69: Pull complete
351daa315b6c: Pull complete
5b6df31e95f8: Pull complete
e82745116109: Pull complete
98e820b4cad7: Pull complete
Digest: sha256:cf9f5df5419319390cc3b5d9abfc2d0d0b149b3e9e3e29b579
Status: Downloaded newer image for mongo:latest
docker.io/library/mongo:latest
Here we can see other interesting characteristics from docker images: when you pull an image, docker only download the layers that changed or the ones that are new in your local system, helping to reduce overhead on network traffic.
- This layer >
7b1a6ab2e44d: Already exists
was already on my system - This one >
90eb44ebc60b: Pull complete
was downloaded.
Images can be really small, let see the following example:
☁ docker [master] ⚡ docker pull alpine
Using default tag: latest
latest: Pulling from library/alpine
59bf1c3509f3: Pull complete
Digest: sha256:21a3deaa0d32a8057914f36584b5288d2e0118285c70fa8c9300
Status: Downloaded newer image for alpine:latest
docker.io/library/alpine:latest
As you may noticed, there are fewer layers on alpine image. Just bo be sure we are going to execute an ls:
☁ docker [master] ⚡ docker image ls
REPOSITORY TAG IMAGE ID CREATED SIZE
alpine latest c059bfaa849c 8 days ago 5.59MB
mongo latest 4253856b2570 2 weeks ago 701MB
5MB vs 700MB, and the standard is around 40-200MB for images.
Another aspect to consider is the images naming, if we observed previous command images are called from : (This is for official images), being latest the default if you do not specify any version.
There is a way to query if an image is an official one, by using:
☁ docker [master] ⚡ docker search ubuntu --filter "is-official=true"
NAME DESCRIPTION STARS OFFICIAL AUTOMATED
ubuntu Ubuntu is a Debian-based Linux operating sys… 13244 [OK]
websphere-liberty WebSphere Liberty multi-architecture images … 282 [OK]
ubuntu-upstart DEPRECATED, as is Upstart (find other proces… 112 [OK]
ubuntu-debootstrap DEPRECATED; use "ubuntu" instead 45 [OK]
NOW, after the images 101, we need to move to some of the best practices using or creating images, lets get to it...
OH WAIT!!
I almost forgot to talk about Dockerfile ...
Basically:
- Text document that contains commands to build an image
- Yeap, it must be named Dockerfile with uppercase at the beginning
- Nop, it cannot be named dockerfile, docker-file, docker file...
This is an small example of a Dockerfile from the docker official documentation:
FROM ubuntu:18.04
COPY . /app
RUN make /app
CMD python /app/app.py
AND now, we can move on with the recommendations...
1.- Use Official Docker images as base Image
From previous example on Dockerfile, the base image is the one that we use on the FROM statement
FROM ubuntu:18.04
COPY . /app
RUN make /app
CMD python /app/app.py
BUT, BUT, BUT
There is caveat, instead of using a base OS image and then add a RUN statement to install dependencies, use the official image that has the dependencies already installed.
Example:
this is ok, but it is not the best approach:
FROM ubuntu:18.04
RUN apt-get update && apt-get install -y gnupg
RUN wget -qO - https://www.mongodb.org/static/pgp/server-5.0.asc | sudo apt-key add -
COPY ./mongodb-org-5.0.list /etc/apt/sources.list.d/mongodb-org-5.0.list
RUN apt-get install -y mongodb-org
This is a better approach
FROM mongo:4.4.11-rc0
It will simplify your Dockerfiles, and make your live easier...
2.- Avoid adding your code at the beginning
As I mentioned before, a docker images are layered, some of this images can be cached on your local system, YOU WANT to have cached layers, it will take less to create your container when pulling your images.
When using a Dockerfile
FROM ubuntu
COPY . /app
CMD ["java", "-jar", "/app/target/app.jar"]
When you make a change on your code, which corresponds to the second line COPY . /app
, the rest of the layers will be pulled only the FROM ubuntu
will be cached.
Put the COPY of your code at the end.
Also, as a bonus, just copy the jar file to the container, you do not need all the files. For example, yo do not need the README file
FROM ubuntu
RUN apt-get update && apt-get install -y --no-install-recommends \
openjdk-8-jdk ssh vim
COPY target/app.jar /app
CMD ["java", "-jar", "/app/target/app.jar"]
And if we remember the first recommendation, someone already worked on an image with openjdk installed
FROM openjdk
COPY target/app.jar /app
CMD ["java", "-jar", "/app/target/app.jar"]
And it is an official one:
☁ docker [master] ⚡ docker search openjdk --filter "is-official=true"
NAME DESCRIPTION STARS OFFICIAL AUTOMATED
openjdk OpenJDK is an open-source implementation of … 3046 [OK]
3.- Add an specific version for your Base Image
If you remember, at the begining we saw that if you do not specify any version, latest will be used...
Right?
OK, yo do not want to use latest...
- latest is unpredictable
- latest can change between pulls
- latest can break your code
- latest is not love!!
Use an specific image version
FROM mongo:4.4.11-rc0
and
FROM openjdk:slim
COPY target/app.jar /app
CMD ["java", "-jar", "/app/target/app.jar"]
You can consult the images versions under Tags inside docker hub
non-relevant for this entry TIP
You can also pull a repo with all its images:
☁ docker [master] ⚡ docker pull --all-tags alpine
#output too long to be shown
☁ docker [master] ⚡ docker image ls | grep alpine
alpine 3 c059bfaa849c 9 days ago 5.59MB
alpine latest c059bfaa849c 9 days ago 5.59MB
ansible-base-lab_managed-host-alpine latest 77f2f125fa50 6 weeks ago 80.8MB
alpine 20210804 4e873038b87b 4 months ago 5.59MB
alpine 20210730 8fd5af68fdb2 4 months ago 5.59MB
alpine 3.10 e7b300aee9f9 7 months ago 5.58MB
alpine 20210212 b0da5d0678e7 8 months ago 5.62MB
alpine 20201218 430cc6504dbd 11 months ago 5.61MB
alpine 20200917 003bcf045729 14 months ago 5.62MB
alpine 20200626 3c791e92a856 17 months ago 5.57MB
alpine 20200428 5737d7d248e9 19 months ago 5.6MB
4.- Decouple your applications on different container
This is going to be a short one, a container should have only one concern. A web application may consist of 3 containers (The web app code, the database, the cache) instead of only one doing all.
This help to scale and make atomic changes.
5.- Use leaner official images also referred to minimal flavors
Official OS images like ubuntu may contain some packages or services installed that we do not need.
Remember the idea of a container is to provide just the necessary software for your application to run as expected, you may not even need to enter the container, this is why some images does not have a shell installed.
Smaller flavors also improve security, because there are less services to attack and less services to update.
AND smaller flavors are easy to transfer and store.
Let see an example on the openjdk slim vs the jdk
☁ docker [master] ⚡ docker image ls | grep openjdk
openjdk slim 8b0ead3b8172 33 hours ago 407MB
openjdk 18-jdk-alpine3.15 c89120dcca4c 3 days ago 329MB
Other option:
☁ docker [master] ⚡ docker image ls | grep python
python latest 47ebea899258 20 hours ago 917MB
python 3.7.12-alpine3.15 a1034fd13493 3 days ago 41.8MB
☁ docker [master] ⚡
Comments
This is not intended to be an extensive list of best practices, these represents the easiest steps you can start working with, on following entries i will write about more advanced topics like multistage builds and all that beautiful things that we can make.
Top comments (0)