DEV Community

igaurab
igaurab

Posted on

Dockerfile good practices [NOTE]

This is my note from this this talk on Dockerfile good practices

Areas of improvements

  • Incremental build time
  • Image size
  • Maintainability
  • Security
  • Consistency / Repeatability

Incremental build time

Making build cache your friend

  • Order is important: if you make changes to any line or any stage of your dockerfile then subsequent stages cache will be busted. So, order your steps from least to most frequently changing steps to optimize caching.

Let's look at this sample dockerfile

FROM ubuntu:18.04
COPY . /app
RUN apt-get update
RUN apt-get install openjdk-8-jdk
Enter fullscreen mode Exit fullscreen mode

This dockerfile copies the application files immediately after the start. Because caching is based on previous steps, whenever something changes in that content, all the steps after the copy has to be invalidated. So, the cache will have to be busted and the steps below should run again.

This is a problem here, since if you want to change your application code, and want to build your image reflecting the latest change, you'll have run all the commands below the COPY command.
A better way here, would be to run the COPY command at the last.

FROM ubuntu:18.04
RUN apt-get update
RUN apt-get install openjdk-8-jdk
COPY . /app
Enter fullscreen mode Exit fullscreen mode
  • Only copy what's needed Avoid COPY . if possible. When you are copying files into your image make sure you're very specific as to what you want to copy, because any changes to files you're copying will bust the cache

  • Identify cacheable units Sometimes you want things to be cached together. Identify cacheable units. For example change this

RUN apt-get update
RUN apt-get -y install openjdk-8-jdk
Enter fullscreen mode Exit fullscreen mode

to

RUN apt-get update \

 && apt-get -y install \
     openjdk-8-jdk

Enter fullscreen mode Exit fullscreen mode

This prevents using an outdated package cache.

  • Fetch dependencies in a separate step: This is also about identifying the cacheable units.

Reduce Image size

  • Remove unnecessary dependencies: Don't install debugging tools and other unnecessary dependencies. You can also use the --no-install-recommends flag. You don't want to deploy your build tools into production, as you will not need them at runtime.

  • Remove package manager cache: You don't need the cache after installing the packages. It's good to remove them as well

RUN apt-get update \ 
    && apt-get -y install --no-install-recommends \ 
    openjdk-8-jdk \ 
    && rm -rf /var/lib/apt/lists/*
Enter fullscreen mode Exit fullscreen mode

Maintainability

  • Use official images where possible: Official images are pre-configured for container use and built by smart people. It can save you a lot of time in maintenance. This also allows you to share layers between images, as they use exactly the same base image.
    For the above sample Dockerfile instead of using debain and installing the dependencies simply start your base image from FROM openjdk:8

  • Use more specific tags: The latest tag is a rolling tag. Be specific to prevent unexpected changes in your base image.

  • Look for minimal flavors Maybe you don't need all the things that is in the bigger variants.

REPOSITORY  TAG             SIZE
-----------------------------------------
openjdk     8               624MB
openjdk     8-jre           443MB
openjdk     8-jre-slim      204MB
opendjk     8-jre-alpine    83MB

Enter fullscreen mode Exit fullscreen mode

Reproducible

The dockefile as a blueprint of your image, source code the source of truth for your application

Make the dockerfile your blueprint

  • It describes the build environment
  • Correct versions of build tools installed
  • Prevent inconsistencies between environments
  • There may be system dependencies

Multi-stage builds

Use Cases

  • Separate build from runtime environment
  • Slight variations on images (DRY)
  • Build dev/test/lint/ specific environments

    • builder: all build dependencies
    • build: builder + build artifacts
    • cross: same as build but for different envs
    • dev: builder + dev/debug tools
    • lint: minimal lint dependencies
    • test: all test dependencies + build artifacts to be tested
    • release: final minimal image with build artifacts
  • Delinearizing your dependencies (concurrency)

  • Platform specific stages

When you name a stage you can only build that stage.

FROM image_or_stage AS stage_name

$ docker build --target stage_name

Further Resources

  1. Dockerfile best practices

  2. Docker's official best practices

Top comments (3)

Collapse
 
alastairmeasures profile image
Alastair Measures • Edited

Good article - thank you.

Am a bit focussed on reducing image sizes as generally a very good thing.

The image size may have only a limited effect on the RAM usage foot print of a running container. Doubtless Alpine images are smaller and build faster and load faster but do they actually cost less to host in the cloud?

Am partly looking to see why Ubuntu and Debian based remain so popular relative to Alpine?

Collapse
 
igaurab profile image
igaurab • Edited

Hey Alastair, thanks for the comment. I have no idea about the costs.

Ubuntu and Debian may be preferable due to several reasons:

  1. Familiarity with the system/ Large per-existing community/ packages
  2. Security

Also these thread on might be relevant:

  1. news.ycombinator.com/item?id=11044980
  2. turnkeylinux.org/blog/alpine-vs-de...
  3. reddit.com/r/docker/comments/77zor...
Collapse
 
alastairmeasures profile image
Alastair Measures • Edited

Hey, Thank you for taking a moment to respond.

Familiarity seems a reason for inertia; when actually in most cases the effort to transition is a modest one off cost. If the Alpine RAM footprint is really half the size (of Ubuntu/ Debian) then the reduction in deployment hosting costs is recurring and therefore significant - especially for small startups.

There is also a relationship between executable size and CPU cache size that influences performance quite markedly.

Concerning security, everyone should always be "all ears" and the picture evolves - just ask OpenBSD about their SSL hick-up. Also given that your links are between 3 and 5 years old, it would be interesting to know how this has evolved.

Currently seeing little reason to switch focus away from Alpine; and if the moment comes, my familiarity with Ubuntu and Debian is still alive on my desktop.

Thanks again for your response.