Jensen Jose

Posted on Aug 1

CK 2024 Blog Series: Understanding Docker Storage

#kubernetes #aws #docker

Welcome back to the CK 2024 blog series! Today we're diving into Docker storage, this topic is essential for understanding how storage works in Docker and how to make it persistent, which is crucial before we move on to discussing Kubernetes storage in the next post.

Introduction to Docker Storage

Docker storage is fundamental for managing data within containers. In this post, we will cover the basics of Docker storage, how to use it, and how to make storage persistent. Understanding these concepts will set the stage for our upcoming discussion on Kubernetes storage.

Cloning a Repository and Creating a Docker File

To start, let's clone a GitHub repository. You can use any project, but for this example, we'll use a simple to-do app that we used in our Day 2 video. Here are the steps:

Clone the repository:

git clone https://github.com/your-repo/todo-app.git
cd todo-app

Create a Dockerfile with the following instructions:

FROM node:18-alpine
WORKDIR /app
COPY . .
RUN yarn install
EXPOSE 3000
CMD ["yarn", "start"]

Build the Docker image:

docker build -t todo-app .

Understanding Docker Image Layers

When you build a Docker image, it is composed of multiple layers. Each instruction in the Dockerfile creates a new layer. These layers are read-only and form the base of your container. Changes to the container are made in a writable layer on top of these read-only layers.

For example:

node:18-alpine creates a base layer of about 6.48 MB.
Additional layers are created for the WORKDIR, COPY, and RUN instructions. If you make changes to the Dockerfile and rebuild the image, Docker will only rebuild the layers that have changed, using a cache for the unchanged layers. This efficiency is a key benefit of Docker's layered architecture.

Making Data Persistent with Volumes

By default, data within a Docker container is ephemeral. Once the container stops or is removed, any data written to it is lost. To make data persistent, we use Docker volumes. Volumes store data outside the container's writable layer, allowing it to persist across container restarts and removals.

Create a volume:

docker volume create data-volume

Run a container with the volume:

docker run -d -p 3000:3000 --name todo-app -v data-volume:/app todo-app

Verify the volume:

docker volume ls

Storage Drivers

Docker uses storage drivers to manage how data is stored. The most common storage drivers are overlay2 for Linux, aufs, and device mapper, although aufs and device mapper are deprecated. These drivers manage the read-only layers and the writable container layer.

Persistent Data with Bind Mounts

Another way to persist data is by using bind mounts, which map a directory on the host machine to a directory in the container.

Run a container with a bind mount

docker run -d -p 3000:3000 --name todo-app -v /path/on/host:/app todo-app

This binds the host directory /path/on/host to the container directory /app, ensuring data is stored on the host machine.

Conclusion

Understanding Docker storage and making data persistent are critical skills for managing containerized applications. By using Docker volumes and bind mounts, you can ensure your data is safe and available even if containers are stopped or removed.

In the next post, we'll dive into Kubernetes storage, including persistent volumes and persistent volume claims. Stay tuned and happy coding!

For further reference, check out the detailed YouTube video here:

DEV Community

CK 2024 Blog Series: Understanding Docker Storage

Introduction to Docker Storage

Cloning a Repository and Creating a Docker File

Understanding Docker Image Layers

Making Data Persistent with Volumes

Storage Drivers

Persistent Data with Bind Mounts

Conclusion

Top comments (0)

Read next

A conversation with your architecture

Launching EC2 Instances with AWS CLI and Advanced Features

Announcements from Matt Garman Keynote at re:Invent 2024

How to: Puppeteer in AWS Docker Lambda