DEV Community

Parvathi
Parvathi

Posted on • Updated on

Beginner's Guide to Docker(Part 2)- Manage data in Docker

We learned few basic things about docker, such as how to build an Image using dockerfile and create a container based on that Image, in continuation to that lets see how to manage data in docker.

Why we need to manage data?

Our dockerized application might have to communicate with file system either for reading or writing data, based on the functionality. Some data needs to be persisted and some data are not.

By default all files created inside a container are stored on a writable container layer. This means that:

1)The data doesn’t persist when that container no longer exists, and it can be difficult to get the data out of the container if another process needs it.

2)A container’s writable layer is tightly coupled to the host machine where the container is running. You can’t easily move the data somewhere else.

3)Writing into a container’s writable layer requires a storage driver to manage the filesystem. The storage driver provides a union filesystem, using the Linux kernel. This extra abstraction reduces performance as compared to using data volumes, which write directly to the host filesystem.

Container

How can we persist data?

Docker has two options for containers to store files in the host machine, so that the files are persisted even after the container stops: volumes, and bind mounts. If you’re running Docker on Linux you can also use a tmpfs mount. If you’re running Docker on Windows you can also use a named pipe.

Storage types

Volumes

Volumes are stored in a part of the host filesystem which is created and managed by Docker (/var/lib/docker/volumes/ on Linux). Non-Docker processes should not modify this part of the filesystem.

When you create a volume, it is stored within a directory on the Docker host. When you mount the volume into a container, this directory is what is mounted into the container

A given volume can be mounted into multiple containers simultaneously. When no running container is using a volume, the volume is still available to Docker and is not removed automatically.

We can create volume in two ways:
1)Anonymous volumes are not given an explicit name when they are first mounted into a container, so Docker gives them a random name that is guaranteed to be unique within a given Docker host. This volume will be removed automatically only when we create/run the container with -rm option. However new anonymous volume are created when we re-create the container.

2)Named volumes - As name suggests, we can assign name to the volume. These volumes are not deleted even when we stop or remove the container. By naming the volumes we can prevent recreating the volume again.

Volumes are often a better choice than persisting data in a container’s writable layer, because a volume does not increase the size of the containers using it, and the volume’s contents exist outside the lifecycle of a given container.

Volumes also support the use of volume drivers, which allow you to store your data on remote hosts or cloud providers, among other possibilities. To specify drivers we need to use --mount.

Let us take a simple todo application, in which the app saves todo list items in /app/todos/todo.txt. If we save this file within the container filesystem, our todo list will be wiped clean every single time we launch the container. So we are going to use volumes instead.

docker volume create todo-db

Creates volume named todo-db.

docker volume ls

Lists all created/available volumes.

docker volume inspect todo-db

This command allows us to inspect the volume.

docker volume rm todo-db

Removes the given volume.

docker volume prune

Remove all volumes.

Bind mounts

Bind mounts may be stored anywhere on the host system, it is managed by us. They may even be important system files or directories. Non-Docker processes on the Docker host or a Docker container can modify them at any time.

When you use a bind mount, a file or directory on the host machine is mounted into a container. The file or directory is referenced by its absolute path on the host machine.

Using -v or --volume option

docker run -d --name devtest \
-v todo-db:/app/todos todo-app:latest
- Named volume

All files that are created in /app/todos will be captured in todo-db volume

docker run -d --name devtest \
-v /app/todos todo-app:latest
- Anonymous volume

All files that are created in /app/todos will be captured in docker created anonymous volume.

We can also create volume using VOLUME instruction in dockerfile VOLUME ["/app/todos"] and we cannot create named volume with this instruction in dockerfile.

docker run -d --name devtest \
-v "$(pwd)"/todo-mount:/app/todos todo-app:latest
- Bind mount

-v or --volume: Consists of three fields, separated by colon characters (:). The fields must be in the correct order

  • In the case of named volumes, the first field is the name of the volume, and is unique on a given host machine. For anonymous volumes, the first field is omitted.
  • The second field is the path where the file or directory are mounted in the container.
  • The third field is optional, and is a comma-separated list of options, such as ro `-v todo-db:/app/todos:ro`(defining volume to be read only, by default the volume has read and write access).

If you use -v or --volume to bind-mount a file or directory that does not yet exist on the Docker host, -v creates the endpoint for you. It is always created as a directory.

Using --mount option

docker run -d --name devtest \
--mount type=volume,source=todo-db,target=/app/todos \
todo-app:latest
- Volume

docker run -d --name devtest \
--mount type=bind,source="$(pwd)"/todo-mount,target=/app/todos \
todo-app:latest
- Bind

--mount: Consists of multiple key-value pairs, separated by commas and each consisting of a <key>=<value>

  • The type of the mount, which can be bind, volume, or tmpfs.
  • The source of the mount. For bind mounts, this is the path to the file or directory on the Docker daemon host. May be specified as source or src.
  • The destination takes as its value the path where the file or directory is mounted in the container. May be specified as destination, dst, or target.
  • The readonly option, if present, causes the bind mount to be mounted into the container as read-only.
  • The bind-propagation option, if present, changes the bind propagation. May be one of rprivate, private, rshared, shared, rslave, slave.

If you use --mount to bind-mount a file or directory that does not yet exist on the Docker host, Docker does not automatically create it for you, but generates an error.

tmpfs mount

As opposed to volumes and bind mounts, a tmpfs mount is temporary, and only persisted in the host memory. When the container stops, the tmpfs mount is removed, and files written there won’t be persisted. It can be used by a container during the lifetime of the container, to store non-persistent state or sensitive information.

This is useful to temporarily store sensitive files that you don’t want to persist in either the host or the container writable layer.

This is available only if you’re running Docker on Linux.

This can be achieved using --mount or --tmpfs

docker run -d -it --name devtest \
--mount type=tmpfs,destination=/app/todos todo-app:latest

docker run -d -it --name devtest \
--tmpfs /app/todos todo-app:latest

Source Docker

To learn about docker communication and networking checkout Networking

Thanks for reading!!

Discussion (0)