K8s Pods: Image tags vs. Digest

#kubernetes #k8s #pods #devops

Image tags are a common source of frustration with the management of Kubernetes pods. How many times have I seen a question like this

I am unable to see my changes, even though I am using the referencing the image using the Correct Tag.

Or even worse:

I haven't changed the Referenced for the Image, but its behavior changed

These are normal problems to face do the basic design of Docker's image management.

But before diving in, let's see thee terminology that we'll be talking about.

Image Tag

Human understandable reference to an Image used to convey information about that variant. An image can be tagged with multiple tags and can be referenced using any.

Properties:

Mutable
Unique (i.e., the same tag can be used on multiple Images and will reference the last image that was tagged)

Image Digest

Hash reference to an Image.

Properties:

Immutable
Created when pushing an Image to the Repository for the first time.

General Practice

Programmers tend to reference the Images using the Image tags, for the obvious reason of getting a better readability, but in doing so, an important blind spot is missed: an image I1 referenced at time T1, may refer to image I2 at time T2.

This generally happens when the Programmer is referring to generic tags such as latest, or can happen due to Human error, when you end up tagging Image with an already used Tag.

Use Cases

Using latest to point to latest version always. Some other self-explanatory tags like qa / dev / prod, versioning tags like x.y.x for maintaining versions.

Non-deterministic Deployment

When using Image tags for referencing the Image, there is no guarantee that the Tag would refer to the expected Image, because of Tag Mutability.

Another defect with the non-deterministic deployment is some infrastructure tools which are dependent on the Stage Changes, will not be able to detect the changes when using generic tags and no versioning, or if due to a human error of mutating the tags.

In order to avoid Non-Deterministic deployment, we should use Immutable Image Identifier Image Digest should be used. But this way, we lose the Human readability. In order to have Human Readability, as well as Deterministic Deployment, we can refer to the Image using both Image Tag and Image Digest. Docker gives us the capability to integrate both and refer to the Image.

Image Reference: <image_name>:<image_tag>@<image_digest>

Deployment Image with Tag vs Digest

Different ways of referencing

Accessing a resource from a collection of similar resources is always easy when there is a versioning attached to it. It helps users to have a better control the specific resource that they are using. This is useful when versioning attached to the resources is immutable. But at times, situations arise where one wants to have even more readability, for ex., using the latest version ever of the resource every time or raising a small patch so that it reflects in the version. In such cases, we would require the tags to be mutable.

This is the usability around which the K8s Pods were build. Having mutable tags gives the ability to have named versions, something which makes a lot of sense because things are kept easier on the pipelines, like take the latest image, use the dev image, or maybe when a fix is so small that it doesn’t require a version update.

So, now that the versions, a.k.a., tags, are mutable. So, does that mean that is nothing immutable that can be used to get the same Image again and again. This is where the Digest comes into the picture. This is a hash key that is created based on what all base images are used for constructing the image and this is unique for every different image. Digest can be used to reference the same image again and again.

Now let’s get into some techie stuff

Why are tags important?

Tags are human understandable reference to an Image used to convey information about that variant. An image can be tagged with multiple tags and can be referenced using any. Importantly, a tag can be changed to different images.

What is Non-deterministic Deployment?

When using Image tags for referencing the Image, there is no guarantee that the Tag would refer to the expected Image, because of Tag Mutability.

This drawback is for the Infrastructure Tools, which depend on State Changes, as the tools will not be able to find the difference in the state, if the Tag refers to a different image or not.

So, how do we make this deterministic?

In order to get the best of both the worlds, that is human understandable reference as well as deterministic deployments, there needs to be a Dynamic Mapping between Image Tag and Image Digest which can be used when deploying images. While pushing to a repository, we can tag the image and then once the digest is created, use that to deploy the images.

How can OPTA help?

What is Opta?

Opta is a platform for running containerised workloads in the cloud. It abstracts away the complexity of networking, IAM, K8s, and various other components - giving you a clean cloud agnostic interface to deploy and run your containers. It’s all configuration driven so you always get a repeatable copy of your infrastructure.

How does Opta help?

Deploy option of Opta with an image tag input would automatically create the Dynamic Mapping between the Image uploaded to the Repository and use the Image digest to deploy the Image.

For more information on how to use Opta, use this: https://docs.opta.dev/getting-started/