The Docker ecosystem is filled with leaky abstractions. The utopian vision of Docker containers is a world where a developer can grab a base container for a language, copy in their code with a minimal Dockerfile, and be ready to develop and deploy instantly.
Unfortunately, this landscape is filled with per-language gotchas that make this world a far cry from reality. Here are some of the wonky things I've run into when working with containers:
- When working with Python, you really can't use the Alpine-based images. Python package binaries don't compile correctly on Alpine out of the box. For a lightweight container, you need to use a slim Debian container. This means you must already know that "Buster" and "Stretch" are Debian versions (because Debian is not in the tag name), and you must know which version is more recent.
- When developing using the official Python images, you can't get good autocomplete in your editor without extra steps because the packages aren't in the mounted volume. You can use the VS Code Remote extension to actually edit files within the container, or you could use
venvto colocate the dependencies with the code. However, using
venvwithin the container won't give you proper autocomplete because the Python interpreter is inside the container itself.
- The official PHP image won't let you install PHP packages from the Debian repos because its PHP executable is compiled from source. You must instead use the container's
- The official PHP image does not come with PHP's Composer package manager, yet Python, Node and Ruby images all come with their respective language package manager out of the box. You must find a way to programmatically install Composer.
- The official Node image comes preconfigured with a default unprivileged user. This is an incredible feature that I wish all images had, however, it is an image-specific feature that you must keep in mind when building a production Dockerfile.
- You cannot run an unprivileged user with the official Nginx image. Nginx has a separate Nginx image that uses an unprivileged user, however, it is not marked with the official tag.
- The official MySQL image (and countless others) don't have a version for ARM processors, meaning they can't run on common inexpensive devices like the Raspberry Pi.
A few months ago I had a goal of "learning Docker," but I've found that I've instead spent most of my time learning the individual base images and their quirks. This isn't totally surprising since every language requires lots of domain specific knowledge: it took me two weeks to figure out how to deploy my first Rails app on an Ubuntu VM. Docker's infrastructure-as-code paradigm also offers enormous advantages over configuring each development and deployment environment manually, which I did prior to adopting containerization.
However, it's worth acknowledging the complexities and pain-points of working with Docker images. One does not "just containerize" an app: it's a process that involves a lot of learning for each language and base image.
Top comments (19)
I wouldn't call it a lot. What you mention about php, for example, is on the first dockerhub page of PHP's official image.
Also, Docker (and most cloud native technologies) exist to enable devops such that development teams are enabled to deliver their applications. If you're a Ruby developer, you sure need to learn the quirks of official Ruby ages (and perhaps how to build one from scratch) but definitely not all language based images.
Finally, there are many other base images depending on how to deliver applications: for rails, for symfony, for roadrunner. One does not preemptively learn all.
Those are all fair points.
Despite my rant in this post, I actually like Docker quite a bit.
At my last job, we had a 40-page server setup document in Google Docs with instructions on how to install all the dependencies the server needed. It was error-prone and disheartening, and onboarding new developers was an absolute nightmare.
Docker could have made that process a lot more smooth. It just takes upfront time and investment, plus a LOT of fiddling. For simpler apps it can be overkill though.
I have to try Docker. Have been avoiding it because of a sour experience with a Snap image when I tried to install an Ubuntu server with their Nextcloud image. It was trouble from the word go and presented a read only file system in which I could not even configure the web server it include so I dumped it and installed Ubuntu cleanly and installed Nextcloud the normal way ;-).
I run into more and more things I want that come in docker images (like the Joplin Server) and so I've also been wanting to learn some more about it. But to be honest many of your points make little sense to me. I had to Google "Alpine-based image" as that meant zero to me.
The reservation I have had with docker images to date is that they seem invariably, and understandably to include too much of the stack, notably a webserver and a database engine. And it makes zero sense to me to replicate those on a server that already has a webserver running and a database engine. And so I've sort of parked docker as suitable for a standalone server (and the Pi is a great platform for that but again ARM compatibility must be there), or an environment where the server is grunty and resourced enough to be running VMs for specific apps. Though that is a false economy anyhow as it kind of downstreams the issue of needless replication, not only the webserver and database engine suddenly but the OS too. Which is of course why containers rose in popularity, dumping at least the OS from the replication.
Thanks for reading my article, Bernd!
There's a lot to learn with Docker, and Docker gives you a lot of opportunities to shoot yourself in the foot along the way. I tried and gave up on Docker twice before it finally stuck.
The using a Dockerized database and webserver can be nice for development, but for a serious app I would always prefer using a managed database in production. For a recent app I just deployed, I do a reverse proxy from the server's Nginx to a containerized Nginx that has reverse proxies to Django and Next.js containers. It feels a little silly so I may change that at some point 😅
Thanks for being brave enough to put this out there Tyler. I hope other developers take courage from this and honestly assess their tool chain this way. It seems that there's a heavy dose of evangelism that surrounds everybody's favorite tech stack, when of course, in reality, there's often just as many demerits as merits.
My approach to new tools, protocols, languages, frameworks, etc. is to begin with the simplest of questions: what problem does it solve, and do I have that problem?
Thanks for reading, Joe! I'm always happy to share my experiences. I agree that there's a heavy dose of evangelism around toolchains.
I'm personally a big fan of Docker. However, I honestly never heard any concrete examples of downsides to Docker. I'd just occasionally hear "it can be complicated."
My hope is that posts like this can surface what kinds of concrete challenges developers may run into when they start containerizing their applications.
Using docker as a development environment has me intrigued, but it also seems like a little overkill. At one point before docker I was given a VM with a dev environment.
As a tool for a build machine (pipeline), docker is awesome. I also recently set up a Jenkins server using docker and I think this will make backups much easier. (ultimately using Linux over Windows is a huge improvement).
One thing that was really nice was I could practice setting up a Jenkins server in docker on a win. I could then take that knowledge and stand it up on the real server.
There are definitely challenges creating a good docker infrastructure, but having it seems to provide great power.
Docker definitely has its pros and cons. I'm generally a fan: it solves real problems I've encountered. But it front-loads a ton of configuration work at the beginning of a project that you normally wouldn't need to think about until it was time to deploy an app.
It does have its upsides. Where it really shines is the ability to add services to the app. Installing ElasticSearch locally looks painful, but it looks okay with Docker. I've built a few apps that have an accompanying Node app to generate images; Docker makes bringing all of that up at once really easy.
I do wish I could jump into the coding parts of projects with Docker faster though...
This is a great article ... BTW to nitpick it was "one does not simply ..." in LOTR ;)
Thank you for the kind words!
"One does not simply express one's appreciation" ... but then again, one does!
I basically agree with what you're saying.
Containers were never meant to make the installation of dependencies easier in the first go, it definitely requires one to have domain specific knowledge of application code they are putting into containers, some linux and dependencies related stuff, most importantly, understanding of each and every dependency you are defining. I have observed developers installing ngrok, puppeteer, pm2, nodemon all together for node packages, which is not right. With well defined runtime configuration containers make it easier to package and ship applications or just the runtime environment for other developers. And to save yourself from all this, try cloudnative buildpacks.
Hey Dishant, thanks for reading my post.
I agree with everything you're saying. I wrote this because I heard a lot of developers talking about containerization like it was a technology that could solve all development and deployment problems. It's certainly solved several of my problems, but it's created a new class of problems in the process.
I wrote this post to highlight some of those problems for anyone who is just getting into containers so they know what kinds of challenges they might face.
I may check out cloudnative buildpacks though.