Docker, and containers more broadly, have made it simpler, and more portable a way to execute workloads, and in orchestration systems like Kubernetes, abstractions like the
CronJob resources make scheduling tasks executed in containers familiar and heavily observable, and with all the strengths of the orchestration system's best efforts for execution as it would any other resource type (in this case,
Pod resources created by the
Let's take a fairly straightforward example of something that Docker allows us to do without having to install additional software on the host machines running Docker. Assume you have a directory on a volume of video files, and you want to generate, at random intervals, screenshots from those videos. Your volume is accessible through some method (for our purposes, this is a local accessible volume at
A task like that, for example, requires the
ffmpeg package, and we accomplish our task by running a command like:
for v in `ls $DATA_PATH`; do ffmpeg -i $DATA_PATH/$v -ss 00:$MINUTE:$SECOND.000 -vframes 1 $CAPTURE_PATH/`head /dev/urandom | tr -dc A-Za-z0-9 | head -c 13 ; echo ''`.jpg ; done
Where we're just having it loop files in
$DATA_PATH, take a screenshot at
$MINUTE:$SECOND.000 into the video and write it to a file in
Because we want the task to execute in a container, we need to pass in those values, and install the ffmpeg package in the container:
FROM alpine:3.4 RUN apk update ; \ apk add ffmpeg ENV DATA_PATH "" ENV CAPTURE_PATH "" ENV MINUTE "" ENV SECOND "" ENTRYPOINT for v in `ls $DATA_PATH`; do ffmpeg -i $DATA_PATH/$v -ss 00:$MINUTE:$SECOND.000 -vframes 1 $CAPTURE_PATH/`head /dev/urandom | tr -dc A-Za-z0-9 | head -c 13 ; echo ''`.jpg ; done
build the image:
docker build -t capture-create .
and for a one-off task, we could run a command like this:
docker run -d -v /mnt/videos/:/media -v /mnt/videos/captures:/caps -e MINUTE:$(echo $((10 + RANDOM % 40))) -e SECOND:$(echo $((10 + RANDOM % 59))) -e DATA_PATH=/media -e CAPTURE_PATH=/caps capture-create:latest
$SECOND are just arbitrarily set in at the time of container creation (you could do this in the container as well on each run, but for our purposes, setting it on the Docker CLI is easiest) and that will be used to create some part of the timecode used to take the screenshot.
You might even do something like wrap this command in bash script, and have your system execute that script like you might any other cronjob, or if you're running Docker in Swarm mode, there are extensions for this behavior in a cluster.
So, let's say you're doing this on Kubernetes, you might have volumes provisioned on a cloud provider, etc. but again, for our purposes, I'll just use a
hostPath volume to demonstrate the video and captures directories.
apiVersion: batch/v1beta1 kind: CronJob metadata: name: capture-create spec: schedule: "0 */8 * * *" jobTemplate: spec: template: spec: containers: - name: capture-create image: capture-create:latest env: - name: DATA_PATH value: /media - name: CAPTURE_PATH value: /captures volumeMounts: - name: content-volume mountPath: /media - name: caps-volume mountPath: /captures volumes: - name: content-volume hostPath: path: /mnt/videos type: Directory - name: caps-volume hostPath: path: /mnt/videos/captures type: Directory restartPolicy: OnFailure imagePullSecrets: - name: myregistrykey
So, for a job like this, you're setting it to run every 8th hour (
0 */8 ...) and using the same variables as before (your host-connected data and capture volume paths) to mount them to the container, where in Kubernetes, both the job execution, and the cron behavior are managed, with the added benefit of retry behavior, and the suite of troubleshooting methods available to your in Kubernetes.
One benefit I particularly enjoy about Kubernetes is that because every resource ultimately is run as a
Pod in one form or another, you can, in the event of a Cronjob failure, create a one-off
Job from that failed job to troubleshoot further.
If, in this case, you have a scheduled CronJob in Kubernetes that has failed, or has no Pods available to complete the task, you can run the task as a Job instead to complete the task.
To create it from the manifest in Kubernetes for the existing CronJob, you can use the from argument with kubectl to define the source job to be templated for the new Job:
kubectl create job my-job --from=cronjob/my-cron-job
which will proceed to schedule the pods for this task.
or even just create a fresh environment to test the contents of the job by create a
Pod that looks like this:
apiVersion: v1 kind: Pod metadata: name: debug-pod spec: containers: - name: debug-shell image: alpine:3.4 command: - "/bin/sh" - "-c" - "sleep 36000" volumes: ...
and configure in the above the
Volumes as we did for the
CronJob definition. You can, then, just connect to this container and run your commands step-by-step.
kubectl exec -ti debug-pod -c debug-shell -- /bin/sh
Some more on troubleshooting these sorts of workloads, or scheduling jobs in Kubernetes more broadly can be found here: