Docker, and containers more broadly, have made it simpler, and more portable a way to execute workloads, and in orchestration systems like Kubernetes, abstractions like the Job
and CronJob
resources make scheduling tasks executed in containers familiar and heavily observable, and with all the strengths of the orchestration system's best efforts for execution as it would any other resource type (in this case, Pod
resources created by the Job
manifest).
Let's take a fairly straightforward example of something that Docker allows us to do without having to install additional software on the host machines running Docker. Assume you have a directory on a volume of video files, and you want to generate, at random intervals, screenshots from those videos. Your volume is accessible through some method (for our purposes, this is a local accessible volume at /mnt/videos
).
A task like that, for example, requires the ffmpeg
package, and we accomplish our task by running a command like:
for v in `ls $DATA_PATH`; do ffmpeg -i $DATA_PATH/$v -ss 00:$MINUTE:$SECOND.000 -vframes 1 $CAPTURE_PATH/`head /dev/urandom | tr -dc A-Za-z0-9 | head -c 13 ; echo ''`.jpg ; done
Where we're just having it loop files in $DATA_PATH
, take a screenshot at $MINUTE:$SECOND.000
into the video and write it to a file in $CAPTURE_PATH
.
Because we want the task to execute in a container, we need to pass in those values, and install the ffmpeg package in the container:
FROM alpine:3.4
RUN apk update ; \
apk add ffmpeg
ENV DATA_PATH ""
ENV CAPTURE_PATH ""
ENV MINUTE ""
ENV SECOND ""
ENTRYPOINT for v in `ls $DATA_PATH`; do ffmpeg -i $DATA_PATH/$v -ss 00:$MINUTE:$SECOND.000 -vframes 1 $CAPTURE_PATH/`head /dev/urandom | tr -dc A-Za-z0-9 | head -c 13 ; echo ''`.jpg ; done
build the image:
docker build -t capture-create .
and for a one-off task, we could run a command like this:
docker run -d -v /mnt/videos/:/media -v /mnt/videos/captures:/caps -e MINUTE:$(echo $((10 + RANDOM % 40))) -e SECOND:$(echo $((10 + RANDOM % 59))) -e DATA_PATH=/media -e CAPTURE_PATH=/caps capture-create:latest
where $MINUTE
and $SECOND
are just arbitrarily set in at the time of container creation (you could do this in the container as well on each run, but for our purposes, setting it on the Docker CLI is easiest) and that will be used to create some part of the timecode used to take the screenshot.
You might even do something like wrap this command in bash script, and have your system execute that script like you might any other cronjob, or if you're running Docker in Swarm mode, there are extensions for this behavior in a cluster.
So, let's say you're doing this on Kubernetes, you might have volumes provisioned on a cloud provider, etc. but again, for our purposes, I'll just use a hostPath
volume to demonstrate the video and captures directories.
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: capture-create
spec:
schedule: "0 */8 * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: capture-create
image: capture-create:latest
env:
- name: DATA_PATH
value: /media
- name: CAPTURE_PATH
value: /captures
volumeMounts:
- name: content-volume
mountPath: /media
- name: caps-volume
mountPath: /captures
volumes:
- name: content-volume
hostPath:
path: /mnt/videos
type: Directory
- name: caps-volume
hostPath:
path: /mnt/videos/captures
type: Directory
restartPolicy: OnFailure
imagePullSecrets:
- name: myregistrykey
So, for a job like this, you're setting it to run every 8th hour (0 */8 ...
) and using the same variables as before (your host-connected data and capture volume paths) to mount them to the container, where in Kubernetes, both the job execution, and the cron behavior are managed, with the added benefit of retry behavior, and the suite of troubleshooting methods available to your in Kubernetes.
One benefit I particularly enjoy about Kubernetes is that because every resource ultimately is run as a Pod
in one form or another, you can, in the event of a Cronjob failure, create a one-off Job
from that failed job to troubleshoot further.
If, in this case, you have a scheduled CronJob in Kubernetes that has failed, or has no Pods available to complete the task, you can run the task as a Job instead to complete the task.
To create it from the manifest in Kubernetes for the existing CronJob, you can use the from argument with kubectl to define the source job to be templated for the new Job:
kubectl create job my-job --from=cronjob/my-cron-job
which will proceed to schedule the pods for this task.
or even just create a fresh environment to test the contents of the job by create a Pod
that looks like this:
apiVersion: v1
kind: Pod
metadata:
name: debug-pod
spec:
containers:
- name: debug-shell
image: alpine:3.4
command:
- "/bin/sh"
- "-c"
- "sleep 36000"
volumes:
...
and configure in the above the Volumes
as we did for the CronJob
definition. You can, then, just connect to this container and run your commands step-by-step.
kubectl exec -ti debug-pod -c debug-shell -- /bin/sh
Some more on troubleshooting these sorts of workloads, or scheduling jobs in Kubernetes more broadly can be found here:
Top comments (0)