DEV Community

loading...
Cover image for Database (Schema) migration to Kubernetes - initContainers vs k8s jobs -

Database (Schema) migration to Kubernetes - initContainers vs k8s jobs -

Ahmed Hanafy
Junior SWE, learning DevOps, love cyber security and competitive programming I’d love to help in open source projects
・4 min read

Introducton

We in trackxy were migrating from microservices managed each by its own team to our Kubernetes cluster, where we have our services managed.

Problem

  • First, we need to know what schema migration is

schema migration (also database migration, database change management) refers to the management of incremental, reversible changes and version control to relational database schemas. A schema migration is performed on a database whenever it is necessary to update or revert that database's schema to some newer or older version.
Migrations are performed programmatically by using a schema migration tool. When invoked with a specified desired schema version, the tool automates the successive application or reversal of an appropriate sequence of schema changes until it is brought to the desired state.
source

  • So, as mentioned above schema migration is good right? yes it's When we have one instance of the program which runs the schema migration we're free to migrate it anytime we need and we're not worried about syncing or which instance will migrate first but when instances are more than 1 we face race condition and problem with which instance should run the migration

Brainstorming

In this section, we'll try to engage you in our thinking process and the pros and cons of each solution and why we chose the solution we did.

Encaspulating Migration inside image

  • we first thought about encapsulating the migration process inside the image itself but there are 3 reasons why we didn't go for this option
    1. SoC (Separation of Concerns), because it's not the image duty to migrate the database
    2. Race condition when we have several replicas of the image need to migrate at the same time and Kubernetes pods creation time can't be controlled
    3. Ambiguous behaviors, multiple replicas trying to migrate at the same time leads to ambiguous behaviors which we couldn't track and we chose not to go forward with this because the deployment component in Kubernetes creates pods in arbitrary order and all pods can start the migration process and make inconsistency in the database
  • due to the reasons we mentioned above we decided not to go forward with this solution Schema migration in multiple replicas reaching same database This diagram shows the race condition happens when 3 replicas tries to migrate schema at the same time

InitContainers

  • It's nearly similar to the previous solution and has the same cons

Kubernetes job

A Job creates one or more Pods and will continue to retry execution of the Pods until a specified number of them successfully terminate. As pods successfully completed, the Job tracks the successful completions. When a specified number of successful completions is reached, the task (ie, Job) is complete. Deleting a Job will clean up the Pods it created.

source

So as mentioned above the Kubernetes job functionality is to create one pod which do a specific job then dies without restarting, and that's exactly what we needed to run our migration once and in one pod only

Our Solution

prerequisites

  • This solution assumes you have a good background about docker, Kubernetes and helm
  • Having kubectl on your local machine and configured to your cluster
  • You have a Kubernetes up and running -we're not discussing configuring k8s cluster here-
  • We will use PHP/Laravel here to demonstrate our solution
  • we have a PHP/Laravel pod
    • we configured our Database with PHP
  • Our Database up and running -either in the cluster or outside the cluster-

Use case

  • We need to add a new table to our database
php artisan make:migration create_flights_table
Enter fullscreen mode Exit fullscreen mode
  • Migration should be done and the PHP/Laravel image redeployed, this where our solution comes to solve this problem

Configuration

  • we eventually went with the Kubernetes job solution
  • This is an example of what we have done to migrate the database
  • you have to have a values.yaml file to refer to the values we referred to here using HELM
apiVersion: batch/v1
kind: Job
metadata:
  name: {{ include "APPLICATION.fullname" . }}-database-migration-job
spec:
  template:
    spec:
      containers:
        - name: {{ include "APPLICATION.fullname" . }}-database-migration-job
          image: "{{ .Values.databaseMigration.image  }}:{{.Values.databaseMigration.imageTag}}"
          imagePullPolicy: {{ .Values.image.pullPolicy }}
          args:
            - php
            - artisan
            - migrate
          env:
            - name: ENV_VARIABLE
              value: "false"
          envFrom:
            - configMapRef:
                name:  {{ include "APPLICATION.fullname" . }}
            - secretRef:
                name: {{ .Values.laravel.secrets_names.laravel_secrets}}
      restartPolicy: Never
Enter fullscreen mode Exit fullscreen mode

Run

  • to run the job and deploy using HELM we run in the root directory
helm  upgrade --install -n [NAMESPACE] api  . 
Enter fullscreen mode Exit fullscreen mode
  • to run the job using kubectl , you have to fill the variables instead of using values.yaml
kubectl apply -f job.yaml
Enter fullscreen mode Exit fullscreen mode

Testing

  • You can see the job and the pod it looks after like so:
kubectl get jobs
Enter fullscreen mode Exit fullscreen mode

Alt Text

kubectl get pods
Enter fullscreen mode Exit fullscreen mode
  • we can see that the pod for the job is created, ran and stopped after completing

    Alt Text

  • we can see the logs of the pod to make sure that our migration is done well

    kubectl logs [pod-name]
    

    Alt Text

invoking job

First we need to know that Kubernetes jobs don't rerun by default, we found that we have to deploy the job every single time you deploy so we decided to redeploy every time using our CI/CD pipeline - outside this article scope -, you can contact me to help you setup your pipeline

author

written by me Ahmed Hanafy feel free to contact me if you have any questions

Discussion (7)

Collapse
pjotre86 profile image
pjotre86

Hey!
Glad you found a mechanism which works for you. I know almost nothing about php/lavarel. But isn't there some kind of library which does migrations in a concurrency-safe manner? Using exclusives locks and such? I would be surprised if not. If so, then I would suggest to start the migration script during startup of your application (not as a decoupled k8s job/init container and also not as a start script in the docker image). That means your application would also work everywhere without any extra setup script.
In case you have a migration which breaks backwards compatibility you need downtime: Tear down all the running instances and then spin up the new version which does the migration. But your approach has the same problem, there's no way around that.

Collapse
ahmeddrawy profile image
Ahmed Hanafy Author

We assumed we won’t have breaking changes in database and older versions of services will use the newer version of database without problems
I have no idea if there’s libraries that does so if you can point me to one it’ll be very helpful
Tbh I don’t understand your point about setup script, can you explain more for me

Collapse
pjotre86 profile image
pjotre86 • Edited

Regarding the setup script:
The k8s-job you created is needed to run before you can start the application. So it's effectively a setup script. If you want to run your app somewhere else, let's say directly on your machine or with docker-compose you need to run similar scripts there as well in order to make it work. That also means that your application is not self-sufficient. If you manage to do the db-migrations within the startup of your application it becomes much more flexibel, easier to use, less error-prone (because you don't need to maintain scripts for different runtimes), easier to test (both manual and automated end-to-end tests are easier to setup when there's not this extra step of running the migration-script every time). Just point the app to a db-server and let it do the rest.

Regarding libraries:
As already mentioned, I'm not an expert on PHP/Lavarel at all. But I saw in your profile that you have a Java-background. In Java the de-facto standard library which does exactly this kind of stuff would be liquibase. Maybe someone else here knows of an equivalent for this in the PHP-world, or maybe you find something if you research a bit. I would be highly surprised if there's nothing like that, regarding the maturity of PHP...

Hope I could clarify my points a bit...

Thread Thread
ahmeddrawy profile image
Ahmed Hanafy Author

You did clarify, thank you
Regarding the db migration we limited it to the production db only
Our dev env has it’s migration managed in other way -not sure how but i think it’s handled using docker-compose -
Will check the Java library thank you very much
This is my first article and I’m still junior SWE learning DevOps so I hope my article -in which i put alot of effort - helped you a bit
Thanks

Thread Thread
pjotre86 profile image
pjotre86

Thanks for the article. It made me think, that's always good. I'm not saying your approach is bad. In my mind I just applied it to the projects I'm working on and then those concerns came to my mind. But you're working on a different thing on a different stack so it might be no problem for you after all. Anyway, I guess most companies would be happy for a junior dev who comes up with such a solution!

Collapse
tareksamni profile image
Tarek N. Elsamni • Edited

You can even run the job as part of your helm chart. Then you can set the job as a pre hook in helm which will make sure to run the job before anything else and only go forward with deployment when the job succeeds.

NB: remember to create any secrets, configmaps and etc (using a pre hook with less value than the job). that the job will need before running the job itself.

Collapse
ahmeddrawy profile image
Ahmed Hanafy Author

yes I found this solution too, but we prefered to invoke the job from the CI pipelines instead