DEV Community

Cover image for How to deploy a high availability (HA) Postgres cluster in Kubernetes?
Dmitry Romanoff
Dmitry Romanoff

Posted on

How to deploy a high availability (HA) Postgres cluster in Kubernetes?

Here's an example of using the PostgreSQL Operator to deploy a high availability PostgreSQL cluster in Kubernetes. In this example, I'll be using the PostgreSQL Operator from Zalando [https://github.com/zalando/postgres-operator] to deploy a PostgreSQL cluster with two nodes.

The Postgres Operator can be installed by using the provided Helm chart which saves you the manual steps.

# add repo for postgres-operator
helm repo add postgres-operator-charts https://opensource.zalando.com/postgres-operator/charts/postgres-operator

# install the postgres-operator
helm install postgres-operator postgres-operator-charts/postgres-operator
Enter fullscreen mode Exit fullscreen mode

Starting the operator may take a few seconds. Check if the operator pod is running before applying a Postgres cluster manifest.

kubectl get pod -l app.kubernetes.io/name=postgres-operator

dmi@dmi-laptop:~/pg_ha_k8s$ kubectl get pod -l app.kubernetes.io/name=postgres-operator
NAME                                 READY   STATUS    RESTARTS   AGE
postgres-operator-6c5657ccd6-zx6dq   1/1     Running   0          61s
dmi@dmi-laptop:~/pg_ha_k8s$ 
Enter fullscreen mode Exit fullscreen mode

If the operator pod is running it listens to new events regarding postgresql resources. Now, it's time to submit your first Postgres cluster manifest.

Clone the repository zalando/postgres-operator and change to the directory

git clone https://github.com/zalando/postgres-operator.git
cd postgres-operator

cat manifests/minimal-postgres-manifest.yaml

apiVersion: "acid.zalan.do/v1"
kind: postgresql
metadata:
  name: acid-minimal-cluster
spec:
  teamId: "acid"
  volume:
    size: 1Gi
  numberOfInstances: 2
  users:
    zalando:  # database owner
    - superuser
    - createdb
    foo_user: []  # role for application foo
  databases:
    foo: zalando  # dbname: owner
  preparedDatabases:
    bar: {}
  postgresql:
    version: "15"

# create a Postgres cluster
kubectl create -f manifests/minimal-postgres-manifest.yaml

Enter fullscreen mode Exit fullscreen mode

After the cluster manifest is submitted and passed the validation the operator will create Service and Endpoint resources and a StatefulSet which spins up new Pod(s) given the number of instances specified in the manifest. All resources are named like the cluster. The database pods can be identified by their number suffix, starting from -0. They run the Spilo container image by Zalando. As for the services and endpoints, there will be one for the master pod and another one for all the replicas (-repl suffix). Check if all components are coming up. Use the label application=spilo to filter and list the label spilo-role to see who is currently the master.

# check the deployed cluster
kubectl get postgresql

kubectl get postgresql
NAME                   TEAM   VERSION   PODS   VOLUME   CPU-REQUEST   MEMORY-REQUEST   AGE   STATUS
acid-minimal-cluster   acid   15        2      1Gi                                     78s   Creating
Enter fullscreen mode Exit fullscreen mode

Wait a few minutes...

kubectl get postgresql
NAME                   TEAM   VERSION   PODS   VOLUME   CPU-REQUEST   MEMORY-REQUEST   AGE     STATUS
acid-minimal-cluster   acid   15        2      1Gi                                     4m12s   Running

# check created database pods
kubectl get pods -l application=spilo -L spilo-role

kubectl get pods -l application=spilo -L spilo-role
NAME                     READY   STATUS    RESTARTS   AGE     SPILO-ROLE
acid-minimal-cluster-0   1/1     Running   0          5m21s   master
acid-minimal-cluster-1   1/1     Running   0          3m20s   replica

# check created service resources
kubectl get svc -l application=spilo -L spilo-role

kubectl get svc -l application=spilo -L spilo-role
NAME                          TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE    SPILO-ROLE
acid-minimal-cluster          ClusterIP   10.100.2.8      <none>        5432/TCP   6m9s   master
acid-minimal-cluster-config   ClusterIP   None            <none>        <none>     4m3s   
acid-minimal-cluster-repl     ClusterIP   10.110.12.149   <none>        5432/TCP   6m9s   replica
Enter fullscreen mode Exit fullscreen mode

With a port-forward on one of the database pods (e.g. the master) you can connect to the PostgreSQL database from your machine. Use labels to filter for the master pod of our test cluster.

# get name of master pod of acid-minimal-cluster
export PGMASTER=$(kubectl get pods -o jsonpath={.items..metadata.name} -l application=spilo,cluster-name=acid-minimal-cluster,spilo-role=master -n default)

# set up port forward
kubectl port-forward $PGMASTER 6432:5432 -n default
Enter fullscreen mode Exit fullscreen mode

Open another CLI and connect to the database using e.g. the psql client. When connecting with a manifest role like foo_user user, read its password from the K8s secret which was generated when creating acid-minimal-cluster. As non-encrypted connections are rejected by default set SSL mode to require:

export PGPASSWORD=$(kubectl get secret postgres.acid-minimal-cluster.credentials.postgresql.acid.zalan.do -o 'jsonpath={.data.password}' | base64 -d)
export PGSSLMODE=require
psql -U postgres -h localhost -p 6432
psql (12.12 (Ubuntu 12.12-0ubuntu0.20.04.1), server 15.1 (Ubuntu 15.1-1.pgdg22.04+1))
WARNING: psql major version 12, server major version 15.
         Some psql features might not work.
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off)
Type "help" for help.

postgres=# select version();
                                                             version                                                              
-----------------------------------------------------------------------------------------------------------------------------------
 PostgreSQL 15.1 (Ubuntu 15.1-1.pgdg22.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0, 64-bit
(1 row)

postgres=# 

postgres=# create database my_test_db;
CREATE DATABASE
postgres=# 
postgres=# \c my_test_db
You are now connected to database "my_test_db" as user "postgres".
my_test_db=# create table my_table_1(a varchar(100), b timestamp);
CREATE TABLE
my_test_db=# create table my_table_2(a varchar(100), b timestamp);
CREATE TABLE
my_test_db=# create table my_table_3(a varchar(100), b timestamp);
CREATE TABLE
my_test_db=# insert into my_table_1 values('One One One', '2001-01-01 01:01:01');
INSERT 0 1
my_test_db=# insert into my_table_1 values('One One One', '2001-01-01 01:01:02');
INSERT 0 1
my_test_db=# insert into my_table_1 values('One One One', '2001-01-01 01:01:03');
INSERT 0 1
my_test_db=# insert into my_table_1 values('One One One', '2001-01-01 01:01:04');
INSERT 0 1
my_test_db=# insert into my_table_2 values('Two Two Two', '2002-02-02 01:01:04');
INSERT 0 1
my_test_db=# 

my_test_db=# \d
                  List of relations
 Schema |          Name           | Type  |  Owner   
--------+-------------------------+-------+----------
 public | my_table_1              | table | postgres
 public | my_table_2              | table | postgres
 public | my_table_3              | table | postgres
 public | pg_stat_kcache          | view  | postgres
 public | pg_stat_kcache_detail   | view  | postgres
 public | pg_stat_statements      | view  | postgres
 public | pg_stat_statements_info | view  | postgres
(7 rows)

my_test_db=#
Enter fullscreen mode Exit fullscreen mode

Let's delete the pod in the K8S cluster that is master, and verify that the failover procedure is working.

How to deploy a high availability (HA) Postgres cluster in Kubernetes?

kubectl get pods
NAME                                 READY   STATUS    RESTARTS   AGE
acid-minimal-cluster-0               1/1     Running   0          22m
acid-minimal-cluster-1               1/1     Running   0          20m
postgres-operator-6c5657ccd6-zx6dq   1/1     Running   0          27m

kubectl get pods
NAME                                 READY   STATUS    RESTARTS   AGE
acid-minimal-cluster-0               1/1     Running   0          2m45s
acid-minimal-cluster-1               1/1     Running   0          30m
postgres-operator-6c5657ccd6-zx6dq   1/1     Running   0          37m
Enter fullscreen mode Exit fullscreen mode

Let's examine log files of the pods after the failover: the pod acid-minimal-cluster-0 became secondary and the acid-minimal-cluster-1 became the leader.

kubectl logs acid-minimal-cluster-0

...
2023-04-12 13:07:36,610 INFO: no action. I am (acid-minimal-cluster-0), a secondary, and following a leader (acid-minimal-cluster-1)
2023-04-12 13:07:46,608 INFO: no action. I am (acid-minimal-cluster-0), a secondary, and following a leader (acid-minimal-cluster-1)
2023-04-12 13:07:56,611 INFO: no action. I am (acid-minimal-cluster-0), a secondary, and following a leader (acid-minimal-cluster-1)
2023-04-12 13:08:06,612 INFO: no action. I am (acid-minimal-cluster-0), a secondary, and following a leader (acid-minimal-cluster-1)
Enter fullscreen mode Exit fullscreen mode

kubectl logs acid-minimal-cluster-1

...
2023-04-12 13:08:34.726 33 LOG {ticks: 0, maint: 0, retry: 0}
2023-04-12 13:08:36,604 INFO: no action. I am (acid-minimal-cluster-1), the leader with the lock
2023-04-12 13:08:46,604 INFO: no action. I am (acid-minimal-cluster-1), the leader with the lock
2023-04-12 13:08:56,604 INFO: no action. I am (acid-minimal-cluster-1), the leader with the lock
2023-04-12 13:09:04.731 33 LOG {ticks: 0, maint: 0, retry: 0}
2023-04-12 13:09:06,608 INFO: no action. I am (acid-minimal-cluster-1), the leader with the lock
2023-04-12 13:09:16,608 INFO: no action. I am (acid-minimal-cluster-1), the leader with the lock
2023-04-12 13:09:26,619 INFO: no action. I am (acid-minimal-cluster-1), the leader with the lock
Enter fullscreen mode Exit fullscreen mode

Let's connect to the PostgreSQL after failover and ensure that the data populated before is persistent.

psql -U postgres -h localhost -p 6432

postgres=# \c my_test_db
psql (12.12 (Ubuntu 12.12-0ubuntu0.20.04.1), server 15.1 (Ubuntu 15.1-1.pgdg22.04+1))
WARNING: psql major version 12, server major version 15.
         Some psql features might not work.
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off)
You are now connected to database "my_test_db" as user "postgres".
my_test_db=# 
my_test_db=# \d
                  List of relations
 Schema |          Name           | Type  |  Owner   
--------+-------------------------+-------+----------
 public | my_table_1              | table | postgres
 public | my_table_2              | table | postgres
 public | my_table_3              | table | postgres
 public | pg_stat_kcache          | view  | postgres
 public | pg_stat_kcache_detail   | view  | postgres
 public | pg_stat_statements      | view  | postgres
 public | pg_stat_statements_info | view  | postgres
(7 rows)

my_test_db=# select * from my_table_1;
      a      |          b          
-------------+---------------------
 One One One | 2001-01-01 01:01:01
 One One One | 2001-01-01 01:01:02
 One One One | 2001-01-01 01:01:03
 One One One | 2001-01-01 01:01:04
(4 rows)

my_test_db=#
Enter fullscreen mode Exit fullscreen mode

We can enter a shell into one of the Pods and check the cluster status there:

kubectl exec -it acid-minimal-cluster-0 -- /bin/bash

 ____        _ _
/ ___| _ __ (_) | ___
\___ \| '_ \| | |/ _ \
 ___) | |_) | | | (_) |
|____/| .__/|_|_|\___/
      |_|

This container is managed by runit, when stopping/starting services use sv

Examples:

sv stop cron
sv restart patroni

Current status: (sv status /etc/service/*)

run: /etc/service/patroni: (pid 24) 992s
run: /etc/service/pgqd: (pid 25) 992s
root@acid-minimal-cluster-0:/home/postgres# 


root@acid-minimal-cluster-0:/home/postgres# patronictl list
+ Cluster: acid-minimal-cluster (7221139180597178431) ----+----+-----------+
| Member                 | Host       | Role    | State   | TL | Lag in MB |
+------------------------+------------+---------+---------+----+-----------+
| acid-minimal-cluster-0 | 172.17.0.4 | Replica | running |  2 |         0 |
| acid-minimal-cluster-1 | 172.17.0.5 | Leader  | running |  2 |           |
+------------------------+------------+---------+---------+----+-----------+
root@acid-minimal-cluster-0:/home/postgres# 

root@acid-minimal-cluster-0:/home/postgres# su - postgres
postgres@acid-minimal-cluster-0:~$

postgres@acid-minimal-cluster-0:~$ psql -d postgres
psql (15.1 (Ubuntu 15.1-1.pgdg22.04+1))
Type "help" for help.

postgres=# 

postgres=# \l
                                                  List of databases
    Name    |   Owner   | Encoding |   Collate   |    Ctype    | ICU Locale | Locale Provider |   Access privileges   
------------+-----------+----------+-------------+-------------+------------+-----------------+-----------------------
 bar        | bar_owner | UTF8     | en_US.utf-8 | en_US.utf-8 |            | libc            | 
 foo        | zalando   | UTF8     | en_US.utf-8 | en_US.utf-8 |            | libc            | 
 my_test_db | postgres  | UTF8     | en_US.utf-8 | en_US.utf-8 |            | libc            | 
 postgres   | postgres  | UTF8     | en_US.utf-8 | en_US.utf-8 |            | libc            | 
 template0  | postgres  | UTF8     | en_US.utf-8 | en_US.utf-8 |            | libc            | =c/postgres          +
            |           |          |             |             |            |                 | postgres=CTc/postgres
 template1  | postgres  | UTF8     | en_US.utf-8 | en_US.utf-8 |            | libc            | =c/postgres          +
            |           |          |             |             |            |                 | postgres=CTc/postgres
(6 rows)


postgres=# \l
                                                  List of databases
    Name    |   Owner   | Encoding |   Collate   |    Ctype    | ICU Locale | Locale Provider |   Access privileges   
------------+-----------+----------+-------------+-------------+------------+-----------------+-----------------------
 bar        | bar_owner | UTF8     | en_US.utf-8 | en_US.utf-8 |            | libc            | 
 foo        | zalando   | UTF8     | en_US.utf-8 | en_US.utf-8 |            | libc            | 
 my_test_db | postgres  | UTF8     | en_US.utf-8 | en_US.utf-8 |            | libc            | 
 postgres   | postgres  | UTF8     | en_US.utf-8 | en_US.utf-8 |            | libc            | 
 template0  | postgres  | UTF8     | en_US.utf-8 | en_US.utf-8 |            | libc            | =c/postgres          +
            |           |          |             |             |            |                 | postgres=CTc/postgres
 template1  | postgres  | UTF8     | en_US.utf-8 | en_US.utf-8 |            | libc            | =c/postgres          +
            |           |          |             |             |            |                 | postgres=CTc/postgres
(6 rows)

postgres=# \c my_test_db
You are now connected to database "my_test_db" as user "postgres".
my_test_db=# \d
                  List of relations
 Schema |          Name           | Type  |  Owner   
--------+-------------------------+-------+----------
 public | my_table_1              | table | postgres
 public | my_table_2              | table | postgres
 public | my_table_3              | table | postgres
 public | pg_stat_kcache          | view  | postgres
 public | pg_stat_kcache_detail   | view  | postgres
 public | pg_stat_statements      | view  | postgres
 public | pg_stat_statements_info | view  | postgres
(7 rows)

my_test_db=# select * from my_table_1;
      a      |          b          
-------------+---------------------
 One One One | 2001-01-01 01:01:01
 One One One | 2001-01-01 01:01:02
 One One One | 2001-01-01 01:01:03
 One One One | 2001-01-01 01:01:04
(4 rows)

my_test_db=#
Enter fullscreen mode Exit fullscreen mode

Top comments (0)