DEV Community

Mehdi Pourfar
Mehdi Pourfar

Posted on

Backup and replicate Postgresql 12 using pgBackRest in Ubuntu 20.04

In this post, I will guide you through the process of setting up a continuous backup system and a replication server using pgBackRest.

For this guide, we need to have 3 servers: pg-master, pg-replica and pg-backup.

On all of these servers, we need to add these lines to /etc/hosts file and put the correct ips there:

{master_server_ip} pg-master
{replica_server_ip} pg-replica
{backup_server_ip} pg-backup

On pg-master:

sudo apt install postgresql-12 pgbackrest

On pg-replica:

sudo apt install postgresql-12 pgbackrest

On pg-backup:

sudo apt install pgbackrest

Since our servers should communicate with each other using ssh connection, we create ssh key pairs on each of them:

On pg-master, pg-replica and pg-backup run these commands:

sudo su postgres
ssh-keygen -t rsa -b 4096 -N ""

And then share all of the public keys inside /var/lib/postgresql/.ssh/authorized_keys file on each server. Then we need to check that all connections can be established without any problem:

On pg-master:

sudo -u postgres ssh postgres@pg-backup
[exit]
sudo -u postgres ssh postgres@pg-replica
[exit]

On pg-replica:

sudo -u postgres ssh postgres@pg-backup
[exit]
sudo -u postgres ssh postgres@pg-master
[exit]

On pg-backup:

sudo -u postgres ssh postgres@pg-master
[exit]
sudo -u postgres ssh postgres@pg-replica
[exit]

If everything works correctly, it's time to configure pgBackRest.

On pg-master, change the content of /etc/pgbackrest.conf with these lines:

[my_cluster]
pg1-path=/var/lib/postgresql/12/main

[global]
repo1-host=pg-backup
repo1-host-user=postgres

[global:archive-push]
compress-level=3

And then, open /etc/postgresql/12/main/postgresql.conf and change these variables:

archive_command = 'pgbackrest --stanza=my_cluster archive-push %p'
archive_mode = on
listen_addresses = '*'
max_wal_senders = 3
wal_level = replica

And then restart postgresql:

sudo systemctl restart postgresql

After that, we will create some test data:

sudo -u postgres psql

psql (12.4 (Ubuntu 12.4-1.pgdg18.04+1))
Type "help" for help.

postgres=# CREATE DATABASE my_db;
CREATE DATABASE
postgres=# \c my_db
You are now connected to database "my_db" as user "postgres".
my_db=# CREATE TABLE nums(val int);
CREATE TABLE
my_db=# INSERT INTO nums VALUES(1);
INSERT 0 1
my_db=# INSERT INTO nums VALUES(2);
INSERT 0 1
my_db=# INSERT INTO nums VALUES(3);
INSERT 0 1

Now we have a database named my_db with a table named nums with three values.

It's time to create our backup on pg-backup server. First edit /etc/pgbackrest.conf on pg-backup as below:

[my_cluster]
pg1-host=pg-master
pg1-host-user=postgres
pg1-path=/var/lib/postgresql/12/main                                                                                                               

[global]                                                                                                                                  
process-max=2
repo1-path=/var/lib/pgbackrest
repo1-retention-full=2
repo1-retention-diff=1
repo1-host-user=postgres
start-fast=y

Then run this command to create a backup stanza on pg-backup server:

sudo -u postgres pgbackrest --stanza=my_cluster --log-level-console=info stanza-create

Then, run this command on both pg-master and pg-backup to check everything is okay:

sudo -u postgres pgbackrest --stanza=my_cluster --log-level-console=info check

If everything works good, it's time to perform our first backup on pg-backup server:

sudo -u postgres pgbackrest --stanza=my_cluster --log-level-console=info backup

By default, pgBackRest works by copying new archive files to backup server but if we let it be like that, the hard disk will be full very soon. So we need to create new backups based on some routines and pgBackRest will remove old backups for us based on retention policies that we have defined above. To do this, we will use Cronjob.

On pg-backup server:

sudo -u postgres
crontab -e

And add these lines at the end of the file and press enter to create a new line. Then save and exit.

30 06  *   *   0     pgbackrest --type=full --stanza=my_cluster backup
30 06  *   *   1-6   pgbackrest --type=diff --stanza=my_cluster backup

You can read more about full backup, diff backup and retention policies in pgBackRest documentation.

Now that our backup server has been configured correctly, it's time to configure our replication server. Since we want to have a streaming replication, we should first create a replication user on pg-master:

sudo -u postgres psql
CREATE USER replicator PASSWORD '123123' REPLICATION";

On pg-master, open /etc/postgresql/12/main/pg_hba.conf and add this line and change pg_replicate_ip to the ip of pg-replica server:

host    replication     replicator      {pg_replicate_ip}/32       md5

And then restart postgresql

systemctl restart postgresql

Now, on pg-replica open the file /etc/pgbackrest.conf and change the content with these lines:

[my_cluster]
pg1-path=/var/lib/postgresql/12/main
recovery-option=primary_conninfo=host=pg-master port=5432 user=replicator password=123123 application_name=replica1

[global]
repo1-host=pg-backup
repo1-host-user=postgres

Don't worry if you don't know what application_name is. We will talk about it later.
On pg-replica, run these commands to fetch master database data:

systemctl stop postgresql
sudo -u postgres pgbackrest --stanza=my_cluster --delta --type=standby --log-level-console=info restore
systemctl start postgresql

To check that everything works correctly, on pg-replica go to psql and check that my_db database exists:

sudo -u postgres psql -d my_db
SELCET * FROM nums;

You should see these values as result:

 val 
-----
   1
   2
   3
(3 rows)

Now, we add some data on pg-master to check changes will be reflected on replica:

sudo -u postgres psql -d my_db
INSERT INTO nums VALUES (4);

On, pg-replica:

SELECT * FROM nums;

Now you should see these values as result:

 val 
-----
   1
   2
   3
   4
(3 rows)

If you don't see 4 values, look at the pg-replica's log file to find out the problem:

tail /var/log/postgresql/postgresql-12-main.log

As you saw earlier, we have set application_name=replica1 in our connection info, but what is that? By default, Postgresql will use asynchronous replication, meaning that, replication will be done with a lag (usually just a few milliseconds) but if you want to have synchronous replication (meaning that commits will be blocked until replication is done) then you can set this application_name in pg-master's postgresql.conf.
If you want to do this, on pg-master, open the file /ect/postgresql/12/main/postgresql.conf and change this value:

synchronous_standby_names = 'replica1'

And then restart postgresql. Now the replication should be synchronous. To test it, on pg-master go to sql shell and add a new row:

sudo -u postgres psql -d my_db
INSERT INTO nums VALUES(5);

If it doesn't hang, that means everything is okay. Buy if you stop postgresql replica server, command will hang and client will be blocked.

On pg-replica:

systemctl stop postgresql

On pg-master:

INSERT INTO nums VALUES(6);

Now as you see, pg-master's psql client is hanged but as soon as you start pg-replica postgresql server, it will be okay.

systemctl start postgresql

Now it's time to test recovery process. For the test purpose, we truncate nums table and recover the data.
On pg-master, run the command below and copy the time value:

sudo -u postgres psql -Atc  "select current_timestamp"
2020-09-24 14:17:23.105194+03:30

Open psql and truncate nums table:

sudo -u postgres psql -d my_db
TRUNCATE nums;
SELECT * FROM nums;
 val 
-----
(0 rows)

Stop Postgresql servers on pg-master and pg-replica

systemctl stop postgresql

On pg-master run the command below:

sudo -u postgres pgbackrest --stanza=my_cluster --delta --type=time "--target=2020-09-24 14:17:23.105194+03:30" --target-action=promote restore

And then start postgresql server:

systemctl start postgresql

Check the nums table:

sudo -u postgres psql -d my_db

 val 
-----
   1
   2
   3
   4
   5
   6
(6 rows)

As you can see, we have restored our truncated table on pg-master. Now we need to restore it on pg-replica. First, go to pg-backup server and get a new backup:

sudo -u postgres pgbackrest --stanza=my_cluster --log-level-console=info backup

Then, on pg-replica server, restore the data:

sudo -u postgres pgbackrest --stanza=my_cluster --delta --type=standby restore
systemctl start postgresql

Now for testing that everything works correctly add a value to nums table in pg-master and check that it will reflect on pg-replica.

That's all.

Discussion (0)