DEV Community

Cover image for Summary of Chapter# 10 : "Base Backup & Point-in-Time Recovery" from the book "The Internals of PostgreSQL"
Vinay Kumar Talreja
Vinay Kumar Talreja

Posted on • Updated on

Summary of Chapter# 10 : "Base Backup & Point-in-Time Recovery" from the book "The Internals of PostgreSQL"

This blog aims to assist you in understanding the initial concepts of Chapter:10 [Base Backup & Point-in-Time Recovery] from the book The Internals of PostgreSQL.

Note: Ensure that you have a thorough understanding of
Chapter 9 Part-4 and basics of PostreSQL before we proceed to Chapter 10 Part-1, as it forms the foundation for our exploration.

So, Let's Start:

Introduction to chapter

  • Online database backup can be roughly classified into two categories: logical and physical backups.

  • One disadvantage of logical backups is that they can be very time-consuming. In particular, it can take a long time to make a backup of a large database, and even longer to restore the database from the backup data.

  • Meanwhile physical backups can be made and restored much more quickly, making them a very important and useful feature in practical systems.

  • In PostgreSQL, online physical full backups have been available since version 8.0.

  • A snapshot of a running whole database cluster (i.e., physical backup data) is known as a base backup.

  • Point-in-Time Recovery (PITR), which has also been available since version 8.0, is the feature to restore a database cluster to any point in time using a base backup and archive logs created by the continuous archiving feature.

Bonus Point:-

  • In versions 7.4 or earlier, PostgreSQL had supported only logical backups (logical full and partial backups, and data exports).

Base Backup

  • First of all, the standard procedure to make a base backup using the low-level commands is as follows:

  • (1) Issue the pg_backup_start command (versions 14 or earlier, pg_start_backup).

  • (2) Take a snapshot of the database cluster using the archiving command of your choice.

  • (3) Issue the pg_backup_stop command (versions 14 or earlier, pg_stop_backup).

  • Simple procedure for database administrators.

  • No need for special tools; uses common tools like cp or archiving tools.

  • Backup process doesn't require table locks.

  • Allows users to continue issuing queries during the backup.

  • Offers a significant advantage over other major open-source RDBMSs.

Making a base backup in PostgreSQL is depicted in figure below:

Image description

pg_backup_start (Ver.14 or earlier, pg_start_backup)

  • The pg_backup_start command prepares for making a base backup.

  • The recovery process starts from a REDO point, so the pg_backup_start command must do a checkpoint to explicitly create a REDO point at the start of making a base backup.

  • The third and fourth operations are the heart of this command.

  • The first and second operations are performed to recover a database cluster more reliably.

  • A backup_label file contains the following six items (versions 11 or later, seven items):

  • CHECKPOINT LOCATION – This is the LSN location where the checkpoint created by this command has been recorded.

  • START WAL LOCATION – This is not used with PITR, but used with the streaming replication, which is described in Chapter 11. It is named 'START WAL LOCATION' because the standby server in replication-mode reads this value only once at initial startup.

  • BACKUP METHOD – This is the method used to make this base backup.

  • BACKUP FROM – This shows whether this backup is taken from the primary or standby server.

  • START TIME – This is the timestamp when the pg_backup_start command was executed.

  • LABEL – This is the label specified at the pg_backup_start command.

  • START TIMELINE – This is the timeline that the backup started. This is for a sanity check and has been introduced in version 11.

  • The pg_backup_start performs the following four operations:

  • (1) Force the database into full-page write mode.

  • (2) Switch to the current WAL segment file (versions 8.4 or later).

  • (3) Do a checkpoint.

  • (4) Create a backup_label file – This file, created in the top level of the base directory, contains essential information about base backup itself, such as the checkpoint location of this checkpoint.

pg_backup_stop (Ver.14 or earlier, pg_stop_backup)

  • The pg_backup_stop command performs the following five operations to complete the backup:

  • (1) Reset to non-full-page writes mode if it has been forcibly changed by the pg_backup_start command.

  • (2) Write a XLOG record of backup end.

  • (3) Switch the WAL segment file.

  • (4) Create a backup history file. This file contains the contents of the backup_label file and the timestamp that the pg_backup_stop command was executed.

  • (5) Delete the backup_label file. The backup_label file is required for recovery from the base backup, but once copied, it is not necessary in the original database cluster.


How Point-in-Time Recovery Works

  • In PITR mode, PostgreSQL replays the WAL data of the archive logs on the base backup, from the REDO point created by the pg_backup_start up to the point you want to recover.

  • In PostgreSQL, the point to be recovered is referred to as a recovery target.

Basic concept of PITR in PostgreSQL is depicted in figure below:

Image description

  • At first, you need to set the command of the restore_command parameter, and also set the time of the recovery_target_time parameter to the point you made the mistake (in this case, 12:05 GMT) in a postgresql.conf (versions 12 or later) or recovery.conf (versions 11 or earlier).

I hope, this blog has helped you in understanding the initial concepts of Base Backup & Point-in-Time Recovery in PostreSQL.

Check out summary of Chapter : 10 Part-2

If you want to understand PostgreSQL In-Depth.

Top comments (0)