A Deep Dive into Base Backup and Point-in-time Recovery in PostgreSQL

#apache #agedb #database #postgres

Welcome to the ninth article of my blog series, in this blog I will discuss Base Backup and Point-in-time Recovery in PostgreSQL.

There are two types of online database backups in PostgreSQL: logical backups and physical backups. Physical backups involve capturing an image of the actual physical files that compose the database, while logical backups involve creating a backup by exporting the logical structure and data of the database.

Although logical backups offer advantages such as flexibility in extracting and modifying data, they can be time-consuming, particularly for large databases. In contrast, physical backups are more efficient and faster for backing up and restoring extensive databases.

Base Backup

A base backup is the technical term for the creation of a physical backup in PostgreSQL. The steps below make up the normal process for making a basic backup:

To initiate the backup process in PostgreSQL, you can use the pg_backup_start command (or pg_start_backup in previous versions). This command ensures that the database is prepared for backup by switching to the current Write-Ahead Log (WAL) segment file and creating a checkpoint. It also generates a backup_label file that contains essential information about the base backup, such as the checkpoint location.

Once the backup process is complete, you can use the pg_backup_stop command (or pg_stop_backup in earlier versions) to finalize the backup. This command performs several actions, including switching the WAL segment file, writing a backup end XLOG record, resetting the full-page write mode, and creating a backup history file. The backup history file includes the contents of the backup_label file and additional data.

With a base backup and the archive logs created through continuous archiving, you can utilize PostgreSQL's Point-in-Time Recovery (PITR) functionality to restore the database cluster to any desired point in time. This feature is useful for resolving critical errors or reverting the database to a specific state.

During the recovery process, PostgreSQL reads the checkpoint location from the backup_label file to determine the starting point. It then reads the archive logs from the archival path specified in the archive_command option and replays the WAL data from those logs.

Upon completion of the recovery operation, a timeline history file is created in the pg_xlog (or pg_wal in version 10 or later) subfolder. This file records the Log Sequence Number (LSN) where the WAL segment switches occurred, the cause of the timeline change, and the timelineId of the restored database cluster. The timeline history file serves as a vital reference for future PITR procedures and provides a history of recovered clusters.

References:
If you want to read it further, please click the link below.
Chapter-9 The Internals of PostgreSQL

DEV Community

A Deep Dive into Base Backup and Point-in-time Recovery in PostgreSQL

Base Backup

Top comments (0)

Read next

Just-in-Time Database Access

The Most Popular Database Caching Strategies Explained

Why we Built a Mini-Language for a Golang Hackathon

10 Best Developer Tools to Make Backend Development Easier 👩‍💻 🚀