Vinay Kumar Talreja

Posted on Sep 10, 2023 • Edited on Sep 15, 2023

Summary of Chapter# 9 : "Write Ahead Logging (WAL)" from the book "The Internals of PostgreSQL" Part-4

#postgres #apacheage #database #agenssql

This blog aims to assist you in understanding the final concepts of Chapter:9 [Write Ahead Logging (WAL)] from the book The Internals of PostgreSQL.

Note: Ensure that you have a thorough understanding of
Chapter 9 Part-3 and basics of PostreSQL before we proceed to Chapter 9 Part-4, as it forms the foundation for our exploration.

So, Let's Continue:

WAL Writer Process

WAL writer is a background process to check the WAL buffer periodically and write all unwritten XLOG records into the WAL segments.
The purpose of this process is to avoid burst of writing of XLOG records. If this process has not been enabled, the writing of XLOG records might have been bottlenecked when a large amount of data committed at one time.
WAL writer is working by default and cannot be disabled. Check interval is set to the configuration parameter wal_writer_delay, default value is 200 milliseconds.

Checkpoint Processing in PostgreSQL

In PostgreSQL, the checkpointer (background) process performs checkpoints.
It starts when one of the following occurs:
(1) The interval time set for checkpoint_timeout from the previous checkpoint has been elapsed (the default interval is 300 seconds).
(2) In version 9.4 or earlier, the number of WAL segment files set for checkpoint_segments has been consumed since the previous checkpoint (the default number is 3).
(3) In version 9.5 or later, the total size of the WAL segment files in the pg_wal directory has exceeded the value of the parameter max_wal_size (the default value is 1GB (64 files)).
(4) The PostgreSQL server stops in smart or fast mode.
(5) A superuser issues the CHECKPOINT command manually.

Database Recovery in PostgreSQL

PostgreSQL implements redo log-based recovery.
If the database server crashes, PostgreSQL can restore the database cluster by sequentially replaying the XLOG records in the WAL segment files from the REDO point.
When PostgreSQL starts up, it first reads the pg_control file.

Details of the recovery process in PostgreSQL is depicted in figure below:

(1) PostgreSQL reads all items in the pg_control file when it starts. If the state item is 'in production', PostgreSQL enters recovery-mode because this means that the database was not shut down normally. If it is 'shut down', PostgreSQL enters normal startup-mode.
(2) PostgreSQL reads the latest checkpoint record, the location of which is written in the pg_control file, from the appropriate WAL segment file. It then gets the REDO point from the record. If the latest checkpoint record is invalid, PostgreSQL reads the one prior to it. If both records are unreadable, it gives up recovering by itself.
(3) The appropriate resource managers read and replay XLOG records in sequence from the REDO point until they reach the end of the latest WAL segment. When an XLOG record is replayed and if it is a backup block, it is overwritten on the corresponding table page regardless of its LSN. Otherwise, a (non-backup block) XLOG record is replayed only if the LSN of the record is greater than the pd_lsn of the corresponding page.

Insertion operations during the background writer working in PostgreSQL is depicted in figure below:

(1) PostgreSQL inserts a tuple into the TABLE_A, and writes an XLOG record at LSN_1.
(2) The background-writer process writes the TABLE_A page to storage. At this point, this page's pd_lsn is LSN_1.
(3) PostgreSQL inserts a new tuple into the TABLE_A, and writes a XLOG record at LSN_2. The modified page is not written into the storage yet.

Database Recovery in PostgreSQL is depicted in figure below:

(1) PostgreSQL loads the first XLOG record and the TABLE_A page, but does not replay it because the LSN of the record (LSN_1) is not larger than the LSN of the page (also LSN_1). In fact, it is clear that there is no need to replay it.
(2) Next, PostgreSQL replays the second XLOG record because the LSN of the record (LSN_2) is greater than the current LSN of the TABLE_A page (LSN_1).

WAL Segment Files Management

PostgreSQL writes XLOG records to one of the WAL segment files stored in the pg_wal subdirectory. A new WAL segment file is switched in if the old one has been filled up. The number of WAL files varies depending on several configuration parameters, as well as server activity. In addition, the management policy for WAL segment files has been improved in version 9.5.

WAL Segment Switches

WAL segment switches occur when one of the following events happens:
(1) WAL segment is filled up.
(2) The function pg_switch_xlog is called.
(3) The archive_mode parameter is enabled and the archive_timeout parameter has expired.
When a WAL segment file is switched, it is usually recycled for future use. However, it may be removed later if it is not needed.

WAL Segment Management

Whenever a checkpoint starts, PostgreSQL estimates and prepares the number of WAL segment files that will be needed for the next checkpoint cycle. This estimate is based on the number of WAL segment files that were consumed in previous checkpoint cycles.
The number of WAL segment files is counted from the segment that contains the prior REDO point, and the value must be between the min_wal_size parameter (which default to 80 MB, or 5 files) and the max_wal_size parameter (which default to 1 GB, or 64 files).
If a checkpoint starts, PostgreSQL will keep or recycle the necessary WAL segment files, and remove any unnecessary files.

Continuous Archiving and Archive Logs

Continuous Archiving is a feature that copies WAL segment files to archival area at the time when WAL segment switches, and is performed by the archiver (background) process.
The copied file is called an archive log.
This feature is usually used for hot physical backup and PITR (Point-in-Time Recovery).
The path of archival area is set to the configuration parameter archive_command.
For Example: archive_command = 'cp %p /home/postgres/archives/%f', where, placeholder %p is copied WAL segment, and %f is archive log.

Continuous archiving in PostgreSQL is depicted in figure below:

The parameter archive_command can be set any Unix commands and tools, so you can transfer the archive logs to other host by setting the scp command or any file backup tools instead of ordinary copy command.

I hope, this blog has helped you in understanding the final concepts of Write Ahead Logging (WAL) in PostreSQL.

Check out summary of Chapter : 10 Part-1

If you want to understand PostgreSQL In-Depth.

DEV Community

Summary of Chapter# 9 : "Write Ahead Logging (WAL)" from the book "The Internals of PostgreSQL" Part-4

WAL Writer Process

Checkpoint Processing in PostgreSQL

Database Recovery in PostgreSQL

WAL Segment Files Management

Continuous Archiving and Archive Logs

Top comments (0)

Read next

All About Parquet Part 02 - Parquet's Columnar Storage Model

Master Bidirectional One-to-One Relations in 5 Steps: Boost Spring Data JPA Efficiency

GitCone.com: Chat with all the repositories from this challenge, or with any other repository

Local AI Knowledge Base with Next.js, Ollama, and PostgreSQL