This blog aims to assist you in understanding the medial concepts of Chapter:9 [Write Ahead Logging (WAL)] from the book The Internals of PostgreSQL.
Note: Ensure that you have a thorough understanding of
Chapter 9 Part-1 and basics of PostreSQL before we proceed to Chapter 9 Part-2, as it forms the foundation for our exploration.
So, Let's Continue:
Full-Page Writes
Suppose that the TABLE_A's page-data on the storage is corrupted, because the operating system has failed while the background-writer process has been writing the dirty pages. As XLOG records cannot be replayed on the corrupted page, we would need an additional feature that is full-page writes to deal with such failures.
In PostgreSQL, such a XLOG record containing the entire page is referred to as backup block or full-page image.
Full page writes in PostgreSQL is depicted in figure below:
(1) The checkpointer starts a checkpoint process.
(2) In the insertion of the first INSERT statement, though PostgreSQL operates in the almost the same manner as in the previous subsection, this XLOG record is the backup block of this page (i.e. it contains the page entirety), because this is the first writing of this page after the latest checkpoint.
(3) As this transaction commits, PostgreSQL operates in the same manner as in the previous subsection.
(4) In the insertion of the second INSERT statement, PostgreSQL operates in the same manner as in the previous subsection since this XLOG record is not a backup block.
(5) When this statement's transaction commits, PostgreSQL operates in the same manner as in the previous subsection.
(6) To demonstrate the effectiveness of full-page writes, here we consider the case in which the TABLE_A's page on the storage has been corrupted due to the operating system failure occurred while the background-writer has been writing it into the HDD.
Database recovery with backup block in PostgreSQL is depicted in figure below:
(1) PostgreSQL reads the XLOG record of the first INSERT statement and loads the corrupted TABLE_A's page from the database cluster into the shared buffer pool. In this example, the XLOG record is a backup block, because the first XLOG record of each page is always its backup block according to the writing rule of full-page writes.
(2) When a XLOG record is its backup block, another rule of replaying is applied: the record's data-portion (i.e. the page itself) is to be overwritten onto the page regardless of the values of both LSNs, and the page's LSN updated to the XLOG record's LSN.
For Example, PostgreSQL overwrites the data-portion of the record to the corrupted page, and updates the TABLE_A's LSN to LSN_1. In this way, the corrupted page is restored by its backup block.
(3) Since the second XLOG record is a non-backup block, PostgreSQL operates in the same manner as the instruction in the previous subsection.
In this way, PostgreSQL can recover the database even if some data write errors occur due to a process or operating system down.
Transaction Log and WAL Segment Files
Logically, PostgreSQL writes XLOG records into the transaction log which is a virtual file 8-byte long (16 ExaByte).
A transaction log capacity is effectively unlimited and so can be said that 8-byte address space is vast enough, it is impossible for us to handle a file with the capacity of 8-byte length. So, a transaction log in PostgreSQL is divided into files of 16-Mbyte, by default, each of which known as WAL segment.
Transaction log and WAL segment files in PostgreSQL is depicted in figure below:
Using the built-in function pg_xlogfile_name (version 9.6 or earlier) or pg_walfile_name (versoin 10 or later), we can find the WAL segment file name that contains the specified LSN.
For Example:
testdb=# SELECT pg_xlogfile_name('1/00002D3E'); # In version 10 or later, "SELECT pg_walfile_name('1/00002D3E');"
pg_xlogfile_name
--------------------------
000000010000000100000000
(1 row)
Internal Layout of WAL Segment
A WAL segment is a 16 MB file, by default, and it is internally divided into pages of 8192 bytes (8 KB). The first page has a header-data defined by the structure XLogLongPageHeaderData, while the headings of all other pages have the page information defined by the structure XLogPageHeaderData.
Following the page header, XLOG records are written in each page from the beginning in descending order.
Internal layout of a WAL segment file in PostgreSQL is depicted in figure below:
I hope, this blog has helped you in understanding the medial concepts of Write Ahead Logging (WAL) in PostreSQL.
Check out summary of Chapter : 9 Part-3
If you want to understand PostgreSQL In-Depth.
Top comments (0)