DEV Community

Cover image for Summary of Chapter# 9 : "Write Ahead Logging (WAL)" from the book "The Internals of PostgreSQL" Part-3
Vinay Kumar Talreja
Vinay Kumar Talreja

Posted on

Summary of Chapter# 9 : "Write Ahead Logging (WAL)" from the book "The Internals of PostgreSQL" Part-3

This blog aims to assist you in understanding the medial concepts of Chapter:9 [Write Ahead Logging (WAL)] from the book The Internals of PostgreSQL.

Note: Ensure that you have a thorough understanding of
Chapter 9 Part-2 and basics of PostreSQL before we proceed to Chapter 9 Part-3, as it forms the foundation for our exploration.

So, Let's Continue:

Internal Layout of XLOG Record

Header Portion of XLOG Record

  • All XLOG records have a general header portion defined by the structure XLogRecord.

  • Structure of XLogRecord is given below:



typedef struct XLogRecord
{
   uint32          xl_tot_len;   /* total len of entire record */
   TransactionId   xl_xid;       /* xact id */
   uint32          xl_len;       /* total len of rmgr data */
   uint8           xl_info;      /* flag bits, see below */
   RmgrId          xl_rmid;      /* resource manager for this record */
   /* 2 bytes of padding here, initialize to zero */
   XLogRecPtr      xl_prev;      /* ptr to previous record in log */
   pg_crc32        xl_crc;       /* CRC for this record */
} XLogRecord;



Enter fullscreen mode Exit fullscreen mode
  • Both xl_rmid and xl_info are variables related to resource managers, which are collections of operations associated with the WAL feature such as writing and replaying of XLOG records.

Data Portion of XLOG Record (version 9.4 or earlier)

  • Data portion of XLOG record is classified into either backup block (entire page) or non-backup block (different data by operation).

Examples of XLOG records (version 9.4 or earlier) in PostgreSQL is depicted in figure below:

Image description

Data Portion of XLOG Record (version 9.5 or later)

  • In version 9.4 or earlier, there was no common format of XLOG record, so that each resource manager had to define one’s own format. In such a case, it became increasingly difficult to maintain the source code and to implement new features related to WAL.

  • In order to deal with this issue, a common structured format, which does not depend on resource managers, has been introduced in version 9.5.

  • Data portion of XLOG record can be divided into two parts: header and data.

Common XLOG record format in PostgreSQL is depicted in figure below:

Image description

  • Header part contains zero or more XLogRecordBlockHeaders and zero or one XLogRecordDataHeaderShort (or XLogRecordDataHeaderLong).

  • It must contain at least either one of those. When its record stores full-page image (i.e. backup block), XLogRecordBlockHeader includes XLogRecordBlockImageHeader, and also includes XLogRecordBlockCompressHeader if its block is compressed.

  • Data part is composed of zero or more block data and zero or one main data, which correspond to the XLogRecordBlockHeader(s) and to the XLogRecordDataHeader respectively.

Examples of XLOG records (version 9.5 or later) in PostgreSQL is depicted in figure below:

Image description


Writing of XLOG Records

  • First, issue the following statement to explore the PostgreSQL internals:

testdb=# INSERT INTO tbl VALUES ('A');

  • The pseudo-code of exec_simple_query() is shown below:


exec_simple_query() @postgres.c

(1) ExtendCLOG() @clog.c                  /* Write the state of this transaction
                                           * "IN_PROGRESS" to the CLOG.
                                           */
(2) heap_insert()@heapam.c                /* Insert a tuple, creates a XLOG record,
                                           * and invoke the function XLogInsert.
                                           */
(3)   XLogInsert() @xlog.c (9.5 or later, xloginsert.c)
                                          /* Write the XLOG record of the inserted tuple
                                           *  to the WAL buffer, and update page's pd_lsn.
                                           */
(4) finish_xact_command() @postgres.c     /* Invoke commit action.*/   
      XLogInsert() @xlog.c  (9.5 or later, xloginsert.c)
                                          /* Write a XLOG record of this commit action 
                                           * to the WAL buffer.
                                           */
(5)   XLogWrite() @xlog.c                 /* Write and flush all XLOG records on 
                                           * the WAL buffer to WAL segment.
                                           */
(6) TransactionIdCommitTree() @transam.c  /* Change the state of this transaction 
                                           * from "IN_PROGRESS" to "COMMITTED" on the CLOG.
                                           */


Enter fullscreen mode Exit fullscreen mode
  • Explanation of above Pseudo-code:

  • (1) The function ExtendCLOG() writes the state of this transaction 'IN_PROGRESS' in the (in-memory) CLOG.

  • (2) The function heap_insert() inserts a heap tuple into the target page on the shared buffer pool, creates this page's XLOG record, and invokes the function XLogInsert().

  • (3) The function XLogInsert() writes the XLOG record created by
    the heap_insert() to the WAL buffer at LSN_1, and then updates the modified page's pd_lsn from LSN_0 to LSN_1.

  • (4) The function finish_xact_command(), which invoked to commit this transaction, creates this commit action's XLOG record, and then the function XLogInsert() writes this record into the WAL buffer at LSN_2.

  • (5) The function XLogWrite() writes and flushes all XLOG records on the WAL buffer to the WAL segment file. If the parameter wal_sync_method is set to 'open_sync' or 'open_datasync', the records are synchronously written because the function writes all records with the open() system call specified the flag O_SYNC or O_DSYNC. If the parameter is set to 'fsync', 'fsync_writethrough' or 'fdatasync', the respective system call – fsync(), fcntl() with F_FULLFSYNC option, or fdatasync() – will be executed. In any case, all XLOG records are ensured to be written into the storage.

  • (6) The function TransactionIdCommitTree() changes the state of this transaction from 'IN_PROGRESS' to 'COMMITTED' on the CLOG.

Write-sequence of XLOG records in PostgreSQL is depicted in figure below:

Image description

*Write-sequence of XLOG records (continued) * in PostgreSQL is depicted in figure below:

Image description


I hope, this blog has helped you in understanding the medial concepts of Write Ahead Logging (WAL) in PostreSQL.

Check out summary of Chapter : 9 Part-4

If you want to understand PostgreSQL In-Depth.

Top comments (0)