This blog aims to assist you in understanding the medial concepts of Chapter:9 [Write Ahead Logging (WAL)] from the book The Internals of PostgreSQL.
Note: Ensure that you have a thorough understanding of
Chapter 9 Part-2 and basics of PostreSQL before we proceed to Chapter 9 Part-3, as it forms the foundation for our exploration.
So, Let's Continue:
Internal Layout of XLOG Record
Header Portion of XLOG Record
All XLOG records have a general header portion defined by the structure XLogRecord.
Structure of XLogRecord is given below:
typedef struct XLogRecord
{
uint32 xl_tot_len; /* total len of entire record */
TransactionId xl_xid; /* xact id */
uint32 xl_len; /* total len of rmgr data */
uint8 xl_info; /* flag bits, see below */
RmgrId xl_rmid; /* resource manager for this record */
/* 2 bytes of padding here, initialize to zero */
XLogRecPtr xl_prev; /* ptr to previous record in log */
pg_crc32 xl_crc; /* CRC for this record */
} XLogRecord;
- Both xl_rmid and xl_info are variables related to resource managers, which are collections of operations associated with the WAL feature such as writing and replaying of XLOG records.
Data Portion of XLOG Record (version 9.4 or earlier)
- Data portion of XLOG record is classified into either backup block (entire page) or non-backup block (different data by operation).
Examples of XLOG records (version 9.4 or earlier) in PostgreSQL is depicted in figure below:
Data Portion of XLOG Record (version 9.5 or later)
In version 9.4 or earlier, there was no common format of XLOG record, so that each resource manager had to define one’s own format. In such a case, it became increasingly difficult to maintain the source code and to implement new features related to WAL.
In order to deal with this issue, a common structured format, which does not depend on resource managers, has been introduced in version 9.5.
Data portion of XLOG record can be divided into two parts: header and data.
Common XLOG record format in PostgreSQL is depicted in figure below:
Header part contains zero or more XLogRecordBlockHeaders and zero or one XLogRecordDataHeaderShort (or XLogRecordDataHeaderLong).
It must contain at least either one of those. When its record stores full-page image (i.e. backup block), XLogRecordBlockHeader includes XLogRecordBlockImageHeader, and also includes XLogRecordBlockCompressHeader if its block is compressed.
Data part is composed of zero or more block data and zero or one main data, which correspond to the XLogRecordBlockHeader(s) and to the XLogRecordDataHeader respectively.
Examples of XLOG records (version 9.5 or later) in PostgreSQL is depicted in figure below:
Writing of XLOG Records
- First, issue the following statement to explore the PostgreSQL internals:
testdb=# INSERT INTO tbl VALUES ('A');
- The pseudo-code of exec_simple_query() is shown below:
exec_simple_query() @postgres.c
(1) ExtendCLOG() @clog.c /* Write the state of this transaction
* "IN_PROGRESS" to the CLOG.
*/
(2) heap_insert()@heapam.c /* Insert a tuple, creates a XLOG record,
* and invoke the function XLogInsert.
*/
(3) XLogInsert() @xlog.c (9.5 or later, xloginsert.c)
/* Write the XLOG record of the inserted tuple
* to the WAL buffer, and update page's pd_lsn.
*/
(4) finish_xact_command() @postgres.c /* Invoke commit action.*/
XLogInsert() @xlog.c (9.5 or later, xloginsert.c)
/* Write a XLOG record of this commit action
* to the WAL buffer.
*/
(5) XLogWrite() @xlog.c /* Write and flush all XLOG records on
* the WAL buffer to WAL segment.
*/
(6) TransactionIdCommitTree() @transam.c /* Change the state of this transaction
* from "IN_PROGRESS" to "COMMITTED" on the CLOG.
*/
Explanation of above Pseudo-code:
(1) The function ExtendCLOG() writes the state of this transaction 'IN_PROGRESS' in the (in-memory) CLOG.
(2) The function heap_insert() inserts a heap tuple into the target page on the shared buffer pool, creates this page's XLOG record, and invokes the function XLogInsert().
(3) The function XLogInsert() writes the XLOG record created by
the heap_insert() to the WAL buffer at LSN_1, and then updates the modified page's pd_lsn from LSN_0 to LSN_1.(4) The function finish_xact_command(), which invoked to commit this transaction, creates this commit action's XLOG record, and then the function XLogInsert() writes this record into the WAL buffer at LSN_2.
(5) The function XLogWrite() writes and flushes all XLOG records on the WAL buffer to the WAL segment file. If the parameter wal_sync_method is set to 'open_sync' or 'open_datasync', the records are synchronously written because the function writes all records with the open() system call specified the flag O_SYNC or O_DSYNC. If the parameter is set to 'fsync', 'fsync_writethrough' or 'fdatasync', the respective system call – fsync(), fcntl() with F_FULLFSYNC option, or fdatasync() – will be executed. In any case, all XLOG records are ensured to be written into the storage.
(6) The function TransactionIdCommitTree() changes the state of this transaction from 'IN_PROGRESS' to 'COMMITTED' on the CLOG.
Write-sequence of XLOG records in PostgreSQL is depicted in figure below:
*Write-sequence of XLOG records (continued) * in PostgreSQL is depicted in figure below:
I hope, this blog has helped you in understanding the medial concepts of Write Ahead Logging (WAL) in PostreSQL.
Check out summary of Chapter : 9 Part-4
If you want to understand PostgreSQL In-Depth.
Top comments (0)