DEV Community

Ahmad Tashfeen
Ahmad Tashfeen

Posted on

Understanding the Internal Layout of a Heap Table File

Overview of Pages

A heap table file, as well as index files, are divided into pages or blocks of fixed length, usually 8192 bytes (8 KB). Pages within a file are numbered sequentially from 0, and the numbers are called block numbers. PostgreSQL adds an empty page to the end of a file if it has been filled up to increase the file size.

Contents of a Page

A page within a table contains three types of data: heap tuples, line pointers, and header data. Heap tuples are the actual data records and are stacked in order from the bottom of the page. Line pointers are 4-byte-long pointers to each heap tuple and form a simple array, which serves as an index to the tuples. Header data, defined by the structure PageHeaderData, contains general information about the page, such as the LSN of XLOG record written by the last change of the page and the checksum value of the page.

Tuple Identifier

To identify a tuple within a table, a tuple identifier (TID) is internally used. A TID consists of a pair of values: the block number of the page that contains the tuple and the offset number of the line pointer that points to the tuple. The usage of TID is illustrated with an example of an index.

Helpful Sources:
https://github.com/apache/age
https://www.interdb.jp/pg/pgsql01.html

Top comments (0)