DEV Community

Muhammad Adil Shahid
Muhammad Adil Shahid

Posted on

Understanding the Internal Layout of the Heap Table in PostgreSQL

The data file that basically contains the heap table and index is divided into pages of fixed length. If the data file space gets filled completely, postgreSql adds a new empty page to the end of the file.

A page within a table contains three kinds of data:

  • Heap tuple(s): A heap tuple is a record data itself. They are stacked in order from the bottom of the page.

  • Line pointer(s): A line pointer that is also known as item pointer is a 4 byte long pointer that points to each heap tuple. Line pointers form a structure of array that represent the index of the tuples.

  • Header data: It is the data that is added at the beginning of the page and contains general information about the page. Header data is defined by the structure PageHeaderData and 24 byte long.

Writing Heap Tuples

While writing heap tuples, there are two main paramters:

  • pd_lower is used to point to the line pointer of the page

  • pd_upper is used to point to the heap tuple of the page

Reading Heap Tuples

While reading heap tuples there are two scans in postgresql:

  • In sequential scan, all tuples in all pages are sequentially read by scanning all line pointers in each page.

  • In a B-tree index scan, an index file contains index tuples. Each index file is composed of an index key and a TID that points to the target heap table.
    PostgreSQL reads the heap tuple using the TID

References:

https://www.interdb.jp/pg/pgsql01.html

Top comments (0)