Structure of Postgres Database

PostgreSQL is a powerful, open-source object-relational database system that uses and extends the SQL language. It combines many features to safely store and scale even the most complicated data workloads. It is highly reliable, stable, scalable and secure system and it has been around for more than 2 decades.

Logical Structure

A database cluster is a collection of databases that is managed by a single instance of a running PostgreSQL server. It does not mean "a group of database servers".

A database is a collection of database objects, which is a data structure used either to store or to reference data. In PostgreSQL, database themselves are also database objects and are logically separated from each other and other database objects, for example tables, indexes etc. belong to their respective databases.
Databases objects are internally managed by respective object identifiers (OIDs). These are unsigned 4-byte integers. System catalogs stores the relation between database objects and OIDs. For example, OID or database is stored in pg_database.

Physical Structure

A database cluster is single directory called as base directory and it contains more subdirectories and files. initdb initializes a new database cluster.
A database is a subdirectory in base subdirectory and tabes and indexes are files stored under the subdirectory of the database.

A tablespace in PostgreSQL is an additional data area outside the base directory. Its meaning in PostgreSQL is different from other RDBMS. A tablespace is created under the directory with the command CREATE TABLESPACE, and under that directory.

Heap Table File

Data file is divided into pages of fixed length with the default size of 8192 byte. These pages are numbered sequentially from 0 and are called block numbers. The layout of pages depends on the data file types.

A page within a table contains three types of data, heap tuple, line pointer and header data. A heap tuple is a record data itself. They are stacked in order from the bottom of the page. Line pointer holds pointer to each heap tuple. They form a simple array. Header data contains the general information about the page.
There are some empty spaces between end of line pointer and beginning of new tuple. These are called free space or hole. Tuple Identifier (TID) is used to identify a tuple. It comprises of block number of the page that contains the tuple and the offset number of the line pointer that points to tuple.

DEV Community

Structure of Postgres Database

Logical Structure

Physical Structure

Heap Table File

Top comments (0)

Read next

Django, Flask, FastAPI, and More: Choosing the Right Python Framework for Your Project

Awesome > 347

Navigating Search Solutions: A Comprehensive Comparison Guide to Meilisearch, Algolia, and ElasticSearch

Winter Solstice