DEV Community

Ahmad Tashfeen
Ahmad Tashfeen

Posted on

Understanding the Physical Structure of Database Cluster

Introduction to Database Clusters

A database cluster is a collection of directories and files that contain various subdirectories. By running the initdb utility, a new database cluster can be initialized, resulting in the creation of a base directory. Typically, the base directory is designated by the environment variable PGDATA.

Layout of Databases

Each database is stored in a subdirectory within the base directory, with the subdirectory names matching the respective OIDs. For instance, if the OID of the database "sampledb" is 16384, its subdirectory name would be 16384.

Layout of Files Associated with Tables and Indexes

Tables and indexes with a size of less than 1GB are stored as single files under the database directory. Tables and indexes are internally managed by individual OIDs, while the data files are managed by the variable relfilenode. The relfilenode values of tables and indexes usually match the respective OIDs, but not always. For tables and indexes that exceed 1GB in file size, PostgreSQL creates new files named relfilenode.1, relfilenode.2, and so on, as necessary.

Tablespaces

A tablespace in PostgreSQL is an additional data area that exists outside of the base directory. This feature was introduced in version 8.0. When a tablespace is created using the CREATE TABLESPACE statement, it is created under the directory specified, with a version-specific subdirectory (e.g., PG_14_202011044). The tablespace directory is addressed using a symbolic link from the pg_tblspc subdirectory, with the link name being the same as the OID value of the tablespace. If a new database (with an OID of 16387) is created under the tablespace, its directory is created under the version-specific subdirectory.

Helpful Sources

https://github.com/apache/age
https://www.interdb.jp/pg/pgsql01.html

Top comments (0)