DEV Community

Umairius's  Repo
Umairius's Repo

Posted on

Chapter 3. Query Processing

Introduction

Picture this: You're a developer, tasked with building a database management system from scratch. Sounds daunting, right? Well, that's exactly what the developers of PostgreSQL did back in the 1980s. And boy, did they succeed in creating something truly remarkable.

Logical Structure of a Database Cluster:

PostgreSQL's query processing system starts with the logical structure of a database cluster. At the top level, there is the cluster itself, which contains multiple databases. Each database object can be considered a separate database, with its own multiple tables, indexes, free space volumes, and visibility maps.

Physical Structure of a Database Cluster:

The physical structure of a database cluster refers to the actual location of the data on disk. The cluster consists of a collection of files that are organized into subdirectories. Each file represents a database object, and the subdirectories contain additional files that are used to manage the database objects.

Layout of a Database Cluster:

The layout of a database cluster is organized into three main areas: the global area, the database area, and the table area. The global area contains information that is shared across all databases in the cluster, such as user and group information. The database area contains information specific to a particular database, such as the database schema. The table area contains information about the tables and indexes within a particular database.

Tablespaces in a Database Cluster:

Tablespaces are an optional feature of PostgreSQL that allows you to store database objects in a location outside of the default data directory. This can be useful if you have a large database or if you want to store data on a different disk or file system.

Query Processing System:

Now that we have an understanding of the physical and logical structure of a database cluster, let's dive into how PostgreSQL processes queries. When a client sends a query to the server, a new backend process is created to handle the query. This backend process is responsible for executing the query and returning the results to the client.

Query Parsing:

The first step in query processing is query parsing. During this step, the query is broken down into its component parts and a parse tree is created. This parse tree represents the structure of the query and is used by the query planner to generate an execution plan.

Query Planning:

The next step in query processing is query planning. The query planner uses the parse tree to generate an execution plan. This plan outlines how the query will be executed and includes information about which tables and indexes will be accessed, what sort of join algorithm will be used, and what sort of sorting or grouping will be performed.

Query Execution:

Finally, we get to the meat of query processing: query execution. During this step, the backend process executes the query according to the execution plan generated by the query planner. As the query is executed, data is retrieved from the database and processed. Once all of the data has been retrieved and processed, the results are sent back to the client.

Conclusion:

So there you have it – a brief overview of how PostgreSQL processes queries. It's truly remarkable how much goes on behind the scenes to make database management systems work. Without the hard work and dedication of the developers who built PostgreSQL, we wouldn't have the robust and reliable database management systems that we have today.

Top comments (0)