From console to rows - the processing of a query in Postgres

#apacheage #postgres #sql

The journey of a query is quite exciting and complicated. As always, this will be a beginner-friendly post so everyone can follow. I would also attach some material that you can look after for an even more in-depth understanding of the topic.

Whenever you type a query in the console, it goes through the following steps for execution -

Parser
Analyzer
Rewriter
Planner
Executor

We will be looking at these from a high level.

1. Parser

For any language, the parser has only one functionality and that is to parse the tokens (or generate some meaning from the statements). That is exactly what the parser does in Postgres as well. It creates a parse tree from the query.

You might ask, what is a parse tree? Well, it is just a tree that contains several fields of a base statement. That base statement could be anything like SELECT, UPDATE, ALTER, etc. Let's take the SELECT statement as an example. For a SELECT statement, you have to specify a table in the FROM clause. This is stored in a list named List *fromClause. Similarly, other parameters are stored respectively.

A very important note to mention here is that the parser only checks for any syntactical mistakes and not semantic mistakes. Let's take an example.

Consider there is a table named students. Then, the following query would return an error in parsing stage -

SELECT * students;

But, the following query won't -

SELECT * FROM studentsz;

All the semantic checking happens in the Analyzer stage.

2. Analyzer

The main purpose of the analyzer is to semantically check the parse tree and generate the query tree. This structure also contains the metadata regarding the query. Each leaf forms a list or a tree that specifies the other values in that particular clause. It converts the tree from kind of a statement tree to a query tree. That's pretty much what it does. Let's move on to the next one.

3. Rewriter

Rewriter as the name suggests rewrites the query tree according to the rules defined in the pg_rules system catalog. This mechanism is complicated as well as interesting. So, if you want to further read, I will give some links at the end. VIEWS are generated and checked through the rewriter system.

4. Planner and Executor

The planner makes a query plan tree from the query tree and the executor executes it. The planner is a very complicated system on its own. It uses cost-based optimization for generating the query plan tree. Also, no planner hints are supported in Postgres. Again, this is a very vast topic that can be explored deeply. For a beginner, only the former information is sufficient to know how a query is executed.

In the end, I would like to add that these subsystems are no different than compilers. All of the processes that are performed in a compiler are also performed here.

For further reading, I have attached the materials below -

Thanks for reading and until then, see ya!

DEV Community

From console to rows - the processing of a query in Postgres

1. Parser

2. Analyzer

3. Rewriter

4. Planner and Executor

Top comments (0)

Read next

Connecting SQL Databases to the Cloud: PostgreSQL, MySQL, SQLite, and Cloud Integration Explained

Technical Interview Questions - Part 4 - Git + SQL vs noSQL

How to import Excel into PostgreSQL only 1 step

On-Demand Refresh | Materialized Views (Mviews) in ORACLE SQL