DEV Community

Haseeb Ashraf
Haseeb Ashraf

Posted on

Query Processing in PostgreSQL

Query processing ins PostgreSQL is the most complicated system and it efficiently processes the supported SQL. In this article, we will look at query processing, in particular focusing on query optimising.

In postgreSQL, a parallel query implemented in version 9.6 uses multiple background worker processes, all queried issued by a connected client are handled by a backend process. This backend process consists of five subsystems as shown below.

  • Parser
    This helps to generate a parse tree from the SQL statements that were written in plain text.
    The root node of this parse tree is the SelectStmt structure that is defined in the parsenodes.h. The SELECT query and the corresponding elements of the parse tree have the same numbering. Since the parser only checks the input syntax when generating the parse tree, it only generates an error when there is a syntax error in the query.

  • Analyser
    Semantic analysis of the parse tree is carried out and a query tree is generated.
    The query structure defined in the parsenodes.h is the root of the query tree. This structure also includes the metadata of its corresponding query. A query tree includes things such as a targetlist which is a list of columns that are the result of the query. The join tree saves the FROM clause and the WHERE clauses.

  • Rewriter
    Given that such rules exist, the rewriter uses the rule system to transform the query tree.
    The rewriter has a system that generates the rule system. It looks at the rules stored in the pg_rules system catalog and rewrites the query tree. The rule system itself is quite intriguing but we will omitt it from our short summary to prevent it from becoming too long.

  • Planner
    The plan tree that can most effectively be executed from the query tree is generated by the planner.
    The planner in the PostgreSQL is a purely cost-based optimisation system and it doesn't support rule based optimisation and hints. Consequently, this planner is the most complicated system in an RDBMS. A plan tree consists of elements called plan nodes and is connected to the plantree list. Each of the plan tree consists of information that the executor requires for processing.

  • Executor
    This executes the query by accessing the tables and indices that were created by the plan tree in order.
    The executor reads and writes indexes and tables in the database cluster with the help of the buffer manager that will be further described in upcoming summaries. The executor occupies some memory areas, line the temp_buffers and work_mem are allocated in advance and temporary files are created if necessary.

Top comments (0)