PostgreSQL and YugabyteDB client execution profiling

#productivity

Performance is, and always has been an hot topic with databases, and a bit surrounded by mystery, because some specialists would advice changing something, and then all of a sudden the database runs much faster. Probably lots of people have experienced this.

It is my believe that this shouldn't be a mystery, but instead an act of understanding an execution path and identifying the statistics that tell the properties of the execution path. These statistics should be the basis for changing the "something", making it a predictable and carefully measured change.

One the principal basic things to measure is the statement execution latency in PostgreSQL, and YugabyteDB, because YugabyteDB reuses the PostgreSQL source code for its YSQL API.

How does statement execution work?

In PostgreSQL, and therefore in YugabyteDB, the execution of a statement has to go through the 4 principal execution phases:

Parse
Rewrite
Planning
Execution

Even with prepared statements, where you can skip some steps, which are steps 1 and 2, you will have to perform steps 1 and 2 to create the named prepared statement, after which you can use it to only execute steps 3 and 4, meaning the statement still has logically passed all phases.

pg_stat_statements

The standard way of looking at historical statement execution information is to use pg_stat_statements to look at the latency of the execution. pg_stat_statements contains the field total_time (and min/max/mean/stddev _time) up to version 12, and total_plan and total_exec time starting from version 13 on (plus the statistical ones for these.

This has a profound meaning that is missed by lot of people. Up to version 12, "total_time" means the time spent in execution phase 4: execution. From version 13 on, the naming is more clear on what the time column contains: "total_plan" is the time spent in phase 3: planning, and "total_exec" is the time spent in phase 4: execution.

In all cases it means pg_stat_statements does not contain the full execution time of a statement.

log_min_duration_statement

The question that begs to be answered based on the pg_stat_statement finding is if there is a way to see the full execution time. The good news is there is...but not in the database.

When the parameter log_min_duration_statement is set higher than its default value of -1 (which means 'disabled'), the PostgreSQL engine will log a line for each of the grouped execution phases that are executed against the database. In YugabyteDB, this parameter is set with the tablet server that starts the YSQL instance and is called ysql_log_min_duration_statement. As you guessed from the naming, it will only log a line if the execution exceeded the duration set with log_min_duration_statement, which is set in milliseconds. If set to 0, it will log all statements.

What are these 'grouped execution phases' (which is a naming that I came up with, it's not official naming)?

mapping client execution to server execution

It seems that there is different naming for the steps in execution between the PostgreSQL client and the server, for which the client terminology is used in the server too.

What are these names?

client 'Parse' -> server 'Parse' + 'Rewrite'.
client 'Bind' -> server 'Planner'.
client 'Execute' -> server 'Execute'.

If you enable log_min_duration_statement, it will log the client grouped phases in the postgresql server log, such as:

2022-12-21 15:18:42.192 UTC [7714] LOG:  duration: 0.933 ms  statement: select now();

postgres communication protocols

PostgreSQL contains two essential protocols with which a client can communicate with the server. The first protocol is the "simple query protocol".

simple query protocol

The "simple query protocol" is the protocol used by psql (and ysqlsh) and performs it, as the name does suggest, in the simplest possible way, which means it just everything (all 4 steps) in one go. This is reflected in the logging, which shows the client side perspective of it: one step, which inside the one step still does contain parse, rewrite, plan and execute.

extended protocol

To be able to take advantage of using the different steps in execution, the extended protocol was created. The extended protocol does break down the execution into different steps, which are the client steps: parse, bind and execute.

This does not mean these steps require separate, independent network requests to be sent. Otherwise there would be little situations where there would be an advantage of the simple query protocol. The client is free to use the extended protocol and group one or more steps in the same request.

If the extended protocol is used for the same query as executed above (select now()) then it would look like this:

2022-12-21 15:56:18.781 UTC [7845] LOG:  duration: 0.386 ms  parse s0: select now()
2022-12-21 15:56:18.783 UTC [7845] LOG:  duration: 0.207 ms  bind s0: select now()
2022-12-21 15:56:18.783 UTC [7845] LOG:  duration: 0.162 ms  execute s0: select now()

measuring execution from the client: connection-profiler

In order to see the client side of things, we created a PostgreSQL and YSQL connection profiler: connection-profiler. Currently it's available as rust source code.

Once connection-profiler is compiled, it can measure the time spent in the phases that the client uses. This is the simple query protocol:

$ ./target/release/connection-profiler -u "host=$HOSTNAME port=5433 user=yugabyte password=$PASSWORD sslmode=require" -q "select now()" -p simple -r 5
create_connection                             23425 us
------------------------------------------------------------
run nr: 1
total simple query                             3421 us
run nr: 2
total simple query                              145 us
run nr: 3
total simple query                              141 us
run nr: 4
total simple query                              120 us
run nr: 5
total simple query                              109 us
============================================================

(I use the shell to set $HOSTNAME and $PASSWORD)

I use a very simple query: select now(), to just see the basic behaviour of YugabyteDB. The basic working is the same in PostgreSQL, although less high because obtaining the data does not require network RPC requests.

The creation of the connection takes 23.4ms.
The first query takes 3.4ms.
The second to fifth query take 0.1ms.

Why does the first query suffer from higher latency? We can't see anything more specific with the simple query protocol. Let's switch to the extended query protocol!

$ ./target/release/connection-profiler -u "host=$HOSTNAME port=5433 user=yugabyte password=$PASSWORD sslmode=require" -q "select now()" -p extended -r 5
create_connection                             24960 us
------------------------------------------------------------
run nr: 1
parse+describe+sync        3295 us
bind+execute+sync           336 us
total extended protocol                        3662 us
run nr: 2
parse+describe+sync         110 us
bind+execute+sync           152 us
total extended protocol                         289 us
run nr: 3
parse+describe+sync          99 us
bind+execute+sync           127 us
total extended protocol                         251 us
run nr: 4
parse+describe+sync          95 us
bind+execute+sync           118 us
total extended protocol                         237 us
run nr: 5
parse+describe+sync         104 us
bind+execute+sync           126 us
total extended protocol                         254 us
============================================================

This does tell more. First of all see that the connection creation latency is in the same ballpark.

Now using the extended protocol, the requests to the database are done in two requests: the parse step, which performs the parsing and rewriting (parse+describe+sync), and the bind and execute steps.

However, the two distinct steps allow us to drill a little deeper into how the latency is divided:

The first query takes higher latency than the other ones.
The main part of higher first query response time is in the parse phase, which is not visible in explain analyze, nor pg_stat_statements.
Another part of the higher first query response time is bind+execute phase. This is caused by the planner requesting additional objects to optimise the execution.
The extra work done in the first query parse and bind phases is cached in the local/private backend cache.
The second to fifth execution can take advantage of the work done by the first statement, and performs at a roughly equal response time.

Please mind that for very simple queries, the two step execution actually slightly increases the response time, because it has to perform two steps instead of one.

This can be further optimised by using prepared statements. A prepared statement "materializes" or simply "stores" the work done in the parse/parse+rewrite phases, so that the client only has to perform the bind/plan and execution work. This can be seen by using connection-profiler in 'prepared' mode:

$ ./target/release/connection-profiler -u "host=$HOSTNAME port=5433 user=yugabyte password=$PASSWORD sslmode=require" -q "select now()" -p prepared -r 5
create_connection                             25127 us
parse+describe+sync        3343 us
prepare statement                              3371 us
------------------------------------------------------------
run nr: 1
bind+execute+sync           306 us
total extended protocol                         324 us
run nr: 2
bind+execute+sync            92 us
total extended protocol                         101 us
run nr: 3
bind+execute+sync            85 us
total extended protocol                          94 us
run nr: 4
bind+execute+sync            77 us
total extended protocol                          86 us
run nr: 5
bind+execute+sync            77 us
total extended protocol                          86 us
============================================================

In essence, all the steps for the statement have been executed: parse, rewrite, plan and execute. However, because the result of the plan and rewrite steps has been saved, it can be reused.

The above timing once again shows that the first parse will suffer the higher latency.

The bind and execute request for the first run still has higher latency, but once a full cycle of parse, rewrite, plan and execute has been performed for that specific statement, it can perform bind/plan and execute only using the prepared statement.