Franck Pachot for YugabyteDB

Posted on Sep 12, 2022 • Edited on Jun 17, 2023

EXPLAIN from pg_stat_statements normalized queries: how to always get the generic plan in 🐘&🚀

#postgres #yugabytedb #distributed #sql

Every postgres user should use pg_stat_statement to collect executed queries with some basic statistics in PostgreSQL (where the extension can be installed) and YugabyteDB (where it is installed by default). But can you EXPLAIN them?

Let's run the following example:

create extension if not exists pg_stat_statements;
drop table demo;
deallocate all;

select pg_stat_statements_reset();
create table demo (i bigint primary key, t text, d date not null, b boolean);
select * from demo where i=42 and t='x' and d is not null and b;

Here is what I can see from pg_stat_statements:

yugabyte=>

          select query, calls, total_time from pg_stat_statements;

                           query                                     | calls | total_time
------------------------------------------------------------------------------+-------+-------------
 select * from demo where i=$1 and t=$2 and d is not null and b               |     1 |    1.006883
 select pg_stat_statements_reset()                                            |     1 |    0.848043
 create table demo (i bigint primary key, t text, d date not null, b boolean) |     1 | 4403.048289
(3 rows)

The queries are normalized here, with $1,$2 parameters replacing the literals. This is great because storing each query with its specific values would be too expensive.

But the problem is that I cannot get the execution plan from it:

yugabyte=> explain select * from demo where i=$1 and t=$2 and d is not null and b;

ERROR:  there is no parameter $1
LINE 1: explain select * from demo where i=$1 and t=$2 and d is null...

If I know some values to be passed, then I can test with those. But if I don't, I want to see the generic plan. It may not be the same as at execution time, but at least can give an idea of the possible access paths for this query.

I can prepare the normalized query with parameters:

yugabyte=# prepare test1 as
           select * from demo where i=$1 and t=$2 and d is not null and b;
PREPARE

I can even get the number of parameters and their expected data types:

yugabyte=# select name, parameter_types, statement from pg_prepared_statements;

 name  | parameter_types |                            statement
-------+-----------------+-----------------------------------------------------------------
 test1 | {bigint,text}   | prepare test1 as                                               +
       |                 | select * from demo where i=$1 and t=$2 and d is not null and b;
(1 row)

But which value to use? I can use 42 for the numbers, fourty-two for the text, and so on, but I can also use null which works with any datatype:

yugabyte=# explain execute test1(null,null);

                QUERY PLAN
-------------------------------------------
 Result  (cost=0.00..0.00 rows=0 width=45)
   One-Time Filter: false
(2 rows)

Did you spot the problem? With null for the primary key, the query planner knows that there's no rows in the result, without even looking at the table. This happened for null here but can also happen for any specific value I use, because the query planner is very clever with check constraints, partial indexes, and so on.

What I want is the generic plan, the one that is valid for any parameter value. One solution may be to run is 5 times in the hope it switches from the custom plan to the generic plan:

...

yugabyte=# explain execute test1(null,null);
                QUERY PLAN
-------------------------------------------
 Result  (cost=0.00..0.00 rows=0 width=45)
   One-Time Filter: false
(2 rows)

yugabyte=# explain execute test1(null,null);
                QUERY PLAN
-------------------------------------------
 Result  (cost=0.00..0.00 rows=0 width=45)
   One-Time Filter: false
(2 rows)

yugabyte=# explain execute test1(null,null);
                              QUERY PLAN
-----------------------------------------------------------------------
 Index Scan using demo_pkey on demo  (cost=0.00..4.12 rows=1 width=45)
   Index Cond: (i = $1)
   Filter: ((d IS NOT NULL) AND b AND (t = $2))
(3 rows)

But that's not sufficient. You may be lucky or not. I explained that in the past with a short tweet:

There is a solution which depends on the PostgreSQL compatibility version.

PostgreSQL 12

In PostgreSQL, since version 12, you can control this with plan_cache_mode. The default (auto) uses 5 custom plans before considering a generic one. But you can directly have a generic with force_generic_plan:

postgres=# \c

postgres=# set plan_cache_mode = force_generic_plan;
SET
postgres=# explain execute test1(null,null);

                              QUERY PLAN
-----------------------------------------------------------------------
 Index Scan using demo_pkey on demo  (cost=0.15..8.17 rows=1 width=45)
   Index Cond: (i = $1)
   Filter: ((d IS NOT NULL) AND b AND (t = $2))
(3 rows)

This is obviously the simplest to get a generic plan if you are compatible with PostgreSQL 12 or above.

PostgreSQL before version 12

If you are not yet with version 12, you can't set plan_cache_mode. The solution is to execute the prepared statement 5 times in order to be able to get a generic statement on the 6th execution. However, this is not guaranteed because there's another condition: the generic plan is chosen only when its estimated cost is not higher than the average cost of the previous executions. This is actually the reason for the 5 runs: get enough data for a relevant average. The idea is that using a generic plan to save the planning time is good only if the execution of the generic plan is faster than the planning + execution of custom plans. But those estimations are really basic. The formula to estimate the cost of planning is proportional to the number of joins and 1000*cpu_operator_cost.

This basic formula gives me a way to guarantee that the 6th execution will be generic, by increasing artificially the planning cost of the five first ones by setting a high cpu_operator_cost:

\c
prepare test1 as 
 select * from demo where i=$1 and t=$2 and d is not null and b;
set local cpu_operator_cost=1e42; -- very high value
explain execute test1(null,null);  -- 5 times
explain execute test1(null,null);  -- 5 times
explain execute test1(null,null);  -- 5 times
explain execute test1(null,null);  -- 5 times
explain execute test1(null,null);  -- 5 times
rollback;                          -- set back to default
explain execute test1(null,null); -- 6th execution

Here, the average planning cost of the first 5 executions is artificially high which will make any generic plan chosen to avoid more planning overhead. I've used a transaction to set local in order to revert to the default value for cpu_operator_cost.

YugabyteDB with PG11 compatibility

YugabyteDB is currently compatible with PostgreSQL 11.2 which doesn't have this control. However, I have a way to avoid those 5 executions which I don't need as I force the estimation. We have introduced yb_test_planner_custom_plan_threshold to control this, especially for our regression tests - thus the test in it. I like it because the magic number 5 hardcoded in PostgreSQL is now a parameter (5 is the default value for yb_test_planner_custom_plan_threshold).

I use this to do the same as before, simplified with only one pre-execution of the custom plan:

\c
prepare test1 as 
 select * from demo where i=$1 and t=$2 and d is not null and b;
begin transaction;
set local cpu_operator_cost=1e42;              -- very high value
explain execute test1(null,null);               -- 1 times
rollback;                                       -- revert to default
begin transaction;
set local yb_test_planner_custom_plan_threshold =  1;
explain execute test1(null,null);               -- 2nd execution
rollback;                                       -- revert to default

Here, the second execution has the generic plan, and, given the extremely high planning cost estimated for the custom plan, I'm guaranteed that it is chosen.

You may wonder why looking at generic plans when the application probably uses custom plans? First, if you have the best access path for a generic plan, then it should also be good for specific values. When looking at the plan, with a little knowledge of your application, you should spot the specific values that deserves a custom plan (like the famous is_deleted column for logical deletion, or the special business values used as discriminator). From the generic plans you can get the list of tables that are queried together, with the columns used by the predicates. Of course, the cost-based optimization will not be accurate as the selectivity cannot be based on statistics, but you already have good information to start. And remember that most of your users will prefer performance predictability to trying to get the fastest for specific values.

YugabyteDB will upgrade the PostgreSQL code to PG15 soon, and next versions will follow.

PostgreSQL 16

With PostgreSQL 16 (currently in beta when I'm adding this) the problem is fully solved with the generic_plan option:

postgres=# select version();
                                                        version
------------------------------------------------------------------------------------------------------------------------
 PostgreSQL 16beta1 on aarch64-unknown-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-10.1.0.1), 64-bit
(1 row)

postgres=# explain (generic_plan)
           select * from demo where i=$1 and t=$2 and d is not null and b;
                              QUERY PLAN
-----------------------------------------------------------------------
 Index Scan using demo_pkey on demo  (cost=0.15..8.17 rows=1 width=45)
   Index Cond: (i = $1)
   Filter: ((d IS NOT NULL) AND b AND (t = $2))
(3 rows)

To get this, you must explain the parameterized statement, without preparing it. If you explain (generic_plan) execute test1(null,null); the parameters will be replaced and the plan will be the custom one.

Top comments (2)

Milo Tan • Mar 9 '23

Can the yb_test_planner_custom_plan_threshold be set as part of the prepared statement like a hint? I tried /*+ set(yb_test_planner_custom_plan_threshold 1) */ with the select statement of a prepare statement but it doesn't appear to work.

Franck Pachot YugabyteDB • Mar 10 '23

Yes, makes sense. This us used in src/backend/utils/cache/plancache.c which is in the Executor, not the planner, so cannot be set via hint