Jake Swanson

Posted on Sep 21, 2022

INSERTing Sequential Scans

#postgres #pg #database #heuristics

Do you see "sequential scan" in your EXPLAIN output while hoping for index scans? This post dives into one reason why, and aims to be a quick tour, showing SQL you can run on your local psql.

If you're a good candidate for this blog post, you might know that ANALYZE is a vital feature of PostgreSQL, but maybe you cannot say why, or are unsure of a case where it matters.

We'll go about this in steps:

Set up the schema + index up front
Test out fast queries on data
Create a slowdown
See the pg_stats

1. The setup

This is the only section where we're changing the table schema. This is important: The planner is going to ruin our day, not a schema change.

create table dogs_seen (id serial, num_seen bigint);
create index dogs_idx on dogs_seen (num_seen);

-- inserts 1mil rows with num_seen values
-- that are weghted towards low, positive integers
insert into dogs_seen (num_seen) select (1/random())
  from generate_series(1,1000000);

-- update pg_stats to reflect our table state
analyze dogs_seen;

2. Initial testing

The test query we will focus on is this:

select count(*) from dogs_seen where num_seen = 1;
-- this should be around 300k-400k

You expect that query to use the index we created during setup, and it should be:

explain analyze select count(*) from dogs_seen where num_seen = 1;

         QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=6320.28..6320.29 rows=1 width=8) (actual time=25.344..26.535 rows=1 loops=1)
   ->  Gather  (cost=6320.07..6320.28 rows=2 width=8) (actual time=25.296..26.527 rows=3 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Partial Aggregate  (cost=5320.07..5320.08 rows=1 width=8) (actual time=19.418..19.418 rows=1 loops=3)
               ->  Parallel Index Only Scan using dogs_idx on dogs_seen  (cost=0.42..4973.54 rows=138611 width=0) (actual time=0.062..11.770 rows=111153 loops=3)
                     Index Cond: (num_seen = 1)
                     Heap Fetches: 0
 Planning Time: 0.292 ms
 Execution Time: 26.595 ms
(10 rows)

Yup, it shows Index scanning. One thing to note here is that this query has pretty bad performance, relatively speaking. During setup, I mentioned that we generated num_seen values weighted towards one, so there are way more rows matching our condition than, say, this one:

explain analyze select count(*) from dogs_seen
  where num_seen = 6;
                                                               QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=650.42..650.43 rows=1 width=8) (actual time=5.709..5.710 rows=1 loops=1)
   ->  Index Only Scan using dogs_idx on dogs_seen  (cost=0.42..580.67 rows=27900 width=0) (actual time=0.040..3.502 rows=28235 loops=1)
         Index Cond: (num_seen = 6)
         Heap Fetches: 0
 Planning Time: 0.233 ms
 Execution Time: 5.754 ms
(6 rows)

I draw a comparison here as a hint to say: This query is about as bad as it can get for the planner. It wants to use the index to reduce the amount of data it is scanning, but we just made it scan hundreds of thousands of entries.

What happens if we make things even worse on the planner? What if we made an even larger chunk of this table match our query?

3. Create a slowdown

Again, we are going to create a slowdown purely by inserting more rows, but we want to make things worse on our planner, so we'll weight these rows differently:

-- insert 1mil more rows, even more weight on low values
-- (0.5 instead of 1, is the difference)
insert into dogs_seen (num_seen) select (0.5/random()) from generate_series(1,1000000);

-- don't forget to analyze
analyze dogs_seen;

And now explain that original 25ms SQL:

explain analyze select count(*) from dogs_seen where num_seen = 1;

                                                                 QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=23271.91..23271.92 rows=1 width=8) (actual time=76.779..77.476 rows=1 loops=1)
   ->  Gather  (cost=23271.69..23271.90 rows=2 width=8) (actual time=76.730..77.471 rows=3 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Partial Aggregate  (cost=22271.69..22271.70 rows=1 width=8) (actual time=70.811..70.811 rows=1 loops=3)
               ->  Parallel Seq Scan on dogs_seen  (cost=0.00..21227.67 rows=417611 width=0) (actual time=0.024..57.560 rows=333558 loops=3)
                     Filter: (num_seen = 1)
                     Rows Removed by Filter: 333109
 Planning Time: 0.599 ms
 Execution Time: 77.524 ms
(10 rows)

The same SQL changed to sequential scan! There is dialog in my head here that I imagine the PG planner saying:

We made our index less valuable for this SQL. We inserted a ton of rows with lots of duplicate values, which just happened to match our SQL condition.
If postgres thinks it needs to scan a lot of a table, it might skip index-scanning work and scan the table directly instead.

But how does postgres decide the above?

4. The `pg_stats`

When we ran ANALYZE dogs_seen above, postgresql updated our pg_stats rows for this table. These columns are all useful to know and feel free to read up on them. The ones we care about here are most_common_vals and its companion most_common_freqs:

select left(most_common_vals::text, 10) as most_common_vals,
  left(most_common_freqs::text, 50) as most_common_freqs
from pg_stats
where tablename = 'dogs_seen' and attname = 'num_seen';

 most_common_vals |                 most_common_freqs
------------------+----------------------------------------------------
 {1,2,3,4,5       | {0.5011333,0.19793333,0.08653333,0.047066666,0.029
(1 row)

This reads as:

the value 1 is seen in 50% of sampled rows
the value 2 is seen in 20% of sampled rows
and so on

By inserting lots of rows with the same values, we affected the entries in this list when ANALYZE populates it. And the planner adjusted to sequentially scan for the value 1 instead of using the index.

Just to drive this home, let's delete a bunch of 1s and look at things again:

-- delete 500k rows, make "1" great again
delete from dogs_seen where id in (select id from dogs_seen where num_seen = 1 limit 500000);

-- re-analyze
analyze dogs_seen;

-- check out the new stats
select left(most_common_vals::text, 10) as most_common_vals, left(most_common_freqs::text, 50) as most_common_freqs from pg_stats where tablename = 'dogs_seen' and attname = 'num_seen';

 most_common_vals |                 most_common_freqs
------------------+----------------------------------------------------
 {1,2,3,4,5       | {0.32923332,0.27103335,0.11643333,0.0607,0.0405666
(1 row)

-- did we switch it back onto the index?
explain analyze select count(*) from dogs_seen where num_seen = 1;
                                                                            QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=9520.65..9520.66 rows=1 width=8) (actual time=33.478..34.535 rows=1 loops=1)
   ->  Gather  (cost=9520.44..9520.65 rows=2 width=8) (actual time=33.384..34.527 rows=3 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Partial Aggregate  (cost=8520.44..8520.45 rows=1 width=8) (actual time=28.603..28.603 rows=1 loops=3)
               ->  Parallel Index Only Scan using dogs_idx on dogs_seen  (cost=0.43..8006.01 rows=205771 width=0) (actual time=0.265..17.126 rows=166891 loops=3)
                     Index Cond: (num_seen = 1)
                     Heap Fetches: 0
 Planning Time: 0.417 ms
 Execution Time: 34.570 ms
(10 rows)

It's back to Index scanning! There you have it. We inserted rows and changed the query plan just by running ANALYZE. If you have a sequential scan haunting you, maybe it is due to this!

DEV Community

INSERTing Sequential Scans

1. The setup

2. Initial testing

3. Create a slowdown

4. The `pg_stats`

Top comments (0)

Read next

Database schema design of Splitwise application

Mastering PostgreSQL Performance: Linux Tuning and Database Optimization

The Justin Beiber database problem!

Designing a Production-Grade Database for High-Traffic Applications on AWS RDS MySQL

1. The setup

2. Initial testing

3. Create a slowdown

4. The pg_stats

Read next

Database schema design of Splitwise application

Mastering PostgreSQL Performance: Linux Tuning and Database Optimization

The Justin Beiber database problem!

Designing a Production-Grade Database for High-Traffic Applications on AWS RDS MySQL

4. The `pg_stats`