DEV Community

Discussion on: SQL query builders

Collapse
 
darkain profile image
Vincent Milum Jr

One point against that final query you show there. Using "OR" anywhere in a query generally destroys performance entirely, usually resulting in a full table scan. If you're dealing with a couple hundred or maybe a thousand rows, that's fine. In my line of work, a billion rows per table is quite common, and that would literally take hours to execute the query. Not entirely ideal on the DBA side, just to save some cleanliness on the developer's side.

Collapse
 
mark_saward profile image
Mark Saward • Edited

Edit: I should have used an earlier date for my tests, and maybe a second clause. A quick test with a second where clause (not included below) seems to show exactly the same plan is used both with and without the OR parts.

The largest tables I deal with are less than 1 million rows, so my pressures are definitely different to yours.

I did some tests in postgesql, and I can't see any difference in the query plan that the database uses, whether I include OR's or not. I'm far from a database expert, so please let me know if I've misunderstood something.

There are two cases here: a date is given to be filtered on or isn't. There are two approaches for each of these cases -- using an OR, and not using one.

postgres=# CREATE TABLE town (
postgres(#   id SERIAL UNIQUE NOT NULL,
postgres(#   name TEXT NOT NULL,
postgres(#   founded date NOT NULL
postgres(# );
CREATE TABLE
postgres=# insert into town (
postgres(#     name, founded
postgres(# )
postgres-# select
postgres-#     md5(random()::text),
postgres-#     ('01-01-0001'::date + (random() * interval '2000 year'))::date
postgres-# from generate_series(1, 10000000) s(i)
postgres-# ;
INSERT 0 10000000
postgres=# PREPARE dateplan_or(date) AS
postgres-#   SELECT * FROM town WHERE ($1 IS NULL OR founded >= $1);
PREPARE
postgres=#
postgres=# PREPARE dateplan(date) AS
postgres-#   SELECT * FROM town WHERE founded >= $1;
PREPARE
postgres=#
postgres=# PREPARE dateplan_all(date) AS
postgres-#   SELECT * FROM town;
PREPARE
Enter fullscreen mode Exit fullscreen mode

dateplan and dateplan_all are used for the cases where date is filtered on or not, respectively, with no OR. dateplan_or includes a where clause, whether we filter on date or not.

The two plans that are actually used when filtering on date, using dateplan_or and dateplan:

postgres=# EXPLAIN ANALYZE EXECUTE dateplan_or('2000-01-01'::date);
                                                        QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=1000.00..147082.16 rows=5408 width=41) (actual time=21.513..296.753 rows=5029 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Parallel Seq Scan on town  (cost=0.00..145541.36 rows=2253 width=41) (actual time=7.801..227.530 rows=1676 loops=3)
         Filter: (founded >= '2000-01-01'::date)
         Rows Removed by Filter: 3331657
 Planning Time: 1.879 ms
 JIT:
   Functions: 6
   Options: Inlining false, Optimization false, Expressions true, Deforming true
   Timing: Generation 1.934 ms, Inlining 0.000 ms, Optimization 13.887 ms, Emission 7.058 ms, Total 22.880 ms
 Execution Time: 298.389 ms
(12 rows)

postgres=# EXPLAIN ANALYZE EXECUTE dateplan('2000-01-01'::date);
                                                        QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=1000.00..147082.16 rows=5408 width=41) (actual time=6.009..252.737 rows=5029 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Parallel Seq Scan on town  (cost=0.00..145541.36 rows=2253 width=41) (actual time=3.297..211.336 rows=1676 loops=3)
         Filter: (founded >= '2000-01-01'::date)
         Rows Removed by Filter: 3331657
 Planning Time: 0.529 ms
 JIT:
   Functions: 6
   Options: Inlining false, Optimization false, Expressions true, Deforming true
   Timing: Generation 1.504 ms, Inlining 0.000 ms, Optimization 0.825 ms, Emission 7.730 ms, Total 10.059 ms
 Execution Time: 253.955 ms
(12 rows)
Enter fullscreen mode Exit fullscreen mode

And the two plans for dateplan_or and dateplan_all when no date is given:

postgres=# EXPLAIN ANALYZE EXECUTE dateplan_or(NULL);
                                                     QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
 Seq Scan on town  (cost=0.00..193458.06 rows=10000006 width=41) (actual time=0.093..593.871 rows=10000000 loops=1)
 Planning Time: 0.234 ms
 Execution Time: 932.317 ms
(3 rows)

postgres=# EXPLAIN ANALYZE EXECUTE dateplan_all(NULL);
                                                     QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
 Seq Scan on town  (cost=0.00..193458.06 rows=10000006 width=41) (actual time=0.051..599.485 rows=10000000 loops=1)
 Planning Time: 0.106 ms
 Execution Time: 936.658 ms
(3 rows)
Enter fullscreen mode Exit fullscreen mode

If I create an index, it gets used by both queries when filtering by date:

postgres=# CREATE INDEX founded_idx ON town (founded);
CREATE INDEX
postgres=# EXPLAIN ANALYZE EXECUTE dateplan_or('2000-01-01'::date);
                                                        QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on town  (cost=66.78..17575.03 rows=5464 width=41) (actual time=3.113..19.385 rows=5029 loops=1)
   Recheck Cond: (founded >= '2000-01-01'::date)
   Heap Blocks: exact=4891
   ->  Bitmap Index Scan on founded_idx  (cost=0.00..65.42 rows=5464 width=0) (actual time=1.499..1.500 rows=5029 loops=1)
         Index Cond: (founded >= '2000-01-01'::date)
 Planning Time: 0.727 ms
 Execution Time: 19.813 ms
(7 rows)

postgres=# EXPLAIN ANALYZE EXECUTE dateplan('2000-01-01'::date);
                                                        QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on town  (cost=66.78..17575.03 rows=5464 width=41) (actual time=1.522..14.834 rows=5029 loops=1)
   Recheck Cond: (founded >= '2000-01-01'::date)
   Heap Blocks: exact=4891
   ->  Bitmap Index Scan on founded_idx  (cost=0.00..65.42 rows=5464 width=0) (actual time=0.752..0.753 rows=5029 loops=1)
         Index Cond: (founded >= '2000-01-01'::date)
 Planning Time: 0.295 ms
 Execution Time: 15.169 ms
(7 rows)
Enter fullscreen mode Exit fullscreen mode