Counting large tables in PostgreSQL

#postgres #todayilearned

Imagine you have a really large table (> 100m rows). How do you figure out how many rows there are in that table within a reasonable time?

If you try and do a SELECT COUNT(id) FROM my_large_table, it will end up taking a lot of time since the database will have to scan throw all the rows to count them.

$ heroku pg:ps
 pid | state | source | running_for | transaction_start | waiting | query
------+--------+------------------+-----------------+-------------------------------+---------+--------------------------------------------------
 6926 | active | psql interactive | 00:10:12.549096 | 2020-10-01 08:22:18.090989+00 | t | SELECT COUNT(id) FROM pageviews;
(1 row)

I gave up after waiting for 10 minutes...

A faster way is to get an approximate value by looking at statistics from the catalog table pg_catalog.pg_class.

DATABASE=> SELECT reltuples::bigint FROM pg_catalog.pg_class WHERE relname = 'pageviews';
 reltuples
-----------
 136032896
(1 row)

This took less than a millisecond, compared to the 9 minutes that we waited for the COUNT query to finish. It may not be as accurate, but it's enough useful to set expectations for adding new indices to the table or doing any large operations on it.

DEV Community

Counting large tables in PostgreSQL

Top comments (0)

Read next

Mastering Next.js Deployment on EC2: Direct vs Apache Setup 🚀

AWS CLI: Instalación en Windows y Linux, y Uso Básico

Ethical Hacking Roadmap 2025: Guide to Becoming a Cybersecurity Expert

Offering help with a project