DEV Community

Dmitry Romanoff
Dmitry Romanoff

Posted on

PostgreSQL. How do you find potentially ineffective indexes?

While indexes in PostgreSQL can significantly improve query performance, especially for read-heavy operations, they are not universally beneficial in every situation.

Having too many indexes on a PostgreSQL table can have several negative impacts on database performance and resource consumption.

The following query helps you find potentially ineffective indexes.

SELECT
 relid::regclass AS table,
 indexrelid::regclass AS index,
 idx_scan,
 idx_tup_read,
 idx_tup_fetch,
 pg_size_pretty(pg_relation_size(indexrelid::regclass)) AS index_size
FROM
 pg_stat_user_indexes
JOIN
 pg_index USING (indexrelid)
WHERE
 idx_scan < 50 AND idx_tup_read > 1000
ORDER BY
 pg_relation_size(indexrelid::regclass) DESC;
Enter fullscreen mode Exit fullscreen mode

Low idx_scan combined with high idx_tup_read and significant index_size might indicate an index that’s not very useful.

Also, a significant gap between idx_tup_read and idx_tup_fetch can indicate inefficiencies too. It suggests that the index is being scanned, but many of the scanned entries don’t lead to relevant rows.

The idx_tup_read counter measures the number of index entries retrieved by index scans. This metric counts every entry retrieved as a result of an index scan, which includes both entries that eventually lead to fetching a table row (a successful match) and entries that don’t meet the query’s criteria upon further inspection.

For example, consider an index scan that looks for rows where the value is greater than 10. idx_tup_read will count all the index entries scanned during this operation, not just the entries that point to rows satisfying the condition.

The idx_tup_fetch counter, on the other hand, measures the number of live table rows fetched after scanning the index. This count reflects the number of times an index entry led to successfully fetching a row from the table that also met any additional filtering criteria. In other words, it’s a measure of how many of the index entries read (idx_tup_read) were used to fetch rows that are relevant to the query.

Using the same example as above, if an index scan examines 100 entries (idx_tup_read) but only 60 of those entries lead to rows that meet the query’s condition (>10), then idx_tup_fetch would be 60.

This situation can happen with non-selective indexes, where the indexed column(s) have a lot of duplicate values or the query’s criteria are not highly selective.

Top comments (0)