Postgres, one of the widely used Relational Database Management System; has been widely adopted due to its ability to handle different workloads such as web services, warehouses, etc.
Fun Fact: The name Postgres comes from its predecessor originated from UC Berkley’s Ingres Database (_INteractive GRaphics iterchangE System; meaning it’s Post-INGRES)._
There are times when the performance is straight forward and in other cases when the expected performance is not met; the Database requires some tweaking in the form of structural modifications to the table, Query Tuning, Configuration improvements, etc.
This article will provide some useful pointers and action plans to become a power-user in optimizing Postgres.
What to do when a query is slow?
In most cases, the occurrence of a slow query is due to the absence of indexes, for those fields that are being used in the where clause of the query.
That should have solved the problem, right? RIGHT?
I hear you; Life ain’t Fair, or Is it?
Not all Indexes for the fields in the WHERE clause can be helpful; It all depends on the appropriate query plan prepared by the optimizer: Prepend
EXPLAIN ANALYZE to the query and run it to find the query plan.
Pro Tip: Use https://explain.depesz.com/ to visualize and analyze your query plan. The color formatting gives a straight forward output to debug the reason for the slowness.
The query plan itself can provide a whole lot of information about where the resources are overflowing. Given below, are a few of those keywords that you can find in the query plan and what they mean to you and the query performance.
Sequential Scan: Yes, you read that right. The scan occurs sequentially; the filter runs for the whole table and returns back the rows that match the condition which can be very expensive and exhaustive. In the case of a single page / small table, Sequential scans are pretty fast.
But for larger tables; In order to speed up the query, the sequential scan needs to be changed to an Index Scan. This can be done by creating indexes on the columns that are present in the where clause.
Index Scans / Index Only Scans: Index Scans denote that the indexes are being properly used. Just make sure that the analyzing & vacuuming happens once in a while. This keeps all the dead tuples out of the way and allows the optimizer to choose the right index for the scan.
Bitmap Index Scan: And this right here is the bummer. Bitmap Index Scans are accompanied by Bitmap Heap Scans on top. These scans occur mostly when one tries to retrieve multiple rows but not all, based on multiple logical conditions in the where clause.
It basically creates a bitmap out of the pages of the table, based on the condition provided (hence the Bitmap Heap Scan on top). The query can be sped up by creating a composite index A.K.A multicolumn index; which changes this scan to an Index Scan.
Caution: The order of the columns in the composite index needs to be maintained the same order as that of the where clause.
Indexes are good; Unused Indexes are Bad;
Having Too many Indexes is OK, as long as they are being used at some point.
More RAM for the DB is Good. VACUUM & ANALYZE of tables is too good!!!
ARCHIVAL of Old Data → Being a good citizen and you are awesome!!
For optimal performance, the following settings (requires a restart of the server) need to be made to the postgresql conf file present in:
|75% of RAM
|25% of RAM
|Min: 256MB; Max:512MB
Consider the scenario, where Postgres Server’s has 160Gigs of RAM:
1) Run Explain Analyze on your Query, and if it takes too long; Run Explain on your Query.
2) Copy the output and paste it onto the dialogue box @ https://explain.depesz.com/
3) Check the Stats of your query:
Index Scans / Index Only Scans are the best and no changes need to be made.
Sequential Scans, can be converted into Index Scans by creating the index for the particular column in the where clause.
Bitmap Heap Scans, can be converted into Index Scans by creating composite indexes A.K.A multicolumn indexes, with the same order as that of the where clause, as:
CREATE INDEX $indexName ON $tableName ($Field1, $Field2);
Note to Self: Index & Optimize.!!
Originally published at https://www.datawrangler.in on August 10, 2019.