SQL 201: Have you VACUUMed your tables lately?

Helen Anderson on March 04, 2019

No, not that kind of vacuuming. Dishing the dirt on VACUUM If you work in a Postgres database there are tasks your DBA need to pe... [Read Full]
markdown guide
 

Autovacuum is nice, but it is not the best solution to rely upon once things start to grow.

You need to do two things:

  1. tune the parameters per table
  2. manually vacuum for some tables

Tuning Parameters Per Table

As you mentioned the auto vacuum kicks in when 20% + 50 tuples have changes. This is a bad setting for small tables which change a lot, as it will result in a lot of vacuum activity for that table.

For example a user table is generally not that big, but it might change a lot if you keep track of a last activity of the user.

So for small tables with a lot of changes you should increase the scale factor, or maybe the base threshold.

Manually vacuum for some tables

For large tables which do see quite a bit of changes you might not want to rely on auto vacuum too much. Even though the overhead is small you probably do not want large tables to be vacuumed during "business hours".

It is best to create a cron job which explicitly calls the vacuum on these tables outside the business hours. This would result in more vacuum actions on the table than the autovacuum would generally consider. But given the size of these tables it is not really a bad thing.

ps, you might want to monitor the n_dead_tup column of pg_stat_user_tables. As this gives an indication as how much garbage there is for that table. If you plot this value together with the vacuum timestamps and n_live_tup column, you can get an idea how to tune vacuuming for that table.

 

That’s great advice, thank you for such a thoughtful comment.

I’ve been really enjoying learning all the tricks from our DBA to keep things running smoothly

 

Good to see more database-savvy entries here ;-)
Many developers stop by writing some SQL, but it is necessary to understand
why things behave as they do.

Some notes.
If you think you should do a vacuum full to a table or database: think twice.
As you mentioned, it creates full-table locks and copies the data to new pages. So you have to be aware of the additional space you need to do so.

The PostgreSQL guys state:

FULL:
Selects “full” vacuum, which can reclaim more space, but takes much longer and >exclusively locks the table. This method also requires extra disk space, since it >writes a new copy of the table and doesn't release the old copy until the >operation is complete. Usually, this should only be used when a significant >amount of space needs to be reclaimed from within the table.

Also check the notes on the same page:

We recommend that active production databases be vacuumed frequently (at least >nightly), in order to remove dead rows. After adding or deleting a large number >of rows, it might be a good idea to issue a VACUUM ANALYZE command for the >affected table. This will update the system catalogs with the results of all >recent changes, and allow the PostgreSQL query planner to make better choices in >planning queries.

The FULL option is not recommended for routine use, but might be useful in >special cases. An example is when you have deleted or updated most of the rows in >a table and would like the table to physically shrink to occupy less disk space >and allow faster table scans. VACUUM FULL will usually shrink the table more than >a plain VACUUM would.

Most of the time PostgreSQL does a good job on its own, the rest is carefull monitoring, reasoning and handling.

 

We just set up a PostGres DB in AWS for our test monitoring and reporting, while we don't have a large table its going to get activity and we were thinking about administrative tasks. Good to know about this one, not sure how well it will work for us with just one thing, but its nice to know what is available for the other stuff we will be putting up there.

Loved the Storm Trooper Maintenance worker image....now I know how the Death Star kept floors that shiny!

 

I love that image too, it was such a good find. I couldn't ditch Mrs Doubtfire, who featured in a post from Ron Soak on our internal work blog, so had to include her too.

 

Very interesting topic, I did not know the database engine would perform in a such way! Gonna definitively check this for MySQL (as I use it on my day-to-day job). Thank you for sharing :)

 

For MySQL, the Vacuum operation is not needed because it was implemented differently. So Don't worry about that unles you are using PotgreSQL. ;)

 

This post sucks.

Surprised no one said it yet...

Thanks @helenanders26 ! Something I didn't know about!

Any idea how this works in other databases like SQL Server (yes...I know...🤪)?

 

Drat.

That would have made a much better clickbait-y name for the post.

 

Ha... great ! I was soing this vacuum thing regularly on my PostgreSQL DBs without really know what it is or how it used to work, i just knew that it was a db maintenace operation. Thank you for this.

 

That's really great to hear!

I starting digging into it after our DBA said she had to the vacuum the tables each morning this week. Such an interesting, yet accurate, name for the process.

code of conduct - report abuse