DEV Community

Arunesh Choudhary
Arunesh Choudhary

Posted on

06 - Vacuum Processing

By deleting dead tuples and freezing transaction IDs, vacuum processing is a crucial PostgreSQL maintenance procedure that contributes to the database's dependable functioning. Dead tuples are rows in databases that may be securely deleted since they are no longer required, but freezing transaction IDs guards against data tampering. There are two types of vacuum processing: concurrent vacuum and full vacuum. Other transactions can read the table while concurrent VACUUM is removing dead tuples from each page of the table file. In contrast, Full VACUUM prevents other transactions from accessing tables while it is executing and eliminates dead tuples and defragments live tuples for the whole file.

Vacuum processing has been developed and improved more slowly than other functions, despite its importance to PostgreSQL's functionality. Vacuum processing had to be done by hand up to version 8.0, but in 2005 the autovacuum daemon was implemented, ushering in automation. Vacuum processing can be expensive since it necessitates scanning whole tables. To increase the effectiveness of eliminating dead tuples, PostgreSQL version 8.4 included the Visibility Map (VM), which keeps track of pages containing only active tuples and enables the vacuum process to skip pointless searches.

Tasks of Vacuum Processing

  • Defragmenting living tuples on a page-by-page basis while removing dead tuples from databases. Additionally, index tuples that lead to defunct tuples are removed.

  • When necessary, it freezes older transaction IDs (txids) for tuples. This entails maintaining data integrity and updating system catalogs connected to frozen txids.

  • The commit log (also known as the "clog") is cleaned up by removing any extraneous sections wherever feasible.

Visibility Map

In order to lower the cost of vacuum processing, PostgreSQL 8.4 introduced the Visibility Map (VM). A table-specific map called the VM records page visibility and shows whether or not pages have dead tuples. The vacuum process can optimize efficiency and use less resources by skipping pages without dead tuples by leveraging the VM.

The VM is kept in a separate file as one or more 8 KB pages. It gives useful data on page visibility and supports the Free Space Map (FSM). This increases the effectiveness of vacuum processing in PostgreSQL by enabling it to efficiently identify and clean up pages that need attention.

Freeze Processing

Lazy and eager freeze processing modes are provided by PostgreSQL. Using the visibility map (VM) of the target tables, the freeze process scans just the pages with dead tuples while it is in lazy mode. This method skips pages without dead tuples effectively. On the other hand, eager mode reads every page, even if there are dead tuples present. It performs a comprehensive freeze procedure, refreshes pertinent system catalogs, removes unused files and pages from the commit log. With lazy mode being the default and eager mode being activated under specified situations, the mode selection is dependent on a number of factors.

Closing Remarks

In conclusion, PostgreSQL's vacuum processing is crucial for preserving database speed, space management, and precise statistics. Vacuum assists in ensuring the efficient operation of the database system by recovering space, updating statistics, and avoiding transaction ID wraparound.

Top comments (0)