DEV Community

Aniketh Deshpande
Aniketh Deshpande

Posted on

Change Data Capture - PostgreSQL

Change Data Capture is the concept of recording the changes in the database table fields.

  • It is very helpful in use cases where we want to track creation, updation, deletion of records in the table.

  • We might want to use this information to make changes in other databases or notify customers or notify other services.

  • Example:

    1. Save a copy of this data in a warehouse post transform.
    2. Trigger notification service to notify users about this change.
    3. Cache the data.

CDC In Postgres

  • Using Notify/Listen
    • NOTIFY provides a mechanism for interprocess communication between the database and the service that is LISTENing to this notification channel.
    • One or more services could be listening to this notification channel.
    • Name of this channel is usually the name of the database. However the user is free to set suitable names for these channels.
    • Any change in the table is captured by the DB and a trigger is initiated, which calls a function that formats the message to notify.
    • This usually contains the table name and the payload string.
    • The listening server registers to the channel and gets the message from the DB.
    • The service can then use this message and perform operations on it.

Pros:

  • Simple to implement. Use a trigger and a function to notify. Implement a listen service.

Cons:

  • Weak reliability. There is always a risk of loss of message especially when the listening service is down. Messages in the queue do not persist.

  • Using Debezium

    • Debezium is an open source tool used for capturing changes in the database tables based on the WAL (write ahead log).
    • The tool provides connectors to connect to a variety of databases.
    • The source connector is used to capture changes in the source database.
    • The sync connector is used to sync data directly in the destination database.

Pros:

  • The changes are persistent as they can be streamed to kafka. Hence highly reliable.

Cons:

  • Debezium does not take into account changes in the schema, users need to update the schema changes. Otherwise there would be data loss.

Thank you
Aniketh Deshpande

Top comments (0)