DEV Community

Discussion on: Tutorial: Set up a Change Data Capture architecture on Azure using Debezium, Postgres and Kafka

Collapse
 
da115 profile image
Denis A • Edited

Many thanks, Abhishek!

A small (potential) missing step: at least for the pgoutput plugin (which Debezium kind of recommend for PostgreSQL 10+), a publication, named specifically dbz_publication, must be created on the PostgreSQL database for all the tables, which are to be tracked, with a SQL command like create publication dbz_publication for table table1,...,tableN;.

Indeed, when the Debezium Kafka Connector starts, it checks whether such a publication exists, and if not, it tries to create one tracking all the tables (create publication dbz_publication for all tables;), but that requires superuser privileges on the PostgreSQL database, which Azure does not provide (the database admin roles are limited to pg_admin only, not superuser, which Microsoft reserve to their own administrators).

Collapse
 
abhirockzz profile image
Abhishek Gupta • Edited

Thank you sharing Denis! I used wal2json in the blog, but this is relevant if using pgoutput with managed offerings (such as Azure PostgreSQL). I'll try to capture this information in the form of another blog post... thanks again :)

You might also want to take a look at the publication.autocreate.mode property

Collapse
 
abhirockzz profile image
Abhishek Gupta

It is now documented - debezium.io/documentation/referenc...

Thread Thread
 
vinit_r_patel profile image
vinit patel

Abhishek,
We use Postgresql as a service on Azure. Is it mandatory to install Kafka? Can we just use Azure EventHub as a replacement? If Kafka is really required then what is the best approach to run Kafka in Azure?

Thread Thread
 
abhirockzz profile image
Abhishek Gupta

the "best" approach is purely based on requirements, so the answer is "it depends". there are multiple options including HD Insight, Confluent Cloud via Azure Marketplace, Kafka on AKS, Event Hubs.