DEV Community

Vladyslav Len
Vladyslav Len

Posted on • Originally published at Medium

Logical Replication is easy with DataBrew

In the ever-evolving landscape of data management, one of the most pressing challenges organizations face is ensuring the seamless and real-time replication of data across their systems. This critical process underpins a myriad of operations, from maintaining data consistency to enabling data analytics and business intelligence. However, selecting the optimal approach for data replication can be a formidable task, as it necessitates a careful evaluation of the available options. Navigating this intricate terrain demands a comprehensive understanding of the nuances and trade-offs associated with all methods, as organizations strive to make informed decisions to meet their specific data replication needs.

Logical Replication in PostgreSQL

Logical replication in PostgreSQL enables the selective replication of data changes at a logical level. Publishers define what data to replicate through publications, and subscribers receive these changes.

To create a publication, you specify the tables and types of changes to replicate. For instance:

CREATE PUBLICATION my_pub FOR TABLE my_table WITH (publish INSERT, publish UPDATE);
Enter fullscreen mode Exit fullscreen mode

Subscribers express interest in specific publications:

CREATE SUBSCRIPTION my_sub
  CONNECTION 'dbname=remote_db host=remote_host user=replication_user password=secret'
  PUBLICATION my_pub;
Enter fullscreen mode Exit fullscreen mode

PostgreSQL then streams changes (e.g., INSERT, UPDATE) from the publisher to subscribers, facilitating real-time data replication. Managing this process requires careful consideration of data integrity and schema changes.

Using these two commands from above could be enough to create fully-fledged data replication for your instance. I wish it was :)

As soon as you dive into this, you realize that you need way more things that PG can provide out of the box. Let’s walk through some of them:

  • Monitoring, because you want to know what is happening right now with your replication.
  • Alerting, because you want to react to incidents immediately as they occur.
  • Visibility. When having more than 2 databases you may have a lot of logical replication set up. It gets incredibly hard to keep an eye on them.

That’s why you need a tool to solve all of these and even more. That’s one of the reasons why we created DataBrew.

We wanted to give developers a way to deploy, observe, and control their data pipelines.

Replication with DataBrew

DataBrew — Next-generation data platform

DataBrew is a cloud-based data platform that gives the ability to work with different data sources. Combine them, stream, and merge. Currently, provides the ability to adopt Change-Data-Capture for PostgreSQL and MySQL

Setting up replication for PostgreSQL (event MySQL) with DataBrew is easy.

First, you have to create an account https://databrew.tech and verify your email.

DataBrew provides a free tier for all new accounts. So you can experiment with data replication for free or even stay on a free tier as long as you want.

After the login, you will be able to create PostgreSQL and MySQL services.

They are compatible drivers. It means you can copy data from PostgreSQL tables into MySQL and vice versa

Register your service with DataBrew. Step 1

Register your service with DataBrew. Step 2

Register your service with DataBrew. Step 3

After you create your services (source and target databases), it’s time to create DataFlows.

In DataBrew’s dictionary, DataFlow — is a connection between two Services. Basically, it’s a data replication pipeline. It can have multiple states, like starting, creating, and stopping.

To create a new DataFlow, simply open the service page and press the “+” button to select the service and the direction of DataFlow.

After the DataFlow is created you will be able to start it to start data replication.

Press the “Play” button to start your DataFlow

When running logical replication on DataBrew — we take care of all the things you need. It means that you don’t have to write code to maintain your replication anymore.

You can provide us with your WebHook URL and DataBrew will be sending updates to your system as soon as they happen. Like DataFlow failure, start or stop.

If you have to manage a lot of replications, means you have a lot of DataFlows — you can get real-time visualization of the state of your system.

There are a lot of features coming to DataBrew following months, make sure you create an account or follow us on social media to stay updated.

Soon we are going to release advanced data flow transformations, integration with blockchain data stream, and many more.

Thanks for reading the article! We hope we could have sparked the interest in your eyes to give DataBrew a shot.

Useful links

  1. Website: https://databrew.tech
  2. DataBrew Documentation — https://docs.databrew.tech/
  3. Twitter: https://twitter.com/@usedatabrew
  4. LinkedIn: https://www.linkedin.com/company/databrewinc/
  5. Email: contact@databrew.tech

Top comments (0)