DEV Community

Cover image for 🦆 💏 🐘 Let PostgreSQL & duckdb "sql" together
adriens
adriens

Posted on

🦆 💏 🐘 Let PostgreSQL & duckdb "sql" together

❔ About

So you're both a - maybe early - PostgreSQL fan... and a recent duckdb adopter.
You like both databases for their strengths and ecosystems... and wonder how it would be

possible to seamlessly send data from/to each other databases... without having to code anything, I mean nothing more that playing sql shell commands in a terminal

: no Python, no Java,...

👉 Well this is exactly what I will talk about in this post thanks to :

🤔 ... but why this post ?

The 3 main reasons of this article and why to pull/put from PostgreSQL/duckdb at this point are :

DuckDB PostgreSQL
Database Model Columnar database Relational database
License MIT BSD like
Serverless Yes No

🎯 What we'll do

We will, only from terminal :

  1. 🐋 Install & boot a containerized PostgreSQL database (with Podman)
  2. 🐘 Create a database
  3. 🔁 Create and feed a little table
  4. 🪄 Read the psql table from duckdb
  5. 🗜️ Export the psql table to a parquet file
  6. 🔬 Inspect parquet file with parquet-cli

Also we will do the reverse move :

  1. Create a table in PostgreSQL from within duckdb
  2. Test table contents from sql

🍿 Demo

📜 shell scripts

Install & boot a PostgreSQL instance:

export PGPASSWORD=mysecretpassword

# Boot a postgresql instance
podman run --name postgres -e POSTGRES_PASSWORD=$PGPASSWORD -d\
    -p 5432:5432 docker.io/library/postgres

# Check container status
podman ps -a
Enter fullscreen mode Exit fullscreen mode

Install psql so we can reach PostgrSQL from outside the contenair:

# Install `psql` on the host so the database can be accessed
# from outside de container
sudo apt install -y postgresql-client
Enter fullscreen mode Exit fullscreen mode

Now, create some PostgreSQL objects:

# Create a demo database
psql -h localhost -p 5432 -U postgres -c "CREATE DATABASE demo;"

# Create a table
psql -h localhost -p 5432 -U postgres -d demo\
    -c "CREATE table customers(id varchar primary key);"

# Feed the PostgreSQL table with some data
psql -h localhost -p 5432 -U postgres -d demo\
    -c "insert into customers values \
    ('Duffy duck'),\
    ('Daisy Duck'),\
    ('Donald Duck'),\
    ('Ludwig Von Drake');"
Enter fullscreen mode Exit fullscreen mode

Install duckdb :

# (Quick and dirty) duckdb install
wget https://github.com/duckdb/duckdb/releases/download/v1.0.0/duckdb_cli-linux-amd64.zip
unzip duckdb_cli-linux-amd64.zip
cp duckdb /usr/bin/
rm duckdb duckdb_cli-linux-amd64.zip
Enter fullscreen mode Exit fullscreen mode

Now do the fun stuff...

Reach PostgreSQL database from duckdb

Let's reach postgres database from duckdb :

duckdb -c "ATTACH 'dbname=demo user=postgres password=mysecretpassword host=127.0.0.1'\
    AS db (TYPE POSTGRES, READ_ONLY);
show all tables;
select * from db.customers;
COPY db.customers TO 'db.customers.parquet' (FORMAT PARQUET);"
Enter fullscreen mode Exit fullscreen mode

... then check the output parquet file:

ls -ltr
file db.customers.parquet
Enter fullscreen mode Exit fullscreen mode

... and read the resulting parquet file from duckdb :

duckdb -c "select * from 'db.customers.parquet';"
Enter fullscreen mode Exit fullscreen mode

Test resulting parquet file with parquet-cli :

pip install parquet-cli
parq -h

parq db.customers.parquet --count
parq db.customers.parquet --head
parq db.customers.parquet --tail
Enter fullscreen mode Exit fullscreen mode

duckdb ➡️ PostgreSQL

Let's:

  1. "Attach" the remote PostgreSQL instance from duckdb runtime
  2. Create a table
  3. Feed the table
  4. Select table contents from psql
duckdb -c "ATTACH 'dbname=demo user=postgres password=mysecretpassword host=127.0.0.1'\
    AS db (TYPE POSTGRES);\
    create table db.heroes(name varchar primary key);\
    insert into db.heroes values\
        ('Dumbo'),\
        ('Man-Elephant'),\
        ('Tantra'),\
        ('Elephant Man'),\
        ('The Elephantmen'),\
        ('Mammomax') ;
"

psql -h localhost -p 5432 -U postgres -d demo\
    -c "select * from heroes;"
Enter fullscreen mode Exit fullscreen mode

⚖️ More about DuckDB vs PostgreSQL

See below this very synthetic breakdown from influxdata:

Image description

Top comments (2)

Collapse
 
adriens profile image
adriens

Collapse
 
adriens profile image
adriens