Franck Pachot for YugabyteDB Distributed PostgreSQL Database

Posted on Dec 22, 2022 • Edited on May 18, 2023

Install extensions from PGDG repo to YugabyteDB - example with sequential_uuids

#yugabytedb #postgres #extension #compatibility

In the previous post, I installed a PostgreSQL extension to YugabyteDB by building it. Some extensions are available in the PostgreSQL Global Development Group YUM repository. Here is how to get it from there into a YugabyteDB.

I'm using YugabyteDB 2.17, which is compatible with PostgreSQL 11.2, on Centos7. Here are the available .rpm:
https://download.postgresql.org/pub/repos/yum/11/redhat/rhel-7-x86_64/

For Centos8, pleas look at the next post:

YugabyteDB Distributed PostgreSQL Database

Install extensions from PGDG repo to YugabyteDB - example with sequential_uuids

Franck Pachot for YugabyteDB Distributed PostgreSQL Database ・ Dec 22 '22

#yugabytedb #postgres #extension #compatibility

Example with sequential_uuids

Vlad Mihalcea asked if we have a Time-Sorted Unique Identifier in YugabyteDB. I answered that we probably don't need it but that there is probably a PostgreSQL extension that works with YugabyteDB:
https://twitter.com/FranckPachot/status/1600743186326245376

And, yes, there is, available from the YUM repository. I'll take this as an example.

I do all that in a temporary directory to avoid overwriting existing files

Download the RPM

I can download the .rpm from the online repo:

yum install -y wget
wget -c https://download.postgresql.org/pub/repos/yum/11/redhat/rhel-7-x86_64/sequential_uuids_11-1.0.2-1.rhel7.x86_64.rpm

or though YUM after installing the repo:

yum install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm
yum install --downloadonly -y --downloaddir=$PWD sequential_uuids_11

Dependencies

If downloaded directly, I can check the dependencies with rpm -qRp:

If downloaded with yum, the dependencies have been downloaded. I can also I can get them with yum -q deplist and they were displayed when downloading:

Note that I don't need the postgresql11 ones because I'll use YugabyteDB instead.

Extract the extension to YugabyteDB

My YugabyteDB installation dir is /home/yugabyte and the extension files (share/extensions/*.{control,sql} and libs/*.so) must go into /home/yugabyte/postgres.

Because the RPM installs it in /usr/pgsql-11, I create a symbolic link in my temporary directory and them with a relative path:

mkdir -p ./usr &&
ln -Ts /home/yugabyte/postgres ./usr/pgsql-11 &&
rpm2cpio ./sequential_uuids_11-*.rpm | 
 cpio -idmv --no-absolute-filenames

Using -v, it displays the extracted files:

Thanks to the symbolic link, they are in the YugabyteDB postgres directory.

The temporary directory can be removed.

Testing the extension

You must install the extension on all nodes of your YugabyteDB server.

Then, when connected to any node, load the extension and play with it:

yugabyte=# create extension sequential_uuids;
CREATE EXTENSION

yugabyte=# create table demo ( id uuid, primary key(id asc) );

yugabyte=# insert into demo
             select uuid_time_nextval(5,x'ffff'::int)
             from generate_series(1,5)
            returning *;

yugabyte=# \watch
Thu 22 Dec 2022 08:24:15 AM UTC (every 2s)

                  id
--------------------------------------
 9da30789-478e-4abd-857e-92c60dfbd375
 9da3dd77-0f14-46ae-90ce-5ffb40531bae
 9da34a5d-f400-4496-a0f4-7db59c8790cb
 9da3f9c6-acb9-4f10-a9e8-498767290ca6
 9da35df5-1698-4371-88cb-077d10c678db
(5 rows)

Thu 22 Dec 2022 08:24:17 AM UTC (every 2s)

                  id
--------------------------------------
 9da3b4a3-448c-4f5f-b70c-d420175fc335
 9da39dac-494a-458a-a797-61d6c30a53ca
 9da30ae4-ecbe-4ab6-a795-3c56b1fe46ba
 9da3e66a-510e-45bd-a8b7-f1afaa84865c
 9da32dca-4a4d-48fb-ba20-b4c2f367fed2
(5 rows)

Thu 22 Dec 2022 08:24:19 AM UTC (every 2s)

                  id
--------------------------------------
 9da31ec6-78aa-493e-b0a9-382a5d92a54c
 9da3b406-1968-4b43-8fd7-a575e051d4f7
 9da34ea9-b6cd-4d17-8c88-f080ecf89cdd
 9da3f783-8cff-42bc-9ee9-98d264780718
 9da35c39-0c86-42fa-a33f-e66e47a11068
(5 rows)

Thu 22 Dec 2022 08:24:21 AM UTC (every 2s)

                  id
--------------------------------------
 9da4f46a-e427-45f2-a313-42f0fba948a2
 9da48fdd-fa08-42ac-9e5f-3413a4f5317d
 9da494bc-ec6b-493b-ac8f-d2f83a8f7be3
 9da4561f-568e-4713-b87e-2af989377c09
 9da477ac-02f8-46d9-b84b-3a8f54b31d98
(5 rows)

Thu 22 Dec 2022 08:24:23 AM UTC (every 2s)

You see that the first 4 bytes (because 0xffff, the default, and the maximum) are the same within a 5 second interval (the first parameter which I've set to 5 instead of the default 60 seconds) with the others random.

Note that I have defined the primary key sharding as ASC rather than the default HASH because, if the goal is to colocate the rows inserted at the same time, you don't want to apply hash sharding.

I also verify that from concurrent sessions the prefix is the same:

for i in {1..10}
do
 ysqlsh -tAq -h $(hostname) -c 'insert into demo select uuid_time_nextval() returning *,now()'&
done

Validating the compatibility with PostgreSQL

The best way to compare the behavior in YugabyteDB is to run the PostgreSQL regression test and compare with the expected output:


yum install -y git
git clone https://github.com/tvondra/sequential-uuids.git


ysqlsh -h $(hostname) --echo-all --quiet         \
 --file sequential-uuids/test/sql/uuids.sql      |
  sdiff sequential-uuids/test/expected/uuids.out -

The only difference is because the regression test has no ORDER BY and the order of rows from a hash partitioned table (the default) in a distributed database is different than from the single-node heap table:

But the result is the same.

The full PostgreSQL-compatibility of the extension is confirmed by the regression tests.

About Sequential UUID in YugabyteDB

⚠️ You must think about the consequence in a Distributed SQL database before using a time-based UUID. Thanks to the LSM-Tree storage, YugabyteDB doesn't have the problems that sequential_uuids tries to solve (WAL write amplification, B-Tree fragmentation and clustering factor). If you want a UUID, then the one from pgcrypto (already installed in YugabyteDB) gen_random_uuid() is probably the right one.

In addition to that, having concurrent sessions touching the same key range will create a hotspot on one tablet. However, if there are not too many concurrent sessions, and the load is distributed with other tables, then maybe this UUID can benefit from colocation in DocDB and filesystem caches. But a cached sequence may be better as it keeps rows together from the same session, but distributes those from concurrent sessions.

Anyway, I've run the above for a while, with more rows inserted (from generate_series(1,100000)) and checked that tablet auto-split occurs:

As you see, one is still at Split Depth 1 because no Sequential UUID went into this range, and another has been split 7 times. So YugabyteDB still balanced the storage.

Summary on the extension installation

If you think a PostgreSQL extension is necessary for your application, you can use this example to test it. The same can be used to deploy it. But to be sure that it is supported, do not hesitate to open a git issue of ask in the slack channel.

DEV Community

Install extensions from PGDG repo to YugabyteDB - example with sequential_uuids

Install extensions from PGDG repo to YugabyteDB - example with sequential_uuids

Franck Pachot for YugabyteDB Distributed PostgreSQL Database ・ Dec 22 '22

Example with sequential_uuids

Download the RPM

Dependencies

Extract the extension to YugabyteDB

Testing the extension

Validating the compatibility with PostgreSQL

About Sequential UUID in YugabyteDB

Summary on the extension installation

Top comments (0)

Read next

Part 2: Implementing Vector Search with OpenAI

Deploying a Rails API-Only App with Postgres using Kamal 2

VACUUM In Postgres Demystified

Database Setup, JWT Implementation & API