DEV Community

Franck Pachot for YugabyteDB

Posted on

PGLOADBALANCEHOSTS

With YugabyteDB you can scale out your PostgreSQL application, which can connect to any node for consistent reads and writes. Here is an example using the basic load balancing provided by the PostgreSQL client library. I'll run PgBench with connections to multiple nodes and without an additional proxy or driver.

For this example, I've installed the latest client, in beta, as in the previous post, and I will connect to a 3 nodes YugabyteDB cluster that I've started, for this lab, in a single VM:

psql (16beta1, server 11.2-YB-2.17.3.0-b0)
Type "help" for help.

yugabyte=# select public_ip, port, region, node_type from yb_servers();

 public_ip  | port | region | node_type
------------+------+--------+-----------
 10.0.0.142 | 5433 | north  | primary
 10.0.0.143 | 5433 | south  | primary
 10.0.0.141 | 5433 | west   | primary

(3 rows)

yugabyte=#
Enter fullscreen mode Exit fullscreen mode

I set the PostgreSQL environment variables to connect to my YugabyteDB cluster:

export PGHOST=10.0.0.141,10.0.0.142,10.0.0.143
export PGPORT=5433
export PGUSER=yugabyte
export PGDATABASE=yugabyte
Enter fullscreen mode Exit fullscreen mode

The list in PGHOST is used for High Availability. The hosts are take in order so that if the first one is down, it can connect to the next one. In PostgreSQL, this is used to connect to read replicas. In YugabyteDB, as you can connect to any node for consistent reads and writes. Note that I could have used a single hostname with multiple IP addresses in the DNS.

PgBench

I create the pgbench tables:


$ pgbench -iIdtp

dropping old tables...
creating tables...
WARNING:  storage parameter fillfactor is unsupported, ignoring
WARNING:  storage parameter fillfactor is unsupported, ignoring
WARNING:  storage parameter fillfactor is unsupported, ignoring
creating primary keys...
done in 3.01 s (drop tables 0.77 s, create tables 0.70 s, primary keys 1.54 s).

$ pgbench -iIGf

generating data (server-side)...
creating foreign keys...
done in 4.17 s (server-side generate 2.24 s, foreign keys 1.94 s).

Enter fullscreen mode Exit fullscreen mode

The warning can be ignored (YugabyteDB has no need for FILLFACTOR as it is not using Heap Tables). For the same reason (tables being stored in their primary key) I create the empty tables and the primary key immediately (tp initialization steps) and then ingest data and create the foreign keys (Gf steps).

host load balancing disabled

I run pgbench workload from 12 clients (-c 12) with the settings above (a list in PGHOST but PGLOADBALANCEHOST is not set and defaults to disable):

$ pgbench -T60 -c 12 -nN

transaction type: <builtin: simple update>
scaling factor: 1
query mode: simple
number of clients: 12
number of threads: 1
maximum number of tries: 1
duration: 60 s
number of transactions actually processed: 19596
number of failed transactions: 0 (0.000%)
latency average = 36.682 ms
initial connection time = 134.319 ms
tps = 327.133784 (without initial connection time)
Enter fullscreen mode Exit fullscreen mode

I can see the all 12 connections connected to 10.0.0.141, the first one in PGHOST:
disable

host load balancing random

I run the same but setting PGLOADBALANCEHOSTS to random:

$ export PGLOADBALANCEHOSTS=random

$ pgbench -T30 -c 12 -nN

pgbench (16beta1, server 11.2-YB-2.17.3.0-b0)
transaction type: <builtin: simple update>
scaling factor: 1
query mode: simple
number of clients: 12
number of threads: 1
maximum number of tries: 1
duration: 30 s
number of transactions actually processed: 9619
number of failed transactions: 0 (0.000%)
latency average = 37.310 ms
initial connection time = 130.484 ms
tps = 321.626355 (without initial connection time)
Enter fullscreen mode Exit fullscreen mode

Now, the connections are randomly distributed:
random

This is a very simple way to balance the connections to a YugabyteDB cluster. Note that, even if all were connected to one node, the data is sharded and balanced with remote read and write operations to the DocDB layer of the tablet servers. However, the SQL processing itself happens in YSQL, the PostgreSQL backend on the node you are connected to. Fur the best scalability, connections should also be balanced.

Host discovery and Smart Driver

Because yb_servers() is updated when nodes are added or removed in the YugabyteDB cluster, you can set PHGOST from it. You can even restrict to a specific cloud, region or zone:

export PGHOST=$(psql -tAc "
 select public_ip from yb_servers()
 where region in ('north','west')
" | paste -sd,)
Enter fullscreen mode Exit fullscreen mode

The same can be done in the application where the host information of the connection pool could be refreshed regularly. This is actually what is done automatically by the YugabyteDB Smart Drivers and is the preferred method for client-side load-balancing.

The PGLOADBALANCEHOSTS=random solution presented here is convenient when there is no smart driver, as it works with any LibPQ client, like PgBench here.

Top comments (0)