DEV Community

ScaleGrid for ScaleGrid

Posted on • Updated on • Originally published at scalegrid.io

Managing High Availability in PostgreSQL – Part I

Managing High Availability in PostgreSQL – Part I - ScaleGrid Blog

Managing high availability in your PostgreSQL hosting is very important to ensuring your clusters maintain exceptional uptime and strong operational performance so your data is always available to your application. In an earlier blog post, we introduced you to configuring high availability for PostgreSQL using streaming replication, and now we're going to show you how to best manage PostgreSQL high availability.

There are multiple tools available for managing the high availability of your PostgreSQL clusters using streaming replication. These solutions offer automatic failover capabilities, monitoring, replication, and other useful administrative tasks. Some of the prominent open source solutions include:

  1. PostgreSQL Automatic Failover by ClusterLabs

  2. Replication Manager for PostgreSQL Clusters by repmgr (2ndQuadrant)

  3. Patroni by Zalando

Each of these tools provides their own way of managing the clusters. In our three-part series of posts on high availability for PostgreSQL, we'll share an overview, the prerequisites, and the working and test results for each of these three tools. Here in Part 1, we'll deep dive into the PostgreSQL Automatic Failover (PAF) solution by ClusterLabs.

PostgreSQL Automatic Failover

PostgreSQL Automatic Failover (PAF) is a high availability management solution for PostgreSQL by ClusterLabs. PAF makes use of the popular, industry-standard Pacemaker and Corosync stack. With Pacemaker and Corosync together, you'll be able to detect failures in the system and act accordingly.

Pacemaker is capable of managing many resources, and does so with the help of their resource agents. Resource agents then have the responsibility of handling a specific resource, how they should behave, and inform Pacemaker of their results.

Your resource agent implementation must comply to the Open Cluster Framework (OCF) specification. This specification defines resource agents’ behavior and implementation of methods like stop, start, promote, demote and interaction with Pacemaker.

PAF is an OCF resource agent for PostgreSQL written in Perl. Once your PostgreSQL cluster is built using internal streaming replication, PAF is able to expose to Pacemaker the current status of the PostgreSQL instance on each node: master, slave, stopped, catching up, etc.

How it Works

PAF communicates with Pacemaker regarding the cluster status and monitors the PostgreSQL functioning. In the event of a failure, it informs Pacemaker, and if there's no chance of the current master being recovered, it will trigger an election between the current standby servers. With the robust Pacemaker in place, PAF will perform management actions like start, stop, monitor, and failover on all the PostgreSQL nodes.

Are There Any Setup Requirements?

  • PAF supports PostgreSQL version 9.3 and higher.
  • PAF is not responsible for PostgreSQL master/standby creation or its setup - you must create and setup streaming replication before using PAF.
  • PAF doesn’t edit any configuration of Postgres. However, it requires users to follow a few prerequisites like:
    • Slave must be configured as hot standby.
    • A recovery template file (default: <postgresql_data_location>/recovery.conf.pcmk) has to be provided with below parameters:
      • standby_mode = on
      • recovery_target_timeline = 'latest'
      • primary_conninfo must have the application_name parameter defined and set to local node name as in Pacemaker.
  • PAF exposes multiple parameters related to the management of a PostgreSQL resource. This can be configured to suit one’s requirement. Below are the parameters:
    • bindir: location of the PostgreSQL binaries (default: /usr/bin)
    • pgdata: location of the PGDATA of your instance (default: /var/lib/pgsql/data)
    • datadir: path to the directory set in data_directory from your postgresql.conf file
    • pghost: the socket directory or IP address to use to connect to the local instance (default: /tmp)
    • pgport: the port to connect to the local instance (default: 5432)
    • recovery_template: the local template that will be copied as the PGDATA/recovery.conf file. This template file must exists on all node (default: $PGDATA/recovery.conf.pcmk)
    • start_opts: Additional arguments given to the Postgres process on startup. See "postgres --help" for available options. Useful when the postgresql.conf file is not in the data directory (PGDATA), eg.: -c config_file=/etc/postgresql/9.3/main/postgresql.conf
    • system_user: the system owner of your instance's process (default: postgres)
    • maxlag: maximum lag allowed on a standby before we set a negative master score on it

PAF Pros

  • PAF provides the user a free hands-on configuration and setup of PostgreSQL.
  • PAF can handle node failures and trigger elections when the master goes down.
  • Quorum behavior can be enforced in PAF.
  • It will provide a complete high availability management solution for the resource, including start, stop, and monitor, and handle network isolation scenarios.
  • It's a distributed solution, which enables the management of any node from another node.

PAF Cons

  • PAF doesn’t detect if a standby is misconfigured with an unknown or non-existent node in recovery configuration. Node will be shown as slave, even if standby is running without connecting to the master/cascading standby node.
  • Requires an extra port (Default 5405) to be opened for the Pacemaker and Corosync components’ communication using UDP.
  • Does not support NAT-based configuration.
  • No pg_rewind support.

High Availability Test Scenarios

We conducted a few tests to determine the capability of the PostgreSQL HA management using PAF. All of these tests were run while the application was running and inserting data to the PostgreSQL database. The application was written using PostgreSQL Java JDBC Driver leveraging the connection failover capability.

Standby Server Tests

Sl. No Test Scenario Observation
1 Kill the PostgreSQL process Pacemaker brought the PostgreSQL process back to running state. There was no disruption in writer application.
2 Stop the PostgreSQL process Pacemaker brought the PostgreSQL process back to running state. There was no disruption in writer application.
3 Reboot the server Standby server was marked offline initially. Once the server came up after reboot, PostgreSQL was started by Pacemaker and the server was marked as online. If fencing was enabled, the node wouldn’t have been added automatically to cluster. There was no disruption in writer application.
4 Stop the Pacemaker process It will stop the PostgreSQL process also, and the server will be marked offline. There was no disruption in writer application.

Master/Primary Server Tests

Sl. No Test Scenario Observation
1 Kill the PostgreSQL process Pacemaker brought the PostgreSQL process back to running state. Primary was recovered within the threshold time and, hence, election was not triggered. The writer application was down for about 26 seconds.
2 Stop the PostgreSQL process Pacemaker brought the PostgreSQL process back to running state. Primary was recovered within the threshold time and, hence, election was not triggered. There was a downtime in writer application for about 26 seconds.
3 Reboot the server Election was triggered by Pacemaker after the threshold time for which master was not available. The most eligible standby server was promoted as the new master. Once the old master came up after reboot, it was added back to the cluster as a standby. If fencing was enabled, the node wouldn’t have been added automatically to cluster. The writer application was down for about 26 seconds.
4 Stop the Pacemaker process It will stop the PostgreSQL process also and server will be marked offline. Election will be triggered and new master will be elected. There was downtime in writer application.

Network Isolation Tests

Sl. No Test Scenario Observation
1 Network isolate the standby server from other servers Corosync traffic was blocked on the standby server. The server was marked offline and PostgreSQL service was turned off due to quorum policy. There was no disruption in the writer application.
2 Network isolate the master server from other servers (split-brain scenario) Corosync traffic was blocked on the master server. PostgreSQL service was turned off and master server was marked offline due to quorum policy. A new master was elected in the majority partition. There was a downtime in the writer application.

Miscellaneous Tests

Sl. No Test Scenario Observation
1 Degrade the cluster by turning off all the standby servers. When all the standby servers went down, PostgreSQL service on master was stopped due to quorum policy. After this test, when all the standby servers was turned on, a new master was elected. There was a downtime in the writer application.
2 Randomly turn off all the servers one after the other, starting with the master, and bring them all back at same time All the servers came up and joined the cluster. New master was elected. There was a downtime in the writer application.

Inference

PostgreSQL Automatic Failover provides several advantages in handling PostgreSQL high availability. PAF uses IP address failover instead of rebooting the standby to connect to the new master during a failover event. This proves advantageous in scenarios where the user does not want to restart the standby nodes. PAF also needs very little manual intervention and manages the overall health of all the resources. The only case where manual intervention is a requirement is in the event of a timeline divergence where the user can elect to use pg_rewind.

In Part 1, we've discussed the capabilities and workings of PostgreSQL Automatic Failover (PAF) by ClusterLabs, and in Part 2, we'll discuss the same high availability aspects using the Replication Manager for PostgreSQL clusters (repmgr) by 2ndQuadrant. Make sure to check back for Part 3, where we'll also cover Patroni by Zalando and compare all three open source solutions to help you determine the best fit for your application.

Top comments (0)