DEV Community

Cong Li
Cong Li

Posted on

GBASE数据库 | GBase 8c 500 Primary-Standby Maintenance: Common Issues and Solutions

The GBase database (GBase 数据库)supports various deployment configurations, including standalone, primary-standby, and distributed modes.

This article provides examples of common issues encountered in primary-standby cluster maintenance. Each case includes environment details, error messages, troubleshooting steps, and solutions. Please exercise caution in production environments, and consult with technical staff when needed.


Issue 1

  • Environment: 500 primary-standby setup. Installation directory, logs, and data directory are all located under /home/gbase, and no VIP is configured in the cluster.
  • Issue: The application reported that it could not connect to the primary node database, with the error message: error: Unsupported or unrecognized SSL message.
  • Troubleshooting: The data node logs contained the error message DETAIL: Permissions should be u=rw (0600) or less. Upon inspection, it was found that chmod -R 777 /home had been run under the root user, affecting the database directory located at /home/gbase/database.

Image description

  • Solution:
    1) Restore Service: Perform a manual primary-standby switch to make node 13.3 the primary, allowing the application to connect to 13.3. The application confirmed normal connectivity.
    2) Fix Permissions:

     find /home/gbase/ -perm 777 -type d | xargs chmod 700
     find /home/gbase/ -perm 777 -type f | xargs chmod 600
    

3) Adjust Command Permissions: Restore permissions for commands such as app/bin/gs_om and python to 500.
4) Correct Certificate Permissions: Set permissions for server.crt and server.key files under the data directory to 400.

Image description


Issue 2

  • Requirement: Configuring firewall rules for a 500 primary-standby database (default port 15400).
  • Solution:
  systemctl start firewalld
  firewall-cmd --zone=public --add-port=15400/tcp --permanent
  firewall-cmd --zone=public --add-port=15300/tcp --permanent
  firewall-cmd --zone=public --add-port=15301/tcp --permanent
  firewall-cmd --zone=public --add-port=15302/tcp --permanent
  firewall-cmd --zone=public --add-port=15405/tcp --permanent
  systemctl stop firewalld
  systemctl start firewalld

  firewall-cmd --zone=public --query-port=15400/tcp
Enter fullscreen mode Exit fullscreen mode

Issue 3

  • Environment: 500 primary-standby database.
  • Issue: Changing the hostname or IP address caused the database to throw an error: xxxx list index out of range.
  • Solution: 1) Locate the Configuration XML:
find / -name *.xml
Enter fullscreen mode Exit fullscreen mode

2) Replace Hostname in Configuration:

sed -i 's/old_hostname/new_hostname/g' /home/opt/gbase_package/cluster_config.xml
Enter fullscreen mode Exit fullscreen mode

3) Reload Configuration with gs_om:

  • Update hostname:
gs_om -t generateconf -X /home/opt/gbase_package/cluster_config.xml --distribute
Enter fullscreen mode Exit fullscreen mode
  • Update IP and Port:
gs_om -t generateconf --old-values=2345,192.x.x.100 --new-values=15400,192.x.x.100 --distribute
Enter fullscreen mode Exit fullscreen mode

4) Restart Cluster:

gs_om -t stop; gs_om -t start
Enter fullscreen mode Exit fullscreen mode

This guide aims to provide insights and practical solutions for common issues encountered during primary-standby maintenance in GBase database (GBase 数据库). Properly following these steps can help ensure smooth operations and effective troubleshooting.

Top comments (0)