DEV Community

leo
leo

Posted on

openGauss Primary and secondary shared storage

Primary and secondary shared storage
Feature introduction
This feature provides the ability for the active and standby machines to share a storage, and realizes the HA deployment form based on the active and standby shared storage. Optionally, OCK RDMA can be used to improve the real-time consistent read capability of the standby machine_.

Figure 1 Architecture diagram of active and standby sharing

value of customer
It solves the problem that the storage capacity of traditional HA deployment is doubled compared with that of a stand-alone deployment, reduces storage capacity, and saves disk array equipment. Optionally use OCK RDMA to improve the real-time consistent read capability of the standby machine.

Characteristic description
-Shared storage relies on two self-developed public components to realize the ability of active and standby shared storage:

Distributed Storage Service DSS (Distributed Storage Service)
DSS is an independent process that directly manages disk array raw devices and provides external capabilities similar to distributed file systems. Through the shared memory and the client API dynamic library, the database provides the ability to create files, delete files, expand and shrink files, and read and write files.

Distributed Memory Service DMS (Distributed Memory Service)
DMS is a dynamic library, which is integrated in the database, transmits the page content through the TCP/RDMA network, integrates the primary and backup memory, and provides memory pooling capabilities, so as to realize the real-time consistent reading function of the standby machine.

Shared storage uses the distributed storage service DSS component to realize the shared storage of the master and backup. Compared with the traditional library building, the shared storage based on the disk array library divides the directory into three types, exclusive for each instance and not shared, exclusive and shared for each instance, and shared by all instances. The directories that need to be shared need to be stored on the disk array device, and the directories that are not shared should be stored on the local disk. In addition, to build a database on the standby machine, you only need to create its own directory, and you don't need to create a directory structure shared by all instances again. Added relevant GUC parameters for active and standby shared storage, and switched the system table storage mode from page to segment page.

The shared storage realizes the real-time exchange of active and standby pages through the distributed memory service DMS component, and provides the real-time consistency capability of the standby machine. That is, after the host transaction is committed, it can be read immediately on the standby machine, and there is no delayed read phenomenon (the transaction isolation level is Read-Committed).

Shared storage uses OCK RDMA to reduce the DMS primary and secondary page exchange delay. The latency of the standby machine's consistent read under TCP is compared. If OCK RDMA is enabled, the standby machine's consistent read latency will be reduced by at least 20%.

feature constraints
The master-slave shared storage solution depends on the disk array device, and the LUN of the disk array needs to support the PR protocol of SCSI3 (including PR OUT (“PERSISTENT RESERVE OUT”) PR IN (“PERSISTENT RESERVE IN”) and INQUIRY), used to realize cluster IO FENCE. In addition, it is also necessary to support the CAW protocol (COMPARE AND WRITE) of SCSI3 to realize the shared disk lock. Such as Dorado 5000 V3 disk array equipment.
The implemented HA deployment form of active-standby shared storage only supports 1 master-1-standby and 1 master-2-standby scenarios. Other scenarios are trial versions that have not been tested and are not promised.
Since the active and standby shared storage relies on functions similar to the distributed file system to realize the real-time consistent read capability of the standby machine, it is required that the file metadata should be changed as little as possible. Based on performance considerations, only segment page tables are supported.
Only active and standby deployments are supported on the same disk array device, disaster recovery deployment is not supported, and active and standby mixed deployments are not supported (for example, active and standby are deployed on different disk array devices).
The active/standby page exchange is accelerated by RDMA, depends on the CX5 network card, and depends on the OCK RDMA dynamic library.
Currently does not support backup machine reconstruction, node replacement, node repair and other capabilities.
Upgrades from traditional HA deployments to deployments based on active and standby shared storage are not supported.
The gs_xlogdump_xid, gs_xlogdump_lsn, gs_xlogdump_tablepath, gs_xlogdump_parsepage_tablepath, pg_create_logical_replication_slot, gs_verify_and_tryrepair_page, gs_repair_page, and gs_repair_file functions are not supported in shared storage mode.
T_CreatePublicationStmt, T_AlterPublicationStmt, T_CreateSubscriptionStmt, T_AlterSubscriptionStmt, T_DropSubscriptionStmt subscription functions are not supported in shared storage mode.
Global temporary tables are not supported in shared storage mode.

Corresponding to the installation part, support for shared storage scenarios has also been added.

Top comments (0)