Cong Li

Posted on Nov 11

GBASE数据库 | GBase 8c System Tuning Reference Guide

#database

Among the GBase database (GBase数据库) product series, GBase 8c is a multi-model, distributed transactional database. This article provides operational guidance for users and developers and is a reference guide for GBase 8c system tuning.

System tuning involves optimizing both the operating system and database levels to make better use of machine resources like CPU, memory, I/O, and network, avoid resource conflicts, and enhance the overall system's query throughput. The following sections will cover tuning at the operating system and database levels.

1. Operating System Parameter Tuning

In performance tuning, you can adjust critical operating system (OS) configuration parameters based on actual business needs to improve GBase 8c database performance.

Prerequisites

Use gs_check to verify if the OS parameters match the recommended values. If they differ, modify them manually based on business requirements.

Memory-related Parameter Settings

Edit the sysctl.conf file to adjust memory parameters such as vm.extfrag_threshold to 1000 (recommended value). If memory parameters are not present in the file, add them manually.

vim /etc/sysctl.conf

After making changes, apply them with the following command:

sysctl -p

Network-related Parameter Settings

Configure network-related parameters in the sysctl.conf file. If these parameters are not present, add them manually.

vim /etc/sysctl.conf

After making changes, apply them with:

sysctl -p

Refer to the table below for detailed parameter descriptions.

Parameter Name	Reference Value	Description
net.ipv4.tcp_timestamps	1	Enables quick recovery of TIME-WAIT sockets in TCP connections. Default is 0 (disabled); 1 enables it.
net.ipv4.tcp_mem	94500000 915000000 927000000	1. The first value indicates that when the pages used by TCP are below 94500000, the kernel will not intervene. 2. The second value indicates that when pages used by TCP are below 915000000, the kernel enters "memory pressure" mode. 3. The third value indicates that when pages used by TCP exceed 927000000, it will trigger an "out of socket memory" alert.
net.ipv4.tcp_max_orphans	3276800	Maximum number of orphan sockets (unattached sockets).
net.ipv4.tcp_fin_timeout	60	Default TIMEOUT value for the system.
net.ipv4.ip_local_port_range	26000 65535	Range of ports available for use by TCP and UDP.

Set the maximum transmission unit (MTU) for a 10GE network card using the ifconfig command. It is recommended to set the MTU for a 10GE network card to 8192 to improve network bandwidth utilization. Example:

ethx is the network card used for internal database operations over 10GE.
The first command sets the MTU, and the second command verifies if the MTU has been set successfully. The MTU value is shown in bold.
Root user privileges are required to set this.

Set the receive (rx) and transmit (tx) queue lengths for a 10GE network card using the ethtool tool. It is recommended to set the queue length to 4096 for a 10GE network card to improve network bandwidth utilization.

Example:

ethx is the network card used for internal database operations over 10GE.
The first command sets the network card's receive and transmit queue lengths, and the second command verifies if the setting was successful. The output in the example indicates that the setting was successful.
Root user privileges are required to set this.

I/O-related Parameter Settings

Set hugepage attributes by disabling transparent hugepages with the following commands:

echo never > /sys/kernel/mm/transparent_hugepage/enabled
echo never > /sys/kernel/mm/transparent_hugepage/defrag

Then, reboot to apply the changes.

reboot 1

2. Database System Parameter Tuning

To ensure optimal database performance, configure GUC parameters based on hardware resources and business requirements.

2.1 Database Memory Parameter Tuning

Complex database queries heavily depend on memory configuration parameters, including logical memory management and parameters for spilling operations to disk.

Logical Memory Management Parameters

The max_process_memory parameter controls the peak memory usage available to database nodes. The formula for setting this parameter is as follows:

Effective Memory = max_process_memory - shared memory (including shared_buffers) - cstore_buffers

Key parameters that influence effective memory are shared_buffers and cstore_buffers.

To monitor memory usage in major memory areas, use the views pg_total_memory_detail (for a single node) and pgxc_total_memory_detail (for the main node).

Set work_mem based on query characteristics and concurrency. When work_mem is insufficient, operators will write data to temporary tables, causing a 5-10x performance drop.

For complex, single-threaded queries with 5-10 joins, set work_mem = 50% memory / 10.
For simple, single-threaded queries with 2-5 joins, set work_mem = 50% memory / 5.
For concurrent queries, set work_mem = serial_work_mem / concurrency.

Parameters for Disk Spilling Operators

The work_mem parameter helps determine if disk spillage has been triggered for spillable operators in execution tasks based on memory usage. Currently, there are six types of spillable operators (ten types in both vectorized and non-vectorized forms): Hash (VecHashJoin), Agg (VecAgg), Sort (VecSort), Material (VecMaterial), SetOp (VecSetOp), and WindowAgg (VecWindowAgg). This parameter is typically set as a trade-off to ensure both concurrency throughput and single-query performance. Adjustments should be based on actual execution conditions (with reference to Explain Performance outputs).

2.2 Database Concurrency Queue Parameter Tuning

The database provides two mechanisms for controlling concurrency queues: global concurrency queue and local concurrency queue.

Global Concurrency Queue

The global concurrency queue uses the GUC parameter max_active_statements to control the number of concurrent tasks on the main database node. This mechanism manages all tasks executed by regular users without distinguishing complexity; each execution statement is treated as a single execution unit. When the number of concurrent tasks reaches this threshold, the tasks enter a queue to wait. Tasks executed by administrators are exempt from global concurrency control. When setting this GUC parameter, system capacity, memory usage, and I/O utilization should be considered. If regular users are linked to resource pools with different priority levels, the global concurrency queue will use a two-dimensional queue, prioritizing tasks based on high or low priority and then on the same priority level. Higher-priority tasks are awakened first.

In transaction-intensive, high-concurrency scenarios, it’s recommended to set max_active_statements to -1, which imposes no global concurrency limit.

For analytical query scenarios, set max_active_statements to the number of CPU cores divided by the number of database nodes, typically 5–8.

Local Concurrency Queue

The local concurrency control mechanism via resource pools limits the number of concurrent tasks within the same resource pool on the main database node. This mechanism restricts the concurrency of complex queries based on the task's cost. The parctl_min_cost parameter determines whether a task is considered complex.

2.3 Configuring SMP

This section discusses the limitations and applicable scenarios of the SMP module, along with configuration instructions.

2.3.1 Background Information

SMP (Symmetric Multiprocessing) improves performance through operator parallelization, using additional system resources like CPU, memory, and I/O. Essentially, SMP is a method of trading resources for time. In suitable scenarios with ample resources, it can effectively boost performance; however, in unsuitable scenarios or resource-constrained environments, it may degrade performance. SMP is ideal for analytical queries, characterized by long individual query times and low business concurrency. SMP parallelism can reduce query latency and enhance throughput. However, in transaction-heavy, high-concurrency scenarios, the overhead of multithreading can increase latency and reduce throughput.

Applicable Scenarios

Supported parallel operators include:

Scan: Sequential scans of row-store regular tables and partitions, and column-store regular tables and partitions.
Join: HashJoin, NestLoop.
Agg: HashAgg, SortAgg, PlainAgg, WindowAgg (supports partition by, not order by).
Stream: Local Redistribute, Local Broadcast.
Others: Result, Subqueryscan, Unique, Material, Setop, Append, VectoRow.

SMP-specific operators are added for data exchange between parallel threads, such as:

LocalGather: Aggregates data within parallel threads.
LocalRedistribute: Redistributes data across threads based on the distribution key.
LocalBroadcast: Broadcasts data to each thread.
LocalRoundRobin: Distributes data across threads in a round-robin manner.

Non-applicable Scenarios

Parallel execution is not supported for:

Index scans.
MergeJoin.
WindowAgg with order by.
Cursor operations.
Queries within stored procedures or functions.
Queries with subplan or initplan, or operators containing subqueries.
Queries with median operations.
Queries involving global temporary tables.
Materialized view updates.

2.3.2 Impact of Resources on SMP Performance

SMP architecture uses surplus resources to reduce time, leading to increased consumption of CPU, memory, and I/O. As parallelism increases, resource consumption rises correspondingly. When these resources become bottlenecks, SMP may fail to enhance performance and could even degrade overall performance.

CPU Resources: If CPU utilization is low, SMP can improve system performance by using CPU resources more effectively. However, if CPU utilization is already high or cores are limited, enabling SMP might not yield significant benefits and could lead to performance degradation due to resource contention.
Memory Resources: Parallel queries increase memory usage, although each operator's memory usage remains capped by parameters like work_mem. For instance, if work_mem is set to 4GB and parallelism is 2, each parallel thread is allocated 2GB. If work_mem is low or memory resources are constrained, SMP could lead to disk spillage, worsening performance.
I/O Resources: Parallel scans inherently increase I/O consumption, so sufficient I/O resources are necessary to improve scanning performance through parallelism.

2.3.3 Other Factors Affecting SMP Performance

Apart from resource factors, others, such as partition data skew and system concurrency, may also impact SMP performance.

Data Skew: When data skew is severe, parallelism may be less effective. For example, if one join column value contains significantly more data than others, a long-tail issue may arise, where one parallel thread processes more data than others, resulting in poorer parallel performance.
System Concurrency: SMP increases resource usage, which can lead to severe resource contention in high-concurrency environments, lowering overall performance. Thus, enabling SMP in such scenarios may not improve performance and could cause degradation.

2.3.4 SMP Usage Recommendations

Usage Restrictions

To leverage SMP for query performance improvement, the following conditions must be met:

Ample CPU, memory, I/O, and network bandwidth. SMP is a resource-for-time approach, increasing resource consumption with parallelism. If resources are limited, SMP may degrade performance, and it is advisable to disable it in such cases.

Precautions

When resources allow, a higher parallelism level yields better performance.
SMP parallelism supports session-level settings. It’s recommended to enable SMP for specific queries that meet the requirements, then disable it afterward to avoid impacts during peak business hours.

Configuration Steps

1) Check the current system load. If resources are sufficient (resource utilization below 50%), proceed to step 2; otherwise, exit.
2) Set query_dop=1 (default), use explain to generate an execution plan, and verify if it meets the conditions outlined in the SMP applicable scenarios. If so, proceed to step 3.
3) Set query_dop=value, forcing a parallelism level of 1 or the specified value regardless of resources or plan characteristics.
4) Set an appropriate query_dop value before executing queries that meet the requirements, and disable query_dop after execution.

2.4 Configuring LLVM

LLVM (Low-Level Virtual Machine) dynamic compilation generates customized machine code for each query to replace generic functions. This reduces redundant conditional checks, virtual function calls, and improves data locality, enhancing query performance. However, in scenarios with small data volumes or low query execution time, LLVM may degrade performance due to the extra time needed for IR intermediate representation generation and machine code compilation.

2.4.1 LLVM Applicable Scenarios and Limitations

Supported Expressions for LLVM Optimization

LLVM optimizations apply to the following expressions in queries:

Case…when… expressions
In expressions
Boolean expressions (And/Or/Not)
BooleanTest expressions (IS_NOT_KNOWN/IS_UNKNOWN/IS_TRUE/IS_NOT_TRUE/IS_FALSE/IS_NOT_FALSE)
NullTest expressions (IS_NOT_NULL/IS_NULL)
Operator expressions
Function expressions (e.g., lpad, substring, btrim, rtrim, length)
Nullif expressions

Supported data types include bool, tinyint, smallint, int, bigint, float4, float8, numeric, date, time, timetz, timestamp, timestamptz, interval, bpchar, varchar, text, oid. Only expressions in the filter of the Scan node in the vectorized execution engine, the complicate hash condition, hash join filter, and hash join target in the HashJoin node, filters in the Nested Loop node, merge join filters and targets in the Merge Join node, and filters in the Group node are considered for LLVM optimization.

Supported LLVM Operators

Join: HashJoin
Agg: HashAgg
Sort

The HashJoin operator only supports HashInnerJoin, and the corresponding hash condition only supports comparisons of int4, bigint, and bpchar types. The HashAgg operator only supports sum and avg operations on bigint and numeric types, and the groupby statement only supports operations on int4, bigint, bpchar, text, varchar, and timestamp types, while also supporting the count(*) aggregation operation. The Sort operator only supports comparison operations on int4, bigint, numeric, bpchar, text, and varchar data types. Additionally, LLVM dynamic compilation optimization cannot be used, which can be verified through the explain performance tool.

Inapplicable Scenarios

Small tables do not support LLVM dynamic compilation optimization.
Queries that generate non-vectorized execution paths are not supported.

2.4.2 Other Factors Affecting LLVM Performance

The effectiveness of LLVM optimization depends not only on the database's internal implementation but also on the hardware environment in use.

Number of C-function Calls in Expressions

The database does not implement full code generation for expression calculations; some expressions are code-generated, while others directly call the original C code. If most of the expressions rely on the latter, using LLVM dynamic compilation optimization may degrade performance. By setting log_min_message to DEBUG1, users can view which expressions directly call C code.

Memory Resources

A key aspect of the LLVM feature is ensuring data locality, meaning data should ideally reside in registers. Data loading should be minimized, so it is recommended to set a sufficiently large work_mem when using LLVM optimization to ensure that the entire execution process for optimized code occurs in memory; otherwise, performance degradation may occur.

Optimizer Cost Estimation

The LLVM feature includes a simple cost estimation model, which determines whether to use LLVM dynamic compilation optimization based on the size of the tables involved in the current node's operation. If the optimizer underestimates the actual number of rows processed, it may miss optimization opportunities that could improve performance. Conversely, overestimation could also lead to inefficiencies.

2.4.3 Recommendations for Using LLVM

LLVM is enabled by default in the database kernel. Users can configure it based on the analysis above, with the following general recommendations:

1) Set an appropriate work_mem value; where possible, increase work_mem to avoid excessive disk writes. If excessive disk writes occur, it is recommended to disable LLVM dynamic compilation optimization (set enable_codegen=off).

2) Set a suitable codegen_cost_threshold (default value is 10000) to avoid using LLVM dynamic compilation optimization in scenarios with small data volumes. If performance degradation is observed due to LLVM optimization, it is recommended to increase the codegen_cost_threshold value.

Continuously tuning the GBase 8c system is crucial. Every adjustment contributes to improved performance and availability. Through meticulous parameter tuning, appropriate resource allocation, and in-depth performance analysis, we can unlock greater potential in GBase database (GBase数据库), ensuring data flows smoothly and stably to meet business demands.

DEV Community