Apache ShardingSphere

Posted on Feb 21, 2022

Apache ShardingSphere 5.1.0 Now Available

#database #sql #postgres #devops

Apache ShardingSphere 5.1.0 is officially released and available. The previous 5.0.0 GA version was launched in November last year and marked ShardingSphere’s evolution from middleware to an ecosystem.

This meant gaining the power to transform any database in a distributed database system and enhance it with features such as data sharding, distributed transaction, data encryption, SQL audit, database gateway, and more.

For the past three months, the ShardingSphere community received a lot of feedback from developers, partners, and users across different industries. We’d like to extend our gratitude for the feedback they provided, because, without it, this update would not be possible.

Our community author and Apache ShardingSphere PMC, Meng Haoran, shares with you in detail what’s new in Apache ShardingSphere version 5.1.0.

Based on user feedback from the 5.0.0 GA version, we also decided to commit our efforts to improve ShardingSphere’s ecosystem, kernel and feature modules:

Kernel

Building a powerful and stable kernel has always been the purpose of ShardingSphere.

In the new version we fix a large number of issues to better support parsing for PostgreSQL and openGauss SQL, and now supports function parsing and binlogstatement parsing.

W also optimized the rewriter engine and improved efficiency for loading massive single tables, to further improve overall kernel performance. Moreover, ShardingSphere now adds the SQL hint function that enables users to use the forced routing function more conveniently.

Access Terminal

For ShardingSphere-Proxy, we fix the issue of parsing MySQL/PostgreSQL protocol, while we also added SCRAM SHA-256authentication mode to support openGauss and optimize the openGauss batch inserts protocol to improve the data insert performance.

For ShardingSphere-JDBC, we removed check for NULL values in rules, so users can still use JDBC even if there is no value in rules. We also optimized the metadata of the logical database only loading the specified schemaName to accelerate boot-up.

Elastic Scale-Out

We made many adjustments to elastic scale-out in this version.

First, the original scaling module is moved to the data-pipeline module under the kernel. In the future, this module will provide most data processing capabilities except for data migration.

Second, scaling configuration has been moved from server.yaml to the config-sharding.yaml configuration file. Together with data sharding, elastic scale-out will provide users with better data sharding services.

DistSQL

Many practical languages can now be implemented. More tools are provided for users to manage the ShardingSphere distributed database ecosystem.

Some distributed cluster governance capabilities are optimized as well. For example, when users enable/stop instances through instanceId while there is only one secondary database, the users will be informed that they cannot stop the instances — significantly improving the user experience.

Read/Write Splitting and High Availability

The API of read/write splitting and high availability are both optimized. Read/write splitting now supports both static and dynamic configurations, while the static configuration needs to be used with high availability.

The high availability configuration and algorithm are isolated, making its configuration more unified and concise. Additionally, SpringBoot and Spring Namespacenow support the configuration of high availability as well as the implementation of openGauss’ high availability feature.

Shadow Database

The shadow database feature has been partly optimized. It now supports logic data source transmission, provides checking for data types that are not supported by column matching shadow algorithms, annotates that shadow algorithm is reconstructed as HINT shadow algorithm, removes enable attribute in configuration, and optimizes the determining logic of shadow algorithm, improving performance.

This post only covers a part of the updates we made to some functions. While developing version 5.1.0, we merged 1000+ PRs from the community. Based on version 5.0.0 GA, version 5.1.0 has been significantly improved in terms of its kernel capabilities, core functions, and performance to deliver a better user experience.

Here are the details of the release of version 5.1.0:

New Features

Support SQL hint
New DistSQL syntax: SHOW AUTHORITY RULE
New DistSQL syntax: SHOW TRANSACTION RULE
New DistSQL syntax: ALTER TRANSACTION RULE
New DistSQL syntax: SHOW SQL_PARSER RULE
New DistSQL syntax: ALTER SQL_PARSER RULE
New DistSQL syntax: ALTER DEFAULT SHARDING STRATEGY
New DistSQL syntax: DROP DEFAULT SHARDING STRATEGY
New DistSQL syntax: CREATE DEFAULT SINGLE TABLE RULE
New DistSQL syntax: SHOW SINGLE TABLES
New DistSQL syntax: SHOW SINGLE TABLE RULES
New DistSQL syntax: SHOW SHARDING TABLE NODES
New DistSQL syntax: CREATE/ALTER/DROP SHARDING KEY GENERATOR
New DistSQL syntax: SHOW SHARDING KEY GENERATORS
New DistSQL syntax: REFRESH TABLE METADATA
New DistSQL syntax: PARSE SQL, Output the abstract syntax tree obtained by parsing SQL
New DistSQL syntax: SHOW UNUSED SHARDING ALGORITHMS
New DistSQL syntax: SHOW UNUSED SHARDING KEY GENERATORS
New DistSQL syntax: CREATE/DROP SHARDING SCALING RULE
New DistSQL syntax: ENABLE/DISABLE SHARDING SCALING RULE
New DistSQL syntax: SHOW SHARDING SCALING RULES
New DistSQL syntax: SHOW INSTANCE MODE
New DistSQL syntax: COUNT SCHEMA RULES
Scaling: Add rateLimiter configuration and QPS TPS implementation
Scaling: Add DATA_MATCH data consistency check
Scaling: Add batchSize configuration to avoid possible OOME
Scaling: Add streamChannel configuration and MEMORY implementation
Scaling: Support MySQL BINARY data type
Scaling: Support MySQL YEAR data type
Scaling: Support PostgreSQL BIT data type
Scaling: Support PostgreSQL MONEY data type
Database discovery adds support for JDBC Spring Boot
Database discovery adds support for JDBC Spring Namespace
Database discovery adds support for openGauss
Shadow DB adds support for logical data source transfer
Add data type validator for column matching shadow algorithm
Add support for xa start/end/prepare/commit/recover in encrypt case with only one data source

API Changes

Redesign the database discovery related DistSQL syntax
In DistSQL, the keyword GENERATED_KEY is adjusted to KEY_GENERATE_STRATEGY
Native authority provider is marked as deprecated and will be removed in a future version
Scaling: Move scaling configuration from server.yaml to config-sharding.yaml
Scaling: Rename clusterAutoSwitchAlgorithm SPI to completionDetector and refactor method parameter
Scaling: Data consistency check API method rename and return type change
Database discovery module API refactoring
Read/write-splitting supports static and dynamic configuration
Shadow DB removes the enable configuration
Shadow algorithm type modified

Enhancements

Improve load multi single table performance
Remove automatically added order by primary key clause
Optimize binding table route logic without sharding column in join condition
Support update sharding key when the sharding routing result keep the same
Optimize rewrite engine performance
Support select union/union all … statements by federation engine
Support insert on duplicate key update sharding column when route context keep same
Use union all to merge sql route units for simple select to improve performance
Supports autocommit in ShardingSphere-Proxy
ShardingSphere openGauss Proxy supports SHA-256authentication method
Remove property java.net.preferIPv4Stack=true from Proxy startup script
Remove the verification of null rules for JDBC
Optimize performance of executing openGauss batch bind
Disable Netty resource leak detector by default
Supports describe prepared statement in PostgreSQL / openGauss Proxy
Optimize performance of executing PostgreSQL batched inserts
Add instance_id to the result of SHOW INSTANCE LIST
Support to use instance_id to perform operations when enable/disable a proxy instance
Support auto creative algorithm when CREATE SHARING TABLE RULE, reducing the steps of creating rule
Support specifying an existing KeyGenerator when CREATE SHARDING TABLE RULE
DROP DATABASE supports IF EXISTS option
DATANODES in SHARDING TABLE RULE supports enumerated inline expressions
CREATE/ALTER SHARDING TABLE RULE supports complex sharding algorithm
SHOW SHARDING TABLE NODES supports non-inline scenarios (range, time, etc.)
When there is only one read data source in the read/write-splitting rule, it is not allowed to be disabled
Scaling: Add basic support of chunked streaming data consistency check
Shadow algorithm decision logic optimization to improve performance

Refactoring

Refactor federation engine scan table logic
Avoid duplicated TCL SQL parsing when executing prepared statement in Proxy
Scaling: Add pipeline modules to redesign scaling
Scaling: Refactor several job configuration structure
Scaling: Precalculate tasks splitting and persist in job configuration
Scaling: Add basic support of pipeline-core code reuse for encryption job
Scaling: Add basic support of scaling job and encryption job combined running
Scaling: Add input and output configuration, including workerThread and rateLimiter
Scaling: Move blockQueueSize into streamChannel
Scaling: Change jobIdtype from integer to text
Optimize JDBC to load only the specified schema
Optimize meta data structure of the registry center
Rename Note shadow algorithm to HINT shadow algorithm

Bug Fixes

Support parsing function
Fix alter table drop constrain
Fix optimize table route
Support Route resource group
Support parsing binlog
Support postgreSql/openGauss ‘&’ and ‘|’ operator
Support parsing openGauss insert on duplicate key
Support parse postgreSql/openGauss union
Support query which table has column contains keyword
Fix missing parameter in function
Fix sub query table with no alias
Fix utc timestamp function
Fix alter encrypt column
Support alter column with position encrypt column
Fix delete with schema for postgresql
Fix wrong route result caused by Oracle parser ambiguity
Fix projection count error when use sharding and encrypt
Fix npe when using shadow and readwrite_splitting
Fix wrong metadata when actual table is case insensitive
Fix encrypt rewrite exception when execute multiple table join query
Fix encrypt rewrite wrong result with table level queryWithCipherColumn
Fix parsing chinese
Fix encrypt exists sub query
Fix full route caused by the MySQL BINARY keyword in the sharding condition
Fix getResultSet method empty result exception when using JDBCMemoryQueryResult processing statement
Fix incorrect shard table validation logic when creating store function/procedure
Fix null charset exception occurs when connecting Proxy with some PostgreSQL client
Fix executing commit in prepared statement cause transaction status incorrect in MySQL Proxy
Fix client connected to Proxy may stuck if error occurred in PostgreSQL with non English locale
Fix file not found when path of configurations contains blank character
Fix transaction status may be incorrect cause by early flush
Fix the unsigned datatype problem when query with PrepareStatement
Fix protocol violation in implementations of prepared statement in MySQL Proxy
Fix caching too many connections in openGauss batch bind
Fix the problem of missing data in SHOW READWRITE_SPLITTING RULES when db-discovery and readwrite-splitting are used together
Fix the problem of missing data in SHOW READWRITE_SPLITTING READ RESOURCES when db-discovery and readwrite-splitting are used together
Fix the NPE when the CREATE SHARDING TABLE RULE statement does not specify the sub-database and sub-table strategy
Fix NPE when PREVIEW SQL by schema.table
Fix DISABLE statement could disable readwrite-splitting write data source in some cases
Fix DIABLE INSTANCE could disable the current instance in some cases
Fix the issue that user may query the unauthorized logic schema when the provider is SCHEMA_PRIVILEGES_PERMITTED
Fix NPE when authority provider is not configured
Scaling: Fix DB connection leak on XA initialization which triggered by data consistency check
Scaling: Fix PostgreSQL replication stream exception on multiple data sources
Scaling: Fix migrating updated record exception on PostgreSQL incremental phase
Scaling: Fix MySQL 5.5 check BINLOG_ROW_IMAGE option failure
Scaling: Fix PostgreSQL xml data type consistency check
Fix database discovery failed to modify cron configuration
Fix single read data source use weight loadbalance algorithm error
Fix create redundant data souce without memory mode
Fix column value matching shadow algorithm data type conversion exception

Apache ShardingSphere Open Source Project Links:
ShardingSphere Github
ShardingSphere Twitter
ShardingSphere Slack Channel
Contributor Guide

Author

Haoran Meng

SphereEx Senior Development Engineer &
Apache ShardingSphere PMC

Previously responsible for the database products R&D at JingDong Technology, he is passionate about Open-Source and database ecosystems. Currently, he focuses on the development of the ShardingSphere database ecosystem and open source community building.

DEV Community

Apache ShardingSphere 5.1.0 Now Available

New Features

API Changes

Enhancements

Refactoring

Author

Top comments (0)

Read next

Introduction à Terraform avec Proxmox

Mastering PostgreSQL Performance: Linux Tuning and Database Optimization

🚀 When to Use VPS, Vercel, and Cloudflare Worker: A Detailed Comparison

Connect to multiple databases, make or generate SQL queries, analyze or visualize.