DEV Community

Cong Li
Cong Li

Posted on

GBase 8s JDBC Character Set Parameters Explained

In database application development, correctly configuring character sets is crucial for accurate data storage and retrieval. GBase 8s V8.8 provides various JDBC character set parameters to optimize character handling. This article will cover the four parameters—CLIENT_LOCALE, DB_LOCALE, NEWCODESET, and NEWLOCALE—to help developers and database administrators better understand and apply them to improve database operation efficiency and accuracy.

1. Detailed Explanation of the CLIENT_LOCALE Parameter

The CLIENT_LOCALE parameter defines the client-side character set and affects the language of error messages, date and time formats, and more.

For example, zh_CN.utf8 is divided into two parts by a period (.):

  • Left part:

    • Determines the language of error messages when an exception is thrown:
    • zh_CN: Chinese (not supported in JDBC 3.3.0 and earlier versions)
    • en_US: English
    • This feature requires server support and not all character sets can modify the left part.
  • Right part:

    • Defines the client-side character set:
    • For char type columns, the character set encoding is used when calling rs.getBytes().
    • For clob type columns, the encoding is used when calling ps.setBinaryStream().

2. The Importance of the DB_LOCALE Parameter

The DB_LOCALE parameter defines the database character set, ensuring that data is correctly encoded before being sent to the server and properly decoded when retrieved from the server.

Like CLIENT_LOCALE, it is also divided into two parts by a period:

  • Left part: Usually not needed.
  • Right part:
    • Determines how characters are encoded before being sent to the server.
    • Defines how byte arrays retrieved from the server are decoded into characters.
    • Specifies the character set of the created database.

3. Application of the NEWCODESET Parameter

The NEWCODESET parameter allows mapping the GBase encoding name to the JDK encoding name, providing flexibility in character set mapping.

It can specify multiple sets in the URL, separated by colons (:). Each set contains three elements, separated by commas (,).

Use Cases:

1) Mapping GBase encoding to JDK encoding:

  • For example, NEWCODESET=GB18030,GB18030-2000,5488:
    • 1st element: GB18030 (jdkEnc)
    • 2nd element: GB18030-2000 (gbaseEncName)
    • 3rd element: 5488 (gbaseEncNumber)

When using:

   DB_LOCALE=zh_CN.GB18030-2000;CLIENT_LOCALE=zh_CN.GB18030-2000;
Enter fullscreen mode Exit fullscreen mode

or

   DB_LOCALE=zh_CN.5488;CLIENT_LOCALE=zh_CN.5488;
Enter fullscreen mode Exit fullscreen mode

The driver uses the mapped JDK encoding name GB18030 during encoding and decoding.

2) Mapping to other character sets:

  • For example:

     DB_LOCALE=en_US.819;CLIENT_LOCALE=en_US.819;NEWCODESET=UTF-8,cp1252,819;
    

    Here, the created database character set is 819, but it can store and retrieve UTF-8 characters.

If the character set specified in the right part of DB_LOCALE and CLIENT_LOCALE is neither 819 nor cp1252, the NEWCODESET parameter will be ignored.

Mapping Logic:

1) Check if the encoding string matches the 2nd element (GBase encoding name) in NEWCODESET. If yes, map it to the 1st element, otherwise proceed to step 2.
2) Check if the encoding string matches a key in the built-in driver hash table (table1). If yes, map it to the value, otherwise proceed to step 3.
3) Check if the encoding string matches the 3rd element (GBase encoding number) in NEWCODESET. If yes, map it to the 1st element, otherwise proceed to step 4.
4). Check if the encoding string matches a key in another hash table (table2). If yes, map it to the value.

Example hash tables:

table1 = new Hashtable<String,String>();
table1.put("88859-1", "ISO8859_1");
table1.put("utf8", "UTF8");
table1.put("GB18030", "GB18030");

table2 = new Hashtable<String,String>();
table2.put("819", "ISO8859_1");
table2.put("57372", "UTF8");
table2.put("5488", "GB18030");
Enter fullscreen mode Exit fullscreen mode

4. The NEWLOCALE Parameter

The NEWLOCALE parameter is used to map locales between the client and the database. While NEWCODESET maps the right side of CLIENT_LOCALE and DB_LOCALE, NEWLOCALE maps the left side. This parameter is used less frequently, so it will not be covered in detail here.


This article has provided a detailed explanation of the configuration and application of JDBC character set parameters in GBase 8s. We hope this helps you manage character sets more effectively in your daily work and optimize your database operations. Thank you for reading.

Top comments (0)