DEV Community

Cover image for What's New in PostgreSQL 14?
Chris Castle for Render

Posted on

What's New in PostgreSQL 14?

PostgreSQL 14, released on September 30, 2021, introduces many new features and improvements. Two I find particularly interesting are JSON syntax conveniences and the multirange data type. But there are also several noticeable performance improvements that many applications will benefit from without any code changes.

JSON Conveniences

PostgreSQL can store and manipulate JSON data. There are JSON and JSONB data types. They are similar, but most of the time you probably want to use JSONB. The JSON data type is stored as a string whereas JSONB is stored in a decomposed binary format that allows PostgreSQL to query and manipulate nested strucutres inside the JSON more efficiently. JSONB also allows you to create indexes based on nested values. I'll be using JSONB exclusively here.

Let's create a table and put some JSON data in it.

CREATE TABLE jsonb_example (
  id bigserial,
  data JSONB
);

INSERT INTO
  jsonb_example (data)
VALUES
  ('{"a": {"b": ["foo", "hello"]}}'),
  ('{"a": {"b": ["chris", "render"]}}');
Enter fullscreen mode Exit fullscreen mode

Here's the table we just created.

SELECT * from jsonb_example;

 id |               data
----+-----------------------------------
  1 | {"a": {"b": ["foo", "hello"]}}
  2 | {"a": {"b": ["chris", "render"]}}
(2 rows)
Enter fullscreen mode Exit fullscreen mode

Cool. But what if we want to query nested values within the JSON? Here's how you used to have to do it with PostgreSQL 13.

SELECT data #> '{a,b,1}'
FROM jsonb_example;

 ?column?
----------
 "hello"
 "render"
(2 rows)
Enter fullscreen mode Exit fullscreen mode

The #> is an operator that tells PostgreSQL to return a JSON sub-object. PostgreSQL has a lot of other operators for working with JSON like #>>, ->, and ->>.

The {a,b,1} is a path into the JSON object that says

  1. Grab the value of property a: {"b": ["foo", "hello"]}
  2. Within that, give me the value of property b: ["foo", "hello"]
  3. Within that array, give me the 2nd element: "hello"

With PostgreSQL 14 you can now access JSON sub-objects in a way that more closely matches how other programming languages reference JSON values. This new functionality is called jsonb subscripting. Here's an example returning the same data as above.

SELECT data['a']['b'][1]
FROM jsonb_example;

   data
----------
 "hello"
 "render"
(2 rows)
Enter fullscreen mode Exit fullscreen mode

We can use the same syntax to filter data with a WHERE clause.

SELECT *
FROM jsonb_example
WHERE data['a']['b'][1] = '"render"';

 id |               data
----+-----------------------------------
  2 | {"a": {"b": ["chris", "render"]}}
(1 row)
Enter fullscreen mode Exit fullscreen mode

And we can update values within the JSON object using the same syntax.

UPDATE jsonb_example SET data['a']['b'][0] = '"bar"';

SELECT *
FROM jsonb_example;

 id |              data
----+---------------------------------
  1 | {"a": {"b": ["bar", "hello"]}}
  2 | {"a": {"b": ["bar", "render"]}}
(2 rows)
Enter fullscreen mode Exit fullscreen mode

This new syntax closely matches how you might access nested values in a JSON object with JavaScript.

const obj = {"a": {"b": ["foo", "hello"]}};
obj['a']['b'][1]; // evaluates to 'hello'
Enter fullscreen mode Exit fullscreen mode

Multirange Data Type

PostgreSQL has had range data types since version 9.2. Now there are multirange data types.

A range data type is useful when you want a single value to represent a range of dates, time, or numbers. For example, you might use a range data type to represent meeting schedules or radio spectrum allocation. With a range data type, you only need a single column to store what would otherwise require two columns--start of range and end of range.

Prior to PostgreSQL 14, that range could only be a single contiguous range. Now with multiranges, multiple discontiguous (but not overlapping) ranges can be stored as a single value.

Let's look at a meeting scheduling example derived from the PostgreSQL documentation.

CREATE TABLE reservation (room int, during tsrange);

INSERT INTO reservation VALUES
    (1108, tsrange('2021-01-01 14:30', '2021-01-01 15:30'));

SELECT * FROM reservation;

 room |                    during
------+-----------------------------------------------
 1108 | ["2021-01-01 14:30:00","2021-01-01 15:30:00")
Enter fullscreen mode Exit fullscreen mode

This row represents a room reservation for a meeting from 14:30 to 15:30 on January 1, 2021. But what if instead of a one hour meeting, we were planning an all-day meeting with a lunch break in the middle? What would a row for that reservation look like?

Prior to PostgreSQL 14's multirange support, it would have to be stored as two rows.

INSERT INTO reservation VALUES
    (1108, tsrange('2021-01-01 09:00', '2021-01-01 12:00')),
    (1108, tsrange('2021-01-01 13:00', '2021-01-01 17:00'));

SELECT * FROM reservation;

 room |                    during
------+-----------------------------------------------
 1108 | ["2021-01-01 09:00:00","2021-01-01 12:00:00")
 1108 | ["2021-01-01 13:00:00","2021-01-01 17:00:00")
(2 rows)
Enter fullscreen mode Exit fullscreen mode

But then we'd need to add a meeting_id column so that we know these two rows represent the same meeting.

With multiranges we only need one row to represent this meeting.

DROP TABLE reservation;
CREATE TABLE reservation (room int, during tsmultirange);

INSERT INTO reservation VALUES
    (1108, tsmultirange(
      tsrange('2021-01-01 09:00', '2021-01-01 12:00'),
      tsrange('2021-01-01 13:00', '2021-01-01 17:00')
    ));

SELECT * FROM reservation;

 room |                                            during
------+-----------------------------------------------------------------------------------------------
 1108 | {["2021-01-01 09:00:00","2021-01-01 12:00:00"),["2021-01-01 13:00:00","2021-01-01 17:00:00")}
(1 row)
Enter fullscreen mode Exit fullscreen mode

Improved Handling of Concurrent Connections

A lot has happened behind the scenes to improve PostgreSQL performance. However, PostgreSQL 14's handling of multiple client connections will likely be the performance improvement noticeable to the most people. The folks at pganalyze did some great performance testing using pgbench to verify this. At 5000 active connections, they saw about a 20% improvement in throughput over PostgreSQL 13. At 10,000 connections, that improvement went up to 50%. Of course, these results are dependent on your hardware, schema, data, and queries, so you may not see the same results, but the consensus seems to be that PostgreSQL's handling of concurrent connections has improved significantly.

And Much More

There are many more new features and improvements, which you can read about in the PostgreSQL 14 release announcement or compare across versions in the feature matrix (deselect all versions except 14 and the one you want to compare to and select "Hide unchanged features").

Render Supports PostgreSQL 14

I'd be remiss not to mention that Render was the first cloud provider I know to support PostgreSQL 14, making it available on the same day PostgreSQL 14 was released and giving our customers access to all these exciting improvements in just a few clicks. Try it out and see how it works for you.

A screenshot of the Render dashboard while creating a new PostgreSQL 14 database

Top comments (0)