DEV Community

John Carrot
John Carrot

Posted on

Manipulating arrays with PostgreSQL

Before we move on to manipulating array values stored in the database, it is useful to first learn the basics.

Representing a list

SELECT '{a,b,c}'::text[];

   text 
----------
 {a,b,c}
(1 row)

We have to specify the type. Else postgres would not know the type since it looks like a string.

Or we could also use the ARRAY[] style

# SELECT ARRAY['a','b','c'];

 array 
-------
 {a,b,c}

SELECTing FROM an array

So why doesn’t the following work?

# SELECT * FROM ARRAY['a','b','c']

ERROR:  42601: syntax error at or near "ARRAY"
LINE 1: SELECT * FROM ARRAY['a','b','c'];
                      ^
LOCATION:  scanner_yyerror, scan.l:1134

SELECT FROM expects that the data source is:

  • a set of rows
  • OR points to a set of word (table)

ARRAY[] returns an atomic value. Which is why it works with just SELECT but fails when FROM is used.

Convert arrays into rows using UNNEST

# SELECT UNNEST(ARRAY['a','b']);

 unnest 
--------
 a
 b
(2 rows)

Convert rows into arrays using ARRAY()

Watch out. We are not using ARRAY[]. We are using array().

SELECT array(
  SELECT UNNEST(ARRAY['a','b'])
);

 array 
-------
 {a,b}
(1 row)

To summarize:

  • UNNEST converts array into a series of rows.
  • array() converts a bunch of rows into an array.
  • ARRAY[] used to represent/define an array.

Concatenate arrays using array_cat()

# SELECT array_cat('{a,b}'::text[], '{b,c}'::text[]);

 array_cat 
-----------
 {a,b,b,c}
(1 row)

We coerced the argument into text arrays to indicate to array_cat what data types it was dealing with. If we used the ARRAY[] initializer, this would not be required. Example below.

SELECT array_cat(ARRAY['a','b'], ARRAY['b', 'c']);

Notice that the final array contains b twice. We would love to remove duplicates. There are two ways to do it and we'll explore both below.

Removing duplicates from an array using DISTINCT

This method works with integers, text, float, etc

DISTINCT helps remove duplicates from all data types. But it requires rows as input. Not arrays.

  • So first convert an array to rows using UNNEST
  • Pass it to DISTINCT
  • And then if you still require an array as output, use array()
SELECT DISTINCT UNNEST(
  '{apple, banana, cat, apple}'::text[]
);

 unnest 
--------
 banana
 cat
 apple
(3 rows)
SELECT DISTINCT UNNEST(
  '{1.245, 1.3, 1.5, 1.2}'::float[]
) ORDER BY 1;

And we have columns. But we need an array back as output. So convert using array()

SELECT array(
  SELECT DISTINCT UNNEST(
    '{apple, banana, cat, apple}'::text[]
  )
);

       array        
--------------------
 {banana,cat,apple}
(1 row)

What if we want it sorted? Use ORDER BY 1

SELECT array(
  SELECT DISTINCT UNNEST(
    '{apple, banana, cat, apple}'::text[]
  ) ORDER BY 1
);

ORDER BY 1 would sort by the first column (We could starting with 1 here). In this case we are working with on-the-fly arrays, without giving them column names. So we use column number references.

Want to try this with a float array? Let's do it.

SELECT array(
  SELECT DISTINCT UNNEST(
    '{1.2, 1.245, 2.3, 1.245, 1.246}'::text[]
  )
);

         array         
-----------------------
 {1.2,1.246,2.3,1.245}
(1 row)

Removing duplicates from integer arrays using sort and uniq

Caveat: This method only works for integer arrays (yeah so no floats too).

sort and uniq are functions available in the intarray extension. Enable the extension in your database by running the following statement.

CREATE EXTENSION IF NOT EXISTS intarray;

Try using the uniq function.

SELECT uniq(
  '{1,2,1,3}'::int[]
);

   uniq    
-----------
 {1,2,1,3}
(1 row)

Oops. Does not work. uniq only works on a sorted list. So use the sort function to sort it first.

SELECT uniq(
  sort(
    '{1,2,1,3}'::int[]
  )
);

  uniq   
---------
 {1,2,3}
(1 row)

I haven't yet benchmarked in detail which method works the fastest for sorting integers. I used both methods as a part of another benchmarking to update 100k rows (coming soon). For now, all I can say is, both aren't slow. I'll try a more detailed benchmarking another day.

What else?

These are essentials that I find useful to know. There are a lot more array functions that Postgres supports. Go check it out.

I'll be sharing more from my notes on PostgreSQL. Subscribe to get updates.

Top comments (2)

Collapse
 
gevera profile image
Denis Donici

thanks for sharing. Good stuff

How would I go about creating a unique set out a query like this

select tags from blog.posts;

 tags          
------------------------
 {mind}
 {thoughts,energy,mind}
 {postgres,"data base"}
(3 rows)

Enter fullscreen mode Exit fullscreen mode

the expected result would be {thoughts,energy,mind, postgres,"data base"}

Collapse
 
lukzard profile image
lukzard

lovely article