DEV Community

Dmitry Daw
Dmitry Daw

Posted on

How to query Ancestry fast

Ancestry is a great library to organize models in a tree structure.

For example, if you have Location model, which may be City, State, and Country, and you want to easily organize and query between all those entities.

By default Ancestry makes requests with LIKE query, e.g.

SELECT
  locations.*
FROM 
  locations 
WHERE
  "locations"."ancestry" LIKE '4a5b7a5d-1147-4d09-8a5d-a1a476c5be50/88fd88b7-eecd-4fc0-b18d-e2b6a077ab66/550f9809-e875-4fab-8cac-db8e6d59e0e5/%' 
OR 
  "locations"."ancestry" = '4a5b7a5d-1147-4d09-8a5d-a1a476c5be50/88fd88b7-eecd-4fc0-b18d-e2b6a077ab66/550f9809-e875-4fab-8cac-db8e6d59e0e5'
Enter fullscreen mode Exit fullscreen mode

Sometimes you need to write a query to check all descendants, like

SELECT
  locations.*, count(distinct cities.id)
FROM 
  locations 
LEFT JOIN locations as cities ON 
  cities.ancestry LIKE '%' || locations.id || '%'
WHERE
  cities.location_type = 'CITY'
GROUP BY
  locations.id
Enter fullscreen mode Exit fullscreen mode

(in this example locations.id is uuid, so we can use simplified LIKE query with '%' || locations.id || '%'. With int id it requires a different request)

But this request is running too long.
PostgreSQL does have indexes that may speed up LIKE queries, for example GIN and GiST.

But before using index, we can change that query to this

SELECT
  locations.*, count(distinct cities.id)
FROM 
  locations 
LEFT JOIN locations as cities ON 
  locations.id = ANY((string_to_array(cities.ancestry, '/')::uuid[]))
WHERE
  cities.location_type = 'CITY'
GROUP BY
  locations.id
Enter fullscreen mode Exit fullscreen mode

And it runs MUCH faster. On our data, we get an improvement from 5 seconds to 6 ms(!).

And it's possible to use it with int ids, just change uuid[] to int[].

After we can also add GIN index to it, if we want to.

add_index :locations, "(string_to_array(ancestry, '/')::uuid[])", using: :gin
Enter fullscreen mode Exit fullscreen mode

But it already runs much faster.

Another way to speed up Ancestry queries is to use ltree index, but it requires a diffent syntax.

Sources:

string_to_array: https://github.com/stefankroes/ancestry/issues/466
GIN and GiST index documentation: https://www.postgresql.org/docs/9.1/textsearch-indexes.html

ltree: https://github.com/stefankroes/ancestry/issues/102
ltree index documentation: https://www.postgresql.org/docs/9.1/ltree.html

Top comments (0)