loading...
Cover image for Postgres Trigram indexes VS Algolia
SaaSHub

Postgres Trigram indexes VS Algolia

stanbright profile image Stan Bright ・3 min read

I want to share my experience of using the native "trigram indexes" of Postgres to achieve a robust "typeahead/autocomplete/fuzzy search" functionality. I am using Algolia as an example, as I have noticed that a lot of people are using it precisely for that. Yet, I believe that you could do perfectly fine by using only Postgres in many situations.

What are trigram indexes? They are a special type of index in Postgres that is available by default; however, they have to be explicitly enabled. Here it is an extract from the official docs:

"The pg_trgm module provides functions and operators for determining the similarity of alphanumeric text based on trigram matching, as well as index operator classes that support fast searching for similar strings.", "A trigram is a group of three consecutive characters taken from a string."

For example, the set of trigrams in the string "cat" is " c", " ca", "cat", and "at ". The set of trigrams in the string "foo|bar" is " f", " fo", "foo", "oo ", " b", " ba", "bar", and "ar ".

Now let's follow what's necessary to build our autocomplete engine.

1) Enable the Postgres extension

CREATE EXTENSION IF NOT EXISTS pg_trgm

2) Index the columns you are going to query

Once you have enabled pg_trgm you have to index varchar and text columns to make use of it. e.g. CREATE INDEX index_fuzzy_searches_on_name_trgm ON fuzzy_searches USING gin(name gin_trgm_ops). Notice USING gin(name gin_trgm_ops)

Now we have very fast LIKE queries with leading and trailing wild cards (%).

3) Prepare the query

The next step is splitting the user query by spaces and adding a separate WHERE clause per name. For example, if a user searches for "sales cloud". We'd end up with a query like SELECT * FROM mytable WHERE name LIKE '%sales%' AND name LIKE '%cloud%'

A simplified implementation in Ruby on Rails could look like (in this case ServiceSearch would be MATERIALIZED VIEW that I will explain down the post):

relation = ServiceSearch.popular_first.limit(limit).includes(:service)
query.split("\s").each do |word|
  relation = relation.where("content LIKE ?", "%#{word}%")
end
relation.map(&:service)

4) Search for different types of records simultaneously

One of the use cases of using a dedicated full-text search engine is querying all your records simultaneously. That could be achieved through a MATERIALIZED VIEW. Moreover, you can easily give priority to some records. For example, showing Products before Categories. I am managing my views by the awesome scenic gem, but you don't need to.

Here it is the body of an exemplary materialized view:

SELECT
  id AS searchable_id,
  'Service' AS searchable_type,
  1 AS priority,
  post_services_count AS weight,
  LOWER(name) AS name
FROM services
WHERE state IN ('approved', 'closed')

UNION

SELECT
  categories.id AS searchable_id,
  'Category' AS searchable_type,
  2 AS priority,
  COUNT(*) AS weight,
  LOWER(name) AS name
FROM categories
JOIN categorizations cats ON cats.category_id = categories.id
WHERE state = 'approved'
GROUP BY categories.id, categories.name

By having a view like that one, we can query multiple records with something as simple as:

SELECT * FROM fuzzy_searches WHERE name LIKE '%test%' ORDER BY priority, weight DESC

Real-world examples

One of the top competitors of SaaSHub - alternativeto.net is using Algolia for their search autocomplete. I'd suggest testing their search and SaaSHub's. You won't find many differences in the UX or speed.

In the end, if you need a simple autocomplete functionality for your project, and you are using Postgres, my suggestion is to consider its trigram indexes + materialized views. Reaching out to external services may not be necessary and could be an overkill. However, I'm sure that there will be many cases in which it will make sense to pay for a professional SaaS product.


p.s. you can find more Algolia alternatives on SaaSHub 🙈.

Posted on by:

stanbright profile

Stan Bright

@stanbright

Founder of SaaSHub & LibHunt. Having more ideas than time. Ruby on Rails enthusiast. 🥑👉🍞 p.s. Sweet dreams are made of hard work. Who am I to disagree?

SaaSHub

SaaSHub is an independent software marketplace. Our goal is to be objective, simple and your first stop when researching for a new service to help you grow your business or be more productive.

Discussion

pic
Editor guide
 

Usually such ngram-search approach has 3 limitations:

  • doesn't scale very well as indexing all those ngrams will drastically increase the DB size
  • there won't be any typo-tolerance
  • hard to get great relevance because no real way to weight differently matches using ngram starting words, in the middle of the words or at the end of the words. Also, very hard to combine this when searching across multiple attributes or having business popularity/ranking to take into account.
 

I'm glad you wrote this up, Stan.

We discussed Algolia as a search option for a RoR app a while ago, but I couldn't see any benefits over PG search, either using raw queries (I'm a fan of MiniSQL for that) of the pg_search gem.