Content-Based recommendations using sentence embeddings and Elasticsearch.

#machinelearning #elasticsearch #recommendations

Content-based recommendations

Description

In this article, I attempt to give some clearance as to how we could produce content-based recommendations using sentence embeddings and using Elasticsearch capabilities.
I will try to be as clear as I can, giving an overall view of how we've been doing this at Jumpseller

A definition

Content-based recommendations are focused on using a set of attributes that help characterize what is known in the recommender-systems literature as an item (a song, film, product) to build a profile that represents it. They can also do the same for users which are the people registered in the system.

An example

A common example that illustrates this is if we had a movie system, we could use movie names, descriptions, categories, cast, and other attributes to build a profile according to some defined heuristic. Then we could potentially recommend products with similar descriptions and similar names.
For example, Star Wars(Episode IV – A New Hope) and Star Wars(Episode IV – A New Hope) a very much alike in terms of description, name, and category.

How do we build an item profile?

Item profiles can be built in multiple ways. One choice would be to use the sentences which characterize the items in our system. These sentences can, in turn, be used to output sentence embeddings which are vectors that represent sentences in a text corpus.
These sentence embeddings can be produced in multiple ways. I'll enumerate 3:

BERT - Bidirectional Encoder Representations from Transformers. It is a language representation model designed and published by Google. It is pre-trained and is a good solution for producing sentence embeddings.
Doc2Vec - It is an extension of Word2Vec for sentence embeddings.
Word2Vec - A NLP algorithm that uses a neural network to output word embeddings. We can then use these word embeddings to produce sentence-level embeddings, by performing some operation. A simple trick would be simply to average all the word embeddings inside a sentence producing a final vector.

Kenter, Tom. (2017). Text Understanding for Computers.

How do we persist these item profiles?

A good choice of storage for operations like these is Elasticsearch. With Elasticsearch's dense_vector mapping type we are free to index our documents (items) with a vector field with a size of our liking.

How do we perform recommendations?

Recommendations are done by computing the nearest neighbors for each item. We start by choosing a similarity measure, i.e the cosine similarity. Elasticsearch makes this easier since it has a built-in cosineSimilarity function for searching.
Since we decided to store our items with Elasticsearch we could use this function as described in their documentation.