DEV Community

loading...
Cover image for Sentence Similarity With Transformers and PyTorch

Sentence Similarity With Transformers and PyTorch

jamescalam profile image James Briggs ・1 min read

All we ever seem to talk about nowadays are BERT this, BERT that. I want to talk about something else, but BERT is just too good - so this video will be about BERT for sentence similarity.

A big part of NLP relies on similarity in highly-dimensional spaces. Typically an NLP solution will take some text, process it to create a big vector/array representing said text - then perform several transformations.

It's highly-dimensional magic.

Sentence similarity is one of the clearest examples of how powerful highly-dimensional magic can be.

The logic is this:

  • Take a sentence, convert it into a vector.
  • Take many other sentences, and convert them into vectors.
  • Find sentences that have the smallest distance (Euclidean) or smallest angle (cosine similarity) between them - more on that here.
  • We now have a measure of semantic similarity between sentences - easy!

At a high level, there's not much else to it. But of course, we want to understand what is happening in a little more detail and implement this in Python too.

Medium article

Easy mode

Discussion (0)

Forem Open with the Forem app