DEV Community

loading...
NeuML

Similarity search with images

davidmezzetti profile image David Mezzetti ・2 min read

This article is part of a tutorial series on txtai, an AI-powered search engine.

txtai as the name implies works with text and ai, pretty straightforward. But that doesn't mean it can't work with different types of content. For example, an image can be described with words. We can use that description to compare an image to a query or other documents. This article shows how images and text can be embedded into the same space to support similarity search.

A future version of txtai will add support for image captioning which will enable images, audio, documents and text to all live in the same embedding index. The model in this article is designed to have images in a separate embedding index. Stay tuned for more on image captioning!

Install dependencies

Install txtai and all dependencies.

pip install txtai torchvision ipyplot

# Get test data
wget -N https://github.com/neuml/txtai/releases/download/v2.0.0/tests.tar.gz
tar -xvzf tests.tar.gz
Enter fullscreen mode Exit fullscreen mode

Create an Embeddings model

sentence-transformers recently added support for the OpenAI CLIP model. This model embeds text and images into the same space, enabling image similarity search. txtai can directly utilize these models through sentence-transformers. Check out the sentence-transformers link above for additional examples on how to use this model.

This section builds an embeddings index over a series of images.

import glob

from PIL import Image

from txtai.embeddings import Embeddings

def images():
  for path in glob.glob('txtai/*jpg'):
    yield (path, Image.open(path), None)

embeddings = Embeddings({"method": "transformers", "path": "clip-ViT-B-32", "modelhub": False})
embeddings.index(images())
Enter fullscreen mode Exit fullscreen mode

Search the index

Now that we have an index, let's search it! This section runs a list of queries against the index and shows the top result for each query. Have to say this is pretty 🔥🔥🔥

import ipyplot
from PIL import Image

images, labels = [], []
for query in ["Walking into the office", "Saturday cleaning the yard", "Working on the latest report", "Working on my homework", "Watching an exciting race",
              "The universe is massive", "Time lapse video of traffic", "Relaxing Thanksgiving day"]:
  index, _ = embeddings.search(query, 1)[0]

  images.append(Image.open(index))
  labels.append(query)

ipyplot.plot_images(images, labels, img_width=425, force_b64=True)
Enter fullscreen mode Exit fullscreen mode

Alt Text

Discussion (0)

pic
Editor guide