Aleksander Obuchowski

Posted on Nov 4

MedImageInsight: Open-Source Medical Image Embedding Model - now on HuggingFace

#ai #computervision #healthcare

TLDR: check out the model at https://huggingface.co/lion-ai/MedImageInsights

Making Medical Image AI Actually Accessible

If you've ever tried implementing a medical imaging model, you know the drill: promising papers, complicated setup processes, and documentation that assumes you are a certified Zzure developer. When I came across the MedImageInsight model, it was the same story - great potential buried under layers of enterprise infrastructure.

It took me 6 hours to access the model - first I had to register on azure, set up payment organization etc. Then although the code repository was there there was no option to clone or download the repo so I had to manually copy the content of each file! Not to mention that this model was shared as MLFlow artifact so there was a ton of unnecessary code. But, since the model is shared on MIT licence I decided to share my custom implementation on Huggingface so you don't have to through the same hell as I did

What's MedImageInsight Anyway?

At its core, MedImageInsight is a dual-encoder model (think CLIP, but for medical images) that can:

Convert medical images into meaningful embeddings
Match images with text descriptions
Perform zero-shot classification on medical conditions
Handle multiple medical imaging modalities (X-rays, CT scans, etc.)

The model was trained on a massive dataset of medical images and their descriptions, learning to create a shared embedding space for both images and text. This means you can throw new medical conditions at it without retraining, and it'll do a decent job at identifying them.

Why Another Implementation?

The original implementation required:

An Azure account
MLflow setup
Multiple enterprise-level configurations
Dealing with undocumented dependencies
Coffee. Lots of coffee.

After spending way too much time setting it up, I decided to strip it down to its essentials. No shade to the original authors - they created an amazing model. But not everyone needs enterprise-grade MLflow pipelines to run a few predictions.

How It Actually Works

At its heart, MedImageInsight uses a technique called contrastive learning to create a shared understanding between medical images and their descriptions. Think of it as teaching the model to speak two languages fluently: the language of images and the language of medical terminology.

The Power of Zero-Shot Learning

Traditional machine learning models are like students who can only answer questions they've seen before. Zero-shot learning models, on the other hand, are like students who can apply their knowledge to entirely new situations.

MedImageInsight achieves this through a clever architectural design:

One part of the model learns to understand medical images
Another part learns to understand medical terminology
Both parts are trained to translate their understanding into the same "language" (a shared vector space)

This means if you show the model a chest X-ray and ask "Is there pneumonia?", it doesn't need to have seen pneumonia examples during training. Instead, it understands both what pneumonia means textually and what to look for in the image.

Getting Started

Clone the repo:

git clone https://huggingface.co/lion-ai/MedImageInsights

Install dependencies (we use uv because it's fast and deterministic):

uv sync

Run the example:

uv run example.py

That's it. No Azure setup, no MLflow, no enterprise infrastructure required.

What's Next?

We're working on:

Better explainability (what is the model actually looking at?)
HuggingFace's transformers library compatibility
More example notebooks for specific use cases
Performance optimizations

Contributing

Found a bug? Have an improvement in mind? The repository is actually open source (imagine that!), and we welcome contributions.

Resources

Original paper: arXiv:2410.06542
Code: https://huggingface.co/lion-ai/MedImageInsights
Example notebooks: In the examples directory
FastAPI service: In fastapi_app.py

DEV Community

MedImageInsight: Open-Source Medical Image Embedding Model - now on HuggingFace

Making Medical Image AI Actually Accessible

What's MedImageInsight Anyway?

Why Another Implementation?

How It Actually Works

The Power of Zero-Shot Learning

Getting Started

What's Next?

Contributing

Resources

Top comments (0)

Read next

Top 10 AI Code Editors and Developer Tools in 2024

The Big Lie AI Vendors Keep Telling You

ChatGPT Drinks 500ml of Water Every Time You Talk to It but Why...??

Here’s how AI-powered autocompletion is implemented in Novel, an open-source text editor