DEV Community

Chloe Williams for Zilliz

Posted on • Originally published at zilliz.com

How to Connect to Milvus Lite Using LangChain and LlamaIndex

Milvus Lite, released just one week ago on May 31, is now the default method for third-party connectors like LangChain and LlamaIndex to connect to Milvus, the popular open-source vector database.

Method Control Level for Retrieval Process Time (seconds)
LlamaIndex No control 2156
LangChain Full control 8
Milvus Lite API Full control 28

Table: Timings using the same HuggingFace embedding model (BAAI/bge-large-en-v1.5) and the same HTML data files.

The result? If you’re looking for the best balance between high control over Milvus settings and fast setup, using the Milvus Lite APIs directly is the optimal choice. The full code and timings are available on my GitHub.

In the following sections, we’ll cover:

  1. Connecting to Milvus Lite using LlamaIndex

  2. Connecting to Milvus Lite using LangChain

  3. Connecting to Milvus Lite using Milvus APIs

Connecting to Milvus Lite Using LlamaIndex

It’s easy to get started using LlamaIndex. It takes about 2000 seconds to connect and create a collection.

from pymilvus import MilvusClient
from llama_index.core import (
   Settings,
   ServiceContext,
   StorageContext,
   VectorStoreIndex,
)
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.vector_stores.milvus import MilvusVectorStore


# 1. Define the embedding model.
service_context = ServiceContext.from_defaults(
   # LlamaIndex local: translates to the same location as default HF cache.
   embed_model="local:BAAI/bge-large-en-v1.5")
# LlamaIndex hides this but we need it to create the vector store!
EMBEDDING_DIM = 1024


# 2. Create a Milvus collection from the documents and embeddings.
milvus_client = MilvusClient()
vector_store = MilvusVectorStore(
   client=milvus_client,
   dim=EMBEDDING_DIM,
   overwrite=True
)
storage_context = StorageContext.from_defaults(
   vector_store=vector_store
)
llamaindex = VectorStoreIndex.from_documents(
   # Chunk, embed, insert too slow!  Just use one document.
   docs[:1],
   storage_context=storage_context,
   service_context=service_context
)

Enter fullscreen mode Exit fullscreen mode

Connecting to Milvus Lite Using LangChain

It’s easy to get started in LangChain. It takes about 8 seconds to connect and create a collection.

from langchain_milvus import Milvus
from langchain_huggingface import HuggingFaceEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter


# 1. Define the embedding model.
model_name = "BAAI/bge-large-en-v1.5"
model_kwargs = {'device': 'cpu'}
encode_kwargs = {'normalize_embeddings': True}
embed_model = HuggingFaceEmbeddings(
   model_name=model_name,
   model_kwargs=model_kwargs,
   encode_kwargs=encode_kwargs)
EMBEDDING_DIM = embed_model.dict()['client'].get_sentence_embedding_dimension()


# 2. Create a Milvus collection from the documents and embeddings.
start_time = time.time()
vectorstore = Milvus.from_documents(
   documents=docs,
   embedding=embed_model,
   connection_args={
       "uri": "./milvus_demo.db",},
   # Override LangChain default values for Milvus.
   consistency_level="Eventually",
   drop_old=True,
   index_params = {
       "metric_type": "COSINE",
       "index_type": "AUTOINDEX",
       "params": {}}
)

Enter fullscreen mode Exit fullscreen mode

Connecting to Milvus Lite Using Milvus Lite APIs

But what's happening behind the scenes? Let’s break down the actual steps and make the default values more explicit:

  1. Start the Milvus Lite server and connect.

  2. Select an embedding model.

  3. Create a Milvus database collection.

    1. Define a schema.
    2. Choose an index (data structure for Approximate Nearest Neighbor search).
    3. Choose a distance metric (definition of “close” in vector space).
    4. Choose the consistency level for inserting data.
  4. Select a chunking strategy.

  5. Transform chunks of data into vectors using the embedding model inference.

  6. Insert vector data into Milvus.

Here is the Python code using the Milvus Lite API directly. It takes about 28 seconds to connect and create a collection.

import pymilvus


# STEP 1. CONNECT A CLIENT TO LIGHT MILVUS PYTHON SERVER.
from pymilvus import MilvusClient
mc = MilvusClient("milvus_demo.db")


# STEP 2. DOWNLOAD AN OPEN SOURCE EMBEDDING MODEL.
from sentence_transformers import SentenceTransformer
model_name = "BAAI/bge-large-en-v1.5"
encoder = SentenceTransformer(model_name, device=’cpu’)


# STEP 3. CREATE A MILVUS COLLECTION AND DEFINE THE DATABASE INDEX.
# Uses Milvus AUTOINDEX, which defaults to HNSW.
COLLECTION_NAME = "MilvusDocs"
mc.create_collection(COLLECTION_NAME,
       EMBEDDING_DIM,
       consistency_level="Eventually",
       auto_id=True, 
       overwrite=True,)


# STEP 4. CHUNK DATA INTO VECTORS.
from langchain_community.document_transformers import BeautifulSoupTransformer
from langchain.text_splitter import RecursiveCharacterTextSplitter
# Define chunk size and overlap.
chunk_size = 512
chunk_overlap = np.round(chunk_size * 0.10, 0)
# Split the documents into recursive, overlapping chunks.
child_splitter = RecursiveCharacterTextSplitter(
   chunk_size = chunk_size,
   chunk_overlap = chunk_overlap,
   length_function = len,  # use built-in Python len function)
chunks = child_splitter.split_documents(docs)


# STEP 5. TRANSFORM CHUNKS INTO VECTORS USING EMBEDDING MODEL INFERENCE.
list_of_strings = [doc.page_content for doc in chunks if hasattr(doc, 'page_content')]
embeddings = torch.tensor(encoder.encode(list_of_strings))


# STEP 6. INSERT CHUNK LIST INTO MILVUS.
# First, create chunk_list and dict_list.
dict_list = []
for chunk, sparse, dense in zip(chunks, embeddings["sparse"], embeddings["dense"]):
   chunk_dict = {
       'chunk': chunk.page_content,
       'source': chunk.metadata.get('source', ""),
       'vector': dense
   }
   dict_list.append(chunk_dict)
mc.insert(
   COLLECTION_NAME,
   data=dict_list,
   progress_bar=True)

Enter fullscreen mode Exit fullscreen mode

Choosing the Right Milvus Light Method

While the different Milvus Lite APIs offer conveniences, they come with trade-offs in terms of control over retrieval and chunking methods and speed.

Using Milvus Lite APIs directly provides the highest control over Milvus retrieval settings balanced with the fastest collection creation speed.

Resources and Further Reading

Milvus Lite docs

Milvus Lite LlamaIndex docs

Milvus Lite LangChain docs

LangChain Milvus docs

LlamaIndex Milvus docs

Top comments (0)