Tutorial 10: Milvus - Distributed Vector Database for AI

Introduction

Prerequisites

Milvus Architecture

Installation and Setup

Collection, Partition, and Index Management

Vector Indexing Strategies

Hybrid Search: Vector + Scalar Filtering

Batch Operations

Scaling with Kubernetes

Comparison vs Qdrant, ChromaDB, and Pinecone

Integration with LangChain

Production Deployment

Best Practices

Conclusion

Introduction

As AI applications increasingly rely on semantic search, recommendation systems, and retrieval-augmented generation (RAG), the need for efficient vector storage and retrieval has become critical. Milvus is an open-source, distributed vector database purpose-built for handling billion-scale vector data with millisecond latency.

Unlike general-purpose databases that bolt on vector search as an afterthought, Milvus was designed from the ground up for similarity search. It supports multiple index types (IVFFLAT, HNSW, IVFPQ, and more), hybrid search combining vector similarity with scalar filtering, horizontal scaling across clusters, and seamless integration with ML frameworks.

This tutorial provides a comprehensive, hands-on guide to Milvus, from basic setup to production deployment.

Prerequisites

Python 3.9 or higher
Docker and Docker Compose (for local Milvus deployment)
Basic understanding of vector embeddings and similarity search
Familiarity with Python data structures

Install the required packages:

pip install pymilvus langchain langchain-openai numpy pandas

Milvus Architecture

Milvus uses a cloud-native, disaggregated architecture with four key layers:

Access Layer - Stateless proxy nodes that handle client connections, request routing, and result aggregation. These nodes are horizontally scalable and sit behind a load balancer. Coordinator Service - The brain of the cluster, responsible for metadata management, query coordination, and data coordination. It manages collection schemas, index building tasks, and query routing. Worker Nodes - Divided into three types:

Query Nodes: Execute search and query operations on loaded segments
Data Nodes: Handle data insertion, deletion, and compaction
Index Nodes: Build vector indexes in the background

Storage Layer - Uses object storage (MinIO/S3) for persistent data and etcd for metadata. This separation allows independent scaling of compute and storage.

Client Applications
        |
   [Access Layer - Proxy Nodes]
        |
   [Coordinator Service]
    /       |       \
[Query]  [Data]  [Index]
 Nodes   Nodes    Nodes
        |
   [Object Storage + etcd]

Installation and Setup

Local Setup with Docker Compose

# Download the docker-compose file wget https://github.com/milvus-io/milvus/releases/download/v2.4.0/milvus-standalone-docker-compose.yml -O docker-compose.yml Start Milvus docker compose up -d Verify it is running docker compose ps

Connecting from Python

from pymilvus import connections, utility

Connect to Milvus
connections.connect(
    alias="default",
    host="localhost",
    port="19530"
)

Verify connection
print(f"Connected to Milvus. Server version: {utility.getserverversion()}")

List existing collections
collections = utility.listcollections()

print(f"Existing collections: {collections}")

Milvus Tutorial: Distributed Vector Database for AI

Tutorial 10: Milvus - Distributed Vector Database for AI

Table of Contents

Introduction

Prerequisites

Milvus Architecture

Installation and Setup

Local Setup with Docker Compose

Start Milvus

Verify it is running

Connecting from Python

Connect to Milvus

Verify connection

List existing collections

Related Articles

Complete Qdrant Tutorial: Vector Database for AI Applications

Complete LlamaIndex Tutorial: Building RAG Applications with LLMs

Complete ChromaDB Tutorial: Simple Vector Database for AI

FAISS Tutorial: Efficient Vector Similarity Search at Scale

Related Articles

Complete Qdrant Tutorial: Vector Database for AI Applications

Tutorial Lengkap Qdrant: Vector Database untuk Aplikasi AI Qdrant adalah vector database performa tinggi yang dirancang ...

Complete LlamaIndex Tutorial: Building RAG Applications with LLMs

Tutorial Lengkap LlamaIndex: Membangun Aplikasi RAG dengan LLM LlamaIndex adalah framework data yang powerful untuk memb...

Complete ChromaDB Tutorial: Simple Vector Database for AI

Tutorial Lengkap ChromaDB: Vector Database Sederhana untuk AI ChromaDB adalah open-source vector database yang dirancang...

FAISS Tutorial: Efficient Vector Similarity Search at Scale

FAISS: Pencarian Kemiripan Vektor yang Efisien dalam Skala Besar FAISS (Facebook AI Similarity Search) adalah library C+...