AI Vector Search on Kubernetes with Milvus

Introduction

In the world of AI, understanding high-dimensional data is critical. This is important for recommendation systems, NLP, and image processing. Milvus is an open-source vector database that provides the ability to store and retrieve vectorized data. It is optimized for similarity search and high-performance analytics, making it a perfect fit for AI applications. AI Vector Search on Kubernetes with Milvus will provide you with capabilities for high-dimensional data processing important for AI-driven critical-mission applications.

In this post, we’ll cover how to deploy Milvus on a Kubernetes cluster using Helm, set up a basic vector collection, and demonstrate how to perform a similarity search—showcasing why Milvus is used today for AI workloads on Kubernetes. We’ve used our testing kubernetes cluster for this article. If you’re interested in deploying your own Kubernetes you can further read How to Deploy Kubernetes Using Kubespray

Procedure

Deploy Milvus on a Kubernetes cluster

Let’s add the milvus repo using helm with the following commands:

$ helm repo add milvus https://zilliztech.github.io/milvus-helm/
$ helm repo update

Create the namespace

$ kubectl create ns milvus-system
$ kn milvus

Let’s install Milvus using the helm install command:

$ helm install my-milvus --namespace milvus-system milvus/milvus

After deployment, verify that the Milvus pods are running:

$ kubectl get pods

$ k get pods
NAME                                   READY   STATUS      RESTARTS      AGE
my-milvus-datanode-777cd55cb5-wxs2l    1/1     Running     5 (14m ago)   25m
my-milvus-etcd-0                       1/1     Running     0             14m
my-milvus-etcd-1                       1/1     Running     0             14m
my-milvus-etcd-2                       1/1     Running     0             14m
my-milvus-indexnode-56bdfb7ff6-qq2nl   1/1     Running     6 (14m ago)   25m
my-milvus-minio-0                      1/1     Running     0             25m
my-milvus-minio-1                      1/1     Running     0             25m
my-milvus-minio-2                      1/1     Running     0             25m
my-milvus-minio-3                      1/1     Running     0             25m
my-milvus-mixcoord-8644f9c9f9-jh4m5    1/1     Running     5 (14m ago)   25m
my-milvus-proxy-c5df8d6bc-62szp        1/1     Running     5 (14m ago)   25m
my-milvus-pulsarv3-bookie-0            1/1     Running     0             14m
my-milvus-pulsarv3-bookie-1            1/1     Running     0             14m
my-milvus-pulsarv3-bookie-2            1/1     Running     0             14m
my-milvus-pulsarv3-bookie-init-jm57r   0/1     Completed   0             25m
my-milvus-pulsarv3-broker-0            1/1     Running     0             14m
my-milvus-pulsarv3-broker-1            1/1     Running     0             14m
my-milvus-pulsarv3-proxy-0             1/1     Running     0             14m
my-milvus-pulsarv3-proxy-1             1/1     Running     0             14m
my-milvus-pulsarv3-pulsar-init-r5d6r   0/1     Completed   0             25m
my-milvus-pulsarv3-recovery-0          1/1     Running     0             14m
my-milvus-pulsarv3-zookeeper-0         1/1     Running     0             14m
my-milvus-pulsarv3-zookeeper-1         1/1     Running     0             14m
my-milvus-pulsarv3-zookeeper-2         1/1     Running     0             14m
my-milvus-querynode-575597dd66-d8tlz   1/1     Running     6 (14m ago)   25m

NOTE: If you have a custom domain, etcd will crash on bootstrapping

values.yaml :

etcd:
  host: "milvus-etcd.<namespace>.custom.domain"
  clusterDomain: "custom.domain"
global:
  clusterDomain: "custom.domain"

etcd:
  host: "milvus-etcd.<namespace>.custom.domain"
  clusterDomain: "custom.domain"

minio:
  host: "my-milvus-minio.<namespace>.custom.domain"
  clusterDomain: "custom.domain"
  port: 9000
  bucket: milvus-bucket
  accessKey: my-access-key
  secretKey: my-secret-key

dataCoord:
  host: "milvus-datacoord.<namespace>.custom.domain"
  clusterDomain: "custom.domain"

$ helm upgrade --install my-milvus milvus/milvus -f values.yaml

Setup a Basic Vector Collection

Connect to Milvus and define a Collection

Verify Milvus is running with service on port 19530

$ k get svc
NAME                           TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                               AGE
my-milvus                      ClusterIP   10.101.22.241   <none>        19530/TCP,9091/TCP                    44m
my-milvus-datanode             ClusterIP   None            <none>        9091/TCP                              44m
my-milvus-etcd                 ClusterIP   10.101.33.251   <none>        2379/TCP,2380/TCP                     44m
my-milvus-etcd-headless        ClusterIP   None            <none>        2379/TCP,2380/TCP                     44m
my-milvus-indexnode            ClusterIP   None            <none>        9091/TCP                              44m
my-milvus-minio                ClusterIP   10.101.30.39    <none>        9000/TCP                              44m
my-milvus-minio-svc            ClusterIP   None            <none>        9000/TCP                              44m
my-milvus-mixcoord             ClusterIP   10.101.11.160   <none>        9091/TCP                              44m
my-milvus-pulsarv3-bookie      ClusterIP   None            <none>        3181/TCP,8000/TCP                     44m
my-milvus-pulsarv3-broker      ClusterIP   None            <none>        8080/TCP,6650/TCP                     44m
my-milvus-pulsarv3-proxy       ClusterIP   10.101.29.5     <none>        80/TCP,6650/TCP                       44m
my-milvus-pulsarv3-recovery    ClusterIP   None            <none>        8000/TCP                              44m
my-milvus-pulsarv3-zookeeper   ClusterIP   None            <none>        8000/TCP,2888/TCP,3888/TCP,2181/TCP   44m
my-milvus-querynode            ClusterIP   None            <none>        9091/TCP                              44m

NOTE: Since this is a PoC, we will proxy our way into Milvus:

$ kubectl port-forward --address 0.0.0.0 service/my-milvus 27018:19530

Install appropriate python libraries

$ python3 -m venv venv
$ source venv/bin/activate
$ pip3 install pymilvus gensim

Now run the following to set up the basic vector collection:

03_define_collection.py:

from pymilvus import connections, FieldSchema, CollectionSchema, DataType, Collection

# Connect to Milvus
connections.connect("default", host="your-milvus-service", port="19530")

# Define the schema for storing word embeddings
fields = [
    FieldSchema(name="word_id", dtype=DataType.INT64, is_primary=True),
    FieldSchema(name="embedding", dtype=DataType.FLOAT_VECTOR, dim=300)  # 300D GloVe embeddings
]
schema = CollectionSchema(fields, "Word Embeddings Collection")

# Create the collection
collection = Collection("word_embeddings", schema)

Run the define collection script:

$ python3 03_define_collection.py

If not issues, then you should get a new prompt without any specific output.

Download a Pre-trained word vectors

In order to show the power of Vector DB i wanted to show how arithmetics can be done with words. For that I selected GloVe collection. From GloVe’s Site:

GloVe is an unsupervised learning algorithm for obtaining vector representations for words. Training is performed on aggregated global word-word co-occurrence statistics from a corpus, and the resulting representations showcase interesting linear substructures of the word vector space.

Download from GloVe website. I have selected glove.6B.zip which is based on Wikipedia 2014 + Gigaword 5. Has 6B tokens, 400K vocab, uncased, 50d, 100d, 200d, & 300d vectors and weighs 822 MB.

$ wget https://nlp.stanford.edu/data/glove.6B.zip

Load and Insert Pre-trained Word Embeddings (GloVe)

We will now load GloVe embeddings and insert them into Milvus.

04_load_pretrained_word_embeddings.py

import numpy as np

def load_glove_embeddings(file_path, limit=None):
    embeddings = []
    words = []
    ids = []
    with open(file_path, 'r') as f:
        for idx, line in enumerate(f):
            values = line.split()
            words.append(values[0])
            word_vector = [float(x) for x in values[1:]]
            embeddings.append(word_vector)
            ids.append(idx)
            if limit and idx >= limit - 1:
                break
    return words, ids, embeddings

# Load GloVe embeddings (limit to first 10,000 words for efficiency)
words, ids, embeddings = load_glove_embeddings("path/to/glove.6B.100d.txt", limit=10000)

# Insert into Milvus
data = [ids, embeddings]
collection.insert(data)
print(f"Inserted {len(ids)} word embeddings into Milvus.")

Run the script to insert the words to the database:

$ python3 04_load_pretrained_word_embeddings.py 
Inserted 10000 word embeddings into Milvus.
Creating index...
Index created successfully

How to Perform a Similarity Search

Perform Vector Arithmetic (King + Girl = Queen)

from pymilvus import connections, Collection
import numpy as np

def load_word_mapping(glove_path):
    """Load just the words from GloVe file to map IDs to words"""
    words = []
    print("Loading word mapping from GloVe...")
    with open(glove_path, 'r', encoding='utf-8') as f:
        for line in f:
            word = line.split()[0]
            words.append(word)
    return words

def get_word_embedding(collection, word, words_list):
    """Get the embedding for a word from Milvus"""
    try:
        word_id = words_list.index(word)
        res = collection.query(
            expr=f"word_id == {word_id}",
            output_fields=["word_id", "embedding"],
            consistency_level="Strong"
        )
        if res:
            return np.array(res[0]['embedding'])
        print(f"Warning: No embedding found for '{word}'")
        return None
    except ValueError:
        print(f"Error: Word '{word}' not found in vocabulary")
        return None

def vector_arithmetic(collection, word1, word2, word3, words_list):
    """Perform vector arithmetic and search"""
    print(f"\nComputing: {word1} - {word2} + {word3}")
    
    # Get embeddings for the words
    vec1 = get_word_embedding(collection, word1, words_list)
    vec2 = get_word_embedding(collection, word2, words_list)
    vec3 = get_word_embedding(collection, word3, words_list)
    
    if vec1 is None or vec2 is None or vec3 is None:
        print("Cannot complete operation - missing embeddings")
        return
    
    # Perform the arithmetic with raw vectors
    query_vector = vec1 - vec2 + vec3
    
    search_params = {
        "metric_type": "L2",
        "params": {"nprobe": 16}
    }
    
    results = collection.search(
        data=[query_vector.tolist()],
        anns_field="embedding",
        param=search_params,
        limit=10,
        output_fields=["word_id"]
    )
    
    # Filter out input words and common words
    common_words = {
        'likewise', 'similarly', 'instance', 'moreover', 'presumably',
        'nevertheless', 'however', 'therefore', 'thus', 'hence',
        'particularly', 'especially', 'specifically', 'generally',
        'usually', 'typically', 'occasionally', 'sometimes'
    }
    
    for hit in results[0]:
        word = words_list[hit.id]
        if word not in {word1, word2, word3} and word not in common_words:
            print(f"Result: {word}")
            return
    
    print("No meaningful result found")

def main():
    print("Initializing...")
    connections.connect("default", host="localhost", port="27018")
    words = load_word_mapping("glove.6B.300d.txt")
    collection = Collection("word_embeddings")
    collection.load()
    
    # Example usage
    vector_arithmetic(collection, "king", "man", "woman", words)
    vector_arithmetic(collection, "paris", "france", "italy", words)
    vector_arithmetic(collection, "bigger", "big", "small", words)
    
    # Cleanup
    collection.release()
    connections.disconnect("default")

if __name__ == "__main__":
    main()

Let’s run the script:

 python3 05_vector_arithmetic1.py 
Initializing...
Loading word mapping from GloVe...

Computing: king - man + woman
Result: queen

Computing: paris - france + italy
Result: rome

Computing: bigger - big + small
Result: larger

Amazing! We can now calculate using Milvus and words.

Explanation

We retrieve embeddings for king, girl, and man.
We compute the vector arithmetic: king - man + girl.
We query Milvus for the most similar vectors to this computed vector.
We filter out
The expected closest word in the results should be queen.

Why Use Milvus for AI Applications?

Milvus is particularly effective for AI workloads due to:

Optimized Similarity Search: Handles millions of high-dimensional vectors efficiently.
Scalability on Kubernetes: Can scale up to support increasing AI workloads dynamically.
AI Integration: Works well with models from TensorFlow, PyTorch, and Hugging Face.

Summary

Milvus provides a scalable and efficient vector database for AI applications. By deploying Milvus on Kubernetes, we gain a powerful system for handling high-dimensional embeddings, allowing us to perform similarity searches and vector arithmetic effortlessly. This guide has shown how to install Milvus using Helm, connect via Python, insert embeddings, and perform computations like king + girl = queen using pre-trained word embeddings.

By leveraging the above organizations can do AI vector search on kubernetes with milvus, optimize AI-driven applications and efficiently manage vast amounts of vector data. If you’re working with embeddings, recommendation engines, or AI search applications, Milvus is a game-changer.

Octopus Computer Solutions delivers scalable AI solutions, leveraging vector search, automation, and Kubernetes to optimize data-driven decision-making

Reference Links

https://milvus.io/docs/install_cluster-helm.md

https://github.com/zilliztech/milvus-helm

Introduction

Procedure

Deploy Milvus on a Kubernetes cluster

Setup a Basic Vector Collection

Download a Pre-trained word vectors

Load and Insert Pre-trained Word Embeddings (GloVe)

How to Perform a Similarity Search

Explanation

Why Use Milvus for AI Applications?

Summary

Reference Links

How to Install ACM on a Disconnected OpenShift

Create a CI/CD Process Using GitLab and ArgoCD? Part 1

Troubleshooting n8n WebSocket Issues on NGINX Ingress

AI Vector Search on Kubernetes with Milvus

Categories

Help & Info