Introduction
In the world of AI, understanding high-dimensional data is critical. This is important for recommendation systems, NLP, and image processing. Milvus is an open-source vector database that provides the ability to store and retrieve vectorized data. It is optimized for similarity search and high-performance analytics, making it a perfect fit for AI applications. AI Vector Search on Kubernetes with Milvus will provide you with capabilities for high-dimensional data processing important for AI-driven critical-mission applications.
In this post, we’ll cover how to deploy Milvus on a Kubernetes cluster using Helm, set up a basic vector collection, and demonstrate how to perform a similarity search—showcasing why Milvus is used today for AI workloads on Kubernetes. We’ve used our testing kubernetes cluster for this article. If you’re interested in deploying your own Kubernetes you can further read How to Deploy Kubernetes Using Kubespray
Procedure
Deploy Milvus on a Kubernetes cluster
Let’s add the milvus repo using helm with the following commands:
$ helm repo add milvus https://zilliztech.github.io/milvus-helm/
$ helm repo update
Create the namespace
$ kubectl create ns milvus-system
$ kn milvus
Let’s install Milvus using the helm install command:
$ helm install my-milvus --namespace milvus-system milvus/milvus
After deployment, verify that the Milvus pods are running:
$ kubectl get pods
$ k get pods
NAME READY STATUS RESTARTS AGE
my-milvus-datanode-777cd55cb5-wxs2l 1/1 Running 5 (14m ago) 25m
my-milvus-etcd-0 1/1 Running 0 14m
my-milvus-etcd-1 1/1 Running 0 14m
my-milvus-etcd-2 1/1 Running 0 14m
my-milvus-indexnode-56bdfb7ff6-qq2nl 1/1 Running 6 (14m ago) 25m
my-milvus-minio-0 1/1 Running 0 25m
my-milvus-minio-1 1/1 Running 0 25m
my-milvus-minio-2 1/1 Running 0 25m
my-milvus-minio-3 1/1 Running 0 25m
my-milvus-mixcoord-8644f9c9f9-jh4m5 1/1 Running 5 (14m ago) 25m
my-milvus-proxy-c5df8d6bc-62szp 1/1 Running 5 (14m ago) 25m
my-milvus-pulsarv3-bookie-0 1/1 Running 0 14m
my-milvus-pulsarv3-bookie-1 1/1 Running 0 14m
my-milvus-pulsarv3-bookie-2 1/1 Running 0 14m
my-milvus-pulsarv3-bookie-init-jm57r 0/1 Completed 0 25m
my-milvus-pulsarv3-broker-0 1/1 Running 0 14m
my-milvus-pulsarv3-broker-1 1/1 Running 0 14m
my-milvus-pulsarv3-proxy-0 1/1 Running 0 14m
my-milvus-pulsarv3-proxy-1 1/1 Running 0 14m
my-milvus-pulsarv3-pulsar-init-r5d6r 0/1 Completed 0 25m
my-milvus-pulsarv3-recovery-0 1/1 Running 0 14m
my-milvus-pulsarv3-zookeeper-0 1/1 Running 0 14m
my-milvus-pulsarv3-zookeeper-1 1/1 Running 0 14m
my-milvus-pulsarv3-zookeeper-2 1/1 Running 0 14m
my-milvus-querynode-575597dd66-d8tlz 1/1 Running 6 (14m ago) 25m
NOTE: If you have a custom domain, etcd will crash on bootstrapping
values.yaml :
etcd:
host: "milvus-etcd.<namespace>.custom.domain"
clusterDomain: "custom.domain"
global:
clusterDomain: "custom.domain"
etcd:
host: "milvus-etcd.<namespace>.custom.domain"
clusterDomain: "custom.domain"
minio:
host: "my-milvus-minio.<namespace>.custom.domain"
clusterDomain: "custom.domain"
port: 9000
bucket: milvus-bucket
accessKey: my-access-key
secretKey: my-secret-key
dataCoord:
host: "milvus-datacoord.<namespace>.custom.domain"
clusterDomain: "custom.domain"
$ helm upgrade --install my-milvus milvus/milvus -f values.yaml
Setup a Basic Vector Collection
Connect to Milvus and define a Collection
Verify Milvus is running with service on port 19530
$ k get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
my-milvus ClusterIP 10.101.22.241 <none> 19530/TCP,9091/TCP 44m
my-milvus-datanode ClusterIP None <none> 9091/TCP 44m
my-milvus-etcd ClusterIP 10.101.33.251 <none> 2379/TCP,2380/TCP 44m
my-milvus-etcd-headless ClusterIP None <none> 2379/TCP,2380/TCP 44m
my-milvus-indexnode ClusterIP None <none> 9091/TCP 44m
my-milvus-minio ClusterIP 10.101.30.39 <none> 9000/TCP 44m
my-milvus-minio-svc ClusterIP None <none> 9000/TCP 44m
my-milvus-mixcoord ClusterIP 10.101.11.160 <none> 9091/TCP 44m
my-milvus-pulsarv3-bookie ClusterIP None <none> 3181/TCP,8000/TCP 44m
my-milvus-pulsarv3-broker ClusterIP None <none> 8080/TCP,6650/TCP 44m
my-milvus-pulsarv3-proxy ClusterIP 10.101.29.5 <none> 80/TCP,6650/TCP 44m
my-milvus-pulsarv3-recovery ClusterIP None <none> 8000/TCP 44m
my-milvus-pulsarv3-zookeeper ClusterIP None <none> 8000/TCP,2888/TCP,3888/TCP,2181/TCP 44m
my-milvus-querynode ClusterIP None <none> 9091/TCP 44m
NOTE: Since this is a PoC, we will proxy our way into Milvus:
$ kubectl port-forward --address 0.0.0.0 service/my-milvus 27018:19530
Install appropriate python libraries
$ python3 -m venv venv
$ source venv/bin/activate
$ pip3 install pymilvus gensim
Now run the following to set up the basic vector collection:
03_define_collection.py:
from pymilvus import connections, FieldSchema, CollectionSchema, DataType, Collection
# Connect to Milvus
connections.connect("default", host="your-milvus-service", port="19530")
# Define the schema for storing word embeddings
fields = [
FieldSchema(name="word_id", dtype=DataType.INT64, is_primary=True),
FieldSchema(name="embedding", dtype=DataType.FLOAT_VECTOR, dim=300) # 300D GloVe embeddings
]
schema = CollectionSchema(fields, "Word Embeddings Collection")
# Create the collection
collection = Collection("word_embeddings", schema)
Run the define collection script:
$ python3 03_define_collection.py
If not issues, then you should get a new prompt without any specific output.
Download a Pre-trained word vectors
In order to show the power of Vector DB i wanted to show how arithmetics can be done with words. For that I selected GloVe collection. From GloVe’s Site:
GloVe is an unsupervised learning algorithm for obtaining vector representations for words. Training is performed on aggregated global word-word co-occurrence statistics from a corpus, and the resulting representations showcase interesting linear substructures of the word vector space.
Download from GloVe website. I have selected glove.6B.zip which is based on Wikipedia 2014 + Gigaword 5. Has 6B tokens, 400K vocab, uncased, 50d, 100d, 200d, & 300d vectors and weighs 822 MB.
$ wget https://nlp.stanford.edu/data/glove.6B.zip
Load and Insert Pre-trained Word Embeddings (GloVe)
We will now load GloVe embeddings and insert them into Milvus.
04_load_pretrained_word_embeddings.py
import numpy as np
def load_glove_embeddings(file_path, limit=None):
embeddings = []
words = []
ids = []
with open(file_path, 'r') as f:
for idx, line in enumerate(f):
values = line.split()
words.append(values[0])
word_vector = [float(x) for x in values[1:]]
embeddings.append(word_vector)
ids.append(idx)
if limit and idx >= limit - 1:
break
return words, ids, embeddings
# Load GloVe embeddings (limit to first 10,000 words for efficiency)
words, ids, embeddings = load_glove_embeddings("path/to/glove.6B.100d.txt", limit=10000)
# Insert into Milvus
data = [ids, embeddings]
collection.insert(data)
print(f"Inserted {len(ids)} word embeddings into Milvus.")
Run the script to insert the words to the database:
$ python3 04_load_pretrained_word_embeddings.py
Inserted 10000 word embeddings into Milvus.
Creating index...
Index created successfully
How to Perform a Similarity Search
Perform Vector Arithmetic (King + Girl = Queen)
from pymilvus import connections, Collection
import numpy as np
def load_word_mapping(glove_path):
"""Load just the words from GloVe file to map IDs to words"""
words = []
print("Loading word mapping from GloVe...")
with open(glove_path, 'r', encoding='utf-8') as f:
for line in f:
word = line.split()[0]
words.append(word)
return words
def get_word_embedding(collection, word, words_list):
"""Get the embedding for a word from Milvus"""
try:
word_id = words_list.index(word)
res = collection.query(
expr=f"word_id == {word_id}",
output_fields=["word_id", "embedding"],
consistency_level="Strong"
)
if res:
return np.array(res[0]['embedding'])
print(f"Warning: No embedding found for '{word}'")
return None
except ValueError:
print(f"Error: Word '{word}' not found in vocabulary")
return None
def vector_arithmetic(collection, word1, word2, word3, words_list):
"""Perform vector arithmetic and search"""
print(f"\nComputing: {word1} - {word2} + {word3}")
# Get embeddings for the words
vec1 = get_word_embedding(collection, word1, words_list)
vec2 = get_word_embedding(collection, word2, words_list)
vec3 = get_word_embedding(collection, word3, words_list)
if vec1 is None or vec2 is None or vec3 is None:
print("Cannot complete operation - missing embeddings")
return
# Perform the arithmetic with raw vectors
query_vector = vec1 - vec2 + vec3
search_params = {
"metric_type": "L2",
"params": {"nprobe": 16}
}
results = collection.search(
data=[query_vector.tolist()],
anns_field="embedding",
param=search_params,
limit=10,
output_fields=["word_id"]
)
# Filter out input words and common words
common_words = {
'likewise', 'similarly', 'instance', 'moreover', 'presumably',
'nevertheless', 'however', 'therefore', 'thus', 'hence',
'particularly', 'especially', 'specifically', 'generally',
'usually', 'typically', 'occasionally', 'sometimes'
}
for hit in results[0]:
word = words_list[hit.id]
if word not in {word1, word2, word3} and word not in common_words:
print(f"Result: {word}")
return
print("No meaningful result found")
def main():
print("Initializing...")
connections.connect("default", host="localhost", port="27018")
words = load_word_mapping("glove.6B.300d.txt")
collection = Collection("word_embeddings")
collection.load()
# Example usage
vector_arithmetic(collection, "king", "man", "woman", words)
vector_arithmetic(collection, "paris", "france", "italy", words)
vector_arithmetic(collection, "bigger", "big", "small", words)
# Cleanup
collection.release()
connections.disconnect("default")
if __name__ == "__main__":
main()
Let’s run the script:
python3 05_vector_arithmetic1.py
Initializing...
Loading word mapping from GloVe...
Computing: king - man + woman
Result: queen
Computing: paris - france + italy
Result: rome
Computing: bigger - big + small
Result: larger
Amazing! We can now calculate using Milvus and words.
Explanation
- We retrieve embeddings for
king
,girl
, andman
. - We compute the vector arithmetic:
king - man + girl
. - We query Milvus for the most similar vectors to this computed vector.
- We filter out
- The expected closest word in the results should be
queen
.
Why Use Milvus for AI Applications?
Milvus is particularly effective for AI workloads due to:
- Optimized Similarity Search: Handles millions of high-dimensional vectors efficiently.
- Scalability on Kubernetes: Can scale up to support increasing AI workloads dynamically.
- AI Integration: Works well with models from TensorFlow, PyTorch, and Hugging Face.
Summary
Milvus provides a scalable and efficient vector database for AI applications. By deploying Milvus on Kubernetes, we gain a powerful system for handling high-dimensional embeddings, allowing us to perform similarity searches and vector arithmetic effortlessly. This guide has shown how to install Milvus using Helm, connect via Python, insert embeddings, and perform computations like king + girl = queen
using pre-trained word embeddings.
By leveraging the above organizations can do AI vector search on kubernetes with milvus, optimize AI-driven applications and efficiently manage vast amounts of vector data. If you’re working with embeddings, recommendation engines, or AI search applications, Milvus is a game-changer.
Octopus Computer Solutions delivers scalable AI solutions, leveraging vector search, automation, and Kubernetes to optimize data-driven decision-making