Vector Similarity Search on Kubernetes with Milvus

Question

Pulumi · Accepted Answer

To achieve a vector similarity search on Kubernetes with Milvus, you will need to deploy a Kubernetes cluster and then install Milvus, which is an open-source vector database built for vector similarity search and AI applications.

We will walk through the following steps in this program:

1. Deploy a Kubernetes cluster using Pulumi.
2. Install Milvus on the Kubernetes cluster.

For deploying Kubernetes, we will use a managed Kubernetes service in this example to simplify operations. The choice of cloud provider is arbitrary; you could use Amazon EKS, Google GKE, Azure AKS, or any other managed Kubernetes service provider. For ease of explanation, I will use Google GKE (Google Kubernetes Engine) in this example as it provides a simple and efficient way to deploy Kubernetes clusters.

After setting up the Kubernetes cluster, we will use Helm to install Milvus. Helm is a package manager for Kubernetes, which will help in deploying the Milvus application on our cluster.

Let's assume that you have the following prerequisites ready:
- Pulumi CLI installed and configured with your cloud provider
- Google Cloud SDK installed and configured (if you're using GKE)

Here's a Pulumi program in Python to deploy a Kubernetes cluster and install Milvus:

```python
import pulumi
import pulumi_gcp as gcp
import pulumi_kubernetes as k8s
from pulumi_kubernetes.helm.v3 import Chart, ChartOpts

# Create a GKE cluster
gke_cluster = gcp.container.Cluster("milvus-cluster",
    initial_node_count=3,
    node_version="latest",
    min_master_version="latest",
    node_config=gcp.container.ClusterNodeConfigArgs(
        machine_type="e2-standard-4",
        oauth_scopes=[
            "https://www.googleapis.com/auth/compute",
            "https://www.googleapis.com/auth/devstorage.read_only",
            "https://www.googleapis.com/auth/logging.write",
            "https://www.googleapis.com/auth/monitoring"
        ],
    ),
)

# Export the Kubeconfig
k8s_provider = k8s.Provider("k8s-provider",
    kubeconfig=gke_cluster.name.apply(lambda name: gcp.container.get_cluster(name=name).kubeconfig),
)

# Install Milvus using Helm
milvus_chart = Chart("milvus",
    ChartOpts(
        chart="milvus",
        version="2.0.0",
        fetch_opts=k8s.helm.v3.FetchOpts(
            repo="https://milvus-io.github.io/milvus-helm/"
        ),
    ),
    opts=pulumi.ResourceOptions(provider=k8s_provider),
)

# Export the Milvus endpoint
milvus_endpoint = pulumi.export("milvus_endpoint", milvus_chart.get_resource("v1/Service", "milvus/milvus").status.apply(lambda status: status.load_balancer.ingress[0].ip))
```

### Explanation

1. **Create GKE Cluster**: We define a Google Kubernetes Engine cluster named `milvus-cluster` with an initial node count of 3. Each node uses the `e2-standard-4` machine type with appropriate OAuth scopes. This will be our underlying infrastructure to run Milvus.

2. **Export Kubeconfig**: For Pulumi to communicate with our Kubernetes cluster, we export the kubeconfig from our created GKE cluster which will be used by the K8s Provider.

3. **Install Milvus with Helm**: We declare a Helm chart for Milvus. Helm charts are bundles of pre-configured Kubernetes resources. The `Chart` class from the Pulumi Kubernetes SDK represents such a Helm chart. We pass the chart name, version and repository.

4. **Export Milvus Endpoint**: At the end of the program, we export the Milvus service's endpoint so that it can be accessed. The `get_resource` function retrieves a resource deployed by the Helm chart so we can access its status and get the load balancer IP address.

Now, if you were to apply this Pulumi program, Pulumi would provision a Kubernetes cluster on GCP and then deploy Milvus to it using Helm. After deployment, you'd get the IP address to access Milvus, and you could start indexing and searching vectors.