Decentralized AI Model Serving using Kubernetes with KubeDB

Question

Pulumi · Accepted Answer

Deploying a decentralized AI model serving solution on Kubernetes with KubeDB involves setting up a Kubernetes cluster and then deploying a database using KubeDB to store and manage the model's data. The AI model itself would be containerized and served using a deployment that allows for scalable access to the model's API.

Below, I will guide you through a Pulumi program that accomplishes this task. The program will involve the following steps:

Provisioning a Kubernetes cluster.
Installing KubeDB on the cluster.
Creating a database instance for our AI models using KubeDB.
Deploying the AI models as Kubernetes deployments with associated services for access.

For this example, I will use the Google Kubernetes Engine (GKE) for the Kubernetes cluster and a generic deployment for the AI model. Please note that you may need to adjust configurations like the model container image, database specifications, and resource allocations based on the specific requirements of your AI model serving solution.

import pulumi
import pulumi_kubernetes as k8s
from pulumi_kubernetes.helm.v3 import Chart, ReleaseArgs

# Initialize a Kubernetes provider using the default kubeconfig file.
k8s_provider = k8s.Provider("k8s-provider")

# Provisioning a Google Kubernetes Engine (GKE) cluster
# It uses the `pulumi_google_native` package to interact with Google Cloud.
cluster = google_native.container.v1.Cluster("ai-model-cluster",
    autopilot=google_native.container.v1.AutopilotArgs(
        enabled=True,
    ),
    location="us-central1",
    name="ai-model-cluster",
    project="<your-project-id>",  # Replace with your project id
)

# Using the GKE cluster as our provider for subsequent resources
k8s_provider = k8s.Provider("gke_k8s",
    kubeconfig=cluster.name.apply(lambda name: cluster_to_kubeconfig(name, "<your-project-id>"))
)

# Deploy KubeDB via Helm chart
kubedb_chart = Chart(
    "kubedb",
    ReleaseArgs(
        chart="kubedb",
        version="<desired-chart-version>",  # Specify the version you wish to use
        namespace="kube-system",
        fetch_opts=ChartFetchOpts(
            repo="https://charts.kubedb.com/stable/",
        ),
    ),
    opts=pulumi.ResourceOptions(provider=k8s_provider)
)

# Deploy a PostgreSQL database using KubeDB
postgres_db = kubedb_chart.Chart("postgres-db",
    values={
        "apiVersion": "kubedb.com/v1alpha1",
        "kind": "Postgres",
        "metadata": {
            "name": "ai-model-db",
            "namespace": "default"
        },
        "spec": {
            "version": "11.1-v2",
            "store": {
                "storageType": "Durable",
                "accessModes": [
                    "ReadWriteOnce"
                ],
                "resources": {
                    "requests": {
                        "storage": "10Gi"
                    }
                }
            }
        }
    },
    opts=pulumi.ResourceOptions(provider=k8s_provider, depends_on=[kubedb_chart])
)

# Deploying the AI Model as a Kubernetes deployment
ai_model_deployment = k8s.apps.v1.Deployment(
    "ai-model-deployment",
    spec=k8s.apps.v1.DeploymentSpecArgs(
        selector=k8s.meta.v1.LabelSelectorArgs(match_labels={"app": "ai-model"}),
        replicas=2,
        template=k8s.core.v1.PodTemplateSpecArgs(
            metadata=k8s.meta.v1.ObjectMetaArgs(labels={"app": "ai-model"}),
            spec=k8s.core.v1.PodSpecArgs(
                containers=[
                    k8s.core.v1.ContainerArgs(
                        name="ai-model",
                        image="<your-model-container-image>",  # Replace with your AI model's container image
                    )
                ]
            )
        )
    ),
    opts=pulumi.ResourceOptions(provider=k8s_provider)
)

# Expose the AI Model as a Service
ai_model_service = k8s.core.v1.Service(
    "ai-model-service",
    spec=k8s.core.v1.ServiceSpecArgs(
        selector={"app": "ai-model"},
        ports=[k8s.core.v1.ServicePortArgs(port=80)],
        type="LoadBalancer",
    ),
    opts=pulumi.ResourceOptions(provider=k8s_provider)
)

# Function to convert GKE cluster information into a usable kubeconfig
def cluster_to_kubeconfig(name: str, project: str) -> str:
    return (f"""
        apiVersion: v1
        clusters:
        - cluster:
            certificate-authority-data: {cluster.master_auth[0].cluster_ca_certificate}
            server: https://{cluster.endpoint}
          name: gke_{project}_{cluster.location}_{name}
        contexts:
        - context:
            cluster: gke_{project}_{cluster.location}_{name}
            user: gke_{project}_{cluster.location}_{name}
          name: gke_{project}_{cluster.location}_{name}
        current-context: gke_{project}_{cluster.location}_{name}
        kind: Config
        preferences: {{}}
        users:
        - name: gke_{project}_{cluster.location}_{name}
          user:
            auth-provider:
                config:
                    cmd-args: config config-helper --format=json
                    cmd-path: gcloud
                    expiry-key: '{{.credential.token_expiry}}'
                    token-key: '{{.credential.access_token}}'
                name: gcp
    """)

# Exporting the service endpoint
pulumi.export('ai_model_service_endpoint', ai_model_service.status.apply(lambda status: status.load_balancer.ingress[0].ip))

This program will create a GKE cluster, install KubeDB using a Helm chart, create a PostgreSQL database for the AI models, and deploy the models themselves as a Kubernetes deployment. It will also create a Kubernetes service of type LoadBalancer to expose the AI models' API externally.

Please ensure to replace <your-project-id> with your actual Google Cloud project ID and <your-model-container-image> with the container image URL of your AI model. Also, specify the chart version you wish to use when deploying KubeDB.

Once this program is applied using Pulumi, it will provision all the resources, and you can interact with your AI model through the provided service endpoint.