Multi-Model Deployment Patterns with Seldon Core

Question

Pulumi · Accepted Answer

To deploy multiple models using Seldon Core with Pulumi, you would typically deploy Seldon as part of a Kubernetes cluster, such as one managed by Amazon EKS, Google GKE, or Azure AKS. Seldon Core is an advanced platform for deploying machine learning models on Kubernetes, offering features like A/B testing, canary rollouts, and multi-model serving.

To achieve this, you would:

1. Set up a Kubernetes cluster on your chosen cloud provider.
2. Install Seldon Core to your cluster using Pulumi's Kubernetes provider.
3. Define your machine learning model deployments using Seldon's Custom Resource Definitions (CRDs).
4. Expose your models for inference via an ingress or load balancer.

Below I'm providing a detailed Pulumi program in Python that sets up the necessary infrastructure in your Kubernetes cluster for a Seldon Core deployment. Note that this assumes you have already configured your `kubectl` to interact with your Kubernetes cluster and have installed Seldon Core CRDs in your cluster.

```python
import pulumi
from pulumi_kubernetes.apps.v1 import Deployment
from pulumi_kubernetes.core.v1 import Service
from pulumi_kubernetes.seldon.v1 import SeldonDeployment

# Assuming Seldon Core is already installed on the Kubernetes cluster
# and that Pulumi is configured to interact with your Kubernetes cluster.

# Define your machine learning models
model_a = {
    # Configuration for model A
}

model_b = {
    # Configuration for model B
}

# For the sake of this example, let's assume that model A and B are two different models
# that you wish to deploy as part of a multi-model serving pattern. We're not specifying
# the exact configuration for these models because it's strongly use case specific and
# would involve setting up your model images, resource requests, etc.

# Deploy the SeldonDeployment Custom Resource (CR) using Pulumi.
seldon_deployment = SeldonDeployment(
    "multi-model-deployment",
    spec={
        "name": "multi-model-example",
        "predictors": [
            {
                "name": "model-a-predictor",
                "graph": {
                    "name": "model-a",
                    "implementation": "MODEL_A_IMPLEMENTATION",  # Replace with your model details
                    "modelUri": "gs://path-to-model-a",  # Replace with your model storage path
                },
                "replicas": 1,
                "componentSpecs": [{
                    "spec": model_a
                }],
            },
            {
                "name": "model-b-predictor",
                "graph": {
                    "name": "model-b",
                    "implementation": "MODEL_B_IMPLEMENTATION",  # Replace with your model details
                    "modelUri": "gs://path-to-model-b",  # Replace with your model storage path
                },
                "replicas": 1,
                "componentSpecs": [{
                    "spec": model_b
                }],
            },
        ]
    }
)

# Pulumi now knows about the SeldonDeployment resource and will track it through subsequent updates.
pulumi.export("seldon_deployment_name", seldon_deployment.metadata["name"])
```

In this code, `SeldonDeployment` is the main resource that allows us to define how to deploy our machine learning models. We define two separate predictors within the `SeldonDeployment` resource, each corresponding to a different model.

This is a basic example and does not include creating the Kubernetes cluster or further details of your machine learning models, such as their specific container images or computational resource needs.

Moreover, you would also need to add a Service and potentially an Ingress to make the Seldon Core accessible over the internet.

Please note that this code is meant to be run in an environment where Pulumi and `kubectl` are already set up and configured to communicate with your Kubernetes cluster, and where the Seldon Core operator has been installed in the cluster.

This Pulumi code establishes the basics of deploying a multi-model pattern with Seldon Core, but remember to replace placeholders with your actual model details and storage paths. Be sure to check Seldon Core's documentation for specifics on configuring your models, including how to set up the predictive services and manage model storage.