Automated A/B Testing for ML Models with Seldon

Question

Pulumi · Accepted Answer

To automate A/B testing for machine learning (ML) models with Seldon, you'll typically need to deploy multiple versions of your ML model to a Kubernetes cluster, then configure a Seldon Deployment to route traffic between these model versions for A/B testing. For this purpose, we'll be using Pulumi with the Kubernetes provider.

Here's a step-by-step Python program using Pulumi to set up an automated A/B testing environment with Seldon:

1. Set up a Kubernetes cluster.
2. Install Seldon Core on that cluster.
3. Define two different versions of a Seldon Deployment that correspond to the two ML models you want to test.
4. Route traffic between the models for A/B testing.

Please note that we assume you've got Pulumi and kubectl installed, as well as your cloud account and Kubernetes cluster properly configured. If you are running this locally, you need to ensure `kubectl` is configured to communicate with your Kubernetes cluster and that you have the necessary permissions to deploy resources to your cluster.

Below is the Pulumi program that accomplishes the above steps:

```python
import pulumi
import pulumi_kubernetes as k8s

# Step 1: A Kubernetes cluster is needed. The example below assumes that you have already provisioned a
# Kubernetes cluster and that `kubectl` is configured to connect to your cluster. This step is cloud-provider
# and infrastructure specific, so it's been left out of the example.

# Step 2: Install Seldon Core on the Kubernetes cluster. For production use, consider specifying namespace,
# version, and other configuration details.
seldon_operator = k8s.helm.v3.Chart(
    "seldon-core-operator",
    k8s.helm.v3.ChartOpts(
        chart="seldon-core-operator",
        version="1.11.0",
        fetch_opts=k8s.helm.v3.FetchOpts(
            repo="https://storage.googleapis.com/seldon-charts",
        ),
        namespace="seldon-system",  # Assuming 'seldon-system' namespace is used for Seldon Core.
        values={
            "usageMetrics": {
                "enabled": True  # Enables usage metrics if you wish to collect them.
            },
        },
    ),
)

# Step 3: Define the SeldonDeployment resource for A/B testing.
a_b_test_deployment = k8s.apiextensions.CustomResource(
    "seldon-ab-test-deployment",
    {
        "apiVersion": "machinelearning.seldon.io/v1",
        "kind": "SeldonDeployment",
        "metadata": {
            "name": "seldon-ab-test",
            "namespace": "test-namespace",  # Replace with the namespace you are deploying to.
        },
        "spec": {
            "predictors": [
                {
                    "name": "model-a",
                    "graph": {
                        "children": [],
                        "implementation": "SOME_IMPLEMENTATION",  # Replace with your actual implementation.
                        "modelUri": "gs://your-bucket/model-a",  # Replace with the URI of your model A.
                        "name": "model-a"
                    },
                    "componentSpecs": [{
                        "spec": {
                            "containers": [
                                {
                                    "name": "model-a",
                                    "resources": {
                                        "requests": {
                                            "memory": "1Gi",
                                            "cpu": "0.5",
                                        },
                                    },
                                },
                            ],
                        },
                    }],
                    "traffic": 50,  # 50% of the traffic goes to model A
                },
                {
                    "name": "model-b",
                    "graph": {
                        "children": [],
                        "implementation": "SOME_IMPLEMENTATION",  # Replace with your actual implementation.
                        "modelUri": "gs://your-bucket/model-b",  # Replace with the URI of your model B.
                        "name": "model-b"
                    },
                    "componentSpecs": [{
                        "spec": {
                            "containers": [
                                {
                                    "name": "model-b",
                                    "resources": {
                                        "requests": {
                                            "memory": "1Gi",
                                            "cpu": "0.5",
                                        },
                                    },
                                },
                            ],
                        },
                    }],
                    "traffic": 50,  # 50% of the traffic goes to model B
                },
            ],
        },
    },
    opts=pulumi.ResourceOptions(depends_on=[seldon_operator]),  # Ensure Seldon Core is installed first.
)

# Export the SeldonDeployment.
pulumi.export("seldon_ab_test_deployment", a_b_test_deployment.metadata["name"])
```

Now, let me walk you through what this Pulumi program does:

- **Install Seldon Core Operator**: First, we install the Seldon Core Operator on our Kubernetes cluster using a Helm chart. This operator is responsible for managing the lifecycle of our machine learning deployments.
- **SeldonDeployment CustomResource**: We define a `SeldonDeployment` custom resource for our A/B testing. This resource specifies the two model versions we are testing, along with the traffic split we want for each version. The `traffic` field specifies what percentage of the requests should be routed to each model. In this example, traffic is split 50/50 between `model-a` and `model-b`.
- **Export**: Finally, we export the name of the `SeldonDeployment` so that you can easily retrieve it later.

Please replace the placeholders (like "SOME_IMPLEMENTATION" and "gs://your-bucket/model-a") with the specifics of your machine learning models and the environment they are stored in. The `resources` section within each container allows you to specify the amount of memory and CPU allocated to that model.

Once you run this Pulumi program, it will deploy the Seldon Core Operator and create a `SeldonDeployment` configured for A/B testing in your Kubernetes cluster. You can then measure and compare the performance of your two models based on the traffic they receive.