Rolling Updates for Machine Learning Models in Kubernetes

Question

Pulumi · Accepted Answer

Rolling updates are a crucial feature for deploying updates to machine learning models in Kubernetes with minimal downtime. The concept of a rolling update is to slowly replace pods of the previous version of your deployment with pods of the new version without taking down your service. By managing the number of pods that are taken down in parallel and slowly bringing up new pods with the new version, a rolling update avoids downtime.

To perform a rolling update of a machine learning model in Kubernetes using Pulumi, you need a model packaged into a container image, a Kubernetes Deployment resource that specifies the desired state of your application, and a strategy for updating the pods.

Here is how you would do a rolling update using Pulumi:

1. **Define the New Model Version**: Package your new machine learning model into a container image and push it to a container registry.
2. **Update the Deployment**: Modify your Kubernetes Deployment resource to use the new container image.
3. **Rolling Update Strategy**: Specify the `strategy.type` as `RollingUpdate` and define `rollingUpdate` parameters such as `maxUnavailable` and `maxSurge` in your Deployment resource.

The following program will first create a Kubernetes Deployment with an initial version of a machine learning model. It sets a rolling update strategy, and then updates the deployment with a new version of the model:

```python
import pulumi
import pulumi_kubernetes as k8s

# Configuration for the deployment
app_labels = {"app": "ml-model"}
model_image_version_1 = "myregistry/model:v1"
model_image_version_2 = "myregistry/model:v2"
deployment_name = "ml-model-deployment"

# Create a Kubernetes Deployment with the first version of the model
ml_model_deployment_v1 = k8s.apps.v1.Deployment(deployment_name,
    metadata={
        "name": deployment_name,
    },
    spec={
        "selector": {
            "matchLabels": app_labels
        },
        "replicas": 2,
        "template": {
            "metadata": {
                "labels": app_labels
            },
            "spec": {
                "containers": [{
                    "name": "ml-model",
                    "image": model_image_version_1,
                }]
            }
        },
        "strategy": {
            "type": "RollingUpdate",
            "rollingUpdate": {
                "maxUnavailable": 1,
                "maxSurge": 1,
            },
        },
    })

# Perform a rolling update by updating the deployment to use the second version of the model
ml_model_deployment_v2 = k8s.apps.v1.Deployment(deployment_name,
    metadata={
        "name": deployment_name,
    },
    spec={
        "selector": {
            "matchLabels": app_labels
        },
        "replicas": 2,
        "template": {
            "metadata": {
                "labels": app_labels
            },
            "spec": {
                "containers": [{
                    "name": "ml-model",
                    "image": model_image_version_2,  # Updated image version
                }]
            }
        },
        "strategy": {
            "type": "RollingUpdate",
            "rollingUpdate": {
                "maxUnavailable": 1,  # Maximum number of pods that can be unavailable during the update
                "maxSurge": 1,  # Maximum number of pods that can be created over the desired number of pods
            },
        },
    })

# Export the Deployment name
pulumi.export("deployment_name", ml_model_deployment_v1.metadata["name"])
```

Here's a step-by-step explanation of the code above:

- We import the necessary Pulumi packages for Kubernetes.
- We define configuration variables such as app labels, the container image for our model, and the deployment name.
- We create a Kubernetes `Deployment` with the first version of the model image.
- We define our `RollingUpdate` strategy, specifying the `maxUnavailable` and `maxSurge` parameters.
- We update the deployment with the second version of the container image, triggering a rolling update.

An important aspect of using Pulumi is that it manages the state of your infrastructure. This means that when you change the container image in your Pulumi program, Pulumi automatically plans and applies the difference in your deployment when you run `pulumi up`. It will perform a rolling update according to the specified strategy.

For this code to work, you need a Kubernetes cluster configured and accessible through `kubectl`, and Pulumi needs to be set up to communicate with your Kubernetes cluster. Ensure you have pushed the model images (`model_image_version_1` and `model_image_version_2`) to your container registry.

Remember, in practice, model updates might also require changes to other Kubernetes resources, such as services or ingress rules, depending on your architecture. Pulumi can manage the entire lifecycle of all these resources in a coordinated way.