Blue-Green Deployments for ML Models using Kubernetes Ingress

Question

Pulumi · Accepted Answer

Blue-green deployment is a strategy that is used to reduce downtime and risk by running two identical production environments called Blue and Green. At any time, only one of the environments is live, serving all production traffic. For Kubernetes applications, this can be achieved by configuring services and using an Ingress to manage traffic routing to these services.

We will write a Pulumi program that accomplishes the following:
1. Set up two identical services in Kubernetes to represent the Blue and Green environments.
2. Create a Kubernetes Ingress resource to manage traffic routing with a simple rule that directs traffic to the Blue service.
3. Illustrate how to switch traffic from the Blue service to the Green service by updating the Ingress rule.

I'll guide you through setting up a blue-green deployment pattern for ML models using the Kubernetes Ingress controller, which will handle routing external traffic to the correct service based on the configuration.

First, set up your Kubernetes cluster and authenticate your `kubectl` command line tool with the cluster's configuration. I'm going to assume you're already familiar with the process of setting up a Kubernetes cluster.

Here's a Pulumi Python program that defines a blue-green deployment:

```python
import pulumi
import pulumi_kubernetes as k8s

# Use an existing Kubernetes cluster context
kubeconfig = pulumi.Config("kubernetes").get("kubeconfig")

# Define provider to use our specified kubeconfig context
provider = k8s.Provider(resource_name="provider", kubeconfig=kubeconfig)

# Define the Blue version of our ML model service
blue_service = k8s.core.v1.Service("blue-service",
    spec=k8s.core.v1.ServiceSpecArgs(
        selector={
            "app": "ml-model",
            "version": "blue",
        },
        ports=[k8s.core.v1.ServicePortArgs(
            port=80,
            target_port=8080,
        )],
    ), opts=pulumi.ResourceOptions(provider=provider))

# Define the Green version of our ML model service
green_service = k8s.core.v1.Service("green-service",
    spec=k8s.core.v1.ServiceSpecArgs(
        selector={
            "app": "ml-model",
            "version": "green",
        },
        ports=[k8s.core.v1.ServicePortArgs(
            port=80,
            target_port=8080,
        )],
    ), opts=pulumi.ResourceOptions(provider=provider))

# Define the Ingress resource to route traffic initially to the Blue service
ingress = k8s.networking.v1.Ingress("ml-model-ingress",
    metadata=k8s.meta.v1.ObjectMetaArgs(
        # Ingress annotations can be used to specify configuration settings
        annotations={
            # Use the nginx Ingress controller
            "kubernetes.io/ingress.class": "nginx",
            # Specific nginx annotation to define blue-green weighting or other properties
        },
    ),
    spec=k8s.networking.v1.IngressSpecArgs(
        ingress_class_name="nginx",
        rules=[
            k8s.networking.v1.HTTPIngressRuleValueArgs(
                host="mlmodels.example.com",
                http=k8s.networking.v1.HTTPIngressPathArgs(
                    path="/",
                    path_type="Prefix",
                    backend=k8s.networking.v1.IngressBackendArgs(
                        service=k8s.networking.v1.IngressServiceBackendArgs(
                            name=blue_service.metadata.name,
                            port=k8s.networking.v1.ServiceBackendPortArgs(number=80)
                        ),
                    ),
                ),
            ),
        ],
    ), opts=pulumi.ResourceOptions(provider=provider))
    
# Export the Ingress endpoint
pulumi.export("ingress_endpoint", ingress.status.load_balancer.ingress[0].hostname)

```

This program describes two Kubernetes services, one for the blue version and one for the green version of an ML model application. Each service is selected based on the app label and the version label, which could correspond to different deployments of your ML models.

The `Ingress` resource is set up to handle incoming traffic at the specified host (in this case, `mlmodels.example.com`). Initially, it routes all incoming traffic to the `blue-service` by referencing its name in `service.name`. This represents the live production environment at the start.

To switch traffic from the Blue to the Green environment, you would update the `Ingress` resource's `backend.service.name` to point to the `green-service` and apply the changes using Pulumi.

An important note is the usage of annotations. Annotations in the Ingress metadata are a key-value map that can provide additional configuration for the Ingress controller. The actual key-value pairs will depend on what Ingress controller you are using (e.g., nginx, HAProxy, Traefik, etc.). In some use cases, additional annotations may be required to fully enable blue-green deployment behaviors, like setting weights for gradual traffic shifting.

Remember, for the initial deployment and the subsequent switch, you can use the Pulumi CLI commands `pulumi up` to apply the changes. The Pulumi program here declares the desired state, and when you run `pulumi up`, Pulumi ensures that the cluster's actual state matches the desired state.

Please replace `mlmodels.example.com` with your actual domain and update service selectors to match your application's configuration. Also, ensure that your domain is correctly pointed to your Ingress controller's external IP or hostname.