Knative Managed Model Rollouts with Canary Releases
PythonTo implement canary releases in a Kubernetes cluster using Knative, you can leverage the traffic management capabilities that Knative Serving provides. This allows you to direct a percentage of traffic to different revisions of your service, which is the core of a canary release strategy.
In this scenario, you have a Knative service, and you update it to roll out to a "canary" set of users before rolling out to everyone. Essentially, you will have two revisions of your service: the current stable one and a new one under test. You'll initially send a small percentage of traffic to the new revision and monitor its performance. If everything goes well, you increase that percentage gradually until the new revision is handling all traffic.
Pulumi does not have a native resource for Knative, as Knative is implemented as a set of custom Kubernetes resources. Instead, we use
pulumi_kubernetes
to deploy the necessary resources.Below, I'll provide you with the Pulumi Python program that demonstrates how a canary release could be orchestrated using Knative Service custom resource definitions (CRDs). Please note that this assumes you have Knative installed on your Kubernetes cluster.
First, we'll define our stable and canary revisions, then we’ll set the traffic percentage that each revision receives.
import pulumi from pulumi_kubernetes import Provider from pulumi_kubernetes.core.v1 import ServiceAccount from pulumi_kubernetes.rbac.v1 import Role, RoleBinding from pulumi_kubernetes.apiextensions import CustomResource from pulumi_kubernetes.apps.v1 import Deployment from pulumi_kubernetes.networking.v1 import Ingress # You need to set up your Kubernetes provider configuration and context where Knative is installed. k8s_provider = Provider("k8s") # This service account, role, and rolebinding setup is simplified and intended for demonstration. # Your actual setup might vary with different permissions and more complex role configurations. service_account = ServiceAccount("knative-service-account", metadata={"name": "knative-deployer"}, opts=pulumi.ResourceOptions(provider=k8s_provider)) role = Role("knative-role", metadata={"name": "knative-deployer-role"}, rules=[{"api_groups": [""], "resources": ["services"], "verbs": ["get", "list", "watch", "create", "update", "patch", "delete"]}], opts=pulumi.ResourceOptions(provider=k8s_provider)) role_binding = RoleBinding("knative-role-binding", metadata={"name": "knative-deployer-role-binding"}, role_ref={"api_group": "", "kind": "Role", "name": role.metadata["name"]}, subjects=[{"kind": "ServiceAccount", "name": service_account.metadata["name"]}], opts=pulumi.ResourceOptions(provider=k8s_provider)) # Define your stable service using Knative serving CRD knative_stable_service = CustomResource( "stable-service", api_version="serving.knative.dev/v1", kind="Service", metadata={"name": "my-model-service"}, spec={ "template": { "metadata": { "name": "my-model-service-v1", "annotations": {"autoscaling.knative.dev/minScale": "1"} }, "spec": { "containers": [{ "image": "docker.io/my-model:stable", # Define resource requests and limits for your container. }] } }, "traffic": [ {"tag": "current", "revisionName": "my-model-service-v1", "percent": 100}, {"tag": "candidate", "percent": 0}, ] }, opts=pulumi.ResourceOptions(provider=k8s_provider) ) # Define the canary (new version) service using Knative serving CRD knative_canary_service = CustomResource( "canary-service", api_version="serving.knative.dev/v1", kind="Service", metadata={"name": "my-model-service"}, spec={ "template": { "metadata": { "name": "my-model-service-v2", "annotations": {"autoscaling.knative.dev/minScale": "1"} }, "spec": { "containers": [{ "image": "docker.io/my-model:canary", # Define resource requests and limits for your container. }] } }, # Start with 10% of traffic to canary, 90% to current. Adjust as necessary. "traffic": [ {"tag": "current", "revisionName": "my-model-service-v1", "percent": 90}, {"tag": "candidate", "revisionName": "my-model-service-v2", "percent": 10}, ] }, opts=pulumi.ResourceOptions(provider=k8s_provider) ) # Export the ingress URL so you can access it. ingress_url = knative_stable_service.status.apply(lambda status: status["url"] if "url" in status else "") pulumi.export("ingress_url", ingress_url)
This program sets up the roles and permissions needed for Knative to operate and then proceeds to create two services: one representing the stable release and one representing the canary release. Note how the
traffic
field controls how much traffic each revision receives. We start with 100% of traffic routed to the stable version and 0% to the canary. Over time, as you grow confident in the canary's performance, you can shift the percentages.Please ensure that Pulumi is set up and configured to interact with your Kubernetes cluster and that you have the correct permissions to deploy resources. Also, adjust the traffic percentages based on your own canary testing policies.