Serverless Model Serving with Knative and Kubernetes Operators

Question

Pulumi · Accepted Answer

Serverless model serving with Knative and Kubernetes Operators requires setting up a Kubernetes cluster and then deploying Knative Serving on top of it. Knative Serving allows you to run serverless containers on Kubernetes with ease. It scales up and down automatically and has a pay-per-use model. Kubernetes Operators are application-specific controllers that extend the Kubernetes API to create, configure, and manage instances of complex stateful applications on behalf of a Kubernetes user.

Here's a breakdown of the steps we're going to take within our Pulumi program:

1. Set up a Kubernetes cluster: You can use various cloud providers like AWS, Azure, GCP, or DigitalOcean to create a Kubernetes cluster. For this example, I'm going to use the `pulumi_eks` module to deploy an EKS (Elastic Kubernetes Service) cluster on AWS since it's a high-level component that simplifies the cluster creation.

2. Install Knative Serving: After setting up the cluster, we'll apply the Knative Serving manifests to our cluster which will install all the required components for Knative to run.

3. Deploy a Kubernetes Operator: This step depends on the operator you want to use for managing your machine learning model. If there's a specific operator for your use case, you would deploy it as you would deploy any other workload on Kubernetes, using the `pulumi_kubernetes` provider to apply the operator's YAML manifest.

Below is the Pulumi Python program that demonstrates the creation of an EKS cluster and the deployment of Knative Serving. Note that deploying a specific Kubernetes Operator for model serving depends on the operator you choose, and may not be covered in this example.

```python
import pulumi
import pulumi_eks as eks
import pulumi_kubernetes as k8s

# Create an EKS cluster.
cluster = eks.Cluster('eks-cluster',
    # Specify the version of Kubernetes.
    version='1.21',
    # Specify the instance size for the EKS worker nodes.
    instance_type='t3.medium',
    # Specify the desired number of worker nodes.
    desired_capacity=3,
    # Specify the minimum number of worker nodes.
    min_size=1,
    # Specify the maximum number of worker nodes.
    max_size=5,
)

# Once the EKS cluster is up and running, we need to install Knative.
# Load the Knative YAML manifests from their repository or a local file.
# This could include installing Istio, or any other requirement for Knative Serving.

# A standalone version of Knative Serving which includes Kourier can be used in this case.
# The following command gets the required YAML:
# kubectl apply -f https://github.com/knative/serving/releases/download/v0.25.0/serving-crds.yaml
# kubectl apply -f https://github.com/knative/serving/releases/download/v0.25.0/serving-core.yaml
# kubectl apply -f https://github.com/knative/net-kourier/releases/download/v0.25.0/kourier.yaml

# Deploying Knative Serving using Pulumi.
knative_crds = k8s.yaml.ConfigGroup('knative-crds',
    files=['https://github.com/knative/serving/releases/download/v0.25.0/serving-crds.yaml'],
    opts=pulumi.ResourceOptions(
        provider=cluster.provider
    )
)

knative_core = k8s.yaml.ConfigGroup('knative-core',
    files=['https://github.com/knative/serving/releases/download/v0.25.0/serving-core.yaml'],
    opts=pulumi.ResourceOptions(
        provider=cluster.provider,
        depends_on=[knative_crds]
    )
)

kourier = k8s.yaml.ConfigGroup('kourier',
    files=['https://github.com/knative/net-kourier/releases/download/v0.25.0/kourier.yaml'],
    opts=pulumi.ResourceOptions(
        provider=cluster.provider,
        depends_on=[knative_core]
    )
)

# Deploy a Kubernetes Operator for model serving.
# This code assumes there’s an operator available that can be deployed using a custom YAML file.
# Replace `model-serving-operator.yaml` with the path to the operator's YAML manifest.
model_serving_operator = k8s.yaml.ConfigFile('model-serving-operator',
    file='path/to/model-serving-operator.yaml',
    opts=pulumi.ResourceOptions(
        provider=cluster.provider,
    )
)

# Export the cluster's kubeconfig.
pulumi.export('kubeconfig', cluster.kubeconfig)
```

Let me explain the components of the above program:

- `eks.Cluster`: This creates a managed Kubernetes cluster on AWS EKS. Adjust the parameters such as `instance_type`, `desired_capacity`, `min_size`, and `max_size` according to your needs.

- `k8s.yaml.ConfigGroup`: This component deploys a set of Kubernetes resources defined in one or more YAML files. In our case, we're deploying three different sets of Knative configurations for CRDs, core components, and Kourier as a lightweight ingress.

- `k8s.yaml.ConfigFile`: Similar to `ConfigGroup`, but for deploying a single Kubernetes YAML manifest. We're using it to deploy the Kubernetes Operator for model serving. This is a placeholder for your specific operator's YAML manifest.

Remember to replace `'path/to/model-serving-operator.yaml'` with the actual path to your model serving operator's YAML file. Furthermore, you will need to configure your AWS credentials and Pulumi stack before you can deploy this code.

This program sets up the foundational infrastructure for serverless machine learning model serving on Kubernetes using Knative. The specific details of deploying and configuring the model serving operator would depend on the operator you choose, and you'd need to follow the operator's documentation for concrete steps.