1. ML Model Serving with Kubernetes and KFServing


    To serve a machine learning model on Kubernetes using KFServing, you'll need a Kubernetes cluster with KFServing installed, as well as a containerized machine learning model that you can deploy. Pulumi's infrastructure as code approach can be used to set up the entire workflow, from provisioning a Kubernetes cluster to deploying a KFServing InferenceService.

    KFServing, part of Kubeflow, is a serverless framework to deploy machine learning models in a Kubernetes environment. It offers various features such as auto-scaling, canary rollouts, and serverless capabilities.

    Below I'll provide a brief explanation and then a Pulumi program that sets up a Kubernetes cluster, installs KFServing, and deploys a simple InferenceService with a prebuilt model.

    First, we create a Kubernetes cluster using Pulumi's pulumi_eks module which simplifies the setup of an Amazon EKS cluster. After the cluster is provisioned, we install KFServing using the pulumi_kubernetes provider—this involves applying the necessary Kubernetes manifests.

    Finally, we define an InferenceService resource that tells KFServing how to serve the model. For this example, we'll use a prebuilt image that serves a simple sklearn model, but in a production environment, you would replace this with your own model's image.

    import pulumi import pulumi_eks as eks import pulumi_kubernetes as k8s # Create an EKS cluster eks_cluster = eks.Cluster('my-eks-cluster') # Use the kubeconfig of the generated EKS cluster to interact with the cluster kubeconfig = eks_cluster.kubeconfig.apply(lambda kc: kc) # Set up the Kubernetes provider using the kubeconfig from the EKS cluster k8s_provider = k8s.Provider('k8s-provider', kubeconfig=kubeconfig) # Install KFServing (Knative and Cert Manager are prerequisites) # The manifests are examples, and should be replaced by actual urls to KFServing YAML files # usually obtained from the official KFServing GitHub repository. kfserving_namespace = k8s.core.v1.Namespace('kfserving-namespace', metadata={'name': 'kfserving-system'}, opts=pulumi.ResourceOptions(provider=k8s_provider)) cert_manager_yaml = k8s.yaml.ConfigFile('cert-manager', file='https://github.com/jetstack/cert-manager/releases/download/v1.0.4/cert-manager.yaml', opts=pulumi.ResourceOptions(provider=k8s_provider, depends_on=[kfserving_namespace])) knative_serving_yaml = k8s.yaml.ConfigFile('knative-serving', file='https://github.com/knative/serving/releases/download/v0.18.0/serving-crds.yaml', opts=pulumi.ResourceOptions(provider=k8s_provider, depends_on=[cert_manager_yaml])) kfserving_yaml = k8s.yaml.ConfigFile('kfserving', file='https://github.com/kubeflow/kfserving/releases/download/v0.5.0/kfserving.yaml', opts=pulumi.ResourceOptions(provider=k8s_provider, depends_on=[knative_serving_yaml])) # Define an InferenceService using a prebuilt sklearn model image sklearn_inference_service = k8s.yaml.ConfigGroup( 'sklearn-inferenceservice', files=['./sklearn-inferenceservice.yaml'], opts=pulumi.ResourceOptions(provider=k8s_provider, depends_on=[kfserving_yaml]) ) # Export the cluster's kubeconfig pulumi.export('kubeconfig', kubeconfig)

    Please replace './sklearn-inferenceservice.yaml' with the actual path to the InferenceService manifest file, it should look something like this:

    apiVersion: "serving.kubeflow.org/v1alpha2" kind: "InferenceService" metadata: name: "sklearn-iris" namespace: "kfserving-system" spec: default: predictor: sklearn: storageUri: "gs://kfserving-samples/models/sklearn/iris"

    In this InferenceService manifest, storageUri points to a Google Cloud Storage bucket containing the trained machine learning model. KFServing will pull this model and serve it. You will need to change this URI to point to your model's storage location.

    Make sure you have access to a Kubernetes cluster and have set up your Pulumi credentials for both AWS and Kubernetes before running this program. Your machine learning model should be containerized and pushed to a container registry where Kubernetes can pull it from.

    After running this program with Pulumi, you should have an EKS cluster running with KFServing installed, ready to serve your machine learning model. You can interact with the KFServing InferenceService using Kubernetes tools like kubectl to send inference requests and receive predictions.