1. Scalable Model Serving with Custom CRDs


    In Kubernetes, Custom Resource Definitions (CRDs) allow you to define custom resources that behave like built-in resources. They are commonly used to extend Kubernetes with new kinds of API objects that are specific to a project or to an organization. For scalable model serving, you might create a CRD to represent a machine learning model and then deploy instances of that custom resource to serve predictions.

    In the context of Pulumi and Kubernetes, you would use the CustomResource Python class provided by Pulumi's Kubernetes provider to create and manage these custom resources from your Pulumi program. Additionally, CustomResourceDefinition (CRD) would be used to define the shape and schema of your custom resources.

    Let's walk through a simple program to deploy a CRD for a model serving resource, and subsequently create an instance of this custom resource:

    1. Define a CRD with the necessary schema that Kubernetes will recognize.
    2. Create a custom resource from this CRD which will represent our model serving.
    3. Deploy and potentially scale this model serving by manipulating this custom resource using Pulumi.

    Before we dive into the code, ensure you have the following prerequisites met:

    • Pulumi CLI installed and configured with the appropriate Kubernetes context.
    • Access to a Kubernetes cluster where you have permission to create CRDs and custom resources.

    Here's a Pulumi program in Python that demonstrates these steps:

    import pulumi import pulumi_kubernetes as k8s # Define the CustomResourceDefinition for model serving model_serving_crd = k8s.apiextensions.v1.CustomResourceDefinition( "model-serving-crd", metadata=k8s.meta.v1.ObjectMetaArgs(name="modelservings.ai.example.com"), spec=k8s.apiextensions.v1.CustomResourceDefinitionSpecArgs( group="ai.example.com", versions=[k8s.apiextensions.v1.CustomResourceDefinitionVersionArgs( name="v1", served=True, storage=True, schema=k8s.apiextensions.v1.CustomResourceValidationArgs( openAPIV3Schema=k8s.apiextensions.v1.JSONSchemaPropsArgs( type="object", properties={ "spec": k8s.apiextensions.v1.JSONSchemaPropsArgs( type="object", properties={ "image": k8s.apiextensions.v1.JSONSchemaPropsArgs(type="string"), "replicas": k8s.apiextensions.v1.JSONSchemaPropsArgs(type="integer"), }, ), }, ), ), )], scope="Namespaced", names=k8s.apiextensions.v1.CustomResourceDefinitionNamesArgs( plural="modelservings", singular="modelserving", kind="ModelServing", short_names=["ms"] ), ) ) # Create an instance of the custom ModelServing resource model_serving_instance = k8s.apiextensions.CustomResource( "model-serving-instance", api_version="ai.example.com/v1", kind="ModelServing", metadata=k8s.meta.v1.ObjectMetaArgs(name="example-model-serving"), spec={ "image": "example-model-image:v1", "replicas": 3 }, opts=pulumi.ResourceOptions(depends_on=[model_serving_crd]) ) # Export the name of the model serving instance pulumi.export("model_serving_name", model_serving_instance.metadata["name"])


    • We begin by importing the necessary Pulumi modules. The pulumi_kubernetes as k8s module contains all the types needed to interact with Kubernetes.
    • The model_serving_crd resource defines the schema for a new resource type named ModelServing. It includes the group ai.example.com, the version v1, and a short name ms.
    • The spec within the CRD outlines the structure of the ModelServing, including required fields such as image and replicas. This is akin to defining the columns of a database table.
    • After defining the CRD, we then create an instance of this CRD with model_serving_instance. This represents a specific model serving object that we want to deploy in our cluster.
    • The depends_on option ensures that the custom resource is not created until the CRD is successfully applied to the Kubernetes cluster.
    • Finally, we export the name of our model serving instance as an output for easy access.

    This Pulumi program, when executed, will apply the CRD to your Kubernetes cluster and create an instance of the ModelServing custom resource based on that definition. The instance uses the specified image and replicas, exemplifying how you can manage a scalable model serving workload using Kubernetes and Pulumi.