1. Kubernetes Ingress for Distributed Model Serving


    In Kubernetes, an Ingress is an API object that manages external access to the services in a cluster, typically HTTP. For your use case of distributed model serving, the Ingress will route external traffic to the appropriate model-serving service based on the request path or host.

    To configure an Ingress for distributed model serving, you will need:

    1. A Kubernetes cluster with the Ingress controller installed (you could use NGINX Ingress controller, for example).
    2. One or more model-serving services deployed in the cluster.
    3. An Ingress resource defining the access rules.

    Below, you will find a Pulumi program written in Python that demonstrates how to set up an Ingress resource to distribute traffic to two different model-serving services based on the request path.

    First, I will include the necessary import statements:

    • pulumi_kubernetes is used to interact with Kubernetes resources.
    • The Ingress class (from networking.v1) to create an Ingress resource.
    • The Service class to define the backend services for handling the requests.

    Here is the Pulumi program that demonstrates this:

    import pulumi from pulumi_kubernetes import Provider from pulumi_kubernetes.networking.v1 import Ingress from pulumi_kubernetes.core.v1 import Service # Assume that you have a Kubernetes Provider configured for Pulumi. k8s_provider = Provider(resource_name='k8s') # Define a service for Model A model_a_service = Service('model-a-service', metadata={ "name": "model-a-service" }, spec={ "selector": { "app": "model-a" }, "ports": [{ "port": 80, "targetPort": 8080 }] }, opts=pulumi.ResourceOptions(provider=k8s_provider) ) # Define a service for Model B model_b_service = Service('model-b-service', metadata={ "name": "model-b-service" }, spec={ "selector": { "app": "model-b" }, "ports": [{ "port": 80, "targetPort": 8080 }] }, opts=pulumi.ResourceOptions(provider=k8s_provider) ) # Create the Ingress resource ingress = Ingress('model-serving-ingress', metadata={ "annotations": { # Example of an NGINX-specific annotation for rewrite "nginx.ingress.kubernetes.io/rewrite-target": "/$2" } }, spec={ "rules": [{ "http": { "paths": [ { "path": "/model-a(/|$)(.*)", "pathType": "Prefix", "backend": { "service": { "name": model_a_service.metadata["name"], "port": { "number": 80 } } } }, { "path": "/model-b(/|$)(.*)", "pathType": "Prefix", "backend": { "service": { "name": model_b_service.metadata["name"], "port": { "number": 80 } } } }, ] } }] }, opts=pulumi.ResourceOptions(provider=k8s_provider) ) # Export the Ingress name pulumi.export('ingress_name', ingress.metadata['name'])

    In this program:

    • Two Service resources are defined, one for each model-serving service (model-a and model-b).
    • An Ingress resource is created that defines rules for routing traffic to these services.
      • Requests to /model-a will be routed to model-a-service.
      • Requests to /model-b will be routed to model-b-service.
    • For simplicity, we assume each service is running on port 8080 and the Ingress will route traffic to them on port 80.
    • Note that the path fields are using a regular expression to capture any sub-paths.
    • The nginx.ingress.kubernetes.io/rewrite-target annotation is specific to the NGINX Ingress controller and indicates how to rewrite the URL path before forwarding the request to the backend service.

    Ensure that you have applied the NGINX Ingress controller or an equivalent in your cluster and that the services represented as model-a and model-b are deployed and properly labeled. Adjust the selector fields to match the labels of your model-serving pods.

    Finally, I have exported the name of the Ingress resource so that you can easily find it within your cluster after deployment. This is a standard practice for tracking generated resources in Pulumi.

    Deploy the program using Pulumi CLI commands:

    • Run pulumi up to preview and deploy the changes.
    • Confirm the deployment by selecting 'yes'.
    • After successful deployment, you can check the Ingress resource using kubectl or access the model services through the Ingress' external endpoint.