1. Kubernetes Ingress Routing for Model Serving Endpoints.


    To set up Kubernetes Ingress Routing for Model Serving Endpoints, we'll be creating a Kubernetes Ingress resource. The Ingress resource is a set of rules that allow inbound connections to reach the cluster services. It can be configured to give services externally-reachable URLs, load balance traffic, terminate SSL/TLS, and offer name-based virtual hosting.

    An Ingress controller is responsible for fulfilling the Ingress, usually with a load balancer, though it may also configure your edge router or additional frontends to help handle the traffic.

    In this scenario, you likely have a model serving service deployed on Kubernetes, which you want to expose to external traffic via an HTTP endpoint. To accomplish this, your Ingress needs to define how to route the traffic to your model serving endpoints.

    Below is a Pulumi program written in Python to create an Ingress resource for a Kubernetes cluster. This example assumes you have a model serving application running within your cluster and that you want to route traffic to it based on the incoming request's host or path.

    Before the code block, let's explain the resources and their usage:

    • Provider: If you're working with a Kubernetes cluster that is not the default configured on your machine, you must first create a Provider resource.

    • Ingress: The Ingress YAML block is where the route rules are defined. You specify what path and host will be directed to which service. You'll need to adjust the serviceName and servicePort to match your Kubernetes service details.

    • pulumi.export: At the end of the Pulumi program, this line exports the ingress's URL so that you can easily access the public URL of your model serving application.

    Let's see how to code this in Pulumi:

    import pulumi import pulumi_kubernetes as kubernetes # Required if your Kubernetes cluster is not the default configured on your machine. # provider = kubernetes.Provider('provider', kubeconfig='path-to-kubeconfig') # Create an Ingress to route traffic to the model serving service model_serving_ingress = kubernetes.networking.v1.Ingress( "model-serving-ingress", metadata=kubernetes.meta.v1.ObjectMetaArgs( name="model-serving", annotations={ # If using an ingress controller like nginx, you would put the relevant ingress class here "kubernetes.io/ingress.class": "nginx" }, ), spec=kubernetes.networking.v1.IngressSpecArgs( rules=[ kubernetes.networking.v1.IngressRuleArgs( host="modelserving.example.com", # Your domain here http=kubernetes.networking.v1.HTTPIngressRuleValueArgs( paths=[ kubernetes.networking.v1.HTTPIngressPathArgs( path="/predictions", # URL path for accessing the model serving endpoint path_type="ImplementationSpecific", backend=kubernetes.networking.v1.IngressBackendArgs( service=kubernetes.networking.v1.IngressServiceBackendArgs( name="model-serving-service", # Name of the model serving Kubernetes service port=kubernetes.networking.v1.ServiceBackendPortArgs( number=80 # Port where the model serving service is exposed ), ), ), ), ], ), ), ], # Uncomment and define the tls section if TLS is needed # tls=[ # kubernetes.networking.v1.IngressTLSArgs( # hosts=["modelserving.example.com"], # secret_name="model-serving-tls-secret", # Secret with TLS certificate and key # ), # ], ), # Comment out the provider line if using the default Kubernetes provider # opts=pulumi.ResourceOptions(provider=provider) ) # Export the public URL of the model serving Ingress pulumi.export("model_serving_url", model_serving_ingress.metadata.apply(lambda metadata: f"http://{metadata.name}.example.com"))

    In this program:

    • We create the Ingress under the domain modelserving.example.com, and we define a route for the path /predictions to a service called model-serving-service. Make sure that the service name and port match the Kubernetes service that's serving your model.

    • Annotations can be added to the metadata section for additional configuration parameters depending on your environment or ingress controller requirements.

    • We have commented out the sections for the provider setup and TLS configuration. You would need to uncomment and adjust those to reflect your environment.

    If you have configured your application to use HTTPS and have a TLS certificate, you would also include the tls section in the IngressSpecArgs with a reference to the Kubernetes Secret that stores your certificate and key.

    Remember to adjust the domain name, service name, and service ports to fit your specific use case. After deploying this with Pulumi, external traffic hitting modelserving.example.com/predictions should be directed to your model serving service.

    Deploying this Pulumi program will create the necessary routing rules within your Kubernetes cluster, allowing external users to access your model serving endpoints.