Kubernetes Ingress Routing for Model Serving Endpoints.

Question

Pulumi · Accepted Answer

To set up Kubernetes Ingress Routing for Model Serving Endpoints, we'll be creating a Kubernetes `Ingress` resource. The `Ingress` resource is a set of rules that allow inbound connections to reach the cluster services. It can be configured to give services externally-reachable URLs, load balance traffic, terminate SSL/TLS, and offer name-based virtual hosting.

An `Ingress` controller is responsible for fulfilling the `Ingress`, usually with a load balancer, though it may also configure your edge router or additional frontends to help handle the traffic.

In this scenario, you likely have a model serving service deployed on Kubernetes, which you want to expose to external traffic via an HTTP endpoint. To accomplish this, your `Ingress` needs to define how to route the traffic to your model serving endpoints.

Below is a Pulumi program written in Python to create an `Ingress` resource for a Kubernetes cluster. This example assumes you have a model serving application running within your cluster and that you want to route traffic to it based on the incoming request's host or path.

Before the code block, let's explain the resources and their usage:

- `Provider`: If you're working with a Kubernetes cluster that is not the default configured on your machine, you must first create a `Provider` resource.

- `Ingress`: The `Ingress` YAML block is where the route rules are defined. You specify what path and host will be directed to which service. You'll need to adjust the `serviceName` and `servicePort` to match your Kubernetes service details.

- `pulumi.export`: At the end of the Pulumi program, this line exports the ingress's URL so that you can easily access the public URL of your model serving application.

Let's see how to code this in Pulumi:

```python
import pulumi
import pulumi_kubernetes as kubernetes

# Required if your Kubernetes cluster is not the default configured on your machine.
# provider = kubernetes.Provider('provider', kubeconfig='path-to-kubeconfig')

# Create an Ingress to route traffic to the model serving service
model_serving_ingress = kubernetes.networking.v1.Ingress(
    "model-serving-ingress",
    metadata=kubernetes.meta.v1.ObjectMetaArgs(
        name="model-serving",
        annotations={
            # If using an ingress controller like nginx, you would put the relevant ingress class here
            "kubernetes.io/ingress.class": "nginx"
        },
    ),
    spec=kubernetes.networking.v1.IngressSpecArgs(
        rules=[
            kubernetes.networking.v1.IngressRuleArgs(
                host="modelserving.example.com",  # Your domain here
                http=kubernetes.networking.v1.HTTPIngressRuleValueArgs(
                    paths=[
                        kubernetes.networking.v1.HTTPIngressPathArgs(
                            path="/predictions",  # URL path for accessing the model serving endpoint
                            path_type="ImplementationSpecific",
                            backend=kubernetes.networking.v1.IngressBackendArgs(
                                service=kubernetes.networking.v1.IngressServiceBackendArgs(
                                    name="model-serving-service",  # Name of the model serving Kubernetes service
                                    port=kubernetes.networking.v1.ServiceBackendPortArgs(
                                        number=80  # Port where the model serving service is exposed
                                    ),
                                ),
                            ),
                        ),
                    ],
                ),
            ),
        ],
        # Uncomment and define the tls section if TLS is needed
        # tls=[
        #     kubernetes.networking.v1.IngressTLSArgs(
        #         hosts=["modelserving.example.com"],
        #         secret_name="model-serving-tls-secret",  # Secret with TLS certificate and key
        #     ),
        # ],
    ),
    # Comment out the provider line if using the default Kubernetes provider
    # opts=pulumi.ResourceOptions(provider=provider)
)

# Export the public URL of the model serving Ingress
pulumi.export("model_serving_url", model_serving_ingress.metadata.apply(lambda metadata: f"http://{metadata.name}.example.com"))
```

In this program:

- We create the Ingress under the domain `modelserving.example.com`, and we define a route for the path `/predictions` to a service called `model-serving-service`. Make sure that the service name and port match the Kubernetes service that's serving your model.

- Annotations can be added to the metadata section for additional configuration parameters depending on your environment or ingress controller requirements.

- We have commented out the sections for the provider setup and TLS configuration. You would need to uncomment and adjust those to reflect your environment.

If you have configured your application to use HTTPS and have a TLS certificate, you would also include the `tls` section in the `IngressSpecArgs` with a reference to the Kubernetes Secret that stores your certificate and key.

Remember to adjust the domain name, service name, and service ports to fit your specific use case. After deploying this with Pulumi, external traffic hitting `modelserving.example.com/predictions` should be directed to your model serving service.

Deploying this Pulumi program will create the necessary routing rules within your Kubernetes cluster, allowing external users to access your model serving endpoints.