1. Scalable AI Workload Isolation with Cilium Network Policies


    To achieve scalable AI workload isolation in a Kubernetes cluster, we can employ Cilium Network Policies which provide powerful and flexible capabilities for controlling network traffic between pods/services based on a variety of criteria, such as labels, Kubernetes namespaces, or pod IP addresses.

    Cilium Network Policies overview: Cilium is a CNI (Container Network Interface) plugin for Kubernetes that provides comprehensive networking and security capabilities. It uses a Linux kernel technology called eBPF to enable dynamic insertion of powerful security visibility and control logic within Linux itself.

    A Cilium Network Policy is a Kubernetes resource that defines how groups of pods are allowed to communicate with each other and other network endpoints. Cilium Network Policies are defined using the standard Kubernetes NetworkPolicy resource kind, but they are enforced by the Cilium agent running on each node. These policies allow you to specify ingress (incoming) and egress (outgoing) rules.

    Below is a Pulumi Python program that creates a Kubernetes NetworkPolicy resource, which you could adapt to use with Cilium. This will serve as an example of workload isolation, where we ensure that only pods with certain labels are able to access a specific pod.

    Pulumi Program for Creating a Cilium Network Policy:

    import pulumi import pulumi_kubernetes as kubernetes # Example labels that define the group of AI workloads ai_workload_selector = { "workload": "ai", "team": "data-science" } # This Namespace could be where your AI workloads reside. # Replace the metadata with what suits your Kubernetes cluster setup. ai_namespace = kubernetes.core.v1.Namespace("ai-namespace", metadata={"name": "ai-workloads"} ) # Creating a NetworkPolicy to isolate the AI workloads # Only pods with labels matching 'ai_workload_selector' will be allowed ingress access. network_policy = kubernetes.networking.v1.NetworkPolicy("ai-workload-policy", metadata={ "namespace": ai_namespace.metadata["name"], "name": "ai-workload-isolation" }, spec={ "podSelector": { "matchLabels": ai_workload_selector }, "policyTypes": ["Ingress"], "ingress": [{ "from": [{ # This example allows access from any pod in the same namespace with the 'role: data-collector' label "podSelector": { "matchLabels": { "role": "data-collector" } } }], "ports": [{"port": 80}] # Example port that the AI workload might be listening on }] } ) # Export the name of the namespace and network policy pulumi.export("namespace", ai_namespace.metadata["name"]) pulumi.export("network_policy_name", network_policy.metadata["name"])

    In this program, we start by defining label selectors for our AI workloads and creating a dedicated Kubernetes namespace for them. We then define a NetworkPolicy that specifies which pods (identified by labels) and ports are allowed for communication to the pods that match our AI workload labels (ai_workload_selector). The policyTypes field with the value ["Ingress"] specifies that the rules apply to inbound (ingress) traffic to the selected pods.

    The from field within the ingress spec provides the source criteria for allowed traffic. In this example, any pod within the same namespace (the ai-namespace we created) that has the label role: data-collector is allowed to access the AI workload pods on port 80.

    Finally, we export the names of the created namespace and network policy which can be useful for querying these resources from the command line or integrating with other Pulumi stacks or systems.

    To adapt this program to your specific AI workload isolation needs, modify the selectors, ports, and namespaces as required by your application's architecture and security requirements. If you are using Cilium's enhanced Network Policies, you might include additional Cilium-specific features such as HTTP-aware rules, which are not covered in the standard Kubernetes NetworkPolicy resources.

    Make sure to check Cilium's documentation for advanced policy configurations and capabilities as well as Cilium-specific annotations that may be added to the Kubernetes NetworkPolicy resource.