Self-Healing Inference Services with DaemonSets

Question

Pulumi · Accepted Answer

In Kubernetes, a DaemonSet ensures that all (or some) nodes run a copy of a pod. As nodes are added to the cluster, pods are added to them. As nodes are removed from the cluster, those pods are garbage collected. Deleting a DaemonSet will clean up the pods it created.

DaemonSets are particularly useful for creating self-healing infrastructure, where you want to guarantee that your service runs on all nodes, automating the deployment and operation of the necessary components. This is often used for services that provide monitoring, logging, or infrastructure-related services at the node level.

For instance, if you are running an inference service that requires running on every node to perform computations on data local to each node, you want it to be self-healing. This means if a node goes down, Kubernetes will manage to bring your service up on another node without manual intervention.

Using Pulumi, we can define a DaemonSet as an Infra-as-Code which provides a convenient way to codify, version-control, and automate the deployment of such services.

Here's how you can define a DaemonSet with Pulumi in Python:

```python
import pulumi
import pulumi_kubernetes as k8s

# Define a namespace for the DaemonSet
namespace = k8s.core.v1.Namespace(
    "inference-service-namespace",
    metadata={
        "name": "inference-services",
    },
)

# Define the DaemonSet to deploy
daemonset = k8s.apps.v1.DaemonSet(
    "inference-daemonset",
    metadata=k8s.meta.v1.ObjectMetaArgs(
        name="inference-daemon",
        namespace=namespace.metadata["name"],
    ),
    spec=k8s.apps.v1.DaemonSetSpecArgs(
        selector=k8s.meta.v1.LabelSelectorArgs(
            match_labels={"app": "inference"},
        ),
        template=k8s.core.v1.PodTemplateSpecArgs(
            metadata=k8s.meta.v1.ObjectMetaArgs(
                labels={"app": "inference"},
            ),
            spec=k8s.core.v1.PodSpecArgs(
                containers=[
                    k8s.core.v1.ContainerArgs(
                        name="inference-container",
                        image="inference-service-image:v1.0.0",  # Replace with your actual image
                        resources=k8s.core.v1.ResourceRequirementsArgs(
                            requests={
                                "cpu": "200m",
                                "memory": "512Mi",
                            },
                            limits={
                                "cpu": "1",
                                "memory": "1Gi",
                            },
                        ),
                        # Define any necessary env vars, ports, volume mounts, etc.
                    ),
                ],
                # This is a critical setting to ensure that the DaemonSet is scheduled on all nodes.
                affinity=k8s.core.v1.AffinityArgs(
                    node_affinity=k8s.core.v1.NodeAffinityArgs(
                        required_during_scheduling_ignored_during_execution=k8s.core.v1.NodeSelectorArgs(
                            node_selector_terms=[
                                k8s.core.v1.NodeSelectorTermArgs(
                                    match_expressions=[
                                        k8s.core.v1.NodeSelectorRequirementArgs(
                                            key="beta.kubernetes.io/arch",
                                            operator="In",
                                            values=["amd64"],
                                        ),
                                    ],
                                ),
                            ],
                        ),
                    ),
                ),
            ),
        ),
    ),
)

# Export the name and namespace of the DaemonSet, which can be helpful for later references or queries
pulumi.export("daemonset_name", daemonset.metadata["name"])
pulumi.export("namespace", namespace.metadata["name"])
```

In this example:

- We've defined a `Namespace` to logically group our DaemonSet, which helps with access control, resource organization, and limiting the impact of changes.
- We've created a `DaemonSet` resource with a single container pod, with CPU and memory requests and limits to ensure it has the necessary resources.
- The `affinity` setting is optional and can provide scheduling rules to match specific node labels.
- The `image` should be replaced with the docker image of your inference service application.
- You can customize the `resources` field based on the requirements of your specific service.
- Additional configurations like `env` variables, `ports`, `volumeMounts`, etc., can be added as needed.

When deployed with Pulumi, this program sets up the defined DaemonSet in your Kubernetes cluster, making sure the service is automatically run on each node.

To run this Pulumi program, save it to a file, such as `main.py`, and run `pulumi up`. Pulumi will perform the necessary actions to achieve the desired state defined by your program. If you haven't logged in to Pulumi or selected a stack, you'll need to perform those steps first.

Remember, Pulumi keeps track of your resources and the desired state you've defined in your program, so if there's drift (for example, if someone manually deletes the DaemonSet), Pulumi can automatically correct this during the next deployment.