How Do I Scale Kubernetes Pods Based on CPU Utilization?

In this guide, we will demonstrate how to automatically scale Kubernetes pods based on CPU utilization using Pulumi. We will create a Kubernetes Deployment and a Horizontal Pod Autoscaler (HPA) that scales the number of pods based on the average CPU usage.

The HorizontalPodAutoscaler resource monitors the CPU utilization of the pods and adjusts the number of replicas in the deployment to maintain the desired CPU usage.

Here’s the complete Pulumi program in TypeScript:

import * as pulumi from "@pulumi/pulumi";
import * as k8s from "@pulumi/kubernetes";

// Create a Kubernetes namespace
const namespace = new k8s.core.v1.Namespace("app-ns");

// Create a Kubernetes Deployment
const appLabels = { app: "my-app" };
const deployment = new k8s.apps.v1.Deployment("my-app-deployment", {
    metadata: {
        namespace: namespace.metadata.name,
    },
    spec: {
        selector: { matchLabels: appLabels },
        replicas: 1, // Start with a single replica
        template: {
            metadata: { labels: appLabels },
            spec: {
                containers: [{
                    name: "my-app",
                    image: "nginx", // Use a simple nginx image for demonstration
                    resources: {
                        requests: {
                            cpu: "100m",
                        },
                    },
                }],
            },
        },
    },
});

// Create a Horizontal Pod Autoscaler
const hpa = new k8s.autoscaling.v1.HorizontalPodAutoscaler("my-app-hpa", {
    metadata: {
        namespace: namespace.metadata.name,
    },
    spec: {
        scaleTargetRef: {
            apiVersion: "apps/v1",
            kind: "Deployment",
            name: deployment.metadata.name,
        },
        minReplicas: 1,
        maxReplicas: 10,
        targetCPUUtilizationPercentage: 50, // Target 50% CPU utilization
    },
});

// Export the namespace and deployment name
export const nsName = namespace.metadata.name;
export const deploymentName = deployment.metadata.name;

Key Points

Namespace: We create a namespace to logically isolate our resources.
Deployment: A Kubernetes Deployment is created with an initial replica count of 1 and a simple container running the nginx image.
Horizontal Pod Autoscaler (HPA): The HPA is configured to scale the deployment between 1 and 10 replicas based on the average CPU utilization, targeting 50% CPU usage.

Summary

In this guide, we set up a Kubernetes Deployment and a Horizontal Pod Autoscaler using Pulumi. The HPA automatically adjusts the number of pods in the deployment to maintain the desired CPU utilization, ensuring efficient resource usage and scalability.

Deploy this code

Want to deploy this code? Sign up for a free Pulumi account to deploy in a few clicks.

New to Pulumi?

Want to deploy this code? Sign up with Pulumi to deploy in a few clicks.