Implementing Horizontal Pod Autoscaling for services in EKS
TypeScriptHorizontal Pod Autoscaling (HPA) is a Kubernetes feature that automatically scales the number of pods in a deployment, replicaset, or statefulset based on observed CPU utilization or other select metrics. To implement HPA in an Amazon EKS cluster, we need to set up a few things:
- An EKS Cluster: Kubernetes clusters on AWS EKS to run our services.
- Metrics Server: For HPA to function, the metrics server should be deployed in the cluster to provide resource utilization metrics to the HPA controller.
- Horizontal Pod Autoscaler Resource: The HPA resource targets a Kubernetes deployment or other scalable resources and defines the scaling behavior.
Let's start by defining an EKS Cluster using Pulumi. We'll create a simple cluster and then add an HPA resource targeting a deployment. I'll include comments explaining each step of the process.
import * as eks from "@pulumi/eks"; import * as k8s from "@pulumi/kubernetes"; import * as aws from "@pulumi/aws"; // Create an EKS cluster with the default settings. // This will create the necessary infrastructure for an EKS cluster, // including the VPC, subnets, and worker nodes. const cluster = new eks.Cluster("my-cluster", { /* Specify additional options as needed; for example, you may want to provide a list of subnets or configure the desired capacity of the Auto Scaling Group for worker nodes. */ }); // After the cluster is initialized, we can create a Kubernetes Provider that uses the kubeconfig // from the newly created cluster. This provider is responsible for deploying resources to the EKS cluster. const provider = new k8s.Provider("eks-k8s", { kubeconfig: cluster.kubeconfig.apply(JSON.stringify), }); // Deploy the metrics server to the EKS cluster. // The metrics server is tasked with aggregating resource usage data, which the HPA uses. const metricsServer = new k8s.yaml.ConfigFile("metrics-server", { file: "https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml", }, { provider }); // Now, let's define a Horizontal Pod Autoscaler for a sample deployment. // This Deployment will be the target for the HPA. // Usually, the deployment would already exist or be created elsewhere in your Pulumi program. const appName = "my-app"; const appLabels = { app: appName }; const deployment = new k8s.apps.v1.Deployment(appName, { metadata: { labels: appLabels }, spec: { selector: { matchLabels: appLabels }, replicas: 2, template: { metadata: { labels: appLabels }, spec: { containers: [{ name: appName, image: "nginx" }] }, }, }, }, { provider }); // Create a HorizontalPodAutoscaler resource targeting the deployment defined above. const appHpa = new k8s.autoscaling.v2beta2.HorizontalPodAutoscaler(appName, { metadata: { labels: appLabels }, spec: { maxReplicas: 10, minReplicas: 2, scaleTargetRef: { apiVersion: "apps/v1", kind: "Deployment", name: appName, }, metrics: [{ type: "Resource", resource: { name: "cpu", target: { type: "Utilization", averageUtilization: 50, // Target CPU utilization of 50% }, }, }], }, }, { provider }); // Export the cluster's name and kubeconfig. export const eksClusterName = cluster.eksCluster.name; export const kubeconfig = cluster.kubeconfig;
This program creates an EKS cluster and a Horizontal Pod Autoscaler that will scale the specified deployment based on CPU utilization. The
metrics-server
component is necessary for the HPA to work, as it provides the CPU metrics that the HPA needs to make scaling decisions.The
HorizontalPodAutoscaler
resource has aspec
section that defines the scaling behavior:- The
scaleTargetRef
specifies the target resource to scale. In the example, it is theDeployment
calledmy-app
. - The
maxReplicas
andminReplicas
define the upper and lower bounds for the number of pods. - The
metrics
specify what metrics to use to determine if scaling is needed. In this case, we are targeting 50% CPU utilization.
Please adjust the cpu target utilization based on your application's needs and ensure your application metrics are set up correctly to forward them to the metrics server. The metrics server needs to be accessible within the cluster for the HPA to function correctly.
To apply this Pulumi program, run it through the Pulumi CLI and use
pulumi up
to deploy the resources. You'll need to have AWS credentials configured and the Pulumi CLI installed on your machine.