1. Answers
  2. Upgrading AKS Cluster with Zero Downtime Using Multiple Node Pools

How do I upgrade AKS clusters with zero downtime using multiple node pools?

In this guide, we will upgrade an Azure Kubernetes Service (AKS) cluster without incurring any downtime by leveraging multiple node pools. The strategy involves creating a new node pool with the updated version, migrating workloads from the old node pool to the new node pool, and then deleting the old node pool if necessary.

Here’s how to set up and manage multiple node pools for safe upgrades.

import * as pulumi from "@pulumi/pulumi";
import * as azure from "@pulumi/azure";

const aksRg = new azure.core.ResourceGroup("aks_rg", {
    name: "exampleResourceGroup",
    location: "East US",
});
const aksCluster = new azure.containerservice.KubernetesCluster("aks_cluster", {
    name: "exampleAKSCluster",
    location: aksRg.location,
    resourceGroupName: aksRg.name,
    dnsPrefix: "exampleakscluster",
    defaultNodePool: {
        name: "default",
        nodeCount: 1,
        vmSize: "Standard_DS2_v2",
    },
    identity: {
        type: "SystemAssigned",
    },
});
const bluePool = new azure.containerservice.KubernetesClusterNodePool("blue_pool", {
    name: "bluepool",
    kubernetesClusterId: aksCluster.id,
    vmSize: "Standard_DS2_v2",
    nodeCount: 2,
    nodeTaints: ["special=true:NoSchedule"],
});
const greenPool = new azure.containerservice.KubernetesClusterNodePool("green_pool", {
    name: "greenpool",
    kubernetesClusterId: aksCluster.id,
    vmSize: "Standard_DS2_v2",
    nodeCount: 2,
    nodeTaints: ["upgrade=true:NoSchedule"],
});
export const resourceGroupName = aksRg.name;
export const kubernetesClusterName = aksCluster.name;
export const bluePoolName = bluePool.name;
export const greenPoolName = greenPool.name;

Key Points:

  • azurerm_provider: Configures the Azure provider.
  • azurerm_resource_group: Creates a resource group to hold the AKS cluster resources.
  • azurerm_kubernetes_cluster: Provisions an AKS cluster with a default node pool.
  • azurerm_kubernetes_cluster_node_pool: Adds multiple node pools (blue_pool and green_pool) to manage the AKS upgrade process.
  • Node Taints: Used for scheduling control to manage workloads safely during upgrades.

Summary

In this example, we created two additional node pools (blue_pool and green_pool) alongside the default pool in an AKS cluster. A rolling upgrade strategy can be implemented by migrating workloads from the default pool to the new pools and leveraging node taints to control scheduling during the upgrade process, ensuring zero downtime. Outputs provide the resource group and cluster names, in addition to the node pool names for easy reference.

Deploy this code

Want to deploy this code? Sign up for a free Pulumi account to deploy in a few clicks.

Sign up

New to Pulumi?

Want to deploy this code? Sign up with Pulumi to deploy in a few clicks.

Sign up