How do I upgrade AKS clusters with zero downtime using multiple node pools?
In this guide, we will upgrade an Azure Kubernetes Service (AKS) cluster without incurring any downtime by leveraging multiple node pools. The strategy involves creating a new node pool with the updated version, migrating workloads from the old node pool to the new node pool, and then deleting the old node pool if necessary.
Here’s how to set up and manage multiple node pools for safe upgrades.
import * as pulumi from "@pulumi/pulumi";
import * as azure from "@pulumi/azure";
const aksRg = new azure.core.ResourceGroup("aks_rg", {
name: "exampleResourceGroup",
location: "East US",
});
const aksCluster = new azure.containerservice.KubernetesCluster("aks_cluster", {
name: "exampleAKSCluster",
location: aksRg.location,
resourceGroupName: aksRg.name,
dnsPrefix: "exampleakscluster",
defaultNodePool: {
name: "default",
nodeCount: 1,
vmSize: "Standard_DS2_v2",
},
identity: {
type: "SystemAssigned",
},
});
const bluePool = new azure.containerservice.KubernetesClusterNodePool("blue_pool", {
name: "bluepool",
kubernetesClusterId: aksCluster.id,
vmSize: "Standard_DS2_v2",
nodeCount: 2,
nodeTaints: ["special=true:NoSchedule"],
});
const greenPool = new azure.containerservice.KubernetesClusterNodePool("green_pool", {
name: "greenpool",
kubernetesClusterId: aksCluster.id,
vmSize: "Standard_DS2_v2",
nodeCount: 2,
nodeTaints: ["upgrade=true:NoSchedule"],
});
export const resourceGroupName = aksRg.name;
export const kubernetesClusterName = aksCluster.name;
export const bluePoolName = bluePool.name;
export const greenPoolName = greenPool.name;
Key Points:
- azurerm_provider: Configures the Azure provider.
- azurerm_resource_group: Creates a resource group to hold the AKS cluster resources.
- azurerm_kubernetes_cluster: Provisions an AKS cluster with a default node pool.
- azurerm_kubernetes_cluster_node_pool: Adds multiple node pools (
blue_pool
andgreen_pool
) to manage the AKS upgrade process. - Node Taints: Used for scheduling control to manage workloads safely during upgrades.
Summary
In this example, we created two additional node pools (blue_pool
and green_pool
) alongside the default pool in an AKS cluster. A rolling upgrade strategy can be implemented by migrating workloads from the default pool to the new pools and leveraging node taints to control scheduling during the upgrade process, ensuring zero downtime. Outputs provide the resource group and cluster names, in addition to the node pool names for easy reference.
Deploy this code
Want to deploy this code? Sign up for a free Pulumi account to deploy in a few clicks.
Sign upNew to Pulumi?
Want to deploy this code? Sign up with Pulumi to deploy in a few clicks.
Sign upThank you for your feedback!
If you have a question about how to use Pulumi, reach out in Community Slack.
Open an issue on GitHub to report a problem or suggest an improvement.