The azure-native:containerservice:AgentPool resource, part of the Pulumi Azure Native provider, defines an AKS agent pool: the compute nodes that run containerized workloads within a managed Kubernetes cluster. This guide focuses on five capabilities: spot instances and autoscaling, ephemeral disks and storage options, kubelet and kernel tuning, GPU partitioning, and snapshots and lifecycle management.
Agent pools belong to an existing AKS cluster and may reference snapshots, capacity reservation groups, or other Azure compute infrastructure. The examples are intentionally small. Combine them with your own cluster configuration, networking, and security policies.
Create a basic agent pool with spot instances
Most AKS deployments start with a user-mode agent pool that runs application workloads. Spot instances reduce compute costs by using Azure’s spare capacity, making them suitable for interruptible workloads like batch processing or dev/test environments.
import * as pulumi from "@pulumi/pulumi";
import * as azure_native from "@pulumi/azure-native";
const agentPool = new azure_native.containerservice.AgentPool("agentPool", {
agentPoolName: "agentpool1",
count: 3,
mode: azure_native.containerservice.AgentPoolMode.User,
nodeLabels: {
key1: "val1",
},
nodeTaints: ["Key1=Value1:NoSchedule"],
orchestratorVersion: "",
osType: azure_native.containerservice.OSType.Linux,
resourceGroupName: "rg1",
resourceName: "clustername1",
scaleSetEvictionPolicy: azure_native.containerservice.ScaleSetEvictionPolicy.Delete,
scaleSetPriority: azure_native.containerservice.ScaleSetPriority.Spot,
tags: {
name1: "val1",
},
vmSize: "Standard_DS1_v2",
});
import pulumi
import pulumi_azure_native as azure_native
agent_pool = azure_native.containerservice.AgentPool("agentPool",
agent_pool_name="agentpool1",
count=3,
mode=azure_native.containerservice.AgentPoolMode.USER,
node_labels={
"key1": "val1",
},
node_taints=["Key1=Value1:NoSchedule"],
orchestrator_version="",
os_type=azure_native.containerservice.OSType.LINUX,
resource_group_name="rg1",
resource_name_="clustername1",
scale_set_eviction_policy=azure_native.containerservice.ScaleSetEvictionPolicy.DELETE,
scale_set_priority=azure_native.containerservice.ScaleSetPriority.SPOT,
tags={
"name1": "val1",
},
vm_size="Standard_DS1_v2")
package main
import (
containerservice "github.com/pulumi/pulumi-azure-native-sdk/containerservice/v3"
"github.com/pulumi/pulumi/sdk/v3/go/pulumi"
)
func main() {
pulumi.Run(func(ctx *pulumi.Context) error {
_, err := containerservice.NewAgentPool(ctx, "agentPool", &containerservice.AgentPoolArgs{
AgentPoolName: pulumi.String("agentpool1"),
Count: pulumi.Int(3),
Mode: pulumi.String(containerservice.AgentPoolModeUser),
NodeLabels: pulumi.StringMap{
"key1": pulumi.String("val1"),
},
NodeTaints: pulumi.StringArray{
pulumi.String("Key1=Value1:NoSchedule"),
},
OrchestratorVersion: pulumi.String(""),
OsType: pulumi.String(containerservice.OSTypeLinux),
ResourceGroupName: pulumi.String("rg1"),
ResourceName: pulumi.String("clustername1"),
ScaleSetEvictionPolicy: pulumi.String(containerservice.ScaleSetEvictionPolicyDelete),
ScaleSetPriority: pulumi.String(containerservice.ScaleSetPrioritySpot),
Tags: pulumi.StringMap{
"name1": pulumi.String("val1"),
},
VmSize: pulumi.String("Standard_DS1_v2"),
})
if err != nil {
return err
}
return nil
})
}
using System.Collections.Generic;
using System.Linq;
using Pulumi;
using AzureNative = Pulumi.AzureNative;
return await Deployment.RunAsync(() =>
{
var agentPool = new AzureNative.ContainerService.AgentPool("agentPool", new()
{
AgentPoolName = "agentpool1",
Count = 3,
Mode = AzureNative.ContainerService.AgentPoolMode.User,
NodeLabels =
{
{ "key1", "val1" },
},
NodeTaints = new[]
{
"Key1=Value1:NoSchedule",
},
OrchestratorVersion = "",
OsType = AzureNative.ContainerService.OSType.Linux,
ResourceGroupName = "rg1",
ResourceName = "clustername1",
ScaleSetEvictionPolicy = AzureNative.ContainerService.ScaleSetEvictionPolicy.Delete,
ScaleSetPriority = AzureNative.ContainerService.ScaleSetPriority.Spot,
Tags =
{
{ "name1", "val1" },
},
VmSize = "Standard_DS1_v2",
});
});
package generated_program;
import com.pulumi.Context;
import com.pulumi.Pulumi;
import com.pulumi.core.Output;
import com.pulumi.azurenative.containerservice.AgentPool;
import com.pulumi.azurenative.containerservice.AgentPoolArgs;
import java.util.List;
import java.util.ArrayList;
import java.util.Map;
import java.io.File;
import java.nio.file.Files;
import java.nio.file.Paths;
public class App {
public static void main(String[] args) {
Pulumi.run(App::stack);
}
public static void stack(Context ctx) {
var agentPool = new AgentPool("agentPool", AgentPoolArgs.builder()
.agentPoolName("agentpool1")
.count(3)
.mode("User")
.nodeLabels(Map.of("key1", "val1"))
.nodeTaints("Key1=Value1:NoSchedule")
.orchestratorVersion("")
.osType("Linux")
.resourceGroupName("rg1")
.resourceName("clustername1")
.scaleSetEvictionPolicy("Delete")
.scaleSetPriority("Spot")
.tags(Map.of("name1", "val1"))
.vmSize("Standard_DS1_v2")
.build());
}
}
resources:
agentPool:
type: azure-native:containerservice:AgentPool
properties:
agentPoolName: agentpool1
count: 3
mode: User
nodeLabels:
key1: val1
nodeTaints:
- Key1=Value1:NoSchedule
orchestratorVersion: ""
osType: Linux
resourceGroupName: rg1
resourceName: clustername1
scaleSetEvictionPolicy: Delete
scaleSetPriority: Spot
tags:
name1: val1
vmSize: Standard_DS1_v2
The scaleSetPriority property sets the pool to use spot instances, while scaleSetEvictionPolicy determines what happens when Azure reclaims capacity. The mode property distinguishes user pools (application workloads) from system pools (cluster services). Node taints prevent regular pods from scheduling on spot nodes unless they explicitly tolerate the taint.
Enable autoscaling with min and max node counts
Production workloads often need to scale automatically based on demand. Autoscaling adjusts the node count within defined boundaries, balancing cost and availability.
import * as pulumi from "@pulumi/pulumi";
import * as azure_native from "@pulumi/azure-native";
const agentPool = new azure_native.containerservice.AgentPool("agentPool", {
agentPoolName: "agentpool1",
count: 3,
enableAutoScaling: true,
maxCount: 2,
minCount: 2,
nodeTaints: ["Key1=Value1:NoSchedule"],
orchestratorVersion: "",
osType: azure_native.containerservice.OSType.Linux,
resourceGroupName: "rg1",
resourceName: "clustername1",
scaleSetEvictionPolicy: azure_native.containerservice.ScaleSetEvictionPolicy.Delete,
scaleSetPriority: azure_native.containerservice.ScaleSetPriority.Spot,
vmSize: "Standard_DS1_v2",
});
import pulumi
import pulumi_azure_native as azure_native
agent_pool = azure_native.containerservice.AgentPool("agentPool",
agent_pool_name="agentpool1",
count=3,
enable_auto_scaling=True,
max_count=2,
min_count=2,
node_taints=["Key1=Value1:NoSchedule"],
orchestrator_version="",
os_type=azure_native.containerservice.OSType.LINUX,
resource_group_name="rg1",
resource_name_="clustername1",
scale_set_eviction_policy=azure_native.containerservice.ScaleSetEvictionPolicy.DELETE,
scale_set_priority=azure_native.containerservice.ScaleSetPriority.SPOT,
vm_size="Standard_DS1_v2")
package main
import (
containerservice "github.com/pulumi/pulumi-azure-native-sdk/containerservice/v3"
"github.com/pulumi/pulumi/sdk/v3/go/pulumi"
)
func main() {
pulumi.Run(func(ctx *pulumi.Context) error {
_, err := containerservice.NewAgentPool(ctx, "agentPool", &containerservice.AgentPoolArgs{
AgentPoolName: pulumi.String("agentpool1"),
Count: pulumi.Int(3),
EnableAutoScaling: pulumi.Bool(true),
MaxCount: pulumi.Int(2),
MinCount: pulumi.Int(2),
NodeTaints: pulumi.StringArray{
pulumi.String("Key1=Value1:NoSchedule"),
},
OrchestratorVersion: pulumi.String(""),
OsType: pulumi.String(containerservice.OSTypeLinux),
ResourceGroupName: pulumi.String("rg1"),
ResourceName: pulumi.String("clustername1"),
ScaleSetEvictionPolicy: pulumi.String(containerservice.ScaleSetEvictionPolicyDelete),
ScaleSetPriority: pulumi.String(containerservice.ScaleSetPrioritySpot),
VmSize: pulumi.String("Standard_DS1_v2"),
})
if err != nil {
return err
}
return nil
})
}
using System.Collections.Generic;
using System.Linq;
using Pulumi;
using AzureNative = Pulumi.AzureNative;
return await Deployment.RunAsync(() =>
{
var agentPool = new AzureNative.ContainerService.AgentPool("agentPool", new()
{
AgentPoolName = "agentpool1",
Count = 3,
EnableAutoScaling = true,
MaxCount = 2,
MinCount = 2,
NodeTaints = new[]
{
"Key1=Value1:NoSchedule",
},
OrchestratorVersion = "",
OsType = AzureNative.ContainerService.OSType.Linux,
ResourceGroupName = "rg1",
ResourceName = "clustername1",
ScaleSetEvictionPolicy = AzureNative.ContainerService.ScaleSetEvictionPolicy.Delete,
ScaleSetPriority = AzureNative.ContainerService.ScaleSetPriority.Spot,
VmSize = "Standard_DS1_v2",
});
});
package generated_program;
import com.pulumi.Context;
import com.pulumi.Pulumi;
import com.pulumi.core.Output;
import com.pulumi.azurenative.containerservice.AgentPool;
import com.pulumi.azurenative.containerservice.AgentPoolArgs;
import java.util.List;
import java.util.ArrayList;
import java.util.Map;
import java.io.File;
import java.nio.file.Files;
import java.nio.file.Paths;
public class App {
public static void main(String[] args) {
Pulumi.run(App::stack);
}
public static void stack(Context ctx) {
var agentPool = new AgentPool("agentPool", AgentPoolArgs.builder()
.agentPoolName("agentpool1")
.count(3)
.enableAutoScaling(true)
.maxCount(2)
.minCount(2)
.nodeTaints("Key1=Value1:NoSchedule")
.orchestratorVersion("")
.osType("Linux")
.resourceGroupName("rg1")
.resourceName("clustername1")
.scaleSetEvictionPolicy("Delete")
.scaleSetPriority("Spot")
.vmSize("Standard_DS1_v2")
.build());
}
}
resources:
agentPool:
type: azure-native:containerservice:AgentPool
properties:
agentPoolName: agentpool1
count: 3
enableAutoScaling: true
maxCount: 2
minCount: 2
nodeTaints:
- Key1=Value1:NoSchedule
orchestratorVersion: ""
osType: Linux
resourceGroupName: rg1
resourceName: clustername1
scaleSetEvictionPolicy: Delete
scaleSetPriority: Spot
vmSize: Standard_DS1_v2
When enableAutoScaling is true, AKS adjusts the node count between minCount and maxCount based on pod resource requests. The count property sets the initial size. Autoscaling responds to scheduling pressure, not CPU or memory metrics directly.
Use ephemeral OS disks for faster node operations
Ephemeral OS disks store the operating system on the VM’s local cache or temporary storage, eliminating network latency and reducing costs. This works best for stateless workloads.
import * as pulumi from "@pulumi/pulumi";
import * as azure_native from "@pulumi/azure-native";
const agentPool = new azure_native.containerservice.AgentPool("agentPool", {
agentPoolName: "agentpool1",
count: 3,
orchestratorVersion: "",
osDiskSizeGB: 64,
osDiskType: azure_native.containerservice.OSDiskType.Ephemeral,
osType: azure_native.containerservice.OSType.Linux,
resourceGroupName: "rg1",
resourceName: "clustername1",
vmSize: "Standard_DS2_v2",
});
import pulumi
import pulumi_azure_native as azure_native
agent_pool = azure_native.containerservice.AgentPool("agentPool",
agent_pool_name="agentpool1",
count=3,
orchestrator_version="",
os_disk_size_gb=64,
os_disk_type=azure_native.containerservice.OSDiskType.EPHEMERAL,
os_type=azure_native.containerservice.OSType.LINUX,
resource_group_name="rg1",
resource_name_="clustername1",
vm_size="Standard_DS2_v2")
package main
import (
containerservice "github.com/pulumi/pulumi-azure-native-sdk/containerservice/v3"
"github.com/pulumi/pulumi/sdk/v3/go/pulumi"
)
func main() {
pulumi.Run(func(ctx *pulumi.Context) error {
_, err := containerservice.NewAgentPool(ctx, "agentPool", &containerservice.AgentPoolArgs{
AgentPoolName: pulumi.String("agentpool1"),
Count: pulumi.Int(3),
OrchestratorVersion: pulumi.String(""),
OsDiskSizeGB: pulumi.Int(64),
OsDiskType: pulumi.String(containerservice.OSDiskTypeEphemeral),
OsType: pulumi.String(containerservice.OSTypeLinux),
ResourceGroupName: pulumi.String("rg1"),
ResourceName: pulumi.String("clustername1"),
VmSize: pulumi.String("Standard_DS2_v2"),
})
if err != nil {
return err
}
return nil
})
}
using System.Collections.Generic;
using System.Linq;
using Pulumi;
using AzureNative = Pulumi.AzureNative;
return await Deployment.RunAsync(() =>
{
var agentPool = new AzureNative.ContainerService.AgentPool("agentPool", new()
{
AgentPoolName = "agentpool1",
Count = 3,
OrchestratorVersion = "",
OsDiskSizeGB = 64,
OsDiskType = AzureNative.ContainerService.OSDiskType.Ephemeral,
OsType = AzureNative.ContainerService.OSType.Linux,
ResourceGroupName = "rg1",
ResourceName = "clustername1",
VmSize = "Standard_DS2_v2",
});
});
package generated_program;
import com.pulumi.Context;
import com.pulumi.Pulumi;
import com.pulumi.core.Output;
import com.pulumi.azurenative.containerservice.AgentPool;
import com.pulumi.azurenative.containerservice.AgentPoolArgs;
import java.util.List;
import java.util.ArrayList;
import java.util.Map;
import java.io.File;
import java.nio.file.Files;
import java.nio.file.Paths;
public class App {
public static void main(String[] args) {
Pulumi.run(App::stack);
}
public static void stack(Context ctx) {
var agentPool = new AgentPool("agentPool", AgentPoolArgs.builder()
.agentPoolName("agentpool1")
.count(3)
.orchestratorVersion("")
.osDiskSizeGB(64)
.osDiskType("Ephemeral")
.osType("Linux")
.resourceGroupName("rg1")
.resourceName("clustername1")
.vmSize("Standard_DS2_v2")
.build());
}
}
resources:
agentPool:
type: azure-native:containerservice:AgentPool
properties:
agentPoolName: agentpool1
count: 3
orchestratorVersion: ""
osDiskSizeGB: 64
osDiskType: Ephemeral
osType: Linux
resourceGroupName: rg1
resourceName: clustername1
vmSize: Standard_DS2_v2
The osDiskType property controls where the OS disk lives. Ephemeral disks use the VM’s cache or temporary storage, which is faster and cheaper than managed disks but requires the VM size to have sufficient local storage. The osDiskSizeGB property must fit within the VM’s cache capacity.
Tune kubelet and kernel settings for performance
High-performance workloads often require custom kubelet and kernel configurations. These settings control CPU management, garbage collection, swap behavior, and network tuning.
import * as pulumi from "@pulumi/pulumi";
import * as azure_native from "@pulumi/azure-native";
const agentPool = new azure_native.containerservice.AgentPool("agentPool", {
agentPoolName: "agentpool1",
count: 3,
kubeletConfig: {
allowedUnsafeSysctls: [
"kernel.msg*",
"net.core.somaxconn",
],
cpuCfsQuota: true,
cpuCfsQuotaPeriod: "200ms",
cpuManagerPolicy: "static",
failSwapOn: false,
imageGcHighThreshold: 90,
imageGcLowThreshold: 70,
topologyManagerPolicy: "best-effort",
},
linuxOSConfig: {
swapFileSizeMB: 1500,
sysctls: {
kernelThreadsMax: 99999,
netCoreWmemDefault: 12345,
netIpv4IpLocalPortRange: "20000 60000",
netIpv4TcpTwReuse: true,
},
transparentHugePageDefrag: "madvise",
transparentHugePageEnabled: "always",
},
orchestratorVersion: "",
osType: azure_native.containerservice.OSType.Linux,
resourceGroupName: "rg1",
resourceName: "clustername1",
vmSize: "Standard_DS2_v2",
});
import pulumi
import pulumi_azure_native as azure_native
agent_pool = azure_native.containerservice.AgentPool("agentPool",
agent_pool_name="agentpool1",
count=3,
kubelet_config={
"allowed_unsafe_sysctls": [
"kernel.msg*",
"net.core.somaxconn",
],
"cpu_cfs_quota": True,
"cpu_cfs_quota_period": "200ms",
"cpu_manager_policy": "static",
"fail_swap_on": False,
"image_gc_high_threshold": 90,
"image_gc_low_threshold": 70,
"topology_manager_policy": "best-effort",
},
linux_os_config={
"swap_file_size_mb": 1500,
"sysctls": {
"kernel_threads_max": 99999,
"net_core_wmem_default": 12345,
"net_ipv4_ip_local_port_range": "20000 60000",
"net_ipv4_tcp_tw_reuse": True,
},
"transparent_huge_page_defrag": "madvise",
"transparent_huge_page_enabled": "always",
},
orchestrator_version="",
os_type=azure_native.containerservice.OSType.LINUX,
resource_group_name="rg1",
resource_name_="clustername1",
vm_size="Standard_DS2_v2")
package main
import (
containerservice "github.com/pulumi/pulumi-azure-native-sdk/containerservice/v3"
"github.com/pulumi/pulumi/sdk/v3/go/pulumi"
)
func main() {
pulumi.Run(func(ctx *pulumi.Context) error {
_, err := containerservice.NewAgentPool(ctx, "agentPool", &containerservice.AgentPoolArgs{
AgentPoolName: pulumi.String("agentpool1"),
Count: pulumi.Int(3),
KubeletConfig: &containerservice.KubeletConfigArgs{
AllowedUnsafeSysctls: pulumi.StringArray{
pulumi.String("kernel.msg*"),
pulumi.String("net.core.somaxconn"),
},
CpuCfsQuota: pulumi.Bool(true),
CpuCfsQuotaPeriod: pulumi.String("200ms"),
CpuManagerPolicy: pulumi.String("static"),
FailSwapOn: pulumi.Bool(false),
ImageGcHighThreshold: pulumi.Int(90),
ImageGcLowThreshold: pulumi.Int(70),
TopologyManagerPolicy: pulumi.String("best-effort"),
},
LinuxOSConfig: &containerservice.LinuxOSConfigArgs{
SwapFileSizeMB: pulumi.Int(1500),
Sysctls: &containerservice.SysctlConfigArgs{
KernelThreadsMax: pulumi.Int(99999),
NetCoreWmemDefault: pulumi.Int(12345),
NetIpv4IpLocalPortRange: pulumi.String("20000 60000"),
NetIpv4TcpTwReuse: pulumi.Bool(true),
},
TransparentHugePageDefrag: pulumi.String("madvise"),
TransparentHugePageEnabled: pulumi.String("always"),
},
OrchestratorVersion: pulumi.String(""),
OsType: pulumi.String(containerservice.OSTypeLinux),
ResourceGroupName: pulumi.String("rg1"),
ResourceName: pulumi.String("clustername1"),
VmSize: pulumi.String("Standard_DS2_v2"),
})
if err != nil {
return err
}
return nil
})
}
using System.Collections.Generic;
using System.Linq;
using Pulumi;
using AzureNative = Pulumi.AzureNative;
return await Deployment.RunAsync(() =>
{
var agentPool = new AzureNative.ContainerService.AgentPool("agentPool", new()
{
AgentPoolName = "agentpool1",
Count = 3,
KubeletConfig = new AzureNative.ContainerService.Inputs.KubeletConfigArgs
{
AllowedUnsafeSysctls = new[]
{
"kernel.msg*",
"net.core.somaxconn",
},
CpuCfsQuota = true,
CpuCfsQuotaPeriod = "200ms",
CpuManagerPolicy = "static",
FailSwapOn = false,
ImageGcHighThreshold = 90,
ImageGcLowThreshold = 70,
TopologyManagerPolicy = "best-effort",
},
LinuxOSConfig = new AzureNative.ContainerService.Inputs.LinuxOSConfigArgs
{
SwapFileSizeMB = 1500,
Sysctls = new AzureNative.ContainerService.Inputs.SysctlConfigArgs
{
KernelThreadsMax = 99999,
NetCoreWmemDefault = 12345,
NetIpv4IpLocalPortRange = "20000 60000",
NetIpv4TcpTwReuse = true,
},
TransparentHugePageDefrag = "madvise",
TransparentHugePageEnabled = "always",
},
OrchestratorVersion = "",
OsType = AzureNative.ContainerService.OSType.Linux,
ResourceGroupName = "rg1",
ResourceName = "clustername1",
VmSize = "Standard_DS2_v2",
});
});
package generated_program;
import com.pulumi.Context;
import com.pulumi.Pulumi;
import com.pulumi.core.Output;
import com.pulumi.azurenative.containerservice.AgentPool;
import com.pulumi.azurenative.containerservice.AgentPoolArgs;
import com.pulumi.azurenative.containerservice.inputs.KubeletConfigArgs;
import com.pulumi.azurenative.containerservice.inputs.LinuxOSConfigArgs;
import com.pulumi.azurenative.containerservice.inputs.SysctlConfigArgs;
import java.util.List;
import java.util.ArrayList;
import java.util.Map;
import java.io.File;
import java.nio.file.Files;
import java.nio.file.Paths;
public class App {
public static void main(String[] args) {
Pulumi.run(App::stack);
}
public static void stack(Context ctx) {
var agentPool = new AgentPool("agentPool", AgentPoolArgs.builder()
.agentPoolName("agentpool1")
.count(3)
.kubeletConfig(KubeletConfigArgs.builder()
.allowedUnsafeSysctls(
"kernel.msg*",
"net.core.somaxconn")
.cpuCfsQuota(true)
.cpuCfsQuotaPeriod("200ms")
.cpuManagerPolicy("static")
.failSwapOn(false)
.imageGcHighThreshold(90)
.imageGcLowThreshold(70)
.topologyManagerPolicy("best-effort")
.build())
.linuxOSConfig(LinuxOSConfigArgs.builder()
.swapFileSizeMB(1500)
.sysctls(SysctlConfigArgs.builder()
.kernelThreadsMax(99999)
.netCoreWmemDefault(12345)
.netIpv4IpLocalPortRange("20000 60000")
.netIpv4TcpTwReuse(true)
.build())
.transparentHugePageDefrag("madvise")
.transparentHugePageEnabled("always")
.build())
.orchestratorVersion("")
.osType("Linux")
.resourceGroupName("rg1")
.resourceName("clustername1")
.vmSize("Standard_DS2_v2")
.build());
}
}
resources:
agentPool:
type: azure-native:containerservice:AgentPool
properties:
agentPoolName: agentpool1
count: 3
kubeletConfig:
allowedUnsafeSysctls:
- kernel.msg*
- net.core.somaxconn
cpuCfsQuota: true
cpuCfsQuotaPeriod: 200ms
cpuManagerPolicy: static
failSwapOn: false
imageGcHighThreshold: 90
imageGcLowThreshold: 70
topologyManagerPolicy: best-effort
linuxOSConfig:
swapFileSizeMB: 1500
sysctls:
kernelThreadsMax: 99999
netCoreWmemDefault: 12345
netIpv4IpLocalPortRange: 20000 60000
netIpv4TcpTwReuse: true
transparentHugePageDefrag: madvise
transparentHugePageEnabled: always
orchestratorVersion: ""
osType: Linux
resourceGroupName: rg1
resourceName: clustername1
vmSize: Standard_DS2_v2
The kubeletConfig block adjusts how kubelet manages containers: cpuManagerPolicy pins pods to specific cores, imageGcHighThreshold triggers garbage collection, and allowedUnsafeSysctls permits kernel parameter changes. The linuxOSConfig block modifies kernel behavior through sysctls, swap file size, and transparent huge pages. These settings apply at node creation and persist across reboots.
Configure GPU partitioning for multi-instance workloads
GPU-accelerated workloads can use Multi-Instance GPU (MIG) to partition a single GPU into multiple isolated instances, allowing better resource utilization for smaller AI/ML jobs.
import * as pulumi from "@pulumi/pulumi";
import * as azure_native from "@pulumi/azure-native";
const agentPool = new azure_native.containerservice.AgentPool("agentPool", {
agentPoolName: "agentpool1",
count: 3,
gpuInstanceProfile: azure_native.containerservice.GPUInstanceProfile.MIG2g,
kubeletConfig: {
allowedUnsafeSysctls: [
"kernel.msg*",
"net.core.somaxconn",
],
cpuCfsQuota: true,
cpuCfsQuotaPeriod: "200ms",
cpuManagerPolicy: "static",
failSwapOn: false,
imageGcHighThreshold: 90,
imageGcLowThreshold: 70,
topologyManagerPolicy: "best-effort",
},
linuxOSConfig: {
swapFileSizeMB: 1500,
sysctls: {
kernelThreadsMax: 99999,
netCoreWmemDefault: 12345,
netIpv4IpLocalPortRange: "20000 60000",
netIpv4TcpTwReuse: true,
},
transparentHugePageDefrag: "madvise",
transparentHugePageEnabled: "always",
},
orchestratorVersion: "",
osType: azure_native.containerservice.OSType.Linux,
resourceGroupName: "rg1",
resourceName: "clustername1",
vmSize: "Standard_ND96asr_v4",
});
import pulumi
import pulumi_azure_native as azure_native
agent_pool = azure_native.containerservice.AgentPool("agentPool",
agent_pool_name="agentpool1",
count=3,
gpu_instance_profile=azure_native.containerservice.GPUInstanceProfile.MIG2G,
kubelet_config={
"allowed_unsafe_sysctls": [
"kernel.msg*",
"net.core.somaxconn",
],
"cpu_cfs_quota": True,
"cpu_cfs_quota_period": "200ms",
"cpu_manager_policy": "static",
"fail_swap_on": False,
"image_gc_high_threshold": 90,
"image_gc_low_threshold": 70,
"topology_manager_policy": "best-effort",
},
linux_os_config={
"swap_file_size_mb": 1500,
"sysctls": {
"kernel_threads_max": 99999,
"net_core_wmem_default": 12345,
"net_ipv4_ip_local_port_range": "20000 60000",
"net_ipv4_tcp_tw_reuse": True,
},
"transparent_huge_page_defrag": "madvise",
"transparent_huge_page_enabled": "always",
},
orchestrator_version="",
os_type=azure_native.containerservice.OSType.LINUX,
resource_group_name="rg1",
resource_name_="clustername1",
vm_size="Standard_ND96asr_v4")
package main
import (
containerservice "github.com/pulumi/pulumi-azure-native-sdk/containerservice/v3"
"github.com/pulumi/pulumi/sdk/v3/go/pulumi"
)
func main() {
pulumi.Run(func(ctx *pulumi.Context) error {
_, err := containerservice.NewAgentPool(ctx, "agentPool", &containerservice.AgentPoolArgs{
AgentPoolName: pulumi.String("agentpool1"),
Count: pulumi.Int(3),
GpuInstanceProfile: pulumi.String(containerservice.GPUInstanceProfileMIG2g),
KubeletConfig: &containerservice.KubeletConfigArgs{
AllowedUnsafeSysctls: pulumi.StringArray{
pulumi.String("kernel.msg*"),
pulumi.String("net.core.somaxconn"),
},
CpuCfsQuota: pulumi.Bool(true),
CpuCfsQuotaPeriod: pulumi.String("200ms"),
CpuManagerPolicy: pulumi.String("static"),
FailSwapOn: pulumi.Bool(false),
ImageGcHighThreshold: pulumi.Int(90),
ImageGcLowThreshold: pulumi.Int(70),
TopologyManagerPolicy: pulumi.String("best-effort"),
},
LinuxOSConfig: &containerservice.LinuxOSConfigArgs{
SwapFileSizeMB: pulumi.Int(1500),
Sysctls: &containerservice.SysctlConfigArgs{
KernelThreadsMax: pulumi.Int(99999),
NetCoreWmemDefault: pulumi.Int(12345),
NetIpv4IpLocalPortRange: pulumi.String("20000 60000"),
NetIpv4TcpTwReuse: pulumi.Bool(true),
},
TransparentHugePageDefrag: pulumi.String("madvise"),
TransparentHugePageEnabled: pulumi.String("always"),
},
OrchestratorVersion: pulumi.String(""),
OsType: pulumi.String(containerservice.OSTypeLinux),
ResourceGroupName: pulumi.String("rg1"),
ResourceName: pulumi.String("clustername1"),
VmSize: pulumi.String("Standard_ND96asr_v4"),
})
if err != nil {
return err
}
return nil
})
}
using System.Collections.Generic;
using System.Linq;
using Pulumi;
using AzureNative = Pulumi.AzureNative;
return await Deployment.RunAsync(() =>
{
var agentPool = new AzureNative.ContainerService.AgentPool("agentPool", new()
{
AgentPoolName = "agentpool1",
Count = 3,
GpuInstanceProfile = AzureNative.ContainerService.GPUInstanceProfile.MIG2g,
KubeletConfig = new AzureNative.ContainerService.Inputs.KubeletConfigArgs
{
AllowedUnsafeSysctls = new[]
{
"kernel.msg*",
"net.core.somaxconn",
},
CpuCfsQuota = true,
CpuCfsQuotaPeriod = "200ms",
CpuManagerPolicy = "static",
FailSwapOn = false,
ImageGcHighThreshold = 90,
ImageGcLowThreshold = 70,
TopologyManagerPolicy = "best-effort",
},
LinuxOSConfig = new AzureNative.ContainerService.Inputs.LinuxOSConfigArgs
{
SwapFileSizeMB = 1500,
Sysctls = new AzureNative.ContainerService.Inputs.SysctlConfigArgs
{
KernelThreadsMax = 99999,
NetCoreWmemDefault = 12345,
NetIpv4IpLocalPortRange = "20000 60000",
NetIpv4TcpTwReuse = true,
},
TransparentHugePageDefrag = "madvise",
TransparentHugePageEnabled = "always",
},
OrchestratorVersion = "",
OsType = AzureNative.ContainerService.OSType.Linux,
ResourceGroupName = "rg1",
ResourceName = "clustername1",
VmSize = "Standard_ND96asr_v4",
});
});
package generated_program;
import com.pulumi.Context;
import com.pulumi.Pulumi;
import com.pulumi.core.Output;
import com.pulumi.azurenative.containerservice.AgentPool;
import com.pulumi.azurenative.containerservice.AgentPoolArgs;
import com.pulumi.azurenative.containerservice.inputs.KubeletConfigArgs;
import com.pulumi.azurenative.containerservice.inputs.LinuxOSConfigArgs;
import com.pulumi.azurenative.containerservice.inputs.SysctlConfigArgs;
import java.util.List;
import java.util.ArrayList;
import java.util.Map;
import java.io.File;
import java.nio.file.Files;
import java.nio.file.Paths;
public class App {
public static void main(String[] args) {
Pulumi.run(App::stack);
}
public static void stack(Context ctx) {
var agentPool = new AgentPool("agentPool", AgentPoolArgs.builder()
.agentPoolName("agentpool1")
.count(3)
.gpuInstanceProfile("MIG2g")
.kubeletConfig(KubeletConfigArgs.builder()
.allowedUnsafeSysctls(
"kernel.msg*",
"net.core.somaxconn")
.cpuCfsQuota(true)
.cpuCfsQuotaPeriod("200ms")
.cpuManagerPolicy("static")
.failSwapOn(false)
.imageGcHighThreshold(90)
.imageGcLowThreshold(70)
.topologyManagerPolicy("best-effort")
.build())
.linuxOSConfig(LinuxOSConfigArgs.builder()
.swapFileSizeMB(1500)
.sysctls(SysctlConfigArgs.builder()
.kernelThreadsMax(99999)
.netCoreWmemDefault(12345)
.netIpv4IpLocalPortRange("20000 60000")
.netIpv4TcpTwReuse(true)
.build())
.transparentHugePageDefrag("madvise")
.transparentHugePageEnabled("always")
.build())
.orchestratorVersion("")
.osType("Linux")
.resourceGroupName("rg1")
.resourceName("clustername1")
.vmSize("Standard_ND96asr_v4")
.build());
}
}
resources:
agentPool:
type: azure-native:containerservice:AgentPool
properties:
agentPoolName: agentpool1
count: 3
gpuInstanceProfile: MIG2g
kubeletConfig:
allowedUnsafeSysctls:
- kernel.msg*
- net.core.somaxconn
cpuCfsQuota: true
cpuCfsQuotaPeriod: 200ms
cpuManagerPolicy: static
failSwapOn: false
imageGcHighThreshold: 90
imageGcLowThreshold: 70
topologyManagerPolicy: best-effort
linuxOSConfig:
swapFileSizeMB: 1500
sysctls:
kernelThreadsMax: 99999
netCoreWmemDefault: 12345
netIpv4IpLocalPortRange: 20000 60000
netIpv4TcpTwReuse: true
transparentHugePageDefrag: madvise
transparentHugePageEnabled: always
orchestratorVersion: ""
osType: Linux
resourceGroupName: rg1
resourceName: clustername1
vmSize: Standard_ND96asr_v4
The gpuInstanceProfile property specifies the MIG partition size (e.g., MIG2g for 2-GPU slices). This requires a GPU-enabled VM size that supports MIG. The kubeletConfig and linuxOSConfig blocks tune the node for GPU workloads, adjusting CPU management and kernel parameters to reduce interference with GPU operations.
Clone configuration from an agent pool snapshot
Agent pool snapshots capture the configuration of an existing pool, allowing you to replicate settings across clusters or restore previous configurations.
import * as pulumi from "@pulumi/pulumi";
import * as azure_native from "@pulumi/azure-native";
const agentPool = new azure_native.containerservice.AgentPool("agentPool", {
agentPoolName: "agentpool1",
count: 3,
creationData: {
sourceResourceId: "/subscriptions/00000000-0000-0000-0000-000000000000/resourcegroups/rg1/providers/Microsoft.ContainerService/snapshots/snapshot1",
},
enableFIPS: true,
orchestratorVersion: "",
osType: azure_native.containerservice.OSType.Linux,
resourceGroupName: "rg1",
resourceName: "clustername1",
vmSize: "Standard_DS2_v2",
});
import pulumi
import pulumi_azure_native as azure_native
agent_pool = azure_native.containerservice.AgentPool("agentPool",
agent_pool_name="agentpool1",
count=3,
creation_data={
"source_resource_id": "/subscriptions/00000000-0000-0000-0000-000000000000/resourcegroups/rg1/providers/Microsoft.ContainerService/snapshots/snapshot1",
},
enable_fips=True,
orchestrator_version="",
os_type=azure_native.containerservice.OSType.LINUX,
resource_group_name="rg1",
resource_name_="clustername1",
vm_size="Standard_DS2_v2")
package main
import (
containerservice "github.com/pulumi/pulumi-azure-native-sdk/containerservice/v3"
"github.com/pulumi/pulumi/sdk/v3/go/pulumi"
)
func main() {
pulumi.Run(func(ctx *pulumi.Context) error {
_, err := containerservice.NewAgentPool(ctx, "agentPool", &containerservice.AgentPoolArgs{
AgentPoolName: pulumi.String("agentpool1"),
Count: pulumi.Int(3),
CreationData: &containerservice.CreationDataArgs{
SourceResourceId: pulumi.String("/subscriptions/00000000-0000-0000-0000-000000000000/resourcegroups/rg1/providers/Microsoft.ContainerService/snapshots/snapshot1"),
},
EnableFIPS: pulumi.Bool(true),
OrchestratorVersion: pulumi.String(""),
OsType: pulumi.String(containerservice.OSTypeLinux),
ResourceGroupName: pulumi.String("rg1"),
ResourceName: pulumi.String("clustername1"),
VmSize: pulumi.String("Standard_DS2_v2"),
})
if err != nil {
return err
}
return nil
})
}
using System.Collections.Generic;
using System.Linq;
using Pulumi;
using AzureNative = Pulumi.AzureNative;
return await Deployment.RunAsync(() =>
{
var agentPool = new AzureNative.ContainerService.AgentPool("agentPool", new()
{
AgentPoolName = "agentpool1",
Count = 3,
CreationData = new AzureNative.ContainerService.Inputs.CreationDataArgs
{
SourceResourceId = "/subscriptions/00000000-0000-0000-0000-000000000000/resourcegroups/rg1/providers/Microsoft.ContainerService/snapshots/snapshot1",
},
EnableFIPS = true,
OrchestratorVersion = "",
OsType = AzureNative.ContainerService.OSType.Linux,
ResourceGroupName = "rg1",
ResourceName = "clustername1",
VmSize = "Standard_DS2_v2",
});
});
package generated_program;
import com.pulumi.Context;
import com.pulumi.Pulumi;
import com.pulumi.core.Output;
import com.pulumi.azurenative.containerservice.AgentPool;
import com.pulumi.azurenative.containerservice.AgentPoolArgs;
import com.pulumi.azurenative.containerservice.inputs.CreationDataArgs;
import java.util.List;
import java.util.ArrayList;
import java.util.Map;
import java.io.File;
import java.nio.file.Files;
import java.nio.file.Paths;
public class App {
public static void main(String[] args) {
Pulumi.run(App::stack);
}
public static void stack(Context ctx) {
var agentPool = new AgentPool("agentPool", AgentPoolArgs.builder()
.agentPoolName("agentpool1")
.count(3)
.creationData(CreationDataArgs.builder()
.sourceResourceId("/subscriptions/00000000-0000-0000-0000-000000000000/resourcegroups/rg1/providers/Microsoft.ContainerService/snapshots/snapshot1")
.build())
.enableFIPS(true)
.orchestratorVersion("")
.osType("Linux")
.resourceGroupName("rg1")
.resourceName("clustername1")
.vmSize("Standard_DS2_v2")
.build());
}
}
resources:
agentPool:
type: azure-native:containerservice:AgentPool
properties:
agentPoolName: agentpool1
count: 3
creationData:
sourceResourceId: /subscriptions/00000000-0000-0000-0000-000000000000/resourcegroups/rg1/providers/Microsoft.ContainerService/snapshots/snapshot1
enableFIPS: true
orchestratorVersion: ""
osType: Linux
resourceGroupName: rg1
resourceName: clustername1
vmSize: Standard_DS2_v2
The creationData property references a snapshot by its resource ID. AKS copies the snapshot’s configuration (VM size, OS settings, Kubernetes version) to the new pool. The enableFIPS property shows that security settings from the snapshot are preserved.
Start a stopped agent pool
Stopped agent pools preserve configuration while eliminating compute costs. Starting a pool brings nodes back online without recreating the pool.
import * as pulumi from "@pulumi/pulumi";
import * as azure_native from "@pulumi/azure-native";
const agentPool = new azure_native.containerservice.AgentPool("agentPool", {
agentPoolName: "agentpool1",
powerState: {
code: azure_native.containerservice.Code.Running,
},
resourceGroupName: "rg1",
resourceName: "clustername1",
});
import pulumi
import pulumi_azure_native as azure_native
agent_pool = azure_native.containerservice.AgentPool("agentPool",
agent_pool_name="agentpool1",
power_state={
"code": azure_native.containerservice.Code.RUNNING,
},
resource_group_name="rg1",
resource_name_="clustername1")
package main
import (
containerservice "github.com/pulumi/pulumi-azure-native-sdk/containerservice/v3"
"github.com/pulumi/pulumi/sdk/v3/go/pulumi"
)
func main() {
pulumi.Run(func(ctx *pulumi.Context) error {
_, err := containerservice.NewAgentPool(ctx, "agentPool", &containerservice.AgentPoolArgs{
AgentPoolName: pulumi.String("agentpool1"),
PowerState: &containerservice.PowerStateArgs{
Code: pulumi.String(containerservice.CodeRunning),
},
ResourceGroupName: pulumi.String("rg1"),
ResourceName: pulumi.String("clustername1"),
})
if err != nil {
return err
}
return nil
})
}
using System.Collections.Generic;
using System.Linq;
using Pulumi;
using AzureNative = Pulumi.AzureNative;
return await Deployment.RunAsync(() =>
{
var agentPool = new AzureNative.ContainerService.AgentPool("agentPool", new()
{
AgentPoolName = "agentpool1",
PowerState = new AzureNative.ContainerService.Inputs.PowerStateArgs
{
Code = AzureNative.ContainerService.Code.Running,
},
ResourceGroupName = "rg1",
ResourceName = "clustername1",
});
});
package generated_program;
import com.pulumi.Context;
import com.pulumi.Pulumi;
import com.pulumi.core.Output;
import com.pulumi.azurenative.containerservice.AgentPool;
import com.pulumi.azurenative.containerservice.AgentPoolArgs;
import com.pulumi.azurenative.containerservice.inputs.PowerStateArgs;
import java.util.List;
import java.util.ArrayList;
import java.util.Map;
import java.io.File;
import java.nio.file.Files;
import java.nio.file.Paths;
public class App {
public static void main(String[] args) {
Pulumi.run(App::stack);
}
public static void stack(Context ctx) {
var agentPool = new AgentPool("agentPool", AgentPoolArgs.builder()
.agentPoolName("agentpool1")
.powerState(PowerStateArgs.builder()
.code("Running")
.build())
.resourceGroupName("rg1")
.resourceName("clustername1")
.build());
}
}
resources:
agentPool:
type: azure-native:containerservice:AgentPool
properties:
agentPoolName: agentpool1
powerState:
code: Running
resourceGroupName: rg1
resourceName: clustername1
The powerState property controls whether the pool is running or stopped. Setting code to Running starts the pool’s VMs and resumes billing. This complements the Stop Agent Pool example, which sets code to Stopped.
Beyond these examples
These snippets focus on specific agent pool features: spot instances and cost optimization, autoscaling and node count management, ephemeral disks and storage configuration, kubelet and kernel tuning, GPU partitioning and specialized hardware, and snapshots and pool lifecycle management. They’re intentionally minimal rather than full cluster deployments.
The examples may reference pre-existing infrastructure such as AKS clusters (resourceName references), resource groups, snapshots for cloning, and capacity reservation groups, dedicated host groups, or proximity placement groups. They focus on configuring the agent pool rather than provisioning the surrounding cluster infrastructure.
To keep things focused, common agent pool patterns are omitted, including:
- Availability zones and multi-zone placement
- Network configuration (vnetSubnetID, podSubnetID)
- Upgrade settings and maintenance windows
- Security profiles and encryption settings
- Windows-specific configuration (windowsProfile)
- Virtual machine pool types and heterogeneous sizing
These omissions are intentional: the goal is to illustrate how each agent pool feature is wired, not provide drop-in cluster modules. See the AgentPool resource reference for all available configuration options.
Let's configure Azure Kubernetes Service Agent Pools
Get started with Pulumi Cloud, then follow our quick setup guide to deploy this infrastructure.
Try Pulumi Cloud for FREEFrequently Asked Questions
Pool Configuration & Scaling
mode set to System at all times. System pools host critical system pods.enableAutoScaling to true and specify minCount and maxCount to define scaling limits.scaleSetPriority to Spot and scaleSetEvictionPolicy to Delete or Deallocate. Spot pools use lower-cost VMs that can be evicted.Immutability & Lifecycle
agentPoolName, resourceGroupName, resourceName, vmSize, and gpuInstanceProfile.powerState.code to Stopped. Stopped agent pools don’t accrue billing charges but remain configured for restart.Kubernetes & OS Configuration
orchestratorVersion with a full version like 1.20.13 or just 1.20 to auto-select the latest GA patch version.Ephemeral if the VM supports it and has a cache disk larger than the requested osDiskSizeGB. Otherwise, it defaults to Managed.messageOfTheDay with a base64-encoded string. This must not be specified for Windows nodes and must be a static string (not a script).Networking
vnetSubnetID specifies the subnet for agent pool nodes. podSubnetID specifies a separate subnet for pods. If podSubnetID is omitted, pod IPs are statically assigned on the node subnet.windowsProfile.disableOutboundNat to true for Windows agent pools.Advanced Features
workloadRuntime to WasmWasi to enable Krustlet and the WASI runtime for WebAssembly workloads.gpuInstanceProfile is immutable and cannot be changed after the agent pool is created.