1. Answers
  2. Configuring GKE Cluster with Mixed Nodes and A100 GPU

How do I configure a GKE cluster with medium and large nodes plus an A100 GPU?

In this guide, we will configure a Google Kubernetes Engine (GKE) cluster with mixed node sizes and an A100 GPU using Pulumi. We will create a GKE cluster with two node pools: one with medium-sized nodes, one with large-sized nodes, and a third node pool with nodes that have A100 GPUs.

import * as pulumi from "@pulumi/pulumi";
import * as gcp from "@pulumi/gcp";

// Define the GKE cluster
const cluster = new gcp.container.Cluster("my-gke-cluster", {
    location: "us-central1",
    initialNodeCount: 1,
    nodeConfig: {
        machineType: "n1-standard-1",
    },
    minMasterVersion: "1.21",
});

// Medium-sized node pool
const mediumNodePool = new gcp.container.NodePool("medium-node-pool", {
    cluster: cluster.name,
    location: cluster.location,
    initialNodeCount: 1,
    nodeConfig: {
        machineType: "n1-standard-4",
        oauthScopes: [
            "https://www.googleapis.com/auth/cloud-platform",
        ],
    },
    autoscaling: {
        minNodeCount: 1,
        maxNodeCount: 3,
    },
    management: {
        autoUpgrade: true,
        autoRepair: true,
    },
});

// Large-sized node pool
const largeNodePool = new gcp.container.NodePool("large-node-pool", {
    cluster: cluster.name,
    location: cluster.location,
    initialNodeCount: 1,
    nodeConfig: {
        machineType: "n1-standard-8",
        oauthScopes: [
            "https://www.googleapis.com/auth/cloud-platform",
        ],
    },
    autoscaling: {
        minNodeCount: 1,
        maxNodeCount: 3,
    },
    management: {
        autoUpgrade: true,
        autoRepair: true,
    },
});

// Node pool with A100 GPU
const gpuNodePool = new gcp.container.NodePool("gpu-node-pool", {
    cluster: cluster.name,
    location: cluster.location,
    initialNodeCount: 1,
    nodeConfig: {
        machineType: "n1-standard-4",
        oauthScopes: [
            "https://www.googleapis.com/auth/cloud-platform",
        ],
        guestAccelerators: [
            {
                type: "nvidia-tesla-a100",
                count: 1,
            },
        ],
    },
    autoscaling: {
        minNodeCount: 1,
        maxNodeCount: 3,
    },
    management: {
        autoUpgrade: true,
        autoRepair: true,
    },
});

// Export the cluster name and endpoint
export const clusterName = cluster.name;
export const clusterEndpoint = cluster.endpoint;

Key Points

  • We defined a GKE cluster with an initial node configuration.
  • We created three node pools: one with medium-sized nodes, one with large-sized nodes, and one with A100 GPUs.
  • Each node pool is configured with autoscaling, management settings, and appropriate machine types.
  • The GPU node pool includes an A100 GPU accelerator.

Summary

We configured a GKE cluster with mixed node sizes and an A100 GPU using Pulumi. The cluster includes node pools for medium and large nodes, as well as a specialized node pool for GPU workloads. This setup allows for flexible scaling and resource allocation based on different workload requirements.

Deploy this code

Want to deploy this code? Sign up for a free Pulumi account to deploy in a few clicks.

Sign up

New to Pulumi?

Want to deploy this code? Sign up with Pulumi to deploy in a few clicks.

Sign up