Deploy the kserve-inference helm chart on Google Kubernetes Engine (GKE)

Question

Pulumi · Accepted Answer

To deploy the `kserve-inference` Helm chart on Google Kubernetes Engine (GKE), we would take the following steps:

1. Create a GKE cluster
2. Configure `kubectl` to connect to the GKE cluster
3. Deploy the `kserve-inference` Helm chart to the GKE cluster

Below is a Pulumi program written in TypeScript that carries out these steps. This program assumes that you have Pulumi installed and configured with the necessary GCP credentials.

### Pulumi Program Explanation:

- **Creating a GKE cluster**: We'll use the `gcp.container.Cluster` Pulumi resource to create a new GKE cluster. The properties will be set to reasonable defaults for the cluster, like the number of nodes in the default node pool and zone.

- **Configuring `kubectl`**: Once the cluster is created, we obtain the kubeconfig file that we need to interact with the cluster using `kubectl`. Pulumi provides this as an output of the cluster resource.

- **Deploying the `kserve-inference` Helm Chart**: Pulumi has a Helm Chart resource that can be used to deploy Helm charts. We'll use `pulumi-kubernetes` which is Kubernetes provider for Pulumi, and then provide the necessary parameters for the Helm chart, including the chart version and any set values required by `kserve-inference`.

### Prerequisites:
- Make sure `helm` is installed locally, as Pulumi will use it to deploy the Helm chart.
- Ensure that Pulumi CLI is installed and configured for access to your GCP account.

### Pulumi Program TypeScript:

```typescript
import * as pulumi from "@pulumi/pulumi";
import * as gcp from "@pulumi/gcp";
import * as k8s from "@pulumi/kubernetes";

// Step 1: Create the GKE cluster
const cluster = new gcp.container.Cluster("kserve-cluster", {
    initialNodeCount: 2,
    minMasterVersion: "latest",
    nodeVersion: "latest",
    location: "us-west1-a", // Set the zone or region where you want your cluster to be created.
});

// Step 2: Configure kubectl to connect to the new GKE cluster
const kubeconfig = pulumi.
    all([cluster.name, cluster.endpoint, cluster.masterAuth]).
    apply(([name, endpoint, masterAuth]) => {
        const context = `${gcp.config.project}_${gcp.config.zone}_${name}`;
        return `apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: ${masterAuth.clusterCaCertificate}
    server: https://${endpoint}
  name: ${context}
contexts:
- context:
    cluster: ${context}
    user: ${context}
  name: ${context}
current-context: ${context}
kind: Config
preferences: {}
users:
- name: ${context}
  user:
    auth-provider:
    config:
      cmd-args: config config-helper --format=json
      cmd-path: gcloud
      expiry-key: '{.credential.token_expiry}'
      token-key: '{.credential.access_token}'
    name: gcloud
`;
    });

// Step 3: Create a Kubernetes Provider instance using the kubeconfig
const k8sProvider = new k8s.Provider("k8s-provider", {
    kubeconfig: kubeconfig,
});

// Step 4: Deploy the kserve-inference Helm Chart
const kserveInferenceChart = new k8s.helm.v3.Chart("kserve-inference", {
    chart: "kserve",
    version: "0.7.0", // specify the version of the kserving chart you wish to deploy
    fetchOpts:{
        repo: "https://kserve.github.io/charts", // specify the Helm chart repository
    },
    // Define values for the Helm chart as needed, or remove the `values` field if not
    values: {
        // values you want to set for the kserve-inference Helm chart
    },
}, { provider: k8sProvider });

// Export the Kubeconfig to access the cluster with kubectl
export const kubeconfigOut = kubeconfig;

// Export other outputs as needed, such as the Helm chart status or resources
export const kserveStatus = kserveInferenceChart.status;
```

### Explanation of Exports:
- `kubeconfigOut` is exported so you can easily access it to use `kubectl` with this cluster.
- `kserveStatus` is exported to view the status of the Helm chart deployment.

### Running the Pulumi Program:
- Use the command `pulumi up` to preview and deploy the changes.

This Pulumi program will set up a GKE cluster, configure `kubectl`, and deploy the `kserve-inference` Helm chart to the cluster. Make sure to replace placeholders with actual values as per your requirements. After the deployment is completed, the `kserve-inference` services should be running on your GKE cluster.