Deploy the kserve-inference helm chart on Google Kubernetes Engine (GKE)
TypeScriptTo deploy the
kserve-inference
Helm chart on Google Kubernetes Engine (GKE), we would take the following steps:- Create a GKE cluster
- Configure
kubectl
to connect to the GKE cluster - Deploy the
kserve-inference
Helm chart to the GKE cluster
Below is a Pulumi program written in TypeScript that carries out these steps. This program assumes that you have Pulumi installed and configured with the necessary GCP credentials.
Pulumi Program Explanation:
-
Creating a GKE cluster: We'll use the
gcp.container.Cluster
Pulumi resource to create a new GKE cluster. The properties will be set to reasonable defaults for the cluster, like the number of nodes in the default node pool and zone. -
Configuring
kubectl
: Once the cluster is created, we obtain the kubeconfig file that we need to interact with the cluster usingkubectl
. Pulumi provides this as an output of the cluster resource. -
Deploying the
kserve-inference
Helm Chart: Pulumi has a Helm Chart resource that can be used to deploy Helm charts. We'll usepulumi-kubernetes
which is Kubernetes provider for Pulumi, and then provide the necessary parameters for the Helm chart, including the chart version and any set values required bykserve-inference
.
Prerequisites:
- Make sure
helm
is installed locally, as Pulumi will use it to deploy the Helm chart. - Ensure that Pulumi CLI is installed and configured for access to your GCP account.
Pulumi Program TypeScript:
import * as pulumi from "@pulumi/pulumi"; import * as gcp from "@pulumi/gcp"; import * as k8s from "@pulumi/kubernetes"; // Step 1: Create the GKE cluster const cluster = new gcp.container.Cluster("kserve-cluster", { initialNodeCount: 2, minMasterVersion: "latest", nodeVersion: "latest", location: "us-west1-a", // Set the zone or region where you want your cluster to be created. }); // Step 2: Configure kubectl to connect to the new GKE cluster const kubeconfig = pulumi. all([cluster.name, cluster.endpoint, cluster.masterAuth]). apply(([name, endpoint, masterAuth]) => { const context = `${gcp.config.project}_${gcp.config.zone}_${name}`; return `apiVersion: v1 clusters: - cluster: certificate-authority-data: ${masterAuth.clusterCaCertificate} server: https://${endpoint} name: ${context} contexts: - context: cluster: ${context} user: ${context} name: ${context} current-context: ${context} kind: Config preferences: {} users: - name: ${context} user: auth-provider: config: cmd-args: config config-helper --format=json cmd-path: gcloud expiry-key: '{.credential.token_expiry}' token-key: '{.credential.access_token}' name: gcloud `; }); // Step 3: Create a Kubernetes Provider instance using the kubeconfig const k8sProvider = new k8s.Provider("k8s-provider", { kubeconfig: kubeconfig, }); // Step 4: Deploy the kserve-inference Helm Chart const kserveInferenceChart = new k8s.helm.v3.Chart("kserve-inference", { chart: "kserve", version: "0.7.0", // specify the version of the kserving chart you wish to deploy fetchOpts:{ repo: "https://kserve.github.io/charts", // specify the Helm chart repository }, // Define values for the Helm chart as needed, or remove the `values` field if not values: { // values you want to set for the kserve-inference Helm chart }, }, { provider: k8sProvider }); // Export the Kubeconfig to access the cluster with kubectl export const kubeconfigOut = kubeconfig; // Export other outputs as needed, such as the Helm chart status or resources export const kserveStatus = kserveInferenceChart.status;
Explanation of Exports:
kubeconfigOut
is exported so you can easily access it to usekubectl
with this cluster.kserveStatus
is exported to view the status of the Helm chart deployment.
Running the Pulumi Program:
- Use the command
pulumi up
to preview and deploy the changes.
This Pulumi program will set up a GKE cluster, configure
kubectl
, and deploy thekserve-inference
Helm chart to the cluster. Make sure to replace placeholders with actual values as per your requirements. After the deployment is completed, thekserve-inference
services should be running on your GKE cluster.