Self-hosted AI Model CI/CD on Kubernetes Runner

Self-Hosted AI Model CI/CD in GKE with GitLab

Self-hosting an AI Model CI/CD (Continuous Integration/Continuous Delivery) on Kubernetes requires setting up a robust environment where your code can be built, tested, and deployed automatically. For this purpose, you may need the following components:

A Kubernetes cluster to run your workloads.
CI/CD tools such as GitLab Runner that will execute your pipelines.
Proper configuration to integrate your CI/CD tool with the Kubernetes cluster.

Pulumi can be used to set up and configure these components. The following program does just that: It provisions a Kubernetes cluster (for demonstration purposes we'll assume you are using the Google Kubernetes Engine, but in practice, it could be any Kubernetes provider), installs a GitLab Runner within the cluster, and configures it to listen for CI/CD events on your GitLab projects to run the jobs.

The program will be defined in several steps:

Creation of the Kubernetes cluster.
Installation and configuration of GitLab Runner on the cluster.
Associating the GitLab Runner with a GitLab project.

Let's start with the first step, creating a Kubernetes cluster:

import pulumi
import pulumi_gcp as gcp
from pulumi_kubernetes import Provider
from pulumi_kubernetes.helm.v3 import Chart, ChartOpts, FetchOpts

# Step 1: Create a Kubernetes Cluster
# Here we create a GKE (Google Kubernetes Engine) cluster. You will need to provide a project name and a location.
cluster = gcp.container.Cluster('ai-model-cicd-cluster',
    initial_node_count=3,
    node_config={
        "machineType": "n1-standard-1",
        "oauthScopes": [
            "https://www.googleapis.com/auth/compute",
            "https://www.googleapis.com/auth/devstorage.read_only",
            "https://www.googleapis.com/auth/logging.write",
            "https://www.googleapis.com/auth/monitoring"
        ],
    },
)

# Step 2: Create a Kubernetes Provider
# A Kubernetes provider is used to connect to the Kubernetes cluster that we just created. The kubeconfig for the cluster returns the connection parameters.
k8s_provider = Provider('k8s-provider', kubeconfig=cluster.endpoint.apply(lambda endpoint: cluster.master_auth.apply(
    lambda master_auth: f'''
apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: {master_auth[0].cluster_ca_certificate}
    server: https://{endpoint}
  name: gke-cluster
contexts:
- context:
    cluster: gke-cluster
    user: gke-user
  name: gke-context
current-context: gke-context
kind: Config
preferences: {{}}
users:
- name: gke-user
  user:
    auth-provider:
      config:
        cmd-args: config config-helper --format=json
        cmd-path: gcloud
        expiry-key: '{{.credential.token_expiry}}'
        token-key: '{{.credential.access_token}}'
      name: gcp
''')))

# Step 3: Deploy the GitLab Runner Helm Chart
# We deploy the official GitLab Runner Helm Chart to provide a Runner that will run the jobs.
runner_chart = Chart('gitlab-runner', ChartOpts(
    chart='gitlab-runner',
    version='0.37.1',
    fetch_opts=FetchOpts(
        repo='https://charts.gitlab.io/',
    ),
    values={
        "gitlabUrl": "https://YOUR_GITLAB_INSTANCE/", # Replace with your GitLab instance URL
        "runnerRegistrationToken": "YOUR_REGISTRATION_TOKEN", # Replace with your GitLab runner registration token
        "runners": {
            "privileged": True,
            "tags": "ai-model, cicd",
        }
    }
), opts=pulumi.ResourceOptions(provider=k8s_provider))

# Export important cluster information
pulumi.export('cluster_name', cluster.name)
pulumi.export('cluster_endpoint', cluster.endpoint)

Note that you need to replace https://YOUR_GITLAB_INSTANCE/ with the URL of your GitLab instance and YOUR_REGISTRATION_TOKEN with your GitLab runner registration token, which you can obtain from your GitLab instance under the "Runners" settings of the project you are setting up the CI/CD for.

Explanation of key parts of the program:

GCP Kubernetes Cluster: The gcp.container.Cluster resource provisions a Kubernetes cluster on Google Cloud Platform. We are specifying the machine type and OAuth scopes required for the cluster nodes.
Kubernetes Provider: The pulumi_kubernetes.Provider resource sets up the connection to the Kubernetes cluster. It's using the cluster's endpoint and master authentication to generate a kubeconfig file, which is used by Pulumi to interact with your cluster.
Helm Chart for GitLab Runner: The pulumi_kubernetes.helm.v3.Chart resource deploys the GitLab Runner using a Helm chart from GitLab's Helm chart repository. This runner will be listening for pipeline events from the specified GitLab instance and project.
Configuration Values: In values, gitlabUrl is set to your GitLab instance URL, and runnerRegistrationToken is set to a runner registration token obtained from your GitLab instance.

For security reasons, do not hardcode the registration token in your Pulumi program. Consider using Pulumi secrets or environment variables to handle sensitive information.

This program assumes you have Pulumi and GCP set up. You would need to have the Pulumi command-line tool installed and configured to use the GCP credentials. To deploy this stack, you would run pulumi up in the directory of this program. After the deployment, you will see exported values such as the cluster name and cluster endpoint, which you can use to interact with your Kubernetes cluster.