Kubernetes Managed Database Clusters for ML Applications

Question

Pulumi · Accepted Answer

To set up Kubernetes Managed Database Clusters for ML (Machine Learning) applications, we will need to create a Kubernetes cluster and then deploy a managed database service into that cluster. These managed database services will typically be made available through Kubernetes operators or directly utilizing cloud-managed services if your Kubernetes cluster is running in a cloud environment.

In a typical setup, an ML application would require a robust data store that can be scaled alongside the ML workloads. Popular databases for such use cases include PostgreSQL, MongoDB, or MySQL. For example, on cloud providers like AWS, GCP, or Azure, you could leverage their respective managed database services (RDS for AWS, Cloud SQL for GCP, and Azure Database for Azure) or deploy a Kubernetes operator that manages the lifecycle of the database inside your Kubernetes cluster.

Below is a Pulumi program in Python showing how to create a managed Kubernetes cluster on Google Cloud Platform (GCP) using Google Kubernetes Engine (GKE), and then an example of how you might deploy a PostgreSQL database using the pulumi_kubernetes package.

Before running the following program, ensure you have the Pulumi CLI installed, have access to a Google Cloud Platform account, and the GCP SDK configured on your local machine.

Here is an illustration on how you might write such a program:

import pulumi
import pulumi_gcp as gcp
import pulumi_kubernetes as kubernetes

# Define the GCP project and region where we will deploy the resources
project = gcp.config.project
region = gcp.config.region

# Create a GKE cluster
# Docs: https://www.pulumi.com/registry/packages/gcp/api-docs/container/cluster/
cluster = gcp.container.Cluster("ml-cluster",
    initial_node_count=3,
    node_version="latest",
    min_master_version="latest",
    location=region,
    node_config={
        "machine_type": "n1-standard-1",
        "oauth_scopes": [
            "https://www.googleapis.com/auth/cloud-platform",
        ],
    })

# Export the Cluster name
pulumi.export('cluster_name', cluster.name)

# Now that we've created a GKE cluster, we need to configure our local kubeconfig
# so that 'pulumi_kubernetes' knows where to deploy our resources.
k8s_provider = kubernetes.Provider("k8s-provider", kubeconfig=cluster.endpoint.apply(
    lambda endpoint: gcp.container.get_kubeconfig(cluster_name=cluster.name, location=region, project=project).kubeconfig))

# Deploy a PostgreSQL database onto our GKE cluster
postgres_chart = kubernetes.helm.v3.Chart("pg-chart",
    config=kubernetes.helm.v3.ChartOpts(
        chart="postgresql",
        version="9.1.4",  # Use an appropriate version of the chart
        fetch_opts=kubernetes.helm.v3.FetchOpts(
            repo="https://charts.bitnami.com/bitnami",
        ),
    ),
    opts=pulumi.ResourceOptions(provider=k8s_provider))

# Export the PostgreSQL service endpoint for easy access
# This assumes PostgreSQL chart creates a service endpoint
# It's important to check the actual service name and keys based on the installed chart
pulumi.export('postgres_endpoint', pulumi.Output.all(cluster.name, postgres_chart).apply(
    lambda args: f"{args[1].status.load_balancer.ingress[0].ip}"))

In the above program:

We define the GCP project and region variables to specify where our resources will be deployed.
We create a GKE cluster with the gcp.container.Cluster resource. A cluster is composed of at least one node (VM) for this example, but in a production environment, you'd likely want more nodes and potentially different configurations for high availability and resilience.
We set up the Kubernetes provider using the GKE cluster's endpoint so that Pulumi knows where to deploy Kubernetes resources.
We install a PostgreSQL Helm chart onto our cluster with the kubernetes.helm.v3.Chart resource. Helm charts are packages that contain predefined Kubernetes resources. We're using the PostgreSQL chart from the Bitnami repository, but other databases and charts could be substituted here.
We export the name of the cluster and the PostgreSQL service endpoint to capture their values easily from the command line once Pulumi has finished deploying our resources. The endpoint would typically be an IP address or a DNS name that your applications can use to connect to the PostgreSQL database.

To run this Pulumi program:

Save the code to a file with a .py extension.
Use the Pulumi CLI to create a new stack (pulumi stack init) and set the GCP config values (pulumi config set gcp:project <your_project> and pulumi config set gcp:region <your_region>).
Run pulumi up to preview and deploy the changes. Pulumi will report the progress and any errors that may occur during the deployment.
Use the exported endpoint to integrate with your ML applications, allowing them to make use of the deployed PostgreSQL database.