Stateful AI Application Backup with Velero

Question

Pulumi · Accepted Answer

Velero is an open source tool to safely backup and restore, perform disaster recovery, and migrate Kubernetes cluster resources and persistent volumes. To create backups with Velero, you must first install the Velero server in your Kubernetes cluster. You then create backup and restore resources as needed.

Although Pulumi does not directly provide Velero resource APIs, you can manage Velero installations and backups using Pulumi's Kubernetes provider. This allows you to define Kubernetes resources, including CRDs (Custom Resource Definitions) like the ones Velero uses for backups, within your Pulumi program.

Below I am providing you with an example Pulumi program that you might use to set up Velero in a Kubernetes cluster, assuming the necessary Velero CRDs are already installed in your cluster. The program defines a Velero backup resource that backs up all resources in a given namespace.

Please note that you need to have Velero running in your Kubernetes cluster, and your `kubeconfig` file correctly set up in the environment where you run Pulumi. You'll also need access privileges to create resources in your cluster.

### Detailed Explanation and Program

1. We begin by importing the necessary Pulumi and Kubernetes libraries.

2. We then create a namespace for Velero (if you haven't already done this).

3. After that, we define the backup resource using Velero's Backup custom resource. In this example, we're backing up all resources in the `default` namespace. You will need to adjust the `includedNamespaces` parameter to target the namespace that contains your stateful AI application.

4. Lastly, we export the backup name, which you can use to reference the backup operation, for example, when configuring your restores or querying Velero status.

Here is the Pulumi program in Python:

```python
import pulumi
import pulumi_kubernetes as k8s

# This assumes that the Velero CRDs and Velero server are already installed on your cluster.
# These are the usual groups, versions, and kinds for Velero, but they may differ depending on your Velero server version.
VELERO_API_GROUP = "velero.io/v1"
VELERO_BACKUP_KIND = "Backup"

# Define the Kubernetes namespace for Velero.
velero_namespace_name = "velero"
velero_namespace = k8s.core.v1.Namespace(
    "velero-namespace",
    metadata={"name": velero_namespace_name},
)

# Define the Backup resource for the stateful AI application, specifying the details required by Velero.
stateful_app_backup = k8s.apiextensions.CustomResource(
    "stateful-app-backup",
    api_version=VELERO_API_GROUP,
    kind=VELERO_BACKUP_KIND,
    metadata={"name": "stateful-ai-app-backup", "namespace": velero_namespace_name},
    spec={
        # Define which resources to include. Here, we're backing up the entire 'default' namespace.
        # You should change 'default' to the namespace where your stateful AI application is running.
        "includedNamespaces": ["default"],
        "labelSelector": {
            # Optionally label your application resources to select them specifically for backup.
            # "matchLabels": {
            #     "app": "your-app-label"
            # },
        },
        # Velero can use storage providers like AWS S3, GCP Buckets, or Azure Blob Storage.
        # You'll need to set this to the correct storage location and credentials based on your setup.
        "storageLocation": "your-backup-storage-location",
        # For a more complex stateful app, you may need to also specify included/excluded resources,
        # labelSelectors, storageLocation, hooks, and other configurations as necessary.
    },
    opts=pulumi.ResourceOptions(depends_on=[velero_namespace])
)

# Export the backup resource name to allow for easy queries with `kubectl` or Velero CLI.
pulumi.export("backup_name", stateful_app_backup.metadata["name"])
```

When you execute this program with Pulumi, it will create a Velero backup resource in your Kubernetes cluster. This backup resource will instruct Velero to initiate the backup operation of the stateful application.

Keep in mind that you must have Velero installed with the correct configurations such as access to file storage where the backups will be saved, and you need to have the respective permissions to create resources in your Kubernetes cluster. To install Velero and its CRDs, you would typically use Helm or a manual installation with `kubectl`. You should follow the official Velero documentation for the procedure that best fits your operational requirements.

For more detailed and specific configuration, such as setting hooks for pre- and post-backup actions, including or excluding specific resources by label, and configuring the storage location, please refer to Velero's documentation: [Velero Docs](https://velero.io/docs/).