How Do I Configure Environment Variables for Databricks Clusters Globally?

In this guide, we will demonstrate how to configure environment variables for Databricks clusters globally using Pulumi. This allows you to set environment variables that will be available across all clusters in your Databricks workspace, ensuring consistency and simplifying configuration management.

Key Points

We will use the databricks.Cluster resource to define a Databricks cluster.
We will configure the sparkEnvVars property to set environment variables globally.
The example will include all necessary entities to ensure the environment variables are applied correctly.

import * as pulumi from "@pulumi/pulumi";
import * as databricks from "@pulumi/databricks";

// Define a Databricks cluster
const cluster = new databricks.Cluster("exampleCluster", {
    clusterName: "example-cluster",
    sparkVersion: "7.3.x-scala2.12",
    nodeTypeId: "i3.xlarge",
    numWorkers: 2,
    sparkEnvVars: {
        "ENV_VAR_1": "value1",
        "ENV_VAR_2": "value2",
    },
    autoterminationMinutes: 30,
    awsAttributes: {
        availability: "SPOT",
        zoneId: "us-west-2a",
    },
});

// Export the cluster ID
export const clusterId = cluster.id;

Summary

In this example, we created a Databricks cluster and configured it with global environment variables using the sparkEnvVars property. These environment variables will be available across all nodes in the cluster, ensuring a consistent configuration. This approach simplifies the management of environment variables in your Databricks environment.

Deploy this code

Want to deploy this code? Sign up for a free Pulumi account to deploy in a few clicks.

New to Pulumi?

Want to deploy this code? Sign up with Pulumi to deploy in a few clicks.