1. Packages
  2. Databricks
  3. API Docs
  4. Cluster
Databricks v1.27.0 published on Tuesday, Dec 5, 2023 by Pulumi

databricks.Cluster

Explore with Pulumi AI

databricks logo
Databricks v1.27.0 published on Tuesday, Dec 5, 2023 by Pulumi

    Import

    The resource cluster can be imported using cluster id. bash

     $ pulumi import databricks:index/cluster:Cluster this <cluster-id>
    

    Create Cluster Resource

    new Cluster(name: string, args: ClusterArgs, opts?: CustomResourceOptions);
    @overload
    def Cluster(resource_name: str,
                opts: Optional[ResourceOptions] = None,
                apply_policy_default_values: Optional[bool] = None,
                autoscale: Optional[ClusterAutoscaleArgs] = None,
                autotermination_minutes: Optional[int] = None,
                aws_attributes: Optional[ClusterAwsAttributesArgs] = None,
                azure_attributes: Optional[ClusterAzureAttributesArgs] = None,
                cluster_id: Optional[str] = None,
                cluster_log_conf: Optional[ClusterClusterLogConfArgs] = None,
                cluster_mount_infos: Optional[Sequence[ClusterClusterMountInfoArgs]] = None,
                cluster_name: Optional[str] = None,
                custom_tags: Optional[Mapping[str, Any]] = None,
                data_security_mode: Optional[str] = None,
                docker_image: Optional[ClusterDockerImageArgs] = None,
                driver_instance_pool_id: Optional[str] = None,
                driver_node_type_id: Optional[str] = None,
                enable_elastic_disk: Optional[bool] = None,
                enable_local_disk_encryption: Optional[bool] = None,
                gcp_attributes: Optional[ClusterGcpAttributesArgs] = None,
                idempotency_token: Optional[str] = None,
                init_scripts: Optional[Sequence[ClusterInitScriptArgs]] = None,
                instance_pool_id: Optional[str] = None,
                is_pinned: Optional[bool] = None,
                libraries: Optional[Sequence[ClusterLibraryArgs]] = None,
                node_type_id: Optional[str] = None,
                num_workers: Optional[int] = None,
                policy_id: Optional[str] = None,
                runtime_engine: Optional[str] = None,
                single_user_name: Optional[str] = None,
                spark_conf: Optional[Mapping[str, Any]] = None,
                spark_env_vars: Optional[Mapping[str, Any]] = None,
                spark_version: Optional[str] = None,
                ssh_public_keys: Optional[Sequence[str]] = None,
                workload_type: Optional[ClusterWorkloadTypeArgs] = None)
    @overload
    def Cluster(resource_name: str,
                args: ClusterArgs,
                opts: Optional[ResourceOptions] = None)
    func NewCluster(ctx *Context, name string, args ClusterArgs, opts ...ResourceOption) (*Cluster, error)
    public Cluster(string name, ClusterArgs args, CustomResourceOptions? opts = null)
    public Cluster(String name, ClusterArgs args)
    public Cluster(String name, ClusterArgs args, CustomResourceOptions options)
    
    type: databricks:Cluster
    properties: # The arguments to resource properties.
    options: # Bag of options to control resource's behavior.
    
    
    name string
    The unique name of the resource.
    args ClusterArgs
    The arguments to resource properties.
    opts CustomResourceOptions
    Bag of options to control resource's behavior.
    resource_name str
    The unique name of the resource.
    args ClusterArgs
    The arguments to resource properties.
    opts ResourceOptions
    Bag of options to control resource's behavior.
    ctx Context
    Context object for the current deployment.
    name string
    The unique name of the resource.
    args ClusterArgs
    The arguments to resource properties.
    opts ResourceOption
    Bag of options to control resource's behavior.
    name string
    The unique name of the resource.
    args ClusterArgs
    The arguments to resource properties.
    opts CustomResourceOptions
    Bag of options to control resource's behavior.
    name String
    The unique name of the resource.
    args ClusterArgs
    The arguments to resource properties.
    options CustomResourceOptions
    Bag of options to control resource's behavior.

    Cluster Resource Properties

    To learn more about resource properties and how to use them, see Inputs and Outputs in the Architecture and Concepts docs.

    Inputs

    The Cluster resource accepts the following input properties:

    SparkVersion string

    Runtime version of the cluster. Any supported databricks.getSparkVersion id. We advise using Cluster Policies to restrict the list of versions for simplicity while maintaining enough control.

    ApplyPolicyDefaultValues bool

    Whether to use policy default values for missing cluster attributes.

    Autoscale ClusterAutoscale
    AutoterminationMinutes int

    Automatically terminate the cluster after being inactive for this time in minutes. If specified, the threshold must be between 10 and 10000 minutes. You can also set this value to 0 to explicitly disable automatic termination. Defaults to 60. We highly recommend having this setting present for Interactive/BI clusters.

    AwsAttributes ClusterAwsAttributes
    AzureAttributes ClusterAzureAttributes
    ClusterId string
    ClusterLogConf ClusterClusterLogConf
    ClusterMountInfos List<ClusterClusterMountInfo>
    ClusterName string

    Cluster name, which doesn’t have to be unique. If not specified at creation, the cluster name will be an empty string.

    CustomTags Dictionary<string, object>

    Additional tags for cluster resources. Databricks will tag all cluster resources (e.g., AWS EC2 instances and EBS volumes) with these tags in addition to default_tags. If a custom cluster tag has the same name as a default cluster tag, the custom tag is prefixed with an x_ when it is propagated.

    DataSecurityMode string

    Select the security features of the cluster. Unity Catalog requires SINGLE_USER or USER_ISOLATION mode. LEGACY_PASSTHROUGH for passthrough cluster and LEGACY_TABLE_ACL for Table ACL cluster. If omitted, no security features are enabled. In the Databricks UI, this has been recently been renamed Access Mode and USER_ISOLATION has been renamed Shared, but use these terms here.

    DockerImage ClusterDockerImage
    DriverInstancePoolId string

    similar to instance_pool_id, but for driver node. If omitted, and instance_pool_id is specified, then the driver will be allocated from that pool.

    DriverNodeTypeId string

    The node type of the Spark driver. This field is optional; if unset, API will set the driver node type to the same value as node_type_id defined above.

    EnableElasticDisk bool

    If you don’t want to allocate a fixed number of EBS volumes at cluster creation time, use autoscaling local storage. With autoscaling local storage, Databricks monitors the amount of free disk space available on your cluster’s Spark workers. If a worker begins to run too low on disk, Databricks automatically attaches a new EBS volume to the worker before it runs out of disk space. EBS volumes are attached up to a limit of 5 TB of total disk space per instance (including the instance’s local storage). To scale down EBS usage, make sure you have autotermination_minutes and autoscale attributes set. More documentation available at cluster configuration page.

    EnableLocalDiskEncryption bool

    Some instance types you use to run clusters may have locally attached disks. Databricks may store shuffle data or temporary data on these locally attached disks. To ensure that all data at rest is encrypted for all storage types, including shuffle data stored temporarily on your cluster’s local disks, you can enable local disk encryption. When local disk encryption is enabled, Databricks generates an encryption key locally unique to each cluster node and uses it to encrypt all data stored on local disks. The scope of the key is local to each cluster node and is destroyed along with the cluster node itself. During its lifetime, the key resides in memory for encryption and decryption and is stored encrypted on the disk. Your workloads may run more slowly because of the performance impact of reading and writing encrypted data to and from local volumes. This feature is not available for all Azure Databricks subscriptions. Contact your Microsoft or Databricks account representative to request access.

    GcpAttributes ClusterGcpAttributes
    IdempotencyToken string

    An optional token to guarantee the idempotency of cluster creation requests. If an active cluster with the provided token already exists, the request will not create a new cluster, but it will return the existing running cluster's ID instead. If you specify the idempotency token, upon failure, you can retry until the request succeeds. Databricks platform guarantees to launch exactly one cluster with that idempotency token. This token should have at most 64 characters.

    InitScripts List<ClusterInitScript>
    InstancePoolId string

    To reduce cluster start time, you can attach a cluster to a predefined pool of idle instances. When attached to a pool, a cluster allocates its driver and worker nodes from the pool. If the pool does not have sufficient idle resources to accommodate the cluster’s request, it expands by allocating new instances from the instance provider. When an attached cluster changes its state to TERMINATED, the instances it used are returned to the pool and reused by a different cluster.

    IsPinned bool

    boolean value specifying if the cluster is pinned (not pinned by default). You must be a Databricks administrator to use this. The pinned clusters' maximum number is limited to 100, so apply may fail if you have more than that (this number may change over time, so check Databricks documentation for actual number).

    The following example demonstrates how to create an autoscaling cluster with Delta Cache enabled:

    import * as pulumi from "@pulumi/pulumi";
    import * as databricks from "@pulumi/databricks";
    

    const smallest = databricks.getNodeType({ localDisk: true, }); const latestLts = databricks.getSparkVersion({ longTermSupport: true, }); const sharedAutoscaling = new databricks.Cluster("sharedAutoscaling", { clusterName: "Shared Autoscaling", sparkVersion: latestLts.then(latestLts => latestLts.id), nodeTypeId: smallest.then(smallest => smallest.id), autoterminationMinutes: 20, autoscale: { minWorkers: 1, maxWorkers: 50, }, sparkConf: { "spark.databricks.io.cache.enabled": true, "spark.databricks.io.cache.maxDiskUsage": "50g", "spark.databricks.io.cache.maxMetaDataCache": "1g", }, });

    import pulumi
    import pulumi_databricks as databricks
    
    smallest = databricks.get_node_type(local_disk=True)
    latest_lts = databricks.get_spark_version(long_term_support=True)
    shared_autoscaling = databricks.Cluster("sharedAutoscaling",
        cluster_name="Shared Autoscaling",
        spark_version=latest_lts.id,
        node_type_id=smallest.id,
        autotermination_minutes=20,
        autoscale=databricks.ClusterAutoscaleArgs(
            min_workers=1,
            max_workers=50,
        ),
        spark_conf={
            "spark.databricks.io.cache.enabled": True,
            "spark.databricks.io.cache.maxDiskUsage": "50g",
            "spark.databricks.io.cache.maxMetaDataCache": "1g",
        })
    
    using System.Collections.Generic;
    using System.Linq;
    using Pulumi;
    using Databricks = Pulumi.Databricks;
    
    return await Deployment.RunAsync(() => 
    {
        var smallest = Databricks.GetNodeType.Invoke(new()
        {
            LocalDisk = true,
        });
    
        var latestLts = Databricks.GetSparkVersion.Invoke(new()
        {
            LongTermSupport = true,
        });
    
        var sharedAutoscaling = new Databricks.Cluster("sharedAutoscaling", new()
        {
            ClusterName = "Shared Autoscaling",
            SparkVersion = latestLts.Apply(getSparkVersionResult => getSparkVersionResult.Id),
            NodeTypeId = smallest.Apply(getNodeTypeResult => getNodeTypeResult.Id),
            AutoterminationMinutes = 20,
            Autoscale = new Databricks.Inputs.ClusterAutoscaleArgs
            {
                MinWorkers = 1,
                MaxWorkers = 50,
            },
            SparkConf = 
            {
                { "spark.databricks.io.cache.enabled", true },
                { "spark.databricks.io.cache.maxDiskUsage", "50g" },
                { "spark.databricks.io.cache.maxMetaDataCache", "1g" },
            },
        });
    
    });
    
    package main
    
    import (
    	"github.com/pulumi/pulumi-databricks/sdk/go/databricks"
    	"github.com/pulumi/pulumi/sdk/v3/go/pulumi"
    )
    
    func main() {
    	pulumi.Run(func(ctx *pulumi.Context) error {
    		smallest, err := databricks.GetNodeType(ctx, &databricks.GetNodeTypeArgs{
    			LocalDisk: pulumi.BoolRef(true),
    		}, nil)
    		if err != nil {
    			return err
    		}
    		latestLts, err := databricks.GetSparkVersion(ctx, &databricks.GetSparkVersionArgs{
    			LongTermSupport: pulumi.BoolRef(true),
    		}, nil)
    		if err != nil {
    			return err
    		}
    		_, err = databricks.NewCluster(ctx, "sharedAutoscaling", &databricks.ClusterArgs{
    			ClusterName:            pulumi.String("Shared Autoscaling"),
    			SparkVersion:           *pulumi.String(latestLts.Id),
    			NodeTypeId:             *pulumi.String(smallest.Id),
    			AutoterminationMinutes: pulumi.Int(20),
    			Autoscale: &databricks.ClusterAutoscaleArgs{
    				MinWorkers: pulumi.Int(1),
    				MaxWorkers: pulumi.Int(50),
    			},
    			SparkConf: pulumi.Map{
    				"spark.databricks.io.cache.enabled":          pulumi.Any(true),
    				"spark.databricks.io.cache.maxDiskUsage":     pulumi.Any("50g"),
    				"spark.databricks.io.cache.maxMetaDataCache": pulumi.Any("1g"),
    			},
    		})
    		if err != nil {
    			return err
    		}
    		return nil
    	})
    }
    
    package generated_program;
    
    import com.pulumi.Context;
    import com.pulumi.Pulumi;
    import com.pulumi.core.Output;
    import com.pulumi.databricks.DatabricksFunctions;
    import com.pulumi.databricks.inputs.GetNodeTypeArgs;
    import com.pulumi.databricks.inputs.GetSparkVersionArgs;
    import com.pulumi.databricks.Cluster;
    import com.pulumi.databricks.ClusterArgs;
    import com.pulumi.databricks.inputs.ClusterAutoscaleArgs;
    import java.util.List;
    import java.util.ArrayList;
    import java.util.Map;
    import java.io.File;
    import java.nio.file.Files;
    import java.nio.file.Paths;
    
    public class App {
        public static void main(String[] args) {
            Pulumi.run(App::stack);
        }
    
        public static void stack(Context ctx) {
            final var smallest = DatabricksFunctions.getNodeType(GetNodeTypeArgs.builder()
                .localDisk(true)
                .build());
    
            final var latestLts = DatabricksFunctions.getSparkVersion(GetSparkVersionArgs.builder()
                .longTermSupport(true)
                .build());
    
            var sharedAutoscaling = new Cluster("sharedAutoscaling", ClusterArgs.builder()        
                .clusterName("Shared Autoscaling")
                .sparkVersion(latestLts.applyValue(getSparkVersionResult -> getSparkVersionResult.id()))
                .nodeTypeId(smallest.applyValue(getNodeTypeResult -> getNodeTypeResult.id()))
                .autoterminationMinutes(20)
                .autoscale(ClusterAutoscaleArgs.builder()
                    .minWorkers(1)
                    .maxWorkers(50)
                    .build())
                .sparkConf(Map.ofEntries(
                    Map.entry("spark.databricks.io.cache.enabled", true),
                    Map.entry("spark.databricks.io.cache.maxDiskUsage", "50g"),
                    Map.entry("spark.databricks.io.cache.maxMetaDataCache", "1g")
                ))
                .build());
    
        }
    }
    
    resources:
      sharedAutoscaling:
        type: databricks:Cluster
        properties:
          clusterName: Shared Autoscaling
          sparkVersion: ${latestLts.id}
          nodeTypeId: ${smallest.id}
          autoterminationMinutes: 20
          autoscale:
            minWorkers: 1
            maxWorkers: 50
          sparkConf:
            spark.databricks.io.cache.enabled: true
            spark.databricks.io.cache.maxDiskUsage: 50g
            spark.databricks.io.cache.maxMetaDataCache: 1g
    variables:
      smallest:
        fn::invoke:
          Function: databricks:getNodeType
          Arguments:
            localDisk: true
      latestLts:
        fn::invoke:
          Function: databricks:getSparkVersion
          Arguments:
            longTermSupport: true
    
    Libraries List<ClusterLibrary>
    NodeTypeId string

    Any supported databricks.getNodeType id. If instance_pool_id is specified, this field is not needed.

    NumWorkers int

    Number of worker nodes that this cluster should have. A cluster has one Spark driver and num_workers executors for a total of num_workers + 1 Spark nodes.

    PolicyId string
    RuntimeEngine string

    The type of runtime engine to use. If not specified, the runtime engine type is inferred based on the spark_version value. Allowed values include: PHOTON, STANDARD.

    SingleUserName string

    The optional user name of the user to assign to an interactive cluster. This field is required when using data_security_mode set to SINGLE_USER or AAD Passthrough for Azure Data Lake Storage (ADLS) with a single-user cluster (i.e., not high-concurrency clusters).

    SparkConf Dictionary<string, object>

    Map with key-value pairs to fine-tune Spark clusters, where you can provide custom Spark configuration properties in a cluster configuration.

    SparkEnvVars Dictionary<string, object>

    Map with environment variable key-value pairs to fine-tune Spark clusters. Key-value pairs of the form (X,Y) are exported (i.e., X='Y') while launching the driver and workers.

    SshPublicKeys List<string>

    SSH public key contents that will be added to each Spark node in this cluster. The corresponding private keys can be used to login with the user name ubuntu on port 2200. You can specify up to 10 keys.

    WorkloadType ClusterWorkloadType
    SparkVersion string

    Runtime version of the cluster. Any supported databricks.getSparkVersion id. We advise using Cluster Policies to restrict the list of versions for simplicity while maintaining enough control.

    ApplyPolicyDefaultValues bool

    Whether to use policy default values for missing cluster attributes.

    Autoscale ClusterAutoscaleArgs
    AutoterminationMinutes int

    Automatically terminate the cluster after being inactive for this time in minutes. If specified, the threshold must be between 10 and 10000 minutes. You can also set this value to 0 to explicitly disable automatic termination. Defaults to 60. We highly recommend having this setting present for Interactive/BI clusters.

    AwsAttributes ClusterAwsAttributesArgs
    AzureAttributes ClusterAzureAttributesArgs
    ClusterId string
    ClusterLogConf ClusterClusterLogConfArgs
    ClusterMountInfos []ClusterClusterMountInfoArgs
    ClusterName string

    Cluster name, which doesn’t have to be unique. If not specified at creation, the cluster name will be an empty string.

    CustomTags map[string]interface{}

    Additional tags for cluster resources. Databricks will tag all cluster resources (e.g., AWS EC2 instances and EBS volumes) with these tags in addition to default_tags. If a custom cluster tag has the same name as a default cluster tag, the custom tag is prefixed with an x_ when it is propagated.

    DataSecurityMode string

    Select the security features of the cluster. Unity Catalog requires SINGLE_USER or USER_ISOLATION mode. LEGACY_PASSTHROUGH for passthrough cluster and LEGACY_TABLE_ACL for Table ACL cluster. If omitted, no security features are enabled. In the Databricks UI, this has been recently been renamed Access Mode and USER_ISOLATION has been renamed Shared, but use these terms here.

    DockerImage ClusterDockerImageArgs
    DriverInstancePoolId string

    similar to instance_pool_id, but for driver node. If omitted, and instance_pool_id is specified, then the driver will be allocated from that pool.

    DriverNodeTypeId string

    The node type of the Spark driver. This field is optional; if unset, API will set the driver node type to the same value as node_type_id defined above.

    EnableElasticDisk bool

    If you don’t want to allocate a fixed number of EBS volumes at cluster creation time, use autoscaling local storage. With autoscaling local storage, Databricks monitors the amount of free disk space available on your cluster’s Spark workers. If a worker begins to run too low on disk, Databricks automatically attaches a new EBS volume to the worker before it runs out of disk space. EBS volumes are attached up to a limit of 5 TB of total disk space per instance (including the instance’s local storage). To scale down EBS usage, make sure you have autotermination_minutes and autoscale attributes set. More documentation available at cluster configuration page.

    EnableLocalDiskEncryption bool

    Some instance types you use to run clusters may have locally attached disks. Databricks may store shuffle data or temporary data on these locally attached disks. To ensure that all data at rest is encrypted for all storage types, including shuffle data stored temporarily on your cluster’s local disks, you can enable local disk encryption. When local disk encryption is enabled, Databricks generates an encryption key locally unique to each cluster node and uses it to encrypt all data stored on local disks. The scope of the key is local to each cluster node and is destroyed along with the cluster node itself. During its lifetime, the key resides in memory for encryption and decryption and is stored encrypted on the disk. Your workloads may run more slowly because of the performance impact of reading and writing encrypted data to and from local volumes. This feature is not available for all Azure Databricks subscriptions. Contact your Microsoft or Databricks account representative to request access.

    GcpAttributes ClusterGcpAttributesArgs
    IdempotencyToken string

    An optional token to guarantee the idempotency of cluster creation requests. If an active cluster with the provided token already exists, the request will not create a new cluster, but it will return the existing running cluster's ID instead. If you specify the idempotency token, upon failure, you can retry until the request succeeds. Databricks platform guarantees to launch exactly one cluster with that idempotency token. This token should have at most 64 characters.

    InitScripts []ClusterInitScriptArgs
    InstancePoolId string

    To reduce cluster start time, you can attach a cluster to a predefined pool of idle instances. When attached to a pool, a cluster allocates its driver and worker nodes from the pool. If the pool does not have sufficient idle resources to accommodate the cluster’s request, it expands by allocating new instances from the instance provider. When an attached cluster changes its state to TERMINATED, the instances it used are returned to the pool and reused by a different cluster.

    IsPinned bool

    boolean value specifying if the cluster is pinned (not pinned by default). You must be a Databricks administrator to use this. The pinned clusters' maximum number is limited to 100, so apply may fail if you have more than that (this number may change over time, so check Databricks documentation for actual number).

    The following example demonstrates how to create an autoscaling cluster with Delta Cache enabled:

    import * as pulumi from "@pulumi/pulumi";
    import * as databricks from "@pulumi/databricks";
    

    const smallest = databricks.getNodeType({ localDisk: true, }); const latestLts = databricks.getSparkVersion({ longTermSupport: true, }); const sharedAutoscaling = new databricks.Cluster("sharedAutoscaling", { clusterName: "Shared Autoscaling", sparkVersion: latestLts.then(latestLts => latestLts.id), nodeTypeId: smallest.then(smallest => smallest.id), autoterminationMinutes: 20, autoscale: { minWorkers: 1, maxWorkers: 50, }, sparkConf: { "spark.databricks.io.cache.enabled": true, "spark.databricks.io.cache.maxDiskUsage": "50g", "spark.databricks.io.cache.maxMetaDataCache": "1g", }, });

    import pulumi
    import pulumi_databricks as databricks
    
    smallest = databricks.get_node_type(local_disk=True)
    latest_lts = databricks.get_spark_version(long_term_support=True)
    shared_autoscaling = databricks.Cluster("sharedAutoscaling",
        cluster_name="Shared Autoscaling",
        spark_version=latest_lts.id,
        node_type_id=smallest.id,
        autotermination_minutes=20,
        autoscale=databricks.ClusterAutoscaleArgs(
            min_workers=1,
            max_workers=50,
        ),
        spark_conf={
            "spark.databricks.io.cache.enabled": True,
            "spark.databricks.io.cache.maxDiskUsage": "50g",
            "spark.databricks.io.cache.maxMetaDataCache": "1g",
        })
    
    using System.Collections.Generic;
    using System.Linq;
    using Pulumi;
    using Databricks = Pulumi.Databricks;
    
    return await Deployment.RunAsync(() => 
    {
        var smallest = Databricks.GetNodeType.Invoke(new()
        {
            LocalDisk = true,
        });
    
        var latestLts = Databricks.GetSparkVersion.Invoke(new()
        {
            LongTermSupport = true,
        });
    
        var sharedAutoscaling = new Databricks.Cluster("sharedAutoscaling", new()
        {
            ClusterName = "Shared Autoscaling",
            SparkVersion = latestLts.Apply(getSparkVersionResult => getSparkVersionResult.Id),
            NodeTypeId = smallest.Apply(getNodeTypeResult => getNodeTypeResult.Id),
            AutoterminationMinutes = 20,
            Autoscale = new Databricks.Inputs.ClusterAutoscaleArgs
            {
                MinWorkers = 1,
                MaxWorkers = 50,
            },
            SparkConf = 
            {
                { "spark.databricks.io.cache.enabled", true },
                { "spark.databricks.io.cache.maxDiskUsage", "50g" },
                { "spark.databricks.io.cache.maxMetaDataCache", "1g" },
            },
        });
    
    });
    
    package main
    
    import (
    	"github.com/pulumi/pulumi-databricks/sdk/go/databricks"
    	"github.com/pulumi/pulumi/sdk/v3/go/pulumi"
    )
    
    func main() {
    	pulumi.Run(func(ctx *pulumi.Context) error {
    		smallest, err := databricks.GetNodeType(ctx, &databricks.GetNodeTypeArgs{
    			LocalDisk: pulumi.BoolRef(true),
    		}, nil)
    		if err != nil {
    			return err
    		}
    		latestLts, err := databricks.GetSparkVersion(ctx, &databricks.GetSparkVersionArgs{
    			LongTermSupport: pulumi.BoolRef(true),
    		}, nil)
    		if err != nil {
    			return err
    		}
    		_, err = databricks.NewCluster(ctx, "sharedAutoscaling", &databricks.ClusterArgs{
    			ClusterName:            pulumi.String("Shared Autoscaling"),
    			SparkVersion:           *pulumi.String(latestLts.Id),
    			NodeTypeId:             *pulumi.String(smallest.Id),
    			AutoterminationMinutes: pulumi.Int(20),
    			Autoscale: &databricks.ClusterAutoscaleArgs{
    				MinWorkers: pulumi.Int(1),
    				MaxWorkers: pulumi.Int(50),
    			},
    			SparkConf: pulumi.Map{
    				"spark.databricks.io.cache.enabled":          pulumi.Any(true),
    				"spark.databricks.io.cache.maxDiskUsage":     pulumi.Any("50g"),
    				"spark.databricks.io.cache.maxMetaDataCache": pulumi.Any("1g"),
    			},
    		})
    		if err != nil {
    			return err
    		}
    		return nil
    	})
    }
    
    package generated_program;
    
    import com.pulumi.Context;
    import com.pulumi.Pulumi;
    import com.pulumi.core.Output;
    import com.pulumi.databricks.DatabricksFunctions;
    import com.pulumi.databricks.inputs.GetNodeTypeArgs;
    import com.pulumi.databricks.inputs.GetSparkVersionArgs;
    import com.pulumi.databricks.Cluster;
    import com.pulumi.databricks.ClusterArgs;
    import com.pulumi.databricks.inputs.ClusterAutoscaleArgs;
    import java.util.List;
    import java.util.ArrayList;
    import java.util.Map;
    import java.io.File;
    import java.nio.file.Files;
    import java.nio.file.Paths;
    
    public class App {
        public static void main(String[] args) {
            Pulumi.run(App::stack);
        }
    
        public static void stack(Context ctx) {
            final var smallest = DatabricksFunctions.getNodeType(GetNodeTypeArgs.builder()
                .localDisk(true)
                .build());
    
            final var latestLts = DatabricksFunctions.getSparkVersion(GetSparkVersionArgs.builder()
                .longTermSupport(true)
                .build());
    
            var sharedAutoscaling = new Cluster("sharedAutoscaling", ClusterArgs.builder()        
                .clusterName("Shared Autoscaling")
                .sparkVersion(latestLts.applyValue(getSparkVersionResult -> getSparkVersionResult.id()))
                .nodeTypeId(smallest.applyValue(getNodeTypeResult -> getNodeTypeResult.id()))
                .autoterminationMinutes(20)
                .autoscale(ClusterAutoscaleArgs.builder()
                    .minWorkers(1)
                    .maxWorkers(50)
                    .build())
                .sparkConf(Map.ofEntries(
                    Map.entry("spark.databricks.io.cache.enabled", true),
                    Map.entry("spark.databricks.io.cache.maxDiskUsage", "50g"),
                    Map.entry("spark.databricks.io.cache.maxMetaDataCache", "1g")
                ))
                .build());
    
        }
    }
    
    resources:
      sharedAutoscaling:
        type: databricks:Cluster
        properties:
          clusterName: Shared Autoscaling
          sparkVersion: ${latestLts.id}
          nodeTypeId: ${smallest.id}
          autoterminationMinutes: 20
          autoscale:
            minWorkers: 1
            maxWorkers: 50
          sparkConf:
            spark.databricks.io.cache.enabled: true
            spark.databricks.io.cache.maxDiskUsage: 50g
            spark.databricks.io.cache.maxMetaDataCache: 1g
    variables:
      smallest:
        fn::invoke:
          Function: databricks:getNodeType
          Arguments:
            localDisk: true
      latestLts:
        fn::invoke:
          Function: databricks:getSparkVersion
          Arguments:
            longTermSupport: true
    
    Libraries []ClusterLibraryArgs
    NodeTypeId string

    Any supported databricks.getNodeType id. If instance_pool_id is specified, this field is not needed.

    NumWorkers int

    Number of worker nodes that this cluster should have. A cluster has one Spark driver and num_workers executors for a total of num_workers + 1 Spark nodes.

    PolicyId string
    RuntimeEngine string

    The type of runtime engine to use. If not specified, the runtime engine type is inferred based on the spark_version value. Allowed values include: PHOTON, STANDARD.

    SingleUserName string

    The optional user name of the user to assign to an interactive cluster. This field is required when using data_security_mode set to SINGLE_USER or AAD Passthrough for Azure Data Lake Storage (ADLS) with a single-user cluster (i.e., not high-concurrency clusters).

    SparkConf map[string]interface{}

    Map with key-value pairs to fine-tune Spark clusters, where you can provide custom Spark configuration properties in a cluster configuration.

    SparkEnvVars map[string]interface{}

    Map with environment variable key-value pairs to fine-tune Spark clusters. Key-value pairs of the form (X,Y) are exported (i.e., X='Y') while launching the driver and workers.

    SshPublicKeys []string

    SSH public key contents that will be added to each Spark node in this cluster. The corresponding private keys can be used to login with the user name ubuntu on port 2200. You can specify up to 10 keys.

    WorkloadType ClusterWorkloadTypeArgs
    sparkVersion String

    Runtime version of the cluster. Any supported databricks.getSparkVersion id. We advise using Cluster Policies to restrict the list of versions for simplicity while maintaining enough control.

    applyPolicyDefaultValues Boolean

    Whether to use policy default values for missing cluster attributes.

    autoscale ClusterAutoscale
    autoterminationMinutes Integer

    Automatically terminate the cluster after being inactive for this time in minutes. If specified, the threshold must be between 10 and 10000 minutes. You can also set this value to 0 to explicitly disable automatic termination. Defaults to 60. We highly recommend having this setting present for Interactive/BI clusters.

    awsAttributes ClusterAwsAttributes
    azureAttributes ClusterAzureAttributes
    clusterId String
    clusterLogConf ClusterClusterLogConf
    clusterMountInfos List<ClusterClusterMountInfo>
    clusterName String

    Cluster name, which doesn’t have to be unique. If not specified at creation, the cluster name will be an empty string.

    customTags Map<String,Object>

    Additional tags for cluster resources. Databricks will tag all cluster resources (e.g., AWS EC2 instances and EBS volumes) with these tags in addition to default_tags. If a custom cluster tag has the same name as a default cluster tag, the custom tag is prefixed with an x_ when it is propagated.

    dataSecurityMode String

    Select the security features of the cluster. Unity Catalog requires SINGLE_USER or USER_ISOLATION mode. LEGACY_PASSTHROUGH for passthrough cluster and LEGACY_TABLE_ACL for Table ACL cluster. If omitted, no security features are enabled. In the Databricks UI, this has been recently been renamed Access Mode and USER_ISOLATION has been renamed Shared, but use these terms here.

    dockerImage ClusterDockerImage
    driverInstancePoolId String

    similar to instance_pool_id, but for driver node. If omitted, and instance_pool_id is specified, then the driver will be allocated from that pool.

    driverNodeTypeId String

    The node type of the Spark driver. This field is optional; if unset, API will set the driver node type to the same value as node_type_id defined above.

    enableElasticDisk Boolean

    If you don’t want to allocate a fixed number of EBS volumes at cluster creation time, use autoscaling local storage. With autoscaling local storage, Databricks monitors the amount of free disk space available on your cluster’s Spark workers. If a worker begins to run too low on disk, Databricks automatically attaches a new EBS volume to the worker before it runs out of disk space. EBS volumes are attached up to a limit of 5 TB of total disk space per instance (including the instance’s local storage). To scale down EBS usage, make sure you have autotermination_minutes and autoscale attributes set. More documentation available at cluster configuration page.

    enableLocalDiskEncryption Boolean

    Some instance types you use to run clusters may have locally attached disks. Databricks may store shuffle data or temporary data on these locally attached disks. To ensure that all data at rest is encrypted for all storage types, including shuffle data stored temporarily on your cluster’s local disks, you can enable local disk encryption. When local disk encryption is enabled, Databricks generates an encryption key locally unique to each cluster node and uses it to encrypt all data stored on local disks. The scope of the key is local to each cluster node and is destroyed along with the cluster node itself. During its lifetime, the key resides in memory for encryption and decryption and is stored encrypted on the disk. Your workloads may run more slowly because of the performance impact of reading and writing encrypted data to and from local volumes. This feature is not available for all Azure Databricks subscriptions. Contact your Microsoft or Databricks account representative to request access.

    gcpAttributes ClusterGcpAttributes
    idempotencyToken String

    An optional token to guarantee the idempotency of cluster creation requests. If an active cluster with the provided token already exists, the request will not create a new cluster, but it will return the existing running cluster's ID instead. If you specify the idempotency token, upon failure, you can retry until the request succeeds. Databricks platform guarantees to launch exactly one cluster with that idempotency token. This token should have at most 64 characters.

    initScripts List<ClusterInitScript>
    instancePoolId String

    To reduce cluster start time, you can attach a cluster to a predefined pool of idle instances. When attached to a pool, a cluster allocates its driver and worker nodes from the pool. If the pool does not have sufficient idle resources to accommodate the cluster’s request, it expands by allocating new instances from the instance provider. When an attached cluster changes its state to TERMINATED, the instances it used are returned to the pool and reused by a different cluster.

    isPinned Boolean

    boolean value specifying if the cluster is pinned (not pinned by default). You must be a Databricks administrator to use this. The pinned clusters' maximum number is limited to 100, so apply may fail if you have more than that (this number may change over time, so check Databricks documentation for actual number).

    The following example demonstrates how to create an autoscaling cluster with Delta Cache enabled:

    import * as pulumi from "@pulumi/pulumi";
    import * as databricks from "@pulumi/databricks";
    

    const smallest = databricks.getNodeType({ localDisk: true, }); const latestLts = databricks.getSparkVersion({ longTermSupport: true, }); const sharedAutoscaling = new databricks.Cluster("sharedAutoscaling", { clusterName: "Shared Autoscaling", sparkVersion: latestLts.then(latestLts => latestLts.id), nodeTypeId: smallest.then(smallest => smallest.id), autoterminationMinutes: 20, autoscale: { minWorkers: 1, maxWorkers: 50, }, sparkConf: { "spark.databricks.io.cache.enabled": true, "spark.databricks.io.cache.maxDiskUsage": "50g", "spark.databricks.io.cache.maxMetaDataCache": "1g", }, });

    import pulumi
    import pulumi_databricks as databricks
    
    smallest = databricks.get_node_type(local_disk=True)
    latest_lts = databricks.get_spark_version(long_term_support=True)
    shared_autoscaling = databricks.Cluster("sharedAutoscaling",
        cluster_name="Shared Autoscaling",
        spark_version=latest_lts.id,
        node_type_id=smallest.id,
        autotermination_minutes=20,
        autoscale=databricks.ClusterAutoscaleArgs(
            min_workers=1,
            max_workers=50,
        ),
        spark_conf={
            "spark.databricks.io.cache.enabled": True,
            "spark.databricks.io.cache.maxDiskUsage": "50g",
            "spark.databricks.io.cache.maxMetaDataCache": "1g",
        })
    
    using System.Collections.Generic;
    using System.Linq;
    using Pulumi;
    using Databricks = Pulumi.Databricks;
    
    return await Deployment.RunAsync(() => 
    {
        var smallest = Databricks.GetNodeType.Invoke(new()
        {
            LocalDisk = true,
        });
    
        var latestLts = Databricks.GetSparkVersion.Invoke(new()
        {
            LongTermSupport = true,
        });
    
        var sharedAutoscaling = new Databricks.Cluster("sharedAutoscaling", new()
        {
            ClusterName = "Shared Autoscaling",
            SparkVersion = latestLts.Apply(getSparkVersionResult => getSparkVersionResult.Id),
            NodeTypeId = smallest.Apply(getNodeTypeResult => getNodeTypeResult.Id),
            AutoterminationMinutes = 20,
            Autoscale = new Databricks.Inputs.ClusterAutoscaleArgs
            {
                MinWorkers = 1,
                MaxWorkers = 50,
            },
            SparkConf = 
            {
                { "spark.databricks.io.cache.enabled", true },
                { "spark.databricks.io.cache.maxDiskUsage", "50g" },
                { "spark.databricks.io.cache.maxMetaDataCache", "1g" },
            },
        });
    
    });
    
    package main
    
    import (
    	"github.com/pulumi/pulumi-databricks/sdk/go/databricks"
    	"github.com/pulumi/pulumi/sdk/v3/go/pulumi"
    )
    
    func main() {
    	pulumi.Run(func(ctx *pulumi.Context) error {
    		smallest, err := databricks.GetNodeType(ctx, &databricks.GetNodeTypeArgs{
    			LocalDisk: pulumi.BoolRef(true),
    		}, nil)
    		if err != nil {
    			return err
    		}
    		latestLts, err := databricks.GetSparkVersion(ctx, &databricks.GetSparkVersionArgs{
    			LongTermSupport: pulumi.BoolRef(true),
    		}, nil)
    		if err != nil {
    			return err
    		}
    		_, err = databricks.NewCluster(ctx, "sharedAutoscaling", &databricks.ClusterArgs{
    			ClusterName:            pulumi.String("Shared Autoscaling"),
    			SparkVersion:           *pulumi.String(latestLts.Id),
    			NodeTypeId:             *pulumi.String(smallest.Id),
    			AutoterminationMinutes: pulumi.Int(20),
    			Autoscale: &databricks.ClusterAutoscaleArgs{
    				MinWorkers: pulumi.Int(1),
    				MaxWorkers: pulumi.Int(50),
    			},
    			SparkConf: pulumi.Map{
    				"spark.databricks.io.cache.enabled":          pulumi.Any(true),
    				"spark.databricks.io.cache.maxDiskUsage":     pulumi.Any("50g"),
    				"spark.databricks.io.cache.maxMetaDataCache": pulumi.Any("1g"),
    			},
    		})
    		if err != nil {
    			return err
    		}
    		return nil
    	})
    }
    
    package generated_program;
    
    import com.pulumi.Context;
    import com.pulumi.Pulumi;
    import com.pulumi.core.Output;
    import com.pulumi.databricks.DatabricksFunctions;
    import com.pulumi.databricks.inputs.GetNodeTypeArgs;
    import com.pulumi.databricks.inputs.GetSparkVersionArgs;
    import com.pulumi.databricks.Cluster;
    import com.pulumi.databricks.ClusterArgs;
    import com.pulumi.databricks.inputs.ClusterAutoscaleArgs;
    import java.util.List;
    import java.util.ArrayList;
    import java.util.Map;
    import java.io.File;
    import java.nio.file.Files;
    import java.nio.file.Paths;
    
    public class App {
        public static void main(String[] args) {
            Pulumi.run(App::stack);
        }
    
        public static void stack(Context ctx) {
            final var smallest = DatabricksFunctions.getNodeType(GetNodeTypeArgs.builder()
                .localDisk(true)
                .build());
    
            final var latestLts = DatabricksFunctions.getSparkVersion(GetSparkVersionArgs.builder()
                .longTermSupport(true)
                .build());
    
            var sharedAutoscaling = new Cluster("sharedAutoscaling", ClusterArgs.builder()        
                .clusterName("Shared Autoscaling")
                .sparkVersion(latestLts.applyValue(getSparkVersionResult -> getSparkVersionResult.id()))
                .nodeTypeId(smallest.applyValue(getNodeTypeResult -> getNodeTypeResult.id()))
                .autoterminationMinutes(20)
                .autoscale(ClusterAutoscaleArgs.builder()
                    .minWorkers(1)
                    .maxWorkers(50)
                    .build())
                .sparkConf(Map.ofEntries(
                    Map.entry("spark.databricks.io.cache.enabled", true),
                    Map.entry("spark.databricks.io.cache.maxDiskUsage", "50g"),
                    Map.entry("spark.databricks.io.cache.maxMetaDataCache", "1g")
                ))
                .build());
    
        }
    }
    
    resources:
      sharedAutoscaling:
        type: databricks:Cluster
        properties:
          clusterName: Shared Autoscaling
          sparkVersion: ${latestLts.id}
          nodeTypeId: ${smallest.id}
          autoterminationMinutes: 20
          autoscale:
            minWorkers: 1
            maxWorkers: 50
          sparkConf:
            spark.databricks.io.cache.enabled: true
            spark.databricks.io.cache.maxDiskUsage: 50g
            spark.databricks.io.cache.maxMetaDataCache: 1g
    variables:
      smallest:
        fn::invoke:
          Function: databricks:getNodeType
          Arguments:
            localDisk: true
      latestLts:
        fn::invoke:
          Function: databricks:getSparkVersion
          Arguments:
            longTermSupport: true
    
    libraries List<ClusterLibrary>
    nodeTypeId String

    Any supported databricks.getNodeType id. If instance_pool_id is specified, this field is not needed.

    numWorkers Integer

    Number of worker nodes that this cluster should have. A cluster has one Spark driver and num_workers executors for a total of num_workers + 1 Spark nodes.

    policyId String
    runtimeEngine String

    The type of runtime engine to use. If not specified, the runtime engine type is inferred based on the spark_version value. Allowed values include: PHOTON, STANDARD.

    singleUserName String

    The optional user name of the user to assign to an interactive cluster. This field is required when using data_security_mode set to SINGLE_USER or AAD Passthrough for Azure Data Lake Storage (ADLS) with a single-user cluster (i.e., not high-concurrency clusters).

    sparkConf Map<String,Object>

    Map with key-value pairs to fine-tune Spark clusters, where you can provide custom Spark configuration properties in a cluster configuration.

    sparkEnvVars Map<String,Object>

    Map with environment variable key-value pairs to fine-tune Spark clusters. Key-value pairs of the form (X,Y) are exported (i.e., X='Y') while launching the driver and workers.

    sshPublicKeys List<String>

    SSH public key contents that will be added to each Spark node in this cluster. The corresponding private keys can be used to login with the user name ubuntu on port 2200. You can specify up to 10 keys.

    workloadType ClusterWorkloadType
    sparkVersion string

    Runtime version of the cluster. Any supported databricks.getSparkVersion id. We advise using Cluster Policies to restrict the list of versions for simplicity while maintaining enough control.

    applyPolicyDefaultValues boolean

    Whether to use policy default values for missing cluster attributes.

    autoscale ClusterAutoscale
    autoterminationMinutes number

    Automatically terminate the cluster after being inactive for this time in minutes. If specified, the threshold must be between 10 and 10000 minutes. You can also set this value to 0 to explicitly disable automatic termination. Defaults to 60. We highly recommend having this setting present for Interactive/BI clusters.

    awsAttributes ClusterAwsAttributes
    azureAttributes ClusterAzureAttributes
    clusterId string
    clusterLogConf ClusterClusterLogConf
    clusterMountInfos ClusterClusterMountInfo[]
    clusterName string

    Cluster name, which doesn’t have to be unique. If not specified at creation, the cluster name will be an empty string.

    customTags {[key: string]: any}

    Additional tags for cluster resources. Databricks will tag all cluster resources (e.g., AWS EC2 instances and EBS volumes) with these tags in addition to default_tags. If a custom cluster tag has the same name as a default cluster tag, the custom tag is prefixed with an x_ when it is propagated.

    dataSecurityMode string

    Select the security features of the cluster. Unity Catalog requires SINGLE_USER or USER_ISOLATION mode. LEGACY_PASSTHROUGH for passthrough cluster and LEGACY_TABLE_ACL for Table ACL cluster. If omitted, no security features are enabled. In the Databricks UI, this has been recently been renamed Access Mode and USER_ISOLATION has been renamed Shared, but use these terms here.

    dockerImage ClusterDockerImage
    driverInstancePoolId string

    similar to instance_pool_id, but for driver node. If omitted, and instance_pool_id is specified, then the driver will be allocated from that pool.

    driverNodeTypeId string

    The node type of the Spark driver. This field is optional; if unset, API will set the driver node type to the same value as node_type_id defined above.

    enableElasticDisk boolean

    If you don’t want to allocate a fixed number of EBS volumes at cluster creation time, use autoscaling local storage. With autoscaling local storage, Databricks monitors the amount of free disk space available on your cluster’s Spark workers. If a worker begins to run too low on disk, Databricks automatically attaches a new EBS volume to the worker before it runs out of disk space. EBS volumes are attached up to a limit of 5 TB of total disk space per instance (including the instance’s local storage). To scale down EBS usage, make sure you have autotermination_minutes and autoscale attributes set. More documentation available at cluster configuration page.

    enableLocalDiskEncryption boolean

    Some instance types you use to run clusters may have locally attached disks. Databricks may store shuffle data or temporary data on these locally attached disks. To ensure that all data at rest is encrypted for all storage types, including shuffle data stored temporarily on your cluster’s local disks, you can enable local disk encryption. When local disk encryption is enabled, Databricks generates an encryption key locally unique to each cluster node and uses it to encrypt all data stored on local disks. The scope of the key is local to each cluster node and is destroyed along with the cluster node itself. During its lifetime, the key resides in memory for encryption and decryption and is stored encrypted on the disk. Your workloads may run more slowly because of the performance impact of reading and writing encrypted data to and from local volumes. This feature is not available for all Azure Databricks subscriptions. Contact your Microsoft or Databricks account representative to request access.

    gcpAttributes ClusterGcpAttributes
    idempotencyToken string

    An optional token to guarantee the idempotency of cluster creation requests. If an active cluster with the provided token already exists, the request will not create a new cluster, but it will return the existing running cluster's ID instead. If you specify the idempotency token, upon failure, you can retry until the request succeeds. Databricks platform guarantees to launch exactly one cluster with that idempotency token. This token should have at most 64 characters.

    initScripts ClusterInitScript[]
    instancePoolId string

    To reduce cluster start time, you can attach a cluster to a predefined pool of idle instances. When attached to a pool, a cluster allocates its driver and worker nodes from the pool. If the pool does not have sufficient idle resources to accommodate the cluster’s request, it expands by allocating new instances from the instance provider. When an attached cluster changes its state to TERMINATED, the instances it used are returned to the pool and reused by a different cluster.

    isPinned boolean

    boolean value specifying if the cluster is pinned (not pinned by default). You must be a Databricks administrator to use this. The pinned clusters' maximum number is limited to 100, so apply may fail if you have more than that (this number may change over time, so check Databricks documentation for actual number).

    The following example demonstrates how to create an autoscaling cluster with Delta Cache enabled:

    import * as pulumi from "@pulumi/pulumi";
    import * as databricks from "@pulumi/databricks";
    

    const smallest = databricks.getNodeType({ localDisk: true, }); const latestLts = databricks.getSparkVersion({ longTermSupport: true, }); const sharedAutoscaling = new databricks.Cluster("sharedAutoscaling", { clusterName: "Shared Autoscaling", sparkVersion: latestLts.then(latestLts => latestLts.id), nodeTypeId: smallest.then(smallest => smallest.id), autoterminationMinutes: 20, autoscale: { minWorkers: 1, maxWorkers: 50, }, sparkConf: { "spark.databricks.io.cache.enabled": true, "spark.databricks.io.cache.maxDiskUsage": "50g", "spark.databricks.io.cache.maxMetaDataCache": "1g", }, });

    import pulumi
    import pulumi_databricks as databricks
    
    smallest = databricks.get_node_type(local_disk=True)
    latest_lts = databricks.get_spark_version(long_term_support=True)
    shared_autoscaling = databricks.Cluster("sharedAutoscaling",
        cluster_name="Shared Autoscaling",
        spark_version=latest_lts.id,
        node_type_id=smallest.id,
        autotermination_minutes=20,
        autoscale=databricks.ClusterAutoscaleArgs(
            min_workers=1,
            max_workers=50,
        ),
        spark_conf={
            "spark.databricks.io.cache.enabled": True,
            "spark.databricks.io.cache.maxDiskUsage": "50g",
            "spark.databricks.io.cache.maxMetaDataCache": "1g",
        })
    
    using System.Collections.Generic;
    using System.Linq;
    using Pulumi;
    using Databricks = Pulumi.Databricks;
    
    return await Deployment.RunAsync(() => 
    {
        var smallest = Databricks.GetNodeType.Invoke(new()
        {
            LocalDisk = true,
        });
    
        var latestLts = Databricks.GetSparkVersion.Invoke(new()
        {
            LongTermSupport = true,
        });
    
        var sharedAutoscaling = new Databricks.Cluster("sharedAutoscaling", new()
        {
            ClusterName = "Shared Autoscaling",
            SparkVersion = latestLts.Apply(getSparkVersionResult => getSparkVersionResult.Id),
            NodeTypeId = smallest.Apply(getNodeTypeResult => getNodeTypeResult.Id),
            AutoterminationMinutes = 20,
            Autoscale = new Databricks.Inputs.ClusterAutoscaleArgs
            {
                MinWorkers = 1,
                MaxWorkers = 50,
            },
            SparkConf = 
            {
                { "spark.databricks.io.cache.enabled", true },
                { "spark.databricks.io.cache.maxDiskUsage", "50g" },
                { "spark.databricks.io.cache.maxMetaDataCache", "1g" },
            },
        });
    
    });
    
    package main
    
    import (
    	"github.com/pulumi/pulumi-databricks/sdk/go/databricks"
    	"github.com/pulumi/pulumi/sdk/v3/go/pulumi"
    )
    
    func main() {
    	pulumi.Run(func(ctx *pulumi.Context) error {
    		smallest, err := databricks.GetNodeType(ctx, &databricks.GetNodeTypeArgs{
    			LocalDisk: pulumi.BoolRef(true),
    		}, nil)
    		if err != nil {
    			return err
    		}
    		latestLts, err := databricks.GetSparkVersion(ctx, &databricks.GetSparkVersionArgs{
    			LongTermSupport: pulumi.BoolRef(true),
    		}, nil)
    		if err != nil {
    			return err
    		}
    		_, err = databricks.NewCluster(ctx, "sharedAutoscaling", &databricks.ClusterArgs{
    			ClusterName:            pulumi.String("Shared Autoscaling"),
    			SparkVersion:           *pulumi.String(latestLts.Id),
    			NodeTypeId:             *pulumi.String(smallest.Id),
    			AutoterminationMinutes: pulumi.Int(20),
    			Autoscale: &databricks.ClusterAutoscaleArgs{
    				MinWorkers: pulumi.Int(1),
    				MaxWorkers: pulumi.Int(50),
    			},
    			SparkConf: pulumi.Map{
    				"spark.databricks.io.cache.enabled":          pulumi.Any(true),
    				"spark.databricks.io.cache.maxDiskUsage":     pulumi.Any("50g"),
    				"spark.databricks.io.cache.maxMetaDataCache": pulumi.Any("1g"),
    			},
    		})
    		if err != nil {
    			return err
    		}
    		return nil
    	})
    }
    
    package generated_program;
    
    import com.pulumi.Context;
    import com.pulumi.Pulumi;
    import com.pulumi.core.Output;
    import com.pulumi.databricks.DatabricksFunctions;
    import com.pulumi.databricks.inputs.GetNodeTypeArgs;
    import com.pulumi.databricks.inputs.GetSparkVersionArgs;
    import com.pulumi.databricks.Cluster;
    import com.pulumi.databricks.ClusterArgs;
    import com.pulumi.databricks.inputs.ClusterAutoscaleArgs;
    import java.util.List;
    import java.util.ArrayList;
    import java.util.Map;
    import java.io.File;
    import java.nio.file.Files;
    import java.nio.file.Paths;
    
    public class App {
        public static void main(String[] args) {
            Pulumi.run(App::stack);
        }
    
        public static void stack(Context ctx) {
            final var smallest = DatabricksFunctions.getNodeType(GetNodeTypeArgs.builder()
                .localDisk(true)
                .build());
    
            final var latestLts = DatabricksFunctions.getSparkVersion(GetSparkVersionArgs.builder()
                .longTermSupport(true)
                .build());
    
            var sharedAutoscaling = new Cluster("sharedAutoscaling", ClusterArgs.builder()        
                .clusterName("Shared Autoscaling")
                .sparkVersion(latestLts.applyValue(getSparkVersionResult -> getSparkVersionResult.id()))
                .nodeTypeId(smallest.applyValue(getNodeTypeResult -> getNodeTypeResult.id()))
                .autoterminationMinutes(20)
                .autoscale(ClusterAutoscaleArgs.builder()
                    .minWorkers(1)
                    .maxWorkers(50)
                    .build())
                .sparkConf(Map.ofEntries(
                    Map.entry("spark.databricks.io.cache.enabled", true),
                    Map.entry("spark.databricks.io.cache.maxDiskUsage", "50g"),
                    Map.entry("spark.databricks.io.cache.maxMetaDataCache", "1g")
                ))
                .build());
    
        }
    }
    
    resources:
      sharedAutoscaling:
        type: databricks:Cluster
        properties:
          clusterName: Shared Autoscaling
          sparkVersion: ${latestLts.id}
          nodeTypeId: ${smallest.id}
          autoterminationMinutes: 20
          autoscale:
            minWorkers: 1
            maxWorkers: 50
          sparkConf:
            spark.databricks.io.cache.enabled: true
            spark.databricks.io.cache.maxDiskUsage: 50g
            spark.databricks.io.cache.maxMetaDataCache: 1g
    variables:
      smallest:
        fn::invoke:
          Function: databricks:getNodeType
          Arguments:
            localDisk: true
      latestLts:
        fn::invoke:
          Function: databricks:getSparkVersion
          Arguments:
            longTermSupport: true
    
    libraries ClusterLibrary[]
    nodeTypeId string

    Any supported databricks.getNodeType id. If instance_pool_id is specified, this field is not needed.

    numWorkers number

    Number of worker nodes that this cluster should have. A cluster has one Spark driver and num_workers executors for a total of num_workers + 1 Spark nodes.

    policyId string
    runtimeEngine string

    The type of runtime engine to use. If not specified, the runtime engine type is inferred based on the spark_version value. Allowed values include: PHOTON, STANDARD.

    singleUserName string

    The optional user name of the user to assign to an interactive cluster. This field is required when using data_security_mode set to SINGLE_USER or AAD Passthrough for Azure Data Lake Storage (ADLS) with a single-user cluster (i.e., not high-concurrency clusters).

    sparkConf {[key: string]: any}

    Map with key-value pairs to fine-tune Spark clusters, where you can provide custom Spark configuration properties in a cluster configuration.

    sparkEnvVars {[key: string]: any}

    Map with environment variable key-value pairs to fine-tune Spark clusters. Key-value pairs of the form (X,Y) are exported (i.e., X='Y') while launching the driver and workers.

    sshPublicKeys string[]

    SSH public key contents that will be added to each Spark node in this cluster. The corresponding private keys can be used to login with the user name ubuntu on port 2200. You can specify up to 10 keys.

    workloadType ClusterWorkloadType
    spark_version str

    Runtime version of the cluster. Any supported databricks.getSparkVersion id. We advise using Cluster Policies to restrict the list of versions for simplicity while maintaining enough control.

    apply_policy_default_values bool

    Whether to use policy default values for missing cluster attributes.

    autoscale ClusterAutoscaleArgs
    autotermination_minutes int

    Automatically terminate the cluster after being inactive for this time in minutes. If specified, the threshold must be between 10 and 10000 minutes. You can also set this value to 0 to explicitly disable automatic termination. Defaults to 60. We highly recommend having this setting present for Interactive/BI clusters.

    aws_attributes ClusterAwsAttributesArgs
    azure_attributes ClusterAzureAttributesArgs
    cluster_id str
    cluster_log_conf ClusterClusterLogConfArgs
    cluster_mount_infos Sequence[ClusterClusterMountInfoArgs]
    cluster_name str

    Cluster name, which doesn’t have to be unique. If not specified at creation, the cluster name will be an empty string.

    custom_tags Mapping[str, Any]

    Additional tags for cluster resources. Databricks will tag all cluster resources (e.g., AWS EC2 instances and EBS volumes) with these tags in addition to default_tags. If a custom cluster tag has the same name as a default cluster tag, the custom tag is prefixed with an x_ when it is propagated.

    data_security_mode str

    Select the security features of the cluster. Unity Catalog requires SINGLE_USER or USER_ISOLATION mode. LEGACY_PASSTHROUGH for passthrough cluster and LEGACY_TABLE_ACL for Table ACL cluster. If omitted, no security features are enabled. In the Databricks UI, this has been recently been renamed Access Mode and USER_ISOLATION has been renamed Shared, but use these terms here.

    docker_image ClusterDockerImageArgs
    driver_instance_pool_id str

    similar to instance_pool_id, but for driver node. If omitted, and instance_pool_id is specified, then the driver will be allocated from that pool.

    driver_node_type_id str

    The node type of the Spark driver. This field is optional; if unset, API will set the driver node type to the same value as node_type_id defined above.

    enable_elastic_disk bool

    If you don’t want to allocate a fixed number of EBS volumes at cluster creation time, use autoscaling local storage. With autoscaling local storage, Databricks monitors the amount of free disk space available on your cluster’s Spark workers. If a worker begins to run too low on disk, Databricks automatically attaches a new EBS volume to the worker before it runs out of disk space. EBS volumes are attached up to a limit of 5 TB of total disk space per instance (including the instance’s local storage). To scale down EBS usage, make sure you have autotermination_minutes and autoscale attributes set. More documentation available at cluster configuration page.

    enable_local_disk_encryption bool

    Some instance types you use to run clusters may have locally attached disks. Databricks may store shuffle data or temporary data on these locally attached disks. To ensure that all data at rest is encrypted for all storage types, including shuffle data stored temporarily on your cluster’s local disks, you can enable local disk encryption. When local disk encryption is enabled, Databricks generates an encryption key locally unique to each cluster node and uses it to encrypt all data stored on local disks. The scope of the key is local to each cluster node and is destroyed along with the cluster node itself. During its lifetime, the key resides in memory for encryption and decryption and is stored encrypted on the disk. Your workloads may run more slowly because of the performance impact of reading and writing encrypted data to and from local volumes. This feature is not available for all Azure Databricks subscriptions. Contact your Microsoft or Databricks account representative to request access.

    gcp_attributes ClusterGcpAttributesArgs
    idempotency_token str

    An optional token to guarantee the idempotency of cluster creation requests. If an active cluster with the provided token already exists, the request will not create a new cluster, but it will return the existing running cluster's ID instead. If you specify the idempotency token, upon failure, you can retry until the request succeeds. Databricks platform guarantees to launch exactly one cluster with that idempotency token. This token should have at most 64 characters.

    init_scripts Sequence[ClusterInitScriptArgs]
    instance_pool_id str

    To reduce cluster start time, you can attach a cluster to a predefined pool of idle instances. When attached to a pool, a cluster allocates its driver and worker nodes from the pool. If the pool does not have sufficient idle resources to accommodate the cluster’s request, it expands by allocating new instances from the instance provider. When an attached cluster changes its state to TERMINATED, the instances it used are returned to the pool and reused by a different cluster.

    is_pinned bool

    boolean value specifying if the cluster is pinned (not pinned by default). You must be a Databricks administrator to use this. The pinned clusters' maximum number is limited to 100, so apply may fail if you have more than that (this number may change over time, so check Databricks documentation for actual number).

    The following example demonstrates how to create an autoscaling cluster with Delta Cache enabled:

    import * as pulumi from "@pulumi/pulumi";
    import * as databricks from "@pulumi/databricks";
    

    const smallest = databricks.getNodeType({ localDisk: true, }); const latestLts = databricks.getSparkVersion({ longTermSupport: true, }); const sharedAutoscaling = new databricks.Cluster("sharedAutoscaling", { clusterName: "Shared Autoscaling", sparkVersion: latestLts.then(latestLts => latestLts.id), nodeTypeId: smallest.then(smallest => smallest.id), autoterminationMinutes: 20, autoscale: { minWorkers: 1, maxWorkers: 50, }, sparkConf: { "spark.databricks.io.cache.enabled": true, "spark.databricks.io.cache.maxDiskUsage": "50g", "spark.databricks.io.cache.maxMetaDataCache": "1g", }, });

    import pulumi
    import pulumi_databricks as databricks
    
    smallest = databricks.get_node_type(local_disk=True)
    latest_lts = databricks.get_spark_version(long_term_support=True)
    shared_autoscaling = databricks.Cluster("sharedAutoscaling",
        cluster_name="Shared Autoscaling",
        spark_version=latest_lts.id,
        node_type_id=smallest.id,
        autotermination_minutes=20,
        autoscale=databricks.ClusterAutoscaleArgs(
            min_workers=1,
            max_workers=50,
        ),
        spark_conf={
            "spark.databricks.io.cache.enabled": True,
            "spark.databricks.io.cache.maxDiskUsage": "50g",
            "spark.databricks.io.cache.maxMetaDataCache": "1g",
        })
    
    using System.Collections.Generic;
    using System.Linq;
    using Pulumi;
    using Databricks = Pulumi.Databricks;
    
    return await Deployment.RunAsync(() => 
    {
        var smallest = Databricks.GetNodeType.Invoke(new()
        {
            LocalDisk = true,
        });
    
        var latestLts = Databricks.GetSparkVersion.Invoke(new()
        {
            LongTermSupport = true,
        });
    
        var sharedAutoscaling = new Databricks.Cluster("sharedAutoscaling", new()
        {
            ClusterName = "Shared Autoscaling",
            SparkVersion = latestLts.Apply(getSparkVersionResult => getSparkVersionResult.Id),
            NodeTypeId = smallest.Apply(getNodeTypeResult => getNodeTypeResult.Id),
            AutoterminationMinutes = 20,
            Autoscale = new Databricks.Inputs.ClusterAutoscaleArgs
            {
                MinWorkers = 1,
                MaxWorkers = 50,
            },
            SparkConf = 
            {
                { "spark.databricks.io.cache.enabled", true },
                { "spark.databricks.io.cache.maxDiskUsage", "50g" },
                { "spark.databricks.io.cache.maxMetaDataCache", "1g" },
            },
        });
    
    });
    
    package main
    
    import (
    	"github.com/pulumi/pulumi-databricks/sdk/go/databricks"
    	"github.com/pulumi/pulumi/sdk/v3/go/pulumi"
    )
    
    func main() {
    	pulumi.Run(func(ctx *pulumi.Context) error {
    		smallest, err := databricks.GetNodeType(ctx, &databricks.GetNodeTypeArgs{
    			LocalDisk: pulumi.BoolRef(true),
    		}, nil)
    		if err != nil {
    			return err
    		}
    		latestLts, err := databricks.GetSparkVersion(ctx, &databricks.GetSparkVersionArgs{
    			LongTermSupport: pulumi.BoolRef(true),
    		}, nil)
    		if err != nil {
    			return err
    		}
    		_, err = databricks.NewCluster(ctx, "sharedAutoscaling", &databricks.ClusterArgs{
    			ClusterName:            pulumi.String("Shared Autoscaling"),
    			SparkVersion:           *pulumi.String(latestLts.Id),
    			NodeTypeId:             *pulumi.String(smallest.Id),
    			AutoterminationMinutes: pulumi.Int(20),
    			Autoscale: &databricks.ClusterAutoscaleArgs{
    				MinWorkers: pulumi.Int(1),
    				MaxWorkers: pulumi.Int(50),
    			},
    			SparkConf: pulumi.Map{
    				"spark.databricks.io.cache.enabled":          pulumi.Any(true),
    				"spark.databricks.io.cache.maxDiskUsage":     pulumi.Any("50g"),
    				"spark.databricks.io.cache.maxMetaDataCache": pulumi.Any("1g"),
    			},
    		})
    		if err != nil {
    			return err
    		}
    		return nil
    	})
    }
    
    package generated_program;
    
    import com.pulumi.Context;
    import com.pulumi.Pulumi;
    import com.pulumi.core.Output;
    import com.pulumi.databricks.DatabricksFunctions;
    import com.pulumi.databricks.inputs.GetNodeTypeArgs;
    import com.pulumi.databricks.inputs.GetSparkVersionArgs;
    import com.pulumi.databricks.Cluster;
    import com.pulumi.databricks.ClusterArgs;
    import com.pulumi.databricks.inputs.ClusterAutoscaleArgs;
    import java.util.List;
    import java.util.ArrayList;
    import java.util.Map;
    import java.io.File;
    import java.nio.file.Files;
    import java.nio.file.Paths;
    
    public class App {
        public static void main(String[] args) {
            Pulumi.run(App::stack);
        }
    
        public static void stack(Context ctx) {
            final var smallest = DatabricksFunctions.getNodeType(GetNodeTypeArgs.builder()
                .localDisk(true)
                .build());
    
            final var latestLts = DatabricksFunctions.getSparkVersion(GetSparkVersionArgs.builder()
                .longTermSupport(true)
                .build());
    
            var sharedAutoscaling = new Cluster("sharedAutoscaling", ClusterArgs.builder()        
                .clusterName("Shared Autoscaling")
                .sparkVersion(latestLts.applyValue(getSparkVersionResult -> getSparkVersionResult.id()))
                .nodeTypeId(smallest.applyValue(getNodeTypeResult -> getNodeTypeResult.id()))
                .autoterminationMinutes(20)
                .autoscale(ClusterAutoscaleArgs.builder()
                    .minWorkers(1)
                    .maxWorkers(50)
                    .build())
                .sparkConf(Map.ofEntries(
                    Map.entry("spark.databricks.io.cache.enabled", true),
                    Map.entry("spark.databricks.io.cache.maxDiskUsage", "50g"),
                    Map.entry("spark.databricks.io.cache.maxMetaDataCache", "1g")
                ))
                .build());
    
        }
    }
    
    resources:
      sharedAutoscaling:
        type: databricks:Cluster
        properties:
          clusterName: Shared Autoscaling
          sparkVersion: ${latestLts.id}
          nodeTypeId: ${smallest.id}
          autoterminationMinutes: 20
          autoscale:
            minWorkers: 1
            maxWorkers: 50
          sparkConf:
            spark.databricks.io.cache.enabled: true
            spark.databricks.io.cache.maxDiskUsage: 50g
            spark.databricks.io.cache.maxMetaDataCache: 1g
    variables:
      smallest:
        fn::invoke:
          Function: databricks:getNodeType
          Arguments:
            localDisk: true
      latestLts:
        fn::invoke:
          Function: databricks:getSparkVersion
          Arguments:
            longTermSupport: true
    
    libraries Sequence[ClusterLibraryArgs]
    node_type_id str

    Any supported databricks.getNodeType id. If instance_pool_id is specified, this field is not needed.

    num_workers int

    Number of worker nodes that this cluster should have. A cluster has one Spark driver and num_workers executors for a total of num_workers + 1 Spark nodes.

    policy_id str
    runtime_engine str

    The type of runtime engine to use. If not specified, the runtime engine type is inferred based on the spark_version value. Allowed values include: PHOTON, STANDARD.

    single_user_name str

    The optional user name of the user to assign to an interactive cluster. This field is required when using data_security_mode set to SINGLE_USER or AAD Passthrough for Azure Data Lake Storage (ADLS) with a single-user cluster (i.e., not high-concurrency clusters).

    spark_conf Mapping[str, Any]

    Map with key-value pairs to fine-tune Spark clusters, where you can provide custom Spark configuration properties in a cluster configuration.

    spark_env_vars Mapping[str, Any]

    Map with environment variable key-value pairs to fine-tune Spark clusters. Key-value pairs of the form (X,Y) are exported (i.e., X='Y') while launching the driver and workers.

    ssh_public_keys Sequence[str]

    SSH public key contents that will be added to each Spark node in this cluster. The corresponding private keys can be used to login with the user name ubuntu on port 2200. You can specify up to 10 keys.

    workload_type ClusterWorkloadTypeArgs
    sparkVersion String

    Runtime version of the cluster. Any supported databricks.getSparkVersion id. We advise using Cluster Policies to restrict the list of versions for simplicity while maintaining enough control.

    applyPolicyDefaultValues Boolean

    Whether to use policy default values for missing cluster attributes.

    autoscale Property Map
    autoterminationMinutes Number

    Automatically terminate the cluster after being inactive for this time in minutes. If specified, the threshold must be between 10 and 10000 minutes. You can also set this value to 0 to explicitly disable automatic termination. Defaults to 60. We highly recommend having this setting present for Interactive/BI clusters.

    awsAttributes Property Map
    azureAttributes Property Map
    clusterId String
    clusterLogConf Property Map
    clusterMountInfos List<Property Map>
    clusterName String

    Cluster name, which doesn’t have to be unique. If not specified at creation, the cluster name will be an empty string.

    customTags Map<Any>

    Additional tags for cluster resources. Databricks will tag all cluster resources (e.g., AWS EC2 instances and EBS volumes) with these tags in addition to default_tags. If a custom cluster tag has the same name as a default cluster tag, the custom tag is prefixed with an x_ when it is propagated.

    dataSecurityMode String

    Select the security features of the cluster. Unity Catalog requires SINGLE_USER or USER_ISOLATION mode. LEGACY_PASSTHROUGH for passthrough cluster and LEGACY_TABLE_ACL for Table ACL cluster. If omitted, no security features are enabled. In the Databricks UI, this has been recently been renamed Access Mode and USER_ISOLATION has been renamed Shared, but use these terms here.

    dockerImage Property Map
    driverInstancePoolId String

    similar to instance_pool_id, but for driver node. If omitted, and instance_pool_id is specified, then the driver will be allocated from that pool.

    driverNodeTypeId String

    The node type of the Spark driver. This field is optional; if unset, API will set the driver node type to the same value as node_type_id defined above.

    enableElasticDisk Boolean

    If you don’t want to allocate a fixed number of EBS volumes at cluster creation time, use autoscaling local storage. With autoscaling local storage, Databricks monitors the amount of free disk space available on your cluster’s Spark workers. If a worker begins to run too low on disk, Databricks automatically attaches a new EBS volume to the worker before it runs out of disk space. EBS volumes are attached up to a limit of 5 TB of total disk space per instance (including the instance’s local storage). To scale down EBS usage, make sure you have autotermination_minutes and autoscale attributes set. More documentation available at cluster configuration page.

    enableLocalDiskEncryption Boolean

    Some instance types you use to run clusters may have locally attached disks. Databricks may store shuffle data or temporary data on these locally attached disks. To ensure that all data at rest is encrypted for all storage types, including shuffle data stored temporarily on your cluster’s local disks, you can enable local disk encryption. When local disk encryption is enabled, Databricks generates an encryption key locally unique to each cluster node and uses it to encrypt all data stored on local disks. The scope of the key is local to each cluster node and is destroyed along with the cluster node itself. During its lifetime, the key resides in memory for encryption and decryption and is stored encrypted on the disk. Your workloads may run more slowly because of the performance impact of reading and writing encrypted data to and from local volumes. This feature is not available for all Azure Databricks subscriptions. Contact your Microsoft or Databricks account representative to request access.

    gcpAttributes Property Map
    idempotencyToken String

    An optional token to guarantee the idempotency of cluster creation requests. If an active cluster with the provided token already exists, the request will not create a new cluster, but it will return the existing running cluster's ID instead. If you specify the idempotency token, upon failure, you can retry until the request succeeds. Databricks platform guarantees to launch exactly one cluster with that idempotency token. This token should have at most 64 characters.

    initScripts List<Property Map>
    instancePoolId String

    To reduce cluster start time, you can attach a cluster to a predefined pool of idle instances. When attached to a pool, a cluster allocates its driver and worker nodes from the pool. If the pool does not have sufficient idle resources to accommodate the cluster’s request, it expands by allocating new instances from the instance provider. When an attached cluster changes its state to TERMINATED, the instances it used are returned to the pool and reused by a different cluster.

    isPinned Boolean

    boolean value specifying if the cluster is pinned (not pinned by default). You must be a Databricks administrator to use this. The pinned clusters' maximum number is limited to 100, so apply may fail if you have more than that (this number may change over time, so check Databricks documentation for actual number).

    The following example demonstrates how to create an autoscaling cluster with Delta Cache enabled:

    import * as pulumi from "@pulumi/pulumi";
    import * as databricks from "@pulumi/databricks";
    

    const smallest = databricks.getNodeType({ localDisk: true, }); const latestLts = databricks.getSparkVersion({ longTermSupport: true, }); const sharedAutoscaling = new databricks.Cluster("sharedAutoscaling", { clusterName: "Shared Autoscaling", sparkVersion: latestLts.then(latestLts => latestLts.id), nodeTypeId: smallest.then(smallest => smallest.id), autoterminationMinutes: 20, autoscale: { minWorkers: 1, maxWorkers: 50, }, sparkConf: { "spark.databricks.io.cache.enabled": true, "spark.databricks.io.cache.maxDiskUsage": "50g", "spark.databricks.io.cache.maxMetaDataCache": "1g", }, });

    import pulumi
    import pulumi_databricks as databricks
    
    smallest = databricks.get_node_type(local_disk=True)
    latest_lts = databricks.get_spark_version(long_term_support=True)
    shared_autoscaling = databricks.Cluster("sharedAutoscaling",
        cluster_name="Shared Autoscaling",
        spark_version=latest_lts.id,
        node_type_id=smallest.id,
        autotermination_minutes=20,
        autoscale=databricks.ClusterAutoscaleArgs(
            min_workers=1,
            max_workers=50,
        ),
        spark_conf={
            "spark.databricks.io.cache.enabled": True,
            "spark.databricks.io.cache.maxDiskUsage": "50g",
            "spark.databricks.io.cache.maxMetaDataCache": "1g",
        })
    
    using System.Collections.Generic;
    using System.Linq;
    using Pulumi;
    using Databricks = Pulumi.Databricks;
    
    return await Deployment.RunAsync(() => 
    {
        var smallest = Databricks.GetNodeType.Invoke(new()
        {
            LocalDisk = true,
        });
    
        var latestLts = Databricks.GetSparkVersion.Invoke(new()
        {
            LongTermSupport = true,
        });
    
        var sharedAutoscaling = new Databricks.Cluster("sharedAutoscaling", new()
        {
            ClusterName = "Shared Autoscaling",
            SparkVersion = latestLts.Apply(getSparkVersionResult => getSparkVersionResult.Id),
            NodeTypeId = smallest.Apply(getNodeTypeResult => getNodeTypeResult.Id),
            AutoterminationMinutes = 20,
            Autoscale = new Databricks.Inputs.ClusterAutoscaleArgs
            {
                MinWorkers = 1,
                MaxWorkers = 50,
            },
            SparkConf = 
            {
                { "spark.databricks.io.cache.enabled", true },
                { "spark.databricks.io.cache.maxDiskUsage", "50g" },
                { "spark.databricks.io.cache.maxMetaDataCache", "1g" },
            },
        });
    
    });
    
    package main
    
    import (
    	"github.com/pulumi/pulumi-databricks/sdk/go/databricks"
    	"github.com/pulumi/pulumi/sdk/v3/go/pulumi"
    )
    
    func main() {
    	pulumi.Run(func(ctx *pulumi.Context) error {
    		smallest, err := databricks.GetNodeType(ctx, &databricks.GetNodeTypeArgs{
    			LocalDisk: pulumi.BoolRef(true),
    		}, nil)
    		if err != nil {
    			return err
    		}
    		latestLts, err := databricks.GetSparkVersion(ctx, &databricks.GetSparkVersionArgs{
    			LongTermSupport: pulumi.BoolRef(true),
    		}, nil)
    		if err != nil {
    			return err
    		}
    		_, err = databricks.NewCluster(ctx, "sharedAutoscaling", &databricks.ClusterArgs{
    			ClusterName:            pulumi.String("Shared Autoscaling"),
    			SparkVersion:           *pulumi.String(latestLts.Id),
    			NodeTypeId:             *pulumi.String(smallest.Id),
    			AutoterminationMinutes: pulumi.Int(20),
    			Autoscale: &databricks.ClusterAutoscaleArgs{
    				MinWorkers: pulumi.Int(1),
    				MaxWorkers: pulumi.Int(50),
    			},
    			SparkConf: pulumi.Map{
    				"spark.databricks.io.cache.enabled":          pulumi.Any(true),
    				"spark.databricks.io.cache.maxDiskUsage":     pulumi.Any("50g"),
    				"spark.databricks.io.cache.maxMetaDataCache": pulumi.Any("1g"),
    			},
    		})
    		if err != nil {
    			return err
    		}
    		return nil
    	})
    }
    
    package generated_program;
    
    import com.pulumi.Context;
    import com.pulumi.Pulumi;
    import com.pulumi.core.Output;
    import com.pulumi.databricks.DatabricksFunctions;
    import com.pulumi.databricks.inputs.GetNodeTypeArgs;
    import com.pulumi.databricks.inputs.GetSparkVersionArgs;
    import com.pulumi.databricks.Cluster;
    import com.pulumi.databricks.ClusterArgs;
    import com.pulumi.databricks.inputs.ClusterAutoscaleArgs;
    import java.util.List;
    import java.util.ArrayList;
    import java.util.Map;
    import java.io.File;
    import java.nio.file.Files;
    import java.nio.file.Paths;
    
    public class App {
        public static void main(String[] args) {
            Pulumi.run(App::stack);
        }
    
        public static void stack(Context ctx) {
            final var smallest = DatabricksFunctions.getNodeType(GetNodeTypeArgs.builder()
                .localDisk(true)
                .build());
    
            final var latestLts = DatabricksFunctions.getSparkVersion(GetSparkVersionArgs.builder()
                .longTermSupport(true)
                .build());
    
            var sharedAutoscaling = new Cluster("sharedAutoscaling", ClusterArgs.builder()        
                .clusterName("Shared Autoscaling")
                .sparkVersion(latestLts.applyValue(getSparkVersionResult -> getSparkVersionResult.id()))
                .nodeTypeId(smallest.applyValue(getNodeTypeResult -> getNodeTypeResult.id()))
                .autoterminationMinutes(20)
                .autoscale(ClusterAutoscaleArgs.builder()
                    .minWorkers(1)
                    .maxWorkers(50)
                    .build())
                .sparkConf(Map.ofEntries(
                    Map.entry("spark.databricks.io.cache.enabled", true),
                    Map.entry("spark.databricks.io.cache.maxDiskUsage", "50g"),
                    Map.entry("spark.databricks.io.cache.maxMetaDataCache", "1g")
                ))
                .build());
    
        }
    }
    
    resources:
      sharedAutoscaling:
        type: databricks:Cluster
        properties:
          clusterName: Shared Autoscaling
          sparkVersion: ${latestLts.id}
          nodeTypeId: ${smallest.id}
          autoterminationMinutes: 20
          autoscale:
            minWorkers: 1
            maxWorkers: 50
          sparkConf:
            spark.databricks.io.cache.enabled: true
            spark.databricks.io.cache.maxDiskUsage: 50g
            spark.databricks.io.cache.maxMetaDataCache: 1g
    variables:
      smallest:
        fn::invoke:
          Function: databricks:getNodeType
          Arguments:
            localDisk: true
      latestLts:
        fn::invoke:
          Function: databricks:getSparkVersion
          Arguments:
            longTermSupport: true
    
    libraries List<Property Map>
    nodeTypeId String

    Any supported databricks.getNodeType id. If instance_pool_id is specified, this field is not needed.

    numWorkers Number

    Number of worker nodes that this cluster should have. A cluster has one Spark driver and num_workers executors for a total of num_workers + 1 Spark nodes.

    policyId String
    runtimeEngine String

    The type of runtime engine to use. If not specified, the runtime engine type is inferred based on the spark_version value. Allowed values include: PHOTON, STANDARD.

    singleUserName String

    The optional user name of the user to assign to an interactive cluster. This field is required when using data_security_mode set to SINGLE_USER or AAD Passthrough for Azure Data Lake Storage (ADLS) with a single-user cluster (i.e., not high-concurrency clusters).

    sparkConf Map<Any>

    Map with key-value pairs to fine-tune Spark clusters, where you can provide custom Spark configuration properties in a cluster configuration.

    sparkEnvVars Map<Any>

    Map with environment variable key-value pairs to fine-tune Spark clusters. Key-value pairs of the form (X,Y) are exported (i.e., X='Y') while launching the driver and workers.

    sshPublicKeys List<String>

    SSH public key contents that will be added to each Spark node in this cluster. The corresponding private keys can be used to login with the user name ubuntu on port 2200. You can specify up to 10 keys.

    workloadType Property Map

    Outputs

    All input properties are implicitly available as output properties. Additionally, the Cluster resource produces the following output properties:

    DefaultTags Dictionary<string, object>

    (map) Tags that are added by Databricks by default, regardless of any custom_tags that may have been added. These include: Vendor: Databricks, Creator: <username_of_creator>, ClusterName: <name_of_cluster>, ClusterId: <id_of_cluster>, Name: , and any workspace and pool tags.

    Id string

    The provider-assigned unique ID for this managed resource.

    State string

    (string) State of the cluster.

    Url string
    DefaultTags map[string]interface{}

    (map) Tags that are added by Databricks by default, regardless of any custom_tags that may have been added. These include: Vendor: Databricks, Creator: <username_of_creator>, ClusterName: <name_of_cluster>, ClusterId: <id_of_cluster>, Name: , and any workspace and pool tags.

    Id string

    The provider-assigned unique ID for this managed resource.

    State string

    (string) State of the cluster.

    Url string
    defaultTags Map<String,Object>

    (map) Tags that are added by Databricks by default, regardless of any custom_tags that may have been added. These include: Vendor: Databricks, Creator: <username_of_creator>, ClusterName: <name_of_cluster>, ClusterId: <id_of_cluster>, Name: , and any workspace and pool tags.

    id String

    The provider-assigned unique ID for this managed resource.

    state String

    (string) State of the cluster.

    url String
    defaultTags {[key: string]: any}

    (map) Tags that are added by Databricks by default, regardless of any custom_tags that may have been added. These include: Vendor: Databricks, Creator: <username_of_creator>, ClusterName: <name_of_cluster>, ClusterId: <id_of_cluster>, Name: , and any workspace and pool tags.

    id string

    The provider-assigned unique ID for this managed resource.

    state string

    (string) State of the cluster.

    url string
    default_tags Mapping[str, Any]

    (map) Tags that are added by Databricks by default, regardless of any custom_tags that may have been added. These include: Vendor: Databricks, Creator: <username_of_creator>, ClusterName: <name_of_cluster>, ClusterId: <id_of_cluster>, Name: , and any workspace and pool tags.

    id str

    The provider-assigned unique ID for this managed resource.

    state str

    (string) State of the cluster.

    url str
    defaultTags Map<Any>

    (map) Tags that are added by Databricks by default, regardless of any custom_tags that may have been added. These include: Vendor: Databricks, Creator: <username_of_creator>, ClusterName: <name_of_cluster>, ClusterId: <id_of_cluster>, Name: , and any workspace and pool tags.

    id String

    The provider-assigned unique ID for this managed resource.

    state String

    (string) State of the cluster.

    url String

    Look up Existing Cluster Resource

    Get an existing Cluster resource’s state with the given name, ID, and optional extra properties used to qualify the lookup.

    public static get(name: string, id: Input<ID>, state?: ClusterState, opts?: CustomResourceOptions): Cluster
    @staticmethod
    def get(resource_name: str,
            id: str,
            opts: Optional[ResourceOptions] = None,
            apply_policy_default_values: Optional[bool] = None,
            autoscale: Optional[ClusterAutoscaleArgs] = None,
            autotermination_minutes: Optional[int] = None,
            aws_attributes: Optional[ClusterAwsAttributesArgs] = None,
            azure_attributes: Optional[ClusterAzureAttributesArgs] = None,
            cluster_id: Optional[str] = None,
            cluster_log_conf: Optional[ClusterClusterLogConfArgs] = None,
            cluster_mount_infos: Optional[Sequence[ClusterClusterMountInfoArgs]] = None,
            cluster_name: Optional[str] = None,
            custom_tags: Optional[Mapping[str, Any]] = None,
            data_security_mode: Optional[str] = None,
            default_tags: Optional[Mapping[str, Any]] = None,
            docker_image: Optional[ClusterDockerImageArgs] = None,
            driver_instance_pool_id: Optional[str] = None,
            driver_node_type_id: Optional[str] = None,
            enable_elastic_disk: Optional[bool] = None,
            enable_local_disk_encryption: Optional[bool] = None,
            gcp_attributes: Optional[ClusterGcpAttributesArgs] = None,
            idempotency_token: Optional[str] = None,
            init_scripts: Optional[Sequence[ClusterInitScriptArgs]] = None,
            instance_pool_id: Optional[str] = None,
            is_pinned: Optional[bool] = None,
            libraries: Optional[Sequence[ClusterLibraryArgs]] = None,
            node_type_id: Optional[str] = None,
            num_workers: Optional[int] = None,
            policy_id: Optional[str] = None,
            runtime_engine: Optional[str] = None,
            single_user_name: Optional[str] = None,
            spark_conf: Optional[Mapping[str, Any]] = None,
            spark_env_vars: Optional[Mapping[str, Any]] = None,
            spark_version: Optional[str] = None,
            ssh_public_keys: Optional[Sequence[str]] = None,
            state: Optional[str] = None,
            url: Optional[str] = None,
            workload_type: Optional[ClusterWorkloadTypeArgs] = None) -> Cluster
    func GetCluster(ctx *Context, name string, id IDInput, state *ClusterState, opts ...ResourceOption) (*Cluster, error)
    public static Cluster Get(string name, Input<string> id, ClusterState? state, CustomResourceOptions? opts = null)
    public static Cluster get(String name, Output<String> id, ClusterState state, CustomResourceOptions options)
    Resource lookup is not supported in YAML
    name
    The unique name of the resulting resource.
    id
    The unique provider ID of the resource to lookup.
    state
    Any extra arguments used during the lookup.
    opts
    A bag of options that control this resource's behavior.
    resource_name
    The unique name of the resulting resource.
    id
    The unique provider ID of the resource to lookup.
    name
    The unique name of the resulting resource.
    id
    The unique provider ID of the resource to lookup.
    state
    Any extra arguments used during the lookup.
    opts
    A bag of options that control this resource's behavior.
    name
    The unique name of the resulting resource.
    id
    The unique provider ID of the resource to lookup.
    state
    Any extra arguments used during the lookup.
    opts
    A bag of options that control this resource's behavior.
    name
    The unique name of the resulting resource.
    id
    The unique provider ID of the resource to lookup.
    state
    Any extra arguments used during the lookup.
    opts
    A bag of options that control this resource's behavior.
    The following state arguments are supported:
    ApplyPolicyDefaultValues bool

    Whether to use policy default values for missing cluster attributes.

    Autoscale ClusterAutoscale
    AutoterminationMinutes int

    Automatically terminate the cluster after being inactive for this time in minutes. If specified, the threshold must be between 10 and 10000 minutes. You can also set this value to 0 to explicitly disable automatic termination. Defaults to 60. We highly recommend having this setting present for Interactive/BI clusters.

    AwsAttributes ClusterAwsAttributes
    AzureAttributes ClusterAzureAttributes
    ClusterId string
    ClusterLogConf ClusterClusterLogConf
    ClusterMountInfos List<ClusterClusterMountInfo>
    ClusterName string

    Cluster name, which doesn’t have to be unique. If not specified at creation, the cluster name will be an empty string.

    CustomTags Dictionary<string, object>

    Additional tags for cluster resources. Databricks will tag all cluster resources (e.g., AWS EC2 instances and EBS volumes) with these tags in addition to default_tags. If a custom cluster tag has the same name as a default cluster tag, the custom tag is prefixed with an x_ when it is propagated.

    DataSecurityMode string

    Select the security features of the cluster. Unity Catalog requires SINGLE_USER or USER_ISOLATION mode. LEGACY_PASSTHROUGH for passthrough cluster and LEGACY_TABLE_ACL for Table ACL cluster. If omitted, no security features are enabled. In the Databricks UI, this has been recently been renamed Access Mode and USER_ISOLATION has been renamed Shared, but use these terms here.

    DefaultTags Dictionary<string, object>

    (map) Tags that are added by Databricks by default, regardless of any custom_tags that may have been added. These include: Vendor: Databricks, Creator: <username_of_creator>, ClusterName: <name_of_cluster>, ClusterId: <id_of_cluster>, Name: , and any workspace and pool tags.

    DockerImage ClusterDockerImage
    DriverInstancePoolId string

    similar to instance_pool_id, but for driver node. If omitted, and instance_pool_id is specified, then the driver will be allocated from that pool.

    DriverNodeTypeId string

    The node type of the Spark driver. This field is optional; if unset, API will set the driver node type to the same value as node_type_id defined above.

    EnableElasticDisk bool

    If you don’t want to allocate a fixed number of EBS volumes at cluster creation time, use autoscaling local storage. With autoscaling local storage, Databricks monitors the amount of free disk space available on your cluster’s Spark workers. If a worker begins to run too low on disk, Databricks automatically attaches a new EBS volume to the worker before it runs out of disk space. EBS volumes are attached up to a limit of 5 TB of total disk space per instance (including the instance’s local storage). To scale down EBS usage, make sure you have autotermination_minutes and autoscale attributes set. More documentation available at cluster configuration page.

    EnableLocalDiskEncryption bool

    Some instance types you use to run clusters may have locally attached disks. Databricks may store shuffle data or temporary data on these locally attached disks. To ensure that all data at rest is encrypted for all storage types, including shuffle data stored temporarily on your cluster’s local disks, you can enable local disk encryption. When local disk encryption is enabled, Databricks generates an encryption key locally unique to each cluster node and uses it to encrypt all data stored on local disks. The scope of the key is local to each cluster node and is destroyed along with the cluster node itself. During its lifetime, the key resides in memory for encryption and decryption and is stored encrypted on the disk. Your workloads may run more slowly because of the performance impact of reading and writing encrypted data to and from local volumes. This feature is not available for all Azure Databricks subscriptions. Contact your Microsoft or Databricks account representative to request access.

    GcpAttributes ClusterGcpAttributes
    IdempotencyToken string

    An optional token to guarantee the idempotency of cluster creation requests. If an active cluster with the provided token already exists, the request will not create a new cluster, but it will return the existing running cluster's ID instead. If you specify the idempotency token, upon failure, you can retry until the request succeeds. Databricks platform guarantees to launch exactly one cluster with that idempotency token. This token should have at most 64 characters.

    InitScripts List<ClusterInitScript>
    InstancePoolId string

    To reduce cluster start time, you can attach a cluster to a predefined pool of idle instances. When attached to a pool, a cluster allocates its driver and worker nodes from the pool. If the pool does not have sufficient idle resources to accommodate the cluster’s request, it expands by allocating new instances from the instance provider. When an attached cluster changes its state to TERMINATED, the instances it used are returned to the pool and reused by a different cluster.

    IsPinned bool

    boolean value specifying if the cluster is pinned (not pinned by default). You must be a Databricks administrator to use this. The pinned clusters' maximum number is limited to 100, so apply may fail if you have more than that (this number may change over time, so check Databricks documentation for actual number).

    The following example demonstrates how to create an autoscaling cluster with Delta Cache enabled:

    import * as pulumi from "@pulumi/pulumi";
    import * as databricks from "@pulumi/databricks";
    

    const smallest = databricks.getNodeType({ localDisk: true, }); const latestLts = databricks.getSparkVersion({ longTermSupport: true, }); const sharedAutoscaling = new databricks.Cluster("sharedAutoscaling", { clusterName: "Shared Autoscaling", sparkVersion: latestLts.then(latestLts => latestLts.id), nodeTypeId: smallest.then(smallest => smallest.id), autoterminationMinutes: 20, autoscale: { minWorkers: 1, maxWorkers: 50, }, sparkConf: { "spark.databricks.io.cache.enabled": true, "spark.databricks.io.cache.maxDiskUsage": "50g", "spark.databricks.io.cache.maxMetaDataCache": "1g", }, });

    import pulumi
    import pulumi_databricks as databricks
    
    smallest = databricks.get_node_type(local_disk=True)
    latest_lts = databricks.get_spark_version(long_term_support=True)
    shared_autoscaling = databricks.Cluster("sharedAutoscaling",
        cluster_name="Shared Autoscaling",
        spark_version=latest_lts.id,
        node_type_id=smallest.id,
        autotermination_minutes=20,
        autoscale=databricks.ClusterAutoscaleArgs(
            min_workers=1,
            max_workers=50,
        ),
        spark_conf={
            "spark.databricks.io.cache.enabled": True,
            "spark.databricks.io.cache.maxDiskUsage": "50g",
            "spark.databricks.io.cache.maxMetaDataCache": "1g",
        })
    
    using System.Collections.Generic;
    using System.Linq;
    using Pulumi;
    using Databricks = Pulumi.Databricks;
    
    return await Deployment.RunAsync(() => 
    {
        var smallest = Databricks.GetNodeType.Invoke(new()
        {
            LocalDisk = true,
        });
    
        var latestLts = Databricks.GetSparkVersion.Invoke(new()
        {
            LongTermSupport = true,
        });
    
        var sharedAutoscaling = new Databricks.Cluster("sharedAutoscaling", new()
        {
            ClusterName = "Shared Autoscaling",
            SparkVersion = latestLts.Apply(getSparkVersionResult => getSparkVersionResult.Id),
            NodeTypeId = smallest.Apply(getNodeTypeResult => getNodeTypeResult.Id),
            AutoterminationMinutes = 20,
            Autoscale = new Databricks.Inputs.ClusterAutoscaleArgs
            {
                MinWorkers = 1,
                MaxWorkers = 50,
            },
            SparkConf = 
            {
                { "spark.databricks.io.cache.enabled", true },
                { "spark.databricks.io.cache.maxDiskUsage", "50g" },
                { "spark.databricks.io.cache.maxMetaDataCache", "1g" },
            },
        });
    
    });
    
    package main
    
    import (
    	"github.com/pulumi/pulumi-databricks/sdk/go/databricks"
    	"github.com/pulumi/pulumi/sdk/v3/go/pulumi"
    )
    
    func main() {
    	pulumi.Run(func(ctx *pulumi.Context) error {
    		smallest, err := databricks.GetNodeType(ctx, &databricks.GetNodeTypeArgs{
    			LocalDisk: pulumi.BoolRef(true),
    		}, nil)
    		if err != nil {
    			return err
    		}
    		latestLts, err := databricks.GetSparkVersion(ctx, &databricks.GetSparkVersionArgs{
    			LongTermSupport: pulumi.BoolRef(true),
    		}, nil)
    		if err != nil {
    			return err
    		}
    		_, err = databricks.NewCluster(ctx, "sharedAutoscaling", &databricks.ClusterArgs{
    			ClusterName:            pulumi.String("Shared Autoscaling"),
    			SparkVersion:           *pulumi.String(latestLts.Id),
    			NodeTypeId:             *pulumi.String(smallest.Id),
    			AutoterminationMinutes: pulumi.Int(20),
    			Autoscale: &databricks.ClusterAutoscaleArgs{
    				MinWorkers: pulumi.Int(1),
    				MaxWorkers: pulumi.Int(50),
    			},
    			SparkConf: pulumi.Map{
    				"spark.databricks.io.cache.enabled":          pulumi.Any(true),
    				"spark.databricks.io.cache.maxDiskUsage":     pulumi.Any("50g"),
    				"spark.databricks.io.cache.maxMetaDataCache": pulumi.Any("1g"),
    			},
    		})
    		if err != nil {
    			return err
    		}
    		return nil
    	})
    }
    
    package generated_program;
    
    import com.pulumi.Context;
    import com.pulumi.Pulumi;
    import com.pulumi.core.Output;
    import com.pulumi.databricks.DatabricksFunctions;
    import com.pulumi.databricks.inputs.GetNodeTypeArgs;
    import com.pulumi.databricks.inputs.GetSparkVersionArgs;
    import com.pulumi.databricks.Cluster;
    import com.pulumi.databricks.ClusterArgs;
    import com.pulumi.databricks.inputs.ClusterAutoscaleArgs;
    import java.util.List;
    import java.util.ArrayList;
    import java.util.Map;
    import java.io.File;
    import java.nio.file.Files;
    import java.nio.file.Paths;
    
    public class App {
        public static void main(String[] args) {
            Pulumi.run(App::stack);
        }
    
        public static void stack(Context ctx) {
            final var smallest = DatabricksFunctions.getNodeType(GetNodeTypeArgs.builder()
                .localDisk(true)
                .build());
    
            final var latestLts = DatabricksFunctions.getSparkVersion(GetSparkVersionArgs.builder()
                .longTermSupport(true)
                .build());
    
            var sharedAutoscaling = new Cluster("sharedAutoscaling", ClusterArgs.builder()        
                .clusterName("Shared Autoscaling")
                .sparkVersion(latestLts.applyValue(getSparkVersionResult -> getSparkVersionResult.id()))
                .nodeTypeId(smallest.applyValue(getNodeTypeResult -> getNodeTypeResult.id()))
                .autoterminationMinutes(20)
                .autoscale(ClusterAutoscaleArgs.builder()
                    .minWorkers(1)
                    .maxWorkers(50)
                    .build())
                .sparkConf(Map.ofEntries(
                    Map.entry("spark.databricks.io.cache.enabled", true),
                    Map.entry("spark.databricks.io.cache.maxDiskUsage", "50g"),
                    Map.entry("spark.databricks.io.cache.maxMetaDataCache", "1g")
                ))
                .build());
    
        }
    }
    
    resources:
      sharedAutoscaling:
        type: databricks:Cluster
        properties:
          clusterName: Shared Autoscaling
          sparkVersion: ${latestLts.id}
          nodeTypeId: ${smallest.id}
          autoterminationMinutes: 20
          autoscale:
            minWorkers: 1
            maxWorkers: 50
          sparkConf:
            spark.databricks.io.cache.enabled: true
            spark.databricks.io.cache.maxDiskUsage: 50g
            spark.databricks.io.cache.maxMetaDataCache: 1g
    variables:
      smallest:
        fn::invoke:
          Function: databricks:getNodeType
          Arguments:
            localDisk: true
      latestLts:
        fn::invoke:
          Function: databricks:getSparkVersion
          Arguments:
            longTermSupport: true
    
    Libraries List<ClusterLibrary>
    NodeTypeId string

    Any supported databricks.getNodeType id. If instance_pool_id is specified, this field is not needed.

    NumWorkers int

    Number of worker nodes that this cluster should have. A cluster has one Spark driver and num_workers executors for a total of num_workers + 1 Spark nodes.

    PolicyId string
    RuntimeEngine string

    The type of runtime engine to use. If not specified, the runtime engine type is inferred based on the spark_version value. Allowed values include: PHOTON, STANDARD.

    SingleUserName string

    The optional user name of the user to assign to an interactive cluster. This field is required when using data_security_mode set to SINGLE_USER or AAD Passthrough for Azure Data Lake Storage (ADLS) with a single-user cluster (i.e., not high-concurrency clusters).

    SparkConf Dictionary<string, object>

    Map with key-value pairs to fine-tune Spark clusters, where you can provide custom Spark configuration properties in a cluster configuration.

    SparkEnvVars Dictionary<string, object>

    Map with environment variable key-value pairs to fine-tune Spark clusters. Key-value pairs of the form (X,Y) are exported (i.e., X='Y') while launching the driver and workers.

    SparkVersion string

    Runtime version of the cluster. Any supported databricks.getSparkVersion id. We advise using Cluster Policies to restrict the list of versions for simplicity while maintaining enough control.

    SshPublicKeys List<string>

    SSH public key contents that will be added to each Spark node in this cluster. The corresponding private keys can be used to login with the user name ubuntu on port 2200. You can specify up to 10 keys.

    State string

    (string) State of the cluster.

    Url string
    WorkloadType ClusterWorkloadType
    ApplyPolicyDefaultValues bool

    Whether to use policy default values for missing cluster attributes.

    Autoscale ClusterAutoscaleArgs
    AutoterminationMinutes int

    Automatically terminate the cluster after being inactive for this time in minutes. If specified, the threshold must be between 10 and 10000 minutes. You can also set this value to 0 to explicitly disable automatic termination. Defaults to 60. We highly recommend having this setting present for Interactive/BI clusters.

    AwsAttributes ClusterAwsAttributesArgs
    AzureAttributes ClusterAzureAttributesArgs
    ClusterId string
    ClusterLogConf ClusterClusterLogConfArgs
    ClusterMountInfos []ClusterClusterMountInfoArgs
    ClusterName string

    Cluster name, which doesn’t have to be unique. If not specified at creation, the cluster name will be an empty string.

    CustomTags map[string]interface{}

    Additional tags for cluster resources. Databricks will tag all cluster resources (e.g., AWS EC2 instances and EBS volumes) with these tags in addition to default_tags. If a custom cluster tag has the same name as a default cluster tag, the custom tag is prefixed with an x_ when it is propagated.

    DataSecurityMode string

    Select the security features of the cluster. Unity Catalog requires SINGLE_USER or USER_ISOLATION mode. LEGACY_PASSTHROUGH for passthrough cluster and LEGACY_TABLE_ACL for Table ACL cluster. If omitted, no security features are enabled. In the Databricks UI, this has been recently been renamed Access Mode and USER_ISOLATION has been renamed Shared, but use these terms here.

    DefaultTags map[string]interface{}

    (map) Tags that are added by Databricks by default, regardless of any custom_tags that may have been added. These include: Vendor: Databricks, Creator: <username_of_creator>, ClusterName: <name_of_cluster>, ClusterId: <id_of_cluster>, Name: , and any workspace and pool tags.

    DockerImage ClusterDockerImageArgs
    DriverInstancePoolId string

    similar to instance_pool_id, but for driver node. If omitted, and instance_pool_id is specified, then the driver will be allocated from that pool.

    DriverNodeTypeId string

    The node type of the Spark driver. This field is optional; if unset, API will set the driver node type to the same value as node_type_id defined above.

    EnableElasticDisk bool

    If you don’t want to allocate a fixed number of EBS volumes at cluster creation time, use autoscaling local storage. With autoscaling local storage, Databricks monitors the amount of free disk space available on your cluster’s Spark workers. If a worker begins to run too low on disk, Databricks automatically attaches a new EBS volume to the worker before it runs out of disk space. EBS volumes are attached up to a limit of 5 TB of total disk space per instance (including the instance’s local storage). To scale down EBS usage, make sure you have autotermination_minutes and autoscale attributes set. More documentation available at cluster configuration page.

    EnableLocalDiskEncryption bool

    Some instance types you use to run clusters may have locally attached disks. Databricks may store shuffle data or temporary data on these locally attached disks. To ensure that all data at rest is encrypted for all storage types, including shuffle data stored temporarily on your cluster’s local disks, you can enable local disk encryption. When local disk encryption is enabled, Databricks generates an encryption key locally unique to each cluster node and uses it to encrypt all data stored on local disks. The scope of the key is local to each cluster node and is destroyed along with the cluster node itself. During its lifetime, the key resides in memory for encryption and decryption and is stored encrypted on the disk. Your workloads may run more slowly because of the performance impact of reading and writing encrypted data to and from local volumes. This feature is not available for all Azure Databricks subscriptions. Contact your Microsoft or Databricks account representative to request access.

    GcpAttributes ClusterGcpAttributesArgs
    IdempotencyToken string

    An optional token to guarantee the idempotency of cluster creation requests. If an active cluster with the provided token already exists, the request will not create a new cluster, but it will return the existing running cluster's ID instead. If you specify the idempotency token, upon failure, you can retry until the request succeeds. Databricks platform guarantees to launch exactly one cluster with that idempotency token. This token should have at most 64 characters.

    InitScripts []ClusterInitScriptArgs
    InstancePoolId string

    To reduce cluster start time, you can attach a cluster to a predefined pool of idle instances. When attached to a pool, a cluster allocates its driver and worker nodes from the pool. If the pool does not have sufficient idle resources to accommodate the cluster’s request, it expands by allocating new instances from the instance provider. When an attached cluster changes its state to TERMINATED, the instances it used are returned to the pool and reused by a different cluster.

    IsPinned bool

    boolean value specifying if the cluster is pinned (not pinned by default). You must be a Databricks administrator to use this. The pinned clusters' maximum number is limited to 100, so apply may fail if you have more than that (this number may change over time, so check Databricks documentation for actual number).

    The following example demonstrates how to create an autoscaling cluster with Delta Cache enabled:

    import * as pulumi from "@pulumi/pulumi";
    import * as databricks from "@pulumi/databricks";
    

    const smallest = databricks.getNodeType({ localDisk: true, }); const latestLts = databricks.getSparkVersion({ longTermSupport: true, }); const sharedAutoscaling = new databricks.Cluster("sharedAutoscaling", { clusterName: "Shared Autoscaling", sparkVersion: latestLts.then(latestLts => latestLts.id), nodeTypeId: smallest.then(smallest => smallest.id), autoterminationMinutes: 20, autoscale: { minWorkers: 1, maxWorkers: 50, }, sparkConf: { "spark.databricks.io.cache.enabled": true, "spark.databricks.io.cache.maxDiskUsage": "50g", "spark.databricks.io.cache.maxMetaDataCache": "1g", }, });

    import pulumi
    import pulumi_databricks as databricks
    
    smallest = databricks.get_node_type(local_disk=True)
    latest_lts = databricks.get_spark_version(long_term_support=True)
    shared_autoscaling = databricks.Cluster("sharedAutoscaling",
        cluster_name="Shared Autoscaling",
        spark_version=latest_lts.id,
        node_type_id=smallest.id,
        autotermination_minutes=20,
        autoscale=databricks.ClusterAutoscaleArgs(
            min_workers=1,
            max_workers=50,
        ),
        spark_conf={
            "spark.databricks.io.cache.enabled": True,
            "spark.databricks.io.cache.maxDiskUsage": "50g",
            "spark.databricks.io.cache.maxMetaDataCache": "1g",
        })
    
    using System.Collections.Generic;
    using System.Linq;
    using Pulumi;
    using Databricks = Pulumi.Databricks;
    
    return await Deployment.RunAsync(() => 
    {
        var smallest = Databricks.GetNodeType.Invoke(new()
        {
            LocalDisk = true,
        });
    
        var latestLts = Databricks.GetSparkVersion.Invoke(new()
        {
            LongTermSupport = true,
        });
    
        var sharedAutoscaling = new Databricks.Cluster("sharedAutoscaling", new()
        {
            ClusterName = "Shared Autoscaling",
            SparkVersion = latestLts.Apply(getSparkVersionResult => getSparkVersionResult.Id),
            NodeTypeId = smallest.Apply(getNodeTypeResult => getNodeTypeResult.Id),
            AutoterminationMinutes = 20,
            Autoscale = new Databricks.Inputs.ClusterAutoscaleArgs
            {
                MinWorkers = 1,
                MaxWorkers = 50,
            },
            SparkConf = 
            {
                { "spark.databricks.io.cache.enabled", true },
                { "spark.databricks.io.cache.maxDiskUsage", "50g" },
                { "spark.databricks.io.cache.maxMetaDataCache", "1g" },
            },
        });
    
    });
    
    package main
    
    import (
    	"github.com/pulumi/pulumi-databricks/sdk/go/databricks"
    	"github.com/pulumi/pulumi/sdk/v3/go/pulumi"
    )
    
    func main() {
    	pulumi.Run(func(ctx *pulumi.Context) error {
    		smallest, err := databricks.GetNodeType(ctx, &databricks.GetNodeTypeArgs{
    			LocalDisk: pulumi.BoolRef(true),
    		}, nil)
    		if err != nil {
    			return err
    		}
    		latestLts, err := databricks.GetSparkVersion(ctx, &databricks.GetSparkVersionArgs{
    			LongTermSupport: pulumi.BoolRef(true),
    		}, nil)
    		if err != nil {
    			return err
    		}
    		_, err = databricks.NewCluster(ctx, "sharedAutoscaling", &databricks.ClusterArgs{
    			ClusterName:            pulumi.String("Shared Autoscaling"),
    			SparkVersion:           *pulumi.String(latestLts.Id),
    			NodeTypeId:             *pulumi.String(smallest.Id),
    			AutoterminationMinutes: pulumi.Int(20),
    			Autoscale: &databricks.ClusterAutoscaleArgs{
    				MinWorkers: pulumi.Int(1),
    				MaxWorkers: pulumi.Int(50),
    			},
    			SparkConf: pulumi.Map{
    				"spark.databricks.io.cache.enabled":          pulumi.Any(true),
    				"spark.databricks.io.cache.maxDiskUsage":     pulumi.Any("50g"),
    				"spark.databricks.io.cache.maxMetaDataCache": pulumi.Any("1g"),
    			},
    		})
    		if err != nil {
    			return err
    		}
    		return nil
    	})
    }
    
    package generated_program;
    
    import com.pulumi.Context;
    import com.pulumi.Pulumi;
    import com.pulumi.core.Output;
    import com.pulumi.databricks.DatabricksFunctions;
    import com.pulumi.databricks.inputs.GetNodeTypeArgs;
    import com.pulumi.databricks.inputs.GetSparkVersionArgs;
    import com.pulumi.databricks.Cluster;
    import com.pulumi.databricks.ClusterArgs;
    import com.pulumi.databricks.inputs.ClusterAutoscaleArgs;
    import java.util.List;
    import java.util.ArrayList;
    import java.util.Map;
    import java.io.File;
    import java.nio.file.Files;
    import java.nio.file.Paths;
    
    public class App {
        public static void main(String[] args) {
            Pulumi.run(App::stack);
        }
    
        public static void stack(Context ctx) {
            final var smallest = DatabricksFunctions.getNodeType(GetNodeTypeArgs.builder()
                .localDisk(true)
                .build());
    
            final var latestLts = DatabricksFunctions.getSparkVersion(GetSparkVersionArgs.builder()
                .longTermSupport(true)
                .build());
    
            var sharedAutoscaling = new Cluster("sharedAutoscaling", ClusterArgs.builder()        
                .clusterName("Shared Autoscaling")
                .sparkVersion(latestLts.applyValue(getSparkVersionResult -> getSparkVersionResult.id()))
                .nodeTypeId(smallest.applyValue(getNodeTypeResult -> getNodeTypeResult.id()))
                .autoterminationMinutes(20)
                .autoscale(ClusterAutoscaleArgs.builder()
                    .minWorkers(1)
                    .maxWorkers(50)
                    .build())
                .sparkConf(Map.ofEntries(
                    Map.entry("spark.databricks.io.cache.enabled", true),
                    Map.entry("spark.databricks.io.cache.maxDiskUsage", "50g"),
                    Map.entry("spark.databricks.io.cache.maxMetaDataCache", "1g")
                ))
                .build());
    
        }
    }
    
    resources:
      sharedAutoscaling:
        type: databricks:Cluster
        properties:
          clusterName: Shared Autoscaling
          sparkVersion: ${latestLts.id}
          nodeTypeId: ${smallest.id}
          autoterminationMinutes: 20
          autoscale:
            minWorkers: 1
            maxWorkers: 50
          sparkConf:
            spark.databricks.io.cache.enabled: true
            spark.databricks.io.cache.maxDiskUsage: 50g
            spark.databricks.io.cache.maxMetaDataCache: 1g
    variables:
      smallest:
        fn::invoke:
          Function: databricks:getNodeType
          Arguments:
            localDisk: true
      latestLts:
        fn::invoke:
          Function: databricks:getSparkVersion
          Arguments:
            longTermSupport: true
    
    Libraries []ClusterLibraryArgs
    NodeTypeId string

    Any supported databricks.getNodeType id. If instance_pool_id is specified, this field is not needed.

    NumWorkers int

    Number of worker nodes that this cluster should have. A cluster has one Spark driver and num_workers executors for a total of num_workers + 1 Spark nodes.

    PolicyId string
    RuntimeEngine string

    The type of runtime engine to use. If not specified, the runtime engine type is inferred based on the spark_version value. Allowed values include: PHOTON, STANDARD.

    SingleUserName string

    The optional user name of the user to assign to an interactive cluster. This field is required when using data_security_mode set to SINGLE_USER or AAD Passthrough for Azure Data Lake Storage (ADLS) with a single-user cluster (i.e., not high-concurrency clusters).

    SparkConf map[string]interface{}

    Map with key-value pairs to fine-tune Spark clusters, where you can provide custom Spark configuration properties in a cluster configuration.

    SparkEnvVars map[string]interface{}

    Map with environment variable key-value pairs to fine-tune Spark clusters. Key-value pairs of the form (X,Y) are exported (i.e., X='Y') while launching the driver and workers.

    SparkVersion string

    Runtime version of the cluster. Any supported databricks.getSparkVersion id. We advise using Cluster Policies to restrict the list of versions for simplicity while maintaining enough control.

    SshPublicKeys []string

    SSH public key contents that will be added to each Spark node in this cluster. The corresponding private keys can be used to login with the user name ubuntu on port 2200. You can specify up to 10 keys.

    State string

    (string) State of the cluster.

    Url string
    WorkloadType ClusterWorkloadTypeArgs
    applyPolicyDefaultValues Boolean

    Whether to use policy default values for missing cluster attributes.

    autoscale ClusterAutoscale
    autoterminationMinutes Integer

    Automatically terminate the cluster after being inactive for this time in minutes. If specified, the threshold must be between 10 and 10000 minutes. You can also set this value to 0 to explicitly disable automatic termination. Defaults to 60. We highly recommend having this setting present for Interactive/BI clusters.

    awsAttributes ClusterAwsAttributes
    azureAttributes ClusterAzureAttributes
    clusterId String
    clusterLogConf ClusterClusterLogConf
    clusterMountInfos List<ClusterClusterMountInfo>
    clusterName String

    Cluster name, which doesn’t have to be unique. If not specified at creation, the cluster name will be an empty string.

    customTags Map<String,Object>

    Additional tags for cluster resources. Databricks will tag all cluster resources (e.g., AWS EC2 instances and EBS volumes) with these tags in addition to default_tags. If a custom cluster tag has the same name as a default cluster tag, the custom tag is prefixed with an x_ when it is propagated.

    dataSecurityMode String

    Select the security features of the cluster. Unity Catalog requires SINGLE_USER or USER_ISOLATION mode. LEGACY_PASSTHROUGH for passthrough cluster and LEGACY_TABLE_ACL for Table ACL cluster. If omitted, no security features are enabled. In the Databricks UI, this has been recently been renamed Access Mode and USER_ISOLATION has been renamed Shared, but use these terms here.

    defaultTags Map<String,Object>

    (map) Tags that are added by Databricks by default, regardless of any custom_tags that may have been added. These include: Vendor: Databricks, Creator: <username_of_creator>, ClusterName: <name_of_cluster>, ClusterId: <id_of_cluster>, Name: , and any workspace and pool tags.

    dockerImage ClusterDockerImage
    driverInstancePoolId String

    similar to instance_pool_id, but for driver node. If omitted, and instance_pool_id is specified, then the driver will be allocated from that pool.

    driverNodeTypeId String

    The node type of the Spark driver. This field is optional; if unset, API will set the driver node type to the same value as node_type_id defined above.

    enableElasticDisk Boolean

    If you don’t want to allocate a fixed number of EBS volumes at cluster creation time, use autoscaling local storage. With autoscaling local storage, Databricks monitors the amount of free disk space available on your cluster’s Spark workers. If a worker begins to run too low on disk, Databricks automatically attaches a new EBS volume to the worker before it runs out of disk space. EBS volumes are attached up to a limit of 5 TB of total disk space per instance (including the instance’s local storage). To scale down EBS usage, make sure you have autotermination_minutes and autoscale attributes set. More documentation available at cluster configuration page.

    enableLocalDiskEncryption Boolean

    Some instance types you use to run clusters may have locally attached disks. Databricks may store shuffle data or temporary data on these locally attached disks. To ensure that all data at rest is encrypted for all storage types, including shuffle data stored temporarily on your cluster’s local disks, you can enable local disk encryption. When local disk encryption is enabled, Databricks generates an encryption key locally unique to each cluster node and uses it to encrypt all data stored on local disks. The scope of the key is local to each cluster node and is destroyed along with the cluster node itself. During its lifetime, the key resides in memory for encryption and decryption and is stored encrypted on the disk. Your workloads may run more slowly because of the performance impact of reading and writing encrypted data to and from local volumes. This feature is not available for all Azure Databricks subscriptions. Contact your Microsoft or Databricks account representative to request access.

    gcpAttributes ClusterGcpAttributes
    idempotencyToken String

    An optional token to guarantee the idempotency of cluster creation requests. If an active cluster with the provided token already exists, the request will not create a new cluster, but it will return the existing running cluster's ID instead. If you specify the idempotency token, upon failure, you can retry until the request succeeds. Databricks platform guarantees to launch exactly one cluster with that idempotency token. This token should have at most 64 characters.

    initScripts List<ClusterInitScript>
    instancePoolId String

    To reduce cluster start time, you can attach a cluster to a predefined pool of idle instances. When attached to a pool, a cluster allocates its driver and worker nodes from the pool. If the pool does not have sufficient idle resources to accommodate the cluster’s request, it expands by allocating new instances from the instance provider. When an attached cluster changes its state to TERMINATED, the instances it used are returned to the pool and reused by a different cluster.

    isPinned Boolean

    boolean value specifying if the cluster is pinned (not pinned by default). You must be a Databricks administrator to use this. The pinned clusters' maximum number is limited to 100, so apply may fail if you have more than that (this number may change over time, so check Databricks documentation for actual number).

    The following example demonstrates how to create an autoscaling cluster with Delta Cache enabled:

    import * as pulumi from "@pulumi/pulumi";
    import * as databricks from "@pulumi/databricks";
    

    const smallest = databricks.getNodeType({ localDisk: true, }); const latestLts = databricks.getSparkVersion({ longTermSupport: true, }); const sharedAutoscaling = new databricks.Cluster("sharedAutoscaling", { clusterName: "Shared Autoscaling", sparkVersion: latestLts.then(latestLts => latestLts.id), nodeTypeId: smallest.then(smallest => smallest.id), autoterminationMinutes: 20, autoscale: { minWorkers: 1, maxWorkers: 50, }, sparkConf: { "spark.databricks.io.cache.enabled": true, "spark.databricks.io.cache.maxDiskUsage": "50g", "spark.databricks.io.cache.maxMetaDataCache": "1g", }, });

    import pulumi
    import pulumi_databricks as databricks
    
    smallest = databricks.get_node_type(local_disk=True)
    latest_lts = databricks.get_spark_version(long_term_support=True)
    shared_autoscaling = databricks.Cluster("sharedAutoscaling",
        cluster_name="Shared Autoscaling",
        spark_version=latest_lts.id,
        node_type_id=smallest.id,
        autotermination_minutes=20,
        autoscale=databricks.ClusterAutoscaleArgs(
            min_workers=1,
            max_workers=50,
        ),
        spark_conf={
            "spark.databricks.io.cache.enabled": True,
            "spark.databricks.io.cache.maxDiskUsage": "50g",
            "spark.databricks.io.cache.maxMetaDataCache": "1g",
        })
    
    using System.Collections.Generic;
    using System.Linq;
    using Pulumi;
    using Databricks = Pulumi.Databricks;
    
    return await Deployment.RunAsync(() => 
    {
        var smallest = Databricks.GetNodeType.Invoke(new()
        {
            LocalDisk = true,
        });
    
        var latestLts = Databricks.GetSparkVersion.Invoke(new()
        {
            LongTermSupport = true,
        });
    
        var sharedAutoscaling = new Databricks.Cluster("sharedAutoscaling", new()
        {
            ClusterName = "Shared Autoscaling",
            SparkVersion = latestLts.Apply(getSparkVersionResult => getSparkVersionResult.Id),
            NodeTypeId = smallest.Apply(getNodeTypeResult => getNodeTypeResult.Id),
            AutoterminationMinutes = 20,
            Autoscale = new Databricks.Inputs.ClusterAutoscaleArgs
            {
                MinWorkers = 1,
                MaxWorkers = 50,
            },
            SparkConf = 
            {
                { "spark.databricks.io.cache.enabled", true },
                { "spark.databricks.io.cache.maxDiskUsage", "50g" },
                { "spark.databricks.io.cache.maxMetaDataCache", "1g" },
            },
        });
    
    });
    
    package main
    
    import (
    	"github.com/pulumi/pulumi-databricks/sdk/go/databricks"
    	"github.com/pulumi/pulumi/sdk/v3/go/pulumi"
    )
    
    func main() {
    	pulumi.Run(func(ctx *pulumi.Context) error {
    		smallest, err := databricks.GetNodeType(ctx, &databricks.GetNodeTypeArgs{
    			LocalDisk: pulumi.BoolRef(true),
    		}, nil)
    		if err != nil {
    			return err
    		}
    		latestLts, err := databricks.GetSparkVersion(ctx, &databricks.GetSparkVersionArgs{
    			LongTermSupport: pulumi.BoolRef(true),
    		}, nil)
    		if err != nil {
    			return err
    		}
    		_, err = databricks.NewCluster(ctx, "sharedAutoscaling", &databricks.ClusterArgs{
    			ClusterName:            pulumi.String("Shared Autoscaling"),
    			SparkVersion:           *pulumi.String(latestLts.Id),
    			NodeTypeId:             *pulumi.String(smallest.Id),
    			AutoterminationMinutes: pulumi.Int(20),
    			Autoscale: &databricks.ClusterAutoscaleArgs{
    				MinWorkers: pulumi.Int(1),
    				MaxWorkers: pulumi.Int(50),
    			},
    			SparkConf: pulumi.Map{
    				"spark.databricks.io.cache.enabled":          pulumi.Any(true),
    				"spark.databricks.io.cache.maxDiskUsage":     pulumi.Any("50g"),
    				"spark.databricks.io.cache.maxMetaDataCache": pulumi.Any("1g"),
    			},
    		})
    		if err != nil {
    			return err
    		}
    		return nil
    	})
    }
    
    package generated_program;
    
    import com.pulumi.Context;
    import com.pulumi.Pulumi;
    import com.pulumi.core.Output;
    import com.pulumi.databricks.DatabricksFunctions;
    import com.pulumi.databricks.inputs.GetNodeTypeArgs;
    import com.pulumi.databricks.inputs.GetSparkVersionArgs;
    import com.pulumi.databricks.Cluster;
    import com.pulumi.databricks.ClusterArgs;
    import com.pulumi.databricks.inputs.ClusterAutoscaleArgs;
    import java.util.List;
    import java.util.ArrayList;
    import java.util.Map;
    import java.io.File;
    import java.nio.file.Files;
    import java.nio.file.Paths;
    
    public class App {
        public static void main(String[] args) {
            Pulumi.run(App::stack);
        }
    
        public static void stack(Context ctx) {
            final var smallest = DatabricksFunctions.getNodeType(GetNodeTypeArgs.builder()
                .localDisk(true)
                .build());
    
            final var latestLts = DatabricksFunctions.getSparkVersion(GetSparkVersionArgs.builder()
                .longTermSupport(true)
                .build());
    
            var sharedAutoscaling = new Cluster("sharedAutoscaling", ClusterArgs.builder()        
                .clusterName("Shared Autoscaling")
                .sparkVersion(latestLts.applyValue(getSparkVersionResult -> getSparkVersionResult.id()))
                .nodeTypeId(smallest.applyValue(getNodeTypeResult -> getNodeTypeResult.id()))
                .autoterminationMinutes(20)
                .autoscale(ClusterAutoscaleArgs.builder()
                    .minWorkers(1)
                    .maxWorkers(50)
                    .build())
                .sparkConf(Map.ofEntries(
                    Map.entry("spark.databricks.io.cache.enabled", true),
                    Map.entry("spark.databricks.io.cache.maxDiskUsage", "50g"),
                    Map.entry("spark.databricks.io.cache.maxMetaDataCache", "1g")
                ))
                .build());
    
        }
    }
    
    resources:
      sharedAutoscaling:
        type: databricks:Cluster
        properties:
          clusterName: Shared Autoscaling
          sparkVersion: ${latestLts.id}
          nodeTypeId: ${smallest.id}
          autoterminationMinutes: 20
          autoscale:
            minWorkers: 1
            maxWorkers: 50
          sparkConf:
            spark.databricks.io.cache.enabled: true
            spark.databricks.io.cache.maxDiskUsage: 50g
            spark.databricks.io.cache.maxMetaDataCache: 1g
    variables:
      smallest:
        fn::invoke:
          Function: databricks:getNodeType
          Arguments:
            localDisk: true
      latestLts:
        fn::invoke:
          Function: databricks:getSparkVersion
          Arguments:
            longTermSupport: true
    
    libraries List<ClusterLibrary>
    nodeTypeId String

    Any supported databricks.getNodeType id. If instance_pool_id is specified, this field is not needed.

    numWorkers Integer

    Number of worker nodes that this cluster should have. A cluster has one Spark driver and num_workers executors for a total of num_workers + 1 Spark nodes.

    policyId String
    runtimeEngine String

    The type of runtime engine to use. If not specified, the runtime engine type is inferred based on the spark_version value. Allowed values include: PHOTON, STANDARD.

    singleUserName String

    The optional user name of the user to assign to an interactive cluster. This field is required when using data_security_mode set to SINGLE_USER or AAD Passthrough for Azure Data Lake Storage (ADLS) with a single-user cluster (i.e., not high-concurrency clusters).

    sparkConf Map<String,Object>

    Map with key-value pairs to fine-tune Spark clusters, where you can provide custom Spark configuration properties in a cluster configuration.

    sparkEnvVars Map<String,Object>

    Map with environment variable key-value pairs to fine-tune Spark clusters. Key-value pairs of the form (X,Y) are exported (i.e., X='Y') while launching the driver and workers.

    sparkVersion String

    Runtime version of the cluster. Any supported databricks.getSparkVersion id. We advise using Cluster Policies to restrict the list of versions for simplicity while maintaining enough control.

    sshPublicKeys List<String>

    SSH public key contents that will be added to each Spark node in this cluster. The corresponding private keys can be used to login with the user name ubuntu on port 2200. You can specify up to 10 keys.

    state String

    (string) State of the cluster.

    url String
    workloadType ClusterWorkloadType
    applyPolicyDefaultValues boolean

    Whether to use policy default values for missing cluster attributes.

    autoscale ClusterAutoscale
    autoterminationMinutes number

    Automatically terminate the cluster after being inactive for this time in minutes. If specified, the threshold must be between 10 and 10000 minutes. You can also set this value to 0 to explicitly disable automatic termination. Defaults to 60. We highly recommend having this setting present for Interactive/BI clusters.

    awsAttributes ClusterAwsAttributes
    azureAttributes ClusterAzureAttributes
    clusterId string
    clusterLogConf ClusterClusterLogConf
    clusterMountInfos ClusterClusterMountInfo[]
    clusterName string

    Cluster name, which doesn’t have to be unique. If not specified at creation, the cluster name will be an empty string.

    customTags {[key: string]: any}

    Additional tags for cluster resources. Databricks will tag all cluster resources (e.g., AWS EC2 instances and EBS volumes) with these tags in addition to default_tags. If a custom cluster tag has the same name as a default cluster tag, the custom tag is prefixed with an x_ when it is propagated.

    dataSecurityMode string

    Select the security features of the cluster. Unity Catalog requires SINGLE_USER or USER_ISOLATION mode. LEGACY_PASSTHROUGH for passthrough cluster and LEGACY_TABLE_ACL for Table ACL cluster. If omitted, no security features are enabled. In the Databricks UI, this has been recently been renamed Access Mode and USER_ISOLATION has been renamed Shared, but use these terms here.

    defaultTags {[key: string]: any}

    (map) Tags that are added by Databricks by default, regardless of any custom_tags that may have been added. These include: Vendor: Databricks, Creator: <username_of_creator>, ClusterName: <name_of_cluster>, ClusterId: <id_of_cluster>, Name: , and any workspace and pool tags.

    dockerImage ClusterDockerImage
    driverInstancePoolId string

    similar to instance_pool_id, but for driver node. If omitted, and instance_pool_id is specified, then the driver will be allocated from that pool.

    driverNodeTypeId string

    The node type of the Spark driver. This field is optional; if unset, API will set the driver node type to the same value as node_type_id defined above.

    enableElasticDisk boolean

    If you don’t want to allocate a fixed number of EBS volumes at cluster creation time, use autoscaling local storage. With autoscaling local storage, Databricks monitors the amount of free disk space available on your cluster’s Spark workers. If a worker begins to run too low on disk, Databricks automatically attaches a new EBS volume to the worker before it runs out of disk space. EBS volumes are attached up to a limit of 5 TB of total disk space per instance (including the instance’s local storage). To scale down EBS usage, make sure you have autotermination_minutes and autoscale attributes set. More documentation available at cluster configuration page.

    enableLocalDiskEncryption boolean

    Some instance types you use to run clusters may have locally attached disks. Databricks may store shuffle data or temporary data on these locally attached disks. To ensure that all data at rest is encrypted for all storage types, including shuffle data stored temporarily on your cluster’s local disks, you can enable local disk encryption. When local disk encryption is enabled, Databricks generates an encryption key locally unique to each cluster node and uses it to encrypt all data stored on local disks. The scope of the key is local to each cluster node and is destroyed along with the cluster node itself. During its lifetime, the key resides in memory for encryption and decryption and is stored encrypted on the disk. Your workloads may run more slowly because of the performance impact of reading and writing encrypted data to and from local volumes. This feature is not available for all Azure Databricks subscriptions. Contact your Microsoft or Databricks account representative to request access.

    gcpAttributes ClusterGcpAttributes
    idempotencyToken string

    An optional token to guarantee the idempotency of cluster creation requests. If an active cluster with the provided token already exists, the request will not create a new cluster, but it will return the existing running cluster's ID instead. If you specify the idempotency token, upon failure, you can retry until the request succeeds. Databricks platform guarantees to launch exactly one cluster with that idempotency token. This token should have at most 64 characters.

    initScripts ClusterInitScript[]
    instancePoolId string

    To reduce cluster start time, you can attach a cluster to a predefined pool of idle instances. When attached to a pool, a cluster allocates its driver and worker nodes from the pool. If the pool does not have sufficient idle resources to accommodate the cluster’s request, it expands by allocating new instances from the instance provider. When an attached cluster changes its state to TERMINATED, the instances it used are returned to the pool and reused by a different cluster.

    isPinned boolean

    boolean value specifying if the cluster is pinned (not pinned by default). You must be a Databricks administrator to use this. The pinned clusters' maximum number is limited to 100, so apply may fail if you have more than that (this number may change over time, so check Databricks documentation for actual number).

    The following example demonstrates how to create an autoscaling cluster with Delta Cache enabled:

    import * as pulumi from "@pulumi/pulumi";
    import * as databricks from "@pulumi/databricks";
    

    const smallest = databricks.getNodeType({ localDisk: true, }); const latestLts = databricks.getSparkVersion({ longTermSupport: true, }); const sharedAutoscaling = new databricks.Cluster("sharedAutoscaling", { clusterName: "Shared Autoscaling", sparkVersion: latestLts.then(latestLts => latestLts.id), nodeTypeId: smallest.then(smallest => smallest.id), autoterminationMinutes: 20, autoscale: { minWorkers: 1, maxWorkers: 50, }, sparkConf: { "spark.databricks.io.cache.enabled": true, "spark.databricks.io.cache.maxDiskUsage": "50g", "spark.databricks.io.cache.maxMetaDataCache": "1g", }, });

    import pulumi
    import pulumi_databricks as databricks
    
    smallest = databricks.get_node_type(local_disk=True)
    latest_lts = databricks.get_spark_version(long_term_support=True)
    shared_autoscaling = databricks.Cluster("sharedAutoscaling",
        cluster_name="Shared Autoscaling",
        spark_version=latest_lts.id,
        node_type_id=smallest.id,
        autotermination_minutes=20,
        autoscale=databricks.ClusterAutoscaleArgs(
            min_workers=1,
            max_workers=50,
        ),
        spark_conf={
            "spark.databricks.io.cache.enabled": True,
            "spark.databricks.io.cache.maxDiskUsage": "50g",
            "spark.databricks.io.cache.maxMetaDataCache": "1g",
        })
    
    using System.Collections.Generic;
    using System.Linq;
    using Pulumi;
    using Databricks = Pulumi.Databricks;
    
    return await Deployment.RunAsync(() => 
    {
        var smallest = Databricks.GetNodeType.Invoke(new()
        {
            LocalDisk = true,
        });
    
        var latestLts = Databricks.GetSparkVersion.Invoke(new()
        {
            LongTermSupport = true,
        });
    
        var sharedAutoscaling = new Databricks.Cluster("sharedAutoscaling", new()
        {
            ClusterName = "Shared Autoscaling",
            SparkVersion = latestLts.Apply(getSparkVersionResult => getSparkVersionResult.Id),
            NodeTypeId = smallest.Apply(getNodeTypeResult => getNodeTypeResult.Id),
            AutoterminationMinutes = 20,
            Autoscale = new Databricks.Inputs.ClusterAutoscaleArgs
            {
                MinWorkers = 1,
                MaxWorkers = 50,
            },
            SparkConf = 
            {
                { "spark.databricks.io.cache.enabled", true },
                { "spark.databricks.io.cache.maxDiskUsage", "50g" },
                { "spark.databricks.io.cache.maxMetaDataCache", "1g" },
            },
        });
    
    });
    
    package main
    
    import (
    	"github.com/pulumi/pulumi-databricks/sdk/go/databricks"
    	"github.com/pulumi/pulumi/sdk/v3/go/pulumi"
    )
    
    func main() {
    	pulumi.Run(func(ctx *pulumi.Context) error {
    		smallest, err := databricks.GetNodeType(ctx, &databricks.GetNodeTypeArgs{
    			LocalDisk: pulumi.BoolRef(true),
    		}, nil)
    		if err != nil {
    			return err
    		}
    		latestLts, err := databricks.GetSparkVersion(ctx, &databricks.GetSparkVersionArgs{
    			LongTermSupport: pulumi.BoolRef(true),
    		}, nil)
    		if err != nil {
    			return err
    		}
    		_, err = databricks.NewCluster(ctx, "sharedAutoscaling", &databricks.ClusterArgs{
    			ClusterName:            pulumi.String("Shared Autoscaling"),
    			SparkVersion:           *pulumi.String(latestLts.Id),
    			NodeTypeId:             *pulumi.String(smallest.Id),
    			AutoterminationMinutes: pulumi.Int(20),
    			Autoscale: &databricks.ClusterAutoscaleArgs{
    				MinWorkers: pulumi.Int(1),
    				MaxWorkers: pulumi.Int(50),
    			},
    			SparkConf: pulumi.Map{
    				"spark.databricks.io.cache.enabled":          pulumi.Any(true),
    				"spark.databricks.io.cache.maxDiskUsage":     pulumi.Any("50g"),
    				"spark.databricks.io.cache.maxMetaDataCache": pulumi.Any("1g"),
    			},
    		})
    		if err != nil {
    			return err
    		}
    		return nil
    	})
    }
    
    package generated_program;
    
    import com.pulumi.Context;
    import com.pulumi.Pulumi;
    import com.pulumi.core.Output;
    import com.pulumi.databricks.DatabricksFunctions;
    import com.pulumi.databricks.inputs.GetNodeTypeArgs;
    import com.pulumi.databricks.inputs.GetSparkVersionArgs;
    import com.pulumi.databricks.Cluster;
    import com.pulumi.databricks.ClusterArgs;
    import com.pulumi.databricks.inputs.ClusterAutoscaleArgs;
    import java.util.List;
    import java.util.ArrayList;
    import java.util.Map;
    import java.io.File;
    import java.nio.file.Files;
    import java.nio.file.Paths;
    
    public class App {
        public static void main(String[] args) {
            Pulumi.run(App::stack);
        }
    
        public static void stack(Context ctx) {
            final var smallest = DatabricksFunctions.getNodeType(GetNodeTypeArgs.builder()
                .localDisk(true)
                .build());
    
            final var latestLts = DatabricksFunctions.getSparkVersion(GetSparkVersionArgs.builder()
                .longTermSupport(true)
                .build());
    
            var sharedAutoscaling = new Cluster("sharedAutoscaling", ClusterArgs.builder()        
                .clusterName("Shared Autoscaling")
                .sparkVersion(latestLts.applyValue(getSparkVersionResult -> getSparkVersionResult.id()))
                .nodeTypeId(smallest.applyValue(getNodeTypeResult -> getNodeTypeResult.id()))
                .autoterminationMinutes(20)
                .autoscale(ClusterAutoscaleArgs.builder()
                    .minWorkers(1)
                    .maxWorkers(50)
                    .build())
                .sparkConf(Map.ofEntries(
                    Map.entry("spark.databricks.io.cache.enabled", true),
                    Map.entry("spark.databricks.io.cache.maxDiskUsage", "50g"),
                    Map.entry("spark.databricks.io.cache.maxMetaDataCache", "1g")
                ))
                .build());
    
        }
    }
    
    resources:
      sharedAutoscaling:
        type: databricks:Cluster
        properties:
          clusterName: Shared Autoscaling
          sparkVersion: ${latestLts.id}
          nodeTypeId: ${smallest.id}
          autoterminationMinutes: 20
          autoscale:
            minWorkers: 1
            maxWorkers: 50
          sparkConf:
            spark.databricks.io.cache.enabled: true
            spark.databricks.io.cache.maxDiskUsage: 50g
            spark.databricks.io.cache.maxMetaDataCache: 1g
    variables:
      smallest:
        fn::invoke:
          Function: databricks:getNodeType
          Arguments:
            localDisk: true
      latestLts:
        fn::invoke:
          Function: databricks:getSparkVersion
          Arguments:
            longTermSupport: true
    
    libraries ClusterLibrary[]
    nodeTypeId string

    Any supported databricks.getNodeType id. If instance_pool_id is specified, this field is not needed.

    numWorkers number

    Number of worker nodes that this cluster should have. A cluster has one Spark driver and num_workers executors for a total of num_workers + 1 Spark nodes.

    policyId string
    runtimeEngine string

    The type of runtime engine to use. If not specified, the runtime engine type is inferred based on the spark_version value. Allowed values include: PHOTON, STANDARD.

    singleUserName string

    The optional user name of the user to assign to an interactive cluster. This field is required when using data_security_mode set to SINGLE_USER or AAD Passthrough for Azure Data Lake Storage (ADLS) with a single-user cluster (i.e., not high-concurrency clusters).

    sparkConf {[key: string]: any}

    Map with key-value pairs to fine-tune Spark clusters, where you can provide custom Spark configuration properties in a cluster configuration.

    sparkEnvVars {[key: string]: any}

    Map with environment variable key-value pairs to fine-tune Spark clusters. Key-value pairs of the form (X,Y) are exported (i.e., X='Y') while launching the driver and workers.

    sparkVersion string

    Runtime version of the cluster. Any supported databricks.getSparkVersion id. We advise using Cluster Policies to restrict the list of versions for simplicity while maintaining enough control.

    sshPublicKeys string[]

    SSH public key contents that will be added to each Spark node in this cluster. The corresponding private keys can be used to login with the user name ubuntu on port 2200. You can specify up to 10 keys.

    state string

    (string) State of the cluster.

    url string
    workloadType ClusterWorkloadType
    apply_policy_default_values bool

    Whether to use policy default values for missing cluster attributes.

    autoscale ClusterAutoscaleArgs
    autotermination_minutes int

    Automatically terminate the cluster after being inactive for this time in minutes. If specified, the threshold must be between 10 and 10000 minutes. You can also set this value to 0 to explicitly disable automatic termination. Defaults to 60. We highly recommend having this setting present for Interactive/BI clusters.

    aws_attributes ClusterAwsAttributesArgs
    azure_attributes ClusterAzureAttributesArgs
    cluster_id str
    cluster_log_conf ClusterClusterLogConfArgs
    cluster_mount_infos Sequence[ClusterClusterMountInfoArgs]
    cluster_name str

    Cluster name, which doesn’t have to be unique. If not specified at creation, the cluster name will be an empty string.

    custom_tags Mapping[str, Any]

    Additional tags for cluster resources. Databricks will tag all cluster resources (e.g., AWS EC2 instances and EBS volumes) with these tags in addition to default_tags. If a custom cluster tag has the same name as a default cluster tag, the custom tag is prefixed with an x_ when it is propagated.

    data_security_mode str

    Select the security features of the cluster. Unity Catalog requires SINGLE_USER or USER_ISOLATION mode. LEGACY_PASSTHROUGH for passthrough cluster and LEGACY_TABLE_ACL for Table ACL cluster. If omitted, no security features are enabled. In the Databricks UI, this has been recently been renamed Access Mode and USER_ISOLATION has been renamed Shared, but use these terms here.

    default_tags Mapping[str, Any]

    (map) Tags that are added by Databricks by default, regardless of any custom_tags that may have been added. These include: Vendor: Databricks, Creator: <username_of_creator>, ClusterName: <name_of_cluster>, ClusterId: <id_of_cluster>, Name: , and any workspace and pool tags.

    docker_image ClusterDockerImageArgs
    driver_instance_pool_id str

    similar to instance_pool_id, but for driver node. If omitted, and instance_pool_id is specified, then the driver will be allocated from that pool.

    driver_node_type_id str

    The node type of the Spark driver. This field is optional; if unset, API will set the driver node type to the same value as node_type_id defined above.

    enable_elastic_disk bool

    If you don’t want to allocate a fixed number of EBS volumes at cluster creation time, use autoscaling local storage. With autoscaling local storage, Databricks monitors the amount of free disk space available on your cluster’s Spark workers. If a worker begins to run too low on disk, Databricks automatically attaches a new EBS volume to the worker before it runs out of disk space. EBS volumes are attached up to a limit of 5 TB of total disk space per instance (including the instance’s local storage). To scale down EBS usage, make sure you have autotermination_minutes and autoscale attributes set. More documentation available at cluster configuration page.

    enable_local_disk_encryption bool

    Some instance types you use to run clusters may have locally attached disks. Databricks may store shuffle data or temporary data on these locally attached disks. To ensure that all data at rest is encrypted for all storage types, including shuffle data stored temporarily on your cluster’s local disks, you can enable local disk encryption. When local disk encryption is enabled, Databricks generates an encryption key locally unique to each cluster node and uses it to encrypt all data stored on local disks. The scope of the key is local to each cluster node and is destroyed along with the cluster node itself. During its lifetime, the key resides in memory for encryption and decryption and is stored encrypted on the disk. Your workloads may run more slowly because of the performance impact of reading and writing encrypted data to and from local volumes. This feature is not available for all Azure Databricks subscriptions. Contact your Microsoft or Databricks account representative to request access.

    gcp_attributes ClusterGcpAttributesArgs
    idempotency_token str

    An optional token to guarantee the idempotency of cluster creation requests. If an active cluster with the provided token already exists, the request will not create a new cluster, but it will return the existing running cluster's ID instead. If you specify the idempotency token, upon failure, you can retry until the request succeeds. Databricks platform guarantees to launch exactly one cluster with that idempotency token. This token should have at most 64 characters.

    init_scripts Sequence[ClusterInitScriptArgs]
    instance_pool_id str

    To reduce cluster start time, you can attach a cluster to a predefined pool of idle instances. When attached to a pool, a cluster allocates its driver and worker nodes from the pool. If the pool does not have sufficient idle resources to accommodate the cluster’s request, it expands by allocating new instances from the instance provider. When an attached cluster changes its state to TERMINATED, the instances it used are returned to the pool and reused by a different cluster.

    is_pinned bool

    boolean value specifying if the cluster is pinned (not pinned by default). You must be a Databricks administrator to use this. The pinned clusters' maximum number is limited to 100, so apply may fail if you have more than that (this number may change over time, so check Databricks documentation for actual number).

    The following example demonstrates how to create an autoscaling cluster with Delta Cache enabled:

    import * as pulumi from "@pulumi/pulumi";
    import * as databricks from "@pulumi/databricks";
    

    const smallest = databricks.getNodeType({ localDisk: true, }); const latestLts = databricks.getSparkVersion({ longTermSupport: true, }); const sharedAutoscaling = new databricks.Cluster("sharedAutoscaling", { clusterName: "Shared Autoscaling", sparkVersion: latestLts.then(latestLts => latestLts.id), nodeTypeId: smallest.then(smallest => smallest.id), autoterminationMinutes: 20, autoscale: { minWorkers: 1, maxWorkers: 50, }, sparkConf: { "spark.databricks.io.cache.enabled": true, "spark.databricks.io.cache.maxDiskUsage": "50g", "spark.databricks.io.cache.maxMetaDataCache": "1g", }, });

    import pulumi
    import pulumi_databricks as databricks
    
    smallest = databricks.get_node_type(local_disk=True)
    latest_lts = databricks.get_spark_version(long_term_support=True)
    shared_autoscaling = databricks.Cluster("sharedAutoscaling",
        cluster_name="Shared Autoscaling",
        spark_version=latest_lts.id,
        node_type_id=smallest.id,
        autotermination_minutes=20,
        autoscale=databricks.ClusterAutoscaleArgs(
            min_workers=1,
            max_workers=50,
        ),
        spark_conf={
            "spark.databricks.io.cache.enabled": True,
            "spark.databricks.io.cache.maxDiskUsage": "50g",
            "spark.databricks.io.cache.maxMetaDataCache": "1g",
        })
    
    using System.Collections.Generic;
    using System.Linq;
    using Pulumi;
    using Databricks = Pulumi.Databricks;
    
    return await Deployment.RunAsync(() => 
    {
        var smallest = Databricks.GetNodeType.Invoke(new()
        {
            LocalDisk = true,
        });
    
        var latestLts = Databricks.GetSparkVersion.Invoke(new()
        {
            LongTermSupport = true,
        });
    
        var sharedAutoscaling = new Databricks.Cluster("sharedAutoscaling", new()
        {
            ClusterName = "Shared Autoscaling",
            SparkVersion = latestLts.Apply(getSparkVersionResult => getSparkVersionResult.Id),
            NodeTypeId = smallest.Apply(getNodeTypeResult => getNodeTypeResult.Id),
            AutoterminationMinutes = 20,
            Autoscale = new Databricks.Inputs.ClusterAutoscaleArgs
            {
                MinWorkers = 1,
                MaxWorkers = 50,
            },
            SparkConf = 
            {
                { "spark.databricks.io.cache.enabled", true },
                { "spark.databricks.io.cache.maxDiskUsage", "50g" },
                { "spark.databricks.io.cache.maxMetaDataCache", "1g" },
            },
        });
    
    });
    
    package main
    
    import (
    	"github.com/pulumi/pulumi-databricks/sdk/go/databricks"
    	"github.com/pulumi/pulumi/sdk/v3/go/pulumi"
    )
    
    func main() {
    	pulumi.Run(func(ctx *pulumi.Context) error {
    		smallest, err := databricks.GetNodeType(ctx, &databricks.GetNodeTypeArgs{
    			LocalDisk: pulumi.BoolRef(true),
    		}, nil)
    		if err != nil {
    			return err
    		}
    		latestLts, err := databricks.GetSparkVersion(ctx, &databricks.GetSparkVersionArgs{
    			LongTermSupport: pulumi.BoolRef(true),
    		}, nil)
    		if err != nil {
    			return err
    		}
    		_, err = databricks.NewCluster(ctx, "sharedAutoscaling", &databricks.ClusterArgs{
    			ClusterName:            pulumi.String("Shared Autoscaling"),
    			SparkVersion:           *pulumi.String(latestLts.Id),
    			NodeTypeId:             *pulumi.String(smallest.Id),
    			AutoterminationMinutes: pulumi.Int(20),
    			Autoscale: &databricks.ClusterAutoscaleArgs{
    				MinWorkers: pulumi.Int(1),
    				MaxWorkers: pulumi.Int(50),
    			},
    			SparkConf: pulumi.Map{
    				"spark.databricks.io.cache.enabled":          pulumi.Any(true),
    				"spark.databricks.io.cache.maxDiskUsage":     pulumi.Any("50g"),
    				"spark.databricks.io.cache.maxMetaDataCache": pulumi.Any("1g"),
    			},
    		})
    		if err != nil {
    			return err
    		}
    		return nil
    	})
    }
    
    package generated_program;
    
    import com.pulumi.Context;
    import com.pulumi.Pulumi;
    import com.pulumi.core.Output;
    import com.pulumi.databricks.DatabricksFunctions;
    import com.pulumi.databricks.inputs.GetNodeTypeArgs;
    import com.pulumi.databricks.inputs.GetSparkVersionArgs;
    import com.pulumi.databricks.Cluster;
    import com.pulumi.databricks.ClusterArgs;
    import com.pulumi.databricks.inputs.ClusterAutoscaleArgs;
    import java.util.List;
    import java.util.ArrayList;
    import java.util.Map;
    import java.io.File;
    import java.nio.file.Files;
    import java.nio.file.Paths;
    
    public class App {
        public static void main(String[] args) {
            Pulumi.run(App::stack);
        }
    
        public static void stack(Context ctx) {
            final var smallest = DatabricksFunctions.getNodeType(GetNodeTypeArgs.builder()
                .localDisk(true)
                .build());
    
            final var latestLts = DatabricksFunctions.getSparkVersion(GetSparkVersionArgs.builder()
                .longTermSupport(true)
                .build());
    
            var sharedAutoscaling = new Cluster("sharedAutoscaling", ClusterArgs.builder()        
                .clusterName("Shared Autoscaling")
                .sparkVersion(latestLts.applyValue(getSparkVersionResult -> getSparkVersionResult.id()))
                .nodeTypeId(smallest.applyValue(getNodeTypeResult -> getNodeTypeResult.id()))
                .autoterminationMinutes(20)
                .autoscale(ClusterAutoscaleArgs.builder()
                    .minWorkers(1)
                    .maxWorkers(50)
                    .build())
                .sparkConf(Map.ofEntries(
                    Map.entry("spark.databricks.io.cache.enabled", true),
                    Map.entry("spark.databricks.io.cache.maxDiskUsage", "50g"),
                    Map.entry("spark.databricks.io.cache.maxMetaDataCache", "1g")
                ))
                .build());
    
        }
    }
    
    resources:
      sharedAutoscaling:
        type: databricks:Cluster
        properties:
          clusterName: Shared Autoscaling
          sparkVersion: ${latestLts.id}
          nodeTypeId: ${smallest.id}
          autoterminationMinutes: 20
          autoscale:
            minWorkers: 1
            maxWorkers: 50
          sparkConf:
            spark.databricks.io.cache.enabled: true
            spark.databricks.io.cache.maxDiskUsage: 50g
            spark.databricks.io.cache.maxMetaDataCache: 1g
    variables:
      smallest:
        fn::invoke:
          Function: databricks:getNodeType
          Arguments:
            localDisk: true
      latestLts:
        fn::invoke:
          Function: databricks:getSparkVersion
          Arguments:
            longTermSupport: true
    
    libraries Sequence[ClusterLibraryArgs]
    node_type_id str

    Any supported databricks.getNodeType id. If instance_pool_id is specified, this field is not needed.

    num_workers int

    Number of worker nodes that this cluster should have. A cluster has one Spark driver and num_workers executors for a total of num_workers + 1 Spark nodes.

    policy_id str
    runtime_engine str

    The type of runtime engine to use. If not specified, the runtime engine type is inferred based on the spark_version value. Allowed values include: PHOTON, STANDARD.

    single_user_name str

    The optional user name of the user to assign to an interactive cluster. This field is required when using data_security_mode set to SINGLE_USER or AAD Passthrough for Azure Data Lake Storage (ADLS) with a single-user cluster (i.e., not high-concurrency clusters).

    spark_conf Mapping[str, Any]

    Map with key-value pairs to fine-tune Spark clusters, where you can provide custom Spark configuration properties in a cluster configuration.

    spark_env_vars Mapping[str, Any]

    Map with environment variable key-value pairs to fine-tune Spark clusters. Key-value pairs of the form (X,Y) are exported (i.e., X='Y') while launching the driver and workers.

    spark_version str

    Runtime version of the cluster. Any supported databricks.getSparkVersion id. We advise using Cluster Policies to restrict the list of versions for simplicity while maintaining enough control.

    ssh_public_keys Sequence[str]

    SSH public key contents that will be added to each Spark node in this cluster. The corresponding private keys can be used to login with the user name ubuntu on port 2200. You can specify up to 10 keys.

    state str

    (string) State of the cluster.

    url str
    workload_type ClusterWorkloadTypeArgs
    applyPolicyDefaultValues Boolean

    Whether to use policy default values for missing cluster attributes.

    autoscale Property Map
    autoterminationMinutes Number

    Automatically terminate the cluster after being inactive for this time in minutes. If specified, the threshold must be between 10 and 10000 minutes. You can also set this value to 0 to explicitly disable automatic termination. Defaults to 60. We highly recommend having this setting present for Interactive/BI clusters.

    awsAttributes Property Map
    azureAttributes Property Map
    clusterId String
    clusterLogConf Property Map
    clusterMountInfos List<Property Map>
    clusterName String

    Cluster name, which doesn’t have to be unique. If not specified at creation, the cluster name will be an empty string.

    customTags Map<Any>

    Additional tags for cluster resources. Databricks will tag all cluster resources (e.g., AWS EC2 instances and EBS volumes) with these tags in addition to default_tags. If a custom cluster tag has the same name as a default cluster tag, the custom tag is prefixed with an x_ when it is propagated.

    dataSecurityMode String

    Select the security features of the cluster. Unity Catalog requires SINGLE_USER or USER_ISOLATION mode. LEGACY_PASSTHROUGH for passthrough cluster and LEGACY_TABLE_ACL for Table ACL cluster. If omitted, no security features are enabled. In the Databricks UI, this has been recently been renamed Access Mode and USER_ISOLATION has been renamed Shared, but use these terms here.

    defaultTags Map<Any>

    (map) Tags that are added by Databricks by default, regardless of any custom_tags that may have been added. These include: Vendor: Databricks, Creator: <username_of_creator>, ClusterName: <name_of_cluster>, ClusterId: <id_of_cluster>, Name: , and any workspace and pool tags.

    dockerImage Property Map
    driverInstancePoolId String

    similar to instance_pool_id, but for driver node. If omitted, and instance_pool_id is specified, then the driver will be allocated from that pool.

    driverNodeTypeId String

    The node type of the Spark driver. This field is optional; if unset, API will set the driver node type to the same value as node_type_id defined above.

    enableElasticDisk Boolean

    If you don’t want to allocate a fixed number of EBS volumes at cluster creation time, use autoscaling local storage. With autoscaling local storage, Databricks monitors the amount of free disk space available on your cluster’s Spark workers. If a worker begins to run too low on disk, Databricks automatically attaches a new EBS volume to the worker before it runs out of disk space. EBS volumes are attached up to a limit of 5 TB of total disk space per instance (including the instance’s local storage). To scale down EBS usage, make sure you have autotermination_minutes and autoscale attributes set. More documentation available at cluster configuration page.

    enableLocalDiskEncryption Boolean

    Some instance types you use to run clusters may have locally attached disks. Databricks may store shuffle data or temporary data on these locally attached disks. To ensure that all data at rest is encrypted for all storage types, including shuffle data stored temporarily on your cluster’s local disks, you can enable local disk encryption. When local disk encryption is enabled, Databricks generates an encryption key locally unique to each cluster node and uses it to encrypt all data stored on local disks. The scope of the key is local to each cluster node and is destroyed along with the cluster node itself. During its lifetime, the key resides in memory for encryption and decryption and is stored encrypted on the disk. Your workloads may run more slowly because of the performance impact of reading and writing encrypted data to and from local volumes. This feature is not available for all Azure Databricks subscriptions. Contact your Microsoft or Databricks account representative to request access.

    gcpAttributes Property Map
    idempotencyToken String

    An optional token to guarantee the idempotency of cluster creation requests. If an active cluster with the provided token already exists, the request will not create a new cluster, but it will return the existing running cluster's ID instead. If you specify the idempotency token, upon failure, you can retry until the request succeeds. Databricks platform guarantees to launch exactly one cluster with that idempotency token. This token should have at most 64 characters.

    initScripts List<Property Map>
    instancePoolId String

    To reduce cluster start time, you can attach a cluster to a predefined pool of idle instances. When attached to a pool, a cluster allocates its driver and worker nodes from the pool. If the pool does not have sufficient idle resources to accommodate the cluster’s request, it expands by allocating new instances from the instance provider. When an attached cluster changes its state to TERMINATED, the instances it used are returned to the pool and reused by a different cluster.

    isPinned Boolean

    boolean value specifying if the cluster is pinned (not pinned by default). You must be a Databricks administrator to use this. The pinned clusters' maximum number is limited to 100, so apply may fail if you have more than that (this number may change over time, so check Databricks documentation for actual number).

    The following example demonstrates how to create an autoscaling cluster with Delta Cache enabled:

    import * as pulumi from "@pulumi/pulumi";
    import * as databricks from "@pulumi/databricks";
    

    const smallest = databricks.getNodeType({ localDisk: true, }); const latestLts = databricks.getSparkVersion({ longTermSupport: true, }); const sharedAutoscaling = new databricks.Cluster("sharedAutoscaling", { clusterName: "Shared Autoscaling", sparkVersion: latestLts.then(latestLts => latestLts.id), nodeTypeId: smallest.then(smallest => smallest.id), autoterminationMinutes: 20, autoscale: { minWorkers: 1, maxWorkers: 50, }, sparkConf: { "spark.databricks.io.cache.enabled": true, "spark.databricks.io.cache.maxDiskUsage": "50g", "spark.databricks.io.cache.maxMetaDataCache": "1g", }, });

    import pulumi
    import pulumi_databricks as databricks
    
    smallest = databricks.get_node_type(local_disk=True)
    latest_lts = databricks.get_spark_version(long_term_support=True)
    shared_autoscaling = databricks.Cluster("sharedAutoscaling",
        cluster_name="Shared Autoscaling",
        spark_version=latest_lts.id,
        node_type_id=smallest.id,
        autotermination_minutes=20,
        autoscale=databricks.ClusterAutoscaleArgs(
            min_workers=1,
            max_workers=50,
        ),
        spark_conf={
            "spark.databricks.io.cache.enabled": True,
            "spark.databricks.io.cache.maxDiskUsage": "50g",
            "spark.databricks.io.cache.maxMetaDataCache": "1g",
        })
    
    using System.Collections.Generic;
    using System.Linq;
    using Pulumi;
    using Databricks = Pulumi.Databricks;
    
    return await Deployment.RunAsync(() => 
    {
        var smallest = Databricks.GetNodeType.Invoke(new()
        {
            LocalDisk = true,
        });
    
        var latestLts = Databricks.GetSparkVersion.Invoke(new()
        {
            LongTermSupport = true,
        });
    
        var sharedAutoscaling = new Databricks.Cluster("sharedAutoscaling", new()
        {
            ClusterName = "Shared Autoscaling",
            SparkVersion = latestLts.Apply(getSparkVersionResult => getSparkVersionResult.Id),
            NodeTypeId = smallest.Apply(getNodeTypeResult => getNodeTypeResult.Id),
            AutoterminationMinutes = 20,
            Autoscale = new Databricks.Inputs.ClusterAutoscaleArgs
            {
                MinWorkers = 1,
                MaxWorkers = 50,
            },
            SparkConf = 
            {
                { "spark.databricks.io.cache.enabled", true },
                { "spark.databricks.io.cache.maxDiskUsage", "50g" },
                { "spark.databricks.io.cache.maxMetaDataCache", "1g" },
            },
        });
    
    });
    
    package main
    
    import (
    	"github.com/pulumi/pulumi-databricks/sdk/go/databricks"
    	"github.com/pulumi/pulumi/sdk/v3/go/pulumi"
    )
    
    func main() {
    	pulumi.Run(func(ctx *pulumi.Context) error {
    		smallest, err := databricks.GetNodeType(ctx, &databricks.GetNodeTypeArgs{
    			LocalDisk: pulumi.BoolRef(true),
    		}, nil)
    		if err != nil {
    			return err
    		}
    		latestLts, err := databricks.GetSparkVersion(ctx, &databricks.GetSparkVersionArgs{
    			LongTermSupport: pulumi.BoolRef(true),
    		}, nil)
    		if err != nil {
    			return err
    		}
    		_, err = databricks.NewCluster(ctx, "sharedAutoscaling", &databricks.ClusterArgs{
    			ClusterName:            pulumi.String("Shared Autoscaling"),
    			SparkVersion:           *pulumi.String(latestLts.Id),
    			NodeTypeId:             *pulumi.String(smallest.Id),
    			AutoterminationMinutes: pulumi.Int(20),
    			Autoscale: &databricks.ClusterAutoscaleArgs{
    				MinWorkers: pulumi.Int(1),
    				MaxWorkers: pulumi.Int(50),
    			},
    			SparkConf: pulumi.Map{
    				"spark.databricks.io.cache.enabled":          pulumi.Any(true),
    				"spark.databricks.io.cache.maxDiskUsage":     pulumi.Any("50g"),
    				"spark.databricks.io.cache.maxMetaDataCache": pulumi.Any("1g"),
    			},
    		})
    		if err != nil {
    			return err
    		}
    		return nil
    	})
    }
    
    package generated_program;
    
    import com.pulumi.Context;
    import com.pulumi.Pulumi;
    import com.pulumi.core.Output;
    import com.pulumi.databricks.DatabricksFunctions;
    import com.pulumi.databricks.inputs.GetNodeTypeArgs;
    import com.pulumi.databricks.inputs.GetSparkVersionArgs;
    import com.pulumi.databricks.Cluster;
    import com.pulumi.databricks.ClusterArgs;
    import com.pulumi.databricks.inputs.ClusterAutoscaleArgs;
    import java.util.List;
    import java.util.ArrayList;
    import java.util.Map;
    import java.io.File;
    import java.nio.file.Files;
    import java.nio.file.Paths;
    
    public class App {
        public static void main(String[] args) {
            Pulumi.run(App::stack);
        }
    
        public static void stack(Context ctx) {
            final var smallest = DatabricksFunctions.getNodeType(GetNodeTypeArgs.builder()
                .localDisk(true)
                .build());
    
            final var latestLts = DatabricksFunctions.getSparkVersion(GetSparkVersionArgs.builder()
                .longTermSupport(true)
                .build());
    
            var sharedAutoscaling = new Cluster("sharedAutoscaling", ClusterArgs.builder()        
                .clusterName("Shared Autoscaling")
                .sparkVersion(latestLts.applyValue(getSparkVersionResult -> getSparkVersionResult.id()))
                .nodeTypeId(smallest.applyValue(getNodeTypeResult -> getNodeTypeResult.id()))
                .autoterminationMinutes(20)
                .autoscale(ClusterAutoscaleArgs.builder()
                    .minWorkers(1)
                    .maxWorkers(50)
                    .build())
                .sparkConf(Map.ofEntries(
                    Map.entry("spark.databricks.io.cache.enabled", true),
                    Map.entry("spark.databricks.io.cache.maxDiskUsage", "50g"),
                    Map.entry("spark.databricks.io.cache.maxMetaDataCache", "1g")
                ))
                .build());
    
        }
    }
    
    resources:
      sharedAutoscaling:
        type: databricks:Cluster
        properties:
          clusterName: Shared Autoscaling
          sparkVersion: ${latestLts.id}
          nodeTypeId: ${smallest.id}
          autoterminationMinutes: 20
          autoscale:
            minWorkers: 1
            maxWorkers: 50
          sparkConf:
            spark.databricks.io.cache.enabled: true
            spark.databricks.io.cache.maxDiskUsage: 50g
            spark.databricks.io.cache.maxMetaDataCache: 1g
    variables:
      smallest:
        fn::invoke:
          Function: databricks:getNodeType
          Arguments:
            localDisk: true
      latestLts:
        fn::invoke:
          Function: databricks:getSparkVersion
          Arguments:
            longTermSupport: true
    
    libraries List<Property Map>
    nodeTypeId String

    Any supported databricks.getNodeType id. If instance_pool_id is specified, this field is not needed.

    numWorkers Number

    Number of worker nodes that this cluster should have. A cluster has one Spark driver and num_workers executors for a total of num_workers + 1 Spark nodes.

    policyId String
    runtimeEngine String

    The type of runtime engine to use. If not specified, the runtime engine type is inferred based on the spark_version value. Allowed values include: PHOTON, STANDARD.

    singleUserName String

    The optional user name of the user to assign to an interactive cluster. This field is required when using data_security_mode set to SINGLE_USER or AAD Passthrough for Azure Data Lake Storage (ADLS) with a single-user cluster (i.e., not high-concurrency clusters).

    sparkConf Map<Any>

    Map with key-value pairs to fine-tune Spark clusters, where you can provide custom Spark configuration properties in a cluster configuration.

    sparkEnvVars Map<Any>

    Map with environment variable key-value pairs to fine-tune Spark clusters. Key-value pairs of the form (X,Y) are exported (i.e., X='Y') while launching the driver and workers.

    sparkVersion String

    Runtime version of the cluster. Any supported databricks.getSparkVersion id. We advise using Cluster Policies to restrict the list of versions for simplicity while maintaining enough control.

    sshPublicKeys List<String>

    SSH public key contents that will be added to each Spark node in this cluster. The corresponding private keys can be used to login with the user name ubuntu on port 2200. You can specify up to 10 keys.

    state String

    (string) State of the cluster.

    url String
    workloadType Property Map

    Supporting Types

    ClusterAutoscale, ClusterAutoscaleArgs

    maxWorkers Integer
    minWorkers Integer
    maxWorkers number
    minWorkers number
    maxWorkers Number
    minWorkers Number

    ClusterAwsAttributes, ClusterAwsAttributesArgs

    ClusterAzureAttributes, ClusterAzureAttributesArgs

    ClusterClusterLogConf, ClusterClusterLogConfArgs

    ClusterClusterLogConfDbfs, ClusterClusterLogConfDbfsArgs

    ClusterClusterLogConfS3, ClusterClusterLogConfS3Args

    Destination string
    CannedAcl string
    EnableEncryption bool
    EncryptionType string
    Endpoint string
    KmsKey string
    Region string
    Destination string
    CannedAcl string
    EnableEncryption bool
    EncryptionType string
    Endpoint string
    KmsKey string
    Region string
    destination String
    cannedAcl String
    enableEncryption Boolean
    encryptionType String
    endpoint String
    kmsKey String
    region String
    destination string
    cannedAcl string
    enableEncryption boolean
    encryptionType string
    endpoint string
    kmsKey string
    region string
    destination String
    cannedAcl String
    enableEncryption Boolean
    encryptionType String
    endpoint String
    kmsKey String
    region String

    ClusterClusterMountInfo, ClusterClusterMountInfoArgs

    ClusterClusterMountInfoNetworkFilesystemInfo, ClusterClusterMountInfoNetworkFilesystemInfoArgs

    ClusterDockerImage, ClusterDockerImageArgs

    ClusterDockerImageBasicAuth, ClusterDockerImageBasicAuthArgs

    Password string
    Username string
    Password string
    Username string
    password String
    username String
    password string
    username string
    password String
    username String

    ClusterGcpAttributes, ClusterGcpAttributesArgs

    Availability string
    BootDiskSize int
    GoogleServiceAccount string
    LocalSsdCount int
    UsePreemptibleExecutors bool

    Deprecated:

    Please use 'availability' instead.

    ZoneId string
    Availability string
    BootDiskSize int
    GoogleServiceAccount string
    LocalSsdCount int
    UsePreemptibleExecutors bool

    Deprecated:

    Please use 'availability' instead.

    ZoneId string
    availability String
    bootDiskSize Integer
    googleServiceAccount String
    localSsdCount Integer
    usePreemptibleExecutors Boolean

    Deprecated:

    Please use 'availability' instead.

    zoneId String
    availability string
    bootDiskSize number
    googleServiceAccount string
    localSsdCount number
    usePreemptibleExecutors boolean

    Deprecated:

    Please use 'availability' instead.

    zoneId string
    availability str
    boot_disk_size int
    google_service_account str
    local_ssd_count int
    use_preemptible_executors bool

    Deprecated:

    Please use 'availability' instead.

    zone_id str
    availability String
    bootDiskSize Number
    googleServiceAccount String
    localSsdCount Number
    usePreemptibleExecutors Boolean

    Deprecated:

    Please use 'availability' instead.

    zoneId String

    ClusterInitScript, ClusterInitScriptArgs

    abfss Property Map
    dbfs Property Map

    Deprecated:

    For init scripts use 'volumes', 'workspace' or cloud storage location instead of 'dbfs'.

    file Property Map
    gcs Property Map
    s3 Property Map
    volumes Property Map
    workspace Property Map

    ClusterInitScriptAbfss, ClusterInitScriptAbfssArgs

    ClusterInitScriptDbfs, ClusterInitScriptDbfsArgs

    ClusterInitScriptFile, ClusterInitScriptFileArgs

    ClusterInitScriptGcs, ClusterInitScriptGcsArgs

    ClusterInitScriptS3, ClusterInitScriptS3Args

    Destination string
    CannedAcl string
    EnableEncryption bool
    EncryptionType string
    Endpoint string
    KmsKey string
    Region string
    Destination string
    CannedAcl string
    EnableEncryption bool
    EncryptionType string
    Endpoint string
    KmsKey string
    Region string
    destination String
    cannedAcl String
    enableEncryption Boolean
    encryptionType String
    endpoint String
    kmsKey String
    region String
    destination string
    cannedAcl string
    enableEncryption boolean
    encryptionType string
    endpoint string
    kmsKey string
    region string
    destination String
    cannedAcl String
    enableEncryption Boolean
    encryptionType String
    endpoint String
    kmsKey String
    region String

    ClusterInitScriptVolumes, ClusterInitScriptVolumesArgs

    ClusterInitScriptWorkspace, ClusterInitScriptWorkspaceArgs

    ClusterLibrary, ClusterLibraryArgs

    ClusterLibraryCran, ClusterLibraryCranArgs

    Package string
    Repo string
    Package string
    Repo string
    package_ String
    repo String
    package string
    repo string
    package str
    repo str
    package String
    repo String

    ClusterLibraryMaven, ClusterLibraryMavenArgs

    Coordinates string
    Exclusions List<string>
    Repo string
    Coordinates string
    Exclusions []string
    Repo string
    coordinates String
    exclusions List<String>
    repo String
    coordinates string
    exclusions string[]
    repo string
    coordinates str
    exclusions Sequence[str]
    repo str
    coordinates String
    exclusions List<String>
    repo String

    ClusterLibraryPypi, ClusterLibraryPypiArgs

    Package string
    Repo string
    Package string
    Repo string
    package_ String
    repo String
    package string
    repo string
    package str
    repo str
    package String
    repo String

    ClusterWorkloadType, ClusterWorkloadTypeArgs

    ClusterWorkloadTypeClients, ClusterWorkloadTypeClientsArgs

    Jobs bool
    Notebooks bool
    Jobs bool
    Notebooks bool
    jobs Boolean
    notebooks Boolean
    jobs boolean
    notebooks boolean
    jobs bool
    notebooks bool
    jobs Boolean
    notebooks Boolean

    Package Details

    Repository
    databricks pulumi/pulumi-databricks
    License
    Apache-2.0
    Notes

    This Pulumi package is based on the databricks Terraform Provider.

    databricks logo
    Databricks v1.27.0 published on Tuesday, Dec 5, 2023 by Pulumi