This page documents the language specification for the gcp package. If you're looking for help working with the inputs, outputs, or functions of gcp resources in a Pulumi program, please see the resource documentation for examples and API reference.

dataproc

This provider is a derived work of the Terraform Provider distributed under MPL 2.0. If you encounter a bug or missing feature, first check the pulumi/pulumi-gcp repo; however, if that doesn’t turn up anything, please consult the source terraform-providers/terraform-provider-google repo.

class pulumi_gcp.dataproc.AutoscalingPolicy(resource_name, opts=None, basic_algorithm=None, location=None, policy_id=None, project=None, secondary_worker_config=None, worker_config=None, __props__=None, __name__=None, __opts__=None)

Describes an autoscaling policy for Dataproc cluster autoscaler.

import pulumi
import pulumi_gcp as gcp

asp = gcp.dataproc.AutoscalingPolicy("asp",
    policy_id="dataproc-policy",
    location="us-central1",
    worker_config={
        "max_instances": 3,
    },
    basic_algorithm={
        "yarn_config": {
            "gracefulDecommissionTimeout": "30s",
            "scaleUpFactor": 0.5,
            "scaleDownFactor": 0.5,
        },
    })
basic = gcp.dataproc.Cluster("basic",
    region="us-central1",
    cluster_config={
        "autoscaling_config": {
            "policyUri": asp.name,
        },
    })
Parameters
  • resource_name (str) – The name of the resource.

  • opts (pulumi.ResourceOptions) – Options for the resource.

  • basic_algorithm (pulumi.Input[dict]) – Basic algorithm for autoscaling. Structure is documented below.

  • location (pulumi.Input[str]) – The location where the autoscaling poicy should reside. The default value is global.

  • policy*id (pulumi.Input[str]) –

    The policy id. The id must contain only letters (a-z, A-Z), numbers (0-9), underscores (*), and hyphens (-). Cannot begin or end with underscore or hyphen. Must consist of between 3 and 50 characters.

  • project (pulumi.Input[str]) – The ID of the project in which the resource belongs. If it is not provided, the provider project is used.

  • secondary_worker_config (pulumi.Input[dict]) – Describes how the autoscaler will operate for secondary workers. Structure is documented below.

  • worker_config (pulumi.Input[dict]) – Describes how the autoscaler will operate for primary workers. Structure is documented below.

The basic_algorithm object supports the following:

  • cooldownPeriod (pulumi.Input[str]) - Duration between scaling events. A scaling period starts after the update operation from the previous event has completed. Bounds: [2m, 1d]. Default: 2m.

  • yarnConfig (pulumi.Input[dict]) - YARN autoscaling configuration. Structure is documented below.

    • gracefulDecommissionTimeout (pulumi.Input[str]) - Timeout for YARN graceful decommissioning of Node Managers. Specifies the duration to wait for jobs to complete before forcefully removing workers (and potentially interrupting jobs). Only applicable to downscaling operations. Bounds: [0s, 1d].

    • scaleDownFactor (pulumi.Input[float]) - Fraction of average pending memory in the last cooldown period for which to remove workers. A scale-down factor of 1 will result in scaling down so that there is no available memory remaining after the update (more aggressive scaling). A scale-down factor of 0 disables removing workers, which can be beneficial for autoscaling a single job. Bounds: [0.0, 1.0].

    • scaleDownMinWorkerFraction (pulumi.Input[float]) - Minimum scale-down threshold as a fraction of total cluster size before scaling occurs. For example, in a 20-worker cluster, a threshold of 0.1 means the autoscaler must recommend at least a 2 worker scale-down for the cluster to scale. A threshold of 0 means the autoscaler will scale down on any recommended change. Bounds: [0.0, 1.0]. Default: 0.0.

    • scaleUpFactor (pulumi.Input[float]) - Fraction of average pending memory in the last cooldown period for which to add workers. A scale-up factor of 1.0 will result in scaling up so that there is no pending memory remaining after the update (more aggressive scaling). A scale-up factor closer to 0 will result in a smaller magnitude of scaling up (less aggressive scaling). Bounds: [0.0, 1.0].

    • scaleUpMinWorkerFraction (pulumi.Input[float]) - Minimum scale-up threshold as a fraction of total cluster size before scaling occurs. For example, in a 20-worker cluster, a threshold of 0.1 means the autoscaler must recommend at least a 2-worker scale-up for the cluster to scale. A threshold of 0 means the autoscaler will scale up on any recommended change. Bounds: [0.0, 1.0]. Default: 0.0.

The secondary_worker_config object supports the following:

  • max_instances (pulumi.Input[float]) - Maximum number of instances for this group. Note that by default, clusters will not use secondary workers. Required for secondary workers if the minimum secondary instances is set. Bounds: [minInstances, ). Defaults to 0.

  • minInstances (pulumi.Input[float]) - Minimum number of instances for this group. Bounds: [0, maxInstances]. Defaults to 0.

  • weight (pulumi.Input[float]) - Weight for the instance group, which is used to determine the fraction of total workers in the cluster from this instance group. For example, if primary workers have weight 2, and secondary workers have weight 1, the cluster will have approximately 2 primary workers for each secondary worker. The cluster may not reach the specified balance if constrained by min/max bounds or other autoscaling settings. For example, if maxInstances for secondary workers is 0, then only primary workers will be added. The cluster can also be out of balance when created. If weight is not set on any instance group, the cluster will default to equal weight for all groups: the cluster will attempt to maintain an equal number of workers in each group within the configured size bounds for each group. If weight is set for one group only, the cluster will default to zero weight on the unset group. For example if weight is set only on primary workers, the cluster will use primary workers only and no secondary workers.

The worker_config object supports the following:

  • max_instances (pulumi.Input[float]) - Maximum number of instances for this group. Note that by default, clusters will not use secondary workers. Required for secondary workers if the minimum secondary instances is set. Bounds: [minInstances, ). Defaults to 0.

  • minInstances (pulumi.Input[float]) - Minimum number of instances for this group. Bounds: [0, maxInstances]. Defaults to 0.

  • weight (pulumi.Input[float]) - Weight for the instance group, which is used to determine the fraction of total workers in the cluster from this instance group. For example, if primary workers have weight 2, and secondary workers have weight 1, the cluster will have approximately 2 primary workers for each secondary worker. The cluster may not reach the specified balance if constrained by min/max bounds or other autoscaling settings. For example, if maxInstances for secondary workers is 0, then only primary workers will be added. The cluster can also be out of balance when created. If weight is not set on any instance group, the cluster will default to equal weight for all groups: the cluster will attempt to maintain an equal number of workers in each group within the configured size bounds for each group. If weight is set for one group only, the cluster will default to zero weight on the unset group. For example if weight is set only on primary workers, the cluster will use primary workers only and no secondary workers.

basic_algorithm: pulumi.Output[dict] = None

Basic algorithm for autoscaling. Structure is documented below.

  • cooldownPeriod (str) - Duration between scaling events. A scaling period starts after the update operation from the previous event has completed. Bounds: [2m, 1d]. Default: 2m.

  • yarnConfig (dict) - YARN autoscaling configuration. Structure is documented below.

    • gracefulDecommissionTimeout (str) - Timeout for YARN graceful decommissioning of Node Managers. Specifies the duration to wait for jobs to complete before forcefully removing workers (and potentially interrupting jobs). Only applicable to downscaling operations. Bounds: [0s, 1d].

    • scaleDownFactor (float) - Fraction of average pending memory in the last cooldown period for which to remove workers. A scale-down factor of 1 will result in scaling down so that there is no available memory remaining after the update (more aggressive scaling). A scale-down factor of 0 disables removing workers, which can be beneficial for autoscaling a single job. Bounds: [0.0, 1.0].

    • scaleDownMinWorkerFraction (float) - Minimum scale-down threshold as a fraction of total cluster size before scaling occurs. For example, in a 20-worker cluster, a threshold of 0.1 means the autoscaler must recommend at least a 2 worker scale-down for the cluster to scale. A threshold of 0 means the autoscaler will scale down on any recommended change. Bounds: [0.0, 1.0]. Default: 0.0.

    • scaleUpFactor (float) - Fraction of average pending memory in the last cooldown period for which to add workers. A scale-up factor of 1.0 will result in scaling up so that there is no pending memory remaining after the update (more aggressive scaling). A scale-up factor closer to 0 will result in a smaller magnitude of scaling up (less aggressive scaling). Bounds: [0.0, 1.0].

    • scaleUpMinWorkerFraction (float) - Minimum scale-up threshold as a fraction of total cluster size before scaling occurs. For example, in a 20-worker cluster, a threshold of 0.1 means the autoscaler must recommend at least a 2-worker scale-up for the cluster to scale. A threshold of 0 means the autoscaler will scale up on any recommended change. Bounds: [0.0, 1.0]. Default: 0.0.

location: pulumi.Output[str] = None

The location where the autoscaling poicy should reside. The default value is global.

name: pulumi.Output[str] = None

The “resource name” of the autoscaling policy.

policy_id: pulumi.Output[str] = None

The policy id. The id must contain only letters (a-z, A-Z), numbers (0-9), underscores (_), and hyphens (-). Cannot begin or end with underscore or hyphen. Must consist of between 3 and 50 characters.

project: pulumi.Output[str] = None

The ID of the project in which the resource belongs. If it is not provided, the provider project is used.

secondary_worker_config: pulumi.Output[dict] = None

Describes how the autoscaler will operate for secondary workers. Structure is documented below.

  • max_instances (float) - Maximum number of instances for this group. Note that by default, clusters will not use secondary workers. Required for secondary workers if the minimum secondary instances is set. Bounds: [minInstances, ). Defaults to 0.

  • minInstances (float) - Minimum number of instances for this group. Bounds: [0, maxInstances]. Defaults to 0.

  • weight (float) - Weight for the instance group, which is used to determine the fraction of total workers in the cluster from this instance group. For example, if primary workers have weight 2, and secondary workers have weight 1, the cluster will have approximately 2 primary workers for each secondary worker. The cluster may not reach the specified balance if constrained by min/max bounds or other autoscaling settings. For example, if maxInstances for secondary workers is 0, then only primary workers will be added. The cluster can also be out of balance when created. If weight is not set on any instance group, the cluster will default to equal weight for all groups: the cluster will attempt to maintain an equal number of workers in each group within the configured size bounds for each group. If weight is set for one group only, the cluster will default to zero weight on the unset group. For example if weight is set only on primary workers, the cluster will use primary workers only and no secondary workers.

worker_config: pulumi.Output[dict] = None

Describes how the autoscaler will operate for primary workers. Structure is documented below.

  • max_instances (float) - Maximum number of instances for this group. Note that by default, clusters will not use secondary workers. Required for secondary workers if the minimum secondary instances is set. Bounds: [minInstances, ). Defaults to 0.

  • minInstances (float) - Minimum number of instances for this group. Bounds: [0, maxInstances]. Defaults to 0.

  • weight (float) - Weight for the instance group, which is used to determine the fraction of total workers in the cluster from this instance group. For example, if primary workers have weight 2, and secondary workers have weight 1, the cluster will have approximately 2 primary workers for each secondary worker. The cluster may not reach the specified balance if constrained by min/max bounds or other autoscaling settings. For example, if maxInstances for secondary workers is 0, then only primary workers will be added. The cluster can also be out of balance when created. If weight is not set on any instance group, the cluster will default to equal weight for all groups: the cluster will attempt to maintain an equal number of workers in each group within the configured size bounds for each group. If weight is set for one group only, the cluster will default to zero weight on the unset group. For example if weight is set only on primary workers, the cluster will use primary workers only and no secondary workers.

static get(resource_name, id, opts=None, basic_algorithm=None, location=None, name=None, policy_id=None, project=None, secondary_worker_config=None, worker_config=None)

Get an existing AutoscalingPolicy resource’s state with the given name, id, and optional extra properties used to qualify the lookup.

Parameters
  • resource_name (str) – The unique name of the resulting resource.

  • id (str) – The unique provider ID of the resource to lookup.

  • opts (pulumi.ResourceOptions) – Options for the resource.

  • basic_algorithm (pulumi.Input[dict]) – Basic algorithm for autoscaling. Structure is documented below.

  • location (pulumi.Input[str]) – The location where the autoscaling poicy should reside. The default value is global.

  • name (pulumi.Input[str]) – The “resource name” of the autoscaling policy.

  • policy*id (pulumi.Input[str]) –

    The policy id. The id must contain only letters (a-z, A-Z), numbers (0-9), underscores (*), and hyphens (-). Cannot begin or end with underscore or hyphen. Must consist of between 3 and 50 characters.

  • project (pulumi.Input[str]) – The ID of the project in which the resource belongs. If it is not provided, the provider project is used.

  • secondary_worker_config (pulumi.Input[dict]) – Describes how the autoscaler will operate for secondary workers. Structure is documented below.

  • worker_config (pulumi.Input[dict]) – Describes how the autoscaler will operate for primary workers. Structure is documented below.

The basic_algorithm object supports the following:

  • cooldownPeriod (pulumi.Input[str]) - Duration between scaling events. A scaling period starts after the update operation from the previous event has completed. Bounds: [2m, 1d]. Default: 2m.

  • yarnConfig (pulumi.Input[dict]) - YARN autoscaling configuration. Structure is documented below.

    • gracefulDecommissionTimeout (pulumi.Input[str]) - Timeout for YARN graceful decommissioning of Node Managers. Specifies the duration to wait for jobs to complete before forcefully removing workers (and potentially interrupting jobs). Only applicable to downscaling operations. Bounds: [0s, 1d].

    • scaleDownFactor (pulumi.Input[float]) - Fraction of average pending memory in the last cooldown period for which to remove workers. A scale-down factor of 1 will result in scaling down so that there is no available memory remaining after the update (more aggressive scaling). A scale-down factor of 0 disables removing workers, which can be beneficial for autoscaling a single job. Bounds: [0.0, 1.0].

    • scaleDownMinWorkerFraction (pulumi.Input[float]) - Minimum scale-down threshold as a fraction of total cluster size before scaling occurs. For example, in a 20-worker cluster, a threshold of 0.1 means the autoscaler must recommend at least a 2 worker scale-down for the cluster to scale. A threshold of 0 means the autoscaler will scale down on any recommended change. Bounds: [0.0, 1.0]. Default: 0.0.

    • scaleUpFactor (pulumi.Input[float]) - Fraction of average pending memory in the last cooldown period for which to add workers. A scale-up factor of 1.0 will result in scaling up so that there is no pending memory remaining after the update (more aggressive scaling). A scale-up factor closer to 0 will result in a smaller magnitude of scaling up (less aggressive scaling). Bounds: [0.0, 1.0].

    • scaleUpMinWorkerFraction (pulumi.Input[float]) - Minimum scale-up threshold as a fraction of total cluster size before scaling occurs. For example, in a 20-worker cluster, a threshold of 0.1 means the autoscaler must recommend at least a 2-worker scale-up for the cluster to scale. A threshold of 0 means the autoscaler will scale up on any recommended change. Bounds: [0.0, 1.0]. Default: 0.0.

The secondary_worker_config object supports the following:

  • max_instances (pulumi.Input[float]) - Maximum number of instances for this group. Note that by default, clusters will not use secondary workers. Required for secondary workers if the minimum secondary instances is set. Bounds: [minInstances, ). Defaults to 0.

  • minInstances (pulumi.Input[float]) - Minimum number of instances for this group. Bounds: [0, maxInstances]. Defaults to 0.

  • weight (pulumi.Input[float]) - Weight for the instance group, which is used to determine the fraction of total workers in the cluster from this instance group. For example, if primary workers have weight 2, and secondary workers have weight 1, the cluster will have approximately 2 primary workers for each secondary worker. The cluster may not reach the specified balance if constrained by min/max bounds or other autoscaling settings. For example, if maxInstances for secondary workers is 0, then only primary workers will be added. The cluster can also be out of balance when created. If weight is not set on any instance group, the cluster will default to equal weight for all groups: the cluster will attempt to maintain an equal number of workers in each group within the configured size bounds for each group. If weight is set for one group only, the cluster will default to zero weight on the unset group. For example if weight is set only on primary workers, the cluster will use primary workers only and no secondary workers.

The worker_config object supports the following:

  • max_instances (pulumi.Input[float]) - Maximum number of instances for this group. Note that by default, clusters will not use secondary workers. Required for secondary workers if the minimum secondary instances is set. Bounds: [minInstances, ). Defaults to 0.

  • minInstances (pulumi.Input[float]) - Minimum number of instances for this group. Bounds: [0, maxInstances]. Defaults to 0.

  • weight (pulumi.Input[float]) - Weight for the instance group, which is used to determine the fraction of total workers in the cluster from this instance group. For example, if primary workers have weight 2, and secondary workers have weight 1, the cluster will have approximately 2 primary workers for each secondary worker. The cluster may not reach the specified balance if constrained by min/max bounds or other autoscaling settings. For example, if maxInstances for secondary workers is 0, then only primary workers will be added. The cluster can also be out of balance when created. If weight is not set on any instance group, the cluster will default to equal weight for all groups: the cluster will attempt to maintain an equal number of workers in each group within the configured size bounds for each group. If weight is set for one group only, the cluster will default to zero weight on the unset group. For example if weight is set only on primary workers, the cluster will use primary workers only and no secondary workers.

translate_output_property(prop)

Provides subclasses of Resource an opportunity to translate names of output properties into a format of their choosing before writing those properties to the resource object.

Parameters

prop (str) – A property name.

Returns

A potentially transformed property name.

Return type

str

translate_input_property(prop)

Provides subclasses of Resource an opportunity to translate names of input properties into a format of their choosing before sending those properties to the Pulumi engine.

Parameters

prop (str) – A property name.

Returns

A potentially transformed property name.

Return type

str

class pulumi_gcp.dataproc.Cluster(resource_name, opts=None, cluster_config=None, labels=None, name=None, project=None, region=None, __props__=None, __name__=None, __opts__=None)

Manages a Cloud Dataproc cluster resource within GCP. For more information see the official dataproc documentation.

!> Warning: Due to limitations of the API, all arguments except labels,cluster_config.worker_config.num_instances and cluster_config.preemptible_worker_config.num_instances are non-updatable. Changing others will cause recreation of the whole cluster!

import pulumi
import pulumi_gcp as gcp

simplecluster = gcp.dataproc.Cluster("simplecluster", region="us-central1")
import pulumi
import pulumi_gcp as gcp

mycluster = gcp.dataproc.Cluster("mycluster",
    cluster_config={
        "gceClusterConfig": {
            "serviceAccountScopes": [
                "https://www.googleapis.com/auth/monitoring",
                "useraccounts-ro",
                "storage-rw",
                "logging-write",
            ],
            "tags": [
                "foo",
                "bar",
            ],
        },
        "initializationAction": [{
            "script": "gs://dataproc-initialization-actions/stackdriver/stackdriver.sh",
            "timeout_sec": 500,
        }],
        "masterConfig": {
            "diskConfig": {
                "bootDiskSizeGb": 15,
                "bootDiskType": "pd-ssd",
            },
            "machine_type": "n1-standard-1",
            "numInstances": 1,
        },
        "preemptibleWorkerConfig": {
            "numInstances": 0,
        },
        "softwareConfig": {
            "imageVersion": "1.3.7-deb9",
            "overrideProperties": {
                "dataproc:dataproc.allow.zero.workers": "true",
            },
        },
        "stagingBucket": "dataproc-staging-bucket",
        "worker_config": {
            "diskConfig": {
                "bootDiskSizeGb": 15,
                "numLocalSsds": 1,
            },
            "machine_type": "n1-standard-1",
            "min_cpu_platform": "Intel Skylake",
            "numInstances": 2,
        },
    },
    labels={
        "foo": "bar",
    },
    region="us-central1")
import pulumi
import pulumi_gcp as gcp

accelerated_cluster = gcp.dataproc.Cluster("acceleratedCluster",
    cluster_config={
        "gceClusterConfig": {
            "zone": "us-central1-a",
        },
        "masterConfig": {
            "accelerators": [{
                "acceleratorCount": "1",
                "accelerator_type": "nvidia-tesla-k80",
            }],
        },
    },
    region="us-central1")
Parameters
  • resource_name (str) – The name of the resource.

  • opts (pulumi.ResourceOptions) – Options for the resource.

  • cluster_config (pulumi.Input[dict]) – Allows you to configure various aspects of the cluster. Structure defined below.

  • labels (pulumi.Input[dict]) – The list of labels (key/value pairs) to be applied to instances in the cluster. GCP generates some itself including goog-dataproc-cluster-name which is the name of the cluster.

  • name (pulumi.Input[str]) – The name of the cluster, unique within the project and zone.

  • project (pulumi.Input[str]) – The ID of the project in which the cluster will exist. If it is not provided, the provider project is used.

  • region (pulumi.Input[str]) – The region in which the cluster and associated nodes will be created in. Defaults to global.

The cluster_config object supports the following:

  • autoscalingConfig (pulumi.Input[dict]) - The autoscaling policy config associated with the cluster. Structure defined below.

    • policyUri (pulumi.Input[str]) - The autoscaling policy used by the cluster.

  • bucket (pulumi.Input[str])

  • encryptionConfig (pulumi.Input[dict]) - The Customer managed encryption keys settings for the cluster. Structure defined below.

    • kms_key_name (pulumi.Input[str]) - The Cloud KMS key name to use for PD disk encryption for all instances in the cluster.

  • endpointConfig (pulumi.Input[dict]) - The config settings for port access on the cluster. Structure defined below.

    • enableHttpPortAccess (pulumi.Input[bool]) - The flag to enable http access to specific ports on the cluster from external sources (aka Component Gateway). Defaults to false.

    • httpPorts (pulumi.Input[dict])

  • gceClusterConfig (pulumi.Input[dict]) - Common config settings for resources of Google Compute Engine cluster instances, applicable to all instances in the cluster. Structure defined below.

    • internalIpOnly (pulumi.Input[bool]) - By default, clusters are not restricted to internal IP addresses, and will have ephemeral external IP addresses assigned to each instance. If set to true, all instances in the cluster will only have internal IP addresses. Note: Private Google Access (also known as privateIpGoogleAccess) must be enabled on the subnetwork that the cluster will be launched in.

    • metadata (pulumi.Input[dict]) - A map of the Compute Engine metadata entries to add to all instances (see Project and instance metadata).

    • network (pulumi.Input[str]) - The name or self_link of the Google Compute Engine network to the cluster will be part of. Conflicts with subnetwork. If neither is specified, this defaults to the “default” network.

    • service_account (pulumi.Input[str]) - The service account to be used by the Node VMs. If not specified, the “default” service account is used.

    • serviceAccountScopes (pulumi.Input[list]) - The set of Google API scopes to be made available on all of the node VMs under the service_account specified. These can be either FQDNs, or scope aliases. The following scopes must be set if any other scopes are set. They’re necessary to ensure the correct functioning ofthe cluster, and are set automatically by the API:

    • subnetwork (pulumi.Input[str]) - The name or self_link of the Google Compute Engine subnetwork the cluster will be part of. Conflicts with network.

    • tags (pulumi.Input[list]) - The list of instance tags applied to instances in the cluster. Tags are used to identify valid sources or targets for network firewalls.

    • zone (pulumi.Input[str]) - The GCP zone where your data is stored and used (i.e. where the master and the worker nodes will be created in). If region is set to ‘global’ (default) then zone is mandatory, otherwise GCP is able to make use of Auto Zone Placement to determine this automatically for you. Note: This setting additionally determines and restricts which computing resources are available for use with other configs such as cluster_config.master_config.machine_type and cluster_config.worker_config.machine_type.

  • initializationActions (pulumi.Input[list]) - Commands to execute on each node after config is completed. You can specify multiple versions of these. Structure defined below.

    • script (pulumi.Input[str]) - The script to be executed during initialization of the cluster. The script must be a GCS file with a gs:// prefix.

    • timeout_sec (pulumi.Input[float]) - The maximum duration (in seconds) which script is allowed to take to execute its action. GCP will default to a predetermined computed value if not set (currently 300).

  • lifecycleConfig (pulumi.Input[dict]) - The settings for auto deletion cluster schedule. Structure defined below.

    • autoDeleteTime (pulumi.Input[str]) - The time when cluster will be auto-deleted. A timestamp in RFC3339 UTC “Zulu” format, accurate to nanoseconds. Example: “2014-10-02T15:01:23.045123456Z”.

    • idleDeleteTtl (pulumi.Input[str]) - The duration to keep the cluster alive while idling (no jobs running). After this TTL, the cluster will be deleted. Valid range: [10m, 14d].

    • idleStartTime (pulumi.Input[str])

  • masterConfig (pulumi.Input[dict]) - The Google Compute Engine config settings for the master instances in a cluster.. Structure defined below.

    • accelerators (pulumi.Input[list]) - The Compute Engine accelerator configuration for these instances. Can be specified multiple times.

      • acceleratorCount (pulumi.Input[float]) - The number of the accelerator cards of this type exposed to this instance. Often restricted to one of 1, 2, 4, or 8.

      • accelerator_type (pulumi.Input[str]) - The short name of the accelerator type to expose to this instance. For example, nvidia-tesla-k80.

    • diskConfig (pulumi.Input[dict]) - Disk Config

      • bootDiskSizeGb (pulumi.Input[float]) - Size of the primary disk attached to each preemptible worker node, specified in GB. The smallest allowed disk size is 10GB. GCP will default to a predetermined computed value if not set (currently 500GB). Note: If SSDs are not attached, it also contains the HDFS data blocks and Hadoop working directories.

      • bootDiskType (pulumi.Input[str]) - The disk type of the primary disk attached to each preemptible worker node. One of "pd-ssd" or "pd-standard". Defaults to "pd-standard".

      • numLocalSsds (pulumi.Input[float]) - The amount of local SSD disks that will be attached to each preemptible worker node. Defaults to 0.

    • imageUri (pulumi.Input[str]) - The URI for the image to use for this worker. See the guide for more information.

    • instanceNames (pulumi.Input[list])

    • machine_type (pulumi.Input[str]) - The name of a Google Compute Engine machine type to create for the worker nodes. If not specified, GCP will default to a predetermined computed value (currently n1-standard-4).

    • min_cpu_platform (pulumi.Input[str]) - The name of a minimum generation of CPU family for the master. If not specified, GCP will default to a predetermined computed value for each zone. See the guide for details about which CPU families are available (and defaulted) for each zone.

    • numInstances (pulumi.Input[float]) - Specifies the number of preemptible nodes to create. Defaults to 0.

  • preemptibleWorkerConfig (pulumi.Input[dict]) - The Google Compute Engine config settings for the additional (aka preemptible) instances in a cluster. Structure defined below.

    • diskConfig (pulumi.Input[dict]) - Disk Config

      • bootDiskSizeGb (pulumi.Input[float]) - Size of the primary disk attached to each preemptible worker node, specified in GB. The smallest allowed disk size is 10GB. GCP will default to a predetermined computed value if not set (currently 500GB). Note: If SSDs are not attached, it also contains the HDFS data blocks and Hadoop working directories.

      • bootDiskType (pulumi.Input[str]) - The disk type of the primary disk attached to each preemptible worker node. One of "pd-ssd" or "pd-standard". Defaults to "pd-standard".

      • numLocalSsds (pulumi.Input[float]) - The amount of local SSD disks that will be attached to each preemptible worker node. Defaults to 0.

    • instanceNames (pulumi.Input[list])

    • numInstances (pulumi.Input[float]) - Specifies the number of preemptible nodes to create. Defaults to 0.

  • securityConfig (pulumi.Input[dict]) - Security related configuration. Structure defined below.

    • kerberosConfig (pulumi.Input[dict]) - Kerberos Configuration

      • crossRealmTrustAdminServer (pulumi.Input[str]) - The admin server (IP or hostname) for the remote trusted realm in a cross realm trust relationship.

      • crossRealmTrustKdc (pulumi.Input[str]) - The KDC (IP or hostname) for the remote trusted realm in a cross realm trust relationship.

      • crossRealmTrustRealm (pulumi.Input[str]) - The remote realm the Dataproc on-cluster KDC will trust, should the user enable cross realm trust.

      • crossRealmTrustSharedPasswordUri (pulumi.Input[str]) - The Cloud Storage URI of a KMS encrypted file containing the shared password between the on-cluster Kerberos realm and the remote trusted realm, in a cross realm trust relationship.

      • enableKerberos (pulumi.Input[bool]) - Flag to indicate whether to Kerberize the cluster.

      • kdcDbKeyUri (pulumi.Input[str]) - The Cloud Storage URI of a KMS encrypted file containing the master key of the KDC database.

      • keyPasswordUri (pulumi.Input[str]) - The Cloud Storage URI of a KMS encrypted file containing the password to the user provided key. For the self-signed certificate, this password is generated by Dataproc.

      • keystorePasswordUri (pulumi.Input[str]) - The Cloud Storage URI of a KMS encrypted file containing the password to the user provided keystore. For the self-signed certificated, the password is generated by Dataproc.

      • keystoreUri (pulumi.Input[str]) - The Cloud Storage URI of the keystore file used for SSL encryption. If not provided, Dataproc will provide a self-signed certificate.

      • kmsKeyUri (pulumi.Input[str]) - The URI of the KMS key used to encrypt various sensitive files.

      • realm (pulumi.Input[str]) - The name of the on-cluster Kerberos realm. If not specified, the uppercased domain of hostnames will be the realm.

      • rootPrincipalPasswordUri (pulumi.Input[str]) - The Cloud Storage URI of a KMS encrypted file containing the root principal password.

      • tgtLifetimeHours (pulumi.Input[float]) - The lifetime of the ticket granting ticket, in hours.

      • truststorePasswordUri (pulumi.Input[str]) - The Cloud Storage URI of a KMS encrypted file containing the password to the user provided truststore. For the self-signed certificate, this password is generated by Dataproc.

      • truststoreUri (pulumi.Input[str]) - The Cloud Storage URI of the truststore file used for SSL encryption. If not provided, Dataproc will provide a self-signed certificate.

  • softwareConfig (pulumi.Input[dict]) - The config settings for software inside the cluster. Structure defined below.

    • imageVersion (pulumi.Input[str]) - The Cloud Dataproc image version to use for the cluster - this controls the sets of software versions installed onto the nodes when you create clusters. If not specified, defaults to the latest version. For a list of valid versions see Cloud Dataproc versions

    • optionalComponents (pulumi.Input[list]) - The set of optional components to activate on the cluster. Accepted values are:

      • ANACONDA

      • DRUID

      • HBASE

      • HIVE_WEBHCAT

      • JUPYTER

      • KERBEROS

      • PRESTO

      • RANGER

      • SOLR

      • ZEPPELIN

      • ZOOKEEPER

    • overrideProperties (pulumi.Input[dict]) - A list of override and additional properties (key/value pairs) used to modify various aspects of the common configuration files used when creating a cluster. For a list of valid properties please see Cluster properties

    • properties (pulumi.Input[dict])

  • stagingBucket (pulumi.Input[str]) - The Cloud Storage staging bucket used to stage files, such as Hadoop jars, between client machines and the cluster. Note: If you don’t explicitly specify a staging_bucket then GCP will auto create / assign one for you. However, you are not guaranteed an auto generated bucket which is solely dedicated to your cluster; it may be shared with other clusters in the same region/zone also choosing to use the auto generation option.

  • worker_config (pulumi.Input[dict]) - The Google Compute Engine config settings for the worker instances in a cluster.. Structure defined below.

    • accelerators (pulumi.Input[list]) - The Compute Engine accelerator configuration for these instances. Can be specified multiple times.

      • acceleratorCount (pulumi.Input[float]) - The number of the accelerator cards of this type exposed to this instance. Often restricted to one of 1, 2, 4, or 8.

      • accelerator_type (pulumi.Input[str]) - The short name of the accelerator type to expose to this instance. For example, nvidia-tesla-k80.

    • diskConfig (pulumi.Input[dict]) - Disk Config

      • bootDiskSizeGb (pulumi.Input[float]) - Size of the primary disk attached to each preemptible worker node, specified in GB. The smallest allowed disk size is 10GB. GCP will default to a predetermined computed value if not set (currently 500GB). Note: If SSDs are not attached, it also contains the HDFS data blocks and Hadoop working directories.

      • bootDiskType (pulumi.Input[str]) - The disk type of the primary disk attached to each preemptible worker node. One of "pd-ssd" or "pd-standard". Defaults to "pd-standard".

      • numLocalSsds (pulumi.Input[float]) - The amount of local SSD disks that will be attached to each preemptible worker node. Defaults to 0.

    • imageUri (pulumi.Input[str]) - The URI for the image to use for this worker. See the guide for more information.

    • instanceNames (pulumi.Input[list])

    • machine_type (pulumi.Input[str]) - The name of a Google Compute Engine machine type to create for the worker nodes. If not specified, GCP will default to a predetermined computed value (currently n1-standard-4).

    • min_cpu_platform (pulumi.Input[str]) - The name of a minimum generation of CPU family for the master. If not specified, GCP will default to a predetermined computed value for each zone. See the guide for details about which CPU families are available (and defaulted) for each zone.

    • numInstances (pulumi.Input[float]) - Specifies the number of preemptible nodes to create. Defaults to 0.

cluster_config: pulumi.Output[dict] = None

Allows you to configure various aspects of the cluster. Structure defined below.

  • autoscalingConfig (dict) - The autoscaling policy config associated with the cluster. Structure defined below.

    • policyUri (str) - The autoscaling policy used by the cluster.

  • bucket (str)

  • encryptionConfig (dict) - The Customer managed encryption keys settings for the cluster. Structure defined below.

    • kms_key_name (str) - The Cloud KMS key name to use for PD disk encryption for all instances in the cluster.

  • endpointConfig (dict) - The config settings for port access on the cluster. Structure defined below.

    • enableHttpPortAccess (bool) - The flag to enable http access to specific ports on the cluster from external sources (aka Component Gateway). Defaults to false.

    • httpPorts (dict)

  • gceClusterConfig (dict) - Common config settings for resources of Google Compute Engine cluster instances, applicable to all instances in the cluster. Structure defined below.

    • internalIpOnly (bool) - By default, clusters are not restricted to internal IP addresses, and will have ephemeral external IP addresses assigned to each instance. If set to true, all instances in the cluster will only have internal IP addresses. Note: Private Google Access (also known as privateIpGoogleAccess) must be enabled on the subnetwork that the cluster will be launched in.

    • metadata (dict) - A map of the Compute Engine metadata entries to add to all instances (see Project and instance metadata).

    • network (str) - The name or self_link of the Google Compute Engine network to the cluster will be part of. Conflicts with subnetwork. If neither is specified, this defaults to the “default” network.

    • service_account (str) - The service account to be used by the Node VMs. If not specified, the “default” service account is used.

    • serviceAccountScopes (list) - The set of Google API scopes to be made available on all of the node VMs under the service_account specified. These can be either FQDNs, or scope aliases. The following scopes must be set if any other scopes are set. They’re necessary to ensure the correct functioning ofthe cluster, and are set automatically by the API:

    • subnetwork (str) - The name or self_link of the Google Compute Engine subnetwork the cluster will be part of. Conflicts with network.

    • tags (list) - The list of instance tags applied to instances in the cluster. Tags are used to identify valid sources or targets for network firewalls.

    • zone (str) - The GCP zone where your data is stored and used (i.e. where the master and the worker nodes will be created in). If region is set to ‘global’ (default) then zone is mandatory, otherwise GCP is able to make use of Auto Zone Placement to determine this automatically for you. Note: This setting additionally determines and restricts which computing resources are available for use with other configs such as cluster_config.master_config.machine_type and cluster_config.worker_config.machine_type.

  • initializationActions (list) - Commands to execute on each node after config is completed. You can specify multiple versions of these. Structure defined below.

    • script (str) - The script to be executed during initialization of the cluster. The script must be a GCS file with a gs:// prefix.

    • timeout_sec (float) - The maximum duration (in seconds) which script is allowed to take to execute its action. GCP will default to a predetermined computed value if not set (currently 300).

  • lifecycleConfig (dict) - The settings for auto deletion cluster schedule. Structure defined below.

    • autoDeleteTime (str) - The time when cluster will be auto-deleted. A timestamp in RFC3339 UTC “Zulu” format, accurate to nanoseconds. Example: “2014-10-02T15:01:23.045123456Z”.

    • idleDeleteTtl (str) - The duration to keep the cluster alive while idling (no jobs running). After this TTL, the cluster will be deleted. Valid range: [10m, 14d].

    • idleStartTime (str)

  • masterConfig (dict) - The Google Compute Engine config settings for the master instances in a cluster.. Structure defined below.

    • accelerators (list) - The Compute Engine accelerator configuration for these instances. Can be specified multiple times.

      • acceleratorCount (float) - The number of the accelerator cards of this type exposed to this instance. Often restricted to one of 1, 2, 4, or 8.

      • accelerator_type (str) - The short name of the accelerator type to expose to this instance. For example, nvidia-tesla-k80.

    • diskConfig (dict) - Disk Config

      • bootDiskSizeGb (float) - Size of the primary disk attached to each preemptible worker node, specified in GB. The smallest allowed disk size is 10GB. GCP will default to a predetermined computed value if not set (currently 500GB). Note: If SSDs are not attached, it also contains the HDFS data blocks and Hadoop working directories.

      • bootDiskType (str) - The disk type of the primary disk attached to each preemptible worker node. One of "pd-ssd" or "pd-standard". Defaults to "pd-standard".

      • numLocalSsds (float) - The amount of local SSD disks that will be attached to each preemptible worker node. Defaults to 0.

    • imageUri (str) - The URI for the image to use for this worker. See the guide for more information.

    • instanceNames (list)

    • machine_type (str) - The name of a Google Compute Engine machine type to create for the worker nodes. If not specified, GCP will default to a predetermined computed value (currently n1-standard-4).

    • min_cpu_platform (str) - The name of a minimum generation of CPU family for the master. If not specified, GCP will default to a predetermined computed value for each zone. See the guide for details about which CPU families are available (and defaulted) for each zone.

    • numInstances (float) - Specifies the number of preemptible nodes to create. Defaults to 0.

  • preemptibleWorkerConfig (dict) - The Google Compute Engine config settings for the additional (aka preemptible) instances in a cluster. Structure defined below.

    • diskConfig (dict) - Disk Config

      • bootDiskSizeGb (float) - Size of the primary disk attached to each preemptible worker node, specified in GB. The smallest allowed disk size is 10GB. GCP will default to a predetermined computed value if not set (currently 500GB). Note: If SSDs are not attached, it also contains the HDFS data blocks and Hadoop working directories.

      • bootDiskType (str) - The disk type of the primary disk attached to each preemptible worker node. One of "pd-ssd" or "pd-standard". Defaults to "pd-standard".

      • numLocalSsds (float) - The amount of local SSD disks that will be attached to each preemptible worker node. Defaults to 0.

    • instanceNames (list)

    • numInstances (float) - Specifies the number of preemptible nodes to create. Defaults to 0.

  • securityConfig (dict) - Security related configuration. Structure defined below.

    • kerberosConfig (dict) - Kerberos Configuration

      • crossRealmTrustAdminServer (str) - The admin server (IP or hostname) for the remote trusted realm in a cross realm trust relationship.

      • crossRealmTrustKdc (str) - The KDC (IP or hostname) for the remote trusted realm in a cross realm trust relationship.

      • crossRealmTrustRealm (str) - The remote realm the Dataproc on-cluster KDC will trust, should the user enable cross realm trust.

      • crossRealmTrustSharedPasswordUri (str) - The Cloud Storage URI of a KMS encrypted file containing the shared password between the on-cluster Kerberos realm and the remote trusted realm, in a cross realm trust relationship.

      • enableKerberos (bool) - Flag to indicate whether to Kerberize the cluster.

      • kdcDbKeyUri (str) - The Cloud Storage URI of a KMS encrypted file containing the master key of the KDC database.

      • keyPasswordUri (str) - The Cloud Storage URI of a KMS encrypted file containing the password to the user provided key. For the self-signed certificate, this password is generated by Dataproc.

      • keystorePasswordUri (str) - The Cloud Storage URI of a KMS encrypted file containing the password to the user provided keystore. For the self-signed certificated, the password is generated by Dataproc.

      • keystoreUri (str) - The Cloud Storage URI of the keystore file used for SSL encryption. If not provided, Dataproc will provide a self-signed certificate.

      • kmsKeyUri (str) - The URI of the KMS key used to encrypt various sensitive files.

      • realm (str) - The name of the on-cluster Kerberos realm. If not specified, the uppercased domain of hostnames will be the realm.

      • rootPrincipalPasswordUri (str) - The Cloud Storage URI of a KMS encrypted file containing the root principal password.

      • tgtLifetimeHours (float) - The lifetime of the ticket granting ticket, in hours.

      • truststorePasswordUri (str) - The Cloud Storage URI of a KMS encrypted file containing the password to the user provided truststore. For the self-signed certificate, this password is generated by Dataproc.

      • truststoreUri (str) - The Cloud Storage URI of the truststore file used for SSL encryption. If not provided, Dataproc will provide a self-signed certificate.

  • softwareConfig (dict) - The config settings for software inside the cluster. Structure defined below.

    • imageVersion (str) - The Cloud Dataproc image version to use for the cluster - this controls the sets of software versions installed onto the nodes when you create clusters. If not specified, defaults to the latest version. For a list of valid versions see Cloud Dataproc versions

    • optionalComponents (list) - The set of optional components to activate on the cluster. Accepted values are:

      • ANACONDA

      • DRUID

      • HBASE

      • HIVE_WEBHCAT

      • JUPYTER

      • KERBEROS

      • PRESTO

      • RANGER

      • SOLR

      • ZEPPELIN

      • ZOOKEEPER

    • overrideProperties (dict) - A list of override and additional properties (key/value pairs) used to modify various aspects of the common configuration files used when creating a cluster. For a list of valid properties please see Cluster properties

    • properties (dict)

  • stagingBucket (str) - The Cloud Storage staging bucket used to stage files, such as Hadoop jars, between client machines and the cluster. Note: If you don’t explicitly specify a staging_bucket then GCP will auto create / assign one for you. However, you are not guaranteed an auto generated bucket which is solely dedicated to your cluster; it may be shared with other clusters in the same region/zone also choosing to use the auto generation option.

  • worker_config (dict) - The Google Compute Engine config settings for the worker instances in a cluster.. Structure defined below.

    • accelerators (list) - The Compute Engine accelerator configuration for these instances. Can be specified multiple times.

      • acceleratorCount (float) - The number of the accelerator cards of this type exposed to this instance. Often restricted to one of 1, 2, 4, or 8.

      • accelerator_type (str) - The short name of the accelerator type to expose to this instance. For example, nvidia-tesla-k80.

    • diskConfig (dict) - Disk Config

      • bootDiskSizeGb (float) - Size of the primary disk attached to each preemptible worker node, specified in GB. The smallest allowed disk size is 10GB. GCP will default to a predetermined computed value if not set (currently 500GB). Note: If SSDs are not attached, it also contains the HDFS data blocks and Hadoop working directories.

      • bootDiskType (str) - The disk type of the primary disk attached to each preemptible worker node. One of "pd-ssd" or "pd-standard". Defaults to "pd-standard".

      • numLocalSsds (float) - The amount of local SSD disks that will be attached to each preemptible worker node. Defaults to 0.

    • imageUri (str) - The URI for the image to use for this worker. See the guide for more information.

    • instanceNames (list)

    • machine_type (str) - The name of a Google Compute Engine machine type to create for the worker nodes. If not specified, GCP will default to a predetermined computed value (currently n1-standard-4).

    • min_cpu_platform (str) - The name of a minimum generation of CPU family for the master. If not specified, GCP will default to a predetermined computed value for each zone. See the guide for details about which CPU families are available (and defaulted) for each zone.

    • numInstances (float) - Specifies the number of preemptible nodes to create. Defaults to 0.

labels: pulumi.Output[dict] = None

The list of labels (key/value pairs) to be applied to instances in the cluster. GCP generates some itself including goog-dataproc-cluster-name which is the name of the cluster.

name: pulumi.Output[str] = None

The name of the cluster, unique within the project and zone.

project: pulumi.Output[str] = None

The ID of the project in which the cluster will exist. If it is not provided, the provider project is used.

region: pulumi.Output[str] = None

The region in which the cluster and associated nodes will be created in. Defaults to global.

static get(resource_name, id, opts=None, cluster_config=None, labels=None, name=None, project=None, region=None)

Get an existing Cluster resource’s state with the given name, id, and optional extra properties used to qualify the lookup.

Parameters
  • resource_name (str) – The unique name of the resulting resource.

  • id (str) – The unique provider ID of the resource to lookup.

  • opts (pulumi.ResourceOptions) – Options for the resource.

  • cluster_config (pulumi.Input[dict]) – Allows you to configure various aspects of the cluster. Structure defined below.

  • labels (pulumi.Input[dict]) – The list of labels (key/value pairs) to be applied to instances in the cluster. GCP generates some itself including goog-dataproc-cluster-name which is the name of the cluster.

  • name (pulumi.Input[str]) – The name of the cluster, unique within the project and zone.

  • project (pulumi.Input[str]) – The ID of the project in which the cluster will exist. If it is not provided, the provider project is used.

  • region (pulumi.Input[str]) – The region in which the cluster and associated nodes will be created in. Defaults to global.

The cluster_config object supports the following:

  • autoscalingConfig (pulumi.Input[dict]) - The autoscaling policy config associated with the cluster. Structure defined below.

    • policyUri (pulumi.Input[str]) - The autoscaling policy used by the cluster.

  • bucket (pulumi.Input[str])

  • encryptionConfig (pulumi.Input[dict]) - The Customer managed encryption keys settings for the cluster. Structure defined below.

    • kms_key_name (pulumi.Input[str]) - The Cloud KMS key name to use for PD disk encryption for all instances in the cluster.

  • endpointConfig (pulumi.Input[dict]) - The config settings for port access on the cluster. Structure defined below.

    • enableHttpPortAccess (pulumi.Input[bool]) - The flag to enable http access to specific ports on the cluster from external sources (aka Component Gateway). Defaults to false.

    • httpPorts (pulumi.Input[dict])

  • gceClusterConfig (pulumi.Input[dict]) - Common config settings for resources of Google Compute Engine cluster instances, applicable to all instances in the cluster. Structure defined below.

    • internalIpOnly (pulumi.Input[bool]) - By default, clusters are not restricted to internal IP addresses, and will have ephemeral external IP addresses assigned to each instance. If set to true, all instances in the cluster will only have internal IP addresses. Note: Private Google Access (also known as privateIpGoogleAccess) must be enabled on the subnetwork that the cluster will be launched in.

    • metadata (pulumi.Input[dict]) - A map of the Compute Engine metadata entries to add to all instances (see Project and instance metadata).

    • network (pulumi.Input[str]) - The name or self_link of the Google Compute Engine network to the cluster will be part of. Conflicts with subnetwork. If neither is specified, this defaults to the “default” network.

    • service_account (pulumi.Input[str]) - The service account to be used by the Node VMs. If not specified, the “default” service account is used.

    • serviceAccountScopes (pulumi.Input[list]) - The set of Google API scopes to be made available on all of the node VMs under the service_account specified. These can be either FQDNs, or scope aliases. The following scopes must be set if any other scopes are set. They’re necessary to ensure the correct functioning ofthe cluster, and are set automatically by the API:

    • subnetwork (pulumi.Input[str]) - The name or self_link of the Google Compute Engine subnetwork the cluster will be part of. Conflicts with network.

    • tags (pulumi.Input[list]) - The list of instance tags applied to instances in the cluster. Tags are used to identify valid sources or targets for network firewalls.

    • zone (pulumi.Input[str]) - The GCP zone where your data is stored and used (i.e. where the master and the worker nodes will be created in). If region is set to ‘global’ (default) then zone is mandatory, otherwise GCP is able to make use of Auto Zone Placement to determine this automatically for you. Note: This setting additionally determines and restricts which computing resources are available for use with other configs such as cluster_config.master_config.machine_type and cluster_config.worker_config.machine_type.

  • initializationActions (pulumi.Input[list]) - Commands to execute on each node after config is completed. You can specify multiple versions of these. Structure defined below.

    • script (pulumi.Input[str]) - The script to be executed during initialization of the cluster. The script must be a GCS file with a gs:// prefix.

    • timeout_sec (pulumi.Input[float]) - The maximum duration (in seconds) which script is allowed to take to execute its action. GCP will default to a predetermined computed value if not set (currently 300).

  • lifecycleConfig (pulumi.Input[dict]) - The settings for auto deletion cluster schedule. Structure defined below.

    • autoDeleteTime (pulumi.Input[str]) - The time when cluster will be auto-deleted. A timestamp in RFC3339 UTC “Zulu” format, accurate to nanoseconds. Example: “2014-10-02T15:01:23.045123456Z”.

    • idleDeleteTtl (pulumi.Input[str]) - The duration to keep the cluster alive while idling (no jobs running). After this TTL, the cluster will be deleted. Valid range: [10m, 14d].

    • idleStartTime (pulumi.Input[str])

  • masterConfig (pulumi.Input[dict]) - The Google Compute Engine config settings for the master instances in a cluster.. Structure defined below.

    • accelerators (pulumi.Input[list]) - The Compute Engine accelerator configuration for these instances. Can be specified multiple times.

      • acceleratorCount (pulumi.Input[float]) - The number of the accelerator cards of this type exposed to this instance. Often restricted to one of 1, 2, 4, or 8.

      • accelerator_type (pulumi.Input[str]) - The short name of the accelerator type to expose to this instance. For example, nvidia-tesla-k80.

    • diskConfig (pulumi.Input[dict]) - Disk Config

      • bootDiskSizeGb (pulumi.Input[float]) - Size of the primary disk attached to each preemptible worker node, specified in GB. The smallest allowed disk size is 10GB. GCP will default to a predetermined computed value if not set (currently 500GB). Note: If SSDs are not attached, it also contains the HDFS data blocks and Hadoop working directories.

      • bootDiskType (pulumi.Input[str]) - The disk type of the primary disk attached to each preemptible worker node. One of "pd-ssd" or "pd-standard". Defaults to "pd-standard".

      • numLocalSsds (pulumi.Input[float]) - The amount of local SSD disks that will be attached to each preemptible worker node. Defaults to 0.

    • imageUri (pulumi.Input[str]) - The URI for the image to use for this worker. See the guide for more information.

    • instanceNames (pulumi.Input[list])

    • machine_type (pulumi.Input[str]) - The name of a Google Compute Engine machine type to create for the worker nodes. If not specified, GCP will default to a predetermined computed value (currently n1-standard-4).

    • min_cpu_platform (pulumi.Input[str]) - The name of a minimum generation of CPU family for the master. If not specified, GCP will default to a predetermined computed value for each zone. See the guide for details about which CPU families are available (and defaulted) for each zone.

    • numInstances (pulumi.Input[float]) - Specifies the number of preemptible nodes to create. Defaults to 0.

  • preemptibleWorkerConfig (pulumi.Input[dict]) - The Google Compute Engine config settings for the additional (aka preemptible) instances in a cluster. Structure defined below.

    • diskConfig (pulumi.Input[dict]) - Disk Config

      • bootDiskSizeGb (pulumi.Input[float]) - Size of the primary disk attached to each preemptible worker node, specified in GB. The smallest allowed disk size is 10GB. GCP will default to a predetermined computed value if not set (currently 500GB). Note: If SSDs are not attached, it also contains the HDFS data blocks and Hadoop working directories.

      • bootDiskType (pulumi.Input[str]) - The disk type of the primary disk attached to each preemptible worker node. One of "pd-ssd" or "pd-standard". Defaults to "pd-standard".

      • numLocalSsds (pulumi.Input[float]) - The amount of local SSD disks that will be attached to each preemptible worker node. Defaults to 0.

    • instanceNames (pulumi.Input[list])

    • numInstances (pulumi.Input[float]) - Specifies the number of preemptible nodes to create. Defaults to 0.

  • securityConfig (pulumi.Input[dict]) - Security related configuration. Structure defined below.

    • kerberosConfig (pulumi.Input[dict]) - Kerberos Configuration

      • crossRealmTrustAdminServer (pulumi.Input[str]) - The admin server (IP or hostname) for the remote trusted realm in a cross realm trust relationship.

      • crossRealmTrustKdc (pulumi.Input[str]) - The KDC (IP or hostname) for the remote trusted realm in a cross realm trust relationship.

      • crossRealmTrustRealm (pulumi.Input[str]) - The remote realm the Dataproc on-cluster KDC will trust, should the user enable cross realm trust.

      • crossRealmTrustSharedPasswordUri (pulumi.Input[str]) - The Cloud Storage URI of a KMS encrypted file containing the shared password between the on-cluster Kerberos realm and the remote trusted realm, in a cross realm trust relationship.

      • enableKerberos (pulumi.Input[bool]) - Flag to indicate whether to Kerberize the cluster.

      • kdcDbKeyUri (pulumi.Input[str]) - The Cloud Storage URI of a KMS encrypted file containing the master key of the KDC database.

      • keyPasswordUri (pulumi.Input[str]) - The Cloud Storage URI of a KMS encrypted file containing the password to the user provided key. For the self-signed certificate, this password is generated by Dataproc.

      • keystorePasswordUri (pulumi.Input[str]) - The Cloud Storage URI of a KMS encrypted file containing the password to the user provided keystore. For the self-signed certificated, the password is generated by Dataproc.

      • keystoreUri (pulumi.Input[str]) - The Cloud Storage URI of the keystore file used for SSL encryption. If not provided, Dataproc will provide a self-signed certificate.

      • kmsKeyUri (pulumi.Input[str]) - The URI of the KMS key used to encrypt various sensitive files.

      • realm (pulumi.Input[str]) - The name of the on-cluster Kerberos realm. If not specified, the uppercased domain of hostnames will be the realm.

      • rootPrincipalPasswordUri (pulumi.Input[str]) - The Cloud Storage URI of a KMS encrypted file containing the root principal password.

      • tgtLifetimeHours (pulumi.Input[float]) - The lifetime of the ticket granting ticket, in hours.

      • truststorePasswordUri (pulumi.Input[str]) - The Cloud Storage URI of a KMS encrypted file containing the password to the user provided truststore. For the self-signed certificate, this password is generated by Dataproc.

      • truststoreUri (pulumi.Input[str]) - The Cloud Storage URI of the truststore file used for SSL encryption. If not provided, Dataproc will provide a self-signed certificate.

  • softwareConfig (pulumi.Input[dict]) - The config settings for software inside the cluster. Structure defined below.

    • imageVersion (pulumi.Input[str]) - The Cloud Dataproc image version to use for the cluster - this controls the sets of software versions installed onto the nodes when you create clusters. If not specified, defaults to the latest version. For a list of valid versions see Cloud Dataproc versions

    • optionalComponents (pulumi.Input[list]) - The set of optional components to activate on the cluster. Accepted values are:

      • ANACONDA

      • DRUID

      • HBASE

      • HIVE_WEBHCAT

      • JUPYTER

      • KERBEROS

      • PRESTO

      • RANGER

      • SOLR

      • ZEPPELIN

      • ZOOKEEPER

    • overrideProperties (pulumi.Input[dict]) - A list of override and additional properties (key/value pairs) used to modify various aspects of the common configuration files used when creating a cluster. For a list of valid properties please see Cluster properties

    • properties (pulumi.Input[dict])

  • stagingBucket (pulumi.Input[str]) - The Cloud Storage staging bucket used to stage files, such as Hadoop jars, between client machines and the cluster. Note: If you don’t explicitly specify a staging_bucket then GCP will auto create / assign one for you. However, you are not guaranteed an auto generated bucket which is solely dedicated to your cluster; it may be shared with other clusters in the same region/zone also choosing to use the auto generation option.

  • worker_config (pulumi.Input[dict]) - The Google Compute Engine config settings for the worker instances in a cluster.. Structure defined below.

    • accelerators (pulumi.Input[list]) - The Compute Engine accelerator configuration for these instances. Can be specified multiple times.

      • acceleratorCount (pulumi.Input[float]) - The number of the accelerator cards of this type exposed to this instance. Often restricted to one of 1, 2, 4, or 8.

      • accelerator_type (pulumi.Input[str]) - The short name of the accelerator type to expose to this instance. For example, nvidia-tesla-k80.

    • diskConfig (pulumi.Input[dict]) - Disk Config

      • bootDiskSizeGb (pulumi.Input[float]) - Size of the primary disk attached to each preemptible worker node, specified in GB. The smallest allowed disk size is 10GB. GCP will default to a predetermined computed value if not set (currently 500GB). Note: If SSDs are not attached, it also contains the HDFS data blocks and Hadoop working directories.

      • bootDiskType (pulumi.Input[str]) - The disk type of the primary disk attached to each preemptible worker node. One of "pd-ssd" or "pd-standard". Defaults to "pd-standard".

      • numLocalSsds (pulumi.Input[float]) - The amount of local SSD disks that will be attached to each preemptible worker node. Defaults to 0.

    • imageUri (pulumi.Input[str]) - The URI for the image to use for this worker. See the guide for more information.

    • instanceNames (pulumi.Input[list])

    • machine_type (pulumi.Input[str]) - The name of a Google Compute Engine machine type to create for the worker nodes. If not specified, GCP will default to a predetermined computed value (currently n1-standard-4).

    • min_cpu_platform (pulumi.Input[str]) - The name of a minimum generation of CPU family for the master. If not specified, GCP will default to a predetermined computed value for each zone. See the guide for details about which CPU families are available (and defaulted) for each zone.

    • numInstances (pulumi.Input[float]) - Specifies the number of preemptible nodes to create. Defaults to 0.

translate_output_property(prop)

Provides subclasses of Resource an opportunity to translate names of output properties into a format of their choosing before writing those properties to the resource object.

Parameters

prop (str) – A property name.

Returns

A potentially transformed property name.

Return type

str

translate_input_property(prop)

Provides subclasses of Resource an opportunity to translate names of input properties into a format of their choosing before sending those properties to the Pulumi engine.

Parameters

prop (str) – A property name.

Returns

A potentially transformed property name.

Return type

str

class pulumi_gcp.dataproc.ClusterIAMBinding(resource_name, opts=None, cluster=None, condition=None, members=None, project=None, region=None, role=None, __props__=None, __name__=None, __opts__=None)

Three different resources help you manage IAM policies on dataproc clusters. Each of these resources serves a different use case:

  • dataproc.ClusterIAMPolicy: Authoritative. Sets the IAM policy for the cluster and replaces any existing policy already attached.

  • dataproc.ClusterIAMBinding: Authoritative for a given role. Updates the IAM policy to grant a role to a list of members. Other roles within the IAM policy for the cluster are preserved.

  • dataproc.ClusterIAMMember: Non-authoritative. Updates the IAM policy to grant a role to a new member. Other members for the role for the cluster are preserved.

Note: dataproc.ClusterIAMPolicy cannot be used in conjunction with dataproc.ClusterIAMBinding and dataproc.ClusterIAMMember or they will fight over what your policy should be. In addition, be careful not to accidentally unset ownership of the cluster as dataproc.ClusterIAMPolicy replaces the entire policy.

Note: dataproc.ClusterIAMBinding resources can be used in conjunction with dataproc.ClusterIAMMember resources only if they do not grant privilege to the same role.

import pulumi
import pulumi_gcp as gcp

admin = gcp.organizations.get_iam_policy(binding=[{
    "role": "roles/editor",
    "members": ["user:jane@example.com"],
}])
editor = gcp.dataproc.ClusterIAMPolicy("editor",
    project="your-project",
    region="your-region",
    cluster="your-dataproc-cluster",
    policy_data=admin.policy_data)
import pulumi
import pulumi_gcp as gcp

editor = gcp.dataproc.ClusterIAMBinding("editor",
    cluster="your-dataproc-cluster",
    members=["user:jane@example.com"],
    role="roles/editor")
import pulumi
import pulumi_gcp as gcp

editor = gcp.dataproc.ClusterIAMMember("editor",
    cluster="your-dataproc-cluster",
    member="user:jane@example.com",
    role="roles/editor")
Parameters
  • resource_name (str) – The name of the resource.

  • opts (pulumi.ResourceOptions) – Options for the resource.

  • cluster (pulumi.Input[str]) – The name or relative resource id of the cluster to manage IAM policies for.

  • project (pulumi.Input[str]) – The project in which the cluster belongs. If it is not provided, the provider will use a default.

  • region (pulumi.Input[str]) – The region in which the cluster belongs. If it is not provided, the provider will use a default.

  • role (pulumi.Input[str]) – The role that should be applied. Only one dataproc.ClusterIAMBinding can be used per role. Note that custom roles must be of the format [projects|organizations]/{parent-name}/roles/{role-name}.

The condition object supports the following:

  • description (pulumi.Input[str])

  • expression (pulumi.Input[str])

  • title (pulumi.Input[str])

cluster: pulumi.Output[str] = None

The name or relative resource id of the cluster to manage IAM policies for.

etag: pulumi.Output[str] = None

(Computed) The etag of the clusters’s IAM policy.

project: pulumi.Output[str] = None

The project in which the cluster belongs. If it is not provided, the provider will use a default.

region: pulumi.Output[str] = None

The region in which the cluster belongs. If it is not provided, the provider will use a default.

role: pulumi.Output[str] = None

The role that should be applied. Only one dataproc.ClusterIAMBinding can be used per role. Note that custom roles must be of the format [projects|organizations]/{parent-name}/roles/{role-name}.

static get(resource_name, id, opts=None, cluster=None, condition=None, etag=None, members=None, project=None, region=None, role=None)

Get an existing ClusterIAMBinding resource’s state with the given name, id, and optional extra properties used to qualify the lookup.

Parameters
  • resource_name (str) – The unique name of the resulting resource.

  • id (str) – The unique provider ID of the resource to lookup.

  • opts (pulumi.ResourceOptions) – Options for the resource.

  • cluster (pulumi.Input[str]) – The name or relative resource id of the cluster to manage IAM policies for.

  • etag (pulumi.Input[str]) – (Computed) The etag of the clusters’s IAM policy.

  • project (pulumi.Input[str]) – The project in which the cluster belongs. If it is not provided, the provider will use a default.

  • region (pulumi.Input[str]) – The region in which the cluster belongs. If it is not provided, the provider will use a default.

  • role (pulumi.Input[str]) – The role that should be applied. Only one dataproc.ClusterIAMBinding can be used per role. Note that custom roles must be of the format [projects|organizations]/{parent-name}/roles/{role-name}.

The condition object supports the following:

  • description (pulumi.Input[str])

  • expression (pulumi.Input[str])

  • title (pulumi.Input[str])

translate_output_property(prop)

Provides subclasses of Resource an opportunity to translate names of output properties into a format of their choosing before writing those properties to the resource object.

Parameters

prop (str) – A property name.

Returns

A potentially transformed property name.

Return type

str

translate_input_property(prop)

Provides subclasses of Resource an opportunity to translate names of input properties into a format of their choosing before sending those properties to the Pulumi engine.

Parameters

prop (str) – A property name.

Returns

A potentially transformed property name.

Return type

str

class pulumi_gcp.dataproc.ClusterIAMMember(resource_name, opts=None, cluster=None, condition=None, member=None, project=None, region=None, role=None, __props__=None, __name__=None, __opts__=None)

Three different resources help you manage IAM policies on dataproc clusters. Each of these resources serves a different use case:

  • dataproc.ClusterIAMPolicy: Authoritative. Sets the IAM policy for the cluster and replaces any existing policy already attached.

  • dataproc.ClusterIAMBinding: Authoritative for a given role. Updates the IAM policy to grant a role to a list of members. Other roles within the IAM policy for the cluster are preserved.

  • dataproc.ClusterIAMMember: Non-authoritative. Updates the IAM policy to grant a role to a new member. Other members for the role for the cluster are preserved.

Note: dataproc.ClusterIAMPolicy cannot be used in conjunction with dataproc.ClusterIAMBinding and dataproc.ClusterIAMMember or they will fight over what your policy should be. In addition, be careful not to accidentally unset ownership of the cluster as dataproc.ClusterIAMPolicy replaces the entire policy.

Note: dataproc.ClusterIAMBinding resources can be used in conjunction with dataproc.ClusterIAMMember resources only if they do not grant privilege to the same role.

import pulumi
import pulumi_gcp as gcp

admin = gcp.organizations.get_iam_policy(binding=[{
    "role": "roles/editor",
    "members": ["user:jane@example.com"],
}])
editor = gcp.dataproc.ClusterIAMPolicy("editor",
    project="your-project",
    region="your-region",
    cluster="your-dataproc-cluster",
    policy_data=admin.policy_data)
import pulumi
import pulumi_gcp as gcp

editor = gcp.dataproc.ClusterIAMBinding("editor",
    cluster="your-dataproc-cluster",
    members=["user:jane@example.com"],
    role="roles/editor")
import pulumi
import pulumi_gcp as gcp

editor = gcp.dataproc.ClusterIAMMember("editor",
    cluster="your-dataproc-cluster",
    member="user:jane@example.com",
    role="roles/editor")
Parameters
  • resource_name (str) – The name of the resource.

  • opts (pulumi.ResourceOptions) – Options for the resource.

  • cluster (pulumi.Input[str]) – The name or relative resource id of the cluster to manage IAM policies for.

  • project (pulumi.Input[str]) – The project in which the cluster belongs. If it is not provided, the provider will use a default.

  • region (pulumi.Input[str]) – The region in which the cluster belongs. If it is not provided, the provider will use a default.

  • role (pulumi.Input[str]) – The role that should be applied. Only one dataproc.ClusterIAMBinding can be used per role. Note that custom roles must be of the format [projects|organizations]/{parent-name}/roles/{role-name}.

The condition object supports the following:

  • description (pulumi.Input[str])

  • expression (pulumi.Input[str])

  • title (pulumi.Input[str])

cluster: pulumi.Output[str] = None

The name or relative resource id of the cluster to manage IAM policies for.

etag: pulumi.Output[str] = None

(Computed) The etag of the clusters’s IAM policy.

project: pulumi.Output[str] = None

The project in which the cluster belongs. If it is not provided, the provider will use a default.

region: pulumi.Output[str] = None

The region in which the cluster belongs. If it is not provided, the provider will use a default.

role: pulumi.Output[str] = None

The role that should be applied. Only one dataproc.ClusterIAMBinding can be used per role. Note that custom roles must be of the format [projects|organizations]/{parent-name}/roles/{role-name}.

static get(resource_name, id, opts=None, cluster=None, condition=None, etag=None, member=None, project=None, region=None, role=None)

Get an existing ClusterIAMMember resource’s state with the given name, id, and optional extra properties used to qualify the lookup.

Parameters
  • resource_name (str) – The unique name of the resulting resource.

  • id (str) – The unique provider ID of the resource to lookup.

  • opts (pulumi.ResourceOptions) – Options for the resource.

  • cluster (pulumi.Input[str]) – The name or relative resource id of the cluster to manage IAM policies for.

  • etag (pulumi.Input[str]) – (Computed) The etag of the clusters’s IAM policy.

  • project (pulumi.Input[str]) – The project in which the cluster belongs. If it is not provided, the provider will use a default.

  • region (pulumi.Input[str]) – The region in which the cluster belongs. If it is not provided, the provider will use a default.

  • role (pulumi.Input[str]) – The role that should be applied. Only one dataproc.ClusterIAMBinding can be used per role. Note that custom roles must be of the format [projects|organizations]/{parent-name}/roles/{role-name}.

The condition object supports the following:

  • description (pulumi.Input[str])

  • expression (pulumi.Input[str])

  • title (pulumi.Input[str])

translate_output_property(prop)

Provides subclasses of Resource an opportunity to translate names of output properties into a format of their choosing before writing those properties to the resource object.

Parameters

prop (str) – A property name.

Returns

A potentially transformed property name.

Return type

str

translate_input_property(prop)

Provides subclasses of Resource an opportunity to translate names of input properties into a format of their choosing before sending those properties to the Pulumi engine.

Parameters

prop (str) – A property name.

Returns

A potentially transformed property name.

Return type

str

class pulumi_gcp.dataproc.ClusterIAMPolicy(resource_name, opts=None, cluster=None, policy_data=None, project=None, region=None, __props__=None, __name__=None, __opts__=None)

Three different resources help you manage IAM policies on dataproc clusters. Each of these resources serves a different use case:

  • dataproc.ClusterIAMPolicy: Authoritative. Sets the IAM policy for the cluster and replaces any existing policy already attached.

  • dataproc.ClusterIAMBinding: Authoritative for a given role. Updates the IAM policy to grant a role to a list of members. Other roles within the IAM policy for the cluster are preserved.

  • dataproc.ClusterIAMMember: Non-authoritative. Updates the IAM policy to grant a role to a new member. Other members for the role for the cluster are preserved.

Note: dataproc.ClusterIAMPolicy cannot be used in conjunction with dataproc.ClusterIAMBinding and dataproc.ClusterIAMMember or they will fight over what your policy should be. In addition, be careful not to accidentally unset ownership of the cluster as dataproc.ClusterIAMPolicy replaces the entire policy.

Note: dataproc.ClusterIAMBinding resources can be used in conjunction with dataproc.ClusterIAMMember resources only if they do not grant privilege to the same role.

import pulumi
import pulumi_gcp as gcp

admin = gcp.organizations.get_iam_policy(binding=[{
    "role": "roles/editor",
    "members": ["user:jane@example.com"],
}])
editor = gcp.dataproc.ClusterIAMPolicy("editor",
    project="your-project",
    region="your-region",
    cluster="your-dataproc-cluster",
    policy_data=admin.policy_data)
import pulumi
import pulumi_gcp as gcp

editor = gcp.dataproc.ClusterIAMBinding("editor",
    cluster="your-dataproc-cluster",
    members=["user:jane@example.com"],
    role="roles/editor")
import pulumi
import pulumi_gcp as gcp

editor = gcp.dataproc.ClusterIAMMember("editor",
    cluster="your-dataproc-cluster",
    member="user:jane@example.com",
    role="roles/editor")
Parameters
  • resource_name (str) – The name of the resource.

  • opts (pulumi.ResourceOptions) – Options for the resource.

  • cluster (pulumi.Input[str]) – The name or relative resource id of the cluster to manage IAM policies for.

  • policy_data (pulumi.Input[str]) – The policy data generated by a organizations.getIAMPolicy data source.

  • project (pulumi.Input[str]) – The project in which the cluster belongs. If it is not provided, the provider will use a default.

  • region (pulumi.Input[str]) – The region in which the cluster belongs. If it is not provided, the provider will use a default.

cluster: pulumi.Output[str] = None

The name or relative resource id of the cluster to manage IAM policies for.

etag: pulumi.Output[str] = None

(Computed) The etag of the clusters’s IAM policy.

policy_data: pulumi.Output[str] = None

The policy data generated by a organizations.getIAMPolicy data source.

project: pulumi.Output[str] = None

The project in which the cluster belongs. If it is not provided, the provider will use a default.

region: pulumi.Output[str] = None

The region in which the cluster belongs. If it is not provided, the provider will use a default.

static get(resource_name, id, opts=None, cluster=None, etag=None, policy_data=None, project=None, region=None)

Get an existing ClusterIAMPolicy resource’s state with the given name, id, and optional extra properties used to qualify the lookup.

Parameters
  • resource_name (str) – The unique name of the resulting resource.

  • id (str) – The unique provider ID of the resource to lookup.

  • opts (pulumi.ResourceOptions) – Options for the resource.

  • cluster (pulumi.Input[str]) – The name or relative resource id of the cluster to manage IAM policies for.

  • etag (pulumi.Input[str]) – (Computed) The etag of the clusters’s IAM policy.

  • policy_data (pulumi.Input[str]) – The policy data generated by a organizations.getIAMPolicy data source.

  • project (pulumi.Input[str]) – The project in which the cluster belongs. If it is not provided, the provider will use a default.

  • region (pulumi.Input[str]) – The region in which the cluster belongs. If it is not provided, the provider will use a default.

translate_output_property(prop)

Provides subclasses of Resource an opportunity to translate names of output properties into a format of their choosing before writing those properties to the resource object.

Parameters

prop (str) – A property name.

Returns

A potentially transformed property name.

Return type

str

translate_input_property(prop)

Provides subclasses of Resource an opportunity to translate names of input properties into a format of their choosing before sending those properties to the Pulumi engine.

Parameters

prop (str) – A property name.

Returns

A potentially transformed property name.

Return type

str

class pulumi_gcp.dataproc.Job(resource_name, opts=None, force_delete=None, hadoop_config=None, hive_config=None, labels=None, pig_config=None, placement=None, project=None, pyspark_config=None, reference=None, region=None, scheduling=None, spark_config=None, sparksql_config=None, __props__=None, __name__=None, __opts__=None)

Manages a job resource within a Dataproc cluster within GCE. For more information see the official dataproc documentation.

!> Note: This resource does not support ‘update’ and changing any attributes will cause the resource to be recreated.

Parameters
  • resource_name (str) – The name of the resource.

  • opts (pulumi.ResourceOptions) – Options for the resource.

  • force_delete (pulumi.Input[bool]) – By default, you can only delete inactive jobs within Dataproc. Setting this to true, and calling destroy, will ensure that the job is first cancelled before issuing the delete.

  • labels (pulumi.Input[dict]) – The list of labels (key/value pairs) to add to the job.

  • project (pulumi.Input[str]) – The project in which the cluster can be found and jobs subsequently run against. If it is not provided, the provider project is used.

  • region (pulumi.Input[str]) – The Cloud Dataproc region. This essentially determines which clusters are available for this job to be submitted to. If not specified, defaults to global.

  • scheduling (pulumi.Input[dict]) – Optional. Job scheduling configuration.

The hadoop_config object supports the following:

  • archiveUris (pulumi.Input[list]) - HCFS URIs of archives to be extracted in the working directory of .jar, .tar, .tar.gz, .tgz, and .zip.

  • args (pulumi.Input[list]) - The arguments to pass to the driver. Do not include arguments, such as -libjars or -Dfoo=bar, that can be set as job properties, since a collision may occur that causes an incorrect job submission.

  • fileUris (pulumi.Input[list]) - HCFS URIs of files to be copied to the working directory of Hadoop drivers and distributed tasks. Useful for naively parallel tasks.

  • jarFileUris (pulumi.Input[list]) - HCFS URIs of jar files to be added to the Spark CLASSPATH.

  • loggingConfig (pulumi.Input[dict])

    • driverLogLevels (pulumi.Input[dict])

  • mainClass (pulumi.Input[str]) - The name of the driver’s main class. The jar file containing the class must be in the default CLASSPATH or specified in jar_file_uris. Conflicts with main_jar_file_uri

  • mainJarFileUri (pulumi.Input[str]) - The HCFS URI of the jar file containing the main class. Examples: ‘gs://foo-bucket/analytics-binaries/extract-useful-metrics-mr.jar’ ‘hdfs:/tmp/test-samples/custom-wordcount.jar’ ‘file:///home/usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar’. Conflicts with main_class

  • properties (pulumi.Input[dict]) - A mapping of property names to values, used to configure Spark SQL’s SparkConf. Properties that conflict with values set by the Cloud Dataproc API may be overwritten.

The hive_config object supports the following:

  • continueOnFailure (pulumi.Input[bool]) - Whether to continue executing queries if a query fails. The default value is false. Setting to true can be useful when executing independent parallel queries. Defaults to false.

  • jarFileUris (pulumi.Input[list]) - HCFS URIs of jar files to be added to the Spark CLASSPATH.

  • properties (pulumi.Input[dict]) - A mapping of property names to values, used to configure Spark SQL’s SparkConf. Properties that conflict with values set by the Cloud Dataproc API may be overwritten.

  • queryFileUri (pulumi.Input[str]) - The HCFS URI of the script that contains SQL queries. Conflicts with query_list

  • queryLists (pulumi.Input[list]) - The list of SQL queries or statements to execute as part of the job. Conflicts with query_file_uri

  • scriptVariables (pulumi.Input[dict]) - Mapping of query variable names to values (equivalent to the Spark SQL command: SET name="value";).

The pig_config object supports the following:

  • continueOnFailure (pulumi.Input[bool]) - Whether to continue executing queries if a query fails. The default value is false. Setting to true can be useful when executing independent parallel queries. Defaults to false.

  • jarFileUris (pulumi.Input[list]) - HCFS URIs of jar files to be added to the Spark CLASSPATH.

  • loggingConfig (pulumi.Input[dict])

    • driverLogLevels (pulumi.Input[dict])

  • properties (pulumi.Input[dict]) - A mapping of property names to values, used to configure Spark SQL’s SparkConf. Properties that conflict with values set by the Cloud Dataproc API may be overwritten.

  • queryFileUri (pulumi.Input[str]) - The HCFS URI of the script that contains SQL queries. Conflicts with query_list

  • queryLists (pulumi.Input[list]) - The list of SQL queries or statements to execute as part of the job. Conflicts with query_file_uri

  • scriptVariables (pulumi.Input[dict]) - Mapping of query variable names to values (equivalent to the Spark SQL command: SET name="value";).

The placement object supports the following:

  • clusterName (pulumi.Input[str])

  • clusterUuid (pulumi.Input[str])

The pyspark_config object supports the following:

  • archiveUris (pulumi.Input[list]) - HCFS URIs of archives to be extracted in the working directory of .jar, .tar, .tar.gz, .tgz, and .zip.

  • args (pulumi.Input[list]) - The arguments to pass to the driver. Do not include arguments, such as -libjars or -Dfoo=bar, that can be set as job properties, since a collision may occur that causes an incorrect job submission.

  • fileUris (pulumi.Input[list]) - HCFS URIs of files to be copied to the working directory of Hadoop drivers and distributed tasks. Useful for naively parallel tasks.

  • jarFileUris (pulumi.Input[list]) - HCFS URIs of jar files to be added to the Spark CLASSPATH.

  • loggingConfig (pulumi.Input[dict])

    • driverLogLevels (pulumi.Input[dict])

  • mainPythonFileUri (pulumi.Input[str]) - The HCFS URI of the main Python file to use as the driver. Must be a .py file.

  • properties (pulumi.Input[dict]) - A mapping of property names to values, used to configure Spark SQL’s SparkConf. Properties that conflict with values set by the Cloud Dataproc API may be overwritten.

  • pythonFileUris (pulumi.Input[list]) - HCFS file URIs of Python files to pass to the PySpark framework. Supported file types: .py, .egg, and .zip.

The reference object supports the following:

  • job_id (pulumi.Input[str])

The scheduling object supports the following:

  • maxFailuresPerHour (pulumi.Input[float])

The spark_config object supports the following:

  • archiveUris (pulumi.Input[list]) - HCFS URIs of archives to be extracted in the working directory of .jar, .tar, .tar.gz, .tgz, and .zip.

  • args (pulumi.Input[list]) - The arguments to pass to the driver. Do not include arguments, such as -libjars or -Dfoo=bar, that can be set as job properties, since a collision may occur that causes an incorrect job submission.

  • fileUris (pulumi.Input[list]) - HCFS URIs of files to be copied to the working directory of Hadoop drivers and distributed tasks. Useful for naively parallel tasks.

  • jarFileUris (pulumi.Input[list]) - HCFS URIs of jar files to be added to the Spark CLASSPATH.

  • loggingConfig (pulumi.Input[dict])

    • driverLogLevels (pulumi.Input[dict])

  • mainClass (pulumi.Input[str]) - The name of the driver’s main class. The jar file containing the class must be in the default CLASSPATH or specified in jar_file_uris. Conflicts with main_jar_file_uri

  • mainJarFileUri (pulumi.Input[str]) - The HCFS URI of the jar file containing the main class. Examples: ‘gs://foo-bucket/analytics-binaries/extract-useful-metrics-mr.jar’ ‘hdfs:/tmp/test-samples/custom-wordcount.jar’ ‘file:///home/usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar’. Conflicts with main_class

  • properties (pulumi.Input[dict]) - A mapping of property names to values, used to configure Spark SQL’s SparkConf. Properties that conflict with values set by the Cloud Dataproc API may be overwritten.

The sparksql_config object supports the following:

  • jarFileUris (pulumi.Input[list]) - HCFS URIs of jar files to be added to the Spark CLASSPATH.

  • loggingConfig (pulumi.Input[dict])

    • driverLogLevels (pulumi.Input[dict])

  • properties (pulumi.Input[dict]) - A mapping of property names to values, used to configure Spark SQL’s SparkConf. Properties that conflict with values set by the Cloud Dataproc API may be overwritten.

  • queryFileUri (pulumi.Input[str]) - The HCFS URI of the script that contains SQL queries. Conflicts with query_list

  • queryLists (pulumi.Input[list]) - The list of SQL queries or statements to execute as part of the job. Conflicts with query_file_uri

  • scriptVariables (pulumi.Input[dict]) - Mapping of query variable names to values (equivalent to the Spark SQL command: SET name="value";).

driver_controls_files_uri: pulumi.Output[str] = None

If present, the location of miscellaneous control files which may be used as part of job setup and handling. If not present, control files may be placed in the same location as driver_output_uri.

driver_output_resource_uri: pulumi.Output[str] = None

A URI pointing to the location of the stdout of the job’s driver program.

force_delete: pulumi.Output[bool] = None

By default, you can only delete inactive jobs within Dataproc. Setting this to true, and calling destroy, will ensure that the job is first cancelled before issuing the delete.

labels: pulumi.Output[dict] = None

The list of labels (key/value pairs) to add to the job.

project: pulumi.Output[str] = None

The project in which the cluster can be found and jobs subsequently run against. If it is not provided, the provider project is used.

region: pulumi.Output[str] = None

The Cloud Dataproc region. This essentially determines which clusters are available for this job to be submitted to. If not specified, defaults to global.

scheduling: pulumi.Output[dict] = None

Optional. Job scheduling configuration.

  • maxFailuresPerHour (float)

static get(resource_name, id, opts=None, driver_controls_files_uri=None, driver_output_resource_uri=None, force_delete=None, hadoop_config=None, hive_config=None, labels=None, pig_config=None, placement=None, project=None, pyspark_config=None, reference=None, region=None, scheduling=None, spark_config=None, sparksql_config=None, status=None)

Get an existing Job resource’s state with the given name, id, and optional extra properties used to qualify the lookup.

Parameters
  • resource_name (str) – The unique name of the resulting resource.

  • id (str) – The unique provider ID of the resource to lookup.

  • opts (pulumi.ResourceOptions) – Options for the resource.

  • driver_controls_files_uri (pulumi.Input[str]) – If present, the location of miscellaneous control files which may be used as part of job setup and handling. If not present, control files may be placed in the same location as driver_output_uri.

  • driver_output_resource_uri (pulumi.Input[str]) – A URI pointing to the location of the stdout of the job’s driver program.

  • force_delete (pulumi.Input[bool]) – By default, you can only delete inactive jobs within Dataproc. Setting this to true, and calling destroy, will ensure that the job is first cancelled before issuing the delete.

  • labels (pulumi.Input[dict]) – The list of labels (key/value pairs) to add to the job.

  • project (pulumi.Input[str]) – The project in which the cluster can be found and jobs subsequently run against. If it is not provided, the provider project is used.

  • region (pulumi.Input[str]) – The Cloud Dataproc region. This essentially determines which clusters are available for this job to be submitted to. If not specified, defaults to global.

  • scheduling (pulumi.Input[dict]) – Optional. Job scheduling configuration.

The hadoop_config object supports the following:

  • archiveUris (pulumi.Input[list]) - HCFS URIs of archives to be extracted in the working directory of .jar, .tar, .tar.gz, .tgz, and .zip.

  • args (pulumi.Input[list]) - The arguments to pass to the driver. Do not include arguments, such as -libjars or -Dfoo=bar, that can be set as job properties, since a collision may occur that causes an incorrect job submission.

  • fileUris (pulumi.Input[list]) - HCFS URIs of files to be copied to the working directory of Hadoop drivers and distributed tasks. Useful for naively parallel tasks.

  • jarFileUris (pulumi.Input[list]) - HCFS URIs of jar files to be added to the Spark CLASSPATH.

  • loggingConfig (pulumi.Input[dict])

    • driverLogLevels (pulumi.Input[dict])

  • mainClass (pulumi.Input[str]) - The name of the driver’s main class. The jar file containing the class must be in the default CLASSPATH or specified in jar_file_uris. Conflicts with main_jar_file_uri

  • mainJarFileUri (pulumi.Input[str]) - The HCFS URI of the jar file containing the main class. Examples: ‘gs://foo-bucket/analytics-binaries/extract-useful-metrics-mr.jar’ ‘hdfs:/tmp/test-samples/custom-wordcount.jar’ ‘file:///home/usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar’. Conflicts with main_class

  • properties (pulumi.Input[dict]) - A mapping of property names to values, used to configure Spark SQL’s SparkConf. Properties that conflict with values set by the Cloud Dataproc API may be overwritten.

The hive_config object supports the following:

  • continueOnFailure (pulumi.Input[bool]) - Whether to continue executing queries if a query fails. The default value is false. Setting to true can be useful when executing independent parallel queries. Defaults to false.

  • jarFileUris (pulumi.Input[list]) - HCFS URIs of jar files to be added to the Spark CLASSPATH.

  • properties (pulumi.Input[dict]) - A mapping of property names to values, used to configure Spark SQL’s SparkConf. Properties that conflict with values set by the Cloud Dataproc API may be overwritten.

  • queryFileUri (pulumi.Input[str]) - The HCFS URI of the script that contains SQL queries. Conflicts with query_list

  • queryLists (pulumi.Input[list]) - The list of SQL queries or statements to execute as part of the job. Conflicts with query_file_uri

  • scriptVariables (pulumi.Input[dict]) - Mapping of query variable names to values (equivalent to the Spark SQL command: SET name="value";).

The pig_config object supports the following:

  • continueOnFailure (pulumi.Input[bool]) - Whether to continue executing queries if a query fails. The default value is false. Setting to true can be useful when executing independent parallel queries. Defaults to false.

  • jarFileUris (pulumi.Input[list]) - HCFS URIs of jar files to be added to the Spark CLASSPATH.

  • loggingConfig (pulumi.Input[dict])

    • driverLogLevels (pulumi.Input[dict])

  • properties (pulumi.Input[dict]) - A mapping of property names to values, used to configure Spark SQL’s SparkConf. Properties that conflict with values set by the Cloud Dataproc API may be overwritten.

  • queryFileUri (pulumi.Input[str]) - The HCFS URI of the script that contains SQL queries. Conflicts with query_list

  • queryLists (pulumi.Input[list]) - The list of SQL queries or statements to execute as part of the job. Conflicts with query_file_uri

  • scriptVariables (pulumi.Input[dict]) - Mapping of query variable names to values (equivalent to the Spark SQL command: SET name="value";).

The placement object supports the following:

  • clusterName (pulumi.Input[str])

  • clusterUuid (pulumi.Input[str])

The pyspark_config object supports the following:

  • archiveUris (pulumi.Input[list]) - HCFS URIs of archives to be extracted in the working directory of .jar, .tar, .tar.gz, .tgz, and .zip.

  • args (pulumi.Input[list]) - The arguments to pass to the driver. Do not include arguments, such as -libjars or -Dfoo=bar, that can be set as job properties, since a collision may occur that causes an incorrect job submission.

  • fileUris (pulumi.Input[list]) - HCFS URIs of files to be copied to the working directory of Hadoop drivers and distributed tasks. Useful for naively parallel tasks.

  • jarFileUris (pulumi.Input[list]) - HCFS URIs of jar files to be added to the Spark CLASSPATH.

  • loggingConfig (pulumi.Input[dict])

    • driverLogLevels (pulumi.Input[dict])

  • mainPythonFileUri (pulumi.Input[str]) - The HCFS URI of the main Python file to use as the driver. Must be a .py file.

  • properties (pulumi.Input[dict]) - A mapping of property names to values, used to configure Spark SQL’s SparkConf. Properties that conflict with values set by the Cloud Dataproc API may be overwritten.

  • pythonFileUris (pulumi.Input[list]) - HCFS file URIs of Python files to pass to the PySpark framework. Supported file types: .py, .egg, and .zip.

The reference object supports the following:

  • job_id (pulumi.Input[str])

The scheduling object supports the following:

  • maxFailuresPerHour (pulumi.Input[float])

The spark_config object supports the following:

  • archiveUris (pulumi.Input[list]) - HCFS URIs of archives to be extracted in the working directory of .jar, .tar, .tar.gz, .tgz, and .zip.

  • args (pulumi.Input[list]) - The arguments to pass to the driver. Do not include arguments, such as -libjars or -Dfoo=bar, that can be set as job properties, since a collision may occur that causes an incorrect job submission.

  • fileUris (pulumi.Input[list]) - HCFS URIs of files to be copied to the working directory of Hadoop drivers and distributed tasks. Useful for naively parallel tasks.

  • jarFileUris (pulumi.Input[list]) - HCFS URIs of jar files to be added to the Spark CLASSPATH.

  • loggingConfig (pulumi.Input[dict])

    • driverLogLevels (pulumi.Input[dict])

  • mainClass (pulumi.Input[str]) - The name of the driver’s main class. The jar file containing the class must be in the default CLASSPATH or specified in jar_file_uris. Conflicts with main_jar_file_uri

  • mainJarFileUri (pulumi.Input[str]) - The HCFS URI of the jar file containing the main class. Examples: ‘gs://foo-bucket/analytics-binaries/extract-useful-metrics-mr.jar’ ‘hdfs:/tmp/test-samples/custom-wordcount.jar’ ‘file:///home/usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar’. Conflicts with main_class

  • properties (pulumi.Input[dict]) - A mapping of property names to values, used to configure Spark SQL’s SparkConf. Properties that conflict with values set by the Cloud Dataproc API may be overwritten.

The sparksql_config object supports the following:

  • jarFileUris (pulumi.Input[list]) - HCFS URIs of jar files to be added to the Spark CLASSPATH.

  • loggingConfig (pulumi.Input[dict])

    • driverLogLevels (pulumi.Input[dict])

  • properties (pulumi.Input[dict]) - A mapping of property names to values, used to configure Spark SQL’s SparkConf. Properties that conflict with values set by the Cloud Dataproc API may be overwritten.

  • queryFileUri (pulumi.Input[str]) - The HCFS URI of the script that contains SQL queries. Conflicts with query_list

  • queryLists (pulumi.Input[list]) - The list of SQL queries or statements to execute as part of the job. Conflicts with query_file_uri

  • scriptVariables (pulumi.Input[dict]) - Mapping of query variable names to values (equivalent to the Spark SQL command: SET name="value";).

The status object supports the following:

  • details (pulumi.Input[str])

  • state (pulumi.Input[str])

  • stateStartTime (pulumi.Input[str])

  • substate (pulumi.Input[str])

translate_output_property(prop)

Provides subclasses of Resource an opportunity to translate names of output properties into a format of their choosing before writing those properties to the resource object.

Parameters

prop (str) – A property name.

Returns

A potentially transformed property name.

Return type

str

translate_input_property(prop)

Provides subclasses of Resource an opportunity to translate names of input properties into a format of their choosing before sending those properties to the Pulumi engine.

Parameters

prop (str) – A property name.

Returns

A potentially transformed property name.

Return type

str

class pulumi_gcp.dataproc.JobIAMBinding(resource_name, opts=None, condition=None, job_id=None, members=None, project=None, region=None, role=None, __props__=None, __name__=None, __opts__=None)

Three different resources help you manage IAM policies on dataproc jobs. Each of these resources serves a different use case:

  • dataproc.JobIAMPolicy: Authoritative. Sets the IAM policy for the job and replaces any existing policy already attached.

  • dataproc.JobIAMBinding: Authoritative for a given role. Updates the IAM policy to grant a role to a list of members. Other roles within the IAM policy for the job are preserved.

  • dataproc.JobIAMMember: Non-authoritative. Updates the IAM policy to grant a role to a new member. Other members for the role for the job are preserved.

Note: dataproc.JobIAMPolicy cannot be used in conjunction with dataproc.JobIAMBinding and dataproc.JobIAMMember or they will fight over what your policy should be. In addition, be careful not to accidentally unset ownership of the job as dataproc.JobIAMPolicy replaces the entire policy.

Note: dataproc.JobIAMBinding resources can be used in conjunction with dataproc.JobIAMMember resources only if they do not grant privilege to the same role.

import pulumi
import pulumi_gcp as gcp

admin = gcp.organizations.get_iam_policy(binding=[{
    "role": "roles/editor",
    "members": ["user:jane@example.com"],
}])
editor = gcp.dataproc.JobIAMPolicy("editor",
    project="your-project",
    region="your-region",
    job_id="your-dataproc-job",
    policy_data=admin.policy_data)
import pulumi
import pulumi_gcp as gcp

editor = gcp.dataproc.JobIAMBinding("editor",
    job_id="your-dataproc-job",
    members=["user:jane@example.com"],
    role="roles/editor")
import pulumi
import pulumi_gcp as gcp

editor = gcp.dataproc.JobIAMMember("editor",
    job_id="your-dataproc-job",
    member="user:jane@example.com",
    role="roles/editor")
Parameters
  • resource_name (str) – The name of the resource.

  • opts (pulumi.ResourceOptions) – Options for the resource.

  • project (pulumi.Input[str]) – The project in which the job belongs. If it is not provided, the provider will use a default.

  • region (pulumi.Input[str]) – The region in which the job belongs. If it is not provided, the provider will use a default.

  • role (pulumi.Input[str]) – The role that should be applied. Only one dataproc.JobIAMBinding can be used per role. Note that custom roles must be of the format [projects|organizations]/{parent-name}/roles/{role-name}.

The condition object supports the following:

  • description (pulumi.Input[str])

  • expression (pulumi.Input[str])

  • title (pulumi.Input[str])

etag: pulumi.Output[str] = None

(Computed) The etag of the jobs’s IAM policy.

project: pulumi.Output[str] = None

The project in which the job belongs. If it is not provided, the provider will use a default.

region: pulumi.Output[str] = None

The region in which the job belongs. If it is not provided, the provider will use a default.

role: pulumi.Output[str] = None

The role that should be applied. Only one dataproc.JobIAMBinding can be used per role. Note that custom roles must be of the format [projects|organizations]/{parent-name}/roles/{role-name}.

static get(resource_name, id, opts=None, condition=None, etag=None, job_id=None, members=None, project=None, region=None, role=None)

Get an existing JobIAMBinding resource’s state with the given name, id, and optional extra properties used to qualify the lookup.

Parameters
  • resource_name (str) – The unique name of the resulting resource.

  • id (str) – The unique provider ID of the resource to lookup.

  • opts (pulumi.ResourceOptions) – Options for the resource.

  • etag (pulumi.Input[str]) – (Computed) The etag of the jobs’s IAM policy.

  • project (pulumi.Input[str]) – The project in which the job belongs. If it is not provided, the provider will use a default.

  • region (pulumi.Input[str]) – The region in which the job belongs. If it is not provided, the provider will use a default.

  • role (pulumi.Input[str]) – The role that should be applied. Only one dataproc.JobIAMBinding can be used per role. Note that custom roles must be of the format [projects|organizations]/{parent-name}/roles/{role-name}.

The condition object supports the following:

  • description (pulumi.Input[str])

  • expression (pulumi.Input[str])

  • title (pulumi.Input[str])

translate_output_property(prop)

Provides subclasses of Resource an opportunity to translate names of output properties into a format of their choosing before writing those properties to the resource object.

Parameters

prop (str) – A property name.

Returns

A potentially transformed property name.

Return type

str

translate_input_property(prop)

Provides subclasses of Resource an opportunity to translate names of input properties into a format of their choosing before sending those properties to the Pulumi engine.

Parameters

prop (str) – A property name.

Returns

A potentially transformed property name.

Return type

str

class pulumi_gcp.dataproc.JobIAMMember(resource_name, opts=None, condition=None, job_id=None, member=None, project=None, region=None, role=None, __props__=None, __name__=None, __opts__=None)

Three different resources help you manage IAM policies on dataproc jobs. Each of these resources serves a different use case:

  • dataproc.JobIAMPolicy: Authoritative. Sets the IAM policy for the job and replaces any existing policy already attached.

  • dataproc.JobIAMBinding: Authoritative for a given role. Updates the IAM policy to grant a role to a list of members. Other roles within the IAM policy for the job are preserved.

  • dataproc.JobIAMMember: Non-authoritative. Updates the IAM policy to grant a role to a new member. Other members for the role for the job are preserved.

Note: dataproc.JobIAMPolicy cannot be used in conjunction with dataproc.JobIAMBinding and dataproc.JobIAMMember or they will fight over what your policy should be. In addition, be careful not to accidentally unset ownership of the job as dataproc.JobIAMPolicy replaces the entire policy.

Note: dataproc.JobIAMBinding resources can be used in conjunction with dataproc.JobIAMMember resources only if they do not grant privilege to the same role.

import pulumi
import pulumi_gcp as gcp

admin = gcp.organizations.get_iam_policy(binding=[{
    "role": "roles/editor",
    "members": ["user:jane@example.com"],
}])
editor = gcp.dataproc.JobIAMPolicy("editor",
    project="your-project",
    region="your-region",
    job_id="your-dataproc-job",
    policy_data=admin.policy_data)
import pulumi
import pulumi_gcp as gcp

editor = gcp.dataproc.JobIAMBinding("editor",
    job_id="your-dataproc-job",
    members=["user:jane@example.com"],
    role="roles/editor")
import pulumi
import pulumi_gcp as gcp

editor = gcp.dataproc.JobIAMMember("editor",
    job_id="your-dataproc-job",
    member="user:jane@example.com",
    role="roles/editor")
Parameters
  • resource_name (str) – The name of the resource.

  • opts (pulumi.ResourceOptions) – Options for the resource.

  • project (pulumi.Input[str]) – The project in which the job belongs. If it is not provided, the provider will use a default.

  • region (pulumi.Input[str]) – The region in which the job belongs. If it is not provided, the provider will use a default.

  • role (pulumi.Input[str]) – The role that should be applied. Only one dataproc.JobIAMBinding can be used per role. Note that custom roles must be of the format [projects|organizations]/{parent-name}/roles/{role-name}.

The condition object supports the following:

  • description (pulumi.Input[str])

  • expression (pulumi.Input[str])

  • title (pulumi.Input[str])

etag: pulumi.Output[str] = None

(Computed) The etag of the jobs’s IAM policy.

project: pulumi.Output[str] = None

The project in which the job belongs. If it is not provided, the provider will use a default.

region: pulumi.Output[str] = None

The region in which the job belongs. If it is not provided, the provider will use a default.

role: pulumi.Output[str] = None

The role that should be applied. Only one dataproc.JobIAMBinding can be used per role. Note that custom roles must be of the format [projects|organizations]/{parent-name}/roles/{role-name}.

static get(resource_name, id, opts=None, condition=None, etag=None, job_id=None, member=None, project=None, region=None, role=None)

Get an existing JobIAMMember resource’s state with the given name, id, and optional extra properties used to qualify the lookup.

Parameters
  • resource_name (str) – The unique name of the resulting resource.

  • id (str) – The unique provider ID of the resource to lookup.

  • opts (pulumi.ResourceOptions) – Options for the resource.

  • etag (pulumi.Input[str]) – (Computed) The etag of the jobs’s IAM policy.

  • project (pulumi.Input[str]) – The project in which the job belongs. If it is not provided, the provider will use a default.

  • region (pulumi.Input[str]) – The region in which the job belongs. If it is not provided, the provider will use a default.

  • role (pulumi.Input[str]) – The role that should be applied. Only one dataproc.JobIAMBinding can be used per role. Note that custom roles must be of the format [projects|organizations]/{parent-name}/roles/{role-name}.

The condition object supports the following:

  • description (pulumi.Input[str])

  • expression (pulumi.Input[str])

  • title (pulumi.Input[str])

translate_output_property(prop)

Provides subclasses of Resource an opportunity to translate names of output properties into a format of their choosing before writing those properties to the resource object.

Parameters

prop (str) – A property name.

Returns

A potentially transformed property name.

Return type

str

translate_input_property(prop)

Provides subclasses of Resource an opportunity to translate names of input properties into a format of their choosing before sending those properties to the Pulumi engine.

Parameters

prop (str) – A property name.

Returns

A potentially transformed property name.

Return type

str

class pulumi_gcp.dataproc.JobIAMPolicy(resource_name, opts=None, job_id=None, policy_data=None, project=None, region=None, __props__=None, __name__=None, __opts__=None)

Three different resources help you manage IAM policies on dataproc jobs. Each of these resources serves a different use case:

  • dataproc.JobIAMPolicy: Authoritative. Sets the IAM policy for the job and replaces any existing policy already attached.

  • dataproc.JobIAMBinding: Authoritative for a given role. Updates the IAM policy to grant a role to a list of members. Other roles within the IAM policy for the job are preserved.

  • dataproc.JobIAMMember: Non-authoritative. Updates the IAM policy to grant a role to a new member. Other members for the role for the job are preserved.

Note: dataproc.JobIAMPolicy cannot be used in conjunction with dataproc.JobIAMBinding and dataproc.JobIAMMember or they will fight over what your policy should be. In addition, be careful not to accidentally unset ownership of the job as dataproc.JobIAMPolicy replaces the entire policy.

Note: dataproc.JobIAMBinding resources can be used in conjunction with dataproc.JobIAMMember resources only if they do not grant privilege to the same role.

import pulumi
import pulumi_gcp as gcp

admin = gcp.organizations.get_iam_policy(binding=[{
    "role": "roles/editor",
    "members": ["user:jane@example.com"],
}])
editor = gcp.dataproc.JobIAMPolicy("editor",
    project="your-project",
    region="your-region",
    job_id="your-dataproc-job",
    policy_data=admin.policy_data)
import pulumi
import pulumi_gcp as gcp

editor = gcp.dataproc.JobIAMBinding("editor",
    job_id="your-dataproc-job",
    members=["user:jane@example.com"],
    role="roles/editor")
import pulumi
import pulumi_gcp as gcp

editor = gcp.dataproc.JobIAMMember("editor",
    job_id="your-dataproc-job",
    member="user:jane@example.com",
    role="roles/editor")
Parameters
  • resource_name (str) – The name of the resource.

  • opts (pulumi.ResourceOptions) – Options for the resource.

  • policy_data (pulumi.Input[str]) – The policy data generated by a organizations.getIAMPolicy data source.

  • project (pulumi.Input[str]) – The project in which the job belongs. If it is not provided, the provider will use a default.

  • region (pulumi.Input[str]) – The region in which the job belongs. If it is not provided, the provider will use a default.

etag: pulumi.Output[str] = None

(Computed) The etag of the jobs’s IAM policy.

policy_data: pulumi.Output[str] = None

The policy data generated by a organizations.getIAMPolicy data source.

project: pulumi.Output[str] = None

The project in which the job belongs. If it is not provided, the provider will use a default.

region: pulumi.Output[str] = None

The region in which the job belongs. If it is not provided, the provider will use a default.

static get(resource_name, id, opts=None, etag=None, job_id=None, policy_data=None, project=None, region=None)

Get an existing JobIAMPolicy resource’s state with the given name, id, and optional extra properties used to qualify the lookup.

Parameters
  • resource_name (str) – The unique name of the resulting resource.

  • id (str) – The unique provider ID of the resource to lookup.

  • opts (pulumi.ResourceOptions) – Options for the resource.

  • etag (pulumi.Input[str]) – (Computed) The etag of the jobs’s IAM policy.

  • policy_data (pulumi.Input[str]) – The policy data generated by a organizations.getIAMPolicy data source.

  • project (pulumi.Input[str]) – The project in which the job belongs. If it is not provided, the provider will use a default.

  • region (pulumi.Input[str]) – The region in which the job belongs. If it is not provided, the provider will use a default.

translate_output_property(prop)

Provides subclasses of Resource an opportunity to translate names of output properties into a format of their choosing before writing those properties to the resource object.

Parameters

prop (str) – A property name.

Returns

A potentially transformed property name.

Return type

str

translate_input_property(prop)

Provides subclasses of Resource an opportunity to translate names of input properties into a format of their choosing before sending those properties to the Pulumi engine.

Parameters

prop (str) – A property name.

Returns

A potentially transformed property name.

Return type

str