Fine-Grained Data Permissions with GCP Policy Tags

Question

Pulumi · Accepted Answer

In Google Cloud Platform (GCP), data permissions are typically controlled at the dataset level. However, for more fine-grained control, you can use Policy Tags. Policy Tags are part of the Data Catalog service, which allows you to create a taxonomy of tags that can be used to control access at the column level for BigQuery tables.

Here's a breakdown of the steps you would take to implement fine-grained data permissions using Policy Tags in GCP:

1. **Create a Taxonomy**: A taxonomy is a group of Policy Tags. It acts like a container or a namespace for the Policy Tags.
2. **Create Policy Tags**: Within your taxonomy, create Policy Tags corresponding to the different levels of data access that you require.
3. **Attach Policy Tags to BigQuery Columns**: Assign the created Policy Tags to specific columns within your BigQuery tables.
4. **Control Access with IAM policies**: Apply Identity and Access Management (IAM) policies at the policy tag level to grant or restrict user access to columns marked with specific tags.

Below is a Pulumi program that demonstrates how you'd use the GCP provider in Pulumi to create a taxonomy, add Policy Tags to the taxonomy, and control access to these tags using IAM policies:

```python
import pulumi
import pulumi_gcp as gcp

# First, initialize a new GCP Data Catalog Taxonomy.
taxonomy = gcp.datacatalog.Taxonomy("my-taxonomy",
    activated_policy_types=["FINE_GRAINED_ACCESS_CONTROL"],
    description="A taxonomy for controlling fine-grained data access.",
    # Display name of our taxonomy.
    display_name="Data Access Taxonomy")

# Create a Policy Tag for personally identifiable information (PII).
pii_policy_tag = gcp.datacatalog.PolicyTag("pii-policy-tag",
    taxonomy=taxonomy.id,
    display_name="PII",
    description="Policy tag for PII data.")

# Create a Policy Tag for financial information.
financial_policy_tag = gcp.datacatalog.PolicyTag("financial-policy-tag",
    taxonomy=taxonomy.id,
    display_name="Financial",
    description="Policy tag for financial data.")

# Create an IAM policy to grant the 'dataCatalogViewer' role on the PII policy tag to a specific user.
pii_policy_tag_iam_policy = gcp.datacatalog.PolicyTagIamPolicy("pii-policy-tag-iam-policy",
    policy_tag=pii_policy_tag.id,
    policy_data=pulumi.Output.secret("""
    {
        "bindings": [
            {
                "role": "roles/datacatalog.tagViewer",
                "members": ["user:someone@example.com"]
            }
        ]
    }
    """))

# Export the IDs of the taxonomy and policy tags for use in other parts of our application or reference.
pulumi.export("taxonomy_id", taxonomy.id)
pulumi.export("pii_policy_tag_id", pii_policy_tag.id)
pulumi.export("financial_policy_tag_id", financial_policy_tag.id)
```

This program establishes the necessary taxonomy and policy tags for managing fine-grained access to your data. It also creates an IAM policy that grants a specific user the ability to view tags for columns marked as PII.

After running this Pulumi program, you would then use the created Policy Tags to tag BigQuery table columns and use IAM policies to enforce who has access to data marked with those tags.

Remember, this is an infrastructure-as-code approach, which means you can version control your fine-grained data permission settings and have a history and audit log of changes over time, aiding in governance and compliance efforts.