Controlling Access to Databricks SQL Endpoints

Question

Pulumi · Accepted Answer

Controlling access to Databricks SQL endpoints is an important aspect of managing a secure and well-governed data platform. Databricks SQL endpoints are query execution clusters that provide a way to run SQL workloads on your Databricks workspace. In Pulumi, managing this involves two primary resources, `SqlEndpoint` and `Permissions`. The `SqlEndpoint` resource is used to create or manage a SQL endpoint, while the `Permissions` resource is used to define access control for that endpoint.

Below you will find a Pulumi program written in Python that demonstrates how to create a Databricks SQL endpoint and then control access to that endpoint by setting up permissions for a specific user or group.

In this example, we are doing the following:

1. Importing the necessary Databricks provider to manage Databricks resources.
2. Creating a new SQL endpoint with a specified number of nodes and other configurations.
3. Granting permission to a specific user or group to access the endpoint.

Let's consider that we want to create an SQL endpoint and give access to a user with the username `data-scientist`. This user will be granted ```CAN_USE``` access to interact with the endpoint. Keep in mind that in a real-world scenario, you would replace these with the actual usernames, access levels, or groups as per your organization's requirements.

Here is the Pulumi program:

```python
import pulumi
import pulumi_databricks as databricks

# Create a new SQL endpoint
sql_endpoint = databricks.SqlEndpoint("sql-endpoint",
    name="my-endpoint",
    cluster_size="Medium",
    auto_stop_mins=30,
    enable_serverless_compute=False,
    channel="CURRENT",
    # You can add more configuration for the SQL endpoint as needed
)

# Define access control for the SQL endpoint
permissions = databricks.Permissions("sql-endpoint-permissions",
    sql_endpoint_id=sql_endpoint.id,
    access_controls=[
        databricks.PermissionsAccessControlsArgs(
            user_name="data-scientist",
            permission_level="CAN_USE",
            # There are different permission levels such as "CAN_MANAGE", "CAN_USE", etc.
            # Assign permission levels based on the user’s role and responsibilities.
        ),
    ],
    # You can add more access configurations for different users or groups as per your needs
)

# Export the JDBC URL of the SQL endpoint
pulumi.export("jdbc_url", sql_endpoint.jdbc_url)
```

In the above program:

- We import the `pulumi` and `pulumi_databricks` modules. The Pulumi Databricks provider allows us to create and manage Databricks-related resources programmatically.
- We create an SQL endpoint using the `databricks.SqlEndpoint` class ([docs](https://www.pulumi.com/registry/packages/databricks/api-docs/sqlendpoint/)). We specify several parameters, such as the size of the cluster and an auto-stop duration, to prevent running idle and incurring unnecessary costs.
- We control the access to the SQL endpoint by creating a `Permissions` resource using the `databricks.Permissions` class ([docs](https://www.pulumi.com/registry/packages/databricks/api-docs/permissions/)). Here we grant the `CAN_USE` permission level to a user named `data-scientist`.
- Finally, we export the JDBC URL of the SQL endpoint as an output of our Pulumi stack. This URL can be used to connect to the SQL endpoint from your applications or data tools.

This program is a starting point for controlling SQL endpoint access in Databricks using Pulumi, and additional configurations can be added as needed for your specific use case.