Multi-Tenancy AI Environment Setup in Databricks Workspace

Question

Pulumi · Accepted Answer

To set up a multi-tenancy AI environment within a Databricks workspace, you will need to create and configure several resources. These include the Databricks workspace itself, as well as the necessary configurations within that workspace to support multi-tenancy, such as different folders or permissions for each tenant.

First, we'll use the `azure-native.databricks.Workspace` resource to create a new Databricks workspace in Azure. We'll specify basic details such as the name, location, and SKU for the workspace. Then, we can further assume that multi-tenancy will involve setting up directories, different user accounts, and possibly separate metastore configurations for isolation. For simplicity, this example demonstrates creating the workspace and some basic structures, such as directories for different projects or tenants.

In the following program, we'll:

1. Create a Databricks workspace.
2. Create directories within that workspace that could represent different tenants or projects.
3. Set up users which could be mapped to different tenants for access management.

Make sure you've logged in to Pulumi and set up your Azure credentials before running the following program.

```python
import pulumi
import pulumi_azure_native as azure_native
from pulumi_azure_native import databricks

# Creating a Databricks workspace
workspace = azure_native.databricks.Workspace("myDatabricksWorkspace",
    resource_group_name="myResourceGroup",
    location="West US 2",
    sku=databricks.SkuArgs(
        name="standard"  # Choose the appropriate SKU for your needs
    ),
    tags={
        "Environment": "Development"
    }
)

# Creating directories within the workspace for different projects or tenants
# Directory for project A
project_a_directory = databricks.Directory("projectADirectory",
    path="/ProjectA",
    workspace_id=workspace.id,
)

# Directory for project B
project_b_directory = databricks.Directory("projectBDirectory",
    path="/ProjectB",
    workspace_id=workspace.id,
)

# If we had user information, we could create users for each tenant/project like so:
# user_for_project_a = databricks.User("userForProjectA",
#     user_name="ProjectAUser",
#     display_name="User for Project A",
#     active=True,
#     home="/ProjectA/UserHome",
#     workspace_id=workspace.id,
# )

# Instead of above, this is just a placeholder for where the user's details would go
user_placeholder = pulumi.Output.none()

# pulumi.export() is used to output the ID and URL of the Databricks workspace.
# These can be used to access the workspace or within other systems that need this information.
pulumi.export('databricks_workspace_id', workspace.id)
pulumi.export('databricks_workspace_url', workspace.workspace_url)
```

In this program:

- We create a Databricks workspace in Azure with `azure_native.databricks.Workspace`. Here, we need to replace `"myResourceGroup"` with the name of your existing Azure resource group. You can adjust the location and tags as needed.
- We then create directories for two hypothetical projects, `ProjectA` and `ProjectB`. These are using the `databricks.Directory` resource type, which allows us to manage folders within the Databricks workspace's file system.
- We have a comment block for creating a user, which is not executed here but indicated how you could add users to the Databricks workspace.
- We are exporting the workspace ID and URL, which you can use to access the workspace from the Databricks portal or APIs.

To apply the above Pulumi program:

1. Save the program to a file named `__main__.py`.
2. Run `pulumi up` in the same directory as the file.

Pulumi will prompt you for any required actions and will show you a preview of the resources that will be created. After reviewing the changes, you can choose to apply them. Once applied, Pulumi will output the exported values, such as the workspace ID and URL, which you can use to access your new workspace.