1. Onboarding Data Scientists to a Unified MongoDB Atlas Data Platform


    MongoDB Atlas is a fully-managed cloud database developed by the same people that build MongoDB. Pulumi enables you to manage MongoDB Atlas resources through infrastructure as code. The process of onboarding data scientists to a unified MongoDB Atlas data platform involves creating a variety of resources including user accounts, clusters, databases, and access permissions.

    The following Pulumi program in Python is designed to set up a basic MongoDB Atlas environment. This includes creating a project, adding a cluster to this project, enabling a team, and setting up database users with appropriate roles which would be crucial for data scientist access.

    I'm going to include comments in the code to explain step-by-step what each part does, and after the code, we'll go over it more generally to provide you with an understanding of the whole process.

    import pulumi import pulumi_mongodbatlas as mongodbatlas # Configure your MongoDB Atlas API Token and Project ID atlas_token = 'pulumi_mongodbatlas:privateKey' organization_id = 'pulumi_mongodbatlas:orgId' project_name = 'data-science-project' # Create a new MongoDB Atlas project where your clusters and users will live. project = mongodbatlas.Project("project", org_id=organization_id, name=project_name) # Define a cluster where your databases will be stored. # Here we are creating a M10 cluster which is suitable for small teams and production usage. # You can choose different cluster sizes based on your needs. cluster = mongodbatlas.Cluster("cluster", project_id=project.id, name="data-science-cluster", disk_size_gb=10, num_shards=1, provider_name="AWS", provider_region_name="US_EAST_1", provider_instance_size_name="M10", backup_enabled=True, auto_scaling_disk_gb_enabled=True) # Create a team dedicated to your data scientists. team = mongodbatlas.Team("team", org_id=organization_id, team_name="Data Scientists") # Add users to the team. # Note that the usernames here should be the email addresses associated with the MongoDB Atlas users. team_members = mongodbatlas.TeamUser("team-members", team_id=team.team_id, usernames=["user1@example.com", "user2@example.com"]) # Add the team to the project with the readWriteAnyDatabase role, allowing them to read and write any database. team_project_role = mongodbatlas.TeamProjects("team-project-role", team_id=team.team_id, project_ids=[project.id], role_names=["readWriteAnyDatabase"]) # Export the IDs of the created resources as final output. pulumi.export('project_id', project.id) pulumi.export('cluster_id', cluster.id) pulumi.export('team_id', team.id)

    Let's walk through the code:

    1. Project Creation: We initialize a new MongoDB Atlas project using the mongodbatlas.Project resource. This acts as a container for all your MongoDB Atlas resources and a common place for your data science team to work from.

    2. Cluster Configuration: Next, we spin up a MongoDB Atlas cluster with the mongodbatlas.Cluster resource. This cluster is where the databases reside, and it's here where your data scientists will perform queries, store data, etc. We configure it according to the project's needs, like backup, disk size, and instance size.

    3. Team Setup: We create a team of data scientists in the MongoDB Atlas organization using the mongodbatlas.Team resource. This resource allows us to group users and apply common settings and permissions across all members efficiently.

    4. Team Membership: We specifically add users to this team with mongodbatlas.TeamUser. You would replace the example email addresses with the real email addresses of your data scientists.

    5. Assign Roles: Using the mongodbatlas.TeamProjects resource, we assign the team to our project with the readWriteAnyDatabase role, which gives them the ability to read and write data to any database within the cluster.

    At the end of the program, we export project_id, cluster_id, and team_id as outputs. These can be used for cross-reference in the Pulumi stack or potentially used in automation/scripts if needed.

    With this basic setup, your data scientists should have access to a shared MongoDB Atlas platform. You can further customize the resources and permissions as needed by adding more clusters, fine-tuning user roles, or integrating with third-party services such as data visualization tools.

    Remember to replace placeholders like pulumi_mongodbatlas:privateKey, pulumi_mongodbatlas:orgId, and user emails with the actual token, organization ID, and email addresses.

    To execute this Pulumi program, ensure that the Pulumi CLI is installed, you have an account configured on MongoDB Atlas with appropriate permissions, and the API token for MongoDB Atlas is set securely in Pulumi's secrets management system.