Managed ML Environments with AWS Service Catalog Artifacts

Question

Pulumi · Accepted Answer

To manage Machine Learning (ML) environments on AWS, you can utilize the AWS Service Catalog along with AWS SageMaker. AWS Service Catalog allows organizations to create and manage catalogs of IT services that are approved for use on AWS. This way, you can have a standardized environment for your ML workflow which can include resources like SageMaker Notebooks, training jobs, model hosting, and more.

As part of managing ML environments with Service Catalog Artifacts, you'd typically:

1. **Define Products**: These are the IT services that you want to make available for deployment on AWS, which in an ML context could be notebook instances, training and inference models, or complete SageMaker projects.

2. **Create Provisioning Artifacts**: These are the versions of products defined. Each provisioning artifact represents a version with its own configuration.

3. **Set up Portfolios**: These are collections of products that can be managed and maintained together. You can set constraints and define roles that have access to the portfolio.

4. **Associate a SageMaker Project**: You can manage SageMaker projects using Service Catalog by altering the status of Service Catalog within SageMaker.

The AWS Provider for Pulumi provides the necessary resources for creating and managing these Service Catalog artifacts.

Below is a Pulumi program written in Python that illustrates how you might set up a product in your service catalog that corresponds to a SageMaker Notebook instance. This product can then be provisioned, which enables users to spin up pre-configured SageMaker Notebooks as per the organization's policies and standards.

```python
import pulumi
import pulumi_aws as aws

# Define the Service Catalog product, which could represent a SageMaker Notebook.
product = aws.servicecatalog.Product("mlProduct",
    owner="dataScienceTeam",
    product_type="CLOUD_FORMATION_TEMPLATE",
    tags={
        "Environment": "ManagedML",
    },
    description="SageMaker Notebook Product for ML Workflows",
    distributor="Internal",
    support_description="Contact the Data Science team for any issues",
    support_email="datascience@example.com",
    support_url="https://internal.example.com/support",
    # This is a hypothetical CloudFormation Template for SageMaker Notebook instance.
    provisioning_artifact_parameters={
        "info": {
            "LoadTemplateFromURL": "https://example.com/sagemaker-notebook-cf-template.yml"
        },
        "name": "v1.0",
        "type": "CLOUD_FORMATION_TEMPLATE"
    })

# Any user who wants to use this product can then provision it within their AWS environment.
# The provisioning parameters would correspond to the specific configuration they need.
provisioned_product = aws.servicecatalog.ProvisionedProduct("mlProvisionedProduct",
    product_id=product.id,
    provisioning_artifact_id=product.provisioning_artifact_ids[0],
    provisioning_parameters=[
        {
            "key": "InstanceType",
            "value": "ml.t2.medium",
        },
        {
            "key": "VolumeSize",
            "value": "5",
        },
    ])

# Export the Product ID and Provisioned Product ARN for future use or reference.
pulumi.export("productId", product.id)
pulumi.export("provisionedProductArn", provisioned_product.arn)
```

In this example:
- We create a `Product` that represents our ML environment (a SageMaker Notebook instance). We've tagged it, provided contact information for support, and specified a link to a CloudFormation template that defines how to actually provision the Notebook instance.
- Then, we provision an instance of this product using `ProvisionedProduct`, specifying the parameters of the Notebook such as instance type and volume size.
- Finally, we export the `Product ID` and `Provisioned Product ARN` so that we can easily reference them later, for example in a CI/CD pipeline or in administrative scripts.

By managing your ML environments through AWS Service Catalog using Pulumi, you get Infrastructure as Code benefits, such as versioning, auditing, and easy replication of environments, which provides a solid foundation for building a robust ML workflow.

For full details about the resources used in the example, check out the Pulumi documentation:
- [`aws.servicecatalog.Product`](https://www.pulumi.com/registry/packages/aws/api-docs/servicecatalog/product/)
- [`aws.servicecatalog.ProvisionedProduct`](https://www.pulumi.com/registry/packages/aws/api-docs/servicecatalog/provisionedproduct/)