Securely Connect to SageMaker Endpoints within VPC

Question

Pulumi · Accepted Answer

To securely connect to SageMaker Endpoints within a Virtual Private Cloud (VPC), you need to create a SageMaker endpoint that is accessible only within your VPC. To achieve this with Pulumi, you would follow these steps:

Set up a VPC with the necessary subnets and security groups.
Create a SageMaker model.
Create a SageMaker endpoint configuration with a reference to the VPC-configured security groups and subnets.
Create a SageMaker endpoint using the endpoint configuration created in the previous step.

Here's a Pulumi program in Python that demonstrates how to accomplish these steps. This program assumes you already have a trained model ready to deploy. It sets up a new VPC (or you could use an existing one), creates a security group that allows traffic within the VPC, and then configures and deploys a SageMaker endpoint that can only be accessed from within that VPC.

import pulumi
import pulumi_aws as aws

# Create a new VPC with specified CIDR block
vpc = aws.ec2.Vpc("sagemaker_vpc",
                  cidr_block="10.0.0.0/16",
                  enable_dns_hostnames=True)

# Create subnets within the VPC
subnet = aws.ec2.Subnet("sagemaker_subnet",
                         vpc_id=vpc.id,
                         cidr_block="10.0.1.0/24",
                         availability_zone="us-west-2a")  # Change this to your desired AWS region & AZ

# Create a security group for the SageMaker endpoint within the VPC
security_group = aws.ec2.SecurityGroup("sagemaker_sg",
                                       vpc_id=vpc.id,
                                       description="Allow internal sagemaker traffic",
                                       tags={
                                           "Name": "sagemaker_vpc_sg"
                                       })

# Allow all traffic within the VPC on the Security Group
all_traffic_within_vpc = aws.ec2.SecurityGroupRule("ingress",
                                                   type="ingress",
                                                   from_port=0,
                                                   to_port=0,
                                                   protocol="-1",  # This means all protocols
                                                   cidr_blocks=[vpc.cidr_block],
                                                   security_group_id=security_group.id)

# Create a SageMaker model
model = aws.sagemaker.Model("sagemaker_model",
                            execution_role_arn="<YOUR-SAGEMAKER-EXECUTION-ROLE-ARN>",  # Replace with actual ARN
                            primary_container={
                                "image": "<YOUR-MODEL-IMAGE>",  # Replace with the container image URL
                                "modelDataUrl": "<YOUR-MODEL-DATA-URL>"  # Replace with the S3 URL to your trained model data
                            })

# Create the SageMaker endpoint configuration
endpoint_config = aws.sagemaker.EndpointConfiguration("sagemaker_endpoint_config",
                                                      production_variants=[{
                                                          "variantName": "default",
                                                          "modelName": model.name,
                                                          "initialInstanceCount": 1,
                                                          "instanceType": "ml.m4.xlarge",
                                                      }],
                                                      vpc_config={
                                                          "subnets": [subnet.id],
                                                          "securityGroupIds": [security_group.id]
                                                      })

# Deploy the SageMaker endpoint
endpoint = aws.sagemaker.Endpoint("sagemaker_endpoint",
                                  endpoint_config_name=endpoint_config.name,
                                  tags={
                                      "Name": "sagemaker_endpoint"
                                  })

pulumi.export('sagemaker_endpoint_name', endpoint.endpoint_name)

This program sets up a new VPC and subnet where the SageMaker endpoint will be deployed. It also creates a security group that controls the traffic allowed to and from the endpoints in the subnet. It then sets up a SageMaker model and endpoint configuration referencing the subnet and security group. Finally, it creates the SageMaker endpoint itself.

Remember to replace placeholders like <YOUR-SAGEMAKER-EXECUTION-ROLE-ARN>, <YOUR-MODEL-IMAGE>, and <YOUR-MODEL-DATA-URL> with your actual values.

Once this Pulumi program is run, it will output the name of the SageMaker endpoint created. You can then use this endpoint for inference within your VPC securely, ensuring that your endpoint isn't accessible from the public internet.

Let's go through the important parts of the configuration:

VPC and Subnet: These resources create a network isolated environment for your resources to ensure secure communication. The CIDR blocks need to be specified for IP addressing.
Security Group: This acts like a virtual firewall that controls the traffic allowed to and from resources attached to them—in this case, the SageMaker endpoint.
SageMaker Model: The model is your trained algorithm that will run on the SageMaker endpoint.
SageMaker Endpoint Configuration: This configuration specifies the details about how SageMaker should deploy the model, including instance types, scaling policy, and the VPC configuration that we defined earlier.

After running this through the Pulumi CLI, your infrastructure as code will provision these resources in your AWS account, ensuring you have an isolated and secure environment for your SageMaker endpoints.