Machine Learning Model Serving via AWS API Gateway

Question

Pulumi · Accepted Answer

To serve a machine learning model via AWS API Gateway, you'll need to set up a few AWS services:

1. **Amazon S3**: You will use this to store your machine learning model.
2. **AWS Lambda**: This will contain the logic to load the model from S3 and perform inference.
3. **Amazon API Gateway**: It will act as the front door to your Lambda function, allowing clients to invoke your model via HTTP.

Here's how the process works:

- API Gateway exposes an HTTP endpoint that clients can invoke.
- When a client hits the API Gateway endpoint, API Gateway triggers a Lambda function.
- The Lambda function loads the machine learning model from an S3 bucket and performs inference.
- Finally, the inference results are sent back to the client through API Gateway.

Let's break down the Pulumi program in Python to set this up:

- **AWS Lambda Function**: Your Lambda will include code to load the machine learning model and make predictions. For this Pulumi program, we will assume that you already have a Python script ready that can be packed into a ZIP file.
- **API Gateway**: We will create a REST API endpoint that triggers the Lambda function.
- **IAM Role**: We will create an IAM role with necessary permissions for Lambda to access S3 and logging to CloudWatch.

Here is the Pulumi program to set up Machine Learning Model Serving via AWS API Gateway:

```python
import pulumi
import pulumi_aws as aws

# Assume you have already zipped your Lambda function code into `lambda_function.zip`
# and have uploaded your trained model to an S3 bucket.

# Create an IAM role that will be used by your Lambda Function
lambda_role = aws.iam.Role("lambdaRole",
    assume_role_policy="""{
        "Version": "2012-10-17",
        "Statement": [{
            "Action": "sts:AssumeRole",
            "Effect": "Allow",
            "Principal": {
                "Service": "lambda.amazonaws.com"
            }
        }]
    }"""
)

# Attach a policy to the role that allows Lambda to access S3 resources and CloudWatch logs
s3_access_policy = aws.iam.RolePolicy("s3AccessPolicy",
    role=lambda_role.id,
    policy="""{
        "Version": "2012-10-17",
        "Statement": [{
            "Action": ["s3:GetObject"],
            "Resource": ["arn:aws:s3:::name-of-your-model-bucket/*"],
            "Effect": "Allow"
        },{
            "Action": ["logs:CreateLogGroup", "logs:CreateLogStream", "logs:PutLogEvents"],
            "Resource": "arn:aws:logs:*:*:*",
            "Effect": "Allow"
        }]
    }"""
)

# Create a Lambda function
lambda_function = aws.lambda_.Function("modelServeLambdaFunction",
    code=pulumi.FileArchive("./lambda_function.zip"),
    role=lambda_role.arn,
    handler="lambda_function.handler", # Replacing `lambda_function.handler` with the appropriate handler
    runtime="python3.8" # Match the runtime to your Lambda function's language and version
)

# Create an API Gateway REST API
api_gateway = aws.apigatewayv2.Api("apiGateway",
    protocol_type="HTTP",
    route_selection_expression="$request.method $request.path"
)

# Create an Integration for your Lambda to connect it with API Gateway
integration = aws.apigatewayv2.Integration("lambdaIntegration",
    api_id=api_gateway.id,
    integration_type="AWS_PROXY",
    integration_uri=lambda_function.invoke_arn,
    payload_format_version="2.0"
)

# Define the route for your API Gateway
route = aws.apigatewayv2.Route("apiRoute",
    api_id=api_gateway.id,
    route_key="POST /predict", # Adjust the method and route as needed
    target=pulumi.Output.concat("integrations/", integration.id)
)

# Deploy your API Gateway
deployment = aws.apigatewayv2.Deployment("apiDeployment",
    api_id=api_gateway.id,
    # NOTE: To avoid circular dependencies, the deployment must depend on the route
    opts=pulumi.ResourceOptions(depends_on=[route])
)

# Create a Stage, which is required to make the deployment accessible
stage = aws.apigatewayv2.Stage("apiStage",
    api_id=api_gateway.id,
    deployment_id=deployment.id,
    name="prod" # You can customize the stage name
)

# Export the HTTPS endpoint of the REST API
pulumi.export('api_url', api_gateway.api_endpoint.apply(
    lambda endpoint: f"{endpoint}/{stage.name}"
))
```

In this program:

- We start by creating an IAM role that grants our Lambda function permissions to access S3 and create logs in CloudWatch.
- Next, we create a Lambda function, specifying the IAM role created earlier, the handler, and the runtime.
- Then, we set up API Gateway as an HTTP API and create an integration between API Gateway and Lambda.
- We define a route that any POST request to `/predict` will be handled by our integration.
- We then deploy our API Gateway and specify a stage called `prod` for the deployment.
- Finally, we export the URL of the API Gateway endpoint so we can use it to make predictions.

This program will automatically create an AWS Lambda function and set up the API Gateway when applied using Pulumi. Remember to replace `"name-of-your-model-bucket"` with the actual name of your S3 bucket, and `"lambda_function.handler"` with the correct handler in your Lambda code. Also, adjust the `route_key` for the API Gateway to match the desired endpoint.

By setting up the API like this, clients can make HTTP POST requests to the API Gateway endpoint containing the input data for predictions, and the Lambda function will process those requests and return the results.