API Gateway for Model Serving Endpoints

Question

Pulumi · Accepted Answer

API Gateway is a cloud service that you can use to create, publish, maintain, monitor, and secure APIs at any scale. It provides a way to route requests from clients to various backend services, handle different versions of your APIs, handle authorization and authentication, and manage traffic.

Here, we are going to create an API Gateway that will act as an interface for serving machine learning model endpoints. This is a common setup for enabling machine learning models to be used by client applications through a secure and scalable API.

For our purposes, we will use AWS as the cloud provider and build an API Gateway REST API. This REST API will consist of resources and methods to invoke our machine learning model endpoints. We will use AWS Lambda functions as our backend service, which will contain the logic to interact with our machine learning models (which could be hosted on Amazon SageMaker, for example).

Below is a Pulumi program in Python that sets up an API Gateway REST API with a single resource and method to invoke a Lambda function which will serve as our model endpoint.

```python
import pulumi
import pulumi_aws as aws

# Create a Lambda function that invokes our machine learning model
model_lambda = aws.lambda_.Function("modelLambda",
    runtime="python3.8",
    code=pulumi.AssetArchive({
        '.': pulumi.FileArchive('./model-serving-lambda')  # Assuming your code is in this directory
    }),
    handler="handler.main",  # Assuming your handler function is called 'main' in 'handler.py'
    role=iam_role.arn  # An IAM role that gives your function permission to access AWS services
)

# Create API Gateway REST API
rest_api = aws.apigateway.RestApi("modelServingApi",
    description="API Gateway for model serving endpoints"
)

# Create a resource representing our model endpoint
model_resource = aws.apigateway.Resource("modelResource",
    rest_api=rest_api.id,
    parent_id=rest_api.root_resource_id,
    path_part="model"  # The path part for the resource, e.g., /model
)

# Create a method for the resource. Here it's assumed to be a GET method.
model_method = aws.apigateway.Method("modelMethod", 
    http_method="GET",
    authorization="NONE",  # No authorization, this should be configured based on your requirements
    resource_id=model_resource.id,
    rest_api=rest_api.id,
    # Method response - Defines response structure for successful request
    method_responses=[{
        "statusCode": "200",
        "responseModels": {
            "application/json": "Empty"
        }
    }]
)

# Create an integration to connect the GET method to the Lambda function
model_integration = aws.apigateway.Integration("modelIntegration",
    http_method=model_method.http_method,
    resource_id=model_resource.id,
    rest_api=rest_api.id,
    integration_http_method="POST",  # Lambda function is invoked via POST
    type="AWS_PROXY",  # Use AWS_PROXY integration for Lambda proxy integration
    uri=model_lambda.invoke_arn  # The ARN to invoke the Lambda function
)

# Deploy the API Gateway
deployment = aws.apigateway.Deployment("modelDeployment",
    rest_api=rest_api.id,
    description="Deployment for the model serving API",
    # Define a stage where we want to deploy our API
    stage_name="prod"
)

# Export the invoke URL to access our model serving endpoint
pulumi.export("invoke_url", deployment.invoke_url.apply(lambda url: f"{url}{model_resource.path_part}"))
```

Here's what each part of the code does:

1. We create an AWS Lambda function (`model_lambda`) that will handle requests to our model endpoint. The function's code is in the `model-serving-lambda` directory, and its handler is `main` in `handler.py`.
   
2. We then define an API Gateway REST API (`rest_api`) to expose our Lambda function as an HTTP endpoint.

3. A resource (`model_resource`) is created to represent our model endpoint, with its path set to `/model`.

4. We add an HTTP method (`model_method`) to the model resource, in this case, a GET method. It specifies the HTTP status code and the response model for successful responses.

5. An integration (`model_integration`) connects the GET method to the Lambda function, allowing API Gateway to invoke the Lambda function when the method is called.

6. Finally, we deploy our API Gateway with a Deployment resource (`deployment`) to a stage named `prod`, and export the invoke URL, which the clients will use to interact with our model endpoint.

The above Pulumi program will create an API Gateway for model serving endpoints which can be easily invoked by clients to receive predictions from your deployed machine learning models.