Public Endpoints for OCI Model Deployment

Question

Pulumi · Accepted Answer

To deploy a machine learning model on Oracle Cloud Infrastructure (OCI) and expose it through public endpoints, you will typically go through several stages, including setting up your data science environment, training your model, deploying the model, and finally setting up the networking to expose the model via public endpoints. Below, I'll detail these stages using Pulumi and the OCI Python SDK.

Let's start by assuming that you have already trained a machine learning model and are ready to deploy it. We'll define two main resources for the model deployment:

1. `oci.DataScience.ModelDeployment`: This is used to create a model deployment with OCI's Data Science service. A model deployment hosts the trained model and serves predictions.

2. `oci.ApiGateway.Deployment`: The API Gateway in OCI is used to define a public endpoint that forwards incoming traffic to the backend service—in this case, our model deployment. You'll need to configure this to map the incoming requests to the ML model's predict function.

Here's a Pulumi program in Python that demonstrates how to deploy a model and set up a public endpoint:

```python
import pulumi
import pulumi_oci as oci

# Set up some predefined values for names, compartment ID, and other attributes
# Replace these placeholders with your actual OCI compartment, VCN, and subnets IDs
compartment_id = "ocid1.compartment.oc1..exampleuniqueID"
vcn_id = "ocid1.vcn.oc1..exampleuniqueID"
subnet_id = "ocid1.subnet.oc1..exampleuniqueID"
model_id = "ocid1.datasciencemodel.oc1..exampleuniqueID"

# Define some tags - these help with organizing and identifying resources
defined_tags = {"Owner": "DataScienceTeam", "Environment": "Production"}

# Create a model deployment in OCI
model_deployment = oci.datascience.ModelDeployment("myModelDeployment",
    compartment_id=compartment_id,
    display_name="MyModelDeployment",
    model_deployment_configuration_details=oci.datascience.ModelDeploymentModelDeploymentConfigurationDetailsArgs(
        deployment_type="SINGLE_MODEL",
        model_configuration_details=oci.datascience.ModelDeploymentModelConfigurationDetailsArgs(
            instance_configuration=oci.datascience.ModelDeploymentInstanceConfigurationArgs(
                instance_shape_name="VM.Standard2.1"
            ),
            model_id=model_id,
            scaling_policy=oci.datascience.ModelDeploymentScalingPolicyArgs(
                policy_type="FIXED_SIZE",
                instance_count=1
            )
        )
    ),
    defined_tags=defined_tags
)

# Define an API Gateway to permit public access to the model
api_gateway = oci.apigateway.Gateway("myApiGateway",
    compartment_id=compartment_id,
    display_name="MyApiGateway",
    subnet_id=subnet_id
)

# Deploy our API Gateway configuration to expose the ML Model deployment
# You will likely have to adjust the specification based on your actual model's API
api_deployment = oci.apigateway.Deployment("apiDeployment",
    compartment_id=compartment_id,
    display_name="apiDeployment",
    gateway_id=api_gateway.id,
    path_prefix="/model",
    specification=oci.apigateway.DeploymentSpecificationArgs(
        routes=[
            oci.apigateway.DeploymentSpecificationRouteArgs(
                methods=["POST"],
                path="/predict",
                backend=oci.apigateway.DeploymentSpecificationRouteBackendArgs(
                    type="HTTP_BACKEND",
                    url=model_deployment.predict_url
                )
            )
        ]
    ),
    defined_tags=defined_tags
)

# Export the URLs for the model deployment and the API Gateway
pulumi.export("model_deployment_url", model_deployment.predict_url)
pulumi.export("api_gateway_url", api_deployment.endpoint_url)
```

In this code, we've defined two key resources: `model_deployment` and `api_deployment`. The `model_deployment` resource creates an OCI model deployment where our trained model can serve predictions. The `api_deployment` resource creates an API Gateway deployment that routes external `POST` requests to the `predict_url` of our model deployment.

Make sure to replace the placeholders with your actual OCI information such as compartment ID, subnet ID, virtual cloud network (VCN) ID, and model ID.

The exact details of your model's API (like URI paths and HTTP methods) will depend on how your model is configured and how you want to serve predictions. Adjust the `routes` configuration in the `api_deployment` accordingly.

Finally, we export the URLs for the model deployment and the API gateway so you can easily access this information after the deployment is complete.

Run this code with the `pulumi up` command after you have [set up Pulumi with your OCI account](https://www.pulumi.com/registry/packages/oci/installation-configuration/). The output will give you URLs that you can use to interact with your deployed model through the public internet.