Event-Driven Autoscaling for ML Inference with Azure Event Grid

Question

Pulumi · Accepted Answer

To create an event-driven autoscaling solution for machine learning (ML) inference on Azure using Pulumi, we'll need to set up various resources. The core idea is to use Azure Event Grid to subscribe to events that will trigger an Azure Function (or another compute service) to scale our ML inference service up or down.

Here's what we need to create:

Azure Event Grid Topic: A custom topic to which events are sent.
Azure Function: A serverless function that gets triggered by the Event Grid Topic and performs the scaling.
Azure Machine Learning Workspace: The workspace that hosts our ML models and inference services.
Event Grid Subscription: A subscription that filters specific events and directs them to the Azure Function.

Below is a pulumi Python program to set up event-driven autoscaling for ML inference:

import pulumi
import pulumi_azure_native as azure_native

# Note: Some details such as resource group, function app details, machine learning workspace configuration and
# specific event filtering are assumed to be predefined or use sensible defaults for brevity.

# Resource Group for all of our resources
resource_group = azure_native.resources.ResourceGroup('resource_group')

# Create an Event Grid Topic
event_grid_topic = azure_native.eventgrid.Topic("eventGridTopic",
                                               resource_group_name=resource_group.name,
                                               location=resource_group.location)

# Deploy an Azure Function which will handle the scale events.
# It's assumed that the function code is already packaged into a blob, and the URL is known.
# The code should contain logic to authenticate to the ML workspace and update the scale settings.
function_app_service_plan = azure_native.web.AppServicePlan("functionAppServicePlan",
                                                           resource_group_name=resource_group.name,
                                                           location=resource_group.location,
                                                           sku=azure_native.web.SkuDescriptionArgs(
                                                               name="Y1",
                                                               tier="Dynamic",
                                                           ))

function_app_storage_account = azure_native.storage.StorageAccount("functionAppStorageAccount",
                                                                  resource_group_name=resource_group.name,
                                                                  location=resource_group.location,
                                                                  sku=azure_native.storage.SkuArgs(
                                                                      name=azure_native.storage.SkuName.STANDARD_LRS
                                                                  ),
                                                                  kind=azure_native.storage.Kind.STORAGE_V2)

function_app = azure_native.web.WebApp("functionApp",
                                       resource_group_name=resource_group.name,
                                       location=resource_group.location,
                                       server_farm_id=function_app_service_plan.id,
                                       site_config=azure_native.web.SiteConfigArgs(
                                           app_settings=[
                                               azure_native.web.NameValuePairArgs(name="FUNCTIONS_EXTENSION_VERSION",
                                                                                  value="~3"),
                                               azure_native.web.NameValuePairArgs(name="WEBSITE_RUN_FROM_PACKAGE",
                                                                                  value="<BLOB_URL>")
                                           ]))

# Machine Learning Workspace where the inference service is hosted
ml_workspace = azure_native.machinelearningservices.Workspace("mlWorkspace",
                                                              resource_group_name=resource_group.name,
                                                              location=resource_group.location)

# Event Grid Subscription to subscribe to our custom topic and trigger the Azure Function
event_subscription = azure_native.eventgrid.EventSubscription("eventSubscription",
                                                              scope=event_grid_topic.id,
                                                              destination=azure_native.eventgrid.WebHookEventSubscriptionDestinationArgs(
                                                                  endpoint_url=function_app.default_host_name.apply(
                                                                      lambda endpoint: f"https://{endpoint}/runtime/webhooks/EventGrid?functionName=<FUNCTION_NAME>&code=<FUNCTION_KEY>")
                                                              ))

# Export the endpoint URL of the Event Grid Topic
pulumi.export("eventGridTopicEndpoint", event_grid_topic.endpoint)

# Export the Function App default host name
pulumi.export("functionAppHostName