ML Batch Processing with Azure Autoscale Settings

Question

Pulumi · Accepted Answer

To achieve ML Batch Processing with Azure Autoscale Settings, we will leverage a few Azure services and Pulumi resources. Specifically, we need a batch processing service that can handle Machine Learning jobs and an autoscale setting to manage resource scaling based on workload.

Azure Machine Learning supports batch processing, and we can use it with the Azure Batch service, which allows for large-scale parallel and high-performance computing batch jobs. To automatically scale our processing capabilities, we'll configure the Autoscale settings provided by Azure Monitor.

Here's an overview of the Pulumi resources we'll use:

1. `azure-native.batch.BatchAccount`: Creates an Azure Batch account, which provides a managed service for Batch processing and job scheduling.
2. `azure-native.insights.AutoscaleSetting`: Configures the autoscaling settings for a specified resource. This will ensure that your compute resources are scaled automatically based on predefined rules and load patterns.
3. `azure-native.machinelearningservices.BatchEndpoint`: Sets up a batch endpoint for the machine learning services, which allows you to submit batch inference jobs.
4. `azure-native.machinelearningservices.BatchDeployment`: Allows for the deployment of models and the setup of compute to process batch inference jobs.

Below is the Pulumi program written in Python that creates an Azure Batch account for ML batch processing with autoscale settings configured. The program assumes you have already set up and authenticated Pulumi with Azure.

```python
import pulumi
import pulumi_azure_native.batch as azure_batch
import pulumi_azure_native.insights as azure_insights
import pulumi_azure_native.machinelearningservices as azure_mls

# The name of your Azure resource group and specific location
resource_group = 'myResourceGroupName'
location = 'East US'

# Create an Azure Batch Account
batch_account = azure_batch.BatchAccount("batchAccount",
    resource_group_name=resource_group,
    location=location
)

# Set up Autoscale settings for the Batch account
autoscale_setting = azure_insights.AutoscaleSetting("autoscaleSetting",
    resource_group_name=resource_group,
    target_resource_uri=batch_account.id,
    location=location,
    profiles=[{
        "name": "AutoscaleProfile",
        "capacity": {
            "minimum": "1",
            "maximum": "10",
            "default": "1"
        },
        "rules": [{
            "metricTrigger": {
                "metricName": "CPUUsage",
                "metricNamespace": "Microsoft.Batch/batchAccounts",
                "metricResourceUri": batch_account.id,
                "timeGrain": "PT1M",
                "statistic": "Average",
                "timeWindow": "PT5M",
                "timeAggregation": "Average",
                "operator": "GreaterThan",
                "threshold": 75
            },
            "scaleAction": {
                "direction": "Increase",
                "type": "ChangeCount",
                "value": "1",
                "cooldown": "PT1M"
            }
        }]
    }]
)

# Create a Machine Learning workspace to host the Batch Endpoint
ml_workspace = azure_mls.Workspace("mlWorkspace",
    resource_group_name=resource_group,
    location=location,
    sku="Standard"
)

# Create a Batch Endpoint for submitting ML batch inference jobs
batch_endpoint = azure_mls.BatchEndpoint("batchEndpoint",
    resource_group_name=resource_group,
    workspace_name=ml_workspace.name,
    location=location
)

# Create a Batch Deployment under the Batch Endpoint
# This would require a pre-existing model and compute target
batch_deployment = azure_mls.BatchDeployment("batchDeployment",
    resource_group_name=resource_group,
    workspace_name=ml_workspace.name,
    endpoint_name=batch_endpoint.name,
    location=location,
    properties={
        # Here you would typically specify the model, compute target, and other relevant properties
    }
)

# Export the Batch Account name and Batch Endpoint name
pulumi.export('batch_account_name', batch_account.name)
pulumi.export('batch_endpoint_name', batch_endpoint.name)
```

This program sets up a Batch Account and configures autoscaling based on CPU usage. When the average CPU usage goes above 75% for 5 minutes, it triggers an increase in count by 1 after a 1-minute cooldown. Autoscale also has a profile that dictates the minimum, maximum, and default compute nodes in the pool.

Please note that for the `BatchDeployment`, you would typically specify the model and the compute target, which are specific to your ML workload. These details would depend on the actual ML model you plan to deploy and the expected workload characteristics. For example, you might define a Kubernetes Service for a compute target or a registered model within your Azure Machine Learning workspace. These specifications have been omitted in the example for brevity.

By running this Pulumi program, the cloud resources defined in it would be provisioned in Azure, and you would be able to submit batch processing jobs for your machine learning workload with the confidence that the resources will scale according to the configured autoscale settings.