Gradual Rollout of Updated Language Models on Azure App Service Slots

Question

Pulumi · Accepted Answer

When managing web applications in Azure App Service, "deployment slots" are a powerful feature that allows you to deploy the application into a "staging" environment before it is swapped into production. This staging environment is practically a clone of your production setting and can be used to validate application changes with a subset of live traffic or your own tests. This concept is especially beneficial when you want to perform a gradual rollout of new features, ensuring minimal impact on your production users if something goes wrong.

To achieve a gradual rollout of updated language models on Azure App Service, the general steps are:

1. Deploy your updated language model to a staging slot.
2. Test the staging deployment with the expected load, ensuring that everything functions as expected.
3. Swap the staging slot with the production slot, so that users begin to use the new model.
4. Monitor the performance of the language model in production.
5. Roll back by swapping slots again if necessary.

Below is a program using Pulumi's Python SDK that sets up an Azure App Service with two slots: one for production and one for staging. We start by creating an App Service plan, which defines the underlying VM that hosts our web applications. Then, we create the web app associated with this plan and the staging slot.

```python
import pulumi
import pulumi_azure_native as azure_native

# Create an Azure Resource Group
resource_group = azure_native.resources.ResourceGroup('my-resource-group')

# Create an App Service Plan
app_service_plan = azure_native.web.AppServicePlan('my-app-service-plan',
    resource_group_name=resource_group.name,
    kind='App',  # Use 'Linux' for Linux, this is for Windows
    location=resource_group.location,
    sku=azure_native.web.SkuDescriptionArgs(
        name='B1',  # Choose the pricing tier of the App Service Plan
        tier='Basic',
        size='B1',
        family='B',
        capacity=1
    ),
    reserved=True  # True means a Linux App Service Plan. False would mean Windows.
)

# Create a Web App within the App Service Plan
app_service = azure_native.web.WebApp('my-web-app',
    resource_group_name=resource_group.name,
    location=resource_group.location,
    server_farm_id=app_service_plan.id
)

# Create a Deployment Slot called "staging" for the Web App
staging_slot = azure_native.web.WebAppSlot('my-staging-slot',
    name=app_service.name,
    resource_group_name=resource_group.name,
    location=resource_group.location,
    slot="staging",  # The name of the slot, 'staging' in this case
    server_farm_id=app_service_plan.id
)

# Swap actions are not directly represented as Pulumi resources,
# so the swap action would typically be performed outside of this deployment script,
# possibly as a manual step in the Azure Portal or using Azure CLI or PowerShell commands.

# Export the hostnames of both production and staging slots
pulumi.export('production_url', app_service.default_host_name)
pulumi.export('staging_url', staging_slot.default_host_name)
```

This program sets up the necessary infrastructure for staging and production. The swap action would generally be a manual action or could be automated with scripts that run Azure CLI commands (as Pulumi does not represent swap actions directly).

To perform the swap between staging and production, you can use Azure CLI:

```sh
az webapp deployment slot swap --resource-group my-resource-group --name my-web-app --slot staging --target-slot production
```

This command would be run after you're confident that the staging deployment behaves as expected, usually after automated testing or manual validation has been completed. Monitoring and potential rollback can be handled similarly.

When the swap is performed, traffic starts to hit the new deployment in a controlled fashion. If you use Azure's traffic routing capabilities, you can also reroute a percentage of the production traffic to the staging environment before the swap, providing even finer control over the rollout process. If any issues arise, you can swap back to direct traffic to the last known good deployment.