Deployment of AI Model Serving with Azure Cache for Redis

Question

Pulumi · Accepted Answer

To deploy an AI model serving infrastructure with Azure Cache for Redis using Pulumi, you will need to create a few components in Azure:

1. An **Azure Container Instance (ACI)** to serve the AI model using a Docker container.
2. An **Azure Cache for Redis** instance to store frequently accessed data with low latency.
3. An **Azure Storage Account** to store any persistent data needed by your AI model.

The following Pulumi program demonstrates how to set up these components. This program does the following:

- Sets up a new resource group for hosting all the resources.
- Creates an Azure Cache for Redis instance to be used as a fast in-memory data store.
- Deploys an Azure Container Instance with your AI model, which can leverage Azure Cache for Redis.

Make sure you have Docker image with your AI model ready to be deployed. This image should be accessible from Azure Container Registry or another accessible Docker registry.

```python
import pulumi
import pulumi_azure_native as azure_native
from pulumi_azure_native import cache, containerservice, resources, storage

# Create an Azure Resource Group
resource_group = resources.ResourceGroup('ai_model_serving_group')

# Create an Azure Cache for Redis instance
redis_cache = cache.Redis(
    'redisCache',
    resource_group_name=resource_group.name,
    location=resource_group.location,
    sku=cache.SkuArgs(
        name='Basic',   # Choose the appropriate tier (Basic, Standard, Premium)
        family='C',     # The SKU family to use
        capacity=0      # The size of the Redis cache to deploy
    ),
    enable_non_ssl_port=False,
    minimum_tls_version="1.2"
)

# Create an Azure Storage Account for persistent data (if needed)
storage_account = storage.StorageAccount(
    'storageAccount',
    resource_group_name=resource_group.name,
    location=resource_group.location,
    sku=storage.SkuArgs(
        name=storage.SkuName.STANDARD_LRS
    ),
    kind=storage.Kind.STORAGE_V2
)

# Deploy an Azure Container Instance (ACI) to serve the AI model
container_group = containerservice.ContainerGroup(
    'containerGroup',
    resource_group_name=resource_group.name,
    location=resource_group.location,
    containers=[containerservice.ContainerArgs(
        name='ai-model-container',
        image='your-docker-image',  # Replace with your Docker image name
        resources=containerservice.ResourceRequirementsArgs(
            requests=containerservice.ResourceRequestsArgs(
                cpu=1.0,
                memory_in_gb=1.5,
            ),
        ),
        environment_variables=[
            containerservice.EnvironmentVariableArgs(
                name='REDIS_HOST',
                value=redis_cache.host_name.apply(lambda host_name: f"{host_name}.redis.cache.windows.net")
            ),
            containerservice.EnvironmentVariableArgs(
                name='REDIS_PASSWORD',
                value=redis_cache.access_keys.apply(lambda keys: keys.primary_key),
                secure_value=True
            )
        ],
    )],
    os_type=containerservice.OperatingSystemTypes.LINUX
)

# Export the container group and Redis cache host name and keys
pulumi.export('container_group_name', container_group.name)
pulumi.export('redis_cache_host_name', redis_cache.host_name)
pulumi.export('redis_cache_primary_key', redis_cache.access_keys.apply(lambda keys: keys.primary_key))
```

You will replace `'your-docker-image'` with the Docker image that serves your AI model.

In this program, environment variables are passed to the container instance so that it knows how to communicate with the Redis cache. The `REDIS_HOST` environment variable is constructed by combining the host name of the Redis cache with the default Redis domain, and `REDIS_PASSWORD` is retrieved securely from the cache's access keys.

The above setup assumes a basic tier for Azure Cache for Redis (`Basic`) and a small container size for serving the AI model. You may need to adjust these settings depending on your specific requirements such as the expected load and the AI model's resource needs.

Remember that you will need the Azure CLI installed and configured as well as the Pulumi CLI to deploy this stack. Once you have both CLIs set up and this program saved into a `__main__.py` file within a Pulumi project directory, you can run `pulumi up` to create all the resources described.

After deployment, you can access the ACI endpoint to communicate with your AI model. The Redis cache will be used by your AI model's application to quickly access data that needs to have low-latency access.