Throttling AI Service Requests using Azure API Management Policies
PythonThrottling requests to an AI service in Azure is a common use case for managing an API's performance and ensuring fair usage among clients. In Azure, you achieve this by applying policies to an API Management (APIM) service. One such policy is the rate-limit or quota policy, which specifies the call rate limits and a renewal period.
To apply a throttle policy, you create an instance of
azure-native.apimanagement.Policy
resource. ThePolicy
resource allows you to define the behavior of your API Management service at global scope, product scope, or API scope. For instance, throttling is configured by specifying limits within the policy definition, written in XML format.Here is a detailed walkthrough of how to create an API Management service in Azure and apply a throttle policy using Pulumi with Python.
Throttle Policy Creation with Pulumi
Firstly, we'll import the necessary Pulumi libraries and initiate our main program. Then we'll create the API Management service instance if it does not already exist. Subsequently, we define a policy with a throttle rule.
import pulumi import pulumi_azure_native as azure_native # Create a resource group if it does not exist resource_group = azure_native.resources.ResourceGroup("my-resource-group") # Create an API Management service instance within the resource group api_management_service = azure_native.apimanagement.ApiManagementService("my-api-management-service", resource_group_name=resource_group.name, publisher_name="My Company", publisher_email="contact@mycompany.com", sku=azure_native.apimanagement.SkuDescriptionArgs( name=azure_native.apimanagement.SkuType.DEVELOPER, # Choosing the Developer SKU for example purposes capacity=1, )) # Define the throttling policy in XML format throttle_policy_xml = """<policies> <inbound> <rate-limit calls="10" renewal-period="60" /> <quota calls="100" renewal-period="86400" /> </inbound> <backend> <forward-request /> </backend> <outbound /> </policies>""" # Apply the policy to the API Management service api_policy = azure_native.apimanagement.Policy("my-api-throttle-policy", resource_group_name=resource_group.name, service_name=api_management_service.name, value=throttle_policy_xml, format="xml") # To make the API endpoint accessible, output the URL of the API Management instance pulumi.export("api_management_endpoint", api_management_service.gateway_url)
In this program:
-
We use the
azure_native.resources.ResourceGroup
class to create a new resource group or use an existing one by the name "my-resource-group". -
We then create an API Management service using
azure_native.apimanagement.ApiManagementService
and specify the basic properties like the name of the publisher and the contact email. For this example, we use the "Developer" SKU type which is suitable for development and testing purposes. -
The next step is to define a throttling policy in XML format. Here, we've set the
rate-limit
policy to allow 10 calls per minute (60 seconds) and aquota
of 100 calls per day (86400 seconds - the number of seconds in a day). -
This policy is applied to the API Management service instance using the
azure_native.apimanagement.Policy
class with thevalue
parameter containing our XML policy and theformat
as "xml". -
Finally, we expose the gateway URL of the API Management service for external access using
pulumi.export
.
This Pulumi program creates and applies the policy to an Azure API Management service, effectively throttling the number of requests that clients can make as per the limits defined in the policy. Please note that this example is for demonstration purposes and should be adapted to your actual quota and rate limits requirements.
You can find more details in the Azure API Management Policy documentation:
Deploy this Pulumi program using the Pulumi CLI by running
pulumi up
inside the directory where this code is saved to provision your API Management service with the throttle policy in Azure.-