Throttling AI API Requests with Azure APIM Policies

Question

Pulumi · Accepted Answer

Throttling is a key technique used to manage the way APIs handle requests, ensuring that resources are not overwhelmed by too many requests in a short period. In Azure API Management (APIM), policies are used to enforce throttling limits on the number of API calls that can be made within specified intervals. To implement throttling with Azure APIM, I'll provide a Pulumi program that creates the following resources: 1. An **API Management Service** – The central point where all API services are managed. 2. A **Product** – Groups a number of APIs together and is the unit of API access control. 3. A **Policy** – Defines the throttling rules that apply to the product. ### Explanation of Resources: - **API Management Service**: This is the main resource that represents Azure API Management within your subscription. You need to create this to organize and manage your APIs centrally. - **Product**: In Azure APIM, a product contains one or more APIs. You expose APIs to developers through products. Developers must subscribe to a product to access the APIs it contains. - **Policy**: Policies in Azure APIM are a powerful capability that allows you to change the behavior of the API through configuration. Policies are a collection of statements that are executed sequentially on the request or response of an API. In this context, we will create a policy to throttle the number of API calls. Below is a Pulumi Python program that sets up throttling using Azure APIM policies. ```python import pulumi import pulumi_azure_native as azure_native # This is the name of the resource group and the API Management service you're working with. # In a real scenario, these come from your environment, config files, or direct inputs. resource_group_name = 'my-resource-group' apim_service_name = 'my-apim-service' # Create an API Management service. apim_service = azure_native.apimanagement.ApiManagementService('myApimService', resource_group_name=resource_group_name, location='West US', publisher_name='My Company', publisher_email='contact@mycompany.com', sku=azure_native.apimanagement.SkuDescriptionArgs( name='Consumption', # Choose the SKU that fits your needs capacity=0 ) ) # Create a Product within the API Management service. product = azure_native.apimanagement.Product('myProduct', product_id='unlimited-product', display_name='Unlimited Product', description='A product that allows unlimited calls', resource_group_name=resource_group_name, service_name=apim_service.name, subscription_required=True, # Require a subscription to access this product approval_required=False, # No approval required for subscription requests state='Published' ) # Create a Policy and apply it to the Product for Throttling. # The XML content here sets the inbound and outbound rules for the policy. policy = azure_native.apimanagement.Policy('myPolicy', resource_group_name=resource_group_name, service_name=apim_service.name, policy_id="myPolicy", format="xml", value="""

""" ) # Export the URL of the API Management service. pulumi.export('apim_service_url', apim_service.gateway_url) ``` In this program: - We create an instance of the `ApiManagementService`, which is required to manage our APIs. - We construct a `Product` named `myProduct`, which groups APIs that we will manage. - We define a `Policy` named `myPolicy` with specific throttling rules and apply it to the `Product`. The limits are set in the form of XML within the `value` property of the policy. - This policy limits calls to 5 every 15 seconds and sets an hourly quota of 100 calls, which is controlled by a `counter-key` that can be adjusted as needed. - At the end of the program, we export the API Management service URL, which would be the endpoint where the managed APIs could be accessed. This URL is used by client applications to interact with your APIs. Remember to replace `my-resource-group` and `my-apim-service` with your actual resource group name and the desired name for the API Management service you want to create. The `sku` parameter should be set according to your specific needs; it's currently set to 'Consumption', which is serverless and ideal for low traffic APIs. This simple program provides a starting point for managing APIs and enforcing throttling policies to ensure the fair usage of API resources.