1. LLM Inference Endpoint Resolution with Azure Private DNS A Records


    In the context of Azure cloud services, a typical task for managing your cloud applications is to set up private domain name resolution for services within your virtual network. This is often needed for various use cases, such as when you have a Machine Learning model hosted as an Inference Endpoint and you want this endpoint to be resolved within your virtual network to maintain privacy and security.

    Azure provides the Azure Private DNS Zone service for this purpose, which allows you to use your own domain names, rather than the Azure-provided names. Typically, you would create A records within your private DNS zone to map domain names to IP addresses for your services.

    Let's walk through the code required to set up a private DNS zone and create an A record that points to the IP address of your LLM Inference Endpoint in Azure using Pulumi.

    First, we will need to create a private DNS zone. A DNS zone is a container for DNS records, and in this context, it will contain the records for our private domain. We will then create an A record within this zone. An A record maps a domain name to an IP address, which in this case will be the IP address of our inference endpoint.

    Below, you will find a Pulumi program written in Python that sets up a private DNS zone and an A record pointing to a hypothetical LLM Inference Endpoint:

    import pulumi import pulumi_azure_native as azure_native # Assume we already have the resource group and virtual network in place resource_group_name = 'my-resource-group' location = 'westus' # Instantiate the Private DNS Zone # Here we register a DNS zone called 'inference.private' within our resource group private_dns_zone = azure_native.network.PrivateZone( 'inference-private-dns-zone', resource_group_name=resource_group_name, location=location, private_zone_name='inference.private' ) # Suppose we have an inference endpoint with the following IP address inference_endpoint_ip = '' # Now we create an A record for the inference endpoint within the private DNS zone # This record will map the domain name 'endpoint.inference.private' to the endpoint's IP address a_record = azure_native.network.RecordSet( 'inference-endpoint-a-record', resource_group_name=resource_group_name, zone_name=private_dns_zone.name, relative_record_set_name='endpoint', record_type='A', ttl=300, a_records=[azure_native.network.ARecordArgs(ipv4_address=inference_endpoint_ip)] ) # Export the hostname of the inference endpoint pulumi.export('inference_endpoint_hostname', pulumi.Output.concat('endpoint.', private_dns_zone.name))

    In this program:

    • We first define some common configuration like the resource group name and location which are assumed to be an existing resource group and the desired region you would deploy your resources in.
    • We create a PrivateZone resource which is akin to creating a private DNS zone in the Azure Portal. We name it 'inference.private' for the purpose of this example.
    • Assuming that we have an LLM Inference Endpoint with an IP address of, we go ahead and create an A record that associates a subdomain of our private zone (endpoint.inference.private) with this IP.
    • The ttl (time to live) of the DNS record is set to 300 seconds. TTL is a setting for DNS records that tells how long a resolver is supposed to cache the DNS query before the query expires and needs a new resolution.
    • Lastly, we export the fully qualified domain name (FQDN) of our inference endpoint as an output of our program.

    Please note, the specifics of your setup, such as the IP address of your inference endpoint and DNS zone name, will differ based on your Azure environment and configurations.

    To run this Pulumi program:

    1. Ensure you have Pulumi installed and configured for use with your Azure account.
    2. Save the script as __main__.py within a Pulumi project directory.
    3. Run pulumi up from the command line within the directory to provision the resources.

    The console output will show you the changes Pulumi is going to apply to your Azure environment, and after confirming, it will make those changes and output any exported variables like the 'inference_endpoint_hostname'.