1. LLM Inference Service Updates in a Controlled Window with AWS SSM


    The goal is to update an LLM (Large Language Model) Inference Service within a controlled window using AWS Systems Manager (SSM). A controlled window refers to a specified time duration when updates or maintenance tasks can be performed without impacting your service availability. To achieve this, one can use a combination of AWS SSM Maintenance Windows and SSM Documents to execute the updates.

    AWS SSM Maintenance Windows allow you to define a schedule for when to perform potentially disruptive actions on your instances, such as software updates. SSM Documents (also known as Automation Documents) define the actions that AWS Systems Manager performs on your managed instances.

    Here's how we can use Pulumi to create these resources:

    1. Define a Maintenance Window to specify when the updates can occur.
    2. Create an SSM Document that describes the steps to update the Inference Service.
    3. Register Targets (the EC2 instances or other resources that make up the Inference Service) with the Maintenance Window.
    4. Assign a Task to the Maintenance Window which refers to the SSM Document to perform the updates.

    Now, let's create a Pulumi program to automate the creation of these resources.

    import pulumi import pulumi_aws as aws # Create a new SSM Maintenance Window to define when updates should occur. maintenance_window = aws.ssm.MaintenanceWindow("maintenanceWindow", # A CRON or rate expression to define the schedule schedule="cron(0 2 ? * SUN *)", # e.g., Run every Sunday at 2:00 AM # The duration of the window in hours. duration=4, # 4 hour window # The amount of time before the end of the window to stop scheduling new tasks. cutoff=1, # 1 hour before the end of the window. # Ensure the Maintenance Window is enabled. enabled=True, ) # Create a new SSM Document that describes the update process for the LLM Inference Service. update_document = aws.ssm.Document("updateDocument", # The content of the document in JSON format. Replace the content with the actual commands to update the LLM service. content="""{ "schemaVersion": "2.2", "description": "Update LLM Inference Service", "mainSteps": [ { "action": "aws:runShellScript", "name": "updateService", "inputs": { "runCommand": [ "sudo systemctl stop llm-inference-service", "sudo yum update -y", "sudo systemctl start llm-inference-service" ] } } ] }""", document_type="Command", ) # Register targets with the Maintenance Window. # Targets can be specified using Key=tag:Name, Values=LLM-Inference-Service to target instances with a specific tag. maintenance_window_target = aws.ssm.MaintenanceWindowTarget("target", window_id=maintenance_window.id, resource_type="INSTANCE", targets=[aws.ssm.MaintenanceWindowTargetTargetArgs( key="tag:Name", values=["LLM-Inference-Service"], )], ) # Assign the update task to the Maintenance Window. maintenance_window_task = aws.ssm.MaintenanceWindowTask("task", window_id=maintenance_window.id, targets=[aws.ssm.MaintenanceWindowTaskTargetArgs( key="WindowTargetIds", values=[maintenance_window_target.id], )], # This ARN refers to the SSM Document that we created earlier. task_arn=update_document.arn, task_type="RUN_COMMAND", # Define the maximum number of targets this task can be run for in parallel. max_concurrency="2", # Define the number of errors allowed before this task stops being scheduled. max_errors="1", # Service Role ARN to allow SSM to perform actions on your behalf. service_role_arn="arn:aws:iam::123456789012:role/SSMTasksRole", # Replace with the actual role ARN for SSM ) # Export the Maintenance Window ID and Document ARN for reference pulumi.export("maintenance_window_id", maintenance_window.id) pulumi.export("update_document_arn", update_document.arn)

    Let's break down each section of the program:

    • Maintenance Window: A schedule is defined using a cron expression to specify the periodic time window for performing updates. The window has a duration of 4 hours and begins the schedule for tasks 1 hour before the window closes.

    • SSM Document: This JSON document lays out the commands necessary to update the LLM Inference Service. In a real-world scenario, these commands would be replaced by the specific steps needed to update your inference service.

    • Maintenance Window Target: The target defines which instances the task should run on. It uses tags to select the appropriate instances. Any EC2 instances with the tag Name: LLM-Inference-Service will be targeted.

    • Maintenance Window Task: The task makes use of the previously defined SSM Document and targets to execute the defined commands in the document during the maintenance window.

    • Exports: The last two lines export the maintenance window ID and the update document ARN so they can be referred to elsewhere.

    To execute this Pulumi program, you will need to have the Pulumi CLI installed, appropriate AWS credentials configured, and a Pulumi project set up. When you run pulumi up, Pulumi will provision the AWS resources as per the defined program.