Auto-scaling RDS Instances for AI Workloads

Question

Pulumi · Accepted Answer

Auto-scaling is a feature that allows you to automatically adjust the number of compute instances in response to the changing load on your application. This ensures that your application has the resources it needs to perform well during peak times while saving costs when demand is lower.

When dealing with AI workloads on AWS RDS instances, we want to ensure that the database instances resize according to the workload demand. Pulumi provides resources from AWS to set up auto-scaling for RDS instances. In this scenario, the most relevant Pulumi resource is the `aws.applicationautoscaling.Target` and `aws.applicationautoscaling.Policy`. These allow you to define the parameters for auto-scaling of a specific AWS service, in our case, RDS.

Here's a Pulumi program written in Python that will create an auto-scaling configuration for AWS RDS instances:

```python
import pulumi
import pulumi_aws as aws

# Create an RDS instance to which we will apply auto-scaling.
rds_instance = aws.rds.Instance("myRdsInstance",
    allocated_storage=20,
    engine="mysql",
    engine_version="5.7",
    instance_class="db.t2.micro",
    name="mydb",
    parameter_group_name="default.mysql5.7",
    password="mydbpassword",
    skip_final_snapshot=True,
    storage_type="gp2",
    username="root",
)

# Define the auto-scaling target. This resource is used to register a scalable target with AWS Application Auto Scaling.
scaling_target = aws.appautoscaling.Target("rdsAutoScalingTarget",
    max_capacity=10,  # The max capacity of RDS instances.
    min_capacity=1,   # The min capacity of RDS instances.
    resource_id=pulumi.Output.concat("instance/", rds_instance.id),  # Formatted as serviceNamespace/resourceType/resourceId.
    scalable_dimension="rds:cluster:ReadReplicaCount",  # Dimension names for RDS.
    service_namespace="rds",  # Valid value for RDS.
)

# Define the auto-scaling policy. This resource is used to set scaling policies for a specified service.
scaling_policy = aws.appautoscaling.Policy("rdsAutoScalingPolicy",
    policy_type="TargetTrackingScaling",
    resource_id=scaling_target.resource_id,  # Reference the scaling target by its resource ID.
    scalable_dimension=scaling_target.scalable_dimension,  # Reference the scalable dimension defined in the scaling target.
    service_namespace=scaling_target.service_namespace,  # Reference the service namespace defined in the scaling target.
    target_tracking_scaling_policy_configuration={
        "predefinedMetricSpecification": {
            "predefinedMetricType": "RDSReaderAverageCPUUtilization",
        },
        "targetValue": 50.0,  # The desired average CPU utilization percentage.
    },
)

# Export the RDS endpoint
pulumi.export("rds_endpoint", rds_instance.endpoint)

# Export the RDS auto-scaling target ID
pulumi.export("scaling_target_id", scaling_target.id)

# Export the RDS auto-scaling policy ARN
pulumi.export("scaling_policy_arn", scaling_policy.arn)
```

In this program, we perform the following steps:

1. Instantiate an `aws.rds.Instance`, which represents our RDS database that we will configure to auto-scale.
2. Define an `aws.appautoscaling.Target` to register our RDS instance as a scalable target with AWS Application Auto Scaling.
3. Create an `aws.appautoscaling.Policy` that defines a scaling policy for the target—specifically, we are using a target tracking scaling policy based on RDS reader average CPU utilization.

Remember, before you can run this Pulumi program, you need to set up AWS credentials on your machine or configure the AWS provider with the required access keys. Once the policy is active, when CPU utilization goes above or below the target value, AWS Application Auto Scaling will automatically adjust the number of replicable read instances, thus ensuring the desired performance and cost efficiency.