Analyzing AI Workload Expenses with AWS CUR
PythonTo analyze AI workload expenses, one efficient approach is to use the AWS Cost and Usage Report (CUR), which provides comprehensive data about your AWS costs and usage. AWS CUR allows you to dive into the details of your AWS expenses, giving you rich insights that can help optimize spending.
Here's a high-level overview of the steps we'll take in the program using Pulumi with Python to set up an infrastructure for analyzing AI workload expenses using AWS CUR:
- Define a Report Definition resource using AWS CUR to specify the data about AWS usage and costs you want to collect.
- Configure the necessary S3 bucket to store the report data.
- Set up permissions for AWS to write the CUR data to your S3 bucket.
The following Pulumi Python program sets up the AWS CUR for analyzing AI workload expenses:
import pulumi import pulumi_aws as aws # Define the S3 bucket where we'd like our usage reports to be delivered. report_bucket = aws.s3.Bucket("report-bucket", acl="private" ) # Create a policy that allows AWS Cost and Usage Reports to write to the bucket. policy_document = pulumi.Output.all(report_bucket.arn).apply(lambda args: f""" {{ "Version": "2012-10-17", "Statement": [ {{ "Effect": "Allow", "Principal": {{ "Service": "cur.amazonaws.com" }}, "Action": "s3:PutObject", "Resource": "{args[0]}/*" }} ] }} """) policy = aws.s3.BucketPolicy("bucket-policy", bucket=report_bucket.id, policy=policy_document ) # Define an AWS Cost and Usage Report Definition. report_definition = aws.cur.ReportDefinition("report-definition", report_name="ai_workload_usage_report", time_unit="HOURLY", format="textORcsv", compression="GZIP", additional_schema_elements=["RESOURCES"], s3_bucket=report_bucket.id, s3_prefix="reports", s3_region="us-east-1", additional_artifacts=["REDSHIFT", "QUICKSIGHT"], refresh_closed_reports=True, report_versioning="CREATE_NEW_REPORT" ) # Export the S3 bucket name where the reports will be stored pulumi.export('report_bucket_name', report_bucket.id)
Now, let's break down the code:
- We create an S3 bucket named
report_bucket
which will be used to store the CUR data. - We then create a bucket policy
policy_document
that explicitly allows the AWS Cost and Usage Report service to put files into our bucket. - After that, we attach this policy to our bucket using the
BucketPolicy
resource. - Next, we set up the
ReportDefinition
. We specify areport_name
which can be identified within AWS to find your report,time_unit
set to "HOURLY" for detailed granularity, and theformat
which could be either "text" or "csv". Thecompression
option is set to "GZIP", andadditional_schema_elements
specifies the level of detail in the report which in this case is set to include "RESOURCES", which is useful for analyzing specific services usage. - We define the S3 bucket details for the report to be written into, specifying the
s3_bucket
,s3_prefix
to organize the reports within the bucket, ands3_region
which should match the region of the S3 bucket. - Additional artifacts are included as per the requirement for analysis in "REDSHIFT" and "QUICKSIGHT", making the data compatible with these additional AWS analysis services.
- We enable
refresh_closed_reports
to let AWS refresh the data when the billing period is closed, and we setreport_versioning
to "CREATE_NEW_REPORT" to get a new report for each period.
Finally, we export the S3 bucket name to make it accessible and easy to reference.
This just sets up the COST and Usage Report; to fully analyze the data, you may need to set up additional resources or perform analysis within an AWS service like QuickSight or using another analytics tool that can consume the CUR data.