1. Analyzing AI Workload Expenses with AWS CUR


    To analyze AI workload expenses, one efficient approach is to use the AWS Cost and Usage Report (CUR), which provides comprehensive data about your AWS costs and usage. AWS CUR allows you to dive into the details of your AWS expenses, giving you rich insights that can help optimize spending.

    Here's a high-level overview of the steps we'll take in the program using Pulumi with Python to set up an infrastructure for analyzing AI workload expenses using AWS CUR:

    1. Define a Report Definition resource using AWS CUR to specify the data about AWS usage and costs you want to collect.
    2. Configure the necessary S3 bucket to store the report data.
    3. Set up permissions for AWS to write the CUR data to your S3 bucket.

    The following Pulumi Python program sets up the AWS CUR for analyzing AI workload expenses:

    import pulumi import pulumi_aws as aws # Define the S3 bucket where we'd like our usage reports to be delivered. report_bucket = aws.s3.Bucket("report-bucket", acl="private" ) # Create a policy that allows AWS Cost and Usage Reports to write to the bucket. policy_document = pulumi.Output.all(report_bucket.arn).apply(lambda args: f""" {{ "Version": "2012-10-17", "Statement": [ {{ "Effect": "Allow", "Principal": {{ "Service": "cur.amazonaws.com" }}, "Action": "s3:PutObject", "Resource": "{args[0]}/*" }} ] }} """) policy = aws.s3.BucketPolicy("bucket-policy", bucket=report_bucket.id, policy=policy_document ) # Define an AWS Cost and Usage Report Definition. report_definition = aws.cur.ReportDefinition("report-definition", report_name="ai_workload_usage_report", time_unit="HOURLY", format="textORcsv", compression="GZIP", additional_schema_elements=["RESOURCES"], s3_bucket=report_bucket.id, s3_prefix="reports", s3_region="us-east-1", additional_artifacts=["REDSHIFT", "QUICKSIGHT"], refresh_closed_reports=True, report_versioning="CREATE_NEW_REPORT" ) # Export the S3 bucket name where the reports will be stored pulumi.export('report_bucket_name', report_bucket.id)

    Now, let's break down the code:

    • We create an S3 bucket named report_bucket which will be used to store the CUR data.
    • We then create a bucket policy policy_document that explicitly allows the AWS Cost and Usage Report service to put files into our bucket.
    • After that, we attach this policy to our bucket using the BucketPolicy resource.
    • Next, we set up the ReportDefinition. We specify a report_name which can be identified within AWS to find your report, time_unit set to "HOURLY" for detailed granularity, and the format which could be either "text" or "csv". The compression option is set to "GZIP", and additional_schema_elements specifies the level of detail in the report which in this case is set to include "RESOURCES", which is useful for analyzing specific services usage.
    • We define the S3 bucket details for the report to be written into, specifying the s3_bucket, s3_prefix to organize the reports within the bucket, and s3_region which should match the region of the S3 bucket.
    • Additional artifacts are included as per the requirement for analysis in "REDSHIFT" and "QUICKSIGHT", making the data compatible with these additional AWS analysis services.
    • We enable refresh_closed_reports to let AWS refresh the data when the billing period is closed, and we set report_versioning to "CREATE_NEW_REPORT" to get a new report for each period.

    Finally, we export the S3 bucket name to make it accessible and easy to reference.

    This just sets up the COST and Usage Report; to fully analyze the data, you may need to set up additional resources or perform analysis within an AWS service like QuickSight or using another analytics tool that can consume the CUR data.