Create and Configure CloudWatch Alarms

The aws:cloudwatch/metricAlarm:MetricAlarm resource, part of the Pulumi AWS provider, defines CloudWatch metric alarms that evaluate time-series data and trigger actions when metrics breach thresholds or deviate from learned patterns. This guide focuses on four capabilities: static threshold monitoring, Auto Scaling integration, metric math and anomaly detection, and Metrics Insights queries.

A metric alarm doesn’t operate in isolation. It references metrics from existing AWS resources and triggers actions on SNS topics, Lambda functions, or scaling policies. The examples are intentionally small and show how each evaluation mode is configured. Combine them with your own infrastructure, notification targets, and operational thresholds.

Monitor a single metric against a static threshold

Most monitoring deployments track a single AWS metric against a fixed threshold, such as CPU utilization or request counts.

import * as pulumi from "@pulumi/pulumi";
import * as aws from "@pulumi/aws";

const foobar = new aws.cloudwatch.MetricAlarm("foobar", {
    name: "test-foobar5",
    comparisonOperator: "GreaterThanOrEqualToThreshold",
    evaluationPeriods: 2,
    metricName: "CPUUtilization",
    namespace: "AWS/EC2",
    period: 120,
    statistic: "Average",
    threshold: 80,
    alarmDescription: "This metric monitors ec2 cpu utilization",
    insufficientDataActions: [],
});
import pulumi
import pulumi_aws as aws

foobar = aws.cloudwatch.MetricAlarm("foobar",
    name="test-foobar5",
    comparison_operator="GreaterThanOrEqualToThreshold",
    evaluation_periods=2,
    metric_name="CPUUtilization",
    namespace="AWS/EC2",
    period=120,
    statistic="Average",
    threshold=80,
    alarm_description="This metric monitors ec2 cpu utilization",
    insufficient_data_actions=[])
package main

import (
	"github.com/pulumi/pulumi-aws/sdk/v7/go/aws/cloudwatch"
	"github.com/pulumi/pulumi-aws/sdk/v7/go/aws/sns"
	"github.com/pulumi/pulumi/sdk/v3/go/pulumi"
)

func main() {
	pulumi.Run(func(ctx *pulumi.Context) error {
		_, err := cloudwatch.NewMetricAlarm(ctx, "foobar", &cloudwatch.MetricAlarmArgs{
			Name:                    pulumi.String("test-foobar5"),
			ComparisonOperator:      pulumi.String("GreaterThanOrEqualToThreshold"),
			EvaluationPeriods:       pulumi.Int(2),
			MetricName:              pulumi.String("CPUUtilization"),
			Namespace:               pulumi.String("AWS/EC2"),
			Period:                  pulumi.Int(120),
			Statistic:               pulumi.String("Average"),
			Threshold:               pulumi.Float64(80),
			AlarmDescription:        pulumi.String("This metric monitors ec2 cpu utilization"),
			InsufficientDataActions: pulumi.Array{},
		})
		if err != nil {
			return err
		}
		return nil
	})
}
using System.Collections.Generic;
using System.Linq;
using Pulumi;
using Aws = Pulumi.Aws;

return await Deployment.RunAsync(() => 
{
    var foobar = new Aws.CloudWatch.MetricAlarm("foobar", new()
    {
        Name = "test-foobar5",
        ComparisonOperator = "GreaterThanOrEqualToThreshold",
        EvaluationPeriods = 2,
        MetricName = "CPUUtilization",
        Namespace = "AWS/EC2",
        Period = 120,
        Statistic = "Average",
        Threshold = 80,
        AlarmDescription = "This metric monitors ec2 cpu utilization",
        InsufficientDataActions = new[] {},
    });

});
package generated_program;

import com.pulumi.Context;
import com.pulumi.Pulumi;
import com.pulumi.core.Output;
import com.pulumi.aws.cloudwatch.MetricAlarm;
import com.pulumi.aws.cloudwatch.MetricAlarmArgs;
import java.util.List;
import java.util.ArrayList;
import java.util.Map;
import java.io.File;
import java.nio.file.Files;
import java.nio.file.Paths;

public class App {
    public static void main(String[] args) {
        Pulumi.run(App::stack);
    }

    public static void stack(Context ctx) {
        var foobar = new MetricAlarm("foobar", MetricAlarmArgs.builder()
            .name("test-foobar5")
            .comparisonOperator("GreaterThanOrEqualToThreshold")
            .evaluationPeriods(2)
            .metricName("CPUUtilization")
            .namespace("AWS/EC2")
            .period(120)
            .statistic("Average")
            .threshold(80.0)
            .alarmDescription("This metric monitors ec2 cpu utilization")
            .insufficientDataActions()
            .build());

    }
}
resources:
  foobar:
    type: aws:cloudwatch:MetricAlarm
    properties:
      name: test-foobar5
      comparisonOperator: GreaterThanOrEqualToThreshold
      evaluationPeriods: 2
      metricName: CPUUtilization
      namespace: AWS/EC2
      period: 120
      statistic: Average
      threshold: 80
      alarmDescription: This metric monitors ec2 cpu utilization
      insufficientDataActions: []

When the metric breaches the threshold for enough consecutive periods, the alarm enters ALARM state. The comparisonOperator sets the direction (greater-than, less-than); evaluationPeriods controls how many periods must breach before triggering. Here, the alarm fires when average CPU exceeds 80% for two consecutive 2-minute periods. The metricName and namespace identify which AWS service metric to monitor; period and statistic define how CloudWatch aggregates the data points.

Trigger Auto Scaling capacity changes from alarm state

When alarms detect sustained load, they invoke scaling policies to adjust Auto Scaling group capacity.

import * as pulumi from "@pulumi/pulumi";
import * as aws from "@pulumi/aws";

const bat = new aws.autoscaling.Policy("bat", {
    name: "foobar3-test",
    scalingAdjustment: 4,
    adjustmentType: "ChangeInCapacity",
    cooldown: 300,
    autoscalingGroupName: bar.name,
});
const batMetricAlarm = new aws.cloudwatch.MetricAlarm("bat", {
    name: "test-foobar5",
    comparisonOperator: "GreaterThanOrEqualToThreshold",
    evaluationPeriods: 2,
    metricName: "CPUUtilization",
    namespace: "AWS/EC2",
    period: 120,
    statistic: "Average",
    threshold: 80,
    dimensions: {
        AutoScalingGroupName: bar.name,
    },
    alarmDescription: "This metric monitors ec2 cpu utilization",
    alarmActions: [bat.arn],
});
import pulumi
import pulumi_aws as aws

bat = aws.autoscaling.Policy("bat",
    name="foobar3-test",
    scaling_adjustment=4,
    adjustment_type="ChangeInCapacity",
    cooldown=300,
    autoscaling_group_name=bar["name"])
bat_metric_alarm = aws.cloudwatch.MetricAlarm("bat",
    name="test-foobar5",
    comparison_operator="GreaterThanOrEqualToThreshold",
    evaluation_periods=2,
    metric_name="CPUUtilization",
    namespace="AWS/EC2",
    period=120,
    statistic="Average",
    threshold=80,
    dimensions={
        "AutoScalingGroupName": bar["name"],
    },
    alarm_description="This metric monitors ec2 cpu utilization",
    alarm_actions=[bat.arn])
package main

import (
	"github.com/pulumi/pulumi-aws/sdk/v7/go/aws/autoscaling"
	"github.com/pulumi/pulumi-aws/sdk/v7/go/aws/cloudwatch"
	"github.com/pulumi/pulumi-aws/sdk/v7/go/aws/sns"
	"github.com/pulumi/pulumi/sdk/v3/go/pulumi"
)

func main() {
	pulumi.Run(func(ctx *pulumi.Context) error {
		bat, err := autoscaling.NewPolicy(ctx, "bat", &autoscaling.PolicyArgs{
			Name:                 pulumi.String("foobar3-test"),
			ScalingAdjustment:    pulumi.Int(4),
			AdjustmentType:       pulumi.String("ChangeInCapacity"),
			Cooldown:             pulumi.Int(300),
			AutoscalingGroupName: pulumi.Any(bar.Name),
		})
		if err != nil {
			return err
		}
		_, err = cloudwatch.NewMetricAlarm(ctx, "bat", &cloudwatch.MetricAlarmArgs{
			Name:               pulumi.String("test-foobar5"),
			ComparisonOperator: pulumi.String("GreaterThanOrEqualToThreshold"),
			EvaluationPeriods:  pulumi.Int(2),
			MetricName:         pulumi.String("CPUUtilization"),
			Namespace:          pulumi.String("AWS/EC2"),
			Period:             pulumi.Int(120),
			Statistic:          pulumi.String("Average"),
			Threshold:          pulumi.Float64(80),
			Dimensions: pulumi.StringMap{
				"AutoScalingGroupName": pulumi.Any(bar.Name),
			},
			AlarmDescription: pulumi.String("This metric monitors ec2 cpu utilization"),
			AlarmActions: pulumi.Array{
				bat.Arn,
			},
		})
		if err != nil {
			return err
		}
		return nil
	})
}
using System.Collections.Generic;
using System.Linq;
using Pulumi;
using Aws = Pulumi.Aws;

return await Deployment.RunAsync(() => 
{
    var bat = new Aws.AutoScaling.Policy("bat", new()
    {
        Name = "foobar3-test",
        ScalingAdjustment = 4,
        AdjustmentType = "ChangeInCapacity",
        Cooldown = 300,
        AutoscalingGroupName = bar.Name,
    });

    var batMetricAlarm = new Aws.CloudWatch.MetricAlarm("bat", new()
    {
        Name = "test-foobar5",
        ComparisonOperator = "GreaterThanOrEqualToThreshold",
        EvaluationPeriods = 2,
        MetricName = "CPUUtilization",
        Namespace = "AWS/EC2",
        Period = 120,
        Statistic = "Average",
        Threshold = 80,
        Dimensions = 
        {
            { "AutoScalingGroupName", bar.Name },
        },
        AlarmDescription = "This metric monitors ec2 cpu utilization",
        AlarmActions = new[]
        {
            bat.Arn,
        },
    });

});
package generated_program;

import com.pulumi.Context;
import com.pulumi.Pulumi;
import com.pulumi.core.Output;
import com.pulumi.aws.autoscaling.Policy;
import com.pulumi.aws.autoscaling.PolicyArgs;
import com.pulumi.aws.cloudwatch.MetricAlarm;
import com.pulumi.aws.cloudwatch.MetricAlarmArgs;
import java.util.List;
import java.util.ArrayList;
import java.util.Map;
import java.io.File;
import java.nio.file.Files;
import java.nio.file.Paths;

public class App {
    public static void main(String[] args) {
        Pulumi.run(App::stack);
    }

    public static void stack(Context ctx) {
        var bat = new Policy("bat", PolicyArgs.builder()
            .name("foobar3-test")
            .scalingAdjustment(4)
            .adjustmentType("ChangeInCapacity")
            .cooldown(300)
            .autoscalingGroupName(bar.name())
            .build());

        var batMetricAlarm = new MetricAlarm("batMetricAlarm", MetricAlarmArgs.builder()
            .name("test-foobar5")
            .comparisonOperator("GreaterThanOrEqualToThreshold")
            .evaluationPeriods(2)
            .metricName("CPUUtilization")
            .namespace("AWS/EC2")
            .period(120)
            .statistic("Average")
            .threshold(80.0)
            .dimensions(Map.of("AutoScalingGroupName", bar.name()))
            .alarmDescription("This metric monitors ec2 cpu utilization")
            .alarmActions(bat.arn())
            .build());

    }
}
resources:
  bat:
    type: aws:autoscaling:Policy
    properties:
      name: foobar3-test
      scalingAdjustment: 4
      adjustmentType: ChangeInCapacity
      cooldown: 300
      autoscalingGroupName: ${bar.name}
  batMetricAlarm:
    type: aws:cloudwatch:MetricAlarm
    name: bat
    properties:
      name: test-foobar5
      comparisonOperator: GreaterThanOrEqualToThreshold
      evaluationPeriods: 2
      metricName: CPUUtilization
      namespace: AWS/EC2
      period: 120
      statistic: Average
      threshold: 80
      dimensions:
        AutoScalingGroupName: ${bar.name}
      alarmDescription: This metric monitors ec2 cpu utilization
      alarmActions:
        - ${bat.arn}

The alarmActions property lists ARNs to invoke when the alarm enters ALARM state. When pointing to an Auto Scaling policy, the alarm triggers capacity changes. The dimensions property scopes the metric to a specific Auto Scaling group. High CPU across the group triggers scale-out, distributing load across more instances.

Calculate derived metrics with metric math expressions

Sometimes you need to monitor values that aren’t available as single metrics, like error rates derived from request and error counts.

import * as pulumi from "@pulumi/pulumi";
import * as aws from "@pulumi/aws";

const foobar = new aws.cloudwatch.MetricAlarm("foobar", {
    name: "test-foobar",
    comparisonOperator: "GreaterThanOrEqualToThreshold",
    evaluationPeriods: 2,
    threshold: 10,
    alarmDescription: "Request error rate has exceeded 10%",
    insufficientDataActions: [],
    metricQueries: [
        {
            id: "e1",
            expression: "m2/m1*100",
            label: "Error Rate",
            returnData: true,
        },
        {
            id: "m1",
            metric: {
                metricName: "RequestCount",
                namespace: "AWS/ApplicationELB",
                period: 120,
                stat: "Sum",
                unit: "Count",
                dimensions: {
                    LoadBalancer: "app/web",
                },
            },
        },
        {
            id: "m2",
            metric: {
                metricName: "HTTPCode_ELB_5XX_Count",
                namespace: "AWS/ApplicationELB",
                period: 120,
                stat: "Sum",
                unit: "Count",
                dimensions: {
                    LoadBalancer: "app/web",
                },
            },
        },
    ],
});
import pulumi
import pulumi_aws as aws

foobar = aws.cloudwatch.MetricAlarm("foobar",
    name="test-foobar",
    comparison_operator="GreaterThanOrEqualToThreshold",
    evaluation_periods=2,
    threshold=10,
    alarm_description="Request error rate has exceeded 10%",
    insufficient_data_actions=[],
    metric_queries=[
        {
            "id": "e1",
            "expression": "m2/m1*100",
            "label": "Error Rate",
            "return_data": True,
        },
        {
            "id": "m1",
            "metric": {
                "metric_name": "RequestCount",
                "namespace": "AWS/ApplicationELB",
                "period": 120,
                "stat": "Sum",
                "unit": "Count",
                "dimensions": {
                    "LoadBalancer": "app/web",
                },
            },
        },
        {
            "id": "m2",
            "metric": {
                "metric_name": "HTTPCode_ELB_5XX_Count",
                "namespace": "AWS/ApplicationELB",
                "period": 120,
                "stat": "Sum",
                "unit": "Count",
                "dimensions": {
                    "LoadBalancer": "app/web",
                },
            },
        },
    ])
package main

import (
	"github.com/pulumi/pulumi-aws/sdk/v7/go/aws/cloudwatch"
	"github.com/pulumi/pulumi-aws/sdk/v7/go/aws/sns"
	"github.com/pulumi/pulumi/sdk/v3/go/pulumi"
)

func main() {
	pulumi.Run(func(ctx *pulumi.Context) error {
		_, err := cloudwatch.NewMetricAlarm(ctx, "foobar", &cloudwatch.MetricAlarmArgs{
			Name:                    pulumi.String("test-foobar"),
			ComparisonOperator:      pulumi.String("GreaterThanOrEqualToThreshold"),
			EvaluationPeriods:       pulumi.Int(2),
			Threshold:               pulumi.Float64(10),
			AlarmDescription:        pulumi.String("Request error rate has exceeded 10%"),
			InsufficientDataActions: pulumi.Array{},
			MetricQueries: cloudwatch.MetricAlarmMetricQueryArray{
				&cloudwatch.MetricAlarmMetricQueryArgs{
					Id:         pulumi.String("e1"),
					Expression: pulumi.String("m2/m1*100"),
					Label:      pulumi.String("Error Rate"),
					ReturnData: pulumi.Bool(true),
				},
				&cloudwatch.MetricAlarmMetricQueryArgs{
					Id: pulumi.String("m1"),
					Metric: &cloudwatch.MetricAlarmMetricQueryMetricArgs{
						MetricName: pulumi.String("RequestCount"),
						Namespace:  pulumi.String("AWS/ApplicationELB"),
						Period:     pulumi.Int(120),
						Stat:       pulumi.String("Sum"),
						Unit:       pulumi.String("Count"),
						Dimensions: pulumi.StringMap{
							"LoadBalancer": pulumi.String("app/web"),
						},
					},
				},
				&cloudwatch.MetricAlarmMetricQueryArgs{
					Id: pulumi.String("m2"),
					Metric: &cloudwatch.MetricAlarmMetricQueryMetricArgs{
						MetricName: pulumi.String("HTTPCode_ELB_5XX_Count"),
						Namespace:  pulumi.String("AWS/ApplicationELB"),
						Period:     pulumi.Int(120),
						Stat:       pulumi.String("Sum"),
						Unit:       pulumi.String("Count"),
						Dimensions: pulumi.StringMap{
							"LoadBalancer": pulumi.String("app/web"),
						},
					},
				},
			},
		})
		if err != nil {
			return err
		}
		return nil
	})
}
using System.Collections.Generic;
using System.Linq;
using Pulumi;
using Aws = Pulumi.Aws;

return await Deployment.RunAsync(() => 
{
    var foobar = new Aws.CloudWatch.MetricAlarm("foobar", new()
    {
        Name = "test-foobar",
        ComparisonOperator = "GreaterThanOrEqualToThreshold",
        EvaluationPeriods = 2,
        Threshold = 10,
        AlarmDescription = "Request error rate has exceeded 10%",
        InsufficientDataActions = new[] {},
        MetricQueries = new[]
        {
            new Aws.CloudWatch.Inputs.MetricAlarmMetricQueryArgs
            {
                Id = "e1",
                Expression = "m2/m1*100",
                Label = "Error Rate",
                ReturnData = true,
            },
            new Aws.CloudWatch.Inputs.MetricAlarmMetricQueryArgs
            {
                Id = "m1",
                Metric = new Aws.CloudWatch.Inputs.MetricAlarmMetricQueryMetricArgs
                {
                    MetricName = "RequestCount",
                    Namespace = "AWS/ApplicationELB",
                    Period = 120,
                    Stat = "Sum",
                    Unit = "Count",
                    Dimensions = 
                    {
                        { "LoadBalancer", "app/web" },
                    },
                },
            },
            new Aws.CloudWatch.Inputs.MetricAlarmMetricQueryArgs
            {
                Id = "m2",
                Metric = new Aws.CloudWatch.Inputs.MetricAlarmMetricQueryMetricArgs
                {
                    MetricName = "HTTPCode_ELB_5XX_Count",
                    Namespace = "AWS/ApplicationELB",
                    Period = 120,
                    Stat = "Sum",
                    Unit = "Count",
                    Dimensions = 
                    {
                        { "LoadBalancer", "app/web" },
                    },
                },
            },
        },
    });

});
package generated_program;

import com.pulumi.Context;
import com.pulumi.Pulumi;
import com.pulumi.core.Output;
import com.pulumi.aws.cloudwatch.MetricAlarm;
import com.pulumi.aws.cloudwatch.MetricAlarmArgs;
import com.pulumi.aws.cloudwatch.inputs.MetricAlarmMetricQueryArgs;
import com.pulumi.aws.cloudwatch.inputs.MetricAlarmMetricQueryMetricArgs;
import java.util.List;
import java.util.ArrayList;
import java.util.Map;
import java.io.File;
import java.nio.file.Files;
import java.nio.file.Paths;

public class App {
    public static void main(String[] args) {
        Pulumi.run(App::stack);
    }

    public static void stack(Context ctx) {
        var foobar = new MetricAlarm("foobar", MetricAlarmArgs.builder()
            .name("test-foobar")
            .comparisonOperator("GreaterThanOrEqualToThreshold")
            .evaluationPeriods(2)
            .threshold(10.0)
            .alarmDescription("Request error rate has exceeded 10%")
            .insufficientDataActions()
            .metricQueries(            
                MetricAlarmMetricQueryArgs.builder()
                    .id("e1")
                    .expression("m2/m1*100")
                    .label("Error Rate")
                    .returnData(true)
                    .build(),
                MetricAlarmMetricQueryArgs.builder()
                    .id("m1")
                    .metric(MetricAlarmMetricQueryMetricArgs.builder()
                        .metricName("RequestCount")
                        .namespace("AWS/ApplicationELB")
                        .period(120)
                        .stat("Sum")
                        .unit("Count")
                        .dimensions(Map.of("LoadBalancer", "app/web"))
                        .build())
                    .build(),
                MetricAlarmMetricQueryArgs.builder()
                    .id("m2")
                    .metric(MetricAlarmMetricQueryMetricArgs.builder()
                        .metricName("HTTPCode_ELB_5XX_Count")
                        .namespace("AWS/ApplicationELB")
                        .period(120)
                        .stat("Sum")
                        .unit("Count")
                        .dimensions(Map.of("LoadBalancer", "app/web"))
                        .build())
                    .build())
            .build());

    }
}
resources:
  foobar:
    type: aws:cloudwatch:MetricAlarm
    properties:
      name: test-foobar
      comparisonOperator: GreaterThanOrEqualToThreshold
      evaluationPeriods: 2
      threshold: 10
      alarmDescription: Request error rate has exceeded 10%
      insufficientDataActions: []
      metricQueries:
        - id: e1
          expression: m2/m1*100
          label: Error Rate
          returnData: 'true'
        - id: m1
          metric:
            metricName: RequestCount
            namespace: AWS/ApplicationELB
            period: 120
            stat: Sum
            unit: Count
            dimensions:
              LoadBalancer: app/web
        - id: m2
          metric:
            metricName: HTTPCode_ELB_5XX_Count
            namespace: AWS/ApplicationELB
            period: 120
            stat: Sum
            unit: Count
            dimensions:
              LoadBalancer: app/web

Metric math expressions combine multiple metrics into derived values. The metricQueries array defines input metrics and expressions. Set returnData to true for the expression you alarm on, false for intermediate inputs. The expression “m2/m1*100” calculates error rate as a percentage; the alarm fires when this derived value exceeds the threshold.

Detect anomalies using machine learning bands

Anomaly detection uses CloudWatch’s machine learning to learn normal metric behavior, then alerts when values deviate from expected patterns rather than fixed thresholds.

import * as pulumi from "@pulumi/pulumi";
import * as aws from "@pulumi/aws";

const xxAnomalyDetection = new aws.cloudwatch.MetricAlarm("xx_anomaly_detection", {
    name: "test-foobar",
    comparisonOperator: "GreaterThanUpperThreshold",
    evaluationPeriods: 2,
    thresholdMetricId: "e1",
    alarmDescription: "This metric monitors ec2 cpu utilization",
    insufficientDataActions: [],
    metricQueries: [
        {
            id: "e1",
            returnData: true,
            expression: "ANOMALY_DETECTION_BAND(m1)",
            label: "CPUUtilization (Expected)",
        },
        {
            id: "m1",
            returnData: true,
            metric: {
                metricName: "CPUUtilization",
                namespace: "AWS/EC2",
                period: 120,
                stat: "Average",
                unit: "Count",
                dimensions: {
                    InstanceId: "i-abc123",
                },
            },
        },
    ],
});
import pulumi
import pulumi_aws as aws

xx_anomaly_detection = aws.cloudwatch.MetricAlarm("xx_anomaly_detection",
    name="test-foobar",
    comparison_operator="GreaterThanUpperThreshold",
    evaluation_periods=2,
    threshold_metric_id="e1",
    alarm_description="This metric monitors ec2 cpu utilization",
    insufficient_data_actions=[],
    metric_queries=[
        {
            "id": "e1",
            "return_data": True,
            "expression": "ANOMALY_DETECTION_BAND(m1)",
            "label": "CPUUtilization (Expected)",
        },
        {
            "id": "m1",
            "return_data": True,
            "metric": {
                "metric_name": "CPUUtilization",
                "namespace": "AWS/EC2",
                "period": 120,
                "stat": "Average",
                "unit": "Count",
                "dimensions": {
                    "InstanceId": "i-abc123",
                },
            },
        },
    ])
package main

import (
	"github.com/pulumi/pulumi-aws/sdk/v7/go/aws/cloudwatch"
	"github.com/pulumi/pulumi-aws/sdk/v7/go/aws/sns"
	"github.com/pulumi/pulumi/sdk/v3/go/pulumi"
)

func main() {
	pulumi.Run(func(ctx *pulumi.Context) error {
		_, err := cloudwatch.NewMetricAlarm(ctx, "xx_anomaly_detection", &cloudwatch.MetricAlarmArgs{
			Name:                    pulumi.String("test-foobar"),
			ComparisonOperator:      pulumi.String("GreaterThanUpperThreshold"),
			EvaluationPeriods:       pulumi.Int(2),
			ThresholdMetricId:       pulumi.String("e1"),
			AlarmDescription:        pulumi.String("This metric monitors ec2 cpu utilization"),
			InsufficientDataActions: pulumi.Array{},
			MetricQueries: cloudwatch.MetricAlarmMetricQueryArray{
				&cloudwatch.MetricAlarmMetricQueryArgs{
					Id:         pulumi.String("e1"),
					ReturnData: pulumi.Bool(true),
					Expression: pulumi.String("ANOMALY_DETECTION_BAND(m1)"),
					Label:      pulumi.String("CPUUtilization (Expected)"),
				},
				&cloudwatch.MetricAlarmMetricQueryArgs{
					Id:         pulumi.String("m1"),
					ReturnData: pulumi.Bool(true),
					Metric: &cloudwatch.MetricAlarmMetricQueryMetricArgs{
						MetricName: pulumi.String("CPUUtilization"),
						Namespace:  pulumi.String("AWS/EC2"),
						Period:     pulumi.Int(120),
						Stat:       pulumi.String("Average"),
						Unit:       pulumi.String("Count"),
						Dimensions: pulumi.StringMap{
							"InstanceId": pulumi.String("i-abc123"),
						},
					},
				},
			},
		})
		if err != nil {
			return err
		}
		return nil
	})
}
using System.Collections.Generic;
using System.Linq;
using Pulumi;
using Aws = Pulumi.Aws;

return await Deployment.RunAsync(() => 
{
    var xxAnomalyDetection = new Aws.CloudWatch.MetricAlarm("xx_anomaly_detection", new()
    {
        Name = "test-foobar",
        ComparisonOperator = "GreaterThanUpperThreshold",
        EvaluationPeriods = 2,
        ThresholdMetricId = "e1",
        AlarmDescription = "This metric monitors ec2 cpu utilization",
        InsufficientDataActions = new[] {},
        MetricQueries = new[]
        {
            new Aws.CloudWatch.Inputs.MetricAlarmMetricQueryArgs
            {
                Id = "e1",
                ReturnData = true,
                Expression = "ANOMALY_DETECTION_BAND(m1)",
                Label = "CPUUtilization (Expected)",
            },
            new Aws.CloudWatch.Inputs.MetricAlarmMetricQueryArgs
            {
                Id = "m1",
                ReturnData = true,
                Metric = new Aws.CloudWatch.Inputs.MetricAlarmMetricQueryMetricArgs
                {
                    MetricName = "CPUUtilization",
                    Namespace = "AWS/EC2",
                    Period = 120,
                    Stat = "Average",
                    Unit = "Count",
                    Dimensions = 
                    {
                        { "InstanceId", "i-abc123" },
                    },
                },
            },
        },
    });

});
package generated_program;

import com.pulumi.Context;
import com.pulumi.Pulumi;
import com.pulumi.core.Output;
import com.pulumi.aws.cloudwatch.MetricAlarm;
import com.pulumi.aws.cloudwatch.MetricAlarmArgs;
import com.pulumi.aws.cloudwatch.inputs.MetricAlarmMetricQueryArgs;
import com.pulumi.aws.cloudwatch.inputs.MetricAlarmMetricQueryMetricArgs;
import java.util.List;
import java.util.ArrayList;
import java.util.Map;
import java.io.File;
import java.nio.file.Files;
import java.nio.file.Paths;

public class App {
    public static void main(String[] args) {
        Pulumi.run(App::stack);
    }

    public static void stack(Context ctx) {
        var xxAnomalyDetection = new MetricAlarm("xxAnomalyDetection", MetricAlarmArgs.builder()
            .name("test-foobar")
            .comparisonOperator("GreaterThanUpperThreshold")
            .evaluationPeriods(2)
            .thresholdMetricId("e1")
            .alarmDescription("This metric monitors ec2 cpu utilization")
            .insufficientDataActions()
            .metricQueries(            
                MetricAlarmMetricQueryArgs.builder()
                    .id("e1")
                    .returnData(true)
                    .expression("ANOMALY_DETECTION_BAND(m1)")
                    .label("CPUUtilization (Expected)")
                    .build(),
                MetricAlarmMetricQueryArgs.builder()
                    .id("m1")
                    .returnData(true)
                    .metric(MetricAlarmMetricQueryMetricArgs.builder()
                        .metricName("CPUUtilization")
                        .namespace("AWS/EC2")
                        .period(120)
                        .stat("Average")
                        .unit("Count")
                        .dimensions(Map.of("InstanceId", "i-abc123"))
                        .build())
                    .build())
            .build());

    }
}
resources:
  xxAnomalyDetection:
    type: aws:cloudwatch:MetricAlarm
    name: xx_anomaly_detection
    properties:
      name: test-foobar
      comparisonOperator: GreaterThanUpperThreshold
      evaluationPeriods: 2
      thresholdMetricId: e1
      alarmDescription: This metric monitors ec2 cpu utilization
      insufficientDataActions: []
      metricQueries:
        - id: e1
          returnData: true
          expression: ANOMALY_DETECTION_BAND(m1)
          label: CPUUtilization (Expected)
        - id: m1
          returnData: true
          metric:
            metricName: CPUUtilization
            namespace: AWS/EC2
            period: 120
            stat: Average
            unit: Count
            dimensions:
              InstanceId: i-abc123

The ANOMALY_DETECTION_BAND expression generates upper and lower bounds based on historical data. The thresholdMetricId property references the band expression; comparisonOperator specifies which boundary to check (GreaterThanUpperThreshold for detecting spikes). CloudWatch automatically adjusts the band over time as usage patterns change.

Query metrics with SQL-style Metrics Insights syntax

Metrics Insights provides SQL-style queries to aggregate, filter, and sort metrics across multiple resources.

import * as pulumi from "@pulumi/pulumi";
import * as aws from "@pulumi/aws";

const example = new aws.cloudwatch.MetricAlarm("example", {
    name: "example-alarm",
    alarmDescription: "Triggers if the smallest per-instance maximum load during the evaluation period exceeds the threshold",
    comparisonOperator: "GreaterThanThreshold",
    evaluationPeriods: 1,
    threshold: 0.6,
    treatMissingData: "notBreaching",
    metricQueries: [{
        id: "q1",
        expression: `SELECT
  MAX(DBLoadRelativeToNumVCPUs)
FROM SCHEMA(\\"AWS/RDS\\", DBInstanceIdentifier)
WHERE DBInstanceIdentifier != 'example-rds-instance'
GROUP BY DBInstanceIdentifier
ORDER BY MIN() ASC
LIMIT 1
`,
        period: 60,
        returnData: true,
        label: "Max DB Load of the Least-Loaded RDS Instance",
    }],
});
import pulumi
import pulumi_aws as aws

example = aws.cloudwatch.MetricAlarm("example",
    name="example-alarm",
    alarm_description="Triggers if the smallest per-instance maximum load during the evaluation period exceeds the threshold",
    comparison_operator="GreaterThanThreshold",
    evaluation_periods=1,
    threshold=0.6,
    treat_missing_data="notBreaching",
    metric_queries=[{
        "id": "q1",
        "expression": """SELECT
  MAX(DBLoadRelativeToNumVCPUs)
FROM SCHEMA(\"AWS/RDS\", DBInstanceIdentifier)
WHERE DBInstanceIdentifier != 'example-rds-instance'
GROUP BY DBInstanceIdentifier
ORDER BY MIN() ASC
LIMIT 1
""",
        "period": 60,
        "return_data": True,
        "label": "Max DB Load of the Least-Loaded RDS Instance",
    }])
package main

import (
	"github.com/pulumi/pulumi-aws/sdk/v7/go/aws/cloudwatch"
	"github.com/pulumi/pulumi/sdk/v3/go/pulumi"
)

func main() {
	pulumi.Run(func(ctx *pulumi.Context) error {
		_, err := cloudwatch.NewMetricAlarm(ctx, "example", &cloudwatch.MetricAlarmArgs{
			Name:               pulumi.String("example-alarm"),
			AlarmDescription:   pulumi.String("Triggers if the smallest per-instance maximum load during the evaluation period exceeds the threshold"),
			ComparisonOperator: pulumi.String("GreaterThanThreshold"),
			EvaluationPeriods:  pulumi.Int(1),
			Threshold:          pulumi.Float64(0.6),
			TreatMissingData:   pulumi.String("notBreaching"),
			MetricQueries: cloudwatch.MetricAlarmMetricQueryArray{
				&cloudwatch.MetricAlarmMetricQueryArgs{
					Id: pulumi.String("q1"),
					Expression: pulumi.String(`SELECT
  MAX(DBLoadRelativeToNumVCPUs)
FROM SCHEMA(\"AWS/RDS\", DBInstanceIdentifier)
WHERE DBInstanceIdentifier != 'example-rds-instance'
GROUP BY DBInstanceIdentifier
ORDER BY MIN() ASC
LIMIT 1
`),
					Period:     pulumi.Int(60),
					ReturnData: pulumi.Bool(true),
					Label:      pulumi.String("Max DB Load of the Least-Loaded RDS Instance"),
				},
			},
		})
		if err != nil {
			return err
		}
		return nil
	})
}
using System.Collections.Generic;
using System.Linq;
using Pulumi;
using Aws = Pulumi.Aws;

return await Deployment.RunAsync(() => 
{
    var example = new Aws.CloudWatch.MetricAlarm("example", new()
    {
        Name = "example-alarm",
        AlarmDescription = "Triggers if the smallest per-instance maximum load during the evaluation period exceeds the threshold",
        ComparisonOperator = "GreaterThanThreshold",
        EvaluationPeriods = 1,
        Threshold = 0.6,
        TreatMissingData = "notBreaching",
        MetricQueries = new[]
        {
            new Aws.CloudWatch.Inputs.MetricAlarmMetricQueryArgs
            {
                Id = "q1",
                Expression = @"SELECT
  MAX(DBLoadRelativeToNumVCPUs)
FROM SCHEMA(\""AWS/RDS\"", DBInstanceIdentifier)
WHERE DBInstanceIdentifier != 'example-rds-instance'
GROUP BY DBInstanceIdentifier
ORDER BY MIN() ASC
LIMIT 1
",
                Period = 60,
                ReturnData = true,
                Label = "Max DB Load of the Least-Loaded RDS Instance",
            },
        },
    });

});
package generated_program;

import com.pulumi.Context;
import com.pulumi.Pulumi;
import com.pulumi.core.Output;
import com.pulumi.aws.cloudwatch.MetricAlarm;
import com.pulumi.aws.cloudwatch.MetricAlarmArgs;
import com.pulumi.aws.cloudwatch.inputs.MetricAlarmMetricQueryArgs;
import java.util.List;
import java.util.ArrayList;
import java.util.Map;
import java.io.File;
import java.nio.file.Files;
import java.nio.file.Paths;

public class App {
    public static void main(String[] args) {
        Pulumi.run(App::stack);
    }

    public static void stack(Context ctx) {
        var example = new MetricAlarm("example", MetricAlarmArgs.builder()
            .name("example-alarm")
            .alarmDescription("Triggers if the smallest per-instance maximum load during the evaluation period exceeds the threshold")
            .comparisonOperator("GreaterThanThreshold")
            .evaluationPeriods(1)
            .threshold(0.6)
            .treatMissingData("notBreaching")
            .metricQueries(MetricAlarmMetricQueryArgs.builder()
                .id("q1")
                .expression("""
SELECT
  MAX(DBLoadRelativeToNumVCPUs)
FROM SCHEMA(\"AWS/RDS\", DBInstanceIdentifier)
WHERE DBInstanceIdentifier != 'example-rds-instance'
GROUP BY DBInstanceIdentifier
ORDER BY MIN() ASC
LIMIT 1
                """)
                .period(60)
                .returnData(true)
                .label("Max DB Load of the Least-Loaded RDS Instance")
                .build())
            .build());

    }
}
resources:
  example:
    type: aws:cloudwatch:MetricAlarm
    properties:
      name: example-alarm
      alarmDescription: Triggers if the smallest per-instance maximum load during the evaluation period exceeds the threshold
      comparisonOperator: GreaterThanThreshold
      evaluationPeriods: 1
      threshold: 0.6
      treatMissingData: notBreaching
      metricQueries:
        - id: q1
          expression: |
            SELECT
              MAX(DBLoadRelativeToNumVCPUs)
            FROM SCHEMA(\"AWS/RDS\", DBInstanceIdentifier)
            WHERE DBInstanceIdentifier != 'example-rds-instance'
            GROUP BY DBInstanceIdentifier
            ORDER BY MIN() ASC
            LIMIT 1            
          period: 60
          returnData: true
          label: Max DB Load of the Least-Loaded RDS Instance

The expression property contains the SQL-style query using SELECT, FROM SCHEMA, WHERE, GROUP BY, ORDER BY, and LIMIT clauses. This query finds the maximum DBLoadRelativeToNumVCPUs across RDS instances, excludes a specific instance, groups by DBInstanceIdentifier, sorts by minimum load, and returns only the least-loaded instance. The treatMissingData property controls how the alarm behaves when the query returns no data.

Track load balancer target health across groups

Load balancer health monitoring tracks healthy target counts, enabling alerts when capacity drops below operational thresholds.

import * as pulumi from "@pulumi/pulumi";
import * as aws from "@pulumi/aws";

const nlbHealthyhosts = new aws.cloudwatch.MetricAlarm("nlb_healthyhosts", {
    name: "alarmname",
    comparisonOperator: "LessThanThreshold",
    evaluationPeriods: 1,
    metricName: "HealthyHostCount",
    namespace: "AWS/NetworkELB",
    period: 60,
    statistic: "Average",
    threshold: logstashServersCount,
    alarmDescription: "Number of healthy nodes in Target Group",
    actionsEnabled: true,
    alarmActions: [sns.arn],
    okActions: [sns.arn],
    dimensions: {
        TargetGroup: lb_tg.arnSuffix,
        LoadBalancer: lb.arnSuffix,
    },
});
import pulumi
import pulumi_aws as aws

nlb_healthyhosts = aws.cloudwatch.MetricAlarm("nlb_healthyhosts",
    name="alarmname",
    comparison_operator="LessThanThreshold",
    evaluation_periods=1,
    metric_name="HealthyHostCount",
    namespace="AWS/NetworkELB",
    period=60,
    statistic="Average",
    threshold=logstash_servers_count,
    alarm_description="Number of healthy nodes in Target Group",
    actions_enabled=True,
    alarm_actions=[sns["arn"]],
    ok_actions=[sns["arn"]],
    dimensions={
        "TargetGroup": lb_tg["arnSuffix"],
        "LoadBalancer": lb["arnSuffix"],
    })
package main

import (
	"github.com/pulumi/pulumi-aws/sdk/v7/go/aws/cloudwatch"
	"github.com/pulumi/pulumi-aws/sdk/v7/go/aws/sns"
	"github.com/pulumi/pulumi/sdk/v3/go/pulumi"
)

func main() {
	pulumi.Run(func(ctx *pulumi.Context) error {
		_, err := cloudwatch.NewMetricAlarm(ctx, "nlb_healthyhosts", &cloudwatch.MetricAlarmArgs{
			Name:               pulumi.String("alarmname"),
			ComparisonOperator: pulumi.String("LessThanThreshold"),
			EvaluationPeriods:  pulumi.Int(1),
			MetricName:         pulumi.String("HealthyHostCount"),
			Namespace:          pulumi.String("AWS/NetworkELB"),
			Period:             pulumi.Int(60),
			Statistic:          pulumi.String("Average"),
			Threshold:          pulumi.Any(logstashServersCount),
			AlarmDescription:   pulumi.String("Number of healthy nodes in Target Group"),
			ActionsEnabled:     pulumi.Bool(true),
			AlarmActions: pulumi.Array{
				sns.Arn,
			},
			OkActions: pulumi.Array{
				sns.Arn,
			},
			Dimensions: pulumi.StringMap{
				"TargetGroup":  pulumi.Any(lb_tg.ArnSuffix),
				"LoadBalancer": pulumi.Any(lb.ArnSuffix),
			},
		})
		if err != nil {
			return err
		}
		return nil
	})
}
using System.Collections.Generic;
using System.Linq;
using Pulumi;
using Aws = Pulumi.Aws;

return await Deployment.RunAsync(() => 
{
    var nlbHealthyhosts = new Aws.CloudWatch.MetricAlarm("nlb_healthyhosts", new()
    {
        Name = "alarmname",
        ComparisonOperator = "LessThanThreshold",
        EvaluationPeriods = 1,
        MetricName = "HealthyHostCount",
        Namespace = "AWS/NetworkELB",
        Period = 60,
        Statistic = "Average",
        Threshold = logstashServersCount,
        AlarmDescription = "Number of healthy nodes in Target Group",
        ActionsEnabled = true,
        AlarmActions = new[]
        {
            sns.Arn,
        },
        OkActions = new[]
        {
            sns.Arn,
        },
        Dimensions = 
        {
            { "TargetGroup", lb_tg.ArnSuffix },
            { "LoadBalancer", lb.ArnSuffix },
        },
    });

});
package generated_program;

import com.pulumi.Context;
import com.pulumi.Pulumi;
import com.pulumi.core.Output;
import com.pulumi.aws.cloudwatch.MetricAlarm;
import com.pulumi.aws.cloudwatch.MetricAlarmArgs;
import java.util.List;
import java.util.ArrayList;
import java.util.Map;
import java.io.File;
import java.nio.file.Files;
import java.nio.file.Paths;

public class App {
    public static void main(String[] args) {
        Pulumi.run(App::stack);
    }

    public static void stack(Context ctx) {
        var nlbHealthyhosts = new MetricAlarm("nlbHealthyhosts", MetricAlarmArgs.builder()
            .name("alarmname")
            .comparisonOperator("LessThanThreshold")
            .evaluationPeriods(1)
            .metricName("HealthyHostCount")
            .namespace("AWS/NetworkELB")
            .period(60)
            .statistic("Average")
            .threshold(logstashServersCount)
            .alarmDescription("Number of healthy nodes in Target Group")
            .actionsEnabled(true)
            .alarmActions(sns.arn())
            .okActions(sns.arn())
            .dimensions(Map.ofEntries(
                Map.entry("TargetGroup", lb_tg.arnSuffix()),
                Map.entry("LoadBalancer", lb.arnSuffix())
            ))
            .build());

    }
}
resources:
  nlbHealthyhosts:
    type: aws:cloudwatch:MetricAlarm
    name: nlb_healthyhosts
    properties:
      name: alarmname
      comparisonOperator: LessThanThreshold
      evaluationPeriods: 1
      metricName: HealthyHostCount
      namespace: AWS/NetworkELB
      period: 60
      statistic: Average
      threshold: ${logstashServersCount}
      alarmDescription: Number of healthy nodes in Target Group
      actionsEnabled: 'true'
      alarmActions:
        - ${sns.arn}
      okActions:
        - ${sns.arn}
      dimensions:
        TargetGroup: ${["lb-tg"].arnSuffix}
        LoadBalancer: ${lb.arnSuffix}

The dimensions property scopes the HealthyHostCount metric to a specific target group and load balancer using their arnSuffix values. The alarmActions property triggers when health drops below threshold; okActions triggers when health recovers. This enables automatic notifications when target failures reduce available capacity.

Beyond These Examples

These snippets focus on specific alarm-level features: static threshold and anomaly detection evaluation, metric math and Metrics Insights queries, and Auto Scaling and SNS notification integration. They’re intentionally minimal rather than full monitoring solutions.

The examples may reference pre-existing infrastructure such as EC2 instances, Auto Scaling groups, load balancers, Auto Scaling policies, and SNS topics for notifications. They focus on configuring the alarm rather than provisioning everything around it.

To keep things focused, common alarm patterns are omitted, including:

  • Missing data handling (treatMissingData)
  • Flapping prevention (datapointsToAlarm)
  • Percentile-based alarms (extendedStatistic, evaluateLowSampleCountPercentiles)
  • Composite alarms for multi-condition logic
  • Unit specification for metric normalization

These omissions are intentional: the goal is to illustrate how each alarm feature is wired, not provide drop-in monitoring modules. See the CloudWatch Metric Alarm resource reference for all available configuration options.

Frequently Asked Questions

Configuration Errors & Mutual Exclusions
What's the difference between statistic and extendedStatistic?
You must choose one or the other, they cannot be used together. Use statistic for standard statistics (SampleCount, Average, Sum, Minimum, Maximum) or extendedStatistic for percentiles (p0.0 to p100).
What happens if I specify both metricQueries and metricName?
This configuration is invalid. When using metricQueries, you cannot specify metricName, namespace, period, or statistic. Use either metricQueries for complex expressions or the basic metric properties for simple alarms.
Metric Configuration
When should I use metricQueries instead of basic metric properties?
Use metricQueries when you need metric math expressions (like calculating error rates), anomaly detection, Metrics Insights SQL queries, or combining multiple metrics. Use basic properties (metricName, namespace, period, statistic) for simple single-metric alarms.
How do I create an alarm using metric math expressions?
Use metricQueries with an expression property to calculate values from multiple metrics. For example, m2/m1*100 calculates an error rate percentage. Set returnData: true on the query that provides the alarm value.
How do I set up anomaly detection alarms?
Use metricQueries with ANOMALY_DETECTION_BAND(m1) as the expression, set thresholdMetricId to match the expression’s ID, and use anomaly-specific comparison operators like GreaterThanUpperThreshold or LessThanLowerThreshold.
Can I use SQL queries with CloudWatch alarms?
Yes, use Metrics Insights queries by setting a metricQueries entry with a SQL-like SELECT statement in the expression field. This allows complex metric aggregations like SELECT MAX(...) with WHERE clauses and GROUP BY.
Alarm Behavior & Thresholds
How does the alarm handle missing data points?
Configure this with treatMissingData. Options are missing (default, treats as missing), ignore (maintains current state), breaching (treats as breaching threshold), or notBreaching (treats as within threshold).
What are the valid period values for alarms?
Valid values are 10, 20, 30, or any multiple of 60 seconds.
When do I use threshold vs thresholdMetricId?
Use threshold for alarms with static values. Use thresholdMetricId for anomaly detection alarms, where it should match the ID of the ANOMALY_DETECTION_BAND function in your metricQueries.
How do I control alarm behavior with low sample counts?
For percentile-based alarms, use evaluateLowSampleCountPercentiles. Set to ignore to prevent state changes when data is statistically insufficient, or evaluate (default) to always evaluate regardless of sample count.
Actions & Integration
What's the difference between alarmActions, okActions, and insufficientDataActions?
These define actions for different state transitions: alarmActions executes when entering ALARM state, okActions when entering OK state, and insufficientDataActions when entering INSUFFICIENT_DATA state. Each accepts a list of ARNs (SNS topics, Auto Scaling policies, etc.).
How do I trigger Auto Scaling policies from alarms?
Include the Auto Scaling policy ARN in alarmActions. The policy will execute when the alarm transitions to ALARM state.
Can I disable alarm actions temporarily?
Yes, set actionsEnabled to false. This prevents all actions from executing during state changes. The default is true.
Naming & Immutability
Can I rename a CloudWatch alarm after creation?
No, the name property is immutable. Changing it will force resource replacement (deletion and recreation).

Ready to get started?

Get started with Pulumi Cloud, then follow our quick setup guide to deploy this infrastructure.

Create free account