Add production observability on GCP with Pulumi

Provision Cloud Monitoring dashboards, log collection, alerting, and email notification wiring for a production-ready observability baseline on GCP.

Switch variant

Choose a different cloud.

Download blueprint

Get this GCP blueprint project as a zip. Switch Pulumi language here to keep the download aligned with the install commands and blueprint program on the page.

Download the TypeScript blueprint with the matching Pulumi program, dependency files, and README.

Download TypeScript blueprint

Download the Python blueprint with the matching Pulumi program, dependency files, and README.

Download Python blueprint

Download the Go blueprint with the matching Pulumi program, dependency files, and README.

Download Go blueprint

This guide builds a small production observability baseline with Pulumi. It creates cloud-native dashboards, log or metric sources, alert rules, and email notification wiring for GCP without introducing another monitoring platform.

Use it when you already have a service and need a repeatable first layer of visibility: where errors are counted, where latency is visible, who gets notified, and what minimal trace hook every service should expose.

Architecture

  • Cloud Logging captures or derives service health signals.
  • Cloud Monitoring dashboards displays the health view operators open first.
  • Cloud Monitoring alert policies raises error and latency alerts where the platform supports the metric directly.
  • Monitoring notification channels sends alert notifications to the notificationEmail Pulumi config value.
  • Cloud Trace is wired through a minimal trace-ready hook so your application can emit traces without changing the infrastructure shape later.

GCP observability shape

This variant uses a Cloud Logging log-based metric, Cloud Monitoring dashboard, AlertPolicy, email notification channel, and Cloud Trace-ready environment output.

Prerequisites

You need:

  • a Pulumi account and the Pulumi CLI
  • a Google Cloud project where you can create Cloud Monitoring, Cloud Logging, and notification channel resources
  • an email address or distribution list owned by your team for alert notifications
  • Go 1.23 or newer

Download the blueprint

Use the Download blueprint button at the top of this page to grab the GCP zip for the language selected in the chooser. Each zip contains:

  • index.ts as the Pulumi entrypoint
  • components/observability.ts as the reusable component
  • package.json and tsconfig.json for the Pulumi project
  • __main__.py as the Pulumi entrypoint
  • components/observability.py as the reusable component
  • requirements.txt for the Pulumi project
  • main.go as the Pulumi entrypoint
  • observability/observability.go as the reusable component
  • go.mod for the Pulumi project

Unzip, change into the directory, and continue with the quickstart below.

Quickstart

Install dependencies, configure the alert recipient, and deploy.

# 1. Install Pulumi project dependencies
npm install

# 2. Initialize and configure the stack
pulumi stack init dev
pulumi config set gcp:project <your-gcp-project-id>
pulumi config set notificationEmail <team-email-address>

# 3. Deploy
pulumi up
# 1. Install Pulumi project dependencies
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

# 2. Initialize and configure the stack
pulumi stack init dev
pulumi config set gcp:project <your-gcp-project-id>
pulumi config set notificationEmail <team-email-address>

# 3. Deploy
pulumi up
# 1. Install Pulumi project dependencies
go mod tidy

# 2. Initialize and configure the stack
pulumi stack init dev
pulumi config set gcp:project <your-gcp-project-id>
pulumi config set notificationEmail <team-email-address>

# 3. Deploy
pulumi up

GCP may require the email notification channel to be verified before alerts can notify it.

What Pulumi creates

The stack provisions Cloud Monitoring dashboards, Cloud Logging resources, Cloud Monitoring alert policies, and Monitoring notification channels. The sample service hook is small: it exists only to show how traces and alert dimensions attach to a real workload boundary.

For production rollout, keep the component shape but replace the sample function or trace environment values with your real service, route names, and SLO thresholds.

Operate it

After pulumi up, use the stack outputs to find the resources operators need first.

Open dashboardId in Cloud Monitoring and pass traceHook environment values into the service you want to trace.

Start with the default thresholds, then adjust them after the service has enough traffic to show normal error and latency patterns. Keep notification recipients in Pulumi config so the starter never hardcodes personal addresses.

Blueprint Pulumi program

The entrypoint reads the notification email from Pulumi config, creates the observability component, and exports operator-facing resources.

import * as pulumi from "@pulumi/pulumi";
import { Observability } from "./components/observability";

const config = new pulumi.Config();
const notificationEmail = config.require("notificationEmail");

const observability = new Observability("observability", {
    notificationEmail,
    namePrefix: `${pulumi.getStack()}-production-observability`,
    tags: { environment: pulumi.getStack(), "solution-family": "production-observability", cloud: "gcp", language: "typescript" },
});

export const dashboardId = observability.dashboardId;
export const notificationTarget = observability.notificationTarget;
export const traceHook = observability.traceHook;
import pulumi
from components.observability import Observability

config = pulumi.Config()
notification_email = config.require("notificationEmail")

observability = Observability("observability", notification_email=notification_email, name_prefix=f"{pulumi.get_stack()}-production-observability", tags={"environment": pulumi.get_stack(), "solution-family": "production-observability", "cloud": "gcp", "language": "python"})

pulumi.export("dashboardId", observability.dashboard_id)
pulumi.export("notificationTarget", observability.notification_target)
pulumi.export("traceHook", observability.trace_hook)
package main

import (
    "fmt"
    "production-observability-gcp/observability"
    "github.com/pulumi/pulumi/sdk/v3/go/pulumi"
    "github.com/pulumi/pulumi/sdk/v3/go/pulumi/config"
)

func Program(ctx *pulumi.Context) error {
        cfg := config.New(ctx, "")
        baseline, err := observability.NewObservability(ctx, "observability", &observability.ObservabilityArgs{NotificationEmail: cfg.Require("notificationEmail"), NamePrefix: fmt.Sprintf("%s-production-observability", ctx.Stack()), Tags: pulumi.StringMap{"environment": pulumi.String(ctx.Stack()), "solution-family": pulumi.String("production-observability"), "cloud": pulumi.String("gcp"), "language": pulumi.String("go")}})
        if err != nil { return err }
        ctx.Export("dashboardId", baseline.DashboardID)
        ctx.Export("notificationTarget", baseline.NotificationTarget)
        ctx.Export("traceHook", baseline.TraceHook)
        return nil
}

func main() {
    pulumi.Run(Program)
}

Reusable observability component

The component provisions the dashboard, log or metric source, alert rules, notification target, and minimal trace-ready service hook for GCP.

components/observability.ts

Creates the Cloud Monitoring dashboards, Cloud Logging wiring, alert rules, notification target, and Cloud Trace hook.

import * as gcp from "@pulumi/gcp";
import * as pulumi from "@pulumi/pulumi";
export interface ObservabilityArgs { notificationEmail: string; namePrefix: string; tags: Record<string, string>; }
export class Observability extends pulumi.ComponentResource {
  public readonly dashboardId: pulumi.Output<string>; public readonly notificationTarget: pulumi.Output<string>; public readonly traceHook: pulumi.Output<string>;
  constructor(name: string, args: ObservabilityArgs, opts?: pulumi.ComponentResourceOptions) {
    super("guides:productionObservability:Gcp", name, {}, opts);
    const metric = new gcp.logging.Metric(`${name}-errors`, { name: `${args.namePrefix}-errors`, filter: "severity>=ERROR", metricDescriptor: { metricKind: "DELTA", valueType: "INT64", displayName: "Application errors" } }, { parent: this });
    const channel = new gcp.monitoring.NotificationChannel(`${name}-email`, { displayName: `${args.namePrefix} email`, type: "email", labels: { email_address: args.notificationEmail }, userLabels: args.tags }, { parent: this });
    const policy = new gcp.monitoring.AlertPolicy(`${name}-error-policy`, { displayName: `${args.namePrefix} errors`, combiner: "OR", conditions: [{ displayName: "Error log entries", conditionThreshold: { filter: pulumi.interpolate`metric.type="logging.googleapis.com/user/${metric.name}"`, duration: "60s", comparison: "COMPARISON_GT", thresholdValue: 0, aggregations: [{ alignmentPeriod: "60s", perSeriesAligner: "ALIGN_DELTA" }] } }], notificationChannels: [channel.name], userLabels: args.tags }, { parent: this });
    const dashboard = new gcp.monitoring.Dashboard(`${name}-dashboard`, { dashboardJson: pulumi.jsonStringify({ displayName: `${args.namePrefix} health`, gridLayout: { columns: "2", widgets: [{ title: "Logged errors", xyChart: { dataSets: [{ timeSeriesQuery: { timeSeriesFilter: { filter: pulumi.interpolate`metric.type="logging.googleapis.com/user/${metric.name}"`, aggregation: { perSeriesAligner: "ALIGN_DELTA" } } }, plotType: "LINE" }] } }, { title: "Trace hook", text: { content: "Set GOOGLE_CLOUD_TRACE_ENABLED=true in the service runtime.", format: "MARKDOWN" } }] } }) }, { parent: this });
    this.dashboardId = dashboard.id; this.notificationTarget = channel.name; this.traceHook = pulumi.interpolate`GOOGLE_CLOUD_TRACE_ENABLED=true; alertPolicy=${policy.name}`; this.registerOutputs({ dashboardId: this.dashboardId, notificationTarget: this.notificationTarget, traceHook: this.traceHook });
  }
}

components/observability.py

Creates the Cloud Monitoring dashboards, Cloud Logging wiring, alert rules, notification target, and Cloud Trace hook.

import json
import pulumi
import pulumi_gcp as gcp
class Observability(pulumi.ComponentResource):
    def __init__(self, name, notification_email, name_prefix, tags, opts=None):
        super().__init__("guides:productionObservability:Gcp", name, None, opts)
        child = pulumi.ResourceOptions(parent=self)
        metric = gcp.logging.Metric(f"{name}-errors", name=f"{name_prefix}-errors", filter="severity>=ERROR", metric_descriptor={"metric_kind":"DELTA","value_type":"INT64","display_name":"Application errors"}, opts=child)
        channel = gcp.monitoring.NotificationChannel(f"{name}-email", display_name=f"{name_prefix} email", type="email", labels={"email_address": notification_email}, user_labels=tags, opts=child)
        policy = gcp.monitoring.AlertPolicy(f"{name}-error-policy", display_name=f"{name_prefix} errors", combiner="OR", conditions=[{"display_name":"Error log entries","condition_threshold":{"filter": metric.name.apply(lambda n: f'metric.type="logging.googleapis.com/user/{n}"'), "duration":"60s", "comparison":"COMPARISON_GT", "threshold_value":0, "aggregations":[{"alignment_period":"60s", "per_series_aligner":"ALIGN_DELTA"}]}}], notification_channels=[channel.name], user_labels=tags, opts=child)
        dashboard = gcp.monitoring.Dashboard(f"{name}-dashboard", dashboard_json=metric.name.apply(lambda n: json.dumps({"displayName": f"{name_prefix} health", "gridLayout": {"columns": "2", "widgets": [{"title":"Logged errors"}, {"title":"Trace hook", "text": {"content":"Set GOOGLE_CLOUD_TRACE_ENABLED=true in the service runtime.", "format":"MARKDOWN"}}]}})), opts=child)
        self.dashboard_id = dashboard.id; self.notification_target = channel.name; self.trace_hook = policy.name.apply(lambda p: f"GOOGLE_CLOUD_TRACE_ENABLED=true; alertPolicy={p}")
        self.register_outputs({"dashboard_id": self.dashboard_id, "notification_target": self.notification_target, "trace_hook": self.trace_hook})

observability/observability.go

Creates the Cloud Monitoring dashboards, Cloud Logging wiring, alert rules, notification target, and Cloud Trace hook.

package observability

import (
    "encoding/json"
    "fmt"
    "github.com/pulumi/pulumi-gcp/sdk/v9/go/gcp/logging"
    "github.com/pulumi/pulumi-gcp/sdk/v9/go/gcp/monitoring"
    "github.com/pulumi/pulumi/sdk/v3/go/pulumi"
)

type ObservabilityArgs struct { NotificationEmail string; NamePrefix string; Tags pulumi.StringMap }
type Observability struct { pulumi.ResourceState; DashboardID pulumi.StringOutput; NotificationTarget pulumi.StringOutput; TraceHook pulumi.StringOutput }

func NewObservability(ctx *pulumi.Context, name string, args *ObservabilityArgs, opts ...pulumi.ResourceOption) (*Observability, error) {
    component := &Observability{}
    if err := ctx.RegisterComponentResource("guides:productionObservability:Gcp", name, component, opts...); err != nil { return nil, err }
    child := pulumi.Parent(component)
    metric, err := logging.NewMetric(ctx, name+"-errors", &logging.MetricArgs{Name: pulumi.String(args.NamePrefix+"-errors"), Filter: pulumi.String("severity>=ERROR"), MetricDescriptor: &logging.MetricMetricDescriptorArgs{MetricKind: pulumi.String("DELTA"), ValueType: pulumi.String("INT64"), DisplayName: pulumi.String("Application errors")}}, child); if err != nil { return nil, err }
    channel, err := monitoring.NewNotificationChannel(ctx, name+"-email", &monitoring.NotificationChannelArgs{DisplayName: pulumi.String(args.NamePrefix+" email"), Type: pulumi.String("email"), Labels: pulumi.StringMap{"email_address": pulumi.String(args.NotificationEmail)}, UserLabels: args.Tags}, child); if err != nil { return nil, err }
    policy, err := monitoring.NewAlertPolicy(ctx, name+"-error-policy", &monitoring.AlertPolicyArgs{DisplayName: pulumi.String(args.NamePrefix+" errors"), Combiner: pulumi.String("OR"), Conditions: monitoring.AlertPolicyConditionArray{&monitoring.AlertPolicyConditionArgs{DisplayName: pulumi.String("Error log entries"), ConditionThreshold: &monitoring.AlertPolicyConditionConditionThresholdArgs{Filter: pulumi.Sprintf("metric.type=\"logging.googleapis.com/user/%s\"", metric.Name), Duration: pulumi.String("60s"), Comparison: pulumi.String("COMPARISON_GT"), ThresholdValue: pulumi.Float64(0)}}}, NotificationChannels: pulumi.StringArray{channel.Name}, UserLabels: args.Tags}, child); if err != nil { return nil, err }
    body := metric.Name.ApplyT(func(metricName string) (string, error) { data, err := json.Marshal(map[string]interface{}{"displayName": args.NamePrefix + " health", "gridLayout": map[string]interface{}{"columns":"2", "widgets": []interface{}{map[string]interface{}{"title":"Logged errors"}, map[string]interface{}{"title":"Trace hook", "text": map[string]interface{}{"content":"Set GOOGLE_CLOUD_TRACE_ENABLED=true in the service runtime.", "format":"MARKDOWN"}}}}}); return string(data), err }).(pulumi.StringOutput)
    dashboard, err := monitoring.NewDashboard(ctx, name+"-dashboard", &monitoring.DashboardArgs{DashboardJson: body}, child); if err != nil { return nil, err }
    component.DashboardID = dashboard.ID().ToStringOutput(); component.NotificationTarget = channel.Name; component.TraceHook = policy.Name.ApplyT(func(v string) string { return fmt.Sprintf("GOOGLE_CLOUD_TRACE_ENABLED=true; alertPolicy=%s", v) }).(pulumi.StringOutput)
    return component, nil
}

Frequently asked questions

Does this deploy an application?
It deploys only the smallest service hook needed to demonstrate log, metric, and trace wiring. Bring your real service names, metric filters, and alert thresholds before using the blueprint for production traffic.
Where does the notification email come from?
Each starter reads a Pulumi config value named notificationEmail. Set it to the address or distribution list your team controls before running pulumi up.
Does this include incident management or on-call rotation?
No. The blueprint stops at cloud-native email notification targets so you can connect your own incident workflow later without adding another platform to the starter.
What should I tune first?
Tune the error threshold, latency threshold, evaluation window, and dashboard widgets to match your service baseline after the first few deploys.
How do I clean it up?
Run pulumi destroy from the same stack, then remove any email subscription confirmation or notification channel that your cloud provider leaves pending outside Pulumi state.