Tracking AI Inference Latency with Datadog SLOs
PythonIf you aim to track AI Inference Latency with Datadog Service Level Objectives (SLOs), you're essentially interested in monitoring the performance of AI inference workloads and ensuring they meet the standards set for latency.
In Datadog, an SLO is composed of a target and a time window, which are applied to a specific metric that you would be tracking. For AI inference latency, you would typically measure the time it takes for an inference request to be processed and return a response. You would then set a target for what an acceptable latency is—say, 95% of requests should be served in under 100ms over a 30-day window.
Let's put together a Pulumi program in Python that would create a Datadog SLO to track the latency of a theoretical AI inference service. We will use the
datadog.ServiceLevelObjective
resource from the Datadog provider package, which includes support for setting up SLOs.import pulumi import pulumi_datadog as datadog # The ID of the monitor that checks the AI inference latency. # This would typically be set up in Datadog to track the inference endpoint's response time. inference_latency_monitor_id = 123456789 # Service Level Objective definition for AI inference latency. # Here we set up an SLO that targets 95% of the requests to have a latency lower than the threshold # over a 30 day period (timeframe '30d'). ai_inference_latency_slo = datadog.ServiceLevelObjective("aiInferenceLatencySlo", name="AI Inference Latency", description="Latency SLO for AI inference requests", tags=["ai", "inference", "latency"], type="monitor", monitor_ids=[inference_latency_monitor_id], thresholds=[ datadog.ServiceLevelObjectiveThresholdArgs( timeframe="30d", target=95, warning=97, ), ], ) # Export the SLO id so it can be used elsewhere or looked up in Datadog. pulumi.export("slo_id", ai_inference_latency_slo.id)
In the program above, we've defined a
ServiceLevelObjective
resource that specifies our SLO metrics. We've given it a name, a description, and tags to help identify and categorize the SLO within Datadog.We specify a
type
as"monitor"
, which means this SLO is based on a monitor in Datadog tracking the target performance metric—in our case, the AI inference latency. Themonitor_ids
array contains IDs of monitors; you will replace123456789
with the actual ID of your monitor set up in Datadog.The
thresholds
argument is a list ofServiceLevelObjectiveThresholdArgs
, describing the SLO targets and timeframes. Here, we are aiming for a target that 95% of the inference responses over the last 30 days should be below our latency threshold.Lastly, we export the
slo_id
, which is the ID of the SLO we've created in Datadog so that it can be referenced if needed.Please note that you will need to set up the appropriate monitor in Datadog that accurately represents the service metric you want to apply the SLO to. This setup is something you would do within the Datadog platform itself or potentially through Pulumi if the monitor does not yet exist.
For more information on using the Datadog Pulumi provider and setting up service level objectives, refer to the Datadog Pulumi documentation.