AI Engineering Lessons from Building Pulumi Copilot

Building AI-powered developer tools comes with unique challenges, and now that we’ve launched our REST API, we want to share some lessons we’ve learned building Pulumi Copilot, an AI assistant for cloud infrastructure.
One of the big challenges was determining what ‘working’ really meant. So when a message landed in our feedback channel after months of rigorous testing - ‘Your tool doesn’t know anything!’ - it caused some mild panic. We’d just made some changes, so we braced for the worst. But our evals were still looking strong, so what was going on?