Skip to main content
Pulumi logo
Engin Diri

Engin Diri

Senior Solutions Architect

Five Stacks Before Lunch: The Parallel Coding Playbook for Pulumi

Five Stacks Before Lunch: The Parallel Coding Playbook for Pulumi

AI coding has two shapes right now. One agent in a loop, sequential work, you babysitting the chat window. Call that 2x. Most teams live here. Five agents in worktrees, parallel work, fresh-context review on every change. Call that 10x. The trick: 2x is mostly prompting, 10x is mostly plumbing.

The parallel coding playbook is a five-pattern setup for running multiple AI coding agents at the same time without them stepping on each other: an issue used as the spec, a plan/build/validate loop, parallel git worktrees, fresh-session review, and a self-healing layer. The whole thing targets application code. The interesting question, and the one I keep ending up at, is what changes when the five agents are touching infrastructure.

Read more →

Stop Tuning Prompts. Build a Harness.

Stop Tuning Prompts. Build a Harness.

Anthropic shipped a piece earlier this month called How Claude Code Works in Large Codebases. I have not read anything more useful about coding agents this year. The core claim, in their words: “the ecosystem built around the model—the harness—determines how Claude Code performs more than the model alone.” In my phrasing: in a real codebase, the model is the smaller variable. The layer of context and tooling you wire around the agent matters more than which version of Sonnet or Opus is behind it.

The post stays high-level, which is the right move for a launch piece. What I want to do here is land it. Same seven pieces, but with the wiring you would actually put in a repo, in the order I would put it.

Read more →

How Building AI Agents Has Changed in 2026

How Building AI Agents Has Changed in 2026

Twelve months ago, building an AI agent meant picking a framework, defining your tools, standing up a RAG pipeline, and writing a stack of glue code to wire it all together. That was the default playbook. The post-mortem on six months of work usually went the same way: half the time went into infrastructure that had nothing to do with the agent’s actual job.

That isn’t where the work is anymore. Most of the middle layer is gone. The SDKs ship with the tools, the skills system replaced the upfront tool registry, and longer context windows pushed vector search out of the default slot it held all of last year.

The shape is the same as a lot of infrastructure shifts before it. The hard thing got cheap, the cheap thing got expected, and the question moved up a level.

Read more →

The Dark Factory Pattern for Infrastructure: Running Pulumi Lights-Out

The Dark Factory Pattern for Infrastructure: Running Pulumi Lights-Out

The original dark factory was Fanuc’s robotics plant in Oshino, Japan, where the lights are off because nobody is on the floor. Robots build robots. Parts move through the line for weeks at a time without a person walking past them.

The same pattern is now showing up in software. Three engineers at StrongDM shipped roughly 32,000 lines of production code without writing or reviewing any of it. Stripe’s “Minions” agent system merges over a thousand pull requests every week. In January, Dan Shapiro of Glowforge published a five-level autonomy ladder that landed cleanly enough to become the shorthand most people now use, and BCG put out a piece calling it the dark software factory.

Almost every public writeup so far is about application code. The harder question is what this looks like for infrastructure.

Read more →

Agent Sprawl Is Here. Your IaC Platform Is the Answer.

Agent Sprawl Is Here. Your IaC Platform Is the Answer.

Somewhere in your company right now, a developer is building an AI agent. Maybe it’s a release agent that cuts tags when tests pass. Maybe it’s a cost agent that shuts down idle EC2 overnight. It’s running, it’s in production, and there’s a decent chance the platform team doesn’t know it exists.

This isn’t a thought experiment. OutSystems just surveyed 1,900 IT leaders and the numbers are rough: 96% of enterprises run AI agents in production today, 94% say the sprawl is becoming a real security problem, and only 12% have any central way to manage it. Twelve percent. You can read the full report here.

The real question is where those agents run. Inside the platform you’ve already built, or somewhere off to the side where nobody on the platform team can see them.

Read more →

Superpowers, GSD, and GSTACK: Picking the Right Framework for Your Coding Agent

Superpowers, GSD, and GSTACK: Picking the Right Framework for Your Coding Agent

Three community frameworks have emerged that fix the specific ways AI coding agents break down on real projects. Superpowers enforces test-driven development. GSD prevents context rot. GSTACK adds role-based governance. All three started with Claude Code but now work across Cursor, Codex, Windsurf, Gemini CLI, and more.

Pulumi uses general-purpose programming languages to define infrastructure. TypeScript, Python, Go, C#, Java. Every framework that makes AI agents write better TypeScript also makes your pulumi up better. After spending a few weeks with each one, I have opinions about when to use which.

Read more →

Token Efficiency vs Cognitive Efficiency: Choosing IaC for AI Agents

Token Efficiency vs Cognitive Efficiency: Choosing IaC for AI Agents

When an AI agent writes infrastructure code, two things matter: how compact the output is (token efficiency) and how well the model actually reasons about what it’s writing (cognitive efficiency). HCL produces fewer tokens for the same resource. But does that make it the better choice when agents need to refactor, debug, and iterate? We ran a benchmark across Claude Opus 4.6 and GPT-5.2-Codex to find out.

Read more →

GitOps Best Practices I Wish I Had Known Before

GitOps Best Practices I Wish I Had Known Before

Getting started with GitOps can feel like trying to herd cats through a YAML factory while the factory is on fire. It’s one of those things that seems like it ought to be simple (just use Git!), but in practice is much more complex — and you may not realize how much more complex until you’re weeks or more into a project. After years of running GitOps workflows in production across dozens of clusters, I’ve collected a list of best practices that I’m hoping can save you from having to make many of the mistakes I’ve made. Think of it as the GitOps cheat sheet I wish I’d had from Day 1.

Read more →

The Claude Skills I Actually Use for DevOps

The Claude Skills I Actually Use for DevOps

When Claude Code first released skills, I ignored them. They looked like fancy prompts, another feature to add to the pile of things I would get around to learning eventually. Then I watched a few engineers demonstrate what skills actually do, and something clicked. By default, language models do not write good code. They write plausible code based on what they have read. Plausible code turns into bugs, horrible UX, and infrastructure that breaks at 3am.

Read more →

The infrastructure as code platform for any cloud.