GenAI / AI Governance

The Delegation Problem: Why AI Made Intent the New Bottleneck

1 Jan 2026·9 min

Every engineering leader I talk to has the same experience: AI coding assistants made their developers faster. The pull requests come in quicker. The code volume is up. The velocity charts look great.

And yet.

Review times are increasing. Defect rates are climbing. Code complexity is rising while refactoring drops to historic lows. The team is producing more and understanding less.

Welcome to the delegation problem.

What actually got cheaper

AI coding assistants made one thing dramatically cheaper: translating intent into code. If you know exactly what you want, a well-prompted AI will produce it faster than any human.

But "knowing exactly what you want" was never the bottleneck. The bottleneck was always intent itself — the clarity of requirements, the precision of specifications, the shared understanding of what "done" means in context.

AI didn't remove the bottleneck. It moved it upstream.

Insight

When production gets cheap, the constraints shift to intent, validation, and accountability. These are organisational design problems, not tooling problems.

The four rungs of the delegation ladder

Not all work can be delegated to AI equally. The level of specification rigour required depends on four variables: risk, novelty, team maturity, and codebase condition.

Rung 1: Full autonomy. Low risk, familiar domain, mature team, clean codebase. AI generates, human reviews briefly. Examples: boilerplate, test stubs, documentation, well-understood CRUD operations.

Rung 2: Guided generation. Medium risk, partially novel. AI generates from a detailed specification. Human reviews thoroughly, validates edge cases. Examples: feature implementations with clear acceptance criteria.

Rung 3: Collaborative drafting. High risk or significant novelty. Human and AI work iteratively. Human provides intent and constraints at each step. AI generates options. Human validates and steers. Examples: architectural changes, performance-critical code, security boundaries.

Rung 4: Human-only. Novel domain, no existing patterns, safety-critical, regulatory. AI assists with research and exploration but does not generate production code. Examples: cryptographic implementations, compliance-critical business logic, novel algorithms.

The mistake most organisations make is treating all work as Rung 1. Everything gets the same level of AI delegation. The result: fast production of code that nobody fully understands, with review becoming the new bottleneck because reviewers can't keep pace with generation.

The governance gap

Here's the uncomfortable truth: most organisations adopted AI tools without redesigning governance.

They didn't change code review processes to account for higher volume and lower author comprehension. They didn't adjust their specification practices to provide the intent clarity that AI delegation requires. They didn't redefine accountability for code that was generated rather than written.

The result is a governance gap — the space between how fast code is produced and how well the organisation can validate what was produced. This gap grows wider with every efficiency improvement that doesn't come with a corresponding governance improvement.

Five governance layers

Closing the governance gap requires five layers, each addressing a different aspect of the delegation problem:

Context layer. What is this change for? What problem does it solve? What constraints apply? Without context, AI generates syntactically correct code that is semantically wrong.
Specification layer. What exactly should the code do? What are the boundaries, edge cases, and acceptance criteria? The more you delegate to AI, the more precise your specifications must be. This is counterintuitive — people expect AI to reduce specification effort, but it actually increases it.
Proof layer. How do we know the code does what the specification says? Tests, characterisation tests, property-based tests, contract tests. The proof packet travels with the code.
Review layer. Who validates that the proof matches the intent? Review in an AI-augmented world is not line-by-line code reading. It's intent validation: does this implementation match what we actually wanted?
Monitoring layer. How do we know the code continues to behave correctly in production? Observability, canary deployments, feature flags, rollback capability.

Most organisations have fragments of these layers. Almost none have all five, and fewer still have them connected into a coherent governance system.

The organisational design question

The delegation problem is ultimately an organisational design question: how do you structure teams, processes, and accountability chains for a world where production is cheap but intent is expensive?

This means:

Product roles become more important, not less. Clear specifications require product thinking. If your product owners write vague tickets, AI amplifies the vagueness.
Review capacity must scale with generation capacity. You can't 10x code generation and keep the same review bandwidth. Either invest in review or accept that quality will degrade.
Accountability must be redefined. When AI generates the code, who is accountable for its behaviour? The person who prompted? The person who reviewed? The person who merged? Without clear accountability, nobody owns the outcome.
Team structures may need to change. The ratio of senior to junior developers, the role of architects, the function of tech leads — all of these are affected by AI delegation. Organisations that don't redesign team structures will find that AI amplifies their existing dysfunction.

AI didn't remove the bottleneck. It moved it upstream into intent, validation, and accountability. That's an organisational design challenge, not a tooling one.

The organisations that capture real value from AI are the ones that treat the delegation problem as a governance redesign, not a tool rollout. The rest will produce more code, faster, with less understanding — and wonder why outcomes aren't improving.

This article is adapted from the forthcoming book on governing AI-assisted software delivery at enterprise scale.