The prompt-action gap: why your agent does the right thing for the wrong reason
Every agent framework has the same blind spot. Your agent completes the task — the email gets sent, the data gets updated, the report gets generated. You look at the output and think: great, it works.
But if you look at the reasoning trace — the chain of thought between prompt and action — you'll often find the agent arrived at the right answer for entirely the wrong reason.
The Gap
I call this the prompt-action gap. It's the distance between what your prompt intended and what the agent actually reasoned about before taking action.
Here's a simple example. You prompt an agent to "find the highest-priority support ticket and draft a response." The agent:
- Queries the ticket system
- Sorts by creation date (not priority)
- Picks the newest ticket
- Writes a reasonable response
The output looks correct because the newest ticket happened to also be the highest priority. But the agent's reasoning was wrong — it sorted by date, not priority. Next time, when those don't align, it'll respond to the wrong ticket.
Why It Matters at Scale
When you're running one agent, you can spot-check. When you're running twenty, you can't. The prompt-action gap is invisible in outcomes until it isn't.
At ButterGrow, we've found three categories of gap:
Semantic gaps — the agent interprets a term differently than you intended. "High-value leads" means different things to your sales team and to an LLM trained on general text.
Ordering gaps — the agent does the right steps in the wrong order, which happens to work in most cases but fails in edge cases (like our priority/date example).
Scope gaps — the agent goes beyond or falls short of the intended action boundary. It updates the CRM and sends a Slack notification when you only asked for the CRM update.
How to Close It
You can't fully close the prompt-action gap. But you can measure it:
- Log reasoning traces, not just outcomes
- Compare intent to execution at the step level
- Test with adversarial cases where the "right answer for wrong reason" would fail
This is fundamentally what we built AgentScore to do. But even without a dedicated tool, you can start by adding assertions to your agent loops:
// After each agent step, verify the reasoning
assert(
agent.lastReasoning.includes("priority"),
"Agent should reason about priority, not just recency"
);
It's crude, but it catches the most obvious gaps.
The prompt-action gap is the agent equivalent of a test that passes for the wrong reason. You wouldn't ship code with coincidentally passing tests. Don't ship agents with coincidentally correct behavior.