Over-Prompting Is Hurting Your Agent Performance
Your detailed specification is making the agent worse.
You wrote the context doc. You listed every edge case. You specified the exact file structure, the naming conventions, the error handling patterns. You gave the agent everything it needed to succeed.
And now it's generating a 47-item to-do list instead of writing code.
The overhead trap
You've seen the advice: give the model more context. Write detailed system prompts. Document your codebase. The more information the agent has, the better it performs.
Except when it doesn't.
The problem shows up when the agent starts treating your documentation as a constraint instead of a guide. You wanted it to write a utility function. Instead, it's parsing your specification line by line, building an elaborate plan, and generating scaffolding for edge cases you mentioned once in passing.
The overhead compounds. More context means more tokens spent reading. More reading means more opportunities for the model to latch onto irrelevant details. And when your specification conflicts with what the model already knows about writing code, you get output that follows your rules but ignores best practices.
"Sometimes that can go on a downside because I feel like overprompting is actually a thing. When you try to give way too much details, all these documents, and then it tries to read and create a super-detailed to-do list. I think it adds too much overhead into the process."
Melo,
The mandate pattern
The alternative sounds too simple to work: give the agent a mandate instead of a manual.
A mandate is a clear outcome with minimal prescription. "Add retry logic to the API client" instead of "Add retry logic using exponential backoff with base delay of 100ms, max delay of 5000ms, jitter of ±10%, max attempts of 3, and custom error classification for 429 and 503 status codes."
The model already knows what retry logic looks like. It's seen thousands of implementations. When you over-specify, you're not giving it information; you're overriding information it already has with your particular preferences.
Sometimes that's what you want. If your codebase has unusual conventions, you need to specify them. But if you're just restating standard patterns, you're adding noise.
"Sometimes we actually get better results by using a simpler prompt and just letting the model do its thing. Giving it a mandate instead of micromanaging it."
Roo Code Team,
When models know more than your spec
The models keep getting better at code. The training data includes years of open source, documentation, and real implementations. When you write a detailed specification for a standard pattern, you're competing with that knowledge.
If your spec matches what the model knows, the extra context is redundant. If your spec diverges from what the model knows, the model has to reconcile the conflict. Neither outcome justifies the token cost.
"The models are getting so good at doing everything code related. So sometimes by using a prompt that's too specific, it might go against what they know already about code."
Melo,
The tradeoff: simpler prompts require more trust. You're betting that the model's default behavior is close enough to what you want. If it isn't, you'll need to iterate. But iteration on a short prompt is cheaper than debugging output from a conflicted long prompt.
Why this matters for your workflow
For an engineer running 10-15 agent tasks per day, prompt overhead adds up. Each extra paragraph in your system prompt is tokens spent on reading instead of generating. Each detailed specification is a potential conflict with the model's training.
The compounding effect is subtle. You don't notice the overhead on any single task. But across a week, you're spending hours watching the agent build elaborate plans for work that should take minutes.
The calibration
Start with a mandate. Watch what the agent does. If it misses something important, add that specific constraint. If it follows a pattern you don't like, override that pattern.
Build your context incrementally based on what actually fails, not what might fail.
The prompt that works is often shorter than the one you started with.
How Roo Code closes the loop on prompt calibration
Roo Code lets you iterate on prompts without leaving your editor. Because Roo Code closes the loop - running commands, executing tests, and iterating based on results - you see immediately whether your mandate worked or needs refinement.
With BYOK (bring your own key), you control your token spend directly. There's no markup obscuring the cost of over-prompting. When a bloated specification burns through tokens on planning instead of coding, you see it in real time.
Roo Code's approval system lets you catch over-specification before it compounds. When the agent generates an elaborate 47-item plan for a simple task, you can reject it, simplify your prompt, and try again - all within the same workflow.
Over-prompting vs. mandate-based prompting
| Dimension | Over-prompting | Mandate-based prompting |
|---|---|---|
| Token cost | High - context parsing adds overhead on every task | Low - minimal input, maximum output |
| Conflict risk | High - spec may contradict model's training | Low - leverages model's existing knowledge |
| Iteration speed | Slow - debugging long prompts is tedious | Fast - short prompts are easy to refine |
| Output quality | Variable - model may follow rules but miss best practices | Consistent - model applies learned patterns |
| Trust required | Low - you specify everything | Higher - you trust the model's defaults |
Frequently asked questions
Stop being the human glue between PRs
Cloud Agents review code, catch issues, and suggest fixes before you open the diff. You review the results, not the process.