Context Compression Creates a Photocopy Problem
Saving tokens costs you tokens.
Compressing context to fit more into the window sounds like discipline. In practice, it's a trap.
The photocopy loop
You're on iteration twelve of a refactoring task. The model proposed a solid approach eight iterations ago. Now it's suggesting changes that contradict its earlier reasoning.
You scroll back. The context that explained why this approach was chosen is gone. Compressed. Summarized into a sentence that lost the nuance.
So you re-explain. The model incorporates your re-explanation, proposes a fix, and the cycle continues. You're spending tokens to recover information you already had.
This is the photocopy problem. Each compression pass loses fidelity. The summary of a summary of a decision no longer contains the constraints that made the decision make sense.
"You create a new task, you squish it down, but we find that actually becomes a bit of a photocopy of a photocopy situation and we're not interested in getting lesser results like that."
Hannes Rudolph,
Why teams compress
The intuition makes sense. Context windows have limits. Tokens cost money. If you can summarize the last ten iterations into a paragraph, you free up space for new work.
The problem is what gets lost in the summary:
- The specific constraint that ruled out option B
- The edge case that made the current approach necessary
- The reason the model proposed this structure instead of that one
When that context disappears, the model starts from a weaker foundation. It re-proposes ideas that were already rejected. It misses the nuance that made the current approach work. You spend time re-establishing what was already established.
The token savings from compression get eaten by the tokens spent recovering lost context.
The orchestration alternative
The alternative is task orchestration: a parent task spawns focused child tasks. Each child task operates with fresh context, does its specific job, and returns only the necessary result to the parent.
"The child task then returns just the necessary context to the parent task, thus keeping your context clean."
Hannes Rudolph,
The parent's context stays clean because it never ingested the full working state of each child. It only receives what it needs: the outcome, the decision, the artifact.
This is different from compression. Compression takes everything and makes it smaller. Orchestration keeps contexts separate and passes only the relevant handoff.
The mental model shift: instead of one long task that periodically squishes its own history, think of a coordinator that delegates to specialists. The coordinator doesn't need to know every line the specialist considered. It needs to know what the specialist concluded.
Context management: compression vs. orchestration
| Dimension | Compression approach | Orchestration approach |
|---|---|---|
| Context handling | Summarize accumulated state into smaller form | Keep contexts separate, pass only results |
| Fidelity over iterations | Degrades with each pass | Preserved within each focused task |
| Token efficiency | Upfront savings, hidden recovery costs | Higher coordination overhead, lower rework |
| When it fails | Around iteration 8-12 as nuance is lost | When task boundaries are poorly defined |
| Best for | Short, single-focus tasks | Multi-step refactors, debugging sessions, cross-file changes |
The tradeoff
Orchestration requires upfront structure. You have to define what the child tasks are, what they return, and how the parent incorporates their results. That's more setup than "keep going and compress when you hit the limit."
For short tasks, compression might never hurt you. The photocopy problem surfaces in longer multi-step work: refactors that span multiple files, debugging sessions that iterate through hypotheses, feature builds that require coordinated changes across layers.
If your tasks regularly exceed ten iterations, the fidelity loss from compression will start showing up as rework.
Why this matters for your team
For a five-person engineering team running multiple parallel workstreams, the compounding effect is significant. Each developer hitting the photocopy loop loses an hour here, thirty minutes there. The model confidently proposes something that contradicts a constraint from six iterations ago. The developer catches it, re-explains, watches the context get compressed again.
Multiply that across a week. Across a sprint.
The shift is structural: use task orchestration to keep context clean instead of using compression to make dirty context fit. The parent task stays focused. The child tasks stay focused. The handoffs carry only what matters.
How Roo Code closes the loop on context management
Roo Code's Orchestrator mode addresses the photocopy problem directly. Instead of compressing context within a single runaway task, Orchestrator spawns focused subtasks that each operate with clean context and return only the necessary results to the parent.
This approach embodies the "close the loop" principle: the agent proposes changes, executes them, observes results, and iterates, all while maintaining context integrity through task boundaries rather than lossy compression.
With BYOK (bring your own key), teams control their token spend directly without markup, making the orchestration overhead predictable and the rework savings measurable. The result: spend tokens intentionally for outcomes rather than spending tokens to recover from compression artifacts.
When to notice the problem
If outputs start degrading around iteration ten, check whether context compression is the cause. The symptom: the model contradicts its own earlier reasoning, or re-proposes approaches that were already rejected.
The fix: break the long task into parent and child tasks. Let each child return a clean result. Keep the parent's context window free of accumulated working state.
Context is not infinite. How you manage it matters more than how much you compress it.
Frequently asked questions
Stop being the human glue between PRs
Cloud Agents review code, catch issues, and suggest fixes before you open the diff. You review the results, not the process.