Let the Agent Wait for CI Before Moving On
$ git push origin feature-branch $ gh pr create --fill
Done? Not even close.
The premature celebration
You open a PR. You write a description. You request review. You move on to the next task.
Thirty minutes later, Slack lights up. CI failed. The tests you assumed would pass hit an edge case you forgot about. Now you're context-switching back, re-reading your own diff, trying to remember what you were thinking when you wrote it.
This is the gap between "I think I fixed it" and "the build actually proves I fixed it." Most agent workflows treat the PR submission as the finish line. The agent proposes changes, you approve, it opens the PR, and then it's done.
But submitting is not shipping. CI is the remaining verification step. And if the agent walks away before CI finishes, you're the one who has to pick up the pieces.
The watch loop
Claude Code introduced a pattern worth stealing: after submitting a PR, it doesn't stop. It runs a GitHub check watch function that polls every 10 seconds until CI completes.
"When it submitted the PR, it ran a GitHub check watch function and pinged every 10 seconds. Well, the checks the CI checks did their thing."
Hannes Rudolph,
If the tests pass, the agent confirms success and moves on. If the tests fail, it reads the failure output and immediately starts fixing them.
"Once we submit a fix, a PR, it waits, checks, it keeps checking it, and then once it's done, it goes, 'Hey, tests aren't passing.' And then it just starts fixing them."
Hannes Rudolph,
No human intervention required. The agent incorporates the CI output into its next iteration, pushes a fix, and watches the tests again. The loop closes without you becoming the message bus between GitHub Actions and your coding tool.
Why this works
The pattern works because it treats CI output as first-class context. Most agent workflows either:
- Ignore CI entirely (assume the code works because the model said so)
- Require you to paste CI logs back into chat when something fails
Both approaches make you the intermediary. You're reading the failure, copying the relevant lines, explaining what went wrong, and hoping the model interprets your summary correctly.
The watch loop skips all of that. The agent reads the actual failure output, not your summary of it. It sees the exact assertion that failed, the exact line number, the exact stack trace.
"Sometimes it just gets the hint. Sonnet's pretty smart that way or Opus and it just submits the changes and watches the tests again."
Hannes Rudolph,
The tradeoffs
This isn't free. Polling CI for 10 minutes while tests run costs tokens. If your CI pipeline takes 30 minutes, the agent is sitting there, waiting, burning context window on status checks.
The pattern works best when:
- CI is reasonably fast (under 10 minutes)
- Failures are actionable (not flaky tests or infrastructure issues)
- The agent has approval to push follow-up commits
If your CI takes an hour, you probably don't want an agent polling the whole time. If your tests are flaky, the agent might chase phantom failures. And if every push requires manual approval, the loop can't actually close.
Why this matters for your workflow
For an engineer shipping 2-3 PRs a day, the context-switch tax adds up. Each time you have to come back to a failed CI run, you're re-loading the mental state of what you were trying to do. That's 10-15 minutes of re-orientation per failure.
An agent that watches CI and iterates on failures removes that re-orientation cost. When you come back, the PR is either green, or the agent has already made three attempts and flagged something it can't solve alone.
How Roo Code closes the loop on CI failures
Roo Code is an AI coding agent that closes the loop: it proposes diffs, runs commands and tests, and iterates based on results. The CI watch pattern extends this capability beyond local execution to the full CI pipeline.
With Roo Code's BYOK (bring your own key) model and configurable approvals, you control exactly how much autonomy the agent has. You can allowlist specific commands like git push and test runners, enabling the agent to iterate on CI failures without requiring manual approval at each step. The agent reads actual CI output directly rather than relying on your summary, which means it sees the exact failure context needed to propose accurate fixes.
Roo Code transforms the PR-to-merge workflow from a series of manual handoffs into a continuous feedback loop where the agent monitors CI, interprets failures, and iterates until the build passes or escalates issues it cannot resolve.
Traditional workflow vs. CI watch loop
| Dimension | Traditional workflow | CI watch loop |
|---|---|---|
| CI monitoring | Manual - you watch for notifications | Automated - agent polls until complete |
| Failure response | Context switch back, re-read diff, debug | Agent reads logs and iterates immediately |
| Human role | Message bus between CI and coding tool | Reviewer of final result or escalations |
| Time to green build | Includes re-orientation overhead per failure | Continuous iteration without context loss |
| Token cost | Lower (agent stops at PR) | Higher (agent polls and iterates) |
The implementation path
If you're using Roo Code with GitHub integration, this pattern is within reach. The key pieces:
- After PR submission, query the GitHub checks API
- Poll until checks complete (with a timeout)
- If checks fail, read the failure logs
- Iterate on the fix without prompting the user
The constraint that matters: approvals. If every command requires manual approval, the agent can't iterate autonomously. Consider allowlisting git push and test commands if you want the loop to close without intervention.
Build the watch loop into your workflow. Let the agent wait for CI before moving on.
Frequently asked questions
Stop being the human glue between PRs
Cloud Agents review code, catch issues, and suggest fixes before you open the diff. You review the results, not the process.