Blog

How teams use agents to iterate, review, and ship PRs with proof.

Showing 12 of 122 posts

Evals for Orchestration, Not Just Code Generation

Roo Cast2025-04-25

Why coding benchmarks miss the failure modes that matter in agentic systems, and how to build orchestration evals that measure task handoffs, feedback loops, and recovery behavior.

evalsorchestrationagentic-workflowsengineering

The Model Doesn't Matter; The Feedback Loop Does

Roo Cast2025-04-25

Why feedback loops, not model selection, determine success in agentic coding systems - and how the close-the-loop principle transforms AI-assisted development.

agentic-developmentfeedback-loopsai-codingdeveloper-workflow

Two Silos: Vibe Coders and Tinkerers Need Different Defaults

Roo Cast2025-04-25

Why averaging UX for different user types fails, and how shipping two experiences with a shared core serves both vibe coders and tinkerers effectively.

product-designuser-experiencedeveloper-toolsai-coding

Put Framework Examples in Your Context Window Before Asking for Code

Office Hours2025-04-22

Learn why AI coding assistants default to popular frameworks and how providing concrete code examples in your context window steers output toward your actual stack.

prompt-engineeringcontext-windowframeworksai-coding

Vibe Coding Works Until Someone Exposes Their API Key

Office Hours2025-04-22

Vibe coding delivers speed but creates blind spots around security. Learn why new builders accidentally expose API keys and how guardrails can catch mistakes before they become incidents.

vibe-codingsecuritydeveloper-experienceguardrails

XML Tags May Be More Reliable Than Native Function Calling

Office Hours2025-04-22

Discover why XML tag-based tool definitions may outperform native function calling for AI agents, with practical guidance on when to swap formats for better reliability.

ai-agentsfunction-callingprompt-engineeringtool-use

Memory Is Currently a Vendor Lock-In Tool

Office Hours2025-04-16

AI memory features create hidden switching costs that undermine multi-model strategies. Learn how to evaluate memory as infrastructure and why portable memory matters for engineering teams.

vendor-lock-inmulti-modelenterprise-aimemory

We Shipped a Stealth Model in 24 Hours Without Leaking the Provider

Office Hours2025-04-16

How OpenRouter's team audited error shapes, message IDs, and tokenization patterns to ship an unreleased OpenAI model without fingerprinting the provider.

api-integrationoperational-securityai-infrastructuredeveloper-workflows

When Demand Outstrips Supply, Rate Limits Aren't a Pricing Problem

Office Hours2025-04-16

Rate limit errors on experimental AI models like Gemini 2.5 Pro aren't billing issues - they're supply problems. Learn why adding credits won't help and how to build fallback routing strategies.

rate-limitsmodel-routingai-infrastructuredeveloper-workflow

Why 10 Million Token Context Windows Aren't Available Yet

Office Hours2025-04-16

The model exists but the endpoint doesn't. Learn why announced context windows don't match API reality and how cluster economics block long-context inference.

context-windowsllm-infrastructureai-engineeringgpu-economics

Local Models Work for Edits, Not for Oneshots

Office Hours2025-04-09

Learn why local AI models excel at scoped code edits but fail at greenfield generation, and how to build a hybrid workflow that balances privacy requirements with agentic coding capability.

local-modelsagentic-codingmodel-selectionenterprise-privacy

Strip the Comments Before You Feed the Context

Office Hours2025-04-09

Stale code comments confuse AI coding agents by providing contradictory context. Learn how to audit comment freshness and practice context hygiene for better agent output.

context-managementai-coding-agentsdeveloper-workflowprompt-engineering

Stop being the human glue between PRs

Cloud Agents review code, catch issues, and suggest fixes before you open the diff. You review the results, not the process.

Try Cloud Free See How It Works