TokenBrake
TokenBrake
A circuit breaker for AI coding agents — wraps Claude Code, Aider, Cursor CLI, and the OpenAI/Anthropic Agents SDKs, watches every run for "spinout" (token spend with no fresh file diffs or tool outputs), and auto-kills the agent before it blows your monthly budget.
Problem
Indie devs and small teams running long-horizon coding agents — Claude Code on a 4-hour refactor, Aider chewing through a migration, an OpenAI Agents SDK swarm doing overnight cleanup — wake up to $50–$200 bills and a workspace that wasn't touched. The agent looped on the same plan, kept re-reading the same files, or got stuck negotiating with a flaky tool, and nobody noticed for hours. Existing observability stacks (Langfuse, Helicone, OpenLLMetry) show the spend after the fact; the only real-time guardrail aimed at this problem is Microsoft Agent 365, which launched May 1, 2026 at $15/user/mo with SSO and SOC2 attached — a complete non-starter for the solo-dev market that is actually feeling the pain.
Target user
Solo devs, indie hackers, and 2–10-person product teams running autonomous coding agents on their own boxes or in their own CI. The job-to-be-done: let my agent run overnight without waking up to a useless bill, and tell me exactly where the tokens went.
MVP scope
- Single Go binary CLI:
tokenbrake -- <agent-command>transparently wraps any agent invocation and proxies its outbound LLM API calls. - SDK-agnostic interception: ships a local HTTPS proxy plus thin shims for the
anthropicandopenaiPython/Node SDKs (env-var swap, no code change inside the wrapped agent). - Spinout detector tracks three signals — (a) repeated tool-call signature within a sliding window, (b) bytes-changed in the workspace per 10K tokens, (c) growth in the set of unique files touched — and trips if all three flatline for N minutes.
- Hard spend cap with auto-
SIGTERMand a graceful "stop here, dump state" hook for agents that support cooperative interruption (Claude Code, Aider). - Post-run forensic report (markdown + JSON): timeline of tokens vs files-touched, top 5 wasted tool-call signatures, suggested config tweaks for next run.
- Local-only by default — proxy logs, run history, and reports stay in
~/.tokenbrake/, no signup required to use the CLI.
Monetization
Freemium. The MIT-licensed CLI is free forever — it's the community wedge and the moat. TokenBrake Cloud at ~$12/mo per user adds a hosted dashboard with shared team budgets, multi-machine roll-up, alert routing (Slack / Discord / PagerDuty), and historical trend lines across runs. Team tier at ~$40/mo per seat (5-seat min) adds policy enforcement (no model X above $Y per run, no bash tool calls in /etc/), audit log export, and per-project caps. Sits an order of magnitude below Microsoft Agent 365 ($15/user with full enterprise overhead) and is complementary to eval platforms like Patronus and Galileo — those grade outputs after the fact, TokenBrake intervenes in real time.
Why now
Microsoft launched Agent 365 on May 1, 2026 as a dedicated governance plane for enterprise AI agents at $15/user/mo — a clear public validation of the "agents need brakes" thesis from the biggest possible vendor, while leaving the entire SMB and indie segment uncovered. May 2026 community engagement on r/AI_Agents, Hacker News, and the n8n forums is dominated by horror stories of agents looping silently for hours, and both Anthropic and OpenAI shipped Agents SDK updates in Q1 2026 specifically targeting "uncontrolled execution" — confirming this is the #1 operational pain. As Claude Sonnet 4.8 and Kimi K2.6 push autonomous coding runs into the 4–8 hour range, the cost of a silent failure is roughly 10× what it was a year ago.
Risks & open questions
- SDK shim approach may break for users on bleeding-edge SDK versions or custom HTTP clients — need a pure-proxy fallback that requires zero code awareness of the wrapped agent.
- Spinout detection is heuristic — false positives that kill productive agents mid-thought will be a fast trust killer. Need adjustable sensitivity, a default "alert-only" dry-run mode for the first week of any user's adoption, and a per-agent profile (Claude Code's "plan" phase looks like a spinout to a naive detector).
- Anthropic, OpenAI, or Cursor could bundle a free entry-level version into their own dashboards and crush the wedge — defensive moat has to be cross-agent, cross-model coverage and a real community presence around the OSS CLI.
- Some agents (Claude Code, Aider) already have built-in spend caps — need to clearly position TokenBrake as the unifying layer that works regardless of which agent or model the user runs today.
- Privacy: the Cloud tier sees tool-call metadata, prompts, and partial responses. Teams in regulated industries will demand a self-hosted backend on day one — needs to be in the roadmap, not an enterprise-tier afterthought.
Next step
Promote to a weekly prototype: build a minimal tokenbrake CLI that wraps a single Anthropic SDK call, implements the three-signal spinout heuristic against a recorded Claude Code trace, and emits the post-run markdown report end-to-end.
Sources
- https://turion.ai/blog/ai-agent-platform-updates-may-2026/ — Documents Microsoft Agent 365's May 1, 2026 launch at $15/user/mo as enterprise agent governance, validating the "brakes for agents" thesis at the top of the market.
- https://machinelearningmastery.com/7-agentic-ai-trends-to-watch-in-2026/ — May 2026 agentic-AI trend roundup; production reliability, sandboxing, and operational maturity are the headline themes.
- https://coasty.ai/blog/ai-agent-benchmark-results-2026-who-actually-wins-20260507 — May 7, 2026 computer-use benchmark write-up finding that most agents silently overstate progress — precisely the failure mode TokenBrake catches.
- https://blog.n8n.io/we-need-re-learn-what-ai-agent-development-tools-are-in-2026/ — n8n team's May 2026 argument that agent dev tools must now treat monitoring, debugging, and cost control as first-class concerns rather than afterthoughts.
- https://medium.com/@visrow/the-biggest-ai-trends-and-tools-emerging-in-april-2026-8a491e6d546f — April 2026 trend roundup naming silent failures and cost control as the top pain points every team running agent swarms reports.