← June 2026
App Idea Cards 2026-06-02

PromptFence

A Manifest V3 extension that sits on your AI chat and agent tabs and acts as a client-side firewall — it shows every other extension that can read the conversation, and it intercepts page-origin…

PromptFence

PromptFence

A Manifest V3 extension that sits on your AI chat and agent tabs and acts as a client-side firewall — it shows every other extension that can read the conversation, and it intercepts page-origin scripts that try to inject prompts or forge clicks into your AI agent.

Problem

Agentic browsing pushed the trust boundary into the page itself, and attackers walked right in. The ClaudeBleed flaw (disclosed early May 2026) let any installed extension — even one with zero special permissions — send arbitrary prompts to Claude's in-browser agent through an unauthenticated message handler, and forge user approvals by spamming confirmation messages and mutating the DOM so Claude misread what it was approving. The exfiltration rode the victim's own authenticated AI session. It is not isolated: a Chrome flaw let extensions hijack the new Gemini-in-Chrome panel's camera, mic, and file access, and ~900,000 users installed AI "assistant" extensions that quietly siphoned their ChatGPT and DeepSeek conversations to a C2 server. The user has no in-browser signal answering the two questions that now matter every time they open an AI tab: who else can read this conversation, and is anything injecting instructions into my agent?

Target user

Two segments. First, privacy- and security-conscious AI power users who live in ChatGPT, Claude, Gemini, and Perplexity all day and paste sensitive code, contracts, and strategy into them. Second — the real wallet — security and IT teams staring at "shadow AI" extension sprawl: employees install AI sidebars from the Web Store with no vetting, and one bad update turns into IP exfiltration across thousands of seats. Today their only options are blunt enterprise extension allowlists or nothing. PromptFence gives both segments a per-tab, real-time view instead of a quarterly audit.

MVP scope

  • Manifest V3 extension with content scripts scoped to known AI domains (chatgpt.com, claude.ai, gemini.google.com, perplexity.ai, copilot.microsoft.com, chat.deepseek.com).
  • MAIN-world guard injected into AI tabs that hooks window.postMessage and incoming message events, and flags the ClaudeBleed pattern: arbitrary prompt strings forwarded to the agent, repeated confirmation/approval messages, and synthetic events hitting the input box with isTrusted === false (no real user gesture).
  • "Who can see this chat" panel — uses chrome.management to enumerate installed extensions that hold host access to the current AI domain plus broad permissions, and assigns each a risk score (host breadth × sensitive APIs × age/popularity).
  • New-extension alert — notifies when a newly installed or newly updated extension gains access to an AI domain (the 900K-user supply-chain pattern), so a malicious update is caught the day it ships.
  • Event log of injection attempts and DOM-scraping heuristics (foreign nodes copying conversation text, unexpected clipboard reads), with per-extension block/mute toggles.
  • Local-only by design: no server, no account, no telemetry in the free tier — the irony of a security extension phoning home is the thing to avoid.

Monetization

Freemium. Free tier: single-device personal audit, the "who can see this chat" panel, and live injection alerts. PromptFence Pro at $5/month adds continuous background monitoring, multi-device sync, and full searchable event history. A Teams / Enterprise tier at $7/seat/month is the real business: managed policy push, allowlist enforcement, CSV/SIEM export of injection and access events, and a central dashboard so security teams can quantify shadow-AI extension risk across the fleet — the same buyers already paying for browser-security and DLP tooling.

Why now

The threat class went from theoretical to headline in a single quarter. LayerX disclosed ClaudeBleed in early May 2026; Anthropic shipped extension v1.0.70 on May 6 but left the externally_connectable message handler in place, only adding approval flows — so the underlying design pattern persists across vendors. Palo Alto Unit 42 published the Gemini-in-Chrome panel hijack in March 2026, Microsoft's security team documented malicious AI-assistant extensions harvesting LLM chat histories the same month, and roughly 900,000 users were caught by chat-stealing AI extensions earlier in 2026. Every major browser is racing to ship a built-in AI agent, each one widening the same trust boundary, and there is still no consumer-grade defensive primitive that lives in the tab where the damage happens.

Risks & open questions

  • MV3 isolation ceiling. A content script can't read another extension's JavaScript. PromptFence infers risk from observable DOM mutations, MAIN-world postMessage, and chrome.management metadata — strong heuristics, not ground truth. A passive scraper that only reads the conversation DOM (no mutation) is genuinely hard to catch and should be framed honestly.
  • Alert fatigue / false positives. Password managers, Grammarly, and translators legitimately touch the same DOM and message channels. Without tight default allowlisting the panel becomes noise and users disable it.
  • Platform risk cuts both ways. Vendors patch their own handlers (Anthropic already did), which narrows the headline exploit — so the durable product is general extension-permission hygiene for AI tabs, not a one-CVE point fix.
  • Trust paradox. A security extension that itself requests management and broad host permissions has to over-justify every grant and should ship open-source/auditable to earn install.
  • No network-layer block. MV3 declarativeNetRequest can't intercept another extension's traffic; detection and warning is in-page only, with remediation = "disable that extension," not "block that request."

Next step

Build a 24-hour proof-of-concept against claude.ai and chatgpt.com: inject a MAIN-world listener that logs every postMessage and every isTrusted === false event reaching the AI input box, plus a chrome.management enumeration of extensions with host access to the current domain. Validate it by writing a benign test extension that reproduces the ClaudeBleed message pattern, and prove two UX moments work — the "who can see this chat" panel renders the test extension as high-risk, and a live "injection attempt blocked" alert fires the instant the test script forwards a prompt. If both land cleanly, promote to the weekly prototype.

Sources

More from App Idea Cards