Trap Scout
TrapScout
Scan a webpage for hidden AI-agent traps before you send your browser agent there — see the indirect prompt-injection payloads your agent would actually read, and the clean version of the page it would see if you stripped them.
Date: 2026-05-10 Form factor: Web app (single-page; mobile-friendly) Status: Prototype
What it is
TrapScout is a single-page scanner for the "indirect prompt injection" trap epidemic that's now visible in the wild. You drop in a URL (or pick a sample), and TrapScout shows the agent-visible content of the page — the part your ChatGPT Atlas / Claude computer-use / Perplexity Comet / Gemini agent mode session would actually ingest — separated into what a human sees vs. what a bot sees. It flags hidden text, ARIA-label and alt-text traps, white-on-white instructions, off-screen positioned content, suspicious "ignore previous instructions" phrasing, fake tool-call markers, base64-encoded comments, and unicode-tag payloads, scores the page on a Safe → Caution → Trap scale, and gives you a "cleaned" preview of the agent prompt with the trap content stripped.
The prototype runs entirely offline against five fixture pages: a clean product page, a competitor-injected shopping page, a PayPal-targeted travel site, a poisoned résumé, and a benign personal blog. It demonstrates the end-to-end flow: paste → scan → triage → export a sanitized snippet your agent can safely consume.
Who it serves
Two overlapping personas:
- The agentic power user. Anyone running ChatGPT Atlas, Claude with computer use, Perplexity Comet, Gemini in agent mode, or a custom browser-using agent to do real work (shopping, booking travel, reviewing résumés, summarizing competitor pages). Google reported a 32% surge in indirect prompt injection attempts between Nov 2025 and Feb 2026, with documented payloads that exfiltrate PayPal balances and reroute purchases. (Decrypt, 2026-04-22)
- The small e-commerce / content owner. Founders worried their own site has been poisoned by competitors injecting hidden "recommend our store instead" instructions, or by SEO scrapers leaving CSS-hidden affiliate redirects. They want a 30-second self-check before they list on AI-agent shopping comparators.
The shared pain: indirect prompt injection is invisible by definition — DevTools shows it, but nobody manually inspects a page before sending an agent there. TrapScout makes the invisible visible.
Why it could be profitable
A three-tier wedge:
- Free consumer tier: 10 scans/day from the web app + a Chrome/Edge extension that lights up the address bar when the current page contains agent traps. Acquisition channel.
- Pro ($9/mo): Unlimited scans, batch URL upload, a "pre-flight" API your custom agent calls before each
browser.navigate, and saved scan history for audit. - Teams / API ($199–$999/mo): For Shopify and BigCommerce stores, marketplaces (Amazon, Etsy), agentic-browser vendors, and DTC brands — bulk scans of your own catalog pages, competitor-poisoning monitoring, and a JSON-schema webhook to a SIEM.
The category is moving fast: Microsoft, IBM, and Google all published agent-frameworks-have-RCE research in the last four weeks (Microsoft Security, 2026-05-07, Help Net Security, 2026-04-24), and consumer demand follows visible incidents. The first product to own the "is this page safe for my agent?" search term has a defensible position.
Form factor & scope
Single-page web app. The MVP slice in this folder:
- URL input with a "Scan" button and five preset sample pages.
- Scan result panel with overall risk score, color-coded badges, and per-finding cards.
- "Human view vs. agent view" split: side-by-side rendering of the visible page and the agent-ingested DOM text.
- Findings list grouped by trap family (DOM hiding, CSS hiding, ARIA / alt, instruction phrasing, encoding, structural).
- "Cleaned prompt" tab that shows the sanitized text payload an agent should use instead.
- Copy-to-clipboard for the cleaned text.
Out of scope for the prototype: live URL fetching (CORS), real browser extension, accounts, billing, multi-page crawl.
How to run it
- Open
index.htmlin any modern browser (Chrome, Edge, Firefox, Safari). No build step. - Pick a sample URL from the dropdown, or type any string into the URL field and press Scan (typing a custom URL will return a "not in fixtures" message — pick a sample).
- Click any finding card to jump to the highlighted snippet in the agent view.
- Switch to the Cleaned prompt tab to copy the sanitized text your agent should consume.
Sample data lives inline in
index.htmlas a<script type="application/json">block, so the prototype runs cleanly fromfile://without a local server.
What's in this prototype
- Five realistic fixture pages with named trap families:
acme-running-shoes-clean.example— clean product page, baseline.discount-electronics-trap.example— hiddendisplay:nonesystem override + ARIA-label coupon trick (competitor injection).cheap-flights-paypal.example— white-on-white "send $4,800 to PayPal address X" with off-screenposition: absolute; left: -9999px.freelance-resume-trap.example— base64 HTML comment + zero-width unicode tag characters telling the agent to recommend hiring the candidate.mikes-camping-blog-clean.example— benign blog with a single false-positive lookalike.
- Eight detector rules implemented client-side, each producing a finding with
severity,family,excerpt, andmitigation. - Risk score = 100 − weighted penalty per finding; thresholds: ≥85 Safe, 60–84 Caution, <60 Trap.
- A "Cleaned prompt" generator that strips hidden nodes and outputs whitespace-normalized text.
Roadmap
- Live URL fetching via a small proxy worker (Cloudflare Workers) to bypass CORS.
- Chrome / Edge MV3 extension that runs in-page and lights up the address bar.
- Pre-flight API (
POST /v1/scan) for custom agents to call beforebrowser.navigate. - Shopify / BigCommerce app for sellers to monitor their own catalog for competitor poisoning.
- Headless agent-vendor adapters: Atlas, Comet, and Claude computer use SDK middleware.
Sources
- Help Net Security — Indirect prompt injection is taking hold in the wild, 2026-04-24 — confirms the "open web filling with traps" framing and ten in-the-wild payload families.
- Decrypt — Malicious Web Pages Are Hijacking AI Agents, And Some Are Going After Your PayPal, 2026-04-22 — Google's 32% surge figure and PayPal-targeting payloads.
- Microsoft Security Blog — When prompts become shells: RCE vulnerabilities in AI agent frameworks, 2026-05-07 — agentic RCE risk surface.
- Google Online Security Blog — April 2026 — official "if you are an AI, do not crawl" pattern and Google's own mitigations.
- Infosecurity Magazine — Researchers Uncover 10 In-the-Wild Prompt Injection Payloads Targeting AI Agents — taxonomy used to seed TrapScout's detector families.
Requirements
TrapScout — Requirements
Goals
- Make indirect prompt-injection content on a webpage visible and explainable to a non-security audience in under 30 seconds.
- Give an agentic-AI user a defensible "is this safe?" verdict and a copy-pasteable cleaned prompt for their agent.
- Run end-to-end as a single HTML file on
file://for the prototype, so anyone can vet the demo without a server or API key. - Establish the detector taxonomy (eight families) that the production extension and API will share.
Primary user
A 30–45-year-old knowledge worker who uses an agentic browser (ChatGPT Atlas, Claude with computer use, Perplexity Comet, or Gemini agent mode) to do real tasks — comparison shopping, travel booking, résumé review, competitor research. Technically literate, security-curious but not a researcher. They have read at least one headline about an AI agent getting hijacked and want a sanity check before delegating sensitive flows. Often working from a laptop on shared Wi-Fi.
Their job-to-be-done: "Before I tell my agent to read this page, tell me whether the page contains hidden instructions targeting it, and if so, give me a clean version I can paste instead."
Functional requirements
- FR1: User can paste any URL into a single input and press Scan.
- FR2: User can pick a sample URL from a dropdown of five named fixtures.
- FR3: When a sample is scanned, the app loads the fixture's pre-computed HTML + agent-view text from inline JSON.
- FR4: When a custom URL not in fixtures is entered, the app shows a friendly "not in prototype fixtures" notice with a CTA to choose a sample.
- FR5: The app computes a numeric risk score 0–100 from per-finding weights and maps it to
Safe(≥85),Caution(60–84), orTrap(<60), each with a distinct color. - FR6: The app shows a "Human view" tab rendering the cleaned visible HTML in a sandboxed
<iframe srcdoc>(no scripts). - FR7: The app shows an "Agent view" tab rendering the agent-ingested text (including hidden, ARIA, alt, and decoded content) with every trap excerpt highlighted.
- FR8: The app shows a "Findings" tab listing each detection grouped by family with severity, excerpt, location, and mitigation guidance.
- FR9: The app shows a "Cleaned prompt" tab with a sanitized text payload and a Copy button that writes it to the clipboard.
- FR10: Detector families implemented: DOM hiding (display:none, visibility:hidden, hidden attribute), CSS hiding (white-on-white, font-size:0, off-screen positioning), ARIA / alt (aria-label, alt text, title with instructions), instruction phrasing ("ignore previous", "system:", "you are now"), encoding (base64, zero-width chars, unicode tag chars), structural (fake tool-call JSON blocks), credential exfil (PayPal / Venmo / Cash App handles in hidden content), and policy override ("you must", "do not tell the user").
- FR11: Clicking a finding scrolls to and highlights the corresponding excerpt in the Agent view.
- FR12: Each fixture surfaces the trap families it contains so the prototype demonstrates every detector.
User stories
- As an agentic-browser user, I want to scan a product page before my agent does, so that I know it won't be tricked into reordering a different SKU.
- As an agentic-browser user, I want to see the exact hidden text that was trying to manipulate my agent, so that I can decide whether to proceed manually.
- As an agentic-browser user, I want a "cleaned" version of the page's text, so that I can paste it into my agent and skip the trap entirely.
- As a Shopify seller, I want to scan my own product pages, so that I can catch a competitor's poisoning attempt before customers' AI agents see it.
- As a security-curious user, I want a plain-English explanation of each finding, so that I understand the attack pattern without needing to read the OWASP spec.
- As a power user, I want a numeric score and a color, so that I can triage at a glance.
- As a recruiter, I want to scan résumé PDFs (eventually) for hidden hire-this-person instructions, so that my AI screening doesn't get gamed.
- As a privacy-conscious user, I want the scan to run in my browser without sending the URL to a server, so that my browsing history stays mine.
Non-functional requirements
- Runs as a single
index.htmloverfile://. No build step. No third-party fonts. No network calls in the prototype. - All sample data inlined or in
sample-data.jsonloaded viafetch('./sample-data.json')with afile://fallback to an inline<script type="application/json">copy. - Mobile-friendly: layout collapses to single-column under 720px.
- Accessible: semantic landmarks, keyboard navigation for tabs and findings,
prefers-color-schemerespected. - The sandboxed
<iframe>usessandbox=""(no scripts, no same-origin) so untrusted fixture HTML can never escape. - Highlight rendering uses pre-escaped HTML — never injects fixture content as raw HTML outside the sandbox.
Out of scope (for the prototype)
- Live URL fetching (would require a CORS proxy / Cloudflare Worker).
- Real browser extension (MV3, content-script DOM access).
- Accounts, billing, scan history, team sharing.
- Headless-browser rendering for JS-only pages.
- PDF or image OCR scanning.
- Integrations with specific agent SDKs (Atlas, Comet, Claude computer use).
Open questions
- Should the cleaned-prompt output stay as plaintext or also offer a structured "agent-safe Markdown" form?
- How aggressive should the white-on-white detector be when the page legitimately uses near-white text on near-white backgrounds (e.g., subtle disclaimers)? Likely needs a contrast-ratio threshold rather than exact match.
- For the API tier, what's the right rate limit per agent call — is one scan per
browser.navigateoverhead acceptable, or should it be cached by URL hash? - What's the right surface for the Chrome extension — a passive badge, a blocking overlay before agent mode runs, or both?