Shannon Lite — an autonomous AI pentester that runs real exploits
Keygraph's open pentester reads source code white-box and fires real payloads — SQLi, XSS, SSRF, auth bypass with PoCs. 96.15% on the XBOW benchmark.

96.15%
Shannon Lite passed 100 of 104 challenges on XBOW — a 96.15% score that sets a new SOTA in autonomous pentesting. Released by Keygraph under AGPL-3.0, the OSS variant collected 1,400 stars in 24 hours and rolled to 31K cumulative, taking the GitHub trending top spot.
The pitch: a full pentest in 30 minutes to 1.5 hours, averaging $50 in API spend.
Why Keygraph built it
Existing pentest tools split into two camps. On one side, interactive tools driven by humans — Burp Suite, OWASP ZAP. On the other, signature-based scanners like Nuclei. The first is expensive and slow; the second lacks payload diversity.
Shannon plants an LLM in between. White-box (source-access) mode reads the codebase, infers attack vectors, and fires actual payloads to produce PoCs. Not simulated — executed.
[IMG#1]
The four-stage pipeline
recon → parallel analysis → parallel exploit → report
- Recon — auto-map target routes, auth flows, external API calls.
- Parallel analysis — multiple LLM instances scan code patterns concurrently.
- Parallel exploit — fires real payloads in an isolated environment.
- Report — packages successful exploits with PoC video, payload, and reproduction.
Tech stack
- Language: TypeScript / Node.js
- Bundler: tsdown (ESM)
- Isolation: Docker worker image (~1GB)
- LLM: Claude-API-tuned
- Run: single
npx @keygraph/shannonordocker pull
Claude-tuning is the key design call. Other LLMs work, but exploit-code generation and tool-call accuracy are most stable on Claude 4.5 Opus per the README.
Repo comparison
| Repo | Stars | License | Position |
|---|---|---|---|
| KeygraphHQ/shannon | 31K | AGPL-3.0 | Autonomous AI pentester, white-box |
| xbow-engineering/xbow | 18K | Apache-2.0 | Autonomous pentester, owns benchmark |
| ProjectDiscovery/nuclei | 21K | MIT | Signature scanner |
| OWASP ZAP | 13K | Apache-2.0 | Interactive + auto scanner |
Shannon now leads the "AI autonomous pentester" category by stars. SOTA on XBOW's own benchmark adds credibility.
[IMG#2]
Why now — ecosystem context
Four converging trends. (1) Claude 4.5 Opus crossed the threshold for white-box exploit reasoning. (2) MCP makes plugging vuln databases and security tools into LLMs straightforward. (3) Commercial autonomous pentest services (XBOW, PortSwigger AI) validated demand for OSS. (4) AGPL-strong copyleft blocks "SaaS-rebrand" companies, building trust in the OSS edition.
Top Hacker News comment: "Could be the next Recon-ng/Nmap baseline."
Getting started
# Node 18+ required
npx @keygraph/shannon
# or Docker
docker pull keygraphhq/shannon-worker
docker run keygraphhq/shannon-worker --target https://your.app
Common gotchas — set ANTHROPIC_API_KEY first. No free-tier path; expect ~$50 per run. Only target assets you own or have explicit written authorization to test.
Limits and outlook
Two current limits. (1) Roughly 8% false positives in real-world deployments — better than Burp's ~12% but not zero. (2) AGPL friction for in-product embedding — Shannon Pro (SaaS) is the alternative.
Outlook — next six months likely brings (a) non-Claude LLM adapters, (b) better black-box (no source) mode, (c) MCP integrations into SIEM and ticketing. The team has a defensive (blue-team) variant on the roadmap.
[IMG#3]
3-Line Summary
- Shannon Lite hit XBOW 96.15% — new autonomous-pentest SOTA, 1,400 stars day one.
- Four-stage pipeline (recon, analysis, exploit, report) wraps a full pentest in 30-90 min.
- AGPL keeps SaaS rebranders out; Pro SaaS plus OSS sustains the business model.
References
- GitHub — KeygraphHQ/shannon
- Keygraph — official site
- XBOW benchmark — GitHub
- Nuclei — scanner repo
- Hacker News — Shannon discussion
관련 기사

Craft Agents OSS — a multi-session agent inbox in a desktop app
Lukilabs (the craft.do team) released craft-agents-oss on May 2 under Apache-2.0. Multi-session inbox + 32 Craft MCP tools + Anthropic/OpenAI/Google/Copilot multi-provider, all in one Electron app.

Anthropic Launches Claude Marketplace — The First Real Enterprise AI App Store
Anthropic opened a B2B marketplace where enterprises can buy Claude-powered third-party apps using existing budgets. Zero commission, six launch partners, and a platform play that could reshape enterprise AI procurement.

Anthropic Just Opened a Marketplace — Snowflake, Harvey, and Replit Are In
Anthropic launched an enterprise Claude Marketplace where companies can buy third-party apps using existing AI budgets. How it differs from GPT Store and what it means for B2B AI competition.
AI 트렌드를 앞서가세요
매일 아침, 엄선된 AI 뉴스를 받아보세요. 스팸 없음. 언제든 구독 취소.
