Shannon Lite — an autonomous AI pentester that runs real exploits

96.15%

Shannon Lite passed 100 of 104 challenges on XBOW — a 96.15% score that sets a new SOTA in autonomous pentesting. Released by Keygraph under AGPL-3.0, the OSS variant collected 1,400 stars in 24 hours and rolled to 31K cumulative, taking the GitHub trending top spot.

The pitch: a full pentest in 30 minutes to 1.5 hours, averaging $50 in API spend.

Why Keygraph built it

Existing pentest tools split into two camps. On one side, interactive tools driven by humans — Burp Suite, OWASP ZAP. On the other, signature-based scanners like Nuclei. The first is expensive and slow; the second lacks payload diversity.

Shannon plants an LLM in between. White-box (source-access) mode reads the codebase, infers attack vectors, and fires actual payloads to produce PoCs. Not simulated — executed.

[IMG#1]

The four-stage pipeline

recon → parallel analysis → parallel exploit → report

Recon — auto-map target routes, auth flows, external API calls.
Parallel analysis — multiple LLM instances scan code patterns concurrently.
Parallel exploit — fires real payloads in an isolated environment.
Report — packages successful exploits with PoC video, payload, and reproduction.

Tech stack

Language: TypeScript / Node.js
Bundler: tsdown (ESM)
Isolation: Docker worker image (~1GB)
LLM: Claude-API-tuned
Run: single npx @keygraph/shannon or docker pull

Claude-tuning is the key design call. Other LLMs work, but exploit-code generation and tool-call accuracy are most stable on Claude 4.5 Opus per the README.

Repo comparison

Repo	Stars	License	Position
KeygraphHQ/shannon	31K	AGPL-3.0	Autonomous AI pentester, white-box
xbow-engineering/xbow	18K	Apache-2.0	Autonomous pentester, owns benchmark
ProjectDiscovery/nuclei	21K	MIT	Signature scanner
OWASP ZAP	13K	Apache-2.0	Interactive + auto scanner

Shannon now leads the "AI autonomous pentester" category by stars. SOTA on XBOW's own benchmark adds credibility.

[IMG#2]

Why now — ecosystem context

Four converging trends. (1) Claude 4.5 Opus crossed the threshold for white-box exploit reasoning. (2) MCP makes plugging vuln databases and security tools into LLMs straightforward. (3) Commercial autonomous pentest services (XBOW, PortSwigger AI) validated demand for OSS. (4) AGPL-strong copyleft blocks "SaaS-rebrand" companies, building trust in the OSS edition.

Top Hacker News comment: "Could be the next Recon-ng/Nmap baseline."

Getting started

# Node 18+ required
npx @keygraph/shannon

# or Docker
docker pull keygraphhq/shannon-worker
docker run keygraphhq/shannon-worker --target https://your.app

Common gotchas — set ANTHROPIC_API_KEY first. No free-tier path; expect ~$50 per run. Only target assets you own or have explicit written authorization to test.

Limits and outlook

Two current limits. (1) Roughly 8% false positives in real-world deployments — better than Burp's ~12% but not zero. (2) AGPL friction for in-product embedding — Shannon Pro (SaaS) is the alternative.

Outlook — next six months likely brings (a) non-Claude LLM adapters, (b) better black-box (no source) mode, (c) MCP integrations into SIEM and ticketing. The team has a defensive (blue-team) variant on the roadmap.

[IMG#3]

3-Line Summary

Shannon Lite hit XBOW 96.15% — new autonomous-pentest SOTA, 1,400 stars day one.
Four-stage pipeline (recon, analysis, exploit, report) wraps a full pentest in 30-90 min.
AGPL keeps SaaS rebranders out; Pro SaaS plus OSS sustains the business model.

References

GitHub — KeygraphHQ/shannon
Keygraph — official site
XBOW benchmark — GitHub
Nuclei — scanner repo
Hacker News — Shannon discussion

Shannon Lite — an autonomous AI pentester that runs real exploits

96.15%

Why Keygraph built it

The four-stage pipeline

Tech stack

Repo comparison

Why now — ecosystem context

Getting started

Limits and outlook

3-Line Summary

References

출처

관련 기사

Craft Agents OSS — a multi-session agent inbox in a desktop app

Anthropic Launches Claude Marketplace — The First Real Enterprise AI App Store

Anthropic Just Opened a Marketplace — Snowflake, Harvey, and Replit Are In

96.15%

Why Keygraph built it

The four-stage pipeline

Tech stack

Repo comparison

Why now — ecosystem context

Getting started

Limits and outlook

3-Line Summary

References

출처

관련 기사

Craft Agents OSS — a multi-session agent inbox in a desktop app

Anthropic Launches Claude Marketplace — The First Real Enterprise AI App Store

Anthropic Just Opened a Marketplace — Snowflake, Harvey, and Replit Are In

AI 트렌드를 앞서가세요