Ghost Hacker is a fully autonomous AI pentester that finds real vulnerabilities and proves they're exploitable โ SQL injection, auth bypass, XSS, SSRF โ using 13 AI agents with a real browser. No false positives. No theory. Just working exploits with copy-paste proof-of-concept commands.
Achieves a 96.15% XBOW Benchmark success rate, enhanced with adversarial dual-agent exploitation (CHAOS vs ORDER), cross-scan collective memory that gets smarter with every scan, and breakthrough capabilities including blind SQLi extraction, adaptive WAF bypass, and CVE variant hunting.
Your team ships code every day. Your pentest happens once a year. Ghost Hacker closes that gap.
Most security tools give you a list of possible issues. Ghost Hacker proves they're exploitable by executing live attacks and extracting real data.
| Tool | Type | Price | Output | False Positives |
|---|---|---|---|---|
| Ghost Hacker | AI Pentester | Free + ~$3-15 API cost/scan | Working exploits with PoC | Near zero |
| Burp Suite Pro | Scanner + Manual | $449/yr | Alerts + suspected vulns | Medium |
| Snyk | SAST/SCA | $25-98/dev/mo | Code issues (no runtime proof) | High |
| Qualys WAS | Cloud Scanner | ~$2,000+/yr | Vulnerability reports | Medium |
| HackerOne Pentest | Human Pentesters | $10,000-50,000/engagement | Expert findings | Low |
| Acunetix | DAST Scanner | $4,500+/yr | Scan reports | Medium-High |
Bottom line: Ghost Hacker gives you pentest-grade results for the cost of a few API calls.
Ghost Hacker discovered and proved 20+ critical vulnerabilities, including:
- SQL Injection Auth Bypass โ Admin access with a single payload, zero credentials needed
- UNION Injection โ Full database dump: usernames, password hashes, roles, emails
- NoSQL Operator Injection โ Mass modified 28 records via
$neoperator - XXE File Disclosure โ Read arbitrary server files through XML entity injection
- SSRF โ Internal service access + cloud metadata endpoint exposure (AWS keys, tokens)
Every finding includes reproducible commands โ not theory, proof. See full sample report
Phase 1: PRE-RECON โโโโโ nmap + subfinder + whatweb + source code analysis
Phase 2: RECON โโโโโโโโโโ Attack surface mapping, API inventory, input vectors
Phase 3: VULN โโโโโโโโโโโ 5 agents IN PARALLEL (injection, XSS, auth, authz, SSRF)
Phase 4: EXPLOIT โโโโโโโโ 5 agents IN PARALLEL (prove each vuln is exploitable)
Phase 5: REPORT โโโโโโโโโ Executive-level security assessment with evidence
Each vulnโexploit pair runs independently. The XSS exploit starts while the injection agent is still running. No "wait for all" bottleneck.
Ghost Hacker analyzes defenses before exploitation and routes intelligently:
| Defense Signal | Score | Detection |
|---|---|---|
| Prepared statements | +40 | Source code analysis |
| WAF middleware | +25 | Response pattern matching |
| Input validation | +20 | Code path tracing |
| Output encoding | +15 | Template analysis |
| Silent error handling | +15 | Error response fingerprinting |
Routing decisions:
| Score | Difficulty | Strategy | Timeout |
|---|---|---|---|
| < 15 | Trivial | Quick confirm, 5 attempts | 30 min |
| 15-40 | Standard | Methodical OWASP workflow | 2 hours |
| 40-70 | Hardened | Bypass-heavy or adversarial | 2-4 hours |
| 70+ | Fortress | Adversarial dual-agent (CHAOS vs ORDER) | 4 hours |
Don't waste 2 hours on a trivial reflected XSS. Don't give up after 15 attempts on a fortress-level blind SQLi behind a WAF.
For hardened targets, Ghost Hacker deploys two competing agents in parallel:
CHAOS Agent โ The creative rule-breaker
Start with the WEIRDEST payloads, not the obvious ones.
Combine techniques in unexpected ways.
Exploit parser differentials and edge cases.
If the obvious path is blocked, that's where you thrive.
ORDER Agent โ The methodical professional
Follow proven patterns โ highest success rate first.
OWASP methodology step by step.
Escalate systematically: manual โ sqlmap โ custom scripts.
The best exploit is one that works consistently.
A JUDGE evaluates both results: Did they extract real data? How many attempts? Novel bypass? Reproducible PoC? Winner's techniques get stored in collective memory.
Ghost Hacker remembers what works. After each adversarial battle, winning techniques are stored:
{
"time_based_blind": {
"successRate": 0.82,
"avgTimeMs": 45000,
"sampleSize": 47,
"bestAgainstStack": ["django", "postgres"]
},
"unicode_normalization": {
"successRate": 0.73,
"avgTimeMs": 8000,
"sampleSize": 12,
"bestAgainstStack": ["express", "mongodb"]
}
}Next scan against Django/Postgres? Start with time-based blind (82% success rate). Tried UNION injection 5 times against Express? Memory says it fails 90% โ skip it. The more you scan, the smarter Ghost Hacker gets.
# Clone
git clone https://github.com/itsjwill/ghosthacker.git
cd ghosthacker
# Configure
cp .env.example .env
# Edit .env: ANTHROPIC_API_KEY=your-key
# Run (adversarial mode is default)
./ghosthacker start URL=https://your-app.com REPO=/path/to/source
# Monitor progress
./ghosthacker logs
./ghosthacker query ID=ghosthacker-1234567890
open http://localhost:8233 # Temporal UI- Docker โ Install Docker
- Anthropic API key โ Get one here
# Adversarial scan (default โ CHAOS vs ORDER on hard targets)
./ghosthacker start URL=https://app.example.com REPO=/path/to/code
# Classic single-agent scan (faster, cheaper, good for CI/CD)
./ghosthacker start URL=https://app.example.com REPO=/path/to/code CLASSIC=true
# With custom auth config (2FA, SSO, form login)
./ghosthacker start URL=https://app.example.com REPO=/path/to/code CONFIG=./configs/my-config.yaml
# View real-time logs
./ghosthacker logs
# Query workflow progress
./ghosthacker query ID=ghosthacker-1234567890
# View collective memory stats
./ghosthacker memory
# Stop (preserves data)
./ghosthacker stop
# Stop and clean everything
./ghosthacker stop CLEAN=true| Situation | Mode | Why |
|---|---|---|
| Quick security check | CLASSIC=true |
Faster, single agent, $2-5 |
| Heavily defended app | Default (adversarial) | CHAOS finds what ORDER misses |
| CI/CD pipeline | CLASSIC=true |
Predictable timing and cost |
| Bug bounty hunting | Default + review memory | Memory learns from past scans |
| Same stack, repeat scans | Default | Memory knows what works |
| Pre-production audit | Default | Full source code + live browser analysis |
GHOST HACKER - Adversarial AI Pentester
[PRE-RECON] Scanning target with nmap, subfinder, whatweb...
[PRE-RECON] Analyzing source code for vulnerability patterns...
[RECON] Mapping attack surface: 47 endpoints, 23 input vectors...
[ROUTER] Analyzing vulnerability difficulty...
INJ-001: trivial โ quick_confirm (5 attempts, 30min)
INJ-002: standard โ methodical (15 attempts, 2hr)
INJ-003: fortress โ adversarial (50 attempts, 4hr)
XSS-001: hardened โ bypass_heavy (30 attempts, 4hr)
AUTH-001: standard โ methodical (15 attempts, 2hr)
[EXPLOIT] Running single-agent on 2 trivial/standard vulns...
INJ-001: EXPLOITED in 2 attempts (UNION injection, 12s)
AUTH-001: EXPLOITED in 8 attempts (JWT confusion, 45s)
[ADVERSARIAL] CHAOS vs ORDER: INJ-003
Difficulty: fortress | Tech stack: django/postgres
CHAOS: unicode_normalization, parser_differential, polyglot_payload...
ORDER: time_based_blind, boolean_blind, error_based...
[JUDGE] Winner: CHAOS
unicode_normalization bypass succeeded where methodical approach failed
[MEMORY] Saved 4 technique updates to collective memory
[REPORT] Generating enhanced security assessment...
SCAN COMPLETE
Vulns found: 5 | Exploited: 4 | False positive: 1
Adversarial battles: CHAOS 1 | ORDER 1
Memory updates: 4 new techniques
Duration: 47 minutes | Cost: $3.42
| Category | What Ghost Hacker Finds & Exploits |
|---|---|
| Injection | SQL injection (UNION, blind, error-based, time-based), Command injection, NoSQL injection, XXE, YAML injection, SSTI |
| XSS | Reflected, Stored, DOM-based, JSONP callback injection, Angular security bypass, HTML injection |
| Auth | Auth bypass, brute force, MD5 cracking, OAuth attacks, token replay, JWT confusion, weak credentials |
| AuthZ | IDOR, role injection, horizontal/vertical privilege escalation, business logic bypass |
| SSRF | Internal service access, cloud metadata (AWS/GCP/Azure), HTTP method bypass |
| Level | Evidence Required | Classification |
|---|---|---|
| 1 | Error messages, timing differences | POTENTIAL (Low) |
| 2 | Boolean-blind working, UNION succeeds | POTENTIAL (Medium) |
| 3 | Actual data extracted from database | EXPLOITED |
| 4 | Admin creds, PII, or RCE achieved | EXPLOITED (CRITICAL) |
Must reach Level 3+ with real data to mark as exploited. Theory without proof = failure.
GHOST HACKER PIPELINE
Phase 1: PRE-RECON (sequential)
โโ nmap + subfinder + whatweb + source code analysis
Phase 2: RECON (sequential)
โโ Attack surface mapping, API inventory, input vector identification
Phase 3-4: ADAPTIVE EXPLOITATION
โ
โโโ DIFFICULTY ROUTER analyzes each vuln from Phase 3
โ โ
โ โโโ TRIVIAL + STANDARD โโโ Single agent (parallel, fast)
โ โ
โ โโโ HARDENED + FORTRESS โโโ ADVERSARIAL MODE
โ โ
โ โโโ CHAOS Agent โโโ
โ โ โโโโ JUDGE
โ โโโ ORDER Agent โโโ โ
โ โผ
โ COLLECTIVE MEMORY
โ (persists across scans)
โ
Phase 5: ENHANCED REPORT
โโ Evidence chain + adversarial insights + difficulty analysis + memory stats
| Component | Technology |
|---|---|
| AI Engine | Claude Sonnet 4.5 via Claude Agent SDK (10,000 max turns) |
| Browser Automation | Playwright MCP (real Chrome, headless) |
| Workflow Orchestration | Temporal (crash recovery, smart retry, queryable progress) |
| Security Tools | nmap, subfinder, whatweb, sqlmap, schemathesis |
| Container | Docker + Chainguard Wolfi (minimal attack surface) |
| Language | TypeScript |
| MCP Tools | save_deliverable, generate_totp (2FA support) |
- Crash Recovery โ Temporal workflows resume automatically after restart
- Smart Retry โ 50 attempts with 5-30 min backoff for billing/rate limits
- Git Checkpoints โ Every agent creates a rollback point before running
- Parallel Execution โ 5 concurrent agents in vuln/exploit phases
- 10,000 Autonomous Turns โ Each agent gets extensive operation time
- Full Browser Control โ Navigate, fill forms, handle 2FA/TOTP, execute attacks
- Queryable Progress โ Real-time status via CLI or Temporal Web UI
ANTHROPIC_API_KEY=sk-ant-... # Required
CLAUDE_CODE_MAX_OUTPUT_TOKENS=64000 # Recommended (prevents truncation)authentication:
type: form
loginUrl: https://app.example.com/login
credentials:
username: test@example.com
password: testpass123
mfa:
type: totp
secret: JBSWY3DPEHPK3PXP
adversarial:
enabled: true
minDifficultyForAdversarial: hardened
memoryPath: ./collective-memory.jsonSupports form login, Google SSO, API keys, basic auth, and 2FA/TOTP โ all with JSON Schema validation.
# OpenAI
OPENAI_API_KEY=sk-your-key
ROUTER_DEFAULT=openai,gpt-5.2
# Google Gemini via OpenRouter
OPENROUTER_API_KEY=sk-or-your-key
ROUTER_DEFAULT=openrouter,google/gemini-3-flash-previewghosthacker/
โโโ ghosthacker # CLI entry point (ASCII banner + commands)
โโโ docker-compose.yml # Temporal + worker containers
โโโ Dockerfile # Chainguard Wolfi (minimal attack surface)
โโโ configs/
โ โโโ config-schema.json # YAML config validation
โ โโโ example-config.yaml # Template config
โ โโโ router-config.json # Multi-model routing
โโโ prompts/ # AI prompt templates (444+ lines each)
โ โโโ pre-recon-code.txt # Source code analysis
โ โโโ recon.txt # Attack surface mapping
โ โโโ vuln-{injection,xss,auth,authz,ssrf}.txt
โ โโโ exploit-{injection,xss,auth,authz,ssrf}.txt
โ โโโ report-executive.txt # Final report generation
โ โโโ shared/ # Shared prompt partials
โโโ src/
โ โโโ intelligence/ # Ghost Hacker additions
โ โ โโโ difficulty-router.ts # Vulnerability difficulty analysis
โ โ โโโ adversarial-exploitation.ts # CHAOS vs ORDER + collective memory
โ โโโ temporal/ # Workflow orchestration
โ โ โโโ workflows.ts # Classic pipeline workflow
โ โ โโโ enhanced-workflow.ts # Adversarial workflow
โ โ โโโ enhanced-activities.ts # Routing + adversarial activities
โ โ โโโ enhanced-worker.ts # Worker with all activities
โ โ โโโ activities.ts # Core agent activities
โ โ โโโ shared.ts # Types + query handlers
โ โโโ ai/
โ โ โโโ claude-executor.ts # Claude Agent SDK integration
โ โ โโโ message-handlers.ts # Stream message processing
โ โ โโโ router-utils.ts # Multi-model routing
โ โโโ audit/ # Crash-safe logging system
โ โโโ prompts/
โ โโโ prompt-manager.ts # Template variable injection
โโโ mcp-server/ # Custom MCP tools
โ โโโ src/tools/
โ โโโ save-deliverable.ts # Evidence file saving
โ โโโ generate-totp.ts # 2FA token generation
โโโ sample-reports/ # Real scan outputs (20+ vulns found)
โโโ collective-memory.json # Cross-scan learning (generated after scans)
Ghost Hacker's base engine achieves a 96.15% success rate on the hint-free, source-aware XBOW Benchmark โ the industry standard for measuring AI pentesting capability.
The adversarial mode (CHAOS vs ORDER) pushes this higher on hardened targets where a single agent's methodology isn't enough.
| Mode | Typical Cost | Best For |
|---|---|---|
Classic (CLASSIC=true) |
$2-5 per scan | CI/CD, quick checks |
| Adversarial (default) | $5-15 per scan | Thorough audits, hardened targets |
| Adversarial + Memory | Decreasing over time | Repeat scans on same stack |
Built-in spending cap detection โ won't silently burn through your API budget.
Ghost Hacker is for authorized security testing only. This is a white-box tool โ it requires access to application source code.
Only test systems you own or have explicit written permission to test.
The tool is designed for:
- Vulnerability assessment and security audits
- Pre-production security verification
- Bug bounty research (with authorization)
- Security report generation
- Developer self-testing ("hack yourself before someone else does")
Ghost Hacker provides a complete pentesting pipeline with Claude Agent SDK integration, Temporal workflow infrastructure, and audit system.
Key capabilities:
- 16 specialized AI agents running in parallel
- Difficulty-based vulnerability routing (smart resource allocation)
- Adversarial dual-agent exploitation (CHAOS vs ORDER competing agents)
- Cross-scan persistent intelligence (gets smarter every scan)
- Blind Oracle Constraint Solver (blind SQLi data extraction)
- Adaptive Payload Evolution Engine (genetic WAF bypass)
- CVE Variant Hunter (finds undiscovered vulnerability variants)
- Enhanced CLI with
./ghosthackercommands
Ghost Hacker is open source under AGPL-3.0. Contributions welcome:
- Fork the repo
- Create a feature branch
- Submit a PR with tests
AGPL-3.0 โ Free to use, modify, and distribute. Derivative works must also be open source.