pentest-ai

Turn Claude Code into your offensive security research assistant

17 specialized AI subagents for every phase of authorized penetration testing, from scoping to reporting

17 Specialist Agents
25+ Tool Integrations
100% MITRE ATT&CK Mapped
MIT Open Source

What is pentest-ai?

pentest-ai turns Claude Code into a full offensive security research environment. Instead of one general-purpose assistant, you get specialized subagents, each an expert in a specific phase of penetration testing. Ask Claude anything security-related and it automatically routes your request to the right specialist. Whether you are scoping your first engagement or writing a final report, every response is mapped to MITRE ATT&CK and paired with defensive guidance.

Automatic Routing

Claude delegates to the right specialist based on your task. No manual agent selection needed. Just describe what you need.

MITRE ATT&CK Mapped

Every technique is cross-referenced with ATT&CK IDs. Know exactly where each finding sits in the adversary framework.

Dual Perspective

Offensive methodology paired with defensive detection in every response. Attack and defend in a single workflow.

Adapts to Your Level

From explaining what Kerberoasting is to providing exact Impacket command syntax. Every agent meets you where you are.

Plain Markdown

Agents are simple Markdown files. No dependencies, no build tools, no lock-in. Fork, modify, and extend freely.

Full Engagement Coverage

OSINT, planning, recon, exploitation, cloud, mobile, wireless, social engineering, forensics, compliance, and reporting.

The Agents

17 specialists, each tuned for a distinct phase of the engagement lifecycle.

Offensive Operations

engagement-planner

Offensive

Plans penetration tests with phased methodology, MITRE ATT&CK mapping, and rules of engagement templates.

Plan an internal network pentest for a 500-endpoint AD environment

osint-collector

Offensive

Open source intelligence gathering. Domain recon, email harvesting, social media profiling, breach data analysis.

Build an OSINT profile on this target domain

recon-advisor

Offensive

Parses output from Nmap, Nessus, BloodHound, and 20+ tools. Prioritizes targets and maps CVEs.

Analyze this Nmap scan and tell me what to hit first

exploit-guide

Offensive

Exploitation methodology covering AD attacks, web apps, cloud, and post-exploitation with defensive perspective.

Walk me through AS-REP Roasting and how defenders detect it

privesc-advisor

Offensive

Systematic Linux and Windows privilege escalation. SUID abuse, token impersonation, GTFOBins, and LOLBAS.

Here's my linpeas output. What's the fastest path to root?

cloud-security

Offensive

AWS, Azure, and GCP penetration testing. IAM privilege escalation, container escape, and cloud-native attack paths.

I have read-only AWS access. Find privilege escalation paths

api-security

Offensive

REST, GraphQL, and WebSocket testing. OWASP API Top 10, JWT attacks, OAuth exploitation, BOLA/BFLA testing.

Test this API for BOLA. Here's the Swagger doc and a valid JWT

mobile-pentester

Offensive

Android and iOS app security. APK/IPA analysis, Frida hooking, SSL pinning bypass, OWASP MASTG/MASVS.

Decompile this APK and check for hardcoded secrets

wireless-pentester

Offensive

WiFi and Bluetooth penetration testing. WPA2/WPA3 attacks, evil twin, rogue AP, and BLE security.

Capture a WPA2 handshake and set up an evil twin

social-engineer

Offensive

Phishing campaigns, pretexting, vishing, and physical social engineering for authorized red team engagements.

Design a phishing campaign using GoPhish for this engagement
Defense & Analysis

detection-engineer

Defense

Produces deployment-ready detection rules in Sigma, Splunk SPL, Elastic KQL with false positive tuning.

Create a detection rule for DCSync with Sigma and Splunk SPL

threat-modeler

Defense

STRIDE/DREAD threat modeling, attack tree construction, data flow analysis, and architecture-specific threat enumeration.

Build a STRIDE threat model for our API gateway

forensics-analyst

Defense

Digital forensics and incident response. Memory forensics, disk analysis, timeline construction, evidence handling.

Walk me through a Volatility 3 workflow for this memory dump

malware-analyst

Defense

Binary analysis, reverse engineering, sandbox methodology, YARA rule writing, and IOC extraction.

Analyze this suspicious PE file with static analysis then Ghidra

stig-analyst

Defense

DISA STIG compliance analysis with GPO remediation paths, risk scores, and keep-open justification templates.

Analyze V-220768 and write a keep-open justification
Reporting & Learning

report-generator

Reporting

Professional pentest reports with executive summaries, CVSS scoring, evidence formatting, and remediation roadmaps.

Compile these findings into a professional report

ctf-solver

Learning

Methodical CTF challenge partner for HackTheBox, TryHackMe, and competitions. Web, pwn, rev, crypto, forensics.

I have a low-priv shell on this HTB machine. Help me enumerate

See It in Action

Describe your task naturally. Claude routes to the right agent automatically.

claude
$ claude

You: Analyze this Nmap scan and prioritize targets for our internal pentest

▸ Routing to recon-advisor agent...

## Prioritized Target Summary

### Critical Priority (Exploit Immediately)
| Host           | Port  | Service        | Finding                    |
|----------------|-------|----------------|----------------------------|
| 10.10.1.5      | 445   | SMB            | SMBv1 enabled, MS17-010    |
| 10.10.1.20     | 3389  | RDP            | BlueKeep (CVE-2019-0708)  |

### High Priority (Investigate Next)
| Host           | Port  | Service        | Finding                    |
|----------------|-------|----------------|----------------------------|
| 10.10.1.100    | 88    | Kerberos       | Domain Controller          |
| 10.10.1.15     | 1433  | MSSQL          | sa login, weak password    |

### Recommended Next Steps
1. Run: nmap -sV --script smb-vuln* 10.10.1.5
2. Run: crackmapexec smb 10.10.1.0/24
3. Enumerate AD: bloodhound-python -d corp.local

Engagement Workflow

A natural pipeline from scoping through delivery. Each phase maps to a dedicated agent.

OSINT osint-collector
Scope engagement-planner
Recon recon-advisor
Exploit exploit-guide
Escalate privesc-advisor
Detect detection-engineer
Report report-generator

Before & After

What changes when you add pentest-ai to your Claude Code workflow.

Task Without pentest-ai With pentest-ai
Engagement scoping Manual checklist, easy to miss items Structured plan with RoE, scope boundaries, and methodology
Recon analysis Read raw tool output yourself Prioritized targets with attack surface mapping
Exploit research Search CVE databases manually Curated exploit chains mapped to ATT&CK with PoC guidance
Detection rules Write Sigma/YARA from scratch Deployment-ready rules with detection logic explained
STIG compliance Cross-reference configs vs. PDF checklists Automated check with fix commands and rationale
Reporting Start from a blank document Professional findings with CVSS, evidence, and remediation

Frequently Asked Questions

Common questions about pentest-ai.

Do I need security certifications to use pentest-ai?
No certifications are required to install and explore the agents. However, for real engagements, you should have proper training and authorization. The agents adapt to your skill level. Beginners get explanations, experienced operators get exact command syntax.
Does pentest-ai execute attacks or access systems?
No. pentest-ai agents provide methodology guidance, analysis, and report generation. They do not execute exploits, access remote systems, or generate functional exploit code. You remain in full control of all actions.
Which Claude subscription do I need?
You need Claude Pro or Max with Claude Code enabled. The agents use the Sonnet model by default, which is included with both subscription tiers.
Can I customize the agents or create new ones?
Yes. Agents are plain Markdown files with YAML frontmatter. You can modify existing agents, change their model, add tools, or create entirely new agents. See the Customization guide for details.
How does automatic agent routing work?
Claude Code reads the description field in each agent's YAML frontmatter. When you describe a task, Claude matches your intent to the most relevant agent and delegates automatically. You can also invoke agents by name for direct control.
Can I use pentest-ai for bug bounty programs?
Yes, within the program's scope and rules. Use the recon-advisor and exploit-guide for vulnerability research, the api-security agent for web/API testing, and the report-generator for professional write-ups. Always operate within the program's rules of engagement.

Installation

Three commands. No dependencies. No build tools.

New to Claude? No problem. The setup guide walks you through creating an account, installing the CLI, and running your first agent in about 5 minutes.

Option 1: Clone and install globally
git clone https://github.com/0xSteph/pentest-ai.git cp pentest-ai/agents/*.md ~/.claude/agents/
Option 2: Install for a specific project
mkdir -p .claude/agents/ cp pentest-ai/agents/*.md .claude/agents/
Then open Claude Code and try
# Just describe your task naturally "Plan an internal pentest for a mid-size company with Active Directory"

See INSTALL.md for detailed instructions and troubleshooting.