NUMASEC(1)

NAME

numasecAI agent for penetration testing. Like Claude Code, but for security. Open source, MCP-native, works with any LLM.

SYNOPSIS

$apt install nmap

INFO

95 stars
15 forks
0 views
TypeScriptAI & LLM

DESCRIPTION

AI agent for penetration testing. Like Claude Code, but for security. Open source, MCP-native, works with any LLM.

README

numasec

The AI agent for cyber security. Like Claude Code, but for cyber security.

numasec running a pentest against OWASP Juice Shop

GitHub Stars AI Cyber Security Agent MIT License Build Release

37 native security tools · 48 templates, payload packs, and playbooks · evidence graph + attack paths

Table of Contents

Quickstart

npm install -g numasec
numasec

Connect a provider, choose a model, type pentest https://yourapp.com, and it starts.

Why numasec

Coding has Claude Code, Copilot, Cursor. Cyber security has nothing.

Until now.

numasec running a pentest

  • Built for cyber security from the ground up. Not a wrapper around ChatGPT. 37 native security tools, 48 templates, payload packs, and playbooks, a stateful browser runtime, and an evidence graph that turns proof into attack paths.
  • Recon. Exploit. Chain vulnerabilities. Generate reports. Default credentials → admin access → user enumeration. SQLi → token issuance → account takeover. IDOR → data exposure → business impact.
  • Single binary, no Python tax. Pure TypeScript. No Docker required. bun build produces a single executable.
  • Attack paths, not isolated findings. Every serious run becomes graph nodes and edges — evidence, hypotheses, findings, resources, attack paths — not a pile of disconnected scanner output.
  • Works with hosted and local LLM providers. Use the provider you already trust for reasoning and orchestration; numasec still executes the scanning, evidence capture, chaining, and reporting locally.

GitHub Stars
If numasec is useful to you, a star helps more people find it.

What it finds

Injection

  • SQL injection (blind, time-based, union, error-based)
  • NoSQL injection
  • OS command injection
  • Server-Side Template Injection
  • XXE injection
  • GraphQL introspection & injection
  • CRLF injection

Authentication & Access

  • JWT attacks (alg:none, weak HS256, kid traversal)
  • OAuth misconfiguration
  • Default credentials & password spray
  • IDOR
  • CSRF
  • Privilege escalation

Client & Server Side

  • XSS (reflected, stored, DOM)
  • SSRF with cloud metadata detection
  • CORS misconfiguration
  • Path traversal / LFI
  • Open redirect
  • Race conditions
  • File upload bypass
  • Mass assignment

Every finding includes CWE ID, CVSS 3.1 score, OWASP Top 10 category, MITRE ATT&CK technique, and remediation steps.

numasec attack chain findings

How it works

graph TD
    A["pentest https://app.com"] --> B
B["🗺️ Planner + Playbooks\n48 templates, payload packs, and playbooks\nChooses what to hit next from the live surface"]
B --> C

C["⚔️ Stateful Runtime\nBrowser actors · shared auth · working memory\nRecovery, replay, resource inventory"]
C --> D

D["🔧 37 Native Security Tools\nRecon · auth · injection · browser · replay\nBuilt to keep pushing, not just probe once"]
D --> E

E["🧠 Evidence Graph\nNodes + edges for evidence, hypotheses,\nfindings, resources, and attack paths"]
E --> F

F["📄 Report\nSARIF · HTML · Markdown"]

style B fill:#1a1a2e,color:#e0e0e0,stroke:#DC143C
style C fill:#1a1a2e,color:#e0e0e0,stroke:#DC143C
style D fill:#1a1a2e,color:#e0e0e0,stroke:#DC143C
style E fill:#1a1a2e,color:#e0e0e0,stroke:#DC143C
style F fill:#1a1a2e,color:#e0e0e0,stroke:#DC143C

Every serious run now leaves behind graph nodes and edges — evidence, hypotheses, findings, resources, attack paths — so numasec remembers what it proved instead of rediscovering the same app every turn.

Reports include executive summary, risk score, OWASP coverage matrix, attack paths, and per-finding remediation. SARIF plugs into GitHub Code Scanning and GitLab SAST.

numasec report output

LLM Providers

All 37 tools execute inside numasec. The model decides what to do next; numasec performs the recon, testing, evidence capture, and reporting.

Use caseProvider examplesNotes
Hosted reasoningAnthropic, OpenAI, xAI, Google, OpenRouter, Bedrock, GitHub ModelsBest when you want stronger reasoning on harder chains and longer investigations
Local / privateOllamaBest when you want local execution and no external model API spend

Installation

npm (recommended)

npm install -g numasec
numasec

For browser automation, install Chromium once:

npx playwright install chromium

From source (local build)

git clone https://github.com/FrancescoStabile/numasec.git
cd numasec
bash install.sh

This installs a local build from your current checkout. Pull updates in the repo, then rerun bash install.sh to refresh it.

Or manually:

cd numasec/agent
bun install
cd packages/numasec
NUMASEC_CHANNEL=local NUMASEC_VERSION=local bun run build
# Binary at dist/numasec-<platform>-<arch>/bin/numasec

Optional: external tools

numasec works standalone, but external probes get much better with this setup:

# Recommended
apt install nmap

Optional

apt install sqlmap apt install ffuf

Chromium is what unlocks the full browser side of numasec: login flows, SPA work, authenticated replay, and browser-driven attack paths.

Usage

numasec                  # Launch the TUI

Agent modes

ModeWhat it does
🔴 pentestFull PTES methodology: recon → vuln testing → exploitation → report
🔵 reconReconnaissance only, no exploitation
🟠 huntSystematic OWASP-style vulnerability hunting
🟡 reviewSecure code review, no network scanning
🟢 reportFindings, attack paths, and deliverables

Canonical workflow commands

CommandDescription
/scope set <target>Set engagement scope and begin reconnaissance
/scope showShow current scope and latest observed surface
/hypothesis listList evidence-graph hypotheses
/verify nextPlan the next verification primitive
/evidence listList findings with available evidence
/evidence show <id-or-title>Show full evidence for one finding
/chains listList derived attack paths
/finding listList findings by severity
/finding finalize <id-or-title>Finalize one provisional finding through the closure path
/remediation planGenerate prioritized remediation actions
/retest run [filter]Replay and retest saved findings
/report statusShow report readiness, blockers, and whether final export is currently possible
/report generate [format] [--out <path>] [--final] [--note <text>]Generate report (markdown, html, sarif); default is a working report, --final enforces readiness
/report finalize [format] [--out <path>] [--working] [--note <text>]Run the closure-aware report path; blocks with exact blocker commands instead of drifting

Legacy aliases still supported in v1.x:

Legacy commandCurrent replacement
/target <url>/scope set <url>
/findings/finding list
/report <format>/report generate <format>
/evidence/evidence list
/evidence <id-or-title>/evidence show <id-or-title>

Development

cd agent
bun install

Type check

bun typecheck

Tests

cd packages/numasec && bun test --timeout 30000

Runtime validation

cd packages/numasec && bun run test:runtime

Benchmark proof pack (runtime eval + live fixture + optional local Juice Shop)

cd packages/numasec && bun run test:benchmark-proof

Build

cd packages/numasec && bun run build

Contributing

Issues, PRs, and ideas are welcome.

  • Found a bug? Open an issue with steps to reproduce.
  • Want to contribute code? Fork, branch from dev, open a PR.

Built by Francesco Stabile.

LinkedIn X

MIT License

SEE ALSO

clihub4/21/2026NUMASEC(1)