NAME
roam-code — Architectural intelligence layer for AI coding agents. Structural graph, architecture governance, multi-agent…
SYNOPSIS
pip install roam-codeINFO
DESCRIPTION
Architectural intelligence layer for AI coding agents. Structural graph, architecture governance, multi-agent orchestration, vulnerability mapping. 139 commands, 101 MCP tools, 26 languages, 100% local.
README
roam-code
The architectural intelligence layer for AI coding agents. Structural graph, architecture governance, multi-agent orchestration, vulnerability mapping, runtime analysis -- one CLI, zero API keys.
140 commands · 102 MCP tools · 27 languages · 100% local
What is Roam?
Roam is a structural intelligence engine for software. It pre-indexes your codebase into a semantic graph -- symbols, dependencies, call graphs, architecture layers, git history, and runtime traces -- stored in a local SQLite DB. Agents query it via CLI or MCP instead of repeatedly grepping files and guessing structure.
Unlike LSPs (editor-bound, language-specific) or Sourcegraph (hosted search), Roam provides architecture-level graph queries -- offline, cross-language, and compact. It goes beyond comprehension: Roam governs architecture through budget gates, simulates refactoring outcomes, orchestrates multi-agent swarms with zero-conflict guarantees, maps vulnerability reachability paths, and enables graph-level code editing without syntax errors.
Codebase ──> [Index] ──> Semantic Graph ──> 139 Commands ──> AI Agent
│ │ │
tree-sitter symbols comprehend
27 languages + edges govern
git history + metrics refactor
runtime traces + architecture orchestrate
The problem
Coding agents explore codebases inefficiently: dozens of grep/read cycles, high token cost, no structural understanding. Roam replaces this with one graph query:
$ roam context Flask Callers: 47 Callees: 3 Affected tests: 31
Files to read: src/flask/app.py:76-963 # definition src/flask/init.py:1-15 # re-export src/flask/testing.py:22-45 # caller: FlaskClient.init tests/test_basic.py:12-30 # caller: test_app_factory ...12 more files
Terminal demo

Core commands
$ roam understand # full codebase briefing
$ roam context <name> # files-to-read with exact line ranges
$ roam preflight <name> # blast radius + tests + complexity + architecture rules
$ roam health # composite score (0-100)
$ roam diff # blast radius of uncommitted changes
What's New in v11
v11.2 -- AST Clone Detection + Debug Artifact Rules
roam clones: New AST structural clone detection via subtree hashing. Finds Type-2 clones (identical control flow, different identifiers/literals) with Jaccard similarity scoring, Union-Find clustering, and automated refactoring suggestions. More precise than the metric-basedduplicatescommand.- 9 debug artifact rules (COR-560 through COR-568): Detect leftover
print(),breakpoint(),pdb.set_trace(),console.log(),debugger, andSystem.out.println()in Python, JavaScript, TypeScript, and Java code. All useast_matchtype with test file exemptions. - 140 commands, 102 MCP tools.
v11.1.2 -- SQL + Scala Tier 1, 27 Languages
- SQL DDL promoted to Tier 1 with dedicated
SqlExtractor-- tables, columns, views, functions, triggers, schemas, types (enums), sequences, ALTER TABLE ADD COLUMN. Foreign keys produce graph edges; views and triggers reference source tables. Database-schema projects now work withroam health,roam layers,roam impact,roam couplingand all graph commands. - Scala promoted to Tier 1 with dedicated
ScalaExtractor-- classes, traits, objects, case classes, sealed hierarchies, val/var properties, type aliases, imports, and inheritance. Fullextends+withtrait mixin resolution. - 27 languages with 16 dedicated Tier 1 extractors.
server.jsonfor official MCP Registry submission.
v11.1.1 -- Command Quality Audit
- Full command audit: all 140 commands reviewed for usefulness, duplicates, and test coverage. ~20 bugs fixed, 21 new test files (700+ tests), every command docstring updated with cross-references to related commands.
- Kotlin promoted to Tier 1 via new YAML-based declarative extractor architecture. Classes, interfaces, enums, objects, functions, methods, properties, and inheritance fully extracted.
- 7 new commands:
roam congestion,roam adrs,roam flag-dead,roam test-scaffold,roam sbom,roam triage,roam ci-setup. - CI templates:
roam ci-setupgenerates pipelines for GitHub Actions, GitLab CI, Azure Pipelines, Jenkins, and Bitbucket. - Bug fixes:
--undocumentedmode inintent(wrong DB table),--changedflag inverify(was permanently dead), lazy-load violation invisualize(~500ms penalty), exit code inconsistency inrules, VERDICT-first convention enforced across all commands. - Code quality: 15 unused variables removed, dead code swept (4 orphaned cmd files, 2 dead helper functions), algo detector false-positive rate reduced (regex-in-loop: 7 to 1, list-prepend deque suppression), 6 regex patterns pre-compiled for loop performance.
v11.0 -- MCP v2 for Agent-First Workflows
- In-process MCP execution removes per-call subprocess overhead.
- 4 compound operations (
roam_explore,roam_prepare_change,roam_review_change,roam_diagnose_issue) reduce multi-step agent workflows to single calls. - Preset-based tool surfacing (
core,review,refactor,debug,architecture,full) keeps default tool choice tight for agents while retaining full depth on demand. - MCP tools now expose structured schemas and richer annotations for safer planner behavior.
- MCP token overhead for default core context dropped from ~36K to <3K tokens (about 92% reduction).
Performance and Retrieval
- Symbol search moved to SQLite FTS5/BM25: typical search moved from seconds to milliseconds (about 1000x on benchmarked paths).
- Incremental indexing shifted from O(N) full-edge rebuild behavior to O(changed) updates.
- DB/runtime optimizations (
mmap_size, safer large-graph guards, batched writes) reduce first-run and reindex friction on larger repos.
CI, Governance, and Delivery
- GitHub Action supports quality gates, SARIF upload, sticky PR comments, and cache-aware execution.
- CI hardening includes changed-only analysis mode, trend-aware gates, and SARIF pre-upload guardrails (size/result caps + truncation signaling).
- Agent governance expanded with verification and AI-quality tooling (
roam verify,roam vibe-check,roam ai-readiness,roam ai-ratio) for teams managing agent-written code.
Best for
- Agent-assisted coding -- structured answers that reduce token usage vs raw file exploration
- Large codebases (100+ files) -- graph queries beat linear search at scale
- Architecture governance -- health scores, CI quality gates, budget enforcement, fitness functions
- Safe refactoring -- blast radius, affected tests, pre-change safety checks, graph-level editing
- Multi-agent orchestration -- partition codebases for parallel agent work with zero-conflict guarantees
- Security analysis -- vulnerability reachability mapping, auth gaps, CVE path tracing
- Algorithm optimization -- detect O(n^2) loops, N+1 queries, and 21 other anti-patterns with suggested fixes
- Backend quality -- auth gaps, missing indexes, over-fetching models, non-idempotent migrations, orphan routes, API drift
- Runtime analysis -- overlay production trace data onto the static graph for hotspot detection
- Multi-repo projects -- cross-repo API edge detection between frontend and backend
When NOT to use Roam
- Real-time type checking -- use an LSP (pyright, gopls, tsserver). Roam is static and offline.
- Small scripts (<10 files) -- just read the files directly.
- Pure text search -- ripgrep is faster for raw string matching.
Why use Roam
Speed. One command replaces 5-10 tool calls (in typical workflows). Under 0.5s for any query.
Dependency-aware. Computes structure, not string matches. Knows Flask has 47 dependents and 31 affected tests. grep knows it appears 847 times.
LLM-optimized output. Plain ASCII, compact abbreviations (fn, cls, meth), --json envelopes. Designed for agent consumption, not human decoration.
Fully local. No API keys, telemetry, or network calls. Works in air-gapped environments.
Algorithm-aware. Built-in catalog of 23 anti-patterns. Detects suboptimal algorithms (quadratic loops, N+1 queries, unbounded recursion) and suggests fixes with Big-O improvements and confidence scores. Receiver-aware loop-invariant analysis minimizes false positives.
CI-ready. --json output, --gate quality gates, GitHub Action, SARIF 2.1.0.
| Without Roam | With Roam | |
|---|---|---|
| Tool calls | 8 | 1 |
| Wall time | ~11s | <0.5s |
| Tokens consumed | ~15,000 | ~3,000 |
Measured on a typical agent workflow in a 200-file Python project (Flask). See benchmarks for more.
Table of Contents
Getting Started: What is Roam? · What's New in v11 · Best for · Why use Roam · Install · Quick Start
Using Roam: Commands · Walkthrough · AI Coding Tools · MCP Server
Operations: CI/CD Integration · SARIF Output · For Teams
Reference: Language Support · Performance · How It Works · How Roam Compares · FAQ
More: Limitations · Troubleshooting · Update / Uninstall · Development · Contributing
Install
pip install roam-codeRecommended: isolated environment
pipx install roam-code
or
uv tool install roam-code
From source
pip install git+https://github.com/Cranot/roam-code.git
Requires Python 3.9+. Works on Linux, macOS, and Windows.
Windows: If
roamis not found after installing withuv, runuv tool update-shelland restart your terminal.
Docker (alpine-based)
docker build -t roam-code .
docker run --rm -v "$PWD:/workspace" roam-code index
docker run --rm -v "$PWD:/workspace" roam-code health
Quick Start
cd your-project
roam init # indexes codebase, creates config + CI workflow
roam understand # full codebase briefing
First index takes ~5s for 200 files, ~15s for 1,000 files. Subsequent runs are incremental and near-instant.
Next steps:
- Set up your AI agent:
roam describe --write(auto-detects CLAUDE.md, AGENTS.md, .cursor/rules, etc. — see integration instructions) - Explore:
roam health→roam weather→roam map - Add to CI:
roam initalready generated a GitHub Action
Try it on Roam itself
git clone https://github.com/Cranot/roam-code.git
cd roam-code
pip install -e .
roam init
roam understand
roam health
Works With
Claude Code • Cursor • Windsurf • GitHub Copilot • Aider • Cline • Gemini CLI • OpenAI Codex CLI • MCP • GitHub Actions • GitLab CI • Azure DevOps
Commands
The 5 core commands shown above cover ~80% of agent workflows. All 140 commands are organized into 7 categories.
Full command reference
Getting Started
| Command | Description |
|---|---|
roam index [--force] [--verbose] | Build or rebuild the codebase index |
roam watch [--interval N] [--debounce N] [--webhook-port P] [--guardian] | Long-running index daemon: poll/webhook-triggered refreshes plus optional continuous architecture-guardian snapshots and JSONL compliance artifacts |
roam init | Guided onboarding: creates .roam/fitness.yaml, CI workflow, runs index, shows health |
roam hooks [--install] [--uninstall] | Manage git hooks for automated roam index updates and health gates |
roam doctor | Diagnose installation and environment: verify tree-sitter grammars, SQLite, git, and config health |
roam reset [--hard] | Reset the roam index and cached data. --hard removes all .roam/ artifacts |
roam clean [--all] | Remove stale or orphaned index entries without a full rebuild |
roam understand | Full codebase briefing: tech stack, architecture, key abstractions, health, conventions, complexity overview, entry points |
roam onboard | Alias for understand |
roam tour [--write PATH] | Auto-generated onboarding guide: top symbols, reading order, entry points, language breakdown. --write saves to Markdown |
roam describe [--write] [--force] [-o PATH] [--agent-prompt] | Auto-generate project description for AI agents. --write auto-detects your agent's config file. --agent-prompt returns a compact (<500 token) system prompt |
roam agent-export [--format F] [--write] | Generate agent-context bundle from project analysis (AGENTS.md + provider-specific overlays) |
roam minimap [--update] [-o FILE] [--init-notes] | Compact annotated codebase snapshot for agent config injection: stack, annotated directory tree, key symbols by PageRank, high fan-in symbols to avoid touching, hotspots, conventions. Sentinel-based in-place updates |
roam config [--set-db-dir PATH] [--semantic-backend MODE] | Manage .roam/config.json (DB path, excludes, optional ONNX semantic settings) |
roam map [-n N] [--full] [--budget N] | Project skeleton: files, languages, entry points, top symbols by PageRank. --budget caps output to N tokens |
roam schema [--diff] [--version V] | JSON envelope schema versioning: view, diff, and validate output schemas |
roam mcp [--list-tools] [--transport T] | Start MCP server (stdio/SSE/streamable-http), inspect available tools, and expose roam to coding agents |
roam mcp-setup <platform> | Generate MCP config snippets for AI platforms: claude-code, cursor, windsurf, vscode, gemini-cli, codex-cli |
roam ci-setup [--platform P] [--write] | Generate CI/CD pipeline config (GitHub Actions, GitLab CI, Azure Pipelines, Jenkins, Bitbucket) with SARIF + quality gates |
roam adrs [--status S] [--limit N] | Discover Architecture Decision Records, link to affected code modules, show status and coverage |
Daily Workflow
| Command | Description |
|---|---|
roam file <path> [--full] [--changed] [--deps-of PATH] | File skeleton: all definitions with signatures, cognitive load index, health score |
roam symbol <name> [--full] | Symbol definition + callers + callees + metrics. Supports file:symbol disambiguation |
roam context <symbol> [--task MODE] [--for-file PATH] | AI-optimized context: definition + callers + callees + files-to-read with line ranges |
roam search <pattern> [--kind KIND] | Find symbols by name pattern, PageRank-ranked |
roam grep <pattern> [-g glob] [-n N] | Text search annotated with enclosing symbol context |
roam deps <path> [--full] | What a file imports and what imports it |
roam trace <source> <target> [-k N] | Dependency paths with coupling strength and hub detection |
roam impact <symbol> | Blast radius: what breaks if a symbol changes (Personalized PageRank weighted) |
roam diff [--staged] [--full] [REV_RANGE] | Blast radius of uncommitted changes or a commit range |
roam pr-risk [REV_RANGE] | PR risk score (0-100, multiplicative model) + structural spread + suggested reviewers |
roam pr-diff [--staged] [--range R] [--format markdown] | Structural PR diff: metric deltas, edge analysis, symbol changes, footprint. Not text diff — graph delta |
roam api-changes [REV_RANGE] | API change classifier: breaking/non-breaking changes, severity, and affected contracts |
roam semantic-diff [REV_RANGE] | Structural change summary: symbols added/removed/modified and changed call edges |
roam test-gaps [REV_RANGE] | Changed-symbol test gap detection: what changed and what still lacks test coverage |
roam affected [REV_RANGE] | Monorepo/package impact analysis: what components are affected by a change |
roam attest [REV_RANGE] [--format markdown] [--sign] | Proof-carrying PR attestation: bundles blast radius, risk, breaking changes, fitness, budget, tests, effects into one verifiable artifact |
roam annotate <symbol> <note> | Attach persistent notes to symbols (agentic memory across sessions) |
roam annotations [--file F] [--symbol S] | View stored annotations |
roam diagnose <symbol> [--depth N] | Root cause analysis: ranks suspects by z-score normalized risk |
roam preflight <symbol|file> | Compound pre-change check: blast radius + tests + complexity + coupling + fitness |
roam guard <symbol> | Compact sub-agent preflight bundle: definition, 1-hop callers/callees, test files, breaking-risk score, and layer signals |
roam agent-plan --agents N | Decompose partitions into dependency-ordered agent tasks with merge sequencing and handoffs |
roam agent-context --agent-id N [--agents M] | Generate per-agent execution context: write scope, read-only dependencies, and interface contracts |
roam syntax-check [--changed] [PATHS...] | Tree-sitter syntax integrity check for changed files and multi-agent judge workflows |
roam verify [--threshold N] | Pre-commit AI-code consistency check across naming, imports, error handling, and duplication signals |
roam verify-imports [--file F] | Import hallucination firewall: validate all imports against indexed symbol table, suggest corrections via FTS5 fuzzy matching |
roam triage list|add|stats|check | Security finding suppression workflow: manage .roam-suppressions.yml (SAFE/ACKNOWLEDGED/WONT-FIX status lifecycle) |
roam safe-delete <symbol> | Safe deletion check: SAFE/REVIEW/UNSAFE verdict |
roam test-map <name> | Map a symbol or file to its test coverage |
roam adversarial [--staged] [--range R] | Adversarial architecture review: generates targeted challenges based on changes |
roam plan [--staged] [--range R] [--agents N] | Agent work planner: decompose changes into sequenced, dependency-aware steps |
roam closure <symbol> [--rename] [--delete] | Minimal-change synthesis: all files to touch for a safe rename/delete |
roam mutate move|rename|add-call|extract | Graph-level code editing: move symbols, rename across codebase, add calls, extract functions. Dry-run by default |
Codebase Health
| Command | Description |
|---|---|
roam health [--no-framework] [--gate] | Composite health score (0-100): weighted geometric mean of tangle ratio, god components, bottlenecks, layer violations. --gate runs quality gate checks from .roam-gates.yml (exit 5 on failure) |
roam smells [--file F] [--min-severity S] | Code smell detection: 15 deterministic detectors (brain methods, god classes, feature envy, shotgun surgery, data clumps, etc.) with per-file health scores |
roam dashboard | Unified single-screen project status: health, hotspots, risks, ownership, and AI-rot indicators |
roam vibe-check [--threshold N] | AI-rot auditor: 8-pattern taxonomy with composite risk score and prioritized findings |
roam ai-readiness | 0-100 score for how well this codebase supports AI coding agents |
roam ai-ratio [--since N] | Statistical estimate of AI-generated code ratio using commit-behavior signals |
roam trends [--record] [--days N] [--metric M] | Historical metrics snapshots with sparklines and trend deltas |
roam complexity [--bumpy-road] | Per-function cognitive complexity (SonarSource-compatible, triangular nesting penalty) + Halstead metrics (volume, difficulty, effort, bugs) + cyclomatic density |
roam algo [--task T] [--confidence C] [--profile P] | Algorithm anti-pattern detection: 23-pattern catalog detects suboptimal algorithms (O(n^2) loops, N+1 queries, quadratic string building, branching recursion, loop-invariant calls) and suggests better approaches with Big-O improvements. Confidence calibration via caller-count + runtime traces, evidence paths, impact scoring, framework-aware N+1 packs, and language-aware fix templates. Alias: roam math |
roam n1 [--confidence C] [--verbose] | Implicit N+1 I/O detection: finds ORM model computed properties ($appends/accessors) that trigger lazy-loaded DB queries in collection contexts. Cross-references with eager loading config. Supports Laravel, Django, Rails, SQLAlchemy, JPA |
roam over-fetch [--threshold N] [--confidence C] | Detect models serializing too many fields: large $fillable without $hidden/$visible, direct controller returns bypassing API Resources, poor exposed-to-hidden ratio |
roam missing-index [--table T] [--confidence C] | Find queries on non-indexed columns: cross-references WHERE/ORDER BY clauses, foreign keys, and paginated queries against migration-defined indexes |
roam weather [-n N] | Hotspots ranked by geometric mean of churn x complexity (percentile-normalized) |
roam debt [--roi] | Hotspot-weighted tech debt prioritization with SQALE remediation costs and optional refactoring ROI estimates |
roam fitness [--explain] | Architectural fitness functions from .roam/fitness.yaml |
roam alerts | Health degradation trend detection (Mann-Kendall + Sen's slope) |
roam forecast [--symbol S] [--horizon N] [--alert-only] | Predict when metrics will exceed thresholds: Theil-Sen regression on snapshot history + churn-weighted per-symbol risk |
roam budget [--init] [--staged] [--range R] | Architectural budget enforcement: per-PR delta limits on health, cycles, complexity. CI gate (exit 5 on violation) |
roam bisect [--metric M] [--range R] | Architectural git bisect: find the commit that degraded a specific metric |
roam ingest-trace <file> [--otel|--jaeger|--zipkin|--generic] | Ingest runtime trace data (OpenTelemetry, Jaeger, Zipkin) for hotspot overlay |
roam hotspots [--runtime] [--discrepancy] | Runtime hotspot analysis: find symbols missed by static analysis but critical at runtime |
roam algo — algorithm anti-pattern catalog (23 patterns)
roam algo scans every indexed function against a 23-pattern catalog, ranks findings by runtime-aware impact score, and shows the exact Big-O improvement available. Findings include semantic evidence paths, precision metadata, and language-aware tips/fixes (Python, JS, Go, Rust, Java, etc.):
$ roam algo VERDICT: 8 algorithmic improvements found (3 high, 4 medium, 1 low) Ordering: highest impact first Profile: balanced (filtered 0 low-signal findings)Nested loop lookup (2): fn resolve_permissions src/auth/rbac.py:112 [high, impact=86.4] Current: Nested iteration -- O(n*m) Better: Hash-map join -- O(n+m) Tip: Build a dict/set from one collection, iterate the other
fn find_matching_rule src/rules/engine.py:67 [high, impact=78.1] Current: Nested iteration -- O(n*m) Better: Hash-map join -- O(n+m) Tip: Build a dict/set from one collection, iterate the other
String building (1): meth build_query src/db/query.py:88 [high, impact=74.0] Current: Loop concatenation -- O(n^2) Better: Join / StringBuilder -- O(n) Tip: Collect parts in a list, join once at the end
Branching recursion without memoization (1): fn compute_cost src/pricing/calc.py:34 [medium, impact=49.5] Current: Naive branching recursion -- O(2^n) Better: Memoized / iterative DP -- O(n) Tip: Add @cache / @lru_cache, or convert to iterative with a table
Full catalog — 23 patterns:
| Pattern | Anti-pattern detected | Better approach | Improvement |
|---|---|---|---|
| Nested loop lookup | for x in a: for y in b: if x==y | Hash-map join | O(n·m) → O(n+m) |
| Membership test | if x in list in a loop | Set lookup | O(n) → O(1) per check |
| Sorting | Bubble / selection sort | Built-in sort | O(n²) → O(n log n) |
| Search in sorted data | Linear scan on sorted sequence | Binary search | O(n) → O(log n) |
| String building | s += chunk in loop | join() / StringBuilder | O(n²) → O(n) |
| Deduplication | Nested loop dedup | set() / dict.fromkeys | O(n²) → O(n) |
| Max / min | Manual tracking loop | max() / min() | idiom |
| Accumulation | Manual accumulator | sum() / reduce() | idiom |
| Group by key | Manual key-existence check | defaultdict / groupingBy | idiom |
| Fibonacci | Naive recursion | Iterative / @lru_cache | O(2ⁿ) → O(n) |
| Exponentiation | Loop multiplication | pow(b, e, mod) | O(n) → O(log n) |
| GCD | Manual loop | math.gcd() | O(n) → O(log n) |
| Matrix multiply | Naive triple loop | NumPy / BLAS | same asymptotic, ~1000× faster via SIMD |
| Busy wait | while True: sleep() poll | Event / condition variable | O(k) → O(1) wake-up |
| Regex in loop | re.match() compiled per iteration | Pre-compiled pattern | O(n·(p+m)) → O(p + n·m) |
| N+1 query | Per-item DB / API call in loop | Batch WHERE IN (...) | n round-trips → 1 |
| List front operations | list.insert(0, x) in loop | collections.deque | O(n) → O(1) per op |
| Sort to select | sorted(x)[0] or sorted(x)[:k] | min() / heapq.nsmallest | O(n log n) → O(n) or O(n log k) |
| Repeated lookup | .index() / .contains() inside loop | Pre-built set / dict | O(m) → O(1) per lookup |
| Branching recursion | Naive f(n-1) + f(n-2) without cache | @cache / iterative DP | O(2ⁿ) → O(n) |
| Quadratic string building | result += chunk across multiple scopes | parts.append + join at end | O(n²) → O(n) |
| Loop-invariant call | get_config() / compile_schema() inside loop body | Hoist before loop | per-iter cost → O(1) |
| String reversal | Manual char-by-char loop | s[::-1] / .reverse() | idiom |
Filtering:
roam algo --task nested-lookup # one pattern type only
roam algo --confidence high # high-confidence findings only
roam algo --profile strict # precision-first filtering
roam algo --task io-in-loop -n 5 # top 5 N+1 query sites
roam --json algo # machine-readable output
roam --sarif algo > roam-algo.sarif # SARIF with fingerprints + fixes
Confidence calibration: high = strong structural signal (unbounded loop + high caller/runtime impact + pattern confirmed); medium = pattern matched but uncertainty remains; low = heuristic signal only.
Profiles: balanced (default), strict (precision-first), aggressive (surface more candidates).
roam minimap — annotated codebase snapshot for agent configs
roam minimap generates a compact block (stack, annotated directory tree, key symbols, hotspots, conventions) wrapped in sentinel comments for in-place agent config updates:
$ roam minimap
<!-- roam:minimap generated=2026-02-25 -->
**Stack:** Python · JavaScript · YAML
.github/ (CI + Action)
benchmarks/ (agent-eval + oss-eval)
src/
roam/
bridges/
base.py # LanguageBridge
registry.py # register_bridge, detect_bridges
commands/ (137 cmd files) # is_test_file, get_changed_files
db/
connection.py # find_project_root, batched_in
schema.py
graph/
builder.py # build_symbol_graph, build_file_graph
pagerank.py # compute_pagerank, compute_centrality
languages/ (21 files) # ApexExtractor
output/
formatter.py # to_json, json_envelope
cli.py # cli, LazyGroup
mcp_server.py
tests/ (186 files)
`
Key symbols (PageRank): open_db · ensure_index · json_envelope · to_json · LanguageExtractor
Touch carefully (fan-in >= 15): to_json (116 callers) · json_envelope (116 callers) · open_db (105 callers) · ensure_index (100 callers)
Hotspots (churn x complexity): cmd_context.py · csharp_lang.py · cmd_dead.py
Conventions: snake_case fns, PascalCase classes
**Workflow:**
roam minimap # print to stdout
roam minimap --update # replace sentinel block in CLAUDE.md in-place
roam minimap -o docs/AGENTS.md # target a different file
roam minimap --init-notes # scaffold .roam/minimap-notes.md for project gotchas
</code></pre>
<p>The sentinel pair <code><!-- roam:minimap --></code> / <code><!-- /roam:minimap --></code> is replaced on each run — surrounding content is left intact. Add project-specific gotchas to <code>.roam/minimap-notes.md</code> and they appear in every subsequent output.</p>
<p><strong>Tree annotations</strong> come from the top exported symbols by fan-in per file. Non-source root directories (<code>.github/</code>, <code>benchmarks/</code>, <code>docs/</code>) are collapsed immediately. Large subdirectories (e.g. <code>commands/</code>, <code>languages/</code>) are collapsed at depth 2+ with a file count.</p>
</details>
<h3>Architecture</h3>
<table>
<thead>
<tr>
<th>Command</th>
<th>Description</th>
</tr>
</thead>
<tbody><tr>
<td><code>roam clusters [--min-size N]</code></td>
<td>Community detection vs directory structure. Modularity Q-score (Newman 2004) + per-cluster conductance</td>
</tr>
<tr>
<td><code>roam spectral [--depth N] [--compare] [--gap-only] [--k K]</code></td>
<td>Spectral bisection: Fiedler vector partition tree with algebraic connectivity gap verdict</td>
</tr>
<tr>
<td><code>roam layers</code></td>
<td>Topological dependency layers + upward violations + Gini balance</td>
</tr>
<tr>
<td><code>roam dead [--all] [--summary] [--clusters]</code></td>
<td>Unreferenced exported symbols with safety verdicts + confidence scoring (60-95%)</td>
</tr>
<tr>
<td><code>roam flag-dead [--config FILE] [--include-tests]</code></td>
<td>Feature flag dead code detection: stale LaunchDarkly/Unleash/Split/custom flags with staleness analysis</td>
</tr>
<tr>
<td><code>roam fan [symbol|file] [-n N] [--no-framework]</code></td>
<td>Fan-in/fan-out: most connected symbols or files</td>
</tr>
<tr>
<td><code>roam risk [-n N] [--domain KW] [--explain]</code></td>
<td>Domain-weighted risk ranking</td>
</tr>
<tr>
<td><code>roam why <name> [name2 ...]</code></td>
<td>Role classification (Hub/Bridge/Core/Leaf), reach, criticality</td>
</tr>
<tr>
<td><code>roam split <file></code></td>
<td>Internal symbol groups with isolation % and extraction suggestions</td>
</tr>
<tr>
<td><code>roam entry-points</code></td>
<td>Entry point catalog with protocol classification</td>
</tr>
<tr>
<td><code>roam patterns</code></td>
<td>Architectural pattern recognition: Strategy, Factory, Observer, etc.</td>
</tr>
<tr>
<td><code>roam visualize [--format mermaid|dot] [--focus NAME] [--limit N]</code></td>
<td>Generate Mermaid or DOT architecture diagrams. Smart filtering via PageRank, cluster grouping, cycle highlighting</td>
</tr>
<tr>
<td><code>roam effects [TARGET] [--file F] [--type T]</code></td>
<td>Side-effect classification: DB writes, network I/O, filesystem, global mutation. Direct + transitive effects through call graph</td>
</tr>
<tr>
<td><code>roam dark-matter [--min-cochanges N]</code></td>
<td>Detect hidden co-change couplings not explained by import/call edges</td>
</tr>
<tr>
<td><code>roam simulate move|extract|merge|delete</code></td>
<td>Counterfactual architecture simulator: test refactoring ideas in-memory, see metric deltas before writing code</td>
</tr>
<tr>
<td><code>roam orchestrate --agents N [--files P]</code></td>
<td>Multi-agent swarm partitioning: split codebase for parallel agents with zero-conflict guarantees</td>
</tr>
<tr>
<td><code>roam partition [--agents N]</code></td>
<td>Multi-agent partition manifest: conflict risk, complexity, and suggested ownership splits</td>
</tr>
<tr>
<td><code>roam fingerprint [--compact] [--compare F]</code></td>
<td>Topology fingerprint: extract/compare architectural signatures across repos</td>
</tr>
<tr>
<td><code>roam cut <target> [--depth N]</code></td>
<td>Minimum graph cuts: find critical edges whose removal disconnects components</td>
</tr>
<tr>
<td><code>roam safe-zones</code></td>
<td>Graph-based containment boundaries</td>
</tr>
<tr>
<td><code>roam coverage-gaps</code></td>
<td>Unprotected entry points with no path to gate symbols</td>
</tr>
<tr>
<td><code>roam duplicates [--threshold T] [--min-lines N]</code></td>
<td>Semantic duplicate detector: functionally equivalent code clusters with divergent edge-case handling</td>
</tr>
<tr>
<td><code>roam clones [--threshold T] [--min-lines N] [--scope P]</code></td>
<td>AST structural clone detection: Type-2 clones via subtree hashing (more precise than <code>duplicates</code>)</td>
</tr>
</tbody></table>
<h3>Exploration</h3>
<table>
<thead>
<tr>
<th>Command</th>
<th>Description</th>
</tr>
</thead>
<tbody><tr>
<td><code>roam module <path></code></td>
<td>Directory contents: exports, signatures, dependencies, cohesion</td>
</tr>
<tr>
<td><code>roam sketch <dir> [--full]</code></td>
<td>Compact structural skeleton of a directory</td>
</tr>
<tr>
<td><code>roam uses <name></code></td>
<td>All consumers: callers, importers, inheritors</td>
</tr>
<tr>
<td><code>roam owner <path></code></td>
<td>Code ownership: who owns a file or directory</td>
</tr>
<tr>
<td><code>roam coupling [-n N] [--set]</code></td>
<td>Temporal coupling: file pairs that change together (NPMI + lift)</td>
</tr>
<tr>
<td><code>roam fn-coupling</code></td>
<td>Function-level temporal coupling across files</td>
</tr>
<tr>
<td><code>roam bus-factor [--brain-methods]</code></td>
<td>Knowledge loss risk per module</td>
</tr>
<tr>
<td><code>roam doc-staleness</code></td>
<td>Detect stale docstrings</td>
</tr>
<tr>
<td><code>roam docs-coverage</code></td>
<td>Public-symbol doc coverage + stale docs + PageRank-ranked missing-doc hotlist</td>
</tr>
<tr>
<td><code>roam suggest-refactoring [--limit N] [--min-score N]</code></td>
<td>Proactive refactoring recommendations ranked by complexity, coupling, churn, smells, coverage gaps, and debt</td>
</tr>
<tr>
<td><code>roam plan-refactor <symbol> [--operation auto|extract|move]</code></td>
<td>Ordered refactor plan with blast radius, test gaps, layer risk, and simulation-based strategy preview</td>
</tr>
<tr>
<td><code>roam test-scaffold <name|file> [--write] [--framework F]</code></td>
<td>Generate test file/function/import skeletons from symbol data (pytest, jest, Go, JUnit, RSpec)</td>
</tr>
<tr>
<td><code>roam conventions</code></td>
<td>Auto-detect naming styles, import preferences. Flags outliers</td>
</tr>
<tr>
<td><code>roam breaking [REV_RANGE]</code></td>
<td>Breaking change detection: removed exports, signature changes</td>
</tr>
<tr>
<td><code>roam affected-tests <symbol|file></code></td>
<td>Trace reverse call graph to test files</td>
</tr>
<tr>
<td><code>roam relate <sym1> <sym2></code></td>
<td>Show relationship between two symbols: shared callers, shortest path, common ancestors</td>
</tr>
<tr>
<td><code>roam endpoints [--routes] [--api]</code></td>
<td>Enumerate all HTTP/API endpoint definitions and surface them for review or cross-repo matching</td>
</tr>
<tr>
<td><code>roam metrics <file|symbol></code></td>
<td>Unified vital signs: complexity, fan-in/out, PageRank, churn, test coverage, dead code risk -- all in one call</td>
</tr>
<tr>
<td><code>roam search-semantic <query></code></td>
<td>Hybrid semantic search: BM25 + TF-IDF + optional local ONNX vectors (select via <code>--backend</code>) with framework/library packs</td>
</tr>
<tr>
<td><code>roam intent [--staged] [--range R]</code></td>
<td>Doc-to-code linking: match documentation to symbols, detect drift</td>
</tr>
<tr>
<td><code>roam x-lang [--bridges] [--edges]</code></td>
<td>Cross-language edge browser: inspect bridge-resolved connections</td>
</tr>
</tbody></table>
<h3>Reports & CI</h3>
<table>
<thead>
<tr>
<th>Command</th>
<th>Description</th>
</tr>
</thead>
<tbody><tr>
<td><code>roam report [--list] [--config FILE] [PRESET]</code></td>
<td>Compound presets: <code>first-contact</code>, <code>security</code>, <code>pre-pr</code>, <code>refactor</code>, <code>guardian</code></td>
</tr>
<tr>
<td><code>roam describe --write</code></td>
<td>Generate agent config (auto-detects: CLAUDE.md, AGENTS.md, .cursor/rules, etc.)</td>
</tr>
<tr>
<td><code>roam auth-gaps [--routes-only] [--controllers-only] [--min-confidence C]</code></td>
<td>Find endpoints missing authentication or authorization: routes outside auth middleware groups, CRUD methods without <code>$this->authorize()</code> / <code>Gate::allows()</code> checks. String-aware PHP brace parsing</td>
</tr>
<tr>
<td><code>roam orphan-routes [-n N] [--confidence C]</code></td>
<td>Detect backend routes with no frontend consumer: parses route definitions, searches frontend for API call references, reports controller methods with no route mapping</td>
</tr>
<tr>
<td><code>roam migration-safety [-n N] [--include-archive]</code></td>
<td>Detect non-idempotent migrations: missing <code>hasTable</code>/<code>hasColumn</code> guards, raw SQL without <code>IF NOT EXISTS</code>, index operations without existence checks</td>
</tr>
<tr>
<td><code>roam api-drift [--model M] [--confidence C]</code></td>
<td>Detect mismatches between PHP model <code>$fillable</code>/<code>$appends</code> fields and TypeScript interface properties. Auto-converts snake_case/camelCase for comparison. Single-repo; cross-repo planned for <code>roam ws api-drift</code></td>
</tr>
<tr>
<td><code>roam codeowners [--unowned] [--owner NAME]</code></td>
<td>CODEOWNERS coverage analysis: owned/unowned files, top owners, and ownership risk</td>
</tr>
<tr>
<td><code>roam drift [--threshold N]</code></td>
<td>Ownership drift detection: declared ownership vs observed maintenance activity</td>
</tr>
<tr>
<td><code>roam suggest-reviewers [REV_RANGE]</code></td>
<td>Reviewer recommendation via ownership, recency, breadth, and impact signals</td>
</tr>
<tr>
<td><code>roam simulate-departure <developer></code></td>
<td>Knowledge-loss simulation: what breaks if a key contributor leaves</td>
</tr>
<tr>
<td><code>roam dev-profile [--developer NAME] [--since N]</code></td>
<td>Developer productivity profile: commit patterns, specialization, impact, and knowledge concentration per contributor</td>
</tr>
<tr>
<td><code>roam secrets [--fail-on-found] [--include-tests]</code></td>
<td>Secret scanning with masking, entropy detection, env-var suppression, remediation suggestions, and optional CI gate failure</td>
</tr>
<tr>
<td><code>roam vulns [--import-file F] [--reachable-only]</code></td>
<td>Vulnerability scanning: ingest npm/pip/trivy/osv reports, auto-detect format, reachability filtering, SARIF output</td>
</tr>
<tr>
<td><code>roam path-coverage [--from P] [--to P] [--max-depth N]</code></td>
<td>Find critical call paths (entry -> sink) with zero test protection. Suggests optimal test insertion points</td>
</tr>
<tr>
<td><code>roam capsule [--redact-paths] [--no-signatures] [--output F]</code></td>
<td>Export sanitized structural graph (no code bodies) for external architectural review</td>
</tr>
<tr>
<td><code>roam rules [--init] [--ci] [--rules-dir D]</code></td>
<td>Plugin DSL for governance: user-defined path/symbol/AST rules via <code>.roam/rules/</code> YAML (<code>$METAVAR</code> captures supported)</td>
</tr>
<tr>
<td><code>roam check-rules [--severity S] [--fix]</code></td>
<td>Evaluate built-in and user-defined governance rules (10 built-in: no-circular-imports, max-fan-out, etc.)</td>
</tr>
<tr>
<td><code>roam vuln-map --generic|--npm-audit|--trivy F</code></td>
<td>Ingest vulnerability reports and match to codebase symbols</td>
</tr>
<tr>
<td><code>roam vuln-reach [--cve C] [--from E]</code></td>
<td>Vulnerability reachability: exact paths from entry points to vulnerable calls</td>
</tr>
<tr>
<td><code>roam supply-chain [--top N]</code></td>
<td>Dependency risk dashboard: pin coverage, risk scoring, supply-chain health</td>
</tr>
<tr>
<td><code>roam sbom [--format cyclonedx|spdx] [--no-reachability] [-o FILE]</code></td>
<td>SBOM generation (CycloneDX 1.5 / SPDX 2.3) enriched with call-graph reachability per dependency</td>
</tr>
<tr>
<td><code>roam congestion [--window N] [--min-authors N]</code></td>
<td>Developer congestion detection: concurrent authors per file, coordination risk scoring</td>
</tr>
<tr>
<td><code>roam invariants [--staged] [--range R]</code></td>
<td>Discover architectural contracts (invariants) from the codebase structure</td>
</tr>
</tbody></table>
<h3>Multi-Repo Workspace</h3>
<table>
<thead>
<tr>
<th>Command</th>
<th>Description</th>
</tr>
</thead>
<tbody><tr>
<td><code>roam ws init <repo1> <repo2> [--name NAME]</code></td>
<td>Initialize a workspace from sibling repos. Auto-detects frontend/backend roles</td>
</tr>
<tr>
<td><code>roam ws status</code></td>
<td>Show workspace repos, index ages, cross-repo edge count</td>
</tr>
<tr>
<td><code>roam ws resolve</code></td>
<td>Scan for REST API endpoints and match frontend calls to backend routes</td>
</tr>
<tr>
<td><code>roam ws understand</code></td>
<td>Unified workspace overview: per-repo stats + cross-repo connections</td>
</tr>
<tr>
<td><code>roam ws health</code></td>
<td>Workspace-wide health report with cross-repo coupling assessment</td>
</tr>
<tr>
<td><code>roam ws context <symbol></code></td>
<td>Cross-repo augmented context: find a symbol across repos + show API callers</td>
</tr>
<tr>
<td><code>roam ws trace <source> <target></code></td>
<td>Trace cross-repo paths via API edges</td>
</tr>
</tbody></table>
<h3>Global Options</h3>
<table>
<thead>
<tr>
<th>Option</th>
<th>Description</th>
</tr>
</thead>
<tbody><tr>
<td><code>roam --json <command></code></td>
<td>Structured JSON output with consistent envelope</td>
</tr>
<tr>
<td><code>roam --compact <command></code></td>
<td>Token-efficient output: TSV tables, minimal JSON envelope</td>
</tr>
<tr>
<td><code>roam --sarif <command></code></td>
<td>SARIF 2.1.0 output for dead, health, complexity, rules, secrets, and algo (GitHub/CI integration)</td>
</tr>
<tr>
<td><code>roam health --gate</code></td>
<td>CI quality gate. Reads <code>.roam-gates.yml</code> thresholds. Exit code 5 on failure</td>
</tr>
</tbody></table>
</details>
<h2>Walkthrough: Investigating a Codebase</h2>
<details>
<summary><strong>10-step walkthrough using Flask as an example</strong> (click to expand)</summary>
<p>Here's how you'd use Roam to understand a project you've never seen before. Using Flask as an example:</p>
<p><strong>Step 1: Onboard and get the full picture</strong></p>
<pre><code>$ roam init
Created .roam/fitness.yaml (6 starter rules)
Created .github/workflows/roam.yml
Done. 226 files, 1132 symbols, 233 edges.
Health: 78/100
$ roam understand
Tech stack: Python (flask, jinja2, werkzeug)
Architecture: Monolithic — 3 layers, 5 clusters
Key abstractions: Flask, Blueprint, Request, Response
Health: 78/100 — 1 god component (Flask)
Entry points: src/flask/__init__.py, src/flask/cli.py
Conventions: snake_case functions, PascalCase classes, relative imports
Complexity: avg 4.2, 3 high (>15), 0 critical (>25)
</code></pre>
<p><strong>Step 2: Drill into a key file</strong></p>
<pre><code>$ roam file src/flask/app.py
src/flask/app.py (python, 963 lines)
cls Flask(App) :76-963
meth __init__(self, import_name, ...) :152
meth route(self, rule, **options) :411
meth register_blueprint(self, blueprint, ...) :580
meth make_response(self, rv) :742
...12 more methods
</code></pre>
<p><strong>Step 3: Who depends on this?</strong></p>
<pre><code>$ roam deps src/flask/app.py
Imported by:
file symbols
-------------------------- -------
src/flask/__init__.py 3
src/flask/testing.py 2
tests/test_basic.py 1
...18 files total
</code></pre>
<p><strong>Step 4: Find the hotspots</strong></p>
<pre><code>$ roam weather
=== Hotspots (churn x complexity) ===
Score Churn Complexity Path Lang
----- ----- ---------- ---------------------- ------
18420 460 40.0 src/flask/app.py python
12180 348 35.0 src/flask/blueprints.py python
</code></pre>
<p><strong>Step 5: Check architecture health</strong></p>
<pre><code>$ roam health
Health: 78/100
Tangle: 0.0% (0/1132 symbols in cycles)
1 god component (Flask, degree 47, actionable)
0 bottlenecks, 0 layer violations
=== God Components (degree > 20) ===
Sev Name Kind Degree Cat File
------- ----- ---- ------ --- ------------------
WARNING Flask cls 47 act src/flask/app.py
</code></pre>
<p><strong>Step 6: Get AI-ready context for a symbol</strong></p>
<pre><code>$ roam context Flask
Files to read:
src/flask/app.py:76-963 # definition
src/flask/__init__.py:1-15 # re-export
src/flask/testing.py:22-45 # caller: FlaskClient.__init__
tests/test_basic.py:12-30 # caller: test_app_factory
...12 more files
Callers: 47 Callees: 3
</code></pre>
<p><strong>Step 7: Pre-change safety check</strong></p>
<pre><code>$ roam preflight Flask
=== Preflight: Flask ===
Blast radius: 47 callers, 89 transitive
Affected tests: 31 (DIRECT: 12, TRANSITIVE: 19)
Complexity: cc=40 (critical), nesting=6
Coupling: 3 hidden co-change partners
Fitness: 1 violation (max-complexity exceeded)
Verdict: HIGH RISK — consider splitting before modifying
</code></pre>
<p><strong>Step 8: Decompose a large file</strong></p>
<pre><code>$ roam split src/flask/app.py
=== Split analysis: src/flask/app.py ===
87 symbols, 42 internal edges, 95 external edges
Cross-group coupling: 18%
Group 1 (routing) — 12 symbols, isolation: 83% [extractable]
meth route L411 PR=0.0088
meth add_url_rule L450 PR=0.0045
...
=== Extraction Suggestions ===
Extract 'routing' group: route, add_url_rule, endpoint (+9 more)
83% isolated, only 3 edges to other groups
</code></pre>
<p><strong>Step 9: Understand why a symbol matters</strong></p>
<pre><code>$ roam why Flask url_for Blueprint
Symbol Role Fan Reach Risk Verdict
--------- ------------ ---------- -------- -------- --------------------------------------------------
Flask Hub fan-in:47 reach:89 CRITICAL God symbol (47 in, 12 out). Consider splitting.
url_for Core utility fan-in:31 reach:45 HIGH Widely used utility (31 callers). Stable interface.
Blueprint Bridge fan-in:18 reach:34 moderate Coupling point between clusters.
</code></pre>
<p><strong>Step 10: Generate docs and set up CI</strong></p>
<pre><code>$ roam describe --write
Wrote CLAUDE.md (98 lines) # auto-detects: CLAUDE.md, AGENTS.md, .cursor/rules, etc.
$ roam health --gate
Health: 78/100 — PASS
</code></pre>
<p>Ten commands. Complete picture: structure, dependencies, hotspots, health, context, safety checks, decomposition, and CI gates.</p>
</details>
<h2>Integration with AI Coding Tools</h2>
<p>Roam is designed to be called by coding agents via shell commands. Instead of repeatedly grepping and reading files, the agent runs one <code>roam</code> command and gets structured output.</p>
<p><strong>Decision order for agents:</strong></p>
<table>
<thead>
<tr>
<th>Situation</th>
<th>Command</th>
</tr>
</thead>
<tbody><tr>
<td>First time in a repo</td>
<td><code>roam understand</code> then <code>roam tour</code></td>
</tr>
<tr>
<td>Need to modify a symbol</td>
<td><code>roam preflight <name></code> (blast radius + tests + fitness)</td>
</tr>
<tr>
<td>Debugging a failure</td>
<td><code>roam diagnose <name></code> (root cause ranking)</td>
</tr>
<tr>
<td>Need files to read</td>
<td><code>roam context <name></code> (files + line ranges)</td>
</tr>
<tr>
<td>Need to find a symbol</td>
<td><code>roam search <pattern></code></td>
</tr>
<tr>
<td>Need file structure</td>
<td><code>roam file <path></code></td>
</tr>
<tr>
<td>Pre-PR check</td>
<td><code>roam pr-risk HEAD~3..HEAD</code></td>
</tr>
<tr>
<td>What breaks if I change X?</td>
<td><code>roam impact <symbol></code></td>
</tr>
<tr>
<td>Check for N+1 queries</td>
<td><code>roam n1</code> (implicit lazy-load detection)</td>
</tr>
<tr>
<td>Check auth coverage</td>
<td><code>roam auth-gaps</code> (routes + controllers)</td>
</tr>
<tr>
<td>Check migration safety</td>
<td><code>roam migration-safety</code> (idempotency guards)</td>
</tr>
</tbody></table>
<p><strong>Fastest setup:</strong></p>
<pre><code class="language-bash">roam describe --write # auto-detects your agent's config file
roam describe --write -o AGENTS.md # or specify an explicit path
roam describe --agent-prompt # compact ~500-token prompt (append to any config)
roam minimap --update # inject/refresh annotated codebase minimap in CLAUDE.md
</code></pre>
<p><strong>Agent not using Roam correctly?</strong> If your agent is ignoring Roam and falling back to grep/read exploration, it likely doesn't have the instructions. Run:</p>
<pre><code class="language-bash">roam describe --write # writes instructions to your agent's config (CLAUDE.md, AGENTS.md, etc.)
</code></pre>
<p>If you already have a config file and don't want to overwrite it:</p>
<pre><code class="language-bash">roam describe --agent-prompt # prints a compact prompt — copy-paste into your existing config
roam minimap --update # injects an annotated codebase snapshot into CLAUDE.md (won't touch other content)
</code></pre>
<p>This teaches the agent which Roam command to use for each situation (e.g., <code>roam preflight</code> before changes, <code>roam context</code> for files to read, <code>roam diagnose</code> for debugging).</p>
<details>
<summary><strong>Copy-paste agent instructions</strong></summary>
<pre><code class="language-markdown">## Codebase navigation
This project uses `roam` for codebase comprehension. Always prefer roam over Glob/Grep/Read exploration.
Before modifying any code:
1. First time in the repo: `roam understand` then `roam tour`
2. Find a symbol: `roam search <pattern>`
3. Before changing a symbol: `roam preflight <name>` (blast radius + tests + fitness)
4. Need files to read: `roam context <name>` (files + line ranges, prioritized)
5. Debugging a failure: `roam diagnose <name>` (root cause ranking)
6. After making changes: `roam diff` (blast radius of uncommitted changes)
Additional: `roam health` (0-100 score), `roam impact <name>` (what breaks),
`roam pr-risk` (PR risk), `roam file <path>` (file skeleton).
Run `roam --help` for all commands. Use `roam --json <cmd>` for structured output.
</code></pre>
</details>
<details>
<summary><strong>Where to put this for each tool</strong></summary>
<table>
<thead>
<tr>
<th>Tool</th>
<th>Config file</th>
</tr>
</thead>
<tbody><tr>
<td><strong>Claude Code</strong></td>
<td><code>CLAUDE.md</code> in your project root</td>
</tr>
<tr>
<td><strong>OpenAI Codex CLI</strong></td>
<td><code>AGENTS.md</code> in your project root</td>
</tr>
<tr>
<td><strong>Gemini CLI</strong></td>
<td><code>GEMINI.md</code> in your project root</td>
</tr>
<tr>
<td><strong>Cursor</strong></td>
<td><code>.cursor/rules/roam.mdc</code> (add <code>alwaysApply: true</code> frontmatter)</td>
</tr>
<tr>
<td><strong>Windsurf</strong></td>
<td><code>.windsurf/rules/roam.md</code> (add <code>trigger: always_on</code> frontmatter)</td>
</tr>
<tr>
<td><strong>GitHub Copilot</strong></td>
<td><code>.github/copilot-instructions.md</code></td>
</tr>
<tr>
<td><strong>Aider</strong></td>
<td><code>CONVENTIONS.md</code></td>
</tr>
<tr>
<td><strong>Continue.dev</strong></td>
<td><code>config.yaml</code> rules</td>
</tr>
<tr>
<td><strong>Cline</strong></td>
<td><code>.clinerules/</code> directory</td>
</tr>
</tbody></table>
</details>
<details>
<summary><strong>Roam vs native tools</strong></summary>
<table>
<thead>
<tr>
<th>Task</th>
<th>Use Roam</th>
<th>Use native tools</th>
</tr>
</thead>
<tbody><tr>
<td>"What calls this function?"</td>
<td><code>roam symbol <name></code></td>
<td>LSP / Grep</td>
</tr>
<tr>
<td>"What files do I need to read?"</td>
<td><code>roam context <name></code></td>
<td>Manual tracing (5+ calls)</td>
</tr>
<tr>
<td>"Is it safe to change X?"</td>
<td><code>roam preflight <name></code></td>
<td>Multiple manual checks</td>
</tr>
<tr>
<td>"Show me this file's structure"</td>
<td><code>roam file <path></code></td>
<td>Read the file directly</td>
</tr>
<tr>
<td>"Understand project architecture"</td>
<td><code>roam understand</code></td>
<td>Manual exploration</td>
</tr>
<tr>
<td>"What breaks if I change X?"</td>
<td><code>roam impact <symbol></code></td>
<td>No direct equivalent</td>
</tr>
<tr>
<td>"What tests to run?"</td>
<td><code>roam affected-tests <name></code></td>
<td>Grep for imports (misses indirect)</td>
</tr>
<tr>
<td>"What's causing this bug?"</td>
<td><code>roam diagnose <name></code></td>
<td>Manual call-chain tracing</td>
</tr>
<tr>
<td>"Codebase health score for CI"</td>
<td><code>roam health --gate</code></td>
<td>No equivalent</td>
</tr>
</tbody></table>
</details>
<h2>MCP Server</h2>
<p>Roam includes a <a href="https://modelcontextprotocol.io/">Model Context Protocol</a> server for direct integration with tools that support MCP.</p>
<pre><code class="language-bash">pip install "roam-code[mcp]"
roam mcp
</code></pre>
<p>102 tools, 10 resources, and 5 prompts are available in the full preset. Most tools are read-only index queries; side-effect tools are explicitly annotated.</p>
<p><strong>MCP v2 highlights (v11):</strong></p>
<ul>
<li>In-process MCP execution (no subprocess shell-out per call)</li>
<li>Preset-based tool surfacing (<code>core</code>, <code>review</code>, <code>refactor</code>, <code>debug</code>, <code>architecture</code>, <code>full</code>)</li>
<li>Compound tools that collapse multi-step exploration/review flows into one call</li>
<li>Structured output schemas + tool annotations for safer planner behavior</li>
</ul>
<p><strong>Default preset:</strong> <code>core</code> (24 tools: 23 core + <code>roam_expand_toolset</code> meta-tool).</p>
<pre><code class="language-bash"># Default
roam mcp
# Full toolset
ROAM_MCP_PRESET=full roam mcp
# Legacy compatibility (same as full preset)
ROAM_MCP_LITE=0 roam mcp
</code></pre>
<p>Core preset tools: <code>roam_affected_tests</code>, <code>roam_batch_get</code>, <code>roam_batch_search</code>, <code>roam_complexity_report</code>, <code>roam_context</code>, <code>roam_dead_code</code>, <code>roam_deps</code>, <code>roam_diagnose</code>, <code>roam_diagnose_issue</code>, <code>roam_diff</code>, <code>roam_expand_toolset</code>, <code>roam_explore</code>, <code>roam_file_info</code>, <code>roam_health</code>, <code>roam_impact</code>, <code>roam_pr_risk</code>, <code>roam_preflight</code>, <code>roam_prepare_change</code>, <code>roam_review_change</code>, <code>roam_search_symbol</code>, <code>roam_syntax_check</code>, <code>roam_trace</code>, <code>roam_understand</code>, <code>roam_uses</code>.</p>
<details>
<summary><strong>MCP tool list (all 101)</strong></summary>
<table>
<thead>
<tr>
<th>Tool</th>
<th>Description</th>
</tr>
</thead>
<tbody><tr>
<td><code>roam_understand</code></td>
<td>Full codebase briefing</td>
</tr>
<tr>
<td><code>roam_health</code></td>
<td>Health score (0-100) + issues</td>
</tr>
<tr>
<td><code>roam_preflight</code></td>
<td>Pre-change safety check</td>
</tr>
<tr>
<td><code>roam_search_symbol</code></td>
<td>Find symbols by name</td>
</tr>
<tr>
<td><code>roam_context</code></td>
<td>Files-to-read for modifying a symbol</td>
</tr>
<tr>
<td><code>roam_trace</code></td>
<td>Dependency path between two symbols</td>
</tr>
<tr>
<td><code>roam_impact</code></td>
<td>Blast radius of changing a symbol</td>
</tr>
<tr>
<td><code>roam_file_info</code></td>
<td>File skeleton with all definitions</td>
</tr>
<tr>
<td><code>roam_pr_risk</code></td>
<td>Risk score for pending changes</td>
</tr>
<tr>
<td><code>roam_breaking_changes</code></td>
<td>Detect breaking changes between refs</td>
</tr>
<tr>
<td><code>roam_affected_tests</code></td>
<td>Find tests affected by a change</td>
</tr>
<tr>
<td><code>roam_dead_code</code></td>
<td>List unreferenced exports</td>
</tr>
<tr>
<td><code>roam_complexity_report</code></td>
<td>Per-symbol cognitive complexity</td>
</tr>
<tr>
<td><code>roam_repo_map</code></td>
<td>Project skeleton with key symbols</td>
</tr>
<tr>
<td><code>roam_tour</code></td>
<td>Auto-generated onboarding guide</td>
</tr>
<tr>
<td><code>roam_diagnose</code></td>
<td>Root cause analysis for debugging</td>
</tr>
<tr>
<td><code>roam_visualize</code></td>
<td>Generate Mermaid or DOT architecture diagrams</td>
</tr>
<tr>
<td><code>roam_algo</code></td>
<td>Algorithm anti-pattern detection with language-aware tips</td>
</tr>
<tr>
<td><code>roam_ws_understand</code></td>
<td>Unified multi-repo workspace overview</td>
</tr>
<tr>
<td><code>roam_ws_context</code></td>
<td>Cross-repo augmented symbol context</td>
</tr>
<tr>
<td><code>roam_pr_diff</code></td>
<td>Structural PR diff: metric deltas, edge analysis, symbol changes</td>
</tr>
<tr>
<td><code>roam_budget_check</code></td>
<td>Check changes against architectural budgets</td>
</tr>
<tr>
<td><code>roam_effects</code></td>
<td>Side-effect classification (DB writes, network, filesystem)</td>
</tr>
<tr>
<td><code>roam_attest</code></td>
<td>Proof-carrying PR attestation with all evidence bundled</td>
</tr>
<tr>
<td><code>roam_capsule_export</code></td>
<td>Export sanitized structural graph (no code bodies)</td>
</tr>
<tr>
<td><code>roam_path_coverage</code></td>
<td>Find critical untested call paths (entry -> sink)</td>
</tr>
<tr>
<td><code>roam_forecast</code></td>
<td>Predict when metrics will exceed thresholds</td>
</tr>
<tr>
<td><code>roam_simulate</code></td>
<td>Counterfactual architecture simulator</td>
</tr>
<tr>
<td><code>roam_orchestrate</code></td>
<td>Multi-agent swarm partitioning</td>
</tr>
<tr>
<td><code>roam_fingerprint</code></td>
<td>Topology fingerprint comparison</td>
</tr>
<tr>
<td><code>roam_mutate</code></td>
<td>Graph-level code editing (move/rename/extract)</td>
</tr>
<tr>
<td><code>roam_dark_matter</code></td>
<td>Hidden co-change coupling detection</td>
</tr>
<tr>
<td><code>roam_closure</code></td>
<td>Minimal-change synthesis for rename/delete</td>
</tr>
<tr>
<td><code>roam_adversarial_review</code></td>
<td>Adversarial architecture review</td>
</tr>
<tr>
<td><code>roam_generate_plan</code></td>
<td>Agent work planner</td>
</tr>
<tr>
<td><code>roam_get_invariants</code></td>
<td>Architectural invariant discovery</td>
</tr>
<tr>
<td><code>roam_bisect_blame</code></td>
<td>Architectural git bisect</td>
</tr>
<tr>
<td><code>roam_doc_intent</code></td>
<td>Doc-to-code linking</td>
</tr>
<tr>
<td><code>roam_cut_analysis</code></td>
<td>Minimum graph cut analysis</td>
</tr>
<tr>
<td><code>roam_clones</code></td>
<td>AST structural clone detection (Type-2 clones)</td>
</tr>
<tr>
<td><code>roam_annotate_symbol</code></td>
<td>Attach persistent notes to symbols</td>
</tr>
<tr>
<td><code>roam_get_annotations</code></td>
<td>View stored annotations</td>
</tr>
<tr>
<td><code>roam_relate</code></td>
<td>Show relationship between two symbols</td>
</tr>
<tr>
<td><code>roam_search_semantic</code></td>
<td>Semantic search by meaning</td>
</tr>
<tr>
<td><code>roam_rules_check</code></td>
<td>Plugin DSL governance rules</td>
</tr>
<tr>
<td><code>roam_check_rules</code></td>
<td>Built-in + user-defined governance rule evaluation with autofix templates</td>
</tr>
<tr>
<td><code>roam_supply_chain</code></td>
<td>Dependency risk dashboard: pin coverage and supply-chain health</td>
</tr>
<tr>
<td><code>roam_spectral</code></td>
<td>Spectral bisection: Fiedler vector partition tree and modularity gap</td>
</tr>
<tr>
<td><code>roam_vuln_map</code></td>
<td>Vulnerability report ingestion</td>
</tr>
<tr>
<td><code>roam_vuln_reach</code></td>
<td>Vulnerability reachability paths</td>
</tr>
<tr>
<td><code>roam_ingest_trace</code></td>
<td>Ingest runtime trace data</td>
</tr>
<tr>
<td><code>roam_runtime_hotspots</code></td>
<td>Runtime hotspot analysis</td>
</tr>
<tr>
<td><code>roam_diff</code></td>
<td>Blast radius of uncommitted/committed changes</td>
</tr>
<tr>
<td><code>roam_symbol</code></td>
<td>Symbol definition, callers, callees, metrics</td>
</tr>
<tr>
<td><code>roam_deps</code></td>
<td>File-level import/imported-by relationships</td>
</tr>
<tr>
<td><code>roam_uses</code></td>
<td>All consumers of a symbol by edge type</td>
</tr>
<tr>
<td><code>roam_weather</code></td>
<td>Code hotspots: churn x complexity ranking</td>
</tr>
<tr>
<td><code>roam_debt</code></td>
<td>Hotspot-weighted technical debt prioritization with optional ROI estimate</td>
</tr>
<tr>
<td><code>roam_docs_coverage</code></td>
<td>Doc coverage and stale-doc drift with PageRank-ranked missing docs</td>
</tr>
<tr>
<td><code>roam_suggest_refactoring</code></td>
<td>Rank proactive refactoring candidates using complexity, coupling, churn, smells, and coverage gaps</td>
</tr>
<tr>
<td><code>roam_plan_refactor</code></td>
<td>Build an ordered refactor plan for one symbol with risk/test/simulation context</td>
</tr>
<tr>
<td><code>roam_n1</code></td>
<td>Detect N+1 I/O patterns in ORM code</td>
</tr>
<tr>
<td><code>roam_auth_gaps</code></td>
<td>Find endpoints missing auth</td>
</tr>
<tr>
<td><code>roam_over_fetch</code></td>
<td>Detect models serializing too many fields</td>
</tr>
<tr>
<td><code>roam_missing_index</code></td>
<td>Find queries on non-indexed columns</td>
</tr>
<tr>
<td><code>roam_orphan_routes</code></td>
<td>Detect dead backend routes</td>
</tr>
<tr>
<td><code>roam_migration_safety</code></td>
<td>Detect non-idempotent migrations</td>
</tr>
<tr>
<td><code>roam_api_drift</code></td>
<td>Backend/frontend model mismatch detection</td>
</tr>
<tr>
<td><code>roam_expand_toolset</code></td>
<td>Discover presets, active toolset, and switch instructions</td>
</tr>
<tr>
<td><code>roam_explore</code></td>
<td>Compound first-contact exploration bundle for fast repo orientation</td>
</tr>
<tr>
<td><code>roam_prepare_change</code></td>
<td>Compound pre-change bundle: context, blast radius, risk, and tests</td>
</tr>
<tr>
<td><code>roam_review_change</code></td>
<td>Compound review bundle for changed code and architecture checks</td>
</tr>
<tr>
<td><code>roam_diagnose_issue</code></td>
<td>Compound debugging bundle with ranked suspects and dependency context</td>
</tr>
<tr>
<td><code>roam_onboard</code></td>
<td>Structured onboarding brief for new contributors/agents</td>
</tr>
<tr>
<td><code>roam_syntax_check</code></td>
<td>Tree-sitter syntax integrity validation for changed paths</td>
</tr>
<tr>
<td><code>roam_agent_export</code></td>
<td>Generate multi-agent instruction bundles (<code>AGENTS.md</code> + overlays)</td>
</tr>
<tr>
<td><code>roam_vibe_check</code></td>
<td>AI-rot auditor with 8-pattern taxonomy and composite score</td>
</tr>
<tr>
<td><code>roam_ai_readiness</code></td>
<td>AI-agent effectiveness readiness scoring and recommendations</td>
</tr>
<tr>
<td><code>roam_dashboard</code></td>
<td>Unified status snapshot across health, risk, churn, and quality</td>
</tr>
<tr>
<td><code>roam_codeowners</code></td>
<td>CODEOWNERS coverage analysis and unowned file discovery</td>
</tr>
<tr>
<td><code>roam_drift</code></td>
<td>Ownership drift detection from declared vs observed ownership</td>
</tr>
<tr>
<td><code>roam_suggest_reviewers</code></td>
<td>Reviewer recommendations with multi-signal scoring</td>
</tr>
<tr>
<td><code>roam_simulate_departure</code></td>
<td>Knowledge-loss simulation for contributor departure scenarios</td>
</tr>
<tr>
<td><code>roam_verify</code></td>
<td>Pre-commit consistency verification and policy checks</td>
</tr>
<tr>
<td><code>roam_api_changes</code></td>
<td>API signature change classification and severity labeling</td>
</tr>
<tr>
<td><code>roam_test_gaps</code></td>
<td>Changed-symbol test gap analysis</td>
</tr>
<tr>
<td><code>roam_ai_ratio</code></td>
<td>Estimated AI-generated code ratio from repository signals</td>
</tr>
<tr>
<td><code>roam_duplicates</code></td>
<td>Semantic duplicate detection across structurally similar functions</td>
</tr>
<tr>
<td><code>roam_partition</code></td>
<td>Multi-agent partition manifest with conflict and complexity scores</td>
</tr>
<tr>
<td><code>roam_affected</code></td>
<td>Monorepo/package affected-set analysis for diffs</td>
</tr>
<tr>
<td><code>roam_semantic_diff</code></td>
<td>Structural diff of symbol/edge changes</td>
</tr>
<tr>
<td><code>roam_trends</code></td>
<td>Historical metric trend retrieval with sparkline output</td>
</tr>
<tr>
<td><code>roam_secrets</code></td>
<td>Secret scanning with masking and CI-friendly fail behavior</td>
</tr>
<tr>
<td><code>roam_endpoints</code></td>
<td>Enumerate HTTP/API endpoint definitions across the codebase</td>
</tr>
<tr>
<td><code>roam_doctor</code></td>
<td>Diagnose installation and environment health</td>
</tr>
<tr>
<td><code>roam_init</code></td>
<td>Initialize roam workspace state and build the first index</td>
</tr>
<tr>
<td><code>roam_reindex</code></td>
<td>Refresh or force-rebuild the index with task-mode support</td>
</tr>
<tr>
<td><code>roam_reset</code></td>
<td>Reset the roam index and cached data</td>
</tr>
<tr>
<td><code>roam_clean</code></td>
<td>Remove stale or orphaned index entries</td>
</tr>
<tr>
<td><code>roam_batch_search</code></td>
<td>Batch symbol search: run multiple pattern queries in a single call</td>
</tr>
<tr>
<td><code>roam_batch_get</code></td>
<td>Batch context retrieval: fetch multiple symbols/files in a single call</td>
</tr>
<tr>
<td><code>roam_dev_profile</code></td>
<td>Developer productivity profile: commit patterns, specialization, and impact</td>
</tr>
</tbody></table>
<p><strong>Resources:</strong> <code>roam://health</code> (current health score), <code>roam://summary</code> (project overview)</p>
</details>
<details>
<summary><strong>Claude Code</strong></summary>
<pre><code class="language-bash">claude mcp add roam-code -- roam mcp
</code></pre>
<p>Or add to <code>.mcp.json</code> in your project root:</p>
<pre><code class="language-json">{
"mcpServers": {
"roam-code": {
"command": "roam",
"args": ["mcp"]
}
}
}
</code></pre>
</details>
<details>
<summary><strong>Claude Desktop</strong></summary>
<p>Add to your <code>claude_desktop_config.json</code>:</p>
<pre><code class="language-json">{
"mcpServers": {
"roam-code": {
"command": "roam",
"args": ["mcp"],
"cwd": "/path/to/your/project"
}
}
}
</code></pre>
</details>
<details>
<summary><strong>Cursor</strong></summary>
<p>Add to <code>.cursor/mcp.json</code>:</p>
<pre><code class="language-json">{
"mcpServers": {
"roam-code": {
"command": "roam",
"args": ["mcp"]
}
}
}
</code></pre>
</details>
<details>
<summary><strong>VS Code + Copilot</strong></summary>
<p>Add to <code>.vscode/mcp.json</code>:</p>
<pre><code class="language-json">{
"servers": {
"roam-code": {
"type": "stdio",
"command": "roam",
"args": ["mcp"]
}
}
}
</code></pre>
</details>
<h2>CI/CD Integration</h2>
<p>All you need is Python 3.9+ and <code>pip install roam-code</code>.</p>
<h3>GitHub Actions</h3>
<pre><code class="language-yaml"># .github/workflows/roam.yml
name: Roam Analysis
on: [pull_request]
jobs:
roam:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- uses: Cranot/roam-code@main
with:
commands: health
gate: "score>=70"
sarif: true
comment: true
</code></pre>
<p>Use <code>roam init</code> to auto-generate this workflow.</p>
<table>
<thead>
<tr>
<th>Input</th>
<th>Default</th>
<th>Description</th>
</tr>
</thead>
<tbody><tr>
<td><code>commands</code></td>
<td><code>health</code></td>
<td>Space-separated roam commands to run</td>
</tr>
<tr>
<td><code>gate</code></td>
<td>(empty)</td>
<td>Quality gate expression (e.g., <code>score>=70</code>). Exit 5 on failure</td>
</tr>
<tr>
<td><code>sarif</code></td>
<td><code>false</code></td>
<td>Upload SARIF results to GitHub Code Scanning</td>
</tr>
<tr>
<td><code>comment</code></td>
<td><code>true</code></td>
<td>Post sticky PR comment with results</td>
</tr>
<tr>
<td><code>python-version</code></td>
<td><code>3.11</code></td>
<td>Python version</td>
</tr>
<tr>
<td><code>version</code></td>
<td><code>latest</code></td>
<td>Pin to a specific roam-code version</td>
</tr>
<tr>
<td><code>cache</code></td>
<td><code>true</code></td>
<td>Cache the SQLite index between runs</td>
</tr>
<tr>
<td><code>changed-only</code></td>
<td><code>false</code></td>
<td>Incremental mode: adapt commands to changed files</td>
</tr>
</tbody></table>
<details>
<summary><strong>GitLab CI</strong></summary>
<pre><code class="language-yaml">roam-analysis:
stage: test
image: python:3.12-slim
before_script:
- pip install roam-code
script:
- roam index
- roam health --gate
- roam --json pr-risk origin/main..HEAD > roam-report.json
artifacts:
paths:
- roam-report.json
rules:
- if: $CI_MERGE_REQUEST_IID
</code></pre>
</details>
<details>
<summary><strong>Azure DevOps / any CI</strong></summary>
<p>Universal pattern:</p>
<pre><code class="language-bash">pip install roam-code
roam index
roam health --gate # exit 5 on failure (reads .roam-gates.yml)
roam --json health > report.json
</code></pre>
</details>
<h2>SARIF Output</h2>
<p>Roam exports analysis results in <a href="https://sarifweb.azurewebsites.net/">SARIF 2.1.0</a> format for GitHub Code Scanning.</p>
<pre><code class="language-python">from roam.output.sarif import health_to_sarif, write_sarif
sarif = health_to_sarif(health_data)
write_sarif(sarif, "roam-health.sarif")
</code></pre>
<pre><code class="language-yaml">- uses: github/codeql-action/upload-sarif@v3
with:
sarif_file: roam-health.sarif
</code></pre>
<h2>For Teams</h2>
<p>Zero infrastructure, zero vendor lock-in, zero data leaving your network.</p>
<table>
<thead>
<tr>
<th>Tool</th>
<th>Annual cost (20-dev team)</th>
<th>Infrastructure</th>
<th>Setup time</th>
</tr>
</thead>
<tbody><tr>
<td>SonarQube Server</td>
<td>$15,000-$45,000</td>
<td>Self-hosted server</td>
<td>Days</td>
</tr>
<tr>
<td>CodeScene</td>
<td>$20,000-$60,000</td>
<td>SaaS or on-prem</td>
<td>Hours</td>
</tr>
<tr>
<td>Code Climate</td>
<td>$12,000-$36,000</td>
<td>SaaS</td>
<td>Hours</td>
</tr>
<tr>
<td><strong>Roam</strong></td>
<td><strong>$0 (MIT license)</strong></td>
<td><strong>None (local)</strong></td>
<td><strong>5 minutes</strong></td>
</tr>
</tbody></table>
<details>
<summary><strong>Team rollout guide</strong></summary>
<p><strong>Week 1-2 (pilot):</strong> 1-2 developers run <code>roam init</code> on one repo. Use <code>roam preflight</code> before changes, <code>roam pr-risk</code> before PRs.</p>
<p><strong>Week 3-4 (expand):</strong> Add <code>roam health --gate</code> to CI as a non-blocking check (configure thresholds in <code>.roam-gates.yml</code>).</p>
<p><strong>Month 2+ (standardize):</strong> Tighten gate thresholds. Expand to additional repos. Track trajectory with <code>roam trends</code>.</p>
</details>
<details>
<summary><strong>Complements your existing stack</strong></summary>
<table>
<thead>
<tr>
<th>If you use...</th>
<th>Roam adds...</th>
</tr>
</thead>
<tbody><tr>
<td><strong>SonarQube</strong></td>
<td>Architecture-level analysis: dependency cycles, god components, blast radius, health scoring</td>
</tr>
<tr>
<td><strong>CodeScene</strong></td>
<td>Free, local alternative for health scoring and hotspot analysis</td>
</tr>
<tr>
<td><strong>ESLint / Pylint</strong></td>
<td>Cross-language architecture checks. Linters enforce style per file; Roam enforces architecture across the codebase</td>
</tr>
<tr>
<td><strong>LSP</strong></td>
<td>AI-agent-optimized queries. <code>roam context</code> answers "what calls this?" with PageRank-ranked results in one call</td>
</tr>
</tbody></table>
</details>
<h2>Language Support</h2>
<h3>Tier 1 -- Full extraction (dedicated parsers)</h3>
<table>
<thead>
<tr>
<th>Language</th>
<th>Extensions</th>
<th>Symbols</th>
<th>References</th>
<th>Inheritance</th>
</tr>
</thead>
<tbody><tr>
<td>Python</td>
<td><code>.py</code> <code>.pyi</code></td>
<td>classes, functions, methods, decorators, variables</td>
<td>imports, calls, inheritance</td>
<td>extends, <code>__all__</code> exports</td>
</tr>
<tr>
<td>JavaScript</td>
<td><code>.js</code> <code>.jsx</code> <code>.mjs</code> <code>.cjs</code></td>
<td>classes, functions, arrow functions, CJS exports</td>
<td>imports, require(), calls</td>
<td>extends</td>
</tr>
<tr>
<td>TypeScript</td>
<td><code>.ts</code> <code>.tsx</code> <code>.mts</code> <code>.cts</code></td>
<td>interfaces, type aliases, enums + all JS</td>
<td>imports, calls, type refs</td>
<td>extends, implements</td>
</tr>
<tr>
<td>Java</td>
<td><code>.java</code></td>
<td>classes, interfaces, enums, constructors, fields</td>
<td>imports, calls</td>
<td>extends, implements</td>
</tr>
<tr>
<td>Go</td>
<td><code>.go</code></td>
<td>structs, interfaces, functions, methods, fields</td>
<td>imports, calls</td>
<td>embedded structs</td>
</tr>
<tr>
<td>Rust</td>
<td><code>.rs</code></td>
<td>structs, traits, impls, enums, functions</td>
<td>use, calls</td>
<td>impl Trait for Struct</td>
</tr>
<tr>
<td>C / C++</td>
<td><code>.c</code> <code>.h</code> <code>.cpp</code> <code>.hpp</code> <code>.cc</code></td>
<td>structs, classes, functions, namespaces, templates</td>
<td>includes, calls</td>
<td>extends</td>
</tr>
<tr>
<td>C#</td>
<td><code>.cs</code></td>
<td>classes, interfaces, structs, enums, records, methods, constructors, properties, delegates, events, fields</td>
<td>using directives, calls, <code>new</code>, attributes</td>
<td>extends, implements</td>
</tr>
<tr>
<td>PHP</td>
<td><code>.php</code></td>
<td>classes, interfaces, traits, enums, methods, properties</td>
<td>namespace use, calls, static calls, <code>new</code></td>
<td>extends, implements, use (traits)</td>
</tr>
<tr>
<td>Visual FoxPro</td>
<td><code>.prg</code></td>
<td>functions, procedures, classes, methods, properties, constants</td>
<td>DO, SET PROCEDURE/CLASSLIB, CREATEOBJECT, <code>=func()</code>, <code>obj.method()</code></td>
<td>DEFINE CLASS ... AS</td>
</tr>
<tr>
<td>YAML (CI/CD)</td>
<td><code>.yml</code> <code>.yaml</code></td>
<td>GitLab CI: jobs, template anchors, stages. GitHub Actions: workflow name, jobs, reusable workflows. Generic: top-level keys</td>
<td><code>extends:</code>, <code>needs:</code>, <code>!reference</code>, <code>uses:</code></td>
<td>—</td>
</tr>
<tr>
<td>HCL / Terraform</td>
<td><code>.tf</code> <code>.tfvars</code> <code>.hcl</code></td>
<td><code>resource</code>, <code>data</code>, <code>variable</code>, <code>output</code>, <code>module</code>, <code>provider</code>, <code>locals</code> entries</td>
<td><code>var.*</code>, <code>module.*</code>, <code>data.*</code>, <code>local.*</code>, resource cross-refs</td>
<td>—</td>
</tr>
<tr>
<td>Vue</td>
<td><code>.vue</code></td>
<td>via <code><script></code> block extraction (TS/JS)</td>
<td>imports, calls, type refs</td>
<td>extends, implements</td>
</tr>
<tr>
<td>Svelte</td>
<td><code>.svelte</code></td>
<td>via <code><script></code> block extraction (TS/JS)</td>
<td>imports, calls, type refs</td>
<td>extends, implements</td>
</tr>
</tbody></table>
<details>
<summary><strong>Salesforce ecosystem (Tier 1)</strong></summary>
<table>
<thead>
<tr>
<th>Language</th>
<th>Extensions</th>
<th>Symbols</th>
<th>References</th>
</tr>
</thead>
<tbody><tr>
<td>Apex</td>
<td><code>.cls</code> <code>.trigger</code></td>
<td>classes, triggers, SOQL, annotations</td>
<td>imports, calls, System.Label, generic type refs</td>
</tr>
<tr>
<td>Aura</td>
<td><code>.cmp</code> <code>.app</code> <code>.evt</code> <code>.intf</code> <code>.design</code></td>
<td>components, attributes, methods, events</td>
<td>controller refs, component refs</td>
</tr>
<tr>
<td>LWC (JavaScript)</td>
<td><code>.js</code> (in LWC dirs)</td>
<td>anonymous class from filename</td>
<td><code>@salesforce/apex/</code>, <code>@salesforce/schema/</code>, <code>@salesforce/label/</code></td>
</tr>
<tr>
<td>Visualforce</td>
<td><code>.page</code> <code>.component</code></td>
<td>pages, components</td>
<td>controller/extensions, merge fields, includes</td>
</tr>
<tr>
<td>SF Metadata XML</td>
<td><code>*-meta.xml</code></td>
<td>objects, fields, rules, layouts</td>
<td>Apex class refs, formula field refs, Flow actionCalls</td>
</tr>
</tbody></table>
<p>Cross-language edges mean <code>roam impact AccountService</code> shows blast radius across Apex, LWC, Aura, Visualforce, and Flows.</p>
</details>
<p>| Ruby | <code>.rb</code> | classes, modules, methods, singleton methods, constants | require, require_relative, include/extend, calls, ClassName.new | class inheritance |
| Kotlin | <code>.kt</code> <code>.kts</code> | classes, interfaces, enums, objects, functions, methods, properties | imports, calls, type refs | extends, implements |
| Scala | <code>.scala</code> <code>.sc</code> | classes, traits, objects, case classes, functions, methods, val/var, type aliases | imports, calls, <code>new</code> | extends, with (trait mixins) |
| SQL (DDL) | <code>.sql</code> | tables, columns, views, functions, triggers, schemas, types (enums), sequences | foreign keys, view table deps, trigger table/function refs | -- |
| Swift | <code>.swift</code> | classes, structs, enums, protocols, functions, methods, properties | imports, calls, type refs | extends, conforms |
| JSONC | <code>.jsonc</code> | via JSON grammar | -- | -- |
| MDX | <code>.mdx</code> | via Markdown grammar | -- | -- |</p>
<h2>Performance</h2>
<table>
<thead>
<tr>
<th>Metric</th>
<th>Value</th>
</tr>
</thead>
<tbody><tr>
<td>Index 200 files</td>
<td>~3-5s</td>
</tr>
<tr>
<td>Index 3,000 files</td>
<td>~2 min</td>
</tr>
<tr>
<td>Incremental (no changes)</td>
<td><1s</td>
</tr>
<tr>
<td>Any query command</td>
<td><0.5s</td>
</tr>
</tbody></table>
<details>
<summary><strong>Detailed benchmarks</strong></summary>
<h3>Indexing Speed</h3>
<table>
<thead>
<tr>
<th>Project</th>
<th>Language</th>
<th>Files</th>
<th>Symbols</th>
<th>Edges</th>
<th>Index Time</th>
<th>Rate</th>
</tr>
</thead>
<tbody><tr>
<td>Express</td>
<td>JS</td>
<td>211</td>
<td>624</td>
<td>804</td>
<td>3s</td>
<td>70 files/s</td>
</tr>
<tr>
<td>Axios</td>
<td>JS</td>
<td>237</td>
<td>1,065</td>
<td>868</td>
<td>6s</td>
<td>41 files/s</td>
</tr>
<tr>
<td>Vue</td>
<td>TS</td>
<td>697</td>
<td>5,335</td>
<td>8,984</td>
<td>25s</td>
<td>28 files/s</td>
</tr>
<tr>
<td>Laravel</td>
<td>PHP</td>
<td>3,058</td>
<td>39,097</td>
<td>38,045</td>
<td>1m46s</td>
<td>29 files/s</td>
</tr>
<tr>
<td>Svelte</td>
<td>TS</td>
<td>8,445</td>
<td>16,445</td>
<td>19,618</td>
<td>2m40s</td>
<td>52 files/s</td>
</tr>
</tbody></table>
<h3>Quality Benchmark</h3>
<table>
<thead>
<tr>
<th>Repo</th>
<th>Language</th>
<th>Score</th>
<th>Coverage</th>
<th>Edge Density</th>
</tr>
</thead>
<tbody><tr>
<td>Laravel</td>
<td>PHP</td>
<td><strong>9.55</strong></td>
<td>91.2%</td>
<td>0.97</td>
</tr>
<tr>
<td>Vue</td>
<td>TS</td>
<td><strong>9.27</strong></td>
<td>85.8%</td>
<td>1.68</td>
</tr>
<tr>
<td>Svelte</td>
<td>TS</td>
<td><strong>9.04</strong></td>
<td>94.7%</td>
<td>1.19</td>
</tr>
<tr>
<td>Axios</td>
<td>JS</td>
<td><strong>8.98</strong></td>
<td>85.9%</td>
<td>0.82</td>
</tr>
<tr>
<td>Express</td>
<td>JS</td>
<td><strong>8.46</strong></td>
<td>96.0%</td>
<td>1.29</td>
</tr>
</tbody></table>
<h3>Token Efficiency</h3>
<table>
<thead>
<tr>
<th>Metric</th>
<th>Value</th>
</tr>
</thead>
<tbody><tr>
<td>1,600-line file → <code>roam file</code></td>
<td><del>5,000 chars (</del>70:1 compression)</td>
</tr>
<tr>
<td>Full project map</td>
<td>~4,000 chars</td>
</tr>
<tr>
<td><code>--compact</code> mode</td>
<td>40-50% additional token reduction</td>
</tr>
<tr>
<td><code>roam preflight</code> replaces</td>
<td>5-7 separate agent tool calls</td>
</tr>
</tbody></table>
</details>
<p>Agent-efficiency benchmarks: see the <a href="benchmarks/"><code>benchmarks/</code></a> directory for harness, repos, and results.</p>
<h2>How It Works</h2>
<pre><code>Codebase
|
[1] Discovery ──── git ls-files (respects .gitignore + .roamignore)
|
[2] Parse ──────── tree-sitter AST per file (27 languages)
|
[3] Extract ────── symbols + references (calls, imports, inheritance)
|
[4] Resolve ────── match references to definitions → edges
|
[5] Metrics ────── adaptive PageRank, betweenness, cognitive complexity, Halstead
|
[6] Algorithms ── 23-pattern anti-pattern catalog (O(n^2) loops, N+1, recursion)
|
[7] Git ────────── churn, co-change matrix, authorship, Renyi entropy
|
[8] Clusters ───── Louvain community detection
|
[9] Health ─────── per-file scores (7-factor) + composite score (0-100)
|
[10] Store ─────── .roam/index.db (SQLite, WAL mode)
</code></pre>
<p>After the first full index, <code>roam index</code> only re-processes changed files (mtime + SHA-256 hash). Incremental updates are near-instant.</p>
<h3>.roamignore</h3>
<p>Create a <code>.roamignore</code> file in your project root to exclude files from indexing. It uses <strong>full gitignore syntax</strong>:</p>
<table>
<thead>
<tr>
<th>Pattern</th>
<th>Meaning</th>
</tr>
</thead>
<tbody><tr>
<td><code>*.log</code></td>
<td>Exclude all <code>.log</code> files (basename match)</td>
</tr>
<tr>
<td><code>vendor/</code></td>
<td>Exclude the <code>vendor</code> directory and everything under it</td>
</tr>
<tr>
<td><code>/build/</code></td>
<td>Exclude <code>build/</code> at repo root only (anchored)</td>
</tr>
<tr>
<td><code>src/**/*.pb.go</code></td>
<td>Exclude <code>.pb.go</code> files at any depth under <code>src/</code></td>
</tr>
<tr>
<td><code>**/test_*.py</code></td>
<td>Exclude <code>test_*.py</code> files anywhere</td>
</tr>
<tr>
<td><code>?</code></td>
<td>Match any single character (not <code>/</code>)</td>
</tr>
<tr>
<td><code>[abc]</code> / <code>[!abc]</code></td>
<td>Character class / negated character class</td>
</tr>
<tr>
<td><code>!important.log</code></td>
<td>Un-exclude (re-include) <code>important.log</code></td>
</tr>
<tr>
<td><code># comment</code></td>
<td>Lines starting with <code>#</code> are comments</td>
</tr>
</tbody></table>
<p>Key rules: <code>*</code> matches within a single path segment (not across <code>/</code>). <code>**</code> matches across <code>/</code> boundaries. Last matching pattern wins (for negation). Patterns containing <code>/</code> are anchored to the repo root.</p>
<pre><code># .roamignore example
*_pb2.py
*_pb2_grpc.py
vendor/
node_modules/
*.generated.*
/build/
!build/keep/
</code></pre>
<p>You can also exclude patterns via <code>roam config --exclude "*.proto"</code> (stored in <code>.roam/config.json</code>) or inspect active patterns with <code>roam config --show</code>.</p>
<details>
<summary><strong>Graph algorithms</strong></summary>
<ul>
<li><strong>Adaptive PageRank</strong> -- damping factor auto-tunes based on cycle density (0.82-0.92); identifies the most important symbols (used by <code>map</code>, <code>search</code>, <code>context</code>)</li>
<li><strong>Personalized PageRank</strong> -- distance-weighted blast radius for <code>impact</code> (Gleich, 2015)</li>
<li><strong>Adaptive betweenness centrality</strong> -- exact for small graphs, sqrt-scaled sampling for large (Brandes & Pich, 2007); finds bottleneck symbols</li>
<li><strong>Edge betweenness centrality</strong> -- identifies critical cycle-breaking edges in SCCs (Brandes, 2001)</li>
<li><strong>Tarjan's SCC</strong> -- detects dependency cycles with tangle ratio</li>
<li><strong>Propagation Cost</strong> -- fraction of system affected by any change, via transitive closure (MacCormack, Rusnak & Baldwin, 2006)</li>
<li><strong>Algebraic connectivity (Fiedler value)</strong> -- second-smallest Laplacian eigenvalue; measures architectural robustness (Fiedler, 1973)</li>
<li><strong>Louvain community detection</strong> -- groups related symbols into clusters</li>
<li><strong>Modularity Q-score</strong> -- measures if cluster boundaries match natural community structure (Newman, 2004)</li>
<li><strong>Conductance</strong> -- per-cluster boundary tightness: cut(S, S_bar) / min(vol(S), vol(S_bar)) (Yang & Leskovec)</li>
<li><strong>Topological sort</strong> -- computes dependency layers, Gini coefficient for layer balance (Gini, 1912), weighted violation severity</li>
<li><strong>k-shortest simple paths</strong> -- traces dependency paths with coupling strength</li>
<li><strong>Renyi entropy (order 2)</strong> -- measures co-change distribution; more robust to outliers than Shannon (Renyi, 1961)</li>
<li><strong>Mann-Kendall trend test</strong> -- non-parametric degradation detection, robust to noise (Mann, 1945; Kendall, 1975)</li>
<li><strong>Sen's slope estimator</strong> -- robust trend magnitude, resistant to outliers (Sen, 1968)</li>
<li><strong>NPMI</strong> -- Normalized Pointwise Mutual Information for coupling strength (Bouma, 2009)</li>
<li><strong>Lift</strong> -- association rule mining metric for co-change statistical significance (Agrawal & Srikant, 1994)</li>
<li><strong>Halstead metrics</strong> -- volume, difficulty, effort, and predicted bugs from operator/operand counts (Halstead, 1977)</li>
<li><strong>SQALE remediation cost</strong> -- time-to-fix estimates per issue type for tech debt prioritization (Letouzey, 2012)</li>
<li><strong>Algorithm anti-pattern catalog</strong> -- 23 patterns detecting suboptimal algorithms (quadratic loops, N+1 queries, quadratic string building, branching recursion, manual top-k, loop-invariant calls) with confidence calibration via caller-count and bounded-loop analysis</li>
</ul>
</details>
<details>
<summary><strong>Health scoring</strong></summary>
<p>Composite health score (0-100) using a <strong>weighted geometric mean</strong> of sigmoid health factors. Non-compensatory: a zero in any dimension cannot be masked by high scores in others.</p>
<table>
<thead>
<tr>
<th>Factor</th>
<th>Weight</th>
<th>What it measures</th>
</tr>
</thead>
<tbody><tr>
<td>Tangle ratio</td>
<td>30%</td>
<td>% of symbols in dependency cycles</td>
</tr>
<tr>
<td>God components</td>
<td>20%</td>
<td>Symbols with extreme fan-in/fan-out</td>
</tr>
<tr>
<td>Bottlenecks</td>
<td>15%</td>
<td>High-betweenness chokepoints</td>
</tr>
<tr>
<td>Layer violations</td>
<td>15%</td>
<td>Upward dependency violations (severity-weighted by layer distance)</td>
</tr>
<tr>
<td>Per-file health</td>
<td>20%</td>
<td>Average of 7-factor file health scores</td>
</tr>
</tbody></table>
<p>Each factor uses sigmoid health: <code>h = e^(-signal/scale)</code> (1 = pristine, approaches 0 = worst). Score = <code>100 * product(h_i ^ w_i)</code>. Also reports <strong>propagation cost</strong> (MacCormack 2006) and <strong>algebraic connectivity</strong> (Fiedler 1973). Per-file health (1-10) combines: cognitive complexity (triangular nesting penalty per Sweller's Cognitive Load Theory), indentation complexity, cycle membership, god component membership, dead export ratio, co-change entropy, and churn amplification.</p>
</details>
<h2>How Roam Compares</h2>
<p>roam-code is the only tool that combines graph algorithms (PageRank, Tarjan SCC, Louvain clustering), git archaeology, architecture simulation, and multi-agent partitioning in a single local CLI with zero API keys.</p>
<p>Documentation (local HTML in <code>docs/site/</code>, CI-deployed via <code>.github/workflows/pages.yml</code>):</p>
<ul>
<li><code>docs/site/getting-started.html</code> — tutorial</li>
<li><code>docs/site/command-reference.html</code> — examples</li>
<li><code>docs/site/architecture.html</code> — diagram + internals</li>
<li><code>docs/site/landscape.html</code> — competitor matrix</li>
</ul>
<table>
<thead>
<tr>
<th>Capability</th>
<th>roam-code</th>
<th>AI IDEs (Cursor, Windsurf)</th>
<th>AI Agents (Claude Code, Codex)</th>
<th>SAST (SonarQube, CodeQL)</th>
</tr>
</thead>
<tbody><tr>
<td>Persistent local index</td>
<td>SQLite</td>
<td>Cloud embeddings</td>
<td>None</td>
<td>Per-scan</td>
</tr>
<tr>
<td>Call graph analysis</td>
<td>Yes</td>
<td>No</td>
<td>No</td>
<td>Yes (CodeQL)</td>
</tr>
<tr>
<td>PageRank / centrality</td>
<td>Yes</td>
<td>No</td>
<td>No</td>
<td>No</td>
</tr>
<tr>
<td>Cycle detection (Tarjan)</td>
<td>Yes</td>
<td>No</td>
<td>No</td>
<td>Deprecated (SonarQube)</td>
</tr>
<tr>
<td>Community detection (Louvain)</td>
<td>Yes</td>
<td>No</td>
<td>No</td>
<td>No</td>
</tr>
<tr>
<td>Git churn / co-change</td>
<td>Yes</td>
<td>No</td>
<td>No</td>
<td>No</td>
</tr>
<tr>
<td>Architecture simulation</td>
<td>Yes</td>
<td>No</td>
<td>No</td>
<td>No</td>
</tr>
<tr>
<td>Multi-agent partitioning</td>
<td>Yes</td>
<td>No</td>
<td>No</td>
<td>No</td>
</tr>
<tr>
<td>MCP tools for agents</td>
<td>101 (24 in default core preset)</td>
<td>Client only</td>
<td>Client only</td>
<td>34 (SonarQube)</td>
</tr>
<tr>
<td>Languages</td>
<td>26</td>
<td>70+</td>
<td>50+</td>
<td>12-42</td>
</tr>
<tr>
<td>100% local, zero API keys</td>
<td>Yes</td>
<td>No</td>
<td>No</td>
<td>Partial</td>
</tr>
<tr>
<td>Open source</td>
<td>MIT</td>
<td>No</td>
<td>Partial</td>
<td>Partial</td>
</tr>
</tbody></table>
<h3>Key Differentiators</h3>
<ul>
<li><strong>vs AI IDEs</strong> (Cursor, Windsurf, Augment): roam-code provides deterministic structural analysis. AI IDEs use probabilistic embeddings that can't guarantee reproducible results.</li>
<li><strong>vs AI Agents</strong> (Claude Code, Codex CLI, Gemini CLI): These agents read files one at a time. roam-code pre-computes relationships so agents get instant answers about architecture, blast radius, and dependencies.</li>
<li><strong>vs SAST Tools</strong> (SonarQube, CodeQL, Semgrep): SAST tools find bugs and vulnerabilities. roam-code understands architecture -- how code is structured, where it's coupled, and what breaks when you change it. Complementary, not competitive.</li>
<li><strong>vs Code Search</strong> (Sourcegraph/Amp, Greptile): Text search finds where code is. roam-code understands why code matters -- which functions are central, which modules are tangled, which files are high-risk.</li>
</ul>
<h2>FAQ</h2>
<p><strong>Does Roam send any data externally?</strong>
No. Zero network calls. No telemetry, no analytics, no update checks.</p>
<p><strong>Can Roam run in air-gapped environments?</strong>
Yes. Once installed, no internet access is required.</p>
<p><strong>Does Roam modify my source code?</strong>
Read-only by default. Creates <code>.roam/</code> with an index database. The <code>roam mutate</code> command can apply code changes (move/rename/extract) but defaults to <code>--dry-run</code> mode — you must explicitly pass <code>--apply</code> to write changes.</p>
<p><strong>How does Roam handle monorepos?</strong>
Indexes from the root. Batched SQL handles 100k+ symbols. Incremental updates stay fast.</p>
<p><strong>How does Roam handle multi-repo projects (e.g., frontend + backend)?</strong>
Use <code>roam ws init <repo1> <repo2></code> to create a workspace. Each repo keeps its own index; a workspace overlay DB stores cross-repo API edges. <code>roam ws resolve</code> scans for REST endpoints and matches frontend calls to backend routes. Then <code>roam ws context</code>, <code>roam ws trace</code>, etc. work across repos.</p>
<p><strong>Is Roam compatible with SonarQube / CodeScene?</strong>
Yes. Roam complements existing tools. Both can run in the same CI pipeline. SARIF output integrates with GitHub Code Scanning.</p>
<h2>Limitations</h2>
<p>Static analysis trade-offs:</p>
<ul>
<li><strong>Static analysis primarily</strong> -- can't trace dynamic dispatch, reflection, or eval'd code. Runtime trace ingestion (<code>roam ingest-trace</code>) adds production data but requires external trace export</li>
<li><strong>Import resolution is heuristic</strong> -- complex re-exports or conditional imports may not resolve</li>
<li><strong>Limited cross-language edges</strong> -- Salesforce, Protobuf, REST API, and multi-repo edges are supported, but not arbitrary FFI</li>
<li><strong>Tier 2 languages</strong> get basic symbol extraction only via generic tree-sitter walker</li>
<li><strong>Large monorepos</strong> (100k+ files) may have slow initial indexing</li>
</ul>
<h2>Troubleshooting</h2>
<table>
<thead>
<tr>
<th>Problem</th>
<th>Solution</th>
</tr>
</thead>
<tbody><tr>
<td><code>roam: command not found</code></td>
<td>Ensure install location is on PATH. For <code>uv</code>: <code>uv tool update-shell</code></td>
</tr>
<tr>
<td><code>Another indexing process is running</code></td>
<td>Delete <code>.roam/index.lock</code> and retry</td>
</tr>
<tr>
<td><code>database is locked</code></td>
<td><code>roam index --force</code> to rebuild</td>
</tr>
<tr>
<td>Unicode errors on Windows</td>
<td><code>chcp 65001</code> for UTF-8</td>
</tr>
<tr>
<td>Symbol resolves to wrong file</td>
<td>Use <code>file:symbol</code> syntax: <code>roam symbol myfile:MyFunction</code></td>
</tr>
<tr>
<td>Health score seems wrong</td>
<td><code>roam --json health</code> for factor breakdown</td>
</tr>
<tr>
<td>Index stale after <code>git pull</code></td>
<td><code>roam index</code> (incremental). After major refactors: <code>roam index --force</code></td>
</tr>
</tbody></table>
<h2>Update / Uninstall</h2>
<pre><code class="language-bash"># Update
pipx upgrade roam-code
uv tool upgrade roam-code
pip install --upgrade roam-code
# Uninstall
pipx uninstall roam-code
uv tool uninstall roam-code
pip uninstall roam-code
</code></pre>
<p>Delete <code>.roam/</code> from your project root to clean up local data.</p>
<h2>Development</h2>
<pre><code class="language-bash">git clone https://github.com/Cranot/roam-code.git
cd roam-code
pip install -e ".[dev]" # includes pytest, ruff
pytest tests/ # ~5500 tests, Python 3.9-3.13
# Or use Make targets:
make dev # install with dev extras
make test # run tests
make lint # ruff check
</code></pre>
<details>
<summary><strong>Project structure</strong></summary>
<pre><code>roam-code/
├── pyproject.toml
├── action.yml # Reusable GitHub Action
├── src/roam/
│ ├── __init__.py # Version (from pyproject.toml)
│ ├── cli.py # Click CLI (140 commands)
│ ├── mcp_server.py # MCP server (102 tools, 10 resources, 5 prompts)
│ ├── db/
│ │ ├── connection.py # SQLite (WAL, pragmas, batched IN)
│ │ ├── schema.py # Tables, indexes, migrations
│ │ └── queries.py # Named SQL constants
│ ├── index/
│ │ ├── indexer.py # Orchestrates full pipeline
│ │ ├── discovery.py # git ls-files, .gitignore
│ │ ├── parser.py # Tree-sitter parsing
│ │ ├── symbols.py # Symbol + reference extraction
│ │ ├── relations.py # Reference resolution -> edges
│ │ ├── complexity.py # Cognitive complexity (SonarSource) + Halstead metrics
│ │ ├── git_stats.py # Churn, co-change, blame, Renyi entropy
│ │ ├── incremental.py # mtime + hash change detection
│ │ ├── file_roles.py # Smart file role classifier
│ │ └── test_conventions.py # Pluggable test naming adapters
│ ├── languages/
│ │ ├── base.py # Abstract LanguageExtractor
│ │ ├── registry.py # Language detection + aliasing
│ │ ├── *_lang.py # One file per language (21 dedicated + generic)
│ │ └── generic_lang.py # Tier 2 fallback
│ ├── bridges/
│ │ ├── base.py, registry.py # Cross-language bridge framework
│ │ ├── bridge_salesforce.py # Apex <-> Aura/LWC/Visualforce
│ │ └── bridge_protobuf.py # .proto -> Go/Java/Python stubs
│ ├── catalog/
│ │ ├── tasks.py # Universal algorithm catalog (23 patterns)
│ │ └── detectors.py # Anti-pattern detectors with confidence calibration
│ ├── workspace/
│ │ ├── config.py # .roam-workspace.json
│ │ ├── db.py # Workspace overlay DB
│ │ ├── api_scanner.py # REST API endpoint detection
│ │ └── aggregator.py # Cross-repo aggregation
│ ├── graph/
│ │ ├── builder.py, pagerank.py # DB -> NetworkX, PageRank
│ │ ├── cycles.py, clusters.py # Tarjan SCC, propagation cost, Louvain, modularity Q
│ │ ├── layers.py, pathfinding.py # Topo layers, k-shortest paths
│ │ ├── simulate.py, spectral.py # Architecture simulation, Fiedler bisection
│ │ ├── partition.py, fingerprint.py # Multi-agent partitioning, topology fingerprints
│ │ └── anomaly.py # Statistical anomaly detection
│ ├── commands/
│ │ ├── resolve.py # Shared symbol resolution
│ │ ├── graph_helpers.py # Shared graph utilities (adj builders, BFS)
│ │ ├── context_helpers.py # Data-gathering helpers for context command
│ │ ├── gate_presets.py # Framework-specific gate rules
│ │ └── cmd_*.py # One module per command
│ ├── analysis/
│ │ ├── effects.py # Side-effect classification engine
│ │ └── taint.py # Taint analysis
│ ├── refactor/
│ │ ├── codegen.py # Import generation (Python/JS/Go)
│ │ └── transforms.py # move/rename/add-call/extract transforms
│ ├── rules/
│ │ ├── engine.py # YAML rule parser + graph query evaluator
│ │ ├── builtin.py # 10 built-in governance rules
│ │ ├── ast_match.py # AST pattern matching with $METAVAR captures
│ │ └── dataflow.py # Intra-procedural dataflow analysis
│ ├── runtime/
│ │ ├── trace_ingest.py # OpenTelemetry/Jaeger/Zipkin ingestion
│ │ └── hotspots.py # Runtime hotspot analysis
│ ├── search/
│ │ ├── tfidf.py # TF-IDF semantic search engine
│ │ ├── index_embeddings.py # Embedding index builder
│ │ └── onnx_embeddings.py # Optional local ONNX semantic backend
│ ├── security/
│ │ ├── vuln_store.py # CVE/vulnerability storage
│ │ └── vuln_reach.py # Vulnerability reachability paths
│ └── output/
│ ├── formatter.py # Token-efficient formatting
│ ├── sarif.py # SARIF 2.1.0 output
│ └── schema_registry.py # JSON envelope schema versioning
└── tests/ # ~5500 tests across 186 test files
</code></pre>
</details>
<h3>Dependencies</h3>
<table>
<thead>
<tr>
<th>Package</th>
<th>Purpose</th>
</tr>
</thead>
<tbody><tr>
<td><a href="https://click.palletsprojects.com/">click</a> >= 8.0</td>
<td>CLI framework</td>
</tr>
<tr>
<td><a href="https://github.com/tree-sitter/py-tree-sitter">tree-sitter</a> >= 0.23</td>
<td>AST parsing</td>
</tr>
<tr>
<td><a href="https://github.com/nicolo-ribaudo/tree-sitter-language-pack">tree-sitter-language-pack</a> >= 0.6</td>
<td>165+ grammars</td>
</tr>
<tr>
<td><a href="https://networkx.org/">networkx</a> >= 3.0</td>
<td>Graph algorithms</td>
</tr>
</tbody></table>
<p>Optional: <a href="https://github.com/jlowin/fastmcp">fastmcp</a> >= 2.0 (MCP server — install with <code>pip install "roam-code[mcp]"</code>)</p>
<p>Optional: Local semantic ONNX stack (<code>numpy</code>, <code>onnxruntime</code>, <code>tokenizers</code>) via <code>pip install "roam-code[semantic]"</code></p>
<h2>Roadmap</h2>
<h3>Shipped</h3>
<ul>
<li><input checked="" disabled="" type="checkbox"> MCP v2 agent surface: in-process execution, compound operations, presets, schemas, annotations, and compatibility profiles.</li>
<li><input checked="" disabled="" type="checkbox"> Full command and MCP inventory parity in docs: 140 CLI commands and 102 MCP tools.</li>
<li><input checked="" disabled="" type="checkbox"> CI hardening: composite action, changed-only mode, trend-aware gates, sticky PR updater, and SARIF guardrails.</li>
<li><input checked="" disabled="" type="checkbox"> Performance foundation: FTS5/BM25 search, O(changed) incremental indexing, DB/index optimizations.</li>
<li><input checked="" disabled="" type="checkbox"> Agent governance suite: <code>vibe-check</code>, <code>ai-readiness</code>, <code>verify</code>, <code>ai-ratio</code>, <code>duplicates</code>, advanced <code>algo</code> scoring/SARIF.</li>
<li><input checked="" disabled="" type="checkbox"> Ownership/review intelligence: <code>codeowners</code>, <code>drift</code>, <code>simulate-departure</code>, <code>suggest-reviewers</code>, <code>api-changes</code>, <code>test-gaps</code>, <code>semantic-diff</code>, <code>secrets</code>.</li>
<li><input checked="" disabled="" type="checkbox"> Multi-agent operations: <code>partition</code>, <code>affected</code>, <code>syntax-check</code>, workspace-aware context and traces.</li>
<li><input checked="" disabled="" type="checkbox"> Budget-aware context delivery: <code>--budget</code> (partial rollout), PageRank-weighted truncation, conversation-aware ranking.</li>
</ul>
<h3>Next</h3>
<ul>
<li><input checked="" disabled="" type="checkbox"> Terminal demo GIF in README.</li>
<li><input disabled="" type="checkbox"> GitHub repo topics.</li>
<li><input disabled="" type="checkbox"> GitHub Discussions enabled.</li>
<li><input disabled="" type="checkbox"> MCP directory + awesome-list submissions.</li>
</ul>
<h2>Contributing</h2>
<pre><code class="language-bash">git clone https://github.com/Cranot/roam-code.git
cd roam-code
pip install -e .
pytest tests/ # all ~5500 tests must pass
</code></pre>
<p>Good first contributions: add a <a href="src/roam/languages/">Tier 1 language</a> (see <code>go_lang.py</code> or <code>php_lang.py</code> as templates), improve reference resolution, add benchmark repos, extend SARIF converters, add MCP tools.</p>
<p>Please open an issue first to discuss larger changes.</p>
<h2>License</h2>
<p><a href="LICENSE">MIT</a></p>