roam-code

The architectural intelligence layer for AI coding agents. Structural graph, architecture governance, multi-agent orchestration, vulnerability mapping, runtime analysis -- one CLI, zero API keys.

140 commands · 102 MCP tools · 27 languages · 100% local

What is Roam?

Roam is a structural intelligence engine for software. It pre-indexes your codebase into a semantic graph -- symbols, dependencies, call graphs, architecture layers, git history, and runtime traces -- stored in a local SQLite DB. Agents query it via CLI or MCP instead of repeatedly grepping files and guessing structure.

Unlike LSPs (editor-bound, language-specific) or Sourcegraph (hosted search), Roam provides architecture-level graph queries -- offline, cross-language, and compact. It goes beyond comprehension: Roam governs architecture through budget gates, simulates refactoring outcomes, orchestrates multi-agent swarms with zero-conflict guarantees, maps vulnerability reachability paths, and enables graph-level code editing without syntax errors.

Codebase ──> [Index] ──> Semantic Graph ──> 139 Commands ──> AI Agent
              │              │                  │
           tree-sitter    symbols            comprehend
           27 languages   + edges            govern
           git history    + metrics          refactor
           runtime traces + architecture     orchestrate

The problem

Coding agents explore codebases inefficiently: dozens of grep/read cycles, high token cost, no structural understanding. Roam replaces this with one graph query:

$ roam context Flask Callers: 47 Callees: 3 Affected tests: 31

Files to read: src/flask/app.py:76-963 # definition src/flask/init.py:1-15 # re-export src/flask/testing.py:22-45 # caller: FlaskClient.init tests/test_basic.py:12-30 # caller: test_app_factory ...12 more files

Terminal demo

roam terminal demo

Core commands

$ roam understand              # full codebase briefing
$ roam context <name>          # files-to-read with exact line ranges
$ roam preflight <name>        # blast radius + tests + complexity + architecture rules
$ roam health                  # composite score (0-100)
$ roam diff                    # blast radius of uncommitted changes

What's New in v11

v11.2 -- AST Clone Detection + Debug Artifact Rules

roam clones: New AST structural clone detection via subtree hashing. Finds Type-2 clones (identical control flow, different identifiers/literals) with Jaccard similarity scoring, Union-Find clustering, and automated refactoring suggestions. More precise than the metric-based duplicates command.
9 debug artifact rules (COR-560 through COR-568): Detect leftover print(), breakpoint(), pdb.set_trace(), console.log(), debugger, and System.out.println() in Python, JavaScript, TypeScript, and Java code. All use ast_match type with test file exemptions.
140 commands, 102 MCP tools.

v11.1.2 -- SQL + Scala Tier 1, 27 Languages

SQL DDL promoted to Tier 1 with dedicated SqlExtractor -- tables, columns, views, functions, triggers, schemas, types (enums), sequences, ALTER TABLE ADD COLUMN. Foreign keys produce graph edges; views and triggers reference source tables. Database-schema projects now work with roam health, roam layers, roam impact, roam coupling and all graph commands.
Scala promoted to Tier 1 with dedicated ScalaExtractor -- classes, traits, objects, case classes, sealed hierarchies, val/var properties, type aliases, imports, and inheritance. Full extends + with trait mixin resolution.
27 languages with 16 dedicated Tier 1 extractors.
server.json for official MCP Registry submission.

v11.1.1 -- Command Quality Audit

Full command audit: all 140 commands reviewed for usefulness, duplicates, and test coverage. ~20 bugs fixed, 21 new test files (700+ tests), every command docstring updated with cross-references to related commands.
Kotlin promoted to Tier 1 via new YAML-based declarative extractor architecture. Classes, interfaces, enums, objects, functions, methods, properties, and inheritance fully extracted.
7 new commands: roam congestion, roam adrs, roam flag-dead, roam test-scaffold, roam sbom, roam triage, roam ci-setup.
CI templates: roam ci-setup generates pipelines for GitHub Actions, GitLab CI, Azure Pipelines, Jenkins, and Bitbucket.
Bug fixes: --undocumented mode in intent (wrong DB table), --changed flag in verify (was permanently dead), lazy-load violation in visualize (~500ms penalty), exit code inconsistency in rules, VERDICT-first convention enforced across all commands.
Code quality: 15 unused variables removed, dead code swept (4 orphaned cmd files, 2 dead helper functions), algo detector false-positive rate reduced (regex-in-loop: 7 to 1, list-prepend deque suppression), 6 regex patterns pre-compiled for loop performance.

v11.0 -- MCP v2 for Agent-First Workflows

In-process MCP execution removes per-call subprocess overhead.
4 compound operations (roam_explore, roam_prepare_change, roam_review_change, roam_diagnose_issue) reduce multi-step agent workflows to single calls.
Preset-based tool surfacing (core, review, refactor, debug, architecture, full) keeps default tool choice tight for agents while retaining full depth on demand.
MCP tools now expose structured schemas and richer annotations for safer planner behavior.
MCP token overhead for default core context dropped from ~36K to <3K tokens (about 92% reduction).

Performance and Retrieval

Symbol search moved to SQLite FTS5/BM25: typical search moved from seconds to milliseconds (about 1000x on benchmarked paths).
Incremental indexing shifted from O(N) full-edge rebuild behavior to O(changed) updates.
DB/runtime optimizations (mmap_size, safer large-graph guards, batched writes) reduce first-run and reindex friction on larger repos.

CI, Governance, and Delivery

GitHub Action supports quality gates, SARIF upload, sticky PR comments, and cache-aware execution.
CI hardening includes changed-only analysis mode, trend-aware gates, and SARIF pre-upload guardrails (size/result caps + truncation signaling).
Agent governance expanded with verification and AI-quality tooling (roam verify, roam vibe-check, roam ai-readiness, roam ai-ratio) for teams managing agent-written code.

Best for

Agent-assisted coding -- structured answers that reduce token usage vs raw file exploration
Large codebases (100+ files) -- graph queries beat linear search at scale
Architecture governance -- health scores, CI quality gates, budget enforcement, fitness functions
Safe refactoring -- blast radius, affected tests, pre-change safety checks, graph-level editing
Multi-agent orchestration -- partition codebases for parallel agent work with zero-conflict guarantees
Security analysis -- vulnerability reachability mapping, auth gaps, CVE path tracing
Algorithm optimization -- detect O(n^2) loops, N+1 queries, and 21 other anti-patterns with suggested fixes
Backend quality -- auth gaps, missing indexes, over-fetching models, non-idempotent migrations, orphan routes, API drift
Runtime analysis -- overlay production trace data onto the static graph for hotspot detection
Multi-repo projects -- cross-repo API edge detection between frontend and backend

When NOT to use Roam

Real-time type checking -- use an LSP (pyright, gopls, tsserver). Roam is static and offline.
Small scripts (<10 files) -- just read the files directly.
Pure text search -- ripgrep is faster for raw string matching.

Why use Roam

Speed. One command replaces 5-10 tool calls (in typical workflows). Under 0.5s for any query.

Dependency-aware. Computes structure, not string matches. Knows Flask has 47 dependents and 31 affected tests. grep knows it appears 847 times.

LLM-optimized output. Plain ASCII, compact abbreviations (fn, cls, meth), --json envelopes. Designed for agent consumption, not human decoration.

Fully local. No API keys, telemetry, or network calls. Works in air-gapped environments.

Algorithm-aware. Built-in catalog of 23 anti-patterns. Detects suboptimal algorithms (quadratic loops, N+1 queries, unbounded recursion) and suggests fixes with Big-O improvements and confidence scores. Receiver-aware loop-invariant analysis minimizes false positives.

CI-ready. --json output, --gate quality gates, GitHub Action, SARIF 2.1.0.

	Without Roam	With Roam
Tool calls	8	1
Wall time	~11s	<0.5s
Tokens consumed	~15,000	~3,000

Measured on a typical agent workflow in a 200-file Python project (Flask). See benchmarks for more.

Table of Contents

Getting Started: What is Roam? · What's New in v11 · Best for · Why use Roam · Install · Quick Start

Using Roam: Commands · Walkthrough · AI Coding Tools · MCP Server

Operations: CI/CD Integration · SARIF Output · For Teams

Reference: Language Support · Performance · How It Works · How Roam Compares · FAQ

More: Limitations · Troubleshooting · Update / Uninstall · Development · Contributing

Install

pip install roam-code Recommended: isolated environment pipx install roam-code or uv tool install roam-code From source

pip install git+https://github.com/Cranot/roam-code.git

Requires Python 3.9+. Works on Linux, macOS, and Windows.

Windows: If roam is not found after installing with uv, run uv tool update-shell and restart your terminal.

Docker (alpine-based)

docker build -t roam-code .
docker run --rm -v "$PWD:/workspace" roam-code index
docker run --rm -v "$PWD:/workspace" roam-code health

Quick Start

cd your-project
roam init                  # indexes codebase, creates config + CI workflow
roam understand            # full codebase briefing

First index takes ~5s for 200 files, ~15s for 1,000 files. Subsequent runs are incremental and near-instant.

Next steps:

Set up your AI agent: roam describe --write (auto-detects CLAUDE.md, AGENTS.md, .cursor/rules, etc. — see integration instructions)
Explore: roam health → roam weather → roam map
Add to CI: roam init already generated a GitHub Action

Try it on Roam itself

git clone https://github.com/Cranot/roam-code.git
cd roam-code
pip install -e .
roam init
roam understand
roam health

Works With

Claude Code • Cursor • Windsurf • GitHub Copilot • Aider • Cline • Gemini CLI • OpenAI Codex CLI • MCP • GitHub Actions • GitLab CI • Azure DevOps

Commands

The 5 core commands shown above cover ~80% of agent workflows. All 140 commands are organized into 7 categories.

Full command reference

Getting Started

Command	Description
`roam index [--force] [--verbose]`	Build or rebuild the codebase index
`roam watch [--interval N] [--debounce N] [--webhook-port P] [--guardian]`	Long-running index daemon: poll/webhook-triggered refreshes plus optional continuous architecture-guardian snapshots and JSONL compliance artifacts
`roam init`	Guided onboarding: creates `.roam/fitness.yaml`, CI workflow, runs index, shows health
`roam hooks [--install] [--uninstall]`	Manage git hooks for automated roam index updates and health gates
`roam doctor`	Diagnose installation and environment: verify tree-sitter grammars, SQLite, git, and config health
`roam reset [--hard]`	Reset the roam index and cached data. `--hard` removes all `.roam/` artifacts
`roam clean [--all]`	Remove stale or orphaned index entries without a full rebuild
`roam understand`	Full codebase briefing: tech stack, architecture, key abstractions, health, conventions, complexity overview, entry points
`roam onboard`	Alias for `understand`
`roam tour [--write PATH]`	Auto-generated onboarding guide: top symbols, reading order, entry points, language breakdown. `--write` saves to Markdown
`roam describe [--write] [--force] [-o PATH] [--agent-prompt]`	Auto-generate project description for AI agents. `--write` auto-detects your agent's config file. `--agent-prompt` returns a compact (<500 token) system prompt
`roam agent-export [--format F] [--write]`	Generate agent-context bundle from project analysis (`AGENTS.md` + provider-specific overlays)
`roam minimap [--update] [-o FILE] [--init-notes]`	Compact annotated codebase snapshot for agent config injection: stack, annotated directory tree, key symbols by PageRank, high fan-in symbols to avoid touching, hotspots, conventions. Sentinel-based in-place updates
`roam config [--set-db-dir PATH] [--semantic-backend MODE]`	Manage `.roam/config.json` (DB path, excludes, optional ONNX semantic settings)
`roam map [-n N] [--full] [--budget N]`	Project skeleton: files, languages, entry points, top symbols by PageRank. `--budget` caps output to N tokens
`roam schema [--diff] [--version V]`	JSON envelope schema versioning: view, diff, and validate output schemas
`roam mcp [--list-tools] [--transport T]`	Start MCP server (stdio/SSE/streamable-http), inspect available tools, and expose roam to coding agents
`roam mcp-setup <platform>`	Generate MCP config snippets for AI platforms: claude-code, cursor, windsurf, vscode, gemini-cli, codex-cli
`roam ci-setup [--platform P] [--write]`	Generate CI/CD pipeline config (GitHub Actions, GitLab CI, Azure Pipelines, Jenkins, Bitbucket) with SARIF + quality gates
`roam adrs [--status S] [--limit N]`	Discover Architecture Decision Records, link to affected code modules, show status and coverage

Daily Workflow

Command	Description
`roam file <path> [--full] [--changed] [--deps-of PATH]`	File skeleton: all definitions with signatures, cognitive load index, health score
`roam symbol <name> [--full]`	Symbol definition + callers + callees + metrics. Supports `file:symbol` disambiguation
`roam context <symbol> [--task MODE] [--for-file PATH]`	AI-optimized context: definition + callers + callees + files-to-read with line ranges
`roam search <pattern> [--kind KIND]`	Find symbols by name pattern, PageRank-ranked
`roam grep <pattern> [-g glob] [-n N]`	Text search annotated with enclosing symbol context
`roam deps <path> [--full]`	What a file imports and what imports it
`roam trace <source> <target> [-k N]`	Dependency paths with coupling strength and hub detection
`roam impact <symbol>`	Blast radius: what breaks if a symbol changes (Personalized PageRank weighted)
`roam diff [--staged] [--full] [REV_RANGE]`	Blast radius of uncommitted changes or a commit range
`roam pr-risk [REV_RANGE]`	PR risk score (0-100, multiplicative model) + structural spread + suggested reviewers
`roam pr-diff [--staged] [--range R] [--format markdown]`	Structural PR diff: metric deltas, edge analysis, symbol changes, footprint. Not text diff — graph delta
`roam api-changes [REV_RANGE]`	API change classifier: breaking/non-breaking changes, severity, and affected contracts
`roam semantic-diff [REV_RANGE]`	Structural change summary: symbols added/removed/modified and changed call edges
`roam test-gaps [REV_RANGE]`	Changed-symbol test gap detection: what changed and what still lacks test coverage
`roam affected [REV_RANGE]`	Monorepo/package impact analysis: what components are affected by a change
`roam attest [REV_RANGE] [--format markdown] [--sign]`	Proof-carrying PR attestation: bundles blast radius, risk, breaking changes, fitness, budget, tests, effects into one verifiable artifact
`roam annotate <symbol> <note>`	Attach persistent notes to symbols (agentic memory across sessions)
`roam annotations [--file F] [--symbol S]`	View stored annotations
`roam diagnose <symbol> [--depth N]`	Root cause analysis: ranks suspects by z-score normalized risk
`roam preflight <symbol\|file>`	Compound pre-change check: blast radius + tests + complexity + coupling + fitness
`roam guard <symbol>`	Compact sub-agent preflight bundle: definition, 1-hop callers/callees, test files, breaking-risk score, and layer signals
`roam agent-plan --agents N`	Decompose partitions into dependency-ordered agent tasks with merge sequencing and handoffs
`roam agent-context --agent-id N [--agents M]`	Generate per-agent execution context: write scope, read-only dependencies, and interface contracts
`roam syntax-check [--changed] [PATHS...]`	Tree-sitter syntax integrity check for changed files and multi-agent judge workflows
`roam verify [--threshold N]`	Pre-commit AI-code consistency check across naming, imports, error handling, and duplication signals
`roam verify-imports [--file F]`	Import hallucination firewall: validate all imports against indexed symbol table, suggest corrections via FTS5 fuzzy matching
`roam triage list\|add\|stats\|check`	Security finding suppression workflow: manage `.roam-suppressions.yml` (SAFE/ACKNOWLEDGED/WONT-FIX status lifecycle)
`roam safe-delete <symbol>`	Safe deletion check: SAFE/REVIEW/UNSAFE verdict
`roam test-map <name>`	Map a symbol or file to its test coverage
`roam adversarial [--staged] [--range R]`	Adversarial architecture review: generates targeted challenges based on changes
`roam plan [--staged] [--range R] [--agents N]`	Agent work planner: decompose changes into sequenced, dependency-aware steps
`roam closure <symbol> [--rename] [--delete]`	Minimal-change synthesis: all files to touch for a safe rename/delete
`roam mutate move\|rename\|add-call\|extract`	Graph-level code editing: move symbols, rename across codebase, add calls, extract functions. Dry-run by default

Codebase Health

Command	Description
`roam health [--no-framework] [--gate]`	Composite health score (0-100): weighted geometric mean of tangle ratio, god components, bottlenecks, layer violations. `--gate` runs quality gate checks from `.roam-gates.yml` (exit 5 on failure)
`roam smells [--file F] [--min-severity S]`	Code smell detection: 15 deterministic detectors (brain methods, god classes, feature envy, shotgun surgery, data clumps, etc.) with per-file health scores
`roam dashboard`	Unified single-screen project status: health, hotspots, risks, ownership, and AI-rot indicators
`roam vibe-check [--threshold N]`	AI-rot auditor: 8-pattern taxonomy with composite risk score and prioritized findings
`roam ai-readiness`	0-100 score for how well this codebase supports AI coding agents
`roam ai-ratio [--since N]`	Statistical estimate of AI-generated code ratio using commit-behavior signals
`roam trends [--record] [--days N] [--metric M]`	Historical metrics snapshots with sparklines and trend deltas
`roam complexity [--bumpy-road]`	Per-function cognitive complexity (SonarSource-compatible, triangular nesting penalty) + Halstead metrics (volume, difficulty, effort, bugs) + cyclomatic density
`roam algo [--task T] [--confidence C] [--profile P]`	Algorithm anti-pattern detection: 23-pattern catalog detects suboptimal algorithms (O(n^2) loops, N+1 queries, quadratic string building, branching recursion, loop-invariant calls) and suggests better approaches with Big-O improvements. Confidence calibration via caller-count + runtime traces, evidence paths, impact scoring, framework-aware N+1 packs, and language-aware fix templates. Alias: `roam math`
`roam n1 [--confidence C] [--verbose]`	Implicit N+1 I/O detection: finds ORM model computed properties (`$appends`/accessors) that trigger lazy-loaded DB queries in collection contexts. Cross-references with eager loading config. Supports Laravel, Django, Rails, SQLAlchemy, JPA
`roam over-fetch [--threshold N] [--confidence C]`	Detect models serializing too many fields: large `$fillable` without `$hidden`/`$visible`, direct controller returns bypassing API Resources, poor exposed-to-hidden ratio
`roam missing-index [--table T] [--confidence C]`	Find queries on non-indexed columns: cross-references `WHERE`/`ORDER BY` clauses, foreign keys, and paginated queries against migration-defined indexes
`roam weather [-n N]`	Hotspots ranked by geometric mean of churn x complexity (percentile-normalized)
`roam debt [--roi]`	Hotspot-weighted tech debt prioritization with SQALE remediation costs and optional refactoring ROI estimates
`roam fitness [--explain]`	Architectural fitness functions from `.roam/fitness.yaml`
`roam alerts`	Health degradation trend detection (Mann-Kendall + Sen's slope)
`roam forecast [--symbol S] [--horizon N] [--alert-only]`	Predict when metrics will exceed thresholds: Theil-Sen regression on snapshot history + churn-weighted per-symbol risk
`roam budget [--init] [--staged] [--range R]`	Architectural budget enforcement: per-PR delta limits on health, cycles, complexity. CI gate (exit 5 on violation)
`roam bisect [--metric M] [--range R]`	Architectural git bisect: find the commit that degraded a specific metric
`roam ingest-trace <file> [--otel\|--jaeger\|--zipkin\|--generic]`	Ingest runtime trace data (OpenTelemetry, Jaeger, Zipkin) for hotspot overlay
`roam hotspots [--runtime] [--discrepancy]`	Runtime hotspot analysis: find symbols missed by static analysis but critical at runtime

roam algo — algorithm anti-pattern catalog (23 patterns)

roam algo scans every indexed function against a 23-pattern catalog, ranks findings by runtime-aware impact score, and shows the exact Big-O improvement available. Findings include semantic evidence paths, precision metadata, and language-aware tips/fixes (Python, JS, Go, Rust, Java, etc.):

$ roam algo
VERDICT: 8 algorithmic improvements found (3 high, 4 medium, 1 low)
Ordering: highest impact first
Profile: balanced (filtered 0 low-signal findings)
Nested loop lookup (2):
fn   resolve_permissions          src/auth/rbac.py:112     [high, impact=86.4]
Current: Nested iteration -- O(n*m)
Better:  Hash-map join -- O(n+m)
Tip: Build a dict/set from one collection, iterate the other
fn   find_matching_rule           src/rules/engine.py:67   [high, impact=78.1]
Current: Nested iteration -- O(n*m)
Better:  Hash-map join -- O(n+m)
Tip: Build a dict/set from one collection, iterate the other
String building (1):
meth build_query                  src/db/query.py:88       [high, impact=74.0]
Current: Loop concatenation -- O(n^2)
Better:  Join / StringBuilder -- O(n)
Tip: Collect parts in a list, join once at the end
Branching recursion without memoization (1):
fn   compute_cost                 src/pricing/calc.py:34   [medium, impact=49.5]
Current: Naive branching recursion -- O(2^n)
Better:  Memoized / iterative DP -- O(n)
Tip: Add @cache / @lru_cache, or convert to iterative with a table

Full catalog — 23 patterns:

Pattern	Anti-pattern detected	Better approach	Improvement
Nested loop lookup	`for x in a: for y in b: if x==y`	Hash-map join	O(n·m) → O(n+m)
Membership test	`if x in list` in a loop	Set lookup	O(n) → O(1) per check
Sorting	Bubble / selection sort	Built-in sort	O(n²) → O(n log n)
Search in sorted data	Linear scan on sorted sequence	Binary search	O(n) → O(log n)
String building	`s += chunk` in loop	`join()` / StringBuilder	O(n²) → O(n)
Deduplication	Nested loop dedup	`set()` / `dict.fromkeys`	O(n²) → O(n)
Max / min	Manual tracking loop	`max()` / `min()`	idiom
Accumulation	Manual accumulator	`sum()` / `reduce()`	idiom
Group by key	Manual key-existence check	`defaultdict` / `groupingBy`	idiom
Fibonacci	Naive recursion	Iterative / `@lru_cache`	O(2ⁿ) → O(n)
Exponentiation	Loop multiplication	`pow(b, e, mod)`	O(n) → O(log n)
GCD	Manual loop	`math.gcd()`	O(n) → O(log n)
Matrix multiply	Naive triple loop	NumPy / BLAS	same asymptotic, ~1000× faster via SIMD
Busy wait	`while True: sleep()` poll	Event / condition variable	O(k) → O(1) wake-up
Regex in loop	`re.match()` compiled per iteration	Pre-compiled pattern	O(n·(p+m)) → O(p + n·m)
N+1 query	Per-item DB / API call in loop	Batch `WHERE IN (...)`	n round-trips → 1
List front operations	`list.insert(0, x)` in loop	`collections.deque`	O(n) → O(1) per op
Sort to select	`sorted(x)[0]` or `sorted(x)[:k]`	`min()` / `heapq.nsmallest`	O(n log n) → O(n) or O(n log k)
Repeated lookup	`.index()` / `.contains()` inside loop	Pre-built set / dict	O(m) → O(1) per lookup
Branching recursion	Naive `f(n-1) + f(n-2)` without cache	`@cache` / iterative DP	O(2ⁿ) → O(n)
Quadratic string building	`result += chunk` across multiple scopes	`parts.append` + `join` at end	O(n²) → O(n)
Loop-invariant call	`get_config()` / `compile_schema()` inside loop body	Hoist before loop	per-iter cost → O(1)
String reversal	Manual char-by-char loop	`s[::-1]` / `.reverse()`	idiom

Filtering:

roam algo --task nested-lookup       # one pattern type only
roam algo --confidence high          # high-confidence findings only
roam algo --profile strict           # precision-first filtering
roam algo --task io-in-loop -n 5    # top 5 N+1 query sites
roam --json algo                     # machine-readable output
roam --sarif algo > roam-algo.sarif  # SARIF with fingerprints + fixes

Confidence calibration: high = strong structural signal (unbounded loop + high caller/runtime impact + pattern confirmed); medium = pattern matched but uncertainty remains; low = heuristic signal only.

Profiles: balanced (default), strict (precision-first), aggressive (surface more candidates).

roam minimap — annotated codebase snapshot for agent configs

roam minimap generates a compact block (stack, annotated directory tree, key symbols, hotspots, conventions) wrapped in sentinel comments for in-place agent config updates:

$ roam minimap
<!-- roam:minimap generated=2026-02-25 -->
**Stack:** Python · JavaScript · YAML

.github/ (CI + Action) benchmarks/ (agent-eval + oss-eval) src/ roam/ bridges/ base.py # LanguageBridge registry.py # register_bridge, detect_bridges commands/ (137 cmd files) # is_test_file, get_changed_files db/ connection.py # find_project_root, batched_in schema.py graph/ builder.py # build_symbol_graph, build_file_graph pagerank.py # compute_pagerank, compute_centrality languages/ (21 files) # ApexExtractor output/ formatter.py # to_json, json_envelope cli.py # cli, LazyGroup mcp_server.py tests/ (186 files) `

Key symbols (PageRank): open_db · ensure_index · json_envelope · to_json · LanguageExtractor

Touch carefully (fan-in >= 15): to_json (116 callers) · json_envelope (116 callers) · open_db (105 callers) · ensure_index (100 callers)

Hotspots (churn x complexity): cmd_context.py · csharp_lang.py · cmd_dead.py

Conventions: snake_case fns, PascalCase classes


**Workflow:**
roam minimap                    # print to stdout
roam minimap --update           # replace sentinel block in CLAUDE.md in-place
roam minimap -o docs/AGENTS.md  # target a different file
roam minimap --init-notes       # scaffold .roam/minimap-notes.md for project gotchas
</code></pre>
<p>The sentinel pair <code>&lt;!-- roam:minimap --&gt;</code> / <code>&lt;!-- /roam:minimap --&gt;</code> is replaced on each run — surrounding content is left intact. Add project-specific gotchas to <code>.roam/minimap-notes.md</code> and they appear in every subsequent output.</p>
<p><strong>Tree annotations</strong> come from the top exported symbols by fan-in per file. Non-source root directories (<code>.github/</code>, <code>benchmarks/</code>, <code>docs/</code>) are collapsed immediately. Large subdirectories (e.g. <code>commands/</code>, <code>languages/</code>) are collapsed at depth 2+ with a file count.</p>
</details>

<h3>Architecture</h3>
<table>
<thead>
<tr>
<th>Command</th>
<th>Description</th>
</tr>
</thead>
<tbody><tr>
<td><code>roam clusters [--min-size N]</code></td>
<td>Community detection vs directory structure. Modularity Q-score (Newman 2004) + per-cluster conductance</td>
</tr>
<tr>
<td><code>roam spectral [--depth N] [--compare] [--gap-only] [--k K]</code></td>
<td>Spectral bisection: Fiedler vector partition tree with algebraic connectivity gap verdict</td>
</tr>
<tr>
<td><code>roam layers</code></td>
<td>Topological dependency layers + upward violations + Gini balance</td>
</tr>
<tr>
<td><code>roam dead [--all] [--summary] [--clusters]</code></td>
<td>Unreferenced exported symbols with safety verdicts + confidence scoring (60-95%)</td>
</tr>
<tr>
<td><code>roam flag-dead [--config FILE] [--include-tests]</code></td>
<td>Feature flag dead code detection: stale LaunchDarkly/Unleash/Split/custom flags with staleness analysis</td>
</tr>
<tr>
<td><code>roam fan [symbol|file] [-n N] [--no-framework]</code></td>
<td>Fan-in/fan-out: most connected symbols or files</td>
</tr>
<tr>
<td><code>roam risk [-n N] [--domain KW] [--explain]</code></td>
<td>Domain-weighted risk ranking</td>
</tr>
<tr>
<td><code>roam why &lt;name&gt; [name2 ...]</code></td>
<td>Role classification (Hub/Bridge/Core/Leaf), reach, criticality</td>
</tr>
<tr>
<td><code>roam split &lt;file&gt;</code></td>
<td>Internal symbol groups with isolation % and extraction suggestions</td>
</tr>
<tr>
<td><code>roam entry-points</code></td>
<td>Entry point catalog with protocol classification</td>
</tr>
<tr>
<td><code>roam patterns</code></td>
<td>Architectural pattern recognition: Strategy, Factory, Observer, etc.</td>
</tr>
<tr>
<td><code>roam visualize [--format mermaid|dot] [--focus NAME] [--limit N]</code></td>
<td>Generate Mermaid or DOT architecture diagrams. Smart filtering via PageRank, cluster grouping, cycle highlighting</td>
</tr>
<tr>
<td><code>roam effects [TARGET] [--file F] [--type T]</code></td>
<td>Side-effect classification: DB writes, network I/O, filesystem, global mutation. Direct + transitive effects through call graph</td>
</tr>
<tr>
<td><code>roam dark-matter [--min-cochanges N]</code></td>
<td>Detect hidden co-change couplings not explained by import/call edges</td>
</tr>
<tr>
<td><code>roam simulate move|extract|merge|delete</code></td>
<td>Counterfactual architecture simulator: test refactoring ideas in-memory, see metric deltas before writing code</td>
</tr>
<tr>
<td><code>roam orchestrate --agents N [--files P]</code></td>
<td>Multi-agent swarm partitioning: split codebase for parallel agents with zero-conflict guarantees</td>
</tr>
<tr>
<td><code>roam partition [--agents N]</code></td>
<td>Multi-agent partition manifest: conflict risk, complexity, and suggested ownership splits</td>
</tr>
<tr>
<td><code>roam fingerprint [--compact] [--compare F]</code></td>
<td>Topology fingerprint: extract/compare architectural signatures across repos</td>
</tr>
<tr>
<td><code>roam cut &lt;target&gt; [--depth N]</code></td>
<td>Minimum graph cuts: find critical edges whose removal disconnects components</td>
</tr>
<tr>
<td><code>roam safe-zones</code></td>
<td>Graph-based containment boundaries</td>
</tr>
<tr>
<td><code>roam coverage-gaps</code></td>
<td>Unprotected entry points with no path to gate symbols</td>
</tr>
<tr>
<td><code>roam duplicates [--threshold T] [--min-lines N]</code></td>
<td>Semantic duplicate detector: functionally equivalent code clusters with divergent edge-case handling</td>
</tr>
<tr>
<td><code>roam clones [--threshold T] [--min-lines N] [--scope P]</code></td>
<td>AST structural clone detection: Type-2 clones via subtree hashing (more precise than <code>duplicates</code>)</td>
</tr>
</tbody></table>
<h3>Exploration</h3>
<table>
<thead>
<tr>
<th>Command</th>
<th>Description</th>
</tr>
</thead>
<tbody><tr>
<td><code>roam module &lt;path&gt;</code></td>
<td>Directory contents: exports, signatures, dependencies, cohesion</td>
</tr>
<tr>
<td><code>roam sketch &lt;dir&gt; [--full]</code></td>
<td>Compact structural skeleton of a directory</td>
</tr>
<tr>
<td><code>roam uses &lt;name&gt;</code></td>
<td>All consumers: callers, importers, inheritors</td>
</tr>
<tr>
<td><code>roam owner &lt;path&gt;</code></td>
<td>Code ownership: who owns a file or directory</td>
</tr>
<tr>
<td><code>roam coupling [-n N] [--set]</code></td>
<td>Temporal coupling: file pairs that change together (NPMI + lift)</td>
</tr>
<tr>
<td><code>roam fn-coupling</code></td>
<td>Function-level temporal coupling across files</td>
</tr>
<tr>
<td><code>roam bus-factor [--brain-methods]</code></td>
<td>Knowledge loss risk per module</td>
</tr>
<tr>
<td><code>roam doc-staleness</code></td>
<td>Detect stale docstrings</td>
</tr>
<tr>
<td><code>roam docs-coverage</code></td>
<td>Public-symbol doc coverage + stale docs + PageRank-ranked missing-doc hotlist</td>
</tr>
<tr>
<td><code>roam suggest-refactoring [--limit N] [--min-score N]</code></td>
<td>Proactive refactoring recommendations ranked by complexity, coupling, churn, smells, coverage gaps, and debt</td>
</tr>
<tr>
<td><code>roam plan-refactor &lt;symbol&gt; [--operation auto|extract|move]</code></td>
<td>Ordered refactor plan with blast radius, test gaps, layer risk, and simulation-based strategy preview</td>
</tr>
<tr>
<td><code>roam test-scaffold &lt;name|file&gt; [--write] [--framework F]</code></td>
<td>Generate test file/function/import skeletons from symbol data (pytest, jest, Go, JUnit, RSpec)</td>
</tr>
<tr>
<td><code>roam conventions</code></td>
<td>Auto-detect naming styles, import preferences. Flags outliers</td>
</tr>
<tr>
<td><code>roam breaking [REV_RANGE]</code></td>
<td>Breaking change detection: removed exports, signature changes</td>
</tr>
<tr>
<td><code>roam affected-tests &lt;symbol|file&gt;</code></td>
<td>Trace reverse call graph to test files</td>
</tr>
<tr>
<td><code>roam relate &lt;sym1&gt; &lt;sym2&gt;</code></td>
<td>Show relationship between two symbols: shared callers, shortest path, common ancestors</td>
</tr>
<tr>
<td><code>roam endpoints [--routes] [--api]</code></td>
<td>Enumerate all HTTP/API endpoint definitions and surface them for review or cross-repo matching</td>
</tr>
<tr>
<td><code>roam metrics &lt;file|symbol&gt;</code></td>
<td>Unified vital signs: complexity, fan-in/out, PageRank, churn, test coverage, dead code risk -- all in one call</td>
</tr>
<tr>
<td><code>roam search-semantic &lt;query&gt;</code></td>
<td>Hybrid semantic search: BM25 + TF-IDF + optional local ONNX vectors (select via <code>--backend</code>) with framework/library packs</td>
</tr>
<tr>
<td><code>roam intent [--staged] [--range R]</code></td>
<td>Doc-to-code linking: match documentation to symbols, detect drift</td>
</tr>
<tr>
<td><code>roam x-lang [--bridges] [--edges]</code></td>
<td>Cross-language edge browser: inspect bridge-resolved connections</td>
</tr>
</tbody></table>
<h3>Reports &amp; CI</h3>
<table>
<thead>
<tr>
<th>Command</th>
<th>Description</th>
</tr>
</thead>
<tbody><tr>
<td><code>roam report [--list] [--config FILE] [PRESET]</code></td>
<td>Compound presets: <code>first-contact</code>, <code>security</code>, <code>pre-pr</code>, <code>refactor</code>, <code>guardian</code></td>
</tr>
<tr>
<td><code>roam describe --write</code></td>
<td>Generate agent config (auto-detects: CLAUDE.md, AGENTS.md, .cursor/rules, etc.)</td>
</tr>
<tr>
<td><code>roam auth-gaps [--routes-only] [--controllers-only] [--min-confidence C]</code></td>
<td>Find endpoints missing authentication or authorization: routes outside auth middleware groups, CRUD methods without <code>$this-&gt;authorize()</code> / <code>Gate::allows()</code> checks. String-aware PHP brace parsing</td>
</tr>
<tr>
<td><code>roam orphan-routes [-n N] [--confidence C]</code></td>
<td>Detect backend routes with no frontend consumer: parses route definitions, searches frontend for API call references, reports controller methods with no route mapping</td>
</tr>
<tr>
<td><code>roam migration-safety [-n N] [--include-archive]</code></td>
<td>Detect non-idempotent migrations: missing <code>hasTable</code>/<code>hasColumn</code> guards, raw SQL without <code>IF NOT EXISTS</code>, index operations without existence checks</td>
</tr>
<tr>
<td><code>roam api-drift [--model M] [--confidence C]</code></td>
<td>Detect mismatches between PHP model <code>$fillable</code>/<code>$appends</code> fields and TypeScript interface properties. Auto-converts snake_case/camelCase for comparison. Single-repo; cross-repo planned for <code>roam ws api-drift</code></td>
</tr>
<tr>
<td><code>roam codeowners [--unowned] [--owner NAME]</code></td>
<td>CODEOWNERS coverage analysis: owned/unowned files, top owners, and ownership risk</td>
</tr>
<tr>
<td><code>roam drift [--threshold N]</code></td>
<td>Ownership drift detection: declared ownership vs observed maintenance activity</td>
</tr>
<tr>
<td><code>roam suggest-reviewers [REV_RANGE]</code></td>
<td>Reviewer recommendation via ownership, recency, breadth, and impact signals</td>
</tr>
<tr>
<td><code>roam simulate-departure &lt;developer&gt;</code></td>
<td>Knowledge-loss simulation: what breaks if a key contributor leaves</td>
</tr>
<tr>
<td><code>roam dev-profile [--developer NAME] [--since N]</code></td>
<td>Developer productivity profile: commit patterns, specialization, impact, and knowledge concentration per contributor</td>
</tr>
<tr>
<td><code>roam secrets [--fail-on-found] [--include-tests]</code></td>
<td>Secret scanning with masking, entropy detection, env-var suppression, remediation suggestions, and optional CI gate failure</td>
</tr>
<tr>
<td><code>roam vulns [--import-file F] [--reachable-only]</code></td>
<td>Vulnerability scanning: ingest npm/pip/trivy/osv reports, auto-detect format, reachability filtering, SARIF output</td>
</tr>
<tr>
<td><code>roam path-coverage [--from P] [--to P] [--max-depth N]</code></td>
<td>Find critical call paths (entry -&gt; sink) with zero test protection. Suggests optimal test insertion points</td>
</tr>
<tr>
<td><code>roam capsule [--redact-paths] [--no-signatures] [--output F]</code></td>
<td>Export sanitized structural graph (no code bodies) for external architectural review</td>
</tr>
<tr>
<td><code>roam rules [--init] [--ci] [--rules-dir D]</code></td>
<td>Plugin DSL for governance: user-defined path/symbol/AST rules via <code>.roam/rules/</code> YAML (<code>$METAVAR</code> captures supported)</td>
</tr>
<tr>
<td><code>roam check-rules [--severity S] [--fix]</code></td>
<td>Evaluate built-in and user-defined governance rules (10 built-in: no-circular-imports, max-fan-out, etc.)</td>
</tr>
<tr>
<td><code>roam vuln-map --generic|--npm-audit|--trivy F</code></td>
<td>Ingest vulnerability reports and match to codebase symbols</td>
</tr>
<tr>
<td><code>roam vuln-reach [--cve C] [--from E]</code></td>
<td>Vulnerability reachability: exact paths from entry points to vulnerable calls</td>
</tr>
<tr>
<td><code>roam supply-chain [--top N]</code></td>
<td>Dependency risk dashboard: pin coverage, risk scoring, supply-chain health</td>
</tr>
<tr>
<td><code>roam sbom [--format cyclonedx|spdx] [--no-reachability] [-o FILE]</code></td>
<td>SBOM generation (CycloneDX 1.5 / SPDX 2.3) enriched with call-graph reachability per dependency</td>
</tr>
<tr>
<td><code>roam congestion [--window N] [--min-authors N]</code></td>
<td>Developer congestion detection: concurrent authors per file, coordination risk scoring</td>
</tr>
<tr>
<td><code>roam invariants [--staged] [--range R]</code></td>
<td>Discover architectural contracts (invariants) from the codebase structure</td>
</tr>
</tbody></table>
<h3>Multi-Repo Workspace</h3>
<table>
<thead>
<tr>
<th>Command</th>
<th>Description</th>
</tr>
</thead>
<tbody><tr>
<td><code>roam ws init &lt;repo1&gt; &lt;repo2&gt; [--name NAME]</code></td>
<td>Initialize a workspace from sibling repos. Auto-detects frontend/backend roles</td>
</tr>
<tr>
<td><code>roam ws status</code></td>
<td>Show workspace repos, index ages, cross-repo edge count</td>
</tr>
<tr>
<td><code>roam ws resolve</code></td>
<td>Scan for REST API endpoints and match frontend calls to backend routes</td>
</tr>
<tr>
<td><code>roam ws understand</code></td>
<td>Unified workspace overview: per-repo stats + cross-repo connections</td>
</tr>
<tr>
<td><code>roam ws health</code></td>
<td>Workspace-wide health report with cross-repo coupling assessment</td>
</tr>
<tr>
<td><code>roam ws context &lt;symbol&gt;</code></td>
<td>Cross-repo augmented context: find a symbol across repos + show API callers</td>
</tr>
<tr>
<td><code>roam ws trace &lt;source&gt; &lt;target&gt;</code></td>
<td>Trace cross-repo paths via API edges</td>
</tr>
</tbody></table>
<h3>Global Options</h3>
<table>
<thead>
<tr>
<th>Option</th>
<th>Description</th>
</tr>
</thead>
<tbody><tr>
<td><code>roam --json &lt;command&gt;</code></td>
<td>Structured JSON output with consistent envelope</td>
</tr>
<tr>
<td><code>roam --compact &lt;command&gt;</code></td>
<td>Token-efficient output: TSV tables, minimal JSON envelope</td>
</tr>
<tr>
<td><code>roam --sarif &lt;command&gt;</code></td>
<td>SARIF 2.1.0 output for dead, health, complexity, rules, secrets, and algo (GitHub/CI integration)</td>
</tr>
<tr>
<td><code>roam health --gate</code></td>
<td>CI quality gate. Reads <code>.roam-gates.yml</code> thresholds. Exit code 5 on failure</td>
</tr>
</tbody></table>
</details>

<h2>Walkthrough: Investigating a Codebase</h2>
<details>
<summary><strong>10-step walkthrough using Flask as an example</strong> (click to expand)</summary>

<p>Here&#39;s how you&#39;d use Roam to understand a project you&#39;ve never seen before. Using Flask as an example:</p>
<p><strong>Step 1: Onboard and get the full picture</strong></p>
<pre><code>$ roam init
Created .roam/fitness.yaml (6 starter rules)
Created .github/workflows/roam.yml
Done. 226 files, 1132 symbols, 233 edges.
Health: 78/100

$ roam understand
Tech stack: Python (flask, jinja2, werkzeug)
Architecture: Monolithic — 3 layers, 5 clusters
Key abstractions: Flask, Blueprint, Request, Response
Health: 78/100 — 1 god component (Flask)
Entry points: src/flask/__init__.py, src/flask/cli.py
Conventions: snake_case functions, PascalCase classes, relative imports
Complexity: avg 4.2, 3 high (&gt;15), 0 critical (&gt;25)
</code></pre>
<p><strong>Step 2: Drill into a key file</strong></p>
<pre><code>$ roam file src/flask/app.py
src/flask/app.py  (python, 963 lines)

  cls  Flask(App)                                   :76-963
    meth  __init__(self, import_name, ...)           :152
    meth  route(self, rule, **options)               :411
    meth  register_blueprint(self, blueprint, ...)   :580
    meth  make_response(self, rv)                    :742
    ...12 more methods
</code></pre>
<p><strong>Step 3: Who depends on this?</strong></p>
<pre><code>$ roam deps src/flask/app.py
Imported by:
file                        symbols
--------------------------  -------
src/flask/__init__.py       3
src/flask/testing.py        2
tests/test_basic.py         1
...18 files total
</code></pre>
<p><strong>Step 4: Find the hotspots</strong></p>
<pre><code>$ roam weather
=== Hotspots (churn x complexity) ===
Score  Churn  Complexity  Path                    Lang
-----  -----  ----------  ----------------------  ------
18420  460    40.0        src/flask/app.py        python
12180  348    35.0        src/flask/blueprints.py python
</code></pre>
<p><strong>Step 5: Check architecture health</strong></p>
<pre><code>$ roam health
Health: 78/100
  Tangle: 0.0% (0/1132 symbols in cycles)
  1 god component (Flask, degree 47, actionable)
  0 bottlenecks, 0 layer violations

=== God Components (degree &gt; 20) ===
Sev      Name   Kind  Degree  Cat  File
-------  -----  ----  ------  ---  ------------------
WARNING  Flask  cls   47      act  src/flask/app.py
</code></pre>
<p><strong>Step 6: Get AI-ready context for a symbol</strong></p>
<pre><code>$ roam context Flask
Files to read:
  src/flask/app.py:76-963              # definition
  src/flask/__init__.py:1-15           # re-export
  src/flask/testing.py:22-45           # caller: FlaskClient.__init__
  tests/test_basic.py:12-30            # caller: test_app_factory
  ...12 more files

Callers: 47  Callees: 3
</code></pre>
<p><strong>Step 7: Pre-change safety check</strong></p>
<pre><code>$ roam preflight Flask
=== Preflight: Flask ===
Blast radius: 47 callers, 89 transitive
Affected tests: 31 (DIRECT: 12, TRANSITIVE: 19)
Complexity: cc=40 (critical), nesting=6
Coupling: 3 hidden co-change partners
Fitness: 1 violation (max-complexity exceeded)
Verdict: HIGH RISK — consider splitting before modifying
</code></pre>
<p><strong>Step 8: Decompose a large file</strong></p>
<pre><code>$ roam split src/flask/app.py
=== Split analysis: src/flask/app.py ===
  87 symbols, 42 internal edges, 95 external edges
  Cross-group coupling: 18%

  Group 1 (routing) — 12 symbols, isolation: 83% [extractable]
    meth  route              L411  PR=0.0088
    meth  add_url_rule       L450  PR=0.0045
    ...

=== Extraction Suggestions ===
  Extract &#39;routing&#39; group: route, add_url_rule, endpoint (+9 more)
    83% isolated, only 3 edges to other groups
</code></pre>
<p><strong>Step 9: Understand why a symbol matters</strong></p>
<pre><code>$ roam why Flask url_for Blueprint
Symbol     Role          Fan         Reach     Risk      Verdict
---------  ------------  ----------  --------  --------  --------------------------------------------------
Flask      Hub           fan-in:47   reach:89  CRITICAL  God symbol (47 in, 12 out). Consider splitting.
url_for    Core utility  fan-in:31   reach:45  HIGH      Widely used utility (31 callers). Stable interface.
Blueprint  Bridge        fan-in:18   reach:34  moderate  Coupling point between clusters.
</code></pre>
<p><strong>Step 10: Generate docs and set up CI</strong></p>
<pre><code>$ roam describe --write
Wrote CLAUDE.md (98 lines)  # auto-detects: CLAUDE.md, AGENTS.md, .cursor/rules, etc.

$ roam health --gate
Health: 78/100 — PASS
</code></pre>
<p>Ten commands. Complete picture: structure, dependencies, hotspots, health, context, safety checks, decomposition, and CI gates.</p>
</details>

<h2>Integration with AI Coding Tools</h2>
<p>Roam is designed to be called by coding agents via shell commands. Instead of repeatedly grepping and reading files, the agent runs one <code>roam</code> command and gets structured output.</p>
<p><strong>Decision order for agents:</strong></p>
<table>
<thead>
<tr>
<th>Situation</th>
<th>Command</th>
</tr>
</thead>
<tbody><tr>
<td>First time in a repo</td>
<td><code>roam understand</code> then <code>roam tour</code></td>
</tr>
<tr>
<td>Need to modify a symbol</td>
<td><code>roam preflight &lt;name&gt;</code> (blast radius + tests + fitness)</td>
</tr>
<tr>
<td>Debugging a failure</td>
<td><code>roam diagnose &lt;name&gt;</code> (root cause ranking)</td>
</tr>
<tr>
<td>Need files to read</td>
<td><code>roam context &lt;name&gt;</code> (files + line ranges)</td>
</tr>
<tr>
<td>Need to find a symbol</td>
<td><code>roam search &lt;pattern&gt;</code></td>
</tr>
<tr>
<td>Need file structure</td>
<td><code>roam file &lt;path&gt;</code></td>
</tr>
<tr>
<td>Pre-PR check</td>
<td><code>roam pr-risk HEAD~3..HEAD</code></td>
</tr>
<tr>
<td>What breaks if I change X?</td>
<td><code>roam impact &lt;symbol&gt;</code></td>
</tr>
<tr>
<td>Check for N+1 queries</td>
<td><code>roam n1</code> (implicit lazy-load detection)</td>
</tr>
<tr>
<td>Check auth coverage</td>
<td><code>roam auth-gaps</code> (routes + controllers)</td>
</tr>
<tr>
<td>Check migration safety</td>
<td><code>roam migration-safety</code> (idempotency guards)</td>
</tr>
</tbody></table>
<p><strong>Fastest setup:</strong></p>
<pre><code class="language-bash">roam describe --write               # auto-detects your agent&#39;s config file
roam describe --write -o AGENTS.md  # or specify an explicit path
roam describe --agent-prompt        # compact ~500-token prompt (append to any config)
roam minimap --update               # inject/refresh annotated codebase minimap in CLAUDE.md
</code></pre>
<p><strong>Agent not using Roam correctly?</strong> If your agent is ignoring Roam and falling back to grep/read exploration, it likely doesn&#39;t have the instructions. Run:</p>
<pre><code class="language-bash">roam describe --write          # writes instructions to your agent&#39;s config (CLAUDE.md, AGENTS.md, etc.)
</code></pre>
<p>If you already have a config file and don&#39;t want to overwrite it:</p>
<pre><code class="language-bash">roam describe --agent-prompt   # prints a compact prompt — copy-paste into your existing config
roam minimap --update          # injects an annotated codebase snapshot into CLAUDE.md (won&#39;t touch other content)
</code></pre>
<p>This teaches the agent which Roam command to use for each situation (e.g., <code>roam preflight</code> before changes, <code>roam context</code> for files to read, <code>roam diagnose</code> for debugging).</p>
<details>
<summary><strong>Copy-paste agent instructions</strong></summary>

<pre><code class="language-markdown">## Codebase navigation

This project uses `roam` for codebase comprehension. Always prefer roam over Glob/Grep/Read exploration.

Before modifying any code:
1. First time in the repo: `roam understand` then `roam tour`
2. Find a symbol: `roam search &lt;pattern&gt;`
3. Before changing a symbol: `roam preflight &lt;name&gt;` (blast radius + tests + fitness)
4. Need files to read: `roam context &lt;name&gt;` (files + line ranges, prioritized)
5. Debugging a failure: `roam diagnose &lt;name&gt;` (root cause ranking)
6. After making changes: `roam diff` (blast radius of uncommitted changes)

Additional: `roam health` (0-100 score), `roam impact &lt;name&gt;` (what breaks),
`roam pr-risk` (PR risk), `roam file &lt;path&gt;` (file skeleton).

Run `roam --help` for all commands. Use `roam --json &lt;cmd&gt;` for structured output.
</code></pre>
</details>

<details>
<summary><strong>Where to put this for each tool</strong></summary>

<table>
<thead>
<tr>
<th>Tool</th>
<th>Config file</th>
</tr>
</thead>
<tbody><tr>
<td><strong>Claude Code</strong></td>
<td><code>CLAUDE.md</code> in your project root</td>
</tr>
<tr>
<td><strong>OpenAI Codex CLI</strong></td>
<td><code>AGENTS.md</code> in your project root</td>
</tr>
<tr>
<td><strong>Gemini CLI</strong></td>
<td><code>GEMINI.md</code> in your project root</td>
</tr>
<tr>
<td><strong>Cursor</strong></td>
<td><code>.cursor/rules/roam.mdc</code> (add <code>alwaysApply: true</code> frontmatter)</td>
</tr>
<tr>
<td><strong>Windsurf</strong></td>
<td><code>.windsurf/rules/roam.md</code> (add <code>trigger: always_on</code> frontmatter)</td>
</tr>
<tr>
<td><strong>GitHub Copilot</strong></td>
<td><code>.github/copilot-instructions.md</code></td>
</tr>
<tr>
<td><strong>Aider</strong></td>
<td><code>CONVENTIONS.md</code></td>
</tr>
<tr>
<td><strong>Continue.dev</strong></td>
<td><code>config.yaml</code> rules</td>
</tr>
<tr>
<td><strong>Cline</strong></td>
<td><code>.clinerules/</code> directory</td>
</tr>
</tbody></table>
</details>

<details>
<summary><strong>Roam vs native tools</strong></summary>

<table>
<thead>
<tr>
<th>Task</th>
<th>Use Roam</th>
<th>Use native tools</th>
</tr>
</thead>
<tbody><tr>
<td>&quot;What calls this function?&quot;</td>
<td><code>roam symbol &lt;name&gt;</code></td>
<td>LSP / Grep</td>
</tr>
<tr>
<td>&quot;What files do I need to read?&quot;</td>
<td><code>roam context &lt;name&gt;</code></td>
<td>Manual tracing (5+ calls)</td>
</tr>
<tr>
<td>&quot;Is it safe to change X?&quot;</td>
<td><code>roam preflight &lt;name&gt;</code></td>
<td>Multiple manual checks</td>
</tr>
<tr>
<td>&quot;Show me this file&#39;s structure&quot;</td>
<td><code>roam file &lt;path&gt;</code></td>
<td>Read the file directly</td>
</tr>
<tr>
<td>&quot;Understand project architecture&quot;</td>
<td><code>roam understand</code></td>
<td>Manual exploration</td>
</tr>
<tr>
<td>&quot;What breaks if I change X?&quot;</td>
<td><code>roam impact &lt;symbol&gt;</code></td>
<td>No direct equivalent</td>
</tr>
<tr>
<td>&quot;What tests to run?&quot;</td>
<td><code>roam affected-tests &lt;name&gt;</code></td>
<td>Grep for imports (misses indirect)</td>
</tr>
<tr>
<td>&quot;What&#39;s causing this bug?&quot;</td>
<td><code>roam diagnose &lt;name&gt;</code></td>
<td>Manual call-chain tracing</td>
</tr>
<tr>
<td>&quot;Codebase health score for CI&quot;</td>
<td><code>roam health --gate</code></td>
<td>No equivalent</td>
</tr>
</tbody></table>
</details>

<h2>MCP Server</h2>
<p>Roam includes a <a href="https://modelcontextprotocol.io/">Model Context Protocol</a> server for direct integration with tools that support MCP.</p>
<pre><code class="language-bash">pip install &quot;roam-code[mcp]&quot;
roam mcp
</code></pre>
<p>102 tools, 10 resources, and 5 prompts are available in the full preset. Most tools are read-only index queries; side-effect tools are explicitly annotated.</p>
<p><strong>MCP v2 highlights (v11):</strong></p>
<ul>
<li>In-process MCP execution (no subprocess shell-out per call)</li>
<li>Preset-based tool surfacing (<code>core</code>, <code>review</code>, <code>refactor</code>, <code>debug</code>, <code>architecture</code>, <code>full</code>)</li>
<li>Compound tools that collapse multi-step exploration/review flows into one call</li>
<li>Structured output schemas + tool annotations for safer planner behavior</li>
</ul>
<p><strong>Default preset:</strong> <code>core</code> (24 tools: 23 core + <code>roam_expand_toolset</code> meta-tool).</p>
<pre><code class="language-bash"># Default
roam mcp

# Full toolset
ROAM_MCP_PRESET=full roam mcp

# Legacy compatibility (same as full preset)
ROAM_MCP_LITE=0 roam mcp
</code></pre>
<p>Core preset tools: <code>roam_affected_tests</code>, <code>roam_batch_get</code>, <code>roam_batch_search</code>, <code>roam_complexity_report</code>, <code>roam_context</code>, <code>roam_dead_code</code>, <code>roam_deps</code>, <code>roam_diagnose</code>, <code>roam_diagnose_issue</code>, <code>roam_diff</code>, <code>roam_expand_toolset</code>, <code>roam_explore</code>, <code>roam_file_info</code>, <code>roam_health</code>, <code>roam_impact</code>, <code>roam_pr_risk</code>, <code>roam_preflight</code>, <code>roam_prepare_change</code>, <code>roam_review_change</code>, <code>roam_search_symbol</code>, <code>roam_syntax_check</code>, <code>roam_trace</code>, <code>roam_understand</code>, <code>roam_uses</code>.</p>
<details>
<summary><strong>MCP tool list (all 101)</strong></summary>

<table>
<thead>
<tr>
<th>Tool</th>
<th>Description</th>
</tr>
</thead>
<tbody><tr>
<td><code>roam_understand</code></td>
<td>Full codebase briefing</td>
</tr>
<tr>
<td><code>roam_health</code></td>
<td>Health score (0-100) + issues</td>
</tr>
<tr>
<td><code>roam_preflight</code></td>
<td>Pre-change safety check</td>
</tr>
<tr>
<td><code>roam_search_symbol</code></td>
<td>Find symbols by name</td>
</tr>
<tr>
<td><code>roam_context</code></td>
<td>Files-to-read for modifying a symbol</td>
</tr>
<tr>
<td><code>roam_trace</code></td>
<td>Dependency path between two symbols</td>
</tr>
<tr>
<td><code>roam_impact</code></td>
<td>Blast radius of changing a symbol</td>
</tr>
<tr>
<td><code>roam_file_info</code></td>
<td>File skeleton with all definitions</td>
</tr>
<tr>
<td><code>roam_pr_risk</code></td>
<td>Risk score for pending changes</td>
</tr>
<tr>
<td><code>roam_breaking_changes</code></td>
<td>Detect breaking changes between refs</td>
</tr>
<tr>
<td><code>roam_affected_tests</code></td>
<td>Find tests affected by a change</td>
</tr>
<tr>
<td><code>roam_dead_code</code></td>
<td>List unreferenced exports</td>
</tr>
<tr>
<td><code>roam_complexity_report</code></td>
<td>Per-symbol cognitive complexity</td>
</tr>
<tr>
<td><code>roam_repo_map</code></td>
<td>Project skeleton with key symbols</td>
</tr>
<tr>
<td><code>roam_tour</code></td>
<td>Auto-generated onboarding guide</td>
</tr>
<tr>
<td><code>roam_diagnose</code></td>
<td>Root cause analysis for debugging</td>
</tr>
<tr>
<td><code>roam_visualize</code></td>
<td>Generate Mermaid or DOT architecture diagrams</td>
</tr>
<tr>
<td><code>roam_algo</code></td>
<td>Algorithm anti-pattern detection with language-aware tips</td>
</tr>
<tr>
<td><code>roam_ws_understand</code></td>
<td>Unified multi-repo workspace overview</td>
</tr>
<tr>
<td><code>roam_ws_context</code></td>
<td>Cross-repo augmented symbol context</td>
</tr>
<tr>
<td><code>roam_pr_diff</code></td>
<td>Structural PR diff: metric deltas, edge analysis, symbol changes</td>
</tr>
<tr>
<td><code>roam_budget_check</code></td>
<td>Check changes against architectural budgets</td>
</tr>
<tr>
<td><code>roam_effects</code></td>
<td>Side-effect classification (DB writes, network, filesystem)</td>
</tr>
<tr>
<td><code>roam_attest</code></td>
<td>Proof-carrying PR attestation with all evidence bundled</td>
</tr>
<tr>
<td><code>roam_capsule_export</code></td>
<td>Export sanitized structural graph (no code bodies)</td>
</tr>
<tr>
<td><code>roam_path_coverage</code></td>
<td>Find critical untested call paths (entry -&gt; sink)</td>
</tr>
<tr>
<td><code>roam_forecast</code></td>
<td>Predict when metrics will exceed thresholds</td>
</tr>
<tr>
<td><code>roam_simulate</code></td>
<td>Counterfactual architecture simulator</td>
</tr>
<tr>
<td><code>roam_orchestrate</code></td>
<td>Multi-agent swarm partitioning</td>
</tr>
<tr>
<td><code>roam_fingerprint</code></td>
<td>Topology fingerprint comparison</td>
</tr>
<tr>
<td><code>roam_mutate</code></td>
<td>Graph-level code editing (move/rename/extract)</td>
</tr>
<tr>
<td><code>roam_dark_matter</code></td>
<td>Hidden co-change coupling detection</td>
</tr>
<tr>
<td><code>roam_closure</code></td>
<td>Minimal-change synthesis for rename/delete</td>
</tr>
<tr>
<td><code>roam_adversarial_review</code></td>
<td>Adversarial architecture review</td>
</tr>
<tr>
<td><code>roam_generate_plan</code></td>
<td>Agent work planner</td>
</tr>
<tr>
<td><code>roam_get_invariants</code></td>
<td>Architectural invariant discovery</td>
</tr>
<tr>
<td><code>roam_bisect_blame</code></td>
<td>Architectural git bisect</td>
</tr>
<tr>
<td><code>roam_doc_intent</code></td>
<td>Doc-to-code linking</td>
</tr>
<tr>
<td><code>roam_cut_analysis</code></td>
<td>Minimum graph cut analysis</td>
</tr>
<tr>
<td><code>roam_clones</code></td>
<td>AST structural clone detection (Type-2 clones)</td>
</tr>
<tr>
<td><code>roam_annotate_symbol</code></td>
<td>Attach persistent notes to symbols</td>
</tr>
<tr>
<td><code>roam_get_annotations</code></td>
<td>View stored annotations</td>
</tr>
<tr>
<td><code>roam_relate</code></td>
<td>Show relationship between two symbols</td>
</tr>
<tr>
<td><code>roam_search_semantic</code></td>
<td>Semantic search by meaning</td>
</tr>
<tr>
<td><code>roam_rules_check</code></td>
<td>Plugin DSL governance rules</td>
</tr>
<tr>
<td><code>roam_check_rules</code></td>
<td>Built-in + user-defined governance rule evaluation with autofix templates</td>
</tr>
<tr>
<td><code>roam_supply_chain</code></td>
<td>Dependency risk dashboard: pin coverage and supply-chain health</td>
</tr>
<tr>
<td><code>roam_spectral</code></td>
<td>Spectral bisection: Fiedler vector partition tree and modularity gap</td>
</tr>
<tr>
<td><code>roam_vuln_map</code></td>
<td>Vulnerability report ingestion</td>
</tr>
<tr>
<td><code>roam_vuln_reach</code></td>
<td>Vulnerability reachability paths</td>
</tr>
<tr>
<td><code>roam_ingest_trace</code></td>
<td>Ingest runtime trace data</td>
</tr>
<tr>
<td><code>roam_runtime_hotspots</code></td>
<td>Runtime hotspot analysis</td>
</tr>
<tr>
<td><code>roam_diff</code></td>
<td>Blast radius of uncommitted/committed changes</td>
</tr>
<tr>
<td><code>roam_symbol</code></td>
<td>Symbol definition, callers, callees, metrics</td>
</tr>
<tr>
<td><code>roam_deps</code></td>
<td>File-level import/imported-by relationships</td>
</tr>
<tr>
<td><code>roam_uses</code></td>
<td>All consumers of a symbol by edge type</td>
</tr>
<tr>
<td><code>roam_weather</code></td>
<td>Code hotspots: churn x complexity ranking</td>
</tr>
<tr>
<td><code>roam_debt</code></td>
<td>Hotspot-weighted technical debt prioritization with optional ROI estimate</td>
</tr>
<tr>
<td><code>roam_docs_coverage</code></td>
<td>Doc coverage and stale-doc drift with PageRank-ranked missing docs</td>
</tr>
<tr>
<td><code>roam_suggest_refactoring</code></td>
<td>Rank proactive refactoring candidates using complexity, coupling, churn, smells, and coverage gaps</td>
</tr>
<tr>
<td><code>roam_plan_refactor</code></td>
<td>Build an ordered refactor plan for one symbol with risk/test/simulation context</td>
</tr>
<tr>
<td><code>roam_n1</code></td>
<td>Detect N+1 I/O patterns in ORM code</td>
</tr>
<tr>
<td><code>roam_auth_gaps</code></td>
<td>Find endpoints missing auth</td>
</tr>
<tr>
<td><code>roam_over_fetch</code></td>
<td>Detect models serializing too many fields</td>
</tr>
<tr>
<td><code>roam_missing_index</code></td>
<td>Find queries on non-indexed columns</td>
</tr>
<tr>
<td><code>roam_orphan_routes</code></td>
<td>Detect dead backend routes</td>
</tr>
<tr>
<td><code>roam_migration_safety</code></td>
<td>Detect non-idempotent migrations</td>
</tr>
<tr>
<td><code>roam_api_drift</code></td>
<td>Backend/frontend model mismatch detection</td>
</tr>
<tr>
<td><code>roam_expand_toolset</code></td>
<td>Discover presets, active toolset, and switch instructions</td>
</tr>
<tr>
<td><code>roam_explore</code></td>
<td>Compound first-contact exploration bundle for fast repo orientation</td>
</tr>
<tr>
<td><code>roam_prepare_change</code></td>
<td>Compound pre-change bundle: context, blast radius, risk, and tests</td>
</tr>
<tr>
<td><code>roam_review_change</code></td>
<td>Compound review bundle for changed code and architecture checks</td>
</tr>
<tr>
<td><code>roam_diagnose_issue</code></td>
<td>Compound debugging bundle with ranked suspects and dependency context</td>
</tr>
<tr>
<td><code>roam_onboard</code></td>
<td>Structured onboarding brief for new contributors/agents</td>
</tr>
<tr>
<td><code>roam_syntax_check</code></td>
<td>Tree-sitter syntax integrity validation for changed paths</td>
</tr>
<tr>
<td><code>roam_agent_export</code></td>
<td>Generate multi-agent instruction bundles (<code>AGENTS.md</code> + overlays)</td>
</tr>
<tr>
<td><code>roam_vibe_check</code></td>
<td>AI-rot auditor with 8-pattern taxonomy and composite score</td>
</tr>
<tr>
<td><code>roam_ai_readiness</code></td>
<td>AI-agent effectiveness readiness scoring and recommendations</td>
</tr>
<tr>
<td><code>roam_dashboard</code></td>
<td>Unified status snapshot across health, risk, churn, and quality</td>
</tr>
<tr>
<td><code>roam_codeowners</code></td>
<td>CODEOWNERS coverage analysis and unowned file discovery</td>
</tr>
<tr>
<td><code>roam_drift</code></td>
<td>Ownership drift detection from declared vs observed ownership</td>
</tr>
<tr>
<td><code>roam_suggest_reviewers</code></td>
<td>Reviewer recommendations with multi-signal scoring</td>
</tr>
<tr>
<td><code>roam_simulate_departure</code></td>
<td>Knowledge-loss simulation for contributor departure scenarios</td>
</tr>
<tr>
<td><code>roam_verify</code></td>
<td>Pre-commit consistency verification and policy checks</td>
</tr>
<tr>
<td><code>roam_api_changes</code></td>
<td>API signature change classification and severity labeling</td>
</tr>
<tr>
<td><code>roam_test_gaps</code></td>
<td>Changed-symbol test gap analysis</td>
</tr>
<tr>
<td><code>roam_ai_ratio</code></td>
<td>Estimated AI-generated code ratio from repository signals</td>
</tr>
<tr>
<td><code>roam_duplicates</code></td>
<td>Semantic duplicate detection across structurally similar functions</td>
</tr>
<tr>
<td><code>roam_partition</code></td>
<td>Multi-agent partition manifest with conflict and complexity scores</td>
</tr>
<tr>
<td><code>roam_affected</code></td>
<td>Monorepo/package affected-set analysis for diffs</td>
</tr>
<tr>
<td><code>roam_semantic_diff</code></td>
<td>Structural diff of symbol/edge changes</td>
</tr>
<tr>
<td><code>roam_trends</code></td>
<td>Historical metric trend retrieval with sparkline output</td>
</tr>
<tr>
<td><code>roam_secrets</code></td>
<td>Secret scanning with masking and CI-friendly fail behavior</td>
</tr>
<tr>
<td><code>roam_endpoints</code></td>
<td>Enumerate HTTP/API endpoint definitions across the codebase</td>
</tr>
<tr>
<td><code>roam_doctor</code></td>
<td>Diagnose installation and environment health</td>
</tr>
<tr>
<td><code>roam_init</code></td>
<td>Initialize roam workspace state and build the first index</td>
</tr>
<tr>
<td><code>roam_reindex</code></td>
<td>Refresh or force-rebuild the index with task-mode support</td>
</tr>
<tr>
<td><code>roam_reset</code></td>
<td>Reset the roam index and cached data</td>
</tr>
<tr>
<td><code>roam_clean</code></td>
<td>Remove stale or orphaned index entries</td>
</tr>
<tr>
<td><code>roam_batch_search</code></td>
<td>Batch symbol search: run multiple pattern queries in a single call</td>
</tr>
<tr>
<td><code>roam_batch_get</code></td>
<td>Batch context retrieval: fetch multiple symbols/files in a single call</td>
</tr>
<tr>
<td><code>roam_dev_profile</code></td>
<td>Developer productivity profile: commit patterns, specialization, and impact</td>
</tr>
</tbody></table>
<p><strong>Resources:</strong> <code>roam://health</code> (current health score), <code>roam://summary</code> (project overview)</p>
</details>

<details>
<summary><strong>Claude Code</strong></summary>

<pre><code class="language-bash">claude mcp add roam-code -- roam mcp
</code></pre>
<p>Or add to <code>.mcp.json</code> in your project root:</p>
<pre><code class="language-json">{
  &quot;mcpServers&quot;: {
    &quot;roam-code&quot;: {
      &quot;command&quot;: &quot;roam&quot;,
      &quot;args&quot;: [&quot;mcp&quot;]
    }
  }
}
</code></pre>
</details>

<details>
<summary><strong>Claude Desktop</strong></summary>

<p>Add to your <code>claude_desktop_config.json</code>:</p>
<pre><code class="language-json">{
  &quot;mcpServers&quot;: {
    &quot;roam-code&quot;: {
      &quot;command&quot;: &quot;roam&quot;,
      &quot;args&quot;: [&quot;mcp&quot;],
      &quot;cwd&quot;: &quot;/path/to/your/project&quot;
    }
  }
}
</code></pre>
</details>

<details>
<summary><strong>Cursor</strong></summary>

<p>Add to <code>.cursor/mcp.json</code>:</p>
<pre><code class="language-json">{
  &quot;mcpServers&quot;: {
    &quot;roam-code&quot;: {
      &quot;command&quot;: &quot;roam&quot;,
      &quot;args&quot;: [&quot;mcp&quot;]
    }
  }
}
</code></pre>
</details>

<details>
<summary><strong>VS Code + Copilot</strong></summary>

<p>Add to <code>.vscode/mcp.json</code>:</p>
<pre><code class="language-json">{
  &quot;servers&quot;: {
    &quot;roam-code&quot;: {
      &quot;type&quot;: &quot;stdio&quot;,
      &quot;command&quot;: &quot;roam&quot;,
      &quot;args&quot;: [&quot;mcp&quot;]
    }
  }
}
</code></pre>
</details>

<h2>CI/CD Integration</h2>
<p>All you need is Python 3.9+ and <code>pip install roam-code</code>.</p>
<h3>GitHub Actions</h3>
<pre><code class="language-yaml"># .github/workflows/roam.yml
name: Roam Analysis
on: [pull_request]

jobs:
  roam:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - uses: Cranot/roam-code@main
        with:
          commands: health
          gate: &quot;score&gt;=70&quot;
          sarif: true
          comment: true
</code></pre>
<p>Use <code>roam init</code> to auto-generate this workflow.</p>
<table>
<thead>
<tr>
<th>Input</th>
<th>Default</th>
<th>Description</th>
</tr>
</thead>
<tbody><tr>
<td><code>commands</code></td>
<td><code>health</code></td>
<td>Space-separated roam commands to run</td>
</tr>
<tr>
<td><code>gate</code></td>
<td>(empty)</td>
<td>Quality gate expression (e.g., <code>score&gt;=70</code>). Exit 5 on failure</td>
</tr>
<tr>
<td><code>sarif</code></td>
<td><code>false</code></td>
<td>Upload SARIF results to GitHub Code Scanning</td>
</tr>
<tr>
<td><code>comment</code></td>
<td><code>true</code></td>
<td>Post sticky PR comment with results</td>
</tr>
<tr>
<td><code>python-version</code></td>
<td><code>3.11</code></td>
<td>Python version</td>
</tr>
<tr>
<td><code>version</code></td>
<td><code>latest</code></td>
<td>Pin to a specific roam-code version</td>
</tr>
<tr>
<td><code>cache</code></td>
<td><code>true</code></td>
<td>Cache the SQLite index between runs</td>
</tr>
<tr>
<td><code>changed-only</code></td>
<td><code>false</code></td>
<td>Incremental mode: adapt commands to changed files</td>
</tr>
</tbody></table>
<details>
<summary><strong>GitLab CI</strong></summary>

<pre><code class="language-yaml">roam-analysis:
  stage: test
  image: python:3.12-slim
  before_script:
    - pip install roam-code
  script:
    - roam index
    - roam health --gate
    - roam --json pr-risk origin/main..HEAD &gt; roam-report.json
  artifacts:
    paths:
      - roam-report.json
  rules:
    - if: $CI_MERGE_REQUEST_IID
</code></pre>
</details>

<details>
<summary><strong>Azure DevOps / any CI</strong></summary>

<p>Universal pattern:</p>
<pre><code class="language-bash">pip install roam-code
roam index
roam health --gate               # exit 5 on failure (reads .roam-gates.yml)
roam --json health &gt; report.json
</code></pre>
</details>

<h2>SARIF Output</h2>
<p>Roam exports analysis results in <a href="https://sarifweb.azurewebsites.net/">SARIF 2.1.0</a> format for GitHub Code Scanning.</p>
<pre><code class="language-python">from roam.output.sarif import health_to_sarif, write_sarif

sarif = health_to_sarif(health_data)
write_sarif(sarif, &quot;roam-health.sarif&quot;)
</code></pre>
<pre><code class="language-yaml">- uses: github/codeql-action/upload-sarif@v3
  with:
    sarif_file: roam-health.sarif
</code></pre>
<h2>For Teams</h2>
<p>Zero infrastructure, zero vendor lock-in, zero data leaving your network.</p>
<table>
<thead>
<tr>
<th>Tool</th>
<th>Annual cost (20-dev team)</th>
<th>Infrastructure</th>
<th>Setup time</th>
</tr>
</thead>
<tbody><tr>
<td>SonarQube Server</td>
<td>$15,000-$45,000</td>
<td>Self-hosted server</td>
<td>Days</td>
</tr>
<tr>
<td>CodeScene</td>
<td>$20,000-$60,000</td>
<td>SaaS or on-prem</td>
<td>Hours</td>
</tr>
<tr>
<td>Code Climate</td>
<td>$12,000-$36,000</td>
<td>SaaS</td>
<td>Hours</td>
</tr>
<tr>
<td><strong>Roam</strong></td>
<td><strong>$0 (MIT license)</strong></td>
<td><strong>None (local)</strong></td>
<td><strong>5 minutes</strong></td>
</tr>
</tbody></table>
<details>
<summary><strong>Team rollout guide</strong></summary>

<p><strong>Week 1-2 (pilot):</strong> 1-2 developers run <code>roam init</code> on one repo. Use <code>roam preflight</code> before changes, <code>roam pr-risk</code> before PRs.</p>
<p><strong>Week 3-4 (expand):</strong> Add <code>roam health --gate</code> to CI as a non-blocking check (configure thresholds in <code>.roam-gates.yml</code>).</p>
<p><strong>Month 2+ (standardize):</strong> Tighten gate thresholds. Expand to additional repos. Track trajectory with <code>roam trends</code>.</p>
</details>

<details>
<summary><strong>Complements your existing stack</strong></summary>

<table>
<thead>
<tr>
<th>If you use...</th>
<th>Roam adds...</th>
</tr>
</thead>
<tbody><tr>
<td><strong>SonarQube</strong></td>
<td>Architecture-level analysis: dependency cycles, god components, blast radius, health scoring</td>
</tr>
<tr>
<td><strong>CodeScene</strong></td>
<td>Free, local alternative for health scoring and hotspot analysis</td>
</tr>
<tr>
<td><strong>ESLint / Pylint</strong></td>
<td>Cross-language architecture checks. Linters enforce style per file; Roam enforces architecture across the codebase</td>
</tr>
<tr>
<td><strong>LSP</strong></td>
<td>AI-agent-optimized queries. <code>roam context</code> answers &quot;what calls this?&quot; with PageRank-ranked results in one call</td>
</tr>
</tbody></table>
</details>

<h2>Language Support</h2>
<h3>Tier 1 -- Full extraction (dedicated parsers)</h3>
<table>
<thead>
<tr>
<th>Language</th>
<th>Extensions</th>
<th>Symbols</th>
<th>References</th>
<th>Inheritance</th>
</tr>
</thead>
<tbody><tr>
<td>Python</td>
<td><code>.py</code> <code>.pyi</code></td>
<td>classes, functions, methods, decorators, variables</td>
<td>imports, calls, inheritance</td>
<td>extends, <code>__all__</code> exports</td>
</tr>
<tr>
<td>JavaScript</td>
<td><code>.js</code> <code>.jsx</code> <code>.mjs</code> <code>.cjs</code></td>
<td>classes, functions, arrow functions, CJS exports</td>
<td>imports, require(), calls</td>
<td>extends</td>
</tr>
<tr>
<td>TypeScript</td>
<td><code>.ts</code> <code>.tsx</code> <code>.mts</code> <code>.cts</code></td>
<td>interfaces, type aliases, enums + all JS</td>
<td>imports, calls, type refs</td>
<td>extends, implements</td>
</tr>
<tr>
<td>Java</td>
<td><code>.java</code></td>
<td>classes, interfaces, enums, constructors, fields</td>
<td>imports, calls</td>
<td>extends, implements</td>
</tr>
<tr>
<td>Go</td>
<td><code>.go</code></td>
<td>structs, interfaces, functions, methods, fields</td>
<td>imports, calls</td>
<td>embedded structs</td>
</tr>
<tr>
<td>Rust</td>
<td><code>.rs</code></td>
<td>structs, traits, impls, enums, functions</td>
<td>use, calls</td>
<td>impl Trait for Struct</td>
</tr>
<tr>
<td>C / C++</td>
<td><code>.c</code> <code>.h</code> <code>.cpp</code> <code>.hpp</code> <code>.cc</code></td>
<td>structs, classes, functions, namespaces, templates</td>
<td>includes, calls</td>
<td>extends</td>
</tr>
<tr>
<td>C#</td>
<td><code>.cs</code></td>
<td>classes, interfaces, structs, enums, records, methods, constructors, properties, delegates, events, fields</td>
<td>using directives, calls, <code>new</code>, attributes</td>
<td>extends, implements</td>
</tr>
<tr>
<td>PHP</td>
<td><code>.php</code></td>
<td>classes, interfaces, traits, enums, methods, properties</td>
<td>namespace use, calls, static calls, <code>new</code></td>
<td>extends, implements, use (traits)</td>
</tr>
<tr>
<td>Visual FoxPro</td>
<td><code>.prg</code></td>
<td>functions, procedures, classes, methods, properties, constants</td>
<td>DO, SET PROCEDURE/CLASSLIB, CREATEOBJECT, <code>=func()</code>, <code>obj.method()</code></td>
<td>DEFINE CLASS ... AS</td>
</tr>
<tr>
<td>YAML (CI/CD)</td>
<td><code>.yml</code> <code>.yaml</code></td>
<td>GitLab CI: jobs, template anchors, stages. GitHub Actions: workflow name, jobs, reusable workflows. Generic: top-level keys</td>
<td><code>extends:</code>, <code>needs:</code>, <code>!reference</code>, <code>uses:</code></td>
<td>—</td>
</tr>
<tr>
<td>HCL / Terraform</td>
<td><code>.tf</code> <code>.tfvars</code> <code>.hcl</code></td>
<td><code>resource</code>, <code>data</code>, <code>variable</code>, <code>output</code>, <code>module</code>, <code>provider</code>, <code>locals</code> entries</td>
<td><code>var.*</code>, <code>module.*</code>, <code>data.*</code>, <code>local.*</code>, resource cross-refs</td>
<td>—</td>
</tr>
<tr>
<td>Vue</td>
<td><code>.vue</code></td>
<td>via <code>&lt;script&gt;</code> block extraction (TS/JS)</td>
<td>imports, calls, type refs</td>
<td>extends, implements</td>
</tr>
<tr>
<td>Svelte</td>
<td><code>.svelte</code></td>
<td>via <code>&lt;script&gt;</code> block extraction (TS/JS)</td>
<td>imports, calls, type refs</td>
<td>extends, implements</td>
</tr>
</tbody></table>
<details>
<summary><strong>Salesforce ecosystem (Tier 1)</strong></summary>

<table>
<thead>
<tr>
<th>Language</th>
<th>Extensions</th>
<th>Symbols</th>
<th>References</th>
</tr>
</thead>
<tbody><tr>
<td>Apex</td>
<td><code>.cls</code> <code>.trigger</code></td>
<td>classes, triggers, SOQL, annotations</td>
<td>imports, calls, System.Label, generic type refs</td>
</tr>
<tr>
<td>Aura</td>
<td><code>.cmp</code> <code>.app</code> <code>.evt</code> <code>.intf</code> <code>.design</code></td>
<td>components, attributes, methods, events</td>
<td>controller refs, component refs</td>
</tr>
<tr>
<td>LWC (JavaScript)</td>
<td><code>.js</code> (in LWC dirs)</td>
<td>anonymous class from filename</td>
<td><code>@salesforce/apex/</code>, <code>@salesforce/schema/</code>, <code>@salesforce/label/</code></td>
</tr>
<tr>
<td>Visualforce</td>
<td><code>.page</code> <code>.component</code></td>
<td>pages, components</td>
<td>controller/extensions, merge fields, includes</td>
</tr>
<tr>
<td>SF Metadata XML</td>
<td><code>*-meta.xml</code></td>
<td>objects, fields, rules, layouts</td>
<td>Apex class refs, formula field refs, Flow actionCalls</td>
</tr>
</tbody></table>
<p>Cross-language edges mean <code>roam impact AccountService</code> shows blast radius across Apex, LWC, Aura, Visualforce, and Flows.</p>
</details>

<p>| Ruby | <code>.rb</code> | classes, modules, methods, singleton methods, constants | require, require_relative, include/extend, calls, ClassName.new | class inheritance |
| Kotlin | <code>.kt</code> <code>.kts</code> | classes, interfaces, enums, objects, functions, methods, properties | imports, calls, type refs | extends, implements |
| Scala | <code>.scala</code> <code>.sc</code> | classes, traits, objects, case classes, functions, methods, val/var, type aliases | imports, calls, <code>new</code> | extends, with (trait mixins) |
| SQL (DDL) | <code>.sql</code> | tables, columns, views, functions, triggers, schemas, types (enums), sequences | foreign keys, view table deps, trigger table/function refs | -- |
| Swift | <code>.swift</code> | classes, structs, enums, protocols, functions, methods, properties | imports, calls, type refs | extends, conforms |
| JSONC | <code>.jsonc</code> | via JSON grammar | -- | -- |
| MDX | <code>.mdx</code> | via Markdown grammar | -- | -- |</p>
<h2>Performance</h2>
<table>
<thead>
<tr>
<th>Metric</th>
<th>Value</th>
</tr>
</thead>
<tbody><tr>
<td>Index 200 files</td>
<td>~3-5s</td>
</tr>
<tr>
<td>Index 3,000 files</td>
<td>~2 min</td>
</tr>
<tr>
<td>Incremental (no changes)</td>
<td>&lt;1s</td>
</tr>
<tr>
<td>Any query command</td>
<td>&lt;0.5s</td>
</tr>
</tbody></table>
<details>
<summary><strong>Detailed benchmarks</strong></summary>

<h3>Indexing Speed</h3>
<table>
<thead>
<tr>
<th>Project</th>
<th>Language</th>
<th>Files</th>
<th>Symbols</th>
<th>Edges</th>
<th>Index Time</th>
<th>Rate</th>
</tr>
</thead>
<tbody><tr>
<td>Express</td>
<td>JS</td>
<td>211</td>
<td>624</td>
<td>804</td>
<td>3s</td>
<td>70 files/s</td>
</tr>
<tr>
<td>Axios</td>
<td>JS</td>
<td>237</td>
<td>1,065</td>
<td>868</td>
<td>6s</td>
<td>41 files/s</td>
</tr>
<tr>
<td>Vue</td>
<td>TS</td>
<td>697</td>
<td>5,335</td>
<td>8,984</td>
<td>25s</td>
<td>28 files/s</td>
</tr>
<tr>
<td>Laravel</td>
<td>PHP</td>
<td>3,058</td>
<td>39,097</td>
<td>38,045</td>
<td>1m46s</td>
<td>29 files/s</td>
</tr>
<tr>
<td>Svelte</td>
<td>TS</td>
<td>8,445</td>
<td>16,445</td>
<td>19,618</td>
<td>2m40s</td>
<td>52 files/s</td>
</tr>
</tbody></table>
<h3>Quality Benchmark</h3>
<table>
<thead>
<tr>
<th>Repo</th>
<th>Language</th>
<th>Score</th>
<th>Coverage</th>
<th>Edge Density</th>
</tr>
</thead>
<tbody><tr>
<td>Laravel</td>
<td>PHP</td>
<td><strong>9.55</strong></td>
<td>91.2%</td>
<td>0.97</td>
</tr>
<tr>
<td>Vue</td>
<td>TS</td>
<td><strong>9.27</strong></td>
<td>85.8%</td>
<td>1.68</td>
</tr>
<tr>
<td>Svelte</td>
<td>TS</td>
<td><strong>9.04</strong></td>
<td>94.7%</td>
<td>1.19</td>
</tr>
<tr>
<td>Axios</td>
<td>JS</td>
<td><strong>8.98</strong></td>
<td>85.9%</td>
<td>0.82</td>
</tr>
<tr>
<td>Express</td>
<td>JS</td>
<td><strong>8.46</strong></td>
<td>96.0%</td>
<td>1.29</td>
</tr>
</tbody></table>
<h3>Token Efficiency</h3>
<table>
<thead>
<tr>
<th>Metric</th>
<th>Value</th>
</tr>
</thead>
<tbody><tr>
<td>1,600-line file → <code>roam file</code></td>
<td><del>5,000 chars (</del>70:1 compression)</td>
</tr>
<tr>
<td>Full project map</td>
<td>~4,000 chars</td>
</tr>
<tr>
<td><code>--compact</code> mode</td>
<td>40-50% additional token reduction</td>
</tr>
<tr>
<td><code>roam preflight</code> replaces</td>
<td>5-7 separate agent tool calls</td>
</tr>
</tbody></table>
</details>

<p>Agent-efficiency benchmarks: see the <a href="benchmarks/"><code>benchmarks/</code></a> directory for harness, repos, and results.</p>
<h2>How It Works</h2>
<pre><code>Codebase
    |
[1] Discovery ──── git ls-files (respects .gitignore + .roamignore)
    |
[2] Parse ──────── tree-sitter AST per file (27 languages)
    |
[3] Extract ────── symbols + references (calls, imports, inheritance)
    |
[4] Resolve ────── match references to definitions → edges
    |
[5] Metrics ────── adaptive PageRank, betweenness, cognitive complexity, Halstead
    |
[6] Algorithms ── 23-pattern anti-pattern catalog (O(n^2) loops, N+1, recursion)
    |
[7] Git ────────── churn, co-change matrix, authorship, Renyi entropy
    |
[8] Clusters ───── Louvain community detection
    |
[9] Health ─────── per-file scores (7-factor) + composite score (0-100)
    |
[10] Store ─────── .roam/index.db (SQLite, WAL mode)
</code></pre>
<p>After the first full index, <code>roam index</code> only re-processes changed files (mtime + SHA-256 hash). Incremental updates are near-instant.</p>
<h3>.roamignore</h3>
<p>Create a <code>.roamignore</code> file in your project root to exclude files from indexing. It uses <strong>full gitignore syntax</strong>:</p>
<table>
<thead>
<tr>
<th>Pattern</th>
<th>Meaning</th>
</tr>
</thead>
<tbody><tr>
<td><code>*.log</code></td>
<td>Exclude all <code>.log</code> files (basename match)</td>
</tr>
<tr>
<td><code>vendor/</code></td>
<td>Exclude the <code>vendor</code> directory and everything under it</td>
</tr>
<tr>
<td><code>/build/</code></td>
<td>Exclude <code>build/</code> at repo root only (anchored)</td>
</tr>
<tr>
<td><code>src/**/*.pb.go</code></td>
<td>Exclude <code>.pb.go</code> files at any depth under <code>src/</code></td>
</tr>
<tr>
<td><code>**/test_*.py</code></td>
<td>Exclude <code>test_*.py</code> files anywhere</td>
</tr>
<tr>
<td><code>?</code></td>
<td>Match any single character (not <code>/</code>)</td>
</tr>
<tr>
<td><code>[abc]</code> / <code>[!abc]</code></td>
<td>Character class / negated character class</td>
</tr>
<tr>
<td><code>!important.log</code></td>
<td>Un-exclude (re-include) <code>important.log</code></td>
</tr>
<tr>
<td><code># comment</code></td>
<td>Lines starting with <code>#</code> are comments</td>
</tr>
</tbody></table>
<p>Key rules: <code>*</code> matches within a single path segment (not across <code>/</code>). <code>**</code> matches across <code>/</code> boundaries. Last matching pattern wins (for negation). Patterns containing <code>/</code> are anchored to the repo root.</p>
<pre><code># .roamignore example
*_pb2.py
*_pb2_grpc.py
vendor/
node_modules/
*.generated.*
/build/
!build/keep/
</code></pre>
<p>You can also exclude patterns via <code>roam config --exclude &quot;*.proto&quot;</code> (stored in <code>.roam/config.json</code>) or inspect active patterns with <code>roam config --show</code>.</p>
<details>
<summary><strong>Graph algorithms</strong></summary>

<ul>
<li><strong>Adaptive PageRank</strong> -- damping factor auto-tunes based on cycle density (0.82-0.92); identifies the most important symbols (used by <code>map</code>, <code>search</code>, <code>context</code>)</li>
<li><strong>Personalized PageRank</strong> -- distance-weighted blast radius for <code>impact</code> (Gleich, 2015)</li>
<li><strong>Adaptive betweenness centrality</strong> -- exact for small graphs, sqrt-scaled sampling for large (Brandes &amp; Pich, 2007); finds bottleneck symbols</li>
<li><strong>Edge betweenness centrality</strong> -- identifies critical cycle-breaking edges in SCCs (Brandes, 2001)</li>
<li><strong>Tarjan&#39;s SCC</strong> -- detects dependency cycles with tangle ratio</li>
<li><strong>Propagation Cost</strong> -- fraction of system affected by any change, via transitive closure (MacCormack, Rusnak &amp; Baldwin, 2006)</li>
<li><strong>Algebraic connectivity (Fiedler value)</strong> -- second-smallest Laplacian eigenvalue; measures architectural robustness (Fiedler, 1973)</li>
<li><strong>Louvain community detection</strong> -- groups related symbols into clusters</li>
<li><strong>Modularity Q-score</strong> -- measures if cluster boundaries match natural community structure (Newman, 2004)</li>
<li><strong>Conductance</strong> -- per-cluster boundary tightness: cut(S, S_bar) / min(vol(S), vol(S_bar)) (Yang &amp; Leskovec)</li>
<li><strong>Topological sort</strong> -- computes dependency layers, Gini coefficient for layer balance (Gini, 1912), weighted violation severity</li>
<li><strong>k-shortest simple paths</strong> -- traces dependency paths with coupling strength</li>
<li><strong>Renyi entropy (order 2)</strong> -- measures co-change distribution; more robust to outliers than Shannon (Renyi, 1961)</li>
<li><strong>Mann-Kendall trend test</strong> -- non-parametric degradation detection, robust to noise (Mann, 1945; Kendall, 1975)</li>
<li><strong>Sen&#39;s slope estimator</strong> -- robust trend magnitude, resistant to outliers (Sen, 1968)</li>
<li><strong>NPMI</strong> -- Normalized Pointwise Mutual Information for coupling strength (Bouma, 2009)</li>
<li><strong>Lift</strong> -- association rule mining metric for co-change statistical significance (Agrawal &amp; Srikant, 1994)</li>
<li><strong>Halstead metrics</strong> -- volume, difficulty, effort, and predicted bugs from operator/operand counts (Halstead, 1977)</li>
<li><strong>SQALE remediation cost</strong> -- time-to-fix estimates per issue type for tech debt prioritization (Letouzey, 2012)</li>
<li><strong>Algorithm anti-pattern catalog</strong> -- 23 patterns detecting suboptimal algorithms (quadratic loops, N+1 queries, quadratic string building, branching recursion, manual top-k, loop-invariant calls) with confidence calibration via caller-count and bounded-loop analysis</li>
</ul>
</details>

<details>
<summary><strong>Health scoring</strong></summary>

<p>Composite health score (0-100) using a <strong>weighted geometric mean</strong> of sigmoid health factors. Non-compensatory: a zero in any dimension cannot be masked by high scores in others.</p>
<table>
<thead>
<tr>
<th>Factor</th>
<th>Weight</th>
<th>What it measures</th>
</tr>
</thead>
<tbody><tr>
<td>Tangle ratio</td>
<td>30%</td>
<td>% of symbols in dependency cycles</td>
</tr>
<tr>
<td>God components</td>
<td>20%</td>
<td>Symbols with extreme fan-in/fan-out</td>
</tr>
<tr>
<td>Bottlenecks</td>
<td>15%</td>
<td>High-betweenness chokepoints</td>
</tr>
<tr>
<td>Layer violations</td>
<td>15%</td>
<td>Upward dependency violations (severity-weighted by layer distance)</td>
</tr>
<tr>
<td>Per-file health</td>
<td>20%</td>
<td>Average of 7-factor file health scores</td>
</tr>
</tbody></table>
<p>Each factor uses sigmoid health: <code>h = e^(-signal/scale)</code> (1 = pristine, approaches 0 = worst). Score = <code>100 * product(h_i ^ w_i)</code>. Also reports <strong>propagation cost</strong> (MacCormack 2006) and <strong>algebraic connectivity</strong> (Fiedler 1973). Per-file health (1-10) combines: cognitive complexity (triangular nesting penalty per Sweller&#39;s Cognitive Load Theory), indentation complexity, cycle membership, god component membership, dead export ratio, co-change entropy, and churn amplification.</p>
</details>

<h2>How Roam Compares</h2>
<p>roam-code is the only tool that combines graph algorithms (PageRank, Tarjan SCC, Louvain clustering), git archaeology, architecture simulation, and multi-agent partitioning in a single local CLI with zero API keys.</p>
<p>Documentation (local HTML in <code>docs/site/</code>, CI-deployed via <code>.github/workflows/pages.yml</code>):</p>
<ul>
<li><code>docs/site/getting-started.html</code> — tutorial</li>
<li><code>docs/site/command-reference.html</code> — examples</li>
<li><code>docs/site/architecture.html</code> — diagram + internals</li>
<li><code>docs/site/landscape.html</code> — competitor matrix</li>
</ul>
<table>
<thead>
<tr>
<th>Capability</th>
<th>roam-code</th>
<th>AI IDEs (Cursor, Windsurf)</th>
<th>AI Agents (Claude Code, Codex)</th>
<th>SAST (SonarQube, CodeQL)</th>
</tr>
</thead>
<tbody><tr>
<td>Persistent local index</td>
<td>SQLite</td>
<td>Cloud embeddings</td>
<td>None</td>
<td>Per-scan</td>
</tr>
<tr>
<td>Call graph analysis</td>
<td>Yes</td>
<td>No</td>
<td>No</td>
<td>Yes (CodeQL)</td>
</tr>
<tr>
<td>PageRank / centrality</td>
<td>Yes</td>
<td>No</td>
<td>No</td>
<td>No</td>
</tr>
<tr>
<td>Cycle detection (Tarjan)</td>
<td>Yes</td>
<td>No</td>
<td>No</td>
<td>Deprecated (SonarQube)</td>
</tr>
<tr>
<td>Community detection (Louvain)</td>
<td>Yes</td>
<td>No</td>
<td>No</td>
<td>No</td>
</tr>
<tr>
<td>Git churn / co-change</td>
<td>Yes</td>
<td>No</td>
<td>No</td>
<td>No</td>
</tr>
<tr>
<td>Architecture simulation</td>
<td>Yes</td>
<td>No</td>
<td>No</td>
<td>No</td>
</tr>
<tr>
<td>Multi-agent partitioning</td>
<td>Yes</td>
<td>No</td>
<td>No</td>
<td>No</td>
</tr>
<tr>
<td>MCP tools for agents</td>
<td>101 (24 in default core preset)</td>
<td>Client only</td>
<td>Client only</td>
<td>34 (SonarQube)</td>
</tr>
<tr>
<td>Languages</td>
<td>26</td>
<td>70+</td>
<td>50+</td>
<td>12-42</td>
</tr>
<tr>
<td>100% local, zero API keys</td>
<td>Yes</td>
<td>No</td>
<td>No</td>
<td>Partial</td>
</tr>
<tr>
<td>Open source</td>
<td>MIT</td>
<td>No</td>
<td>Partial</td>
<td>Partial</td>
</tr>
</tbody></table>
<h3>Key Differentiators</h3>
<ul>
<li><strong>vs AI IDEs</strong> (Cursor, Windsurf, Augment): roam-code provides deterministic structural analysis. AI IDEs use probabilistic embeddings that can&#39;t guarantee reproducible results.</li>
<li><strong>vs AI Agents</strong> (Claude Code, Codex CLI, Gemini CLI): These agents read files one at a time. roam-code pre-computes relationships so agents get instant answers about architecture, blast radius, and dependencies.</li>
<li><strong>vs SAST Tools</strong> (SonarQube, CodeQL, Semgrep): SAST tools find bugs and vulnerabilities. roam-code understands architecture -- how code is structured, where it&#39;s coupled, and what breaks when you change it. Complementary, not competitive.</li>
<li><strong>vs Code Search</strong> (Sourcegraph/Amp, Greptile): Text search finds where code is. roam-code understands why code matters -- which functions are central, which modules are tangled, which files are high-risk.</li>
</ul>
<h2>FAQ</h2>
<p><strong>Does Roam send any data externally?</strong>
No. Zero network calls. No telemetry, no analytics, no update checks.</p>
<p><strong>Can Roam run in air-gapped environments?</strong>
Yes. Once installed, no internet access is required.</p>
<p><strong>Does Roam modify my source code?</strong>
Read-only by default. Creates <code>.roam/</code> with an index database. The <code>roam mutate</code> command can apply code changes (move/rename/extract) but defaults to <code>--dry-run</code> mode — you must explicitly pass <code>--apply</code> to write changes.</p>
<p><strong>How does Roam handle monorepos?</strong>
Indexes from the root. Batched SQL handles 100k+ symbols. Incremental updates stay fast.</p>
<p><strong>How does Roam handle multi-repo projects (e.g., frontend + backend)?</strong>
Use <code>roam ws init &lt;repo1&gt; &lt;repo2&gt;</code> to create a workspace. Each repo keeps its own index; a workspace overlay DB stores cross-repo API edges. <code>roam ws resolve</code> scans for REST endpoints and matches frontend calls to backend routes. Then <code>roam ws context</code>, <code>roam ws trace</code>, etc. work across repos.</p>
<p><strong>Is Roam compatible with SonarQube / CodeScene?</strong>
Yes. Roam complements existing tools. Both can run in the same CI pipeline. SARIF output integrates with GitHub Code Scanning.</p>
<h2>Limitations</h2>
<p>Static analysis trade-offs:</p>
<ul>
<li><strong>Static analysis primarily</strong> -- can&#39;t trace dynamic dispatch, reflection, or eval&#39;d code. Runtime trace ingestion (<code>roam ingest-trace</code>) adds production data but requires external trace export</li>
<li><strong>Import resolution is heuristic</strong> -- complex re-exports or conditional imports may not resolve</li>
<li><strong>Limited cross-language edges</strong> -- Salesforce, Protobuf, REST API, and multi-repo edges are supported, but not arbitrary FFI</li>
<li><strong>Tier 2 languages</strong> get basic symbol extraction only via generic tree-sitter walker</li>
<li><strong>Large monorepos</strong> (100k+ files) may have slow initial indexing</li>
</ul>
<h2>Troubleshooting</h2>
<table>
<thead>
<tr>
<th>Problem</th>
<th>Solution</th>
</tr>
</thead>
<tbody><tr>
<td><code>roam: command not found</code></td>
<td>Ensure install location is on PATH. For <code>uv</code>: <code>uv tool update-shell</code></td>
</tr>
<tr>
<td><code>Another indexing process is running</code></td>
<td>Delete <code>.roam/index.lock</code> and retry</td>
</tr>
<tr>
<td><code>database is locked</code></td>
<td><code>roam index --force</code> to rebuild</td>
</tr>
<tr>
<td>Unicode errors on Windows</td>
<td><code>chcp 65001</code> for UTF-8</td>
</tr>
<tr>
<td>Symbol resolves to wrong file</td>
<td>Use <code>file:symbol</code> syntax: <code>roam symbol myfile:MyFunction</code></td>
</tr>
<tr>
<td>Health score seems wrong</td>
<td><code>roam --json health</code> for factor breakdown</td>
</tr>
<tr>
<td>Index stale after <code>git pull</code></td>
<td><code>roam index</code> (incremental). After major refactors: <code>roam index --force</code></td>
</tr>
</tbody></table>
<h2>Update / Uninstall</h2>
<pre><code class="language-bash"># Update
pipx upgrade roam-code
uv tool upgrade roam-code
pip install --upgrade roam-code

# Uninstall
pipx uninstall roam-code
uv tool uninstall roam-code
pip uninstall roam-code
</code></pre>
<p>Delete <code>.roam/</code> from your project root to clean up local data.</p>
<h2>Development</h2>
<pre><code class="language-bash">git clone https://github.com/Cranot/roam-code.git
cd roam-code
pip install -e &quot;.[dev]&quot;   # includes pytest, ruff
pytest tests/              # ~5500 tests, Python 3.9-3.13

# Or use Make targets:
make dev      # install with dev extras
make test     # run tests
make lint     # ruff check
</code></pre>
<details>
<summary><strong>Project structure</strong></summary>

<pre><code>roam-code/
├── pyproject.toml
├── action.yml                         # Reusable GitHub Action
├── src/roam/
│   ├── __init__.py                    # Version (from pyproject.toml)
│   ├── cli.py                         # Click CLI (140 commands)
│   ├── mcp_server.py                  # MCP server (102 tools, 10 resources, 5 prompts)
│   ├── db/
│   │   ├── connection.py              # SQLite (WAL, pragmas, batched IN)
│   │   ├── schema.py                  # Tables, indexes, migrations
│   │   └── queries.py                 # Named SQL constants
│   ├── index/
│   │   ├── indexer.py                 # Orchestrates full pipeline
│   │   ├── discovery.py               # git ls-files, .gitignore
│   │   ├── parser.py                  # Tree-sitter parsing
│   │   ├── symbols.py                 # Symbol + reference extraction
│   │   ├── relations.py               # Reference resolution -&gt; edges
│   │   ├── complexity.py              # Cognitive complexity (SonarSource) + Halstead metrics
│   │   ├── git_stats.py               # Churn, co-change, blame, Renyi entropy
│   │   ├── incremental.py             # mtime + hash change detection
│   │   ├── file_roles.py              # Smart file role classifier
│   │   └── test_conventions.py        # Pluggable test naming adapters
│   ├── languages/
│   │   ├── base.py                    # Abstract LanguageExtractor
│   │   ├── registry.py                # Language detection + aliasing
│   │   ├── *_lang.py                  # One file per language (21 dedicated + generic)
│   │   └── generic_lang.py            # Tier 2 fallback
│   ├── bridges/
│   │   ├── base.py, registry.py       # Cross-language bridge framework
│   │   ├── bridge_salesforce.py       # Apex &lt;-&gt; Aura/LWC/Visualforce
│   │   └── bridge_protobuf.py         # .proto -&gt; Go/Java/Python stubs
│   ├── catalog/
│   │   ├── tasks.py                  # Universal algorithm catalog (23 patterns)
│   │   └── detectors.py              # Anti-pattern detectors with confidence calibration
│   ├── workspace/
│   │   ├── config.py                  # .roam-workspace.json
│   │   ├── db.py                      # Workspace overlay DB
│   │   ├── api_scanner.py             # REST API endpoint detection
│   │   └── aggregator.py              # Cross-repo aggregation
│   ├── graph/
│   │   ├── builder.py, pagerank.py    # DB -&gt; NetworkX, PageRank
│   │   ├── cycles.py, clusters.py     # Tarjan SCC, propagation cost, Louvain, modularity Q
│   │   ├── layers.py, pathfinding.py  # Topo layers, k-shortest paths
│   │   ├── simulate.py, spectral.py   # Architecture simulation, Fiedler bisection
│   │   ├── partition.py, fingerprint.py # Multi-agent partitioning, topology fingerprints
│   │   └── anomaly.py                 # Statistical anomaly detection
│   ├── commands/
│   │   ├── resolve.py                 # Shared symbol resolution
│   │   ├── graph_helpers.py           # Shared graph utilities (adj builders, BFS)
│   │   ├── context_helpers.py         # Data-gathering helpers for context command
│   │   ├── gate_presets.py            # Framework-specific gate rules
│   │   └── cmd_*.py                   # One module per command
│   ├── analysis/
│   │   ├── effects.py                 # Side-effect classification engine
│   │   └── taint.py                   # Taint analysis
│   ├── refactor/
│   │   ├── codegen.py                 # Import generation (Python/JS/Go)
│   │   └── transforms.py             # move/rename/add-call/extract transforms
│   ├── rules/
│   │   ├── engine.py                  # YAML rule parser + graph query evaluator
│   │   ├── builtin.py                 # 10 built-in governance rules
│   │   ├── ast_match.py               # AST pattern matching with $METAVAR captures
│   │   └── dataflow.py                # Intra-procedural dataflow analysis
│   ├── runtime/
│   │   ├── trace_ingest.py            # OpenTelemetry/Jaeger/Zipkin ingestion
│   │   └── hotspots.py                # Runtime hotspot analysis
│   ├── search/
│   │   ├── tfidf.py                   # TF-IDF semantic search engine
│   │   ├── index_embeddings.py        # Embedding index builder
│   │   └── onnx_embeddings.py         # Optional local ONNX semantic backend
│   ├── security/
│   │   ├── vuln_store.py              # CVE/vulnerability storage
│   │   └── vuln_reach.py              # Vulnerability reachability paths
│   └── output/
│       ├── formatter.py               # Token-efficient formatting
│       ├── sarif.py                   # SARIF 2.1.0 output
│       └── schema_registry.py         # JSON envelope schema versioning
└── tests/                             # ~5500 tests across 186 test files
</code></pre>
</details>

<h3>Dependencies</h3>
<table>
<thead>
<tr>
<th>Package</th>
<th>Purpose</th>
</tr>
</thead>
<tbody><tr>
<td><a href="https://click.palletsprojects.com/">click</a> &gt;= 8.0</td>
<td>CLI framework</td>
</tr>
<tr>
<td><a href="https://github.com/tree-sitter/py-tree-sitter">tree-sitter</a> &gt;= 0.23</td>
<td>AST parsing</td>
</tr>
<tr>
<td><a href="https://github.com/nicolo-ribaudo/tree-sitter-language-pack">tree-sitter-language-pack</a> &gt;= 0.6</td>
<td>165+ grammars</td>
</tr>
<tr>
<td><a href="https://networkx.org/">networkx</a> &gt;= 3.0</td>
<td>Graph algorithms</td>
</tr>
</tbody></table>
<p>Optional: <a href="https://github.com/jlowin/fastmcp">fastmcp</a> &gt;= 2.0 (MCP server — install with <code>pip install &quot;roam-code[mcp]&quot;</code>)</p>
<p>Optional: Local semantic ONNX stack (<code>numpy</code>, <code>onnxruntime</code>, <code>tokenizers</code>) via <code>pip install &quot;roam-code[semantic]&quot;</code></p>
<h2>Roadmap</h2>
<h3>Shipped</h3>
<ul>
<li><input checked="" disabled="" type="checkbox"> MCP v2 agent surface: in-process execution, compound operations, presets, schemas, annotations, and compatibility profiles.</li>
<li><input checked="" disabled="" type="checkbox"> Full command and MCP inventory parity in docs: 140 CLI commands and 102 MCP tools.</li>
<li><input checked="" disabled="" type="checkbox"> CI hardening: composite action, changed-only mode, trend-aware gates, sticky PR updater, and SARIF guardrails.</li>
<li><input checked="" disabled="" type="checkbox"> Performance foundation: FTS5/BM25 search, O(changed) incremental indexing, DB/index optimizations.</li>
<li><input checked="" disabled="" type="checkbox"> Agent governance suite: <code>vibe-check</code>, <code>ai-readiness</code>, <code>verify</code>, <code>ai-ratio</code>, <code>duplicates</code>, advanced <code>algo</code> scoring/SARIF.</li>
<li><input checked="" disabled="" type="checkbox"> Ownership/review intelligence: <code>codeowners</code>, <code>drift</code>, <code>simulate-departure</code>, <code>suggest-reviewers</code>, <code>api-changes</code>, <code>test-gaps</code>, <code>semantic-diff</code>, <code>secrets</code>.</li>
<li><input checked="" disabled="" type="checkbox"> Multi-agent operations: <code>partition</code>, <code>affected</code>, <code>syntax-check</code>, workspace-aware context and traces.</li>
<li><input checked="" disabled="" type="checkbox"> Budget-aware context delivery: <code>--budget</code> (partial rollout), PageRank-weighted truncation, conversation-aware ranking.</li>
</ul>
<h3>Next</h3>
<ul>
<li><input checked="" disabled="" type="checkbox"> Terminal demo GIF in README.</li>
<li><input disabled="" type="checkbox"> GitHub repo topics.</li>
<li><input disabled="" type="checkbox"> GitHub Discussions enabled.</li>
<li><input disabled="" type="checkbox"> MCP directory + awesome-list submissions.</li>
</ul>
<h2>Contributing</h2>
<pre><code class="language-bash">git clone https://github.com/Cranot/roam-code.git
cd roam-code
pip install -e .
pytest tests/   # all ~5500 tests must pass
</code></pre>
<p>Good first contributions: add a <a href="src/roam/languages/">Tier 1 language</a> (see <code>go_lang.py</code> or <code>php_lang.py</code> as templates), improve reference resolution, add benchmark repos, extend SARIF converters, add MCP tools.</p>
<p>Please open an issue first to discuss larger changes.</p>
<h2>License</h2>
<p><a href="LICENSE">MIT</a></p>

NAME

SYNOPSIS

INFO

DESCRIPTION

README