ROAM-CODE(1)

NAME

roam-codeArchitectural intelligence layer for AI coding agents. Structural graph, architecture governance, multi-agent…

SYNOPSIS

$pip install roam-code

INFO

449 stars
38 forks
0 views

DESCRIPTION

Architectural intelligence layer for AI coding agents. Structural graph, architecture governance, multi-agent orchestration, vulnerability mapping. 139 commands, 101 MCP tools, 26 languages, 100% local.

README

roam-code

The architectural intelligence layer for AI coding agents. Structural graph, architecture governance, multi-agent orchestration, vulnerability mapping, runtime analysis -- one CLI, zero API keys.

140 commands · 102 MCP tools · 27 languages · 100% local

PyPI version GitHub stars CI Python 3.9+ License: MIT


What is Roam?

Roam is a structural intelligence engine for software. It pre-indexes your codebase into a semantic graph -- symbols, dependencies, call graphs, architecture layers, git history, and runtime traces -- stored in a local SQLite DB. Agents query it via CLI or MCP instead of repeatedly grepping files and guessing structure.

Unlike LSPs (editor-bound, language-specific) or Sourcegraph (hosted search), Roam provides architecture-level graph queries -- offline, cross-language, and compact. It goes beyond comprehension: Roam governs architecture through budget gates, simulates refactoring outcomes, orchestrates multi-agent swarms with zero-conflict guarantees, maps vulnerability reachability paths, and enables graph-level code editing without syntax errors.

Codebase ──> [Index] ──> Semantic Graph ──> 139 Commands ──> AI Agent
              │              │                  │
           tree-sitter    symbols            comprehend
           27 languages   + edges            govern
           git history    + metrics          refactor
           runtime traces + architecture     orchestrate

The problem

Coding agents explore codebases inefficiently: dozens of grep/read cycles, high token cost, no structural understanding. Roam replaces this with one graph query:

$ roam context Flask
Callers: 47  Callees: 3
Affected tests: 31

Files to read: src/flask/app.py:76-963 # definition src/flask/init.py:1-15 # re-export src/flask/testing.py:22-45 # caller: FlaskClient.init tests/test_basic.py:12-30 # caller: test_app_factory ...12 more files

Terminal demo

roam terminal demo

Core commands

$ roam understand              # full codebase briefing
$ roam context <name>          # files-to-read with exact line ranges
$ roam preflight <name>        # blast radius + tests + complexity + architecture rules
$ roam health                  # composite score (0-100)
$ roam diff                    # blast radius of uncommitted changes

What's New in v11

v11.2 -- AST Clone Detection + Debug Artifact Rules

  • roam clones: New AST structural clone detection via subtree hashing. Finds Type-2 clones (identical control flow, different identifiers/literals) with Jaccard similarity scoring, Union-Find clustering, and automated refactoring suggestions. More precise than the metric-based duplicates command.
  • 9 debug artifact rules (COR-560 through COR-568): Detect leftover print(), breakpoint(), pdb.set_trace(), console.log(), debugger, and System.out.println() in Python, JavaScript, TypeScript, and Java code. All use ast_match type with test file exemptions.
  • 140 commands, 102 MCP tools.

v11.1.2 -- SQL + Scala Tier 1, 27 Languages

  • SQL DDL promoted to Tier 1 with dedicated SqlExtractor -- tables, columns, views, functions, triggers, schemas, types (enums), sequences, ALTER TABLE ADD COLUMN. Foreign keys produce graph edges; views and triggers reference source tables. Database-schema projects now work with roam health, roam layers, roam impact, roam coupling and all graph commands.
  • Scala promoted to Tier 1 with dedicated ScalaExtractor -- classes, traits, objects, case classes, sealed hierarchies, val/var properties, type aliases, imports, and inheritance. Full extends + with trait mixin resolution.
  • 27 languages with 16 dedicated Tier 1 extractors.
  • server.json for official MCP Registry submission.

v11.1.1 -- Command Quality Audit

  • Full command audit: all 140 commands reviewed for usefulness, duplicates, and test coverage. ~20 bugs fixed, 21 new test files (700+ tests), every command docstring updated with cross-references to related commands.
  • Kotlin promoted to Tier 1 via new YAML-based declarative extractor architecture. Classes, interfaces, enums, objects, functions, methods, properties, and inheritance fully extracted.
  • 7 new commands: roam congestion, roam adrs, roam flag-dead, roam test-scaffold, roam sbom, roam triage, roam ci-setup.
  • CI templates: roam ci-setup generates pipelines for GitHub Actions, GitLab CI, Azure Pipelines, Jenkins, and Bitbucket.
  • Bug fixes: --undocumented mode in intent (wrong DB table), --changed flag in verify (was permanently dead), lazy-load violation in visualize (~500ms penalty), exit code inconsistency in rules, VERDICT-first convention enforced across all commands.
  • Code quality: 15 unused variables removed, dead code swept (4 orphaned cmd files, 2 dead helper functions), algo detector false-positive rate reduced (regex-in-loop: 7 to 1, list-prepend deque suppression), 6 regex patterns pre-compiled for loop performance.

v11.0 -- MCP v2 for Agent-First Workflows

  • In-process MCP execution removes per-call subprocess overhead.
  • 4 compound operations (roam_explore, roam_prepare_change, roam_review_change, roam_diagnose_issue) reduce multi-step agent workflows to single calls.
  • Preset-based tool surfacing (core, review, refactor, debug, architecture, full) keeps default tool choice tight for agents while retaining full depth on demand.
  • MCP tools now expose structured schemas and richer annotations for safer planner behavior.
  • MCP token overhead for default core context dropped from ~36K to <3K tokens (about 92% reduction).

Performance and Retrieval

  • Symbol search moved to SQLite FTS5/BM25: typical search moved from seconds to milliseconds (about 1000x on benchmarked paths).
  • Incremental indexing shifted from O(N) full-edge rebuild behavior to O(changed) updates.
  • DB/runtime optimizations (mmap_size, safer large-graph guards, batched writes) reduce first-run and reindex friction on larger repos.

CI, Governance, and Delivery

  • GitHub Action supports quality gates, SARIF upload, sticky PR comments, and cache-aware execution.
  • CI hardening includes changed-only analysis mode, trend-aware gates, and SARIF pre-upload guardrails (size/result caps + truncation signaling).
  • Agent governance expanded with verification and AI-quality tooling (roam verify, roam vibe-check, roam ai-readiness, roam ai-ratio) for teams managing agent-written code.

Best for

  • Agent-assisted coding -- structured answers that reduce token usage vs raw file exploration
  • Large codebases (100+ files) -- graph queries beat linear search at scale
  • Architecture governance -- health scores, CI quality gates, budget enforcement, fitness functions
  • Safe refactoring -- blast radius, affected tests, pre-change safety checks, graph-level editing
  • Multi-agent orchestration -- partition codebases for parallel agent work with zero-conflict guarantees
  • Security analysis -- vulnerability reachability mapping, auth gaps, CVE path tracing
  • Algorithm optimization -- detect O(n^2) loops, N+1 queries, and 21 other anti-patterns with suggested fixes
  • Backend quality -- auth gaps, missing indexes, over-fetching models, non-idempotent migrations, orphan routes, API drift
  • Runtime analysis -- overlay production trace data onto the static graph for hotspot detection
  • Multi-repo projects -- cross-repo API edge detection between frontend and backend

When NOT to use Roam

  • Real-time type checking -- use an LSP (pyright, gopls, tsserver). Roam is static and offline.
  • Small scripts (<10 files) -- just read the files directly.
  • Pure text search -- ripgrep is faster for raw string matching.

Why use Roam

Speed. One command replaces 5-10 tool calls (in typical workflows). Under 0.5s for any query.

Dependency-aware. Computes structure, not string matches. Knows Flask has 47 dependents and 31 affected tests. grep knows it appears 847 times.

LLM-optimized output. Plain ASCII, compact abbreviations (fn, cls, meth), --json envelopes. Designed for agent consumption, not human decoration.

Fully local. No API keys, telemetry, or network calls. Works in air-gapped environments.

Algorithm-aware. Built-in catalog of 23 anti-patterns. Detects suboptimal algorithms (quadratic loops, N+1 queries, unbounded recursion) and suggests fixes with Big-O improvements and confidence scores. Receiver-aware loop-invariant analysis minimizes false positives.

CI-ready. --json output, --gate quality gates, GitHub Action, SARIF 2.1.0.

Without RoamWith Roam
Tool calls81
Wall time~11s<0.5s
Tokens consumed~15,000~3,000

Measured on a typical agent workflow in a 200-file Python project (Flask). See benchmarks for more.

Table of Contents

Getting Started: What is Roam? · What's New in v11 · Best for · Why use Roam · Install · Quick Start

Using Roam: Commands · Walkthrough · AI Coding Tools · MCP Server

Operations: CI/CD Integration · SARIF Output · For Teams

Reference: Language Support · Performance · How It Works · How Roam Compares · FAQ

More: Limitations · Troubleshooting · Update / Uninstall · Development · Contributing

Install

pip install roam-code

Recommended: isolated environment

pipx install roam-code

or

uv tool install roam-code

From source

pip install git+https://github.com/Cranot/roam-code.git

Requires Python 3.9+. Works on Linux, macOS, and Windows.

Windows: If roam is not found after installing with uv, run uv tool update-shell and restart your terminal.

Docker (alpine-based)

docker build -t roam-code .
docker run --rm -v "$PWD:/workspace" roam-code index
docker run --rm -v "$PWD:/workspace" roam-code health

Quick Start

cd your-project
roam init                  # indexes codebase, creates config + CI workflow
roam understand            # full codebase briefing

First index takes ~5s for 200 files, ~15s for 1,000 files. Subsequent runs are incremental and near-instant.

Next steps:

  • Set up your AI agent: roam describe --write (auto-detects CLAUDE.md, AGENTS.md, .cursor/rules, etc. — see integration instructions)
  • Explore: roam healthroam weatherroam map
  • Add to CI: roam init already generated a GitHub Action
Try it on Roam itself
git clone https://github.com/Cranot/roam-code.git
cd roam-code
pip install -e .
roam init
roam understand
roam health

Works With

Claude CodeCursorWindsurfGitHub CopilotAiderClineGemini CLIOpenAI Codex CLIMCPGitHub ActionsGitLab CIAzure DevOps

Commands

The 5 core commands shown above cover ~80% of agent workflows. All 140 commands are organized into 7 categories.

Full command reference

Getting Started

CommandDescription
roam index [--force] [--verbose]Build or rebuild the codebase index
roam watch [--interval N] [--debounce N] [--webhook-port P] [--guardian]Long-running index daemon: poll/webhook-triggered refreshes plus optional continuous architecture-guardian snapshots and JSONL compliance artifacts
roam initGuided onboarding: creates .roam/fitness.yaml, CI workflow, runs index, shows health
roam hooks [--install] [--uninstall]Manage git hooks for automated roam index updates and health gates
roam doctorDiagnose installation and environment: verify tree-sitter grammars, SQLite, git, and config health
roam reset [--hard]Reset the roam index and cached data. --hard removes all .roam/ artifacts
roam clean [--all]Remove stale or orphaned index entries without a full rebuild
roam understandFull codebase briefing: tech stack, architecture, key abstractions, health, conventions, complexity overview, entry points
roam onboardAlias for understand
roam tour [--write PATH]Auto-generated onboarding guide: top symbols, reading order, entry points, language breakdown. --write saves to Markdown
roam describe [--write] [--force] [-o PATH] [--agent-prompt]Auto-generate project description for AI agents. --write auto-detects your agent's config file. --agent-prompt returns a compact (<500 token) system prompt
roam agent-export [--format F] [--write]Generate agent-context bundle from project analysis (AGENTS.md + provider-specific overlays)
roam minimap [--update] [-o FILE] [--init-notes]Compact annotated codebase snapshot for agent config injection: stack, annotated directory tree, key symbols by PageRank, high fan-in symbols to avoid touching, hotspots, conventions. Sentinel-based in-place updates
roam config [--set-db-dir PATH] [--semantic-backend MODE]Manage .roam/config.json (DB path, excludes, optional ONNX semantic settings)
roam map [-n N] [--full] [--budget N]Project skeleton: files, languages, entry points, top symbols by PageRank. --budget caps output to N tokens
roam schema [--diff] [--version V]JSON envelope schema versioning: view, diff, and validate output schemas
roam mcp [--list-tools] [--transport T]Start MCP server (stdio/SSE/streamable-http), inspect available tools, and expose roam to coding agents
roam mcp-setup <platform>Generate MCP config snippets for AI platforms: claude-code, cursor, windsurf, vscode, gemini-cli, codex-cli
roam ci-setup [--platform P] [--write]Generate CI/CD pipeline config (GitHub Actions, GitLab CI, Azure Pipelines, Jenkins, Bitbucket) with SARIF + quality gates
roam adrs [--status S] [--limit N]Discover Architecture Decision Records, link to affected code modules, show status and coverage

Daily Workflow

CommandDescription
roam file <path> [--full] [--changed] [--deps-of PATH]File skeleton: all definitions with signatures, cognitive load index, health score
roam symbol <name> [--full]Symbol definition + callers + callees + metrics. Supports file:symbol disambiguation
roam context <symbol> [--task MODE] [--for-file PATH]AI-optimized context: definition + callers + callees + files-to-read with line ranges
roam search <pattern> [--kind KIND]Find symbols by name pattern, PageRank-ranked
roam grep <pattern> [-g glob] [-n N]Text search annotated with enclosing symbol context
roam deps <path> [--full]What a file imports and what imports it
roam trace <source> <target> [-k N]Dependency paths with coupling strength and hub detection
roam impact <symbol>Blast radius: what breaks if a symbol changes (Personalized PageRank weighted)
roam diff [--staged] [--full] [REV_RANGE]Blast radius of uncommitted changes or a commit range
roam pr-risk [REV_RANGE]PR risk score (0-100, multiplicative model) + structural spread + suggested reviewers
roam pr-diff [--staged] [--range R] [--format markdown]Structural PR diff: metric deltas, edge analysis, symbol changes, footprint. Not text diff — graph delta
roam api-changes [REV_RANGE]API change classifier: breaking/non-breaking changes, severity, and affected contracts
roam semantic-diff [REV_RANGE]Structural change summary: symbols added/removed/modified and changed call edges
roam test-gaps [REV_RANGE]Changed-symbol test gap detection: what changed and what still lacks test coverage
roam affected [REV_RANGE]Monorepo/package impact analysis: what components are affected by a change
roam attest [REV_RANGE] [--format markdown] [--sign]Proof-carrying PR attestation: bundles blast radius, risk, breaking changes, fitness, budget, tests, effects into one verifiable artifact
roam annotate <symbol> <note>Attach persistent notes to symbols (agentic memory across sessions)
roam annotations [--file F] [--symbol S]View stored annotations
roam diagnose <symbol> [--depth N]Root cause analysis: ranks suspects by z-score normalized risk
roam preflight <symbol|file>Compound pre-change check: blast radius + tests + complexity + coupling + fitness
roam guard <symbol>Compact sub-agent preflight bundle: definition, 1-hop callers/callees, test files, breaking-risk score, and layer signals
roam agent-plan --agents NDecompose partitions into dependency-ordered agent tasks with merge sequencing and handoffs
roam agent-context --agent-id N [--agents M]Generate per-agent execution context: write scope, read-only dependencies, and interface contracts
roam syntax-check [--changed] [PATHS...]Tree-sitter syntax integrity check for changed files and multi-agent judge workflows
roam verify [--threshold N]Pre-commit AI-code consistency check across naming, imports, error handling, and duplication signals
roam verify-imports [--file F]Import hallucination firewall: validate all imports against indexed symbol table, suggest corrections via FTS5 fuzzy matching
roam triage list|add|stats|checkSecurity finding suppression workflow: manage .roam-suppressions.yml (SAFE/ACKNOWLEDGED/WONT-FIX status lifecycle)
roam safe-delete <symbol>Safe deletion check: SAFE/REVIEW/UNSAFE verdict
roam test-map <name>Map a symbol or file to its test coverage
roam adversarial [--staged] [--range R]Adversarial architecture review: generates targeted challenges based on changes
roam plan [--staged] [--range R] [--agents N]Agent work planner: decompose changes into sequenced, dependency-aware steps
roam closure <symbol> [--rename] [--delete]Minimal-change synthesis: all files to touch for a safe rename/delete
roam mutate move|rename|add-call|extractGraph-level code editing: move symbols, rename across codebase, add calls, extract functions. Dry-run by default

Codebase Health

CommandDescription
roam health [--no-framework] [--gate]Composite health score (0-100): weighted geometric mean of tangle ratio, god components, bottlenecks, layer violations. --gate runs quality gate checks from .roam-gates.yml (exit 5 on failure)
roam smells [--file F] [--min-severity S]Code smell detection: 15 deterministic detectors (brain methods, god classes, feature envy, shotgun surgery, data clumps, etc.) with per-file health scores
roam dashboardUnified single-screen project status: health, hotspots, risks, ownership, and AI-rot indicators
roam vibe-check [--threshold N]AI-rot auditor: 8-pattern taxonomy with composite risk score and prioritized findings
roam ai-readiness0-100 score for how well this codebase supports AI coding agents
roam ai-ratio [--since N]Statistical estimate of AI-generated code ratio using commit-behavior signals
roam trends [--record] [--days N] [--metric M]Historical metrics snapshots with sparklines and trend deltas
roam complexity [--bumpy-road]Per-function cognitive complexity (SonarSource-compatible, triangular nesting penalty) + Halstead metrics (volume, difficulty, effort, bugs) + cyclomatic density
roam algo [--task T] [--confidence C] [--profile P]Algorithm anti-pattern detection: 23-pattern catalog detects suboptimal algorithms (O(n^2) loops, N+1 queries, quadratic string building, branching recursion, loop-invariant calls) and suggests better approaches with Big-O improvements. Confidence calibration via caller-count + runtime traces, evidence paths, impact scoring, framework-aware N+1 packs, and language-aware fix templates. Alias: roam math
roam n1 [--confidence C] [--verbose]Implicit N+1 I/O detection: finds ORM model computed properties ($appends/accessors) that trigger lazy-loaded DB queries in collection contexts. Cross-references with eager loading config. Supports Laravel, Django, Rails, SQLAlchemy, JPA
roam over-fetch [--threshold N] [--confidence C]Detect models serializing too many fields: large $fillable without $hidden/$visible, direct controller returns bypassing API Resources, poor exposed-to-hidden ratio
roam missing-index [--table T] [--confidence C]Find queries on non-indexed columns: cross-references WHERE/ORDER BY clauses, foreign keys, and paginated queries against migration-defined indexes
roam weather [-n N]Hotspots ranked by geometric mean of churn x complexity (percentile-normalized)
roam debt [--roi]Hotspot-weighted tech debt prioritization with SQALE remediation costs and optional refactoring ROI estimates
roam fitness [--explain]Architectural fitness functions from .roam/fitness.yaml
roam alertsHealth degradation trend detection (Mann-Kendall + Sen's slope)
roam forecast [--symbol S] [--horizon N] [--alert-only]Predict when metrics will exceed thresholds: Theil-Sen regression on snapshot history + churn-weighted per-symbol risk
roam budget [--init] [--staged] [--range R]Architectural budget enforcement: per-PR delta limits on health, cycles, complexity. CI gate (exit 5 on violation)
roam bisect [--metric M] [--range R]Architectural git bisect: find the commit that degraded a specific metric
roam ingest-trace <file> [--otel|--jaeger|--zipkin|--generic]Ingest runtime trace data (OpenTelemetry, Jaeger, Zipkin) for hotspot overlay
roam hotspots [--runtime] [--discrepancy]Runtime hotspot analysis: find symbols missed by static analysis but critical at runtime
roam algo — algorithm anti-pattern catalog (23 patterns)

roam algo scans every indexed function against a 23-pattern catalog, ranks findings by runtime-aware impact score, and shows the exact Big-O improvement available. Findings include semantic evidence paths, precision metadata, and language-aware tips/fixes (Python, JS, Go, Rust, Java, etc.):

$ roam algo
VERDICT: 8 algorithmic improvements found (3 high, 4 medium, 1 low)
Ordering: highest impact first
Profile: balanced (filtered 0 low-signal findings)

Nested loop lookup (2): fn resolve_permissions src/auth/rbac.py:112 [high, impact=86.4] Current: Nested iteration -- O(n*m) Better: Hash-map join -- O(n+m) Tip: Build a dict/set from one collection, iterate the other

fn find_matching_rule src/rules/engine.py:67 [high, impact=78.1] Current: Nested iteration -- O(n*m) Better: Hash-map join -- O(n+m) Tip: Build a dict/set from one collection, iterate the other

String building (1): meth build_query src/db/query.py:88 [high, impact=74.0] Current: Loop concatenation -- O(n^2) Better: Join / StringBuilder -- O(n) Tip: Collect parts in a list, join once at the end

Branching recursion without memoization (1): fn compute_cost src/pricing/calc.py:34 [medium, impact=49.5] Current: Naive branching recursion -- O(2^n) Better: Memoized / iterative DP -- O(n) Tip: Add @cache / @lru_cache, or convert to iterative with a table

Full catalog — 23 patterns:

PatternAnti-pattern detectedBetter approachImprovement
Nested loop lookupfor x in a: for y in b: if x==yHash-map joinO(n·m) → O(n+m)
Membership testif x in list in a loopSet lookupO(n) → O(1) per check
SortingBubble / selection sortBuilt-in sortO(n²) → O(n log n)
Search in sorted dataLinear scan on sorted sequenceBinary searchO(n) → O(log n)
String buildings += chunk in loopjoin() / StringBuilderO(n²) → O(n)
DeduplicationNested loop dedupset() / dict.fromkeysO(n²) → O(n)
Max / minManual tracking loopmax() / min()idiom
AccumulationManual accumulatorsum() / reduce()idiom
Group by keyManual key-existence checkdefaultdict / groupingByidiom
FibonacciNaive recursionIterative / @lru_cacheO(2ⁿ) → O(n)
ExponentiationLoop multiplicationpow(b, e, mod)O(n) → O(log n)
GCDManual loopmath.gcd()O(n) → O(log n)
Matrix multiplyNaive triple loopNumPy / BLASsame asymptotic, ~1000× faster via SIMD
Busy waitwhile True: sleep() pollEvent / condition variableO(k) → O(1) wake-up
Regex in loopre.match() compiled per iterationPre-compiled patternO(n·(p+m)) → O(p + n·m)
N+1 queryPer-item DB / API call in loopBatch WHERE IN (...)n round-trips → 1
List front operationslist.insert(0, x) in loopcollections.dequeO(n) → O(1) per op
Sort to selectsorted(x)[0] or sorted(x)[:k]min() / heapq.nsmallestO(n log n) → O(n) or O(n log k)
Repeated lookup.index() / .contains() inside loopPre-built set / dictO(m) → O(1) per lookup
Branching recursionNaive f(n-1) + f(n-2) without cache@cache / iterative DPO(2ⁿ) → O(n)
Quadratic string buildingresult += chunk across multiple scopesparts.append + join at endO(n²) → O(n)
Loop-invariant callget_config() / compile_schema() inside loop bodyHoist before loopper-iter cost → O(1)
String reversalManual char-by-char loops[::-1] / .reverse()idiom

Filtering:

roam algo --task nested-lookup       # one pattern type only
roam algo --confidence high          # high-confidence findings only
roam algo --profile strict           # precision-first filtering
roam algo --task io-in-loop -n 5    # top 5 N+1 query sites
roam --json algo                     # machine-readable output
roam --sarif algo > roam-algo.sarif  # SARIF with fingerprints + fixes

Confidence calibration: high = strong structural signal (unbounded loop + high caller/runtime impact + pattern confirmed); medium = pattern matched but uncertainty remains; low = heuristic signal only.

Profiles: balanced (default), strict (precision-first), aggressive (surface more candidates).

roam minimap — annotated codebase snapshot for agent configs

roam minimap generates a compact block (stack, annotated directory tree, key symbols, hotspots, conventions) wrapped in sentinel comments for in-place agent config updates:

$ roam minimap
<!-- roam:minimap generated=2026-02-25 -->
**Stack:** Python · JavaScript · YAML

.github/ (CI + Action) benchmarks/ (agent-eval + oss-eval) src/ roam/ bridges/ base.py # LanguageBridge registry.py # register_bridge, detect_bridges commands/ (137 cmd files) # is_test_file, get_changed_files db/ connection.py # find_project_root, batched_in schema.py graph/ builder.py # build_symbol_graph, build_file_graph pagerank.py # compute_pagerank, compute_centrality languages/ (21 files) # ApexExtractor output/ formatter.py # to_json, json_envelope cli.py # cli, LazyGroup mcp_server.py tests/ (186 files) `

Key symbols (PageRank): open_db · ensure_index · json_envelope · to_json · LanguageExtractor

Touch carefully (fan-in >= 15): to_json (116 callers) · json_envelope (116 callers) · open_db (105 callers) · ensure_index (100 callers)

Hotspots (churn x complexity): cmd_context.py · csharp_lang.py · cmd_dead.py

Conventions: snake_case fns, PascalCase classes


**Workflow:**
roam minimap                    # print to stdout
roam minimap --update           # replace sentinel block in CLAUDE.md in-place
roam minimap -o docs/AGENTS.md  # target a different file
roam minimap --init-notes       # scaffold .roam/minimap-notes.md for project gotchas
</code></pre>
<p>The sentinel pair <code>&lt;!-- roam:minimap --&gt;</code> / <code>&lt;!-- /roam:minimap --&gt;</code> is replaced on each run — surrounding content is left intact. Add project-specific gotchas to <code>.roam/minimap-notes.md</code> and they appear in every subsequent output.</p>
<p><strong>Tree annotations</strong> come from the top exported symbols by fan-in per file. Non-source root directories (<code>.github/</code>, <code>benchmarks/</code>, <code>docs/</code>) are collapsed immediately. Large subdirectories (e.g. <code>commands/</code>, <code>languages/</code>) are collapsed at depth 2+ with a file count.</p>
</details>

<h3>Architecture</h3>
<table>
<thead>
<tr>
<th>Command</th>
<th>Description</th>
</tr>
</thead>
<tbody><tr>
<td><code>roam clusters [--min-size N]</code></td>
<td>Community detection vs directory structure. Modularity Q-score (Newman 2004) + per-cluster conductance</td>
</tr>
<tr>
<td><code>roam spectral [--depth N] [--compare] [--gap-only] [--k K]</code></td>
<td>Spectral bisection: Fiedler vector partition tree with algebraic connectivity gap verdict</td>
</tr>
<tr>
<td><code>roam layers</code></td>
<td>Topological dependency layers + upward violations + Gini balance</td>
</tr>
<tr>
<td><code>roam dead [--all] [--summary] [--clusters]</code></td>
<td>Unreferenced exported symbols with safety verdicts + confidence scoring (60-95%)</td>
</tr>
<tr>
<td><code>roam flag-dead [--config FILE] [--include-tests]</code></td>
<td>Feature flag dead code detection: stale LaunchDarkly/Unleash/Split/custom flags with staleness analysis</td>
</tr>
<tr>
<td><code>roam fan [symbol|file] [-n N] [--no-framework]</code></td>
<td>Fan-in/fan-out: most connected symbols or files</td>
</tr>
<tr>
<td><code>roam risk [-n N] [--domain KW] [--explain]</code></td>
<td>Domain-weighted risk ranking</td>
</tr>
<tr>
<td><code>roam why &lt;name&gt; [name2 ...]</code></td>
<td>Role classification (Hub/Bridge/Core/Leaf), reach, criticality</td>
</tr>
<tr>
<td><code>roam split &lt;file&gt;</code></td>
<td>Internal symbol groups with isolation % and extraction suggestions</td>
</tr>
<tr>
<td><code>roam entry-points</code></td>
<td>Entry point catalog with protocol classification</td>
</tr>
<tr>
<td><code>roam patterns</code></td>
<td>Architectural pattern recognition: Strategy, Factory, Observer, etc.</td>
</tr>
<tr>
<td><code>roam visualize [--format mermaid|dot] [--focus NAME] [--limit N]</code></td>
<td>Generate Mermaid or DOT architecture diagrams. Smart filtering via PageRank, cluster grouping, cycle highlighting</td>
</tr>
<tr>
<td><code>roam effects [TARGET] [--file F] [--type T]</code></td>
<td>Side-effect classification: DB writes, network I/O, filesystem, global mutation. Direct + transitive effects through call graph</td>
</tr>
<tr>
<td><code>roam dark-matter [--min-cochanges N]</code></td>
<td>Detect hidden co-change couplings not explained by import/call edges</td>
</tr>
<tr>
<td><code>roam simulate move|extract|merge|delete</code></td>
<td>Counterfactual architecture simulator: test refactoring ideas in-memory, see metric deltas before writing code</td>
</tr>
<tr>
<td><code>roam orchestrate --agents N [--files P]</code></td>
<td>Multi-agent swarm partitioning: split codebase for parallel agents with zero-conflict guarantees</td>
</tr>
<tr>
<td><code>roam partition [--agents N]</code></td>
<td>Multi-agent partition manifest: conflict risk, complexity, and suggested ownership splits</td>
</tr>
<tr>
<td><code>roam fingerprint [--compact] [--compare F]</code></td>
<td>Topology fingerprint: extract/compare architectural signatures across repos</td>
</tr>
<tr>
<td><code>roam cut &lt;target&gt; [--depth N]</code></td>
<td>Minimum graph cuts: find critical edges whose removal disconnects components</td>
</tr>
<tr>
<td><code>roam safe-zones</code></td>
<td>Graph-based containment boundaries</td>
</tr>
<tr>
<td><code>roam coverage-gaps</code></td>
<td>Unprotected entry points with no path to gate symbols</td>
</tr>
<tr>
<td><code>roam duplicates [--threshold T] [--min-lines N]</code></td>
<td>Semantic duplicate detector: functionally equivalent code clusters with divergent edge-case handling</td>
</tr>
<tr>
<td><code>roam clones [--threshold T] [--min-lines N] [--scope P]</code></td>
<td>AST structural clone detection: Type-2 clones via subtree hashing (more precise than <code>duplicates</code>)</td>
</tr>
</tbody></table>
<h3>Exploration</h3>
<table>
<thead>
<tr>
<th>Command</th>
<th>Description</th>
</tr>
</thead>
<tbody><tr>
<td><code>roam module &lt;path&gt;</code></td>
<td>Directory contents: exports, signatures, dependencies, cohesion</td>
</tr>
<tr>
<td><code>roam sketch &lt;dir&gt; [--full]</code></td>
<td>Compact structural skeleton of a directory</td>
</tr>
<tr>
<td><code>roam uses &lt;name&gt;</code></td>
<td>All consumers: callers, importers, inheritors</td>
</tr>
<tr>
<td><code>roam owner &lt;path&gt;</code></td>
<td>Code ownership: who owns a file or directory</td>
</tr>
<tr>
<td><code>roam coupling [-n N] [--set]</code></td>
<td>Temporal coupling: file pairs that change together (NPMI + lift)</td>
</tr>
<tr>
<td><code>roam fn-coupling</code></td>
<td>Function-level temporal coupling across files</td>
</tr>
<tr>
<td><code>roam bus-factor [--brain-methods]</code></td>
<td>Knowledge loss risk per module</td>
</tr>
<tr>
<td><code>roam doc-staleness</code></td>
<td>Detect stale docstrings</td>
</tr>
<tr>
<td><code>roam docs-coverage</code></td>
<td>Public-symbol doc coverage + stale docs + PageRank-ranked missing-doc hotlist</td>
</tr>
<tr>
<td><code>roam suggest-refactoring [--limit N] [--min-score N]</code></td>
<td>Proactive refactoring recommendations ranked by complexity, coupling, churn, smells, coverage gaps, and debt</td>
</tr>
<tr>
<td><code>roam plan-refactor &lt;symbol&gt; [--operation auto|extract|move]</code></td>
<td>Ordered refactor plan with blast radius, test gaps, layer risk, and simulation-based strategy preview</td>
</tr>
<tr>
<td><code>roam test-scaffold &lt;name|file&gt; [--write] [--framework F]</code></td>
<td>Generate test file/function/import skeletons from symbol data (pytest, jest, Go, JUnit, RSpec)</td>
</tr>
<tr>
<td><code>roam conventions</code></td>
<td>Auto-detect naming styles, import preferences. Flags outliers</td>
</tr>
<tr>
<td><code>roam breaking [REV_RANGE]</code></td>
<td>Breaking change detection: removed exports, signature changes</td>
</tr>
<tr>
<td><code>roam affected-tests &lt;symbol|file&gt;</code></td>
<td>Trace reverse call graph to test files</td>
</tr>
<tr>
<td><code>roam relate &lt;sym1&gt; &lt;sym2&gt;</code></td>
<td>Show relationship between two symbols: shared callers, shortest path, common ancestors</td>
</tr>
<tr>
<td><code>roam endpoints [--routes] [--api]</code></td>
<td>Enumerate all HTTP/API endpoint definitions and surface them for review or cross-repo matching</td>
</tr>
<tr>
<td><code>roam metrics &lt;file|symbol&gt;</code></td>
<td>Unified vital signs: complexity, fan-in/out, PageRank, churn, test coverage, dead code risk -- all in one call</td>
</tr>
<tr>
<td><code>roam search-semantic &lt;query&gt;</code></td>
<td>Hybrid semantic search: BM25 + TF-IDF + optional local ONNX vectors (select via <code>--backend</code>) with framework/library packs</td>
</tr>
<tr>
<td><code>roam intent [--staged] [--range R]</code></td>
<td>Doc-to-code linking: match documentation to symbols, detect drift</td>
</tr>
<tr>
<td><code>roam x-lang [--bridges] [--edges]</code></td>
<td>Cross-language edge browser: inspect bridge-resolved connections</td>
</tr>
</tbody></table>
<h3>Reports &amp; CI</h3>
<table>
<thead>
<tr>
<th>Command</th>
<th>Description</th>
</tr>
</thead>
<tbody><tr>
<td><code>roam report [--list] [--config FILE] [PRESET]</code></td>
<td>Compound presets: <code>first-contact</code>, <code>security</code>, <code>pre-pr</code>, <code>refactor</code>, <code>guardian</code></td>
</tr>
<tr>
<td><code>roam describe --write</code></td>
<td>Generate agent config (auto-detects: CLAUDE.md, AGENTS.md, .cursor/rules, etc.)</td>
</tr>
<tr>
<td><code>roam auth-gaps [--routes-only] [--controllers-only] [--min-confidence C]</code></td>
<td>Find endpoints missing authentication or authorization: routes outside auth middleware groups, CRUD methods without <code>$this-&gt;authorize()</code> / <code>Gate::allows()</code> checks. String-aware PHP brace parsing</td>
</tr>
<tr>
<td><code>roam orphan-routes [-n N] [--confidence C]</code></td>
<td>Detect backend routes with no frontend consumer: parses route definitions, searches frontend for API call references, reports controller methods with no route mapping</td>
</tr>
<tr>
<td><code>roam migration-safety [-n N] [--include-archive]</code></td>
<td>Detect non-idempotent migrations: missing <code>hasTable</code>/<code>hasColumn</code> guards, raw SQL without <code>IF NOT EXISTS</code>, index operations without existence checks</td>
</tr>
<tr>
<td><code>roam api-drift [--model M] [--confidence C]</code></td>
<td>Detect mismatches between PHP model <code>$fillable</code>/<code>$appends</code> fields and TypeScript interface properties. Auto-converts snake_case/camelCase for comparison. Single-repo; cross-repo planned for <code>roam ws api-drift</code></td>
</tr>
<tr>
<td><code>roam codeowners [--unowned] [--owner NAME]</code></td>
<td>CODEOWNERS coverage analysis: owned/unowned files, top owners, and ownership risk</td>
</tr>
<tr>
<td><code>roam drift [--threshold N]</code></td>
<td>Ownership drift detection: declared ownership vs observed maintenance activity</td>
</tr>
<tr>
<td><code>roam suggest-reviewers [REV_RANGE]</code></td>
<td>Reviewer recommendation via ownership, recency, breadth, and impact signals</td>
</tr>
<tr>
<td><code>roam simulate-departure &lt;developer&gt;</code></td>
<td>Knowledge-loss simulation: what breaks if a key contributor leaves</td>
</tr>
<tr>
<td><code>roam dev-profile [--developer NAME] [--since N]</code></td>
<td>Developer productivity profile: commit patterns, specialization, impact, and knowledge concentration per contributor</td>
</tr>
<tr>
<td><code>roam secrets [--fail-on-found] [--include-tests]</code></td>
<td>Secret scanning with masking, entropy detection, env-var suppression, remediation suggestions, and optional CI gate failure</td>
</tr>
<tr>
<td><code>roam vulns [--import-file F] [--reachable-only]</code></td>
<td>Vulnerability scanning: ingest npm/pip/trivy/osv reports, auto-detect format, reachability filtering, SARIF output</td>
</tr>
<tr>
<td><code>roam path-coverage [--from P] [--to P] [--max-depth N]</code></td>
<td>Find critical call paths (entry -&gt; sink) with zero test protection. Suggests optimal test insertion points</td>
</tr>
<tr>
<td><code>roam capsule [--redact-paths] [--no-signatures] [--output F]</code></td>
<td>Export sanitized structural graph (no code bodies) for external architectural review</td>
</tr>
<tr>
<td><code>roam rules [--init] [--ci] [--rules-dir D]</code></td>
<td>Plugin DSL for governance: user-defined path/symbol/AST rules via <code>.roam/rules/</code> YAML (<code>$METAVAR</code> captures supported)</td>
</tr>
<tr>
<td><code>roam check-rules [--severity S] [--fix]</code></td>
<td>Evaluate built-in and user-defined governance rules (10 built-in: no-circular-imports, max-fan-out, etc.)</td>
</tr>
<tr>
<td><code>roam vuln-map --generic|--npm-audit|--trivy F</code></td>
<td>Ingest vulnerability reports and match to codebase symbols</td>
</tr>
<tr>
<td><code>roam vuln-reach [--cve C] [--from E]</code></td>
<td>Vulnerability reachability: exact paths from entry points to vulnerable calls</td>
</tr>
<tr>
<td><code>roam supply-chain [--top N]</code></td>
<td>Dependency risk dashboard: pin coverage, risk scoring, supply-chain health</td>
</tr>
<tr>
<td><code>roam sbom [--format cyclonedx|spdx] [--no-reachability] [-o FILE]</code></td>
<td>SBOM generation (CycloneDX 1.5 / SPDX 2.3) enriched with call-graph reachability per dependency</td>
</tr>
<tr>
<td><code>roam congestion [--window N] [--min-authors N]</code></td>
<td>Developer congestion detection: concurrent authors per file, coordination risk scoring</td>
</tr>
<tr>
<td><code>roam invariants [--staged] [--range R]</code></td>
<td>Discover architectural contracts (invariants) from the codebase structure</td>
</tr>
</tbody></table>
<h3>Multi-Repo Workspace</h3>
<table>
<thead>
<tr>
<th>Command</th>
<th>Description</th>
</tr>
</thead>
<tbody><tr>
<td><code>roam ws init &lt;repo1&gt; &lt;repo2&gt; [--name NAME]</code></td>
<td>Initialize a workspace from sibling repos. Auto-detects frontend/backend roles</td>
</tr>
<tr>
<td><code>roam ws status</code></td>
<td>Show workspace repos, index ages, cross-repo edge count</td>
</tr>
<tr>
<td><code>roam ws resolve</code></td>
<td>Scan for REST API endpoints and match frontend calls to backend routes</td>
</tr>
<tr>
<td><code>roam ws understand</code></td>
<td>Unified workspace overview: per-repo stats + cross-repo connections</td>
</tr>
<tr>
<td><code>roam ws health</code></td>
<td>Workspace-wide health report with cross-repo coupling assessment</td>
</tr>
<tr>
<td><code>roam ws context &lt;symbol&gt;</code></td>
<td>Cross-repo augmented context: find a symbol across repos + show API callers</td>
</tr>
<tr>
<td><code>roam ws trace &lt;source&gt; &lt;target&gt;</code></td>
<td>Trace cross-repo paths via API edges</td>
</tr>
</tbody></table>
<h3>Global Options</h3>
<table>
<thead>
<tr>
<th>Option</th>
<th>Description</th>
</tr>
</thead>
<tbody><tr>
<td><code>roam --json &lt;command&gt;</code></td>
<td>Structured JSON output with consistent envelope</td>
</tr>
<tr>
<td><code>roam --compact &lt;command&gt;</code></td>
<td>Token-efficient output: TSV tables, minimal JSON envelope</td>
</tr>
<tr>
<td><code>roam --sarif &lt;command&gt;</code></td>
<td>SARIF 2.1.0 output for dead, health, complexity, rules, secrets, and algo (GitHub/CI integration)</td>
</tr>
<tr>
<td><code>roam health --gate</code></td>
<td>CI quality gate. Reads <code>.roam-gates.yml</code> thresholds. Exit code 5 on failure</td>
</tr>
</tbody></table>
</details>

<h2>Walkthrough: Investigating a Codebase</h2>
<details>
<summary><strong>10-step walkthrough using Flask as an example</strong> (click to expand)</summary>

<p>Here&#39;s how you&#39;d use Roam to understand a project you&#39;ve never seen before. Using Flask as an example:</p>
<p><strong>Step 1: Onboard and get the full picture</strong></p>
<pre><code>$ roam init
Created .roam/fitness.yaml (6 starter rules)
Created .github/workflows/roam.yml
Done. 226 files, 1132 symbols, 233 edges.
Health: 78/100

$ roam understand
Tech stack: Python (flask, jinja2, werkzeug)
Architecture: Monolithic — 3 layers, 5 clusters
Key abstractions: Flask, Blueprint, Request, Response
Health: 78/100 — 1 god component (Flask)
Entry points: src/flask/__init__.py, src/flask/cli.py
Conventions: snake_case functions, PascalCase classes, relative imports
Complexity: avg 4.2, 3 high (&gt;15), 0 critical (&gt;25)
</code></pre>
<p><strong>Step 2: Drill into a key file</strong></p>
<pre><code>$ roam file src/flask/app.py
src/flask/app.py  (python, 963 lines)

  cls  Flask(App)                                   :76-963
    meth  __init__(self, import_name, ...)           :152
    meth  route(self, rule, **options)               :411
    meth  register_blueprint(self, blueprint, ...)   :580
    meth  make_response(self, rv)                    :742
    ...12 more methods
</code></pre>
<p><strong>Step 3: Who depends on this?</strong></p>
<pre><code>$ roam deps src/flask/app.py
Imported by:
file                        symbols
--------------------------  -------
src/flask/__init__.py       3
src/flask/testing.py        2
tests/test_basic.py         1
...18 files total
</code></pre>
<p><strong>Step 4: Find the hotspots</strong></p>
<pre><code>$ roam weather
=== Hotspots (churn x complexity) ===
Score  Churn  Complexity  Path                    Lang
-----  -----  ----------  ----------------------  ------
18420  460    40.0        src/flask/app.py        python
12180  348    35.0        src/flask/blueprints.py python
</code></pre>
<p><strong>Step 5: Check architecture health</strong></p>
<pre><code>$ roam health
Health: 78/100
  Tangle: 0.0% (0/1132 symbols in cycles)
  1 god component (Flask, degree 47, actionable)
  0 bottlenecks, 0 layer violations

=== God Components (degree &gt; 20) ===
Sev      Name   Kind  Degree  Cat  File
-------  -----  ----  ------  ---  ------------------
WARNING  Flask  cls   47      act  src/flask/app.py
</code></pre>
<p><strong>Step 6: Get AI-ready context for a symbol</strong></p>
<pre><code>$ roam context Flask
Files to read:
  src/flask/app.py:76-963              # definition
  src/flask/__init__.py:1-15           # re-export
  src/flask/testing.py:22-45           # caller: FlaskClient.__init__
  tests/test_basic.py:12-30            # caller: test_app_factory
  ...12 more files

Callers: 47  Callees: 3
</code></pre>
<p><strong>Step 7: Pre-change safety check</strong></p>
<pre><code>$ roam preflight Flask
=== Preflight: Flask ===
Blast radius: 47 callers, 89 transitive
Affected tests: 31 (DIRECT: 12, TRANSITIVE: 19)
Complexity: cc=40 (critical), nesting=6
Coupling: 3 hidden co-change partners
Fitness: 1 violation (max-complexity exceeded)
Verdict: HIGH RISK — consider splitting before modifying
</code></pre>
<p><strong>Step 8: Decompose a large file</strong></p>
<pre><code>$ roam split src/flask/app.py
=== Split analysis: src/flask/app.py ===
  87 symbols, 42 internal edges, 95 external edges
  Cross-group coupling: 18%

  Group 1 (routing) — 12 symbols, isolation: 83% [extractable]
    meth  route              L411  PR=0.0088
    meth  add_url_rule       L450  PR=0.0045
    ...

=== Extraction Suggestions ===
  Extract &#39;routing&#39; group: route, add_url_rule, endpoint (+9 more)
    83% isolated, only 3 edges to other groups
</code></pre>
<p><strong>Step 9: Understand why a symbol matters</strong></p>
<pre><code>$ roam why Flask url_for Blueprint
Symbol     Role          Fan         Reach     Risk      Verdict
---------  ------------  ----------  --------  --------  --------------------------------------------------
Flask      Hub           fan-in:47   reach:89  CRITICAL  God symbol (47 in, 12 out). Consider splitting.
url_for    Core utility  fan-in:31   reach:45  HIGH      Widely used utility (31 callers). Stable interface.
Blueprint  Bridge        fan-in:18   reach:34  moderate  Coupling point between clusters.
</code></pre>
<p><strong>Step 10: Generate docs and set up CI</strong></p>
<pre><code>$ roam describe --write
Wrote CLAUDE.md (98 lines)  # auto-detects: CLAUDE.md, AGENTS.md, .cursor/rules, etc.

$ roam health --gate
Health: 78/100 — PASS
</code></pre>
<p>Ten commands. Complete picture: structure, dependencies, hotspots, health, context, safety checks, decomposition, and CI gates.</p>
</details>

<h2>Integration with AI Coding Tools</h2>
<p>Roam is designed to be called by coding agents via shell commands. Instead of repeatedly grepping and reading files, the agent runs one <code>roam</code> command and gets structured output.</p>
<p><strong>Decision order for agents:</strong></p>
<table>
<thead>
<tr>
<th>Situation</th>
<th>Command</th>
</tr>
</thead>
<tbody><tr>
<td>First time in a repo</td>
<td><code>roam understand</code> then <code>roam tour</code></td>
</tr>
<tr>
<td>Need to modify a symbol</td>
<td><code>roam preflight &lt;name&gt;</code> (blast radius + tests + fitness)</td>
</tr>
<tr>
<td>Debugging a failure</td>
<td><code>roam diagnose &lt;name&gt;</code> (root cause ranking)</td>
</tr>
<tr>
<td>Need files to read</td>
<td><code>roam context &lt;name&gt;</code> (files + line ranges)</td>
</tr>
<tr>
<td>Need to find a symbol</td>
<td><code>roam search &lt;pattern&gt;</code></td>
</tr>
<tr>
<td>Need file structure</td>
<td><code>roam file &lt;path&gt;</code></td>
</tr>
<tr>
<td>Pre-PR check</td>
<td><code>roam pr-risk HEAD~3..HEAD</code></td>
</tr>
<tr>
<td>What breaks if I change X?</td>
<td><code>roam impact &lt;symbol&gt;</code></td>
</tr>
<tr>
<td>Check for N+1 queries</td>
<td><code>roam n1</code> (implicit lazy-load detection)</td>
</tr>
<tr>
<td>Check auth coverage</td>
<td><code>roam auth-gaps</code> (routes + controllers)</td>
</tr>
<tr>
<td>Check migration safety</td>
<td><code>roam migration-safety</code> (idempotency guards)</td>
</tr>
</tbody></table>
<p><strong>Fastest setup:</strong></p>
<pre><code class="language-bash">roam describe --write               # auto-detects your agent&#39;s config file
roam describe --write -o AGENTS.md  # or specify an explicit path
roam describe --agent-prompt        # compact ~500-token prompt (append to any config)
roam minimap --update               # inject/refresh annotated codebase minimap in CLAUDE.md
</code></pre>
<p><strong>Agent not using Roam correctly?</strong> If your agent is ignoring Roam and falling back to grep/read exploration, it likely doesn&#39;t have the instructions. Run:</p>
<pre><code class="language-bash">roam describe --write          # writes instructions to your agent&#39;s config (CLAUDE.md, AGENTS.md, etc.)
</code></pre>
<p>If you already have a config file and don&#39;t want to overwrite it:</p>
<pre><code class="language-bash">roam describe --agent-prompt   # prints a compact prompt — copy-paste into your existing config
roam minimap --update          # injects an annotated codebase snapshot into CLAUDE.md (won&#39;t touch other content)
</code></pre>
<p>This teaches the agent which Roam command to use for each situation (e.g., <code>roam preflight</code> before changes, <code>roam context</code> for files to read, <code>roam diagnose</code> for debugging).</p>
<details>
<summary><strong>Copy-paste agent instructions</strong></summary>

<pre><code class="language-markdown">## Codebase navigation

This project uses `roam` for codebase comprehension. Always prefer roam over Glob/Grep/Read exploration.

Before modifying any code:
1. First time in the repo: `roam understand` then `roam tour`
2. Find a symbol: `roam search &lt;pattern&gt;`
3. Before changing a symbol: `roam preflight &lt;name&gt;` (blast radius + tests + fitness)
4. Need files to read: `roam context &lt;name&gt;` (files + line ranges, prioritized)
5. Debugging a failure: `roam diagnose &lt;name&gt;` (root cause ranking)
6. After making changes: `roam diff` (blast radius of uncommitted changes)

Additional: `roam health` (0-100 score), `roam impact &lt;name&gt;` (what breaks),
`roam pr-risk` (PR risk), `roam file &lt;path&gt;` (file skeleton).

Run `roam --help` for all commands. Use `roam --json &lt;cmd&gt;` for structured output.
</code></pre>
</details>

<details>
<summary><strong>Where to put this for each tool</strong></summary>

<table>
<thead>
<tr>
<th>Tool</th>
<th>Config file</th>
</tr>
</thead>
<tbody><tr>
<td><strong>Claude Code</strong></td>
<td><code>CLAUDE.md</code> in your project root</td>
</tr>
<tr>
<td><strong>OpenAI Codex CLI</strong></td>
<td><code>AGENTS.md</code> in your project root</td>
</tr>
<tr>
<td><strong>Gemini CLI</strong></td>
<td><code>GEMINI.md</code> in your project root</td>
</tr>
<tr>
<td><strong>Cursor</strong></td>
<td><code>.cursor/rules/roam.mdc</code> (add <code>alwaysApply: true</code> frontmatter)</td>
</tr>
<tr>
<td><strong>Windsurf</strong></td>
<td><code>.windsurf/rules/roam.md</code> (add <code>trigger: always_on</code> frontmatter)</td>
</tr>
<tr>
<td><strong>GitHub Copilot</strong></td>
<td><code>.github/copilot-instructions.md</code></td>
</tr>
<tr>
<td><strong>Aider</strong></td>
<td><code>CONVENTIONS.md</code></td>
</tr>
<tr>
<td><strong>Continue.dev</strong></td>
<td><code>config.yaml</code> rules</td>
</tr>
<tr>
<td><strong>Cline</strong></td>
<td><code>.clinerules/</code> directory</td>
</tr>
</tbody></table>
</details>

<details>
<summary><strong>Roam vs native tools</strong></summary>

<table>
<thead>
<tr>
<th>Task</th>
<th>Use Roam</th>
<th>Use native tools</th>
</tr>
</thead>
<tbody><tr>
<td>&quot;What calls this function?&quot;</td>
<td><code>roam symbol &lt;name&gt;</code></td>
<td>LSP / Grep</td>
</tr>
<tr>
<td>&quot;What files do I need to read?&quot;</td>
<td><code>roam context &lt;name&gt;</code></td>
<td>Manual tracing (5+ calls)</td>
</tr>
<tr>
<td>&quot;Is it safe to change X?&quot;</td>
<td><code>roam preflight &lt;name&gt;</code></td>
<td>Multiple manual checks</td>
</tr>
<tr>
<td>&quot;Show me this file&#39;s structure&quot;</td>
<td><code>roam file &lt;path&gt;</code></td>
<td>Read the file directly</td>
</tr>
<tr>
<td>&quot;Understand project architecture&quot;</td>
<td><code>roam understand</code></td>
<td>Manual exploration</td>
</tr>
<tr>
<td>&quot;What breaks if I change X?&quot;</td>
<td><code>roam impact &lt;symbol&gt;</code></td>
<td>No direct equivalent</td>
</tr>
<tr>
<td>&quot;What tests to run?&quot;</td>
<td><code>roam affected-tests &lt;name&gt;</code></td>
<td>Grep for imports (misses indirect)</td>
</tr>
<tr>
<td>&quot;What&#39;s causing this bug?&quot;</td>
<td><code>roam diagnose &lt;name&gt;</code></td>
<td>Manual call-chain tracing</td>
</tr>
<tr>
<td>&quot;Codebase health score for CI&quot;</td>
<td><code>roam health --gate</code></td>
<td>No equivalent</td>
</tr>
</tbody></table>
</details>

<h2>MCP Server</h2>
<p>Roam includes a <a href="https://modelcontextprotocol.io/">Model Context Protocol</a> server for direct integration with tools that support MCP.</p>
<pre><code class="language-bash">pip install &quot;roam-code[mcp]&quot;
roam mcp
</code></pre>
<p>102 tools, 10 resources, and 5 prompts are available in the full preset. Most tools are read-only index queries; side-effect tools are explicitly annotated.</p>
<p><strong>MCP v2 highlights (v11):</strong></p>
<ul>
<li>In-process MCP execution (no subprocess shell-out per call)</li>
<li>Preset-based tool surfacing (<code>core</code>, <code>review</code>, <code>refactor</code>, <code>debug</code>, <code>architecture</code>, <code>full</code>)</li>
<li>Compound tools that collapse multi-step exploration/review flows into one call</li>
<li>Structured output schemas + tool annotations for safer planner behavior</li>
</ul>
<p><strong>Default preset:</strong> <code>core</code> (24 tools: 23 core + <code>roam_expand_toolset</code> meta-tool).</p>
<pre><code class="language-bash"># Default
roam mcp

# Full toolset
ROAM_MCP_PRESET=full roam mcp

# Legacy compatibility (same as full preset)
ROAM_MCP_LITE=0 roam mcp
</code></pre>
<p>Core preset tools: <code>roam_affected_tests</code>, <code>roam_batch_get</code>, <code>roam_batch_search</code>, <code>roam_complexity_report</code>, <code>roam_context</code>, <code>roam_dead_code</code>, <code>roam_deps</code>, <code>roam_diagnose</code>, <code>roam_diagnose_issue</code>, <code>roam_diff</code>, <code>roam_expand_toolset</code>, <code>roam_explore</code>, <code>roam_file_info</code>, <code>roam_health</code>, <code>roam_impact</code>, <code>roam_pr_risk</code>, <code>roam_preflight</code>, <code>roam_prepare_change</code>, <code>roam_review_change</code>, <code>roam_search_symbol</code>, <code>roam_syntax_check</code>, <code>roam_trace</code>, <code>roam_understand</code>, <code>roam_uses</code>.</p>
<details>
<summary><strong>MCP tool list (all 101)</strong></summary>

<table>
<thead>
<tr>
<th>Tool</th>
<th>Description</th>
</tr>
</thead>
<tbody><tr>
<td><code>roam_understand</code></td>
<td>Full codebase briefing</td>
</tr>
<tr>
<td><code>roam_health</code></td>
<td>Health score (0-100) + issues</td>
</tr>
<tr>
<td><code>roam_preflight</code></td>
<td>Pre-change safety check</td>
</tr>
<tr>
<td><code>roam_search_symbol</code></td>
<td>Find symbols by name</td>
</tr>
<tr>
<td><code>roam_context</code></td>
<td>Files-to-read for modifying a symbol</td>
</tr>
<tr>
<td><code>roam_trace</code></td>
<td>Dependency path between two symbols</td>
</tr>
<tr>
<td><code>roam_impact</code></td>
<td>Blast radius of changing a symbol</td>
</tr>
<tr>
<td><code>roam_file_info</code></td>
<td>File skeleton with all definitions</td>
</tr>
<tr>
<td><code>roam_pr_risk</code></td>
<td>Risk score for pending changes</td>
</tr>
<tr>
<td><code>roam_breaking_changes</code></td>
<td>Detect breaking changes between refs</td>
</tr>
<tr>
<td><code>roam_affected_tests</code></td>
<td>Find tests affected by a change</td>
</tr>
<tr>
<td><code>roam_dead_code</code></td>
<td>List unreferenced exports</td>
</tr>
<tr>
<td><code>roam_complexity_report</code></td>
<td>Per-symbol cognitive complexity</td>
</tr>
<tr>
<td><code>roam_repo_map</code></td>
<td>Project skeleton with key symbols</td>
</tr>
<tr>
<td><code>roam_tour</code></td>
<td>Auto-generated onboarding guide</td>
</tr>
<tr>
<td><code>roam_diagnose</code></td>
<td>Root cause analysis for debugging</td>
</tr>
<tr>
<td><code>roam_visualize</code></td>
<td>Generate Mermaid or DOT architecture diagrams</td>
</tr>
<tr>
<td><code>roam_algo</code></td>
<td>Algorithm anti-pattern detection with language-aware tips</td>
</tr>
<tr>
<td><code>roam_ws_understand</code></td>
<td>Unified multi-repo workspace overview</td>
</tr>
<tr>
<td><code>roam_ws_context</code></td>
<td>Cross-repo augmented symbol context</td>
</tr>
<tr>
<td><code>roam_pr_diff</code></td>
<td>Structural PR diff: metric deltas, edge analysis, symbol changes</td>
</tr>
<tr>
<td><code>roam_budget_check</code></td>
<td>Check changes against architectural budgets</td>
</tr>
<tr>
<td><code>roam_effects</code></td>
<td>Side-effect classification (DB writes, network, filesystem)</td>
</tr>
<tr>
<td><code>roam_attest</code></td>
<td>Proof-carrying PR attestation with all evidence bundled</td>
</tr>
<tr>
<td><code>roam_capsule_export</code></td>
<td>Export sanitized structural graph (no code bodies)</td>
</tr>
<tr>
<td><code>roam_path_coverage</code></td>
<td>Find critical untested call paths (entry -&gt; sink)</td>
</tr>
<tr>
<td><code>roam_forecast</code></td>
<td>Predict when metrics will exceed thresholds</td>
</tr>
<tr>
<td><code>roam_simulate</code></td>
<td>Counterfactual architecture simulator</td>
</tr>
<tr>
<td><code>roam_orchestrate</code></td>
<td>Multi-agent swarm partitioning</td>
</tr>
<tr>
<td><code>roam_fingerprint</code></td>
<td>Topology fingerprint comparison</td>
</tr>
<tr>
<td><code>roam_mutate</code></td>
<td>Graph-level code editing (move/rename/extract)</td>
</tr>
<tr>
<td><code>roam_dark_matter</code></td>
<td>Hidden co-change coupling detection</td>
</tr>
<tr>
<td><code>roam_closure</code></td>
<td>Minimal-change synthesis for rename/delete</td>
</tr>
<tr>
<td><code>roam_adversarial_review</code></td>
<td>Adversarial architecture review</td>
</tr>
<tr>
<td><code>roam_generate_plan</code></td>
<td>Agent work planner</td>
</tr>
<tr>
<td><code>roam_get_invariants</code></td>
<td>Architectural invariant discovery</td>
</tr>
<tr>
<td><code>roam_bisect_blame</code></td>
<td>Architectural git bisect</td>
</tr>
<tr>
<td><code>roam_doc_intent</code></td>
<td>Doc-to-code linking</td>
</tr>
<tr>
<td><code>roam_cut_analysis</code></td>
<td>Minimum graph cut analysis</td>
</tr>
<tr>
<td><code>roam_clones</code></td>
<td>AST structural clone detection (Type-2 clones)</td>
</tr>
<tr>
<td><code>roam_annotate_symbol</code></td>
<td>Attach persistent notes to symbols</td>
</tr>
<tr>
<td><code>roam_get_annotations</code></td>
<td>View stored annotations</td>
</tr>
<tr>
<td><code>roam_relate</code></td>
<td>Show relationship between two symbols</td>
</tr>
<tr>
<td><code>roam_search_semantic</code></td>
<td>Semantic search by meaning</td>
</tr>
<tr>
<td><code>roam_rules_check</code></td>
<td>Plugin DSL governance rules</td>
</tr>
<tr>
<td><code>roam_check_rules</code></td>
<td>Built-in + user-defined governance rule evaluation with autofix templates</td>
</tr>
<tr>
<td><code>roam_supply_chain</code></td>
<td>Dependency risk dashboard: pin coverage and supply-chain health</td>
</tr>
<tr>
<td><code>roam_spectral</code></td>
<td>Spectral bisection: Fiedler vector partition tree and modularity gap</td>
</tr>
<tr>
<td><code>roam_vuln_map</code></td>
<td>Vulnerability report ingestion</td>
</tr>
<tr>
<td><code>roam_vuln_reach</code></td>
<td>Vulnerability reachability paths</td>
</tr>
<tr>
<td><code>roam_ingest_trace</code></td>
<td>Ingest runtime trace data</td>
</tr>
<tr>
<td><code>roam_runtime_hotspots</code></td>
<td>Runtime hotspot analysis</td>
</tr>
<tr>
<td><code>roam_diff</code></td>
<td>Blast radius of uncommitted/committed changes</td>
</tr>
<tr>
<td><code>roam_symbol</code></td>
<td>Symbol definition, callers, callees, metrics</td>
</tr>
<tr>
<td><code>roam_deps</code></td>
<td>File-level import/imported-by relationships</td>
</tr>
<tr>
<td><code>roam_uses</code></td>
<td>All consumers of a symbol by edge type</td>
</tr>
<tr>
<td><code>roam_weather</code></td>
<td>Code hotspots: churn x complexity ranking</td>
</tr>
<tr>
<td><code>roam_debt</code></td>
<td>Hotspot-weighted technical debt prioritization with optional ROI estimate</td>
</tr>
<tr>
<td><code>roam_docs_coverage</code></td>
<td>Doc coverage and stale-doc drift with PageRank-ranked missing docs</td>
</tr>
<tr>
<td><code>roam_suggest_refactoring</code></td>
<td>Rank proactive refactoring candidates using complexity, coupling, churn, smells, and coverage gaps</td>
</tr>
<tr>
<td><code>roam_plan_refactor</code></td>
<td>Build an ordered refactor plan for one symbol with risk/test/simulation context</td>
</tr>
<tr>
<td><code>roam_n1</code></td>
<td>Detect N+1 I/O patterns in ORM code</td>
</tr>
<tr>
<td><code>roam_auth_gaps</code></td>
<td>Find endpoints missing auth</td>
</tr>
<tr>
<td><code>roam_over_fetch</code></td>
<td>Detect models serializing too many fields</td>
</tr>
<tr>
<td><code>roam_missing_index</code></td>
<td>Find queries on non-indexed columns</td>
</tr>
<tr>
<td><code>roam_orphan_routes</code></td>
<td>Detect dead backend routes</td>
</tr>
<tr>
<td><code>roam_migration_safety</code></td>
<td>Detect non-idempotent migrations</td>
</tr>
<tr>
<td><code>roam_api_drift</code></td>
<td>Backend/frontend model mismatch detection</td>
</tr>
<tr>
<td><code>roam_expand_toolset</code></td>
<td>Discover presets, active toolset, and switch instructions</td>
</tr>
<tr>
<td><code>roam_explore</code></td>
<td>Compound first-contact exploration bundle for fast repo orientation</td>
</tr>
<tr>
<td><code>roam_prepare_change</code></td>
<td>Compound pre-change bundle: context, blast radius, risk, and tests</td>
</tr>
<tr>
<td><code>roam_review_change</code></td>
<td>Compound review bundle for changed code and architecture checks</td>
</tr>
<tr>
<td><code>roam_diagnose_issue</code></td>
<td>Compound debugging bundle with ranked suspects and dependency context</td>
</tr>
<tr>
<td><code>roam_onboard</code></td>
<td>Structured onboarding brief for new contributors/agents</td>
</tr>
<tr>
<td><code>roam_syntax_check</code></td>
<td>Tree-sitter syntax integrity validation for changed paths</td>
</tr>
<tr>
<td><code>roam_agent_export</code></td>
<td>Generate multi-agent instruction bundles (<code>AGENTS.md</code> + overlays)</td>
</tr>
<tr>
<td><code>roam_vibe_check</code></td>
<td>AI-rot auditor with 8-pattern taxonomy and composite score</td>
</tr>
<tr>
<td><code>roam_ai_readiness</code></td>
<td>AI-agent effectiveness readiness scoring and recommendations</td>
</tr>
<tr>
<td><code>roam_dashboard</code></td>
<td>Unified status snapshot across health, risk, churn, and quality</td>
</tr>
<tr>
<td><code>roam_codeowners</code></td>
<td>CODEOWNERS coverage analysis and unowned file discovery</td>
</tr>
<tr>
<td><code>roam_drift</code></td>
<td>Ownership drift detection from declared vs observed ownership</td>
</tr>
<tr>
<td><code>roam_suggest_reviewers</code></td>
<td>Reviewer recommendations with multi-signal scoring</td>
</tr>
<tr>
<td><code>roam_simulate_departure</code></td>
<td>Knowledge-loss simulation for contributor departure scenarios</td>
</tr>
<tr>
<td><code>roam_verify</code></td>
<td>Pre-commit consistency verification and policy checks</td>
</tr>
<tr>
<td><code>roam_api_changes</code></td>
<td>API signature change classification and severity labeling</td>
</tr>
<tr>
<td><code>roam_test_gaps</code></td>
<td>Changed-symbol test gap analysis</td>
</tr>
<tr>
<td><code>roam_ai_ratio</code></td>
<td>Estimated AI-generated code ratio from repository signals</td>
</tr>
<tr>
<td><code>roam_duplicates</code></td>
<td>Semantic duplicate detection across structurally similar functions</td>
</tr>
<tr>
<td><code>roam_partition</code></td>
<td>Multi-agent partition manifest with conflict and complexity scores</td>
</tr>
<tr>
<td><code>roam_affected</code></td>
<td>Monorepo/package affected-set analysis for diffs</td>
</tr>
<tr>
<td><code>roam_semantic_diff</code></td>
<td>Structural diff of symbol/edge changes</td>
</tr>
<tr>
<td><code>roam_trends</code></td>
<td>Historical metric trend retrieval with sparkline output</td>
</tr>
<tr>
<td><code>roam_secrets</code></td>
<td>Secret scanning with masking and CI-friendly fail behavior</td>
</tr>
<tr>
<td><code>roam_endpoints</code></td>
<td>Enumerate HTTP/API endpoint definitions across the codebase</td>
</tr>
<tr>
<td><code>roam_doctor</code></td>
<td>Diagnose installation and environment health</td>
</tr>
<tr>
<td><code>roam_init</code></td>
<td>Initialize roam workspace state and build the first index</td>
</tr>
<tr>
<td><code>roam_reindex</code></td>
<td>Refresh or force-rebuild the index with task-mode support</td>
</tr>
<tr>
<td><code>roam_reset</code></td>
<td>Reset the roam index and cached data</td>
</tr>
<tr>
<td><code>roam_clean</code></td>
<td>Remove stale or orphaned index entries</td>
</tr>
<tr>
<td><code>roam_batch_search</code></td>
<td>Batch symbol search: run multiple pattern queries in a single call</td>
</tr>
<tr>
<td><code>roam_batch_get</code></td>
<td>Batch context retrieval: fetch multiple symbols/files in a single call</td>
</tr>
<tr>
<td><code>roam_dev_profile</code></td>
<td>Developer productivity profile: commit patterns, specialization, and impact</td>
</tr>
</tbody></table>
<p><strong>Resources:</strong> <code>roam://health</code> (current health score), <code>roam://summary</code> (project overview)</p>
</details>

<details>
<summary><strong>Claude Code</strong></summary>

<pre><code class="language-bash">claude mcp add roam-code -- roam mcp
</code></pre>
<p>Or add to <code>.mcp.json</code> in your project root:</p>
<pre><code class="language-json">{
  &quot;mcpServers&quot;: {
    &quot;roam-code&quot;: {
      &quot;command&quot;: &quot;roam&quot;,
      &quot;args&quot;: [&quot;mcp&quot;]
    }
  }
}
</code></pre>
</details>

<details>
<summary><strong>Claude Desktop</strong></summary>

<p>Add to your <code>claude_desktop_config.json</code>:</p>
<pre><code class="language-json">{
  &quot;mcpServers&quot;: {
    &quot;roam-code&quot;: {
      &quot;command&quot;: &quot;roam&quot;,
      &quot;args&quot;: [&quot;mcp&quot;],
      &quot;cwd&quot;: &quot;/path/to/your/project&quot;
    }
  }
}
</code></pre>
</details>

<details>
<summary><strong>Cursor</strong></summary>

<p>Add to <code>.cursor/mcp.json</code>:</p>
<pre><code class="language-json">{
  &quot;mcpServers&quot;: {
    &quot;roam-code&quot;: {
      &quot;command&quot;: &quot;roam&quot;,
      &quot;args&quot;: [&quot;mcp&quot;]
    }
  }
}
</code></pre>
</details>

<details>
<summary><strong>VS Code + Copilot</strong></summary>

<p>Add to <code>.vscode/mcp.json</code>:</p>
<pre><code class="language-json">{
  &quot;servers&quot;: {
    &quot;roam-code&quot;: {
      &quot;type&quot;: &quot;stdio&quot;,
      &quot;command&quot;: &quot;roam&quot;,
      &quot;args&quot;: [&quot;mcp&quot;]
    }
  }
}
</code></pre>
</details>

<h2>CI/CD Integration</h2>
<p>All you need is Python 3.9+ and <code>pip install roam-code</code>.</p>
<h3>GitHub Actions</h3>
<pre><code class="language-yaml"># .github/workflows/roam.yml
name: Roam Analysis
on: [pull_request]

jobs:
  roam:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - uses: Cranot/roam-code@main
        with:
          commands: health
          gate: &quot;score&gt;=70&quot;
          sarif: true
          comment: true
</code></pre>
<p>Use <code>roam init</code> to auto-generate this workflow.</p>
<table>
<thead>
<tr>
<th>Input</th>
<th>Default</th>
<th>Description</th>
</tr>
</thead>
<tbody><tr>
<td><code>commands</code></td>
<td><code>health</code></td>
<td>Space-separated roam commands to run</td>
</tr>
<tr>
<td><code>gate</code></td>
<td>(empty)</td>
<td>Quality gate expression (e.g., <code>score&gt;=70</code>). Exit 5 on failure</td>
</tr>
<tr>
<td><code>sarif</code></td>
<td><code>false</code></td>
<td>Upload SARIF results to GitHub Code Scanning</td>
</tr>
<tr>
<td><code>comment</code></td>
<td><code>true</code></td>
<td>Post sticky PR comment with results</td>
</tr>
<tr>
<td><code>python-version</code></td>
<td><code>3.11</code></td>
<td>Python version</td>
</tr>
<tr>
<td><code>version</code></td>
<td><code>latest</code></td>
<td>Pin to a specific roam-code version</td>
</tr>
<tr>
<td><code>cache</code></td>
<td><code>true</code></td>
<td>Cache the SQLite index between runs</td>
</tr>
<tr>
<td><code>changed-only</code></td>
<td><code>false</code></td>
<td>Incremental mode: adapt commands to changed files</td>
</tr>
</tbody></table>
<details>
<summary><strong>GitLab CI</strong></summary>

<pre><code class="language-yaml">roam-analysis:
  stage: test
  image: python:3.12-slim
  before_script:
    - pip install roam-code
  script:
    - roam index
    - roam health --gate
    - roam --json pr-risk origin/main..HEAD &gt; roam-report.json
  artifacts:
    paths:
      - roam-report.json
  rules:
    - if: $CI_MERGE_REQUEST_IID
</code></pre>
</details>

<details>
<summary><strong>Azure DevOps / any CI</strong></summary>

<p>Universal pattern:</p>
<pre><code class="language-bash">pip install roam-code
roam index
roam health --gate               # exit 5 on failure (reads .roam-gates.yml)
roam --json health &gt; report.json
</code></pre>
</details>

<h2>SARIF Output</h2>
<p>Roam exports analysis results in <a href="https://sarifweb.azurewebsites.net/">SARIF 2.1.0</a> format for GitHub Code Scanning.</p>
<pre><code class="language-python">from roam.output.sarif import health_to_sarif, write_sarif

sarif = health_to_sarif(health_data)
write_sarif(sarif, &quot;roam-health.sarif&quot;)
</code></pre>
<pre><code class="language-yaml">- uses: github/codeql-action/upload-sarif@v3
  with:
    sarif_file: roam-health.sarif
</code></pre>
<h2>For Teams</h2>
<p>Zero infrastructure, zero vendor lock-in, zero data leaving your network.</p>
<table>
<thead>
<tr>
<th>Tool</th>
<th>Annual cost (20-dev team)</th>
<th>Infrastructure</th>
<th>Setup time</th>
</tr>
</thead>
<tbody><tr>
<td>SonarQube Server</td>
<td>$15,000-$45,000</td>
<td>Self-hosted server</td>
<td>Days</td>
</tr>
<tr>
<td>CodeScene</td>
<td>$20,000-$60,000</td>
<td>SaaS or on-prem</td>
<td>Hours</td>
</tr>
<tr>
<td>Code Climate</td>
<td>$12,000-$36,000</td>
<td>SaaS</td>
<td>Hours</td>
</tr>
<tr>
<td><strong>Roam</strong></td>
<td><strong>$0 (MIT license)</strong></td>
<td><strong>None (local)</strong></td>
<td><strong>5 minutes</strong></td>
</tr>
</tbody></table>
<details>
<summary><strong>Team rollout guide</strong></summary>

<p><strong>Week 1-2 (pilot):</strong> 1-2 developers run <code>roam init</code> on one repo. Use <code>roam preflight</code> before changes, <code>roam pr-risk</code> before PRs.</p>
<p><strong>Week 3-4 (expand):</strong> Add <code>roam health --gate</code> to CI as a non-blocking check (configure thresholds in <code>.roam-gates.yml</code>).</p>
<p><strong>Month 2+ (standardize):</strong> Tighten gate thresholds. Expand to additional repos. Track trajectory with <code>roam trends</code>.</p>
</details>

<details>
<summary><strong>Complements your existing stack</strong></summary>

<table>
<thead>
<tr>
<th>If you use...</th>
<th>Roam adds...</th>
</tr>
</thead>
<tbody><tr>
<td><strong>SonarQube</strong></td>
<td>Architecture-level analysis: dependency cycles, god components, blast radius, health scoring</td>
</tr>
<tr>
<td><strong>CodeScene</strong></td>
<td>Free, local alternative for health scoring and hotspot analysis</td>
</tr>
<tr>
<td><strong>ESLint / Pylint</strong></td>
<td>Cross-language architecture checks. Linters enforce style per file; Roam enforces architecture across the codebase</td>
</tr>
<tr>
<td><strong>LSP</strong></td>
<td>AI-agent-optimized queries. <code>roam context</code> answers &quot;what calls this?&quot; with PageRank-ranked results in one call</td>
</tr>
</tbody></table>
</details>

<h2>Language Support</h2>
<h3>Tier 1 -- Full extraction (dedicated parsers)</h3>
<table>
<thead>
<tr>
<th>Language</th>
<th>Extensions</th>
<th>Symbols</th>
<th>References</th>
<th>Inheritance</th>
</tr>
</thead>
<tbody><tr>
<td>Python</td>
<td><code>.py</code> <code>.pyi</code></td>
<td>classes, functions, methods, decorators, variables</td>
<td>imports, calls, inheritance</td>
<td>extends, <code>__all__</code> exports</td>
</tr>
<tr>
<td>JavaScript</td>
<td><code>.js</code> <code>.jsx</code> <code>.mjs</code> <code>.cjs</code></td>
<td>classes, functions, arrow functions, CJS exports</td>
<td>imports, require(), calls</td>
<td>extends</td>
</tr>
<tr>
<td>TypeScript</td>
<td><code>.ts</code> <code>.tsx</code> <code>.mts</code> <code>.cts</code></td>
<td>interfaces, type aliases, enums + all JS</td>
<td>imports, calls, type refs</td>
<td>extends, implements</td>
</tr>
<tr>
<td>Java</td>
<td><code>.java</code></td>
<td>classes, interfaces, enums, constructors, fields</td>
<td>imports, calls</td>
<td>extends, implements</td>
</tr>
<tr>
<td>Go</td>
<td><code>.go</code></td>
<td>structs, interfaces, functions, methods, fields</td>
<td>imports, calls</td>
<td>embedded structs</td>
</tr>
<tr>
<td>Rust</td>
<td><code>.rs</code></td>
<td>structs, traits, impls, enums, functions</td>
<td>use, calls</td>
<td>impl Trait for Struct</td>
</tr>
<tr>
<td>C / C++</td>
<td><code>.c</code> <code>.h</code> <code>.cpp</code> <code>.hpp</code> <code>.cc</code></td>
<td>structs, classes, functions, namespaces, templates</td>
<td>includes, calls</td>
<td>extends</td>
</tr>
<tr>
<td>C#</td>
<td><code>.cs</code></td>
<td>classes, interfaces, structs, enums, records, methods, constructors, properties, delegates, events, fields</td>
<td>using directives, calls, <code>new</code>, attributes</td>
<td>extends, implements</td>
</tr>
<tr>
<td>PHP</td>
<td><code>.php</code></td>
<td>classes, interfaces, traits, enums, methods, properties</td>
<td>namespace use, calls, static calls, <code>new</code></td>
<td>extends, implements, use (traits)</td>
</tr>
<tr>
<td>Visual FoxPro</td>
<td><code>.prg</code></td>
<td>functions, procedures, classes, methods, properties, constants</td>
<td>DO, SET PROCEDURE/CLASSLIB, CREATEOBJECT, <code>=func()</code>, <code>obj.method()</code></td>
<td>DEFINE CLASS ... AS</td>
</tr>
<tr>
<td>YAML (CI/CD)</td>
<td><code>.yml</code> <code>.yaml</code></td>
<td>GitLab CI: jobs, template anchors, stages. GitHub Actions: workflow name, jobs, reusable workflows. Generic: top-level keys</td>
<td><code>extends:</code>, <code>needs:</code>, <code>!reference</code>, <code>uses:</code></td>
<td>—</td>
</tr>
<tr>
<td>HCL / Terraform</td>
<td><code>.tf</code> <code>.tfvars</code> <code>.hcl</code></td>
<td><code>resource</code>, <code>data</code>, <code>variable</code>, <code>output</code>, <code>module</code>, <code>provider</code>, <code>locals</code> entries</td>
<td><code>var.*</code>, <code>module.*</code>, <code>data.*</code>, <code>local.*</code>, resource cross-refs</td>
<td>—</td>
</tr>
<tr>
<td>Vue</td>
<td><code>.vue</code></td>
<td>via <code>&lt;script&gt;</code> block extraction (TS/JS)</td>
<td>imports, calls, type refs</td>
<td>extends, implements</td>
</tr>
<tr>
<td>Svelte</td>
<td><code>.svelte</code></td>
<td>via <code>&lt;script&gt;</code> block extraction (TS/JS)</td>
<td>imports, calls, type refs</td>
<td>extends, implements</td>
</tr>
</tbody></table>
<details>
<summary><strong>Salesforce ecosystem (Tier 1)</strong></summary>

<table>
<thead>
<tr>
<th>Language</th>
<th>Extensions</th>
<th>Symbols</th>
<th>References</th>
</tr>
</thead>
<tbody><tr>
<td>Apex</td>
<td><code>.cls</code> <code>.trigger</code></td>
<td>classes, triggers, SOQL, annotations</td>
<td>imports, calls, System.Label, generic type refs</td>
</tr>
<tr>
<td>Aura</td>
<td><code>.cmp</code> <code>.app</code> <code>.evt</code> <code>.intf</code> <code>.design</code></td>
<td>components, attributes, methods, events</td>
<td>controller refs, component refs</td>
</tr>
<tr>
<td>LWC (JavaScript)</td>
<td><code>.js</code> (in LWC dirs)</td>
<td>anonymous class from filename</td>
<td><code>@salesforce/apex/</code>, <code>@salesforce/schema/</code>, <code>@salesforce/label/</code></td>
</tr>
<tr>
<td>Visualforce</td>
<td><code>.page</code> <code>.component</code></td>
<td>pages, components</td>
<td>controller/extensions, merge fields, includes</td>
</tr>
<tr>
<td>SF Metadata XML</td>
<td><code>*-meta.xml</code></td>
<td>objects, fields, rules, layouts</td>
<td>Apex class refs, formula field refs, Flow actionCalls</td>
</tr>
</tbody></table>
<p>Cross-language edges mean <code>roam impact AccountService</code> shows blast radius across Apex, LWC, Aura, Visualforce, and Flows.</p>
</details>

<p>| Ruby | <code>.rb</code> | classes, modules, methods, singleton methods, constants | require, require_relative, include/extend, calls, ClassName.new | class inheritance |
| Kotlin | <code>.kt</code> <code>.kts</code> | classes, interfaces, enums, objects, functions, methods, properties | imports, calls, type refs | extends, implements |
| Scala | <code>.scala</code> <code>.sc</code> | classes, traits, objects, case classes, functions, methods, val/var, type aliases | imports, calls, <code>new</code> | extends, with (trait mixins) |
| SQL (DDL) | <code>.sql</code> | tables, columns, views, functions, triggers, schemas, types (enums), sequences | foreign keys, view table deps, trigger table/function refs | -- |
| Swift | <code>.swift</code> | classes, structs, enums, protocols, functions, methods, properties | imports, calls, type refs | extends, conforms |
| JSONC | <code>.jsonc</code> | via JSON grammar | -- | -- |
| MDX | <code>.mdx</code> | via Markdown grammar | -- | -- |</p>
<h2>Performance</h2>
<table>
<thead>
<tr>
<th>Metric</th>
<th>Value</th>
</tr>
</thead>
<tbody><tr>
<td>Index 200 files</td>
<td>~3-5s</td>
</tr>
<tr>
<td>Index 3,000 files</td>
<td>~2 min</td>
</tr>
<tr>
<td>Incremental (no changes)</td>
<td>&lt;1s</td>
</tr>
<tr>
<td>Any query command</td>
<td>&lt;0.5s</td>
</tr>
</tbody></table>
<details>
<summary><strong>Detailed benchmarks</strong></summary>

<h3>Indexing Speed</h3>
<table>
<thead>
<tr>
<th>Project</th>
<th>Language</th>
<th>Files</th>
<th>Symbols</th>
<th>Edges</th>
<th>Index Time</th>
<th>Rate</th>
</tr>
</thead>
<tbody><tr>
<td>Express</td>
<td>JS</td>
<td>211</td>
<td>624</td>
<td>804</td>
<td>3s</td>
<td>70 files/s</td>
</tr>
<tr>
<td>Axios</td>
<td>JS</td>
<td>237</td>
<td>1,065</td>
<td>868</td>
<td>6s</td>
<td>41 files/s</td>
</tr>
<tr>
<td>Vue</td>
<td>TS</td>
<td>697</td>
<td>5,335</td>
<td>8,984</td>
<td>25s</td>
<td>28 files/s</td>
</tr>
<tr>
<td>Laravel</td>
<td>PHP</td>
<td>3,058</td>
<td>39,097</td>
<td>38,045</td>
<td>1m46s</td>
<td>29 files/s</td>
</tr>
<tr>
<td>Svelte</td>
<td>TS</td>
<td>8,445</td>
<td>16,445</td>
<td>19,618</td>
<td>2m40s</td>
<td>52 files/s</td>
</tr>
</tbody></table>
<h3>Quality Benchmark</h3>
<table>
<thead>
<tr>
<th>Repo</th>
<th>Language</th>
<th>Score</th>
<th>Coverage</th>
<th>Edge Density</th>
</tr>
</thead>
<tbody><tr>
<td>Laravel</td>
<td>PHP</td>
<td><strong>9.55</strong></td>
<td>91.2%</td>
<td>0.97</td>
</tr>
<tr>
<td>Vue</td>
<td>TS</td>
<td><strong>9.27</strong></td>
<td>85.8%</td>
<td>1.68</td>
</tr>
<tr>
<td>Svelte</td>
<td>TS</td>
<td><strong>9.04</strong></td>
<td>94.7%</td>
<td>1.19</td>
</tr>
<tr>
<td>Axios</td>
<td>JS</td>
<td><strong>8.98</strong></td>
<td>85.9%</td>
<td>0.82</td>
</tr>
<tr>
<td>Express</td>
<td>JS</td>
<td><strong>8.46</strong></td>
<td>96.0%</td>
<td>1.29</td>
</tr>
</tbody></table>
<h3>Token Efficiency</h3>
<table>
<thead>
<tr>
<th>Metric</th>
<th>Value</th>
</tr>
</thead>
<tbody><tr>
<td>1,600-line file → <code>roam file</code></td>
<td><del>5,000 chars (</del>70:1 compression)</td>
</tr>
<tr>
<td>Full project map</td>
<td>~4,000 chars</td>
</tr>
<tr>
<td><code>--compact</code> mode</td>
<td>40-50% additional token reduction</td>
</tr>
<tr>
<td><code>roam preflight</code> replaces</td>
<td>5-7 separate agent tool calls</td>
</tr>
</tbody></table>
</details>

<p>Agent-efficiency benchmarks: see the <a href="benchmarks/"><code>benchmarks/</code></a> directory for harness, repos, and results.</p>
<h2>How It Works</h2>
<pre><code>Codebase
    |
[1] Discovery ──── git ls-files (respects .gitignore + .roamignore)
    |
[2] Parse ──────── tree-sitter AST per file (27 languages)
    |
[3] Extract ────── symbols + references (calls, imports, inheritance)
    |
[4] Resolve ────── match references to definitions → edges
    |
[5] Metrics ────── adaptive PageRank, betweenness, cognitive complexity, Halstead
    |
[6] Algorithms ── 23-pattern anti-pattern catalog (O(n^2) loops, N+1, recursion)
    |
[7] Git ────────── churn, co-change matrix, authorship, Renyi entropy
    |
[8] Clusters ───── Louvain community detection
    |
[9] Health ─────── per-file scores (7-factor) + composite score (0-100)
    |
[10] Store ─────── .roam/index.db (SQLite, WAL mode)
</code></pre>
<p>After the first full index, <code>roam index</code> only re-processes changed files (mtime + SHA-256 hash). Incremental updates are near-instant.</p>
<h3>.roamignore</h3>
<p>Create a <code>.roamignore</code> file in your project root to exclude files from indexing. It uses <strong>full gitignore syntax</strong>:</p>
<table>
<thead>
<tr>
<th>Pattern</th>
<th>Meaning</th>
</tr>
</thead>
<tbody><tr>
<td><code>*.log</code></td>
<td>Exclude all <code>.log</code> files (basename match)</td>
</tr>
<tr>
<td><code>vendor/</code></td>
<td>Exclude the <code>vendor</code> directory and everything under it</td>
</tr>
<tr>
<td><code>/build/</code></td>
<td>Exclude <code>build/</code> at repo root only (anchored)</td>
</tr>
<tr>
<td><code>src/**/*.pb.go</code></td>
<td>Exclude <code>.pb.go</code> files at any depth under <code>src/</code></td>
</tr>
<tr>
<td><code>**/test_*.py</code></td>
<td>Exclude <code>test_*.py</code> files anywhere</td>
</tr>
<tr>
<td><code>?</code></td>
<td>Match any single character (not <code>/</code>)</td>
</tr>
<tr>
<td><code>[abc]</code> / <code>[!abc]</code></td>
<td>Character class / negated character class</td>
</tr>
<tr>
<td><code>!important.log</code></td>
<td>Un-exclude (re-include) <code>important.log</code></td>
</tr>
<tr>
<td><code># comment</code></td>
<td>Lines starting with <code>#</code> are comments</td>
</tr>
</tbody></table>
<p>Key rules: <code>*</code> matches within a single path segment (not across <code>/</code>). <code>**</code> matches across <code>/</code> boundaries. Last matching pattern wins (for negation). Patterns containing <code>/</code> are anchored to the repo root.</p>
<pre><code># .roamignore example
*_pb2.py
*_pb2_grpc.py
vendor/
node_modules/
*.generated.*
/build/
!build/keep/
</code></pre>
<p>You can also exclude patterns via <code>roam config --exclude &quot;*.proto&quot;</code> (stored in <code>.roam/config.json</code>) or inspect active patterns with <code>roam config --show</code>.</p>
<details>
<summary><strong>Graph algorithms</strong></summary>

<ul>
<li><strong>Adaptive PageRank</strong> -- damping factor auto-tunes based on cycle density (0.82-0.92); identifies the most important symbols (used by <code>map</code>, <code>search</code>, <code>context</code>)</li>
<li><strong>Personalized PageRank</strong> -- distance-weighted blast radius for <code>impact</code> (Gleich, 2015)</li>
<li><strong>Adaptive betweenness centrality</strong> -- exact for small graphs, sqrt-scaled sampling for large (Brandes &amp; Pich, 2007); finds bottleneck symbols</li>
<li><strong>Edge betweenness centrality</strong> -- identifies critical cycle-breaking edges in SCCs (Brandes, 2001)</li>
<li><strong>Tarjan&#39;s SCC</strong> -- detects dependency cycles with tangle ratio</li>
<li><strong>Propagation Cost</strong> -- fraction of system affected by any change, via transitive closure (MacCormack, Rusnak &amp; Baldwin, 2006)</li>
<li><strong>Algebraic connectivity (Fiedler value)</strong> -- second-smallest Laplacian eigenvalue; measures architectural robustness (Fiedler, 1973)</li>
<li><strong>Louvain community detection</strong> -- groups related symbols into clusters</li>
<li><strong>Modularity Q-score</strong> -- measures if cluster boundaries match natural community structure (Newman, 2004)</li>
<li><strong>Conductance</strong> -- per-cluster boundary tightness: cut(S, S_bar) / min(vol(S), vol(S_bar)) (Yang &amp; Leskovec)</li>
<li><strong>Topological sort</strong> -- computes dependency layers, Gini coefficient for layer balance (Gini, 1912), weighted violation severity</li>
<li><strong>k-shortest simple paths</strong> -- traces dependency paths with coupling strength</li>
<li><strong>Renyi entropy (order 2)</strong> -- measures co-change distribution; more robust to outliers than Shannon (Renyi, 1961)</li>
<li><strong>Mann-Kendall trend test</strong> -- non-parametric degradation detection, robust to noise (Mann, 1945; Kendall, 1975)</li>
<li><strong>Sen&#39;s slope estimator</strong> -- robust trend magnitude, resistant to outliers (Sen, 1968)</li>
<li><strong>NPMI</strong> -- Normalized Pointwise Mutual Information for coupling strength (Bouma, 2009)</li>
<li><strong>Lift</strong> -- association rule mining metric for co-change statistical significance (Agrawal &amp; Srikant, 1994)</li>
<li><strong>Halstead metrics</strong> -- volume, difficulty, effort, and predicted bugs from operator/operand counts (Halstead, 1977)</li>
<li><strong>SQALE remediation cost</strong> -- time-to-fix estimates per issue type for tech debt prioritization (Letouzey, 2012)</li>
<li><strong>Algorithm anti-pattern catalog</strong> -- 23 patterns detecting suboptimal algorithms (quadratic loops, N+1 queries, quadratic string building, branching recursion, manual top-k, loop-invariant calls) with confidence calibration via caller-count and bounded-loop analysis</li>
</ul>
</details>

<details>
<summary><strong>Health scoring</strong></summary>

<p>Composite health score (0-100) using a <strong>weighted geometric mean</strong> of sigmoid health factors. Non-compensatory: a zero in any dimension cannot be masked by high scores in others.</p>
<table>
<thead>
<tr>
<th>Factor</th>
<th>Weight</th>
<th>What it measures</th>
</tr>
</thead>
<tbody><tr>
<td>Tangle ratio</td>
<td>30%</td>
<td>% of symbols in dependency cycles</td>
</tr>
<tr>
<td>God components</td>
<td>20%</td>
<td>Symbols with extreme fan-in/fan-out</td>
</tr>
<tr>
<td>Bottlenecks</td>
<td>15%</td>
<td>High-betweenness chokepoints</td>
</tr>
<tr>
<td>Layer violations</td>
<td>15%</td>
<td>Upward dependency violations (severity-weighted by layer distance)</td>
</tr>
<tr>
<td>Per-file health</td>
<td>20%</td>
<td>Average of 7-factor file health scores</td>
</tr>
</tbody></table>
<p>Each factor uses sigmoid health: <code>h = e^(-signal/scale)</code> (1 = pristine, approaches 0 = worst). Score = <code>100 * product(h_i ^ w_i)</code>. Also reports <strong>propagation cost</strong> (MacCormack 2006) and <strong>algebraic connectivity</strong> (Fiedler 1973). Per-file health (1-10) combines: cognitive complexity (triangular nesting penalty per Sweller&#39;s Cognitive Load Theory), indentation complexity, cycle membership, god component membership, dead export ratio, co-change entropy, and churn amplification.</p>
</details>

<h2>How Roam Compares</h2>
<p>roam-code is the only tool that combines graph algorithms (PageRank, Tarjan SCC, Louvain clustering), git archaeology, architecture simulation, and multi-agent partitioning in a single local CLI with zero API keys.</p>
<p>Documentation (local HTML in <code>docs/site/</code>, CI-deployed via <code>.github/workflows/pages.yml</code>):</p>
<ul>
<li><code>docs/site/getting-started.html</code> — tutorial</li>
<li><code>docs/site/command-reference.html</code> — examples</li>
<li><code>docs/site/architecture.html</code> — diagram + internals</li>
<li><code>docs/site/landscape.html</code> — competitor matrix</li>
</ul>
<table>
<thead>
<tr>
<th>Capability</th>
<th>roam-code</th>
<th>AI IDEs (Cursor, Windsurf)</th>
<th>AI Agents (Claude Code, Codex)</th>
<th>SAST (SonarQube, CodeQL)</th>
</tr>
</thead>
<tbody><tr>
<td>Persistent local index</td>
<td>SQLite</td>
<td>Cloud embeddings</td>
<td>None</td>
<td>Per-scan</td>
</tr>
<tr>
<td>Call graph analysis</td>
<td>Yes</td>
<td>No</td>
<td>No</td>
<td>Yes (CodeQL)</td>
</tr>
<tr>
<td>PageRank / centrality</td>
<td>Yes</td>
<td>No</td>
<td>No</td>
<td>No</td>
</tr>
<tr>
<td>Cycle detection (Tarjan)</td>
<td>Yes</td>
<td>No</td>
<td>No</td>
<td>Deprecated (SonarQube)</td>
</tr>
<tr>
<td>Community detection (Louvain)</td>
<td>Yes</td>
<td>No</td>
<td>No</td>
<td>No</td>
</tr>
<tr>
<td>Git churn / co-change</td>
<td>Yes</td>
<td>No</td>
<td>No</td>
<td>No</td>
</tr>
<tr>
<td>Architecture simulation</td>
<td>Yes</td>
<td>No</td>
<td>No</td>
<td>No</td>
</tr>
<tr>
<td>Multi-agent partitioning</td>
<td>Yes</td>
<td>No</td>
<td>No</td>
<td>No</td>
</tr>
<tr>
<td>MCP tools for agents</td>
<td>101 (24 in default core preset)</td>
<td>Client only</td>
<td>Client only</td>
<td>34 (SonarQube)</td>
</tr>
<tr>
<td>Languages</td>
<td>26</td>
<td>70+</td>
<td>50+</td>
<td>12-42</td>
</tr>
<tr>
<td>100% local, zero API keys</td>
<td>Yes</td>
<td>No</td>
<td>No</td>
<td>Partial</td>
</tr>
<tr>
<td>Open source</td>
<td>MIT</td>
<td>No</td>
<td>Partial</td>
<td>Partial</td>
</tr>
</tbody></table>
<h3>Key Differentiators</h3>
<ul>
<li><strong>vs AI IDEs</strong> (Cursor, Windsurf, Augment): roam-code provides deterministic structural analysis. AI IDEs use probabilistic embeddings that can&#39;t guarantee reproducible results.</li>
<li><strong>vs AI Agents</strong> (Claude Code, Codex CLI, Gemini CLI): These agents read files one at a time. roam-code pre-computes relationships so agents get instant answers about architecture, blast radius, and dependencies.</li>
<li><strong>vs SAST Tools</strong> (SonarQube, CodeQL, Semgrep): SAST tools find bugs and vulnerabilities. roam-code understands architecture -- how code is structured, where it&#39;s coupled, and what breaks when you change it. Complementary, not competitive.</li>
<li><strong>vs Code Search</strong> (Sourcegraph/Amp, Greptile): Text search finds where code is. roam-code understands why code matters -- which functions are central, which modules are tangled, which files are high-risk.</li>
</ul>
<h2>FAQ</h2>
<p><strong>Does Roam send any data externally?</strong>
No. Zero network calls. No telemetry, no analytics, no update checks.</p>
<p><strong>Can Roam run in air-gapped environments?</strong>
Yes. Once installed, no internet access is required.</p>
<p><strong>Does Roam modify my source code?</strong>
Read-only by default. Creates <code>.roam/</code> with an index database. The <code>roam mutate</code> command can apply code changes (move/rename/extract) but defaults to <code>--dry-run</code> mode — you must explicitly pass <code>--apply</code> to write changes.</p>
<p><strong>How does Roam handle monorepos?</strong>
Indexes from the root. Batched SQL handles 100k+ symbols. Incremental updates stay fast.</p>
<p><strong>How does Roam handle multi-repo projects (e.g., frontend + backend)?</strong>
Use <code>roam ws init &lt;repo1&gt; &lt;repo2&gt;</code> to create a workspace. Each repo keeps its own index; a workspace overlay DB stores cross-repo API edges. <code>roam ws resolve</code> scans for REST endpoints and matches frontend calls to backend routes. Then <code>roam ws context</code>, <code>roam ws trace</code>, etc. work across repos.</p>
<p><strong>Is Roam compatible with SonarQube / CodeScene?</strong>
Yes. Roam complements existing tools. Both can run in the same CI pipeline. SARIF output integrates with GitHub Code Scanning.</p>
<h2>Limitations</h2>
<p>Static analysis trade-offs:</p>
<ul>
<li><strong>Static analysis primarily</strong> -- can&#39;t trace dynamic dispatch, reflection, or eval&#39;d code. Runtime trace ingestion (<code>roam ingest-trace</code>) adds production data but requires external trace export</li>
<li><strong>Import resolution is heuristic</strong> -- complex re-exports or conditional imports may not resolve</li>
<li><strong>Limited cross-language edges</strong> -- Salesforce, Protobuf, REST API, and multi-repo edges are supported, but not arbitrary FFI</li>
<li><strong>Tier 2 languages</strong> get basic symbol extraction only via generic tree-sitter walker</li>
<li><strong>Large monorepos</strong> (100k+ files) may have slow initial indexing</li>
</ul>
<h2>Troubleshooting</h2>
<table>
<thead>
<tr>
<th>Problem</th>
<th>Solution</th>
</tr>
</thead>
<tbody><tr>
<td><code>roam: command not found</code></td>
<td>Ensure install location is on PATH. For <code>uv</code>: <code>uv tool update-shell</code></td>
</tr>
<tr>
<td><code>Another indexing process is running</code></td>
<td>Delete <code>.roam/index.lock</code> and retry</td>
</tr>
<tr>
<td><code>database is locked</code></td>
<td><code>roam index --force</code> to rebuild</td>
</tr>
<tr>
<td>Unicode errors on Windows</td>
<td><code>chcp 65001</code> for UTF-8</td>
</tr>
<tr>
<td>Symbol resolves to wrong file</td>
<td>Use <code>file:symbol</code> syntax: <code>roam symbol myfile:MyFunction</code></td>
</tr>
<tr>
<td>Health score seems wrong</td>
<td><code>roam --json health</code> for factor breakdown</td>
</tr>
<tr>
<td>Index stale after <code>git pull</code></td>
<td><code>roam index</code> (incremental). After major refactors: <code>roam index --force</code></td>
</tr>
</tbody></table>
<h2>Update / Uninstall</h2>
<pre><code class="language-bash"># Update
pipx upgrade roam-code
uv tool upgrade roam-code
pip install --upgrade roam-code

# Uninstall
pipx uninstall roam-code
uv tool uninstall roam-code
pip uninstall roam-code
</code></pre>
<p>Delete <code>.roam/</code> from your project root to clean up local data.</p>
<h2>Development</h2>
<pre><code class="language-bash">git clone https://github.com/Cranot/roam-code.git
cd roam-code
pip install -e &quot;.[dev]&quot;   # includes pytest, ruff
pytest tests/              # ~5500 tests, Python 3.9-3.13

# Or use Make targets:
make dev      # install with dev extras
make test     # run tests
make lint     # ruff check
</code></pre>
<details>
<summary><strong>Project structure</strong></summary>

<pre><code>roam-code/
├── pyproject.toml
├── action.yml                         # Reusable GitHub Action
├── src/roam/
│   ├── __init__.py                    # Version (from pyproject.toml)
│   ├── cli.py                         # Click CLI (140 commands)
│   ├── mcp_server.py                  # MCP server (102 tools, 10 resources, 5 prompts)
│   ├── db/
│   │   ├── connection.py              # SQLite (WAL, pragmas, batched IN)
│   │   ├── schema.py                  # Tables, indexes, migrations
│   │   └── queries.py                 # Named SQL constants
│   ├── index/
│   │   ├── indexer.py                 # Orchestrates full pipeline
│   │   ├── discovery.py               # git ls-files, .gitignore
│   │   ├── parser.py                  # Tree-sitter parsing
│   │   ├── symbols.py                 # Symbol + reference extraction
│   │   ├── relations.py               # Reference resolution -&gt; edges
│   │   ├── complexity.py              # Cognitive complexity (SonarSource) + Halstead metrics
│   │   ├── git_stats.py               # Churn, co-change, blame, Renyi entropy
│   │   ├── incremental.py             # mtime + hash change detection
│   │   ├── file_roles.py              # Smart file role classifier
│   │   └── test_conventions.py        # Pluggable test naming adapters
│   ├── languages/
│   │   ├── base.py                    # Abstract LanguageExtractor
│   │   ├── registry.py                # Language detection + aliasing
│   │   ├── *_lang.py                  # One file per language (21 dedicated + generic)
│   │   └── generic_lang.py            # Tier 2 fallback
│   ├── bridges/
│   │   ├── base.py, registry.py       # Cross-language bridge framework
│   │   ├── bridge_salesforce.py       # Apex &lt;-&gt; Aura/LWC/Visualforce
│   │   └── bridge_protobuf.py         # .proto -&gt; Go/Java/Python stubs
│   ├── catalog/
│   │   ├── tasks.py                  # Universal algorithm catalog (23 patterns)
│   │   └── detectors.py              # Anti-pattern detectors with confidence calibration
│   ├── workspace/
│   │   ├── config.py                  # .roam-workspace.json
│   │   ├── db.py                      # Workspace overlay DB
│   │   ├── api_scanner.py             # REST API endpoint detection
│   │   └── aggregator.py              # Cross-repo aggregation
│   ├── graph/
│   │   ├── builder.py, pagerank.py    # DB -&gt; NetworkX, PageRank
│   │   ├── cycles.py, clusters.py     # Tarjan SCC, propagation cost, Louvain, modularity Q
│   │   ├── layers.py, pathfinding.py  # Topo layers, k-shortest paths
│   │   ├── simulate.py, spectral.py   # Architecture simulation, Fiedler bisection
│   │   ├── partition.py, fingerprint.py # Multi-agent partitioning, topology fingerprints
│   │   └── anomaly.py                 # Statistical anomaly detection
│   ├── commands/
│   │   ├── resolve.py                 # Shared symbol resolution
│   │   ├── graph_helpers.py           # Shared graph utilities (adj builders, BFS)
│   │   ├── context_helpers.py         # Data-gathering helpers for context command
│   │   ├── gate_presets.py            # Framework-specific gate rules
│   │   └── cmd_*.py                   # One module per command
│   ├── analysis/
│   │   ├── effects.py                 # Side-effect classification engine
│   │   └── taint.py                   # Taint analysis
│   ├── refactor/
│   │   ├── codegen.py                 # Import generation (Python/JS/Go)
│   │   └── transforms.py             # move/rename/add-call/extract transforms
│   ├── rules/
│   │   ├── engine.py                  # YAML rule parser + graph query evaluator
│   │   ├── builtin.py                 # 10 built-in governance rules
│   │   ├── ast_match.py               # AST pattern matching with $METAVAR captures
│   │   └── dataflow.py                # Intra-procedural dataflow analysis
│   ├── runtime/
│   │   ├── trace_ingest.py            # OpenTelemetry/Jaeger/Zipkin ingestion
│   │   └── hotspots.py                # Runtime hotspot analysis
│   ├── search/
│   │   ├── tfidf.py                   # TF-IDF semantic search engine
│   │   ├── index_embeddings.py        # Embedding index builder
│   │   └── onnx_embeddings.py         # Optional local ONNX semantic backend
│   ├── security/
│   │   ├── vuln_store.py              # CVE/vulnerability storage
│   │   └── vuln_reach.py              # Vulnerability reachability paths
│   └── output/
│       ├── formatter.py               # Token-efficient formatting
│       ├── sarif.py                   # SARIF 2.1.0 output
│       └── schema_registry.py         # JSON envelope schema versioning
└── tests/                             # ~5500 tests across 186 test files
</code></pre>
</details>

<h3>Dependencies</h3>
<table>
<thead>
<tr>
<th>Package</th>
<th>Purpose</th>
</tr>
</thead>
<tbody><tr>
<td><a href="https://click.palletsprojects.com/">click</a> &gt;= 8.0</td>
<td>CLI framework</td>
</tr>
<tr>
<td><a href="https://github.com/tree-sitter/py-tree-sitter">tree-sitter</a> &gt;= 0.23</td>
<td>AST parsing</td>
</tr>
<tr>
<td><a href="https://github.com/nicolo-ribaudo/tree-sitter-language-pack">tree-sitter-language-pack</a> &gt;= 0.6</td>
<td>165+ grammars</td>
</tr>
<tr>
<td><a href="https://networkx.org/">networkx</a> &gt;= 3.0</td>
<td>Graph algorithms</td>
</tr>
</tbody></table>
<p>Optional: <a href="https://github.com/jlowin/fastmcp">fastmcp</a> &gt;= 2.0 (MCP server — install with <code>pip install &quot;roam-code[mcp]&quot;</code>)</p>
<p>Optional: Local semantic ONNX stack (<code>numpy</code>, <code>onnxruntime</code>, <code>tokenizers</code>) via <code>pip install &quot;roam-code[semantic]&quot;</code></p>
<h2>Roadmap</h2>
<h3>Shipped</h3>
<ul>
<li><input checked="" disabled="" type="checkbox"> MCP v2 agent surface: in-process execution, compound operations, presets, schemas, annotations, and compatibility profiles.</li>
<li><input checked="" disabled="" type="checkbox"> Full command and MCP inventory parity in docs: 140 CLI commands and 102 MCP tools.</li>
<li><input checked="" disabled="" type="checkbox"> CI hardening: composite action, changed-only mode, trend-aware gates, sticky PR updater, and SARIF guardrails.</li>
<li><input checked="" disabled="" type="checkbox"> Performance foundation: FTS5/BM25 search, O(changed) incremental indexing, DB/index optimizations.</li>
<li><input checked="" disabled="" type="checkbox"> Agent governance suite: <code>vibe-check</code>, <code>ai-readiness</code>, <code>verify</code>, <code>ai-ratio</code>, <code>duplicates</code>, advanced <code>algo</code> scoring/SARIF.</li>
<li><input checked="" disabled="" type="checkbox"> Ownership/review intelligence: <code>codeowners</code>, <code>drift</code>, <code>simulate-departure</code>, <code>suggest-reviewers</code>, <code>api-changes</code>, <code>test-gaps</code>, <code>semantic-diff</code>, <code>secrets</code>.</li>
<li><input checked="" disabled="" type="checkbox"> Multi-agent operations: <code>partition</code>, <code>affected</code>, <code>syntax-check</code>, workspace-aware context and traces.</li>
<li><input checked="" disabled="" type="checkbox"> Budget-aware context delivery: <code>--budget</code> (partial rollout), PageRank-weighted truncation, conversation-aware ranking.</li>
</ul>
<h3>Next</h3>
<ul>
<li><input checked="" disabled="" type="checkbox"> Terminal demo GIF in README.</li>
<li><input disabled="" type="checkbox"> GitHub repo topics.</li>
<li><input disabled="" type="checkbox"> GitHub Discussions enabled.</li>
<li><input disabled="" type="checkbox"> MCP directory + awesome-list submissions.</li>
</ul>
<h2>Contributing</h2>
<pre><code class="language-bash">git clone https://github.com/Cranot/roam-code.git
cd roam-code
pip install -e .
pytest tests/   # all ~5500 tests must pass
</code></pre>
<p>Good first contributions: add a <a href="src/roam/languages/">Tier 1 language</a> (see <code>go_lang.py</code> or <code>php_lang.py</code> as templates), improve reference resolution, add benchmark repos, extend SARIF converters, add MCP tools.</p>
<p>Please open an issue first to discuss larger changes.</p>
<h2>License</h2>
<p><a href="LICENSE">MIT</a></p>

SEE ALSO

clihub4/9/2026ROAM-CODE(1)