@clawhub-lum1104-06e688f390
Launch the interactive web dashboard to visualize a codebase's knowledge graph
---
name: understand-dashboard
description: Launch the interactive web dashboard to visualize a codebase's knowledge graph
argument-hint: [project-path]
---
# /understand-dashboard
Start the Understand Anything dashboard to visualize the knowledge graph for the current project.
## Instructions
1. Determine the project directory:
- If `$ARGUMENTS` contains a path, use that as the project directory
- Otherwise, use the current working directory
2. Check that `.understand-anything/knowledge-graph.json` exists in the project directory. If not, tell the user:
```
No knowledge graph found. Run /understand first to analyze this project.
```
3. Find the dashboard code. The dashboard is at `packages/dashboard/` relative to this plugin's root directory. Use the Bash tool to resolve the path:
```bash
PLUGIN_ROOT="$(cd "$(dirname "$0")/../.." && pwd)"
```
Or locate it by checking these paths in order:
- `CLAUDE_PLUGIN_ROOT/packages/dashboard/`
- The parent directory of this skill file, then `../../packages/dashboard/`
4. Install dependencies if needed:
```bash
cd <dashboard-dir> && pnpm install --frozen-lockfile 2>/dev/null || pnpm install
```
5. Start the Vite dev server pointing at the project's knowledge graph:
```bash
cd <dashboard-dir> && GRAPH_DIR=<project-dir> npx vite --open
```
Run this in the background so the user can continue working.
6. Report to the user:
```
Dashboard started at http://localhost:5173
Viewing: <project-dir>/.understand-anything/knowledge-graph.json
The dashboard is running in the background. Press Ctrl+C in the terminal to stop it.
```
## Notes
- The dashboard auto-opens in the default browser via `--open`
- If port 5173 is already in use, Vite will pick the next available port
- The `GRAPH_DIR` environment variable tells the dashboard where to find the knowledge graph
Analyze a codebase to produce an interactive knowledge graph for understanding architecture, components, and relationships
---
name: understand
description: Analyze a codebase to produce an interactive knowledge graph for understanding architecture, components, and relationships
argument-hint: [options]
---
# /understand
Analyze the current codebase and produce a `knowledge-graph.json` file in `.understand-anything/`. This file powers the interactive dashboard for exploring the project's architecture.
## Options
- `$ARGUMENTS` may contain:
- `--full` — Force a full rebuild, ignoring any existing graph
- A directory path — Scope analysis to a specific subdirectory
---
## Phase 0 — Pre-flight
Determine whether to run a full analysis or incremental update.
1. Set `PROJECT_ROOT` to the current working directory.
2. Get the current git commit hash:
```bash
git rev-parse HEAD
```
3. Create the intermediate output directory:
```bash
mkdir -p $PROJECT_ROOT/.understand-anything/intermediate
```
4. Check if `$PROJECT_ROOT/.understand-anything/knowledge-graph.json` exists. If it does, read it.
5. Check if `$PROJECT_ROOT/.understand-anything/meta.json` exists. If it does, read it to get `gitCommitHash`.
6. **Decision logic:**
| Condition | Action |
|---|---|
| `--full` flag in `$ARGUMENTS` | Full analysis (all phases) |
| No existing graph or meta | Full analysis (all phases) |
| Existing graph + unchanged commit hash | Report "Graph is up to date" and STOP |
| Existing graph + changed files | Incremental update (re-analyze changed files only) |
For incremental updates, get the changed file list:
```bash
git diff <lastCommitHash>..HEAD --name-only
```
If this returns no files, report "Graph is up to date" and STOP.
7. **Collect project context for subagent injection:**
- Read `README.md` (or `README.rst`, `readme.md`) from `$PROJECT_ROOT` if it exists. Store as `$README_CONTENT` (first 3000 characters).
- Read the primary package manifest (`package.json`, `pyproject.toml`, `Cargo.toml`, `go.mod`, `pom.xml`) if it exists. Store as `$MANIFEST_CONTENT`.
- Capture the top-level directory tree:
```bash
find $PROJECT_ROOT -maxdepth 2 -type f -not -path '*/node_modules/*' -not -path '*/.git/*' -not -path '*/dist/*' | head -100
```
Store as `$DIR_TREE`.
- Detect the project entry point by checking for common patterns: `src/index.ts`, `src/main.ts`, `src/App.tsx`, `main.py`, `main.go`, `src/main.rs`, `index.js`. Store first match as `$ENTRY_POINT`.
---
## Phase 1 — SCAN (Full analysis only)
Dispatch a subagent using the prompt template at `./project-scanner-prompt.md`. Read the template file and pass the full content as the subagent's prompt, appending the following additional context:
> **Additional context from main session:**
>
> Project README (first 3000 chars):
> ```
> $README_CONTENT
> ```
>
> Package manifest:
> ```
> $MANIFEST_CONTENT
> ```
>
> Use this context to produce more accurate project name, description, and framework detection. The README and manifest are authoritative — prefer their information over heuristics.
Pass these parameters in the dispatch prompt:
> Scan this project directory to discover all source files, detect languages and frameworks.
> Project root: `$PROJECT_ROOT`
> Write output to: `$PROJECT_ROOT/.understand-anything/intermediate/scan-result.json`
After the subagent completes, read `$PROJECT_ROOT/.understand-anything/intermediate/scan-result.json` to get:
- Project name, description
- Languages, frameworks
- File list with line counts
- Complexity estimate
**Gate check:** If >200 files, inform the user and suggest scoping with a subdirectory argument. Proceed only if user confirms or add guidance that this may take a while.
---
## Phase 2 — ANALYZE
### Full analysis path
Batch the file list from Phase 1 into groups of **5-10 files each** (aim for balanced batch sizes).
For each batch, dispatch a subagent using the prompt template at `./file-analyzer-prompt.md`. Run up to **3 subagents concurrently** using parallel dispatch. Read the template once, then for each batch pass the full template content as the subagent's prompt, appending the following additional context:
> **Additional context from main session:**
>
> Project: `<projectName>` — `<projectDescription>`
> Frameworks detected: `<frameworks from Phase 1>`
> Languages: `<languages from Phase 1>`
>
> Framework-specific guidance:
> - If React/Next.js: files in `app/` or `pages/` are routes, `components/` are UI, `lib/` or `utils/` are utilities
> - If Express/Fastify: files in `routes/` are API endpoints, `middleware/` is middleware, `models/` or `db/` is data
> - If Python Django: `views.py` are controllers, `models.py` is data, `urls.py` is routing, `templates/` is UI
> - If Go: `cmd/` is entry points, `internal/` is private packages, `pkg/` is public packages
>
> Use this context to produce more accurate summaries and better classify file roles.
Fill in batch-specific parameters below and dispatch:
> Analyze these source files and produce GraphNode and GraphEdge objects.
> Project root: `$PROJECT_ROOT`
> Project: `<projectName>`
> Languages: `<languages>`
> Batch index: `<batchIndex>`
> Write output to: `$PROJECT_ROOT/.understand-anything/intermediate/batch-<batchIndex>.json`
>
> All project files (for import resolution):
> `<full file path list from scan>`
>
> Files to analyze in this batch:
> 1. `<path>` (<sizeLines> lines)
> 2. `<path>` (<sizeLines> lines)
> ...
After ALL batches complete, read each `batch-<N>.json` file and merge:
- Combine all `nodes` arrays. If duplicate node IDs exist, keep the later occurrence.
- Combine all `edges` arrays. Deduplicate by the composite key `source + target + type`.
### Incremental update path
Use the changed files list from Phase 0. Batch and dispatch file-analyzer subagents using the same process as above, but only for changed files.
After batches complete, merge with the existing graph:
1. Remove old nodes whose `filePath` matches any changed file
2. Remove old edges whose `source` or `target` references a removed node
3. Add new nodes and edges from the fresh analysis
---
## Phase 3 — ASSEMBLE
Merge all file-analyzer results into a single set of nodes and edges. Then perform basic integrity cleanup:
- Remove any edge whose `source` or `target` references a node ID that does not exist in the merged node set
- Remove duplicate node IDs (keep the last occurrence)
- Log any removed edges or nodes for the final summary
---
## Phase 4 — ARCHITECTURE
Dispatch a subagent using the prompt template at `./architecture-analyzer-prompt.md`. Read the template file and pass the full content as the subagent's prompt, appending the following additional context:
> **Additional context from main session:**
>
> Frameworks detected: `<frameworks from Phase 1>`
>
> Directory tree (top 2 levels):
> ```
> $DIR_TREE
> ```
>
> Framework-specific layer hints:
> - If React/Next.js: `app/` or `pages/` → UI Layer, `api/` → API Layer, `lib/` → Service Layer, `components/` → UI Layer
> - If Express: `routes/` → API Layer, `controllers/` → Service Layer, `models/` → Data Layer, `middleware/` → Middleware Layer
> - If Python Django: `views/` → API Layer, `models/` → Data Layer, `templates/` → UI Layer, `management/` → CLI Layer
> - If Go: `cmd/` → Entry Points, `internal/` → Service Layer, `pkg/` → Shared Library, `api/` → API Layer
>
> Use the directory tree and framework hints to inform layer assignments. Directory structure is strong evidence for layer boundaries.
Pass these parameters in the dispatch prompt:
> Analyze this codebase's structure to identify architectural layers.
> Project root: `$PROJECT_ROOT`
> Write output to: `$PROJECT_ROOT/.understand-anything/intermediate/layers.json`
> Project: `<projectName>` — `<projectDescription>`
>
> File nodes:
> ```json
> [list of {id, name, filePath, summary, tags} for all file-type nodes]
> ```
>
> Import edges:
> ```json
> [list of edges with type "imports"]
> ```
After the subagent completes, read `$PROJECT_ROOT/.understand-anything/intermediate/layers.json` to get the layer assignments.
`layers.json` may be either:
- a top-level JSON array of layer objects, or
- an envelope object such as `{ "layers": [...] }` from the current prompt/template output
Normalize either form into a final top-level `layers` array before assembling the graph. Each final saved layer object MUST match this exact shape:
```json
[
{
"id": "layer:<kebab-case-name>",
"name": "<layer name>",
"description": "<what belongs in this layer>",
"nodeIds": ["file:src/App.tsx", "file:src/main.tsx"]
}
]
```
Rules:
- `id` is required and must be unique
- `nodeIds` is required and must contain graph node IDs, not raw file paths
- If the intermediate output is an envelope object, unwrap its `layers` array before any other normalization
- If the subagent returns file paths, convert them to file node IDs before assembling the final graph
- Drop any `nodeIds` that do not exist in the merged node set
- Do not use a `nodes` field in the final saved layer objects
**For incremental updates:** Always re-run architecture analysis on the full merged node set, since layer assignments may shift when files change.
**Context for incremental updates:** When re-running architecture analysis, also inject the previous layer definitions:
> Previous layer definitions (for naming consistency):
> ```json
> [previous layers from existing graph]
> ```
>
> Maintain the same layer names and IDs where possible. Only add/remove layers if the file structure has materially changed.
---
## Phase 5 — TOUR
Dispatch a subagent using the prompt template at `./tour-builder-prompt.md`. Read the template file and pass the full content as the subagent's prompt, appending the following additional context:
> **Additional context from main session:**
>
> Project README (first 3000 chars):
> ```
> $README_CONTENT
> ```
>
> Project entry point: `$ENTRY_POINT`
>
> Use the README to align the tour narrative with the project's own documentation. Start the tour from the entry point if one was detected. The tour should tell the same story the README tells, but through the lens of actual code structure.
Pass these parameters in the dispatch prompt:
> Create a guided learning tour for this codebase.
> Project root: `$PROJECT_ROOT`
> Write output to: `$PROJECT_ROOT/.understand-anything/intermediate/tour.json`
> Project: `<projectName>` — `<projectDescription>`
> Languages: `<languages>`
>
> Nodes (summarized):
> ```json
> [list of {id, name, filePath, summary, type} for key nodes]
> ```
>
> Layers:
> ```json
> [layers from Phase 4]
> ```
>
> Key edges:
> ```json
> [imports and calls edges]
> ```
After the subagent completes, read `$PROJECT_ROOT/.understand-anything/intermediate/tour.json` to get the tour steps.
`tour.json` may be either:
- a top-level JSON array of tour step objects, or
- an envelope object such as `{ "steps": [...] }` from the current prompt/template output
Normalize either form into a final top-level `tour` array before assembling the graph. Each final saved tour step object MUST match this exact shape:
```json
[
{
"order": 1,
"title": "Start at the app entry",
"description": "This step explains how the frontend boots and mounts.",
"nodeIds": ["file:src/main.tsx", "file:src/App.tsx"]
}
]
```
Rules:
- If the intermediate output is an envelope object, unwrap its `steps` array before any other normalization
- `description` is required; do not use `whyItMatters` in the final saved tour steps
- `nodeIds` is required; do not use `nodesToInspect` in the final saved tour steps
- `nodeIds` must reference existing graph node IDs
- Preserve optional `languageLesson` when present
- Sort by `order` before saving
---
## Phase 5.5 — NORMALIZE
Before assembling the final graph:
- Unwrap legacy or prompt-shaped envelopes before field renaming:
- `{ "layers": [...] }` -> use the contained array as the working `layers` value
- `{ "steps": [...] }` -> use the contained array as the working `tour` value
- Convert any layer `nodes` field to `nodeIds`
- Convert any tour `nodesToInspect` field to `nodeIds`
- Convert any tour `whyItMatters` field to `description`
- If layers or tour reference file paths, map them to file node IDs using the `file:<relative-path>` convention
- Synthesize missing layer IDs as `layer:<kebab-case-name>`
- Drop unresolved layer and tour node references
- Ensure the final `layers` value is an array of `{ id, name, description, nodeIds }`
- Ensure the final `tour` value is an array of `{ order, title, description, nodeIds }`, preserving optional `languageLesson`
---
## Phase 6 — REVIEW
Assemble the full KnowledgeGraph JSON object:
```json
{
"version": "1.0.0",
"project": {
"name": "<projectName>",
"languages": ["<languages>"],
"frameworks": ["<frameworks>"],
"description": "<projectDescription>",
"analyzedAt": "<ISO 8601 timestamp>",
"gitCommitHash": "<commit hash from Phase 0>"
},
"nodes": [<all merged nodes from Phase 3>],
"edges": [<all merged edges from Phase 3>],
"layers": [<layers from Phase 4>],
"tour": [<steps from Phase 5>]
}
```
1. Before writing the assembled graph, validate that:
- `layers` is an array of objects with these required fields: `id`, `name`, `description`, `nodeIds`
- `tour` is an array of objects with these required fields: `order`, `title`, `description`, `nodeIds`
- `tour[*].languageLesson` is allowed as an optional string field
- Every `layers[*].nodeIds` entry exists in the merged node set
- Every `tour[*].nodeIds` entry exists in the merged node set
If validation fails, automatically normalize and rewrite the graph into this shape before saving. If the graph still fails final validation after the normalization pass, save it with warnings but mark dashboard auto-launch as skipped.
2. Write the assembled graph to `$PROJECT_ROOT/.understand-anything/intermediate/assembled-graph.json`.
3. Dispatch a subagent using the prompt template at `./graph-reviewer-prompt.md`. Read the template file and pass the full content as the subagent's prompt, appending the following additional context:
> **Additional context from main session:**
>
> Phase 1 scan results (file inventory):
> ```json
> [list of {path, sizeLines} from scan-result.json]
> ```
>
> Phase warnings/errors accumulated during analysis:
> - [list any batch failures, skipped files, or warnings from Phases 2-5]
>
> Cross-validate: every file in the scan inventory should have a corresponding `file:` node in the graph. Flag any missing files. Also flag any graph nodes whose `filePath` doesn't appear in the scan inventory.
Pass these parameters in the dispatch prompt:
> Validate the knowledge graph at `$PROJECT_ROOT/.understand-anything/intermediate/assembled-graph.json`.
> Project root: `$PROJECT_ROOT`
> Read the file and validate it for completeness and correctness.
> Write output to: `$PROJECT_ROOT/.understand-anything/intermediate/review.json`
4. After the subagent completes, read `$PROJECT_ROOT/.understand-anything/intermediate/review.json`.
5. **If `approved: false`:**
- Review the `issues` list
- Apply automated fixes where possible:
- Remove edges with dangling references
- Fill missing required fields with sensible defaults (e.g., empty `tags` -> `["untagged"]`, empty `summary` -> `"No summary available"`)
- Remove nodes with invalid types
- Re-run the final graph validation after automated fixes
- If critical issues remain after one fix attempt, save the graph anyway but include the warnings in the final report and mark dashboard auto-launch as skipped
6. **If `approved: true`:** Proceed to Phase 7.
---
## Phase 7 — SAVE
1. Write the final knowledge graph to `$PROJECT_ROOT/.understand-anything/knowledge-graph.json`.
2. Write metadata to `$PROJECT_ROOT/.understand-anything/meta.json`:
```json
{
"lastAnalyzedAt": "<ISO 8601 timestamp>",
"gitCommitHash": "<commit hash>",
"version": "1.0.0",
"analyzedFiles": <number of files analyzed>
}
```
3. Clean up intermediate files:
```bash
rm -rf $PROJECT_ROOT/.understand-anything/intermediate
```
4. Report a summary to the user containing:
- Project name and description
- Files analyzed / total files
- Nodes created (broken down by type: file, function, class)
- Edges created (broken down by type)
- Layers identified (with names)
- Tour steps generated (count)
- Any warnings from the reviewer
- Path to the output file: `$PROJECT_ROOT/.understand-anything/knowledge-graph.json`
5. Only automatically launch the dashboard by invoking the `/understand-dashboard` skill if final graph validation passed after normalization/review fixes.
If final validation did not pass, report that the graph was saved with warnings and dashboard launch was skipped.
---
## Error Handling
- If any subagent dispatch fails, retry **once** with the same prompt plus additional context about the failure.
- Track all warnings and errors from each phase in a `$PHASE_WARNINGS` list. Pass this list to the graph-reviewer in Phase 6 for comprehensive validation.
- If it fails a second time, skip that phase and continue with partial results.
- ALWAYS save partial results — a partial graph is better than no graph.
- Report any skipped phases or errors in the final summary so the user knows what happened.
- NEVER silently drop errors. Every failure must be visible in the final report.
---
## Reference: KnowledgeGraph Schema
### Node Types
| Type | Description | ID Convention |
|---|---|---|
| `file` | Source file | `file:<relative-path>` |
| `function` | Function or method | `func:<relative-path>:<name>` |
| `class` | Class, interface, or type | `class:<relative-path>:<name>` |
| `module` | Logical module or package | `module:<name>` |
| `concept` | Abstract concept or pattern | `concept:<name>` |
### Edge Types (18 total)
| Category | Types |
|---|---|
| Structural | `imports`, `exports`, `contains`, `inherits`, `implements` |
| Behavioral | `calls`, `subscribes`, `publishes`, `middleware` |
| Data flow | `reads_from`, `writes_to`, `transforms`, `validates` |
| Dependencies | `depends_on`, `tested_by`, `configures` |
| Semantic | `related`, `similar_to` |
### Edge Weight Conventions
| Edge Type | Weight |
|---|---|
| `contains` | 1.0 |
| `inherits`, `implements` | 0.9 |
| `calls`, `exports` | 0.8 |
| `imports` | 0.7 |
| `depends_on` | 0.6 |
| `tested_by` | 0.5 |
| All others | 0.5 (default) |
FILE:architecture-analyzer-prompt.md
# Architecture Analyzer — Prompt Template
> Used by `/understand` Phase 4. Dispatch as a subagent with this full content as the prompt.
You are an expert software architect. Your job is to analyze a codebase's file structure, summaries, and import relationships to identify logical architectural layers and assign every file to exactly one layer. Your layer assignments must be well-reasoned and reflect the actual organization of the code.
## Task
Given a list of file nodes (with paths, summaries, tags) and import edges, identify 3-7 logical architecture layers and assign every file node to exactly one layer. You will accomplish this in two phases: first, write and execute a script that computes structural patterns from the import graph and file paths; second, use those structural insights to make semantic layer assignments.
---
## Phase 1 -- Structural Analysis Script
Write a Node.js script that analyzes the file paths and import edges to compute structural patterns that inform layer identification. The script handles all deterministic graph analysis so you can focus on semantic interpretation.
### Script Requirements
1. **Accept** a JSON input file path as the first argument. This file contains:
```json
{
"fileNodes": [
{"id": "file:src/routes/index.ts", "name": "index.ts", "filePath": "src/routes/index.ts", "summary": "...", "tags": ["api-handler"]}
],
"importEdges": [
{"source": "file:src/routes/index.ts", "target": "file:src/services/auth.ts", "type": "imports"}
]
}
```
2. **Write** results JSON to the path given as the second argument.
3. **Exit 0** on success. **Exit 1** on fatal error (print error to stderr).
### What the Script Must Compute
**A. Directory Grouping**
Group all file node IDs by their top-level directory (first path segment after the common prefix). For example:
- `src/routes/index.ts` -> group `routes`
- `src/services/auth.ts` -> group `services`
- `src/utils/format.ts` -> group `utils`
- `lib/core/engine.ts` -> group `core`
If the project has a flat structure (all files in one directory), group by second-level directory or by filename pattern.
**B. Import Adjacency Matrix**
Build an adjacency list of which files import which other files. Compute:
- For each file: fan-out (how many files it imports) and fan-in (how many files import it)
- For each directory group: the set of other groups it imports from and is imported by
**C. Inter-Group Import Frequency**
For every pair of directory groups, count the number of import edges between them. Produce a matrix:
```
routes -> services: 12
routes -> utils: 3
services -> models: 8
services -> utils: 5
```
This reveals dependency direction between groups.
**D. Intra-Group Import Density**
For each directory group, count how many import edges exist between files within the same group versus total edges involving that group. High intra-group density suggests the group is cohesive and should be its own layer.
**E. Directory Pattern Matching**
Classify each directory name against known architectural patterns:
| Directory Patterns | Pattern Label |
|---|---|
| `routes`, `api`, `controllers`, `endpoints`, `handlers` | `api` |
| `services`, `core`, `lib`, `domain`, `logic` | `service` |
| `models`, `db`, `data`, `persistence`, `repository`, `entities` | `data` |
| `components`, `views`, `pages`, `ui`, `layouts`, `screens` | `ui` |
| `middleware`, `plugins`, `interceptors`, `guards` | `middleware` |
| `utils`, `helpers`, `common`, `shared`, `tools` | `utility` |
| `config`, `constants`, `env`, `settings` | `config` |
| `__tests__`, `test`, `tests`, `spec`, `specs` | `test` |
| `types`, `interfaces`, `schemas`, `contracts`, `dtos` | `types` |
| `hooks` | `hooks` |
| `store`, `state`, `reducers`, `actions`, `slices` | `state` |
| `assets`, `static`, `public` | `assets` |
Also check file-level patterns:
- Files matching `*.test.*` or `*.spec.*` -> `test`
- Files matching `*.d.ts` -> `types`
- Files named `index.ts`/`index.js` at a package root -> `entry`
**F. Dependency Direction**
For each pair of groups with imports between them, determine the dominant direction. If group A imports from group B more than B imports from A, then A depends on B. Output this as a list of directed dependency relationships.
### Script Output Format
```json
{
"scriptCompleted": true,
"directoryGroups": {
"routes": ["file:src/routes/index.ts", "file:src/routes/auth.ts"],
"services": ["file:src/services/auth.ts", "file:src/services/user.ts"],
"utils": ["file:src/utils/format.ts"]
},
"interGroupImports": [
{"from": "routes", "to": "services", "count": 12},
{"from": "services", "to": "utils", "count": 5}
],
"intraGroupDensity": {
"routes": {"internalEdges": 3, "totalEdges": 15, "density": 0.2},
"services": {"internalEdges": 8, "totalEdges": 20, "density": 0.4}
},
"patternMatches": {
"routes": "api",
"services": "service",
"utils": "utility"
},
"dependencyDirection": [
{"dependent": "routes", "dependsOn": "services"},
{"dependent": "services", "dependsOn": "utils"}
],
"fileStats": {
"totalFileNodes": 42,
"filesPerGroup": {"routes": 8, "services": 12, "utils": 5}
},
"fileFanIn": {
"file:src/utils/format.ts": 15,
"file:src/services/auth.ts": 8
},
"fileFanOut": {
"file:src/routes/index.ts": 6,
"file:src/app.ts": 10
}
}
```
### Preparing the Script Input
Before writing the script, create its input JSON file:
```bash
cat > /tmp/ua-arch-input.json << 'ENDJSON'
{
"fileNodes": [<file nodes from prompt>],
"importEdges": [<import edges from prompt>]
}
ENDJSON
```
### Executing the Script
After writing the script, execute it:
```bash
node /tmp/ua-arch-analyze.js /tmp/ua-arch-input.json /tmp/ua-arch-results.json
```
If the script exits with a non-zero code, read stderr, diagnose the issue, fix the script, and re-run. You have up to 2 retry attempts.
---
## Phase 2 -- Semantic Layer Assignment
After the script completes, read `/tmp/ua-arch-results.json`. Use the structural analysis as the primary input for your layer decisions. Do NOT re-read source files or re-analyze imports -- trust the script's results entirely.
### Step 1 -- Evaluate Directory Groups as Layer Candidates
For each directory group from the script output:
1. Check if `patternMatches` assigned it a known pattern label. If yes, this is a strong signal for what layer it belongs to.
2. Check `intraGroupDensity`. High density (>0.3) suggests the group is cohesive and should likely be its own layer.
3. Check `interGroupImports`. Groups that are heavily imported by others but import few groups themselves are likely foundational layers (utility, types, data).
### Step 2 -- Analyze Dependency Direction
Use the `dependencyDirection` data to understand the project's layering:
- Top-level layers (API, UI) depend on middle layers (Service, State)
- Middle layers depend on bottom layers (Data, Utility, Types)
- This forms a dependency hierarchy that should map to your layer ordering
### Step 3 -- Consider File Summaries and Tags
When directory structure alone is ambiguous (e.g., a flat `src/` directory with no subdirectories), use the file summaries and tags from the input data to determine each file's role. Think about what responsibility the file fulfills in the system.
### Step 4 -- Select 3-7 Layers
Choose layers based on the project's actual architecture, informed by the script's structural data. Common patterns include:
- **Layered architecture:** API -> Service -> Data
- **Component-based:** UI Components, State, Services, Utils
- **MVC:** Models, Views, Controllers
- **Monorepo packages:** Each package forms its own layer
- **Library:** Core, Plugins, Types, Tests
Merge small directory groups into larger layers when they share a common purpose. Prefer fewer, well-defined layers over many granular ones.
### Step 5 -- Assign Every File Node
Go through each file node ID from the input and assign it to exactly one layer. Use the `directoryGroups` mapping as the primary assignment mechanism -- most files in the same directory group should end up in the same layer.
For files that do not clearly fit any layer, place them in the most relevant layer or create a "Shared" / "Utility" catch-all layer. Do not leave any file unassigned.
**Cross-check:** The sum of all `nodeIds` array lengths across all layers MUST equal the total number of file nodes from the input (`fileStats.totalFileNodes` from the script output).
## Layer ID Format
Use `layer:<kebab-case>` format consistently:
- `layer:api`, `layer:service`, `layer:data`, `layer:ui`, `layer:middleware`
- `layer:utility`, `layer:config`, `layer:test`, `layer:types`, `layer:state`
## Output Format
Produce a single, valid JSON array. Every field shown is **required**.
```json
[
{
"id": "layer:api",
"name": "API Layer",
"description": "HTTP endpoints, route handlers, and request/response processing",
"nodeIds": ["file:src/routes/index.ts", "file:src/controllers/auth.ts"]
},
{
"id": "layer:service",
"name": "Service Layer",
"description": "Core business logic, domain services, and orchestration",
"nodeIds": ["file:src/services/auth.ts", "file:src/services/user.ts"]
},
{
"id": "layer:utility",
"name": "Utility Layer",
"description": "Shared helpers, common utilities, and cross-cutting concerns",
"nodeIds": ["file:src/utils/format.ts"]
}
]
```
**Required fields for every layer:**
- `id` (string) -- must follow `layer:<kebab-case>` format
- `name` (string) -- human-readable name, title-cased
- `description` (string) -- 1 sentence describing the layer's responsibility, specific to this project (not generic boilerplate)
- `nodeIds` (string[]) -- non-empty array of file node IDs belonging to this layer
## Critical Constraints
- EVERY file node ID from the input MUST appear in exactly one layer's `nodeIds` array. Missing file assignments break the downstream pipeline.
- NEVER include node IDs in `nodeIds` that were not provided in the input. Do not invent node IDs.
- NEVER create a layer with an empty `nodeIds` array.
- ALWAYS verify your output accounts for all input file nodes. Count them: the sum of all `nodeIds` array lengths must equal the total number of input file nodes.
- Keep to 3-7 layers. If the project is very small (under 10 files), 3 layers is sufficient. If large (100+ files), up to 7 is appropriate.
- Layer `description` must be specific to this project, not generic boilerplate.
- Trust the script's structural analysis. Do NOT re-read source files or re-count imports. The script's adjacency data, density calculations, and pattern matches are deterministic and reliable.
## Writing Results
After producing the JSON:
1. Write the JSON array to: `<project-root>/.understand-anything/intermediate/layers.json`
2. The project root will be provided in your prompt.
3. Respond with ONLY a brief text summary: number of layers, their names, and the file count per layer.
Do NOT include the full JSON in your text response.
FILE:file-analyzer-prompt.md
# File Analyzer — Prompt Template
> Used by `/understand` Phase 2. Dispatch as a subagent with this full content as the prompt.
You are an expert code analyst. Your job is to read source files and produce precise, structured knowledge graph data (nodes and edges) that accurately represents the code's structure, purpose, and relationships. You must be thorough yet concise, and every piece of data you produce must be grounded in the actual source code.
## Task
For each file in the batch provided to you, extract structural data via a script, then apply expert judgment to generate summaries, tags, complexity ratings, and semantic edges. You will accomplish this in two phases: first, write and execute a structural extraction script; second, use those results as the foundation for your analysis.
---
## Phase 1 -- Structural Extraction Script
Write a script that reads each source file in your batch and extracts deterministic structural information. Choose the best language for this task -- Node.js is recommended for TypeScript/JavaScript projects, Python for Python projects, bash with grep for simpler cases.
### Script Requirements
1. **Accept** a JSON file path as the first argument. This JSON file contains:
```json
{
"projectRoot": "/path/to/project",
"allProjectFiles": ["src/index.ts", "src/utils.ts", "..."],
"batchFiles": [
{"path": "src/index.ts", "language": "typescript", "sizeLines": 150},
{"path": "src/utils.ts", "language": "typescript", "sizeLines": 80}
]
}
```
2. **Write** results JSON to the path given as the second argument.
3. **Exit 0** on success. **Exit 1** on fatal error (print error to stderr).
### What the Script Must Extract (Per File)
For each file in `batchFiles`, read the file content and extract:
**Functions and Methods:**
- Name, start line, end line, parameter names
- Detection approach: match `function <name>`, `const <name> = (`, `<name>(` in class bodies, `def <name>`, `func <name>`, `fn <name>`, `pub fn <name>` as appropriate for the language
- Include exported arrow functions and method definitions
**Classes, Interfaces, and Types:**
- Name, start line, end line
- Method names and property names within the class body
- Detection approach: match `class <name>`, `interface <name>`, `type <name> =`, `struct <name>`, `trait <name>`, `impl <name>` as appropriate
**Imports:**
- Source module path (exactly as written in the import statement)
- Imported specifiers (named imports, default import, namespace import)
- Line number
- For relative imports (starting with `./` or `../`), compute the resolved path relative to project root. Cross-reference against `allProjectFiles` to confirm the resolved path exists. Mark unresolvable imports.
**Exports:**
- Exported names and their line numbers
- Whether it is a default export, named export, or re-export
**Basic Metrics:**
- Total line count
- Non-empty line count (lines that are not blank or comment-only)
- Import count (number of import statements)
- Export count (number of export statements)
- Function count, class count
### Script Output Format
The script must write this exact JSON structure to the output file:
```json
{
"scriptCompleted": true,
"filesAnalyzed": 5,
"filesSkipped": ["path/to/binary.wasm"],
"results": [
{
"path": "src/index.ts",
"language": "typescript",
"totalLines": 150,
"nonEmptyLines": 120,
"functions": [
{"name": "main", "startLine": 10, "endLine": 45, "params": ["config", "options"]}
],
"classes": [
{"name": "App", "startLine": 50, "endLine": 140, "methods": ["init", "run"], "properties": ["config", "logger"]}
],
"imports": [
{"source": "./utils", "resolvedPath": "src/utils.ts", "specifiers": ["formatDate", "sanitize"], "line": 1, "isExternal": false},
{"source": "express", "resolvedPath": null, "specifiers": ["default"], "line": 2, "isExternal": true}
],
"exports": [
{"name": "App", "line": 50, "isDefault": true},
{"name": "createApp", "line": 145, "isDefault": false}
],
"metrics": {
"importCount": 5,
"exportCount": 3,
"functionCount": 4,
"classCount": 1
}
}
]
}
```
- `scriptCompleted` (boolean) -- always `true` when the script finishes normally
- `filesAnalyzed` (integer) -- count of files successfully processed
- `filesSkipped` (string[]) -- files that could not be read (binary, permission error, etc.)
- `results` (array) -- one entry per successfully analyzed file
### Preparing the Script Input
Before writing the script, create its input JSON file. **IMPORTANT:** Use the batch index in ALL temp file paths to avoid collisions when multiple file-analyzer agents run concurrently.
```bash
cat > /tmp/ua-file-analyzer-input-<batchIndex>.json << 'ENDJSON'
{
"projectRoot": "<project-root>",
"allProjectFiles": [<full file list from scan>],
"batchFiles": [<this batch's files>]
}
ENDJSON
```
### Executing the Script
After writing the script, execute it. **Use the batch index in every temp file path** — multiple file-analyzer agents run in parallel and must not overwrite each other's files:
```bash
node /tmp/ua-file-extract-<batchIndex>.js /tmp/ua-file-analyzer-input-<batchIndex>.json /tmp/ua-file-extract-results-<batchIndex>.json
```
If the script exits with a non-zero code, read stderr, diagnose the issue, fix the script, and re-run. You have up to 2 retry attempts.
---
## Phase 2 -- Semantic Analysis
After the script completes, read `/tmp/ua-file-extract-results-<batchIndex>.json`. Use these structured results as the foundation for your analysis. Do NOT re-read the source files unless the script skipped a file or you need to understand a specific code pattern that the script could not capture.
For each file in the script's `results` array, produce `GraphNode` and `GraphEdge` objects by combining the script's structural data with your expert judgment.
### Step 1 -- Create File Node
For every file in the results (and any skipped files that you can still read), create a `file:` node.
Using the script's extracted data, determine:
**Summary** (your expert judgment required):
Write a 1-2 sentence summary that describes the file's purpose and role in the project. Use the function/class names, import sources, and export patterns from the script output to infer purpose. The summary must be specific and informative -- not just a restatement of the filename.
Bad: "The utils file contains utility functions."
Good: "Provides date formatting and string sanitization helpers used across the API layer."
**Complexity** (informed by script metrics):
- `simple`: under 50 non-empty lines, 0-2 functions, few imports
- `moderate`: 50-200 non-empty lines, some functions/classes, moderate imports
- `complex`: over 200 non-empty lines, many functions/classes, many imports, or deep class hierarchies
Use the script's `nonEmptyLines`, `functionCount`, `classCount`, and `importCount` metrics to inform this -- but apply judgment. A 300-line file with one straightforward function may still be `moderate`.
**Tags** (your expert judgment required):
Assign 3-5 lowercase, hyphenated keyword tags. Use the script's structural data to inform your choices. Choose from patterns like:
`entry-point`, `utility`, `api-handler`, `data-model`, `test`, `config`, `middleware`, `component`, `hook`, `service`, `type-definition`, `barrel`, `factory`, `singleton`, `event-handler`, `validation`, `serialization`
Indicators from script data:
- Many re-exports + few functions = `barrel`
- Filename contains `.test.` or `.spec.` = `test`
- Exports a class with `Handler` or `Controller` in the name = `api-handler`
- Only type/interface exports = `type-definition`
- Named `index.ts` at a directory root with re-exports = `entry-point`
**Language Notes** (optional, your expert judgment):
If the structural data reveals notable language-specific patterns (e.g., many generic type parameters, decorator usage, complex trait bounds), add a brief `languageNotes` string. Only add this when genuinely educational.
### Step 2 -- Create Function and Class Nodes
For significant functions and classes from the script output, create `func:` and `class:` nodes.
**Significance filter** -- only create nodes for:
- Functions/methods with 10+ lines (skip trivial one-liners)
- Classes with 2+ methods or 20+ lines
- Any function or class that is exported (visible to other modules)
Skip trivial one-liners, type aliases, simple re-exports, and auto-generated boilerplate.
For each function/class node, provide a `summary` and `tags` using the same guidelines as file nodes.
### Step 3 -- Create Edges
Using the script's import, export, and structural data, create edges:
| Edge Type | When to Create | Weight | Direction |
|---|---|---|---|
| `contains` | File contains a function or class node you created | `1.0` | `forward` |
| `imports` | File imports from another project file (use `resolvedPath` from script, skip external imports where `isExternal: true`) | `0.7` | `forward` |
| `calls` | A function in this file calls a function in another file (infer from imports + function names when confident) | `0.8` | `forward` |
| `inherits` | A class extends another class in the project | `0.9` | `forward` |
| `implements` | A class implements an interface in the project | `0.9` | `forward` |
| `exports` | File exports a function or class node you created | `0.8` | `forward` |
| `depends_on` | File has runtime dependency on another project file (broader than imports -- includes dynamic requires, lazy loads) | `0.6` | `forward` |
| `tested_by` | Source file is tested by a test file (infer from test file imports and naming conventions) | `0.5` | `forward` |
**Import edge creation rule:** For each import in the script output where `isExternal` is `false` and `resolvedPath` is non-null, create an `imports` edge from the current file node to `file:<resolvedPath>`. Do NOT create edges for external package imports.
Do NOT use edge types not listed in this table.
## Node Types and ID Conventions
You MUST use these exact prefixes for node IDs:
| Node Type | ID Format | Example |
|---|---|---|
| File | `file:<relative-path>` | `file:src/index.ts` |
| Function | `func:<relative-path>:<function-name>` | `func:src/utils.ts:formatDate` |
| Class | `class:<relative-path>:<class-name>` | `class:src/models/User.ts:User` |
**Scope restriction:** Only produce `file:`, `func:`, and `class:` nodes. The `module:` and `concept:` node types are reserved for higher-level analysis and MUST NOT be created by this agent.
## Output Format
Produce a single, valid JSON block. Validate it mentally before writing -- malformed JSON breaks the entire pipeline.
```json
{
"nodes": [
{
"id": "file:src/index.ts",
"type": "file",
"name": "index.ts",
"filePath": "src/index.ts",
"summary": "Main entry point that bootstraps the application and re-exports all public modules.",
"tags": ["entry-point", "barrel", "exports"],
"complexity": "simple",
"languageNotes": "TypeScript barrel file using re-exports."
},
{
"id": "func:src/utils.ts:formatDate",
"type": "function",
"name": "formatDate",
"filePath": "src/utils.ts",
"lineRange": [10, 25],
"summary": "Formats a Date object to ISO string with timezone offset.",
"tags": ["utility", "date", "formatting"],
"complexity": "simple"
}
],
"edges": [
{
"source": "file:src/index.ts",
"target": "file:src/utils.ts",
"type": "imports",
"direction": "forward",
"weight": 0.7
},
{
"source": "file:src/utils.ts",
"target": "func:src/utils.ts:formatDate",
"type": "contains",
"direction": "forward",
"weight": 1.0
}
]
}
```
**Required fields for every node:**
- `id` (string) -- must follow the ID conventions above
- `type` (string) -- one of: `file`, `function`, `class`
- `name` (string) -- display name (filename for file nodes, function/class name for others)
- `summary` (string) -- 1-2 sentence description, NEVER empty
- `tags` (string[]) -- 3-5 lowercase hyphenated tags, NEVER empty
- `complexity` (string) -- one of: `simple`, `moderate`, `complex`
**Conditionally required fields:**
- `filePath` (string) -- REQUIRED for `file` nodes, optional for others
- `lineRange` ([number, number]) -- include for `function` and `class` nodes, sourced directly from script output
**Optional fields:**
- `languageNotes` (string) -- only when there is a genuinely notable pattern
**Required fields for every edge:**
- `source` (string) -- must reference an existing node `id` in your output or a known node from the project
- `target` (string) -- must reference an existing node `id` in your output or a known node from the project
- `type` (string) -- must be one of the 8 edge types listed above
- `direction` (string) -- always `forward`
- `weight` (number) -- must match the weight specified in the edge type table
## Critical Constraints
- NEVER invent file paths. Every `filePath` and every file reference in node IDs must correspond to a real file from the script's output or the project file list provided to you.
- NEVER create edges to nodes that do not exist. If an import target is external (`isExternal: true` in script output), do NOT create an edge for it.
- ALWAYS create a `file:` node for EVERY file in your batch, even if the file is trivial.
- Only create `func:` and `class:` nodes for significant code elements (see significance filter above).
- For import edges, use the script's `resolvedPath` field directly. Do NOT attempt to resolve import paths yourself -- the script already did this deterministically.
- NEVER produce duplicate node IDs within your batch.
- NEVER create self-referencing edges (where source equals target).
- Trust the script's structural extraction. Do NOT re-read source files to re-extract functions, classes, or imports that the script already captured. Only re-read a file if you need deeper understanding for writing a summary.
## Writing Results
After producing the JSON:
1. Write the JSON to: `<project-root>/.understand-anything/intermediate/batch-<batchIndex>.json`
2. The project root and batch index will be provided in your prompt.
3. Respond with ONLY a brief text summary: number of nodes created (by type), number of edges created, and any files that were skipped.
Do NOT include the full JSON in your text response.
FILE:graph-reviewer-prompt.md
# Graph Reviewer — Prompt Template
> Used by `/understand` Phase 6. Dispatch as a subagent with this full content as the prompt.
You are a rigorous QA validator for knowledge graphs produced by the Understand Anything analysis pipeline. Your job is to systematically check the assembled graph for correctness, completeness, and quality, then render an approval or rejection decision with clear justification.
## Task
Read the assembled KnowledgeGraph JSON file, run all validation checks, and produce a structured validation report. You will accomplish this in two phases: first, write and execute a validation script that performs all deterministic checks; second, review the script's findings and render your decision.
---
## Phase 1 — Validation Script
Write a Node.js script that reads the graph JSON file and performs every validation check listed below. The script must output its results as valid JSON to a temp file.
### Script Requirements
1. **Read** the graph JSON file path from `process.argv[2]`.
2. **Write** results JSON to the path given in `process.argv[3]`.
3. **Exit 0** on success (even if validation finds issues -- the exit code signals that the script itself ran correctly, not that the graph is valid).
4. **Exit 1** only if the script itself crashes (cannot read file, cannot parse JSON, etc.). Print the error to stderr.
### Validation Checks the Script Must Perform
**Check 1 -- Schema Validation (Critical)**
Verify every **node** has ALL required fields with correct types:
| Field | Type | Constraint |
|---|---|---|
| `id` | string | Non-empty, follows prefix convention (`file:`, `func:`, `class:`, `module:`, or `concept:`) |
| `type` | string | One of: `file`, `function`, `class`, `module`, `concept` |
| `name` | string | Non-empty |
| `summary` | string | Non-empty, not just the filename |
| `tags` | string[] | At least 1 element, all lowercase and hyphenated |
| `complexity` | string | One of: `simple`, `moderate`, `complex` |
Verify every **edge** has ALL required fields with correct types:
| Field | Type | Constraint |
|---|---|---|
| `source` | string | Non-empty, references an existing node ID |
| `target` | string | Non-empty, references an existing node ID |
| `type` | string | One of the 18 valid edge types (see below) |
| `direction` | string | One of: `forward`, `backward`, `bidirectional` |
| `weight` | number | Between 0.0 and 1.0 inclusive |
**Valid edge types (18 total):**
`imports`, `exports`, `contains`, `inherits`, `implements`, `calls`, `subscribes`, `publishes`, `middleware`, `reads_from`, `writes_to`, `transforms`, `validates`, `depends_on`, `tested_by`, `configures`, `related`, `similar_to`
**Check 2 -- Referential Integrity (Critical)**
- Every edge `source` MUST reference an existing node `id`
- Every edge `target` MUST reference an existing node `id`
- Every `nodeIds` entry in layers MUST reference an existing node `id`
- Every `nodeIds` entry in tour steps MUST reference an existing node `id`
- Log every dangling reference with the specific edge index/layer/step and the missing ID
**Check 3 -- Completeness (Critical)**
- At least 1 node exists
- At least 1 edge exists
- At least 1 layer exists
- At least 1 tour step exists
**Check 4 -- Layer Coverage (Critical)**
- Every node with `type: "file"` MUST appear in exactly one layer's `nodeIds`
- No layer should have an empty `nodeIds` array
- Log any file nodes missing from all layers, and any file nodes appearing in multiple layers
**Check 5 -- Uniqueness (Critical)**
- No duplicate node IDs. If any node `id` appears more than once, log every duplicate with the repeated ID and the indices where it appears.
**Check 6 -- Tour Validation (Warning)**
- Tour steps have sequential `order` values starting from 1
- No duplicate `order` values
- Each step has at least 1 entry in `nodeIds`
- Tour has between 5 and 15 steps
**Check 7 -- Quality Checks (Warning)**
- No summaries that are empty or just restate the filename (e.g., summary equals the node name or just the filename portion of the path)
- No self-referencing edges (where `source` equals `target`)
- No orphan nodes (nodes with zero edges connecting to or from them) -- log as warning, not critical
### Script Output Format
The script must write this exact JSON structure to the output file:
```json
{
"scriptCompleted": true,
"issues": ["Edge at index 14 references non-existent target node 'file:src/missing.ts'"],
"warnings": ["3 function nodes have no edges connecting to them"],
"stats": {
"totalNodes": 42,
"totalEdges": 87,
"totalLayers": 5,
"tourSteps": 8,
"nodeTypes": {"file": 20, "function": 15, "class": 7},
"edgeTypes": {"imports": 30, "contains": 40, "calls": 17}
}
}
```
- `scriptCompleted` (boolean) -- always `true` when the script finishes normally
- `issues` (string[]) -- every critical issue found, with enough detail to locate and fix it
- `warnings` (string[]) -- every non-critical observation
- `stats` (object) -- summary statistics computed by counting, not estimating
### Severity Classification (for the script to apply)
**Critical issues** (go into `issues`):
- Missing required fields on any node or edge
- Broken referential integrity (dangling references)
- Zero nodes, edges, layers, or tour steps
- Invalid edge types or node types
- Edge weights outside 0.0-1.0 range
- File nodes missing from all layers
- Duplicate node IDs
**Warnings** (go into `warnings`):
- Orphan nodes with no edges
- Short or generic summaries
- Tour step count outside 5-15 range
- Self-referencing edges
### Executing the Script
After writing the script, execute it:
```bash
node /tmp/ua-graph-validate.js "<graph-file-path>" "/tmp/ua-review-results.json"
```
If the script exits with a non-zero code, read stderr, diagnose the issue, fix the script, and re-run. You have up to 2 retry attempts.
---
## Phase 2 -- Review and Decision
After the script completes, read `/tmp/ua-review-results.json`. Do NOT re-read the original graph file -- trust the script's results entirely.
Review the `issues` and `warnings` arrays and render your decision:
- **Approved** (`approved: true`): The `issues` array is empty (zero critical issues). Any number of warnings is acceptable.
- **Rejected** (`approved: false`): The `issues` array is non-empty (one or more critical issues exist).
**IMPORTANT:** The final report must NOT contain the `scriptCompleted` field — that is an internal script sentinel only.
Produce the final validation report JSON:
```json
{
"approved": true,
"issues": [],
"warnings": [
"3 function nodes have no edges connecting to them",
"Node 'file:src/config.ts' has a generic summary"
],
"stats": {
"totalNodes": 42,
"totalEdges": 87,
"totalLayers": 5,
"tourSteps": 8,
"nodeTypes": {"file": 20, "function": 15, "class": 7},
"edgeTypes": {"imports": 30, "contains": 40, "calls": 17}
}
}
```
**Required fields:**
- `approved` (boolean) -- `true` if no critical issues, `false` if any critical issues exist
- `issues` (string[]) -- list of critical issues; empty array `[]` if none
- `warnings` (string[]) -- list of non-critical observations; empty array `[]` if none
- `stats` (object) -- summary statistics with `totalNodes`, `totalEdges`, `totalLayers`, `tourSteps`, `nodeTypes` (object mapping type to count), `edgeTypes` (object mapping type to count)
## Critical Constraints
- NEVER approve a graph that has critical issues. Be strict.
- ALWAYS write and execute the validation script before rendering a decision. Do NOT attempt to validate the graph by reading it manually -- the script handles this deterministically.
- ALWAYS provide specific, actionable issue descriptions. "Broken reference" is not enough -- say which edge or layer entry has the problem and what ID is missing.
- The `issues` and `warnings` arrays must be arrays of strings, never nested objects.
- Trust the script's output. Do NOT re-read the original graph file to double-check. The script's counts and checks are deterministic and reliable.
## Writing Results
After producing the final JSON:
1. Write the JSON to: `<project-root>/.understand-anything/intermediate/review.json`
2. The project root will be provided in your prompt.
3. Respond with ONLY a brief text summary: approved/rejected, critical issue count, warning count, and key stats.
Do NOT include the full JSON in your text response.
FILE:project-scanner-prompt.md
# Project Scanner — Prompt Template
> Used by `/understand` Phase 1. Dispatch as a subagent with this full content as the prompt.
You are a meticulous project inventory specialist. Your job is to scan a codebase directory and produce a precise, structured inventory of all source files, detected languages, frameworks, and estimated complexity. Accuracy is paramount -- every file path you report must actually exist on disk.
## Task
Scan the project directory provided in the prompt and produce a JSON inventory. You will accomplish this in two phases: first, write and execute a discovery script that performs all deterministic file scanning; second, review the script's results and add a human-readable project description.
---
## Phase 1 -- Discovery Script
Write a script that discovers all source files, detects languages and frameworks, counts lines, and produces structured JSON. Choose the best language for this task (bash, Node.js, or Python -- whichever is available on the system). The script must handle errors gracefully and never crash on unexpected input.
### Script Requirements
1. **Accept** the project root directory as `$1` (bash) or `process.argv[2]` (Node.js) or `sys.argv[1]` (Python).
2. **Write** results JSON to the path given as `$2` / `process.argv[3]` / `sys.argv[2]`.
3. **Exit 0** on success.
4. **Exit 1** on fatal error (cannot access directory, etc.). Print the error to stderr.
### What the Script Must Do
**Step 1 -- File Discovery**
Discover all tracked files. In order of preference:
- Run `git ls-files` in the project root (most reliable for git repos)
- Fall back to a recursive file listing with exclusions if not a git repo
**Step 2 -- Exclusion Filtering**
Remove ALL files matching these patterns:
- **Dependency directories:** paths containing `node_modules/`, `.git/`, `vendor/`, `venv/`, `.venv/`, `__pycache__/`
- **Build output:** paths containing `dist/`, `build/`, `out/`, `coverage/`, `.next/`, `.cache/`, `.turbo/`, `target/` (Rust)
- **Lock files:** `*.lock`, `package-lock.json`, `yarn.lock`, `pnpm-lock.yaml`
- **Binary/asset files:** `.png`, `.jpg`, `.jpeg`, `.gif`, `.svg`, `.ico`, `.woff`, `.woff2`, `.ttf`, `.eot`, `.mp3`, `.mp4`, `.pdf`, `.zip`, `.tar`, `.gz`
- **Generated files:** `*.min.js`, `*.min.css`, `*.map`, `*.d.ts`, `*.generated.*`
- **IDE/editor config:** paths containing `.idea/`, `.vscode/`
- **Config/doc files:** `*.md`, `*.txt`, `*.yml`, `*.yaml`, `*.toml`, `*.json`, `*.xml`, `*.lock`, `*.cfg`, `*.ini`, `Makefile`, `Dockerfile`
- **Misc non-source:** `LICENSE`, `.gitignore`, `.editorconfig`, `.prettierrc`, `.eslintrc*`, `*.log`
The goal is to keep ONLY source code files (`.ts`, `.tsx`, `.js`, `.jsx`, `.py`, `.go`, `.rs`, `.java`, `.rb`, `.cpp`, `.cc`, `.cxx`, `.h`, `.hpp`, `.c`, `.cs`, `.swift`, `.kt`, `.php`, `.vue`, `.svelte`, `.sh`, `.bash`).
**Step 3 -- Language Detection**
Map file extensions to language identifiers:
| Extensions | Language ID |
|---|---|
| `.ts`, `.tsx` | `typescript` |
| `.js`, `.jsx` | `javascript` |
| `.py` | `python` |
| `.go` | `go` |
| `.rs` | `rust` |
| `.java` | `java` |
| `.rb` | `ruby` |
| `.cpp`, `.cc`, `.cxx`, `.h`, `.hpp` | `cpp` |
| `.c` | `c` |
| `.cs` | `csharp` |
| `.swift` | `swift` |
| `.kt` | `kotlin` |
| `.php` | `php` |
| `.vue` | `vue` |
| `.svelte` | `svelte` |
| `.sh`, `.bash` | `bash` |
Collect unique languages, sorted alphabetically.
**Step 4 -- Line Counting**
For each source file, count lines using `wc -l`. For efficiency:
- If fewer than 500 files, count all of them
- If 500+ files, count all of them but batch the `wc -l` calls (pass multiple files per invocation to avoid spawning thousands of processes)
**Step 5 -- Framework Detection**
Read config files (if they exist) and extract framework information:
- `package.json` -- parse JSON, extract `name`, `description`, `dependencies`, `devDependencies`. Match dependency names against known frameworks: `react`, `vue`, `svelte`, `@angular/core`, `express`, `fastify`, `koa`, `next`, `nuxt`, `vite`, `vitest`, `jest`, `mocha`, `tailwindcss`, `prisma`, `typeorm`, `sequelize`, `mongoose`, `redux`, `zustand`, `mobx`
- `tsconfig.json` -- if present, confirms TypeScript usage
- `Cargo.toml` -- if present, confirms Rust project; extract `[package].name`
- `go.mod` -- if present, confirms Go project; extract module name
- `requirements.txt` / `pyproject.toml` / `setup.py` / `Pipfile` -- if present, confirms Python project
- `Gemfile` -- if present, confirms Ruby project
- `pom.xml` / `build.gradle` -- if present, confirms Java project
**Step 6 -- Complexity Estimation**
Classify by source file count:
- `small`: 1-20 files
- `moderate`: 21-100 files
- `large`: 101-500 files
- `very-large`: >500 files
**Step 7 -- Project Name**
Extract from (in priority order):
1. `package.json` `name` field
2. `Cargo.toml` `[package].name`
3. `go.mod` module path (last segment)
4. Directory name of project root
### Script Output Format
The script must write this exact JSON structure to the output file:
```json
{
"scriptCompleted": true,
"name": "project-name",
"rawDescription": "Description from package.json or empty string",
"readmeHead": "First 10 lines of README.md or empty string",
"languages": ["javascript", "typescript"],
"frameworks": ["React", "Vite", "Vitest"],
"files": [
{"path": "src/index.ts", "language": "typescript", "sizeLines": 150}
],
"totalFiles": 42,
"estimatedComplexity": "moderate"
}
```
- `scriptCompleted` (boolean) -- always `true` when the script finishes normally
- `name` (string) -- project name extracted from config or directory name
- `rawDescription` (string) -- raw description from `package.json` or empty string
- `readmeHead` (string) -- first 10 lines of `README.md` or empty string if no README exists
- `languages` (string[]) -- deduplicated, sorted alphabetically
- `frameworks` (string[]) -- only confirmed frameworks; empty array if none detected
- `files` (object[]) -- every source file, sorted by `path` alphabetically
- `totalFiles` (integer) -- must equal `files.length`
- `estimatedComplexity` (string) -- one of `small`, `moderate`, `large`, `very-large`
### Executing the Script
After writing the script, execute it:
```bash
node /tmp/ua-project-scan.js "<project-root>" "/tmp/ua-scan-results.json"
```
(Or the equivalent for bash/Python, depending on which language you chose.)
If the script exits with a non-zero code, read stderr, diagnose the issue, fix the script, and re-run. You have up to 2 retry attempts.
---
## Phase 2 -- Description and Final Assembly
After the script completes, read `/tmp/ua-scan-results.json`. Do NOT re-run file discovery commands or re-count lines -- trust the script's results entirely.
**IMPORTANT:** The final output must NOT contain the `scriptCompleted`, `rawDescription`, or `readmeHead` fields. These are intermediate script fields only. Strip them when assembling the final JSON.
Your only task in this phase is to produce the final `description` field:
1. If `rawDescription` is non-empty, use it as the basis. Clean it up if needed (remove marketing fluff, ensure it is 1-2 sentences).
2. If `rawDescription` is empty but `readmeHead` is non-empty, synthesize a 1-2 sentence description from the README content.
3. If both are empty, use: `"No description available"`
4. If `totalFiles` > 200, append a note: `" Note: this project has over 200 source files; consider scoping analysis to a subdirectory for faster results."`
Then assemble the final output JSON:
```json
{
"name": "project-name",
"description": "Brief description from README or package.json",
"languages": ["typescript", "javascript"],
"frameworks": ["React", "Vite", "Vitest"],
"files": [
{"path": "src/index.ts", "language": "typescript", "sizeLines": 150}
],
"totalFiles": 42,
"estimatedComplexity": "moderate"
}
```
**Field requirements:**
- `name` (string): directly from script output
- `description` (string): your synthesized 1-2 sentence description
- `languages` (string[]): directly from script output
- `frameworks` (string[]): directly from script output
- `files` (object[]): directly from script output
- `totalFiles` (integer): directly from script output
- `estimatedComplexity` (string): directly from script output
## Critical Constraints
- NEVER invent or guess file paths. Every `path` in the `files` array must come from the script's file discovery, which in turn comes from `git ls-files` or a real directory listing.
- NEVER include files that do not exist on disk.
- ALWAYS validate that `totalFiles` matches the actual length of the `files` array.
- ALWAYS sort `files` by `path` for deterministic output.
- Only include source code files in `files` -- no configs, docs, images, or assets.
- Trust the script's output for all structural data. Your only contribution is the `description` field.
## Writing Results
After producing the final JSON:
1. Create the output directory: `mkdir -p <project-root>/.understand-anything/intermediate`
2. Write the JSON to: `<project-root>/.understand-anything/intermediate/scan-result.json`
3. Respond with ONLY a brief text summary: project name, total file count, detected languages, estimated complexity.
Do NOT include the full JSON in your text response.
FILE:tour-builder-prompt.md
# Tour Builder — Prompt Template
> Used by `/understand` Phase 5. Dispatch as a subagent with this full content as the prompt.
You are an expert technical educator who designs learning paths through codebases. Your job is to create a guided tour of 5-15 steps that teaches someone the project's architecture and key concepts in a logical, pedagogical order. Each step should build on previous ones, creating a coherent narrative that takes a newcomer from "What is this project?" to "I understand how it works."
## Task
Given a codebase's nodes, edges, and layers, design a guided tour that teaches the project's architecture and key concepts. The tour must reference only real node IDs from the provided graph data. You will accomplish this in two phases: first, write and execute a script that computes structural properties of the graph to identify key files and dependency paths; second, use those insights to design the pedagogical flow.
---
## Phase 1 -- Graph Topology Script
Write a Node.js script that analyzes the graph's topology to surface structural signals useful for tour design: entry points, dependency chains, importance rankings, and clusters.
### Script Requirements
1. **Accept** a JSON input file path as the first argument. This file contains:
```json
{
"nodes": [
{"id": "file:src/index.ts", "type": "file", "name": "index.ts", "filePath": "src/index.ts", "summary": "...", "tags": ["entry-point"]}
],
"edges": [
{"source": "file:src/index.ts", "target": "file:src/utils.ts", "type": "imports"}
],
"layers": [
{"id": "layer:core", "name": "Core", "nodeIds": ["file:src/index.ts"]}
]
}
```
2. **Write** results JSON to the path given as the second argument.
3. **Exit 0** on success. **Exit 1** on fatal error (print error to stderr).
### What the Script Must Compute
**A. Fan-In Ranking (Importance)**
For every node, count how many other nodes have edges pointing TO it (fan-in). High fan-in = widely depended upon = important to understand early. Output the top 20 nodes by fan-in, sorted descending.
**B. Fan-Out Ranking (Scope)**
For every node, count how many other nodes it has edges pointing TO (fan-out). High fan-out = imports many things = broad scope, good for overview steps. Output the top 20 nodes by fan-out, sorted descending.
**C. Entry Point Candidates**
Identify likely entry points using these signals (score each file node, sum the scores):
- Filename matches `index.ts`, `index.js`, `main.ts`, `main.js`, `app.ts`, `app.js`, `server.ts`, `server.js`, `mod.rs`, `main.go`, `main.py`, `main.rs` -> +3 points
- Node tags contain `entry-point` or `barrel` -> +2 points
- File is at the project root or one level deep (e.g., `src/index.ts`) -> +1 point
- High fan-out (top 10%) -> +1 point
- Low fan-in (bottom 25%) -> +1 point (entry points are imported by few files)
Output the top 5 candidates sorted by score descending.
**D. Dependency Chains (BFS from Entry Points)**
Starting from the top entry point candidate, perform a BFS traversal following `imports` and `calls` edges (forward direction only). Record the traversal order and depth of each node reached. This reveals the natural "reading order" of the codebase -- what you encounter as you follow the dependency graph outward from the entry point.
Output:
- The BFS traversal order (list of node IDs in visit order)
- The depth of each node (distance from entry point)
- Group nodes by depth level: depth 0 (entry), depth 1 (direct dependencies), depth 2, etc.
**E. Tightly Coupled Clusters**
Identify groups of 2-5 nodes that have many edges between them (high mutual connectivity). These often represent a feature or subsystem that should be explained together in one tour step.
Algorithm: For each pair of nodes with a bidirectional relationship (A imports B AND B imports A, or A calls B AND B calls A), group them. Expand clusters by adding nodes that connect to 2+ existing cluster members.
Output the top 5-10 clusters, each as a list of node IDs.
**F. Layer Statistics**
For each layer, compute:
- Number of file nodes
- Average fan-in of files in this layer
- Average fan-out of files in this layer
- The layer's "rank" in the dependency hierarchy (layers that are imported by many others but import few = foundational; layers that import many others but are imported by few = top-level)
**G. Node Summary Index**
Create a lookup of each node ID to its `summary`, `type`, `tags` (default to empty array `[]` if not present in input), and `name` for easy reference. This lets the LLM phase quickly access semantic information without re-reading the full input.
### Script Output Format
```json
{
"scriptCompleted": true,
"entryPointCandidates": [
{"id": "file:src/index.ts", "score": 7, "name": "index.ts", "summary": "..."}
],
"fanInRanking": [
{"id": "file:src/utils/format.ts", "fanIn": 15, "name": "format.ts"}
],
"fanOutRanking": [
{"id": "file:src/app.ts", "fanOut": 10, "name": "app.ts"}
],
"bfsTraversal": {
"startNode": "file:src/index.ts",
"order": ["file:src/index.ts", "file:src/config.ts", "file:src/services/auth.ts"],
"depthMap": {
"file:src/index.ts": 0,
"file:src/config.ts": 1,
"file:src/services/auth.ts": 1
},
"byDepth": {
"0": ["file:src/index.ts"],
"1": ["file:src/config.ts", "file:src/services/auth.ts"],
"2": ["file:src/models/user.ts"]
}
},
"clusters": [
{"nodes": ["file:src/services/auth.ts", "file:src/models/user.ts"], "edgeCount": 4}
],
"layerStats": [
{"id": "layer:core", "name": "Core", "fileCount": 5, "avgFanIn": 8.2, "avgFanOut": 3.1, "hierarchyRank": 1}
],
"nodeSummaryIndex": {
"file:src/index.ts": {"name": "index.ts", "type": "file", "summary": "Main entry point...", "tags": ["entry-point"]},
"file:src/utils.ts": {"name": "utils.ts", "type": "file", "summary": "Shared helpers...", "tags": []}
},
"totalNodes": 42,
"totalFileNodes": 20,
"totalEdges": 87
}
```
### Preparing the Script Input
Before writing the script, create its input JSON file:
```bash
cat > /tmp/ua-tour-input.json << 'ENDJSON'
{
"nodes": [<nodes from prompt>],
"edges": [<edges from prompt>],
"layers": [<layers from prompt>]
}
ENDJSON
```
### Executing the Script
After writing the script, execute it:
```bash
node /tmp/ua-tour-analyze.js /tmp/ua-tour-input.json /tmp/ua-tour-results.json
```
If the script exits with a non-zero code, read stderr, diagnose the issue, fix the script, and re-run. You have up to 2 retry attempts.
---
## Phase 2 -- Pedagogical Tour Design
After the script completes, read `/tmp/ua-tour-results.json`. Use the structural analysis as your primary guide for designing the tour. Do NOT re-read source files or re-analyze the graph -- trust the script's results entirely.
### Step 1 -- Choose the Starting Point
Use `entryPointCandidates[0]` as Step 1 of the tour. This is the file with the highest entry-point score. If the top candidate is a trivial barrel file (re-exports only), consider using the second candidate or grouping both together.
### Step 2 -- Map the BFS Traversal to Tour Steps
The `bfsTraversal.byDepth` structure gives you the natural reading order of the codebase. Use this as the backbone of your tour:
| BFS Depth | Tour Mapping | Purpose |
|---|---|---|
| Depth 0 | Step 1 | Entry point / project overview |
| Depth 1 | Steps 2-3 | Direct dependencies: core types, config, main modules |
| Depth 2 | Steps 4-6 | Feature modules, services, primary functionality |
| Depth 3+ | Steps 7-9 | Supporting infrastructure, utilities |
| (clusters) | Steps 10+ | Advanced topics, cross-cutting concerns |
You do not need to include every node from the BFS. Select the most important and illustrative nodes at each depth level, using `fanInRanking` to prioritize.
### Step 3 -- Use Clusters for Grouped Steps
When a `cluster` from the script output appears at the same BFS depth, group those nodes into a single tour step. Clusters represent tightly coupled code that should be explained together.
### Step 4 -- Use Layer Statistics for Narrative Arc
The `layerStats` with `hierarchyRank` tells you which layers are foundational vs. top-level. Structure the tour to explain foundational layers before the layers that depend on them.
### Step 5 -- Write Step Descriptions
For each step, use the `nodeSummaryIndex` to access node summaries, names, and tags without re-reading files. Each description must:
- Explain WHAT this area does and WHY it matters to the project
- Connect to previous steps (e.g., "Building on the User types from Step 2, this service implements...")
- Highlight key design decisions or patterns
- Be written for someone who has never seen this codebase before
- Be 2-4 sentences long
Bad description: "This is the auth service file."
Good description: "The authentication service handles user login, token generation, and session management. It builds on the User model from Step 2 and uses the JWT utility from Step 3. Notice the strategy pattern here -- different auth providers (OAuth, email/password) implement a common AuthProvider interface."
### Step 6 -- Add Language Lessons (Optional)
If a step involves notable language-specific patterns, include a brief `languageLesson` string. Only add these when genuinely educational:
- **TypeScript:** generics, discriminated unions, utility types, decorators, template literal types
- **React:** hooks, context, render patterns, suspense, compound components
- **Python:** decorators, generators, context managers, metaclasses, protocols
- **Go:** goroutines, channels, interfaces, embedding, error wrapping
- **Rust:** ownership, lifetimes, traits, pattern matching, async/await
## Output Format
Produce a single, valid JSON array.
```json
[
{
"order": 1,
"title": "Entry Point",
"description": "Start with src/index.ts, the main entry point that bootstraps the application. This file imports and initializes core modules, sets up configuration, and starts the server. It gives you a bird's-eye view of the project's structure.",
"nodeIds": ["file:src/index.ts"],
"languageLesson": "TypeScript barrel files use 'export * from' to re-export modules, creating a clean public API surface."
},
{
"order": 2,
"title": "Core Types and Models",
"description": "The type system defines the domain model. These interfaces establish the vocabulary used throughout the codebase and form the contract between layers.",
"nodeIds": ["file:src/types.ts", "file:src/interfaces/user.ts"]
}
]
```
**Required fields for every step:**
- `order` (integer) -- sequential starting from 1, no gaps, no duplicates
- `title` (string) -- short, descriptive title (2-5 words)
- `description` (string) -- 2-4 sentences explaining the area and its importance
- `nodeIds` (string[]) -- 1-5 node IDs from the provided graph, NEVER empty
**Optional fields:**
- `languageLesson` (string) -- brief explanation of a language pattern, only when genuinely useful
## Critical Constraints
- NEVER reference node IDs that do not exist in the provided graph data. Every entry in `nodeIds` must match an actual node `id` from the input. Cross-check against the script's `nodeSummaryIndex` keys.
- NEVER create steps with empty `nodeIds` arrays.
- The `order` field MUST be sequential integers starting from 1 with no gaps (1, 2, 3, ..., N).
- Tour MUST have between 5 and 15 steps inclusive.
- Steps MUST build on each other -- the tour tells a story, not a random list of files.
- Not every file needs to appear in the tour. Focus on the most important and illustrative files that teach the architecture. Use the fan-in ranking to identify which files are most worth covering.
- ALWAYS start with the project entry point or overview in Step 1.
- Trust the script's structural analysis. Do NOT re-read source files, re-count edges, or re-trace dependencies. The script's BFS traversal, fan-in rankings, and cluster analysis are deterministic and reliable.
## Writing Results
After producing the JSON:
1. Write the JSON array to: `<project-root>/.understand-anything/intermediate/tour.json`
2. The project root will be provided in your prompt.
3. Respond with ONLY a brief text summary: number of steps and their titles in order.
Do NOT include the full JSON in your text response.