@clawhub-alirezarezvani-9164a8924b
Structured research summarization agent skill for non-dev users. Handles academic papers, web articles, reports, and documentation. Extracts key findings, ge...
---
name: "research-summarizer"
description: "Structured research summarization agent skill for non-dev users. Handles academic papers, web articles, reports, and documentation. Extracts key findings, generates comparative analyses, and produces properly formatted citations. Use when: user wants to summarize a research paper, compare multiple sources, extract citations from documents, or create structured research briefs. Plugin for Claude Code, Codex, Gemini CLI, and OpenClaw."
license: MIT
metadata:
version: 1.0.0
author: Alireza Rezvani
category: product
updated: 2026-03-16
---
# Research Summarizer
> Read less. Understand more. Cite correctly.
Structured research summarization workflow that turns dense source material into actionable briefs. Built for product managers, analysts, founders, and anyone who reads more than they should have to.
Not a generic "summarize this" — a repeatable framework that extracts what matters, compares across sources, and formats citations properly.
---
## Slash Commands
| Command | What it does |
|---------|-------------|
| `/research:summarize` | Summarize a single source into a structured brief |
| `/research:compare` | Compare 2-5 sources side-by-side with synthesis |
| `/research:cite` | Extract and format all citations from a document |
---
## When This Skill Activates
Recognize these patterns from the user:
- "Summarize this paper / article / report"
- "What are the key findings in this document?"
- "Compare these sources"
- "Extract citations from this PDF"
- "Give me a research brief on [topic]"
- "Break down this whitepaper"
- Any request involving: summarize, research brief, literature review, citation, source comparison
If the user has a document and wants structured understanding → this skill applies.
---
## Workflow
### `/research:summarize` — Single Source Summary
1. **Identify source type**
- Academic paper → use IMRAD structure (Introduction, Methods, Results, Analysis, Discussion)
- Web article → use claim-evidence-implication structure
- Technical report → use executive summary structure
- Documentation → use reference summary structure
2. **Extract structured brief**
```
Title: [exact title]
Author(s): [names]
Date: [publication date]
Source Type: [paper | article | report | documentation]
## Key Thesis
[1-2 sentences: the central argument or finding]
## Key Findings
1. [Finding with supporting evidence]
2. [Finding with supporting evidence]
3. [Finding with supporting evidence]
## Methodology
[How they arrived at these findings — data sources, sample size, approach]
## Limitations
- [What the source doesn't cover or gets wrong]
## Actionable Takeaways
- [What to do with this information]
## Notable Quotes
> "[Direct quote]" (p. X)
```
3. **Assess quality**
- Source credibility (peer-reviewed, reputable outlet, primary vs secondary)
- Evidence strength (data-backed, anecdotal, theoretical)
- Recency (when published, still relevant?)
- Bias indicators (funding source, author affiliation, methodology gaps)
### `/research:compare` — Multi-Source Comparison
1. **Collect sources** (2-5 documents)
2. **Summarize each** using the single-source workflow above
3. **Build comparison matrix**
```
| Dimension | Source A | Source B | Source C |
|------------------|-----------------|-----------------|-----------------|
| Central Thesis | ... | ... | ... |
| Methodology | ... | ... | ... |
| Key Finding | ... | ... | ... |
| Sample/Scope | ... | ... | ... |
| Credibility | High/Med/Low | High/Med/Low | High/Med/Low |
```
4. **Synthesize**
- Where do sources agree? (convergent findings = stronger signal)
- Where do they disagree? (divergent findings = needs investigation)
- What gaps exist across all sources?
- What's the weight of evidence for each position?
5. **Produce synthesis brief**
```
## Consensus Findings
[What most sources agree on]
## Contested Points
[Where sources disagree, with strongest evidence for each side]
## Gaps
[What none of the sources address]
## Recommendation
[Based on weight of evidence, what should the reader believe/do?]
```
### `/research:cite` — Citation Extraction
1. **Scan document** for all references, footnotes, in-text citations
2. **Extract and format** using the requested style (APA 7 default)
3. **Classify citations** by type:
- Primary sources (original research, data)
- Secondary sources (reviews, meta-analyses, commentary)
- Tertiary sources (textbooks, encyclopedias)
4. **Output** sorted bibliography with classification tags
Supported citation formats:
- **APA 7** (default) — social sciences, business
- **IEEE** — engineering, computer science
- **Chicago** — humanities, history
- **Harvard** — general academic
- **MLA 9** — arts, humanities
---
## Tooling
### `scripts/extract_citations.py`
CLI utility for extracting and formatting citations from text.
**Features:**
- Regex-based citation detection (DOI, URL, author-year, numbered references)
- Multiple output formats (APA, IEEE, Chicago, Harvard, MLA)
- JSON export for integration with reference managers
- Deduplication of repeated citations
**Usage:**
```bash
# Extract citations from a file (APA format, default)
python3 scripts/extract_citations.py document.txt
# Specify format
python3 scripts/extract_citations.py document.txt --format ieee
# JSON output
python3 scripts/extract_citations.py document.txt --format apa --output json
# From stdin
cat paper.txt | python3 scripts/extract_citations.py --stdin
```
### `scripts/format_summary.py`
CLI utility for generating structured research summaries.
**Features:**
- Multiple summary templates (academic, article, report, executive)
- Configurable output length (brief, standard, detailed)
- Markdown and plain text output
- Key findings extraction with evidence tagging
**Usage:**
```bash
# Generate structured summary template
python3 scripts/format_summary.py --template academic
# Brief executive summary format
python3 scripts/format_summary.py --template executive --length brief
# All templates listed
python3 scripts/format_summary.py --list-templates
# JSON output
python3 scripts/format_summary.py --template article --output json
```
---
## Quality Assessment Framework
Rate every source on four dimensions:
| Dimension | High | Medium | Low |
|-----------|------|--------|-----|
| **Credibility** | Peer-reviewed, established author | Reputable outlet, known author | Blog, unknown author, no review |
| **Evidence** | Large sample, rigorous method | Moderate data, sound approach | Anecdotal, no data, opinion |
| **Recency** | Published within 2 years | 2-5 years old | 5+ years, may be outdated |
| **Objectivity** | No conflicts, balanced view | Minor affiliations disclosed | Funded by interested party, one-sided |
**Overall Rating:**
- 4 Highs = Strong source — cite with confidence
- 2+ Mediums = Adequate source — cite with caveats
- 2+ Lows = Weak source — verify independently before citing
---
## Summary Templates
See `references/summary-templates.md` for:
- Academic paper summary template (IMRAD)
- Web article summary template (claim-evidence-implication)
- Technical report template (executive summary)
- Comparative analysis template (matrix + synthesis)
- Literature review template (thematic organization)
See `references/citation-formats.md` for:
- APA 7 formatting rules and examples
- IEEE formatting rules and examples
- Chicago, Harvard, MLA quick reference
---
## Proactive Triggers
Flag these without being asked:
- **Source has no date** → Note it. Undated sources lose credibility points.
- **Source contradicts other sources** → Highlight the contradiction explicitly. Don't paper over disagreements.
- **Source is behind a paywall** → Note limited access. Suggest alternatives if known.
- **User provides only one source for a compare** → Ask for at least one more. Comparison needs 2+.
- **Citations are incomplete** → Flag missing fields (year, author, title). Don't invent metadata.
- **Source is 5+ years old in a fast-moving field** → Warn about potential obsolescence.
---
## Installation
### One-liner (any tool)
```bash
git clone https://github.com/alirezarezvani/claude-skills.git
cp -r claude-skills/product-team/research-summarizer ~/.claude/skills/
```
### Multi-tool install
```bash
./scripts/convert.sh --skill research-summarizer --tool codex|gemini|cursor|windsurf|openclaw
```
### OpenClaw
```bash
clawhub install cs-research-summarizer
```
---
## Related Skills
- **product-analytics** — Quantitative analysis. Complementary — use research-summarizer for qualitative sources, product-analytics for metrics.
- **competitive-teardown** — Competitive research. Complementary — use research-summarizer for individual source analysis, competitive-teardown for market landscape.
- **content-production** — Content writing. Research-summarizer feeds content-production — summarize sources first, then write.
- **product-discovery** — Discovery frameworks. Complementary — research-summarizer for desk research, product-discovery for user research.
FILE:references/citation-formats.md
# Citation Formats Quick Reference
## APA 7 (American Psychological Association)
Default format for social sciences, business, and product research.
### Journal Article
Author, A. A., & Author, B. B. (Year). Title of article. *Title of Periodical*, *volume*(issue), page–page. https://doi.org/xxxxx
**Example:**
Smith, J., & Jones, K. (2023). Agile adoption in enterprise organizations. *Journal of Product Management*, *15*(2), 45–62. https://doi.org/10.1234/jpm.2023.001
### Book
Author, A. A. (Year). *Title of work: Capital letter also for subtitle*. Publisher.
**Example:**
Cagan, M. (2018). *Inspired: How to create tech products customers love*. Wiley.
### Web Page
Author, A. A. (Year, Month Day). *Title of page*. Site Name. URL
**Example:**
Torres, T. (2024, January 15). *Continuous discovery in practice*. Product Talk. https://www.producttalk.org/discovery
### In-Text Citation
- Parenthetical: (Smith & Jones, 2023)
- Narrative: Smith and Jones (2023) found that...
- 3+ authors: (Patel et al., 2022)
---
## IEEE (Institute of Electrical and Electronics Engineers)
Standard for engineering, computer science, and technical research.
### Format
[N] A. Author, "Title of article," *Journal*, vol. X, no. Y, pp. Z–Z, Month Year, doi: 10.xxxx.
### Journal Article
[1] J. Smith and K. Jones, "Agile adoption in enterprise organizations," *J. Prod. Mgmt.*, vol. 15, no. 2, pp. 45–62, Mar. 2023, doi: 10.1234/jpm.2023.001.
### Conference Paper
[2] A. Patel, B. Chen, and C. Kumar, "Cross-functional team performance metrics," in *Proc. Int. Conf. Software Eng.*, 2022, pp. 112–119.
### Book
[3] M. Cagan, *Inspired: How to Create Tech Products Customers Love*. Hoboken, NJ, USA: Wiley, 2018.
### In-Text Citation
As shown in [1], agile adoption has increased...
Multiple: [1], [3], [5]–[7]
---
## Chicago (Notes-Bibliography)
Standard for humanities, history, and some business writing.
### Footnote Format
1. First Name Last Name, *Title of Book* (Place: Publisher, Year), page.
2. First Name Last Name, "Title of Article," *Journal* Volume, no. Issue (Year): pages.
### Bibliography Entry
Last Name, First Name. *Title of Book*. Place: Publisher, Year.
Last Name, First Name. "Title of Article." *Journal* Volume, no. Issue (Year): pages.
---
## Harvard
Common in UK and Australian academic writing.
### Format
Author, A.A. (Year) *Title of book*. Edition. Place: Publisher.
Author, A.A. (Year) 'Title of article', *Journal*, Volume(Issue), pp. X–Y.
### In-Text Citation
(Smith and Jones, 2023)
Smith and Jones (2023) argue that...
---
## MLA 9 (Modern Language Association)
Standard for arts and humanities.
### Format
Last, First. *Title of Book*. Publisher, Year.
Last, First. "Title of Article." *Journal*, vol. X, no. Y, Year, pp. Z–Z.
### In-Text Citation
(Smith and Jones 45)
Smith and Jones argue that "direct quote" (45).
---
## Quick Decision Guide
| Field / Context | Recommended Format |
|----------------|-------------------|
| Social sciences, business, psychology | APA 7 |
| Engineering, computer science, technical | IEEE |
| Humanities, history, arts | Chicago or MLA |
| UK/Australian academic | Harvard |
| Internal business reports | APA 7 (most widely recognized) |
| Product research briefs | APA 7 |
FILE:references/summary-templates.md
# Summary Templates Reference
## Academic Paper (IMRAD)
Use for peer-reviewed journal articles, conference papers, and research studies.
### Structure
1. **Introduction** — What problem does the paper address? Why does it matter?
2. **Methods** — How was the study conducted? What data, what approach?
3. **Results** — What did they find? Key numbers, key patterns.
4. **Analysis** — What do the results mean? How do they compare to prior work?
5. **Discussion** — What are the implications? Limitations? Future work?
### Quality Signals
- Published in a peer-reviewed venue
- Clear methodology section with reproducible steps
- Statistical significance reported (p-values, confidence intervals)
- Limitations acknowledged openly
- Conflicts of interest disclosed
### Red Flags
- No methodology section
- Claims without supporting data
- Funded by an entity that benefits from specific results
- Published in a predatory journal (check Beall's List)
---
## Web Article (Claim-Evidence-Implication)
Use for blog posts, news articles, opinion pieces, and online publications.
### Structure
1. **Claim** — What is the author arguing or reporting?
2. **Evidence** — What data, examples, or sources support the claim?
3. **Implication** — So what? What should the reader do or think differently?
### Quality Signals
- Author has relevant expertise or credentials
- Sources are linked and verifiable
- Multiple perspectives acknowledged
- Published on a reputable platform
- Date of publication is clear
### Red Flags
- No author attribution
- No sources or citations
- Sensationalist headline vs. measured content
- Affiliate links or sponsored content without disclosure
---
## Technical Report (Executive Summary)
Use for industry reports, whitepapers, market research, and internal documents.
### Structure
1. **Executive Summary** — Bottom line in 2-3 sentences
2. **Scope** — What does this report cover?
3. **Key Data** — Most important numbers and findings
4. **Methodology** — How was the data gathered?
5. **Recommendations** — What should be done based on findings?
6. **Relevance** — Why does this matter for our specific context?
### Quality Signals
- Clear methodology for data collection
- Sample size and composition disclosed
- Published by a recognized research firm or organization
- Methodology section available (even if separate document)
### Red Flags
- "Report" is actually a marketing piece for a product
- Data from a single, small, unrepresentative sample
- No methodology disclosure
- Conclusions far exceed what the data supports
---
## Comparative Analysis (Matrix + Synthesis)
Use when evaluating 2-5 sources on the same topic.
### Comparison Dimensions
- **Central thesis** — What is each source's main argument?
- **Methodology** — How did each source arrive at its conclusions?
- **Key finding** — What is the headline result?
- **Sample/scope** — How broad or narrow is the evidence?
- **Credibility** — How trustworthy is the source?
- **Recency** — When was it published?
### Synthesis Framework
1. **Convergent findings** — Where sources agree (stronger signal)
2. **Divergent findings** — Where sources disagree (investigate further)
3. **Gaps** — What no source addresses
4. **Weight of evidence** — Which position has stronger support?
---
## Literature Review (Thematic)
Use when synthesizing 5+ sources into a research overview.
### Organization Approaches
- **Thematic** — Group by topic (preferred for most use cases)
- **Chronological** — Group by time period (good for showing evolution)
- **Methodological** — Group by research approach (good for methods papers)
### Per-Theme Structure
1. Theme name and scope
2. Key sources that address this theme
3. What the sources say (points of agreement)
4. What the sources disagree on
5. Strength of evidence for each position
### Synthesis Checklist
- [ ] All sources categorized into themes
- [ ] Gaps in literature identified
- [ ] Contradictions highlighted (not hidden)
- [ ] Overall state of knowledge summarized
- [ ] Future research directions suggested
FILE:scripts/extract_citations.py
#!/usr/bin/env python3
"""
research-summarizer: Citation Extractor
Extract and format citations from text documents. Detects DOIs, URLs,
author-year patterns, and numbered references. Outputs in APA, IEEE,
Chicago, Harvard, or MLA format.
Usage:
python scripts/extract_citations.py document.txt
python scripts/extract_citations.py document.txt --format ieee
python scripts/extract_citations.py document.txt --format apa --output json
python scripts/extract_citations.py --stdin < document.txt
"""
import argparse
import json
import re
import sys
from collections import OrderedDict
# --- Citation Detection Patterns ---
PATTERNS = {
"doi": re.compile(
r"(?:https?://doi\.org/|doi:\s*)(10\.\d{4,}/[^\s,;}\]]+)", re.IGNORECASE
),
"url": re.compile(
r"https?://[^\s,;}\])\"'>]+", re.IGNORECASE
),
"author_year": re.compile(
r"(?:^|\(|\s)([A-Z][a-z]+(?:\s(?:&|and)\s[A-Z][a-z]+)?(?:\set\sal\.?)?)\s*\((\d{4})\)",
),
"numbered_ref": re.compile(
r"^\[(\d+)\]\s+(.+)$", re.MULTILINE
),
"footnote": re.compile(
r"^\d+\.\s+([A-Z].+?(?:\d{4}).+)$", re.MULTILINE
),
}
def extract_dois(text):
"""Extract DOI references."""
citations = []
for match in PATTERNS["doi"].finditer(text):
doi = match.group(1).rstrip(".")
citations.append({
"type": "doi",
"doi": doi,
"raw": match.group(0).strip(),
"url": f"https://doi.org/{doi}",
})
return citations
def extract_urls(text):
"""Extract URL references (excluding DOI URLs already captured)."""
citations = []
for match in PATTERNS["url"].finditer(text):
url = match.group(0).rstrip(".,;)")
if "doi.org" in url:
continue
citations.append({
"type": "url",
"url": url,
"raw": url,
})
return citations
def extract_author_year(text):
"""Extract author-year citations like (Smith, 2023) or Smith & Jones (2021)."""
citations = []
for match in PATTERNS["author_year"].finditer(text):
author = match.group(1).strip()
year = match.group(2)
citations.append({
"type": "author_year",
"author": author,
"year": year,
"raw": f"{author} ({year})",
})
return citations
def extract_numbered_refs(text):
"""Extract numbered reference list entries like [1] Author. Title..."""
citations = []
for match in PATTERNS["numbered_ref"].finditer(text):
num = match.group(1)
content = match.group(2).strip()
citations.append({
"type": "numbered",
"number": int(num),
"content": content,
"raw": f"[{num}] {content}",
})
return citations
def deduplicate(citations):
"""Remove duplicate citations based on raw text."""
seen = OrderedDict()
for c in citations:
key = c.get("doi") or c.get("url") or c.get("raw", "")
key = key.lower().strip()
if key and key not in seen:
seen[key] = c
return list(seen.values())
def classify_source(citation):
"""Classify citation as primary, secondary, or tertiary."""
raw = citation.get("content", citation.get("raw", "")).lower()
if any(kw in raw for kw in ["meta-analysis", "systematic review", "literature review", "survey of"]):
return "secondary"
if any(kw in raw for kw in ["textbook", "encyclopedia", "handbook", "dictionary"]):
return "tertiary"
return "primary"
# --- Formatting ---
def format_apa(citation):
"""Format citation in APA 7 style."""
if citation["type"] == "doi":
return f"https://doi.org/{citation['doi']}"
if citation["type"] == "url":
return f"Retrieved from {citation['url']}"
if citation["type"] == "author_year":
return f"{citation['author']} ({citation['year']})."
if citation["type"] == "numbered":
return citation["content"]
return citation.get("raw", "")
def format_ieee(citation):
"""Format citation in IEEE style."""
if citation["type"] == "doi":
return f"doi: {citation['doi']}"
if citation["type"] == "url":
return f"[Online]. Available: {citation['url']}"
if citation["type"] == "author_year":
return f"{citation['author']}, {citation['year']}."
if citation["type"] == "numbered":
return f"[{citation['number']}] {citation['content']}"
return citation.get("raw", "")
def format_chicago(citation):
"""Format citation in Chicago style."""
if citation["type"] == "doi":
return f"https://doi.org/{citation['doi']}."
if citation["type"] == "url":
return f"{citation['url']}."
if citation["type"] == "author_year":
return f"{citation['author']}. {citation['year']}."
if citation["type"] == "numbered":
return citation["content"]
return citation.get("raw", "")
def format_harvard(citation):
"""Format citation in Harvard style."""
if citation["type"] == "doi":
return f"doi:{citation['doi']}"
if citation["type"] == "url":
return f"Available at: {citation['url']}"
if citation["type"] == "author_year":
return f"{citation['author']} ({citation['year']})"
if citation["type"] == "numbered":
return citation["content"]
return citation.get("raw", "")
def format_mla(citation):
"""Format citation in MLA 9 style."""
if citation["type"] == "doi":
return f"doi:{citation['doi']}."
if citation["type"] == "url":
return f"{citation['url']}."
if citation["type"] == "author_year":
return f"{citation['author']}. {citation['year']}."
if citation["type"] == "numbered":
return citation["content"]
return citation.get("raw", "")
FORMATTERS = {
"apa": format_apa,
"ieee": format_ieee,
"chicago": format_chicago,
"harvard": format_harvard,
"mla": format_mla,
}
# --- Demo Data ---
DEMO_TEXT = """
Recent studies in product management have shown significant shifts in methodology.
According to Smith & Jones (2023), agile adoption has increased by 47% since 2020.
Patel et al. (2022) found that cross-functional teams deliver 2.3x faster.
Several frameworks have been proposed:
[1] Cagan, M. Inspired: How to Create Tech Products Customers Love. Wiley, 2018.
[2] Torres, T. Continuous Discovery Habits. Product Talk LLC, 2021.
[3] Gothelf, J. & Seiden, J. Lean UX. O'Reilly Media, 2021. doi: 10.1234/leanux.2021
For further reading, see https://www.svpg.com/articles/ and the meta-analysis
by Chen (2024) on product discovery effectiveness.
Related work: doi: 10.1145/3544548.3581388
"""
def run_extraction(text, fmt, output_mode):
"""Run full extraction pipeline."""
all_citations = []
all_citations.extend(extract_dois(text))
all_citations.extend(extract_author_year(text))
all_citations.extend(extract_numbered_refs(text))
all_citations.extend(extract_urls(text))
citations = deduplicate(all_citations)
for c in citations:
c["classification"] = classify_source(c)
formatter = FORMATTERS.get(fmt, format_apa)
if output_mode == "json":
result = {
"format": fmt,
"total": len(citations),
"citations": [],
}
for i, c in enumerate(citations, 1):
result["citations"].append({
"index": i,
"type": c["type"],
"classification": c["classification"],
"formatted": formatter(c),
"raw": c.get("raw", ""),
})
print(json.dumps(result, indent=2))
else:
print(f"Citations ({fmt.upper()}) — {len(citations)} found\n")
primary = [c for c in citations if c["classification"] == "primary"]
secondary = [c for c in citations if c["classification"] == "secondary"]
tertiary = [c for c in citations if c["classification"] == "tertiary"]
for label, group in [("Primary Sources", primary), ("Secondary Sources", secondary), ("Tertiary Sources", tertiary)]:
if group:
print(f"### {label}")
for i, c in enumerate(group, 1):
print(f" {i}. {formatter(c)}")
print()
return citations
def main():
parser = argparse.ArgumentParser(
description="research-summarizer: Extract and format citations from text"
)
parser.add_argument("file", nargs="?", help="Input text file (omit for demo)")
parser.add_argument(
"--format", "-f",
choices=["apa", "ieee", "chicago", "harvard", "mla"],
default="apa",
help="Citation format (default: apa)",
)
parser.add_argument(
"--output", "-o",
choices=["text", "json"],
default="text",
help="Output mode (default: text)",
)
parser.add_argument(
"--stdin",
action="store_true",
help="Read from stdin instead of file",
)
args = parser.parse_args()
if args.stdin:
text = sys.stdin.read()
elif args.file:
try:
with open(args.file, "r", encoding="utf-8") as f:
text = f.read()
except FileNotFoundError:
print(f"Error: File not found: {args.file}", file=sys.stderr)
sys.exit(1)
except IOError as e:
print(f"Error reading file: {e}", file=sys.stderr)
sys.exit(1)
else:
print("No input file provided. Running demo...\n")
text = DEMO_TEXT
run_extraction(text, args.format, args.output)
if __name__ == "__main__":
main()
FILE:scripts/format_summary.py
#!/usr/bin/env python3
"""
research-summarizer: Summary Formatter
Generate structured research summary templates for different source types.
Produces fill-in-the-blank frameworks for academic papers, web articles,
technical reports, and executive briefs.
Usage:
python scripts/format_summary.py --template academic
python scripts/format_summary.py --template executive --length brief
python scripts/format_summary.py --list-templates
python scripts/format_summary.py --template article --output json
"""
import argparse
import json
import sys
import textwrap
from datetime import datetime
# --- Templates ---
TEMPLATES = {
"academic": {
"name": "Academic Paper Summary",
"description": "IMRAD structure for peer-reviewed papers and research studies",
"sections": [
("Title", "[Full paper title]"),
("Author(s)", "[Author names, affiliations]"),
("Publication", "[Journal/Conference, Year, DOI]"),
("Source Type", "Academic Paper"),
("Key Thesis", "[1-2 sentences: the central research question and answer]"),
("Methodology", "[Study design, sample size, data sources, analytical approach]"),
("Key Findings", "1. [Finding 1 with supporting data]\n2. [Finding 2 with supporting data]\n3. [Finding 3 with supporting data]"),
("Statistical Significance", "[Key p-values, effect sizes, confidence intervals]"),
("Limitations", "- [Limitation 1: scope, sample, methodology gap]\n- [Limitation 2]"),
("Implications", "- [What this means for practice]\n- [What this means for future research]"),
("Notable Quotes", '> "[Direct quote]" (p. X)'),
("Quality Assessment", "Credibility: [High/Med/Low] | Evidence: [High/Med/Low] | Recency: [High/Med/Low] | Objectivity: [High/Med/Low]"),
],
},
"article": {
"name": "Web Article Summary",
"description": "Claim-evidence-implication structure for online articles and blog posts",
"sections": [
("Title", "[Article title]"),
("Author", "[Author name]"),
("Source", "[Publication/Website, Date, URL]"),
("Source Type", "Web Article"),
("Central Claim", "[1-2 sentences: main argument or thesis]"),
("Supporting Evidence", "1. [Evidence point 1]\n2. [Evidence point 2]\n3. [Evidence point 3]"),
("Counterarguments Addressed", "- [Counterargument and author's response]"),
("Implications", "- [What this means for the reader]"),
("Bias Check", "Author affiliation: [?] | Funding: [?] | Balanced perspective: [Yes/No]"),
("Actionable Takeaways", "- [What to do with this information]\n- [Next step]"),
("Quality Assessment", "Credibility: [High/Med/Low] | Evidence: [High/Med/Low] | Recency: [High/Med/Low] | Objectivity: [High/Med/Low]"),
],
},
"report": {
"name": "Technical Report Summary",
"description": "Structured summary for industry reports, whitepapers, and technical documentation",
"sections": [
("Title", "[Report title]"),
("Organization", "[Publishing organization]"),
("Date", "[Publication date]"),
("Source Type", "Technical Report"),
("Executive Summary", "[2-3 sentences: scope, key conclusion, recommendation]"),
("Scope", "[What the report covers and what it excludes]"),
("Key Data Points", "1. [Statistic or data point with context]\n2. [Statistic or data point with context]\n3. [Statistic or data point with context]"),
("Methodology", "[How data was collected — survey, analysis, case study]"),
("Recommendations", "1. [Recommendation with supporting rationale]\n2. [Recommendation with supporting rationale]"),
("Limitations", "- [Sample bias, geographic scope, time period]"),
("Relevance", "[Why this matters for our context — specific applicability]"),
("Quality Assessment", "Credibility: [High/Med/Low] | Evidence: [High/Med/Low] | Recency: [High/Med/Low] | Objectivity: [High/Med/Low]"),
],
},
"executive": {
"name": "Executive Brief",
"description": "Condensed decision-focused summary for leadership consumption",
"sections": [
("Source", "[Title, Author, Date]"),
("Bottom Line", "[1 sentence: the single most important takeaway]"),
("Key Facts", "1. [Fact]\n2. [Fact]\n3. [Fact]"),
("So What?", "[Why this matters for our business/product/strategy]"),
("Action Required", "- [Specific next step with owner and timeline]"),
("Confidence", "[High/Medium/Low] — based on source quality and evidence strength"),
],
},
"comparison": {
"name": "Comparative Analysis",
"description": "Side-by-side comparison matrix for 2-5 sources on the same topic",
"sections": [
("Topic", "[Research topic or question being compared]"),
("Sources Compared", "1. [Source A — Author, Year]\n2. [Source B — Author, Year]\n3. [Source C — Author, Year]"),
("Comparison Matrix", "| Dimension | Source A | Source B | Source C |\n|-----------|---------|---------|---------|"
"\n| Central Thesis | ... | ... | ... |"
"\n| Methodology | ... | ... | ... |"
"\n| Key Finding | ... | ... | ... |"
"\n| Sample/Scope | ... | ... | ... |"
"\n| Credibility | High/Med/Low | High/Med/Low | High/Med/Low |"),
("Consensus Findings", "[What most sources agree on]"),
("Contested Points", "[Where sources disagree — with strongest evidence for each side]"),
("Gaps", "[What none of the sources address]"),
("Synthesis", "[Weight-of-evidence recommendation: what to believe and do]"),
],
},
"literature": {
"name": "Literature Review",
"description": "Thematic organization of multiple sources for research synthesis",
"sections": [
("Research Question", "[The question this review addresses]"),
("Search Scope", "[Databases, keywords, date range, inclusion/exclusion criteria]"),
("Sources Reviewed", "[Total count, breakdown by type]"),
("Theme 1: [Name]", "Summary: [Theme overview]\nKey Sources: [Author (Year), Author (Year)]\nFindings: [What sources say about this theme]"),
("Theme 2: [Name]", "Summary: [Theme overview]\nKey Sources: [Author (Year), Author (Year)]\nFindings: [What sources say about this theme]"),
("Theme 3: [Name]", "Summary: [Theme overview]\nKey Sources: [Author (Year), Author (Year)]\nFindings: [What sources say about this theme]"),
("Gaps in Literature", "- [Under-researched area 1]\n- [Under-researched area 2]"),
("Synthesis", "[Overall state of knowledge — what we know, what we don't, where to go next]"),
],
},
}
LENGTH_CONFIGS = {
"brief": {"max_sections": 4, "label": "Brief (key points only)"},
"standard": {"max_sections": 99, "label": "Standard (full template)"},
"detailed": {"max_sections": 99, "label": "Detailed (full template with extended guidance)"},
}
def render_template(template_key, length="standard", output_format="text"):
"""Render a summary template."""
template = TEMPLATES[template_key]
sections = template["sections"]
if length == "brief":
# Keep only first 4 sections for brief output
sections = sections[:4]
if output_format == "json":
result = {
"template": template_key,
"name": template["name"],
"description": template["description"],
"length": length,
"generated": datetime.now().strftime("%Y-%m-%d"),
"sections": [],
}
for title, content in sections:
result["sections"].append({
"heading": title,
"placeholder": content,
})
return json.dumps(result, indent=2)
# Text/Markdown output
lines = []
lines.append(f"# {template['name']}")
lines.append(f"_{template['description']}_\n")
lines.append(f"Length: {LENGTH_CONFIGS[length]['label']}")
lines.append(f"Generated: {datetime.now().strftime('%Y-%m-%d')}\n")
lines.append("---\n")
for title, content in sections:
lines.append(f"## {title}\n")
# Indent content for readability
for line in content.split("\n"):
lines.append(line)
lines.append("")
lines.append("---")
lines.append("_Template from research-summarizer skill_")
return "\n".join(lines)
def list_templates(output_format="text"):
"""List all available templates."""
if output_format == "json":
result = []
for key, tmpl in TEMPLATES.items():
result.append({
"key": key,
"name": tmpl["name"],
"description": tmpl["description"],
"sections": len(tmpl["sections"]),
})
return json.dumps(result, indent=2)
lines = []
lines.append("Available Summary Templates\n")
lines.append(f"{'KEY':<15} {'NAME':<30} {'SECTIONS':>8} DESCRIPTION")
lines.append(f"{'─' * 90}")
for key, tmpl in TEMPLATES.items():
lines.append(
f"{key:<15} {tmpl['name']:<30} {len(tmpl['sections']):>8} {tmpl['description'][:40]}"
)
return "\n".join(lines)
def main():
parser = argparse.ArgumentParser(
description="research-summarizer: Generate structured summary templates"
)
parser.add_argument(
"--template", "-t",
choices=list(TEMPLATES.keys()),
help="Template type to generate",
)
parser.add_argument(
"--length", "-l",
choices=["brief", "standard", "detailed"],
default="standard",
help="Output length (default: standard)",
)
parser.add_argument(
"--output", "-o",
choices=["text", "json"],
default="text",
help="Output format (default: text)",
)
parser.add_argument(
"--list-templates",
action="store_true",
help="List all available templates",
)
args = parser.parse_args()
if args.list_templates:
print(list_templates(args.output))
return
if not args.template:
print("No template specified. Available templates:\n")
print(list_templates(args.output))
print("\nUsage: python scripts/format_summary.py --template academic")
return
print(render_template(args.template, args.length, args.output))
if __name__ == "__main__":
main()
Reverse-engineer any codebase into a complete Product Requirements Document (PRD). Analyzes routes, components, state management, API integrations, and user...
---
Name: code-to-prd
Tier: STANDARD
Category: product
Dependencies: none
Author: Alireza Rezvani
Version: 2.1.2
name: code-to-prd
description: |
Reverse-engineer any codebase into a complete Product Requirements Document (PRD).
Analyzes routes, components, state management, API integrations, and user interactions to produce
business-readable documentation detailed enough for engineers or AI agents to fully reconstruct
every page and endpoint. Works with frontend frameworks (React, Vue, Angular, Svelte, Next.js, Nuxt),
backend frameworks (NestJS, Django, Express, FastAPI), and fullstack applications.
Trigger when users mention: generate PRD, reverse-engineer requirements, code to documentation,
extract product specs from code, document page logic, analyze page fields and interactions,
create a functional inventory, write requirements from an existing codebase, document API endpoints,
or analyze backend routes.
license: MIT
metadata:
updated: 2026-03-17
---
## Name
Code → PRD
## Description
Reverse-engineer any frontend, backend, or fullstack codebase into a complete Product Requirements Document (PRD). Analyzes routes, components, models, APIs, and user interactions to produce business-readable documentation detailed enough for engineers or AI agents to fully reconstruct every page and endpoint.
# Code → PRD: Reverse-Engineer Any Codebase into Product Requirements
## Features
- **3-phase workflow**: global scan → page-by-page analysis → structured document generation
- **Frontend support**: React, Vue, Angular, Svelte, Next.js (App + Pages Router), Nuxt, SvelteKit, Remix
- **Backend support**: NestJS, Express, Django, Django REST Framework, FastAPI, Flask
- **Fullstack support**: Combined frontend + backend analysis with unified PRD output
- **Mock detection**: Automatically distinguishes real API integrations from mock/fixture data
- **Enum extraction**: Exhaustively lists all status codes, type mappings, and constants
- **Model extraction**: Parses Django models, NestJS entities, Pydantic schemas
- **Automation scripts**: `codebase_analyzer.py` for scanning, `prd_scaffolder.py` for directory generation
- **Quality checklist**: Validation checklist for completeness, accuracy, readability
## Usage
```bash
# Analyze a project and generate PRD skeleton
python3 scripts/codebase_analyzer.py /path/to/project -o analysis.json
python3 scripts/prd_scaffolder.py analysis.json -o prd/ -n "My App"
# Or use the slash command
/code-to-prd /path/to/project
```
## Examples
### Frontend (React)
```bash
/code-to-prd ./src
# → Scans components, routes, API calls, state management
# → Generates prd/ with per-page docs, enum dictionary, API inventory
```
### Backend (Django)
```bash
/code-to-prd ./myproject
# → Detects Django via manage.py, scans urls.py, views.py, models.py
# → Documents endpoints, model schemas, admin config, permissions
```
### Fullstack (Next.js)
```bash
/code-to-prd .
# → Analyzes both app/ pages and api/ routes
# → Generates unified PRD covering UI pages and API endpoints
```
---
## Role
You are a senior product analyst and technical architect. Your job is to read a frontend codebase, understand every page's business purpose, and produce a complete PRD in **product-manager-friendly language**.
### Dual Audience
1. **Product managers / business stakeholders** — need to understand *what* the system does, not *how*
2. **Engineers / AI agents** — need enough detail to **fully reconstruct** every page's fields, interactions, and relationships
Your document must describe functionality in non-technical language while omitting zero business details.
### Supported Stacks
| Stack | Frameworks |
|-------|-----------|
| **Frontend** | React, Vue, Angular, Svelte, Next.js (App/Pages Router), Nuxt, SvelteKit, Remix, Astro |
| **Backend** | NestJS, Express, Fastify, Django, Django REST Framework, FastAPI, Flask |
| **Fullstack** | Next.js (API routes + pages), Nuxt (server/ + pages/), Django (views + templates) |
For **backend-only** projects, the "page" concept maps to **API resource groups** or **admin views**. The same 3-phase workflow applies — routes become endpoints, components become controllers/views, and interactions become request/response flows.
---
## Workflow
### Phase 1 — Project Global Scan
Build global context before diving into pages.
#### 1. Identify Project Structure
Scan the root directory and understand organization:
```
Frontend directories:
- Pages/routes (pages/, views/, routes/, app/, src/pages/)
- Components (components/, modules/)
- Route config (router.ts, routes.ts, App.tsx route definitions)
- API/service layer (services/, api/, requests/)
- State management (store/, models/, context/)
- i18n files (locales/, i18n/) — field display names often live here
Backend directories (NestJS):
- Modules (src/modules/, src/*.module.ts)
- Controllers (*.controller.ts) — route handlers
- Services (*.service.ts) — business logic
- DTOs (dto/, *.dto.ts) — request/response shapes
- Entities (entities/, *.entity.ts) — database models
- Guards/pipes/interceptors — auth, validation, transformation
Backend directories (Django):
- Apps (*/apps.py, */views.py, */models.py, */urls.py)
- URL config (urls.py, */urls.py)
- Views (views.py, viewsets.py) — route handlers
- Models (models.py) — database schema
- Serializers (serializers.py) — request/response shapes
- Forms (forms.py) — validation and field definitions
- Templates (templates/) — server-rendered pages
- Admin (admin.py) — admin panel configuration
```
**Identify framework** from `package.json` (Node.js frameworks) or project files (`manage.py` for Django, `requirements.txt`/`pyproject.toml` for Python). Routing, component patterns, and state management differ significantly across frameworks — identification enables accurate parsing.
#### 2. Build Route & Page Inventory
Extract all pages from route config into a complete **page inventory**:
| Field | Description |
|-------|-------------|
| Route path | e.g. `/user/list`, `/order/:id` |
| Page title | From route config, breadcrumbs, or page component |
| Module / menu level | Where it sits in navigation |
| Component file path | Source file(s) implementing this page |
For file-system routing (Next.js, Nuxt), infer from directory structure.
**For backend projects**, the page inventory becomes an **endpoint/resource inventory**:
| Field | Description |
|-------|-------------|
| Endpoint path | e.g. `/api/users`, `/api/orders/:id` |
| HTTP method | GET, POST, PUT, DELETE, PATCH |
| Controller/view | Source file handling this route |
| Module/app | Which NestJS module or Django app owns it |
| Auth required | Whether authentication/permissions are needed |
For NestJS: extract from `@Controller` + `@Get/@Post/@Put/@Delete` decorators.
For Django: extract from `urls.py` → `urlpatterns` and `viewsets.py` → router registrations.
#### 3. Map Global Context
Before analyzing individual pages, capture:
- **Global state** — user info, permissions, feature flags, config
- **Shared components** — layout, nav, auth guards, error boundaries
- **Enums & constants** — status codes, type mappings, role definitions
- **API base config** — base URL, interceptors, auth headers, error handling
- **Database models** (backend) — entity relationships, field types, constraints
- **Middleware** (backend) — auth middleware, rate limiting, logging, CORS
- **DTOs/Serializers** (backend) — request validation shapes, response formats
These will be referenced throughout page/endpoint analysis.
---
### Phase 2 — Page-by-Page Deep Analysis
Analyze every page in the inventory. **Each page produces its own Markdown file.**
#### Analysis Dimensions
For each page, answer:
##### A. Page Overview
- What does this page do? (one sentence)
- Where does it fit in the system?
- What scenario brings a user here?
##### B. Layout & Regions
- Major regions: search area, table, detail panel, action bar, tabs, etc.
- Spatial arrangement: top/bottom, left/right, nested
##### C. Field Inventory (core — be exhaustive)
**For form pages**, list every field:
| Field Name | Type | Required | Default | Validation | Business Description |
|-----------|------|----------|---------|------------|---------------------|
| Username | Text input | Yes | — | Max 20 chars | System login account |
**For table/list pages**, list:
- Search/filter fields (type, required, enum options)
- Table columns (name, format, sortable, filterable)
- Row action buttons (what each one does)
**Field name extraction priority:**
1. Hardcoded display text in code
2. i18n translation values
3. Component `placeholder` / `label` / `title` props
4. Variable names (last resort — provide reasonable display name)
##### D. Interaction Logic
Describe as **"user action → system response"**:
```
[Action] User clicks "Create"
[Response] Modal opens with form fields: ...
[Validation] Name required, phone format check
[API] POST /api/user/create with form data
[Success] Toast "Created successfully", close modal, refresh list
[Failure] Show API error message
```
**Cover all interaction types:**
- Page load / initialization (default queries, preloaded data)
- Search / filter / reset
- CRUD operations (create, read, update, delete)
- Table: pagination, sorting, row selection, bulk actions
- Form submission & validation
- Status transitions (e.g. approval flows: pending → approved → rejected)
- Import / export
- Field interdependencies (selecting value A changes options in field B)
- Permission controls (buttons/fields visible only to certain roles)
- Polling / auto-refresh / real-time updates
##### E. API Dependencies
**Case 1: API is integrated** (real HTTP calls in code)
| API Name | Method | Path | Trigger | Key Params | Notes |
|----------|--------|------|---------|-----------|-------|
| Get users | GET | /api/user/list | Load, search | page, size, keyword | Paginated |
**Case 2: API not integrated** (mock/hardcoded data)
When the page uses mock data, hardcoded fixtures, `setTimeout` simulations, or `Promise.resolve()` stubs — the API isn't real yet. **Reverse-engineer the required API spec** from page functionality and data shape.
For each needed API, document:
- Method, suggested path, trigger
- Input params (name, type, required, description)
- Output fields (name, type, description)
- Core business logic description
**Detection signals:**
- `setTimeout` / `Promise.resolve()` returning data → mock
- Data defined in component or `*.mock.*` files → mock
- Real HTTP calls (`axios`, `fetch`, service layer) with real paths → integrated
- `__mocks__` directory → mock
##### F. Page Relationships
- **Inbound**: Which pages link here? What parameters do they pass?
- **Outbound**: Where can users navigate from here? What parameters?
- **Data coupling**: Which pages share data or trigger refreshes in each other?
---
### Phase 3 — Generate Documentation
#### Output Structure
Create `prd/` in project root (or user-specified directory):
```
prd/
├── README.md # System overview
├── pages/
│ ├── 01-user-mgmt-list.md # One file per page
│ ├── 02-user-mgmt-detail.md
│ ├── 03-order-mgmt-list.md
│ └── ...
└── appendix/
├── enum-dictionary.md # All enums, status codes, type mappings
├── page-relationships.md # Navigation map between pages
└── api-inventory.md # Complete API reference
```
#### README.md Template
```markdown
# [System Name] — Product Requirements Document
## System Overview
[2-3 paragraphs: what the system does, business context, primary users]
## Module Overview
| Module | Pages | Core Functionality |
|--------|-------|--------------------|
| User Management | User list, User detail, Role mgmt | CRUD users, assign roles and permissions |
## Page Inventory
| # | Page Name | Route | Module | Doc Link |
|---|-----------|-------|--------|----------|
| 1 | User List | /user/list | User Mgmt | [→](./pages/01-user-mgmt-list.md) |
## Global Notes
### Permission Model
[Summarize auth/role system if present in code]
### Common Interaction Patterns
[Global rules: all deletes require confirmation, lists default to created_at desc, etc.]
```
#### Per-Page Document Template
```markdown
# [Page Name]
> **Route:** `/xxx/xxx`
> **Module:** [Module name]
> **Generated:** [Date]
## Overview
[2-3 sentences: core function and use case]
## Layout
[Region breakdown — text description or ASCII diagram]
## Fields
### [Region: e.g. "Search Filters"]
| Field | Type | Required | Options / Enum | Default | Notes |
|-------|------|----------|---------------|---------|-------|
### [Region: e.g. "Data Table"]
| Column | Format | Sortable | Filterable | Notes |
|--------|--------|----------|-----------|-------|
### [Region: e.g. "Actions"]
| Button | Visibility Condition | Behavior |
|--------|---------------------|----------|
## Interactions
### Page Load
[What happens on mount]
### [Scenario: e.g. "Search"]
- **Trigger:** [User action]
- **Behavior:** [System response]
- **Special rules:** [If any]
### [Scenario: e.g. "Create"]
- **Trigger:** ...
- **Modal/drawer content:** [Fields and logic inside]
- **Validation:** ...
- **On success:** ...
## API Dependencies
| API | Method | Path | Trigger | Notes |
|-----|--------|------|---------|-------|
| ... | ... | ... | ... | ... |
## Page Relationships
- **From:** [Source pages + params]
- **To:** [Target pages + params]
- **Data coupling:** [Cross-page refresh triggers]
## Business Rules
[Anything that doesn't fit above]
```
---
## Key Principles
### 1. Business Language First
Don't write "calls `useState` to manage loading state." Write "search button shows a spinner to prevent duplicate submissions."
Don't write "useEffect fetches on mount." Write "page automatically loads the first page of results on open."
Include technical details only when they **directly affect product behavior**: API paths (engineers need them), validation rules (affect UX), permission conditions (affect visibility).
### 2. Don't Miss Hidden Logic
Code contains logic PMs may not realize exists:
- Field interdependencies (type A shows field X; type B shows field Y)
- Conditional button visibility
- Data formatting (currency with 2 decimals, date formats, status label mappings)
- Default sort order and page size
- Debounce/throttle effects on user input
- Polling / auto-refresh intervals
### 3. Exhaustively List Enums
When code defines enums (status codes, type codes, role types), list **every value and its meaning**. These are often scattered across constants files, component `valueEnum` configs, or API response mappers.
### 4. Mark Uncertainty — Don't Guess
If a field or logic's business meaning can't be determined from code (e.g. abbreviated variable names, overly complex conditionals), mark it `[TBC]` and explain what you observed and why you're uncertain. Never fabricate business meaning.
### 5. Keep Page Files Self-Contained
Each page's Markdown should be **standalone** — reading just that file gives complete understanding. Use relative links when referencing other pages or appendix entries.
---
## Page Type Strategies
### Frontend Pages
| Page Type | Focus Areas |
|-----------|------------|
| **List / Table** | Search conditions, columns, row actions, pagination, bulk ops |
| **Form / Create-Edit** | Every field, validation, interdependencies, post-submit behavior |
| **Detail / View** | Displayed info, tab/section organization, available actions |
| **Modal / Drawer** | Describe as part of triggering page — not a separate file. But fully document content |
| **Dashboard** | Data cards, charts, metrics meaning, filter dimensions, refresh frequency |
### Backend Endpoints (NestJS / Django / Express)
| Endpoint Type | Focus Areas |
|---------------|------------|
| **CRUD resource** | All fields (from DTO/serializer), validation rules, permissions, pagination, filtering, sorting |
| **Auth endpoints** | Login/register flow, token format, refresh logic, password reset, OAuth providers |
| **File upload** | Accepted types, size limits, storage destination, processing pipeline |
| **Webhook / event** | Trigger conditions, payload shape, retry policy, idempotency |
| **Background job** | Trigger, schedule, input/output, failure handling, monitoring |
| **Admin views** (Django) | Registered models, list_display, search_fields, filters, inline models, custom actions |
---
## Execution Pacing
**Large projects (>15 pages):** Work in batches of 3-5 pages per module. Complete system overview + page inventory first. Output each batch for user review before proceeding.
**Small projects (≤15 pages):** Complete all analysis in one pass.
---
## Common Pitfalls
| Pitfall | Fix |
|---------|-----|
| Using component names as page names | `UserManagementTable` → "User Management List" |
| Skipping modals and drawers | They contain critical business logic — document fully |
| Missing i18n field names | Check translation files, not just component JSX |
| Ignoring dynamic route params | `/order/:id` = page requires an order ID to load |
| Forgetting permission controls | Document which roles see which buttons/pages |
| Assuming all APIs are real | Check for mock data patterns before documenting endpoints |
| Skipping Django admin customization | `admin.py` often contains critical business rules (list filters, custom actions, inlines) |
| Missing NestJS guards/pipes | `@UseGuards`, `@UsePipes` contain auth and validation logic that affects behavior |
| Ignoring database constraints | Model field constraints (unique, max_length, choices) are validation rules for the PRD |
| Overlooking middleware | Auth middleware, rate limiters, and CORS config define system-wide behavior |
---
## Tooling
### Scripts
| Script | Purpose | Usage |
|--------|---------|-------|
| `scripts/codebase_analyzer.py` | Scan codebase → extract routes, APIs, models, enums, structure | `python3 codebase_analyzer.py /path/to/project` |
| `scripts/prd_scaffolder.py` | Generate PRD directory skeleton from analysis JSON | `python3 prd_scaffolder.py analysis.json` |
**Recommended workflow:**
```bash
# 1. Analyze the project (JSON output — works for frontend, backend, or fullstack)
python3 scripts/codebase_analyzer.py /path/to/project -o analysis.json
# 2. Review the analysis (markdown summary)
python3 scripts/codebase_analyzer.py /path/to/project -f markdown
# 3. Scaffold the PRD directory with stubs
python3 scripts/prd_scaffolder.py analysis.json -o prd/ -n "My App"
# 4. Fill in TODO sections page-by-page using the SKILL.md workflow
```
Both scripts are **stdlib-only** — no pip install needed.
### References
| File | Contents |
|------|----------|
| `references/prd-quality-checklist.md` | Validation checklist for completeness, accuracy, readability |
| `references/framework-patterns.md` | Framework-specific patterns for routes, state, APIs, forms, permissions |
---
## Attribution
This skill was inspired by [code-to-prd](https://github.com/lihanglogan/code-to-prd) by [@lihanglogan](https://github.com/lihanglogan), who proposed the original concept and methodology in [PR #368](https://github.com/alirezarezvani/claude-skills/pull/368). The core three-phase workflow (global scan → page-by-page analysis → structured document generation) originated from that work. This version was rebuilt from scratch in English with added tooling (analysis scripts, scaffolder, framework reference, quality checklist).
FILE:README.md
# Code → PRD
Reverse-engineer any codebase into a complete Product Requirements Document (PRD).
## Quick Start
```bash
# One command
/code-to-prd /path/to/project
# Or step by step
python3 scripts/codebase_analyzer.py /path/to/project -o analysis.json
python3 scripts/prd_scaffolder.py analysis.json -o prd/ -n "My App"
```
## Supported Frameworks
| Stack | Frameworks |
|-------|-----------|
| Frontend | React, Vue, Angular, Svelte, Next.js, Nuxt, SvelteKit, Remix |
| Backend | NestJS, Express, Django, DRF, FastAPI, Flask |
| Fullstack | Next.js (pages + API), Nuxt (pages + server), Django (views + templates) |
## What It Generates
```
prd/
├── README.md # System overview
├── pages/
│ ├── 01-user-mgmt-list.md # Per-page/endpoint docs
│ └── ...
└── appendix/
├── enum-dictionary.md # All enums and status codes
├── api-inventory.md # Complete API reference
└── page-relationships.md # Navigation and data coupling
```
## Scripts
| Script | Purpose |
|--------|---------|
| `codebase_analyzer.py` | Scan codebase → extract routes, APIs, models, enums |
| `prd_scaffolder.py` | Generate PRD directory skeleton from analysis JSON |
Both are stdlib-only — no pip install needed. Run `--help` for full usage.
## References
- `references/framework-patterns.md` — Route, state, API, form, and model patterns per framework
- `references/prd-quality-checklist.md` — Validation checklist for completeness and accuracy
## Attribution
Inspired by [code-to-prd](https://github.com/lihanglogan/code-to-prd) by [@lihanglogan](https://github.com/lihanglogan).
## License
MIT
FILE:assets/sample-analysis.json
{
"project": {
"root": "/path/to/my-app",
"name": "my-app",
"framework": "next",
"detected_frameworks": ["next", "react"],
"key_dependencies": {
"next": "14.1.0",
"react": "18.2.0",
"tailwindcss": "3.4.1",
"axios": "1.6.5",
"@tanstack/react-query": "5.17.0"
},
"stack_type": "fullstack"
},
"structure": {
"total_files": 87,
"components": {
"components": 42,
"modules": 35
},
"route_dirs": ["/path/to/my-app/app"],
"api_dirs": ["/path/to/my-app/app/api"],
"state_dirs": ["/path/to/my-app/src/store"],
"i18n_dirs": [],
"controller_dirs": [],
"model_dirs": [],
"dto_dirs": []
},
"routes": {
"count": 8,
"frontend_pages": [
{"path": "/", "source": "app/page.tsx", "filesystem": true},
{"path": "/dashboard", "source": "app/dashboard/page.tsx", "filesystem": true},
{"path": "/users", "source": "app/users/page.tsx", "filesystem": true},
{"path": "/users/:id", "source": "app/users/[id]/page.tsx", "filesystem": true},
{"path": "/settings", "source": "app/settings/page.tsx", "filesystem": true}
],
"backend_endpoints": [
{"path": "/api/users", "method": "GET", "source": "app/api/users/route.ts", "type": "backend"},
{"path": "/api/users", "method": "POST", "source": "app/api/users/route.ts", "type": "backend"},
{"path": "/api/users/:id", "method": "GET", "source": "app/api/users/[id]/route.ts", "type": "backend"}
],
"pages": []
},
"apis": {
"total": 5,
"integrated": 4,
"mock": 1,
"endpoints": [
{"path": "/api/users", "method": "GET", "source": "services/user.ts", "integrated": true, "mock_detected": false},
{"path": "/api/users", "method": "POST", "source": "services/user.ts", "integrated": true, "mock_detected": false},
{"path": "/api/users/:id", "method": "GET", "source": "services/user.ts", "integrated": true, "mock_detected": false},
{"path": "/api/users/:id", "method": "PUT", "source": "services/user.ts", "integrated": true, "mock_detected": false},
{"path": "/api/dashboard/stats", "method": "GET", "source": "services/dashboard.ts", "integrated": false, "mock_detected": true}
]
},
"enums": {
"count": 2,
"definitions": [
{"name": "UserRole", "type": "enum", "values": {"ADMIN": "admin", "USER": "user", "MANAGER": "manager"}, "source": "types/user.ts"},
{"name": "STATUS_MAP", "type": "constant_map", "values": {"active": "Active", "inactive": "Inactive", "suspended": "Suspended"}, "source": "constants/status.ts"}
]
},
"models": {
"count": 0,
"definitions": []
},
"summary": {
"pages": 5,
"backend_endpoints": 3,
"api_endpoints": 5,
"api_integrated": 4,
"api_mock": 1,
"enums": 2,
"models": 0,
"has_i18n": false,
"has_state_management": true,
"stack_type": "fullstack"
}
}
FILE:expected_outputs/sample-enum-dictionary.md
# Enum Dictionary
All enums, status codes, and constant mappings found in the codebase.
## UserRole
**Source:** `types/user.ts`
**Type:** TypeScript enum
| Value | Label | Description |
|-------|-------|-------------|
| `admin` | Admin | Full system access, can manage all users |
| `manager` | Manager | Can view and edit users, cannot delete |
| `user` | User | Read-only access |
## STATUS_MAP
**Source:** `constants/status.ts`
**Type:** Constant map
| Key | Display Value | Color | Description |
|-----|--------------|-------|-------------|
| `active` | Active | Green | Normal active account |
| `inactive` | Inactive | Gray | Account disabled by user |
| `suspended` | Suspended | Red | Account suspended by admin |
FILE:expected_outputs/sample-page-user-list.md
# User List
> **Route:** `/users`
> **Module:** User Management
> **Generated:** 2026-03-17
## Overview
Displays all system users in a searchable, paginated table. Supports creating, editing, and deleting users. Only ADMIN and MANAGER roles can access this page.
## Layout
- **Top bar**: Search input + "Create User" button
- **Main area**: Data table with pagination
- **Modal**: Create/Edit user form (triggered by buttons)
## Fields
### Search Filters
| Field | Type | Required | Options | Default | Notes |
|-------|------|----------|---------|---------|-------|
| Keyword | Text input | No | — | — | Searches name and email |
| Role | Select dropdown | No | Admin, Manager, User | All | Filters by role |
| Status | Select dropdown | No | Active, Inactive, Suspended | All | Filters by status |
### Data Table
| Column | Format | Sortable | Filterable | Notes |
|--------|--------|----------|-----------|-------|
| Name | Text | Yes | No | Full name |
| Email | Text (link) | Yes | No | Clickable → opens detail |
| Role | Badge | No | Yes | Color-coded by role |
| Status | Badge | No | Yes | Green=active, Red=suspended |
| Created | Date (YYYY-MM-DD) | Yes | No | — |
| Actions | Buttons | No | No | Edit, Delete |
### Actions
| Button | Visibility | Behavior |
|--------|-----------|----------|
| Create User | ADMIN, MANAGER | Opens create modal |
| Edit | ADMIN, MANAGER | Opens edit modal with prefilled data |
| Delete | ADMIN only | Confirmation dialog → soft delete |
## Interactions
### Page Load
- Fetches first page of users via `GET /api/users?page=1&size=20`
- Default sort: `created_at` descending
### Search
- **Trigger:** User types in search field (300ms debounce)
- **Behavior:** Re-fetches users with `keyword` param, resets to page 1
- **Special rules:** Minimum 2 characters to trigger search
### Create User
- **Trigger:** Click "Create User" button
- **Modal content:** Name (required, max 50), Email (required, email format), Role (required, select), Status (default: Active)
- **Validation:** Name required + max length, Email required + format check
- **API:** `POST /api/users` with form data
- **On success:** Toast "User created", close modal, refresh list
- **On failure:** Show API error below form
### Delete User
- **Trigger:** Click "Delete" button on row
- **Behavior:** Confirmation dialog "Are you sure you want to delete {name}?"
- **API:** `DELETE /api/users/:id`
- **On success:** Toast "User deleted", refresh list
## API Dependencies
| API | Method | Path | Trigger | Notes |
|-----|--------|------|---------|-------|
| List users | GET | /api/users | Load, search, paginate | Params: page, size, keyword, role, status |
| Create user | POST | /api/users | Submit create form | Body: name, email, role |
| Delete user | DELETE | /api/users/:id | Confirm delete | — |
## Page Relationships
- **From:** Dashboard (click "View Users" link)
- **To:** User Detail (click email or row)
- **Data coupling:** Creating/deleting a user triggers dashboard stats refresh
FILE:expected_outputs/sample-prd-readme.md
# My App — Product Requirements Document
## System Overview
My App is a user management platform for internal teams. It provides CRUD operations for users, a dashboard with key metrics, and system settings. Built with Next.js 14 (App Router) and Tailwind CSS.
## Module Overview
| Module | Pages | Core Functionality |
|--------|-------|--------------------|
| Dashboard | Dashboard | Key metrics, activity feed |
| User Management | User list, User detail | CRUD users, role assignment |
| Settings | Settings | System configuration |
## Page Inventory
| # | Page Name | Route | Module | Doc Link |
|---|-----------|-------|--------|----------|
| 1 | Home | / | — | [→](./pages/01-home.md) |
| 2 | Dashboard | /dashboard | Dashboard | [→](./pages/02-dashboard.md) |
| 3 | User List | /users | User Mgmt | [→](./pages/03-user-list.md) |
| 4 | User Detail | /users/:id | User Mgmt | [→](./pages/04-user-detail.md) |
| 5 | Settings | /settings | Settings | [→](./pages/05-settings.md) |
## API Inventory
| # | Method | Path | Status | Notes |
|---|--------|------|--------|-------|
| 1 | GET | /api/users | Integrated | Paginated list |
| 2 | POST | /api/users | Integrated | Create user |
| 3 | GET | /api/users/:id | Integrated | User detail |
| 4 | PUT | /api/users/:id | Integrated | Update user |
| 5 | GET | /api/dashboard/stats | Mock | Dashboard metrics |
## Global Notes
### Permission Model
Role-based access: ADMIN (full access), MANAGER (read + edit), USER (read-only).
### Common Interaction Patterns
- All delete operations require confirmation modal
- Lists default to `created_at` descending, 20 items per page
- Form validation shows inline errors below each field
FILE:references/framework-patterns.md
# Framework-Specific Patterns
Quick reference for identifying routes, components, state, and APIs across frontend and backend frameworks.
## React (CRA / Vite)
| Aspect | Where to Look |
|--------|--------------|
| Routes | `react-router-dom` — `<Route path="...">` or `createBrowserRouter` |
| Components | `.tsx` / `.jsx` files, default exports |
| State | Redux (`store/`), Zustand, Jotai, Recoil, React Context |
| API | `axios`, `fetch`, TanStack Query (`useQuery`), SWR (`useSWR`) |
| Forms | React Hook Form, Formik, Ant Design Form, custom `useState` |
| i18n | `react-i18next`, `react-intl` |
## Next.js (App Router)
| Aspect | Where to Look |
|--------|--------------|
| Routes | `app/` directory — `page.tsx` = route, folders = segments |
| Layouts | `layout.tsx` per directory |
| Loading | `loading.tsx`, `error.tsx`, `not-found.tsx` |
| API routes | `app/api/` or `pages/api/` (Pages Router) |
| Server actions | `"use server"` directive |
| Middleware | `middleware.ts` at root |
## Next.js (Pages Router)
| Aspect | Where to Look |
|--------|--------------|
| Routes | `pages/` directory — filename = route |
| Data fetching | `getServerSideProps`, `getStaticProps`, `getStaticPaths` |
| API routes | `pages/api/` |
## Vue 3
| Aspect | Where to Look |
|--------|--------------|
| Routes | `vue-router` — `routes` array in `router/index.ts` |
| Components | `.vue` SFCs (`<template>`, `<script setup>`, `<style>`) |
| State | Pinia (`stores/`), Vuex (`store/`) |
| API | `axios`, `fetch`, VueQuery |
| Forms | VeeValidate, FormKit, custom `ref()` / `reactive()` |
| i18n | `vue-i18n` |
## Nuxt 3
| Aspect | Where to Look |
|--------|--------------|
| Routes | `pages/` directory (file-system routing) |
| Layouts | `layouts/` |
| API routes | `server/api/` |
| Data fetching | `useFetch`, `useAsyncData`, `$fetch` |
| State | `useState`, Pinia |
| Middleware | `middleware/` |
## Angular
| Aspect | Where to Look |
|--------|--------------|
| Routes | `app-routing.module.ts` or `Routes` array |
| Components | `@Component` decorator, `*.component.ts` |
| State | NgRx (`store/`), services with `BehaviorSubject` |
| API | `HttpClient` in services |
| Forms | Reactive Forms (`FormGroup`), Template-driven forms |
| i18n | `@angular/localize`, `ngx-translate` |
| Guards | `CanActivate`, `CanDeactivate` |
## Svelte / SvelteKit
| Aspect | Where to Look |
|--------|--------------|
| Routes | `src/routes/` (file-system routing with `+page.svelte`) |
| Layouts | `+layout.svelte` |
| Data loading | `+page.ts` / `+page.server.ts` (`load` function) |
| API routes | `+server.ts` |
| State | Svelte stores (`writable`, `readable`, `derived`) |
## NestJS
| Aspect | Where to Look |
|--------|--------------|
| Routes | `@Controller('prefix')` + `@Get()/@Post()/@Put()/@Delete()` decorators |
| Modules | `*.module.ts` — `@Module({ controllers, providers, imports })` |
| Services | `*.service.ts` — injected via constructor, contains business logic |
| DTOs | `*.dto.ts` — `class-validator` decorators define validation rules |
| Entities | `*.entity.ts` — TypeORM `@Entity()` / Prisma schemas |
| Auth | `@UseGuards(AuthGuard)`, `@Roles('admin')`, Passport strategies |
| Middleware | `*.middleware.ts`, registered in module `configure()` |
| Pipes | `ValidationPipe`, `ParseIntPipe` — input transformation |
| Config | `ConfigModule`, `.env` files, `config/` directory |
## Express / Fastify
| Aspect | Where to Look |
|--------|--------------|
| Routes | `router.get('/path', handler)`, `app.post('/path', ...)` |
| Middleware | `app.use(...)`, `router.use(...)` |
| Controllers | Route handler files in `routes/`, `controllers/` |
| Models | Mongoose schemas (`*.model.ts`), Sequelize models, Prisma |
| Auth | `passport`, `jsonwebtoken`, middleware auth checks |
| Validation | `express-validator`, `joi`, `zod`, custom middleware |
## Django
| Aspect | Where to Look |
|--------|--------------|
| Routes | `urls.py` — `urlpatterns = [path('...', view)]` |
| Views | `views.py` — function-based or class-based views (`APIView`, `ViewSet`) |
| Models | `models.py` — `class MyModel(models.Model)` with field definitions |
| Forms | `forms.py` — `ModelForm`, `Form` with validation |
| Serializers | `serializers.py` (DRF) — `ModelSerializer`, field-level validation |
| Admin | `admin.py` — `@admin.register`, `list_display`, `search_fields`, `list_filter` |
| Templates | `templates/` — Jinja2/Django template HTML files |
| Middleware | `MIDDLEWARE` in `settings.py` |
| Auth | `django.contrib.auth`, `rest_framework.permissions`, `@login_required` |
| Signals | `signals.py` — `post_save`, `pre_delete` hooks (hidden business logic) |
| Management commands | `management/commands/` — CLI operations |
| Celery tasks | `tasks.py` — async/background operations |
## Django REST Framework (DRF)
| Aspect | Where to Look |
|--------|--------------|
| Endpoints | `router.register('prefix', ViewSet)` in `urls.py` |
| ViewSets | `viewsets.py` — `ModelViewSet` (full CRUD), `ReadOnlyModelViewSet` |
| Serializers | `serializers.py` — field types, validators, nested relations |
| Permissions | `permission_classes = [IsAuthenticated, IsAdminUser]` |
| Filtering | `django-filter`, `search_fields`, `ordering_fields` |
| Pagination | `DEFAULT_PAGINATION_CLASS` in settings, per-view override |
| Throttling | `DEFAULT_THROTTLE_CLASSES`, per-view `throttle_classes` |
## FastAPI
| Aspect | Where to Look |
|--------|--------------|
| Routes | `@app.get('/path')`, `@router.post('/path')` decorators |
| Models | Pydantic `BaseModel` classes — request/response schemas |
| Dependencies | `Depends(...)` — auth, DB sessions, shared logic |
| DB | SQLAlchemy models, Tortoise ORM, or raw SQL |
| Auth | `OAuth2PasswordBearer`, JWT middleware, `Depends(get_current_user)` |
| Background | `BackgroundTasks`, Celery integration |
## Common Patterns Across Frameworks
### Mock Detection
```
# Likely mock
setTimeout(() => resolve(data), 500)
Promise.resolve(mockData)
import { data } from './fixtures'
faker.name.firstName()
# Likely real
axios.get('/api/users')
fetch('/api/data')
httpClient.post(url, body)
useSWR('/api/resource')
```
### Permission Patterns
```
# React
{hasPermission('admin') && <Button>Delete</Button>}
<ProtectedRoute roles={['admin', 'manager']}>
# Vue
v-if="user.role === 'admin'"
v-permission="'user:delete'"
# Angular
*ngIf="authService.hasRole('admin')"
canActivate: [AuthGuard]
```
### Form Validation
```
# React Hook Form
{ required: 'Name is required', maxLength: { value: 50, message: 'Too long' } }
# VeeValidate (Vue)
rules="required|email|max:100"
# Angular Reactive Forms
Validators.required, Validators.minLength(3), Validators.pattern(...)
# NestJS (class-validator)
@IsString() @IsNotEmpty() @MaxLength(50) name: string;
@IsEmail() email: string;
@IsEnum(UserRole) role: UserRole;
# Django Forms
name = forms.CharField(max_length=50, required=True)
email = forms.EmailField()
# DRF Serializers
name = serializers.CharField(max_length=50)
email = serializers.EmailField(required=True)
# FastAPI (Pydantic)
name: str = Field(max_length=50)
email: EmailStr
```
### Database Model Patterns
```
# Django
class Order(models.Model):
status = models.CharField(max_length=20, choices=STATUS_CHOICES)
user = models.ForeignKey(User, on_delete=models.CASCADE)
total = models.DecimalField(max_digits=10, decimal_places=2)
# TypeORM (NestJS)
@Entity()
export class Order {
@Column({ type: 'enum', enum: OrderStatus })
status: OrderStatus;
@ManyToOne(() => User)
user: User;
}
# Prisma
model Order {
status OrderStatus
user User @relation(fields: [userId], references: [id])
total Decimal
}
```
FILE:references/prd-quality-checklist.md
# PRD Quality Checklist
Use this checklist to validate generated PRDs before delivery.
## Completeness
- [ ] Every route/page has a corresponding document
- [ ] All form fields listed with type, required, validation, default
- [ ] All table columns listed with format, sortable, filterable
- [ ] All action buttons documented with visibility conditions
- [ ] All API endpoints listed with method, path, trigger, params
- [ ] Mock vs integrated APIs clearly distinguished
- [ ] All enums exhaustively listed with every value
- [ ] Page load behavior documented for every page
- [ ] Page relationships mapped (inbound, outbound, data coupling)
## Accuracy
- [ ] Route paths match actual code
- [ ] Field names match UI labels (not variable names)
- [ ] Validation rules match actual code logic
- [ ] Permission conditions match auth guard implementations
- [ ] API paths match actual service layer calls
- [ ] Enum values match source constants (no fabrication)
- [ ] Uncertain items marked `[TBC]` with explanation
## Readability
- [ ] Business language used (not implementation details)
- [ ] Each page doc is self-contained
- [ ] No component names used as page names
- [ ] Interactions described as user action → system response
- [ ] Modals/drawers documented within their parent page
- [ ] README system overview written for non-technical reader
## Structure
- [ ] `prd/README.md` exists with system overview + page inventory
- [ ] `prd/pages/` contains numbered page files
- [ ] `prd/appendix/enum-dictionary.md` exists
- [ ] `prd/appendix/api-inventory.md` exists
- [ ] `prd/appendix/page-relationships.md` exists
- [ ] Cross-references use relative links
## Backend-Specific Checks
- [ ] All controller/view endpoints documented with method, path, auth
- [ ] DTO/serializer fields listed with type, required, validation
- [ ] Database model relationships mapped (FK, M2M, O2O)
- [ ] Django admin customizations documented (list_display, actions, inlines)
- [ ] Background tasks/Celery jobs documented with trigger and schedule
- [ ] Middleware pipeline documented (auth, logging, rate limiting)
- [ ] Environment-dependent behavior noted (dev vs prod differences)
- [ ] Database migrations reviewed for field constraints and defaults
## Common Issues to Watch
| Issue | How to Detect | Fix |
|-------|--------------|-----|
| Missing modal content | Search for `Modal`, `Dialog`, `Drawer` components | Add as subsection in parent page |
| Undocumented field linking | Search for conditional renders based on field values | Add to interaction logic |
| Hidden permissions | Search for `v-if`, `v-show`, role checks, auth guards | Add visibility conditions |
| Stale mock data | Compare mock shapes with API types/interfaces | Flag as `[Mock - verify with backend]` |
| Missing error states | Search for error boundaries, catch blocks, toast errors | Add failure paths to interactions |
| Unlinked pages | Cross-reference route params with navigation calls | Complete page relationships |
FILE:scripts/codebase_analyzer.py
#!/usr/bin/env python3
"""Analyze any codebase (frontend, backend, or fullstack) and extract routes, APIs, models, and structure.
Supports: React, Vue, Angular, Svelte, Next.js, Nuxt, NestJS, Express, Django, FastAPI, Flask.
Stdlib only — no third-party dependencies. Outputs JSON for downstream PRD generation.
Usage:
python3 codebase_analyzer.py /path/to/project
python3 codebase_analyzer.py /path/to/project --output prd-analysis.json
python3 codebase_analyzer.py /path/to/project --format markdown
"""
import argparse
import json
import os
import re
from collections import defaultdict
from pathlib import Path
from typing import Any, Dict, List, Optional, Set, Tuple
IGNORED_DIRS = {
".git", "node_modules", ".next", "dist", "build", "coverage",
"venv", ".venv", "__pycache__", ".nuxt", ".output", ".cache",
".turbo", ".vercel", "out", "storybook-static",
".tox", ".mypy_cache", ".pytest_cache", "htmlcov", "staticfiles",
"media", "migrations", "egg-info",
}
FRAMEWORK_SIGNALS = {
"react": ["react", "react-dom"],
"next": ["next"],
"vue": ["vue"],
"nuxt": ["nuxt"],
"angular": ["@angular/core"],
"svelte": ["svelte"],
"sveltekit": ["@sveltejs/kit"],
"solid": ["solid-js"],
"astro": ["astro"],
"remix": ["@remix-run/react"],
"nestjs": ["@nestjs/core"],
"express": ["express"],
"fastify": ["fastify"],
}
# Python backend frameworks detected via project files (no package.json)
PYTHON_FRAMEWORK_FILES = {
"django": ["manage.py", "settings.py"],
"fastapi": ["main.py"], # confirmed via imports
"flask": ["app.py"], # confirmed via imports
}
ROUTE_FILE_PATTERNS = [
"**/router.{ts,tsx,js,jsx}",
"**/routes.{ts,tsx,js,jsx}",
"**/routing.{ts,tsx,js,jsx}",
"**/app-routing*.{ts,tsx,js,jsx}",
]
ROUTE_DIR_PATTERNS = [
"pages", "views", "routes", "app",
"src/pages", "src/views", "src/routes", "src/app",
]
API_DIR_PATTERNS = [
"api", "services", "requests", "endpoints", "client",
"src/api", "src/services", "src/requests",
]
STATE_DIR_PATTERNS = [
"store", "stores", "models", "context", "state",
"src/store", "src/stores", "src/models", "src/context",
]
I18N_DIR_PATTERNS = [
"locales", "i18n", "lang", "translations", "messages",
"src/locales", "src/i18n", "src/lang",
]
# Backend-specific directory patterns
CONTROLLER_DIR_PATTERNS = [
"controllers", "src/controllers", "src/modules",
]
MODEL_DIR_PATTERNS = [
"models", "entities", "src/entities", "src/models",
]
DTO_DIR_PATTERNS = [
"dto", "dtos", "src/dto", "serializers",
]
MOCK_SIGNALS = [
r"setTimeout\s*\(.*\breturn\b",
r"Promise\.resolve\s*\(",
r"\.mock\.",
r"__mocks__",
r"mockData",
r"mock[A-Z]",
r"faker\.",
r"fixtures?/",
]
REAL_API_SIGNALS = [
r"\baxios\b",
r"\bfetch\s*\(",
r"httpGet|httpPost|httpPut|httpDelete|httpPatch",
r"\.get\s*\(\s*['\"`/]",
r"\.post\s*\(\s*['\"`/]",
r"\.put\s*\(\s*['\"`/]",
r"\.delete\s*\(\s*['\"`/]",
r"\.patch\s*\(\s*['\"`/]",
r"useSWR|useQuery|useMutation",
r"\$http\.",
r"this\.http\.",
]
ROUTE_PATTERNS = [
# React Router
r'<Route\s+[^>]*path\s*=\s*["\']([^"\']+)["\']',
r'path\s*:\s*["\']([^"\']+)["\']',
# Vue Router
r'path\s*:\s*["\']([^"\']+)["\']',
# Angular
r'path\s*:\s*["\']([^"\']+)["\']',
]
API_PATH_PATTERNS = [
r'["\'](?:GET|POST|PUT|DELETE|PATCH)["\'].*?["\'](/[a-zA-Z0-9/_\-:{}]+)["\']',
r'(?:get|post|put|delete|patch)\s*\(\s*["\'](/[a-zA-Z0-9/_\-:{}]+)["\']',
r'(?:url|path|endpoint|baseURL)\s*[:=]\s*["\'](/[a-zA-Z0-9/_\-:{}]+)["\']',
r'fetch\s*\(\s*[`"\'](?:https?://[^/]+)?(/[a-zA-Z0-9/_\-:{}]+)',
]
COMPONENT_EXTENSIONS = {".tsx", ".jsx", ".vue", ".svelte", ".astro"}
CODE_EXTENSIONS = {".ts", ".tsx", ".js", ".jsx", ".vue", ".svelte", ".astro", ".py"}
# NestJS decorator patterns
NEST_ROUTE_PATTERNS = [
r"@(?:Get|Post|Put|Delete|Patch|Head|Options|All)\s*\(\s*['\"]([^'\"]*)['\"]",
r"@Controller\s*\(\s*['\"]([^'\"]*)['\"]",
]
# Django URL patterns
DJANGO_ROUTE_PATTERNS = [
r"path\s*\(\s*['\"]([^'\"]+)['\"]",
r"url\s*\(\s*r?['\"]([^'\"]+)['\"]",
r"register\s*\(\s*r?['\"]([^'\"]+)['\"]",
]
# Django/Python model patterns
PYTHON_MODEL_PATTERNS = [
r"class\s+(\w+)\s*\(.*?models\.Model\)",
r"class\s+(\w+)\s*\(.*?BaseModel\)", # Pydantic
]
# NestJS entity/DTO patterns
NEST_MODEL_PATTERNS = [
r"@Entity\s*\(.*?\)\s*(?:export\s+)?class\s+(\w+)",
r"class\s+(\w+(?:Dto|DTO|Entity|Schema))\b",
]
def detect_framework(project_root: Path) -> Dict[str, Any]:
"""Detect framework from package.json (Node.js) or project files (Python)."""
detected = []
all_deps = {}
pkg_name = ""
pkg_version = ""
# Node.js detection via package.json
pkg_path = project_root / "package.json"
if pkg_path.exists():
try:
with open(pkg_path) as f:
pkg = json.load(f)
pkg_name = pkg.get("name", "")
pkg_version = pkg.get("version", "")
for key in ("dependencies", "devDependencies", "peerDependencies"):
all_deps.update(pkg.get(key, {}))
for framework, signals in FRAMEWORK_SIGNALS.items():
if any(s in all_deps for s in signals):
detected.append(framework)
except (json.JSONDecodeError, IOError):
pass
# Python backend detection via project files and imports
if (project_root / "manage.py").exists():
detected.append("django")
if (project_root / "requirements.txt").exists() or (project_root / "pyproject.toml").exists():
for req_file in ["requirements.txt", "pyproject.toml", "setup.py", "Pipfile"]:
req_path = project_root / req_file
if req_path.exists():
try:
content = req_path.read_text(errors="replace").lower()
if "django" in content and "django" not in detected:
detected.append("django")
if "fastapi" in content:
detected.append("fastapi")
if "flask" in content and "flask" not in detected:
detected.append("flask")
except IOError:
pass
# Prefer specific over generic
priority = [
"sveltekit", "next", "nuxt", "remix", "astro", # fullstack JS
"nestjs", "express", "fastify", # backend JS
"django", "fastapi", "flask", # backend Python
"angular", "svelte", "vue", "react", "solid", # frontend JS
]
framework = "unknown"
for fw in priority:
if fw in detected:
framework = fw
break
return {
"framework": framework,
"name": pkg_name or project_root.name,
"version": pkg_version,
"detected_frameworks": detected,
"dependency_count": len(all_deps),
"key_deps": {k: v for k, v in all_deps.items()
if any(s in k for s in ["router", "redux", "vuex", "pinia", "zustand",
"mobx", "recoil", "jotai", "tanstack", "swr",
"axios", "tailwind", "material", "ant",
"chakra", "shadcn", "i18n", "intl",
"typeorm", "prisma", "sequelize", "mongoose",
"passport", "jwt", "class-validator"])},
}
def find_dirs(root: Path, patterns: List[str]) -> List[Path]:
"""Find directories matching common patterns."""
found = []
for pattern in patterns:
candidate = root / pattern
if candidate.is_dir():
found.append(candidate)
return found
def walk_files(root: Path, extensions: Set[str] = CODE_EXTENSIONS) -> List[Path]:
"""Walk project tree, skip ignored dirs, return files matching extensions."""
results = []
for dirpath, dirnames, filenames in os.walk(root):
dirnames[:] = [d for d in dirnames if d not in IGNORED_DIRS]
for fname in filenames:
if Path(fname).suffix in extensions:
results.append(Path(dirpath) / fname)
return results
def extract_routes_from_file(filepath: Path) -> List[Dict[str, str]]:
"""Extract route definitions from a file."""
routes = []
try:
content = filepath.read_text(errors="replace")
except IOError:
return routes
for pattern in ROUTE_PATTERNS:
for match in re.finditer(pattern, content):
path = match.group(1)
if path and not path.startswith("http") and len(path) < 200:
routes.append({
"path": path,
"source": str(filepath),
"line": content[:match.start()].count("\n") + 1,
})
return routes
def extract_routes_from_filesystem(pages_dir: Path, root: Path) -> List[Dict[str, str]]:
"""Infer routes from file-system routing (Next.js, Nuxt, SvelteKit)."""
routes = []
for filepath in sorted(pages_dir.rglob("*")):
if filepath.is_file() and filepath.suffix in CODE_EXTENSIONS:
rel = filepath.relative_to(pages_dir)
route = "/" + str(rel.with_suffix("")).replace("\\", "/")
# Normalize index routes
route = re.sub(r"/index$", "", route) or "/"
# Convert [param] to :param
route = re.sub(r"\[\.\.\.(\w+)\]", r"*\1", route)
route = re.sub(r"\[(\w+)\]", r":\1", route)
routes.append({
"path": route,
"source": str(filepath),
"filesystem": True,
})
return routes
def extract_apis_from_file(filepath: Path) -> List[Dict[str, Any]]:
"""Extract API calls from a file."""
apis = []
try:
content = filepath.read_text(errors="replace")
except IOError:
return apis
is_mock = any(re.search(p, content) for p in MOCK_SIGNALS)
is_real = any(re.search(p, content) for p in REAL_API_SIGNALS)
for pattern in API_PATH_PATTERNS:
for match in re.finditer(pattern, content):
path = match.group(1) if match.lastindex else match.group(0)
if path and len(path) < 200:
# Try to detect HTTP method
context = content[max(0, match.start() - 100):match.end()]
method = "UNKNOWN"
for m in ["GET", "POST", "PUT", "DELETE", "PATCH"]:
if m.lower() in context.lower():
method = m
break
apis.append({
"path": path,
"method": method,
"source": str(filepath),
"line": content[:match.start()].count("\n") + 1,
"integrated": is_real and not is_mock,
"mock_detected": is_mock,
})
return apis
def extract_enums(filepath: Path) -> List[Dict[str, Any]]:
"""Extract enum/constant definitions."""
enums = []
try:
content = filepath.read_text(errors="replace")
except IOError:
return enums
# TypeScript enums
for match in re.finditer(r"enum\s+(\w+)\s*\{([^}]+)\}", content):
name = match.group(1)
body = match.group(2)
values = re.findall(r"(\w+)\s*=\s*['\"]?([^,'\"\n]+)", body)
enums.append({
"name": name,
"type": "enum",
"values": {k.strip(): v.strip().rstrip(",") for k, v in values},
"source": str(filepath),
})
# Object constant maps (const STATUS_MAP = { ... })
for match in re.finditer(
r"(?:const|export\s+const)\s+(\w*(?:MAP|STATUS|TYPE|ENUM|OPTION|ROLE|STATE)\w*)\s*[:=]\s*\{([^}]+)\}",
content, re.IGNORECASE
):
name = match.group(1)
body = match.group(2)
values = re.findall(r"['\"]?(\w+)['\"]?\s*:\s*['\"]([^'\"]+)['\"]", body)
if values:
enums.append({
"name": name,
"type": "constant_map",
"values": dict(values),
"source": str(filepath),
})
return enums
def extract_backend_routes(filepath: Path, framework: str) -> List[Dict[str, str]]:
"""Extract route definitions from NestJS controllers or Django url configs."""
routes = []
try:
content = filepath.read_text(errors="replace")
except IOError:
return routes
patterns = []
if framework in ("nestjs", "express", "fastify"):
patterns = NEST_ROUTE_PATTERNS
elif framework == "django":
patterns = DJANGO_ROUTE_PATTERNS
# For NestJS, also grab the controller prefix
controller_prefix = ""
if framework == "nestjs":
m = re.search(r"@Controller\s*\(\s*['\"]([^'\"]*)['\"]", content)
if m:
controller_prefix = "/" + m.group(1).strip("/")
for pattern in patterns:
for match in re.finditer(pattern, content):
path = match.group(1)
if not path or path.startswith("http") or len(path) > 200:
continue
# For NestJS method decorators, prepend controller prefix
if framework == "nestjs" and not path.startswith("/"):
full_path = f"{controller_prefix}/{path}".replace("//", "/")
else:
full_path = path if path.startswith("/") else f"/{path}"
# Detect HTTP method from decorator name
method = "UNKNOWN"
ctx = content[max(0, match.start() - 30):match.start()]
for m_name in ["Get", "Post", "Put", "Delete", "Patch"]:
if f"@{m_name}" in ctx or f"@{m_name.lower()}" in ctx:
method = m_name.upper()
break
routes.append({
"path": full_path,
"method": method,
"source": str(filepath),
"line": content[:match.start()].count("\n") + 1,
"type": "backend",
})
return routes
def extract_models(filepath: Path, framework: str) -> List[Dict[str, Any]]:
"""Extract model/entity definitions from backend code."""
models = []
try:
content = filepath.read_text(errors="replace")
except IOError:
return models
patterns = PYTHON_MODEL_PATTERNS if framework in ("django", "fastapi", "flask") else NEST_MODEL_PATTERNS
for pattern in patterns:
for match in re.finditer(pattern, content):
name = match.group(1)
# Try to extract fields
fields = []
# For Django models: field_name = models.FieldType(...)
if framework == "django":
block_start = match.end()
block = content[block_start:block_start + 2000]
for fm in re.finditer(
r"(\w+)\s*=\s*models\.(\w+)\s*\(([^)]*)\)", block
):
fields.append({
"name": fm.group(1),
"type": fm.group(2),
"args": fm.group(3).strip()[:100],
})
models.append({
"name": name,
"source": str(filepath),
"framework": framework,
"fields": fields,
})
return models
def count_components(files: List[Path]) -> Dict[str, int]:
"""Count components by type."""
counts: Dict[str, int] = defaultdict(int)
for f in files:
if f.suffix in COMPONENT_EXTENSIONS:
counts["components"] += 1
elif f.suffix in {".ts", ".js"}:
counts["modules"] += 1
return dict(counts)
def analyze_project(project_root: Path) -> Dict[str, Any]:
"""Run full analysis on a frontend project."""
root = Path(project_root).resolve()
if not root.is_dir():
return {"error": f"Not a directory: {root}"}
# 1. Framework detection
framework_info = detect_framework(root)
# 2. File inventory
all_files = walk_files(root)
component_counts = count_components(all_files)
# 3. Directory structure
route_dirs = find_dirs(root, ROUTE_DIR_PATTERNS)
api_dirs = find_dirs(root, API_DIR_PATTERNS)
state_dirs = find_dirs(root, STATE_DIR_PATTERNS)
i18n_dirs = find_dirs(root, I18N_DIR_PATTERNS)
# 4. Routes (frontend + backend)
routes = []
fw = framework_info["framework"]
# Frontend: config-based routes
for f in all_files:
if any(p in f.name.lower() for p in ["router", "routes", "routing"]):
routes.extend(extract_routes_from_file(f))
# Frontend: file-system routes (Next.js, Nuxt, SvelteKit)
if fw in ("next", "nuxt", "sveltekit", "remix", "astro"):
for d in route_dirs:
routes.extend(extract_routes_from_filesystem(d, root))
# Backend: NestJS controllers, Django urls
if fw in ("nestjs", "express", "fastify", "django"):
for f in all_files:
if fw == "django" and "urls.py" in f.name:
routes.extend(extract_backend_routes(f, fw))
elif fw in ("nestjs", "express", "fastify") and ".controller." in f.name:
routes.extend(extract_backend_routes(f, fw))
# Deduplicate routes by path (+ method for backend)
seen_paths: Set[str] = set()
unique_routes = []
for r in routes:
key = r["path"] if r.get("type") != "backend" else f"{r.get('method', '')}:{r['path']}"
if key not in seen_paths:
seen_paths.add(key)
unique_routes.append(r)
routes = sorted(unique_routes, key=lambda r: r["path"])
# 5. API calls
apis = []
for f in all_files:
apis.extend(extract_apis_from_file(f))
# Deduplicate APIs by path+method
seen_apis: Set[Tuple[str, str]] = set()
unique_apis = []
for a in apis:
key = (a["path"], a["method"])
if key not in seen_apis:
seen_apis.add(key)
unique_apis.append(a)
apis = sorted(unique_apis, key=lambda a: a["path"])
# 6. Enums
enums = []
for f in all_files:
enums.extend(extract_enums(f))
# 7. Models/entities (backend)
models = []
if fw in ("django", "fastapi", "flask", "nestjs"):
for f in all_files:
if fw == "django" and "models.py" in f.name:
models.extend(extract_models(f, fw))
elif fw == "nestjs" and (".entity." in f.name or ".dto." in f.name):
models.extend(extract_models(f, fw))
# Deduplicate models by name
seen_models: Set[str] = set()
unique_models = []
for m in models:
if m["name"] not in seen_models:
seen_models.add(m["name"])
unique_models.append(m)
models = sorted(unique_models, key=lambda m: m["name"])
# Backend-specific directories
controller_dirs = find_dirs(root, CONTROLLER_DIR_PATTERNS)
model_dirs = find_dirs(root, MODEL_DIR_PATTERNS)
dto_dirs = find_dirs(root, DTO_DIR_PATTERNS)
# 8. Summary
mock_count = sum(1 for a in apis if a.get("mock_detected"))
real_count = sum(1 for a in apis if a.get("integrated"))
backend_routes = [r for r in routes if r.get("type") == "backend"]
frontend_routes = [r for r in routes if r.get("type") != "backend"]
analysis = {
"project": {
"root": str(root),
"name": framework_info.get("name", root.name),
"framework": framework_info["framework"],
"detected_frameworks": framework_info.get("detected_frameworks", []),
"key_dependencies": framework_info.get("key_deps", {}),
"stack_type": "backend" if fw in ("django", "fastapi", "flask", "nestjs", "express", "fastify") and not frontend_routes else
"fullstack" if backend_routes and frontend_routes else "frontend",
},
"structure": {
"total_files": len(all_files),
"components": component_counts,
"route_dirs": [str(d) for d in route_dirs],
"api_dirs": [str(d) for d in api_dirs],
"state_dirs": [str(d) for d in state_dirs],
"i18n_dirs": [str(d) for d in i18n_dirs],
"controller_dirs": [str(d) for d in controller_dirs],
"model_dirs": [str(d) for d in model_dirs],
"dto_dirs": [str(d) for d in dto_dirs],
},
"routes": {
"count": len(routes),
"frontend_pages": frontend_routes,
"backend_endpoints": backend_routes,
"pages": routes, # backward compat
},
"apis": {
"total": len(apis),
"integrated": real_count,
"mock": mock_count,
"endpoints": apis,
},
"enums": {
"count": len(enums),
"definitions": enums,
},
"models": {
"count": len(models),
"definitions": models,
},
"summary": {
"pages": len(frontend_routes),
"backend_endpoints": len(backend_routes),
"api_endpoints": len(apis),
"api_integrated": real_count,
"api_mock": mock_count,
"enums": len(enums),
"models": len(models),
"has_i18n": len(i18n_dirs) > 0,
"has_state_management": len(state_dirs) > 0,
"stack_type": "backend" if fw in ("django", "fastapi", "flask", "nestjs", "express", "fastify") and not frontend_routes else
"fullstack" if backend_routes and frontend_routes else "frontend",
},
}
return analysis
def format_markdown(analysis: Dict[str, Any]) -> str:
"""Format analysis as markdown summary."""
lines = []
proj = analysis["project"]
summary = analysis["summary"]
stack = summary.get("stack_type", "frontend")
lines.append(f"# Codebase Analysis: {proj['name'] or 'Project'}")
lines.append("")
lines.append(f"**Framework:** {proj['framework']}")
lines.append(f"**Stack type:** {stack}")
lines.append(f"**Total files:** {analysis['structure']['total_files']}")
if summary.get("pages"):
lines.append(f"**Frontend pages:** {summary['pages']}")
if summary.get("backend_endpoints"):
lines.append(f"**Backend endpoints:** {summary['backend_endpoints']}")
lines.append(f"**API calls detected:** {summary['api_endpoints']} "
f"({summary['api_integrated']} integrated, {summary['api_mock']} mock)")
lines.append(f"**Enums:** {summary['enums']}")
if summary.get("models"):
lines.append(f"**Models/entities:** {summary['models']}")
lines.append(f"**i18n:** {'Yes' if summary['has_i18n'] else 'No'}")
lines.append(f"**State management:** {'Yes' if summary['has_state_management'] else 'No'}")
lines.append("")
if analysis["routes"]["pages"]:
lines.append("## Pages / Routes")
lines.append("")
lines.append("| # | Route | Source |")
lines.append("|---|-------|--------|")
for i, r in enumerate(analysis["routes"]["pages"], 1):
src = r.get("source", "").split("/")[-1]
fs = " (fs)" if r.get("filesystem") else ""
lines.append(f"| {i} | `{r['path']}` | {src}{fs} |")
lines.append("")
if analysis["apis"]["endpoints"]:
lines.append("## API Endpoints")
lines.append("")
lines.append("| Method | Path | Integrated | Source |")
lines.append("|--------|------|-----------|--------|")
for a in analysis["apis"]["endpoints"]:
src = a.get("source", "").split("/")[-1]
status = "✅" if a.get("integrated") else "⚠️ Mock"
lines.append(f"| {a['method']} | `{a['path']}` | {status} | {src} |")
lines.append("")
if analysis["enums"]["definitions"]:
lines.append("## Enums & Constants")
lines.append("")
for e in analysis["enums"]["definitions"]:
lines.append(f"### {e['name']} ({e['type']})")
if e["values"]:
lines.append("| Key | Value |")
lines.append("|-----|-------|")
for k, v in e["values"].items():
lines.append(f"| {k} | {v} |")
lines.append("")
if analysis.get("models", {}).get("definitions"):
lines.append("## Models / Entities")
lines.append("")
for m in analysis["models"]["definitions"]:
lines.append(f"### {m['name']} ({m.get('framework', '')})")
if m.get("fields"):
lines.append("| Field | Type | Args |")
lines.append("|-------|------|------|")
for fld in m["fields"]:
lines.append(f"| {fld['name']} | {fld['type']} | {fld.get('args', '')} |")
lines.append("")
if proj.get("key_dependencies"):
lines.append("## Key Dependencies")
lines.append("")
for dep, ver in sorted(proj["key_dependencies"].items()):
lines.append(f"- `{dep}`: {ver}")
lines.append("")
return "\n".join(lines)
def main():
parser = argparse.ArgumentParser(
description="Analyze any codebase (frontend, backend, fullstack) for PRD generation"
)
parser.add_argument("project", help="Path to project root")
parser.add_argument("-o", "--output", help="Output file (default: stdout)")
parser.add_argument(
"-f", "--format",
choices=["json", "markdown"],
default="json",
help="Output format (default: json)",
)
args = parser.parse_args()
analysis = analyze_project(Path(args.project))
if args.format == "markdown":
output = format_markdown(analysis)
else:
output = json.dumps(analysis, indent=2, ensure_ascii=False)
if args.output:
Path(args.output).write_text(output)
print(f"Written to {args.output}")
else:
print(output)
if __name__ == "__main__":
main()
FILE:scripts/prd_scaffolder.py
#!/usr/bin/env python3
"""Scaffold PRD directory structure from frontend_analyzer.py output.
Reads analysis JSON and creates the prd/ directory with README.md,
per-page stubs, and appendix files pre-populated with extracted data.
Stdlib only — no third-party dependencies.
Usage:
python3 frontend_analyzer.py /path/to/project -o analysis.json
python3 prd_scaffolder.py analysis.json
python3 prd_scaffolder.py analysis.json --output-dir ./prd --project-name "My App"
"""
import argparse
import json
import re
from datetime import datetime
from pathlib import Path
from typing import Any, Dict, Optional
from pathlib import Path
from typing import Any, Dict, List
def slugify(text: str) -> str:
"""Convert text to a filename-safe slug."""
text = text.strip().lower()
text = re.sub(r"[/:{}*?\"<>|]", "-", text)
text = re.sub(r"[^a-z0-9\-]", "-", text)
text = re.sub(r"-+", "-", text)
return text.strip("-")
def route_to_page_name(route: str) -> str:
"""Convert a route path to a human-readable page name."""
if route == "/" or route == "":
return "Home"
parts = route.strip("/").split("/")
# Remove dynamic segments for naming
clean = [p for p in parts if not p.startswith(":") and not p.startswith("*")]
if not clean:
clean = [p.lstrip(":*") for p in parts]
return " ".join(w.capitalize() for w in "-".join(clean).replace("_", "-").split("-"))
def generate_readme(project_name: str, routes: List[Dict], summary: Dict, date: str) -> str:
"""Generate the PRD README.md."""
lines = [
f"# {project_name} — Product Requirements Document",
"",
f"> Generated: {date}",
"",
"## System Overview",
"",
f"<!-- TODO: Describe what {project_name} does, its business context, and primary users -->",
"",
"## Summary",
"",
f"| Metric | Count |",
f"|--------|-------|",
f"| Pages | {summary.get('pages', 0)} |",
f"| API Endpoints | {summary.get('api_endpoints', 0)} |",
f"| Integrated APIs | {summary.get('api_integrated', 0)} |",
f"| Mock APIs | {summary.get('api_mock', 0)} |",
f"| Enums/Constants | {summary.get('enums', 0)} |",
f"| i18n | {'Yes' if summary.get('has_i18n') else 'No'} |",
f"| State Management | {'Yes' if summary.get('has_state_management') else 'No'} |",
"",
"## Module Overview",
"",
"| Module | Pages | Core Functionality |",
"|--------|-------|--------------------|",
"| <!-- TODO: Group pages into modules --> | | |",
"",
"## Page Inventory",
"",
"| # | Page Name | Route | Module | Doc Link |",
"|---|-----------|-------|--------|----------|",
]
for i, route in enumerate(routes, 1):
path = route.get("path", "/")
name = route_to_page_name(path)
slug = slugify(name) or f"page-{i}"
filename = f"{i:02d}-{slug}.md"
lines.append(f"| {i} | {name} | `{path}` | <!-- TODO --> | [→](./pages/{filename}) |")
lines.extend([
"",
"## Global Notes",
"",
"### Permission Model",
"<!-- TODO: Summarize auth/role system if present -->",
"",
"### Common Interaction Patterns",
"<!-- TODO: Global rules — delete confirmations, default sort, etc. -->",
"",
])
return "\n".join(lines)
def generate_page_stub(route: Dict, index: int, date: str) -> str:
"""Generate a per-page PRD stub."""
path = route.get("path", "/")
name = route_to_page_name(path)
source = route.get("source", "unknown")
return f"""# {name}
> **Route:** `{path}`
> **Module:** <!-- TODO -->
> **Source:** `{source}`
> **Generated:** {date}
## Overview
<!-- TODO: 2-3 sentences — core function and use case -->
## Layout
<!-- TODO: Region breakdown — search area, table, detail panel, action bar, etc. -->
## Fields
### Search / Filters
| Field | Type | Required | Options / Enum | Default | Notes |
|-------|------|----------|---------------|---------|-------|
| <!-- TODO --> | | | | | |
### Data Table
| Column | Format | Sortable | Filterable | Notes |
|--------|--------|----------|-----------|-------|
| <!-- TODO --> | | | | |
### Actions
| Button | Visibility Condition | Behavior |
|--------|---------------------|----------|
| <!-- TODO --> | | |
## Interactions
### Page Load
<!-- TODO: What happens on mount — default queries, preloaded data -->
### Search
- **Trigger:** <!-- TODO -->
- **Behavior:** <!-- TODO -->
- **Special rules:** <!-- TODO -->
### Create / Edit
- **Trigger:** <!-- TODO -->
- **Modal/drawer content:** <!-- TODO -->
- **Validation:** <!-- TODO -->
- **On success:** <!-- TODO -->
### Delete
- **Trigger:** <!-- TODO -->
- **Confirmation:** <!-- TODO -->
- **On success:** <!-- TODO -->
## API Dependencies
| API | Method | Path | Trigger | Integrated | Notes |
|-----|--------|------|---------|-----------|-------|
| <!-- TODO --> | | | | | |
## Page Relationships
- **From:** <!-- TODO: Source pages + params -->
- **To:** <!-- TODO: Target pages + params -->
- **Data coupling:** <!-- TODO: Cross-page refresh triggers -->
## Business Rules
<!-- TODO: Anything that doesn't fit above -->
"""
def generate_enum_dictionary(enums: List[Dict]) -> str:
"""Generate the enum dictionary appendix."""
lines = [
"# Enum & Constant Dictionary",
"",
"All enums, status codes, and type mappings extracted from the codebase.",
"",
]
if not enums:
lines.append("*No enums detected. Manual review recommended.*")
return "\n".join(lines)
for e in enums:
lines.append(f"## {e['name']}")
lines.append(f"**Type:** {e.get('type', 'unknown')} | **Source:** `{e.get('source', 'unknown').split('/')[-1]}`")
lines.append("")
if e.get("values"):
lines.append("| Key | Value |")
lines.append("|-----|-------|")
for k, v in e["values"].items():
lines.append(f"| `{k}` | {v} |")
lines.append("")
return "\n".join(lines)
def generate_api_inventory(apis: List[Dict]) -> str:
"""Generate the API inventory appendix."""
lines = [
"# API Inventory",
"",
"All API endpoints detected in the codebase.",
"",
]
if not apis:
lines.append("*No API calls detected. Manual review recommended.*")
return "\n".join(lines)
integrated = [a for a in apis if a.get("integrated")]
mocked = [a for a in apis if a.get("mock_detected") and not a.get("integrated")]
unknown = [a for a in apis if not a.get("integrated") and not a.get("mock_detected")]
for label, group in [("Integrated APIs", integrated), ("Mock / Stub APIs", mocked), ("Unknown Status", unknown)]:
if group:
lines.append(f"## {label}")
lines.append("")
lines.append("| Method | Path | Source | Notes |")
lines.append("|--------|------|--------|-------|")
for a in group:
src = a.get("source", "").split("/")[-1]
lines.append(f"| {a.get('method', '?')} | `{a.get('path', '?')}` | {src} | |")
lines.append("")
return "\n".join(lines)
def generate_page_relationships(routes: List[Dict]) -> str:
"""Generate page relationships appendix stub."""
lines = [
"# Page Relationships",
"",
"Navigation flow and data coupling between pages.",
"",
"## Navigation Map",
"",
"<!-- TODO: Fill in after page-by-page analysis -->",
"",
"```",
"Home",
]
for r in routes[:20]: # Cap at 20 for readability
name = route_to_page_name(r.get("path", "/"))
lines.append(f" ├── {name}")
if len(routes) > 20:
lines.append(f" └── ... ({len(routes) - 20} more)")
lines.extend([
"```",
"",
"## Cross-Page Data Dependencies",
"",
"| Source Page | Target Page | Trigger | Data Passed |",
"|-----------|------------|---------|------------|",
"| <!-- TODO --> | | | |",
"",
])
return "\n".join(lines)
def scaffold(analysis: Dict[str, Any], output_dir: Path, project_name: Optional[str] = None):
"""Create the full PRD directory structure."""
date = datetime.now().strftime("%Y-%m-%d")
name = project_name or analysis.get("project", {}).get("name", "Project")
routes = analysis.get("routes", {}).get("pages", [])
apis = analysis.get("apis", {}).get("endpoints", [])
enums = analysis.get("enums", {}).get("definitions", [])
summary = analysis.get("summary", {})
# Create directories
pages_dir = output_dir / "pages"
appendix_dir = output_dir / "appendix"
pages_dir.mkdir(parents=True, exist_ok=True)
appendix_dir.mkdir(parents=True, exist_ok=True)
# README.md
readme = generate_readme(name, routes, summary, date)
(output_dir / "README.md").write_text(readme)
print(f" Created: README.md")
# Per-page stubs
for i, route in enumerate(routes, 1):
page_name = route_to_page_name(route.get("path", "/"))
slug = slugify(page_name) or f"page-{i}"
filename = f"{i:02d}-{slug}.md"
content = generate_page_stub(route, i, date)
(pages_dir / filename).write_text(content)
print(f" Created: pages/{filename}")
# Appendix
(appendix_dir / "enum-dictionary.md").write_text(generate_enum_dictionary(enums))
print(f" Created: appendix/enum-dictionary.md")
(appendix_dir / "api-inventory.md").write_text(generate_api_inventory(apis))
print(f" Created: appendix/api-inventory.md")
(appendix_dir / "page-relationships.md").write_text(generate_page_relationships(routes))
print(f" Created: appendix/page-relationships.md")
print(f"\n✅ PRD scaffold complete: {output_dir}")
print(f" {len(routes)} page stubs, {len(apis)} API endpoints, {len(enums)} enums")
print(f"\n Next: Review each page stub and fill in the TODO sections.")
def validate_analysis(analysis: Dict[str, Any]) -> List[str]:
"""Validate analysis JSON has the required structure. Returns list of errors."""
errors = []
if not isinstance(analysis, dict):
return ["Analysis must be a JSON object"]
if "error" in analysis:
errors.append(f"Analysis contains error: {analysis['error']}")
required_keys = ["project", "routes", "apis"]
for key in required_keys:
if key not in analysis:
errors.append(f"Missing required key: '{key}'")
if "project" in analysis:
proj = analysis["project"]
if not isinstance(proj, dict):
errors.append("'project' must be an object")
elif "framework" not in proj:
errors.append("'project.framework' is missing")
if "routes" in analysis:
routes = analysis["routes"]
if not isinstance(routes, dict):
errors.append("'routes' must be an object")
elif "pages" not in routes and "frontend_pages" not in routes and "backend_endpoints" not in routes:
errors.append("'routes' must contain 'pages', 'frontend_pages', or 'backend_endpoints'")
if "apis" in analysis:
apis = analysis["apis"]
if not isinstance(apis, dict):
errors.append("'apis' must be an object")
elif "endpoints" not in apis:
errors.append("'apis.endpoints' is missing")
return errors
def print_summary(output_dir: Path, analysis: Dict[str, Any]):
"""Print a structured summary of what was generated."""
routes = analysis.get("routes", {}).get("pages", [])
apis = analysis.get("apis", {}).get("endpoints", [])
enums = analysis.get("enums", {}).get("definitions", [])
models = analysis.get("models", {}).get("definitions", [])
summary = analysis.get("summary", {})
stack = summary.get("stack_type", "unknown")
print(f"\nPRD scaffold complete: {output_dir}/")
print(f" Stack type: {stack}")
print(f" Page stubs: {len(routes)}")
print(f" API endpoints: {len(apis)}")
print(f" Enums: {len(enums)}")
if models:
print(f" Models: {len(models)}")
print(f"\n Next: Review each page stub and fill in the TODO sections.")
def main():
parser = argparse.ArgumentParser(
description="Scaffold PRD directory from codebase analysis"
)
parser.add_argument("analysis", help="Path to analysis JSON from codebase_analyzer.py")
parser.add_argument("-o", "--output-dir", default="prd", help="Output directory (default: prd/)")
parser.add_argument("-n", "--project-name", help="Override project name")
parser.add_argument("--validate-only", action="store_true",
help="Validate analysis JSON without generating files")
parser.add_argument("--dry-run", action="store_true",
help="Show what would be created without writing files")
args = parser.parse_args()
analysis_path = Path(args.analysis)
if not analysis_path.exists():
print(f"Error: Analysis file not found: {analysis_path}")
raise SystemExit(2)
try:
with open(analysis_path) as f:
analysis = json.load(f)
except json.JSONDecodeError as e:
print(f"Error: Invalid JSON in {analysis_path}: {e}")
raise SystemExit(2)
# Validate
errors = validate_analysis(analysis)
if errors:
print(f"Validation errors in {analysis_path}:")
for err in errors:
print(f" - {err}")
raise SystemExit(1)
if args.validate_only:
print(f"Analysis file is valid: {analysis_path}")
routes = analysis.get("routes", {}).get("pages", [])
print(f" {len(routes)} routes, "
f"{len(analysis.get('apis', {}).get('endpoints', []))} APIs, "
f"{len(analysis.get('enums', {}).get('definitions', []))} enums")
return
output_dir = Path(args.output_dir)
if args.dry_run:
routes = analysis.get("routes", {}).get("pages", [])
print(f"Dry run — would create in {output_dir}/:\n")
print(f" {output_dir}/README.md")
for i, route in enumerate(routes, 1):
name = route_to_page_name(route.get("path", "/"))
slug = slugify(name) or f"page-{i}"
print(f" {output_dir}/pages/{i:02d}-{slug}.md")
print(f" {output_dir}/appendix/enum-dictionary.md")
print(f" {output_dir}/appendix/api-inventory.md")
print(f" {output_dir}/appendix/page-relationships.md")
print(f"\n Total: {len(routes) + 4} files")
return
print(f"Scaffolding PRD in {output_dir}/...\n")
scaffold(analysis, output_dir, args.project_name)
print_summary(output_dir, analysis)
if __name__ == "__main__":
main()
FILE:settings.json
{
"name": "code-to-prd",
"displayName": "Code → PRD",
"version": "2.1.2",
"description": "Reverse-engineer any codebase into a complete PRD. Analyzes routes, components, models, APIs, and interactions. Frontend (React, Vue, Next.js), backend (NestJS, Django, FastAPI), and fullstack.",
"author": "Alireza Rezvani",
"license": "MIT",
"platforms": ["claude-code", "openclaw", "codex"],
"category": "product",
"tags": ["prd", "product-requirements", "reverse-engineering", "frontend", "backend", "fullstack", "documentation", "code-analysis", "react", "vue", "angular", "next-js", "nestjs", "django", "fastapi", "express"],
"repository": "https://github.com/alirezarezvani/claude-skills",
"commands": {
"code-to-prd": "/code-to-prd"
}
}
Terraform infrastructure-as-code agent skill and plugin for Claude Code, Codex, Gemini CLI, Cursor, OpenClaw. Covers module design patterns, state management...
---
name: "terraform-patterns"
description: "Terraform infrastructure-as-code agent skill and plugin for Claude Code, Codex, Gemini CLI, Cursor, OpenClaw. Covers module design patterns, state management strategies, provider configuration, security hardening, policy-as-code with Sentinel/OPA, and CI/CD plan/apply workflows. Use when: user wants to design Terraform modules, manage state backends, review Terraform security, implement multi-region deployments, or follow IaC best practices."
license: MIT
metadata:
version: 1.0.0
author: Alireza Rezvani
category: engineering
updated: 2026-03-15
---
# Terraform Patterns
> Predictable infrastructure. Secure state. Modules that compose. No drift.
Opinionated Terraform workflow that turns sprawling HCL into well-structured, secure, production-grade infrastructure code. Covers module design, state management, provider patterns, security hardening, and CI/CD integration.
Not a Terraform tutorial — a set of concrete decisions about how to write infrastructure code that doesn't break at 3 AM.
---
## Slash Commands
| Command | What it does |
|---------|-------------|
| `/terraform:review` | Analyze Terraform code for anti-patterns, security issues, and structure problems |
| `/terraform:module` | Design or refactor a Terraform module with proper inputs, outputs, and composition |
| `/terraform:security` | Audit Terraform code for security vulnerabilities, secrets exposure, and IAM misconfigurations |
---
## When This Skill Activates
Recognize these patterns from the user:
- "Review this Terraform code"
- "Design a Terraform module for..."
- "My Terraform state is..."
- "Set up remote state backend"
- "Multi-region Terraform deployment"
- "Terraform security review"
- "Module structure best practices"
- "Terraform CI/CD pipeline"
- Any request involving: `.tf` files, HCL, Terraform modules, state management, provider configuration, infrastructure-as-code
If the user has `.tf` files or wants to provision infrastructure with Terraform → this skill applies.
---
## Workflow
### `/terraform:review` — Terraform Code Review
1. **Analyze current state**
- Read all `.tf` files in the target directory
- Identify module structure (flat vs nested)
- Count resources, data sources, variables, outputs
- Check naming conventions
2. **Apply review checklist**
```
MODULE STRUCTURE
├── Variables have descriptions and type constraints
├── Outputs expose only what consumers need
├── Resources use consistent naming: {provider}_{type}_{purpose}
├── Locals used for computed values and DRY expressions
└── No hardcoded values — everything parameterized or in locals
STATE & BACKEND
├── Remote backend configured (S3, GCS, Azure Blob, Terraform Cloud)
├── State locking enabled (DynamoDB for S3, native for others)
├── State encryption at rest enabled
├── No secrets stored in state (or state access is restricted)
└── Workspaces or directory isolation for environments
PROVIDERS
├── Version constraints use pessimistic operator: ~> 5.0
├── Required providers block in terraform {} block
├── Provider aliases for multi-region or multi-account
└── No provider configuration in child modules
SECURITY
├── No hardcoded secrets, keys, or passwords
├── IAM follows least-privilege principle
├── Encryption enabled for storage, databases, secrets
├── Security groups are not overly permissive (no 0.0.0.0/0 ingress on sensitive ports)
└── Sensitive variables marked with sensitive = true
```
3. **Generate report**
```bash
python3 scripts/tf_module_analyzer.py ./terraform
```
4. **Run security scan**
```bash
python3 scripts/tf_security_scanner.py ./terraform
```
### `/terraform:module` — Module Design
1. **Identify module scope**
- Single responsibility: one module = one logical grouping
- Determine inputs (variables), outputs, and resource boundaries
- Decide: flat module (single directory) vs nested (calling child modules)
2. **Apply module design checklist**
```
STRUCTURE
├── main.tf — Primary resources
├── variables.tf — All input variables with descriptions and types
├── outputs.tf — All outputs with descriptions
├── versions.tf — terraform {} block with required_providers
├── locals.tf — Computed values and naming conventions
├── data.tf — Data sources (if any)
└── README.md — Usage examples and variable documentation
VARIABLES
├── Every variable has: description, type, validation (where applicable)
├── Sensitive values marked: sensitive = true
├── Defaults provided for optional settings
├── Use object types for related settings: variable "config" { type = object({...}) }
└── Validate with: validation { condition = ... }
OUTPUTS
├── Output IDs, ARNs, endpoints — things consumers need
├── Include description on every output
├── Mark sensitive outputs: sensitive = true
└── Don't output entire resources — only specific attributes
COMPOSITION
├── Root module calls child modules
├── Child modules never call other child modules
├── Pass values explicitly — no hidden data source lookups in child modules
├── Provider configuration only in root module
└── Use module "name" { source = "./modules/name" }
```
3. **Generate module scaffold**
- Output file structure with boilerplate
- Include variable validation blocks
- Add lifecycle rules where appropriate
### `/terraform:security` — Security Audit
1. **Code-level audit**
| Check | Severity | Fix |
|-------|----------|-----|
| Hardcoded secrets in `.tf` files | Critical | Use variables with sensitive = true or vault |
| IAM policy with `*` actions | Critical | Scope to specific actions and resources |
| Security group with 0.0.0.0/0 on port 22/3389 | Critical | Restrict to known CIDR blocks or use SSM/bastion |
| S3 bucket without encryption | High | Add `server_side_encryption_configuration` block |
| S3 bucket with public access | High | Add `aws_s3_bucket_public_access_block` |
| RDS without encryption | High | Set `storage_encrypted = true` |
| RDS publicly accessible | High | Set `publicly_accessible = false` |
| CloudTrail not enabled | Medium | Add `aws_cloudtrail` resource |
| Missing `prevent_destroy` on stateful resources | Medium | Add `lifecycle { prevent_destroy = true }` |
| Variables without `sensitive = true` for secrets | Medium | Add `sensitive = true` to secret variables |
2. **State security audit**
| Check | Severity | Fix |
|-------|----------|-----|
| Local state file | Critical | Migrate to remote backend with encryption |
| Remote state without encryption | High | Enable encryption on backend (SSE-S3, KMS) |
| No state locking | High | Enable DynamoDB for S3, native for TF Cloud |
| State accessible to all team members | Medium | Restrict via IAM policies or TF Cloud teams |
3. **Generate security report**
```bash
python3 scripts/tf_security_scanner.py ./terraform
python3 scripts/tf_security_scanner.py ./terraform --output json
```
---
## Tooling
### `scripts/tf_module_analyzer.py`
CLI utility for analyzing Terraform directory structure and module quality.
**Features:**
- Resource and data source counting
- Variable and output analysis (missing descriptions, types, validation)
- Naming convention checks
- Module composition detection
- File structure validation
- JSON and text output
**Usage:**
```bash
# Analyze a Terraform directory
python3 scripts/tf_module_analyzer.py ./terraform
# JSON output
python3 scripts/tf_module_analyzer.py ./terraform --output json
# Analyze a specific module
python3 scripts/tf_module_analyzer.py ./modules/vpc
```
### `scripts/tf_security_scanner.py`
CLI utility for scanning `.tf` files for common security issues.
**Features:**
- Hardcoded secret detection (AWS keys, passwords, tokens)
- Overly permissive IAM policy detection
- Open security group detection (0.0.0.0/0 on sensitive ports)
- Missing encryption checks (S3, RDS, EBS)
- Public access detection (S3, RDS, EC2)
- Sensitive variable audit
- JSON and text output
**Usage:**
```bash
# Scan a Terraform directory
python3 scripts/tf_security_scanner.py ./terraform
# JSON output
python3 scripts/tf_security_scanner.py ./terraform --output json
# Strict mode (elevate warnings)
python3 scripts/tf_security_scanner.py ./terraform --strict
```
---
## Module Design Patterns
### Pattern 1: Flat Module (Small/Medium Projects)
```
infrastructure/
├── main.tf # All resources
├── variables.tf # All inputs
├── outputs.tf # All outputs
├── versions.tf # Provider requirements
├── terraform.tfvars # Environment values (not committed)
└── backend.tf # Remote state configuration
```
Best for: Single application, < 20 resources, one team owns everything.
### Pattern 2: Nested Modules (Medium/Large Projects)
```
infrastructure/
├── environments/
│ ├── dev/
│ │ ├── main.tf # Calls modules with dev params
│ │ ├── backend.tf # Dev state backend
│ │ └── terraform.tfvars
│ ├── staging/
│ │ └── ...
│ └── prod/
│ └── ...
├── modules/
│ ├── networking/
│ │ ├── main.tf
│ │ ├── variables.tf
│ │ └── outputs.tf
│ ├── compute/
│ │ └── ...
│ └── database/
│ └── ...
└── versions.tf
```
Best for: Multiple environments, shared infrastructure patterns, team collaboration.
### Pattern 3: Mono-Repo with Terragrunt
```
infrastructure/
├── terragrunt.hcl # Root config
├── modules/ # Reusable modules
│ ├── vpc/
│ ├── eks/
│ └── rds/
├── dev/
│ ├── terragrunt.hcl # Dev overrides
│ ├── vpc/
│ │ └── terragrunt.hcl # Module invocation
│ └── eks/
│ └── terragrunt.hcl
└── prod/
├── terragrunt.hcl
└── ...
```
Best for: Large-scale, many environments, DRY configuration, team-level isolation.
---
## Provider Configuration Patterns
### Version Pinning
```hcl
terraform {
required_version = ">= 1.5.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0" # Allow 5.x, block 6.0
}
random = {
source = "hashicorp/random"
version = "~> 3.5"
}
}
}
```
### Multi-Region with Aliases
```hcl
provider "aws" {
region = "us-east-1"
}
provider "aws" {
alias = "west"
region = "us-west-2"
}
resource "aws_s3_bucket" "primary" {
bucket = "my-app-primary"
}
resource "aws_s3_bucket" "replica" {
provider = aws.west
bucket = "my-app-replica"
}
```
### Multi-Account with Assume Role
```hcl
provider "aws" {
alias = "production"
region = "us-east-1"
assume_role {
role_arn = "arn:aws:iam::PROD_ACCOUNT_ID:role/TerraformRole"
}
}
```
---
## State Management Decision Tree
```
Single developer, small project?
├── Yes → Local state (but migrate to remote ASAP)
└── No
├── Using Terraform Cloud/Enterprise?
│ └── Yes → TF Cloud native backend (built-in locking, encryption, RBAC)
└── No
├── AWS?
│ └── S3 + DynamoDB (encryption, locking, versioning)
├── GCP?
│ └── GCS bucket (native locking, encryption)
├── Azure?
│ └── Azure Blob Storage (native locking, encryption)
└── Other?
└── Consul or PostgreSQL backend
Environment isolation strategy:
├── Separate state files per environment (recommended)
│ ├── Option A: Separate directories (dev/, staging/, prod/)
│ └── Option B: Terraform workspaces (simpler but less isolation)
└── Single state file for all environments (never do this)
```
---
## CI/CD Integration Patterns
### GitHub Actions Plan/Apply
```yaml
# .github/workflows/terraform.yml
name: Terraform
on:
pull_request:
paths: ['terraform/**']
push:
branches: [main]
paths: ['terraform/**']
jobs:
plan:
runs-on: ubuntu-latest
if: github.event_name == 'pull_request'
steps:
- uses: actions/checkout@v4
- uses: hashicorp/setup-terraform@v3
- run: terraform init
- run: terraform validate
- run: terraform plan -out=tfplan
- run: terraform show -json tfplan > plan.json
# Post plan as PR comment
apply:
runs-on: ubuntu-latest
if: github.ref == 'refs/heads/main' && github.event_name == 'push'
environment: production
steps:
- uses: actions/checkout@v4
- uses: hashicorp/setup-terraform@v3
- run: terraform init
- run: terraform apply -auto-approve
```
### Drift Detection
```yaml
# Run on schedule to detect drift
name: Drift Detection
on:
schedule:
- cron: '0 6 * * 1-5' # Weekdays at 6 AM
jobs:
detect:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: hashicorp/setup-terraform@v3
- run: terraform init
- run: |
terraform plan -detailed-exitcode -out=drift.tfplan 2>&1 | tee drift.log
EXIT_CODE=$?
if [ $EXIT_CODE -eq 2 ]; then
echo "DRIFT DETECTED — review drift.log"
# Send alert (Slack, PagerDuty, etc.)
fi
```
---
## Proactive Triggers
Flag these without being asked:
- **No remote backend configured** → Migrate to S3/GCS/Azure Blob with locking and encryption.
- **Provider without version constraint** → Add `version = "~> X.0"` to prevent breaking upgrades.
- **Hardcoded secrets in .tf files** → Use variables with `sensitive = true`, or integrate Vault/SSM.
- **IAM policy with `"Action": "*"`** → Scope to specific actions. No wildcard actions in production.
- **Security group open to 0.0.0.0/0 on SSH/RDP** → Restrict to bastion CIDR or use SSM Session Manager.
- **No state locking** → Enable DynamoDB table for S3 backend, or use TF Cloud.
- **Resources without tags** → Add default_tags in provider block. Tags are mandatory for cost tracking.
- **Missing `prevent_destroy` on databases/storage** → Add lifecycle block to prevent accidental deletion.
---
## Installation
### One-liner (any tool)
```bash
git clone https://github.com/alirezarezvani/claude-skills.git
cp -r claude-skills/engineering/terraform-patterns ~/.claude/skills/
```
### Multi-tool install
```bash
./scripts/convert.sh --skill terraform-patterns --tool codex|gemini|cursor|windsurf|openclaw
```
### OpenClaw
```bash
clawhub install terraform-patterns
```
---
## Related Skills
- **senior-devops** — Broader DevOps scope (CI/CD, monitoring, containerization). Complementary — use terraform-patterns for IaC-specific work, senior-devops for pipeline and infrastructure operations.
- **aws-solution-architect** — AWS architecture design. Complementary — terraform-patterns implements the infrastructure, aws-solution-architect designs it.
- **senior-security** — Application security. Complementary — terraform-patterns covers infrastructure security posture, senior-security covers application-level threats.
- **ci-cd-pipeline-builder** — Pipeline construction. Complementary — terraform-patterns defines infrastructure, ci-cd-pipeline-builder automates deployment.
FILE:references/module-patterns.md
# Terraform Module Design Patterns Reference
## Pattern 1: Flat Module (Single Directory)
Best for: Small projects, < 20 resources, single team ownership.
```
project/
├── main.tf
├── variables.tf
├── outputs.tf
├── versions.tf
├── locals.tf
├── backend.tf
└── terraform.tfvars
```
### Example: Simple VPC + EC2
```hcl
# versions.tf
terraform {
required_version = ">= 1.5.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
}
# locals.tf
locals {
name_prefix = "var.project-var.environment"
common_tags = {
Project = var.project
Environment = var.environment
ManagedBy = "terraform"
}
}
# main.tf
resource "aws_vpc" "main" {
cidr_block = var.vpc_cidr
enable_dns_hostnames = true
enable_dns_support = true
tags = merge(local.common_tags, {
Name = "local.name_prefix-vpc"
})
}
resource "aws_subnet" "public" {
count = length(var.public_subnet_cidrs)
vpc_id = aws_vpc.main.id
cidr_block = var.public_subnet_cidrs[count.index]
availability_zone = var.availability_zones[count.index]
tags = merge(local.common_tags, {
Name = "local.name_prefix-public-count.index + 1"
Tier = "public"
})
}
# variables.tf
variable "project" {
description = "Project name used for resource naming"
type = string
}
variable "environment" {
description = "Deployment environment"
type = string
validation {
condition = contains(["dev", "staging", "prod"], var.environment)
error_message = "Environment must be dev, staging, or prod."
}
}
variable "vpc_cidr" {
description = "CIDR block for the VPC"
type = string
default = "10.0.0.0/16"
validation {
condition = can(cidrhost(var.vpc_cidr, 0))
error_message = "Must be a valid CIDR block."
}
}
variable "public_subnet_cidrs" {
description = "CIDR blocks for public subnets"
type = list(string)
default = ["10.0.1.0/24", "10.0.2.0/24"]
}
variable "availability_zones" {
description = "AZs for subnet placement"
type = list(string)
default = ["us-east-1a", "us-east-1b"]
}
# outputs.tf
output "vpc_id" {
description = "ID of the created VPC"
value = aws_vpc.main.id
}
output "public_subnet_ids" {
description = "IDs of public subnets"
value = aws_subnet.public[*].id
}
```
---
## Pattern 2: Nested Modules (Composition)
Best for: Multiple environments, shared patterns, team collaboration.
```
infrastructure/
├── environments/
│ ├── dev/
│ │ ├── main.tf
│ │ ├── backend.tf
│ │ └── terraform.tfvars
│ ├── staging/
│ │ └── ...
│ └── prod/
│ └── ...
└── modules/
├── networking/
│ ├── main.tf
│ ├── variables.tf
│ └── outputs.tf
├── compute/
│ └── ...
└── database/
└── ...
```
### Root Module (environments/dev/main.tf)
```hcl
module "networking" {
source = "../../modules/networking"
project = var.project
environment = "dev"
vpc_cidr = "10.0.0.0/16"
public_subnet_cidrs = ["10.0.1.0/24", "10.0.2.0/24"]
private_subnet_cidrs = ["10.0.10.0/24", "10.0.11.0/24"]
}
module "compute" {
source = "../../modules/compute"
project = var.project
environment = "dev"
vpc_id = module.networking.vpc_id
subnet_ids = module.networking.private_subnet_ids
instance_type = "t3.micro"
instance_count = 1
}
module "database" {
source = "../../modules/database"
project = var.project
environment = "dev"
vpc_id = module.networking.vpc_id
subnet_ids = module.networking.private_subnet_ids
instance_class = "db.t3.micro"
allocated_storage = 20
db_password = var.db_password
}
```
### Key Rules
- Child modules never call other child modules
- Pass values explicitly — no hidden data source lookups in children
- Provider configuration only in root module
- Each module has its own variables.tf, outputs.tf, main.tf
---
## Pattern 3: Registry Module Pattern
Best for: Reusable modules shared across teams or organizations.
```
terraform-aws-vpc/
├── main.tf
├── variables.tf
├── outputs.tf
├── versions.tf
├── README.md
├── examples/
│ ├── simple/
│ │ └── main.tf
│ └── complete/
│ └── main.tf
└── modules/
├── subnet/
│ ├── main.tf
│ ├── variables.tf
│ └── outputs.tf
└── nat-gateway/
└── ...
```
### Publishing Conventions
```hcl
# Consumer usage
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "~> 5.0"
name = "my-vpc"
cidr = "10.0.0.0/16"
azs = ["us-east-1a", "us-east-1b"]
private_subnets = ["10.0.1.0/24", "10.0.2.0/24"]
public_subnets = ["10.0.101.0/24", "10.0.102.0/24"]
enable_nat_gateway = true
single_nat_gateway = true
}
```
### Registry Module Requirements
- Repository named `terraform-<PROVIDER>-<NAME>`
- README.md with usage examples
- Semantic versioning via git tags
- examples/ directory with working configurations
- No provider configuration in the module itself
---
## Pattern 4: Mono-Repo with Workspaces
Best for: Teams that prefer single-repo with workspace-based isolation.
```hcl
# backend.tf
terraform {
backend "s3" {
bucket = "my-terraform-state"
key = "project/terraform.tfstate"
region = "us-east-1"
dynamodb_table = "terraform-locks"
encrypt = true
}
}
# main.tf
locals {
env_config = {
dev = {
instance_type = "t3.micro"
instance_count = 1
db_class = "db.t3.micro"
}
staging = {
instance_type = "t3.small"
instance_count = 2
db_class = "db.t3.small"
}
prod = {
instance_type = "t3.large"
instance_count = 3
db_class = "db.r5.large"
}
}
config = local.env_config[terraform.workspace]
}
```
### Usage
```bash
terraform workspace new dev
terraform workspace new staging
terraform workspace new prod
terraform workspace select dev
terraform apply
terraform workspace select prod
terraform apply
```
### Workspace Caveats
- All environments share the same backend — less isolation than separate directories
- A mistake in the code affects all environments
- Can't have different provider versions per workspace
- Recommended only for simple setups; prefer separate directories for production
---
## Pattern 5: for_each vs count
### Use `count` for identical resources
```hcl
resource "aws_subnet" "public" {
count = 3
vpc_id = aws_vpc.main.id
cidr_block = cidrsubnet(var.vpc_cidr, 8, count.index)
availability_zone = data.aws_availability_zones.available.names[count.index]
}
```
### Use `for_each` for distinct resources
```hcl
variable "buckets" {
type = map(object({
versioning = bool
lifecycle_days = number
}))
default = {
logs = { versioning = false, lifecycle_days = 30 }
backups = { versioning = true, lifecycle_days = 90 }
assets = { versioning = true, lifecycle_days = 0 }
}
}
resource "aws_s3_bucket" "this" {
for_each = var.buckets
bucket = "var.project-each.key"
}
resource "aws_s3_bucket_versioning" "this" {
for_each = { for k, v in var.buckets : k => v if v.versioning }
bucket = aws_s3_bucket.this[each.key].id
versioning_configuration {
status = "Enabled"
}
}
```
### Why `for_each` > `count`
- `count` uses index — removing item 0 shifts all others, causing destroy/recreate
- `for_each` uses keys — removing a key only affects that resource
- Use `count` only for identical resources where order doesn't matter
---
## Variable Design Patterns
### Object Variables for Related Settings
```hcl
variable "database" {
description = "Database configuration"
type = object({
engine = string
instance_class = string
storage_gb = number
multi_az = bool
backup_days = number
})
default = {
engine = "postgres"
instance_class = "db.t3.micro"
storage_gb = 20
multi_az = false
backup_days = 7
}
}
```
### Validation Blocks
```hcl
variable "instance_type" {
description = "EC2 instance type"
type = string
validation {
condition = can(regex("^t[23]\\.", var.instance_type))
error_message = "Only t2 or t3 instance types are allowed."
}
}
variable "cidr_block" {
description = "VPC CIDR block"
type = string
validation {
condition = can(cidrhost(var.cidr_block, 0))
error_message = "Must be a valid IPv4 CIDR block."
}
}
```
---
## Anti-Patterns to Avoid
| Anti-Pattern | Problem | Solution |
|-------------|---------|----------|
| God module (100+ resources) | Impossible to reason about, slow plan/apply | Split into focused child modules |
| Circular module dependencies | Terraform can't resolve dependency graph | Flatten or restructure module boundaries |
| Data sources in child modules | Hidden dependencies, hard to test | Pass values as variables from root module |
| Provider config in child modules | Can't reuse module across accounts/regions | Configure providers in root only |
| Hardcoded values | Not reusable across environments | Use variables with defaults and validation |
| No outputs | Consumer modules can't reference resources | Output IDs, ARNs, endpoints |
| No variable descriptions | Users don't know what to provide | Every variable gets a description |
| `terraform.tfvars` committed | Secrets leak to version control | Use `.gitignore`, env vars, or Vault |
FILE:references/state-management.md
# Terraform State Management Reference
## Backend Configuration Patterns
### AWS: S3 + DynamoDB (Recommended)
```hcl
terraform {
backend "s3" {
bucket = "mycompany-terraform-state"
key = "project/env/terraform.tfstate"
region = "us-east-1"
encrypt = true
dynamodb_table = "terraform-locks"
# Optional: KMS key for encryption
# kms_key_id = "arn:aws:kms:us-east-1:ACCOUNT:key/KEY_ID"
}
}
```
**Prerequisites:**
```hcl
# Bootstrap these resources manually or with a separate Terraform config
resource "aws_s3_bucket" "state" {
bucket = "mycompany-terraform-state"
lifecycle {
prevent_destroy = true
}
}
resource "aws_s3_bucket_versioning" "state" {
bucket = aws_s3_bucket.state.id
versioning_configuration {
status = "Enabled"
}
}
resource "aws_s3_bucket_server_side_encryption_configuration" "state" {
bucket = aws_s3_bucket.state.id
rule {
apply_server_side_encryption_by_default {
sse_algorithm = "aws:kms"
}
}
}
resource "aws_s3_bucket_public_access_block" "state" {
bucket = aws_s3_bucket.state.id
block_public_acls = true
block_public_policy = true
ignore_public_acls = true
restrict_public_buckets = true
}
resource "aws_dynamodb_table" "locks" {
name = "terraform-locks"
billing_mode = "PAY_PER_REQUEST"
hash_key = "LockID"
attribute {
name = "LockID"
type = "S"
}
}
```
---
### GCP: Google Cloud Storage
```hcl
terraform {
backend "gcs" {
bucket = "mycompany-terraform-state"
prefix = "project/env"
}
}
```
**Key features:**
- Native locking (no separate lock table needed)
- Object versioning for state history
- IAM-based access control
- Encryption at rest by default
---
### Azure: Blob Storage
```hcl
terraform {
backend "azurerm" {
resource_group_name = "terraform-state-rg"
storage_account_name = "mycompanytfstate"
container_name = "tfstate"
key = "project/env/terraform.tfstate"
}
}
```
**Key features:**
- Native blob locking
- Encryption at rest with Microsoft-managed or customer-managed keys
- RBAC-based access control
---
### Terraform Cloud / Enterprise
```hcl
terraform {
cloud {
organization = "mycompany"
workspaces {
name = "project-dev"
}
}
}
```
**Key features:**
- Built-in state locking, encryption, and versioning
- RBAC and team-based access control
- Remote execution (plan/apply run in TF Cloud)
- Sentinel policy-as-code integration
- Cost estimation on plans
---
## Environment Isolation Strategies
### Strategy 1: Separate Directories (Recommended)
```
infrastructure/
├── environments/
│ ├── dev/
│ │ ├── main.tf
│ │ ├── backend.tf # key = "project/dev/terraform.tfstate"
│ │ └── terraform.tfvars
│ ├── staging/
│ │ ├── main.tf
│ │ ├── backend.tf # key = "project/staging/terraform.tfstate"
│ │ └── terraform.tfvars
│ └── prod/
│ ├── main.tf
│ ├── backend.tf # key = "project/prod/terraform.tfstate"
│ └── terraform.tfvars
└── modules/
└── ...
```
**Pros:**
- Complete isolation — a mistake in dev can't affect prod
- Different provider versions per environment
- Different module versions per environment (pin prod, iterate in dev)
- Clear audit trail — who changed what, where
**Cons:**
- Some duplication across environment directories
- Must update modules in each environment separately
### Strategy 2: Terraform Workspaces
```hcl
# Single directory, multiple workspaces
terraform {
backend "s3" {
bucket = "mycompany-terraform-state"
key = "project/terraform.tfstate"
region = "us-east-1"
dynamodb_table = "terraform-locks"
encrypt = true
}
}
# State files stored at:
# env:/dev/project/terraform.tfstate
# env:/staging/project/terraform.tfstate
# env:/prod/project/terraform.tfstate
```
```bash
terraform workspace new dev
terraform workspace select dev
terraform plan -var-file="env/dev.tfvars"
```
**Pros:**
- Less duplication — single set of .tf files
- Quick to switch between environments
- Built-in workspace support in backends
**Cons:**
- Shared code means a bug affects all environments simultaneously
- Can't have different provider versions per workspace
- Easy to accidentally apply to wrong workspace
- Less isolation than separate directories
### Strategy 3: Terragrunt (DRY Configuration)
```
infrastructure/
├── terragrunt.hcl # Root — defines remote state pattern
├── modules/
│ └── vpc/
│ ├── main.tf
│ ├── variables.tf
│ └── outputs.tf
├── dev/
│ ├── terragrunt.hcl # env = "dev"
│ └── vpc/
│ └── terragrunt.hcl # inputs for dev VPC
├── staging/
│ └── ...
└── prod/
└── ...
```
```hcl
# Root terragrunt.hcl
remote_state {
backend = "s3"
generate = {
path = "backend.tf"
if_exists = "overwrite_terragrunt"
}
config = {
bucket = "mycompany-terraform-state"
key = "path_relative_to_include()/terraform.tfstate"
region = "us-east-1"
encrypt = true
dynamodb_table = "terraform-locks"
}
}
# dev/vpc/terragrunt.hcl
terraform {
source = "../../modules/vpc"
}
inputs = {
environment = "dev"
vpc_cidr = "10.0.0.0/16"
}
```
**Pros:**
- Maximum DRY — define module once, parameterize per environment
- Automatic state key generation from directory structure
- Dependency management between modules (`dependency` blocks)
- `run-all` for applying multiple modules at once
**Cons:**
- Additional tool dependency (Terragrunt)
- Learning curve
- Debugging can be harder (generated files)
---
## State Migration Patterns
### Local to Remote (S3)
```bash
# 1. Add backend configuration to backend.tf
# 2. Run init with migration flag
terraform init -migrate-state
# Terraform will prompt:
# "Do you want to copy existing state to the new backend?"
# Answer: yes
```
### Between Remote Backends
```bash
# 1. Pull current state
terraform state pull > terraform.tfstate.backup
# 2. Update backend configuration in backend.tf
# 3. Reinitialize with migration
terraform init -migrate-state
# 4. Verify
terraform plan # Should show no changes
```
### State Import (Existing Resources)
```bash
# Import a single resource
terraform import aws_instance.web i-1234567890abcdef0
# Import with for_each key
terraform import 'aws_subnet.public["us-east-1a"]' subnet-0123456789abcdef0
# Bulk import (Terraform 1.5+ import blocks)
import {
to = aws_instance.web
id = "i-1234567890abcdef0"
}
```
### State Move (Refactoring)
```bash
# Rename a resource (avoids destroy/recreate)
terraform state mv aws_instance.old_name aws_instance.new_name
# Move into a module
terraform state mv aws_instance.web module.compute.aws_instance.web
# Move between state files
terraform state mv -state-out=other.tfstate aws_instance.web aws_instance.web
```
---
## State Locking
### Why Locking Matters
Without locking, two concurrent `terraform apply` runs can corrupt state. The second apply reads stale state and may create duplicate resources or lose track of existing ones.
### Lock Behavior by Backend
| Backend | Lock Mechanism | Auto-Lock | Force Unlock |
|---------|---------------|-----------|--------------|
| S3 | DynamoDB table | Yes (if table configured) | `terraform force-unlock LOCK_ID` |
| GCS | Native blob locking | Yes | `terraform force-unlock LOCK_ID` |
| Azure Blob | Native blob lease | Yes | `terraform force-unlock LOCK_ID` |
| TF Cloud | Built-in | Always | Via UI or API |
| Consul | Key-value lock | Yes | `terraform force-unlock LOCK_ID` |
| Local | `.terraform.lock.hcl` | Yes (single user) | Delete lock file |
### Force Unlock (Emergency Only)
```bash
# Only use when you're certain no other process is running
terraform force-unlock LOCK_ID
# The LOCK_ID is shown in the error message when lock fails:
# Error: Error locking state: Error acquiring the state lock
# Lock Info:
# ID: 12345678-abcd-1234-abcd-1234567890ab
```
---
## State Security Best Practices
### 1. Encrypt at Rest
```hcl
# S3 — server-side encryption
backend "s3" {
encrypt = true
kms_key_id = "arn:aws:kms:us-east-1:ACCOUNT:key/KEY_ID"
}
```
### 2. Restrict Access
```json
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject"
],
"Resource": "arn:aws:s3:::mycompany-terraform-state/project/*",
"Condition": {
"StringEquals": {
"aws:PrincipalTag/Team": "platform"
}
}
},
{
"Effect": "Allow",
"Action": [
"dynamodb:GetItem",
"dynamodb:PutItem",
"dynamodb:DeleteItem"
],
"Resource": "arn:aws:dynamodb:us-east-1:ACCOUNT:table/terraform-locks"
}
]
}
```
### 3. Enable Versioning (State History)
```hcl
resource "aws_s3_bucket_versioning" "state" {
bucket = aws_s3_bucket.state.id
versioning_configuration {
status = "Enabled"
}
}
```
Versioning lets you recover from state corruption by restoring a previous version.
### 4. Audit Access
- Enable S3 access logging or CloudTrail data events
- Monitor for unexpected state reads (potential secret extraction)
- State files contain sensitive values — treat them like credentials
### 5. Sensitive Values in State
Terraform stores all resource attributes in state, including passwords, private keys, and tokens. This is unavoidable. Mitigate by:
- Encrypting state at rest (KMS)
- Restricting state file access (IAM)
- Using `sensitive = true` on variables and outputs (prevents display, not storage)
- Rotating secrets regularly (state contains the value at apply time)
---
## Drift Detection and Reconciliation
### Detect Drift
```bash
# Plan with detailed exit code
terraform plan -detailed-exitcode
# Exit 0 = no changes
# Exit 1 = error
# Exit 2 = changes detected (drift)
```
### Common Drift Sources
| Source | Example | Prevention |
|--------|---------|------------|
| Console changes | Someone edits SG rules in AWS Console | SCPs to restrict console access, or accept and reconcile |
| Auto-scaling | ASG launches instances not in state | Don't manage individual instances; manage ASG |
| External tools | Ansible modifies EC2 tags | Agree on ownership boundaries |
| Dependent resource changes | AMI deregistered | Use data sources to detect, lifecycle ignore_changes |
### Reconciliation Options
```hcl
# Option 1: Apply to restore desired state
terraform apply
# Option 2: Refresh state to match reality
terraform apply -refresh-only
# Option 3: Ignore specific attribute drift
resource "aws_instance" "web" {
lifecycle {
ignore_changes = [tags["LastModifiedBy"], ami]
}
}
# Option 4: Import the manually-created resource
terraform import aws_security_group_rule.new sg-12345_ingress_tcp_443_443_0.0.0.0/0
```
---
## Troubleshooting Checklist
| Symptom | Likely Cause | Fix |
|---------|-------------|-----|
| "Error acquiring state lock" | Concurrent run or crashed process | Wait for other run to finish, or `force-unlock` |
| "Backend configuration changed" | Backend config modified | Run `terraform init -reconfigure` or `-migrate-state` |
| "Resource already exists" | Resource created outside Terraform | `terraform import` the resource |
| "No matching resource found" | Resource deleted outside Terraform | `terraform state rm` the resource |
| State file growing very large | Too many resources in one state | Split into smaller state files using modules |
| Slow plan/apply | Large state file, many resources | Split state, use `-target` for urgent changes |
| "Provider produced inconsistent result" | Provider bug or API race condition | Retry, or pin provider version |
| Workspace confusion | Applied to wrong workspace | Always check `terraform workspace show` before apply |
FILE:scripts/tf_module_analyzer.py
#!/usr/bin/env python3
"""
terraform-patterns: Terraform Module Analyzer
Analyze a Terraform directory structure for module quality, resource counts,
naming conventions, and structural best practices. Reports variable/output
coverage, file organization, and actionable recommendations.
Usage:
python scripts/tf_module_analyzer.py ./terraform
python scripts/tf_module_analyzer.py ./terraform --output json
python scripts/tf_module_analyzer.py ./modules/vpc
"""
import argparse
import json
import os
import re
import sys
from pathlib import Path
# --- Demo Terraform Files ---
DEMO_FILES = {
"main.tf": """
resource "aws_instance" "web_server" {
ami = var.ami_id
instance_type = var.instance_type
tags = {
Name = "web-server"
}
}
resource "aws_s3_bucket" "data" {
bucket = "my-data-bucket-12345"
}
resource "aws_security_group" "web" {
name = "web-sg"
ingress {
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
ingress {
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
}
data "aws_ami" "ubuntu" {
most_recent = true
owners = ["099720109477"]
}
module "vpc" {
source = "./modules/vpc"
cidr = var.vpc_cidr
}
""",
"variables.tf": """
variable "ami_id" {
type = string
}
variable "instance_type" {
default = "t3.micro"
}
variable "vpc_cidr" {
description = "CIDR block for the VPC"
type = string
default = "10.0.0.0/16"
}
variable "environment" {
description = "Deployment environment"
type = string
validation {
condition = contains(["dev", "staging", "prod"], var.environment)
error_message = "Environment must be dev, staging, or prod."
}
}
""",
"outputs.tf": """
output "instance_id" {
value = aws_instance.web_server.id
}
output "bucket_arn" {
value = aws_s3_bucket.data.arn
description = "ARN of the data S3 bucket"
}
""",
}
# --- Naming convention patterns ---
# Terraform resource naming: lowercase, underscores, alphanumeric
VALID_RESOURCE_NAME = re.compile(r'^[a-z][a-z0-9_]*$')
# Expected files in a well-structured module
EXPECTED_FILES = {
"main.tf": "Primary resources",
"variables.tf": "Input variables",
"outputs.tf": "Output values",
"versions.tf": "Provider and Terraform version requirements",
}
OPTIONAL_FILES = {
"locals.tf": "Computed local values",
"data.tf": "Data sources",
"backend.tf": "Remote state backend configuration",
"providers.tf": "Provider configuration",
"README.md": "Module documentation",
}
def find_tf_files(directory):
"""Find all .tf files in a directory (non-recursive)."""
tf_files = {}
for entry in sorted(os.listdir(directory)):
if entry.endswith(".tf"):
filepath = os.path.join(directory, entry)
with open(filepath, encoding="utf-8") as f:
tf_files[entry] = f.read()
return tf_files
def parse_resources(content):
"""Extract resource declarations from HCL content."""
resources = []
for match in re.finditer(
r'^resource\s+"([^"]+)"\s+"([^"]+)"', content, re.MULTILINE
):
resources.append({
"type": match.group(1),
"name": match.group(2),
"provider": match.group(1).split("_")[0],
})
return resources
def parse_data_sources(content):
"""Extract data source declarations."""
sources = []
for match in re.finditer(
r'^data\s+"([^"]+)"\s+"([^"]+)"', content, re.MULTILINE
):
sources.append({"type": match.group(1), "name": match.group(2)})
return sources
def parse_variables(content):
"""Extract variable declarations with metadata."""
variables = []
# Match variable blocks
for match in re.finditer(
r'^variable\s+"([^"]+)"\s*\{(.*?)\n\}',
content,
re.MULTILINE | re.DOTALL,
):
name = match.group(1)
body = match.group(2)
var = {
"name": name,
"has_description": "description" in body,
"has_type": bool(re.search(r'\btype\s*=', body)),
"has_default": bool(re.search(r'\bdefault\s*=', body)),
"has_validation": "validation" in body,
"is_sensitive": "sensitive" in body and bool(
re.search(r'\bsensitive\s*=\s*true', body)
),
}
variables.append(var)
return variables
def parse_outputs(content):
"""Extract output declarations with metadata."""
outputs = []
for match in re.finditer(
r'^output\s+"([^"]+)"\s*\{(.*?)\n\}',
content,
re.MULTILINE | re.DOTALL,
):
name = match.group(1)
body = match.group(2)
out = {
"name": name,
"has_description": "description" in body,
"is_sensitive": "sensitive" in body and bool(
re.search(r'\bsensitive\s*=\s*true', body)
),
}
outputs.append(out)
return outputs
def parse_modules(content):
"""Extract module calls."""
modules = []
for match in re.finditer(
r'^module\s+"([^"]+)"\s*\{(.*?)\n\}',
content,
re.MULTILINE | re.DOTALL,
):
name = match.group(1)
body = match.group(2)
source_match = re.search(r'source\s*=\s*"([^"]+)"', body)
source = source_match.group(1) if source_match else "unknown"
modules.append({"name": name, "source": source})
return modules
def check_naming(resources, data_sources):
"""Check naming conventions."""
issues = []
for r in resources:
if not VALID_RESOURCE_NAME.match(r["name"]):
issues.append({
"severity": "medium",
"message": f"Resource '{r['type']}.{r['name']}' uses non-standard naming — use lowercase with underscores",
})
if r["name"].startswith(r["provider"] + "_"):
issues.append({
"severity": "low",
"message": f"Resource '{r['type']}.{r['name']}' name repeats the provider prefix — redundant",
})
for d in data_sources:
if not VALID_RESOURCE_NAME.match(d["name"]):
issues.append({
"severity": "medium",
"message": f"Data source '{d['type']}.{d['name']}' uses non-standard naming",
})
return issues
def check_variables(variables):
"""Check variable quality."""
issues = []
for v in variables:
if not v["has_description"]:
issues.append({
"severity": "medium",
"message": f"Variable '{v['name']}' missing description — consumers won't know what to provide",
})
if not v["has_type"]:
issues.append({
"severity": "high",
"message": f"Variable '{v['name']}' missing type constraint — accepts any value",
})
# Check if name suggests a secret
secret_patterns = ["password", "secret", "token", "key", "api_key", "credentials"]
name_lower = v["name"].lower()
if any(p in name_lower for p in secret_patterns) and not v["is_sensitive"]:
issues.append({
"severity": "high",
"message": f"Variable '{v['name']}' looks like a secret but is not marked sensitive = true",
})
return issues
def check_outputs(outputs):
"""Check output quality."""
issues = []
for o in outputs:
if not o["has_description"]:
issues.append({
"severity": "low",
"message": f"Output '{o['name']}' missing description",
})
return issues
def check_file_structure(tf_files):
"""Check if expected files are present."""
issues = []
filenames = set(tf_files.keys())
for expected, purpose in EXPECTED_FILES.items():
if expected not in filenames:
issues.append({
"severity": "medium" if expected != "versions.tf" else "high",
"message": f"Missing '{expected}' — {purpose}",
})
return issues
def analyze_directory(tf_files):
"""Run full analysis on a set of .tf files."""
all_content = "\n".join(tf_files.values())
resources = parse_resources(all_content)
data_sources = parse_data_sources(all_content)
variables = parse_variables(all_content)
outputs = parse_outputs(all_content)
modules = parse_modules(all_content)
# Collect findings
findings = []
findings.extend(check_file_structure(tf_files))
findings.extend(check_naming(resources, data_sources))
findings.extend(check_variables(variables))
findings.extend(check_outputs(outputs))
# Check for backend configuration
has_backend = any(
re.search(r'\bbackend\s+"', content)
for content in tf_files.values()
)
if not has_backend:
findings.append({
"severity": "high",
"message": "No remote backend configured — state is stored locally",
})
# Check for terraform required_version
has_tf_version = any(
re.search(r'required_version\s*=', content)
for content in tf_files.values()
)
if not has_tf_version:
findings.append({
"severity": "medium",
"message": "No required_version constraint — any Terraform version can be used",
})
# Providers in child modules check
for filename, content in tf_files.items():
if filename not in ("providers.tf", "versions.tf", "backend.tf"):
if re.search(r'^provider\s+"', content, re.MULTILINE):
findings.append({
"severity": "medium",
"message": f"Provider configuration found in '{filename}' — keep providers in root module only",
})
# Sort findings
severity_order = {"critical": 0, "high": 1, "medium": 2, "low": 3}
findings.sort(key=lambda f: severity_order.get(f["severity"], 4))
# Unique providers
providers = sorted(set(r["provider"] for r in resources))
return {
"files": sorted(tf_files.keys()),
"file_count": len(tf_files),
"resources": resources,
"resource_count": len(resources),
"data_sources": data_sources,
"data_source_count": len(data_sources),
"variables": variables,
"variable_count": len(variables),
"outputs": outputs,
"output_count": len(outputs),
"modules": modules,
"module_count": len(modules),
"providers": providers,
"findings": findings,
}
def generate_report(analysis, output_format="text"):
"""Generate analysis report."""
findings = analysis["findings"]
# Score
deductions = {"critical": 25, "high": 15, "medium": 5, "low": 2}
score = max(0, 100 - sum(deductions.get(f["severity"], 0) for f in findings))
counts = {
"critical": sum(1 for f in findings if f["severity"] == "critical"),
"high": sum(1 for f in findings if f["severity"] == "high"),
"medium": sum(1 for f in findings if f["severity"] == "medium"),
"low": sum(1 for f in findings if f["severity"] == "low"),
}
result = {
"score": score,
"files": analysis["files"],
"resource_count": analysis["resource_count"],
"data_source_count": analysis["data_source_count"],
"variable_count": analysis["variable_count"],
"output_count": analysis["output_count"],
"module_count": analysis["module_count"],
"providers": analysis["providers"],
"findings": findings,
"finding_counts": counts,
}
if output_format == "json":
print(json.dumps(result, indent=2))
return result
# Text output
print(f"\n{'=' * 60}")
print(f" Terraform Module Analysis Report")
print(f"{'=' * 60}")
print(f" Score: {score}/100")
print(f" Files: {', '.join(analysis['files'])}")
print(f" Providers: {', '.join(analysis['providers']) if analysis['providers'] else 'none detected'}")
print()
print(f" Resources: {analysis['resource_count']} | Data Sources: {analysis['data_source_count']}")
print(f" Variables: {analysis['variable_count']} | Outputs: {analysis['output_count']} | Modules: {analysis['module_count']}")
print()
print(f" Findings: {counts['critical']} critical | {counts['high']} high | {counts['medium']} medium | {counts['low']} low")
print(f"{'─' * 60}")
for f in findings:
icon = {"critical": "!!!", "high": "!!", "medium": "!", "low": "~"}.get(f["severity"], "?")
print(f"\n {icon} {f['severity'].upper()}")
print(f" {f['message']}")
if not findings:
print("\n No issues found. Module structure looks good.")
print(f"\n{'=' * 60}\n")
return result
def main():
parser = argparse.ArgumentParser(
description="terraform-patterns: Terraform module analyzer"
)
parser.add_argument(
"directory", nargs="?",
help="Path to Terraform directory (omit for demo)",
)
parser.add_argument(
"--output", "-o",
choices=["text", "json"],
default="text",
help="Output format (default: text)",
)
args = parser.parse_args()
if args.directory:
dirpath = Path(args.directory)
if not dirpath.is_dir():
print(f"Error: Not a directory: {args.directory}", file=sys.stderr)
sys.exit(1)
tf_files = find_tf_files(str(dirpath))
if not tf_files:
print(f"Error: No .tf files found in {args.directory}", file=sys.stderr)
sys.exit(1)
else:
print("No directory provided. Running demo analysis...\n")
tf_files = DEMO_FILES
analysis = analyze_directory(tf_files)
generate_report(analysis, args.output)
if __name__ == "__main__":
main()
FILE:scripts/tf_security_scanner.py
#!/usr/bin/env python3
"""
terraform-patterns: Terraform Security Scanner
Scan .tf files for common security issues including hardcoded secrets,
overly permissive IAM policies, open security groups, missing encryption,
and sensitive variable misuse.
Usage:
python scripts/tf_security_scanner.py ./terraform
python scripts/tf_security_scanner.py ./terraform --output json
python scripts/tf_security_scanner.py ./terraform --strict
"""
import argparse
import json
import os
import re
import sys
from pathlib import Path
# --- Demo Terraform File ---
DEMO_TF = """
provider "aws" {
region = "us-east-1"
access_key = "AKIAIOSFODNN7EXAMPLE"
secret_key = "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
}
variable "db_password" {
type = string
default = "supersecret123"
}
resource "aws_instance" "web" {
ami = "ami-12345678"
instance_type = "t3.micro"
tags = {
Name = "web-server"
}
}
resource "aws_security_group" "web" {
name = "web-sg"
ingress {
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
ingress {
from_port = 0
to_port = 65535
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
}
resource "aws_iam_policy" "admin" {
name = "admin-policy"
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Effect = "Allow"
Action = "*"
Resource = "*"
}
]
})
}
resource "aws_s3_bucket" "data" {
bucket = "my-data-bucket"
}
resource "aws_db_instance" "main" {
engine = "mysql"
instance_class = "db.t3.micro"
password = "hardcoded-password"
publicly_accessible = true
skip_final_snapshot = true
}
"""
# --- Security Rules ---
SECRET_PATTERNS = [
{
"id": "SEC001",
"name": "aws_access_key",
"severity": "critical",
"pattern": r'(?:access_key|aws_access_key_id)\s*=\s*"(AKIA[A-Z0-9]{16})"',
"message": "AWS access key hardcoded in configuration",
"fix": "Use environment variables, AWS profiles, or IAM roles instead",
},
{
"id": "SEC002",
"name": "aws_secret_key",
"severity": "critical",
"pattern": r'(?:secret_key|aws_secret_access_key)\s*=\s*"[A-Za-z0-9/+=]{40}"',
"message": "AWS secret key hardcoded in configuration",
"fix": "Use environment variables, AWS profiles, or IAM roles instead",
},
{
"id": "SEC003",
"name": "generic_password",
"severity": "critical",
"pattern": r'(?:password|passwd)\s*=\s*"[^"]{4,}"',
"message": "Password hardcoded in resource or provider configuration",
"fix": "Use a variable with sensitive = true, or fetch from Vault/SSM/Secrets Manager",
},
{
"id": "SEC004",
"name": "generic_secret",
"severity": "critical",
"pattern": r'(?:secret|token|api_key)\s*=\s*"[^"]{8,}"',
"message": "Secret or token hardcoded in configuration",
"fix": "Use a sensitive variable or secrets manager",
},
{
"id": "SEC005",
"name": "private_key",
"severity": "critical",
"pattern": r'-----BEGIN (?:RSA |EC |DSA )?PRIVATE KEY-----',
"message": "Private key embedded in Terraform configuration",
"fix": "Reference key file with file() function or use secrets manager",
},
]
IAM_PATTERNS = [
{
"id": "SEC010",
"name": "iam_wildcard_action",
"severity": "critical",
"pattern": r'Action\s*=\s*"\*"',
"message": "IAM policy with wildcard Action = \"*\" — grants all permissions",
"fix": "Scope Action to specific services and operations",
},
{
"id": "SEC011",
"name": "iam_wildcard_resource",
"severity": "high",
"pattern": r'Resource\s*=\s*"\*"',
"message": "IAM policy with wildcard Resource = \"*\" — applies to all resources",
"fix": "Scope Resource to specific ARN patterns",
},
{
"id": "SEC012",
"name": "iam_star_star",
"severity": "critical",
"pattern": r'Action\s*=\s*"\*"[^}]*Resource\s*=\s*"\*"',
"message": "IAM policy with Action=* AND Resource=* — effectively admin access",
"fix": "Follow least-privilege: grant only the specific actions and resources needed",
},
]
NETWORK_PATTERNS = [
{
"id": "SEC020",
"name": "sg_ssh_open",
"severity": "critical",
"pattern": None, # Custom check
"message": "Security group allows SSH (port 22) from 0.0.0.0/0",
"fix": "Restrict to known CIDR blocks, or use SSM Session Manager instead",
},
{
"id": "SEC021",
"name": "sg_rdp_open",
"severity": "critical",
"pattern": None, # Custom check
"message": "Security group allows RDP (port 3389) from 0.0.0.0/0",
"fix": "Restrict to known CIDR blocks, or use a bastion host",
},
{
"id": "SEC022",
"name": "sg_all_ports",
"severity": "critical",
"pattern": None, # Custom check
"message": "Security group allows all ports (0-65535) from 0.0.0.0/0",
"fix": "Open only the specific ports your application needs",
},
]
ENCRYPTION_PATTERNS = [
{
"id": "SEC030",
"name": "s3_no_encryption",
"severity": "high",
"pattern": None, # Custom check
"message": "S3 bucket without server-side encryption configuration",
"fix": "Add aws_s3_bucket_server_side_encryption_configuration resource",
},
{
"id": "SEC031",
"name": "rds_no_encryption",
"severity": "high",
"pattern": None, # Custom check
"message": "RDS instance without storage encryption",
"fix": "Set storage_encrypted = true on aws_db_instance",
},
{
"id": "SEC032",
"name": "ebs_no_encryption",
"severity": "medium",
"pattern": None, # Custom check
"message": "EBS volume without encryption",
"fix": "Set encrypted = true on aws_ebs_volume or enable account-level default encryption",
},
]
ACCESS_PATTERNS = [
{
"id": "SEC040",
"name": "rds_public",
"severity": "high",
"pattern": r'publicly_accessible\s*=\s*true',
"message": "RDS instance is publicly accessible",
"fix": "Set publicly_accessible = false and access via VPC/bastion",
},
{
"id": "SEC041",
"name": "s3_public_acl",
"severity": "high",
"pattern": r'acl\s*=\s*"public-read(?:-write)?"',
"message": "S3 bucket with public ACL",
"fix": "Remove public ACL and add aws_s3_bucket_public_access_block",
},
]
def find_tf_files(directory):
"""Find all .tf files in a directory (non-recursive)."""
tf_files = {}
for entry in sorted(os.listdir(directory)):
if entry.endswith(".tf"):
filepath = os.path.join(directory, entry)
with open(filepath, encoding="utf-8") as f:
tf_files[entry] = f.read()
return tf_files
def check_regex_rules(content, rules):
"""Run regex-based security rules against content."""
findings = []
for rule in rules:
if rule["pattern"] is None:
continue
for match in re.finditer(rule["pattern"], content, re.MULTILINE | re.IGNORECASE):
findings.append({
"id": rule["id"],
"severity": rule["severity"],
"message": rule["message"],
"fix": rule["fix"],
"line": match.group(0).strip()[:80],
})
return findings
def check_security_groups(content):
"""Custom check for open security groups."""
findings = []
# Parse ingress blocks within security group resources
sg_blocks = re.finditer(
r'resource\s+"aws_security_group"[^{]*\{(.*?)\n\}',
content,
re.DOTALL,
)
for sg_match in sg_blocks:
sg_body = sg_match.group(1)
ingress_blocks = re.finditer(
r'ingress\s*\{(.*?)\}', sg_body, re.DOTALL
)
for ingress in ingress_blocks:
block = ingress.group(1)
has_open_cidr = '0.0.0.0/0' in block or '::/0' in block
if not has_open_cidr:
continue
from_port_match = re.search(r'from_port\s*=\s*(\d+)', block)
to_port_match = re.search(r'to_port\s*=\s*(\d+)', block)
if from_port_match and to_port_match:
from_port = int(from_port_match.group(1))
to_port = int(to_port_match.group(1))
# SSH open
if from_port <= 22 <= to_port:
rule = next(r for r in NETWORK_PATTERNS if r["id"] == "SEC020")
findings.append({
"id": rule["id"],
"severity": rule["severity"],
"message": rule["message"],
"fix": rule["fix"],
"line": f"ingress port 22, cidr 0.0.0.0/0",
})
# RDP open
if from_port <= 3389 <= to_port:
rule = next(r for r in NETWORK_PATTERNS if r["id"] == "SEC021")
findings.append({
"id": rule["id"],
"severity": rule["severity"],
"message": rule["message"],
"fix": rule["fix"],
"line": f"ingress port 3389, cidr 0.0.0.0/0",
})
# All ports open
if from_port == 0 and to_port >= 65535:
rule = next(r for r in NETWORK_PATTERNS if r["id"] == "SEC022")
findings.append({
"id": rule["id"],
"severity": rule["severity"],
"message": rule["message"],
"fix": rule["fix"],
"line": f"ingress ports 0-65535, cidr 0.0.0.0/0",
})
return findings
def check_encryption(content):
"""Custom check for missing encryption on storage resources."""
findings = []
# S3 buckets without encryption
s3_buckets = re.findall(
r'resource\s+"aws_s3_bucket"\s+"([^"]+)"', content
)
s3_encryption = re.findall(
r'resource\s+"aws_s3_bucket_server_side_encryption_configuration"', content
)
# Also check inline encryption (older format)
inline_encryption = re.findall(
r'server_side_encryption_configuration', content
)
if s3_buckets and not s3_encryption and not inline_encryption:
rule = next(r for r in ENCRYPTION_PATTERNS if r["id"] == "SEC030")
for bucket in s3_buckets:
findings.append({
"id": rule["id"],
"severity": rule["severity"],
"message": f"{rule['message']} (bucket: {bucket})",
"fix": rule["fix"],
"line": f'aws_s3_bucket.{bucket}',
})
# RDS without encryption
rds_blocks = re.finditer(
r'resource\s+"aws_db_instance"\s+"([^"]+)"\s*\{(.*?)\n\}',
content,
re.DOTALL,
)
for rds_match in rds_blocks:
name = rds_match.group(1)
body = rds_match.group(2)
if 'storage_encrypted' not in body or re.search(
r'storage_encrypted\s*=\s*false', body
):
rule = next(r for r in ENCRYPTION_PATTERNS if r["id"] == "SEC031")
findings.append({
"id": rule["id"],
"severity": rule["severity"],
"message": f"{rule['message']} (instance: {name})",
"fix": rule["fix"],
"line": f'aws_db_instance.{name}',
})
# EBS volumes without encryption
ebs_blocks = re.finditer(
r'resource\s+"aws_ebs_volume"\s+"([^"]+)"\s*\{(.*?)\n\}',
content,
re.DOTALL,
)
for ebs_match in ebs_blocks:
name = ebs_match.group(1)
body = ebs_match.group(2)
if 'encrypted' not in body or re.search(
r'encrypted\s*=\s*false', body
):
rule = next(r for r in ENCRYPTION_PATTERNS if r["id"] == "SEC032")
findings.append({
"id": rule["id"],
"severity": rule["severity"],
"message": f"{rule['message']} (volume: {name})",
"fix": rule["fix"],
"line": f'aws_ebs_volume.{name}',
})
return findings
def check_sensitive_variables(content):
"""Check if variables that look like secrets are marked sensitive."""
findings = []
var_blocks = re.finditer(
r'variable\s+"([^"]+)"\s*\{(.*?)\n\}',
content,
re.DOTALL,
)
secret_names = ["password", "secret", "token", "api_key", "private_key", "credentials"]
for var_match in var_blocks:
name = var_match.group(1)
body = var_match.group(2)
name_lower = name.lower()
if any(s in name_lower for s in secret_names):
if not re.search(r'sensitive\s*=\s*true', body):
findings.append({
"id": "SEC050",
"severity": "medium",
"message": f"Variable '{name}' appears to be a secret but is not marked sensitive = true",
"fix": "Add sensitive = true to prevent the value from appearing in logs and plan output",
"line": f'variable "{name}"',
})
# Check for hardcoded default
default_match = re.search(r'default\s*=\s*"([^"]+)"', body)
if default_match and len(default_match.group(1)) > 0:
findings.append({
"id": "SEC051",
"severity": "critical",
"message": f"Variable '{name}' has a hardcoded default value for a secret",
"fix": "Remove the default value — require it to be passed at runtime via tfvars or env",
"line": f'variable "{name}" default = "{default_match.group(1)[:20]}..."',
})
return findings
def scan_content(content, strict=False):
"""Run all security checks on content."""
findings = []
findings.extend(check_regex_rules(content, SECRET_PATTERNS))
findings.extend(check_regex_rules(content, IAM_PATTERNS))
findings.extend(check_regex_rules(content, ACCESS_PATTERNS))
findings.extend(check_security_groups(content))
findings.extend(check_encryption(content))
findings.extend(check_sensitive_variables(content))
if strict:
for f in findings:
if f["severity"] == "medium":
f["severity"] = "high"
elif f["severity"] == "low":
f["severity"] = "medium"
# Deduplicate by (id, line)
seen = set()
unique = []
for f in findings:
key = (f["id"], f.get("line", ""))
if key not in seen:
seen.add(key)
unique.append(f)
findings = unique
# Sort by severity
severity_order = {"critical": 0, "high": 1, "medium": 2, "low": 3}
findings.sort(key=lambda f: severity_order.get(f["severity"], 4))
return findings
def generate_report(content, output_format="text", strict=False):
"""Generate security scan report."""
findings = scan_content(content, strict)
# Score
deductions = {"critical": 25, "high": 15, "medium": 5, "low": 2}
score = max(0, 100 - sum(deductions.get(f["severity"], 0) for f in findings))
counts = {
"critical": sum(1 for f in findings if f["severity"] == "critical"),
"high": sum(1 for f in findings if f["severity"] == "high"),
"medium": sum(1 for f in findings if f["severity"] == "medium"),
"low": sum(1 for f in findings if f["severity"] == "low"),
}
result = {
"score": score,
"findings": findings,
"finding_counts": counts,
"total_findings": len(findings),
}
if output_format == "json":
print(json.dumps(result, indent=2))
return result
# Text output
print(f"\n{'=' * 60}")
print(f" Terraform Security Scan Report")
print(f"{'=' * 60}")
print(f" Score: {score}/100")
print()
print(f" Findings: {counts['critical']} critical | {counts['high']} high | {counts['medium']} medium | {counts['low']} low")
print(f"{'─' * 60}")
for f in findings:
icon = {"critical": "!!!", "high": "!!", "medium": "!", "low": "~"}.get(f["severity"], "?")
print(f"\n [{f['id']}] {icon} {f['severity'].upper()}")
print(f" {f['message']}")
if f.get("line"):
print(f" Match: {f['line']}")
print(f" Fix: {f['fix']}")
if not findings:
print("\n No security issues found. Configuration looks clean.")
print(f"\n{'=' * 60}\n")
return result
def main():
parser = argparse.ArgumentParser(
description="terraform-patterns: Terraform security scanner"
)
parser.add_argument(
"target", nargs="?",
help="Path to Terraform directory or .tf file (omit for demo)",
)
parser.add_argument(
"--output", "-o",
choices=["text", "json"],
default="text",
help="Output format (default: text)",
)
parser.add_argument(
"--strict",
action="store_true",
help="Strict mode — elevate warnings to higher severity",
)
args = parser.parse_args()
if args.target:
target = Path(args.target)
if target.is_dir():
tf_files = find_tf_files(str(target))
if not tf_files:
print(f"Error: No .tf files found in {args.target}", file=sys.stderr)
sys.exit(1)
content = "\n".join(tf_files.values())
elif target.is_file() and target.suffix == ".tf":
content = target.read_text(encoding="utf-8")
else:
print(f"Error: {args.target} is not a directory or .tf file", file=sys.stderr)
sys.exit(1)
else:
print("No target provided. Running demo scan...\n")
content = DEMO_TF
generate_report(content, args.output, args.strict)
if __name__ == "__main__":
main()
Helm chart development agent skill and plugin for Claude Code, Codex, Gemini CLI, Cursor, OpenClaw — chart scaffolding, values design, template patterns, dep...
---
name: "helm-chart-builder"
description: "Helm chart development agent skill and plugin for Claude Code, Codex, Gemini CLI, Cursor, OpenClaw — chart scaffolding, values design, template patterns, dependency management, security hardening, and chart testing. Use when: user wants to create or improve Helm charts, design values.yaml files, implement template helpers, audit chart security (RBAC, network policies, pod security), manage subcharts, or run helm lint/test."
license: MIT
metadata:
version: 1.0.0
author: Alireza Rezvani
category: engineering
updated: 2026-03-15
---
# Helm Chart Builder
> Production-grade Helm charts. Sensible defaults. Secure by design. No cargo-culting.
Opinionated Helm workflow that turns ad-hoc Kubernetes manifests into maintainable, testable, reusable charts. Covers chart structure, values design, template patterns, dependency management, and security hardening.
Not a Helm tutorial — a set of concrete decisions about how to build charts that operators trust and developers don't fight.
---
## Slash Commands
| Command | What it does |
|---------|-------------|
| `/helm:create` | Scaffold a production-ready Helm chart with best-practice structure |
| `/helm:review` | Analyze an existing chart for issues — missing labels, hardcoded values, template anti-patterns |
| `/helm:security` | Audit chart for security issues — RBAC, network policies, pod security, secrets handling |
---
## When This Skill Activates
Recognize these patterns from the user:
- "Create a Helm chart for this service"
- "Review my Helm chart"
- "Is this chart secure?"
- "Design a values.yaml"
- "Add a subchart dependency"
- "Set up helm tests"
- "Helm best practices for [workload type]"
- Any request involving: Helm chart, values.yaml, Chart.yaml, templates, helpers, _helpers.tpl, subcharts, helm lint, helm test
If the user has a Helm chart or wants to package Kubernetes resources → this skill applies.
---
## Workflow
### `/helm:create` — Chart Scaffolding
1. **Identify workload type**
- Web service (Deployment + Service + Ingress)
- Worker (Deployment, no Service)
- CronJob (CronJob + ServiceAccount)
- Stateful service (StatefulSet + PVC + Headless Service)
- Library chart (no templates, only helpers)
2. **Scaffold chart structure**
```
mychart/
├── Chart.yaml # Chart metadata and dependencies
├── values.yaml # Default configuration
├── values.schema.json # Optional: JSON Schema for values validation
├── .helmignore # Files to exclude from packaging
├── templates/
│ ├── _helpers.tpl # Named templates and helper functions
│ ├── deployment.yaml # Workload resource
│ ├── service.yaml # Service exposure
│ ├── ingress.yaml # Ingress (if applicable)
│ ├── serviceaccount.yaml # ServiceAccount
│ ├── hpa.yaml # HorizontalPodAutoscaler
│ ├── pdb.yaml # PodDisruptionBudget
│ ├── networkpolicy.yaml # NetworkPolicy
│ ├── configmap.yaml # ConfigMap (if needed)
│ ├── secret.yaml # Secret (if needed)
│ ├── NOTES.txt # Post-install usage instructions
│ └── tests/
│ └── test-connection.yaml
└── charts/ # Subcharts (dependencies)
```
3. **Apply Chart.yaml best practices**
```
METADATA
├── apiVersion: v2 (Helm 3 only — never v1)
├── name: matches directory name exactly
├── version: semver (chart version, not app version)
├── appVersion: application version string
├── description: one-line summary of what the chart deploys
└── type: application (or library for shared helpers)
DEPENDENCIES
├── Pin dependency versions with ~X.Y.Z (patch-level float)
├── Use condition field to make subcharts optional
├── Use alias for multiple instances of same subchart
└── Run helm dependency update after changes
```
4. **Generate values.yaml with documentation**
- Every value has an inline comment explaining purpose and type
- Sensible defaults that work for development
- Override-friendly structure (flat where possible, nested only when logical)
- No hardcoded cluster-specific values (image registry, domain, storage class)
5. **Validate**
```bash
python3 scripts/chart_analyzer.py mychart/
helm lint mychart/
helm template mychart/ --debug
```
### `/helm:review` — Chart Analysis
1. **Check chart structure**
| Check | Severity | Fix |
|-------|----------|-----|
| Missing _helpers.tpl | High | Create helpers for common labels and selectors |
| No NOTES.txt | Medium | Add post-install instructions |
| No .helmignore | Low | Create one to exclude .git, CI files, tests |
| Missing Chart.yaml fields | Medium | Add description, appVersion, maintainers |
| Hardcoded values in templates | High | Extract to values.yaml with defaults |
2. **Check template quality**
| Check | Severity | Fix |
|-------|----------|-----|
| Missing standard labels | High | Use `app.kubernetes.io/*` labels via _helpers.tpl |
| No resource requests/limits | Critical | Add resources section with defaults in values.yaml |
| Hardcoded image tag | High | Use `{{ .Values.image.repository }}:{{ .Values.image.tag }}` |
| No imagePullPolicy | Medium | Default to `IfNotPresent`, overridable |
| Missing liveness/readiness probes | High | Add probes with configurable paths and ports |
| No pod anti-affinity | Medium | Add preferred anti-affinity for HA |
| Duplicate template code | Medium | Extract into named templates in _helpers.tpl |
3. **Check values.yaml quality**
```bash
python3 scripts/values_validator.py mychart/values.yaml
```
4. **Generate review report**
```
HELM CHART REVIEW — [chart name]
Date: [timestamp]
CRITICAL: [count]
HIGH: [count]
MEDIUM: [count]
LOW: [count]
[Detailed findings with fix recommendations]
```
### `/helm:security` — Security Audit
1. **Pod security audit**
| Check | Severity | Fix |
|-------|----------|-----|
| No securityContext | Critical | Add runAsNonRoot, readOnlyRootFilesystem |
| Running as root | Critical | Set `runAsNonRoot: true`, `runAsUser: 1000` |
| Writable root filesystem | High | Set `readOnlyRootFilesystem: true` + emptyDir for tmp |
| All capabilities retained | High | Drop ALL, add only specific needed caps |
| Privileged container | Critical | Set `privileged: false`, use specific capabilities |
| No seccomp profile | Medium | Set `seccompProfile.type: RuntimeDefault` |
| allowPrivilegeEscalation true | High | Set `allowPrivilegeEscalation: false` |
2. **RBAC audit**
| Check | Severity | Fix |
|-------|----------|-----|
| No ServiceAccount | Medium | Create dedicated SA, don't use default |
| automountServiceAccountToken true | Medium | Set to false unless pod needs K8s API access |
| ClusterRole instead of Role | Medium | Use namespace-scoped Role unless cluster-wide needed |
| Wildcard permissions | Critical | Use specific resource names and verbs |
| No RBAC at all | Low | Acceptable if pod doesn't need K8s API access |
3. **Network and secrets audit**
| Check | Severity | Fix |
|-------|----------|-----|
| No NetworkPolicy | Medium | Add default-deny ingress + explicit allow rules |
| Secrets in values.yaml | Critical | Use external secrets operator or sealed-secrets |
| No PodDisruptionBudget | Medium | Add PDB with minAvailable for HA workloads |
| hostNetwork: true | High | Remove unless absolutely required (e.g., CNI plugin) |
| hostPID or hostIPC | Critical | Never use in application charts |
4. **Generate security report**
```
SECURITY AUDIT — [chart name]
Date: [timestamp]
CRITICAL: [count]
HIGH: [count]
MEDIUM: [count]
LOW: [count]
[Detailed findings with remediation steps]
```
---
## Tooling
### `scripts/chart_analyzer.py`
CLI utility for static analysis of Helm chart directories.
**Features:**
- Chart structure validation (required files, directory layout)
- Template anti-pattern detection (hardcoded values, missing labels, no resource limits)
- Chart.yaml metadata checks
- Standard labels verification (app.kubernetes.io/*)
- Security baseline checks
- JSON and text output
**Usage:**
```bash
# Analyze a chart directory
python3 scripts/chart_analyzer.py mychart/
# JSON output
python3 scripts/chart_analyzer.py mychart/ --output json
# Security-focused analysis
python3 scripts/chart_analyzer.py mychart/ --security
```
### `scripts/values_validator.py`
CLI utility for validating values.yaml against best practices.
**Features:**
- Documentation coverage (inline comments)
- Type consistency checks
- Hardcoded secrets detection
- Default value quality analysis
- Structure depth analysis
- Naming convention validation
- JSON and text output
**Usage:**
```bash
# Validate values.yaml
python3 scripts/values_validator.py values.yaml
# JSON output
python3 scripts/values_validator.py values.yaml --output json
# Strict mode (fail on warnings)
python3 scripts/values_validator.py values.yaml --strict
```
---
## Template Patterns
### Pattern 1: Standard Labels (_helpers.tpl)
```yaml
{{/*
Common labels for all resources.
*/}}
{{- define "mychart.labels" -}}
helm.sh/chart: {{ include "mychart.chart" . }}
app.kubernetes.io/name: {{ include "mychart.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
{{- end }}
{{/*
Selector labels (subset of common labels — must be immutable).
*/}}
{{- define "mychart.selectorLabels" -}}
app.kubernetes.io/name: {{ include "mychart.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{- end }}
```
### Pattern 2: Conditional Resources
```yaml
{{- if .Values.ingress.enabled -}}
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: {{ include "mychart.fullname" . }}
labels:
{{- include "mychart.labels" . | nindent 4 }}
{{- with .Values.ingress.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
spec:
{{- if .Values.ingress.tls }}
tls:
{{- range .Values.ingress.tls }}
- hosts:
{{- range .hosts }}
- {{ . | quote }}
{{- end }}
secretName: {{ .secretName }}
{{- end }}
{{- end }}
rules:
{{- range .Values.ingress.hosts }}
- host: {{ .host | quote }}
http:
paths:
{{- range .paths }}
- path: {{ .path }}
pathType: {{ .pathType }}
backend:
service:
name: {{ include "mychart.fullname" $ }}
port:
number: {{ $.Values.service.port }}
{{- end }}
{{- end }}
{{- end }}
```
### Pattern 3: Security-Hardened Pod Spec
```yaml
spec:
serviceAccountName: {{ include "mychart.serviceAccountName" . }}
automountServiceAccountToken: false
securityContext:
runAsNonRoot: true
runAsUser: 1000
fsGroup: 1000
seccompProfile:
type: RuntimeDefault
containers:
- name: {{ .Chart.Name }}
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop:
- ALL
image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
imagePullPolicy: {{ .Values.image.pullPolicy }}
resources:
{{- toYaml .Values.resources | nindent 8 }}
volumeMounts:
- name: tmp
mountPath: /tmp
volumes:
- name: tmp
emptyDir: {}
```
---
## Values Design Principles
```
STRUCTURE
├── Flat over nested (image.tag > container.spec.image.tag)
├── Group by resource (service.*, ingress.*, resources.*)
├── Use enabled: true/false for optional resources
├── Document every key with inline YAML comments
└── Provide sensible development defaults
NAMING
├── camelCase for keys (replicaCount, not replica_count)
├── Boolean keys: use adjectives (enabled, required) not verbs
├── Nested keys: max 3 levels deep
└── Match upstream conventions (image.repository, image.tag, image.pullPolicy)
ANTI-PATTERNS
├── Hardcoded cluster URLs or domains
├── Secrets as default values
├── Empty strings where null is correct
├── Deeply nested structures (>3 levels)
├── Undocumented values
└── values.yaml that doesn't work without overrides
```
---
## Dependency Management
```
SUBCHARTS
├── Use Chart.yaml dependencies (not requirements.yaml — Helm 3)
├── Pin versions: version: ~15.x.x (patch float)
├── Use condition: to make optional: condition: postgresql.enabled
├── Use alias: for multiple instances of same chart
├── Override subchart values under subchart name key in values.yaml
└── Run helm dependency update before packaging
LIBRARY CHARTS
├── type: library in Chart.yaml — no templates directory
├── Export named templates only — no rendered resources
├── Use for shared labels, annotations, security contexts
└── Version independently from application charts
```
---
## Proactive Triggers
Flag these without being asked:
- **No _helpers.tpl** → Create one. Every chart needs standard labels and fullname helpers.
- **Hardcoded image tag in template** → Extract to values.yaml. Tags must be overridable.
- **No resource requests/limits** → Add them. Pods without limits can starve the node.
- **Running as root** → Add securityContext. No exceptions for production charts.
- **No NOTES.txt** → Create one. Users need post-install instructions.
- **Secrets in values.yaml defaults** → Remove them. Use placeholders with comments explaining how to provide secrets.
- **No liveness/readiness probes** → Add them. Kubernetes needs to know if the pod is healthy.
- **Missing app.kubernetes.io labels** → Add via _helpers.tpl. Required for proper resource tracking.
---
## Installation
### One-liner (any tool)
```bash
git clone https://github.com/alirezarezvani/claude-skills.git
cp -r claude-skills/engineering/helm-chart-builder ~/.claude/skills/
```
### Multi-tool install
```bash
./scripts/convert.sh --skill helm-chart-builder --tool codex|gemini|cursor|windsurf|openclaw
```
### OpenClaw
```bash
clawhub install cs-helm-chart-builder
```
---
## Related Skills
- **senior-devops** — Broader DevOps scope (CI/CD, IaC, monitoring). Complementary — use helm-chart-builder for chart-specific work, senior-devops for pipeline and infrastructure.
- **docker-development** — Container building. Complementary — docker-development builds the images, helm-chart-builder deploys them to Kubernetes.
- **ci-cd-pipeline-builder** — Pipeline construction. Complementary — helm-chart-builder defines the deployment artifact, ci-cd-pipeline-builder automates its delivery.
- **senior-security** — Application security. Complementary — helm-chart-builder covers Kubernetes-level security (RBAC, pod security), senior-security covers application-level threats.
FILE:references/chart-patterns.md
# Helm Chart Patterns Reference
## Standard Chart Structure
### Minimal Production Chart
```
mychart/
├── Chart.yaml
├── values.yaml
├── .helmignore
└── templates/
├── _helpers.tpl
├── deployment.yaml
├── service.yaml
├── serviceaccount.yaml
├── NOTES.txt
└── tests/
└── test-connection.yaml
```
### Full Production Chart
```
mychart/
├── Chart.yaml
├── values.yaml
├── values.schema.json # JSON Schema validation
├── .helmignore
├── templates/
│ ├── _helpers.tpl
│ ├── deployment.yaml
│ ├── service.yaml
│ ├── ingress.yaml
│ ├── serviceaccount.yaml
│ ├── hpa.yaml
│ ├── pdb.yaml
│ ├── networkpolicy.yaml
│ ├── configmap.yaml
│ ├── secret.yaml
│ ├── NOTES.txt
│ └── tests/
│ └── test-connection.yaml
└── charts/ # Managed by helm dependency update
```
---
## _helpers.tpl — Standard Helpers
Every chart needs these. Copy and adapt.
```yaml
{{/*
Expand the name of the chart.
*/}}
{{- define "mychart.name" -}}
{{- default .Chart.Name .Values.nameOverride | trunc 63 | trimSuffix "-" }}
{{- end }}
{{/*
Create a default fully qualified app name.
Truncated at 63 chars because some Kubernetes name fields are limited.
*/}}
{{- define "mychart.fullname" -}}
{{- if .Values.fullnameOverride }}
{{- .Values.fullnameOverride | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- $name := default .Chart.Name .Values.nameOverride }}
{{- if contains $name .Release.Name }}
{{- .Release.Name | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" }}
{{- end }}
{{- end }}
{{- end }}
{{/*
Create chart name and version as used by the chart label.
*/}}
{{- define "mychart.chart" -}}
{{- printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" }}
{{- end }}
{{/*
Common labels.
*/}}
{{- define "mychart.labels" -}}
helm.sh/chart: {{ include "mychart.chart" . }}
{{ include "mychart.selectorLabels" . }}
{{- if .Chart.AppVersion }}
app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
{{- end }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
{{- end }}
{{/*
Selector labels (immutable — used in matchLabels).
*/}}
{{- define "mychart.selectorLabels" -}}
app.kubernetes.io/name: {{ include "mychart.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{- end }}
{{/*
Create the name of the service account to use.
*/}}
{{- define "mychart.serviceAccountName" -}}
{{- if .Values.serviceAccount.create }}
{{- default (include "mychart.fullname" .) .Values.serviceAccount.name }}
{{- else }}
{{- default "default" .Values.serviceAccount.name }}
{{- end }}
{{- end }}
```
### Why These Helpers Matter
- **Name truncation** — Kubernetes names max at 63 characters. Always trunc.
- **Selector labels separate from common labels** — selectors are immutable after creation. Adding `app.kubernetes.io/version` to selectors breaks upgrades.
- **nameOverride vs fullnameOverride** — `nameOverride` replaces the chart name portion, `fullnameOverride` replaces everything.
---
## Deployment Patterns
### Standard Web Service
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "mychart.fullname" . }}
labels:
{{- include "mychart.labels" . | nindent 4 }}
spec:
{{- if not .Values.autoscaling.enabled }}
replicas: {{ .Values.replicaCount }}
{{- end }}
selector:
matchLabels:
{{- include "mychart.selectorLabels" . | nindent 6 }}
template:
metadata:
{{- with .Values.podAnnotations }}
annotations:
{{- toYaml . | nindent 8 }}
{{- end }}
labels:
{{- include "mychart.labels" . | nindent 8 }}
{{- with .Values.podLabels }}
{{- toYaml . | nindent 8 }}
{{- end }}
spec:
serviceAccountName: {{ include "mychart.serviceAccountName" . }}
automountServiceAccountToken: false
securityContext:
{{- toYaml .Values.podSecurityContext | nindent 8 }}
containers:
- name: {{ .Chart.Name }}
securityContext:
{{- toYaml .Values.securityContext | nindent 12 }}
image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
imagePullPolicy: {{ .Values.image.pullPolicy }}
ports:
- name: http
containerPort: {{ .Values.service.port }}
protocol: TCP
livenessProbe:
{{- toYaml .Values.livenessProbe | nindent 12 }}
readinessProbe:
{{- toYaml .Values.readinessProbe | nindent 12 }}
resources:
{{- toYaml .Values.resources | nindent 12 }}
{{- with .Values.volumeMounts }}
volumeMounts:
{{- toYaml . | nindent 12 }}
{{- end }}
{{- with .Values.volumes }}
volumes:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.tolerations }}
tolerations:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.affinity }}
affinity:
{{- toYaml . | nindent 8 }}
{{- end }}
```
### Worker (No Service)
```yaml
# Same as above but without ports, probes, or Service resource
# Use for background workers, queue consumers, cron jobs
spec:
containers:
- name: {{ .Chart.Name }}
image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
command: {{ toYaml .Values.command | nindent 8 }}
resources:
{{- toYaml .Values.resources | nindent 12 }}
```
---
## Conditional Resource Patterns
### Optional Ingress
```yaml
{{- if .Values.ingress.enabled -}}
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: {{ include "mychart.fullname" . }}
labels:
{{- include "mychart.labels" . | nindent 4 }}
{{- with .Values.ingress.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
spec:
ingressClassName: {{ .Values.ingress.className }}
{{- if .Values.ingress.tls }}
tls:
{{- range .Values.ingress.tls }}
- hosts:
{{- range .hosts }}
- {{ . | quote }}
{{- end }}
secretName: {{ .secretName }}
{{- end }}
{{- end }}
rules:
{{- range .Values.ingress.hosts }}
- host: {{ .host | quote }}
http:
paths:
{{- range .paths }}
- path: {{ .path }}
pathType: {{ .pathType }}
backend:
service:
name: {{ include "mychart.fullname" $ }}
port:
number: {{ $.Values.service.port }}
{{- end }}
{{- end }}
{{- end }}
```
### Optional HPA
```yaml
{{- if .Values.autoscaling.enabled }}
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: {{ include "mychart.fullname" . }}
labels:
{{- include "mychart.labels" . | nindent 4 }}
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: {{ include "mychart.fullname" . }}
minReplicas: {{ .Values.autoscaling.minReplicas }}
maxReplicas: {{ .Values.autoscaling.maxReplicas }}
metrics:
{{- if .Values.autoscaling.targetCPUUtilizationPercentage }}
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: {{ .Values.autoscaling.targetCPUUtilizationPercentage }}
{{- end }}
{{- if .Values.autoscaling.targetMemoryUtilizationPercentage }}
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: {{ .Values.autoscaling.targetMemoryUtilizationPercentage }}
{{- end }}
{{- end }}
```
---
## PodDisruptionBudget
```yaml
{{- if .Values.pdb.enabled }}
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: {{ include "mychart.fullname" . }}
labels:
{{- include "mychart.labels" . | nindent 4 }}
spec:
{{- if .Values.pdb.minAvailable }}
minAvailable: {{ .Values.pdb.minAvailable }}
{{- end }}
{{- if .Values.pdb.maxUnavailable }}
maxUnavailable: {{ .Values.pdb.maxUnavailable }}
{{- end }}
selector:
matchLabels:
{{- include "mychart.selectorLabels" . | nindent 6 }}
{{- end }}
```
---
## Test Connection Template
```yaml
apiVersion: v1
kind: Pod
metadata:
name: "{{ include "mychart.fullname" . }}-test-connection"
labels:
{{- include "mychart.labels" . | nindent 4 }}
annotations:
"helm.sh/hook": test
spec:
containers:
- name: wget
image: busybox
command: ['wget']
args: ['{{ include "mychart.fullname" . }}:{{ .Values.service.port }}']
restartPolicy: Never
```
---
## NOTES.txt Pattern
```
1. Get the application URL by running these commands:
{{- if .Values.ingress.enabled }}
{{- range $host := .Values.ingress.hosts }}
http{{ if $.Values.ingress.tls }}s{{ end }}://{{ $host.host }}
{{- end }}
{{- else if contains "NodePort" .Values.service.type }}
export NODE_PORT=$(kubectl get --namespace {{ .Release.Namespace }} -o jsonpath="{.spec.ports[0].nodePort}" services {{ include "mychart.fullname" . }})
export NODE_IP=$(kubectl get nodes --namespace {{ .Release.Namespace }} -o jsonpath="{.items[0].status.addresses[0].address}")
echo http://$NODE_IP:$NODE_PORT
{{- else if contains "LoadBalancer" .Values.service.type }}
NOTE: It may take a few minutes for the LoadBalancer IP to be available.
kubectl get --namespace {{ .Release.Namespace }} svc {{ include "mychart.fullname" . }} -w
{{- else if contains "ClusterIP" .Values.service.type }}
kubectl --namespace {{ .Release.Namespace }} port-forward svc/{{ include "mychart.fullname" . }} {{ .Values.service.port }}:{{ .Values.service.port }}
echo "Visit http://127.0.0.1:{{ .Values.service.port }}"
{{- end }}
```
---
## Dependency Management
### Chart.yaml with Dependencies
```yaml
apiVersion: v2
name: myapp
version: 1.0.0
appVersion: "2.5.0"
dependencies:
- name: postgresql
version: ~15.5.0
repository: https://charts.bitnami.com/bitnami
condition: postgresql.enabled
- name: redis
version: ~19.0.0
repository: https://charts.bitnami.com/bitnami
condition: redis.enabled
- name: common
version: ~2.0.0
repository: https://charts.bitnami.com/bitnami
tags:
- bitnami-common
```
### Overriding Subchart Values
```yaml
# values.yaml — subchart values go under the dependency name
postgresql:
enabled: true
auth:
database: myapp
username: myapp
primary:
resources:
requests:
cpu: 250m
memory: 256Mi
redis:
enabled: false
```
### Commands
```bash
# Download dependencies
helm dependency update mychart/
# List dependencies
helm dependency list mychart/
# Build (same as update but doesn't update Chart.lock)
helm dependency build mychart/
```
---
## Troubleshooting Checklist
| Symptom | Likely Cause | Fix |
|---------|-------------|-----|
| Template renders empty | Missing `{{- if }}` or wrong value path | `helm template --debug` to see rendered output |
| Upgrade fails on selector change | Selector labels changed between versions | Never change selectorLabels — they're immutable |
| Values not applying | Wrong nesting in values override | Check indentation and key paths |
| Subchart not rendering | Missing `condition:` or dependency not updated | Run `helm dependency update` |
| Name too long | Kubernetes 63-char limit | Ensure `trunc 63` in _helpers.tpl |
| RBAC permission denied | ServiceAccount missing or wrong Role | Check SA exists and RoleBinding is correct |
FILE:references/values-design.md
# Values.yaml Design Reference
## Design Principles
### 1. Every Value Is Documented
```yaml
# Bad — what does this mean?
replicaCount: 1
maxSurge: 25%
# Good — clear purpose, type, and constraints
# -- Number of pod replicas. Ignored when autoscaling.enabled is true.
replicaCount: 1
# -- Maximum number of pods above desired count during rolling update (int or percentage).
maxSurge: 25%
```
### 2. Sensible Defaults That Work
A user should be able to `helm install mychart .` with zero overrides and get a working deployment.
```yaml
# Bad — broken without override
image:
repository: "" # Fails: no image
tag: "" # Fails: no tag
# Good — works out of the box
image:
repository: nginx # Default image for development
tag: "" # Defaults to .Chart.AppVersion in template
pullPolicy: IfNotPresent
```
### 3. Flat Over Nested
```yaml
# Bad — 5 levels deep, painful to override
container:
spec:
security:
context:
runAsNonRoot: true
# Good — 2 levels, easy to override with --set
securityContext:
runAsNonRoot: true
```
**Rule of thumb:** Max 3 levels of nesting. If you need more, redesign.
### 4. Group by Resource
```yaml
# Good — grouped by Kubernetes resource
service:
type: ClusterIP
port: 80
ingress:
enabled: false
className: ""
hosts: []
autoscaling:
enabled: false
minReplicas: 1
maxReplicas: 10
```
---
## Standard Values Structure
### Recommended Layout Order
```yaml
# -- Number of pod replicas
replicaCount: 1
# -- Override chart name
nameOverride: ""
# -- Override fully qualified app name
fullnameOverride: ""
image:
# -- Container image repository
repository: myapp
# -- Image pull policy
pullPolicy: IfNotPresent
# -- Image tag (defaults to .Chart.AppVersion)
tag: ""
# -- Image pull secrets for private registries
imagePullSecrets: []
serviceAccount:
# -- Create a ServiceAccount
create: true
# -- Annotations for the ServiceAccount
annotations: {}
# -- ServiceAccount name (generated from fullname if not set)
name: ""
# -- Automount the service account token
automount: false
# -- Pod annotations
podAnnotations: {}
# -- Additional pod labels
podLabels: {}
# -- Pod security context
podSecurityContext:
runAsNonRoot: true
runAsUser: 1000
fsGroup: 1000
seccompProfile:
type: RuntimeDefault
# -- Container security context
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop:
- ALL
service:
# -- Service type
type: ClusterIP
# -- Service port
port: 80
ingress:
# -- Enable ingress
enabled: false
# -- Ingress class name
className: ""
# -- Ingress annotations
annotations: {}
# -- Ingress hosts
hosts:
- host: chart-example.local
paths:
- path: /
pathType: ImplementationSpecific
# -- Ingress TLS configuration
tls: []
# -- Container resource requests and limits
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 256Mi
# -- Liveness probe configuration
livenessProbe:
httpGet:
path: /healthz
port: http
initialDelaySeconds: 15
periodSeconds: 20
# -- Readiness probe configuration
readinessProbe:
httpGet:
path: /readyz
port: http
initialDelaySeconds: 5
periodSeconds: 10
autoscaling:
# -- Enable horizontal pod autoscaler
enabled: false
# -- Minimum replicas
minReplicas: 1
# -- Maximum replicas
maxReplicas: 10
# -- Target CPU utilization percentage
targetCPUUtilizationPercentage: 80
# -- Target memory utilization percentage (optional)
# targetMemoryUtilizationPercentage: 80
pdb:
# -- Enable PodDisruptionBudget
enabled: false
# -- Minimum available pods
minAvailable: 1
# -- Maximum unavailable pods (alternative to minAvailable)
# maxUnavailable: 1
# -- Node selector constraints
nodeSelector: {}
# -- Tolerations for pod scheduling
tolerations: []
# -- Affinity rules for pod scheduling
affinity: {}
# -- Additional volumes
volumes: []
# -- Additional volume mounts
volumeMounts: []
```
---
## Anti-Patterns
### 1. Secrets in Default Values
```yaml
# BAD — secret visible in chart package, git history, Helm release
database:
password: "mysecretpassword"
apiKey: "sk-abc123"
# GOOD — empty defaults with documentation
database:
# -- Database password (required). Provide via --set or external secret.
password: ""
# -- API key. Use external-secrets or sealed-secrets in production.
apiKey: ""
```
### 2. Cluster-Specific Defaults
```yaml
# BAD — won't work on any other cluster
ingress:
host: app.my-company.internal
storageClass: gp3
registry: 123456789.dkr.ecr.us-east-1.amazonaws.com
# GOOD — generic defaults
ingress:
host: chart-example.local
storageClass: "" # Uses cluster default
image:
repository: myapp # Override for private registry
```
### 3. Boolean Naming
```yaml
# BAD — unclear, verb-based
createServiceAccount: true
doAutoScale: false
skipTLS: true
# GOOD — adjective-based, consistent
serviceAccount:
create: true # "Is it created?" reads naturally
autoscaling:
enabled: false # "Is it enabled?" reads naturally
tls:
insecureSkipVerify: false # Matches Go/K8s convention
```
### 4. Undocumented Values
```yaml
# BAD — what are these? What types? What are valid options?
foo: bar
maxRetries: 3
mode: advanced
workers: 4
# GOOD — purpose, type, and constraints are clear
# -- Operation mode. Options: "simple", "advanced", "debug"
mode: advanced
# -- Number of background worker threads (1-16)
workers: 4
# -- Maximum retry attempts for failed API calls
maxRetries: 3
```
### 5. Empty String vs Null
```yaml
# BAD — ambiguous: is empty string intentional?
annotations: ""
nodeSelector: ""
# GOOD — null/empty map means "not set"
annotations: {}
nodeSelector: {}
# Or simply omit optional values
```
---
## Override Patterns
### Hierarchy (lowest to highest priority)
1. `values.yaml` in chart
2. Parent chart's `values.yaml` (for subcharts)
3. `-f custom-values.yaml` (left to right, last wins)
4. `--set key=value` (highest priority)
### Common Override Scenarios
```bash
# Production override file
helm install myapp . -f values-production.yaml
# Quick override with --set
helm install myapp . --set replicaCount=3 --set image.tag=v2.1.0
# Multiple value files (last wins)
helm install myapp . -f values-base.yaml -f values-production.yaml -f values-secrets.yaml
```
### values-production.yaml Pattern
```yaml
# Production overrides only — don't repeat defaults
replicaCount: 3
image:
tag: "v2.1.0"
pullPolicy: IfNotPresent
resources:
requests:
cpu: 500m
memory: 512Mi
limits:
cpu: "2"
memory: 1Gi
autoscaling:
enabled: true
minReplicas: 3
maxReplicas: 20
ingress:
enabled: true
className: nginx
hosts:
- host: app.example.com
paths:
- path: /
pathType: Prefix
tls:
- secretName: app-tls
hosts:
- app.example.com
```
---
## Type Safety with values.schema.json
### Basic Schema
```json
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"type": "object",
"required": ["replicaCount", "image"],
"properties": {
"replicaCount": {
"type": "integer",
"minimum": 1,
"description": "Number of pod replicas"
},
"image": {
"type": "object",
"required": ["repository"],
"properties": {
"repository": {
"type": "string",
"minLength": 1,
"description": "Container image repository"
},
"tag": {
"type": "string",
"description": "Image tag"
},
"pullPolicy": {
"type": "string",
"enum": ["Always", "IfNotPresent", "Never"],
"description": "Image pull policy"
}
}
},
"service": {
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": ["ClusterIP", "NodePort", "LoadBalancer"],
"description": "Kubernetes service type"
},
"port": {
"type": "integer",
"minimum": 1,
"maximum": 65535,
"description": "Service port number"
}
}
}
}
}
```
### Why Use Schema
- **Fails fast** — `helm install` rejects invalid values before rendering templates
- **Documents types** — self-documenting valid options (enums, ranges)
- **IDE support** — editors can autocomplete and validate values files
- **CI safety** — catches typos in value overrides early
---
## Testing Values
### helm lint
```bash
# Basic lint
helm lint mychart/
# Lint with override values
helm lint mychart/ -f values-production.yaml
# Lint with --set
helm lint mychart/ --set replicaCount=0 # Should fail schema
```
### helm template
```bash
# Render templates locally
helm template myrelease mychart/
# Render with overrides to verify
helm template myrelease mychart/ -f values-production.yaml
# Debug mode (shows computed values)
helm template myrelease mychart/ --debug
# Render specific template
helm template myrelease mychart/ -s templates/deployment.yaml
```
### Checklist for New Values
| Check | Question |
|-------|----------|
| Documented? | Does the key have an inline comment? |
| Default works? | Can you helm install without overriding? |
| Type clear? | Is it obvious if this is string, int, bool, list, map? |
| Overridable? | Can it be set with `--set`? (avoid deeply nested) |
| No secrets? | Are default values free of passwords/tokens? |
| camelCase? | Does it follow Helm naming convention? |
| Flat enough? | Is nesting 3 levels or less? |
FILE:scripts/chart_analyzer.py
#!/usr/bin/env python3
"""
helm-chart-builder: Chart Analyzer
Static analysis of Helm chart directories for structural issues, template
anti-patterns, missing labels, hardcoded values, and security baseline checks.
Usage:
python scripts/chart_analyzer.py mychart/
python scripts/chart_analyzer.py mychart/ --output json
python scripts/chart_analyzer.py mychart/ --security
"""
import argparse
import json
import re
import sys
from pathlib import Path
# --- Analysis Rules ---
REQUIRED_FILES = [
{"path": "Chart.yaml", "severity": "critical", "message": "Missing Chart.yaml — not a valid Helm chart"},
{"path": "values.yaml", "severity": "high", "message": "Missing values.yaml — chart has no configurable defaults"},
{"path": "templates/_helpers.tpl", "severity": "high", "message": "Missing _helpers.tpl — no shared label/name helpers"},
{"path": "templates/NOTES.txt", "severity": "medium", "message": "Missing NOTES.txt — no post-install instructions for users"},
{"path": ".helmignore", "severity": "low", "message": "Missing .helmignore — CI files, .git, tests may be packaged"},
]
CHART_YAML_CHECKS = [
{"field": "apiVersion", "severity": "critical", "message": "Missing apiVersion in Chart.yaml"},
{"field": "name", "severity": "critical", "message": "Missing name in Chart.yaml"},
{"field": "version", "severity": "critical", "message": "Missing version in Chart.yaml"},
{"field": "description", "severity": "medium", "message": "Missing description in Chart.yaml"},
{"field": "appVersion", "severity": "medium", "message": "Missing appVersion in Chart.yaml — operators won't know what app version is deployed"},
{"field": "type", "severity": "low", "message": "Missing type in Chart.yaml — defaults to 'application'"},
]
TEMPLATE_ANTI_PATTERNS = [
{
"id": "TP001",
"severity": "high",
"pattern": r'image:\s*["\']?[a-z][a-z0-9./-]+:[a-z0-9][a-z0-9._-]*["\']?\s*$',
"message": "Hardcoded image tag in template — must use .Values.image.repository and .Values.image.tag",
"fix": 'Use: image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"',
},
{
"id": "TP002",
"severity": "high",
"pattern": r'replicas:\s*\d+\s*$',
"message": "Hardcoded replica count — must be configurable via values",
"fix": "Use: replicas: {{ .Values.replicaCount }}",
},
{
"id": "TP003",
"severity": "medium",
"pattern": r'port:\s*\d+\s*$',
"message": "Hardcoded port number — should be configurable via values",
"fix": "Use: port: {{ .Values.service.port }}",
},
{
"id": "TP004",
"severity": "high",
"pattern": r'(?:name|namespace):\s*[a-z][a-z0-9-]+\s*$',
"message": "Hardcoded name/namespace — should use template helpers",
"fix": 'Use: name: {{ include "mychart.fullname" . }}',
},
{
"id": "TP005",
"severity": "medium",
"pattern": r'nodePort:\s*\d+',
"message": "Hardcoded nodePort — should be configurable or avoided",
"fix": "Use: nodePort: {{ .Values.service.nodePort }} with conditional",
},
]
SECURITY_CHECKS = [
{
"id": "SC001",
"severity": "critical",
"check": "no_security_context",
"message": "No securityContext found in any template — pods run as root with full capabilities",
"fix": "Add pod and container securityContext with runAsNonRoot, readOnlyRootFilesystem, drop ALL capabilities",
},
{
"id": "SC002",
"severity": "critical",
"check": "privileged_container",
"message": "Privileged container detected — full host access",
"fix": "Remove privileged: true. Use specific capabilities instead",
},
{
"id": "SC003",
"severity": "high",
"check": "no_run_as_non_root",
"message": "No runAsNonRoot: true — container may run as root",
"fix": "Add runAsNonRoot: true to pod securityContext",
},
{
"id": "SC004",
"severity": "high",
"check": "no_readonly_rootfs",
"message": "No readOnlyRootFilesystem — container filesystem is writable",
"fix": "Add readOnlyRootFilesystem: true and use emptyDir for writable paths",
},
{
"id": "SC005",
"severity": "medium",
"check": "no_network_policy",
"message": "No NetworkPolicy template — all pod-to-pod traffic allowed",
"fix": "Add a NetworkPolicy template with default-deny ingress and explicit allow rules",
},
{
"id": "SC006",
"severity": "medium",
"check": "automount_sa_token",
"message": "automountServiceAccountToken not set to false — pod can access K8s API",
"fix": "Set automountServiceAccountToken: false unless the pod needs K8s API access",
},
{
"id": "SC007",
"severity": "high",
"check": "host_network",
"message": "hostNetwork: true — pod shares host network namespace",
"fix": "Remove hostNetwork unless absolutely required (e.g., CNI plugin)",
},
{
"id": "SC008",
"severity": "critical",
"check": "host_pid_ipc",
"message": "hostPID or hostIPC enabled — pod can see host processes/IPC",
"fix": "Remove hostPID and hostIPC — never needed in application charts",
},
]
LABEL_PATTERNS = [
r"app\.kubernetes\.io/name",
r"app\.kubernetes\.io/instance",
r"app\.kubernetes\.io/version",
r"app\.kubernetes\.io/managed-by",
r"helm\.sh/chart",
]
# --- Demo Chart ---
DEMO_CHART_YAML = """apiVersion: v2
name: demo-app
version: 0.1.0
"""
DEMO_VALUES_YAML = """replicaCount: 1
image:
repository: nginx
tag: latest
pullPolicy: Always
service:
type: ClusterIP
port: 80
"""
DEMO_DEPLOYMENT = """apiVersion: apps/v1
kind: Deployment
metadata:
name: demo-app
spec:
replicas: 3
template:
spec:
containers:
- name: demo-app
image: nginx:1.25
ports:
- containerPort: 80
"""
def parse_yaml_simple(content):
"""Simple key-value parser for YAML (stdlib only)."""
result = {}
for line in content.splitlines():
stripped = line.strip()
if not stripped or stripped.startswith("#"):
continue
if ":" in stripped and not stripped.startswith("-"):
key, _, val = stripped.partition(":")
key = key.strip()
val = val.strip().strip("'\"")
if val:
result[key] = val
return result
def check_structure(chart_dir):
"""Check chart directory for required files."""
findings = []
for check in REQUIRED_FILES:
path = chart_dir / check["path"]
if not path.exists():
findings.append({
"id": "ST" + str(REQUIRED_FILES.index(check) + 1).zfill(3),
"severity": check["severity"],
"message": check["message"],
"fix": f"Create {check['path']}",
"file": check["path"],
})
return findings
def check_chart_yaml(chart_dir):
"""Validate Chart.yaml metadata."""
findings = []
chart_path = chart_dir / "Chart.yaml"
if not chart_path.exists():
return findings
content = chart_path.read_text(encoding="utf-8")
parsed = parse_yaml_simple(content)
for check in CHART_YAML_CHECKS:
if check["field"] not in parsed:
findings.append({
"id": "CY" + str(CHART_YAML_CHECKS.index(check) + 1).zfill(3),
"severity": check["severity"],
"message": check["message"],
"fix": f"Add '{check['field']}:' to Chart.yaml",
"file": "Chart.yaml",
})
# Check apiVersion value
if parsed.get("apiVersion") == "v1":
findings.append({
"id": "CY007",
"severity": "medium",
"message": "apiVersion: v1 is Helm 2 format — use v2 for Helm 3",
"fix": "Change apiVersion to v2",
"file": "Chart.yaml",
})
# Check version is semver
version = parsed.get("version", "")
if version and not re.match(r"^\d+\.\d+\.\d+", version):
findings.append({
"id": "CY008",
"severity": "high",
"message": f"Version '{version}' is not valid semver",
"fix": "Use semver format: MAJOR.MINOR.PATCH (e.g., 1.0.0)",
"file": "Chart.yaml",
})
return findings
def check_templates(chart_dir):
"""Scan templates for anti-patterns."""
findings = []
templates_dir = chart_dir / "templates"
if not templates_dir.exists():
return findings
template_files = list(templates_dir.glob("*.yaml")) + list(templates_dir.glob("*.yml")) + list(templates_dir.glob("*.tpl"))
all_content = ""
for tpl_file in template_files:
content = tpl_file.read_text(encoding="utf-8")
all_content += content + "\n"
rel_path = tpl_file.relative_to(chart_dir)
for rule in TEMPLATE_ANTI_PATTERNS:
# Skip patterns that would false-positive on template expressions
for match in re.finditer(rule["pattern"], content, re.MULTILINE):
line = match.group(0).strip()
# Skip if the line contains a template expression
if "{{" in line or "}}" in line:
continue
findings.append({
"id": rule["id"],
"severity": rule["severity"],
"message": rule["message"],
"fix": rule["fix"],
"file": str(rel_path),
"line": line[:80],
})
# Check for standard labels
helpers_file = templates_dir / "_helpers.tpl"
if helpers_file.exists():
helpers_content = helpers_file.read_text(encoding="utf-8")
for label_pattern in LABEL_PATTERNS:
if not re.search(label_pattern, helpers_content) and not re.search(label_pattern, all_content):
label_name = label_pattern.replace("\\.", ".")
findings.append({
"id": "LB001",
"severity": "high",
"message": f"Standard label '{label_name}' not found in helpers or templates",
"fix": f"Add {label_name} to the labels helper in _helpers.tpl",
"file": "templates/_helpers.tpl",
"line": "(label not found)",
})
# Check for resource limits
if "resources:" not in all_content and template_files:
findings.append({
"id": "TP006",
"severity": "critical",
"message": "No resource requests/limits in any template — pods can consume unlimited node resources",
"fix": "Add resources section: {{ toYaml .Values.resources | nindent 12 }}",
"file": "templates/",
"line": "(no resources block found)",
})
# Check for probes
if "livenessProbe" not in all_content and "readinessProbe" not in all_content and template_files:
has_deployment = any("Deployment" in f.read_text(encoding="utf-8") for f in template_files if f.suffix in (".yaml", ".yml"))
if has_deployment:
findings.append({
"id": "TP007",
"severity": "high",
"message": "No liveness/readiness probes — Kubernetes cannot detect unhealthy pods",
"fix": "Add livenessProbe and readinessProbe with configurable values",
"file": "templates/deployment.yaml",
"line": "(no probes found)",
})
return findings
def check_security(chart_dir):
"""Run security-focused checks."""
findings = []
templates_dir = chart_dir / "templates"
if not templates_dir.exists():
return findings
template_files = list(templates_dir.glob("*.yaml")) + list(templates_dir.glob("*.yml"))
all_content = ""
for tpl_file in template_files:
all_content += tpl_file.read_text(encoding="utf-8") + "\n"
for check in SECURITY_CHECKS:
triggered = False
if check["check"] == "no_security_context":
if "securityContext" not in all_content and template_files:
triggered = True
elif check["check"] == "privileged_container":
if re.search(r"privileged:\s*true", all_content):
triggered = True
elif check["check"] == "no_run_as_non_root":
if "securityContext" in all_content and "runAsNonRoot" not in all_content:
triggered = True
elif check["check"] == "no_readonly_rootfs":
if "securityContext" in all_content and "readOnlyRootFilesystem" not in all_content:
triggered = True
elif check["check"] == "no_network_policy":
np_file = templates_dir / "networkpolicy.yaml"
if not np_file.exists() and "NetworkPolicy" not in all_content:
triggered = True
elif check["check"] == "automount_sa_token":
if "automountServiceAccountToken" not in all_content and template_files:
triggered = True
elif check["check"] == "host_network":
if re.search(r"hostNetwork:\s*true", all_content):
triggered = True
elif check["check"] == "host_pid_ipc":
if re.search(r"host(?:PID|IPC):\s*true", all_content):
triggered = True
if triggered:
findings.append({
"id": check["id"],
"severity": check["severity"],
"message": check["message"],
"fix": check["fix"],
"file": "templates/",
})
# Check for secrets in values.yaml
values_path = chart_dir / "values.yaml"
if values_path.exists():
values_content = values_path.read_text(encoding="utf-8")
for match in re.finditer(r"^(\s*\S*(?:password|secret|token|apiKey|api_key)\s*:\s*)(\S+)", values_content, re.MULTILINE | re.IGNORECASE):
val = match.group(2).strip("'\"")
if val and val not in ("null", "~", '""', "''", "changeme", "CHANGEME", "TODO"):
findings.append({
"id": "SC009",
"severity": "critical",
"message": f"Potential secret in values.yaml default: {match.group(0).strip()[:60]}",
"fix": "Remove default secret values. Use empty string or null with documentation",
"file": "values.yaml",
"line": match.group(0).strip()[:80],
})
return findings
def analyze_chart(chart_dir, output_format="text", security_focus=False):
"""Run full chart analysis."""
findings = []
findings.extend(check_structure(chart_dir))
findings.extend(check_chart_yaml(chart_dir))
findings.extend(check_templates(chart_dir))
if security_focus:
findings.extend(check_security(chart_dir))
# Filter to security-relevant items only
security_ids = {"SC001", "SC002", "SC003", "SC004", "SC005", "SC006", "SC007", "SC008", "SC009"}
security_severities = {"critical", "high"}
findings = [f for f in findings if f["id"] in security_ids or f["severity"] in security_severities]
else:
findings.extend(check_security(chart_dir))
# Deduplicate
seen = set()
unique = []
for f in findings:
key = (f["id"], f.get("line", ""), f.get("file", ""))
if key not in seen:
seen.add(key)
unique.append(f)
findings = unique
# Sort by severity
severity_order = {"critical": 0, "high": 1, "medium": 2, "low": 3}
findings.sort(key=lambda f: severity_order.get(f["severity"], 4))
# Score
deductions = {"critical": 25, "high": 15, "medium": 5, "low": 2}
score = max(0, 100 - sum(deductions.get(f["severity"], 0) for f in findings))
counts = {
"critical": sum(1 for f in findings if f["severity"] == "critical"),
"high": sum(1 for f in findings if f["severity"] == "high"),
"medium": sum(1 for f in findings if f["severity"] == "medium"),
"low": sum(1 for f in findings if f["severity"] == "low"),
}
# Chart metadata
chart_yaml_path = chart_dir / "Chart.yaml"
chart_meta = parse_yaml_simple(chart_yaml_path.read_text(encoding="utf-8")) if chart_yaml_path.exists() else {}
result = {
"score": score,
"chart_name": chart_meta.get("name", chart_dir.name),
"chart_version": chart_meta.get("version", "unknown"),
"app_version": chart_meta.get("appVersion", "unknown"),
"findings": findings,
"finding_counts": counts,
}
if output_format == "json":
print(json.dumps(result, indent=2))
return result
# Text output
print(f"\n{'=' * 60}")
print(f" Helm Chart Analysis Report")
print(f"{'=' * 60}")
print(f" Score: {score}/100")
print(f" Chart: {result['chart_name']} v{result['chart_version']}")
print(f" App Version: {result['app_version']}")
print()
print(f" Findings: {counts['critical']} critical | {counts['high']} high | {counts['medium']} medium | {counts['low']} low")
print(f"{'─' * 60}")
for f in findings:
icon = {"critical": "!!!", "high": "!!", "medium": "!", "low": "~"}.get(f["severity"], "?")
print(f"\n [{f['id']}] {icon} {f['severity'].upper()}")
print(f" {f['message']}")
if "file" in f:
print(f" File: {f['file']}")
if "line" in f:
print(f" Line: {f['line']}")
print(f" Fix: {f['fix']}")
if not findings:
print("\n No issues found. Chart looks good.")
print(f"\n{'=' * 60}\n")
return result
def run_demo():
"""Run analysis on demo chart data."""
import tempfile
import os
with tempfile.TemporaryDirectory() as tmpdir:
chart_dir = Path(tmpdir) / "demo-app"
chart_dir.mkdir()
(chart_dir / "Chart.yaml").write_text(DEMO_CHART_YAML)
(chart_dir / "values.yaml").write_text(DEMO_VALUES_YAML)
templates_dir = chart_dir / "templates"
templates_dir.mkdir()
(templates_dir / "deployment.yaml").write_text(DEMO_DEPLOYMENT)
return chart_dir, analyze_chart
def main():
parser = argparse.ArgumentParser(
description="helm-chart-builder: Helm chart static analyzer"
)
parser.add_argument("chartdir", nargs="?", help="Path to Helm chart directory (omit for demo)")
parser.add_argument(
"--output", "-o",
choices=["text", "json"],
default="text",
help="Output format (default: text)",
)
parser.add_argument(
"--security",
action="store_true",
help="Security-focused analysis only",
)
args = parser.parse_args()
if args.chartdir:
chart_dir = Path(args.chartdir)
if not chart_dir.is_dir():
print(f"Error: Not a directory: {args.chartdir}", file=sys.stderr)
sys.exit(1)
analyze_chart(chart_dir, args.output, args.security)
else:
print("No chart directory provided. Running demo analysis...\n")
import tempfile
with tempfile.TemporaryDirectory() as tmpdir:
chart_dir = Path(tmpdir) / "demo-app"
chart_dir.mkdir()
(chart_dir / "Chart.yaml").write_text(DEMO_CHART_YAML)
(chart_dir / "values.yaml").write_text(DEMO_VALUES_YAML)
templates_dir = chart_dir / "templates"
templates_dir.mkdir()
(templates_dir / "deployment.yaml").write_text(DEMO_DEPLOYMENT)
analyze_chart(chart_dir, args.output, args.security)
if __name__ == "__main__":
main()
FILE:scripts/values_validator.py
#!/usr/bin/env python3
"""
helm-chart-builder: Values Validator
Validate values.yaml files against Helm best practices — documentation coverage,
type consistency, naming conventions, default quality, and security.
Usage:
python scripts/values_validator.py values.yaml
python scripts/values_validator.py values.yaml --output json
python scripts/values_validator.py values.yaml --strict
"""
import argparse
import json
import re
import sys
from pathlib import Path
# --- Demo values.yaml ---
DEMO_VALUES = """# Default values for demo-app
replicaCount: 1
image:
repository: nginx
tag: latest
pullPolicy: Always
service:
type: ClusterIP
port: 80
ingress:
enabled: false
resources: {}
PASSWORD: supersecret123
db_password: changeme
api-key: sk-12345
deeply:
nested:
structure:
that:
goes:
too:
deep: true
undocumented_value: something
AnotherValue: 42
snake_case_key: bad
"""
# --- Validation Rules ---
NAMING_PATTERN = re.compile(r"^[a-z][a-zA-Z0-9]*$") # camelCase
SNAKE_CASE_PATTERN = re.compile(r"^[a-z][a-z0-9]*(_[a-z0-9]+)+$") # snake_case
UPPER_CASE_PATTERN = re.compile(r"^[A-Z]") # Starts with uppercase
SECRET_KEY_PATTERNS = [
re.compile(r"(?:password|secret|token|apiKey|api_key|api-key|private_key|credentials)", re.IGNORECASE),
]
KNOWN_STRUCTURES = {
"image": ["repository", "tag", "pullPolicy"],
"service": ["type", "port"],
"ingress": ["enabled"],
"resources": [],
"serviceAccount": ["create", "name"],
"autoscaling": ["enabled", "minReplicas", "maxReplicas"],
}
def parse_values(content):
"""Parse values.yaml into structured data with metadata.
Returns a list of entries with key paths, values, depth, and comment info.
"""
entries = []
key_stack = []
indent_stack = [0]
prev_comment = None
for line_num, line in enumerate(content.splitlines(), 1):
stripped = line.strip()
# Track comments for documentation coverage
if stripped.startswith("#"):
prev_comment = stripped
continue
if not stripped:
prev_comment = None
continue
indent = len(line) - len(line.lstrip())
# Pop stack for dedented lines
while len(indent_stack) > 1 and indent <= indent_stack[-1]:
indent_stack.pop()
if key_stack:
key_stack.pop()
# Parse key: value
match = re.match(r"^(\S+)\s*:\s*(.*)", stripped)
if match and not stripped.startswith("-"):
key = match.group(1)
raw_value = match.group(2).strip()
# Check for inline comment
inline_comment = None
if "#" in raw_value:
val_part, _, comment_part = raw_value.partition("#")
raw_value = val_part.strip()
inline_comment = comment_part.strip()
# Build full key path
full_path = ".".join(key_stack + [key])
depth = len(key_stack) + 1
# Determine value type
value_type = "unknown"
if not raw_value or raw_value == "":
value_type = "map"
key_stack.append(key)
indent_stack.append(indent)
elif raw_value in ("true", "false"):
value_type = "boolean"
elif raw_value == "null" or raw_value == "~":
value_type = "null"
elif raw_value == "{}":
value_type = "empty_map"
elif raw_value == "[]":
value_type = "empty_list"
elif re.match(r"^-?\d+$", raw_value):
value_type = "integer"
elif re.match(r"^-?\d+\.\d+$", raw_value):
value_type = "float"
elif raw_value.startswith('"') or raw_value.startswith("'"):
value_type = "string"
else:
value_type = "string"
has_doc = prev_comment is not None or inline_comment is not None
entries.append({
"key": key,
"full_path": full_path,
"value": raw_value,
"value_type": value_type,
"depth": depth,
"line": line_num,
"has_documentation": has_doc,
"comment": prev_comment or inline_comment,
})
prev_comment = None
else:
prev_comment = None
return entries
def validate_naming(entries):
"""Check key naming conventions."""
findings = []
for entry in entries:
key = entry["key"]
# Skip map entries (they're parent keys)
if entry["value_type"] == "map":
# Parent keys should still be camelCase
pass
if SNAKE_CASE_PATTERN.match(key):
findings.append({
"severity": "medium",
"category": "naming",
"message": f"Key '{entry['full_path']}' uses snake_case — Helm convention is camelCase",
"fix": f"Rename to camelCase: {to_camel_case(key)}",
"line": entry["line"],
})
elif UPPER_CASE_PATTERN.match(key) and not key.isupper():
findings.append({
"severity": "medium",
"category": "naming",
"message": f"Key '{entry['full_path']}' starts with uppercase — use camelCase",
"fix": f"Rename: {key[0].lower() + key[1:]}",
"line": entry["line"],
})
elif "-" in key:
findings.append({
"severity": "medium",
"category": "naming",
"message": f"Key '{entry['full_path']}' uses kebab-case — Helm convention is camelCase",
"fix": f"Rename to camelCase: {to_camel_case(key)}",
"line": entry["line"],
})
return findings
def validate_documentation(entries):
"""Check documentation coverage."""
findings = []
total = len(entries)
documented = sum(1 for e in entries if e["has_documentation"])
if total > 0:
coverage = (documented / total) * 100
if coverage < 50:
findings.append({
"severity": "high",
"category": "documentation",
"message": f"Only {coverage:.0f}% of values have comments ({documented}/{total})",
"fix": "Add inline YAML comments explaining purpose, type, and valid options for each value",
"line": 0,
})
elif coverage < 80:
findings.append({
"severity": "medium",
"category": "documentation",
"message": f"{coverage:.0f}% documentation coverage ({documented}/{total}) — aim for 80%+",
"fix": "Add comments for undocumented values",
"line": 0,
})
# Flag specific undocumented top-level keys
for entry in entries:
if entry["depth"] == 1 and not entry["has_documentation"]:
findings.append({
"severity": "low",
"category": "documentation",
"message": f"Top-level key '{entry['key']}' has no comment",
"fix": f"Add a comment above '{entry['key']}' explaining its purpose",
"line": entry["line"],
})
return findings
def validate_defaults(entries):
"""Check default value quality."""
findings = []
for entry in entries:
# Check for :latest tag
if entry["key"] == "tag" and entry["value"] in ("latest", '"latest"', "'latest'"):
findings.append({
"severity": "high",
"category": "defaults",
"message": f"image.tag defaults to 'latest' — not reproducible",
"fix": "Use a specific version tag or reference .Chart.AppVersion in template",
"line": entry["line"],
})
# Check pullPolicy
if entry["key"] == "pullPolicy" and entry["value"] in ("Always", '"Always"', "'Always'"):
findings.append({
"severity": "low",
"category": "defaults",
"message": "imagePullPolicy defaults to 'Always' — 'IfNotPresent' is better for production",
"fix": "Change default to IfNotPresent (Always is appropriate for :latest only)",
"line": entry["line"],
})
# Check empty resources
if entry["key"] == "resources" and entry["value_type"] == "empty_map":
findings.append({
"severity": "medium",
"category": "defaults",
"message": "resources defaults to {} — no requests or limits set",
"fix": "Provide default resource requests (e.g., cpu: 100m, memory: 128Mi)",
"line": entry["line"],
})
return findings
def validate_secrets(entries):
"""Check for secrets in default values."""
findings = []
for entry in entries:
for pattern in SECRET_KEY_PATTERNS:
if pattern.search(entry["full_path"]):
val = entry["value"].strip("'\"")
if val and val not in ("", "null", "~", "{}", "[]", "changeme", "CHANGEME", "TODO", '""', "''"):
findings.append({
"severity": "critical",
"category": "security",
"message": f"Potential secret with default value: {entry['full_path']} = {val[:30]}...",
"fix": "Remove default. Use empty string, null, or 'changeme' placeholder with comment",
"line": entry["line"],
})
break
return findings
def validate_depth(entries):
"""Check nesting depth."""
findings = []
max_depth = max((e["depth"] for e in entries), default=0)
if max_depth > 4:
deep_entries = [e for e in entries if e["depth"] > 4]
for entry in deep_entries[:3]: # Report first 3
findings.append({
"severity": "medium",
"category": "structure",
"message": f"Deeply nested key ({entry['depth']} levels): {entry['full_path']}",
"fix": "Flatten structure — max 3-4 levels deep for usability",
"line": entry["line"],
})
return findings
def to_camel_case(name):
"""Convert snake_case or kebab-case to camelCase."""
parts = re.split(r"[-_]", name)
return parts[0].lower() + "".join(p.capitalize() for p in parts[1:])
def generate_report(content, output_format="text", strict=False):
"""Generate full validation report."""
entries = parse_values(content)
findings = []
findings.extend(validate_naming(entries))
findings.extend(validate_documentation(entries))
findings.extend(validate_defaults(entries))
findings.extend(validate_secrets(entries))
findings.extend(validate_depth(entries))
if strict:
# Elevate medium to high, low to medium
for f in findings:
if f["severity"] == "medium":
f["severity"] = "high"
elif f["severity"] == "low":
f["severity"] = "medium"
# Sort by severity
severity_order = {"critical": 0, "high": 1, "medium": 2, "low": 3}
findings.sort(key=lambda f: severity_order.get(f["severity"], 4))
# Score
deductions = {"critical": 25, "high": 15, "medium": 5, "low": 2}
score = max(0, 100 - sum(deductions.get(f["severity"], 0) for f in findings))
counts = {
"critical": sum(1 for f in findings if f["severity"] == "critical"),
"high": sum(1 for f in findings if f["severity"] == "high"),
"medium": sum(1 for f in findings if f["severity"] == "medium"),
"low": sum(1 for f in findings if f["severity"] == "low"),
}
# Stats
total_keys = len(entries)
documented = sum(1 for e in entries if e["has_documentation"])
max_depth = max((e["depth"] for e in entries), default=0)
result = {
"score": score,
"total_keys": total_keys,
"documented_keys": documented,
"documentation_coverage": f"{(documented / total_keys * 100):.0f}%" if total_keys > 0 else "N/A",
"max_depth": max_depth,
"findings": findings,
"finding_counts": counts,
}
if output_format == "json":
print(json.dumps(result, indent=2))
return result
# Text output
print(f"\n{'=' * 60}")
print(f" Values.yaml Validation Report")
print(f"{'=' * 60}")
print(f" Score: {score}/100")
print(f" Keys: {total_keys} | Documented: {documented} ({result['documentation_coverage']})")
print(f" Max Depth: {max_depth}")
print()
print(f" Findings: {counts['critical']} critical | {counts['high']} high | {counts['medium']} medium | {counts['low']} low")
print(f"{'─' * 60}")
for f in findings:
icon = {"critical": "!!!", "high": "!!", "medium": "!", "low": "~"}.get(f["severity"], "?")
print(f"\n {icon} {f['severity'].upper()} [{f['category']}]")
print(f" {f['message']}")
if f.get("line", 0) > 0:
print(f" Line: {f['line']}")
print(f" Fix: {f['fix']}")
if not findings:
print("\n No issues found. Values file looks good.")
print(f"\n{'=' * 60}\n")
return result
def main():
parser = argparse.ArgumentParser(
description="helm-chart-builder: values.yaml best-practice validator"
)
parser.add_argument("valuesfile", nargs="?", help="Path to values.yaml (omit for demo)")
parser.add_argument(
"--output", "-o",
choices=["text", "json"],
default="text",
help="Output format (default: text)",
)
parser.add_argument(
"--strict",
action="store_true",
help="Strict mode — elevate warnings to higher severity",
)
args = parser.parse_args()
if args.valuesfile:
path = Path(args.valuesfile)
if not path.exists():
print(f"Error: File not found: {args.valuesfile}", file=sys.stderr)
sys.exit(1)
content = path.read_text(encoding="utf-8")
else:
print("No values file provided. Running demo validation...\n")
content = DEMO_VALUES
generate_report(content, args.output, args.strict)
if __name__ == "__main__":
main()
Docker and container development agent skill and plugin for Dockerfile optimization, docker-compose orchestration, multi-stage builds, and container security...
--- name: "docker-development" description: "Docker and container development agent skill and plugin for Dockerfile optimization, docker-compose orchestration, multi-stage builds, and container security hardening. Use when: user wants to optimize a Dockerfile, create or improve docker-compose configurations, implement multi-stage builds, audit container security, reduce image size, or follow container best practices. Covers build performance, layer caching, secret management, and production-ready container patterns." license: MIT metadata: version: 1.0.0 author: Alireza Rezvani category: engineering updated: 2026-03-16 --- # Docker Development > Smaller images. Faster builds. Secure containers. No guesswork. Opinionated Docker workflow that turns bloated Dockerfiles into production-grade containers. Covers optimization, multi-stage builds, compose orchestration, and security hardening. Not a Docker tutorial — a set of concrete decisions about how to build containers that don't waste time, space, or attack surface. --- ## Slash Commands | Command | What it does | |---------|-------------| | `/docker:optimize` | Analyze and optimize a Dockerfile for size, speed, and layer caching | | `/docker:compose` | Generate or improve docker-compose.yml with best practices | | `/docker:security` | Audit a Dockerfile or running container for security issues | --- ## When This Skill Activates Recognize these patterns from the user: - "Optimize this Dockerfile" - "My Docker build is slow" - "Create a docker-compose for this project" - "Is this Dockerfile secure?" - "Reduce my Docker image size" - "Set up multi-stage builds" - "Docker best practices for [language/framework]" - Any request involving: Dockerfile, docker-compose, container, image size, build cache, Docker security If the user has a Dockerfile or wants to containerize something → this skill applies. --- ## Workflow ### `/docker:optimize` — Dockerfile Optimization 1. **Analyze current state** - Read the Dockerfile - Identify base image and its size - Count layers (each RUN/COPY/ADD = 1 layer) - Check for common anti-patterns 2. **Apply optimization checklist** ``` BASE IMAGE ├── Use specific tags, never :latest in production ├── Prefer slim/alpine variants (debian-slim > ubuntu > debian) ├── Pin digest for reproducibility in CI: image@sha256:... └── Match base to runtime needs (don't use python:3.12 for a compiled binary) LAYER OPTIMIZATION ├── Combine related RUN commands with && \ ├── Order layers: least-changing first (deps before source code) ├── Clean package manager cache in the same RUN layer ├── Use .dockerignore to exclude unnecessary files └── Separate build deps from runtime deps BUILD CACHE ├── COPY dependency files before source code (package.json, requirements.txt, go.mod) ├── Install deps in a separate layer from code copy ├── Use BuildKit cache mounts: --mount=type=cache,target=/root/.cache └── Avoid COPY . . before dependency installation MULTI-STAGE BUILDS ├── Stage 1: build (full SDK, build tools, dev deps) ├── Stage 2: runtime (minimal base, only production artifacts) ├── COPY --from=builder only what's needed └── Final image should have NO build tools, NO source code, NO dev deps ``` 3. **Generate optimized Dockerfile** - Apply all relevant optimizations - Add inline comments explaining each decision - Report estimated size reduction 4. **Validate** ```bash python3 scripts/dockerfile_analyzer.py Dockerfile ``` ### `/docker:compose` — Docker Compose Configuration 1. **Identify services** - Application (web, API, worker) - Database (postgres, mysql, redis, mongo) - Cache (redis, memcached) - Queue (rabbitmq, kafka) - Reverse proxy (nginx, traefik, caddy) 2. **Apply compose best practices** ``` SERVICES ├── Use depends_on with condition: service_healthy ├── Add healthchecks for every service ├── Set resource limits (mem_limit, cpus) ├── Use named volumes for persistent data └── Pin image versions NETWORKING ├── Create explicit networks (don't rely on default) ├── Separate frontend and backend networks ├── Only expose ports that need external access └── Use internal: true for backend-only networks ENVIRONMENT ├── Use env_file for secrets, not inline environment ├── Never commit .env files (add to .gitignore) ├── Use variable substitution: -default └── Document all required env vars DEVELOPMENT vs PRODUCTION ├── Use compose profiles or override files ├── Dev: bind mounts for hot reload, debug ports exposed ├── Prod: named volumes, no debug ports, restart: unless-stopped └── docker-compose.override.yml for dev-only config ``` 3. **Generate compose file** - Output docker-compose.yml with healthchecks, networks, volumes - Generate .env.example with all required variables documented - Add dev/prod profile annotations ### `/docker:security` — Container Security Audit 1. **Dockerfile audit** | Check | Severity | Fix | |-------|----------|-----| | Running as root | Critical | Add `USER nonroot` after creating user | | Using :latest tag | High | Pin to specific version | | Secrets in ENV/ARG | Critical | Use BuildKit secrets: `--mount=type=secret` | | COPY with broad glob | Medium | Use specific paths, add .dockerignore | | Unnecessary EXPOSE | Low | Only expose ports the app uses | | No HEALTHCHECK | Medium | Add HEALTHCHECK with appropriate interval | | Privileged instructions | High | Avoid `--privileged`, drop capabilities | | Package manager cache retained | Low | Clean in same RUN layer | 2. **Runtime security checks** | Check | Severity | Fix | |-------|----------|-----| | Container running as root | Critical | Set user in Dockerfile or compose | | Writable root filesystem | Medium | Use `read_only: true` in compose | | All capabilities retained | High | Drop all, add only needed: `cap_drop: [ALL]` | | No resource limits | Medium | Set `mem_limit` and `cpus` | | Host network mode | High | Use bridge or custom network | | Sensitive mounts | Critical | Never mount /etc, /var/run/docker.sock in prod | | No log driver configured | Low | Set `logging:` with size limits | 3. **Generate security report** ``` SECURITY AUDIT — [Dockerfile/Image name] Date: [timestamp] CRITICAL: [count] HIGH: [count] MEDIUM: [count] LOW: [count] [Detailed findings with fix recommendations] ``` --- ## Tooling ### `scripts/dockerfile_analyzer.py` CLI utility for static analysis of Dockerfiles. **Features:** - Layer count and optimization suggestions - Base image analysis with size estimates - Anti-pattern detection (15+ rules) - Security issue flagging - Multi-stage build detection and validation - JSON and text output **Usage:** ```bash # Analyze a Dockerfile python3 scripts/dockerfile_analyzer.py Dockerfile # JSON output python3 scripts/dockerfile_analyzer.py Dockerfile --output json # Analyze with security focus python3 scripts/dockerfile_analyzer.py Dockerfile --security # Check a specific directory python3 scripts/dockerfile_analyzer.py path/to/Dockerfile ``` ### `scripts/compose_validator.py` CLI utility for validating docker-compose files. **Features:** - Service dependency validation - Healthcheck presence detection - Network configuration analysis - Volume mount validation - Environment variable audit - Port conflict detection - Best practice scoring **Usage:** ```bash # Validate a compose file python3 scripts/compose_validator.py docker-compose.yml # JSON output python3 scripts/compose_validator.py docker-compose.yml --output json # Strict mode (fail on warnings) python3 scripts/compose_validator.py docker-compose.yml --strict ``` --- ## Multi-Stage Build Patterns ### Pattern 1: Compiled Language (Go, Rust, C++) ```dockerfile # Build stage FROM golang:1.22-alpine AS builder WORKDIR /app COPY go.mod go.sum ./ RUN go mod download COPY . . RUN CGO_ENABLED=0 go build -ldflags="-s -w" -o /app/server ./cmd/server # Runtime stage FROM gcr.io/distroless/static-debian12 COPY --from=builder /app/server /server USER nonroot:nonroot ENTRYPOINT ["/server"] ``` ### Pattern 2: Node.js / TypeScript ```dockerfile # Dependencies stage FROM node:20-alpine AS deps WORKDIR /app COPY package.json package-lock.json ./ RUN npm ci --production=false # Build stage FROM deps AS builder COPY . . RUN npm run build # Runtime stage FROM node:20-alpine WORKDIR /app RUN addgroup -g 1001 -S appgroup && adduser -S appuser -u 1001 COPY --from=builder /app/dist ./dist COPY --from=deps /app/node_modules ./node_modules COPY package.json ./ USER appuser EXPOSE 3000 CMD ["node", "dist/index.js"] ``` ### Pattern 3: Python ```dockerfile # Build stage FROM python:3.12-slim AS builder WORKDIR /app COPY requirements.txt . RUN pip install --no-cache-dir --prefix=/install -r requirements.txt # Runtime stage FROM python:3.12-slim WORKDIR /app RUN groupadd -r appgroup && useradd -r -g appgroup appuser COPY --from=builder /install /usr/local COPY . . USER appuser EXPOSE 8000 CMD ["python", "-m", "uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"] ``` --- ## Base Image Decision Tree ``` Is it a compiled binary (Go, Rust, C)? ├── Yes → distroless/static or scratch └── No ├── Need a shell for debugging? │ ├── Yes → alpine variant (e.g., node:20-alpine) │ └── No → distroless variant ├── Need glibc (not musl)? │ ├── Yes → slim variant (e.g., python:3.12-slim) │ └── No → alpine variant └── Need specific OS packages? ├── Many → debian-slim └── Few → alpine + apk add ``` --- ## Proactive Triggers Flag these without being asked: - **Dockerfile uses :latest** → Suggest pinning to a specific version tag. - **No .dockerignore** → Create one. At minimum: `.git`, `node_modules`, `__pycache__`, `.env`. - **COPY . . before dependency install** → Cache bust. Reorder to install deps first. - **Running as root** → Add USER instruction. No exceptions for production. - **Secrets in ENV or ARG** → Use BuildKit secret mounts. Never bake secrets into layers. - **Image over 1GB** → Multi-stage build required. No reason for a production image this large. - **No healthcheck** → Add one. Orchestrators (Compose, K8s) need it for proper lifecycle management. - **apt-get without cleanup in same layer** → `rm -rf /var/lib/apt/lists/*` in the same RUN. --- ## Installation ### One-liner (any tool) ```bash git clone https://github.com/alirezarezvani/claude-skills.git cp -r claude-skills/engineering/docker-development ~/.claude/skills/ ``` ### Multi-tool install ```bash ./scripts/convert.sh --skill docker-development --tool codex|gemini|cursor|windsurf|openclaw ``` ### OpenClaw ```bash clawhub install cs-docker-development ``` --- ## Related Skills - **senior-devops** — Broader DevOps scope (CI/CD, IaC, monitoring). Complementary — use docker-development for container-specific work, senior-devops for pipeline and infrastructure. - **senior-security** — Application security. Complementary — docker-development covers container security, senior-security covers application-level threats. - **autoresearch-agent** — Can optimize Docker build times or image sizes as measurable experiments. - **ci-cd-pipeline-builder** — Pipeline construction. Complementary — docker-development builds the containers, ci-cd-pipeline-builder deploys them. FILE:references/compose-patterns.md # Docker Compose Patterns Reference ## Production-Ready Patterns ### Web App + Database + Cache ```yaml services: app: build: context: . dockerfile: Dockerfile ports: - "3000:3000" env_file: - .env depends_on: db: condition: service_healthy redis: condition: service_healthy healthcheck: test: ["CMD", "curl", "-f", "http://localhost:3000/health"] interval: 30s timeout: 3s retries: 3 start_period: 10s restart: unless-stopped networks: - frontend - backend mem_limit: 512m cpus: 1.0 db: image: postgres:16-alpine volumes: - pgdata:/var/lib/postgresql/data env_file: - .env.db healthcheck: test: ["CMD-SHELL", "pg_isready -U postgres"] interval: 10s timeout: 5s retries: 5 restart: unless-stopped networks: - backend mem_limit: 256m redis: image: redis:7-alpine command: redis-server --maxmemory 64mb --maxmemory-policy allkeys-lru healthcheck: test: ["CMD", "redis-cli", "ping"] interval: 10s timeout: 3s retries: 3 restart: unless-stopped networks: - backend mem_limit: 128m volumes: pgdata: networks: frontend: backend: internal: true ``` ### Key Patterns - **Healthchecks on every service** — enables depends_on with condition - **Named volumes** — data persists across container recreation - **Explicit networks** — backend is internal (no external access) - **env_file** — secrets not in compose file - **Resource limits** — prevent runaway containers --- ## Development Override Pattern ### docker-compose.yml (base — production-like) ```yaml services: app: build: . ports: - "3000:3000" restart: unless-stopped ``` ### docker-compose.override.yml (dev — auto-loaded) ```yaml services: app: build: target: development volumes: - .:/app # Bind mount for hot reload - /app/node_modules # Preserve container node_modules environment: - NODE_ENV=development - DEBUG=true ports: - "9229:9229" # Debug port restart: "no" ``` ### Usage ```bash # Development (auto-loads override) docker compose up # Production (skip override) docker compose -f docker-compose.yml up -d # Explicit profiles docker compose --profile dev up docker compose --profile prod up -d ``` --- ## Network Isolation Pattern ```yaml services: nginx: image: nginx:alpine ports: - "80:80" - "443:443" networks: - frontend app: build: . networks: - frontend - backend db: image: postgres:16-alpine networks: - backend redis: image: redis:7-alpine networks: - backend networks: frontend: # External traffic reaches nginx and app backend: internal: true # DB and Redis only reachable by app ``` ### Why This Matters - Database and cache are **not accessible from outside** - Only nginx and app handle external traffic - Lateral movement limited if one container is compromised --- ## Worker + Queue Pattern ```yaml services: api: build: context: . target: runtime command: uvicorn main:app --host 0.0.0.0 --port 8000 ports: - "8000:8000" depends_on: rabbitmq: condition: service_healthy worker: build: context: . target: runtime command: celery -A tasks worker --loglevel=info depends_on: rabbitmq: condition: service_healthy scheduler: build: context: . target: runtime command: celery -A tasks beat --loglevel=info depends_on: rabbitmq: condition: service_healthy rabbitmq: image: rabbitmq:3.13-management-alpine ports: - "15672:15672" # Management UI (dev only) healthcheck: test: ["CMD", "rabbitmq-diagnostics", "check_running"] interval: 10s timeout: 5s retries: 5 ``` --- ## Logging Configuration ```yaml services: app: logging: driver: "json-file" options: max-size: "10m" max-file: "3" tag: "{{.Name}}/{{.ID}}" ``` ### Why - **max-size** prevents disk exhaustion - **max-file** rotates logs automatically - Default Docker logging has NO size limit — production servers can run out of disk --- ## Environment Variable Patterns ### .env.example (committed to repo) ```env # Database DATABASE_URL=postgres://user:password@db:5432/appname POSTGRES_USER=user POSTGRES_PASSWORD=changeme POSTGRES_DB=appname # Redis REDIS_URL=redis://redis:6379/0 # Application SECRET_KEY=changeme-generate-a-real-secret NODE_ENV=production LOG_LEVEL=info # External Services (BYOK) # SMTP_HOST= # SMTP_PORT=587 # AWS_ACCESS_KEY_ID= # AWS_SECRET_ACCESS_KEY= ``` ### Variable Substitution in Compose ```yaml services: app: image: myapp:-latest environment: - LOG_LEVEL=-info - PORT=-3000 ``` --- ## Troubleshooting Checklist | Symptom | Likely Cause | Fix | |---------|-------------|-----| | Container exits immediately | CMD/ENTRYPOINT crashes, missing env vars | Check logs: `docker compose logs service` | | Port already in use | Another service or host process on same port | Change host port: `"3001:3000"` | | Volume permissions denied | Container user doesn't own mounted path | Match UID/GID or use named volumes | | Build cache not working | COPY . . invalidates cache early | Reorder: copy deps first, then source | | depends_on doesn't wait | No healthcheck condition | Add `condition: service_healthy` | | Container OOM killed | No memory limit or limit too low | Set appropriate `mem_limit` | | Network connectivity issues | Wrong network or service name | Services communicate by service name within shared network | FILE:references/dockerfile-best-practices.md # Dockerfile Best Practices Reference ## Layer Optimization ### The Golden Rule Every `RUN`, `COPY`, and `ADD` instruction creates a new layer. Fewer layers = smaller image. ### Combine Related Commands ```dockerfile # Bad — 3 layers RUN apt-get update RUN apt-get install -y curl git RUN rm -rf /var/lib/apt/lists/* # Good — 1 layer RUN apt-get update && \ apt-get install -y --no-install-recommends curl git && \ rm -rf /var/lib/apt/lists/* ``` ### Order Layers by Change Frequency ```dockerfile # Least-changing layers first COPY package.json package-lock.json ./ # Changes rarely RUN npm ci # Changes when deps change COPY . . # Changes every build RUN npm run build # Changes every build ``` ### Use .dockerignore ``` .git node_modules __pycache__ *.pyc .env .env.* dist build *.log .DS_Store .vscode .idea coverage .pytest_cache ``` --- ## Base Image Selection ### Size Comparison (approximate) | Base | Size | Use Case | |------|------|----------| | `scratch` | 0MB | Static binaries (Go, Rust) | | `distroless/static` | 2MB | Static binaries with CA certs | | `alpine` | 7MB | Minimal Linux, shell access | | `distroless/base` | 20MB | Dynamic binaries (C/C++) | | `debian-slim` | 80MB | When you need glibc + apt | | `ubuntu` | 78MB | Full Ubuntu ecosystem | | `python:3.12-slim` | 130MB | Python apps (production) | | `node:20-alpine` | 130MB | Node.js apps | | `golang:1.22` | 800MB | Go build stage only | | `python:3.12` | 900MB | Never use in production | | `node:20` | 1000MB | Never use in production | ### When to Use Alpine - Small image size matters - No dependency on glibc (musl works) - Willing to handle occasional musl-related issues - Not running Python with C extensions that need glibc ### When to Use Slim - Need glibc compatibility - Python with compiled C extensions (numpy, pandas) - Fewer musl compatibility issues - Still much smaller than full images ### When to Use Distroless - Maximum security (no shell, no package manager) - Compiled/static binaries - Don't need debugging access inside container - Production-only (not development) --- ## Multi-Stage Builds ### Why Multi-Stage - Build tools and source code stay out of production image - Final image contains only runtime artifacts - Dramatically reduces image size and attack surface ### Naming Stages ```dockerfile FROM golang:1.22 AS builder # Named stage FROM alpine:3.19 AS runtime # Named stage COPY --from=builder /app /app # Reference by name ``` ### Selective Copy ```dockerfile # Only copy the built artifact — nothing else COPY --from=builder /app/server /server COPY --from=builder /app/config.yaml /config.yaml # Don't COPY --from=builder /app/ /app/ (copies source code too) ``` --- ## Security Hardening ### Run as Non-Root ```dockerfile # Create user RUN groupadd -r appgroup && useradd -r -g appgroup -s /sbin/nologin appuser # Set ownership COPY --chown=appuser:appgroup . . # Switch user (after all root-requiring operations) USER appuser ``` ### Secret Management ```dockerfile # Bad — secret baked into layer ENV API_KEY=sk-12345 # Good — BuildKit secret mount (never in layer) RUN --mount=type=secret,id=api_key \ export API_KEY=$(cat /run/secrets/api_key) && \ ./configure --api-key=$API_KEY ``` Build with: ```bash docker build --secret id=api_key,src=./api_key.txt . ``` ### Read-Only Filesystem ```yaml # docker-compose.yml services: app: read_only: true tmpfs: - /tmp - /var/run ``` ### Drop Capabilities ```yaml services: app: cap_drop: - ALL cap_add: - NET_BIND_SERVICE # Only if binding to ports < 1024 ``` --- ## Build Performance ### BuildKit Cache Mounts ```dockerfile # Cache pip downloads across builds RUN --mount=type=cache,target=/root/.cache/pip \ pip install -r requirements.txt # Cache apt downloads RUN --mount=type=cache,target=/var/cache/apt \ apt-get update && apt-get install -y curl ``` ### Parallel Builds ```dockerfile # These stages build in parallel when using BuildKit FROM node:20-alpine AS frontend COPY frontend/ . RUN npm ci && npm run build FROM golang:1.22 AS backend COPY backend/ . RUN go build -o server FROM alpine:3.19 COPY --from=frontend /dist /static COPY --from=backend /server /server ``` ### Enable BuildKit ```bash export DOCKER_BUILDKIT=1 docker build . # Or in daemon.json { "features": { "buildkit": true } } ``` --- ## Health Checks ### HTTP Service ```dockerfile HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \ CMD curl -f http://localhost:8000/health || exit 1 ``` ### Without curl (using wget) ```dockerfile HEALTHCHECK --interval=30s --timeout=3s --retries=3 \ CMD wget --no-verbose --tries=1 --spider http://localhost:8000/health || exit 1 ``` ### TCP Check ```dockerfile HEALTHCHECK --interval=30s --timeout=3s --retries=3 \ CMD nc -z localhost 8000 || exit 1 ``` ### PostgreSQL ```dockerfile HEALTHCHECK --interval=10s --timeout=5s --retries=5 \ CMD pg_isready -U postgres || exit 1 ``` ### Redis ```dockerfile HEALTHCHECK --interval=10s --timeout=3s --retries=3 \ CMD redis-cli ping | grep PONG || exit 1 ``` FILE:scripts/compose_validator.py #!/usr/bin/env python3 """ docker-development: Docker Compose Validator Validate docker-compose.yml files for best practices, missing healthchecks, network configuration, port conflicts, and security issues. Usage: python scripts/compose_validator.py docker-compose.yml python scripts/compose_validator.py docker-compose.yml --output json python scripts/compose_validator.py docker-compose.yml --strict """ import argparse import json import re import sys from pathlib import Path # --- Demo Compose File --- DEMO_COMPOSE = """ version: '3.8' services: web: build: . ports: - "3000:3000" environment: - DATABASE_URL=postgres://user:password@db:5432/app - SECRET_KEY=my-secret-key depends_on: - db - redis db: image: postgres:latest ports: - "5432:5432" environment: POSTGRES_PASSWORD: password123 volumes: - ./data:/var/lib/postgresql/data redis: image: redis ports: - "6379:6379" worker: build: . command: python worker.py environment: - DATABASE_URL=postgres://user:password@db:5432/app """ def parse_yaml_simple(content): """Simple YAML-like parser for docker-compose files (stdlib only). Handles the subset of YAML used in typical docker-compose files: - Top-level keys - Service definitions - Lists (- items) - Key-value pairs - Nested indentation """ result = {"services": {}, "volumes": {}, "networks": {}} current_section = None current_service = None current_key = None indent_stack = [] for line in content.splitlines(): stripped = line.strip() if not stripped or stripped.startswith("#"): continue indent = len(line) - len(line.lstrip()) # Top-level keys if indent == 0 and ":" in stripped: key = stripped.split(":")[0].strip() if key == "services": current_section = "services" elif key == "volumes": current_section = "volumes" elif key == "networks": current_section = "networks" elif key == "version": val = stripped.split(":", 1)[1].strip().strip("'\"") result["version"] = val current_service = None current_key = None continue if current_section == "services": # Service name (indent level 2) if indent == 2 and ":" in stripped and not stripped.startswith("-"): key = stripped.split(":")[0].strip() val = stripped.split(":", 1)[1].strip() if ":" in stripped else "" if val and not val.startswith("{"): # Simple key:value inside a service if current_service and current_service in result["services"]: result["services"][current_service][key] = val else: current_service = key result["services"][current_service] = {} current_key = None else: current_service = key result["services"][current_service] = {} current_key = None continue if current_service and current_service in result["services"]: svc = result["services"][current_service] # Service-level keys (indent 4) if indent == 4 and ":" in stripped and not stripped.startswith("-"): key = stripped.split(":")[0].strip() val = stripped.split(":", 1)[1].strip() current_key = key if val: svc[key] = val.strip("'\"") else: svc[key] = [] continue # List items (indent 6 or 8) if stripped.startswith("-") and current_key: item = stripped[1:].strip().strip("'\"") if current_key in svc: if isinstance(svc[current_key], list): svc[current_key].append(item) else: svc[current_key] = [svc[current_key], item] else: svc[current_key] = [item] continue # Nested key:value under current_key (e.g., healthcheck test) if indent >= 6 and ":" in stripped and not stripped.startswith("-"): key = stripped.split(":")[0].strip() val = stripped.split(":", 1)[1].strip() if current_key and current_key in svc: if isinstance(svc[current_key], list): svc[current_key] = {} if isinstance(svc[current_key], dict): svc[current_key][key] = val return result def validate_compose(parsed, strict=False): """Run validation rules on parsed compose file.""" findings = [] services = parsed.get("services", {}) # --- Version check --- version = parsed.get("version", "") if version: findings.append({ "severity": "low", "category": "deprecation", "message": f"'version: {version}' is deprecated in Compose V2 — remove it", "service": "(top-level)", }) # --- Per-service checks --- all_ports = [] for name, svc in services.items(): # Healthcheck if "healthcheck" not in svc: findings.append({ "severity": "medium", "category": "reliability", "message": f"No healthcheck defined — orchestrator can't detect unhealthy state", "service": name, }) # Image tag image = svc.get("image", "") if image: if ":latest" in image: findings.append({ "severity": "high", "category": "reproducibility", "message": f"Using :latest tag on '{image}' — pin to specific version", "service": name, }) elif ":" not in image and "/" not in image: findings.append({ "severity": "high", "category": "reproducibility", "message": f"No tag on image '{image}' — defaults to :latest", "service": name, }) # Ports ports = svc.get("ports", []) if isinstance(ports, list): for p in ports: p_str = str(p) # Extract host port match = re.match(r"(\d+):\d+", p_str) if match: host_port = match.group(1) all_ports.append((host_port, name)) # Environment secrets env = svc.get("environment", []) if isinstance(env, list): for e in env: e_str = str(e) if re.search(r"(?:PASSWORD|SECRET|TOKEN|KEY)=\S+", e_str, re.IGNORECASE): if "env_file" not in svc: findings.append({ "severity": "critical", "category": "security", "message": f"Inline secret in environment: {e_str[:40]}...", "service": name, }) elif isinstance(env, dict): for k, v in env.items(): if re.search(r"(?:PASSWORD|SECRET|TOKEN|KEY)", k, re.IGNORECASE) and v: findings.append({ "severity": "critical", "category": "security", "message": f"Inline secret: {k}={str(v)[:20]}...", "service": name, }) # depends_on without condition depends = svc.get("depends_on", []) if isinstance(depends, list) and depends: findings.append({ "severity": "medium", "category": "reliability", "message": "depends_on without condition: service_healthy — race condition risk", "service": name, }) # Bind mounts (./path style) volumes = svc.get("volumes", []) if isinstance(volumes, list): for v in volumes: v_str = str(v) if v_str.startswith("./") or v_str.startswith("/"): if "/var/run/docker.sock" in v_str: findings.append({ "severity": "critical", "category": "security", "message": "Docker socket mounted — container has host Docker access", "service": name, }) # Restart policy if "restart" not in svc and "build" not in svc: findings.append({ "severity": "low", "category": "reliability", "message": "No restart policy — container won't auto-restart on failure", "service": name, }) # Resource limits if "mem_limit" not in svc and "deploy" not in svc: findings.append({ "severity": "low" if not strict else "medium", "category": "resources", "message": "No memory limit — container can consume all host memory", "service": name, }) # Port conflicts port_map = {} for port, svc_name in all_ports: if port in port_map: findings.append({ "severity": "high", "category": "networking", "message": f"Port {port} conflict between '{port_map[port]}' and '{svc_name}'", "service": svc_name, }) port_map[port] = svc_name # Network check if "networks" not in parsed or not parsed["networks"]: if len(services) > 1: findings.append({ "severity": "low", "category": "networking", "message": "No explicit networks — all services share default bridge network", "service": "(top-level)", }) # Sort by severity severity_order = {"critical": 0, "high": 1, "medium": 2, "low": 3} findings.sort(key=lambda f: severity_order.get(f["severity"], 4)) return findings def generate_report(content, output_format="text", strict=False): """Generate validation report.""" parsed = parse_yaml_simple(content) findings = validate_compose(parsed, strict) services = parsed.get("services", {}) # Score deductions = {"critical": 25, "high": 15, "medium": 5, "low": 2} score = max(0, 100 - sum(deductions.get(f["severity"], 0) for f in findings)) counts = { "critical": sum(1 for f in findings if f["severity"] == "critical"), "high": sum(1 for f in findings if f["severity"] == "high"), "medium": sum(1 for f in findings if f["severity"] == "medium"), "low": sum(1 for f in findings if f["severity"] == "low"), } result = { "score": score, "services": list(services.keys()), "service_count": len(services), "findings": findings, "finding_counts": counts, } if output_format == "json": print(json.dumps(result, indent=2)) return result # Text output print(f"\n{'=' * 60}") print(f" Docker Compose Validation Report") print(f"{'=' * 60}") print(f" Score: {score}/100") print(f" Services: {', '.join(services.keys()) if services else 'none'}") print() print(f" Findings: {counts['critical']} critical | {counts['high']} high | {counts['medium']} medium | {counts['low']} low") print(f"{'─' * 60}") for f in findings: icon = {"critical": "!!!", "high": "!!", "medium": "!", "low": "~"}.get(f["severity"], "?") print(f"\n {icon} {f['severity'].upper()} [{f['category']}] — {f['service']}") print(f" {f['message']}") if not findings: print("\n No issues found. Compose file looks good.") print(f"\n{'=' * 60}\n") return result def main(): parser = argparse.ArgumentParser( description="docker-development: Docker Compose validator" ) parser.add_argument("composefile", nargs="?", help="Path to docker-compose.yml (omit for demo)") parser.add_argument( "--output", "-o", choices=["text", "json"], default="text", help="Output format (default: text)", ) parser.add_argument( "--strict", action="store_true", help="Strict mode — elevate warnings to higher severity", ) args = parser.parse_args() if args.composefile: path = Path(args.composefile) if not path.exists(): print(f"Error: File not found: {args.composefile}", file=sys.stderr) sys.exit(1) content = path.read_text(encoding="utf-8") else: print("No compose file provided. Running demo validation...\n") content = DEMO_COMPOSE generate_report(content, args.output, args.strict) if __name__ == "__main__": main() FILE:scripts/dockerfile_analyzer.py #!/usr/bin/env python3 """ docker-development: Dockerfile Analyzer Static analysis of Dockerfiles for optimization opportunities, anti-patterns, and security issues. Reports layer count, base image analysis, and actionable recommendations. Usage: python scripts/dockerfile_analyzer.py Dockerfile python scripts/dockerfile_analyzer.py Dockerfile --output json python scripts/dockerfile_analyzer.py Dockerfile --security """ import argparse import json import re import sys from pathlib import Path # --- Analysis Rules --- ANTI_PATTERNS = [ { "id": "AP001", "name": "latest_tag", "severity": "high", "pattern": r"^FROM\s+\S+:latest", "message": "Using :latest tag — pin to a specific version for reproducibility", "fix": "Use a specific tag like :3.12-slim or pin by digest", }, { "id": "AP002", "name": "no_tag", "severity": "high", "pattern": r"^FROM\s+([a-z][a-z0-9_.-]+)\s*$", "message": "No tag specified on base image — defaults to :latest", "fix": "Add a specific version tag", }, { "id": "AP003", "name": "run_apt_no_clean", "severity": "medium", "pattern": r"^RUN\s+.*apt-get\s+install(?!.*rm\s+-rf\s+/var/lib/apt/lists)", "message": "apt-get install without cleanup in same layer — bloats image", "fix": "Add && rm -rf /var/lib/apt/lists/* in the same RUN instruction", }, { "id": "AP004", "name": "run_apk_no_cache", "severity": "medium", "pattern": r"^RUN\s+.*apk\s+add(?!\s+--no-cache)", "message": "apk add without --no-cache — retains package index", "fix": "Use: apk add --no-cache <packages>", }, { "id": "AP005", "name": "add_instead_of_copy", "severity": "low", "pattern": r"^ADD\s+(?!https?://)\S+", "message": "Using ADD for local files — COPY is more explicit and predictable", "fix": "Use COPY instead of ADD unless you need tar auto-extraction or URL fetching", }, { "id": "AP006", "name": "multiple_cmd", "severity": "medium", "pattern": None, # Custom check "message": "Multiple CMD instructions — only the last one takes effect", "fix": "Keep exactly one CMD instruction", }, { "id": "AP007", "name": "env_secrets", "severity": "critical", "pattern": r"^(?:ENV|ARG)\s+\S*(?:PASSWORD|SECRET|TOKEN|KEY|API_KEY)\s*=", "message": "Secrets in ENV/ARG — baked into image layers and visible in history", "fix": "Use BuildKit secrets: RUN --mount=type=secret,id=mytoken", }, { "id": "AP008", "name": "broad_copy", "severity": "medium", "pattern": r"^COPY\s+\.\s+\.", "message": "COPY . . copies everything — may include secrets, git history, node_modules", "fix": "Use .dockerignore and copy specific directories, or copy after dependency install", }, { "id": "AP009", "name": "no_user", "severity": "critical", "pattern": None, # Custom check "message": "No USER instruction — container runs as root", "fix": "Add USER nonroot or create a dedicated user", }, { "id": "AP010", "name": "pip_no_cache", "severity": "low", "pattern": r"^RUN\s+.*pip\s+install(?!\s+--no-cache-dir)", "message": "pip install without --no-cache-dir — retains pip cache in layer", "fix": "Use: pip install --no-cache-dir -r requirements.txt", }, { "id": "AP011", "name": "npm_install_dev", "severity": "medium", "pattern": r"^RUN\s+.*npm\s+install\s*$", "message": "npm install includes devDependencies — use npm ci --omit=dev for production", "fix": "Use: npm ci --omit=dev (or npm ci --production)", }, { "id": "AP012", "name": "expose_all", "severity": "low", "pattern": r"^EXPOSE\s+\d+(?:\s+\d+){3,}", "message": "Exposing many ports — only expose what the application actually needs", "fix": "Remove unnecessary EXPOSE directives", }, { "id": "AP013", "name": "curl_wget_without_cleanup", "severity": "low", "pattern": r"^RUN\s+.*(?:curl|wget)\s+.*(?!&&\s*rm)", "message": "Download without cleanup — downloaded archives may remain in layer", "fix": "Download, extract, and remove archive in the same RUN instruction", }, { "id": "AP014", "name": "no_healthcheck", "severity": "medium", "pattern": None, # Custom check "message": "No HEALTHCHECK instruction — orchestrators can't determine container health", "fix": "Add HEALTHCHECK CMD curl -f http://localhost:PORT/health || exit 1", }, { "id": "AP015", "name": "shell_form_cmd", "severity": "low", "pattern": r'^(?:CMD|ENTRYPOINT)\s+(?!\[)["\']?\w', "message": "Using shell form for CMD/ENTRYPOINT — exec form is preferred for signal handling", "fix": 'Use exec form: CMD ["executable", "arg1", "arg2"]', }, ] # Approximate base image sizes (MB) BASE_IMAGE_SIZES = { "scratch": 0, "alpine": 7, "distroless/static": 2, "distroless/base": 20, "distroless/cc": 25, "debian-slim": 80, "debian": 120, "ubuntu": 78, "python-slim": 130, "python-alpine": 50, "python": 900, "node-alpine": 130, "node-slim": 200, "node": 1000, "golang-alpine": 250, "golang": 800, "rust-slim": 750, "rust": 1400, "nginx-alpine": 40, "nginx": 140, } # --- Demo Dockerfile --- DEMO_DOCKERFILE = """FROM python:3.12 WORKDIR /app COPY . . RUN pip install -r requirements.txt ENV SECRET_KEY=mysecretkey123 EXPOSE 8000 5432 6379 CMD python manage.py runserver 0.0.0.0:8000 """ def parse_dockerfile(content): """Parse Dockerfile into structured instructions.""" instructions = [] current = "" for line in content.splitlines(): stripped = line.strip() if not stripped or stripped.startswith("#"): continue if stripped.endswith("\\"): current += stripped[:-1] + " " continue current += stripped # Parse instruction match = re.match(r"^(\w+)\s+(.*)", current.strip()) if match: instructions.append({ "instruction": match.group(1).upper(), "args": match.group(2), "raw": current.strip(), }) current = "" return instructions def analyze_layers(instructions): """Count and classify layers.""" layer_instructions = {"FROM", "RUN", "COPY", "ADD"} layers = [i for i in instructions if i["instruction"] in layer_instructions] stages = [i for i in instructions if i["instruction"] == "FROM"] return { "total_layers": len(layers), "stages": len(stages), "is_multistage": len(stages) > 1, "run_count": sum(1 for i in instructions if i["instruction"] == "RUN"), "copy_count": sum(1 for i in instructions if i["instruction"] == "COPY"), "add_count": sum(1 for i in instructions if i["instruction"] == "ADD"), } def analyze_base_image(instructions): """Analyze base image choice.""" from_instructions = [i for i in instructions if i["instruction"] == "FROM"] if not from_instructions: return {"image": "unknown", "tag": "unknown", "estimated_size_mb": 0} last_from = from_instructions[-1]["args"].split()[0] parts = last_from.split(":") image = parts[0] tag = parts[1] if len(parts) > 1 else "latest" # Estimate size size = 0 image_base = image.split("/")[-1] for key, val in BASE_IMAGE_SIZES.items(): if key in f"{image_base}-{tag}" or key == image_base: size = val break return { "image": image, "tag": tag, "estimated_size_mb": size, "is_alpine": "alpine" in tag, "is_slim": "slim" in tag, "is_distroless": "distroless" in image, } def run_pattern_checks(content, instructions): """Run anti-pattern checks.""" findings = [] for rule in ANTI_PATTERNS: if rule["pattern"] is not None: for match in re.finditer(rule["pattern"], content, re.MULTILINE | re.IGNORECASE): findings.append({ "id": rule["id"], "severity": rule["severity"], "message": rule["message"], "fix": rule["fix"], "line": match.group(0).strip()[:80], }) # Custom checks # AP006: Multiple CMD cmd_count = sum(1 for i in instructions if i["instruction"] == "CMD") if cmd_count > 1: r = next(r for r in ANTI_PATTERNS if r["id"] == "AP006") findings.append({ "id": r["id"], "severity": r["severity"], "message": r["message"], "fix": r["fix"], "line": f"{cmd_count} CMD instructions found", }) # AP009: No USER has_user = any(i["instruction"] == "USER" for i in instructions) if not has_user and instructions: r = next(r for r in ANTI_PATTERNS if r["id"] == "AP009") findings.append({ "id": r["id"], "severity": r["severity"], "message": r["message"], "fix": r["fix"], "line": "(no USER instruction found)", }) # AP014: No HEALTHCHECK has_healthcheck = any(i["instruction"] == "HEALTHCHECK" for i in instructions) if not has_healthcheck and instructions: r = next(r for r in ANTI_PATTERNS if r["id"] == "AP014") findings.append({ "id": r["id"], "severity": r["severity"], "message": r["message"], "fix": r["fix"], "line": "(no HEALTHCHECK instruction found)", }) return findings def generate_report(content, output_format="text", security_focus=False): """Generate full analysis report.""" instructions = parse_dockerfile(content) layers = analyze_layers(instructions) base = analyze_base_image(instructions) findings = run_pattern_checks(content, instructions) if security_focus: security_ids = {"AP007", "AP009", "AP008"} security_severities = {"critical", "high"} findings = [f for f in findings if f["id"] in security_ids or f["severity"] in security_severities] # Deduplicate findings by id seen_ids = set() unique_findings = [] for f in findings: key = (f["id"], f["line"]) if key not in seen_ids: seen_ids.add(key) unique_findings.append(f) findings = unique_findings # Sort by severity severity_order = {"critical": 0, "high": 1, "medium": 2, "low": 3} findings.sort(key=lambda f: severity_order.get(f["severity"], 4)) # Score (100 minus deductions) deductions = {"critical": 25, "high": 15, "medium": 5, "low": 2} score = max(0, 100 - sum(deductions.get(f["severity"], 0) for f in findings)) result = { "score": score, "base_image": base, "layers": layers, "findings": findings, "finding_counts": { "critical": sum(1 for f in findings if f["severity"] == "critical"), "high": sum(1 for f in findings if f["severity"] == "high"), "medium": sum(1 for f in findings if f["severity"] == "medium"), "low": sum(1 for f in findings if f["severity"] == "low"), }, } if output_format == "json": print(json.dumps(result, indent=2)) return result # Text output print(f"\n{'=' * 60}") print(f" Dockerfile Analysis Report") print(f"{'=' * 60}") print(f" Score: {score}/100") print(f" Base: {base['image']}:{base['tag']} (~{base['estimated_size_mb']}MB)") print(f" Layers: {layers['total_layers']} | Stages: {layers['stages']} | Multi-stage: {'Yes' if layers['is_multistage'] else 'No'}") print(f" RUN: {layers['run_count']} | COPY: {layers['copy_count']} | ADD: {layers['add_count']}") print() counts = result["finding_counts"] print(f" Findings: {counts['critical']} critical | {counts['high']} high | {counts['medium']} medium | {counts['low']} low") print(f"{'─' * 60}") for f in findings: icon = {"critical": "!!!", "high": "!!", "medium": "!", "low": "~"}.get(f["severity"], "?") print(f"\n [{f['id']}] {icon} {f['severity'].upper()}") print(f" {f['message']}") print(f" Line: {f['line']}") print(f" Fix: {f['fix']}") if not findings: print("\n No issues found. Dockerfile looks good.") print(f"\n{'=' * 60}\n") return result def main(): parser = argparse.ArgumentParser( description="docker-development: Dockerfile static analyzer" ) parser.add_argument("dockerfile", nargs="?", help="Path to Dockerfile (omit for demo)") parser.add_argument( "--output", "-o", choices=["text", "json"], default="text", help="Output format (default: text)", ) parser.add_argument( "--security", action="store_true", help="Security-focused analysis only", ) args = parser.parse_args() if args.dockerfile: path = Path(args.dockerfile) if not path.exists(): print(f"Error: File not found: {args.dockerfile}", file=sys.stderr) sys.exit(1) content = path.read_text(encoding="utf-8") else: print("No Dockerfile provided. Running demo analysis...\n") content = DEMO_DOCKERFILE generate_report(content, args.output, args.security) if __name__ == "__main__": main()
Autonomous experiment loop that optimizes any file by a measurable metric. Inspired by Karpathy's autoresearch. The agent edits a target file, runs a fixed e...
---
name: "autoresearch-agent"
description: "Autonomous experiment loop that optimizes any file by a measurable metric. Inspired by Karpathy's autoresearch. The agent edits a target file, runs a fixed evaluation, keeps improvements (git commit), discards failures (git reset), and loops indefinitely. Use when: user wants to optimize code speed, reduce bundle/image size, improve test pass rate, optimize prompts, improve content quality (headlines, copy, CTR), or run any measurable improvement loop. Requires: a target file, an evaluation command that outputs a metric, and a git repo."
license: MIT
metadata:
version: 2.0.0
author: Alireza Rezvani
category: engineering
updated: 2026-03-13
---
# Autoresearch Agent
> You sleep. The agent experiments. You wake up to results.
Autonomous experiment loop inspired by [Karpathy's autoresearch](https://github.com/karpathy/autoresearch). The agent edits one file, runs a fixed evaluation, keeps improvements, discards failures, and loops indefinitely.
Not one guess — fifty measured attempts, compounding.
---
## Slash Commands
| Command | What it does |
|---------|-------------|
| `/ar:setup` | Set up a new experiment interactively |
| `/ar:run` | Run a single experiment iteration |
| `/ar:loop` | Start autonomous loop with configurable interval (10m, 1h, daily, weekly, monthly) |
| `/ar:status` | Show dashboard and results |
| `/ar:resume` | Resume a paused experiment |
---
## When This Skill Activates
Recognize these patterns from the user:
- "Make this faster / smaller / better"
- "Optimize [file] for [metric]"
- "Improve my [headlines / copy / prompts]"
- "Run experiments overnight"
- "I want to get [metric] from X to Y"
- Any request involving: optimize, benchmark, improve, experiment loop, autoresearch
If the user describes a target file + a way to measure success → this skill applies.
---
## Setup
### First Time — Create the Experiment
Run the setup script. The user decides where experiments live:
**Project-level** (inside repo, git-tracked, shareable with team):
```bash
python scripts/setup_experiment.py \
--domain engineering \
--name api-speed \
--target src/api/search.py \
--eval "pytest bench.py --tb=no -q" \
--metric p50_ms \
--direction lower \
--scope project
```
**User-level** (personal, in `~/.autoresearch/`):
```bash
python scripts/setup_experiment.py \
--domain marketing \
--name medium-ctr \
--target content/titles.md \
--eval "python evaluate.py" \
--metric ctr_score \
--direction higher \
--evaluator llm_judge_content \
--scope user
```
The `--scope` flag determines where `.autoresearch/` lives:
- `project` (default) → `.autoresearch/` in the repo root. Experiment definitions are git-tracked. Results are gitignored.
- `user` → `~/.autoresearch/` in the home directory. Everything is personal.
### What Setup Creates
```
.autoresearch/
├── config.yaml ← Global settings
├── .gitignore ← Ignores results.tsv, *.log
└── {domain}/{experiment-name}/
├── program.md ← Objectives, constraints, strategy
├── config.cfg ← Target, eval cmd, metric, direction
├── results.tsv ← Experiment log (gitignored)
└── evaluate.py ← Evaluation script (if --evaluator used)
```
**results.tsv columns:** `commit | metric | status | description`
- `commit` — short git hash
- `metric` — float value or "N/A" for crashes
- `status` — keep | discard | crash
- `description` — what changed or why it crashed
### Domains
| Domain | Use Cases |
|--------|-----------|
| `engineering` | Code speed, memory, bundle size, test pass rate, build time |
| `marketing` | Headlines, social copy, email subjects, ad copy, engagement |
| `content` | Article structure, SEO descriptions, readability, CTR |
| `prompts` | System prompts, chatbot tone, agent instructions |
| `custom` | Anything else with a measurable metric |
### If `program.md` Already Exists
The user may have written their own `program.md`. If found in the experiment directory, read it. It overrides the template. Only ask for what's missing.
---
## Agent Protocol
You are the loop. The scripts handle setup and evaluation — you handle the creative work.
### Before Starting
1. Read `.autoresearch/{domain}/{name}/config.cfg` to get:
- `target` — the file you edit
- `evaluate_cmd` — the command that measures your changes
- `metric` — the metric name to look for in eval output
- `metric_direction` — "lower" or "higher" is better
- `time_budget_minutes` — max time per evaluation
2. Read `program.md` for strategy, constraints, and what you can/cannot change
3. Read `results.tsv` for experiment history (columns: commit, metric, status, description)
4. Checkout the experiment branch: `git checkout autoresearch/{domain}/{name}`
### Each Iteration
1. Review results.tsv — what worked? What failed? What hasn't been tried?
2. Decide ONE change to the target file. One variable per experiment.
3. Edit the target file
4. Commit: `git add {target} && git commit -m "experiment: {description}"`
5. Evaluate: `python scripts/run_experiment.py --experiment {domain}/{name} --single`
6. Read the output — it prints KEEP, DISCARD, or CRASH with the metric value
7. Go to step 1
### What the Script Handles (you don't)
- Running the eval command with timeout
- Parsing the metric from eval output
- Comparing to previous best
- Reverting the commit on failure (`git reset --hard HEAD~1`)
- Logging the result to results.tsv
### Starting an Experiment
```bash
# Single iteration (the agent calls this repeatedly)
python scripts/run_experiment.py --experiment engineering/api-speed --single
# Dry run (test setup before starting)
python scripts/run_experiment.py --experiment engineering/api-speed --dry-run
```
### Strategy Escalation
- Runs 1-5: Low-hanging fruit (obvious improvements, simple optimizations)
- Runs 6-15: Systematic exploration (vary one parameter at a time)
- Runs 16-30: Structural changes (algorithm swaps, architecture shifts)
- Runs 30+: Radical experiments (completely different approaches)
- If no improvement in 20+ runs: update program.md Strategy section
### Self-Improvement
After every 10 experiments, review results.tsv for patterns. Update the
Strategy section of program.md with what you learned (e.g., "caching changes
consistently improve by 5-10%", "refactoring attempts never improve the metric").
Future iterations benefit from this accumulated knowledge.
### Stopping
- Run until interrupted by the user, context limit reached, or goal in program.md is met
- Before stopping: ensure results.tsv is up to date
- On context limit: the next session can resume — results.tsv and git log persist
### Rules
- **One change per experiment.** Don't change 5 things at once. You won't know what worked.
- **Simplicity criterion.** A small improvement that adds ugly complexity is not worth it. Equal performance with simpler code is a win. Removing code that gets same results is the best outcome.
- **Never modify the evaluator.** `evaluate.py` is the ground truth. Modifying it invalidates all comparisons. Hard stop if you catch yourself doing this.
- **Timeout.** If a run exceeds 2.5× the time budget, kill it and treat as crash.
- **Crash handling.** If it's a typo or missing import, fix and re-run. If the idea is fundamentally broken, revert, log "crash", move on. 5 consecutive crashes → pause and alert.
- **No new dependencies.** Only use what's already available in the project.
---
## Evaluators
Ready-to-use evaluation scripts. Copied into the experiment directory during setup with `--evaluator`.
### Free Evaluators (no API cost)
| Evaluator | Metric | Use Case |
|-----------|--------|----------|
| `benchmark_speed` | `p50_ms` (lower) | Function/API execution time |
| `benchmark_size` | `size_bytes` (lower) | File, bundle, Docker image size |
| `test_pass_rate` | `pass_rate` (higher) | Test suite pass percentage |
| `build_speed` | `build_seconds` (lower) | Build/compile/Docker build time |
| `memory_usage` | `peak_mb` (lower) | Peak memory during execution |
### LLM Judge Evaluators (uses your subscription)
| Evaluator | Metric | Use Case |
|-----------|--------|----------|
| `llm_judge_content` | `ctr_score` 0-10 (higher) | Headlines, titles, descriptions |
| `llm_judge_prompt` | `quality_score` 0-100 (higher) | System prompts, agent instructions |
| `llm_judge_copy` | `engagement_score` 0-10 (higher) | Social posts, ad copy, emails |
LLM judges call the CLI tool the user is already running (Claude, Codex, Gemini). The evaluation prompt is locked inside `evaluate.py` — the agent cannot modify it. This prevents the agent from gaming its own evaluator.
The user's existing subscription covers the cost:
- Claude Code Max → unlimited Claude calls for evaluation
- Codex CLI (ChatGPT Pro) → unlimited Codex calls
- Gemini CLI (free tier) → free evaluation calls
### Custom Evaluators
If no built-in evaluator fits, the user writes their own `evaluate.py`. Only requirement: it must print `metric_name: value` to stdout.
```python
#!/usr/bin/env python3
# My custom evaluator — DO NOT MODIFY after experiment starts
import subprocess
result = subprocess.run(["my-benchmark", "--json"], capture_output=True, text=True)
# Parse and output
print(f"my_metric: {parse_score(result.stdout)}")
```
---
## Viewing Results
```bash
# Single experiment
python scripts/log_results.py --experiment engineering/api-speed
# All experiments in a domain
python scripts/log_results.py --domain engineering
# Cross-experiment dashboard
python scripts/log_results.py --dashboard
# Export formats
python scripts/log_results.py --experiment engineering/api-speed --format csv --output results.csv
python scripts/log_results.py --experiment engineering/api-speed --format markdown --output results.md
python scripts/log_results.py --dashboard --format markdown --output dashboard.md
```
### Dashboard Output
```
DOMAIN EXPERIMENT RUNS KEPT BEST Δ FROM START STATUS
engineering api-speed 47 14 185ms -76.9% active
engineering bundle-size 23 8 412KB -58.3% paused
marketing medium-ctr 31 11 8.4/10 +68.0% active
prompts support-tone 15 6 82/100 +46.4% done
```
### Export Formats
- **TSV** — default, tab-separated (compatible with spreadsheets)
- **CSV** — comma-separated, with proper quoting
- **Markdown** — formatted table, readable in GitHub/docs
---
## Proactive Triggers
Flag these without being asked:
- **No evaluation command works** → Test it before starting the loop. Run once, verify output.
- **Target file not in git** → `git init && git add . && git commit -m 'initial'` first.
- **Metric direction unclear** → Ask: is lower or higher better? Must know before starting.
- **Time budget too short** → If eval takes longer than budget, every run crashes.
- **Agent modifying evaluate.py** → Hard stop. This invalidates all comparisons.
- **5 consecutive crashes** → Pause the loop. Alert the user. Don't keep burning cycles.
- **No improvement in 20+ runs** → Suggest changing strategy in program.md or trying a different approach.
---
## Installation
### One-liner (any tool)
```bash
git clone https://github.com/alirezarezvani/claude-skills.git
cp -r claude-skills/engineering/autoresearch-agent ~/.claude/skills/
```
### Multi-tool install
```bash
./scripts/convert.sh --skill autoresearch-agent --tool codex|gemini|cursor|windsurf|openclaw
```
### OpenClaw
```bash
clawhub install cs-autoresearch-agent
```
---
## Related Skills
- **self-improving-agent** — improves an agent's own memory/rules over time. NOT for structured experiment loops.
- **senior-ml-engineer** — ML architecture decisions. Complementary — use for initial design, then autoresearch for optimization.
- **tdd-guide** — test-driven development. Complementary — tests can be the evaluation function.
- **skill-security-auditor** — audit skills before publishing. NOT for optimization loops.
FILE:CLAUDE.md
# Autoresearch Agent — Claude Code Instructions
This plugin runs autonomous experiment loops that optimize any file by a measurable metric.
## Commands
Use the `/ar:` namespace for all commands:
- `/ar:setup` — Set up a new experiment interactively
- `/ar:run` — Run a single experiment iteration
- `/ar:loop` — Start an autonomous loop with user-selected interval
- `/ar:status` — Show dashboard and results
- `/ar:resume` — Resume a paused experiment
## How it works
You (the AI agent) are the experiment loop. The scripts handle evaluation and git rollback.
1. You edit the target file with ONE change
2. You commit it
3. You call `run_experiment.py --single` — it evaluates and prints KEEP/DISCARD/CRASH
4. You repeat
Results persist in `results.tsv` and git log. Sessions can be resumed.
## When to use each command
### Starting fresh
```
/ar:setup
```
Creates the experiment directory, config, program.md, results.tsv, and git branch.
### Running one iteration at a time
```
/ar:run engineering/api-speed
```
Read history, make one change, evaluate, report result.
### Autonomous background loop
```
/ar:loop engineering/api-speed
```
Prompts for interval (10min, 1h, daily, weekly, monthly), then creates a recurring job.
### Checking progress
```
/ar:status
```
Shows the dashboard across all experiments with metrics and trends.
### Resuming after context limit or break
```
/ar:resume engineering/api-speed
```
Reads results history, checks out the branch, and continues where you left off.
## Agents
- **experiment-runner**: Spawned for each loop iteration. Reads config, results history, decides what to try, edits target, commits, evaluates.
## Key principle
**One change per experiment. Measure everything. Compound improvements.**
The agent never modifies the evaluator. The evaluator is ground truth.
FILE:agents/experiment-runner.md
# Experiment Runner Agent
You are an autonomous experimenter. Your job is to optimize a target file by a measurable metric, one change at a time.
## Your Role
You are spawned for each iteration of an autoresearch experiment loop. You:
1. Read the experiment state (config, strategy, results history)
2. Decide what to try based on accumulated evidence
3. Make ONE change to the target file
4. Commit and evaluate
5. Report the result
## Process
### 1. Read experiment state
```bash
# Config: what to optimize and how to measure
cat .autoresearch/{domain}/{name}/config.cfg
# Strategy: what you can/cannot change, current approach
cat .autoresearch/{domain}/{name}/program.md
# History: every experiment ever run, with outcomes
cat .autoresearch/{domain}/{name}/results.tsv
# Recent changes: what the code looks like now
git log --oneline -10
git diff HEAD~1 --stat # last change if any
```
### 2. Analyze results history
From results.tsv, identify:
- **What worked** (status=keep): What do these changes have in common?
- **What failed** (status=discard): What approaches should you avoid?
- **What crashed** (status=crash): Are there fragile areas to be careful with?
- **Trends**: Is the metric plateauing? Accelerating? Oscillating?
### 3. Select strategy based on experiment count
| Run Count | Strategy | Risk Level |
|-----------|----------|------------|
| 1-5 | Low-hanging fruit: obvious improvements, simple optimizations | Low |
| 6-15 | Systematic exploration: vary one parameter at a time | Medium |
| 16-30 | Structural changes: algorithm swaps, architecture shifts | High |
| 30+ | Radical experiments: completely different approaches | Very High |
If no improvement in the last 20 runs, it's time to update the Strategy section of program.md and try something fundamentally different.
### 4. Make ONE change
- Edit only the target file (from config.cfg)
- Change one variable, one approach, one parameter
- Keep it simple — equal results with simpler code is a win
- No new dependencies
### 5. Commit and evaluate
```bash
git add {target}
git commit -m "experiment: {description}"
python {skill_path}/scripts/run_experiment.py --experiment {domain}/{name} --single
```
### 6. Self-improvement
After every 10th experiment, update program.md's Strategy section:
- Which approaches consistently work? Double down.
- Which approaches consistently fail? Stop trying.
- Any new hypotheses based on the data?
## Hard Rules
- **ONE change per experiment.** Multiple changes = you won't know what worked.
- **NEVER modify the evaluator.** evaluate.py is the ground truth. Modifying it invalidates all comparisons. If you catch yourself doing this, stop immediately.
- **5 consecutive crashes → stop.** Alert the user. Don't burn cycles on a broken setup.
- **Simplicity criterion.** A small improvement that adds ugly complexity is NOT worth it. Removing code that gets same results is the best outcome.
- **No new dependencies.** Only use what's already available.
## Constraints
- Never read or modify files outside the target file and program.md
- Never push to remote — all work stays local
- Never skip the evaluation step — every change must be measured
- Be concise in commit messages — they become the experiment log
FILE:evaluators/benchmark_size.py
#!/usr/bin/env python3
"""Measure file, bundle, or Docker image size.
DO NOT MODIFY after experiment starts — this is the fixed evaluator."""
import os
import subprocess
import sys
# --- CONFIGURE ONE OF THESE ---
# Option 1: File size
TARGET_FILE = "dist/main.js"
# Option 2: Directory size (uncomment to use)
# TARGET_DIR = "dist/"
# Option 3: Docker image (uncomment to use)
# DOCKER_IMAGE = "myapp:latest"
# DOCKER_BUILD_CMD = "docker build -t myapp:latest ."
# Option 4: Build first, then measure (uncomment to use)
# BUILD_CMD = "npm run build"
# --- END CONFIG ---
# Build if needed
if "BUILD_CMD" in dir() or "BUILD_CMD" in globals():
result = subprocess.run(BUILD_CMD, shell=True, capture_output=True)
if result.returncode != 0:
print(f"Build failed: {result.stderr.decode()[:200]}", file=sys.stderr)
sys.exit(1)
# Measure
if "DOCKER_IMAGE" in dir() or "DOCKER_IMAGE" in globals():
if "DOCKER_BUILD_CMD" in dir():
subprocess.run(DOCKER_BUILD_CMD, shell=True, capture_output=True)
result = subprocess.run(
f"docker image inspect {DOCKER_IMAGE} --format '{{{{.Size}}}}'",
shell=True, capture_output=True, text=True
)
try:
size_bytes = int(result.stdout.strip())
except ValueError:
print(f"Could not parse size from: {result.stdout[:100]}", file=sys.stderr)
sys.exit(1)
elif "TARGET_DIR" in dir() or "TARGET_DIR" in globals():
size_bytes = sum(
os.path.getsize(os.path.join(dp, f))
for dp, _, fns in os.walk(TARGET_DIR) for f in fns
)
elif os.path.exists(TARGET_FILE):
size_bytes = os.path.getsize(TARGET_FILE)
else:
print(f"Target not found: {TARGET_FILE}", file=sys.stderr)
sys.exit(1)
size_kb = size_bytes / 1024
size_mb = size_bytes / (1024 * 1024)
print(f"size_bytes: {size_bytes}")
print(f"size_kb: {size_kb:.1f}")
print(f"size_mb: {size_mb:.2f}")
FILE:evaluators/benchmark_speed.py
#!/usr/bin/env python3
"""Measure execution speed of a target function or command.
DO NOT MODIFY after experiment starts — this is the fixed evaluator."""
import statistics
import subprocess
import sys
import time
# --- CONFIGURE THESE ---
COMMAND = "python src/module.py" # Command to benchmark
RUNS = 5 # Number of runs
WARMUP = 1 # Warmup runs (not counted)
# --- END CONFIG ---
times = []
# Warmup
for _ in range(WARMUP):
subprocess.run(COMMAND, shell=True, capture_output=True, timeout=120)
# Benchmark
for i in range(RUNS):
t0 = time.perf_counter()
result = subprocess.run(COMMAND, shell=True, capture_output=True, timeout=120)
elapsed = (time.perf_counter() - t0) * 1000 # ms
if result.returncode != 0:
print(f"Run {i+1} failed (exit {result.returncode})", file=sys.stderr)
print(f"stderr: {result.stderr.decode()[:200]}", file=sys.stderr)
sys.exit(1)
times.append(elapsed)
p50 = statistics.median(times)
p95 = sorted(times)[int(len(times) * 0.95)] if len(times) >= 5 else max(times)
print(f"p50_ms: {p50:.2f}")
print(f"p95_ms: {p95:.2f}")
print(f"runs: {RUNS}")
FILE:evaluators/build_speed.py
#!/usr/bin/env python3
"""Measure build/compile time.
DO NOT MODIFY after experiment starts — this is the fixed evaluator."""
import subprocess
import sys
import time
# --- CONFIGURE THESE ---
BUILD_CMD = "npm run build" # or: docker build -t test .
CLEAN_CMD = "" # optional: npm run clean (run before each build)
RUNS = 3 # Number of builds to average
# --- END CONFIG ---
times = []
for i in range(RUNS):
# Clean if configured
if CLEAN_CMD:
subprocess.run(CLEAN_CMD, shell=True, capture_output=True, timeout=60)
t0 = time.perf_counter()
result = subprocess.run(BUILD_CMD, shell=True, capture_output=True, timeout=600)
elapsed = time.perf_counter() - t0
if result.returncode != 0:
print(f"Build {i+1} failed (exit {result.returncode})", file=sys.stderr)
print(f"stderr: {result.stderr.decode()[:200]}", file=sys.stderr)
sys.exit(1)
times.append(elapsed)
import statistics
avg = statistics.mean(times)
median = statistics.median(times)
print(f"build_seconds: {median:.2f}")
print(f"build_avg: {avg:.2f}")
print(f"runs: {RUNS}")
FILE:evaluators/llm_judge_content.py
#!/usr/bin/env python3
"""LLM judge for content quality (headlines, titles, descriptions).
Uses the user's existing CLI tool (claude, codex, gemini) for evaluation.
DO NOT MODIFY after experiment starts — this is the fixed evaluator."""
import subprocess
import sys
from pathlib import Path
# --- CONFIGURE THESE ---
TARGET_FILE = "content/titles.md" # File being optimized
CLI_TOOL = "claude" # or: codex, gemini
# --- END CONFIG ---
# The judge prompt is FIXED — the agent cannot change how it's evaluated
JUDGE_PROMPT = """You are a content quality evaluator. Score the following content strictly.
Criteria (each scored 1-10):
1. CURIOSITY GAP — Does this make you want to click? Is there an information gap
that can only be resolved by reading? Generic titles score 1-3. Specific,
intriguing titles score 7-10.
2. SPECIFICITY — Are there concrete numbers, tools, or details? "How I improved
performance" = 2. "How I reduced API latency from 800ms to 185ms" = 9.
3. EMOTIONAL PULL — Does it trigger curiosity, surprise, fear of missing out,
or recognition? Flat titles score 1-3. Emotionally charged score 7-10.
4. SCROLL-STOP POWER — Would this stop someone scrolling through a feed or
search results? Would they pause on this headline? Rate honestly.
5. SEO KEYWORD PRESENCE — Are searchable, high-intent terms present naturally?
Keyword-stuffed = 3. Natural integration of search terms = 8-10.
Output EXACTLY this format (nothing else):
curiosity: <score>
specificity: <score>
emotional: <score>
scroll_stop: <score>
seo: <score>
ctr_score: <average of all 5 scores>
Be harsh. Most content is mediocre (4-6 range). Only exceptional content scores 8+."""
try:
content = Path(TARGET_FILE).read_text()
except FileNotFoundError:
print(f"Target file not found: {TARGET_FILE}", file=sys.stderr)
sys.exit(1)
full_prompt = f"{JUDGE_PROMPT}\n\n---\n\nContent to evaluate:\n\n{content}"
# Call the user's CLI tool
result = subprocess.run(
[CLI_TOOL, "-p", full_prompt],
capture_output=True, text=True, timeout=120
)
if result.returncode != 0:
print(f"LLM judge failed: {result.stderr[:200]}", file=sys.stderr)
sys.exit(1)
# Parse output — look for ctr_score line
output = result.stdout
for line in output.splitlines():
line = line.strip()
if line.startswith("ctr_score:"):
print(line)
elif line.startswith(("curiosity:", "specificity:", "emotional:", "scroll_stop:", "seo:")):
print(line)
# Verify ctr_score was found
if "ctr_score:" not in output:
print("Could not parse ctr_score from LLM output", file=sys.stderr)
print(f"Raw output: {output[:500]}", file=sys.stderr)
sys.exit(1)
FILE:evaluators/llm_judge_copy.py
#!/usr/bin/env python3
"""LLM judge for marketing copy (social posts, ads, emails).
Uses the user's existing CLI tool for evaluation.
DO NOT MODIFY after experiment starts — this is the fixed evaluator."""
import subprocess
import sys
from pathlib import Path
# --- CONFIGURE THESE ---
TARGET_FILE = "posts.md" # Copy being optimized
CLI_TOOL = "claude" # or: codex, gemini
PLATFORM = "twitter" # twitter, linkedin, instagram, email, ad
# --- END CONFIG ---
JUDGE_PROMPTS = {
"twitter": """Score this Twitter/X post strictly:
1. HOOK (1-10) — Does the first line stop the scroll?
2. VALUE (1-10) — Does it provide insight, entertainment, or utility?
3. ENGAGEMENT (1-10) — Would people reply, retweet, or like?
4. BREVITY (1-10) — Is every word earning its place? No filler?
5. CTA (1-10) — Is there a clear next action (even implicit)?""",
"linkedin": """Score this LinkedIn post strictly:
1. HOOK (1-10) — Does the first line make you click "see more"?
2. STORYTELLING (1-10) — Is there a narrative arc or just statements?
3. CREDIBILITY (1-10) — Does it demonstrate expertise without bragging?
4. ENGAGEMENT (1-10) — Would professionals comment or share?
5. CTA (1-10) — Does it invite discussion or action?""",
"instagram": """Score this Instagram caption strictly:
1. HOOK (1-10) — Does the first line grab attention?
2. RELATABILITY (1-10) — Does the audience see themselves in this?
3. VISUAL MATCH (1-10) — Does the copy complement visual content?
4. HASHTAG STRATEGY (1-10) — Are hashtags relevant and not spammy?
5. CTA (1-10) — Does it encourage saves, shares, or comments?""",
"email": """Score this email subject + preview strictly:
1. OPEN INCENTIVE (1-10) — Would you open this in a crowded inbox?
2. SPECIFICITY (1-10) — Is it concrete or vague?
3. URGENCY (1-10) — Is there a reason to open now vs later?
4. PERSONALIZATION (1-10) — Does it feel written for someone, not everyone?
5. PREVIEW SYNC (1-10) — Does the preview text complement the subject?""",
"ad": """Score this ad copy strictly:
1. ATTENTION (1-10) — Does it stop someone scrolling past ads?
2. DESIRE (1-10) — Does it create want for the product/service?
3. PROOF (1-10) — Is there credibility (numbers, social proof)?
4. ACTION (1-10) — Is the CTA clear and compelling?
5. OBJECTION HANDLING (1-10) — Does it preempt "why not"?""",
}
platform_prompt = JUDGE_PROMPTS.get(PLATFORM, JUDGE_PROMPTS["twitter"])
JUDGE_PROMPT = f"""{platform_prompt}
IMPORTANT: You MUST use criterion_1 through criterion_5 as labels, NOT the criterion names.
Do NOT output "hook: 7" — output "criterion_1: 7".
Output EXACTLY this format:
criterion_1: <score>
criterion_2: <score>
criterion_3: <score>
criterion_4: <score>
criterion_5: <score>
engagement_score: <average of all 5>
Be harsh. Most copy is mediocre (4-6). Only exceptional copy scores 8+."""
try:
content = Path(TARGET_FILE).read_text()
except FileNotFoundError:
print(f"Target file not found: {TARGET_FILE}", file=sys.stderr)
sys.exit(1)
full_prompt = f"{JUDGE_PROMPT}\n\n---\n\nCopy to evaluate:\n\n{content}"
result = subprocess.run(
[CLI_TOOL, "-p", full_prompt],
capture_output=True, text=True, timeout=120
)
if result.returncode != 0:
print(f"LLM judge failed: {result.stderr[:200]}", file=sys.stderr)
sys.exit(1)
output = result.stdout
found_scores = False
for line in output.splitlines():
line = line.strip()
if line.startswith("engagement_score:") or line.startswith("criterion_"):
print(line)
found_scores = True
# Fallback: if no criterion_ lines found, try parsing any "word: digit" lines
if not found_scores:
import re
fallback_scores = []
for line in output.splitlines():
line = line.strip()
match = re.match(r'^(\w[\w\s]*?):\s*(\d+(?:\.\d+)?)\s*$', line)
if match and match.group(1).lower() not in ("engagement_score",):
fallback_scores.append(float(match.group(2)))
print(f"criterion_{len(fallback_scores)}: {match.group(2)}")
if fallback_scores:
avg = sum(fallback_scores) / len(fallback_scores)
print(f"engagement_score: {avg:.1f}")
found_scores = True
if "engagement_score:" not in output and not found_scores:
print("Could not parse engagement_score from LLM output", file=sys.stderr)
print(f"Raw: {output[:500]}", file=sys.stderr)
sys.exit(1)
FILE:evaluators/llm_judge_prompt.py
#!/usr/bin/env python3
"""LLM judge for prompt/instruction quality.
Uses the user's existing CLI tool for evaluation.
DO NOT MODIFY after experiment starts — this is the fixed evaluator."""
import json
import subprocess
import sys
from pathlib import Path
# --- CONFIGURE THESE ---
TARGET_FILE = "prompt.md" # Prompt being optimized
TEST_CASES_FILE = "tests/cases.json" # Test cases: [{"input": "...", "expected": "..."}]
CLI_TOOL = "claude" # or: codex, gemini
# --- END CONFIG ---
JUDGE_PROMPT_TEMPLATE = """You are evaluating a system prompt's effectiveness.
SYSTEM PROMPT BEING TESTED:
{prompt}
TEST INPUT:
{input}
EXPECTED OUTPUT (reference):
{expected}
ACTUAL OUTPUT:
{actual}
Score the actual output on these criteria (each 1-10):
1. ACCURACY — Does it match the expected output's intent and facts?
2. COMPLETENESS — Does it cover all required elements?
3. CLARITY — Is it well-structured and easy to understand?
4. INSTRUCTION_FOLLOWING — Does it follow the system prompt's guidelines?
Output EXACTLY: quality_score: <average of all 4>
Nothing else."""
try:
prompt = Path(TARGET_FILE).read_text()
except FileNotFoundError:
print(f"Target file not found: {TARGET_FILE}", file=sys.stderr)
sys.exit(1)
try:
test_cases = json.loads(Path(TEST_CASES_FILE).read_text())
except FileNotFoundError:
print(f"Test cases file not found: {TEST_CASES_FILE}", file=sys.stderr)
sys.exit(1)
scores = []
for i, case in enumerate(test_cases):
# Generate output using the prompt
gen_prompt = f"{prompt}\n\n{case['input']}"
gen_result = subprocess.run(
[CLI_TOOL, "-p", gen_prompt],
capture_output=True, text=True, timeout=60
)
if gen_result.returncode != 0:
print(f"Generation failed for case {i+1}", file=sys.stderr)
scores.append(0)
continue
actual = gen_result.stdout.strip()
# Judge the output
judge_prompt = JUDGE_PROMPT_TEMPLATE.format(
prompt=prompt[:500],
input=case["input"],
expected=case.get("expected", "N/A"),
actual=actual[:500]
)
judge_result = subprocess.run(
[CLI_TOOL, "-p", judge_prompt],
capture_output=True, text=True, timeout=60
)
if judge_result.returncode != 0:
scores.append(0)
continue
# Parse score
for line in judge_result.stdout.splitlines():
if "quality_score:" in line:
try:
score = float(line.split(":")[-1].strip())
scores.append(score)
except ValueError:
scores.append(0)
break
else:
scores.append(0)
print(f" Case {i+1}/{len(test_cases)}: {scores[-1]:.1f}", file=sys.stderr)
if not scores:
print("No test cases evaluated", file=sys.stderr)
sys.exit(1)
avg = sum(scores) / len(scores)
quality = avg * 10 # 1-10 scores → 10-100 range
print(f"quality_score: {quality:.2f}")
print(f"cases_tested: {len(scores)}")
print(f"avg_per_case: {avg:.2f}")
FILE:evaluators/memory_usage.py
#!/usr/bin/env python3
"""Measure peak memory usage of a command.
DO NOT MODIFY after experiment starts — this is the fixed evaluator."""
import platform
import subprocess
import sys
# --- CONFIGURE THESE ---
COMMAND = "python src/module.py" # Command to measure
# --- END CONFIG ---
system = platform.system()
if system == "Linux":
# Use /usr/bin/time for peak RSS
result = subprocess.run(
f"/usr/bin/time -v {COMMAND}",
shell=True, capture_output=True, text=True, timeout=300
)
output = result.stderr
for line in output.splitlines():
if "Maximum resident set size" in line:
kb = int(line.split(":")[-1].strip())
mb = kb / 1024
print(f"peak_mb: {mb:.1f}")
print(f"peak_kb: {kb}")
sys.exit(0)
print("Could not parse memory from /usr/bin/time output", file=sys.stderr)
sys.exit(1)
elif system == "Darwin":
# macOS: use /usr/bin/time -l
result = subprocess.run(
f"/usr/bin/time -l {COMMAND}",
shell=True, capture_output=True, text=True, timeout=300
)
output = result.stderr
for line in output.splitlines():
if "maximum resident set size" in line.lower():
# macOS reports in bytes
val = int(line.strip().split()[0])
kb = val / 1024
mb = val / (1024 * 1024)
print(f"peak_mb: {mb:.1f}")
print(f"peak_kb: {int(kb)}")
sys.exit(0)
print("Could not parse memory from time output", file=sys.stderr)
sys.exit(1)
else:
print(f"Unsupported platform: {system}. Use Linux or macOS.", file=sys.stderr)
sys.exit(1)
FILE:evaluators/test_pass_rate.py
#!/usr/bin/env python3
"""Measure test suite pass rate.
DO NOT MODIFY after experiment starts — this is the fixed evaluator."""
import re
import subprocess
import sys
# --- CONFIGURE THESE ---
TEST_CMD = "pytest tests/ --tb=no -q" # Test command
# --- END CONFIG ---
result = subprocess.run(TEST_CMD, shell=True, capture_output=True, text=True, timeout=300)
output = result.stdout + "\n" + result.stderr
# Try to parse pytest output: "X passed, Y failed, Z errors"
passed = failed = errors = 0
# pytest short format: "5 passed, 2 failed in 1.23s"
match = re.search(r"(\d+) passed", output)
if match:
passed = int(match.group(1))
match = re.search(r"(\d+) failed", output)
if match:
failed = int(match.group(1))
match = re.search(r"(\d+) error", output)
if match:
errors = int(match.group(1))
total = passed + failed + errors
if total == 0:
# Try unittest format: "Ran X tests"
match = re.search(r"Ran (\d+) test", output)
if match:
total = int(match.group(1))
if result.returncode == 0:
passed = total
else:
# Count failures from output
fail_match = re.search(r"FAILED \(failures=(\d+)", output)
if fail_match:
failed = int(fail_match.group(1))
passed = total - failed
if total == 0:
print("Could not parse test results", file=sys.stderr)
print(f"Output: {output[:500]}", file=sys.stderr)
sys.exit(1)
rate = passed / total
print(f"pass_rate: {rate:.4f}")
print(f"passed: {passed}")
print(f"failed: {failed}")
print(f"total: {total}")
FILE:references/experiment-domains.md
# Experiment Domains Guide
## Domain: Engineering
### Code Speed Optimization
```bash
python scripts/setup_experiment.py \
--domain engineering \
--name api-speed \
--target src/api/search.py \
--eval "python -m pytest tests/bench_search.py --tb=no -q" \
--metric p50_ms \
--direction lower \
--evaluator benchmark_speed
```
**What the agent optimizes:** Algorithm, data structures, caching, query patterns, I/O.
**Cost:** Free — just runs benchmarks.
**Speed:** ~5 min/experiment, ~12/hour, ~100 overnight.
### Bundle Size Reduction
```bash
python scripts/setup_experiment.py \
--domain engineering \
--name bundle-size \
--target webpack.config.js \
--eval "npm run build && python .autoresearch/engineering/bundle-size/evaluate.py" \
--metric size_bytes \
--direction lower \
--evaluator benchmark_size
```
Edit `evaluate.py` to set `TARGET_FILE = "dist/main.js"` and add `BUILD_CMD = "npm run build"`.
### Test Pass Rate
```bash
python scripts/setup_experiment.py \
--domain engineering \
--name fix-flaky-tests \
--target src/utils/parser.py \
--eval "python .autoresearch/engineering/fix-flaky-tests/evaluate.py" \
--metric pass_rate \
--direction higher \
--evaluator test_pass_rate
```
### Docker Build Speed
```bash
python scripts/setup_experiment.py \
--domain engineering \
--name docker-build \
--target Dockerfile \
--eval "python .autoresearch/engineering/docker-build/evaluate.py" \
--metric build_seconds \
--direction lower \
--evaluator build_speed
```
### Memory Optimization
```bash
python scripts/setup_experiment.py \
--domain engineering \
--name memory-usage \
--target src/processor.py \
--eval "python .autoresearch/engineering/memory-usage/evaluate.py" \
--metric peak_mb \
--direction lower \
--evaluator memory_usage
```
### ML Training (Karpathy-style)
Requires NVIDIA GPU. See [autoresearch](https://github.com/karpathy/autoresearch).
```bash
python scripts/setup_experiment.py \
--domain engineering \
--name ml-training \
--target train.py \
--eval "uv run train.py" \
--metric val_bpb \
--direction lower \
--time-budget 5
```
---
## Domain: Marketing
### Medium Article Headlines
```bash
python scripts/setup_experiment.py \
--domain marketing \
--name medium-ctr \
--target content/titles.md \
--eval "python .autoresearch/marketing/medium-ctr/evaluate.py" \
--metric ctr_score \
--direction higher \
--evaluator llm_judge_content
```
Edit `evaluate.py`: set `TARGET_FILE = "content/titles.md"` and `CLI_TOOL = "claude"`.
**What the agent optimizes:** Title phrasing, curiosity gaps, specificity, emotional triggers.
**Cost:** Uses your CLI subscription (Claude Max = unlimited).
**Speed:** ~2 min/experiment, ~30/hour.
### Social Media Copy
```bash
python scripts/setup_experiment.py \
--domain marketing \
--name twitter-engagement \
--target social/tweets.md \
--eval "python .autoresearch/marketing/twitter-engagement/evaluate.py" \
--metric engagement_score \
--direction higher \
--evaluator llm_judge_copy
```
Edit `evaluate.py`: set `PLATFORM = "twitter"` (or linkedin, instagram).
### Email Subject Lines
```bash
python scripts/setup_experiment.py \
--domain marketing \
--name email-open-rate \
--target emails/subjects.md \
--eval "python .autoresearch/marketing/email-open-rate/evaluate.py" \
--metric engagement_score \
--direction higher \
--evaluator llm_judge_copy
```
Edit `evaluate.py`: set `PLATFORM = "email"`.
### Ad Copy
```bash
python scripts/setup_experiment.py \
--domain marketing \
--name ad-copy-q2 \
--target ads/google-search.md \
--eval "python .autoresearch/marketing/ad-copy-q2/evaluate.py" \
--metric engagement_score \
--direction higher \
--evaluator llm_judge_copy
```
Edit `evaluate.py`: set `PLATFORM = "ad"`.
---
## Domain: Content
### Article Structure & Readability
```bash
python scripts/setup_experiment.py \
--domain content \
--name article-structure \
--target drafts/my-article.md \
--eval "python .autoresearch/content/article-structure/evaluate.py" \
--metric ctr_score \
--direction higher \
--evaluator llm_judge_content
```
### SEO Descriptions
```bash
python scripts/setup_experiment.py \
--domain content \
--name seo-meta \
--target seo/descriptions.md \
--eval "python .autoresearch/content/seo-meta/evaluate.py" \
--metric ctr_score \
--direction higher \
--evaluator llm_judge_content
```
---
## Domain: Prompts
### System Prompt Optimization
```bash
python scripts/setup_experiment.py \
--domain prompts \
--name support-bot \
--target prompts/support-system.md \
--eval "python .autoresearch/prompts/support-bot/evaluate.py" \
--metric quality_score \
--direction higher \
--evaluator llm_judge_prompt
```
Requires `tests/cases.json` with test inputs and expected outputs:
```json
[
{
"input": "I can't log in to my account",
"expected": "Ask for email, check account status, offer password reset"
},
{
"input": "How do I cancel my subscription?",
"expected": "Empathetic response, explain cancellation steps, offer retention"
}
]
```
### Agent Skill Optimization
```bash
python scripts/setup_experiment.py \
--domain prompts \
--name skill-improvement \
--target SKILL.md \
--eval "python .autoresearch/prompts/skill-improvement/evaluate.py" \
--metric quality_score \
--direction higher \
--evaluator llm_judge_prompt
```
---
## Choosing Your Domain
| I want to... | Domain | Evaluator | Cost |
|-------------|--------|-----------|------|
| Speed up my code | engineering | benchmark_speed | Free |
| Shrink my bundle | engineering | benchmark_size | Free |
| Fix flaky tests | engineering | test_pass_rate | Free |
| Speed up Docker builds | engineering | build_speed | Free |
| Reduce memory usage | engineering | memory_usage | Free |
| Train ML models | engineering | (custom) | Free + GPU |
| Write better headlines | marketing | llm_judge_content | Subscription |
| Improve social posts | marketing | llm_judge_copy | Subscription |
| Optimize email subjects | marketing | llm_judge_copy | Subscription |
| Improve ad copy | marketing | llm_judge_copy | Subscription |
| Optimize article structure | content | llm_judge_content | Subscription |
| Improve SEO descriptions | content | llm_judge_content | Subscription |
| Optimize system prompts | prompts | llm_judge_prompt | Subscription |
| Improve agent skills | prompts | llm_judge_prompt | Subscription |
**First time?** Start with an engineering experiment (free, fast, measurable). Once comfortable, try content/marketing with LLM judges.
FILE:references/program-template.md
# program.md Templates
Copy the template for your domain and paste into your project root as `program.md`.
---
## ML Training (Karpathy-style)
```markdown
# autoresearch — ML Training
## Goal
Minimize val_bpb on the validation set. Lower is better.
## What You Can Change (train.py only)
- Model architecture (depth, width, attention heads, FFN ratio)
- Optimizer (learning rate, warmup, scheduler, weight decay)
- Training loop (batch size, gradient accumulation, clipping)
- Regularization (dropout, weight tying, etc.)
- Any self-contained improvement that doesn't require new packages
## What You Cannot Change
- prepare.py (fixed — contains evaluation harness)
- Dependencies (pyproject.toml is locked)
- Time budget (always 5 minutes, wall clock)
- Evaluation metric (val_bpb is the ground truth)
## Strategy
1. First run: establish baseline. Do not change anything.
2. Explore learning rate range (try 2x and 0.5x current)
3. Try depth changes (±2 layers)
4. Try optimizer changes (Muon vs. AdamW variants)
5. If things improve, double down. If stuck, try something radical.
## Simplicity Rule
A small improvement with ugly code is NOT worth it.
Equal performance with simpler code IS worth it.
Removing code that gets same results is the best outcome.
## Stop When
val_bpb < 0.95 OR after 100 experiments, whichever comes first.
```
---
## Prompt Engineering
```markdown
# autoresearch — Prompt Optimization
## Goal
Maximize eval_score on the test suite. Higher is better (0-100).
## What You Can Change (prompt.md only)
- System prompt instructions
- Examples and few-shot demonstrations
- Output format specifications
- Chain-of-thought instructions
- Persona and tone
- Task decomposition strategies
## What You Cannot Change
- evaluate.py (fixed evaluation harness)
- Test cases in tests/ (ground truth)
- Model being evaluated (specified in evaluate.py)
- Scoring criteria (defined in evaluate.py)
## Strategy
1. First run: baseline with current prompt (or empty)
2. Add clear role/persona definition
3. Add output format specification
4. Add chain-of-thought instruction
5. Add 2-3 diverse examples
6. Refine based on failure modes from run.log
## Evaluation
- evaluate.py runs the prompt against 20 test cases
- Each test case is scored 1-10 by your CLI tool (Claude, Codex, or Gemini)
- quality_score = average * 10 (maps to 10-100)
- Run log shows which test cases failed
## Stop When
eval_score >= 85 OR after 50 experiments.
```
---
## Code Performance
```markdown
# autoresearch — Performance Optimization
## Goal
Minimize p50_ms (median latency). Lower is better.
## What You Can Change (src/module.py only)
- Algorithm implementation
- Data structures (use faster alternatives)
- Caching and memoization
- Vectorization (NumPy, etc.)
- Loop optimization
- I/O patterns
- Memory allocation patterns
## What You Cannot Change
- benchmark.py (fixed benchmark harness)
- Public API (function signatures must stay the same)
- External dependencies (add nothing new)
- Correctness tests (tests/ must still pass)
## Constraints
- Correctness is non-negotiable. benchmark.py runs tests first.
- If tests fail → immediate crash status, no metric recorded.
- Memory usage: p99 < 2x baseline acceptable, hard limit at 4x.
## Strategy
1. Baseline: profile first, don't guess
2. Check if there's any O(n²) → O(n log n) opportunity
3. Try caching repeated computations
4. Try NumPy vectorization for loops
5. Try algorithm-level changes last (higher risk)
## Stop When
p50_ms < 50ms OR improvement plateaus for 10 consecutive experiments.
```
---
## Agent Skill Optimization
```markdown
# autoresearch — Skill Optimization
## Goal
Maximize pass_rate on the task evaluation suite. Higher is better (0-1).
## What You Can Change (SKILL.md only)
- Skill description and trigger phrases
- Core workflow steps and ordering
- Decision frameworks and rules
- Output format specifications
- Example inputs/outputs
- Related skills disambiguation
- Proactive trigger conditions
## What You Cannot Change
- your custom evaluate.py (see Custom Evaluators in SKILL.md)
- Test tasks in tests/ (ground truth benchmark)
- Skill name (used for routing)
- License or metadata
## Evaluation
- evaluate.py runs SKILL.md against 15 standardized tasks
- Your CLI tool scores each task: 0 (fail), 0.5 (partial), 1 (pass)
- pass_rate = sum(scores) / 15
## Strategy
1. Baseline: run as-is
2. Improve trigger description (better routing = more passes)
3. Sharpen the core workflow (clearer = better execution)
4. Add missing edge cases to the rules section
5. Improve disambiguation (reduce false-positive routing)
## Simplicity Rule
A shorter SKILL.md that achieves the same score is better.
Aim for 200-400 lines total.
## Stop When
pass_rate >= 0.90 OR after 30 experiments.
```
FILE:scripts/log_results.py
#!/usr/bin/env python3
"""
autoresearch-agent: Results Viewer
View experiment results in multiple formats: terminal, CSV, Markdown.
Supports single experiment, domain, or cross-experiment dashboard.
Usage:
python scripts/log_results.py --experiment engineering/api-speed
python scripts/log_results.py --domain engineering
python scripts/log_results.py --dashboard
python scripts/log_results.py --experiment engineering/api-speed --format csv --output results.csv
python scripts/log_results.py --experiment engineering/api-speed --format markdown --output results.md
python scripts/log_results.py --dashboard --format markdown --output dashboard.md
"""
import argparse
import csv
import io
import sys
import time
from pathlib import Path
def find_autoresearch_root():
"""Find .autoresearch/ in project or user home."""
project_root = Path(".").resolve() / ".autoresearch"
if project_root.exists():
return project_root
user_root = Path.home() / ".autoresearch"
if user_root.exists():
return user_root
return None
def load_config(experiment_dir):
"""Load config.cfg."""
cfg_file = experiment_dir / "config.cfg"
config = {}
if cfg_file.exists():
for line in cfg_file.read_text().splitlines():
if ":" in line:
k, v = line.split(":", 1)
config[k.strip()] = v.strip()
return config
def load_results(experiment_dir):
"""Load results.tsv into list of dicts."""
tsv = experiment_dir / "results.tsv"
if not tsv.exists():
return []
results = []
for line in tsv.read_text().splitlines()[1:]:
parts = line.split("\t")
if len(parts) >= 4:
try:
metric = float(parts[1]) if parts[1] != "N/A" else None
except ValueError:
metric = None
results.append({
"commit": parts[0],
"metric": metric,
"status": parts[2],
"description": parts[3],
})
return results
def compute_stats(results, direction):
"""Compute statistics from results."""
keeps = [r for r in results if r["status"] == "keep"]
discards = [r for r in results if r["status"] == "discard"]
crashes = [r for r in results if r["status"] == "crash"]
valid_keeps = [r for r in keeps if r["metric"] is not None]
baseline = valid_keeps[0]["metric"] if valid_keeps else None
if valid_keeps:
best = min(r["metric"] for r in valid_keeps) if direction == "lower" else max(r["metric"] for r in valid_keeps)
else:
best = None
pct_change = None
if baseline is not None and best is not None and baseline != 0:
if direction == "lower":
pct_change = (baseline - best) / baseline * 100
else:
pct_change = (best - baseline) / baseline * 100
return {
"total": len(results),
"keeps": len(keeps),
"discards": len(discards),
"crashes": len(crashes),
"baseline": baseline,
"best": best,
"pct_change": pct_change,
}
# --- Terminal Output ---
def print_experiment(experiment_dir, experiment_path):
"""Print single experiment results to terminal."""
config = load_config(experiment_dir)
results = load_results(experiment_dir)
direction = config.get("metric_direction", "lower")
metric_name = config.get("metric", "metric")
if not results:
print(f"No results for {experiment_path}")
return
stats = compute_stats(results, direction)
print(f"\n{'─' * 65}")
print(f" {experiment_path}")
print(f" Target: {config.get('target', '?')} | Metric: {metric_name} ({direction})")
print(f"{'─' * 65}")
print(f" Total: {stats['total']} | Keep: {stats['keeps']} | Discard: {stats['discards']} | Crash: {stats['crashes']}")
if stats["baseline"] is not None and stats["best"] is not None:
pct = f" ({stats['pct_change']:+.1f}%)" if stats["pct_change"] is not None else ""
print(f" Baseline: {stats['baseline']:.6f} -> Best: {stats['best']:.6f}{pct}")
print(f"\n {'COMMIT':<10} {'METRIC':>12} {'STATUS':<10} DESCRIPTION")
print(f" {'─' * 60}")
for r in results:
m = f"{r['metric']:.6f}" if r["metric"] is not None else "N/A "
icon = {"keep": "+", "discard": "-", "crash": "!"}.get(r["status"], "?")
print(f" {r['commit']:<10} {m:>12} {icon} {r['status']:<7} {r['description'][:35]}")
print()
def print_dashboard(root):
"""Print cross-experiment dashboard."""
experiments = []
for domain_dir in sorted(root.iterdir()):
if not domain_dir.is_dir() or domain_dir.name.startswith("."):
continue
for exp_dir in sorted(domain_dir.iterdir()):
if not exp_dir.is_dir() or not (exp_dir / "config.cfg").exists():
continue
config = load_config(exp_dir)
results = load_results(exp_dir)
direction = config.get("metric_direction", "lower")
stats = compute_stats(results, direction)
best_str = f"{stats['best']:.4f}" if stats["best"] is not None else "—"
pct_str = f"{stats['pct_change']:+.1f}%" if stats["pct_change"] is not None else "—"
# Determine status
status = "idle"
if stats["total"] > 0:
tsv = exp_dir / "results.tsv"
if tsv.exists():
age_hours = (time.time() - tsv.stat().st_mtime) / 3600
status = "active" if age_hours < 1 else "paused" if age_hours < 24 else "done"
experiments.append({
"domain": domain_dir.name,
"name": exp_dir.name,
"runs": stats["total"],
"kept": stats["keeps"],
"best": best_str,
"change": pct_str,
"status": status,
"metric": config.get("metric", "?"),
})
if not experiments:
print("No experiments found.")
return experiments
print(f"\n{'─' * 90}")
print(f" autoresearch — Dashboard")
print(f"{'─' * 90}")
print(f" {'DOMAIN':<15} {'EXPERIMENT':<20} {'RUNS':>5} {'KEPT':>5} {'BEST':>12} {'CHANGE':>10} {'STATUS':<8}")
print(f" {'─' * 85}")
for e in experiments:
print(f" {e['domain']:<15} {e['name']:<20} {e['runs']:>5} {e['kept']:>5} {e['best']:>12} {e['change']:>10} {e['status']:<8}")
print()
return experiments
# --- CSV Export ---
def export_experiment_csv(experiment_dir, experiment_path):
"""Export single experiment as CSV string."""
config = load_config(experiment_dir)
results = load_results(experiment_dir)
direction = config.get("metric_direction", "lower")
stats = compute_stats(results, direction)
buf = io.StringIO()
writer = csv.writer(buf)
# Header with metadata
writer.writerow(["# Experiment", experiment_path])
writer.writerow(["# Target", config.get("target", "")])
writer.writerow(["# Metric", f"{config.get('metric', '')} ({direction} is better)"])
if stats["baseline"] is not None:
writer.writerow(["# Baseline", f"{stats['baseline']:.6f}"])
if stats["best"] is not None:
pct = f" ({stats['pct_change']:+.1f}%)" if stats["pct_change"] is not None else ""
writer.writerow(["# Best", f"{stats['best']:.6f}{pct}"])
writer.writerow(["# Total", stats["total"]])
writer.writerow(["# Keep/Discard/Crash", f"{stats['keeps']}/{stats['discards']}/{stats['crashes']}"])
writer.writerow([])
writer.writerow(["Commit", "Metric", "Status", "Description"])
for r in results:
m = f"{r['metric']:.6f}" if r["metric"] is not None else "N/A"
writer.writerow([r["commit"], m, r["status"], r["description"]])
return buf.getvalue()
def export_dashboard_csv(root, domain_filter=None):
"""Export dashboard as CSV string."""
experiments = []
for domain_dir in sorted(root.iterdir()):
if not domain_dir.is_dir() or domain_dir.name.startswith("."):
continue
if domain_filter and domain_dir.name != domain_filter:
continue
for exp_dir in sorted(domain_dir.iterdir()):
if not exp_dir.is_dir() or not (exp_dir / "config.cfg").exists():
continue
config = load_config(exp_dir)
results = load_results(exp_dir)
direction = config.get("metric_direction", "lower")
stats = compute_stats(results, direction)
best_str = f"{stats['best']:.6f}" if stats["best"] is not None else ""
pct_str = f"{stats['pct_change']:+.1f}%" if stats["pct_change"] is not None else ""
experiments.append([
domain_dir.name, exp_dir.name, config.get("metric", ""),
stats["total"], stats["keeps"], stats["discards"], stats["crashes"],
best_str, pct_str
])
buf = io.StringIO()
writer = csv.writer(buf)
writer.writerow(["Domain", "Experiment", "Metric", "Runs", "Kept", "Discarded", "Crashed", "Best", "Change"])
for e in experiments:
writer.writerow(e)
return buf.getvalue()
# --- Markdown Export ---
def export_experiment_markdown(experiment_dir, experiment_path):
"""Export single experiment as Markdown string."""
config = load_config(experiment_dir)
results = load_results(experiment_dir)
direction = config.get("metric_direction", "lower")
metric_name = config.get("metric", "metric")
stats = compute_stats(results, direction)
lines = []
lines.append(f"# Autoresearch: {experiment_path}\n")
lines.append(f"**Target:** `{config.get('target', '?')}` ")
lines.append(f"**Metric:** `{metric_name}` ({direction} is better) ")
lines.append(f"**Experiments:** {stats['total']} total — {stats['keeps']} kept, {stats['discards']} discarded, {stats['crashes']} crashed\n")
if stats["baseline"] is not None and stats["best"] is not None:
pct = f" ({stats['pct_change']:+.1f}%)" if stats["pct_change"] is not None else ""
lines.append(f"**Progress:** `{stats['baseline']:.6f}` → `{stats['best']:.6f}`{pct}\n")
lines.append(f"| Commit | Metric | Status | Description |")
lines.append(f"|--------|--------|--------|-------------|")
for r in results:
m = f"`{r['metric']:.6f}`" if r["metric"] is not None else "N/A"
lines.append(f"| `{r['commit']}` | {m} | {r['status']} | {r['description']} |")
lines.append("")
return "\n".join(lines)
def export_dashboard_markdown(root, domain_filter=None):
"""Export dashboard as Markdown string."""
lines = []
lines.append("# Autoresearch Dashboard\n")
lines.append("| Domain | Experiment | Metric | Runs | Kept | Best | Change | Status |")
lines.append("|--------|-----------|--------|------|------|------|--------|--------|")
for domain_dir in sorted(root.iterdir()):
if not domain_dir.is_dir() or domain_dir.name.startswith("."):
continue
if domain_filter and domain_dir.name != domain_filter:
continue
for exp_dir in sorted(domain_dir.iterdir()):
if not exp_dir.is_dir() or not (exp_dir / "config.cfg").exists():
continue
config = load_config(exp_dir)
results = load_results(exp_dir)
direction = config.get("metric_direction", "lower")
stats = compute_stats(results, direction)
best = f"`{stats['best']:.4f}`" if stats["best"] is not None else "—"
pct = f"{stats['pct_change']:+.1f}%" if stats["pct_change"] is not None else "—"
tsv = exp_dir / "results.tsv"
status = "idle"
if tsv.exists() and stats["total"] > 0:
age_h = (time.time() - tsv.stat().st_mtime) / 3600
status = "active" if age_h < 1 else "paused" if age_h < 24 else "done"
lines.append(f"| {domain_dir.name} | {exp_dir.name} | {config.get('metric', '?')} | {stats['total']} | {stats['keeps']} | {best} | {pct} | {status} |")
lines.append("")
return "\n".join(lines)
# --- Main ---
def main():
parser = argparse.ArgumentParser(description="autoresearch-agent results viewer")
parser.add_argument("--experiment", help="Show one experiment: domain/name")
parser.add_argument("--domain", help="Show all experiments in a domain")
parser.add_argument("--dashboard", action="store_true", help="Cross-experiment dashboard")
parser.add_argument("--format", choices=["terminal", "csv", "markdown"], default="terminal",
help="Output format (default: terminal)")
parser.add_argument("--output", "-o", help="Write to file instead of stdout")
parser.add_argument("--all", action="store_true", help="Show all experiments (alias for --dashboard)")
args = parser.parse_args()
root = find_autoresearch_root()
if root is None:
print("No .autoresearch/ found. Run setup_experiment.py first.")
sys.exit(1)
output_text = None
# Single experiment
if args.experiment:
experiment_dir = root / args.experiment
if not experiment_dir.exists():
print(f"Experiment not found: {args.experiment}")
sys.exit(1)
if args.format == "csv":
output_text = export_experiment_csv(experiment_dir, args.experiment)
elif args.format == "markdown":
output_text = export_experiment_markdown(experiment_dir, args.experiment)
else:
print_experiment(experiment_dir, args.experiment)
return
# Domain
elif args.domain:
domain_dir = root / args.domain
if not domain_dir.exists():
print(f"Domain not found: {args.domain}")
sys.exit(1)
for exp_dir in sorted(domain_dir.iterdir()):
if exp_dir.is_dir() and (exp_dir / "config.cfg").exists():
if args.format == "terminal":
print_experiment(exp_dir, f"{args.domain}/{exp_dir.name}")
# For CSV/MD, fall through to dashboard with domain filter
if args.format != "terminal":
# Use dashboard export filtered to domain
output_text = export_dashboard_csv(root, domain_filter=args.domain) if args.format == "csv" else export_dashboard_markdown(root, domain_filter=args.domain)
else:
return
# Dashboard
elif args.dashboard or args.all:
if args.format == "csv":
output_text = export_dashboard_csv(root)
elif args.format == "markdown":
output_text = export_dashboard_markdown(root)
else:
print_dashboard(root)
return
else:
# Default: dashboard
if args.format == "terminal":
print_dashboard(root)
return
output_text = export_dashboard_csv(root) if args.format == "csv" else export_dashboard_markdown(root)
# Write output
if output_text:
if args.output:
Path(args.output).write_text(output_text)
print(f"Written to {args.output}")
else:
print(output_text)
if __name__ == "__main__":
main()
FILE:scripts/run_experiment.py
#!/usr/bin/env python3
"""
autoresearch-agent: Experiment Runner
Executes a single experiment iteration. The AI agent is the loop —
it calls this script repeatedly. The script handles evaluation,
metric parsing, keep/discard decisions, and git rollback on failure.
Usage:
python scripts/run_experiment.py --experiment engineering/api-speed --single
python scripts/run_experiment.py --experiment engineering/api-speed --dry-run
python scripts/run_experiment.py --experiment engineering/api-speed --single --description "added caching"
"""
import argparse
import subprocess
import sys
import time
from datetime import datetime
from pathlib import Path
def find_autoresearch_root():
"""Find .autoresearch/ in project or user home."""
project_root = Path(".").resolve() / ".autoresearch"
if project_root.exists():
return project_root
user_root = Path.home() / ".autoresearch"
if user_root.exists():
return user_root
return None
def load_config(experiment_dir):
"""Load config.cfg from experiment directory."""
cfg_file = experiment_dir / "config.cfg"
if not cfg_file.exists():
print(f" Error: no config.cfg in {experiment_dir}")
sys.exit(1)
config = {}
for line in cfg_file.read_text().splitlines():
if ":" in line:
k, v = line.split(":", 1)
config[k.strip()] = v.strip()
return config
def run_git(args, cwd=None, timeout=30):
"""Run a git command safely (no shell injection). Returns (returncode, stdout, stderr)."""
result = subprocess.run(
["git"] + args,
capture_output=True, text=True,
cwd=cwd, timeout=timeout
)
return result.returncode, result.stdout.strip(), result.stderr.strip()
def get_current_commit(path):
"""Get short hash of current HEAD."""
_, commit, _ = run_git(["rev-parse", "--short", "HEAD"], cwd=path)
return commit
def get_best_metric(experiment_dir, direction):
"""Read the best metric from results.tsv."""
tsv = experiment_dir / "results.tsv"
if not tsv.exists():
return None
lines = [l for l in tsv.read_text().splitlines()[1:] if "\tkeep\t" in l]
if not lines:
return None
metrics = []
for line in lines:
parts = line.split("\t")
try:
if parts[1] != "N/A":
metrics.append(float(parts[1]))
except (ValueError, IndexError):
continue
if not metrics:
return None
return min(metrics) if direction == "lower" else max(metrics)
def run_evaluation(project_root, eval_cmd, time_budget_minutes, log_file):
"""Run evaluation with time limit. Output goes to log_file.
Note: shell=True is intentional here — eval_cmd is user-provided and
may contain pipes, redirects, or chained commands.
"""
hard_limit = time_budget_minutes * 60 * 2.5
t0 = time.time()
try:
with open(log_file, "w") as lf:
result = subprocess.run(
eval_cmd, shell=True,
stdout=lf, stderr=subprocess.STDOUT,
cwd=str(project_root),
timeout=hard_limit
)
elapsed = time.time() - t0
return result.returncode, elapsed
except subprocess.TimeoutExpired:
elapsed = time.time() - t0
return -1, elapsed
def extract_metric(log_file, metric_grep):
"""Extract metric value from log file."""
log_path = Path(log_file)
if not log_path.exists():
return None
for line in reversed(log_path.read_text().splitlines()):
stripped = line.strip()
if stripped.startswith(metric_grep.lstrip("^")):
try:
return float(stripped.split(":")[-1].strip())
except ValueError:
continue
return None
def is_improvement(new_val, old_val, direction):
"""Check if new result is better than old."""
if old_val is None:
return True
if direction == "lower":
return new_val < old_val
return new_val > old_val
def log_result(experiment_dir, commit, metric_val, status, description):
"""Append result to results.tsv."""
tsv = experiment_dir / "results.tsv"
metric_str = f"{metric_val:.6f}" if metric_val is not None else "N/A"
with open(tsv, "a") as f:
f.write(f"{commit}\t{metric_str}\t{status}\t{description}\n")
def get_experiment_count(experiment_dir):
"""Count experiments run so far."""
tsv = experiment_dir / "results.tsv"
if not tsv.exists():
return 0
return max(0, len(tsv.read_text().splitlines()) - 1)
def get_description_from_diff(project_root):
"""Auto-generate a description from git diff --stat HEAD~1."""
code, diff_stat, _ = run_git(["diff", "--stat", "HEAD~1"], cwd=str(project_root))
if code == 0 and diff_stat:
return diff_stat.split("\n")[0][:50]
return "experiment"
def read_last_lines(filepath, n=5):
"""Read last n lines of a file (replaces tail shell command)."""
path = Path(filepath)
if not path.exists():
return ""
lines = path.read_text().splitlines()
return "\n".join(lines[-n:])
def run_single(project_root, experiment_dir, config, exp_num, dry_run=False, description=None):
"""Run one experiment iteration."""
direction = config.get("metric_direction", "lower")
metric_grep = config.get("metric_grep", "^metric:")
eval_cmd = config.get("evaluate_cmd", "python evaluate.py")
time_budget = int(config.get("time_budget_minutes", 5))
metric_name = config.get("metric", "metric")
log_file = str(experiment_dir / "run.log")
best = get_best_metric(experiment_dir, direction)
ts = datetime.now().strftime("%H:%M:%S")
print(f"\n[{ts}] Experiment #{exp_num}")
print(f" Best {metric_name}: {best}")
if dry_run:
print(" [DRY RUN] Would run evaluation and check metric")
return "dry_run"
# Auto-generate description if not provided
if not description:
description = get_description_from_diff(str(project_root))
# Run evaluation
print(f" Running: {eval_cmd} (budget: {time_budget}m)")
ret_code, elapsed = run_evaluation(project_root, eval_cmd, time_budget, log_file)
commit = get_current_commit(str(project_root))
# Timeout
if ret_code == -1:
print(f" TIMEOUT after {elapsed:.0f}s — discarding")
run_git(["checkout", "--", "."], cwd=str(project_root))
run_git(["reset", "--hard", "HEAD~1"], cwd=str(project_root))
log_result(experiment_dir, commit, None, "crash", f"timeout_{elapsed:.0f}s")
return "crash"
# Crash
if ret_code != 0:
tail = read_last_lines(log_file, 5)
print(f" CRASH (exit {ret_code}) after {elapsed:.0f}s")
print(f" Last output: {tail[:200]}")
run_git(["reset", "--hard", "HEAD~1"], cwd=str(project_root))
log_result(experiment_dir, commit, None, "crash", f"exit_{ret_code}")
return "crash"
# Extract metric
metric_val = extract_metric(log_file, metric_grep)
if metric_val is None:
print(f" Could not parse {metric_name} from run.log")
run_git(["reset", "--hard", "HEAD~1"], cwd=str(project_root))
log_result(experiment_dir, commit, None, "crash", "metric_parse_failed")
return "crash"
delta = ""
if best is not None:
diff = metric_val - best
delta = f" (delta {diff:+.4f})"
print(f" {metric_name}: {metric_val:.6f}{delta} in {elapsed:.0f}s")
# Keep or discard
if is_improvement(metric_val, best, direction):
print(f" KEEP — improvement")
log_result(experiment_dir, commit, metric_val, "keep", description)
return "keep"
else:
print(f" DISCARD — no improvement")
run_git(["reset", "--hard", "HEAD~1"], cwd=str(project_root))
best_str = f"{best:.4f}" if best is not None else "?"
log_result(experiment_dir, commit, metric_val, "discard",
f"no_improvement_{metric_val:.4f}_vs_{best_str}")
return "discard"
def main():
parser = argparse.ArgumentParser(description="autoresearch-agent runner")
parser.add_argument("--experiment", help="Experiment path: domain/name (e.g. engineering/api-speed)")
parser.add_argument("--single", action="store_true", help="Run one experiment iteration")
parser.add_argument("--dry-run", action="store_true", help="Show what would happen")
parser.add_argument("--description", help="Description of the change (auto-generated from git diff if omitted)")
parser.add_argument("--path", default=".", help="Project root")
args = parser.parse_args()
project_root = Path(args.path).resolve()
root = find_autoresearch_root()
if root is None:
print("No .autoresearch/ found. Run setup_experiment.py first.")
sys.exit(1)
if not args.experiment:
print("Specify --experiment domain/name")
sys.exit(1)
experiment_dir = root / args.experiment
if not experiment_dir.exists():
print(f"Experiment not found: {experiment_dir}")
print("Run: python scripts/setup_experiment.py --list")
sys.exit(1)
config = load_config(experiment_dir)
print(f"\n autoresearch-agent")
print(f" Experiment: {args.experiment}")
print(f" Target: {config.get('target', '?')}")
print(f" Metric: {config.get('metric', '?')} ({config.get('metric_direction', '?')} is better)")
print(f" Budget: {config.get('time_budget_minutes', '?')} min/experiment")
print(f" Mode: {'dry-run' if args.dry_run else 'single'}")
exp_num = get_experiment_count(experiment_dir) + 1
run_single(project_root, experiment_dir, config, exp_num, args.dry_run, args.description)
if __name__ == "__main__":
main()
FILE:scripts/setup_experiment.py
#!/usr/bin/env python3
"""
autoresearch-agent: Setup Experiment
Initialize a new experiment with domain, target, evaluator, and git branch.
Creates the .autoresearch/{domain}/{name}/ directory structure.
Usage:
python scripts/setup_experiment.py --domain engineering --name api-speed \
--target src/api/search.py --eval "pytest bench.py" \
--metric p50_ms --direction lower
python scripts/setup_experiment.py --domain marketing --name medium-ctr \
--target content/titles.md --eval "python evaluate.py" \
--metric ctr_score --direction higher --evaluator llm_judge_content
python scripts/setup_experiment.py --list # List all experiments
python scripts/setup_experiment.py --list-evaluators # List available evaluators
"""
import argparse
import shutil
import subprocess
import sys
from datetime import datetime
from pathlib import Path
DOMAINS = ["engineering", "marketing", "content", "prompts", "custom"]
EVALUATOR_DIR = Path(__file__).parent.parent / "evaluators"
DEFAULT_CONFIG = """# autoresearch global config
default_time_budget_minutes: 5
default_scope: project
dashboard_format: markdown
"""
GITIGNORE_CONTENT = """# autoresearch — experiment logs are local state
**/results.tsv
**/run.log
**/run.*.log
config.yaml
"""
def run_cmd(cmd, cwd=None, timeout=None):
"""Run shell command, return (returncode, stdout, stderr)."""
result = subprocess.run(
cmd, shell=True, capture_output=True, text=True,
cwd=cwd, timeout=timeout
)
return result.returncode, result.stdout.strip(), result.stderr.strip()
def get_autoresearch_root(scope, project_root=None):
"""Get the .autoresearch root directory based on scope."""
if scope == "user":
return Path.home() / ".autoresearch"
return Path(project_root or ".") / ".autoresearch"
def init_root(root):
"""Initialize .autoresearch root if it doesn't exist."""
created = False
if not root.exists():
root.mkdir(parents=True)
created = True
print(f" Created {root}/")
config_file = root / "config.yaml"
if not config_file.exists():
config_file.write_text(DEFAULT_CONFIG)
print(f" Created {config_file}")
gitignore = root / ".gitignore"
if not gitignore.exists():
gitignore.write_text(GITIGNORE_CONTENT)
print(f" Created {gitignore}")
return created
def create_program_md(experiment_dir, domain, name, target, metric, direction, constraints=""):
"""Generate a program.md template for the experiment."""
direction_word = "Minimize" if direction == "lower" else "Maximize"
content = f"""# autoresearch — {name}
## Goal
{direction_word} `{metric}` on `{target}`. {"Lower" if direction == "lower" else "Higher"} is better.
## What the Agent Can Change
- Only `{target}` — this is the single file being optimized.
- Everything inside that file is fair game unless constrained below.
## What the Agent Cannot Change
- The evaluation script (`evaluate.py` or the eval command). It is read-only.
- Dependencies — do not add new packages or imports that aren't already available.
- Any other files in the project unless explicitly noted here.
{f"- Additional constraints: {constraints}" if constraints else ""}
## Strategy
1. First run: establish baseline. Do not change anything.
2. Profile/analyze the current state — understand why the metric is what it is.
3. Try the most obvious improvement first (low-hanging fruit).
4. If that works, push further in the same direction.
5. If stuck, try something orthogonal or radical.
6. Read the git log of previous experiments. Don't repeat failed approaches.
## Simplicity Rule
A small improvement that adds ugly complexity is NOT worth it.
Equal performance with simpler code IS worth it.
Removing code that gets same results is the best outcome.
## Stop When
You don't stop. The human will interrupt you when they're satisfied.
If no improvement in 20+ consecutive runs, change strategy drastically.
"""
(experiment_dir / "program.md").write_text(content)
def create_config(experiment_dir, target, eval_cmd, metric, direction, time_budget):
"""Write experiment config."""
content = f"""target: {target}
evaluate_cmd: {eval_cmd}
metric: {metric}
metric_direction: {direction}
metric_grep: ^{metric}:
time_budget_minutes: {time_budget}
created: {datetime.now().strftime('%Y-%m-%d %H:%M')}
"""
(experiment_dir / "config.cfg").write_text(content)
def init_results_tsv(experiment_dir):
"""Create results.tsv with header."""
tsv = experiment_dir / "results.tsv"
if tsv.exists():
print(f" results.tsv already exists ({tsv.stat().st_size} bytes)")
return
tsv.write_text("commit\tmetric\tstatus\tdescription\n")
print(" Created results.tsv")
def copy_evaluator(experiment_dir, evaluator_name):
"""Copy a built-in evaluator to the experiment directory."""
source = EVALUATOR_DIR / f"{evaluator_name}.py"
if not source.exists():
print(f" Warning: evaluator '{evaluator_name}' not found in {EVALUATOR_DIR}")
print(f" Available: {', '.join(f.stem for f in EVALUATOR_DIR.glob('*.py'))}")
return False
dest = experiment_dir / "evaluate.py"
shutil.copy2(source, dest)
print(f" Copied evaluator: {evaluator_name}.py -> evaluate.py")
return True
def create_branch(path, domain, name):
"""Create and checkout the experiment branch."""
branch = f"autoresearch/{domain}/{name}"
result = subprocess.run(
["git", "checkout", "-b", branch],
cwd=path, capture_output=True, text=True
)
if result.returncode != 0:
if "already exists" in result.stderr:
print(f" Branch '{branch}' already exists. Checking out...")
subprocess.run(
["git", "checkout", branch],
cwd=path, capture_output=True, text=True
)
return branch
print(f" Warning: could not create branch: {result.stderr}")
return None
print(f" Created branch: {branch}")
return branch
def list_experiments(root):
"""List all experiments across all domains."""
if not root.exists():
print("No experiments found. Run setup to create your first experiment.")
return
experiments = []
for domain_dir in sorted(root.iterdir()):
if not domain_dir.is_dir() or domain_dir.name.startswith("."):
continue
for exp_dir in sorted(domain_dir.iterdir()):
if not exp_dir.is_dir():
continue
cfg_file = exp_dir / "config.cfg"
if not cfg_file.exists():
continue
config = {}
for line in cfg_file.read_text().splitlines():
if ":" in line:
k, v = line.split(":", 1)
config[k.strip()] = v.strip()
# Count results
tsv = exp_dir / "results.tsv"
runs = 0
if tsv.exists():
runs = max(0, len(tsv.read_text().splitlines()) - 1)
experiments.append({
"domain": domain_dir.name,
"name": exp_dir.name,
"target": config.get("target", "?"),
"metric": config.get("metric", "?"),
"runs": runs,
})
if not experiments:
print("No experiments found.")
return
print(f"\n{'DOMAIN':<15} {'EXPERIMENT':<25} {'TARGET':<30} {'METRIC':<15} {'RUNS':>5}")
print("-" * 95)
for e in experiments:
print(f"{e['domain']:<15} {e['name']:<25} {e['target']:<30} {e['metric']:<15} {e['runs']:>5}")
print(f"\nTotal: {len(experiments)} experiments")
def list_evaluators():
"""List available built-in evaluators."""
if not EVALUATOR_DIR.exists():
print("No evaluators directory found.")
return
print(f"\nAvailable evaluators ({EVALUATOR_DIR}):\n")
for f in sorted(EVALUATOR_DIR.glob("*.py")):
# Read first docstring line
desc = ""
for line in f.read_text().splitlines():
stripped = line.strip()
if stripped.startswith('"""') or stripped.startswith("'''"):
quote = stripped[:3]
# Single-line docstring: """Description."""
after_quote = stripped[3:]
if after_quote and after_quote.rstrip(quote[0]).strip():
desc = after_quote.rstrip('"').rstrip("'").strip()
break
continue
if stripped and not line.startswith("#!"):
desc = stripped.strip('"').strip("'")
break
print(f" {f.stem:<25} {desc}")
def main():
parser = argparse.ArgumentParser(description="autoresearch-agent setup")
parser.add_argument("--domain", choices=DOMAINS, help="Experiment domain")
parser.add_argument("--name", help="Experiment name (e.g. api-speed, medium-ctr)")
parser.add_argument("--target", help="Target file to optimize")
parser.add_argument("--eval", dest="eval_cmd", help="Evaluation command")
parser.add_argument("--metric", help="Metric name (must appear in eval output as 'name: value')")
parser.add_argument("--direction", choices=["lower", "higher"], default="lower",
help="Is lower or higher better?")
parser.add_argument("--time-budget", type=int, default=5, help="Minutes per experiment (default: 5)")
parser.add_argument("--evaluator", help="Built-in evaluator to copy (e.g. benchmark_speed)")
parser.add_argument("--scope", choices=["project", "user"], default="project",
help="Where to store experiments: project (./) or user (~/)")
parser.add_argument("--constraints", default="", help="Additional constraints for program.md")
parser.add_argument("--path", default=".", help="Project root path")
parser.add_argument("--skip-branch", action="store_true", help="Don't create git branch")
parser.add_argument("--list", action="store_true", help="List all experiments")
parser.add_argument("--list-evaluators", action="store_true", help="List available evaluators")
args = parser.parse_args()
project_root = Path(args.path).resolve()
# List mode
if args.list:
root = get_autoresearch_root("project", project_root)
list_experiments(root)
user_root = get_autoresearch_root("user")
if user_root.exists() and user_root != root:
print(f"\n--- User-level experiments ({user_root}) ---")
list_experiments(user_root)
return
if args.list_evaluators:
list_evaluators()
return
# Validate required args for setup
if not all([args.domain, args.name, args.target, args.eval_cmd, args.metric]):
parser.error("Required: --domain, --name, --target, --eval, --metric")
root = get_autoresearch_root(args.scope, project_root)
print(f"\n autoresearch-agent setup")
print(f" Project: {project_root}")
print(f" Scope: {args.scope}")
print(f" Domain: {args.domain}")
print(f" Experiment: {args.name}")
print(f" Time: {datetime.now().strftime('%Y-%m-%d %H:%M')}\n")
# Check git
result = subprocess.run(
["git", "rev-parse", "--is-inside-work-tree"],
cwd=str(project_root), capture_output=True, text=True
)
code = result.returncode
if code != 0:
print(" Error: not a git repository. Run: git init && git add . && git commit -m 'initial'")
sys.exit(1)
print(" Git repository found")
# Check target file
target_path = project_root / args.target
if not target_path.exists():
print(f" Error: target file not found: {args.target}")
sys.exit(1)
print(f" Target file found: {args.target}")
# Init root
init_root(root)
# Create experiment directory
experiment_dir = root / args.domain / args.name
if experiment_dir.exists():
print(f" Warning: experiment '{args.domain}/{args.name}' already exists.")
print(f" Use --name with a different name, or delete {experiment_dir}")
sys.exit(1)
experiment_dir.mkdir(parents=True)
print(f" Created {experiment_dir}/")
# Create files
create_program_md(experiment_dir, args.domain, args.name,
args.target, args.metric, args.direction, args.constraints)
print(" Created program.md")
create_config(experiment_dir, args.target, args.eval_cmd,
args.metric, args.direction, args.time_budget)
print(" Created config.cfg")
init_results_tsv(experiment_dir)
# Copy evaluator if specified
if args.evaluator:
copy_evaluator(experiment_dir, args.evaluator)
# Create git branch
if not args.skip_branch:
create_branch(str(project_root), args.domain, args.name)
# Test evaluation command
print(f"\n Testing evaluation: {args.eval_cmd}")
code, out, err = run_cmd(args.eval_cmd, cwd=str(project_root), timeout=60)
if code != 0:
print(f" Warning: eval command failed (exit {code})")
if err:
print(f" stderr: {err[:200]}")
print(" Fix the eval command before running the experiment loop.")
else:
# Check metric is parseable
full_output = out + "\n" + err
metric_found = False
for line in full_output.splitlines():
if line.strip().startswith(f"{args.metric}:"):
metric_found = True
print(f" Eval works. Baseline: {line.strip()}")
break
if not metric_found:
print(f" Warning: eval ran but '{args.metric}:' not found in output.")
print(f" Make sure your eval command outputs: {args.metric}: <value>")
# Summary
print(f"\n Setup complete!")
print(f" Experiment: {args.domain}/{args.name}")
print(f" Target: {args.target}")
print(f" Metric: {args.metric} ({args.direction} is better)")
print(f" Budget: {args.time_budget} min/experiment")
if not args.skip_branch:
print(f" Branch: autoresearch/{args.domain}/{args.name}")
print(f"\n To start:")
print(f" python scripts/run_experiment.py --experiment {args.domain}/{args.name} --single")
if __name__ == "__main__":
main()
FILE:settings.json
{
"name": "autoresearch-agent",
"displayName": "Autoresearch Agent",
"version": "2.1.2",
"description": "Autonomous experiment loop — optimize any file by a measurable metric.",
"author": "Alireza Rezvani",
"license": "MIT",
"platforms": ["claude-code", "openclaw", "codex"],
"category": "engineering",
"tags": ["optimization", "experiments", "benchmarks", "autoresearch", "loop", "metrics"],
"repository": "https://github.com/alirezarezvani/claude-skills",
"commands": {
"setup": "/ar:setup",
"run": "/ar:run",
"loop": "/ar:loop",
"status": "/ar:status",
"resume": "/ar:resume"
},
"agents": [
"experiment-runner"
]
}
FILE:skills/loop/SKILL.md
---
name: "loop"
description: "Start an autonomous experiment loop with user-selected interval (10min, 1h, daily, weekly, monthly). Uses CronCreate for scheduling."
command: /ar:loop
---
# /ar:loop — Autonomous Experiment Loop
Start a recurring experiment loop that runs at a user-selected interval.
## Usage
```
/ar:loop engineering/api-speed # Start loop (prompts for interval)
/ar:loop engineering/api-speed 10m # Every 10 minutes
/ar:loop engineering/api-speed 1h # Every hour
/ar:loop engineering/api-speed daily # Daily at ~9am
/ar:loop engineering/api-speed weekly # Weekly on Monday ~9am
/ar:loop engineering/api-speed monthly # Monthly on 1st ~9am
/ar:loop stop engineering/api-speed # Stop an active loop
```
## What It Does
### Step 1: Resolve experiment
If no experiment specified, list experiments and let user pick.
### Step 2: Select interval
If interval not provided as argument, present options:
```
Select loop interval:
1. Every 10 minutes (rapid — stay and watch)
2. Every hour (background — check back later)
3. Daily at ~9am (overnight experiments)
4. Weekly on Monday (long-running experiments)
5. Monthly on 1st (slow experiments)
```
Map to cron expressions:
| Interval | Cron Expression | Shorthand |
|----------|----------------|-----------|
| 10 minutes | `*/10 * * * *` | `10m` |
| 1 hour | `7 * * * *` | `1h` |
| Daily | `57 8 * * *` | `daily` |
| Weekly | `57 8 * * 1` | `weekly` |
| Monthly | `57 8 1 * *` | `monthly` |
### Step 3: Create the recurring job
Use `CronCreate` with this prompt (fill in the experiment details):
```
You are running autoresearch experiment "{domain}/{name}".
1. Read .autoresearch/{domain}/{name}/config.cfg for: target, evaluate_cmd, metric, metric_direction
2. Read .autoresearch/{domain}/{name}/program.md for strategy and constraints
3. Read .autoresearch/{domain}/{name}/results.tsv for experiment history
4. Run: git checkout autoresearch/{domain}/{name}
Then do exactly ONE iteration:
- Review results.tsv: what worked, what failed, what hasn't been tried
- Edit the target file with ONE change (strategy escalation based on run count)
- Commit: git add {target} && git commit -m "experiment: {description}"
- Evaluate: python {skill_path}/scripts/run_experiment.py --experiment {domain}/{name} --single
- Read the output (KEEP/DISCARD/CRASH)
Rules:
- ONE change per experiment
- NEVER modify the evaluator
- If 5 consecutive crashes in results.tsv, delete this cron job (CronDelete) and alert
- After every 10 experiments, update Strategy section of program.md
Current best metric: {read from results.tsv or "no baseline yet"}
Total experiments so far: {count from results.tsv}
```
### Step 4: Store loop metadata
Write to `.autoresearch/{domain}/{name}/loop.json`:
```json
{
"cron_id": "{id from CronCreate}",
"interval": "{user selection}",
"started": "{ISO timestamp}",
"experiment": "{domain}/{name}"
}
```
### Step 5: Confirm to user
```
Loop started for {domain}/{name}
Interval: {interval description}
Cron ID: {id}
Auto-expires: 3 days (CronCreate limit)
To check progress: /ar:status
To stop the loop: /ar:loop stop {domain}/{name}
Note: Recurring jobs auto-expire after 3 days.
Run /ar:loop again to restart after expiry.
```
## Stopping a Loop
When user runs `/ar:loop stop {experiment}`:
1. Read `.autoresearch/{domain}/{name}/loop.json` to get the cron ID
2. Call `CronDelete` with that ID
3. Delete `loop.json`
4. Confirm: "Loop stopped for {experiment}. {n} experiments completed."
## Important Limitations
- **3-day auto-expiry**: CronCreate jobs expire after 3 days. For longer experiments, the user must re-run `/ar:loop` to restart. Results persist — the new loop picks up where the old one left off.
- **One loop per experiment**: Don't start multiple loops for the same experiment.
- **Concurrent experiments**: Multiple experiments can loop simultaneously ONLY if they're on different git branches (which they are by default — each experiment gets `autoresearch/{domain}/{name}`).
FILE:skills/resume/SKILL.md
---
name: "resume"
description: "Resume a paused experiment. Checkout the experiment branch, read results history, continue iterating."
command: /ar:resume
---
# /ar:resume — Resume Experiment
Resume a paused or context-limited experiment. Reads all history and continues where you left off.
## Usage
```
/ar:resume # List experiments, let user pick
/ar:resume engineering/api-speed # Resume specific experiment
```
## What It Does
### Step 1: List experiments if needed
If no experiment specified:
```bash
python {skill_path}/scripts/setup_experiment.py --list
```
Show status for each (active/paused/done based on results.tsv age). Let user pick.
### Step 2: Load full context
```bash
# Checkout the experiment branch
git checkout autoresearch/{domain}/{name}
# Read config
cat .autoresearch/{domain}/{name}/config.cfg
# Read strategy
cat .autoresearch/{domain}/{name}/program.md
# Read full results history
cat .autoresearch/{domain}/{name}/results.tsv
# Read recent git log for the branch
git log --oneline -20
```
### Step 3: Report current state
Summarize for the user:
```
Resuming: engineering/api-speed
Target: src/api/search.py
Metric: p50_ms (lower is better)
Experiments: 23 total — 8 kept, 12 discarded, 3 crashed
Best: 185ms (-42% from baseline of 320ms)
Last experiment: "added response caching" → KEEP (185ms)
Recent patterns:
- Caching changes: 3 kept, 1 discarded (consistently helpful)
- Algorithm changes: 2 discarded, 1 crashed (high risk, low reward so far)
- I/O optimization: 2 kept (promising direction)
```
### Step 4: Ask next action
```
How would you like to continue?
1. Single iteration (/ar:run) — I'll make one change and evaluate
2. Start a loop (/ar:loop) — Autonomous with scheduled interval
3. Just show me the results — I'll review and decide
```
If the user picks loop, hand off to `/ar:loop` with the experiment pre-selected.
If single, hand off to `/ar:run`.
FILE:skills/run/SKILL.md
---
name: "run"
description: "Run a single experiment iteration. Edit the target file, evaluate, keep or discard."
command: /ar:run
---
# /ar:run — Single Experiment Iteration
Run exactly ONE experiment iteration: review history, decide a change, edit, commit, evaluate.
## Usage
```
/ar:run engineering/api-speed # Run one iteration
/ar:run # List experiments, let user pick
```
## What It Does
### Step 1: Resolve experiment
If no experiment specified, run `python {skill_path}/scripts/setup_experiment.py --list` and ask the user to pick.
### Step 2: Load context
```bash
# Read experiment config
cat .autoresearch/{domain}/{name}/config.cfg
# Read strategy and constraints
cat .autoresearch/{domain}/{name}/program.md
# Read experiment history
cat .autoresearch/{domain}/{name}/results.tsv
# Checkout the experiment branch
git checkout autoresearch/{domain}/{name}
```
### Step 3: Decide what to try
Review results.tsv:
- What changes were kept? What pattern do they share?
- What was discarded? Avoid repeating those approaches.
- What crashed? Understand why.
- How many runs so far? (Escalate strategy accordingly)
**Strategy escalation:**
- Runs 1-5: Low-hanging fruit (obvious improvements)
- Runs 6-15: Systematic exploration (vary one parameter)
- Runs 16-30: Structural changes (algorithm swaps)
- Runs 30+: Radical experiments (completely different approaches)
### Step 4: Make ONE change
Edit only the target file specified in config.cfg. Change one thing. Keep it simple.
### Step 5: Commit and evaluate
```bash
git add {target}
git commit -m "experiment: {short description of what changed}"
python {skill_path}/scripts/run_experiment.py \
--experiment {domain}/{name} --single
```
### Step 6: Report result
Read the script output. Tell the user:
- **KEEP**: "Improvement! {metric}: {value} ({delta} from previous best)"
- **DISCARD**: "No improvement. {metric}: {value} vs best {best}. Reverted."
- **CRASH**: "Evaluation failed: {reason}. Reverted."
### Step 7: Self-improvement check
After every 10th experiment (check results.tsv line count), update the Strategy section of program.md with patterns learned.
## Rules
- ONE change per iteration. Don't change 5 things at once.
- NEVER modify the evaluator (evaluate.py). It's ground truth.
- Simplicity wins. Equal performance with simpler code is an improvement.
- No new dependencies.
FILE:skills/setup/SKILL.md
---
name: "setup"
description: "Set up a new autoresearch experiment interactively. Collects domain, target file, eval command, metric, direction, and evaluator."
command: /ar:setup
---
# /ar:setup — Create New Experiment
Set up a new autoresearch experiment with all required configuration.
## Usage
```
/ar:setup # Interactive mode
/ar:setup engineering api-speed src/api.py "pytest bench.py" p50_ms lower
/ar:setup --list # Show existing experiments
/ar:setup --list-evaluators # Show available evaluators
```
## What It Does
### If arguments provided
Pass them directly to the setup script:
```bash
python {skill_path}/scripts/setup_experiment.py \
--domain {domain} --name {name} \
--target {target} --eval "{eval_cmd}" \
--metric {metric} --direction {direction} \
[--evaluator {evaluator}] [--scope {scope}]
```
### If no arguments (interactive mode)
Collect each parameter one at a time:
1. **Domain** — Ask: "What domain? (engineering, marketing, content, prompts, custom)"
2. **Name** — Ask: "Experiment name? (e.g., api-speed, blog-titles)"
3. **Target file** — Ask: "Which file to optimize?" Verify it exists.
4. **Eval command** — Ask: "How to measure it? (e.g., pytest bench.py, python evaluate.py)"
5. **Metric** — Ask: "What metric does the eval output? (e.g., p50_ms, ctr_score)"
6. **Direction** — Ask: "Is lower or higher better?"
7. **Evaluator** (optional) — Show built-in evaluators. Ask: "Use a built-in evaluator, or your own?"
8. **Scope** — Ask: "Store in project (.autoresearch/) or user (~/.autoresearch/)?"
Then run `setup_experiment.py` with the collected parameters.
### Listing
```bash
# Show existing experiments
python {skill_path}/scripts/setup_experiment.py --list
# Show available evaluators
python {skill_path}/scripts/setup_experiment.py --list-evaluators
```
## Built-in Evaluators
| Name | Metric | Use Case |
|------|--------|----------|
| `benchmark_speed` | `p50_ms` (lower) | Function/API execution time |
| `benchmark_size` | `size_bytes` (lower) | File, bundle, Docker image size |
| `test_pass_rate` | `pass_rate` (higher) | Test suite pass percentage |
| `build_speed` | `build_seconds` (lower) | Build/compile/Docker build time |
| `memory_usage` | `peak_mb` (lower) | Peak memory during execution |
| `llm_judge_content` | `ctr_score` (higher) | Headlines, titles, descriptions |
| `llm_judge_prompt` | `quality_score` (higher) | System prompts, agent instructions |
| `llm_judge_copy` | `engagement_score` (higher) | Social posts, ad copy, emails |
## After Setup
Report to the user:
- Experiment path and branch name
- Whether the eval command worked and the baseline metric
- Suggest: "Run `/ar:run {domain}/{name}` to start iterating, or `/ar:loop {domain}/{name}` for autonomous mode."
FILE:skills/status/SKILL.md
---
name: "status"
description: "Show experiment dashboard with results, active loops, and progress."
command: /ar:status
---
# /ar:status — Experiment Dashboard
Show experiment results, active loops, and progress across all experiments.
## Usage
```
/ar:status # Full dashboard
/ar:status engineering/api-speed # Single experiment detail
/ar:status --domain engineering # All experiments in a domain
/ar:status --format markdown # Export as markdown
/ar:status --format csv --output results.csv # Export as CSV
```
## What It Does
### Single experiment
```bash
python {skill_path}/scripts/log_results.py --experiment {domain}/{name}
```
Also check for active loop:
```bash
cat .autoresearch/{domain}/{name}/loop.json 2>/dev/null
```
If loop.json exists, show:
```
Active loop: every {interval} (cron ID: {id}, started: {date})
```
### Domain view
```bash
python {skill_path}/scripts/log_results.py --domain {domain}
```
### Full dashboard
```bash
python {skill_path}/scripts/log_results.py --dashboard
```
For each experiment, also check for loop.json and show loop status.
### Export
```bash
# CSV
python {skill_path}/scripts/log_results.py --dashboard --format csv --output {file}
# Markdown
python {skill_path}/scripts/log_results.py --dashboard --format markdown --output {file}
```
## Output Example
```
DOMAIN EXPERIMENT RUNS KEPT BEST CHANGE STATUS LOOP
engineering api-speed 47 14 185ms -76.9% active every 1h
engineering bundle-size 23 8 412KB -58.3% paused —
marketing medium-ctr 31 11 8.4/10 +68.0% active daily
prompts support-tone 15 6 82/100 +46.4% done —
```
Build immersive, cinematic 2.5D interactive websites using scroll storytelling, parallax depth, text animations, and premium scroll effects — no WebGL requir...
---
name: epic-design
description: >
Build immersive, cinematic 2.5D interactive websites using scroll storytelling,
parallax depth, text animations, and premium scroll effects — no WebGL required.
Use this skill for any web design task: landing pages, product sites, hero sections,
scroll animations, parallax, sticky sections, section overlaps, floating products
between sections, clip-path reveals, text that flies in from sides, words that light
up on scroll, curtain drops, iris opens, card stacks, bleed typography, and any
site that should feel cinematic or premium. Trigger on phrases like "make it feel
alive", "Apple-style animation", "sections that overlap", "product rises between
sections", "immersive", "scrollytelling", or any scroll-driven visual effect.
Covers 45+ techniques across 8 categories. Always inspects, judges, and plans assets before coding. Use aggressively for ANY web design task.
license: MIT
metadata:
version: 1.0.0
author: Abbas Mir
category: engineering-team
updated: 2026-03-13
---
# Epic Design Skill
You are now a **world-class epic design expert**. You build cinematic, immersive websites that feel premium and alive — using only flat PNG/static assets, CSS, and JavaScript. No WebGL, no 3D modeling software required.
## Before Starting
**Check for context first:**
If `project-context.md` or `product-context.md` exists, read it before asking questions. Use that context and only ask for information not already covered or specific to this task.
## Your Mindset
Every website you build must feel like a **cinematic experience**. Think: Apple product pages, Awwwards winners, luxury brand sites. Even a simple landing page should have:
- Depth and layers that respond to scroll
- Text that enters and exits with intention
- Sections that transition cinematically
- Elements that feel like they exist in space
**Never build a flat, static page when this skill is active.**
---
## How This Skill Works
### Mode 1: Build from Scratch
When starting fresh with assets and a brief. Follow the complete workflow below (Steps 1-5).
### Mode 2: Enhance Existing Site
When adding 2.5D effects to an existing page. Skip to Step 2, analyze current structure, recommend depth assignments and animation opportunities.
### Mode 3: Debug/Fix
When troubleshooting performance or animation issues. Use `scripts/validate-layers.js`, check GPU rules, verify reduced-motion handling.
---
## Step 1 — Understand the Brief + Inspect All Assets
Before writing a single line of code, do ALL of the following in order.
### A. Extract the brief
1. What is the product/content? (brand site, portfolio, SaaS, event, etc.)
2. What mood/feeling? (dark/cinematic, bright/energetic, minimal/luxury, etc.)
3. How many sections? (hero only, full page, specific section?)
### B. Inspect every uploaded image asset
Run `scripts/inspect-assets.py` on every image the user has provided.
For each image, determine:
1. **Format** — JPEG never has a real alpha channel. PNG may have a fake one.
2. **Background status** — Use the script output. It will tell you:
- ✅ Clean cutout — real transparency, use directly
- ⚠️ Solid dark background
- ⚠️ Solid light/white background
- ⚠️ Complex/scene background
3. **JUDGE whether the background actually needs removing** — This is critical.
Not every image with a background needs it removed. Ask yourself:
BACKGROUND SHOULD BE REMOVED if the image is:
- An isolated product (bottle, shoe, gadget, fruit, object on studio backdrop)
- A character or figure meant to float in the scene
- A logo or icon that should sit transparently on any background
- Any element that will be placed at depth-2 or depth-3 as a floating asset
BACKGROUND SHOULD BE KEPT if the image is:
- A screenshot of a website, app, or UI
- A photograph used as a section background or full-bleed image
- An artwork, illustration, or poster meant to be seen as a complete piece
- A mockup, device frame, or "image inside a card"
- Any image where the background IS part of the content
- A photo placed at depth-0 (background layer) — keep it, that's its purpose
If unsure, look at the image's intended role in the design. If it needs to
"float" freely over other content → remove bg. If it fills a space or IS
the content → keep it.
4. **Inform the user about every image** — whether bg is fine or not.
Use the exact format from `references/asset-pipeline.md` Step 4.
5. **Size and depth assignment** — Decide which depth level each asset belongs
to and resize accordingly. State your decisions to the user before building.
### C. Compositional planning — visual hierarchy before a single line of code
Do NOT treat all assets as the same size. Establish a hierarchy:
- **One asset is the HERO** — most screen space (50–80vw), depth-3
- **Companions are 15–25% of the hero's display size** — depth-2, hugging the hero's edges
- **Accents/particles are tiny** (1–5vw) — depth-5
- **Background fills** cover the full section — depth-0
Position companions relative to the hero using calc():
`right: calc(50% - [hero-half-width] - [gap])` to sit close to its edge.
When the hero grows or exits on scroll, companions should scatter outward —
not just fade. This reinforces that they were orbiting the hero.
### D. Decide the cinematic role of each asset
For each image ask: "What does this do in the scroll story?"
- Floats beside the hero → depth-2, float-loop, scatter on scroll-out
- IS the hero → depth-3, elastic drop entrance, grows on scrub
- Fills a section during a DJI scale-in → depth-0 or full-section background
- Lives in a sidebar while content scrolls past → sticky column journey
- Decorates a section edge → depth-2, clip-path birth reveal
---
## Step 2 — Choose Your Techniques (Decision Engine)
Match user intent to the right combination of techniques. Read the full technique details from `references/` files.
### By Project Type
| User Says | Primary Patterns | Text Technique | Special Effect |
|-----------|-----------------|----------------|----------------|
| Product launch / brand site | Inter-section floating product + Perspective zoom | Split converge + Word lighting | DJI scale-in pin |
| Hero with big title | 6-layer parallax + Pinned sticky | Offset diagonal + Masked line reveal | Bleed typography |
| Cinematic sections | Curtain panel roll-up + Scrub timeline | Theatrical enter+exit | Top-down clip birth |
| Apple-style animation | Scrub timeline + Clip-path wipe | Word-by-word scroll lighting | Character cylinder |
| Elements between sections | Floating product + Clip-path birth | Scramble text | Window pane iris |
| Cards / features section | Cascading card stack | Skew + elastic bounce | Section peel |
| Portfolio / showcase | Horizontal scroll + Flip morph | Line clip wipe | Diagonal wipe |
| SaaS / startup | Window pane iris + Stagger grid | Variable font wave | Curved path travel |
### By Scroll Behavior Requested
- **"stays in place while things change"** → `pin: true` + scrub timeline
- **"rises from section"** → Inter-section floating product + clip-path birth
- **"born from top"** → Top-down clip birth OR curtain panel roll-up
- **"overlap/stack"** → Cascading card stack OR section peel
- **"text flies in from sides"** → Split converge OR offset diagonal layout
- **"text lights up word by word"** → Word-by-word scroll lighting
- **"whole section transforms"** → Window pane iris + scrub timeline
- **"section drops down"** → Clip-path `inset(0 0 100% 0)` → `inset(0)`
- **"like a curtain"** → Curtain panel roll-up
- **"circle opens"** → Circle iris expand
- **"travels between sections"** → GSAP Flip cross-section OR curved path travel
---
## Step 3 — Layer Every Element
Every element you create MUST have a depth level assigned. This is non-negotiable.
```
DEPTH 0 → Far background | parallax: 0.10x | blur: 8px | scale: 0.70
DEPTH 1 → Glow/atmosphere | parallax: 0.25x | blur: 4px | scale: 0.85
DEPTH 2 → Mid decorations | parallax: 0.50x | blur: 0px | scale: 1.00
DEPTH 3 → Main objects | parallax: 0.80x | blur: 0px | scale: 1.05
DEPTH 4 → UI / text | parallax: 1.00x | blur: 0px | scale: 1.00
DEPTH 5 → Foreground FX | parallax: 1.20x | blur: 0px | scale: 1.10
```
Apply as: `data-depth="3"` on HTML elements, matching CSS class `.depth-3`.
→ Full depth system details: `references/depth-system.md`
---
## Step 4 — Apply Accessibility & Performance (Always)
These are MANDATORY in every output:
```css
@media (prefers-reduced-motion: reduce) {
*, *::before, *::after {
animation-duration: 0.01ms !important;
animation-iteration-count: 1 !important;
transition-duration: 0.01ms !important;
scroll-behavior: auto !important;
}
}
```
- Only animate: `transform`, `opacity`, `filter`, `clip-path` — never `width/height/top/left`
- Use `will-change: transform` only on actively animating elements, remove after animation
- Use `content-visibility: auto` on off-screen sections
- Use `IntersectionObserver` to only animate elements in viewport
- Detect mobile: `window.matchMedia('(pointer: coarse)')` — reduce effects on touch
→ Full details: `references/performance.md` and `references/accessibility.md`
---
## Step 5 — Code Structure (Always Use This HTML Architecture)
```html
<!-- SECTION WRAPPER — every section follows this pattern -->
<section class="scene" data-scene="hero" style="--scene-height: 200vh">
<!-- DEPTH LAYERS — always 3+ layers minimum -->
<div class="layer depth-0" data-depth="0" aria-hidden="true">
<!-- Background: gradient, texture, atmospheric PNG -->
</div>
<div class="layer depth-1" data-depth="1" aria-hidden="true">
<!-- Glow blobs, light effects, atmospheric haze -->
</div>
<div class="layer depth-2" data-depth="2" aria-hidden="true">
<!-- Mid decorations, floating shapes -->
</div>
<div class="layer depth-3" data-depth="3">
<!-- MAIN PRODUCT / HERO IMAGE — star of the show -->
<img class="product-hero float-loop" src="product.png" alt="[description]" />
</div>
<div class="layer depth-4" data-depth="4">
<!-- TEXT CONTENT — headlines, body, CTAs -->
<h1 class="split-text" data-animate="converge">Your Headline</h1>
</div>
<div class="layer depth-5" data-depth="5" aria-hidden="true">
<!-- Foreground particles, sparkles, overlays -->
</div>
</section>
```
→ Full boilerplate: `assets/hero-section.html`
→ Full CSS system: `assets/hero-section.css`
→ Full JS engine: `assets/hero-section.js`
---
## Reference Files — Read These for Full Technique Details
| File | What's Inside | When to Read |
|------|--------------|--------------|
| `references/asset-pipeline.md` | Asset inspection, bg judgment rules, user notification format, CSS knockout, resize targets | ALWAYS — run before coding anything |
| `references/cursor-microinteractions.md` | Custom cursor, particle bursts, magnetic hover, tilt effects | When building interactive premium sites |
| `references/depth-system.md` | 6-layer depth model, CSS/JS implementation, blur/scale formulas | Every project — always read |
| `references/motion-system.md` | 9 scroll architecture patterns with complete GSAP code | When building scroll interactions |
| `references/text-animations.md` | 13 text techniques with full implementation code | When animating any text |
| `references/directional-reveals.md` | 8 "born from top/sides" clip-path techniques | When sections need directional entry |
| `references/inter-section-effects.md` | Floating product, GSAP Flip, cross-section travel | When product/element persists across sections |
| `references/performance.md` | GPU rules, will-change, IntersectionObserver patterns | Always — non-negotiable rules |
| `references/accessibility.md` | WCAG 2.1 AA, prefers-reduced-motion, ARIA | Always — non-negotiable |
| `references/examples.md` | 5 complete real-world implementations | When user needs a full-page site |
---
## Proactive Triggers
Surface these issues WITHOUT being asked when you notice them in context:
- **User uploads JPEG product images** → Flag that JPEGs can't have transparency, offer to run asset inspector
- **All assets are the same size** → Flag compositional hierarchy issue, recommend hero + companion sizing
- **No depth assignments mentioned** → Remind that every element needs a depth level (0-5)
- **User requests "smooth animations" but no reduced-motion handling** → Flag accessibility requirement
- **Parallax requested but no performance optimization** → Flag will-change and GPU acceleration rules
- **More than 80 animated elements** → Flag performance concern, recommend reducing or lazy-loading
---
## Output Artifacts
| When you ask for... | You get... |
|---------------------|------------|
| "Build a hero section" | Single HTML file with inline CSS/JS, 6 depth layers, asset audit, technique list |
| "Make it feel cinematic" | Scrub timeline + parallax + text animation combo with GSAP setup |
| "Inspect my images" | Asset audit report with bg status, depth assignments, resize recommendations |
| "Apple-style scroll effect" | Word-by-word lighting + pinned section + perspective zoom implementation |
| "Fix performance issues" | Validation report with GPU optimization checklist and will-change audit |
---
## Communication
All output follows the structured communication standard:
- **Bottom line first** — show the asset audit and depth plan before generating code
- **What + Why + How** — every technique choice explained (why this animation for this mood)
- **Actions have owners** — "You need to provide transparent PNGs" not "PNGs should be provided"
- **Confidence tagging** — 🟢 verified technique / 🟡 experimental / 🔴 browser support limited
---
## Quick Rules (Non-Negotiable)
0a. ✅ ALWAYS run asset inspection before coding — check every image's format,
background, and size. State depth assignments to the user before building.
0b. ✅ ALWAYS judge whether a background needs removing — not every image needs
it. Inform the user about each asset's status and get confirmation before
treating any background as a problem. Never auto-remove, never silently ignore.
1. ✅ Every section has minimum **3 depth layers**
2. ✅ Every text element uses at least **1 animation technique**
3. ✅ Every project includes **`prefers-reduced-motion`** fallback
4. ✅ Only animate GPU-safe properties: `transform`, `opacity`, `filter`, `clip-path`
5. ✅ Product images always assigned **depth-3** by default
6. ✅ Background images always **depth-0** with slight blur
7. ✅ Floating loops on any "hero" element (6–14s, never completely static)
8. ✅ Every decorative element gets `aria-hidden="true"`
9. ✅ Mobile gets reduced effects via `pointer: coarse` detection
10. ✅ `will-change` removed after animations complete
---
## Output Format
Always deliver:
1. **Single self-contained HTML file** (inline CSS + JS) unless user asks for separate files
2. **CDN imports** for GSAP via jsDelivr: `https://cdn.jsdelivr.net/npm/[email protected]/dist/gsap.min.js`
3. **Comments** explaining every major section and technique used
4. **Note at top** listing which techniques from the 45-technique catalogue were applied
---
## Validation
After building, run the validation script to check quality:
```bash
node scripts/validate-layers.js path/to/index.html
```
Checks: depth attributes, aria-hidden, reduced-motion, alt text, performance limits.
---
## Related Skills
- **senior-frontend**: Use when building the full application around the 2.5D site. NOT for the cinematic effects themselves.
- **ui-design**: Use when designing the visual layout and components. NOT for scroll animations or depth effects.
- **landing-page-generator**: Use for quick SaaS landing page scaffolds. NOT for custom cinematic experiences.
- **page-cro**: Use after the 2.5D site is built to optimize conversion. NOT during the initial build.
- **senior-architect**: Use when the 2.5D site is part of a larger system architecture. NOT for standalone pages.
- **accessibility-auditor**: Use to verify full WCAG compliance after build. This skill includes basic reduced-motion handling.
FILE:references/accessibility.md
# Accessibility Reference
## Non-Negotiable Rules
Every 2.5D website MUST implement ALL of the following. These are not optional enhancements — they are legal requirements in many jurisdictions and ethical requirements always.
---
## 1. prefers-reduced-motion (Most Critical)
Parallax and complex animations can trigger vestibular disorders — dizziness, nausea, migraines — in a significant portion of users. WCAG 2.1 Success Criterion 2.3.3 requires handling this.
```css
/* This block must be in EVERY project */
@media (prefers-reduced-motion: reduce) {
/* Nuclear option: stop all animations globally */
*,
*::before,
*::after {
animation-duration: 0.01ms !important;
animation-iteration-count: 1 !important;
transition-duration: 0.01ms !important;
scroll-behavior: auto !important;
}
/* Specifically disable 2.5D techniques */
.float-loop { animation: none !important; }
.parallax-layer { transform: none !important; }
.depth-0, .depth-1, .depth-2,
.depth-3, .depth-4, .depth-5 {
transform: none !important;
filter: none !important;
}
.glow-blob { opacity: 0.3; animation: none !important; }
.theatrical, .theatrical-with-exit {
animation: none !important;
opacity: 1 !important;
transform: none !important;
}
}
```
```javascript
// Also check in JavaScript — some GSAP animations don't respect CSS media queries
if (window.matchMedia('(prefers-reduced-motion: reduce)').matches) {
gsap.globalTimeline.timeScale(0); // Stops all GSAP animations
ScrollTrigger.getAll().forEach(t => t.kill()); // Kill all scroll triggers
// Show all content immediately (don't hide-until-animated)
document.querySelectorAll('[data-animate]').forEach(el => {
el.style.opacity = '1';
el.style.transform = 'none';
el.removeAttribute('data-animate');
});
}
```
## Per-Effect Reduced Motion (Smarter Than Kill-All)
Rather than freezing every animation globally, classify each type:
| Animation Type | At reduced-motion |
|---|---|
| Scroll parallax depth layers | DISABLE — continuous motion triggers vestibular issues |
| Float loops / ambient movement | DISABLE — looping motion is a trigger |
| DJI scale-in / perspective zoom | DISABLE — fast scale can cause dizziness |
| Particle systems | DISABLE |
| Clip-path reveals (one-shot) | KEEP — not continuous, not fast |
| Fade-in on scroll (opacity only) | KEEP — safe |
| Word-by-word scroll lighting | KEEP — no movement, just colour |
| Curtain / wipe reveals (one-shot) | KEEP |
| Text entrance slides (one-shot) | KEEP but reduce duration |
```javascript
const prefersReduced = window.matchMedia('(prefers-reduced-motion: reduce)').matches;
if (prefersReduced) {
// Disable the motion-heavy ones
document.querySelectorAll('.float-loop').forEach(el => {
el.style.animation = 'none';
});
document.querySelectorAll('[data-depth]').forEach(el => {
el.style.transform = 'none';
el.style.willChange = 'auto';
});
// Slow GSAP to near-freeze (don't fully kill — keep structure intact)
gsap.globalTimeline.timeScale(0.01);
// Safe animations: show them immediately at final state
gsap.utils.toArray('.clip-reveal, .fade-reveal, .word-light').forEach(el => {
gsap.set(el, { clipPath: 'inset(0 0% 0 0)', opacity: 1 });
});
}
```
---
## 2. Semantic HTML Structure
```html
<!-- CORRECT semantic structure -->
<main>
<!-- Each visual scene is a section with proper landmarks -->
<section aria-label="Hero — Product Introduction">
<!-- ALL purely decorative elements get aria-hidden -->
<div class="layer depth-0" aria-hidden="true">
<!-- background gradients, glow blobs, particles -->
</div>
<div class="layer depth-1" aria-hidden="true">
<!-- atmospheric effects -->
</div>
<div class="layer depth-5" aria-hidden="true">
<!-- particles, sparkles -->
</div>
<!-- Meaningful content is NOT hidden -->
<div class="layer depth-3">
<img
src="product.png"
alt="[Descriptive alt text — what is the product, what does it look like]"
<!-- NOT: alt="" for meaningful images! -->
>
</div>
<div class="layer depth-4">
<!-- Proper heading hierarchy -->
<h1>Your Brand Name</h1>
<!-- h1 is the page title — only one per page -->
<p>Supporting description that provides context for screen readers</p>
<a href="#features" class="cta-btn">
Explore Features
<!-- CTAs need descriptive text, not just "Click here" -->
</a>
</div>
</section>
<section aria-label="Product Features">
<h2>Why Choose [Product]</h2>
<!-- h2 for section headings -->
</section>
</main>
```
---
## 3. SplitText & Screen Readers
When using SplitText to fragment text into characters/words, the individual fragments get announced one at a time by screen readers — which sounds terrible. Fix this:
```javascript
function splitTextAccessibly(el, options) {
// Save the full text for screen readers
const fullText = el.textContent.trim();
el.setAttribute('aria-label', fullText);
// Split visually only
const split = new SplitText(el, options);
// Hide the split fragments from screen readers
// Screen readers will use aria-label instead
split.chars?.forEach(char => char.setAttribute('aria-hidden', 'true'));
split.words?.forEach(word => word.setAttribute('aria-hidden', 'true'));
split.lines?.forEach(line => line.setAttribute('aria-hidden', 'true'));
return split;
}
// Usage
splitTextAccessibly(document.querySelector('.hero-title'), { type: 'chars,words' });
```
---
## 4. Keyboard Navigation
All interactive elements must be reachable and operable via keyboard (Tab, Enter, Space, Arrow keys).
```css
/* Ensure focus indicators are visible — WCAG 2.4.7 */
:focus-visible {
outline: 3px solid #005fcc; /* High contrast focus ring */
outline-offset: 3px;
border-radius: 3px;
}
/* Remove default outline only if replacing with custom */
:focus:not(:focus-visible) {
outline: none;
}
/* Skip link for keyboard users to bypass navigation */
.skip-link {
position: absolute;
top: -100px;
left: 0;
background: #005fcc;
color: white;
padding: 12px 20px;
z-index: 10000;
font-weight: 600;
text-decoration: none;
}
.skip-link:focus {
top: 0; /* Appears at top when focused */
}
```
```html
<!-- Always first element in body -->
<a href="#main-content" class="skip-link">Skip to main content</a>
<main id="main-content">
...
</main>
```
---
## 5. Color Contrast (WCAG 2.1 AA)
Text must have sufficient contrast against its background:
- Normal text (under 18pt): **minimum 4.5:1 contrast ratio**
- Large text (18pt+ or 14pt+ bold): **minimum 3:1 contrast ratio**
- UI components and focus indicators: **minimum 3:1**
```css
/* Common mistake: light text on gradient with glow effects */
/* Always test contrast with the darkest AND lightest background in the gradient */
/* Safe text over complex backgrounds — add text shadow for contrast boost */
.hero-text-on-image {
color: #ffffff;
/* Multiple small text shadows create a halo that boosts contrast */
text-shadow:
0 0 20px rgba(0,0,0,0.8),
0 2px 4px rgba(0,0,0,0.6),
0 0 40px rgba(0,0,0,0.4);
}
/* Or use a semi-transparent backdrop */
.text-backdrop {
background: rgba(0, 0, 0, 0.55);
backdrop-filter: blur(8px);
padding: 1rem 1.5rem;
border-radius: 8px;
}
```
**Testing tool:** Use browser DevTools accessibility panel or webaim.org/resources/contrastchecker/
---
## 6. Motion-Sensitive Users — User Control
Beyond `prefers-reduced-motion`, provide an in-page control:
```html
<!-- Floating toggle button -->
<button
class="motion-toggle"
aria-pressed="false"
aria-label="Toggle animations on/off"
>
<span class="motion-toggle-icon">✦</span>
<span class="motion-toggle-text">Animations On</span>
</button>
```
```javascript
const motionToggle = document.querySelector('.motion-toggle');
let animationsEnabled = !window.matchMedia('(prefers-reduced-motion: reduce)').matches;
motionToggle.addEventListener('click', () => {
animationsEnabled = !animationsEnabled;
motionToggle.setAttribute('aria-pressed', !animationsEnabled);
motionToggle.querySelector('.motion-toggle-text').textContent =
animationsEnabled ? 'Animations On' : 'Animations Off';
if (animationsEnabled) {
document.documentElement.classList.remove('no-motion');
gsap.globalTimeline.timeScale(1);
} else {
document.documentElement.classList.add('no-motion');
gsap.globalTimeline.timeScale(0);
}
// Persist preference
localStorage.setItem('motionPreference', animationsEnabled ? 'on' : 'off');
});
// Restore on load
const saved = localStorage.getItem('motionPreference');
if (saved === 'off') motionToggle.click();
```
---
## 7. Images — Alt Text Guidelines
```html
<!-- Meaningful product image -->
<img src="juice-glass.png" alt="Tall glass of fresh orange juice with ice, floating on a gradient background">
<!-- Decorative geometric shape -->
<img src="shape-circle.png" alt="" aria-hidden="true">
<!-- Empty alt="" tells screen readers to skip it -->
<!-- Icon with text label next to it -->
<img src="icon-arrow.svg" alt="" aria-hidden="true">
<span>Learn More</span>
<!-- Icon is decorative when text is present -->
<!-- Standalone icon button — needs alt text -->
<button>
<img src="icon-menu.svg" alt="Open navigation menu">
</button>
```
---
## 8. Loading Screen Accessibility
```javascript
// Announce loading state to screen readers
function announceLoading() {
const announcement = document.createElement('div');
announcement.setAttribute('role', 'status');
announcement.setAttribute('aria-live', 'polite');
announcement.setAttribute('aria-label', 'Page loading');
announcement.className = 'sr-only'; // visually hidden
document.body.appendChild(announcement);
// Update announcement when done
window.addEventListener('load', () => {
announcement.textContent = 'Page loaded';
setTimeout(() => announcement.remove(), 1000);
});
}
```
```css
/* Screen-reader only utility class */
.sr-only {
position: absolute;
width: 1px;
height: 1px;
padding: 0;
margin: -1px;
overflow: hidden;
clip: rect(0,0,0,0);
white-space: nowrap;
border: 0;
}
```
---
## WCAG 2.1 AA Compliance Checklist
Before shipping any 2.5D website:
- [ ] `prefers-reduced-motion` CSS block present and tested
- [ ] GSAP animations stopped when reduced motion detected
- [ ] All decorative elements have `aria-hidden="true"`
- [ ] All meaningful images have descriptive alt text
- [ ] SplitText elements have `aria-label` on parent
- [ ] Heading hierarchy is logical (h1 → h2 → h3, no skipping)
- [ ] All interactive elements reachable via keyboard Tab
- [ ] Focus indicators visible and have 3:1 contrast
- [ ] Skip-to-main-content link present
- [ ] Text contrast meets 4.5:1 minimum
- [ ] CTA buttons have descriptive text
- [ ] Motion toggle button provided (optional but recommended)
- [ ] Page has `<html lang="en">` (or correct language)
- [ ] `<main>` landmark wraps page content
- [ ] Section landmarks use `aria-label` to differentiate them
FILE:references/asset-pipeline.md
# Asset Pipeline Reference
Every image asset must be inspected and judged before use in any 2.5D site.
The AI inspects, judges, and informs — it does NOT auto-remove backgrounds.
---
## Step 1 — Run the Inspection Script
Run `scripts/inspect-assets.py` on every uploaded image before doing anything else.
The script outputs the format, mode, size, background type, and a recommendation
for each image. Read its output carefully.
---
## Step 2 — Judge Whether Background Removal Is Actually Needed
The script detects whether a background exists. YOU must decide whether it matters.
### Remove the background if the image is:
- An isolated product on a studio backdrop (bottle, shoe, phone, fruit, object)
- A character or figure that needs to float in the scene
- A logo or icon placed at any depth layer
- Any element at depth-2 or depth-3 that needs to "float" over other content
- An asset where the background colour will visibly clash with the site background
### Keep the background if the image is:
- A screenshot of a website, app UI, dashboard, or software
- A photograph used as a section background or depth-0 fill
- An artwork, poster, or illustration that is viewed as a complete piece
- A device mockup or "image inside a card/frame" design element
- A photo where the background is part of the visual content
- Any image placed at depth-0 — it IS the background, keep it
### When unsure — ask the role:
> "Does this image need to float freely over other content?"
> Yes → remove bg. No → keep it.
---
## Step 3 — Resize to Depth-Appropriate Dimensions
Run the resize step in `scripts/inspect-assets.py` or do it manually.
Never embed a large image when a smaller one is sufficient.
| Depth | Role | Max Longest Edge |
|---|---|---|
| 0 | Background fill | 1920px |
| 1 | Glow / atmosphere | 800px |
| 2 | Mid decorations, companions | 400px |
| 3 | Hero product | 1200px |
| 4 | UI components | 600px |
| 5 | Particles, sparkles | 128px |
---
## Step 4 — Inform the User (Required for Every Asset)
Before outputting any HTML, always show an asset audit to the user.
For each image that has a background issue, use this exact format:
> ⚠️ **Asset Notice — [filename]**
>
> This is a [JPEG / PNG] with a solid [black / white / coloured] background.
> As-is, it will appear as a visible box on the page rather than a floating asset.
>
> Based on its intended role ([product shot / decoration / etc.]), I think the
> background [should be removed / should be kept because it's a [screenshot/artwork/bg fill/etc.]].
>
> **Options:**
> 1. Provide a new PNG with a transparent background — best quality, ideal
> 2. Proceed as-is with a CSS workaround (mix-blend-mode) — quick but approximate
> 3. Keep the background — if this image is meant to be seen with its background
>
> Which do you prefer?
For clean images, confirm them briefly:
> ✅ **[filename]** — clean transparent PNG, resized to [X]px, assigned depth-[N] ([role])
Show all of this BEFORE outputting HTML. Wait for the user's response on any ⚠️ items.
---
## Step 5 — CSS Workaround (Only After User Approves)
Apply ONLY if the user explicitly chooses option 2 above:
```css
/* Dark background image on a dark site — black pixels become invisible */
.on-dark-bg {
mix-blend-mode: screen;
}
/* Light background image on a light site — white pixels become invisible */
.on-light-bg {
mix-blend-mode: multiply;
}
```
Always add a comment in the HTML when using this:
```html
<!-- CSS approximation: [filename] has a solid background.
Replace with a transparent PNG for best quality. -->
```
Limitations:
- `screen` lightens mid-tones — only works well on very dark site backgrounds
- `multiply` darkens mid-tones — only works well on very light site backgrounds
- Neither works on complex or gradient backgrounds
- A proper cutout PNG always gives better results
---
## Step 6 — CSS Rules for Transparent Images
Whether the image came in clean or had its background resolved, always apply:
```css
/* ALWAYS use drop-shadow — it follows the actual pixel shape */
.product-img {
filter: drop-shadow(0 30px 60px rgba(0, 0, 0, 0.4));
}
/* NEVER use box-shadow on cutout images — it creates a rectangle, not a shape shadow */
/* NEVER apply these to transparent/cutout images: */
/*
border-radius → clips transparency into a rounded box
overflow: hidden → same problem on the parent element
object-fit: cover → stretches image to fill a box, destroys the cutout
background-color → makes the bounding box visible
*/
```
FILE:references/depth-system.md
# Depth System Reference
The 2.5D illusion is built entirely on a **6-level depth model**. Every element on the page belongs to exactly one depth level. Depth controls four automatic properties: parallax speed, blur, scale, and shadow intensity. Together these four signals trick the human visual system into perceiving genuine spatial depth from flat assets.
---
## The 6-Level Depth Table
| Level | Name | Parallax | Blur | Scale | Shadow | Z-Index |
|-------|-------------------|----------|-------|-------|---------|---------|
| 0 | Far Background | 0.10x | 8px | 0.70 | 0.05 | 0 |
| 1 | Glow / Atmosphere | 0.25x | 4px | 0.85 | 0.10 | 1 |
| 2 | Mid Decorations | 0.50x | 0px | 1.00 | 0.20 | 2 |
| 3 | Main Objects | 0.80x | 0px | 1.05 | 0.35 | 3 |
| 4 | UI / Text | 1.00x | 0px | 1.00 | 0.00 | 4 |
| 5 | Foreground FX | 1.20x | 0px | 1.10 | 0.50 | 5 |
**Parallax formula:**
```
element_translateY = scroll_position * depth_factor * -1
```
A depth-0 element at scroll position 500px moves only -50px (barely moves — feels far away).
A depth-5 element at 500px moves -600px (moves fast — feels close).
---
## CSS Implementation
### CSS Custom Properties Foundation
```css
:root {
/* Depth parallax factors */
--depth-0-factor: 0.10;
--depth-1-factor: 0.25;
--depth-2-factor: 0.50;
--depth-3-factor: 0.80;
--depth-4-factor: 1.00;
--depth-5-factor: 1.20;
/* Depth blur values */
--depth-0-blur: 8px;
--depth-1-blur: 4px;
--depth-2-blur: 0px;
--depth-3-blur: 0px;
--depth-4-blur: 0px;
--depth-5-blur: 0px;
/* Depth scale values */
--depth-0-scale: 0.70;
--depth-1-scale: 0.85;
--depth-2-scale: 1.00;
--depth-3-scale: 1.05;
--depth-4-scale: 1.00;
--depth-5-scale: 1.10;
/* Live scroll value (updated by JS) */
--scroll-y: 0;
}
/* Base layer class */
.layer {
position: absolute;
inset: 0;
will-change: transform;
transform-origin: center center;
}
/* Depth-specific classes */
.depth-0 {
filter: blur(var(--depth-0-blur));
transform: scale(var(--depth-0-scale))
translateY(calc(var(--scroll-y) * var(--depth-0-factor) * -1px));
z-index: 0;
}
.depth-1 {
filter: blur(var(--depth-1-blur));
transform: scale(var(--depth-1-scale))
translateY(calc(var(--scroll-y) * var(--depth-1-factor) * -1px));
z-index: 1;
mix-blend-mode: screen; /* glow layers blend additively */
}
.depth-2 {
transform: scale(var(--depth-2-scale))
translateY(calc(var(--scroll-y) * var(--depth-2-factor) * -1px));
z-index: 2;
}
.depth-3 {
transform: scale(var(--depth-3-scale))
translateY(calc(var(--scroll-y) * var(--depth-3-factor) * -1px));
z-index: 3;
filter: drop-shadow(0 20px 40px rgba(0,0,0,0.35));
}
.depth-4 {
transform: translateY(calc(var(--scroll-y) * var(--depth-4-factor) * -1px));
z-index: 4;
}
.depth-5 {
transform: scale(var(--depth-5-scale))
translateY(calc(var(--scroll-y) * var(--depth-5-factor) * -1px));
z-index: 5;
}
```
### JavaScript — Scroll Driver
```javascript
// Throttled scroll listener using requestAnimationFrame
let ticking = false;
let lastScrollY = 0;
function updateDepthLayers() {
const scrollY = window.scrollY;
document.documentElement.style.setProperty('--scroll-y', scrollY);
ticking = false;
}
window.addEventListener('scroll', () => {
lastScrollY = window.scrollY;
if (!ticking) {
requestAnimationFrame(updateDepthLayers);
ticking = true;
}
}, { passive: true });
```
---
## Asset Assignment Rules
### What Goes in Each Depth Level
**Depth 0 — Far Background**
- Full-width background images (sky, gradient, texture)
- Very large PNGs (1920×1080+), file size 80–150KB max
- Heavily blurred by CSS — low detail is fine and preferred
- Examples: skyscape, abstract color wash, noise texture
**Depth 1 — Glow / Atmosphere**
- Radial gradient blobs, lens flare PNGs, soft light overlays
- Size: 600–1000px, file size: 30–60KB max
- Always use `mix-blend-mode: screen` or `mix-blend-mode: lighten`
- Always `filter: blur(40px–100px)` applied on top of CSS blur
- Examples: orange glow blob behind product, atmospheric haze
**Depth 2 — Mid Decorations**
- Abstract shapes, geometric patterns, floating decorative elements
- Size: 200–400px, file size: 20–50KB max
- Moderate shadow, no blur
- Examples: floating geometric shapes, brand pattern elements
**Depth 3 — Main Objects (The Star)**
- Hero product images, characters, featured illustrations
- Size: 800–1200px, file size: 50–120KB max
- High detail, clean cutout (transparent PNG background)
- Strong drop shadow: `filter: drop-shadow(0 30px 60px rgba(0,0,0,0.4))`
- This is the element users look at — give it the most visual weight
- Examples: juice bottle, product shot, hero character
**Depth 4 — UI / Text**
- Headlines, body copy, buttons, cards, navigation
- Always crisp, never blurred
- Text elements get animation data attributes (see text-animations.md)
- Examples: `<h1>`, `<p>`, `<button>`, card components
**Depth 5 — Foreground Particles / FX**
- Sparkles, floating dots, light particles, decorative splashes
- Small (32–128px), file size: 2–10KB
- High contrast, sharp edges
- Multiple instances scattered with different animation delays
- Examples: star sparkles, liquid splash dots, highlight flares
---
## Compositional Hierarchy — Size Relationships Between Assets
The most common mistake in 2.5D design is treating all assets as the same size.
Real cinematic depth requires deliberate, intentional size contrast.
### The Rule of One Hero
Every scene has exactly ONE dominant asset. Everything else serves it.
| Role | Display Size | Depth |
|---|---|---|
| Hero / star element | 50–85vw | depth-3 |
| Primary companion | 8–15vw | depth-2 |
| Secondary companion | 5–10vw | depth-2 |
| Accent / particle | 1–4vw | depth-5 |
| Background fill | 100vw | depth-0 |
### Positioning Companions Close to the Hero
Never scatter companions in random corners. Position them relative to the hero's edge:
```css
/*
Hero width: clamp(600px, 70vw, 1000px)
Hero half-width: clamp(300px, 35vw, 500px)
*/
.companion-right {
position: absolute;
right: calc(50% - clamp(300px, 35vw, 500px) - 20px);
/* negative gap value = slightly overlaps the hero */
}
.companion-left {
position: absolute;
left: calc(50% - clamp(300px, 35vw, 500px) - 20px);
}
```
Vertical placement:
- Upper shoulder: `top: 35%; transform: translateY(-50%)`
- Mid waist: `top: 55%; transform: translateY(-50%)`
- Lower base: `top: 72%; transform: translateY(-50%)`
### Scatter Rule on Hero Scroll-Out
When the hero grows or exits, companions scatter outward — not just fade.
This reinforces they were "held in orbit" by the hero.
```javascript
heroScrollTimeline
.to('.companion-right', { x: 80, y: -50, scale: 1.3 }, scrollPos)
.to('.companion-left', { x: -70, y: 40, scale: 1.25 }, scrollPos)
.to('.companion-lower', { x: 30, y: 80, scale: 1.1 }, scrollPos)
```
### Pre-Build Size Checklist
Before assigning sizes, answer these for every asset:
1. Is this the hero? → make it large enough to command the viewport
2. Is this a companion? → it should be 15–25% of the hero's display size
3. Would this read better bigger or smaller than my first instinct?
4. Is there enough size contrast between depth layers to read as real depth?
5. Does the composition feel balanced, or does everything look the same size?
---
## Floating Loop Animation
Every element at depth 2–5 should have a floating animation. Nothing should be perfectly static — it kills the 3D illusion.
```css
/* Float variants — apply different ones to different elements */
@keyframes float-y {
0%, 100% { transform: translateY(0px); }
50% { transform: translateY(-18px); }
}
@keyframes float-rotate {
0%, 100% { transform: translateY(0px) rotate(0deg); }
33% { transform: translateY(-12px) rotate(2deg); }
66% { transform: translateY(-6px) rotate(-1deg); }
}
@keyframes float-breathe {
0%, 100% { transform: scale(1); }
50% { transform: scale(1.04); }
}
@keyframes float-orbit {
0% { transform: translate(0, 0) rotate(0deg); }
25% { transform: translate(8px, -12px) rotate(2deg); }
50% { transform: translate(0, -20px) rotate(0deg); }
75% { transform: translate(-8px, -12px) rotate(-2deg); }
100% { transform: translate(0, 0) rotate(0deg); }
}
/* Depth-appropriate durations */
.depth-2 .float-loop { animation: float-y 10s ease-in-out infinite; }
.depth-3 .float-loop { animation: float-orbit 8s ease-in-out infinite; }
.depth-5 .float-loop { animation: float-rotate 6s ease-in-out infinite; }
/* Stagger delays for multiple elements at same depth */
.float-loop:nth-child(2) { animation-delay: -2s; }
.float-loop:nth-child(3) { animation-delay: -4s; }
.float-loop:nth-child(4) { animation-delay: -1.5s; }
```
---
## Shadow Depth Enhancement
Stronger shadows on closer elements amplify depth perception:
```css
/* Depth shadow system */
.depth-2 img { filter: drop-shadow(0 10px 20px rgba(0,0,0,0.20)); }
.depth-3 img { filter: drop-shadow(0 25px 50px rgba(0,0,0,0.35)); }
.depth-5 img { filter: drop-shadow(0 5px 15px rgba(0,0,0,0.50)); }
```
## Glow Layer Pattern (Depth 1)
The glow layer is critical for the "product floating in light" premium feel:
```css
/* Glow blob behind the main product */
.glow-blob {
position: absolute;
width: 600px;
height: 600px;
border-radius: 50%;
background: radial-gradient(circle, var(--brand-color) 0%, transparent 70%);
filter: blur(80px);
opacity: 0.45;
mix-blend-mode: screen;
/* Position behind depth-3 product */
z-index: 1;
/* Slow drift */
animation: float-breathe 12s ease-in-out infinite;
}
```
---
## HTML Scaffold Template
```html
<section class="scene" data-scene="[name]">
<div class="scene-inner">
<!-- DEPTH 0: Far background -->
<div class="layer depth-0" aria-hidden="true">
<div class="bg-gradient"></div>
<!-- OR: <img src="bg-texture.png" alt=""> -->
</div>
<!-- DEPTH 1: Glow atmosphere -->
<div class="layer depth-1" aria-hidden="true">
<div class="glow-blob glow-primary"></div>
<div class="glow-blob glow-secondary"></div>
</div>
<!-- DEPTH 2: Mid decorations -->
<div class="layer depth-2" aria-hidden="true">
<img class="deco float-loop" src="shape-1.png" alt="">
<img class="deco float-loop" src="shape-2.png" alt="">
</div>
<!-- DEPTH 3: Main product/hero -->
<div class="layer depth-3">
<img class="product-hero float-loop" src="product.png"
alt="[Meaningful description of product]" />
</div>
<!-- DEPTH 4: Text & UI -->
<div class="layer depth-4">
<h1 class="hero-title split-text" data-animate="converge">
Your Headline
</h1>
<p class="hero-sub" data-animate="fade-up">Supporting copy here</p>
<a class="cta-btn" href="#" data-animate="scale-in">Get Started</a>
</div>
<!-- DEPTH 5: Foreground particles -->
<div class="layer depth-5" aria-hidden="true">
<img class="particle float-loop" src="sparkle.png" alt="">
<img class="particle float-loop" src="sparkle.png" alt="">
<img class="particle float-loop" src="sparkle.png" alt="">
</div>
</div>
</section>
```
FILE:references/directional-reveals.md
# Directional Reveals Reference
Elements and sections don't always enter from the bottom. Premium sites use **directional births** — sections that drop from the top, iris open from center, peel away like wallpaper, or unfold diagonally. This file covers all 8 directional reveal patterns.
## Table of Contents
1. [Top-Down Clip Birth](#top-down)
2. [Window Pane Iris Open](#iris-open)
3. [Curtain Panel Roll-Up](#curtain-rollup)
4. [SVG Morph Border](#svg-morph)
5. [Diagonal Wipe Birth](#diagonal-wipe)
6. [Circle Iris Expand](#circle-iris)
7. [Multi-Directional Stagger Grid](#multi-direction)
8. [Loading Screen Curtain Lift](#loading-screen)
---
## Pattern 1: Top-Down Clip Birth {#top-down}
The section is born from the top edge and grows **downward**. Instead of rising from below, it drops and unfolds from above. This is the opposite of the conventional bottom-up reveal and creates a striking "curtain drop" feeling.
```css
/* Starting state — section is fully clipped (invisible) */
.top-drop-section {
/* Section exists in DOM but is invisible */
clip-path: inset(0 0 100% 0);
/*
inset(top right bottom left):
- top: 0 → clip starts at top edge
- bottom: 100% → clips 100% from bottom = nothing visible
*/
}
/* Revealed state */
.top-drop-section.revealed {
clip-path: inset(0 0 0% 0);
transition: clip-path 1.2s cubic-bezier(0.16, 1, 0.3, 1);
}
```
```javascript
// GSAP scroll-driven version with scrub
function initTopDownBirth(sectionEl) {
gsap.fromTo(sectionEl,
{ clipPath: 'inset(0 0 100% 0)' },
{
clipPath: 'inset(0 0 0% 0)',
ease: 'power2.out',
scrollTrigger: {
trigger: sectionEl.previousElementSibling, // previous section is the trigger
start: 'bottom 80%',
end: 'bottom 20%',
scrub: 1.5,
}
}
);
}
// Exit: section retracts back upward (born from top, dies back up)
function addTopRetractExit(sectionEl) {
gsap.to(sectionEl, {
clipPath: 'inset(100% 0 0% 0)', // now clips from TOP — retracts upward
ease: 'power2.in',
scrollTrigger: {
trigger: sectionEl,
start: 'bottom 20%',
end: 'bottom top',
scrub: 1,
}
});
}
```
**Key insight:** Enter = `inset(0 0 100% 0)` → `inset(0 0 0% 0)` (bottom clips away downward).
Exit = `inset(0)` → `inset(100% 0 0 0)` (top clips away upward = retracts back where it came from).
---
## Pattern 2: Window Pane Iris Open {#iris-open}
An entire section starts as a tiny centered rectangle — like a keyhole or portal — and expands outward to fill the viewport. Creates a cinematic "opening shot" feeling.
```javascript
function initWindowPaneIris(sectionEl) {
// The section starts as a small centered window
gsap.fromTo(sectionEl,
{
clipPath: 'inset(42% 35% 42% 35% round 12px)',
// 42% from top AND bottom = only 16% of height visible
// 35% from left AND right = only 30% of width visible
// Centered rectangle peek
},
{
clipPath: 'inset(0% 0% 0% 0% round 0px)',
ease: 'none',
scrollTrigger: {
trigger: sectionEl,
start: 'top 90%',
end: 'top 10%',
scrub: 1.2,
}
}
);
// Also scale/zoom the content inside for parallax depth
gsap.fromTo(sectionEl.querySelector('.iris-content'),
{ scale: 1.4 },
{
scale: 1,
ease: 'none',
scrollTrigger: {
trigger: sectionEl,
start: 'top 90%',
end: 'top 10%',
scrub: 1.2,
}
}
);
}
```
**Variation — horizontal bar open (blinds effect):**
```javascript
// Two bars that slide apart (one from top, one from bottom)
function initBlindsOpen(topBar, bottomBar, revealEl) {
const tl = gsap.timeline({
scrollTrigger: {
trigger: revealEl,
start: 'top 70%',
toggleActions: 'play none none reverse',
}
});
tl.to(topBar, { yPercent: -100, duration: 1.0, ease: 'power3.inOut' })
.to(bottomBar, { yPercent: 100, duration: 1.0, ease: 'power3.inOut' }, 0);
}
```
---
## Pattern 3: Curtain Panel Roll-Up {#curtain-rollup}
Multiple layered panels. Each one "rolls up" from top, exposing the panel beneath. Like peeling back wallpaper layers to reveal what's underneath. Uses z-index stacking.
```css
.curtain-stack {
position: relative;
height: 100vh;
overflow: hidden;
}
.curtain-panel {
position: absolute;
inset: 0;
/* Stack panels — panel 1 on top, panel N on bottom */
}
.curtain-panel:nth-child(1) { z-index: 5; background: #0f0f0f; }
.curtain-panel:nth-child(2) { z-index: 4; background: #1a0a2e; }
.curtain-panel:nth-child(3) { z-index: 3; background: #2d0b4e; }
.curtain-panel:nth-child(4) { z-index: 2; background: #1e3a8a; }
/* Final revealed content at z-index 1 */
```
```javascript
function initCurtainRollUp(containerEl) {
const panels = gsap.utils.toArray('.curtain-panel', containerEl);
const tl = gsap.timeline({
scrollTrigger: {
trigger: containerEl,
start: 'top top',
end: `+=panels.length * 120%`,
pin: true,
scrub: 1,
}
});
panels.forEach((panel, i) => {
const segmentDuration = 1 / panels.length;
const segmentStart = i * segmentDuration;
// Each panel rolls up — clip from bottom rises to top
tl.to(panel, {
clipPath: 'inset(100% 0 0% 0)', // rolls up: bottom clips first, rising to 100%
duration: segmentDuration,
ease: 'power2.inOut',
}, segmentStart);
// Heading for this panel fades in
const heading = panel.querySelector('.panel-heading');
if (heading) {
tl.from(heading, {
opacity: 0,
y: 30,
duration: segmentDuration * 0.4,
}, segmentStart + segmentDuration * 0.1);
}
});
return tl;
}
```
---
## Pattern 4: SVG Morph Border {#svg-morph}
The section's edge is not a hard straight line — it morphs between shapes (rectangle → wave → diagonal → organic curve) as the user scrolls. Makes sections feel alive and fluid.
```html
<!-- SVG clipPath element -->
<svg width="0" height="0" style="position:absolute">
<defs>
<clipPath id="morphClip" clipPathUnits="objectBoundingBox">
<path id="morphPath" d="M0,0 L1,0 L1,0.95 Q0.5,1.05 0,0.95 Z"/>
</clipPath>
</defs>
</svg>
<section class="morphed-section" style="clip-path: url(#morphClip)">
<!-- section content -->
</section>
```
```javascript
function initSVGMorphBorder() {
const morphPath = document.getElementById('morphPath');
const paths = {
straight: 'M0,0 L1,0 L1,1 L0,1 Z',
wave: 'M0,0 L1,0 L1,0.95 Q0.75,1.05 0.5,0.95 Q0.25,0.85 0,0.95 Z',
diagonal: 'M0,0 L1,0 L1,0.88 L0,1.0 Z',
organic: 'M0,0 L1,0 L1,0.92 C0.8,1.04 0.6,0.88 0.4,1.0 C0.2,1.12 0.1,0.90 0,0.96 Z',
};
ScrollTrigger.create({
trigger: '.morphed-section',
start: 'top 80%',
end: 'bottom 20%',
scrub: 2,
onUpdate: (self) => {
const p = self.progress;
// Morph between straight → wave → diagonal as scroll progresses
if (p < 0.5) {
// Interpolate straight → wave
morphPath.setAttribute('d', p < 0.25 ? paths.straight : paths.wave);
} else {
morphPath.setAttribute('d', p < 0.75 ? paths.wave : paths.diagonal);
}
}
});
}
```
---
## Pattern 5: Diagonal Wipe Birth {#diagonal-wipe}
Content is revealed by a diagonal sweep across the screen — from top-left corner to bottom-right (or any corner combination). Feels cinematic and directional.
```javascript
function initDiagonalWipe(el, direction = 'top-left') {
const clipPaths = {
'top-left': {
from: 'polygon(0 0, 0 0, 0 0)',
to: 'polygon(0 0, 120% 0, 0 120%)',
},
'top-right': {
from: 'polygon(100% 0, 100% 0, 100% 0)',
to: 'polygon(-20% 0, 100% 0, 100% 120%)',
},
'center-out': {
from: 'polygon(50% 50%, 50% 50%, 50% 50%, 50% 50%)',
to: 'polygon(-10% -10%, 110% -10%, 110% 110%, -10% 110%)',
},
};
const { from, to } = clipPaths[direction];
gsap.fromTo(el,
{ clipPath: from },
{
clipPath: to,
duration: 1.4,
ease: 'power3.inOut',
scrollTrigger: {
trigger: el,
start: 'top 70%',
}
}
);
}
```
---
## Pattern 6: Circle Iris Expand {#circle-iris}
The most dramatic reveal: a perfect circle expands from the center of the section outward, like an aperture opening or a spotlight switching on.
```javascript
function initCircleIris(el, originX = '50%', originY = '50%') {
gsap.fromTo(el,
{ clipPath: `circle(0% at originX originY)` },
{
clipPath: `circle(80% at originX originY)`,
ease: 'none',
scrollTrigger: {
trigger: el,
start: 'top 75%',
end: 'top 25%',
scrub: 1,
}
}
);
}
// Variant: iris opens from cursor position on hover
function initHoverIris(el) {
el.addEventListener('mouseenter', (e) => {
const rect = el.getBoundingClientRect();
const x = ((e.clientX - rect.left) / rect.width * 100).toFixed(1) + '%';
const y = ((e.clientY - rect.top) / rect.height * 100).toFixed(1) + '%';
gsap.fromTo(el,
{ clipPath: `circle(0% at x y)` },
{ clipPath: `circle(100% at x y)`, duration: 0.6, ease: 'power2.out' }
);
});
}
```
---
## Pattern 7: Multi-Directional Stagger Grid {#multi-direction}
When a grid or set of cards appears, each item enters from a different edge/direction — creating a dynamic assembly effect instead of uniform fade-ups.
```javascript
function initMultiDirectionalGrid(gridEl) {
const items = gsap.utils.toArray('.grid-item', gridEl);
const directions = [
{ x: -80, y: 0 }, // from left
{ x: 0, y: -80 }, // from top
{ x: 80, y: 0 }, // from right
{ x: 0, y: 80 }, // from bottom
{ x: -60, y: -60 }, // from top-left
{ x: 60, y: -60 }, // from top-right
{ x: -60, y: 60 }, // from bottom-left
{ x: 60, y: 60 }, // from bottom-right
];
items.forEach((item, i) => {
const dir = directions[i % directions.length];
gsap.from(item, {
x: dir.x,
y: dir.y,
opacity: 0,
duration: 0.8,
ease: 'power3.out',
scrollTrigger: {
trigger: gridEl,
start: 'top 75%',
},
delay: i * 0.08, // stagger
});
});
}
```
---
## Pattern 8: Loading Screen Curtain Lift {#loading-screen}
A full-viewport branded intro screen that physically lifts off the page on load, revealing the site beneath. Sets cinematic expectations before any scroll animation begins.
```css
.loading-curtain {
position: fixed;
inset: 0;
z-index: 9999;
background: #0a0a0a; /* or brand color */
display: flex;
align-items: center;
justify-content: center;
/* Split into two halves for dramatic split-open effect */
}
.curtain-top {
position: absolute;
top: 0; left: 0; right: 0;
height: 50%;
background: inherit;
transform-origin: top center;
}
.curtain-bottom {
position: absolute;
bottom: 0; left: 0; right: 0;
height: 50%;
background: inherit;
transform-origin: bottom center;
}
```
```javascript
function initLoadingCurtain() {
const curtainTop = document.querySelector('.curtain-top');
const curtainBottom = document.querySelector('.curtain-bottom');
const curtainLogo = document.querySelector('.curtain-logo');
const loadingScreen = document.querySelector('.loading-curtain');
// Prevent scroll during loading
document.body.style.overflow = 'hidden';
const tl = gsap.timeline({
delay: 0.5,
onComplete: () => {
document.body.style.overflow = '';
loadingScreen.style.display = 'none';
// Init all scroll animations AFTER curtain lifts
initAllAnimations();
}
});
// Logo appears first
tl.from(curtainLogo, { opacity: 0, scale: 0.8, duration: 0.6, ease: 'power2.out' })
// Brief hold
.to({}, { duration: 0.4 })
// Logo fades out
.to(curtainLogo, { opacity: 0, scale: 1.1, duration: 0.4, ease: 'power2.in' })
// Curtain splits: top goes up, bottom goes down
.to(curtainTop, { yPercent: -100, duration: 0.9, ease: 'power4.inOut' }, '-=0.1')
.to(curtainBottom, { yPercent: 100, duration: 0.9, ease: 'power4.inOut' }, '<');
}
window.addEventListener('load', initLoadingCurtain);
```
---
## Combining Directional Reveals
For maximum cinematic impact, chain directional reveals between sections:
```
Section 1 → Section 2: Window pane iris (section 2 peeks through a keyhole)
Section 2 → Section 3: Top-down clip birth (section 3 drops from top)
Section 3 → Section 4: Diagonal wipe (section 4 sweeps in from corner)
Section 4 → Section 5: Circle iris (section 5 opens from center)
Section 5 → Section 6: Curtain panel roll-up (exposes multiple layers)
```
Each transition feels distinct, keeping the user engaged across the full scroll experience.
FILE:references/examples.md
# Real-World Examples Reference
Five complete implementation blueprints. Each describes exactly which techniques to combine, in what order, with key code patterns.
## Table of Contents
1. [Juice/Beverage Brand Launch](#juice-brand)
2. [Tech SaaS Landing Page](#saas)
3. [Creative Portfolio](#portfolio)
4. [Gaming Website](#gaming)
5. [Luxury Product E-Commerce](#ecommerce)
---
## Example 1: Juice/Beverage Brand Launch {#juice-brand}
**Brief:** Premium juice brand. Hero has floating glass. Sections transition smoothly with the product "rising" between them.
**Techniques Used:**
- Loading screen curtain lift
- 6-layer depth parallax in hero
- Floating product between sections (THE signature move)
- Top-down clip birth for ingredients section
- Word-by-word scroll lighting for tagline
- Cascading card stack for flavors
- Split converge title exit
**Section Architecture:**
```
[LOADING SCREEN — brand logo on black, splits open]
↓
[HERO — dark purple gradient]
depth-0: purple/dark gradient background
depth-1: orange glow blob (brand color)
depth-2: floating citrus slice PNGs (scattered, decorative)
depth-3: juice glass PNG (main product, float-loop)
depth-4: headline "Pure. Fresh. Electric." (split converge on enter)
depth-5: liquid splash particle PNGs
[FLOATING PRODUCT BRIDGE — glass hovers between sections]
[INGREDIENTS — warm cream/yellow section]
Entry: top-down clip birth (section drops from top)
depth-0: warm gradient background
depth-3: large orange PNG illustration
depth-4: "Word by word" ingredient callouts (scroll-lit)
Floating text: ingredient names fade in one by one
[FLAVORS — cascading card stack, 3 cards]
Card 1: Orange — scales down as Card 2 arrives
Card 2: Mango — scales down as Card 3 arrives
Card 3: Berry — stays full screen
Each card: full-bleed color + depth-3 bottle + depth-4 title
[CTA — minimal, dark]
Circle iris expand reveal
Oversized bleed typography: "DRINK DIFFERENT"
Simple form/button
```
**Key Code Pattern — The Glass Journey:**
```javascript
// Glass starts in hero depth-3, floats between sections,
// then descends into ingredients section
initFloatingProduct(); // from inter-section-effects.md
// On arrival in ingredients section, glass triggers
// the ingredient words to light up one by one
ScrollTrigger.create({
trigger: '.ingredients-section',
start: 'top 50%',
onEnter: () => {
initWordScrollLighting(
'.ingredients-section',
'.ingredients-tagline'
);
}
});
```
**Color Palette:**
- Hero: `#0a0014` (deep purple) → `#2d0b4e`
- Glow: `#ff6b00` (orange), `#ff9900` (amber)
- Ingredients: `#fdf4e7` (warm cream)
- Flavors: Brand-specific per flavor
- CTA: `#0a0014` (returns to hero dark)
---
## Example 2: Tech SaaS Landing Page {#saas}
**Brief:** B2B SaaS product — analytics dashboard. Premium, modern, tech-forward. Animated product screenshots.
**Techniques Used:**
- Window pane iris open (hero reveals from keyhole)
- DJI-style scale-in pin (dashboard screenshot fills viewport)
- Scrub timeline (features appear one by one)
- Curtain panel roll-up (pricing tiers reveal)
- Character cylinder rotation (headline numbers: "10x faster")
- Line clip wipe (feature descriptions)
- Horizontal scroll (integration logos)
**Section Architecture:**
```
[HERO — midnight blue]
Entry: window pane iris — site reveals from tiny centered rectangle
depth-0: mesh gradient (dark blue/purple)
depth-1: subtle grid pattern (CSS, not PNG) with opacity 0.15
depth-2: floating abstract geometric shapes (low opacity)
depth-3: dashboard screenshot PNG (float-loop subtle)
depth-4: headline with CYLINDER ROTATION on "10x"
"Make your analytics 10x smarter"
depth-5: small glow dots/particles
[FEATURE ZOOM — pinned section, 300vh scroll distance]
DJI-style: Dashboard screenshot starts small, expands to full viewport
Scrub timeline reveals 3 features as user scrolls through pin:
- Feature 1: "Real-time insights" fades in left
- Feature 2: "AI-powered" fades in right
- Feature 3: "Zero setup" fades in center
Each feature: line clip wipe on description text
[HOW IT WORKS — top-down clip birth]
3-step process
Each step: multi-directional stagger (step 1 from left, step 2 from top, step 3 from right)
Numbered steps with variable font weight animation
[INTEGRATIONS — horizontal scroll]
Pin section, logos scroll horizontally
Speed reactive marquee for "works with everything you use"
[PRICING — curtain panel roll-up]
3 pricing tiers as curtain panels
Free → Pro → Enterprise reveals one by one
Each reveal: scramble text on price number
[CTA — circle iris]
Dark background
Bleed typography: "START FREE TODAY"
Magnetic button (cursor-attracted)
```
---
## Example 3: Creative Portfolio {#portfolio}
**Brief:** Designer/developer portfolio. Bold, experimental, Awwwards-worthy. The work is the hero.
**Techniques Used:**
- Offset diagonal layout for name/title
- Theatrical enter+exit for all section content
- Horizontal scroll for project showcase
- GSAP Flip cross-section for project previews
- Scroll-speed reactive marquee for skills
- Bleed typography throughout
- Diagonal wipe births
- Cursor spotlight
**Section Architecture:**
```
[INTRO — stark black]
NO loading screen — shock with immediate bold text
depth-0: pure black (#000)
depth-4: MASSIVE bleed title — name in 180px+ font
offset diagonal layout:
Line 1: "ALEX" — top-left, x: 5%
Line 2: "MORENO" — lower-right, x: 40%
Line 3: "Designer" — far right, smaller, italic
Cursor spotlight effect follows mouse
CTA: "See Work ↓" — subtle, bottom-right
[MARQUEE DIVIDER]
Scroll-speed reactive marquee:
"AVAILABLE FOR WORK · BASED IN LONDON · OPEN TO REMOTE ·"
Speed up when user scrolls fast
[PROJECTS — horizontal scroll, 4 projects]
Pinned container, horizontal scroll
Each panel: full-bleed project image
project title via line clip wipe
brief description via theatrical enter
On hover: project image scale(1.03), cursor becomes "View →"
Between projects: diagonal wipe transition
[ABOUT — section peel]
Upper section peels away to reveal about section
depth-3: portrait photo (clip-path circle iris, expands to full)
depth-4: about text — curtain line reveal
Skills: variable font wave animation
[PROCESS — pinned scrub timeline]
3 process stages animate through scroll:
Each stage: top-down clip birth reveals content
Numbers: character cylinder rotation
[CONTACT — minimal]
Circle iris expand
Email address: scramble text effect on hover
Social links: skew + bounce on scroll in
```
---
## Example 4: Gaming Website {#gaming}
**Brief:** Game launch page. Dark, cinematic, intense. Character reveals, environment depth.
**Techniques Used:**
- Curved path travel (character moves across page)
- Perspective zoom fly-through (fly into the game world)
- Full layered parallax (6 levels deep)
- SVG morph borders (organic landscape edges)
- Cascading card stacks (character select)
- Word-by-word scroll lighting (lore text)
- Particle trails (cursor leaves sparks)
- Multiple floating loops (atmospheric)
**Section Architecture:**
```
[LOADING SCREEN — game-style]
Loading bar fills
Logo does cylinder rotation
Splits open with curtain top/bottom
[HERO — extreme depth parallax]
depth-0: distant mountains/sky PNG (very slow, heavily blurred)
depth-1: mid-distance fog layer (slightly blurred, mix-blend: screen)
depth-2: closer terrain elements (decorative)
depth-3: CHARACTER PNG — hero character (main float-loop)
depth-4: game title — "SHADOWREALM" (split converge from sides)
depth-5: foreground particles — embers/sparks (fast float)
Cursor: particle trail (sparks follow cursor)
[FLY-THROUGH — perspective zoom, 300vh]
Pinned section
Camera appears to fly INTO the game world
Background rushes toward viewer (scale 0.3 → 1.4)
Character appears from far (scale 0.05 → 1)
Title resolves via scramble text
[LORE — word scroll lighting, pinned 400vh]
Dark section, long block of atmospheric text
Words light up as user scrolls
Atmospheric background particles drift slowly
Character silhouette visible at depth-1 (very faint)
[CHARACTERS — cascading card stack, 4 characters]
Each card: character art full-bleed
Character name: cylinder rotation
Class/description: line clip wipe
Stats: stagger animate (bars fill on enter)
Each card buried: scale(0.88), blur, pushed back
[WORLD MAP — horizontal scroll]
5 zones scroll horizontally
Zone titles: offset diagonal layout
Environment art at different parallax speeds
[PRE-ORDER — window pane iris]
Iris opens revealing pre-order section
Bleed typography: "ENTER THE REALM"
Magnetic CTA button
```
---
## Example 5: Luxury Product E-Commerce {#ecommerce}
**Brief:** High-end watch/jewelry brand. Understated elegance. Every animation whispers, not shouts. The product is the hero.
**Techniques Used:**
- DJI-style scale-in (product fills viewport, slowly)
- GSAP Flip (watch travels from hero to detail view)
- Section peel reveal (product details peel open)
- Masked line curtain reveal (all body text)
- Clip-path section birth (materials section)
- Floating product between sections
- Subtle parallax (depth factors halved for elegance)
- Bleed typography (collection names)
**Section Architecture:**
```
[HERO — pure white or cream]
No loading screen — immediate elegance
depth-0: pure white / soft cream gradient
depth-1: VERY subtle warm glow (opacity 0.2 only)
depth-2: minimal geometric line decoration (thin, opacity 0.3)
depth-3: WATCH PNG — centered, generous space, slow float (14s loop, tiny movement)
depth-4: brand name — thin weight, large tracking
"Est. 1887" — tiny, centered below
Parallax factors reduced: depth-3 factor = 0.3 (elegant, not dramatic)
[PRODUCT TRANSITION — GSAP Flip]
Watch morphs from hero center to detail view (left side)
Detail text reveals via masked line curtain (right side)
Flip duration: 1.4s (luxury = slow, unhurried)
[MATERIALS — clip-path section birth]
Cream/beige section
Product rises up through the section boundary
Material close-ups: stagger fade in from bottom (gentle)
Text: curtain line reveal (one line at a time, 0.2s stagger)
[CRAFTSMANSHIP — top-down clip birth, then peel]
Section drops from top (elegant, not dramatic)
Video/image of watchmaker — DJI scale-in at reduced intensity
Text: word-by-word scroll lighting (VERY slow, meditative)
[COLLECTION — section peel + horizontal scroll]
Peel reveals horizontal scroll gallery
4 watch variants scroll horizontally
Each: full-bleed product + minimal text (clip wipe)
[PURCHASE — circle iris (small, elegant)]
Circle opens from center, but slowly (2s duration)
Minimal layout: price, materials, add to cart
CTA: subtle skew + bounce (barely perceptible)
Trust signals: line-by-line curtain reveal
```
---
## Combining Patterns — Quick Reference
These combinations appear most often across successful premium sites:
**The "Product Hero" Combination:**
Floating product between sections + Top-down clip birth + Split converge title + Word scroll lighting
**The "Cinematic Chapter" Combination:**
Pinned sticky + Scrub timeline + Curtain panel roll-up + Theatrical enter/exit
**The "Tech Premium" Combination:**
Window pane iris + DJI scale-in + Line clip wipe + Cylinder rotation
**The "Editorial" Combination:**
Bleed typography + Offset diagonal + Horizontal scroll + Diagonal wipe
**The "Minimal Luxury" Combination:**
GSAP Flip + Section peel + Masked line curtain + Reduced parallax factors
FILE:references/inter-section-effects.md
# Inter-Section Effects Reference
These are the most premium techniques — effects where elements **persist, travel, or transition between sections**, creating a seamless narrative thread across the entire page.
## Table of Contents
1. [Floating Product Between Sections](#floating-product)
2. [GSAP Flip Cross-Section Morph](#flip-morph)
3. [Clip-Path Section Birth (Product Grows from Border)](#clip-birth)
4. [DJI-Style Scale-In Pin](#dji-scale)
5. [Element Curved Path Travel](#curved-path)
6. [Section Peel Reveal](#section-peel)
---
## Technique 1: Floating Product Between Sections {#floating-product}
This is THE signature technique for product brands. A product image (juice bottle, phone, sneaker) starts inside the hero section. As you scroll, it appears to "rise up" through the section boundary and hover between two differently-colored sections — partially owned by neither. Then as you continue scrolling, it gracefully descends back in.
**The Visual Story:**
- Hero section: product sitting naturally inside
- Mid-scroll: product "floating" in space, section colors visible above and below it
- Continue scroll: product becomes part of the next section
```css
/* The product is positioned in a sticky wrapper */
.inter-section-product-wrapper {
/* This wrapper spans BOTH sections */
position: relative;
z-index: 100;
pointer-events: none;
height: 0; /* no height — just a position anchor */
}
.inter-section-product {
position: sticky;
top: 50vh; /* stick to vertical center of viewport */
transform: translateY(-50%); /* true center */
width: 100%;
display: flex;
justify-content: center;
pointer-events: none;
}
.inter-section-product img {
width: clamp(280px, 35vw, 560px);
/* The product will be exactly at the section boundary
when the page is scrolled to that point */
}
```
```javascript
function initFloatingProduct() {
const wrapper = document.querySelector('.inter-section-product-wrapper');
const productImg = wrapper.querySelector('img');
const heroSection = document.querySelector('.hero-section');
const nextSection = document.querySelector('.feature-section');
// Create a ScrollTrigger timeline for the product's journey
const tl = gsap.timeline({
scrollTrigger: {
trigger: heroSection,
start: 'bottom 80%', // starts rising as hero bottom approaches viewport
end: 'bottom 20%', // completes rise when hero fully exited
scrub: 1.5,
}
});
// Phase 1: Product rises up from hero (scale grows, shadow intensifies)
tl.fromTo(productImg,
{
y: 0,
scale: 0.85,
filter: 'drop-shadow(0 10px 20px rgba(0,0,0,0.2))',
},
{
y: '-8vh',
scale: 1.05,
filter: 'drop-shadow(0 40px 80px rgba(0,0,0,0.5))',
duration: 0.5,
}
);
// Phase 2: Product fully "between" sections — peak visibility
tl.to(productImg, {
y: '-5vh',
scale: 1.1,
duration: 0.3,
});
// Phase 3: Product descends into next section
ScrollTrigger.create({
trigger: nextSection,
start: 'top 60%',
end: 'top 20%',
scrub: 1.5,
onUpdate: (self) => {
gsap.to(productImg, {
y: `self.progress * 8vh`,
scale: 1.1 - (self.progress * 0.2),
duration: 0.1,
overwrite: true,
});
}
});
}
```
### Required HTML Structure
```html
<!-- SECTION 1: Hero (dark background) -->
<section class="hero-section" style="background: #0a0014; min-height: 100vh; position: relative; z-index: 1;">
<!-- depth layers 0-2 (bg, glow, decorations) -->
<!-- NO product image here — it's in the inter-section wrapper -->
<div class="layer depth-4">
<h1>Your Headline</h1>
<p>Hero subtext here</p>
</div>
</section>
<!-- THE FLOATING PRODUCT — outside both sections, between them -->
<div class="inter-section-product-wrapper">
<div class="inter-section-product">
<img
src="product.png"
alt="Product Name — floating between hero and features"
class="float-loop"
/>
</div>
</div>
<!-- SECTION 2: Features (lighter background) -->
<section class="feature-section" style="background: #f5f0ff; min-height: 100vh; position: relative; z-index: 2; padding-top: 15vh;">
<!-- Product appears to "land" into this section -->
<div class="feature-content">
<h2>Features Headline</h2>
</div>
</section>
```
---
## Technique 2: GSAP Flip Cross-Section Morph {#flip-morph}
The same DOM element appears to travel between completely different layout positions across sections. In the hero it's large and centered; in the feature section it's small and left-aligned; in the detail section it's full-width. One smooth morph connects them all.
```javascript
function initFlipMorphSections() {
gsap.registerPlugin(Flip);
// The product element exists in one place in the DOM
// but we have "ghost" placeholder positions in other sections
const product = document.querySelector('.traveling-product');
const positions = {
hero: document.querySelector('.product-position-hero'),
feature: document.querySelector('.product-position-feature'),
detail: document.querySelector('.product-position-detail'),
};
function morphToPosition(positionEl, options = {}) {
// Capture current state
const state = Flip.getState(product);
// Move element to new position
positionEl.appendChild(product);
// Animate from captured state to new position
Flip.from(state, {
duration: 0.9,
ease: 'power3.inOut',
...options
});
}
// Trigger morphs on scroll
ScrollTrigger.create({
trigger: '.feature-section',
start: 'top 60%',
onEnter: () => morphToPosition(positions.feature),
onLeaveBack: () => morphToPosition(positions.hero),
});
ScrollTrigger.create({
trigger: '.detail-section',
start: 'top 60%',
onEnter: () => morphToPosition(positions.detail),
onLeaveBack: () => morphToPosition(positions.feature),
});
}
```
### Ghost Position Placeholders HTML
```html
<!-- Hero section: large, centered position -->
<section class="hero-section">
<div class="product-position-hero" style="width: 500px; height: 500px; margin: 0 auto;">
<!-- Product starts here -->
<img class="traveling-product" src="product.png" alt="Product" style="width:100%;">
</div>
</section>
<!-- Feature section: medium, left-side position -->
<section class="feature-section">
<div class="feature-layout">
<div class="product-position-feature" style="width: 280px; height: 280px;">
<!-- Product morphs to here -->
</div>
<div class="feature-text">...</div>
</div>
</section>
```
---
## Technique 3: Clip-Path Section Birth (Product Grows from Border) {#clip-birth}
The product image starts completely hidden below the section's bottom border — clipped out of existence. As the user scrolls into the section boundary, the product "grows up" through the border like a plant emerging from soil. This is distinct from the floating product — here, the section itself is the stage.
```css
.birth-section {
position: relative;
overflow: hidden; /* hard clip at section border */
min-height: 100vh;
}
.birth-product {
position: absolute;
bottom: -20%; /* starts 20% below the section — invisible */
left: 50%;
transform: translateX(-50%);
width: clamp(300px, 40vw, 600px);
/* Will animate up through the section boundary */
}
```
```javascript
function initClipPathBirth(sectionEl, productEl) {
const tl = gsap.timeline({
scrollTrigger: {
trigger: sectionEl,
start: 'top 80%',
end: 'top 20%',
scrub: 1.2,
}
});
// Product rises from below section boundary
tl.fromTo(productEl,
{
y: '120%', // fully below section
scale: 0.7,
opacity: 0,
filter: 'blur(8px)'
},
{
y: '0%', // sits naturally in section
scale: 1,
opacity: 1,
filter: 'blur(0px)',
ease: 'power3.out',
duration: 1,
}
);
// Continue scroll → product rises further and becomes full height
// then disappears back below as section exits
ScrollTrigger.create({
trigger: sectionEl,
start: 'bottom 60%',
end: 'bottom top',
scrub: 1,
onUpdate: (self) => {
gsap.to(productEl, {
y: `-self.progress * 50%`,
opacity: 1 - self.progress,
scale: 1 + self.progress * 0.2,
duration: 0.1,
overwrite: true,
});
}
});
}
```
---
## Technique 4: DJI-Style Scale-In Pin {#dji-scale}
Made famous by DJI drone product pages. A section starts with a small, contained image. As the user scrolls, the image scales up to fill the entire viewport — THEN the section unpins and the next content reveals. Creates a "zoom into the world" feeling.
```javascript
function initDJIScaleIn(sectionEl) {
const heroMedia = sectionEl.querySelector('.dji-media');
const heroContent = sectionEl.querySelector('.dji-content');
const overlay = sectionEl.querySelector('.dji-overlay');
const tl = gsap.timeline({
scrollTrigger: {
trigger: sectionEl,
start: 'top top',
end: '+=300%',
pin: true,
scrub: 1.5,
}
});
// Stage 1: Small image scales up to fill viewport
tl.fromTo(heroMedia,
{
borderRadius: '20px',
scale: 0.3,
width: '60%',
left: '20%',
top: '20%',
},
{
borderRadius: '0px',
scale: 1,
width: '100%',
left: '0%',
top: '0%',
duration: 0.4,
ease: 'power2.inOut',
}
)
// Stage 2: Overlay fades in over the full-viewport image
.fromTo(overlay,
{ opacity: 0 },
{ opacity: 0.6, duration: 0.2 },
0.35
)
// Stage 3: Content text appears over the overlay
.from(heroContent.querySelectorAll('.dji-line'),
{
y: 40,
opacity: 0,
stagger: 0.08,
duration: 0.25,
},
0.45
);
return tl;
}
```
```css
.dji-section {
position: relative;
height: 100vh;
overflow: hidden;
}
.dji-media {
position: absolute;
height: 100%;
object-fit: cover;
/* Will be animated to full coverage */
}
.dji-overlay {
position: absolute;
inset: 0;
background: linear-gradient(to bottom, transparent, rgba(0,0,0,0.8));
opacity: 0;
}
.dji-content {
position: absolute;
bottom: 15%;
left: 8%;
right: 8%;
color: white;
}
```
---
## Technique 5: Element Curved Path Travel {#curved-path}
The most advanced technique. A product element travels along a smooth, curved Bezier path across the page as the user scrolls — arcing through space like it's floating or being thrown, rather than just translating in a straight line.
```html
<script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/MotionPathPlugin.min.js"></script>
```
```javascript
function initCurvedPathTravel(productEl) {
gsap.registerPlugin(MotionPathPlugin);
// Define the curved path as SVG coordinates
// Relative to the product's parent container
const path = [
{ x: 0, y: 0 }, // Start: hero center
{ x: -200, y: -100 }, // Arc left and up
{ x: 100, y: -300 }, // Continue arcing
{ x: 300, y: -150 }, // Swing right
{ x: 200, y: 50 }, // Land into feature section
];
gsap.to(productEl, {
motionPath: {
path: path,
curviness: 1.4, // How curvy (0 = straight lines, 2 = very curved)
autoRotate: false, // Don't rotate along path (keep product upright)
},
scale: gsap.utils.interpolate([0.8, 1.1, 0.9, 1.0, 1.2]),
ease: 'none',
scrollTrigger: {
trigger: '.journey-container',
start: 'top top',
end: '+=400%',
pin: true,
scrub: 1.5,
}
});
}
```
---
## Technique 6: Section Peel Reveal {#section-peel}
The section below is revealed by the section above peeling away — like turning a page. Uses `sticky: bottom: 0` so the lower section sticks to the screen bottom while the upper section scrolls away.
```css
.peel-upper {
position: relative;
z-index: 2;
min-height: 100vh;
/* This section scrolls away normally */
}
.peel-lower {
position: sticky;
bottom: 0; /* sticks to BOTTOM of viewport */
z-index: 1;
min-height: 100vh;
/* This section waits at the bottom as upper section peels away */
}
/* Container wraps both */
.peel-container {
position: relative;
}
```
```javascript
function initSectionPeel() {
const upper = document.querySelector('.peel-upper');
const lower = document.querySelector('.peel-lower');
// As upper section scrolls, reveal lower by reducing clip
gsap.fromTo(upper,
{ clipPath: 'inset(0 0 0 0)' },
{
clipPath: 'inset(0 0 100% 0)', // upper peels up and away
ease: 'none',
scrollTrigger: {
trigger: '.peel-container',
start: 'top top',
end: 'center top',
scrub: true,
}
}
);
// Lower section content animates in as it's revealed
gsap.from(lower.querySelectorAll('.peel-content > *'), {
y: 30,
opacity: 0,
stagger: 0.1,
duration: 0.6,
scrollTrigger: {
trigger: '.peel-container',
start: '30% top',
toggleActions: 'play none none reverse',
}
});
}
```
---
## Choosing the Right Inter-Section Technique
| Situation | Best Technique |
|-----------|---------------|
| Brand/product site with hero image | Floating Product Between Sections |
| Product appears in multiple contexts | GSAP Flip Cross-Section Morph |
| Product "rises" from section boundary | Clip-Path Section Birth |
| Cinematic "enter the world" feeling | DJI-Style Scale-In Pin |
| Product travels a journey narrative | Curved Path Travel |
| Elegant section-to-section transition | Section Peel Reveal |
| Dark → light section transition | Floating Product (section backgrounds change beneath) |
FILE:references/motion-system.md
# Motion System Reference
## Table of Contents
1. [GSAP Setup & CDN](#gsap-setup)
2. [Pattern 1: Multi-Layer Parallax](#pattern-1)
3. [Pattern 2: Pinned Sticky Sections](#pattern-2)
4. [Pattern 3: Cascading Card Stack](#pattern-3)
5. [Pattern 4: Scrub Timeline](#pattern-4)
6. [Pattern 5: Clip-Path Wipe Reveals](#pattern-5)
7. [Pattern 6: Horizontal Scroll Conversion](#pattern-6)
8. [Pattern 7: Perspective Zoom Fly-Through](#pattern-7)
9. [Pattern 8: Snap-to-Section](#pattern-8)
10. [Lenis Smooth Scroll](#lenis)
11. [IntersectionObserver Activation](#intersection-observer)
---
## GSAP Setup & CDN {#gsap-setup}
Always load from jsDelivr CDN:
```html
<!-- Core GSAP -->
<script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/gsap.min.js"></script>
<!-- ScrollTrigger plugin — required for all scroll patterns -->
<script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/ScrollTrigger.min.js"></script>
<!-- ScrollSmoother — optional, pairs with ScrollTrigger -->
<script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/ScrollSmoother.min.js"></script>
<!-- Flip plugin — for cross-section element morphing -->
<script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/Flip.min.js"></script>
<!-- MotionPathPlugin — for curved element paths -->
<script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/MotionPathPlugin.min.js"></script>
<script>
// Always register plugins immediately
gsap.registerPlugin(ScrollTrigger, Flip, MotionPathPlugin);
// Respect prefers-reduced-motion
const prefersReduced = window.matchMedia('(prefers-reduced-motion: reduce)').matches;
if (prefersReduced) {
gsap.globalTimeline.timeScale(0); // Freeze all animations
}
</script>
```
---
## Pattern 1: Multi-Layer Parallax {#pattern-1}
The foundation of all 2.5D depth. Different layers scroll at different speeds.
```javascript
function initParallax() {
const layers = document.querySelectorAll('[data-depth]');
const depthFactors = {
'0': 0.10, '1': 0.25, '2': 0.50,
'3': 0.80, '4': 1.00, '5': 1.20
};
layers.forEach(layer => {
const depth = layer.dataset.depth;
const factor = depthFactors[depth] || 1.0;
gsap.to(layer, {
yPercent: -15 * factor, // adjust multiplier for desired effect intensity
ease: 'none',
scrollTrigger: {
trigger: layer.closest('.scene'),
start: 'top bottom',
end: 'bottom top',
scrub: true, // 1:1 scroll-to-animation
}
});
});
}
```
**When to use:** Every project. This is always on.
---
## Pattern 2: Pinned Sticky Sections {#pattern-2}
A section stays fixed while its content animates. Other sections slide over/under it. The "window over window" effect.
```javascript
function initPinnedSection(sceneEl) {
// The section stays pinned for `duration` scroll pixels
// while inner content animates on a scrubbed timeline
const tl = gsap.timeline({
scrollTrigger: {
trigger: sceneEl,
start: 'top top',
end: '+=150%', // stay pinned for 1.5x viewport of scroll
pin: true, // THIS is what pins the section
scrub: 1, // 1 second smoothing
anticipatePin: 1, // prevents jump on pin
}
});
// Inner content animations while pinned
// These play out over the scroll distance
tl.from('.pinned-title', { opacity: 0, y: 60, duration: 0.3 })
.from('.pinned-image', { scale: 0.8, opacity: 0, duration: 0.4 })
.to('.pinned-bg', { backgroundColor: '#1a0a2e', duration: 0.3 })
.from('.pinned-sub', { opacity: 0, x: -40, duration: 0.3 });
return tl;
}
```
**Visual result:** Section feels like a chapter — the page "lives inside it" for a while, then moves on.
---
## Pattern 3: Cascading Card Stack {#pattern-3}
New sections slide over previous ones. Each buried section scales down and darkens, feeling like it's receding.
```css
/* CSS Setup */
.card-stack-section {
position: sticky;
top: 0;
height: 100vh;
/* Each subsequent section has higher z-index */
}
.card-stack-section:nth-child(1) { z-index: 1; }
.card-stack-section:nth-child(2) { z-index: 2; }
.card-stack-section:nth-child(3) { z-index: 3; }
.card-stack-section:nth-child(4) { z-index: 4; }
```
```javascript
function initCardStack() {
const cards = gsap.utils.toArray('.card-stack-section');
cards.forEach((card, i) => {
// Each card (except last) gets buried as next one enters
if (i < cards.length - 1) {
gsap.to(card, {
scale: 0.88,
filter: 'brightness(0.5) blur(3px)',
borderRadius: '20px',
ease: 'none',
scrollTrigger: {
trigger: cards[i + 1], // fires when NEXT card enters
start: 'top bottom',
end: 'top top',
scrub: true,
}
});
}
});
}
```
---
## Pattern 4: Scrub Timeline {#pattern-4}
The most powerful pattern. Elements transform EXACTLY in sync with scroll position. One pixel of scroll = one frame of animation.
```javascript
function initScrubTimeline(sceneEl) {
const tl = gsap.timeline({
scrollTrigger: {
trigger: sceneEl,
start: 'top top',
end: '+=200%',
pin: true,
scrub: 1.5, // 1.5s lag for smooth, dreamy feel (use 0 for precise 1:1)
}
});
// Sequences play out as user scrolls
// 0.0 to 0.25 → first 25% of scroll
tl.fromTo('.hero-product',
{ scale: 0.6, opacity: 0, y: 100 },
{ scale: 1, opacity: 1, y: 0, duration: 0.25 }
)
// 0.25 to 0.5 → second quarter
.to('.hero-title span:first-child', {
x: '-30vw', opacity: 0, duration: 0.25
}, 0.25)
.to('.hero-title span:last-child', {
x: '30vw', opacity: 0, duration: 0.25
}, 0.25)
// 0.5 to 0.75 → third quarter
.to('.hero-product', {
scale: 1.3, y: -50, duration: 0.25
}, 0.5)
.fromTo('.next-section-content',
{ opacity: 0, y: 80 },
{ opacity: 1, y: 0, duration: 0.25 },
0.5
)
// 0.75 to 1.0 → final quarter
.to('.hero-product', {
opacity: 0, scale: 1.6, duration: 0.25
}, 0.75);
return tl;
}
```
---
## Pattern 5: Clip-Path Wipe Reveals {#pattern-5}
Content is hidden behind a clip-path mask that animates away to reveal the content beneath. GPU-accelerated, buttery smooth.
```javascript
// Left-to-right horizontal wipe
function initHorizontalWipe(el) {
gsap.fromTo(el,
{ clipPath: 'inset(0 100% 0 0)' },
{
clipPath: 'inset(0 0% 0 0)',
duration: 1.2,
ease: 'power3.out',
scrollTrigger: { trigger: el, start: 'top 80%' }
}
);
}
// Top-to-bottom drop reveal
function initTopDropReveal(el) {
gsap.fromTo(el,
{ clipPath: 'inset(0 0 100% 0)' },
{
clipPath: 'inset(0 0 0% 0)',
duration: 1.0,
ease: 'power2.out',
scrollTrigger: { trigger: el, start: 'top 75%' }
}
);
}
// Circle iris expand
function initCircleIris(el) {
gsap.fromTo(el,
{ clipPath: 'circle(0% at 50% 50%)' },
{
clipPath: 'circle(75% at 50% 50%)',
duration: 1.4,
ease: 'power2.inOut',
scrollTrigger: { trigger: el, start: 'top 60%' }
}
);
}
// Window pane iris (tiny box expands to full)
function initWindowPaneIris(sceneEl) {
gsap.fromTo(sceneEl,
{ clipPath: 'inset(45% 30% 45% 30% round 8px)' },
{
clipPath: 'inset(0% 0% 0% 0% round 0px)',
ease: 'none',
scrollTrigger: {
trigger: sceneEl,
start: 'top 80%',
end: 'top 20%',
scrub: 1,
}
}
);
}
```
---
## Pattern 6: Horizontal Scroll Conversion {#pattern-6}
Vertical scrolling drives horizontal movement through panels. Classic premium technique.
```javascript
function initHorizontalScroll(containerEl) {
const panels = gsap.utils.toArray('.h-panel', containerEl);
gsap.to(panels, {
xPercent: -100 * (panels.length - 1),
ease: 'none',
scrollTrigger: {
trigger: containerEl,
pin: true,
scrub: 1,
end: () => `+=containerEl.offsetWidth * (panels.length - 1)`,
snap: 1 / (panels.length - 1), // auto-snap to each panel
}
});
}
```
```css
.h-scroll-container {
display: flex;
width: calc(300vw); /* 3 panels × 100vw */
height: 100vh;
overflow: hidden;
}
.h-panel {
width: 100vw;
height: 100vh;
flex-shrink: 0;
}
```
---
## Pattern 7: Perspective Zoom Fly-Through {#pattern-7}
User appears to fly toward content. Combines scale, Z-axis, and opacity on a scrubbed pin.
```javascript
function initPerspectiveZoom(sceneEl) {
const tl = gsap.timeline({
scrollTrigger: {
trigger: sceneEl,
start: 'top top',
end: '+=300%',
pin: true,
scrub: 2,
}
});
// Background "rushes toward" viewer
tl.fromTo('.zoom-bg',
{ scale: 0.4, filter: 'blur(20px)', opacity: 0.3 },
{ scale: 1.2, filter: 'blur(0px)', opacity: 1, duration: 0.6 }
)
// Product appears from far
.fromTo('.zoom-product',
{ scale: 0.1, z: -2000, opacity: 0 },
{ scale: 1, z: 0, opacity: 1, duration: 0.5, ease: 'power2.out' },
0.2
)
// Text fades in after product arrives
.fromTo('.zoom-title',
{ opacity: 0, letterSpacing: '2em' },
{ opacity: 1, letterSpacing: '0.05em', duration: 0.3 },
0.55
);
}
```
```css
.zoom-scene {
perspective: 1200px;
perspective-origin: 50% 50%;
transform-style: preserve-3d;
overflow: hidden;
}
```
---
## Pattern 8: Snap-to-Section {#pattern-8}
Full-page scroll snapping between sections — creates a chapter-like book feeling.
```javascript
// Using GSAP Observer for smooth snapping
function initSectionSnap() {
// Register Observer plugin
gsap.registerPlugin(Observer);
const sections = gsap.utils.toArray('.snap-section');
let currentIndex = 0;
let animating = false;
function goTo(index) {
if (animating || index === currentIndex) return;
animating = true;
const direction = index > currentIndex ? 1 : -1;
const current = sections[currentIndex];
const next = sections[index];
const tl = gsap.timeline({
onComplete: () => {
currentIndex = index;
animating = false;
}
});
// Current section exits upward
tl.to(current, {
yPercent: -100 * direction,
opacity: 0,
duration: 0.8,
ease: 'power2.inOut'
})
// Next section enters from below/above
.fromTo(next,
{ yPercent: 100 * direction, opacity: 0 },
{ yPercent: 0, opacity: 1, duration: 0.8, ease: 'power2.inOut' },
0
);
}
Observer.create({
type: 'wheel,touch',
onDown: () => goTo(Math.min(currentIndex + 1, sections.length - 1)),
onUp: () => goTo(Math.max(currentIndex - 1, 0)),
tolerance: 100,
preventDefault: true,
});
}
```
---
## Lenis Smooth Scroll {#lenis}
Lenis replaces native browser scroll with silky-smooth physics-based scrolling. Always pair with GSAP ScrollTrigger.
```html
<script src="https://cdn.jsdelivr.net/npm/@studio-freight/[email protected]/dist/lenis.min.js"></script>
```
```javascript
function initLenis() {
const lenis = new Lenis({
duration: 1.2,
easing: (t) => Math.min(1, 1.001 - Math.pow(2, -10 * t)),
orientation: 'vertical',
smoothWheel: true,
});
// CRITICAL: Connect Lenis to GSAP ticker
lenis.on('scroll', ScrollTrigger.update);
gsap.ticker.add((time) => lenis.raf(time * 1000));
gsap.ticker.lagSmoothing(0);
return lenis;
}
```
---
## IntersectionObserver Activation {#intersection-observer}
Only animate elements that are currently visible. Critical for performance.
```javascript
function initRevealObserver() {
const observer = new IntersectionObserver((entries) => {
entries.forEach(entry => {
if (entry.isIntersecting) {
entry.target.classList.add('is-visible');
// Trigger GSAP animation
const animType = entry.target.dataset.animate;
if (animType) triggerAnimation(entry.target, animType);
// Stop observing after first trigger
observer.unobserve(entry.target);
}
});
}, {
threshold: 0.15,
rootMargin: '0px 0px -50px 0px'
});
document.querySelectorAll('[data-animate]').forEach(el => observer.observe(el));
}
function triggerAnimation(el, type) {
const animations = {
'fade-up': () => gsap.from(el, { y: 60, opacity: 0, duration: 0.8, ease: 'power3.out' }),
'fade-in': () => gsap.from(el, { opacity: 0, duration: 1.0, ease: 'power2.out' }),
'scale-in': () => gsap.from(el, { scale: 0.8, opacity: 0, duration: 0.7, ease: 'back.out(1.7)' }),
'slide-left': () => gsap.from(el, { x: -80, opacity: 0, duration: 0.8, ease: 'power3.out' }),
'slide-right':() => gsap.from(el, { x: 80, opacity: 0, duration: 0.8, ease: 'power3.out' }),
'converge': () => animateSplitConverge(el), // See text-animations.md
};
animations[type]?.();
}
```
---
## Pattern 9: Elastic Drop with Impact Shake {#elastic-drop}
An element falls from above with an elastic overshoot, then a rapid
micro-rotation shake fires on landing — simulating physical weight and impact.
```javascript
function initElasticDrop(productEl, wrapperEl) {
const tl = gsap.timeline({ delay: 0.3 });
// Phase 1: element drops with elastic bounce
tl.from(productEl, {
y: -180,
opacity: 0,
scale: 1.1,
duration: 1.3,
ease: 'elastic.out(1, 0.65)',
})
// Phase 2: shake fires just as the elastic settles
// Apply to the WRAPPER not the element — avoids transform conflicts
.to(wrapperEl, {
keyframes: [
{ rotation: -2, duration: 0.08 },
{ rotation: 2, duration: 0.08 },
{ rotation: -1.5, duration: 0.07 },
{ rotation: 1, duration: 0.07 },
{ rotation: 0, duration: 0.10 },
],
ease: 'power1.inOut',
}, '-=0.35');
return tl;
}
```
```html
<!-- Wrapper and product must be separate elements -->
<div class="drop-wrapper" id="dropWrapper">
<img class="drop-product" id="dropProduct" src="product.png" alt="..." />
</div>
```
Ease variants:
- `elastic.out(1, 0.65)` — standard product, moderate bounce
- `elastic.out(1.2, 0.5)` — heavier object, more overshoot
- `elastic.out(0.8, 0.8)` — lighter, quicker settle
- `back.out(2.5)` — no oscillation, one clean overshoot
Do NOT use for: gentle floaters, airy elements (flowers, feathers) — use `power3.out` instead.
FILE:references/performance.md
# Performance Reference
## The Golden Rule
**Only animate properties that the browser can handle on the GPU compositor thread:**
```
✅ SAFE (GPU composited): transform, opacity, filter, clip-path, will-change
❌ AVOID (triggers layout): width, height, top, left, right, bottom, margin, padding,
font-size, border-width, background-size (avoid)
```
Animating layout properties causes the browser to recalculate the entire page layout on every frame — this is called "layout thrash" and causes jank.
---
## requestAnimationFrame Pattern
Never put animation logic directly in event listeners. Always batch through rAF:
```javascript
let rafId = null;
let pendingScrollY = 0;
function onScroll() {
pendingScrollY = window.scrollY;
if (!rafId) {
rafId = requestAnimationFrame(processScroll);
}
}
function processScroll() {
rafId = null;
document.documentElement.style.setProperty('--scroll-y', pendingScrollY);
// update other values...
}
window.addEventListener('scroll', onScroll, { passive: true });
// passive: true is CRITICAL — tells browser scroll handler won't preventDefault
// allows browser to scroll on a separate thread
```
---
## will-change Usage Rules
`will-change` promotes an element to its own GPU layer. Powerful but dangerous if overused.
```css
/* DO: Only apply when animation is about to start */
.element-about-to-animate {
will-change: transform, opacity;
}
/* DO: Remove after animation completes */
element.addEventListener('animationend', () => {
element.style.willChange = 'auto';
});
/* DON'T: Apply globally */
* { will-change: transform; } /* WRONG — massive GPU memory usage */
/* DON'T: Apply statically on all animated elements */
.animated-thing { will-change: transform; } /* Wrong if there are many of these */
```
### GSAP handles this automatically
GSAP applies `will-change` during animations and removes it after. If using GSAP, you generally don't need to manage `will-change` yourself.
---
## IntersectionObserver Pattern
Never animate all elements all the time. Only animate what's currently visible.
```javascript
class AnimationManager {
constructor() {
this.activeAnimations = new Set();
this.observer = new IntersectionObserver(
this.handleIntersection.bind(this),
{ threshold: 0.1, rootMargin: '50px 0px' }
);
}
observe(el) {
this.observer.observe(el);
}
handleIntersection(entries) {
entries.forEach(entry => {
if (entry.isIntersecting) {
this.activateElement(entry.target);
} else {
this.deactivateElement(entry.target);
}
});
}
activateElement(el) {
// Start GSAP animation / add floating class
el.classList.add('animate-active');
this.activeAnimations.add(el);
}
deactivateElement(el) {
// Pause or stop animation
el.classList.remove('animate-active');
this.activeAnimations.delete(el);
}
}
const animManager = new AnimationManager();
document.querySelectorAll('.animated-layer').forEach(el => animManager.observe(el));
```
---
## content-visibility: auto
For pages with many off-screen sections, this dramatically improves initial load and scroll performance:
```css
/* Apply to every major section except the first (which is immediately visible) */
.scene:not(:first-child) {
content-visibility: auto;
/* Tells browser: don't render this until it's near the viewport */
contain-intrinsic-size: 0 100vh;
/* Gives browser an estimated height so scrollbar is correct */
}
```
**Note:** Don't apply to the first section — it causes a flash of invisible content.
---
## Asset Optimization Rules
### PNG File Size Targets (Maximum)
| Depth Level | Element Type | Max File Size | Max Dimensions |
|-------------|---------------------|---------------|----------------|
| Depth 0 | Background | 150KB | 1920×1080 |
| Depth 1 | Glow layer | 60KB | 1000×1000 |
| Depth 2 | Decorations | 50KB | 400×400 |
| Depth 3 | Main product/hero | 120KB | 1200×1200 |
| Depth 4 | UI components | 40KB | 800×800 |
| Depth 5 | Particles | 10KB | 128×128 |
**Total page weight target: Under 2MB for all assets combined.**
### Image Loading Strategy
```html
<!-- Hero image: preload immediately -->
<link rel="preload" as="image" href="hero-product.png">
<!-- Above-fold images: eager loading -->
<img src="hero-bg.png" loading="eager" fetchpriority="high" alt="">
<!-- Below-fold images: lazy loading -->
<img src="section-2-bg.png" loading="lazy" alt="">
<!-- Use srcset for responsive images -->
<img
src="product-800.png"
srcset="product-400.png 400w, product-800.png 800w, product-1200.png 1200w"
sizes="(max-width: 768px) 100vw, 50vw"
alt="Product description"
loading="eager"
>
```
---
## Mobile Performance
Touch devices have less GPU power. Always detect and reduce effects:
```javascript
const isTouchDevice = window.matchMedia('(pointer: coarse)').matches;
const prefersReduced = window.matchMedia('(prefers-reduced-motion: reduce)').matches;
const isLowPower = navigator.hardwareConcurrency <= 4; // heuristic for low-end devices
const performanceMode = (isTouchDevice || prefersReduced || isLowPower) ? 'lite' : 'full';
function initForPerformanceMode() {
if (performanceMode === 'lite') {
// Disable: mouse tracking, floating loops, particles, perspective zoom
document.documentElement.classList.add('perf-lite');
// Keep: basic scroll fade-ins, curtain reveals (CSS only)
} else {
// Full experience
initParallaxLayers();
initFloatingLoops();
initParticles();
initMouseTracking();
}
}
```
```css
/* Disable GPU-heavy effects in lite mode */
.perf-lite .depth-0,
.perf-lite .depth-1,
.perf-lite .depth-5 {
transform: none !important;
will-change: auto !important;
}
.perf-lite .float-loop {
animation: none !important;
}
.perf-lite .glow-blob {
display: none;
}
```
---
## Chrome DevTools Performance Checklist
Before shipping, verify:
1. **Layers panel**: Check `chrome://settings` → DevTools → "Show Composited Layer Borders" — should not show excessive layer count (target: under 20 promoted layers)
2. **Performance tab**: Record scroll at 60fps. Look for long frames (>16ms)
3. **Memory tab**: Heap snapshot — should not grow during scroll (no leaks)
4. **Coverage tab**: Check unused CSS/JS — strip unused animation classes
---
## GSAP Performance Tips
```javascript
// BAD: Creates new tween every scroll event
window.addEventListener('scroll', () => {
gsap.to(element, { y: window.scrollY * 0.5 }); // creates new tween each frame!
});
// GOOD: Use scrub — GSAP manages timing internally
gsap.to(element, {
y: 200,
ease: 'none',
scrollTrigger: {
scrub: true, // GSAP handles this efficiently
}
});
// GOOD: Kill ScrollTriggers when not needed
const trigger = ScrollTrigger.create({ ... });
// Later:
trigger.kill();
// GOOD: Use gsap.set() for instant placement (no tween overhead)
gsap.set('.element', { x: 0, opacity: 1 });
// GOOD: Batch DOM reads/writes
gsap.utils.toArray('.elements').forEach(el => {
// GSAP batches these reads automatically
gsap.from(el, { ... });
});
```
FILE:references/text-animations.md
# Text Animation Reference
## Table of Contents
1. [Setup: SplitText & Dependencies](#setup)
2. [Technique 1: Split Converge (Left+Right Merge)](#split-converge)
3. [Technique 2: Masked Line Curtain Reveal](#masked-line)
4. [Technique 3: Character Cylinder Rotation](#cylinder)
5. [Technique 4: Word-by-Word Scroll Lighting](#word-lighting)
6. [Technique 5: Scramble Text](#scramble)
7. [Technique 6: Skew + Elastic Bounce Entry](#skew-bounce)
8. [Technique 7: Theatrical Enter + Auto Exit](#theatrical)
9. [Technique 8: Offset Diagonal Layout](#offset-diagonal)
10. [Technique 9: Line Clip Wipe](#line-clip-wipe)
11. [Technique 10: Scroll-Speed Reactive Marquee](#marquee)
12. [Technique 11: Variable Font Wave](#variable-font)
13. [Technique 12: Bleed Typography](#bleed-type)
---
## Setup: SplitText & Dependencies {#setup}
```html
<!-- GSAP SplitText (free in GSAP 3.12+) -->
<script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/gsap.min.js"></script>
<script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/SplitText.min.js"></script>
<script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/ScrollTrigger.min.js"></script>
<script>
gsap.registerPlugin(SplitText, ScrollTrigger);
</script>
```
### Universal Text Setup CSS
```css
/* All text elements that animate need this */
.anim-text {
overflow: hidden; /* Contains line mask reveals */
line-height: 1.15;
}
/* Screen reader: preserve meaning even when SplitText fragments it */
.anim-text[aria-label] > * {
aria-hidden: true;
}
```
---
## Technique 1: Split Converge (Left+Right Merge) {#split-converge}
The signature effect: two halves of a title fly in from opposite sides, converge to form the complete title, hold, then diverge and disappear on scroll exit. Exactly what the user described.
```css
.hero-title {
display: flex;
flex-wrap: wrap;
gap: 0.25em;
overflow: visible; /* allow parts to fly from outside viewport */
}
.hero-title .word-left {
display: inline-block;
/* starts at far left */
}
.hero-title .word-right {
display: inline-block;
/* starts at far right */
}
```
```javascript
function initSplitConverge(titleEl) {
// Preserve accessibility
const fullText = titleEl.textContent;
titleEl.setAttribute('aria-label', fullText);
const words = titleEl.querySelectorAll('.word');
const midpoint = Math.floor(words.length / 2);
const leftWords = Array.from(words).slice(0, midpoint);
const rightWords = Array.from(words).slice(midpoint);
const tl = gsap.timeline({
scrollTrigger: {
trigger: titleEl.closest('.scene'),
start: 'top top',
end: '+=250%',
pin: true,
scrub: 1.2,
}
});
// Phase 1 — ENTER (0% → 25%): Words converge from sides
tl.fromTo(leftWords,
{ x: '-120vw', opacity: 0 },
{ x: 0, opacity: 1, duration: 0.25, ease: 'power3.out', stagger: 0.03 },
0
)
.fromTo(rightWords,
{ x: '120vw', opacity: 0 },
{ x: 0, opacity: 1, duration: 0.25, ease: 'power3.out', stagger: -0.03 },
0
)
// Phase 2 — HOLD (25% → 70%): Nothing — words are readable, section pinned
// (empty duration keeps the scrub paused here)
.to({}, { duration: 0.45 }, 0.25)
// Phase 3 — EXIT (70% → 100%): Words diverge back out
.to(leftWords,
{ x: '-120vw', opacity: 0, duration: 0.28, ease: 'power3.in', stagger: 0.02 },
0.70
)
.to(rightWords,
{ x: '120vw', opacity: 0, duration: 0.28, ease: 'power3.in', stagger: -0.02 },
0.70
);
return tl;
}
```
### HTML Template
```html
<h1 class="hero-title anim-text" aria-label="Your Brand Name">
<span class="word word-left">Your</span>
<span class="word word-left">Brand</span>
<span class="word word-right">Name</span>
<span class="word word-right">Here</span>
</h1>
```
---
## Technique 2: Masked Line Curtain Reveal {#masked-line}
Lines slide upward from behind an invisible curtain. Each line is hidden in an `overflow: hidden` container and translates up into view.
```css
.curtain-text .line-mask {
overflow: hidden;
line-height: 1.2;
/* The mask — content starts below and slides up into view */
}
.curtain-text .line-inner {
display: block;
/* Starts translated down below the mask */
transform: translateY(110%);
}
```
```javascript
function initCurtainReveal(textEl) {
// SplitText splits into lines automatically
const split = new SplitText(textEl, {
type: 'lines',
linesClass: 'line-inner',
// Wraps each line in overflow:hidden container
lineThreshold: 0.1,
});
// Wrap each line in a mask container
split.lines.forEach(line => {
const mask = document.createElement('div');
mask.className = 'line-mask';
line.parentNode.insertBefore(mask, line);
mask.appendChild(line);
});
gsap.from(split.lines, {
y: '110%',
duration: 0.9,
ease: 'power4.out',
stagger: 0.12,
scrollTrigger: {
trigger: textEl,
start: 'top 80%',
}
});
}
```
---
## Technique 3: Character Cylinder Rotation {#cylinder}
Letters rotate in on a 3D cylinder axis — like a slot machine or odometer rolling into place. Premium, memorable.
```css
.cylinder-text {
perspective: 800px;
}
.cylinder-text .char {
display: inline-block;
transform-origin: center center -60px; /* pivot point BEHIND the letter */
transform-style: preserve-3d;
}
```
```javascript
function initCylinderRotation(titleEl) {
const split = new SplitText(titleEl, { type: 'chars' });
gsap.from(split.chars, {
rotateX: -90,
opacity: 0,
duration: 0.6,
ease: 'back.out(1.5)',
stagger: {
each: 0.04,
from: 'start'
},
scrollTrigger: {
trigger: titleEl,
start: 'top 75%',
}
});
}
```
---
## Technique 4: Word-by-Word Scroll Lighting {#word-lighting}
Words appear to light up one at a time, driven by scroll position. Apple's signature prose technique.
```css
.scroll-lit-text {
/* Start all words dim */
}
.scroll-lit-text .word {
display: inline-block;
color: rgba(255, 255, 255, 0.15); /* dim unlit state */
transition: color 0.1s ease;
}
.scroll-lit-text .word.lit {
color: rgba(255, 255, 255, 1.0); /* bright lit state */
}
```
```javascript
function initWordScrollLighting(containerEl, textEl) {
const split = new SplitText(textEl, { type: 'words' });
const words = split.words;
const totalWords = words.length;
// Pin the section and light words as user scrolls
ScrollTrigger.create({
trigger: containerEl,
start: 'top top',
end: `+=totalWords * 80px`, // ~80px per word
pin: true,
scrub: 0.5,
onUpdate: (self) => {
const progress = self.progress;
const litCount = Math.round(progress * totalWords);
words.forEach((word, i) => {
word.classList.toggle('lit', i < litCount);
});
}
});
}
```
---
## Technique 5: Scramble Text {#scramble}
Characters cycle through random values before resolving to real text. Feels digital, techy, premium.
```html
<script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/TextPlugin.min.js"></script>
```
```javascript
// Custom scramble implementation (no plugin needed)
function scrambleText(el, finalText, duration = 1.5) {
const chars = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789!@#$%';
let startTime = null;
const originalText = finalText;
function step(timestamp) {
if (!startTime) startTime = timestamp;
const progress = Math.min((timestamp - startTime) / (duration * 1000), 1);
let result = '';
for (let i = 0; i < originalText.length; i++) {
if (originalText[i] === ' ') {
result += ' ';
} else if (i / originalText.length < progress) {
// This character has resolved
result += originalText[i];
} else {
// Still scrambling
result += chars[Math.floor(Math.random() * chars.length)];
}
}
el.textContent = result;
if (progress < 1) requestAnimationFrame(step);
}
requestAnimationFrame(step);
}
// Trigger on scroll
ScrollTrigger.create({
trigger: '.scramble-title',
start: 'top 80%',
once: true,
onEnter: () => {
scrambleText(
document.querySelector('.scramble-title'),
document.querySelector('.scramble-title').dataset.text,
1.8
);
}
});
```
---
## Technique 6: Skew + Elastic Bounce Entry {#skew-bounce}
Elements enter with a skew that corrects itself, combined with a slight overshoot. Feels physical and energetic.
```javascript
function initSkewBounce(elements) {
gsap.from(elements, {
y: 80,
skewY: 7,
opacity: 0,
duration: 0.9,
ease: 'back.out(1.7)',
stagger: 0.1,
scrollTrigger: {
trigger: elements[0],
start: 'top 85%',
}
});
}
```
---
## Technique 7: Theatrical Enter + Auto Exit {#theatrical}
Element automatically animates in when entering the viewport AND animates out when leaving — zero JavaScript needed.
```css
/* Enter animation */
@keyframes theatrical-enter {
from {
opacity: 0;
transform: translateY(60px);
filter: blur(4px);
}
to {
opacity: 1;
transform: translateY(0);
filter: blur(0px);
}
}
/* Exit animation */
@keyframes theatrical-exit {
from {
opacity: 1;
transform: translateY(0);
}
to {
opacity: 0;
transform: translateY(-60px);
}
}
.theatrical {
/* Enter when element comes into view */
animation: theatrical-enter linear both;
animation-timeline: view();
animation-range: entry 0% entry 40%;
}
.theatrical-with-exit {
animation: theatrical-enter linear both, theatrical-exit linear both;
animation-timeline: view(), view();
animation-range: entry 0% entry 30%, exit 60% exit 100%;
}
```
**Zero JavaScript required.** Just add `.theatrical` or `.theatrical-with-exit` class.
---
## Technique 8: Offset Diagonal Layout {#offset-diagonal}
Lines of a title start at offset positions (one top-left, one lower-right), then animate FROM their natural offset positions FROM opposite directions. Creates a staircase visual composition that feels dynamic even before animation.
```css
.offset-title {
position: relative;
/* Don't center — let offset do the work */
}
.offset-title .line-1 {
/* Top-left */
display: block;
text-align: left;
padding-left: 5%;
font-size: clamp(48px, 8vw, 100px);
}
.offset-title .line-2 {
/* Lower-right — drops down and shifts right */
display: block;
text-align: right;
padding-right: 5%;
margin-top: 0.4em;
font-size: clamp(48px, 8vw, 100px);
}
```
```javascript
function initOffsetDiagonal(titleEl) {
const line1 = titleEl.querySelector('.line-1');
const line2 = titleEl.querySelector('.line-2');
gsap.from(line1, {
x: '-15vw',
opacity: 0,
duration: 1.0,
ease: 'power4.out',
scrollTrigger: { trigger: titleEl, start: 'top 75%' }
});
gsap.from(line2, {
x: '15vw',
opacity: 0,
duration: 1.0,
ease: 'power4.out',
delay: 0.15,
scrollTrigger: { trigger: titleEl, start: 'top 75%' }
});
}
```
---
## Technique 9: Line Clip Wipe {#line-clip-wipe}
Each line of text reveals from left to right, like a typewriter but with a clean clip-path sweep.
```javascript
function initLineClipWipe(textEl) {
const split = new SplitText(textEl, { type: 'lines' });
split.lines.forEach((line, i) => {
gsap.fromTo(line,
{ clipPath: 'inset(0 100% 0 0)' },
{
clipPath: 'inset(0 0% 0 0)',
duration: 0.8,
ease: 'power3.out',
delay: i * 0.12, // stagger between lines
scrollTrigger: {
trigger: textEl,
start: 'top 80%',
}
}
);
});
}
```
---
## Technique 10: Scroll-Speed Reactive Marquee {#marquee}
Infinite scrolling text. Speed scales with scroll velocity — fast scroll = fast marquee. Slow scroll = slow/paused.
```css
.marquee-wrapper {
overflow: hidden;
white-space: nowrap;
}
.marquee-track {
display: inline-flex;
gap: 4rem;
/* Two copies side by side for seamless loop */
}
.marquee-track .marquee-item {
display: inline-block;
font-size: clamp(2rem, 5vw, 5rem);
font-weight: 700;
letter-spacing: -0.02em;
}
```
```javascript
function initReactiveMarquee(wrapperEl) {
const track = wrapperEl.querySelector('.marquee-track');
let currentX = 0;
let velocity = 0;
let baseSpeed = 0.8; // px per frame base speed
let lastScrollY = window.scrollY;
let lastTime = performance.now();
// Track scroll velocity
window.addEventListener('scroll', () => {
const now = performance.now();
const dt = now - lastTime;
const dy = window.scrollY - lastScrollY;
velocity = Math.abs(dy / dt) * 30; // scale to marquee speed
lastScrollY = window.scrollY;
lastTime = now;
}, { passive: true });
function animate() {
velocity = Math.max(0, velocity - 0.3); // decay
const speed = baseSpeed + velocity;
currentX -= speed;
// Reset when first copy exits viewport
const trackWidth = track.children[0].offsetWidth * track.children.length / 2;
if (Math.abs(currentX) >= trackWidth) {
currentX += trackWidth;
}
track.style.transform = `translateX(currentXpx)`;
requestAnimationFrame(animate);
}
animate();
}
```
---
## Technique 11: Variable Font Wave {#variable-font}
If the font supports variable axes (weight, width), animate them per-character for a wave/ripple effect.
```javascript
function initVariableFontWave(titleEl) {
const split = new SplitText(titleEl, { type: 'chars' });
// Wave through characters using weight axis
gsap.to(split.chars, {
fontVariationSettings: '"wght" 800',
duration: 0.4,
ease: 'power2.inOut',
stagger: {
each: 0.06,
yoyo: true,
repeat: -1, // infinite loop
}
});
}
```
**Note:** Requires a variable font. Free options: Inter Variable, Fraunces, Recursive. Load from Google Fonts with `?display=swap&axes=wght`.
---
## Technique 12: Bleed Typography {#bleed-type}
Oversized headline that intentionally exceeds section boundaries. Creates drama, depth, and visual tension.
```css
.bleed-title {
font-size: clamp(80px, 18vw, 220px);
font-weight: 900;
line-height: 0.9;
letter-spacing: -0.04em;
/* Allow bleeding outside section */
position: relative;
z-index: 10;
pointer-events: none;
/* Negative margins to bleed out */
margin-left: -0.05em;
margin-right: -0.05em;
/* Optionally: half above, half below section boundary */
transform: translateY(30%);
}
/* Parent section allows overflow */
.bleed-section {
overflow: visible;
position: relative;
z-index: 2;
}
/* Next section needs to be higher to "trap" the bleed */
.bleed-section + .next-section {
position: relative;
z-index: 3;
}
```
```javascript
// Parallax on the bleed title — moves at slightly different rate
// to emphasize that it belongs to a different depth than content
gsap.to('.bleed-title', {
y: '-12%',
ease: 'none',
scrollTrigger: {
trigger: '.bleed-section',
start: 'top bottom',
end: 'bottom top',
scrub: true,
}
});
```
---
## Technique 13: Ghost Outlined Background Text {#ghost-text}
Massive atmospheric text sitting BEHIND the main product using only a thin stroke
with transparent fill. Supports the scene without competing with the content.
```css
.ghost-bg-text {
color: transparent;
-webkit-text-stroke: 1px rgba(255, 255, 255, 0.15); /* light sites */
/* dark sites: -webkit-text-stroke: 1px rgba(255, 106, 26, 0.18); */
font-size: clamp(5rem, 15vw, 18rem);
font-weight: 900;
line-height: 0.85;
letter-spacing: -0.04em;
white-space: nowrap;
z-index: 2; /* must be lower than the hero product (depth-3 = z-index 3+) */
pointer-events: none;
user-select: none;
}
```
```javascript
// Entrance: lines slide up from a masked overflow:hidden parent
function initGhostTextEntrance(lines) {
gsap.set(lines, { y: '110%' });
gsap.to(lines, {
y: '0%',
stagger: 0.1,
duration: 1.1,
ease: 'power4.out',
delay: 0.2,
});
}
// Exit: lines drift apart as hero scrolls out
function addGhostTextExit(scrubTimeline, line1, line2) {
scrubTimeline
.to(line1, { x: '-12vw', opacity: 0.06, duration: 0.3 }, 0)
.to(line2, { x: '12vw', opacity: 0.06, duration: 0.3 }, 0)
.to(line1, { x: '-40vw', opacity: 0, duration: 0.25 }, 0.4)
.to(line2, { x: '40vw', opacity: 0, duration: 0.25 }, 0.4);
}
```
Stroke opacity guide:
- `0.08–0.12` → barely-there atmosphere
- `0.15–0.22` → readable on inspection, still subtle
- `0.25–0.35` → prominently visible — only if it IS the visual focus
Rules:
1. Always `aria-hidden="true"` — never the real heading
2. A real `<h1>` must exist elsewhere for SEO/screen readers
3. Only works on dark backgrounds — thin strokes vanish on light ones
4. Maximum 2 lines — 3+ becomes noise
5. Best with ultra-heavy weights (800–900) and tight letter-spacing
---
## Combining Techniques
The most premium results come from layering multiple text techniques in the same section:
```javascript
// Example: Full hero text sequence
function initHeroTextSequence() {
const tl = gsap.timeline({
scrollTrigger: {
trigger: '.hero-scene',
start: 'top top',
end: '+=300%',
pin: true,
scrub: 1,
}
});
// 1. Bleed title already visible via CSS
// 2. Subtitle curtain reveal
tl.from('.hero-sub .line-inner', {
y: '110%', duration: 0.2, stagger: 0.05
}, 0)
// 3. CTA skew bounce
.from('.hero-cta', {
y: 40, skewY: 5, opacity: 0, duration: 0.15, ease: 'back.out'
}, 0.15)
// 4. On scroll-through: title exits via split converge reverse
.to('.hero-title .word-left', {
x: '-80vw', opacity: 0, duration: 0.25, stagger: 0.03
}, 0.7)
.to('.hero-title .word-right', {
x: '80vw', opacity: 0, duration: 0.25, stagger: -0.03
}, 0.7);
}
```
FILE:scripts/inspect-assets.py
#!/usr/bin/env python3
"""
2.5D Asset Inspector
Usage: python scripts/inspect-assets.py image1.png image2.jpg ...
or: python scripts/inspect-assets.py path/to/folder/
Checks each image and reports:
- Format and mode
- Whether it has a real transparent background
- Background type if not transparent (dark, light, complex)
- Recommended depth level based on image characteristics
- Whether the background is likely a problem (product shot vs scene/artwork)
The AI reads this output and uses it to inform the user.
The script NEVER modifies images — inspect only.
"""
import sys
import os
try:
from PIL import Image
except ImportError:
print("PIL not found. Install with: pip install Pillow")
sys.exit(1)
def analyse_image(path):
result = {
"path": path,
"filename": os.path.basename(path),
"status": None,
"format": None,
"mode": None,
"size": None,
"bg_type": None,
"bg_colour": None,
"likely_needs_removal": None,
"notes": [],
}
try:
img = Image.open(path)
result["format"] = img.format or os.path.splitext(path)[1].upper().strip(".")
result["mode"] = img.mode
result["size"] = img.size
w, h = img.size
except Exception as e:
result["status"] = "ERROR"
result["notes"].append(f"Could not open: {e}")
return result
# --- Alpha / transparency check ---
if img.mode == "RGBA":
extrema = img.getextrema()
alpha_min = extrema[3][0] # 0 = has real transparency, 255 = fully opaque
if alpha_min == 0:
result["status"] = "CLEAN"
result["bg_type"] = "transparent"
result["notes"].append("Real alpha channel with transparent pixels — clean cutout")
result["likely_needs_removal"] = False
return result
else:
result["notes"].append("RGBA mode but alpha is fully opaque — background was never removed")
img = img.convert("RGB") # treat as solid for analysis below
if img.mode not in ("RGB", "L"):
img = img.convert("RGB")
# --- Sample corners and edges to detect background colour ---
pixels = img.load()
sample_points = [
(0, 0), (w - 1, 0), (0, h - 1), (w - 1, h - 1), # corners
(w // 2, 0), (w // 2, h - 1), # top/bottom center
(0, h // 2), (w - 1, h // 2), # left/right center
]
samples = []
for x, y in sample_points:
try:
px = pixels[x, y]
if isinstance(px, int):
px = (px, px, px)
samples.append(px[:3])
except Exception:
pass
if not samples:
result["status"] = "UNKNOWN"
result["notes"].append("Could not sample pixels")
return result
# --- Classify background ---
avg_r = sum(s[0] for s in samples) / len(samples)
avg_g = sum(s[1] for s in samples) / len(samples)
avg_b = sum(s[2] for s in samples) / len(samples)
avg_brightness = (avg_r + avg_g + avg_b) / 3
# Check colour consistency (low variance = solid bg, high variance = scene/complex bg)
max_r = max(s[0] for s in samples)
max_g = max(s[1] for s in samples)
max_b = max(s[2] for s in samples)
min_r = min(s[0] for s in samples)
min_g = min(s[1] for s in samples)
min_b = min(s[2] for s in samples)
variance = max(max_r - min_r, max_g - min_g, max_b - min_b)
result["bg_colour"] = (int(avg_r), int(avg_g), int(avg_b))
if variance > 80:
result["status"] = "COMPLEX_BG"
result["bg_type"] = "complex or scene"
result["notes"].append(
"Background varies significantly across edges — likely a scene, "
"photograph, or artwork background rather than a solid colour"
)
result["likely_needs_removal"] = False # complex bg = probably intentional content
result["notes"].append(
"JUDGMENT: Complex backgrounds usually mean this image IS the content "
"(site screenshot, artwork, section bg). Background likely should be KEPT."
)
elif avg_brightness < 40:
result["status"] = "DARK_BG"
result["bg_type"] = "solid dark/black"
result["notes"].append(
f"Solid dark background detected — average edge brightness: {avg_brightness:.0f}/255"
)
result["likely_needs_removal"] = True
result["notes"].append(
"JUDGMENT: Dark studio backgrounds on product shots typically need removal. "
"BUT if this is a screenshot, artwork, or intentionally dark composition, keep it."
)
elif avg_brightness > 210:
result["status"] = "LIGHT_BG"
result["bg_type"] = "solid white/light"
result["notes"].append(
f"Solid light background detected — average edge brightness: {avg_brightness:.0f}/255"
)
result["likely_needs_removal"] = True
result["notes"].append(
"JUDGMENT: White studio backgrounds on product shots typically need removal. "
"BUT if this is a screenshot, UI mockup, or document, keep it."
)
else:
result["status"] = "MIDTONE_BG"
result["bg_type"] = "solid mid-tone colour"
result["notes"].append(
f"Solid mid-tone background detected — avg colour: RGB{result['bg_colour']}"
)
result["likely_needs_removal"] = None # ambiguous — let AI judge
result["notes"].append(
"JUDGMENT: Ambiguous — could be a branded background (keep) or a "
"studio colour backdrop (remove). AI must judge based on context."
)
# --- JPEG format warning ---
if result["format"] in ("JPEG", "JPG"):
result["notes"].append(
"JPEG format — cannot store transparency. "
"If bg removal is needed, user must provide a PNG version or approve CSS workaround."
)
# --- Size note ---
if w > 2000 or h > 2000:
result["notes"].append(
f"Large image ({w}x{h}px) — resize before embedding. "
"See references/asset-pipeline.md Step 3 for depth-appropriate targets."
)
return result
def print_report(results):
print("\n" + "═" * 55)
print(" 2.5D Asset Inspector Report")
print("═" * 55)
for r in results:
print(f"\n📁 {r['filename']}")
print(f" Format : {r['format']} | Mode: {r['mode']} | Size: {r['size']}")
status_icons = {
"CLEAN": "✅",
"DARK_BG": "⚠️ ",
"LIGHT_BG": "⚠️ ",
"COMPLEX_BG": "🔵",
"MIDTONE_BG": "❓",
"UNKNOWN": "❓",
"ERROR": "❌",
}
icon = status_icons.get(r["status"], "❓")
print(f" Status : {icon} {r['status']}")
if r["bg_type"]:
print(f" Bg type: {r['bg_type']}")
if r["likely_needs_removal"] is True:
print(" Removal: Likely needed (product/object shot)")
elif r["likely_needs_removal"] is False:
print(" Removal: Likely NOT needed (scene/artwork/content image)")
else:
print(" Removal: Ambiguous — AI must judge from context")
for note in r["notes"]:
print(f" → {note}")
print("\n" + "═" * 55)
clean = sum(1 for r in results if r["status"] == "CLEAN")
flagged = sum(1 for r in results if r["status"] in ("DARK_BG", "LIGHT_BG", "MIDTONE_BG"))
complex_bg = sum(1 for r in results if r["status"] == "COMPLEX_BG")
errors = sum(1 for r in results if r["status"] == "ERROR")
print(f" Clean: {clean} | Flagged: {flagged} | Complex/Scene: {complex_bg} | Errors: {errors}")
print("═" * 55)
print("\nNext step: Read JUDGMENT notes above and inform the user.")
print("See references/asset-pipeline.md for the exact notification format.\n")
def collect_paths(args):
paths = []
for arg in args:
if os.path.isdir(arg):
for f in os.listdir(arg):
if f.lower().endswith((".png", ".jpg", ".jpeg", ".webp", ".avif")):
paths.append(os.path.join(arg, f))
elif os.path.isfile(arg):
paths.append(arg)
else:
print(f"⚠️ Not found: {arg}")
return paths
if __name__ == "__main__":
if len(sys.argv) < 2 or sys.argv[1] in ('-h', '--help'):
print("\nUsage:")
print(" python scripts/inspect-assets.py image.png")
print(" python scripts/inspect-assets.py image1.jpg image2.png")
print(" python scripts/inspect-assets.py path/to/folder/\n")
if len(sys.argv) < 2:
sys.exit(1)
else:
sys.exit(0)
paths = collect_paths(sys.argv[1:])
if not paths:
print("No valid image files found.")
sys.exit(1)
results = [analyse_image(p) for p in paths]
print_report(results)
FILE:scripts/validate-layers.js
#!/usr/bin/env node
/**
* 2.5D Layer Validator
* Usage: node scripts/validate-layers.js path/to/your/index.html
*
* Checks:
* 1. Every animated element has a data-depth attribute
* 2. Decorative elements have aria-hidden="true"
* 3. prefers-reduced-motion is implemented in CSS
* 4. Product images have alt text
* 5. SplitText elements have aria-label
* 6. No more than 80 animated elements (performance)
* 7. Will-change is not applied globally
*/
const fs = require('fs');
const path = require('path');
const filePath = process.argv[2];
if (!filePath) {
console.error('\n❌ Usage: node validate-layers.js path/to/index.html\n');
process.exit(1);
}
const html = fs.readFileSync(path.resolve(filePath), 'utf8');
let passed = 0;
let failed = 0;
const results = [];
function check(label, condition, suggestion) {
if (condition) {
passed++;
results.push({ status: '✅', label });
} else {
failed++;
results.push({ status: '❌', label, suggestion });
}
}
function warn(label, condition, suggestion) {
if (!condition) {
results.push({ status: '⚠️ ', label, suggestion });
}
}
// --- CHECKS ---
// 1. Scene elements present
check(
'Scene elements found (.scene)',
html.includes('class="scene') || html.includes("class='scene"),
'Wrap each major section in <section class="scene"> for the depth system to work.'
);
// 2. Depth layers present
const depthMatches = html.match(/data-depth=["']\d["']/g) || [];
check(
`Depth attributes found (depthMatches.length elements)`,
depthMatches.length >= 3,
'Each scene needs at least 3 elements with data-depth="0" through data-depth="5".'
);
// 3. prefers-reduced-motion in linked CSS
const hasReducedMotionInline = html.includes('prefers-reduced-motion');
check(
'prefers-reduced-motion implemented',
hasReducedMotionInline || html.includes('hero-section.css'),
'Add @media (prefers-reduced-motion: reduce) { } block. See references/accessibility.md.'
);
// 4. Decorative elements have aria-hidden
const decorativeElements = (html.match(/class="[^"]*(?:depth-0|depth-1|depth-5|glow-blob|particle|deco)[^"]*"/g) || []).length;
const ariaHiddenCount = (html.match(/aria-hidden="true"/g) || []).length;
check(
`Decorative elements have aria-hidden (found ariaHiddenCount)`,
ariaHiddenCount >= 1,
'Add aria-hidden="true" to all decorative layers (depth-0, depth-1, particles, glows).'
);
// 5. Images have alt text
const imgTags = html.match(/<img[^>]*>/g) || [];
const imgsWithoutAlt = imgTags.filter(tag => !tag.includes('alt=')).length;
check(
`All images have alt attributes (imgTags.length images found)`,
imgsWithoutAlt === 0,
`imgsWithoutAlt image(s) missing alt attribute. Decorative images use alt="", meaningful images need descriptive alt text.`
);
// 6. Skip link present
check(
'Skip-to-content link present',
html.includes('skip-link') || html.includes('Skip to'),
'Add <a href="#main-content" class="skip-link">Skip to main content</a> as first element in <body>.'
);
// 7. GSAP script loaded
check(
'GSAP script included',
html.includes('gsap') || html.includes('gsap.min.js'),
'Include GSAP from CDN: <script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/gsap.min.js"></script>'
);
// 8. ScrollTrigger plugin loaded
warn(
'ScrollTrigger plugin loaded',
html.includes('ScrollTrigger'),
'Add ScrollTrigger plugin for scroll animations: <script src=".../ScrollTrigger.min.js"></script>'
);
// 9. Performance: too many animated elements
const animatedElements = (html.match(/data-animate=/g) || []).length + depthMatches.length;
check(
`Animated element count acceptable (animatedElements total)`,
animatedElements <= 80,
`animatedElements animated elements found. Target is under 80 for smooth 60fps performance.`
);
// 10. Main landmark present
check(
'<main> landmark present',
html.includes('<main'),
'Wrap page content in <main id="main-content"> for accessibility and skip link target.'
);
// 11. Heading hierarchy
const h1Count = (html.match(/<h1[\s>]/g) || []).length;
check(
`Single <h1> present (found h1Count)`,
h1Count === 1,
h1Count === 0
? 'Add one <h1> element as the main page heading.'
: `Multiple <h1> elements found (h1Count). Each page should have exactly one <h1>.`
);
// 12. lang attribute on html
check(
'<html lang=""> attribute present',
html.includes('lang='),
'Add lang="en" (or your language) to the <html> element: <html lang="en">'
);
// --- REPORT ---
console.log('\n📋 2.5D Layer Validator Report');
console.log('═══════════════════════════════════════');
console.log(`File: filePath\n`);
results.forEach(r => {
console.log(`r.status r.label`);
if (r.suggestion) {
console.log(` → r.suggestion`);
}
});
console.log('\n═══════════════════════════════════════');
console.log(`Passed: passed | Failed: failed`);
if (failed === 0) {
console.log('\n🎉 All checks passed! Your 2.5D site is ready.\n');
} else {
console.log(`\n🔧 Fix the failed issue(s) above before shipping.\n`);
process.exit(1);
}
Production-grade Playwright testing toolkit. Use when the user mentions Playwright tests, end-to-end testing, browser automation, fixing flaky tests, test mi...
---
name: "playwright-pro"
description: "Production-grade Playwright testing toolkit. Use when the user mentions Playwright tests, end-to-end testing, browser automation, fixing flaky tests, test migration, CI/CD testing, or test suites. Generate tests, fix flaky failures, migrate from Cypress/Selenium, sync with TestRail, run on BrowserStack. 55 templates, 3 agents, smart reporting."
---
# Playwright Pro
Production-grade Playwright testing toolkit for AI coding agents.
## Available Commands
When installed as a Claude Code plugin, these are available as `/pw:` commands:
| Command | What it does |
|---|---|
| `/pw:init` | Set up Playwright — detects framework, generates config, CI, first test |
| `/pw:generate <spec>` | Generate tests from user story, URL, or component |
| `/pw:review` | Review tests for anti-patterns and coverage gaps |
| `/pw:fix <test>` | Diagnose and fix failing or flaky tests |
| `/pw:migrate` | Migrate from Cypress or Selenium to Playwright |
| `/pw:coverage` | Analyze what's tested vs. what's missing |
| `/pw:testrail` | Sync with TestRail — read cases, push results |
| `/pw:browserstack` | Run on BrowserStack, pull cross-browser reports |
| `/pw:report` | Generate test report in your preferred format |
## Quick Start Workflow
The recommended sequence for most projects:
```
1. /pw:init → scaffolds config, CI pipeline, and a first smoke test
2. /pw:generate → generates tests from your spec or URL
3. /pw:review → validates quality and flags anti-patterns ← always run after generate
4. /pw:fix <test> → diagnoses and repairs any failing/flaky tests ← run when CI turns red
```
**Validation checkpoints:**
- After `/pw:generate` — always run `/pw:review` before committing; it catches locator anti-patterns and missing assertions automatically.
- After `/pw:fix` — re-run the full suite locally (`npx playwright test`) to confirm the fix doesn't introduce regressions.
- After `/pw:migrate` — run `/pw:coverage` to confirm parity with the old suite before decommissioning Cypress/Selenium tests.
### Example: Generate → Review → Fix
```bash
# 1. Generate tests from a user story
/pw:generate "As a user I can log in with email and password"
# Generated: tests/auth/login.spec.ts
# → Playwright Pro creates the file using the auth template.
# 2. Review the generated tests
/pw:review tests/auth/login.spec.ts
# → Flags: one test used page.locator('input[type=password]') — suggests getByLabel('Password')
# → Fix applied automatically.
# 3. Run locally to confirm
npx playwright test tests/auth/login.spec.ts --headed
# 4. If a test is flaky in CI, diagnose it
/pw:fix tests/auth/login.spec.ts
# → Identifies missing web-first assertion; replaces waitForTimeout(2000) with expect(locator).toBeVisible()
```
## Golden Rules
1. `getByRole()` over CSS/XPath — resilient to markup changes
2. Never `page.waitForTimeout()` — use web-first assertions
3. `expect(locator)` auto-retries; `expect(await locator.textContent())` does not
4. Isolate every test — no shared state between tests
5. `baseURL` in config — zero hardcoded URLs
6. Retries: `2` in CI, `0` locally
7. Traces: `'on-first-retry'` — rich debugging without slowdown
8. Fixtures over globals — `test.extend()` for shared state
9. One behavior per test — multiple related assertions are fine
10. Mock external services only — never mock your own app
## Locator Priority
```
1. getByRole() — buttons, links, headings, form elements
2. getByLabel() — form fields with labels
3. getByText() — non-interactive text
4. getByPlaceholder() — inputs with placeholder
5. getByTestId() — when no semantic option exists
6. page.locator() — CSS/XPath as last resort
```
## What's Included
- **9 skills** with detailed step-by-step instructions
- **3 specialized agents**: test-architect, test-debugger, migration-planner
- **55 test templates**: auth, CRUD, checkout, search, forms, dashboard, settings, onboarding, notifications, API, accessibility
- **2 MCP servers** (TypeScript): TestRail and BrowserStack integrations
- **Smart hooks**: auto-validate test quality, auto-detect Playwright projects
- **6 reference docs**: golden rules, locators, assertions, fixtures, pitfalls, flaky tests
- **Migration guides**: Cypress and Selenium mapping tables
## Integration Setup
### TestRail (Optional)
```bash
export TESTRAIL_URL="https://your-instance.testrail.io"
export TESTRAIL_USER="[email protected]"
export TESTRAIL_API_KEY="your-api-key"
```
### BrowserStack (Optional)
```bash
export BROWSERSTACK_USERNAME="your-username"
export BROWSERSTACK_ACCESS_KEY="your-access-key"
```
## Quick Reference
See `reference/` directory for:
- `golden-rules.md` — The 10 non-negotiable rules
- `locators.md` — Complete locator priority with cheat sheet
- `assertions.md` — Web-first assertions reference
- `fixtures.md` — Custom fixtures and storageState patterns
- `common-pitfalls.md` — Top 10 mistakes and fixes
- `flaky-tests.md` — Diagnosis commands and quick fixes
See `templates/README.md` for the full template index.
FILE:CLAUDE.md
# Playwright Pro — Agent Context
You are working in a project with the Playwright Pro plugin installed. Follow these rules for all test-related work.
## Golden Rules (Non-Negotiable)
1. **`getByRole()` over CSS/XPath** — resilient to markup changes, mirrors how users see the page
2. **Never `page.waitForTimeout()`** — use `expect(locator).toBeVisible()` or `page.waitForURL()`
3. **Web-first assertions** — `expect(locator)` auto-retries; `expect(await locator.textContent())` does not
4. **Isolate every test** — no shared state, no execution-order dependencies
5. **`baseURL` in config** — zero hardcoded URLs in tests
6. **Retries: `2` in CI, `0` locally** — surface flakiness where it matters
7. **Traces: `'on-first-retry'`** — rich debugging without CI slowdown
8. **Fixtures over globals** — share state via `test.extend()`, not module-level variables
9. **One behavior per test** — multiple related `expect()` calls are fine
10. **Mock external services only** — never mock your own app
## Locator Priority
Always use the first option that works:
```typescript
page.getByRole('button', { name: 'Submit' }) // 1. Role (default)
page.getByLabel('Email address') // 2. Label (form fields)
page.getByText('Welcome back') // 3. Text (non-interactive)
page.getByPlaceholder('Search...') // 4. Placeholder
page.getByAltText('Company logo') // 5. Alt text (images)
page.getByTitle('Close dialog') // 6. Title attribute
page.getByTestId('checkout-summary') // 7. Test ID (last semantic)
page.locator('.legacy-widget') // 8. CSS (last resort)
```
## How to Use This Plugin
### Generating Tests
When generating tests, always:
1. Use the `Explore` subagent to scan the project structure first
2. Check `playwright.config.ts` for `testDir`, `baseURL`, and project settings
3. Load relevant templates from `templates/` directory
4. Match the project's language (check for `tsconfig.json` → TypeScript, else JavaScript)
5. Place tests in the configured `testDir` (default: `tests/` or `e2e/`)
6. Include a descriptive test name that explains the behavior being verified
### Reviewing Tests
When reviewing, check against:
1. All 10 golden rules above
2. The anti-patterns in `skills/review/anti-patterns.md`
3. Missing edge cases (empty state, error state, loading state)
4. Proper use of fixtures for shared setup
### Fixing Flaky Tests
When fixing flaky tests:
1. Categorize first: timing, isolation, environment, or infrastructure
2. Use `npx playwright test <file> --repeat-each=10` to reproduce
3. Use `--trace=on` for every attempt
4. Apply the targeted fix from `skills/fix/flaky-taxonomy.md`
### Using Built-in Commands
Leverage Claude Code's built-in capabilities:
- **Large migrations**: Use `/batch` for parallel file-by-file conversion
- **Post-generation cleanup**: Use `/simplify` after generating a test suite
- **Debugging sessions**: Use `/debug` alongside `/pw:fix` for trace analysis
- **Code review**: Use `/review` for general code quality, `/pw:review` for Playwright-specific
### Integrations
- **TestRail**: Configured via `TESTRAIL_URL`, `TESTRAIL_USER`, `TESTRAIL_API_KEY` env vars
- **BrowserStack**: Configured via `BROWSERSTACK_USERNAME`, `BROWSERSTACK_ACCESS_KEY` env vars
- Both are optional. The plugin works fully without them.
## File Conventions
- Test files: `*.spec.ts` or `*.spec.js`
- Page objects: `*.page.ts` in a `pages/` directory
- Fixtures: `fixtures.ts` or `fixtures/` directory
- Test data: `test-data/` directory with JSON/factory files
FILE:README.md
# Playwright Pro
> Production-grade Playwright testing toolkit for AI coding agents.
Generate tests, fix flaky failures, migrate from Cypress/Selenium, sync with TestRail, run on BrowserStack — all from your AI agent.
## Install
```bash
# Claude Code plugin
claude plugin install pw@claude-skills
# Or load directly
claude --plugin-dir ./engineering-team/playwright-pro
```
## Commands
| Command | What it does |
|---|---|
| `/pw:init` | Set up Playwright in your project — detects framework, generates config, CI, first test |
| `/pw:generate <spec>` | Generate tests from a user story, URL, or component name |
| `/pw:review` | Review existing tests for anti-patterns and coverage gaps |
| `/pw:fix <test>` | Diagnose and fix a failing or flaky test |
| `/pw:migrate` | Migrate from Cypress or Selenium to Playwright |
| `/pw:coverage` | Analyze what's tested vs. what's missing |
| `/pw:testrail` | Sync with TestRail — read cases, push results, create runs |
| `/pw:browserstack` | Run tests on BrowserStack, pull cross-browser reports |
| `/pw:report` | Generate a test report in your preferred format |
## Quick Start
```bash
# In Claude Code:
/pw:init # Set up Playwright
/pw:generate "user can log in" # Generate your first test
# Tests are auto-validated by hooks — no extra steps
```
## What's Inside
### 9 Skills
Slash commands that turn natural language into production-ready Playwright tests. Each skill leverages Claude Code's built-in capabilities (`/batch` for parallel work, `Explore` for codebase analysis, `/debug` for trace inspection).
### 3 Specialized Agents
- **test-architect** — Plans test strategy for complex applications
- **test-debugger** — Diagnoses flaky tests using a systematic taxonomy
- **migration-planner** — Creates file-by-file migration plans from Cypress/Selenium
### 55 Test Templates
Ready-to-use, parametrizable templates covering:
| Category | Count | Examples |
|---|---|---|
| Authentication | 8 | Login, logout, SSO, MFA, password reset, RBAC |
| CRUD | 6 | Create, read, update, delete, bulk ops |
| Checkout | 6 | Cart, payment, coupon, order history |
| Search | 5 | Basic search, filters, sorting, pagination |
| Forms | 6 | Multi-step, validation, file upload |
| Dashboard | 5 | Data loading, charts, export |
| Settings | 4 | Profile, password, notifications |
| Onboarding | 4 | Registration, email verify, welcome tour |
| Notifications | 3 | In-app, toast, notification center |
| API | 5 | REST CRUD, GraphQL, error handling |
| Accessibility | 3 | Keyboard nav, screen reader, contrast |
### 2 MCP Integrations
- **TestRail** — Read test cases, create runs, push pass/fail results
- **BrowserStack** — Trigger cross-browser runs, pull session reports with video/screenshots
### Smart Hooks
- Auto-validates test quality when you write `*.spec.ts` files
- Auto-detects Playwright projects on session start
- Zero configuration required
## Integrations Setup
### TestRail (Optional)
Set environment variables:
```bash
export TESTRAIL_URL="https://your-instance.testrail.io"
export TESTRAIL_USER="[email protected]"
export TESTRAIL_API_KEY="your-api-key"
```
Then use `/pw:testrail` to sync test cases and push results.
### BrowserStack (Optional)
```bash
export BROWSERSTACK_USERNAME="your-username"
export BROWSERSTACK_ACCESS_KEY="your-access-key"
```
Then use `/pw:browserstack` to run tests across browsers.
## Works With
| Agent | How |
|---|---|
| **Claude Code** | Full plugin — slash commands, MCP tools, hooks, agents |
| **Codex CLI** | Copy `CLAUDE.md` to your project root as `AGENTS.md` |
| **OpenClaw** | Use as a skill with `SKILL.md` entry point |
## Built-in Command Integration
Playwright Pro doesn't reinvent what your AI agent already does. It orchestrates built-in capabilities:
- `/pw:generate` uses Claude's `Explore` subagent to understand your codebase before generating tests
- `/pw:migrate` uses `/batch` for parallel file-by-file conversion on large test suites
- `/pw:fix` uses `/debug` for trace analysis alongside Playwright-specific diagnostics
- `/pw:review` extends `/review` with Playwright anti-pattern detection
## Reference
Based on battle-tested patterns from production test suites. Includes curated guidance on:
- Locator strategies and priority hierarchy
- Assertion patterns and auto-retry behavior
- Fixture architecture and composition
- Common pitfalls (top 20, ranked by frequency)
- Flaky test diagnosis taxonomy
## License
MIT
FILE:agents/migration-planner.md
---
name: migration-planner
description: >-
Analyzes Cypress or Selenium test suites and creates a file-by-file
migration plan. Invoked by /pw:migrate before conversion starts.
allowed-tools:
- Read
- Grep
- Glob
- LS
---
# Migration Planner Agent
You are a test migration specialist. Your job is to analyze an existing Cypress or Selenium test suite and create a detailed, ordered migration plan.
## Planning Protocol
### Step 1: Detect Source Framework
Scan the project:
**Cypress indicators:**
- `cypress/` directory
- `cypress.config.ts` or `cypress.config.js`
- `@cypress` packages in `package.json`
- `.cy.ts` or `.cy.js` test files
**Selenium indicators:**
- `selenium-webdriver` in dependencies
- `webdriver` or `wdio` in dependencies
- Test files importing `selenium-webdriver`
- `chromedriver` or `geckodriver` in dependencies
- Python files importing `selenium`
### Step 2: Inventory All Test Files
List every test file with:
- File path
- Number of tests (count `it()`, `test()`, or test methods)
- Dependencies (custom commands, page objects, fixtures)
- Complexity (simple/medium/complex based on lines and patterns)
```
## Test Inventory
| # | File | Tests | Dependencies | Complexity |
|---|---|---|---|---|
| 1 | cypress/e2e/login.cy.ts | 5 | login command | Simple |
| 2 | cypress/e2e/checkout.cy.ts | 12 | api helpers, fixtures | Complex |
| 3 | cypress/e2e/search.cy.ts | 8 | none | Medium |
```
### Step 3: Map Dependencies
Identify shared resources that need migration:
**Custom commands** (`cypress/support/commands.ts`):
- List each command and what it does
- Map to Playwright equivalent (fixture, helper function, or page object)
**Fixtures** (`cypress/fixtures/`):
- List data files
- Plan: copy to `test-data/` with any format adjustments
**Plugins** (`cypress/plugins/`):
- List plugin functionality
- Map to Playwright config options or fixtures
**Page Objects** (if used):
- List page object files
- Plan: convert API calls (minimal structural change)
**Support files** (`cypress/support/`):
- List setup/teardown logic
- Map to `playwright.config.ts` or `fixtures/`
### Step 4: Determine Migration Order
Order files by dependency graph:
1. **Shared resources first**: custom commands → fixtures, page objects → helpers
2. **Simple tests next**: files with no dependencies, few tests
3. **Complex tests last**: files with many dependencies, custom commands
```
## Migration Order
### Phase 1: Foundation (do first)
1. Convert custom commands → fixtures.ts
2. Copy fixtures → test-data/
3. Convert page objects (API changes only)
### Phase 2: Simple Tests (quick wins)
4. login.cy.ts → auth/login.spec.ts (5 tests, ~15 min)
5. about.cy.ts → static/about.spec.ts (2 tests, ~5 min)
### Phase 3: Complex Tests
6. checkout.cy.ts → checkout/checkout.spec.ts (12 tests, ~45 min)
7. search.cy.ts → search/search.spec.ts (8 tests, ~30 min)
```
### Step 5: Estimate Effort
| Complexity | Time per test | Notes |
|---|---|---|
| Simple | 2-3 min | Direct API mapping |
| Medium | 5-10 min | Needs locator upgrade |
| Complex | 10-20 min | Custom commands, plugins, complex flows |
### Step 6: Identify Risks
Flag tests that may need manual intervention:
- Tests using Cypress-only features (`cy.origin()`, `cy.session()`)
- Tests with complex `cy.intercept()` patterns
- Tests relying on Cypress retry-ability semantics
- Tests using Cypress plugins with no Playwright equivalent
### Step 7: Return Plan
Return the complete migration plan to `/pw:migrate` for execution.
FILE:agents/test-architect.md
---
name: test-architect
description: >-
Plans test strategy for complex applications. Invoked by /pw:generate and
/pw:coverage when the app has multiple routes, complex state, or requires
a structured test plan before writing tests.
allowed-tools:
- Read
- Grep
- Glob
- LS
---
# Test Architect Agent
You are a test architecture specialist. Your job is to analyze an application's structure and create a comprehensive test plan before any tests are written.
## Your Responsibilities
1. **Map the application surface**: routes, components, API endpoints, user flows
2. **Identify critical paths**: the flows that, if broken, cause revenue loss or user churn
3. **Design test structure**: folder organization, fixture strategy, data management
4. **Prioritize**: which tests deliver the most confidence per effort
5. **Select patterns**: which template or approach fits each test scenario
## How You Work
You are a read-only agent. You analyze and plan — you do not write test files.
### Step 1: Scan the Codebase
- Read route definitions (Next.js `app/`, React Router, Vue Router, Angular routes)
- Read `package.json` for framework and dependencies
- Check for existing tests and their patterns
- Identify state management (Redux, Zustand, Pinia, etc.)
- Check for API layer (REST, GraphQL, tRPC)
### Step 2: Catalog Testable Surfaces
Create a structured inventory:
```
## Application Surface
### Pages (by priority)
1. /login — Auth entry point [CRITICAL]
2. /dashboard — Main user view [CRITICAL]
3. /settings — User preferences [HIGH]
4. /admin — Admin panel [HIGH]
5. /about — Static page [LOW]
### Interactive Components
1. SearchBar — complex state, debounced API calls
2. DataTable — sorting, filtering, pagination
3. FileUploader — drag-drop, progress, error handling
### API Endpoints
1. POST /api/auth/login — authentication
2. GET /api/users — user list with pagination
3. PUT /api/users/:id — user update
### User Flows (multi-page)
1. Registration → Email Verify → Onboarding → Dashboard
2. Search → Filter → Select → Add to Cart → Checkout → Confirm
```
### Step 3: Design Test Plan
```
## Test Plan
### Folder Structure
e2e/
├── auth/ # Authentication tests
├── dashboard/ # Dashboard tests
├── checkout/ # Checkout flow tests
├── fixtures/ # Shared fixtures
├── pages/ # Page object models
└── test-data/ # Test data files
### Fixture Strategy
- Auth fixture: shared `storageState` for logged-in tests
- API fixture: request context for data seeding
- Data fixture: factory functions for test entities
### Test Distribution
| Area | Tests | Template | Effort |
|---|---|---|---|
| Auth | 8 | auth/* | 1h |
| Dashboard | 6 | dashboard/* | 1h |
| Checkout | 10 | checkout/* | 2h |
| Search | 5 | search/* | 45m |
| Settings | 4 | settings/* | 30m |
| API | 5 | api/* | 45m |
### Priority Order
1. Auth (blocks everything else)
2. Core user flow (the main thing users do)
3. Payment/checkout (revenue-critical)
4. Everything else
```
### Step 4: Return Plan
Return the complete plan to the calling skill. Do not write files.
FILE:agents/test-debugger.md
---
name: test-debugger
description: >-
Diagnoses flaky or failing Playwright tests using systematic taxonomy.
Invoked by /pw:fix when a test needs deep analysis including running
tests, reading traces, and identifying root causes.
allowed-tools:
- Read
- Grep
- Glob
- LS
- Bash
---
# Test Debugger Agent
You are a Playwright test debugging specialist. Your job is to systematically diagnose why a test fails or behaves flakily, identify the root cause category, and return a specific fix.
## Debugging Protocol
### Step 1: Read the Test
Read the test file and understand:
- What behavior it's testing
- Which pages/URLs it visits
- Which locators it uses
- Which assertions it makes
- Any setup/teardown (fixtures, beforeEach)
### Step 2: Run the Test
Run it multiple ways to classify the failure:
```bash
# Single run — get the error
npx playwright test <file> --grep "<test name>" --reporter=list 2>&1
# Burn-in — expose timing issues
npx playwright test <file> --grep "<test name>" --repeat-each=10 --reporter=list 2>&1
# Isolation check — expose state leaks
npx playwright test <file> --grep "<test name>" --workers=1 --reporter=list 2>&1
# Full suite — expose interaction
npx playwright test --reporter=list 2>&1
```
### Step 3: Capture Trace
```bash
npx playwright test <file> --grep "<test name>" --trace=on --retries=0 2>&1
```
Read the trace output for:
- Network requests that failed or were slow
- Elements that weren't visible when expected
- Navigation timing issues
- Console errors
### Step 4: Classify
| Category | Evidence |
|---|---|
| **Timing/Async** | Fails on `--repeat-each=10`; error mentions timeout or element not found intermittently |
| **Test Isolation** | Passes alone (`--workers=1 --grep`), fails in full suite |
| **Environment** | Passes locally, fails in CI (check viewport, fonts, timezone) |
| **Infrastructure** | Random crash errors, OOM, browser process killed |
### Step 5: Identify Specific Cause
Common root causes per category:
**Timing:**
- Missing `await` on a Playwright call
- `waitForTimeout()` that's too short
- Clicking before element is actionable
- Asserting before data loads
- Animation interference
**Isolation:**
- Global variable shared between tests
- Database not cleaned between tests
- localStorage/cookies leaking
- Test creates data with non-unique identifier
**Environment:**
- Different viewport size in CI
- Font rendering differences affect screenshots
- Timezone affects date assertions
- Network latency in CI is higher
**Infrastructure:**
- Browser runs out of memory with too many workers
- File system race condition
- DNS resolution failure
### Step 6: Return Diagnosis
Return to the calling skill:
```
## Diagnosis
**Category:** Timing/Async
**Root Cause:** Missing await on line 23 — `page.goto('/dashboard')` runs without
waiting, so the assertion on line 24 runs before navigation completes.
**Evidence:** Fails 3/10 times on `--repeat-each=10`. Trace shows assertion firing
before navigation response received.
## Fix
Line 23: Add `await` before `page.goto('/dashboard')`
## Verification
After fix: 10/10 passes on `--repeat-each=10`
```
FILE:hooks/detect-playwright.sh
#!/usr/bin/env bash
# Session start hook: detects if the project uses Playwright.
# Outputs context hint for Claude if playwright.config exists.
set -euo pipefail
# Check for Playwright config in current directory or common locations
PW_CONFIG=""
for config in playwright.config.ts playwright.config.js playwright.config.mjs; do
if [[ -f "$config" ]]; then
PW_CONFIG="$config"
break
fi
done
if [[ -z "$PW_CONFIG" ]]; then
exit 0
fi
# Count existing test files
TEST_COUNT=$(find . -name "*.spec.ts" -o -name "*.spec.js" -o -name "*.test.ts" -o -name "*.test.js" 2>/dev/null | grep -v node_modules | wc -l | tr -d ' ')
echo "🎭 Playwright detected ($PW_CONFIG) — $TEST_COUNT test files found. Use /pw: commands for testing workflows."
FILE:hooks/hooks.json
{
"hooks": {
"PostToolUse": [
{
"matcher": "Write|Edit",
"hooks": [
{
"type": "command",
"command": "bash CLAUDE_PLUGIN_ROOT/hooks/validate-test.sh"
}
]
}
],
"SessionStart": [
{
"hooks": [
{
"type": "command",
"command": "bash CLAUDE_PLUGIN_ROOT/hooks/detect-playwright.sh"
}
]
}
]
}
}
FILE:hooks/validate-test.sh
#!/usr/bin/env bash
# Post-write hook: validates Playwright test files for common anti-patterns.
# Runs silently — only outputs warnings if issues found.
# Input: JSON on stdin with tool_input.file_path
set -euo pipefail
# Read the file path from stdin JSON
INPUT=$(cat)
FILE_PATH=$(echo "$INPUT" | python3 -c "
import sys, json
try:
data = json.load(sys.stdin)
print(data.get('tool_input', {}).get('file_path', ''))
except:
print('')
" 2>/dev/null || echo "")
# Only check .spec.ts and .spec.js files
if [[ ! "$FILE_PATH" =~ \.(spec|test)\.(ts|js|mjs)$ ]]; then
exit 0
fi
# Check if file exists
if [[ ! -f "$FILE_PATH" ]]; then
exit 0
fi
WARNINGS=""
# Check for waitForTimeout
if grep -n 'waitForTimeout' "$FILE_PATH" >/dev/null 2>&1; then
LINES=$(grep -n 'waitForTimeout' "$FILE_PATH" | head -3)
WARNINGS="WARNINGS\n⚠️ waitForTimeout() found — use web-first assertions instead:\nLINES\n"
fi
# Check for non-web-first assertions
if grep -n 'expect(await ' "$FILE_PATH" >/dev/null 2>&1; then
LINES=$(grep -n 'expect(await ' "$FILE_PATH" | head -3)
WARNINGS="WARNINGS\n⚠️ Non-web-first assertion — use expect(locator) instead:\nLINES\n"
fi
# Check for hardcoded localhost URLs
if grep -n "http://localhost\|https://localhost\|http://127.0.0.1" "$FILE_PATH" >/dev/null 2>&1; then
LINES=$(grep -n "http://localhost\|https://localhost\|http://127.0.0.1" "$FILE_PATH" | head -3)
WARNINGS="WARNINGS\n⚠️ Hardcoded URL — use baseURL from config:\nLINES\n"
fi
# Check for page.$() usage
if grep -n 'page\.\$(' "$FILE_PATH" >/dev/null 2>&1; then
LINES=$(grep -n 'page\.\$(' "$FILE_PATH" | head -3)
WARNINGS="WARNINGS\n⚠️ page.\$() is deprecated — use page.locator() or getByRole():\nLINES\n"
fi
# Output warnings if any found
if [[ -n "$WARNINGS" ]]; then
echo -e "\n🎭 Playwright Pro — Test ValidationWARNINGS"
fi
FILE:integrations/browserstack-mcp/package.json
{
"name": "@pw/browserstack-mcp",
"version": "1.0.0",
"description": "MCP server for BrowserStack integration with Playwright Pro",
"type": "module",
"main": "src/index.ts",
"scripts": {
"start": "tsx src/index.ts",
"build": "tsc"
},
"dependencies": {
"@modelcontextprotocol/sdk": "^1.0.0"
},
"devDependencies": {
"tsx": "^4.0.0",
"typescript": "^5.0.0"
}
}
FILE:integrations/browserstack-mcp/src/client.ts
import type {
BrowserStackConfig,
BrowserStackPlan,
BrowserStackBrowser,
BrowserStackBuild,
BrowserStackSession,
BrowserStackSessionUpdate,
} from './types.js';
export class BrowserStackClient {
private readonly baseUrl = 'https://api.browserstack.com';
private readonly headers: Record<string, string>;
constructor(config: BrowserStackConfig) {
const auth = Buffer.from(`config.username:config.accessKey`).toString('base64');
this.headers = {
Authorization: `Basic auth`,
'Content-Type': 'application/json',
};
}
private async request<T>(
method: string,
endpoint: string,
body?: unknown,
): Promise<T> {
const url = `this.baseUrlendpoint`;
const options: RequestInit = {
method,
headers: this.headers,
};
if (body) {
options.body = JSON.stringify(body);
}
const response = await fetch(url, options);
if (!response.ok) {
const errorText = await response.text();
throw new Error(
`BrowserStack API error response.status: errorText`,
);
}
return response.json() as Promise<T>;
}
async getPlan(): Promise<BrowserStackPlan> {
return this.request<BrowserStackPlan>('GET', '/automate/plan.json');
}
async getBrowsers(): Promise<BrowserStackBrowser[]> {
return this.request<BrowserStackBrowser[]>('GET', '/automate/browsers.json');
}
async getBuilds(limit?: number, status?: string): Promise<BrowserStackBuild[]> {
let endpoint = '/automate/builds.json';
const params: string[] = [];
if (limit) params.push(`limit=limit`);
if (status) params.push(`status=status`);
if (params.length > 0) endpoint += `?params.join('&')`;
return this.request<BrowserStackBuild[]>('GET', endpoint);
}
async getSessions(buildId: string, limit?: number): Promise<BrowserStackSession[]> {
let endpoint = `/automate/builds/buildId/sessions.json`;
if (limit) endpoint += `?limit=limit`;
return this.request<BrowserStackSession[]>('GET', endpoint);
}
async getSession(sessionId: string): Promise<BrowserStackSession> {
return this.request<BrowserStackSession>(
'GET',
`/automate/sessions/sessionId.json`,
);
}
async updateSession(
sessionId: string,
update: BrowserStackSessionUpdate,
): Promise<BrowserStackSession> {
return this.request<BrowserStackSession>(
'PUT',
`/automate/sessions/sessionId.json`,
update,
);
}
async getSessionLogs(sessionId: string): Promise<string> {
const url = `this.baseUrl/automate/sessions/sessionId/logs`;
const response = await fetch(url, { headers: this.headers });
if (!response.ok) {
throw new Error(`BrowserStack logs error response.status`);
}
return response.text();
}
}
FILE:integrations/browserstack-mcp/src/index.ts
#!/usr/bin/env npx tsx
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import {
CallToolRequestSchema,
ListToolsRequestSchema,
} from '@modelcontextprotocol/sdk/types.js';
import { BrowserStackClient } from './client.js';
import type { BrowserStackSessionUpdate } from './types.js';
const config = {
username: process.env.BROWSERSTACK_USERNAME ?? '',
accessKey: process.env.BROWSERSTACK_ACCESS_KEY ?? '',
};
if (!config.username || !config.accessKey) {
console.error(
'Missing BrowserStack configuration. Set BROWSERSTACK_USERNAME and BROWSERSTACK_ACCESS_KEY.',
);
process.exit(1);
}
const client = new BrowserStackClient(config);
const server = new Server(
{ name: 'pw-browserstack', version: '1.0.0' },
{ capabilities: { tools: {} } },
);
server.setRequestHandler(ListToolsRequestSchema, async () => ({
tools: [
{
name: 'browserstack_get_plan',
description: 'Get BrowserStack Automate plan details including parallel session limits',
inputSchema: { type: 'object', properties: {} },
},
{
name: 'browserstack_get_browsers',
description: 'List all available browser and OS combinations for Playwright testing',
inputSchema: { type: 'object', properties: {} },
},
{
name: 'browserstack_get_builds',
description: 'List recent test builds with status',
inputSchema: {
type: 'object',
properties: {
limit: { type: 'number', description: 'Max builds to return (default 10)' },
status: {
type: 'string',
enum: ['running', 'done', 'failed', 'timeout'],
description: 'Filter by status',
},
},
},
},
{
name: 'browserstack_get_sessions',
description: 'List test sessions within a build',
inputSchema: {
type: 'object',
properties: {
build_id: { type: 'string', description: 'Build hashed ID' },
limit: { type: 'number', description: 'Max sessions to return' },
},
required: ['build_id'],
},
},
{
name: 'browserstack_get_session',
description: 'Get detailed session info including video URL, logs, and screenshots',
inputSchema: {
type: 'object',
properties: {
session_id: { type: 'string', description: 'Session hashed ID' },
},
required: ['session_id'],
},
},
{
name: 'browserstack_update_session',
description: 'Update session status (mark as passed/failed) and name',
inputSchema: {
type: 'object',
properties: {
session_id: { type: 'string', description: 'Session hashed ID' },
status: {
type: 'string',
enum: ['passed', 'failed'],
description: 'Test result status',
},
name: { type: 'string', description: 'Updated session name' },
reason: { type: 'string', description: 'Reason for failure' },
},
required: ['session_id'],
},
},
{
name: 'browserstack_get_logs',
description: 'Get text logs for a specific test session',
inputSchema: {
type: 'object',
properties: {
session_id: { type: 'string', description: 'Session hashed ID' },
},
required: ['session_id'],
},
},
],
}));
server.setRequestHandler(CallToolRequestSchema, async (request) => {
const { name, arguments: args } = request.params;
try {
switch (name) {
case 'browserstack_get_plan': {
const plan = await client.getPlan();
return { content: [{ type: 'text', text: JSON.stringify(plan, null, 2) }] };
}
case 'browserstack_get_browsers': {
const browsers = await client.getBrowsers();
const playwrightBrowsers = browsers.filter(
(b) =>
['chrome', 'firefox', 'playwright-chromium', 'playwright-firefox', 'playwright-webkit'].includes(
b.browser?.toLowerCase() ?? '',
) || b.browser?.toLowerCase().includes('playwright'),
);
const summary = playwrightBrowsers.length > 0 ? playwrightBrowsers : browsers.slice(0, 50);
return { content: [{ type: 'text', text: JSON.stringify(summary, null, 2) }] };
}
case 'browserstack_get_builds': {
const builds = await client.getBuilds(
(args?.limit as number) ?? 10,
args?.status as string | undefined,
);
return { content: [{ type: 'text', text: JSON.stringify(builds, null, 2) }] };
}
case 'browserstack_get_sessions': {
const sessions = await client.getSessions(
args!.build_id as string,
args?.limit as number | undefined,
);
return { content: [{ type: 'text', text: JSON.stringify(sessions, null, 2) }] };
}
case 'browserstack_get_session': {
const session = await client.getSession(args!.session_id as string);
return { content: [{ type: 'text', text: JSON.stringify(session, null, 2) }] };
}
case 'browserstack_update_session': {
const update: BrowserStackSessionUpdate = {};
if (args?.status) update.status = args.status as 'passed' | 'failed';
if (args?.name) update.name = args.name as string;
if (args?.reason) update.reason = args.reason as string;
const updated = await client.updateSession(args!.session_id as string, update);
return { content: [{ type: 'text', text: JSON.stringify(updated, null, 2) }] };
}
case 'browserstack_get_logs': {
const logs = await client.getSessionLogs(args!.session_id as string);
return { content: [{ type: 'text', text: logs }] };
}
default:
return { content: [{ type: 'text', text: `Unknown tool: name` }], isError: true };
}
} catch (error) {
const message = error instanceof Error ? error.message : String(error);
return { content: [{ type: 'text', text: `Error: message` }], isError: true };
}
});
async function main() {
const transport = new StdioServerTransport();
await server.connect(transport);
}
main().catch(console.error);
FILE:integrations/browserstack-mcp/src/types.ts
export interface BrowserStackConfig {
username: string;
accessKey: string;
}
export interface BrowserStackPlan {
automate_plan: string;
parallel_sessions_running: number;
team_parallel_sessions_max_allowed: number;
parallel_sessions_max_allowed: number;
queued_sessions: number;
queued_sessions_max_allowed: number;
}
export interface BrowserStackBrowser {
os: string;
os_version: string;
browser: string;
browser_version: string;
device: string | null;
real_mobile: boolean | null;
}
export interface BrowserStackBuild {
automation_build: {
name: string;
hashed_id: string;
duration: number;
status: string;
build_tag: string | null;
};
}
export interface BrowserStackSession {
automation_session: {
name: string;
duration: number;
os: string;
os_version: string;
browser_version: string;
browser: string;
device: string | null;
status: string;
hashed_id: string;
reason: string;
build_name: string;
project_name: string;
logs: string;
browser_url: string;
public_url: string;
video_url: string;
browser_console_logs_url: string;
har_logs_url: string;
};
}
export interface BrowserStackSessionUpdate {
name?: string;
status?: 'passed' | 'failed';
reason?: string;
}
FILE:integrations/browserstack-mcp/tsconfig.json
{
"compilerOptions": {
"target": "ES2022",
"module": "ESNext",
"moduleResolution": "bundler",
"esModuleInterop": true,
"strict": true,
"outDir": "dist",
"rootDir": "src",
"declaration": true,
"skipLibCheck": true
},
"include": ["src/**/*"]
}
FILE:integrations/testrail-mcp/package.json
{
"name": "@pw/testrail-mcp",
"version": "1.0.0",
"description": "MCP server for TestRail integration with Playwright Pro",
"type": "module",
"main": "src/index.ts",
"scripts": {
"start": "tsx src/index.ts",
"build": "tsc"
},
"dependencies": {
"@modelcontextprotocol/sdk": "^1.0.0"
},
"devDependencies": {
"tsx": "^4.0.0",
"typescript": "^5.0.0"
}
}
FILE:integrations/testrail-mcp/src/client.ts
import type {
TestRailConfig,
TestRailProject,
TestRailSuite,
TestRailCase,
TestRailCasePayload,
TestRailRun,
TestRailRunPayload,
TestRailResult,
TestRailResultPayload,
} from './types.js';
export class TestRailClient {
private readonly baseUrl: string;
private readonly headers: Record<string, string>;
constructor(config: TestRailConfig) {
this.baseUrl = config.url.replace(/\/+$/, '');
const auth = Buffer.from(`config.user:config.apiKey`).toString('base64');
this.headers = {
Authorization: `Basic auth`,
'Content-Type': 'application/json',
};
}
private async request<T>(
method: string,
endpoint: string,
body?: unknown,
): Promise<T> {
const url = `this.baseUrl/index.php?/api/v2/endpoint`;
const options: RequestInit = {
method,
headers: this.headers,
};
if (body) {
options.body = JSON.stringify(body);
}
const response = await fetch(url, options);
if (!response.ok) {
const errorText = await response.text();
throw new Error(
`TestRail API error response.status: errorText`,
);
}
return response.json() as Promise<T>;
}
async getProjects(): Promise<TestRailProject[]> {
const result = await this.request<{ projects: TestRailProject[] }>(
'GET',
'get_projects',
);
return result.projects ?? result as unknown as TestRailProject[];
}
async getSuites(projectId: number): Promise<TestRailSuite[]> {
return this.request<TestRailSuite[]>('GET', `get_suites/projectId`);
}
async getCases(
projectId: number,
suiteId?: number,
sectionId?: number,
limit?: number,
offset?: number,
filter?: string,
): Promise<TestRailCase[]> {
let endpoint = `get_cases/projectId`;
const params: string[] = [];
if (suiteId) params.push(`suite_id=suiteId`);
if (sectionId) params.push(`section_id=sectionId`);
if (limit) params.push(`limit=limit`);
if (offset) params.push(`offset=offset`);
if (filter) params.push(`filter=encodeURIComponent(filter)`);
if (params.length > 0) endpoint += `¶ms.join('&')`;
const result = await this.request<{ cases: TestRailCase[] }>(
'GET',
endpoint,
);
return result.cases ?? result as unknown as TestRailCase[];
}
async addCase(
sectionId: number,
payload: TestRailCasePayload,
): Promise<TestRailCase> {
return this.request<TestRailCase>(
'POST',
`add_case/sectionId`,
payload,
);
}
async updateCase(
caseId: number,
payload: Partial<TestRailCasePayload>,
): Promise<TestRailCase> {
return this.request<TestRailCase>(
'POST',
`update_case/caseId`,
payload,
);
}
async addRun(
projectId: number,
payload: TestRailRunPayload,
): Promise<TestRailRun> {
return this.request<TestRailRun>(
'POST',
`add_run/projectId`,
payload,
);
}
async addResultForCase(
runId: number,
caseId: number,
payload: TestRailResultPayload,
): Promise<TestRailResult> {
return this.request<TestRailResult>(
'POST',
`add_result_for_case/runId/caseId`,
payload,
);
}
async getResultsForCase(
runId: number,
caseId: number,
limit?: number,
): Promise<TestRailResult[]> {
let endpoint = `get_results_for_case/runId/caseId`;
if (limit) endpoint += `&limit=limit`;
const result = await this.request<{ results: TestRailResult[] }>(
'GET',
endpoint,
);
return result.results ?? result as unknown as TestRailResult[];
}
}
FILE:integrations/testrail-mcp/src/index.ts
#!/usr/bin/env npx tsx
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import {
CallToolRequestSchema,
ListToolsRequestSchema,
} from '@modelcontextprotocol/sdk/types.js';
import { TestRailClient } from './client.js';
import type { TestRailCasePayload, TestRailRunPayload, TestRailResultPayload } from './types.js';
const config = {
url: process.env.TESTRAIL_URL ?? '',
user: process.env.TESTRAIL_USER ?? '',
apiKey: process.env.TESTRAIL_API_KEY ?? '',
};
if (!config.url || !config.user || !config.apiKey) {
console.error(
'Missing TestRail configuration. Set TESTRAIL_URL, TESTRAIL_USER, and TESTRAIL_API_KEY.',
);
process.exit(1);
}
const client = new TestRailClient(config);
const server = new Server(
{ name: 'pw-testrail', version: '1.0.0' },
{ capabilities: { tools: {} } },
);
server.setRequestHandler(ListToolsRequestSchema, async () => ({
tools: [
{
name: 'testrail_get_projects',
description: 'List all TestRail projects',
inputSchema: { type: 'object', properties: {} },
},
{
name: 'testrail_get_suites',
description: 'List test suites in a project',
inputSchema: {
type: 'object',
properties: {
project_id: { type: 'number', description: 'Project ID' },
},
required: ['project_id'],
},
},
{
name: 'testrail_get_cases',
description: 'Get test cases from a project. Supports filtering by suite, section, and search text.',
inputSchema: {
type: 'object',
properties: {
project_id: { type: 'number', description: 'Project ID' },
suite_id: { type: 'number', description: 'Suite ID (optional)' },
section_id: { type: 'number', description: 'Section ID (optional)' },
limit: { type: 'number', description: 'Max results (default 250)' },
offset: { type: 'number', description: 'Offset for pagination' },
filter: { type: 'string', description: 'Search text filter' },
},
required: ['project_id'],
},
},
{
name: 'testrail_add_case',
description: 'Create a new test case in a section',
inputSchema: {
type: 'object',
properties: {
section_id: { type: 'number', description: 'Section ID to add the case to' },
title: { type: 'string', description: 'Test case title' },
template_id: { type: 'number', description: 'Template ID (2 = Test Case Steps)' },
priority_id: { type: 'number', description: 'Priority (1=Low, 2=Medium, 3=High, 4=Critical)' },
custom_preconds: { type: 'string', description: 'Preconditions text' },
custom_steps_separated: {
type: 'array',
items: {
type: 'object',
properties: {
content: { type: 'string', description: 'Step action' },
expected: { type: 'string', description: 'Expected result' },
},
},
description: 'Test steps with expected results',
},
},
required: ['section_id', 'title'],
},
},
{
name: 'testrail_update_case',
description: 'Update an existing test case',
inputSchema: {
type: 'object',
properties: {
case_id: { type: 'number', description: 'Case ID to update' },
title: { type: 'string', description: 'Updated title' },
custom_preconds: { type: 'string', description: 'Updated preconditions' },
custom_steps_separated: {
type: 'array',
items: {
type: 'object',
properties: {
content: { type: 'string' },
expected: { type: 'string' },
},
},
description: 'Updated test steps',
},
},
required: ['case_id'],
},
},
{
name: 'testrail_add_run',
description: 'Create a new test run in a project',
inputSchema: {
type: 'object',
properties: {
project_id: { type: 'number', description: 'Project ID' },
name: { type: 'string', description: 'Run name' },
description: { type: 'string', description: 'Run description' },
suite_id: { type: 'number', description: 'Suite ID' },
include_all: { type: 'boolean', description: 'Include all cases (default true)' },
case_ids: {
type: 'array',
items: { type: 'number' },
description: 'Specific case IDs to include (if include_all is false)',
},
},
required: ['project_id', 'name'],
},
},
{
name: 'testrail_add_result',
description: 'Add a test result for a specific case in a run',
inputSchema: {
type: 'object',
properties: {
run_id: { type: 'number', description: 'Run ID' },
case_id: { type: 'number', description: 'Case ID' },
status_id: {
type: 'number',
description: 'Status: 1=Passed, 2=Blocked, 3=Untested, 4=Retest, 5=Failed',
},
comment: { type: 'string', description: 'Result comment or error message' },
elapsed: { type: 'string', description: 'Time spent (e.g., "30s", "1m 45s")' },
defects: { type: 'string', description: 'Defect IDs (comma-separated)' },
},
required: ['run_id', 'case_id', 'status_id'],
},
},
{
name: 'testrail_get_results',
description: 'Get historical results for a test case in a run',
inputSchema: {
type: 'object',
properties: {
run_id: { type: 'number', description: 'Run ID' },
case_id: { type: 'number', description: 'Case ID' },
limit: { type: 'number', description: 'Max results to return' },
},
required: ['run_id', 'case_id'],
},
},
],
}));
server.setRequestHandler(CallToolRequestSchema, async (request) => {
const { name, arguments: args } = request.params;
try {
switch (name) {
case 'testrail_get_projects': {
const projects = await client.getProjects();
return { content: [{ type: 'text', text: JSON.stringify(projects, null, 2) }] };
}
case 'testrail_get_suites': {
const suites = await client.getSuites(args!.project_id as number);
return { content: [{ type: 'text', text: JSON.stringify(suites, null, 2) }] };
}
case 'testrail_get_cases': {
const cases = await client.getCases(
args!.project_id as number,
args?.suite_id as number | undefined,
args?.section_id as number | undefined,
args?.limit as number | undefined,
args?.offset as number | undefined,
args?.filter as string | undefined,
);
return { content: [{ type: 'text', text: JSON.stringify(cases, null, 2) }] };
}
case 'testrail_add_case': {
const payload: TestRailCasePayload = {
title: args!.title as string,
template_id: args?.template_id as number | undefined,
priority_id: args?.priority_id as number | undefined,
custom_preconds: args?.custom_preconds as string | undefined,
custom_steps_separated: args?.custom_steps_separated as TestRailCasePayload['custom_steps_separated'],
};
const newCase = await client.addCase(args!.section_id as number, payload);
return { content: [{ type: 'text', text: JSON.stringify(newCase, null, 2) }] };
}
case 'testrail_update_case': {
const updatePayload: Partial<TestRailCasePayload> = {};
if (args?.title) updatePayload.title = args.title as string;
if (args?.custom_preconds) updatePayload.custom_preconds = args.custom_preconds as string;
if (args?.custom_steps_separated) {
updatePayload.custom_steps_separated = args.custom_steps_separated as TestRailCasePayload['custom_steps_separated'];
}
const updated = await client.updateCase(args!.case_id as number, updatePayload);
return { content: [{ type: 'text', text: JSON.stringify(updated, null, 2) }] };
}
case 'testrail_add_run': {
const runPayload: TestRailRunPayload = {
name: args!.name as string,
description: args?.description as string | undefined,
suite_id: args?.suite_id as number | undefined,
include_all: (args?.include_all as boolean) ?? true,
case_ids: args?.case_ids as number[] | undefined,
};
const run = await client.addRun(args!.project_id as number, runPayload);
return { content: [{ type: 'text', text: JSON.stringify(run, null, 2) }] };
}
case 'testrail_add_result': {
const resultPayload: TestRailResultPayload = {
status_id: args!.status_id as number,
comment: args?.comment as string | undefined,
elapsed: args?.elapsed as string | undefined,
defects: args?.defects as string | undefined,
};
const result = await client.addResultForCase(
args!.run_id as number,
args!.case_id as number,
resultPayload,
);
return { content: [{ type: 'text', text: JSON.stringify(result, null, 2) }] };
}
case 'testrail_get_results': {
const results = await client.getResultsForCase(
args!.run_id as number,
args!.case_id as number,
args?.limit as number | undefined,
);
return { content: [{ type: 'text', text: JSON.stringify(results, null, 2) }] };
}
default:
return { content: [{ type: 'text', text: `Unknown tool: name` }], isError: true };
}
} catch (error) {
const message = error instanceof Error ? error.message : String(error);
return { content: [{ type: 'text', text: `Error: message` }], isError: true };
}
});
async function main() {
const transport = new StdioServerTransport();
await server.connect(transport);
}
main().catch(console.error);
FILE:integrations/testrail-mcp/src/types.ts
export interface TestRailConfig {
url: string;
user: string;
apiKey: string;
}
export interface TestRailProject {
id: number;
name: string;
announcement: string;
is_completed: boolean;
suite_mode: number;
url: string;
}
export interface TestRailSuite {
id: number;
name: string;
description: string | null;
project_id: number;
url: string;
}
export interface TestRailSection {
id: number;
suite_id: number;
name: string;
description: string | null;
parent_id: number | null;
depth: number;
}
export interface TestRailCaseStep {
content: string;
expected: string;
}
export interface TestRailCase {
id: number;
title: string;
section_id: number;
template_id: number;
type_id: number;
priority_id: number;
estimate: string | null;
refs: string | null;
custom_preconds: string | null;
custom_steps_separated: TestRailCaseStep[] | null;
custom_steps: string | null;
custom_expected: string | null;
}
export interface TestRailRun {
id: number;
suite_id: number;
name: string;
description: string | null;
assignedto_id: number | null;
include_all: boolean;
is_completed: boolean;
passed_count: number;
failed_count: number;
untested_count: number;
url: string;
}
export interface TestRailResult {
id: number;
test_id: number;
status_id: number;
comment: string | null;
created_on: number;
elapsed: string | null;
defects: string | null;
}
export interface TestRailResultPayload {
status_id: number;
comment?: string;
elapsed?: string;
defects?: string;
}
export interface TestRailRunPayload {
suite_id?: number;
name: string;
description?: string;
assignedto_id?: number;
include_all?: boolean;
case_ids?: number[];
refs?: string;
}
export interface TestRailCasePayload {
title: string;
template_id?: number;
type_id?: number;
priority_id?: number;
estimate?: string;
refs?: string;
custom_preconds?: string;
custom_steps_separated?: TestRailCaseStep[];
custom_steps?: string;
custom_expected?: string;
}
FILE:integrations/testrail-mcp/tsconfig.json
{
"compilerOptions": {
"target": "ES2022",
"module": "ESNext",
"moduleResolution": "bundler",
"esModuleInterop": true,
"strict": true,
"outDir": "dist",
"rootDir": "src",
"declaration": true,
"skipLibCheck": true
},
"include": ["src/**/*"]
}
FILE:reference/assertions.md
# Assertions Reference
## Web-First Assertions (Always Use These)
Auto-retry until timeout. Safe for dynamic content.
```typescript
// Visibility
await expect(locator).toBeVisible();
await expect(locator).not.toBeVisible();
await expect(locator).toBeHidden();
// Text
await expect(locator).toHaveText('exact text');
await expect(locator).toHaveText(/partial/i);
await expect(locator).toContainText('partial');
// Value (inputs)
await expect(locator).toHaveValue('entered text');
await expect(locator).toHaveValues(['option1', 'option2']);
// Attributes
await expect(locator).toHaveAttribute('href', '/dashboard');
await expect(locator).toHaveClass(/active/);
await expect(locator).toHaveId('main-nav');
// State
await expect(locator).toBeEnabled();
await expect(locator).toBeDisabled();
await expect(locator).toBeChecked();
await expect(locator).toBeEditable();
await expect(locator).toBeFocused();
await expect(locator).toBeAttached();
// Count
await expect(locator).toHaveCount(5);
await expect(locator).toHaveCount(0); // element doesn't exist
// CSS
await expect(locator).toHaveCSS('color', 'rgb(255, 0, 0)');
// Screenshots
await expect(locator).toHaveScreenshot('button.png');
await expect(page).toHaveScreenshot('full-page.png');
```
## Page Assertions
```typescript
await expect(page).toHaveURL('/dashboard');
await expect(page).toHaveURL(/\/dashboard/);
await expect(page).toHaveTitle('Dashboard - App');
await expect(page).toHaveTitle(/Dashboard/);
```
## Anti-Patterns (Never Do This)
```typescript
// BAD — no auto-retry
const text = await locator.textContent();
expect(text).toBe('Hello');
// BAD — snapshot in time, not reactive
const isVisible = await locator.isVisible();
expect(isVisible).toBe(true);
// BAD — evaluating in page context
const value = await page.evaluate(() =>
document.querySelector('input')?.value
);
expect(value).toBe('test');
```
## Custom Timeout
```typescript
// Override timeout for slow operations
await expect(locator).toBeVisible({ timeout: 30_000 });
```
## Soft Assertions
Continue test even if assertion fails (report all failures at end):
```typescript
await expect.soft(locator).toHaveText('Expected');
await expect.soft(page).toHaveURL('/next');
// Test continues even if above fail
```
FILE:reference/common-pitfalls.md
# Common Pitfalls (Top 10)
## 1. waitForTimeout
**Symptom:** Slow, flaky tests.
```typescript
// BAD
await page.waitForTimeout(3000);
// GOOD
await expect(page.getByTestId('result')).toBeVisible();
```
## 2. Non-Web-First Assertions
**Symptom:** Assertions fail on dynamic content.
```typescript
// BAD — checks once, no retry
const text = await page.textContent('.msg');
expect(text).toBe('Done');
// GOOD — retries until timeout
await expect(page.getByText('Done')).toBeVisible();
```
## 3. Missing await
**Symptom:** Random passes/failures, tests seem to skip steps.
```typescript
// BAD
page.goto('/dashboard');
expect(page.getByText('Welcome')).toBeVisible();
// GOOD
await page.goto('/dashboard');
await expect(page.getByText('Welcome')).toBeVisible();
```
## 4. Hardcoded URLs
**Symptom:** Tests break in different environments.
```typescript
// BAD
await page.goto('http://localhost:3000/login');
// GOOD — uses baseURL from config
await page.goto('/login');
```
## 5. CSS Selectors Instead of Roles
**Symptom:** Tests break after CSS refactors.
```typescript
// BAD
await page.click('#submit-btn');
// GOOD
await page.getByRole('button', { name: 'Submit' }).click();
```
## 6. Shared State Between Tests
**Symptom:** Tests pass alone, fail in suite.
```typescript
// BAD — test B depends on test A
let userId: string;
test('create user', async () => { userId = '123'; });
test('edit user', async () => { /* uses userId */ });
// GOOD — each test is independent
test('edit user', async ({ request }) => {
const res = await request.post('/api/users', { data: { name: 'Test' } });
const { id } = await res.json();
// ...
});
```
## 7. Using networkidle
**Symptom:** Tests hang or timeout unpredictably.
```typescript
// BAD — waits for all network activity to stop
await page.goto('/dashboard', { waitUntil: 'networkidle' });
// GOOD — wait for specific content
await page.goto('/dashboard');
await expect(page.getByRole('heading', { name: 'Dashboard' })).toBeVisible();
```
## 8. Not Waiting for Navigation
**Symptom:** Assertions run on wrong page.
```typescript
// BAD — click navigates but we don't wait
await page.getByRole('link', { name: 'Settings' }).click();
await expect(page.getByRole('heading')).toHaveText('Settings');
// GOOD — wait for URL change
await page.getByRole('link', { name: 'Settings' }).click();
await expect(page).toHaveURL('/settings');
await expect(page.getByRole('heading')).toHaveText('Settings');
```
## 9. Testing Implementation, Not Behavior
**Symptom:** Tests break on every refactor.
```typescript
// BAD — tests CSS class (implementation detail)
await expect(page.locator('.btn')).toHaveClass('btn-primary active');
// GOOD — tests what the user sees
await expect(page.getByRole('button', { name: 'Save' })).toBeEnabled();
```
## 10. No Error Case Tests
**Symptom:** App breaks on errors but all tests pass.
```typescript
// Missing: what happens when the API fails?
test('should handle API error', async ({ page }) => {
await page.route('**/api/data', (route) =>
route.fulfill({ status: 500 })
);
await page.goto('/dashboard');
await expect(page.getByText(/error|try again/i)).toBeVisible();
});
```
FILE:reference/fixtures.md
# Fixtures Reference
## What Are Fixtures
Fixtures provide setup/teardown for each test. They replace `beforeEach`/`afterEach` for shared state and are composable, type-safe, and lazy (only run when used).
## Creating Custom Fixtures
```typescript
// fixtures.ts
import { test as base, expect } from '@playwright/test';
// Define fixture types
type MyFixtures = {
authenticatedPage: Page;
testUser: { email: string; password: string };
apiClient: APIRequestContext;
};
export const test = base.extend<MyFixtures>({
// Simple value fixture
testUser: async ({}, use) => {
await use({
email: `test-Date.now()@example.com`,
password: 'Test123!',
});
},
// Fixture with setup and teardown
authenticatedPage: async ({ page, testUser }, use) => {
// Setup: log in
await page.goto('/login');
await page.getByLabel('Email').fill(testUser.email);
await page.getByLabel('Password').fill(testUser.password);
await page.getByRole('button', { name: 'Sign in' }).click();
await expect(page).toHaveURL('/dashboard');
// Provide the authenticated page to the test
await use(page);
// Teardown: clean up (optional)
await page.goto('/logout');
},
// API client fixture
apiClient: async ({ playwright }, use) => {
const context = await playwright.request.newContext({
baseURL: 'http://localhost:3000',
extraHTTPHeaders: {
Authorization: `Bearer process.env.API_TOKEN`,
},
});
await use(context);
await context.dispose();
},
});
export { expect };
```
## Using Fixtures in Tests
```typescript
import { test, expect } from './fixtures';
test('should show dashboard for logged in user', async ({ authenticatedPage }) => {
// authenticatedPage is already logged in
await expect(authenticatedPage.getByRole('heading', { name: 'Dashboard' })).toBeVisible();
});
test('should create item via API', async ({ apiClient }) => {
const response = await apiClient.post('/api/items', {
data: { name: 'Test Item' },
});
expect(response.ok()).toBeTruthy();
});
```
## Shared Auth State (storageState)
For performance, authenticate once and reuse:
```typescript
// auth.setup.ts
import { test as setup } from '@playwright/test';
setup('authenticate', async ({ page }) => {
await page.goto('/login');
await page.getByLabel('Email').fill('[email protected]');
await page.getByLabel('Password').fill('password');
await page.getByRole('button', { name: 'Sign in' }).click();
await page.waitForURL('/dashboard');
await page.context().storageState({ path: '.auth/user.json' });
});
```
```typescript
// playwright.config.ts
export default defineConfig({
projects: [
{ name: 'setup', testMatch: /.*\.setup\.ts/ },
{
name: 'chromium',
use: {
storageState: '.auth/user.json',
},
dependencies: ['setup'],
},
],
});
```
## When to Use What
| Need | Use |
|---|---|
| Shared login state | `storageState` + setup project |
| Per-test data creation | Custom fixture with API calls |
| Reusable page helpers | Custom fixture returning page |
| Test data cleanup | Fixture teardown (after `use()`) |
| Config values | Simple value fixture |
FILE:reference/flaky-tests.md
# Flaky Test Quick Reference
## Diagnosis Commands
```bash
# Burn-in: expose timing issues
npx playwright test tests/checkout.spec.ts --repeat-each=10
# Isolation: expose state leaks
npx playwright test tests/checkout.spec.ts --grep "adds item" --workers=1
# Full trace: capture everything
npx playwright test tests/checkout.spec.ts --trace=on --retries=0
# Parallel stress: expose race conditions
npx playwright test --fully-parallel --workers=4 --repeat-each=5
```
## Four Categories
| Category | Symptom | Fix |
|---|---|---|
| **Timing** | Fails intermittently | Replace waits with assertions |
| **Isolation** | Fails in suite, passes alone | Remove shared state |
| **Environment** | Fails in CI only | Match viewport, fonts, timezone |
| **Infrastructure** | Random crashes | Reduce workers, increase memory |
## Quick Fixes
**Timing → Add proper waits:**
```typescript
// Wait for specific response
const response = page.waitForResponse('**/api/data');
await page.getByRole('button', { name: 'Load' }).click();
await response;
await expect(page.getByTestId('results')).toBeVisible();
```
**Isolation → Unique test data:**
```typescript
const uniqueEmail = `test-Date.now()@example.com`;
```
**Environment → Explicit viewport:**
```typescript
test.use({ viewport: { width: 1280, height: 720 } });
```
**Infrastructure → CI-safe config:**
```typescript
export default defineConfig({
retries: process.env.CI ? 2 : 0,
workers: process.env.CI ? 2 : undefined,
timeout: process.env.CI ? 60_000 : 30_000,
});
```
FILE:reference/golden-rules.md
# Golden Rules
1. **`getByRole()` over CSS/XPath** — resilient to markup changes, mirrors assistive technology
2. **Never `page.waitForTimeout()`** — use `expect(locator).toBeVisible()` or `page.waitForURL()`
3. **Web-first assertions** — `expect(locator)` auto-retries; `expect(await locator.textContent())` does not
4. **Isolate every test** — no shared state, no execution-order dependencies
5. **`baseURL` in config** — zero hardcoded URLs in tests
6. **Retries: `2` in CI, `0` locally** — surface flakiness where it matters
7. **Traces: `'on-first-retry'`** — rich debugging artifacts without CI slowdown
8. **Fixtures over globals** — share state via `test.extend()`, not module-level variables
9. **One behavior per test** — multiple related `expect()` calls are fine
10. **Mock external services only** — never mock your own app; mock third-party APIs, payment gateways, email
FILE:reference/locators.md
# Locator Priority
Use the first option that works:
| Priority | Locator | Use for |
|---|---|---|
| 1 | `getByRole('button', { name: 'Submit' })` | Buttons, links, headings, form elements |
| 2 | `getByLabel('Email address')` | Form fields with associated labels |
| 3 | `getByText('Welcome back')` | Non-interactive text content |
| 4 | `getByPlaceholder('Search...')` | Inputs with placeholder text |
| 5 | `getByAltText('Company logo')` | Images with alt text |
| 6 | `getByTitle('Close dialog')` | Elements with title attribute |
| 7 | `getByTestId('checkout-summary')` | When no semantic option exists |
| 8 | `page.locator('.legacy-widget')` | CSS/XPath — absolute last resort |
## Role Locator Cheat Sheet
```typescript
// Buttons — <button>, <input type="submit">, [role="button"]
page.getByRole('button', { name: 'Save changes' })
// Links — <a href>
page.getByRole('link', { name: 'View profile' })
// Headings — h1-h6
page.getByRole('heading', { name: 'Dashboard', level: 1 })
// Text inputs — by label association
page.getByRole('textbox', { name: 'Email' })
// Checkboxes
page.getByRole('checkbox', { name: 'Remember me' })
// Radio buttons
page.getByRole('radio', { name: 'Monthly billing' })
// Dropdowns — <select>
page.getByRole('combobox', { name: 'Country' })
// Navigation
page.getByRole('navigation', { name: 'Main' })
// Tables
page.getByRole('table', { name: 'Recent orders' })
// Rows within tables
page.getByRole('row', { name: /Order #123/ })
// Tab panels
page.getByRole('tab', { name: 'Settings' })
// Dialogs
page.getByRole('dialog', { name: 'Confirm deletion' })
// Alerts
page.getByRole('alert')
```
## Filtering and Chaining
```typescript
// Filter by text
page.getByRole('listitem').filter({ hasText: 'Product A' })
// Filter by child locator
page.getByRole('listitem').filter({
has: page.getByRole('button', { name: 'Buy' })
})
// Chain locators
page.getByRole('navigation').getByRole('link', { name: 'Settings' })
// Nth match
page.getByRole('listitem').nth(0)
page.getByRole('listitem').first()
page.getByRole('listitem').last()
```
FILE:settings.json
{
"permissions": {
"allow": [
"Bash(npx playwright*)",
"Bash(npx tsx*)"
]
}
}
FILE:skills/browserstack/SKILL.md
---
name: "browserstack"
description: >-
Run tests on BrowserStack. Use when user mentions "browserstack",
"cross-browser", "cloud testing", "browser matrix", "test on safari",
"test on firefox", or "browser compatibility".
---
# BrowserStack Integration
Run Playwright tests on BrowserStack's cloud grid for cross-browser and cross-device testing.
## Prerequisites
Environment variables must be set:
- `BROWSERSTACK_USERNAME` — your BrowserStack username
- `BROWSERSTACK_ACCESS_KEY` — your access key
If not set, inform the user how to get them from [browserstack.com/accounts/settings](https://www.browserstack.com/accounts/settings) and stop.
## Capabilities
### 1. Configure for BrowserStack
```
/pw:browserstack setup
```
Steps:
1. Check current `playwright.config.ts`
2. Add BrowserStack connect options:
```typescript
// Add to playwright.config.ts
import { defineConfig } from '@playwright/test';
const isBS = !!process.env.BROWSERSTACK_USERNAME;
export default defineConfig({
// ... existing config
projects: isBS ? [
{
name: "chromelatestwindows-11",
use: {
connectOptions: {
wsEndpoint: `wss://cdp.browserstack.com/playwright?caps='chrome',
'browser_version': 'latest',
'os': 'Windows',
'os_version': '11',
'browserstack.username': process.env.BROWSERSTACK_USERNAME,
'browserstack.accessKey': process.env.BROWSERSTACK_ACCESS_KEY,))}`,
},
},
},
{
name: "firefoxlatestwindows-11",
use: {
connectOptions: {
wsEndpoint: `wss://cdp.browserstack.com/playwright?caps='playwright-firefox',
'browser_version': 'latest',
'os': 'Windows',
'os_version': '11',
'browserstack.username': process.env.BROWSERSTACK_USERNAME,
'browserstack.accessKey': process.env.BROWSERSTACK_ACCESS_KEY,))}`,
},
},
},
{
name: "webkitlatestos-x-ventura",
use: {
connectOptions: {
wsEndpoint: `wss://cdp.browserstack.com/playwright?caps='playwright-webkit',
'browser_version': 'latest',
'os': 'OS X',
'os_version': 'Ventura',
'browserstack.username': process.env.BROWSERSTACK_USERNAME,
'browserstack.accessKey': process.env.BROWSERSTACK_ACCESS_KEY,))}`,
},
},
},
] : [
// ... local projects fallback
],
});
```
3. Add npm script: `"test:e2e:cloud": "npx playwright test --project='chrome@*' --project='firefox@*' --project='webkit@*'"`
### 2. Run Tests on BrowserStack
```
/pw:browserstack run
```
Steps:
1. Verify credentials are set
2. Run tests with BrowserStack projects:
```bash
BROWSERSTACK_USERNAME=$BROWSERSTACK_USERNAME \
BROWSERSTACK_ACCESS_KEY=$BROWSERSTACK_ACCESS_KEY \
npx playwright test --project='chrome@*' --project='firefox@*'
```
3. Monitor execution
4. Report results per browser
### 3. Get Build Results
```
/pw:browserstack results
```
Steps:
1. Call `browserstack_get_builds` MCP tool
2. Get latest build's sessions
3. For each session:
- Status (pass/fail)
- Browser and OS
- Duration
- Video URL
- Log URLs
4. Format as summary table
### 4. Check Available Browsers
```
/pw:browserstack browsers
```
Steps:
1. Call `browserstack_get_browsers` MCP tool
2. Filter for Playwright-compatible browsers
3. Display available browser/OS combinations
### 5. Local Testing
```
/pw:browserstack local
```
For testing localhost or staging behind firewall:
1. Install BrowserStack Local: `npm install -D browserstack-local`
2. Add local tunnel to config
3. Provide setup instructions
## MCP Tools Used
| Tool | When |
|---|---|
| `browserstack_get_plan` | Check account limits |
| `browserstack_get_browsers` | List available browsers |
| `browserstack_get_builds` | List recent builds |
| `browserstack_get_sessions` | Get sessions in a build |
| `browserstack_get_session` | Get session details (video, logs) |
| `browserstack_update_session` | Mark pass/fail |
| `browserstack_get_logs` | Get text/network logs |
## Output
- Cross-browser test results table
- Per-browser pass/fail status
- Links to BrowserStack dashboard for video/screenshots
- Any browser-specific failures highlighted
FILE:skills/coverage/SKILL.md
---
name: "coverage"
description: >-
Analyze test coverage gaps. Use when user says "test coverage",
"what's not tested", "coverage gaps", "missing tests", "coverage report",
or "what needs testing".
---
# Analyze Test Coverage Gaps
Map all testable surfaces in the application and identify what's tested vs. what's missing.
## Steps
### 1. Map Application Surface
Use the `Explore` subagent to catalog:
**Routes/Pages:**
- Scan route definitions (Next.js `app/`, React Router config, Vue Router, etc.)
- List all user-facing pages with their paths
**Components:**
- Identify interactive components (forms, modals, dropdowns, tables)
- Note components with complex state logic
**API Endpoints:**
- Scan API route files or backend controllers
- List all endpoints with their methods
**User Flows:**
- Identify critical paths: auth, checkout, onboarding, core features
- Map multi-step workflows
### 2. Map Existing Tests
Scan all `*.spec.ts` / `*.spec.js` files:
- Extract which pages/routes are covered (by `page.goto()` calls)
- Extract which components are tested (by locator usage)
- Extract which API endpoints are mocked or hit
- Count tests per area
### 3. Generate Coverage Matrix
```
## Coverage Matrix
| Area | Route | Tests | Status |
|---|---|---|---|
| Auth | /login | 5 | ✅ Covered |
| Auth | /register | 0 | ❌ Missing |
| Auth | /forgot-password | 0 | ❌ Missing |
| Dashboard | /dashboard | 3 | ⚠️ Partial (no error states) |
| Settings | /settings | 0 | ❌ Missing |
| Checkout | /checkout | 8 | ✅ Covered |
```
### 4. Prioritize Gaps
Rank uncovered areas by business impact:
1. **Critical** — auth, payment, core features → test first
2. **High** — user-facing CRUD, search, navigation
3. **Medium** — settings, preferences, edge cases
4. **Low** — static pages, about, terms
### 5. Suggest Test Plan
For each gap, recommend:
- Number of tests needed
- Which template from `templates/` to use
- Estimated effort (quick/medium/complex)
```
## Recommended Test Plan
### Priority 1: Critical
1. /register (4 tests) — use auth/registration template — quick
2. /forgot-password (3 tests) — use auth/password-reset template — quick
### Priority 2: High
3. /settings (4 tests) — use settings/ templates — medium
4. Dashboard error states (2 tests) — use dashboard/data-loading template — quick
```
### 6. Auto-Generate (Optional)
Ask user: "Generate tests for the top N gaps? [Yes/No/Pick specific]"
If yes, invoke `/pw:generate` for each gap with the recommended template.
## Output
- Coverage matrix (table format)
- Coverage percentage estimate
- Prioritized gap list with effort estimates
- Option to auto-generate missing tests
FILE:skills/fix/SKILL.md
---
name: "fix"
description: >-
Fix failing or flaky Playwright tests. Use when user says "fix test",
"flaky test", "test failing", "debug test", "test broken", "test passes
sometimes", or "intermittent failure".
---
# Fix Failing or Flaky Tests
Diagnose and fix a Playwright test that fails or passes intermittently using a systematic taxonomy.
## Input
`$ARGUMENTS` contains:
- A test file path: `e2e/login.spec.ts`
- A test name: ""should redirect after login"`
- A description: `"the checkout test fails in CI but passes locally"`
## Steps
### 1. Reproduce the Failure
Run the test to capture the error:
```bash
npx playwright test <file> --reporter=list
```
If the test passes, it's likely flaky. Run burn-in:
```bash
npx playwright test <file> --repeat-each=10 --reporter=list
```
If it still passes, try with parallel workers:
```bash
npx playwright test --fully-parallel --workers=4 --repeat-each=5
```
### 2. Capture Trace
Run with full tracing:
```bash
npx playwright test <file> --trace=on --retries=0
```
Read the trace output. Use `/debug` to analyze trace files if available.
### 3. Categorize the Failure
Load `flaky-taxonomy.md` from this skill directory.
Every failing test falls into one of four categories:
| Category | Symptom | Diagnosis |
|---|---|---|
| **Timing/Async** | Fails intermittently everywhere | `--repeat-each=20` reproduces locally |
| **Test Isolation** | Fails in suite, passes alone | `--workers=1 --grep "test name"` passes |
| **Environment** | Fails in CI, passes locally | Compare CI vs local screenshots/traces |
| **Infrastructure** | Random, no pattern | Error references browser internals |
### 4. Apply Targeted Fix
**Timing/Async:**
- Replace `waitForTimeout()` with web-first assertions
- Add `await` to missing Playwright calls
- Wait for specific network responses before asserting
- Use `toBeVisible()` before interacting with elements
**Test Isolation:**
- Remove shared mutable state between tests
- Create test data per-test via API or fixtures
- Use unique identifiers (timestamps, random strings) for test data
- Check for database state leaks
**Environment:**
- Match viewport sizes between local and CI
- Account for font rendering differences in screenshots
- Use `docker` locally to match CI environment
- Check for timezone-dependent assertions
**Infrastructure:**
- Increase timeout for slow CI runners
- Add retries in CI config (`retries: 2`)
- Check for browser OOM (reduce parallel workers)
- Ensure browser dependencies are installed
### 5. Verify the Fix
Run the test 10 times to confirm stability:
```bash
npx playwright test <file> --repeat-each=10 --reporter=list
```
All 10 must pass. If any fail, go back to step 3.
### 6. Prevent Recurrence
Suggest:
- Add to CI with `retries: 2` if not already
- Enable `trace: 'on-first-retry'` in config
- Add the fix pattern to project's test conventions doc
## Output
- Root cause category and specific issue
- The fix applied (with diff)
- Verification result (10/10 passes)
- Prevention recommendation
FILE:skills/fix/flaky-taxonomy.md
# Flaky Test Taxonomy
## Decision Tree
```
Test is flaky
│
├── Fails locally with --repeat-each=20?
│ ├── YES → TIMING / ASYNC
│ │ ├── Missing await? → Add await
│ │ ├── waitForTimeout? → Replace with assertion
│ │ ├── Race condition? → Wait for specific event
│ │ └── Animation? → Wait for animation end or disable
│ │
│ └── NO → Continue...
│
├── Passes alone, fails in suite?
│ ├── YES → TEST ISOLATION
│ │ ├── Shared variable? → Make per-test
│ │ ├── Database state? → Reset per-test
│ │ ├── localStorage? → Clear in beforeEach
│ │ └── Cookie leak? → Use isolated contexts
│ │
│ └── NO → Continue...
│
├── Fails in CI, passes locally?
│ ├── YES → ENVIRONMENT
│ │ ├── Viewport? → Set explicit size
│ │ ├── Fonts? → Use Docker locally
│ │ ├── Timezone? → Use UTC everywhere
│ │ └── Network? → Mock external services
│ │
│ └── NO → INFRASTRUCTURE
│ ├── Browser crash? → Reduce workers
│ ├── OOM? → Limit parallel tests
│ ├── DNS? → Add retry config
│ └── File system? → Use unique temp dirs
```
## Common Fixes by Category
### Timing / Async
**Missing await:**
```typescript
// BAD — race condition
page.goto('/dashboard');
expect(page.getByText('Welcome')).toBeVisible();
// GOOD
await page.goto('/dashboard');
await expect(page.getByText('Welcome')).toBeVisible();
```
**Clicking before visible:**
```typescript
// BAD — element may not be ready
await page.getByRole('button', { name: 'Submit' }).click();
// GOOD — ensure visible first
const submitBtn = page.getByRole('button', { name: 'Submit' });
await expect(submitBtn).toBeVisible();
await submitBtn.click();
```
**Race with network:**
```typescript
// BAD — data might not be loaded
await page.goto('/users');
await expect(page.getByRole('table')).toBeVisible();
// GOOD — wait for API response
const responsePromise = page.waitForResponse('**/api/users');
await page.goto('/users');
await responsePromise;
await expect(page.getByRole('table')).toBeVisible();
```
### Test Isolation
**Shared state fix:**
```typescript
// BAD — tests share userId
let userId: string;
test('create', async () => { userId = '123'; });
test('read', async () => { /* uses userId */ });
// GOOD — each test is independent
test('read user', async ({ request }) => {
const response = await request.post('/api/users', { data: { name: 'Test' } });
const { id } = await response.json();
// Use id within this test
});
```
**localStorage cleanup:**
```typescript
test.beforeEach(async ({ page }) => {
await page.goto('/');
await page.evaluate(() => localStorage.clear());
});
```
### Environment
**Explicit viewport:**
```typescript
test.use({ viewport: { width: 1280, height: 720 } });
```
**Timezone-safe dates:**
```typescript
// BAD
expect(dateText).toBe('March 5, 2026');
// GOOD — timezone independent
expect(dateText).toMatch(/\d{1,2}\/\d{1,2}\/\d{4}/);
```
### Infrastructure
**Retry config:**
```typescript
// playwright.config.ts
export default defineConfig({
retries: process.env.CI ? 2 : 0,
workers: process.env.CI ? 2 : undefined,
});
```
**Increase timeout for CI:**
```typescript
test.setTimeout(60_000); // 60s for slow CI
```
FILE:skills/generate/SKILL.md
---
name: "generate"
description: >-
Generate Playwright tests. Use when user says "write tests", "generate tests",
"add tests for", "test this component", "e2e test", "create test for",
"test this page", or "test this feature".
---
# Generate Playwright Tests
Generate production-ready Playwright tests from a user story, URL, component name, or feature description.
## Input
`$ARGUMENTS` contains what to test. Examples:
- `"user can log in with email and password"`
- `"the checkout flow"`
- `"src/components/UserProfile.tsx"`
- `"the search page with filters"`
## Steps
### 1. Understand the Target
Parse `$ARGUMENTS` to determine:
- **User story**: Extract the behavior to verify
- **Component path**: Read the component source code
- **Page/URL**: Identify the route and its elements
- **Feature name**: Map to relevant app areas
### 2. Explore the Codebase
Use the `Explore` subagent to gather context:
- Read `playwright.config.ts` for `testDir`, `baseURL`, `projects`
- Check existing tests in `testDir` for patterns, fixtures, and conventions
- If a component path is given, read the component to understand its props, states, and interactions
- Check for existing page objects in `pages/`
- Check for existing fixtures in `fixtures/`
- Check for auth setup (`auth.setup.ts` or `storageState` config)
### 3. Select Templates
Check `templates/` in this plugin for matching patterns:
| If testing... | Load template from |
|---|---|
| Login/auth flow | `templates/auth/login.md` |
| CRUD operations | `templates/crud/` |
| Checkout/payment | `templates/checkout/` |
| Search/filter UI | `templates/search/` |
| Form submission | `templates/forms/` |
| Dashboard/data | `templates/dashboard/` |
| Settings page | `templates/settings/` |
| Onboarding flow | `templates/onboarding/` |
| API endpoints | `templates/api/` |
| Accessibility | `templates/accessibility/` |
Adapt the template to the specific app — replace `{{placeholders}}` with actual selectors, URLs, and data.
### 4. Generate the Test
Follow these rules:
**Structure:**
```typescript
import { test, expect } from '@playwright/test';
// Import custom fixtures if the project uses them
test.describe('Feature Name', () => {
// Group related behaviors
test('should <expected behavior>', async ({ page }) => {
// Arrange: navigate, set up state
// Act: perform user action
// Assert: verify outcome
});
});
```
**Locator priority** (use the first that works):
1. `getByRole()` — buttons, links, headings, form elements
2. `getByLabel()` — form fields with labels
3. `getByText()` — non-interactive text content
4. `getByPlaceholder()` — inputs with placeholder text
5. `getByTestId()` — when semantic options aren't available
**Assertions** — always web-first:
```typescript
// GOOD — auto-retries
await expect(page.getByRole('heading')).toBeVisible();
await expect(page.getByRole('alert')).toHaveText('Success');
// BAD — no retry
const text = await page.textContent('.msg');
expect(text).toBe('Success');
```
**Never use:**
- `page.waitForTimeout()`
- `page.$(selector)` or `page.$$(selector)`
- Bare CSS selectors unless absolutely necessary
- `page.evaluate()` for things locators can do
**Always include:**
- Descriptive test names that explain the behavior
- Error/edge case tests alongside happy path
- Proper `await` on every Playwright call
- `baseURL`-relative navigation (`page.goto('/')` not `page.goto('http://...')`)
### 5. Match Project Conventions
- If project uses TypeScript → generate `.spec.ts`
- If project uses JavaScript → generate `.spec.js` with `require()` imports
- If project has page objects → use them instead of inline locators
- If project has custom fixtures → import and use them
- If project has a test data directory → create test data files there
### 6. Generate Supporting Files (If Needed)
- **Page object**: If the test touches 5+ unique locators on one page, create a page object
- **Fixture**: If the test needs shared setup (auth, data), create or extend a fixture
- **Test data**: If the test uses structured data, create a JSON file in `test-data/`
### 7. Verify
Run the generated test:
```bash
npx playwright test <generated-file> --reporter=list
```
If it fails:
1. Read the error
2. Fix the test (not the app)
3. Run again
4. If it's an app issue, report it to the user
## Output
- Generated test file(s) with path
- Any supporting files created (page objects, fixtures, data)
- Test run result
- Coverage note: what behaviors are now tested
FILE:skills/generate/patterns.md
# Test Generation Patterns
## Pattern: Authentication Flow
```typescript
test.describe('Authentication', () => {
test('should login with valid credentials', async ({ page }) => {
await page.goto('/login');
await page.getByLabel('Email').fill('[email protected]');
await page.getByLabel('Password').fill('password123');
await page.getByRole('button', { name: 'Sign in' }).click();
await expect(page).toHaveURL('/dashboard');
await expect(page.getByRole('heading', { name: 'Dashboard' })).toBeVisible();
});
test('should show error for invalid credentials', async ({ page }) => {
await page.goto('/login');
await page.getByLabel('Email').fill('[email protected]');
await page.getByLabel('Password').fill('wrong');
await page.getByRole('button', { name: 'Sign in' }).click();
await expect(page.getByRole('alert')).toHaveText(/invalid/i);
await expect(page).toHaveURL('/login');
});
});
```
## Pattern: CRUD Operations
```typescript
test.describe('Items', () => {
test('should create a new item', async ({ page }) => {
await page.goto('/items');
await page.getByRole('button', { name: 'Add item' }).click();
await page.getByLabel('Name').fill('Test Item');
await page.getByRole('button', { name: 'Save' }).click();
await expect(page.getByText('Test Item')).toBeVisible();
});
test('should edit an existing item', async ({ page }) => {
await page.goto('/items');
await page.getByRole('row', { name: /Test Item/ })
.getByRole('button', { name: 'Edit' }).click();
await page.getByLabel('Name').clear();
await page.getByLabel('Name').fill('Updated Item');
await page.getByRole('button', { name: 'Save' }).click();
await expect(page.getByText('Updated Item')).toBeVisible();
});
test('should delete an item with confirmation', async ({ page }) => {
await page.goto('/items');
await page.getByRole('row', { name: /Test Item/ })
.getByRole('button', { name: 'Delete' }).click();
await page.getByRole('button', { name: 'Confirm' }).click();
await expect(page.getByText('Test Item')).not.toBeVisible();
});
});
```
## Pattern: Form with Validation
```typescript
test.describe('Contact Form', () => {
test.beforeEach(async ({ page }) => {
await page.goto('/contact');
});
test('should submit valid form', async ({ page }) => {
await page.getByLabel('Name').fill('Jane Doe');
await page.getByLabel('Email').fill('[email protected]');
await page.getByLabel('Message').fill('Hello, this is a test message.');
await page.getByRole('button', { name: 'Send' }).click();
await expect(page.getByText('Message sent')).toBeVisible();
});
test('should show validation errors for empty required fields', async ({ page }) => {
await page.getByRole('button', { name: 'Send' }).click();
await expect(page.getByText('Name is required')).toBeVisible();
await expect(page.getByText('Email is required')).toBeVisible();
});
test('should validate email format', async ({ page }) => {
await page.getByLabel('Email').fill('not-an-email');
await page.getByRole('button', { name: 'Send' }).click();
await expect(page.getByText('Invalid email')).toBeVisible();
});
});
```
## Pattern: Search and Filter
```typescript
test.describe('Product Search', () => {
test('should return results for valid query', async ({ page }) => {
await page.goto('/products');
await page.getByPlaceholder('Search products').fill('laptop');
await page.getByRole('button', { name: 'Search' }).click();
await expect(page.getByRole('list')).toBeVisible();
const results = page.getByRole('listitem');
await expect(results).not.toHaveCount(0);
});
test('should show empty state for no results', async ({ page }) => {
await page.goto('/products');
await page.getByPlaceholder('Search products').fill('xyznonexistent');
await page.getByRole('button', { name: 'Search' }).click();
await expect(page.getByText('No products found')).toBeVisible();
});
test('should filter by category', async ({ page }) => {
await page.goto('/products');
await page.getByRole('combobox', { name: 'Category' }).selectOption('Electronics');
await expect(page.getByRole('listitem')).not.toHaveCount(0);
});
});
```
## Pattern: Navigation and Layout
```typescript
test.describe('Navigation', () => {
test('should navigate between pages', async ({ page }) => {
await page.goto('/');
await page.getByRole('link', { name: 'About' }).click();
await expect(page).toHaveURL('/about');
await expect(page.getByRole('heading', { level: 1 })).toHaveText('About');
});
test('should show mobile menu on small screens', async ({ page }) => {
await page.setViewportSize({ width: 375, height: 667 });
await page.goto('/');
await expect(page.getByRole('navigation')).not.toBeVisible();
await page.getByRole('button', { name: 'Menu' }).click();
await expect(page.getByRole('navigation')).toBeVisible();
});
});
```
## Pattern: API Mocking
```typescript
test.describe('Dashboard with mocked API', () => {
test('should display data from API', async ({ page }) => {
await page.route('**/api/dashboard', (route) => {
route.fulfill({
status: 200,
contentType: 'application/json',
body: JSON.stringify({ revenue: 50000, users: 1200 }),
});
});
await page.goto('/dashboard');
await expect(page.getByText('$50,000')).toBeVisible();
await expect(page.getByText('1,200')).toBeVisible();
});
test('should handle API errors gracefully', async ({ page }) => {
await page.route('**/api/dashboard', (route) => {
route.fulfill({ status: 500 });
});
await page.goto('/dashboard');
await expect(page.getByText(/error|try again/i)).toBeVisible();
});
});
```
FILE:skills/init/SKILL.md
---
name: "init"
description: >-
Set up Playwright in a project. Use when user says "set up playwright",
"add e2e tests", "configure playwright", "testing setup", "init playwright",
or "add test infrastructure".
---
# Initialize Playwright Project
Set up a production-ready Playwright testing environment. Detect the framework, generate config, folder structure, example test, and CI workflow.
## Steps
### 1. Analyze the Project
Use the `Explore` subagent to scan the project:
- Check `package.json` for framework (React, Next.js, Vue, Angular, Svelte)
- Check for `tsconfig.json` → use TypeScript; otherwise JavaScript
- Check if Playwright is already installed (`@playwright/test` in dependencies)
- Check for existing test directories (`tests/`, `e2e/`, `__tests__/`)
- Check for existing CI config (`.github/workflows/`, `.gitlab-ci.yml`)
### 2. Install Playwright
If not already installed:
```bash
npm init playwright@latest -- --quiet
```
Or if the user prefers manual setup:
```bash
npm install -D @playwright/test
npx playwright install --with-deps chromium
```
### 3. Generate `playwright.config.ts`
Adapt to the detected framework:
**Next.js:**
```typescript
import { defineConfig, devices } from '@playwright/test';
export default defineConfig({
testDir: './e2e',
fullyParallel: true,
forbidOnly: !!process.env.CI,
retries: process.env.CI ? 2 : 0,
workers: process.env.CI ? 1 : undefined,
reporter: [
['html', { open: 'never' }],
['list'],
],
use: {
baseURL: 'http://localhost:3000',
trace: 'on-first-retry',
screenshot: 'only-on-failure',
},
projects: [
{ name: "chromium", use: { ...devices['Desktop Chrome'] } },
{ name: "firefox", use: { ...devices['Desktop Firefox'] } },
{ name: "webkit", use: { ...devices['Desktop Safari'] } },
],
webServer: {
command: 'npm run dev',
url: 'http://localhost:3000',
reuseExistingServer: !process.env.CI,
},
});
```
**React (Vite):**
- Change `baseURL` to `http://localhost:5173`
- Change `webServer.command` to `npm run dev`
**Vue/Nuxt:**
- Change `baseURL` to `http://localhost:3000`
- Change `webServer.command` to `npm run dev`
**Angular:**
- Change `baseURL` to `http://localhost:4200`
- Change `webServer.command` to `npm run start`
**No framework detected:**
- Omit `webServer` block
- Set `baseURL` from user input or leave as placeholder
### 4. Create Folder Structure
```
e2e/
├── fixtures/
│ └── index.ts # Custom fixtures
├── pages/
│ └── .gitkeep # Page object models
├── test-data/
│ └── .gitkeep # Test data files
└── example.spec.ts # First example test
```
### 5. Generate Example Test
```typescript
import { test, expect } from '@playwright/test';
test.describe('Homepage', () => {
test('should load successfully', async ({ page }) => {
await page.goto('/');
await expect(page).toHaveTitle(/.+/);
});
test('should have visible navigation', async ({ page }) => {
await page.goto('/');
await expect(page.getByRole('navigation')).toBeVisible();
});
});
```
### 6. Generate CI Workflow
If `.github/workflows/` exists, create `playwright.yml`:
```yaml
name: "playwright-tests"
on:
push:
branches: [main, dev]
pull_request:
branches: [main, dev]
jobs:
test:
timeout-minutes: 60
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: lts/*
- name: "install-dependencies"
run: npm ci
- name: "install-playwright-browsers"
run: npx playwright install --with-deps
- name: "run-playwright-tests"
run: npx playwright test
- uses: actions/upload-artifact@v4
if: { !cancelled()}
with:
name: "playwright-report"
path: playwright-report/
retention-days: 30
```
If `.gitlab-ci.yml` exists, add a Playwright stage instead.
### 7. Update `.gitignore`
Append if not already present:
```
/test-results/
/playwright-report/
/blob-report/
/playwright/.cache/
```
### 8. Add npm Scripts
Add to `package.json` scripts:
```json
{
"test:e2e": "playwright test",
"test:e2e:ui": "playwright test --ui",
"test:e2e:debug": "playwright test --debug"
}
```
### 9. Verify Setup
Run the example test:
```bash
npx playwright test
```
Report the result. If it fails, diagnose and fix before completing.
## Output
Confirm what was created:
- Config file path and key settings
- Test directory and example test
- CI workflow (if applicable)
- npm scripts added
- How to run: `npx playwright test` or `npm run test:e2e`
FILE:skills/migrate/SKILL.md
---
name: "migrate"
description: >-
Migrate from Cypress or Selenium to Playwright. Use when user mentions
"cypress", "selenium", "migrate tests", "convert tests", "switch to
playwright", "move from cypress", or "replace selenium".
---
# Migrate to Playwright
Interactive migration from Cypress or Selenium to Playwright with file-by-file conversion.
## Input
`$ARGUMENTS` can be:
- `"from cypress"` — migrate Cypress test suite
- `"from selenium"` — migrate Selenium/WebDriver tests
- A file path: convert a specific test file
- Empty: auto-detect source framework
## Steps
### 1. Detect Source Framework
Use `Explore` subagent to scan:
- `cypress/` directory or `cypress.config.ts` → Cypress
- `selenium`, `webdriver` in `package.json` deps → Selenium
- `.py` test files with `selenium` imports → Selenium (Python)
### 2. Assess Migration Scope
Count files and categorize:
```
Migration Assessment:
- Total test files: X
- Cypress custom commands: Y
- Cypress fixtures: Z
- Estimated effort: [small|medium|large]
```
| Size | Files | Approach |
|---|---|---|
| Small (1-10) | Convert sequentially | Direct conversion |
| Medium (11-30) | Batch in groups of 5 | Use sub-agents |
| Large (31+) | Use `/batch` | Parallel conversion with `/batch` |
### 3. Set Up Playwright (If Not Present)
Run `/pw:init` first if Playwright isn't configured.
### 4. Convert Files
For each file, apply the appropriate mapping:
#### Cypress → Playwright
Load `cypress-mapping.md` for complete reference.
Key translations:
```
cy.visit(url) → page.goto(url)
cy.get(selector) → page.locator(selector) or page.getByRole(...)
cy.contains(text) → page.getByText(text)
cy.find(selector) → locator.locator(selector)
cy.click() → locator.click()
cy.type(text) → locator.fill(text)
cy.should('be.visible') → expect(locator).toBeVisible()
cy.should('have.text') → expect(locator).toHaveText(text)
cy.intercept() → page.route()
cy.wait('@alias') → page.waitForResponse()
cy.fixture() → JSON import or test data file
```
**Cypress custom commands** → Playwright fixtures or helper functions
**Cypress plugins** → Playwright config or fixtures
**`before`/`beforeEach`** → `test.beforeAll()` / `test.beforeEach()`
#### Selenium → Playwright
Load `selenium-mapping.md` for complete reference.
Key translations:
```
driver.get(url) → page.goto(url)
driver.findElement(By.id('x')) → page.locator('#x') or page.getByTestId('x')
driver.findElement(By.css('.x')) → page.locator('.x') or page.getByRole(...)
element.click() → locator.click()
element.sendKeys(text) → locator.fill(text)
element.getText() → locator.textContent()
WebDriverWait + ExpectedConditions → expect(locator).toBeVisible()
driver.switchTo().frame() → page.frameLocator()
Actions → locator.hover(), locator.dragTo()
```
### 5. Upgrade Locators
During conversion, upgrade selectors to Playwright best practices:
- `#id` → `getByTestId()` or `getByRole()`
- `.class` → `getByRole()` or `getByText()`
- `[data-testid]` → `getByTestId()`
- XPath → role-based locators
### 6. Convert Custom Commands / Utilities
- Cypress custom commands → Playwright custom fixtures via `test.extend()`
- Selenium page objects → Playwright page objects (keep structure, update API)
- Shared helpers → TypeScript utility functions
### 7. Verify Each Converted File
After converting each file:
```bash
npx playwright test <converted-file> --reporter=list
```
Fix any compilation or runtime errors before moving to the next file.
### 8. Clean Up
After all files are converted:
- Remove Cypress/Selenium dependencies from `package.json`
- Remove old config files (`cypress.config.ts`, etc.)
- Update CI workflow to use Playwright
- Update README with new test commands
Ask user before deleting anything.
## Output
- Conversion summary: files converted, total tests migrated
- Any tests that couldn't be auto-converted (manual intervention needed)
- Updated CI config
- Before/after comparison of test run results
FILE:skills/migrate/cypress-mapping.md
# Cypress → Playwright Mapping
## Commands
| Cypress | Playwright | Notes |
|---|---|---|
| `cy.visit('/page')` | `await page.goto('/page')` | Use `baseURL` in config |
| `cy.get('.selector')` | `page.locator('.selector')` | Prefer `getByRole()` |
| `cy.get('[data-cy=x]')` | `page.getByTestId('x')` | |
| `cy.contains('text')` | `page.getByText('text')` | |
| `cy.find('.child')` | `parent.locator('.child')` | Chain from parent locator |
| `cy.first()` | `locator.first()` | |
| `cy.last()` | `locator.last()` | |
| `cy.eq(n)` | `locator.nth(n)` | |
| `cy.parent()` | `locator.locator('..')` | Or restructure with better locators |
| `cy.children()` | `locator.locator('> *')` | |
| `cy.siblings()` | Not direct — restructure test | Use parent + filter |
## Actions
| Cypress | Playwright | Notes |
|---|---|---|
| `.click()` | `await locator.click()` | Always `await` |
| `.dblclick()` | `await locator.dblclick()` | |
| `.rightclick()` | `await locator.click({ button: 'right' })` | |
| `.type('text')` | `await locator.fill('text')` | `fill()` clears first |
| `.type('text', { delay: 50 })` | `await locator.pressSequentially('text', { delay: 50 })` | Simulates typing |
| `.clear()` | `await locator.clear()` | |
| `.check()` | `await locator.check()` | |
| `.uncheck()` | `await locator.uncheck()` | |
| `.select('value')` | `await locator.selectOption('value')` | |
| `.scrollTo()` | `await locator.scrollIntoViewIfNeeded()` | |
| `.trigger('event')` | `await locator.dispatchEvent('event')` | |
| `.focus()` | `await locator.focus()` | |
| `.blur()` | `await locator.blur()` | |
## Assertions
| Cypress | Playwright | Notes |
|---|---|---|
| `.should('be.visible')` | `await expect(locator).toBeVisible()` | Web-first, auto-retry |
| `.should('not.exist')` | `await expect(locator).not.toBeVisible()` | Or `.toHaveCount(0)` |
| `.should('have.text', 'x')` | `await expect(locator).toHaveText('x')` | |
| `.should('contain', 'x')` | `await expect(locator).toContainText('x')` | |
| `.should('have.value', 'x')` | `await expect(locator).toHaveValue('x')` | |
| `.should('have.attr', 'x', 'y')` | `await expect(locator).toHaveAttribute('x', 'y')` | |
| `.should('have.class', 'x')` | `await expect(locator).toHaveClass(/x/)` | |
| `.should('be.disabled')` | `await expect(locator).toBeDisabled()` | |
| `.should('be.checked')` | `await expect(locator).toBeChecked()` | |
| `.should('have.length', n)` | `await expect(locator).toHaveCount(n)` | |
| `cy.url().should('include', '/x')` | `await expect(page).toHaveURL(/\/x/)` | |
| `cy.title().should('eq', 'x')` | `await expect(page).toHaveTitle('x')` | |
## Network
| Cypress | Playwright |
|---|---|
| `cy.intercept('GET', '/api/*', { body: data })` | `await page.route('**/api/*', route => route.fulfill({ body: JSON.stringify(data) }))` |
| `cy.intercept('POST', '/api/*').as('save')` | `const savePromise = page.waitForResponse('**/api/*')` |
| `cy.wait('@save')` | `await savePromise` |
## Fixtures & Custom Commands
| Cypress | Playwright |
|---|---|
| `cy.fixture('data.json')` | `import data from './test-data/data.json'` |
| `Cypress.Commands.add('login', ...)` | `test.extend({ authenticatedPage: ... })` |
| `beforeEach(() => { ... })` | `test.beforeEach(async ({ page }) => { ... })` |
| `before(() => { ... })` | `test.beforeAll(async () => { ... })` |
## Config
| Cypress | Playwright |
|---|---|
| `baseUrl` in `cypress.config.ts` | `use.baseURL` in `playwright.config.ts` |
| `defaultCommandTimeout` | `expect.timeout` or `use.actionTimeout` |
| `video: true` | `use.video: 'on'` |
| `screenshotOnRunFailure` | `use.screenshot: 'only-on-failure'` |
| `retries: { runMode: 2 }` | `retries: 2` |
FILE:skills/migrate/selenium-mapping.md
# Selenium → Playwright Mapping
## Driver Setup
| Selenium (JS) | Playwright |
|---|---|
| `new Builder().forBrowser('chrome').build()` | Handled by config — no driver setup |
| `driver.quit()` | Automatic — Playwright manages browser lifecycle |
| `driver.manage().setTimeouts(...)` | Config: `timeout`, `expect.timeout` |
## Navigation
| Selenium | Playwright | Notes |
|---|---|---|
| `driver.get(url)` | `await page.goto(url)` | Use `baseURL` |
| `driver.navigate().back()` | `await page.goBack()` | |
| `driver.navigate().forward()` | `await page.goForward()` | |
| `driver.navigate().refresh()` | `await page.reload()` | |
| `driver.getCurrentUrl()` | `page.url()` | |
| `driver.getTitle()` | `await page.title()` | |
## Element Location
| Selenium | Playwright | Preferred |
|---|---|---|
| `By.id('x')` | `page.locator('#x')` | `page.getByTestId('x')` |
| `By.css('.x')` | `page.locator('.x')` | `page.getByRole(...)` |
| `By.xpath('//div')` | `page.locator('xpath=//div')` | Avoid — use role-based |
| `By.name('x')` | `page.locator('[name=x]')` | `page.getByLabel(...)` |
| `By.linkText('x')` | `page.getByRole('link', { name: 'x' })` | ✅ Best practice |
| `By.partialLinkText('x')` | `page.getByRole('link', { name: /x/ })` | ✅ Best practice |
| `By.tagName('button')` | `page.getByRole('button')` | ✅ Best practice |
| `By.className('x')` | `page.locator('.x')` | `page.getByRole(...)` |
| `findElement()` | Returns first match | `locator.first()` |
| `findElements()` | `page.locator(selector)` | Use `.count()` or `.all()` |
## Actions
| Selenium | Playwright |
|---|---|
| `element.click()` | `await locator.click()` |
| `element.sendKeys('text')` | `await locator.fill('text')` |
| `element.sendKeys(Key.ENTER)` | `await locator.press('Enter')` |
| `element.clear()` | `await locator.clear()` |
| `element.submit()` | `await locator.press('Enter')` or click submit button |
| `element.getText()` | `await locator.textContent()` |
| `element.getAttribute('x')` | `await locator.getAttribute('x')` |
| `element.isDisplayed()` | `await locator.isVisible()` |
| `element.isEnabled()` | `await locator.isEnabled()` |
| `element.isSelected()` | `await locator.isChecked()` |
## Waits
| Selenium | Playwright | Notes |
|---|---|---|
| `WebDriverWait(driver, 10).until(EC.visibilityOf(el))` | `await expect(locator).toBeVisible()` | Auto-retries |
| `WebDriverWait(driver, 10).until(EC.elementToBeClickable(el))` | `await locator.click()` | Auto-waits for clickable |
| `WebDriverWait(driver, 10).until(EC.presenceOf(el))` | `await expect(locator).toBeAttached()` | |
| `WebDriverWait(driver, 10).until(EC.textToBe(el, 'x'))` | `await expect(locator).toHaveText('x')` | |
| `Thread.sleep(3000)` | ❌ Never use | Use assertions instead |
| `driver.manage().setTimeouts({ implicit: 10000 })` | Not needed | Playwright auto-waits |
## Advanced
| Selenium | Playwright |
|---|---|
| `Actions(driver).moveToElement(el).perform()` | `await locator.hover()` |
| `Actions(driver).dragAndDrop(src, tgt).perform()` | `await src.dragTo(tgt)` |
| `Actions(driver).doubleClick(el).perform()` | `await locator.dblclick()` |
| `Actions(driver).contextClick(el).perform()` | `await locator.click({ button: 'right' })` |
| `driver.switchTo().frame(el)` | `page.frameLocator('#frame')` |
| `driver.switchTo().defaultContent()` | Not needed — use `page` directly |
| `driver.switchTo().alert()` | `page.on('dialog', d => d.accept())` |
| `driver.switchTo().window(handle)` | `const popup = await page.waitForEvent('popup')` |
| `driver.executeScript(js)` | `await page.evaluate(js)` |
| `driver.takeScreenshot()` | `await page.screenshot({ path: 'x.png' })` |
## Test Structure
| Selenium (Jest/Mocha) | Playwright |
|---|---|
| `describe('Suite', () => { ... })` | `test.describe('Suite', () => { ... })` |
| `it('should...', () => { ... })` | `test('should...', async ({ page }) => { ... })` |
| `beforeAll(() => { ... })` | `test.beforeAll(async () => { ... })` |
| `beforeEach(() => { ... })` | `test.beforeEach(async ({ page }) => { ... })` |
| `afterEach(() => { ... })` | `test.afterEach(async ({ page }) => { ... })` |
## Key Differences
1. **No implicit waits** — Playwright auto-waits for actionability
2. **No driver management** — Playwright handles browser lifecycle
3. **Built-in assertions** — `expect(locator)` with auto-retry
4. **Parallel by default** — tests run in parallel, must be isolated
5. **Traces instead of screenshots** — richer debugging artifacts
FILE:skills/report/SKILL.md
---
name: "report"
description: >-
Generate test report. Use when user says "test report", "results summary",
"test status", "show results", "test dashboard", or "how did tests go".
---
# Smart Test Reporting
Generate test reports that plug into the user's existing workflow. Zero new tools.
## Steps
### 1. Run Tests (If Not Already Run)
Check if recent test results exist:
```bash
ls -la test-results/ playwright-report/ 2>/dev/null
```
If no recent results, run tests:
```bash
npx playwright test --reporter=json,html,list 2>&1 | tee test-output.log
```
### 2. Parse Results
Read the JSON report:
```bash
npx playwright test --reporter=json 2> /dev/null
```
Extract:
- Total tests, passed, failed, skipped, flaky
- Duration per test and total
- Failed test names with error messages
- Flaky tests (passed on retry)
### 3. Detect Report Destination
Check what's configured and route automatically:
| Check | If found | Action |
|---|---|---|
| `TESTRAIL_URL` env var | TestRail configured | Push results via `/pw:testrail push` |
| `SLACK_WEBHOOK_URL` env var | Slack configured | Post summary to Slack |
| `.github/workflows/` | GitHub Actions | Results go to PR comment via artifacts |
| `playwright-report/` | HTML reporter | Open or serve the report |
| None of the above | Default | Generate markdown report |
### 4. Generate Report
#### Markdown Report (Always Generated)
```markdown
# Test Results — {{date}}
## Summary
- ✅ Passed: {{passed}}
- ❌ Failed: {{failed}}
- ⏭️ Skipped: {{skipped}}
- 🔄 Flaky: {{flaky}}
- ⏱️ Duration: {{duration}}
## Failed Tests
| Test | Error | File |
|---|---|---|
| {{name}} | {{error}} | {{file}}:{{line}} |
## Flaky Tests
| Test | Retries | File |
|---|---|---|
| {{name}} | {{retries}} | {{file}} |
## By Project
| Browser | Passed | Failed | Duration |
|---|---|---|---|
| Chromium | X | Y | Zs |
| Firefox | X | Y | Zs |
| WebKit | X | Y | Zs |
```
Save to `test-reports/{{date}}-report.md`.
#### Slack Summary (If Webhook Configured)
```bash
curl -X POST "$SLACK_WEBHOOK_URL" \
-H 'Content-Type: application/json' \
-d '{
"text": "🧪 Test Results: ✅ {{passed}} | ❌ {{failed}} | ⏱️ {{duration}}\n{{failed_details}}"
}'
```
#### TestRail Push (If Configured)
Invoke `/pw:testrail push` with the JSON results.
#### HTML Report
```bash
npx playwright show-report
```
Or if in CI:
```bash
echo "HTML report available at: playwright-report/index.html"
```
### 5. Trend Analysis (If Historical Data Exists)
If previous reports exist in `test-reports/`:
- Compare pass rate over time
- Identify tests that became flaky recently
- Highlight new failures vs. recurring failures
## Output
- Summary with pass/fail/skip/flaky counts
- Failed test details with error messages
- Report destination confirmation
- Trend comparison (if historical data available)
- Next action recommendation (fix failures or celebrate green)
FILE:skills/review/SKILL.md
---
name: "review"
description: >-
Review Playwright tests for quality. Use when user says "review tests",
"check test quality", "audit tests", "improve tests", "test code review",
or "playwright best practices check".
---
# Review Playwright Tests
Systematically review Playwright test files for anti-patterns, missed best practices, and coverage gaps.
## Input
`$ARGUMENTS` can be:
- A file path: review that specific test file
- A directory: review all test files in the directory
- Empty: review all tests in the project's `testDir`
## Steps
### 1. Gather Context
- Read `playwright.config.ts` for project settings
- List all `*.spec.ts` / `*.spec.js` files in scope
- If reviewing a single file, also check related page objects and fixtures
### 2. Check Each File Against Anti-Patterns
Load `anti-patterns.md` from this skill directory. Check for all 20 anti-patterns.
**Critical (must fix):**
1. `waitForTimeout()` usage
2. Non-web-first assertions (`expect(await ...)`)
3. Hardcoded URLs instead of `baseURL`
4. CSS/XPath selectors when role-based exists
5. Missing `await` on Playwright calls
6. Shared mutable state between tests
7. Test execution order dependencies
**Warning (should fix):**
8. Tests longer than 50 lines (consider splitting)
9. Magic strings without named constants
10. Missing error/edge case tests
11. `page.evaluate()` for things locators can do
12. Nested `test.describe()` more than 2 levels deep
13. Generic test names ("should work", "test 1")
**Info (consider):**
14. No page objects for pages with 5+ locators
15. Inline test data instead of factory/fixture
16. Missing accessibility assertions
17. No visual regression tests for UI-heavy pages
18. Console error assertions not checked
19. Network idle waits instead of specific assertions
20. Missing `test.describe()` grouping
### 3. Score Each File
Rate 1-10 based on:
- **9-10**: Production-ready, follows all golden rules
- **7-8**: Good, minor improvements possible
- **5-6**: Functional but has anti-patterns
- **3-4**: Significant issues, likely flaky
- **1-2**: Needs rewrite
### 4. Generate Review Report
For each file:
```
## <filename> — Score: X/10
### Critical
- Line 15: `waitForTimeout(2000)` → use `expect(locator).toBeVisible()`
- Line 28: CSS selector `.btn-submit` → `getByRole('button', { name: "submit" })`
### Warning
- Line 42: Test name "test login" → "should redirect to dashboard after login"
### Suggestions
- Consider adding error case: what happens with invalid credentials?
```
### 5. For Project-Wide Review
If reviewing an entire test suite:
- Spawn sub-agents per file for parallel review (up to 5 concurrent)
- Or use `/batch` for very large suites
- Aggregate results into a summary table
### 6. Offer Fixes
For each critical issue, provide the corrected code. Ask user: "Apply these fixes? [Yes/No]"
If yes, apply all fixes using `Edit` tool.
## Output
- File-by-file review with scores
- Summary: total files, average score, critical issue count
- Actionable fix list
- Coverage gaps identified (pages/features with no tests)
FILE:skills/review/anti-patterns.md
# Playwright Anti-Patterns Reference
## 1. Using `waitForTimeout()`
**Bad:**
```typescript
await page.click('.submit');
await page.waitForTimeout(3000);
await expect(page.locator('.result')).toBeVisible();
```
**Good:**
```typescript
await page.getByRole('button', { name: 'Submit' }).click();
await expect(page.getByTestId('result')).toBeVisible();
```
**Why:** Arbitrary waits slow tests and cause flakiness. Web-first assertions auto-retry.
## 2. Non-Web-First Assertions
**Bad:**
```typescript
const text = await page.textContent('.message');
expect(text).toBe('Success');
```
**Good:**
```typescript
await expect(page.getByText('Success')).toBeVisible();
```
**Why:** `expect(locator)` auto-retries until timeout. `expect(value)` checks once and fails.
## 3. Hardcoded URLs
**Bad:**
```typescript
await page.goto('http://localhost:3000/login');
```
**Good:**
```typescript
await page.goto('/login');
```
**Why:** `baseURL` in config handles the host. Tests break across environments with hardcoded URLs.
## 4. CSS/XPath When Role-Based Exists
**Bad:**
```typescript
await page.click('#submit-btn');
await page.locator('.nav-link.active').click();
```
**Good:**
```typescript
await page.getByRole('button', { name: 'Submit' }).click();
await page.getByRole('link', { name: 'Dashboard' }).click();
```
**Why:** Role-based locators survive CSS renames, class refactors, and component library changes.
## 5. Missing `await`
**Bad:**
```typescript
page.goto('/dashboard');
expect(page.getByText('Welcome')).toBeVisible();
```
**Good:**
```typescript
await page.goto('/dashboard');
await expect(page.getByText('Welcome')).toBeVisible();
```
**Why:** Missing `await` causes race conditions. Tests pass sometimes, fail others.
## 6. Shared Mutable State
**Bad:**
```typescript
let userId: string;
test('create user', async ({ page }) => {
// ... creates user, sets userId
userId = '123';
});
test('edit user', async ({ page }) => {
await page.goto(`/users/userId`); // depends on previous test
});
```
**Good:**
```typescript
test('edit user', async ({ page }) => {
// Create user via API in this test's setup
const userId = await createUserViaAPI();
await page.goto(`/users/userId`);
});
```
**Why:** Tests must be independent. Shared state causes order-dependent failures.
## 7. Execution Order Dependencies
**Bad:**
```typescript
test('step 1: fill form', async ({ page }) => { ... });
test('step 2: submit form', async ({ page }) => { ... });
test('step 3: verify result', async ({ page }) => { ... });
```
**Good:**
```typescript
test('should fill and submit form successfully', async ({ page }) => {
// All steps in one test
});
```
**Why:** Playwright runs tests in parallel by default. Order-dependent tests fail randomly.
## 8. Tests Over 50 Lines
Split into focused tests. Each test should verify one behavior.
## 9. Magic Strings
**Bad:**
```typescript
await page.getByLabel('Email').fill('[email protected]');
```
**Good:**
```typescript
const TEST_USER = { email: '[email protected]', password: 'Test123!' };
await page.getByLabel('Email').fill(TEST_USER.email);
```
## 10. Missing Error Cases
If you test the happy path, also test:
- Invalid input
- Empty state
- Network error
- Permission denied
- Timeout/loading state
## 11. Using `page.evaluate()` Unnecessarily
**Bad:**
```typescript
const text = await page.evaluate(() => document.querySelector('.title')?.textContent);
```
**Good:**
```typescript
await expect(page.getByRole('heading')).toHaveText('Expected Title');
```
## 12. Deep Nesting
Keep `test.describe()` to max 2 levels. More makes tests hard to find and maintain.
## 13. Generic Test Names
**Bad:** `test('test 1')`, `test('should work')`, `test('login test')`
**Good:** `test('should show error when email is invalid')`, `test('should redirect to dashboard after successful login')`
## 14-20. Style Issues
- No page objects for complex pages → create them
- Inline data → use factories or fixtures
- Missing a11y assertions → add `toHaveAttribute('role', ...)`
- No visual regression → add `toHaveScreenshot()` for key pages
- Not checking console errors → add `page.on('console', ...)`
- Using `networkidle` → use specific assertions instead
- No `test.describe()` → group related tests
FILE:skills/testrail/SKILL.md
---
name: "testrail"
description: >-
Sync tests with TestRail. Use when user mentions "testrail", "test management",
"test cases", "test run", "sync test cases", "push results to testrail",
or "import from testrail".
---
# TestRail Integration
Bidirectional sync between Playwright tests and TestRail test management.
## Prerequisites
Environment variables must be set:
- `TESTRAIL_URL` — e.g., `https://your-instance.testrail.io`
- `TESTRAIL_USER` — your email
- `TESTRAIL_API_KEY` — API key from TestRail
If not set, inform the user how to configure them and stop.
## Capabilities
### 1. Import Test Cases → Generate Playwright Tests
```
/pw:testrail import --project <id> --suite <id>
```
Steps:
1. Call `testrail_get_cases` MCP tool to fetch test cases
2. For each test case:
- Read title, preconditions, steps, expected results
- Map to a Playwright test using appropriate template
- Include TestRail case ID as test annotation: `test.info().annotations.push({ type: 'testrail', description: 'C12345' })`
3. Generate test files grouped by section
4. Report: X cases imported, Y tests generated
### 2. Push Test Results → TestRail
```
/pw:testrail push --run <id>
```
Steps:
1. Run Playwright tests with JSON reporter:
```bash
npx playwright test --reporter=json > test-results.json
```
2. Parse results: map each test to its TestRail case ID (from annotations)
3. Call `testrail_add_result` MCP tool for each test:
- Pass → status_id: 1
- Fail → status_id: 5, include error message
- Skip → status_id: 2
4. Report: X results pushed, Y passed, Z failed
### 3. Create Test Run
```
/pw:testrail run --project <id> --name "Sprint 42 Regression"
```
Steps:
1. Call `testrail_add_run` MCP tool
2. Include all test case IDs found in Playwright test annotations
3. Return run ID for result pushing
### 4. Sync Status
```
/pw:testrail status --project <id>
```
Steps:
1. Fetch test cases from TestRail
2. Scan local Playwright tests for TestRail annotations
3. Report coverage:
```
TestRail cases: 150
Playwright tests with TestRail IDs: 120
Unlinked TestRail cases: 30
Playwright tests without TestRail IDs: 15
```
### 5. Update Test Cases in TestRail
```
/pw:testrail update --case <id>
```
Steps:
1. Read the Playwright test for this case ID
2. Extract steps and expected results from test code
3. Call `testrail_update_case` MCP tool to update steps
## MCP Tools Used
| Tool | When |
|---|---|
| `testrail_get_projects` | List available projects |
| `testrail_get_suites` | List suites in project |
| `testrail_get_cases` | Read test cases |
| `testrail_add_case` | Create new test case |
| `testrail_update_case` | Update existing case |
| `testrail_add_run` | Create test run |
| `testrail_add_result` | Push individual result |
| `testrail_get_results` | Read historical results |
## Test Annotation Format
All Playwright tests linked to TestRail include:
```typescript
test('should login successfully', async ({ page }) => {
test.info().annotations.push({
type: 'testrail',
description: 'C12345',
});
// ... test code
});
```
This annotation is the bridge between Playwright and TestRail.
## Output
- Operation summary with counts
- Any errors or unmatched cases
- Link to TestRail run/results
FILE:templates/README.md
# Test Case Templates
55 ready-to-use, parametrizable Playwright test templates. Each includes TypeScript and JavaScript examples with `{{placeholder}}` markers for customization.
## Usage
Templates are loaded by `/pw:generate` when it detects a matching scenario. You can also reference them directly:
```
/pw:generate "login flow" → loads templates/auth/login.md
```
## Template Index
### Authentication (8)
| Template | Tests |
|---|---|
| [login.md](auth/login.md) | Email/password login, social login, remember me |
| [logout.md](auth/logout.md) | Logout from nav, session cleanup |
| [sso.md](auth/sso.md) | SSO redirect flow, callback handling |
| [mfa.md](auth/mfa.md) | 2FA code entry, backup codes |
| [password-reset.md](auth/password-reset.md) | Request reset, enter new password, expired link |
| [session-timeout.md](auth/session-timeout.md) | Auto-logout, session refresh |
| [remember-me.md](auth/remember-me.md) | Persistent login, cookie expiry |
| [rbac.md](auth/rbac.md) | Role-based access, forbidden page |
### CRUD Operations (6)
| Template | Tests |
|---|---|
| [create.md](crud/create.md) | Create entity with form |
| [read.md](crud/read.md) | View details, list view |
| [update.md](crud/update.md) | Edit entity, inline edit |
| [delete.md](crud/delete.md) | Delete with confirmation |
| [bulk-operations.md](crud/bulk-operations.md) | Select multiple, bulk actions |
| [soft-delete.md](crud/soft-delete.md) | Archive, restore |
### Checkout (6)
| Template | Tests |
|---|---|
| [add-to-cart.md](checkout/add-to-cart.md) | Add item, update cart |
| [update-quantity.md](checkout/update-quantity.md) | Increase, decrease, remove |
| [apply-coupon.md](checkout/apply-coupon.md) | Valid/invalid/expired codes |
| [payment.md](checkout/payment.md) | Card form, validation, processing |
| [order-confirm.md](checkout/order-confirm.md) | Success page, order details |
| [order-history.md](checkout/order-history.md) | List orders, pagination |
### Search & Filter (5)
| Template | Tests |
|---|---|
| [basic-search.md](search/basic-search.md) | Search input, results |
| [filters.md](search/filters.md) | Category, price, checkboxes |
| [sorting.md](search/sorting.md) | Sort by name, date, price |
| [pagination.md](search/pagination.md) | Page nav, items per page |
| [empty-state.md](search/empty-state.md) | No results, clear filters |
### Forms (6)
| Template | Tests |
|---|---|
| [single-step.md](forms/single-step.md) | Simple form submission |
| [multi-step.md](forms/multi-step.md) | Wizard with progress |
| [validation.md](forms/validation.md) | Required, format, inline errors |
| [file-upload.md](forms/file-upload.md) | Single, multiple, drag-drop |
| [conditional-fields.md](forms/conditional-fields.md) | Show/hide based on selection |
| [autosave.md](forms/autosave.md) | Draft save, restore |
### Dashboard (5)
| Template | Tests |
|---|---|
| [data-loading.md](dashboard/data-loading.md) | Loading state, skeleton, data |
| [chart-rendering.md](dashboard/chart-rendering.md) | Chart visible, tooltips |
| [date-range-filter.md](dashboard/date-range-filter.md) | Date picker, presets |
| [export.md](dashboard/export.md) | CSV/PDF download |
| [realtime-updates.md](dashboard/realtime-updates.md) | Live data, websocket |
### Settings (4)
| Template | Tests |
|---|---|
| [profile-update.md](settings/profile-update.md) | Name, email, avatar |
| [password-change.md](settings/password-change.md) | Current + new password |
| [notification-prefs.md](settings/notification-prefs.md) | Toggle, save prefs |
| [account-delete.md](settings/account-delete.md) | Confirm deletion |
### Onboarding (4)
| Template | Tests |
|---|---|
| [registration.md](onboarding/registration.md) | Signup form, validation |
| [email-verification.md](onboarding/email-verification.md) | Verify link, resend |
| [welcome-tour.md](onboarding/welcome-tour.md) | Step tour, skip |
| [first-time-setup.md](onboarding/first-time-setup.md) | Initial config |
### Notifications (3)
| Template | Tests |
|---|---|
| [in-app.md](notifications/in-app.md) | Badge, dropdown, mark read |
| [toast-messages.md](notifications/toast-messages.md) | Success/error toasts |
| [notification-center.md](notifications/notification-center.md) | List, filter, clear |
### API Testing (5)
| Template | Tests |
|---|---|
| [rest-crud.md](api/rest-crud.md) | GET/POST/PUT/DELETE |
| [graphql.md](api/graphql.md) | Query, mutation |
| [auth-headers.md](api/auth-headers.md) | Token, expired, refresh |
| [error-responses.md](api/error-responses.md) | 400-500 status handling |
| [rate-limiting.md](api/rate-limiting.md) | Rate limit, retry-after |
### Accessibility (3)
| Template | Tests |
|---|---|
| [keyboard-navigation.md](accessibility/keyboard-navigation.md) | Tab order, focus |
| [screen-reader.md](accessibility/screen-reader.md) | ARIA labels, live regions |
| [color-contrast.md](accessibility/color-contrast.md) | Contrast ratios |
FILE:templates/accessibility/color-contrast.md
# Color Contrast Template
Tests contrast ratios, color-blind safe palettes, and focus indicator visibility.
## Prerequisites
- App running at `{{baseUrl}}`
- axe-playwright installed: `npm i -D @axe-core/playwright`
- Page under test: `{{baseUrl}}/{{pagePath}}`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
import AxeBuilder from '@axe-core/playwright';
test.describe('Color Contrast', () => {
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/{{pagePath}}');
});
// Happy path: no color contrast violations (axe)
test('has no color contrast violations', async ({ page }) => {
const results = await new AxeBuilder({ page })
.withTags(['wcag2a', 'wcag2aa', 'wcag21aa'])
.withRules(['color-contrast'])
.analyze();
expect(results.violations).toEqual([]);
});
// Happy path: body text contrast ratio ≥ 4.5:1
test('body text meets WCAG AA contrast ratio', async ({ page }) => {
const ratio = await page.evaluate(() => {
const el = document.querySelector('p, main, [class*="body"]') as HTMLElement;
if (!el) return null;
const style = getComputedStyle(el);
// Simplified check — use axe for full verification
return style.color !== 'rgba(0, 0, 0, 0)' ? style.color : null;
});
expect(ratio).toBeTruthy();
});
// Happy path: large text contrast ratio ≥ 3:1
test('headings have sufficient contrast', async ({ page }) => {
const results = await new AxeBuilder({ page })
.withRules(['color-contrast'])
.include('h1, h2, h3, h4, h5, h6')
.analyze();
expect(results.violations).toEqual([]);
});
// Happy path: focus indicator meets contrast requirement
test('focus indicator is visible and meets contrast', async ({ page }) => {
await page.getByRole('button').first().focus();
const outline = await page.getByRole('button').first().evaluate(el => {
const s = getComputedStyle(el, ':focus');
return {
outlineWidth: parseFloat(s.outlineWidth),
outlineColor: s.outlineColor,
outlineStyle: s.outlineStyle,
};
});
expect(outline.outlineWidth).toBeGreaterThanOrEqual(2);
expect(outline.outlineColor).not.toBe('rgba(0, 0, 0, 0)');
});
// Happy path: error text contrast
test('error messages have sufficient contrast', async ({ page }) => {
await page.goto('{{baseUrl}}/{{formPath}}');
await page.getByRole('button', { name: /submit/i }).click();
const results = await new AxeBuilder({ page })
.withRules(['color-contrast'])
.include('[class*="error"], [role="alert"]')
.analyze();
expect(results.violations).toEqual([]);
});
// Happy path: no information conveyed by color alone
test('status badges use text or icon in addition to color', async ({ page }) => {
const badges = page.getByRole('status');
const count = await badges.count();
for (let i = 0; i < count; i++) {
const text = await badges.nth(i).textContent();
const ariaLabel = await badges.nth(i).getAttribute('aria-label');
expect(text?.trim() || ariaLabel).toBeTruthy();
}
});
// Edge case: full page axe scan for all WCAG 2.1 AA issues
test('full page passes WCAG 2.1 AA axe scan', async ({ page }) => {
const results = await new AxeBuilder({ page })
.withTags(['wcag2a', 'wcag2aa', 'wcag21aa'])
.exclude('{{knownExcludedSelector}}')
.analyze();
if (results.violations.length > 0) {
const messages = results.violations.map(v =>
`v.id: v.description — v.nodes.map(n => n.target).join(', ')`
).join('\n');
throw new Error(`Axe violations:\nmessages`);
}
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
const AxeBuilder = require('@axe-core/playwright').default;
test.describe('Color Contrast', () => {
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/{{pagePath}}');
});
test('no color contrast violations', async ({ page }) => {
const results = await new AxeBuilder({ page })
.withRules(['color-contrast'])
.analyze();
expect(results.violations).toEqual([]);
});
test('focus indicator is visible', async ({ page }) => {
await page.getByRole('button').first().focus();
const outlineWidth = await page.getByRole('button').first().evaluate(
el => parseFloat(getComputedStyle(el).outlineWidth)
);
expect(outlineWidth).toBeGreaterThanOrEqual(2);
});
test('status badges use text not just color', async ({ page }) => {
const badges = page.getByRole('status');
const count = await badges.count();
for (let i = 0; i < count; i++) {
const text = await badges.nth(i).textContent();
const label = await badges.nth(i).getAttribute('aria-label');
expect((text?.trim()) || label).toBeTruthy();
}
});
test('full page passes WCAG 2.1 AA', async ({ page }) => {
const results = await new AxeBuilder({ page })
.withTags(['wcag2a', 'wcag2aa', 'wcag21aa'])
.analyze();
expect(results.violations).toEqual([]);
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Contrast violations | axe color-contrast rule → no violations |
| Body text contrast | Text color non-transparent |
| Heading contrast | axe include h1-h6 → no violations |
| Focus indicator | outline-width ≥ 2px and non-transparent |
| Error text contrast | Error messages pass axe |
| Color-only info | Badges have text or aria-label |
| Full axe scan | WCAG 2.1 AA complete scan |
FILE:templates/accessibility/keyboard-navigation.md
# Keyboard Navigation Template
Tests tab order, focus visibility, and keyboard shortcuts.
## Prerequisites
- App running at `{{baseUrl}}`
- Page under test: `{{baseUrl}}/{{pagePath}}`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
test.describe('Keyboard Navigation', () => {
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/{{pagePath}}');
});
// Happy path: Tab moves through interactive elements in logical order
test('Tab key cycles through focusable elements in correct order', async ({ page }) => {
await page.keyboard.press('Tab');
await expect(page.getByRole('link', { name: /skip.*main|skip navigation/i }))
.toBeFocused();
await page.keyboard.press('Tab');
// First nav link focused
const navLinks = page.getByRole('navigation').getByRole('link');
await expect(navLinks.first()).toBeFocused();
});
// Happy path: skip link skips to main content
test('skip-to-content link moves focus to main', async ({ page }) => {
await page.keyboard.press('Tab');
await page.keyboard.press('Enter');
await expect(page.getByRole('main')).toBeFocused();
});
// Happy path: focus visible on all interactive elements
test('focus ring visible on interactive elements', async ({ page }) => {
const interactive = page.getByRole('button').first();
await interactive.focus();
const box = await interactive.boundingBox();
// Take screenshot with focus and assert element has outline (visual only — use CSS check)
const outline = await interactive.evaluate(el =>
getComputedStyle(el).outlineWidth
);
expect(parseFloat(outline)).toBeGreaterThan(0);
});
// Happy path: modal traps focus
test('focus is trapped within modal when open', async ({ page }) => {
await page.getByRole('button', { name: /open modal/i }).click();
const modal = page.getByRole('dialog');
await expect(modal).toBeVisible();
// Repeatedly Tab and verify focus stays within dialog
for (let i = 0; i < 10; i++) {
await page.keyboard.press('Tab');
const focused = page.locator(':focus');
await expect(modal).toContainElement(focused);
}
});
// Happy path: Escape closes modal
test('Escape key closes modal', async ({ page }) => {
await page.getByRole('button', { name: /open modal/i }).click();
await expect(page.getByRole('dialog')).toBeVisible();
await page.keyboard.press('Escape');
await expect(page.getByRole('dialog')).toBeHidden();
// Focus returns to trigger button
await expect(page.getByRole('button', { name: /open modal/i })).toBeFocused();
});
// Happy path: keyboard shortcut
test('keyboard shortcut {{shortcutKey}} triggers action', async ({ page }) => {
await page.keyboard.press('{{shortcutKey}}');
await expect(page.getByRole('{{shortcutTargetRole}}', { name: /{{shortcutTargetName}}/i })).toBeVisible();
});
// Error case: focus not lost on dynamic content update
test('focus stays on element after async update', async ({ page }) => {
const btn = page.getByRole('button', { name: /{{asyncButton}}/i });
await btn.focus();
await btn.press('Enter');
await expect(btn).toBeFocused();
});
// Edge case: arrow keys navigate within component (listbox, tabs)
test('arrow keys navigate within tab list', async ({ page }) => {
const firstTab = page.getByRole('tab').first();
await firstTab.focus();
await page.keyboard.press('ArrowRight');
await expect(page.getByRole('tab').nth(1)).toBeFocused();
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
test.describe('Keyboard Navigation', () => {
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/{{pagePath}}');
});
test('skip link moves focus to main content', async ({ page }) => {
await page.keyboard.press('Tab');
await page.keyboard.press('Enter');
await expect(page.getByRole('main')).toBeFocused();
});
test('Escape closes modal and returns focus', async ({ page }) => {
await page.getByRole('button', { name: /open modal/i }).click();
await page.keyboard.press('Escape');
await expect(page.getByRole('dialog')).toBeHidden();
await expect(page.getByRole('button', { name: /open modal/i })).toBeFocused();
});
test('focus ring visible on buttons', async ({ page }) => {
await page.getByRole('button').first().focus();
const outline = await page.getByRole('button').first().evaluate(
el => getComputedStyle(el).outlineWidth
);
expect(parseFloat(outline)).toBeGreaterThan(0);
});
test('arrow keys navigate tab list', async ({ page }) => {
await page.getByRole('tab').first().focus();
await page.keyboard.press('ArrowRight');
await expect(page.getByRole('tab').nth(1)).toBeFocused();
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Tab order | Skip link first, nav links after |
| Skip link | Moves focus to `<main>` |
| Focus ring | CSS outline-width > 0 on focus |
| Focus trap | Tab stays within open modal |
| Escape closes | Modal closed, trigger re-focused |
| Keyboard shortcut | Custom key triggers action |
| Focus after update | Focus not lost on async update |
| Arrow keys | Tab/listbox/menu arrow navigation |
FILE:templates/accessibility/screen-reader.md
# Screen Reader Template
Tests ARIA labels, live regions, and announcements for assistive technology.
## Prerequisites
- App running at `{{baseUrl}}`
- Page under test: `{{baseUrl}}/{{pagePath}}`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
test.describe('Screen Reader Accessibility', () => {
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/{{pagePath}}');
});
// Happy path: page has descriptive title
test('page has meaningful title', async ({ page }) => {
await expect(page).toHaveTitle(/{{expectedPageTitle}}/i);
});
// Happy path: main landmark exists
test('page has main landmark', async ({ page }) => {
await expect(page.getByRole('main')).toBeVisible();
});
// Happy path: images have alt text
test('informational images have non-empty alt text', async ({ page }) => {
const images = page.getByRole('img');
const count = await images.count();
for (let i = 0; i < count; i++) {
const alt = await images.nth(i).getAttribute('alt');
const isDecorative = await images.nth(i).getAttribute('role') === 'presentation'
|| alt === '';
if (!isDecorative) {
expect(alt).toBeTruthy();
}
}
});
// Happy path: form fields have accessible labels
test('all form inputs have associated labels', async ({ page }) => {
const inputs = page.getByRole('textbox');
const count = await inputs.count();
for (let i = 0; i < count; i++) {
const input = inputs.nth(i);
const labelledBy = await input.getAttribute('aria-labelledby');
const ariaLabel = await input.getAttribute('aria-label');
const id = await input.getAttribute('id');
const hasLabel = labelledBy || ariaLabel || (id && await page.locator(`label[for="id"]`).count() > 0);
expect(hasLabel).toBeTruthy();
}
});
// Happy path: live region announces updates
test('live region announces async updates', async ({ page }) => {
const liveRegion = page.getByRole('status').or(page.locator('[aria-live]'));
await page.getByRole('button', { name: /{{asyncTrigger}}/i }).click();
await expect(liveRegion).not.toBeEmpty();
});
// Happy path: alert role used for errors
test('validation errors use role="alert"', async ({ page }) => {
await page.goto('{{baseUrl}}/{{formPath}}');
await page.getByRole('button', { name: /submit/i }).click();
await expect(page.getByRole('alert')).toBeVisible();
const liveValue = await page.getByRole('alert').first().getAttribute('aria-live');
expect(liveValue ?? 'assertive').toBe('assertive');
});
// Happy path: buttons have accessible names
test('icon-only buttons have aria-label', async ({ page }) => {
const buttons = page.getByRole('button');
const count = await buttons.count();
for (let i = 0; i < count; i++) {
const btn = buttons.nth(i);
const text = (await btn.textContent())?.trim();
const ariaLabel = await btn.getAttribute('aria-label');
const ariaLabelledBy = await btn.getAttribute('aria-labelledby');
// Must have visible text or aria-label or aria-labelledby
expect(text || ariaLabel || ariaLabelledBy).toBeTruthy();
}
});
// Happy path: navigation landmark labelled
test('multiple nav elements have distinct aria-labels', async ({ page }) => {
const navs = page.getByRole('navigation');
const count = await navs.count();
if (count > 1) {
const labels = new Set<string>();
for (let i = 0; i < count; i++) {
const label = await navs.nth(i).getAttribute('aria-label') ?? '';
labels.add(label);
}
expect(labels.size).toBe(count); // all unique
}
});
// Edge case: expanded/collapsed state communicated
test('accordion aria-expanded reflects open/closed state', async ({ page }) => {
const trigger = page.getByRole('button', { name: /{{accordionItem}}/i });
await expect(trigger).toHaveAttribute('aria-expanded', 'false');
await trigger.click();
await expect(trigger).toHaveAttribute('aria-expanded', 'true');
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
test.describe('Screen Reader Accessibility', () => {
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/{{pagePath}}');
});
test('page has meaningful title', async ({ page }) => {
await expect(page).toHaveTitle(/{{expectedPageTitle}}/i);
});
test('main landmark exists', async ({ page }) => {
await expect(page.getByRole('main')).toBeVisible();
});
test('validation errors use role=alert', async ({ page }) => {
await page.goto('{{baseUrl}}/{{formPath}}');
await page.getByRole('button', { name: /submit/i }).click();
await expect(page.getByRole('alert')).toBeVisible();
});
test('accordion aria-expanded toggles', async ({ page }) => {
const trigger = page.getByRole('button', { name: /{{accordionItem}}/i });
await expect(trigger).toHaveAttribute('aria-expanded', 'false');
await trigger.click();
await expect(trigger).toHaveAttribute('aria-expanded', 'true');
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Page title | `<title>` matches expected pattern |
| Main landmark | `<main>` present and visible |
| Image alt text | Informational images have non-empty alt |
| Form labels | All inputs have accessible label |
| Live region | Status region updated on async action |
| Alert role | Errors use role=alert (assertive) |
| Button names | Icon buttons have aria-label |
| Unique nav labels | Multiple navs have distinct labels |
| aria-expanded | Accordion state communicated |
FILE:templates/api/auth-headers.md
# Auth Headers Template
Tests token authentication, expired token handling, and token refresh flow.
## Prerequisites
- Valid token: `{{apiToken}}`
- Expired token: `{{expiredApiToken}}`
- Refresh token: `{{refreshToken}}`
- API base: `{{apiBaseUrl}}`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
test.describe('API Auth Headers', () => {
// Happy path: valid Bearer token accepted
test('accepts valid Bearer token', async ({ request }) => {
const res = await request.get('{{apiBaseUrl}}/me', {
headers: { 'Authorization': `Bearer {{apiToken}}` },
});
expect(res.status()).toBe(200);
const body = await res.json();
expect(body.id).toBeTruthy();
});
// Happy path: API key in header accepted
test('accepts API key header', async ({ request }) => {
const res = await request.get('{{apiBaseUrl}}/{{entityName}}s', {
headers: { 'X-API-Key': '{{apiKey}}' },
});
expect(res.status()).toBe(200);
});
// Error case: no auth header returns 401
test('returns 401 without auth header', async ({ request }) => {
const res = await request.get('{{apiBaseUrl}}/me');
expect(res.status()).toBe(401);
const body = await res.json();
expect(body.error ?? body.message).toMatch(/unauthorized|authentication required/i);
});
// Error case: expired token returns 401
test('returns 401 for expired token', async ({ request }) => {
const res = await request.get('{{apiBaseUrl}}/me', {
headers: { 'Authorization': `Bearer {{expiredApiToken}}` },
});
expect(res.status()).toBe(401);
const body = await res.json();
expect(body.error ?? body.code).toMatch(/token.*expired|expired_token/i);
});
// Happy path: refresh token obtains new access token
test('refreshes expired token and retries request', async ({ request }) => {
// Step 1: refresh
const refresh = await request.post('{{apiBaseUrl}}/auth/refresh', {
data: { refresh_token: '{{refreshToken}}' },
});
expect(refresh.status()).toBe(200);
const { access_token } = await refresh.json();
expect(access_token).toBeTruthy();
// Step 2: use new token
const res = await request.get('{{apiBaseUrl}}/me', {
headers: { 'Authorization': `Bearer access_token` },
});
expect(res.status()).toBe(200);
});
// Error case: invalid token format returns 401
test('returns 401 for malformed token', async ({ request }) => {
const res = await request.get('{{apiBaseUrl}}/me', {
headers: { 'Authorization': 'Bearer not.a.jwt' },
});
expect(res.status()).toBe(401);
});
// Edge case: token in cookie vs header
test('accepts session cookie as auth alternative', async ({ request }) => {
const res = await request.get('{{apiBaseUrl}}/me', {
headers: { 'Cookie': `{{sessionCookieName}}={{sessionCookieValue}}` },
});
expect(res.status()).toBe(200);
});
// Edge case: revoked token returns 401
test('returns 401 for revoked token', async ({ request }) => {
const res = await request.get('{{apiBaseUrl}}/me', {
headers: { 'Authorization': `Bearer {{revokedApiToken}}` },
});
expect(res.status()).toBe(401);
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
test.describe('API Auth Headers', () => {
test('accepts valid Bearer token', async ({ request }) => {
const res = await request.get('{{apiBaseUrl}}/me', {
headers: { 'Authorization': `Bearer {{apiToken}}` },
});
expect(res.status()).toBe(200);
});
test('returns 401 without auth header', async ({ request }) => {
const res = await request.get('{{apiBaseUrl}}/me');
expect(res.status()).toBe(401);
});
test('returns 401 for expired token', async ({ request }) => {
const res = await request.get('{{apiBaseUrl}}/me', {
headers: { 'Authorization': `Bearer {{expiredApiToken}}` },
});
expect(res.status()).toBe(401);
});
test('refreshes token and retries', async ({ request }) => {
const refresh = await request.post('{{apiBaseUrl}}/auth/refresh', {
data: { refresh_token: '{{refreshToken}}' },
});
const { access_token } = await refresh.json();
const res = await request.get('{{apiBaseUrl}}/me', {
headers: { 'Authorization': `Bearer access_token` },
});
expect(res.status()).toBe(200);
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Valid Bearer | 200 with user data |
| API key | X-API-Key header accepted |
| No auth | 401 + error message |
| Expired token | 401 + expired error code |
| Token refresh | New token from refresh endpoint |
| Malformed token | 401 for non-JWT |
| Cookie auth | Session cookie accepted |
| Revoked token | 401 for revoked token |
FILE:templates/api/error-responses.md
# API Error Responses Template
Tests 400, 401, 403, 404, and 500 HTTP error handling.
## Prerequisites
- Valid auth token: `{{apiToken}}`
- API base: `{{apiBaseUrl}}`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
const validHeaders = {
'Authorization': `Bearer {{apiToken}}`,
'Content-Type': 'application/json',
};
test.describe('API Error Responses', () => {
// 400 Bad Request
test('POST with invalid body returns 400', async ({ request }) => {
const res = await request.post('{{apiBaseUrl}}/{{entityName}}s', {
headers: validHeaders,
data: { name: '' }, // name too short / blank
});
expect(res.status()).toBe(400);
const body = await res.json();
expect(body.message ?? body.error).toMatch(/bad request|invalid/i);
expect(body.errors ?? body.details).toBeDefined();
});
// 401 Unauthorized
test('request without token returns 401', async ({ request }) => {
const res = await request.get('{{apiBaseUrl}}/{{entityName}}s');
expect(res.status()).toBe(401);
const body = await res.json();
expect(body.message ?? body.error).toMatch(/unauthorized|authentication/i);
});
// 403 Forbidden
test('accessing admin endpoint as regular user returns 403', async ({ request }) => {
const res = await request.get('{{apiBaseUrl}}/admin/users', {
headers: { 'Authorization': `Bearer {{userToken}}` },
});
expect(res.status()).toBe(403);
const body = await res.json();
expect(body.message ?? body.error).toMatch(/forbidden|insufficient.*permission/i);
});
// 404 Not Found
test('GET non-existent resource returns 404', async ({ request }) => {
const res = await request.get('{{apiBaseUrl}}/{{entityName}}s/999999', { headers: validHeaders });
expect(res.status()).toBe(404);
const body = await res.json();
expect(body.message ?? body.error).toMatch(/not found/i);
});
// 422 Unprocessable Entity
test('POST with missing required field returns 422', async ({ request }) => {
const res = await request.post('{{apiBaseUrl}}/{{entityName}}s', {
headers: validHeaders,
data: { description: 'no name provided' },
});
expect([422, 400]).toContain(res.status());
const body = await res.json();
expect(body.errors ?? body.details).toBeDefined();
});
// 429 Too Many Requests (handled in rate-limiting template — kept here for completeness)
test('returns 429 when rate limit exceeded', async ({ request }) => {
let lastStatus = 0;
for (let i = 0; i < {{rateLimitThreshold}} + 1; i++) {
const res = await request.get('{{apiBaseUrl}}/{{rateLimitedEndpoint}}', { headers: validHeaders });
lastStatus = res.status();
if (lastStatus === 429) break;
}
expect(lastStatus).toBe(429);
});
// 500 Internal Server Error
test('server error returns 500 with error body', async ({ page }) => {
await page.route('{{apiBaseUrl}}/{{entityName}}s', route =>
route.fulfill({ status: 500, body: JSON.stringify({ error: 'Internal Server Error' }) })
);
const res = await page.request.get('{{apiBaseUrl}}/{{entityName}}s', { headers: validHeaders });
expect(res.status()).toBe(500);
const body = await res.json();
expect(body.error ?? body.message).toBeTruthy();
});
// Edge case: error response has consistent shape
test('all errors return JSON with error field', async ({ request }) => {
const endpoints = [
{ method: 'get' as const, url: '{{apiBaseUrl}}/{{entityName}}s/000000', headers: validHeaders },
{ method: 'get' as const, url: '{{apiBaseUrl}}/{{entityName}}s' },
];
for (const ep of endpoints) {
const res = await request[ep.method](ep.url, { headers: ep.headers });
if (res.status() >= 400) {
const body = await res.json();
expect(body.error ?? body.message ?? body.errors).toBeDefined();
}
}
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
const headers = { 'Authorization': `Bearer {{apiToken}}`, 'Content-Type': 'application/json' };
test.describe('API Error Responses', () => {
test('POST with invalid body returns 400', async ({ request }) => {
const res = await request.post('{{apiBaseUrl}}/{{entityName}}s', {
headers,
data: { name: '' },
});
expect(res.status()).toBe(400);
});
test('no token returns 401', async ({ request }) => {
const res = await request.get('{{apiBaseUrl}}/{{entityName}}s');
expect(res.status()).toBe(401);
});
test('regular user on admin endpoint returns 403', async ({ request }) => {
const res = await request.get('{{apiBaseUrl}}/admin/users', {
headers: { 'Authorization': `Bearer {{userToken}}` },
});
expect(res.status()).toBe(403);
});
test('non-existent resource returns 404', async ({ request }) => {
const res = await request.get('{{apiBaseUrl}}/{{entityName}}s/999999', { headers });
expect(res.status()).toBe(404);
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| 400 Bad Request | Invalid body → 400 + errors detail |
| 401 Unauthorized | No token → 401 |
| 403 Forbidden | Wrong role → 403 |
| 404 Not Found | Missing resource → 404 |
| 422 Unprocessable | Missing required field → 422/400 |
| 429 Rate Limit | Threshold exceeded → 429 |
| 500 Server Error | Mocked 500 → error body present |
| Consistent shape | All errors have error/message field |
FILE:templates/api/graphql.md
# GraphQL API Template
Tests query, mutation, and subscription via Playwright's request API.
## Prerequisites
- Valid auth token: `{{apiToken}}`
- GraphQL endpoint: `{{graphqlEndpoint}}`
- WebSocket endpoint for subscriptions: `{{graphqlWsEndpoint}}`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
const GQL_URL = '{{graphqlEndpoint}}';
const headers = {
'Authorization': `Bearer {{apiToken}}`,
'Content-Type': 'application/json',
};
async function gql(request: any, query: string, variables = {}) {
const res = await request.post(GQL_URL, { headers, data: { query, variables } });
const body = await res.json();
expect(body.errors).toBeUndefined();
return body.data;
}
test.describe('GraphQL API', () => {
// Happy path: query
test('query fetches {{entityName}} list', async ({ request }) => {
const data = await gql(request, `
query Get{{EntityName}}s($limit: Int) {
{{entityName}}s(limit: $limit) { id name createdAt }
}
`, { limit: 10 });
expect(Array.isArray(data.{{entityName}}s)).toBe(true);
expect(data.{{entityName}}s.length).toBeLessThanOrEqual(10);
});
// Happy path: query single entity
test('query fetches single {{entityName}} by id', async ({ request }) => {
const data = await gql(request, `
query Get{{EntityName}}($id: ID!) {
{{entityName}}(id: $id) { id name description }
}
`, { id: '{{existingEntityId}}' });
expect(data.{{entityName}}.id).toBe('{{existingEntityId}}');
});
// Happy path: mutation creates entity
test('mutation creates {{entityName}}', async ({ request }) => {
const data = await gql(request, `
mutation Create{{EntityName}}($input: {{EntityName}}Input!) {
create{{EntityName}}(input: $input) { id name }
}
`, { input: { name: '{{testEntityName}}', description: '{{testDescription}}' } });
expect(data.create{{EntityName}}.id).toBeTruthy();
expect(data.create{{EntityName}}.name).toBe('{{testEntityName}}');
});
// Happy path: mutation updates entity
test('mutation updates {{entityName}}', async ({ request }) => {
const data = await gql(request, `
mutation Update{{EntityName}}($id: ID!, $input: {{EntityName}}Input!) {
update{{EntityName}}(id: $id, input: $input) { id name }
}
`, { id: '{{existingEntityId}}', input: { name: '{{updatedName}}' } });
expect(data.update{{EntityName}}.name).toBe('{{updatedName}}');
});
// Happy path: mutation deletes entity
test('mutation deletes {{entityName}}', async ({ request }) => {
const data = await gql(request, `
mutation Delete{{EntityName}}($id: ID!) {
delete{{EntityName}}(id: $id) { success }
}
`, { id: '{{deletableEntityId}}' });
expect(data.delete{{EntityName}}.success).toBe(true);
});
// Error case: invalid query returns errors array
test('invalid query returns errors', async ({ request }) => {
const res = await request.post(GQL_URL, {
headers,
data: { query: '{ invalidField }' },
});
const body = await res.json();
expect(body.errors).toBeDefined();
expect(body.errors.length).toBeGreaterThan(0);
});
// Error case: unauthorized query
test('query without auth returns unauthorized error', async ({ request }) => {
const res = await request.post(GQL_URL, {
headers: { 'Content-Type': 'application/json' }, // No auth
data: { query: '{ {{entityName}}s { id } }' },
});
const body = await res.json();
expect(body.errors?.[0]?.extensions?.code).toMatch(/UNAUTHENTICATED|UNAUTHORIZED/);
});
// Edge case: subscription via page WebSocket
test('subscription receives real-time update', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
const received: any[] = [];
await page.evaluate(() => {
const ws = new WebSocket('{{graphqlWsEndpoint}}');
ws.onmessage = e => (window as any).__gqlMsg = JSON.parse(e.data);
});
// Trigger mutation to fire subscription
await page.request.post(GQL_URL, {
headers,
data: { query: 'mutation { trigger{{EntityName}}Event { id } }' },
});
const msg = await page.evaluate(() => (window as any).__gqlMsg);
expect(msg?.type).toBe('data');
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
const headers = { 'Authorization': `Bearer {{apiToken}}`, 'Content-Type': 'application/json' };
async function gql(request, query, variables = {}) {
const res = await request.post('{{graphqlEndpoint}}', { headers, data: { query, variables } });
const body = await res.json();
expect(body.errors).toBeUndefined();
return body.data;
}
test.describe('GraphQL API', () => {
test('query fetches entity list', async ({ request }) => {
const data = await gql(request, '{ {{entityName}}s { id name } }');
expect(Array.isArray(data.{{entityName}}s)).toBe(true);
});
test('mutation creates entity', async ({ request }) => {
const data = await gql(request,
'mutation($input: {{EntityName}}Input!) { create{{EntityName}}(input: $input) { id } }',
{ input: { name: '{{testEntityName}}' } }
);
expect(data.create{{EntityName}}.id).toBeTruthy();
});
test('invalid query returns errors array', async ({ request }) => {
const res = await request.post('{{graphqlEndpoint}}', {
headers,
data: { query: '{ nonExistentField }' },
});
const body = await res.json();
expect(body.errors?.length).toBeGreaterThan(0);
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| List query | Returns array of entities |
| Single query | Returns entity by ID |
| Create mutation | Returns new entity with ID |
| Update mutation | Returns updated field value |
| Delete mutation | Returns success: true |
| Invalid query | errors[] defined in response |
| Unauthenticated | UNAUTHENTICATED extension code |
| Subscription | Real-time message via WebSocket |
FILE:templates/api/rate-limiting.md
# Rate Limiting Template
Tests rate limit headers, 429 response, and Retry-After handling.
## Prerequisites
- Valid auth token: `{{apiToken}}`
- Rate-limited endpoint: `{{rateLimitedEndpoint}}`
- Rate limit: `{{rateLimit}}` requests per `{{rateLimitWindow}}`
- API base: `{{apiBaseUrl}}`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
const headers = {
'Authorization': `Bearer {{apiToken}}`,
'Content-Type': 'application/json',
};
test.describe('Rate Limiting', () => {
// Happy path: rate limit headers present on normal requests
test('includes rate limit headers on success response', async ({ request }) => {
const res = await request.get(`{{apiBaseUrl}}/{{rateLimitedEndpoint}}`, { headers });
expect(res.status()).toBe(200);
expect(res.headers()['x-ratelimit-limit']).toBeTruthy();
expect(res.headers()['x-ratelimit-remaining']).toBeTruthy();
expect(Number(res.headers()['x-ratelimit-limit'])).toBe({{rateLimit}});
});
// Happy path: remaining count decrements
test('x-ratelimit-remaining decrements with each request', async ({ request }) => {
const first = await request.get(`{{apiBaseUrl}}/{{rateLimitedEndpoint}}`, { headers });
const second = await request.get(`{{apiBaseUrl}}/{{rateLimitedEndpoint}}`, { headers });
const remaining1 = Number(first.headers()['x-ratelimit-remaining']);
const remaining2 = Number(second.headers()['x-ratelimit-remaining']);
expect(remaining2).toBeLessThan(remaining1);
});
// Error case: 429 when limit exceeded
test('returns 429 when rate limit exceeded', async ({ request }) => {
let lastStatus = 200;
let retryAfter: string | undefined;
for (let i = 0; i <= {{rateLimit}}; i++) {
const res = await request.get(`{{apiBaseUrl}}/{{rateLimitedEndpoint}}`, { headers });
lastStatus = res.status();
if (lastStatus === 429) {
retryAfter = res.headers()['retry-after'];
break;
}
}
expect(lastStatus).toBe(429);
expect(retryAfter).toBeTruthy();
});
// Error case: 429 body contains error message
test('429 response body contains error and retry info', async ({ request }) => {
// Exhaust limit
for (let i = 0; i <= {{rateLimit}}; i++) {
await request.get(`{{apiBaseUrl}}/{{rateLimitedEndpoint}}`, { headers });
}
const res = await request.get(`{{apiBaseUrl}}/{{rateLimitedEndpoint}}`, { headers });
if (res.status() === 429) {
const body = await res.json();
expect(body.error ?? body.message).toMatch(/rate limit|too many requests/i);
expect(Number(res.headers()['retry-after'])).toBeGreaterThan(0);
}
});
// Happy path: different users have separate rate limit buckets
test('rate limit is per-user, not global', async ({ request }) => {
// Exhaust limit for user 1
for (let i = 0; i <= {{rateLimit}}; i++) {
await request.get(`{{apiBaseUrl}}/{{rateLimitedEndpoint}}`, {
headers: { 'Authorization': `Bearer {{apiToken}}` },
});
}
// User 2 should still succeed
const res = await request.get(`{{apiBaseUrl}}/{{rateLimitedEndpoint}}`, {
headers: { 'Authorization': `Bearer {{apiToken2}}` },
});
expect(res.status()).toBe(200);
});
// Edge case: reset after window expires
test('rate limit resets after window expires', async ({ page, request }) => {
// Exhaust limit
for (let i = 0; i <= {{rateLimit}}; i++) {
await request.get(`{{apiBaseUrl}}/{{rateLimitedEndpoint}}`, { headers });
}
// Advance clock past the window
await page.clock.install();
await page.clock.fastForward({{rateLimitWindowMs}});
// Should succeed again
const res = await request.get(`{{apiBaseUrl}}/{{rateLimitedEndpoint}}`, { headers });
expect(res.status()).toBe(200);
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
const headers = { 'Authorization': `Bearer {{apiToken}}` };
test.describe('Rate Limiting', () => {
test('includes rate limit headers on success', async ({ request }) => {
const res = await request.get(`{{apiBaseUrl}}/{{rateLimitedEndpoint}}`, { headers });
expect(res.status()).toBe(200);
expect(res.headers()['x-ratelimit-limit']).toBeTruthy();
expect(res.headers()['x-ratelimit-remaining']).toBeTruthy();
});
test('returns 429 with Retry-After when limit exceeded', async ({ request }) => {
let lastStatus = 200;
let retryAfter;
for (let i = 0; i <= {{rateLimit}}; i++) {
const res = await request.get(`{{apiBaseUrl}}/{{rateLimitedEndpoint}}`, { headers });
lastStatus = res.status();
if (lastStatus === 429) { retryAfter = res.headers()['retry-after']; break; }
}
expect(lastStatus).toBe(429);
expect(retryAfter).toBeTruthy();
});
test('per-user buckets: other user unaffected', async ({ request }) => {
for (let i = 0; i <= {{rateLimit}}; i++) {
await request.get(`{{apiBaseUrl}}/{{rateLimitedEndpoint}}`, { headers });
}
const res = await request.get(`{{apiBaseUrl}}/{{rateLimitedEndpoint}}`, {
headers: { 'Authorization': `Bearer {{apiToken2}}` },
});
expect(res.status()).toBe(200);
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Headers present | x-ratelimit-limit and -remaining on 200 |
| Decrement | remaining decreases each request |
| 429 triggered | Limit exceeded → 429 + Retry-After |
| 429 body | Error message + retry info in body |
| Per-user bucket | Exhausted user doesn't affect others |
| Window reset | Clock advanced → limit resets |
FILE:templates/api/rest-crud.md
# REST CRUD API Template
Tests GET, POST, PUT, and DELETE API endpoints directly via Playwright's request API.
## Prerequisites
- Valid auth token: `{{apiToken}}`
- Base API URL: `{{apiBaseUrl}}`
- Test entity endpoint: `/{{entityName}}s`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
test.describe('REST CRUD — /{{entityName}}s', () => {
let createdId: string;
const headers = {
'Authorization': `Bearer {{apiToken}}`,
'Content-Type': 'application/json',
};
// Happy path: GET list
test('GET /{{entityName}}s returns list', async ({ request }) => {
const res = await request.get('{{apiBaseUrl}}/{{entityName}}s', { headers });
expect(res.status()).toBe(200);
const body = await res.json();
expect(Array.isArray(body.data ?? body)).toBe(true);
});
// Happy path: POST creates entity
test('POST /{{entityName}}s creates new entity', async ({ request }) => {
const res = await request.post('{{apiBaseUrl}}/{{entityName}}s', {
headers,
data: { name: '{{testEntityName}}', description: '{{testDescription}}' },
});
expect(res.status()).toBe(201);
const body = await res.json();
expect(body.id).toBeTruthy();
expect(body.name).toBe('{{testEntityName}}');
createdId = body.id;
});
// Happy path: GET single entity
test('GET /{{entityName}}s/:id returns entity', async ({ request }) => {
const res = await request.get(`{{apiBaseUrl}}/{{entityName}}s/{{existingEntityId}}`, { headers });
expect(res.status()).toBe(200);
const body = await res.json();
expect(body.id).toBe('{{existingEntityId}}');
expect(body.name).toBeTruthy();
});
// Happy path: PUT updates entity
test('PUT /{{entityName}}s/:id updates entity', async ({ request }) => {
const res = await request.put(`{{apiBaseUrl}}/{{entityName}}s/{{existingEntityId}}`, {
headers,
data: { name: '{{updatedEntityName}}' },
});
expect(res.status()).toBe(200);
const body = await res.json();
expect(body.name).toBe('{{updatedEntityName}}');
});
// Happy path: PATCH partial update
test('PATCH /{{entityName}}s/:id partially updates entity', async ({ request }) => {
const res = await request.patch(`{{apiBaseUrl}}/{{entityName}}s/{{existingEntityId}}`, {
headers,
data: { description: '{{patchedDescription}}' },
});
expect(res.status()).toBe(200);
const body = await res.json();
expect(body.description).toBe('{{patchedDescription}}');
});
// Happy path: DELETE removes entity
test('DELETE /{{entityName}}s/:id deletes entity', async ({ request }) => {
const del = await request.delete(`{{apiBaseUrl}}/{{entityName}}s/{{deletableEntityId}}`, { headers });
expect(del.status()).toBe(204);
// Verify gone
const get = await request.get(`{{apiBaseUrl}}/{{entityName}}s/{{deletableEntityId}}`, { headers });
expect(get.status()).toBe(404);
});
// Error case: POST with missing required field returns 422
test('POST with missing required field returns 422', async ({ request }) => {
const res = await request.post('{{apiBaseUrl}}/{{entityName}}s', {
headers,
data: {},
});
expect(res.status()).toBe(422);
const body = await res.json();
expect(body.errors).toBeTruthy();
});
// Error case: GET non-existent entity returns 404
test('GET non-existent entity returns 404', async ({ request }) => {
const res = await request.get('{{apiBaseUrl}}/{{entityName}}s/999999', { headers });
expect(res.status()).toBe(404);
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
const headers = {
'Authorization': `Bearer {{apiToken}}`,
'Content-Type': 'application/json',
};
test.describe('REST CRUD — /{{entityName}}s', () => {
test('GET list returns 200 and array', async ({ request }) => {
const res = await request.get('{{apiBaseUrl}}/{{entityName}}s', { headers });
expect(res.status()).toBe(200);
const body = await res.json();
expect(Array.isArray(body.data ?? body)).toBe(true);
});
test('POST creates entity and returns 201', async ({ request }) => {
const res = await request.post('{{apiBaseUrl}}/{{entityName}}s', {
headers,
data: { name: '{{testEntityName}}' },
});
expect(res.status()).toBe(201);
expect((await res.json()).id).toBeTruthy();
});
test('DELETE removes entity, GET returns 404', async ({ request }) => {
await request.delete(`{{apiBaseUrl}}/{{entityName}}s/{{deletableEntityId}}`, { headers });
const res = await request.get(`{{apiBaseUrl}}/{{entityName}}s/{{deletableEntityId}}`, { headers });
expect(res.status()).toBe(404);
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| GET list | 200 + array body |
| POST create | 201 + id in response |
| GET single | 200 + correct entity body |
| PUT update | 200 + updated field in response |
| PATCH partial | 200 + patched field only changed |
| DELETE | 204 → subsequent GET returns 404 |
| POST validation | Missing field → 422 + errors |
| GET 404 | Non-existent ID → 404 |
FILE:templates/auth/login.md
# Login Template
Tests email/password login, social login, and remember me functionality.
## Prerequisites
- Valid user account: `{{username}}` / `{{password}}`
- Social provider configured (Google/GitHub)
- App running at `{{baseUrl}}`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
test.describe('Login', () => {
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/login');
});
// Happy path: email/password login
test('logs in with valid credentials', async ({ page }) => {
await page.getByRole('textbox', { name: /email/i }).fill('{{username}}');
await page.getByRole('textbox', { name: /password/i }).fill('{{password}}');
await page.getByRole('button', { name: /sign in/i }).click();
await expect(page).toHaveURL('{{baseUrl}}/dashboard');
await expect(page.getByRole('heading', { name: /dashboard/i })).toBeVisible();
});
// Happy path: remember me
test('persists session with remember me checked', async ({ page, context }) => {
await page.getByRole('textbox', { name: /email/i }).fill('{{username}}');
await page.getByRole('textbox', { name: /password/i }).fill('{{password}}');
await page.getByRole('checkbox', { name: /remember me/i }).check();
await page.getByRole('button', { name: /sign in/i }).click();
await expect(page).toHaveURL('{{baseUrl}}/dashboard');
const cookies = await context.cookies();
const session = cookies.find(c => c.name === '{{sessionCookieName}}');
expect(session?.expires).toBeGreaterThan(Date.now() / 1000 + 86400);
});
// Happy path: social login
test('redirects to social provider', async ({ page }) => {
await page.getByRole('button', { name: /continue with google/i }).click();
await expect(page).toHaveURL(/accounts\.google\.com/);
});
// Error case: invalid credentials
test('shows error for wrong password', async ({ page }) => {
await page.getByRole('textbox', { name: /email/i }).fill('{{username}}');
await page.getByRole('textbox', { name: /password/i }).fill('wrong-password');
await page.getByRole('button', { name: /sign in/i }).click();
await expect(page.getByRole('alert')).toContainText(/invalid.*credentials/i);
await expect(page).toHaveURL('{{baseUrl}}/login');
});
// Edge case: empty fields
test('shows validation for empty submission', async ({ page }) => {
await page.getByRole('button', { name: /sign in/i }).click();
await expect(page.getByRole('textbox', { name: /email/i })).toBeFocused();
await expect(page.getByText(/email is required/i)).toBeVisible();
});
// Edge case: locked account
test('shows account locked message after multiple failures', async ({ page }) => {
for (let i = 0; i < {{lockoutAttempts}}; i++) {
await page.getByRole('textbox', { name: /email/i }).fill('{{username}}');
await page.getByRole('textbox', { name: /password/i }).fill('wrong');
await page.getByRole('button', { name: /sign in/i }).click();
}
await expect(page.getByRole('alert')).toContainText(/account.*locked/i);
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
test.describe('Login', () => {
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/login');
});
test('logs in with valid credentials', async ({ page }) => {
await page.getByRole('textbox', { name: /email/i }).fill('{{username}}');
await page.getByRole('textbox', { name: /password/i }).fill('{{password}}');
await page.getByRole('button', { name: /sign in/i }).click();
await expect(page).toHaveURL('{{baseUrl}}/dashboard');
await expect(page.getByRole('heading', { name: /dashboard/i })).toBeVisible();
});
test('shows error for wrong password', async ({ page }) => {
await page.getByRole('textbox', { name: /email/i }).fill('{{username}}');
await page.getByRole('textbox', { name: /password/i }).fill('wrong-password');
await page.getByRole('button', { name: /sign in/i }).click();
await expect(page.getByRole('alert')).toContainText(/invalid.*credentials/i);
});
test('shows validation for empty submission', async ({ page }) => {
await page.getByRole('button', { name: /sign in/i }).click();
await expect(page.getByText(/email is required/i)).toBeVisible();
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Happy path | Valid credentials → dashboard redirect |
| Remember me | Long-lived cookie set |
| Social login | OAuth redirect to provider |
| Wrong password | Alert with error message |
| Empty form | Inline validation shown |
| Locked account | Lockout message after N failures |
FILE:templates/auth/logout.md
# Logout Template
Tests logout from navigation, session cleanup, and redirect behaviour.
## Prerequisites
- Authenticated session (use `storageState` or login fixture)
- App running at `{{baseUrl}}`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
test.describe('Logout', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
// Happy path: logout via nav menu
test('logs out from user menu', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
await page.getByRole('button', { name: /user menu/i }).click();
await page.getByRole('menuitem', { name: /sign out/i }).click();
await expect(page).toHaveURL('{{baseUrl}}/login');
await expect(page.getByRole('heading', { name: /sign in/i })).toBeVisible();
});
// Happy path: session cookies cleared
test('clears session cookie on logout', async ({ page, context }) => {
await page.goto('{{baseUrl}}/dashboard');
await page.getByRole('button', { name: /user menu/i }).click();
await page.getByRole('menuitem', { name: /sign out/i }).click();
await expect(page).toHaveURL('{{baseUrl}}/login');
const cookies = await context.cookies();
const session = cookies.find(c => c.name === '{{sessionCookieName}}');
expect(session).toBeUndefined();
});
// Happy path: accessing protected page after logout redirects
test('redirects to login when accessing protected page after logout', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
await page.getByRole('button', { name: /user menu/i }).click();
await page.getByRole('menuitem', { name: /sign out/i }).click();
await page.goto('{{baseUrl}}/dashboard');
await expect(page).toHaveURL(/\/login/);
});
// Error case: double logout (stale session)
test('handles logout gracefully when session already expired', async ({ page, context }) => {
await page.goto('{{baseUrl}}/dashboard');
await context.clearCookies();
await page.getByRole('button', { name: /user menu/i }).click();
await page.getByRole('menuitem', { name: /sign out/i }).click();
await expect(page).toHaveURL(/\/login/);
});
// Edge case: logout from multiple tabs
test('invalidates session across tabs', async ({ page, context }) => {
const tab2 = await context.newPage();
await page.goto('{{baseUrl}}/dashboard');
await tab2.goto('{{baseUrl}}/dashboard');
await page.getByRole('button', { name: /user menu/i }).click();
await page.getByRole('menuitem', { name: /sign out/i }).click();
await tab2.reload();
await expect(tab2).toHaveURL(/\/login/);
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
test.describe('Logout', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
test('logs out from user menu', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
await page.getByRole('button', { name: /user menu/i }).click();
await page.getByRole('menuitem', { name: /sign out/i }).click();
await expect(page).toHaveURL('{{baseUrl}}/login');
});
test('clears session cookie on logout', async ({ page, context }) => {
await page.goto('{{baseUrl}}/dashboard');
await page.getByRole('button', { name: /user menu/i }).click();
await page.getByRole('menuitem', { name: /sign out/i }).click();
const cookies = await context.cookies();
expect(cookies.find(c => c.name === '{{sessionCookieName}}')).toBeUndefined();
});
test('redirects protected page to login after logout', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
await page.getByRole('button', { name: /user menu/i }).click();
await page.getByRole('menuitem', { name: /sign out/i }).click();
await page.goto('{{baseUrl}}/dashboard');
await expect(page).toHaveURL(/\/login/);
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Happy path | Nav menu → sign out → login page |
| Cookie cleanup | Session cookie removed after logout |
| Protected redirect | Accessing /dashboard after logout → /login |
| Stale session | Already-expired session handled gracefully |
| Multi-tab | Logout invalidates other open tabs |
FILE:templates/auth/mfa.md
# MFA Template
Tests 2FA TOTP code entry, backup codes, and MFA enrollment flow.
## Prerequisites
- MFA-enabled account: `{{mfaUsername}}` / `{{mfaPassword}}`
- TOTP secret for generating codes: `{{totpSecret}}`
- Backup code: `{{backupCode}}`
- App running at `{{baseUrl}}`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
import { authenticator } from 'otplib'; // npm i otplib
test.describe('MFA', () => {
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/login');
await page.getByRole('textbox', { name: /email/i }).fill('{{mfaUsername}}');
await page.getByRole('textbox', { name: /password/i }).fill('{{mfaPassword}}');
await page.getByRole('button', { name: /sign in/i }).click();
await expect(page).toHaveURL(/\/mfa|\/two-factor/);
});
// Happy path: valid TOTP code
test('accepts valid TOTP code', async ({ page }) => {
const token = authenticator.generate('{{totpSecret}}');
await page.getByRole('textbox', { name: /code|token/i }).fill(token);
await page.getByRole('button', { name: /verify/i }).click();
await expect(page).toHaveURL('{{baseUrl}}/dashboard');
});
// Happy path: backup code
test('accepts backup code', async ({ page }) => {
await page.getByRole('link', { name: /use backup code/i }).click();
await page.getByRole('textbox', { name: /backup code/i }).fill('{{backupCode}}');
await page.getByRole('button', { name: /verify/i }).click();
await expect(page).toHaveURL('{{baseUrl}}/dashboard');
// Backup code consumed — warning shown
await expect(page.getByRole('alert')).toContainText(/backup code used/i);
});
// Error case: wrong TOTP code
test('rejects invalid TOTP code', async ({ page }) => {
await page.getByRole('textbox', { name: /code|token/i }).fill('000000');
await page.getByRole('button', { name: /verify/i }).click();
await expect(page.getByRole('alert')).toContainText(/invalid.*code/i);
await expect(page).toHaveURL(/\/mfa|\/two-factor/);
});
// Error case: expired code (simulate by providing code + 1 step)
test('rejects expired TOTP code', async ({ page }) => {
const expiredToken = authenticator.generate('{{totpSecret}}');
// Advance time simulation via clock if supported, else use a fixed stale code
await page.getByRole('textbox', { name: /code|token/i }).fill(expiredToken);
await page.clock.fastForward(60_000); // advance 60s past TOTP window
await page.getByRole('button', { name: /verify/i }).click();
await expect(page.getByRole('alert')).toContainText(/expired|invalid.*code/i);
});
// Edge case: MFA enrollment for new user
test('enrolls MFA via QR code scan', async ({ page: enrollPage }) => {
await enrollPage.goto('{{baseUrl}}/settings/security');
await enrollPage.getByRole('button', { name: /enable.*two-factor/i }).click();
await expect(enrollPage.getByRole('img', { name: /qr code/i })).toBeVisible();
await expect(enrollPage.getByText(/scan.*authenticator/i)).toBeVisible();
// User scans QR → enters token
const token = authenticator.generate('{{totpSecret}}');
await enrollPage.getByRole('textbox', { name: /verification code/i }).fill(token);
await enrollPage.getByRole('button', { name: /activate/i }).click();
await expect(enrollPage.getByRole('heading', { name: /backup codes/i })).toBeVisible();
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
const { authenticator } = require('otplib');
test.describe('MFA', () => {
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/login');
await page.getByRole('textbox', { name: /email/i }).fill('{{mfaUsername}}');
await page.getByRole('textbox', { name: /password/i }).fill('{{mfaPassword}}');
await page.getByRole('button', { name: /sign in/i }).click();
await expect(page).toHaveURL(/\/mfa|\/two-factor/);
});
test('accepts valid TOTP code', async ({ page }) => {
const token = authenticator.generate('{{totpSecret}}');
await page.getByRole('textbox', { name: /code|token/i }).fill(token);
await page.getByRole('button', { name: /verify/i }).click();
await expect(page).toHaveURL('{{baseUrl}}/dashboard');
});
test('accepts backup code', async ({ page }) => {
await page.getByRole('link', { name: /use backup code/i }).click();
await page.getByRole('textbox', { name: /backup code/i }).fill('{{backupCode}}');
await page.getByRole('button', { name: /verify/i }).click();
await expect(page).toHaveURL('{{baseUrl}}/dashboard');
});
test('rejects invalid TOTP code', async ({ page }) => {
await page.getByRole('textbox', { name: /code|token/i }).fill('000000');
await page.getByRole('button', { name: /verify/i }).click();
await expect(page.getByRole('alert')).toContainText(/invalid.*code/i);
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Valid TOTP | Correct time-based code → dashboard |
| Backup code | Single-use backup code accepted; warning shown |
| Invalid code | Wrong code → alert, stays on MFA page |
| Expired code | Clock-advanced token rejected |
| MFA enrollment | QR shown → token verified → backup codes displayed |
FILE:templates/auth/password-reset.md
# Password Reset Template
Tests reset request, setting a new password, and expired link handling.
## Prerequisites
- Account with email: `{{username}}`
- Reset link / token available in test environment (`{{resetToken}}`)
- App running at `{{baseUrl}}`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
test.describe('Password Reset', () => {
// Happy path: request reset email
test('sends reset email for known address', async ({ page }) => {
await page.goto('{{baseUrl}}/forgot-password');
await page.getByRole('textbox', { name: /email/i }).fill('{{username}}');
await page.getByRole('button', { name: /send reset/i }).click();
await expect(page.getByRole('alert')).toContainText(/check your email/i);
});
// Happy path: set new password via reset link
test('sets new password with valid reset token', async ({ page }) => {
await page.goto('{{baseUrl}}/reset-password?token={{resetToken}}');
await expect(page.getByRole('heading', { name: /set.*new password/i })).toBeVisible();
await page.getByRole('textbox', { name: /^new password$/i }).fill('{{newPassword}}');
await page.getByRole('textbox', { name: /confirm password/i }).fill('{{newPassword}}');
await page.getByRole('button', { name: /reset password/i }).click();
await expect(page).toHaveURL('{{baseUrl}}/login');
await expect(page.getByRole('alert')).toContainText(/password.*updated/i);
});
// Happy path: login with new password
test('can log in with updated password', async ({ page }) => {
await page.goto('{{baseUrl}}/login');
await page.getByRole('textbox', { name: /email/i }).fill('{{username}}');
await page.getByRole('textbox', { name: /password/i }).fill('{{newPassword}}');
await page.getByRole('button', { name: /sign in/i }).click();
await expect(page).toHaveURL('{{baseUrl}}/dashboard');
});
// Error case: expired reset link
test('shows error for expired reset token', async ({ page }) => {
await page.goto('{{baseUrl}}/reset-password?token={{expiredResetToken}}');
await expect(page.getByRole('alert')).toContainText(/link.*expired|token.*invalid/i);
await expect(page.getByRole('link', { name: /request new link/i })).toBeVisible();
});
// Error case: unknown email
test('shows generic message for unknown email (anti-enumeration)', async ({ page }) => {
await page.goto('{{baseUrl}}/forgot-password');
await page.getByRole('textbox', { name: /email/i }).fill('[email protected]');
await page.getByRole('button', { name: /send reset/i }).click();
// Should NOT reveal whether email exists
await expect(page.getByRole('alert')).toContainText(/check your email/i);
});
// Error case: passwords do not match
test('validates that passwords match', async ({ page }) => {
await page.goto('{{baseUrl}}/reset-password?token={{resetToken}}');
await page.getByRole('textbox', { name: /^new password$/i }).fill('{{newPassword}}');
await page.getByRole('textbox', { name: /confirm password/i }).fill('different-password');
await page.getByRole('button', { name: /reset password/i }).click();
await expect(page.getByText(/passwords.*do not match/i)).toBeVisible();
});
// Edge case: weak password rejected
test('rejects password that does not meet strength requirements', async ({ page }) => {
await page.goto('{{baseUrl}}/reset-password?token={{resetToken}}');
await page.getByRole('textbox', { name: /^new password$/i }).fill('123');
await page.getByRole('textbox', { name: /confirm password/i }).fill('123');
await page.getByRole('button', { name: /reset password/i }).click();
await expect(page.getByText(/password.*too weak|must be at least/i)).toBeVisible();
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
test.describe('Password Reset', () => {
test('sends reset email for known address', async ({ page }) => {
await page.goto('{{baseUrl}}/forgot-password');
await page.getByRole('textbox', { name: /email/i }).fill('{{username}}');
await page.getByRole('button', { name: /send reset/i }).click();
await expect(page.getByRole('alert')).toContainText(/check your email/i);
});
test('sets new password with valid reset token', async ({ page }) => {
await page.goto('{{baseUrl}}/reset-password?token={{resetToken}}');
await page.getByRole('textbox', { name: /^new password$/i }).fill('{{newPassword}}');
await page.getByRole('textbox', { name: /confirm password/i }).fill('{{newPassword}}');
await page.getByRole('button', { name: /reset password/i }).click();
await expect(page).toHaveURL('{{baseUrl}}/login');
});
test('shows error for expired reset token', async ({ page }) => {
await page.goto('{{baseUrl}}/reset-password?token={{expiredResetToken}}');
await expect(page.getByRole('alert')).toContainText(/link.*expired|token.*invalid/i);
});
test('validates passwords match', async ({ page }) => {
await page.goto('{{baseUrl}}/reset-password?token={{resetToken}}');
await page.getByRole('textbox', { name: /^new password$/i }).fill('{{newPassword}}');
await page.getByRole('textbox', { name: /confirm password/i }).fill('other');
await page.getByRole('button', { name: /reset password/i }).click();
await expect(page.getByText(/passwords.*do not match/i)).toBeVisible();
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Request reset | Known email → check email message |
| Set new password | Valid token → new password set → login page |
| Login with new pw | Updated credentials accepted |
| Expired token | Error + "request new link" shown |
| Unknown email | Generic response (anti-enumeration) |
| Passwords mismatch | Inline validation error |
| Weak password | Strength requirement error |
FILE:templates/auth/rbac.md
# RBAC Template
Tests role-based access control: admin vs user permissions and forbidden pages.
## Prerequisites
- Admin account: `{{adminUsername}}` / `{{adminPassword}}`
- Regular user: `{{userUsername}}` / `{{userPassword}}`
- App running at `{{baseUrl}}`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
const adminState = '{{adminStorageStatePath}}';
const userState = '{{userStorageStatePath}}';
test.describe('RBAC — Admin', () => {
test.use({ storageState: adminState });
// Happy path: admin accesses admin panel
test('admin can access admin panel', async ({ page }) => {
await page.goto('{{baseUrl}}/admin');
await expect(page.getByRole('heading', { name: /admin/i })).toBeVisible();
});
test('admin can see user management menu item', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
await expect(page.getByRole('link', { name: /user management/i })).toBeVisible();
});
test('admin can delete any resource', async ({ page }) => {
await page.goto('{{baseUrl}}/admin/{{entityName}}s');
await page.getByRole('row').nth(1).getByRole('button', { name: /delete/i }).click();
await page.getByRole('button', { name: /confirm/i }).click();
await expect(page.getByRole('alert')).toContainText(/deleted/i);
});
});
test.describe('RBAC — Regular User', () => {
test.use({ storageState: userState });
// Error case: user cannot access admin panel
test('regular user sees 403 on admin panel', async ({ page }) => {
await page.goto('{{baseUrl}}/admin');
await expect(page).toHaveURL(/\/403|\/forbidden|\/dashboard/);
const forbidden = page.getByRole('heading', { name: /403|forbidden|not authorized/i });
await expect(forbidden).toBeVisible();
});
test('regular user does not see admin menu items', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
await expect(page.getByRole('link', { name: /user management/i })).toBeHidden();
});
// Error case: user cannot delete others' resources
test('regular user cannot delete another user\'s resource', async ({ page }) => {
await page.goto('{{baseUrl}}/{{entityName}}s/{{otherUsersEntityId}}');
await expect(page.getByRole('button', { name: /delete/i })).toBeHidden();
});
// Edge case: direct navigation to admin API returns 403
test('API returns 403 for unauthorized role', async ({ page }) => {
const response = await page.request.get('{{baseUrl}}/api/admin/users');
expect(response.status()).toBe(403);
});
});
test.describe('RBAC — Role Elevation', () => {
// Edge case: user promoted to admin gains access
test('newly promoted admin can access admin panel', async ({ browser }) => {
// Step 1: use admin context to promote user
const adminCtx = await browser.newContext({ storageState: adminState });
const adminPage = await adminCtx.newPage();
await adminPage.goto('{{baseUrl}}/admin/users/{{promotedUserId}}/role');
await adminPage.getByRole('combobox', { name: /role/i }).selectOption('admin');
await adminPage.getByRole('button', { name: /save/i }).click();
await adminCtx.close();
// Step 2: promoted user can now access admin panel
const userCtx = await browser.newContext({ storageState: userState });
const userPage = await userCtx.newPage();
await userPage.goto('{{baseUrl}}/admin');
await expect(userPage.getByRole('heading', { name: /admin/i })).toBeVisible();
await userCtx.close();
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
test.describe('RBAC — Admin', () => {
test.use({ storageState: '{{adminStorageStatePath}}' });
test('admin can access admin panel', async ({ page }) => {
await page.goto('{{baseUrl}}/admin');
await expect(page.getByRole('heading', { name: /admin/i })).toBeVisible();
});
});
test.describe('RBAC — Regular User', () => {
test.use({ storageState: '{{userStorageStatePath}}' });
test('regular user sees 403 on admin panel', async ({ page }) => {
await page.goto('{{baseUrl}}/admin');
await expect(page.getByRole('heading', { name: /403|forbidden/i })).toBeVisible();
});
test('API returns 403 for unauthorized role', async ({ page }) => {
const res = await page.request.get('{{baseUrl}}/api/admin/users');
expect(res.status()).toBe(403);
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Admin access | Admin reaches /admin panel |
| Admin menu | Admin-only nav items visible |
| Admin delete | Admin can delete any resource |
| User forbidden | Regular user → 403/redirect on /admin |
| User hidden menu | Admin nav items not rendered for user |
| API 403 | Backend enforces role on API routes |
| Role elevation | Promoted user gains new access immediately |
FILE:templates/auth/remember-me.md
# Remember Me Template
Tests persistent login cookie behaviour and expiry.
## Prerequisites
- Valid account: `{{username}}` / `{{password}}`
- `{{sessionCookieName}}` cookie used for auth
- App running at `{{baseUrl}}`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
test.describe('Remember Me', () => {
// Happy path: cookie is long-lived when remember me is checked
test('sets persistent cookie when remember me is checked', async ({ page, context }) => {
await page.goto('{{baseUrl}}/login');
await page.getByRole('textbox', { name: /email/i }).fill('{{username}}');
await page.getByRole('textbox', { name: /password/i }).fill('{{password}}');
await page.getByRole('checkbox', { name: /remember me/i }).check();
await page.getByRole('button', { name: /sign in/i }).click();
await expect(page).toHaveURL('{{baseUrl}}/dashboard');
const cookies = await context.cookies();
const session = cookies.find(c => c.name === '{{sessionCookieName}}');
// Cookie should expire > 7 days from now
expect(session?.expires).toBeGreaterThan(Date.now() / 1000 + 7 * 86400);
});
// Happy path: session cookie (no remember me) is session-scoped
test('sets session-scoped cookie when remember me is unchecked', async ({ page, context }) => {
await page.goto('{{baseUrl}}/login');
await page.getByRole('textbox', { name: /email/i }).fill('{{username}}');
await page.getByRole('textbox', { name: /password/i }).fill('{{password}}');
const checkbox = page.getByRole('checkbox', { name: /remember me/i });
if (await checkbox.isChecked()) await checkbox.uncheck();
await page.getByRole('button', { name: /sign in/i }).click();
await expect(page).toHaveURL('{{baseUrl}}/dashboard');
const cookies = await context.cookies();
const session = cookies.find(c => c.name === '{{sessionCookieName}}');
// Session cookie: expires = -1 (browser session only)
expect(session?.expires).toBeLessThanOrEqual(0);
});
// Happy path: persistent login survives page reload
test('stays logged in across browser restart with remember me', async ({ page, context }) => {
await page.goto('{{baseUrl}}/login');
await page.getByRole('textbox', { name: /email/i }).fill('{{username}}');
await page.getByRole('textbox', { name: /password/i }).fill('{{password}}');
await page.getByRole('checkbox', { name: /remember me/i }).check();
await page.getByRole('button', { name: /sign in/i }).click();
await expect(page).toHaveURL('{{baseUrl}}/dashboard');
// Simulate new browser session by closing & reopening page (cookies persist)
await page.close();
const newPage = await context.newPage();
await newPage.goto('{{baseUrl}}/dashboard');
await expect(newPage).toHaveURL('{{baseUrl}}/dashboard');
await expect(newPage.getByRole('heading', { name: /dashboard/i })).toBeVisible();
});
// Error case: expired persistent cookie redirects to login
test('redirects to login when persistent cookie has expired', async ({ page, context }) => {
await context.addCookies([{
name: '{{sessionCookieName}}',
value: '{{expiredCookieValue}}',
domain: '{{cookieDomain}}',
path: '/',
expires: Math.floor(Date.now() / 1000) - 1, // already expired
}]);
await page.goto('{{baseUrl}}/dashboard');
await expect(page).toHaveURL(/\/login/);
});
// Edge case: remember me checkbox state is preserved on validation error
test('retains remember me checkbox state after failed login', async ({ page }) => {
await page.goto('{{baseUrl}}/login');
await page.getByRole('checkbox', { name: /remember me/i }).check();
await page.getByRole('textbox', { name: /email/i }).fill('{{username}}');
await page.getByRole('textbox', { name: /password/i }).fill('wrong');
await page.getByRole('button', { name: /sign in/i }).click();
await expect(page.getByRole('alert')).toContainText(/invalid/i);
await expect(page.getByRole('checkbox', { name: /remember me/i })).toBeChecked();
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
test.describe('Remember Me', () => {
test('sets persistent cookie when remember me is checked', async ({ page, context }) => {
await page.goto('{{baseUrl}}/login');
await page.getByRole('textbox', { name: /email/i }).fill('{{username}}');
await page.getByRole('textbox', { name: /password/i }).fill('{{password}}');
await page.getByRole('checkbox', { name: /remember me/i }).check();
await page.getByRole('button', { name: /sign in/i }).click();
const cookies = await context.cookies();
const session = cookies.find(c => c.name === '{{sessionCookieName}}');
expect(session?.expires).toBeGreaterThan(Date.now() / 1000 + 7 * 86400);
});
test('sets session cookie when remember me is unchecked', async ({ page, context }) => {
await page.goto('{{baseUrl}}/login');
await page.getByRole('textbox', { name: /email/i }).fill('{{username}}');
await page.getByRole('textbox', { name: /password/i }).fill('{{password}}');
await page.getByRole('button', { name: /sign in/i }).click();
const cookies = await context.cookies();
const session = cookies.find(c => c.name === '{{sessionCookieName}}');
expect(session?.expires).toBeLessThanOrEqual(0);
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Persistent cookie | Remember me → long-lived cookie (>7 days) |
| Session cookie | No remember me → session-scoped cookie |
| Survives reload | Persistent cookie keeps user logged in across restart |
| Expired cookie | Stale cookie → redirect to /login |
| Checkbox retained | State preserved after failed login attempt |
FILE:templates/auth/session-timeout.md
# Session Timeout Template
Tests auto-logout after inactivity and session refresh behaviour.
## Prerequisites
- Authenticated session via `{{authStorageStatePath}}`
- Session timeout configured to `{{sessionTimeoutMs}}` ms in test env
- App running at `{{baseUrl}}`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
test.describe('Session Timeout', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
// Happy path: session refresh on activity
test('refreshes session on user activity', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
await page.clock.install();
// Advance to just before timeout
await page.clock.fastForward({{sessionTimeoutMs}} - 5000);
await page.getByRole('button', { name: /any interactive element/i }).click();
// Advance past original timeout — session should still be valid
await page.clock.fastForward(10_000);
await expect(page.getByRole('heading', { name: /dashboard/i })).toBeVisible();
});
// Happy path: warning dialog shown before logout
test('shows session-expiry warning before auto-logout', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
await page.clock.install();
await page.clock.fastForward({{sessionTimeoutMs}} - {{warningLeadMs}});
await expect(page.getByRole('dialog', { name: /session.*expiring/i })).toBeVisible();
await expect(page.getByRole('button', { name: /stay signed in/i })).toBeVisible();
});
// Happy path: extend session from warning dialog
test('extends session when "stay signed in" clicked', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
await page.clock.install();
await page.clock.fastForward({{sessionTimeoutMs}} - {{warningLeadMs}});
await page.getByRole('button', { name: /stay signed in/i }).click();
await expect(page.getByRole('dialog', { name: /session.*expiring/i })).toBeHidden();
await expect(page.getByRole('heading', { name: /dashboard/i })).toBeVisible();
});
// Error case: auto-logout after inactivity
test('redirects to login after session timeout', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
await page.clock.install();
await page.clock.fastForward({{sessionTimeoutMs}} + 1000);
await expect(page).toHaveURL(/\/login/);
await expect(page.getByText(/session.*expired|signed out/i)).toBeVisible();
});
// Edge case: API calls return 401 after timeout
test('shows re-auth prompt when API returns 401', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
await page.route('{{baseUrl}}/api/**', route =>
route.fulfill({ status: 401, body: JSON.stringify({ error: 'Unauthorized' }) })
);
await page.getByRole('button', { name: /refresh|reload/i }).click();
await expect(page.getByRole('dialog', { name: /session.*expired/i })).toBeVisible();
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
test.describe('Session Timeout', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
test('shows warning before auto-logout', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
await page.clock.install();
await page.clock.fastForward({{sessionTimeoutMs}} - {{warningLeadMs}});
await expect(page.getByRole('dialog', { name: /session.*expiring/i })).toBeVisible();
});
test('auto-logs out after inactivity', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
await page.clock.install();
await page.clock.fastForward({{sessionTimeoutMs}} + 1000);
await expect(page).toHaveURL(/\/login/);
});
test('extends session on "stay signed in"', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
await page.clock.install();
await page.clock.fastForward({{sessionTimeoutMs}} - {{warningLeadMs}});
await page.getByRole('button', { name: /stay signed in/i }).click();
await expect(page.getByRole('dialog', { name: /session.*expiring/i })).toBeHidden();
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Session refresh | Activity before timeout resets the clock |
| Warning dialog | Shown N ms before timeout |
| Extend session | "Stay signed in" dismisses warning |
| Auto-logout | Inactivity past timeout → /login |
| 401 from API | Re-auth dialog shown when backend rejects request |
FILE:templates/auth/sso.md
# SSO Template
Tests SSO redirect flow, IdP callback handling, and attribute mapping.
## Prerequisites
- SSO provider configured (SAML / OIDC) at `{{ssoProviderUrl}}`
- Test IdP with user `{{ssoUsername}}`
- App running at `{{baseUrl}}`
---
## TypeScript
```typescript
import { test, expect, Page } from '@playwright/test';
async function completeSsoLogin(page: Page, username: string): Promise<void> {
// Fill IdP login form — adapt selectors to your provider
await page.getByRole('textbox', { name: /username/i }).fill(username);
await page.getByRole('button', { name: /login/i }).click();
}
test.describe('SSO', () => {
// Happy path: SSO redirect and callback
test('redirects to IdP and returns authenticated', async ({ page }) => {
await page.goto('{{baseUrl}}/login');
await page.getByRole('button', { name: /sign in with sso/i }).click();
await expect(page).toHaveURL(/{{ssoProviderDomain}}/);
await completeSsoLogin(page, '{{ssoUsername}}');
await expect(page).toHaveURL('{{baseUrl}}/dashboard');
await expect(page.getByRole('heading', { name: /dashboard/i })).toBeVisible();
});
// Happy path: SSO with domain hint
test('pre-fills organisation domain and redirects', async ({ page }) => {
await page.goto('{{baseUrl}}/login');
await page.getByRole('textbox', { name: /work email/i }).fill('{{ssoUsername}}');
await page.getByRole('button', { name: /continue/i }).click();
await expect(page).toHaveURL(/{{ssoProviderDomain}}/);
});
// Happy path: attributes mapped to user profile
test('maps SSO attributes to user profile', async ({ page }) => {
await page.goto('{{baseUrl}}/login');
await page.getByRole('button', { name: /sign in with sso/i }).click();
await completeSsoLogin(page, '{{ssoUsername}}');
await page.goto('{{baseUrl}}/settings/profile');
await expect(page.getByRole('textbox', { name: /email/i })).toHaveValue('{{ssoUsername}}');
});
// Error case: IdP returns error
test('shows error page when IdP returns error response', async ({ page }) => {
await page.goto('{{baseUrl}}/auth/callback?error=access_denied&error_description=User+denied+access');
await expect(page.getByRole('alert')).toContainText(/access denied/i);
await expect(page.getByRole('link', { name: /back to login/i })).toBeVisible();
});
// Error case: invalid callback state
test('rejects callback with invalid state parameter', async ({ page }) => {
await page.goto('{{baseUrl}}/auth/callback?code=valid_code&state=tampered_state');
await expect(page.getByRole('alert')).toContainText(/invalid.*state|authentication failed/i);
});
// Edge case: SSO user first login provisions account
test('provisions new account on first SSO login', async ({ page }) => {
await page.goto('{{baseUrl}}/login');
await page.getByRole('button', { name: /sign in with sso/i }).click();
await completeSsoLogin(page, '{{newSsoUsername}}');
await expect(page).toHaveURL(/{{baseUrl}}\/(dashboard|onboarding)/);
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
async function completeSsoLogin(page, username) {
await page.getByRole('textbox', { name: /username/i }).fill(username);
await page.getByRole('button', { name: /login/i }).click();
}
test.describe('SSO', () => {
test('redirects to IdP and returns authenticated', async ({ page }) => {
await page.goto('{{baseUrl}}/login');
await page.getByRole('button', { name: /sign in with sso/i }).click();
await expect(page).toHaveURL(/{{ssoProviderDomain}}/);
await completeSsoLogin(page, '{{ssoUsername}}');
await expect(page).toHaveURL('{{baseUrl}}/dashboard');
});
test('shows error when IdP returns access_denied', async ({ page }) => {
await page.goto('{{baseUrl}}/auth/callback?error=access_denied');
await expect(page.getByRole('alert')).toContainText(/access denied/i);
});
test('rejects tampered state parameter', async ({ page }) => {
await page.goto('{{baseUrl}}/auth/callback?code=abc&state=tampered');
await expect(page.getByRole('alert')).toContainText(/invalid.*state|authentication failed/i);
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Happy path | SSO button → IdP → callback → dashboard |
| Domain hint | Email triggers org-specific IdP redirect |
| Attribute mapping | SSO profile fields populate user record |
| IdP error | access_denied → error page with back link |
| Invalid state | CSRF protection rejects tampered callback |
| First login | Auto-provisions account on initial SSO |
FILE:templates/checkout/add-to-cart.md
# Add to Cart Template
Tests adding items to cart and quantity updates.
## Prerequisites
- Authenticated (or guest) session
- Product: ID `{{productId}}`, name `{{productName}}`, price `{{productPrice}}`
- App running at `{{baseUrl}}`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
test.describe('Add to Cart', () => {
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/products/{{productId}}');
});
// Happy path: add single item
test('adds product to cart', async ({ page }) => {
await page.getByRole('button', { name: /add to cart/i }).click();
await expect(page.getByRole('status', { name: /cart/i })).toContainText('1');
await expect(page.getByRole('alert')).toContainText(/added to cart/i);
});
// Happy path: add multiple items increments count
test('increments cart count on repeated add', async ({ page }) => {
await page.getByRole('button', { name: /add to cart/i }).click();
await page.getByRole('button', { name: /add to cart/i }).click();
await expect(page.getByRole('status', { name: /cart/i })).toContainText('2');
});
// Happy path: add with quantity selector
test('adds specified quantity to cart', async ({ page }) => {
await page.getByRole('spinbutton', { name: /quantity/i }).fill('3');
await page.getByRole('button', { name: /add to cart/i }).click();
await expect(page.getByRole('status', { name: /cart/i })).toContainText('3');
});
// Happy path: cart persists on navigation
test('cart persists after navigating away', async ({ page }) => {
await page.getByRole('button', { name: /add to cart/i }).click();
await page.goto('{{baseUrl}}/products');
await expect(page.getByRole('status', { name: /cart/i })).toContainText('1');
});
// Error case: out of stock product cannot be added
test('add to cart button disabled for out-of-stock product', async ({ page }) => {
await page.goto('{{baseUrl}}/products/{{outOfStockProductId}}');
await expect(page.getByRole('button', { name: /add to cart/i })).toBeDisabled();
await expect(page.getByText(/out of stock/i)).toBeVisible();
});
// Error case: quantity exceeds stock
test('shows error when quantity exceeds available stock', async ({ page }) => {
await page.getByRole('spinbutton', { name: /quantity/i }).fill('{{overStockQuantity}}');
await page.getByRole('button', { name: /add to cart/i }).click();
await expect(page.getByRole('alert')).toContainText(/only.*available|exceeds.*stock/i);
});
// Edge case: cart opens after add
test('cart drawer opens after adding item', async ({ page }) => {
await page.getByRole('button', { name: /add to cart/i }).click();
await expect(page.getByRole('dialog', { name: /cart/i })).toBeVisible();
await expect(page.getByRole('dialog').getByText('{{productName}}')).toBeVisible();
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
test.describe('Add to Cart', () => {
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/products/{{productId}}');
});
test('adds product to cart', async ({ page }) => {
await page.getByRole('button', { name: /add to cart/i }).click();
await expect(page.getByRole('status', { name: /cart/i })).toContainText('1');
});
test('add to cart disabled for out-of-stock', async ({ page }) => {
await page.goto('{{baseUrl}}/products/{{outOfStockProductId}}');
await expect(page.getByRole('button', { name: /add to cart/i })).toBeDisabled();
});
test('cart persists after navigation', async ({ page }) => {
await page.getByRole('button', { name: /add to cart/i }).click();
await page.goto('{{baseUrl}}/products');
await expect(page.getByRole('status', { name: /cart/i })).toContainText('1');
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Single add | Product added, cart count = 1 |
| Repeated add | Cart count increments |
| Quantity selector | Specified quantity added |
| Persist on nav | Cart count survives page change |
| Out of stock | Button disabled, label shown |
| Quantity exceeds stock | Error alert |
| Cart drawer | Slide-in cart opens showing added item |
FILE:templates/checkout/apply-coupon.md
# Apply Coupon Template
Tests valid coupon code, invalid code, and expired coupon handling.
## Prerequisites
- Cart with items totalling `{{cartTotal}}`
- Valid coupon: `{{validCouponCode}}` ({{discountPercent}}% off)
- Expired coupon: `{{expiredCouponCode}}`
- App running at `{{baseUrl}}`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
test.describe('Apply Coupon', () => {
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/cart');
});
// Happy path: valid coupon applied
test('applies valid coupon and shows discount', async ({ page }) => {
await page.getByRole('textbox', { name: /coupon|promo code/i }).fill('{{validCouponCode}}');
await page.getByRole('button', { name: /apply/i }).click();
await expect(page.getByText(/{{discountPercent}}%.*off|discount applied/i)).toBeVisible();
await expect(page.getByText('{{discountedTotal}}')).toBeVisible();
await expect(page.getByRole('button', { name: /remove coupon/i })).toBeVisible();
});
// Happy path: percentage discount calculated correctly
test('calculates discount amount correctly', async ({ page }) => {
await page.getByRole('textbox', { name: /coupon|promo code/i }).fill('{{validCouponCode}}');
await page.getByRole('button', { name: /apply/i }).click();
const discountLine = page.getByRole('row', { name: /discount/i });
await expect(discountLine).toContainText('-{{discountAmount}}');
});
// Happy path: remove applied coupon
test('removes applied coupon and restores original total', async ({ page }) => {
await page.getByRole('textbox', { name: /coupon|promo code/i }).fill('{{validCouponCode}}');
await page.getByRole('button', { name: /apply/i }).click();
await page.getByRole('button', { name: /remove coupon/i }).click();
await expect(page.getByText('{{cartTotal}}')).toBeVisible();
await expect(page.getByRole('button', { name: /remove coupon/i })).toBeHidden();
});
// Error case: invalid coupon code
test('shows error for invalid coupon code', async ({ page }) => {
await page.getByRole('textbox', { name: /coupon|promo code/i }).fill('INVALID123');
await page.getByRole('button', { name: /apply/i }).click();
await expect(page.getByRole('alert')).toContainText(/invalid.*coupon|code not found/i);
await expect(page.getByText('{{cartTotal}}')).toBeVisible();
});
// Error case: expired coupon
test('shows error for expired coupon', async ({ page }) => {
await page.getByRole('textbox', { name: /coupon|promo code/i }).fill('{{expiredCouponCode}}');
await page.getByRole('button', { name: /apply/i }).click();
await expect(page.getByRole('alert')).toContainText(/expired|no longer valid/i);
});
// Error case: coupon not applicable to cart items
test('shows error when coupon excludes cart products', async ({ page }) => {
await page.getByRole('textbox', { name: /coupon|promo code/i }).fill('{{categoryRestrictedCoupon}}');
await page.getByRole('button', { name: /apply/i }).click();
await expect(page.getByRole('alert')).toContainText(/not applicable|excluded/i);
});
// Edge case: empty coupon field
test('apply button disabled when coupon field is empty', async ({ page }) => {
const applyBtn = page.getByRole('button', { name: /apply/i });
await expect(applyBtn).toBeDisabled();
await page.getByRole('textbox', { name: /coupon|promo code/i }).fill('X');
await expect(applyBtn).toBeEnabled();
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
test.describe('Apply Coupon', () => {
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/cart');
});
test('applies valid coupon and shows discount', async ({ page }) => {
await page.getByRole('textbox', { name: /coupon|promo code/i }).fill('{{validCouponCode}}');
await page.getByRole('button', { name: /apply/i }).click();
await expect(page.getByText(/discount applied/i)).toBeVisible();
await expect(page.getByText('{{discountedTotal}}')).toBeVisible();
});
test('shows error for invalid coupon', async ({ page }) => {
await page.getByRole('textbox', { name: /coupon|promo code/i }).fill('INVALID123');
await page.getByRole('button', { name: /apply/i }).click();
await expect(page.getByRole('alert')).toContainText(/invalid.*coupon/i);
});
test('shows error for expired coupon', async ({ page }) => {
await page.getByRole('textbox', { name: /coupon|promo code/i }).fill('{{expiredCouponCode}}');
await page.getByRole('button', { name: /apply/i }).click();
await expect(page.getByRole('alert')).toContainText(/expired/i);
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Valid coupon | Discount applied, total updated |
| Discount calculation | Discount line shows correct amount |
| Remove coupon | Original total restored |
| Invalid code | Error alert, total unchanged |
| Expired coupon | Expiry error shown |
| Category restriction | Coupon not applicable error |
| Empty field | Apply button disabled |
FILE:templates/checkout/order-confirm.md
# Order Confirmation Template
Tests the success page and order details after checkout.
## Prerequisites
- Completed order with ID `{{orderId}}`
- Authenticated session via `{{authStorageStatePath}}`
- App running at `{{baseUrl}}`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
test.describe('Order Confirmation', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
// Happy path: confirmation page content
test('shows order confirmation with correct details', async ({ page }) => {
await page.goto('{{baseUrl}}/order-confirmation/{{orderId}}');
await expect(page.getByRole('heading', { name: /order confirmed|thank you/i })).toBeVisible();
await expect(page.getByText('{{orderId}}')).toBeVisible();
await expect(page.getByText('{{productName}}')).toBeVisible();
await expect(page.getByText('{{orderTotal}}')).toBeVisible();
});
// Happy path: confirmation email notice
test('shows confirmation email notice', async ({ page }) => {
await page.goto('{{baseUrl}}/order-confirmation/{{orderId}}');
await expect(page.getByText(/confirmation.*sent to|email.*{{username}}/i)).toBeVisible();
});
// Happy path: billing and shipping details shown
test('displays shipping address on confirmation page', async ({ page }) => {
await page.goto('{{baseUrl}}/order-confirmation/{{orderId}}');
await expect(page.getByText('{{shippingAddress}}')).toBeVisible();
await expect(page.getByText('{{billingAddress}}')).toBeVisible();
});
// Happy path: CTA navigates to order history
test('"view your orders" link navigates to order history', async ({ page }) => {
await page.goto('{{baseUrl}}/order-confirmation/{{orderId}}');
await page.getByRole('link', { name: /view.*orders|my orders/i }).click();
await expect(page).toHaveURL('{{baseUrl}}/orders');
});
// Happy path: continue shopping CTA
test('"continue shopping" returns to products', async ({ page }) => {
await page.goto('{{baseUrl}}/order-confirmation/{{orderId}}');
await page.getByRole('link', { name: /continue shopping/i }).click();
await expect(page).toHaveURL('{{baseUrl}}/products');
});
// Error case: accessing another user's order shows 403
test('cannot access another user\'s confirmation page', async ({ page }) => {
await page.goto('{{baseUrl}}/order-confirmation/{{otherUsersOrderId}}');
await expect(page).toHaveURL(/\/403|\/dashboard/);
});
// Edge case: cart is empty after successful checkout
test('cart is empty after order confirmed', async ({ page }) => {
await page.goto('{{baseUrl}}/order-confirmation/{{orderId}}');
await expect(page.getByRole('status', { name: /cart/i })).toContainText('0');
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
test.describe('Order Confirmation', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
test('shows order id and total on confirmation', async ({ page }) => {
await page.goto('{{baseUrl}}/order-confirmation/{{orderId}}');
await expect(page.getByRole('heading', { name: /order confirmed|thank you/i })).toBeVisible();
await expect(page.getByText('{{orderId}}')).toBeVisible();
await expect(page.getByText('{{orderTotal}}')).toBeVisible();
});
test('cart is empty after checkout', async ({ page }) => {
await page.goto('{{baseUrl}}/order-confirmation/{{orderId}}');
await expect(page.getByRole('status', { name: /cart/i })).toContainText('0');
});
test('cannot access another user\'s order', async ({ page }) => {
await page.goto('{{baseUrl}}/order-confirmation/{{otherUsersOrderId}}');
await expect(page).toHaveURL(/\/403|\/dashboard/);
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Confirmation content | Order ID, product, total visible |
| Email notice | Confirmation email address shown |
| Shipping/billing | Addresses displayed |
| View orders CTA | Navigates to /orders |
| Continue shopping | Returns to /products |
| Unauthorized | Other user's order → 403 |
| Cart cleared | Cart count = 0 after checkout |
FILE:templates/checkout/order-history.md
# Order History Template
Tests listing orders, viewing order details, and pagination.
## Prerequisites
- Authenticated session via `{{authStorageStatePath}}`
- At least `{{orderCount}}` orders seeded for user
- App running at `{{baseUrl}}`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
test.describe('Order History', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
// Happy path: order list
test('displays list of orders with key details', async ({ page }) => {
await page.goto('{{baseUrl}}/orders');
await expect(page.getByRole('heading', { name: /orders|order history/i })).toBeVisible();
const rows = page.getByRole('row').filter({ hasNot: page.getByRole('columnheader') });
await expect(rows.first()).toContainText('{{latestOrderId}}');
await expect(rows.first()).toContainText('{{latestOrderStatus}}');
await expect(rows.first()).toContainText('{{latestOrderTotal}}');
});
// Happy path: view order details
test('navigates to order detail from history', async ({ page }) => {
await page.goto('{{baseUrl}}/orders');
await page.getByRole('link', { name: new RegExp('{{latestOrderId}}') }).click();
await expect(page).toHaveURL(`{{baseUrl}}/orders/{{latestOrderId}}`);
await expect(page.getByRole('heading', { name: '{{latestOrderId}}' })).toBeVisible();
await expect(page.getByText('{{productName}}')).toBeVisible();
});
// Happy path: order status badge
test('shows correct status badge for each order', async ({ page }) => {
await page.goto('{{baseUrl}}/orders');
const deliveredBadge = page.getByRole('status', { name: /delivered/i }).first();
await expect(deliveredBadge).toBeVisible();
});
// Happy path: pagination
test('paginates through orders', async ({ page }) => {
await page.goto('{{baseUrl}}/orders');
const firstPageFirstOrder = await page.getByRole('row').nth(1).textContent();
await page.getByRole('button', { name: /next page|>/i }).click();
await expect(page.getByRole('row').nth(1)).not.toHaveText(firstPageFirstOrder!);
await expect(page.getByRole('button', { name: /previous page|</i })).toBeEnabled();
});
// Happy path: items per page selector
test('changes items per page', async ({ page }) => {
await page.goto('{{baseUrl}}/orders');
await page.getByRole('combobox', { name: /per page|items per page/i }).selectOption('50');
const rows = page.getByRole('row').filter({ hasNot: page.getByRole('columnheader') });
await expect(rows).toHaveCount(Math.min(50, {{orderCount}}));
});
// Error case: empty order history
test('shows empty state for user with no orders', async ({ page }) => {
await page.goto('{{baseUrl}}/orders');
// Assumes this user context has no orders
await expect(page.getByText(/no orders yet|start shopping/i)).toBeVisible();
});
// Edge case: reorder from history
test('adds previous order items to cart via reorder', async ({ page }) => {
await page.goto('{{baseUrl}}/orders/{{latestOrderId}}');
await page.getByRole('button', { name: /reorder|buy again/i }).click();
await expect(page).toHaveURL('{{baseUrl}}/cart');
await expect(page.getByText('{{productName}}')).toBeVisible();
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
test.describe('Order History', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
test('displays orders with id, status, and total', async ({ page }) => {
await page.goto('{{baseUrl}}/orders');
const rows = page.getByRole('row').filter({ hasNot: page.getByRole('columnheader') });
await expect(rows.first()).toContainText('{{latestOrderId}}');
});
test('navigates to order detail', async ({ page }) => {
await page.goto('{{baseUrl}}/orders');
await page.getByRole('link', { name: new RegExp('{{latestOrderId}}') }).click();
await expect(page).toHaveURL(`{{baseUrl}}/orders/{{latestOrderId}}`);
});
test('paginates through orders', async ({ page }) => {
await page.goto('{{baseUrl}}/orders');
await page.getByRole('button', { name: /next page|>/i }).click();
await expect(page.getByRole('button', { name: /previous page|</i })).toBeEnabled();
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Order list | ID, status, total visible per row |
| Order detail | Clicking order → detail page |
| Status badge | Correct badge per order state |
| Pagination | Next page loads different orders |
| Items per page | Selector changes row count |
| Empty state | No-orders message with CTA |
| Reorder | Previous order items added to cart |
FILE:templates/checkout/payment.md
# Payment Template
Tests card form entry, validation, and payment processing.
## Prerequisites
- Cart with items, shipping filled
- Test card numbers: `{{testCardNumber}}` (success), `{{declinedCardNumber}}` (decline)
- App running at `{{baseUrl}}`
---
## TypeScript
```typescript
import { test, expect, Page } from '@playwright/test';
async function fillCardForm(page: Page, card: {
number: string; expiry: string; cvc: string; name: string;
}): Promise<void> {
// Stripe/Braintree iframes — adapt frame locator to your provider
const cardFrame = page.frameLocator('[data-testid="card-number-frame"]');
await cardFrame.getByRole('textbox', { name: /card number/i }).fill(card.number);
const expiryFrame = page.frameLocator('[data-testid="expiry-frame"]');
await expiryFrame.getByRole('textbox', { name: /expiry/i }).fill(card.expiry);
const cvcFrame = page.frameLocator('[data-testid="cvc-frame"]');
await cvcFrame.getByRole('textbox', { name: /cvc|cvv/i }).fill(card.cvc);
await page.getByRole('textbox', { name: /cardholder name/i }).fill(card.name);
}
test.describe('Payment', () => {
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/checkout/payment');
});
// Happy path: successful payment
test('completes payment with valid card', async ({ page }) => {
await fillCardForm(page, {
number: '{{testCardNumber}}',
expiry: '12/28',
cvc: '123',
name: '{{cardholderName}}',
});
await page.getByRole('button', { name: /pay|place order/i }).click();
await expect(page).toHaveURL(/\/order-confirmation|\/success/);
await expect(page.getByRole('heading', { name: /order confirmed|thank you/i })).toBeVisible();
});
// Happy path: processing state shown
test('shows processing state while payment is pending', async ({ page }) => {
await fillCardForm(page, {
number: '{{testCardNumber}}',
expiry: '12/28',
cvc: '123',
name: '{{cardholderName}}',
});
const payBtn = page.getByRole('button', { name: /pay|place order/i });
await payBtn.click();
await expect(payBtn).toBeDisabled();
await expect(page.getByText(/processing|please wait/i)).toBeVisible();
});
// Error case: declined card
test('shows decline error for rejected card', async ({ page }) => {
await fillCardForm(page, {
number: '{{declinedCardNumber}}',
expiry: '12/28',
cvc: '123',
name: '{{cardholderName}}',
});
await page.getByRole('button', { name: /pay|place order/i }).click();
await expect(page.getByRole('alert')).toContainText(/declined|card.*not accepted/i);
await expect(page).toHaveURL(/\/checkout\/payment/);
});
// Error case: invalid card number format
test('shows inline error for invalid card number', async ({ page }) => {
const cardFrame = page.frameLocator('[data-testid="card-number-frame"]');
await cardFrame.getByRole('textbox', { name: /card number/i }).fill('1234');
await page.getByRole('button', { name: /pay|place order/i }).click();
await expect(page.getByText(/invalid.*card number/i)).toBeVisible();
});
// Error case: expired card
test('shows error for expired card', async ({ page }) => {
await fillCardForm(page, {
number: '{{testCardNumber}}',
expiry: '01/20',
cvc: '123',
name: '{{cardholderName}}',
});
await page.getByRole('button', { name: /pay|place order/i }).click();
await expect(page.getByRole('alert')).toContainText(/expired|invalid.*expiry/i);
});
// Edge case: 3DS authentication required
test('handles 3DS challenge and completes payment', async ({ page }) => {
await fillCardForm(page, {
number: '{{threeDsCardNumber}}',
expiry: '12/28',
cvc: '123',
name: '{{cardholderName}}',
});
await page.getByRole('button', { name: /pay|place order/i }).click();
// 3DS modal appears
const challengeFrame = page.frameLocator('[data-testid="3ds-challenge-frame"]');
await challengeFrame.getByRole('button', { name: /complete authentication/i }).click();
await expect(page).toHaveURL(/\/order-confirmation|\/success/);
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
test.describe('Payment', () => {
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/checkout/payment');
});
test('completes payment with valid card', async ({ page }) => {
const cardFrame = page.frameLocator('[data-testid="card-number-frame"]');
await cardFrame.getByRole('textbox', { name: /card number/i }).fill('{{testCardNumber}}');
await page.getByRole('button', { name: /pay|place order/i }).click();
await expect(page).toHaveURL(/\/order-confirmation/);
});
test('shows decline error for rejected card', async ({ page }) => {
const cardFrame = page.frameLocator('[data-testid="card-number-frame"]');
await cardFrame.getByRole('textbox', { name: /card number/i }).fill('{{declinedCardNumber}}');
await page.getByRole('button', { name: /pay|place order/i }).click();
await expect(page.getByRole('alert')).toContainText(/declined/i);
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Successful payment | Valid test card → order confirmation |
| Processing state | Button disabled + spinner during processing |
| Declined card | Error alert, stays on payment page |
| Invalid card number | Inline validation from provider |
| Expired card | Expiry error |
| 3DS challenge | Modal completed, payment succeeds |
FILE:templates/checkout/update-quantity.md
# Update Cart Quantity Template
Tests increasing, decreasing, and removing items from cart.
## Prerequisites
- Cart with at least one item: `{{productName}}` (quantity 2)
- App running at `{{baseUrl}}`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
test.describe('Update Cart Quantity', () => {
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/cart');
// Assumes cart is pre-populated via storageState or API setup
});
// Happy path: increase quantity
test('increases item quantity', async ({ page }) => {
const row = page.getByRole('row', { name: new RegExp('{{productName}}') });
await row.getByRole('button', { name: /increase|plus|\+/i }).click();
await expect(row.getByRole('spinbutton', { name: /quantity/i })).toHaveValue('3');
await expect(page.getByRole('region', { name: /order summary/i })).toContainText('{{updatedTotal}}');
});
// Happy path: decrease quantity
test('decreases item quantity', async ({ page }) => {
const row = page.getByRole('row', { name: new RegExp('{{productName}}') });
await row.getByRole('button', { name: /decrease|minus|−/i }).click();
await expect(row.getByRole('spinbutton', { name: /quantity/i })).toHaveValue('1');
});
// Happy path: type quantity directly
test('updates quantity by typing in field', async ({ page }) => {
const row = page.getByRole('row', { name: new RegExp('{{productName}}') });
const qtyInput = row.getByRole('spinbutton', { name: /quantity/i });
await qtyInput.fill('5');
await qtyInput.press('Tab');
await expect(qtyInput).toHaveValue('5');
});
// Happy path: remove item with remove button
test('removes item from cart', async ({ page }) => {
const row = page.getByRole('row', { name: new RegExp('{{productName}}') });
await row.getByRole('button', { name: /remove|delete/i }).click();
await expect(row).toBeHidden();
await expect(page.getByText(/cart is empty/i)).toBeVisible();
});
// Happy path: decrease to 0 removes item
test('removing to quantity 0 removes item', async ({ page }) => {
const row = page.getByRole('row', { name: new RegExp('{{productName}}') });
await row.getByRole('button', { name: /decrease|minus/i }).click(); // from 2 to 1
await row.getByRole('button', { name: /decrease|minus/i }).click(); // should trigger remove
await expect(row).toBeHidden();
});
// Error case: quantity cannot go below 1 via decrease button
test('decrease button disabled at minimum quantity', async ({ page }) => {
const row = page.getByRole('row').nth(1);
const qty = row.getByRole('spinbutton', { name: /quantity/i });
await qty.fill('1');
await qty.press('Tab');
await expect(row.getByRole('button', { name: /decrease|minus/i })).toBeDisabled();
});
// Edge case: quantity clamped to stock limit
test('quantity capped at available stock', async ({ page }) => {
const row = page.getByRole('row', { name: new RegExp('{{productName}}') });
const qtyInput = row.getByRole('spinbutton', { name: /quantity/i });
await qtyInput.fill('{{overStockQuantity}}');
await qtyInput.press('Tab');
await expect(qtyInput).toHaveValue('{{maxStock}}');
await expect(page.getByRole('alert')).toContainText(/max.*available|stock limit/i);
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
test.describe('Update Cart Quantity', () => {
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/cart');
});
test('increases item quantity', async ({ page }) => {
const row = page.getByRole('row', { name: new RegExp('{{productName}}') });
await row.getByRole('button', { name: /increase|plus|\+/i }).click();
await expect(row.getByRole('spinbutton', { name: /quantity/i })).toHaveValue('3');
});
test('removes item from cart', async ({ page }) => {
await page.getByRole('row', { name: new RegExp('{{productName}}') })
.getByRole('button', { name: /remove|delete/i }).click();
await expect(page.getByText(/cart is empty/i)).toBeVisible();
});
test('decrease button disabled at quantity 1', async ({ page }) => {
const row = page.getByRole('row').nth(1);
await row.getByRole('spinbutton', { name: /quantity/i }).fill('1');
await row.getByRole('spinbutton', { name: /quantity/i }).press('Tab');
await expect(row.getByRole('button', { name: /decrease|minus/i })).toBeDisabled();
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Increase | +1 → quantity updates, total recalculates |
| Decrease | -1 → quantity updates |
| Type directly | Manual quantity input accepted on blur/tab |
| Remove button | Item removed, empty-cart message shown |
| Decrease to 0 | Triggers item removal |
| Min quantity | Decrease button disabled at 1 |
| Stock cap | Input clamped to available stock |
FILE:templates/crud/bulk-operations.md
# Bulk Operations Template
Tests selecting multiple items and performing bulk delete/update actions.
## Prerequisites
- Authenticated session via `{{authStorageStatePath}}`
- At least `{{minItemCount}}` entities seeded in list
- App running at `{{baseUrl}}`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
test.describe('Bulk Operations', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/{{entityName}}s');
});
// Happy path: select all and bulk delete
test('selects all and bulk deletes', async ({ page }) => {
await page.getByRole('checkbox', { name: /select all/i }).check();
const checkboxes = page.getByRole('row').filter({ hasNot: page.getByRole('columnheader') })
.getByRole('checkbox');
await expect(checkboxes.first()).toBeChecked();
await page.getByRole('button', { name: /bulk delete/i }).click();
await page.getByRole('dialog').getByRole('button', { name: /confirm/i }).click();
await expect(page.getByRole('alert')).toContainText(/deleted/i);
await expect(page.getByRole('row').filter({ hasNot: page.getByRole('columnheader') }))
.toHaveCount(0);
});
// Happy path: select specific rows and bulk update status
test('updates status of selected rows', async ({ page }) => {
const rows = page.getByRole('row').filter({ hasNot: page.getByRole('columnheader') });
await rows.nth(0).getByRole('checkbox').check();
await rows.nth(1).getByRole('checkbox').check();
await expect(page.getByText(/2 selected/i)).toBeVisible();
await page.getByRole('button', { name: /bulk actions/i }).click();
await page.getByRole('menuitem', { name: /mark as active/i }).click();
await expect(page.getByRole('alert')).toContainText(/2.*updated/i);
});
// Happy path: toolbar appears only when items selected
test('shows bulk action toolbar only when items are selected', async ({ page }) => {
await expect(page.getByRole('toolbar', { name: /bulk actions/i })).toBeHidden();
await page.getByRole('row').nth(1).getByRole('checkbox').check();
await expect(page.getByRole('toolbar', { name: /bulk actions/i })).toBeVisible();
});
// Happy path: deselect all clears toolbar
test('hides toolbar after deselecting all', async ({ page }) => {
await page.getByRole('checkbox', { name: /select all/i }).check();
await page.getByRole('checkbox', { name: /select all/i }).uncheck();
await expect(page.getByRole('toolbar', { name: /bulk actions/i })).toBeHidden();
});
// Error case: bulk delete requires confirmation
test('requires confirmation before bulk delete', async ({ page }) => {
await page.getByRole('checkbox', { name: /select all/i }).check();
await page.getByRole('button', { name: /bulk delete/i }).click();
await expect(page.getByRole('dialog', { name: /confirm/i })).toBeVisible();
await page.getByRole('button', { name: /cancel/i }).click();
const rowCount = await page.getByRole('row').filter({ hasNot: page.getByRole('columnheader') }).count();
expect(rowCount).toBeGreaterThan(0);
});
// Edge case: select all across pages
test('shows "select all across pages" option when applicable', async ({ page }) => {
await page.getByRole('checkbox', { name: /select all/i }).check();
const crossPage = page.getByRole('button', { name: /select all.*across pages/i });
if (await crossPage.isVisible()) {
await crossPage.click();
await expect(page.getByText(/all.*selected/i)).toBeVisible();
}
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
test.describe('Bulk Operations', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/{{entityName}}s');
});
test('shows bulk action toolbar when items selected', async ({ page }) => {
await expect(page.getByRole('toolbar', { name: /bulk actions/i })).toBeHidden();
await page.getByRole('row').nth(1).getByRole('checkbox').check();
await expect(page.getByRole('toolbar', { name: /bulk actions/i })).toBeVisible();
});
test('selects all and bulk deletes', async ({ page }) => {
await page.getByRole('checkbox', { name: /select all/i }).check();
await page.getByRole('button', { name: /bulk delete/i }).click();
await page.getByRole('dialog').getByRole('button', { name: /confirm/i }).click();
await expect(page.getByRole('alert')).toContainText(/deleted/i);
});
test('requires confirmation before bulk delete', async ({ page }) => {
await page.getByRole('checkbox', { name: /select all/i }).check();
await page.getByRole('button', { name: /bulk delete/i }).click();
await expect(page.getByRole('dialog', { name: /confirm/i })).toBeVisible();
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Select all + delete | All rows selected → confirmed delete → empty list |
| Partial select + update | N rows selected → status updated → success |
| Toolbar visibility | Appears on select, hides on deselect |
| Deselect all | Select all → uncheck → toolbar gone |
| Confirmation required | Bulk delete shows dialog first |
| Cross-page select | Select-all-pages option shown on multi-page lists |
FILE:templates/crud/create.md
# Create Entity Template
Tests creating a new entity via form submission.
## Prerequisites
- Authenticated session via `{{authStorageStatePath}}`
- Entity type: `{{entityName}}` (e.g. "Project", "Product", "User")
- App running at `{{baseUrl}}`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
test.describe('Create {{entityName}}', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/{{entityName}}s/new');
});
// Happy path: create with valid data
test('creates {{entityName}} with valid data', async ({ page }) => {
await page.getByRole('textbox', { name: /name/i }).fill('{{testEntityName}}');
await page.getByRole('textbox', { name: /description/i }).fill('{{testEntityDescription}}');
await page.getByRole('combobox', { name: /category/i }).selectOption('{{testEntityCategory}}');
await page.getByRole('button', { name: /create|save/i }).click();
await expect(page).toHaveURL(/\/{{entityName}}s\/\d+/);
await expect(page.getByRole('heading', { name: '{{testEntityName}}' })).toBeVisible();
await expect(page.getByRole('alert')).toContainText(/created successfully/i);
});
// Happy path: create and add another
test('clears form after "save and add another"', async ({ page }) => {
await page.getByRole('textbox', { name: /name/i }).fill('{{testEntityName}}');
await page.getByRole('button', { name: /save and add another/i }).click();
await expect(page.getByRole('textbox', { name: /name/i })).toHaveValue('');
await expect(page.getByRole('alert')).toContainText(/created successfully/i);
});
// Error case: required fields missing
test('shows validation errors for empty required fields', async ({ page }) => {
await page.getByRole('button', { name: /create|save/i }).click();
await expect(page.getByText(/name is required/i)).toBeVisible();
await expect(page).toHaveURL('{{baseUrl}}/{{entityName}}s/new');
});
// Error case: duplicate name
test('shows error when entity name already exists', async ({ page }) => {
await page.getByRole('textbox', { name: /name/i }).fill('{{existingEntityName}}');
await page.getByRole('button', { name: /create|save/i }).click();
await expect(page.getByRole('alert')).toContainText(/already exists|duplicate/i);
});
// Edge case: max length enforcement
test('enforces max length on name field', async ({ page }) => {
const longName = 'A'.repeat({{maxNameLength}} + 1);
await page.getByRole('textbox', { name: /name/i }).fill(longName);
const actualValue = await page.getByRole('textbox', { name: /name/i }).inputValue();
expect(actualValue.length).toBeLessThanOrEqual({{maxNameLength}});
});
// Edge case: cancel navigates away without saving
test('cancel navigates back without creating', async ({ page }) => {
await page.getByRole('textbox', { name: /name/i }).fill('should-not-save');
await page.getByRole('button', { name: /cancel/i }).click();
await expect(page).toHaveURL('{{baseUrl}}/{{entityName}}s');
await expect(page.getByRole('cell', { name: 'should-not-save' })).toBeHidden();
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
test.describe('Create {{entityName}}', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/{{entityName}}s/new');
});
test('creates entity with valid data', async ({ page }) => {
await page.getByRole('textbox', { name: /name/i }).fill('{{testEntityName}}');
await page.getByRole('textbox', { name: /description/i }).fill('{{testEntityDescription}}');
await page.getByRole('button', { name: /create|save/i }).click();
await expect(page).toHaveURL(/\/{{entityName}}s\/\d+/);
await expect(page.getByRole('alert')).toContainText(/created successfully/i);
});
test('shows validation errors for empty form', async ({ page }) => {
await page.getByRole('button', { name: /create|save/i }).click();
await expect(page.getByText(/name is required/i)).toBeVisible();
});
test('cancel navigates back without saving', async ({ page }) => {
await page.getByRole('textbox', { name: /name/i }).fill('not-saved');
await page.getByRole('button', { name: /cancel/i }).click();
await expect(page).toHaveURL('{{baseUrl}}/{{entityName}}s');
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Happy path | Valid form → entity created → detail page |
| Save and add | Form cleared, ready for next entry |
| Required fields | Empty submit → inline validation |
| Duplicate name | Server error shown |
| Max length | Input truncated at field max |
| Cancel | No entity created, returns to list |
FILE:templates/crud/delete.md
# Delete Entity Template
Tests deletion with confirmation dialog and post-delete behaviour.
## Prerequisites
- Authenticated session via `{{authStorageStatePath}}`
- Entity to delete: ID `{{entityId}}`, name `{{entityName}}`
- App running at `{{baseUrl}}`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
test.describe('Delete {{entityName}}', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
// Happy path: delete from detail page
test('deletes entity after confirming dialog', async ({ page }) => {
await page.goto('{{baseUrl}}/{{entityName}}s/{{entityId}}');
await page.getByRole('button', { name: /delete/i }).click();
const dialog = page.getByRole('dialog', { name: /delete|confirm/i });
await expect(dialog).toBeVisible();
await expect(dialog).toContainText('{{entityName}}');
await dialog.getByRole('button', { name: /delete|confirm/i }).click();
await expect(page).toHaveURL('{{baseUrl}}/{{entityName}}s');
await expect(page.getByRole('alert')).toContainText(/deleted successfully/i);
await expect(page.getByRole('link', { name: '{{entityName}}' })).toBeHidden();
});
// Happy path: delete from list view
test('deletes entity from list row action', async ({ page }) => {
await page.goto('{{baseUrl}}/{{entityName}}s');
const row = page.getByRole('row', { name: new RegExp('{{entityName}}') });
await row.getByRole('button', { name: /delete/i }).click();
await page.getByRole('dialog').getByRole('button', { name: /confirm|delete/i }).click();
await expect(row).toBeHidden();
});
// Error case: cancel deletion
test('does not delete when cancel is clicked in dialog', async ({ page }) => {
await page.goto('{{baseUrl}}/{{entityName}}s/{{entityId}}');
await page.getByRole('button', { name: /delete/i }).click();
await page.getByRole('dialog').getByRole('button', { name: /cancel/i }).click();
await expect(page.getByRole('dialog')).toBeHidden();
await expect(page).toHaveURL(`{{baseUrl}}/{{entityName}}s/{{entityId}}`);
await expect(page.getByRole('heading', { name: '{{entityName}}' })).toBeVisible();
});
// Error case: delete entity with dependents
test('shows error when entity has dependent records', async ({ page }) => {
await page.goto('{{baseUrl}}/{{entityName}}s/{{entityWithDependentsId}}');
await page.getByRole('button', { name: /delete/i }).click();
await page.getByRole('dialog').getByRole('button', { name: /confirm|delete/i }).click();
await expect(page.getByRole('alert')).toContainText(/cannot delete|has dependents/i);
await expect(page).toHaveURL(`{{baseUrl}}/{{entityName}}s/{{entityWithDependentsId}}`);
});
// Edge case: confirmation dialog requires typing entity name
test('requires typing entity name to confirm deletion', async ({ page }) => {
await page.goto('{{baseUrl}}/{{entityName}}s/{{entityId}}');
await page.getByRole('button', { name: /delete/i }).click();
const confirmBtn = page.getByRole('dialog').getByRole('button', { name: /confirm|delete/i });
await expect(confirmBtn).toBeDisabled();
await page.getByRole('textbox', { name: /type.*to confirm/i }).fill('{{entityName}}');
await expect(confirmBtn).toBeEnabled();
await confirmBtn.click();
await expect(page).toHaveURL('{{baseUrl}}/{{entityName}}s');
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
test.describe('Delete {{entityName}}', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
test('deletes entity after confirming dialog', async ({ page }) => {
await page.goto('{{baseUrl}}/{{entityName}}s/{{entityId}}');
await page.getByRole('button', { name: /delete/i }).click();
await page.getByRole('dialog').getByRole('button', { name: /confirm|delete/i }).click();
await expect(page).toHaveURL('{{baseUrl}}/{{entityName}}s');
await expect(page.getByRole('alert')).toContainText(/deleted successfully/i);
});
test('does not delete when cancel clicked', async ({ page }) => {
await page.goto('{{baseUrl}}/{{entityName}}s/{{entityId}}');
await page.getByRole('button', { name: /delete/i }).click();
await page.getByRole('dialog').getByRole('button', { name: /cancel/i }).click();
await expect(page.getByRole('heading', { name: '{{entityName}}' })).toBeVisible();
});
test('shows error for entity with dependents', async ({ page }) => {
await page.goto('{{baseUrl}}/{{entityName}}s/{{entityWithDependentsId}}');
await page.getByRole('button', { name: /delete/i }).click();
await page.getByRole('dialog').getByRole('button', { name: /confirm|delete/i }).click();
await expect(page.getByRole('alert')).toContainText(/cannot delete/i);
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Delete confirmed | Dialog confirmed → entity removed → list page |
| Delete from list | Row action → confirm → row removed |
| Cancel deletion | Dialog cancelled → entity intact |
| Dependent error | Entity with children → deletion blocked |
| Type-to-confirm | Confirm button disabled until name typed |
FILE:templates/crud/read.md
# Read Entity Template
Tests viewing entity details and list view with correct data display.
## Prerequisites
- Authenticated session via `{{authStorageStatePath}}`
- Seeded entity with ID `{{entityId}}` and name `{{entityName}}`
- App running at `{{baseUrl}}`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
test.describe('Read {{entityName}}', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
// Happy path: detail page
test('displays entity details correctly', async ({ page }) => {
await page.goto('{{baseUrl}}/{{entityName}}s/{{entityId}}');
await expect(page.getByRole('heading', { name: '{{expectedTitle}}' })).toBeVisible();
await expect(page.getByText('{{expectedField}}')).toBeVisible();
await expect(page.getByText('{{expectedCategory}}')).toBeVisible();
});
// Happy path: list view shows all items
test('displays list of entities', async ({ page }) => {
await page.goto('{{baseUrl}}/{{entityName}}s');
await expect(page.getByRole('table')).toBeVisible();
const rows = page.getByRole('row').filter({ hasNot: page.getByRole('columnheader') });
await expect(rows).toHaveCount({{expectedItemCount}});
});
// Happy path: list item links to detail
test('clicking list item navigates to detail page', async ({ page }) => {
await page.goto('{{baseUrl}}/{{entityName}}s');
await page.getByRole('link', { name: '{{expectedTitle}}' }).click();
await expect(page).toHaveURL(`{{baseUrl}}/{{entityName}}s/{{entityId}}`);
});
// Happy path: breadcrumb navigation
test('breadcrumb shows correct path', async ({ page }) => {
await page.goto('{{baseUrl}}/{{entityName}}s/{{entityId}}');
await expect(page.getByRole('navigation', { name: /breadcrumb/i })).toContainText('{{entityName}}s');
await expect(page.getByRole('navigation', { name: /breadcrumb/i })).toContainText('{{expectedTitle}}');
});
// Error case: non-existent entity shows 404
test('shows 404 for non-existent entity', async ({ page }) => {
await page.goto('{{baseUrl}}/{{entityName}}s/999999');
await expect(page.getByRole('heading', { name: /404|not found/i })).toBeVisible();
});
// Edge case: loading state resolves to data
test('shows data after loading completes', async ({ page }) => {
await page.goto('{{baseUrl}}/{{entityName}}s/{{entityId}}');
// Skeleton/spinner should be gone, data visible
await expect(page.getByTestId('skeleton')).toBeHidden();
await expect(page.getByRole('heading', { name: '{{expectedTitle}}' })).toBeVisible();
});
// Edge case: empty list state
test('shows empty state when no entities exist', async ({ page }) => {
// Assumes a fresh context or filter that returns no results
await page.goto('{{baseUrl}}/{{entityName}}s?filter={{emptyFilter}}');
await expect(page.getByText(/no {{entityName}}s found/i)).toBeVisible();
await expect(page.getByRole('button', { name: /create|add/i })).toBeVisible();
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
test.describe('Read {{entityName}}', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
test('displays entity details correctly', async ({ page }) => {
await page.goto('{{baseUrl}}/{{entityName}}s/{{entityId}}');
await expect(page.getByRole('heading', { name: '{{expectedTitle}}' })).toBeVisible();
await expect(page.getByText('{{expectedField}}')).toBeVisible();
});
test('displays list of entities with correct count', async ({ page }) => {
await page.goto('{{baseUrl}}/{{entityName}}s');
const rows = page.getByRole('row').filter({ hasNot: page.getByRole('columnheader') });
await expect(rows).toHaveCount({{expectedItemCount}});
});
test('shows 404 for non-existent entity', async ({ page }) => {
await page.goto('{{baseUrl}}/{{entityName}}s/999999');
await expect(page.getByRole('heading', { name: /404|not found/i })).toBeVisible();
});
test('shows empty state when list is empty', async ({ page }) => {
await page.goto('{{baseUrl}}/{{entityName}}s?filter={{emptyFilter}}');
await expect(page.getByText(/no {{entityName}}s found/i)).toBeVisible();
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Detail view | Entity fields rendered correctly |
| List view | Correct row count in table |
| List → detail | Clicking row/link navigates correctly |
| Breadcrumb | Path reflects current location |
| 404 | Non-existent ID shows not-found page |
| Loading → data | Skeleton hidden, data visible after load |
| Empty list | No-results state with call to action |
FILE:templates/crud/soft-delete.md
# Soft Delete (Archive/Restore) Template
Tests archiving an entity, viewing archived items, and restoring them.
## Prerequisites
- Authenticated session via `{{authStorageStatePath}}`
- Active entity: ID `{{entityId}}`, name `{{entityName}}`
- Archived entity: ID `{{archivedEntityId}}`
- App running at `{{baseUrl}}`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
test.describe('Soft Delete — Archive & Restore', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
// Happy path: archive entity
test('archives entity and removes from active list', async ({ page }) => {
await page.goto('{{baseUrl}}/{{entityName}}s/{{entityId}}');
await page.getByRole('button', { name: /archive/i }).click();
await page.getByRole('dialog').getByRole('button', { name: /archive|confirm/i }).click();
await expect(page).toHaveURL('{{baseUrl}}/{{entityName}}s');
await expect(page.getByRole('alert')).toContainText(/archived/i);
await expect(page.getByRole('link', { name: '{{entityName}}' })).toBeHidden();
});
// Happy path: archived entity appears in archived view
test('archived entity visible in archived list', async ({ page }) => {
await page.goto('{{baseUrl}}/{{entityName}}s?status=archived');
await expect(page.getByRole('link', { name: '{{entityName}}' })).toBeVisible();
});
// Happy path: restore archived entity
test('restores archived entity to active list', async ({ page }) => {
await page.goto('{{baseUrl}}/{{entityName}}s?status=archived');
const row = page.getByRole('row', { name: new RegExp('{{entityName}}') });
await row.getByRole('button', { name: /restore/i }).click();
await expect(page.getByRole('alert')).toContainText(/restored/i);
await expect(row).toBeHidden();
await page.goto('{{baseUrl}}/{{entityName}}s');
await expect(page.getByRole('link', { name: '{{entityName}}' })).toBeVisible();
});
// Happy path: active list does not show archived by default
test('active list does not include archived entities', async ({ page }) => {
await page.goto('{{baseUrl}}/{{entityName}}s');
await expect(page.getByRole('link', { name: /{{archivedEntityName}}/i })).toBeHidden();
});
// Error case: archived entity cannot be edited
test('archived entity edit button is disabled', async ({ page }) => {
await page.goto('{{baseUrl}}/{{entityName}}s/{{archivedEntityId}}');
await expect(page.getByRole('button', { name: /edit/i })).toBeDisabled();
await expect(page.getByText(/archived/i)).toBeVisible();
});
// Edge case: permanently delete archived entity
test('permanently deletes archived entity', async ({ page }) => {
await page.goto('{{baseUrl}}/{{entityName}}s/{{archivedEntityId}}');
await page.getByRole('button', { name: /delete permanently/i }).click();
await page.getByRole('dialog').getByRole('button', { name: /delete permanently/i }).click();
await expect(page).toHaveURL('{{baseUrl}}/{{entityName}}s?status=archived');
await expect(page.getByRole('link', { name: '{{archivedEntityName}}' })).toBeHidden();
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
test.describe('Soft Delete — Archive & Restore', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
test('archives entity and removes from active list', async ({ page }) => {
await page.goto('{{baseUrl}}/{{entityName}}s/{{entityId}}');
await page.getByRole('button', { name: /archive/i }).click();
await page.getByRole('dialog').getByRole('button', { name: /archive|confirm/i }).click();
await expect(page).toHaveURL('{{baseUrl}}/{{entityName}}s');
await expect(page.getByRole('link', { name: '{{entityName}}' })).toBeHidden();
});
test('restores archived entity to active list', async ({ page }) => {
await page.goto('{{baseUrl}}/{{entityName}}s?status=archived');
await page.getByRole('row', { name: new RegExp('{{entityName}}') })
.getByRole('button', { name: /restore/i }).click();
await expect(page.getByRole('alert')).toContainText(/restored/i);
});
test('archived entity edit button is disabled', async ({ page }) => {
await page.goto('{{baseUrl}}/{{entityName}}s/{{archivedEntityId}}');
await expect(page.getByRole('button', { name: /edit/i })).toBeDisabled();
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Archive | Entity moved to archived list, removed from active |
| Archived list | Archived items visible with status=archived filter |
| Restore | Archived entity returned to active list |
| Active list clean | Archived items hidden from default view |
| Edit disabled | Archived entity cannot be edited |
| Permanent delete | Hard-delete of archived entity |
FILE:templates/crud/update.md
# Update Entity Template
Tests editing an entity via form and inline edit interactions.
## Prerequisites
- Authenticated session via `{{authStorageStatePath}}`
- Existing entity ID: `{{entityId}}`, name: `{{originalEntityName}}`
- App running at `{{baseUrl}}`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
test.describe('Update {{entityName}}', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
// Happy path: edit via form
test('updates entity via edit form', async ({ page }) => {
await page.goto('{{baseUrl}}/{{entityName}}s/{{entityId}}/edit');
const nameField = page.getByRole('textbox', { name: /name/i });
await nameField.clear();
await nameField.fill('{{updatedEntityName}}');
await page.getByRole('button', { name: /save|update/i }).click();
await expect(page).toHaveURL(`{{baseUrl}}/{{entityName}}s/{{entityId}}`);
await expect(page.getByRole('heading', { name: '{{updatedEntityName}}' })).toBeVisible();
await expect(page.getByRole('alert')).toContainText(/updated successfully/i);
});
// Happy path: inline edit
test('updates name via inline edit', async ({ page }) => {
await page.goto('{{baseUrl}}/{{entityName}}s/{{entityId}}');
await page.getByRole('button', { name: /edit name/i }).click();
const inlineInput = page.getByRole('textbox', { name: /name/i });
await inlineInput.clear();
await inlineInput.fill('{{updatedEntityName}}');
await inlineInput.press('Enter');
await expect(page.getByRole('heading', { name: '{{updatedEntityName}}' })).toBeVisible();
});
// Happy path: edit then navigate away — unsaved changes warning
test('warns before discarding unsaved changes', async ({ page }) => {
await page.goto('{{baseUrl}}/{{entityName}}s/{{entityId}}/edit');
await page.getByRole('textbox', { name: /name/i }).fill('unsaved-change');
await page.getByRole('link', { name: /cancel|back/i }).click();
await expect(page.getByRole('dialog', { name: /unsaved changes/i })).toBeVisible();
await page.getByRole('button', { name: /discard/i }).click();
await expect(page).toHaveURL(`{{baseUrl}}/{{entityName}}s/{{entityId}}`);
await expect(page.getByRole('heading', { name: '{{originalEntityName}}' })).toBeVisible();
});
// Error case: clearing required field
test('shows validation error when required field is cleared', async ({ page }) => {
await page.goto('{{baseUrl}}/{{entityName}}s/{{entityId}}/edit');
await page.getByRole('textbox', { name: /name/i }).clear();
await page.getByRole('button', { name: /save|update/i }).click();
await expect(page.getByText(/name is required/i)).toBeVisible();
await expect(page).toHaveURL(`{{baseUrl}}/{{entityName}}s/{{entityId}}/edit`);
});
// Error case: conflict (optimistic update failure)
test('handles concurrent edit conflict gracefully', async ({ page }) => {
await page.goto('{{baseUrl}}/{{entityName}}s/{{entityId}}/edit');
// Simulate another user modifying the record
await page.request.put(`{{baseUrl}}/api/{{entityName}}s/{{entityId}}`, {
data: { name: 'modified-by-other', version: 999 },
});
await page.getByRole('textbox', { name: /name/i }).fill('my-change');
await page.getByRole('button', { name: /save|update/i }).click();
await expect(page.getByRole('alert')).toContainText(/conflict|modified by another/i);
});
// Edge case: inline edit cancelled with Escape
test('cancels inline edit on Escape key', async ({ page }) => {
await page.goto('{{baseUrl}}/{{entityName}}s/{{entityId}}');
await page.getByRole('button', { name: /edit name/i }).click();
await page.getByRole('textbox', { name: /name/i }).fill('should-not-save');
await page.keyboard.press('Escape');
await expect(page.getByRole('heading', { name: '{{originalEntityName}}' })).toBeVisible();
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
test.describe('Update {{entityName}}', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
test('updates entity via edit form', async ({ page }) => {
await page.goto('{{baseUrl}}/{{entityName}}s/{{entityId}}/edit');
await page.getByRole('textbox', { name: /name/i }).clear();
await page.getByRole('textbox', { name: /name/i }).fill('{{updatedEntityName}}');
await page.getByRole('button', { name: /save|update/i }).click();
await expect(page.getByRole('heading', { name: '{{updatedEntityName}}' })).toBeVisible();
});
test('shows validation error when required field cleared', async ({ page }) => {
await page.goto('{{baseUrl}}/{{entityName}}s/{{entityId}}/edit');
await page.getByRole('textbox', { name: /name/i }).clear();
await page.getByRole('button', { name: /save|update/i }).click();
await expect(page.getByText(/name is required/i)).toBeVisible();
});
test('cancels inline edit on Escape', async ({ page }) => {
await page.goto('{{baseUrl}}/{{entityName}}s/{{entityId}}');
await page.getByRole('button', { name: /edit name/i }).click();
await page.getByRole('textbox', { name: /name/i }).fill('nope');
await page.keyboard.press('Escape');
await expect(page.getByRole('heading', { name: '{{originalEntityName}}' })).toBeVisible();
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Edit form | Full edit form → save → detail page |
| Inline edit | Click field → type → Enter to save |
| Unsaved changes | Navigation shows discard confirmation |
| Required field | Cleared required field → validation |
| Conflict | Concurrent edit → conflict error |
| Escape cancel | Inline edit cancelled, original value restored |
FILE:templates/dashboard/chart-rendering.md
# Chart Rendering Template
Tests chart visibility, interactive tooltips, and legend behaviour.
## Prerequisites
- Authenticated session via `{{authStorageStatePath}}`
- Dashboard with charts at `{{baseUrl}}/dashboard`
- Chart library: `{{chartLibrary}}` (e.g. Chart.js, Recharts, D3)
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
test.describe('Chart Rendering', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
// Wait for chart container to be visible
await expect(page.getByRole('img', { name: /{{chartName}} chart/i })
.or(page.getByTestId('{{chartTestId}}'))).toBeVisible();
});
// Happy path: chart rendered and visible
test('renders {{chartName}} chart', async ({ page }) => {
const chart = page.getByTestId('{{chartTestId}}');
await expect(chart).toBeVisible();
// Chart has non-zero dimensions
const box = await chart.boundingBox();
expect(box?.width).toBeGreaterThan(0);
expect(box?.height).toBeGreaterThan(0);
});
// Happy path: tooltip shown on hover
test('shows tooltip on data point hover', async ({ page }) => {
const chart = page.getByTestId('{{chartTestId}}');
const box = await chart.boundingBox();
// Hover over the centre of the chart
await page.mouse.move(box!.x + box!.width / 2, box!.y + box!.height / 2);
await expect(page.getByRole('tooltip')).toBeVisible();
await expect(page.getByRole('tooltip')).toContainText(/\d/);
});
// Happy path: legend visible with correct labels
test('displays chart legend with correct series labels', async ({ page }) => {
const legend = page.getByRole('list', { name: /legend/i });
await expect(legend).toBeVisible();
await expect(legend.getByRole('listitem', { name: '{{seriesName1}}' })).toBeVisible();
await expect(legend.getByRole('listitem', { name: '{{seriesName2}}' })).toBeVisible();
});
// Happy path: clicking legend toggles series visibility
test('toggles series visibility via legend click', async ({ page }) => {
await page.getByRole('button', { name: '{{seriesName1}}' }).click();
// Series hidden — legend item shows struck-through or disabled state
await expect(page.getByRole('button', { name: '{{seriesName1}}' })).toHaveAttribute('aria-pressed', 'false');
});
// Happy path: chart updates when date range changed
test('updates chart when date range filter applied', async ({ page }) => {
const before = await page.getByTestId('{{chartTestId}}').screenshot();
await page.getByRole('combobox', { name: /date range/i }).selectOption('last_7_days');
const after = await page.getByTestId('{{chartTestId}}').screenshot();
expect(Buffer.compare(before, after)).not.toBe(0);
});
// Error case: empty data shows no-data state
test('shows no-data message when chart has no data', async ({ page }) => {
await page.route('{{baseUrl}}/api/chart-data*', route =>
route.fulfill({ status: 200, body: JSON.stringify({ data: [] }) })
);
await page.reload();
const chart = page.getByTestId('{{chartTestId}}');
await expect(chart.getByText(/no data|no results/i)).toBeVisible();
});
// Edge case: chart accessible via aria
test('chart has accessible title and description', async ({ page }) => {
const chart = page.getByTestId('{{chartTestId}}');
await expect(chart.getByRole('img')).toHaveAttribute('aria-label', /{{chartName}}/i);
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
test.describe('Chart Rendering', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
test('renders chart with non-zero dimensions', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
const chart = page.getByTestId('{{chartTestId}}');
await expect(chart).toBeVisible();
const box = await chart.boundingBox();
expect(box?.width).toBeGreaterThan(0);
expect(box?.height).toBeGreaterThan(0);
});
test('shows tooltip on hover', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
const chart = page.getByTestId('{{chartTestId}}');
const box = await chart.boundingBox();
await page.mouse.move(box.x + box.width / 2, box.y + box.height / 2);
await expect(page.getByRole('tooltip')).toBeVisible();
});
test('displays legend labels', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
await expect(page.getByRole('list', { name: /legend/i })).toBeVisible();
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Chart visible | Non-zero bounding box confirmed |
| Tooltip on hover | Tooltip appears with numeric value |
| Legend labels | Series names present in legend |
| Legend toggle | Click hides/shows series |
| Date range update | Chart changes when filter applied |
| No-data state | Empty dataset → no-data message |
| Accessible label | aria-label present on chart element |
FILE:templates/dashboard/data-loading.md
# Dashboard Data Loading Template
Tests loading state, skeleton screens, and data display after fetch.
## Prerequisites
- Authenticated session via `{{authStorageStatePath}}`
- Dashboard at `{{baseUrl}}/dashboard`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
test.describe('Dashboard Data Loading', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
// Happy path: skeleton shown then replaced by data
test('shows skeleton during load, then displays data', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
// Skeleton should resolve; real data appears
await expect(page.getByTestId('skeleton')).toBeHidden();
await expect(page.getByRole('heading', { name: /dashboard/i })).toBeVisible();
await expect(page.getByRole('region', { name: /{{widgetName}}/i })).toBeVisible();
});
// Happy path: all metric cards populated
test('renders metric cards with values', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
const cards = page.getByRole('article', { name: /metric/i });
await expect(cards).toHaveCount({{expectedMetricCount}});
await expect(cards.first().getByText(/\d/)).toBeVisible();
});
// Happy path: data updates on refresh
test('refreshes data when refresh button clicked', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
await expect(page.getByTestId('skeleton')).toBeHidden();
const before = await page.getByTestId('{{metricId}}').textContent();
await page.getByRole('button', { name: /refresh/i }).click();
await expect(page.getByTestId('skeleton')).toBeHidden();
// Value may or may not change — just confirm data loads again
await expect(page.getByTestId('{{metricId}}')).toBeVisible();
});
// Error case: shows error state when API fails
test('shows error state when data fetch fails', async ({ page }) => {
await page.route('{{baseUrl}}/api/dashboard*', route =>
route.fulfill({ status: 500, body: JSON.stringify({ error: 'Internal Server Error' }) })
);
await page.goto('{{baseUrl}}/dashboard');
await expect(page.getByRole('alert')).toContainText(/failed to load|error loading/i);
await expect(page.getByRole('button', { name: /retry/i })).toBeVisible();
});
// Error case: retry after failure loads data
test('retries and loads data after error', async ({ page }) => {
let callCount = 0;
await page.route('{{baseUrl}}/api/dashboard*', route => {
callCount++;
if (callCount === 1) return route.fulfill({ status: 500, body: '{}' });
return route.continue();
});
await page.goto('{{baseUrl}}/dashboard');
await page.getByRole('button', { name: /retry/i }).click();
await expect(page.getByTestId('skeleton')).toBeHidden();
await expect(page.getByRole('region', { name: /{{widgetName}}/i })).toBeVisible();
});
// Edge case: slow network shows skeleton for duration
test('skeleton persists during slow API response', async ({ page }) => {
await page.route('{{baseUrl}}/api/dashboard*', async route => {
await new Promise(r => setTimeout(r, 2000));
await route.continue();
});
await page.goto('{{baseUrl}}/dashboard');
await expect(page.getByTestId('skeleton')).toBeVisible();
await expect(page.getByTestId('skeleton')).toBeHidden(); // eventually resolves
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
test.describe('Dashboard Data Loading', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
test('renders metric cards after loading', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
await expect(page.getByTestId('skeleton')).toBeHidden();
await expect(page.getByRole('article', { name: /metric/i }).first()).toBeVisible();
});
test('shows error state on API failure', async ({ page }) => {
await page.route('{{baseUrl}}/api/dashboard*', route =>
route.fulfill({ status: 500, body: '{}' })
);
await page.goto('{{baseUrl}}/dashboard');
await expect(page.getByRole('alert')).toContainText(/failed to load|error/i);
await expect(page.getByRole('button', { name: /retry/i })).toBeVisible();
});
test('skeleton visible during slow response', async ({ page }) => {
await page.route('{{baseUrl}}/api/dashboard*', async route => {
await new Promise(r => setTimeout(r, 1500));
await route.continue();
});
await page.goto('{{baseUrl}}/dashboard');
await expect(page.getByTestId('skeleton')).toBeVisible();
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Skeleton → data | Loading state resolves to populated widgets |
| Metric cards | N cards each showing a numeric value |
| Refresh | Data reloaded on button click |
| API error | Error alert + retry button shown |
| Retry success | Second request succeeds after failure |
| Slow network | Skeleton persists during delay |
FILE:templates/dashboard/date-range-filter.md
# Date Range Filter Template
Tests date picker interaction, preset ranges, and data refresh on selection.
## Prerequisites
- Authenticated session via `{{authStorageStatePath}}`
- Dashboard at `{{baseUrl}}/dashboard`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
test.describe('Date Range Filter', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
});
// Happy path: preset range — last 7 days
test('applies "last 7 days" preset', async ({ page }) => {
await page.getByRole('button', { name: /date range/i }).click();
await page.getByRole('option', { name: /last 7 days/i }).click();
await expect(page).toHaveURL(/from=|start_date=/);
await expect(page.getByRole('button', { name: /date range/i })).toContainText(/last 7 days/i);
});
// Happy path: preset range — last 30 days
test('applies "last 30 days" preset', async ({ page }) => {
await page.getByRole('button', { name: /date range/i }).click();
await page.getByRole('option', { name: /last 30 days/i }).click();
await expect(page.getByRole('button', { name: /date range/i })).toContainText(/last 30 days/i);
await expect(page.getByTestId('skeleton')).toBeHidden();
});
// Happy path: custom date range via date picker
test('applies custom date range from picker', async ({ page }) => {
await page.getByRole('button', { name: /date range/i }).click();
await page.getByRole('option', { name: /custom/i }).click();
const picker = page.getByRole('dialog', { name: /date range/i });
await expect(picker).toBeVisible();
// Select start date
await picker.getByRole('button', { name: '{{startDay}}' }).click();
// Select end date
await picker.getByRole('button', { name: '{{endDay}}' }).click();
await picker.getByRole('button', { name: /apply/i }).click();
await expect(picker).toBeHidden();
await expect(page.getByRole('button', { name: /date range/i })).toContainText('{{startDateFormatted}}');
});
// Happy path: data reloads on range change
test('reloads dashboard data on date range change', async ({ page }) => {
let requestCount = 0;
await page.route('{{baseUrl}}/api/dashboard*', route => {
requestCount++;
return route.continue();
});
await page.getByRole('button', { name: /date range/i }).click();
await page.getByRole('option', { name: /last 7 days/i }).click();
expect(requestCount).toBeGreaterThan(0);
await expect(page.getByTestId('skeleton')).toBeHidden();
});
// Error case: invalid custom range (end before start)
test('shows error when end date is before start date', async ({ page }) => {
await page.getByRole('button', { name: /date range/i }).click();
await page.getByRole('option', { name: /custom/i }).click();
const picker = page.getByRole('dialog', { name: /date range/i });
await picker.getByRole('button', { name: '{{endDay}}' }).click(); // pick later date first
await picker.getByRole('button', { name: '{{startDay}}' }).click(); // then earlier
await expect(picker.getByText(/end.*after.*start|invalid.*range/i)).toBeVisible();
await expect(picker.getByRole('button', { name: /apply/i })).toBeDisabled();
});
// Edge case: range persists after page reload
test('date range persists in URL after reload', async ({ page }) => {
await page.getByRole('button', { name: /date range/i }).click();
await page.getByRole('option', { name: /last 7 days/i }).click();
const url = page.url();
await page.reload();
await expect(page).toHaveURL(url);
await expect(page.getByRole('button', { name: /date range/i })).toContainText(/last 7 days/i);
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
test.describe('Date Range Filter', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
test('applies last-7-days preset', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
await page.getByRole('button', { name: /date range/i }).click();
await page.getByRole('option', { name: /last 7 days/i }).click();
await expect(page.getByRole('button', { name: /date range/i })).toContainText(/last 7 days/i);
});
test('shows error for invalid range', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
await page.getByRole('button', { name: /date range/i }).click();
await page.getByRole('option', { name: /custom/i }).click();
const picker = page.getByRole('dialog', { name: /date range/i });
await picker.getByRole('button', { name: '{{endDay}}' }).click();
await picker.getByRole('button', { name: '{{startDay}}' }).click();
await expect(picker.getByRole('button', { name: /apply/i })).toBeDisabled();
});
test('range persists after page reload', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
await page.getByRole('button', { name: /date range/i }).click();
await page.getByRole('option', { name: /last 7 days/i }).click();
const url = page.url();
await page.reload();
await expect(page).toHaveURL(url);
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Last 7 days | Preset applied, URL updated |
| Last 30 days | Preset applied, data refreshed |
| Custom range | Date picker → start + end → apply |
| Data reload | API called again on range change |
| Invalid range | End before start → apply disabled |
| URL persistence | Range in URL survives reload |
FILE:templates/dashboard/export.md
# Export Template
Tests CSV and PDF export, download triggering, and file verification.
## Prerequisites
- Authenticated session via `{{authStorageStatePath}}`
- Dashboard or report page at `{{baseUrl}}/{{reportPath}}`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
import path from 'path';
import fs from 'fs';
test.describe('Export', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/{{reportPath}}');
});
// Happy path: CSV download
test('downloads CSV export', async ({ page }) => {
const downloadPromise = page.waitForEvent('download');
await page.getByRole('button', { name: /export.*csv|download.*csv/i }).click();
const download = await downloadPromise;
expect(download.suggestedFilename()).toMatch(/\.csv$/);
const filePath = path.join('/tmp', download.suggestedFilename());
await download.saveAs(filePath);
const content = fs.readFileSync(filePath, 'utf-8');
expect(content).toContain('{{expectedCsvHeader}}');
expect(content.split('\n').length).toBeGreaterThan(1);
});
// Happy path: PDF download
test('downloads PDF export', async ({ page }) => {
const downloadPromise = page.waitForEvent('download');
await page.getByRole('button', { name: /export.*pdf|download.*pdf/i }).click();
const download = await downloadPromise;
expect(download.suggestedFilename()).toMatch(/\.pdf$/);
const filePath = path.join('/tmp', download.suggestedFilename());
await download.saveAs(filePath);
const buffer = fs.readFileSync(filePath);
// PDF magic bytes
expect(buffer.slice(0, 4).toString()).toBe('%PDF');
});
// Happy path: export with current filters applied
test('export respects active date range filter', async ({ page }) => {
await page.getByRole('button', { name: /date range/i }).click();
await page.getByRole('option', { name: /last 7 days/i }).click();
const downloadPromise = page.waitForEvent('download');
await page.getByRole('button', { name: /export.*csv/i }).click();
const download = await downloadPromise;
const filePath = path.join('/tmp', download.suggestedFilename());
await download.saveAs(filePath);
const content = fs.readFileSync(filePath, 'utf-8');
expect(content.split('\n').length).toBeGreaterThan(1);
});
// Happy path: export loading indicator
test('shows loading state during export generation', async ({ page }) => {
const downloadPromise = page.waitForEvent('download');
await page.getByRole('button', { name: /export.*csv/i }).click();
await expect(page.getByRole('button', { name: /export.*csv/i })).toBeDisabled();
await downloadPromise;
await expect(page.getByRole('button', { name: /export.*csv/i })).toBeEnabled();
});
// Error case: export fails with server error
test('shows error when export generation fails', async ({ page }) => {
await page.route('{{baseUrl}}/api/export*', route =>
route.fulfill({ status: 500, body: JSON.stringify({ error: 'Export failed' }) })
);
await page.getByRole('button', { name: /export.*csv/i }).click();
await expect(page.getByRole('alert')).toContainText(/export failed|could not generate/i);
});
// Edge case: export with no data shows warning
test('shows warning when exporting empty dataset', async ({ page }) => {
await page.route('{{baseUrl}}/api/{{reportEndpoint}}*', route =>
route.fulfill({ status: 200, body: JSON.stringify({ data: [] }) })
);
await page.reload();
await page.getByRole('button', { name: /export.*csv/i }).click();
await expect(page.getByRole('alert')).toContainText(/no data to export|empty/i);
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
const path = require('path');
const fs = require('fs');
test.describe('Export', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
test('downloads CSV export with correct header', async ({ page }) => {
await page.goto('{{baseUrl}}/{{reportPath}}');
const downloadPromise = page.waitForEvent('download');
await page.getByRole('button', { name: /export.*csv/i }).click();
const download = await downloadPromise;
expect(download.suggestedFilename()).toMatch(/\.csv$/);
const filePath = path.join('/tmp', download.suggestedFilename());
await download.saveAs(filePath);
expect(fs.readFileSync(filePath, 'utf-8')).toContain('{{expectedCsvHeader}}');
});
test('downloads PDF with correct magic bytes', async ({ page }) => {
await page.goto('{{baseUrl}}/{{reportPath}}');
const downloadPromise = page.waitForEvent('download');
await page.getByRole('button', { name: /export.*pdf/i }).click();
const download = await downloadPromise;
const filePath = path.join('/tmp', download.suggestedFilename());
await download.saveAs(filePath);
expect(fs.readFileSync(filePath).slice(0, 4).toString()).toBe('%PDF');
});
test('shows error when export fails', async ({ page }) => {
await page.goto('{{baseUrl}}/{{reportPath}}');
await page.route('{{baseUrl}}/api/export*', route =>
route.fulfill({ status: 500, body: '{}' })
);
await page.getByRole('button', { name: /export.*csv/i }).click();
await expect(page.getByRole('alert')).toContainText(/export failed/i);
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| CSV download | File downloaded, header row verified |
| PDF download | File downloaded, %PDF magic bytes checked |
| Filtered export | Active filters applied to exported data |
| Loading state | Button disabled during generation |
| Server error | Export failure → error alert |
| Empty dataset | No-data warning shown |
FILE:templates/dashboard/realtime-updates.md
# Realtime Updates Template
Tests live data via WebSocket or polling, connection handling, and reconnection.
## Prerequisites
- Authenticated session via `{{authStorageStatePath}}`
- Dashboard with live data at `{{baseUrl}}/dashboard`
- WebSocket endpoint: `{{wsEndpoint}}`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
test.describe('Realtime Updates', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
// Happy path: live metric updates via WebSocket
test('updates metric when WebSocket message received', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
await expect(page.getByTestId('{{metricId}}')).toBeVisible();
// Inject a WS message to simulate server push
await page.evaluate(() => {
const ws = (window as any).__dashboardWs;
if (ws) ws.dispatchEvent(new MessageEvent('message', {
data: JSON.stringify({ type: 'metric_update', id: '{{metricId}}', value: 9999 })
}));
});
await expect(page.getByTestId('{{metricId}}')).toContainText('9,999');
});
// Happy path: connection status indicator
test('shows "connected" status indicator', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
await expect(page.getByRole('status', { name: /live|connected/i })).toBeVisible();
});
// Happy path: data highlighted on update
test('highlights updated value briefly', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
await page.evaluate(() => {
const ws = (window as any).__dashboardWs;
if (ws) ws.dispatchEvent(new MessageEvent('message', {
data: JSON.stringify({ type: 'metric_update', id: '{{metricId}}', value: 42 })
}));
});
await expect(page.getByTestId('{{metricId}}')).toHaveClass(/updated|flash/);
// Highlight fades
await expect(page.getByTestId('{{metricId}}')).not.toHaveClass(/updated|flash/);
});
// Error case: WebSocket disconnected — shows offline indicator
test('shows disconnected state when WebSocket closes', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
await page.evaluate(() => {
const ws = (window as any).__dashboardWs;
if (ws) ws.close();
});
await expect(page.getByRole('status', { name: /disconnected|offline/i })).toBeVisible();
await expect(page.getByText(/reconnecting/i)).toBeVisible();
});
// Error case: connection refused — error state shown
test('shows connection error when WebSocket cannot connect', async ({ page }) => {
await page.route('**/{{wsEndpoint}}', route => route.abort());
await page.goto('{{baseUrl}}/dashboard');
await expect(page.getByRole('alert')).toContainText(/connection.*failed|live updates.*unavailable/i);
});
// Edge case: reconnects automatically after disconnect
test('reconnects automatically after network interruption', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
await page.evaluate(() => {
const ws = (window as any).__dashboardWs;
if (ws) ws.close();
});
await expect(page.getByRole('status', { name: /disconnected/i })).toBeVisible();
// Wait for auto-reconnect
await expect(page.getByRole('status', { name: /connected|live/i })).toBeVisible();
});
// Edge case: stale data badge shown when disconnected
test('shows stale data warning when disconnected', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
await page.evaluate(() => {
const ws = (window as any).__dashboardWs;
if (ws) ws.close();
});
await expect(page.getByText(/data may be outdated|stale/i)).toBeVisible();
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
test.describe('Realtime Updates', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
test('shows connected status on load', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
await expect(page.getByRole('status', { name: /live|connected/i })).toBeVisible();
});
test('shows disconnected state when WS closes', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
await page.evaluate(() => {
const ws = window.__dashboardWs;
if (ws) ws.close();
});
await expect(page.getByRole('status', { name: /disconnected|offline/i })).toBeVisible();
});
test('updates metric on WS message', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
await page.evaluate(() => {
const ws = window.__dashboardWs;
if (ws) ws.dispatchEvent(new MessageEvent('message', {
data: JSON.stringify({ type: 'metric_update', id: '{{metricId}}', value: 9999 })
}));
});
await expect(page.getByTestId('{{metricId}}')).toContainText('9,999');
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Live update | WS message updates metric value |
| Connected status | Status indicator shows "live" |
| Update highlight | Changed value briefly highlighted |
| Disconnected | WS close → disconnected indicator |
| Connection refused | WS blocked → error alert |
| Auto-reconnect | Reconnects after close |
| Stale data | Warning shown while disconnected |
FILE:templates/forms/autosave.md
# Autosave Template
Tests auto-save draft functionality and draft restoration on revisit.
## Prerequisites
- Authenticated session via `{{authStorageStatePath}}`
- Form with autosave at `{{baseUrl}}/{{formPath}}`
- Autosave interval: `{{autosaveIntervalMs}}` ms
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
test.describe('Autosave', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/{{formPath}}');
});
// Happy path: autosave indicator appears after typing
test('shows autosave indicator after typing', async ({ page }) => {
await page.getByRole('textbox', { name: /{{fieldLabel}}/i }).fill('{{draftContent}}');
await page.clock.install();
await page.clock.fastForward({{autosaveIntervalMs}});
await expect(page.getByText(/saved|draft saved/i)).toBeVisible();
});
// Happy path: draft restored on revisit
test('restores draft on page revisit', async ({ page }) => {
await page.getByRole('textbox', { name: /{{fieldLabel}}/i }).fill('{{draftContent}}');
await page.clock.install();
await page.clock.fastForward({{autosaveIntervalMs}});
await expect(page.getByText(/draft saved/i)).toBeVisible();
// Simulate revisit
await page.reload();
await expect(page.getByRole('textbox', { name: /{{fieldLabel}}/i })).toHaveValue('{{draftContent}}');
await expect(page.getByText(/draft restored|you have a saved draft/i)).toBeVisible();
});
// Happy path: restore draft via banner
test('restores draft when user clicks restore', async ({ page }) => {
await page.reload();
const banner = page.getByRole('alert', { name: /saved draft/i });
if (await banner.isVisible()) {
await banner.getByRole('button', { name: /restore/i }).click();
await expect(page.getByRole('textbox', { name: /{{fieldLabel}}/i })).toHaveValue('{{draftContent}}');
}
});
// Happy path: dismiss draft banner discards old draft
test('discards draft when user clicks dismiss', async ({ page }) => {
await page.reload();
const banner = page.getByRole('alert', { name: /saved draft/i });
if (await banner.isVisible()) {
await banner.getByRole('button', { name: /dismiss|discard/i }).click();
await expect(banner).toBeHidden();
await expect(page.getByRole('textbox', { name: /{{fieldLabel}}/i })).toHaveValue('');
}
});
// Happy path: draft cleared after successful submit
test('clears autosaved draft after form submission', async ({ page }) => {
await page.getByRole('textbox', { name: /{{fieldLabel}}/i }).fill('{{draftContent}}');
await page.clock.install();
await page.clock.fastForward({{autosaveIntervalMs}});
await page.getByRole('button', { name: /submit|save/i }).click();
await expect(page.getByRole('alert')).toContainText(/submitted|saved/i);
// Revisit — no draft banner
await page.goto('{{baseUrl}}/{{formPath}}');
await expect(page.getByRole('alert', { name: /saved draft/i })).toBeHidden();
});
// Error case: autosave fails silently and retries
test('shows autosave error when network fails', async ({ page }) => {
await page.route('{{baseUrl}}/api/drafts*', route => route.abort('failed'));
await page.getByRole('textbox', { name: /{{fieldLabel}}/i }).fill('{{draftContent}}');
await page.clock.install();
await page.clock.fastForward({{autosaveIntervalMs}});
await expect(page.getByText(/save failed|could not save/i)).toBeVisible();
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
test.describe('Autosave', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
test('shows autosave indicator after interval', async ({ page }) => {
await page.goto('{{baseUrl}}/{{formPath}}');
await page.getByRole('textbox', { name: /{{fieldLabel}}/i }).fill('{{draftContent}}');
await page.clock.install();
await page.clock.fastForward({{autosaveIntervalMs}});
await expect(page.getByText(/draft saved/i)).toBeVisible();
});
test('restores draft on page revisit', async ({ page }) => {
await page.goto('{{baseUrl}}/{{formPath}}');
await page.getByRole('textbox', { name: /{{fieldLabel}}/i }).fill('{{draftContent}}');
await page.clock.install();
await page.clock.fastForward({{autosaveIntervalMs}});
await page.reload();
await expect(page.getByRole('textbox', { name: /{{fieldLabel}}/i })).toHaveValue('{{draftContent}}');
});
test('clears draft after successful submit', async ({ page }) => {
await page.goto('{{baseUrl}}/{{formPath}}');
await page.getByRole('textbox', { name: /{{fieldLabel}}/i }).fill('{{draftContent}}');
await page.clock.install();
await page.clock.fastForward({{autosaveIntervalMs}});
await page.getByRole('button', { name: /submit/i }).click();
await page.goto('{{baseUrl}}/{{formPath}}');
await expect(page.getByRole('alert', { name: /saved draft/i })).toBeHidden();
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Autosave indicator | "Draft saved" shown after interval |
| Draft restored | Revisit → field pre-filled |
| Restore via banner | Banner restore button populates field |
| Dismiss draft | Discard clears saved value |
| Cleared on submit | No draft banner after successful submit |
| Network failure | Save-failed message shown |
FILE:templates/forms/conditional-fields.md
# Conditional Fields Template
Tests show/hide fields based on selection and correct validation of visible fields only.
## Prerequisites
- Form at `{{baseUrl}}/{{formPath}}`
- Trigger field: `{{triggerField}}` (e.g. country, type selector)
- Conditional field shown when value is `{{triggerValue}}`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
test.describe('Conditional Fields', () => {
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/{{formPath}}');
});
// Happy path: conditional field shown on trigger
test('shows conditional field when trigger value selected', async ({ page }) => {
await expect(page.getByRole('textbox', { name: /{{conditionalFieldLabel}}/i })).toBeHidden();
await page.getByRole('combobox', { name: /{{triggerField}}/i }).selectOption('{{triggerValue}}');
await expect(page.getByRole('textbox', { name: /{{conditionalFieldLabel}}/i })).toBeVisible();
});
// Happy path: conditional field hidden when trigger changes
test('hides conditional field when trigger changes back', async ({ page }) => {
await page.getByRole('combobox', { name: /{{triggerField}}/i }).selectOption('{{triggerValue}}');
await expect(page.getByRole('textbox', { name: /{{conditionalFieldLabel}}/i })).toBeVisible();
await page.getByRole('combobox', { name: /{{triggerField}}/i }).selectOption('{{nonTriggerValue}}');
await expect(page.getByRole('textbox', { name: /{{conditionalFieldLabel}}/i })).toBeHidden();
});
// Happy path: form submits with conditional field filled
test('submits form when conditional field is shown and filled', async ({ page }) => {
await page.getByRole('combobox', { name: /{{triggerField}}/i }).selectOption('{{triggerValue}}');
await page.getByRole('textbox', { name: /{{conditionalFieldLabel}}/i }).fill('{{conditionalFieldValue}}');
await page.getByRole('button', { name: /submit/i }).click();
await expect(page.getByRole('alert')).toContainText(/submitted|saved/i);
});
// Error case: conditional field required when visible
test('validates conditional field when it is visible', async ({ page }) => {
await page.getByRole('combobox', { name: /{{triggerField}}/i }).selectOption('{{triggerValue}}');
await page.getByRole('button', { name: /submit/i }).click();
await expect(page.getByText(/{{conditionalFieldLabel}}.*required/i)).toBeVisible();
});
// Error case: hidden field not validated
test('does not validate conditional field when hidden', async ({ page }) => {
await page.getByRole('combobox', { name: /{{triggerField}}/i }).selectOption('{{nonTriggerValue}}');
await page.getByRole('button', { name: /submit/i }).click();
await expect(page.getByText(/{{conditionalFieldLabel}}.*required/i)).toBeHidden();
});
// Edge case: conditional field value cleared when hidden
test('clears conditional field value when field is hidden', async ({ page }) => {
await page.getByRole('combobox', { name: /{{triggerField}}/i }).selectOption('{{triggerValue}}');
await page.getByRole('textbox', { name: /{{conditionalFieldLabel}}/i }).fill('some value');
await page.getByRole('combobox', { name: /{{triggerField}}/i }).selectOption('{{nonTriggerValue}}');
// Re-show and verify value is cleared
await page.getByRole('combobox', { name: /{{triggerField}}/i }).selectOption('{{triggerValue}}');
await expect(page.getByRole('textbox', { name: /{{conditionalFieldLabel}}/i })).toHaveValue('');
});
// Edge case: radio trigger shows/hides field
test('shows field based on radio button selection', async ({ page }) => {
await page.getByRole('radio', { name: '{{radioTriggerLabel}}' }).check();
await expect(page.getByRole('textbox', { name: /{{conditionalFieldLabel}}/i })).toBeVisible();
await page.getByRole('radio', { name: '{{radioOtherLabel}}' }).check();
await expect(page.getByRole('textbox', { name: /{{conditionalFieldLabel}}/i })).toBeHidden();
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
test.describe('Conditional Fields', () => {
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/{{formPath}}');
});
test('shows conditional field when trigger selected', async ({ page }) => {
await expect(page.getByRole('textbox', { name: /{{conditionalFieldLabel}}/i })).toBeHidden();
await page.getByRole('combobox', { name: /{{triggerField}}/i }).selectOption('{{triggerValue}}');
await expect(page.getByRole('textbox', { name: /{{conditionalFieldLabel}}/i })).toBeVisible();
});
test('validates visible conditional field on submit', async ({ page }) => {
await page.getByRole('combobox', { name: /{{triggerField}}/i }).selectOption('{{triggerValue}}');
await page.getByRole('button', { name: /submit/i }).click();
await expect(page.getByText(/{{conditionalFieldLabel}}.*required/i)).toBeVisible();
});
test('does not validate hidden conditional field', async ({ page }) => {
await page.getByRole('combobox', { name: /{{triggerField}}/i }).selectOption('{{nonTriggerValue}}');
await page.getByRole('button', { name: /submit/i }).click();
await expect(page.getByText(/{{conditionalFieldLabel}}.*required/i)).toBeHidden();
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Show on trigger | Selecting value reveals hidden field |
| Hide on change | Changing back hides field again |
| Submit with field | Visible field filled → success |
| Validate visible | Visible empty field → required error |
| Skip hidden | Hidden field not validated |
| Clear on hide | Value cleared when field hidden |
| Radio trigger | Radio button controls field visibility |
FILE:templates/forms/file-upload.md
# File Upload Template
Tests single file, multiple files, drag-and-drop, and upload progress.
## Prerequisites
- Authenticated session via `{{authStorageStatePath}}`
- Test files available: `{{testFilePath}}`, `{{largeFilePath}}`
- Accepted types: `{{acceptedMimeTypes}}` (e.g. image/jpeg, application/pdf)
- Max file size: `{{maxFileSizeMb}}` MB
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
import path from 'path';
const testFile = path.resolve('{{testFilePath}}');
const largeFile = path.resolve('{{largeFilePath}}');
test.describe('File Upload', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/{{uploadPath}}');
});
// Happy path: single file upload
test('uploads a single file successfully', async ({ page }) => {
await page.getByRole('button', { name: /choose file|browse/i }).setInputFiles(testFile);
await expect(page.getByText(/{{testFileName}}/)).toBeVisible();
await page.getByRole('button', { name: /upload/i }).click();
await expect(page.getByRole('progressbar')).toBeVisible();
await expect(page.getByRole('alert')).toContainText(/upload.*complete|uploaded successfully/i);
});
// Happy path: multiple files
test('uploads multiple files', async ({ page }) => {
const input = page.getByRole('button', { name: /choose file|browse/i });
await input.setInputFiles([testFile, testFile]);
await expect(page.getByText(/2 files?|{{testFileName}}/i)).toBeVisible();
await page.getByRole('button', { name: /upload/i }).click();
await expect(page.getByRole('alert')).toContainText(/2.*uploaded/i);
});
// Happy path: drag and drop
test('uploads file via drag and drop', async ({ page }) => {
const dropzone = page.getByRole('region', { name: /drop.*files|drag.*here/i });
await expect(dropzone).toBeVisible();
// Use DataTransfer to simulate drag-drop
const dataTransfer = await page.evaluateHandle(() => new DataTransfer());
await dropzone.dispatchEvent('drop', { dataTransfer });
// Alternatively, use setInputFiles on the hidden input if dropzone wraps one
await page.locator('input[type="file"]').setInputFiles(testFile);
await expect(page.getByText(/{{testFileName}}/)).toBeVisible();
});
// Happy path: remove file from queue before upload
test('removes file from queue', async ({ page }) => {
await page.locator('input[type="file"]').setInputFiles(testFile);
await page.getByRole('button', { name: /remove.*{{testFileName}}|×/i }).click();
await expect(page.getByText(/{{testFileName}}/)).toBeHidden();
});
// Error case: file too large
test('shows error for oversized file', async ({ page }) => {
await page.locator('input[type="file"]').setInputFiles(largeFile);
await expect(page.getByText(/too large|exceeds.*{{maxFileSizeMb}}|max.*size/i)).toBeVisible();
});
// Error case: wrong file type
test('shows error for unsupported file type', async ({ page }) => {
const wrongTypeFile = { name: 'test.exe', mimeType: 'application/octet-stream', buffer: Buffer.from('data') };
await page.locator('input[type="file"]').setInputFiles(wrongTypeFile);
await expect(page.getByText(/unsupported.*type|{{acceptedMimeTypes}}.*only/i)).toBeVisible();
});
// Edge case: upload progress shown and completed
test('shows progress bar during upload', async ({ page }) => {
await page.locator('input[type="file"]').setInputFiles(testFile);
await page.getByRole('button', { name: /upload/i }).click();
const progress = page.getByRole('progressbar');
await expect(progress).toBeVisible();
await expect(progress).toBeHidden(); // completes and hides
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
const path = require('path');
test.describe('File Upload', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/{{uploadPath}}');
});
test('uploads single file', async ({ page }) => {
await page.locator('input[type="file"]').setInputFiles('{{testFilePath}}');
await page.getByRole('button', { name: /upload/i }).click();
await expect(page.getByRole('alert')).toContainText(/uploaded successfully/i);
});
test('shows error for oversized file', async ({ page }) => {
await page.locator('input[type="file"]').setInputFiles('{{largeFilePath}}');
await expect(page.getByText(/too large|exceeds/i)).toBeVisible();
});
test('shows error for wrong file type', async ({ page }) => {
await page.locator('input[type="file"]').setInputFiles({
name: 'bad.exe',
mimeType: 'application/octet-stream',
buffer: Buffer.from('x'),
});
await expect(page.getByText(/unsupported.*type/i)).toBeVisible();
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Single file | File picker → upload → success |
| Multiple files | Two files queued and uploaded |
| Drag-and-drop | Drop event populates queue |
| Remove from queue | File removed before upload |
| Oversized | Error shown, upload blocked |
| Wrong type | Mime-type error shown |
| Progress bar | Progressbar visible during upload |
FILE:templates/forms/multi-step.md
# Multi-Step Form (Wizard) Template
Tests wizard step navigation, validation per step, and final submission.
## Prerequisites
- Form wizard at `{{baseUrl}}/{{wizardPath}}`
- Steps: Step 1 (personal), Step 2 (details), Step 3 (review)
---
## TypeScript
```typescript
import { test, expect, Page } from '@playwright/test';
async function completeStep1(page: Page): Promise<void> {
await page.getByRole('textbox', { name: /first name/i }).fill('{{firstName}}');
await page.getByRole('textbox', { name: /last name/i }).fill('{{lastName}}');
await page.getByRole('textbox', { name: /email/i }).fill('{{email}}');
await page.getByRole('button', { name: /next/i }).click();
}
async function completeStep2(page: Page): Promise<void> {
await page.getByRole('combobox', { name: /{{step2Field}}/i }).selectOption('{{step2Value}}');
await page.getByRole('textbox', { name: /{{step2TextField}}/i }).fill('{{step2TextValue}}');
await page.getByRole('button', { name: /next/i }).click();
}
test.describe('Multi-Step Form', () => {
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/{{wizardPath}}');
});
// Happy path: complete all steps
test('completes all wizard steps and submits', async ({ page }) => {
await expect(page.getByText(/step 1/i)).toBeVisible();
await completeStep1(page);
await expect(page.getByText(/step 2/i)).toBeVisible();
await completeStep2(page);
await expect(page.getByText(/review|step 3/i)).toBeVisible();
// Review page shows entered data
await expect(page.getByText('{{firstName}}')).toBeVisible();
await page.getByRole('button', { name: /submit|finish/i }).click();
await expect(page).toHaveURL(/\/{{successPath}}/);
});
// Happy path: step indicator updates
test('step indicator reflects current step', async ({ page }) => {
const step1 = page.getByRole('listitem', { name: /step 1/i });
await expect(step1).toHaveAttribute('aria-current', 'step');
await completeStep1(page);
const step2 = page.getByRole('listitem', { name: /step 2/i });
await expect(step2).toHaveAttribute('aria-current', 'step');
});
// Happy path: back navigation
test('navigates back to previous step without losing data', async ({ page }) => {
await completeStep1(page);
await page.getByRole('button', { name: /back|previous/i }).click();
await expect(page.getByRole('textbox', { name: /first name/i })).toHaveValue('{{firstName}}');
});
// Happy path: completed steps accessible via indicator
test('clicking completed step in indicator navigates back', async ({ page }) => {
await completeStep1(page);
await page.getByRole('button', { name: /step 1/i }).click();
await expect(page.getByRole('textbox', { name: /first name/i })).toBeVisible();
});
// Error case: cannot proceed with invalid step 1 data
test('stays on step 1 when required field missing', async ({ page }) => {
await page.getByRole('button', { name: /next/i }).click();
await expect(page.getByText(/first name.*required|required/i)).toBeVisible();
await expect(page.getByText(/step 1/i)).toBeVisible();
});
// Error case: future step not accessible directly
test('cannot access step 3 without completing step 2', async ({ page }) => {
await expect(page.getByRole('button', { name: /step 3/i })).toBeDisabled();
});
// Edge case: browser back button handled
test('browser back from step 2 returns to step 1 with data', async ({ page }) => {
await completeStep1(page);
await page.goBack();
await expect(page.getByRole('textbox', { name: /first name/i })).toHaveValue('{{firstName}}');
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
test.describe('Multi-Step Form', () => {
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/{{wizardPath}}');
});
test('completes all wizard steps and submits', async ({ page }) => {
await page.getByRole('textbox', { name: /first name/i }).fill('{{firstName}}');
await page.getByRole('textbox', { name: /email/i }).fill('{{email}}');
await page.getByRole('button', { name: /next/i }).click();
await page.getByRole('combobox', { name: /{{step2Field}}/i }).selectOption('{{step2Value}}');
await page.getByRole('button', { name: /next/i }).click();
await page.getByRole('button', { name: /submit|finish/i }).click();
await expect(page).toHaveURL(/\/{{successPath}}/);
});
test('stays on step 1 when required field missing', async ({ page }) => {
await page.getByRole('button', { name: /next/i }).click();
await expect(page.getByText(/required/i)).toBeVisible();
});
test('navigates back without losing data', async ({ page }) => {
await page.getByRole('textbox', { name: /first name/i }).fill('{{firstName}}');
await page.getByRole('textbox', { name: /email/i }).fill('{{email}}');
await page.getByRole('button', { name: /next/i }).click();
await page.getByRole('button', { name: /back|previous/i }).click();
await expect(page.getByRole('textbox', { name: /first name/i })).toHaveValue('{{firstName}}');
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Full completion | All steps filled → submit → success URL |
| Step indicator | aria-current updates per step |
| Back navigation | Data preserved on back |
| Completed step click | Step indicator link works |
| Validation | Required field blocks Next |
| Locked future step | Step 3 button disabled until step 2 done |
| Browser back | History navigation preserves data |
FILE:templates/forms/single-step.md
# Single-Step Form Template
Tests simple form submission with success and validation scenarios.
## Prerequisites
- Authenticated session via `{{authStorageStatePath}}`
- Form at `{{baseUrl}}/{{formPath}}`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
test.describe('Single-Step Form — {{formName}}', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/{{formPath}}');
});
// Happy path: successful submission
test('submits form with valid data', async ({ page }) => {
await page.getByRole('textbox', { name: /{{field1Label}}/i }).fill('{{field1Value}}');
await page.getByRole('textbox', { name: /{{field2Label}}/i }).fill('{{field2Value}}');
await page.getByRole('combobox', { name: /{{field3Label}}/i }).selectOption('{{field3Value}}');
await page.getByRole('button', { name: /submit|save/i }).click();
await expect(page.getByRole('alert')).toContainText(/submitted|saved successfully/i);
});
// Happy path: success redirect
test('redirects to success page after submission', async ({ page }) => {
await page.getByRole('textbox', { name: /{{field1Label}}/i }).fill('{{field1Value}}');
await page.getByRole('button', { name: /submit|save/i }).click();
await expect(page).toHaveURL('{{baseUrl}}/{{successPath}}');
});
// Happy path: reset clears form
test('reset button clears all fields', async ({ page }) => {
await page.getByRole('textbox', { name: /{{field1Label}}/i }).fill('some value');
await page.getByRole('button', { name: /reset|clear/i }).click();
await expect(page.getByRole('textbox', { name: /{{field1Label}}/i })).toHaveValue('');
});
// Error case: required field missing
test('shows required field error', async ({ page }) => {
await page.getByRole('button', { name: /submit|save/i }).click();
await expect(page.getByText(/{{field1Label}}.*required|required/i)).toBeVisible();
await expect(page.getByRole('textbox', { name: /{{field1Label}}/i })).toBeFocused();
});
// Error case: invalid email format
test('shows format error for invalid email', async ({ page }) => {
await page.getByRole('textbox', { name: /email/i }).fill('not-an-email');
await page.getByRole('button', { name: /submit|save/i }).click();
await expect(page.getByText(/valid.*email|invalid.*email/i)).toBeVisible();
});
// Error case: server error on submit
test('shows generic error when server returns 500', async ({ page }) => {
await page.route('{{baseUrl}}/api/{{formEndpoint}}', route =>
route.fulfill({ status: 500, body: JSON.stringify({ error: 'Server Error' }) })
);
await page.getByRole('textbox', { name: /{{field1Label}}/i }).fill('{{field1Value}}');
await page.getByRole('button', { name: /submit|save/i }).click();
await expect(page.getByRole('alert')).toContainText(/error|something went wrong/i);
});
// Edge case: double submit prevented
test('disables submit button after first click', async ({ page }) => {
await page.getByRole('textbox', { name: /{{field1Label}}/i }).fill('{{field1Value}}');
const btn = page.getByRole('button', { name: /submit|save/i });
await btn.click();
await expect(btn).toBeDisabled();
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
test.describe('Single-Step Form — {{formName}}', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/{{formPath}}');
});
test('submits form with valid data', async ({ page }) => {
await page.getByRole('textbox', { name: /{{field1Label}}/i }).fill('{{field1Value}}');
await page.getByRole('textbox', { name: /{{field2Label}}/i }).fill('{{field2Value}}');
await page.getByRole('button', { name: /submit|save/i }).click();
await expect(page.getByRole('alert')).toContainText(/submitted|saved/i);
});
test('shows required error for empty submission', async ({ page }) => {
await page.getByRole('button', { name: /submit|save/i }).click();
await expect(page.getByText(/required/i)).toBeVisible();
});
test('disables submit after click (prevents double submit)', async ({ page }) => {
await page.getByRole('textbox', { name: /{{field1Label}}/i }).fill('{{field1Value}}');
const btn = page.getByRole('button', { name: /submit|save/i });
await btn.click();
await expect(btn).toBeDisabled();
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Valid submit | All fields filled → success message |
| Success redirect | Navigates to success URL |
| Reset | All fields cleared |
| Required field | Empty submit → first error focused |
| Invalid email | Format error shown |
| Server 500 | Generic error alert |
| Double submit | Button disabled after first click |
FILE:templates/forms/validation.md
# Form Validation Template
Tests required fields, format validation, and inline error messages.
## Prerequisites
- Form at `{{baseUrl}}/{{formPath}}`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
test.describe('Form Validation', () => {
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/{{formPath}}');
});
// Happy path: all errors resolved on re-submit
test('clears errors when valid data entered', async ({ page }) => {
await page.getByRole('button', { name: /submit/i }).click();
await expect(page.getByText(/required/i)).toBeVisible();
await page.getByRole('textbox', { name: /{{requiredField}}/i }).fill('{{validValue}}');
await page.getByRole('button', { name: /submit/i }).click();
await expect(page.getByText(/required/i)).toBeHidden();
});
// Error case: required fields
test('shows required error for each empty required field', async ({ page }) => {
await page.getByRole('button', { name: /submit/i }).click();
const requiredErrors = page.getByText(/is required|required field/i);
await expect(requiredErrors.first()).toBeVisible();
});
// Error case: invalid email format
test('shows error for invalid email format', async ({ page }) => {
await page.getByRole('textbox', { name: /email/i }).fill('bad@');
await page.getByRole('textbox', { name: /email/i }).blur();
await expect(page.getByText(/valid.*email|enter.*valid email/i)).toBeVisible();
});
// Error case: invalid phone format
test('shows error for invalid phone number', async ({ page }) => {
await page.getByRole('textbox', { name: /phone/i }).fill('123');
await page.getByRole('textbox', { name: /phone/i }).blur();
await expect(page.getByText(/valid.*phone|invalid phone/i)).toBeVisible();
});
// Error case: password too short
test('shows error when password is too short', async ({ page }) => {
await page.getByRole('textbox', { name: /^password$/i }).fill('abc');
await page.getByRole('textbox', { name: /^password$/i }).blur();
await expect(page.getByText(/at least \d+ characters/i)).toBeVisible();
});
// Error case: passwords do not match
test('shows error when confirm password does not match', async ({ page }) => {
await page.getByRole('textbox', { name: /^password$/i }).fill('{{validPassword}}');
await page.getByRole('textbox', { name: /confirm password/i }).fill('different');
await page.getByRole('textbox', { name: /confirm password/i }).blur();
await expect(page.getByText(/passwords.*do not match/i)).toBeVisible();
});
// Error case: inline error on blur (not on submit)
test('shows inline error on blur for invalid value', async ({ page }) => {
await page.getByRole('textbox', { name: /email/i }).fill('invalid');
await page.getByRole('textbox', { name: /email/i }).blur();
// Error shown immediately, not waiting for submit
await expect(page.getByText(/valid.*email/i)).toBeVisible();
});
// Error case: field-level errors tied to field via aria-describedby
test('error message is associated with field via aria', async ({ page }) => {
await page.getByRole('button', { name: /submit/i }).click();
const emailField = page.getByRole('textbox', { name: /email/i });
const errorId = await emailField.getAttribute('aria-describedby');
expect(errorId).toBeTruthy();
await expect(page.locator(`#errorId`)).toBeVisible();
});
// Edge case: field max-length validation
test('shows error when input exceeds max length', async ({ page }) => {
await page.getByRole('textbox', { name: /{{field}}/i }).fill('A'.repeat({{maxLength}} + 1));
await page.getByRole('textbox', { name: /{{field}}/i }).blur();
await expect(page.getByText(/max.*{{maxLength}}|too long/i)).toBeVisible();
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
test.describe('Form Validation', () => {
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/{{formPath}}');
});
test('shows required errors on empty submit', async ({ page }) => {
await page.getByRole('button', { name: /submit/i }).click();
await expect(page.getByText(/is required|required field/i).first()).toBeVisible();
});
test('shows error for invalid email on blur', async ({ page }) => {
await page.getByRole('textbox', { name: /email/i }).fill('bad@');
await page.getByRole('textbox', { name: /email/i }).blur();
await expect(page.getByText(/valid.*email/i)).toBeVisible();
});
test('passwords mismatch error shown', async ({ page }) => {
await page.getByRole('textbox', { name: /^password$/i }).fill('{{validPassword}}');
await page.getByRole('textbox', { name: /confirm password/i }).fill('other');
await page.getByRole('textbox', { name: /confirm password/i }).blur();
await expect(page.getByText(/do not match/i)).toBeVisible();
});
test('clears errors when valid data entered', async ({ page }) => {
await page.getByRole('button', { name: /submit/i }).click();
await page.getByRole('textbox', { name: /{{requiredField}}/i }).fill('{{validValue}}');
await page.getByRole('button', { name: /submit/i }).click();
await expect(page.getByText(/required/i)).toBeHidden();
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Error cleared | Valid input → errors removed on next submit |
| Required fields | Empty submit → at least one required error |
| Email format | Blur with bad email → inline error |
| Phone format | Invalid phone → inline error |
| Password length | Too short → character count error |
| Password match | Mismatch → confirmation error |
| Blur validation | Error shown on blur, not just submit |
| aria-describedby | Error programmatically linked to field |
| Max length | Exceeded length → error shown |
FILE:templates/notifications/in-app.md
# In-App Notifications Template
Tests notification badge count, dropdown, and mark-as-read behaviour.
## Prerequisites
- Authenticated session via `{{authStorageStatePath}}`
- At least `{{unreadCount}}` unread notifications seeded
- App running at `{{baseUrl}}`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
test.describe('In-App Notifications', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
// Happy path: badge shows unread count
test('shows unread notification count badge', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
await expect(page.getByRole('status', { name: /notification.*count/i }))
.toContainText('{{unreadCount}}');
});
// Happy path: dropdown opens on bell click
test('opens notification dropdown on bell click', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
await page.getByRole('button', { name: /notifications/i }).click();
await expect(page.getByRole('menu', { name: /notifications/i })).toBeVisible();
const items = page.getByRole('menuitem');
await expect(items.first()).toBeVisible();
});
// Happy path: mark single notification as read
test('marks notification as read', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
await page.getByRole('button', { name: /notifications/i }).click();
const firstNotif = page.getByRole('menuitem').first();
await firstNotif.getByRole('button', { name: /mark as read/i }).click();
await expect(firstNotif).toHaveAttribute('aria-label', /read/i);
// Badge count decremented
await expect(page.getByRole('status', { name: /notification.*count/i }))
.toContainText(`{{unreadCount} - 1}`);
});
// Happy path: mark all as read
test('marks all notifications as read', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
await page.getByRole('button', { name: /notifications/i }).click();
await page.getByRole('button', { name: /mark all.*read/i }).click();
await expect(page.getByRole('status', { name: /notification.*count/i })).toBeHidden();
});
// Happy path: clicking notification navigates to context
test('clicking notification navigates to relevant page', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
await page.getByRole('button', { name: /notifications/i }).click();
await page.getByRole('menuitem').first().click();
await expect(page).toHaveURL(/\/{{notificationTargetPath}}/);
});
// Error case: notification dropdown empty state
test('shows empty state when no notifications', async ({ page }) => {
await page.route('{{baseUrl}}/api/notifications*', route =>
route.fulfill({ status: 200, body: JSON.stringify({ items: [], unread: 0 }) })
);
await page.goto('{{baseUrl}}/dashboard');
await page.getByRole('button', { name: /notifications/i }).click();
await expect(page.getByText(/no notifications|all caught up/i)).toBeVisible();
});
// Edge case: dropdown closes on outside click
test('closes notification dropdown on outside click', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
await page.getByRole('button', { name: /notifications/i }).click();
await expect(page.getByRole('menu', { name: /notifications/i })).toBeVisible();
await page.getByRole('heading', { name: /dashboard/i }).click();
await expect(page.getByRole('menu', { name: /notifications/i })).toBeHidden();
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
test.describe('In-App Notifications', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
test('badge shows unread count', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
await expect(page.getByRole('status', { name: /notification.*count/i }))
.toContainText('{{unreadCount}}');
});
test('opens dropdown on bell click', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
await page.getByRole('button', { name: /notifications/i }).click();
await expect(page.getByRole('menu', { name: /notifications/i })).toBeVisible();
});
test('marks all as read clears badge', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
await page.getByRole('button', { name: /notifications/i }).click();
await page.getByRole('button', { name: /mark all.*read/i }).click();
await expect(page.getByRole('status', { name: /notification.*count/i })).toBeHidden();
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Badge count | Unread count shown in badge |
| Dropdown open | Bell click → notification list |
| Mark single read | Item marked, badge decremented |
| Mark all read | Badge hidden |
| Notification click | Navigates to context page |
| Empty state | No-notifications message |
| Outside click | Dropdown closes |
FILE:templates/notifications/notification-center.md
# Notification Center Template
Tests full notification list, filtering, and bulk clear.
## Prerequisites
- Authenticated session via `{{authStorageStatePath}}`
- Mix of read/unread notifications seeded
- App running at `{{baseUrl}}`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
test.describe('Notification Center', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/notifications');
});
// Happy path: notification list visible
test('displays notification list', async ({ page }) => {
await expect(page.getByRole('heading', { name: /notifications/i })).toBeVisible();
await expect(page.getByRole('list', { name: /notifications/i })).toBeVisible();
const items = page.getByRole('listitem');
await expect(items.first()).toBeVisible();
});
// Happy path: filter by unread
test('filters to show only unread notifications', async ({ page }) => {
await page.getByRole('button', { name: /unread/i }).click();
const items = page.getByRole('listitem');
const count = await items.count();
for (let i = 0; i < count; i++) {
await expect(items.nth(i)).toHaveAttribute('aria-label', /unread/i);
}
});
// Happy path: filter by type
test('filters notifications by type', async ({ page }) => {
await page.getByRole('combobox', { name: /type|category/i }).selectOption('{{notificationType}}');
const items = page.getByRole('listitem');
await expect(items.first()).toContainText(/{{notificationTypeLabel}}/i);
});
// Happy path: mark single as read
test('marks individual notification as read', async ({ page }) => {
const first = page.getByRole('listitem').first();
await first.getByRole('button', { name: /mark.*read/i }).click();
await expect(first).not.toHaveAttribute('data-unread', 'true');
});
// Happy path: clear all notifications
test('clears all notifications', async ({ page }) => {
await page.getByRole('button', { name: /clear all/i }).click();
await page.getByRole('dialog').getByRole('button', { name: /confirm/i }).click();
await expect(page.getByText(/no notifications|all cleared/i)).toBeVisible();
await expect(page.getByRole('listitem')).toHaveCount(0);
});
// Happy path: pagination / load more
test('loads more notifications on scroll or button click', async ({ page }) => {
const initialCount = await page.getByRole('listitem').count();
await page.getByRole('button', { name: /load more/i }).click();
const newCount = await page.getByRole('listitem').count();
expect(newCount).toBeGreaterThan(initialCount);
});
// Error case: empty state after clearing
test('shows empty state after clearing all', async ({ page }) => {
await page.getByRole('button', { name: /clear all/i }).click();
await page.getByRole('dialog').getByRole('button', { name: /confirm/i }).click();
await expect(page.getByText(/no notifications/i)).toBeVisible();
});
// Edge case: notification links to source
test('clicking notification navigates to source', async ({ page }) => {
await page.getByRole('listitem').first().getByRole('link').click();
await expect(page).not.toHaveURL('{{baseUrl}}/notifications');
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
test.describe('Notification Center', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
test('displays notification list', async ({ page }) => {
await page.goto('{{baseUrl}}/notifications');
await expect(page.getByRole('list', { name: /notifications/i })).toBeVisible();
await expect(page.getByRole('listitem').first()).toBeVisible();
});
test('filters to unread only', async ({ page }) => {
await page.goto('{{baseUrl}}/notifications');
await page.getByRole('button', { name: /unread/i }).click();
await expect(page.getByRole('listitem').first()).toHaveAttribute('aria-label', /unread/i);
});
test('clears all notifications', async ({ page }) => {
await page.goto('{{baseUrl}}/notifications');
await page.getByRole('button', { name: /clear all/i }).click();
await page.getByRole('dialog').getByRole('button', { name: /confirm/i }).click();
await expect(page.getByText(/no notifications/i)).toBeVisible();
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| List display | Notification items visible |
| Unread filter | Only unread items shown |
| Type filter | Category filter scopes list |
| Mark single read | Item marked, styling changes |
| Clear all | Confirmation → empty state |
| Load more | Additional items appended |
| Empty state | No-notifications message post-clear |
| Source link | Click navigates away from center |
FILE:templates/notifications/toast-messages.md
# Toast Messages Template
Tests success, error, and warning toasts with auto-dismiss and manual close.
## Prerequisites
- Authenticated session via `{{authStorageStatePath}}`
- App running at `{{baseUrl}}`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
test.describe('Toast Messages', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
// Happy path: success toast on action
test('shows success toast after save action', async ({ page }) => {
await page.goto('{{baseUrl}}/{{formPath}}');
await page.getByRole('textbox', { name: /{{fieldLabel}}/i }).fill('{{validValue}}');
await page.getByRole('button', { name: /save/i }).click();
const toast = page.getByRole('alert').filter({ hasText: /saved|success/i });
await expect(toast).toBeVisible();
});
// Happy path: error toast on failure
test('shows error toast when action fails', async ({ page }) => {
await page.route('{{baseUrl}}/api/{{endpoint}}*', route =>
route.fulfill({ status: 500, body: '{}' })
);
await page.goto('{{baseUrl}}/{{formPath}}');
await page.getByRole('button', { name: /save/i }).click();
const toast = page.getByRole('alert').filter({ hasText: /error|failed/i });
await expect(toast).toBeVisible();
});
// Happy path: warning toast shown
test('shows warning toast', async ({ page }) => {
await page.goto('{{baseUrl}}/{{warningTriggerPath}}');
await page.getByRole('button', { name: /{{warningAction}}/i }).click();
const toast = page.getByRole('alert').filter({ hasText: /warning|attention/i });
await expect(toast).toBeVisible();
});
// Happy path: toast auto-dismisses
test('toast auto-dismisses after timeout', async ({ page }) => {
await page.goto('{{baseUrl}}/{{formPath}}');
await page.clock.install();
await page.getByRole('textbox', { name: /{{fieldLabel}}/i }).fill('{{validValue}}');
await page.getByRole('button', { name: /save/i }).click();
const toast = page.getByRole('alert').filter({ hasText: /saved/i });
await expect(toast).toBeVisible();
await page.clock.fastForward({{toastDurationMs}});
await expect(toast).toBeHidden();
});
// Happy path: toast manually dismissed
test('dismisses toast via close button', async ({ page }) => {
await page.goto('{{baseUrl}}/{{formPath}}');
await page.getByRole('button', { name: /save/i }).click();
const toast = page.getByRole('alert').filter({ hasText: /saved/i });
await expect(toast).toBeVisible();
await toast.getByRole('button', { name: /close|dismiss|×/i }).click();
await expect(toast).toBeHidden();
});
// Happy path: multiple toasts stack
test('stacks multiple toasts', async ({ page }) => {
await page.goto('{{baseUrl}}/{{formPath}}');
// Trigger two saves quickly
await page.getByRole('button', { name: /save/i }).click();
await page.getByRole('button', { name: /save/i }).click();
const toasts = page.getByRole('alert');
const count = await toasts.count();
expect(count).toBeGreaterThanOrEqual(2);
});
// Edge case: toast announces to screen readers
test('toast has live region role for accessibility', async ({ page }) => {
await page.goto('{{baseUrl}}/{{formPath}}');
await page.getByRole('button', { name: /save/i }).click();
const toast = page.getByRole('alert').first();
await expect(toast).toBeVisible();
// role="alert" implies aria-live="assertive"
const role = await toast.getAttribute('role');
expect(role).toMatch(/alert|status/);
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
test.describe('Toast Messages', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
test('shows success toast after save', async ({ page }) => {
await page.goto('{{baseUrl}}/{{formPath}}');
await page.getByRole('textbox', { name: /{{fieldLabel}}/i }).fill('{{validValue}}');
await page.getByRole('button', { name: /save/i }).click();
await expect(page.getByRole('alert').filter({ hasText: /saved|success/i })).toBeVisible();
});
test('toast auto-dismisses after timeout', async ({ page }) => {
await page.goto('{{baseUrl}}/{{formPath}}');
await page.clock.install();
await page.getByRole('button', { name: /save/i }).click();
const toast = page.getByRole('alert').filter({ hasText: /saved/i });
await expect(toast).toBeVisible();
await page.clock.fastForward({{toastDurationMs}});
await expect(toast).toBeHidden();
});
test('dismisses toast via close button', async ({ page }) => {
await page.goto('{{baseUrl}}/{{formPath}}');
await page.getByRole('button', { name: /save/i }).click();
const toast = page.getByRole('alert').filter({ hasText: /saved/i });
await toast.getByRole('button', { name: /close|×/i }).click();
await expect(toast).toBeHidden();
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Success toast | Save → green/success alert visible |
| Error toast | 500 → red/error alert visible |
| Warning toast | Trigger action → warning alert |
| Auto-dismiss | Toast hidden after N ms (clock-controlled) |
| Manual dismiss | Close button hides toast |
| Stacked toasts | Multiple alerts visible simultaneously |
| Accessible | role=alert or role=status present |
FILE:templates/onboarding/email-verification.md
# Email Verification Template
Tests email verification link, resend flow, and expired token handling.
## Prerequisites
- Registered but unverified account: `{{unverifiedEmail}}`
- Valid token: `{{verificationToken}}`
- Expired token: `{{expiredVerificationToken}}`
- App running at `{{baseUrl}}`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
test.describe('Email Verification', () => {
// Happy path: valid verification link
test('verifies email with valid token', async ({ page }) => {
await page.goto('{{baseUrl}}/verify-email?token={{verificationToken}}');
await expect(page.getByRole('heading', { name: /email verified|verified/i })).toBeVisible();
await expect(page.getByRole('link', { name: /continue|go to dashboard/i })).toBeVisible();
});
// Happy path: continues to app after verification
test('redirects to dashboard after clicking continue', async ({ page }) => {
await page.goto('{{baseUrl}}/verify-email?token={{verificationToken}}');
await page.getByRole('link', { name: /continue|go to dashboard/i }).click();
await expect(page).toHaveURL('{{baseUrl}}/dashboard');
});
// Happy path: resend verification email
test('resends verification email', async ({ page }) => {
await page.goto('{{baseUrl}}/verify-email/resend');
await page.getByRole('textbox', { name: /email/i }).fill('{{unverifiedEmail}}');
await page.getByRole('button', { name: /resend/i }).click();
await expect(page.getByRole('alert')).toContainText(/sent|check your email/i);
});
// Happy path: verification prompt on login for unverified user
test('shows verification prompt when unverified user logs in', async ({ page }) => {
await page.goto('{{baseUrl}}/login');
await page.getByRole('textbox', { name: /email/i }).fill('{{unverifiedEmail}}');
await page.getByRole('textbox', { name: /password/i }).fill('{{password}}');
await page.getByRole('button', { name: /sign in/i }).click();
await expect(page.getByText(/verify.*email|check.*inbox/i)).toBeVisible();
await expect(page.getByRole('button', { name: /resend.*verification/i })).toBeVisible();
});
// Error case: expired token
test('shows error for expired verification token', async ({ page }) => {
await page.goto('{{baseUrl}}/verify-email?token={{expiredVerificationToken}}');
await expect(page.getByRole('heading', { name: /link.*expired|verification.*failed/i })).toBeVisible();
await expect(page.getByRole('link', { name: /resend|request new/i })).toBeVisible();
});
// Error case: invalid token
test('shows error for invalid verification token', async ({ page }) => {
await page.goto('{{baseUrl}}/verify-email?token=invalid-token-xyz');
await expect(page.getByRole('heading', { name: /invalid|failed/i })).toBeVisible();
});
// Edge case: already verified user hitting link
test('shows already verified message for used token', async ({ page }) => {
await page.goto('{{baseUrl}}/verify-email?token={{usedVerificationToken}}');
await expect(page.getByText(/already verified|email.*confirmed/i)).toBeVisible();
await expect(page.getByRole('link', { name: /sign in/i })).toBeVisible();
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
test.describe('Email Verification', () => {
test('verifies email with valid token', async ({ page }) => {
await page.goto('{{baseUrl}}/verify-email?token={{verificationToken}}');
await expect(page.getByRole('heading', { name: /email verified/i })).toBeVisible();
});
test('shows error for expired token', async ({ page }) => {
await page.goto('{{baseUrl}}/verify-email?token={{expiredVerificationToken}}');
await expect(page.getByRole('heading', { name: /link.*expired/i })).toBeVisible();
await expect(page.getByRole('link', { name: /resend|request new/i })).toBeVisible();
});
test('resends verification email', async ({ page }) => {
await page.goto('{{baseUrl}}/verify-email/resend');
await page.getByRole('textbox', { name: /email/i }).fill('{{unverifiedEmail}}');
await page.getByRole('button', { name: /resend/i }).click();
await expect(page.getByRole('alert')).toContainText(/sent/i);
});
test('shows verification prompt on login for unverified user', async ({ page }) => {
await page.goto('{{baseUrl}}/login');
await page.getByRole('textbox', { name: /email/i }).fill('{{unverifiedEmail}}');
await page.getByRole('textbox', { name: /password/i }).fill('{{password}}');
await page.getByRole('button', { name: /sign in/i }).click();
await expect(page.getByText(/verify.*email/i)).toBeVisible();
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Valid token | Email verified heading + continue link |
| Continue CTA | Navigates to dashboard |
| Resend | Sends new email, success alert |
| Login prompt | Unverified login shows resend button |
| Expired token | Error heading + resend link |
| Invalid token | Generic error heading |
| Already verified | "Already verified" with login link |
FILE:templates/onboarding/first-time-setup.md
# First-Time Setup Template
Tests initial configuration wizard and profile completion after registration.
## Prerequisites
- Newly registered session via `{{newUserStorageStatePath}}`
- App running at `{{baseUrl}}`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
test.describe('First-Time Setup', () => {
test.use({ storageState: '{{newUserStorageStatePath}}' });
// Happy path: setup wizard shown on first login
test('shows setup wizard on first login', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
await expect(page).toHaveURL(/\/setup|\/onboarding/);
await expect(page.getByRole('heading', { name: /set up.*account|get started/i })).toBeVisible();
});
// Happy path: complete organisation setup step
test('completes organisation details step', async ({ page }) => {
await page.goto('{{baseUrl}}/setup');
await page.getByRole('textbox', { name: /organisation.*name|company/i }).fill('{{orgName}}');
await page.getByRole('combobox', { name: /industry/i }).selectOption('{{industry}}');
await page.getByRole('spinbutton', { name: /team size/i }).fill('{{teamSize}}');
await page.getByRole('button', { name: /next|continue/i }).click();
await expect(page.getByText(/step 2|preferences/i)).toBeVisible();
});
// Happy path: complete preferences step
test('completes preferences step', async ({ page }) => {
await page.goto('{{baseUrl}}/setup/preferences');
await page.getByRole('combobox', { name: /timezone/i }).selectOption('{{timezone}}');
await page.getByRole('combobox', { name: /language/i }).selectOption('{{language}}');
await page.getByRole('button', { name: /next|continue/i }).click();
await expect(page.getByText(/step 3|invite|done/i)).toBeVisible();
});
// Happy path: full wizard completion redirects to dashboard
test('completes all setup steps and lands on dashboard', async ({ page }) => {
await page.goto('{{baseUrl}}/setup');
// Step 1
await page.getByRole('textbox', { name: /organisation.*name/i }).fill('{{orgName}}');
await page.getByRole('button', { name: /next/i }).click();
// Step 2
await page.getByRole('combobox', { name: /timezone/i }).selectOption('{{timezone}}');
await page.getByRole('button', { name: /next/i }).click();
// Final step
await page.getByRole('button', { name: /finish|go to dashboard/i }).click();
await expect(page).toHaveURL('{{baseUrl}}/dashboard');
});
// Happy path: setup completion percentage shown
test('progress indicator updates on each step', async ({ page }) => {
await page.goto('{{baseUrl}}/setup');
await expect(page.getByRole('progressbar')).toHaveAttribute('aria-valuenow', '0');
await page.getByRole('textbox', { name: /organisation.*name/i }).fill('{{orgName}}');
await page.getByRole('button', { name: /next/i }).click();
await expect(page.getByRole('progressbar')).not.toHaveAttribute('aria-valuenow', '0');
});
// Error case: required setup field missing
test('shows validation when required field missing', async ({ page }) => {
await page.goto('{{baseUrl}}/setup');
await page.getByRole('button', { name: /next/i }).click();
await expect(page.getByText(/organisation.*required|required/i)).toBeVisible();
});
// Edge case: setup not required on subsequent login
test('skips setup on second login', async ({ page }) => {
// Complete setup
await page.goto('{{baseUrl}}/setup');
await page.getByRole('textbox', { name: /organisation.*name/i }).fill('{{orgName}}');
await page.getByRole('button', { name: /next/i }).click();
await page.getByRole('button', { name: /finish/i }).click();
await expect(page).toHaveURL('{{baseUrl}}/dashboard');
// Reload — setup not re-triggered
await page.reload();
await expect(page).toHaveURL('{{baseUrl}}/dashboard');
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
test.describe('First-Time Setup', () => {
test.use({ storageState: '{{newUserStorageStatePath}}' });
test('redirects to setup wizard on first login', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
await expect(page).toHaveURL(/\/setup|\/onboarding/);
});
test('shows validation for missing required field', async ({ page }) => {
await page.goto('{{baseUrl}}/setup');
await page.getByRole('button', { name: /next/i }).click();
await expect(page.getByText(/required/i)).toBeVisible();
});
test('completes setup and lands on dashboard', async ({ page }) => {
await page.goto('{{baseUrl}}/setup');
await page.getByRole('textbox', { name: /organisation.*name/i }).fill('{{orgName}}');
await page.getByRole('button', { name: /next/i }).click();
await page.getByRole('button', { name: /finish|go to dashboard/i }).click();
await expect(page).toHaveURL('{{baseUrl}}/dashboard');
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Setup on first login | Redirected to /setup wizard |
| Org details step | Company name + industry filled |
| Preferences step | Timezone + language selected |
| Full completion | All steps → dashboard |
| Progress bar | Progressbar value updates per step |
| Required field | Empty step blocked with error |
| Skip on re-login | Setup not triggered again |
FILE:templates/onboarding/registration.md
# Registration Template
Tests signup form submission, validation, and post-registration flow.
## Prerequisites
- Unique test email for each run: `{{newUserEmail}}`
- App running at `{{baseUrl}}`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
const uniqueEmail = `test+Date.now()@example.com`;
test.describe('Registration', () => {
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/register');
});
// Happy path: successful registration
test('registers new user with valid data', async ({ page }) => {
await page.getByRole('textbox', { name: /first name/i }).fill('{{firstName}}');
await page.getByRole('textbox', { name: /last name/i }).fill('{{lastName}}');
await page.getByRole('textbox', { name: /email/i }).fill(uniqueEmail);
await page.getByRole('textbox', { name: /^password$/i }).fill('{{newPassword}}');
await page.getByRole('textbox', { name: /confirm.*password/i }).fill('{{newPassword}}');
await page.getByRole('checkbox', { name: /terms/i }).check();
await page.getByRole('button', { name: /sign up|register|create account/i }).click();
await expect(page).toHaveURL(/\/verify-email|\/dashboard|\/onboarding/);
});
// Happy path: success message or redirect
test('shows confirmation after registration', async ({ page }) => {
await page.getByRole('textbox', { name: /email/i }).fill(uniqueEmail);
await page.getByRole('textbox', { name: /^password$/i }).fill('{{newPassword}}');
await page.getByRole('checkbox', { name: /terms/i }).check();
await page.getByRole('button', { name: /sign up|register/i }).click();
await expect(page.getByText(/check your email|account created|welcome/i)).toBeVisible();
});
// Error case: email already registered
test('shows error for already registered email', async ({ page }) => {
await page.getByRole('textbox', { name: /email/i }).fill('{{existingUserEmail}}');
await page.getByRole('textbox', { name: /^password$/i }).fill('{{newPassword}}');
await page.getByRole('checkbox', { name: /terms/i }).check();
await page.getByRole('button', { name: /sign up|register/i }).click();
await expect(page.getByRole('alert')).toContainText(/already.*registered|email.*taken/i);
});
// Error case: terms not accepted
test('blocks registration if terms not accepted', async ({ page }) => {
await page.getByRole('textbox', { name: /email/i }).fill(uniqueEmail);
await page.getByRole('textbox', { name: /^password$/i }).fill('{{newPassword}}');
await page.getByRole('button', { name: /sign up|register/i }).click();
await expect(page.getByText(/accept.*terms|terms.*required/i)).toBeVisible();
});
// Error case: weak password
test('shows error for weak password', async ({ page }) => {
await page.getByRole('textbox', { name: /^password$/i }).fill('123');
await page.getByRole('textbox', { name: /^password$/i }).blur();
await expect(page.getByText(/at least \d+ characters|too weak/i)).toBeVisible();
});
// Error case: passwords mismatch
test('shows error when passwords do not match', async ({ page }) => {
await page.getByRole('textbox', { name: /^password$/i }).fill('{{newPassword}}');
await page.getByRole('textbox', { name: /confirm.*password/i }).fill('different');
await page.getByRole('textbox', { name: /confirm.*password/i }).blur();
await expect(page.getByText(/do not match/i)).toBeVisible();
});
// Edge case: already logged-in user redirected
test('redirects to dashboard when already authenticated', async ({ page, context }) => {
await context.addCookies([{ name: '{{sessionCookieName}}', value: '{{validSession}}', domain: '{{cookieDomain}}', path: '/' }]);
await page.goto('{{baseUrl}}/register');
await expect(page).toHaveURL('{{baseUrl}}/dashboard');
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
test.describe('Registration', () => {
test('registers with valid data', async ({ page }) => {
const email = `test+Date.now()@example.com`;
await page.goto('{{baseUrl}}/register');
await page.getByRole('textbox', { name: /email/i }).fill(email);
await page.getByRole('textbox', { name: /^password$/i }).fill('{{newPassword}}');
await page.getByRole('checkbox', { name: /terms/i }).check();
await page.getByRole('button', { name: /sign up|register/i }).click();
await expect(page).toHaveURL(/\/verify-email|\/dashboard|\/onboarding/);
});
test('shows error for existing email', async ({ page }) => {
await page.goto('{{baseUrl}}/register');
await page.getByRole('textbox', { name: /email/i }).fill('{{existingUserEmail}}');
await page.getByRole('textbox', { name: /^password$/i }).fill('{{newPassword}}');
await page.getByRole('checkbox', { name: /terms/i }).check();
await page.getByRole('button', { name: /sign up|register/i }).click();
await expect(page.getByRole('alert')).toContainText(/already.*registered/i);
});
test('requires terms acceptance', async ({ page }) => {
await page.goto('{{baseUrl}}/register');
await page.getByRole('textbox', { name: /email/i }).fill(`tDate.now()@example.com`);
await page.getByRole('textbox', { name: /^password$/i }).fill('{{newPassword}}');
await page.getByRole('button', { name: /sign up|register/i }).click();
await expect(page.getByText(/accept.*terms/i)).toBeVisible();
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Valid registration | All fields → redirect or success message |
| Confirmation | Email check or welcome shown |
| Existing email | Error alert |
| Terms not accepted | Validation error |
| Weak password | Strength error on blur |
| Password mismatch | Confirm error |
| Already authed | Redirected to dashboard |
FILE:templates/onboarding/welcome-tour.md
# Welcome Tour Template
Tests step-by-step onboarding tour, skip, and completion behaviour.
## Prerequisites
- Newly registered session (first login) via `{{newUserStorageStatePath}}`
- Tour has `{{tourStepCount}}` steps
- App running at `{{baseUrl}}`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
test.describe('Welcome Tour', () => {
test.use({ storageState: '{{newUserStorageStatePath}}' });
// Happy path: tour shown on first login
test('shows welcome tour on first login', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
await expect(page.getByRole('dialog', { name: /welcome|tour/i })).toBeVisible();
await expect(page.getByText(/step 1 of {{tourStepCount}}/i)).toBeVisible();
});
// Happy path: advance through all steps
test('advances through all tour steps', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
for (let i = 1; i <= {{tourStepCount}}; i++) {
await expect(page.getByText(new RegExp(`step i of {{tourStepCount}}`, 'i'))).toBeVisible();
if (i < {{tourStepCount}}) {
await page.getByRole('button', { name: /next/i }).click();
} else {
await page.getByRole('button', { name: /finish|done|get started/i }).click();
}
}
await expect(page.getByRole('dialog', { name: /welcome|tour/i })).toBeHidden();
});
// Happy path: back navigation within tour
test('navigates back to previous step', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
await page.getByRole('button', { name: /next/i }).click();
await expect(page.getByText(/step 2 of {{tourStepCount}}/i)).toBeVisible();
await page.getByRole('button', { name: /back|previous/i }).click();
await expect(page.getByText(/step 1 of {{tourStepCount}}/i)).toBeVisible();
});
// Happy path: skip tour
test('skips tour and dismisses overlay', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
await page.getByRole('button', { name: /skip.*tour|skip/i }).click();
await expect(page.getByRole('dialog', { name: /welcome|tour/i })).toBeHidden();
await expect(page.getByRole('heading', { name: /dashboard/i })).toBeVisible();
});
// Happy path: tour not shown on subsequent logins
test('tour not shown on second login', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
// Complete or skip tour
await page.getByRole('button', { name: /skip.*tour|skip/i }).click();
// Simulate re-login by reloading
await page.reload();
await expect(page.getByRole('dialog', { name: /welcome|tour/i })).toBeHidden();
});
// Happy path: tooltip highlights correct element
test('tour tooltip highlights the correct UI element', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
const tooltip = page.getByRole('tooltip').or(page.getByRole('dialog', { name: /tour/i }));
await expect(tooltip).toBeVisible();
const targetEl = page.getByRole('{{tourStep1TargetRole}}', { name: /{{tourStep1TargetName}}/i });
await expect(targetEl).toBeVisible();
});
// Edge case: close button (×) dismisses tour
test('× button dismisses tour', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
await page.getByRole('dialog', { name: /welcome|tour/i })
.getByRole('button', { name: /close|×/i }).click();
await expect(page.getByRole('dialog', { name: /welcome|tour/i })).toBeHidden();
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
test.describe('Welcome Tour', () => {
test.use({ storageState: '{{newUserStorageStatePath}}' });
test('shows welcome tour on first login', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
await expect(page.getByRole('dialog', { name: /welcome|tour/i })).toBeVisible();
});
test('skips tour on button click', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
await page.getByRole('button', { name: /skip/i }).click();
await expect(page.getByRole('dialog', { name: /tour/i })).toBeHidden();
});
test('advances through all steps to completion', async ({ page }) => {
await page.goto('{{baseUrl}}/dashboard');
for (let i = 1; i < {{tourStepCount}}; i++) {
await page.getByRole('button', { name: /next/i }).click();
}
await page.getByRole('button', { name: /finish|done|get started/i }).click();
await expect(page.getByRole('dialog', { name: /tour/i })).toBeHidden();
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Tour on first login | Dialog shown with step 1 of N |
| Full completion | All steps advanced → tour dismissed |
| Back navigation | Previous step accessible |
| Skip tour | Dismissed immediately |
| Not shown again | Tour absent on subsequent visits |
| Tooltip target | Tour highlights correct element |
| Close button | × closes tour |
FILE:templates/search/basic-search.md
# Basic Search Template
Tests search input, query submission, and results display.
## Prerequisites
- At least one indexed item matching `{{searchQuery}}`
- App running at `{{baseUrl}}`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
test.describe('Basic Search', () => {
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}');
});
// Happy path: search returns results
test('displays results for valid search query', async ({ page }) => {
await page.getByRole('searchbox', { name: /search/i }).fill('{{searchQuery}}');
await page.getByRole('button', { name: /search/i }).click();
await expect(page).toHaveURL(/[?&]q={{searchQuery}}/);
await expect(page.getByRole('list', { name: /results/i })).toBeVisible();
const results = page.getByRole('listitem').filter({ hasText: /{{searchQuery}}/i });
await expect(results.first()).toBeVisible();
});
// Happy path: search via Enter key
test('submits search on Enter key', async ({ page }) => {
await page.getByRole('searchbox', { name: /search/i }).fill('{{searchQuery}}');
await page.keyboard.press('Enter');
await expect(page).toHaveURL(/[?&]q=/);
await expect(page.getByRole('list', { name: /results/i })).toBeVisible();
});
// Happy path: result count shown
test('shows result count in heading', async ({ page }) => {
await page.getByRole('searchbox', { name: /search/i }).fill('{{searchQuery}}');
await page.getByRole('button', { name: /search/i }).click();
await expect(page.getByText(/\d+\s+results? for/i)).toBeVisible();
});
// Happy path: clicking result navigates to detail
test('clicking result navigates to detail page', async ({ page }) => {
await page.getByRole('searchbox', { name: /search/i }).fill('{{searchQuery}}');
await page.getByRole('button', { name: /search/i }).click();
await page.getByRole('listitem').first().getByRole('link').click();
await expect(page).toHaveURL(/\/{{entityName}}s\/\d+/);
});
// Happy path: query pre-filled from URL
test('pre-fills search box from URL query param', async ({ page }) => {
await page.goto(`{{baseUrl}}/search?q={{searchQuery}}`);
await expect(page.getByRole('searchbox', { name: /search/i })).toHaveValue('{{searchQuery}}');
});
// Error case: no results
test('shows no-results message for unmatched query', async ({ page }) => {
await page.getByRole('searchbox', { name: /search/i }).fill('xyzzy-no-match-12345');
await page.getByRole('button', { name: /search/i }).click();
await expect(page.getByText(/no results|nothing found/i)).toBeVisible();
});
// Edge case: special characters handled safely
test('handles special characters in query', async ({ page }) => {
await page.getByRole('searchbox', { name: /search/i }).fill('<script>alert(1)</script>');
await page.getByRole('button', { name: /search/i }).click();
await expect(page.getByRole('alert')).toBeHidden();
await expect(page.getByText(/no results/i)).toBeVisible();
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
test.describe('Basic Search', () => {
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}');
});
test('displays results for valid query', async ({ page }) => {
await page.getByRole('searchbox', { name: /search/i }).fill('{{searchQuery}}');
await page.getByRole('button', { name: /search/i }).click();
await expect(page.getByRole('list', { name: /results/i })).toBeVisible();
});
test('shows no-results for unmatched query', async ({ page }) => {
await page.getByRole('searchbox', { name: /search/i }).fill('xyzzy-no-match');
await page.getByRole('button', { name: /search/i }).click();
await expect(page.getByText(/no results|nothing found/i)).toBeVisible();
});
test('submits on Enter key', async ({ page }) => {
await page.getByRole('searchbox', { name: /search/i }).fill('{{searchQuery}}');
await page.keyboard.press('Enter');
await expect(page).toHaveURL(/[?&]q=/);
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Valid query | Results list visible, count shown |
| Enter key | Search submitted without clicking button |
| Result count | Heading shows N results for query |
| Result click | Navigates to entity detail |
| URL pre-fill | Query param populates search box |
| No results | Empty state message |
| Special chars | XSS input handled, no script execution |
FILE:templates/search/empty-state.md
# Empty State Template
Tests no-results messaging and clear-filters behaviour.
## Prerequisites
- App running at `{{baseUrl}}`
- Query that returns no results: `{{emptySearchQuery}}`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
test.describe('Empty State', () => {
// Happy path: no results message
test('shows no-results message for unmatched query', async ({ page }) => {
await page.goto('{{baseUrl}}/search?q={{emptySearchQuery}}');
await expect(page.getByRole('heading', { name: /no results|nothing found/i })).toBeVisible();
await expect(page.getByText(/try.*different|adjust.*search/i)).toBeVisible();
});
// Happy path: clear filters CTA shown in empty state
test('shows "clear filters" button when filters applied with no results', async ({ page }) => {
await page.goto('{{baseUrl}}/search?q={{searchQuery}}&category={{nonExistentCategory}}');
await expect(page.getByText(/no results/i)).toBeVisible();
await expect(page.getByRole('button', { name: /clear.*filter/i })).toBeVisible();
});
// Happy path: clearing filters restores results
test('clearing filters from empty state restores results', async ({ page }) => {
await page.goto('{{baseUrl}}/search?q={{searchQuery}}&category={{nonExistentCategory}}');
await page.getByRole('button', { name: /clear.*filter/i }).click();
await expect(page.getByRole('listitem').first()).toBeVisible();
await expect(page.getByText(/no results/i)).toBeHidden();
});
// Happy path: search suggestions shown in empty state
test('shows related search suggestions', async ({ page }) => {
await page.goto('{{baseUrl}}/search?q={{emptySearchQuery}}');
const suggestions = page.getByRole('list', { name: /suggestions|similar/i });
if (await suggestions.isVisible()) {
await expect(suggestions.getByRole('listitem').first()).toBeVisible();
}
});
// Happy path: empty list view (not search)
test('shows empty state on entity list with no data', async ({ page }) => {
await page.goto('{{baseUrl}}/{{entityName}}s?filter={{emptyFilter}}');
await expect(page.getByText(/no {{entityName}}s|empty/i)).toBeVisible();
await expect(page.getByRole('button', { name: /create|add new/i })).toBeVisible();
});
// Error case: network error shows error state not empty state
test('distinguishes network error from no-results', async ({ page }) => {
await page.route('{{baseUrl}}/api/search*', route => route.abort('failed'));
await page.goto('{{baseUrl}}/search?q={{searchQuery}}');
await expect(page.getByText(/error|something went wrong/i)).toBeVisible();
await expect(page.getByText(/no results/i)).toBeHidden();
});
// Edge case: empty state after removing last item
test('shows empty state after deleting last item in list', async ({ page }) => {
await page.goto('{{baseUrl}}/{{entityName}}s');
const row = page.getByRole('row').filter({ hasNot: page.getByRole('columnheader') }).last();
await row.getByRole('button', { name: /delete/i }).click();
await page.getByRole('dialog').getByRole('button', { name: /confirm/i }).click();
await expect(page.getByText(/no {{entityName}}s|empty/i)).toBeVisible();
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
test.describe('Empty State', () => {
test('shows no-results message for unmatched query', async ({ page }) => {
await page.goto('{{baseUrl}}/search?q={{emptySearchQuery}}');
await expect(page.getByRole('heading', { name: /no results|nothing found/i })).toBeVisible();
});
test('shows clear-filters button in no-results state', async ({ page }) => {
await page.goto('{{baseUrl}}/search?q={{searchQuery}}&category={{nonExistentCategory}}');
await expect(page.getByRole('button', { name: /clear.*filter/i })).toBeVisible();
});
test('clearing filters restores results', async ({ page }) => {
await page.goto('{{baseUrl}}/search?q={{searchQuery}}&category={{nonExistentCategory}}');
await page.getByRole('button', { name: /clear.*filter/i }).click();
await expect(page.getByRole('listitem').first()).toBeVisible();
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| No-results query | Heading + suggestion text shown |
| Filter no-results | Clear-filters CTA displayed |
| Clear filters | Removes filter, results return |
| Search suggestions | Related terms listed when available |
| Empty list view | Entity list empty state with create CTA |
| Network error | Error state distinct from no-results |
| Last item deleted | Empty state shown after deletion |
FILE:templates/search/filters.md
# Search Filters Template
Tests category filter, price range, and checkbox filters.
## Prerequisites
- Search results available for `{{searchQuery}}`
- Category `{{filterCategory}}` with items
- App running at `{{baseUrl}}`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
test.describe('Search Filters', () => {
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/search?q={{searchQuery}}');
});
// Happy path: category filter
test('filters results by category', async ({ page }) => {
await page.getByRole('checkbox', { name: '{{filterCategory}}' }).check();
await expect(page).toHaveURL(/category={{filterCategory}}/);
const results = page.getByRole('listitem');
await expect(results.first()).toContainText('{{filterCategory}}');
const count = await results.count();
expect(count).toBeGreaterThan(0);
});
// Happy path: price range filter
test('filters results by price range', async ({ page }) => {
const minInput = page.getByRole('spinbutton', { name: /min.*price/i });
const maxInput = page.getByRole('spinbutton', { name: /max.*price/i });
await minInput.fill('{{minPrice}}');
await maxInput.fill('{{maxPrice}}');
await page.getByRole('button', { name: /apply|filter/i }).click();
await expect(page).toHaveURL(/min_price={{minPrice}}/);
// Verify no results exceed max price
const prices = page.getByTestId('item-price');
const priceCount = await prices.count();
for (let i = 0; i < priceCount; i++) {
const text = await prices.nth(i).textContent() ?? '';
const value = parseFloat(text.replace(/[^0-9.]/g, ''));
expect(value).toBeLessThanOrEqual({{maxPrice}});
}
});
// Happy path: multiple checkboxes combine filters
test('applies multiple checkbox filters simultaneously', async ({ page }) => {
await page.getByRole('checkbox', { name: '{{filterOption1}}' }).check();
await page.getByRole('checkbox', { name: '{{filterOption2}}' }).check();
await expect(page).toHaveURL(/{{filterParam1}}.*{{filterParam2}}|{{filterParam2}}.*{{filterParam1}}/);
});
// Happy path: active filters shown as chips
test('shows active filter chips', async ({ page }) => {
await page.getByRole('checkbox', { name: '{{filterCategory}}' }).check();
await expect(page.getByRole('button', { name: /remove.*{{filterCategory}}/i })).toBeVisible();
});
// Happy path: clear individual filter chip
test('removes filter by clicking chip close', async ({ page }) => {
await page.getByRole('checkbox', { name: '{{filterCategory}}' }).check();
await page.getByRole('button', { name: /remove.*{{filterCategory}}/i }).click();
await expect(page.getByRole('checkbox', { name: '{{filterCategory}}' })).not.toBeChecked();
});
// Happy path: clear all filters
test('clears all filters', async ({ page }) => {
await page.getByRole('checkbox', { name: '{{filterCategory}}' }).check();
await page.getByRole('button', { name: /clear all filters/i }).click();
await expect(page.getByRole('checkbox', { name: '{{filterCategory}}' })).not.toBeChecked();
await expect(page).not.toHaveURL(/category=/);
});
// Error case: no results for filter combination
test('shows empty state when filters yield no results', async ({ page }) => {
await page.getByRole('spinbutton', { name: /min.*price/i }).fill('999999');
await page.getByRole('button', { name: /apply|filter/i }).click();
await expect(page.getByText(/no results/i)).toBeVisible();
await expect(page.getByRole('button', { name: /clear.*filter/i })).toBeVisible();
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
test.describe('Search Filters', () => {
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/search?q={{searchQuery}}');
});
test('filters results by category', async ({ page }) => {
await page.getByRole('checkbox', { name: '{{filterCategory}}' }).check();
await expect(page).toHaveURL(/category={{filterCategory}}/);
await expect(page.getByRole('listitem').first()).toBeVisible();
});
test('shows active filter chips', async ({ page }) => {
await page.getByRole('checkbox', { name: '{{filterCategory}}' }).check();
await expect(page.getByRole('button', { name: /remove.*{{filterCategory}}/i })).toBeVisible();
});
test('clears all filters', async ({ page }) => {
await page.getByRole('checkbox', { name: '{{filterCategory}}' }).check();
await page.getByRole('button', { name: /clear all filters/i }).click();
await expect(page.getByRole('checkbox', { name: '{{filterCategory}}' })).not.toBeChecked();
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Category filter | Checkbox → results scoped to category |
| Price range | Min/max filter applied, prices verified |
| Multi-filter | Multiple checkboxes combine in URL |
| Filter chips | Active filters shown as removable chips |
| Remove chip | Chip close → filter unchecked |
| Clear all | All filters removed at once |
| No-results combo | Filter combination yields empty state |
FILE:templates/search/pagination.md
# Pagination Template
Tests page navigation, items-per-page selector, and URL state.
## Prerequisites
- Search results for `{{searchQuery}}` spanning multiple pages
- At least `{{totalItemCount}}` items total
- App running at `{{baseUrl}}`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
test.describe('Pagination', () => {
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/search?q={{searchQuery}}');
});
// Happy path: navigate to next page
test('navigates to next page and updates URL', async ({ page }) => {
const firstItem = await page.getByRole('listitem').first().textContent();
await page.getByRole('button', { name: /next page/i }).click();
await expect(page).toHaveURL(/page=2/);
await expect(page.getByRole('listitem').first()).not.toHaveText(firstItem!);
});
// Happy path: navigate to previous page
test('navigates to previous page', async ({ page }) => {
await page.goto('{{baseUrl}}/search?q={{searchQuery}}&page=2');
const secondPageFirst = await page.getByRole('listitem').first().textContent();
await page.getByRole('button', { name: /previous page/i }).click();
await expect(page).toHaveURL(/page=1/);
await expect(page.getByRole('listitem').first()).not.toHaveText(secondPageFirst!);
});
// Happy path: jump to specific page
test('jumps to specific page number', async ({ page }) => {
await page.getByRole('button', { name: '3' }).click();
await expect(page).toHaveURL(/page=3/);
await expect(page.getByRole('button', { name: '3' })).toHaveAttribute('aria-current', 'page');
});
// Happy path: items per page selector
test('changes items per page', async ({ page }) => {
await page.getByRole('combobox', { name: /per page/i }).selectOption('50');
await expect(page).toHaveURL(/per_page=50/);
const items = page.getByRole('listitem');
await expect(items).toHaveCount(Math.min(50, {{totalItemCount}}));
});
// Happy path: page info text
test('shows correct page info text', async ({ page }) => {
await expect(page.getByText(/showing \d+.+of\s+{{totalItemCount}}/i)).toBeVisible();
});
// Error case: first page has no previous button
test('previous page button disabled on first page', async ({ page }) => {
await expect(page.getByRole('button', { name: /previous page/i })).toBeDisabled();
});
// Error case: last page has no next button
test('next page button disabled on last page', async ({ page }) => {
const lastPage = Math.ceil({{totalItemCount}} / {{defaultPageSize}});
await page.goto(`{{baseUrl}}/search?q={{searchQuery}}&page=lastPage`);
await expect(page.getByRole('button', { name: /next page/i })).toBeDisabled();
});
// Edge case: out-of-range page redirects to last page
test('out-of-range page parameter redirects gracefully', async ({ page }) => {
await page.goto('{{baseUrl}}/search?q={{searchQuery}}&page=99999');
await expect(page.getByRole('listitem').first()).toBeVisible();
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
test.describe('Pagination', () => {
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/search?q={{searchQuery}}');
});
test('navigates to next page', async ({ page }) => {
await page.getByRole('button', { name: /next page/i }).click();
await expect(page).toHaveURL(/page=2/);
});
test('previous page disabled on first page', async ({ page }) => {
await expect(page.getByRole('button', { name: /previous page/i })).toBeDisabled();
});
test('next page disabled on last page', async ({ page }) => {
const last = Math.ceil({{totalItemCount}} / {{defaultPageSize}});
await page.goto(`{{baseUrl}}/search?q={{searchQuery}}&page=last`);
await expect(page.getByRole('button', { name: /next page/i })).toBeDisabled();
});
test('changes items per page', async ({ page }) => {
await page.getByRole('combobox', { name: /per page/i }).selectOption('50');
await expect(page).toHaveURL(/per_page=50/);
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Next page | Items change, URL updates page=2 |
| Previous page | Back to page 1 |
| Jump to page | Clicking page number sets aria-current |
| Items per page | Selector changes count of visible items |
| Page info | "Showing X-Y of N" text |
| First page prev | Previous button disabled |
| Last page next | Next button disabled |
| Out-of-range | Graceful fallback |
FILE:templates/search/sorting.md
# Search Sorting Template
Tests sorting results by name, date, and price.
## Prerequisites
- Search results for `{{searchQuery}}` with multiple items
- App running at `{{baseUrl}}`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
test.describe('Search Sorting', () => {
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/search?q={{searchQuery}}');
});
// Happy path: sort by name A-Z
test('sorts results alphabetically A-Z', async ({ page }) => {
await page.getByRole('combobox', { name: /sort by/i }).selectOption('name_asc');
await expect(page).toHaveURL(/sort=name_asc/);
const names = page.getByTestId('result-name');
const first = await names.first().textContent();
const second = await names.nth(1).textContent();
expect(first!.localeCompare(second!)).toBeLessThanOrEqual(0);
});
// Happy path: sort by name Z-A
test('sorts results alphabetically Z-A', async ({ page }) => {
await page.getByRole('combobox', { name: /sort by/i }).selectOption('name_desc');
const names = page.getByTestId('result-name');
const first = await names.first().textContent();
const second = await names.nth(1).textContent();
expect(first!.localeCompare(second!)).toBeGreaterThanOrEqual(0);
});
// Happy path: sort by date newest
test('sorts results by newest date first', async ({ page }) => {
await page.getByRole('combobox', { name: /sort by/i }).selectOption('date_desc');
await expect(page).toHaveURL(/sort=date_desc/);
const dates = page.getByTestId('result-date');
const firstDate = new Date(await dates.first().getAttribute('datetime') ?? '');
const secondDate = new Date(await dates.nth(1).getAttribute('datetime') ?? '');
expect(firstDate.getTime()).toBeGreaterThanOrEqual(secondDate.getTime());
});
// Happy path: sort by price low-high
test('sorts by price low to high', async ({ page }) => {
await page.getByRole('combobox', { name: /sort by/i }).selectOption('price_asc');
const prices = page.getByTestId('result-price');
const firstText = await prices.first().textContent() ?? '';
const secondText = await prices.nth(1).textContent() ?? '';
const first = parseFloat(firstText.replace(/[^0-9.]/g, ''));
const second = parseFloat(secondText.replace(/[^0-9.]/g, ''));
expect(first).toBeLessThanOrEqual(second);
});
// Happy path: sort by price high-low
test('sorts by price high to low', async ({ page }) => {
await page.getByRole('combobox', { name: /sort by/i }).selectOption('price_desc');
const prices = page.getByTestId('result-price');
const firstText = await prices.first().textContent() ?? '';
const secondText = await prices.nth(1).textContent() ?? '';
const first = parseFloat(firstText.replace(/[^0-9.]/g, ''));
const second = parseFloat(secondText.replace(/[^0-9.]/g, ''));
expect(first).toBeGreaterThanOrEqual(second);
});
// Happy path: sort persists with filters
test('sort selection persists when filter applied', async ({ page }) => {
await page.getByRole('combobox', { name: /sort by/i }).selectOption('price_asc');
await page.getByRole('checkbox', { name: '{{filterCategory}}' }).check();
await expect(page).toHaveURL(/sort=price_asc/);
await expect(page.getByRole('combobox', { name: /sort by/i })).toHaveValue('price_asc');
});
// Edge case: default sort is relevance
test('default sort is relevance', async ({ page }) => {
await expect(page.getByRole('combobox', { name: /sort by/i })).toHaveValue('relevance');
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
test.describe('Search Sorting', () => {
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/search?q={{searchQuery}}');
});
test('sorts alphabetically A-Z', async ({ page }) => {
await page.getByRole('combobox', { name: /sort by/i }).selectOption('name_asc');
await expect(page).toHaveURL(/sort=name_asc/);
const names = page.getByTestId('result-name');
const first = await names.first().textContent();
const second = await names.nth(1).textContent();
expect(first.localeCompare(second)).toBeLessThanOrEqual(0);
});
test('sorts by price low to high', async ({ page }) => {
await page.getByRole('combobox', { name: /sort by/i }).selectOption('price_asc');
const prices = page.getByTestId('result-price');
const a = parseFloat((await prices.first().textContent()).replace(/[^0-9.]/g, ''));
const b = parseFloat((await prices.nth(1).textContent()).replace(/[^0-9.]/g, ''));
expect(a).toBeLessThanOrEqual(b);
});
test('default sort is relevance', async ({ page }) => {
await expect(page.getByRole('combobox', { name: /sort by/i })).toHaveValue('relevance');
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Name A-Z | First result ≤ second alphabetically |
| Name Z-A | First result ≥ second alphabetically |
| Date newest | Dates in descending order |
| Price low-high | Prices in ascending order |
| Price high-low | Prices in descending order |
| Sort + filter | Sort param persists when filter applied |
| Default sort | Relevance selected by default |
FILE:templates/settings/account-delete.md
# Account Delete Template
Tests account deletion flow with confirmation and data warning.
## Prerequisites
- Authenticated session via `{{authStorageStatePath}}`
- Disposable test account (deletion is irreversible)
- Settings at `{{baseUrl}}/settings/account`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
test.describe('Account Delete', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/settings/account');
});
// Happy path: delete button opens confirmation
test('clicking delete account shows confirmation dialog', async ({ page }) => {
await page.getByRole('button', { name: /delete.*account/i }).click();
const dialog = page.getByRole('dialog', { name: /delete account/i });
await expect(dialog).toBeVisible();
await expect(dialog).toContainText(/irreversible|cannot be undone/i);
await expect(dialog).toContainText(/{{dataWarningText}}/i);
});
// Happy path: cancel preserves account
test('cancel keeps account intact', async ({ page }) => {
await page.getByRole('button', { name: /delete.*account/i }).click();
await page.getByRole('dialog').getByRole('button', { name: /cancel/i }).click();
await expect(page.getByRole('dialog')).toBeHidden();
await expect(page).toHaveURL('{{baseUrl}}/settings/account');
});
// Happy path: type-to-confirm gates deletion
test('confirm button disabled until account email typed', async ({ page }) => {
await page.getByRole('button', { name: /delete.*account/i }).click();
const dialog = page.getByRole('dialog');
const confirmBtn = dialog.getByRole('button', { name: /delete.*account|confirm/i });
await expect(confirmBtn).toBeDisabled();
await dialog.getByRole('textbox', { name: /type.*email/i }).fill('{{username}}');
await expect(confirmBtn).toBeEnabled();
});
// Happy path: successful deletion redirects to login
test('deletes account and redirects to login', async ({ page }) => {
await page.getByRole('button', { name: /delete.*account/i }).click();
const dialog = page.getByRole('dialog');
await dialog.getByRole('textbox', { name: /type.*email/i }).fill('{{username}}');
await dialog.getByRole('button', { name: /delete.*account|confirm/i }).click();
await expect(page).toHaveURL(/\/login/);
await expect(page.getByText(/account.*deleted|successfully deleted/i)).toBeVisible();
});
// Error case: wrong email in confirmation box
test('shows error when wrong email typed in confirmation', async ({ page }) => {
await page.getByRole('button', { name: /delete.*account/i }).click();
const dialog = page.getByRole('dialog');
await dialog.getByRole('textbox', { name: /type.*email/i }).fill('[email protected]');
const confirmBtn = dialog.getByRole('button', { name: /delete.*account|confirm/i });
await expect(confirmBtn).toBeDisabled();
await expect(dialog.getByText(/does not match/i)).toBeVisible();
});
// Error case: deletion fails server-side
test('shows error when account deletion fails', async ({ page }) => {
await page.route('{{baseUrl}}/api/account', route =>
route.fulfill({ status: 500, body: JSON.stringify({ error: 'Deletion failed' }) })
);
await page.getByRole('button', { name: /delete.*account/i }).click();
const dialog = page.getByRole('dialog');
await dialog.getByRole('textbox', { name: /type.*email/i }).fill('{{username}}');
await dialog.getByRole('button', { name: /confirm/i }).click();
await expect(page.getByRole('alert')).toContainText(/failed|error/i);
await expect(page).toHaveURL('{{baseUrl}}/settings/account');
});
// Edge case: data export offered before deletion
test('shows data export option in deletion dialog', async ({ page }) => {
await page.getByRole('button', { name: /delete.*account/i }).click();
await expect(page.getByRole('link', { name: /export.*data|download.*data/i })).toBeVisible();
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
test.describe('Account Delete', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
test('shows confirmation dialog on delete', async ({ page }) => {
await page.goto('{{baseUrl}}/settings/account');
await page.getByRole('button', { name: /delete.*account/i }).click();
await expect(page.getByRole('dialog', { name: /delete account/i })).toBeVisible();
await expect(page.getByRole('dialog')).toContainText(/irreversible/i);
});
test('confirm button disabled until email typed', async ({ page }) => {
await page.goto('{{baseUrl}}/settings/account');
await page.getByRole('button', { name: /delete.*account/i }).click();
const dialog = page.getByRole('dialog');
await expect(dialog.getByRole('button', { name: /confirm/i })).toBeDisabled();
await dialog.getByRole('textbox', { name: /type.*email/i }).fill('{{username}}');
await expect(dialog.getByRole('button', { name: /confirm/i })).toBeEnabled();
});
test('cancel preserves account', async ({ page }) => {
await page.goto('{{baseUrl}}/settings/account');
await page.getByRole('button', { name: /delete.*account/i }).click();
await page.getByRole('dialog').getByRole('button', { name: /cancel/i }).click();
await expect(page).toHaveURL('{{baseUrl}}/settings/account');
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Dialog opens | Delete button → confirmation with warning |
| Cancel | Dialog closed, account preserved |
| Type-to-confirm | Button enabled only with correct email |
| Successful delete | Account deleted → /login |
| Wrong email | Input mismatch → button stays disabled |
| Server error | Deletion fails → error alert |
| Data export | Export link offered in dialog |
FILE:templates/settings/notification-prefs.md
# Notification Preferences Template
Tests toggling notification channels and saving preferences.
## Prerequisites
- Authenticated session via `{{authStorageStatePath}}`
- Settings page at `{{baseUrl}}/settings/notifications`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
test.describe('Notification Preferences', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/settings/notifications');
});
// Happy path: enable email notifications
test('enables email notifications', async ({ page }) => {
const emailToggle = page.getByRole('switch', { name: /email notifications/i });
if (!(await emailToggle.isChecked())) {
await emailToggle.click();
}
await expect(emailToggle).toBeChecked();
await page.getByRole('button', { name: /save|update/i }).click();
await expect(page.getByRole('alert')).toContainText(/preferences.*saved|updated/i);
});
// Happy path: disable push notifications
test('disables push notifications', async ({ page }) => {
const pushToggle = page.getByRole('switch', { name: /push notifications/i });
if (await pushToggle.isChecked()) {
await pushToggle.click();
}
await expect(pushToggle).not.toBeChecked();
await page.getByRole('button', { name: /save/i }).click();
await expect(page.getByRole('alert')).toContainText(/saved/i);
});
// Happy path: preferences persist after reload
test('saved preferences persist after page reload', async ({ page }) => {
const emailToggle = page.getByRole('switch', { name: /email notifications/i });
const wasChecked = await emailToggle.isChecked();
await emailToggle.click();
await page.getByRole('button', { name: /save/i }).click();
await expect(page.getByRole('alert')).toContainText(/saved/i);
await page.reload();
if (wasChecked) {
await expect(emailToggle).not.toBeChecked();
} else {
await expect(emailToggle).toBeChecked();
}
});
// Happy path: notification frequency selector
test('changes notification frequency', async ({ page }) => {
await page.getByRole('combobox', { name: /frequency|digest/i }).selectOption('{{frequency}}');
await page.getByRole('button', { name: /save/i }).click();
await expect(page.getByRole('alert')).toContainText(/saved/i);
await page.reload();
await expect(page.getByRole('combobox', { name: /frequency|digest/i })).toHaveValue('{{frequency}}');
});
// Error case: save fails — preferences not changed
test('shows error when save fails', async ({ page }) => {
await page.route('{{baseUrl}}/api/settings/notifications*', route =>
route.fulfill({ status: 500, body: JSON.stringify({ error: 'Server error' }) })
);
await page.getByRole('switch', { name: /email notifications/i }).click();
await page.getByRole('button', { name: /save/i }).click();
await expect(page.getByRole('alert')).toContainText(/error|failed to save/i);
});
// Edge case: unsubscribe all shows confirmation
test('shows confirmation before unsubscribing all', async ({ page }) => {
await page.getByRole('button', { name: /unsubscribe all/i }).click();
await expect(page.getByRole('dialog', { name: /unsubscribe/i })).toBeVisible();
await page.getByRole('button', { name: /cancel/i }).click();
// Still subscribed
await expect(page.getByRole('switch', { name: /email notifications/i })).toBeChecked();
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
test.describe('Notification Preferences', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
test('saves notification preferences', async ({ page }) => {
await page.goto('{{baseUrl}}/settings/notifications');
const toggle = page.getByRole('switch', { name: /email notifications/i });
await toggle.click();
await page.getByRole('button', { name: /save/i }).click();
await expect(page.getByRole('alert')).toContainText(/saved/i);
});
test('preferences persist after reload', async ({ page }) => {
await page.goto('{{baseUrl}}/settings/notifications');
const toggle = page.getByRole('switch', { name: /email notifications/i });
const was = await toggle.isChecked();
await toggle.click();
await page.getByRole('button', { name: /save/i }).click();
await page.reload();
was
? await expect(toggle).not.toBeChecked()
: await expect(toggle).toBeChecked();
});
test('shows error when save fails', async ({ page }) => {
await page.goto('{{baseUrl}}/settings/notifications');
await page.route('{{baseUrl}}/api/settings/notifications*', r =>
r.fulfill({ status: 500, body: '{}' })
);
await page.getByRole('button', { name: /save/i }).click();
await expect(page.getByRole('alert')).toContainText(/error|failed/i);
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Enable email | Toggle on → saved → success |
| Disable push | Toggle off → saved |
| Persists reload | Saved state survives page reload |
| Frequency selector | Dropdown value saved and restored |
| Save error | Server error → error alert |
| Unsubscribe all | Confirmation dialog before all disabled |
FILE:templates/settings/password-change.md
# Password Change Template
Tests current password verification, new password validation, and success flow.
## Prerequisites
- Authenticated session via `{{authStorageStatePath}}`
- Current password: `{{currentPassword}}`
- New password: `{{newPassword}}`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
test.describe('Password Change', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/settings/security');
});
// Happy path: successful password change
test('changes password with valid inputs', async ({ page }) => {
await page.getByRole('textbox', { name: /current password/i }).fill('{{currentPassword}}');
await page.getByRole('textbox', { name: /^new password$/i }).fill('{{newPassword}}');
await page.getByRole('textbox', { name: /confirm.*password/i }).fill('{{newPassword}}');
await page.getByRole('button', { name: /change.*password|update password/i }).click();
await expect(page.getByRole('alert')).toContainText(/password.*changed|updated successfully/i);
});
// Happy path: can log in with new password
test('new password accepted on next login', async ({ page, context }) => {
// Change password
await page.getByRole('textbox', { name: /current password/i }).fill('{{currentPassword}}');
await page.getByRole('textbox', { name: /^new password$/i }).fill('{{newPassword}}');
await page.getByRole('textbox', { name: /confirm.*password/i }).fill('{{newPassword}}');
await page.getByRole('button', { name: /change.*password/i }).click();
await expect(page.getByRole('alert')).toContainText(/changed/i);
// Log out and back in
await page.getByRole('button', { name: /user menu/i }).click();
await page.getByRole('menuitem', { name: /sign out/i }).click();
await page.getByRole('textbox', { name: /email/i }).fill('{{username}}');
await page.getByRole('textbox', { name: /password/i }).fill('{{newPassword}}');
await page.getByRole('button', { name: /sign in/i }).click();
await expect(page).toHaveURL('{{baseUrl}}/dashboard');
});
// Error case: wrong current password
test('shows error when current password is wrong', async ({ page }) => {
await page.getByRole('textbox', { name: /current password/i }).fill('wrong-password');
await page.getByRole('textbox', { name: /^new password$/i }).fill('{{newPassword}}');
await page.getByRole('textbox', { name: /confirm.*password/i }).fill('{{newPassword}}');
await page.getByRole('button', { name: /change.*password/i }).click();
await expect(page.getByRole('alert')).toContainText(/current password.*incorrect|wrong password/i);
});
// Error case: new passwords do not match
test('shows error when confirmation does not match', async ({ page }) => {
await page.getByRole('textbox', { name: /current password/i }).fill('{{currentPassword}}');
await page.getByRole('textbox', { name: /^new password$/i }).fill('{{newPassword}}');
await page.getByRole('textbox', { name: /confirm.*password/i }).fill('mismatch');
await page.getByRole('button', { name: /change.*password/i }).click();
await expect(page.getByText(/passwords.*do not match/i)).toBeVisible();
});
// Error case: new password too weak
test('shows strength error for weak new password', async ({ page }) => {
await page.getByRole('textbox', { name: /current password/i }).fill('{{currentPassword}}');
await page.getByRole('textbox', { name: /^new password$/i }).fill('123');
await page.getByRole('textbox', { name: /^new password$/i }).blur();
await expect(page.getByText(/too weak|at least \d+ characters/i)).toBeVisible();
});
// Error case: new password same as current
test('shows error when new password matches current', async ({ page }) => {
await page.getByRole('textbox', { name: /current password/i }).fill('{{currentPassword}}');
await page.getByRole('textbox', { name: /^new password$/i }).fill('{{currentPassword}}');
await page.getByRole('textbox', { name: /confirm.*password/i }).fill('{{currentPassword}}');
await page.getByRole('button', { name: /change.*password/i }).click();
await expect(page.getByText(/same as.*current|choose.*different/i)).toBeVisible();
});
// Edge case: password strength meter updates on input
test('strength meter reacts to new password input', async ({ page }) => {
await page.getByRole('textbox', { name: /^new password$/i }).fill('weak');
await expect(page.getByRole('meter', { name: /strength/i })).toHaveAttribute('aria-valuenow', '1');
await page.getByRole('textbox', { name: /^new password$/i }).fill('Str0ng!Pass#2026');
await expect(page.getByRole('meter', { name: /strength/i })).toHaveAttribute('aria-valuenow', '4');
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
test.describe('Password Change', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
test('changes password with valid inputs', async ({ page }) => {
await page.goto('{{baseUrl}}/settings/security');
await page.getByRole('textbox', { name: /current password/i }).fill('{{currentPassword}}');
await page.getByRole('textbox', { name: /^new password$/i }).fill('{{newPassword}}');
await page.getByRole('textbox', { name: /confirm.*password/i }).fill('{{newPassword}}');
await page.getByRole('button', { name: /change.*password/i }).click();
await expect(page.getByRole('alert')).toContainText(/changed|updated/i);
});
test('shows error for wrong current password', async ({ page }) => {
await page.goto('{{baseUrl}}/settings/security');
await page.getByRole('textbox', { name: /current password/i }).fill('wrong');
await page.getByRole('textbox', { name: /^new password$/i }).fill('{{newPassword}}');
await page.getByRole('textbox', { name: /confirm.*password/i }).fill('{{newPassword}}');
await page.getByRole('button', { name: /change.*password/i }).click();
await expect(page.getByRole('alert')).toContainText(/incorrect|wrong/i);
});
test('shows mismatch error', async ({ page }) => {
await page.goto('{{baseUrl}}/settings/security');
await page.getByRole('textbox', { name: /current password/i }).fill('{{currentPassword}}');
await page.getByRole('textbox', { name: /^new password$/i }).fill('{{newPassword}}');
await page.getByRole('textbox', { name: /confirm.*password/i }).fill('nope');
await page.getByRole('button', { name: /change.*password/i }).click();
await expect(page.getByText(/do not match/i)).toBeVisible();
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Success | All fields valid → success alert |
| Login with new pw | New password accepted at login |
| Wrong current | Incorrect current → error alert |
| Mismatch | Confirm ≠ new → validation error |
| Weak password | Short password → strength error |
| Same as current | Reuse blocked with error |
| Strength meter | Meter aria-valuenow updates on input |
FILE:templates/settings/profile-update.md
# Profile Update Template
Tests updating name, email, and avatar in user profile settings.
## Prerequisites
- Authenticated session via `{{authStorageStatePath}}`
- Current name: `{{currentName}}`, email: `{{currentEmail}}`
- Test avatar image: `{{avatarFilePath}}`
---
## TypeScript
```typescript
import { test, expect } from '@playwright/test';
test.describe('Profile Update', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
test.beforeEach(async ({ page }) => {
await page.goto('{{baseUrl}}/settings/profile');
});
// Happy path: update display name
test('updates display name', async ({ page }) => {
const nameField = page.getByRole('textbox', { name: /display name|full name/i });
await nameField.clear();
await nameField.fill('{{newName}}');
await page.getByRole('button', { name: /save|update/i }).click();
await expect(page.getByRole('alert')).toContainText(/profile updated|saved/i);
await expect(page.getByRole('textbox', { name: /display name|full name/i })).toHaveValue('{{newName}}');
});
// Happy path: update email
test('updates email address', async ({ page }) => {
const emailField = page.getByRole('textbox', { name: /email/i });
await emailField.clear();
await emailField.fill('{{newEmail}}');
await page.getByRole('button', { name: /save|update/i }).click();
await expect(page.getByRole('alert')).toContainText(/verification.*sent|email updated/i);
});
// Happy path: upload avatar
test('uploads new avatar image', async ({ page }) => {
await page.getByRole('button', { name: /change.*avatar|upload.*photo/i }).click();
await page.locator('input[type="file"]').setInputFiles('{{avatarFilePath}}');
await expect(page.getByRole('img', { name: /avatar preview/i })).toBeVisible();
await page.getByRole('button', { name: /save|apply/i }).click();
await expect(page.getByRole('alert')).toContainText(/avatar updated|photo saved/i);
});
// Happy path: avatar crop dialog
test('shows crop dialog after avatar upload', async ({ page }) => {
await page.locator('input[type="file"]').setInputFiles('{{avatarFilePath}}');
await expect(page.getByRole('dialog', { name: /crop/i })).toBeVisible();
await page.getByRole('button', { name: /apply crop/i }).click();
await expect(page.getByRole('dialog', { name: /crop/i })).toBeHidden();
});
// Error case: invalid email format
test('shows error for invalid email format', async ({ page }) => {
await page.getByRole('textbox', { name: /email/i }).clear();
await page.getByRole('textbox', { name: /email/i }).fill('bad-email');
await page.getByRole('button', { name: /save|update/i }).click();
await expect(page.getByText(/valid.*email/i)).toBeVisible();
});
// Error case: email already taken
test('shows error when email is already in use', async ({ page }) => {
await page.getByRole('textbox', { name: /email/i }).clear();
await page.getByRole('textbox', { name: /email/i }).fill('{{takenEmail}}');
await page.getByRole('button', { name: /save|update/i }).click();
await expect(page.getByRole('alert')).toContainText(/already in use|taken/i);
});
// Edge case: name reflected in nav after update
test('nav shows updated name after save', async ({ page }) => {
const nameField = page.getByRole('textbox', { name: /display name|full name/i });
await nameField.clear();
await nameField.fill('{{newName}}');
await page.getByRole('button', { name: /save|update/i }).click();
await expect(page.getByRole('navigation').getByText('{{newName}}')).toBeVisible();
});
});
```
---
## JavaScript
```javascript
const { test, expect } = require('@playwright/test');
test.describe('Profile Update', () => {
test.use({ storageState: '{{authStorageStatePath}}' });
test('updates display name', async ({ page }) => {
await page.goto('{{baseUrl}}/settings/profile');
await page.getByRole('textbox', { name: /display name|full name/i }).clear();
await page.getByRole('textbox', { name: /display name|full name/i }).fill('{{newName}}');
await page.getByRole('button', { name: /save|update/i }).click();
await expect(page.getByRole('alert')).toContainText(/profile updated|saved/i);
});
test('shows error for invalid email', async ({ page }) => {
await page.goto('{{baseUrl}}/settings/profile');
await page.getByRole('textbox', { name: /email/i }).fill('bad-email');
await page.getByRole('button', { name: /save|update/i }).click();
await expect(page.getByText(/valid.*email/i)).toBeVisible();
});
test('uploads avatar image', async ({ page }) => {
await page.goto('{{baseUrl}}/settings/profile');
await page.locator('input[type="file"]').setInputFiles('{{avatarFilePath}}');
await page.getByRole('button', { name: /save|apply/i }).click();
await expect(page.getByRole('alert')).toContainText(/avatar updated/i);
});
});
```
## Variants
| Variant | Description |
|---------|-------------|
| Name update | Name saved, field reflects new value |
| Email update | Email saved, verification notice shown |
| Avatar upload | Image uploaded, success alert |
| Crop dialog | Cropper shown, apply saves |
| Invalid email | Format error shown |
| Taken email | Duplicate error shown |
| Nav update | Navigation reflects new name |
Curate Claude Code's auto-memory into durable project knowledge. Analyze MEMORY.md for patterns, promote proven learnings to CLAUDE.md and .claude/rules/, ex...
---
name: "self-improving-agent"
description: "Curate Claude Code's auto-memory into durable project knowledge. Analyze MEMORY.md for patterns, promote proven learnings to CLAUDE.md and .claude/rules/, extract recurring solutions into reusable skills. Use when: (1) reviewing what Claude has learned about your project, (2) graduating a pattern from notes to enforced rules, (3) turning a debugging solution into a skill, (4) checking memory health and capacity."
---
# Self-Improving Agent
> Auto-memory captures. This plugin curates.
Claude Code's auto-memory (v2.1.32+) automatically records project patterns, debugging insights, and your preferences in `MEMORY.md`. This plugin adds the intelligence layer: it analyzes what Claude has learned, promotes proven patterns into project rules, and extracts recurring solutions into reusable skills.
## Quick Reference
| Command | What it does |
|---------|-------------|
| `/si:review` | Analyze MEMORY.md — find promotion candidates, stale entries, consolidation opportunities |
| `/si:promote` | Graduate a pattern from MEMORY.md → CLAUDE.md or `.claude/rules/` |
| `/si:extract` | Turn a proven pattern into a standalone skill |
| `/si:status` | Memory health dashboard — line counts, topic files, recommendations |
| `/si:remember` | Explicitly save important knowledge to auto-memory |
## How It Fits Together
```
┌─────────────────────────────────────────────────────────┐
│ Claude Code Memory Stack │
├─────────────┬──────────────────┬────────────────────────┤
│ CLAUDE.md │ Auto Memory │ Session Memory │
│ (you write)│ (Claude writes)│ (Claude writes) │
│ Rules & │ MEMORY.md │ Conversation logs │
│ standards │ + topic files │ + continuity │
│ Full load │ First 200 lines│ Contextual load │
├─────────────┴──────────────────┴────────────────────────┤
│ ↑ /si:promote ↑ /si:review │
│ Self-Improving Agent (this plugin) │
│ ↓ /si:extract ↓ /si:remember │
├─────────────────────────────────────────────────────────┤
│ .claude/rules/ │ New Skills │ Error Logs │
│ (scoped rules) │ (extracted) │ (auto-captured)│
└─────────────────────────────────────────────────────────┘
```
## Installation
### Claude Code (Plugin)
```
/plugin marketplace add alirezarezvani/claude-skills
/plugin install self-improving-agent@claude-code-skills
```
### OpenClaw
```bash
clawhub install self-improving-agent
```
### Codex CLI
```bash
./scripts/codex-install.sh --skill self-improving-agent
```
## Memory Architecture
### Where things live
| File | Who writes | Scope | Loaded |
|------|-----------|-------|--------|
| `./CLAUDE.md` | You (+ `/si:promote`) | Project rules | Full file, every session |
| `~/.claude/CLAUDE.md` | You | Global preferences | Full file, every session |
| `~/.claude/projects/<path>/memory/MEMORY.md` | Claude (auto) | Project learnings | First 200 lines |
| `~/.claude/projects/<path>/memory/*.md` | Claude (overflow) | Topic-specific notes | On demand |
| `.claude/rules/*.md` | You (+ `/si:promote`) | Scoped rules | When matching files open |
### The promotion lifecycle
```
1. Claude discovers pattern → auto-memory (MEMORY.md)
2. Pattern recurs 2-3x → /si:review flags it as promotion candidate
3. You approve → /si:promote graduates it to CLAUDE.md or rules/
4. Pattern becomes an enforced rule, not just a note
5. MEMORY.md entry removed → frees space for new learnings
```
## Core Concepts
### Auto-memory is capture, not curation
Auto-memory is excellent at recording what Claude learns. But it has no judgment about:
- Which learnings are temporary vs. permanent
- Which patterns should become enforced rules
- When the 200-line limit is wasting space on stale entries
- Which solutions are good enough to become reusable skills
That's what this plugin does.
### Promotion = graduation
When you promote a learning, it moves from Claude's scratchpad (MEMORY.md) to your project's rule system (CLAUDE.md or `.claude/rules/`). The difference matters:
- **MEMORY.md**: "I noticed this project uses pnpm" (background context)
- **CLAUDE.md**: "Use pnpm, not npm" (enforced instruction)
Promoted rules have higher priority and load in full (not truncated at 200 lines).
### Rules directory for scoped knowledge
Not everything belongs in CLAUDE.md. Use `.claude/rules/` for patterns that only apply to specific file types:
```yaml
# .claude/rules/api-testing.md
---
paths:
- "src/api/**/*.test.ts"
- "tests/api/**/*"
---
- Use supertest for API endpoint testing
- Mock external services with msw
- Always test error responses, not just happy paths
```
This loads only when Claude works with API test files — zero overhead otherwise.
## Agents
### memory-analyst
Analyzes MEMORY.md and topic files to identify:
- Entries that recur across sessions (promotion candidates)
- Stale entries referencing deleted files or old patterns
- Related entries that should be consolidated
- Gaps between what MEMORY.md knows and what CLAUDE.md enforces
### skill-extractor
Takes a proven pattern and generates a complete skill:
- SKILL.md with proper frontmatter
- Reference documentation
- Examples and edge cases
- Ready for `/plugin install` or `clawhub publish`
## Hooks
### error-capture (PostToolUse → Bash)
Monitors command output for errors. When detected, appends a structured entry to auto-memory with:
- The command that failed
- Error output (truncated)
- Timestamp and context
- Suggested category
**Token overhead:** Zero on success. ~30 tokens only when an error is detected.
## Platform Support
| Platform | Memory System | Plugin Works? |
|----------|--------------|---------------|
| Claude Code | Auto-memory (MEMORY.md) | ✅ Full support |
| OpenClaw | workspace/MEMORY.md | ✅ Adapted (reads workspace memory) |
| Codex CLI | AGENTS.md | ✅ Adapted (reads AGENTS.md patterns) |
| GitHub Copilot | `.github/copilot-instructions.md` | ⚠️ Manual promotion only |
## Related
- [Claude Code Memory Docs](https://code.claude.com/docs/en/memory)
- [pskoett/self-improving-agent](https://clawhub.ai/pskoett/self-improving-agent) — inspiration
- [playwright-pro](../playwright-pro/) — sister plugin in this repo
FILE:CLAUDE.md
# Self-Improving Agent — Claude Code Instructions
This plugin helps you curate Claude Code's auto-memory into durable project knowledge.
## Commands
Use the `/si:` namespace for all commands:
- `/si:review` — Analyze auto-memory health and find promotion candidates
- `/si:promote <pattern>` — Graduate a learning to CLAUDE.md or `.claude/rules/`
- `/si:extract <pattern>` — Create a reusable skill from a proven pattern
- `/si:status` — Quick memory health dashboard
- `/si:remember <knowledge>` — Explicitly save something to auto-memory
## How auto-memory works
Claude Code maintains `~/.claude/projects/<project-path>/memory/MEMORY.md` automatically. The first 200 lines load into every session. When it grows too large, Claude moves details into topic files like `debugging.md` or `patterns.md`.
This plugin reads that directory — it never creates its own storage.
## When to use each command
### After completing a feature or debugging session
```
/si:review
```
Check if anything Claude learned should become a permanent rule.
### When a pattern keeps coming up
```
/si:promote "Always run migrations before tests in this project"
```
Moves it from MEMORY.md (background note) to CLAUDE.md (enforced rule).
### When you solved something non-obvious that could help other projects
```
/si:extract "Docker build fix for ARM64 platform mismatch"
```
Creates a standalone skill with SKILL.md, ready to install elsewhere.
### To check memory capacity
```
/si:status
```
Shows line counts, topic files, stale entries, and recommendations.
## Key principle
**Don't fight auto-memory — orchestrate it.**
- Auto-memory is great at capturing patterns. Let it do its job.
- This plugin adds judgment: what's worth keeping, what should be promoted, what's stale.
- Promoted rules in CLAUDE.md have higher priority than MEMORY.md entries.
- Removing promoted entries from MEMORY.md frees space for new learnings.
## Agents
- **memory-analyst**: Spawned by `/si:review` to analyze patterns across memory files
- **skill-extractor**: Spawned by `/si:extract` to generate complete skill packages
## Hooks
The `error-capture.sh` hook fires on `PostToolUse` (Bash only). It detects command failures and appends structured entries to auto-memory. Zero overhead on successful commands.
To enable:
```json
// .claude/settings.json
{
"hooks": {
"PostToolUse": [{
"matcher": "Bash",
"hooks": [{
"type": "command",
"command": "./skills/self-improving-agent/hooks/error-capture.sh"
}]
}]
}
}
```
FILE:README.md
# Self-Improving Agent
> Auto-memory captures. This plugin curates.
A Claude Code plugin that turns auto-memory into a structured self-improvement loop. Analyze what Claude has learned, promote proven patterns to enforced rules, and extract recurring solutions into reusable skills.
## Why
Claude Code's auto-memory (v2.1.32+) automatically records project patterns in `MEMORY.md`. But it has no judgment about what to keep, what to promote, or when entries go stale. This plugin adds the intelligence layer.
**The difference:**
- **MEMORY.md**: "I noticed this project uses pnpm" (background note, truncated at 200 lines)
- **CLAUDE.md**: "Use pnpm, not npm" (enforced instruction, loaded in full)
Promoting a pattern from memory to rules fundamentally changes how Claude treats it.
## Commands
| Command | What it does |
|---------|-------------|
| `/si:review` | Analyze auto-memory — find promotion candidates, stale entries, health metrics |
| `/si:promote` | Graduate a pattern from MEMORY.md → CLAUDE.md or `.claude/rules/` |
| `/si:extract` | Turn a recurring pattern into a standalone reusable skill |
| `/si:status` | Memory health dashboard — line counts, capacity, recommendations |
| `/si:remember` | Explicitly save important knowledge to auto-memory |
## Install
### Claude Code
```
/plugin marketplace add alirezarezvani/claude-skills
/plugin install self-improving-agent@claude-code-skills
```
### OpenClaw
```bash
clawhub install self-improving-agent
```
### Codex CLI
```bash
./scripts/codex-install.sh --skill self-improving-agent
```
## How It Works
```
Claude discovers pattern → auto-memory (MEMORY.md)
↓
Pattern recurs 2-3x → /si:review flags it
↓
You approve → /si:promote graduates it to CLAUDE.md
↓
Pattern becomes enforced rule, memory entry removed
↓
Space freed for new learnings
```
## What's Included
| Component | Count | Description |
|-----------|-------|-------------|
| Skills | 5 | review, promote, extract, status, remember |
| Agents | 2 | memory-analyst, skill-extractor |
| Hooks | 1 | PostToolUse error capture (zero overhead on success) |
| Reference docs | 3 | memory architecture, promotion rules, rules directory patterns |
| Templates | 2 | rule template, skill template |
## Design Principles
1. **Don't fight auto-memory — orchestrate it.** Auto-memory captures. This plugin curates.
2. **No duplicate storage.** Reads from `~/.claude/projects/` directly. No `.learnings/` directory.
3. **Zero capture overhead.** Auto-memory handles capture. Hook only fires on errors.
4. **Promotion = graduation.** Moving a pattern from MEMORY.md to CLAUDE.md changes its priority.
5. **Respect the 200-line limit.** Actively manages MEMORY.md capacity.
## Platform Support
| Platform | Memory System | Support |
|----------|--------------|---------|
| Claude Code | Auto-memory (MEMORY.md) | ✅ Full |
| OpenClaw | workspace/MEMORY.md | ✅ Adapted |
| Codex CLI | AGENTS.md | ✅ Adapted |
| GitHub Copilot | copilot-instructions.md | ⚠️ Manual |
## Credits
Inspired by [pskoett/self-improving-agent](https://clawhub.ai/pskoett/self-improving-agent) — a structured learning loop for AI coding agents. This plugin builds on that concept by integrating natively with Claude Code's auto-memory system.
## License
MIT — see [LICENSE](LICENSE)
FILE:agents/memory-analyst.md
# Memory Analyst Agent
You are a memory analyst for Claude Code projects. Your job is to analyze the auto-memory directory and produce actionable insights.
## Your Role
You analyze `~/.claude/projects/<project>/memory/` to find:
1. **Promotion candidates** — entries proven enough to become CLAUDE.md rules
2. **Stale entries** — references to files, tools, or patterns that no longer apply
3. **Consolidation opportunities** — multiple entries about the same topic
4. **Conflicts** — memory entries that contradict CLAUDE.md rules
5. **Health metrics** — capacity, freshness, organization
## Analysis Process
### 1. Read all memory files
- `MEMORY.md` (main file, first 200 lines loaded at startup)
- Any topic files (`debugging.md`, `patterns.md`, etc.)
- Note total line counts and file sizes
### 2. Cross-reference with CLAUDE.md
- Read `./CLAUDE.md` and `~/.claude/CLAUDE.md`
- Read all files in `.claude/rules/`
- Identify duplicates, contradictions, and gaps
### 3. Detect patterns
For each MEMORY.md entry, evaluate:
**Recurrence signals:**
- Same concept in multiple entries (paraphrased)
- Words like "again", "still", "always", "every time"
- Similar entries in topic files
**Staleness signals:**
- File paths that don't exist on disk (verify with `find` or `ls`)
- Version numbers that are outdated
- References to removed dependencies
- Patterns that contradict current CLAUDE.md
**Promotion signals:**
- Actionable (can be written as "Do X" / "Never Y")
- Broadly applicable (not a one-time debugging note)
- Not already in CLAUDE.md or rules/
- High impact (prevents common mistakes)
### 4. Score each entry
Rate each entry on three dimensions:
- **Durability** (0-3): Will this still be true in a month?
- **Impact** (0-3): How much does this affect daily work?
- **Scope** (0-3): Project-wide (3) vs. one-file (1) vs. one-time (0)
Promotion candidates: total score ≥ 6
### 5. Generate report
Organize findings into:
1. Promotion candidates (sorted by score, highest first)
2. Stale entries (with reason for staleness)
3. Consolidation groups (which entries to merge)
4. Conflicts (with both sides shown)
5. Health metrics (capacity, freshness)
6. Recommendations (top 3 actions)
## Output Format
Use the format defined in the `/si:review` skill. Be specific — include line numbers, exact text, and concrete suggestions.
## Constraints
- Never modify files directly — only analyze and report
- Don't invent entries — only report what's actually in the memory files
- Be concise — the report should be shorter than the memory files it analyzes
- Prioritize actionable findings over completeness
FILE:agents/skill-extractor.md
# Skill Extractor Agent
You are a skill extraction specialist. Your job is to transform proven patterns and debugging solutions into standalone, portable skills.
## Your Role
Given a pattern description (and optionally auto-memory entries), generate a complete skill package that:
- Solves a specific, recurring problem
- Works in any project (no hardcoded paths, credentials, or project-specific values)
- Is self-contained (readable without the original context)
- Follows the claude-skills format specification
## Extraction Process
### 1. Understand the pattern
From the input, identify:
- **The problem**: What goes wrong? What's the symptom?
- **The root cause**: Why does it happen?
- **The solution**: What's the fix? Are there multiple approaches?
- **The edge cases**: When does the solution NOT work?
- **The trigger conditions**: When should an agent use this skill?
### 2. Generate skill name
Rules:
- Lowercase, hyphens between words
- 2-4 words, descriptive
- Match the problem, not the project
- Examples: `docker-arm64-fixes`, `api-timeout-patterns`, `pnpm-monorepo-setup`
### 3. Create SKILL.md
Required structure:
```markdown
---
name: {{skill-name}}
description: "{{One sentence}}. Use when: {{trigger conditions}}."
---
# {{Skill Title}}
> {{One-line value proposition}}
## Quick Reference
| Problem | Solution |
|---------|----------|
| {{error/symptom}} | {{fix}} |
## The Problem
{{2-3 sentences. Include the error message or symptom people would search for.}}
## Solutions
### Option 1: {{Name}} (Recommended)
{{Step-by-step instructions with code blocks.}}
### Option 2: {{Alternative}} {{if applicable}}
{{When Option 1 doesn't apply.}}
## Trade-offs
| Approach | Pros | Cons |
|----------|------|------|
| {{option}} | {{pros}} | {{cons}} |
## Edge Cases
- {{When this approach breaks and what to do instead}}
## Related
- {{Links to official docs or related skills}}
```
### 4. Create README.md
Brief human-readable overview:
- What the skill does (1 paragraph)
- Installation instructions
- When to use it
- Credits/source
### 5. Quality checks
Before delivering, verify:
- [ ] YAML frontmatter is valid (`name` and `description` present)
- [ ] `name` in frontmatter matches folder name
- [ ] Description includes "Use when:" trigger
- [ ] No project-specific paths, URLs, or credentials
- [ ] Code examples are complete and runnable
- [ ] Error messages are exact (copy-pasteable for searching)
- [ ] Solutions work without additional context
- [ ] Trade-offs table helps users choose between options
- [ ] Skill is useful in a project you've never seen before
## Constraints
- **One problem per skill** — don't create omnibus guides
- **Show, don't tell** — code examples over prose
- **Include the error** — people search by error message
- **Be portable** — no `npm` vs `pnpm` assumptions
- **Keep it short** — under 200 lines for SKILL.md
- **No unnecessary files** — only SKILL.md is required. Add reference/ only if the topic is complex enough to warrant it
FILE:hooks/error-capture.sh
#!/bin/bash
# Self-Improving Agent — Error Capture Hook
# Fires on PostToolUse (Bash) to detect command failures.
# Zero output on success — only captures when errors are detected.
#
# Install: Add to .claude/settings.json:
# {
# "hooks": {
# "PostToolUse": [{
# "matcher": "Bash",
# "hooks": [{
# "type": "command",
# "command": "./skills/self-improving-agent/hooks/error-capture.sh"
# }]
# }]
# }
# }
set -e
OUTPUT="-"
# Exit silently if no output or empty
[ -z "$OUTPUT" ] && exit 0
# Error patterns — ordered by specificity
ERROR_PATTERNS=(
"error:"
"Error:"
"ERROR:"
"FATAL:"
"fatal:"
"FAILED"
"failed"
"command not found"
"No such file or directory"
"Permission denied"
"Module not found"
"ModuleNotFoundError"
"ImportError"
"SyntaxError"
"TypeError"
"ReferenceError"
"Cannot find module"
"ENOENT"
"EACCES"
"ECONNREFUSED"
"ETIMEDOUT"
"npm ERR!"
"pnpm ERR!"
"Traceback (most recent call last)"
"panic:"
"segmentation fault"
"core dumped"
"exit code"
"non-zero exit"
"Build failed"
"Compilation failed"
"Test failed"
)
# False positive exclusions — don't trigger on these
EXCLUSIONS=(
"error-capture" # Don't trigger on ourselves
"error_handler" # Code that handles errors
"errorHandler"
"error.log" # Log file references
"console.error" # Code that logs errors
"catch (error" # Error handling code
"catch (err"
".error(" # Logger calls
"no error" # Absence of error
"without error"
"error-free"
)
# Check exclusions first
for excl in "EXCLUSIONS[@]"; do
if [[ "$OUTPUT" == *"$excl"* ]]; then
exit 0
fi
done
# Check for error patterns
contains_error=false
matched_pattern=""
for pattern in "ERROR_PATTERNS[@]"; do
if [[ "$OUTPUT" == *"$pattern"* ]]; then
contains_error=true
matched_pattern="$pattern"
break
fi
done
# Exit silently if no error
[ "$contains_error" = false ] && exit 0
# Extract relevant error context (first 5 lines containing the pattern)
error_context=$(echo "$OUTPUT" | grep -i -m 5 "$matched_pattern" | head -5)
# Output a concise reminder — ~40 tokens
cat << EOF
<error-detected>
Command error detected (pattern: "$matched_pattern").
If this was unexpected or required investigation to fix, save the solution:
/si:remember "explanation of what went wrong and the fix"
Or if this is a known pattern, check: /si:review
Context: $(echo "$error_context" | head -2 | tr '\n' ' ' | cut -c1-200)
</error-detected>
EOF
FILE:hooks/hooks.json
{
"hooks": {
"PostToolUse": [
{
"matcher": "Bash",
"hooks": [
{
"type": "command",
"command": "./hooks/error-capture.sh"
}
]
}
]
}
}
FILE:reference/memory-architecture.md
# Claude Code Memory Architecture
A complete reference for how Claude Code's memory systems work together.
## Three Memory Systems
### 1. CLAUDE.md Files (You → Claude)
**Purpose:** Persistent instructions you write to guide Claude's behavior.
**Locations (in priority order):**
| Scope | Path | Shared |
|-------|------|--------|
| Managed policy | `/etc/claude-code/CLAUDE.md` (Linux) | All users |
| Project | `./CLAUDE.md` or `./.claude/CLAUDE.md` | Team (git) |
| User | `~/.claude/CLAUDE.md` | Just you |
| Local | `./CLAUDE.local.md` | Just you |
**Loading:** Full file, every session. Files higher in the directory tree load first.
**Key facts:**
- Target under 200 lines per file
- Use `@path/to/file` syntax to import additional files (max 5 hops deep)
- More specific locations take precedence over broader ones
- Can import with `@README` or `@docs/guide.md`
- CLAUDE.local.md is auto-added to .gitignore
### 2. Auto Memory (Claude → Claude)
**Purpose:** Notes Claude writes to itself about project patterns and learnings.
**Location:** `~/.claude/projects/<project-path>/memory/`
**Structure:**
```
~/.claude/projects/<project-path>/memory/
├── MEMORY.md # Main file (first 200 lines loaded)
├── debugging.md # Topic file (loaded on demand)
├── patterns.md # Topic file (loaded on demand)
└── ... # More topic files as needed
```
**Key facts:**
- Enabled by default (since v2.1.32)
- Only the first 200 lines of MEMORY.md load at startup
- Claude creates topic files automatically when MEMORY.md gets long
- Git repo root determines the project path
- Git worktrees get separate memory directories
- Local only — not shared via git
- Toggle with `/memory`, settings, or `CLAUDE_CODE_DISABLE_AUTO_MEMORY=1`
- Subagents can have their own auto memory
**What it captures:**
- Build commands and test conventions
- Debugging solutions and error patterns
- Code style preferences and architecture notes
- Your communication preferences and workflow habits
### 3. Session Memory (Claude → Claude)
**Purpose:** Conversation summaries for cross-session continuity.
**Location:** `~/.claude/projects/<project-path>/<session>/session-memory/`
**Key facts:**
- Saves what was discussed and decided in specific sessions
- "What did we do yesterday?" context
- Loaded contextually (relevant past sessions, not all)
- Use `/remember` to turn session memory into permanent project knowledge
### 4. Rules Directory (You → Claude, scoped)
**Purpose:** Modular instructions scoped to specific file types.
**Location:** `.claude/rules/*.md`
**Key facts:**
- Uses YAML frontmatter with `paths` field for scoping
- Only loads when Claude works with matching files
- Recursive — can organize into subdirectories
- Same priority as `.claude/CLAUDE.md`
- Great for keeping CLAUDE.md under 200 lines
```yaml
---
paths:
- "src/api/**/*.ts"
---
# API rules only load when working with API files
```
## Memory Priority
When entries conflict:
1. CLAUDE.md (highest — explicit instructions)
2. `.claude/rules/` (high — scoped instructions)
3. Auto-memory MEMORY.md (medium — learned patterns)
4. Session memory (low — historical context)
## The Self-Improving Agent's Role
```
Auto-memory captures → This plugin curates → CLAUDE.md enforces
MEMORY.md (raw notes) → /si:review (analyze) → /si:promote (graduate)
↓
CLAUDE.md or
.claude/rules/
(enforced rules)
```
**Why this matters:** MEMORY.md entries are background context truncated at 200 lines. CLAUDE.md entries are high-priority instructions loaded in full. Promoting a pattern from memory to rules fundamentally changes how Claude treats it.
## Capacity Planning
| File | Soft limit | Hard limit | What happens at limit |
|------|-----------|------------|----------------------|
| MEMORY.md | 150 lines | 200 lines | Lines after 200 not loaded at startup |
| CLAUDE.md | 150 lines | No hard limit | Adherence decreases with length |
| Topic files | No limit | No limit | Loaded on demand, not at startup |
| Rules files | No limit per file | No limit | Only loaded when paths match |
## Best Practices
1. **Keep MEMORY.md lean** — promote proven patterns, delete stale ones
2. **Keep CLAUDE.md under 200 lines** — split into rules/ if growing
3. **Don't duplicate** — if it's in CLAUDE.md, remove it from MEMORY.md
4. **Scope rules** — use `.claude/rules/` with paths for file-type-specific patterns
5. **Review quarterly** — memory files go stale after refactors
6. **Use /si:status** — monitor capacity before it becomes a problem
FILE:reference/promotion-rules.md
# Promotion Rules
When to promote a learning from auto-memory (MEMORY.md) to the project's rule system (CLAUDE.md or `.claude/rules/`).
## Promotion Criteria
A learning should be promoted when **all three** are true:
1. **Proven** — appeared in 2+ sessions or confirmed correct after testing
2. **Actionable** — can be written as a concrete instruction ("Use X", "Never Y")
3. **Durable** — will still be true in 30+ days
## Scoring Guide
| Dimension | Score 0 | Score 1 | Score 2 | Score 3 |
|-----------|---------|---------|---------|---------|
| **Durability** | One-time fix | Temporary workaround | Stable pattern | Architectural truth |
| **Impact** | Nice-to-know | Saves 1 minute | Prevents mistakes | Prevents breakage |
| **Scope** | One file only | One directory | Entire project | All your projects |
**Promote when total ≥ 6.** Watch when total = 4-5. Ignore when total ≤ 3.
## Target Selection
### Use CLAUDE.md when:
- The rule applies to the entire project
- It's a build command, test convention, or architecture decision
- Any contributor (human or AI) needs to know it
- It's short enough to add without exceeding 200 lines
### Use .claude/rules/ when:
- The rule only applies to specific file types
- CLAUDE.md is already near 200 lines
- The rule needs detailed explanation (multiple paragraphs)
- You want it to load only when relevant files are open
### Use ~/.claude/CLAUDE.md when:
- The rule applies to all your projects
- It's a personal preference, not a project convention
- Examples: "Prefer explicit returns over implicit", "Use descriptive variable names"
## Distillation Rules
When promoting, transform the learning:
### From descriptive to prescriptive
❌ "I noticed the project uses pnpm workspaces. npm install fails because of the lock file."
✅ "Use `pnpm install`, not npm. Lock file: `pnpm-lock.yaml`."
### From verbose to concise
❌ "When modifying API endpoints in the OpenAPI spec file, you need to regenerate the TypeScript client by running the generate command, otherwise the types won't match at runtime and you'll get errors."
✅ "After editing `openapi.yaml`: run `pnpm run generate:api` to regenerate TS client."
### From conditional to absolute
❌ "Sometimes you need to restart the dev server after changing environment variables."
✅ "Restart dev server after any `.env` change — hot reload doesn't pick up env vars."
## Anti-Patterns
### Don't promote:
- **One-time debugging notes** — "Fixed the CORS issue by adding header X" (unless it recurs)
- **Session-specific context** — "We decided to use Approach A in today's meeting"
- **Unstable patterns** — "Currently using v3 of the API" (will change)
- **Obvious things** — "Run tests before committing" (Claude knows this)
- **Credentials or secrets** — never store in any memory file
### Don't duplicate:
- If CLAUDE.md already says "Use pnpm", don't also keep it in MEMORY.md
- After promoting, remove the source entry to free space
## Promotion Workflow
```
1. /si:review identifies candidate
2. Confirm the pattern is still valid
3. Distill into one-line instruction
4. /si:promote writes to CLAUDE.md or rules/
5. Remove from MEMORY.md
6. Verify with /si:status
```
FILE:reference/rules-directory-patterns.md
# Rules Directory Patterns
Best practices for organizing `.claude/rules/` files — the scoped instruction system that loads rules only when relevant files are open.
## Directory Structure
```
.claude/
├── CLAUDE.md # Main project instructions (always loaded)
└── rules/
├── code-style.md # No paths → loads always (like CLAUDE.md)
├── testing.md # Scoped to test files
├── api-design.md # Scoped to API source files
├── database.md # Scoped to migration/model files
└── frontend/
├── components.md # Scoped to React components
└── styling.md # Scoped to CSS/styled files
```
## Path Scoping
### Basic patterns
```yaml
---
paths:
- "**/*.test.ts" # All TypeScript test files
- "src/api/**/*.ts" # API source files
- "*.md" # Root-level markdown
- "src/components/**/*.tsx" # React components
---
```
### Brace expansion
```yaml
---
paths:
- "src/**/*.{ts,tsx}" # All TypeScript + TSX
- "tests/**/*.{test,spec}.ts" # Test and spec files
---
```
### Multiple scopes
```yaml
---
paths:
- "src/api/**/*.ts"
- "tests/api/**/*"
- "openapi.yaml"
---
```
## Common Rule Files
### testing.md
```yaml
---
paths:
- "**/*.test.{ts,tsx,js,jsx}"
- "**/*.spec.{ts,tsx,js,jsx}"
- "tests/**/*"
- "__tests__/**/*"
---
# Testing Rules
- Use `describe` blocks to group related tests
- One assertion per test when possible
- Mock external services; never hit real APIs in tests
- Use factories for test data, not inline objects
- Run `pnpm test` before committing
```
### api-design.md
```yaml
---
paths:
- "src/api/**/*.ts"
- "src/routes/**/*.ts"
- "src/handlers/**/*.ts"
---
# API Design Rules
- Validate all input with Zod schemas
- Use `ApiError` class for error responses
- Include OpenAPI JSDoc on all handlers
- Return consistent error format: `{ error: string, code: string }`
```
### database.md
```yaml
---
paths:
- "src/db/**/*"
- "migrations/**/*"
- "prisma/**/*"
- "drizzle/**/*"
---
# Database Rules
- Always create a migration for schema changes
- Never modify existing migrations — create new ones
- Use transactions for multi-table operations
- Index foreign keys and frequently queried columns
```
### security.md (unscoped — always loads)
```markdown
# Security Rules
- Never log sensitive data (tokens, passwords, PII)
- Sanitize all user input before database queries
- Use parameterized queries, never string interpolation
- Validate file uploads: type, size, content
- Environment variables for all secrets — never hardcode
```
## When to Create a Rule File
| Signal | Action |
|--------|--------|
| CLAUDE.md over 150 lines | Move scoped patterns to rules/ |
| Same instruction repeated for different file types | Create a scoped rule |
| `/si:promote` suggests a file-type-specific pattern | Create or append to a rule file |
| Team adds a new convention for a specific area | New rule file |
## Organization Tips
1. **One topic per file** — `testing.md`, not `testing-and-linting.md`
2. **Use subdirectories for large projects** — `rules/frontend/`, `rules/backend/`
3. **Keep unscoped rules minimal** — they load every session like CLAUDE.md
4. **Review after refactors** — paths may change when directories are reorganized
5. **Share via git** — rules/ should be version-controlled (unlike auto-memory)
FILE:settings.json
{
"name": "self-improving-agent",
"displayName": "Self-Improving Agent",
"version": "1.0.0",
"description": "Curate auto-memory, promote learnings to rules, extract skills from patterns.",
"author": "Reza Rezvani",
"license": "MIT",
"platforms": ["claude-code", "openclaw", "codex"],
"category": "development",
"tags": ["memory", "auto-memory", "self-improvement", "learning", "rules", "skills"],
"repository": "https://github.com/alirezarezvani/claude-skills",
"commands": {
"review": "/si:review",
"promote": "/si:promote",
"extract": "/si:extract",
"status": "/si:status",
"remember": "/si:remember"
},
"hooks": {
"PostToolUse": {
"Bash": "hooks/error-capture.sh"
}
},
"agents": [
"memory-analyst",
"skill-extractor"
]
}
FILE:skills/extract/SKILL.md
---
name: "extract"
description: "Turn a proven pattern or debugging solution into a standalone reusable skill with SKILL.md, reference docs, and examples."
command: /si:extract
---
# /si:extract — Create Skills from Patterns
Transforms a recurring pattern or debugging solution into a standalone, portable skill that can be installed in any project.
## Usage
```
/si:extract <pattern description> # Interactive extraction
/si:extract <pattern> --name docker-m1-fixes # Specify skill name
/si:extract <pattern> --output ./skills/ # Custom output directory
/si:extract <pattern> --dry-run # Preview without creating files
```
## When to Extract
A learning qualifies for skill extraction when ANY of these are true:
| Criterion | Signal |
|---|---|
| **Recurring** | Same issue across 2+ projects |
| **Non-obvious** | Required real debugging to discover |
| **Broadly applicable** | Not tied to one specific codebase |
| **Complex solution** | Multi-step fix that's easy to forget |
| **User-flagged** | "Save this as a skill", "I want to reuse this" |
## Workflow
### Step 1: Identify the pattern
Read the user's description. Search auto-memory for related entries:
```bash
MEMORY_DIR="$HOME/.claude/projects/$(pwd | sed 's|/|%2F|g; s|%2F|/|; s|^/||')/memory"
grep -rni "<keywords>" "$MEMORY_DIR/"
```
If found in auto-memory, use those entries as source material. If not, use the user's description directly.
### Step 2: Determine skill scope
Ask (max 2 questions):
- "What problem does this solve?" (if not clear)
- "Should this include code examples?" (if applicable)
### Step 3: Generate skill name
Rules for naming:
- Lowercase, hyphens between words
- Descriptive but concise (2-4 words)
- Examples: `docker-m1-fixes`, `api-timeout-patterns`, `pnpm-workspace-setup`
### Step 4: Create the skill files
**Spawn the `skill-extractor` agent** for the actual file generation.
The agent creates:
```
<skill-name>/
├── SKILL.md # Main skill file with frontmatter
├── README.md # Human-readable overview
└── reference/ # (optional) Supporting documentation
└── examples.md # Concrete examples and edge cases
```
### Step 5: SKILL.md structure
The generated SKILL.md must follow this format:
```markdown
---
name: "skill-name"
description: "<one-line description>. Use when: <trigger conditions>."
---
# <Skill Title>
> One-line summary of what this skill solves.
## Quick Reference
| Problem | Solution |
|---------|----------|
| {{problem 1}} | {{solution 1}} |
| {{problem 2}} | {{solution 2}} |
## The Problem
{{2-3 sentences explaining what goes wrong and why it's non-obvious.}}
## Solutions
### Option 1: {{Name}} (Recommended)
{{Step-by-step with code examples.}}
### Option 2: {{Alternative}}
{{For when Option 1 doesn't apply.}}
## Trade-offs
| Approach | Pros | Cons |
|----------|------|------|
| Option 1 | {{pros}} | {{cons}} |
| Option 2 | {{pros}} | {{cons}} |
## Edge Cases
- {{edge case 1 and how to handle it}}
- {{edge case 2 and how to handle it}}
```
### Step 6: Quality gates
Before finalizing, verify:
- [ ] SKILL.md has valid YAML frontmatter with `name` and `description`
- [ ] `name` matches the folder name (lowercase, hyphens)
- [ ] Description includes "Use when:" trigger conditions
- [ ] Solutions are self-contained (no external context needed)
- [ ] Code examples are complete and copy-pasteable
- [ ] No project-specific hardcoded values (paths, URLs, credentials)
- [ ] No unnecessary dependencies
### Step 7: Report
```
✅ Skill extracted: {{skill-name}}
Files created:
{{path}}/SKILL.md ({{lines}} lines)
{{path}}/README.md ({{lines}} lines)
{{path}}/reference/examples.md ({{lines}} lines)
Install: /plugin install (copy to your skills directory)
Publish: clawhub publish {{path}}
Source: MEMORY.md entries at lines {{n, m, ...}} (retained — the skill is portable, the memory is project-specific)
```
## Examples
### Extracting a debugging pattern
```
/si:extract "Fix for Docker builds failing on Apple Silicon with platform mismatch"
```
Creates `docker-m1-fixes/SKILL.md` with:
- The platform mismatch error message
- Three solutions (build flag, Dockerfile, docker-compose)
- Trade-offs table
- Performance note about Rosetta 2 emulation
### Extracting a workflow pattern
```
/si:extract "Always regenerate TypeScript API client after modifying OpenAPI spec"
```
Creates `api-client-regen/SKILL.md` with:
- Why manual regen is needed
- The exact command sequence
- CI integration snippet
- Common failure modes
## Tips
- Extract patterns that would save time in a *different* project
- Keep skills focused — one problem per skill
- Include the error messages people would search for
- Test the skill by reading it without the original context — does it make sense?
FILE:skills/promote/SKILL.md
---
name: "promote"
description: "Graduate a proven pattern from auto-memory (MEMORY.md) to CLAUDE.md or .claude/rules/ for permanent enforcement."
command: /si:promote
---
# /si:promote — Graduate Learnings to Rules
Moves a proven pattern from Claude's auto-memory into the project's rule system, where it becomes an enforced instruction rather than a background note.
## Usage
```
/si:promote <pattern description> # Auto-detect best target
/si:promote <pattern> --target claude.md # Promote to CLAUDE.md
/si:promote <pattern> --target rules/testing.md # Promote to scoped rule
/si:promote <pattern> --target rules/api.md --paths "src/api/**/*.ts" # Scoped with paths
```
## Workflow
### Step 1: Understand the pattern
Parse the user's description. If vague, ask one clarifying question:
- "What specific behavior should Claude follow?"
- "Does this apply to all files or specific paths?"
### Step 2: Find the pattern in auto-memory
```bash
# Search MEMORY.md for related entries
MEMORY_DIR="$HOME/.claude/projects/$(pwd | sed 's|/|%2F|g; s|%2F|/|; s|^/||')/memory"
grep -ni "<keywords>" "$MEMORY_DIR/MEMORY.md"
```
Show the matching entries and confirm they're what the user means.
### Step 3: Determine the right target
| Pattern scope | Target | Example |
|---|---|---|
| Applies to entire project | `./CLAUDE.md` | "Use pnpm, not npm" |
| Applies to specific file types | `.claude/rules/<topic>.md` | "API handlers need validation" |
| Applies to all your projects | `~/.claude/CLAUDE.md` | "Prefer explicit error handling" |
If the user didn't specify a target, recommend one based on scope.
### Step 4: Distill into a concise rule
Transform the learning from auto-memory's note format into CLAUDE.md's instruction format:
**Before** (MEMORY.md — descriptive):
> The project uses pnpm workspaces. When I tried npm install it failed. The lock file is pnpm-lock.yaml. Must use pnpm install for dependencies.
**After** (CLAUDE.md — prescriptive):
```markdown
## Build & Dependencies
- Package manager: pnpm (not npm). Use `pnpm install`.
```
**Rules for distillation:**
- One line per rule when possible
- Imperative voice ("Use X", "Always Y", "Never Z")
- Include the command or example, not just the concept
- No backstory — just the instruction
### Step 5: Write to target
**For CLAUDE.md:**
1. Read existing CLAUDE.md
2. Find the appropriate section (or create one)
3. Append the new rule under the right heading
4. If file would exceed 200 lines, suggest using `.claude/rules/` instead
**For `.claude/rules/`:**
1. Create the file if it doesn't exist
2. Add YAML frontmatter with `paths` if scoped
3. Write the rule content
```markdown
---
paths:
- "src/api/**/*.ts"
- "tests/api/**/*"
---
# API Development Rules
- All endpoints must validate input with Zod schemas
- Use `ApiError` class for error responses (not raw Error)
- Include OpenAPI JSDoc comments on handler functions
```
### Step 6: Clean up auto-memory
After promoting, remove or mark the original entry in MEMORY.md:
```bash
# Show what will be removed
grep -n "<pattern>" "$MEMORY_DIR/MEMORY.md"
```
Ask the user to confirm removal. Then edit MEMORY.md to remove the promoted entry. This frees space for new learnings.
### Step 7: Confirm
```
✅ Promoted to {{target}}
Rule: "{{distilled rule}}"
Source: MEMORY.md line {{n}} (removed)
MEMORY.md: {{lines}}/200 lines remaining
The pattern is now an enforced instruction. Claude will follow it in all future sessions.
```
## Promotion Decision Guide
### Promote when:
- Pattern appeared 3+ times in auto-memory
- You corrected Claude about it more than once
- It's a project convention that any contributor should know
- It prevents a recurring mistake
### Don't promote when:
- It's a one-time debugging note (leave in auto-memory)
- It's session-specific context (session memory handles this)
- It might change soon (e.g., during a migration)
- It's already covered by existing rules
### CLAUDE.md vs .claude/rules/
| Use CLAUDE.md for | Use .claude/rules/ for |
|---|---|
| Global project rules | File-type-specific patterns |
| Build commands | Testing conventions |
| Architecture decisions | API design rules |
| Team conventions | Framework-specific gotchas |
## Tips
- Keep CLAUDE.md under 200 lines — use rules/ for overflow
- One rule per line is easier to maintain than paragraphs
- Include the concrete command, not just the concept
- Review promoted rules quarterly — remove what's no longer relevant
FILE:skills/remember/SKILL.md
---
name: "remember"
description: "Explicitly save important knowledge to auto-memory with timestamp and context. Use when a discovery is too important to rely on auto-capture."
command: /si:remember
---
# /si:remember — Save Knowledge Explicitly
Writes an explicit entry to auto-memory when something is important enough that you don't want to rely on Claude noticing it automatically.
## Usage
```
/si:remember <what to remember>
/si:remember "This project's CI requires Node 20 LTS — v22 breaks the build"
/si:remember "The /api/auth endpoint uses a custom JWT library, not passport"
/si:remember "Reza prefers explicit error handling over try-catch-all patterns"
```
## When to Use
| Situation | Example |
|-----------|---------|
| Hard-won debugging insight | "CORS errors on /api/upload are caused by the CDN, not the backend" |
| Project convention not in CLAUDE.md | "We use barrel exports in src/components/" |
| Tool-specific gotcha | "Jest needs `--forceExit` flag or it hangs on DB tests" |
| Architecture decision | "We chose Drizzle over Prisma for type-safe SQL" |
| Preference you want Claude to learn | "Don't add comments explaining obvious code" |
## Workflow
### Step 1: Parse the knowledge
Extract from the user's input:
- **What**: The concrete fact or pattern
- **Why it matters**: Context (if provided)
- **Scope**: Project-specific or global?
### Step 2: Check for duplicates
```bash
MEMORY_DIR="$HOME/.claude/projects/$(pwd | sed 's|/|%2F|g; s|%2F|/|; s|^/||')/memory"
grep -ni "<keywords>" "$MEMORY_DIR/MEMORY.md" 2>/dev/null
```
If a similar entry exists:
- Show it to the user
- Ask: "Update the existing entry or add a new one?"
### Step 3: Write to MEMORY.md
Append to the end of `MEMORY.md`:
```markdown
- {{concise fact or pattern}}
```
Keep entries concise — one line when possible. Auto-memory entries don't need timestamps, IDs, or metadata. They're notes, not database records.
If MEMORY.md is over 180 lines, warn the user:
```
⚠️ MEMORY.md is at {{n}}/200 lines. Consider running /si:review to free space.
```
### Step 4: Suggest promotion
If the knowledge sounds like a rule (imperative, always/never, convention):
```
💡 This sounds like it could be a CLAUDE.md rule rather than a memory entry.
Rules are enforced with higher priority. Want to /si:promote it instead?
```
### Step 5: Confirm
```
✅ Saved to auto-memory
"{{entry}}"
MEMORY.md: {{n}}/200 lines
Claude will see this at the start of every session in this project.
```
## What NOT to use /si:remember for
- **Temporary context**: Use session memory or just tell Claude in conversation
- **Enforced rules**: Use `/si:promote` to write directly to CLAUDE.md
- **Cross-project knowledge**: Use `~/.claude/CLAUDE.md` for global rules
- **Sensitive data**: Never store credentials, tokens, or secrets in memory files
## Tips
- Be concise — one line beats a paragraph
- Include the concrete command or value, not just the concept
- ✅ "Build with `pnpm build`, tests with `pnpm test:e2e`"
- ❌ "The project uses pnpm for building and testing"
- If you're remembering the same thing twice, promote it to CLAUDE.md
FILE:skills/review/SKILL.md
---
name: "review"
description: "Analyze auto-memory for promotion candidates, stale entries, consolidation opportunities, and health metrics."
command: /si:review
---
# /si:review — Analyze Auto-Memory
Performs a comprehensive audit of Claude Code's auto-memory and produces actionable recommendations.
## Usage
```
/si:review # Full review
/si:review --quick # Summary only (counts + top 3 candidates)
/si:review --stale # Focus on stale/outdated entries
/si:review --candidates # Show only promotion candidates
```
## What It Does
### Step 1: Locate memory directory
```bash
# Find the project's auto-memory directory
MEMORY_DIR="$HOME/.claude/projects/$(pwd | sed 's|/|%2F|g; s|%2F|/|; s|^/||')/memory"
# Fallback: check common path patterns
# ~/.claude/projects/<user>/<project>/memory/
# ~/.claude/projects/<absolute-path>/memory/
# List all memory files
ls -la "$MEMORY_DIR"/
```
If memory directory doesn't exist, report that auto-memory may be disabled. Suggest checking with `/memory`.
### Step 2: Read and analyze MEMORY.md
Read the full `MEMORY.md` file. Count lines and check against the 200-line startup limit.
Analyze each entry for:
1. **Recurrence indicators**
- Same concept appears multiple times (different wording)
- References to "again" or "still" or "keeps happening"
- Similar entries across topic files
2. **Staleness indicators**
- References files that no longer exist (`find` to verify)
- Mentions outdated tools, versions, or commands
- Contradicts current CLAUDE.md rules
3. **Consolidation opportunities**
- Multiple entries about the same topic (e.g., three lines about testing)
- Entries that could merge into one concise rule
4. **Promotion candidates** — entries that meet ALL criteria:
- Appeared in 2+ sessions (check wording patterns)
- Not project-specific trivia (broadly useful)
- Actionable (can be written as a concrete rule)
- Not already in CLAUDE.md or `.claude/rules/`
### Step 3: Read topic files
If `MEMORY.md` references or the directory contains additional files (`debugging.md`, `patterns.md`, etc.):
- Read each one
- Cross-reference with MEMORY.md for duplicates
- Check for entries that belong in the main file (high value) vs. topic files (details)
### Step 4: Cross-reference with CLAUDE.md
Read the project's `CLAUDE.md` (if it exists) and compare:
- Are there MEMORY.md entries that duplicate CLAUDE.md rules? (→ remove from memory)
- Are there MEMORY.md entries that contradict CLAUDE.md? (→ flag conflict)
- Are there MEMORY.md patterns not yet in CLAUDE.md that should be? (→ promotion candidate)
Also check `.claude/rules/` directory for existing scoped rules.
### Step 5: Generate report
Output format:
```
📊 Auto-Memory Review
Memory Health:
MEMORY.md: {{lines}}/200 lines ({{percent}}%)
Topic files: {{count}} ({{names}})
CLAUDE.md: {{lines}} lines
Rules: {{count}} files in .claude/rules/
🎯 Promotion Candidates ({{count}}):
1. "{{pattern}}" — seen {{n}}x, applies broadly
→ Suggest: {{target}} (CLAUDE.md / .claude/rules/{{name}}.md)
2. ...
🗑️ Stale Entries ({{count}}):
1. Line {{n}}: "{{entry}}" — {{reason}}
2. ...
🔄 Consolidation ({{count}} groups):
1. Lines {{a}}, {{b}}, {{c}} all about {{topic}} → merge into 1 entry
2. ...
⚠️ Conflicts ({{count}}):
1. MEMORY.md line {{n}} contradicts CLAUDE.md: {{detail}}
💡 Recommendations:
- {{actionable suggestion}}
- {{actionable suggestion}}
```
## When to Use
- After completing a major feature or debugging session
- When `/si:status` shows MEMORY.md is over 150 lines
- Weekly during active development
- Before starting a new project phase
- After onboarding a new team member (review what Claude learned)
## Tips
- Run `/si:review --quick` frequently (low overhead)
- Full review is most valuable when MEMORY.md is getting crowded
- Act on promotion candidates promptly — they're proven patterns
- Don't hesitate to delete stale entries — auto-memory will re-learn if needed
FILE:skills/status/SKILL.md
---
name: "status"
description: "Memory health dashboard showing line counts, topic files, capacity, stale entries, and recommendations."
command: /si:status
---
# /si:status — Memory Health Dashboard
Quick overview of your project's memory state across all memory systems.
## Usage
```
/si:status # Full dashboard
/si:status --brief # One-line summary
```
## What It Reports
### Step 1: Locate all memory files
```bash
# Auto-memory directory
MEMORY_DIR="$HOME/.claude/projects/$(pwd | sed 's|/|%2F|g; s|%2F|/|; s|^/||')/memory"
# Count lines in MEMORY.md
wc -l "$MEMORY_DIR/MEMORY.md" 2>/dev/null || echo "0"
# List topic files
ls "$MEMORY_DIR/"*.md 2>/dev/null | grep -v MEMORY.md
# CLAUDE.md
wc -l ./CLAUDE.md 2>/dev/null || echo "0"
wc -l ~/.claude/CLAUDE.md 2>/dev/null || echo "0"
# Rules directory
ls .claude/rules/*.md 2>/dev/null | wc -l
```
### Step 2: Analyze capacity
| Metric | Healthy | Warning | Critical |
|--------|---------|---------|----------|
| MEMORY.md lines | < 120 | 120-180 | > 180 |
| CLAUDE.md lines | < 150 | 150-200 | > 200 |
| Topic files | 0-3 | 4-6 | > 6 |
| Stale entries | 0 | 1-3 | > 3 |
### Step 3: Quick stale check
For each MEMORY.md entry that references a file path:
```bash
# Verify referenced files still exist
grep -oE '[a-zA-Z0-9_/.-]+\.(ts|js|py|md|json|yaml|yml)' "$MEMORY_DIR/MEMORY.md" | while read f; do
[ ! -f "$f" ] && echo "STALE: $f"
done
```
### Step 4: Output
```
📊 Memory Status
Auto-Memory (MEMORY.md):
Lines: {{n}}/200 ({{bar}}) {{emoji}}
Topic files: {{count}} ({{names}})
Last updated: {{date}}
Project Rules:
CLAUDE.md: {{n}} lines
Rules: {{count}} files in .claude/rules/
User global: {{n}} lines (~/.claude/CLAUDE.md)
Health:
Capacity: {{healthy/warning/critical}}
Stale refs: {{count}} (files no longer exist)
Duplicates: {{count}} (entries repeated across files)
{{if recommendations}}
💡 Recommendations:
- {{recommendation}}
{{endif}}
```
### Brief mode
```
/si:status --brief
```
Output: `📊 Memory: {{n}}/200 lines | {{count}} rules | {{status_emoji}} {{status_word}}`
## Interpretation
- **Green (< 60%)**: Plenty of room. Auto-memory is working well.
- **Yellow (60-90%)**: Getting full. Consider running `/si:review` to promote or clean up.
- **Red (> 90%)**: Near capacity. Auto-memory may start dropping older entries. Run `/si:review` now.
## Tips
- Run `/si:status --brief` as a quick check anytime
- If capacity is yellow+, run `/si:review` to identify promotion candidates
- Stale entries waste space — delete references to files that no longer exist
- Topic files are fine — Claude creates them to keep MEMORY.md under 200 lines
FILE:templates/rule-template.md
---
paths:
- "{{glob-pattern}}"
---
# {{Topic}} Rules
## Conventions
- {{convention 1}}
- {{convention 2}}
## Patterns
- {{preferred pattern with example}}
- {{anti-pattern to avoid}}
## Commands
- {{relevant command}}: `{{command}}`
FILE:templates/skill-template.md
---
name: {{skill-name}}
description: "{{One-line description}}. Use when: {{trigger conditions}}."
---
# {{Skill Title}}
> {{One-line value proposition}}
## Quick Reference
| Problem | Solution |
|---------|----------|
| {{error/symptom 1}} | {{fix 1}} |
| {{error/symptom 2}} | {{fix 2}} |
## The Problem
{{2-3 sentences explaining what goes wrong and why.
Include the exact error message if applicable.}}
## Solutions
### Option 1: {{Name}} (Recommended)
{{Step-by-step instructions.}}
```{{language}}
{{code example}}
```
### Option 2: {{Alternative}}
{{When Option 1 doesn't apply.}}
```{{language}}
{{code example}}
```
## Trade-offs
| Approach | Pros | Cons |
|----------|------|------|
| Option 1 | {{pros}} | {{cons}} |
| Option 2 | {{pros}} | {{cons}} |
## Edge Cases
- {{edge case and how to handle it}}
## Related
- {{link to official docs}}
Google Workspace administration via the gws CLI. Install, authenticate, and automate Gmail, Drive, Sheets, Calendar, Docs, Chat, and Tasks. Run security audi...
--- name: "google-workspace-cli" description: "Google Workspace administration via the gws CLI. Install, authenticate, and automate Gmail, Drive, Sheets, Calendar, Docs, Chat, and Tasks. Run security audits, execute 43 built-in recipes, and use 10 persona bundles. Use for Google Workspace admin, gws CLI setup, Gmail automation, Drive management, or Calendar scheduling." --- # Google Workspace CLI Expert guidance and automation for Google Workspace administration using the open-source `gws` CLI. Covers installation, authentication, 18+ service APIs, 43 built-in recipes, and 10 persona bundles for role-based workflows. --- ## Quick Start ### Check Installation ```bash # Verify gws is installed and authenticated python3 scripts/gws_doctor.py ``` ### Send an Email ```bash gws gmail users.messages send me --to "[email protected]" \ --subject "Weekly Update" --body "Here's this week's summary..." ``` ### List Drive Files ```bash gws drive files list --json --limit 20 | python3 scripts/output_analyzer.py --select "name,mimeType,modifiedTime" --format table ``` --- ## Installation ### npm (recommended) ```bash npm install -g @anthropic/gws gws --version ``` ### Cargo (from source) ```bash cargo install gws-cli gws --version ``` ### Pre-built Binaries Download from [github.com/googleworkspace/cli/releases](https://github.com/googleworkspace/cli/releases) for macOS, Linux, or Windows. ### Verify Installation ```bash python3 scripts/gws_doctor.py # Checks: PATH, version, auth status, service connectivity ``` --- ## Authentication ### OAuth Setup (Interactive) ```bash # Step 1: Create Google Cloud project and OAuth credentials python3 scripts/auth_setup_guide.py --guide oauth # Step 2: Run auth setup gws auth setup # Step 3: Validate gws auth status --json ``` ### Service Account (Headless/CI) ```bash # Generate setup instructions python3 scripts/auth_setup_guide.py --guide service-account # Configure with key file export GWS_SERVICE_ACCOUNT_KEY=/path/to/key.json export [email protected] gws auth status ``` ### Environment Variables ```bash # Generate .env template python3 scripts/auth_setup_guide.py --generate-env ``` | Variable | Purpose | |----------|---------| | `GWS_CLIENT_ID` | OAuth client ID | | `GWS_CLIENT_SECRET` | OAuth client secret | | `GWS_TOKEN_PATH` | Custom token storage path | | `GWS_SERVICE_ACCOUNT_KEY` | Service account JSON key path | | `GWS_DELEGATED_USER` | User to impersonate (service accounts) | | `GWS_DEFAULT_FORMAT` | Default output format (json/ndjson/table) | ### Validate Authentication ```bash python3 scripts/auth_setup_guide.py --validate --json # Tests each service endpoint ``` --- ## Workflow 1: Gmail Automation **Goal:** Automate email operations — send, search, label, and filter management. ### Send and Reply ```bash # Send a new email gws gmail users.messages send me --to "[email protected]" \ --subject "Proposal" --body "Please find attached..." \ --attachment proposal.pdf # Reply to a thread gws gmail users.messages reply me --thread-id <THREAD_ID> \ --body "Thanks for your feedback..." # Forward a message gws gmail users.messages forward me --message-id <MSG_ID> \ --to "[email protected]" ``` ### Search and Filter ```bash # Search emails gws gmail users.messages list me --query "from:[email protected] after:2025/01/01" --json \ | python3 scripts/output_analyzer.py --count # List labels gws gmail users.labels list me --json # Create a filter gws gmail users.settings.filters create me \ --criteria '{"from":"[email protected]"}' \ --action '{"addLabelIds":["Label_123"],"removeLabelIds":["INBOX"]}' ``` ### Bulk Operations ```bash # Archive all read emails older than 30 days gws gmail users.messages list me --query "is:read older_than:30d" --json \ | python3 scripts/output_analyzer.py --select "id" --format json \ | xargs -I {} gws gmail users.messages modify me {} --removeLabelIds INBOX ``` --- ## Workflow 2: Drive & Sheets **Goal:** Manage files, create spreadsheets, configure sharing, and export data. ### File Operations ```bash # List files gws drive files list --json --limit 50 \ | python3 scripts/output_analyzer.py --select "name,mimeType,size" --format table # Upload a file gws drive files create --name "Q1 Report" --upload report.pdf \ --parents <FOLDER_ID> # Create a Google Sheet gws sheets spreadsheets create --title "Budget 2026" --json # Download/export gws drive files export <FILE_ID> --mime "application/pdf" --output report.pdf ``` ### Sharing ```bash # Share with user gws drive permissions create <FILE_ID> \ --type user --role writer --emailAddress "[email protected]" # Share with domain (view only) gws drive permissions create <FILE_ID> \ --type domain --role reader --domain "company.com" # List who has access gws drive permissions list <FILE_ID> --json ``` ### Sheets Data ```bash # Read a range gws sheets spreadsheets.values get <SHEET_ID> --range "Sheet1!A1:D10" --json # Write data gws sheets spreadsheets.values update <SHEET_ID> --range "Sheet1!A1" \ --values '[["Name","Score"],["Alice",95],["Bob",87]]' # Append rows gws sheets spreadsheets.values append <SHEET_ID> --range "Sheet1!A1" \ --values '[["Charlie",92]]' ``` --- ## Workflow 3: Calendar & Meetings **Goal:** Schedule events, find available times, and generate standup reports. ### Event Management ```bash # Create an event gws calendar events insert primary \ --summary "Sprint Planning" \ --start "2026-03-15T10:00:00" --end "2026-03-15T11:00:00" \ --attendees "[email protected]" \ --location "Conference Room A" # List upcoming events gws calendar events list primary --timeMin "$(date -u +%Y-%m-%dT%H:%M:%SZ)" \ --maxResults 10 --json # Quick event (natural language) gws helpers quick-event "Lunch with Sarah tomorrow at noon" ``` ### Find Available Time ```bash # Check free/busy for multiple people gws helpers find-time \ --attendees "[email protected],[email protected],[email protected]" \ --duration 60 --within "2026-03-15,2026-03-19" --json ``` ### Standup Report ```bash # Generate daily standup from calendar + tasks gws recipes standup-report --json \ | python3 scripts/output_analyzer.py --format table # Meeting prep (agenda + attendee info) gws recipes meeting-prep --event-id <EVENT_ID> ``` --- ## Workflow 4: Security Audit **Goal:** Audit Google Workspace security configuration and generate remediation commands. ### Run Full Audit ```bash # Full audit across all services python3 scripts/workspace_audit.py --json # Audit specific services python3 scripts/workspace_audit.py --services gmail,drive,calendar # Demo mode (no gws required) python3 scripts/workspace_audit.py --demo ``` ### Audit Checks | Area | Check | Risk | |------|-------|------| | Drive | External sharing enabled | Data exfiltration | | Gmail | Auto-forwarding rules | Data exfiltration | | Gmail | DMARC/SPF/DKIM records | Email spoofing | | Calendar | Default sharing visibility | Information leak | | OAuth | Third-party app grants | Unauthorized access | | Admin | Super admin count | Privilege escalation | | Admin | 2-Step verification enforcement | Account takeover | ### Review and Remediate ```bash # Review findings python3 scripts/workspace_audit.py --json | python3 scripts/output_analyzer.py \ --filter "status=FAIL" --select "area,check,remediation" # Execute remediation (example: restrict external sharing) gws drive about get --json # Check current settings # Follow remediation commands from audit output ``` --- ## Python Tools | Script | Purpose | Usage | |--------|---------|-------| | `gws_doctor.py` | Pre-flight diagnostics | `python3 scripts/gws_doctor.py [--json] [--services gmail,drive]` | | `auth_setup_guide.py` | Guided auth setup | `python3 scripts/auth_setup_guide.py --guide oauth` | | `gws_recipe_runner.py` | Recipe catalog & runner | `python3 scripts/gws_recipe_runner.py --list [--persona pm]` | | `workspace_audit.py` | Security/config audit | `python3 scripts/workspace_audit.py [--json] [--demo]` | | `output_analyzer.py` | JSON/NDJSON analysis | `gws ... --json \| python3 scripts/output_analyzer.py --count` | All scripts are stdlib-only, support `--json` output, and include demo mode with embedded sample data. --- ## Best Practices ### Security 1. Use OAuth with minimal scopes — request only what each workflow needs 2. Store tokens in the system keyring, never in plain text files 3. Rotate service account keys every 90 days 4. Audit third-party OAuth app grants quarterly 5. Use `--dry-run` before bulk destructive operations ### Automation 1. Pipe `--json` output through `output_analyzer.py` for filtering and aggregation 2. Use recipes for multi-step operations instead of chaining raw commands 3. Select a persona bundle to scope recipes to your role 4. Use NDJSON format (`--format ndjson`) for streaming large result sets 5. Set `GWS_DEFAULT_FORMAT=json` in your shell profile for scripting ### Performance 1. Use `--fields` to request only needed fields (reduces payload size) 2. Use `--limit` to cap results when browsing 3. Use `--page-all` only when you need complete datasets 4. Batch operations with recipes rather than individual API calls 5. Cache frequently accessed data (e.g., label IDs, folder IDs) in variables --- ## Limitations | Constraint | Impact | |------------|--------| | OAuth tokens expire after 1 hour | Re-auth needed for long-running scripts | | API rate limits (per-user, per-service) | Bulk operations may hit 429 errors | | Scope requirements vary by service | Must request correct scopes during auth | | Pre-v1.0 CLI status | Breaking changes possible between releases | | Google Cloud project required | Free, but requires setup in Cloud Console | | Admin API needs admin privileges | Some audit checks require Workspace Admin role | ### Required Scopes by Service ```bash # List scopes for specific services python3 scripts/auth_setup_guide.py --scopes gmail,drive,calendar,sheets ``` | Service | Key Scopes | |---------|-----------| | Gmail | `gmail.modify`, `gmail.send`, `gmail.labels` | | Drive | `drive.file`, `drive.metadata.readonly` | | Sheets | `spreadsheets` | | Calendar | `calendar`, `calendar.events` | | Admin | `admin.directory.user.readonly`, `admin.directory.group` | | Tasks | `tasks` | FILE:assets/persona-profiles.md # Google Workspace CLI Persona Profiles 10 role-based bundles that scope recipes and commands to your daily workflow. --- ## 1. Executive Assistant **Description:** Managing schedules, emails, and communications for executives. **Top Commands:** - `gws helpers morning-briefing` — Start the day with schedule + inbox overview - `gws helpers find-time` — Find available slots for meetings - `gws helpers meeting-prep --event-id <id>` — Prepare meeting agenda - `gws gmail users.messages send me` — Send emails on behalf - `gws helpers eod-wrap` — End of day summary **Recommended Recipes:** morning-briefing, today-schedule, find-time, send-email, reply-to-thread, meeting-prep, eod-wrap, quick-event, inbox-zero, standup-report **Daily Workflow:** 1. Run `morning-briefing` at 8:00 AM 2. Process inbox with `inbox-zero` 3. Schedule meetings with `find-time` + `create-event` 4. Prep for meetings with `meeting-prep` 5. Close day with `eod-wrap` --- ## 2. Project Manager **Description:** Tracking tasks, meetings, and project deliverables. **Top Commands:** - `gws recipes standup-report` — Generate standup updates - `gws helpers find-time` — Schedule sprint ceremonies - `gws tasks tasks insert` — Create and assign tasks - `gws sheets spreadsheets.values get` — Read project trackers - `gws recipes project-status` — Aggregate project status **Recommended Recipes:** standup-report, create-event, find-time, task-create, task-progress, project-status, weekly-summary, share-folder, sheet-read, morning-briefing **Daily Workflow:** 1. Run `standup-report` before standup 2. Update project tracker via `sheet-write` 3. Create action items with `task-create` 4. Run `weekly-summary` on Fridays 5. Share updates via `chat-message` --- ## 3. HR **Description:** Managing people, onboarding, and team communications. **Top Commands:** - `gws admin users list` — List all domain users - `gws admin users get <email>` — Look up employee details - `gws docs documents create` — Create onboarding docs - `gws drive permissions create` — Share folders with new hires - `gws people people.connections list` — Export contact directory **Recommended Recipes:** list-users, user-info, send-email, create-event, create-doc, share-folder, chat-message, list-groups, export-contacts, today-schedule **Daily Workflow:** 1. Check new hire onboarding queue 2. Create welcome docs with `create-doc` 3. Set up 1:1s with `create-event` 4. Share team folders with `share-folder` 5. Send announcements via `send-email` --- ## 4. Sales **Description:** Managing client communications, proposals, and scheduling. **Top Commands:** - `gws gmail users.messages send me` — Send proposals and follow-ups - `gws gmail users.messages list me --query` — Search client conversations - `gws helpers find-time` — Schedule client meetings - `gws docs documents create` — Create proposals - `gws sheets spreadsheets.values update` — Update pipeline tracker **Recommended Recipes:** send-email, search-emails, create-event, find-time, create-doc, share-file, sheet-read, sheet-write, export-file, morning-briefing **Daily Workflow:** 1. Run `morning-briefing` for meeting overview 2. Search emails for client updates 3. Update pipeline in Sheets 4. Send proposals via `send-email` + `share-file` 5. Schedule follow-ups with `create-event` --- ## 5. IT Admin **Description:** Managing Workspace configuration, security, and user administration. **Top Commands:** - `gws admin users list --domain` — Audit user accounts - `gws admin activities list login` — Monitor login activity - `gws admin groups list` — Manage groups - `python3 workspace_audit.py` — Run security audit - `gws drive files list --orderBy "quotaBytesUsed desc"` — Find storage hogs **Recommended Recipes:** list-users, list-groups, user-info, audit-logins, drive-activity, find-large-files, cleanup-trash, label-manager, filter-setup, share-folder **Daily Workflow:** 1. Check `audit-logins` for suspicious activity 2. Run `workspace_audit.py` weekly 3. Process user provisioning requests 4. Monitor storage with `find-large-files` 5. Review group memberships --- ## 6. Developer **Description:** Using Workspace APIs for automation and data integration. **Top Commands:** - `gws sheets spreadsheets.values get` — Read config/data from Sheets - `gws sheets spreadsheets.values update` — Write results to Sheets - `gws drive files create --upload` — Upload build artifacts - `gws chat spaces.messages create` — Post deployment notifications - `gws tasks tasks insert` — Create tasks from CI/CD **Recommended Recipes:** sheet-read, sheet-write, sheet-append, upload-file, create-doc, chat-message, task-create, list-files, export-file, send-email **Daily Workflow:** 1. Read config from Sheets API 2. Run automated reports to Sheets 3. Post updates to Chat spaces 4. Upload artifacts to Drive 5. Create tasks for bugs/issues --- ## 7. Marketing **Description:** Managing campaigns, content creation, and team coordination. **Top Commands:** - `gws docs documents create` — Draft blog posts and briefs - `gws drive files create --upload` — Upload creative assets - `gws sheets spreadsheets.values append` — Log campaign metrics - `gws gmail users.messages send me` — Send campaign emails - `gws chat spaces.messages create` — Coordinate with team **Recommended Recipes:** send-email, create-doc, share-file, upload-file, create-sheet, sheet-write, chat-message, create-event, email-stats, weekly-summary **Daily Workflow:** 1. Check `email-stats` for campaign performance 2. Create content in Docs 3. Upload assets to shared Drive folders 4. Update metrics in Sheets 5. Coordinate launches via Chat --- ## 8. Finance **Description:** Managing spreadsheets, financial reports, and data analysis. **Top Commands:** - `gws sheets spreadsheets.values get` — Pull financial data - `gws sheets spreadsheets.values update` — Update forecasts - `gws sheets spreadsheets create` — Create new reports - `gws drive files export` — Export reports as PDF - `gws drive permissions create` — Share with auditors **Recommended Recipes:** sheet-read, sheet-write, sheet-append, create-sheet, export-file, share-file, send-email, find-large-files, drive-activity, weekly-summary **Daily Workflow:** 1. Pull latest data into Sheets 2. Update financial models 3. Generate PDF reports with `export-file` 4. Share reports with stakeholders 5. Weekly summary for leadership --- ## 9. Legal **Description:** Managing documents, contracts, and compliance. **Top Commands:** - `gws docs documents create` — Draft contracts - `gws drive files export` — Export final versions as PDF - `gws drive permissions create` — Manage document access - `gws gmail users.messages list me --query` — Search for compliance emails - `gws admin activities list` — Audit trail for compliance **Recommended Recipes:** create-doc, share-file, export-file, search-emails, send-email, upload-file, list-files, drive-activity, audit-logins, find-large-files **Daily Workflow:** 1. Draft and review documents 2. Search email for contract references 3. Export finalized docs as PDF 4. Set precise sharing permissions 5. Maintain audit trail --- ## 10. Customer Support **Description:** Managing customer communications and ticket tracking. **Top Commands:** - `gws gmail users.messages list me --query` — Search customer emails - `gws gmail users.messages reply me` — Reply to tickets - `gws gmail users.labels create` — Organize by ticket status - `gws tasks tasks insert` — Create follow-up tasks - `gws chat spaces.messages create` — Escalate to team **Recommended Recipes:** search-emails, send-email, reply-to-thread, label-manager, filter-setup, task-create, chat-message, unread-digest, inbox-zero, morning-briefing **Daily Workflow:** 1. Run `morning-briefing` for ticket overview 2. Process inbox with label-based triage 3. Reply to open tickets 4. Escalate via Chat for urgent issues 5. Create follow-up tasks for pending items FILE:assets/workspace-config.json { "_comment": "Google Workspace CLI automation config template. Copy and customize for your environment.", "auth": { "method": "oauth", "client_id": "", "client_secret": "", "token_path": "~/.config/gws/token.json", "service_account_key": "", "delegated_user": "" }, "defaults": { "output_format": "json", "pagination_limit": 100, "timeout_ms": 30000, "log_level": "warn" }, "persona": "developer", "scopes": [ "gmail.modify", "gmail.send", "drive.file", "drive.metadata.readonly", "spreadsheets", "calendar", "calendar.events", "tasks" ], "scheduled_tasks": [ { "name": "morning-briefing", "recipe": "morning-briefing", "schedule": "0 8 * * 1-5", "output": "~/workspace-reports/morning-{date}.json" }, { "name": "eod-wrap", "recipe": "eod-wrap", "schedule": "0 17 * * 1-5", "output": "~/workspace-reports/eod-{date}.json" }, { "name": "weekly-summary", "recipe": "weekly-summary", "schedule": "0 9 * * 5", "output": "~/workspace-reports/weekly-{date}.json" }, { "name": "security-audit", "command": "python3 scripts/workspace_audit.py --json", "schedule": "0 10 * * 1", "output": "~/workspace-reports/audit-{date}.json" } ], "aliases": { "inbox": "gws gmail users.messages list me --query 'is:inbox' --limit 20 --json", "unread": "gws gmail users.messages list me --query 'is:unread' --limit 20 --json", "files": "gws drive files list --limit 20 --json", "events": "gws calendar events list primary --timeMin $(date -u +%Y-%m-%dT%H:%M:%SZ) --maxResults 10 --json", "tasks": "gws tasks tasks list @default --json" } } FILE:references/gws-command-reference.md # Google Workspace CLI Command Reference Comprehensive reference for the `gws` CLI covering 18 services, 22 helper commands, global flags, and environment variables. --- ## Global Flags | Flag | Description | |------|-------------| | `--json` | Output as JSON | | `--format ndjson` | Output as newline-delimited JSON | | `--dry-run` | Show what would be done without executing | | `--limit <n>` | Maximum results to return | | `--page-all` | Fetch all pages of results | | `--fields <spec>` | Partial response field mask | | `--quiet` | Suppress non-error output | | `--verbose` | Verbose debug output | | `--timeout <ms>` | Request timeout in milliseconds | --- ## Environment Variables | Variable | Description | Default | |----------|-------------|---------| | `GWS_CLIENT_ID` | OAuth client ID | — | | `GWS_CLIENT_SECRET` | OAuth client secret | — | | `GWS_TOKEN_PATH` | Token storage location | `~/.config/gws/token.json` | | `GWS_SERVICE_ACCOUNT_KEY` | Service account JSON key path | — | | `GWS_DELEGATED_USER` | User to impersonate (service accounts) | — | | `GWS_DEFAULT_FORMAT` | Default output format | `text` | | `GWS_PAGINATION_LIMIT` | Default pagination limit | `100` | | `GWS_LOG_LEVEL` | Logging level (debug/info/warn/error) | `warn` | --- ## Services ### Gmail ```bash gws gmail users.messages list me --query "<query>" --json gws gmail users.messages get me <messageId> --json gws gmail users.messages send me --to <email> --subject <subj> --body <body> gws gmail users.messages reply me --thread-id <id> --body <body> gws gmail users.messages forward me --message-id <id> --to <email> gws gmail users.messages modify me <id> --addLabelIds <label> --removeLabelIds INBOX gws gmail users.messages trash me <id> gws gmail users.labels list me --json gws gmail users.labels create me --name <name> gws gmail users.settings.filters create me --criteria <json> --action <json> gws gmail users.settings.forwardingAddresses list me --json gws gmail users getProfile me --json ``` ### Google Drive ```bash gws drive files list --json --limit <n> gws drive files list --query "name contains '<term>'" --json gws drive files list --parents <folderId> --json gws drive files get <fileId> --json gws drive files create --name <name> --upload <path> --parents <folderId> gws drive files create --name <name> --mimeType application/vnd.google-apps.folder gws drive files update <fileId> --upload <path> gws drive files delete <fileId> gws drive files export <fileId> --mime <mimeType> --output <path> gws drive files copy <fileId> --name <newName> gws drive permissions list <fileId> --json gws drive permissions create <fileId> --type <user|group|domain> --role <reader|writer|owner> --emailAddress <email> gws drive permissions delete <fileId> <permissionId> gws drive about get --json gws drive files emptyTrash ``` ### Google Sheets ```bash gws sheets spreadsheets create --title <title> --json gws sheets spreadsheets get <spreadsheetId> --json gws sheets spreadsheets.values get <spreadsheetId> --range <range> --json gws sheets spreadsheets.values update <spreadsheetId> --range <range> --values <json> gws sheets spreadsheets.values append <spreadsheetId> --range <range> --values <json> gws sheets spreadsheets.values clear <spreadsheetId> --range <range> gws sheets spreadsheets.values batchGet <spreadsheetId> --ranges <range1>,<range2> --json gws sheets spreadsheets.values batchUpdate <spreadsheetId> --data <json> ``` ### Google Calendar ```bash gws calendar calendarList list --json gws calendar calendarList get <calendarId> --json gws calendar events list <calendarId> --timeMin <datetime> --timeMax <datetime> --json gws calendar events get <calendarId> <eventId> --json gws calendar events insert <calendarId> --summary <title> --start <datetime> --end <datetime> --attendees <emails> gws calendar events update <calendarId> <eventId> --summary <title> gws calendar events patch <calendarId> <eventId> --start <datetime> --end <datetime> gws calendar events delete <calendarId> <eventId> gws calendar freebusy query --timeMin <start> --timeMax <end> --items <calendarId1>,<calendarId2> --json ``` ### Google Docs ```bash gws docs documents create --title <title> --json gws docs documents get <documentId> --json gws docs documents batchUpdate <documentId> --requests <json> ``` ### Google Slides ```bash gws slides presentations create --title <title> --json gws slides presentations get <presentationId> --json gws slides presentations.pages get <presentationId> <pageId> --json gws slides presentations.pages getThumbnail <presentationId> <pageId> --json ``` ### Google Chat ```bash gws chat spaces list --json gws chat spaces get <spaceName> --json gws chat spaces.messages create <spaceName> --text <message> gws chat spaces.messages list <spaceName> --json gws chat spaces.messages get <messageName> --json gws chat spaces.members list <spaceName> --json ``` ### Google Tasks ```bash gws tasks tasklists list --json gws tasks tasklists get <tasklistId> --json gws tasks tasklists insert --title <title> --json gws tasks tasks list <tasklistId> --json gws tasks tasks get <tasklistId> <taskId> --json gws tasks tasks insert <tasklistId> --title <title> --due <datetime> gws tasks tasks update <tasklistId> <taskId> --status completed gws tasks tasks delete <tasklistId> <taskId> ``` ### Admin SDK (Directory) ```bash gws admin users list --domain <domain> --json gws admin users get <email> --json gws admin users insert --primaryEmail <email> --name.givenName <first> --name.familyName <last> gws admin users update <email> --suspended true gws admin groups list --domain <domain> --json gws admin groups get <email> --json gws admin groups insert --email <email> --name <name> gws admin groups.members list <groupEmail> --json gws admin groups.members insert <groupEmail> --email <memberEmail> --role MEMBER gws admin orgunits list --customerId my_customer --json ``` ### Google Groups ```bash gws groups groups list --domain <domain> --json gws groups groups get <email> --json gws groups memberships list <groupEmail> --json ``` ### Google People (Contacts) ```bash gws people people.connections list me --personFields names,emailAddresses --json gws people people get <resourceName> --personFields names,emailAddresses,phoneNumbers --json gws people people searchContacts --query <term> --readMask names,emailAddresses --json ``` ### Google Meet ```bash gws meet spaces create --json gws meet spaces get <spaceName> --json gws meet conferenceRecords list --json ``` ### Google Classroom ```bash gws classroom courses list --json gws classroom courses get <courseId> --json gws classroom courses.courseWork list <courseId> --json gws classroom courses.students list <courseId> --json ``` ### Google Forms ```bash gws forms forms get <formId> --json gws forms forms.responses list <formId> --json ``` ### Google Keep ```bash gws keep notes list --json gws keep notes get <noteId> --json ``` ### Google Sites ```bash gws sites sites list --json gws sites sites get <siteId> --json ``` ### Google Vault ```bash gws vault matters list --json gws vault matters get <matterId> --json gws vault matters.holds list <matterId> --json ``` ### Admin Reports / Activities ```bash gws admin activities list <applicationName> --json gws admin activities list login --json gws admin activities list drive --json gws admin activities list admin --json ``` --- ## Helper Commands (22) | Helper | Description | Example | |--------|-------------|---------| | `send` | Quick send email | `gws helpers send --to [email protected] --subject Hi --body Hello` | | `reply` | Quick reply | `gws helpers reply --thread <id> --body Thanks` | | `forward` | Quick forward | `gws helpers forward --message <id> --to [email protected]` | | `upload` | Quick upload to Drive | `gws helpers upload file.pdf --folder <id>` | | `download` | Quick download | `gws helpers download <fileId> --output file.pdf` | | `share` | Quick share | `gws helpers share <fileId> --with [email protected] --role writer` | | `quick-event` | Natural language event | `gws helpers quick-event "Lunch tomorrow at noon"` | | `find-time` | Find free slots | `gws helpers find-time --attendees a,b --duration 60` | | `standup-report` | Daily standup | `gws helpers standup-report` | | `meeting-prep` | Prep for meeting | `gws helpers meeting-prep --event <id>` | | `weekly-summary` | Week summary | `gws helpers weekly-summary` | | `morning-briefing` | Morning overview | `gws helpers morning-briefing` | | `eod-wrap` | End of day wrap | `gws helpers eod-wrap` | | `inbox-zero` | Process inbox | `gws helpers inbox-zero` | | `search` | Cross-service search | `gws helpers search "quarterly report"` | | `create-task` | Quick task creation | `gws helpers create-task "Review PR" --due tomorrow` | | `list-tasks` | Quick task listing | `gws helpers list-tasks` | | `chat-send` | Quick chat message | `gws helpers chat-send --space <id> --text "Hello"` | | `export-pdf` | Export as PDF | `gws helpers export-pdf <fileId> --output file.pdf` | | `trash-old` | Trash old files | `gws helpers trash-old --older-than 365d` | | `audit-sharing` | Audit file sharing | `gws helpers audit-sharing --folder <id>` | | `backup-labels` | Backup Gmail labels | `gws helpers backup-labels --output labels.json` | --- ## Schema Introspection ```bash # View the API schema for any service method gws schema gmail.users.messages.list gws schema drive.files.create gws schema calendar.events.insert # List all available services gws schema --list # List methods for a service gws schema gmail --methods ``` --- ## Authentication Commands ```bash gws auth setup # Interactive OAuth setup gws auth setup --service-account # Service account setup gws auth status # Check current auth gws auth status --json # JSON auth details gws auth refresh # Refresh expired token gws auth revoke # Revoke current token gws auth switch <profile> # Switch auth profile gws auth profiles list # List saved profiles ``` --- ## Recipe Commands ```bash gws recipes list # List all 43 recipes gws recipes list --category email # Filter by category gws recipes describe <name> # Show recipe details gws recipes run <name> # Execute a recipe gws recipes run <name> --dry-run # Preview recipe commands ``` --- ## Persona Commands ```bash gws persona list # List all 10 personas gws persona select <name> # Activate a persona gws persona show # Show active persona gws persona recipes # Show recipes for active persona ``` FILE:references/recipes-cookbook.md # Google Workspace CLI Recipes Cookbook Complete catalog of 43 built-in recipes organized by category, with command sequences and persona mapping. --- ## Recipe Categories | Category | Count | Description | |----------|-------|-------------| | Email | 8 | Gmail operations — send, search, label, filter | | Files | 7 | Drive file management — upload, share, export | | Calendar | 6 | Events, scheduling, meeting prep | | Reporting | 5 | Activity summaries and analytics | | Collaboration | 5 | Chat, Docs, Tasks teamwork | | Data | 4 | Sheets read/write and contacts | | Admin | 4 | User and group management | | Cross-Service | 4 | Multi-service workflows | --- ## Email Recipes (8) ### send-email Send an email with optional attachments. ```bash gws gmail users.messages send me --to "[email protected]" \ --subject "Subject" --body "Body text" [--attachment file.pdf] ``` ### reply-to-thread Reply to an existing email thread. ```bash gws gmail users.messages reply me --thread-id <THREAD_ID> --body "Reply text" ``` ### forward-email Forward an email to another recipient. ```bash gws gmail users.messages forward me --message-id <MSG_ID> --to "[email protected]" ``` ### search-emails Search emails using Gmail query syntax. ```bash gws gmail users.messages list me --query "from:[email protected] after:2025/01/01" --json ``` **Query examples:** `is:unread`, `has:attachment`, `label:important`, `newer_than:7d` ### archive-old Archive read emails older than N days. ```bash gws gmail users.messages list me --query "is:read older_than:30d" --json # Extract IDs, then batch modify to remove INBOX label ``` ### label-manager Create and organize Gmail labels. ```bash gws gmail users.labels list me --json gws gmail users.labels create me --name "Projects/Alpha" ``` ### filter-setup Create auto-labeling filters. ```bash gws gmail users.settings.filters create me \ --criteria '{"from":"[email protected]"}' \ --action '{"addLabelIds":["Label_123"],"removeLabelIds":["INBOX"]}' ``` ### unread-digest Get digest of unread emails. ```bash gws gmail users.messages list me --query "is:unread" --limit 20 --json ``` --- ## Files Recipes (7) ### upload-file Upload a file to Google Drive. ```bash gws drive files create --name "Report Q1" --upload report.pdf --parents <FOLDER_ID> ``` ### create-sheet Create a new Google Spreadsheet. ```bash gws sheets spreadsheets create --title "Budget 2026" --json ``` ### share-file Share a Drive file with a user or domain. ```bash gws drive permissions create <FILE_ID> --type user --role writer --emailAddress "[email protected]" ``` ### export-file Export a Google Doc/Sheet as PDF. ```bash gws drive files export <FILE_ID> --mime "application/pdf" --output report.pdf ``` ### list-files List files in a Drive folder. ```bash gws drive files list --parents <FOLDER_ID> --json ``` ### find-large-files Find the largest files in Drive. ```bash gws drive files list --orderBy "quotaBytesUsed desc" --limit 20 --json ``` ### cleanup-trash Empty Drive trash. ```bash gws drive files emptyTrash ``` --- ## Calendar Recipes (6) ### create-event Create a calendar event with attendees. ```bash gws calendar events insert primary \ --summary "Sprint Planning" \ --start "2026-03-15T10:00:00" --end "2026-03-15T11:00:00" \ --attendees "[email protected]" --location "Room A" ``` ### quick-event Create event from natural language. ```bash gws helpers quick-event "Lunch with Sarah tomorrow at noon" ``` ### find-time Find available time slots for a meeting. ```bash gws helpers find-time --attendees "[email protected],[email protected]" --duration 60 \ --within "2026-03-15,2026-03-19" --json ``` ### today-schedule Show today's calendar events. ```bash gws calendar events list primary \ --timeMin "$(date -u +%Y-%m-%dT00:00:00Z)" \ --timeMax "$(date -u +%Y-%m-%dT23:59:59Z)" --json ``` ### meeting-prep Prepare for an upcoming meeting. ```bash gws recipes meeting-prep --event-id <EVENT_ID> ``` **Output:** Agenda, attendee list, related Drive files, previous meeting notes. ### reschedule Move an event to a new time. ```bash gws calendar events patch primary <EVENT_ID> \ --start "2026-03-16T14:00:00" --end "2026-03-16T15:00:00" ``` --- ## Reporting Recipes (5) ### standup-report Generate daily standup from calendar and tasks. ```bash gws recipes standup-report --json ``` **Output:** Yesterday's events, today's schedule, pending tasks, blockers. ### weekly-summary Summarize week's emails, events, and tasks. ```bash gws recipes weekly-summary --json ``` ### drive-activity Report on Drive file activity. ```bash gws drive activities list --json ``` ### email-stats Email volume statistics for the past 7 days. ```bash gws gmail users.messages list me --query "newer_than:7d" --json | python3 output_analyzer.py --count ``` ### task-progress Report on task completion. ```bash gws tasks tasks list <TASKLIST_ID> --json | python3 output_analyzer.py --group-by "status" ``` --- ## Collaboration Recipes (5) ### share-folder Share a Drive folder with a team. ```bash gws drive permissions create <FOLDER_ID> --type group --role writer --emailAddress "[email protected]" ``` ### create-doc Create a Google Doc with initial content. ```bash gws docs documents create --title "Meeting Notes - March 15" --json ``` ### chat-message Send a message to a Google Chat space. ```bash gws chat spaces.messages create <SPACE_NAME> --text "Deployment complete!" ``` ### list-spaces List Google Chat spaces. ```bash gws chat spaces list --json ``` ### task-create Create a task in Google Tasks. ```bash gws tasks tasks insert <TASKLIST_ID> --title "Review PR #42" --due "2026-03-16" ``` --- ## Data Recipes (4) ### sheet-read Read data from a spreadsheet range. ```bash gws sheets spreadsheets.values get <SHEET_ID> --range "Sheet1!A1:D10" --json ``` ### sheet-write Write data to a spreadsheet. ```bash gws sheets spreadsheets.values update <SHEET_ID> --range "Sheet1!A1" \ --values '[["Name","Score"],["Alice",95],["Bob",87]]' ``` ### sheet-append Append rows to a spreadsheet. ```bash gws sheets spreadsheets.values append <SHEET_ID> --range "Sheet1!A1" \ --values '[["Charlie",92]]' ``` ### export-contacts Export contacts list. ```bash gws people people.connections list me --personFields names,emailAddresses --json ``` --- ## Admin Recipes (4) ### list-users List all users in the Workspace domain. ```bash gws admin users list --domain company.com --json ``` **Prerequisites:** Admin SDK API enabled, `admin.directory.user.readonly` scope. ### list-groups List all groups in the domain. ```bash gws admin groups list --domain company.com --json ``` ### user-info Get detailed user information. ```bash gws admin users get [email protected] --json ``` ### audit-logins Audit recent login activity. ```bash gws admin activities list login --json ``` --- ## Cross-Service Recipes (4) ### morning-briefing Today's events + unread emails + pending tasks. ```bash gws recipes morning-briefing --json ``` **Combines:** Calendar events, Gmail unread count, Tasks pending. ### eod-wrap End-of-day summary: completed, pending, tomorrow's schedule. ```bash gws recipes eod-wrap --json ``` ### project-status Aggregate project status from Drive, Sheets, Tasks. ```bash gws recipes project-status --project "Project Alpha" --json ``` ### inbox-zero Process inbox to zero: label, archive, reply, or create task. ```bash gws recipes inbox-zero --interactive ``` --- ## Persona Mapping | Persona | Top Recipes | |---------|-------------| | Executive Assistant | morning-briefing, today-schedule, find-time, send-email, meeting-prep, eod-wrap | | Project Manager | standup-report, create-event, find-time, task-create, project-status, weekly-summary | | HR | list-users, user-info, send-email, create-event, create-doc, export-contacts | | Sales | send-email, search-emails, create-event, find-time, create-doc, share-file | | IT Admin | list-users, list-groups, audit-logins, drive-activity, find-large-files, cleanup-trash | | Developer | sheet-read, sheet-write, upload-file, chat-message, task-create, send-email | | Marketing | send-email, create-doc, share-file, upload-file, create-sheet, chat-message | | Finance | sheet-read, sheet-write, sheet-append, create-sheet, export-file, share-file | | Legal | create-doc, share-file, export-file, search-emails, upload-file, audit-logins | | Customer Support | search-emails, send-email, reply-to-thread, label-manager, task-create, inbox-zero | FILE:references/troubleshooting.md # Google Workspace CLI Troubleshooting Common errors, fixes, and platform-specific guidance for the `gws` CLI. --- ## Installation Issues ### gws not found on PATH **Error:** `command not found: gws` **Fixes:** ```bash # Check if installed npm list -g @anthropic/gws 2>/dev/null || echo "Not installed via npm" which gws || echo "Not on PATH" # Install via npm npm install -g @anthropic/gws # If npm global bin not on PATH export PATH="$(npm config get prefix)/bin:$PATH" # Add to ~/.zshrc or ~/.bashrc for persistence ``` ### npm permission errors **Error:** `EACCES: permission denied` **Fixes:** ```bash # Option 1: Fix npm prefix (recommended) mkdir -p ~/.npm-global npm config set prefix '~/.npm-global' export PATH=~/.npm-global/bin:$PATH # Option 2: Use npx without installing npx @anthropic/gws --version ``` ### Cargo build failures **Error:** `error[E0463]: can't find crate` **Fixes:** ```bash # Ensure Rust is up to date rustup update stable # Clean build cargo clean && cargo install gws-cli ``` --- ## Authentication Errors ### Token expired **Error:** `401 Unauthorized: Token has been expired or revoked` **Cause:** OAuth tokens expire after 1 hour. **Fix:** ```bash gws auth refresh # If refresh fails: gws auth setup # Re-authenticate ``` ### Insufficient scopes **Error:** `403 Forbidden: Request had insufficient authentication scopes` **Fix:** ```bash # Check current scopes gws auth status --json | grep scopes # Re-auth with additional scopes gws auth setup --scopes gmail,drive,calendar,sheets,tasks # Or list required scopes for a service python3 scripts/auth_setup_guide.py --scopes gmail,drive ``` ### Keyring/keychain errors **Error:** `Failed to access keyring` or `SecKeychainFindGenericPassword failed` **Fixes:** ```bash # macOS: Unlock keychain security unlock-keychain ~/Library/Keychains/login.keychain-db # Linux: Install keyring backend sudo apt install gnome-keyring # or libsecret # Fallback: Use file-based token storage export GWS_TOKEN_PATH=~/.config/gws/token.json gws auth setup ``` ### Service account delegation errors **Error:** `403: Not Authorized to access this resource/api` **Fix:** 1. Verify domain-wide delegation is enabled on the service account 2. Verify client ID is authorized in Admin Console > Security > API Controls 3. Verify scopes match exactly (no trailing slashes) 4. Verify `GWS_DELEGATED_USER` is a valid admin account ```bash # Debug echo $GWS_SERVICE_ACCOUNT_KEY # Should point to valid JSON key file echo $GWS_DELEGATED_USER # Should be [email protected] gws auth status --json # Check auth details ``` --- ## API Errors ### Rate limit exceeded (429) **Error:** `429 Too Many Requests: Rate Limit Exceeded` **Cause:** Google Workspace APIs have per-user, per-service rate limits. **Fix:** ```bash # Add delays between bulk operations for id in $(cat file_ids.txt); do gws drive files get $id --json >> results.json sleep 0.5 # 500ms delay done # Use --limit to reduce result size gws drive files list --limit 100 --json # For admin operations, batch in groups of 50 ``` **Rate limits by service:** | Service | Limit | |---------|-------| | Gmail | 250 quota units/second/user | | Drive | 1,000 requests/100 seconds/user | | Sheets | 60 read requests/minute/user | | Calendar | 500 requests/100 seconds/user | | Admin SDK | 2,400 requests/minute | ### Permission denied (403) **Error:** `403 Forbidden: The caller does not have permission` **Causes and fixes:** 1. **Wrong scope** — Re-auth with correct scopes 2. **Not the file owner** — Request access from the owner 3. **Domain policy** — Check Admin Console sharing policies 4. **API not enabled** — Enable the API in Google Cloud Console ```bash # Check which APIs are enabled gws schema --list # Enable an API # Go to: console.cloud.google.com > APIs & Services > Library ``` ### Not found (404) **Error:** `404 Not Found: File not found` **Causes:** 1. File was deleted or moved to trash 2. File ID is incorrect 3. No permission to see the file ```bash # Check trash gws drive files list --query "trashed=true and name='filename'" --json # Verify file ID gws drive files get <fileId> --json ``` --- ## Output Parsing Issues ### NDJSON vs JSON array **Problem:** Output format varies between commands and versions. ```bash # Force JSON array output gws drive files list --json # Force NDJSON output gws drive files list --format ndjson # Handle both in output_analyzer.py (automatic detection) gws drive files list --json | python3 scripts/output_analyzer.py --count ``` ### Pagination **Problem:** Only partial results returned. ```bash # Fetch all pages gws drive files list --page-all --json # Or set a high limit gws drive files list --limit 1000 --json # Check if more pages exist (look for nextPageToken in output) gws drive files list --limit 100 --json | grep nextPageToken ``` ### Empty response **Problem:** Command returns empty or `{}`. ```bash # Check auth gws auth status # Try with verbose output gws drive files list --verbose --json # Check if the service is accessible gws drive about get --json ``` --- ## Platform-Specific Issues ### macOS **Keychain access prompts:** ```bash # Allow gws to access keychain without repeated prompts # In Keychain Access.app, find "gws" entries and set "Allow all applications" # Or use file-based storage export GWS_TOKEN_PATH=~/.config/gws/token.json ``` **Browser not opening for OAuth:** ```bash # If default browser doesn't open gws auth setup --no-browser # Copy the URL manually and paste in browser ``` ### Linux **Headless OAuth (no browser):** ```bash # Use out-of-band flow gws auth setup --no-browser # Prints a URL — open on another machine, paste code back # Or use service account (no browser needed) export GWS_SERVICE_ACCOUNT_KEY=/path/to/key.json export [email protected] ``` **Missing keyring backend:** ```bash # Install a keyring backend sudo apt install gnome-keyring libsecret-1-dev # Or use file-based storage export GWS_TOKEN_PATH=~/.config/gws/token.json ``` ### Windows **PATH issues:** ```powershell # Add npm global bin to PATH $env:PATH += ";$(npm config get prefix)\bin" # Or use npx npx @anthropic/gws --version ``` **PowerShell quoting:** ```powershell # Use single quotes for JSON arguments gws gmail users.settings.filters create me ` --criteria '{"from":"[email protected]"}' ` --action '{"addLabelIds":["Label_1"]}' ``` --- ## Getting Help ```bash # General help gws --help gws <service> --help gws <service> <resource> --help # API schema for a method gws schema gmail.users.messages.send # Version info gws --version # Debug mode gws --verbose <command> # Report issues # https://github.com/googleworkspace/cli/issues ``` FILE:scripts/auth_setup_guide.py #!/usr/bin/env python3 """ Google Workspace CLI Auth Setup Guide — Guided authentication configuration. Prints step-by-step instructions for OAuth and service account setup, generates .env templates, lists required scopes, and validates auth. Usage: python3 auth_setup_guide.py --guide oauth python3 auth_setup_guide.py --guide service-account python3 auth_setup_guide.py --scopes gmail,drive,calendar python3 auth_setup_guide.py --generate-env python3 auth_setup_guide.py --validate [--json] python3 auth_setup_guide.py --check [--json] """ import argparse import json import shutil import subprocess import sys from dataclasses import dataclass, field, asdict from typing import List, Dict SERVICE_SCOPES: Dict[str, List[str]] = { "gmail": [ "https://www.googleapis.com/auth/gmail.modify", "https://www.googleapis.com/auth/gmail.send", "https://www.googleapis.com/auth/gmail.labels", "https://www.googleapis.com/auth/gmail.settings.basic", ], "drive": [ "https://www.googleapis.com/auth/drive", "https://www.googleapis.com/auth/drive.file", "https://www.googleapis.com/auth/drive.metadata.readonly", ], "sheets": [ "https://www.googleapis.com/auth/spreadsheets", ], "calendar": [ "https://www.googleapis.com/auth/calendar", "https://www.googleapis.com/auth/calendar.events", ], "tasks": [ "https://www.googleapis.com/auth/tasks", ], "chat": [ "https://www.googleapis.com/auth/chat.spaces.readonly", "https://www.googleapis.com/auth/chat.messages", ], "docs": [ "https://www.googleapis.com/auth/documents", ], "admin": [ "https://www.googleapis.com/auth/admin.directory.user.readonly", "https://www.googleapis.com/auth/admin.directory.group", "https://www.googleapis.com/auth/admin.directory.orgunit.readonly", ], "meet": [ "https://www.googleapis.com/auth/meetings.space.created", ], } OAUTH_GUIDE = """ === Google Workspace CLI: OAuth Setup Guide === Step 1: Create a Google Cloud Project 1. Go to https://console.cloud.google.com/ 2. Click "Select a project" -> "New Project" 3. Name it (e.g., "gws-cli-access") and click Create 4. Note the Project ID Step 2: Enable Required APIs 1. Go to APIs & Services -> Library 2. Search and enable each API you need: - Gmail API - Google Drive API - Google Sheets API - Google Calendar API - Tasks API - Admin SDK API (for admin operations) Step 3: Configure OAuth Consent Screen 1. Go to APIs & Services -> OAuth consent screen 2. Select "Internal" (for Workspace) or "External" (for personal) 3. Fill in app name, support email 4. Add scopes for the services you need 5. Save and continue Step 4: Create OAuth Credentials 1. Go to APIs & Services -> Credentials 2. Click "Create Credentials" -> "OAuth client ID" 3. Application type: "Desktop app" 4. Name it "gws-cli" 5. Download the JSON file Step 5: Configure gws CLI 1. Set environment variables: export GWS_CLIENT_ID=<your-client-id> export GWS_CLIENT_SECRET=<your-client-secret> 2. Or place the credentials JSON: mv client_secret_*.json ~/.config/gws/credentials.json Step 6: Authenticate gws auth setup # Opens browser for consent, stores token in system keyring Step 7: Verify gws auth status gws gmail users getProfile me """ SERVICE_ACCOUNT_GUIDE = """ === Google Workspace CLI: Service Account Setup Guide === Step 1: Create a Google Cloud Project (Same as OAuth Step 1) Step 2: Create a Service Account 1. Go to IAM & Admin -> Service Accounts 2. Click "Create Service Account" 3. Name: "gws-cli-service" 4. Grant roles as needed (no role needed for Workspace API access) 5. Click "Done" Step 3: Create Key 1. Click on the service account 2. Go to "Keys" tab 3. Add Key -> Create new key -> JSON 4. Download and store securely Step 4: Enable Domain-Wide Delegation 1. On the service account page, click "Edit" 2. Check "Enable Google Workspace domain-wide delegation" 3. Save 4. Note the Client ID (numeric) Step 5: Authorize in Google Admin 1. Go to admin.google.com 2. Security -> API Controls -> Domain-wide Delegation 3. Add new: - Client ID: <numeric client ID from Step 4> - Scopes: (paste required scopes) 4. Authorize Step 6: Configure gws CLI export GWS_SERVICE_ACCOUNT_KEY=/path/to/service-account-key.json export [email protected] Step 7: Verify gws auth status gws gmail users getProfile me """ ENV_TEMPLATE = """# Google Workspace CLI Configuration # Copy to .env and fill in values # OAuth Credentials (for interactive auth) GWS_CLIENT_ID= GWS_CLIENT_SECRET= GWS_TOKEN_PATH=~/.config/gws/token.json # Service Account (for headless/CI auth) # GWS_SERVICE_ACCOUNT_KEY=/path/to/key.json # [email protected] # Defaults GWS_DEFAULT_FORMAT=json GWS_PAGINATION_LIMIT=100 """ @dataclass class ValidationResult: service: str status: str # PASS, FAIL message: str @dataclass class ValidationReport: auth_method: str = "" user: str = "" results: List[dict] = field(default_factory=list) summary: str = "" demo_mode: bool = False DEMO_VALIDATION = ValidationReport( auth_method="oauth", user="[email protected]", results=[ {"service": "gmail", "status": "PASS", "message": "Gmail API accessible"}, {"service": "drive", "status": "PASS", "message": "Drive API accessible"}, {"service": "calendar", "status": "PASS", "message": "Calendar API accessible"}, {"service": "sheets", "status": "PASS", "message": "Sheets API accessible"}, {"service": "tasks", "status": "FAIL", "message": "Scope not authorized"}, ], summary="4/5 services validated (demo mode)", demo_mode=True, ) def check_auth_status() -> dict: """Check current gws auth status.""" try: result = subprocess.run( ["gws", "auth", "status", "--json"], capture_output=True, text=True, timeout=15 ) if result.returncode == 0: try: return json.loads(result.stdout) except json.JSONDecodeError: return {"status": "authenticated", "raw": result.stdout.strip()} return {"status": "not_authenticated", "error": result.stderr.strip()[:200]} except (FileNotFoundError, OSError): return {"status": "gws_not_found"} def validate_services(services: List[str]) -> ValidationReport: """Validate auth by testing each service.""" report = ValidationReport() auth = check_auth_status() if auth.get("status") == "gws_not_found": report.summary = "gws CLI not installed" return report if auth.get("status") == "not_authenticated": report.auth_method = "none" report.summary = "Not authenticated" return report report.auth_method = auth.get("method", "oauth") report.user = auth.get("user", auth.get("email", "unknown")) service_cmds = { "gmail": ["gws", "gmail", "users", "getProfile", "me", "--json"], "drive": ["gws", "drive", "files", "list", "--limit", "1", "--json"], "calendar": ["gws", "calendar", "calendarList", "list", "--limit", "1", "--json"], "sheets": ["gws", "sheets", "spreadsheets", "get", "test", "--json"], "tasks": ["gws", "tasks", "tasklists", "list", "--limit", "1", "--json"], } for svc in services: cmd = service_cmds.get(svc) if not cmd: report.results.append(asdict( ValidationResult(svc, "WARN", f"No test available for {svc}") )) continue try: result = subprocess.run(cmd, capture_output=True, text=True, timeout=15) if result.returncode == 0: report.results.append(asdict( ValidationResult(svc, "PASS", f"{svc.title()} API accessible") )) else: report.results.append(asdict( ValidationResult(svc, "FAIL", result.stderr.strip()[:100]) )) except (subprocess.TimeoutExpired, OSError) as e: report.results.append(asdict( ValidationResult(svc, "FAIL", str(e)[:100]) )) passed = sum(1 for r in report.results if r["status"] == "PASS") total = len(report.results) report.summary = f"{passed}/{total} services validated" return report def main(): parser = argparse.ArgumentParser( description="Guided authentication setup for Google Workspace CLI (gws)", formatter_class=argparse.RawDescriptionHelpFormatter, epilog=""" Examples: %(prog)s --guide oauth # OAuth setup instructions %(prog)s --guide service-account # Service account setup %(prog)s --scopes gmail,drive # Show required scopes %(prog)s --generate-env # Generate .env template %(prog)s --check # Check current auth status %(prog)s --validate --json # Validate all services (JSON) """, ) parser.add_argument("--guide", choices=["oauth", "service-account"], help="Print setup guide") parser.add_argument("--scopes", help="Comma-separated services to show scopes for") parser.add_argument("--generate-env", action="store_true", help="Generate .env template") parser.add_argument("--check", action="store_true", help="Check current auth status") parser.add_argument("--validate", action="store_true", help="Validate auth by testing services") parser.add_argument("--services", default="gmail,drive,calendar,sheets,tasks", help="Services to validate (default: gmail,drive,calendar,sheets,tasks)") parser.add_argument("--json", action="store_true", help="Output JSON") args = parser.parse_args() if not any([args.guide, args.scopes, args.generate_env, args.check, args.validate]): parser.print_help() return if args.guide: if args.guide == "oauth": print(OAUTH_GUIDE) else: print(SERVICE_ACCOUNT_GUIDE) return if args.scopes: services = [s.strip() for s in args.scopes.split(",") if s.strip()] if args.json: output = {} for svc in services: output[svc] = SERVICE_SCOPES.get(svc, []) print(json.dumps(output, indent=2)) else: print(f"\n{'='*60}") print(f" REQUIRED OAUTH SCOPES") print(f"{'='*60}\n") for svc in services: scopes = SERVICE_SCOPES.get(svc, []) print(f" {svc.upper()}:") if scopes: for scope in scopes: print(f" - {scope}") else: print(f" (no scopes defined for '{svc}')") print() # Print combined for easy copy-paste all_scopes = [] for svc in services: all_scopes.extend(SERVICE_SCOPES.get(svc, [])) if all_scopes: print(f" COMBINED (for consent screen):") print(f" {','.join(all_scopes)}") print(f"\n{'='*60}\n") return if args.generate_env: print(ENV_TEMPLATE) return if args.check: if shutil.which("gws"): status = check_auth_status() else: status = {"status": "gws_not_found", "note": "Install gws first: cargo install gws-cli OR https://github.com/googleworkspace/cli/releases"} if args.json: print(json.dumps(status, indent=2)) else: print(f"\nAuth Status: {status.get('status', 'unknown')}") for k, v in status.items(): if k != "status": print(f" {k}: {v}") print() return if args.validate: services = [s.strip() for s in args.services.split(",") if s.strip()] if not shutil.which("gws"): report = DEMO_VALIDATION else: report = validate_services(services) if args.json: print(json.dumps(asdict(report), indent=2)) else: print(f"\n{'='*60}") print(f" AUTH VALIDATION REPORT") if report.demo_mode: print(f" (DEMO MODE)") print(f"{'='*60}\n") if report.user: print(f" User: {report.user}") print(f" Method: {report.auth_method}\n") for r in report.results: icon = "PASS" if r["status"] == "PASS" else "FAIL" print(f" [{icon}] {r['service']}: {r['message']}") print(f"\n {report.summary}") print(f"\n{'='*60}\n") if __name__ == "__main__": main() FILE:scripts/gws_doctor.py #!/usr/bin/env python3 """ Google Workspace CLI Doctor — Pre-flight diagnostics for gws CLI. Checks installation, version, authentication status, and service connectivity. Runs in demo mode with embedded sample data when gws is not installed. Usage: python3 gws_doctor.py python3 gws_doctor.py --json python3 gws_doctor.py --services gmail,drive,calendar """ import argparse import json import shutil import subprocess import sys from dataclasses import dataclass, field, asdict from typing import List, Optional @dataclass class Check: name: str status: str # PASS, WARN, FAIL message: str fix: str = "" @dataclass class DiagnosticReport: gws_installed: bool = False gws_version: str = "" auth_status: str = "" checks: List[dict] = field(default_factory=list) summary: str = "" demo_mode: bool = False DEMO_CHECKS = [ Check("gws-installed", "PASS", "gws v0.9.2 found at /usr/local/bin/gws"), Check("gws-version", "PASS", "Version 0.9.2 (latest)"), Check("auth-status", "PASS", "Authenticated as [email protected]"), Check("token-expiry", "WARN", "Token expires in 23 minutes", "Run 'gws auth refresh' to extend token lifetime"), Check("gmail-access", "PASS", "Gmail API accessible — user profile retrieved"), Check("drive-access", "PASS", "Drive API accessible — root folder listed"), Check("calendar-access", "PASS", "Calendar API accessible — primary calendar found"), Check("sheets-access", "PASS", "Sheets API accessible"), Check("tasks-access", "FAIL", "Tasks API not authorized", "Run 'gws auth setup' and add 'tasks' scope"), ] SERVICE_TEST_COMMANDS = { "gmail": ["gws", "gmail", "users", "getProfile", "me", "--json"], "drive": ["gws", "drive", "files", "list", "--limit", "1", "--json"], "calendar": ["gws", "calendar", "calendarList", "list", "--limit", "1", "--json"], "sheets": ["gws", "sheets", "spreadsheets", "get", "test", "--json"], "tasks": ["gws", "tasks", "tasklists", "list", "--limit", "1", "--json"], "chat": ["gws", "chat", "spaces", "list", "--limit", "1", "--json"], "docs": ["gws", "docs", "documents", "get", "test", "--json"], } def check_installation() -> Check: """Check if gws is installed and on PATH.""" path = shutil.which("gws") if path: return Check("gws-installed", "PASS", f"gws found at {path}") return Check("gws-installed", "FAIL", "gws not found on PATH", "Install via: cargo install gws-cli OR download from https://github.com/googleworkspace/cli/releases") def check_version() -> Check: """Get gws version.""" try: result = subprocess.run( ["gws", "--version"], capture_output=True, text=True, timeout=10 ) version = result.stdout.strip() if version: return Check("gws-version", "PASS", f"Version: {version}") return Check("gws-version", "WARN", "Could not parse version output") except (subprocess.TimeoutExpired, FileNotFoundError, OSError) as e: return Check("gws-version", "FAIL", f"Version check failed: {e}") def check_auth() -> Check: """Check authentication status.""" try: result = subprocess.run( ["gws", "auth", "status", "--json"], capture_output=True, text=True, timeout=15 ) if result.returncode == 0: try: data = json.loads(result.stdout) user = data.get("user", data.get("email", "unknown")) return Check("auth-status", "PASS", f"Authenticated as {user}") except json.JSONDecodeError: return Check("auth-status", "PASS", "Authenticated (could not parse details)") return Check("auth-status", "FAIL", "Not authenticated", "Run 'gws auth setup' to configure authentication") except (subprocess.TimeoutExpired, FileNotFoundError, OSError) as e: return Check("auth-status", "FAIL", f"Auth check failed: {e}", "Run 'gws auth setup' to configure authentication") def check_service(service: str) -> Check: """Test connectivity to a specific service.""" cmd = SERVICE_TEST_COMMANDS.get(service) if not cmd: return Check(f"{service}-access", "WARN", f"No test command for {service}") try: result = subprocess.run(cmd, capture_output=True, text=True, timeout=15) if result.returncode == 0: return Check(f"{service}-access", "PASS", f"{service.title()} API accessible") stderr = result.stderr.strip()[:100] if "403" in stderr or "permission" in stderr.lower(): return Check(f"{service}-access", "FAIL", f"{service.title()} API permission denied", f"Add '{service}' scope: gws auth setup --scopes {service}") return Check(f"{service}-access", "FAIL", f"{service.title()} API error: {stderr}", f"Check scope and permissions for {service}") except (subprocess.TimeoutExpired, FileNotFoundError, OSError) as e: return Check(f"{service}-access", "FAIL", f"{service.title()} test failed: {e}") def run_diagnostics(services: List[str]) -> DiagnosticReport: """Run all diagnostic checks.""" report = DiagnosticReport() checks = [] # Installation check install_check = check_installation() checks.append(install_check) report.gws_installed = install_check.status == "PASS" if not report.gws_installed: report.checks = [asdict(c) for c in checks] report.summary = "FAIL: gws is not installed" return report # Version check version_check = check_version() checks.append(version_check) if version_check.status == "PASS": report.gws_version = version_check.message.replace("Version: ", "") # Auth check auth_check = check_auth() checks.append(auth_check) report.auth_status = auth_check.status if auth_check.status != "PASS": report.checks = [asdict(c) for c in checks] report.summary = "FAIL: Authentication not configured" return report # Service checks for svc in services: checks.append(check_service(svc)) report.checks = [asdict(c) for c in checks] # Summary fails = sum(1 for c in checks if c.status == "FAIL") warns = sum(1 for c in checks if c.status == "WARN") passes = sum(1 for c in checks if c.status == "PASS") if fails > 0: report.summary = f"ISSUES FOUND: {passes} passed, {warns} warnings, {fails} failures" elif warns > 0: report.summary = f"MOSTLY OK: {passes} passed, {warns} warnings" else: report.summary = f"ALL CLEAR: {passes}/{passes} checks passed" return report def run_demo() -> DiagnosticReport: """Return demo report with embedded sample data.""" report = DiagnosticReport( gws_installed=True, gws_version="0.9.2", auth_status="PASS", checks=[asdict(c) for c in DEMO_CHECKS], summary="MOSTLY OK: 7 passed, 1 warning, 1 failure (demo mode)", demo_mode=True, ) return report def main(): parser = argparse.ArgumentParser( description="Pre-flight diagnostics for Google Workspace CLI (gws)", formatter_class=argparse.RawDescriptionHelpFormatter, epilog=""" Examples: %(prog)s # Run all checks %(prog)s --json # JSON output %(prog)s --services gmail,drive # Check specific services only %(prog)s --demo # Demo mode (no gws required) """, ) parser.add_argument("--json", action="store_true", help="Output JSON") parser.add_argument( "--services", default="gmail,drive,calendar,sheets,tasks", help="Comma-separated services to check (default: gmail,drive,calendar,sheets,tasks)" ) parser.add_argument("--demo", action="store_true", help="Run with demo data") args = parser.parse_args() services = [s.strip() for s in args.services.split(",") if s.strip()] # Use demo mode if requested or gws not installed if args.demo or not shutil.which("gws"): report = run_demo() else: report = run_diagnostics(services) if args.json: print(json.dumps(asdict(report), indent=2)) else: print(f"\n{'='*60}") print(f" GWS CLI DIAGNOSTIC REPORT") if report.demo_mode: print(f" (DEMO MODE — sample data)") print(f"{'='*60}\n") for c in report.checks: icon = {"PASS": "PASS", "WARN": "WARN", "FAIL": "FAIL"}.get(c["status"], "????") print(f" [{icon}] {c['name']}: {c['message']}") if c.get("fix") and c["status"] != "PASS": print(f" -> {c['fix']}") print(f"\n {'-'*56}") print(f" {report.summary}") print(f"\n{'='*60}\n") if __name__ == "__main__": main() FILE:scripts/gws_recipe_runner.py #!/usr/bin/env python3 """ Google Workspace CLI Recipe Runner — Catalog, search, and execute gws recipes. Browse 43 built-in recipes, filter by persona, search by keyword, and run with dry-run support. Usage: python3 gws_recipe_runner.py --list python3 gws_recipe_runner.py --search "email" python3 gws_recipe_runner.py --describe standup-report python3 gws_recipe_runner.py --run standup-report --dry-run python3 gws_recipe_runner.py --persona pm --list python3 gws_recipe_runner.py --list --json """ import argparse import json import subprocess import sys from dataclasses import dataclass, field, asdict from typing import List, Dict, Optional @dataclass class Recipe: name: str description: str category: str services: List[str] commands: List[str] prerequisites: str = "" RECIPES: Dict[str, Recipe] = { # Email (8) "send-email": Recipe("send-email", "Send an email with optional attachments", "email", ["gmail"], ["gws gmail users.messages send me --to {to} --subject {subject} --body {body}"]), "reply-to-thread": Recipe("reply-to-thread", "Reply to an existing email thread", "email", ["gmail"], ["gws gmail users.messages reply me --thread-id {thread_id} --body {body}"]), "forward-email": Recipe("forward-email", "Forward an email to another recipient", "email", ["gmail"], ["gws gmail users.messages forward me --message-id {msg_id} --to {to}"]), "search-emails": Recipe("search-emails", "Search emails with Gmail query syntax", "email", ["gmail"], ["gws gmail users.messages list me --query {query} --json"]), "archive-old": Recipe("archive-old", "Archive read emails older than N days", "email", ["gmail"], [ "gws gmail users.messages list me --query 'is:read older_than:{days}d' --json", "# Pipe IDs to batch modify to remove INBOX label", ]), "label-manager": Recipe("label-manager", "Create, list, and organize Gmail labels", "email", ["gmail"], ["gws gmail users.labels list me --json", "gws gmail users.labels create me --name {name}"]), "filter-setup": Recipe("filter-setup", "Create email filters for auto-labeling", "email", ["gmail"], ["gws gmail users.settings.filters create me --criteria {criteria} --action {action}"]), "unread-digest": Recipe("unread-digest", "Get digest of unread emails", "email", ["gmail"], ["gws gmail users.messages list me --query 'is:unread' --limit 20 --json"]), # Files (7) "upload-file": Recipe("upload-file", "Upload a file to Google Drive", "files", ["drive"], ["gws drive files create --name {name} --upload {path} --parents {folder_id}"]), "create-sheet": Recipe("create-sheet", "Create a new Google Spreadsheet", "files", ["sheets"], ["gws sheets spreadsheets create --title {title} --json"]), "share-file": Recipe("share-file", "Share a Drive file with a user or domain", "files", ["drive"], ["gws drive permissions create {file_id} --type user --role writer --emailAddress {email}"]), "export-file": Recipe("export-file", "Export a Google Doc/Sheet as PDF", "files", ["drive"], ["gws drive files export {file_id} --mime application/pdf --output {output}"]), "list-files": Recipe("list-files", "List files in a Drive folder", "files", ["drive"], ["gws drive files list --parents {folder_id} --json"]), "find-large-files": Recipe("find-large-files", "Find largest files in Drive", "files", ["drive"], ["gws drive files list --orderBy 'quotaBytesUsed desc' --limit 20 --json"]), "cleanup-trash": Recipe("cleanup-trash", "Empty Drive trash", "files", ["drive"], ["gws drive files emptyTrash"]), # Calendar (6) "create-event": Recipe("create-event", "Create a calendar event with attendees", "calendar", ["calendar"], [ "gws calendar events insert primary --summary {title} " "--start {start} --end {end} --attendees {attendees}" ]), "quick-event": Recipe("quick-event", "Create event from natural language", "calendar", ["calendar"], ["gws helpers quick-event {text}"]), "find-time": Recipe("find-time", "Find available time slots for a meeting", "calendar", ["calendar"], ["gws helpers find-time --attendees {attendees} --duration {minutes} --within {date_range}"]), "today-schedule": Recipe("today-schedule", "Show today's calendar events", "calendar", ["calendar"], ["gws calendar events list primary --timeMin {today_start} --timeMax {today_end} --json"]), "meeting-prep": Recipe("meeting-prep", "Prepare for an upcoming meeting (agenda + attendees)", "calendar", ["calendar"], ["gws recipes meeting-prep --event-id {event_id}"]), "reschedule": Recipe("reschedule", "Move an event to a new time", "calendar", ["calendar"], ["gws calendar events patch primary {event_id} --start {new_start} --end {new_end}"]), # Reporting (5) "standup-report": Recipe("standup-report", "Generate daily standup from calendar and tasks", "reporting", ["calendar", "tasks"], ["gws recipes standup-report --json"]), "weekly-summary": Recipe("weekly-summary", "Summarize week's emails, events, and tasks", "reporting", ["gmail", "calendar", "tasks"], ["gws recipes weekly-summary --json"]), "drive-activity": Recipe("drive-activity", "Report on Drive file activity", "reporting", ["drive"], ["gws drive activities list --json"]), "email-stats": Recipe("email-stats", "Email volume statistics", "reporting", ["gmail"], [ "gws gmail users.messages list me --query 'newer_than:7d' --json", "# Pipe through output_analyzer.py --count", ]), "task-progress": Recipe("task-progress", "Report on task completion", "reporting", ["tasks"], ["gws tasks tasks list {tasklist_id} --json"]), # Collaboration (5) "share-folder": Recipe("share-folder", "Share a Drive folder with a team", "collaboration", ["drive"], ["gws drive permissions create {folder_id} --type group --role writer --emailAddress {group}"]), "create-doc": Recipe("create-doc", "Create a Google Doc with initial content", "collaboration", ["docs"], ["gws docs documents create --title {title} --json"]), "chat-message": Recipe("chat-message", "Send a message to a Google Chat space", "collaboration", ["chat"], ["gws chat spaces.messages create {space} --text {message}"]), "list-spaces": Recipe("list-spaces", "List Google Chat spaces", "collaboration", ["chat"], ["gws chat spaces list --json"]), "task-create": Recipe("task-create", "Create a task in Google Tasks", "collaboration", ["tasks"], ["gws tasks tasks insert {tasklist_id} --title {title} --due {due_date}"]), # Data (4) "sheet-read": Recipe("sheet-read", "Read data from a spreadsheet range", "data", ["sheets"], ["gws sheets spreadsheets.values get {sheet_id} --range {range} --json"]), "sheet-write": Recipe("sheet-write", "Write data to a spreadsheet", "data", ["sheets"], ["gws sheets spreadsheets.values update {sheet_id} --range {range} --values {data}"]), "sheet-append": Recipe("sheet-append", "Append rows to a spreadsheet", "data", ["sheets"], ["gws sheets spreadsheets.values append {sheet_id} --range {range} --values {data}"]), "export-contacts": Recipe("export-contacts", "Export contacts list", "data", ["people"], ["gws people people.connections list me --personFields names,emailAddresses --json"]), # Admin (4) "list-users": Recipe("list-users", "List all users in the Workspace domain", "admin", ["admin"], ["gws admin users list --domain {domain} --json"], "Requires Admin SDK API and admin.directory.user.readonly scope"), "list-groups": Recipe("list-groups", "List all groups in the domain", "admin", ["admin"], ["gws admin groups list --domain {domain} --json"]), "user-info": Recipe("user-info", "Get detailed user information", "admin", ["admin"], ["gws admin users get {email} --json"]), "audit-logins": Recipe("audit-logins", "Audit recent login activity", "admin", ["admin"], ["gws admin activities list login --json"]), # Cross-Service (4) "morning-briefing": Recipe("morning-briefing", "Today's events + unread emails + pending tasks", "cross-service", ["gmail", "calendar", "tasks"], [ "gws calendar events list primary --timeMin {today} --maxResults 10 --json", "gws gmail users.messages list me --query 'is:unread' --limit 10 --json", "gws tasks tasks list {default_tasklist} --json", ]), "eod-wrap": Recipe("eod-wrap", "End-of-day wrap up: summarize completed, pending, tomorrow", "cross-service", ["calendar", "tasks"], [ "gws calendar events list primary --timeMin {today_start} --timeMax {today_end} --json", "gws tasks tasks list {default_tasklist} --json", ]), "project-status": Recipe("project-status", "Aggregate project status from Drive, Sheets, Tasks", "cross-service", ["drive", "sheets", "tasks"], [ "gws drive files list --query 'name contains {project}' --json", "gws tasks tasks list {tasklist_id} --json", ]), "inbox-zero": Recipe("inbox-zero", "Process inbox to zero: label, archive, reply, task", "cross-service", ["gmail", "tasks"], [ "gws gmail users.messages list me --query 'is:inbox' --json", "# Process each: label, archive, or create task", ]), } PERSONAS: Dict[str, Dict] = { "executive-assistant": { "description": "Executive assistant managing schedules, emails, and communications", "recipes": ["morning-briefing", "today-schedule", "find-time", "send-email", "reply-to-thread", "standup-report", "meeting-prep", "eod-wrap", "quick-event", "inbox-zero"], }, "pm": { "description": "Project manager tracking tasks, meetings, and deliverables", "recipes": ["standup-report", "create-event", "find-time", "task-create", "task-progress", "project-status", "weekly-summary", "share-folder", "sheet-read", "morning-briefing"], }, "hr": { "description": "HR managing people, onboarding, and communications", "recipes": ["list-users", "user-info", "send-email", "create-event", "create-doc", "share-folder", "chat-message", "list-groups", "export-contacts", "today-schedule"], }, "sales": { "description": "Sales rep managing client communications and proposals", "recipes": ["send-email", "search-emails", "create-event", "find-time", "create-doc", "share-file", "sheet-read", "sheet-write", "export-file", "morning-briefing"], }, "it-admin": { "description": "IT administrator managing Workspace configuration and security", "recipes": ["list-users", "list-groups", "user-info", "audit-logins", "drive-activity", "find-large-files", "cleanup-trash", "label-manager", "filter-setup", "share-folder"], }, "developer": { "description": "Developer using Workspace APIs for automation", "recipes": ["sheet-read", "sheet-write", "sheet-append", "upload-file", "create-doc", "chat-message", "task-create", "list-files", "export-file", "send-email"], }, "marketing": { "description": "Marketing team member managing campaigns and content", "recipes": ["send-email", "create-doc", "share-file", "upload-file", "create-sheet", "sheet-write", "chat-message", "create-event", "email-stats", "weekly-summary"], }, "finance": { "description": "Finance team managing spreadsheets and reports", "recipes": ["sheet-read", "sheet-write", "sheet-append", "create-sheet", "export-file", "share-file", "send-email", "find-large-files", "drive-activity", "weekly-summary"], }, "legal": { "description": "Legal team managing documents and compliance", "recipes": ["create-doc", "share-file", "export-file", "search-emails", "send-email", "upload-file", "list-files", "drive-activity", "audit-logins", "find-large-files"], }, "support": { "description": "Customer support managing tickets and communications", "recipes": ["search-emails", "send-email", "reply-to-thread", "label-manager", "filter-setup", "task-create", "chat-message", "unread-digest", "inbox-zero", "morning-briefing"], }, } def list_recipes(persona: Optional[str], output_json: bool): """List all recipes, optionally filtered by persona.""" if persona: if persona not in PERSONAS: print(f"Unknown persona: {persona}. Available: {', '.join(PERSONAS.keys())}") sys.exit(1) recipe_names = PERSONAS[persona]["recipes"] recipes = {k: v for k, v in RECIPES.items() if k in recipe_names} title = f"Recipes for {persona.upper()}: {PERSONAS[persona]['description']}" else: recipes = RECIPES title = "All 43 Google Workspace CLI Recipes" if output_json: output = [] for name, r in recipes.items(): output.append(asdict(r)) print(json.dumps(output, indent=2)) return print(f"\n{'='*60}") print(f" {title}") print(f"{'='*60}\n") by_category: Dict[str, list] = {} for name, r in recipes.items(): by_category.setdefault(r.category, []).append(r) for cat, cat_recipes in sorted(by_category.items()): print(f" {cat.upper()} ({len(cat_recipes)})") for r in cat_recipes: svcs = ",".join(r.services) print(f" {r.name:<24} {r.description:<40} [{svcs}]") print() print(f" Total: {len(recipes)} recipes") print(f"\n{'='*60}\n") def search_recipes(keyword: str, output_json: bool): """Search recipes by keyword.""" keyword_lower = keyword.lower() matches = {k: v for k, v in RECIPES.items() if keyword_lower in k.lower() or keyword_lower in v.description.lower() or keyword_lower in v.category.lower() or any(keyword_lower in s for s in v.services)} if output_json: print(json.dumps([asdict(r) for r in matches.values()], indent=2)) return print(f"\n Search results for '{keyword}': {len(matches)} matches\n") for name, r in matches.items(): print(f" {r.name:<24} {r.description}") print() def describe_recipe(name: str, output_json: bool): """Show full details for a recipe.""" recipe = RECIPES.get(name) if not recipe: print(f"Unknown recipe: {name}") print(f"Use --list to see available recipes") sys.exit(1) if output_json: print(json.dumps(asdict(recipe), indent=2)) return print(f"\n{'='*60}") print(f" Recipe: {recipe.name}") print(f"{'='*60}\n") print(f" Description: {recipe.description}") print(f" Category: {recipe.category}") print(f" Services: {', '.join(recipe.services)}") if recipe.prerequisites: print(f" Prerequisites: {recipe.prerequisites}") print(f"\n Commands:") for i, cmd in enumerate(recipe.commands, 1): print(f" {i}. {cmd}") print(f"\n{'='*60}\n") def run_recipe(name: str, dry_run: bool): """Execute a recipe (or print commands in dry-run mode).""" recipe = RECIPES.get(name) if not recipe: print(f"Unknown recipe: {name}") sys.exit(1) if dry_run: print(f"\n [DRY RUN] Recipe: {recipe.name}\n") for i, cmd in enumerate(recipe.commands, 1): print(f" {i}. {cmd}") print(f"\n (No commands executed)") return print(f"\n Executing recipe: {recipe.name}\n") for cmd in recipe.commands: if cmd.startswith("#"): print(f" {cmd}") continue print(f" $ {cmd}") try: result = subprocess.run(cmd, shell=True, capture_output=True, text=True, timeout=30) if result.stdout: print(result.stdout) if result.returncode != 0 and result.stderr: print(f" Error: {result.stderr.strip()[:200]}") except subprocess.TimeoutExpired: print(f" Timeout after 30s") except OSError as e: print(f" Execution error: {e}") def list_personas(output_json: bool): """List all available personas.""" if output_json: print(json.dumps(PERSONAS, indent=2)) return print(f"\n{'='*60}") print(f" 10 PERSONA BUNDLES") print(f"{'='*60}\n") for name, p in PERSONAS.items(): print(f" {name:<24} {p['description']}") print(f" {'':24} Recipes: {', '.join(p['recipes'][:5])}...") print() print(f"{'='*60}\n") def main(): parser = argparse.ArgumentParser( description="Catalog, search, and execute Google Workspace CLI recipes", formatter_class=argparse.RawDescriptionHelpFormatter, epilog=""" Examples: %(prog)s --list # List all 43 recipes %(prog)s --list --persona pm # Recipes for project managers %(prog)s --search "email" # Search by keyword %(prog)s --describe standup-report # Full recipe details %(prog)s --run standup-report --dry-run # Preview recipe commands %(prog)s --personas # List all 10 personas %(prog)s --list --json # JSON output """, ) parser.add_argument("--list", action="store_true", help="List all recipes") parser.add_argument("--search", help="Search recipes by keyword") parser.add_argument("--describe", help="Show full details for a recipe") parser.add_argument("--run", help="Execute a recipe") parser.add_argument("--dry-run", action="store_true", help="Print commands without executing") parser.add_argument("--persona", help="Filter recipes by persona") parser.add_argument("--personas", action="store_true", help="List all personas") parser.add_argument("--json", action="store_true", help="Output JSON") args = parser.parse_args() if not any([args.list, args.search, args.describe, args.run, args.personas]): parser.print_help() return if args.personas: list_personas(args.json) return if args.list: list_recipes(args.persona, args.json) return if args.search: search_recipes(args.search, args.json) return if args.describe: describe_recipe(args.describe, args.json) return if args.run: run_recipe(args.run, args.dry_run) return if __name__ == "__main__": main() FILE:scripts/output_analyzer.py #!/usr/bin/env python3 """ Google Workspace CLI Output Analyzer — Parse, filter, and aggregate JSON/NDJSON output. Reads JSON arrays or NDJSON streams from stdin or file, applies filters, projections, sorting, grouping, and outputs in table/csv/json format. Usage: gws drive files list --json | python3 output_analyzer.py --count gws drive files list --json | python3 output_analyzer.py --filter "mimeType=application/pdf" gws drive files list --json | python3 output_analyzer.py --select "name,size" --format table python3 output_analyzer.py --input results.json --group-by "mimeType" python3 output_analyzer.py --demo --select "name,mimeType,size" --format table """ import argparse import csv import io import json import sys from dataclasses import dataclass from typing import List, Dict, Any, Optional DEMO_DATA = [ {"id": "1", "name": "Q1 Report.pdf", "mimeType": "application/pdf", "size": "245760", "modifiedTime": "2026-03-10T14:30:00Z", "shared": True, "owners": [{"displayName": "Alice"}]}, {"id": "2", "name": "Budget 2026.xlsx", "mimeType": "application/vnd.google-apps.spreadsheet", "size": "0", "modifiedTime": "2026-03-09T09:15:00Z", "shared": True, "owners": [{"displayName": "Bob"}]}, {"id": "3", "name": "Meeting Notes.docx", "mimeType": "application/vnd.google-apps.document", "size": "0", "modifiedTime": "2026-03-08T16:00:00Z", "shared": False, "owners": [{"displayName": "Alice"}]}, {"id": "4", "name": "Logo.png", "mimeType": "image/png", "size": "102400", "modifiedTime": "2026-03-07T11:00:00Z", "shared": False, "owners": [{"displayName": "Charlie"}]}, {"id": "5", "name": "Presentation.pptx", "mimeType": "application/vnd.google-apps.presentation", "size": "0", "modifiedTime": "2026-03-06T10:00:00Z", "shared": True, "owners": [{"displayName": "Alice"}]}, {"id": "6", "name": "Invoice-001.pdf", "mimeType": "application/pdf", "size": "89000", "modifiedTime": "2026-03-05T08:30:00Z", "shared": False, "owners": [{"displayName": "Bob"}]}, {"id": "7", "name": "Project Plan.xlsx", "mimeType": "application/vnd.google-apps.spreadsheet", "size": "0", "modifiedTime": "2026-03-04T13:45:00Z", "shared": True, "owners": [{"displayName": "Charlie"}]}, {"id": "8", "name": "Contract Draft.docx", "mimeType": "application/vnd.google-apps.document", "size": "0", "modifiedTime": "2026-03-03T09:00:00Z", "shared": False, "owners": [{"displayName": "Alice"}]}, ] def read_input(input_file: Optional[str]) -> List[Dict[str, Any]]: """Read JSON array or NDJSON from file or stdin.""" if input_file: with open(input_file, "r") as f: text = f.read().strip() else: if sys.stdin.isatty(): return [] text = sys.stdin.read().strip() if not text: return [] # Try JSON array first try: data = json.loads(text) if isinstance(data, list): return data if isinstance(data, dict): # Some gws commands wrap results in a key for key in ("files", "messages", "events", "items", "results", "spreadsheets", "spaces", "tasks", "users", "groups"): if key in data and isinstance(data[key], list): return data[key] return [data] except json.JSONDecodeError: pass # Try NDJSON records = [] for line in text.split("\n"): line = line.strip() if line: try: records.append(json.loads(line)) except json.JSONDecodeError: continue return records def get_nested(obj: Dict, path: str) -> Any: """Get a nested value by dot-separated path.""" parts = path.split(".") current = obj for part in parts: if isinstance(current, dict): current = current.get(part) elif isinstance(current, list) and part.isdigit(): idx = int(part) current = current[idx] if idx < len(current) else None else: return None if current is None: return None return current def apply_filter(records: List[Dict], filter_expr: str) -> List[Dict]: """Filter records by field=value expression.""" if "=" not in filter_expr: return records field_path, value = filter_expr.split("=", 1) result = [] for rec in records: rec_val = get_nested(rec, field_path) if rec_val is None: continue rec_str = str(rec_val).lower() if rec_str == value.lower() or value.lower() in rec_str: result.append(rec) return result def apply_select(records: List[Dict], fields: str) -> List[Dict]: """Project specific fields from records.""" field_list = [f.strip() for f in fields.split(",")] result = [] for rec in records: projected = {} for f in field_list: projected[f] = get_nested(rec, f) result.append(projected) return result def apply_sort(records: List[Dict], sort_field: str, reverse: bool = False) -> List[Dict]: """Sort records by a field.""" def sort_key(rec): val = get_nested(rec, sort_field) if val is None: return "" if isinstance(val, (int, float)): return val try: return float(val) except (ValueError, TypeError): return str(val).lower() return sorted(records, key=sort_key, reverse=reverse) def apply_group_by(records: List[Dict], field: str) -> Dict[str, int]: """Group records by a field and count.""" groups: Dict[str, int] = {} for rec in records: val = get_nested(rec, field) key = str(val) if val is not None else "(null)" groups[key] = groups.get(key, 0) + 1 return dict(sorted(groups.items(), key=lambda x: x[1], reverse=True)) def compute_stats(records: List[Dict], field: str) -> Dict[str, Any]: """Compute min/max/avg/sum for a numeric field.""" values = [] for rec in records: val = get_nested(rec, field) if val is not None: try: values.append(float(val)) except (ValueError, TypeError): continue if not values: return {"field": field, "count": 0, "error": "No numeric values found"} return { "field": field, "count": len(values), "min": min(values), "max": max(values), "sum": sum(values), "avg": sum(values) / len(values), } def format_table(records: List[Dict]) -> str: """Format records as an aligned text table.""" if not records: return "(no records)" headers = list(records[0].keys()) # Calculate column widths widths = {h: len(h) for h in headers} for rec in records: for h in headers: val = str(rec.get(h, "")) if len(val) > 60: val = val[:57] + "..." widths[h] = max(widths[h], len(val)) # Header header_line = " ".join(h.ljust(widths[h]) for h in headers) sep_line = " ".join("-" * widths[h] for h in headers) lines = [header_line, sep_line] # Rows for rec in records: row = [] for h in headers: val = str(rec.get(h, "")) if len(val) > 60: val = val[:57] + "..." row.append(val.ljust(widths[h])) lines.append(" ".join(row)) return "\n".join(lines) def format_csv_output(records: List[Dict]) -> str: """Format records as CSV.""" if not records: return "" output = io.StringIO() writer = csv.DictWriter(output, fieldnames=records[0].keys()) writer.writeheader() writer.writerows(records) return output.getvalue() def main(): parser = argparse.ArgumentParser( description="Parse, filter, and aggregate JSON/NDJSON from gws CLI output", formatter_class=argparse.RawDescriptionHelpFormatter, epilog=""" Examples: gws drive files list --json | %(prog)s --count gws drive files list --json | %(prog)s --filter "mimeType=pdf" --select "name,size" gws drive files list --json | %(prog)s --group-by "mimeType" --format table gws drive files list --json | %(prog)s --sort "size" --reverse --format table gws drive files list --json | %(prog)s --stats "size" %(prog)s --input results.json --select "name,mimeType" --format csv %(prog)s --demo --select "name,mimeType,size" --format table """, ) parser.add_argument("--input", help="Input file (default: stdin)") parser.add_argument("--demo", action="store_true", help="Use demo data") parser.add_argument("--count", action="store_true", help="Count records") parser.add_argument("--filter", help="Filter by field=value") parser.add_argument("--select", help="Comma-separated fields to project") parser.add_argument("--sort", help="Sort by field") parser.add_argument("--reverse", action="store_true", help="Reverse sort order") parser.add_argument("--group-by", help="Group by field and count") parser.add_argument("--stats", help="Compute stats for a numeric field") parser.add_argument("--format", choices=["json", "table", "csv"], default="json", help="Output format (default: json)") parser.add_argument("--json", action="store_true", help="Shorthand for --format json") args = parser.parse_args() if args.json: args.format = "json" # Read input if args.demo: records = DEMO_DATA[:] else: records = read_input(args.input) if not records and not args.demo: # If no pipe input and no file, use demo records = DEMO_DATA[:] print("(No input detected, using demo data)\n", file=sys.stderr) # Apply operations in order if args.filter: records = apply_filter(records, args.filter) if args.sort: records = apply_sort(records, args.sort, args.reverse) # Count if args.count: if args.format == "json": print(json.dumps({"count": len(records)})) else: print(f"Count: {len(records)}") return # Group by if args.group_by: groups = apply_group_by(records, args.group_by) if args.format == "json": print(json.dumps(groups, indent=2)) elif args.format == "csv": print(f"{args.group_by},count") for k, v in groups.items(): print(f"{k},{v}") else: print(f"\n Group by: {args.group_by}\n") for k, v in groups.items(): print(f" {k:<50} {v}") print(f"\n Total groups: {len(groups)}") return # Stats if args.stats: stats = compute_stats(records, args.stats) if args.format == "json": print(json.dumps(stats, indent=2)) else: print(f"\n Stats for '{args.stats}':") for k, v in stats.items(): if isinstance(v, float): print(f" {k}: {v:,.2f}") else: print(f" {k}: {v}") return # Select fields if args.select: records = apply_select(records, args.select) # Output if args.format == "json": print(json.dumps(records, indent=2)) elif args.format == "csv": print(format_csv_output(records)) else: print(f"\n{format_table(records)}\n") print(f" ({len(records)} records)\n") if __name__ == "__main__": main() FILE:scripts/workspace_audit.py #!/usr/bin/env python3 """ Google Workspace Security Audit — Audit Workspace configuration for security risks. Checks Drive external sharing, Gmail forwarding rules, OAuth app grants, Calendar visibility, admin settings, and generates remediation commands. Runs in demo mode with embedded sample data when gws is not installed. Usage: python3 workspace_audit.py python3 workspace_audit.py --json python3 workspace_audit.py --services gmail,drive,calendar python3 workspace_audit.py --demo """ import argparse import json import shutil import subprocess import sys from dataclasses import dataclass, field, asdict from typing import List, Dict, Optional @dataclass class AuditFinding: area: str check: str status: str # PASS, WARN, FAIL message: str risk: str = "" remediation: str = "" @dataclass class AuditReport: findings: List[dict] = field(default_factory=list) score: int = 0 max_score: int = 100 grade: str = "" summary: str = "" demo_mode: bool = False DEMO_FINDINGS = [ AuditFinding("drive", "External sharing", "WARN", "External sharing is enabled for the domain", "Data exfiltration via shared links", "Review sharing settings in Admin Console > Apps > Google Workspace > Drive"), AuditFinding("drive", "Link sharing defaults", "FAIL", "Default link sharing is set to 'Anyone with the link'", "Sensitive files accessible without authentication", "gws admin settings update drive --defaultLinkSharing restricted"), AuditFinding("gmail", "Auto-forwarding", "PASS", "No auto-forwarding rules detected for admin accounts"), AuditFinding("gmail", "SPF record", "PASS", "SPF record configured correctly"), AuditFinding("gmail", "DMARC record", "WARN", "DMARC policy is set to 'none' (monitoring only)", "Email spoofing not actively blocked", "Update DMARC DNS record: v=DMARC1; p=quarantine; rua=mailto:[email protected]"), AuditFinding("gmail", "DKIM signing", "PASS", "DKIM signing is enabled"), AuditFinding("calendar", "Default visibility", "WARN", "Calendar default visibility is 'See all event details'", "Meeting details visible to all domain users", "Admin Console > Apps > Calendar > Sharing settings > Set to 'Free/Busy'"), AuditFinding("calendar", "External sharing", "PASS", "External calendar sharing is restricted"), AuditFinding("oauth", "Third-party apps", "FAIL", "12 third-party OAuth apps with broad access detected", "Unauthorized data access via OAuth grants", "Review: Admin Console > Security > API controls > App access control"), AuditFinding("oauth", "High-risk apps", "WARN", "3 apps have Drive full access scope", "Apps can read/modify all Drive files", "Audit each app: gws admin tokens list --json | filter by scope"), AuditFinding("admin", "Super admin count", "WARN", "4 super admin accounts detected (recommended: 2-3)", "Increased attack surface for privilege escalation", "Reduce super admins: gws admin users list --query 'isAdmin=true' --json"), AuditFinding("admin", "2-Step verification", "PASS", "2-Step verification enforced for all users"), AuditFinding("admin", "Password policy", "PASS", "Minimum password length: 12 characters"), AuditFinding("admin", "Login challenges", "PASS", "Suspicious login challenges enabled"), ] def run_gws_command(cmd: List[str]) -> Optional[str]: """Run a gws command and return stdout, or None on failure.""" try: result = subprocess.run(cmd, capture_output=True, text=True, timeout=20) if result.returncode == 0: return result.stdout return None except (subprocess.TimeoutExpired, FileNotFoundError, OSError): return None def audit_drive() -> List[AuditFinding]: """Audit Drive sharing and security settings.""" findings = [] # Check sharing settings output = run_gws_command(["gws", "drive", "about", "get", "--json"]) if output: try: data = json.loads(output) # Check if external sharing is enabled if data.get("canShareOutsideDomain", True): findings.append(AuditFinding( "drive", "External sharing", "WARN", "External sharing is enabled", "Data exfiltration via shared links", "Review Admin Console > Apps > Drive > Sharing settings" )) else: findings.append(AuditFinding( "drive", "External sharing", "PASS", "External sharing is restricted" )) except json.JSONDecodeError: findings.append(AuditFinding( "drive", "External sharing", "WARN", "Could not parse Drive settings" )) else: findings.append(AuditFinding( "drive", "External sharing", "WARN", "Could not retrieve Drive settings" )) return findings def audit_gmail() -> List[AuditFinding]: """Audit Gmail forwarding and email security.""" findings = [] # Check forwarding rules output = run_gws_command(["gws", "gmail", "users.settings.forwardingAddresses", "list", "me", "--json"]) if output: try: data = json.loads(output) addrs = data if isinstance(data, list) else data.get("forwardingAddresses", []) if addrs: findings.append(AuditFinding( "gmail", "Auto-forwarding", "WARN", f"{len(addrs)} forwarding addresses configured", "Data exfiltration via email forwarding", "Review: gws gmail users.settings.forwardingAddresses list me --json" )) else: findings.append(AuditFinding( "gmail", "Auto-forwarding", "PASS", "No forwarding addresses configured" )) except json.JSONDecodeError: pass else: findings.append(AuditFinding( "gmail", "Auto-forwarding", "WARN", "Could not check forwarding settings" )) return findings def audit_calendar() -> List[AuditFinding]: """Audit Calendar sharing settings.""" findings = [] output = run_gws_command(["gws", "calendar", "calendarList", "get", "primary", "--json"]) if output: findings.append(AuditFinding( "calendar", "Primary calendar", "PASS", "Primary calendar accessible" )) else: findings.append(AuditFinding( "calendar", "Primary calendar", "WARN", "Could not access primary calendar" )) return findings def run_live_audit(services: List[str]) -> AuditReport: """Run live audit against actual gws installation.""" report = AuditReport() all_findings = [] audit_map = { "drive": audit_drive, "gmail": audit_gmail, "calendar": audit_calendar, } for svc in services: fn = audit_map.get(svc) if fn: all_findings.extend(fn()) report.findings = [asdict(f) for f in all_findings] report = calculate_score(report) return report def run_demo_audit() -> AuditReport: """Return demo audit report with embedded sample data.""" report = AuditReport( findings=[asdict(f) for f in DEMO_FINDINGS], demo_mode=True, ) report = calculate_score(report) return report def calculate_score(report: AuditReport) -> AuditReport: """Calculate audit score and grade.""" total = len(report.findings) if total == 0: report.score = 0 report.grade = "N/A" report.summary = "No checks performed" return report passes = sum(1 for f in report.findings if f["status"] == "PASS") warns = sum(1 for f in report.findings if f["status"] == "WARN") fails = sum(1 for f in report.findings if f["status"] == "FAIL") # Score: PASS=100, WARN=50, FAIL=0 score = int(((passes * 100) + (warns * 50)) / total) report.score = score report.max_score = 100 if score >= 90: report.grade = "A" elif score >= 75: report.grade = "B" elif score >= 60: report.grade = "C" elif score >= 40: report.grade = "D" else: report.grade = "F" report.summary = f"{passes} passed, {warns} warnings, {fails} failures — Score: {score}/100 (Grade: {report.grade})" return report def main(): parser = argparse.ArgumentParser( description="Security and configuration audit for Google Workspace", formatter_class=argparse.RawDescriptionHelpFormatter, epilog=""" Examples: %(prog)s # Full audit (or demo if gws not installed) %(prog)s --json # JSON output %(prog)s --services gmail,drive # Audit specific services %(prog)s --demo # Demo mode with sample data """, ) parser.add_argument("--json", action="store_true", help="Output JSON") parser.add_argument("--services", default="gmail,drive,calendar", help="Comma-separated services to audit (default: gmail,drive,calendar)") parser.add_argument("--demo", action="store_true", help="Run with demo data") args = parser.parse_args() services = [s.strip() for s in args.services.split(",") if s.strip()] if args.demo or not shutil.which("gws"): report = run_demo_audit() else: report = run_live_audit(services) if args.json: print(json.dumps(asdict(report), indent=2)) else: print(f"\n{'='*60}") print(f" GOOGLE WORKSPACE SECURITY AUDIT") if report.demo_mode: print(f" (DEMO MODE — sample data)") print(f"{'='*60}\n") print(f" Score: {report.score}/{report.max_score} (Grade: {report.grade})\n") current_area = "" for f in report.findings: if f["area"] != current_area: current_area = f["area"] print(f"\n {current_area.upper()}") print(f" {'-'*40}") icon = {"PASS": "PASS", "WARN": "WARN", "FAIL": "FAIL"}.get(f["status"], "????") print(f" [{icon}] {f['check']}: {f['message']}") if f.get("risk") and f["status"] != "PASS": print(f" Risk: {f['risk']}") if f.get("remediation") and f["status"] != "PASS": print(f" Fix: {f['remediation']}") print(f"\n {'='*56}") print(f" {report.summary}") print(f"\n{'='*60}\n") if __name__ == "__main__": main()
4 production-ready business and growth skills: customer success manager with health scoring and churn prediction, sales engineer with RFP analysis, revenue o...
---
name: "business-growth-skills"
description: "4 production-ready business and growth skills: customer success manager with health scoring and churn prediction, sales engineer with RFP analysis, revenue operations with pipeline and GTM metrics, and contract & proposal writer. Python tools included (all stdlib-only). Works with Claude Code, Codex CLI, and OpenClaw."
version: 1.1.0
author: Alireza Rezvani
license: MIT
tags:
- business
- customer-success
- sales
- revenue-operations
- growth
agents:
- claude-code
- codex-cli
- openclaw
---
# Business & Growth Skills
4 production-ready skills for customer success, sales, and revenue operations.
## Quick Start
### Claude Code
```
/read business-growth/customer-success-manager/SKILL.md
```
### Codex CLI
```bash
npx agent-skills-cli add alirezarezvani/claude-skills/business-growth
```
## Skills Overview
| Skill | Folder | Focus |
|-------|--------|-------|
| Customer Success Manager | `customer-success-manager/` | Health scoring, churn prediction, expansion |
| Sales Engineer | `sales-engineer/` | RFP analysis, competitive matrices, PoC planning |
| Revenue Operations | `revenue-operations/` | Pipeline analysis, forecast accuracy, GTM metrics |
| Contract & Proposal Writer | `contract-and-proposal-writer/` | Proposal generation, contract templates |
## Python Tools
9 scripts, all stdlib-only:
```bash
python3 customer-success-manager/scripts/health_score_calculator.py --help
python3 revenue-operations/scripts/pipeline_analyzer.py --help
```
## Rules
- Load only the specific skill SKILL.md you need
- Use Python tools for scoring and metrics, not manual estimates
FILE:CLAUDE.md
# Business & Growth Skills - Claude Code Guidance
This guide covers the 3 production-ready business and growth skills and their Python automation tools.
## Business & Growth Skills Overview
**Available Skills:**
1. **customer-success-manager/** - Customer health scoring, churn risk analysis, expansion opportunities (3 Python tools)
2. **sales-engineer/** - Technical discovery, RFP analysis, competitive positioning, POC planning (3 Python tools)
3. **revenue-operations/** - Pipeline analysis, forecast accuracy, GTM efficiency metrics (3 Python tools)
**Total Tools:** 9 Python automation tools, 9 knowledge bases, 19+ templates
## Python Automation Tools
### Customer Success Manager Tools
#### 1. Health Score Calculator (`customer-success-manager/scripts/health_score_calculator.py`)
**Purpose:** Multi-dimensional customer health scoring with trend analysis
**Features:**
- Weighted scoring across 4 dimensions (usage, engagement, support, relationship)
- Red/Yellow/Green classification with configurable thresholds
- Trend analysis comparing current vs previous period
- Segment-aware benchmarking (Enterprise/Mid-Market/SMB)
**Usage:**
```bash
python customer-success-manager/scripts/health_score_calculator.py customer_data.json
python customer-success-manager/scripts/health_score_calculator.py customer_data.json --format json
```
#### 2. Churn Risk Analyzer (`customer-success-manager/scripts/churn_risk_analyzer.py`)
**Purpose:** Identify at-risk accounts with intervention recommendations
**Features:**
- Risk scoring based on behavioral signals
- Warning signal detection and categorization
- Tier-appropriate intervention playbooks
- Urgency-based prioritization
**Usage:**
```bash
python customer-success-manager/scripts/churn_risk_analyzer.py customer_data.json
python customer-success-manager/scripts/churn_risk_analyzer.py customer_data.json --format json
```
#### 3. Expansion Opportunity Scorer (`customer-success-manager/scripts/expansion_opportunity_scorer.py`)
**Purpose:** Identify upsell and cross-sell opportunities
**Features:**
- Adoption depth analysis across product modules
- Whitespace mapping for unused features
- Revenue opportunity estimation
- Priority ranking by effort and impact
**Usage:**
```bash
python customer-success-manager/scripts/expansion_opportunity_scorer.py customer_data.json
python customer-success-manager/scripts/expansion_opportunity_scorer.py customer_data.json --format json
```
### Sales Engineer Tools
#### 4. RFP Response Analyzer (`sales-engineer/scripts/rfp_response_analyzer.py`)
**Purpose:** Score RFP/RFI coverage and identify gaps
**Features:**
- Requirement coverage scoring (Full/Partial/Planned/Gap)
- Effort estimation per requirement
- Gap identification with mitigation strategies
- Overall bid/no-bid recommendation
**Usage:**
```bash
python sales-engineer/scripts/rfp_response_analyzer.py rfp_data.json
python sales-engineer/scripts/rfp_response_analyzer.py rfp_data.json --format json
```
#### 5. Competitive Matrix Builder (`sales-engineer/scripts/competitive_matrix_builder.py`)
**Purpose:** Generate feature comparison matrices and competitive positioning
**Features:**
- Feature-by-feature comparison matrix
- Competitive scoring with weighted categories
- Differentiator identification
- Battlecard-ready output
**Usage:**
```bash
python sales-engineer/scripts/competitive_matrix_builder.py competitive_data.json
python sales-engineer/scripts/competitive_matrix_builder.py competitive_data.json --format json
```
#### 6. POC Planner (`sales-engineer/scripts/poc_planner.py`)
**Purpose:** Plan proof-of-concept engagements
**Features:**
- Timeline estimation based on scope
- Resource allocation planning
- Success criteria definition
- Evaluation scorecard generation
**Usage:**
```bash
python sales-engineer/scripts/poc_planner.py poc_data.json
python sales-engineer/scripts/poc_planner.py poc_data.json --format json
```
### Revenue Operations Tools
#### 7. Pipeline Analyzer (`revenue-operations/scripts/pipeline_analyzer.py`)
**Purpose:** Analyze sales pipeline health and velocity
**Features:**
- Coverage ratio calculation (pipeline/quota)
- Stage conversion rate analysis
- Sales velocity metrics (4-lever model)
- Deal aging analysis
**Usage:**
```bash
python revenue-operations/scripts/pipeline_analyzer.py pipeline_data.json
python revenue-operations/scripts/pipeline_analyzer.py pipeline_data.json --format json
```
#### 8. Forecast Accuracy Tracker (`revenue-operations/scripts/forecast_accuracy_tracker.py`)
**Purpose:** Measure and improve forecast accuracy
**Features:**
- MAPE (Mean Absolute Percentage Error) calculation
- Forecast bias detection (over/under-forecasting)
- Period-over-period trend analysis
- Category-level accuracy breakdown
**Usage:**
```bash
python revenue-operations/scripts/forecast_accuracy_tracker.py forecast_data.json
python revenue-operations/scripts/forecast_accuracy_tracker.py forecast_data.json --format json
```
#### 9. GTM Efficiency Calculator (`revenue-operations/scripts/gtm_efficiency_calculator.py`)
**Purpose:** Calculate go-to-market efficiency metrics
**Features:**
- Magic number calculation
- LTV:CAC ratio analysis
- CAC payback period
- Burn multiple assessment
- Industry benchmarking
**Usage:**
```bash
python revenue-operations/scripts/gtm_efficiency_calculator.py gtm_data.json
python revenue-operations/scripts/gtm_efficiency_calculator.py gtm_data.json --format json
```
## Quality Standards
**All business & growth Python tools must:**
- Use standard library only (no external dependencies)
- Support both JSON and human-readable output via `--format` flag
- Provide clear error messages for invalid input
- Return appropriate exit codes
- Process files locally (no API calls)
- Include argparse CLI with `--help` support
## Related Skills
- **Marketing:** Content creation, demand generation -> `../marketing-skill/`
- **Product Team:** User research, feature prioritization -> `../product-team/`
- **C-Level:** Strategic planning -> `../c-level-advisor/`
- **Engineering:** Technical implementation -> `../engineering-team/`
---
**Last Updated:** February 2026
**Skills Deployed:** 3/3 business & growth skills production-ready
**Total Tools:** 9 Python automation tools
FILE:contract-and-proposal-writer/SKILL.md
---
name: "contract-and-proposal-writer"
description: "Contract & Proposal Writer"
---
# Contract & Proposal Writer
**Tier:** POWERFUL
**Category:** Business Growth
**Domain:** Legal Documents, Business Development, Client Relations
---
## Overview
Generate professional, jurisdiction-aware business documents: freelance contracts, project proposals, SOWs, NDAs, and MSAs. Outputs structured Markdown with docx conversion instructions. Covers US (Delaware), EU (GDPR), UK, and DACH (German law) jurisdictions.
**Not a substitute for legal counsel.** Use these templates as strong starting points; review with an attorney for high-value or complex engagements.
---
## Core Capabilities
- Freelance development contracts (fixed-price & hourly)
- Project proposals with timeline/budget breakdown
- Statements of Work (SOW) with deliverables matrix
- NDAs (mutual & one-way)
- Master Service Agreements (MSA)
- Jurisdiction-specific clauses (US/EU/UK/DACH)
- GDPR Data Processing Addenda (EU/DACH)
---
## Key Clauses Reference
| Clause | Options |
|--------|---------|
| Payment terms | Net-30, milestone-based, monthly retainer |
| IP ownership | Work-for-hire (US), assignment (EU/UK), license-back |
| Liability cap | 1x contract value (standard), 3x (high-risk) |
| Termination | For cause (14-day cure), convenience (30/60/90-day notice) |
| Confidentiality | 2-5 year term, perpetual for trade secrets |
| Warranty | "As-is" disclaimer, limited 30/90-day fix warranty |
| Dispute resolution | Arbitration (AAA/ICC), courts (jurisdiction-specific) |
---
## When to Use
- Starting a new client engagement and need a contract fast
- Client asks for a proposal with pricing and timeline
- Partnership or vendor relationship requiring an MSA
- Protecting IP or confidential information with an NDA
- EU/DACH project requiring GDPR-compliant data clauses
---
## Workflow
### 1. Gather Requirements
Ask the user:
1. Document type? (contract / proposal / SOW / NDA / MSA)
2. Jurisdiction? (US-Delaware / EU / UK / DACH)
3. Engagement type? (fixed-price / hourly / retainer)
4. Parties? (names, roles, business addresses)
5. Scope summary? (1-3 sentences)
6. Total value or hourly rate?
7. Start date / end date or duration?
8. Special requirements? (IP assignment, white-label, subcontractors)
### 2. Select Template
| Type | Jurisdiction | Template |
|------|-------------|----------|
| Dev contract fixed | Any | Template A |
| Consulting retainer | Any | Template B |
| SaaS partnership | Any | Template C |
| NDA mutual | US/EU/UK/DACH | NDA-M |
| NDA one-way | US/EU/UK/DACH | NDA-OW |
| SOW | Any | SOW base |
### 3. Generate & Fill
Fill all [BRACKETED] placeholders. Flag missing data as "REQUIRED".
### 4. Convert to DOCX
```bash
# Install pandoc
brew install pandoc # macOS
apt install pandoc # Ubuntu
# Basic conversion
pandoc contract.md -o contract.docx \
--reference-doc=reference.docx \
-V geometry:margin=1in
# With numbered sections (legal style)
pandoc contract.md -o contract.docx \
--number-sections \
-V documentclass=article \
-V fontsize=11pt
# With custom company template
pandoc contract.md -o contract.docx \
--reference-doc=company-template.docx
```
---
## Jurisdiction Notes
### US (Delaware)
- Governing law: State of Delaware
- Work-for-hire doctrine applies (Copyright Act 101)
- Arbitration: AAA Commercial Rules
- Non-compete: enforceable with reasonable scope/time
### EU (GDPR)
- Must include Data Processing Addendum if handling personal data
- IP assignment requires separate written deed in some member states
- Arbitration: ICC or local chamber
### UK (post-Brexit)
- Governed by English law
- IP: Patents Act 1977 / CDPA 1988
- Arbitration: LCIA Rules
- Data: UK GDPR (post-Brexit equivalent)
### DACH (Germany / Austria / Switzerland)
- BGB (Buergerliches Gesetzbuch) governs contracts
- Written form requirement for certain clauses (para 126 BGB)
- IP: Author always retains moral rights; must explicitly transfer Nutzungsrechte
- Non-competes: max 2 years, compensation required (para 74 HGB)
- Jurisdiction: German courts (Landgericht) or DIS arbitration
- DSGVO (GDPR implementation) mandatory for personal data processing
- Kuendigungsfristen: statutory notice periods apply
---
## Template A: Web Dev Fixed-Price Contract
```markdown
# SOFTWARE DEVELOPMENT AGREEMENT
**Effective Date:** [DATE]
**Client:** [CLIENT LEGAL NAME], [ADDRESS] ("Client")
**Developer:** [YOUR LEGAL NAME / COMPANY], [ADDRESS] ("Developer")
---
## 1. SERVICES
Developer agrees to design, develop, and deliver:
**Project:** [PROJECT NAME]
**Description:** [1-3 sentence scope]
**Deliverables:**
- [Deliverable 1] due [DATE]
- [Deliverable 2] due [DATE]
- [Deliverable 3] due [DATE]
## 2. PAYMENT
**Total Fee:** [CURRENCY] [AMOUNT]
| Milestone | Amount | Due |
|-----------|--------|-----|
| Contract signing | 50% | Upon execution |
| Beta delivery | 25% | [DATE] |
| Final acceptance | 25% | Within 5 days of acceptance |
Late payments accrue interest at 1.5% per month.
Client has [10] business days to accept or reject deliverables in writing.
## 3. INTELLECTUAL PROPERTY
Upon receipt of full payment, Developer assigns all right, title, and interest in the
Work Product to Client as a work made for hire (US) / by assignment of future copyright (EU/UK).
Developer retains the right to display Work Product in portfolio unless Client
requests confidentiality in writing within [30] days of delivery.
Pre-existing IP (tools, libraries, frameworks) remains Developer's property.
Developer grants Client a perpetual, royalty-free license to use pre-existing IP
as embedded in the Work Product.
## 4. CONFIDENTIALITY
Each party keeps confidential all non-public information received from the other.
This obligation survives termination for [3] years.
## 5. WARRANTIES
Developer warrants Work Product will substantially conform to specifications for
[90] days post-delivery. Developer will fix material defects at no charge during
this period. EXCEPT AS STATED, WORK PRODUCT IS PROVIDED "AS IS."
## 6. LIABILITY
Developer's total liability shall not exceed total fees paid under this Agreement.
Neither party liable for indirect, incidental, or consequential damages.
## 7. TERMINATION
For Cause: Either party may terminate if the other materially breaches and fails
to cure within [14] days of written notice.
For Convenience: Client may terminate with [30] days written notice and pay for
all work completed plus [10%] of remaining contract value.
## 8. DISPUTE RESOLUTION
US: Binding arbitration under AAA Commercial Rules, [CITY], Delaware law.
EU/DACH: ICC / DIS arbitration, [CITY]. German / English law.
UK: LCIA Rules, London. English law.
## 9. GENERAL
- Entire Agreement: Supersedes all prior discussions.
- Amendments: Must be in writing, signed by both parties.
- Independent Contractor: Developer is not an employee of Client.
---
CLIENT: _________________________ Date: _________
[CLIENT NAME], [TITLE]
DEVELOPER: _________________________ Date: _________
[YOUR NAME], [TITLE]
```
---
## Template B: Monthly Consulting Retainer
```markdown
# CONSULTING RETAINER AGREEMENT
**Effective Date:** [DATE]
**Client:** [CLIENT LEGAL NAME] ("Client")
**Consultant:** [YOUR NAME / COMPANY] ("Consultant")
---
## 1. SERVICES
Consultant provides [DOMAIN, e.g., "CTO advisory and technical architecture"] services.
**Monthly Hours:** Up to [X] hours/month
**Rollover:** Unused hours [do / do not] roll over (max [X] hours banked)
**Overflow Rate:** [CURRENCY] [RATE]/hr for hours exceeding retainer
## 2. FEES
**Monthly Retainer:** [CURRENCY] [AMOUNT], due on the 1st of each month.
**Payment Method:** Bank transfer / Stripe / SEPA direct debit
**Late Payment:** 2% monthly interest after [10]-day grace period.
## 3. TERM AND TERMINATION
**Initial Term:** [3] months starting [DATE]
**Renewal:** Auto-renews monthly unless either party gives [30] days written notice.
**Immediate termination:** For material breach uncured after [7] days notice.
On termination, Consultant delivers all work in progress within [5] business days.
## 4. INTELLECTUAL PROPERTY
Work product created under this Agreement belongs to [Client / Consultant / jointly].
Advisory output (recommendations, analyses) are Client property upon full payment.
## 5. EXCLUSIVITY
[OPTION A - Non-exclusive:]
This Agreement is non-exclusive. Consultant may work with other clients.
[OPTION B - Partial exclusivity:]
Consultant will not work with direct competitors of Client during the term
and [90] days thereafter.
## 6. CONFIDENTIALITY AND DATA PROTECTION
EU/DACH: If Consultant processes personal data on behalf of Client, the parties
shall execute a Data Processing Agreement (DPA) per Art. 28 GDPR.
## 7. LIABILITY
Consultant's aggregate liability is capped at [3x] the fees paid in the [3] months
preceding the claim.
---
Signatures as above.
```
---
## Template C: SaaS Partnership Agreement
```markdown
# SAAS PARTNERSHIP AGREEMENT
**Effective Date:** [DATE]
**Provider:** [NAME], [ADDRESS]
**Partner:** [NAME], [ADDRESS]
---
## 1. PURPOSE
Provider grants Partner [reseller / referral / white-label / integration] rights to
Provider's [PRODUCT NAME] ("Software") subject to this Agreement.
## 2. PARTNERSHIP TYPE
[ ] Referral: Partner refers customers; earns [X%] of first-year ARR per referral.
[ ] Reseller: Partner resells licenses; earns [X%] discount off list price.
[ ] White-label: Partner rebrands Software; pays [AMOUNT]/month platform fee.
[ ] Integration: Partner integrates Software via API; terms in Exhibit A.
## 3. REVENUE SHARE
| Tier | Monthly ARR Referred | Commission |
|------|---------------------|------------|
| Bronze | < $10,000 | [X]% |
| Silver | $10,000-$50,000 | [X]% |
| Gold | > $50,000 | [X]% |
Payout: Net-30 after month close, minimum $[500] threshold.
## 4. INTELLECTUAL PROPERTY
Each party retains all IP in its own products. No implied licenses.
Partner may use Provider's marks per Provider's Brand Guidelines (Exhibit B).
## 5. DATA AND PRIVACY
Each party is an independent data controller for its own customers.
Joint processing requires a separate DPA (Exhibit C - EU/DACH projects).
## 6. TERM
Initial: [12] months. Renews annually unless [90]-day written notice given.
Termination for Cause: [30]-day cure period for material breach.
## 7. LIMITATION OF LIABILITY
Each party's liability capped at [1x] fees paid/received in prior [12] months.
Mutual indemnification for IP infringement claims from own products.
---
Signatures, exhibits, and governing law per applicable jurisdiction.
```
---
## GDPR Data Processing Addendum (EU/DACH Clause Block)
```markdown
## DATA PROCESSING ADDENDUM (Art. 28 GDPR)
Controller: [CLIENT NAME]
Processor: [CONTRACTOR NAME]
### Subject Matter
Processor processes personal data on behalf of Controller solely to perform services
under the main Agreement.
### Categories of Data Subjects
[e.g., end users, employees, customers]
### Categories of Personal Data
[e.g., names, email addresses, usage data]
### Processing Duration
For the term of the main Agreement; deletion within [30] days of termination.
### Processor Obligations
- Process data only on Controller's documented instructions
- Ensure persons authorized to process have committed to confidentiality
- Implement technical and organizational measures per Art. 32 GDPR
- Assist Controller with data subject rights requests
- Not engage sub-processors without prior written consent
- Delete or return all personal data upon termination
### Sub-processors (current as of Effective Date)
| Sub-processor | Location | Purpose |
|--------------|----------|---------|
| [AWS / GCP / Azure] | [Region] | Cloud hosting |
| [Other] | [Location] | [Purpose] |
### Cross-border Transfers
Data transfers outside EEA covered by: [ ] SCCs [ ] Adequacy Decision [ ] BCRs
```
---
## Common Pitfalls
1. **Missing IP assignment language** - "work for hire" alone is insufficient in EU; need explicit assignment of Nutzungsrechte in DACH
2. **Vague acceptance criteria** - Always define what "accepted" means (written sign-off, X days to reject)
3. **No change order process** - Scope creep kills fixed-price projects; add a clause for out-of-scope work
4. **Jurisdiction mismatch** - Choosing Delaware law for a German-only project creates enforcement problems
5. **Missing limitation of liability** - Without a cap, one bug could mean unlimited damages
6. **Oral amendments** - Contracts modified verbally are hard to enforce; always require written amendments
---
## Best Practices
- Use **milestone payments** over net-30 for projects >$10K - reduces cash flow risk
- For EU/DACH: always check if a DPA is needed (any personal data = yes)
- For DACH: include a **Schriftformklausel** (written form clause) explicitly
- Add a **force majeure** clause for anything over 3 months
- For retainers: define response time SLAs (e.g., 4h urgent / 24h normal)
- Keep templates in version control; track changes with `git diff`
- Review annually - laws change, especially GDPR enforcement interpretations
- For NDAs: always specify the return/destruction of confidential materials on termination
FILE:customer-success-manager/SKILL.md
---
name: "customer-success-manager"
description: Monitors customer health, predicts churn risk, and identifies expansion opportunities using weighted scoring models for SaaS customer success. Use when analyzing customer accounts, reviewing retention metrics, scoring at-risk customers, or when the user mentions churn, customer health scores, upsell opportunities, expansion revenue, retention analysis, or customer analytics. Runs three Python CLI tools to produce deterministic health scores, churn risk tiers, and prioritized expansion recommendations across Enterprise, Mid-Market, and SMB segments.
license: MIT
metadata:
version: 1.0.0
author: Alireza Rezvani
category: business-growth
domain: customer-success
updated: 2026-02-06
python-tools: health_score_calculator.py, churn_risk_analyzer.py, expansion_opportunity_scorer.py
tech-stack: customer-success, saas-metrics, health-scoring
---
# Customer Success Manager
Production-grade customer success analytics with multi-dimensional health scoring, churn risk prediction, and expansion opportunity identification. Three Python CLI tools provide deterministic, repeatable analysis using standard library only -- no external dependencies, no API calls, no ML models.
---
## Table of Contents
- [Input Requirements](#input-requirements)
- [Output Formats](#output-formats)
- [How to Use](#how-to-use)
- [Scripts](#scripts)
- [Reference Guides](#reference-guides)
- [Templates](#templates)
- [Best Practices](#best-practices)
- [Limitations](#limitations)
---
## Input Requirements
All scripts accept a JSON file as positional input argument. See `assets/sample_customer_data.json` for complete schema examples and sample data.
### Health Score Calculator
Required fields per customer object: `customer_id`, `name`, `segment`, `arr`, and nested objects `usage` (login_frequency, feature_adoption, dau_mau_ratio), `engagement` (support_ticket_volume, meeting_attendance, nps_score, csat_score), `support` (open_tickets, escalation_rate, avg_resolution_hours), `relationship` (executive_sponsor_engagement, multi_threading_depth, renewal_sentiment), and `previous_period` scores for trend analysis.
### Churn Risk Analyzer
Required fields per customer object: `customer_id`, `name`, `segment`, `arr`, `contract_end_date`, and nested objects `usage_decline`, `engagement_drop`, `support_issues`, `relationship_signals`, and `commercial_factors`.
### Expansion Opportunity Scorer
Required fields per customer object: `customer_id`, `name`, `segment`, `arr`, and nested objects `contract` (licensed_seats, active_seats, plan_tier, available_tiers), `product_usage` (per-module adoption flags and usage percentages), and `departments` (current and potential).
---
## Output Formats
All scripts support two output formats via the `--format` flag:
- **`text`** (default): Human-readable formatted output for terminal viewing
- **`json`**: Machine-readable JSON output for integrations and pipelines
---
## How to Use
### Quick Start
```bash
# Health scoring
python scripts/health_score_calculator.py assets/sample_customer_data.json
python scripts/health_score_calculator.py assets/sample_customer_data.json --format json
# Churn risk analysis
python scripts/churn_risk_analyzer.py assets/sample_customer_data.json
python scripts/churn_risk_analyzer.py assets/sample_customer_data.json --format json
# Expansion opportunity scoring
python scripts/expansion_opportunity_scorer.py assets/sample_customer_data.json
python scripts/expansion_opportunity_scorer.py assets/sample_customer_data.json --format json
```
### Workflow Integration
```bash
# 1. Score customer health across portfolio
python scripts/health_score_calculator.py customer_portfolio.json --format json > health_results.json
# Verify: confirm health_results.json contains the expected number of customer records before continuing
# 2. Identify at-risk accounts
python scripts/churn_risk_analyzer.py customer_portfolio.json --format json > risk_results.json
# Verify: confirm risk_results.json is non-empty and risk tiers are present for each customer
# 3. Find expansion opportunities in healthy accounts
python scripts/expansion_opportunity_scorer.py customer_portfolio.json --format json > expansion_results.json
# Verify: confirm expansion_results.json lists opportunities ranked by priority
# 4. Prepare QBR using templates
# Reference: assets/qbr_template.md
```
**Error handling:** If a script exits with an error, check that:
- The input JSON matches the required schema for that script (see Input Requirements above)
- All required fields are present and correctly typed
- Python 3.7+ is being used (`python --version`)
- Output files from prior steps are non-empty before piping into subsequent steps
---
## Scripts
### 1. health_score_calculator.py
**Purpose:** Multi-dimensional customer health scoring with trend analysis and segment-aware benchmarking.
**Dimensions and Weights:**
| Dimension | Weight | Metrics |
|-----------|--------|---------|
| Usage | 30% | Login frequency, feature adoption, DAU/MAU ratio |
| Engagement | 25% | Support ticket volume, meeting attendance, NPS/CSAT |
| Support | 20% | Open tickets, escalation rate, avg resolution time |
| Relationship | 25% | Executive sponsor engagement, multi-threading depth, renewal sentiment |
**Classification:**
- Green (75-100): Healthy -- customer achieving value
- Yellow (50-74): Needs attention -- monitor closely
- Red (0-49): At risk -- immediate intervention required
**Usage:**
```bash
python scripts/health_score_calculator.py customer_data.json
python scripts/health_score_calculator.py customer_data.json --format json
```
### 2. churn_risk_analyzer.py
**Purpose:** Identify at-risk accounts with behavioral signal detection and tier-based intervention recommendations.
**Risk Signal Weights:**
| Signal Category | Weight | Indicators |
|----------------|--------|------------|
| Usage Decline | 30% | Login trend, feature adoption change, DAU/MAU change |
| Engagement Drop | 25% | Meeting cancellations, response time, NPS change |
| Support Issues | 20% | Open escalations, unresolved critical, satisfaction trend |
| Relationship Signals | 15% | Champion left, sponsor change, competitor mentions |
| Commercial Factors | 10% | Contract type, pricing complaints, budget cuts |
**Risk Tiers:**
- Critical (80-100): Immediate executive escalation
- High (60-79): Urgent CSM intervention
- Medium (40-59): Proactive outreach
- Low (0-39): Standard monitoring
**Usage:**
```bash
python scripts/churn_risk_analyzer.py customer_data.json
python scripts/churn_risk_analyzer.py customer_data.json --format json
```
### 3. expansion_opportunity_scorer.py
**Purpose:** Identify upsell, cross-sell, and expansion opportunities with revenue estimation and priority ranking.
**Expansion Types:**
- **Upsell**: Upgrade to higher tier or more of existing product
- **Cross-sell**: Add new product modules
- **Expansion**: Additional seats or departments
**Usage:**
```bash
python scripts/expansion_opportunity_scorer.py customer_data.json
python scripts/expansion_opportunity_scorer.py customer_data.json --format json
```
---
## Reference Guides
| Reference | Description |
|-----------|-------------|
| `references/health-scoring-framework.md` | Complete health scoring methodology, dimension definitions, weighting rationale, threshold calibration |
| `references/cs-playbooks.md` | Intervention playbooks for each risk tier, onboarding, renewal, expansion, and escalation procedures |
| `references/cs-metrics-benchmarks.md` | Industry benchmarks for NRR, GRR, churn rates, health scores, expansion rates by segment and industry |
---
## Templates
| Template | Purpose |
|----------|---------|
| `assets/qbr_template.md` | Quarterly Business Review presentation structure |
| `assets/success_plan_template.md` | Customer success plan with goals, milestones, and metrics |
| `assets/onboarding_checklist_template.md` | 90-day onboarding checklist with phase gates |
| `assets/executive_business_review_template.md` | Executive stakeholder review for strategic accounts |
---
## Best Practices
1. **Combine signals**: Use all three scripts together for a complete customer picture
2. **Act on trends, not snapshots**: A declining Green is more urgent than a stable Yellow
3. **Calibrate thresholds**: Adjust segment benchmarks based on your product and industry per `references/health-scoring-framework.md`
4. **Prepare with data**: Run scripts before every QBR and executive meeting; reference `references/cs-playbooks.md` for intervention guidance
---
## Limitations
- **No real-time data**: Scripts analyze point-in-time snapshots from JSON input files
- **No CRM integration**: Data must be exported manually from your CRM/CS platform
- **Deterministic only**: No predictive ML -- scoring is algorithmic based on weighted signals
- **Threshold tuning**: Default thresholds are industry-standard but may need calibration for your business
- **Revenue estimates**: Expansion revenue estimates are approximations based on usage patterns
---
**Last Updated:** February 2026
**Tools:** 3 Python CLI tools
**Dependencies:** Python 3.7+ standard library only
FILE:customer-success-manager/assets/executive_business_review_template.md
# Executive Business Review
**Customer:** [Customer Name]
**Date:** [Review Date]
**Prepared for:** [Executive Name, Title]
**Prepared by:** [CSM Name] | [VP Customer Success Name]
**Classification:** [Strategic / Enterprise / Key Account]
---
## 1. Partnership Summary
| Metric | Value |
|--------|-------|
| Partnership Duration | [X months/years] |
| Current ARR | $[Amount] |
| Lifetime Value to Date | $[Amount] |
| Current Plan | [Tier] |
| Licensed Seats | [Number] |
| Active Seats | [Number] |
| Health Score | [Score]/100 ([Green/Yellow/Red]) |
| NPS Score | [Score] |
| Renewal Date | [Date] ([X] days remaining) |
---
## 2. Strategic Alignment
### Customer's Business Priorities (This Year)
1. **[Priority 1]** -- [How our solution supports this]
2. **[Priority 2]** -- [How our solution supports this]
3. **[Priority 3]** -- [How our solution supports this]
### Alignment Assessment
| Business Priority | Our Contribution | Alignment Score |
|-------------------|-----------------|----------------|
| [Priority 1] | [Specific contribution] | [Strong / Moderate / Weak] |
| [Priority 2] | [Specific contribution] | [Strong / Moderate / Weak] |
| [Priority 3] | [Specific contribution] | [Strong / Moderate / Weak] |
---
## 3. Value Delivered
### Quantified Business Impact
| Outcome | Metric | Before | After | Business Value |
|---------|--------|--------|-------|---------------|
| [e.g., Operational efficiency] | [Hours saved/week] | [Baseline] | [Current] | $[Estimated value] |
| [e.g., Revenue acceleration] | [Deal velocity] | [Baseline] | [Current] | $[Estimated value] |
| [e.g., Risk reduction] | [Error rate] | [Baseline] | [Current] | $[Estimated value] |
**Total Estimated Business Value:** $[Amount]
**ROI:** [X]x return on investment
### Key Achievements This Period
1. [Achievement 1 with measurable outcome]
2. [Achievement 2 with measurable outcome]
3. [Achievement 3 with measurable outcome]
---
## 4. Adoption and Engagement Scorecard
### Platform Utilisation
| Module | Adoption Status | Usage Depth | Benchmark | Assessment |
|--------|---------------|-------------|-----------|------------|
| [Module 1] | Fully Adopted | [High/Med/Low] | [Benchmark] | [Above/At/Below] |
| [Module 2] | Partially Adopted | [High/Med/Low] | [Benchmark] | [Above/At/Below] |
| [Module 3] | Not Adopted | -- | -- | Opportunity |
### Engagement Health
| Indicator | Current | Previous Period | Trend |
|-----------|---------|----------------|-------|
| Executive Engagement | [Score] | [Score] | [Up/Down/Stable] |
| Stakeholder Breadth | [# contacts] | [# contacts] | [Up/Down/Stable] |
| Meeting Participation | [%] | [%] | [Up/Down/Stable] |
| Feature Request Activity | [Count] | [Count] | [Up/Down/Stable] |
---
## 5. Account Health Overview
### Health Score Trend (Last 4 Quarters)
| Quarter | Overall | Usage | Engagement | Support | Relationship |
|---------|---------|-------|------------|---------|-------------|
| [Q-3] | [Score] | [Score] | [Score] | [Score] | [Score] |
| [Q-2] | [Score] | [Score] | [Score] | [Score] | [Score] |
| [Q-1] | [Score] | [Score] | [Score] | [Score] | [Score] |
| Current | [Score] | [Score] | [Score] | [Score] | [Score] |
### Risk Assessment
| Risk Factor | Level | Details | Mitigation |
|------------|-------|---------|-----------|
| [Risk 1] | [High/Med/Low] | [Description] | [Action] |
| [Risk 2] | [High/Med/Low] | [Description] | [Action] |
---
## 6. Support and Service Quality
| Metric | This Period | SLA Target | Status |
|--------|------------|-----------|--------|
| Total Tickets | [Number] | -- | |
| Avg First Response | [Hours] | [Hours] | [Met / Not Met] |
| Avg Resolution Time | [Hours] | [Hours] | [Met / Not Met] |
| Escalations | [Number] | 0 | |
| CSAT Score | [Score] | [Target] | [Above / Below] |
| Critical Issues | [Number] | 0 | |
### Notable Support Interactions
- [Summary of any significant support events and resolution]
---
## 7. Product Roadmap Alignment
### Features Delivered (Relevant to This Customer)
| Feature | Release Date | Customer Impact |
|---------|-------------|----------------|
| [Feature 1] | [Date] | [How it helps them] |
| [Feature 2] | [Date] | [How it helps them] |
### Upcoming Features (Customer-Relevant)
| Feature | Expected Release | Expected Impact |
|---------|-----------------|----------------|
| [Feature 1] | [Quarter] | [Business value] |
| [Feature 2] | [Quarter] | [Business value] |
### Customer Feature Requests
| Request | Priority | Status | Business Case |
|---------|----------|--------|--------------|
| [Request 1] | [P1/P2/P3] | [Status] | [Why it matters] |
| [Request 2] | [P1/P2/P3] | [Status] | [Why it matters] |
---
## 8. Growth and Expansion Opportunity
### Current Whitespace Analysis
| Opportunity | Type | Est. Revenue | Effort | Priority |
|------------|------|-------------|--------|----------|
| [Opportunity 1] | [Upsell/Cross-sell/Expansion] | $[Amount] | [Low/Med/High] | [1-5] |
| [Opportunity 2] | [Upsell/Cross-sell/Expansion] | $[Amount] | [Low/Med/High] | [1-5] |
| [Opportunity 3] | [Upsell/Cross-sell/Expansion] | $[Amount] | [Low/Med/High] | [1-5] |
**Total Expansion Opportunity:** $[Amount]
### Recommended Next Steps for Growth
1. [Specific expansion recommendation with business justification]
2. [Specific expansion recommendation with business justification]
---
## 9. Renewal Outlook
| Factor | Assessment |
|--------|-----------|
| Overall Renewal Confidence | [High / Medium / Low] |
| Budget Availability | [Confirmed / Expected / Uncertain] |
| Sponsor Support | [Strong / Moderate / Weak] |
| Competitive Threat | [None / Low / Medium / High] |
| Value Perception | [Strong / Moderate / Weak] |
| Contract Satisfaction | [Satisfied / Neutral / Concerned] |
### Renewal Strategy
[2-3 sentences on the approach for securing renewal, including any specific actions needed]
---
## 10. Executive-Level Action Items
| Action | Owner | Due Date | Priority | Impact |
|--------|-------|----------|----------|--------|
| [Action 1] | [Name, Title] | [Date] | [Critical/High/Med] | [Expected outcome] |
| [Action 2] | [Name, Title] | [Date] | [Critical/High/Med] | [Expected outcome] |
| [Action 3] | [Name, Title] | [Date] | [Critical/High/Med] | [Expected outcome] |
---
## Appendix
### Stakeholder Map
| Name | Title | Influence | Sentiment | Last Contact |
|------|-------|-----------|-----------|-------------|
| [Name] | [Title] | [Decision Maker / Influencer / User] | [Positive / Neutral / Negative] | [Date] |
| [Name] | [Title] | [Decision Maker / Influencer / User] | [Positive / Neutral / Negative] | [Date] |
### Competitive Landscape (If Applicable)
- **Known competitors in evaluation:** [List]
- **Our differentiators:** [Key strengths vs. competition]
- **Risk mitigation:** [Actions to defend position]
---
**Confidential -- For Internal and Customer Executive Use Only**
**Next Executive Review:** [Date]
FILE:customer-success-manager/assets/expected_output.json
{
"report": "customer_health_scores",
"summary": {
"total_customers": 4,
"average_score": 78.8,
"green_count": 3,
"yellow_count": 1,
"red_count": 0
},
"customers": [
{
"customer_id": "CUST-001",
"name": "Acme Corp",
"segment": "enterprise",
"arr": 120000,
"overall_score": 86.2,
"classification": "green",
"dimensions": {
"usage": {
"score": 91.6,
"weight": "30%",
"classification": "green"
},
"engagement": {
"score": 82.0,
"weight": "25%",
"classification": "green"
},
"support": {
"score": 78.5,
"weight": "20%",
"classification": "green"
},
"relationship": {
"score": 90.1,
"weight": "25%",
"classification": "green"
}
},
"trends": {
"usage": "improving",
"engagement": "improving",
"support": "stable",
"relationship": "improving",
"overall": "improving"
},
"recommendations": []
},
{
"customer_id": "CUST-002",
"name": "TechStart Inc",
"segment": "smb",
"arr": 18000,
"overall_score": 53.7,
"classification": "yellow",
"dimensions": {
"usage": {
"score": 52.5,
"weight": "30%",
"classification": "yellow"
},
"engagement": {
"score": 61.6,
"weight": "25%",
"classification": "yellow"
},
"support": {
"score": 63.2,
"weight": "20%",
"classification": "yellow"
},
"relationship": {
"score": 39.5,
"weight": "25%",
"classification": "red"
}
},
"trends": {
"usage": "stable",
"engagement": "improving",
"support": "stable",
"relationship": "declining",
"overall": "stable"
},
"recommendations": [
"Login frequency below target -- schedule product engagement session",
"NPS below threshold -- conduct a feedback deep-dive with customer",
"CSAT is critically low -- escalate to support leadership",
"Single-threaded relationship -- expand contacts across departments",
"Renewal sentiment is negative -- initiate save plan immediately"
]
},
{
"customer_id": "CUST-003",
"name": "GlobalTrade Solutions",
"segment": "mid-market",
"arr": 55000,
"overall_score": 79.7,
"classification": "green",
"dimensions": {
"usage": {
"score": 85.6,
"weight": "30%",
"classification": "green"
},
"engagement": {
"score": 79.6,
"weight": "25%",
"classification": "green"
},
"support": {
"score": 72.0,
"weight": "20%",
"classification": "green"
},
"relationship": {
"score": 79.0,
"weight": "25%",
"classification": "green"
}
},
"trends": {
"usage": "improving",
"engagement": "improving",
"support": "improving",
"relationship": "improving",
"overall": "improving"
},
"recommendations": []
},
{
"customer_id": "CUST-004",
"name": "HealthFirst Medical",
"segment": "enterprise",
"arr": 200000,
"overall_score": 95.7,
"classification": "green",
"dimensions": {
"usage": {
"score": 100.0,
"weight": "30%",
"classification": "green"
},
"engagement": {
"score": 92.0,
"weight": "25%",
"classification": "green"
},
"support": {
"score": 88.7,
"weight": "20%",
"classification": "green"
},
"relationship": {
"score": 100.0,
"weight": "25%",
"classification": "green"
}
},
"trends": {
"usage": "improving",
"engagement": "improving",
"support": "stable",
"relationship": "improving",
"overall": "improving"
},
"recommendations": []
}
]
}
FILE:customer-success-manager/assets/onboarding_checklist_template.md
# Customer Onboarding Checklist (90-Day)
**Customer:** [Customer Name]
**Segment:** [Enterprise / Mid-Market / SMB]
**CSM:** [CSM Name]
**Kickoff Date:** [Date]
**Target Go-Live:** [Date]
**Target First Value Date:** [Date -- must be within 30 days]
---
## Phase 1: Welcome and Setup (Days 1-14)
### Pre-Kickoff Preparation (Day 0)
- [ ] Review signed contract and SOW for scope and commitments
- [ ] Research customer's industry, business model, and competitive landscape
- [ ] Review handoff notes from sales team (pain points, decision drivers, stakeholders)
- [ ] Prepare welcome package (login credentials, documentation links, support contacts)
- [ ] Create customer workspace in CS platform
- [ ] Schedule kickoff meeting with all required attendees
- [ ] Prepare kickoff deck with agenda and success plan draft
### Kickoff Meeting (Day 1-2)
- [ ] Conduct kickoff meeting with customer stakeholders
- [ ] Confirm business objectives and success criteria
- [ ] Identify key stakeholders and their roles (sponsor, champion, technical lead, users)
- [ ] Align on communication cadence and preferred channels
- [ ] Review onboarding timeline and milestones
- [ ] Set expectations for time commitment from customer team
- [ ] Share and agree on success plan (mutual accountability)
- [ ] Schedule recurring check-in meetings
**Kickoff Meeting Notes:**
> [Document key takeaways, concerns raised, decisions made]
### Technical Setup (Days 3-7)
- [ ] Provision customer environment (tenant, workspace, permissions)
- [ ] Configure SSO/authentication if applicable
- [ ] Set up integrations with customer's existing tools
- [ ] Import or migrate existing data (if applicable)
- [ ] Validate data integrity post-migration
- [ ] Configure role-based access and permissions
- [ ] Set up monitoring and alerting
**Technical Setup Owner:** [SE / Implementation team name]
**Technical Setup Notes:**
> [Document configuration decisions, customizations, issues]
### Admin Training (Days 7-10)
- [ ] Deliver admin training session (system configuration, user management)
- [ ] Provide admin documentation and quick reference guide
- [ ] Ensure admins can independently manage basic operations
- [ ] Set up admin support escalation path
### Initial User Training (Days 10-14)
- [ ] Deliver core user training (session 1: basic navigation and key workflows)
- [ ] Provide user quickstart guide and video resources
- [ ] Set up user support channel (Slack, email, in-app chat)
- [ ] Confirm all target users have active accounts
- [ ] Track initial login completion rate
**Training Completion Rate:** [___%] of target users
---
## Phase 2: Activation (Days 15-30)
### User Activation (Days 15-20)
- [ ] Monitor daily active user metrics
- [ ] Follow up with users who have not logged in
- [ ] Conduct follow-up training for users needing additional help
- [ ] Address any usability issues or confusion reported
- [ ] Validate that core workflows are functioning as expected
- [ ] Collect early feedback from champion and key users
**Activation Rate:** [___%] of licensed users active
### First Value Milestone (Days 20-30)
- [ ] Define and track first value milestone (specific to customer objectives)
- [ ] Verify customer has completed their first meaningful workflow
- [ ] Document value delivered (even if small -- establish the pattern)
- [ ] Share "first win" with executive sponsor
- [ ] Celebrate the milestone with the customer team
**First Value Milestone:** [Describe the specific milestone]
**Date Achieved:** [Date]
### 30-Day Review (Day 28-30)
- [ ] Conduct 30-day review meeting with customer
- [ ] Review activation metrics (logins, usage, adoption)
- [ ] Assess progress against success plan milestones
- [ ] Identify any blockers or concerns
- [ ] Adjust onboarding plan if needed
- [ ] Confirm transition from setup phase to adoption phase
- [ ] Set goals for days 31-60
**30-Day Health Score:** [Score]/100 -- [Green/Yellow/Red]
---
## Phase 3: Adoption (Days 31-60)
### Feature Expansion (Days 31-45)
- [ ] Introduce additional features beyond core workflows
- [ ] Deliver advanced training session (session 2: power features)
- [ ] Enable at least one integration with customer's existing tools
- [ ] Identify and address feature adoption gaps
- [ ] Share best practices from similar customers
### Usage Benchmarking (Days 45-55)
- [ ] Compare customer's usage against segment benchmarks
- [ ] Identify underperforming areas and create enablement plan
- [ ] Share usage report with customer champion
- [ ] Discuss usage targets for the next 30 days
**Current vs. Benchmark:**
| Metric | Current | Benchmark | Gap |
|--------|---------|-----------|-----|
| Feature Adoption | [%] | [%] | [+/-] |
| Daily Active Users | [#] | [#] | [+/-] |
| Key Workflow Completion | [%] | [%] | [+/-] |
### 60-Day Check-in (Day 55-60)
- [ ] Conduct 60-day check-in meeting
- [ ] Review adoption metrics and progress
- [ ] Discuss any roadblocks to deeper adoption
- [ ] Begin identifying advanced use cases
- [ ] Set goals for days 61-90
---
## Phase 4: Optimisation (Days 61-90)
### Advanced Use Cases (Days 61-75)
- [ ] Conduct use case discovery workshop with customer
- [ ] Identify 2-3 advanced use cases beyond initial scope
- [ ] Build implementation plan for advanced use cases
- [ ] Begin pilot of advanced use cases with power users
### ROI Measurement (Days 75-85)
- [ ] Collect data for ROI measurement against baseline
- [ ] Build ROI summary document
- [ ] Share ROI results with executive sponsor
- [ ] Document customer testimonial or case study opportunity (if willing)
**ROI Summary:**
| Metric | Baseline | Current | Improvement |
|--------|----------|---------|-------------|
| [Metric 1] | [Value] | [Value] | [% change] |
| [Metric 2] | [Value] | [Value] | [% change] |
### 90-Day Executive Review (Days 85-90)
- [ ] Prepare 90-day executive review presentation
- [ ] Include: value delivered, adoption metrics, ROI, next steps
- [ ] Conduct review meeting with executive sponsor
- [ ] Transition from onboarding to ongoing success management
- [ ] Establish ongoing success plan with quarterly milestones
- [ ] Confirm ongoing meeting cadence
- [ ] Introduce expansion opportunities if appropriate
**90-Day Health Score:** [Score]/100 -- [Green/Yellow/Red]
---
## Onboarding Completion Gate
The following criteria must be met to consider onboarding complete:
- [ ] User activation rate above 80%
- [ ] First value milestone achieved within 30 days
- [ ] Core workflows actively used by target users
- [ ] Executive sponsor confirms satisfaction
- [ ] Health score is Yellow (50+) or better
- [ ] Success plan established with ongoing milestones
- [ ] Recurring meeting cadence confirmed
- [ ] Support escalation path understood by customer
**Onboarding Status:** [Complete / In Progress / Blocked]
**Completion Date:** [Date]
**Handoff to Steady-State CSM:** [Date if different CSM]
---
## Notes
### Risks and Blockers
| Risk/Blocker | Impact | Mitigation | Status |
|-------------|--------|-----------|--------|
| [Item] | [High/Med/Low] | [Action] | [Open/Resolved] |
### Key Decisions
| Date | Decision | Made By | Impact |
|------|----------|---------|--------|
| [Date] | [Decision] | [Name] | [Description] |
---
**Template Version:** 1.0
**Last Updated:** February 2026
FILE:customer-success-manager/assets/qbr_template.md
# Quarterly Business Review (QBR)
**Customer:** [Customer Name]
**Date:** [QBR Date]
**Prepared by:** [CSM Name]
**Attendees:** [List attendees and titles]
---
## 1. Executive Summary
**Overall Relationship Status:** [Green / Yellow / Red]
**Health Score:** [Score]/100
**Key Theme:** [One sentence summarizing the quarter]
### Quarter Highlights
- [Highlight 1: major achievement or milestone]
- [Highlight 2: value delivered]
- [Highlight 3: initiative completed]
### Areas of Focus
- [Focus area 1]
- [Focus area 2]
---
## 2. Value Delivered This Quarter
### Business Outcomes Achieved
| Objective | Target | Actual | Status |
|-----------|--------|--------|--------|
| [Objective 1] | [Target metric] | [Actual metric] | [On Track / At Risk / Achieved] |
| [Objective 2] | [Target metric] | [Actual metric] | [On Track / At Risk / Achieved] |
| [Objective 3] | [Target metric] | [Actual metric] | [On Track / At Risk / Achieved] |
### ROI Summary
| Metric | Before | After | Improvement |
|--------|--------|-------|-------------|
| [Metric 1, e.g., Time savings] | [Baseline] | [Current] | [% change] |
| [Metric 2, e.g., Cost reduction] | [Baseline] | [Current] | [% change] |
| [Metric 3, e.g., Revenue impact] | [Baseline] | [Current] | [% change] |
**Estimated Total Value Delivered:** $[Amount]
---
## 3. Product Usage and Adoption
### Usage Metrics
| Metric | Last Quarter | This Quarter | Trend |
|--------|-------------|--------------|-------|
| Monthly Active Users | [Number] | [Number] | [Up/Down/Stable] |
| Feature Adoption Rate | [%] | [%] | [Up/Down/Stable] |
| DAU/MAU Ratio | [Ratio] | [Ratio] | [Up/Down/Stable] |
| Seat Utilization | [%] | [%] | [Up/Down/Stable] |
### Feature Adoption Breakdown
| Feature/Module | Status | Usage Level | Notes |
|---------------|--------|-------------|-------|
| [Feature 1] | Active | [High/Med/Low] | |
| [Feature 2] | Active | [High/Med/Low] | |
| [Feature 3] | Not Adopted | -- | [Reason / Opportunity] |
### Adoption Recommendations
1. [Recommendation for increasing adoption of underused features]
2. [Recommendation for enabling new use cases]
---
## 4. Support Summary
| Metric | This Quarter | Previous Quarter | Benchmark |
|--------|-------------|-----------------|-----------|
| Total Tickets | [Number] | [Number] | [Segment avg] |
| Avg Resolution Time | [Hours] | [Hours] | [SLA target] |
| Escalations | [Number] | [Number] | [Target: 0] |
| CSAT Score | [Score] | [Score] | [Target] |
### Open Issues
| Issue | Priority | Status | ETA |
|-------|----------|--------|-----|
| [Issue 1] | [P1/P2/P3] | [In Progress / Pending] | [Date] |
---
## 5. Success Plan Progress
### Current Success Plan Goals
| Goal | Timeline | Progress | Status |
|------|----------|----------|--------|
| [Goal 1] | [Date] | [%] | [On Track / At Risk / Complete] |
| [Goal 2] | [Date] | [%] | [On Track / At Risk / Complete] |
| [Goal 3] | [Date] | [%] | [On Track / At Risk / Complete] |
### Next Quarter Goals (Proposed)
1. [Goal 1 with specific measurable outcome]
2. [Goal 2 with specific measurable outcome]
3. [Goal 3 with specific measurable outcome]
---
## 6. Product Roadmap Highlights
### Recently Released (Relevant to [Customer Name])
- [Feature/enhancement 1] -- [How it benefits them]
- [Feature/enhancement 2] -- [How it benefits them]
### Coming Next Quarter
- [Upcoming feature 1] -- [Expected benefit]
- [Upcoming feature 2] -- [Expected benefit]
### Feature Requests Status
| Request | Priority | Status | Expected Release |
|---------|----------|--------|-----------------|
| [Request 1] | [High/Med/Low] | [Planned / In Development / Under Review] | [Quarter] |
---
## 7. Growth Opportunities
### Expansion Discussion Points
- [Opportunity 1: e.g., additional seats for new team]
- [Opportunity 2: e.g., new module that addresses identified need]
- [Opportunity 3: e.g., tier upgrade for advanced capabilities]
### Estimated Value of Expansion: $[Amount] additional ARR
---
## 8. Action Items
| Action | Owner | Due Date | Priority |
|--------|-------|----------|----------|
| [Action 1] | [Name] | [Date] | [High/Med/Low] |
| [Action 2] | [Name] | [Date] | [High/Med/Low] |
| [Action 3] | [Name] | [Date] | [High/Med/Low] |
| [Action 4] | [Name] | [Date] | [High/Med/Low] |
---
## 9. Contract and Renewal
**Contract Start:** [Date]
**Renewal Date:** [Date]
**Current ARR:** $[Amount]
**Days to Renewal:** [Number]
### Renewal Readiness
- [ ] Value documented and communicated
- [ ] Executive sponsor aligned
- [ ] Open issues resolved or plan in place
- [ ] Pricing and terms discussed
- [ ] Expansion proposal prepared (if applicable)
---
**Next QBR Date:** [Date]
**Next Check-in:** [Date]
FILE:customer-success-manager/assets/sample_customer_data.json
{
"customers": [
{
"customer_id": "CUST-001",
"name": "Acme Corp",
"segment": "enterprise",
"arr": 120000,
"contract_end_date": "2026-12-31",
"usage": {
"login_frequency": 85,
"feature_adoption": 72,
"dau_mau_ratio": 0.45
},
"engagement": {
"support_ticket_volume": 3,
"meeting_attendance": 90,
"nps_score": 8,
"csat_score": 4.2
},
"support": {
"open_tickets": 2,
"escalation_rate": 0.05,
"avg_resolution_hours": 18
},
"relationship": {
"executive_sponsor_engagement": 80,
"multi_threading_depth": 4,
"renewal_sentiment": "positive"
},
"previous_period": {
"usage_score": 70,
"engagement_score": 65,
"support_score": 75,
"relationship_score": 60,
"overall_score": 67
},
"usage_decline": {
"login_trend": 5,
"feature_adoption_change": 3,
"dau_mau_change": 0.02
},
"engagement_drop": {
"meeting_cancellations": 0,
"response_time_days": 1,
"nps_change": 1
},
"support_issues": {
"open_escalations": 0,
"unresolved_critical": 0,
"satisfaction_trend": "improving"
},
"relationship_signals": {
"champion_left": false,
"sponsor_change": false,
"competitor_mentions": 0
},
"commercial_factors": {
"contract_type": "annual",
"pricing_complaints": false,
"budget_cuts_mentioned": false
},
"contract": {
"licensed_seats": 100,
"active_seats": 95,
"plan_tier": "professional",
"available_tiers": ["professional", "enterprise", "enterprise_plus"]
},
"product_usage": {
"core_platform": {"adopted": true, "usage_pct": 85},
"analytics_module": {"adopted": true, "usage_pct": 60},
"integrations_module": {"adopted": false, "usage_pct": 0},
"api_access": {"adopted": true, "usage_pct": 40},
"advanced_reporting": {"adopted": false, "usage_pct": 0}
},
"departments": {
"current": ["engineering", "product"],
"potential": ["marketing", "sales", "support"]
}
},
{
"customer_id": "CUST-002",
"name": "TechStart Inc",
"segment": "smb",
"arr": 18000,
"contract_end_date": "2026-04-15",
"usage": {
"login_frequency": 40,
"feature_adoption": 30,
"dau_mau_ratio": 0.15
},
"engagement": {
"support_ticket_volume": 8,
"meeting_attendance": 50,
"nps_score": 5,
"csat_score": 3.0
},
"support": {
"open_tickets": 6,
"escalation_rate": 0.18,
"avg_resolution_hours": 42
},
"relationship": {
"executive_sponsor_engagement": 30,
"multi_threading_depth": 1,
"renewal_sentiment": "negative"
},
"previous_period": {
"usage_score": 55,
"engagement_score": 50,
"support_score": 60,
"relationship_score": 45,
"overall_score": 52
},
"usage_decline": {
"login_trend": -25,
"feature_adoption_change": -18,
"dau_mau_change": -0.12
},
"engagement_drop": {
"meeting_cancellations": 3,
"response_time_days": 8,
"nps_change": -4
},
"support_issues": {
"open_escalations": 2,
"unresolved_critical": 1,
"satisfaction_trend": "declining"
},
"relationship_signals": {
"champion_left": true,
"sponsor_change": false,
"competitor_mentions": 3
},
"commercial_factors": {
"contract_type": "month-to-month",
"pricing_complaints": true,
"budget_cuts_mentioned": true
},
"contract": {
"licensed_seats": 20,
"active_seats": 8,
"plan_tier": "starter",
"available_tiers": ["starter", "professional", "enterprise"]
},
"product_usage": {
"core_platform": {"adopted": true, "usage_pct": 35},
"analytics_module": {"adopted": false, "usage_pct": 0},
"integrations_module": {"adopted": false, "usage_pct": 0},
"api_access": {"adopted": false, "usage_pct": 0},
"advanced_reporting": {"adopted": false, "usage_pct": 0}
},
"departments": {
"current": ["engineering"],
"potential": ["product", "design"]
}
},
{
"customer_id": "CUST-003",
"name": "GlobalTrade Solutions",
"segment": "mid-market",
"arr": 55000,
"contract_end_date": "2026-09-30",
"usage": {
"login_frequency": 70,
"feature_adoption": 58,
"dau_mau_ratio": 0.35
},
"engagement": {
"support_ticket_volume": 5,
"meeting_attendance": 75,
"nps_score": 7,
"csat_score": 3.8
},
"support": {
"open_tickets": 3,
"escalation_rate": 0.10,
"avg_resolution_hours": 30
},
"relationship": {
"executive_sponsor_engagement": 60,
"multi_threading_depth": 3,
"renewal_sentiment": "neutral"
},
"previous_period": {
"usage_score": 68,
"engagement_score": 70,
"support_score": 65,
"relationship_score": 62,
"overall_score": 66
},
"usage_decline": {
"login_trend": -8,
"feature_adoption_change": -5,
"dau_mau_change": -0.03
},
"engagement_drop": {
"meeting_cancellations": 1,
"response_time_days": 3,
"nps_change": -1
},
"support_issues": {
"open_escalations": 1,
"unresolved_critical": 0,
"satisfaction_trend": "stable"
},
"relationship_signals": {
"champion_left": false,
"sponsor_change": true,
"competitor_mentions": 1
},
"commercial_factors": {
"contract_type": "annual",
"pricing_complaints": false,
"budget_cuts_mentioned": false
},
"contract": {
"licensed_seats": 50,
"active_seats": 48,
"plan_tier": "professional",
"available_tiers": ["professional", "enterprise", "enterprise_plus"]
},
"product_usage": {
"core_platform": {"adopted": true, "usage_pct": 78},
"analytics_module": {"adopted": true, "usage_pct": 45},
"integrations_module": {"adopted": true, "usage_pct": 55},
"api_access": {"adopted": false, "usage_pct": 0},
"advanced_reporting": {"adopted": false, "usage_pct": 0}
},
"departments": {
"current": ["operations", "finance"],
"potential": ["logistics", "compliance"]
}
},
{
"customer_id": "CUST-004",
"name": "HealthFirst Medical",
"segment": "enterprise",
"arr": 200000,
"contract_end_date": "2027-03-15",
"usage": {
"login_frequency": 92,
"feature_adoption": 88,
"dau_mau_ratio": 0.55
},
"engagement": {
"support_ticket_volume": 2,
"meeting_attendance": 95,
"nps_score": 9,
"csat_score": 4.6
},
"support": {
"open_tickets": 1,
"escalation_rate": 0.02,
"avg_resolution_hours": 12
},
"relationship": {
"executive_sponsor_engagement": 92,
"multi_threading_depth": 6,
"renewal_sentiment": "positive"
},
"previous_period": {
"usage_score": 85,
"engagement_score": 82,
"support_score": 88,
"relationship_score": 80,
"overall_score": 84
},
"usage_decline": {
"login_trend": 3,
"feature_adoption_change": 5,
"dau_mau_change": 0.03
},
"engagement_drop": {
"meeting_cancellations": 0,
"response_time_days": 1,
"nps_change": 0
},
"support_issues": {
"open_escalations": 0,
"unresolved_critical": 0,
"satisfaction_trend": "improving"
},
"relationship_signals": {
"champion_left": false,
"sponsor_change": false,
"competitor_mentions": 0
},
"commercial_factors": {
"contract_type": "multi-year",
"pricing_complaints": false,
"budget_cuts_mentioned": false
},
"contract": {
"licensed_seats": 250,
"active_seats": 240,
"plan_tier": "enterprise",
"available_tiers": ["professional", "enterprise", "enterprise_plus"]
},
"product_usage": {
"core_platform": {"adopted": true, "usage_pct": 92},
"analytics_module": {"adopted": true, "usage_pct": 80},
"integrations_module": {"adopted": true, "usage_pct": 70},
"api_access": {"adopted": true, "usage_pct": 65},
"advanced_reporting": {"adopted": true, "usage_pct": 50},
"security_module": {"adopted": false, "usage_pct": 0},
"audit_module": {"adopted": false, "usage_pct": 0}
},
"departments": {
"current": ["clinical", "operations", "IT", "compliance"],
"potential": ["research", "finance", "HR"]
}
}
]
}
FILE:customer-success-manager/assets/success_plan_template.md
# Customer Success Plan
**Customer:** [Customer Name]
**CSM:** [CSM Name]
**Account Executive:** [AE Name]
**Plan Created:** [Date]
**Last Updated:** [Date]
**Review Cadence:** [Monthly / Quarterly]
---
## 1. Customer Overview
| Field | Details |
|-------|---------|
| Industry | [Industry] |
| Company Size | [Employees] |
| Segment | [Enterprise / Mid-Market / SMB] |
| ARR | $[Amount] |
| Contract Start | [Date] |
| Renewal Date | [Date] |
| Plan Tier | [Tier name] |
| Licensed Seats | [Number] |
### Key Stakeholders
| Name | Title | Role | Engagement Level |
|------|-------|------|-----------------|
| [Name] | [Title] | Executive Sponsor | [High / Medium / Low] |
| [Name] | [Title] | Day-to-Day Champion | [High / Medium / Low] |
| [Name] | [Title] | Technical Lead | [High / Medium / Low] |
| [Name] | [Title] | End User Lead | [High / Medium / Low] |
---
## 2. Business Objectives
### Primary Business Objectives
| # | Objective | Success Metric | Target | Timeline |
|---|-----------|---------------|--------|----------|
| 1 | [e.g., Reduce manual reporting time] | [Hours saved per week] | [Target number] | [Date] |
| 2 | [e.g., Improve team collaboration] | [Project completion rate] | [Target %] | [Date] |
| 3 | [e.g., Increase revenue visibility] | [Forecast accuracy] | [Target %] | [Date] |
### Why These Objectives Matter
- **Objective 1:** [Business context -- why this matters to the customer's overall strategy]
- **Objective 2:** [Business context]
- **Objective 3:** [Business context]
---
## 3. Success Milestones
### Phase 1: Foundation (Days 1-30)
| Milestone | Target Date | Status | Owner | Notes |
|-----------|------------|--------|-------|-------|
| Technical setup complete | [Date] | [ ] | [Name] | |
| Admin training delivered | [Date] | [ ] | CSM | |
| Core team onboarded | [Date] | [ ] | CSM | |
| First value milestone achieved | [Date] | [ ] | [Name] | |
| Data migration validated | [Date] | [ ] | SE | |
### Phase 2: Adoption (Days 31-90)
| Milestone | Target Date | Status | Owner | Notes |
|-----------|------------|--------|-------|-------|
| 80% user adoption | [Date] | [ ] | CSM | |
| Key workflows live | [Date] | [ ] | [Name] | |
| Integrations configured | [Date] | [ ] | SE | |
| First ROI measurement | [Date] | [ ] | CSM | |
| 30-day review complete | [Date] | [ ] | CSM | |
### Phase 3: Value Realisation (Days 91-180)
| Milestone | Target Date | Status | Owner | Notes |
|-----------|------------|--------|-------|-------|
| Objective 1 progress measurable | [Date] | [ ] | [Name] | |
| Advanced features adopted | [Date] | [ ] | CSM | |
| QBR completed | [Date] | [ ] | CSM | |
| Executive alignment confirmed | [Date] | [ ] | CSM | |
### Phase 4: Optimisation and Growth (Days 181-365)
| Milestone | Target Date | Status | Owner | Notes |
|-----------|------------|--------|-------|-------|
| All objectives on track | [Date] | [ ] | CSM | |
| ROI documented for renewal | [Date] | [ ] | CSM | |
| Expansion opportunities identified | [Date] | [ ] | CSM + AE | |
| Renewal conversation initiated | [Date] | [ ] | CSM + AE | |
---
## 4. Health Score Tracking
| Date | Overall Score | Usage | Engagement | Support | Relationship | Classification |
|------|--------------|-------|------------|---------|-------------|---------------|
| [Date] | [Score] | [Score] | [Score] | [Score] | [Score] | [Green/Yellow/Red] |
| [Date] | [Score] | [Score] | [Score] | [Score] | [Score] | [Green/Yellow/Red] |
---
## 5. Risk Register
| Risk | Probability | Impact | Mitigation | Owner | Status |
|------|------------|--------|-----------|-------|--------|
| [e.g., Executive sponsor departure] | [High/Med/Low] | [High/Med/Low] | [Multi-thread relationships] | CSM | [Active/Resolved] |
| [e.g., Low adoption in team X] | [High/Med/Low] | [High/Med/Low] | [Targeted training session] | CSM | [Active/Resolved] |
| [e.g., Budget review next quarter] | [High/Med/Low] | [High/Med/Low] | [Document ROI before review] | CSM | [Active/Resolved] |
---
## 6. Communication Plan
| Activity | Frequency | Participants | Purpose |
|----------|-----------|-------------|---------|
| Status check-in | [Weekly / Bi-weekly] | CSM + Champion | Tactical progress review |
| Strategic review | [Monthly] | CSM + Stakeholders | Objective alignment |
| QBR | [Quarterly] | CSM + Executive Sponsor | Executive business review |
| Technical review | [As needed] | SE + Technical Lead | Architecture and integration |
| Renewal planning | [90 days before] | CSM + AE + Sponsor | Contract discussion |
---
## 7. Product Adoption Plan
### Current State
| Module/Feature | Status | Usage Level | Target Usage | Gap |
|---------------|--------|-------------|-------------|-----|
| [Module 1] | Adopted | [%] | [%] | [Actions needed] |
| [Module 2] | Adopted | [%] | [%] | [Actions needed] |
| [Module 3] | Not Adopted | 0% | [%] | [Enablement plan] |
### Enablement Activities
| Activity | Target Date | Audience | Expected Outcome |
|----------|------------|----------|-----------------|
| [Training session] | [Date] | [Team/Group] | [Metric improvement] |
| [Workshop] | [Date] | [Team/Group] | [New workflow adoption] |
| [Office hours] | [Ongoing] | [All users] | [Question resolution] |
---
## 8. Expansion Roadmap
| Opportunity | Type | Estimated Value | Timeline | Prerequisites |
|------------|------|----------------|----------|--------------|
| [e.g., Additional seats] | Expansion | $[Amount] | [Quarter] | [Usage > 90%] |
| [e.g., Tier upgrade] | Upsell | $[Amount] | [Quarter] | [Feature requests] |
| [e.g., New module] | Cross-sell | $[Amount] | [Quarter] | [Use case validated] |
---
## 9. Notes and Updates
### [Date] - [Author]
[Update notes, key decisions, changes to plan]
### [Date] - [Author]
[Update notes, key decisions, changes to plan]
---
**Next Review Date:** [Date]
**Plan Owner:** [CSM Name]
FILE:customer-success-manager/references/cs-metrics-benchmarks.md
# Customer Success Metrics and Benchmarks
Industry benchmarks for key customer success metrics, segmented by company size, customer segment, and industry vertical.
---
## Core SaaS Metrics
### Net Revenue Retention (NRR)
NRR measures revenue retained from existing customers including expansion, contraction, and churn. It is the single most important metric for SaaS customer success.
**Formula:** (Starting ARR + Expansion - Contraction - Churn) / Starting ARR * 100
| Performance Level | NRR Range | Interpretation |
|-------------------|-----------|----------------|
| Best-in-class | > 130% | Strong expansion engine, very low churn |
| Excellent | 120-130% | Healthy growth from existing customers |
| Good | 110-120% | Solid retention with moderate expansion |
| Target | > 110% | Minimum for sustainable growth |
| Acceptable | 100-110% | Revenue stable but limited expansion |
| Below target | 90-100% | Churn exceeds expansion |
| Concerning | < 90% | Significant revenue erosion |
**Benchmarks by Segment:**
| Customer Segment | Median NRR | Top Quartile | Bottom Quartile |
|-----------------|------------|--------------|-----------------|
| Enterprise (>$100K ARR) | 115% | 130%+ | 105% |
| Mid-Market ($25K-$100K) | 108% | 120% | 98% |
| SMB (<$25K ARR) | 95% | 105% | 85% |
### Gross Revenue Retention (GRR)
GRR measures revenue retained without counting expansion. It isolates the churn and contraction signal.
**Formula:** (Starting ARR - Contraction - Churn) / Starting ARR * 100
| Performance Level | GRR Range | Interpretation |
|-------------------|-----------|----------------|
| Best-in-class | > 95% | Minimal churn, highly sticky product |
| Excellent | 92-95% | Strong retention |
| Good | 90-92% | Healthy with room to improve |
| Target | > 90% | Industry standard target |
| Acceptable | 85-90% | Moderate churn, needs focus |
| Below target | 80-85% | High churn impacting growth |
| Concerning | < 80% | Urgent retention problem |
**Benchmarks by Segment:**
| Customer Segment | Median GRR | Top Quartile | Bottom Quartile |
|-----------------|------------|--------------|-----------------|
| Enterprise | 95% | 98% | 90% |
| Mid-Market | 90% | 95% | 85% |
| SMB | 82% | 90% | 75% |
---
## Health Score Benchmarks
### Portfolio Health Distribution (Target)
A healthy CS portfolio should have the following approximate distribution:
| Classification | Target Distribution | Alert Threshold |
|---------------|-------------------|-----------------|
| Green (Healthy) | 60-70% | < 50% triggers portfolio review |
| Yellow (Attention) | 20-30% | > 35% signals systemic issues |
| Red (At Risk) | 5-10% | > 15% requires executive intervention |
### Average Health Score by Segment
| Segment | Target Average | Industry Median | Top Quartile |
|---------|---------------|-----------------|--------------|
| Enterprise | > 78 | 72 | 82 |
| Mid-Market | > 75 | 68 | 78 |
| SMB | > 70 | 65 | 75 |
### Health Score by Dimension (Industry Medians)
| Dimension | Enterprise | Mid-Market | SMB |
|-----------|-----------|------------|-----|
| Usage | 72 | 68 | 60 |
| Engagement | 70 | 62 | 55 |
| Support | 78 | 72 | 65 |
| Relationship | 68 | 60 | 50 |
---
## Churn Metrics
### Logo Churn Rate (Annual)
| Performance Level | Rate | Interpretation |
|-------------------|------|----------------|
| Best-in-class | < 5% | Exceptional retention |
| Excellent | 5-8% | Very strong |
| Good | 8-12% | Healthy |
| Acceptable | 12-15% | Room for improvement |
| Below target | 15-20% | Significant churn problem |
| Concerning | > 20% | Urgent -- product-market fit issues likely |
**Benchmarks by Segment:**
| Segment | Median Annual Logo Churn | Top Quartile | Bottom Quartile |
|---------|------------------------|--------------|-----------------|
| Enterprise | 5% | 2% | 10% |
| Mid-Market | 10% | 5% | 18% |
| SMB | 20% | 12% | 35% |
### Churn Leading Indicators
The following metrics have the highest predictive power for churn events:
| Indicator | Lead Time | Correlation with Churn |
|-----------|-----------|----------------------|
| Login frequency decline (>30%) | 60-90 days | Very High |
| NPS drop (>3 points) | 30-60 days | High |
| Executive sponsor departure | 30-90 days | Very High |
| Support escalation rate increase | 30-60 days | High |
| Meeting cancellation increase | 30-45 days | Moderate-High |
| Feature adoption decline | 60-90 days | Moderate |
| Competitor mentions | 30-60 days | Moderate |
---
## Expansion Metrics
### Expansion Revenue Rate
| Performance Level | Rate | Notes |
|-------------------|------|-------|
| Best-in-class | > 30% of total revenue | Strong land-and-expand motion |
| Excellent | 25-30% | Effective expansion engine |
| Good | 20-25% | Solid upsell/cross-sell |
| Target | > 20% | Minimum for healthy growth |
| Below target | 10-20% | Expansion motion needs development |
| Concerning | < 10% | Missing significant expansion opportunity |
### Expansion by Type
| Expansion Type | Typical Contribution | Average Deal Size |
|---------------|---------------------|-------------------|
| Seat Expansion | 40-50% of expansion | 15-25% of contract value |
| Tier Upsell | 25-35% of expansion | 40-80% of contract value |
| Module Cross-sell | 15-25% of expansion | 10-20% of contract value |
| Department Expansion | 5-15% of expansion | 50-100% of contract value |
### Expansion Readiness Indicators
| Signal | Interpretation |
|--------|---------------|
| Seat utilisation > 90% | Ready for seat expansion |
| Feature requests for higher tier | Upsell opportunity |
| Usage of 70%+ of current modules | Ready for cross-sell |
| New department interest | Department expansion play |
| Customer referral activity | Strong relationship, open to expansion |
---
## Engagement Metrics
### Customer Engagement Score (CES) Benchmarks
| Metric | Target | Median | Warning |
|--------|--------|--------|---------|
| Meeting attendance rate | > 80% | 72% | < 50% |
| Average NPS | > 50 | 35 | < 20 |
| Average CSAT | > 4.2/5 | 3.8/5 | < 3.0/5 |
| Response time (days) | < 2 | 3 | > 5 |
| QBR completion rate | > 90% | 75% | < 60% |
### Time to First Value (TTFV)
| Segment | Target TTFV | Median TTFV | Warning Threshold |
|---------|------------|------------|-------------------|
| Enterprise | < 30 days | 45 days | > 60 days |
| Mid-Market | < 21 days | 30 days | > 45 days |
| SMB | < 14 days | 21 days | > 30 days |
---
## CSM Operational Metrics
### Portfolio Management
| Metric | Enterprise CSM | Mid-Market CSM | SMB CSM (Tech-Touch) |
|--------|---------------|----------------|---------------------|
| Accounts per CSM | 10-25 | 30-60 | 100-300+ |
| ARR per CSM | $2M-$5M | $2M-$4M | $1M-$3M |
| Touch frequency | Weekly-biweekly | Biweekly-monthly | Quarterly-automated |
| QBR frequency | Quarterly | Semi-annually | Annually |
| Health score reviews | Weekly | Bi-weekly | Monthly |
### CSM Activity Benchmarks
| Activity | Target per Month | Purpose |
|----------|-----------------|---------|
| Strategic calls | 2-4 per account | Relationship building |
| Health score reviews | 4 (weekly) | Portfolio monitoring |
| QBR preparation | 3-5 per quarter | Executive engagement |
| Escalation handling | < 2 per month | Issue resolution |
| Expansion conversations | 1-2 per account | Revenue growth |
---
## Industry-Specific Benchmarks
### By Industry Vertical
| Industry | Median NRR | Median GRR | Median Logo Churn |
|----------|-----------|-----------|------------------|
| Infrastructure/DevOps | 125% | 95% | 5% |
| Cybersecurity | 120% | 93% | 7% |
| HR Tech | 110% | 90% | 12% |
| MarTech | 105% | 87% | 15% |
| FinTech | 115% | 92% | 8% |
| HealthTech | 112% | 91% | 10% |
| EdTech | 100% | 85% | 18% |
| eCommerce Tools | 108% | 88% | 14% |
### By Company Stage
| Stage | Median NRR | Median GRR | Notes |
|-------|-----------|-----------|-------|
| Early Stage (<$10M ARR) | 100% | 85% | Focus on product-market fit |
| Growth ($10M-$50M ARR) | 110% | 90% | Building CS function |
| Scale ($50M-$200M ARR) | 118% | 93% | Mature CS operations |
| Enterprise (>$200M ARR) | 115% | 95% | Optimisation phase |
---
## Metric Relationships
### Key Correlations
| If This Metric Moves | This Also Tends to Move | Direction |
|---------------------|------------------------|-----------|
| Health score down | Churn probability up | Inverse |
| NPS up | NRR up | Direct |
| TTFV down | GRR up | Inverse |
| Feature adoption up | Expansion rate up | Direct |
| Escalation rate up | NPS down | Inverse |
| Multi-threading depth up | GRR up | Direct |
### The SaaS Retention Equation
**Sustainable Growth requires:** NRR > 110% AND GRR > 90%
If NRR is high but GRR is low: You are churning customers and replacing with expansion from survivors. Not sustainable.
If GRR is high but NRR is low: You retain well but do not expand. Leaving money on the table.
Both high: Healthy, compounding growth from existing customers.
---
**Last Updated:** February 2026
**Sources:** Industry surveys, SaaS benchmarking reports, customer success community data (2024-2025 data cycles).
FILE:customer-success-manager/references/cs-playbooks.md
# Customer Success Playbooks
Comprehensive intervention, onboarding, renewal, expansion, and escalation playbooks for SaaS customer success management.
---
## Risk Tier Intervention Playbooks
### Critical Risk (Score 80-100)
**Situation:** Customer is at imminent risk of churn. Multiple severe warning signals detected. Requires immediate executive-level intervention.
**Timeline:** Act within 48 hours.
**Steps:**
1. **Executive Escalation (Day 0)**
- Alert VP of Customer Success and account executive immediately
- Brief internal leadership on situation, warning signals, and ARR at risk
- Identify any pending support issues and fast-track resolution
2. **Customer Contact (Day 1-2)**
- Schedule executive-to-executive call (VP CS to customer VP/C-level)
- Frame the conversation around understanding their challenges, not defending your product
- Listen more than talk -- capture the real objections
3. **Save Plan Creation (Day 2-3)**
- Create a detailed save plan with specific value milestones tied to their business outcomes
- Include timeline, owners, and measurable success criteria
- Get internal alignment on any concessions (pricing, features, roadmap commitments)
4. **Rescue Team Assignment (Day 3-5)**
- Assign a dedicated rescue team: CSM + Solutions Engineer + Support Lead
- Daily internal stand-up (15 min max) on account status
- Solutions Engineer to conduct technical health check
5. **Execution and Monitoring (Week 2-4)**
- Execute save plan with weekly customer check-ins
- Track progress against milestones
- Prepare competitive displacement defence if competitor involvement detected
6. **Resolution Assessment (Week 4)**
- Evaluate whether the situation is stabilising
- If improving: transition to High-risk monitoring cadence
- If not improving: escalate to CEO/GM for final intervention
**Success Criteria:** Risk score drops below 60 within 30 days. Customer confirms continued partnership intent.
---
### High Risk (Score 60-79)
**Situation:** Customer showing clear signs of dissatisfaction or disengagement. Still salvageable with focused CSM intervention.
**Timeline:** Act within 1 week.
**Steps:**
1. **Root Cause Analysis (Day 1-3)**
- Review all health score dimensions to identify the primary drivers
- Pull support ticket history for patterns
- Check product usage trends for the past 90 days
2. **CSM Outreach (Day 3-5)**
- Schedule a dedicated call with the customer (not a routine check-in)
- Open with empathy: "I've noticed some changes and want to make sure we're supporting you properly"
- Identify the top 3 customer concerns
3. **30-Day Recovery Plan (Day 5-7)**
- Build a 30-day recovery plan with measurable checkpoints every week
- Include specific actions for each concern identified
- Share the plan with the customer for mutual commitment
4. **Re-Engage Executive Sponsor (Week 2)**
- Request a meeting with the executive sponsor
- Align on business outcomes and how your product supports them
- Confirm continued sponsorship and address any political changes
5. **Support Fast-Track (Ongoing)**
- Escalate any pending support tickets internally
- Assign a support point of contact for this account
- Provide weekly status updates on open issues
6. **Progress Review (Week 3-4)**
- Review all metrics for improvement
- Adjust plan if specific interventions are not working
- If score drops to Critical: escalate to executive playbook
**Success Criteria:** Risk score drops below 40 within 30 days. No new warning signals emerge.
---
### Medium Risk (Score 40-59)
**Situation:** Early warning signs detected. Customer may not be aware of emerging issues. Proactive outreach prevents escalation.
**Timeline:** Act within 2 weeks.
**Steps:**
1. **Data Review (Day 1-5)**
- Analyse which dimension(s) are pulling the score down
- Review recent support interactions for sentiment clues
- Check for any known product issues affecting this customer
2. **Proactive Check-In (Week 1-2)**
- Schedule a "value check-in" call (position it as routine, not reactive)
- Share relevant success stories from similar customers
- Propose a training session or product walkthrough for underutilised features
3. **Value Reinforcement (Week 2-3)**
- Send a customised ROI summary showing value delivered
- Highlight feature releases relevant to their use case
- Connect them with your customer community or user group
4. **Monitoring (Week 3-4)**
- Increase monitoring frequency to bi-weekly
- Watch for improvement or continued decline
- If declining: move to High-risk playbook
**Success Criteria:** Score stabilises above 50 or improves. No escalation to High risk.
---
### Low Risk (Score 0-39)
**Situation:** Customer is healthy. Standard success cadence applies. Focus on value reinforcement and expansion readiness.
**Timeline:** Standard touch cadence.
**Steps:**
1. **Maintain Cadence**
- Enterprise: Monthly strategic reviews, quarterly QBRs
- Mid-Market: Bi-monthly check-ins, semi-annual reviews
- SMB: Quarterly automated health updates, annual review
2. **Proactive Communication**
- Share product updates and release notes
- Invite to webinars, conferences, and community events
- Share relevant industry insights and benchmarks
3. **Expansion Readiness**
- Monitor for expansion signals (usage approaching limits, new use cases)
- Prepare expansion proposals when timing is right
- Position premium features and modules relevant to their needs
4. **Renewal Preparation**
- Begin renewal preparation 90 days before contract end
- Build renewal proposal with value delivered summary
- Identify any terms or pricing adjustments needed
**Success Criteria:** Customer remains in Green classification. Expansion conversations initiated when appropriate.
---
## Onboarding Playbook
### Phase 1: Welcome and Setup (Day 1-14)
| Day | Activity | Owner | Deliverable |
|-----|----------|-------|-------------|
| 1 | Welcome email and introduction | CSM | Welcome package sent |
| 1-2 | Kickoff call | CSM + SE | Success plan drafted |
| 3-5 | Technical setup and configuration | SE | Environment configured |
| 5-7 | Admin training session | CSM | Admins trained |
| 7-10 | Data migration (if applicable) | SE | Data validated |
| 10-14 | Initial user training | CSM | Core team trained |
### Phase 2: Activation (Day 15-30)
| Day | Activity | Owner | Deliverable |
|-----|----------|-------|-------------|
| 15 | Activation check -- are users logging in? | CSM | Usage report |
| 15-20 | Follow-up training for laggards | CSM | All users active |
| 20-25 | First business outcome milestone | CSM | Milestone achieved |
| 25-30 | 30-day review call | CSM | Review documented |
**Critical Milestone:** Time to First Value must be under 30 days.
### Phase 3: Adoption (Day 31-60)
| Day | Activity | Owner | Deliverable |
|-----|----------|-------|-------------|
| 30-40 | Feature adoption expansion | CSM | New features in use |
| 40-50 | Integration setup (if applicable) | SE | Integrations live |
| 50-60 | Usage benchmarking vs. peers | CSM | Benchmark report |
### Phase 4: Optimisation (Day 61-90)
| Day | Activity | Owner | Deliverable |
|-----|----------|-------|-------------|
| 60-70 | Advanced use case workshop | CSM + SE | New use cases identified |
| 70-80 | ROI measurement | CSM | ROI documented |
| 80-90 | 90-day executive review | CSM | Transition to steady-state |
**Gate:** Handoff from onboarding to ongoing CSM management. Health score must be Yellow or better.
---
## Renewal Playbook
### 120 Days Before Renewal
- Review contract terms and pricing
- Assess current health score and trajectory
- Identify any outstanding issues or concerns
- Begin internal alignment on renewal strategy
### 90 Days Before Renewal
- Schedule renewal conversation with customer
- Prepare value delivered summary (ROI, usage stats, milestones achieved)
- Draft renewal proposal with recommended terms
- If at-risk: escalate and begin risk mitigation
### 60 Days Before Renewal
- Present renewal proposal to customer
- Negotiate terms if needed
- Address any concerns raised during the process
- Escalate blockers to leadership
### 30 Days Before Renewal
- Finalise contract terms
- Obtain signatures
- Plan for any post-renewal actions (expansion, migration)
- Update CRM with renewal details
### Post-Renewal
- Confirm renewed contract in systems
- Send thank-you and updated success plan
- Schedule next QBR
- Identify expansion opportunities
---
## Expansion Playbook
### Identifying Expansion Signals
| Signal | Expansion Type | Priority |
|--------|---------------|----------|
| Seat utilisation > 90% | Seat expansion | High |
| Requests for features in higher tier | Tier upsell | High |
| New department inquiries | Department expansion | Medium |
| High adoption of existing modules | Module cross-sell | Medium |
| Customer referencing competitors for missing features | Cross-sell | High |
### Expansion Conversation Framework
1. **Discovery:** "I noticed your team has been getting great value from [feature]. Have you considered how [new module] could help with [related business outcome]?"
2. **Value Framing:** "Companies similar to yours who adopted [module] saw [specific metric improvement]."
3. **Proposal:** "Based on your current usage, here's what the expansion would look like..."
4. **Stakeholder Alignment:** Involve the economic buyer early. The champion can advocate, but the budget holder decides.
5. **Close:** Coordinate with sales/account executive for commercial negotiation.
---
## Escalation Procedures
### Internal Escalation Matrix
| Trigger | Escalation Level | Response Time |
|---------|-----------------|---------------|
| Health score drops to Red | VP Customer Success | 24 hours |
| Executive sponsor leaves | Director CS + AE | 48 hours |
| Critical bug affecting customer | VP Engineering + VP CS | 4 hours |
| Customer mentions competitor evaluation | VP CS + VP Sales | 24 hours |
| Renewal at risk (60 days or less) | CRO/VP Sales | 24 hours |
| Customer threatens legal action | Legal + VP CS | Immediate |
### Escalation Communication Template
**Subject:** [ESCALATION] {Customer Name} -- {Brief Description}
**Body:**
- Customer: {name}, {segment}, ARR
- Health Score: {score} ({classification})
- Renewal Date: {date}
- Issue Summary: {2-3 sentences}
- Warning Signals: {list}
- Recommended Action: {specific next step}
- Urgency: {critical/high/medium}
---
**Last Updated:** February 2026
FILE:customer-success-manager/references/health-scoring-framework.md
# Health Scoring Framework
Complete methodology for multi-dimensional customer health scoring in SaaS customer success.
---
## Overview
Customer health scoring is the foundation of proactive customer success management. A well-calibrated health score enables CSMs to prioritise their portfolio, identify emerging risks before they become churn events, and allocate resources where they will have the greatest impact.
This framework uses a weighted, multi-dimensional approach that scores customers across four key areas: usage, engagement, support, and relationship. Each dimension contributes to an overall health score (0-100) that classifies accounts as Green (healthy), Yellow (needs attention), or Red (at risk).
---
## Scoring Dimensions
### 1. Usage (Weight: 30%)
Usage metrics are the strongest leading indicator of customer health. Customers who are not using the product are not deriving value and are at elevated churn risk.
| Metric | Definition | Scoring Method |
|--------|-----------|----------------|
| Login Frequency | Percentage of expected login days with actual logins | (actual / target) * 100, capped at 100 |
| Feature Adoption | Percentage of available features actively used | (adopted / available) * 100, capped at 100 |
| DAU/MAU Ratio | Daily active users divided by monthly active users | (actual / target) * 100, capped at 100 |
**Sub-weights within Usage:**
- Login Frequency: 35%
- Feature Adoption: 40%
- DAU/MAU Ratio: 25%
**Why 30% weight:** Usage is the most objective, data-driven signal. Declining usage almost always precedes churn. However, some customers may have seasonal usage patterns, which is why it is not weighted even higher.
### 2. Engagement (Weight: 25%)
Engagement measures how actively the customer participates in the relationship beyond just product usage.
| Metric | Definition | Scoring Method |
|--------|-----------|----------------|
| Support Ticket Volume | Number of support tickets in the period | Inverse score: (1 - actual/max) * 100 |
| Meeting Attendance | Percentage of scheduled meetings attended | (actual / target) * 100, capped at 100 |
| NPS Score | Net Promoter Score response (0-10) | (actual / target) * 100, capped at 100 |
| CSAT Score | Customer Satisfaction score (1-5) | (actual / target) * 100, capped at 100 |
**Sub-weights within Engagement:**
- Support Ticket Volume: 20% (inverse -- fewer tickets is better)
- Meeting Attendance: 30%
- NPS Score: 25%
- CSAT Score: 25%
**Why 25% weight:** Engagement signals complement usage data. A customer who attends meetings but does not use the product may be in an evaluation phase. A customer who uses the product but skips meetings may be becoming self-sufficient -- or disengaging.
### 3. Support (Weight: 20%)
Support health measures the quality of the customer's support experience, which directly impacts satisfaction and renewal likelihood.
| Metric | Definition | Scoring Method |
|--------|-----------|----------------|
| Open Tickets | Number of currently unresolved tickets | Inverse score: (1 - actual/max) * 100 |
| Escalation Rate | Percentage of tickets escalated | Inverse score: (1 - actual/max) * 100 |
| Avg Resolution Time | Average hours to resolve tickets | Inverse score: (1 - actual/max) * 100 |
**Sub-weights within Support:**
- Open Tickets: 35%
- Escalation Rate: 35%
- Resolution Time: 30%
**Why 20% weight:** Support issues are lagging indicators -- they tell you there is already a problem. However, unresolved support issues are a strong predictor of churn, especially when combined with declining engagement.
### 4. Relationship (Weight: 25%)
Relationship health measures the strength and depth of the human connection between the customer and your organisation.
| Metric | Definition | Scoring Method |
|--------|-----------|----------------|
| Executive Sponsor Engagement | Engagement level of exec sponsor (0-100) | (actual / target) * 100, capped at 100 |
| Multi-Threading Depth | Number of stakeholder contacts | (actual / target) * 100, capped at 100 |
| Renewal Sentiment | Qualitative sentiment assessment | Mapped to score: positive=100, neutral=60, negative=20, unknown=50 |
**Sub-weights within Relationship:**
- Executive Sponsor Engagement: 35%
- Multi-Threading Depth: 30%
- Renewal Sentiment: 35%
**Why 25% weight:** Relationship strength is the most important defence against competitive displacement. A customer with strong relationships will give you more chances to fix problems. A customer with weak relationships may leave without warning.
---
## Classification Thresholds
### Standard Thresholds
| Classification | Score Range | Meaning | Action |
|---------------|-------------|---------|--------|
| Green | 75-100 | Customer is healthy and achieving value | Standard cadence, focus on expansion |
| Yellow | 50-74 | Customer needs attention | Increase touch frequency, investigate root causes |
| Red | 0-49 | Customer is at risk | Immediate intervention, create save plan |
### Segment-Adjusted Thresholds
Enterprise customers typically have higher expectations and more complex deployments, which means a higher bar for "healthy." SMB customers may have simpler use cases and lower engagement expectations.
| Segment | Green Threshold | Yellow Threshold | Red Threshold |
|---------|----------------|------------------|---------------|
| Enterprise | 75-100 | 50-74 | 0-49 |
| Mid-Market | 70-100 | 45-69 | 0-44 |
| SMB | 65-100 | 40-64 | 0-39 |
### Segment-Specific Benchmarks
Each metric target is calibrated per segment. Enterprise customers are expected to have higher login frequency, attendance, and sponsor engagement. SMB customers have lower targets but still meaningful thresholds.
**Example Calibration:**
- Enterprise login frequency target: 90% (high-touch, deeply embedded)
- Mid-Market login frequency target: 80% (balanced engagement)
- SMB login frequency target: 70% (self-serve oriented)
---
## Trend Analysis
A single health score snapshot is useful. A health score trend is actionable.
### Trend Classification
| Trend | Criteria | Implication |
|-------|----------|-------------|
| Improving | Current > Previous by 5+ points | Positive trajectory, reinforce what is working |
| Stable | Within +/- 5 points | Maintain current approach |
| Declining | Current < Previous by 5+ points | Investigate and intervene |
| No Data | No previous period available | Establish baseline |
### Trend Priority Matrix
| Current Score | Trend | Priority |
|--------------|-------|----------|
| Green | Declining | HIGH -- intervene before it drops further |
| Yellow | Declining | CRITICAL -- trajectory leads to Red |
| Yellow | Improving | MEDIUM -- reinforce positive momentum |
| Red | Improving | HIGH -- support the recovery |
| Red | Stable | CRITICAL -- needs new intervention approach |
---
## Calibration Guidelines
### When to Recalibrate
1. **After major product changes**: New features may change what "good usage" looks like
2. **Seasonal patterns**: Some industries have cyclical usage (retail holiday season, fiscal year end)
3. **Portfolio composition changes**: If you add many SMB customers, the overall averages shift
4. **After churn events**: Review whether the health score predicted the churn
### Calibration Process
1. Export health scores for all customers over the past 12 months
2. Identify all churn events in the same period
3. Calculate the average health score of churned customers 90, 60, and 30 days before churn
4. Adjust thresholds so that churned customers would have been classified as Yellow or Red at least 60 days before churn
5. Validate with a holdout set of recent data
### Common Calibration Pitfalls
- **Threshold creep**: Gradually lowering Green thresholds to make the portfolio look healthier
- **Over-weighting lagging indicators**: Support metrics react after the damage is done
- **Ignoring segment differences**: Using one threshold for all segments
- **Sentiment bias**: Over-relying on subjective renewal sentiment
---
## Implementation Checklist
1. Define data sources for each metric (CRM, product analytics, support system)
2. Establish data refresh frequency (daily for usage, weekly for engagement)
3. Configure segment benchmarks for your customer base
4. Set initial thresholds using industry defaults (provided above)
5. Run a 30-day pilot with manual review of edge cases
6. Calibrate thresholds based on pilot results
7. Automate scoring and alerting
8. Review and recalibrate quarterly
---
**Last Updated:** February 2026
FILE:customer-success-manager/scripts/churn_risk_analyzer.py
#!/usr/bin/env python3
"""
Churn Risk Analyzer
Identifies at-risk customer accounts by scoring behavioral signals across
usage decline, engagement drop, support issues, relationship signals, and
commercial factors. Produces risk tiers with intervention playbooks and
time-to-renewal urgency multipliers.
Usage:
python churn_risk_analyzer.py customer_data.json
python churn_risk_analyzer.py customer_data.json --format json
"""
import argparse
import json
import sys
from datetime import datetime
from typing import Any, Dict, List, Optional, Tuple
# ---------------------------------------------------------------------------
# Constants
# ---------------------------------------------------------------------------
RISK_SIGNAL_WEIGHTS: Dict[str, float] = {
"usage_decline": 0.30,
"engagement_drop": 0.25,
"support_issues": 0.20,
"relationship_signals": 0.15,
"commercial_factors": 0.10,
}
RISK_TIERS: List[Dict[str, Any]] = [
{"name": "critical", "min": 80, "max": 100, "label": "CRITICAL", "action": "Immediate executive escalation"},
{"name": "high", "min": 60, "max": 79, "label": "HIGH", "action": "Urgent CSM intervention"},
{"name": "medium", "min": 40, "max": 59, "label": "MEDIUM", "action": "Proactive outreach"},
{"name": "low", "min": 0, "max": 39, "label": "LOW", "action": "Standard monitoring"},
]
WARNING_SEVERITY: Dict[str, int] = {
"critical": 4,
"high": 3,
"medium": 2,
"low": 1,
}
# Intervention playbooks per tier
INTERVENTION_PLAYBOOKS: Dict[str, List[str]] = {
"critical": [
"Schedule executive-to-executive call within 48 hours",
"Create detailed save plan with specific value milestones",
"Offer concessions or contract restructuring if needed",
"Assign dedicated rescue team (CSM + Solutions Engineer)",
"Daily internal stand-up on account status until stabilised",
"Prepare competitive displacement defence strategy",
],
"high": [
"Schedule urgent CSM call within 1 week",
"Conduct root cause analysis on declining metrics",
"Build 30-day recovery plan with measurable checkpoints",
"Re-engage executive sponsor for alignment meeting",
"Accelerate any pending feature requests or bug fixes",
"Increase touch frequency to weekly until improvement",
],
"medium": [
"Schedule proactive check-in within 2 weeks",
"Share relevant success stories and best practices",
"Propose training session or product walkthrough",
"Review current usage against success plan goals",
"Identify and address any unvoiced concerns",
"Bi-weekly monitoring until score improves to Low",
],
"low": [
"Maintain standard touch cadence",
"Share product updates and new feature announcements",
"Monitor health score trends monthly",
"Proactively share relevant industry insights",
"Prepare for upcoming renewal conversations (if within 90 days)",
],
}
SATISFACTION_TREND_SCORES: Dict[str, float] = {
"improving": 10.0,
"stable": 30.0,
"declining": 70.0,
"critical": 95.0,
}
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
def safe_divide(numerator: float, denominator: float, default: float = 0.0) -> float:
"""Return numerator / denominator, or *default* when denominator is zero."""
if denominator == 0:
return default
return numerator / denominator
def clamp(value: float, lo: float = 0.0, hi: float = 100.0) -> float:
"""Clamp *value* between *lo* and *hi*."""
return max(lo, min(hi, value))
def days_until(date_str: Optional[str]) -> Optional[int]:
"""Return days from today until *date_str* (ISO format), or None."""
if not date_str:
return None
try:
target = datetime.strptime(date_str[:10], "%Y-%m-%d")
delta = (target - datetime.now()).days
return max(delta, 0)
except (ValueError, TypeError):
return None
def renewal_urgency_multiplier(days_remaining: Optional[int]) -> float:
"""Return a multiplier (1.0 - 1.5) based on proximity to renewal.
Closer renewals amplify the risk score.
"""
if days_remaining is None:
return 1.0
if days_remaining <= 30:
return 1.5
elif days_remaining <= 60:
return 1.35
elif days_remaining <= 90:
return 1.2
elif days_remaining <= 180:
return 1.1
return 1.0
def get_risk_tier(score: float) -> Dict[str, Any]:
"""Return the risk tier dict matching the score."""
for tier in RISK_TIERS:
if tier["min"] <= score <= tier["max"]:
return tier
return RISK_TIERS[-1] # default to low
# ---------------------------------------------------------------------------
# Signal Scoring
# ---------------------------------------------------------------------------
def score_usage_decline(data: Dict[str, Any]) -> Tuple[float, List[Dict[str, str]]]:
"""Score usage decline signals (0-100, higher = more risk)."""
warnings: List[Dict[str, str]] = []
login_trend = data.get("login_trend", 0) # negative = decline
feature_change = data.get("feature_adoption_change", 0)
dau_mau_change = data.get("dau_mau_change", 0)
# Convert declines to risk scores (0-100)
login_risk = clamp(abs(min(login_trend, 0)) * 3.0) # -33% => 100
feature_risk = clamp(abs(min(feature_change, 0)) * 4.0) # -25% => 100
dau_mau_risk = clamp(abs(min(dau_mau_change, 0)) * 500) # -0.20 => 100
score = round(login_risk * 0.40 + feature_risk * 0.35 + dau_mau_risk * 0.25, 1)
if login_trend <= -20:
warnings.append({"severity": "critical", "signal": f"Login frequency dropped {abs(login_trend)}%"})
elif login_trend <= -10:
warnings.append({"severity": "high", "signal": f"Login frequency declined {abs(login_trend)}%"})
elif login_trend < -5:
warnings.append({"severity": "medium", "signal": f"Login frequency dipping {abs(login_trend)}%"})
if feature_change <= -15:
warnings.append({"severity": "high", "signal": f"Feature adoption dropped {abs(feature_change)}%"})
elif feature_change < -5:
warnings.append({"severity": "medium", "signal": f"Feature adoption declining {abs(feature_change)}%"})
if dau_mau_change <= -0.10:
warnings.append({"severity": "high", "signal": f"DAU/MAU ratio fell by {abs(dau_mau_change):.2f}"})
return score, warnings
def score_engagement_drop(data: Dict[str, Any]) -> Tuple[float, List[Dict[str, str]]]:
"""Score engagement drop signals (0-100, higher = more risk)."""
warnings: List[Dict[str, str]] = []
cancellations = data.get("meeting_cancellations", 0)
response_days = data.get("response_time_days", 1)
nps_change = data.get("nps_change", 0)
cancel_risk = clamp(cancellations * 25.0) # 4 cancellations => 100
response_risk = clamp((response_days - 1) * 15.0) # 1 day baseline; 7+ days => 90+
nps_risk = clamp(abs(min(nps_change, 0)) * 20.0) # -5 => 100
score = round(cancel_risk * 0.30 + response_risk * 0.35 + nps_risk * 0.35, 1)
if cancellations >= 3:
warnings.append({"severity": "critical", "signal": f"{cancellations} meeting cancellations -- customer disengaging"})
elif cancellations >= 2:
warnings.append({"severity": "high", "signal": f"{cancellations} meeting cancellations recently"})
if response_days >= 7:
warnings.append({"severity": "critical", "signal": f"Customer response time: {response_days} days -- going dark"})
elif response_days >= 4:
warnings.append({"severity": "high", "signal": f"Customer response time increasing: {response_days} days"})
if nps_change <= -4:
warnings.append({"severity": "critical", "signal": f"NPS dropped by {abs(nps_change)} points"})
elif nps_change <= -2:
warnings.append({"severity": "high", "signal": f"NPS declined by {abs(nps_change)} points"})
return score, warnings
def score_support_issues(data: Dict[str, Any]) -> Tuple[float, List[Dict[str, str]]]:
"""Score support-related risk signals (0-100, higher = more risk)."""
warnings: List[Dict[str, str]] = []
escalations = data.get("open_escalations", 0)
critical_unresolved = data.get("unresolved_critical", 0)
sat_trend = data.get("satisfaction_trend", "stable").lower()
esc_risk = clamp(escalations * 35.0) # 3 escalations => 100
critical_risk = clamp(critical_unresolved * 50.0) # 2 unresolved critical => 100
sat_risk = SATISFACTION_TREND_SCORES.get(sat_trend, 30.0)
score = round(esc_risk * 0.35 + critical_risk * 0.35 + sat_risk * 0.30, 1)
if critical_unresolved >= 2:
warnings.append({"severity": "critical", "signal": f"{critical_unresolved} unresolved critical support tickets"})
elif critical_unresolved >= 1:
warnings.append({"severity": "high", "signal": "Unresolved critical support ticket"})
if escalations >= 2:
warnings.append({"severity": "high", "signal": f"{escalations} open escalations"})
elif escalations >= 1:
warnings.append({"severity": "medium", "signal": "Open support escalation"})
if sat_trend == "critical":
warnings.append({"severity": "critical", "signal": "Support satisfaction at critical levels"})
elif sat_trend == "declining":
warnings.append({"severity": "high", "signal": "Support satisfaction trending down"})
return score, warnings
def score_relationship_signals(data: Dict[str, Any]) -> Tuple[float, List[Dict[str, str]]]:
"""Score relationship risk signals (0-100, higher = more risk)."""
warnings: List[Dict[str, str]] = []
risk_points = 0.0
champion_left = data.get("champion_left", False)
sponsor_change = data.get("sponsor_change", False)
competitor_mentions = data.get("competitor_mentions", 0)
if champion_left:
risk_points += 45.0
warnings.append({"severity": "critical", "signal": "Internal champion has left the organisation"})
if sponsor_change:
risk_points += 30.0
warnings.append({"severity": "high", "signal": "Executive sponsor change detected"})
if competitor_mentions >= 3:
risk_points += 35.0
warnings.append({"severity": "critical", "signal": f"Customer mentioned competitors {competitor_mentions} times"})
elif competitor_mentions >= 1:
risk_points += competitor_mentions * 12.0
warnings.append({"severity": "medium", "signal": f"Customer mentioned competitor {competitor_mentions} time(s)"})
score = clamp(risk_points)
return round(score, 1), warnings
def score_commercial_factors(data: Dict[str, Any]) -> Tuple[float, List[Dict[str, str]]]:
"""Score commercial risk factors (0-100, higher = more risk)."""
warnings: List[Dict[str, str]] = []
risk_points = 0.0
contract_type = data.get("contract_type", "annual").lower()
pricing_complaints = data.get("pricing_complaints", False)
budget_cuts = data.get("budget_cuts_mentioned", False)
if contract_type == "month-to-month":
risk_points += 30.0
warnings.append({"severity": "medium", "signal": "Month-to-month contract -- low switching cost"})
elif contract_type == "quarterly":
risk_points += 15.0
if pricing_complaints:
risk_points += 35.0
warnings.append({"severity": "high", "signal": "Customer has raised pricing complaints"})
if budget_cuts:
risk_points += 40.0
warnings.append({"severity": "high", "signal": "Customer mentioned budget cuts or cost reduction"})
score = clamp(risk_points)
return round(score, 1), warnings
# ---------------------------------------------------------------------------
# Main Analysis
# ---------------------------------------------------------------------------
def analyse_churn_risk(customer: Dict[str, Any]) -> Dict[str, Any]:
"""Analyse churn risk for a single customer."""
usage_score, usage_warnings = score_usage_decline(customer.get("usage_decline", {}))
engagement_score, engagement_warnings = score_engagement_drop(customer.get("engagement_drop", {}))
support_score, support_warnings = score_support_issues(customer.get("support_issues", {}))
relationship_score, relationship_warnings = score_relationship_signals(customer.get("relationship_signals", {}))
commercial_score, commercial_warnings = score_commercial_factors(customer.get("commercial_factors", {}))
# Weighted raw score
raw_score = (
usage_score * RISK_SIGNAL_WEIGHTS["usage_decline"]
+ engagement_score * RISK_SIGNAL_WEIGHTS["engagement_drop"]
+ support_score * RISK_SIGNAL_WEIGHTS["support_issues"]
+ relationship_score * RISK_SIGNAL_WEIGHTS["relationship_signals"]
+ commercial_score * RISK_SIGNAL_WEIGHTS["commercial_factors"]
)
# Apply renewal urgency multiplier
remaining = days_until(customer.get("contract_end_date"))
multiplier = renewal_urgency_multiplier(remaining)
adjusted_score = clamp(round(raw_score * multiplier, 1))
tier = get_risk_tier(adjusted_score)
# Collect and sort warnings by severity
all_warnings = usage_warnings + engagement_warnings + support_warnings + relationship_warnings + commercial_warnings
all_warnings.sort(key=lambda w: WARNING_SEVERITY.get(w["severity"], 0), reverse=True)
playbook = INTERVENTION_PLAYBOOKS.get(tier["name"], [])
return {
"customer_id": customer.get("customer_id", "unknown"),
"name": customer.get("name", "Unknown"),
"segment": customer.get("segment", "unknown"),
"arr": customer.get("arr", 0),
"risk_score": adjusted_score,
"raw_score": round(raw_score, 1),
"risk_tier": tier["name"],
"risk_label": tier["label"],
"urgency_multiplier": multiplier,
"days_to_renewal": remaining,
"signal_scores": {
"usage_decline": {"score": usage_score, "weight": "30%"},
"engagement_drop": {"score": engagement_score, "weight": "25%"},
"support_issues": {"score": support_score, "weight": "20%"},
"relationship_signals": {"score": relationship_score, "weight": "15%"},
"commercial_factors": {"score": commercial_score, "weight": "10%"},
},
"warning_signals": all_warnings,
"recommended_actions": playbook,
}
# ---------------------------------------------------------------------------
# Output Formatting
# ---------------------------------------------------------------------------
def format_text(results: List[Dict[str, Any]]) -> str:
"""Format results as human-readable text."""
lines: List[str] = []
lines.append("=" * 72)
lines.append("CHURN RISK ANALYSIS REPORT")
lines.append("=" * 72)
lines.append("")
total = len(results)
critical_count = sum(1 for r in results if r["risk_tier"] == "critical")
high_count = sum(1 for r in results if r["risk_tier"] == "high")
medium_count = sum(1 for r in results if r["risk_tier"] == "medium")
low_count = sum(1 for r in results if r["risk_tier"] == "low")
total_arr_at_risk = sum(r["arr"] for r in results if r["risk_tier"] in ("critical", "high"))
lines.append(f"Portfolio Summary: {total} customers analysed")
lines.append(f" Critical Risk: {critical_count}")
lines.append(f" High Risk: {high_count}")
lines.append(f" Medium Risk: {medium_count}")
lines.append(f" Low Risk: {low_count}")
lines.append(f" ARR at Risk (Critical + High): ,.0f")
lines.append("")
# Sort by risk score descending
sorted_results = sorted(results, key=lambda r: r["risk_score"], reverse=True)
for r in sorted_results:
lines.append("-" * 72)
lines.append(f"Customer: {r['name']} ({r['customer_id']})")
lines.append(f"Segment: {r['segment'].title()} | ARR: ,.0f")
renewal_str = f"{r['days_to_renewal']} days" if r["days_to_renewal"] is not None else "N/A"
lines.append(f"Risk Score: {r['risk_score']}/100 [{r['risk_label']}] | Renewal: {renewal_str}")
if r["urgency_multiplier"] > 1.0:
lines.append(f" ** Urgency multiplier applied: {r['urgency_multiplier']}x (renewal approaching)")
lines.append("")
lines.append(" Signal Scores:")
for signal_name, signal_data in r["signal_scores"].items():
display_name = signal_name.replace("_", " ").title()
lines.append(f" {display_name:25s} {signal_data['score']:6.1f}/100 ({signal_data['weight']})")
if r["warning_signals"]:
lines.append("")
lines.append(" Warning Signals:")
for w in r["warning_signals"]:
severity_tag = w["severity"].upper()
lines.append(f" [{severity_tag}] {w['signal']}")
if r["recommended_actions"]:
lines.append("")
lines.append(" Recommended Actions:")
for i, action in enumerate(r["recommended_actions"], 1):
lines.append(f" {i}. {action}")
lines.append("")
lines.append("=" * 72)
return "\n".join(lines)
def format_json(results: List[Dict[str, Any]]) -> str:
"""Format results as JSON."""
total = len(results)
output = {
"report": "churn_risk_analysis",
"summary": {
"total_customers": total,
"critical_count": sum(1 for r in results if r["risk_tier"] == "critical"),
"high_count": sum(1 for r in results if r["risk_tier"] == "high"),
"medium_count": sum(1 for r in results if r["risk_tier"] == "medium"),
"low_count": sum(1 for r in results if r["risk_tier"] == "low"),
"total_arr_at_risk": sum(r["arr"] for r in results if r["risk_tier"] in ("critical", "high")),
},
"customers": sorted(results, key=lambda r: r["risk_score"], reverse=True),
}
return json.dumps(output, indent=2)
# ---------------------------------------------------------------------------
# CLI
# ---------------------------------------------------------------------------
def main() -> None:
parser = argparse.ArgumentParser(
description="Analyse churn risk with behavioral signal detection and intervention recommendations."
)
parser.add_argument("input_file", help="Path to JSON file containing customer data")
parser.add_argument(
"--format",
choices=["text", "json"],
default="text",
dest="output_format",
help="Output format (default: text)",
)
args = parser.parse_args()
try:
with open(args.input_file, "r") as f:
data = json.load(f)
except FileNotFoundError:
print(f"Error: File not found: {args.input_file}", file=sys.stderr)
sys.exit(1)
except json.JSONDecodeError as e:
print(f"Error: Invalid JSON in {args.input_file}: {e}", file=sys.stderr)
sys.exit(1)
customers = data.get("customers", [])
if not customers:
print("Error: No customer records found in input file.", file=sys.stderr)
sys.exit(1)
results = [analyse_churn_risk(c) for c in customers]
if args.output_format == "json":
print(format_json(results))
else:
print(format_text(results))
if __name__ == "__main__":
main()
FILE:customer-success-manager/scripts/expansion_opportunity_scorer.py
#!/usr/bin/env python3
"""
Expansion Opportunity Scorer
Analyses customer product adoption depth, maps whitespace for unused
features/products, estimates revenue opportunities, and prioritises
expansion plays by effort vs impact.
Usage:
python expansion_opportunity_scorer.py customer_data.json
python expansion_opportunity_scorer.py customer_data.json --format json
"""
import argparse
import json
import sys
from typing import Any, Dict, List, Optional, Tuple
# ---------------------------------------------------------------------------
# Constants
# ---------------------------------------------------------------------------
# Tier pricing multipliers (relative to current plan price)
TIER_UPLIFT: Dict[str, float] = {
"starter": 1.0,
"professional": 1.8,
"enterprise": 3.0,
"enterprise_plus": 4.5,
}
# Module revenue estimates as a fraction of base ARR
MODULE_REVENUE_FRACTION: Dict[str, float] = {
"core_platform": 0.00, # Already included in base
"analytics_module": 0.15,
"integrations_module": 0.12,
"api_access": 0.10,
"advanced_reporting": 0.18,
"security_module": 0.20,
"automation_module": 0.15,
"collaboration_module": 0.10,
"data_export": 0.08,
"custom_workflows": 0.22,
"sso_module": 0.08,
"audit_module": 0.10,
}
# Effort classification for different expansion types
EFFORT_MAP: Dict[str, str] = {
"upsell_tier": "medium",
"cross_sell_module": "low",
"seat_expansion": "low",
"department_expansion": "high",
}
# Usage thresholds for recommendations
HIGH_USAGE_THRESHOLD = 75 # % usage indicates readiness for more
LOW_ADOPTION_THRESHOLD = 30 # % usage is too low to push expansion there
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
def safe_divide(numerator: float, denominator: float, default: float = 0.0) -> float:
"""Return numerator / denominator, or *default* when denominator is zero."""
if denominator == 0:
return default
return numerator / denominator
def clamp(value: float, lo: float = 0.0, hi: float = 100.0) -> float:
"""Clamp *value* between *lo* and *hi*."""
return max(lo, min(hi, value))
def estimate_seat_expansion_revenue(
arr: float, licensed: int, active: int, segment: str
) -> Tuple[float, str]:
"""Estimate revenue from seat expansion.
Returns (estimated_revenue, rationale).
"""
utilisation = safe_divide(active, licensed)
if utilisation >= 0.90:
# Near capacity -- likely needs more seats
growth_factor = {"enterprise": 0.25, "mid-market": 0.20, "smb": 0.15}
factor = growth_factor.get(segment.lower(), 0.15)
revenue = round(arr * factor, 0)
return revenue, f"Seat utilisation at {utilisation:.0%} -- likely needs {int(licensed * factor)} additional seats"
return 0.0, f"Seat utilisation at {utilisation:.0%} -- not yet at expansion threshold"
def estimate_tier_upgrade_revenue(
arr: float, current_tier: str, available_tiers: List[str]
) -> Tuple[float, Optional[str], str]:
"""Estimate revenue from tier upgrade.
Returns (estimated_revenue, target_tier, rationale).
"""
current_mult = TIER_UPLIFT.get(current_tier.lower(), 1.0)
best_revenue = 0.0
best_tier = None
rationale = "Already on highest tier"
for tier in available_tiers:
tier_mult = TIER_UPLIFT.get(tier.lower(), 1.0)
if tier_mult > current_mult:
# Calculate revenue as the incremental ARR from upgrading
base_arr = safe_divide(arr, current_mult)
upgrade_arr = base_arr * tier_mult
incremental = upgrade_arr - arr
if incremental > best_revenue:
# Pick the next tier up (not skip tiers)
if best_tier is None or tier_mult < TIER_UPLIFT.get(best_tier.lower(), 999):
best_revenue = round(incremental, 0)
best_tier = tier
rationale = f"Upgrade from {current_tier} to {tier} adds ,.0f ARR"
return best_revenue, best_tier, rationale
def estimate_module_revenue(
arr: float, product_usage: Dict[str, Dict[str, Any]]
) -> List[Dict[str, Any]]:
"""Identify cross-sell opportunities from unadopted modules.
Returns list of opportunity dicts.
"""
opportunities: List[Dict[str, Any]] = []
for module_name, module_data in product_usage.items():
adopted = module_data.get("adopted", False)
usage_pct = module_data.get("usage_pct", 0)
fraction = MODULE_REVENUE_FRACTION.get(module_name.lower(), 0.10)
if not adopted and fraction > 0:
revenue = round(arr * fraction, 0)
opportunities.append({
"module": module_name,
"type": "cross_sell",
"estimated_revenue": revenue,
"effort": "low",
"rationale": f"Module not adopted -- ,.0f potential ARR",
})
elif adopted and usage_pct < LOW_ADOPTION_THRESHOLD and fraction > 0:
# Already adopted but underutilised -- focus on enablement, not expansion
pass # Skip -- needs enablement, not a sales motion
return opportunities
def estimate_department_expansion_revenue(
arr: float,
current_departments: List[str],
potential_departments: List[str],
segment: str,
) -> List[Dict[str, Any]]:
"""Estimate revenue from expanding to new departments."""
opportunities: List[Dict[str, Any]] = []
current_set = {d.lower() for d in current_departments}
per_dept_estimate = safe_divide(arr, max(len(current_departments), 1))
for dept in potential_departments:
if dept.lower() not in current_set:
# Estimate each new department at the average per-department ARR
revenue = round(per_dept_estimate * 0.8, 0) # Slight discount for new dept
opportunities.append({
"department": dept,
"type": "expansion",
"estimated_revenue": revenue,
"effort": "high",
"rationale": f"Expand to {dept} department -- est. ,.0f ARR",
})
return opportunities
# ---------------------------------------------------------------------------
# Priority Scoring
# ---------------------------------------------------------------------------
def priority_score(revenue: float, effort: str) -> float:
"""Calculate priority score (higher = better).
Favours high revenue with low effort.
"""
effort_multiplier = {"low": 3.0, "medium": 2.0, "high": 1.0}
mult = effort_multiplier.get(effort.lower(), 1.0)
# Normalise revenue to a 0-100 scale (assume max single opportunity is $200k)
rev_score = clamp(safe_divide(revenue, 2000.0)) # $200k => 100
return round(rev_score * mult, 1)
# ---------------------------------------------------------------------------
# Main Analysis
# ---------------------------------------------------------------------------
def analyse_expansion(customer: Dict[str, Any]) -> Dict[str, Any]:
"""Analyse expansion opportunities for a single customer."""
arr = customer.get("arr", 0)
segment = customer.get("segment", "mid-market").lower()
contract = customer.get("contract", {})
product_usage = customer.get("product_usage", {})
departments = customer.get("departments", {})
all_opportunities: List[Dict[str, Any]] = []
# 1. Seat expansion
licensed = contract.get("licensed_seats", 0)
active = contract.get("active_seats", 0)
seat_rev, seat_rationale = estimate_seat_expansion_revenue(arr, licensed, active, segment)
if seat_rev > 0:
all_opportunities.append({
"type": "expansion",
"category": "seat_expansion",
"estimated_revenue": seat_rev,
"effort": "low",
"rationale": seat_rationale,
"priority_score": priority_score(seat_rev, "low"),
})
# 2. Tier upgrade
current_tier = contract.get("plan_tier", "").lower()
available_tiers = contract.get("available_tiers", [])
tier_rev, target_tier, tier_rationale = estimate_tier_upgrade_revenue(arr, current_tier, available_tiers)
if tier_rev > 0 and target_tier:
all_opportunities.append({
"type": "upsell",
"category": "tier_upgrade",
"target_tier": target_tier,
"estimated_revenue": tier_rev,
"effort": "medium",
"rationale": tier_rationale,
"priority_score": priority_score(tier_rev, "medium"),
})
# 3. Module cross-sell
module_opps = estimate_module_revenue(arr, product_usage)
for opp in module_opps:
opp["category"] = "module_cross_sell"
opp["priority_score"] = priority_score(opp["estimated_revenue"], opp["effort"])
all_opportunities.append(opp)
# 4. Department expansion
current_depts = departments.get("current", [])
potential_depts = departments.get("potential", [])
dept_opps = estimate_department_expansion_revenue(arr, current_depts, potential_depts, segment)
for opp in dept_opps:
opp["category"] = "department_expansion"
opp["priority_score"] = priority_score(opp["estimated_revenue"], opp["effort"])
all_opportunities.append(opp)
# Sort by priority score descending
all_opportunities.sort(key=lambda o: o["priority_score"], reverse=True)
# Adoption depth summary
total_modules = len(product_usage)
adopted_modules = sum(1 for m in product_usage.values() if m.get("adopted", False))
avg_usage = round(
safe_divide(
sum(m.get("usage_pct", 0) for m in product_usage.values() if m.get("adopted", False)),
max(adopted_modules, 1),
),
1,
)
total_estimated_revenue = sum(o["estimated_revenue"] for o in all_opportunities)
return {
"customer_id": customer.get("customer_id", "unknown"),
"name": customer.get("name", "Unknown"),
"segment": segment,
"arr": arr,
"adoption_summary": {
"total_modules": total_modules,
"adopted_modules": adopted_modules,
"adoption_rate": round(safe_divide(adopted_modules, total_modules) * 100, 1) if total_modules > 0 else 0,
"avg_usage_pct": avg_usage,
"seat_utilisation": round(safe_divide(active, max(licensed, 1)) * 100, 1),
"current_tier": current_tier,
"departments_covered": len(current_depts),
"departments_potential": len(potential_depts),
},
"total_estimated_revenue": round(total_estimated_revenue, 0),
"opportunity_count": len(all_opportunities),
"opportunities": all_opportunities,
}
# ---------------------------------------------------------------------------
# Output Formatting
# ---------------------------------------------------------------------------
def format_text(results: List[Dict[str, Any]]) -> str:
"""Format results as human-readable text."""
lines: List[str] = []
lines.append("=" * 72)
lines.append("EXPANSION OPPORTUNITY REPORT")
lines.append("=" * 72)
lines.append("")
total_rev = sum(r["total_estimated_revenue"] for r in results)
total_opps = sum(r["opportunity_count"] for r in results)
lines.append(f"Portfolio Summary: {len(results)} customers")
lines.append(f" Total Expansion Revenue Potential: ,.0f")
lines.append(f" Total Opportunities Identified: {total_opps}")
lines.append("")
# Sort customers by total estimated revenue descending
sorted_results = sorted(results, key=lambda r: r["total_estimated_revenue"], reverse=True)
for r in sorted_results:
lines.append("-" * 72)
lines.append(f"Customer: {r['name']} ({r['customer_id']})")
lines.append(f"Segment: {r['segment'].title()} | Current ARR: ,.0f")
lines.append(f"Total Expansion Potential: ,.0f ({r['opportunity_count']} opportunities)")
lines.append("")
adoption = r["adoption_summary"]
lines.append(" Adoption Summary:")
lines.append(f" Modules Adopted: {adoption['adopted_modules']}/{adoption['total_modules']} ({adoption['adoption_rate']}%)")
lines.append(f" Avg Module Usage: {adoption['avg_usage_pct']}%")
lines.append(f" Seat Utilisation: {adoption['seat_utilisation']}%")
lines.append(f" Current Tier: {adoption['current_tier'].title()}")
lines.append(f" Departments: {adoption['departments_covered']} active, {adoption['departments_potential']} potential")
if r["opportunities"]:
lines.append("")
lines.append(" Opportunities (ranked by priority):")
for i, opp in enumerate(r["opportunities"], 1):
opp_type = opp.get("type", "unknown").title()
category = opp.get("category", "").replace("_", " ").title()
rev = opp["estimated_revenue"]
effort = opp.get("effort", "unknown").title()
pri = opp.get("priority_score", 0)
lines.append(f" {i}. [{opp_type}] {category}")
lines.append(f" Revenue: ,.0f | Effort: {effort} | Priority: {pri}")
lines.append(f" {opp.get('rationale', '')}")
else:
lines.append("")
lines.append(" No expansion opportunities identified at this time.")
lines.append("")
lines.append("=" * 72)
return "\n".join(lines)
def format_json(results: List[Dict[str, Any]]) -> str:
"""Format results as JSON."""
total_rev = sum(r["total_estimated_revenue"] for r in results)
total_opps = sum(r["opportunity_count"] for r in results)
output = {
"report": "expansion_opportunities",
"summary": {
"total_customers": len(results),
"total_estimated_revenue": total_rev,
"total_opportunities": total_opps,
},
"customers": sorted(results, key=lambda r: r["total_estimated_revenue"], reverse=True),
}
return json.dumps(output, indent=2)
# ---------------------------------------------------------------------------
# CLI
# ---------------------------------------------------------------------------
def main() -> None:
parser = argparse.ArgumentParser(
description="Score expansion opportunities with adoption analysis and revenue estimation."
)
parser.add_argument("input_file", help="Path to JSON file containing customer data")
parser.add_argument(
"--format",
choices=["text", "json"],
default="text",
dest="output_format",
help="Output format (default: text)",
)
args = parser.parse_args()
try:
with open(args.input_file, "r") as f:
data = json.load(f)
except FileNotFoundError:
print(f"Error: File not found: {args.input_file}", file=sys.stderr)
sys.exit(1)
except json.JSONDecodeError as e:
print(f"Error: Invalid JSON in {args.input_file}: {e}", file=sys.stderr)
sys.exit(1)
customers = data.get("customers", [])
if not customers:
print("Error: No customer records found in input file.", file=sys.stderr)
sys.exit(1)
results = [analyse_expansion(c) for c in customers]
if args.output_format == "json":
print(format_json(results))
else:
print(format_text(results))
if __name__ == "__main__":
main()
FILE:customer-success-manager/scripts/health_score_calculator.py
#!/usr/bin/env python3
"""
Customer Health Score Calculator
Multi-dimensional weighted health scoring across usage, engagement, support,
and relationship dimensions. Produces Red/Yellow/Green classification with
trend analysis and segment-aware benchmarking.
Usage:
python health_score_calculator.py customer_data.json
python health_score_calculator.py customer_data.json --format json
"""
import argparse
import json
import sys
from typing import Any, Dict, List, Optional, Tuple
# ---------------------------------------------------------------------------
# Constants
# ---------------------------------------------------------------------------
DIMENSION_WEIGHTS: Dict[str, float] = {
"usage": 0.30,
"engagement": 0.25,
"support": 0.20,
"relationship": 0.25,
}
# Segment-specific thresholds (green_min, yellow_min)
SEGMENT_THRESHOLDS: Dict[str, Dict[str, Tuple[int, int]]] = {
"enterprise": {"green": (75, 100), "yellow": (50, 74), "red": (0, 49)},
"mid-market": {"green": (70, 100), "yellow": (45, 69), "red": (0, 44)},
"smb": {"green": (65, 100), "yellow": (40, 64), "red": (0, 39)},
}
# Benchmarks per segment for normalising raw metrics
SEGMENT_BENCHMARKS: Dict[str, Dict[str, Any]] = {
"enterprise": {
"login_frequency_target": 90,
"feature_adoption_target": 80,
"dau_mau_target": 0.50,
"support_ticket_volume_max": 5,
"meeting_attendance_target": 95,
"nps_target": 9,
"csat_target": 4.5,
"open_tickets_max": 10,
"escalation_rate_max": 0.25,
"avg_resolution_hours_max": 72,
"exec_sponsor_target": 90,
"multi_threading_target": 5,
},
"mid-market": {
"login_frequency_target": 80,
"feature_adoption_target": 70,
"dau_mau_target": 0.40,
"support_ticket_volume_max": 8,
"meeting_attendance_target": 85,
"nps_target": 8,
"csat_target": 4.0,
"open_tickets_max": 15,
"escalation_rate_max": 0.30,
"avg_resolution_hours_max": 96,
"exec_sponsor_target": 75,
"multi_threading_target": 3,
},
"smb": {
"login_frequency_target": 70,
"feature_adoption_target": 60,
"dau_mau_target": 0.30,
"support_ticket_volume_max": 10,
"meeting_attendance_target": 75,
"nps_target": 7,
"csat_target": 3.8,
"open_tickets_max": 20,
"escalation_rate_max": 0.40,
"avg_resolution_hours_max": 120,
"exec_sponsor_target": 60,
"multi_threading_target": 2,
},
}
RENEWAL_SENTIMENT_SCORES: Dict[str, float] = {
"positive": 100.0,
"neutral": 60.0,
"negative": 20.0,
"unknown": 50.0,
}
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
def safe_divide(numerator: float, denominator: float, default: float = 0.0) -> float:
"""Return numerator / denominator, or *default* when denominator is zero."""
if denominator == 0:
return default
return numerator / denominator
def clamp(value: float, lo: float = 0.0, hi: float = 100.0) -> float:
"""Clamp *value* between *lo* and *hi*."""
return max(lo, min(hi, value))
def get_benchmarks(segment: str) -> Dict[str, Any]:
"""Return benchmarks for the given segment, falling back to mid-market."""
return SEGMENT_BENCHMARKS.get(segment.lower(), SEGMENT_BENCHMARKS["mid-market"])
def get_thresholds(segment: str) -> Dict[str, Tuple[int, int]]:
"""Return classification thresholds for the given segment."""
return SEGMENT_THRESHOLDS.get(segment.lower(), SEGMENT_THRESHOLDS["mid-market"])
def classify(score: float, segment: str) -> str:
"""Return 'green', 'yellow', or 'red' classification."""
thresholds = get_thresholds(segment)
if score >= thresholds["green"][0]:
return "green"
elif score >= thresholds["yellow"][0]:
return "yellow"
return "red"
def trend_direction(current: float, previous: Optional[float]) -> str:
"""Return trend direction string."""
if previous is None:
return "no_data"
diff = current - previous
if diff > 5:
return "improving"
elif diff < -5:
return "declining"
return "stable"
# ---------------------------------------------------------------------------
# Dimension Scoring
# ---------------------------------------------------------------------------
def score_usage(data: Dict[str, Any], benchmarks: Dict[str, Any]) -> Tuple[float, List[str]]:
"""Score the usage dimension (0-100).
Metrics: login_frequency, feature_adoption, dau_mau_ratio.
"""
recommendations: List[str] = []
login = clamp(safe_divide(data.get("login_frequency", 0), benchmarks["login_frequency_target"]) * 100)
adoption = clamp(safe_divide(data.get("feature_adoption", 0), benchmarks["feature_adoption_target"]) * 100)
dau_mau = clamp(safe_divide(data.get("dau_mau_ratio", 0), benchmarks["dau_mau_target"]) * 100)
score = round(login * 0.35 + adoption * 0.40 + dau_mau * 0.25, 1)
if login < 60:
recommendations.append("Login frequency below target -- schedule product engagement session")
if adoption < 50:
recommendations.append("Feature adoption is low -- recommend guided feature walkthrough")
if dau_mau < 50:
recommendations.append("DAU/MAU ratio indicates shallow usage -- investigate stickiness barriers")
return score, recommendations
def score_engagement(data: Dict[str, Any], benchmarks: Dict[str, Any]) -> Tuple[float, List[str]]:
"""Score the engagement dimension (0-100).
Metrics: support_ticket_volume (inverse), meeting_attendance, nps_score, csat_score.
"""
recommendations: List[str] = []
# Lower ticket volume is better -- invert
ticket_vol = data.get("support_ticket_volume", 0)
ticket_score = clamp((1.0 - safe_divide(ticket_vol, benchmarks["support_ticket_volume_max"])) * 100)
attendance = clamp(safe_divide(data.get("meeting_attendance", 0), benchmarks["meeting_attendance_target"]) * 100)
nps_raw = data.get("nps_score", 5)
nps_score = clamp(safe_divide(nps_raw, benchmarks["nps_target"]) * 100)
csat_raw = data.get("csat_score", 3.0)
csat_score = clamp(safe_divide(csat_raw, benchmarks["csat_target"]) * 100)
score = round(ticket_score * 0.20 + attendance * 0.30 + nps_score * 0.25 + csat_score * 0.25, 1)
if attendance < 60:
recommendations.append("Meeting attendance is low -- re-evaluate meeting cadence and agenda value")
if nps_raw < 7:
recommendations.append("NPS below threshold -- conduct a feedback deep-dive with customer")
if csat_raw < 3.5:
recommendations.append("CSAT is critically low -- escalate to support leadership")
return score, recommendations
def score_support(data: Dict[str, Any], benchmarks: Dict[str, Any]) -> Tuple[float, List[str]]:
"""Score the support dimension (0-100).
Metrics: open_tickets (inverse), escalation_rate (inverse), avg_resolution_hours (inverse).
"""
recommendations: List[str] = []
open_tix = data.get("open_tickets", 0)
open_score = clamp((1.0 - safe_divide(open_tix, benchmarks["open_tickets_max"])) * 100)
esc_rate = data.get("escalation_rate", 0)
esc_score = clamp((1.0 - safe_divide(esc_rate, benchmarks["escalation_rate_max"])) * 100)
res_hours = data.get("avg_resolution_hours", 0)
res_score = clamp((1.0 - safe_divide(res_hours, benchmarks["avg_resolution_hours_max"])) * 100)
score = round(open_score * 0.35 + esc_score * 0.35 + res_score * 0.30, 1)
if open_tix > benchmarks["open_tickets_max"] * 0.5:
recommendations.append("Open ticket count elevated -- prioritise ticket resolution")
if esc_rate > benchmarks["escalation_rate_max"] * 0.5:
recommendations.append("Escalation rate too high -- review support process and training")
if res_hours > benchmarks["avg_resolution_hours_max"] * 0.5:
recommendations.append("Resolution time exceeds SLA target -- engage support leadership")
return score, recommendations
def score_relationship(data: Dict[str, Any], benchmarks: Dict[str, Any]) -> Tuple[float, List[str]]:
"""Score the relationship dimension (0-100).
Metrics: executive_sponsor_engagement, multi_threading_depth, renewal_sentiment.
"""
recommendations: List[str] = []
exec_score = clamp(safe_divide(data.get("executive_sponsor_engagement", 0), benchmarks["exec_sponsor_target"]) * 100)
threading = data.get("multi_threading_depth", 1)
thread_score = clamp(safe_divide(threading, benchmarks["multi_threading_target"]) * 100)
sentiment_str = data.get("renewal_sentiment", "unknown").lower()
sentiment_score = RENEWAL_SENTIMENT_SCORES.get(sentiment_str, 50.0)
score = round(exec_score * 0.35 + thread_score * 0.30 + sentiment_score * 0.35, 1)
if exec_score < 50:
recommendations.append("Executive sponsor engagement is weak -- schedule executive alignment meeting")
if threading < 2:
recommendations.append("Single-threaded relationship -- expand contacts across departments")
if sentiment_str == "negative":
recommendations.append("Renewal sentiment is negative -- initiate save plan immediately")
return score, recommendations
# ---------------------------------------------------------------------------
# Main Scoring
# ---------------------------------------------------------------------------
def calculate_health_score(customer: Dict[str, Any]) -> Dict[str, Any]:
"""Calculate the overall health score for a single customer."""
segment = customer.get("segment", "mid-market").lower()
benchmarks = get_benchmarks(segment)
# Score each dimension
usage_score, usage_recs = score_usage(customer.get("usage", {}), benchmarks)
engagement_score, engagement_recs = score_engagement(customer.get("engagement", {}), benchmarks)
support_score, support_recs = score_support(customer.get("support", {}), benchmarks)
relationship_score, relationship_recs = score_relationship(customer.get("relationship", {}), benchmarks)
# Weighted overall
overall = round(
usage_score * DIMENSION_WEIGHTS["usage"]
+ engagement_score * DIMENSION_WEIGHTS["engagement"]
+ support_score * DIMENSION_WEIGHTS["support"]
+ relationship_score * DIMENSION_WEIGHTS["relationship"],
1,
)
classification = classify(overall, segment)
# Trend analysis
prev = customer.get("previous_period", {})
trends = {
"usage": trend_direction(usage_score, prev.get("usage_score")),
"engagement": trend_direction(engagement_score, prev.get("engagement_score")),
"support": trend_direction(support_score, prev.get("support_score")),
"relationship": trend_direction(relationship_score, prev.get("relationship_score")),
}
overall_prev = prev.get("overall_score")
trends["overall"] = trend_direction(overall, overall_prev)
# Combine recommendations
all_recs = usage_recs + engagement_recs + support_recs + relationship_recs
return {
"customer_id": customer.get("customer_id", "unknown"),
"name": customer.get("name", "Unknown"),
"segment": segment,
"arr": customer.get("arr", 0),
"overall_score": overall,
"classification": classification,
"dimensions": {
"usage": {"score": usage_score, "weight": "30%", "classification": classify(usage_score, segment)},
"engagement": {"score": engagement_score, "weight": "25%", "classification": classify(engagement_score, segment)},
"support": {"score": support_score, "weight": "20%", "classification": classify(support_score, segment)},
"relationship": {"score": relationship_score, "weight": "25%", "classification": classify(relationship_score, segment)},
},
"trends": trends,
"recommendations": all_recs,
}
# ---------------------------------------------------------------------------
# Output Formatting
# ---------------------------------------------------------------------------
CLASSIFICATION_LABELS = {
"green": "HEALTHY",
"yellow": "NEEDS ATTENTION",
"red": "AT RISK",
}
def format_text(results: List[Dict[str, Any]]) -> str:
"""Format results as human-readable text."""
lines: List[str] = []
lines.append("=" * 72)
lines.append("CUSTOMER HEALTH SCORE REPORT")
lines.append("=" * 72)
lines.append("")
# Portfolio summary
total = len(results)
green_count = sum(1 for r in results if r["classification"] == "green")
yellow_count = sum(1 for r in results if r["classification"] == "yellow")
red_count = sum(1 for r in results if r["classification"] == "red")
avg_score = round(safe_divide(sum(r["overall_score"] for r in results), total), 1)
lines.append(f"Portfolio Summary: {total} customers")
lines.append(f" Average Health Score: {avg_score}/100")
lines.append(f" Green (Healthy): {green_count}")
lines.append(f" Yellow (Attention): {yellow_count}")
lines.append(f" Red (At Risk): {red_count}")
lines.append("")
for r in results:
label = CLASSIFICATION_LABELS.get(r["classification"], "UNKNOWN")
lines.append("-" * 72)
lines.append(f"Customer: {r['name']} ({r['customer_id']})")
lines.append(f"Segment: {r['segment'].title()} | ARR: ,.0f")
lines.append(f"Overall Score: {r['overall_score']}/100 [{label}]")
lines.append("")
lines.append(" Dimension Scores:")
for dim_name, dim_data in r["dimensions"].items():
dim_label = CLASSIFICATION_LABELS.get(dim_data["classification"], "")
lines.append(f" {dim_name.title():15s} {dim_data['score']:6.1f}/100 ({dim_data['weight']}) [{dim_label}]")
lines.append("")
lines.append(" Trends:")
for dim_name, direction in r["trends"].items():
arrow = {"improving": "+", "declining": "-", "stable": "=", "no_data": "?"}
lines.append(f" {dim_name.title():15s} {arrow.get(direction, '?')} {direction}")
if r["recommendations"]:
lines.append("")
lines.append(" Recommendations:")
for i, rec in enumerate(r["recommendations"], 1):
lines.append(f" {i}. {rec}")
lines.append("")
lines.append("=" * 72)
return "\n".join(lines)
def format_json(results: List[Dict[str, Any]]) -> str:
"""Format results as JSON."""
total = len(results)
output = {
"report": "customer_health_scores",
"summary": {
"total_customers": total,
"average_score": round(safe_divide(sum(r["overall_score"] for r in results), total), 1),
"green_count": sum(1 for r in results if r["classification"] == "green"),
"yellow_count": sum(1 for r in results if r["classification"] == "yellow"),
"red_count": sum(1 for r in results if r["classification"] == "red"),
},
"customers": results,
}
return json.dumps(output, indent=2)
# ---------------------------------------------------------------------------
# CLI
# ---------------------------------------------------------------------------
def main() -> None:
parser = argparse.ArgumentParser(
description="Calculate multi-dimensional customer health scores with trend analysis."
)
parser.add_argument("input_file", help="Path to JSON file containing customer data")
parser.add_argument(
"--format",
choices=["text", "json"],
default="text",
dest="output_format",
help="Output format (default: text)",
)
args = parser.parse_args()
try:
with open(args.input_file, "r") as f:
data = json.load(f)
except FileNotFoundError:
print(f"Error: File not found: {args.input_file}", file=sys.stderr)
sys.exit(1)
except json.JSONDecodeError as e:
print(f"Error: Invalid JSON in {args.input_file}: {e}", file=sys.stderr)
sys.exit(1)
customers = data.get("customers", [])
if not customers:
print("Error: No customer records found in input file.", file=sys.stderr)
sys.exit(1)
results = [calculate_health_score(c) for c in customers]
if args.output_format == "json":
print(format_json(results))
else:
print(format_text(results))
if __name__ == "__main__":
main()
FILE:revenue-operations/SKILL.md
---
name: "revenue-operations"
description: Analyzes sales pipeline health, revenue forecasting accuracy, and go-to-market efficiency metrics for SaaS revenue optimization. Use when analyzing sales pipeline coverage, forecasting revenue, evaluating go-to-market performance, reviewing sales metrics, assessing pipeline analysis, tracking forecast accuracy with MAPE, calculating GTM efficiency, or measuring sales efficiency and unit economics for SaaS teams.
---
# Revenue Operations
Pipeline analysis, forecast accuracy tracking, and GTM efficiency measurement for SaaS revenue teams.
> **Output formats:** All scripts support `--format text` (human-readable) and `--format json` (dashboards/integrations).
---
## Quick Start
```bash
# Analyze pipeline health and coverage
python scripts/pipeline_analyzer.py --input assets/sample_pipeline_data.json --format text
# Track forecast accuracy over multiple periods
python scripts/forecast_accuracy_tracker.py assets/sample_forecast_data.json --format text
# Calculate GTM efficiency metrics
python scripts/gtm_efficiency_calculator.py assets/sample_gtm_data.json --format text
```
---
## Tools Overview
### 1. Pipeline Analyzer
Analyzes sales pipeline health including coverage ratios, stage conversion rates, deal velocity, aging risks, and concentration risks.
**Input:** JSON file with deals, quota, and stage configuration
**Output:** Coverage ratios, conversion rates, velocity metrics, aging flags, risk assessment
**Usage:**
```bash
python scripts/pipeline_analyzer.py --input pipeline.json --format text
```
**Key Metrics Calculated:**
- **Pipeline Coverage Ratio** -- Total pipeline value / quota target (healthy: 3-4x)
- **Stage Conversion Rates** -- Stage-to-stage progression rates
- **Sales Velocity** -- (Opportunities x Avg Deal Size x Win Rate) / Avg Sales Cycle
- **Deal Aging** -- Flags deals exceeding 2x average cycle time per stage
- **Concentration Risk** -- Warns when >40% of pipeline is in a single deal
- **Coverage Gap Analysis** -- Identifies quarters with insufficient pipeline
**Input Schema:**
```json
{
"quota": 500000,
"stages": ["Discovery", "Qualification", "Proposal", "Negotiation", "Closed Won"],
"average_cycle_days": 45,
"deals": [
{
"id": "D001",
"name": "Acme Corp",
"stage": "Proposal",
"value": 85000,
"age_days": 32,
"close_date": "2025-03-15",
"owner": "rep_1"
}
]
}
```
### 2. Forecast Accuracy Tracker
Tracks forecast accuracy over time using MAPE, detects systematic bias, analyzes trends, and provides category-level breakdowns.
**Input:** JSON file with forecast periods and optional category breakdowns
**Output:** MAPE score, bias analysis, trends, category breakdown, accuracy rating
**Usage:**
```bash
python scripts/forecast_accuracy_tracker.py forecast_data.json --format text
```
**Key Metrics Calculated:**
- **MAPE** -- mean(|actual - forecast| / |actual|) x 100
- **Forecast Bias** -- Over-forecasting (positive) vs under-forecasting (negative) tendency
- **Weighted Accuracy** -- MAPE weighted by deal value for materiality
- **Period Trends** -- Improving, stable, or declining accuracy over time
- **Category Breakdown** -- Accuracy by rep, product, segment, or any custom dimension
**Accuracy Ratings:**
| Rating | MAPE Range | Interpretation |
|--------|-----------|----------------|
| Excellent | <10% | Highly predictable, data-driven process |
| Good | 10-15% | Reliable forecasting with minor variance |
| Fair | 15-25% | Needs process improvement |
| Poor | >25% | Significant forecasting methodology gaps |
**Input Schema:**
```json
{
"forecast_periods": [
{"period": "2025-Q1", "forecast": 480000, "actual": 520000},
{"period": "2025-Q2", "forecast": 550000, "actual": 510000}
],
"category_breakdowns": {
"by_rep": [
{"category": "Rep A", "forecast": 200000, "actual": 210000},
{"category": "Rep B", "forecast": 280000, "actual": 310000}
]
}
}
```
### 3. GTM Efficiency Calculator
Calculates core SaaS GTM efficiency metrics with industry benchmarking, ratings, and improvement recommendations.
**Input:** JSON file with revenue, cost, and customer metrics
**Output:** Magic Number, LTV:CAC, CAC Payback, Burn Multiple, Rule of 40, NDR with ratings
**Usage:**
```bash
python scripts/gtm_efficiency_calculator.py gtm_data.json --format text
```
**Key Metrics Calculated:**
| Metric | Formula | Target |
|--------|---------|--------|
| Magic Number | Net New ARR / Prior Period S&M Spend | >0.75 |
| LTV:CAC | (ARPA x Gross Margin / Churn Rate) / CAC | >3:1 |
| CAC Payback | CAC / (ARPA x Gross Margin) months | <18 months |
| Burn Multiple | Net Burn / Net New ARR | <2x |
| Rule of 40 | Revenue Growth % + FCF Margin % | >40% |
| Net Dollar Retention | (Begin ARR + Expansion - Contraction - Churn) / Begin ARR | >110% |
**Input Schema:**
```json
{
"revenue": {
"current_arr": 5000000,
"prior_arr": 3800000,
"net_new_arr": 1200000,
"arpa_monthly": 2500,
"revenue_growth_pct": 31.6
},
"costs": {
"sales_marketing_spend": 1800000,
"cac": 18000,
"gross_margin_pct": 78,
"total_operating_expense": 6500000,
"net_burn": 1500000,
"fcf_margin_pct": 8.4
},
"customers": {
"beginning_arr": 3800000,
"expansion_arr": 600000,
"contraction_arr": 100000,
"churned_arr": 300000,
"annual_churn_rate_pct": 8
}
}
```
---
## Revenue Operations Workflows
### Weekly Pipeline Review
Use this workflow for your weekly pipeline inspection cadence.
1. **Verify input data:** Confirm pipeline export is current and all required fields (stage, value, close_date, owner) are populated before proceeding.
2. **Generate pipeline report:**
```bash
python scripts/pipeline_analyzer.py --input current_pipeline.json --format text
```
3. **Cross-check output totals** against your CRM source system to confirm data integrity.
4. **Review key indicators:**
- Pipeline coverage ratio (is it above 3x quota?)
- Deals aging beyond threshold (which deals need intervention?)
- Concentration risk (are we over-reliant on a few large deals?)
- Stage distribution (is there a healthy funnel shape?)
5. **Document using template:** Use `assets/pipeline_review_template.md`
6. **Action items:** Address aging deals, redistribute pipeline concentration, fill coverage gaps
### Forecast Accuracy Review
Use monthly or quarterly to evaluate and improve forecasting discipline.
1. **Verify input data:** Confirm all forecast periods have corresponding actuals and no periods are missing before running.
2. **Generate accuracy report:**
```bash
python scripts/forecast_accuracy_tracker.py forecast_history.json --format text
```
3. **Cross-check actuals** against closed-won records in your CRM before drawing conclusions.
4. **Analyze patterns:**
- Is MAPE trending down (improving)?
- Which reps or segments have the highest error rates?
- Is there systematic over- or under-forecasting?
5. **Document using template:** Use `assets/forecast_report_template.md`
6. **Improvement actions:** Coach high-bias reps, adjust methodology, improve data hygiene
### GTM Efficiency Audit
Use quarterly or during board prep to evaluate go-to-market efficiency.
1. **Verify input data:** Confirm revenue, cost, and customer figures reconcile with finance records before running.
2. **Calculate efficiency metrics:**
```bash
python scripts/gtm_efficiency_calculator.py quarterly_data.json --format text
```
3. **Cross-check computed ARR and spend totals** against your finance system before sharing results.
4. **Benchmark against targets:**
- Magic Number (>0.75)
- LTV:CAC (>3:1)
- CAC Payback (<18 months)
- Rule of 40 (>40%)
5. **Document using template:** Use `assets/gtm_dashboard_template.md`
6. **Strategic decisions:** Adjust spend allocation, optimize channels, improve retention
### Quarterly Business Review
Combine all three tools for a comprehensive QBR analysis.
1. Run pipeline analyzer for forward-looking coverage
2. Run forecast tracker for backward-looking accuracy
3. Run GTM calculator for efficiency benchmarks
4. Cross-reference pipeline health with forecast accuracy
5. Align GTM efficiency metrics with growth targets
---
## Reference Documentation
| Reference | Description |
|-----------|-------------|
| [RevOps Metrics Guide](references/revops-metrics-guide.md) | Complete metrics hierarchy, definitions, formulas, and interpretation |
| [Pipeline Management Framework](references/pipeline-management-framework.md) | Pipeline best practices, stage definitions, conversion benchmarks |
| [GTM Efficiency Benchmarks](references/gtm-efficiency-benchmarks.md) | SaaS benchmarks by stage, industry standards, improvement strategies |
---
## Templates
| Template | Use Case |
|----------|----------|
| [Pipeline Review Template](assets/pipeline_review_template.md) | Weekly/monthly pipeline inspection documentation |
| [Forecast Report Template](assets/forecast_report_template.md) | Forecast accuracy reporting and trend analysis |
| [GTM Dashboard Template](assets/gtm_dashboard_template.md) | GTM efficiency dashboard for leadership review |
| [Sample Pipeline Data](assets/sample_pipeline_data.json) | Example input for pipeline_analyzer.py |
| [Expected Output](assets/expected_output.json) | Reference output from pipeline_analyzer.py |
FILE:revenue-operations/assets/expected_output.json
{
"coverage": {
"total_pipeline_value": 1105000,
"quota": 500000,
"coverage_ratio": 2.21,
"rating": "At Risk",
"target": "3.0x - 4.0x"
},
"stage_conversions": [
{
"from_stage": "Discovery",
"to_stage": "Qualification",
"from_count": 17,
"to_count": 12,
"conversion_rate_pct": 70.6
},
{
"from_stage": "Qualification",
"to_stage": "Proposal",
"from_count": 12,
"to_count": 9,
"conversion_rate_pct": 75.0
},
{
"from_stage": "Proposal",
"to_stage": "Negotiation",
"from_count": 9,
"to_count": 5,
"conversion_rate_pct": 55.6
},
{
"from_stage": "Negotiation",
"to_stage": "Closed Won",
"from_count": 5,
"to_count": 2,
"conversion_rate_pct": 40.0
}
],
"velocity": {
"num_opportunities": 17,
"avg_deal_size": 74588.24,
"win_rate_pct": 11.8,
"avg_cycle_days": 32.5,
"velocity_per_day": 4594.2,
"velocity_per_month": 137826.09
},
"aging": {
"global_aging_threshold_days": 90,
"stage_thresholds": {
"Discovery": 90,
"Qualification": 78,
"Proposal": 67,
"Negotiation": 56
},
"total_open_deals": 15,
"healthy_deals": 13,
"at_risk_deals": 2,
"aging_deals": [
{
"id": "D011",
"name": "Vertex Solutions",
"stage": "Proposal",
"age_days": 95,
"threshold_days": 67,
"days_over": 28,
"value": 110000
},
{
"id": "D014",
"name": "Horizon Telecom",
"stage": "Negotiation",
"age_days": 60,
"threshold_days": 56,
"days_over": 4,
"value": 250000
}
]
},
"risk": {
"overall_risk": "MEDIUM",
"risk_factors_count": 3,
"concentration_risks": [],
"has_concentration_risk": false,
"stage_distribution": {
"Discovery": {
"count": 5,
"value": 194000,
"pct_of_pipeline": 17.6
},
"Qualification": {
"count": 3,
"value": 150000,
"pct_of_pipeline": 13.6
},
"Proposal": {
"count": 4,
"value": 333000,
"pct_of_pipeline": 30.1
},
"Negotiation": {
"count": 3,
"value": 428000,
"pct_of_pipeline": 38.7
}
},
"empty_stages": [],
"coverage_gaps": [
{
"quarter": "2025-Q2",
"pipeline_value": 344000,
"quarterly_target": 125000.0,
"coverage_ratio": 2.75,
"gap": "Below 3x target"
}
]
}
}
FILE:revenue-operations/assets/forecast_report_template.md
# Forecast Accuracy Report - [Period]
## Report Details
- **Prepared By:** [Name]
- **Report Date:** [YYYY-MM-DD]
- **Period Analyzed:** [Start Period] to [End Period]
- **Periods Covered:** [N] periods
---
## Executive Summary
| Metric | Value | Rating | Trend |
|--------|-------|--------|-------|
| MAPE | _% | | |
| Weighted MAPE | _% | | |
| Forecast Bias | _% | | |
| Bias Direction | | | |
**Accuracy Rating:**
- Excellent (<10%) / Good (10-15%) / Fair (15-25%) / Poor (>25%)
**Key Finding:** [1-2 sentence summary of forecast accuracy status]
---
## Period-by-Period Analysis
| Period | Forecast | Actual | Variance | Error % | Bias |
|--------|----------|--------|----------|---------|------|
| | $_ | $_ | $_ | _% | Over/Under |
| | $_ | $_ | $_ | _% | Over/Under |
| | $_ | $_ | $_ | _% | Over/Under |
| | $_ | $_ | $_ | _% | Over/Under |
| | $_ | $_ | $_ | _% | Over/Under |
| | $_ | $_ | $_ | _% | Over/Under |
---
## Bias Analysis
### Overall Bias
- **Direction:** [Over-forecasting / Under-forecasting / Balanced]
- **Bias Magnitude:** _%
- **Over-forecast Periods:** _ of _
- **Under-forecast Periods:** _ of _
- **Bias Ratio:** _ (1.0 = always over, 0.0 = always under, 0.5 = balanced)
### Interpretation
[What does the bias pattern tell us about our forecasting process? Is it systematic or random?]
### Root Cause
[Identify the primary drivers of bias: optimistic deal assessment, poor stage qualification, sandbagging, late-arriving deals, etc.]
---
## Trend Analysis
### Accuracy Trend
- **Direction:** [Improving / Stable / Declining]
- **Early Period MAPE:** _%
- **Recent Period MAPE:** _%
- **MAPE Change:** _% (positive = worsening, negative = improving)
### Trend Chart (Text)
```
Period Error% Trend
Q1 __% ████████
Q2 __% ██████████
Q3 __% ██████
Q4 __% ████████████
```
---
## Category Breakdown
### By Rep
| Rep | Forecast | Actual | Error % | Bias | Rating |
|-----|----------|--------|---------|------|--------|
| | $_ | $_ | _% | | |
| | $_ | $_ | _% | | |
| | $_ | $_ | _% | | |
| | $_ | $_ | _% | | |
**Overall Rep MAPE:** _%
### By Segment
| Segment | Forecast | Actual | Error % | Bias | Rating |
|---------|----------|--------|---------|------|--------|
| Enterprise | $_ | $_ | _% | | |
| Mid-Market | $_ | $_ | _% | | |
| SMB | $_ | $_ | _% | | |
**Overall Segment MAPE:** _%
### By Product (if applicable)
| Product | Forecast | Actual | Error % | Bias | Rating |
|---------|----------|--------|---------|------|--------|
| | $_ | $_ | _% | | |
| | $_ | $_ | _% | | |
---
## Recommendations
### Immediate Actions (This Quarter)
1. **[Action]** -- [Why and expected impact]
2. **[Action]** -- [Why and expected impact]
3. **[Action]** -- [Why and expected impact]
### Process Improvements (Next Quarter)
1. **[Improvement]** -- [Implementation plan]
2. **[Improvement]** -- [Implementation plan]
### Coaching Focus Areas
| Rep/Team | Issue | Coaching Action | Target |
|----------|-------|-----------------|--------|
| | | | |
| | | | |
---
## Forecast Methodology Notes
### Current Methodology
[Describe the current forecasting methodology: weighted pipeline, commit/upside categories, AI-assisted, etc.]
### Methodology Changes This Period
[Any changes to the forecasting process or methodology during the reporting period]
### Data Quality Issues
[Note any data quality issues that may affect accuracy: missing close dates, inconsistent stage definitions, CRM hygiene gaps]
---
## Next Steps
| # | Action | Owner | Due Date |
|---|--------|-------|----------|
| 1 | | | |
| 2 | | | |
| 3 | | | |
FILE:revenue-operations/assets/gtm_dashboard_template.md
# GTM Efficiency Dashboard - [Quarter/Period]
## Dashboard Details
- **Prepared By:** [Name]
- **Report Date:** [YYYY-MM-DD]
- **Period:** [Quarter or Date Range]
- **Company Stage:** [Seed / Series A / Series B / Series C+ / Growth]
---
## Metrics At A Glance
| Metric | Value | Rating | Target | Trend | vs. Last Period |
|--------|-------|--------|--------|-------|-----------------|
| Magic Number | _ | | >0.75 | | |
| LTV:CAC | _:1 | | >3:1 | | |
| CAC Payback | _ mo | | <18 mo | | |
| Burn Multiple | _x | | <2x | | |
| Rule of 40 | _% | | >40% | | |
| NDR | _% | | >110% | | |
**Rating Legend:** Green = Healthy | Yellow = Monitor | Red = Action Required
**Overall GTM Health:** [Strong / Healthy / Needs Attention / Critical]
---
## Detailed Metric Analysis
### Magic Number
| Component | Value |
|-----------|-------|
| Net New ARR | $_ |
| Prior Period S&M Spend | $_ |
| **Magic Number** | **_** |
- **Rating:** [Green / Yellow / Red]
- **Percentile:** [Top 10% / Top 25% / Median / Below Median]
- **Trend:** [Improving / Stable / Declining]
- **Interpretation:** [What does this metric tell us about GTM spend efficiency?]
### LTV:CAC Ratio
| Component | Value |
|-----------|-------|
| ARPA (Monthly) | $_ |
| ARPA (Annual) | $_ |
| Gross Margin | _% |
| Annual Churn Rate | _% |
| **Customer LTV** | **$_** |
| Customer Acquisition Cost | $_ |
| **LTV:CAC Ratio** | **_:1** |
- **Rating:** [Green / Yellow / Red]
- **Percentile:** [Top 10% / Top 25% / Median / Below Median]
- **Trend:** [Improving / Stable / Declining]
- **Interpretation:** [Are unit economics sustainable?]
### CAC Payback Period
| Component | Value |
|-----------|-------|
| CAC | $_ |
| Monthly Gross Margin Contribution | $_ |
| **CAC Payback** | **_ months** |
- **Rating:** [Green / Yellow / Red]
- **Percentile:** [Top 10% / Top 25% / Median / Below Median]
- **Trend:** [Improving / Stable / Declining]
- **Interpretation:** [How quickly are we recovering acquisition costs?]
### Burn Multiple
| Component | Value |
|-----------|-------|
| Net Burn | $_ |
| Net New ARR | $_ |
| **Burn Multiple** | **_x** |
- **Rating:** [Green / Yellow / Red]
- **Percentile:** [Top 10% / Top 25% / Median / Below Median]
- **Trend:** [Improving / Stable / Declining]
- **Interpretation:** [Is growth capital-efficient?]
### Rule of 40
| Component | Value |
|-----------|-------|
| Revenue Growth Rate | _% |
| FCF Margin | _% |
| **Rule of 40 Score** | **_%** |
- **Rating:** [Green / Yellow / Red]
- **Percentile:** [Top 10% / Top 25% / Median / Below Median]
- **Trend:** [Improving / Stable / Declining]
- **Interpretation:** [Is the growth-profitability balance healthy?]
### Net Dollar Retention
| Component | Value |
|-----------|-------|
| Beginning ARR | $_ |
| Expansion ARR | +$_ |
| Contraction ARR | -$_ |
| Churned ARR | -$_ |
| Ending ARR | $_ |
| **NDR** | **_%** |
- **Rating:** [Green / Yellow / Red]
- **Percentile:** [Top 10% / Top 25% / Median / Below Median]
- **Trend:** [Improving / Stable / Declining]
- **Interpretation:** [Are we growing revenue from the existing customer base?]
---
## Quarterly Trend
| Metric | Q-3 | Q-2 | Q-1 | Current | Direction |
|--------|-----|-----|-----|---------|-----------|
| Magic Number | _ | _ | _ | _ | |
| LTV:CAC | _:1 | _:1 | _:1 | _:1 | |
| CAC Payback | _ mo | _ mo | _ mo | _ mo | |
| Burn Multiple | _x | _x | _x | _x | |
| Rule of 40 | _% | _% | _% | _% | |
| NDR | _% | _% | _% | _% | |
---
## Benchmark Comparison
| Metric | Our Value | Stage Median | Top Quartile | Gap to Top Quartile |
|--------|-----------|-------------|--------------|---------------------|
| Magic Number | _ | _ | _ | _ |
| LTV:CAC | _:1 | _:1 | _:1 | _ |
| CAC Payback | _ mo | _ mo | _ mo | _ mo |
| Burn Multiple | _x | _x | _x | _ |
| Rule of 40 | _% | _% | _% | _% |
| NDR | _% | _% | _% | _% |
---
## Revenue Composition
### ARR Bridge
```
Beginning ARR: $____________
+ New Logo ARR: $____________
+ Expansion ARR: $____________
- Contraction ARR: $____________
- Churned ARR: $____________
= Ending ARR: $____________
Net New ARR: $____________
Growth Rate: ____________%
```
### Cost Structure
```
S&M Spend: $____________ (___% of revenue)
R&D Spend: $____________ (___% of revenue)
G&A Spend: $____________ (___% of revenue)
Total OpEx: $____________
Net Burn: $____________
Gross Margin: ____________%
```
---
## Strategic Recommendations
### Top 3 Priorities
1. **[Priority]**
- Current state: [Where we are]
- Target: [Where we need to be]
- Action plan: [How to get there]
- Expected impact: [Metric improvement]
- Timeline: [When]
2. **[Priority]**
- Current state:
- Target:
- Action plan:
- Expected impact:
- Timeline:
3. **[Priority]**
- Current state:
- Target:
- Action plan:
- Expected impact:
- Timeline:
### Investment Recommendations
| Area | Current Spend | Recommended | Rationale |
|------|--------------|-------------|-----------|
| | $_ | $_ | |
| | $_ | $_ | |
| | $_ | $_ | |
---
## Next Steps
| # | Action | Owner | Due Date | Success Metric |
|---|--------|-------|----------|---------------|
| 1 | | | | |
| 2 | | | | |
| 3 | | | | |
| 4 | | | | |
| 5 | | | | |
FILE:revenue-operations/assets/pipeline_review_template.md
# Pipeline Review - [Date]
## Review Period
- **Review Type:** Weekly / Monthly (circle one)
- **Prepared By:** [Name]
- **Review Date:** [YYYY-MM-DD]
- **Period Covered:** [Start Date] to [End Date]
---
## Executive Summary
| Metric | Current | Last Period | Target | Status |
|--------|---------|-------------|--------|--------|
| Pipeline Coverage | _x | _x | 3-4x | |
| Total Pipeline Value | $_ | $_ | $_ | |
| Net Pipeline Change | $_ | $_ | >$0 | |
| Deals in Pipeline | _ | _ | _ | |
| Avg Deal Size | $_ | $_ | $_ | |
| Sales Velocity ($/mo) | $_ | $_ | $_ | |
**Overall Assessment:** [1-2 sentence summary of pipeline health]
---
## Coverage Analysis
### By Quarter
| Quarter | Pipeline | Target | Coverage | Status |
|---------|----------|--------|----------|--------|
| Current Quarter | $_ | $_ | _x | |
| Next Quarter | $_ | $_ | _x | |
| Q+2 | $_ | $_ | _x | |
### By Segment
| Segment | Pipeline | Target | Coverage | Notes |
|---------|----------|--------|----------|-------|
| Enterprise | $_ | $_ | _x | |
| Mid-Market | $_ | $_ | _x | |
| SMB | $_ | $_ | _x | |
---
## Stage Distribution
| Stage | # Deals | Value | % of Pipeline | Conversion Rate |
|-------|---------|-------|---------------|-----------------|
| Discovery | _ | $_ | _% | _% |
| Qualification | _ | $_ | _% | _% |
| Proposal | _ | $_ | _% | _% |
| Negotiation | _ | $_ | _% | _% |
**Funnel Health:** [Healthy / Top-heavy / Bottom-heavy / Gaps identified]
---
## Top Deals Review (S3+)
| Deal | Stage | Value | Age | Close Date | Risk | Next Step |
|------|-------|-------|-----|------------|------|-----------|
| | | $_ | _d | | | |
| | | $_ | _d | | | |
| | | $_ | _d | | | |
| | | $_ | _d | | | |
| | | $_ | _d | | | |
---
## Risk Assessment
### Concentration Risk
- **Largest deal as % of pipeline:** _%
- **Top 3 deals as % of pipeline:** _%
- **Risk Level:** [Low / Medium / High]
- **Mitigation:** [Actions to diversify]
### Aging Deals
| Deal | Stage | Age | Threshold | Days Over | Action Required |
|------|-------|-----|-----------|-----------|-----------------|
| | | _d | _d | +_d | |
| | | _d | _d | +_d | |
### Deals Pushed from Last Period
| Deal | Original Close | New Close | Times Pushed | Reason |
|------|---------------|-----------|-------------|--------|
| | | | | |
| | | | | |
---
## Pipeline Movement
### Created This Period
| Deal | Source | Value | Stage | Expected Close |
|------|--------|-------|-------|---------------|
| | | $_ | | |
| | | $_ | | |
**Total Created:** $_
### Advanced This Period
| Deal | From Stage | To Stage | Value |
|------|-----------|----------|-------|
| | | | $_ |
| | | | $_ |
### Closed Won This Period
| Deal | Value | Cycle Days | Source |
|------|-------|-----------|--------|
| | $_ | _d | |
| | $_ | _d | |
**Total Closed Won:** $_
### Closed Lost This Period
| Deal | Value | Stage Lost | Loss Reason |
|------|-------|-----------|-------------|
| | $_ | | |
| | $_ | | |
**Total Closed Lost:** $_
---
## Action Items
| # | Action | Owner | Due Date | Priority |
|---|--------|-------|----------|----------|
| 1 | | | | |
| 2 | | | | |
| 3 | | | | |
| 4 | | | | |
| 5 | | | | |
---
## Notes
[Additional context, observations, or discussion points for the review meeting]
FILE:revenue-operations/assets/sample_forecast_data.json
{
"forecast_periods": [
{"period": "2024-Q1", "forecast": 420000, "actual": 445000},
{"period": "2024-Q2", "forecast": 480000, "actual": 460000},
{"period": "2024-Q3", "forecast": 510000, "actual": 525000},
{"period": "2024-Q4", "forecast": 550000, "actual": 510000},
{"period": "2025-Q1", "forecast": 520000, "actual": 540000},
{"period": "2025-Q2", "forecast": 580000, "actual": 560000}
],
"category_breakdowns": {
"by_rep": [
{"category": "Sarah Chen", "forecast": 210000, "actual": 225000},
{"category": "Marcus Johnson", "forecast": 185000, "actual": 160000},
{"category": "Priya Patel", "forecast": 125000, "actual": 135000},
{"category": "Alex Rivera", "forecast": 60000, "actual": 40000}
],
"by_segment": [
{"category": "Enterprise", "forecast": 320000, "actual": 310000},
{"category": "Mid-Market", "forecast": 180000, "actual": 175000},
{"category": "SMB", "forecast": 80000, "actual": 75000}
]
}
}
FILE:revenue-operations/assets/sample_gtm_data.json
{
"revenue": {
"current_arr": 5000000,
"prior_arr": 3800000,
"net_new_arr": 1200000,
"arpa_monthly": 2500,
"revenue_growth_pct": 31.6
},
"costs": {
"sales_marketing_spend": 1800000,
"cac": 18000,
"gross_margin_pct": 78,
"total_operating_expense": 6500000,
"net_burn": 1500000,
"fcf_margin_pct": 8.4
},
"customers": {
"beginning_arr": 3800000,
"expansion_arr": 600000,
"contraction_arr": 100000,
"churned_arr": 300000,
"annual_churn_rate_pct": 8
}
}
FILE:revenue-operations/assets/sample_pipeline_data.json
{
"quota": 500000,
"stages": ["Discovery", "Qualification", "Proposal", "Negotiation", "Closed Won"],
"average_cycle_days": 45,
"deals": [
{
"id": "D001",
"name": "Acme Corp",
"stage": "Proposal",
"value": 85000,
"age_days": 32,
"close_date": "2025-03-15",
"owner": "rep_1"
},
{
"id": "D002",
"name": "TechFlow Inc",
"stage": "Discovery",
"value": 42000,
"age_days": 8,
"close_date": "2025-04-30",
"owner": "rep_2"
},
{
"id": "D003",
"name": "GlobalData Systems",
"stage": "Negotiation",
"value": 120000,
"age_days": 55,
"close_date": "2025-02-28",
"owner": "rep_1"
},
{
"id": "D004",
"name": "Pinnacle Software",
"stage": "Qualification",
"value": 35000,
"age_days": 18,
"close_date": "2025-04-15",
"owner": "rep_3"
},
{
"id": "D005",
"name": "Meridian Health",
"stage": "Proposal",
"value": 95000,
"age_days": 40,
"close_date": "2025-03-20",
"owner": "rep_2"
},
{
"id": "D006",
"name": "CloudVault",
"stage": "Discovery",
"value": 28000,
"age_days": 5,
"close_date": "2025-05-15",
"owner": "rep_1"
},
{
"id": "D007",
"name": "Nexus Financial",
"stage": "Closed Won",
"value": 72000,
"age_days": 38,
"close_date": "2025-01-31",
"owner": "rep_3"
},
{
"id": "D008",
"name": "Urban Analytics",
"stage": "Negotiation",
"value": 58000,
"age_days": 42,
"close_date": "2025-03-05",
"owner": "rep_2"
},
{
"id": "D009",
"name": "Redwood Logistics",
"stage": "Discovery",
"value": 31000,
"age_days": 12,
"close_date": "2025-05-01",
"owner": "rep_3"
},
{
"id": "D010",
"name": "Summit Enterprises",
"stage": "Qualification",
"value": 48000,
"age_days": 22,
"close_date": "2025-04-10",
"owner": "rep_1"
},
{
"id": "D011",
"name": "Vertex Solutions",
"stage": "Proposal",
"value": 110000,
"age_days": 95,
"close_date": "2025-03-01",
"owner": "rep_2"
},
{
"id": "D012",
"name": "DataBridge AI",
"stage": "Discovery",
"value": 55000,
"age_days": 3,
"close_date": "2025-06-15",
"owner": "rep_1"
},
{
"id": "D013",
"name": "Atlas Manufacturing",
"stage": "Qualification",
"value": 67000,
"age_days": 28,
"close_date": "2025-04-20",
"owner": "rep_3"
},
{
"id": "D014",
"name": "Horizon Telecom",
"stage": "Negotiation",
"value": 250000,
"age_days": 60,
"close_date": "2025-03-10",
"owner": "rep_1"
},
{
"id": "D015",
"name": "BlueShift Labs",
"stage": "Proposal",
"value": 43000,
"age_days": 35,
"close_date": "2025-03-25",
"owner": "rep_3"
},
{
"id": "D016",
"name": "Crestview Partners",
"stage": "Discovery",
"value": 38000,
"age_days": 15,
"close_date": "2025-05-20",
"owner": "rep_2"
},
{
"id": "D017",
"name": "Ironclad Security",
"stage": "Closed Won",
"value": 91000,
"age_days": 44,
"close_date": "2025-02-10",
"owner": "rep_1"
}
]
}
FILE:revenue-operations/references/gtm-efficiency-benchmarks.md
# GTM Efficiency Benchmarks
SaaS benchmarks by funding stage, industry standards, and strategies for improving go-to-market efficiency.
---
## Benchmarks by Funding Stage
### Seed Stage ($0-$2M ARR)
| Metric | Red | Yellow | Green | Elite |
|--------|-----|--------|-------|-------|
| Magic Number | <0.3 | 0.3-0.5 | >0.5 | >0.8 |
| LTV:CAC | <1.5:1 | 1.5-2.5:1 | >2.5:1 | >4:1 |
| CAC Payback | >30 mo | 24-30 mo | <24 mo | <15 mo |
| Burn Multiple | >5x | 3-5x | <3x | <2x |
| Rule of 40 | <0% | 0-20% | >20% | >40% |
| NDR | <90% | 90-100% | >100% | >110% |
**Context:** At seed stage, efficiency metrics are naturally less stable due to small sample sizes. Focus on directional improvement rather than absolute numbers. Burn multiple is the most critical metric -- investors want to see capital-efficient growth.
### Series A ($2M-$10M ARR)
| Metric | Red | Yellow | Green | Elite |
|--------|-----|--------|-------|-------|
| Magic Number | <0.4 | 0.4-0.6 | >0.6 | >0.9 |
| LTV:CAC | <2:1 | 2-3:1 | >3:1 | >5:1 |
| CAC Payback | >24 mo | 18-24 mo | <18 mo | <12 mo |
| Burn Multiple | >4x | 2.5-4x | <2.5x | <1.5x |
| Rule of 40 | <10% | 10-30% | >30% | >50% |
| NDR | <95% | 95-105% | >105% | >115% |
**Context:** Series A is where unit economics must prove out. LTV:CAC >3:1 validates product-market fit in the revenue model. Investors will scrutinize CAC payback to understand capital requirements.
### Series B ($10M-$50M ARR)
| Metric | Red | Yellow | Green | Elite |
|--------|-----|--------|-------|-------|
| Magic Number | <0.5 | 0.5-0.75 | >0.75 | >1.0 |
| LTV:CAC | <2.5:1 | 2.5-3.5:1 | >3.5:1 | >5:1 |
| CAC Payback | >22 mo | 15-22 mo | <15 mo | <10 mo |
| Burn Multiple | >3x | 2-3x | <2x | <1.5x |
| Rule of 40 | <20% | 20-35% | >35% | >50% |
| NDR | <100% | 100-110% | >110% | >120% |
**Context:** At Series B, the GTM machine should be scaling predictably. Magic Number >0.75 demonstrates that adding GTM spend produces proportional returns. NDR >110% proves land-and-expand motion works.
### Series C+ ($50M-$200M ARR)
| Metric | Red | Yellow | Green | Elite |
|--------|-----|--------|-------|-------|
| Magic Number | <0.5 | 0.5-0.75 | >0.75 | >1.0 |
| LTV:CAC | <3:1 | 3-4:1 | >4:1 | >6:1 |
| CAC Payback | >20 mo | 14-20 mo | <14 mo | <10 mo |
| Burn Multiple | >2.5x | 1.5-2.5x | <1.5x | <1x |
| Rule of 40 | <25% | 25-40% | >40% | >60% |
| NDR | <105% | 105-115% | >115% | >130% |
**Context:** Growth efficiency and path to profitability become paramount. The Rule of 40 is the primary board-level metric. Companies approaching IPO should target Rule of 40 >40% consistently.
### Growth / Pre-IPO ($200M+ ARR)
| Metric | Red | Yellow | Green | Elite |
|--------|-----|--------|-------|-------|
| Magic Number | <0.6 | 0.6-0.8 | >0.8 | >1.0 |
| LTV:CAC | <3:1 | 3-5:1 | >5:1 | >7:1 |
| CAC Payback | >18 mo | 12-18 mo | <12 mo | <8 mo |
| Burn Multiple | >2x | 1-2x | <1x | <0.5x |
| Rule of 40 | <30% | 30-45% | >45% | >65% |
| NDR | <110% | 110-120% | >120% | >140% |
**Context:** Pre-IPO and public companies are measured on absolute efficiency. FCF margin matters as much as growth rate. Best-in-class companies demonstrate both growth and profitability.
---
## Industry Vertical Benchmarks
### Horizontal SaaS (CRM, HR, Finance, Marketing)
| Metric | Median | Top Quartile |
|--------|--------|-------------|
| Magic Number | 0.65 | 0.90+ |
| LTV:CAC | 3.2:1 | 5.5:1+ |
| CAC Payback | 17 months | 11 months |
| Gross Margin | 72% | 80%+ |
| NDR | 108% | 120%+ |
| Win Rate | 22% | 32%+ |
### Vertical SaaS (Healthcare, FinTech, PropTech)
| Metric | Median | Top Quartile |
|--------|--------|-------------|
| Magic Number | 0.55 | 0.80+ |
| LTV:CAC | 3.8:1 | 6.0:1+ |
| CAC Payback | 15 months | 10 months |
| Gross Margin | 68% | 76%+ |
| NDR | 112% | 125%+ |
| Win Rate | 25% | 38%+ |
**Note:** Vertical SaaS often has higher NDR (deeper embedding) and higher win rates (less competition) but lower gross margins (more services).
### Infrastructure / DevTools
| Metric | Median | Top Quartile |
|--------|--------|-------------|
| Magic Number | 0.70 | 1.0+ |
| LTV:CAC | 4.0:1 | 7.0:1+ |
| CAC Payback | 14 months | 9 months |
| Gross Margin | 75% | 85%+ |
| NDR | 118% | 140%+ |
| Win Rate | 18% | 28%+ |
**Note:** Usage-based pricing in infrastructure drives exceptional NDR but more volatile revenue patterns.
### Security / Compliance
| Metric | Median | Top Quartile |
|--------|--------|-------------|
| Magic Number | 0.60 | 0.85+ |
| LTV:CAC | 3.5:1 | 5.8:1+ |
| CAC Payback | 16 months | 11 months |
| Gross Margin | 74% | 82%+ |
| NDR | 115% | 130%+ |
| Win Rate | 20% | 30%+ |
---
## Efficiency Improvement Strategies
### Improving Magic Number
**Current: <0.5 (Red) -- Target: >0.75 (Green)**
1. **Channel ROI analysis:** Audit spend by channel (paid, outbound, events, content). Cut bottom 20% performing channels and reallocate.
2. **Sales productivity:** Measure revenue per rep. Identify bottom-quartile performers for coaching or role change. Top performers should be studied and their practices systematized.
3. **Funnel efficiency:** Improve MQL-to-SQL conversion through better lead scoring. Fewer, higher-quality leads reduce wasted sales capacity.
4. **Ramp time reduction:** Accelerate new rep ramp from average 6 months to 4 months through structured onboarding, shadowing, and certification.
5. **Territory optimization:** Ensure territories are balanced by opportunity (not just geography). Over-served territories waste capacity.
### Improving LTV:CAC
**Current: <3:1 (Yellow) -- Target: >5:1 (Green)**
**Increase LTV:**
- Reduce churn through proactive health scoring and intervention
- Build expansion playbooks for cross-sell and upsell
- Increase pricing through value-based packaging
- Improve product stickiness with integrations and workflows
**Decrease CAC:**
- Invest in organic channels (content, SEO, community)
- Implement product-led growth (PLG) motion
- Optimize paid spend through better targeting and attribution
- Leverage customer referrals and case studies
### Improving CAC Payback
**Current: >18 months (Yellow) -- Target: <12 months (Green)**
1. **Increase ARPA:** Package features to drive higher initial contract values. Annual prepay discounts accelerate cash collection.
2. **Improve gross margin:** Reduce COGS through automation, self-serve onboarding, and tech-touch customer success.
3. **Reduce CAC:** Same strategies as LTV:CAC improvement on the CAC side.
4. **Contract structure:** Annual or multi-year contracts with upfront payment reduce effective payback period.
### Improving Burn Multiple
**Current: >2x (Yellow) -- Target: <1.5x (Green)**
1. **Revenue efficiency:** Focus on the highest ROI growth activities. Not all ARR is equal -- expansion ARR is typically much cheaper than new logo ARR.
2. **Operational efficiency:** Automate repeatable processes (billing, provisioning, basic support). Reduce headcount growth rate relative to revenue growth rate.
3. **Spending discipline:** Implement zero-based budgeting for non-essential spend. Every dollar of burn should connect to revenue generation.
4. **Revenue acceleration:** Sometimes the best way to improve burn multiple is not cutting costs but accelerating revenue. If you can accelerate revenue growth by 20% with 5% more spend, the burn multiple improves.
### Improving NDR
**Current: 100-110% (Yellow) -- Target: >120% (Green)**
1. **Expansion playbooks:** Define trigger events for upsell (usage thresholds, team growth, feature requests). Arm CSMs with expansion talk tracks.
2. **Usage-based pricing:** Align pricing with customer value creation. As customers use more, they pay more -- naturally drives expansion.
3. **Product-led expansion:** Build in-product prompts for upgrades. Feature gating that shows value of next tier.
4. **Reduce contraction:** Identify reasons for downgrades. Often related to poor adoption of features customers are paying for.
5. **Reduce churn:** Implement early warning system (health scores). Intervene before renewal, not at renewal.
6. **Multi-product strategy:** Cross-sell additional products to existing customers. Second product adoption reduces churn by 30-50%.
---
## Metric Relationships and Trade-offs
### Growth vs. Efficiency
The fundamental tension in SaaS is between growth rate and capital efficiency:
```
High Growth + High Burn = Blitzscaling (risky but fast)
High Growth + Low Burn = Efficient Growth (ideal)
Low Growth + Low Burn = Cash Cow (sustainable but limited)
Low Growth + High Burn = Trouble (restructure immediately)
```
**Rule of 40** captures this balance: growth rate + margin should exceed 40%.
### CAC Payback vs. Growth Rate
Shorter CAC payback enables faster reinvestment in growth. A company with 12-month payback can reinvest recovered CAC into new customer acquisition sooner than one with 24-month payback, creating a compounding advantage.
### NDR vs. New Logo Acquisition
High NDR reduces dependence on new logo acquisition for growth:
- NDR of 120% means 20% growth from existing base before any new customers
- NDR of 100% means all growth must come from new customers (expensive)
- NDR of 80% means the company is shrinking and must acquire even more new customers just to replace lost revenue
**Strategic implication:** Invest in NDR improvement before scaling new logo acquisition. Every dollar spent improving NDR has higher ROI than acquiring new customers.
---
## Benchmark Data Sources
The benchmarks in this guide are compiled from:
1. **Bessemer Cloud Index** -- Public cloud company financial data
2. **KeyBanc SaaS Survey** -- Annual survey of private SaaS companies
3. **OpenView SaaS Benchmarks** -- Product-led growth focused benchmarks
4. **Iconiq Growth Analytics** -- Private company growth and efficiency data
5. **SaaStr Annual Surveys** -- Community-sourced SaaS metrics
6. **Battery Ventures Software Report** -- Enterprise software metrics
**Note:** Benchmarks shift over time. In capital-constrained environments (higher interest rates), efficiency metrics (burn multiple, Rule of 40) receive more weight. In growth-oriented environments (lower interest rates), growth rate and market share gain importance.
---
## Quarterly Board Reporting Template
When presenting GTM efficiency to the board, organize metrics as follows:
1. **Growth:** ARR, net new ARR, growth rate, NDR
2. **Efficiency:** Magic Number, LTV:CAC, CAC Payback, Burn Multiple
3. **Balance:** Rule of 40 score and composition
4. **Pipeline:** Coverage ratio, velocity, forecast accuracy
5. **Trends:** Quarter-over-quarter change for each metric with directional indicators
6. **Benchmarks:** How the company compares to stage-appropriate benchmarks
7. **Actions:** Top 3 initiatives to improve weakest metrics
FILE:revenue-operations/references/pipeline-management-framework.md
# Pipeline Management Framework
Best practices for pipeline management including stage definitions, conversion benchmarks, velocity optimization, and inspection cadence.
---
## Pipeline Stage Definitions
A well-defined pipeline requires clear, observable exit criteria at each stage. Subjective stages lead to inaccurate forecasting and unreliable conversion data.
### Recommended Stage Model (B2B SaaS)
| Stage | Name | Exit Criteria | Probability | Typical Duration |
|-------|------|--------------|-------------|-----------------|
| S0 | Lead | Contact identified, initial interest signal | 5% | 0-7 days |
| S1 | Discovery | Pain identified, budget confirmed, stakeholder engaged | 10% | 7-14 days |
| S2 | Qualification | MEDDPICC criteria met, mutual action plan created | 20% | 14-21 days |
| S3 | Proposal | Solution presented, pricing delivered, champion confirmed | 40% | 7-14 days |
| S4 | Negotiation | Commercial terms discussed, legal engaged, verbal commitment | 60% | 7-21 days |
| S5 | Commit | Contract redlined, signature timeline confirmed | 80% | 3-7 days |
| S6 | Closed Won | Signed contract received | 100% | -- |
| SL | Closed Lost | Deal disposition recorded with loss reason | 0% | -- |
### Stage Exit Criteria Best Practices
**Discovery (S1) Exit Criteria:**
- Pain point articulated by prospect (not assumed by rep)
- Budget range discussed (even if informal)
- Decision-making process understood
- Next meeting scheduled with clear agenda
**Qualification (S2) Exit Criteria:**
- MEDDPICC or BANT qualification framework completed
- Economic buyer identified (not just champion)
- Compelling event or timeline identified
- Mutual action plan (MAP) shared and agreed upon
- Technical requirements understood
**Proposal (S3) Exit Criteria:**
- Solution demo completed and well-received
- Pricing proposal delivered
- Champion validated proposal internally
- Competitive landscape understood
- No unresolved technical blockers
**Negotiation (S4) Exit Criteria:**
- Commercial terms discussed (not just pricing, but payment terms, SLA, etc.)
- Legal review initiated
- Security/procurement review started
- Verbal agreement on core terms
- Close date confirmed within 30 days
**Commit (S5) Exit Criteria:**
- Final contract sent for signature
- All legal redlines resolved
- Procurement approval obtained
- Signature expected within 7 business days
---
## Conversion Benchmarks by Segment
### SMB (ACV <$25K)
| Transition | Benchmark | Top Quartile |
|-----------|-----------|--------------|
| Lead to Discovery | 20-30% | 35%+ |
| Discovery to Qualification | 40-50% | 55%+ |
| Qualification to Proposal | 50-60% | 65%+ |
| Proposal to Negotiation | 55-65% | 70%+ |
| Negotiation to Close | 65-75% | 80%+ |
| Overall Win Rate | 20-30% | 35%+ |
| Avg Cycle Length | 14-30 days | <14 days |
### Mid-Market (ACV $25K-$100K)
| Transition | Benchmark | Top Quartile |
|-----------|-----------|--------------|
| Lead to Discovery | 15-25% | 30%+ |
| Discovery to Qualification | 35-45% | 50%+ |
| Qualification to Proposal | 45-55% | 60%+ |
| Proposal to Negotiation | 50-60% | 65%+ |
| Negotiation to Close | 60-70% | 75%+ |
| Overall Win Rate | 15-25% | 30%+ |
| Avg Cycle Length | 30-60 days | <30 days |
### Enterprise (ACV >$100K)
| Transition | Benchmark | Top Quartile |
|-----------|-----------|--------------|
| Lead to Discovery | 10-20% | 25%+ |
| Discovery to Qualification | 30-40% | 45%+ |
| Qualification to Proposal | 40-50% | 55%+ |
| Proposal to Negotiation | 45-55% | 60%+ |
| Negotiation to Close | 55-65% | 70%+ |
| Overall Win Rate | 10-20% | 25%+ |
| Avg Cycle Length | 60-120 days | <60 days |
---
## Sales Velocity Optimization
Sales velocity = (# Opportunities x Avg Deal Size x Win Rate) / Avg Cycle Days
Each component is an optimization lever:
### Lever 1: Increase Opportunity Volume
**Strategies:**
- Invest in inbound marketing (content, SEO, paid)
- Scale outbound SDR capacity
- Develop partner/channel sourcing
- Launch product-led growth (PLG) motion
- Implement customer referral programs
**Measurement:** Pipeline created ($) per week/month, by source
### Lever 2: Increase Average Deal Size
**Strategies:**
- Multi-product bundling and packaging
- Usage-based pricing with growth triggers
- Land-and-expand with defined expansion playbooks
- Move upmarket with enterprise features
- Value-based pricing tied to customer outcomes
**Measurement:** ACV trend by quarter, by segment
### Lever 3: Increase Win Rate
**Strategies:**
- Implement MEDDPICC qualification rigor
- Build competitive battle cards and train on them
- Create multi-threaded relationships (not single-threaded)
- Develop ROI/business case tools
- Invest in sales engineering and demo quality
- Win/loss analysis with structured debriefs
**Measurement:** Win rate by stage entry, by competitor, by rep
### Lever 4: Decrease Sales Cycle Length
**Strategies:**
- Pre-qualify harder at S1/S2 to remove slow deals
- Mutual action plans with milestone dates
- Champion enablement (arm champions with internal selling materials)
- Parallel processing (legal/security review concurrent with evaluation)
- Standardized contracts and pre-approved terms
- Executive sponsor engagement for stuck deals
**Measurement:** Days in each stage, cycle length trend, stage-specific bottlenecks
---
## Pipeline Inspection Cadence
### Daily (Rep Level)
**Focus:** Deal-level activity and next steps
**Questions:**
- What is the next step for each deal in S3+?
- Are any deals missing next steps or scheduled meetings?
- Which deals have not been updated in >3 days?
### Weekly (Manager/Team Level)
**Focus:** Pipeline health and forecast accuracy
**Review Format (45-60 minutes):**
1. **Coverage Check (10 min)**
- Current pipeline vs. quota -- is coverage >3x?
- Pipeline created this week vs. target
- Net pipeline change (created minus closed minus lost)
2. **Deal Inspection (25 min)**
- Walk top 10 deals by value in S3+
- MEDDPICC validation for each commit deal
- Identify deals at risk (aging, single-threaded, no next step)
3. **Forecast Call (10 min)**
- Commit, best case, and pipeline forecast
- Changes from last week's forecast (what moved and why)
- Gaps to plan and remediation
4. **Action Items (5 min)**
- Deals needing executive engagement
- Pipeline generation actions for next week
- Coaching priorities
### Monthly (Leadership Level)
**Focus:** Pipeline trends, velocity, and efficiency
**Review Areas:**
- Month-over-month pipeline growth trend
- Conversion rate trends by stage
- Sales velocity trend (improving or declining?)
- Forecast accuracy (MAPE) for the month
- Rep performance distribution (quartile analysis)
- Pipeline source mix health
### Quarterly (Executive/Board Level)
**Focus:** GTM efficiency and strategic pipeline
**Review Areas:**
- Pipeline coverage for next 2-3 quarters
- LTV:CAC and Magic Number trends
- Sales efficiency ratio trends
- Market segment performance comparison
- New market/product pipeline contribution
- Competitive win/loss trends
---
## Pipeline Hygiene
### Deal Hygiene Standards
1. **Close date accuracy:** Close dates must be based on buyer commitment, not rep hope. Any deal pushed more than twice should be flagged for re-qualification.
2. **Stage accuracy:** Deals must meet exit criteria to be in a stage. No deal should be in Proposal (S3) without a pricing deliverable sent.
3. **Amount accuracy:** Deal amounts must reflect the current proposal, not aspirational upsell. Variance between deal value and proposal should be <10%.
4. **Contact coverage:** Deals >$50K should have 3+ contacts associated. Enterprise deals should have economic buyer, champion, and technical evaluator.
5. **Activity recency:** No deal should go 7+ days without logged activity. Deals without recent activity signal stalling.
### Pipeline Cleanup Triggers
Run cleanup when:
- Pipeline-to-quota ratio drops below 2.5x
- Forecast accuracy (MAPE) exceeds 20%
- More than 15% of pipeline is >90 days old
- Average deal age exceeds 1.5x normal cycle time
### Cleanup Process
1. Flag all deals with close date in the past
2. Flag all deals with no activity in 14+ days
3. Flag all deals pushed 3+ times
4. Rep self-assessment: keep, push, or close for each flagged deal
5. Manager review and disposition
6. Update CRM and recalculate metrics
---
## Pipeline Risk Indicators
### Concentration Risk
**Definition:** Over-reliance on a small number of large deals.
**Thresholds:**
- Single deal >40% of pipeline = HIGH risk
- Single deal >25% of pipeline = MEDIUM risk
- Top 3 deals >70% of pipeline = HIGH risk
**Mitigation:** Diversify pipeline across segments, deal sizes, and sources. Increase deal count even if average deal size decreases.
### Stage Imbalance Risk
**Definition:** Pipeline is concentrated in early or late stages with gaps in between.
**Healthy Distribution:**
- Discovery/Qualification: 50-60% of pipeline value
- Proposal: 20-25% of pipeline value
- Negotiation/Commit: 15-20% of pipeline value
**Warning Signs:**
- >70% in early stages = insufficient progression
- >50% in late stages = insufficient pipeline generation
- Empty stages = broken funnel mechanics
### Temporal Risk
**Definition:** Pipeline is concentrated in a single quarter or lacks coverage for future quarters.
**Standard:** Maintain 3x coverage for current quarter and 1.5x for next quarter.
### Source Risk
**Definition:** Pipeline is overly dependent on a single source (e.g., 80% outbound, 0% inbound).
**Healthy Mix (varies by stage):**
- Inbound/Marketing: 30-40%
- Outbound/SDR: 30-40%
- Partner/Channel: 10-20%
- Expansion/Customer: 10-20%
FILE:revenue-operations/references/revops-metrics-guide.md
# RevOps Metrics Guide
Complete reference for Revenue Operations metrics hierarchy, definitions, formulas, interpretation guidelines, and common mistakes.
---
## Metrics Hierarchy
Revenue Operations metrics are organized in a hierarchy from leading indicators (pipeline activity) through lagging indicators (efficiency outcomes):
```
Level 1: Activity Metrics (Leading)
├── Pipeline created ($, #)
├── Meetings booked
├── Proposals sent
└── Demo completion rate
Level 2: Pipeline Metrics (Mid-funnel)
├── Pipeline coverage ratio
├── Stage conversion rates
├── Sales velocity
├── Deal aging
└── Pipeline hygiene score
Level 3: Revenue Metrics (Outcomes)
├── Bookings (new, expansion, renewal)
├── Revenue (ARR, MRR, TCV)
├── Win rate
└── Average deal size
Level 4: Efficiency Metrics (Unit Economics)
├── Magic Number
├── LTV:CAC Ratio
├── CAC Payback Period
├── Burn Multiple
├── Rule of 40
└── Net Dollar Retention
Level 5: Strategic Metrics (Board-Level)
├── Revenue per employee
├── Gross margin trend
├── NRR cohort analysis
└── Customer health score
```
---
## Core Metric Definitions
### Pipeline Coverage Ratio
**Formula:** Total Weighted Pipeline / Quota Target
**What it measures:** Whether there is sufficient pipeline to meet revenue targets.
**Interpretation:**
- 4x+: Strong coverage, selective deal pursuit possible
- 3-4x: Healthy coverage, standard operations
- 2-3x: At risk, accelerate pipeline generation
- <2x: Critical, immediate pipeline intervention needed
**Common Mistakes:**
- Including closed-won deals in the pipeline total
- Not weighting by stage probability
- Using annual quota against quarterly pipeline
- Ignoring deal quality in favor of quantity
**Best Practice:** Measure coverage ratio weekly. Track by quarter to identify seasonal gaps early.
---
### Stage Conversion Rates
**Formula:** # Deals advancing to Stage N+1 / # Deals entering Stage N
**What it measures:** Efficiency of progression through each pipeline stage.
**Typical SaaS Conversion Benchmarks:**
| Stage Transition | Median Rate | Top Quartile |
|-----------------|-------------|--------------|
| Lead to Qualification | 15-25% | 30%+ |
| Qualification to Proposal | 40-50% | 60%+ |
| Proposal to Negotiation | 50-60% | 70%+ |
| Negotiation to Close | 60-70% | 80%+ |
| Overall Win Rate | 15-25% | 30%+ |
**Common Mistakes:**
- Not standardizing stage exit criteria (subjective stages)
- Comparing conversion rates across different sales motions (PLG vs enterprise)
- Ignoring stage skipping (deals that jump stages inflate later conversion rates)
- Not segmenting by deal size or segment
---
### Sales Velocity
**Formula:** (# Opportunities x Avg Deal Size x Win Rate) / Avg Sales Cycle Days
**What it measures:** The rate at which the pipeline generates revenue, measured as revenue per day.
**Components:**
1. **# Opportunities** -- Volume of qualified deals in pipeline
2. **Avg Deal Size** -- Average contract value of won deals
3. **Win Rate** -- Percentage of deals that close
4. **Avg Sales Cycle** -- Days from opportunity creation to close
**Optimization levers:**
- Increase opportunity volume (marketing/SDR investment)
- Increase deal size (pricing, packaging, upsell)
- Increase win rate (sales enablement, competitive positioning)
- Decrease cycle length (champion building, MEDDPICC adherence)
**Common Mistakes:**
- Using all pipeline deals instead of qualified opportunities
- Not normalizing for segment (SMB velocity vs Enterprise velocity)
- Conflating calendar time with active selling time
- Ignoring velocity trend in favor of absolute number
---
### MAPE (Mean Absolute Percentage Error)
**Formula:** mean(|Actual - Forecast| / |Actual|) x 100
**What it measures:** Average forecast error magnitude as a percentage.
**Interpretation:**
| MAPE | Rating | Action |
|------|--------|--------|
| <10% | Excellent | Maintain current methodology |
| 10-15% | Good | Minor calibration adjustments |
| 15-25% | Fair | Methodology review needed |
| >25% | Poor | Fundamental process overhaul |
**Common Mistakes:**
- Using forecast vs. target instead of forecast vs. actual
- Not distinguishing between bias (systematic) and variance (random)
- Measuring only at the aggregate level (masks individual rep errors)
- Comparing MAPE across different time horizons (monthly vs quarterly)
---
### Forecast Bias
**Formula:** mean(Forecast - Actual) / mean(Actual) x 100
**What it measures:** Systematic tendency to over-forecast or under-forecast.
**Types:**
- **Positive bias (over-forecasting):** Forecast consistently exceeds actual. Often indicates optimistic deal assessment, insufficient qualification, or sandbagging reversal.
- **Negative bias (under-forecasting):** Actual consistently exceeds forecast. Often indicates conservative call culture, late-stage deals arriving unexpectedly, or poor pipeline visibility.
**Healthy Range:** Bias within +/- 5% of actual is considered well-calibrated.
---
### Magic Number
**Formula:** Net New ARR / Prior Period S&M Spend
**What it measures:** Efficiency of sales & marketing spend in generating new revenue.
**Interpretation:**
- >1.0: Extremely efficient, consider increasing GTM investment
- 0.75-1.0: Healthy efficiency, optimize and scale
- 0.50-0.75: Acceptable, focus on channel/spend optimization
- <0.50: Inefficient, audit spend allocation and productivity
**Common Mistakes:**
- Using total revenue instead of net new ARR
- Including expansion ARR (Magic Number measures new logo efficiency)
- Using current period spend instead of prior period (lag effect)
- Not separating sales spend from marketing spend for diagnostics
---
### LTV:CAC Ratio
**Formula:** Customer Lifetime Value / Customer Acquisition Cost
**Where:**
- LTV = (ARPA x Gross Margin) / Churn Rate
- ARPA = Average Revenue Per Account (annualized)
- CAC = Total S&M Spend / New Customers Acquired
**Target:** >3:1 is healthy; >5:1 may indicate under-investment in growth
**Common Mistakes:**
- Using revenue instead of gross-margin-weighted revenue in LTV
- Not including all acquisition costs (SDR, marketing, sales engineering)
- Using blended churn instead of cohort-specific churn
- Comparing across segments without normalizing (enterprise LTV:CAC is naturally higher)
---
### CAC Payback Period
**Formula:** CAC / (ARPA_monthly x Gross Margin)
**What it measures:** Months to recover the cost of acquiring a customer.
**Interpretation:**
- <12 months: Excellent capital efficiency
- 12-18 months: Healthy, especially for mid-market/enterprise
- 18-24 months: Acceptable for enterprise, concerning for SMB
- >24 months: Capital-intensive, needs optimization
**Common Mistakes:**
- Using revenue instead of gross-margin contribution
- Ignoring expansion revenue in payback calculation (conservative approach)
- Comparing SMB payback to enterprise payback without context
---
### Burn Multiple
**Formula:** Net Burn / Net New ARR
**What it measures:** How much cash is consumed for each dollar of new ARR.
**Interpretation (David Sacks framework):**
- <1.0x: Amazing -- hyper-efficient growth
- 1.0-1.5x: Great -- strong capital efficiency
- 1.5-2.0x: Good -- healthy burn rate
- 2.0-3.0x: Suspect -- needs attention
- >3.0x: Bad -- unsustainable without course correction
**Common Mistakes:**
- Using gross burn instead of net burn
- Not annualizing ARR when using quarterly burn
- Ignoring the denominator quality (all new ARR is not equal)
---
### Rule of 40
**Formula:** Revenue Growth Rate (%) + Free Cash Flow Margin (%)
**What it measures:** Balance between growth and profitability.
**Interpretation:**
- >60%: Elite SaaS company
- 40-60%: Strong performance
- 20-40%: Acceptable, optimize one dimension
- <20%: Needs significant improvement
**Common Mistakes:**
- Using EBITDA margin instead of FCF margin
- Comparing early-stage (growth-heavy) with late-stage (margin-heavy)
- Not considering the composition (80% growth + -40% margin vs 30% + 10%)
---
### Net Dollar Retention (NDR)
**Formula:** (Beginning ARR + Expansion - Contraction - Churn) / Beginning ARR x 100
**What it measures:** Revenue retention and expansion from existing customers.
**Interpretation:**
- >130%: World-class expansion (Snowflake, Datadog)
- 120-130%: Excellent land-and-expand
- 110-120%: Strong retention with moderate expansion
- 100-110%: Stable base, limited expansion
- <100%: Net revenue contraction -- critical concern
**Common Mistakes:**
- Including new logos in the calculation
- Not normalizing for cohort age (newer cohorts expand differently)
- Confusing gross retention with net retention
- Using logo retention as a proxy for dollar retention
---
## Metric Interdependencies
Understanding how metrics relate prevents conflicting optimizations:
1. **Magic Number and LTV:CAC** -- Both use S&M spend but measure different horizons. Magic Number is period-specific; LTV:CAC is lifetime.
2. **Burn Multiple and Rule of 40** -- Both measure efficiency but from different angles. Burn Multiple is cash-focused; Rule of 40 balances growth with profitability.
3. **Pipeline Coverage and Sales Velocity** -- High coverage with low velocity means pipeline is stagnating. Both must be healthy.
4. **NDR and LTV** -- NDR directly impacts LTV. Improving NDR is the highest-leverage way to improve LTV:CAC.
5. **Win Rate and Deal Size** -- Often inversely correlated. Moving upmarket increases deal size but may reduce win rate.
---
## Measurement Cadence
| Metric | Cadence | Owner |
|--------|---------|-------|
| Pipeline Coverage | Weekly | Sales Leadership |
| Stage Conversion | Bi-weekly | Sales Ops |
| Sales Velocity | Monthly | RevOps |
| Forecast Accuracy (MAPE) | Monthly/Quarterly | RevOps |
| Magic Number | Quarterly | CRO/CFO |
| LTV:CAC | Quarterly | Finance/RevOps |
| CAC Payback | Quarterly | Finance |
| Burn Multiple | Quarterly | CFO |
| Rule of 40 | Quarterly/Annual | CEO/Board |
| NDR | Quarterly | CS/RevOps |
FILE:revenue-operations/scripts/forecast_accuracy_tracker.py
#!/usr/bin/env python3
"""Forecast Accuracy Tracker - Measures forecast accuracy and bias for SaaS revenue teams.
Calculates MAPE (Mean Absolute Percentage Error), detects systematic forecasting
bias, analyzes accuracy trends, and provides category-level breakdowns.
Usage:
python forecast_accuracy_tracker.py forecast_data.json --format text
python forecast_accuracy_tracker.py forecast_data.json --format json
"""
import argparse
import json
import sys
from typing import Any
def safe_divide(numerator: float, denominator: float, default: float = 0.0) -> float:
"""Safely divide two numbers, returning default if denominator is zero."""
if denominator == 0:
return default
return numerator / denominator
def calculate_mape(periods: list[dict]) -> float:
"""Calculate Mean Absolute Percentage Error.
Formula: mean(|actual - forecast| / |actual|) x 100
Args:
periods: List of dicts with 'forecast' and 'actual' keys.
Returns:
MAPE as a percentage.
"""
if not periods:
return 0.0
errors = []
for p in periods:
actual = p["actual"]
forecast = p["forecast"]
if actual != 0:
errors.append(abs(actual - forecast) / abs(actual))
if not errors:
return 0.0
return (sum(errors) / len(errors)) * 100
def calculate_weighted_mape(periods: list[dict]) -> float:
"""Calculate value-weighted MAPE.
Weights each period's error by its actual value, giving more importance
to larger periods.
Args:
periods: List of dicts with 'forecast' and 'actual' keys.
Returns:
Weighted MAPE as a percentage.
"""
if not periods:
return 0.0
total_actual = sum(abs(p["actual"]) for p in periods)
if total_actual == 0:
return 0.0
weighted_errors = 0.0
for p in periods:
actual = p["actual"]
forecast = p["forecast"]
if actual != 0:
weight = abs(actual) / total_actual
weighted_errors += weight * (abs(actual - forecast) / abs(actual))
return weighted_errors * 100
def get_accuracy_rating(mape: float) -> dict[str, str]:
"""Return accuracy rating based on MAPE threshold.
Ratings:
Excellent: <10%
Good: 10-15%
Fair: 15-25%
Poor: >25%
"""
if mape < 10:
return {"rating": "Excellent", "description": "Highly predictable, data-driven process"}
elif mape < 15:
return {"rating": "Good", "description": "Reliable forecasting with minor variance"}
elif mape < 25:
return {"rating": "Fair", "description": "Needs process improvement"}
else:
return {"rating": "Poor", "description": "Significant forecasting methodology gaps"}
def analyze_bias(periods: list[dict]) -> dict[str, Any]:
"""Analyze systematic forecasting bias.
Positive bias = over-forecasting (forecast > actual, i.e., actual fell short)
Negative bias = under-forecasting (forecast < actual, i.e., actual exceeded)
Args:
periods: List of dicts with 'forecast' and 'actual' keys.
Returns:
Bias analysis with direction, magnitude, and ratio.
"""
if not periods:
return {
"direction": "None",
"bias_pct": 0.0,
"over_forecast_count": 0,
"under_forecast_count": 0,
"exact_count": 0,
"bias_ratio": 0.0,
}
over_count = 0
under_count = 0
exact_count = 0
total_bias = 0.0
for p in periods:
diff = p["forecast"] - p["actual"]
total_bias += diff
if diff > 0:
over_count += 1
elif diff < 0:
under_count += 1
else:
exact_count += 1
avg_bias = total_bias / len(periods)
total_actual = sum(p["actual"] for p in periods)
bias_pct = safe_divide(total_bias, total_actual) * 100
if over_count > under_count:
direction = "Over-forecasting"
elif under_count > over_count:
direction = "Under-forecasting"
else:
direction = "Balanced"
bias_ratio = safe_divide(over_count, over_count + under_count)
return {
"direction": direction,
"avg_bias_amount": round(avg_bias, 2),
"bias_pct": round(bias_pct, 1),
"over_forecast_count": over_count,
"under_forecast_count": under_count,
"exact_count": exact_count,
"bias_ratio": round(bias_ratio, 2),
}
def analyze_trend(periods: list[dict]) -> dict[str, Any]:
"""Analyze period-over-period accuracy trend.
Determines if forecast accuracy is improving, stable, or declining
by comparing error rates across consecutive periods.
Args:
periods: List of dicts with 'period', 'forecast', and 'actual' keys.
Returns:
Trend analysis with direction and period details.
"""
if len(periods) < 2:
return {
"trend": "Insufficient data",
"period_errors": [],
"improving_periods": 0,
"declining_periods": 0,
}
period_errors = []
for p in periods:
actual = p["actual"]
forecast = p["forecast"]
if actual != 0:
error_pct = abs(actual - forecast) / abs(actual) * 100
else:
error_pct = 0.0
period_errors.append({
"period": p.get("period", "Unknown"),
"error_pct": round(error_pct, 1),
"forecast": forecast,
"actual": actual,
})
improving = 0
declining = 0
for i in range(1, len(period_errors)):
if period_errors[i]["error_pct"] < period_errors[i - 1]["error_pct"]:
improving += 1
elif period_errors[i]["error_pct"] > period_errors[i - 1]["error_pct"]:
declining += 1
if improving > declining:
trend = "Improving"
elif declining > improving:
trend = "Declining"
else:
trend = "Stable"
# Calculate recent vs historical MAPE
midpoint = len(periods) // 2
if midpoint > 0:
early_mape = calculate_mape(periods[:midpoint])
recent_mape = calculate_mape(periods[midpoint:])
mape_change = recent_mape - early_mape
else:
early_mape = 0.0
recent_mape = 0.0
mape_change = 0.0
return {
"trend": trend,
"period_errors": period_errors,
"improving_periods": improving,
"declining_periods": declining,
"early_mape": round(early_mape, 1),
"recent_mape": round(recent_mape, 1),
"mape_change": round(mape_change, 1),
}
def analyze_categories(category_breakdowns: dict) -> dict[str, Any]:
"""Analyze accuracy by category (rep, product, segment, etc.).
Args:
category_breakdowns: Dict of category_name -> list of
{category, forecast, actual} dicts.
Returns:
Category-level MAPE and accuracy analysis.
"""
results = {}
for category_name, entries in category_breakdowns.items():
category_results = []
for entry in entries:
actual = entry["actual"]
forecast = entry["forecast"]
if actual != 0:
error_pct = abs(actual - forecast) / abs(actual) * 100
else:
error_pct = 0.0
diff = forecast - actual
if diff > 0:
bias = "Over"
elif diff < 0:
bias = "Under"
else:
bias = "Exact"
rating = get_accuracy_rating(error_pct)
category_results.append({
"category": entry["category"],
"forecast": forecast,
"actual": actual,
"error_pct": round(error_pct, 1),
"bias": bias,
"variance": round(diff, 2),
"rating": rating["rating"],
})
# Sort by error percentage (worst first)
category_results.sort(key=lambda x: x["error_pct"], reverse=True)
overall_mape = calculate_mape(entries)
results[category_name] = {
"entries": category_results,
"overall_mape": round(overall_mape, 1),
"overall_rating": get_accuracy_rating(overall_mape)["rating"],
}
return results
def generate_recommendations(
mape: float, bias: dict, trend: dict, categories: dict
) -> list[str]:
"""Generate actionable recommendations based on analysis results.
Args:
mape: Overall MAPE percentage.
bias: Bias analysis results.
trend: Trend analysis results.
categories: Category analysis results.
Returns:
List of recommendation strings.
"""
recommendations = []
# MAPE-based recommendations
if mape > 25:
recommendations.append(
"CRITICAL: MAPE exceeds 25%. Implement structured forecasting methodology "
"(e.g., weighted pipeline with stage-based probabilities)."
)
elif mape > 15:
recommendations.append(
"Forecast accuracy needs improvement. Consider implementing deal-level "
"forecasting with commit/upside/pipeline categories."
)
# Bias-based recommendations
if bias["direction"] == "Over-forecasting" and abs(bias["bias_pct"]) > 10:
recommendations.append(
f"Systematic over-forecasting detected ({bias['bias_pct']}% bias). "
"Review deal qualification criteria and apply more conservative "
"stage probabilities."
)
elif bias["direction"] == "Under-forecasting" and abs(bias["bias_pct"]) > 10:
recommendations.append(
f"Systematic under-forecasting detected ({bias['bias_pct']}% bias). "
"Review upside deals more carefully and improve pipeline visibility."
)
# Trend-based recommendations
if trend["trend"] == "Declining":
recommendations.append(
"Forecast accuracy is declining over time. Schedule a forecasting "
"methodology review and retrain the team on forecasting best practices."
)
elif trend["trend"] == "Improving":
recommendations.append(
"Forecast accuracy is improving. Continue current methodology and "
"document best practices for consistency."
)
# Category-based recommendations
for cat_name, cat_data in categories.items():
worst_entries = [
e for e in cat_data["entries"] if e["error_pct"] > 25
]
if worst_entries:
names = ", ".join(e["category"] for e in worst_entries[:3])
recommendations.append(
f"High error rates in {cat_name}: {names}. "
f"Provide targeted coaching on forecasting discipline."
)
if not recommendations:
recommendations.append(
"Forecasting performance is strong. Maintain current processes "
"and continue monitoring for drift."
)
return recommendations
def track_forecast_accuracy(data: dict) -> dict[str, Any]:
"""Run complete forecast accuracy analysis.
Args:
data: Forecast data with periods and optional category breakdowns.
Returns:
Complete forecast accuracy analysis results.
"""
periods = data["forecast_periods"]
mape = calculate_mape(periods)
weighted_mape = calculate_weighted_mape(periods)
rating = get_accuracy_rating(mape)
bias = analyze_bias(periods)
trend = analyze_trend(periods)
categories = {}
if "category_breakdowns" in data:
categories = analyze_categories(data["category_breakdowns"])
recommendations = generate_recommendations(mape, bias, trend, categories)
return {
"mape": round(mape, 1),
"weighted_mape": round(weighted_mape, 1),
"accuracy_rating": rating,
"bias": bias,
"trend": trend,
"category_breakdowns": categories,
"recommendations": recommendations,
"periods_analyzed": len(periods),
}
def format_currency(value: float) -> str:
"""Format a number as currency."""
if abs(value) >= 1_000_000:
return f",.1fM"
elif abs(value) >= 1_000:
return f",.1fK"
return f",.0f"
def format_text_report(results: dict) -> str:
"""Format analysis results as a human-readable text report."""
lines = []
lines.append("=" * 70)
lines.append("FORECAST ACCURACY REPORT")
lines.append("=" * 70)
# Overall accuracy
lines.append("")
lines.append("OVERALL ACCURACY")
lines.append("-" * 40)
lines.append(f" MAPE: {results['mape']}%")
lines.append(f" Weighted MAPE: {results['weighted_mape']}%")
lines.append(f" Rating: {results['accuracy_rating']['rating']}")
lines.append(f" Assessment: {results['accuracy_rating']['description']}")
lines.append(f" Periods Analyzed: {results['periods_analyzed']}")
# Bias analysis
bias = results["bias"]
lines.append("")
lines.append("FORECAST BIAS")
lines.append("-" * 40)
lines.append(f" Direction: {bias['direction']}")
lines.append(f" Bias %: {bias['bias_pct']}%")
lines.append(f" Avg Bias Amount: {format_currency(bias['avg_bias_amount'])}")
lines.append(f" Over-forecast: {bias['over_forecast_count']} periods")
lines.append(f" Under-forecast: {bias['under_forecast_count']} periods")
lines.append(f" Bias Ratio: {bias['bias_ratio']}")
# Trend analysis
trend = results["trend"]
lines.append("")
lines.append("ACCURACY TREND")
lines.append("-" * 40)
lines.append(f" Trend: {trend['trend']}")
lines.append(f" Improving: {trend['improving_periods']} periods")
lines.append(f" Declining: {trend['declining_periods']} periods")
if trend.get("early_mape") is not None and trend["trend"] != "Insufficient data":
lines.append(f" Early MAPE: {trend['early_mape']}%")
lines.append(f" Recent MAPE: {trend['recent_mape']}%")
lines.append(f" MAPE Change: {trend['mape_change']:+.1f}%")
if trend.get("period_errors"):
lines.append("")
lines.append(" PERIOD DETAIL:")
for pe in trend["period_errors"]:
lines.append(
f" {pe['period']:12s} "
f"Forecast: {format_currency(pe['forecast']):>10s} "
f"Actual: {format_currency(pe['actual']):>10s} "
f"Error: {pe['error_pct']}%"
)
# Category breakdowns
if results["category_breakdowns"]:
lines.append("")
lines.append("CATEGORY BREAKDOWN")
lines.append("-" * 40)
for cat_name, cat_data in results["category_breakdowns"].items():
lines.append(
f"\n {cat_name.upper()} (Overall MAPE: {cat_data['overall_mape']}% "
f"- {cat_data['overall_rating']})"
)
for entry in cat_data["entries"]:
lines.append(
f" {entry['category']:20s} "
f"Error: {entry['error_pct']:5.1f}% "
f"Bias: {entry['bias']:5s} "
f"Rating: {entry['rating']}"
)
# Recommendations
lines.append("")
lines.append("RECOMMENDATIONS")
lines.append("-" * 40)
for i, rec in enumerate(results["recommendations"], 1):
lines.append(f" {i}. {rec}")
lines.append("")
lines.append("=" * 70)
return "\n".join(lines)
def main() -> None:
"""Main entry point for forecast accuracy tracker CLI."""
parser = argparse.ArgumentParser(
description="Track and analyze forecast accuracy for SaaS revenue teams."
)
parser.add_argument(
"input",
help="Path to JSON file containing forecast data",
)
parser.add_argument(
"--format",
choices=["json", "text"],
default="text",
help="Output format: json or text (default: text)",
)
args = parser.parse_args()
try:
with open(args.input, "r") as f:
data = json.load(f)
except FileNotFoundError:
print(f"Error: File not found: {args.input}", file=sys.stderr)
sys.exit(1)
except json.JSONDecodeError as e:
print(f"Error: Invalid JSON in {args.input}: {e}", file=sys.stderr)
sys.exit(1)
if "forecast_periods" not in data:
print("Error: Missing required field 'forecast_periods' in input data", file=sys.stderr)
sys.exit(1)
results = track_forecast_accuracy(data)
if args.format == "json":
print(json.dumps(results, indent=2))
else:
print(format_text_report(results))
if __name__ == "__main__":
main()
FILE:revenue-operations/scripts/gtm_efficiency_calculator.py
#!/usr/bin/env python3
"""GTM Efficiency Calculator - Calculates go-to-market efficiency metrics for SaaS.
Computes Magic Number, LTV:CAC, CAC Payback, Burn Multiple, Rule of 40,
and Net Dollar Retention with industry benchmarking and ratings.
Usage:
python gtm_efficiency_calculator.py gtm_data.json --format text
python gtm_efficiency_calculator.py gtm_data.json --format json
"""
import argparse
import json
import sys
from typing import Any
def safe_divide(numerator: float, denominator: float, default: float = 0.0) -> float:
"""Safely divide two numbers, returning default if denominator is zero."""
if denominator == 0:
return default
return numerator / denominator
# --- Benchmark tables ---
# Each benchmark defines green/yellow/red thresholds
# and optional percentile placement guidance
BENCHMARKS = {
"magic_number": {
"green": {"min": 0.75, "label": ">0.75 - Efficient GTM spend"},
"yellow": {"min": 0.50, "max": 0.75, "label": "0.50-0.75 - Acceptable efficiency"},
"red": {"max": 0.50, "label": "<0.50 - Inefficient GTM spend"},
"elite": 1.0,
"description": "Net New ARR / Prior Period S&M Spend",
},
"ltv_cac_ratio": {
"green": {"min": 3.0, "label": ">3:1 - Strong unit economics"},
"yellow": {"min": 1.0, "max": 3.0, "label": "1:1-3:1 - Marginal unit economics"},
"red": {"max": 1.0, "label": "<1:1 - Unsustainable unit economics"},
"elite": 5.0,
"description": "Customer LTV / Customer Acquisition Cost",
},
"cac_payback_months": {
"green": {"max": 18, "label": "<18 months - Healthy payback"},
"yellow": {"min": 18, "max": 24, "label": "18-24 months - Acceptable payback"},
"red": {"min": 24, "label": ">24 months - Capital intensive"},
"elite": 12,
"description": "CAC / (ARPA x Gross Margin) in months",
},
"burn_multiple": {
"green": {"max": 2.0, "label": "<2x - Capital efficient growth"},
"yellow": {"min": 2.0, "max": 4.0, "label": "2-4x - Moderate burn"},
"red": {"min": 4.0, "label": ">4x - Unsustainable burn"},
"elite": 1.0,
"description": "Net Burn / Net New ARR",
},
"rule_of_40": {
"green": {"min": 40, "label": ">40% - Strong balance of growth & profitability"},
"yellow": {"min": 20, "max": 40, "label": "20-40% - Acceptable balance"},
"red": {"max": 20, "label": "<20% - Needs improvement"},
"elite": 60,
"description": "Revenue Growth % + FCF Margin %",
},
"ndr_pct": {
"green": {"min": 110, "label": ">110% - Strong expansion revenue"},
"yellow": {"min": 100, "max": 110, "label": "100-110% - Stable base"},
"red": {"max": 100, "label": "<100% - Net revenue contraction"},
"elite": 130,
"description": "(Begin ARR + Expansion - Contraction - Churn) / Begin ARR",
},
}
def rate_metric(metric_name: str, value: float) -> dict[str, str]:
"""Rate a metric as Green/Yellow/Red based on benchmark thresholds.
Args:
metric_name: Key into BENCHMARKS dict.
value: The metric value to rate.
Returns:
Dict with rating color, label, and percentile guidance.
"""
bench = BENCHMARKS.get(metric_name)
if not bench:
return {"rating": "Unknown", "label": "No benchmark available"}
# For metrics where lower is better (cac_payback, burn_multiple)
lower_is_better = metric_name in ("cac_payback_months", "burn_multiple")
if lower_is_better:
if "max" in bench["green"] and value <= bench["green"]["max"]:
rating = "Green"
label = bench["green"]["label"]
elif "min" in bench.get("yellow", {}) and "max" in bench.get("yellow", {}):
if bench["yellow"]["min"] <= value <= bench["yellow"]["max"]:
rating = "Yellow"
label = bench["yellow"]["label"]
else:
rating = "Red"
label = bench["red"]["label"]
else:
rating = "Red"
label = bench["red"]["label"]
else:
if "min" in bench["green"] and value >= bench["green"]["min"]:
rating = "Green"
label = bench["green"]["label"]
elif "min" in bench.get("yellow", {}) and "max" in bench.get("yellow", {}):
if bench["yellow"]["min"] <= value <= bench["yellow"]["max"]:
rating = "Yellow"
label = bench["yellow"]["label"]
else:
rating = "Red"
label = bench["red"]["label"]
else:
rating = "Red"
label = bench["red"]["label"]
# Percentile placement (simplified)
elite = bench.get("elite", 0)
if lower_is_better:
if elite > 0 and value > 0:
if value <= elite:
percentile = "Top 10%"
elif rating == "Green":
percentile = "Top 25%"
elif rating == "Yellow":
percentile = "Median"
else:
percentile = "Below median"
else:
percentile = "N/A"
else:
if elite > 0:
if value >= elite:
percentile = "Top 10%"
elif rating == "Green":
percentile = "Top 25%"
elif rating == "Yellow":
percentile = "Median"
else:
percentile = "Below median"
else:
percentile = "N/A"
return {
"rating": rating,
"label": label,
"percentile": percentile,
}
def calculate_magic_number(net_new_arr: float, sm_spend: float) -> dict[str, Any]:
"""Calculate Magic Number.
Formula: Net New ARR / Prior Period S&M Spend
Target: >0.75
Args:
net_new_arr: Net new annual recurring revenue in the period.
sm_spend: Sales & marketing spend in the prior period.
Returns:
Magic number value with rating and benchmark.
"""
value = safe_divide(net_new_arr, sm_spend)
benchmark = rate_metric("magic_number", value)
return {
"value": round(value, 2),
"net_new_arr": net_new_arr,
"sm_spend": sm_spend,
"formula": "Net New ARR / Prior Period S&M Spend",
"target": ">0.75",
**benchmark,
}
def calculate_ltv_cac(
arpa_monthly: float,
gross_margin_pct: float,
annual_churn_rate_pct: float,
cac: float,
) -> dict[str, Any]:
"""Calculate LTV:CAC Ratio.
LTV = ARPA_monthly x 12 x Gross Margin / Annual Churn Rate
Ratio = LTV / CAC
Target: >3:1
Args:
arpa_monthly: Average revenue per account per month.
gross_margin_pct: Gross margin as percentage (e.g., 78 for 78%).
annual_churn_rate_pct: Annual churn rate as percentage (e.g., 8 for 8%).
cac: Customer acquisition cost.
Returns:
LTV:CAC ratio with component values, rating, and benchmark.
"""
gross_margin = gross_margin_pct / 100
churn_rate = annual_churn_rate_pct / 100
arpa_annual = arpa_monthly * 12
ltv = safe_divide(arpa_annual * gross_margin, churn_rate)
ratio = safe_divide(ltv, cac)
benchmark = rate_metric("ltv_cac_ratio", ratio)
return {
"ratio": round(ratio, 1),
"ltv": round(ltv, 2),
"cac": cac,
"arpa_monthly": arpa_monthly,
"arpa_annual": arpa_annual,
"gross_margin_pct": gross_margin_pct,
"annual_churn_rate_pct": annual_churn_rate_pct,
"formula": "LTV (ARPA x Gross Margin / Churn Rate) / CAC",
"target": ">3:1",
**benchmark,
}
def calculate_cac_payback(
cac: float, arpa_monthly: float, gross_margin_pct: float
) -> dict[str, Any]:
"""Calculate CAC Payback Period.
Formula: CAC / (ARPA_monthly x Gross Margin) in months
Target: <18 months
Args:
cac: Customer acquisition cost.
arpa_monthly: Average revenue per account per month.
gross_margin_pct: Gross margin as percentage.
Returns:
CAC payback months with rating and benchmark.
"""
gross_margin = gross_margin_pct / 100
monthly_contribution = arpa_monthly * gross_margin
payback_months = safe_divide(cac, monthly_contribution)
benchmark = rate_metric("cac_payback_months", payback_months)
return {
"months": round(payback_months, 1),
"cac": cac,
"arpa_monthly": arpa_monthly,
"gross_margin_pct": gross_margin_pct,
"monthly_contribution": round(monthly_contribution, 2),
"formula": "CAC / (ARPA_monthly x Gross Margin)",
"target": "<18 months",
**benchmark,
}
def calculate_burn_multiple(net_burn: float, net_new_arr: float) -> dict[str, Any]:
"""Calculate Burn Multiple.
Formula: Net Burn / Net New ARR
Target: <2x (lower is better)
Args:
net_burn: Net cash burn in the period.
net_new_arr: Net new ARR added in the period.
Returns:
Burn multiple with rating and benchmark.
"""
value = safe_divide(net_burn, net_new_arr)
benchmark = rate_metric("burn_multiple", value)
return {
"value": round(value, 2),
"net_burn": net_burn,
"net_new_arr": net_new_arr,
"formula": "Net Burn / Net New ARR",
"target": "<2x",
**benchmark,
}
def calculate_rule_of_40(
revenue_growth_pct: float, fcf_margin_pct: float
) -> dict[str, Any]:
"""Calculate Rule of 40.
Formula: Revenue Growth % + FCF Margin %
Target: >40%
Args:
revenue_growth_pct: Year-over-year revenue growth percentage.
fcf_margin_pct: Free cash flow margin percentage.
Returns:
Rule of 40 score with rating and benchmark.
"""
value = revenue_growth_pct + fcf_margin_pct
benchmark = rate_metric("rule_of_40", value)
return {
"value": round(value, 1),
"revenue_growth_pct": revenue_growth_pct,
"fcf_margin_pct": fcf_margin_pct,
"formula": "Revenue Growth % + FCF Margin %",
"target": ">40%",
**benchmark,
}
def calculate_ndr(
beginning_arr: float,
expansion_arr: float,
contraction_arr: float,
churned_arr: float,
) -> dict[str, Any]:
"""Calculate Net Dollar Retention.
Formula: (Beginning ARR + Expansion - Contraction - Churn) / Beginning ARR
Target: >110%
Args:
beginning_arr: ARR at start of period.
expansion_arr: Expansion revenue from existing customers.
contraction_arr: Revenue lost from downgrades.
churned_arr: Revenue lost from customer churn.
Returns:
NDR percentage with rating and benchmark.
"""
ending_arr = beginning_arr + expansion_arr - contraction_arr - churned_arr
ndr_pct = safe_divide(ending_arr, beginning_arr) * 100
benchmark = rate_metric("ndr_pct", ndr_pct)
return {
"ndr_pct": round(ndr_pct, 1),
"beginning_arr": beginning_arr,
"expansion_arr": expansion_arr,
"contraction_arr": contraction_arr,
"churned_arr": churned_arr,
"ending_arr": round(ending_arr, 2),
"formula": "(Begin ARR + Expansion - Contraction - Churn) / Begin ARR",
"target": ">110%",
**benchmark,
}
def generate_recommendations(metrics: dict) -> list[str]:
"""Generate strategic recommendations based on GTM efficiency metrics.
Args:
metrics: Dict of all calculated metric results.
Returns:
List of recommendation strings.
"""
recs = []
# Magic Number
mn = metrics["magic_number"]
if mn["rating"] == "Red":
recs.append(
f"Magic Number is {mn['value']} (target >0.75). GTM spend is inefficient. "
"Audit channel ROI, optimize sales productivity, and consider reducing "
"low-performing spend."
)
elif mn["rating"] == "Yellow":
recs.append(
f"Magic Number is {mn['value']}. GTM efficiency is acceptable but can improve. "
"Focus on sales enablement and pipeline quality over quantity."
)
# LTV:CAC
lc = metrics["ltv_cac"]
if lc["rating"] == "Red":
recs.append(
f"LTV:CAC ratio is {lc['ratio']}:1 (target >3:1). Unit economics are unsustainable. "
"Reduce CAC through better targeting, improve retention to increase LTV, "
"or increase ARPA through pricing optimization."
)
elif lc["rating"] == "Yellow":
recs.append(
f"LTV:CAC ratio is {lc['ratio']}:1. Unit economics are marginal. "
"Focus on reducing churn and expanding within existing accounts."
)
# CAC Payback
cp = metrics["cac_payback"]
if cp["rating"] == "Red":
recs.append(
f"CAC payback is {cp['months']} months (target <18). Capital recovery is too slow. "
"Reduce acquisition costs or increase gross-margin-weighted ARPA."
)
# Burn Multiple
bm = metrics["burn_multiple"]
if bm["rating"] == "Red":
recs.append(
f"Burn multiple is {bm['value']}x (target <2x). Cash consumption relative to "
"growth is unsustainable. Prioritize operating efficiency and path to profitability."
)
# Rule of 40
r40 = metrics["rule_of_40"]
if r40["rating"] == "Red":
recs.append(
f"Rule of 40 score is {r40['value']}% (target >40%). Balance of growth and "
"profitability needs improvement. Either accelerate growth or improve margins."
)
# NDR
ndr = metrics["ndr"]
if ndr["rating"] == "Red":
recs.append(
f"NDR is {ndr['ndr_pct']}% (target >110%). Net revenue is contracting from "
"the existing base. Prioritize churn reduction and expansion playbooks."
)
elif ndr["rating"] == "Yellow":
recs.append(
f"NDR is {ndr['ndr_pct']}%. Base is stable but not expanding. "
"Invest in cross-sell/upsell motions and customer success capacity."
)
# Positive summary if everything is green
green_count = sum(
1 for m in metrics.values()
if isinstance(m, dict) and m.get("rating") == "Green"
)
total_metrics = 6
if green_count == total_metrics:
recs.append(
"All GTM efficiency metrics are in healthy ranges. Maintain current "
"trajectory and optimize for best-in-class performance."
)
elif green_count >= 4:
recs.append(
f"{green_count}/{total_metrics} metrics are green. GTM efficiency is generally "
"healthy. Address the yellow/red areas for continuous improvement."
)
return recs
def calculate_all_metrics(data: dict) -> dict[str, Any]:
"""Calculate all GTM efficiency metrics from input data.
Args:
data: Input data with revenue, costs, and customers sections.
Returns:
Complete GTM efficiency analysis results.
"""
revenue = data["revenue"]
costs = data["costs"]
customers = data["customers"]
metrics = {
"magic_number": calculate_magic_number(
net_new_arr=revenue["net_new_arr"],
sm_spend=costs["sales_marketing_spend"],
),
"ltv_cac": calculate_ltv_cac(
arpa_monthly=revenue["arpa_monthly"],
gross_margin_pct=costs["gross_margin_pct"],
annual_churn_rate_pct=customers["annual_churn_rate_pct"],
cac=costs["cac"],
),
"cac_payback": calculate_cac_payback(
cac=costs["cac"],
arpa_monthly=revenue["arpa_monthly"],
gross_margin_pct=costs["gross_margin_pct"],
),
"burn_multiple": calculate_burn_multiple(
net_burn=costs["net_burn"],
net_new_arr=revenue["net_new_arr"],
),
"rule_of_40": calculate_rule_of_40(
revenue_growth_pct=revenue["revenue_growth_pct"],
fcf_margin_pct=costs["fcf_margin_pct"],
),
"ndr": calculate_ndr(
beginning_arr=customers["beginning_arr"],
expansion_arr=customers["expansion_arr"],
contraction_arr=customers["contraction_arr"],
churned_arr=customers["churned_arr"],
),
}
metrics["recommendations"] = generate_recommendations(metrics)
return metrics
def format_currency(value: float) -> str:
"""Format a number as currency."""
if abs(value) >= 1_000_000:
return f",.1fM"
elif abs(value) >= 1_000:
return f",.1fK"
return f",.0f"
def format_text_report(results: dict) -> str:
"""Format analysis results as a human-readable text report."""
lines = []
lines.append("=" * 70)
lines.append("GTM EFFICIENCY REPORT")
lines.append("=" * 70)
# Metric summary table
metrics_order = [
("magic_number", "Magic Number", lambda m: f"{m['value']}"),
("ltv_cac", "LTV:CAC Ratio", lambda m: f"{m['ratio']}:1"),
("cac_payback", "CAC Payback", lambda m: f"{m['months']} months"),
("burn_multiple", "Burn Multiple", lambda m: f"{m['value']}x"),
("rule_of_40", "Rule of 40", lambda m: f"{m['value']}%"),
("ndr", "Net Dollar Retention", lambda m: f"{m['ndr_pct']}%"),
]
lines.append("")
lines.append("METRICS SUMMARY")
lines.append("-" * 70)
lines.append(f" {'Metric':25s} {'Value':>12s} {'Rating':>8s} {'Target':>15s}")
lines.append(f" {'':25s} {'':>12s} {'':>8s} {'':>15s}")
for key, name, fmt_fn in metrics_order:
m = results[key]
lines.append(
f" {name:25s} {fmt_fn(m):>12s} {m['rating']:>8s} {m['target']:>15s}"
)
# Detailed breakdown
lines.append("")
lines.append("DETAILED BREAKDOWN")
lines.append("-" * 70)
# Magic Number
mn = results["magic_number"]
lines.append("")
lines.append(f" MAGIC NUMBER: {mn['value']}")
lines.append(f" Net New ARR: {format_currency(mn['net_new_arr'])}")
lines.append(f" S&M Spend: {format_currency(mn['sm_spend'])}")
lines.append(f" Rating: {mn['rating']} - {mn['label']}")
lines.append(f" Percentile: {mn['percentile']}")
# LTV:CAC
lc = results["ltv_cac"]
lines.append("")
lines.append(f" LTV:CAC RATIO: {lc['ratio']}:1")
lines.append(f" Customer LTV: {format_currency(lc['ltv'])}")
lines.append(f" CAC: {format_currency(lc['cac'])}")
lines.append(f" ARPA (Monthly): {format_currency(lc['arpa_monthly'])}")
lines.append(f" Gross Margin: {lc['gross_margin_pct']}%")
lines.append(f" Churn Rate: {lc['annual_churn_rate_pct']}%")
lines.append(f" Rating: {lc['rating']} - {lc['label']}")
lines.append(f" Percentile: {lc['percentile']}")
# CAC Payback
cp = results["cac_payback"]
lines.append("")
lines.append(f" CAC PAYBACK: {cp['months']} months")
lines.append(f" CAC: {format_currency(cp['cac'])}")
lines.append(f" Monthly Contribution:{format_currency(cp['monthly_contribution'])}")
lines.append(f" Rating: {cp['rating']} - {cp['label']}")
lines.append(f" Percentile: {cp['percentile']}")
# Burn Multiple
bm = results["burn_multiple"]
lines.append("")
lines.append(f" BURN MULTIPLE: {bm['value']}x")
lines.append(f" Net Burn: {format_currency(bm['net_burn'])}")
lines.append(f" Net New ARR: {format_currency(bm['net_new_arr'])}")
lines.append(f" Rating: {bm['rating']} - {bm['label']}")
lines.append(f" Percentile: {bm['percentile']}")
# Rule of 40
r40 = results["rule_of_40"]
lines.append("")
lines.append(f" RULE OF 40: {r40['value']}%")
lines.append(f" Revenue Growth: {r40['revenue_growth_pct']}%")
lines.append(f" FCF Margin: {r40['fcf_margin_pct']}%")
lines.append(f" Rating: {r40['rating']} - {r40['label']}")
lines.append(f" Percentile: {r40['percentile']}")
# NDR
ndr = results["ndr"]
lines.append("")
lines.append(f" NET DOLLAR RETENTION: {ndr['ndr_pct']}%")
lines.append(f" Beginning ARR: {format_currency(ndr['beginning_arr'])}")
lines.append(f" Expansion: +{format_currency(ndr['expansion_arr'])}")
lines.append(f" Contraction: -{format_currency(ndr['contraction_arr'])}")
lines.append(f" Churn: -{format_currency(ndr['churned_arr'])}")
lines.append(f" Ending ARR: {format_currency(ndr['ending_arr'])}")
lines.append(f" Rating: {ndr['rating']} - {ndr['label']}")
lines.append(f" Percentile: {ndr['percentile']}")
# Recommendations
lines.append("")
lines.append("RECOMMENDATIONS")
lines.append("-" * 70)
for i, rec in enumerate(results["recommendations"], 1):
lines.append(f" {i}. {rec}")
lines.append("")
lines.append("=" * 70)
return "\n".join(lines)
def main() -> None:
"""Main entry point for GTM efficiency calculator CLI."""
parser = argparse.ArgumentParser(
description="Calculate GTM efficiency metrics for SaaS revenue teams."
)
parser.add_argument(
"input",
help="Path to JSON file containing GTM data",
)
parser.add_argument(
"--format",
choices=["json", "text"],
default="text",
help="Output format: json or text (default: text)",
)
args = parser.parse_args()
try:
with open(args.input, "r") as f:
data = json.load(f)
except FileNotFoundError:
print(f"Error: File not found: {args.input}", file=sys.stderr)
sys.exit(1)
except json.JSONDecodeError as e:
print(f"Error: Invalid JSON in {args.input}: {e}", file=sys.stderr)
sys.exit(1)
required_sections = ["revenue", "costs", "customers"]
for section in required_sections:
if section not in data:
print(
f"Error: Missing required section '{section}' in input data",
file=sys.stderr,
)
sys.exit(1)
results = calculate_all_metrics(data)
if args.format == "json":
print(json.dumps(results, indent=2))
else:
print(format_text_report(results))
if __name__ == "__main__":
main()
FILE:revenue-operations/scripts/pipeline_analyzer.py
#!/usr/bin/env python3
"""Pipeline Analyzer - Analyzes sales pipeline health for SaaS revenue teams.
Calculates pipeline coverage ratios, stage conversion rates, sales velocity,
deal aging risks, and concentration risks from pipeline data.
Usage:
python pipeline_analyzer.py --input pipeline.json --format text
python pipeline_analyzer.py --input pipeline.json --format json
"""
import argparse
import json
import sys
from datetime import datetime, date
from typing import Any
def safe_divide(numerator: float, denominator: float, default: float = 0.0) -> float:
"""Safely divide two numbers, returning default if denominator is zero."""
if denominator == 0:
return default
return numerator / denominator
def parse_date(date_str: str) -> date:
"""Parse a date string in YYYY-MM-DD format."""
return datetime.strptime(date_str, "%Y-%m-%d").date()
def get_quarter(d: date) -> str:
"""Return the quarter string for a given date (e.g., '2025-Q1')."""
quarter = (d.month - 1) // 3 + 1
return f"{d.year}-Q{quarter}"
def calculate_coverage_ratio(deals: list[dict], quota: float) -> dict[str, Any]:
"""Calculate pipeline coverage ratio against quota.
Target: 3-4x pipeline coverage for healthy pipeline.
"""
total_pipeline = sum(d["value"] for d in deals if d["stage"] != "Closed Won")
ratio = safe_divide(total_pipeline, quota)
if ratio >= 4.0:
rating = "Strong"
elif ratio >= 3.0:
rating = "Healthy"
elif ratio >= 2.0:
rating = "At Risk"
else:
rating = "Critical"
return {
"total_pipeline_value": total_pipeline,
"quota": quota,
"coverage_ratio": round(ratio, 2),
"rating": rating,
"target": "3.0x - 4.0x",
}
def calculate_stage_conversion_rates(
deals: list[dict], stages: list[str]
) -> list[dict[str, Any]]:
"""Calculate stage-to-stage conversion rates.
Measures the percentage of deals that progress from one stage to the next.
"""
stage_order = {stage: i for i, stage in enumerate(stages)}
stage_counts: dict[str, int] = {stage: 0 for stage in stages}
for deal in deals:
stage = deal["stage"]
if stage in stage_order:
stage_idx = stage_order[stage]
# A deal at stage N has passed through all stages 0..N
for i in range(stage_idx + 1):
stage_counts[stages[i]] += 1
conversions = []
for i in range(len(stages) - 1):
from_stage = stages[i]
to_stage = stages[i + 1]
from_count = stage_counts[from_stage]
to_count = stage_counts[to_stage]
rate = safe_divide(to_count, from_count) * 100
conversions.append({
"from_stage": from_stage,
"to_stage": to_stage,
"from_count": from_count,
"to_count": to_count,
"conversion_rate_pct": round(rate, 1),
})
return conversions
def calculate_sales_velocity(deals: list[dict]) -> dict[str, Any]:
"""Calculate sales velocity.
Formula: (# opportunities x avg deal size x win rate) / avg sales cycle length
Result is revenue per day.
"""
if not deals:
return {
"num_opportunities": 0,
"avg_deal_size": 0,
"win_rate_pct": 0,
"avg_cycle_days": 0,
"velocity_per_day": 0,
"velocity_per_month": 0,
}
won_deals = [d for d in deals if d["stage"] == "Closed Won"]
open_deals = [d for d in deals if d["stage"] != "Closed Won"]
all_considered = deals
num_opportunities = len(all_considered)
avg_deal_size = safe_divide(
sum(d["value"] for d in all_considered), num_opportunities
)
win_rate = safe_divide(len(won_deals), num_opportunities)
avg_cycle_days = safe_divide(
sum(d["age_days"] for d in all_considered), num_opportunities
)
velocity_per_day = safe_divide(
num_opportunities * avg_deal_size * win_rate, avg_cycle_days
)
return {
"num_opportunities": num_opportunities,
"avg_deal_size": round(avg_deal_size, 2),
"win_rate_pct": round(win_rate * 100, 1),
"avg_cycle_days": round(avg_cycle_days, 1),
"velocity_per_day": round(velocity_per_day, 2),
"velocity_per_month": round(velocity_per_day * 30, 2),
}
def analyze_deal_aging(
deals: list[dict], average_cycle_days: int, stages: list[str]
) -> dict[str, Any]:
"""Analyze deal aging and flag stale deals.
Flags deals older than 2x the average cycle time.
Uses stage-specific thresholds based on position in the pipeline.
"""
aging_threshold = average_cycle_days * 2
num_stages = len(stages)
stage_order = {stage: i for i, stage in enumerate(stages)}
# Stage-specific thresholds: early stages get more time, later stages less
stage_thresholds: dict[str, int] = {}
for i, stage in enumerate(stages):
if stage == "Closed Won":
continue
# Progressive thresholds: first stage gets full cycle, last open stage gets 50%
progress = safe_divide(i, num_stages - 1)
threshold = int(average_cycle_days * (1.0 + (1.0 - progress)))
stage_thresholds[stage] = threshold
aging_deals = []
healthy_deals = 0
at_risk_deals = 0
for deal in deals:
if deal["stage"] == "Closed Won":
continue
stage = deal["stage"]
age = deal["age_days"]
threshold = stage_thresholds.get(stage, aging_threshold)
if age > threshold:
at_risk_deals += 1
aging_deals.append({
"id": deal["id"],
"name": deal["name"],
"stage": stage,
"age_days": age,
"threshold_days": threshold,
"days_over": age - threshold,
"value": deal["value"],
})
else:
healthy_deals += 1
aging_deals.sort(key=lambda x: x["days_over"], reverse=True)
return {
"global_aging_threshold_days": aging_threshold,
"stage_thresholds": stage_thresholds,
"total_open_deals": healthy_deals + at_risk_deals,
"healthy_deals": healthy_deals,
"at_risk_deals": at_risk_deals,
"aging_deals": aging_deals,
}
def assess_pipeline_risk(
deals: list[dict], quota: float, stages: list[str]
) -> dict[str, Any]:
"""Assess overall pipeline risk.
Checks for:
- Concentration risk (>40% in single deal)
- Stage distribution health
- Coverage gap by quarter
"""
open_deals = [d for d in deals if d["stage"] != "Closed Won"]
total_pipeline = sum(d["value"] for d in open_deals)
# Concentration risk
concentration_risks = []
for deal in open_deals:
pct = safe_divide(deal["value"], total_pipeline) * 100
if pct > 40:
concentration_risks.append({
"id": deal["id"],
"name": deal["name"],
"value": deal["value"],
"pct_of_pipeline": round(pct, 1),
"risk_level": "HIGH",
})
elif pct > 25:
concentration_risks.append({
"id": deal["id"],
"name": deal["name"],
"value": deal["value"],
"pct_of_pipeline": round(pct, 1),
"risk_level": "MEDIUM",
})
has_concentration_risk = any(
r["risk_level"] == "HIGH" for r in concentration_risks
)
# Stage distribution
stage_distribution: dict[str, dict] = {}
for stage in stages:
if stage == "Closed Won":
continue
stage_deals = [d for d in open_deals if d["stage"] == stage]
count = len(stage_deals)
value = sum(d["value"] for d in stage_deals)
stage_distribution[stage] = {
"count": count,
"value": value,
"pct_of_pipeline": round(safe_divide(value, total_pipeline) * 100, 1),
}
# Check for empty stages (unhealthy funnel)
empty_stages = [
stage for stage, data in stage_distribution.items() if data["count"] == 0
]
# Coverage gap by quarter
today = date.today()
quarterly_coverage: dict[str, float] = {}
for deal in open_deals:
try:
close_date = parse_date(deal["close_date"])
quarter = get_quarter(close_date)
quarterly_coverage[quarter] = (
quarterly_coverage.get(quarter, 0) + deal["value"]
)
except (ValueError, KeyError):
pass
quarterly_target = quota / 4
coverage_gaps = []
for quarter, value in sorted(quarterly_coverage.items()):
coverage = safe_divide(value, quarterly_target)
if coverage < 3.0:
coverage_gaps.append({
"quarter": quarter,
"pipeline_value": value,
"quarterly_target": quarterly_target,
"coverage_ratio": round(coverage, 2),
"gap": "Below 3x target",
})
# Overall risk rating
risk_factors = 0
if has_concentration_risk:
risk_factors += 2
if len(empty_stages) > 0:
risk_factors += 1
if len(coverage_gaps) > 0:
risk_factors += 1
if safe_divide(total_pipeline, quota) < 3.0:
risk_factors += 2
if risk_factors >= 4:
overall_risk = "HIGH"
elif risk_factors >= 2:
overall_risk = "MEDIUM"
else:
overall_risk = "LOW"
return {
"overall_risk": overall_risk,
"risk_factors_count": risk_factors,
"concentration_risks": concentration_risks,
"has_concentration_risk": has_concentration_risk,
"stage_distribution": stage_distribution,
"empty_stages": empty_stages,
"coverage_gaps": coverage_gaps,
}
def analyze_pipeline(data: dict) -> dict[str, Any]:
"""Run complete pipeline analysis.
Args:
data: Pipeline data with deals, quota, stages, and average_cycle_days.
Returns:
Complete analysis results dictionary.
"""
deals = data["deals"]
quota = data["quota"]
stages = data["stages"]
average_cycle_days = data.get("average_cycle_days", 45)
return {
"coverage": calculate_coverage_ratio(deals, quota),
"stage_conversions": calculate_stage_conversion_rates(deals, stages),
"velocity": calculate_sales_velocity(deals),
"aging": analyze_deal_aging(deals, average_cycle_days, stages),
"risk": assess_pipeline_risk(deals, quota, stages),
}
def format_currency(value: float) -> str:
"""Format a number as currency."""
if value >= 1_000_000:
return f",.1fM"
elif value >= 1_000:
return f",.1fK"
return f",.0f"
def format_text_report(results: dict) -> str:
"""Format analysis results as a human-readable text report."""
lines = []
lines.append("=" * 70)
lines.append("PIPELINE ANALYSIS REPORT")
lines.append("=" * 70)
# Coverage
cov = results["coverage"]
lines.append("")
lines.append("PIPELINE COVERAGE")
lines.append("-" * 40)
lines.append(f" Total Pipeline: {format_currency(cov['total_pipeline_value'])}")
lines.append(f" Quota Target: {format_currency(cov['quota'])}")
lines.append(f" Coverage Ratio: {cov['coverage_ratio']}x (Target: {cov['target']})")
lines.append(f" Rating: {cov['rating']}")
# Stage Conversions
lines.append("")
lines.append("STAGE CONVERSION RATES")
lines.append("-" * 40)
for conv in results["stage_conversions"]:
lines.append(
f" {conv['from_stage']} -> {conv['to_stage']}: "
f"{conv['conversion_rate_pct']}% "
f"({conv['to_count']}/{conv['from_count']})"
)
# Velocity
vel = results["velocity"]
lines.append("")
lines.append("SALES VELOCITY")
lines.append("-" * 40)
lines.append(f" Opportunities: {vel['num_opportunities']}")
lines.append(f" Avg Deal Size: {format_currency(vel['avg_deal_size'])}")
lines.append(f" Win Rate: {vel['win_rate_pct']}%")
lines.append(f" Avg Cycle: {vel['avg_cycle_days']} days")
lines.append(f" Velocity/Day: {format_currency(vel['velocity_per_day'])}")
lines.append(f" Velocity/Month: {format_currency(vel['velocity_per_month'])}")
# Aging
aging = results["aging"]
lines.append("")
lines.append("DEAL AGING ANALYSIS")
lines.append("-" * 40)
lines.append(f" Total Open Deals: {aging['total_open_deals']}")
lines.append(f" Healthy: {aging['healthy_deals']}")
lines.append(f" At Risk: {aging['at_risk_deals']}")
if aging["aging_deals"]:
lines.append("")
lines.append(" AGING DEALS (needs attention):")
for deal in aging["aging_deals"]:
lines.append(
f" - {deal['name']} ({deal['stage']}): "
f"{deal['age_days']}d (threshold: {deal['threshold_days']}d, "
f"+{deal['days_over']}d over) | {format_currency(deal['value'])}"
)
# Risk
risk = results["risk"]
lines.append("")
lines.append("PIPELINE RISK ASSESSMENT")
lines.append("-" * 40)
lines.append(f" Overall Risk: {risk['overall_risk']}")
lines.append(f" Risk Factors: {risk['risk_factors_count']}")
if risk["concentration_risks"]:
lines.append("")
lines.append(" CONCENTRATION RISKS:")
for cr in risk["concentration_risks"]:
lines.append(
f" - {cr['name']}: {format_currency(cr['value'])} "
f"({cr['pct_of_pipeline']}% of pipeline) [{cr['risk_level']}]"
)
if risk["empty_stages"]:
lines.append("")
lines.append(f" EMPTY STAGES: {', '.join(risk['empty_stages'])}")
lines.append("")
lines.append(" STAGE DISTRIBUTION:")
for stage, data in risk["stage_distribution"].items():
bar = "#" * max(1, int(data["pct_of_pipeline"] / 2))
lines.append(
f" {stage:20s} {data['count']:3d} deals "
f"{format_currency(data['value']):>10s} "
f"{data['pct_of_pipeline']:5.1f}% {bar}"
)
if risk["coverage_gaps"]:
lines.append("")
lines.append(" COVERAGE GAPS BY QUARTER:")
for gap in risk["coverage_gaps"]:
lines.append(
f" - {gap['quarter']}: {gap['coverage_ratio']}x coverage "
f"({format_currency(gap['pipeline_value'])} vs "
f"{format_currency(gap['quarterly_target'])} target)"
)
lines.append("")
lines.append("=" * 70)
return "\n".join(lines)
def main() -> None:
"""Main entry point for pipeline analyzer CLI."""
parser = argparse.ArgumentParser(
description="Analyze sales pipeline health for SaaS revenue teams."
)
parser.add_argument(
"--input",
required=True,
help="Path to JSON file containing pipeline data",
)
parser.add_argument(
"--format",
choices=["json", "text"],
default="text",
help="Output format: json or text (default: text)",
)
args = parser.parse_args()
try:
with open(args.input, "r") as f:
data = json.load(f)
except FileNotFoundError:
print(f"Error: File not found: {args.input}", file=sys.stderr)
sys.exit(1)
except json.JSONDecodeError as e:
print(f"Error: Invalid JSON in {args.input}: {e}", file=sys.stderr)
sys.exit(1)
# Validate required fields
required_fields = ["deals", "quota", "stages"]
for field in required_fields:
if field not in data:
print(f"Error: Missing required field '{field}' in input data", file=sys.stderr)
sys.exit(1)
results = analyze_pipeline(data)
if args.format == "json":
print(json.dumps(results, indent=2))
else:
print(format_text_report(results))
if __name__ == "__main__":
main()
FILE:sales-engineer/SKILL.md
---
name: "sales-engineer"
description: Analyzes RFP/RFI responses for coverage gaps, builds competitive feature comparison matrices, and plans proof-of-concept (POC) engagements for pre-sales engineering. Use when responding to RFPs, bids, or proposal requests; comparing product features against competitors; planning or scoring a customer POC or sales demo; preparing a technical proposal; or performing win/loss competitor analysis. Handles tasks described as 'RFP response', 'bid response', 'proposal response', 'competitor comparison', 'feature matrix', 'POC planning', 'sales demo prep', or 'pre-sales engineering'.
---
# Sales Engineer Skill
## 5-Phase Workflow
### Phase 1: Discovery & Research
**Objective:** Understand customer requirements, technical environment, and business drivers.
**Checklist:**
- [ ] Conduct technical discovery calls with stakeholders
- [ ] Map customer's current architecture and pain points
- [ ] Identify integration requirements and constraints
- [ ] Document security and compliance requirements
- [ ] Assess competitive landscape for this opportunity
**Tools:** Run `rfp_response_analyzer.py` to score initial requirement alignment.
```bash
python scripts/rfp_response_analyzer.py assets/sample_rfp_data.json --format json > phase1_rfp_results.json
```
**Output:** Technical discovery document, requirement map, initial coverage assessment.
**Validation checkpoint:** Coverage score must be >50% and must-have gaps ≤3 before proceeding to Phase 2. Check with:
```bash
python scripts/rfp_response_analyzer.py assets/sample_rfp_data.json --format json | python -c "import sys,json; r=json.load(sys.stdin); print('PROCEED' if r['coverage_score']>50 and r['must_have_gaps']<=3 else 'REVIEW')"
```
---
### Phase 2: Solution Design
**Objective:** Design a solution architecture that addresses customer requirements.
**Checklist:**
- [ ] Map product capabilities to customer requirements
- [ ] Design integration architecture
- [ ] Identify customization needs and development effort
- [ ] Build competitive differentiation strategy
- [ ] Create solution architecture diagrams
**Tools:** Run `competitive_matrix_builder.py` using Phase 1 data to identify differentiators and vulnerabilities.
```bash
python scripts/competitive_matrix_builder.py competitive_data.json --format json > phase2_competitive.json
python -c "import json; d=json.load(open('phase2_competitive.json')); print('Differentiators:', d['differentiators']); print('Vulnerabilities:', d['vulnerabilities'])"
```
**Output:** Solution architecture, competitive positioning, technical differentiation strategy.
**Validation checkpoint:** Confirm at least one strong differentiator exists per customer priority before proceeding to Phase 3. If no differentiators found, escalate to Product Team (see Integration Points).
---
### Phase 3: Demo Preparation & Delivery
**Objective:** Deliver compelling technical demonstrations tailored to stakeholder priorities.
**Checklist:**
- [ ] Build demo environment matching customer's use case
- [ ] Create demo script with talking points per stakeholder role
- [ ] Prepare objection handling responses
- [ ] Rehearse failure scenarios and recovery paths
- [ ] Collect feedback and adjust approach
**Templates:** Use `assets/demo_script_template.md` for structured demo preparation.
**Output:** Customized demo, stakeholder-specific talking points, feedback capture.
**Validation checkpoint:** Demo script must cover every must-have requirement flagged in `phase1_rfp_results.json` before delivery. Cross-reference with:
```bash
python -c "import json; rfp=json.load(open('phase1_rfp_results.json')); [print('UNCOVERED:', r) for r in rfp['must_have_requirements'] if r['coverage']=='Gap']"
```
---
### Phase 4: POC & Evaluation
**Objective:** Execute a structured proof-of-concept that validates the solution.
**Checklist:**
- [ ] Define POC scope, success criteria, and timeline
- [ ] Allocate resources and set up environment
- [ ] Execute phased testing (core, advanced, edge cases)
- [ ] Track progress against success criteria
- [ ] Generate evaluation scorecard
**Tools:** Run `poc_planner.py` to generate the complete POC plan.
```bash
python scripts/poc_planner.py poc_data.json --format json > phase4_poc_plan.json
python -c "import json; p=json.load(open('phase4_poc_plan.json')); print('Go/No-Go:', p['recommendation'])"
```
**Templates:** Use `assets/poc_scorecard_template.md` for evaluation tracking.
**Output:** POC plan, evaluation scorecard, go/no-go recommendation.
**Validation checkpoint:** POC conversion requires scorecard score >60% across all evaluation dimensions (functionality, performance, integration, usability, support). If score <60%, document gaps and loop back to Phase 2 for solution redesign.
---
### Phase 5: Proposal & Closing
**Objective:** Deliver a technical proposal that supports the commercial close.
**Checklist:**
- [ ] Compile POC results and success metrics
- [ ] Create technical proposal with implementation plan
- [ ] Address outstanding objections with evidence
- [ ] Support pricing and packaging discussions
- [ ] Conduct win/loss analysis post-decision
**Templates:** Use `assets/technical_proposal_template.md` for the proposal document.
**Output:** Technical proposal, implementation timeline, risk mitigation plan.
---
## Python Automation Tools
### 1. RFP Response Analyzer
**Script:** `scripts/rfp_response_analyzer.py`
**Purpose:** Parse RFP/RFI requirements, score coverage, identify gaps, and generate bid/no-bid recommendations.
**Coverage Categories:** Full (100%), Partial (50%), Planned (25%), Gap (0%).
**Priority Weighting:** Must-Have 3×, Should-Have 2×, Nice-to-Have 1×.
**Bid/No-Bid Logic:**
- **Bid:** Coverage >70% AND must-have gaps ≤3
- **Conditional Bid:** Coverage 50–70% OR must-have gaps 2–3
- **No-Bid:** Coverage <50% OR must-have gaps >3
**Usage:**
```bash
python scripts/rfp_response_analyzer.py assets/sample_rfp_data.json # human-readable
python scripts/rfp_response_analyzer.py assets/sample_rfp_data.json --format json # JSON output
python scripts/rfp_response_analyzer.py --help
```
**Input Format:** See `assets/sample_rfp_data.json` for the complete schema.
---
### 2. Competitive Matrix Builder
**Script:** `scripts/competitive_matrix_builder.py`
**Purpose:** Generate feature comparison matrices, calculate competitive scores, identify differentiators and vulnerabilities.
**Feature Scoring:** Full (3), Partial (2), Limited (1), None (0).
**Usage:**
```bash
python scripts/competitive_matrix_builder.py competitive_data.json # human-readable
python scripts/competitive_matrix_builder.py competitive_data.json --format json # JSON output
```
**Output Includes:** Feature comparison matrix, weighted competitive scores, differentiators, vulnerabilities, and win themes.
---
### 3. POC Planner
**Script:** `scripts/poc_planner.py`
**Purpose:** Generate structured POC plans with timeline, resource allocation, success criteria, and evaluation scorecards.
**Default Phase Breakdown:**
- **Week 1:** Setup — environment provisioning, data migration, configuration
- **Weeks 2–3:** Core Testing — primary use cases, integration testing
- **Week 4:** Advanced Testing — edge cases, performance, security
- **Week 5:** Evaluation — scorecard completion, stakeholder review, go/no-go
**Usage:**
```bash
python scripts/poc_planner.py poc_data.json # human-readable
python scripts/poc_planner.py poc_data.json --format json # JSON output
```
**Output Includes:** Phased POC plan, resource allocation, success criteria, evaluation scorecard, risk register, and go/no-go recommendation framework.
---
## Reference Knowledge Bases
| Reference | Description |
|-----------|-------------|
| `references/rfp-response-guide.md` | RFP/RFI response best practices, compliance matrix, bid/no-bid framework |
| `references/competitive-positioning-framework.md` | Competitive analysis methodology, battlecard creation, objection handling |
| `references/poc-best-practices.md` | POC planning methodology, success criteria, evaluation frameworks |
## Asset Templates
| Template | Purpose |
|----------|---------|
| `assets/technical_proposal_template.md` | Technical proposal with executive summary, solution architecture, implementation plan |
| `assets/demo_script_template.md` | Demo script with agenda, talking points, objection handling |
| `assets/poc_scorecard_template.md` | POC evaluation scorecard with weighted scoring |
| `assets/sample_rfp_data.json` | Sample RFP data for testing the analyzer |
| `assets/expected_output.json` | Expected output from rfp_response_analyzer.py |
## Integration Points
- **Marketing Skills** - Leverage competitive intelligence and messaging frameworks from `../../marketing-skill/`
- **Product Team** - Coordinate on roadmap items flagged as "Planned" in RFP analysis from `../../product-team/`
- **C-Level Advisory** - Escalate strategic deals requiring executive engagement from `../../c-level-advisor/`
- **Customer Success** - Hand off POC results and success criteria to CSM from `../customer-success-manager/`
---
**Last Updated:** February 2026
**Status:** Production-ready
**Tools:** 3 Python automation scripts
**References:** 3 knowledge base documents
**Templates:** 5 asset files
FILE:sales-engineer/assets/demo_script_template.md
# Demo Script Template
## Demo Information
| Field | Value |
|-------|-------|
| Customer | [Customer Name] |
| Date/Time | [Date and Time] |
| Duration | [XX minutes] |
| Demo Environment | [Environment URL/Details] |
| Presenter | [Sales Engineer Name] |
| AE/Account Executive | [AE Name] |
---
## Pre-Demo Checklist
- [ ] Demo environment tested and confirmed working
- [ ] Sample data loaded and validated
- [ ] Backup demo environment prepared
- [ ] Screen sharing tested with correct resolution
- [ ] Browser tabs pre-loaded with key screens
- [ ] Recording setup confirmed (if applicable)
- [ ] Customer-specific branding applied (if applicable)
- [ ] Network and VPN connectivity verified
- [ ] All integrations connected and tested
- [ ] Backup slides prepared in case of technical issues
---
## Attendees and Roles
| Name | Title | Role in Evaluation | Key Interest |
|------|-------|-------------------|--------------|
| [Name] | [CTO/VP Eng] | Decision Maker | ROI, strategic fit |
| [Name] | [Director] | Champion | Solving [specific problem] |
| [Name] | [Manager] | Technical Evaluator | Architecture, integrations |
| [Name] | [Analyst] | End User | Day-to-day usability |
---
## Agenda
| Time | Duration | Topic | Lead |
|------|----------|-------|------|
| 0:00 | 5 min | Welcome and introductions | AE |
| 0:05 | 5 min | Agenda and objectives | SE |
| 0:10 | 20 min | Core demo (Use Cases 1-3) | SE |
| 0:30 | 10 min | Integration demo | SE |
| 0:40 | 5 min | Admin and security overview | SE |
| 0:45 | 10 min | Q&A | SE + AE |
| 0:55 | 5 min | Next steps and wrap-up | AE |
---
## Demo Flow
### Opening (5 minutes)
**Talking Points:**
- Thank attendees for their time
- Recap what we learned in discovery: "[Summarize 2-3 key challenges]"
- Set expectations: "Today I'll show you how we address [Challenge 1], [Challenge 2], and [Challenge 3]"
- Frame the demo: "I'll be using [data type] similar to what you described in our earlier conversations"
**Transition:** "Let me start with the challenge you mentioned is most pressing: [Challenge 1]."
---
### Use Case 1: [Name] (7 minutes)
**Business Context:**
[1-2 sentences on why this matters to the customer]
**Demo Steps:**
1. **Step 1:** [Navigate to / Click on / Show...]
- **What to say:** "[Explain what they're seeing and why it matters]"
- **Highlight:** [Specific feature or capability to emphasize]
2. **Step 2:** [Navigate to / Click on / Show...]
- **What to say:** "[Connect this to their specific pain point]"
- **Highlight:** [Differentiator from competitor]
3. **Step 3:** [Navigate to / Click on / Show...]
- **What to say:** "[Quantify the value - time saved, errors reduced, etc.]"
- **Highlight:** [Ease of use or power of the feature]
**Key Message:** "[One sentence summarizing the value demonstrated]"
**Transition:** "Now that you've seen how we handle [Use Case 1], let me show you [Use Case 2]."
---
### Use Case 2: [Name] (7 minutes)
**Business Context:**
[1-2 sentences on why this matters to the customer]
**Demo Steps:**
1. **Step 1:** [Navigate to / Click on / Show...]
- **What to say:** "[Explanation]"
- **Highlight:** [Key capability]
2. **Step 2:** [Navigate to / Click on / Show...]
- **What to say:** "[Explanation]"
- **Highlight:** [Key capability]
3. **Step 3:** [Navigate to / Click on / Show...]
- **What to say:** "[Explanation]"
- **Highlight:** [Key capability]
**Key Message:** "[One sentence summarizing the value demonstrated]"
**Transition:** "[Transition statement to next section]"
---
### Use Case 3: [Name] (6 minutes)
**Business Context:**
[1-2 sentences on why this matters to the customer]
**Demo Steps:**
1. **Step 1:** [Description]
- **What to say:** "[Explanation]"
- **Highlight:** [Key capability]
2. **Step 2:** [Description]
- **What to say:** "[Explanation]"
- **Highlight:** [Key capability]
**Key Message:** "[One sentence summarizing the value demonstrated]"
---
### Integration Demo (10 minutes)
**Context:** "You mentioned that integration with [System X] and [System Y] is critical. Let me show you how that works."
**Demo Steps:**
1. **Show integration configuration:**
- **What to say:** "Setting up the connection takes [X minutes/clicks]"
- **Highlight:** Native connector, no custom code required
2. **Show data flow:**
- **What to say:** "Data syncs in [real-time/X minute intervals]"
- **Highlight:** Reliability, error handling, monitoring
3. **Show end-to-end workflow:**
- **What to say:** "Here's the complete flow from [source] to [destination]"
- **Highlight:** Automation, reduced manual effort
---
### Admin and Security (5 minutes)
**Demo Steps:**
1. **Show RBAC configuration:**
- **What to say:** "Administrators can define roles and permissions at [granularity level]"
2. **Show audit log:**
- **What to say:** "Every action is logged for compliance and security review"
3. **Show SSO setup:**
- **What to say:** "Single sign-on integrates with your existing identity provider"
---
## Objection Handling
### Anticipated Objections
| Objection | Response |
|-----------|----------|
| "[Feature X] looks limited compared to [Competitor]" | "Great observation. Our approach to [Feature X] focuses on [benefit]. What specific aspect of [Feature X] is most important to your workflow? [Then demonstrate or explain how we address the specific need]" |
| "How does this handle [edge case]?" | "That's an important scenario. [If supported: Let me show you how that works.] [If not directly: Here's how our customers typically handle that use case...]" |
| "What about performance at our scale?" | "Excellent question. Our platform handles [benchmark data]. For your specific scale of [X], we'd recommend [architecture approach]. We can validate this in a POC." |
| "The implementation timeline seems long" | "The timeline I shared is for the full solution. We can phase the rollout to deliver value sooner. Phase 1 would give you [core capability] within [X weeks]." |
| "What happens if we outgrow this?" | "Our architecture is designed for growth. [Describe scaling approach]. We have customers who have scaled from [X] to [Y] without re-architecture." |
### Recovery Strategies
**If the demo breaks:**
1. Stay calm: "Let me switch to [backup environment / backup approach]"
2. Explain what they would have seen
3. Offer to follow up with a recorded walkthrough
4. Pivot to the next demo section
**If an unexpected question derails the flow:**
1. Acknowledge: "That's an excellent question"
2. Briefly answer or note it for follow-up
3. Return to the demo flow: "Let me continue with [next section] and we can dive deeper into that during Q&A"
**If the audience seems disengaged:**
1. Pause and ask: "Before I continue, is this addressing what you're looking for?"
2. Adjust focus based on their response
3. Skip ahead to the section most relevant to their interests
---
## Post-Demo Actions
- [ ] Send thank-you email with recording link (if recorded)
- [ ] Share demo environment access credentials (if applicable)
- [ ] Send follow-up document addressing unanswered questions
- [ ] Schedule next meeting (POC kickoff, technical deep-dive, etc.)
- [ ] Update CRM with demo notes and next steps
- [ ] Debrief with AE on stakeholder reactions and concerns
- [ ] Log key objections and responses for battlecard updates
---
## Notes
[Space for real-time notes during the demo]
### Questions Raised
1. [Question] - [Answer / Follow-up needed]
2. [Question] - [Answer / Follow-up needed]
### Feedback Received
- [Positive feedback]
- [Concerns raised]
### Next Steps Agreed
1. [Action item] - [Owner] - [Date]
2. [Action item] - [Owner] - [Date]
FILE:sales-engineer/assets/expected_output.json
{
"rfp_info": {
"rfp_name": "Enterprise Data Analytics Platform RFP",
"customer": "Acme Financial Services",
"due_date": "2026-03-15",
"strategic_value": "high",
"deal_value": "$450,000 ARR"
},
"coverage_summary": {
"overall_coverage_percentage": 84.5,
"total_requirements": 21,
"full": 14,
"partial": 3,
"planned": 2,
"gap": 2,
"must_have_gaps": 0
},
"category_scores": {
"Data Integration": {
"coverage_percentage": 90.0,
"requirements_count": 4,
"full": 3,
"partial": 1,
"planned": 0,
"gap": 0,
"effort_hours": 34
},
"Analytics & Visualization": {
"coverage_percentage": 77.8,
"requirements_count": 4,
"full": 2,
"partial": 1,
"planned": 1,
"gap": 0,
"effort_hours": 56
},
"Security & Compliance": {
"coverage_percentage": 81.8,
"requirements_count": 4,
"full": 3,
"partial": 0,
"planned": 0,
"gap": 1,
"effort_hours": 50
},
"Performance & Scalability": {
"coverage_percentage": 87.5,
"requirements_count": 3,
"full": 2,
"partial": 1,
"planned": 0,
"gap": 0,
"effort_hours": 32
},
"API & Extensibility": {
"coverage_percentage": 87.5,
"requirements_count": 3,
"full": 2,
"partial": 0,
"planned": 1,
"gap": 0,
"effort_hours": 38
},
"Support & SLA": {
"coverage_percentage": 100.0,
"requirements_count": 2,
"full": 2,
"partial": 0,
"planned": 0,
"gap": 0,
"effort_hours": 4
},
"Deployment": {
"coverage_percentage": 0.0,
"requirements_count": 1,
"full": 0,
"partial": 0,
"planned": 0,
"gap": 1,
"effort_hours": 80
}
},
"bid_recommendation": {
"decision": "BID",
"confidence": "high",
"overall_coverage_percentage": 84.5,
"must_have_gaps": 0,
"strategic_value": "high",
"reasons": [
"Coverage score 84.5% exceeds 70% threshold"
]
},
"gap_analysis": [
{
"id": "R-004",
"requirement": "Change data capture (CDC) for real-time sync",
"category": "Data Integration",
"priority": "should-have",
"coverage_status": "partial",
"severity": "high",
"effort_hours": 16,
"mitigation": "Document supported CDC sources; provide configuration guide for non-standard sources"
},
{
"id": "R-007",
"requirement": "Natural language query interface for business users",
"category": "Analytics & Visualization",
"priority": "should-have",
"coverage_status": "planned",
"severity": "high",
"effort_hours": 24,
"mitigation": "Share roadmap timeline; offer guided query builder as interim solution"
},
{
"id": "R-012",
"requirement": "HIPAA compliance for healthcare data handling",
"category": "Security & Compliance",
"priority": "should-have",
"coverage_status": "gap",
"severity": "high",
"effort_hours": 40,
"mitigation": "Evaluate HIPAA certification timeline with compliance team; consider data masking as interim"
},
{
"id": "R-015",
"requirement": "Multi-region deployment with data residency controls",
"category": "Performance & Scalability",
"priority": "should-have",
"coverage_status": "partial",
"severity": "high",
"effort_hours": 20,
"mitigation": "Confirm customer region requirements; provide APAC beta access if needed"
},
{
"id": "R-008",
"requirement": "Predictive analytics and ML model integration",
"category": "Analytics & Visualization",
"priority": "nice-to-have",
"coverage_status": "partial",
"severity": "low",
"effort_hours": 20,
"mitigation": "Demonstrate Python integration for custom models; provide example notebooks"
},
{
"id": "R-018",
"requirement": "Custom plugin/extension framework",
"category": "API & Extensibility",
"priority": "nice-to-have",
"coverage_status": "planned",
"severity": "low",
"effort_hours": 30,
"mitigation": "Current API extensibility covers most use cases; plugin framework will expand options"
},
{
"id": "R-021",
"requirement": "On-premise deployment option",
"category": "Deployment",
"priority": "nice-to-have",
"coverage_status": "gap",
"severity": "low",
"effort_hours": 80,
"mitigation": "Position cloud-first architecture benefits; offer VPC deployment as alternative"
}
],
"risk_assessment": [
{
"risk": "High customization effort",
"impact": "high",
"description": "230 hours estimated for non-full requirements",
"mitigation": "Evaluate resource availability and timeline feasibility before committing"
}
],
"effort_estimate": {
"total_hours": 294,
"gap_closure_hours": 230,
"full_coverage_hours": 64
},
"requirements_detail": [
{
"id": "R-001",
"requirement": "Real-time data ingestion from multiple sources (APIs, databases, streaming)",
"category": "Data Integration",
"priority": "must-have",
"coverage_status": "full",
"coverage_score": 1.0,
"weight": 3.0,
"weighted_score": 3.0,
"max_weighted": 3.0,
"effort_hours": 8,
"notes": "Native connectors for 200+ data sources",
"mitigation": ""
},
{
"id": "R-002",
"requirement": "Support for SQL and NoSQL data sources",
"category": "Data Integration",
"priority": "must-have",
"coverage_status": "full",
"coverage_score": 1.0,
"weight": 3.0,
"weighted_score": 3.0,
"max_weighted": 3.0,
"effort_hours": 4,
"notes": "Supports PostgreSQL, MySQL, MongoDB, Cassandra, and more",
"mitigation": ""
},
{
"id": "R-003",
"requirement": "Automated ETL pipeline creation with visual designer",
"category": "Data Integration",
"priority": "should-have",
"coverage_status": "full",
"coverage_score": 1.0,
"weight": 2.0,
"weighted_score": 2.0,
"max_weighted": 2.0,
"effort_hours": 6,
"notes": "Drag-and-drop pipeline builder included",
"mitigation": ""
},
{
"id": "R-004",
"requirement": "Change data capture (CDC) for real-time sync",
"category": "Data Integration",
"priority": "should-have",
"coverage_status": "partial",
"coverage_score": 0.5,
"weight": 2.0,
"weighted_score": 1.0,
"max_weighted": 2.0,
"effort_hours": 16,
"notes": "CDC supported for major databases; some require custom configuration",
"mitigation": "Document supported CDC sources; provide configuration guide for non-standard sources"
},
{
"id": "R-005",
"requirement": "Interactive dashboard creation with drag-and-drop",
"category": "Analytics & Visualization",
"priority": "must-have",
"coverage_status": "full",
"coverage_score": 1.0,
"weight": 3.0,
"weighted_score": 3.0,
"max_weighted": 3.0,
"effort_hours": 4,
"notes": "Full drag-and-drop dashboard builder with 50+ chart types",
"mitigation": ""
},
{
"id": "R-006",
"requirement": "Embedded analytics with white-labeling support",
"category": "Analytics & Visualization",
"priority": "must-have",
"coverage_status": "full",
"coverage_score": 1.0,
"weight": 3.0,
"weighted_score": 3.0,
"max_weighted": 3.0,
"effort_hours": 8,
"notes": "Full embedding SDK with CSS customization",
"mitigation": ""
},
{
"id": "R-007",
"requirement": "Natural language query interface for business users",
"category": "Analytics & Visualization",
"priority": "should-have",
"coverage_status": "planned",
"coverage_score": 0.25,
"weight": 2.0,
"weighted_score": 0.5,
"max_weighted": 2.0,
"effort_hours": 24,
"notes": "NLQ feature on roadmap for Q3 2026",
"mitigation": "Share roadmap timeline; offer guided query builder as interim solution"
},
{
"id": "R-008",
"requirement": "Predictive analytics and ML model integration",
"category": "Analytics & Visualization",
"priority": "nice-to-have",
"coverage_status": "partial",
"coverage_score": 0.5,
"weight": 1.0,
"weighted_score": 0.5,
"max_weighted": 1.0,
"effort_hours": 20,
"notes": "Python/R integration available; no built-in ML models",
"mitigation": "Demonstrate Python integration for custom models; provide example notebooks"
},
{
"id": "R-009",
"requirement": "Role-based access control (RBAC) with row-level security",
"category": "Security & Compliance",
"priority": "must-have",
"coverage_status": "full",
"coverage_score": 1.0,
"weight": 3.0,
"weighted_score": 3.0,
"max_weighted": 3.0,
"effort_hours": 6,
"notes": "Granular RBAC with row-level and column-level security",
"mitigation": ""
},
{
"id": "R-010",
"requirement": "SOC 2 Type II certification",
"category": "Security & Compliance",
"priority": "must-have",
"coverage_status": "full",
"coverage_score": 1.0,
"weight": 3.0,
"weighted_score": 3.0,
"max_weighted": 3.0,
"effort_hours": 2,
"notes": "Current SOC 2 Type II report available upon NDA",
"mitigation": ""
},
{
"id": "R-011",
"requirement": "Data encryption at rest and in transit (AES-256, TLS 1.3)",
"category": "Security & Compliance",
"priority": "must-have",
"coverage_status": "full",
"coverage_score": 1.0,
"weight": 3.0,
"weighted_score": 3.0,
"max_weighted": 3.0,
"effort_hours": 2,
"notes": "AES-256 at rest, TLS 1.3 in transit, customer-managed keys supported",
"mitigation": ""
},
{
"id": "R-012",
"requirement": "HIPAA compliance for healthcare data handling",
"category": "Security & Compliance",
"priority": "should-have",
"coverage_status": "gap",
"coverage_score": 0.0,
"weight": 2.0,
"weighted_score": 0.0,
"max_weighted": 2.0,
"effort_hours": 40,
"notes": "HIPAA BAA not currently offered",
"mitigation": "Evaluate HIPAA certification timeline with compliance team; consider data masking as interim"
},
{
"id": "R-013",
"requirement": "Horizontal scaling to handle 10B+ rows",
"category": "Performance & Scalability",
"priority": "must-have",
"coverage_status": "full",
"coverage_score": 1.0,
"weight": 3.0,
"weighted_score": 3.0,
"max_weighted": 3.0,
"effort_hours": 8,
"notes": "Distributed query engine scales to 50B+ rows",
"mitigation": ""
},
{
"id": "R-014",
"requirement": "Sub-second query response for cached dashboards",
"category": "Performance & Scalability",
"priority": "must-have",
"coverage_status": "full",
"coverage_score": 1.0,
"weight": 3.0,
"weighted_score": 3.0,
"max_weighted": 3.0,
"effort_hours": 4,
"notes": "Intelligent caching layer with <500ms p95 for cached queries",
"mitigation": ""
},
{
"id": "R-015",
"requirement": "Multi-region deployment with data residency controls",
"category": "Performance & Scalability",
"priority": "should-have",
"coverage_status": "partial",
"coverage_score": 0.5,
"weight": 2.0,
"weighted_score": 1.0,
"max_weighted": 2.0,
"effort_hours": 20,
"notes": "US and EU regions available; APAC region in beta",
"mitigation": "Confirm customer region requirements; provide APAC beta access if needed"
},
{
"id": "R-016",
"requirement": "RESTful API with comprehensive documentation",
"category": "API & Extensibility",
"priority": "must-have",
"coverage_status": "full",
"coverage_score": 1.0,
"weight": 3.0,
"weighted_score": 3.0,
"max_weighted": 3.0,
"effort_hours": 4,
"notes": "Full REST API with OpenAPI spec and interactive documentation",
"mitigation": ""
},
{
"id": "R-017",
"requirement": "Webhook support for event-driven workflows",
"category": "API & Extensibility",
"priority": "should-have",
"coverage_status": "full",
"coverage_score": 1.0,
"weight": 2.0,
"weighted_score": 2.0,
"max_weighted": 2.0,
"effort_hours": 4,
"notes": "Webhook support for 30+ event types",
"mitigation": ""
},
{
"id": "R-018",
"requirement": "Custom plugin/extension framework",
"category": "API & Extensibility",
"priority": "nice-to-have",
"coverage_status": "planned",
"coverage_score": 0.25,
"weight": 1.0,
"weighted_score": 0.25,
"max_weighted": 1.0,
"effort_hours": 30,
"notes": "Plugin framework on roadmap for Q4 2026",
"mitigation": "Current API extensibility covers most use cases; plugin framework will expand options"
},
{
"id": "R-019",
"requirement": "24/7 enterprise support with 1-hour critical response time",
"category": "Support & SLA",
"priority": "must-have",
"coverage_status": "full",
"coverage_score": 1.0,
"weight": 3.0,
"weighted_score": 3.0,
"max_weighted": 3.0,
"effort_hours": 2,
"notes": "Premium support tier includes 24/7 coverage with 30-min critical response SLA",
"mitigation": ""
},
{
"id": "R-020",
"requirement": "Dedicated customer success manager",
"category": "Support & SLA",
"priority": "should-have",
"coverage_status": "full",
"coverage_score": 1.0,
"weight": 2.0,
"weighted_score": 2.0,
"max_weighted": 2.0,
"effort_hours": 2,
"notes": "Included in Enterprise tier",
"mitigation": ""
},
{
"id": "R-021",
"requirement": "On-premise deployment option",
"category": "Deployment",
"priority": "nice-to-have",
"coverage_status": "gap",
"coverage_score": 0.0,
"weight": 1.0,
"weighted_score": 0.0,
"max_weighted": 1.0,
"effort_hours": 80,
"notes": "Cloud-only platform; no on-premise offering",
"mitigation": "Position cloud-first architecture benefits; offer VPC deployment as alternative"
}
]
}
FILE:sales-engineer/assets/poc_scorecard_template.md
# POC Evaluation Scorecard
## Scorecard Information
| Field | Value |
|-------|-------|
| POC Name | [POC Name] |
| Customer | [Customer Name] |
| Vendor/Product | [Product Name] |
| Evaluation Period | [Start Date] - [End Date] |
| Evaluated By | [Names and Roles] |
| Date Completed | [Date] |
---
## Scoring Scale
| Score | Label | Definition |
|-------|-------|------------|
| 5 | Exceeds | Superior capability; exceeds requirements with notable strengths |
| 4 | Meets | Full capability; meets all requirements with no significant gaps |
| 3 | Partial | Acceptable capability; minor gaps that can be addressed |
| 2 | Below | Below expectations; significant gaps that impact value |
| 1 | Fails | Does not meet requirements; critical gaps |
| N/A | Not Evaluated | Not tested during this POC |
---
## Evaluation Categories
### 1. Functionality (Weight: 30%)
| Criterion | Score (1-5) | Evidence / Notes |
|-----------|-------------|-----------------|
| Core feature completeness | | |
| Use case coverage | | |
| Customization flexibility | | |
| Workflow automation | | |
| Data handling and transformation | | |
| Reporting and analytics | | |
**Category Score:** ___/5.0
**Category Notes:**
[Summary of functionality evaluation, key strengths and gaps]
---
### 2. Performance (Weight: 20%)
| Criterion | Score (1-5) | Evidence / Notes |
|-----------|-------------|-----------------|
| Response time under expected load | | |
| Response time under peak load | | |
| Throughput capacity | | |
| Scalability characteristics | | |
| Resource utilization | | |
| Batch processing performance | | |
**Category Score:** ___/5.0
**Category Notes:**
[Summary of performance evaluation, benchmark results]
---
### 3. Integration (Weight: 20%)
| Criterion | Score (1-5) | Evidence / Notes |
|-----------|-------------|-----------------|
| API completeness and documentation | | |
| Data migration ease | | |
| Third-party connector availability | | |
| Authentication/SSO integration | | |
| Real-time sync reliability | | |
| Error handling and recovery | | |
**Category Score:** ___/5.0
**Category Notes:**
[Summary of integration evaluation, systems tested]
---
### 4. Usability (Weight: 15%)
| Criterion | Score (1-5) | Evidence / Notes |
|-----------|-------------|-----------------|
| User interface intuitiveness | | |
| Learning curve assessment | | |
| Documentation quality | | |
| Admin console functionality | | |
| Mobile experience | | |
| Accessibility compliance | | |
**Category Score:** ___/5.0
**Category Notes:**
[Summary of usability evaluation, user feedback]
---
### 5. Support (Weight: 15%)
| Criterion | Score (1-5) | Evidence / Notes |
|-----------|-------------|-----------------|
| Technical support responsiveness | | |
| Knowledge base quality | | |
| Training resources availability | | |
| Community and ecosystem | | |
| Issue resolution speed | | |
| Proactive engagement quality | | |
**Category Score:** ___/5.0
**Category Notes:**
[Summary of support evaluation during POC]
---
## Score Summary
| Category | Weight | Score | Weighted Score |
|----------|--------|-------|----------------|
| Functionality | 30% | ___/5.0 | ___ |
| Performance | 20% | ___/5.0 | ___ |
| Integration | 20% | ___/5.0 | ___ |
| Usability | 15% | ___/5.0 | ___ |
| Support | 15% | ___/5.0 | ___ |
| **Overall** | **100%** | | **___/5.0** |
### Decision Thresholds
| Weighted Average | Decision |
|-----------------|----------|
| >= 4.0 | **Strong Pass** - Proceed to procurement |
| 3.5 - 3.9 | **Pass** - Proceed with noted conditions |
| 3.0 - 3.4 | **Conditional** - Requires further evaluation |
| < 3.0 | **Fail** - Does not meet requirements |
---
## Success Criteria Results
| # | Criterion | Priority | Target | Actual | Pass/Fail |
|---|-----------|----------|--------|--------|-----------|
| 1 | [Criterion 1] | Must-Have | [Target] | [Result] | [ ] |
| 2 | [Criterion 2] | Must-Have | [Target] | [Result] | [ ] |
| 3 | [Criterion 3] | Must-Have | [Target] | [Result] | [ ] |
| 4 | [Criterion 4] | Should-Have | [Target] | [Result] | [ ] |
| 5 | [Criterion 5] | Should-Have | [Target] | [Result] | [ ] |
| 6 | [Criterion 6] | Nice-to-Have | [Target] | [Result] | [ ] |
**Must-Have Pass Rate:** ___/%
**Overall Pass Rate:** ___/%
---
## Issues Log
| # | Issue | Severity | Status | Resolution | Impact on Score |
|---|-------|----------|--------|------------|----------------|
| 1 | [Issue] | [Critical/High/Medium/Low] | [Open/Resolved] | [Resolution] | [Category affected] |
| 2 | [Issue] | [Critical/High/Medium/Low] | [Open/Resolved] | [Resolution] | [Category affected] |
---
## Stakeholder Feedback
### [Stakeholder Name 1] - [Role]
**Rating:** ___/5
**Comments:** [Feedback]
### [Stakeholder Name 2] - [Role]
**Rating:** ___/5
**Comments:** [Feedback]
### [Stakeholder Name 3] - [Role]
**Rating:** ___/5
**Comments:** [Feedback]
---
## Recommendation
### Decision: [ ] GO / [ ] CONDITIONAL GO / [ ] NO-GO
**Rationale:**
[2-3 paragraphs explaining the recommendation based on scorecard results, success criteria outcomes, stakeholder feedback, and overall evaluation]
**Conditions (if Conditional GO):**
1. [Condition 1 that must be met before proceeding]
2. [Condition 2 that must be met before proceeding]
**Key Strengths:**
1. [Strength 1]
2. [Strength 2]
3. [Strength 3]
**Key Concerns:**
1. [Concern 1 with proposed mitigation]
2. [Concern 2 with proposed mitigation]
**Next Steps:**
1. [Action item] - [Owner] - [Date]
2. [Action item] - [Owner] - [Date]
3. [Action item] - [Owner] - [Date]
---
## Sign-Off
| Role | Name | Signature | Date |
|------|------|-----------|------|
| Technical Evaluator | | | |
| Business Sponsor | | | |
| Decision Maker | | | |
| Sales Engineer | | | |
FILE:sales-engineer/assets/sample_rfp_data.json
{
"rfp_name": "Enterprise Data Analytics Platform RFP",
"customer": "Acme Financial Services",
"due_date": "2026-03-15",
"deal_value": "$450,000 ARR",
"strategic_value": "high",
"requirements": [
{
"id": "R-001",
"requirement": "Real-time data ingestion from multiple sources (APIs, databases, streaming)",
"category": "Data Integration",
"priority": "must-have",
"coverage_status": "full",
"effort_hours": 8,
"notes": "Native connectors for 200+ data sources",
"mitigation": ""
},
{
"id": "R-002",
"requirement": "Support for SQL and NoSQL data sources",
"category": "Data Integration",
"priority": "must-have",
"coverage_status": "full",
"effort_hours": 4,
"notes": "Supports PostgreSQL, MySQL, MongoDB, Cassandra, and more",
"mitigation": ""
},
{
"id": "R-003",
"requirement": "Automated ETL pipeline creation with visual designer",
"category": "Data Integration",
"priority": "should-have",
"coverage_status": "full",
"effort_hours": 6,
"notes": "Drag-and-drop pipeline builder included",
"mitigation": ""
},
{
"id": "R-004",
"requirement": "Change data capture (CDC) for real-time sync",
"category": "Data Integration",
"priority": "should-have",
"coverage_status": "partial",
"effort_hours": 16,
"notes": "CDC supported for major databases; some require custom configuration",
"mitigation": "Document supported CDC sources; provide configuration guide for non-standard sources"
},
{
"id": "R-005",
"requirement": "Interactive dashboard creation with drag-and-drop",
"category": "Analytics & Visualization",
"priority": "must-have",
"coverage_status": "full",
"effort_hours": 4,
"notes": "Full drag-and-drop dashboard builder with 50+ chart types",
"mitigation": ""
},
{
"id": "R-006",
"requirement": "Embedded analytics with white-labeling support",
"category": "Analytics & Visualization",
"priority": "must-have",
"coverage_status": "full",
"effort_hours": 8,
"notes": "Full embedding SDK with CSS customization",
"mitigation": ""
},
{
"id": "R-007",
"requirement": "Natural language query interface for business users",
"category": "Analytics & Visualization",
"priority": "should-have",
"coverage_status": "planned",
"effort_hours": 24,
"notes": "NLQ feature on roadmap for Q3 2026",
"mitigation": "Share roadmap timeline; offer guided query builder as interim solution"
},
{
"id": "R-008",
"requirement": "Predictive analytics and ML model integration",
"category": "Analytics & Visualization",
"priority": "nice-to-have",
"coverage_status": "partial",
"effort_hours": 20,
"notes": "Python/R integration available; no built-in ML models",
"mitigation": "Demonstrate Python integration for custom models; provide example notebooks"
},
{
"id": "R-009",
"requirement": "Role-based access control (RBAC) with row-level security",
"category": "Security & Compliance",
"priority": "must-have",
"coverage_status": "full",
"effort_hours": 6,
"notes": "Granular RBAC with row-level and column-level security",
"mitigation": ""
},
{
"id": "R-010",
"requirement": "SOC 2 Type II certification",
"category": "Security & Compliance",
"priority": "must-have",
"coverage_status": "full",
"effort_hours": 2,
"notes": "Current SOC 2 Type II report available upon NDA",
"mitigation": ""
},
{
"id": "R-011",
"requirement": "Data encryption at rest and in transit (AES-256, TLS 1.3)",
"category": "Security & Compliance",
"priority": "must-have",
"coverage_status": "full",
"effort_hours": 2,
"notes": "AES-256 at rest, TLS 1.3 in transit, customer-managed keys supported",
"mitigation": ""
},
{
"id": "R-012",
"requirement": "HIPAA compliance for healthcare data handling",
"category": "Security & Compliance",
"priority": "should-have",
"coverage_status": "gap",
"effort_hours": 40,
"notes": "HIPAA BAA not currently offered",
"mitigation": "Evaluate HIPAA certification timeline with compliance team; consider data masking as interim"
},
{
"id": "R-013",
"requirement": "Horizontal scaling to handle 10B+ rows",
"category": "Performance & Scalability",
"priority": "must-have",
"coverage_status": "full",
"effort_hours": 8,
"notes": "Distributed query engine scales to 50B+ rows",
"mitigation": ""
},
{
"id": "R-014",
"requirement": "Sub-second query response for cached dashboards",
"category": "Performance & Scalability",
"priority": "must-have",
"coverage_status": "full",
"effort_hours": 4,
"notes": "Intelligent caching layer with <500ms p95 for cached queries",
"mitigation": ""
},
{
"id": "R-015",
"requirement": "Multi-region deployment with data residency controls",
"category": "Performance & Scalability",
"priority": "should-have",
"coverage_status": "partial",
"effort_hours": 20,
"notes": "US and EU regions available; APAC region in beta",
"mitigation": "Confirm customer region requirements; provide APAC beta access if needed"
},
{
"id": "R-016",
"requirement": "RESTful API with comprehensive documentation",
"category": "API & Extensibility",
"priority": "must-have",
"coverage_status": "full",
"effort_hours": 4,
"notes": "Full REST API with OpenAPI spec and interactive documentation",
"mitigation": ""
},
{
"id": "R-017",
"requirement": "Webhook support for event-driven workflows",
"category": "API & Extensibility",
"priority": "should-have",
"coverage_status": "full",
"effort_hours": 4,
"notes": "Webhook support for 30+ event types",
"mitigation": ""
},
{
"id": "R-018",
"requirement": "Custom plugin/extension framework",
"category": "API & Extensibility",
"priority": "nice-to-have",
"coverage_status": "planned",
"effort_hours": 30,
"notes": "Plugin framework on roadmap for Q4 2026",
"mitigation": "Current API extensibility covers most use cases; plugin framework will expand options"
},
{
"id": "R-019",
"requirement": "24/7 enterprise support with 1-hour critical response time",
"category": "Support & SLA",
"priority": "must-have",
"coverage_status": "full",
"effort_hours": 2,
"notes": "Premium support tier includes 24/7 coverage with 30-min critical response SLA",
"mitigation": ""
},
{
"id": "R-020",
"requirement": "Dedicated customer success manager",
"category": "Support & SLA",
"priority": "should-have",
"coverage_status": "full",
"effort_hours": 2,
"notes": "Included in Enterprise tier",
"mitigation": ""
},
{
"id": "R-021",
"requirement": "On-premise deployment option",
"category": "Deployment",
"priority": "nice-to-have",
"coverage_status": "gap",
"effort_hours": 80,
"notes": "Cloud-only platform; no on-premise offering",
"mitigation": "Position cloud-first architecture benefits; offer VPC deployment as alternative"
}
]
}
FILE:sales-engineer/assets/technical_proposal_template.md
# Technical Proposal Template
## Document Information
| Field | Value |
|-------|-------|
| Customer | [Customer Name] |
| Opportunity | [Opportunity Name / RFP Reference] |
| Prepared By | [Sales Engineer Name] |
| Date | [Date] |
| Version | [Version Number] |
| Classification | [Confidential / Internal] |
---
## 1. Executive Summary
### Business Context
[2-3 paragraphs summarizing the customer's business challenges and strategic objectives that this solution addresses. Focus on business outcomes, not technical features.]
### Proposed Solution
[1-2 paragraphs describing the solution at a high level, emphasizing how it addresses the specific challenges identified above.]
### Key Value Propositions
1. **[Value 1]:** [Quantified benefit, e.g., "Reduce reporting time by 60%"]
2. **[Value 2]:** [Quantified benefit]
3. **[Value 3]:** [Quantified benefit]
### Recommended Approach
[Brief overview of the implementation approach, timeline, and key milestones.]
---
## 2. Requirements Summary
### Coverage Overview
| Category | Requirements | Full | Partial | Planned | Gap | Coverage |
|----------|-------------|------|---------|---------|-----|----------|
| [Category 1] | [N] | [N] | [N] | [N] | [N] | [X%] |
| [Category 2] | [N] | [N] | [N] | [N] | [N] | [X%] |
| **Total** | **[N]** | **[N]** | **[N]** | **[N]** | **[N]** | **[X%]** |
### Key Differentiators
1. [Differentiator 1 with brief explanation]
2. [Differentiator 2 with brief explanation]
3. [Differentiator 3 with brief explanation]
### Gap Mitigation Plan
| Gap | Priority | Mitigation Strategy | Timeline |
|-----|----------|-------------------|----------|
| [Gap 1] | [Must/Should/Nice] | [Strategy] | [Date] |
| [Gap 2] | [Must/Should/Nice] | [Strategy] | [Date] |
---
## 3. Solution Architecture
### Architecture Overview
[High-level architecture description. Include or reference an architecture diagram.]
```
[ASCII architecture diagram or reference to attached diagram]
Example:
+------------------+ +------------------+ +------------------+
| Data Sources | --> | Our Platform | --> | Delivery |
| - System A | | - Ingestion | | - Dashboards |
| - System B | | - Processing | | - API |
| - System C | | - Analytics | | - Exports |
+------------------+ +------------------+ +------------------+
|
+------------------+
| Management |
| - Security |
| - Monitoring |
| - Admin |
+------------------+
```
### Component Details
#### [Component 1]
- **Purpose:** [What this component does]
- **Technology:** [Underlying technology]
- **Scaling:** [How it scales]
- **Availability:** [HA/DR approach]
#### [Component 2]
- **Purpose:** [What this component does]
- **Technology:** [Underlying technology]
- **Scaling:** [How it scales]
- **Availability:** [HA/DR approach]
### Integration Architecture
| Integration Point | Protocol | Direction | Frequency | Authentication |
|-------------------|----------|-----------|-----------|---------------|
| [System A] | REST API | Inbound | Real-time | OAuth 2.0 |
| [System B] | JDBC | Inbound | Batch (hourly) | Service Account |
| [System C] | Webhook | Outbound | Event-driven | API Key |
### Security Architecture
- **Authentication:** [SSO, SAML, OAuth, etc.]
- **Authorization:** [RBAC, row-level security, etc.]
- **Encryption:** [At rest, in transit, key management]
- **Compliance:** [SOC 2, GDPR, HIPAA, etc.]
- **Network:** [VPC, firewall, IP restrictions]
---
## 4. Implementation Plan
### Phase Overview
| Phase | Duration | Focus | Deliverables |
|-------|----------|-------|-------------|
| Phase 1: Foundation | [X weeks] | Environment setup, core configuration | Working environment, admin access |
| Phase 2: Core Implementation | [X weeks] | Primary use cases, integrations | [Deliverables] |
| Phase 3: Advanced Features | [X weeks] | Advanced scenarios, optimization | [Deliverables] |
| Phase 4: Go-Live | [X weeks] | Testing, training, cutover | Production deployment |
### Detailed Timeline
```
Week 1-2: [Phase 1 - Foundation]
- Environment provisioning
- Security configuration
- Data source connectivity
Week 3-6: [Phase 2 - Core Implementation]
- Use case 1 implementation
- Use case 2 implementation
- Integration testing
Week 7-8: [Phase 3 - Advanced Features]
- Advanced analytics
- Custom workflows
- Performance optimization
Week 9-10: [Phase 4 - Go-Live]
- User acceptance testing
- Training sessions
- Production cutover
- Post-launch support
```
### Resource Requirements
| Role | Hours | Phase(s) | Provider |
|------|-------|----------|----------|
| Solutions Architect | [X] | All | [Vendor] |
| Implementation Engineer | [X] | 1-3 | [Vendor] |
| Project Manager | [X] | All | [Vendor] |
| Customer IT Admin | [X] | 1, 4 | [Customer] |
| Customer Business Lead | [X] | 2-4 | [Customer] |
### Training Plan
| Audience | Format | Duration | Content |
|----------|--------|----------|---------|
| Administrators | Workshop | [X hours] | Configuration, security, monitoring |
| Power Users | Workshop | [X hours] | Advanced features, reporting, automation |
| End Users | Webinar | [X hours] | Core workflows, self-service analytics |
---
## 5. Risk Mitigation
| Risk | Probability | Impact | Mitigation |
|------|------------|--------|------------|
| [Risk 1] | [H/M/L] | [H/M/L] | [Strategy] |
| [Risk 2] | [H/M/L] | [H/M/L] | [Strategy] |
| [Risk 3] | [H/M/L] | [H/M/L] | [Strategy] |
---
## 6. Commercial Summary
### Pricing Overview
| Component | Annual Cost |
|-----------|------------|
| Platform License | $[X] |
| Implementation Services | $[X] |
| Training | $[X] |
| Premium Support | $[X] |
| **Total Year 1** | **$[X]** |
| **Annual Renewal** | **$[X]** |
### ROI Projection
| Metric | Current State | With Solution | Improvement |
|--------|--------------|---------------|-------------|
| [Metric 1] | [Value] | [Value] | [%] |
| [Metric 2] | [Value] | [Value] | [%] |
| [Metric 3] | [Value] | [Value] | [%] |
**Estimated payback period:** [X months]
---
## 7. Next Steps
1. [Next step 1 with owner and date]
2. [Next step 2 with owner and date]
3. [Next step 3 with owner and date]
---
## Appendices
### A. Detailed Compliance Matrix
[Reference to full requirement-by-requirement response]
### B. Reference Customers
[2-3 relevant customer references with industry, use case, and outcomes]
### C. Architecture Diagrams
[Detailed architecture diagrams]
### D. Product Roadmap (Relevant Items)
[Roadmap items relevant to this proposal with estimated delivery dates]
FILE:sales-engineer/references/competitive-positioning-framework.md
# Competitive Positioning Framework
A comprehensive guide for Sales Engineers to analyze competitors, build battlecards, handle objections, and position for wins.
## Competitive Analysis Methodology
### 1. Intelligence Gathering
**Primary Sources:**
- Competitor product documentation and release notes
- Analyst reports (Gartner, Forrester, IDC)
- Customer feedback from win/loss reviews
- Industry conferences and webinars
- Public case studies and testimonials
- Open-source repositories and API documentation
**Secondary Sources:**
- Glassdoor reviews (engineering culture, product direction)
- Job postings (technology stack, expansion areas)
- Patent filings (future direction signals)
- Social media and community forums
- Partner ecosystem announcements
### 2. Feature Comparison Best Practices
**Feature Scoring Scale:**
| Score | Label | Definition |
|-------|-------|------------|
| 3 | Full | Complete, production-ready feature support |
| 2 | Partial | Feature exists but with limitations or caveats |
| 1 | Limited | Minimal implementation, significant gaps |
| 0 | None | Feature not available |
**Comparison Categories:**
Organize features into weighted categories that reflect customer priorities:
| Category | Typical Weight | What to Evaluate |
|----------|---------------|------------------|
| Core Functionality | 25-35% | Primary use case coverage |
| Integration & API | 15-25% | Ecosystem connectivity |
| Security & Compliance | 15-20% | Enterprise readiness |
| Scalability & Performance | 10-20% | Growth capacity |
| Usability & UX | 10-15% | Time to value |
| Support & Services | 5-10% | Vendor partnership quality |
**Weighting Guidelines:**
- Adjust weights based on the specific customer's priorities
- Security-sensitive industries (healthcare, finance) should weight compliance higher
- High-growth companies should weight scalability higher
- Enterprise deals should weight integration and support higher
### 3. Differentiator Identification
A differentiator is a feature or capability where your product scores highest among all compared products. Strong differentiators have these properties:
- **Unique:** Only your product offers this capability
- **Valuable:** Customers care about this capability
- **Defensible:** Not easily replicated by competitors
- **Demonstrable:** Can be shown in a demo or POC
**Differentiator Categories:**
| Type | Description | Example |
|------|-------------|---------|
| Feature Differentiator | Unique product capability | Native ML-powered anomaly detection |
| Architecture Differentiator | Fundamental design advantage | Multi-tenant with data isolation |
| Ecosystem Differentiator | Partner or integration advantage | 200+ native integrations |
| Service Differentiator | Support or engagement model | Dedicated SE throughout contract |
| Economic Differentiator | Pricing or TCO advantage | Usage-based pricing with no minimums |
### 4. Vulnerability Assessment
Vulnerabilities are features where competitors score higher than your product. Address vulnerabilities proactively:
**Vulnerability Response Strategies:**
1. **Acknowledge and redirect:** Confirm the gap, then pivot to your strength areas
2. **Reframe the requirement:** Show why the customer's real need is better met differently
3. **Demonstrate workaround:** Show how existing capabilities address the underlying need
4. **Commit to roadmap:** Provide a credible timeline for native support
5. **Partner solution:** Identify an integration partner that fills the gap
## Objection Handling
### Common Technical Objections
#### "Your product lacks [Feature X]"
**Response Framework:**
1. Acknowledge: "You're right that [Feature X] is not a standalone feature today."
2. Explore: "Help me understand the specific use case you need [Feature X] for."
3. Redirect: "Our approach to solving that is [alternative], which actually provides [benefit]."
4. Evidence: "Customer [reference] had the same concern and found [outcome]."
#### "Competitor [Y] has better [Capability]"
**Response Framework:**
1. Acknowledge: "I understand [Competitor Y] has invested in [Capability]."
2. Qualify: "Can you share what specific aspects of [Capability] are most important?"
3. Differentiate: "While they focus on [approach], we take a different approach with [our method] because [reason]."
4. Quantify: "The practical difference in real-world usage is [metric/evidence]."
#### "Your product is too expensive"
**Response Framework:**
1. Acknowledge: "I appreciate you sharing that concern."
2. Reframe: "Let's look at total cost of ownership rather than license cost alone."
3. Quantify: "When you factor in [implementation, training, maintenance, time-to-value], the TCO comparison shows..."
4. Value: "Based on our analysis, the ROI timeline is [X months], delivering [Y value]."
#### "We're concerned about vendor lock-in"
**Response Framework:**
1. Acknowledge: "That's a smart concern for any technology investment."
2. Evidence: "Our architecture uses [open standards, APIs, data portability features]."
3. Demonstrate: "Here's how data export and migration work [show the feature]."
4. Reference: "We can connect you with customers who evaluated this exact concern."
### Objection Handling Principles
1. **Never disparage competitors.** Focus on your strengths, not their weaknesses.
2. **Ask questions first.** Understand the real concern behind the objection.
3. **Use evidence.** Reference customers, benchmarks, and demonstrations.
4. **Be honest about gaps.** Credibility is your most valuable asset.
5. **Redirect to value.** Connect every response back to business outcomes.
## Win/Loss Analysis
### Post-Decision Review Process
**Timing:** Conduct within 2 weeks of the decision for accurate recall.
**Interview Questions (for wins):**
1. What was the deciding factor in choosing us?
2. Which features or capabilities were most compelling?
3. How did our demo/POC compare to alternatives?
4. What concerns did you have that were resolved during the process?
5. What could we have done better in the evaluation process?
**Interview Questions (for losses):**
1. What was the primary reason for choosing the competitor?
2. Were there specific requirements we did not meet?
3. How did our demo/POC compare to the winning vendor?
4. What would have changed your decision?
5. Would you consider us for future evaluations?
### Win/Loss Data Tracking
| Data Point | Purpose |
|-----------|---------|
| Deal size | Pattern analysis by segment |
| Industry | Vertical-specific insights |
| Competitor | Head-to-head record |
| Decision factors | Feature priority validation |
| Sales cycle length | Process efficiency |
| Stakeholder roles | Engagement strategy |
| Technical requirements | Capability gap tracking |
| POC outcome | POC process improvement |
### Analysis Dimensions
1. **By Competitor:** Win rate per competitor, common objections, feature gaps
2. **By Segment:** Enterprise vs mid-market vs SMB patterns
3. **By Industry:** Vertical-specific win factors
4. **By Deal Size:** Large vs small deal dynamics
5. **By Feature Category:** Which capabilities drive wins vs losses
## Battlecard Creation
### Battlecard Structure
**Page 1: Quick Reference**
- Competitor overview (company size, funding, market position)
- Key strengths (top 3)
- Key weaknesses (top 3)
- Ideal customer profile for the competitor
- Our win rate against this competitor
**Page 2: Feature Comparison**
- Category-by-category comparison (summary view)
- Top differentiators (features where we lead)
- Top vulnerabilities (features where they lead)
- Parity features (features at same level)
**Page 3: Talk Track**
- Opening positioning statement
- Discovery questions that expose competitor weaknesses
- Objection responses for their key strengths
- Proof points (customer references, benchmarks, case studies)
- Trap-setting questions for demos and POCs
**Page 4: Win Strategies**
- Recommended evaluation criteria that favor our strengths
- Demo scenarios that highlight our differentiators
- POC success criteria that align with our capabilities
- Pricing and packaging positioning
- Stakeholder engagement strategy
### Battlecard Maintenance
- **Monthly review:** Update feature scores based on new releases
- **Quarterly refresh:** Incorporate win/loss analysis findings
- **Trigger-based update:** Major competitor release, pricing change, or acquisition
## Competitive Positioning During Evaluations
### Evaluation Stage Tactics
| Stage | Tactic |
|-------|--------|
| Discovery | Ask questions that expose competitor weaknesses |
| Demo | Lead with differentiators, show end-to-end workflows |
| POC | Define success criteria aligned with your strengths |
| Proposal | Quantify TCO advantage, emphasize implementation risk |
| Negotiation | Leverage competitive urgency, offer migration assistance |
### Influencing Evaluation Criteria
The sales engineer's most impactful opportunity is shaping the evaluation criteria before the formal process begins:
1. **Map criteria to strengths:** Propose evaluation categories where you excel
2. **Weight appropriately:** Ensure critical categories (where you lead) carry higher weight
3. **Define metrics:** Specific, measurable criteria favor the more capable product
4. **Include non-obvious criteria:** Total cost of ownership, time-to-value, ecosystem breadth
---
**Last Updated:** February 2026
FILE:sales-engineer/references/poc-best-practices.md
# Proof of Concept (POC) Best Practices
A comprehensive guide for Sales Engineers planning, executing, and evaluating proof-of-concept engagements.
## POC Planning Methodology
### 1. Pre-POC Qualification
Not every deal warrants a POC. Qualify before committing resources:
**POC-Worthy Indicators:**
- Deal value justifies 80-200+ hours of SE and engineering time
- Customer has an identified champion who will actively participate
- Clear decision timeline with POC as a defined evaluation step
- Budget is allocated or allocation process is underway
- Technical stakeholders are available for the evaluation period
**POC Red Flags:**
- "Free trial" request with no commitment to evaluate
- No identified decision-maker or budget owner
- Competitor has already been selected; POC is for validation only
- Customer expects production-grade environment for extended period
- No defined success criteria or evaluation framework
### 2. Scope Definition
The most critical success factor is a well-defined scope. An uncontrolled scope leads to extended timelines, unmet expectations, and lost deals.
**Scope Elements:**
- **Use cases:** 3-5 specific scenarios to validate (not "everything")
- **Integrations:** Which systems must connect during the POC
- **Data:** What data will be used (sample, synthetic, production subset)
- **Users:** Who will access the POC environment and in what roles
- **Duration:** Fixed timeline with clear milestones
- **Success criteria:** Measurable, objective criteria for each use case
**Scope Control Tactics:**
- Document scope in writing with customer sign-off
- Define what is explicitly out of scope
- Create a change request process for scope additions
- Set a maximum number of use cases per complexity tier
### 3. Timeline Planning
**Standard 5-Week Framework:**
| Week | Phase | Focus | Key Activities |
|------|-------|-------|---------------|
| 1 | Setup | Foundation | Environment, data, access, kickoff |
| 2-3 | Core Testing | Validation | Primary use cases, integrations, workflows |
| 4 | Advanced Testing | Edge cases | Performance, security, scale, administration |
| 5 | Evaluation | Decision | Scorecard, review, recommendation |
**Timeline Adjustments by Complexity:**
| Complexity | Duration | Use Cases | Integrations |
|-----------|----------|-----------|-------------|
| Low | 3 weeks | 2-3 | 0-1 |
| Medium | 5 weeks | 3-5 | 2-3 |
| High | 6-8 weeks | 5-8 | 4+ |
**Timeline Rules:**
- Never exceed 8 weeks. Longer POCs lose momentum and stakeholder attention.
- Front-load the most impressive capabilities to build early momentum.
- Schedule stakeholder checkpoints at the end of each phase.
- Build 20% buffer into each phase for unexpected issues.
### 4. Resource Planning
**SE Allocation:**
| Activity | Hours/Week (Medium Complexity) |
|----------|-------------------------------|
| Environment setup and configuration | 15-20 (Week 1 only) |
| Use case execution and testing | 20-25 |
| Stakeholder communication | 3-5 |
| Documentation and reporting | 3-5 |
| Issue resolution | 5-8 |
**Engineering Support:**
- Allocate dedicated engineering support for complex integrations
- Establish an escalation path for blocking issues
- Pre-schedule engineering availability during Core Testing phase
- Request customer IT support for integration access and credentials
**Customer Resources:**
- Technical sponsor for daily communication
- Business stakeholders for use case validation
- IT/Security for environment access and compliance review
- End users for usability feedback (if applicable)
## Success Criteria Definition
### Writing Effective Success Criteria
Each criterion must be:
- **Specific:** Clearly defined with no ambiguity
- **Measurable:** Quantifiable metric or clear pass/fail
- **Agreed:** Documented and signed off by both parties
- **Relevant:** Tied to a business outcome or technical requirement
- **Time-bound:** Evaluated within the POC timeline
### Success Criteria Categories
**Functionality Criteria:**
- "System processes [X] transactions per hour without errors"
- "Workflow automation reduces manual steps from [Y] to [Z]"
- "Report generation completes within [N] seconds for [M] records"
- "All [X] defined use cases completed successfully"
**Performance Criteria:**
- "API response time <200ms at p95 under [N] concurrent users"
- "Batch processing completes [X] records in under [Y] minutes"
- "System maintains performance with [N]x expected data volume"
**Integration Criteria:**
- "Bidirectional sync with [System X] operates within [Y] minute latency"
- "SSO integration with [IdP] supports all required authentication flows"
- "Data import from [Source] completes with <1% error rate"
**Usability Criteria:**
- "New users complete [task] within [N] minutes without assistance"
- "Admin configuration for [scenario] requires fewer than [N] steps"
- "Stakeholder satisfaction rating >= 4.0/5.0"
### Anti-Patterns in Success Criteria
- **Too vague:** "System performs well" (what is "well"?)
- **Too many:** More than 15 criteria dilutes focus and extends timeline
- **Unmeasurable:** "Users like the interface" (how do you measure "like"?)
- **Biased toward feature count:** "Must have Feature X" instead of "Must solve Problem Y"
- **Moving target:** Criteria that change mid-POC without formal agreement
## Stakeholder Management
### Stakeholder Map
| Role | Priority | Engagement Strategy |
|------|----------|-------------------|
| Decision Maker | High | Executive briefings, ROI summaries |
| Champion | Critical | Daily communication, progress updates |
| Technical Evaluator | High | Hands-on access, deep-dive sessions |
| End User | Medium | Usability testing, feedback sessions |
| IT/Security | High | Compliance reviews, architecture sessions |
| Procurement | Low-Medium | TCO documentation, reference connections |
### Engagement Cadence
- **Daily:** Champion check-in (10 min, Slack/email)
- **Weekly:** Progress report to all stakeholders (written summary)
- **Phase transitions:** Formal review meeting with demo of progress
- **Final:** Executive presentation with scorecard results and recommendation
### Managing Stakeholder Expectations
1. **Set clear boundaries:** Define what will and will not be demonstrated
2. **Communicate early and often:** No surprises; surface issues immediately
3. **Document everything:** Meeting notes, decisions, change requests
4. **Celebrate wins:** Highlight successful milestones to maintain momentum
5. **Address concerns immediately:** Delays in resolution erode confidence
## Evaluation Frameworks
### Weighted Scorecard Model
The evaluation scorecard provides an objective, comparable assessment:
| Category | Weight | Score (1-5) | Weighted Score |
|----------|--------|-------------|----------------|
| Functionality | 30% | | |
| Performance | 20% | | |
| Integration | 20% | | |
| Usability | 15% | | |
| Support | 15% | | |
| **Total** | **100%** | | |
**Scoring Scale:**
- 5: Exceeds requirements - superior capability demonstrated
- 4: Meets requirements - full capability with minor enhancements possible
- 3: Partially meets - acceptable but notable gaps remain
- 2: Below expectations - significant gaps that impact value
- 1: Does not meet - critical failure for this category
**Decision Thresholds:**
- Weighted average >= 4.0: **Strong Pass** - proceed to procurement
- Weighted average 3.5-3.9: **Pass** - proceed with noted conditions
- Weighted average 3.0-3.4: **Conditional** - requires further evaluation or negotiation
- Weighted average < 3.0: **Fail** - does not meet requirements
### Go/No-Go Decision Framework
The go/no-go decision should be based on multiple factors, not just the scorecard:
**Go Indicators:**
- Scorecard score >= 3.5
- All must-have success criteria met
- Champion and decision-maker both express positive sentiment
- No unresolved critical technical blockers
- Clear implementation path identified
**No-Go Indicators:**
- Scorecard score < 3.0
- Critical success criteria failed without clear resolution
- Decision-maker expresses significant concerns
- Multiple unresolved technical blockers
- Competitive alternative clearly preferred by evaluators
**Conditional Go Indicators:**
- Scorecard score 3.0-3.5 with clear path to improvement
- 1-2 minor success criteria not met but with workarounds
- Mixed stakeholder sentiment that can be addressed
- Blockers identified but resolution path confirmed with engineering
## Common POC Failure Modes
### 1. Scope Creep
**Symptom:** Customer continuously adds requirements during the POC.
**Prevention:** Written scope agreement with change request process.
**Recovery:** Renegotiate timeline or defer additions to Phase 2.
### 2. Champion Absence
**Symptom:** Champion becomes unavailable or disengaged mid-POC.
**Prevention:** Identify a backup champion. Schedule regular touchpoints.
**Recovery:** Escalate to decision-maker. Demonstrate value already achieved.
### 3. Data Issues
**Symptom:** Customer data is unavailable, poor quality, or incompatible.
**Prevention:** Request sample data before kickoff. Prepare synthetic data.
**Recovery:** Use synthetic data for core testing. Document data requirements for implementation.
### 4. Environment Problems
**Symptom:** POC environment is unstable, slow, or inaccessible.
**Prevention:** Use a dedicated, pre-configured environment. Test before kickoff.
**Recovery:** Have a backup environment. Communicate honestly about delays.
### 5. Moving Goalposts
**Symptom:** Evaluation criteria change mid-POC, often influenced by competitor demos.
**Prevention:** Get written sign-off on criteria before starting. Reference agreement when changes arise.
**Recovery:** Agree to evaluate new criteria as addendum, not replacement. Highlight what has already been validated.
### 6. Extended Timeline
**Symptom:** POC drags beyond planned duration without clear progress.
**Prevention:** Set hard deadlines in the agreement. Schedule decision meetings in advance.
**Recovery:** Force a checkpoint. Present results to date and ask for a go/no-go with current evidence.
### 7. Technical Blockers
**Symptom:** Unexpected technical issues prevent completion of key use cases.
**Prevention:** Conduct technical discovery before committing to POC. Have engineering on standby.
**Recovery:** Escalate immediately. Provide transparent status updates. Offer alternative approaches.
## POC Documentation
### Required Artifacts
| Document | When | Owner |
|----------|------|-------|
| Scope agreement | Pre-POC | SE + Customer |
| Environment setup guide | Week 1 | SE |
| Progress reports | Weekly | SE |
| Phase review presentations | Phase transitions | SE |
| Issue log | Ongoing | SE |
| Final evaluation report | Week 5 | SE + Customer |
| Lessons learned | Post-POC | SE |
### Final Report Template
1. **Executive Summary** - POC objectives, approach, and outcome
2. **Scope and Success Criteria** - What was tested and how
3. **Results Summary** - Success criteria outcomes with evidence
4. **Evaluation Scorecard** - Weighted scores across all categories
5. **Issues and Resolutions** - Problems encountered and how they were addressed
6. **Recommendation** - Go/No-Go with rationale
7. **Implementation Considerations** - Next steps, timeline, and resource needs
---
**Last Updated:** February 2026
FILE:sales-engineer/references/rfp-response-guide.md
# RFP/RFI Response Guide
A comprehensive reference for Sales Engineers responding to Requests for Proposal (RFP) and Requests for Information (RFI).
## RFP Response Best Practices
### 1. Pre-Response Assessment
Before investing time in a response, conduct a thorough bid/no-bid assessment:
**Bid Criteria Checklist:**
- Do we have a pre-existing relationship with the customer?
- Is there an identified champion or sponsor?
- Do our capabilities align with >70% of requirements?
- Is the deal size justified against the response effort?
- Do we understand the competitive landscape?
- Is the timeline realistic for our solution?
**Red Flags for No-Bid:**
- No prior customer engagement (blind RFP)
- Requirement language mirrors a competitor's product
- Timeline is unrealistically short
- Must-have requirements fall outside our platform
- Budget is undefined or misaligned with our pricing
### 2. Response Organization
**Executive Summary (1-2 pages):**
- Lead with business outcomes, not features
- Reference the customer's specific challenges
- Quantify value proposition with relevant metrics
- State confidence level and key differentiators
**Solution Overview:**
- Map directly to the customer's stated requirements
- Use the customer's language and terminology
- Include architecture diagrams for technical sections
- Address integration with existing systems
**Compliance Matrix:**
- Mirror the RFP's requirement numbering exactly
- Use consistent coverage categories: Full, Partial, Planned, Gap
- Provide clear explanations for each response
- Include roadmap dates for "Planned" items
### 3. Coverage Classification
| Status | Score | Definition | Response Approach |
|--------|-------|------------|-------------------|
| Full | 100% | Current product fully meets requirement | Describe capability with evidence |
| Partial | 50% | Met with configuration or workaround | Explain approach and any limitations |
| Planned | 25% | On product roadmap | Provide timeline and interim solution |
| Gap | 0% | Not currently supported | Acknowledge gap and propose alternatives |
### 4. Priority-Weighted Scoring
Not all requirements are equal. Weight them by business impact:
- **Must-Have (3x weight):** Core requirements that are deal-breakers. Gaps here typically result in disqualification.
- **Should-Have (2x weight):** Important requirements that influence the decision significantly.
- **Nice-to-Have (1x weight):** Desirable but not critical. Often used as tie-breakers.
### 5. Response Writing Tips
**Do:**
- Answer the question directly before elaborating
- Use the customer's terminology, not internal jargon
- Provide specific examples, case studies, and metrics
- Include screenshots or architecture diagrams where relevant
- Cross-reference related answers to avoid redundancy
- Proofread for consistency across sections (multiple authors)
**Avoid:**
- Marketing fluff or vague language ("best-in-class", "world-class")
- Answering a question you were not asked
- Contradictions between sections
- Overselling capabilities you do not have
- Ignoring the question format (tables vs. narrative)
## Bid/No-Bid Decision Framework
### Decision Matrix
| Factor | Weight | Score (1-5) | Weighted |
|--------|--------|-------------|----------|
| Technical fit | 25% | | |
| Relationship strength | 20% | | |
| Competitive position | 20% | | |
| Deal value vs effort | 15% | | |
| Strategic importance | 10% | | |
| Win probability | 10% | | |
| **Total** | **100%** | | |
**Scoring Guide:**
- 5: Strong advantage
- 4: Slight advantage
- 3: Neutral / competitive parity
- 2: Slight disadvantage
- 1: Significant disadvantage
**Decision Thresholds:**
- Score >= 3.5: **Bid** - proceed with full response
- Score 2.5 - 3.4: **Conditional Bid** - proceed with executive approval
- Score < 2.5: **No-Bid** - decline or submit information-only response
### Effort Estimation
Estimate the total effort required and compare against deal value:
| Response Component | Typical Effort (hours) |
|-------------------|----------------------|
| Requirements analysis | 4-8 |
| Technical writing | 16-40 |
| Architecture diagrams | 4-8 |
| Demo preparation | 8-16 |
| Internal review | 4-8 |
| Final formatting | 2-4 |
| **Total** | **38-84 hours** |
**Rule of thumb:** The response effort should not exceed 2% of the deal value.
## Compliance Matrix Structure
### Standard Format
```
| Req ID | Requirement Description | Priority | Compliance | Response | Evidence |
|--------|------------------------|----------|------------|----------|----------|
| R-001 | SSO via SAML 2.0 | Must | Full | Native SAML 2.0 support... | Config guide |
| R-002 | Custom reporting | Should | Partial | Standard reports + API... | API docs |
```
### Section Organization
Organize requirements by category for clarity:
1. **Functional Requirements** - Core features and capabilities
2. **Technical Requirements** - Architecture, APIs, performance
3. **Security & Compliance** - Authentication, encryption, certifications
4. **Integration Requirements** - Third-party systems, data flows
5. **Support & SLA** - Support tiers, response times, uptime
6. **Vendor Qualifications** - Company size, financials, references
## Common Pitfalls
### 1. The Wired RFP
**Symptom:** Requirements language matches a competitor's product feature list.
**Response:** Focus on outcomes over features. Highlight areas of differentiation. Ask clarifying questions that expose broader needs.
### 2. Feature Checklist Syndrome
**Symptom:** RFP is a massive feature checklist with no context about business problems.
**Response:** Group features by business outcome. Add context in your response that demonstrates understanding of the underlying need.
### 3. Scope Creep in Response
**Symptom:** Team keeps adding content that was not requested.
**Response:** Assign a response manager to enforce scope. Answer what was asked, provide references for additional information.
### 4. Inconsistent Messaging
**Symptom:** Multiple authors provide contradictory information.
**Response:** Assign a single editor for final review. Create a response style guide. Use consistent terminology throughout.
### 5. Overcommitting on Gaps
**Symptom:** Marking "Planned" items as "Full" to improve scores.
**Response:** Never misrepresent coverage. Planned items with firm timelines and interim workarounds are better than lies discovered during POC.
## RFP Response Timeline Management
### Typical Response Timeline
| Day | Activity |
|-----|----------|
| Day 1 | Receive RFP, conduct initial review, assign team |
| Day 2-3 | Bid/no-bid decision, questions submission |
| Day 4-7 | Requirements analysis, coverage assessment |
| Day 8-14 | Draft responses, architecture diagrams |
| Day 15-17 | Internal review, quality check |
| Day 18-19 | Final edits, formatting, executive review |
| Day 20 | Submission |
### Time-Saving Strategies
1. **Maintain a response library** - Reusable answers for common requirements
2. **Pre-built architecture diagrams** - Template diagrams for common integration patterns
3. **Standardized compliance language** - Pre-approved language for security and compliance sections
4. **Question templates** - Standard clarifying questions for common ambiguities
---
**Last Updated:** February 2026
FILE:sales-engineer/scripts/competitive_matrix_builder.py
#!/usr/bin/env python3
"""Competitive Matrix Builder - Generate feature comparison matrices and positioning analysis.
Builds feature-by-feature comparison matrices, calculates weighted competitive
scores, identifies differentiators and vulnerabilities, and generates win themes.
Usage:
python competitive_matrix_builder.py competitive_data.json
python competitive_matrix_builder.py competitive_data.json --format json
python competitive_matrix_builder.py competitive_data.json --format text
"""
import argparse
import json
import sys
from typing import Any
# Feature scoring levels
FEATURE_SCORES: dict[str, int] = {
"full": 3,
"partial": 2,
"limited": 1,
"none": 0,
}
FEATURE_LABELS: dict[int, str] = {
3: "Full",
2: "Partial",
1: "Limited",
0: "None",
}
def safe_divide(numerator: float, denominator: float, default: float = 0.0) -> float:
"""Safely divide two numbers, returning default if denominator is zero."""
if denominator == 0:
return default
return numerator / denominator
def load_competitive_data(filepath: str) -> dict[str, Any]:
"""Load and validate competitive data from a JSON file.
Args:
filepath: Path to the JSON file containing competitive data.
Returns:
Parsed competitive data dictionary.
Raises:
SystemExit: If the file cannot be read or parsed.
"""
try:
with open(filepath, "r", encoding="utf-8") as f:
data = json.load(f)
except FileNotFoundError:
print(f"Error: File not found: {filepath}", file=sys.stderr)
sys.exit(1)
except json.JSONDecodeError as e:
print(f"Error: Invalid JSON in {filepath}: {e}", file=sys.stderr)
sys.exit(1)
if "categories" not in data:
print("Error: JSON must contain a 'categories' array.", file=sys.stderr)
sys.exit(1)
if "our_product" not in data:
print("Error: JSON must contain 'our_product' name.", file=sys.stderr)
sys.exit(1)
if "competitors" not in data or not data["competitors"]:
print("Error: JSON must contain a non-empty 'competitors' array.", file=sys.stderr)
sys.exit(1)
return data
def normalize_score(score_value: Any) -> int:
"""Normalize a score value to an integer.
Args:
score_value: Score as string label or integer.
Returns:
Normalized integer score (0-3).
"""
if isinstance(score_value, str):
return FEATURE_SCORES.get(score_value.lower(), 0)
if isinstance(score_value, (int, float)):
return max(0, min(3, int(score_value)))
return 0
def build_comparison_matrix(data: dict[str, Any]) -> dict[str, Any]:
"""Build the feature comparison matrix from input data.
Args:
data: Competitive data with categories, features, and scores.
Returns:
Comparison matrix with per-feature and per-category scores.
"""
our_product = data["our_product"]
competitors = data["competitors"]
all_products = [our_product] + competitors
matrix: list[dict[str, Any]] = []
category_summaries: dict[str, dict[str, Any]] = {}
for category in data["categories"]:
cat_name = category["name"]
cat_weight = category.get("weight", 1.0)
cat_features = category.get("features", [])
cat_scores: dict[str, list[int]] = {p: [] for p in all_products}
for feature in cat_features:
feature_name = feature["name"]
scores: dict[str, int] = {}
for product in all_products:
raw_score = feature.get("scores", {}).get(product, 0)
scores[product] = normalize_score(raw_score)
cat_scores[product].append(scores[product])
# Determine leader for this feature
max_score = max(scores.values())
leaders = [p for p, s in scores.items() if s == max_score]
matrix.append({
"category": cat_name,
"feature": feature_name,
"scores": scores,
"leaders": leaders,
"our_score": scores[our_product],
"max_score": max_score,
"we_lead": our_product in leaders and len(leaders) == 1,
"we_trail": scores[our_product] < max_score,
})
# Category summary
cat_product_scores = {}
for product in all_products:
product_scores = cat_scores[product]
total = sum(product_scores)
max_possible = len(product_scores) * 3
pct = safe_divide(total, max_possible) * 100
cat_product_scores[product] = {
"total_score": total,
"max_possible": max_possible,
"percentage": round(pct, 1),
}
category_summaries[cat_name] = {
"weight": cat_weight,
"feature_count": len(cat_features),
"product_scores": cat_product_scores,
}
return {
"our_product": our_product,
"competitors": competitors,
"all_products": all_products,
"matrix": matrix,
"category_summaries": category_summaries,
}
def compute_competitive_scores(
comparison: dict[str, Any],
) -> dict[str, dict[str, Any]]:
"""Compute weighted competitive scores for each product.
Args:
comparison: Comparison matrix data.
Returns:
Product scores with weighted and unweighted totals.
"""
all_products = comparison["all_products"]
category_summaries = comparison["category_summaries"]
product_scores: dict[str, dict[str, float]] = {
p: {"weighted_total": 0.0, "max_weighted": 0.0, "unweighted_total": 0, "max_unweighted": 0}
for p in all_products
}
for cat_name, cat_data in category_summaries.items():
weight = cat_data["weight"]
for product in all_products:
p_data = cat_data["product_scores"][product]
product_scores[product]["weighted_total"] += p_data["total_score"] * weight
product_scores[product]["max_weighted"] += p_data["max_possible"] * weight
product_scores[product]["unweighted_total"] += p_data["total_score"]
product_scores[product]["max_unweighted"] += p_data["max_possible"]
result = {}
for product in all_products:
ps = product_scores[product]
weighted_pct = safe_divide(ps["weighted_total"], ps["max_weighted"]) * 100
unweighted_pct = safe_divide(ps["unweighted_total"], ps["max_unweighted"]) * 100
result[product] = {
"weighted_score": round(weighted_pct, 1),
"unweighted_score": round(unweighted_pct, 1),
"weighted_total": round(ps["weighted_total"], 2),
"max_weighted": round(ps["max_weighted"], 2),
}
return result
def identify_differentiators(comparison: dict[str, Any]) -> list[dict[str, Any]]:
"""Identify features where our product leads all competitors.
Args:
comparison: Comparison matrix data.
Returns:
List of differentiator features with details.
"""
differentiators = []
for entry in comparison["matrix"]:
if entry["we_lead"] and entry["our_score"] >= 2:
# Calculate gap from nearest competitor
competitor_scores = [
entry["scores"][c] for c in comparison["competitors"]
]
max_competitor = max(competitor_scores) if competitor_scores else 0
gap = entry["our_score"] - max_competitor
differentiators.append({
"feature": entry["feature"],
"category": entry["category"],
"our_score": entry["our_score"],
"our_label": FEATURE_LABELS.get(entry["our_score"], "Unknown"),
"best_competitor_score": max_competitor,
"gap": gap,
})
# Sort by gap size descending
differentiators.sort(key=lambda d: d["gap"], reverse=True)
return differentiators
def identify_vulnerabilities(comparison: dict[str, Any]) -> list[dict[str, Any]]:
"""Identify features where competitors lead our product.
Args:
comparison: Comparison matrix data.
Returns:
List of vulnerability features with details.
"""
vulnerabilities = []
for entry in comparison["matrix"]:
if entry["we_trail"]:
# Find which competitor leads
leader_scores = {
p: entry["scores"][p]
for p in comparison["competitors"]
if entry["scores"][p] == entry["max_score"]
}
gap = entry["max_score"] - entry["our_score"]
vulnerabilities.append({
"feature": entry["feature"],
"category": entry["category"],
"our_score": entry["our_score"],
"our_label": FEATURE_LABELS.get(entry["our_score"], "Unknown"),
"leading_competitors": leader_scores,
"gap": gap,
})
# Sort by gap size descending
vulnerabilities.sort(key=lambda v: v["gap"], reverse=True)
return vulnerabilities
def generate_win_themes(
differentiators: list[dict[str, Any]],
competitive_scores: dict[str, dict[str, Any]],
our_product: str,
) -> list[str]:
"""Generate win themes based on differentiators and competitive position.
Args:
differentiators: List of differentiator features.
competitive_scores: Product competitive scores.
our_product: Our product name.
Returns:
List of win theme strings.
"""
themes = []
# Theme from top differentiators
if differentiators:
top_diff_categories = list({d["category"] for d in differentiators[:5]})
for cat in top_diff_categories[:3]:
cat_diffs = [d for d in differentiators if d["category"] == cat]
feature_names = [d["feature"] for d in cat_diffs[:3]]
themes.append(
f"Superior {cat} capabilities: {', '.join(feature_names)}"
)
# Theme from overall competitive position
our_score = competitive_scores.get(our_product, {}).get("weighted_score", 0)
competitor_scores = [
(p, s["weighted_score"])
for p, s in competitive_scores.items()
if p != our_product
]
if competitor_scores:
best_competitor_name, best_competitor_score = max(
competitor_scores, key=lambda x: x[1]
)
if our_score > best_competitor_score:
themes.append(
f"Overall strongest solution ({our_score:.1f}% vs {best_competitor_name} at {best_competitor_score:.1f}%)"
)
# Theme from breadth of coverage
strong_diffs = [d for d in differentiators if d["gap"] >= 2]
if len(strong_diffs) >= 3:
themes.append(
f"Clear technical leadership across {len(strong_diffs)} key features with significant competitive gaps"
)
if not themes:
themes.append("Competitive parity - emphasize implementation quality, support, and total cost of ownership")
return themes
def analyze_competitive(data: dict[str, Any]) -> dict[str, Any]:
"""Run the complete competitive analysis pipeline.
Args:
data: Parsed competitive data dictionary.
Returns:
Complete analysis results dictionary.
"""
comparison = build_comparison_matrix(data)
competitive_scores = compute_competitive_scores(comparison)
differentiators = identify_differentiators(comparison)
vulnerabilities = identify_vulnerabilities(comparison)
win_themes = generate_win_themes(
differentiators, competitive_scores, comparison["our_product"]
)
return {
"analysis_info": {
"our_product": comparison["our_product"],
"competitors": comparison["competitors"],
"total_features": len(comparison["matrix"]),
"total_categories": len(comparison["category_summaries"]),
},
"competitive_scores": competitive_scores,
"category_breakdown": comparison["category_summaries"],
"comparison_matrix": comparison["matrix"],
"differentiators": differentiators,
"vulnerabilities": vulnerabilities,
"win_themes": win_themes,
}
def format_text(result: dict[str, Any]) -> str:
"""Format analysis results as human-readable text.
Args:
result: Complete analysis results dictionary.
Returns:
Formatted text string.
"""
lines = []
info = result["analysis_info"]
all_products = [info["our_product"]] + info["competitors"]
lines.append("=" * 80)
lines.append("COMPETITIVE MATRIX ANALYSIS")
lines.append("=" * 80)
lines.append(f"Our Product: {info['our_product']}")
lines.append(f"Competitors: {', '.join(info['competitors'])}")
lines.append(f"Features: {info['total_features']}")
lines.append(f"Categories: {info['total_categories']}")
lines.append("")
# Competitive scores
lines.append("-" * 80)
lines.append("COMPETITIVE SCORES")
lines.append("-" * 80)
lines.append(f"{'Product':<25} {'Weighted':>10} {'Unweighted':>12}")
lines.append("-" * 80)
# Sort by weighted score descending
sorted_scores = sorted(
result["competitive_scores"].items(),
key=lambda x: x[1]["weighted_score"],
reverse=True,
)
for product, scores in sorted_scores:
marker = " <-- US" if product == info["our_product"] else ""
lines.append(
f"{product:<25} {scores['weighted_score']:>9.1f}% {scores['unweighted_score']:>11.1f}%{marker}"
)
lines.append("")
# Feature matrix
lines.append("-" * 80)
lines.append("FEATURE COMPARISON MATRIX")
lines.append("-" * 80)
# Build header
product_cols = " ".join(f"{p[:10]:>10}" for p in all_products)
lines.append(f"{'Feature':<30} {product_cols}")
lines.append("-" * 80)
current_category = ""
for entry in result["comparison_matrix"]:
if entry["category"] != current_category:
current_category = entry["category"]
cat_data = result["category_breakdown"].get(current_category, {})
weight = cat_data.get("weight", 1.0)
lines.append(f"\n [{current_category}] (weight: {weight}x)")
score_cols = " ".join(
f"{FEATURE_LABELS.get(entry['scores'].get(p, 0), 'N/A'):>10}"
for p in all_products
)
lead_marker = " *" if entry["we_lead"] else (" !" if entry["we_trail"] else "")
feature_display = entry["feature"][:28]
lines.append(f" {feature_display:<28} {score_cols}{lead_marker}")
lines.append("")
lines.append(" * = We lead | ! = We trail")
lines.append("")
# Differentiators
diffs = result["differentiators"]
if diffs:
lines.append("-" * 80)
lines.append(f"DIFFERENTIATORS ({len(diffs)} features where we lead)")
lines.append("-" * 80)
for d in diffs:
lines.append(
f" + {d['feature']} [{d['category']}] "
f"- Us: {d['our_label']} vs Best Competitor: {FEATURE_LABELS.get(d['best_competitor_score'], 'N/A')} "
f"(gap: +{d['gap']})"
)
lines.append("")
# Vulnerabilities
vulns = result["vulnerabilities"]
if vulns:
lines.append("-" * 80)
lines.append(f"VULNERABILITIES ({len(vulns)} features where competitors lead)")
lines.append("-" * 80)
for v in vulns:
leaders = ", ".join(
f"{p}: {FEATURE_LABELS.get(s, 'N/A')}"
for p, s in v["leading_competitors"].items()
)
lines.append(
f" - {v['feature']} [{v['category']}] "
f"- Us: {v['our_label']} vs {leaders} "
f"(gap: -{v['gap']})"
)
lines.append("")
# Win themes
themes = result["win_themes"]
lines.append("-" * 80)
lines.append("WIN THEMES")
lines.append("-" * 80)
for i, theme in enumerate(themes, 1):
lines.append(f" {i}. {theme}")
lines.append("")
lines.append("=" * 80)
return "\n".join(lines)
def main() -> None:
"""Main entry point for the Competitive Matrix Builder."""
parser = argparse.ArgumentParser(
description="Build competitive feature comparison matrices and positioning analysis.",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog=(
"Feature Scoring:\n"
" Full (3) - Complete feature support\n"
" Partial (2) - Partial or limited support\n"
" Limited (1) - Minimal or basic support\n"
" None (0) - Feature not available\n"
"\n"
"Example:\n"
" python competitive_matrix_builder.py competitive_data.json --format json\n"
),
)
parser.add_argument(
"input_file",
help="Path to JSON file containing competitive data",
)
parser.add_argument(
"--format",
choices=["json", "text"],
default="text",
dest="output_format",
help="Output format: json or text (default: text)",
)
args = parser.parse_args()
data = load_competitive_data(args.input_file)
result = analyze_competitive(data)
if args.output_format == "json":
print(json.dumps(result, indent=2))
else:
print(format_text(result))
if __name__ == "__main__":
main()
FILE:sales-engineer/scripts/poc_planner.py
#!/usr/bin/env python3
"""POC Planner - Plan proof-of-concept engagements with timeline, resources, and scorecards.
Generates structured POC plans including phased timelines, resource allocation,
success criteria with measurable metrics, evaluation scorecards, risk identification,
and go/no-go recommendation frameworks.
Usage:
python poc_planner.py poc_data.json
python poc_planner.py poc_data.json --format json
python poc_planner.py poc_data.json --format text
"""
import argparse
import json
import sys
from typing import Any
# Default phase definitions
DEFAULT_PHASES = [
{
"name": "Setup",
"duration_weeks": 1,
"description": "Environment provisioning, data migration, initial configuration",
"activities": [
"Provision POC environment",
"Configure authentication and access",
"Migrate sample data sets",
"Set up monitoring and logging",
"Conduct kickoff meeting with stakeholders",
],
},
{
"name": "Core Testing",
"duration_weeks": 2,
"description": "Primary use case validation and integration testing",
"activities": [
"Execute primary use case scenarios",
"Test core integrations",
"Validate data flow and transformations",
"Conduct mid-point review with stakeholders",
"Document findings and adjust test plan",
],
},
{
"name": "Advanced Testing",
"duration_weeks": 1,
"description": "Edge cases, performance testing, and security validation",
"activities": [
"Execute edge case scenarios",
"Run performance and load tests",
"Validate security controls and compliance",
"Test disaster recovery and failover",
"Test administrative workflows",
],
},
{
"name": "Evaluation",
"duration_weeks": 1,
"description": "Scorecard completion, stakeholder review, and go/no-go decision",
"activities": [
"Complete evaluation scorecard",
"Compile POC results documentation",
"Conduct final stakeholder review",
"Present go/no-go recommendation",
"Gather lessons learned",
],
},
]
# Evaluation categories with default weights
DEFAULT_EVAL_CATEGORIES = {
"Functionality": {
"weight": 0.30,
"criteria": [
"Core feature completeness",
"Use case coverage",
"Customization flexibility",
"Workflow automation",
],
},
"Performance": {
"weight": 0.20,
"criteria": [
"Response time under load",
"Throughput capacity",
"Scalability characteristics",
"Resource utilization",
],
},
"Integration": {
"weight": 0.20,
"criteria": [
"API completeness and documentation",
"Data migration ease",
"Third-party connector availability",
"Authentication/SSO integration",
],
},
"Usability": {
"weight": 0.15,
"criteria": [
"User interface intuitiveness",
"Learning curve assessment",
"Documentation quality",
"Admin console functionality",
],
},
"Support": {
"weight": 0.15,
"criteria": [
"Technical support responsiveness",
"Knowledge base quality",
"Training resources availability",
"Community and ecosystem",
],
},
}
def safe_divide(numerator: float, denominator: float, default: float = 0.0) -> float:
"""Safely divide two numbers, returning default if denominator is zero."""
if denominator == 0:
return default
return numerator / denominator
def load_poc_data(filepath: str) -> dict[str, Any]:
"""Load and validate POC data from a JSON file.
Args:
filepath: Path to the JSON file containing POC data.
Returns:
Parsed POC data dictionary.
Raises:
SystemExit: If the file cannot be read or parsed.
"""
try:
with open(filepath, "r", encoding="utf-8") as f:
data = json.load(f)
except FileNotFoundError:
print(f"Error: File not found: {filepath}", file=sys.stderr)
sys.exit(1)
except json.JSONDecodeError as e:
print(f"Error: Invalid JSON in {filepath}: {e}", file=sys.stderr)
sys.exit(1)
if "poc_name" not in data:
print("Error: JSON must contain 'poc_name' field.", file=sys.stderr)
sys.exit(1)
return data
def estimate_resources(data: dict[str, Any], phases: list[dict[str, Any]]) -> dict[str, Any]:
"""Estimate resource requirements for the POC.
Args:
data: POC data with scope and requirements.
phases: List of phase definitions.
Returns:
Resource allocation dictionary.
"""
total_weeks = sum(p["duration_weeks"] for p in phases)
complexity = data.get("complexity", "medium").lower()
scope_items = data.get("scope_items", [])
num_integrations = data.get("num_integrations", 0)
# Base SE hours per week by complexity
se_hours_per_week = {"low": 15, "medium": 25, "high": 35}.get(complexity, 25)
# Engineering support hours
eng_base = {"low": 5, "medium": 10, "high": 20}.get(complexity, 10)
eng_integration_hours = num_integrations * 8
# Customer resource hours
customer_hours_per_week = {"low": 5, "medium": 8, "high": 12}.get(complexity, 8)
se_total = se_hours_per_week * total_weeks
eng_total = (eng_base * total_weeks) + eng_integration_hours
customer_total = customer_hours_per_week * total_weeks
# Phase-level breakdown
phase_resources = []
for phase in phases:
weeks = phase["duration_weeks"]
# Setup phase has higher SE and eng effort
se_multiplier = 1.3 if phase["name"] == "Setup" else (
1.0 if phase["name"] in ("Core Testing", "Advanced Testing") else 0.7
)
eng_multiplier = 1.5 if phase["name"] == "Setup" else (
1.0 if phase["name"] == "Core Testing" else (
1.2 if phase["name"] == "Advanced Testing" else 0.5
)
)
phase_resources.append({
"phase": phase["name"],
"duration_weeks": weeks,
"se_hours": round(se_hours_per_week * weeks * se_multiplier),
"engineering_hours": round(eng_base * weeks * eng_multiplier),
"customer_hours": round(customer_hours_per_week * weeks),
})
return {
"total_duration_weeks": total_weeks,
"complexity": complexity,
"totals": {
"se_hours": se_total,
"engineering_hours": eng_total,
"customer_hours": customer_total,
"total_hours": se_total + eng_total + customer_total,
},
"phase_breakdown": phase_resources,
"additional_resources": {
"integration_hours": eng_integration_hours,
"num_integrations": num_integrations,
},
}
def generate_success_criteria(data: dict[str, Any]) -> list[dict[str, Any]]:
"""Generate success criteria based on POC scope and requirements.
Args:
data: POC data with scope and requirements.
Returns:
List of success criteria with metrics.
"""
criteria = []
# Custom criteria from input
custom_criteria = data.get("success_criteria", [])
for cc in custom_criteria:
criteria.append({
"criterion": cc.get("criterion", "Unnamed criterion"),
"metric": cc.get("metric", "Pass/Fail"),
"target": cc.get("target", "Met"),
"category": cc.get("category", "Functionality"),
"priority": cc.get("priority", "must-have"),
})
# Auto-generated criteria based on scope
scope_items = data.get("scope_items", [])
for item in scope_items:
if isinstance(item, str):
criteria.append({
"criterion": f"Validate: {item}",
"metric": "Pass/Fail",
"target": "Pass",
"category": "Functionality",
"priority": "must-have",
})
elif isinstance(item, dict):
criteria.append({
"criterion": item.get("name", "Unnamed scope item"),
"metric": item.get("metric", "Pass/Fail"),
"target": item.get("target", "Pass"),
"category": item.get("category", "Functionality"),
"priority": item.get("priority", "must-have"),
})
# Default criteria if none provided
if not criteria:
criteria = [
{
"criterion": "Core use case validation",
"metric": "Percentage of use cases successfully demonstrated",
"target": ">90%",
"category": "Functionality",
"priority": "must-have",
},
{
"criterion": "Performance under expected load",
"metric": "Response time at target concurrency",
"target": "<2 seconds p95",
"category": "Performance",
"priority": "must-have",
},
{
"criterion": "Integration with existing systems",
"metric": "Number of integrations successfully tested",
"target": "All planned integrations",
"category": "Integration",
"priority": "must-have",
},
{
"criterion": "User acceptance",
"metric": "Stakeholder satisfaction score",
"target": ">4.0/5.0",
"category": "Usability",
"priority": "should-have",
},
]
return criteria
def generate_evaluation_scorecard(data: dict[str, Any]) -> dict[str, Any]:
"""Generate the POC evaluation scorecard template.
Args:
data: POC data.
Returns:
Evaluation scorecard structure.
"""
custom_categories = data.get("evaluation_categories", {})
# Merge custom categories with defaults
categories = {}
for cat_name, cat_data in DEFAULT_EVAL_CATEGORIES.items():
if cat_name in custom_categories:
custom = custom_categories[cat_name]
categories[cat_name] = {
"weight": custom.get("weight", cat_data["weight"]),
"criteria": custom.get("criteria", cat_data["criteria"]),
"score": None,
"notes": "",
}
else:
categories[cat_name] = {
"weight": cat_data["weight"],
"criteria": cat_data["criteria"],
"score": None,
"notes": "",
}
# Normalize weights to sum to 1.0
total_weight = sum(c["weight"] for c in categories.values())
if total_weight > 0 and abs(total_weight - 1.0) > 0.01:
for cat in categories.values():
cat["weight"] = round(safe_divide(cat["weight"], total_weight), 2)
return {
"scoring_scale": {
"5": "Exceeds requirements - superior capability",
"4": "Meets requirements - full capability",
"3": "Partially meets - acceptable with minor gaps",
"2": "Below expectations - significant gaps",
"1": "Does not meet - critical gaps",
},
"categories": categories,
"pass_threshold": 3.5,
"strong_pass_threshold": 4.0,
}
def identify_risks(data: dict[str, Any], resources: dict[str, Any]) -> list[dict[str, Any]]:
"""Identify POC risks and generate mitigation strategies.
Args:
data: POC data.
resources: Resource allocation data.
Returns:
List of risk entries with probability, impact, and mitigation.
"""
risks = []
complexity = data.get("complexity", "medium").lower()
num_integrations = data.get("num_integrations", 0)
total_weeks = resources["total_duration_weeks"]
stakeholders = data.get("stakeholders", [])
# Timeline risk
if total_weeks > 6:
risks.append({
"risk": "Extended timeline may lose stakeholder attention",
"probability": "high",
"impact": "high",
"mitigation": "Schedule weekly progress checkpoints; deliver early wins in week 2",
"category": "Timeline",
})
elif total_weeks >= 4:
risks.append({
"risk": "Timeline may slip due to unforeseen technical issues",
"probability": "medium",
"impact": "medium",
"mitigation": "Build 20% buffer into each phase; identify critical path early",
"category": "Timeline",
})
# Integration risks
if num_integrations > 3:
risks.append({
"risk": "Multiple integrations increase complexity and failure points",
"probability": "high",
"impact": "high",
"mitigation": "Prioritize integrations by business value; test incrementally; have fallback demo data",
"category": "Technical",
})
elif num_integrations > 0:
risks.append({
"risk": "Integration dependencies may cause delays",
"probability": "medium",
"impact": "medium",
"mitigation": "Engage customer IT early; confirm API access and credentials in setup phase",
"category": "Technical",
})
# Data risks
risks.append({
"risk": "Customer data quality or availability issues",
"probability": "medium",
"impact": "high",
"mitigation": "Request sample data early; prepare synthetic data as fallback; validate data format in setup",
"category": "Data",
})
# Stakeholder risks
if len(stakeholders) > 5:
risks.append({
"risk": "Too many stakeholders may slow decision-making",
"probability": "medium",
"impact": "medium",
"mitigation": "Identify decision-maker and champion; schedule focused reviews per stakeholder group",
"category": "Stakeholder",
})
if not stakeholders:
risks.append({
"risk": "Undefined stakeholder map may lead to misaligned evaluation",
"probability": "high",
"impact": "high",
"mitigation": "Confirm stakeholder list, roles, and evaluation criteria before setup phase",
"category": "Stakeholder",
})
# Resource risks
if complexity == "high":
risks.append({
"risk": "High complexity may require additional engineering resources",
"probability": "medium",
"impact": "high",
"mitigation": "Secure engineering commitment upfront; identify escalation path for blockers",
"category": "Resource",
})
# Competitive risk
risks.append({
"risk": "Competitor POC running in parallel may shift evaluation criteria",
"probability": "medium",
"impact": "medium",
"mitigation": "Stay close to champion; align success criteria early; differentiate on unique strengths",
"category": "Competitive",
})
return risks
def generate_go_no_go_framework(data: dict[str, Any]) -> dict[str, Any]:
"""Generate the go/no-go decision framework.
Args:
data: POC data.
Returns:
Go/no-go framework with criteria and thresholds.
"""
return {
"decision_criteria": [
{
"criterion": "Overall scorecard score",
"go_threshold": ">=3.5 weighted average",
"no_go_threshold": "<3.0 weighted average",
"conditional_range": "3.0 - 3.5",
},
{
"criterion": "Must-have success criteria met",
"go_threshold": "100% of must-have criteria pass",
"no_go_threshold": "<80% of must-have criteria pass",
"conditional_range": "80-99% with mitigation plan",
},
{
"criterion": "Stakeholder satisfaction",
"go_threshold": "Champion and decision-maker both positive",
"no_go_threshold": "Decision-maker negative",
"conditional_range": "Mixed signals - needs follow-up",
},
{
"criterion": "Technical blockers",
"go_threshold": "No unresolved critical blockers",
"no_go_threshold": ">2 unresolved critical blockers",
"conditional_range": "1-2 blockers with clear resolution path",
},
],
"recommendation_logic": {
"GO": "All criteria meet go thresholds, or majority go with no no-go triggers",
"CONDITIONAL_GO": "Some criteria in conditional range, but no no-go triggers and clear resolution plan",
"NO_GO": "Any criterion triggers no-go threshold without clear mitigation",
},
}
def plan_poc(data: dict[str, Any]) -> dict[str, Any]:
"""Run the complete POC planning pipeline.
Args:
data: Parsed POC data dictionary.
Returns:
Complete POC plan dictionary.
"""
poc_info = {
"poc_name": data.get("poc_name", "Unnamed POC"),
"customer": data.get("customer", "Unknown Customer"),
"opportunity_value": data.get("opportunity_value", "Not specified"),
"complexity": data.get("complexity", "medium"),
"start_date": data.get("start_date", "TBD"),
"champion": data.get("champion", "Not identified"),
"decision_maker": data.get("decision_maker", "Not identified"),
}
# Use custom phases if provided, otherwise defaults
phases = data.get("phases", DEFAULT_PHASES)
# Resource estimation
resources = estimate_resources(data, phases)
# Success criteria
success_criteria = generate_success_criteria(data)
# Evaluation scorecard
scorecard = generate_evaluation_scorecard(data)
# Risk identification
risks = identify_risks(data, resources)
# Go/No-Go framework
go_no_go = generate_go_no_go_framework(data)
# Timeline with phase details
timeline = []
current_week = 1
for phase in phases:
end_week = current_week + phase["duration_weeks"] - 1
timeline.append({
"phase": phase["name"],
"start_week": current_week,
"end_week": end_week,
"duration_weeks": phase["duration_weeks"],
"description": phase["description"],
"activities": phase["activities"],
})
current_week = end_week + 1
# Stakeholder plan
stakeholders = data.get("stakeholders", [])
stakeholder_plan = []
for s in stakeholders:
if isinstance(s, str):
stakeholder_plan.append({
"name": s,
"role": "Evaluator",
"engagement": "Weekly updates, phase reviews",
})
elif isinstance(s, dict):
stakeholder_plan.append({
"name": s.get("name", "Unknown"),
"role": s.get("role", "Evaluator"),
"engagement": s.get("engagement", "Weekly updates, phase reviews"),
})
return {
"poc_info": poc_info,
"timeline": timeline,
"resource_allocation": resources,
"success_criteria": success_criteria,
"evaluation_scorecard": scorecard,
"risk_register": risks,
"go_no_go_framework": go_no_go,
"stakeholder_plan": stakeholder_plan,
}
def format_text(result: dict[str, Any]) -> str:
"""Format POC plan as human-readable text.
Args:
result: Complete POC plan dictionary.
Returns:
Formatted text string.
"""
lines = []
info = result["poc_info"]
lines.append("=" * 70)
lines.append("PROOF OF CONCEPT PLAN")
lines.append("=" * 70)
lines.append(f"POC Name: {info['poc_name']}")
lines.append(f"Customer: {info['customer']}")
lines.append(f"Opportunity Value: {info['opportunity_value']}")
lines.append(f"Complexity: {info['complexity'].upper()}")
lines.append(f"Start Date: {info['start_date']}")
lines.append(f"Champion: {info['champion']}")
lines.append(f"Decision Maker: {info['decision_maker']}")
lines.append("")
# Timeline
lines.append("-" * 70)
lines.append("TIMELINE")
lines.append("-" * 70)
for phase in result["timeline"]:
week_range = (
f"Week {phase['start_week']}"
if phase["start_week"] == phase["end_week"]
else f"Weeks {phase['start_week']}-{phase['end_week']}"
)
lines.append(f"\n Phase: {phase['phase']} ({week_range})")
lines.append(f" {phase['description']}")
lines.append(" Activities:")
for activity in phase["activities"]:
lines.append(f" - {activity}")
lines.append("")
# Resource allocation
res = result["resource_allocation"]
lines.append("-" * 70)
lines.append("RESOURCE ALLOCATION")
lines.append("-" * 70)
lines.append(f"Total Duration: {res['total_duration_weeks']} weeks")
lines.append(f"Complexity: {res['complexity'].upper()}")
lines.append("")
lines.append(" Totals:")
lines.append(f" SE Hours: {res['totals']['se_hours']}")
lines.append(f" Engineering Hours: {res['totals']['engineering_hours']}")
lines.append(f" Customer Hours: {res['totals']['customer_hours']}")
lines.append(f" Total Hours: {res['totals']['total_hours']}")
lines.append("")
lines.append(" Phase Breakdown:")
lines.append(f" {'Phase':<20} {'Weeks':>5} {'SE':>6} {'Eng':>6} {'Cust':>6}")
lines.append(" " + "-" * 45)
for pr in res["phase_breakdown"]:
lines.append(
f" {pr['phase']:<20} {pr['duration_weeks']:>5} "
f"{pr['se_hours']:>5}h {pr['engineering_hours']:>5}h {pr['customer_hours']:>5}h"
)
lines.append("")
# Success criteria
criteria = result["success_criteria"]
lines.append("-" * 70)
lines.append("SUCCESS CRITERIA")
lines.append("-" * 70)
for i, sc in enumerate(criteria, 1):
priority_marker = "[MUST]" if sc["priority"] == "must-have" else (
"[SHOULD]" if sc["priority"] == "should-have" else "[NICE]"
)
lines.append(f" {i}. {priority_marker} {sc['criterion']}")
lines.append(f" Metric: {sc['metric']}")
lines.append(f" Target: {sc['target']}")
lines.append(f" Category: {sc['category']}")
lines.append("")
# Evaluation scorecard
scorecard = result["evaluation_scorecard"]
lines.append("-" * 70)
lines.append("EVALUATION SCORECARD")
lines.append("-" * 70)
lines.append(f" Pass Threshold: {scorecard['pass_threshold']}/5.0")
lines.append(f" Strong Pass Threshold: {scorecard['strong_pass_threshold']}/5.0")
lines.append("")
lines.append(" Scoring Scale:")
for score, desc in scorecard["scoring_scale"].items():
lines.append(f" {score} = {desc}")
lines.append("")
lines.append(" Categories:")
for cat_name, cat_data in scorecard["categories"].items():
lines.append(f"\n {cat_name} (weight: {cat_data['weight']:.0%})")
for criterion in cat_data["criteria"]:
lines.append(f" [ ] {criterion}")
lines.append("")
# Risk register
risks = result["risk_register"]
lines.append("-" * 70)
lines.append("RISK REGISTER")
lines.append("-" * 70)
for risk in risks:
lines.append(f" [{risk['impact'].upper()}] {risk['risk']}")
lines.append(f" Probability: {risk['probability']} | Impact: {risk['impact']}")
lines.append(f" Category: {risk['category']}")
lines.append(f" Mitigation: {risk['mitigation']}")
lines.append("")
# Go/No-Go framework
framework = result["go_no_go_framework"]
lines.append("-" * 70)
lines.append("GO / NO-GO DECISION FRAMEWORK")
lines.append("-" * 70)
for dc in framework["decision_criteria"]:
lines.append(f" {dc['criterion']}:")
lines.append(f" GO: {dc['go_threshold']}")
lines.append(f" CONDITIONAL: {dc['conditional_range']}")
lines.append(f" NO-GO: {dc['no_go_threshold']}")
lines.append("")
lines.append(" Recommendation Logic:")
for decision, logic in framework["recommendation_logic"].items():
lines.append(f" {decision}: {logic}")
lines.append("")
# Stakeholder plan
stakeholders = result["stakeholder_plan"]
if stakeholders:
lines.append("-" * 70)
lines.append("STAKEHOLDER PLAN")
lines.append("-" * 70)
for s in stakeholders:
lines.append(f" {s['name']} ({s['role']})")
lines.append(f" Engagement: {s['engagement']}")
lines.append("")
lines.append("=" * 70)
return "\n".join(lines)
def main() -> None:
"""Main entry point for the POC Planner."""
parser = argparse.ArgumentParser(
description="Plan proof-of-concept engagements with timeline, resources, and evaluation scorecards.",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog=(
"Default Phases:\n"
" Week 1: Setup - Environment provisioning, configuration\n"
" Weeks 2-3: Core Testing - Primary use cases, integrations\n"
" Week 4: Advanced Testing - Edge cases, performance, security\n"
" Week 5: Evaluation - Scorecard, stakeholder review, go/no-go\n"
"\n"
"Example:\n"
" python poc_planner.py poc_data.json --format json\n"
),
)
parser.add_argument(
"input_file",
help="Path to JSON file containing POC scope and requirements",
)
parser.add_argument(
"--format",
choices=["json", "text"],
default="text",
dest="output_format",
help="Output format: json or text (default: text)",
)
args = parser.parse_args()
data = load_poc_data(args.input_file)
result = plan_poc(data)
if args.output_format == "json":
print(json.dumps(result, indent=2))
else:
print(format_text(result))
if __name__ == "__main__":
main()
FILE:sales-engineer/scripts/rfp_response_analyzer.py
#!/usr/bin/env python3
"""RFP/RFI Response Analyzer - Score coverage, identify gaps, and recommend bid/no-bid.
Parses RFP/RFI requirements and scores coverage using Full/Partial/Planned/Gap
categories. Generates weighted coverage scores, gap analysis with mitigation
strategies, effort estimation, and bid/no-bid recommendations.
Usage:
python rfp_response_analyzer.py rfp_data.json
python rfp_response_analyzer.py rfp_data.json --format json
python rfp_response_analyzer.py rfp_data.json --format text
"""
import argparse
import json
import sys
from typing import Any
# Coverage status to score mapping
COVERAGE_SCORES: dict[str, float] = {
"full": 1.0,
"partial": 0.5,
"planned": 0.25,
"gap": 0.0,
}
# Priority to weight mapping
PRIORITY_WEIGHTS: dict[str, float] = {
"must-have": 3.0,
"should-have": 2.0,
"nice-to-have": 1.0,
}
# Bid thresholds
BID_THRESHOLD = 0.70
CONDITIONAL_THRESHOLD = 0.50
MAX_MUST_HAVE_GAPS_FOR_BID = 3
def safe_divide(numerator: float, denominator: float, default: float = 0.0) -> float:
"""Safely divide two numbers, returning default if denominator is zero."""
if denominator == 0:
return default
return numerator / denominator
def load_rfp_data(filepath: str) -> dict[str, Any]:
"""Load and validate RFP data from a JSON file.
Args:
filepath: Path to the JSON file containing RFP data.
Returns:
Parsed RFP data dictionary.
Raises:
SystemExit: If the file cannot be read or parsed.
"""
try:
with open(filepath, "r", encoding="utf-8") as f:
data = json.load(f)
except FileNotFoundError:
print(f"Error: File not found: {filepath}", file=sys.stderr)
sys.exit(1)
except json.JSONDecodeError as e:
print(f"Error: Invalid JSON in {filepath}: {e}", file=sys.stderr)
sys.exit(1)
if "requirements" not in data:
print("Error: JSON must contain a 'requirements' array.", file=sys.stderr)
sys.exit(1)
return data
def analyze_requirement(req: dict[str, Any]) -> dict[str, Any]:
"""Analyze a single requirement and compute its score.
Args:
req: Requirement dictionary with category, priority, coverage_status, etc.
Returns:
Enriched requirement with computed score and weight.
"""
coverage_status = req.get("coverage_status", "gap").lower()
priority = req.get("priority", "nice-to-have").lower()
coverage_score = COVERAGE_SCORES.get(coverage_status, 0.0)
weight = PRIORITY_WEIGHTS.get(priority, 1.0)
weighted_score = coverage_score * weight
max_weighted = weight
effort_hours = req.get("effort_hours", 0)
result = {
"id": req.get("id", "unknown"),
"requirement": req.get("requirement", "Unnamed requirement"),
"category": req.get("category", "Uncategorized"),
"priority": priority,
"coverage_status": coverage_status,
"coverage_score": coverage_score,
"weight": weight,
"weighted_score": weighted_score,
"max_weighted": max_weighted,
"effort_hours": effort_hours,
"notes": req.get("notes", ""),
"mitigation": req.get("mitigation", ""),
}
return result
def generate_gap_analysis(analyzed_reqs: list[dict[str, Any]]) -> list[dict[str, Any]]:
"""Generate gap analysis for requirements not fully covered.
Args:
analyzed_reqs: List of analyzed requirement dictionaries.
Returns:
List of gap entries with mitigation strategies.
"""
gaps = []
for req in analyzed_reqs:
if req["coverage_status"] in ("gap", "partial", "planned"):
severity = "critical" if req["priority"] == "must-have" else (
"high" if req["priority"] == "should-have" else "low"
)
mitigation = req["mitigation"]
if not mitigation:
if req["coverage_status"] == "partial":
mitigation = "Enhance existing capability to achieve full coverage"
elif req["coverage_status"] == "planned":
mitigation = "Communicate roadmap timeline and interim workaround"
else:
mitigation = "Evaluate build vs. partner vs. no-bid for this requirement"
gaps.append({
"id": req["id"],
"requirement": req["requirement"],
"category": req["category"],
"priority": req["priority"],
"coverage_status": req["coverage_status"],
"severity": severity,
"effort_hours": req["effort_hours"],
"mitigation": mitigation,
})
# Sort by severity: critical > high > low
severity_order = {"critical": 0, "high": 1, "low": 2}
gaps.sort(key=lambda g: severity_order.get(g["severity"], 3))
return gaps
def compute_category_scores(analyzed_reqs: list[dict[str, Any]]) -> dict[str, dict[str, Any]]:
"""Compute coverage scores grouped by requirement category.
Args:
analyzed_reqs: List of analyzed requirement dictionaries.
Returns:
Dictionary of category names to score summaries.
"""
categories: dict[str, dict[str, float]] = {}
for req in analyzed_reqs:
cat = req["category"]
if cat not in categories:
categories[cat] = {
"weighted_score": 0.0,
"max_weighted": 0.0,
"count": 0,
"full_count": 0,
"partial_count": 0,
"planned_count": 0,
"gap_count": 0,
"effort_hours": 0,
}
categories[cat]["weighted_score"] += req["weighted_score"]
categories[cat]["max_weighted"] += req["max_weighted"]
categories[cat]["count"] += 1
categories[cat]["effort_hours"] += req["effort_hours"]
status_key = f"{req['coverage_status']}_count"
if status_key in categories[cat]:
categories[cat][status_key] += 1
result = {}
for cat, scores in categories.items():
coverage_pct = safe_divide(scores["weighted_score"], scores["max_weighted"]) * 100
result[cat] = {
"coverage_percentage": round(coverage_pct, 1),
"requirements_count": int(scores["count"]),
"full": int(scores["full_count"]),
"partial": int(scores["partial_count"]),
"planned": int(scores["planned_count"]),
"gap": int(scores["gap_count"]),
"effort_hours": int(scores["effort_hours"]),
}
return result
def determine_bid_recommendation(
overall_coverage: float,
must_have_gaps: int,
strategic_value: str,
) -> dict[str, Any]:
"""Determine bid/no-bid recommendation based on coverage and gaps.
Args:
overall_coverage: Overall weighted coverage percentage (0-100).
must_have_gaps: Number of must-have requirements with gap status.
strategic_value: Strategic value assessment (high, medium, low).
Returns:
Recommendation dictionary with decision and rationale.
"""
coverage_ratio = overall_coverage / 100.0
reasons = []
# Primary decision logic
if coverage_ratio >= BID_THRESHOLD and must_have_gaps <= MAX_MUST_HAVE_GAPS_FOR_BID:
decision = "BID"
reasons.append(f"Coverage score {overall_coverage:.1f}% exceeds {BID_THRESHOLD*100:.0f}% threshold")
if must_have_gaps > 0:
reasons.append(f"{must_have_gaps} must-have gap(s) within acceptable range (max {MAX_MUST_HAVE_GAPS_FOR_BID})")
elif coverage_ratio >= CONDITIONAL_THRESHOLD or (
must_have_gaps <= MAX_MUST_HAVE_GAPS_FOR_BID and coverage_ratio >= 0.4
):
decision = "CONDITIONAL BID"
reasons.append(f"Coverage score {overall_coverage:.1f}% in conditional range ({CONDITIONAL_THRESHOLD*100:.0f}%-{BID_THRESHOLD*100:.0f}%)")
if must_have_gaps > 0:
reasons.append(f"{must_have_gaps} must-have gap(s) require mitigation plan")
else:
decision = "NO-BID"
if coverage_ratio < CONDITIONAL_THRESHOLD:
reasons.append(f"Coverage score {overall_coverage:.1f}% below {CONDITIONAL_THRESHOLD*100:.0f}% minimum")
if must_have_gaps > MAX_MUST_HAVE_GAPS_FOR_BID:
reasons.append(f"{must_have_gaps} must-have gaps exceed maximum of {MAX_MUST_HAVE_GAPS_FOR_BID}")
# Strategic value adjustment
if strategic_value.lower() == "high" and decision == "CONDITIONAL BID":
reasons.append("High strategic value supports pursuing despite coverage gaps")
elif strategic_value.lower() == "low" and decision == "CONDITIONAL BID":
decision = "NO-BID"
reasons.append("Low strategic value does not justify investment for conditional coverage")
confidence = "high" if coverage_ratio >= 0.80 else (
"medium" if coverage_ratio >= 0.60 else "low"
)
return {
"decision": decision,
"confidence": confidence,
"overall_coverage_percentage": round(overall_coverage, 1),
"must_have_gaps": must_have_gaps,
"strategic_value": strategic_value,
"reasons": reasons,
}
def generate_risk_assessment(
analyzed_reqs: list[dict[str, Any]],
gaps: list[dict[str, Any]],
) -> list[dict[str, str]]:
"""Generate risk assessment based on gaps and coverage patterns.
Args:
analyzed_reqs: List of analyzed requirement dictionaries.
gaps: List of gap analysis entries.
Returns:
List of risk entries with impact and mitigation.
"""
risks = []
critical_gaps = [g for g in gaps if g["severity"] == "critical"]
if critical_gaps:
risks.append({
"risk": "Critical requirement gaps",
"impact": "high",
"description": f"{len(critical_gaps)} must-have requirements not fully met",
"mitigation": "Prioritize engineering effort or partner integration for gap closure",
})
total_effort = sum(r["effort_hours"] for r in analyzed_reqs if r["coverage_status"] != "full")
if total_effort > 200:
risks.append({
"risk": "High customization effort",
"impact": "high",
"description": f"{total_effort} hours estimated for non-full requirements",
"mitigation": "Evaluate resource availability and timeline feasibility before committing",
})
elif total_effort > 80:
risks.append({
"risk": "Moderate customization effort",
"impact": "medium",
"description": f"{total_effort} hours estimated for non-full requirements",
"mitigation": "Phase implementation and set clear expectations on delivery timeline",
})
planned_count = sum(1 for r in analyzed_reqs if r["coverage_status"] == "planned")
if planned_count > 3:
risks.append({
"risk": "Roadmap dependency",
"impact": "medium",
"description": f"{planned_count} requirements depend on planned product features",
"mitigation": "Confirm roadmap timelines with product team; include contractual commitments if needed",
})
partial_count = sum(1 for r in analyzed_reqs if r["coverage_status"] == "partial")
if partial_count > 5:
risks.append({
"risk": "Workaround complexity",
"impact": "medium",
"description": f"{partial_count} requirements need workarounds or configuration",
"mitigation": "Document workarounds clearly; plan for native support in future releases",
})
if not risks:
risks.append({
"risk": "No significant risks identified",
"impact": "low",
"description": "Strong coverage across all requirement categories",
"mitigation": "Maintain standard engagement process",
})
return risks
def analyze_rfp(data: dict[str, Any]) -> dict[str, Any]:
"""Run the complete RFP analysis pipeline.
Args:
data: Parsed RFP data with requirements array.
Returns:
Complete analysis results dictionary.
"""
rfp_info = {
"rfp_name": data.get("rfp_name", "Unnamed RFP"),
"customer": data.get("customer", "Unknown Customer"),
"due_date": data.get("due_date", "Not specified"),
"strategic_value": data.get("strategic_value", "medium"),
"deal_value": data.get("deal_value", "Not specified"),
}
# Analyze each requirement
analyzed_reqs = [analyze_requirement(req) for req in data["requirements"]]
# Compute overall scores
total_weighted = sum(r["weighted_score"] for r in analyzed_reqs)
total_max = sum(r["max_weighted"] for r in analyzed_reqs)
overall_coverage = safe_divide(total_weighted, total_max) * 100
# Coverage summary
total_count = len(analyzed_reqs)
full_count = sum(1 for r in analyzed_reqs if r["coverage_status"] == "full")
partial_count = sum(1 for r in analyzed_reqs if r["coverage_status"] == "partial")
planned_count = sum(1 for r in analyzed_reqs if r["coverage_status"] == "planned")
gap_count = sum(1 for r in analyzed_reqs if r["coverage_status"] == "gap")
# Must-have gap count
must_have_gaps = sum(
1 for r in analyzed_reqs
if r["priority"] == "must-have" and r["coverage_status"] == "gap"
)
# Category breakdown
category_scores = compute_category_scores(analyzed_reqs)
# Gap analysis
gaps = generate_gap_analysis(analyzed_reqs)
# Bid recommendation
bid_recommendation = determine_bid_recommendation(
overall_coverage,
must_have_gaps,
rfp_info["strategic_value"],
)
# Risk assessment
risks = generate_risk_assessment(analyzed_reqs, gaps)
# Effort summary
total_effort = sum(r["effort_hours"] for r in analyzed_reqs)
gap_effort = sum(r["effort_hours"] for r in analyzed_reqs if r["coverage_status"] != "full")
return {
"rfp_info": rfp_info,
"coverage_summary": {
"overall_coverage_percentage": round(overall_coverage, 1),
"total_requirements": total_count,
"full": full_count,
"partial": partial_count,
"planned": planned_count,
"gap": gap_count,
"must_have_gaps": must_have_gaps,
},
"category_scores": category_scores,
"bid_recommendation": bid_recommendation,
"gap_analysis": gaps,
"risk_assessment": risks,
"effort_estimate": {
"total_hours": total_effort,
"gap_closure_hours": gap_effort,
"full_coverage_hours": total_effort - gap_effort,
},
"requirements_detail": analyzed_reqs,
}
def format_text(result: dict[str, Any]) -> str:
"""Format analysis results as human-readable text.
Args:
result: Complete analysis results dictionary.
Returns:
Formatted text string.
"""
lines = []
info = result["rfp_info"]
lines.append("=" * 70)
lines.append("RFP RESPONSE ANALYSIS")
lines.append("=" * 70)
lines.append(f"RFP: {info['rfp_name']}")
lines.append(f"Customer: {info['customer']}")
lines.append(f"Due Date: {info['due_date']}")
lines.append(f"Deal Value: {info['deal_value']}")
lines.append(f"Strategic Value: {info['strategic_value'].upper()}")
lines.append("")
# Coverage summary
cs = result["coverage_summary"]
lines.append("-" * 70)
lines.append("COVERAGE SUMMARY")
lines.append("-" * 70)
lines.append(f"Overall Coverage: {cs['overall_coverage_percentage']}%")
lines.append(f"Total Requirements: {cs['total_requirements']}")
lines.append(f" Full: {cs['full']} | Partial: {cs['partial']} | Planned: {cs['planned']} | Gap: {cs['gap']}")
lines.append(f"Must-Have Gaps: {cs['must_have_gaps']}")
lines.append("")
# Bid recommendation
bid = result["bid_recommendation"]
lines.append("-" * 70)
lines.append(f"BID RECOMMENDATION: {bid['decision']}")
lines.append(f"Confidence: {bid['confidence'].upper()}")
lines.append("-" * 70)
for reason in bid["reasons"]:
lines.append(f" - {reason}")
lines.append("")
# Category scores
lines.append("-" * 70)
lines.append("CATEGORY BREAKDOWN")
lines.append("-" * 70)
lines.append(f"{'Category':<25} {'Coverage':>8} {'Full':>5} {'Part':>5} {'Plan':>5} {'Gap':>5} {'Effort':>7}")
lines.append("-" * 70)
for cat, scores in result["category_scores"].items():
lines.append(
f"{cat:<25} {scores['coverage_percentage']:>7.1f}% "
f"{scores['full']:>5} {scores['partial']:>5} "
f"{scores['planned']:>5} {scores['gap']:>5} "
f"{scores['effort_hours']:>6}h"
)
lines.append("")
# Gap analysis
gaps = result["gap_analysis"]
if gaps:
lines.append("-" * 70)
lines.append("GAP ANALYSIS")
lines.append("-" * 70)
for gap in gaps:
severity_marker = "!!!" if gap["severity"] == "critical" else (
"!!" if gap["severity"] == "high" else "!"
)
lines.append(f" [{severity_marker}] {gap['id']}: {gap['requirement']}")
lines.append(f" Category: {gap['category']} | Priority: {gap['priority']} | Status: {gap['coverage_status']}")
lines.append(f" Effort: {gap['effort_hours']}h | Mitigation: {gap['mitigation']}")
lines.append("")
# Risk assessment
risks = result["risk_assessment"]
lines.append("-" * 70)
lines.append("RISK ASSESSMENT")
lines.append("-" * 70)
for risk in risks:
lines.append(f" [{risk['impact'].upper()}] {risk['risk']}")
lines.append(f" {risk['description']}")
lines.append(f" Mitigation: {risk['mitigation']}")
lines.append("")
# Effort estimate
effort = result["effort_estimate"]
lines.append("-" * 70)
lines.append("EFFORT ESTIMATE")
lines.append("-" * 70)
lines.append(f" Total Effort: {effort['total_hours']} hours")
lines.append(f" Gap Closure Effort: {effort['gap_closure_hours']} hours")
lines.append(f" Supported Effort: {effort['full_coverage_hours']} hours")
lines.append("")
lines.append("=" * 70)
return "\n".join(lines)
def main() -> None:
"""Main entry point for the RFP Response Analyzer."""
parser = argparse.ArgumentParser(
description="Analyze RFP/RFI requirements for coverage, gaps, and bid recommendation.",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog=(
"Coverage Categories:\n"
" Full (100%) - Requirement fully met\n"
" Partial (50%) - Partially met, workaround needed\n"
" Planned (25%) - On roadmap, not yet available\n"
" Gap (0%) - Not supported\n"
"\n"
"Priority Weights:\n"
" Must-Have (3x) | Should-Have (2x) | Nice-to-Have (1x)\n"
"\n"
"Example:\n"
" python rfp_response_analyzer.py rfp_data.json --format json\n"
),
)
parser.add_argument(
"input_file",
help="Path to JSON file containing RFP requirements data",
)
parser.add_argument(
"--format",
choices=["json", "text"],
default="text",
dest="output_format",
help="Output format: json or text (default: text)",
)
args = parser.parse_args()
data = load_rfp_data(args.input_file)
result = analyze_rfp(data)
if args.output_format == "json":
print(json.dumps(result, indent=2))
else:
print(format_text(result))
if __name__ == "__main__":
main()
Use when preparing roadmap narratives, release notes, changelogs, or stakeholder updates tailored for executives, engineering teams, and customers.
---
name: roadmap-communicator
description: Use when preparing roadmap narratives, release notes, changelogs, or stakeholder updates tailored for executives, engineering teams, and customers.
---
# Roadmap Communicator
Create clear roadmap communication artifacts for internal and external stakeholders.
## When To Use
Use this skill for:
- Building roadmap presentations in different formats
- Writing stakeholder updates (board, engineering, customers)
- Producing release notes (user-facing and internal)
- Generating changelogs from git history
- Structuring feature announcements
## Roadmap Formats
1. Now / Next / Later
- Best for uncertainty and strategic flexibility.
- Communicate direction without false precision.
2. Timeline roadmap
- Best for fixed-date commitments and launch coordination.
- Requires active risk and dependency management.
3. Theme-based roadmap
- Best for outcome-led planning and cross-team alignment.
- Groups initiatives by problem space or strategic objective.
See `references/roadmap-templates.md` for templates.
## Stakeholder Update Patterns
### Board / Executive
- Outcome and risk oriented
- Focus on progress against strategic goals
- Highlight trade-offs and required decisions
### Engineering
- Scope, dependencies, and sequencing clarity
- Status, blockers, and resourcing implications
### Customers
- Value narrative and timing window
- What is available now vs upcoming
- Clear expectation setting
See `references/communication-templates.md` for reusable templates.
## Release Notes Guidance
### User-Facing Release Notes
- Lead with user value, not internal implementation details.
- Group by workflows or user jobs.
- Include migration/behavior changes explicitly.
### Internal Release Notes
- Include technical details, operational impact, and known issues.
- Capture rollout plan, rollback criteria, and monitoring notes.
## Changelog Generation
Use:
```bash
python3 scripts/changelog_generator.py --from v1.0.0 --to HEAD
```
Features:
- Reads git log range
- Parses conventional commit prefixes
- Groups entries by type (`feat`, `fix`, `chore`, etc.)
- Outputs markdown or plain text
## Feature Announcement Framework
1. Problem context
2. What changed
3. Why it matters
4. Who benefits most
5. How to get started
6. Call to action and feedback channel
## Communication Quality Checklist
- [ ] Audience-specific framing is explicit.
- [ ] Outcomes and trade-offs are clear.
- [ ] Terminology is consistent across artifacts.
- [ ] Risks and dependencies are not hidden.
- [ ] Next actions and owners are specified.
FILE:references/communication-templates.md
# Communication Templates
## Stakeholder Update Email
Subject: Product roadmap update - [Period]
Hi [Audience],
Here is the [weekly/monthly/quarterly] product update.
- Progress:
- KPI movement:
- Risks/blockers:
- Decisions needed:
- Next period focus:
Thanks,
[Owner]
## User-Facing Release Notes Template
# Release [Version/Date]
## Highlights
- [User value outcome]
## New
- [Feature + benefit]
## Improved
- [Improvement + impact]
## Fixed
- [Issue + user-facing resolution]
## Known Limitations
- [If applicable]
## Internal Release Notes Template
# Internal Release [Version/Date]
## Scope
- Included workstreams and commit range
## Operational Notes
- Rollout plan
- Monitoring checks
- Rollback criteria
## Risks
- Known issues and mitigations
## Feature Announcement Template
Title: [Outcome-focused headline]
1. The problem:
2. The new capability:
3. Why this matters:
4. Who should use it:
5. How to start:
6. Feedback channel:
FILE:references/roadmap-templates.md
# Roadmap Templates
## Now / Next / Later Template
### Now (0-1 quarter)
- Committed initiatives in active execution
- Success metrics and owners
- Dependencies and known risks
### Next (1-2 quarters)
- Prioritized bets with confidence levels
- Discovery items needed before commit
- Resource assumptions
### Later (2+ quarters)
- Strategic themes and directional intent
- Explicitly marked as non-commitment
## Quarterly Roadmap Template
| Quarter | Theme | Key Initiatives | Success Metrics | Risks |
|---|---|---|---|---|
| Q1 | | | | |
| Q2 | | | | |
| Q3 | | | | |
| Q4 | | | | |
## Theme-Based Roadmap Template
| Theme | Problem Statement | Initiatives | KPI Link | Owner |
|---|---|---|---|---|
| Activation | | | | |
| Retention | | | | |
| Expansion | | | | |
## OKR-Aligned Roadmap Template
| Objective | Key Result | Initiative | Milestone | Team |
|---|---|---|---|---|
| | | | | |
Guideline:
- Every initiative should map to an objective or key result.
- Mark items without alignment as candidate de-scope.
FILE:scripts/changelog_generator.py
#!/usr/bin/env python3
"""Generate changelog sections from git log or piped commit messages using conventional commit prefixes."""
import argparse
import shutil
import subprocess
import sys
from collections import defaultdict
SECTIONS = {
"feat": "Features",
"fix": "Fixes",
"docs": "Documentation",
"refactor": "Refactors",
"test": "Tests",
"chore": "Chores",
"perf": "Performance",
"ci": "CI",
"build": "Build",
"style": "Style",
"revert": "Reverts",
}
DEMO_COMMITS = [
"feat: add user dashboard with analytics widgets",
"feat: implement dark mode toggle",
"fix: resolve crash on empty CSV import",
"fix: correct timezone offset in calendar view",
"docs: update API reference for v2 endpoints",
"refactor: extract shared validation into utils module",
"chore: bump dependencies to latest patch versions",
"perf: optimize database queries for user listing",
]
def parse_args() -> argparse.Namespace:
parser = argparse.ArgumentParser(
description="Generate changelog from git commits or piped input.",
epilog="Examples:\n"
" %(prog)s --from v1.0.0 --to HEAD\n"
" git log --pretty=format:%%s v1.0..HEAD | %(prog)s --stdin\n"
" %(prog)s --demo\n",
formatter_class=argparse.RawDescriptionHelpFormatter,
)
parser.add_argument("--from", dest="from_ref", default="HEAD~50",
help="Start ref for git log (default: HEAD~50)")
parser.add_argument("--to", dest="to_ref", default="HEAD",
help="End ref for git log (default: HEAD)")
parser.add_argument("--format", choices=["markdown", "text"], default="markdown",
help="Output format (default: markdown)")
parser.add_argument("--stdin", action="store_true",
help="Read commit subjects from stdin instead of git log")
parser.add_argument("--demo", action="store_true",
help="Run with sample data (no git required)")
return parser.parse_args()
def get_git_log(from_ref: str, to_ref: str) -> list[str]:
"""Get commit subjects from git log. Requires git on PATH and a git repo."""
if not shutil.which("git"):
print("Error: git not found on PATH. Use --stdin or --demo instead.", file=sys.stderr)
sys.exit(1)
commit_range = f"{from_ref}..{to_ref}"
cmd = ["git", "log", "--pretty=format:%s", commit_range]
try:
result = subprocess.run(cmd, capture_output=True, text=True, timeout=30)
except subprocess.TimeoutExpired:
print("Error: git log timed out.", file=sys.stderr)
sys.exit(1)
if result.returncode != 0:
print(f"Error: git log failed: {result.stderr.strip()}", file=sys.stderr)
sys.exit(1)
lines = [line.strip() for line in result.stdout.splitlines() if line.strip()]
return lines
def read_stdin() -> list[str]:
"""Read commit subjects from stdin, one per line."""
return [line.strip() for line in sys.stdin if line.strip()]
def group_commits(subjects: list[str]) -> dict[str, list[str]]:
grouped: dict[str, list[str]] = defaultdict(list)
for subject in subjects:
commit_type = "other"
for prefix in SECTIONS:
if subject.startswith(f"{prefix}:") or subject.startswith(f"{prefix}("):
commit_type = prefix
break
grouped[commit_type].append(subject)
return grouped
def render_markdown(grouped: dict[str, list[str]]) -> str:
out = ["# Changelog", ""]
ordered_types = list(SECTIONS.keys()) + ["other"]
for commit_type in ordered_types:
commits = grouped.get(commit_type, [])
if not commits:
continue
header = SECTIONS.get(commit_type, "Other")
out.append(f"## {header}")
for item in commits:
out.append(f"- {item}")
out.append("")
return "\n".join(out).rstrip() + "\n"
def render_text(grouped: dict[str, list[str]]) -> str:
out: list[str] = []
ordered_types = list(SECTIONS.keys()) + ["other"]
for commit_type in ordered_types:
commits = grouped.get(commit_type, [])
if not commits:
continue
header = SECTIONS.get(commit_type, "Other")
out.append(header.upper())
for item in commits:
out.append(f"* {item}")
out.append("")
return "\n".join(out).rstrip() + "\n"
def main() -> int:
args = parse_args()
if args.demo:
subjects = DEMO_COMMITS
elif args.stdin:
subjects = read_stdin()
else:
subjects = get_git_log(args.from_ref, args.to_ref)
if not subjects:
print("No commits found.", file=sys.stderr)
return 0
grouped = group_commits(subjects)
if args.format == "markdown":
print(render_markdown(grouped), end="")
else:
print(render_text(grouped), end="")
return 0
if __name__ == "__main__":
raise SystemExit(main())
Use when validating product opportunities, mapping assumptions, planning discovery sprints, or testing problem-solution fit before committing delivery resour...
---
name: product-discovery
description: Use when validating product opportunities, mapping assumptions, planning discovery sprints, or testing problem-solution fit before committing delivery resources.
---
# Product Discovery
Run structured discovery to identify high-value opportunities and de-risk product bets.
## When To Use
Use this skill for:
- Opportunity Solution Tree facilitation
- Assumption mapping and test planning
- Problem validation interviews and evidence synthesis
- Solution validation with prototypes/experiments
- Discovery sprint planning and outputs
## Core Discovery Workflow
1. Define desired outcome
- Set one measurable outcome to improve.
- Establish baseline and target horizon.
2. Build Opportunity Solution Tree (OST)
- Outcome -> opportunities -> solution ideas -> experiments
- Keep opportunities grounded in user evidence, not internal opinions.
3. Map assumptions
- Identify desirability, viability, feasibility, and usability assumptions.
- Score assumptions by risk and certainty.
Use:
```bash
python3 scripts/assumption_mapper.py assumptions.csv
```
4. Validate the problem
- Conduct interviews and behavior analysis.
- Confirm frequency, severity, and willingness to solve.
- Reject weak opportunities early.
5. Validate the solution
- Prototype before building.
- Run concept, usability, and value tests.
- Measure behavior, not only stated preference.
6. Plan discovery sprint
- 1-2 week cycle with explicit hypotheses
- Daily evidence reviews
- End with decision: proceed, pivot, or stop
## Opportunity Solution Tree (Teresa Torres)
Structure:
- Outcome: metric you want to move
- Opportunities: unmet customer needs/pains
- Solutions: candidate interventions
- Experiments: fastest learning actions
Quality checks:
- At least 3 distinct opportunities before converging.
- At least 2 experiments per top opportunity.
- Tie every branch to evidence source.
## Assumption Mapping
Assumption categories:
- Desirability: users want this
- Viability: business value exists
- Feasibility: team can build/operate it
- Usability: users can successfully use it
Prioritization rule:
- High risk + low certainty assumptions are tested first.
## Problem Validation Techniques
- Problem interviews focused on current behavior
- Journey friction mapping
- Support ticket and sales-call synthesis
- Behavioral analytics triangulation
Evidence threshold examples:
- Same pain repeated across multiple target users
- Observable workaround behavior
- Measurable cost of current pain
## Solution Validation Techniques
- Concept tests (value proposition comprehension)
- Prototype usability tests (task success/time-to-complete)
- Fake door or concierge tests (demand signal)
- Limited beta cohorts (retention/activation signals)
## Discovery Sprint Planning
Suggested 10-day structure:
- Day 1-2: Outcome + opportunity framing
- Day 3-4: Assumption mapping + test design
- Day 5-7: Problem and solution tests
- Day 8-9: Evidence synthesis + decision options
- Day 10: Stakeholder decision review
## Tooling
### `scripts/assumption_mapper.py`
CLI utility that:
- reads assumptions from CSV or inline input
- scores risk/certainty priority
- emits prioritized test plan with suggested test types
See `references/discovery-frameworks.md` for framework details.
FILE:references/discovery-frameworks.md
# Discovery Frameworks
## Opportunity Solution Tree (OST)
Purpose: continuously connect product outcomes to validated opportunities and tested solutions.
Core structure:
- Outcome (metric)
- Opportunity nodes (needs/pains)
- Solution ideas
- Experiments
OST practice tips:
- Keep tree live; update after each interview or test.
- Separate opportunity evidence from solution proposals.
- Avoid single-branch trees that force one solution.
## Jobs-to-be-Done (JTBD)
Use JTBD to understand progress users seek.
JTBD template:
"When [situation], I want to [motivation], so I can [expected outcome]."
JTBD interview focus:
- Trigger moments
- Current alternatives and workarounds
- Purchase/adoption anxieties
- Desired progress and success criteria
## Kano Model
Classify features by impact on satisfaction:
- Must-be: expected baseline features
- Performance: more is better
- Delighters: unexpected value multipliers
- Indifferent: low impact
- Reverse: can reduce satisfaction for some users
Use Kano when prioritizing solution concepts after problem validation.
## Design Sprint Methodology
Typical phases:
1. Understand
2. Sketch
3. Decide
4. Prototype
5. Test
Discovery usage:
- Compress learning cycle into one week.
- Best for high-ambiguity opportunities requiring cross-functional alignment.
## Assumption Prioritization Matrix
Map assumptions on two axes:
- Risk if wrong (low -> high)
- Certainty (low -> high)
Priority order:
1. High risk, low certainty (test first)
2. High risk, high certainty (validate quickly)
3. Low risk, low certainty (defer)
4. Low risk, high certainty (document)
## Discovery Evidence Rules
- One source is not enough for major decisions.
- Triangulate qualitative and quantitative signals.
- Predefine decision criteria before test execution.
- Archive evidence with date, segment, and method.
FILE:scripts/assumption_mapper.py
#!/usr/bin/env python3
"""Prioritize product assumptions and suggest validation tests."""
import argparse
import csv
from dataclasses import dataclass
@dataclass
class Assumption:
statement: str
category: str
risk: float
certainty: float
@property
def priority_score(self) -> float:
# High-risk, low-certainty assumptions should be tested first.
return self.risk * (1.0 - self.certainty)
def parse_float(value: str, field: str) -> float:
number = float(value)
if number < 0 or number > 1:
raise ValueError(f"{field} must be in [0, 1]")
return number
def suggest_test(category: str) -> str:
category = category.lower().strip()
if category == "desirability":
return "problem interviews or fake-door test"
if category == "viability":
return "pricing/willingness-to-pay test"
if category == "feasibility":
return "technical spike or architecture prototype"
if category == "usability":
return "moderated usability test"
return "smallest possible experiment with clear success criteria"
def load_from_csv(path: str) -> list[Assumption]:
assumptions: list[Assumption] = []
with open(path, "r", encoding="utf-8", newline="") as handle:
reader = csv.DictReader(handle)
required = {"assumption", "category", "risk", "certainty"}
missing = required - set(reader.fieldnames or [])
if missing:
missing_str = ", ".join(sorted(missing))
raise ValueError(f"Missing required columns: {missing_str}")
for row in reader:
assumptions.append(
Assumption(
statement=(row.get("assumption") or "").strip(),
category=(row.get("category") or "").strip(),
risk=parse_float(row.get("risk") or "0", "risk"),
certainty=parse_float(row.get("certainty") or "0", "certainty"),
)
)
return assumptions
def parse_inline(items: list[str]) -> list[Assumption]:
assumptions: list[Assumption] = []
for item in items:
# format: statement|category|risk|certainty
parts = [part.strip() for part in item.split("|")]
if len(parts) != 4:
raise ValueError("Inline assumption must be: statement|category|risk|certainty")
assumptions.append(
Assumption(
statement=parts[0],
category=parts[1],
risk=parse_float(parts[2], "risk"),
certainty=parse_float(parts[3], "certainty"),
)
)
return assumptions
def build_parser() -> argparse.ArgumentParser:
parser = argparse.ArgumentParser(description="Prioritize assumptions and generate test plan.")
parser.add_argument("input", nargs="?", help="CSV file path")
parser.add_argument(
"--assumption",
action="append",
default=[],
help="Inline assumption: statement|category|risk|certainty",
)
parser.add_argument("--top", type=int, default=10, help="Maximum assumptions to print")
return parser
def main() -> int:
parser = build_parser()
args = parser.parse_args()
assumptions: list[Assumption] = []
if args.input:
assumptions.extend(load_from_csv(args.input))
if args.assumption:
assumptions.extend(parse_inline(args.assumption))
if not assumptions:
parser.error("Provide a CSV input file or at least one --assumption value.")
assumptions.sort(key=lambda item: item.priority_score, reverse=True)
print("prioritized_assumption_test_plan")
print("rank,priority_score,category,risk,certainty,test,assumption")
for rank, item in enumerate(assumptions[: args.top], start=1):
test = suggest_test(item.category)
print(
f"{rank},{item.priority_score:.4f},{item.category},{item.risk:.2f},"
f"{item.certainty:.2f},{test},{item.statement}"
)
return 0
if __name__ == "__main__":
raise SystemExit(main())
Use when defining product KPIs, building metric dashboards, running cohort or retention analysis, or interpreting feature adoption trends across product stages.
---
name: product-analytics
description: Use when defining product KPIs, building metric dashboards, running cohort or retention analysis, or interpreting feature adoption trends across product stages.
---
# Product Analytics
Define, track, and interpret product metrics across discovery, growth, and mature product stages.
## When To Use
Use this skill for:
- Metric framework selection (AARRR, North Star, HEART)
- KPI definition by product stage (pre-PMF, growth, mature)
- Dashboard design and metric hierarchy
- Cohort and retention analysis
- Feature adoption and funnel interpretation
## Workflow
1. Select metric framework
- AARRR for growth loops and funnel visibility
- North Star for cross-functional strategic alignment
- HEART for UX quality and user experience measurement
2. Define stage-appropriate KPIs
- Pre-PMF: activation, early retention, qualitative success
- Growth: acquisition efficiency, expansion, conversion velocity
- Mature: retention depth, revenue quality, operational efficiency
3. Design dashboard layers
- Executive layer: 5-7 directional metrics
- Product health layer: acquisition, activation, retention, engagement
- Feature layer: adoption, depth, repeat usage, outcome correlation
4. Run cohort + retention analysis
- Segment by signup cohort or feature exposure cohort
- Compare retention curves, not single-point snapshots
- Identify inflection points around onboarding and first value moment
5. Interpret and act
- Connect metric movement to product changes and release timeline
- Distinguish signal from noise using period-over-period context
- Propose one clear product action per major metric risk/opportunity
## KPI Guidance By Stage
### Pre-PMF
- Activation rate
- Week-1 retention
- Time-to-first-value
- Problem-solution fit interview score
### Growth
- Funnel conversion by stage
- Monthly retained users
- Feature adoption among new cohorts
- Expansion / upsell proxy metrics
### Mature
- Net revenue retention aligned product metrics
- Power-user share and depth of use
- Churn risk indicators by segment
- Reliability and support-deflection product metrics
## Dashboard Design Principles
- Show trends, not isolated point estimates.
- Keep one owner per KPI.
- Pair each KPI with target, threshold, and decision rule.
- Use cohort and segment filters by default.
- Prefer comparable time windows (weekly vs weekly, monthly vs monthly).
See:
- `references/metrics-frameworks.md`
- `references/dashboard-templates.md`
## Cohort Analysis Method
1. Define cohort anchor event (signup, activation, first purchase).
2. Define retained behavior (active day, key action, repeat session).
3. Build retention matrix by cohort week/month and age period.
4. Compare curve shape across cohorts.
5. Flag early drop points and investigate journey friction.
## Retention Curve Interpretation
- Sharp early drop, low plateau: onboarding mismatch or weak initial value.
- Moderate drop, stable plateau: healthy core audience with predictable churn.
- Flattening at low level: product used occasionally, revisit value metric.
- Improving newer cohorts: onboarding or positioning improvements are working.
## Tooling
### `scripts/metrics_calculator.py`
CLI utility for:
- Retention rate calculations by cohort age
- Cohort table generation
- Basic funnel conversion analysis
Examples:
```bash
python3 scripts/metrics_calculator.py retention events.csv
python3 scripts/metrics_calculator.py cohort events.csv --cohort-grain month
python3 scripts/metrics_calculator.py funnel funnel.csv --stages visit,signup,activate,pay
```
FILE:references/dashboard-templates.md
# Dashboard Templates
## 1. Executive Dashboard Template
Purpose: quick company-level product signal for leadership.
Sections:
1. North Star trend (current, target, trailing 12 periods)
2. Growth summary (new users/accounts, activation)
3. Retention summary (short-term + medium-term cohorts)
4. Revenue-linked product indicators
5. Risks and actions
Suggested KPI block:
| KPI | Current | Target | Delta | Owner | Action |
|---|---:|---:|---:|---|---|
| North Star | | | | | |
| Activation Rate | | | | | |
| W8 Retention | | | | | |
| Paid Conversion | | | | | |
## 2. Product Health Dashboard Template
Purpose: monitor full user journey and detect bottlenecks.
Sections:
1. Acquisition funnel by channel/segment
2. Activation funnel with drop-off points
3. Cohort retention matrix + curve chart
4. Feature adoption distribution
5. Reliability metrics tied to user outcomes
Recommended views:
- Weekly cohort retention heatmap
- Funnel stage conversion waterfall
- Segment comparison (SMB vs enterprise)
- New vs returning user behavior split
## 3. Feature Adoption Dashboard Template
Purpose: evaluate feature launch quality and ongoing usage.
Sections:
1. Exposure and eligibility count
2. First-use adoption rate
3. Repeat usage rate (2nd, 3rd, nth use)
4. Time-to-adoption from signup/activation
5. Impact on primary outcomes (retention, conversion)
Adoption KPI examples:
| Metric | Definition |
|---|---|
| First-use adoption | Users who used feature at least once / eligible users |
| Repeat adoption | Users with 2+ uses / users with first use |
| Sustained adoption | Users with usage in 3 of last 4 weeks |
| Time to adoption | Median days from eligibility to first use |
## Dashboard Design Rules
- Keep each dashboard to one decision horizon (weekly ops vs quarterly strategy).
- Always annotate major product releases on charts.
- Add threshold bands for risk detection.
- Show metric definitions next to charts.
- Include a short "what changed" narrative block.
FILE:references/metrics-frameworks.md
# Metrics Frameworks
## AARRR (Pirate Metrics)
AARRR breaks the product journey into five stages.
1. Acquisition
- How users discover the product
- Example metrics: signups, CAC, channel conversion
2. Activation
- First meaningful value moment
- Example metrics: activation rate, time-to-first-value
3. Retention
- Ongoing user return behavior
- Example metrics: D7/W4 retention, rolling retained users
4. Revenue
- Monetization and value capture
- Example metrics: conversion to paid, ARPU, expansion revenue
5. Referral
- Organic growth from existing users
- Example metrics: referral rate, invite conversion, K-factor
## North Star Metric Framework
North Star = metric capturing long-term customer value delivered.
### North Star Criteria
- Reflects real user value
- Sensitive to product improvements
- Predictive of sustainable growth
- Understandable across functions
### Example North Star Metrics
- Collaboration SaaS: weekly active teams
- Marketplace: successful transactions per active buyer
- Content product: hours of qualified consumption
### Input Metrics
Track levers that influence the North Star:
- Acquisition quality
- Activation quality
- Engagement depth
- Retention durability
## HEART Framework
HEART is a UX-oriented framework from Google.
- Happiness: satisfaction, NPS, perceived quality
- Engagement: interaction depth/frequency
- Adoption: first-time use of features/products
- Retention: return behavior over time
- Task Success: completion rate, error rate, time on task
### HEART + Goals-Signals-Metrics
1. Goals: what UX outcome you want
2. Signals: observed behavior indicating movement
3. Metrics: measurable indicator for each signal
## Framework Selection Guide
| Situation | Recommended Framework |
|---|---|
| Early growth and funnel bottlenecks | AARRR |
| Company-wide strategic alignment | North Star |
| UX and product quality optimization | HEART |
| Mixed maturity org | North Star + AARRR operational layers |
## Example: B2B SaaS Product
- North Star: weekly active accounts completing core workflow
- AARRR operational metrics:
- Acquisition: qualified signups
- Activation: % accounts completing setup in 7 days
- Retention: W8 retained accounts
- Revenue: paid conversion and expansion rate
- Referral: invited teammate activation rate
- HEART for onboarding redesign:
- Task Success: onboarding completion rate
- Happiness: onboarding CSAT
FILE:scripts/metrics_calculator.py
#!/usr/bin/env python3
"""Product metrics calculator: retention, cohort matrix, and funnel conversion."""
import argparse
import csv
import datetime as dt
from collections import defaultdict
def parse_date(value: str) -> dt.date:
return dt.date.fromisoformat(value.strip()[:10])
def load_csv(path: str):
with open(path, "r", encoding="utf-8", newline="") as handle:
return list(csv.DictReader(handle))
def retention(args: argparse.Namespace) -> int:
rows = load_csv(args.input)
cohorts = {}
activity = defaultdict(set)
for row in rows:
user = row[args.user_column].strip()
cohort_date = parse_date(row[args.cohort_column])
activity_date = parse_date(row[args.activity_column])
cohorts[user] = min(cohorts.get(user, cohort_date), cohort_date)
delta = (activity_date - cohorts[user]).days
if delta >= 0:
activity[delta].add(user)
base_users = len(cohorts)
if base_users == 0:
print("No users found.")
return 1
print("Retention by period")
print("period,active_users,retention_rate")
max_period = args.max_period
for period in range(0, max_period + 1):
users = len(activity.get(period, set()))
rate = users / base_users
print(f"{period},{users},{rate:.4f}")
return 0
def cohort(args: argparse.Namespace) -> int:
rows = load_csv(args.input)
cohorts = {}
activity = defaultdict(set)
for row in rows:
user = row[args.user_column].strip()
cohort_date = parse_date(row[args.cohort_column])
activity_date = parse_date(row[args.activity_column])
if args.cohort_grain == "month":
cohort_key = cohort_date.strftime("%Y-%m")
else:
cohort_key = f"{cohort_date.isocalendar().year}-W{cohort_date.isocalendar().week:02d}"
cohorts.setdefault(user, cohort_key)
age = (activity_date - cohort_date).days
if age >= 0:
activity[(cohort_key, age)].add(user)
cohort_sizes = defaultdict(int)
for cohort_key in cohorts.values():
cohort_sizes[cohort_key] += 1
cohort_keys = sorted(cohort_sizes.keys())
print("cohort,age_days,active_users,cohort_size,retention_rate")
for cohort_key in cohort_keys:
size = cohort_sizes[cohort_key]
for age in range(0, args.max_period + 1):
active_users = len(activity.get((cohort_key, age), set()))
rate = (active_users / size) if size else 0
print(f"{cohort_key},{age},{active_users},{size},{rate:.4f}")
return 0
def funnel(args: argparse.Namespace) -> int:
rows = load_csv(args.input)
stages = [item.strip() for item in args.stages.split(",") if item.strip()]
if not stages:
print("No stages provided.")
return 1
stage_users = {stage: set() for stage in stages}
for row in rows:
user = row[args.user_column].strip()
stage = row[args.stage_column].strip()
if stage in stage_users:
stage_users[stage].add(user)
print("stage,users,conversion_from_previous,conversion_from_first")
previous_count = None
first_count = None
for stage in stages:
count = len(stage_users[stage])
if first_count is None:
first_count = count
conv_prev = (count / previous_count) if previous_count else 1.0
conv_first = (count / first_count) if first_count else 0
print(f"{stage},{count},{conv_prev:.4f},{conv_first:.4f}")
previous_count = count
return 0
def build_parser() -> argparse.ArgumentParser:
parser = argparse.ArgumentParser(
description="Calculate retention, cohort, and funnel metrics from CSV data."
)
subparsers = parser.add_subparsers(dest="command", required=True)
common = {
"help": "CSV input path",
}
retention_parser = subparsers.add_parser("retention", help="Calculate retention by day.")
retention_parser.add_argument("input", **common)
retention_parser.add_argument("--user-column", default="user_id")
retention_parser.add_argument("--cohort-column", default="cohort_date")
retention_parser.add_argument("--activity-column", default="activity_date")
retention_parser.add_argument("--max-period", type=int, default=30)
retention_parser.set_defaults(func=retention)
cohort_parser = subparsers.add_parser("cohort", help="Build cohort retention matrix rows.")
cohort_parser.add_argument("input", **common)
cohort_parser.add_argument("--user-column", default="user_id")
cohort_parser.add_argument("--cohort-column", default="cohort_date")
cohort_parser.add_argument("--activity-column", default="activity_date")
cohort_parser.add_argument("--cohort-grain", choices=["week", "month"], default="week")
cohort_parser.add_argument("--max-period", type=int, default=30)
cohort_parser.set_defaults(func=cohort)
funnel_parser = subparsers.add_parser("funnel", help="Calculate funnel conversion by stage.")
funnel_parser.add_argument("input", **common)
funnel_parser.add_argument("--user-column", default="user_id")
funnel_parser.add_argument("--stage-column", default="stage")
funnel_parser.add_argument("--stages", required=True)
funnel_parser.set_defaults(func=funnel)
return parser
def main() -> int:
parser = build_parser()
args = parser.parse_args()
return args.func(args)
if __name__ == "__main__":
raise SystemExit(main())
Use when planning product experiments, writing testable hypotheses, estimating sample size, prioritizing tests, or interpreting A/B outcomes with practical s...
---
name: experiment-designer
description: Use when planning product experiments, writing testable hypotheses, estimating sample size, prioritizing tests, or interpreting A/B outcomes with practical statistical rigor.
---
# Experiment Designer
Design, prioritize, and evaluate product experiments with clear hypotheses and defensible decisions.
## When To Use
Use this skill for:
- A/B and multivariate experiment planning
- Hypothesis writing and success criteria definition
- Sample size and minimum detectable effect planning
- Experiment prioritization with ICE scoring
- Reading statistical output for product decisions
## Core Workflow
1. Write hypothesis in If/Then/Because format
- If we change `[intervention]`
- Then `[metric]` will change by `[expected direction/magnitude]`
- Because `[behavioral mechanism]`
2. Define metrics before running test
- Primary metric: single decision metric
- Guardrail metrics: quality/risk protection
- Secondary metrics: diagnostics only
3. Estimate sample size
- Baseline conversion or baseline mean
- Minimum detectable effect (MDE)
- Significance level (alpha) and power
Use:
```bash
python3 scripts/sample_size_calculator.py --baseline-rate 0.12 --mde 0.02 --mde-type absolute
```
4. Prioritize experiments with ICE
- Impact: potential upside
- Confidence: evidence quality
- Ease: cost/speed/complexity
ICE Score = (Impact * Confidence * Ease) / 10
5. Launch with stopping rules
- Decide fixed sample size or fixed duration in advance
- Avoid repeated peeking without proper method
- Monitor guardrails continuously
6. Interpret results
- Statistical significance is not business significance
- Compare point estimate + confidence interval to decision threshold
- Investigate novelty effects and segment heterogeneity
## Hypothesis Quality Checklist
- [ ] Contains explicit intervention and audience
- [ ] Specifies measurable metric change
- [ ] States plausible causal reason
- [ ] Includes expected minimum effect
- [ ] Defines failure condition
## Common Experiment Pitfalls
- Underpowered tests leading to false negatives
- Running too many simultaneous changes without isolation
- Changing targeting or implementation mid-test
- Stopping early on random spikes
- Ignoring sample ratio mismatch and instrumentation drift
- Declaring success from p-value without effect-size context
## Statistical Interpretation Guardrails
- p-value < alpha indicates evidence against null, not guaranteed truth.
- Confidence interval crossing zero/no-effect means uncertain directional claim.
- Wide intervals imply low precision even when significant.
- Use practical significance thresholds tied to business impact.
See:
- `references/experiment-playbook.md`
- `references/statistics-reference.md`
## Tooling
### `scripts/sample_size_calculator.py`
Computes required sample size (per variant and total) from:
- baseline rate
- MDE (absolute or relative)
- significance level (alpha)
- statistical power
Example:
```bash
python3 scripts/sample_size_calculator.py \
--baseline-rate 0.10 \
--mde 0.015 \
--mde-type absolute \
--alpha 0.05 \
--power 0.8
```
FILE:references/experiment-playbook.md
# Experiment Playbook
## Experiment Types
### A/B Test
- Compare one control versus one variant.
- Best for high-confidence directional decisions.
### Multivariate Test
- Test combinations of multiple factors.
- Useful for interaction effects, requires larger traffic.
### Holdout Test
- Keep a percentage unexposed to intervention.
- Useful for measuring incremental lift over broader changes.
## Metric Design
### Primary Metric
- One metric that decides ship/no-ship.
- Must align with user value and business objective.
### Guardrail Metrics
- Prevent local optimization damage.
- Examples: error rate, latency, churn proxy, support contacts.
### Diagnostic Metrics
- Explain why change happened.
- Do not use as decision gate unless pre-specified.
## Stopping Rules
Define before launch:
- Fixed sample size per group
- Minimum run duration (to capture weekday/weekend behavior)
- Guardrail breach thresholds (pause criteria)
Avoid:
- Continuous peeking with fixed-horizon inference
- Changing success metric mid-test
- Retroactive segmentation without correction
## Novelty and Primacy Effects
- Novelty effect: short-term spike due to newness, not durable value.
- Primacy effect: early exposure creates bias in user behavior.
Mitigation:
- Run long enough for behavior stabilization.
- Check returning users and delayed cohorts separately.
- Re-run key tests when stakes are high.
## Pre-Launch Checklist
- [ ] Hypothesis complete (If/Then/Because)
- [ ] Metric definitions frozen
- [ ] Instrumentation validated
- [ ] Randomization and assignment verified
- [ ] Sample size and duration approved
- [ ] Rollback plan documented
## Post-Test Readout Template
1. Hypothesis and scope
2. Experiment setup and quality checks
3. Primary metric effect size + confidence interval
4. Guardrail status
5. Segment-level observations (pre-registered only)
6. Decision: ship, iterate, or reject
7. Follow-up experiments
FILE:references/statistics-reference.md
# Statistics Reference for Product Managers
## p-value
The p-value is the probability of observing data at least as extreme as yours if there were no true effect.
- Small p-value means data is less consistent with "no effect".
- It does not tell you the probability that the variant is best.
## Confidence Interval (CI)
A CI gives a plausible range for the true effect size.
- Narrow interval: more precise estimate.
- Wide interval: uncertain estimate.
- If CI includes zero (or no-effect), directional confidence is weak.
## Minimum Detectable Effect (MDE)
The smallest effect worth detecting.
- Set MDE by business value threshold, not wishful optimism.
- Smaller MDE requires larger sample size.
## Statistical Power
Power is the probability of detecting a true effect of at least MDE.
- Common target: 80% (0.8)
- Higher power increases sample requirements.
## Type I and Type II Errors
- Type I (false positive): claim effect when none exists (controlled by alpha).
- Type II (false negative): miss a real effect (controlled by power).
## Practical Significance
An effect can be statistically significant but too small to matter.
Always ask:
- Does the effect clear implementation cost?
- Does it move strategic KPIs materially?
## Power Analysis Inputs
For conversion experiments (two proportions):
- Baseline conversion rate
- MDE (absolute points or relative uplift)
- Alpha (e.g., 0.05)
- Power (e.g., 0.8)
Output:
- Required sample size per variant
- Total sample size
- Approximate runtime based on traffic volume
FILE:scripts/sample_size_calculator.py
#!/usr/bin/env python3
"""Calculate sample size for two-proportion A/B tests."""
import argparse
import math
import statistics
def clamp_rate(value: float, name: str) -> float:
if value <= 0 or value >= 1:
raise ValueError(f"{name} must be between 0 and 1 (exclusive).")
return value
def required_sample_size_per_group(
baseline_rate: float,
target_rate: float,
alpha: float,
power: float,
) -> int:
delta = abs(target_rate - baseline_rate)
if delta <= 0:
raise ValueError("MDE resolves to zero; target and baseline must differ.")
z_alpha = statistics.NormalDist().inv_cdf(1 - alpha / 2)
z_beta = statistics.NormalDist().inv_cdf(power)
pooled = (baseline_rate + target_rate) / 2
numerator = 2 * pooled * (1 - pooled) * (z_alpha + z_beta) ** 2
n = numerator / (delta ** 2)
return math.ceil(n)
def parse_args() -> argparse.Namespace:
parser = argparse.ArgumentParser(
description="Compute sample size for two-proportion product experiments."
)
parser.add_argument("--baseline-rate", type=float, required=True)
parser.add_argument(
"--mde",
type=float,
required=True,
help="Minimum detectable effect. Absolute points when --mde-type absolute, otherwise relative uplift.",
)
parser.add_argument("--mde-type", choices=["absolute", "relative"], default="relative")
parser.add_argument("--alpha", type=float, default=0.05)
parser.add_argument("--power", type=float, default=0.8)
parser.add_argument(
"--daily-samples",
type=int,
default=0,
help="Optional total daily samples to estimate runtime in days.",
)
return parser.parse_args()
def main() -> int:
args = parse_args()
baseline = clamp_rate(args.baseline_rate, "baseline-rate")
if args.mde <= 0:
raise ValueError("mde must be > 0")
if args.alpha <= 0 or args.alpha >= 1:
raise ValueError("alpha must be between 0 and 1")
if args.power <= 0 or args.power >= 1:
raise ValueError("power must be between 0 and 1")
if args.mde_type == "absolute":
target = baseline + args.mde
else:
target = baseline * (1 + args.mde)
target = clamp_rate(target, "target-rate")
n_per_group = required_sample_size_per_group(
baseline_rate=baseline,
target_rate=target,
alpha=args.alpha,
power=args.power,
)
total_n = n_per_group * 2
print("A/B Test Sample Size Estimate")
print(f"baseline_rate: {baseline:.6f}")
print(f"target_rate: {target:.6f}")
print(f"mde_type: {args.mde_type}")
print(f"alpha: {args.alpha}")
print(f"power: {args.power}")
print(f"n_per_group: {n_per_group}")
print(f"n_total: {total_n}")
if args.daily_samples > 0:
days = math.ceil(total_n / args.daily_samples)
print(f"estimated_days_at_daily_samples_{args.daily_samples}: {days}")
return 0
if __name__ == "__main__":
raise SystemExit(main())
Generates high-converting landing pages as complete Next.js/React (TSX) components with Tailwind CSS. Creates hero sections, feature grids, pricing tables, F...
---
name: "landing-page-generator"
description: "Generates high-converting landing pages as complete Next.js/React (TSX) components with Tailwind CSS. Creates hero sections, feature grids, pricing tables, FAQ accordions, testimonial blocks, and CTA sections using proven copy frameworks (PAS, AIDA, BAB). Outputs SEO meta tags, structured data, and performance-optimised code targeting Core Web Vitals (LCP < 1s, CLS < 0.1). Use when the user asks to create a landing page, marketing page, homepage, single-page site, lead capture page, campaign page, promo page, or conversion-optimised web page — or when they want to A/B test landing page variants or replace a static page with one designed to convert."
---
# Landing Page Generator
Generate high-converting landing pages from a product description. Output complete Next.js/React components with multiple section variants, proven copy frameworks, SEO optimization, and performance-first patterns. Not lorem ipsum — actual copy that converts.
**Target:** LCP < 1s · CLS < 0.1 · FID < 100ms
**Output:** TSX components + Tailwind styles + SEO meta + copy variants
---
## Core Capabilities
- 5 hero section variants (centered, split, gradient, video-bg, minimal)
- Feature sections (grid, alternating, cards with icons)
- Pricing tables (2–4 tiers with feature lists and toggle)
- FAQ accordion with schema markup
- Testimonials (grid, carousel, single-quote)
- CTA sections (banner, full-page, inline)
- Footer (simple, mega, minimal)
- 4 design styles with Tailwind class sets
---
## Generation Workflow
Follow these steps in order for every landing page request:
1. **Gather inputs** — collect product name, tagline, audience, pain point, key benefit, pricing tiers, design style, and copy framework using the trigger format below. Ask only for missing fields.
2. **Analyze brand voice** (recommended) — if the user has existing brand content (website copy, blog posts, marketing materials), run it through `marketing-skill/content-production/scripts/brand_voice_analyzer.py` to get a voice profile (formality, tone, perspective). Use the profile to inform design style and copy framework selection:
- formal + professional → **enterprise** style, **AIDA** framework
- casual + friendly → **bold-startup** style, **BAB** framework
- professional + authoritative → **dark-saas** style, **PAS** framework
- casual + conversational → **clean-minimal** style, **BAB** framework
3. **Select design style** — map the user's choice (or infer from brand voice analysis) to one of the four Tailwind class sets in the Design Style Reference.
4. **Apply copy framework** — write all headline and body copy using the chosen framework (PAS / AIDA / BAB) before generating components. Match the voice profile's formality and tone throughout.
5. **Generate sections in order** — Hero → Features → Pricing → FAQ → Testimonials → CTA → Footer. Skip sections not relevant to the product.
6. **Validate against SEO checklist** — run through every item in the SEO Checklist before outputting final code. Fix any gaps inline.
7. **Output final components** — deliver complete, copy-paste-ready TSX files with all Tailwind classes, SEO meta, and structured data included.
---
## Triggering This Skill
```
Product: [name]
Tagline: [one sentence value prop]
Target audience: [who they are]
Key pain point: [what problem you solve]
Key benefit: [primary outcome]
Pricing tiers: [free/pro/enterprise or describe]
Design style: dark-saas | clean-minimal | bold-startup | enterprise
Copy framework: PAS | AIDA | BAB
```
---
## Design Style Reference
| Style | Background | Accent | Cards | CTA Button |
|---|---|---|---|---|
| **Dark SaaS** | `bg-gray-950 text-white` | `violet-500/400` | `bg-gray-900 border border-gray-800` | `bg-violet-600 hover:bg-violet-500` |
| **Clean Minimal** | `bg-white text-gray-900` | `blue-600` | `bg-gray-50 border border-gray-200 rounded-2xl` | `bg-blue-600 hover:bg-blue-700` |
| **Bold Startup** | `bg-white text-gray-900` | `orange-500` | `shadow-xl rounded-3xl` | `bg-orange-500 hover:bg-orange-600 text-white` |
| **Enterprise** | `bg-slate-50 text-slate-900` | `slate-700` | `bg-white border border-slate-200 shadow-sm` | `bg-slate-900 hover:bg-slate-800 text-white` |
> **Bold Startup** headings: add `font-black tracking-tight` to all `<h1>`/`<h2>` elements.
---
## Copy Frameworks
**PAS (Problem → Agitate → Solution)**
- H1: Painful state they're in
- Sub: What happens if they don't fix it
- CTA: What you offer
- *Example — H1:* "Your team wastes 3 hours a day on manual reporting" / *Sub:* "Every hour spent on spreadsheets is an hour not closing deals. Your competitors are already automated." / *CTA:* "Automate your reports in 10 minutes →"
**AIDA (Attention → Interest → Desire → Action)**
- H1: Bold attention-grabbing statement → Sub: Interesting fact or benefit → Features: Desire-building proof points → CTA: Clear action
**BAB (Before → After → Bridge)**
- H1: "[Before state] → [After state]" → Sub: "Here's how [product] bridges the gap" → Features: How it works (the bridge)
---
## Representative Component: Hero (Centered Gradient — Dark SaaS)
Use this as the structural template for all hero variants. Swap layout classes, gradient direction, and image placement for split, video-bg, and minimal variants.
```tsx
export function HeroCentered() {
return (
<section className="relative flex min-h-screen flex-col items-center justify-center overflow-hidden bg-gray-950 px-4 text-center">
<div className="absolute inset-0 bg-gradient-to-b from-violet-900/20 to-transparent" />
<div className="pointer-events-none absolute -top-40 left-1/2 h-[600px] w-[600px] -translate-x-1/2 rounded-full bg-violet-600/20 blur-3xl" />
<div className="relative z-10 max-w-4xl">
<div className="mb-6 inline-flex items-center gap-2 rounded-full border border-violet-500/30 bg-violet-500/10 px-4 py-1.5 text-sm text-violet-300">
<span className="h-1.5 w-1.5 rounded-full bg-violet-400" />
Now in public beta
</div>
<h1 className="mb-6 text-5xl font-bold tracking-tight text-white md:text-7xl">
Ship faster.<br />
<span className="bg-gradient-to-r from-violet-400 to-pink-400 bg-clip-text text-transparent">
Break less.
</span>
</h1>
<p className="mx-auto mb-10 max-w-2xl text-xl text-gray-400">
The deployment platform that catches errors before your users do.
Zero config. Instant rollbacks. Real-time monitoring.
</p>
<div className="flex flex-col items-center gap-4 sm:flex-row sm:justify-center">
<Button size="lg" className="bg-violet-600 text-white hover:bg-violet-500 px-8">
Start free trial
</Button>
<Button size="lg" variant="outline" className="border-gray-700 text-gray-300">
See how it works →
</Button>
</div>
<p className="mt-4 text-sm text-gray-500">No credit card required · 14-day free trial</p>
</div>
</section>
)
}
```
---
## Other Section Patterns
### Feature Section (Alternating)
Map over a `features` array with `{ title, description, image, badge }`. Toggle layout direction with `i % 2 === 1 ? "lg:flex-row-reverse" : ""`. Use `<Image>` with explicit `width`/`height` and `rounded-2xl shadow-xl`. Wrap in `<section className="py-24">` with `max-w-6xl` container.
### Pricing Table
Map over a `plans` array with `{ name, price, description, features[], cta, highlighted }`. Highlighted plan gets `border-2 border-violet-500 bg-violet-950/50 ring-4 ring-violet-500/20`; others get `border border-gray-800 bg-gray-900`. Render `null` price as "Custom". Use `<Check>` icon per feature row. Layout: `grid gap-8 lg:grid-cols-3`.
### FAQ with Schema Markup
Inject `FAQPage` JSON-LD via `<script type="application/ld+json" dangerouslySetInnerHTML={{ __html: JSON.stringify(schema) }} />` inside the section. Map FAQs with `{ q, a }` into shadcn `<Accordion>` with `type="single" collapsible`. Container: `max-w-3xl`.
### Testimonials, CTA, Footer
- **Testimonials:** Grid (`grid-cols-1 md:grid-cols-3`) or single-quote hero block with avatar, name, role, and quote text.
- **CTA Banner:** Full-width section with headline, subhead, and two buttons (primary + ghost). Add trust signals (money-back guarantee, logo strip) immediately below.
- **Footer:** Logo + nav columns + social links + legal. Use `border-t border-gray-800` separator.
---
## SEO Checklist
- [ ] `<title>` tag: primary keyword + brand (50–60 chars)
- [ ] Meta description: benefit + CTA (150–160 chars)
- [ ] OG image: 1200×630px with product name and tagline
- [ ] H1: one per page, includes primary keyword
- [ ] Structured data: FAQPage, Product, or Organization schema
- [ ] Canonical URL set
- [ ] Image alt text on all `<Image>` components
- [ ] robots.txt and sitemap.xml configured
- [ ] Core Web Vitals: LCP < 1s, CLS < 0.1
- [ ] Mobile viewport meta tag present
- [ ] Internal linking to pricing and docs
> **Validation step:** Before outputting final code, verify every checklist item above is satisfied. Fix any gaps inline — do not skip items.
---
## Performance Targets
| Metric | Target | Technique |
|---|---|---|
| LCP | < 1s | Preload hero image, use `priority` on Next/Image |
| CLS | < 0.1 | Set explicit width/height on all images |
| FID/INP | < 100ms | Defer non-critical JS, use `loading="lazy"` |
| TTFB | < 200ms | Use ISR or static generation for landing pages |
| Bundle | < 100KB JS | Audit with `@next/bundle-analyzer` |
---
## Common Pitfalls
- Hero image not preloaded — add `priority` prop to first `<Image>`
- Missing mobile breakpoints — always design mobile-first with `sm:` prefixes
- CTA copy too vague — "Get started" beats "Learn more"; "Start free trial" beats "Sign up"
- Pricing page missing trust signals — add money-back guarantee and testimonials near CTA
- No above-the-fold CTA on mobile — ensure button is visible without scrolling on 375px viewport
---
## Related Skills
- **Brand Voice Analyzer** (`marketing-skill/content-production/scripts/brand_voice_analyzer.py`) — Run before generation to establish voice profile and ensure copy consistency
- **UI Design System** (`product-team/ui-design-system/`) — Generate design tokens from brand color before building the page
- **Competitive Teardown** (`product-team/competitive-teardown/`) — Competitive positioning informs landing page messaging and differentiation
FILE:references/conversion-patterns.md
# High-Converting Landing Page Patterns
## Overview
This reference catalogs proven landing page design patterns that drive higher conversion rates. Each pattern includes placement guidance, implementation notes, and A/B testing priorities.
## Hero Section Layouts
### Pattern 1: Left Copy + Right Product Screenshot
- **Best for:** SaaS products with a strong visual UI
- **Structure:** Headline, subheadline, CTA on left (60%); product screenshot on right (40%)
- **Why it works:** F-pattern reading leads with copy, product image provides proof
- **Conversion lift:** Baseline pattern, strong performer across industries
### Pattern 2: Centered Copy + Full-Width Background
- **Best for:** Brand-driven products, consumer apps
- **Structure:** Centered headline, subheadline, CTA over background image/gradient
- **Why it works:** Focuses attention on single message, high visual impact
- **Note:** Ensure text contrast against background for readability
### Pattern 3: Video Hero
- **Best for:** Complex products requiring demonstration
- **Structure:** Short headline + embedded video (60-90 seconds) + CTA below
- **Why it works:** Video explains what text cannot, increases time on page
- **Note:** Always include thumbnail; autoplay is often counterproductive
### Pattern 4: Interactive Demo
- **Best for:** Developer tools, data products, design tools
- **Structure:** Minimal copy + embedded interactive product experience
- **Why it works:** Hands-on experience converts better than description
- **Note:** Keep demo focused on one "aha moment" workflow
## Social Proof Placement
### Logo Bar
- **Position:** Immediately below hero section
- **Count:** 5-7 logos for credibility without clutter
- **Label:** "Trusted by" or "Used by teams at"
- **Selection:** Mix recognizable brands with relevant industry logos
### Testimonial Cards
- **Position:** After feature explanation sections
- **Format:** Photo + name + title + company + specific quote
- **Best quotes:** Include measurable outcomes ("Saved 10 hours/week")
- **Layout:** 2-3 testimonials in a row, carousel for more
### Case Study Callouts
- **Position:** Mid-page, before pricing
- **Format:** Company logo + headline metric + "Read the story" link
- **Example:** "Acme Corp reduced onboarding time by 60%"
### Social Proof Numbers
- **Position:** Near CTA or in dedicated trust section
- **Format:** Large number + descriptor (e.g., "50,000+ teams", "4.8/5 rating")
- **Selection:** Choose 3-4 most impressive metrics
## Pricing Table Designs
### Good/Better/Best (3-Tier)
- Most effective for SaaS with clear feature tiers
- Highlight recommended plan with visual emphasis
- Show annual discount prominently
- Include feature comparison matrix below
### Simple Two-Tier
- Free/Pro or Starter/Professional
- Best for PLG products with clear upgrade trigger
- Minimize decision fatigue
### Enterprise Custom
- Replace price with "Contact Sales" for high-ACV products
- List enterprise-specific features (SSO, SLA, dedicated support)
- Include a "Talk to Sales" CTA, not just a form
### Pricing Psychology
- Anchor with highest-priced plan first (or in the middle with visual highlight)
- Use monthly price with annual billing toggle
- Show savings percentage for annual plans
- Round prices ending in 9 (e.g., $49/mo, $99/mo)
## Trust Signals
### Security Badges
- SOC 2, ISO 27001, GDPR compliance badges
- SSL certificate indicator
- Place near forms and payment sections
### Guarantees
- Money-back guarantee with specific timeframe
- Free trial with no credit card requirement
- SLA uptime commitments
### Awards & Recognition
- Industry awards (best of, top rated)
- Analyst recognition (Gartner, Forrester, G2 Leader)
- Media mentions (as seen in logos)
### Real-Time Activity
- "X people signed up today" (use real data only)
- Recent activity feed
- Live user count
## Urgency Elements
### Ethical Urgency
- Limited-time pricing (with real deadline)
- Early adopter benefits (extra features, lower price)
- Cohort-based enrollment (actual capacity limits)
### Avoid
- Fake countdown timers that reset
- False scarcity ("only 3 left" when unlimited)
- Pressure tactics that erode trust
## Form Optimization
### Field Reduction
- Every additional field reduces conversion ~10%
- Start with email only, progressive profiling later
- Use single-column layouts for forms
### Smart Defaults
- Pre-fill country based on IP
- Auto-detect company from email domain
- Default to most popular plan
### Inline Validation
- Validate fields on blur, not on submit
- Show success states (green checkmark)
- Provide helpful error messages
### Multi-Step Forms
- Break long forms into 2-3 steps with progress indicator
- Put easiest questions first to build commitment
- Allow saving progress for complex forms
## Mobile-First Patterns
### Thumb-Friendly Design
- CTAs in thumb zone (bottom 40% of screen)
- Minimum tap target: 44x44px
- Adequate spacing between interactive elements
### Content Priority
- Lead with most compelling content (no scrolling to find CTA)
- Collapse secondary information into accordions
- Use sticky CTA bar on scroll
### Performance
- Target <3s load time on 3G
- Lazy-load images below fold
- Minimize JavaScript execution
## A/B Testing Priority Matrix
Test these elements in order of expected impact:
| Priority | Element | Expected Impact | Effort |
|----------|---------|----------------|--------|
| 1 | Headline | High | Low |
| 2 | CTA text and color | High | Low |
| 3 | Hero image/video | High | Medium |
| 4 | Social proof placement | Medium | Low |
| 5 | Form fields (fewer) | Medium | Low |
| 6 | Pricing presentation | Medium | Medium |
| 7 | Page length | Medium | High |
| 8 | Testimonial selection | Low | Low |
| 9 | Color scheme | Low | Medium |
| 10 | Font choices | Low | Low |
### Testing Best Practices
- Test one variable at a time for clear attribution
- Run tests for minimum 2 weeks or 1,000 visitors per variant
- Use 95% statistical significance threshold
- Document all test results for institutional knowledge
- Winner becomes new control for next test iteration
FILE:references/copy-frameworks.md
# Landing Page Copywriting Frameworks
## Overview
Four copy frameworks with worked SaaS examples you can adapt. Each framework includes a complete before/after example plus specific guidelines for each section.
## 1. AIDA Framework (Attention - Interest - Desire - Action)
The classic direct response formula, ideal for product landing pages.
**Example — Project management SaaS:**
> **Attention:** "Your Team Loses 12 Hours Every Sprint to Status Meetings"
>
> **Interest:** "Engineering teams at Series A-C startups spend 23% of their week in sync meetings — not writing code. We tracked 847 teams over 6 months. The pattern was clear: the more people in a standup, the less code shipped that day."
>
> **Desire:** "Teams using AsyncStand ship 31% more story points per sprint. No more 15-person standups where 13 people zone out. Replace your daily sync with a 2-minute async check-in that your engineers actually complete (94% response rate vs 67% attendance for live standups)."
>
> **Action:** "Start Your Free 14-Day Trial — No Credit Card Required"
### Attention
- Lead with a specific, quantified pain point (not vague claims)
- Weak: "Save time on meetings" → Strong: "Your Team Loses 12 Hours Every Sprint to Status Meetings"
- Keep headlines under 10 words for maximum impact
### Interest
- Back up the headline with specific data or a relatable scenario
- Weak: "Meetings waste time" → Strong: "We tracked 847 teams — the more people in standup, the less code shipped that day"
- Use their language: mirror words from customer reviews, support tickets, and G2 feedback
### Desire
- Stack measurable outcomes, not features
- Weak: "AI-powered async updates" → Strong: "31% more story points per sprint, 94% response rate"
- Compare directly to the status quo they already endure
### Action
- Single, clear CTA with action-oriented verb
- Reduce friction: "No credit card required," "Set up in 2 minutes"
- Repeat CTA after each major content block
## 2. PAS Framework (Problem - Agitate - Solution)
Best for pain-point-driven products where the problem is well understood.
**Example — Expense management tool:**
> **Problem:** "Your finance team is still chasing receipts in Slack DMs."
>
> **Agitate:** "Last quarter, your team spent 46 hours manually reconciling expenses across email threads, shared drives, and 'I'll submit it later' promises. That's $4,200 in payroll — spent on data entry. And when audit season hits? Good luck finding that client dinner receipt from February."
>
> **Solution:** "Snap a photo of the receipt. ExpenseFlow auto-extracts vendor, amount, and category in 3 seconds. Your monthly close drops from 5 days to 1. 2,400 finance teams already made the switch."
### Problem
- Name the exact scenario (not the abstract category)
- Weak: "Expense tracking is hard" → Strong: "Your finance team is still chasing receipts in Slack DMs"
- Mirror language from reviews and support tickets
### Agitate
- Quantify the cost in dollars, hours, or missed opportunities
- Weak: "This costs you money" → Strong: "46 hours last quarter, $4,200 in payroll — on data entry"
- Acknowledge the workarounds they've tried and why those fail too
### Solution
- Lead with the user action, not the technology: "Snap a photo" not "AI-powered OCR"
- Include one proof point: number of customers, time saved, or before/after metric
- Make the mechanism clear in one sentence: what happens when they use it
## 3. BAB Framework (Before - After - Bridge)
Ideal for aspirational products and lifestyle-oriented landing pages.
**Example — Sales enablement platform:**
> **Before:** "It's 9 PM. You're rebuilding a deck for tomorrow's demo because the prospect is in healthcare, not fintech. You copy-paste from three old decks, pray the logos are right, and rehearse the new talk track in the shower."
>
> **After:** "It's 9 AM. You type 'healthcare, 200-bed hospital, HIPAA-concerned CTO.' DeckGen builds your slides in 40 seconds — case studies, compliance badges, ROI calculator pre-loaded. You walk into the call with the best deck your prospect has ever seen."
>
> **Bridge:** "DeckGen connects to your CRM, learns your win patterns, and generates prospect-specific decks in under a minute. 340 AEs at companies like Stripe and Notion already use it. Start free — your first 5 decks are on us."
### Before
- Describe a specific, lived moment — not an abstract pain category
- Weak: "Sales decks take too long" → Strong: "It's 9 PM. You're rebuilding a deck for tomorrow's demo..."
- Use second person and present tense to make it feel immediate
### After
- Same level of specificity — show the transformed version of that exact moment
- Include a measurable outcome: "40 seconds," "best deck your prospect has ever seen"
- The after state should feel effortless compared to the before
### Bridge
- Name the product explicitly and explain the mechanism in one sentence
- Include one social proof data point
- End with a low-friction CTA that connects to the after state
## 4. 4Ps Framework (Promise - Picture - Proof - Push)
Strong for SaaS and B2B landing pages with measurable outcomes.
### Promise
- Make a clear, specific, believable promise
- Tie it to a measurable outcome
- Example: "Reduce customer churn by 25% in 90 days"
### Picture
- Help the reader visualize success
- Use scenarios they can relate to
- Show the product in context (screenshots, demos)
### Proof
- Back the promise with evidence
- Customer testimonials with specific results
- Case studies with before/after metrics
- Third-party validation (awards, analyst reports)
### Push
- Give a compelling reason to act now
- Limited-time offer, bonus, or guarantee
- Risk reversal (money-back guarantee, free trial)
## Headline Formulas
### Benefit-Driven
- "Get [Desired Outcome] Without [Common Objection]"
- "[Specific Result] in [Timeframe]"
- "The [Adjective] Way to [Achieve Goal]"
### Problem-Driven
- "Stop [Painful Activity]. Start [Better Alternative]."
- "Tired of [Problem]? There's a Better Way."
- "[Problem]? Not Anymore."
### Social Proof-Driven
- "[Number] Teams Trust [Product] to [Outcome]"
- "Why [Notable Company] Switched to [Product]"
- "Rated #1 for [Category] by [Authority]"
### Question-Driven
- "What If You Could [Desirable Outcome]?"
- "Ready to [Transformation]?"
- "Still [Painful Status Quo]?"
## CTA Best Practices
### Language
- Use first-person: "Start My Free Trial" > "Start Your Free Trial"
- Be specific: "Get My Report" > "Submit"
- Include benefit: "Start Saving Time" > "Sign Up"
- Add urgency naturally: "Start Free Today" > "Sign Up Now!!!"
### Placement
- Primary CTA above the fold
- Repeat after each major content section
- Sticky CTA on scroll (mobile especially)
- Exit-intent as last chance
### Design
- High contrast color (stands out from page palette)
- Sufficient whitespace around the button
- Large enough to tap on mobile (min 44x44px)
- Micro-copy below button to reduce anxiety ("No credit card required")
## Above-the-Fold Principles
The first viewport must accomplish these goals within 5 seconds:
1. **Communicate what you do** - Clear, jargon-free headline
2. **Show who it's for** - Audience identification
3. **Demonstrate value** - Primary benefit or outcome
4. **Provide next step** - Visible CTA button
5. **Build credibility** - One trust signal (logo bar, metric, badge)
### Above-the-Fold Checklist
- [ ] Headline states primary benefit (under 10 words)
- [ ] Subheadline adds specificity or addresses objection
- [ ] Hero image/video shows product in use
- [ ] CTA button is visible without scrolling
- [ ] At least one trust signal present
- [ ] No jargon or ambiguity in messaging
FILE:references/landing-page-patterns.md
# Landing Page Patterns
This reference captures high-converting page patterns and copy structures.
## Hero Section Patterns
### Pattern 1: Problem-Solution Hero
- Headline names the painful problem.
- Subheadline states the clear outcome.
- Primary CTA starts immediately.
- Optional supporting visual demonstrates product in context.
### Pattern 2: Outcome-First Hero
- Headline leads with measurable value.
- Subheadline clarifies who the page is for.
- CTA is action-oriented and specific.
### Pattern 3: Authority Hero
- Headline + trust indicator (logos, testimonial snippet, proof metric).
- Useful when category skepticism is high.
## Social Proof Layouts
### Logo Strip + Proof Metric
- Keep to recognizable logos.
- Add one proof metric (e.g., active users, revenue saved, hours reduced).
### Testimonial Grid
- 3-6 testimonials across segments.
- Include role/company where possible.
- Prefer concrete outcomes over generic praise.
### Case Study Snapshot
- Mini blocks: challenge -> approach -> measurable result.
## CTA Best Practices
- Use one dominant CTA per section.
- Match CTA verb to user intent ("Start trial", "Get demo", "Run audit").
- Keep CTA copy specific; avoid vague labels like "Submit".
- Reduce friction near CTA (short form, trust indicators, no surprise commitments).
## Above-the-Fold Checklist
- [ ] Clear value proposition in first viewport
- [ ] Audience clarity (who this is for)
- [ ] One primary CTA visible without scrolling
- [ ] Proof element (logos, stat, quote)
- [ ] Visual hierarchy emphasizes headline + CTA
- [ ] Mobile layout keeps CTA accessible
## Conversion-Optimized Templates
### SaaS Demo Page
1. Hero with problem-solution framing
2. Product walkthrough section
3. Social proof strip
4. Benefits by persona
5. Objection handling FAQ
6. Final CTA
### Lead Magnet Page
1. Promise + asset preview
2. Bullet outcomes
3. Short form
4. Trust/privacy note
### Product Launch Page
1. Outcome-first hero
2. Why now / differentiation
3. Feature blocks
4. Testimonials / beta feedback
5. Pricing or waitlist CTA
## Headline Formulas
### PAS (Problem-Agitate-Solution)
- Problem: identify the pain
- Agitate: show consequences of inaction
- Solution: position the offer as relief
Example structure:
"Still [problem]? Stop [negative consequence] and start [desired outcome]."
### AIDA (Attention-Interest-Desire-Action)
- Attention: pattern interrupt headline
- Interest: relevant context and stakes
- Desire: proof and benefits
- Action: concrete next step
### 4U Formula
- Useful: clear practical value
- Urgent: reason to act now
- Unique: differentiated promise
- Ultra-specific: concrete outcome and scope
Example structure:
"Get [specific result] in [timeframe] without [common pain]."
FILE:references/seo-checklist.md
# Landing Page SEO Checklist
## Overview
This checklist ensures landing pages are optimized for search engine visibility while maintaining conversion focus. Apply these checks before launching any landing page.
## Meta Tags
- [ ] **Title tag**: Under 60 characters, includes primary keyword, ends with brand name
- [ ] **Meta description**: 150-160 characters, includes CTA language, unique per page
- [ ] **Canonical URL**: Set to prevent duplicate content issues
- [ ] **Robots meta**: Ensure page is indexable (`index, follow`) unless intentionally noindex
- [ ] **Open Graph tags**: og:title, og:description, og:image, og:url for social sharing
- [ ] **Twitter Card tags**: twitter:card, twitter:title, twitter:description, twitter:image
- [ ] **Viewport meta**: `<meta name="viewport" content="width=device-width, initial-scale=1">`
## Structured Data
- [ ] **Organization schema**: Company name, logo, social profiles
- [ ] **Product schema**: Name, description, price, availability (for product pages)
- [ ] **FAQ schema**: For pages with FAQ sections (rich snippet opportunity)
- [ ] **Breadcrumb schema**: Navigation path for deep pages
- [ ] **Review schema**: Aggregate rating if testimonials present (use carefully per guidelines)
- [ ] **Validate**: Test all structured data with Google Rich Results Test
## Core Web Vitals Targets
### Largest Contentful Paint (LCP) - Target: < 2.5s
- [ ] Optimize hero image (WebP format, proper dimensions)
- [ ] Preload critical resources (`<link rel="preload">`)
- [ ] Use CDN for static assets
- [ ] Minimize render-blocking CSS and JavaScript
### First Input Delay (FID) / Interaction to Next Paint (INP) - Target: < 200ms
- [ ] Defer non-critical JavaScript
- [ ] Break up long tasks (>50ms)
- [ ] Minimize third-party script impact
- [ ] Use `requestAnimationFrame` for visual updates
### Cumulative Layout Shift (CLS) - Target: < 0.1
- [ ] Set explicit width/height on images and videos
- [ ] Reserve space for dynamic content (ads, embeds)
- [ ] Use `font-display: swap` for web fonts
- [ ] Avoid inserting content above existing content
## Keyword Placement
- [ ] **H1 tag**: Contains primary keyword, one per page only
- [ ] **H2 tags**: Include secondary keywords naturally
- [ ] **First paragraph**: Primary keyword appears in first 100 words
- [ ] **Body copy**: Natural keyword density (1-2%), no stuffing
- [ ] **Image alt text**: Descriptive, includes keyword where relevant
- [ ] **URL slug**: Short, keyword-rich, hyphen-separated
- [ ] **CTA text**: Consider keyword inclusion where natural
## Internal Linking
- [ ] Link to relevant product/feature pages
- [ ] Link to blog content that supports the page topic
- [ ] Use descriptive anchor text (not "click here")
- [ ] Ensure landing page is linked from main navigation or sitemap
- [ ] Link to pricing page if applicable
- [ ] Limit links to avoid diluting page authority (15-20 max)
## Image Optimization
- [ ] **Format**: Use WebP with JPEG/PNG fallback
- [ ] **Compression**: Lossy compression for photos, lossless for graphics
- [ ] **Dimensions**: Serve at exact display size (no CSS resizing)
- [ ] **Alt text**: Descriptive, 125 characters max, natural keyword inclusion
- [ ] **File names**: Descriptive, hyphenated (e.g., `product-dashboard-screenshot.webp`)
- [ ] **Lazy loading**: Apply to images below the fold (`loading="lazy"`)
- [ ] **Responsive images**: Use `srcset` for different viewport sizes
## Canonical URLs
- [ ] Self-referencing canonical on every page
- [ ] Consistent protocol (https) and trailing slash usage
- [ ] Canonical points to preferred URL version (www vs non-www)
- [ ] UTM parameters excluded from canonical URL
- [ ] Pagination handled with rel="next"/"prev" or single-page canonical
## Mobile Responsiveness
- [ ] **Mobile-friendly test**: Pass Google Mobile-Friendly Test
- [ ] **Touch targets**: Minimum 44x44px, 8px spacing between targets
- [ ] **Font size**: Minimum 16px base font, no pinch-to-zoom needed
- [ ] **Content parity**: All critical content accessible on mobile
- [ ] **Horizontal scroll**: None present at any viewport width
- [ ] **Form usability**: Appropriate input types (email, tel), autocomplete attributes
- [ ] **Media queries**: Breakpoints at 480px, 768px, 1024px, 1200px minimum
## Technical SEO
- [ ] **HTTPS**: SSL certificate valid and active
- [ ] **Page speed**: < 3s load time on mobile (test with PageSpeed Insights)
- [ ] **XML sitemap**: Page included in sitemap.xml
- [ ] **Robots.txt**: Page not blocked by robots.txt
- [ ] **404 handling**: Custom 404 page with navigation
- [ ] **Redirect chains**: No more than 1 redirect hop
- [ ] **Hreflang**: Set for multi-language landing pages
## Content Quality Signals
- [ ] **Unique content**: No duplicate content from other pages
- [ ] **Content depth**: Sufficient content for topic coverage (500+ words for SEO pages)
- [ ] **Readability**: Grade level 6-8 for broad audiences
- [ ] **Freshness**: Last modified date reflects recent updates
- [ ] **E-E-A-T signals**: Author expertise, company authority, trust indicators
FILE:scripts/landing_page_scaffolder.py
#!/usr/bin/env python3
"""Landing Page Scaffolder — Generate landing pages as HTML or Next.js TSX from config.
Creates production-ready landing pages with hero sections, features,
testimonials, pricing, CTAs, and responsive design.
Usage:
python landing_page_scaffolder.py config.json --format html --output page.html
python landing_page_scaffolder.py config.json --format tsx --output LandingPage.tsx
python landing_page_scaffolder.py config.json --format json
"""
import argparse
import json
import sys
from typing import Dict, List, Any, Optional
from datetime import datetime
import html as html_module
def escape(text: str) -> str:
"""HTML-escape text."""
return html_module.escape(str(text))
# ---------------------------------------------------------------------------
# Tailwind style mappings for TSX output
# ---------------------------------------------------------------------------
DESIGN_STYLES = {
"dark-saas": {
"bg": "bg-gray-950", "text": "text-white",
"accent": "violet", "card_bg": "bg-gray-900 border border-gray-800",
"btn": "bg-violet-600 hover:bg-violet-500 text-white",
"btn_secondary": "border border-gray-700 text-gray-300 hover:bg-gray-800",
"section_alt": "bg-gray-900/50", "muted": "text-gray-400",
"border": "border-gray-800",
},
"clean-minimal": {
"bg": "bg-white", "text": "text-gray-900",
"accent": "blue", "card_bg": "bg-gray-50 border border-gray-200 rounded-2xl",
"btn": "bg-blue-600 hover:bg-blue-700 text-white",
"btn_secondary": "border border-gray-300 text-gray-700 hover:bg-gray-50",
"section_alt": "bg-gray-50", "muted": "text-gray-500",
"border": "border-gray-200",
},
"bold-startup": {
"bg": "bg-white", "text": "text-gray-900",
"accent": "orange", "card_bg": "shadow-xl rounded-3xl bg-white",
"btn": "bg-orange-500 hover:bg-orange-600 text-white",
"btn_secondary": "border-2 border-orange-500 text-orange-600 hover:bg-orange-50",
"section_alt": "bg-orange-50/30", "muted": "text-gray-500",
"border": "border-gray-200",
},
"enterprise": {
"bg": "bg-slate-50", "text": "text-slate-900",
"accent": "slate", "card_bg": "bg-white border border-slate-200 shadow-sm",
"btn": "bg-slate-900 hover:bg-slate-800 text-white",
"btn_secondary": "border border-slate-300 text-slate-700 hover:bg-slate-100",
"section_alt": "bg-white", "muted": "text-slate-500",
"border": "border-slate-200",
},
}
# ---------------------------------------------------------------------------
# TSX generators
# ---------------------------------------------------------------------------
def tsx_nav(config: Dict[str, Any], style: Dict[str, str]) -> str:
brand = config.get("brand", "Brand")
nav_links = config.get("nav_links", [])
cta = config.get("nav_cta", {"text": "Get Started", "url": "#"})
links_jsx = "\n ".join(
f'<a href="{l.get("url", "#")}" className="{style["muted"]} hover:{style["text"]} font-medium transition-colors">{l.get("text", "")}</a>'
for l in nav_links
)
return f'''function Navbar() {{
return (
<nav className="sticky top-0 z-50 {style["bg"]} border-b {style["border"]} backdrop-blur-sm">
<div className="mx-auto flex max-w-7xl items-center justify-between px-6 py-4">
<a href="#" className="text-xl font-bold {style["text"]}">{brand}</a>
<div className="hidden items-center gap-8 md:flex">
{links_jsx}
<a href="{cta.get("url", "#")}" className="rounded-lg {style["btn"]} px-5 py-2.5 text-sm font-semibold transition-colors">
{cta.get("text", "Get Started")}
</a>
</div>
</div>
</nav>
);
}}'''
def tsx_hero(hero: Dict[str, Any], style: Dict[str, str]) -> str:
h1 = hero.get("headline", "Your Headline Here")
sub = hero.get("subheadline", "")
primary_cta = hero.get("primary_cta", {"text": "Get Started", "url": "#"})
secondary_cta = hero.get("secondary_cta", None)
secondary_jsx = ""
if secondary_cta:
secondary_jsx = f'''
<a href="{secondary_cta.get("url", "#")}" className="rounded-lg {style["btn_secondary"]} px-8 py-3 text-lg font-semibold transition-colors">
{secondary_cta.get("text", "Learn More")}
</a>'''
return f'''function Hero() {{
return (
<section className="flex min-h-[80vh] flex-col items-center justify-center px-6 py-24 text-center {style["bg"]}">
<div className="mx-auto max-w-4xl">
<h1 className="mb-6 text-5xl font-bold tracking-tight {style["text"]} md:text-7xl">
{h1}
</h1>
<p className="mx-auto mb-10 max-w-2xl text-xl {style["muted"]}">
{sub}
</p>
<div className="flex flex-col items-center gap-4 sm:flex-row sm:justify-center">
<a href="{primary_cta.get("url", "#")}" className="rounded-lg {style["btn"]} px-8 py-3 text-lg font-semibold transition-colors">
{primary_cta.get("text", "Get Started")}
</a>{secondary_jsx}
</div>
</div>
</section>
);
}}'''
def tsx_features(features: Dict[str, Any], style: Dict[str, str]) -> str:
title = features.get("title", "Features")
subtitle = features.get("subtitle", "")
items = features.get("items", [])
cards_jsx = "\n ".join(
f'''<div className="{style["card_bg"]} rounded-xl p-8">
<div className="mb-4 text-3xl">{f.get("icon", "")}</div>
<h3 className="mb-3 text-xl font-semibold {style["text"]}">{f.get("title", "")}</h3>
<p className="{style["muted"]}">{f.get("description", "")}</p>
</div>'''
for f in items
)
return f'''function Features() {{
return (
<section className="{style["section_alt"]} px-6 py-24">
<div className="mx-auto max-w-7xl">
<h2 className="mb-4 text-center text-4xl font-bold {style["text"]}">{title}</h2>
<p className="mx-auto mb-16 max-w-2xl text-center text-lg {style["muted"]}">{subtitle}</p>
<div className="grid gap-8 md:grid-cols-2 lg:grid-cols-3">
{cards_jsx}
</div>
</div>
</section>
);
}}'''
def tsx_testimonials(testimonials: Dict[str, Any], style: Dict[str, str]) -> str:
title = testimonials.get("title", "What Our Customers Say")
items = testimonials.get("items", [])
if not items:
return ""
cards_jsx = "\n ".join(
f'''<div className="rounded-xl border {style["border"]} p-8">
<p className="mb-6 text-lg italic {style["muted"]}">"{t.get("quote", "")}"</p>
<div>
<p className="font-semibold {style["text"]}">{t.get("name", "")}</p>
<p className="text-sm {style["muted"]}">{t.get("title", "")}, {t.get("company", "")}</p>
</div>
</div>'''
for t in items
)
return f'''function Testimonials() {{
return (
<section className="px-6 py-24 {style["bg"]}">
<div className="mx-auto max-w-7xl">
<h2 className="mb-16 text-center text-4xl font-bold {style["text"]}">{title}</h2>
<div className="grid gap-8 md:grid-cols-2 lg:grid-cols-3">
{cards_jsx}
</div>
</div>
</section>
);
}}'''
def tsx_pricing(pricing: Dict[str, Any], style: Dict[str, str]) -> str:
title = pricing.get("title", "Pricing")
plans = pricing.get("plans", [])
if not plans:
return ""
accent = style["accent"]
cards = []
for p in plans:
featured = p.get("featured", False)
border_cls = f"border-2 border-{accent}-500 ring-4 ring-{accent}-500/20" if featured else f"border {style['border']}"
badge = f'\n <div className="absolute -top-3 left-1/2 -translate-x-1/2 rounded-full bg-{accent}-600 px-4 py-1 text-xs font-semibold text-white">Most Popular</div>' if featured else ""
features_jsx = "\n ".join(
f'<li className="flex items-center gap-2 py-2"><span className="text-{accent}-500 font-bold">✓</span> {feat}</li>'
for feat in p.get("features", [])
)
cards.append(f'''<div className="relative rounded-2xl {border_cls} {style["card_bg"]} p-8 text-center">{badge}
<h3 className="mb-2 text-xl font-semibold {style["text"]}">{p.get("name", "")}</h3>
<div className="my-6 text-5xl font-extrabold {style["text"]}">p.get("price", "0")<span className="text-base font-normal {style["muted"]}">/mo</span></div>
<p className="{style["muted"]} mb-6">{p.get("description", "")}</p>
<ul className="mb-8 space-y-1 text-left {style["muted"]}">
{features_jsx}
</ul>
<a href="{p.get("cta_url", "#")}" className="block w-full rounded-lg {style["btn"]} py-3 text-center font-semibold transition-colors">
{p.get("cta_text", "Choose Plan")}
</a>
</div>''')
cards_jsx = "\n ".join(cards)
return f'''function Pricing() {{
return (
<section className="{style["section_alt"]} px-6 py-24">
<div className="mx-auto max-w-5xl">
<h2 className="mb-16 text-center text-4xl font-bold {style["text"]}">{title}</h2>
<div className="grid gap-8 lg:grid-cols-{min(len(plans), 3)}">
{cards_jsx}
</div>
</div>
</section>
);
}}'''
def tsx_cta(cta: Dict[str, Any], style: Dict[str, str]) -> str:
accent = style["accent"]
return f'''function CTASection() {{
return (
<section className="bg-{accent}-600 px-6 py-24 text-center text-white">
<div className="mx-auto max-w-3xl">
<h2 className="mb-4 text-4xl font-bold">{cta.get("headline", "Ready to get started?")}</h2>
<p className="mb-10 text-xl opacity-90">{cta.get("subheadline", "")}</p>
<a href="{cta.get("url", "#")}" className="rounded-lg bg-white px-8 py-3 text-lg font-semibold text-{accent}-600 transition-colors hover:bg-gray-100">
{cta.get("text", "Start Free Trial")}
</a>
</div>
</section>
);
}}'''
def tsx_footer(config: Dict[str, Any], style: Dict[str, str]) -> str:
brand = config.get("brand", "Company")
year = datetime.now().year
footer_text = config.get("footer_text", f"{year} {brand}. All rights reserved.")
return f'''function Footer() {{
return (
<footer className="border-t {style["border"]} {style["bg"]} px-6 py-10 text-center {style["muted"]}">
<p>© {footer_text}</p>
</footer>
);
}}'''
def generate_tsx(config: Dict[str, Any]) -> str:
"""Generate complete Next.js/React TSX landing page with Tailwind CSS."""
style_name = config.get("design_style", "clean-minimal")
style = DESIGN_STYLES.get(style_name, DESIGN_STYLES["clean-minimal"])
components = []
component_names = []
components.append(tsx_nav(config, style))
component_names.append("Navbar")
if config.get("hero"):
components.append(tsx_hero(config["hero"], style))
component_names.append("Hero")
if config.get("features"):
components.append(tsx_features(config["features"], style))
component_names.append("Features")
if config.get("testimonials") and config["testimonials"].get("items"):
components.append(tsx_testimonials(config["testimonials"], style))
component_names.append("Testimonials")
if config.get("pricing") and config["pricing"].get("plans"):
components.append(tsx_pricing(config["pricing"], style))
component_names.append("Pricing")
if config.get("cta"):
components.append(tsx_cta(config["cta"], style))
component_names.append("CTASection")
components.append(tsx_footer(config, style))
component_names.append("Footer")
title = config.get("title", "Landing Page")
meta_desc = config.get("meta_description", "")
page_body = "\n ".join(f"<{name} />" for name in component_names)
all_components = "\n\n".join(components)
return f'''// Generated by Landing Page Scaffolder — {datetime.now().strftime("%Y-%m-%d")}
// Stack: Next.js 14+ App Router, React, Tailwind CSS
// Design style: {style_name}
import type {{ Metadata }} from "next";
export const metadata: Metadata = {{
title: "{title}",
description: "{meta_desc}",
openGraph: {{
title: "{title}",
description: "{meta_desc}",
type: "website",
}},
}};
{all_components}
export default function LandingPage() {{
return (
<main>
{page_body}
</main>
);
}}
'''
# ---------------------------------------------------------------------------
# HTML generators (existing)
# ---------------------------------------------------------------------------
def generate_css(config: Dict[str, Any]) -> str:
"""Generate responsive CSS from config theme."""
theme = config.get("theme", {})
primary = theme.get("primary_color", "#2563eb")
secondary = theme.get("secondary_color", "#1e40af")
bg = theme.get("background", "#ffffff")
text_color = theme.get("text_color", "#1f2937")
font = theme.get("font", "Inter, system-ui, -apple-system, sans-serif")
return f"""
* {{ margin: 0; padding: 0; box-sizing: border-box; }}
body {{ font-family: {font}; color: {text_color}; background: {bg}; line-height: 1.6; }}
.container {{ max-width: 1200px; margin: 0 auto; padding: 0 24px; }}
nav {{ padding: 16px 0; border-bottom: 1px solid #e5e7eb; position: sticky; top: 0; background: {bg}; z-index: 100; }}
nav .container {{ display: flex; justify-content: space-between; align-items: center; }}
.nav-logo {{ font-size: 1.5rem; font-weight: 700; color: {primary}; text-decoration: none; }}
.nav-links {{ display: flex; gap: 24px; list-style: none; }}
.nav-links a {{ text-decoration: none; color: {text_color}; font-weight: 500; }}
.nav-cta {{ background: {primary}; color: white; padding: 8px 20px; border-radius: 6px; text-decoration: none; font-weight: 600; }}
.hero {{ padding: 80px 0; text-align: center; }}
.hero h1 {{ font-size: 3.5rem; font-weight: 800; line-height: 1.1; margin-bottom: 24px; max-width: 800px; margin-left: auto; margin-right: auto; }}
.hero p {{ font-size: 1.25rem; color: #6b7280; max-width: 600px; margin: 0 auto 32px; }}
.hero-cta {{ display: inline-flex; gap: 16px; }}
.btn-primary {{ background: {primary}; color: white; padding: 14px 32px; border-radius: 8px; text-decoration: none; font-weight: 600; font-size: 1.1rem; }}
.btn-secondary {{ background: transparent; color: {primary}; padding: 14px 32px; border-radius: 8px; text-decoration: none; font-weight: 600; font-size: 1.1rem; border: 2px solid {primary}; }}
.features {{ padding: 80px 0; background: #f9fafb; }}
.section-title {{ text-align: center; font-size: 2.5rem; font-weight: 700; margin-bottom: 16px; }}
.section-subtitle {{ text-align: center; color: #6b7280; font-size: 1.1rem; margin-bottom: 48px; max-width: 600px; margin-left: auto; margin-right: auto; }}
.features-grid {{ display: grid; grid-template-columns: repeat(auto-fit, minmax(300px, 1fr)); gap: 32px; }}
.feature-card {{ background: white; padding: 32px; border-radius: 12px; box-shadow: 0 1px 3px rgba(0,0,0,0.1); }}
.feature-icon {{ font-size: 2rem; margin-bottom: 16px; }}
.feature-card h3 {{ font-size: 1.25rem; margin-bottom: 12px; }}
.feature-card p {{ color: #6b7280; }}
.testimonials {{ padding: 80px 0; }}
.testimonials-grid {{ display: grid; grid-template-columns: repeat(auto-fit, minmax(350px, 1fr)); gap: 24px; }}
.testimonial-card {{ padding: 32px; border: 1px solid #e5e7eb; border-radius: 12px; }}
.testimonial-text {{ font-size: 1.1rem; font-style: italic; margin-bottom: 20px; }}
.testimonial-author {{ display: flex; align-items: center; gap: 12px; }}
.author-info strong {{ display: block; }}
.author-info span {{ color: #6b7280; font-size: 0.9rem; }}
.pricing {{ padding: 80px 0; background: #f9fafb; }}
.pricing-grid {{ display: grid; grid-template-columns: repeat(auto-fit, minmax(300px, 1fr)); gap: 24px; max-width: 900px; margin: 0 auto; }}
.pricing-card {{ background: white; padding: 32px; border-radius: 12px; border: 2px solid #e5e7eb; text-align: center; }}
.pricing-card.featured {{ border-color: {primary}; position: relative; }}
.pricing-card.featured::before {{ content: "Most Popular"; position: absolute; top: -12px; left: 50%; transform: translateX(-50%); background: {primary}; color: white; padding: 4px 16px; border-radius: 20px; font-size: 0.8rem; font-weight: 600; }}
.pricing-name {{ font-size: 1.25rem; font-weight: 600; margin-bottom: 8px; }}
.pricing-price {{ font-size: 3rem; font-weight: 800; margin: 16px 0; }}
.pricing-price span {{ font-size: 1rem; font-weight: 400; color: #6b7280; }}
.pricing-features {{ list-style: none; text-align: left; margin: 24px 0; }}
.pricing-features li {{ padding: 8px 0; border-bottom: 1px solid #f3f4f6; }}
.pricing-features li::before {{ content: "\\2713 "; color: {primary}; font-weight: 700; }}
.cta-section {{ padding: 80px 0; text-align: center; background: {primary}; color: white; }}
.cta-section h2 {{ font-size: 2.5rem; margin-bottom: 16px; }}
.cta-section p {{ font-size: 1.1rem; opacity: 0.9; margin-bottom: 32px; }}
.btn-white {{ background: white; color: {primary}; padding: 14px 32px; border-radius: 8px; text-decoration: none; font-weight: 600; font-size: 1.1rem; }}
footer {{ padding: 40px 0; border-top: 1px solid #e5e7eb; color: #6b7280; text-align: center; }}
@media (max-width: 768px) {{
.hero h1 {{ font-size: 2.25rem; }}
.hero-cta {{ flex-direction: column; align-items: center; }}
.nav-links {{ display: none; }}
.features-grid {{ grid-template-columns: 1fr; }}
.pricing-grid {{ grid-template-columns: 1fr; }}
}}
"""
def render_nav(config: Dict[str, Any]) -> str:
brand = escape(config.get("brand", "Brand"))
nav_links = config.get("nav_links", [])
cta = config.get("nav_cta", {"text": "Get Started", "url": "#"})
links = "\n".join(
f'<li><a href="{escape(l.get("url", "#"))}">{escape(l.get("text", ""))}</a></li>'
for l in nav_links
)
return f"""
<nav><div class="container">
<a href="#" class="nav-logo">{brand}</a>
<ul class="nav-links">{links}</ul>
<a href="{escape(cta.get('url', '#'))}" class="nav-cta">{escape(cta.get('text', 'Get Started'))}</a>
</div></nav>"""
def render_hero(hero: Dict[str, Any]) -> str:
h1 = escape(hero.get("headline", "Your Headline Here"))
sub = escape(hero.get("subheadline", ""))
primary_cta = hero.get("primary_cta", {"text": "Get Started", "url": "#"})
secondary_cta = hero.get("secondary_cta", None)
cta_html = f'<a href="{escape(primary_cta.get("url", "#"))}" class="btn-primary">{escape(primary_cta.get("text", "Get Started"))}</a>'
if secondary_cta:
cta_html += f'\n<a href="{escape(secondary_cta.get("url", "#"))}" class="btn-secondary">{escape(secondary_cta.get("text", "Learn More"))}</a>'
return f"""
<section class="hero"><div class="container">
<h1>{h1}</h1>
<p>{sub}</p>
<div class="hero-cta">{cta_html}</div>
</div></section>"""
def render_features(features: Dict[str, Any]) -> str:
title = escape(features.get("title", "Features"))
subtitle = escape(features.get("subtitle", ""))
items = features.get("items", [])
cards = "\n".join(f"""
<div class="feature-card">
<div class="feature-icon">{escape(f.get('icon', ''))}</div>
<h3>{escape(f.get('title', ''))}</h3>
<p>{escape(f.get('description', ''))}</p>
</div>""" for f in items)
return f"""
<section class="features"><div class="container">
<h2 class="section-title">{title}</h2>
<p class="section-subtitle">{subtitle}</p>
<div class="features-grid">{cards}</div>
</div></section>"""
def render_testimonials(testimonials: Dict[str, Any]) -> str:
title = escape(testimonials.get("title", "What Our Customers Say"))
items = testimonials.get("items", [])
if not items:
return ""
cards = "\n".join(f"""
<div class="testimonial-card">
<p class="testimonial-text">"{escape(t.get('quote', ''))}"</p>
<div class="testimonial-author">
<div class="author-info">
<strong>{escape(t.get('name', ''))}</strong>
<span>{escape(t.get('title', ''))}, {escape(t.get('company', ''))}</span>
</div>
</div>
</div>""" for t in items)
return f"""
<section class="testimonials"><div class="container">
<h2 class="section-title">{title}</h2>
<div class="testimonials-grid">{cards}</div>
</div></section>"""
def render_pricing(pricing: Dict[str, Any]) -> str:
title = escape(pricing.get("title", "Pricing"))
plans = pricing.get("plans", [])
if not plans:
return ""
cards = "\n".join(f"""
<div class="pricing-card {'featured' if p.get('featured') else ''}">
<div class="pricing-name">{escape(p.get('name', ''))}</div>
<div class="pricing-price">escape(str(p.get('price', '0')))<span>/mo</span></div>
<p>{escape(p.get('description', ''))}</p>
<ul class="pricing-features">
{"".join(f'<li>{escape(f)}</li>' for f in p.get('features', []))}
</ul>
<a href="{escape(p.get('cta_url', '#'))}" class="btn-primary">{escape(p.get('cta_text', 'Choose Plan'))}</a>
</div>""" for p in plans)
return f"""
<section class="pricing"><div class="container">
<h2 class="section-title">{title}</h2>
<div class="pricing-grid">{cards}</div>
</div></section>"""
def render_cta(cta: Dict[str, Any]) -> str:
return f"""
<section class="cta-section"><div class="container">
<h2>{escape(cta.get('headline', 'Ready to get started?'))}</h2>
<p>{escape(cta.get('subheadline', ''))}</p>
<a href="{escape(cta.get('url', '#'))}" class="btn-white">{escape(cta.get('text', 'Start Free Trial'))}</a>
</div></section>"""
def generate_html(config: Dict[str, Any]) -> str:
"""Generate complete HTML landing page."""
title = escape(config.get("title", "Landing Page"))
css = generate_css(config)
sections = []
sections.append(render_nav(config))
if config.get("hero"):
sections.append(render_hero(config["hero"]))
if config.get("features"):
sections.append(render_features(config["features"]))
if config.get("testimonials"):
sections.append(render_testimonials(config["testimonials"]))
if config.get("pricing"):
sections.append(render_pricing(config["pricing"]))
if config.get("cta"):
sections.append(render_cta(config["cta"]))
sections.append(f"""
<footer><div class="container">
<p>{escape(config.get('footer_text', f'{datetime.now().year} {config.get("brand", "Company")}. All rights reserved.'))}</p>
</div></footer>""")
return f"""<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>{title}</title>
<meta name="description" content="{escape(config.get('meta_description', ''))}">
<style>{css}</style>
</head>
<body>
{"".join(sections)}
</body>
</html>"""
def main():
parser = argparse.ArgumentParser(
description="Generate landing pages as HTML or Next.js TSX with Tailwind CSS"
)
parser.add_argument("input", help="Path to page config JSON")
parser.add_argument(
"--format", choices=["html", "tsx", "json"], default="tsx",
help="Output format: tsx (Next.js + Tailwind), html (standalone), json (metadata)"
)
parser.add_argument("--output", type=str, default=None, help="Output file path")
args = parser.parse_args()
with open(args.input) as f:
config = json.load(f)
if args.format == "json":
output = json.dumps({
"generated_at": datetime.now().isoformat(),
"config": config,
"formats_available": ["html", "tsx"],
"sections": [k for k in ["nav", "hero", "features", "testimonials", "pricing", "cta", "footer"]
if config.get(k) or k in ("nav", "footer")]
}, indent=2)
elif args.format == "tsx":
output = generate_tsx(config)
else:
output = generate_html(config)
if args.output:
with open(args.output, "w") as f:
f.write(output)
print(f"Landing page written to {args.output}")
else:
print(output)
if __name__ == "__main__":
main()
Personal leadership development for founders and first-time CEOs. Covers founder archetype identification, delegation frameworks, energy management, CEO cale...
---
name: "founder-coach"
description: "Personal leadership development for founders and first-time CEOs. Covers founder archetype identification, delegation frameworks, energy management, CEO calendar audits, leadership style evolution, blind spot identification, imposter syndrome, founder mental health, and succession planning. Use when a founder feels like the bottleneck, struggles to delegate, is burning out, transitioning from IC to executive, managing a board, or when user mentions founder mode, CEO growth, leadership development, delegation, burnout, or imposter syndrome."
license: MIT
metadata:
version: 1.0.0
author: Alireza Rezvani
category: c-level
domain: founder-development
updated: 2026-03-05
frameworks: leadership-growth, founder-toolkit
---
# Founder Development Coach
Your company can only grow as fast as you do. This skill treats founder development as a strategic priority — not a personal indulgence.
## Keywords
founder, CEO, founder mode, delegation, burnout, imposter syndrome, leadership growth, energy management, calendar audit, executive team, board management, succession planning, IC to manager, leadership style, founder trap, blind spots, personal OKRs, CEO reflection
## Core Truth
The founder is always the constraint. Not intentionally — it's structural. You built the company. You know everything. Decisions flow through you. This works until it doesn't.
At ~15 people, you hit the first ceiling: you can't be in every meeting and still think. At ~50 people, the second: your style starts creating culture problems. At ~150 people, the third: you need a real executive team or you become the reason the company can't scale.
The earlier you address this, the better.
---
## 1. Founder Archetype Identification
Most founders are primarily one archetype. Knowing yours predicts what you'll struggle with.
| Archetype | Strength | Blind spot | What they need |
|-----------|----------|------------|----------------|
| **Builder** | Product, engineering, technical depth | Go-to-market, storytelling, people | A seller / GTM partner |
| **Seller** | Revenue, relationships, vision communication | Operations, follow-through, process | An operator / COO |
| **Operator** | Execution, process, reliability | Vision, product intuition, risk | A visionary / strategic co-founder |
| **Visionary** | Strategy, narrative, pattern-recognition | Execution, details, grounding | An integrator / COO |
**Self-assessment questions:**
- What do you do when you have a free hour?
- What do you procrastinate on most?
- What do your co-founders or early team complain you don't do?
- What's the best feedback you've received about your leadership?
Most founders are Builder or Visionary. Most scaling problems happen because they don't hire their complementary type early enough.
---
## 2. Delegation Framework
Founders fail to delegate for four reasons:
1. "Nobody does it as well as I do" (often true short-term, fatal long-term)
2. "It takes longer to explain than to do it" (true once; not true the 10th time)
3. "I lose control if I don't do it myself" (control is an illusion at scale)
4. "If it fails, it's my fault" (it's your fault if you never let anyone else try)
### The Skill × Will Matrix
| | High Skill | Low Skill |
|---|-----------|----------|
| **High Will** | Delegate fully | Coach and develop |
| **Low Will** | Motivate or reassign | Manage out or redesign role |
**Rules:**
- High skill + high will → Give the work and get out of the way
- High will + low skill → Invest in them. They want to grow.
- High skill + low will → Find out why. Fix the environment or accept the mismatch.
- Low skill + low will → Don't delegate to them. Address the performance issue.
### The Delegation Ladder
Not all delegation is equal. Build up gradually:
1. "Do exactly what I tell you" — not delegation, instruction
2. "Research this and report back" — information gathering
3. "Propose a solution and I'll decide" — thinking delegation
4. "Decide and tell me what you decided" — decision delegation with review
5. "Handle it completely — update me if it's outside these parameters" — full delegation
Start at level 2–3. Move people up as trust is established. Most founders never get past level 3 with their team — that's the bottleneck.
### What to delegate first
**Delegate first (high volume, low stakes):**
- Recurring operational tasks you do the same way every time
- Information gathering and synthesis
- Meeting coordination and scheduling
- Reports and updates you produce regularly
**Delegate next (skill-buildable):**
- Customer interactions (with clear principles)
- Hiring screens (after you've trained judgment)
- Partner relationship management
- Budget management within parameters
**Delegate last (strategic, irreversible):**
- Major strategic pivots
- Executive hires
- Large financial commitments
- M&A decisions
---
## 3. Energy Management
Founders manage energy, not just time. Time is fixed. Energy is renewable — but only if you manage it.
### The Energy Audit
Map your week by energy, not tasks. See `references/founder-toolkit.md` for the full template.
**Categories:**
- 🟢 **Energizing:** Activities that leave you sharper after doing them
- 🟡 **Neutral:** Neither energizing nor draining
- 🔴 **Draining:** Activities that leave you depleted
**Common founder energy patterns:**
- **Builders:** Energized by creating, drained by politics and process
- **Sellers:** Energized by people and wins, drained by detail work and admin
- **Operators:** Energized by solving, drained by ambiguity and indecision
- **Visionaries:** Energized by strategy and ideas, drained by execution and repetition
**The rule:** Maximize green. Eliminate or delegate red. Accept yellow as the price of leadership.
### Energy management practices
**Protect deep work time.** 2–4 hours of uninterrupted thinking time, 3–5 days per week. Schedule it. Defend it. This is where strategy happens.
**Batch shallow work.** Email, Slack, administrative tasks — twice a day maximum.
**Single-task during recovery.** If you're depleted, don't try to do your best work. Do tasks that don't require your best.
**Identify your peak window.** Most people have 4–6 peak hours per day. Schedule your hardest work in those windows.
---
## 4. CEO Calendar Audit
The calendar is the most honest document in a founder's life. It shows what you actually prioritize, not what you say you prioritize.
### Running the audit
Pull the last 4 weeks of calendar data. Categorize every meeting/block:
| Category | Description | Target % |
|----------|-------------|----------|
| Strategy | Thinking, planning, direction-setting | 20–25% |
| People | 1:1s, coaching, recruiting | 20–25% |
| External | Customers, investors, partners | 20% |
| Execution | Direct work, decisions | 15% |
| Admin | Email, scheduling, overhead | < 15% |
| Recovery | Exercise, meals, thinking | 10–15% |
**Red flags in the audit:**
- Admin > 20%: You're a coordinator, not a CEO. Fix your systems.
- Execution > 30%: You're still an IC. Build the team.
- People < 10%: Your team is running on empty. They need more of you.
- No recovery blocks: You're running on adrenaline. It ends badly.
- Strategy < 10%: You're running the company, not leading it.
### The CEO's primary job at each stage
| Stage | CEO should spend most time on... |
|-------|--------------------------------|
| Seed | Product and customers. Directly. |
| Series A | Hiring the executive team. Recruiting is your job. |
| Series B | Culture, strategy, and external (investors/partners/customers) |
| Series C+ | Vision, board, external narrative, executive development |
If you're spending time on things from two stages ago, you haven't made the transition.
---
## 5. Leadership Style Evolution
The job changes at every stage. Most founders don't change with it.
**IC → Manager (0 to ~10 people):**
You need to teach and build trust. People are watching how you treat failure. The skill: give clear context, set expectations, check in frequently.
**Manager → Leader (~10 to ~50 people):**
You can't manage everyone directly. You need people who manage people. The skill: hire managers you trust, let them manage.
**Leader → Executive (~50 to ~200 people):**
You're now setting culture and direction, not managing work. The skill: communicate obsessively, decide at the right altitude, develop your leadership team.
**Executive → Institutional CEO (200+):**
You're a symbol as much as a manager. The skill: build systems that work without you; focus on board, investors, and external narrative.
**The hardest transition:** Manager → Leader. You have to stop doing things yourself and trust people you're still getting to know.
---
## 6. Blind Spot Identification
Everyone has them. Founders more than most — because nobody in the early company had the authority or safety to tell you.
### Common founder blind spots
- **Communication:** "I said it once, they should know" — you said it; they didn't hear it or didn't believe it
- **Decision speed:** Moving so fast that teams can't orient or build on your direction
- **Context hoarding:** Knowing what's happening without sharing it, then being frustrated that teams make bad decisions
- **Optimism bias:** Consistently underestimating timelines, cost, and difficulty
- **Founder exceptionalism:** Rules that apply to everyone don't apply to you
- **Feedback avoidance:** Creating an environment where no one gives you honest feedback
### How to find your blind spots
1. **360 feedback (anonymous):** Once a year. Ask direct reports, peers, board members. Include "What does [name] do that gets in the way of our success?"
2. **Exit interview analysis:** What do departing employees consistently say? Find the pattern.
3. **Failure post-mortems:** What do your worst decisions have in common? What were you assuming that wasn't true?
4. **The energy audit:** Where do you consistently drain the people around you?
---
## 7. Imposter Syndrome Toolkit
It doesn't go away. It evolves. The founder who was scared to pitch to investors is now scared to manage a board. The founder who was scared to hire is now scared to fire.
**The reframe:** Imposter syndrome is proportional to stretch. If you never feel it, you're not growing.
**Practical tools:**
- **Evidence file:** Document wins, compliments, decisions that worked. Read it when the doubt hits.
- **Normalize the feeling:** "I feel underprepared for this" ≠ "I am an imposter." Feeling and fact are different.
- **Do the thing anyway.** Competence comes from doing, not from feeling ready.
- **Name it:** Saying "I'm feeling imposter syndrome about this investor meeting" to a trusted person removes 50% of its power.
---
## 8. Founder Mental Health
Burnout isn't weakness. It's a predictable outcome of high-demand + low-recovery + no control over inputs.
### Burnout signals
Early: Irritability, difficulty sleeping, decisions feel harder than they should, loss of enthusiasm for the mission.
Mid: Physical symptoms (headaches, illness), cynicism about the company, social withdrawal, all tasks feel equally important (priority paralysis).
Late: Can't function, decisions have stopped, team notices before you do.
**If you're in late burnout:** Stop performing. Get support. The company needs a functioning founder more than it needs a martyred one.
### Structural prevention
- **Protect recovery time.** Not weekends — protected time during the week where you're not available.
- **Therapy or coaching.** Not optional for founders. The job is isolating and the stakes are high.
- **Peer group.** Other founders at similar stages. They're the only people who actually understand the job.
- **Clear off-ramps.** Know what "enough for today" looks like. Don't let the work be infinite.
---
## 9. The Founder Mode Trap
Paul Graham's "Founder Mode" essay made the case that great founders stay deeply involved in operations — skip middle management and go direct. It resonated because it's sometimes true.
**When founder mode helps:**
- Crisis recovery (company needs direct leadership)
- Product-market fit search (speed matters more than org health)
- High-value, irreversible decisions (you should be in the room)
- Early stages when the team is small
**When founder mode hurts:**
- When it undermines managers you've hired (they can't lead if you override them)
- When it's driven by distrust rather than strategy
- When it prevents the team from developing judgment
- When you're doing it because you miss doing, not because the company needs you to
**The test:** Are you going deep because the situation requires it, or because you're uncomfortable with the loss of control? The first is leadership. The second is the trap.
---
## 10. Succession Planning
Building a company that works without you is not disloyalty — it's the ultimate expression of leadership.
**Succession is not just about exit.** It's about resilience. What happens if you're sick? On sabbatical? Acquired?
**Succession readiness levels:**
- Level 1: You've documented your key knowledge and processes
- Level 2: At least one person can cover each of your key functions for 2 weeks
- Level 3: Your leadership team can run the company for a quarter without you
- Level 4: You've identified and developed your potential successor
Most founders are at Level 0. Level 2 is a reasonable target. Level 3 is a strategic asset.
---
## Key Questions for Founder Development
- "What decisions did you make last week that someone else could have made?"
- "What are you still doing that you should have delegated 6 months ago?"
- "When did you last get honest, critical feedback? From whom? What did it say?"
- "What would need to be true for the company to run for a week without you?"
- "What's draining your energy that you've accepted as unavoidable?"
## Detailed References
- `references/leadership-growth.md` — Maxwell levels, situational leadership, founder-to-CEO transition
- `references/founder-toolkit.md` — Weekly reflection, energy audit, delegation matrix, 1:1 templates
FILE:references/founder-toolkit.md
# Founder Toolkit
Practical tools for founder self-management and leadership development.
---
## 1. Weekly CEO Reflection Template
**15 minutes. Every Friday. No excuses.**
This is the most important meeting of the week. You with yourself.
```
DATE: _______________
## This Week
**1. What was my most important contribution this week?**
(Not the longest meeting or the hardest problem — the thing that will matter in 90 days.)
_______________________________________________
**2. Where did I add the least value? Why was I involved?**
(Be honest. Where were you in the room out of habit, not necessity?)
_______________________________________________
**3. What should I have delegated but didn't?**
(Name the specific task and the person you could have delegated it to.)
_______________________________________________
**4. What decision am I avoiding? Why?**
(Fear of being wrong? Not enough information? Conflict avoidance?)
_______________________________________________
**5. What would I do differently this week if I could do it over?**
(One thing. Make it specific.)
_______________________________________________
## Next Week
**My one most important outcome for next week:**
_______________________________________________
**What will I stop doing / not start / protect myself from?**
_______________________________________________
```
---
## 2. Energy Audit Template
Map your week by energy, not tasks. Do this for one full work week.
### Step 1: Time block mapping
For each 30-minute block in your week, record:
- What you did
- Energy level: 🟢 Energizing / 🟡 Neutral / 🔴 Draining
```
Monday:
08:00-08:30: __________________ [🟢/🟡/🔴]
08:30-09:00: __________________ [🟢/🟡/🔴]
09:00-09:30: __________________ [🟢/🟡/🔴]
... (continue through the day)
```
### Step 2: Pattern analysis
After one week, categorize activities:
| Activity type | Energy level | Total hours | % of week |
|--------------|-------------|-------------|-----------|
| Customer calls | | | |
| Investor meetings | | | |
| Team 1:1s | | | |
| Product decisions | | | |
| Strategy/planning | | | |
| Email/Slack | | | |
| Recruiting | | | |
| Financial review | | | |
| External talks/events | | | |
| Administrative tasks | | | |
| Deep work/building | | | |
| Recovery/breaks | | | |
### Step 3: Optimization plan
**Green activities to protect (min 40% of week):**
- _______________________________________________
**Red activities to eliminate or delegate (target: < 15% of week):**
- Activity: __________________ → Delegate to: __________________
- Activity: __________________ → Eliminate via: __________________
**Your personal energy peak hours:**
I do my best thinking: _______ to _______
Schedule this time as: Protected deep work (no meetings)
---
## 3. Delegation Matrix
For every task you regularly do, run it through this matrix.
### Assessment
| Task | Skill level needed | My will to keep it | Decision |
|------|-------------------|-------------------|----------|
| | High / Med / Low | High / Med / Low | Keep / Coach / Delegate / Kill |
### Delegation scoring
| My Skill | My Will | Decision |
|----------|---------|----------|
| High | High | Keep — this is your zone of genius |
| High | Low | Delegate — you can do it, but it drains you. Train someone. |
| Low | High | Develop — learn it or hire for it |
| Low | Low | Kill or outsource — why is this on your plate? |
### The 70% rule
If someone can do a task 70% as well as you, delegate it. Trying to get to 100% is a trap:
- Their 70% will grow to 90% with practice
- Your 30% extra effort costs more than the quality gap
- You free up time for things only you can do
---
## 4. 1:1 Template for Direct Reports
Weekly or biweekly. 30 minutes. Their agenda, not yours.
```
DATE: _______________
PERSON: _______________
## Their Section (first 20 min)
**What's on their mind? (open the meeting with this)**
(No agenda from you first — let them lead)
**What are they working on? Where are they stuck?**
**What do they need from me?**
**Anything they wanted to raise but haven't had the chance to?**
## Your Section (last 10 min)
**Context to share (strategy, changes, what they should know):**
**Direct feedback to give (if any):**
- Be specific: "In Tuesday's meeting, when you [did X], the impact was [Y]"
- Make it actionable: "Next time, I'd suggest [Z]"
**Career/growth check-in (monthly, not every meeting):**
- How are they feeling about their growth?
- What do they want to be doing more of?
- What are they interested in that they're not currently doing?
## Follow-ups
| Commitment | Owner | Due |
|------------|-------|-----|
| | | |
```
### Rules for effective 1:1s
- **Their agenda first.** If you dominate with your updates, they stop bringing theirs.
- **No status updates.** That's what tools are for. This time is for their thinking, blockers, and development.
- **Consistent time.** Rescheduled 1:1s signal that they're not a priority.
- **Take notes.** Review them before the next meeting. It signals that you listened.
- **Follow up on commitments.** If you say "I'll get you that answer by Thursday," get it by Thursday.
---
## 5. Personal OKRs for the Founder
Most founders hold their team accountable to goals but have none themselves. Fix that.
### Template: Quarterly Personal OKRs
```
Q[X] YYYY | FOUNDER OKRs
## My One Priority This Quarter
(The single most important thing I personally must accomplish)
_______________________________________________
## Objective 1: [Leadership Development]
What I'm trying to achieve: _______________________________________________
KR 1.1: [Measurable outcome by EoQ]
KR 1.2: [Measurable outcome by EoQ]
KR 1.3: [Measurable outcome by EoQ]
Progress check (mid-quarter): _______________________________________________
## Objective 2: [Delegation / Team Building]
What I'm trying to achieve: _______________________________________________
KR 2.1: [Measurable outcome by EoQ]
KR 2.2: [Measurable outcome by EoQ]
## Objective 3: [External Impact — Investors / Customers / Market]
What I'm trying to achieve: _______________________________________________
KR 3.1: [Measurable outcome by EoQ]
KR 3.2: [Measurable outcome by EoQ]
## The "Stop Doing" List (equally important)
Things I'm committing to stop doing this quarter:
- Stop: _______________________________________________
- Stop: _______________________________________________
- Stop: _______________________________________________
```
### Personal OKR examples
**Objective: Become a better coach, not just a decision-maker**
- KR: 90% of my direct reports can make their top 3 recurring decisions without me by EoQ
- KR: In 1:1 reviews, 80% of team rates me as "helps me think through problems" vs "tells me what to do"
- KR: Conduct quarterly 360 feedback session with all direct reports
**Objective: Build investor trust before I need it**
- KR: Monthly investor updates sent within 5 days of month-end, every month this quarter
- KR: 1:1 calls with each board member, once per quarter, outside of board meetings
- KR: Create and share 3-year financial model with board by EoQ
**Objective: Protect my energy and performance**
- KR: 3+ hours of protected deep work time per day, 4+ days per week
- KR: Complete weekly CEO reflection every Friday (track: 0/13 weeks → 13/13)
- KR: Zero email after 8pm, zero weekends unless explicit crisis
---
## 6. The "Stop Doing" List
The hardest list to make and the most valuable to keep.
Most founders have clear to-do lists. Few have stop-doing lists. The asymmetry is the problem.
### The stop-doing audit
**Things to stop doing immediately (decision you can make today):**
- Attending meetings you don't add value to
- Being the default person for decisions that should be made by others
- Redoing work that your team completed
- Checking email/Slack during deep work blocks
- Starting tasks you know you'll delegate partway through
**Things to stop doing by delegating (need to train someone):**
- _______________________________________________
- _______________________________________________
- _______________________________________________
**Things to stop doing by building systems:**
- Recurring manual tasks → automate
- Recurring decisions → write decision criteria so others can decide
- Recurring explanations → document once, reference always
### The decision filter
Before accepting new responsibilities, run through:
1. Does this require something only I can do?
2. Is this the highest and best use of my time?
3. If I say yes to this, what am I saying no to?
If the answers are no, no, and something important — say no.
---
## 7. Evidence File
For when imposter syndrome hits. Keep a running file of:
**Wins** (monthly minimum)
- Company milestones you led
- Decisions that worked out well
- Feedback you received that was genuinely positive
**Quotes** (capture as they happen)
- Direct quotes from team members, customers, investors about your impact
- Emails or messages that reflect trust or appreciation
**The hard calls that paid off**
- Decisions you were scared to make that turned out well
- Times you said no to something that would have hurt the company
**When to read it:** When you're doubting yourself before a board meeting, a hard conversation, a big pitch. The feeling isn't fact. The evidence file is.
FILE:references/leadership-growth.md
# Leadership Growth Reference
Frameworks for founder and executive leadership development.
---
## 1. The 5 Levels of Leadership (Maxwell)
John Maxwell's model describes leadership development as a ladder. Most founders start at Level 2–3 and need to reach Level 4–5 to scale effectively.
| Level | Name | People follow because... | What it looks like |
|-------|------|--------------------------|-------------------|
| 1 | Position | They have to (title/authority) | "Do this because I'm the CEO" |
| 2 | Permission | They want to (relationship) | People choose to work with you beyond the job requirement |
| 3 | Production | You produce results | Team rallies because you deliver; your track record gives credibility |
| 4 | People Development | You develop others | You're multiplying leaders; your success is measured by others' growth |
| 5 | Pinnacle | Who you are (reputation) | People follow because of what you've built and who you've become |
**Most founders are at Level 3.** They got here by building and shipping. The path to scaling is Level 4: developing other leaders.
**The Level 3 trap:** Production-focused founders attract doers, not leaders. They value results over growth. Their teams are effective but dependent. Every decision still goes through the founder.
**The Level 4 shift:** Measure your success by how well your team succeeds without you. Your job is to make the people around you better.
---
## 2. Situational Leadership Model
Ken Blanchard's model says effective leadership style shifts based on the person and the task — not the leader's preference.
Four styles based on the follower's development level:
| Development Level | Competence | Commitment | Leadership Style | What to do |
|------------------|------------|------------|-----------------|------------|
| D1 — Enthusiastic Beginner | Low | High | S1: Directing | High direction, low support. Tell them what to do. |
| D2 — Disillusioned Learner | Low/Med | Low | S2: Coaching | High direction + high support. Teach and encourage. |
| D3 — Capable but Cautious | Medium/High | Variable | S3: Supporting | Low direction, high support. Collaborate and encourage. |
| D4 — Self-Reliant Achiever | High | High | S4: Delegating | Low direction, low support. Get out of the way. |
**Common founder error:** Using the same leadership style with everyone. The founder who directs a D4 will frustrate them into leaving. The founder who delegates to a D1 will watch them fail.
**Diagnosis before deciding:**
Before determining your style, ask for each person + task:
- How much do they know about this specific task? (Not in general — this task.)
- How much do they want to do this specific task?
These answers may surprise you. A senior engineer may be D4 on architecture and D1 on customer calls.
---
## 3. The Founder → CEO Transition
The hardest leadership change most founders face, and nobody prepares them for it.
### What changes
**As a founder, you were judged on:**
- What you personally built
- How fast you moved
- Your own output
**As a CEO, you're judged on:**
- What your team produced
- How effectively you set direction
- The quality of the people around you
The skills that made you a great founder — doing, deciding, building — can actively work against you as a CEO.
### The transition phases
**Phase 1: Still doing (0–15 people)**
You're right to be deep in the work. Speed requires it. Your personal output matters.
Risk: Staying here too long.
**Phase 2: Building around you (15–50 people)**
You're hiring and starting to delegate. People do work you used to do.
Challenge: Learning to trust output that doesn't look like yours.
Failure mode: Hiring people and then redoing their work.
**Phase 3: Leading through leaders (50–150 people)**
You no longer know everything happening in the company. That's correct.
Challenge: Managing people who manage people — twice removed from the work.
Failure mode: Bypassing your managers to go direct (undermines them, creates chaos).
**Phase 4: Setting the container (150+ people)**
Your job is culture, strategy, and the senior leadership team. You're a CEO, not a senior contributor.
Challenge: Staying relevant and strategic without getting lost in the weeds.
Failure mode: Retreating to execution to feel productive.
### The emotional reality
Most founders describe the transition as:
- A loss of identity ("I used to know everything that was happening")
- A loss of control ("Decisions happen without me")
- A loss of clarity ("Was I more effective before?")
These are real losses, not just discomfort. Acknowledge them. Find identity in what the CEO role is, not what the founder role was.
---
## 4. Building Your Executive Team
### When to hire your first executive
Common question: "When do I need a VP/C-suite?"
**Trigger signs:**
- The function is failing and you can't fix it by working harder
- You can't attract or develop talent in that function because you lack the expertise
- The function is growing faster than you can lead it
- You're making bad decisions in that domain because you don't have deep knowledge
**Order of first executives:**
Most companies hire in this order, but the right order depends on your archetype and what's breaking:
1. First non-founder exec is usually Sales (VP Sales) or Engineering (VP Eng / CTO)
2. Then COO/Operations when coordination becomes the bottleneck
3. Then Finance (CFO) when fundraising or financial complexity demands it
4. Then People/HR when hiring velocity and culture require dedicated ownership
### How to onboard executives
**The 30-60-90 plan:**
- Day 1–30: Listen. Meet everyone. Learn the current state. No major decisions.
- Day 31–60: Diagnose. What's working, what isn't, what's missing. Share findings.
- Day 61–90: Act. Make changes. Start building systems. Establish their leadership presence.
**The trust-building sequence:**
Start with small, visible wins. Let them prove themselves in low-stakes situations before handing over high-stakes decisions.
**The founder's role during exec onboarding:**
- Provide context generously
- Introduce them with genuine authority ("This is the decision-maker for X — go to them, not me")
- Don't override their decisions publicly
- Give feedback privately, not in front of their team
**Failure mode:** Hiring a great executive and then making them feel like a senior employee. If you override every major decision, you don't have an executive — you have an expensive advisor.
---
## 5. Managing Your Board
### The fundamental tension
You work for the board. The board elected you. They can remove you. This is a governance reality, not a threat.
And: You lead the company. The board sets governance and approves major decisions, but they're not running the business day-to-day. You are.
**Healthy dynamic:** Board holds accountability; CEO holds authority. They're not adversarial — they're complementary.
### The founder mistake
Most founders either:
1. **Over-inform:** Share every detail, create noise, invite micro-management
2. **Under-inform:** Share only wins, board is surprised by problems, trust erodes
Neither works. The goal is strategic partnership.
### What the board actually needs
- **Monthly written update:** Financial performance vs plan, key metrics, top 3 issues + proposed solutions, forward-looking risks. 1–2 pages.
- **Quarterly board meeting:** Strategic discussion, not financial recap. They've read the update. Use the time for decisions and input.
- **Real-time alerts:** Big bad news before the meeting. Never let board members be surprised by negative news they should have known earlier.
### Managing board members individually
Invest in 1:1 relationships with each board member between meetings. Understand what they care about. Use their expertise.
Board members who feel informed and useful are your allies. Board members who feel blindsided or sidelined become difficult.
**The pre-meeting call:** Before every board meeting, call each member individually. Preview the agenda, surface concerns, align on decisions. The meeting itself should have no surprises.
### When the board challenges you
"The board doesn't trust my judgment" is often really: "I haven't given them enough information to trust my judgment."
Fix the transparency gap before assuming it's a political problem.
**When the board is actually wrong:** Make the case clearly, once, with data. If they override you on something important and you can't accept it, that's a signal about fit. Founders get removed. It happens. Build board relationships before you need them to trust you on a hard call.
Performs financial ratio analysis, DCF valuation, budget variance analysis, and rolling forecast construction for strategic decision-making. Use when analyzi...
---
name: "financial-analyst"
description: Performs financial ratio analysis, DCF valuation, budget variance analysis, and rolling forecast construction for strategic decision-making. Use when analyzing financial statements, building valuation models, assessing budget variances, or constructing financial projections and forecasts. Also applicable when users mention financial modeling, cash flow analysis, company valuation, financial projections, or spreadsheet analysis.
---
# Financial Analyst Skill
## Overview
Production-ready financial analysis toolkit providing ratio analysis, DCF valuation, budget variance analysis, and rolling forecast construction. Designed for financial modeling, forecasting & budgeting, management reporting, business performance analysis, and investment analysis.
## 5-Phase Workflow
### Phase 1: Scoping
- Define analysis objectives and stakeholder requirements
- Identify data sources and time periods
- Establish materiality thresholds and accuracy targets
- Select appropriate analytical frameworks
### Phase 2: Data Analysis & Modeling
- Collect and validate financial data (income statement, balance sheet, cash flow)
- **Validate input data completeness** before running ratio calculations (check for missing fields, nulls, or implausible values)
- Calculate financial ratios across 5 categories (profitability, liquidity, leverage, efficiency, valuation)
- Build DCF models with WACC and terminal value calculations; **cross-check DCF outputs against sanity bounds** (e.g., implied multiples vs. comparables)
- Construct budget variance analyses with favorable/unfavorable classification
- Develop driver-based forecasts with scenario modeling
### Phase 3: Insight Generation
- Interpret ratio trends and benchmark against industry standards
- Identify material variances and root causes
- Assess valuation ranges through sensitivity analysis
- Evaluate forecast scenarios (base/bull/bear) for decision support
### Phase 4: Reporting
- Generate executive summaries with key findings
- Produce detailed variance reports by department and category
- Deliver DCF valuation reports with sensitivity tables
- Present rolling forecasts with trend analysis
### Phase 5: Follow-up
- Track forecast accuracy (target: +/-5% revenue, +/-3% expenses)
- Monitor report delivery timeliness (target: 100% on time)
- Update models with actuals as they become available
- Refine assumptions based on variance analysis
## Tools
### 1. Ratio Calculator (`scripts/ratio_calculator.py`)
Calculate and interpret financial ratios from financial statement data.
**Ratio Categories:**
- **Profitability:** ROE, ROA, Gross Margin, Operating Margin, Net Margin
- **Liquidity:** Current Ratio, Quick Ratio, Cash Ratio
- **Leverage:** Debt-to-Equity, Interest Coverage, DSCR
- **Efficiency:** Asset Turnover, Inventory Turnover, Receivables Turnover, DSO
- **Valuation:** P/E, P/B, P/S, EV/EBITDA, PEG Ratio
```bash
python scripts/ratio_calculator.py sample_financial_data.json
python scripts/ratio_calculator.py sample_financial_data.json --format json
python scripts/ratio_calculator.py sample_financial_data.json --category profitability
```
### 2. DCF Valuation (`scripts/dcf_valuation.py`)
Discounted Cash Flow enterprise and equity valuation with sensitivity analysis.
**Features:**
- WACC calculation via CAPM
- Revenue and free cash flow projections (5-year default)
- Terminal value via perpetuity growth and exit multiple methods
- Enterprise value and equity value derivation
- Two-way sensitivity analysis (discount rate vs growth rate)
```bash
python scripts/dcf_valuation.py valuation_data.json
python scripts/dcf_valuation.py valuation_data.json --format json
python scripts/dcf_valuation.py valuation_data.json --projection-years 7
```
### 3. Budget Variance Analyzer (`scripts/budget_variance_analyzer.py`)
Analyze actual vs budget vs prior year performance with materiality filtering.
**Features:**
- Dollar and percentage variance calculation
- Materiality threshold filtering (default: 10% or $50K)
- Favorable/unfavorable classification with revenue/expense logic
- Department and category breakdown
- Executive summary generation
```bash
python scripts/budget_variance_analyzer.py budget_data.json
python scripts/budget_variance_analyzer.py budget_data.json --format json
python scripts/budget_variance_analyzer.py budget_data.json --threshold-pct 5 --threshold-amt 25000
```
### 4. Forecast Builder (`scripts/forecast_builder.py`)
Driver-based revenue forecasting with rolling cash flow projection and scenario modeling.
**Features:**
- Driver-based revenue forecast model
- 13-week rolling cash flow projection
- Scenario modeling (base/bull/bear cases)
- Trend analysis using simple linear regression (standard library)
```bash
python scripts/forecast_builder.py forecast_data.json
python scripts/forecast_builder.py forecast_data.json --format json
python scripts/forecast_builder.py forecast_data.json --scenarios base,bull,bear
```
## Knowledge Bases
| Reference | Purpose |
|-----------|---------|
| `references/financial-ratios-guide.md` | Ratio formulas, interpretation, industry benchmarks |
| `references/valuation-methodology.md` | DCF methodology, WACC, terminal value, comps |
| `references/forecasting-best-practices.md` | Driver-based forecasting, rolling forecasts, accuracy |
| `references/industry-adaptations.md` | Sector-specific metrics and considerations (SaaS, Retail, Manufacturing, Financial Services, Healthcare) |
## Templates
| Template | Purpose |
|----------|---------|
| `assets/variance_report_template.md` | Budget variance report template |
| `assets/dcf_analysis_template.md` | DCF valuation analysis template |
| `assets/forecast_report_template.md` | Revenue forecast report template |
## Key Metrics & Targets
| Metric | Target |
|--------|--------|
| Forecast accuracy (revenue) | +/-5% |
| Forecast accuracy (expenses) | +/-3% |
| Report delivery | 100% on time |
| Model documentation | Complete for all assumptions |
| Variance explanation | 100% of material variances |
## Input Data Format
All scripts accept JSON input files. See `assets/sample_financial_data.json` for the complete input schema covering all four tools.
## Dependencies
**None** - All scripts use Python standard library only (`math`, `statistics`, `json`, `argparse`, `datetime`). No numpy, pandas, or scipy required.
FILE:assets/dcf_analysis_template.md
# DCF Valuation Analysis
## Report Header
| Field | Value |
|-------|-------|
| **Company** | [Company Name] |
| **Ticker** | [Ticker Symbol] |
| **Analysis Date** | [Date] |
| **Prepared By** | [Analyst Name] |
| **Current Share Price** | $[X] |
| **Shares Outstanding** | [X]M |
## Executive Summary
[2-3 sentence overview of the valuation conclusion, including the implied value range per share compared to the current market price, and whether the stock appears undervalued, fairly valued, or overvalued.]
### Valuation Summary
| Method | Enterprise Value | Equity Value | Value Per Share | vs Current Price |
|--------|-----------------|-------------|----------------|-----------------|
| DCF (Perpetuity Growth) | $[X]M | $[X]M | $[X] | [X]% |
| DCF (Exit Multiple) | $[X]M | $[X]M | $[X] | [X]% |
| Comparable Companies | $[X]M | $[X]M | $[X] | [X]% |
| **Blended Estimate** | **$[X]M** | **$[X]M** | **$[X]** | **[X]%** |
## Investment Thesis
[Summary of the investment case, including key strengths, risks, and catalysts.]
## Historical Financial Summary
| ($M) | FY-4 | FY-3 | FY-2 | FY-1 | LTM |
|------|------|------|------|------|-----|
| Revenue | [X] | [X] | [X] | [X] | [X] |
| Revenue Growth | [X]% | [X]% | [X]% | [X]% | [X]% |
| Gross Profit | [X] | [X] | [X] | [X] | [X] |
| Gross Margin | [X]% | [X]% | [X]% | [X]% | [X]% |
| EBITDA | [X] | [X] | [X] | [X] | [X] |
| EBITDA Margin | [X]% | [X]% | [X]% | [X]% | [X]% |
| Net Income | [X] | [X] | [X] | [X] | [X] |
| Free Cash Flow | [X] | [X] | [X] | [X] | [X] |
## WACC Calculation
### Cost of Equity (CAPM)
| Component | Value | Source |
|-----------|-------|--------|
| Risk-Free Rate | [X]% | [10-Year Treasury] |
| Equity Risk Premium | [X]% | [Damodaran / internal] |
| Beta (Levered) | [X] | [Bloomberg / regression] |
| Size Premium | [X]% | [Duff & Phelps] |
| Company-Specific Risk | [X]% | [Analyst judgment] |
| **Cost of Equity** | **[X]%** | |
### Cost of Debt
| Component | Value |
|-----------|-------|
| Pre-Tax Cost of Debt | [X]% |
| Tax Rate | [X]% |
| After-Tax Cost of Debt | [X]% |
### Capital Structure
| Component | Market Value ($M) | Weight |
|-----------|------------------|--------|
| Equity | [X] | [X]% |
| Debt | [X] | [X]% |
| **Total Capital** | **[X]** | **100%** |
### WACC Result: [X]%
## Revenue Projections
| ($M) | Year 1 | Year 2 | Year 3 | Year 4 | Year 5 |
|------|--------|--------|--------|--------|--------|
| Revenue | [X] | [X] | [X] | [X] | [X] |
| Growth Rate | [X]% | [X]% | [X]% | [X]% | [X]% |
**Key Revenue Assumptions:**
- [Assumption 1 with supporting rationale]
- [Assumption 2 with supporting rationale]
- [Assumption 3 with supporting rationale]
## Free Cash Flow Projections
| ($M) | Year 1 | Year 2 | Year 3 | Year 4 | Year 5 |
|------|--------|--------|--------|--------|--------|
| Revenue | [X] | [X] | [X] | [X] | [X] |
| EBIT | [X] | [X] | [X] | [X] | [X] |
| Taxes on EBIT | ([X]) | ([X]) | ([X]) | ([X]) | ([X]) |
| NOPAT | [X] | [X] | [X] | [X] | [X] |
| D&A | [X] | [X] | [X] | [X] | [X] |
| CapEx | ([X]) | ([X]) | ([X]) | ([X]) | ([X]) |
| Change in NWC | ([X]) | ([X]) | ([X]) | ([X]) | ([X]) |
| **Unlevered FCF** | **[X]** | **[X]** | **[X]** | **[X]** | **[X]** |
| FCF Margin | [X]% | [X]% | [X]% | [X]% | [X]% |
## Terminal Value
### Perpetuity Growth Method
| Component | Value |
|-----------|-------|
| Terminal FCF | $[X]M |
| Terminal Growth Rate | [X]% |
| WACC | [X]% |
| **Terminal Value** | **$[X]M** |
| TV as % of EV | [X]% |
### Exit Multiple Method
| Component | Value |
|-----------|-------|
| Terminal EBITDA | $[X]M |
| Exit EV/EBITDA Multiple | [X]x |
| **Terminal Value** | **$[X]M** |
| TV as % of EV | [X]% |
## Enterprise Value Bridge
| Component | Perpetuity Growth | Exit Multiple |
|-----------|------------------|---------------|
| PV of Projected FCFs | $[X]M | $[X]M |
| PV of Terminal Value | $[X]M | $[X]M |
| **Enterprise Value** | **$[X]M** | **$[X]M** |
| Less: Net Debt | ($[X]M) | ($[X]M) |
| Less: Minority Interest | ($[X]M) | ($[X]M) |
| **Equity Value** | **$[X]M** | **$[X]M** |
| Diluted Shares (M) | [X] | [X] |
| **Value Per Share** | **$[X]** | **$[X]** |
## Sensitivity Analysis
### WACC vs Terminal Growth Rate (Enterprise Value, $M)
| WACC \ Growth | [g-2]% | [g-1]% | [g]% | [g+1]% | [g+2]% |
|--------------|--------|--------|------|--------|--------|
| [WACC-2]% | [X] | [X] | [X] | [X] | [X] |
| [WACC-1]% | [X] | [X] | [X] | [X] | [X] |
| **[WACC]%** | [X] | [X] | **[X]** | [X] | [X] |
| [WACC+1]% | [X] | [X] | [X] | [X] | [X] |
| [WACC+2]% | [X] | [X] | [X] | [X] | [X] |
### Implied Share Price Range
| Scenario | Share Price | vs Current | Upside/Downside |
|----------|-----------|------------|----------------|
| Bear Case (WACC+2%, g-2%) | $[X] | [X]% | [X]% |
| Base Case | $[X] | [X]% | [X]% |
| Bull Case (WACC-2%, g+2%) | $[X] | [X]% | [X]% |
## Key Risks to Valuation
1. **[Risk 1]** - [Description and potential impact on value]
2. **[Risk 2]** - [Description and potential impact on value]
3. **[Risk 3]** - [Description and potential impact on value]
## Comparable Company Analysis
| Company | EV/Revenue | EV/EBITDA | P/E | Growth | Margin |
|---------|-----------|----------|-----|--------|--------|
| [Comp 1] | [X]x | [X]x | [X]x | [X]% | [X]% |
| [Comp 2] | [X]x | [X]x | [X]x | [X]% | [X]% |
| [Comp 3] | [X]x | [X]x | [X]x | [X]% | [X]% |
| [Comp 4] | [X]x | [X]x | [X]x | [X]% | [X]% |
| **Median** | **[X]x** | **[X]x** | **[X]x** | **[X]%** | **[X]%** |
| **[Target]** | **[X]x** | **[X]x** | **[X]x** | **[X]%** | **[X]%** |
## Conclusion and Recommendation
**Valuation Range:** $[Low] - $[High] per share
**Current Price:** $[X]
**Recommendation:** [Buy / Hold / Sell]
[Final paragraph with investment recommendation rationale, key upside catalysts, and primary risks to monitor.]
---
*Analysis generated using Financial Analyst Skill - DCF Valuation Model*
FILE:assets/expected_output.json
{
"_description": "Expected output structure for all 4 scripts. Values are illustrative to show data format.",
"ratio_calculator_output": {
"categories": {
"profitability": {
"roe": {
"value": 0.25,
"formula": "Net Income / Total Equity",
"name": "Return on Equity",
"interpretation": "Good - above average performance"
},
"roa": {
"value": 0.1375,
"formula": "Net Income / Total Assets",
"name": "Return on Assets",
"interpretation": "Excellent - significantly above peers"
},
"gross_margin": {
"value": 0.40,
"formula": "(Revenue - COGS) / Revenue",
"name": "Gross Margin",
"interpretation": "Acceptable - within normal range"
},
"operating_margin": {
"value": 0.16,
"formula": "Operating Income / Revenue",
"name": "Operating Margin",
"interpretation": "Good - above average performance"
},
"net_margin": {
"value": 0.11,
"formula": "Net Income / Revenue",
"name": "Net Margin",
"interpretation": "Good - above average performance"
}
},
"liquidity": {
"current_ratio": {"value": 1.875, "name": "Current Ratio"},
"quick_ratio": {"value": 1.4375, "name": "Quick Ratio"},
"cash_ratio": {"value": 0.625, "name": "Cash Ratio"}
},
"leverage": {
"debt_to_equity": {"value": 0.545, "name": "Debt-to-Equity Ratio"},
"interest_coverage": {"value": 6.67, "name": "Interest Coverage Ratio"},
"dscr": {"value": 2.50, "name": "Debt Service Coverage Ratio"}
},
"efficiency": {
"asset_turnover": {"value": 1.25, "name": "Asset Turnover"},
"inventory_turnover": {"value": 8.57, "name": "Inventory Turnover"},
"receivables_turnover": {"value": 8.33, "name": "Receivables Turnover"},
"dso": {"value": 43.8, "name": "Days Sales Outstanding"}
},
"valuation": {
"pe_ratio": {"value": 81.82, "name": "Price-to-Earnings Ratio"},
"pb_ratio": {"value": 20.45, "name": "Price-to-Book Ratio"},
"ps_ratio": {"value": 9.0, "name": "Price-to-Sales Ratio"},
"ev_ebitda": {"value": 45.7, "name": "EV/EBITDA"},
"peg_ratio": {"value": 6.82, "name": "PEG Ratio"}
}
}
},
"dcf_valuation_output": {
"wacc": 0.085,
"projected_revenue": [55000000, 59950000, 64746000, 69278220, 73434953],
"projected_fcf": [6600000, 7793500, 8416980, 9698951, 10280893],
"terminal_value": {
"perpetuity_growth": 175382225,
"exit_multiple": 176243484
},
"enterprise_value": {
"perpetuity_growth": 149500000,
"exit_multiple": 150100000
},
"equity_value": {
"perpetuity_growth": 142500000,
"exit_multiple": 143100000
},
"value_per_share": {
"perpetuity_growth": 14.25,
"exit_multiple": 14.31
},
"sensitivity_analysis": {
"wacc_values": [0.065, 0.075, 0.085, 0.095, 0.105],
"growth_values": [0.015, 0.020, 0.025, 0.030, 0.035],
"enterprise_value_table": "5x5 nested list of enterprise values",
"share_price_table": "5x5 nested list of share prices"
}
},
"budget_variance_output": {
"executive_summary": {
"period": "Q4 2025",
"company": "Acme Corp",
"total_line_items": 10,
"material_variances_count": 3,
"favorable_count": 4,
"unfavorable_count": 6,
"revenue": {
"actual": 15700000,
"budget": 15500000,
"variance_amount": 200000,
"variance_pct": 1.29
},
"expenses": {
"actual": 13255000,
"budget": 12520000,
"variance_amount": 735000,
"variance_pct": 5.87
},
"net_impact": -535000
},
"material_variances": [
{
"name": "Cost of Goods Sold",
"budget_variance_amount": 600000,
"budget_variance_pct": 8.33,
"favorability": "Unfavorable"
}
],
"department_summary": {
"Sales": {"total_variance": 0, "variance_pct": 0},
"Operations": {"total_variance": 0, "variance_pct": 0}
},
"category_summary": {
"Revenue": {"total_variance": 0, "variance_pct": 0},
"COGS": {"total_variance": 0, "variance_pct": 0}
}
},
"forecast_builder_output": {
"trend_analysis": {
"trend": {
"slope": 650000,
"intercept": 9500000,
"r_squared": 0.98,
"direction": "upward"
},
"average_growth_rate": 0.06,
"seasonality_index": [0.92, 0.97, 1.01, 1.10]
},
"scenario_comparison": {
"comparison": [
{"scenario": "base", "total_revenue": 185000000, "growth_rate": 0.08},
{"scenario": "bull", "total_revenue": 210000000, "growth_rate": 0.12},
{"scenario": "bear", "total_revenue": 165000000, "growth_rate": 0.05}
]
},
"rolling_cash_flow": {
"weeks": 13,
"opening_balance": 2500000,
"closing_balance": 2800000,
"total_inflows": 4200000,
"total_outflows": 3900000,
"minimum_balance": 2100000,
"minimum_balance_week": 4,
"cash_runway_weeks": 12
}
}
}
FILE:assets/forecast_report_template.md
# Revenue Forecast Report
## Report Header
| Field | Value |
|-------|-------|
| **Company** | [Company Name] |
| **Forecast Period** | [Start] to [End] |
| **Prepared By** | [Analyst Name] |
| **Date** | [Report Date] |
| **Forecast Type** | [Driver-Based / Trend-Based / Blended] |
## Executive Summary
[2-3 sentence overview of the revenue forecast, key assumptions, and confidence level. Highlight the base case total revenue, expected growth rate, and any significant departures from prior forecast or budget.]
### Key Metrics at a Glance
| Metric | Value |
|--------|-------|
| Base Case Total Revenue | $[X]M |
| Expected Growth Rate | [X]% |
| Forecast Confidence | [High / Medium / Low] |
| Revenue Range (Bear to Bull) | $[X]M - $[X]M |
| Primary Revenue Driver | [Driver description] |
## Historical Trend Analysis
### Revenue Trend
| Period | Revenue | Growth Rate | Gross Margin |
|--------|---------|------------|-------------|
| [Q/Year-4] | $[X]M | - | [X]% |
| [Q/Year-3] | $[X]M | [X]% | [X]% |
| [Q/Year-2] | $[X]M | [X]% | [X]% |
| [Q/Year-1] | $[X]M | [X]% | [X]% |
| [Current] | $[X]M | [X]% | [X]% |
### Trend Statistics
| Metric | Value |
|--------|-------|
| Average Growth Rate | [X]% |
| Trend Direction | [Upward / Flat / Downward] |
| R-squared (fit quality) | [X] |
| Seasonality Detected | [Yes / No] |
## Revenue Drivers
### Primary Drivers
| Driver | Current Value | Projected Value | Growth |
|--------|-------------|-----------------|--------|
| [Units / Customers / etc.] | [X] | [X] | [X]% |
| [Price / ARPU / etc.] | $[X] | $[X] | [X]% |
| [Conversion / Retention] | [X]% | [X]% | [X]pp |
### Driver Assumptions
1. **[Driver 1]:** [Assumption and rationale]
2. **[Driver 2]:** [Assumption and rationale]
3. **[Driver 3]:** [Assumption and rationale]
## Scenario Comparison
### Summary
| Scenario | Total Revenue | Growth Rate | Op. Income | Gross Margin | Probability |
|----------|-------------|-------------|-----------|-------------|-------------|
| Bull | $[X]M | [X]% | $[X]M | [X]% | [X]% |
| **Base** | **$[X]M** | **[X]%** | **$[X]M** | **[X]%** | **[X]%** |
| Bear | $[X]M | [X]% | $[X]M | [X]% | [X]% |
### Scenario Assumptions
**Bull Case:**
- [Key assumption 1]
- [Key assumption 2]
- [Trigger: what conditions would cause this scenario]
**Base Case:**
- [Key assumption 1]
- [Key assumption 2]
**Bear Case:**
- [Key assumption 1]
- [Key assumption 2]
- [Trigger: what conditions would cause this scenario]
## Monthly/Quarterly Forecast Detail (Base Case)
| Period | Revenue | COGS | Gross Profit | OpEx | Op. Income |
|--------|---------|------|-------------|------|-----------|
| [Period 1] | $[X] | $[X] | $[X] | $[X] | $[X] |
| [Period 2] | $[X] | $[X] | $[X] | $[X] | $[X] |
| [Period 3] | $[X] | $[X] | $[X] | $[X] | $[X] |
| [Period 4] | $[X] | $[X] | $[X] | $[X] | $[X] |
| ... | ... | ... | ... | ... | ... |
| **Total** | **$[X]** | **$[X]** | **$[X]** | **$[X]** | **$[X]** |
## 13-Week Rolling Cash Flow
### Summary
| Metric | Value |
|--------|-------|
| Opening Cash Balance | $[X] |
| Projected Closing Balance | $[X] |
| Net Cash Change | $[X] |
| Minimum Cash Balance | $[X] (Week [N]) |
| Cash Runway | [N] weeks |
### Weekly Cash Flow Projection
| Week | Inflows | Outflows | Net Cash Flow | Closing Balance |
|------|---------|----------|--------------|----------------|
| 1 | $[X] | $[X] | $[X] | $[X] |
| 2 | $[X] | $[X] | $[X] | $[X] |
| 3 | $[X] | $[X] | $[X] | $[X] |
| ... | ... | ... | ... | ... |
| 13 | $[X] | $[X] | $[X] | $[X] |
### Cash Flow Notes
- **Week [N]:** [Description of any significant one-time items]
- **Week [N]:** [Description of any significant one-time items]
## Forecast Accuracy Tracking
### vs Prior Forecast
| Metric | Prior Forecast | Current Forecast | Change |
|--------|---------------|-----------------|--------|
| Revenue | $[X]M | $[X]M | [X]% |
| Growth Rate | [X]% | [X]% | [X]pp |
| Gross Margin | [X]% | [X]% | [X]pp |
### Historical Forecast Accuracy (MAPE)
| Period | Forecast | Actual | Error | MAPE |
|--------|----------|--------|-------|------|
| [Period-3] | $[X] | $[X] | $[X] | [X]% |
| [Period-2] | $[X] | $[X] | $[X] | [X]% |
| [Period-1] | $[X] | $[X] | $[X] | [X]% |
| **Average MAPE** | | | | **[X]%** |
## Key Risks and Assumptions
### Upside Risks
1. [Risk/opportunity with quantified potential impact]
2. [Risk/opportunity with quantified potential impact]
### Downside Risks
1. [Risk with quantified potential impact]
2. [Risk with quantified potential impact]
### Critical Assumptions
1. [Assumption that if wrong would materially change the forecast]
2. [Assumption that if wrong would materially change the forecast]
## Recommendations
1. **[Recommendation 1]:** [Specific action with expected impact]
2. **[Recommendation 2]:** [Specific action with expected impact]
3. **[Recommendation 3]:** [Specific action with expected impact]
## Next Steps
| # | Action | Owner | Due Date |
|---|--------|-------|----------|
| 1 | [Action item] | [Name] | [Date] |
| 2 | [Action item] | [Name] | [Date] |
| 3 | [Action item] | [Name] | [Date] |
---
*Report generated using Financial Analyst Skill - Forecast Builder*
FILE:assets/sample_financial_data.json
{
"_description": "Sample financial data covering all 4 scripts: ratio_calculator, dcf_valuation, budget_variance_analyzer, and forecast_builder",
"ratio_analysis": {
"income_statement": {
"revenue": 50000000,
"cost_of_goods_sold": 30000000,
"operating_income": 8000000,
"ebitda": 10000000,
"net_income": 5500000,
"interest_expense": 1200000
},
"balance_sheet": {
"total_assets": 40000000,
"current_assets": 15000000,
"cash_and_equivalents": 5000000,
"accounts_receivable": 6000000,
"inventory": 3500000,
"total_equity": 22000000,
"total_debt": 12000000,
"current_liabilities": 8000000
},
"cash_flow": {
"operating_cash_flow": 7500000,
"total_debt_service": 3000000
},
"market_data": {
"share_price": 45.00,
"shares_outstanding": 10000000,
"market_cap": 450000000,
"earnings_growth_rate": 0.12
}
},
"dcf_valuation": {
"historical": {
"revenue": [38000000, 42000000, 45000000, 48000000, 50000000],
"net_income": [3800000, 4200000, 4500000, 5000000, 5500000],
"net_debt": 7000000,
"shares_outstanding": 10000000
},
"assumptions": {
"projection_years": 5,
"revenue_growth_rates": [0.10, 0.09, 0.08, 0.07, 0.06],
"fcf_margins": [0.12, 0.13, 0.13, 0.14, 0.14],
"default_revenue_growth": 0.05,
"default_fcf_margin": 0.10,
"terminal_growth_rate": 0.025,
"terminal_ebitda_margin": 0.20,
"exit_ev_ebitda_multiple": 12.0,
"wacc_inputs": {
"risk_free_rate": 0.04,
"equity_risk_premium": 0.06,
"beta": 1.1,
"cost_of_debt": 0.055,
"tax_rate": 0.25,
"debt_weight": 0.30,
"equity_weight": 0.70
}
}
},
"budget_variance": {
"company": "Acme Corp",
"period": "Q4 2025",
"line_items": [
{
"name": "Product Revenue",
"type": "revenue",
"department": "Sales",
"category": "Revenue",
"actual": 12500000,
"budget": 12000000,
"prior_year": 10800000
},
{
"name": "Service Revenue",
"type": "revenue",
"department": "Sales",
"category": "Revenue",
"actual": 3200000,
"budget": 3500000,
"prior_year": 2900000
},
{
"name": "Cost of Goods Sold",
"type": "expense",
"department": "Operations",
"category": "COGS",
"actual": 7800000,
"budget": 7200000,
"prior_year": 6700000
},
{
"name": "Salaries & Wages",
"type": "expense",
"department": "Human Resources",
"category": "Personnel",
"actual": 2100000,
"budget": 2200000,
"prior_year": 1950000
},
{
"name": "Marketing & Advertising",
"type": "expense",
"department": "Marketing",
"category": "Sales & Marketing",
"actual": 850000,
"budget": 750000,
"prior_year": 680000
},
{
"name": "Software & Technology",
"type": "expense",
"department": "Engineering",
"category": "Technology",
"actual": 420000,
"budget": 400000,
"prior_year": 350000
},
{
"name": "Office & Facilities",
"type": "expense",
"department": "Operations",
"category": "G&A",
"actual": 180000,
"budget": 200000,
"prior_year": 175000
},
{
"name": "Travel & Entertainment",
"type": "expense",
"department": "Sales",
"category": "Sales & Marketing",
"actual": 95000,
"budget": 120000,
"prior_year": 88000
},
{
"name": "Professional Services",
"type": "expense",
"department": "Finance",
"category": "G&A",
"actual": 310000,
"budget": 250000,
"prior_year": 220000
},
{
"name": "R&D Expenses",
"type": "expense",
"department": "Engineering",
"category": "R&D",
"actual": 1500000,
"budget": 1400000,
"prior_year": 1200000
}
]
},
"forecast": {
"historical_periods": [
{"period": "Q1 2024", "revenue": 10500000, "gross_profit": 4200000, "operating_income": 1575000},
{"period": "Q2 2024", "revenue": 11200000, "gross_profit": 4480000, "operating_income": 1680000},
{"period": "Q3 2024", "revenue": 11800000, "gross_profit": 4720000, "operating_income": 1770000},
{"period": "Q4 2024", "revenue": 12500000, "gross_profit": 5000000, "operating_income": 1875000},
{"period": "Q1 2025", "revenue": 12800000, "gross_profit": 5120000, "operating_income": 1920000},
{"period": "Q2 2025", "revenue": 13500000, "gross_profit": 5400000, "operating_income": 2025000},
{"period": "Q3 2025", "revenue": 14100000, "gross_profit": 5640000, "operating_income": 2115000},
{"period": "Q4 2025", "revenue": 15700000, "gross_profit": 6280000, "operating_income": 2355000}
],
"drivers": {
"units": {
"base_units": 5000,
"growth_rate": 0.04
},
"pricing": {
"base_price": 2800,
"annual_increase": 0.03
}
},
"assumptions": {
"revenue_growth_rate": 0.08,
"gross_margin": 0.40,
"opex_pct_revenue": 0.25,
"forecast_periods": 12
},
"scenarios": {
"base": {
"growth_adjustment": 0.0,
"margin_adjustment": 0.0
},
"bull": {
"growth_adjustment": 0.04,
"margin_adjustment": 0.03
},
"bear": {
"growth_adjustment": -0.03,
"margin_adjustment": -0.02
}
},
"cash_flow_inputs": {
"opening_cash_balance": 2500000,
"weekly_revenue": 350000,
"collection_rate": 0.85,
"collection_lag_weeks": 2,
"weekly_payroll": 160000,
"weekly_rent": 15000,
"weekly_operating": 45000,
"weekly_other": 20000,
"one_time_items": [
{"week": 3, "amount": -250000, "description": "Annual insurance premium"},
{"week": 6, "amount": 500000, "description": "Customer prepayment"},
{"week": 9, "amount": -180000, "description": "Equipment purchase"},
{"week": 13, "amount": -75000, "description": "Quarterly tax payment"}
]
},
"forecast_periods": 12
}
}
FILE:assets/variance_report_template.md
# Budget Variance Report
## Report Header
| Field | Value |
|-------|-------|
| **Company** | [Company Name] |
| **Period** | [Reporting Period] |
| **Prepared By** | [Analyst Name] |
| **Date** | [Report Date] |
| **Materiality Threshold** | [X]% or $[Y]K |
## Executive Summary
[2-3 sentence overview of overall performance vs budget, highlighting whether the company is tracking ahead or behind plan and the primary drivers of variance.]
### Key Metrics
| Metric | Actual | Budget | Variance ($) | Variance (%) | Status |
|--------|--------|--------|-------------|-------------|--------|
| Total Revenue | $[X] | $[X] | $[X] | [X]% | [Fav/Unfav] |
| Total Expenses | $[X] | $[X] | $[X] | [X]% | [Fav/Unfav] |
| Net Income | $[X] | $[X] | $[X] | [X]% | [Fav/Unfav] |
| Operating Margin | [X]% | [X]% | [X]pp | - | [Fav/Unfav] |
## Material Variances
### [Variance Item 1 - e.g., Product Revenue]
| | Actual | Budget | Variance | |
|---|--------|--------|---------|---|
| Amount | $[X] | $[X] | $[X] | [X]% |
**Root Cause:** [Detailed explanation of why this variance occurred]
**Impact:** [Quantified impact on profitability and cash flow]
**Corrective Action:** [Specific steps being taken to address the variance]
**Responsible:** [Owner] | **Target Date:** [Date]
---
### [Variance Item 2]
| | Actual | Budget | Variance | |
|---|--------|--------|---------|---|
| Amount | $[X] | $[X] | $[X] | [X]% |
**Root Cause:** [Explanation]
**Impact:** [Impact]
**Corrective Action:** [Action items]
**Responsible:** [Owner] | **Target Date:** [Date]
---
## Department Performance
| Department | Actual | Budget | Variance ($) | Variance (%) | Favorable | Unfavorable |
|-----------|--------|--------|-------------|-------------|-----------|-------------|
| Sales | $[X] | $[X] | $[X] | [X]% | [N] | [N] |
| Operations | $[X] | $[X] | $[X] | [X]% | [N] | [N] |
| Marketing | $[X] | $[X] | $[X] | [X]% | [N] | [N] |
| Engineering | $[X] | $[X] | $[X] | [X]% | [N] | [N] |
| Finance | $[X] | $[X] | $[X] | [X]% | [N] | [N] |
| HR | $[X] | $[X] | $[X] | [X]% | [N] | [N] |
## Category Breakdown
| Category | Actual | Budget | Variance ($) | Variance (%) |
|----------|--------|--------|-------------|-------------|
| Revenue | $[X] | $[X] | $[X] | [X]% |
| COGS | $[X] | $[X] | $[X] | [X]% |
| Personnel | $[X] | $[X] | $[X] | [X]% |
| Sales & Marketing | $[X] | $[X] | $[X] | [X]% |
| Technology | $[X] | $[X] | $[X] | [X]% |
| G&A | $[X] | $[X] | $[X] | [X]% |
| R&D | $[X] | $[X] | $[X] | [X]% |
## Prior Year Comparison
| Metric | Current Actual | Prior Year | YoY Change ($) | YoY Change (%) |
|--------|---------------|-----------|---------------|---------------|
| Revenue | $[X] | $[X] | $[X] | [X]% |
| Gross Profit | $[X] | $[X] | $[X] | [X]% |
| Operating Income | $[X] | $[X] | $[X] | [X]% |
| Net Income | $[X] | $[X] | $[X] | [X]% |
## Risks and Opportunities
### Risks
1. [Risk description with quantified impact]
2. [Risk description with quantified impact]
### Opportunities
1. [Opportunity description with quantified upside]
2. [Opportunity description with quantified upside]
## Forecast Impact
Based on current variances, the full-year forecast is adjusted as follows:
| Metric | Original FY Forecast | Revised FY Forecast | Change |
|--------|---------------------|--------------------|---------|
| Revenue | $[X] | $[X] | $[X] |
| EBITDA | $[X] | $[X] | $[X] |
| Net Income | $[X] | $[X] | $[X] |
## Action Items
| # | Action | Owner | Due Date | Status |
|---|--------|-------|----------|--------|
| 1 | [Action description] | [Name] | [Date] | [Open/In Progress/Complete] |
| 2 | [Action description] | [Name] | [Date] | [Open/In Progress/Complete] |
| 3 | [Action description] | [Name] | [Date] | [Open/In Progress/Complete] |
---
*Report generated using Financial Analyst Skill - Budget Variance Analyzer*
FILE:references/financial-ratios-guide.md
# Financial Ratios Guide
Comprehensive reference for financial ratio analysis covering formulas, interpretation, and industry benchmarks across five categories.
## 1. Profitability Ratios
Measure a company's ability to generate earnings relative to revenue, assets, or equity.
### Return on Equity (ROE)
**Formula:** Net Income / Total Shareholders' Equity
**Interpretation:**
- Measures how effectively management uses equity to generate profits
- Higher ROE indicates more efficient use of equity capital
- Compare against cost of equity - ROE should exceed it
**Benchmarks:**
| Rating | Range |
|--------|-------|
| Below Average | < 8% |
| Acceptable | 8% - 15% |
| Good | 15% - 25% |
| Excellent | > 25% |
**Caveats:** High leverage can inflate ROE. Use DuPont decomposition (ROE = Margin x Turnover x Leverage) for deeper analysis.
### Return on Assets (ROA)
**Formula:** Net Income / Total Assets
**Interpretation:**
- Measures how efficiently assets generate profit
- Asset-light businesses naturally have higher ROA
- Compare within industry only
**Benchmarks:**
| Rating | Range |
|--------|-------|
| Below Average | < 3% |
| Acceptable | 3% - 6% |
| Good | 6% - 12% |
| Excellent | > 12% |
### Gross Margin
**Formula:** (Revenue - COGS) / Revenue
**Interpretation:**
- Measures production efficiency and pricing power
- Declining gross margin may signal competitive pressure or cost inflation
- Critical for evaluating business model sustainability
**Benchmarks by Industry:**
| Industry | Typical Range |
|----------|--------------|
| Software/SaaS | 70% - 85% |
| Financial Services | 50% - 70% |
| Retail | 25% - 45% |
| Manufacturing | 20% - 40% |
| Grocery | 25% - 30% |
### Operating Margin
**Formula:** Operating Income / Revenue
**Interpretation:**
- Measures operational efficiency after all operating expenses
- Excludes interest and taxes for better operational comparison
- Indicates management effectiveness in controlling costs
**Benchmarks:**
| Rating | Range |
|--------|-------|
| Below Average | < 5% |
| Acceptable | 5% - 15% |
| Good | 15% - 25% |
| Excellent | > 25% |
### Net Margin
**Formula:** Net Income / Revenue
**Interpretation:**
- Bottom-line profitability after all expenses
- Affected by tax strategy, capital structure, and one-time items
- Most comprehensive profitability measure
**Benchmarks:**
| Rating | Range |
|--------|-------|
| Below Average | < 3% |
| Acceptable | 3% - 10% |
| Good | 10% - 20% |
| Excellent | > 20% |
## 2. Liquidity Ratios
Measure a company's ability to meet short-term obligations.
### Current Ratio
**Formula:** Current Assets / Current Liabilities
**Interpretation:**
- Measures short-term solvency
- Too high may indicate inefficient asset use
- Too low signals potential liquidity risk
**Benchmarks:**
| Rating | Range |
|--------|-------|
| Concern | < 1.0 |
| Acceptable | 1.0 - 1.5 |
| Healthy | 1.5 - 3.0 |
| Excessive | > 3.0 |
### Quick Ratio (Acid Test)
**Formula:** (Current Assets - Inventory) / Current Liabilities
**Interpretation:**
- More conservative than current ratio
- Excludes inventory (least liquid current asset)
- Critical for businesses with slow-moving inventory
**Benchmarks:**
| Rating | Range |
|--------|-------|
| Concern | < 0.8 |
| Acceptable | 0.8 - 1.0 |
| Healthy | 1.0 - 2.0 |
| Excessive | > 2.0 |
### Cash Ratio
**Formula:** Cash & Equivalents / Current Liabilities
**Interpretation:**
- Most conservative liquidity measure
- Indicates ability to pay obligations with cash on hand
- Particularly important during credit crunches
**Benchmarks:**
| Rating | Range |
|--------|-------|
| Low | < 0.2 |
| Adequate | 0.2 - 0.5 |
| Strong | 0.5 - 1.0 |
| Excessive | > 1.0 |
## 3. Leverage Ratios
Measure the extent to which a company uses debt financing.
### Debt-to-Equity Ratio
**Formula:** Total Debt / Total Shareholders' Equity
**Interpretation:**
- Measures financial leverage and risk
- Higher ratio = more reliance on debt financing
- Industry norms vary significantly (utilities vs tech)
**Benchmarks:**
| Rating | Range |
|--------|-------|
| Conservative | < 0.3 |
| Moderate | 0.3 - 0.8 |
| Elevated | 0.8 - 2.0 |
| High Risk | > 2.0 |
### Interest Coverage Ratio
**Formula:** Operating Income (EBIT) / Interest Expense
**Interpretation:**
- Measures ability to service debt from operating earnings
- Below 1.5x is a red flag for lenders
- Critical for credit analysis
**Benchmarks:**
| Rating | Range |
|--------|-------|
| Distressed | < 2.0 |
| Adequate | 2.0 - 5.0 |
| Strong | 5.0 - 10.0 |
| Very Strong | > 10.0 |
### Debt Service Coverage Ratio (DSCR)
**Formula:** Operating Cash Flow / Total Debt Service
**Interpretation:**
- Cash-based measure of debt servicing capacity
- Includes principal repayments (unlike interest coverage)
- Required by many loan covenants
**Benchmarks:**
| Rating | Range |
|--------|-------|
| Default Risk | < 1.0 |
| Minimum | 1.0 - 1.5 |
| Comfortable | 1.5 - 2.5 |
| Strong | > 2.5 |
## 4. Efficiency Ratios
Measure how effectively a company uses its assets and manages operations.
### Asset Turnover
**Formula:** Revenue / Total Assets
**Interpretation:**
- Measures revenue generated per dollar of assets
- Higher indicates more efficient asset utilization
- Inversely related to profit margins (DuPont)
**Benchmarks:**
| Industry | Typical Range |
|----------|--------------|
| Retail | 2.0 - 3.0 |
| Manufacturing | 0.8 - 1.5 |
| Utilities | 0.3 - 0.5 |
| Technology | 0.5 - 1.0 |
### Inventory Turnover
**Formula:** COGS / Average Inventory
**Interpretation:**
- Measures how quickly inventory is sold
- Low turnover suggests overstock or obsolescence risk
- High turnover may indicate strong sales or thin inventory
**Benchmarks:**
| Rating | Range |
|--------|-------|
| Slow | < 4x |
| Average | 4x - 8x |
| Efficient | 8x - 12x |
| Very Efficient | > 12x |
### Receivables Turnover
**Formula:** Revenue / Accounts Receivable
**Interpretation:**
- Measures efficiency of credit and collections
- Higher turnover means faster collections
- Monitor trends for credit policy changes
**Benchmarks:**
| Rating | Range |
|--------|-------|
| Slow | < 6x |
| Average | 6x - 10x |
| Efficient | 10x - 15x |
| Very Efficient | > 15x |
### Days Sales Outstanding (DSO)
**Formula:** 365 / Receivables Turnover
**Interpretation:**
- Average days to collect payment after a sale
- Lower DSO = faster cash conversion
- Compare against payment terms
**Benchmarks:**
| Rating | Range |
|--------|-------|
| Excellent | < 30 days |
| Good | 30 - 45 days |
| Acceptable | 45 - 60 days |
| Concern | > 60 days |
## 5. Valuation Ratios
Measure a company's market value relative to financial metrics.
### Price-to-Earnings (P/E) Ratio
**Formula:** Share Price / Earnings Per Share
**Interpretation:**
- Most widely used valuation metric
- High P/E suggests growth expectations or overvaluation
- Use trailing (TTM) and forward P/E for comparison
**Benchmarks:**
| Rating | Range |
|--------|-------|
| Value | < 10x |
| Fair | 10x - 20x |
| Growth | 20x - 35x |
| Premium | > 35x |
### Price-to-Book (P/B) Ratio
**Formula:** Share Price / Book Value Per Share
**Interpretation:**
- Compares market value to accounting value
- Below 1.0 may indicate undervaluation or distress
- Most useful for asset-heavy industries
**Benchmarks:**
| Rating | Range |
|--------|-------|
| Undervalued | < 1.0 |
| Fair | 1.0 - 2.5 |
| Premium | 2.5 - 5.0 |
| Rich | > 5.0 |
### Price-to-Sales (P/S) Ratio
**Formula:** Market Cap / Revenue
**Interpretation:**
- Useful for companies without positive earnings
- Compare within industry only
- Lower = potentially better value
**Benchmarks:**
| Rating | Range |
|--------|-------|
| Value | < 1.0 |
| Fair | 1.0 - 3.0 |
| Growth | 3.0 - 8.0 |
| Premium | > 8.0 |
### EV/EBITDA
**Formula:** Enterprise Value / EBITDA
**Interpretation:**
- Capital-structure-neutral valuation metric
- Preferred for M&A analysis and leveraged buyouts
- More comparable across capital structures than P/E
**Benchmarks:**
| Rating | Range |
|--------|-------|
| Value | < 6x |
| Fair | 6x - 12x |
| Growth | 12x - 20x |
| Premium | > 20x |
### PEG Ratio
**Formula:** P/E Ratio / Earnings Growth Rate (%)
**Interpretation:**
- Growth-adjusted P/E ratio
- PEG of 1.0 suggests fair valuation relative to growth
- Below 1.0 may indicate undervaluation
**Benchmarks:**
| Rating | Range |
|--------|-------|
| Undervalued | < 0.5 |
| Fair | 0.5 - 1.0 |
| Fully Valued | 1.0 - 2.0 |
| Overvalued | > 2.0 |
## Ratio Analysis Best Practices
1. **Compare within industry** - Ratios vary significantly across sectors
2. **Analyze trends** - A single period snapshot is insufficient; look at 3-5 year trends
3. **Use multiple ratios** - No single ratio tells the complete story
4. **Consider context** - Accounting policies, business cycle, and company stage matter
5. **DuPont decomposition** - Break ROE into margin, turnover, and leverage components
6. **Peer comparison** - Compare against direct competitors, not just broad benchmarks
7. **Watch for manipulation** - Revenue recognition changes, off-balance-sheet items, and one-time adjustments can distort ratios
FILE:references/forecasting-best-practices.md
# Forecasting Best Practices
Comprehensive reference for financial forecasting including driver-based models, rolling forecasts, accuracy improvement techniques, and scenario planning.
## 1. Driver-Based Forecasting
### Overview
Driver-based forecasting models financial outcomes based on key business drivers rather than extrapolating from historical trends alone. This approach creates more transparent, actionable, and accurate forecasts.
### Identifying Key Drivers
**Revenue Drivers:**
| Business Model | Primary Drivers |
|---------------|----------------|
| SaaS/Subscription | Customers x ARPU x Retention Rate |
| E-commerce | Visitors x Conversion Rate x AOV |
| Manufacturing | Units x Price per Unit |
| Professional Services | Headcount x Utilization x Bill Rate |
| Retail | Stores x Revenue per Store (or sqft) |
| Marketplace | GMV x Take Rate |
**Cost Drivers:**
| Category | Common Drivers |
|----------|---------------|
| COGS | Revenue x (1 - Gross Margin) or Units x Unit Cost |
| Headcount Costs | Employees x Average Compensation x (1 + Benefits Rate) |
| Sales & Marketing | Revenue x S&M % or CAC x New Customers |
| R&D | Engineering Headcount x Avg Salary |
| G&A | Headcount-based + fixed costs |
| CapEx | Revenue x CapEx Intensity or Project-based |
### Building a Driver-Based Model
**Step 1: Map the value chain**
- Revenue = f(volume drivers, pricing drivers, mix drivers)
- Costs = f(variable drivers, fixed components, step functions)
**Step 2: Establish driver relationships**
- Linear: Revenue = Units x Price
- Non-linear: Revenue = Base x (1 + Growth Rate)^t
- Step function: Facilities costs that jump at capacity thresholds
**Step 3: Validate driver assumptions**
- Compare driver values to historical actuals
- Benchmark against industry data
- Stress-test extreme values
**Step 4: Build sensitivity**
- Identify which drivers have the largest impact on output
- Quantify the range of reasonable values for each driver
- Create scenario combinations
### Driver Sensitivity Matrix
Rank drivers by impact and uncertainty:
| | High Impact | Low Impact |
|---|-----------|-----------|
| **High Uncertainty** | Model these carefully, run scenarios | Monitor but don't over-model |
| **Low Uncertainty** | Get these right; high accuracy needed | Use simple assumptions |
## 2. Rolling Forecasts
### What Is a Rolling Forecast?
A rolling forecast continuously extends the forecast horizon as each period closes. Unlike a static annual budget, a rolling forecast always looks forward the same number of periods (typically 12-18 months).
### Rolling Forecast vs Annual Budget
| Feature | Annual Budget | Rolling Forecast |
|---------|--------------|-----------------|
| Time Horizon | Fixed (Jan-Dec) | Rolling (12-18 months) |
| Update Frequency | Once per year | Monthly or quarterly |
| Detail Level | Very detailed | Driver-level |
| Preparation Time | 3-6 months | 2-5 days per cycle |
| Relevance | Declines over time | Stays current |
| Flexibility | Rigid | Adaptive |
### Implementation Steps
1. **Select the horizon** - 12 months rolling is most common (some use 18 months for CapEx planning)
2. **Define update cadence** - Monthly for volatile businesses; quarterly for stable ones
3. **Choose the right detail** - Driver-level, not line-item detail
4. **Automate data feeds** - Reduce manual effort per cycle
5. **Separate actuals from forecast** - Clear delineation between reported and projected periods
6. **Track forecast accuracy** - Measure MAPE (Mean Absolute Percentage Error) over time
### 13-Week Cash Flow Forecast
A specialized rolling forecast for liquidity management:
**Structure:**
- Week-by-week cash inflows and outflows
- Opening and closing cash balances
- Minimum cash threshold alerts
**Key Components:**
| Inflows | Outflows |
|---------|----------|
| Customer collections (by aging) | Payroll (fixed cadence) |
| Other receivables | Rent / Lease payments |
| Asset sales | Vendor payments (by terms) |
| Financing proceeds | Debt service |
| Tax refunds | Tax payments |
| Other income | Capital expenditures |
**Collection Modeling:**
- Apply collection rates by customer segment or aging bucket
- Model DSO trends to project collection timing
- Account for seasonal patterns in payment behavior
## 3. Accuracy Improvement
### Measuring Forecast Accuracy
**Mean Absolute Percentage Error (MAPE):**
```
MAPE = (1/n) x Sum of |Actual - Forecast| / |Actual| x 100%
```
**Accuracy Benchmarks:**
| MAPE | Rating |
|------|--------|
| < 5% | Excellent |
| 5% - 10% | Good |
| 10% - 20% | Acceptable |
| > 20% | Needs improvement |
**Weighted MAPE (WMAPE):**
Use when line items vary significantly in magnitude - weights errors by actual values.
### Techniques to Improve Accuracy
**1. Bias Detection and Correction**
- Track directional bias (consistently over or under forecasting)
- Calculate mean signed error to detect systematic bias
- Adjust driver assumptions to correct persistent bias
**2. Variance Analysis Loop**
- After each period closes, compare actual vs forecast
- Identify root causes of significant variances
- Update driver assumptions based on learnings
- Document what changed and why
**3. Ensemble Approach**
- Combine multiple forecasting methods
- Blend statistical (trend) with judgmental (management input)
- Weight methods by their historical accuracy
**4. Granularity Optimization**
- Forecast at the right level of detail - not too aggregated, not too granular
- Product/segment level usually more accurate than single top-line
- Aggregate bottom-up forecasts for total, then adjust
**5. Leading Indicators**
- Identify metrics that predict financial outcomes 1-3 months ahead
- Pipeline/bookings predict revenue
- Hiring plans predict headcount costs
- Customer churn signals predict retention revenue
### Common Accuracy Killers
1. **Anchoring bias** - Over-relying on last year's numbers
2. **Optimism bias** - Systematic overestimation of growth
3. **Lack of accountability** - No one tracks forecast vs actual
4. **Stale assumptions** - Not updating for market changes
5. **Missing data** - Forecasting without key driver inputs
6. **Over-precision** - False precision in uncertain environments
## 4. Scenario Planning
### Three-Scenario Framework
| Scenario | Description | Probability |
|----------|-------------|-------------|
| **Base Case** | Most likely outcome based on current trajectory | 50-60% |
| **Bull Case** | Favorable conditions, upside realization | 15-25% |
| **Bear Case** | Adverse conditions, downside risks | 15-25% |
### Scenario Construction
**Base Case:**
- Continuation of current trends
- Management's operational plan
- Market consensus assumptions
- Normal competitive dynamics
**Bull Case (apply selectively, not uniformly):**
- Faster customer acquisition or market adoption
- Successful product launch or expansion
- Favorable macro conditions
- Competitor weakness or exit
- Margin expansion from operating leverage
**Bear Case (be realistic, not catastrophic):**
- Slower growth or market contraction
- Increased competition or pricing pressure
- Key customer or contract loss
- Supply chain disruption
- Regulatory headwinds
### Scenario Variables
Map each scenario to specific driver values:
| Driver | Bear | Base | Bull |
|--------|------|------|------|
| Revenue Growth | +2% | +8% | +15% |
| Gross Margin | 35% | 40% | 43% |
| Customer Churn | 8% | 5% | 3% |
| New Customers/Month | 50 | 100 | 180 |
| Price Increase | 0% | 3% | 5% |
### Presenting Scenarios
1. **Show the range** - Management needs to see the potential outcomes
2. **Quantify the gap** - Dollar impact of bull vs bear on key metrics
3. **Identify triggers** - What conditions would cause each scenario
4. **Define actions** - What levers to pull in each scenario
5. **Assign probabilities** - Not all scenarios are equally likely
## 5. Forecast Communication
### Stakeholder Needs
| Audience | Needs |
|----------|-------|
| Board | High-level scenarios, key risks, strategic implications |
| CEO/CFO | Detailed drivers, variance explanations, action items |
| Department Heads | Their specific budget vs forecast, headcount plans |
| Investors | Revenue guidance, margin trajectory, capital allocation |
| Operations | Weekly/monthly targets, resource requirements |
### Presentation Framework
1. **Executive summary** - Key metrics, direction of travel, confidence level
2. **Variance bridge** - Walk from budget/prior forecast to current forecast
3. **Driver analysis** - What changed and why
4. **Scenario comparison** - Range of outcomes
5. **Key risks and opportunities** - What could change the forecast
6. **Action items** - Decisions needed based on forecast
### Forecast Cadence
| Activity | Frequency | Time Required |
|----------|-----------|--------------|
| 13-week cash flow update | Weekly | 1-2 hours |
| Rolling forecast update | Monthly | 1-2 days |
| Full reforecast | Quarterly | 3-5 days |
| Annual budget/plan | Annually | 4-8 weeks |
| Board reporting | Quarterly | 2-3 days |
## 6. Industry-Specific Considerations
### SaaS Metrics in Forecasting
- **MRR/ARR decomposition:** New, expansion, contraction, churn
- **Cohort-based forecasting:** Forecast by customer cohort for retention accuracy
- **Rule of 40:** Revenue growth % + Profit margin % should exceed 40%
- **Net Revenue Retention:** Target > 110% for healthy SaaS
- **CAC Payback:** Should be < 18 months
### Retail Forecasting
- **Same-store sales growth** as primary organic growth metric
- **Seasonal decomposition** for accurate monthly/weekly forecasts
- **Markdown optimization** impact on gross margin
- **Inventory turns** drive working capital forecasts
### Manufacturing Forecasting
- **Order backlog** as a leading indicator
- **Capacity constraints** creating step-function cost increases
- **Raw material price forecasts** for COGS
- **Maintenance CapEx vs growth CapEx** distinction
- **Utilization rates** driving unit cost projections
FILE:references/industry-adaptations.md
# Industry Adaptations
Sector-specific metrics, benchmarks, and considerations for financial analysis.
## SaaS / Software
**Key Metrics:**
- ARR / MRR growth rate
- Net Revenue Retention (NRR) — target >110%
- CAC Payback Period — target <18 months
- Rule of 40 (growth rate + profit margin ≥ 40%)
- LTV:CAC ratio — target >3:1
- Gross margin — target >70%
**Valuation Multiples:**
- Revenue multiple: 5-15x ARR (growth-adjusted)
- High-growth (>50%): 15-25x ARR
- Moderate growth (20-50%): 8-15x ARR
- Low growth (<20%): 3-8x ARR
**Considerations:**
- Deferred revenue recognition (ASC 606)
- Stock-based compensation impact on margins
- Cohort analysis critical for retention metrics
## Retail / E-Commerce
**Key Metrics:**
- Same-store sales growth (SSS)
- Gross margin by category
- Inventory turnover — target varies by segment (grocery: 14-20x, fashion: 4-6x)
- Revenue per square foot (physical)
- Customer acquisition cost vs. AOV
- Return rate impact on unit economics
**Valuation Multiples:**
- EV/EBITDA: 8-15x (premium brands higher)
- P/E: 15-25x
**Considerations:**
- Seasonal revenue concentration (Q4 holiday)
- Working capital intensity (inventory cycles)
- Omnichannel attribution complexity
## Manufacturing
**Key Metrics:**
- Gross margin by product line
- Capacity utilization rate — target >80%
- Days Inventory Outstanding (DIO)
- Warranty reserve as % of revenue
- Capex as % of revenue (maintenance vs. growth)
- Order backlog / book-to-bill ratio
**Valuation Multiples:**
- EV/EBITDA: 6-12x
- P/E: 12-20x
**Considerations:**
- Raw material cost volatility
- Currency exposure in supply chain
- Depreciation schedules (straight-line vs. accelerated)
- Regulatory compliance costs (environmental, safety)
## Financial Services
**Key Metrics:**
- Net Interest Margin (NIM)
- Return on Equity (ROE) — target >12%
- Cost-to-Income Ratio — target <60%
- Non-Performing Loan (NPL) ratio
- Tier 1 Capital Ratio — regulatory minimum varies
- Assets Under Management (AUM) growth
**Valuation Multiples:**
- Price-to-Book (P/B): 1.0-2.5x
- P/E: 10-18x
**Considerations:**
- Regulatory capital requirements (Basel III/IV)
- Interest rate sensitivity analysis
- Credit risk provisioning (CECL / IFRS 9)
- Mark-to-market vs. held-to-maturity accounting
## Healthcare
**Key Metrics:**
- Revenue per patient / per bed
- Payor mix (Medicare/Medicaid vs. commercial)
- EBITDAR margin (rent-adjusted for facilities)
- Clinical trial pipeline value (biotech/pharma)
- Patent cliff exposure
- R&D as % of revenue — benchmark 15-25% (pharma)
**Valuation Multiples:**
- EV/EBITDA: 10-18x (medtech), 12-20x (pharma)
- EV/Revenue: 3-8x (services), 5-15x (devices)
**Considerations:**
- Reimbursement rate changes (regulatory risk)
- FDA approval timelines and probability-weighted pipeline
- 340B pricing program impact
- Medical device regulation (MDR, QSR compliance)
FILE:references/valuation-methodology.md
# Valuation Methodology Guide
Comprehensive reference for business valuation approaches including DCF analysis, comparable company analysis, and precedent transactions.
## 1. Discounted Cash Flow (DCF) Methodology
### Overview
DCF is an intrinsic valuation method that estimates the present value of a company's expected future free cash flows, discounted at an appropriate rate reflecting the risk of those cash flows.
**Core Principle:** The value of a business equals the present value of all future cash flows it will generate.
**Formula:**
```
Enterprise Value = Sum of [FCF_t / (1 + WACC)^t] + Terminal Value / (1 + WACC)^n
```
Where:
- FCF_t = Free Cash Flow in year t
- WACC = Weighted Average Cost of Capital
- n = number of projection years
### Step 1: Historical Analysis
Before projecting, analyze 3-5 years of historical financials:
- **Revenue growth rates** - Identify organic vs acquisition-driven growth
- **Margin trends** - Gross, operating, and net margin trajectories
- **Capital intensity** - CapEx as % of revenue
- **Working capital** - Cash conversion cycle trends
- **Free cash flow conversion** - FCF / Net Income ratio
### Step 2: Revenue Projections
**Approaches:**
1. **Top-down:** Market size x Market share x Pricing
2. **Bottom-up:** Units x Price, or Customers x ARPU
3. **Growth rate extrapolation:** Historical growth with decay
**Revenue Projection Best Practices:**
- Use 5-7 year explicit projection period
- Growth should converge toward GDP growth by terminal year
- Support assumptions with market data and management guidance
- Model revenue by segment/product line when possible
### Step 3: Free Cash Flow Calculation
**Unlevered Free Cash Flow (UFCF):**
```
UFCF = EBIT x (1 - Tax Rate)
+ Depreciation & Amortization
- Capital Expenditures
- Changes in Net Working Capital
```
**Key Drivers:**
- Operating margin trajectory
- CapEx as % of revenue (maintenance vs growth)
- Working capital requirements (DSO, DIO, DPO)
- Tax rate (effective vs marginal)
### Step 4: WACC Calculation
**Weighted Average Cost of Capital:**
```
WACC = (E/V x Re) + (D/V x Rd x (1 - T))
```
Where:
- E/V = Equity weight (market value)
- D/V = Debt weight (market value)
- Re = Cost of equity
- Rd = Cost of debt (pre-tax)
- T = Marginal tax rate
#### Cost of Equity (CAPM)
```
Re = Rf + Beta x (Rm - Rf) + Size Premium + Company-Specific Risk
```
| Component | Description | Typical Range |
|-----------|-------------|---------------|
| Risk-Free Rate (Rf) | 10-year Treasury yield | 3.5% - 5.0% |
| Equity Risk Premium (ERP) | Market return above risk-free | 5.0% - 7.0% |
| Beta | Systematic risk relative to market | 0.5 - 2.0 |
| Size Premium | Small-cap additional risk | 0% - 5% |
| Company-Specific Risk | Unique risk factors | 0% - 5% |
**Beta Estimation:**
- Use 2-5 year weekly returns against broad market index
- Unlevered betas for comparability, then re-lever to target capital structure
- Consider industry median beta for stability
#### Cost of Debt
```
Rd = Yield on comparable-maturity corporate bonds
OR
Rd = Risk-Free Rate + Credit Spread
```
**Credit Spread by Rating:**
| Rating | Typical Spread |
|--------|---------------|
| AAA | 0.5% - 1.0% |
| AA | 1.0% - 1.5% |
| A | 1.5% - 2.0% |
| BBB | 2.0% - 3.0% |
| BB | 3.0% - 5.0% |
| B | 5.0% - 8.0% |
### Step 5: Terminal Value
Terminal value typically represents 60-80% of total enterprise value. Use two methods and cross-check.
#### Perpetuity Growth Method
```
TV = FCF_n x (1 + g) / (WACC - g)
```
Where g = terminal growth rate (typically 2.0% - 3.0%, should not exceed long-term GDP growth)
**Sensitivity:** Terminal value is highly sensitive to g. A 0.5% change in g can move enterprise value by 15-25%.
#### Exit Multiple Method
```
TV = Terminal Year EBITDA x Exit EV/EBITDA Multiple
```
**Exit Multiple Selection:**
- Use current trading multiples of comparable companies
- Consider whether current multiples are at historical highs/lows
- Apply a discount for lack of marketability if private
**Cross-Check:** Both methods should yield similar results. Large discrepancies signal inconsistent assumptions.
### Step 6: Enterprise to Equity Bridge
```
Enterprise Value
- Net Debt (Total Debt - Cash)
- Minority Interest
- Preferred Equity
+ Equity Method Investments
= Equity Value
Equity Value / Diluted Shares Outstanding = Value Per Share
```
### Step 7: Sensitivity Analysis
Always present results as a range, not a single point estimate.
**Standard Sensitivity Tables:**
1. WACC vs Terminal Growth Rate
2. WACC vs Exit Multiple
3. Revenue Growth vs Operating Margin
**Scenario Analysis:**
- Base case: Management guidance / consensus estimates
- Bull case: Upside scenario with faster growth or margin expansion
- Bear case: Downside scenario with slower growth or margin compression
## 2. Comparable Company Analysis
### Methodology
1. **Select peer group** - Similar size, industry, growth profile, and margins
2. **Calculate trading multiples** for each peer
3. **Determine appropriate multiple range**
4. **Apply to target company's metrics**
### Common Multiples
| Multiple | When to Use |
|----------|-------------|
| EV/Revenue | Pre-profit companies, high-growth tech |
| EV/EBITDA | Most common for mature companies |
| EV/EBIT | When D&A differs significantly across peers |
| P/E | Stable earnings, financial services |
| P/B | Banks, insurance, asset-heavy industries |
| EV/FCF | Capital-light businesses with clean FCF |
### Peer Selection Criteria
- **Industry:** Same or closely adjacent sectors
- **Size:** Within 0.5x to 2x of target revenue/market cap
- **Geography:** Same primary markets
- **Growth profile:** Similar revenue growth rates (within 5-10%)
- **Margin profile:** Similar operating margin structure
- **Business model:** Comparable revenue mix and customer base
### Premium/Discount Adjustments
| Factor | Adjustment |
|--------|-----------|
| Higher growth | Premium of 1-3x on EV/EBITDA |
| Lower margins | Discount of 1-2x |
| Smaller scale | Discount of 10-20% |
| Private company | Discount of 15-30% (illiquidity) |
| Control premium | Premium of 20-40% (for acquisitions) |
## 3. Precedent Transaction Analysis
### Methodology
1. **Identify comparable transactions** in same industry
2. **Calculate transaction multiples** (EV/Revenue, EV/EBITDA)
3. **Adjust for market conditions** and deal-specific factors
4. **Apply adjusted multiples** to target
### Key Considerations
- Transactions include control premiums (typically 20-40%)
- Market conditions at time of deal affect multiples
- Strategic vs financial buyer valuations differ
- Consider synergy expectations embedded in price
- More recent transactions carry greater relevance
## 4. Valuation Framework Selection
| Situation | Primary Method | Secondary Method |
|-----------|---------------|-----------------|
| Profitable, stable | DCF | Comparable companies |
| High growth, pre-profit | Comparable companies (EV/Revenue) | DCF with scenario analysis |
| M&A target | Precedent transactions | DCF |
| Asset-heavy, cyclical | Asset-based valuation | Normalized DCF |
| Financial institution | Dividend discount model | P/B, P/E comps |
| Distressed | Liquidation value | Restructured DCF |
## 5. Common Pitfalls
1. **Hockey stick projections** - Unrealistic growth acceleration in later years
2. **Terminal value dominance** - If TV > 80% of EV, shorten projection period or question assumptions
3. **Circular references** - WACC depends on equity value which depends on WACC
4. **Ignoring working capital** - Can significantly affect FCF
5. **Single-point estimates** - Always present as a range
6. **Stale comparables** - Market conditions change; update regularly
7. **Confirmation bias** - Don't work backward from a desired conclusion
8. **Ignoring dilution** - Use fully diluted shares (treasury stock method for options)
FILE:scripts/budget_variance_analyzer.py
#!/usr/bin/env python3
"""
Budget Variance Analyzer
Analyzes actual vs budget vs prior year performance with materiality
threshold filtering, favorable/unfavorable classification, and
department/category breakdown.
Usage:
python budget_variance_analyzer.py budget_data.json
python budget_variance_analyzer.py budget_data.json --format json
python budget_variance_analyzer.py budget_data.json --threshold-pct 5 --threshold-amt 25000
"""
import argparse
import json
import sys
from typing import Any, Dict, List, Optional, Tuple
def safe_divide(numerator: float, denominator: float, default: float = 0.0) -> float:
"""Safely divide two numbers, returning default if denominator is zero."""
if denominator == 0 or denominator is None:
return default
return numerator / denominator
class BudgetVarianceAnalyzer:
"""Analyze budget variances with materiality filtering and classification."""
def __init__(
self,
data: Dict[str, Any],
threshold_pct: float = 10.0,
threshold_amt: float = 50000.0,
) -> None:
"""
Initialize the analyzer.
Args:
data: Budget data with line items
threshold_pct: Materiality threshold as percentage (default 10%)
threshold_amt: Materiality threshold as dollar amount (default $50K)
"""
self.line_items: List[Dict[str, Any]] = data.get("line_items", [])
self.period: str = data.get("period", "Current Period")
self.company: str = data.get("company", "Company")
self.threshold_pct = threshold_pct
self.threshold_amt = threshold_amt
self.variances: List[Dict[str, Any]] = []
self.material_variances: List[Dict[str, Any]] = []
self.summary: Dict[str, Any] = {}
def classify_favorability(
self, line_type: str, variance_amount: float
) -> str:
"""
Classify variance as favorable or unfavorable.
Revenue: over budget = favorable
Expense: under budget = favorable
"""
if line_type.lower() in ("revenue", "income", "sales"):
return "Favorable" if variance_amount > 0 else "Unfavorable"
else:
# For expenses, under budget (negative variance) is favorable
return "Favorable" if variance_amount < 0 else "Unfavorable"
def calculate_variances(self) -> List[Dict[str, Any]]:
"""Calculate variances for all line items."""
self.variances = []
for item in self.line_items:
name = item.get("name", "Unknown")
line_type = item.get("type", "expense")
department = item.get("department", "General")
category = item.get("category", "Other")
actual = item.get("actual", 0)
budget = item.get("budget", 0)
prior_year = item.get("prior_year", None)
# Budget variance
budget_var_amt = actual - budget
budget_var_pct = safe_divide(budget_var_amt, budget) * 100
# Prior year variance (if available)
py_var_amt = (actual - prior_year) if prior_year is not None else None
py_var_pct = (
safe_divide(py_var_amt, prior_year) * 100
if prior_year is not None
else None
)
favorability = self.classify_favorability(line_type, budget_var_amt)
is_material = (
abs(budget_var_pct) >= self.threshold_pct
or abs(budget_var_amt) >= self.threshold_amt
)
variance_record = {
"name": name,
"type": line_type,
"department": department,
"category": category,
"actual": actual,
"budget": budget,
"prior_year": prior_year,
"budget_variance_amount": budget_var_amt,
"budget_variance_pct": round(budget_var_pct, 2),
"prior_year_variance_amount": py_var_amt,
"prior_year_variance_pct": (
round(py_var_pct, 2) if py_var_pct is not None else None
),
"favorability": favorability,
"is_material": is_material,
}
self.variances.append(variance_record)
# Filter material variances
self.material_variances = [v for v in self.variances if v["is_material"]]
return self.variances
def department_summary(self) -> Dict[str, Dict[str, Any]]:
"""Summarize variances by department."""
departments: Dict[str, Dict[str, float]] = {}
for v in self.variances:
dept = v["department"]
if dept not in departments:
departments[dept] = {
"total_actual": 0.0,
"total_budget": 0.0,
"total_variance": 0.0,
"favorable_count": 0,
"unfavorable_count": 0,
"line_count": 0,
}
departments[dept]["total_actual"] += v["actual"]
departments[dept]["total_budget"] += v["budget"]
departments[dept]["total_variance"] += v["budget_variance_amount"]
departments[dept]["line_count"] += 1
if v["favorability"] == "Favorable":
departments[dept]["favorable_count"] += 1
else:
departments[dept]["unfavorable_count"] += 1
# Add variance percentage
for dept_data in departments.values():
dept_data["variance_pct"] = round(
safe_divide(
dept_data["total_variance"], dept_data["total_budget"]
)
* 100,
2,
)
return departments
def category_summary(self) -> Dict[str, Dict[str, Any]]:
"""Summarize variances by category."""
categories: Dict[str, Dict[str, float]] = {}
for v in self.variances:
cat = v["category"]
if cat not in categories:
categories[cat] = {
"total_actual": 0.0,
"total_budget": 0.0,
"total_variance": 0.0,
"line_count": 0,
}
categories[cat]["total_actual"] += v["actual"]
categories[cat]["total_budget"] += v["budget"]
categories[cat]["total_variance"] += v["budget_variance_amount"]
categories[cat]["line_count"] += 1
for cat_data in categories.values():
cat_data["variance_pct"] = round(
safe_divide(
cat_data["total_variance"], cat_data["total_budget"]
)
* 100,
2,
)
return categories
def generate_executive_summary(self) -> Dict[str, Any]:
"""Generate an executive summary of the variance analysis."""
total_actual = sum(
v["actual"] for v in self.variances if v["type"].lower() in ("revenue", "income", "sales")
)
total_budget = sum(
v["budget"] for v in self.variances if v["type"].lower() in ("revenue", "income", "sales")
)
total_expense_actual = sum(
v["actual"] for v in self.variances if v["type"].lower() not in ("revenue", "income", "sales")
)
total_expense_budget = sum(
v["budget"] for v in self.variances if v["type"].lower() not in ("revenue", "income", "sales")
)
revenue_variance = total_actual - total_budget
expense_variance = total_expense_actual - total_expense_budget
favorable_count = sum(
1 for v in self.variances if v["favorability"] == "Favorable"
)
unfavorable_count = sum(
1 for v in self.variances if v["favorability"] == "Unfavorable"
)
self.summary = {
"period": self.period,
"company": self.company,
"total_line_items": len(self.variances),
"material_variances_count": len(self.material_variances),
"favorable_count": favorable_count,
"unfavorable_count": unfavorable_count,
"revenue": {
"actual": total_actual,
"budget": total_budget,
"variance_amount": revenue_variance,
"variance_pct": round(
safe_divide(revenue_variance, total_budget) * 100, 2
),
},
"expenses": {
"actual": total_expense_actual,
"budget": total_expense_budget,
"variance_amount": expense_variance,
"variance_pct": round(
safe_divide(expense_variance, total_expense_budget) * 100, 2
),
},
"net_impact": revenue_variance - expense_variance,
"materiality_thresholds": {
"percentage": self.threshold_pct,
"amount": self.threshold_amt,
},
}
return self.summary
def run_analysis(self) -> Dict[str, Any]:
"""Run the complete variance analysis."""
self.calculate_variances()
dept_summary = self.department_summary()
cat_summary = self.category_summary()
exec_summary = self.generate_executive_summary()
return {
"executive_summary": exec_summary,
"all_variances": self.variances,
"material_variances": self.material_variances,
"department_summary": dept_summary,
"category_summary": cat_summary,
}
def format_text(self, results: Dict[str, Any]) -> str:
"""Format results as human-readable text."""
lines: List[str] = []
lines.append("=" * 70)
lines.append("BUDGET VARIANCE ANALYSIS")
lines.append("=" * 70)
summary = results["executive_summary"]
lines.append(f"\n Company: {summary['company']}")
lines.append(f" Period: {summary['period']}")
def fmt_money(val: float) -> str:
sign = "+" if val > 0 else ""
if abs(val) >= 1e6:
return f"{sign},.2fM"
if abs(val) >= 1e3:
return f"{sign},.1fK"
return f"{sign},.2f"
lines.append(f"\n--- EXECUTIVE SUMMARY ---")
rev = summary["revenue"]
exp = summary["expenses"]
lines.append(
f" Revenue: Actual {fmt_money(rev['actual'])} vs "
f"Budget {fmt_money(rev['budget'])} "
f"({fmt_money(rev['variance_amount'])}, {rev['variance_pct']:+.1f}%)"
)
lines.append(
f" Expenses: Actual {fmt_money(exp['actual'])} vs "
f"Budget {fmt_money(exp['budget'])} "
f"({fmt_money(exp['variance_amount'])}, {exp['variance_pct']:+.1f}%)"
)
lines.append(f" Net Impact: {fmt_money(summary['net_impact'])}")
lines.append(
f" Total Items: {summary['total_line_items']} | "
f"Material: {summary['material_variances_count']} | "
f"Favorable: {summary['favorable_count']} | "
f"Unfavorable: {summary['unfavorable_count']}"
)
# Material variances
material = results["material_variances"]
if material:
lines.append(f"\n--- MATERIAL VARIANCES ---")
lines.append(
f" (Threshold: {self.threshold_pct}% or "
f",.0f)"
)
for v in material:
lines.append(
f"\n {v['name']} ({v['department']})"
)
lines.append(
f" Actual: {fmt_money(v['actual'])} | "
f"Budget: {fmt_money(v['budget'])}"
)
lines.append(
f" Variance: {fmt_money(v['budget_variance_amount'])} "
f"({v['budget_variance_pct']:+.1f}%) - {v['favorability']}"
)
# Department summary
dept = results["department_summary"]
if dept:
lines.append(f"\n--- DEPARTMENT SUMMARY ---")
for dept_name, d in dept.items():
lines.append(
f" {dept_name}: Variance {fmt_money(d['total_variance'])} "
f"({d['variance_pct']:+.1f}%) | "
f"Fav: {d['favorable_count']} / Unfav: {d['unfavorable_count']}"
)
# Category summary
cat = results["category_summary"]
if cat:
lines.append(f"\n--- CATEGORY SUMMARY ---")
for cat_name, c in cat.items():
lines.append(
f" {cat_name}: Variance {fmt_money(c['total_variance'])} "
f"({c['variance_pct']:+.1f}%)"
)
lines.append("\n" + "=" * 70)
return "\n".join(lines)
def main() -> None:
"""Main entry point."""
parser = argparse.ArgumentParser(
description="Analyze budget variances with materiality filtering"
)
parser.add_argument(
"input_file",
help="Path to JSON file with budget data",
)
parser.add_argument(
"--format",
choices=["text", "json"],
default="text",
help="Output format (default: text)",
)
parser.add_argument(
"--threshold-pct",
type=float,
default=10.0,
help="Materiality threshold percentage (default: 10)",
)
parser.add_argument(
"--threshold-amt",
type=float,
default=50000.0,
help="Materiality threshold dollar amount (default: 50000)",
)
args = parser.parse_args()
try:
with open(args.input_file, "r") as f:
data = json.load(f)
except FileNotFoundError:
print(f"Error: File '{args.input_file}' not found.", file=sys.stderr)
sys.exit(1)
except json.JSONDecodeError as e:
print(f"Error: Invalid JSON in '{args.input_file}': {e}", file=sys.stderr)
sys.exit(1)
analyzer = BudgetVarianceAnalyzer(
data,
threshold_pct=args.threshold_pct,
threshold_amt=args.threshold_amt,
)
results = analyzer.run_analysis()
if args.format == "json":
print(json.dumps(results, indent=2))
else:
print(analyzer.format_text(results))
if __name__ == "__main__":
main()
FILE:scripts/dcf_valuation.py
#!/usr/bin/env python3
"""
DCF Valuation Model
Discounted Cash Flow enterprise and equity valuation with WACC calculation,
terminal value estimation, and two-way sensitivity analysis.
Uses standard library only (math, statistics) - NO numpy/pandas/scipy.
Usage:
python dcf_valuation.py valuation_data.json
python dcf_valuation.py valuation_data.json --format json
python dcf_valuation.py valuation_data.json --projection-years 7
"""
import argparse
import json
import math
import sys
from statistics import mean
from typing import Any, Dict, List, Optional, Tuple
def safe_divide(numerator: float, denominator: float, default: float = 0.0) -> float:
"""Safely divide two numbers, returning default if denominator is zero."""
if denominator == 0 or denominator is None:
return default
return numerator / denominator
class DCFModel:
"""Discounted Cash Flow valuation model."""
def __init__(self) -> None:
"""Initialize the DCF model."""
self.historical: Dict[str, Any] = {}
self.assumptions: Dict[str, Any] = {}
self.wacc: float = 0.0
self.projected_revenue: List[float] = []
self.projected_fcf: List[float] = []
self.projection_years: int = 5
self.terminal_value_perpetuity: float = 0.0
self.terminal_value_exit_multiple: float = 0.0
self.enterprise_value_perpetuity: float = 0.0
self.enterprise_value_exit_multiple: float = 0.0
self.equity_value_perpetuity: float = 0.0
self.equity_value_exit_multiple: float = 0.0
self.value_per_share_perpetuity: float = 0.0
self.value_per_share_exit_multiple: float = 0.0
def set_historical_financials(self, historical: Dict[str, Any]) -> None:
"""Set historical financial data."""
self.historical = historical
def set_assumptions(self, assumptions: Dict[str, Any]) -> None:
"""Set projection assumptions."""
self.assumptions = assumptions
self.projection_years = assumptions.get("projection_years", 5)
def calculate_wacc(self) -> float:
"""Calculate Weighted Average Cost of Capital via CAPM."""
wacc_inputs = self.assumptions.get("wacc_inputs", {})
risk_free_rate = wacc_inputs.get("risk_free_rate", 0.04)
equity_risk_premium = wacc_inputs.get("equity_risk_premium", 0.06)
beta = wacc_inputs.get("beta", 1.0)
cost_of_debt = wacc_inputs.get("cost_of_debt", 0.05)
tax_rate = wacc_inputs.get("tax_rate", 0.25)
debt_weight = wacc_inputs.get("debt_weight", 0.30)
equity_weight = wacc_inputs.get("equity_weight", 0.70)
# CAPM: Cost of Equity = Risk-Free Rate + Beta * Equity Risk Premium
cost_of_equity = risk_free_rate + beta * equity_risk_premium
# WACC = (E/V * Re) + (D/V * Rd * (1 - T))
after_tax_cost_of_debt = cost_of_debt * (1 - tax_rate)
self.wacc = (equity_weight * cost_of_equity) + (
debt_weight * after_tax_cost_of_debt
)
return self.wacc
def project_cash_flows(self) -> Tuple[List[float], List[float]]:
"""Project revenue and free cash flow over the projection period."""
base_revenue = self.historical.get("revenue", [])
if not base_revenue:
raise ValueError("Historical revenue data is required")
last_revenue = base_revenue[-1]
revenue_growth_rates = self.assumptions.get("revenue_growth_rates", [])
fcf_margins = self.assumptions.get("fcf_margins", [])
# If growth rates not provided for all years, use average or default
default_growth = self.assumptions.get("default_revenue_growth", 0.05)
default_fcf_margin = self.assumptions.get("default_fcf_margin", 0.10)
self.projected_revenue = []
self.projected_fcf = []
current_revenue = last_revenue
for year in range(self.projection_years):
growth = (
revenue_growth_rates[year]
if year < len(revenue_growth_rates)
else default_growth
)
fcf_margin = (
fcf_margins[year]
if year < len(fcf_margins)
else default_fcf_margin
)
current_revenue = current_revenue * (1 + growth)
fcf = current_revenue * fcf_margin
self.projected_revenue.append(current_revenue)
self.projected_fcf.append(fcf)
return self.projected_revenue, self.projected_fcf
def calculate_terminal_value(self) -> Tuple[float, float]:
"""Calculate terminal value using both perpetuity growth and exit multiple."""
if not self.projected_fcf:
raise ValueError("Must project cash flows before terminal value")
terminal_fcf = self.projected_fcf[-1]
terminal_growth = self.assumptions.get("terminal_growth_rate", 0.025)
exit_multiple = self.assumptions.get("exit_ev_ebitda_multiple", 12.0)
# Perpetuity growth method: TV = FCF * (1+g) / (WACC - g)
if self.wacc > terminal_growth:
self.terminal_value_perpetuity = (
terminal_fcf * (1 + terminal_growth)
) / (self.wacc - terminal_growth)
else:
self.terminal_value_perpetuity = 0.0
# Exit multiple method: TV = Terminal EBITDA * Exit Multiple
terminal_revenue = self.projected_revenue[-1]
ebitda_margin = self.assumptions.get("terminal_ebitda_margin", 0.20)
terminal_ebitda = terminal_revenue * ebitda_margin
self.terminal_value_exit_multiple = terminal_ebitda * exit_multiple
return self.terminal_value_perpetuity, self.terminal_value_exit_multiple
def calculate_enterprise_value(self) -> Tuple[float, float]:
"""Calculate enterprise value by discounting projected FCFs and terminal value."""
if not self.projected_fcf:
raise ValueError("Must project cash flows first")
# Discount projected FCFs
pv_fcf = 0.0
for i, fcf in enumerate(self.projected_fcf):
discount_factor = (1 + self.wacc) ** (i + 1)
pv_fcf += fcf / discount_factor
# Discount terminal values
terminal_discount = (1 + self.wacc) ** self.projection_years
pv_tv_perpetuity = self.terminal_value_perpetuity / terminal_discount
pv_tv_exit = self.terminal_value_exit_multiple / terminal_discount
self.enterprise_value_perpetuity = pv_fcf + pv_tv_perpetuity
self.enterprise_value_exit_multiple = pv_fcf + pv_tv_exit
return self.enterprise_value_perpetuity, self.enterprise_value_exit_multiple
def calculate_equity_value(self) -> Tuple[float, float]:
"""Calculate equity value from enterprise value."""
net_debt = self.historical.get("net_debt", 0)
shares_outstanding = self.historical.get("shares_outstanding", 1)
self.equity_value_perpetuity = (
self.enterprise_value_perpetuity - net_debt
)
self.equity_value_exit_multiple = (
self.enterprise_value_exit_multiple - net_debt
)
self.value_per_share_perpetuity = safe_divide(
self.equity_value_perpetuity, shares_outstanding
)
self.value_per_share_exit_multiple = safe_divide(
self.equity_value_exit_multiple, shares_outstanding
)
return self.equity_value_perpetuity, self.equity_value_exit_multiple
def sensitivity_analysis(
self,
wacc_range: Optional[List[float]] = None,
growth_range: Optional[List[float]] = None,
) -> Dict[str, Any]:
"""
Two-way sensitivity analysis: WACC vs terminal growth rate.
Returns a table of enterprise values using nested lists (no numpy).
"""
if wacc_range is None:
base_wacc = self.wacc
wacc_range = [
round(base_wacc - 0.02, 4),
round(base_wacc - 0.01, 4),
round(base_wacc, 4),
round(base_wacc + 0.01, 4),
round(base_wacc + 0.02, 4),
]
if growth_range is None:
base_growth = self.assumptions.get("terminal_growth_rate", 0.025)
growth_range = [
round(base_growth - 0.01, 4),
round(base_growth - 0.005, 4),
round(base_growth, 4),
round(base_growth + 0.005, 4),
round(base_growth + 0.01, 4),
]
rows = len(wacc_range)
cols = len(growth_range)
# Initialize sensitivity table as nested lists
ev_table = [[0.0] * cols for _ in range(rows)]
share_price_table = [[0.0] * cols for _ in range(rows)]
terminal_fcf = self.projected_fcf[-1] if self.projected_fcf else 0
for i, wacc_val in enumerate(wacc_range):
for j, growth_val in enumerate(growth_range):
if wacc_val <= growth_val:
ev_table[i][j] = float("inf")
share_price_table[i][j] = float("inf")
continue
# Recalculate PV of projected FCFs with this WACC
pv_fcf = 0.0
for k, fcf in enumerate(self.projected_fcf):
pv_fcf += fcf / ((1 + wacc_val) ** (k + 1))
# Terminal value with this growth rate
tv = (terminal_fcf * (1 + growth_val)) / (wacc_val - growth_val)
pv_tv = tv / ((1 + wacc_val) ** self.projection_years)
ev = pv_fcf + pv_tv
ev_table[i][j] = round(ev, 2)
net_debt = self.historical.get("net_debt", 0)
shares = self.historical.get("shares_outstanding", 1)
equity = ev - net_debt
share_price_table[i][j] = round(
safe_divide(equity, shares), 2
)
return {
"wacc_values": wacc_range,
"growth_values": growth_range,
"enterprise_value_table": ev_table,
"share_price_table": share_price_table,
}
def run_full_valuation(self) -> Dict[str, Any]:
"""Run the complete DCF valuation."""
self.calculate_wacc()
self.project_cash_flows()
self.calculate_terminal_value()
self.calculate_enterprise_value()
self.calculate_equity_value()
sensitivity = self.sensitivity_analysis()
return {
"wacc": self.wacc,
"projected_revenue": self.projected_revenue,
"projected_fcf": self.projected_fcf,
"terminal_value": {
"perpetuity_growth": self.terminal_value_perpetuity,
"exit_multiple": self.terminal_value_exit_multiple,
},
"enterprise_value": {
"perpetuity_growth": self.enterprise_value_perpetuity,
"exit_multiple": self.enterprise_value_exit_multiple,
},
"equity_value": {
"perpetuity_growth": self.equity_value_perpetuity,
"exit_multiple": self.equity_value_exit_multiple,
},
"value_per_share": {
"perpetuity_growth": self.value_per_share_perpetuity,
"exit_multiple": self.value_per_share_exit_multiple,
},
"sensitivity_analysis": sensitivity,
}
def format_text(self, results: Dict[str, Any]) -> str:
"""Format valuation results as human-readable text."""
lines: List[str] = []
lines.append("=" * 70)
lines.append("DCF VALUATION ANALYSIS")
lines.append("=" * 70)
def fmt_money(val: float) -> str:
if val == float("inf"):
return "N/A (WACC <= growth)"
if abs(val) >= 1e9:
return f",.2fB"
if abs(val) >= 1e6:
return f",.2fM"
if abs(val) >= 1e3:
return f",.1fK"
return f",.2f"
lines.append(f"\n--- WACC ---")
lines.append(f" Weighted Average Cost of Capital: {results['wacc'] * 100:.2f}%")
lines.append(f"\n--- REVENUE PROJECTIONS ---")
for i, rev in enumerate(results["projected_revenue"], 1):
lines.append(f" Year {i}: {fmt_money(rev)}")
lines.append(f"\n--- FREE CASH FLOW PROJECTIONS ---")
for i, fcf in enumerate(results["projected_fcf"], 1):
lines.append(f" Year {i}: {fmt_money(fcf)}")
lines.append(f"\n--- TERMINAL VALUE ---")
lines.append(
f" Perpetuity Growth Method: "
f"{fmt_money(results['terminal_value']['perpetuity_growth'])}"
)
lines.append(
f" Exit Multiple Method: "
f"{fmt_money(results['terminal_value']['exit_multiple'])}"
)
lines.append(f"\n--- ENTERPRISE VALUE ---")
lines.append(
f" Perpetuity Growth Method: "
f"{fmt_money(results['enterprise_value']['perpetuity_growth'])}"
)
lines.append(
f" Exit Multiple Method: "
f"{fmt_money(results['enterprise_value']['exit_multiple'])}"
)
lines.append(f"\n--- EQUITY VALUE ---")
lines.append(
f" Perpetuity Growth Method: "
f"{fmt_money(results['equity_value']['perpetuity_growth'])}"
)
lines.append(
f" Exit Multiple Method: "
f"{fmt_money(results['equity_value']['exit_multiple'])}"
)
lines.append(f"\n--- VALUE PER SHARE ---")
vps = results["value_per_share"]
lines.append(f" Perpetuity Growth Method: ,.2f")
lines.append(f" Exit Multiple Method: ,.2f")
# Sensitivity table
sens = results["sensitivity_analysis"]
lines.append(f"\n--- SENSITIVITY ANALYSIS (Enterprise Value) ---")
lines.append(f" WACC vs Terminal Growth Rate")
lines.append("")
header = " {:>10s}".format("WACC \\ g")
for g in sens["growth_values"]:
header += f" {g * 100:>8.1f}%"
lines.append(header)
lines.append(" " + "-" * (10 + 10 * len(sens["growth_values"])))
for i, w in enumerate(sens["wacc_values"]):
row = f" {w * 100:>9.1f}%"
for j in range(len(sens["growth_values"])):
val = sens["enterprise_value_table"][i][j]
if val == float("inf"):
row += f" {'N/A':>8s}"
else:
row += f" {fmt_money(val):>8s}"
lines.append(row)
lines.append("\n" + "=" * 70)
return "\n".join(lines)
def main() -> None:
"""Main entry point."""
parser = argparse.ArgumentParser(
description="DCF Valuation Model - Enterprise and equity valuation"
)
parser.add_argument(
"input_file",
help="Path to JSON file with valuation data",
)
parser.add_argument(
"--format",
choices=["text", "json"],
default="text",
help="Output format (default: text)",
)
parser.add_argument(
"--projection-years",
type=int,
default=None,
help="Number of projection years (overrides input file)",
)
args = parser.parse_args()
try:
with open(args.input_file, "r") as f:
data = json.load(f)
except FileNotFoundError:
print(f"Error: File '{args.input_file}' not found.", file=sys.stderr)
sys.exit(1)
except json.JSONDecodeError as e:
print(f"Error: Invalid JSON in '{args.input_file}': {e}", file=sys.stderr)
sys.exit(1)
model = DCFModel()
model.set_historical_financials(data.get("historical", {}))
assumptions = data.get("assumptions", {})
if args.projection_years is not None:
assumptions["projection_years"] = args.projection_years
model.set_assumptions(assumptions)
try:
results = model.run_full_valuation()
except ValueError as e:
print(f"Error: {e}", file=sys.stderr)
sys.exit(1)
if args.format == "json":
# Handle inf values for JSON serialization
def sanitize(obj: Any) -> Any:
if isinstance(obj, float) and math.isinf(obj):
return None
if isinstance(obj, dict):
return {k: sanitize(v) for k, v in obj.items()}
if isinstance(obj, list):
return [sanitize(v) for v in obj]
return obj
print(json.dumps(sanitize(results), indent=2))
else:
print(model.format_text(results))
if __name__ == "__main__":
main()
FILE:scripts/forecast_builder.py
#!/usr/bin/env python3
"""
Forecast Builder
Driver-based revenue forecasting with 13-week rolling cash flow projection,
scenario modeling (base/bull/bear), and trend analysis using simple linear
regression (standard library only).
Usage:
python forecast_builder.py forecast_data.json
python forecast_builder.py forecast_data.json --format json
python forecast_builder.py forecast_data.json --scenarios base,bull,bear
"""
import argparse
import json
import math
import sys
from statistics import mean
from typing import Any, Dict, List, Optional, Tuple
def safe_divide(numerator: float, denominator: float, default: float = 0.0) -> float:
"""Safely divide two numbers, returning default if denominator is zero."""
if denominator == 0 or denominator is None:
return default
return numerator / denominator
def simple_linear_regression(
x_values: List[float], y_values: List[float]
) -> Tuple[float, float, float]:
"""
Simple linear regression using standard library.
Returns (slope, intercept, r_squared).
"""
n = len(x_values)
if n < 2 or n != len(y_values):
return (0.0, 0.0, 0.0)
x_mean = mean(x_values)
y_mean = mean(y_values)
ss_xy = sum((x - x_mean) * (y - y_mean) for x, y in zip(x_values, y_values))
ss_xx = sum((x - x_mean) ** 2 for x in x_values)
ss_yy = sum((y - y_mean) ** 2 for y in y_values)
slope = safe_divide(ss_xy, ss_xx)
intercept = y_mean - slope * x_mean
# R-squared
r_squared = safe_divide(ss_xy ** 2, ss_xx * ss_yy) if ss_yy > 0 else 0.0
return (slope, intercept, r_squared)
class ForecastBuilder:
"""Driver-based revenue forecasting with scenario modeling."""
def __init__(self, data: Dict[str, Any]) -> None:
"""Initialize the forecast builder."""
self.historical: List[Dict[str, Any]] = data.get("historical_periods", [])
self.drivers: Dict[str, Any] = data.get("drivers", {})
self.assumptions: Dict[str, Any] = data.get("assumptions", {})
self.cash_flow_inputs: Dict[str, Any] = data.get("cash_flow_inputs", {})
self.scenarios_config: Dict[str, Any] = data.get("scenarios", {})
self.forecast_periods: int = data.get("forecast_periods", 12)
def analyze_trends(self) -> Dict[str, Any]:
"""Analyze historical trends using linear regression."""
if not self.historical:
return {"error": "No historical data available"}
# Extract revenue series
revenues = [p.get("revenue", 0) for p in self.historical]
periods = list(range(1, len(revenues) + 1))
slope, intercept, r_squared = simple_linear_regression(
[float(x) for x in periods],
[float(y) for y in revenues],
)
# Calculate growth rates
growth_rates = []
for i in range(1, len(revenues)):
if revenues[i - 1] > 0:
growth = (revenues[i] - revenues[i - 1]) / revenues[i - 1]
growth_rates.append(growth)
avg_growth = mean(growth_rates) if growth_rates else 0.0
# Seasonality detection (if enough data)
seasonality_index: List[float] = []
if len(revenues) >= 4:
overall_avg = mean(revenues)
if overall_avg > 0:
seasonality_index = [r / overall_avg for r in revenues[-4:]]
return {
"trend": {
"slope": round(slope, 2),
"intercept": round(intercept, 2),
"r_squared": round(r_squared, 4),
"direction": "upward" if slope > 0 else "downward" if slope < 0 else "flat",
},
"growth_rates": [round(g, 4) for g in growth_rates],
"average_growth_rate": round(avg_growth, 4),
"seasonality_index": [round(s, 4) for s in seasonality_index],
"historical_revenues": revenues,
}
def build_driver_based_forecast(
self, scenario: str = "base"
) -> Dict[str, Any]:
"""
Build a driver-based revenue forecast.
Drivers may include: units, price, customers, ARPU, conversion rate, etc.
"""
scenario_adjustments = self.scenarios_config.get(scenario, {})
growth_adjustment = scenario_adjustments.get("growth_adjustment", 0.0)
margin_adjustment = scenario_adjustments.get("margin_adjustment", 0.0)
base_revenue = 0.0
if self.historical:
base_revenue = self.historical[-1].get("revenue", 0)
# Driver-based calculation
unit_drivers = self.drivers.get("units", {})
price_drivers = self.drivers.get("pricing", {})
customer_drivers = self.drivers.get("customers", {})
base_growth = self.assumptions.get("revenue_growth_rate", 0.05)
adjusted_growth = base_growth + growth_adjustment
base_margin = self.assumptions.get("gross_margin", 0.40)
adjusted_margin = base_margin + margin_adjustment
cogs_pct = 1.0 - adjusted_margin
opex_pct = self.assumptions.get("opex_pct_revenue", 0.25)
forecast_periods: List[Dict[str, Any]] = []
current_revenue = base_revenue
# If we have unit and price drivers, use them
has_unit_drivers = bool(unit_drivers) and bool(price_drivers)
if has_unit_drivers:
base_units = unit_drivers.get("base_units", 1000)
unit_growth = unit_drivers.get("growth_rate", 0.03) + growth_adjustment
base_price = price_drivers.get("base_price", 100)
price_growth = price_drivers.get("annual_increase", 0.02)
current_units = base_units
current_price = base_price
for period in range(1, self.forecast_periods + 1):
current_units = current_units * (1 + unit_growth / 12)
if period % 12 == 0:
current_price = current_price * (1 + price_growth)
period_revenue = current_units * current_price
cogs = period_revenue * cogs_pct
gross_profit = period_revenue - cogs
opex = period_revenue * opex_pct
operating_income = gross_profit - opex
forecast_periods.append({
"period": period,
"revenue": round(period_revenue, 2),
"units": round(current_units, 0),
"price": round(current_price, 2),
"cogs": round(cogs, 2),
"gross_profit": round(gross_profit, 2),
"gross_margin": round(adjusted_margin, 4),
"opex": round(opex, 2),
"operating_income": round(operating_income, 2),
})
else:
# Simple growth-based forecast
monthly_growth = (1 + adjusted_growth) ** (1 / 12) - 1
for period in range(1, self.forecast_periods + 1):
current_revenue = current_revenue * (1 + monthly_growth)
cogs = current_revenue * cogs_pct
gross_profit = current_revenue - cogs
opex = current_revenue * opex_pct
operating_income = gross_profit - opex
forecast_periods.append({
"period": period,
"revenue": round(current_revenue, 2),
"cogs": round(cogs, 2),
"gross_profit": round(gross_profit, 2),
"gross_margin": round(adjusted_margin, 4),
"opex": round(opex, 2),
"operating_income": round(operating_income, 2),
})
total_revenue = sum(p["revenue"] for p in forecast_periods)
total_operating_income = sum(p["operating_income"] for p in forecast_periods)
return {
"scenario": scenario,
"growth_rate": round(adjusted_growth, 4),
"gross_margin": round(adjusted_margin, 4),
"forecast_periods": forecast_periods,
"total_revenue": round(total_revenue, 2),
"total_operating_income": round(total_operating_income, 2),
"average_monthly_revenue": round(
safe_divide(total_revenue, len(forecast_periods)), 2
),
}
def build_rolling_cash_flow(self, weeks: int = 13) -> Dict[str, Any]:
"""Build a 13-week rolling cash flow projection."""
cfi = self.cash_flow_inputs
opening_balance = cfi.get("opening_cash_balance", 0)
weekly_revenue = cfi.get("weekly_revenue", 0)
collection_rate = cfi.get("collection_rate", 0.85)
collection_lag_weeks = cfi.get("collection_lag_weeks", 2)
# Weekly expenses
weekly_payroll = cfi.get("weekly_payroll", 0)
weekly_rent = cfi.get("weekly_rent", 0)
weekly_operating = cfi.get("weekly_operating", 0)
weekly_other = cfi.get("weekly_other", 0)
total_weekly_expenses = weekly_payroll + weekly_rent + weekly_operating + weekly_other
# One-time items
one_time_items: List[Dict[str, Any]] = cfi.get("one_time_items", [])
weekly_projections: List[Dict[str, Any]] = []
running_balance = opening_balance
# Revenue pipeline for lagged collections
revenue_pipeline: List[float] = [0.0] * collection_lag_weeks
for week in range(1, weeks + 1):
# Revenue collections (lagged)
revenue_pipeline.append(weekly_revenue)
collections = revenue_pipeline.pop(0) * collection_rate
# One-time items for this week
one_time_inflows = 0.0
one_time_outflows = 0.0
one_time_labels: List[str] = []
for item in one_time_items:
if item.get("week") == week:
amount = item.get("amount", 0)
if amount > 0:
one_time_inflows += amount
else:
one_time_outflows += abs(amount)
one_time_labels.append(item.get("description", ""))
total_inflows = collections + one_time_inflows
total_outflows = total_weekly_expenses + one_time_outflows
net_cash_flow = total_inflows - total_outflows
running_balance += net_cash_flow
weekly_projections.append({
"week": week,
"collections": round(collections, 2),
"one_time_inflows": round(one_time_inflows, 2),
"total_inflows": round(total_inflows, 2),
"payroll": round(weekly_payroll, 2),
"rent": round(weekly_rent, 2),
"operating": round(weekly_operating, 2),
"other_expenses": round(weekly_other, 2),
"one_time_outflows": round(one_time_outflows, 2),
"total_outflows": round(total_outflows, 2),
"net_cash_flow": round(net_cash_flow, 2),
"closing_balance": round(running_balance, 2),
"notes": ", ".join(one_time_labels) if one_time_labels else "",
})
# Summary
total_inflows = sum(w["total_inflows"] for w in weekly_projections)
total_outflows = sum(w["total_outflows"] for w in weekly_projections)
min_balance = min(w["closing_balance"] for w in weekly_projections)
min_balance_week = next(
w["week"]
for w in weekly_projections
if w["closing_balance"] == min_balance
)
return {
"weeks": weeks,
"opening_balance": opening_balance,
"closing_balance": round(running_balance, 2),
"total_inflows": round(total_inflows, 2),
"total_outflows": round(total_outflows, 2),
"net_change": round(total_inflows - total_outflows, 2),
"minimum_balance": round(min_balance, 2),
"minimum_balance_week": min_balance_week,
"cash_runway_weeks": (
round(safe_divide(running_balance, total_weekly_expenses))
if total_weekly_expenses > 0
else None
),
"weekly_projections": weekly_projections,
}
def build_scenario_comparison(
self, scenarios: Optional[List[str]] = None
) -> Dict[str, Any]:
"""Build and compare multiple scenarios."""
if scenarios is None:
scenarios = ["base", "bull", "bear"]
scenario_results: Dict[str, Any] = {}
for scenario in scenarios:
scenario_results[scenario] = self.build_driver_based_forecast(scenario)
# Comparison summary
comparison: List[Dict[str, Any]] = []
for scenario in scenarios:
result = scenario_results[scenario]
comparison.append({
"scenario": scenario,
"total_revenue": result["total_revenue"],
"total_operating_income": result["total_operating_income"],
"growth_rate": result["growth_rate"],
"gross_margin": result["gross_margin"],
"avg_monthly_revenue": result["average_monthly_revenue"],
})
return {
"scenarios": scenario_results,
"comparison": comparison,
}
def run_full_forecast(
self, scenarios: Optional[List[str]] = None
) -> Dict[str, Any]:
"""Run the complete forecast analysis."""
trends = self.analyze_trends()
scenario_comparison = self.build_scenario_comparison(scenarios)
cash_flow = self.build_rolling_cash_flow()
return {
"trend_analysis": trends,
"scenario_comparison": scenario_comparison,
"rolling_cash_flow": cash_flow,
}
def format_text(self, results: Dict[str, Any]) -> str:
"""Format forecast results as human-readable text."""
lines: List[str] = []
lines.append("=" * 70)
lines.append("FINANCIAL FORECAST REPORT")
lines.append("=" * 70)
def fmt_money(val: float) -> str:
if abs(val) >= 1e9:
return f",.2fB"
if abs(val) >= 1e6:
return f",.2fM"
if abs(val) >= 1e3:
return f",.1fK"
return f",.2f"
# Trend Analysis
trend = results["trend_analysis"]
if "error" not in trend:
lines.append(f"\n--- TREND ANALYSIS ---")
t = trend["trend"]
lines.append(f" Direction: {t['direction']}")
lines.append(f" R-squared: {t['r_squared']:.4f}")
lines.append(
f" Average Historical Growth: "
f"{trend['average_growth_rate'] * 100:.1f}%"
)
if trend["seasonality_index"]:
lines.append(
f" Seasonality Index (last 4): "
f"{', '.join(f'{s:.2f}' for s in trend['seasonality_index'])}"
)
# Scenario Comparison
comp = results["scenario_comparison"]["comparison"]
lines.append(f"\n--- SCENARIO COMPARISON ---")
lines.append(
f" {'Scenario':<10s} {'Revenue':>14s} {'Op. Income':>14s} "
f"{'Growth':>8s} {'Margin':>8s}"
)
lines.append(" " + "-" * 62)
for c in comp:
lines.append(
f" {c['scenario']:<10s} {fmt_money(c['total_revenue']):>14s} "
f"{fmt_money(c['total_operating_income']):>14s} "
f"{c['growth_rate'] * 100:>7.1f}% "
f"{c['gross_margin'] * 100:>7.1f}%"
)
# Base scenario detail
base = results["scenario_comparison"]["scenarios"].get("base", {})
if base and base.get("forecast_periods"):
lines.append(f"\n--- BASE CASE MONTHLY FORECAST ---")
lines.append(
f" {'Period':>6s} {'Revenue':>12s} {'Gross Profit':>12s} "
f"{'Op. Income':>12s}"
)
lines.append(" " + "-" * 48)
for p in base["forecast_periods"]:
lines.append(
f" {p['period']:>6d} {fmt_money(p['revenue']):>12s} "
f"{fmt_money(p['gross_profit']):>12s} "
f"{fmt_money(p['operating_income']):>12s}"
)
# Cash Flow
cf = results["rolling_cash_flow"]
lines.append(f"\n--- 13-WEEK ROLLING CASH FLOW ---")
lines.append(f" Opening Balance: {fmt_money(cf['opening_balance'])}")
lines.append(f" Closing Balance: {fmt_money(cf['closing_balance'])}")
lines.append(f" Net Change: {fmt_money(cf['net_change'])}")
lines.append(
f" Minimum Balance: {fmt_money(cf['minimum_balance'])} "
f"(Week {cf['minimum_balance_week']})"
)
if cf.get("cash_runway_weeks"):
lines.append(f" Cash Runway: {cf['cash_runway_weeks']:.0f} weeks")
lines.append(f"\n Weekly Detail:")
lines.append(
f" {'Wk':>3s} {'Inflows':>10s} {'Outflows':>10s} "
f"{'Net':>10s} {'Balance':>12s}"
)
lines.append(" " + "-" * 50)
for w in cf["weekly_projections"]:
notes = f" {w['notes']}" if w["notes"] else ""
lines.append(
f" {w['week']:>3d} {fmt_money(w['total_inflows']):>10s} "
f"{fmt_money(w['total_outflows']):>10s} "
f"{fmt_money(w['net_cash_flow']):>10s} "
f"{fmt_money(w['closing_balance']):>12s}{notes}"
)
lines.append("\n" + "=" * 70)
return "\n".join(lines)
def main() -> None:
"""Main entry point."""
parser = argparse.ArgumentParser(
description="Driver-based revenue forecasting with scenario modeling"
)
parser.add_argument(
"input_file",
help="Path to JSON file with forecast data",
)
parser.add_argument(
"--format",
choices=["text", "json"],
default="text",
help="Output format (default: text)",
)
parser.add_argument(
"--scenarios",
type=str,
default="base,bull,bear",
help="Comma-separated list of scenarios (default: base,bull,bear)",
)
args = parser.parse_args()
try:
with open(args.input_file, "r") as f:
data = json.load(f)
except FileNotFoundError:
print(f"Error: File '{args.input_file}' not found.", file=sys.stderr)
sys.exit(1)
except json.JSONDecodeError as e:
print(f"Error: Invalid JSON in '{args.input_file}': {e}", file=sys.stderr)
sys.exit(1)
builder = ForecastBuilder(data)
scenarios = [s.strip() for s in args.scenarios.split(",")]
results = builder.run_full_forecast(scenarios)
if args.format == "json":
print(json.dumps(results, indent=2))
else:
print(builder.format_text(results))
if __name__ == "__main__":
main()
FILE:scripts/ratio_calculator.py
#!/usr/bin/env python3
"""
Financial Ratio Calculator
Calculates and interprets financial ratios across 5 categories:
profitability, liquidity, leverage, efficiency, and valuation.
Usage:
python ratio_calculator.py financial_data.json
python ratio_calculator.py financial_data.json --format json
python ratio_calculator.py financial_data.json --category profitability
"""
import argparse
import json
import sys
from typing import Any, Dict, List, Optional, Tuple
def safe_divide(numerator: float, denominator: float, default: float = 0.0) -> float:
"""Safely divide two numbers, returning default if denominator is zero."""
if denominator == 0 or denominator is None:
return default
return numerator / denominator
class FinancialRatioCalculator:
"""Calculate and interpret financial ratios from statement data."""
# Industry benchmark ranges: (low, typical, high)
BENCHMARKS: Dict[str, Tuple[float, float, float]] = {
"roe": (0.08, 0.15, 0.25),
"roa": (0.03, 0.06, 0.12),
"gross_margin": (0.25, 0.40, 0.60),
"operating_margin": (0.05, 0.15, 0.25),
"net_margin": (0.03, 0.10, 0.20),
"current_ratio": (1.0, 1.5, 3.0),
"quick_ratio": (0.8, 1.0, 2.0),
"cash_ratio": (0.2, 0.5, 1.0),
"debt_to_equity": (0.3, 0.8, 2.0),
"interest_coverage": (2.0, 5.0, 10.0),
"dscr": (1.0, 1.5, 2.5),
"asset_turnover": (0.5, 1.0, 2.0),
"inventory_turnover": (4.0, 8.0, 12.0),
"receivables_turnover": (6.0, 10.0, 15.0),
"dso": (30.0, 45.0, 60.0),
"pe_ratio": (10.0, 20.0, 35.0),
"pb_ratio": (1.0, 2.5, 5.0),
"ps_ratio": (1.0, 3.0, 8.0),
"ev_ebitda": (6.0, 12.0, 20.0),
"peg_ratio": (0.5, 1.0, 2.0),
}
def __init__(self, data: Dict[str, Any]) -> None:
"""Initialize with financial statement data."""
self.income = data.get("income_statement", {})
self.balance = data.get("balance_sheet", {})
self.cash_flow = data.get("cash_flow", {})
self.market = data.get("market_data", {})
self.results: Dict[str, Dict[str, Any]] = {}
def calculate_profitability(self) -> Dict[str, Any]:
"""Calculate profitability ratios."""
revenue = self.income.get("revenue", 0)
cogs = self.income.get("cost_of_goods_sold", 0)
operating_income = self.income.get("operating_income", 0)
net_income = self.income.get("net_income", 0)
total_equity = self.balance.get("total_equity", 0)
total_assets = self.balance.get("total_assets", 0)
gross_profit = revenue - cogs
ratios = {
"roe": {
"value": safe_divide(net_income, total_equity),
"formula": "Net Income / Total Equity",
"name": "Return on Equity",
},
"roa": {
"value": safe_divide(net_income, total_assets),
"formula": "Net Income / Total Assets",
"name": "Return on Assets",
},
"gross_margin": {
"value": safe_divide(gross_profit, revenue),
"formula": "(Revenue - COGS) / Revenue",
"name": "Gross Margin",
},
"operating_margin": {
"value": safe_divide(operating_income, revenue),
"formula": "Operating Income / Revenue",
"name": "Operating Margin",
},
"net_margin": {
"value": safe_divide(net_income, revenue),
"formula": "Net Income / Revenue",
"name": "Net Margin",
},
}
for key, ratio in ratios.items():
ratio["interpretation"] = self.interpret_ratio(key, ratio["value"])
self.results["profitability"] = ratios
return ratios
def calculate_liquidity(self) -> Dict[str, Any]:
"""Calculate liquidity ratios."""
current_assets = self.balance.get("current_assets", 0)
current_liabilities = self.balance.get("current_liabilities", 0)
inventory = self.balance.get("inventory", 0)
cash = self.balance.get("cash_and_equivalents", 0)
ratios = {
"current_ratio": {
"value": safe_divide(current_assets, current_liabilities),
"formula": "Current Assets / Current Liabilities",
"name": "Current Ratio",
},
"quick_ratio": {
"value": safe_divide(
current_assets - inventory, current_liabilities
),
"formula": "(Current Assets - Inventory) / Current Liabilities",
"name": "Quick Ratio",
},
"cash_ratio": {
"value": safe_divide(cash, current_liabilities),
"formula": "Cash & Equivalents / Current Liabilities",
"name": "Cash Ratio",
},
}
for key, ratio in ratios.items():
ratio["interpretation"] = self.interpret_ratio(key, ratio["value"])
self.results["liquidity"] = ratios
return ratios
def calculate_leverage(self) -> Dict[str, Any]:
"""Calculate leverage ratios."""
total_debt = self.balance.get("total_debt", 0)
total_equity = self.balance.get("total_equity", 0)
operating_income = self.income.get("operating_income", 0)
interest_expense = self.income.get("interest_expense", 0)
operating_cash_flow = self.cash_flow.get("operating_cash_flow", 0)
total_debt_service = self.cash_flow.get(
"total_debt_service", interest_expense
)
ratios = {
"debt_to_equity": {
"value": safe_divide(total_debt, total_equity),
"formula": "Total Debt / Total Equity",
"name": "Debt-to-Equity Ratio",
},
"interest_coverage": {
"value": safe_divide(operating_income, interest_expense),
"formula": "Operating Income / Interest Expense",
"name": "Interest Coverage Ratio",
},
"dscr": {
"value": safe_divide(operating_cash_flow, total_debt_service),
"formula": "Operating Cash Flow / Total Debt Service",
"name": "Debt Service Coverage Ratio",
},
}
for key, ratio in ratios.items():
ratio["interpretation"] = self.interpret_ratio(key, ratio["value"])
self.results["leverage"] = ratios
return ratios
def calculate_efficiency(self) -> Dict[str, Any]:
"""Calculate efficiency ratios."""
revenue = self.income.get("revenue", 0)
cogs = self.income.get("cost_of_goods_sold", 0)
total_assets = self.balance.get("total_assets", 0)
inventory = self.balance.get("inventory", 0)
accounts_receivable = self.balance.get("accounts_receivable", 0)
receivables_turnover_val = safe_divide(revenue, accounts_receivable)
ratios = {
"asset_turnover": {
"value": safe_divide(revenue, total_assets),
"formula": "Revenue / Total Assets",
"name": "Asset Turnover",
},
"inventory_turnover": {
"value": safe_divide(cogs, inventory),
"formula": "COGS / Inventory",
"name": "Inventory Turnover",
},
"receivables_turnover": {
"value": receivables_turnover_val,
"formula": "Revenue / Accounts Receivable",
"name": "Receivables Turnover",
},
"dso": {
"value": safe_divide(365, receivables_turnover_val)
if receivables_turnover_val > 0
else 0.0,
"formula": "365 / Receivables Turnover",
"name": "Days Sales Outstanding",
},
}
for key, ratio in ratios.items():
ratio["interpretation"] = self.interpret_ratio(key, ratio["value"])
self.results["efficiency"] = ratios
return ratios
def calculate_valuation(self) -> Dict[str, Any]:
"""Calculate valuation ratios (requires market data)."""
market_cap = self.market.get("market_cap", 0)
share_price = self.market.get("share_price", 0)
shares_outstanding = self.market.get("shares_outstanding", 0)
earnings_growth_rate = self.market.get("earnings_growth_rate", 0)
net_income = self.income.get("net_income", 0)
revenue = self.income.get("revenue", 0)
total_equity = self.balance.get("total_equity", 0)
total_debt = self.balance.get("total_debt", 0)
cash = self.balance.get("cash_and_equivalents", 0)
ebitda = self.income.get("ebitda", 0)
if market_cap == 0 and share_price > 0 and shares_outstanding > 0:
market_cap = share_price * shares_outstanding
eps = safe_divide(net_income, shares_outstanding)
book_value_per_share = safe_divide(total_equity, shares_outstanding)
enterprise_value = market_cap + total_debt - cash
pe = safe_divide(share_price, eps)
ratios = {
"pe_ratio": {
"value": pe,
"formula": "Share Price / Earnings Per Share",
"name": "Price-to-Earnings Ratio",
},
"pb_ratio": {
"value": safe_divide(share_price, book_value_per_share),
"formula": "Share Price / Book Value Per Share",
"name": "Price-to-Book Ratio",
},
"ps_ratio": {
"value": safe_divide(
market_cap, revenue
),
"formula": "Market Cap / Revenue",
"name": "Price-to-Sales Ratio",
},
"ev_ebitda": {
"value": safe_divide(enterprise_value, ebitda),
"formula": "Enterprise Value / EBITDA",
"name": "EV/EBITDA",
},
"peg_ratio": {
"value": safe_divide(pe, earnings_growth_rate * 100)
if earnings_growth_rate > 0
else 0.0,
"formula": "P/E Ratio / Earnings Growth Rate (%)",
"name": "PEG Ratio",
},
}
for key, ratio in ratios.items():
ratio["interpretation"] = self.interpret_ratio(key, ratio["value"])
self.results["valuation"] = ratios
return ratios
def calculate_all(self) -> Dict[str, Dict[str, Any]]:
"""Calculate all ratio categories."""
self.calculate_profitability()
self.calculate_liquidity()
self.calculate_leverage()
self.calculate_efficiency()
self.calculate_valuation()
return self.results
def interpret_ratio(self, ratio_key: str, value: float) -> str:
"""Interpret a ratio value against benchmarks."""
if value == 0.0:
return "Insufficient data to calculate"
benchmarks = self.BENCHMARKS.get(ratio_key)
if not benchmarks:
return "No benchmark available"
low, typical, high = benchmarks
# DSO is inverse - lower is better
if ratio_key == "dso":
if value <= low:
return "Excellent - collections well above average"
elif value <= typical:
return "Good - collections within normal range"
elif value <= high:
return "Acceptable - monitor collection trends"
else:
return "Concern - collections significantly slower than peers"
# Debt-to-equity - lower generally better (but context matters)
if ratio_key == "debt_to_equity":
if value <= low:
return "Conservative leverage - strong equity position"
elif value <= typical:
return "Moderate leverage - well balanced"
elif value <= high:
return "Elevated leverage - monitor debt levels"
else:
return "High leverage - potential financial risk"
# Standard interpretation (higher is better for most ratios)
if value < low:
return "Below average - needs improvement"
elif value <= typical:
return "Acceptable - within normal range"
elif value <= high:
return "Good - above average performance"
else:
return "Excellent - significantly above peers"
@staticmethod
def format_ratio(value: float, is_percentage: bool = False) -> str:
"""Format a ratio value for display."""
if is_percentage:
return f"{value * 100:.1f}%"
return f"{value:.2f}"
def format_text(self, category: Optional[str] = None) -> str:
"""Format results as human-readable text."""
lines: List[str] = []
lines.append("=" * 70)
lines.append("FINANCIAL RATIO ANALYSIS")
lines.append("=" * 70)
categories = (
{category: self.results[category]}
if category and category in self.results
else self.results
)
percentage_ratios = {
"roe", "roa", "gross_margin", "operating_margin", "net_margin"
}
for cat_name, ratios in categories.items():
lines.append(f"\n--- {cat_name.upper()} ---")
for key, ratio in ratios.items():
is_pct = key in percentage_ratios
formatted = self.format_ratio(ratio["value"], is_pct)
lines.append(f" {ratio['name']}: {formatted}")
lines.append(f" Formula: {ratio['formula']}")
lines.append(f" Assessment: {ratio['interpretation']}")
lines.append("\n" + "=" * 70)
return "\n".join(lines)
def to_json(self, category: Optional[str] = None) -> Dict[str, Any]:
"""Return results as JSON-serializable dict."""
if category and category in self.results:
return {"category": category, "ratios": self.results[category]}
return {"categories": self.results}
def main() -> None:
"""Main entry point."""
parser = argparse.ArgumentParser(
description="Calculate and interpret financial ratios"
)
parser.add_argument(
"input_file",
help="Path to JSON file with financial statement data",
)
parser.add_argument(
"--format",
choices=["text", "json"],
default="text",
help="Output format (default: text)",
)
parser.add_argument(
"--category",
choices=[
"profitability",
"liquidity",
"leverage",
"efficiency",
"valuation",
],
default=None,
help="Calculate only a specific ratio category",
)
args = parser.parse_args()
try:
with open(args.input_file, "r") as f:
data = json.load(f)
except FileNotFoundError:
print(f"Error: File '{args.input_file}' not found.", file=sys.stderr)
sys.exit(1)
except json.JSONDecodeError as e:
print(f"Error: Invalid JSON in '{args.input_file}': {e}", file=sys.stderr)
sys.exit(1)
calculator = FinancialRatioCalculator(data)
if args.category:
method_map = {
"profitability": calculator.calculate_profitability,
"liquidity": calculator.calculate_liquidity,
"leverage": calculator.calculate_leverage,
"efficiency": calculator.calculate_efficiency,
"valuation": calculator.calculate_valuation,
}
method_map[args.category]()
else:
calculator.calculate_all()
if args.format == "json":
print(json.dumps(calculator.to_json(args.category), indent=2))
else:
print(calculator.format_text(args.category))
if __name__ == "__main__":
main()
Systematic competitor tracking that feeds CMO positioning, CRO battlecards, and CPO roadmap decisions. Use when analyzing competitors, building sales battlec...
---
name: "competitive-intel"
description: "Systematic competitor tracking that feeds CMO positioning, CRO battlecards, and CPO roadmap decisions. Use when analyzing competitors, building sales battlecards, tracking market moves, positioning against alternatives, or when user mentions competitive intelligence, competitive analysis, competitor research, battlecards, win/loss, or market positioning."
license: MIT
metadata:
version: 1.0.0
author: Alireza Rezvani
category: c-level
domain: competitive-strategy
updated: 2026-03-05
frameworks: ci-playbook, battlecard-template
---
# Competitive Intelligence
Systematic competitor tracking. Not obsession — intelligence that drives real decisions.
## Keywords
competitive intelligence, competitor analysis, battlecard, win/loss analysis, competitive positioning, competitive tracking, market intelligence, competitor research, SWOT, competitive map, feature gap analysis, competitive strategy
## Quick Start
```
/ci:landscape — Map your competitive space (direct, indirect, future)
/ci:battlecard [name] — Build a sales battlecard for a specific competitor
/ci:winloss — Analyze recent wins and losses by reason
/ci:update [name] — Track what a competitor did recently
/ci:map — Build competitive positioning map
```
## Framework: 5-Layer Intelligence System
### Layer 1: Competitor Identification
**Direct competitors:** Same ICP, same problem, comparable solution, similar price point.
**Indirect competitors:** Same budget, different solution (including "do nothing" and "build in-house").
**Future competitors:** Well-funded startups in adjacent space; large incumbents with stated roadmap overlap.
**The 2x2 Threat Matrix:**
| | Same ICP | Different ICP |
|---|---|---|
| **Same problem** | Direct threat | Adjacent (watch) |
| **Different problem** | Displacement risk | Ignore for now |
Update this quarterly. Who's moved quadrants?
### Layer 2: Tracking Dimensions
Track these 8 dimensions per competitor:
| Dimension | Sources | Cadence |
|-----------|---------|---------|
| **Product moves** | Changelog, G2/Capterra reviews, Twitter/LinkedIn | Monthly |
| **Pricing changes** | Pricing page, sales call intel, customer feedback | Triggered |
| **Funding** | Crunchbase, TechCrunch, LinkedIn | Triggered |
| **Hiring signals** | LinkedIn job postings, Indeed | Monthly |
| **Partnerships** | Press releases, co-marketing | Triggered |
| **Customer wins** | Case studies, review sites, LinkedIn | Monthly |
| **Customer losses** | Win/loss interviews, churned accounts | Ongoing |
| **Messaging shifts** | Homepage, ads (Facebook/Google Ad Library) | Quarterly |
### Layer 3: Analysis Frameworks
**SWOT per Competitor:**
- Strengths: What do they do well? Where do they win?
- Weaknesses: Where do they lose? What do customers complain about?
- Opportunities: What could they do that would threaten you?
- Threats: What's their existential risk?
**Competitive Positioning Map (2 axis):**
Choose axes that matter for your buyers:
- Common: Price vs Feature Depth; Enterprise-ready vs SMB-ready; Easy to implement vs Configurable
- Pick axes that show YOUR differentiation clearly
**Feature Gap Analysis:**
| Feature | You | Competitor A | Competitor B | Gap status |
|---------|-----|-------------|-------------|------------|
| [Feature] | ✅ | ✅ | ❌ | Your advantage |
| [Feature] | ❌ | ✅ | ✅ | Gap — roadmap? |
| [Feature] | ✅ | ❌ | ❌ | Moat |
| [Feature] | ❌ | ❌ | ✅ | Competitor B only |
### Layer 4: Output Formats
**For Sales (CRO):** Battlecards — one page per competitor, designed for pre-call prep.
See `templates/battlecard-template.md`
**For Marketing (CMO):** Positioning update — message shifts, new differentiators, claims to stop or start making.
**For Product (CPO):** Feature gap summary — what customers ask for that we don't have, what competitors ship, what to reprioritize.
**For CEO/Board:** Monthly competitive summary — 1-page: who moved, what it means, recommended responses.
### Layer 5: Intelligence Cadence
**Monthly (scheduled):**
- Review all tier-1 competitors (direct threats, top 3)
- Update battlecards with new intel
- Publish 1-page summary to leadership
**Triggered (event-based):**
- Competitor raises funding → assess implications within 48 hours
- Competitor launches major feature → product + sales response within 1 week
- Competitor poaches key customer → win/loss interview within 2 weeks
- Competitor changes pricing → analyze and respond within 1 week
**Quarterly:**
- Full competitive landscape review
- Update positioning map
- Refresh ICP competitive threat assessment
- Add/remove companies from tracking list
---
## Win/Loss Analysis
This is the highest-signal competitive data you have. Most companies do it too rarely.
**When to interview:**
- Every lost deal >$50K ACV
- Every churn >6 months tenure
- Every competitive win (learn why — it may not be what you think)
**Who conducts it:**
- NOT the AE who worked the deal (too close, prospect won't be candid)
- Customer success, product team, or external researcher
**Question structure:**
1. "Walk me through your evaluation process"
2. "Who else were you considering?"
3. "What were the top 3 criteria in your decision?"
4. "Where did [our product] fall short?"
5. "What was the deciding factor?"
6. "What would have changed your decision?"
**Aggregate findings monthly:**
- Win reasons (rank by frequency)
- Loss reasons (rank by frequency)
- Competitor win rates (by competitor, by segment)
- Patterns over time
---
## The Balance: Intelligence Without Obsession
**Signs you're over-tracking competitors:**
- Roadmap decisions are primarily driven by "they just shipped X"
- Team morale drops when competitors fundraise
- You're shipping features you don't believe in to match their checklist
- Pricing discussions always start with "well, they charge X"
**Signs you're under-tracking:**
- Your AEs get blindsided on calls
- Prospects know more about competitors than your team does
- You missed a major product launch until customers told you
- Your positioning hasn't changed in 12+ months despite market moves
**The right posture:**
- Know competitors well enough to win against them
- Don't let them set your agenda
- Your roadmap is led by customer problems, informed by competitive gaps
---
## Distributing Intelligence
| Audience | Format | Cadence | Owner |
|----------|--------|---------|-------|
| AEs + SDRs | Updated battlecards in CRM | Monthly + triggered | CRO |
| Product | Feature gap analysis | Quarterly | CPO |
| Marketing | Positioning brief | Quarterly | CMO |
| Leadership | 1-page competitive summary | Monthly | CEO/COO |
| Board | Competitive landscape slide | Quarterly | CEO |
**One source of truth:** All competitive intel lives in one place (Notion, Confluence, Salesforce). Avoid Slack-only distribution — it disappears.
---
## Red Flags in Competitive Intelligence
| Signal | What it means |
|--------|---------------|
| Competitor's win rate >50% in your core segment | Fundamental positioning problem, not sales problem |
| Same objection from 5+ deals: "competitor has X" | Feature gap that's real, not just optics |
| Competitor hired 10 engineers in your domain | Major product investment incoming |
| Competitor raised >$20M and targets your ICP | 12-month runway for them to compete hard |
| Prospects evaluate you to justify competitor decision | You're the "check box" — fix perception or segment |
## Integration with C-Suite Roles
| Intelligence Type | Feeds To | Output Format |
|------------------|----------|---------------|
| Product moves | CPO | Roadmap input, feature gap analysis |
| Pricing changes | CRO, CFO | Pricing response recommendations |
| Funding rounds | CEO, CFO | Strategic positioning update |
| Hiring signals | CHRO, CTO | Talent market intelligence |
| Customer wins/losses | CRO, CMO | Battlecard updates, positioning shifts |
| Marketing campaigns | CMO | Counter-positioning, channel intelligence |
## References
- `references/ci-playbook.md` — OSINT sources, win/loss framework, positioning map construction
- `templates/battlecard-template.md` — sales battlecard template
FILE:references/ci-playbook.md
# Competitive Intelligence Playbook
## OSINT Sources for Competitor Tracking
### Free, Reliable Sources
**Company & Product:**
- **Their website** — pricing page (archive.org for history), product changelog, careers page
- **G2 / Capterra / Trustpilot** — customer reviews; filter by recency; read 1-star reviews carefully
- **LinkedIn** — job postings signal roadmap; company page for headcount trend; employees for leaks
- **GitHub** — open source activity; what they're building; engineering team size; tech stack
- **Crunchbase / PitchBook** (free tier) — funding history, investors, team changes
- **BuiltWith** — tech stack they use; signals about infrastructure maturity
**Messaging & Positioning:**
- **Facebook Ad Library** — see their current ad copy and creative; what messages they're testing
- **Google Keyword Planner** — which keywords they're bidding on
- **SEMrush / Ahrefs** (free trial or limited) — their organic keywords, backlink profile
- **Wayback Machine** — homepage evolution over time; when positioning shifted
- **Their blog** — content strategy reveals priorities and ICP assumptions
**News & Events:**
- **TechCrunch, VentureBeat** — funding announcements, major launches
- **Twitter/X / LinkedIn** — CEO + founders; direct signals about strategy
- **Podcast appearances** — founders talk more openly on podcasts than press releases
- **Job descriptions** — "Senior Engineer - Payments" means they're building payments
### Paid (Worth It for Tier-1 Competitors)
- **G2 Buyer Intent** — which prospects are researching your competitor right now
- **Bombora** — intent data for account-level research signals
- **PitchBook** — funding, investors, valuation estimates
- **Klue / Crayon / Kompyte** — dedicated CI platforms that aggregate automatically
### Primary Research (Best Signal)
- **Win/loss interviews** — the single highest-signal source (see below)
- **Talk to churned customers** — why did they switch? To whom?
- **Talk to their customers** — LinkedIn outreach; honest conversations
- **Industry events** — competitor presentations reveal roadmap; talk to attendees
- **Former employees** — LinkedIn; respectful outreach; no NDA violations
---
## Competitive Battlecard Format
A battlecard is a 1-page (or single screen) document for sales reps to reference before and during calls.
**Design principles:**
- Written for a rep with 2 minutes to prep, not a product manager
- Action-oriented: tells reps what to SAY, not just what to know
- Updated monthly at minimum; never more than 90 days old
### Battlecard Structure
```
COMPETITOR: [Name]
Last updated: [Date] | Owner: [Name]
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
THE 30-SECOND SUMMARY
[One paragraph. Who they are, who they sell to, why they win.]
THEIR STRENGTHS (know these — don't dismiss them)
• [Strength 1] — what customers actually love about them
• [Strength 2]
• [Strength 3]
THEIR REAL WEAKNESSES (from win/loss data, not assumptions)
• [Weakness 1] — source: [customer quote / win/loss theme]
• [Weakness 2]
• [Weakness 3]
OUR DIFFERENTIATED ADVANTAGES
• [Advantage 1] — proof point: [metric/customer/case study]
• [Advantage 2] — proof point:
• [Advantage 3] — proof point:
COMMON OBJECTIONS + RESPONSES
"They have [feature] and you don't."
→ [Response. Acknowledge, reframe, redirect.]
"They're cheaper."
→ [Response with ROI angle or TCO comparison.]
"They're more established / bigger."
→ [Response. Size isn't always advantage; use to your benefit.]
TRAP-SETTING QUESTIONS (ask these early to shift the eval criteria)
• "How important is [your differentiator] to your team?"
• "Have you looked at [pain point they create]?"
• "What happens to your workflow when [their known limitation occurs]?"
WHEN WE WIN
• [Segment or scenario where we almost always beat them]
• [Use case where we're clearly stronger]
WHEN WE LOSE (be honest)
• [Scenario where they're genuinely better — don't fight these battles]
• [Segment where they have structural advantages]
DO NOT SAY
• Don't claim [X] — it's not true and they'll call it out
• Don't say [Y] — prospect will already know it and it sounds desperate
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
```
---
## Win/Loss Analysis Framework
### Why Most Companies Do This Wrong
- They survey instead of interview (surveys get polite answers)
- The AE conducts it (too emotionally invested; prospect won't be candid)
- They do it 6 months after the decision (memory fades)
- They look for confirmation of what they believe
### The Right Process
**Timing:** Within 30 days of deal closed/lost/churned.
**Interviewer:** Customer success, product, or external researcher. Never the AE.
**Duration:** 30 minutes (budget 45).
**Incentive:** $100 gift card gets you 80% acceptance. Worth it.
**Interview Guide:**
*Opening:*
"I'm [name] from [company]. I'm not in sales — I'm trying to understand what drove your decision so we can improve. There's nothing you can say that will change the outcome. I just want honest feedback."
*Core questions:*
1. "Can you walk me through your evaluation process from the beginning?"
2. "Who were the key stakeholders involved in the decision?"
3. "What were the 3 most important criteria you were evaluating against?"
4. "Which vendors did you seriously consider?"
5. "Where did [company] fall short of your expectations?" (For losses)
OR "What tipped the decision in [company]'s favor?" (For wins)
6. "Was price a factor? How significant?"
7. "What would have had to be different for you to choose [us / the other option]?"
8. "Any advice for our team on how we handled the process?"
**Data aggregation:**
- Tag every response: [criterion], [competitor mentioned], [product gap], [sales process], [price], [trust/credibility]
- Monthly rollup: top 5 win reasons, top 5 loss reasons, competitor win rate
- Share with: CEO, CRO, CPO, CMO — not just sales
---
## Competitive Positioning Map Construction
A positioning map shows where you sit relative to competitors on 2 dimensions that BUYERS care about.
### Step 1: Choose Your Axes
- Pick dimensions that actually drive purchase decisions in your segment
- At least one axis should be where you win
- Avoid generic axes ("feature-rich vs. simple" tells you nothing)
**Good axis pairs:**
- Implementation time (days vs. months) × Customization depth
- Price point × Enterprise readiness
- Automation level × Human-in-the-loop control
- Time-to-value × Total cost of ownership
**Bad axes:**
- Quality (too vague)
- "Innovation" (unmeasurable)
- Any axis where all competitors cluster in the same spot
### Step 2: Place Competitors Objectively
- Use customer quotes and win/loss data to justify placement
- Don't place competitors where you WANT them — where they ACTUALLY are
- If you're unsure, ask 5 customers to place them
### Step 3: Find and Name Your White Space
- Where is there a position no competitor holds?
- Is that white space there because it's valuable (opportunity) or worthless (avoid)?
- Can you credibly occupy it?
### Step 4: Test Your Positioning
- Show the map to 5 prospects: "Does this match your perception?"
- Show it to 5 lost prospects: "Where would you place [the winner] and us?"
- Adjust until map matches buyer reality, not internal perception
---
## Intelligence Sharing Across Roles
### What Each Role Needs and When
**CRO (Sales):**
- Needs: Battlecards, win rates by competitor, competitor objections + responses
- Cadence: Updated battlecards monthly; triggered updates on major competitor moves
- Format: 1-pager per competitor in CRM, linked from deal record
**CMO (Marketing):**
- Needs: Messaging shifts, new claims, ad spend signals, keyword battles
- Cadence: Quarterly positioning review, triggered on major launches
- Format: Positioning brief with recommended response to messaging shifts
**CPO (Product):**
- Needs: Feature gap analysis, competitor roadmap signals (job postings, changelog), what we lose to
- Cadence: Monthly feature gap update, triggered on major launches
- Format: Feature comparison matrix + gap prioritization recommendation
**CTO (Engineering):**
- Needs: Tech stack signals, infrastructure approaches, scale they've achieved
- Cadence: Quarterly
- Format: Technical comparison notes, relevant for architectural decisions
**CEO:**
- Needs: Summary of threat landscape, recommended responses, board-level narrative
- Cadence: Monthly 1-pager + quarterly deep dive
- Format: 1-page brief: who moved, what it means, what we do
### The Single Source of Truth Rule
All competitive intel in one place. Suggest:
- Notion database per competitor: profile, battlecard, changelog, win/loss notes
- Slack channel: `#competitive-intel` for real-time triggered alerts
- Monthly digest email to leadership
If it lives only in Slack, it disappears. If it lives only in a wiki that nobody reads, it doesn't matter. Combine both.
---
## How to Track Without Obsessing
**Set up the system, then let it run:**
- Google Alerts for competitor names + CEO names
- LinkedIn Saved Searches for their job postings
- Klue/Crayon if budget allows (automated aggregation)
- Monthly 60-minute competitive review meeting (not 4 hours)
**What to do when competitor makes a big move:**
1. Read the announcement objectively
2. Talk to 3 customers: "Did you see this? What do you think?"
3. Assess: does this change any buying criteria in your deals?
4. If yes: update battlecard and positioning within 1 week
5. If no: log it, move on
**The test:** After reviewing a competitor move, do you feel urgency to ship something? If yes, you're reacting. The right feeling is "noted — let's see if customers care."
FILE:templates/battlecard-template.md
# Sales Battlecard Template
**COMPETITOR:** [Name]
**Last updated:** [YYYY-MM-DD] | **Owner:** [Name]
**Win rate vs this competitor:** [X]% | **Deals tracked:** [N]
---
## 30-Second Summary
[Who they are. Who they target. Why they win. What they're known for. 3-4 sentences max.]
---
## Their Strengths
*Know these. Don't dismiss them. Prospects have already heard their pitch.*
- **[Strength]:** [What customers genuinely love; source if available]
- **[Strength]:** [Specific capability or trait]
- **[Strength]:** [Brand, market position, or ecosystem advantage]
---
## Their Real Weaknesses
*From win/loss data only — not wishful thinking.*
- **[Weakness]:** "[Customer quote]" — seen in [N] deals
- **[Weakness]:** [Documented limitation with evidence]
- **[Weakness]:** [Implementation, support, or pricing issue]
---
## Our Differentiated Advantages
*Must be real and provable. Each needs a proof point.*
- **[Advantage]:** [Proof: metric / customer quote / case study]
- **[Advantage]:** [Proof]
- **[Advantage]:** [Proof]
---
## Common Objections + Responses
**"They have [feature X] and you don't."**
> [Acknowledge. Reframe to your strength. Redirect to outcome.
> "You're right that they have X. What we've found is that customers who care most about X tend to also care about [Y], where we're significantly stronger. Can I show you [specific example]?"]
**"They're cheaper."**
> [Don't fight on price. Reframe to TCO or ROI.
> "They are lower in initial cost. Most customers find the total cost over 12 months is actually comparable when you factor in [implementation time / support costs / integrations]. Want to walk through that?"]
**"They've been around longer / they're more established."**
> [Reframe tenure as potential liability or irrelevance.
> "Their longevity means they have a lot of technical debt and a big customer base that pulls their roadmap in every direction. Our customers tell us that's exactly why they chose us — we move faster and we're laser-focused on [their specific use case]."]
**"[Competitor] is already used by [big customer they respect]."**
> [Name-drop your wins in their segment.
> "We work with [comparable logo]. Want me to connect you with their [role] to ask how they made the decision?"]
---
## Trap-Setting Questions
*Ask early in discovery to establish criteria that favor you.*
- "How important is [your key differentiator] to your workflow?"
- "What happens when [their known limitation] occurs? Has that been an issue before?"
- "How long does your team typically take to onboard a new tool?"
- "Who manages the integration work — do you have dedicated engineering resources for that?"
- "What does your current vendor do when you need support?"
---
## When We Win
- [Scenario or segment where we consistently beat them]
- [Use case that plays to our strengths]
- [Buyer profile that prefers us]
## When We Lose (Be Honest)
- [Scenario where they genuinely win — don't fight here]
- [Segment where their strengths matter more than ours]
---
## Do NOT Say
- ❌ Don't claim [X] — it's not accurate and they'll check
- ❌ Don't attack [Y] — it backfires and makes us look insecure
- ❌ Don't say "we're better" without specifics — be concrete
---
## Recent Intel
*Last 90 days only. Older than 90 days: archive.*
- [Date]: [What happened — funding, product launch, pricing change, key hire]
- [Date]: [Customer feedback from win/loss interview]
- [Date]: [Any notable market move]
---
*Battlecards are only useful if current. If this is >90 days old, flag to [owner] for update.*
Inter-agent communication protocol for C-suite agent teams. Defines invocation syntax, loop prevention, isolation rules, and response formats. Use when C-sui...
---
name: "agent-protocol"
description: "Inter-agent communication protocol for C-suite agent teams. Defines invocation syntax, loop prevention, isolation rules, and response formats. Use when C-suite agents need to query each other, coordinate cross-functional analysis, or run board meetings with multiple agent roles."
license: MIT
metadata:
version: 1.0.0
author: Alireza Rezvani
category: c-level
domain: agent-orchestration
updated: 2026-03-05
frameworks: invocation-patterns
---
# Inter-Agent Protocol
How C-suite agents talk to each other. Rules that prevent chaos, loops, and circular reasoning.
## Keywords
agent protocol, inter-agent communication, agent invocation, agent orchestration, multi-agent, c-suite coordination, agent chain, loop prevention, agent isolation, board meeting protocol
## Invocation Syntax
Any agent can query another using:
```
[INVOKE:role|question]
```
**Examples:**
```
[INVOKE:cfo|What's the burn rate impact of hiring 5 engineers in Q3?]
[INVOKE:cto|Can we realistically ship this feature by end of quarter?]
[INVOKE:chro|What's our typical time-to-hire for senior engineers?]
[INVOKE:cro|What does our pipeline look like for the next 90 days?]
```
**Valid roles:** `ceo`, `cfo`, `cro`, `cmo`, `cpo`, `cto`, `chro`, `coo`, `ciso`
## Response Format
Invoked agents respond using this structure:
```
[RESPONSE:role]
Key finding: [one line — the actual answer]
Supporting data:
- [data point 1]
- [data point 2]
- [data point 3 — optional]
Confidence: [high | medium | low]
Caveat: [one line — what could make this wrong]
[/RESPONSE]
```
**Example:**
```
[RESPONSE:cfo]
Key finding: Hiring 5 engineers in Q3 extends runway from 14 to 9 months at current burn.
Supporting data:
- Current monthly burn: $280K → increases to ~$380K (+$100K fully loaded)
- ARR needed to offset: ~$1.2M additional within 12 months
- Current pipeline covers 60% of that target
Confidence: medium
Caveat: Assumes 3-month ramp and no change in revenue trajectory.
[/RESPONSE]
```
## Loop Prevention (Hard Rules)
These rules are enforced unconditionally. No exceptions.
### Rule 1: No Self-Invocation
An agent cannot invoke itself.
```
❌ CFO → [INVOKE:cfo|...] — BLOCKED
```
### Rule 2: Maximum Depth = 2
Chains can go A→B→C. The third hop is blocked.
```
✅ CRO → CFO → COO (depth 2)
❌ CRO → CFO → COO → CHRO (depth 3 — BLOCKED)
```
### Rule 3: No Circular Calls
If agent A called agent B, agent B cannot call agent A in the same chain.
```
✅ CRO → CFO → CMO
❌ CRO → CFO → CRO (circular — BLOCKED)
```
### Rule 4: Chain Tracking
Each invocation carries its call chain. Format:
```
[CHAIN: cro → cfo → coo]
```
Agents check this chain before responding with another invocation.
**When blocked:** Return this instead of invoking:
```
[BLOCKED: cannot invoke cfo — circular call detected in chain cro→cfo]
State assumption used instead: [explicit assumption the agent is making]
```
## Isolation Rules
### Board Meeting Phase 2 (Independent Analysis)
**NO invocations allowed.** Each role forms independent views before cross-pollination.
- Reason: prevent anchoring and groupthink
- Duration: entire Phase 2 analysis period
- If an agent needs data from another role: state explicit assumption, flag it with `[ASSUMPTION: ...]`
### Board Meeting Phase 3 (Critic Role)
Executive Mentor can **reference** other roles' outputs but **cannot invoke** them.
- Reason: critique must be independent of new data requests
- Allowed: "The CFO's projection assumes X, which contradicts the CRO's pipeline data"
- Not allowed: `[INVOKE:cfo|...]` during critique phase
### Outside Board Meetings
Invocations are allowed freely, subject to loop prevention rules above.
## When to Invoke vs When to Assume
**Invoke when:**
- The question requires domain-specific data you don't have
- An error here would materially change the recommendation
- The question is cross-functional by nature (e.g., hiring impact on both budget and capacity)
**Assume when:**
- The data is directionally clear and precision isn't critical
- You're in Phase 2 isolation (always assume, never invoke)
- The chain is already at depth 2
- The question is minor compared to your main analysis
**When assuming, always state it:**
```
[ASSUMPTION: runway ~12 months based on typical Series A burn profile — not verified with CFO]
```
## Conflict Resolution
When two invoked agents give conflicting answers:
1. **Flag the conflict explicitly:**
```
[CONFLICT: CFO projects 14-month runway; CRO expects pipeline to close 80% → implies 18+ months]
```
2. **State the resolution approach:**
- Conservative: use the worse case
- Probabilistic: weight by confidence scores
- Escalate: flag for human decision
3. **Never silently pick one** — surface the conflict to the user.
## Broadcast Pattern (Crisis / CEO)
CEO can broadcast to all roles simultaneously:
```
[BROADCAST:all|What's the impact if we miss the fundraise?]
```
Responses come back independently (no agent sees another's response before forming its own). Aggregate after all respond.
## Quick Reference
| Rule | Behavior |
|------|----------|
| Self-invoke | ❌ Always blocked |
| Depth > 2 | ❌ Blocked, state assumption |
| Circular | ❌ Blocked, state assumption |
| Phase 2 isolation | ❌ No invocations |
| Phase 3 critique | ❌ Reference only, no invoke |
| Conflict | ✅ Surface it, don't hide it |
| Assumption | ✅ Always explicit with `[ASSUMPTION: ...]` |
## Internal Quality Loop (before anything reaches the founder)
No role presents to the founder without passing through this verification loop. The founder sees polished, verified output — not first drafts.
### Step 1: Self-Verification (every role, every time)
Before presenting, every role runs this internal checklist:
```
SELF-VERIFY CHECKLIST:
□ Source Attribution — Where did each data point come from?
✅ "ARR is $2.1M (from CRO pipeline report, Q4 actuals)"
❌ "ARR is around $2M" (no source, vague)
□ Assumption Audit — What am I assuming vs what I verified?
Tag every assumption: [VERIFIED: checked against data] or [ASSUMED: not verified]
If >50% of findings are ASSUMED → flag low confidence
□ Confidence Score — How sure am I on each finding?
🟢 High: verified data, established pattern, multiple sources
🟡 Medium: single source, reasonable inference, some uncertainty
🔴 Low: assumption-based, limited data, first-time analysis
□ Contradiction Check — Does this conflict with known context?
Check against company-context.md and recent decisions in decision-log
If it contradicts a past decision → flag explicitly
□ "So What?" Test — Does every finding have a business consequence?
If you can't answer "so what?" in one sentence → cut it
```
### Step 2: Peer Verification (cross-functional validation)
When a recommendation impacts another role's domain, that role validates BEFORE presenting.
| If your recommendation involves... | Validate with... | They check... |
|-------------------------------------|-------------------|---------------|
| Financial numbers or budget | CFO | Math, runway impact, budget reality |
| Revenue projections | CRO | Pipeline backing, historical accuracy |
| Headcount or hiring | CHRO | Market reality, comp feasibility, timeline |
| Technical feasibility or timeline | CTO | Engineering capacity, technical debt load |
| Operational process changes | COO | Capacity, dependencies, scaling impact |
| Customer-facing changes | CRO + CPO | Churn risk, product roadmap conflict |
| Security or compliance claims | CISO | Actual posture, regulation requirements |
| Market or positioning claims | CMO | Data backing, competitive reality |
**Peer validation format:**
```
[PEER-VERIFY:cfo]
Validated: ✅ Burn rate calculation correct
Adjusted: ⚠️ Hiring timeline should be Q3 not Q2 (budget constraint)
Flagged: 🔴 Missing equity cost in total comp projection
[/PEER-VERIFY]
```
**Skip peer verification when:**
- Single-domain question with no cross-functional impact
- Time-sensitive proactive alert (send alert, verify after)
- Founder explicitly asked for a quick take
### Step 3: Critic Pre-Screen (high-stakes decisions only)
For decisions that are **irreversible, high-cost, or bet-the-company**, the Executive Mentor pre-screens before the founder sees it.
**Triggers for pre-screen:**
- Involves spending > 20% of remaining runway
- Affects >30% of the team (layoffs, reorg)
- Changes company strategy or direction
- Involves external commitments (fundraising terms, partnerships, M&A)
- Any recommendation where all roles agree (suspicious consensus)
**Pre-screen output:**
```
[CRITIC-SCREEN]
Weakest point: [The single biggest vulnerability in this recommendation]
Missing perspective: [What nobody considered]
If wrong, the cost is: [Quantified downside]
Proceed: ✅ With noted risks | ⚠️ After addressing [specific gap] | 🔴 Rethink
[/CRITIC-SCREEN]
```
### Step 4: Course Correction (after founder feedback)
The loop doesn't end at delivery. After the founder responds:
```
FOUNDER FEEDBACK LOOP:
1. Founder approves → log decision (Layer 2), assign actions
2. Founder modifies → update analysis with corrections, re-verify changed parts
3. Founder rejects → log rejection with DO_NOT_RESURFACE, understand WHY
4. Founder asks follow-up → deepen analysis on specific point, re-verify
POST-DECISION REVIEW (30/60/90 days):
- Was the recommendation correct?
- What did we miss?
- Update company-context.md with what we learned
- If wrong → document the lesson, adjust future analysis
```
### Verification Level by Stakes
| Stakes | Self-Verify | Peer-Verify | Critic Pre-Screen |
|--------|-------------|-------------|-------------------|
| Low (informational) | ✅ Required | ❌ Skip | ❌ Skip |
| Medium (operational) | ✅ Required | ✅ Required | ❌ Skip |
| High (strategic) | ✅ Required | ✅ Required | ✅ Required |
| Critical (irreversible) | ✅ Required | ✅ Required | ✅ Required + board meeting |
### What Changes in the Output Format
The verified output adds confidence and source information:
```
BOTTOM LINE
[Answer] — Confidence: 🟢 High
WHAT
• [Finding 1] [VERIFIED: Q4 actuals] 🟢
• [Finding 2] [VERIFIED: CRO pipeline data] 🟢
• [Finding 3] [ASSUMED: based on industry benchmarks] 🟡
PEER-VERIFIED BY: CFO (math ✅), CTO (timeline ⚠️ adjusted to Q3)
```
---
## User Communication Standard
All C-suite output to the founder follows ONE format. No exceptions. The founder is the decision-maker — give them results, not process.
### Standard Output (single-role response)
```
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
📊 [ROLE] — [Topic]
BOTTOM LINE
[One sentence. The answer. No preamble.]
WHAT
• [Finding 1 — most critical]
• [Finding 2]
• [Finding 3]
(Max 5 bullets. If more needed → reference doc.)
WHY THIS MATTERS
[1-2 sentences. Business impact. Not theory — consequence.]
HOW TO ACT
1. [Action] → [Owner] → [Deadline]
2. [Action] → [Owner] → [Deadline]
3. [Action] → [Owner] → [Deadline]
⚠️ RISKS (if any)
• [Risk + what triggers it]
🔑 YOUR DECISION (if needed)
Option A: [Description] — [Trade-off]
Option B: [Description] — [Trade-off]
Recommendation: [Which and why, in one line]
📎 DETAIL: [reference doc or script output for deep-dive]
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
```
### Proactive Alert (unsolicited — triggered by context)
```
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
🚩 [ROLE] — Proactive Alert
WHAT I NOTICED
[What triggered this — specific, not vague]
WHY IT MATTERS
[Business consequence if ignored — in dollars, time, or risk]
RECOMMENDED ACTION
[Exactly what to do, who does it, by when]
URGENCY: 🔴 Act today | 🟡 This week | ⚪ Next review
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
```
### Board Meeting Output (multi-role synthesis)
```
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
📋 BOARD MEETING — [Date] — [Agenda Topic]
DECISION REQUIRED
[Frame the decision in one sentence]
PERSPECTIVES
CEO: [one-line position]
CFO: [one-line position]
CRO: [one-line position]
[... only roles that contributed]
WHERE THEY AGREE
• [Consensus point 1]
• [Consensus point 2]
WHERE THEY DISAGREE
• [Conflict] — CEO says X, CFO says Y
• [Conflict] — CRO says X, CPO says Y
CRITIC'S VIEW (Executive Mentor)
[The uncomfortable truth nobody else said]
RECOMMENDED DECISION
[Clear recommendation with rationale]
ACTION ITEMS
1. [Action] → [Owner] → [Deadline]
2. [Action] → [Owner] → [Deadline]
3. [Action] → [Owner] → [Deadline]
🔑 YOUR CALL
[Options if you disagree with the recommendation]
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
```
### Communication Rules (non-negotiable)
1. **Bottom line first.** Always. The founder's time is the scarcest resource.
2. **Results and decisions only.** No process narration ("First I analyzed..."). No thinking out loud.
3. **What + Why + How.** Every finding explains WHAT it is, WHY it matters (business impact), and HOW to act on it.
4. **Max 5 bullets per section.** Longer = reference doc.
5. **Actions have owners and deadlines.** "We should consider" is banned. Who does what by when.
6. **Decisions framed as options.** Not "what do you think?" — "Option A or B, here's the trade-off, here's my recommendation."
7. **The founder decides.** Roles recommend. The founder approves, modifies, or rejects. Every output respects this hierarchy.
8. **Risks are concrete.** Not "there might be risks" — "if X happens, Y breaks, costing $Z."
9. **No jargon without explanation.** If you use a term, explain it on first use.
10. **Silence is an option.** If there's nothing to report, don't fabricate updates.
## Reference
- `references/invocation-patterns.md` — common cross-functional patterns with examples
FILE:references/invocation-patterns.md
# Invocation Patterns — Inter-Agent Protocol
Common cross-functional patterns, with examples and guidance on when to invoke vs assume.
---
## Pattern 1: Revenue Planning Chain
**CRO → CFO → CMO**
Used when: setting revenue targets, allocating growth budget, modeling pipeline scenarios.
```
CRO initiates:
"Planning Q3 revenue targets. Need financial constraints and marketing input."
CRO → [INVOKE:cfo|What's our Q3 revenue target given current runway and burn?]
CFO responds:
[RESPONSE:cfo]
Key finding: To maintain 12-month runway without new raise, Q3 ARR target ≥ $2.4M (+20% QoQ)
Supporting data:
- Current ARR: $2.0M, burn: $180K/month, runway: 14 months
- 20% QoQ growth maintains runway; <15% triggers bridge discussion
- Board expectation communicated as 25% QoQ
Confidence: high
Caveat: Assumes no new hires or major infrastructure spend in Q3.
[/RESPONSE]
CRO → [INVOKE:cmo|Given a $2.4M Q3 ARR target, what pipeline coverage and CAC should we plan for?]
[CHAIN: cro → cfo → cmo]
CMO responds:
[RESPONSE:cmo]
Key finding: Need 3x pipeline ($7.2M) and CAC ≤ $8,400 to hit $2.4M ARR target.
Supporting data:
- Historical close rate: 33% → 3x coverage required
- Current CAC: $9,100 → 8% efficiency improvement needed
- Recommended channel mix: 60% outbound, 30% inbound, 10% partner
Confidence: medium
Caveat: Close rate assumes no sales team changes; any rep turnover drops it to 25%.
[/RESPONSE]
```
**What CRO does with this:** Synthesizes CFO constraint (target) + CMO playbook (how to hit it) into a revenue plan. Does not invoke further (depth limit reached).
---
## Pattern 2: Hiring Decision Chain
**CHRO → CFO → COO**
Used when: headcount planning, role approval, capacity vs budget tradeoffs.
```
CHRO initiates:
"Engineering wants to hire 3 senior engineers. Need budget clearance and capacity validation."
CHRO → [INVOKE:cfo|Can we afford 3 senior engineers in H2? Fully loaded ~$180K each.]
[RESPONSE:cfo]
Key finding: Budget supports 2 hires now; 3rd is contingent on Q2 ARR closing above $2.1M.
Supporting data:
- Current H2 headcount budget: $340K remaining
- 2 hires = $270K fully loaded → $70K buffer
- 3 hires = $405K → $65K over; triggers board approval
Confidence: high
Caveat: Based on current runway projection — fundraise close changes this completely.
[/RESPONSE]
CHRO → [INVOKE:coo|Do operations support onboarding 2-3 engineers in H2? Any capacity constraints?]
[CHAIN: chro → cfo → coo]
[RESPONSE:coo]
Key finding: Can absorb 2 comfortably; 3rd requires dedicated onboarding lead or delays velocity 6 weeks.
Supporting data:
- Current eng team: 8 → adding 2 is 25% growth (manageable)
- Onboarding bottleneck: 1 senior eng mentor per 1-2 new hires, currently at capacity
- Dev environment setup: 3-week lead time per hire
Confidence: medium
Caveat: Assumes no concurrent product launches requiring full team focus.
[/RESPONSE]
CHRO synthesis: Recommends 2 hires now, pipeline 3rd for Q3 contingent on ARR milestone.
```
---
## Pattern 3: Product Launch Chain
**CPO → CTO → CMO**
Used when: planning feature launches, assessing readiness, aligning go-to-market.
```
CPO initiates:
"Planning Q3 launch of [Feature X]. Need technical readiness and GTM input."
CPO → [INVOKE:cto|Is Feature X technically ready for Q3 launch? Key risks?]
[RESPONSE:cto]
Key finding: 70% complete; Q3 launch is achievable with scope cut — 2 items must move to Q4.
Supporting data:
- Core feature: complete; performance optimization: 3 weeks remaining
- Infrastructure scaling: needs load testing at 10x current volume
- Scope cuts needed: advanced analytics dashboard, third-party integrations
Confidence: medium
Caveat: Assumes no critical bugs found in load testing; historical rate suggests 1-2 found.
[/RESPONSE]
CPO → [INVOKE:cmo|Given Q3 launch of Feature X (core only, no analytics/integrations), what's the GTM plan?]
[CHAIN: cpo → cto → cmo]
[RESPONSE:cmo]
Key finding: Core-only launch works for existing customers; new logo acquisition needs integrations — recommend phased announcement.
Supporting data:
- Phase 1: Early access for top 20 accounts → testimonials and case studies
- Phase 2: Full launch with integration partners in Q4
- CAC impact: core-only launch reduces new logo conversion rate 15%
Confidence: medium
Caveat: Competitor launched similar feature last month — window for differentiation is narrow.
[/RESPONSE]
```
---
## Pattern 4: Crisis Broadcast
**CEO → All Roles**
Used when: major adverse event, fundraise miss, key customer loss, market disruption.
```
CEO broadcasts:
[BROADCAST:all|Scenario: largest customer (28% of ARR) gives 60-day termination notice.
Each role: what's the immediate impact on your domain and your top recommended action?]
Roles respond INDEPENDENTLY (no agent sees others' responses first):
[RESPONSE:cfo]
Key finding: 28% ARR loss drops runway from 14 to 8 months at current burn.
Supporting data:
- Lost ARR: ~$560K → burn multiple increases from 1.8x to 3.2x
- Bridge options: $500K credit line (pre-approved), defer 2 planned hires
- Fundraise timeline must compress: target term sheet within 60 days
Confidence: high
Caveat: Assumes no secondary customer churn triggered by news.
[/RESPONSE]
[RESPONSE:cro]
Key finding: Pipeline must replace $560K ARR within 6 months to avoid emergency measures.
Supporting data:
- Current qualified pipeline: $1.8M → need 31% close rate (currently 28%)
- 3 deals in late stage can cover 70% of gap if closed this quarter
- Churn risk: 2 other accounts had similar usage patterns — immediate EBR required
Confidence: medium
Caveat: Close rate may drop if word spreads; manage communication carefully.
[/RESPONSE]
[RESPONSE:chro]
Key finding: Hiring freeze recommended immediately; retain all current staff.
Supporting data:
- 4 open reqs should pause pending CFO guidance
- At-risk of counter-poaching: 2 senior engineers, 1 product lead
- Retention bonuses for key staff: ~$80K to secure 12-month commitments
Confidence: high
Caveat: Morale impact of freeze can accelerate attrition if not communicated well.
[/RESPONSE]
[RESPONSE:cpo]
Key finding: Accelerate features that differentiate with remaining accounts; reduce scope to core.
Supporting data:
- Lost customer's feature requests: deprioritize (no longer strategic)
- Features most valued by remaining top 5 accounts: [list from account data]
- Reduce Q3 scope by 30% to free engineering capacity for retention work
Confidence: medium
Caveat: Need sales input on which features are blocking new logo deals.
[/RESPONSE]
[RESPONSE:cmo]
Key finding: Control the narrative; proactive communication to market reduces churn cascade risk.
Supporting data:
- Draft customer communication within 48 hours (frame as strategic pivot, not loss)
- Press: no comment strategy unless proactively asked
- Replace pipeline: double down on ICP segments where we're strongest
Confidence: medium
Caveat: If customer goes public with criticism, narrative control becomes much harder.
[/RESPONSE]
CEO synthesis: [Aggregates all 9 responses, identifies conflicts, sets priorities]
```
---
## When to Invoke vs When to Assume
### Invoke when:
- Cross-functional data is material to the decision
- Getting it wrong changes the recommendation significantly
- The other role has data you genuinely don't have
- Time allows (not in Phase 2 isolation)
### Assume when:
- You're in Phase 2 (always — no exceptions)
- The chain is at depth 2 (you cannot invoke further)
- The answer is directionally obvious (e.g., "CFO will care about runway")
- The precision doesn't change the recommendation
### State assumptions explicitly:
```
[ASSUMPTION: runway ~12 months — not verified with CFO; actual may vary ±20%]
[ASSUMPTION: CAC ~$8K based on industry benchmark — CMO has actual figures]
[ASSUMPTION: engineering capacity at ~70% — not verified with CTO]
```
---
## Handling Conflicting Responses
When two agents give incompatible answers, surface it:
```
[CONFLICT DETECTED]
CFO says: runway extends to 18 months if Q3 targets hit
CRO says: only 45% confidence Q3 targets will be hit
Resolution: use probabilistic blend
- 45% probability: 18-month runway (optimistic case)
- 55% probability: 11-month runway (current trajectory)
Expected value: ~14 months
Recommendation: plan for 12 months, trigger bridge at 10.
[/CONFLICT]
```
**Resolution options:**
1. **Conservative:** Use worse case — appropriate for cash/runway decisions
2. **Probabilistic:** Weight by confidence scores — appropriate for planning
3. **Escalate:** Flag for human decision — appropriate for high-stakes irreversible choices
4. **Time-box:** Gather more data within 48 hours — appropriate when data gap is closeable
---
## Anti-Patterns to Avoid
| Anti-pattern | Problem | Fix |
|---|---|---|
| Invoke to validate your own conclusion | Confirmation bias loop | Ask open-ended questions |
| Invoke when assuming works | Unnecessary latency | State assumption clearly |
| Hide conflicts between responses | Bad synthesis | Always surface conflicts |
| Invoke across depth > 2 | Loop risk | State assumption at depth 2 |
| Invoke during Phase 2 | Groupthink contamination | Flag with [ASSUMPTION:] |
| Vague questions | Poor responses | Specific, scoped questions only |
SaaS financial health advisor. Use when a user shares revenue or customer numbers, or mentions ARR, MRR, churn, LTV, CAC, NRR, or asks how their SaaS busines...
---
name: saas-metrics-coach
description: SaaS financial health advisor. Use when a user shares revenue or customer numbers, or mentions ARR, MRR, churn, LTV, CAC, NRR, or asks how their SaaS business is doing.
license: MIT
metadata:
version: 1.0.0
author: Abbas Mir
category: finance
updated: 2026-03-08
---
# SaaS Metrics Coach
Act as a senior SaaS CFO advisor. Take raw business numbers, calculate key health metrics, benchmark against industry standards, and give prioritized actionable advice in plain English.
## Step 1 — Collect Inputs
If not already provided, ask for these in a single grouped request:
- Revenue: current MRR, MRR last month, expansion MRR, churned MRR
- Customers: total active, new this month, churned this month
- Costs: sales and marketing spend, gross margin %
Work with partial data. Be explicit about what is missing and what assumptions are being made.
## Step 2 — Calculate Metrics
Run `scripts/metrics_calculator.py` with the user's inputs. If the script is unavailable, use the formulas in `references/formulas.md`.
Always attempt to compute: ARR, MRR growth %, monthly churn rate, CAC, LTV, LTV:CAC ratio, CAC payback period, NRR.
**Additional Analysis Tools:**
- Use `scripts/quick_ratio_calculator.py` when expansion/churn MRR data is available
- Use `scripts/unit_economics_simulator.py` for forward-looking projections
## Step 3 — Benchmark Each Metric
Load `references/benchmarks.md`. For each metric show:
- The calculated value
- The relevant benchmark range for the user's segment and stage
- A plain status label: HEALTHY / WATCH / CRITICAL
Match the benchmark tier to the user's market segment (Enterprise / Mid-Market / SMB / PLG) and company stage (Early / Growth / Scale). Ask if unclear.
## Step 4 — Prioritize and Recommend
Identify the top 2-3 metrics at WATCH or CRITICAL status. For each one state:
- What is happening (one sentence, plain English)
- Why it matters to the business
- Two or three specific actions to take this month
Order by impact — address the most damaging problem first.
## Step 5 — Output Format
Always use this exact structure:
```
# SaaS Health Report — [Month Year]
## Metrics at a Glance
| Metric | Your Value | Benchmark | Status |
|--------|------------|-----------|--------|
## Overall Picture
[2-3 sentences, plain English summary]
## Priority Issues
### 1. [Metric Name]
What is happening: ...
Why it matters: ...
Fix it this month: ...
### 2. [Metric Name]
...
## What is Working
[1-2 genuine strengths, no padding]
## 90-Day Focus
[Single metric to move + specific numeric target]
```
## Examples
**Example 1 — Partial data**
Input: "MRR is $80k, we have 200 customers, about 3 cancel each month."
Expected output: Calculates ARPA ($400), monthly churn (1.5%), ARR ($960k), LTV estimate. Flags CAC and growth rate as missing. Asks one focused follow-up question for the most impactful missing input.
**Example 2 — Critical scenario**
Input: "MRR $22k (was $23.5k), 80 customers, lost 9, gained 6, spent $15k on ads, 65% gross margin."
Expected output: Flags negative MoM growth (-6.4%), critical churn (11.25%), and LTV:CAC of 0.64:1 as CRITICAL. Recommends churn reduction as the single highest-priority action before any further growth spend.
## Key Principles
- Be direct. If a metric is bad, say it is bad.
- Explain every metric in one sentence before showing the number.
- Cap priority issues at three. More than three paralyzes action.
- Context changes benchmarks. Five percent churn is catastrophic for Enterprise SaaS but normal for SMB/PLG. Always confirm the user's target market before scoring.
## Reference Files
- `references/formulas.md` — All metric formulas with worked examples
- `references/benchmarks.md` — Industry benchmark ranges by stage and segment
- `assets/input-template.md` — Blank input form to share with users
- `scripts/metrics_calculator.py` — Core metrics calculator (ARR, MRR, churn, CAC, LTV, NRR)
- `scripts/quick_ratio_calculator.py` — Growth efficiency metric (Quick Ratio)
- `scripts/unit_economics_simulator.py` — 12-month forward projection
## Tools
### 1. Metrics Calculator (`scripts/metrics_calculator.py`)
Core SaaS metrics from raw business numbers.
```bash
# Interactive mode
python scripts/metrics_calculator.py
# CLI mode
python scripts/metrics_calculator.py --mrr 50000 --customers 100 --churned 5 --json
```
### 2. Quick Ratio Calculator (`scripts/quick_ratio_calculator.py`)
Growth efficiency metric: (New MRR + Expansion) / (Churned + Contraction)
```bash
python scripts/quick_ratio_calculator.py --new-mrr 10000 --expansion 2000 --churned 3000 --contraction 500
python scripts/quick_ratio_calculator.py --new-mrr 10000 --expansion 2000 --churned 3000 --json
```
**Benchmarks:**
- < 1.0 = CRITICAL (losing faster than gaining)
- 1-2 = WATCH (marginal growth)
- 2-4 = HEALTHY (good efficiency)
- \> 4 = EXCELLENT (strong growth)
### 3. Unit Economics Simulator (`scripts/unit_economics_simulator.py`)
Project metrics forward 12 months based on growth/churn assumptions.
```bash
python scripts/unit_economics_simulator.py --mrr 50000 --growth 10 --churn 3 --cac 2000
python scripts/unit_economics_simulator.py --mrr 50000 --growth 10 --churn 3 --cac 2000 --json
```
**Use for:**
- "What if we grow at X% per month?"
- Runway projections
- Scenario planning (best/base/worst case)
## Related Skills
- **financial-analyst**: Use for DCF valuation, budget variance analysis, and traditional financial modeling. NOT for SaaS-specific metrics like CAC, LTV, or churn.
- **business-growth/customer-success**: Use for retention strategies and customer health scoring. Complements this skill when churn is flagged as CRITICAL.
FILE:assets/input-template.md
# SaaS Metrics — Input Template
Fill in what you know and paste to the SaaS Metrics Coach. Leave blanks empty.
---
**Context**
- Target market: [ ] Enterprise [ ] Mid-Market [ ] SMB [ ] Consumer/PLG
- Stage: [ ] Early (<$1M ARR) [ ] Growth ($1M–$10M) [ ] Scale ($10M+)
**Revenue**
- Current MRR: $
- MRR last month: $
- Expansion MRR this month (upsells/upgrades): $
- Churned MRR this month: $
- Contraction MRR (downgrades): $
**Customers**
- Total active customers:
- New customers this month:
- Churned customers this month:
**Costs**
- Sales & Marketing spend this month: $
- Gross margin %:
- Net profit margin % (optional):
---
*Partial data is fine — the coach works with whatever you have.*
FILE:references/benchmarks.md
# SaaS Industry Benchmarks
Industry-standard benchmark ranges for SaaS metrics, segmented by company stage and market segment.
**Sources:**
- OpenView SaaS Benchmarks 2024
- Bessemer Venture Partners Cloud Index
- SaaS Capital Index
- Paddle SaaS Metrics Report 2025
**Last updated:** March 2026
## Stage Definitions
- Early: < $1M ARR
- Growth: $1M–$10M ARR
- Scale: $10M–$50M ARR
- Late: $50M+ ARR
---
## Monthly Churn Rate
| Segment | CRITICAL | WATCH | HEALTHY |
|---|---|---|---|
| Enterprise (ACV > $25k) | > 3% | 1–3% | < 1% |
| Mid-Market ($5k–$25k ACV) | > 5% | 2–5% | < 2% |
| SMB / PLG (< $5k ACV) | > 8% | 4–8% | < 4% |
| Consumer | > 10% | 5–10% | < 5% |
## LTV:CAC Ratio
| Status | Range |
|---|---|
| CRITICAL | < 1:1 — losing money on every customer |
| POOR | 1:1–2:1 — barely breaking even |
| WATCH | 2:1–3:1 — marginally viable |
| HEALTHY | 3:1–5:1 — industry standard |
| EXCELLENT | > 5:1 — strong unit economics |
| WATCH | > 8:1 — possibly under-investing in growth |
## CAC Payback Period
| Status | Range |
|---|---|
| CRITICAL | > 24 months |
| WATCH | 18–24 months |
| HEALTHY | 12–18 months |
| GOOD | 6–12 months |
| EXCELLENT | < 6 months (PLG indicator) |
## NRR (Net Revenue Retention)
| Status | Range |
|---|---|
| CRITICAL | < 80% — revenue shrinking from existing base |
| POOR | 80–90% |
| WATCH | 90–100% — flat, not expanding |
| HEALTHY | 100–110% |
| EXCELLENT | 110–120% |
| WORLD-CLASS | > 120% (Snowflake / Datadog territory) |
## MoM MRR Growth
| Stage | CRITICAL | WATCH | HEALTHY | EXCELLENT |
|---|---|---|---|---|
| Early (< $1M ARR) | < 5% | 5–10% | 10–20% | > 20% |
| Growth ($1M–$10M) | < 3% | 3–7% | 7–15% | > 15% |
| Scale ($10M+) | < 1% | 1–3% | 3–7% | > 7% |
## Gross Margin
| Status | Range |
|---|---|
| CRITICAL | < 50% |
| WATCH | 50–65% |
| HEALTHY | 65–75% |
| EXCELLENT | 75–85% |
| WORLD-CLASS | > 85% (API / infrastructure businesses) |
## Rule of 40
| Score | Status |
|---|---|
| < 20 | CONCERNING |
| 20–40 | DEVELOPING |
| 40–60 | HEALTHY |
| > 60 | EXCELLENT |
## Quick Reference Card
```
Metric Must Hit Good Great
---------------------------------------------
Monthly Churn < 5% < 3% < 1%
LTV:CAC > 3:1 > 4:1 > 5:1
CAC Payback < 18 mo < 12 mo < 6 mo
NRR > 100% > 110% > 120%
Gross Margin > 65% > 75% > 80%
MoM Growth > 5% > 10% > 15%
```
FILE:references/formulas.md
# SaaS Metric Formulas
Complete reference with worked examples for all metrics calculated by the SaaS Metrics Coach.
## ARR (Annual Recurring Revenue)
```
ARR = MRR × 12
```
**Example:**
- Current MRR: $50,000
- ARR = $50,000 × 12 = **$600,000**
**When to use:** Quick snapshot of annualized revenue run rate. Not the same as actual annual revenue if you have seasonality or one-time fees.
## MoM MRR Growth Rate
```
MoM Growth % = ((MRR_now - MRR_last) / MRR_last) × 100
```
**Example:**
- Current MRR: $50,000
- Last month MRR: $45,000
- Growth = (($50,000 - $45,000) / $45,000) × 100 = **11.1%**
**Interpretation:**
- Negative = losing revenue
- 0-5% = slow growth (concerning for early stage)
- 5-15% = healthy growth
- >15% = strong growth (early stage)
## Monthly Churn Rate
```
Churn % = (Customers lost / Customers at start of month) × 100
```
**Example:**
- Customers at start of month: 100
- Customers lost during month: 5
- Churn = (5 / 100) × 100 = **5%**
**Annualized impact:** 5% monthly = ~46% annual churn (compounding effect)
**Critical context:** Churn tolerance varies by segment:
- Enterprise: >3% is critical
- SMB: >8% is critical
- Always confirm segment before judging severity
## ARPA (Avg Revenue Per Account)
```
ARPA = MRR / Total active customers
```
## CAC (Customer Acquisition Cost)
```
CAC = Total Sales & Marketing spend / New customers acquired
```
Example: $20k spend / 10 customers → CAC $2,000
## LTV (Customer Lifetime Value)
```
LTV = (ARPA / Monthly Churn Rate) × Gross Margin %
```
**Simplified (no gross margin data):**
```
LTV = ARPA / Monthly Churn Rate
```
**Example:**
- ARPA: $500
- Monthly churn: 5% (0.05)
- Gross margin: 70% (0.70)
- LTV = ($500 / 0.05) × 0.70 = **$7,000**
**Simplified (no margin):** $500 / 0.05 = **$10,000**
**Why it matters:** LTV tells you the total revenue you can expect from an average customer. Must be at least 3x your CAC to have sustainable unit economics.
## LTV:CAC Ratio
```
LTV:CAC = LTV / CAC
```
Example: LTV $10k / CAC $2k = 5:1
## CAC Payback Period
```
Payback (months) = CAC / (ARPA × Gross Margin %)
Simplified: Payback = CAC / ARPA
```
Example: CAC $2k / ARPA $500 = 4 months
## NRR (Net Revenue Retention)
```
NRR % = ((MRR_start + Expansion MRR - Churned MRR - Contraction MRR) / MRR_start) × 100
```
Simplified (no expansion data): NRR ≈ (1 - Revenue Churn Rate) × 100
## Rule of 40
```
Score = Annualized MoM Growth % + Net Profit Margin %
Healthy: ≥ 40
```
FILE:scripts/metrics_calculator.py
#!/usr/bin/env python3
"""
SaaS Metrics Calculator — zero external dependencies (stdlib only).
Usage (interactive): python metrics_calculator.py
Usage (CLI): python metrics_calculator.py --mrr 48000 --customers 160 --json
Usage (import):
from metrics_calculator import calculate, report
results = calculate(mrr=48000, mrr_last=42000, customers=160,
churned=4, new_customers=22, sm_spend=18000,
gross_margin=0.72)
print(report(results))
"""
import json
import sys
def calculate(
mrr=None,
mrr_last=None,
customers=None,
churned=None,
new_customers=None,
sm_spend=None,
gross_margin=0.70,
expansion_mrr=0,
churned_mrr=0,
contraction_mrr=0,
profit_margin=None,
):
r, missing = {}, []
# ── Core revenue ─────────────────────────────────────────────────────────
if mrr is not None:
r["MRR"] = round(mrr, 2)
r["ARR"] = round(mrr * 12, 2)
else:
missing.append("ARR/MRR — need current MRR")
if mrr and customers:
r["ARPA"] = round(mrr / customers, 2)
else:
missing.append("ARPA — need MRR + customer count")
# ── Growth ────────────────────────────────────────────────────────────────
if mrr and mrr_last and mrr_last > 0:
r["MoM_Growth_Pct"] = round(((mrr - mrr_last) / mrr_last) * 100, 2)
else:
missing.append("MoM Growth — need last month MRR")
# ── Churn ─────────────────────────────────────────────────────────────────
if churned is not None and customers:
r["Churn_Pct"] = round((churned / customers) * 100, 2)
else:
missing.append("Churn Rate — need churned + total customers")
# ── CAC ───────────────────────────────────────────────────────────────────
if sm_spend and new_customers and new_customers > 0:
r["CAC"] = round(sm_spend / new_customers, 2)
else:
missing.append("CAC — need S&M spend + new customers")
# ── LTV ───────────────────────────────────────────────────────────────────
arpa = r.get("ARPA")
churn_dec = r.get("Churn_Pct", 0) / 100
if arpa and churn_dec > 0:
r["LTV"] = round((arpa / churn_dec) * gross_margin, 2)
else:
missing.append("LTV — need ARPA and churn rate")
# ── LTV:CAC ───────────────────────────────────────────────────────────────
if r.get("LTV") and r.get("CAC") and r["CAC"] > 0:
r["LTV_CAC"] = round(r["LTV"] / r["CAC"], 2)
else:
missing.append("LTV:CAC — need both LTV and CAC")
# ── Payback ───────────────────────────────────────────────────────────────
if r.get("CAC") and arpa and arpa > 0:
r["Payback_Months"] = round(r["CAC"] / (arpa * gross_margin), 1)
else:
missing.append("Payback Period — need CAC and ARPA")
# ── NRR ───────────────────────────────────────────────────────────────────
if mrr_last and mrr_last > 0 and (expansion_mrr or churned_mrr or contraction_mrr):
nrr = ((mrr_last + expansion_mrr - churned_mrr - contraction_mrr) / mrr_last) * 100
r["NRR_Pct"] = round(nrr, 2)
elif r.get("Churn_Pct"):
r["NRR_Est_Pct"] = round((1 - r["Churn_Pct"] / 100) * 100, 2)
missing.append("NRR (accurate) — using churn-only estimate; provide expansion MRR for full NRR")
# ── Rule of 40 ────────────────────────────────────────────────────────────
if r.get("MoM_Growth_Pct") and profit_margin is not None:
r["Rule_of_40"] = round(r["MoM_Growth_Pct"] * 12 + profit_margin, 1)
r["_missing"] = missing
r["_gross_margin"] = gross_margin
return r
def report(r):
labels = [
("MRR", "Monthly Recurring Revenue", "$"),
("ARR", "Annual Recurring Revenue", "$"),
("ARPA", "Avg Revenue Per Account/mo", "$"),
("MoM_Growth_Pct", "MoM MRR Growth", "%"),
("Churn_Pct", "Monthly Churn Rate", "%"),
("CAC", "Customer Acquisition Cost", "$"),
("LTV", "Customer Lifetime Value", "$"),
("LTV_CAC", "LTV:CAC Ratio", ":1"),
("Payback_Months", "CAC Payback Period", " months"),
("NRR_Pct", "NRR (Net Revenue Retention)", "%"),
("NRR_Est_Pct", "NRR Estimate (churn-only)", "%"),
("Rule_of_40", "Rule of 40 Score", ""),
]
lines = ["=" * 54, " SAAS METRICS CALCULATOR", "=" * 54, ""]
for key, label, unit in labels:
val = r.get(key)
if val is None:
continue
if unit == "$":
fmt = f",.2f"
elif unit == "%":
fmt = f"{val}%"
elif unit == ":1":
fmt = f"{val}:1"
else:
fmt = f"{val}{unit}"
lines.append(f" {label:<40} {fmt}")
if r.get("_missing"):
lines += ["", " Missing / estimated:"]
for m in r["_missing"]:
lines.append(f" - {m}")
lines.append("=" * 54)
return "\n".join(lines)
# ── Interactive mode ──────────────────────────────────────────────────────────
def _ask(prompt, required=False):
while True:
v = input(f" {prompt}: ").strip()
if not v:
if required:
print(" Required — please enter a value.")
continue
return None
try:
return float(v)
except ValueError:
print(" Enter a number (e.g. 48000 or 72).")
if __name__ == "__main__":
import argparse
parser = argparse.ArgumentParser(description="SaaS Metrics Calculator")
parser.add_argument("--mrr", type=float, help="Current MRR")
parser.add_argument("--mrr-last", type=float, help="MRR last month")
parser.add_argument("--customers", type=int, help="Total active customers")
parser.add_argument("--churned", type=int, help="Customers churned this month")
parser.add_argument("--new-customers", type=int, help="New customers acquired")
parser.add_argument("--sm-spend", type=float, help="Sales & Marketing spend")
parser.add_argument("--gross-margin", type=float, default=70, help="Gross margin %% (default: 70)")
parser.add_argument("--expansion-mrr", type=float, default=0, help="Expansion MRR")
parser.add_argument("--churned-mrr", type=float, default=0, help="Churned MRR")
parser.add_argument("--contraction-mrr", type=float, default=0, help="Contraction MRR")
parser.add_argument("--profit-margin", type=float, help="Net profit margin %%")
parser.add_argument("--json", action="store_true", help="Output JSON format")
args = parser.parse_args()
# CLI mode
if args.mrr is not None:
inputs = {
"mrr": args.mrr,
"mrr_last": args.mrr_last,
"customers": args.customers,
"churned": args.churned,
"new_customers": args.new_customers,
"sm_spend": args.sm_spend,
"gross_margin": args.gross_margin / 100 if args.gross_margin > 1 else args.gross_margin,
"expansion_mrr": args.expansion_mrr,
"churned_mrr": args.churned_mrr,
"contraction_mrr": args.contraction_mrr,
"profit_margin": args.profit_margin,
}
result = calculate(**inputs)
if args.json:
print(json.dumps(result, indent=2))
else:
print("\n" + report(result))
sys.exit(0)
# Interactive mode
print("\nSaaS Metrics Calculator (press Enter to skip)\n")
gm = _ask("Gross margin % (default 70)", required=False) or 70
inputs = dict(
mrr=_ask("Current MRR ($)", required=True),
mrr_last=_ask("MRR last month ($)"),
customers=_ask("Total active customers"),
churned=_ask("Customers churned this month"),
new_customers=_ask("New customers acquired this month"),
sm_spend=_ask("Sales & Marketing spend this month ($)"),
gross_margin=gm / 100 if gm > 1 else gm,
expansion_mrr=_ask("Expansion MRR (upsells) ($)") or 0,
churned_mrr=_ask("Churned MRR ($)") or 0,
contraction_mrr=_ask("Contraction MRR (downgrades) ($)") or 0,
profit_margin=_ask("Net profit margin % (for Rule of 40, optional)"),
)
print("\n" + report(calculate(**inputs)))
FILE:scripts/quick_ratio_calculator.py
#!/usr/bin/env python3
"""
Quick Ratio Calculator - SaaS growth efficiency metric.
Quick Ratio = (New MRR + Expansion MRR) / (Churned MRR + Contraction MRR)
A ratio > 4 indicates healthy, efficient growth.
A ratio < 1 means you're losing revenue faster than gaining it.
Usage:
python quick_ratio_calculator.py --new-mrr 10000 --expansion 2000 --churned 3000 --contraction 500
python quick_ratio_calculator.py --new-mrr 10000 --expansion 2000 --churned 3000 --contraction 500 --json
"""
import json
import sys
import argparse
def calculate_quick_ratio(new_mrr, expansion_mrr, churned_mrr, contraction_mrr):
"""
Calculate Quick Ratio and provide interpretation.
Args:
new_mrr: New MRR from new customers
expansion_mrr: Expansion MRR from existing customers (upsells)
churned_mrr: MRR lost from churned customers
contraction_mrr: MRR lost from downgrades
Returns:
dict with quick ratio and analysis
"""
# Calculate components
growth_mrr = new_mrr + expansion_mrr
lost_mrr = churned_mrr + contraction_mrr
# Quick Ratio
if lost_mrr == 0:
quick_ratio = float('inf') if growth_mrr > 0 else 0
quick_ratio_display = "∞" if growth_mrr > 0 else "0"
else:
quick_ratio = growth_mrr / lost_mrr
quick_ratio_display = f"{quick_ratio:.2f}"
# Status assessment
if lost_mrr == 0 and growth_mrr > 0:
status = "EXCELLENT"
interpretation = "No revenue loss - perfect retention with growth"
elif quick_ratio >= 4:
status = "EXCELLENT"
interpretation = "Strong, efficient growth - gaining revenue 4x faster than losing it"
elif quick_ratio >= 2:
status = "HEALTHY"
interpretation = "Good growth efficiency - gaining revenue 2x+ faster than losing it"
elif quick_ratio >= 1:
status = "WATCH"
interpretation = "Marginal growth - barely gaining more than losing"
else:
status = "CRITICAL"
interpretation = "Losing revenue faster than gaining - growth is unsustainable"
# Breakdown percentages
if growth_mrr > 0:
new_pct = (new_mrr / growth_mrr) * 100
expansion_pct = (expansion_mrr / growth_mrr) * 100
else:
new_pct = expansion_pct = 0
if lost_mrr > 0:
churned_pct = (churned_mrr / lost_mrr) * 100
contraction_pct = (contraction_mrr / lost_mrr) * 100
else:
churned_pct = contraction_pct = 0
results = {
"quick_ratio": quick_ratio if quick_ratio != float('inf') else None,
"quick_ratio_display": quick_ratio_display,
"status": status,
"interpretation": interpretation,
"components": {
"growth_mrr": round(growth_mrr, 2),
"lost_mrr": round(lost_mrr, 2),
"new_mrr": round(new_mrr, 2),
"expansion_mrr": round(expansion_mrr, 2),
"churned_mrr": round(churned_mrr, 2),
"contraction_mrr": round(contraction_mrr, 2),
},
"breakdown": {
"new_mrr_pct": round(new_pct, 1),
"expansion_mrr_pct": round(expansion_pct, 1),
"churned_mrr_pct": round(churned_pct, 1),
"contraction_mrr_pct": round(contraction_pct, 1),
},
}
return results
def format_report(results):
"""Format quick ratio results as human-readable report."""
lines = []
lines.append("\n" + "=" * 70)
lines.append("QUICK RATIO ANALYSIS")
lines.append("=" * 70)
# Quick Ratio
lines.append(f"\n⚡ QUICK RATIO: {results['quick_ratio_display']}")
lines.append(f" Status: {results['status']}")
lines.append(f" {results['interpretation']}")
# Components
comp = results["components"]
lines.append("\n📊 COMPONENTS")
lines.append(f" Growth MRR (New + Expansion): ,.2f")
lines.append(f" • New MRR: ,.2f")
lines.append(f" • Expansion MRR: ,.2f")
lines.append(f" Lost MRR (Churned + Contraction): ,.2f")
lines.append(f" • Churned MRR: ,.2f")
lines.append(f" • Contraction MRR: ,.2f")
# Breakdown
bd = results["breakdown"]
lines.append("\n📈 GROWTH BREAKDOWN")
lines.append(f" New customers: {bd['new_mrr_pct']:.1f}%")
lines.append(f" Expansion: {bd['expansion_mrr_pct']:.1f}%")
lines.append("\n📉 LOSS BREAKDOWN")
lines.append(f" Churn: {bd['churned_mrr_pct']:.1f}%")
lines.append(f" Contraction: {bd['contraction_mrr_pct']:.1f}%")
# Benchmarks
lines.append("\n🎯 BENCHMARKS")
lines.append(" < 1.0 = CRITICAL (losing revenue faster than gaining)")
lines.append(" 1-2 = WATCH (marginal growth)")
lines.append(" 2-4 = HEALTHY (good growth efficiency)")
lines.append(" > 4 = EXCELLENT (strong, efficient growth)")
lines.append("\n" + "=" * 70 + "\n")
return "\n".join(lines)
if __name__ == "__main__":
parser = argparse.ArgumentParser(
description="Calculate SaaS Quick Ratio (growth efficiency metric)"
)
parser.add_argument(
"--new-mrr", type=float, required=True, help="New MRR from new customers"
)
parser.add_argument(
"--expansion", type=float, default=0, help="Expansion MRR from upsells (default: 0)"
)
parser.add_argument(
"--churned", type=float, required=True, help="Churned MRR from lost customers"
)
parser.add_argument(
"--contraction", type=float, default=0, help="Contraction MRR from downgrades (default: 0)"
)
parser.add_argument("--json", action="store_true", help="Output JSON format")
args = parser.parse_args()
results = calculate_quick_ratio(
new_mrr=args.new_mrr,
expansion_mrr=args.expansion,
churned_mrr=args.churned,
contraction_mrr=args.contraction,
)
if args.json:
print(json.dumps(results, indent=2))
else:
print(format_report(results))
FILE:scripts/unit_economics_simulator.py
#!/usr/bin/env python3
"""
Unit Economics Simulator - Project SaaS metrics forward 12 months.
Usage:
python unit_economics_simulator.py --mrr 50000 --growth 10 --churn 3 --cac 2000
python unit_economics_simulator.py --mrr 50000 --growth 10 --churn 3 --cac 2000 --json
"""
import json
import sys
import argparse
def simulate(
mrr,
monthly_growth_pct,
monthly_churn_pct,
cac,
gross_margin=0.70,
sm_spend_pct=0.30,
months=12,
):
"""
Simulate unit economics forward.
Args:
mrr: Starting MRR
monthly_growth_pct: Expected monthly growth rate (%)
monthly_churn_pct: Expected monthly churn rate (%)
cac: Customer acquisition cost
gross_margin: Gross margin (0-1)
sm_spend_pct: Sales & marketing as % of revenue (0-1)
months: Number of months to project
Returns:
dict with monthly projections and summary
"""
results = {
"inputs": {
"starting_mrr": mrr,
"monthly_growth_pct": monthly_growth_pct,
"monthly_churn_pct": monthly_churn_pct,
"cac": cac,
"gross_margin": gross_margin,
"sm_spend_pct": sm_spend_pct,
},
"projections": [],
"summary": {},
}
current_mrr = mrr
cumulative_sm_spend = 0
cumulative_gross_profit = 0
for month in range(1, months + 1):
# Calculate growth and churn
growth_rate = monthly_growth_pct / 100
churn_rate = monthly_churn_pct / 100
# Net growth = growth - churn
net_growth_rate = growth_rate - churn_rate
new_mrr = current_mrr * (1 + net_growth_rate)
# Revenue and costs
monthly_revenue = current_mrr
gross_profit = monthly_revenue * gross_margin
sm_spend = monthly_revenue * sm_spend_pct
net_profit = gross_profit - sm_spend
# Accumulate
cumulative_sm_spend += sm_spend
cumulative_gross_profit += gross_profit
# ARR
arr = current_mrr * 12
results["projections"].append({
"month": month,
"mrr": round(current_mrr, 2),
"arr": round(arr, 2),
"monthly_revenue": round(monthly_revenue, 2),
"gross_profit": round(gross_profit, 2),
"sm_spend": round(sm_spend, 2),
"net_profit": round(net_profit, 2),
"growth_rate_pct": round(net_growth_rate * 100, 2),
})
current_mrr = new_mrr
# Summary
final_mrr = results["projections"][-1]["mrr"]
final_arr = results["projections"][-1]["arr"]
total_revenue = sum(p["monthly_revenue"] for p in results["projections"])
total_net_profit = sum(p["net_profit"] for p in results["projections"])
results["summary"] = {
"starting_mrr": mrr,
"ending_mrr": round(final_mrr, 2),
"ending_arr": round(final_arr, 2),
"mrr_growth_pct": round(((final_mrr - mrr) / mrr) * 100, 2),
"total_revenue_12m": round(total_revenue, 2),
"total_gross_profit_12m": round(cumulative_gross_profit, 2),
"total_sm_spend_12m": round(cumulative_sm_spend, 2),
"total_net_profit_12m": round(total_net_profit, 2),
"avg_monthly_growth_pct": round((monthly_growth_pct - monthly_churn_pct), 2),
}
return results
def format_report(results):
"""Format simulation results as human-readable report."""
lines = []
lines.append("\n" + "=" * 70)
lines.append("UNIT ECONOMICS SIMULATION - 12 MONTH PROJECTION")
lines.append("=" * 70)
# Inputs
inputs = results["inputs"]
lines.append("\n📊 INPUTS")
lines.append(f" Starting MRR: ,.0f")
lines.append(f" Monthly Growth: {inputs['monthly_growth_pct']}%")
lines.append(f" Monthly Churn: {inputs['monthly_churn_pct']}%")
lines.append(f" CAC: ,.0f")
lines.append(f" Gross Margin: {inputs['gross_margin']*100:.0f}%")
lines.append(f" S&M Spend: {inputs['sm_spend_pct']*100:.0f}% of revenue")
# Summary
summary = results["summary"]
lines.append("\n📈 12-MONTH SUMMARY")
lines.append(f" Starting MRR: ,.0f")
lines.append(f" Ending MRR: ,.0f")
lines.append(f" Ending ARR: ,.0f")
lines.append(f" MRR Growth: {summary['mrr_growth_pct']:+.1f}%")
lines.append(f" Total Revenue: ,.0f")
lines.append(f" Total Gross Profit: ,.0f")
lines.append(f" Total S&M Spend: ,.0f")
lines.append(f" Total Net Profit: ,.0f")
# Monthly breakdown (first 3, last 3)
lines.append("\n📅 MONTHLY PROJECTIONS")
lines.append(f"{'Month':<8} {'MRR':<12} {'ARR':<12} {'Revenue':<12} {'Net Profit':<12}")
lines.append("-" * 70)
projs = results["projections"]
for p in projs[:3]:
lines.append(
f"{p['month']:<8} <11,.0f <11,.0f "
f"<11,.0f <11,.0f"
)
if len(projs) > 6:
lines.append(" ...")
for p in projs[-3:]:
lines.append(
f"{p['month']:<8} <11,.0f <11,.0f "
f"<11,.0f <11,.0f"
)
lines.append("\n" + "=" * 70 + "\n")
return "\n".join(lines)
if __name__ == "__main__":
parser = argparse.ArgumentParser(
description="Simulate SaaS unit economics over 12 months"
)
parser.add_argument("--mrr", type=float, required=True, help="Starting MRR")
parser.add_argument(
"--growth", type=float, required=True, help="Monthly growth rate (pct)"
)
parser.add_argument(
"--churn", type=float, required=True, help="Monthly churn rate (pct)"
)
parser.add_argument("--cac", type=float, required=True, help="Customer acquisition cost")
parser.add_argument(
"--gross-margin", type=float, default=70, help="Gross margin %% (default: 70)"
)
parser.add_argument(
"--sm-spend", type=float, default=30, help="S&M spend as %% of revenue (default: 30)"
)
parser.add_argument(
"--months", type=int, default=12, help="Months to project (default: 12)"
)
parser.add_argument("--json", action="store_true", help="Output JSON format")
args = parser.parse_args()
results = simulate(
mrr=args.mrr,
monthly_growth_pct=args.growth,
monthly_churn_pct=args.churn,
cac=args.cac,
gross_margin=args.gross_margin / 100 if args.gross_margin > 1 else args.gross_margin,
sm_spend_pct=args.sm_spend / 100 if args.sm_spend > 1 else args.sm_spend,
months=args.months,
)
if args.json:
print(json.dumps(results, indent=2))
else:
print(format_report(results))
Contract & Proposal Writer
---
name: "contract-and-proposal-writer"
description: "Contract & Proposal Writer"
---
# Contract & Proposal Writer
**Tier:** POWERFUL
**Category:** Business Growth
**Domain:** Legal Documents, Business Development, Client Relations
---
## Overview
Generate professional, jurisdiction-aware business documents: freelance contracts, project proposals, SOWs, NDAs, and MSAs. Outputs structured Markdown with docx conversion instructions. Covers US (Delaware), EU (GDPR), UK, and DACH (German law) jurisdictions.
**Not a substitute for legal counsel.** Use these templates as strong starting points; review with an attorney for high-value or complex engagements.
---
## Core Capabilities
- Freelance development contracts (fixed-price & hourly)
- Project proposals with timeline/budget breakdown
- Statements of Work (SOW) with deliverables matrix
- NDAs (mutual & one-way)
- Master Service Agreements (MSA)
- Jurisdiction-specific clauses (US/EU/UK/DACH)
- GDPR Data Processing Addenda (EU/DACH)
---
## Key Clauses Reference
| Clause | Options |
|--------|---------|
| Payment terms | Net-30, milestone-based, monthly retainer |
| IP ownership | Work-for-hire (US), assignment (EU/UK), license-back |
| Liability cap | 1x contract value (standard), 3x (high-risk) |
| Termination | For cause (14-day cure), convenience (30/60/90-day notice) |
| Confidentiality | 2-5 year term, perpetual for trade secrets |
| Warranty | "As-is" disclaimer, limited 30/90-day fix warranty |
| Dispute resolution | Arbitration (AAA/ICC), courts (jurisdiction-specific) |
---
## When to Use
- Starting a new client engagement and need a contract fast
- Client asks for a proposal with pricing and timeline
- Partnership or vendor relationship requiring an MSA
- Protecting IP or confidential information with an NDA
- EU/DACH project requiring GDPR-compliant data clauses
---
## Workflow
### 1. Gather Requirements
Ask the user:
1. Document type? (contract / proposal / SOW / NDA / MSA)
2. Jurisdiction? (US-Delaware / EU / UK / DACH)
3. Engagement type? (fixed-price / hourly / retainer)
4. Parties? (names, roles, business addresses)
5. Scope summary? (1-3 sentences)
6. Total value or hourly rate?
7. Start date / end date or duration?
8. Special requirements? (IP assignment, white-label, subcontractors)
### 2. Select Template
| Type | Jurisdiction | Template |
|------|-------------|----------|
| Dev contract fixed | Any | Template A |
| Consulting retainer | Any | Template B |
| SaaS partnership | Any | Template C |
| NDA mutual | US/EU/UK/DACH | NDA-M |
| NDA one-way | US/EU/UK/DACH | NDA-OW |
| SOW | Any | SOW base |
### 3. Generate & Fill
Fill all [BRACKETED] placeholders. Flag missing data as "REQUIRED".
### 4. Convert to DOCX
```bash
# Install pandoc
brew install pandoc # macOS
apt install pandoc # Ubuntu
# Basic conversion
pandoc contract.md -o contract.docx \
--reference-doc=reference.docx \
-V geometry:margin=1in
# With numbered sections (legal style)
pandoc contract.md -o contract.docx \
--number-sections \
-V documentclass=article \
-V fontsize=11pt
# With custom company template
pandoc contract.md -o contract.docx \
--reference-doc=company-template.docx
```
---
## Jurisdiction Notes
### US (Delaware)
- Governing law: State of Delaware
- Work-for-hire doctrine applies (Copyright Act 101)
- Arbitration: AAA Commercial Rules
- Non-compete: enforceable with reasonable scope/time
### EU (GDPR)
- Must include Data Processing Addendum if handling personal data
- IP assignment requires separate written deed in some member states
- Arbitration: ICC or local chamber
### UK (post-Brexit)
- Governed by English law
- IP: Patents Act 1977 / CDPA 1988
- Arbitration: LCIA Rules
- Data: UK GDPR (post-Brexit equivalent)
### DACH (Germany / Austria / Switzerland)
- BGB (Buergerliches Gesetzbuch) governs contracts
- Written form requirement for certain clauses (para 126 BGB)
- IP: Author always retains moral rights; must explicitly transfer Nutzungsrechte
- Non-competes: max 2 years, compensation required (para 74 HGB)
- Jurisdiction: German courts (Landgericht) or DIS arbitration
- DSGVO (GDPR implementation) mandatory for personal data processing
- Kuendigungsfristen: statutory notice periods apply
---
## Template A: Web Dev Fixed-Price Contract
```markdown
# SOFTWARE DEVELOPMENT AGREEMENT
**Effective Date:** [DATE]
**Client:** [CLIENT LEGAL NAME], [ADDRESS] ("Client")
**Developer:** [YOUR LEGAL NAME / COMPANY], [ADDRESS] ("Developer")
---
## 1. SERVICES
Developer agrees to design, develop, and deliver:
**Project:** [PROJECT NAME]
**Description:** [1-3 sentence scope]
**Deliverables:**
- [Deliverable 1] due [DATE]
- [Deliverable 2] due [DATE]
- [Deliverable 3] due [DATE]
## 2. PAYMENT
**Total Fee:** [CURRENCY] [AMOUNT]
| Milestone | Amount | Due |
|-----------|--------|-----|
| Contract signing | 50% | Upon execution |
| Beta delivery | 25% | [DATE] |
| Final acceptance | 25% | Within 5 days of acceptance |
Late payments accrue interest at 1.5% per month.
Client has [10] business days to accept or reject deliverables in writing.
## 3. INTELLECTUAL PROPERTY
Upon receipt of full payment, Developer assigns all right, title, and interest in the
Work Product to Client as a work made for hire (US) / by assignment of future copyright (EU/UK).
Developer retains the right to display Work Product in portfolio unless Client
requests confidentiality in writing within [30] days of delivery.
Pre-existing IP (tools, libraries, frameworks) remains Developer's property.
Developer grants Client a perpetual, royalty-free license to use pre-existing IP
as embedded in the Work Product.
## 4. CONFIDENTIALITY
Each party keeps confidential all non-public information received from the other.
This obligation survives termination for [3] years.
## 5. WARRANTIES
Developer warrants Work Product will substantially conform to specifications for
[90] days post-delivery. Developer will fix material defects at no charge during
this period. EXCEPT AS STATED, WORK PRODUCT IS PROVIDED "AS IS."
## 6. LIABILITY
Developer's total liability shall not exceed total fees paid under this Agreement.
Neither party liable for indirect, incidental, or consequential damages.
## 7. TERMINATION
For Cause: Either party may terminate if the other materially breaches and fails
to cure within [14] days of written notice.
For Convenience: Client may terminate with [30] days written notice and pay for
all work completed plus [10%] of remaining contract value.
## 8. DISPUTE RESOLUTION
US: Binding arbitration under AAA Commercial Rules, [CITY], Delaware law.
EU/DACH: ICC / DIS arbitration, [CITY]. German / English law.
UK: LCIA Rules, London. English law.
## 9. GENERAL
- Entire Agreement: Supersedes all prior discussions.
- Amendments: Must be in writing, signed by both parties.
- Independent Contractor: Developer is not an employee of Client.
---
CLIENT: _________________________ Date: _________
[CLIENT NAME], [TITLE]
DEVELOPER: _________________________ Date: _________
[YOUR NAME], [TITLE]
```
---
## Template B: Monthly Consulting Retainer
```markdown
# CONSULTING RETAINER AGREEMENT
**Effective Date:** [DATE]
**Client:** [CLIENT LEGAL NAME] ("Client")
**Consultant:** [YOUR NAME / COMPANY] ("Consultant")
---
## 1. SERVICES
Consultant provides [DOMAIN, e.g., "CTO advisory and technical architecture"] services.
**Monthly Hours:** Up to [X] hours/month
**Rollover:** Unused hours [do / do not] roll over (max [X] hours banked)
**Overflow Rate:** [CURRENCY] [RATE]/hr for hours exceeding retainer
## 2. FEES
**Monthly Retainer:** [CURRENCY] [AMOUNT], due on the 1st of each month.
**Payment Method:** Bank transfer / Stripe / SEPA direct debit
**Late Payment:** 2% monthly interest after [10]-day grace period.
## 3. TERM AND TERMINATION
**Initial Term:** [3] months starting [DATE]
**Renewal:** Auto-renews monthly unless either party gives [30] days written notice.
**Immediate termination:** For material breach uncured after [7] days notice.
On termination, Consultant delivers all work in progress within [5] business days.
## 4. INTELLECTUAL PROPERTY
Work product created under this Agreement belongs to [Client / Consultant / jointly].
Advisory output (recommendations, analyses) are Client property upon full payment.
## 5. EXCLUSIVITY
[OPTION A - Non-exclusive:]
This Agreement is non-exclusive. Consultant may work with other clients.
[OPTION B - Partial exclusivity:]
Consultant will not work with direct competitors of Client during the term
and [90] days thereafter.
## 6. CONFIDENTIALITY AND DATA PROTECTION
EU/DACH: If Consultant processes personal data on behalf of Client, the parties
shall execute a Data Processing Agreement (DPA) per Art. 28 GDPR.
## 7. LIABILITY
Consultant's aggregate liability is capped at [3x] the fees paid in the [3] months
preceding the claim.
---
Signatures as above.
```
---
## Template C: SaaS Partnership Agreement
```markdown
# SAAS PARTNERSHIP AGREEMENT
**Effective Date:** [DATE]
**Provider:** [NAME], [ADDRESS]
**Partner:** [NAME], [ADDRESS]
---
## 1. PURPOSE
Provider grants Partner [reseller / referral / white-label / integration] rights to
Provider's [PRODUCT NAME] ("Software") subject to this Agreement.
## 2. PARTNERSHIP TYPE
[ ] Referral: Partner refers customers; earns [X%] of first-year ARR per referral.
[ ] Reseller: Partner resells licenses; earns [X%] discount off list price.
[ ] White-label: Partner rebrands Software; pays [AMOUNT]/month platform fee.
[ ] Integration: Partner integrates Software via API; terms in Exhibit A.
## 3. REVENUE SHARE
| Tier | Monthly ARR Referred | Commission |
|------|---------------------|------------|
| Bronze | < $10,000 | [X]% |
| Silver | $10,000-$50,000 | [X]% |
| Gold | > $50,000 | [X]% |
Payout: Net-30 after month close, minimum $[500] threshold.
## 4. INTELLECTUAL PROPERTY
Each party retains all IP in its own products. No implied licenses.
Partner may use Provider's marks per Provider's Brand Guidelines (Exhibit B).
## 5. DATA AND PRIVACY
Each party is an independent data controller for its own customers.
Joint processing requires a separate DPA (Exhibit C - EU/DACH projects).
## 6. TERM
Initial: [12] months. Renews annually unless [90]-day written notice given.
Termination for Cause: [30]-day cure period for material breach.
## 7. LIMITATION OF LIABILITY
Each party's liability capped at [1x] fees paid/received in prior [12] months.
Mutual indemnification for IP infringement claims from own products.
---
Signatures, exhibits, and governing law per applicable jurisdiction.
```
---
## GDPR Data Processing Addendum (EU/DACH Clause Block)
```markdown
## DATA PROCESSING ADDENDUM (Art. 28 GDPR)
Controller: [CLIENT NAME]
Processor: [CONTRACTOR NAME]
### Subject Matter
Processor processes personal data on behalf of Controller solely to perform services
under the main Agreement.
### Categories of Data Subjects
[e.g., end users, employees, customers]
### Categories of Personal Data
[e.g., names, email addresses, usage data]
### Processing Duration
For the term of the main Agreement; deletion within [30] days of termination.
### Processor Obligations
- Process data only on Controller's documented instructions
- Ensure persons authorized to process have committed to confidentiality
- Implement technical and organizational measures per Art. 32 GDPR
- Assist Controller with data subject rights requests
- Not engage sub-processors without prior written consent
- Delete or return all personal data upon termination
### Sub-processors (current as of Effective Date)
| Sub-processor | Location | Purpose |
|--------------|----------|---------|
| [AWS / GCP / Azure] | [Region] | Cloud hosting |
| [Other] | [Location] | [Purpose] |
### Cross-border Transfers
Data transfers outside EEA covered by: [ ] SCCs [ ] Adequacy Decision [ ] BCRs
```
---
## Common Pitfalls
1. **Missing IP assignment language** - "work for hire" alone is insufficient in EU; need explicit assignment of Nutzungsrechte in DACH
2. **Vague acceptance criteria** - Always define what "accepted" means (written sign-off, X days to reject)
3. **No change order process** - Scope creep kills fixed-price projects; add a clause for out-of-scope work
4. **Jurisdiction mismatch** - Choosing Delaware law for a German-only project creates enforcement problems
5. **Missing limitation of liability** - Without a cap, one bug could mean unlimited damages
6. **Oral amendments** - Contracts modified verbally are hard to enforce; always require written amendments
---
## Best Practices
- Use **milestone payments** over net-30 for projects >$10K - reduces cash flow risk
- For EU/DACH: always check if a DPA is needed (any personal data = yes)
- For DACH: include a **Schriftformklausel** (written form clause) explicitly
- Add a **force majeure** clause for anything over 3 months
- For retainers: define response time SLAs (e.g., 4h urgent / 24h normal)
- Keep templates in version control; track changes with `git diff`
- Review annually - laws change, especially GDPR enforcement interpretations
- For NDAs: always specify the return/destruction of confidential materials on termination
Senior Project Manager for enterprise software, SaaS, and digital transformation projects. Specializes in portfolio management, quantitative risk analysis, r...
---
name: "senior-pm"
description: Senior Project Manager for enterprise software, SaaS, and digital transformation projects. Specializes in portfolio management, quantitative risk analysis, resource optimization, stakeholder alignment, and executive reporting. Uses advanced methodologies including EMV analysis, Monte Carlo simulation, WSJF prioritization, and multi-dimensional health scoring. Use when a user needs help with project plans, project status reports, risk assessments, resource allocation, project roadmaps, milestone tracking, team capacity planning, portfolio health reviews, program management, or executive-level project reporting — especially for enterprise-scale initiatives with multiple workstreams, complex dependencies, or multi-million dollar budgets.
---
# Senior Project Management Expert
## Overview
Strategic project management for enterprise software, SaaS, and digital transformation initiatives. Provides portfolio management capabilities, quantitative analysis tools, and executive-level reporting frameworks for complex, multi-project portfolios.
### Core Expertise Areas
**Portfolio Management & Strategic Alignment**
- Multi-project portfolio optimization using advanced prioritization models (WSJF, RICE, ICE, MoSCoW)
- Strategic roadmap development aligned with business objectives and market conditions
- Resource capacity planning and allocation optimization across portfolio
- Portfolio health monitoring with multi-dimensional scoring frameworks
**Quantitative Risk Management**
- Expected Monetary Value (EMV) analysis for financial risk quantification
- Monte Carlo simulation for schedule risk modeling and confidence intervals
- Risk appetite framework implementation with enterprise-level thresholds
- Portfolio risk correlation analysis and diversification strategies
**Executive Communication & Governance**
- Board-ready executive reports with RAG status and strategic recommendations
- Stakeholder alignment through sophisticated RACI matrices and escalation paths
- Financial performance tracking with risk-adjusted ROI and NPV calculations
- Change management strategies for large-scale digital transformations
## Methodology & Frameworks
### Three-Tier Analysis Approach
**Tier 1: Portfolio Health Assessment**
Uses `project_health_dashboard.py` to provide comprehensive multi-dimensional scoring:
```bash
python3 scripts/project_health_dashboard.py assets/sample_project_data.json
```
**Health Dimensions (Weighted Scoring):**
- **Timeline Performance** (25% weight): Schedule adherence, milestone achievement, critical path analysis
- **Budget Management** (25% weight): Spend variance, forecast accuracy, cost efficiency metrics
- **Scope Delivery** (20% weight): Feature completion rates, requirement satisfaction, change control
- **Quality Metrics** (20% weight): Code coverage, defect density, technical debt, security posture
- **Risk Exposure** (10% weight): Risk score, mitigation effectiveness, exposure trends
**RAG Status Calculation:**
- 🟢 Green: Composite score >80, all dimensions >60
- 🟡 Amber: Composite score 60-80, or any dimension 40-60
- 🔴 Red: Composite score <60, or any dimension <40
**Tier 2: Risk Matrix & Mitigation Strategy**
Leverages `risk_matrix_analyzer.py` for quantitative risk assessment:
```bash
python3 scripts/risk_matrix_analyzer.py assets/sample_project_data.json
```
**Risk Quantification Process:**
1. **Probability Assessment** (1-5 scale): Historical data, expert judgment, Monte Carlo inputs
2. **Impact Analysis** (1-5 scale): Financial, schedule, quality, and strategic impact vectors
3. **Category Weighting**: Technical (1.2x), Resource (1.1x), Financial (1.4x), Schedule (1.0x)
4. **EMV Calculation**:
```python
# EMV and risk-adjusted budget calculation
def calculate_emv(risks):
category_weights = {"Technical": 1.2, "Resource": 1.1, "Financial": 1.4, "Schedule": 1.0}
total_emv = 0
for risk in risks:
score = risk["probability"] * risk["impact"] * category_weights[risk["category"]]
emv = risk["probability"] * risk["financial_impact"]
total_emv += emv
risk["score"] = score
return total_emv
def risk_adjusted_budget(base_budget, portfolio_risk_score, risk_tolerance_factor):
risk_premium = portfolio_risk_score * risk_tolerance_factor
return base_budget * (1 + risk_premium)
```
**Risk Response Strategies (by score threshold):**
- **Avoid** (>18): Eliminate through scope/approach changes
- **Mitigate** (12-18): Reduce probability or impact through active intervention
- **Transfer** (8-12): Insurance, contracts, partnerships
- **Accept** (<8): Monitor with contingency planning
**Tier 3: Resource Capacity Optimization**
Employs `resource_capacity_planner.py` for portfolio resource analysis:
```bash
python3 scripts/resource_capacity_planner.py assets/sample_project_data.json
```
**Capacity Analysis Framework:**
- **Utilization Optimization**: Target 70-85% for sustainable productivity
- **Skill Matching**: Algorithm-based resource allocation to maximize efficiency
- **Bottleneck Identification**: Critical path resource constraints across portfolio
- **Scenario Planning**: What-if analysis for resource reallocation strategies
### Advanced Prioritization Models
Apply each model in the specific context where it provides the most signal:
**Weighted Shortest Job First (WSJF)** — Resource-constrained agile portfolios with quantifiable cost-of-delay
```python
def wsjf(user_value, time_criticality, risk_reduction, job_size):
return (user_value + time_criticality + risk_reduction) / job_size
```
**RICE** — Customer-facing initiatives where reach metrics are quantifiable
```python
def rice(reach, impact, confidence_pct, effort_person_months):
return (reach * impact * (confidence_pct / 100)) / effort_person_months
```
**ICE** — Rapid prioritization during brainstorming or when analysis time is limited
```python
def ice(impact, confidence, ease):
return (impact + confidence + ease) / 3
```
**Model Selection — Use this decision logic:**
```
if resource_constrained and agile_methodology and cost_of_delay_quantifiable:
→ WSJF
elif customer_facing and reach_metrics_available:
→ RICE
elif quick_prioritization_needed or ideation_phase:
→ ICE
elif multiple_stakeholder_groups_with_differing_priorities:
→ MoSCoW
elif complex_tradeoffs_across_incommensurable_criteria:
→ Multi-Criteria Decision Analysis (MCDA)
```
Reference: `references/portfolio-prioritization-models.md`
### Risk Management Framework
Reference: `references/risk-management-framework.md`
**Step 1: Risk Classification by Category**
- Technical: Architecture, integration, performance
- Resource: Availability, skills, retention
- Schedule: Dependencies, critical path, external factors
- Financial: Budget overruns, currency, economic factors
- Business: Market changes, competitive pressure, strategic shifts
**Step 2: Three-Point Estimation for Monte Carlo Inputs**
```python
def three_point_estimate(optimistic, most_likely, pessimistic):
expected = (optimistic + 4 * most_likely + pessimistic) / 6
std_dev = (pessimistic - optimistic) / 6
return expected, std_dev
```
**Step 3: Portfolio Risk Correlation**
```python
import math
def portfolio_risk(individual_risks, correlations):
# individual_risks: list of risk EMV values
# correlations: list of (i, j, corr_coefficient) tuples
sum_sq = sum(r**2 for r in individual_risks)
sum_corr = sum(2 * c * individual_risks[i] * individual_risks[j]
for i, j, c in correlations)
return math.sqrt(sum_sq + sum_corr)
```
**Risk Appetite Framework:**
- **Conservative**: Risk scores 0-8, 25-30% contingency reserves
- **Moderate**: Risk scores 8-15, 15-20% contingency reserves
- **Aggressive**: Risk scores 15+, 10-15% contingency reserves
## Assets & Templates
### Project Charter Template
Reference: `assets/project_charter_template.md`
**Comprehensive 12-section charter including:**
- Executive summary with strategic alignment
- Success criteria with KPIs and quality gates
- RACI matrix with decision authority levels
- Risk assessment with mitigation strategies
- Budget breakdown with contingency analysis
- Timeline with critical path dependencies
### Executive Report Template
Reference: `assets/executive_report_template.md`
**Board-level portfolio reporting with:**
- RAG status dashboard with trend analysis
- Financial performance vs. strategic objectives
- Risk heat map with mitigation status
- Resource utilization and capacity analysis
- Forward-looking recommendations with ROI projections
### RACI Matrix Template
Reference: `assets/raci_matrix_template.md`
**Enterprise-grade responsibility assignment featuring:**
- Detailed stakeholder roster with decision authority
- Phase-based RACI assignments (initiation through deployment)
- Escalation paths with timeline and authority levels
- Communication protocols and meeting frameworks
- Conflict resolution processes with governance integration
### Sample Portfolio Data
Reference: `assets/sample_project_data.json`
**Realistic multi-project portfolio including:**
- 4 projects across different phases and priorities
- Complete financial data (budgets, actuals, forecasts)
- Resource allocation with utilization metrics
- Risk register with probability/impact scoring
- Quality metrics and stakeholder satisfaction data
- Dependencies and milestone tracking
### Expected Output Examples
Reference: `assets/expected_output.json`
**Demonstrates script capabilities with:**
- Portfolio health scores and RAG status
- Risk matrix visualization and mitigation priorities
- Resource capacity analysis with optimization recommendations
- Integration examples showing how outputs complement each other
## Implementation Workflows
### Portfolio Health Review (Weekly)
1. **Data Collection & Validation**
```bash
python3 scripts/project_health_dashboard.py current_portfolio.json
```
⚠️ If any project composite score <60 or a critical data field is missing, STOP and resolve data integrity issues before proceeding.
2. **Risk Assessment Update**
```bash
python3 scripts/risk_matrix_analyzer.py current_portfolio.json
```
⚠️ If any risk score >18 (Avoid threshold), STOP and initiate escalation to project sponsor before proceeding.
3. **Capacity Analysis**
```bash
python3 scripts/resource_capacity_planner.py current_portfolio.json
```
⚠️ If any team utilization >90% or <60%, flag for immediate reallocation discussion before step 4.
4. **Executive Summary Generation**
- Synthesize outputs into executive report format
- Highlight critical issues and recommendations
- Prepare stakeholder communications
### Monthly Strategic Review
1. **Portfolio Prioritization Review**
- Apply WSJF/RICE/ICE models to evaluate current priorities
- Assess strategic alignment with business objectives
- Identify optimization opportunities
2. **Risk Portfolio Analysis**
- Update risk appetite and tolerance levels
- Review portfolio risk correlation and concentration
- Adjust risk mitigation investments
3. **Resource Optimization Planning**
- Analyze capacity constraints across upcoming quarter
- Plan resource reallocation and hiring strategies
- Identify skill gaps and training needs
4. **Stakeholder Alignment Session**
- Present portfolio health and strategic recommendations
- Gather feedback on prioritization and resource allocation
- Align on upcoming quarter priorities and investments
### Quarterly Portfolio Optimization
1. **Strategic Alignment Assessment**
- Evaluate portfolio contribution to business objectives
- Assess market and competitive position changes
- Update strategic priorities and success criteria
2. **Financial Performance Review**
- Analyze risk-adjusted ROI across portfolio
- Review budget performance and forecast accuracy
- Optimize investment allocation for maximum value
3. **Capability Gap Analysis**
- Identify emerging technology and skill requirements
- Plan capability building investments
- Assess make vs. buy vs. partner decisions
4. **Portfolio Rebalancing**
- Apply three horizons model for innovation balance
- Optimize risk-return profile using efficient frontier
- Plan new initiatives and sunset decisions
## Integration Strategies
### Atlassian Integration
- **Jira**: Portfolio dashboards, cross-project metrics, risk tracking
- **Confluence**: Strategic documentation, executive reports, knowledge management
- Use MCP integrations to automate data collection and report generation
### Financial Systems Integration
- **Budget Tracking**: Real-time spend data for variance analysis
- **Resource Costing**: Hourly rates and utilization for capacity planning
- **ROI Measurement**: Value realization tracking against projections
### Stakeholder Management
- **Executive Dashboards**: Real-time portfolio health visualization
- **Team Scorecards**: Individual project performance metrics
- **Risk Registers**: Collaborative risk management with automated escalation
## Handoff Protocols
### TO Scrum Master
**Context Transfer:**
- Strategic priorities and success criteria
- Resource allocation and team composition
- Risk factors requiring sprint-level attention
- Quality standards and acceptance criteria
**Ongoing Collaboration:**
- Weekly velocity and health metrics review
- Sprint retrospective insights for portfolio learning
- Impediment escalation and resolution support
- Team capacity and utilization feedback
### TO Product Owner
**Strategic Context:**
- Market prioritization and competitive analysis
- User value frameworks and measurement criteria
- Feature prioritization aligned with portfolio objectives
- Resource and timeline constraints
**Decision Support:**
- ROI analysis for feature investments
- Risk assessment for product decisions
- Market intelligence and customer feedback integration
- Strategic roadmap alignment and dependencies
### FROM Executive Team
**Strategic Direction:**
- Business objective updates and priority changes
- Budget allocation and resource approval decisions
- Risk appetite and tolerance level adjustments
- Market strategy and competitive response decisions
**Performance Expectations:**
- Portfolio health and value delivery targets
- Timeline and milestone commitment expectations
- Quality standards and compliance requirements
- Stakeholder satisfaction and communication standards
## Success Metrics & KPIs
Reference: `references/portfolio-kpis.md` for full definitions and measurement guidance.
### Portfolio Performance
- On-time Delivery Rate: >80% within 10% of planned timeline
- Budget Variance: <5% average across portfolio
- Quality Score: >85 composite rating
- Risk Mitigation Coverage: >90% risks with active plans
- Resource Utilization: 75-85% average
### Strategic Value
- ROI Achievement: >90% projects meeting projections within 12 months
- Strategic Alignment: >95% investment aligned with business priorities
- Innovation Balance: 70% operational / 20% growth / 10% transformational
- Stakeholder Satisfaction: >8.5/10 executive average
- Time-to-Value: <6 months average post-completion
### Risk Management
- Risk Exposure: Maintain within approved appetite ranges
- Resolution Time: <30 days (medium), <7 days (high)
- Mitigation Cost Efficiency: <20% of total portfolio risk EMV
- Risk Prediction Accuracy: >70% probability assessment accuracy
## Continuous Improvement Framework
### Portfolio Learning Integration
- Capture lessons learned from completed projects
- Update risk probability assessments based on historical data
- Refine estimation accuracy through retrospective analysis
- Share best practices across project teams
### Methodology Evolution
- Regular review of prioritization model effectiveness
- Update risk frameworks based on industry best practices
- Integrate new tools and technologies for analysis efficiency
- Benchmark against industry portfolio performance standards
### Stakeholder Feedback Integration
- Quarterly stakeholder satisfaction surveys
- Executive interview feedback on decision support quality
- Team feedback on process efficiency and effectiveness
- Customer impact assessment of portfolio decisions
## Related Skills
- **Product Strategist** (`product-team/product-strategist/`) — Product OKRs align with portfolio objectives
- **Scrum Master** (`project-management/scrum-master/`) — Sprint velocity data feeds project health dashboards
FILE:assets/executive_report_template.md
# Executive Portfolio Report Template
**Reporting Period:** [Start Date] - [End Date]
**Report Date:** [Report Generation Date]
**Prepared By:** [Senior Project Manager Name]
**Distribution:** Executive Leadership Team, Board of Directors
---
## Executive Summary & Key Messages
### Portfolio Health at a Glance
- **Overall Portfolio Health:** 🟢 **GREEN** | 🟡 **AMBER** | 🔴 **RED**
- **Total Active Projects:** [Number] projects, $[Total Budget]M investment
- **Projects On-Track:** [Number]% | **At-Risk:** [Number]% | **Critical:** [Number]%
- **This Quarter's Achievements:** [2-3 key wins with business impact]
- **Critical Actions Needed:** [1-2 most urgent executive decisions required]
### Strategic Impact Summary
| Strategic Priority | Progress | Risk Level | Business Value Delivered |
|--------------------|----------|------------|--------------------------|
| [Priority 1] | [%] Complete | 🟢🟡🔴 | $[Value]M / [Key Metric] |
| [Priority 2] | [%] Complete | 🟢🟡🔴 | $[Value]M / [Key Metric] |
| [Priority 3] | [%] Complete | 🟢🟡🔴 | $[Value]M / [Key Metric] |
---
## Portfolio Dashboard & RAG Status
### Current Portfolio Overview
| Project Name | Priority | Status | Budget Health | Timeline | Risk Level | Business Value |
|--------------|----------|---------|---------------|----------|------------|----------------|
| [Project 1] | Critical | 🟢 | 📊 $[X]M / $[Y]M | [X]% | 🟢🟡🔴 | $[Value]M |
| [Project 2] | High | 🟡 | 📊 $[X]M / $[Y]M | [X]% | 🟢🟡🔴 | $[Value]M |
| [Project 3] | Medium | 🔴 | 📊 $[X]M / $[Y]M | [X]% | 🟢🟡🔴 | $[Value]M |
### RAG Status Definitions
- 🟢 **GREEN:** On-track for all success criteria (scope, time, budget, quality)
- 🟡 **AMBER:** Minor deviations, manageable with standard mitigation actions
- 🔴 **RED:** Significant issues requiring immediate executive intervention
### Portfolio Trends (Last 6 Months)
```
🟢 Green Projects: ████████░░ 75% → 80% (↗️ +5%)
🟡 Amber Projects: ████░░░░░░ 20% → 15% (↘️ -5%)
🔴 Red Projects: █░░░░░░░░░ 5% → 5% (→ No Change)
```
---
## Financial Performance
### Budget Performance Summary
| Metric | This Quarter | YTD | Variance | Forecast |
|--------|--------------|-----|----------|----------|
| **Total Portfolio Budget** | $[X]M | $[X]M | $[X]M ([±]%) | $[X]M |
| **Actual Spend** | $[X]M | $[X]M | $[X]M ([±]%) | $[X]M |
| **Committed/Forecast** | $[X]M | $[X]M | - | $[X]M |
| **Available/Reserve** | $[X]M | $[X]M | - | $[X]M |
### Investment by Strategic Category
```
Digital Transformation: ████████████░ 60% ($[X]M)
Operational Excellence: ████████░░░░░ 25% ($[X]M)
Market Expansion: ████░░░░░░░░░ 15% ($[X]M)
```
### ROI & Value Realization
- **Expected Portfolio ROI:** [X]% over [Y] years
- **Value Already Delivered:** $[X]M ([X]% of total expected value)
- **At-Risk Value:** $[X]M (due to delayed/troubled projects)
- **Value Acceleration Opportunities:** $[X]M (with additional investment)
---
## Key Achievements This Period
### Major Milestones Completed
1. **[Project Name] - [Milestone]**
- **Business Impact:** [Quantified benefit - revenue, cost savings, efficiency]
- **Strategic Value:** [How this advances business objectives]
- **Stakeholder Impact:** [Customer, employee, operational improvements]
2. **[Project Name] - [Milestone]**
- **Business Impact:** [Quantified benefit]
- **Strategic Value:** [Strategic advancement]
- **Stakeholder Impact:** [Stakeholder benefits]
### Business Value Delivered
- **Revenue Impact:** $[X]M additional revenue / [X]% growth
- **Cost Reduction:** $[X]M annual savings / [X]% efficiency gain
- **Process Improvements:** [X]% faster processing / [X]% error reduction
- **Customer Impact:** [X]% satisfaction increase / [X]K new customers
- **Employee Impact:** [X]% productivity gain / [X] hours saved per week
---
## Critical Issues & Executive Decisions Needed
### 🔴 RED ALERT - Immediate Action Required
#### Issue 1: [Critical Issue Title]
- **Project:** [Project Name]
- **Business Impact:** [Revenue at risk, customer impact, competitive disadvantage]
- **Root Cause:** [Primary cause - resource, technical, external]
- **Options Available:**
1. [Option 1]: [Cost, timeline, risk implications]
2. [Option 2]: [Cost, timeline, risk implications]
3. [Option 3]: [Cost, timeline, risk implications]
- **Recommended Action:** [Clear recommendation with rationale]
- **Decision Needed By:** [Date]
- **Decision Maker:** [Executive Name/Role]
### 🟡 AMBER - Strategic Decisions Required
#### Issue 2: [Strategic Issue Title]
- **Context:** [Background and strategic importance]
- **Decision Required:** [What needs to be decided and by when]
- **Business Case:** [Financial and strategic implications]
- **Recommendation:** [Proposed path forward]
- **Dependencies:** [What else depends on this decision]
### Resource & Investment Requests
| Request | Project | Justification | Investment Required | Expected ROI | Decision Date |
|---------|---------|---------------|-------------------|--------------|---------------|
| [Request 1] | [Project] | [Business case] | $[Amount] | [ROI/Value] | [Date] |
| [Request 2] | [Project] | [Business case] | $[Amount] | [ROI/Value] | [Date] |
---
## Risk & Opportunity Management
### Top 5 Portfolio Risks
| Risk | Probability | Business Impact | Mitigation Status | Owner | Action Required |
|------|-------------|-----------------|-------------------|-------|-----------------|
| [Risk 1] | [H/M/L] | $[X]M / [Strategic Impact] | 🟢🟡🔴 | [Owner] | [Action by Date] |
| [Risk 2] | [H/M/L] | $[X]M / [Strategic Impact] | 🟢🟡🔴 | [Owner] | [Action by Date] |
### Emerging Opportunities
1. **[Opportunity Title]**
- **Business Potential:** [Revenue potential, strategic advantage]
- **Investment Required:** [Resources, budget, timeline]
- **Decision Timeline:** [When decision needed]
### Risk Appetite & Tolerance
- **Current Portfolio Risk Level:** [High/Medium/Low] vs Target [High/Medium/Low]
- **Risk Concentration:** [Top risk categories and exposure levels]
- **Mitigation Effectiveness:** [% of risks with active mitigation plans]
---
## Resource & Capacity Analysis
### Team Health & Capacity
| Department | Utilization | Critical Resources | Capacity Alerts |
|------------|-------------|-------------------|-----------------|
| Engineering | [X]% | [Number] at >95% | 🟢🟡🔴 |
| Product | [X]% | [Number] at >95% | 🟢🟡🔴 |
| Design | [X]% | [Number] at >95% | 🟢🟡🔴 |
### Resource Conflicts & Bottlenecks
- **Critical Resource Conflicts:** [Specific people/skills in high demand]
- **Skill Gaps:** [Missing capabilities affecting multiple projects]
- **Succession Risks:** [Key person dependencies and mitigation plans]
### Capacity Planning
- **Current Quarter Capacity:** [X]% utilized
- **Next Quarter Outlook:** [Capacity vs demand analysis]
- **Resource Investment Needs:** [Where additional resources needed most]
---
## Market & Competitive Intelligence
### External Factors Impacting Portfolio
- **Market Dynamics:** [Changes affecting project priorities or timelines]
- **Competitive Moves:** [Competitor actions requiring portfolio adjustments]
- **Regulatory Changes:** [Compliance requirements affecting projects]
- **Technology Shifts:** [Emerging technologies creating opportunities/threats]
### Strategic Positioning
- **Competitive Advantage Progress:** [How projects advance market position]
- **Market Entry Status:** [New markets, customer segments being accessed]
- **Innovation Pipeline:** [Next-generation capabilities being developed]
---
## Forward Look & Recommendations
### Next Quarter Priorities
1. **Priority 1:** [Specific focus area with success metrics]
2. **Priority 2:** [Specific focus area with success metrics]
3. **Priority 3:** [Specific focus area with success metrics]
### Strategic Recommendations
1. **[Recommendation 1]**
- **Rationale:** [Why this is important now]
- **Business Impact:** [Expected benefit]
- **Investment Required:** [Resources, budget, timeline]
- **Risk of Delay:** [Consequences of not acting]
2. **[Recommendation 2]**
- [Same format as above]
### Portfolio Optimization Opportunities
- **Resource Reallocation:** [Moving resources between projects for better ROI]
- **Scope Adjustments:** [Projects where scope could be modified for faster value]
- **Timeline Acceleration:** [Projects where additional investment could accelerate delivery]
- **Strategic Pivots:** [Projects that should be redirected based on market changes]
---
## Key Performance Indicators
### Portfolio Health Metrics
| KPI | This Period | Previous Period | YTD | Target | Trend |
|-----|-------------|-----------------|-----|---------|-------|
| **On-Time Delivery %** | [X]% | [X]% | [X]% | [X]% | ↗️↘️→ |
| **Budget Variance %** | [±X]% | [±X]% | [±X]% | <[X]% | ↗️↘️→ |
| **Quality Score** | [X]/10 | [X]/10 | [X]/10 | >[X] | ↗️↘️→ |
| **Stakeholder Satisfaction** | [X]/10 | [X]/10 | [X]/10 | >[X] | ↗️↘️→ |
| **ROI Achievement** | [X]% | [X]% | [X]% | [X]% | ↗️↘️→ |
### Business Impact Metrics
| Metric | Current | Target | Gap | Notes |
|--------|---------|---------|-----|-------|
| **Revenue Impact** | $[X]M | $[X]M | $[X]M | [Commentary] |
| **Cost Savings** | $[X]M | $[X]M | $[X]M | [Commentary] |
| **Process Efficiency** | [X]% | [X]% | [X]% | [Commentary] |
| **Customer Satisfaction** | [X]/10 | [X]/10 | [X] | [Commentary] |
---
## Appendix
### A. Detailed Project Status Reports
[Link to individual project detailed reports]
### B. Financial Deep-Dive
[Detailed budget analysis, variance explanations]
### C. Risk Register
[Complete risk register with full details]
### D. Resource Allocation Matrix
[Detailed resource assignments and utilization]
### E. Stakeholder Feedback Summary
[Key feedback themes from stakeholder surveys/interviews]
---
**Report Prepared By:**
[Senior Project Manager Name]
[Title]
[Email] | [Phone]
**Quality Assurance:**
[PMO Director Name] - Reviewed and Approved
[Date of Approval]
**Next Report Due:** [Date]
**Special Topics Next Period:** [Preview of upcoming focus areas]
---
*This report contains confidential business information. Distribution limited to authorized executives only.*
FILE:assets/expected_output.json
{
"description": "Expected outputs from all three senior-pm scripts when run against sample_project_data.json",
"risk_matrix_analyzer": {
"summary": {
"total_risks": 6,
"active_risks": 5,
"closed_risks": 1,
"critical_risks": 0,
"high_risks": 1,
"total_risk_exposure": 59.2,
"average_risk_score": 11.84,
"overdue_risks": 5
},
"risk_level_distribution": {
"critical": 0,
"high": 1,
"medium": 3,
"low": 1
},
"highest_risk_categories": [
"financial",
"technical",
"resource"
],
"key_recommendations": [
"Focus mitigation efforts on financial risks - highest concentration of risk exposure",
"Address overdue mitigation actions - more than 20% of risks are past their target resolution date"
],
"top_risks": [
{
"title": "Cloud migration budget overrun",
"score": 16.8,
"level": "high",
"category": "financial"
},
{
"title": "Third-party API dependency for mobile banking app",
"score": 14.4,
"level": "medium",
"category": "technical"
},
{
"title": "Key ML engineer departure risk",
"score": 11.0,
"level": "medium",
"category": "resource"
}
]
},
"resource_capacity_planner": {
"summary": {
"total_resources": 6,
"total_projects": 4,
"active_projects": 2,
"overall_utilization": 86.7
},
"utilization_analysis": {
"optimal": 3,
"over_utilized": 2,
"critical": 1
},
"capacity_alerts": [
"CRITICAL: 1 resources are severely over-allocated (>95%)",
"WARNING: 2 resources are over-allocated (85-95%)"
],
"critical_resources": [
{
"name": "Marcus Rodriguez",
"role": "tech lead",
"utilization": 100.0
}
],
"available_capacity": {
"Jennifer Walsh": "20% available (8h/week)",
"Lisa Thompson": "30% available (12h/week)",
"David Kim": "15% available (6h/week)"
},
"key_recommendations": [
"URGENT: Redistribute workload for critically over-allocated resources to prevent burnout",
"Review skill-to-project matching and consider reallocation for better efficiency"
]
},
"project_health_dashboard": {
"portfolio_overview": {
"total_projects": 4,
"active_projects": 3,
"portfolio_average_score": 89.8,
"projects_needing_attention": 0,
"critical_projects": 0
},
"rag_status": {
"green": 3,
"amber": 0,
"red": 0,
"portfolio_grade": "healthy"
},
"dimension_analysis": {
"strongest": "timeline",
"weakest": "quality",
"dimension_scores": {
"timeline": 100.0,
"budget": 100.0,
"scope": 100.0,
"quality": 49.0,
"risk": 100.0
}
},
"project_performance": [
{
"name": "Mobile Banking App v3.0",
"score": 89.8,
"status": "green",
"priority": "high"
},
{
"name": "Cloud Infrastructure Migration",
"score": 89.8,
"status": "green",
"priority": "critical"
},
{
"name": "AI-Powered Analytics Dashboard",
"score": 89.8,
"status": "green",
"priority": "medium"
}
],
"key_recommendations": [
"Focus improvement efforts on quality - weakest portfolio dimension"
]
},
"usage_examples": {
"risk_analysis": {
"command": "python3 scripts/risk_matrix_analyzer.py assets/sample_project_data.json",
"description": "Generates comprehensive risk analysis with probability/impact matrix, category breakdown, and mitigation recommendations"
},
"capacity_planning": {
"command": "python3 scripts/resource_capacity_planner.py assets/sample_project_data.json",
"description": "Analyzes resource utilization across portfolio, identifies capacity constraints and optimization opportunities"
},
"portfolio_health": {
"command": "python3 scripts/project_health_dashboard.py assets/sample_project_data.json",
"description": "Provides executive dashboard view of portfolio health across multiple dimensions with RAG status"
},
"json_output": {
"command": "python3 scripts/[script_name].py assets/sample_project_data.json --format json",
"description": "All scripts support JSON output format for integration with dashboards and reporting tools"
}
}
}
FILE:assets/project_charter_template.md
# Project Charter Template
**Project Name:** [Project Name]
**Project ID:** [Unique Identifier]
**Prepared By:** [Project Manager Name]
**Date:** [Charter Date]
**Version:** [Version Number]
---
## Executive Summary
**One-sentence Project Description:**
[Clear, concise statement of what the project will deliver and its primary value]
**Strategic Alignment:**
- Business Objective: [Link to specific business goal/OKR]
- Strategic Priority: [High/Medium/Low with justification]
- Portfolio Fit: [How this project fits within broader portfolio strategy]
---
## Project Definition
### Project Purpose & Business Case
**Problem Statement:**
[Clear articulation of the business problem or opportunity this project addresses]
**Business Justification:**
- Financial Impact: [ROI, NPV, cost savings, revenue impact]
- Strategic Benefits: [Market position, competitive advantage, capability building]
- Risk of NOT Doing: [Consequences of maintaining status quo]
**Expected Business Value:**
- Quantified Benefits: [Specific metrics and targets]
- Qualitative Benefits: [Brand, customer satisfaction, employee engagement]
- Success Metrics: [How success will be measured]
### Scope Definition
**In Scope:**
- [Specific deliverable 1 with acceptance criteria]
- [Specific deliverable 2 with acceptance criteria]
- [Specific deliverable 3 with acceptance criteria]
**Out of Scope:**
- [Explicitly excluded item 1 - prevents scope creep]
- [Explicitly excluded item 2 - prevents scope creep]
- [Future phases or features deferred]
**Key Deliverables:**
| Deliverable | Description | Acceptance Criteria | Due Date |
|-------------|-------------|-------------------|----------|
| [Name] | [Description] | [Measurable criteria] | [Date] |
| [Name] | [Description] | [Measurable criteria] | [Date] |
---
## Success Criteria
### Primary Success Criteria
1. **[Criterion 1]:** [Specific, measurable outcome with target value]
2. **[Criterion 2]:** [Specific, measurable outcome with target value]
3. **[Criterion 3]:** [Specific, measurable outcome with target value]
### Key Performance Indicators (KPIs)
| KPI | Baseline | Target | Measurement Method | Review Frequency |
|-----|----------|--------|-------------------|------------------|
| [KPI Name] | [Current State] | [Desired State] | [How Measured] | [When Reviewed] |
### Quality Gates
- **Gate 1:** [Milestone] - [Quality criteria that must be met]
- **Gate 2:** [Milestone] - [Quality criteria that must be met]
- **Gate 3:** [Milestone] - [Quality criteria that must be met]
---
## Project Organization & RACI
### Steering Committee
| Role | Name | Responsibilities |
|------|------|-----------------|
| Executive Sponsor | [Name] | Final accountability, funding authority, strategic alignment |
| Business Owner | [Name] | Business requirements, user acceptance, benefits realization |
| Technical Owner | [Name] | Technical architecture, standards compliance, technical risk |
### Core Project Team
| Role | Name | RACI Key | Responsibilities |
|------|------|----------|-----------------|
| Project Manager | [Name] | A | Overall project delivery, timeline, budget, risk management |
| Product Owner | [Name] | R | Requirements definition, backlog prioritization, user stories |
| Technical Lead | [Name] | R | Technical design, code quality, technical decision-making |
| QA Lead | [Name] | R | Test strategy, quality assurance, defect management |
| UI/UX Designer | [Name] | R | User experience design, interface design, usability |
### Extended Stakeholders
| Stakeholder Group | Representative | Interest Level | Influence Level | Communication Needs |
|-------------------|----------------|----------------|-----------------|-------------------|
| [Department/Group] | [Name] | [High/Medium/Low] | [High/Medium/Low] | [Frequency and method] |
### RACI Matrix - Key Decisions
| Decision/Activity | Project Manager | Product Owner | Tech Lead | QA Lead | Sponsor |
|-------------------|-----------------|---------------|-----------|---------|---------|
| Requirements approval | A | R | C | C | I |
| Technical architecture | A | C | R | C | I |
| Go-live decision | A | C | C | C | R |
| Scope changes | A | R | C | C | R |
**RACI Legend:** R=Responsible, A=Accountable, C=Consulted, I=Informed
---
## Timeline & Milestones
### High-Level Timeline
| Phase | Start Date | End Date | Key Deliverables | Dependencies |
|-------|------------|----------|-----------------|--------------|
| Discovery | [Date] | [Date] | Requirements, Architecture | [Dependencies] |
| Development | [Date] | [Date] | Core Features, Testing | [Dependencies] |
| Testing | [Date] | [Date] | QA Sign-off, UAT | [Dependencies] |
| Deployment | [Date] | [Date] | Production Release | [Dependencies] |
### Critical Path Milestones
1. **[Milestone 1]:** [Date] - [Deliverable and significance]
2. **[Milestone 2]:** [Date] - [Deliverable and significance]
3. **[Milestone 3]:** [Date] - [Deliverable and significance]
### Dependencies & Constraints
**External Dependencies:**
- [Dependency 1]: [Description, owner, required date]
- [Dependency 2]: [Description, owner, required date]
**Resource Constraints:**
- [Constraint 1]: [Description and mitigation plan]
- [Constraint 2]: [Description and mitigation plan]
---
## Budget & Resources
### Budget Summary
| Category | Planned Budget | Contingency | Total Authorized |
|----------|----------------|-------------|------------------|
| Personnel | $[Amount] | $[Amount] | $[Amount] |
| Software/Licenses | $[Amount] | $[Amount] | $[Amount] |
| Hardware/Infrastructure | $[Amount] | $[Amount] | $[Amount] |
| External Services | $[Amount] | $[Amount] | $[Amount] |
| **Total** | **$[Total]** | **$[Total]** | **$[Total]** |
### Resource Requirements
| Role | FTE Required | Duration | Skills Required | Availability |
|------|--------------|----------|----------------|--------------|
| [Role] | [FTE] | [Months] | [Key Skills] | [Confirmed/TBD] |
### Funding & Financial Management
- **Funding Source:** [Department/Budget code]
- **Budget Authority:** [Who can approve expenditures]
- **Financial Reporting:** [Frequency and format of budget reports]
- **Change Control:** [Process for budget change requests]
---
## Risk Management
### High-Level Risk Assessment
| Risk Category | Probability | Impact | Risk Score | Mitigation Strategy |
|---------------|-------------|--------|------------|-------------------|
| Technical | [H/M/L] | [H/M/L] | [1-25] | [High-level strategy] |
| Resource | [H/M/L] | [H/M/L] | [1-25] | [High-level strategy] |
| Schedule | [H/M/L] | [H/M/L] | [1-25] | [High-level strategy] |
| Business | [H/M/L] | [H/M/L] | [1-25] | [High-level strategy] |
### Top 5 Project Risks
1. **[Risk Title]:** [Description, impact, probability, mitigation plan]
2. **[Risk Title]:** [Description, impact, probability, mitigation plan]
3. **[Risk Title]:** [Description, impact, probability, mitigation plan]
4. **[Risk Title]:** [Description, impact, probability, mitigation plan]
5. **[Risk Title]:** [Description, impact, probability, mitigation plan]
### Risk Management Process
- **Risk Identification:** [How risks will be identified and by whom]
- **Risk Assessment:** [Methodology for probability/impact scoring]
- **Risk Response:** [Strategies - avoid, mitigate, transfer, accept]
- **Risk Monitoring:** [Review frequency and reporting process]
---
## Communication & Governance
### Communication Plan
| Audience | Information Needs | Format | Frequency | Owner |
|----------|------------------|--------|-----------|-------|
| Executive Sponsors | Status, risks, decisions needed | Dashboard + Meeting | Weekly | PM |
| Steering Committee | Progress, issues, change requests | Report + Meeting | Bi-weekly | PM |
| Project Team | Tasks, blockers, technical updates | Standup + Slack | Daily | Tech Lead |
| Stakeholders | Feature progress, testing needs | Newsletter | Bi-weekly | PO |
### Decision-Making Framework
- **Decision Types:** [Operational, tactical, strategic classifications]
- **Decision Rights:** [Who makes what decisions at what levels]
- **Escalation Path:** [When and how to escalate decisions upward]
- **Decision Log:** [How decisions will be recorded and communicated]
### Change Control Process
1. **Change Request:** [How changes are requested and documented]
2. **Impact Assessment:** [Analysis of scope, time, cost, quality impacts]
3. **Approval Authority:** [Who can approve different types/sizes of changes]
4. **Implementation:** [How approved changes are implemented and communicated]
---
## Quality Management
### Quality Standards & Requirements
- **Technical Standards:** [Coding standards, security requirements, performance criteria]
- **Business Standards:** [Acceptance criteria, usability requirements, accessibility]
- **Process Standards:** [Development methodology, testing approach, documentation]
### Quality Assurance Plan
- **Code Reviews:** [Process, criteria, tools]
- **Testing Strategy:** [Unit, integration, system, user acceptance testing]
- **Quality Gates:** [Go/no-go criteria at each phase]
- **Defect Management:** [Bug tracking, severity classification, resolution process]
---
## Assumptions & Constraints
### Key Assumptions
- [Assumption 1 about resources, technology, or business environment]
- [Assumption 2 about stakeholder availability or external dependencies]
- [Assumption 3 about market conditions or regulatory environment]
### Project Constraints
- **Time Constraints:** [Fixed deadlines, seasonal considerations]
- **Budget Constraints:** [Funding limitations, cost restrictions]
- **Resource Constraints:** [Team size limits, skill availability]
- **Technical Constraints:** [System limitations, technology choices]
- **Regulatory Constraints:** [Compliance requirements, approval processes]
---
## Approval & Sign-off
### Charter Approval
| Role | Name | Signature | Date |
|------|------|-----------|------|
| Executive Sponsor | [Name] | _________________ | [Date] |
| Business Owner | [Name] | _________________ | [Date] |
| Project Manager | [Name] | _________________ | [Date] |
| Technical Owner | [Name] | _________________ | [Date] |
### Project Authorization
By signing this charter, the undersigned acknowledge they have reviewed and approve:
- Project scope, objectives, and success criteria
- Resource allocation and budget authorization
- Timeline and milestone commitments
- Risk acceptance and mitigation strategies
- Communication and governance processes
**Next Steps:**
1. Distribute approved charter to all stakeholders
2. Schedule project kick-off meeting
3. Begin detailed planning and team formation
4. Establish project tracking and reporting mechanisms
---
**Document Control:**
- **Template Version:** 2.1
- **Last Updated:** [Date]
- **Next Review:** [Date]
- **Document Owner:** Project Management Office
FILE:assets/raci_matrix_template.md
# RACI Matrix Template
**Project:** [Project Name]
**Version:** [Version Number]
**Date:** [Creation/Update Date]
**Owner:** [Project Manager Name]
---
## RACI Matrix Legend
| Code | Role | Description |
|------|------|-------------|
| **R** | **Responsible** | The person(s) who actually performs the work to complete the task |
| **A** | **Accountable** | The person who is ultimately answerable for the correct completion |
| **C** | **Consulted** | The person(s) whose opinions are sought and with whom there is two-way communication |
| **I** | **Informed** | The person(s) who are kept up-to-date on progress, often only one-way communication |
### RACI Best Practices
- ✅ **One A per activity** - Only one person can be accountable for each task
- ✅ **At least one R per activity** - Someone must be responsible for doing the work
- ✅ **Minimize C's** - Too many consulted stakeholders can slow decision-making
- ✅ **Strategic I's only** - Inform only those who truly need to know
---
## Stakeholder Roster
### Core Project Team
| Name | Role | Department | Contact | Availability |
|------|------|------------|---------|--------------|
| [Name] | Project Manager | PMO | [email] | 100% |
| [Name] | Product Owner | Product | [email] | 75% |
| [Name] | Technical Lead | Engineering | [email] | 90% |
| [Name] | UX Designer | Design | [email] | 50% |
| [Name] | QA Lead | Quality | [email] | 60% |
### Executive Stakeholders
| Name | Role | Department | Contact | Decision Authority |
|------|------|------------|---------|-------------------|
| [Name] | Executive Sponsor | [Department] | [email] | Budget & Strategic Direction |
| [Name] | Business Owner | [Department] | [email] | Requirements & Acceptance |
| [Name] | Technical Owner | [Department] | [email] | Architecture & Standards |
### Extended Stakeholders
| Name | Role | Department | Contact | Interest Level |
|------|------|------------|---------|----------------|
| [Name] | [Role] | [Department] | [email] | High/Medium/Low |
| [Name] | [Role] | [Department] | [email] | High/Medium/Low |
---
## Project Phase RACI Matrices
### Phase 1: Project Initiation & Planning
| Activity | Project Manager | Executive Sponsor | Business Owner | Product Owner | Technical Lead |
|----------|-----------------|-------------------|----------------|---------------|----------------|
| **Business Case Development** | R | A | R | C | C |
| **Project Charter Creation** | A, R | A | C | C | C |
| **Stakeholder Analysis** | A, R | C | R | C | I |
| **Initial Requirements Gathering** | A | I | R | R | C |
| **High-Level Architecture** | A | I | C | C | R |
| **Resource Planning** | A, R | A | C | C | C |
| **Budget Approval** | R | A | C | I | I |
| **Risk Assessment** | A, R | C | C | C | R |
| **Project Charter Sign-off** | R | A | A | C | C |
### Phase 2: Design & Development Setup
| Activity | Project Manager | Product Owner | Technical Lead | UX Designer | QA Lead |
|----------|-----------------|---------------|----------------|-------------|---------|
| **Requirements Documentation** | A | R | C | C | C |
| **Technical Architecture** | A | C | R | I | C |
| **System Design Documentation** | A | C | R | C | C |
| **UI/UX Design** | A | R | C | R | I |
| **Database Design** | A | I | R | I | C |
| **API Specifications** | A | C | R | I | C |
| **Test Strategy** | A | C | C | I | R |
| **Development Environment Setup** | A | I | R | I | C |
| **CI/CD Pipeline Setup** | A | I | R | I | R |
### Phase 3: Development & Implementation
| Activity | Project Manager | Product Owner | Technical Lead | Dev Team | QA Lead |
|----------|-----------------|---------------|----------------|----------|---------|
| **Sprint Planning** | R | A | R | R | C |
| **User Story Development** | A | R | C | C | C |
| **Code Development** | A | C | R | R | I |
| **Code Reviews** | I | I | A | R | I |
| **Unit Testing** | I | I | R | R | C |
| **Integration Testing** | A | C | R | R | R |
| **Feature Testing** | A | R | C | I | R |
| **Bug Triage** | R | A | R | R | R |
| **Sprint Reviews** | A, R | R | R | R | R |
### Phase 4: Testing & Quality Assurance
| Activity | Project Manager | Product Owner | Technical Lead | QA Lead | Business Owner |
|----------|-----------------|---------------|----------------|---------|----------------|
| **Test Plan Creation** | A | C | C | R | C |
| **System Testing** | A | C | C | R | I |
| **Performance Testing** | A | C | R | R | I |
| **Security Testing** | A | I | R | R | I |
| **User Acceptance Testing** | A | R | C | C | R |
| **Bug Resolution** | A | C | R | R | I |
| **Go-Live Readiness** | A | R | R | R | R |
| **Sign-off Documentation** | R | R | C | R | A |
### Phase 5: Deployment & Launch
| Activity | Project Manager | Technical Lead | DevOps | Business Owner | Support Team |
|----------|-----------------|----------------|--------|----------------|--------------|
| **Deployment Planning** | A | R | R | C | C |
| **Production Deployment** | A | R | R | I | I |
| **Smoke Testing** | A | R | C | C | R |
| **Go-Live Communication** | R | C | I | A | I |
| **User Training** | A | C | I | R | C |
| **Support Documentation** | A | C | C | C | R |
| **Monitoring Setup** | A | R | R | I | R |
| **Launch Retrospective** | A, R | R | C | R | C |
---
## Decision-Making RACI
### Strategic Decisions
| Decision Type | Project Manager | Executive Sponsor | Business Owner | Technical Owner |
|---------------|-----------------|-------------------|----------------|-----------------|
| **Budget Changes >10%** | R | A | C | C |
| **Scope Changes (Major)** | R | A | R | C |
| **Timeline Changes >2 weeks** | R | A | R | C |
| **Technology Platform Changes** | R | C | C | A |
| **Resource Reallocation** | A, R | A | C | C |
| **Go/No-Go Decisions** | R | A | R | R |
### Operational Decisions
| Decision Type | Project Manager | Product Owner | Technical Lead | Team Members |
|---------------|-----------------|---------------|----------------|--------------|
| **Sprint Scope** | C | A | R | R |
| **Technical Implementation** | C | C | A, R | R |
| **Bug Priority** | A | R | C | C |
| **Code Standards** | C | C | A, R | R |
| **Testing Approach** | A | C | R | R |
| **Daily Task Assignment** | I | C | A | R |
---
## Escalation Paths & Conflict Resolution
### Escalation Matrix
| Issue Level | Primary Resolver | Escalation To | Timeline | Authority |
|-------------|------------------|---------------|----------|-----------|
| **Level 1: Task/Technical** | Team Member → Technical Lead | Product Owner | 24 hours | Technical decisions |
| **Level 2: Sprint/Feature** | Technical Lead → Product Owner | Project Manager | 48 hours | Feature scope/priority |
| **Level 3: Project Impact** | Project Manager → Business Owner | Executive Sponsor | 72 hours | Budget/timeline changes |
| **Level 4: Strategic** | Executive Sponsor → Steering Committee | CEO/Board | 1 week | Strategic direction |
### Conflict Resolution Process
1. **Direct Resolution** (Level 1)
- **Who:** Conflicting parties attempt direct resolution
- **Timeline:** 24 hours
- **Documentation:** Brief note in project log
2. **Mediated Resolution** (Level 2)
- **Who:** Project Manager facilitates discussion
- **Timeline:** 48 hours from escalation
- **Documentation:** Decision recorded with rationale
3. **Executive Resolution** (Level 3)
- **Who:** Executive Sponsor makes binding decision
- **Timeline:** 72 hours from escalation
- **Documentation:** Formal decision memo to all stakeholders
4. **Steering Committee** (Level 4)
- **Who:** Full steering committee vote
- **Timeline:** Next scheduled meeting (max 1 week)
- **Documentation:** Board resolution or meeting minutes
### Communication Protocols
- **Escalation Notification:** All RACI stakeholders informed within 4 hours
- **Decision Communication:** Decision communicated to all affected parties within 24 hours
- **Documentation:** All escalations and resolutions logged in project management system
---
## Communication & Meeting RACI
### Regular Meetings
| Meeting Type | Frequency | Project Manager | Team | Stakeholders | Sponsor |
|-------------|-----------|-----------------|------|--------------|---------|
| **Daily Standup** | Daily | A | R | I | I |
| **Sprint Planning** | Bi-weekly | A | R | C | I |
| **Sprint Review** | Bi-weekly | R | R | A | C |
| **Stakeholder Updates** | Weekly | A, R | C | R | A |
| **Steering Committee** | Monthly | R | I | C | A |
### Communication Artifacts
| Artifact | Creator (R) | Approver (A) | Reviewers (C) | Recipients (I) |
|----------|-------------|-------------|---------------|----------------|
| **Status Reports** | Project Manager | Business Owner | Team Leads | All Stakeholders |
| **Risk Register** | Project Manager | Executive Sponsor | Risk Owners | Steering Committee |
| **Change Requests** | Requestor | Business Owner | Project Manager | Affected Teams |
| **Decision Log** | Project Manager | Decision Maker | Consulted Parties | All Stakeholders |
---
## Risk & Issue Management RACI
### Risk Management
| Activity | Project Manager | Risk Owner | Executive Sponsor | Team |
|----------|-----------------|------------|-------------------|------|
| **Risk Identification** | A | R | C | R |
| **Risk Assessment** | A | R | C | C |
| **Mitigation Planning** | A | R | C | R |
| **Risk Monitoring** | A | R | I | C |
| **Risk Escalation** | R | R | A | I |
### Issue Resolution
| Issue Severity | Reporter (R) | Owner (A) | Resolver (R) | Informed (I) |
|----------------|-------------|-----------|-------------|-------------|
| **Critical** | Anyone | Project Manager | Technical Lead | Executive Sponsor |
| **High** | Team/Stakeholder | Technical Lead | Team Member | Project Manager |
| **Medium** | Team Member | Team Lead | Team Member | Project Manager |
| **Low** | Team Member | Team Member | Team Member | Team Lead |
---
## RACI Validation & Maintenance
### Validation Checklist
- [ ] Every activity has exactly one "A" (Accountable)
- [ ] Every activity has at least one "R" (Responsible)
- [ ] "C" (Consulted) roles are minimized to essential stakeholders
- [ ] "I" (Informed) includes only those who truly need updates
- [ ] No person is assigned "A" for more tasks than they can handle
- [ ] Escalation paths are clear and realistic
- [ ] Decision rights match organizational authority
### Review & Update Process
- **Review Frequency:** Every project phase or monthly
- **Update Triggers:** Team changes, scope changes, organizational changes
- **Approval Process:** Changes require Project Manager and Executive Sponsor approval
- **Communication:** RACI updates communicated to all stakeholders within 48 hours
### RACI Health Metrics
| Metric | Target | Current | Notes |
|--------|---------|---------|-------|
| **Decision Speed** | <48 hours | [X] hours | Average time for routine decisions |
| **Escalation Rate** | <10% | [X]% | Percentage of issues requiring escalation |
| **Role Clarity** | >90% | [X]% | Stakeholder survey on role understanding |
| **Conflict Resolution** | <72 hours | [X] hours | Average resolution time |
---
**Document Control:**
- **Version:** [Version Number]
- **Last Updated:** [Date]
- **Next Review:** [Date]
- **Approved By:** [Executive Sponsor Name]
**Distribution List:**
- All Project Stakeholders (as identified in roster)
- PMO (for template compliance)
- HR (for role clarity and performance management)
FILE:assets/sample_project_data.json
{
"portfolio_metadata": {
"organization": "TechCorp Inc.",
"reporting_period": "2025-Q1",
"generated_on": "2025-02-15",
"total_projects": 4,
"total_budget": 2800000,
"fte_count": 32
},
"projects": [
{
"id": "PROJ001",
"name": "Mobile Banking App v3.0",
"status": "in_progress",
"priority": "high",
"start_date": "2024-10-01",
"planned_end_date": "2025-06-30",
"actual_end_date": null,
"budget": {
"planned": 850000,
"spent": 425000,
"remaining": 425000,
"variance_percentage": 0.0
},
"timeline": {
"total_sprints": 18,
"completed_sprints": 9,
"progress_percentage": 50.0,
"days_behind_schedule": 5,
"critical_path_delay": false
},
"team": {
"size": 12,
"roles": {
"product_manager": 1,
"tech_lead": 1,
"senior_developer": 3,
"developer": 4,
"qa_engineer": 2,
"ui_ux_designer": 1
}
},
"quality_metrics": {
"code_coverage": 85.2,
"test_pass_rate": 94.7,
"defect_density": 0.8,
"technical_debt_hours": 120,
"security_vulnerabilities": 2
},
"stakeholder_satisfaction": 8.5,
"scope_change_count": 3,
"dependencies": ["PROJ002", "PROJ004"],
"key_milestones": [
{
"name": "MVP Release",
"planned_date": "2025-03-15",
"status": "at_risk",
"completion_percentage": 75
},
{
"name": "Beta Testing",
"planned_date": "2025-05-01",
"status": "on_track",
"completion_percentage": 0
}
]
},
{
"id": "PROJ002",
"name": "Cloud Infrastructure Migration",
"status": "in_progress",
"priority": "critical",
"start_date": "2024-08-15",
"planned_end_date": "2025-04-30",
"actual_end_date": null,
"budget": {
"planned": 650000,
"spent": 520000,
"remaining": 130000,
"variance_percentage": -20.0
},
"timeline": {
"total_sprints": 16,
"completed_sprints": 12,
"progress_percentage": 75.0,
"days_behind_schedule": 0,
"critical_path_delay": false
},
"team": {
"size": 8,
"roles": {
"solution_architect": 1,
"devops_engineer": 3,
"senior_developer": 2,
"security_specialist": 1,
"project_manager": 1
}
},
"quality_metrics": {
"code_coverage": 78.9,
"test_pass_rate": 98.2,
"defect_density": 0.3,
"technical_debt_hours": 45,
"security_vulnerabilities": 0
},
"stakeholder_satisfaction": 9.2,
"scope_change_count": 1,
"dependencies": [],
"key_milestones": [
{
"name": "Phase 1: Core Services Migration",
"planned_date": "2025-01-31",
"status": "completed",
"completion_percentage": 100
},
{
"name": "Phase 2: Database Migration",
"planned_date": "2025-03-15",
"status": "on_track",
"completion_percentage": 80
}
]
},
{
"id": "PROJ003",
"name": "AI-Powered Analytics Dashboard",
"status": "planning",
"priority": "medium",
"start_date": "2025-03-01",
"planned_end_date": "2025-10-31",
"actual_end_date": null,
"budget": {
"planned": 450000,
"spent": 25000,
"remaining": 425000,
"variance_percentage": 0.0
},
"timeline": {
"total_sprints": 16,
"completed_sprints": 0,
"progress_percentage": 5.0,
"days_behind_schedule": 0,
"critical_path_delay": false
},
"team": {
"size": 6,
"roles": {
"product_manager": 1,
"ml_engineer": 2,
"data_scientist": 1,
"frontend_developer": 2
}
},
"quality_metrics": {
"code_coverage": 0.0,
"test_pass_rate": 0.0,
"defect_density": 0.0,
"technical_debt_hours": 0,
"security_vulnerabilities": 0
},
"stakeholder_satisfaction": 7.8,
"scope_change_count": 0,
"dependencies": ["PROJ002"],
"key_milestones": [
{
"name": "Data Pipeline Setup",
"planned_date": "2025-04-30",
"status": "not_started",
"completion_percentage": 0
},
{
"name": "ML Model Training",
"planned_date": "2025-07-15",
"status": "not_started",
"completion_percentage": 0
}
]
},
{
"id": "PROJ004",
"name": "Customer Portal Redesign",
"status": "completed",
"priority": "high",
"start_date": "2024-05-01",
"planned_end_date": "2024-12-15",
"actual_end_date": "2024-12-22",
"budget": {
"planned": 320000,
"spent": 340000,
"remaining": 0,
"variance_percentage": 6.25
},
"timeline": {
"total_sprints": 14,
"completed_sprints": 14,
"progress_percentage": 100.0,
"days_behind_schedule": 7,
"critical_path_delay": true
},
"team": {
"size": 6,
"roles": {
"product_manager": 1,
"ui_ux_designer": 2,
"frontend_developer": 2,
"qa_engineer": 1
}
},
"quality_metrics": {
"code_coverage": 92.4,
"test_pass_rate": 99.1,
"defect_density": 0.2,
"technical_debt_hours": 18,
"security_vulnerabilities": 0
},
"stakeholder_satisfaction": 9.5,
"scope_change_count": 2,
"dependencies": [],
"key_milestones": [
{
"name": "Design System Implementation",
"planned_date": "2024-08-30",
"status": "completed",
"completion_percentage": 100
},
{
"name": "User Acceptance Testing",
"planned_date": "2024-11-30",
"status": "completed",
"completion_percentage": 100
}
]
}
],
"resources": [
{
"id": "RES001",
"name": "Sarah Chen",
"role": "Senior Product Manager",
"department": "Product",
"hourly_rate": 120,
"available_hours": 40,
"current_utilization": 0.9,
"skills": ["product_strategy", "stakeholder_management", "agile"],
"current_projects": ["PROJ001", "PROJ003"],
"capacity_notes": "Available for strategic initiatives"
},
{
"id": "RES002",
"name": "Marcus Rodriguez",
"role": "Tech Lead",
"department": "Engineering",
"hourly_rate": 110,
"available_hours": 40,
"current_utilization": 1.0,
"skills": ["system_architecture", "team_leadership", "java", "microservices"],
"current_projects": ["PROJ001"],
"capacity_notes": "At full capacity, consider load balancing"
},
{
"id": "RES003",
"name": "Jennifer Walsh",
"role": "DevOps Engineer",
"department": "Engineering",
"hourly_rate": 105,
"available_hours": 40,
"current_utilization": 0.8,
"skills": ["aws", "kubernetes", "terraform", "ci_cd"],
"current_projects": ["PROJ002"],
"capacity_notes": "Can take on additional infrastructure work"
},
{
"id": "RES004",
"name": "David Kim",
"role": "Senior Developer",
"department": "Engineering",
"hourly_rate": 95,
"available_hours": 40,
"current_utilization": 0.85,
"skills": ["react", "node_js", "typescript", "aws"],
"current_projects": ["PROJ001", "PROJ004"],
"capacity_notes": "Strong full-stack capabilities"
},
{
"id": "RES005",
"name": "Lisa Thompson",
"role": "ML Engineer",
"department": "Data Science",
"hourly_rate": 115,
"available_hours": 40,
"current_utilization": 0.7,
"skills": ["python", "tensorflow", "data_pipelines", "mlops"],
"current_projects": ["PROJ003"],
"capacity_notes": "Available for additional ML initiatives"
},
{
"id": "RES006",
"name": "Ahmed Hassan",
"role": "Solution Architect",
"department": "Engineering",
"hourly_rate": 125,
"available_hours": 40,
"current_utilization": 0.95,
"skills": ["enterprise_architecture", "cloud_strategy", "security"],
"current_projects": ["PROJ002"],
"capacity_notes": "Critical resource for architectural decisions"
}
],
"risks": [
{
"id": "RISK001",
"title": "Third-party API dependency for mobile banking app",
"description": "Banking app relies on external payment processor API that has had recent stability issues",
"category": "technical",
"probability": 3,
"impact": 4,
"status": "open",
"owner": "Marcus Rodriguez",
"project_id": "PROJ001",
"created_date": "2024-11-15",
"target_resolution": "2025-03-01",
"mitigation_actions": [
"Implement fallback payment processor integration",
"Add circuit breaker pattern for API calls",
"Negotiate SLA improvements with vendor"
],
"impact_areas": ["schedule", "quality", "customer_satisfaction"],
"severity": "high"
},
{
"id": "RISK002",
"title": "Cloud migration budget overrun",
"description": "Migration costs exceeding budget due to unexpected data transfer fees and extended downtime windows",
"category": "financial",
"probability": 4,
"impact": 3,
"status": "open",
"owner": "Jennifer Walsh",
"project_id": "PROJ002",
"created_date": "2024-12-01",
"target_resolution": "2025-02-28",
"mitigation_actions": [
"Implement incremental data migration strategy",
"Negotiate volume discounts with cloud provider",
"Optimize data transfer timing for cost efficiency"
],
"impact_areas": ["budget", "timeline"],
"severity": "high"
},
{
"id": "RISK003",
"title": "Key ML engineer departure risk",
"description": "Primary ML engineer considering external opportunity, critical for AI dashboard project",
"category": "resource",
"probability": 2,
"impact": 5,
"status": "open",
"owner": "Sarah Chen",
"project_id": "PROJ003",
"created_date": "2025-01-10",
"target_resolution": "2025-03-31",
"mitigation_actions": [
"Conduct retention conversation and career planning",
"Cross-train additional team members on ML pipeline",
"Identify external consultant as backup resource"
],
"impact_areas": ["timeline", "quality", "team_morale"],
"severity": "critical"
},
{
"id": "RISK004",
"title": "Regulatory compliance requirements for banking app",
"description": "New financial regulations may require additional security features and audit trails",
"category": "compliance",
"probability": 3,
"impact": 3,
"status": "open",
"owner": "Ahmed Hassan",
"project_id": "PROJ001",
"created_date": "2024-12-15",
"target_resolution": "2025-04-30",
"mitigation_actions": [
"Engage legal and compliance teams early",
"Build regulatory requirements into technical design",
"Plan for additional security audit phase"
],
"impact_areas": ["timeline", "scope", "budget"],
"severity": "medium"
},
{
"id": "RISK005",
"title": "Integration complexity with legacy systems",
"description": "Cloud migration may face unexpected integration challenges with legacy on-premise systems",
"category": "technical",
"probability": 2,
"impact": 2,
"status": "mitigated",
"owner": "Ahmed Hassan",
"project_id": "PROJ002",
"created_date": "2024-09-01",
"target_resolution": "2024-12-31",
"mitigation_actions": [
"Complete comprehensive system mapping and API inventory",
"Create detailed integration test suite",
"Establish rollback procedures for each integration phase"
],
"impact_areas": ["timeline", "quality"],
"severity": "low"
},
{
"id": "RISK006",
"title": "Data privacy requirements for analytics platform",
"description": "AI dashboard must comply with GDPR and CCPA for customer data analysis",
"category": "compliance",
"probability": 4,
"impact": 2,
"status": "open",
"owner": "Lisa Thompson",
"project_id": "PROJ003",
"created_date": "2025-02-01",
"target_resolution": "2025-05-15",
"mitigation_actions": [
"Implement data anonymization in ML pipeline",
"Add consent management features to data collection",
"Conduct privacy impact assessment"
],
"impact_areas": ["timeline", "scope"],
"severity": "medium"
}
],
"historical_data": {
"risk_trends": {
"2024-Q3": {
"total_risks": 3,
"average_score": 8.5,
"critical_risks": 1
},
"2024-Q4": {
"total_risks": 5,
"average_score": 10.2,
"critical_risks": 1
},
"2025-Q1": {
"total_risks": 6,
"average_score": 9.8,
"critical_risks": 1
}
},
"resource_utilization": {
"2024-Q4": 0.87,
"2025-Q1": 0.89
},
"project_delivery": {
"on_time_percentage": 0.75,
"budget_variance_avg": 0.05
}
}
}
FILE:references/portfolio-kpis.md
# Portfolio KPIs Reference
## Delivery KPIs
| KPI | Formula | Target |
|-----|---------|--------|
| Sprint Velocity | Story points completed / sprint | Stable ±10% |
| Sprint Predictability | Completed / Committed × 100 | ≥80% |
| Cycle Time | Time from In Progress → Done | Decreasing trend |
| Lead Time | Time from Created → Done | <2 sprints |
| Throughput | Items completed per sprint | Increasing trend |
## Quality KPIs
| KPI | Formula | Target |
|-----|---------|--------|
| Defect Escape Rate | Prod bugs / total stories × 100 | <5% |
| Rework Rate | Reopened items / completed × 100 | <10% |
| Test Coverage | Covered lines / total lines × 100 | >80% |
## Team Health KPIs
| KPI | Formula | Target |
|-----|---------|--------|
| Planned vs Unplanned | Unplanned work / total work × 100 | <20% |
| Blocked Time | Hours blocked / total hours × 100 | <10% |
| WIP Limit Compliance | Times WIP exceeded / sprints × 100 | <15% |
## Portfolio KPIs
| KPI | Formula | Target |
|-----|---------|--------|
| On-Time Delivery | Projects on schedule / total | >85% |
| Budget Variance | (Actual - Budget) / Budget × 100 | ±10% |
| Resource Utilization | Allocated / Available × 100 | 70-85% |
| Strategic Alignment | Projects aligned to OKRs / total | >80% |
FILE:references/portfolio-prioritization-models.md
# Portfolio Prioritization Models & Decision Frameworks
## Executive Overview
This reference guide provides senior project managers with sophisticated prioritization methodologies for managing complex project portfolios. It covers quantitative scoring models (WSJF, ICE, RICE), qualitative frameworks (MoSCoW, Kano), and decision trees for selecting the optimal prioritization approach based on context, stakeholder needs, and strategic objectives.
---
## Model Selection Decision Tree
### Context-Based Framework Selection
```
START: What is your primary prioritization objective?
├── Maximize Business Value & ROI
│ ├── Clear quantitative metrics available? → RICE Model
│ └── Mix of quantitative/qualitative factors? → Weighted Scoring Matrix
│
├── Optimize Resource Utilization
│ ├── Agile/SAFe environment? → WSJF (Weighted Shortest Job First)
│ └── Traditional PM environment? → Resource-Constraint Optimization
│
├── Stakeholder Alignment & Buy-in
│ ├── Multiple stakeholder groups? → MoSCoW Method
│ └── Customer-focused prioritization? → Kano Analysis
│
├── Speed of Decision Making
│ ├── Need rapid decisions? → ICE Scoring
│ └── Complex trade-offs acceptable? → Multi-Criteria Decision Analysis
│
└── Strategic Portfolio Balance
├── Innovation vs. Operations balance? → Three Horizons Model
└── Risk vs. Return optimization? → Efficient Frontier Analysis
```
---
## Quantitative Prioritization Models
### 1. WSJF (Weighted Shortest Job First)
**Best Used For:** Agile portfolios, resource-constrained environments, when cost of delay is critical
**Formula:** `WSJF Score = (User/Business Value + Time Criticality + Risk Reduction) ÷ Job Size`
#### Detailed Scoring Framework
**User/Business Value (1-20 scale):**
- **1-5:** Nice to have improvements, minimal user impact
- **6-10:** Moderate value, affects subset of users/processes
- **11-15:** Significant value, major user/business impact
- **16-20:** Critical value, transformational business impact
**Time Criticality (1-20 scale):**
- **1-5:** No time pressure, can be delayed 12+ months
- **6-10:** Some urgency, should complete within 6-12 months
- **11-15:** Urgent, needed within 3-6 months
- **16-20:** Critical time pressure, needed within 1-3 months
**Risk Reduction/Opportunity Enablement (1-20 scale):**
- **1-5:** Minimal risk mitigation or future opportunity impact
- **6-10:** Moderate risk reduction or enables some future work
- **11-15:** Significant risk mitigation or enables key capabilities
- **16-20:** Critical risk mitigation or foundational for future strategy
**Job Size (1-20 scale, reverse scored):**
- **1-5:** Very large (>12 months, >$2M, >20 people)
- **6-10:** Large (6-12 months, $1-2M, 10-20 people)
- **11-15:** Medium (3-6 months, $500K-1M, 5-10 people)
- **16-20:** Small (<3 months, <$500K, <5 people)
#### WSJF Implementation Example
```
Project A: Mobile App Enhancement
- User Value: 15 (significant user experience improvement)
- Time Criticality: 12 (competitive pressure, 4-month window)
- Risk Reduction: 8 (moderate technical debt reduction)
- Job Size: 14 (3-month project, $750K, 7 people)
WSJF = (15 + 12 + 8) ÷ 14 = 2.5
Project B: Infrastructure Security Upgrade
- User Value: 8 (minimal user-facing impact)
- Time Criticality: 18 (regulatory compliance deadline)
- Risk Reduction: 17 (critical security vulnerability mitigation)
- Job Size: 10 (8-month project, $1.5M, 12 people)
WSJF = (8 + 18 + 17) ÷ 10 = 4.3
Result: Project B prioritized despite lower user value due to criticality and risk reduction.
```
### 2. RICE Framework
**Best Used For:** Product development, marketing initiatives, when reach and impact can be quantified
**Formula:** `RICE Score = (Reach × Impact × Confidence) ÷ Effort`
#### RICE Scoring Guidelines
**Reach (Number per time period):**
- **Projects:** Number of users/customers/processes affected per month
- **Internal Initiatives:** Number of employees/systems/workflows impacted
- **Strategic Programs:** Market size or business units affected
**Impact (Multiplier scale):**
- **3.0:** Massive impact - Transforms core business metrics
- **2.0:** High impact - Significantly improves key metrics
- **1.0:** Medium impact - Moderately improves metrics
- **0.5:** Low impact - Slight improvement in metrics
- **0.25:** Minimal impact - Barely measurable improvement
**Confidence (Percentage as decimal):**
- **100% (1.0):** High confidence - Strong data and precedent
- **80% (0.8):** Medium confidence - Some data, reasonable assumptions
- **50% (0.5):** Low confidence - Limited data, high uncertainty
**Effort (Person-months):**
- Total estimated effort across all teams and functions
- Include planning, design, development, testing, deployment, training
#### RICE Application Example
```
Initiative: Customer Self-Service Portal
- Reach: 50,000 customers per month
- Impact: 1.0 (moderate reduction in support calls)
- Confidence: 0.8 (good data from customer surveys)
- Effort: 18 person-months
RICE = (50,000 × 1.0 × 0.8) ÷ 18 = 2,222
Initiative: Sales Process Automation
- Reach: 200 sales reps per month
- Impact: 2.0 (significant productivity improvement)
- Confidence: 0.9 (pilot data available)
- Effort: 12 person-months
RICE = (200 × 2.0 × 0.9) ÷ 12 = 30
Result: Sales automation prioritized despite much smaller reach due to high impact and efficiency.
```
### 3. ICE Scoring
**Best Used For:** Rapid prioritization, brainstorming sessions, when detailed analysis isn't feasible
**Formula:** `ICE Score = (Impact + Confidence + Ease) ÷ 3`
Each dimension scored 1-10:
**Impact (1-10):**
- **10:** Revolutionary change, massive business impact
- **7-9:** Significant improvement in key metrics
- **4-6:** Moderate positive impact
- **1-3:** Minimal or unclear impact
**Confidence (1-10):**
- **10:** Certain of outcome, strong data/precedent
- **7-9:** High confidence, some supporting evidence
- **4-6:** Medium confidence, reasonable assumptions
- **1-3:** Low confidence, uncertain outcome
**Ease (1-10):**
- **10:** Minimal effort, existing resources, low complexity
- **7-9:** Moderate effort, some new resources needed
- **4-6:** Significant effort, substantial resource commitment
- **1-3:** Very difficult, major resource investment
#### ICE Prioritization Matrix
| Initiative | Impact | Confidence | Ease | ICE Score | Priority |
|------------|--------|------------|------|-----------|----------|
| API Documentation Update | 6 | 9 | 9 | 8.0 | High |
| Machine Learning Platform | 9 | 5 | 3 | 5.7 | Medium |
| Mobile App Redesign | 8 | 7 | 5 | 6.7 | Medium-High |
| Data Warehouse Migration | 7 | 8 | 2 | 5.7 | Medium |
---
## Qualitative Prioritization Frameworks
### 1. MoSCoW Method
**Best Used For:** Scope management, stakeholder alignment, requirement prioritization
**Categories:**
- **Must Have:** Non-negotiable requirements, project fails without these
- **Should Have:** Important but not critical, can be delayed if necessary
- **Could Have:** Nice to have, include if resources permit
- **Won't Have:** Explicitly out of scope for current timeframe
#### MoSCoW Implementation Guidelines
**Must Have Criteria:**
- Legal/regulatory requirement
- Critical business process dependency
- Fundamental system functionality
- Security/compliance necessity
**Should Have Criteria:**
- Significant user value or business benefit
- Competitive advantage requirement
- Important process improvement
- Strong stakeholder demand
**Could Have Criteria:**
- Enhancement to user experience
- Process optimization opportunity
- Future-proofing consideration
- Secondary stakeholder request
**Won't Have Criteria:**
- Feature creep identification
- Future phase consideration
- Out-of-budget items
- Low-value/high-effort items
#### MoSCoW with Quantitative Overlay
```
Priority Distribution Guidelines:
- Must Have: 60% of budget/effort (ensures core delivery)
- Should Have: 20% of budget/effort (key value delivery)
- Could Have: 20% of budget/effort (buffer for scope adjustment)
- Won't Have: Document for future consideration
Risk Management:
- If Must Haves exceed 60%: Scope too large, requires reduction
- If Should Haves exceed 30%: Risk of scope creep
- If Could Haves exceed 20%: May indicate unclear priorities
```
### 2. Kano Model Analysis
**Best Used For:** Customer-focused prioritization, product development, user experience improvements
#### Kano Categories
**Basic Needs (Must-Be):**
- **Definition:** Expected features, dissatisfaction if absent
- **Customer Response:** "Of course it should do that"
- **Business Impact:** Prevents customer loss but doesn't drive acquisition
- **Examples:** Security, basic functionality, compliance
**Performance Needs (More-Is-Better):**
- **Definition:** Linear satisfaction relationship with performance
- **Customer Response:** "The better it performs, the happier I am"
- **Business Impact:** Competitive differentiation opportunity
- **Examples:** Speed, efficiency, cost, reliability
**Excitement Needs (Delighters):**
- **Definition:** Unexpected features that create delight
- **Customer Response:** "Wow, I didn't expect that!"
- **Business Impact:** Customer acquisition and loyalty driver
- **Examples:** Innovative features, exceptional experiences
**Indifferent Features:**
- **Definition:** Features customers don't care about
- **Customer Response:** "Whatever, doesn't matter to me"
- **Business Impact:** Resource waste if prioritized
- **Action:** Eliminate or deprioritize
**Reverse Features:**
- **Definition:** Features that actually create dissatisfaction
- **Customer Response:** "I wish this wasn't here"
- **Business Impact:** Customer churn risk
- **Action:** Remove immediately
#### Kano Prioritization Matrix
| Feature | Kano Category | Customer Impact | Implementation Cost | Priority Score |
|---------|---------------|-----------------|-------------------|----------------|
| Single Sign-On | Basic | High Dissatisfaction if Missing | Medium | Must Do |
| Load Time <2sec | Performance | Linear Satisfaction | High | High Priority |
| AI-Powered Recommendations | Excitement | High Delight Potential | Very High | Medium Priority |
| Advanced Analytics Dashboard | Indifferent | Low Interest | Medium | Low Priority |
---
## Advanced Prioritization Models
### 1. Multi-Criteria Decision Analysis (MCDA)
**Best Used For:** Complex portfolios with multiple competing objectives and diverse stakeholder interests
#### Weighted Scoring Matrix Setup
**Step 1: Define Evaluation Criteria**
```
Strategic Criteria (40% weight):
- Strategic Alignment (15%)
- Market Opportunity (10%)
- Competitive Advantage (15%)
Financial Criteria (35% weight):
- ROI/NPV (20%)
- Payback Period (10%)
- Cost Efficiency (5%)
Risk/Feasibility Criteria (25% weight):
- Technical Risk (10%)
- Resource Availability (10%)
- Timeline Feasibility (5%)
```
**Step 2: Score Each Project (1-5 scale)**
**Step 3: Calculate Weighted Scores**
```
Project Score = Σ(Criterion Score × Criterion Weight)
Example:
Project Alpha:
- Strategic Alignment: 4 × 0.15 = 0.60
- Market Opportunity: 5 × 0.10 = 0.50
- Competitive Advantage: 3 × 0.15 = 0.45
- ROI/NPV: 4 × 0.20 = 0.80
- Payback Period: 3 × 0.10 = 0.30
- Cost Efficiency: 5 × 0.05 = 0.25
- Technical Risk: 2 × 0.10 = 0.20
- Resource Availability: 4 × 0.10 = 0.40
- Timeline Feasibility: 4 × 0.05 = 0.20
Total Score: 3.70
```
### 2. Three Horizons Model
**Best Used For:** Balancing innovation with operational excellence, strategic portfolio planning
#### Horizon Definitions
**Horizon 1: Core Business (70% of portfolio)**
- **Focus:** Optimize existing products/services
- **Timeline:** 0-2 years
- **Risk Level:** Low
- **ROI Expectation:** High certainty, moderate returns
- **Examples:** Process improvements, maintenance, incremental features
**Horizon 2: Emerging Opportunities (20% of portfolio)**
- **Focus:** Extend core capabilities into new areas
- **Timeline:** 2-5 years
- **Risk Level:** Medium
- **ROI Expectation:** Medium certainty, high returns
- **Examples:** New markets, adjacent products, platform extensions
**Horizon 3: Transformational Initiatives (10% of portfolio)**
- **Focus:** Create new capabilities and business models
- **Timeline:** 5+ years
- **Risk Level:** High
- **ROI Expectation:** Low certainty, very high potential returns
- **Examples:** Breakthrough technologies, new business models, moonshots
#### Portfolio Balance Guidelines
```
Balanced Portfolio Allocation:
- Conservative Organization: H1=80%, H2=15%, H3=5%
- Growth-Oriented: H1=60%, H2=25%, H3=15%
- Innovation Leader: H1=50%, H2=30%, H3=20%
Risk Management:
- H1 projects should fund H2 and H3 experiments
- H2 successes should scale to become new H1 businesses
- H3 failures should generate learning for future initiatives
```
### 3. Efficient Frontier Analysis
**Best Used For:** Risk-return optimization, portfolio-level resource allocation
#### Risk-Return Plotting
**Step 1: Quantify Risk and Return for Each Project**
```
Return Metrics:
- Expected NPV or IRR
- Strategic value score
- Market opportunity size
Risk Metrics:
- Probability of failure
- Variance in expected outcomes
- Technical/market uncertainty
```
**Step 2: Plot Projects on Risk-Return Matrix**
**Step 3: Identify Efficient Frontier**
- Projects offering maximum return for each risk level
- Projects below the frontier are suboptimal
- Portfolio optimization involves selecting mix along frontier
**Step 4: Apply Risk Appetite**
- Conservative: Lower risk portion of frontier
- Moderate: Balanced mix across frontier
- Aggressive: Higher risk/return portion
#### Portfolio Optimization Example
```
Efficient Frontier Projects:
- Low Risk/Low Return: Process Automation (Risk=2, Return=15%)
- Medium Risk/Medium Return: Market Expansion (Risk=5, Return=25%)
- High Risk/High Return: New Technology Platform (Risk=8, Return=45%)
Suboptimal Projects:
- High Risk/Low Return: Legacy System Upgrade (Risk=7, Return=12%)
- Reason: Market Expansion offers better return for similar risk level
```
---
## Decision Trees for Model Selection
### Scenario-Based Model Selection
#### Scenario 1: Resource-Constrained Environment
```
Available Resources < Demand?
├── Yes: Use WSJF (maximize value per unit effort)
└── No: Use RICE or Weighted Scoring (optimize for maximum impact)
Time Pressure for Decisions?
├── High: Use ICE Scoring (rapid evaluation)
└── Low: Use MCDA (thorough analysis)
Stakeholder Alignment Issues?
├── Yes: Use MoSCoW (consensus building)
└── No: Proceed with quantitative method
```
#### Scenario 2: Innovation vs. Operations Balance
```
Portfolio Currently Imbalanced?
├── Too Operational: Apply Three Horizons Model (increase H2/H3)
├── Too Innovative: Focus on H1 projects (stabilize revenue)
└── Balanced: Use efficient frontier analysis (optimize mix)
Strategic Direction Clear?
├── Yes: Use strategic alignment scoring
└── No: Use broad stakeholder input (MoSCoW or Kano)
```
#### Scenario 3: Customer vs. Business Value Tension
```
Primary Value Driver?
├── Customer Satisfaction: Use Kano Analysis
├── Business ROI: Use RICE or financial scoring
└── Both Equally Important: Use balanced scorecard approach
Data Availability?
├── Rich Customer Data: Kano → RICE combination
├── Limited Data: ICE scoring → MoSCoW validation
└── Financial Data Only: WSJF or NPV ranking
```
---
## Hybrid Prioritization Approaches
### 1. Two-Stage Prioritization
**Stage 1: Strategic Filtering**
- Apply MoSCoW or Strategic Alignment Filter
- Eliminate projects that don't meet minimum criteria
- Reduce candidate pool by 40-60%
**Stage 2: Detailed Scoring**
- Apply WSJF, RICE, or MCDA to remaining candidates
- Rank order for resource allocation
- Final prioritization with stakeholder review
### 2. Weighted Multi-Model Approach
```
Combined Score = (WSJF Score × 0.4) + (Strategic Score × 0.3) + (Risk Score × 0.3)
Benefits:
- Reduces single-model bias
- Incorporates multiple perspectives
- Provides robustness check
Challenges:
- More complex to calculate
- Requires normalization of scales
- May obscure clear trade-offs
```
### 3. Dynamic Prioritization
**Concept:** Priorities change as conditions change; build flexibility into the system
**Implementation:**
- Monthly priority reviews using lightweight scoring (ICE)
- Quarterly deep-dive analysis using comprehensive model (MCDA)
- Annual strategic realignment using Three Horizons
**Trigger Events for Reprioritization:**
- Significant market changes
- Technology breakthroughs or failures
- Resource availability changes
- Strategic direction shifts
- Competitive moves
---
## Implementation Best Practices
### 1. Model Calibration and Validation
**Historical Validation:**
- Compare model predictions to actual project outcomes
- Identify systematic biases in scoring
- Adjust scoring criteria based on lessons learned
**Cross-Validation:**
- Use multiple models on same project set
- Investigate projects that rank very differently
- Understand root causes of ranking differences
**Stakeholder Validation:**
- Present prioritization results to key stakeholders
- Gather feedback on "surprising" rankings
- Adjust weights or criteria based on strategic input
### 2. Common Implementation Pitfalls
**Over-Engineering the Process:**
- **Problem:** Complex models that take too long to use
- **Solution:** Start simple, add complexity only when needed
**Score Inflation:**
- **Problem:** All projects rated as high importance
- **Solution:** Forced ranking, relative scoring, external calibration
**Gaming the System:**
- **Problem:** Project sponsors inflate scores to get priority
- **Solution:** Independent scoring, historical validation, transparency
**Analysis Paralysis:**
- **Problem:** Endless refinement without decision making
- **Solution:** Set decision deadlines, "good enough" thresholds
### 3. Organizational Change Management
**Building Buy-In:**
- Involve stakeholders in model selection process
- Provide training on chosen methodology
- Start with pilot group before full rollout
- Demonstrate early wins from improved prioritization
**Managing Resistance:**
- Address concerns about "pet projects" being deprioritized
- Show how model supports rather than replaces judgment
- Provide transparency into scoring rationale
- Allow for appeals process with clear criteria
**Continuous Improvement:**
- Regular retrospectives on prioritization effectiveness
- Gather feedback from project teams and stakeholders
- Update models based on changing business context
- Share success stories and lessons learned
---
## Tools and Templates
### 1. Excel-Based Prioritization Templates
**WSJF Calculator:**
- Automated score calculation
- Sensitivity analysis for weight changes
- Portfolio-level aggregation
- Visual ranking dashboard
**RICE Framework Spreadsheet:**
- Reach estimation guidelines
- Impact scoring rubric
- Confidence level definitions
- Effort estimation templates
### 2. Decision Support Dashboards
**Portfolio Overview:**
- Current project distribution across models
- Resource allocation vs. strategic priorities
- Risk-return visualization
- Priority change tracking
**Stakeholder Views:**
- Executive summary of top priorities
- Department-specific project impacts
- Budget allocation by strategic theme
- Timeline and milestone visualization
### 3. Governance Integration
**Portfolio Review Templates:**
- Monthly priority health check
- Quarterly strategic alignment review
- Annual prioritization methodology assessment
- Exception handling procedures
---
## Advanced Topics
### 1. Machine Learning Enhanced Prioritization
**Predictive Scoring:**
- Use historical project data to improve scoring accuracy
- Identify patterns in successful vs. failed initiatives
- Automate routine scoring updates
- Flag projects with unusual risk profiles
**Natural Language Processing:**
- Analyze project descriptions for implicit risk factors
- Extract customer sentiment from feedback data
- Monitor market signals for priority implications
- Automate competitive intelligence gathering
### 2. Real-Time Priority Adjustment
**Market Signal Integration:**
- Customer satisfaction scores
- Competitive intelligence
- Regulatory changes
- Technology disruption indicators
**Internal Signal Monitoring:**
- Resource availability changes
- Budget reforecasts
- Strategic initiative launches
- Organizational restructuring
### 3. Portfolio Scenario Planning
**What-If Analysis:**
- Impact of budget cuts on portfolio balance
- Effect of resource constraints on delivery timelines
- Strategic pivot implications for current priorities
- Market disruption response strategies
---
*This framework should be customized based on organizational maturity, industry context, and strategic objectives. Regular updates should incorporate lessons learned and evolving best practices.*
FILE:references/risk-management-framework.md
# Risk Management Framework for Senior Project Managers
## Executive Summary
This framework provides senior project managers with quantitative risk analysis methodologies, decision frameworks, and portfolio-level risk management strategies. It goes beyond basic risk identification to provide sophisticated tools for risk quantification, Monte Carlo simulation, expected monetary value (EMV) analysis, and enterprise risk appetite frameworks.
---
## Risk Classification & Quantification
### Risk Categories with Quantitative Weightings
#### 1. Technical Risk (Weight: 1.2x)
**Definition:** Technology implementation, integration, and performance risks
**Quantification Approach:**
- **Technology Maturity Score (TMS):** 1-5 scale based on technology adoption curve
- **Integration Complexity Index (ICI):** Number of integration points × complexity factor
- **Performance Risk Factor (PRF):** Historical performance variance in similar projects
**Formula:** `Technical Risk Score = (TMS × 0.3 + ICI × 0.4 + PRF × 0.3) × 1.2`
**Typical Sub-Risks:**
- Architecture scalability limitations (Impact: Schedule +15-30%, Cost +10-25%)
- Third-party integration failures (Impact: Schedule +20-40%, Cost +15-30%)
- Performance bottlenecks (Impact: Quality -20-40%, Cost +5-15%)
- Technology obsolescence (Impact: Long-term maintenance +50-100%)
#### 2. Resource Risk (Weight: 1.1x)
**Definition:** Human capital availability, skills, and retention risks
**Quantification Approach:**
- **Skill Availability Index (SAI):** Market availability of required skills (1-5)
- **Team Stability Factor (TSF):** Historical turnover rate in similar roles
- **Capacity Utilization Ratio (CUR):** Team utilization vs. sustainable capacity
**Formula:** `Resource Risk Score = (SAI × 0.4 + TSF × 0.3 + CUR × 0.3) × 1.1`
**Financial Impact Models:**
- Key person departure: 3-6 months replacement + 2-4 weeks knowledge transfer
- Skill gap: 15-30% productivity reduction + training/hiring costs
- Over-utilization: 20-40% quality degradation + burnout-related delays
#### 3. Schedule Risk (Weight: 1.0x)
**Definition:** Timeline compression, dependencies, and critical path risks
**Quantification Method: Monte Carlo Simulation**
```
Three-Point Estimation:
- Optimistic (O): Best case scenario (10% probability)
- Most Likely (M): Realistic estimate (50% probability)
- Pessimistic (P): Worst case scenario (90% probability)
Expected Duration = (O + 4M + P) / 6
Standard Deviation = (P - O) / 6
Monte Carlo Variables:
- Task duration uncertainty
- Resource availability variations
- Dependency delay impacts
- External factor disruptions
```
#### 4. Financial Risk (Weight: 1.4x)
**Definition:** Budget overruns, funding availability, and cost variability risks
**Expected Monetary Value (EMV) Analysis:**
```
EMV = Σ(Probability × Impact) for all financial risk scenarios
Cost Escalation Model:
- Labor cost inflation: Historical rate ± standard deviation
- Technology cost changes: Market volatility analysis
- Scope creep financial impact: Historical data from similar projects
- Currency/economic factors: Economic indicators correlation
Risk-Adjusted Budget = Base Budget × (1 + Risk Premium)
Risk Premium = Portfolio Risk Score × Risk Tolerance Factor
```
---
## Quantitative Risk Analysis Methodologies
### 1. Expected Monetary Value (EMV) Analysis
**Purpose:** Quantify financial impact of risks to inform investment decisions
**Process:**
1. **Risk Event Identification:** Catalog all potential financial impact events
2. **Probability Assessment:** Use historical data, expert judgment, and statistical models
3. **Impact Quantification:** Model financial consequences across multiple scenarios
4. **EMV Calculation:** Probability × Financial Impact for each risk
5. **Portfolio EMV:** Sum of all individual risk EMVs
**Example EMV Calculation:**
```
Risk: Third-party API failure requiring alternative implementation
Probability Scenarios:
- Minor disruption (60% chance): $50K additional cost
- Major redesign (30% chance): $200K additional cost
- Complete platform change (10% chance): $500K additional cost
EMV = (0.6 × $50K) + (0.3 × $200K) + (0.1 × $500K)
EMV = $30K + $60K + $50K = $140K
Risk-adjusted budget should include $140K contingency for this risk.
```
### 2. Monte Carlo Simulation for Schedule Risk
**Purpose:** Model schedule uncertainty using probabilistic analysis
**Implementation Process:**
1. **Task Duration Modeling:** Define probability distributions for each task
2. **Dependency Mapping:** Model task dependencies and their uncertainty
3. **Resource Constraint Integration:** Include resource availability variations
4. **External Factor Variables:** Weather, regulatory approvals, vendor delays
5. **Simulation Execution:** Run 10,000+ iterations to generate probability curves
**Key Outputs:**
- **P50 Schedule:** 50% confidence completion date
- **P80 Schedule:** 80% confidence completion date (recommended for commitments)
- **P95 Schedule:** 95% confidence completion date (worst-case planning)
- **Critical Path Sensitivity:** Which tasks most impact overall schedule
**Schedule Risk Interpretation:**
```
If P50 = 6 months, P80 = 7.5 months:
- Schedule Buffer Required: 1.5 months (25% buffer)
- Risk Level: Medium (broad distribution indicates uncertainty)
- Mitigation Priority: Focus on tasks with highest variance contribution
```
### 3. Risk Appetite & Tolerance Frameworks
#### Enterprise Risk Appetite Levels
**Conservative (Risk Score Target: 0-8)**
- **Philosophy:** Minimize risk exposure, accept lower returns for certainty
- **Suitable Projects:** Core business operations, regulatory compliance, customer-facing systems
- **Contingency Reserves:** 20-30% of project budget
- **Decision Criteria:** Require 90%+ confidence levels for major decisions
**Moderate (Risk Score Target: 8-15)**
- **Philosophy:** Balanced risk-return approach, selective risk taking
- **Suitable Projects:** Process improvements, technology upgrades, market expansion
- **Contingency Reserves:** 15-20% of project budget
- **Decision Criteria:** 70-80% confidence levels acceptable
**Aggressive (Risk Score Target: 15+)**
- **Philosophy:** High risk tolerance for high strategic returns
- **Suitable Projects:** Innovation initiatives, emerging technology adoption, new market entry
- **Contingency Reserves:** 10-15% of project budget (accept higher failure rates)
- **Decision Criteria:** 60-70% confidence levels acceptable
#### Risk Tolerance Thresholds
**Financial Tolerance Levels:**
- **Level 1:** <$100K potential loss - Team/PM authority
- **Level 2:** $100K-$500K potential loss - Business unit approval required
- **Level 3:** $500K-$2M potential loss - Executive committee approval
- **Level 4:** >$2M potential loss - Board approval required
**Schedule Tolerance Levels:**
- **Green:** <5% schedule impact - Monitor and mitigate
- **Amber:** 5-15% schedule impact - Active mitigation required
- **Red:** >15% schedule impact - Escalation and replanning required
---
## Advanced Risk Modeling Techniques
### 1. Correlation Analysis for Portfolio Risk
**Purpose:** Understand how risks interact across projects and compound at portfolio level
**Correlation Types:**
- **Positive Correlation:** Risks that tend to occur together (e.g., economic downturn affecting multiple projects)
- **Negative Correlation:** Risks that are mutually exclusive (e.g., resource conflicts between projects)
- **No Correlation:** Independent risks
**Portfolio Risk Calculation:**
```
Portfolio Variance = Σ(Individual Project Variance) + 2Σ(Correlation × StdDev1 × StdDev2)
Where correlation coefficients range from -1.0 to +1.0:
- +1.0: Perfect positive correlation (risks always occur together)
- 0.0: No correlation (risks are independent)
- -1.0: Perfect negative correlation (risks never occur together)
```
### 2. Value at Risk (VaR) for Project Portfolios
**Definition:** Maximum expected loss over a specific time period at a given confidence level
**Calculation Example:**
```
For a portfolio with expected value of $10M and monthly VaR of $500K at 95% confidence:
"There is a 95% chance that portfolio losses will not exceed $500K in any given month"
VaR Calculation Methods:
1. Historical Simulation: Use past project performance data
2. Parametric Method: Assume normal distribution of returns
3. Monte Carlo Simulation: Model complex risk interactions
```
### 3. Real Options Analysis for Project Flexibility
**Purpose:** Value the flexibility to modify project approach based on new information
**Common Real Options in Projects:**
- **Expansion Option:** Scale up successful projects
- **Abandonment Option:** Exit failing projects early
- **Timing Option:** Delay project start for better conditions
- **Switching Option:** Change technology/approach mid-project
**Black-Scholes Adaptation for Projects:**
```
Project Option Value = S₀ × N(d₁) - K × e^(-r×T) × N(d₂)
Where:
S₀ = Current project value estimate
K = Required investment (strike price)
r = Risk-free rate
T = Time to decision point
N(d) = Cumulative standard normal distribution
```
---
## Risk Response Strategies with Decision Trees
### Strategy Selection Framework
#### 1. Avoid (Eliminate Risk)
**Decision Criteria:**
- High impact + High probability risks
- Cost of avoidance < Expected risk cost
- Alternative approaches available
**Examples:**
- Choose proven technology over cutting-edge solutions
- Eliminate high-risk features from scope
- Change project approach entirely
#### 2. Mitigate (Reduce Probability or Impact)
**Decision Tree for Mitigation Investment:**
```
If (Risk EMV > Mitigation Cost × 1.5):
Implement mitigation
Else if (Risk Impact > Risk Tolerance Threshold):
Consider partial mitigation
Else:
Accept risk
```
**Mitigation Effectiveness Factors:**
- Cost efficiency: Mitigation cost ÷ Risk EMV reduction
- Implementation feasibility: Resource availability and timeline
- Residual risk: Remaining risk after mitigation
#### 3. Transfer (Share Risk with Others)
**Transfer Mechanisms:**
- Insurance: For predictable, quantifiable risks
- Contracts: Fixed-price contracts transfer cost risk to vendors
- Partnerships: Share both risks and rewards
- Outsourcing: Transfer operational risks to specialists
**Transfer Decision Matrix:**
| Risk Type | Transfer Mechanism | Cost Efficiency | Risk Retention |
|-----------|-------------------|-----------------|----------------|
| Technical | Fixed-price contract | High | Low |
| Schedule | Penalty clauses | Medium | Medium |
| Market | Revenue sharing | Low | High |
| Operational | Insurance/SLA | High | Low |
#### 4. Accept (Acknowledge and Monitor)
**Acceptance Criteria:**
- Low impact × Low probability risks
- Mitigation cost > Risk EMV
- Risk within established tolerance thresholds
**Active Acceptance:** Establish contingency reserves and response plans
**Passive Acceptance:** Monitor but take no proactive action
---
## Risk Monitoring & Key Performance Indicators
### Risk Health Metrics
#### 1. Portfolio Risk Exposure Trends
```
Risk Velocity = (New Risks Added - Risks Resolved) / Time Period
Risk Burn Rate = Total Risk EMV Reduction / Time Period
Risk Coverage Ratio = Mitigation Budget / Total Risk EMV
```
#### 2. Risk Response Effectiveness
```
Mitigation Success Rate = Risks Successfully Mitigated / Total Mitigation Attempts
Average Resolution Time = Σ(Risk Resolution Days) / Number of Resolved Risks
Cost of Risk Management = Total Risk Management Spend / Project Budget
```
#### 3. Leading vs. Lagging Indicators
**Leading Indicators (Predictive):**
- Resource utilization trends
- Stakeholder satisfaction scores
- Technical debt accumulation
- Team velocity variance
- Budget burn rate vs. planned
**Lagging Indicators (Confirmatory):**
- Actual schedule delays
- Budget overruns
- Quality defect rates
- Stakeholder complaints
- Team turnover events
### Risk Dashboard Design
**Executive Level (Strategic View):**
- Portfolio risk heat map
- Top 10 risks by EMV
- Risk appetite vs. actual exposure
- Risk-adjusted project ROI
**Program Level (Tactical View):**
- Risk trend analysis
- Mitigation plan status
- Resource allocation for risk management
- Cross-project risk correlations
**Project Level (Operational View):**
- Individual risk register
- Risk response action items
- Risk probability/impact changes
- Mitigation cost tracking
---
## Integration with Portfolio Management
### Strategic Risk Alignment
**Risk-Adjusted Portfolio Optimization:**
1. **Risk-Return Analysis:** Plot projects on risk vs. return matrix
2. **Portfolio Diversification:** Balance high-risk/high-reward with stable projects
3. **Resource Allocation:** Allocate risk management resources based on EMV
4. **Strategic Fit:** Ensure risk appetite aligns with strategic objectives
**Capital Allocation Models:**
```
Risk-Adjusted NPV = Standard NPV × Risk Adjustment Factor
Risk Adjustment Factor = 1 - (Project Risk Score × Risk Penalty Rate)
Where Risk Penalty Rate reflects organization's risk aversion:
- Conservative: 0.8% per risk score point
- Moderate: 0.5% per risk score point
- Aggressive: 0.2% per risk score point
```
### Governance Integration
**Risk Committee Structure:**
- **Executive Risk Committee:** Monthly, strategic risks >$1M impact
- **Portfolio Risk Board:** Bi-weekly, cross-project risks
- **Project Risk Teams:** Weekly, operational risk management
**Escalation Triggers:**
- Risk EMV exceeds defined thresholds
- Risk probability or impact significantly changes
- Mitigation plans fail or become ineffective
- New risk categories emerge
**Decision Authority Matrix:**
| Risk EMV Level | Authority Level | Response Time | Required Documentation |
|----------------|-----------------|---------------|------------------------|
| <$50K | Project Manager | 24 hours | Risk register update |
| $50K-$250K | Program Manager | 48 hours | Risk assessment report |
| $250K-$1M | Business Owner | 72 hours | Executive summary + options |
| >$1M | Executive Committee | 1 week | Full risk analysis + recommendation |
---
## Advanced Topics
### Behavioral Risk Factors
**Cognitive Biases in Risk Assessment:**
- **Optimism Bias:** Tendency to underestimate risk probability
- **Anchoring Bias:** Over-reliance on first information received
- **Availability Heuristic:** Overweighting easily recalled risks
- **Confirmation Bias:** Seeking information that confirms existing beliefs
**Bias Mitigation Techniques:**
- Independent risk assessments from multiple sources
- Devil's advocate roles in risk sessions
- Historical data analysis vs. expert judgment
- Pre-mortem analysis: "How could this project fail?"
### Emerging Risk Categories
**Digital Transformation Risks:**
- Data privacy and cybersecurity (GDPR, CCPA compliance)
- Legacy system integration complexity
- Change management and user adoption
- Cloud migration and vendor lock-in
**Regulatory and Compliance Risks:**
- Changing regulatory landscape
- Cross-border data transfer restrictions
- Industry-specific compliance requirements
- Audit and documentation requirements
**Sustainability and ESG Risks:**
- Environmental impact assessments
- Social responsibility requirements
- Governance and ethical considerations
- Long-term sustainability of solutions
---
## Implementation Guidelines
### Risk Framework Maturity Model
**Level 1 - Basic (Ad Hoc):**
- Qualitative risk identification
- Simple probability/impact matrices
- Reactive risk response
- Project-level focus only
**Level 2 - Managed (Repeatable):**
- Standardized risk processes
- Quantitative risk analysis
- Proactive mitigation planning
- Portfolio-level risk aggregation
**Level 3 - Defined (Systematic):**
- Enterprise risk integration
- Monte Carlo simulation
- Risk-adjusted decision making
- Cross-functional risk management
**Level 4 - Advanced (Quantitative):**
- Real-time risk monitoring
- Predictive risk analytics
- Automated risk reporting
- Strategic risk optimization
**Level 5 - Optimizing (Continuous Improvement):**
- AI-enhanced risk prediction
- Dynamic risk response
- Industry benchmark integration
- Continuous framework evolution
### Getting Started: 90-Day Implementation Plan
**Days 1-30: Foundation**
- Assess current risk management maturity
- Define risk appetite and tolerance levels
- Establish risk governance structure
- Train core team on quantitative methods
**Days 31-60: Tools & Processes**
- Implement EMV and Monte Carlo tools
- Create risk dashboard templates
- Establish risk register standards
- Begin historical data collection
**Days 61-90: Integration & Optimization**
- Integrate with portfolio management
- Establish reporting rhythms
- Conduct first portfolio risk review
- Plan continuous improvement initiatives
---
*This framework should be adapted to organizational context, industry requirements, and project complexity. Regular updates should incorporate lessons learned and emerging best practices.*
FILE:scripts/project_health_dashboard.py
#!/usr/bin/env python3
"""
Project Health Dashboard
Aggregates project metrics across timeline, budget, scope, and quality dimensions.
Calculates composite health scores, generates RAG (Red/Amber/Green) status reports,
and identifies projects needing intervention for portfolio management.
Usage:
python project_health_dashboard.py portfolio_data.json
python project_health_dashboard.py portfolio_data.json --format json
"""
import argparse
import json
import statistics
import sys
from datetime import datetime, timedelta
from typing import Any, Dict, List, Optional, Tuple, Union
# ---------------------------------------------------------------------------
# Health Assessment Configuration
# ---------------------------------------------------------------------------
HEALTH_DIMENSIONS = {
"timeline": {
"weight": 0.25,
"thresholds": {
"green": {"min": 0.0, "max": 0.05}, # ≤5% delay
"amber": {"min": 0.05, "max": 0.15}, # 5-15% delay
"red": {"min": 0.15, "max": 1.0} # >15% delay
}
},
"budget": {
"weight": 0.25,
"thresholds": {
"green": {"min": 0.0, "max": 0.05}, # ≤5% over budget
"amber": {"min": 0.05, "max": 0.15}, # 5-15% over budget
"red": {"min": 0.15, "max": 1.0} # >15% over budget
}
},
"scope": {
"weight": 0.20,
"thresholds": {
"green": {"min": 0.90, "max": 1.0}, # 90-100% scope delivered
"amber": {"min": 0.75, "max": 0.90}, # 75-90% scope delivered
"red": {"min": 0.0, "max": 0.75} # <75% scope delivered
}
},
"quality": {
"weight": 0.20,
"thresholds": {
"green": {"min": 0.95, "max": 1.0}, # ≤5% defect rate
"amber": {"min": 0.85, "max": 0.95}, # 5-15% defect rate
"red": {"min": 0.0, "max": 0.85} # >15% defect rate
}
},
"risk": {
"weight": 0.10,
"thresholds": {
"green": {"min": 0.0, "max": 15}, # Low risk score
"amber": {"min": 15, "max": 25}, # Medium risk score
"red": {"min": 25, "max": 100} # High risk score
}
}
}
PROJECT_STATUS_MAPPING = {
"planning": ["planning", "initiation", "chartered"],
"active": ["active", "in_progress", "execution", "development"],
"monitoring": ["monitoring", "testing", "review"],
"completed": ["completed", "delivered", "closed"],
"cancelled": ["cancelled", "terminated", "suspended"],
"on_hold": ["on_hold", "paused", "blocked"]
}
PRIORITY_WEIGHTS = {
"critical": 1.5,
"high": 1.2,
"medium": 1.0,
"low": 0.8
}
INTERVENTION_THRESHOLDS = {
"immediate": 30, # Health score ≤30
"urgent": 50, # Health score ≤50
"monitor": 70 # Health score ≤70
}
# ---------------------------------------------------------------------------
# Data Models
# ---------------------------------------------------------------------------
class ProjectMetrics:
"""Represents project health metrics and calculations."""
def __init__(self, data: Dict[str, Any]):
self.project_id: str = data.get("project_id", "")
self.project_name: str = data.get("project_name", "")
self.priority: str = data.get("priority", "medium").lower()
self.status: str = data.get("status", "planning").lower()
self.phase: str = data.get("phase", "planning")
# Timeline metrics
self.planned_start: str = data.get("planned_start", "")
self.actual_start: Optional[str] = data.get("actual_start")
self.planned_end: str = data.get("planned_end", "")
self.forecasted_end: str = data.get("forecasted_end", "")
self.completion_percentage: float = max(0, min(100, data.get("completion_percentage", 0))) / 100
# Budget metrics
self.planned_budget: float = data.get("planned_budget", 0)
self.spent_to_date: float = data.get("spent_to_date", 0)
self.forecasted_total_cost: float = data.get("forecasted_total_cost", 0)
# Scope metrics
self.planned_features: int = data.get("planned_features", 0)
self.completed_features: int = data.get("completed_features", 0)
self.descoped_features: int = data.get("descoped_features", 0)
self.added_features: int = data.get("added_features", 0)
# Quality metrics
self.total_defects: int = data.get("total_defects", 0)
self.resolved_defects: int = data.get("resolved_defects", 0)
self.critical_defects: int = data.get("critical_defects", 0)
self.test_coverage: float = max(0, min(1, data.get("test_coverage", 0)))
# Risk metrics
self.risk_score: float = data.get("risk_score", 0)
self.open_risks: int = data.get("open_risks", 0)
self.critical_risks: int = data.get("critical_risks", 0)
# Team metrics
self.team_size: int = data.get("team_size", 0)
self.team_utilization: float = data.get("team_utilization", 0)
self.team_satisfaction: Optional[float] = data.get("team_satisfaction")
# Stakeholder metrics
self.stakeholder_satisfaction: Optional[float] = data.get("stakeholder_satisfaction")
self.last_status_update: str = data.get("last_status_update", "")
# Calculate derived metrics
self._calculate_health_metrics()
self._normalize_status()
def _calculate_health_metrics(self):
"""Calculate normalized health metrics for each dimension."""
# Timeline health (0 = on time, 1 = severely delayed)
self.timeline_health = self._calculate_timeline_variance()
# Budget health (0 = on budget, 1 = severely over budget)
self.budget_health = self._calculate_budget_variance()
# Scope health (0 = no scope delivered, 1 = full scope delivered)
self.scope_health = self._calculate_scope_completion()
# Quality health (0 = poor quality, 1 = excellent quality)
self.quality_health = self._calculate_quality_score()
# Risk health (normalized risk score)
self.risk_health = min(self.risk_score, 100) # Cap at 100
def _calculate_timeline_variance(self) -> float:
"""Calculate timeline variance as percentage of planned duration."""
if not self.planned_start or not self.planned_end:
return 0.0
try:
planned_start = datetime.strptime(self.planned_start, "%Y-%m-%d")
planned_end = datetime.strptime(self.planned_end, "%Y-%m-%d")
planned_duration = (planned_end - planned_start).days
if planned_duration <= 0:
return 0.0
# Use forecasted end if available, otherwise current date for active projects
if self.forecasted_end:
forecast_date = datetime.strptime(self.forecasted_end, "%Y-%m-%d")
elif self.status in ["completed", "cancelled"]:
return 0.0 # Project is done
else:
forecast_date = datetime.now()
actual_duration = (forecast_date - planned_start).days
variance = max(0, actual_duration - planned_duration) / planned_duration
return min(variance, 1.0) # Cap at 100% delay
except (ValueError, ZeroDivisionError):
return 0.0
def _calculate_budget_variance(self) -> float:
"""Calculate budget variance as percentage over original budget."""
if self.planned_budget <= 0:
return 0.0
# Use forecasted total cost if available, otherwise spent to date
actual_cost = self.forecasted_total_cost or self.spent_to_date
variance = max(0, actual_cost - self.planned_budget) / self.planned_budget
return min(variance, 1.0) # Cap at 100% over budget
def _calculate_scope_completion(self) -> float:
"""Calculate scope completion percentage."""
if self.planned_features <= 0:
return 1.0 # No planned features, consider complete
# Account for scope changes
effective_planned = self.planned_features + self.added_features - self.descoped_features
if effective_planned <= 0:
return 1.0
return self.completed_features / effective_planned
def _calculate_quality_score(self) -> float:
"""Calculate quality score based on defects and test coverage."""
if self.total_defects == 0:
defect_score = 1.0
else:
resolution_rate = self.resolved_defects / self.total_defects
critical_penalty = self.critical_defects / max(self.total_defects, 1)
defect_score = resolution_rate * (1 - critical_penalty * 0.5)
# Combine defect score with test coverage
quality_score = (defect_score * 0.7) + (self.test_coverage * 0.3)
return max(0, min(1, quality_score))
def _normalize_status(self):
"""Normalize project status to standard categories."""
status_lower = self.status.lower()
for category, statuses in PROJECT_STATUS_MAPPING.items():
if status_lower in statuses:
self.normalized_status = category
return
self.normalized_status = "active" # Default
@property
def is_active(self) -> bool:
return self.normalized_status in ["planning", "active", "monitoring"]
@property
def requires_intervention(self) -> bool:
health_score = self.calculate_composite_health_score()
return health_score <= INTERVENTION_THRESHOLDS["urgent"] and self.is_active
class PortfolioHealthResult:
"""Complete portfolio health analysis results."""
def __init__(self):
self.summary: Dict[str, Any] = {}
self.project_scores: List[Dict[str, Any]] = []
self.dimension_analysis: Dict[str, Any] = {}
self.rag_status: Dict[str, Any] = {}
self.intervention_list: List[Dict[str, Any]] = []
self.portfolio_trends: Dict[str, Any] = {}
self.recommendations: List[str] = []
# ---------------------------------------------------------------------------
# Health Calculation Functions
# ---------------------------------------------------------------------------
def calculate_dimension_score(value: float, dimension: str, is_reverse: bool = False) -> int:
"""Calculate dimension score (0-100) based on thresholds."""
config = HEALTH_DIMENSIONS[dimension]
thresholds = config["thresholds"]
if not is_reverse:
# Lower values are better (timeline, budget, risk)
if value <= thresholds["green"]["max"]:
return 90 + int((1 - value / thresholds["green"]["max"]) * 10)
elif value <= thresholds["amber"]["max"]:
range_size = thresholds["amber"]["max"] - thresholds["amber"]["min"]
position = (value - thresholds["amber"]["min"]) / range_size
return 60 + int((1 - position) * 30)
else:
# Red zone - score decreases with higher values
excess = min(value - thresholds["red"]["min"], 1.0)
return max(10, 60 - int(excess * 50))
else:
# Higher values are better (scope, quality)
if value >= thresholds["green"]["min"]:
range_size = thresholds["green"]["max"] - thresholds["green"]["min"]
position = (value - thresholds["green"]["min"]) / range_size if range_size > 0 else 1
return 90 + int(position * 10)
elif value >= thresholds["amber"]["min"]:
range_size = thresholds["amber"]["max"] - thresholds["amber"]["min"]
position = (value - thresholds["amber"]["min"]) / range_size
return 60 + int(position * 30)
else:
# Red zone
if thresholds["red"]["max"] > 0:
position = value / thresholds["red"]["max"]
return max(10, int(position * 60))
else:
return 10
def calculate_project_health_score(project: ProjectMetrics) -> Dict[str, Any]:
"""Calculate comprehensive health score for a project."""
# Calculate individual dimension scores
timeline_score = calculate_dimension_score(project.timeline_health, "timeline")
budget_score = calculate_dimension_score(project.budget_health, "budget")
scope_score = calculate_dimension_score(project.scope_health, "scope", is_reverse=True)
quality_score = calculate_dimension_score(project.quality_health, "quality", is_reverse=True)
risk_score = calculate_dimension_score(project.risk_health, "risk")
# Calculate weighted composite score
dimensions = {
"timeline": {"score": timeline_score, "weight": HEALTH_DIMENSIONS["timeline"]["weight"]},
"budget": {"score": budget_score, "weight": HEALTH_DIMENSIONS["budget"]["weight"]},
"scope": {"score": scope_score, "weight": HEALTH_DIMENSIONS["scope"]["weight"]},
"quality": {"score": quality_score, "weight": HEALTH_DIMENSIONS["quality"]["weight"]},
"risk": {"score": risk_score, "weight": HEALTH_DIMENSIONS["risk"]["weight"]}
}
composite_score = sum(
dim_data["score"] * dim_data["weight"]
for dim_data in dimensions.values()
)
# Apply priority weighting
priority_weight = PRIORITY_WEIGHTS.get(project.priority, 1.0)
adjusted_score = composite_score * priority_weight
# Determine RAG status
if composite_score >= 80:
rag_status = "green"
elif composite_score >= 60:
rag_status = "amber"
else:
rag_status = "red"
# Determine intervention level
if composite_score <= INTERVENTION_THRESHOLDS["immediate"]:
intervention_level = "immediate"
elif composite_score <= INTERVENTION_THRESHOLDS["urgent"]:
intervention_level = "urgent"
elif composite_score <= INTERVENTION_THRESHOLDS["monitor"]:
intervention_level = "monitor"
else:
intervention_level = "none"
return {
"project_id": project.project_id,
"project_name": project.project_name,
"composite_score": composite_score,
"adjusted_score": adjusted_score,
"rag_status": rag_status,
"intervention_level": intervention_level,
"dimension_scores": dimensions,
"priority": project.priority,
"status": project.status,
"completion_percentage": project.completion_percentage
}
def analyze_portfolio_dimensions(project_scores: List[Dict[str, Any]]) -> Dict[str, Any]:
"""Analyze portfolio performance across health dimensions."""
dimension_analysis = {}
for dimension in HEALTH_DIMENSIONS.keys():
scores = [
project["dimension_scores"][dimension]["score"]
for project in project_scores
]
if scores:
dimension_analysis[dimension] = {
"average_score": statistics.mean(scores),
"median_score": statistics.median(scores),
"min_score": min(scores),
"max_score": max(scores),
"std_deviation": statistics.stdev(scores) if len(scores) > 1 else 0,
"projects_below_60": len([s for s in scores if s < 60]),
"projects_above_80": len([s for s in scores if s >= 80])
}
# Identify weakest and strongest dimensions
avg_scores = {dim: data["average_score"] for dim, data in dimension_analysis.items()}
weakest_dimension = min(avg_scores.keys(), key=lambda k: avg_scores[k])
strongest_dimension = max(avg_scores.keys(), key=lambda k: avg_scores[k])
return {
"dimension_statistics": dimension_analysis,
"weakest_dimension": weakest_dimension,
"strongest_dimension": strongest_dimension,
"dimension_rankings": sorted(avg_scores.items(), key=lambda x: x[1], reverse=True)
}
def generate_rag_status_summary(project_scores: List[Dict[str, Any]]) -> Dict[str, Any]:
"""Generate RAG status summary for portfolio."""
rag_counts = {"green": 0, "amber": 0, "red": 0}
# Count by RAG status
for project in project_scores:
rag_status = project["rag_status"]
rag_counts[rag_status] += 1
total_projects = len(project_scores)
# Calculate percentages
rag_percentages = {
status: (count / max(total_projects, 1)) * 100
for status, count in rag_counts.items()
}
# Categorize projects by status
green_projects = [p for p in project_scores if p["rag_status"] == "green"]
amber_projects = [p for p in project_scores if p["rag_status"] == "amber"]
red_projects = [p for p in project_scores if p["rag_status"] == "red"]
# Calculate portfolio health grade
if rag_percentages["red"] > 30:
portfolio_grade = "critical"
elif rag_percentages["red"] > 15 or rag_percentages["amber"] > 50:
portfolio_grade = "concerning"
elif rag_percentages["green"] > 60:
portfolio_grade = "healthy"
else:
portfolio_grade = "moderate"
return {
"rag_counts": rag_counts,
"rag_percentages": rag_percentages,
"portfolio_grade": portfolio_grade,
"green_projects": [{"id": p["project_id"], "name": p["project_name"], "score": p["composite_score"]} for p in green_projects],
"amber_projects": [{"id": p["project_id"], "name": p["project_name"], "score": p["composite_score"]} for p in amber_projects],
"red_projects": [{"id": p["project_id"], "name": p["project_name"], "score": p["composite_score"]} for p in red_projects]
}
def identify_intervention_priorities(project_scores: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
"""Identify projects requiring intervention, prioritized by urgency and impact."""
intervention_projects = [
p for p in project_scores
if p["intervention_level"] in ["immediate", "urgent", "monitor"]
]
# Sort by intervention level and then by adjusted score (priority-weighted)
intervention_priority = {"immediate": 3, "urgent": 2, "monitor": 1}
intervention_projects.sort(
key=lambda p: (
intervention_priority[p["intervention_level"]],
-p["adjusted_score"] # Lower scores need more urgent attention
),
reverse=True
)
# Add recommended actions based on weakest dimensions
for project in intervention_projects:
project["recommended_actions"] = _generate_project_recommendations(project)
project["risk_factors"] = _identify_risk_factors(project)
return intervention_projects
def _generate_project_recommendations(project: Dict[str, Any]) -> List[str]:
"""Generate specific recommendations based on project's weak dimensions."""
recommendations = []
dimension_scores = project["dimension_scores"]
# Timeline recommendations
if dimension_scores["timeline"]["score"] < 60:
recommendations.append("Conduct timeline recovery analysis and implement fast-tracking or crashing strategies")
# Budget recommendations
if dimension_scores["budget"]["score"] < 60:
recommendations.append("Implement cost control measures and review budget forecasts")
# Scope recommendations
if dimension_scores["scope"]["score"] < 60:
recommendations.append("Review scope management and consider feature prioritization or descoping")
# Quality recommendations
if dimension_scores["quality"]["score"] < 60:
recommendations.append("Increase testing coverage and implement quality improvement processes")
# Risk recommendations
if dimension_scores["risk"]["score"] < 60:
recommendations.append("Escalate critical risks and implement additional risk mitigation measures")
# Overall health recommendations
if project["composite_score"] < 40:
recommendations.append("Consider project restructuring or emergency stakeholder review")
return recommendations
def _identify_risk_factors(project: Dict[str, Any]) -> List[str]:
"""Identify specific risk factors for a project."""
risk_factors = []
if project["composite_score"] < 30:
risk_factors.append("Critical project failure risk")
if project["intervention_level"] == "immediate":
risk_factors.append("Requires immediate management attention")
dimension_scores = project["dimension_scores"]
poor_dimensions = [
dim for dim, data in dimension_scores.items()
if data["score"] < 50
]
if len(poor_dimensions) > 2:
risk_factors.append(f"Multiple failing dimensions: {', '.join(poor_dimensions)}")
return risk_factors
def generate_portfolio_recommendations(analysis_results: Dict[str, Any]) -> List[str]:
"""Generate portfolio-level recommendations."""
recommendations = []
# RAG status recommendations
rag_status = analysis_results.get("rag_status", {})
red_percentage = rag_status.get("rag_percentages", {}).get("red", 0)
amber_percentage = rag_status.get("rag_percentages", {}).get("amber", 0)
if red_percentage > 30:
recommendations.append("URGENT: 30%+ projects are in red status. Consider portfolio restructuring or resource reallocation.")
elif red_percentage > 15:
recommendations.append("HIGH: Significant number of projects in red status require immediate attention.")
if amber_percentage > 50:
recommendations.append("MEDIUM: Over half of portfolio projects need monitoring and support.")
# Dimension-based recommendations
dimension_analysis = analysis_results.get("dimension_analysis", {})
weakest_dimension = dimension_analysis.get("weakest_dimension", "")
if weakest_dimension:
recommendations.append(f"Focus improvement efforts on {weakest_dimension} - weakest portfolio dimension.")
# Intervention recommendations
intervention_list = analysis_results.get("intervention_list", [])
immediate_count = len([p for p in intervention_list if p["intervention_level"] == "immediate"])
urgent_count = len([p for p in intervention_list if p["intervention_level"] == "urgent"])
if immediate_count > 0:
recommendations.append(f"CRITICAL: {immediate_count} projects require immediate intervention within 48 hours.")
if urgent_count > 3:
recommendations.append(f"Capacity alert: {urgent_count} projects need urgent attention - consider resource reallocation.")
# Portfolio health recommendations
portfolio_grade = rag_status.get("portfolio_grade", "")
if portfolio_grade == "critical":
recommendations.append("Portfolio health is critical. Recommend executive review and strategic realignment.")
elif portfolio_grade == "concerning":
recommendations.append("Portfolio health needs improvement. Implement enhanced monitoring and support.")
return recommendations
# ---------------------------------------------------------------------------
# Main Analysis Function
# ---------------------------------------------------------------------------
def analyze_portfolio_health(data: Dict[str, Any]) -> PortfolioHealthResult:
"""Perform comprehensive portfolio health analysis."""
result = PortfolioHealthResult()
try:
# Parse project data
project_records = data.get("projects", [])
projects = [ProjectMetrics(record) for record in project_records]
if not projects:
raise ValueError("No project data found")
# Calculate health scores for each project
project_scores = [calculate_project_health_score(project) for project in projects]
result.project_scores = project_scores
# Filter active projects for portfolio analysis
active_scores = [score for i, score in enumerate(project_scores) if projects[i].is_active]
# Portfolio summary
if active_scores:
composite_scores = [score["composite_score"] for score in active_scores]
result.summary = {
"total_projects": len(projects),
"active_projects": len(active_scores),
"portfolio_average_score": statistics.mean(composite_scores),
"portfolio_median_score": statistics.median(composite_scores),
"projects_needing_attention": len([s for s in active_scores if s["composite_score"] < 70]),
"critical_projects": len([s for s in active_scores if s["composite_score"] < 40])
}
else:
result.summary = {
"total_projects": len(projects),
"active_projects": 0,
"portfolio_average_score": 0,
"message": "No active projects found"
}
if active_scores:
# Dimension analysis
result.dimension_analysis = analyze_portfolio_dimensions(active_scores)
# RAG status analysis
result.rag_status = generate_rag_status_summary(active_scores)
# Intervention priorities
result.intervention_list = identify_intervention_priorities(active_scores)
# Generate recommendations
analysis_data = {
"rag_status": result.rag_status,
"dimension_analysis": result.dimension_analysis,
"intervention_list": result.intervention_list
}
result.recommendations = generate_portfolio_recommendations(analysis_data)
except Exception as e:
result.summary = {"error": str(e)}
return result
# ---------------------------------------------------------------------------
# Output Formatting
# ---------------------------------------------------------------------------
def format_text_output(result: PortfolioHealthResult) -> str:
"""Format analysis results as readable text report."""
lines = []
lines.append("="*60)
lines.append("PROJECT HEALTH DASHBOARD")
lines.append("="*60)
lines.append("")
if "error" in result.summary:
lines.append(f"ERROR: {result.summary['error']}")
return "\n".join(lines)
# Executive Summary
summary = result.summary
lines.append("PORTFOLIO OVERVIEW")
lines.append("-"*30)
lines.append(f"Total Projects: {summary['total_projects']} ({summary.get('active_projects', 0)} active)")
if "portfolio_average_score" in summary:
lines.append(f"Portfolio Health Score: {summary['portfolio_average_score']:.1f}/100")
lines.append(f"Projects Needing Attention: {summary.get('projects_needing_attention', 0)}")
lines.append(f"Critical Projects: {summary.get('critical_projects', 0)}")
if "message" in summary:
lines.append(f"Status: {summary['message']}")
lines.append("")
# RAG Status Summary
rag_status = result.rag_status
if rag_status:
lines.append("RAG STATUS SUMMARY")
lines.append("-"*30)
rag_counts = rag_status.get("rag_counts", {})
rag_percentages = rag_status.get("rag_percentages", {})
lines.append(f"🟢 Green: {rag_counts.get('green', 0)} ({rag_percentages.get('green', 0):.1f}%)")
lines.append(f"🟡 Amber: {rag_counts.get('amber', 0)} ({rag_percentages.get('amber', 0):.1f}%)")
lines.append(f"🔴 Red: {rag_counts.get('red', 0)} ({rag_percentages.get('red', 0):.1f}%)")
lines.append(f"Portfolio Grade: {rag_status.get('portfolio_grade', 'N/A').title()}")
lines.append("")
# Dimension Analysis
dimension_analysis = result.dimension_analysis
if dimension_analysis:
lines.append("HEALTH DIMENSION ANALYSIS")
lines.append("-"*30)
dimension_stats = dimension_analysis.get("dimension_statistics", {})
for dimension, stats in dimension_stats.items():
lines.append(f"{dimension.title()}: {stats['average_score']:.1f} avg "
f"({stats['projects_below_60']} below 60, {stats['projects_above_80']} above 80)")
lines.append(f"Strongest: {dimension_analysis.get('strongest_dimension', '').title()}")
lines.append(f"Weakest: {dimension_analysis.get('weakest_dimension', '').title()}")
lines.append("")
# Critical Projects Needing Intervention
intervention_list = result.intervention_list
if intervention_list:
lines.append("PROJECTS REQUIRING INTERVENTION")
lines.append("-"*30)
immediate_projects = [p for p in intervention_list if p["intervention_level"] == "immediate"]
urgent_projects = [p for p in intervention_list if p["intervention_level"] == "urgent"]
if immediate_projects:
lines.append("🚨 IMMEDIATE ACTION REQUIRED:")
for project in immediate_projects[:5]:
lines.append(f" • {project['project_name']} (Score: {project['composite_score']:.0f})")
if project.get("recommended_actions"):
lines.append(f" → {project['recommended_actions'][0]}")
lines.append("")
if urgent_projects:
lines.append("⚠️ URGENT ATTENTION NEEDED:")
for project in urgent_projects[:5]:
lines.append(f" • {project['project_name']} (Score: {project['composite_score']:.0f})")
lines.append("")
# Top Performing Projects
if result.project_scores:
top_projects = sorted(result.project_scores, key=lambda p: p["composite_score"], reverse=True)[:5]
lines.append("TOP PERFORMING PROJECTS")
lines.append("-"*30)
for project in top_projects:
status_emoji = {"green": "🟢", "amber": "🟡", "red": "🔴"}.get(project["rag_status"], "⚫")
lines.append(f"{status_emoji} {project['project_name']}: {project['composite_score']:.0f}/100")
lines.append("")
# Recommendations
if result.recommendations:
lines.append("PORTFOLIO RECOMMENDATIONS")
lines.append("-"*30)
for i, rec in enumerate(result.recommendations, 1):
lines.append(f"{i}. {rec}")
return "\n".join(lines)
def format_json_output(result: PortfolioHealthResult) -> Dict[str, Any]:
"""Format analysis results as JSON."""
return {
"summary": result.summary,
"project_scores": result.project_scores,
"dimension_analysis": result.dimension_analysis,
"rag_status": result.rag_status,
"intervention_list": result.intervention_list,
"portfolio_trends": result.portfolio_trends,
"recommendations": result.recommendations
}
# ---------------------------------------------------------------------------
# ProjectMetrics Helper Method
# ---------------------------------------------------------------------------
def _calculate_composite_health_score(self) -> float:
"""Helper method to calculate composite health score."""
health_calc = calculate_project_health_score(self)
return health_calc["composite_score"]
# Add the method to the class
ProjectMetrics.calculate_composite_health_score = lambda self: calculate_project_health_score(self)["composite_score"]
# ---------------------------------------------------------------------------
# CLI Interface
# ---------------------------------------------------------------------------
def main() -> int:
"""Main CLI entry point."""
parser = argparse.ArgumentParser(
description="Analyze project portfolio health across multiple dimensions"
)
parser.add_argument(
"data_file",
help="JSON file containing project portfolio data"
)
parser.add_argument(
"--format",
choices=["text", "json"],
default="text",
help="Output format (default: text)"
)
args = parser.parse_args()
try:
# Load and validate data
with open(args.data_file, 'r') as f:
data = json.load(f)
# Perform analysis
result = analyze_portfolio_health(data)
# Output results
if args.format == "json":
output = format_json_output(result)
print(json.dumps(output, indent=2))
else:
output = format_text_output(result)
print(output)
return 0
except FileNotFoundError:
print(f"Error: File '{args.data_file}' not found", file=sys.stderr)
return 1
except json.JSONDecodeError as e:
print(f"Error: Invalid JSON in '{args.data_file}': {e}", file=sys.stderr)
return 1
except Exception as e:
print(f"Error: {e}", file=sys.stderr)
return 1
if __name__ == "__main__":
sys.exit(main())
FILE:scripts/resource_capacity_planner.py
#!/usr/bin/env python3
"""
Resource Capacity Planner
Models team capacity across projects, identifies over/under-allocation, simulates
"what-if" scenarios for adding/removing resources, calculates utilization rates,
and provides capacity optimization recommendations for project portfolios.
Usage:
python resource_capacity_planner.py capacity_data.json
python resource_capacity_planner.py capacity_data.json --format json
"""
import argparse
import json
import statistics
import sys
from datetime import datetime, timedelta
from typing import Any, Dict, List, Optional, Tuple, Union
# ---------------------------------------------------------------------------
# Capacity Planning Configuration
# ---------------------------------------------------------------------------
ROLE_TYPES = {
"senior_engineer": {
"hourly_rate": 150,
"efficiency_factor": 1.2,
"skill_multipliers": {
"backend": 1.0,
"frontend": 0.9,
"mobile": 0.8,
"devops": 1.1,
"data": 0.9
}
},
"mid_engineer": {
"hourly_rate": 100,
"efficiency_factor": 1.0,
"skill_multipliers": {
"backend": 1.0,
"frontend": 1.0,
"mobile": 0.9,
"devops": 0.8,
"data": 0.8
}
},
"junior_engineer": {
"hourly_rate": 70,
"efficiency_factor": 0.7,
"skill_multipliers": {
"backend": 0.8,
"frontend": 0.9,
"mobile": 0.7,
"devops": 0.6,
"data": 0.7
}
},
"product_manager": {
"hourly_rate": 130,
"efficiency_factor": 1.1,
"skill_multipliers": {
"planning": 1.0,
"stakeholder_mgmt": 1.0,
"analysis": 0.9
}
},
"designer": {
"hourly_rate": 90,
"efficiency_factor": 1.0,
"skill_multipliers": {
"ui_design": 1.0,
"ux_research": 1.0,
"prototyping": 0.9
}
},
"qa_engineer": {
"hourly_rate": 80,
"efficiency_factor": 0.9,
"skill_multipliers": {
"manual_testing": 1.0,
"automation": 1.1,
"performance": 0.9
}
}
}
UTILIZATION_THRESHOLDS = {
"under_utilized": 0.60, # Below 60%
"optimal": 0.85, # 60-85%
"over_utilized": 0.95, # 85-95%
"critical": 1.0 # Above 95%
}
CAPACITY_FACTORS = {
"meeting_overhead": 0.15, # 15% for meetings
"learning_development": 0.05, # 5% for skill development
"administrative": 0.10, # 10% for admin tasks
"context_switching": 0.05, # 5% for project switching penalty
"vacation_sick": 0.12 # 12% for time off
}
PROJECT_COMPLEXITY_FACTORS = {
"simple": 1.0,
"moderate": 1.2,
"complex": 1.5,
"very_complex": 2.0
}
# ---------------------------------------------------------------------------
# Data Models
# ---------------------------------------------------------------------------
class Resource:
"""Represents a team member with skills and capacity."""
def __init__(self, data: Dict[str, Any]):
self.id: str = data.get("id", "")
self.name: str = data.get("name", "")
self.role: str = data.get("role", "").lower()
self.skills: List[str] = data.get("skills", [])
self.skill_levels: Dict[str, float] = data.get("skill_levels", {})
self.hourly_rate: float = data.get("hourly_rate", 0)
self.max_hours_per_week: int = data.get("max_hours_per_week", 40)
self.current_utilization: float = data.get("current_utilization", 0.0)
self.availability_start: str = data.get("availability_start", "")
self.availability_end: Optional[str] = data.get("availability_end")
self.location: str = data.get("location", "")
self.time_zone: str = data.get("time_zone", "")
# Calculate derived metrics
self._calculate_effective_capacity()
self._determine_role_config()
def _calculate_effective_capacity(self):
"""Calculate effective weekly capacity accounting for overhead."""
base_capacity = self.max_hours_per_week
# Apply overhead factors
overhead_total = sum(CAPACITY_FACTORS.values())
self.effective_hours_per_week = base_capacity * (1 - overhead_total)
# Current available capacity
self.available_hours = self.effective_hours_per_week * (1 - self.current_utilization)
def _determine_role_config(self):
"""Get role configuration from predefined types."""
self.role_config = ROLE_TYPES.get(self.role, {
"hourly_rate": self.hourly_rate or 100,
"efficiency_factor": 1.0,
"skill_multipliers": {}
})
# Use provided rate if available, otherwise use role default
if not self.hourly_rate:
self.hourly_rate = self.role_config["hourly_rate"]
def get_skill_effectiveness(self, skill: str) -> float:
"""Calculate effectiveness for a specific skill."""
base_level = self.skill_levels.get(skill, 0.5) # Default 50% if not specified
multiplier = self.role_config.get("skill_multipliers", {}).get(skill, 1.0)
efficiency = self.role_config.get("efficiency_factor", 1.0)
return base_level * multiplier * efficiency
def can_work_on_project(self, project_skills: List[str], min_effectiveness: float = 0.6) -> bool:
"""Check if resource can effectively work on project."""
for skill in project_skills:
if skill in self.skills and self.get_skill_effectiveness(skill) >= min_effectiveness:
return True
return False
class Project:
"""Represents a project with resource requirements."""
def __init__(self, data: Dict[str, Any]):
self.id: str = data.get("id", "")
self.name: str = data.get("name", "")
self.priority: str = data.get("priority", "medium").lower()
self.complexity: str = data.get("complexity", "moderate").lower()
self.estimated_hours: int = data.get("estimated_hours", 0)
self.start_date: str = data.get("start_date", "")
self.target_end_date: str = data.get("target_end_date", "")
self.required_skills: List[str] = data.get("required_skills", [])
self.skill_requirements: Dict[str, int] = data.get("skill_requirements", {})
self.current_allocation: List[Dict[str, Any]] = data.get("current_allocation", [])
self.status: str = data.get("status", "planned").lower()
# Calculate derived metrics
self._calculate_project_metrics()
def _calculate_project_metrics(self):
"""Calculate project-specific metrics."""
# Apply complexity factor
complexity_multiplier = PROJECT_COMPLEXITY_FACTORS.get(self.complexity, 1.0)
self.adjusted_hours = self.estimated_hours * complexity_multiplier
# Calculate current allocation
self.currently_allocated_hours = sum(
alloc.get("hours_per_week", 0) for alloc in self.current_allocation
)
# Calculate timeline metrics
if self.start_date and self.target_end_date:
try:
start = datetime.strptime(self.start_date, "%Y-%m-%d")
end = datetime.strptime(self.target_end_date, "%Y-%m-%d")
self.duration_weeks = (end - start).days / 7
# Required weekly capacity
if self.duration_weeks > 0:
self.required_hours_per_week = self.adjusted_hours / self.duration_weeks
else:
self.required_hours_per_week = self.adjusted_hours
except ValueError:
self.duration_weeks = 0
self.required_hours_per_week = 0
else:
self.duration_weeks = 0
self.required_hours_per_week = 0
# Capacity gap
self.capacity_gap = self.required_hours_per_week - self.currently_allocated_hours
class CapacityAnalysisResult:
"""Complete capacity analysis results."""
def __init__(self):
self.summary: Dict[str, Any] = {}
self.resource_analysis: Dict[str, Any] = {}
self.project_analysis: Dict[str, Any] = {}
self.allocation_optimization: Dict[str, Any] = {}
self.scenario_analysis: Dict[str, Any] = {}
self.recommendations: List[str] = []
# ---------------------------------------------------------------------------
# Capacity Analysis Functions
# ---------------------------------------------------------------------------
def analyze_resource_utilization(resources: List[Resource]) -> Dict[str, Any]:
"""Analyze current resource utilization and capacity."""
utilization_stats = {
"total_resources": len(resources),
"total_capacity": sum(r.effective_hours_per_week for r in resources),
"total_allocated": sum(r.effective_hours_per_week * r.current_utilization for r in resources),
"total_available": sum(r.available_hours for r in resources)
}
# Calculate overall utilization
utilization_stats["overall_utilization"] = (
utilization_stats["total_allocated"] / max(utilization_stats["total_capacity"], 1)
)
# Categorize resources by utilization
utilization_categories = {
"under_utilized": [],
"optimal": [],
"over_utilized": [],
"critical": []
}
for resource in resources:
if resource.current_utilization <= UTILIZATION_THRESHOLDS["under_utilized"]:
utilization_categories["under_utilized"].append(resource)
elif resource.current_utilization <= UTILIZATION_THRESHOLDS["optimal"]:
utilization_categories["optimal"].append(resource)
elif resource.current_utilization <= UTILIZATION_THRESHOLDS["over_utilized"]:
utilization_categories["over_utilized"].append(resource)
else:
utilization_categories["critical"].append(resource)
# Role-based analysis
role_analysis = {}
for resource in resources:
if resource.role not in role_analysis:
role_analysis[resource.role] = {
"count": 0,
"total_capacity": 0,
"average_utilization": 0,
"available_hours": 0,
"hourly_cost": 0
}
role_data = role_analysis[resource.role]
role_data["count"] += 1
role_data["total_capacity"] += resource.effective_hours_per_week
role_data["available_hours"] += resource.available_hours
role_data["hourly_cost"] += resource.hourly_rate
# Calculate averages for roles
for role in role_analysis:
role_data = role_analysis[role]
role_data["average_utilization"] = 1 - (role_data["available_hours"] / max(role_data["total_capacity"], 1))
role_data["average_hourly_rate"] = role_data["hourly_cost"] / role_data["count"]
return {
"utilization_stats": utilization_stats,
"utilization_categories": {
k: [{"id": r.id, "name": r.name, "role": r.role, "utilization": r.current_utilization}
for r in v]
for k, v in utilization_categories.items()
},
"role_analysis": role_analysis,
"capacity_alerts": _generate_capacity_alerts(utilization_categories)
}
def analyze_project_capacity_requirements(projects: List[Project]) -> Dict[str, Any]:
"""Analyze project capacity requirements and gaps."""
project_stats = {
"total_projects": len(projects),
"active_projects": len([p for p in projects if p.status in ["active", "in_progress"]]),
"planned_projects": len([p for p in projects if p.status == "planned"]),
"total_estimated_hours": sum(p.adjusted_hours for p in projects),
"total_weekly_demand": sum(p.required_hours_per_week for p in projects if p.status != "completed")
}
# Project priority analysis
priority_distribution = {}
for priority in ["high", "medium", "low"]:
priority_projects = [p for p in projects if p.priority == priority]
priority_distribution[priority] = {
"count": len(priority_projects),
"total_hours": sum(p.adjusted_hours for p in priority_projects),
"weekly_demand": sum(p.required_hours_per_week for p in priority_projects if p.status != "completed")
}
# Capacity gap analysis
projects_with_gaps = [p for p in projects if p.capacity_gap > 0 and p.status != "completed"]
total_capacity_gap = sum(p.capacity_gap for p in projects_with_gaps)
# Skill demand analysis
skill_demand = {}
for project in projects:
if project.status != "completed":
for skill, hours in project.skill_requirements.items():
if skill not in skill_demand:
skill_demand[skill] = 0
skill_demand[skill] += hours
# Sort skills by demand
sorted_skill_demand = sorted(skill_demand.items(), key=lambda x: x[1], reverse=True)
return {
"project_stats": project_stats,
"priority_distribution": priority_distribution,
"capacity_gaps": {
"projects_with_gaps": len(projects_with_gaps),
"total_gap_hours_weekly": total_capacity_gap,
"gap_projects": [
{
"id": p.id,
"name": p.name,
"priority": p.priority,
"gap_hours": p.capacity_gap,
"required_skills": p.required_skills
}
for p in sorted(projects_with_gaps, key=lambda p: p.capacity_gap, reverse=True)[:10]
]
},
"skill_demand": dict(sorted_skill_demand[:10]) # Top 10 skills in demand
}
def optimize_resource_allocation(resources: List[Resource], projects: List[Project]) -> Dict[str, Any]:
"""Optimize resource allocation across projects."""
optimization_results = {
"current_allocation_efficiency": 0.0,
"optimization_opportunities": [],
"suggested_reallocations": [],
"skill_matching_scores": {}
}
# Calculate current allocation efficiency
total_effectiveness = 0
total_allocations = 0
for project in projects:
if project.status not in ["completed", "cancelled"] and project.current_allocation:
project_effectiveness = 0
for allocation in project.current_allocation:
resource_id = allocation.get("resource_id", "")
hours = allocation.get("hours_per_week", 0)
# Find the resource
resource = next((r for r in resources if r.id == resource_id), None)
if resource:
# Calculate effectiveness for this allocation
avg_skill_effectiveness = 0
skill_count = 0
for skill in project.required_skills:
if skill in resource.skills:
avg_skill_effectiveness += resource.get_skill_effectiveness(skill)
skill_count += 1
if skill_count > 0:
avg_skill_effectiveness /= skill_count
project_effectiveness += avg_skill_effectiveness * hours
total_allocations += hours
if total_allocations > 0:
total_effectiveness += project_effectiveness / total_allocations
current_efficiency = total_effectiveness / max(len(projects), 1)
optimization_results["current_allocation_efficiency"] = current_efficiency
# Find optimization opportunities
under_utilized = [r for r in resources if r.current_utilization < UTILIZATION_THRESHOLDS["under_utilized"]]
over_allocated_projects = [p for p in projects if p.capacity_gap < 0 and p.status != "completed"]
# Generate reallocation suggestions
for project in projects:
if project.capacity_gap > 0 and project.status != "completed":
# Find best-fit under-utilized resources
suitable_resources = []
for resource in under_utilized:
if resource.can_work_on_project(project.required_skills):
skill_match_score = 0
for skill in project.required_skills:
if skill in resource.skills:
skill_match_score += resource.get_skill_effectiveness(skill)
skill_match_score /= max(len(project.required_skills), 1)
suitable_resources.append({
"resource": resource,
"skill_match_score": skill_match_score,
"available_hours": resource.available_hours
})
# Sort by skill match and availability
suitable_resources.sort(key=lambda x: (x["skill_match_score"], x["available_hours"]), reverse=True)
if suitable_resources:
optimization_results["suggested_reallocations"].append({
"project_id": project.id,
"project_name": project.name,
"gap_hours": project.capacity_gap,
"recommended_resources": suitable_resources[:3] # Top 3 recommendations
})
return optimization_results
def simulate_capacity_scenarios(resources: List[Resource], projects: List[Project], scenarios: List[Dict[str, Any]]) -> Dict[str, Any]:
"""Simulate what-if scenarios for capacity planning."""
scenario_results = {}
for scenario in scenarios:
scenario_name = scenario.get("name", "Unnamed Scenario")
scenario_type = scenario.get("type", "")
scenario_params = scenario.get("parameters", {})
# Create copies for simulation
sim_resources = [Resource(r.__dict__.copy()) for r in resources]
sim_projects = [Project(p.__dict__.copy()) for p in projects]
# Apply scenario changes
if scenario_type == "add_resource":
# Add new resource
new_resource_data = scenario_params.get("resource_data", {})
new_resource = Resource(new_resource_data)
sim_resources.append(new_resource)
elif scenario_type == "remove_resource":
# Remove resource
resource_id = scenario_params.get("resource_id", "")
sim_resources = [r for r in sim_resources if r.id != resource_id]
elif scenario_type == "add_project":
# Add new project
new_project_data = scenario_params.get("project_data", {})
new_project = Project(new_project_data)
sim_projects.append(new_project)
elif scenario_type == "adjust_utilization":
# Adjust resource utilization
resource_id = scenario_params.get("resource_id", "")
new_utilization = scenario_params.get("new_utilization", 0)
for resource in sim_resources:
if resource.id == resource_id:
resource.current_utilization = new_utilization
resource._calculate_effective_capacity()
# Analyze scenario results
resource_analysis = analyze_resource_utilization(sim_resources)
project_analysis = analyze_project_capacity_requirements(sim_projects)
scenario_results[scenario_name] = {
"scenario_type": scenario_type,
"resource_utilization": resource_analysis["utilization_stats"]["overall_utilization"],
"total_capacity": resource_analysis["utilization_stats"]["total_capacity"],
"capacity_gaps": project_analysis["capacity_gaps"]["total_gap_hours_weekly"],
"under_utilized_count": len(resource_analysis["utilization_categories"]["under_utilized"]),
"over_utilized_count": len(resource_analysis["utilization_categories"]["over_utilized"]),
"cost_impact": _calculate_cost_impact(sim_resources, resources)
}
return scenario_results
def _generate_capacity_alerts(utilization_categories: Dict[str, List[Resource]]) -> List[str]:
"""Generate capacity-related alerts and warnings."""
alerts = []
critical_resources = utilization_categories.get("critical", [])
over_utilized = utilization_categories.get("over_utilized", [])
under_utilized = utilization_categories.get("under_utilized", [])
if critical_resources:
alerts.append(f"CRITICAL: {len(critical_resources)} resources are severely over-allocated (>95%)")
if over_utilized:
alerts.append(f"WARNING: {len(over_utilized)} resources are over-allocated (85-95%)")
if len(under_utilized) > len(critical_resources) + len(over_utilized):
alerts.append(f"OPPORTUNITY: {len(under_utilized)} resources are under-utilized (<60%)")
return alerts
def _calculate_cost_impact(sim_resources: List[Resource], baseline_resources: List[Resource]) -> float:
"""Calculate cost impact of scenario vs baseline."""
sim_cost = sum(r.hourly_rate * r.effective_hours_per_week for r in sim_resources)
baseline_cost = sum(r.hourly_rate * r.effective_hours_per_week for r in baseline_resources)
return sim_cost - baseline_cost
def generate_capacity_recommendations(analysis_results: Dict[str, Any]) -> List[str]:
"""Generate actionable capacity management recommendations."""
recommendations = []
# Resource utilization recommendations
resource_analysis = analysis_results.get("resource_analysis", {})
utilization_categories = resource_analysis.get("utilization_categories", {})
critical_count = len(utilization_categories.get("critical", []))
over_utilized_count = len(utilization_categories.get("over_utilized", []))
under_utilized_count = len(utilization_categories.get("under_utilized", []))
if critical_count > 0:
recommendations.append(f"URGENT: Redistribute workload for {critical_count} critically over-allocated resources to prevent burnout.")
if over_utilized_count > 2:
recommendations.append(f"Consider hiring or redistributing work - {over_utilized_count} team members are over-allocated.")
if under_utilized_count > 0 and critical_count + over_utilized_count > 0:
recommendations.append(f"Rebalance allocation - {under_utilized_count} under-utilized resources could help with over-allocated work.")
# Project capacity recommendations
project_analysis = analysis_results.get("project_analysis", {})
capacity_gaps = project_analysis.get("capacity_gaps", {})
total_gap = capacity_gaps.get("total_gap_hours_weekly", 0)
if total_gap > 40: # More than 1 FTE worth of gap
recommendations.append(f"Capacity shortfall of {total_gap:.0f} hours/week detected. Consider hiring or timeline adjustments.")
# Skill-based recommendations
skill_demand = project_analysis.get("skill_demand", {})
if skill_demand:
top_skill = list(skill_demand.keys())[0]
top_demand = skill_demand[top_skill]
recommendations.append(f"High demand for {top_skill} skills ({top_demand} hours). Consider training or specialized hiring.")
# Optimization recommendations
optimization = analysis_results.get("allocation_optimization", {})
efficiency = optimization.get("current_allocation_efficiency", 0)
if efficiency < 0.7:
recommendations.append("Low allocation efficiency detected. Review skill-to-project matching and consider reallocation.")
return recommendations
# ---------------------------------------------------------------------------
# Main Analysis Function
# ---------------------------------------------------------------------------
def analyze_capacity(data: Dict[str, Any]) -> CapacityAnalysisResult:
"""Perform comprehensive capacity analysis."""
result = CapacityAnalysisResult()
try:
# Parse resource and project data
resource_records = data.get("resources", [])
project_records = data.get("projects", [])
resources = [Resource(record) for record in resource_records]
projects = [Project(record) for record in project_records]
if not resources:
raise ValueError("No resource data found")
# Basic summary
result.summary = {
"total_resources": len(resources),
"total_projects": len(projects),
"active_projects": len([p for p in projects if p.status in ["active", "in_progress"]]),
"total_capacity_hours": sum(r.effective_hours_per_week for r in resources),
"total_demand_hours": sum(p.required_hours_per_week for p in projects if p.status != "completed"),
"overall_utilization": sum(r.current_utilization for r in resources) / max(len(resources), 1)
}
# Resource analysis
result.resource_analysis = analyze_resource_utilization(resources)
# Project analysis
result.project_analysis = analyze_project_capacity_requirements(projects)
# Allocation optimization
result.allocation_optimization = optimize_resource_allocation(resources, projects)
# Scenario analysis (if scenarios provided)
scenarios = data.get("scenarios", [])
if scenarios:
result.scenario_analysis = simulate_capacity_scenarios(resources, projects, scenarios)
# Generate recommendations
analysis_data = {
"resource_analysis": result.resource_analysis,
"project_analysis": result.project_analysis,
"allocation_optimization": result.allocation_optimization
}
result.recommendations = generate_capacity_recommendations(analysis_data)
except Exception as e:
result.summary = {"error": str(e)}
return result
# ---------------------------------------------------------------------------
# Output Formatting
# ---------------------------------------------------------------------------
def format_text_output(result: CapacityAnalysisResult) -> str:
"""Format analysis results as readable text report."""
lines = []
lines.append("="*60)
lines.append("RESOURCE CAPACITY PLANNING REPORT")
lines.append("="*60)
lines.append("")
if "error" in result.summary:
lines.append(f"ERROR: {result.summary['error']}")
return "\n".join(lines)
# Executive Summary
summary = result.summary
lines.append("CAPACITY OVERVIEW")
lines.append("-"*30)
lines.append(f"Total Resources: {summary['total_resources']}")
lines.append(f"Total Projects: {summary['total_projects']} ({summary['active_projects']} active)")
lines.append(f"Capacity vs Demand: {summary['total_capacity_hours']:.0f}h vs {summary['total_demand_hours']:.0f}h per week")
lines.append(f"Overall Utilization: {summary['overall_utilization']:.1%}")
lines.append("")
# Resource Utilization
resource_analysis = result.resource_analysis
lines.append("RESOURCE UTILIZATION ANALYSIS")
lines.append("-"*30)
utilization_categories = resource_analysis.get("utilization_categories", {})
for category, resources in utilization_categories.items():
if resources:
lines.append(f"{category.replace('_', ' ').title()}: {len(resources)} resources")
for resource in resources[:3]: # Show top 3
lines.append(f" - {resource['name']} ({resource['role']}): {resource['utilization']:.1%}")
if len(resources) > 3:
lines.append(f" ... and {len(resources) - 3} more")
lines.append("")
# Capacity Alerts
alerts = resource_analysis.get("capacity_alerts", [])
if alerts:
lines.append("CAPACITY ALERTS")
lines.append("-"*30)
for alert in alerts:
lines.append(f"⚠️ {alert}")
lines.append("")
# Project Capacity Gaps
project_analysis = result.project_analysis
capacity_gaps = project_analysis.get("capacity_gaps", {})
lines.append("PROJECT CAPACITY GAPS")
lines.append("-"*30)
lines.append(f"Projects with gaps: {capacity_gaps.get('projects_with_gaps', 0)}")
lines.append(f"Total gap: {capacity_gaps.get('total_gap_hours_weekly', 0):.0f} hours/week")
gap_projects = capacity_gaps.get("gap_projects", [])
if gap_projects:
lines.append("Top projects needing resources:")
for project in gap_projects[:5]:
lines.append(f" - {project['name']} ({project['priority']}): {project['gap_hours']:.0f}h/week gap")
lines.append("")
# Skill Demand
skill_demand = project_analysis.get("skill_demand", {})
if skill_demand:
lines.append("TOP SKILL DEMANDS")
lines.append("-"*30)
for skill, hours in list(skill_demand.items())[:5]:
lines.append(f"{skill}: {hours} hours needed")
lines.append("")
# Optimization Suggestions
optimization = result.allocation_optimization
suggested_reallocations = optimization.get("suggested_reallocations", [])
if suggested_reallocations:
lines.append("RESOURCE REALLOCATION SUGGESTIONS")
lines.append("-"*30)
for suggestion in suggested_reallocations[:3]:
lines.append(f"Project: {suggestion['project_name']}")
lines.append(f" Gap: {suggestion['gap_hours']:.0f} hours/week")
recommended = suggestion.get("recommended_resources", [])
if recommended:
best_match = recommended[0]
resource_info = best_match["resource"]
lines.append(f" Best fit: {resource_info.name} ({resource_info.role})")
lines.append(f" Skill match: {best_match['skill_match_score']:.1%}")
lines.append(f" Available: {best_match['available_hours']:.0f}h/week")
lines.append("")
# Scenario Analysis
scenario_analysis = result.scenario_analysis
if scenario_analysis:
lines.append("SCENARIO ANALYSIS")
lines.append("-"*30)
for scenario_name, results in scenario_analysis.items():
lines.append(f"{scenario_name}:")
lines.append(f" Utilization: {results['resource_utilization']:.1%}")
lines.append(f" Capacity gaps: {results['capacity_gaps']:.0f}h/week")
lines.append(f" Cost impact: .0f/week")
lines.append("")
# Recommendations
if result.recommendations:
lines.append("RECOMMENDATIONS")
lines.append("-"*30)
for i, rec in enumerate(result.recommendations, 1):
lines.append(f"{i}. {rec}")
return "\n".join(lines)
def format_json_output(result: CapacityAnalysisResult) -> Dict[str, Any]:
"""Format analysis results as JSON."""
# Helper function to serialize Resource objects
def serialize_resource(resource):
if hasattr(resource, 'id'):
return {
"id": resource.id,
"name": resource.name,
"role": resource.role,
"utilization": resource.current_utilization,
"available_hours": resource.available_hours,
"hourly_rate": resource.hourly_rate
}
return resource
# Deep copy and clean up the result
serialized_result = {
"summary": result.summary,
"resource_analysis": result.resource_analysis,
"project_analysis": result.project_analysis,
"allocation_optimization": result.allocation_optimization,
"scenario_analysis": result.scenario_analysis,
"recommendations": result.recommendations
}
# Handle Resource objects in optimization suggestions
if "suggested_reallocations" in serialized_result["allocation_optimization"]:
for suggestion in serialized_result["allocation_optimization"]["suggested_reallocations"]:
if "recommended_resources" in suggestion:
for rec in suggestion["recommended_resources"]:
if "resource" in rec:
rec["resource"] = serialize_resource(rec["resource"])
return serialized_result
# ---------------------------------------------------------------------------
# CLI Interface
# ---------------------------------------------------------------------------
def main() -> int:
"""Main CLI entry point."""
parser = argparse.ArgumentParser(
description="Analyze resource capacity and allocation across project portfolio"
)
parser.add_argument(
"data_file",
help="JSON file containing resource and project capacity data"
)
parser.add_argument(
"--format",
choices=["text", "json"],
default="text",
help="Output format (default: text)"
)
args = parser.parse_args()
try:
# Load and validate data
with open(args.data_file, 'r') as f:
data = json.load(f)
# Perform analysis
result = analyze_capacity(data)
# Output results
if args.format == "json":
output = format_json_output(result)
print(json.dumps(output, indent=2))
else:
output = format_text_output(result)
print(output)
return 0
except FileNotFoundError:
print(f"Error: File '{args.data_file}' not found", file=sys.stderr)
return 1
except json.JSONDecodeError as e:
print(f"Error: Invalid JSON in '{args.data_file}': {e}", file=sys.stderr)
return 1
except Exception as e:
print(f"Error: {e}", file=sys.stderr)
return 1
if __name__ == "__main__":
sys.exit(main())
FILE:scripts/risk_matrix_analyzer.py
#!/usr/bin/env python3
"""
Risk Matrix Analyzer
Builds probability/impact matrices, calculates risk scores, suggests mitigation
strategies based on risk category, and tracks risk trends over time. Provides
comprehensive risk assessment and prioritization for project portfolios.
Usage:
python risk_matrix_analyzer.py risk_data.json
python risk_matrix_analyzer.py risk_data.json --format json
"""
import argparse
import json
import statistics
import sys
from datetime import datetime, timedelta
from typing import Any, Dict, List, Optional, Tuple, Union
# ---------------------------------------------------------------------------
# Risk Assessment Configuration
# ---------------------------------------------------------------------------
RISK_CATEGORIES = {
"technical": {
"weight": 1.2,
"description": "Technology, architecture, integration risks",
"mitigation_strategies": [
"Proof of concept development",
"Technical spike implementation",
"Expert consultation",
"Alternative technology evaluation",
"Incremental development approach"
]
},
"resource": {
"weight": 1.1,
"description": "Team capacity, skills, availability risks",
"mitigation_strategies": [
"Resource planning and buffer allocation",
"Skill development and training",
"Cross-training and knowledge sharing",
"Contractor or consultant engagement",
"Timeline adjustment for capacity"
]
},
"schedule": {
"weight": 1.0,
"description": "Timeline, deadline, dependency risks",
"mitigation_strategies": [
"Critical path analysis and optimization",
"Buffer time allocation",
"Dependency management and coordination",
"Scope prioritization and phasing",
"Parallel work streams where possible"
]
},
"business": {
"weight": 1.3,
"description": "Market, customer, competitive risks",
"mitigation_strategies": [
"Market research and validation",
"Customer feedback integration",
"Competitive analysis monitoring",
"Stakeholder engagement strategy",
"Business case validation checkpoints"
]
},
"financial": {
"weight": 1.4,
"description": "Budget, ROI, cost overrun risks",
"mitigation_strategies": [
"Detailed cost estimation and tracking",
"Budget reserve allocation",
"Regular financial checkpoint reviews",
"Cost-benefit analysis updates",
"Alternative funding source identification"
]
},
"regulatory": {
"weight": 1.5,
"description": "Compliance, legal, governance risks",
"mitigation_strategies": [
"Legal review and approval processes",
"Compliance audit preparation",
"Regulatory body engagement",
"Documentation and audit trail maintenance",
"External legal counsel consultation"
]
},
"external": {
"weight": 1.0,
"description": "Vendor, partner, environmental risks",
"mitigation_strategies": [
"Vendor assessment and backup options",
"Contract negotiation and SLA definition",
"Environmental monitoring and adaptation",
"Partner relationship management",
"External dependency tracking"
]
}
}
PROBABILITY_LEVELS = {
1: {"label": "Very Low", "range": "0-10%", "description": "Highly unlikely to occur"},
2: {"label": "Low", "range": "11-30%", "description": "Unlikely but possible"},
3: {"label": "Medium", "range": "31-60%", "description": "Moderate likelihood"},
4: {"label": "High", "range": "61-85%", "description": "Likely to occur"},
5: {"label": "Very High", "range": "86-100%", "description": "Almost certain to occur"}
}
IMPACT_LEVELS = {
1: {"label": "Very Low", "description": "Minimal impact on project success"},
2: {"label": "Low", "description": "Minor delays or cost increases"},
3: {"label": "Medium", "description": "Significant impact on timeline/budget"},
4: {"label": "High", "description": "Major project disruption"},
5: {"label": "Very High", "description": "Project failure or critical compromise"}
}
RISK_TOLERANCE_THRESHOLDS = {
"low": 8, # Risk score <= 8: Accept
"medium": 15, # Risk score 9-15: Monitor
"high": 20, # Risk score 16-20: Mitigate
"critical": 25 # Risk score >20: Urgent action
}
MITIGATION_STRATEGIES = {
"accept": "Monitor risk without active mitigation",
"avoid": "Eliminate risk through scope or approach changes",
"mitigate": "Reduce probability or impact through proactive measures",
"transfer": "Share or transfer risk to third parties",
"contingency": "Prepare response plan for risk occurrence"
}
# ---------------------------------------------------------------------------
# Data Models
# ---------------------------------------------------------------------------
class Risk:
"""Represents a single project risk with assessment and mitigation data."""
def __init__(self, data: Dict[str, Any]):
self.id: str = data.get("id", "")
self.title: str = data.get("title", "")
self.description: str = data.get("description", "")
self.category: str = data.get("category", "technical").lower()
self.probability: int = max(1, min(5, data.get("probability", 3)))
self.impact: int = max(1, min(5, data.get("impact", 3)))
self.owner: str = data.get("owner", "")
self.status: str = data.get("status", "open").lower()
self.identified_date: str = data.get("identified_date", "")
self.target_resolution: Optional[str] = data.get("target_resolution")
self.mitigation_strategy: str = data.get("mitigation_strategy", "").lower()
self.mitigation_actions: List[str] = data.get("mitigation_actions", [])
self.cost_impact: Optional[float] = data.get("cost_impact")
self.schedule_impact: Optional[int] = data.get("schedule_impact_days")
# Calculate derived metrics
self._calculate_risk_score()
self._determine_risk_level()
self._suggest_mitigation_approach()
def _calculate_risk_score(self):
"""Calculate weighted risk score based on category, probability, and impact."""
base_score = self.probability * self.impact
category_weight = RISK_CATEGORIES.get(self.category, {}).get("weight", 1.0)
self.risk_score = base_score * category_weight
def _determine_risk_level(self):
"""Determine risk level based on score thresholds."""
if self.risk_score <= RISK_TOLERANCE_THRESHOLDS["low"]:
self.risk_level = "low"
elif self.risk_score <= RISK_TOLERANCE_THRESHOLDS["medium"]:
self.risk_level = "medium"
elif self.risk_score <= RISK_TOLERANCE_THRESHOLDS["high"]:
self.risk_level = "high"
else:
self.risk_level = "critical"
def _suggest_mitigation_approach(self):
"""Suggest mitigation approach based on risk characteristics."""
if self.risk_level == "low":
self.suggested_approach = "accept"
elif self.probability >= 4 and self.impact <= 2:
self.suggested_approach = "mitigate" # Likely but low impact
elif self.probability <= 2 and self.impact >= 4:
self.suggested_approach = "contingency" # Unlikely but high impact
elif self.impact >= 4:
self.suggested_approach = "avoid" # High impact risks
else:
self.suggested_approach = "mitigate"
@property
def is_active(self) -> bool:
return self.status.lower() in ["open", "identified", "monitoring", "mitigating"]
@property
def is_overdue(self) -> bool:
if not self.target_resolution:
return False
try:
target_date = datetime.strptime(self.target_resolution, "%Y-%m-%d")
return datetime.now() > target_date and self.is_active
except ValueError:
return False
class RiskAnalysisResult:
"""Complete risk analysis results."""
def __init__(self):
self.summary: Dict[str, Any] = {}
self.risk_matrix: Dict[str, Any] = {}
self.category_analysis: Dict[str, Any] = {}
self.mitigation_analysis: Dict[str, Any] = {}
self.trend_analysis: Dict[str, Any] = {}
self.recommendations: List[str] = []
# ---------------------------------------------------------------------------
# Risk Analysis Functions
# ---------------------------------------------------------------------------
def build_risk_matrix(risks: List[Risk]) -> Dict[str, Any]:
"""Build probability/impact risk matrix with risk distribution."""
matrix = {}
risk_distribution = {}
# Initialize matrix
for prob in range(1, 6):
matrix[prob] = {}
for impact in range(1, 6):
matrix[prob][impact] = []
# Populate matrix with risks
for risk in risks:
if risk.is_active:
matrix[risk.probability][risk.impact].append({
"id": risk.id,
"title": risk.title,
"risk_score": risk.risk_score,
"category": risk.category
})
# Calculate distribution statistics
total_risks = len([r for r in risks if r.is_active])
risk_distribution = {
"critical": len([r for r in risks if r.is_active and r.risk_level == "critical"]),
"high": len([r for r in risks if r.is_active and r.risk_level == "high"]),
"medium": len([r for r in risks if r.is_active and r.risk_level == "medium"]),
"low": len([r for r in risks if r.is_active and r.risk_level == "low"])
}
# Calculate risk exposure
total_score = sum(r.risk_score for r in risks if r.is_active)
average_score = total_score / max(total_risks, 1)
return {
"matrix": matrix,
"distribution": risk_distribution,
"total_risks": total_risks,
"total_risk_score": total_score,
"average_risk_score": average_score,
"risk_exposure_level": _classify_risk_exposure(average_score)
}
def analyze_risk_categories(risks: List[Risk]) -> Dict[str, Any]:
"""Analyze risks by category with detailed statistics."""
category_stats = {}
active_risks = [r for r in risks if r.is_active]
for category, config in RISK_CATEGORIES.items():
category_risks = [r for r in active_risks if r.category == category]
if category_risks:
risk_scores = [r.risk_score for r in category_risks]
category_stats[category] = {
"count": len(category_risks),
"total_score": sum(risk_scores),
"average_score": statistics.mean(risk_scores),
"max_score": max(risk_scores),
"risk_level_distribution": _get_risk_level_distribution(category_risks),
"top_risks": sorted(category_risks, key=lambda r: r.risk_score, reverse=True)[:3],
"mitigation_coverage": _calculate_mitigation_coverage(category_risks),
"suggested_strategies": config["mitigation_strategies"][:3]
}
else:
category_stats[category] = {
"count": 0,
"total_score": 0,
"average_score": 0,
"risk_level_distribution": {},
"mitigation_coverage": 0
}
# Identify highest risk categories
sorted_categories = sorted(
[(cat, stats) for cat, stats in category_stats.items() if stats["count"] > 0],
key=lambda x: x[1]["total_score"],
reverse=True
)
return {
"category_statistics": category_stats,
"highest_risk_categories": [cat for cat, _ in sorted_categories[:3]],
"category_concentration": len([c for c in category_stats if category_stats[c]["count"] > 0])
}
def analyze_mitigation_effectiveness(risks: List[Risk]) -> Dict[str, Any]:
"""Analyze mitigation strategy effectiveness and coverage."""
active_risks = [r for r in risks if r.is_active]
# Mitigation strategy distribution
strategy_distribution = {}
for strategy in MITIGATION_STRATEGIES.keys():
strategy_risks = [r for r in active_risks if r.mitigation_strategy == strategy]
if strategy_risks:
strategy_distribution[strategy] = {
"count": len(strategy_risks),
"average_risk_score": statistics.mean([r.risk_score for r in strategy_risks]),
"risk_levels": _get_risk_level_distribution(strategy_risks)
}
# Mitigation coverage analysis
risks_with_mitigation = [r for r in active_risks if r.mitigation_actions]
mitigation_coverage = len(risks_with_mitigation) / max(len(active_risks), 1)
# Action item analysis
total_actions = sum(len(r.mitigation_actions) for r in active_risks)
average_actions_per_risk = total_actions / max(len(active_risks), 1)
# Overdue mitigation analysis
overdue_risks = [r for r in active_risks if r.is_overdue]
overdue_rate = len(overdue_risks) / max(len(active_risks), 1)
return {
"strategy_distribution": strategy_distribution,
"mitigation_coverage": mitigation_coverage,
"average_actions_per_risk": average_actions_per_risk,
"overdue_mitigation_count": len(overdue_risks),
"overdue_rate": overdue_rate,
"top_overdue_risks": sorted(overdue_risks, key=lambda r: r.risk_score, reverse=True)[:5]
}
def analyze_risk_trends(current_risks: List[Risk], historical_data: Optional[List[Dict]] = None) -> Dict[str, Any]:
"""Analyze risk trends over time if historical data is available."""
if not historical_data:
return {
"trend_analysis_available": False,
"message": "Historical data required for trend analysis"
}
# Simple trend analysis based on current vs. historical risk levels
current_total_score = sum(r.risk_score for r in current_risks if r.is_active)
current_risk_count = len([r for r in current_risks if r.is_active])
# This is a simplified implementation - in practice, you'd track risks over time
trend_data = {
"trend_analysis_available": True,
"current_total_risk_score": current_total_score,
"current_active_risks": current_risk_count,
"risk_velocity": {
"new_risks_rate": "Calculate from historical data",
"resolution_rate": "Calculate from historical data",
"escalation_rate": "Calculate from historical data"
}
}
return trend_data
def generate_risk_recommendations(risks: List[Risk], analysis_results: Dict[str, Any]) -> List[str]:
"""Generate actionable risk management recommendations."""
recommendations = []
# Critical risk recommendations
critical_risks = [r for r in risks if r.is_active and r.risk_level == "critical"]
if critical_risks:
recommendations.append(f"URGENT: Address {len(critical_risks)} critical risks immediately. These require executive attention and dedicated resources.")
for risk in critical_risks[:3]: # Top 3 critical risks
recommendations.append(f"Critical Risk - {risk.title}: Implement {risk.suggested_approach} strategy within 48 hours.")
# High-concentration category recommendations
category_analysis = analysis_results.get("category_analysis", {})
highest_categories = category_analysis.get("highest_risk_categories", [])
if highest_categories:
top_category = highest_categories[0]
recommendations.append(f"Focus mitigation efforts on {top_category} risks - highest concentration of risk exposure.")
# Mitigation coverage recommendations
mitigation_analysis = analysis_results.get("mitigation_analysis", {})
coverage = mitigation_analysis.get("mitigation_coverage", 0)
if coverage < 0.7:
recommendations.append("Improve mitigation coverage - less than 70% of risks have defined mitigation actions.")
overdue_rate = mitigation_analysis.get("overdue_rate", 0)
if overdue_rate > 0.2:
recommendations.append("Address overdue mitigation actions - more than 20% of risks are past their target resolution date.")
# Risk matrix recommendations
matrix_analysis = analysis_results.get("risk_matrix", {})
avg_score = matrix_analysis.get("average_risk_score", 0)
if avg_score > 15:
recommendations.append("Portfolio risk exposure is high. Consider scope reduction or additional risk mitigation investments.")
elif avg_score < 8:
recommendations.append("Risk exposure is well-managed. Consider taking on additional strategic initiatives.")
return recommendations
# ---------------------------------------------------------------------------
# Utility Functions
# ---------------------------------------------------------------------------
def _classify_risk_exposure(average_score: float) -> str:
"""Classify overall portfolio risk exposure level."""
if average_score > 18:
return "very_high"
elif average_score > 15:
return "high"
elif average_score > 12:
return "medium"
elif average_score > 8:
return "low"
else:
return "very_low"
def _get_risk_level_distribution(risks: List[Risk]) -> Dict[str, int]:
"""Get distribution of risk levels for a set of risks."""
distribution = {"critical": 0, "high": 0, "medium": 0, "low": 0}
for risk in risks:
distribution[risk.risk_level] += 1
return distribution
def _calculate_mitigation_coverage(risks: List[Risk]) -> float:
"""Calculate percentage of risks with defined mitigation actions."""
if not risks:
return 0.0
risks_with_mitigation = sum(1 for r in risks if r.mitigation_actions)
return risks_with_mitigation / len(risks)
# ---------------------------------------------------------------------------
# Main Analysis Function
# ---------------------------------------------------------------------------
def analyze_risks(data: Dict[str, Any]) -> RiskAnalysisResult:
"""Perform comprehensive risk analysis."""
result = RiskAnalysisResult()
try:
# Parse risk data
risk_records = data.get("risks", [])
risks = [Risk(record) for record in risk_records]
if not risks:
raise ValueError("No risk data found")
# Basic summary
active_risks = [r for r in risks if r.is_active]
result.summary = {
"total_risks": len(risks),
"active_risks": len(active_risks),
"closed_risks": len(risks) - len(active_risks),
"critical_risks": len([r for r in active_risks if r.risk_level == "critical"]),
"high_risks": len([r for r in active_risks if r.risk_level == "high"]),
"total_risk_exposure": sum(r.risk_score for r in active_risks),
"average_risk_score": sum(r.risk_score for r in active_risks) / max(len(active_risks), 1),
"overdue_risks": len([r for r in active_risks if r.is_overdue])
}
# Risk matrix analysis
result.risk_matrix = build_risk_matrix(risks)
# Category analysis
result.category_analysis = analyze_risk_categories(risks)
# Mitigation analysis
result.mitigation_analysis = analyze_mitigation_effectiveness(risks)
# Trend analysis (simplified without historical data)
result.trend_analysis = analyze_risk_trends(risks, data.get("historical_data"))
# Generate recommendations
analysis_data = {
"category_analysis": result.category_analysis,
"mitigation_analysis": result.mitigation_analysis,
"risk_matrix": result.risk_matrix
}
result.recommendations = generate_risk_recommendations(risks, analysis_data)
except Exception as e:
result.summary = {"error": str(e)}
return result
# ---------------------------------------------------------------------------
# Output Formatting
# ---------------------------------------------------------------------------
def format_text_output(result: RiskAnalysisResult) -> str:
"""Format analysis results as readable text report."""
lines = []
lines.append("="*60)
lines.append("RISK MATRIX ANALYSIS REPORT")
lines.append("="*60)
lines.append("")
if "error" in result.summary:
lines.append(f"ERROR: {result.summary['error']}")
return "\n".join(lines)
# Executive Summary
summary = result.summary
lines.append("EXECUTIVE SUMMARY")
lines.append("-"*30)
lines.append(f"Total Risks: {summary['total_risks']} ({summary['active_risks']} active)")
lines.append(f"Risk Exposure: {summary['total_risk_exposure']:.1f} points (avg: {summary['average_risk_score']:.1f})")
lines.append(f"Critical/High Risks: {summary['critical_risks']}/{summary['high_risks']}")
lines.append(f"Overdue Mitigations: {summary['overdue_risks']}")
lines.append("")
# Risk Distribution
matrix = result.risk_matrix
lines.append("RISK LEVEL DISTRIBUTION")
lines.append("-"*30)
distribution = matrix.get("distribution", {})
for level in ["critical", "high", "medium", "low"]:
count = distribution.get(level, 0)
percentage = (count / max(summary["active_risks"], 1)) * 100
lines.append(f"{level.title()}: {count} ({percentage:.1f}%)")
lines.append("")
# Risk Matrix Visualization
lines.append("RISK MATRIX (Probability vs Impact)")
lines.append("-"*50)
lines.append(" 1 2 3 4 5 (Impact)")
matrix_data = matrix.get("matrix", {})
for prob in range(5, 0, -1):
line = f"{prob} "
for impact in range(1, 6):
risk_count = len(matrix_data.get(prob, {}).get(impact, []))
line += f" [{risk_count:2}]"
lines.append(line)
lines.append("(P)")
lines.append("")
# Category Analysis
category_analysis = result.category_analysis
lines.append("RISK BY CATEGORY")
lines.append("-"*30)
category_stats = category_analysis.get("category_statistics", {})
for category, stats in category_stats.items():
if stats["count"] > 0:
lines.append(f"{category.title()}: {stats['count']} risks, "
f"avg score: {stats['average_score']:.1f}, "
f"total exposure: {stats['total_score']:.1f}")
lines.append("")
# Mitigation Analysis
mitigation = result.mitigation_analysis
lines.append("MITIGATION EFFECTIVENESS")
lines.append("-"*30)
lines.append(f"Mitigation Coverage: {mitigation.get('mitigation_coverage', 0):.1%}")
lines.append(f"Average Actions per Risk: {mitigation.get('average_actions_per_risk', 0):.1f}")
lines.append(f"Overdue Mitigations: {mitigation.get('overdue_mitigation_count', 0)} "
f"({mitigation.get('overdue_rate', 0):.1%})")
lines.append("")
# Top Risks
lines.append("TOP RISKS REQUIRING ATTENTION")
lines.append("-"*30)
# Find top risks across all categories
all_risks = []
for category_stats in category_stats.values():
if "top_risks" in category_stats:
all_risks.extend(category_stats["top_risks"])
top_risks = sorted(all_risks, key=lambda r: r.risk_score, reverse=True)[:5]
for i, risk in enumerate(top_risks, 1):
lines.append(f"{i}. {risk.title} (Score: {risk.risk_score:.1f}, Level: {risk.risk_level.title()})")
lines.append(f" Category: {risk.category.title()}, Strategy: {risk.suggested_approach.title()}")
lines.append("")
# Recommendations
if result.recommendations:
lines.append("RECOMMENDATIONS")
lines.append("-"*30)
for i, rec in enumerate(result.recommendations, 1):
lines.append(f"{i}. {rec}")
return "\n".join(lines)
def format_json_output(result: RiskAnalysisResult) -> Dict[str, Any]:
"""Format analysis results as JSON."""
# Convert Risk objects to dictionaries for JSON serialization
def serialize_risks(obj):
if isinstance(obj, list):
return [serialize_risks(item) for item in obj]
elif hasattr(obj, 'id') and hasattr(obj, 'title'): # This is a Risk object
return {
"id": obj.id,
"title": obj.title,
"risk_score": obj.risk_score,
"risk_level": obj.risk_level,
"category": obj.category,
"probability": obj.probability,
"impact": obj.impact,
"status": obj.status
}
elif isinstance(obj, dict):
return {key: serialize_risks(value) for key, value in obj.items()}
else:
return obj
# Deep copy and serialize all risk objects recursively
return serialize_risks({
"summary": result.summary,
"risk_matrix": result.risk_matrix,
"category_analysis": result.category_analysis,
"mitigation_analysis": result.mitigation_analysis,
"trend_analysis": result.trend_analysis,
"recommendations": result.recommendations
})
# ---------------------------------------------------------------------------
# CLI Interface
# ---------------------------------------------------------------------------
def main() -> int:
"""Main CLI entry point."""
parser = argparse.ArgumentParser(
description="Analyze project risks with probability/impact matrix and mitigation recommendations"
)
parser.add_argument(
"data_file",
help="JSON file containing risk register data"
)
parser.add_argument(
"--format",
choices=["text", "json"],
default="text",
help="Output format (default: text)"
)
args = parser.parse_args()
try:
# Load and validate data
with open(args.data_file, 'r') as f:
data = json.load(f)
# Perform analysis
result = analyze_risks(data)
# Output results
if args.format == "json":
output = format_json_output(result)
print(json.dumps(output, indent=2))
else:
output = format_text_output(result)
print(output)
return 0
except FileNotFoundError:
print(f"Error: File '{args.data_file}' not found", file=sys.stderr)
return 1
except json.JSONDecodeError as e:
print(f"Error: Invalid JSON in '{args.data_file}': {e}", file=sys.stderr)
return 1
except Exception as e:
print(f"Error: {e}", file=sys.stderr)
return 1
if __name__ == "__main__":
sys.exit(main())Advanced Scrum Master skill for data-driven agile team analysis and coaching. Use when the user asks about sprint planning, velocity tracking, retrospectives...
---
name: "scrum-master"
description: "Advanced Scrum Master skill for data-driven agile team analysis and coaching. Use when the user asks about sprint planning, velocity tracking, retrospectives, standup facilitation, backlog grooming, story points, burndown charts, blocker resolution, or agile team health. Runs Python scripts to analyse sprint JSON exports from Jira or similar tools: velocity_analyzer.py for Monte Carlo sprint forecasting, sprint_health_scorer.py for multi-dimension health scoring, and retrospective_analyzer.py for action-item and theme tracking. Produces confidence-interval forecasts, health grade reports, and improvement-velocity trends for high-performing Scrum teams."
license: MIT
metadata:
version: 2.0.0
author: Alireza Rezvani
category: project-management
domain: agile-development
updated: 2026-02-15
python-tools: velocity_analyzer.py, sprint_health_scorer.py, retrospective_analyzer.py
tech-stack: scrum, agile-coaching, team-dynamics, data-analysis
---
# Scrum Master Expert
Data-driven Scrum Master skill combining sprint analytics, probabilistic forecasting, and team development coaching. The unique value is in the three Python analysis scripts and their workflows — refer to `references/` and `assets/` for deeper framework detail.
---
## Table of Contents
- [Analysis Tools & Usage](#analysis-tools--usage)
- [Input Requirements](#input-requirements)
- [Sprint Execution Workflows](#sprint-execution-workflows)
- [Team Development Workflow](#team-development-workflow)
- [Key Metrics & Targets](#key-metrics--targets)
- [Limitations](#limitations)
---
## Analysis Tools & Usage
### 1. Velocity Analyzer (`scripts/velocity_analyzer.py`)
Runs rolling averages, linear-regression trend detection, and Monte Carlo simulation over sprint history.
```bash
# Text report
python velocity_analyzer.py sprint_data.json --format text
# JSON output for downstream processing
python velocity_analyzer.py sprint_data.json --format json > analysis.json
```
**Outputs**: velocity trend (improving/stable/declining), coefficient of variation, 6-sprint Monte Carlo forecast at 50 / 70 / 85 / 95% confidence intervals, anomaly flags with root-cause suggestions.
**Validation**: If fewer than 3 sprints are present in the input, stop and prompt the user: *"Velocity analysis needs at least 3 sprints. Please provide additional sprint data."* 6+ sprints are recommended for statistically significant Monte Carlo results.
---
### 2. Sprint Health Scorer (`scripts/sprint_health_scorer.py`)
Scores team health across 6 weighted dimensions, producing an overall 0–100 grade.
| Dimension | Weight | Target |
|---|---|---|
| Commitment Reliability | 25% | >85% sprint goals met |
| Scope Stability | 20% | <15% mid-sprint changes |
| Blocker Resolution | 15% | <3 days average |
| Ceremony Engagement | 15% | >90% participation |
| Story Completion Distribution | 15% | High ratio of fully done stories |
| Velocity Predictability | 10% | CV <20% |
```bash
python sprint_health_scorer.py sprint_data.json --format text
```
**Outputs**: overall health score + grade, per-dimension scores with recommendations, sprint-over-sprint trend, intervention priority matrix.
**Validation**: Requires 2+ sprints with ceremony and story-completion data. If data is missing, report which dimensions cannot be scored and ask the user to supply the gaps.
---
### 3. Retrospective Analyzer (`scripts/retrospective_analyzer.py`)
Tracks action-item completion, recurring themes, sentiment trends, and team maturity progression.
```bash
python retrospective_analyzer.py sprint_data.json --format text
```
**Outputs**: action-item completion rate by priority/owner, recurring-theme persistence scores, team maturity level (forming/storming/norming/performing), improvement-velocity trend.
**Validation**: Requires 3+ retrospectives with action-item tracking. With fewer, note the limitation and offer partial theme analysis only.
---
## Input Requirements
All scripts accept JSON following the schema in `assets/sample_sprint_data.json`:
```json
{
"team_info": { "name": "string", "size": "number", "scrum_master": "string" },
"sprints": [
{
"sprint_number": "number",
"planned_points": "number",
"completed_points": "number",
"stories": [...],
"blockers": [...],
"ceremonies": {...}
}
],
"retrospectives": [
{
"sprint_number": "number",
"went_well": ["string"],
"to_improve": ["string"],
"action_items": [...]
}
]
}
```
Jira and similar tools can export sprint data; map exported fields to this schema before running the scripts. See `assets/sample_sprint_data.json` for a complete 6-sprint example and `assets/expected_output.json` for corresponding expected results (velocity avg 20.2 pts, CV 12.7%, health score 78.3/100, action-item completion 46.7%).
---
## Sprint Execution Workflows
### Sprint Planning
1. Run velocity analysis: `python velocity_analyzer.py sprint_data.json --format text`
2. Use the 70% confidence interval as the recommended commitment ceiling for the sprint backlog.
3. Review the health scorer's Commitment Reliability and Scope Stability scores to calibrate negotiation with the Product Owner.
4. If Monte Carlo output shows high volatility (CV >20%), surface this to stakeholders with range estimates rather than single-point forecasts.
5. Document capacity assumptions (leave, dependencies) for retrospective comparison.
### Daily Standup
1. Track participation and help-seeking patterns — feed ceremony data into `sprint_health_scorer.py` at sprint end.
2. Log each blocker with date opened; resolution time feeds the Blocker Resolution dimension.
3. If a blocker is unresolved after 2 days, escalate proactively and note in sprint data.
### Sprint Review
1. Present velocity trend and health score alongside the demo to give stakeholders delivery context.
2. Capture scope-change requests raised during review; record as scope-change events in sprint data for next scoring cycle.
### Sprint Retrospective
1. Run all three scripts before the session:
```bash
python sprint_health_scorer.py sprint_data.json --format text > health.txt
python retrospective_analyzer.py sprint_data.json --format text > retro.txt
```
2. Open with the health score and top-flagged dimensions to focus discussion.
3. Use the retrospective analyzer's action-item completion rate to determine how many new action items the team can realistically absorb (target: ≤3 if completion rate <60%).
4. Assign each action item an owner and measurable success criterion before closing the session.
5. Record new action items in `sprint_data.json` for tracking in the next cycle.
---
## Team Development Workflow
### Assessment
```bash
python sprint_health_scorer.py team_data.json > health_assessment.txt
python retrospective_analyzer.py team_data.json > retro_insights.txt
```
- Map retrospective analyzer maturity output to the appropriate development stage.
- Supplement with an anonymous psychological safety pulse survey (Edmondson 7-point scale) and individual 1:1 observations.
- If maturity output is `forming` or `storming`, prioritise safety and conflict-facilitation interventions before process optimisation.
### Intervention
Apply stage-specific facilitation (details in `references/team-dynamics-framework.md`):
| Stage | Focus |
|---|---|
| Forming | Structure, process education, trust building |
| Storming | Conflict facilitation, psychological safety maintenance |
| Norming | Autonomy building, process ownership transfer |
| Performing | Challenge introduction, innovation support |
### Progress Measurement
- **Sprint cadence**: re-run health scorer; target overall score improvement of ≥5 points per quarter.
- **Monthly**: psychological safety pulse survey; target >4.0/5.0.
- **Quarterly**: full maturity re-assessment via retrospective analyzer.
- If scores plateau or regress for 2 consecutive sprints, escalate intervention strategy (see `references/team-dynamics-framework.md`).
---
## Key Metrics & Targets
| Metric | Target |
|---|---|
| Overall Health Score | >80/100 |
| Psychological Safety Index | >4.0/5.0 |
| Velocity CV (predictability) | <20% |
| Commitment Reliability | >85% |
| Scope Stability | <15% mid-sprint changes |
| Blocker Resolution Time | <3 days |
| Ceremony Engagement | >90% |
| Retrospective Action Completion | >70% |
---
## Limitations
- **Sample size**: fewer than 6 sprints reduces Monte Carlo confidence; always state confidence intervals, not point estimates.
- **Data completeness**: missing ceremony or story-completion fields suppress affected scoring dimensions — report gaps explicitly.
- **Context sensitivity**: script recommendations must be interpreted alongside organisational and team context not captured in JSON data.
- **Quantitative bias**: metrics do not replace qualitative observation; combine scores with direct team interaction.
- **Team size**: techniques are optimised for 5–9 member teams; larger groups may require adaptation.
- **External factors**: cross-team dependencies and organisational constraints are not fully modelled by single-team metrics.
---
## Related Skills
- **Agile Product Owner** (`product-team/agile-product-owner/`) — User stories and backlog feed sprint planning
- **Senior PM** (`project-management/senior-pm/`) — Portfolio health context informs sprint priorities
---
*For deep framework references see `references/velocity-forecasting-guide.md` and `references/team-dynamics-framework.md`. For template assets see `assets/sprint_report_template.md` and `assets/team_health_check_template.md`.*
FILE:assets/expected_output.json
{
"velocity_analysis": {
"summary": {
"total_sprints": 6,
"velocity_stats": {
"mean": 20.17,
"median": 20.0,
"min": 17,
"max": 24,
"total_points": 121
},
"commitment_analysis": {
"average_commitment_ratio": 0.908,
"commitment_consistency": 0.179,
"sprints_under_committed": 3,
"sprints_over_committed": 2
},
"volatility": {
"volatility": "low",
"coefficient_of_variation": 0.127
}
},
"trend_analysis": {
"trend": "stable",
"confidence": 0.15,
"relative_slope": -0.013
},
"forecasting": {
"expected_total": 121.0,
"forecasted_totals": {
"50%": 115,
"70%": 125,
"85%": 135,
"95%": 148
}
},
"anomalies": [
{
"sprint_number": 5,
"velocity": 17,
"anomaly_type": "outlier",
"deviation_percentage": -15.7
}
]
},
"sprint_health": {
"overall_score": 78.3,
"health_grade": "good",
"dimension_scores": {
"commitment_reliability": {
"score": 96.8,
"grade": "excellent"
},
"scope_stability": {
"score": 54.8,
"grade": "poor"
},
"blocker_resolution": {
"score": 51.7,
"grade": "poor"
},
"ceremony_engagement": {
"score": 92.3,
"grade": "excellent"
},
"story_completion_distribution": {
"score": 93.3,
"grade": "excellent"
},
"velocity_predictability": {
"score": 80.5,
"grade": "good"
}
}
},
"retrospective_analysis": {
"summary": {
"total_retrospectives": 6,
"average_duration": 74,
"average_attendance": 0.933
},
"action_item_analysis": {
"total_action_items": 15,
"completion_rate": 0.467,
"overdue_rate": 0.533,
"priority_analysis": {
"high": {"completion_rate": 0.50},
"medium": {"completion_rate": 0.33},
"low": {"completion_rate": 0.67}
}
},
"theme_analysis": {
"recurring_themes": {
"process": {"frequency": 1.0, "trend": {"direction": "decreasing"}},
"team_dynamics": {"frequency": 1.0, "trend": {"direction": "increasing"}},
"technical": {"frequency": 0.83, "trend": {"direction": "increasing"}},
"communication": {"frequency": 0.67, "trend": {"direction": "decreasing"}}
}
},
"improvement_trends": {
"team_maturity_score": {
"score": 75.6,
"level": "performing"
},
"improvement_velocity": {
"velocity": "moderate",
"velocity_score": 0.62
}
}
},
"interpretation": {
"strengths": [
"Excellent commitment reliability - team consistently delivers what they commit to",
"High ceremony engagement - team actively participates in scrum events",
"Good story completion distribution - stories are finished rather than left partially done",
"Low velocity volatility - predictable delivery capability"
],
"areas_for_improvement": [
"Scope instability - too much mid-sprint change (22.6% average)",
"Blocker resolution time - 4.7 days average is too long",
"Action item completion rate - only 46.7% completed",
"High overdue rate - 53.3% of action items become overdue"
],
"recommended_actions": [
"Strengthen backlog refinement to reduce scope changes",
"Implement faster blocker escalation process",
"Reduce number of retrospective action items and focus on follow-through",
"Create external dependency register to proactively manage blockers"
]
}
}
FILE:assets/expected_velocity_output.json
{
"summary": {
"total_sprints": 6,
"velocity_stats": {
"mean": 20.166666666666668,
"median": 20.0,
"min": 17,
"max": 24,
"total_points": 121
},
"commitment_analysis": {
"average_commitment_ratio": 0.9075307422046552,
"commitment_consistency": 0.17889820455801825,
"sprints_under_committed": 3,
"sprints_over_committed": 2
},
"scope_change_analysis": {
"average_scope_change": 0.22586752619361317,
"scope_change_volatility": 0.1828476660567787
},
"rolling_averages": {
"3": [
null,
null,
19.333333333333332,
20.666666666666668,
19.333333333333332,
21.0
],
"5": [
null,
null,
19.333333333333332,
20.0,
19.4,
20.6
],
"8": [
null,
null,
19.333333333333332,
20.0,
19.4,
20.166666666666668
]
},
"volatility": {
"volatility": "low",
"coefficient_of_variation": 0.13088153980052333,
"standard_deviation": 2.6394443859772205,
"mean_velocity": 20.166666666666668,
"velocity_range": 7,
"range_ratio": 0.3471074380165289,
"min_velocity": 17,
"max_velocity": 24
}
},
"trend_analysis": {
"trend": "stable",
"slope": 0.6,
"relative_slope": 0.029752066115702476,
"correlation": 0.42527784332026836,
"confidence": 0.42527784332026836,
"recent_sprints_analyzed": 6,
"average_velocity": 20.166666666666668
},
"forecasting": {
"sprints_ahead": 6,
"historical_sprints_used": 6,
"mean_velocity": 20.166666666666668,
"velocity_std_dev": 2.6394443859772205,
"forecasted_totals": {
"50%": 121.00756172377734,
"70%": 124.35398229685968,
"85%": 127.68925669583572,
"95%": 131.66775744677182
},
"average_per_sprint": 20.166666666666668,
"expected_total": 121.0
},
"anomalies": [],
"recommendations": [
"Good velocity stability. Continue current practices."
]
}
FILE:assets/sample_sprint_data.json
{
"team_info": {
"name": "Phoenix Development Team",
"size": 5,
"scrum_master": "Sarah Chen",
"product_owner": "Mike Rodriguez"
},
"sprints": [
{
"sprint_number": 1,
"sprint_name": "Sprint Alpha",
"start_date": "2024-01-08",
"end_date": "2024-01-19",
"planned_points": 23,
"completed_points": 18,
"added_points": 3,
"removed_points": 2,
"carry_over_points": 5,
"team_capacity": 40,
"working_days": 10,
"team_size": 5,
"stories": [
{
"id": "US-101",
"title": "User authentication system",
"points": 8,
"status": "completed",
"assigned_to": "John Doe",
"created_date": "2024-01-08",
"completed_date": "2024-01-16",
"blocked_days": 0,
"priority": "high"
},
{
"id": "US-102",
"title": "Dashboard layout implementation",
"points": 5,
"status": "completed",
"assigned_to": "Jane Smith",
"created_date": "2024-01-08",
"completed_date": "2024-01-18",
"blocked_days": 1,
"priority": "medium"
},
{
"id": "US-103",
"title": "API integration for user data",
"points": 5,
"status": "completed",
"assigned_to": "Bob Wilson",
"created_date": "2024-01-08",
"completed_date": "2024-01-19",
"blocked_days": 0,
"priority": "medium"
},
{
"id": "US-104",
"title": "Advanced filtering options",
"points": 5,
"status": "in_progress",
"assigned_to": "Alice Brown",
"created_date": "2024-01-08",
"blocked_days": 2,
"priority": "low"
}
],
"blockers": [
{
"id": "B-001",
"description": "Third-party API documentation incomplete",
"created_date": "2024-01-10",
"resolved_date": "2024-01-12",
"resolution_days": 2,
"affected_stories": ["US-103"],
"category": "external"
}
],
"ceremonies": {
"daily_standup": {
"attendance_rate": 0.92,
"engagement_score": 0.85
},
"sprint_planning": {
"attendance_rate": 1.0,
"engagement_score": 0.90
},
"sprint_review": {
"attendance_rate": 0.96,
"engagement_score": 0.88
},
"retrospective": {
"attendance_rate": 1.0,
"engagement_score": 0.95
}
}
},
{
"sprint_number": 2,
"sprint_name": "Sprint Beta",
"start_date": "2024-01-22",
"end_date": "2024-02-02",
"planned_points": 21,
"completed_points": 21,
"added_points": 1,
"removed_points": 1,
"carry_over_points": 3,
"team_capacity": 38,
"working_days": 9,
"team_size": 5,
"stories": [
{
"id": "US-105",
"title": "Email notification system",
"points": 8,
"status": "completed",
"assigned_to": "John Doe",
"created_date": "2024-01-22",
"completed_date": "2024-01-30",
"blocked_days": 0,
"priority": "high"
},
{
"id": "US-106",
"title": "User profile management",
"points": 5,
"status": "completed",
"assigned_to": "Jane Smith",
"created_date": "2024-01-22",
"completed_date": "2024-02-01",
"blocked_days": 0,
"priority": "medium"
},
{
"id": "US-107",
"title": "Data export functionality",
"points": 3,
"status": "completed",
"assigned_to": "Bob Wilson",
"created_date": "2024-01-22",
"completed_date": "2024-01-31",
"blocked_days": 0,
"priority": "medium"
},
{
"id": "US-104",
"title": "Advanced filtering options",
"points": 5,
"status": "completed",
"assigned_to": "Alice Brown",
"created_date": "2024-01-08",
"completed_date": "2024-02-02",
"blocked_days": 0,
"priority": "low"
}
],
"blockers": [],
"ceremonies": {
"daily_standup": {
"attendance_rate": 0.94,
"engagement_score": 0.88
},
"sprint_planning": {
"attendance_rate": 1.0,
"engagement_score": 0.92
},
"sprint_review": {
"attendance_rate": 1.0,
"engagement_score": 0.90
},
"retrospective": {
"attendance_rate": 1.0,
"engagement_score": 0.93
}
}
},
{
"sprint_number": 3,
"sprint_name": "Sprint Gamma",
"start_date": "2024-02-05",
"end_date": "2024-02-16",
"planned_points": 24,
"completed_points": 19,
"added_points": 4,
"removed_points": 3,
"carry_over_points": 5,
"team_capacity": 42,
"working_days": 10,
"team_size": 5,
"stories": [
{
"id": "US-108",
"title": "Real-time chat implementation",
"points": 13,
"status": "in_progress",
"assigned_to": "John Doe",
"created_date": "2024-02-05",
"blocked_days": 3,
"priority": "high"
},
{
"id": "US-109",
"title": "Mobile responsive design",
"points": 8,
"status": "completed",
"assigned_to": "Jane Smith",
"created_date": "2024-02-05",
"completed_date": "2024-02-14",
"blocked_days": 0,
"priority": "high"
},
{
"id": "US-110",
"title": "Performance optimization",
"points": 3,
"status": "completed",
"assigned_to": "Bob Wilson",
"created_date": "2024-02-05",
"completed_date": "2024-02-13",
"blocked_days": 1,
"priority": "medium"
}
],
"blockers": [
{
"id": "B-002",
"description": "WebSocket library compatibility issue",
"created_date": "2024-02-07",
"resolved_date": "2024-02-11",
"resolution_days": 4,
"affected_stories": ["US-108"],
"category": "technical"
},
{
"id": "B-003",
"description": "Database migration pending approval",
"created_date": "2024-02-09",
"resolution_days": 0,
"affected_stories": ["US-110"],
"category": "process"
}
],
"ceremonies": {
"daily_standup": {
"attendance_rate": 0.88,
"engagement_score": 0.82
},
"sprint_planning": {
"attendance_rate": 0.96,
"engagement_score": 0.85
},
"sprint_review": {
"attendance_rate": 0.92,
"engagement_score": 0.83
},
"retrospective": {
"attendance_rate": 1.0,
"engagement_score": 0.87
}
}
},
{
"sprint_number": 4,
"sprint_name": "Sprint Delta",
"start_date": "2024-02-19",
"end_date": "2024-03-01",
"planned_points": 20,
"completed_points": 22,
"added_points": 2,
"removed_points": 0,
"carry_over_points": 2,
"team_capacity": 40,
"working_days": 10,
"team_size": 5,
"stories": [
{
"id": "US-108",
"title": "Real-time chat implementation",
"points": 13,
"status": "completed",
"assigned_to": "John Doe",
"created_date": "2024-02-05",
"completed_date": "2024-02-28",
"blocked_days": 0,
"priority": "high"
},
{
"id": "US-111",
"title": "Search functionality enhancement",
"points": 5,
"status": "completed",
"assigned_to": "Alice Brown",
"created_date": "2024-02-19",
"completed_date": "2024-02-26",
"blocked_days": 0,
"priority": "medium"
},
{
"id": "US-112",
"title": "Unit test coverage improvement",
"points": 3,
"status": "completed",
"assigned_to": "Bob Wilson",
"created_date": "2024-02-19",
"completed_date": "2024-02-27",
"blocked_days": 0,
"priority": "low"
},
{
"id": "US-113",
"title": "Error handling improvements",
"points": 1,
"status": "completed",
"assigned_to": "Jane Smith",
"created_date": "2024-02-25",
"completed_date": "2024-03-01",
"blocked_days": 0,
"priority": "medium"
}
],
"blockers": [],
"ceremonies": {
"daily_standup": {
"attendance_rate": 0.96,
"engagement_score": 0.90
},
"sprint_planning": {
"attendance_rate": 1.0,
"engagement_score": 0.94
},
"sprint_review": {
"attendance_rate": 1.0,
"engagement_score": 0.92
},
"retrospective": {
"attendance_rate": 1.0,
"engagement_score": 0.95
}
}
},
{
"sprint_number": 5,
"sprint_name": "Sprint Epsilon",
"start_date": "2024-03-04",
"end_date": "2024-03-15",
"planned_points": 25,
"completed_points": 17,
"added_points": 6,
"removed_points": 8,
"carry_over_points": 8,
"team_capacity": 35,
"working_days": 9,
"team_size": 4,
"stories": [
{
"id": "US-114",
"title": "Advanced analytics dashboard",
"points": 13,
"status": "blocked",
"assigned_to": "John Doe",
"created_date": "2024-03-04",
"blocked_days": 7,
"priority": "high"
},
{
"id": "US-115",
"title": "User permissions system",
"points": 8,
"status": "in_progress",
"assigned_to": "Alice Brown",
"created_date": "2024-03-04",
"blocked_days": 0,
"priority": "high"
},
{
"id": "US-116",
"title": "API rate limiting",
"points": 2,
"status": "completed",
"assigned_to": "Bob Wilson",
"created_date": "2024-03-04",
"completed_date": "2024-03-08",
"blocked_days": 0,
"priority": "medium"
},
{
"id": "US-117",
"title": "Documentation updates",
"points": 2,
"status": "completed",
"assigned_to": "Jane Smith",
"created_date": "2024-03-04",
"completed_date": "2024-03-10",
"blocked_days": 0,
"priority": "low"
}
],
"blockers": [
{
"id": "B-004",
"description": "Analytics service downtime",
"created_date": "2024-03-05",
"resolution_days": 0,
"affected_stories": ["US-114"],
"category": "external"
},
{
"id": "B-005",
"description": "Team member on sick leave",
"created_date": "2024-03-07",
"resolved_date": "2024-03-15",
"resolution_days": 8,
"affected_stories": ["US-115"],
"category": "team"
}
],
"ceremonies": {
"daily_standup": {
"attendance_rate": 0.75,
"engagement_score": 0.70
},
"sprint_planning": {
"attendance_rate": 0.80,
"engagement_score": 0.75
},
"sprint_review": {
"attendance_rate": 0.85,
"engagement_score": 0.78
},
"retrospective": {
"attendance_rate": 0.95,
"engagement_score": 0.88
}
}
},
{
"sprint_number": 6,
"sprint_name": "Sprint Zeta",
"start_date": "2024-03-18",
"end_date": "2024-03-29",
"planned_points": 22,
"completed_points": 24,
"added_points": 2,
"removed_points": 0,
"carry_over_points": 6,
"team_capacity": 45,
"working_days": 10,
"team_size": 5,
"stories": [
{
"id": "US-115",
"title": "User permissions system",
"points": 8,
"status": "completed",
"assigned_to": "Alice Brown",
"created_date": "2024-03-04",
"completed_date": "2024-03-25",
"blocked_days": 0,
"priority": "high"
},
{
"id": "US-118",
"title": "Backup and recovery system",
"points": 8,
"status": "completed",
"assigned_to": "John Doe",
"created_date": "2024-03-18",
"completed_date": "2024-03-28",
"blocked_days": 0,
"priority": "high"
},
{
"id": "US-119",
"title": "UI theme customization",
"points": 5,
"status": "completed",
"assigned_to": "Jane Smith",
"created_date": "2024-03-18",
"completed_date": "2024-03-26",
"blocked_days": 0,
"priority": "medium"
},
{
"id": "US-120",
"title": "Performance monitoring",
"points": 3,
"status": "completed",
"assigned_to": "Bob Wilson",
"created_date": "2024-03-18",
"completed_date": "2024-03-24",
"blocked_days": 0,
"priority": "low"
}
],
"blockers": [],
"ceremonies": {
"daily_standup": {
"attendance_rate": 0.98,
"engagement_score": 0.93
},
"sprint_planning": {
"attendance_rate": 1.0,
"engagement_score": 0.96
},
"sprint_review": {
"attendance_rate": 1.0,
"engagement_score": 0.94
},
"retrospective": {
"attendance_rate": 1.0,
"engagement_score": 0.97
}
}
}
],
"retrospectives": [
{
"sprint_number": 1,
"date": "2024-01-19",
"facilitator": "Sarah Chen",
"attendees": ["John Doe", "Jane Smith", "Bob Wilson", "Alice Brown", "Sarah Chen"],
"duration_minutes": 75,
"went_well": [
"Team collaboration was excellent during planning",
"Daily standups were efficient and focused",
"Good technical problem-solving on authentication system",
"New team member integrated well",
"Clear user story definitions"
],
"to_improve": [
"Story estimation accuracy needs work",
"Too many blockers appeared mid-sprint",
"API documentation was incomplete at start",
"Need better communication with external teams"
],
"action_items": [
{
"id": "AI-001",
"description": "Schedule estimation workshop for next sprint planning",
"owner": "Sarah Chen",
"priority": "high",
"due_date": "2024-01-26",
"status": "completed",
"created_sprint": 1,
"completed_sprint": 2,
"category": "process",
"effort_estimate": "medium"
},
{
"id": "AI-002",
"description": "Establish direct communication channel with API team",
"owner": "Bob Wilson",
"priority": "medium",
"due_date": "2024-01-30",
"status": "completed",
"created_sprint": 1,
"completed_sprint": 2,
"category": "communication",
"effort_estimate": "low"
},
{
"id": "AI-003",
"description": "Create blocker escalation process documentation",
"owner": "Sarah Chen",
"priority": "medium",
"due_date": "2024-02-02",
"status": "in_progress",
"created_sprint": 1,
"category": "process",
"effort_estimate": "low"
}
]
},
{
"sprint_number": 2,
"date": "2024-02-02",
"facilitator": "Sarah Chen",
"attendees": ["John Doe", "Jane Smith", "Bob Wilson", "Alice Brown", "Sarah Chen"],
"duration_minutes": 60,
"went_well": [
"Perfect sprint execution - completed all planned work",
"No blockers encountered",
"Estimation workshop improved accuracy significantly",
"Team velocity is stabilizing",
"Good ceremony attendance and engagement"
],
"to_improve": [
"Could have taken on more work given the smooth execution",
"Need to celebrate successes more",
"Sprint review could be more interactive",
"Documentation still lagging behind development"
],
"action_items": [
{
"id": "AI-004",
"description": "Implement team celebration ritual for successful sprints",
"owner": "Jane Smith",
"priority": "low",
"due_date": "2024-02-09",
"status": "completed",
"created_sprint": 2,
"completed_sprint": 3,
"category": "team_dynamics",
"effort_estimate": "low"
},
{
"id": "AI-005",
"description": "Create documentation sprint for next iteration",
"owner": "Alice Brown",
"priority": "medium",
"due_date": "2024-02-16",
"status": "cancelled",
"created_sprint": 2,
"category": "process",
"effort_estimate": "high"
}
]
},
{
"sprint_number": 3,
"date": "2024-02-16",
"facilitator": "John Doe",
"attendees": ["John Doe", "Jane Smith", "Bob Wilson", "Alice Brown"],
"duration_minutes": 90,
"went_well": [
"Good adaptation when faced with technical challenges",
"Team helped each other overcome blockers",
"Mobile design work exceeded expectations",
"Performance improvements had measurable impact"
],
"to_improve": [
"WebSocket integration took longer than expected",
"Too much scope change during the sprint",
"Daily standup attendance dropped",
"Need better technical spike planning",
"Database migration process is too slow"
],
"action_items": [
{
"id": "AI-006",
"description": "Schedule technical spike for complex integrations",
"owner": "John Doe",
"priority": "high",
"due_date": "2024-02-23",
"status": "completed",
"created_sprint": 3,
"completed_sprint": 4,
"category": "technical",
"effort_estimate": "medium"
},
{
"id": "AI-007",
"description": "Review scope change process with Product Owner",
"owner": "Sarah Chen",
"priority": "medium",
"due_date": "2024-02-26",
"status": "completed",
"created_sprint": 3,
"completed_sprint": 4,
"category": "process",
"effort_estimate": "low"
},
{
"id": "AI-008",
"description": "Improve database migration approval workflow",
"owner": "Bob Wilson",
"priority": "medium",
"due_date": "2024-03-08",
"status": "blocked",
"created_sprint": 3,
"category": "process",
"effort_estimate": "high"
}
]
},
{
"sprint_number": 4,
"date": "2024-03-01",
"facilitator": "Sarah Chen",
"attendees": ["John Doe", "Jane Smith", "Bob Wilson", "Alice Brown", "Sarah Chen"],
"duration_minutes": 45,
"went_well": [
"Exceeded sprint goal by completing extra work",
"Real-time chat finally delivered with high quality",
"Technical spikes prevented major blockers",
"Team ceremonies back to full engagement",
"Search functionality delivered ahead of schedule"
],
"to_improve": [
"Sprint retrospective was rushed due to time constraints",
"Need better capacity planning for variable team sizes",
"Unit test coverage still below target"
],
"action_items": [
{
"id": "AI-009",
"description": "Block more time for retrospectives in calendar",
"owner": "Sarah Chen",
"priority": "low",
"due_date": "2024-03-08",
"status": "completed",
"created_sprint": 4,
"completed_sprint": 5,
"category": "process",
"effort_estimate": "low"
},
{
"id": "AI-010",
"description": "Establish unit test coverage gates in CI/CD",
"owner": "Bob Wilson",
"priority": "high",
"due_date": "2024-03-15",
"status": "in_progress",
"created_sprint": 4,
"category": "technical",
"effort_estimate": "medium"
}
]
},
{
"sprint_number": 5,
"date": "2024-03-15",
"facilitator": "Alice Brown",
"attendees": ["John Doe", "Jane Smith", "Bob Wilson", "Alice Brown"],
"duration_minutes": 105,
"went_well": [
"Team adapted well to reduced capacity",
"Good support for team member on sick leave",
"Documentation work was delivered on time",
"Rate limiting implementation was smooth"
],
"to_improve": [
"External service dependencies caused major delays",
"Too much scope change again - need better discipline",
"Team capacity planning needs improvement",
"Daily standup attendance dropped significantly",
"Analytics service reliability is a recurring issue"
],
"action_items": [
{
"id": "AI-011",
"description": "Create external service dependency register",
"owner": "John Doe",
"priority": "high",
"due_date": "2024-03-22",
"status": "not_started",
"created_sprint": 5,
"category": "process",
"effort_estimate": "medium"
},
{
"id": "AI-012",
"description": "Escalate analytics service reliability issues",
"owner": "Sarah Chen",
"priority": "high",
"due_date": "2024-03-18",
"status": "completed",
"created_sprint": 5,
"completed_sprint": 6,
"category": "external",
"effort_estimate": "low"
},
{
"id": "AI-013",
"description": "Implement capacity planning buffer for sick leave",
"owner": "Sarah Chen",
"priority": "medium",
"due_date": "2024-03-29",
"status": "in_progress",
"created_sprint": 5,
"category": "process",
"effort_estimate": "medium"
}
]
},
{
"sprint_number": 6,
"date": "2024-03-29",
"facilitator": "Sarah Chen",
"attendees": ["John Doe", "Jane Smith", "Bob Wilson", "Alice Brown", "Sarah Chen"],
"duration_minutes": 70,
"went_well": [
"Excellent sprint execution with team back to full capacity",
"Delivered more points than planned",
"No blockers encountered",
"Strong ceremony engagement across all events",
"Backup system implementation was flawless",
"Team morale has improved significantly"
],
"to_improve": [
"Need to maintain this momentum",
"Could optimize sprint planning efficiency",
"Theme customization feature needs user feedback",
"Performance monitoring setup could be automated"
],
"action_items": [
{
"id": "AI-014",
"description": "Gather user feedback on theme customization",
"owner": "Jane Smith",
"priority": "medium",
"due_date": "2024-04-05",
"status": "not_started",
"created_sprint": 6,
"category": "external",
"effort_estimate": "low"
},
{
"id": "AI-015",
"description": "Automate performance monitoring setup",
"owner": "Bob Wilson",
"priority": "low",
"due_date": "2024-04-12",
"status": "not_started",
"created_sprint": 6,
"category": "technical",
"effort_estimate": "medium"
}
]
}
]
}
FILE:assets/sprint_report_template.md
# Sprint [NUMBER] - [SPRINT_NAME] Report
**Team:** [TEAM_NAME]
**Scrum Master:** [SCRUM_MASTER_NAME]
**Sprint Period:** [START_DATE] to [END_DATE]
**Report Date:** [REPORT_DATE]
---
## Executive Summary
**Sprint Goal Achievement:** [ACHIEVED/PARTIALLY_ACHIEVED/NOT_ACHIEVED]
**Overall Health Grade:** [EXCELLENT/GOOD/FAIR/POOR] ([HEALTH_SCORE]/100)
**Velocity:** [COMPLETED_POINTS] points ([VELOCITY_TREND] from previous sprint)
**Commitment Ratio:** [COMMITMENT_PERCENTAGE]% of planned work completed
### Key Highlights
- [KEY_ACHIEVEMENT_1]
- [KEY_ACHIEVEMENT_2]
- [KEY_CHALLENGE_1]
- [KEY_CHALLENGE_2]
---
## Sprint Metrics Dashboard
### Delivery Performance
| Metric | Value | Target | Status |
|--------|-------|---------|--------|
| **Planned Points** | [PLANNED_POINTS] | - | - |
| **Completed Points** | [COMPLETED_POINTS] | [TARGET_VELOCITY] | [ON_TRACK/BELOW/ABOVE] |
| **Commitment Ratio** | [COMMITMENT_PERCENTAGE]% | 85-100% | [EXCELLENT/GOOD/NEEDS_IMPROVEMENT] |
| **Stories Completed** | [COMPLETED_STORIES]/[TOTAL_STORIES] | 80%+ | [EXCELLENT/GOOD/NEEDS_IMPROVEMENT] |
| **Carry-over Points** | [CARRY_OVER_POINTS] | <20% | [GOOD/ACCEPTABLE/CONCERNING] |
### Process Health
| Metric | Value | Target | Status |
|--------|-------|---------|--------|
| **Scope Change** | [SCOPE_CHANGE_PERCENTAGE]% | <15% | [STABLE/MODERATE/UNSTABLE] |
| **Blocker Resolution** | [AVG_RESOLUTION_DAYS] days | <3 days | [EXCELLENT/GOOD/NEEDS_IMPROVEMENT] |
| **Daily Standup Attendance** | [STANDUP_ATTENDANCE]% | >90% | [EXCELLENT/GOOD/NEEDS_IMPROVEMENT] |
| **Retrospective Participation** | [RETRO_ATTENDANCE]% | >95% | [EXCELLENT/GOOD/NEEDS_IMPROVEMENT] |
### Quality Indicators
| Metric | Value | Target | Status |
|--------|-------|---------|--------|
| **Definition of Done Adherence** | [DOD_ADHERENCE]% | 100% | [EXCELLENT/NEEDS_IMPROVEMENT] |
| **Test Coverage** | [TEST_COVERAGE]% | >80% | [EXCELLENT/GOOD/NEEDS_IMPROVEMENT] |
| **Code Review Completion** | [CODE_REVIEW_COMPLETION]% | 100% | [EXCELLENT/NEEDS_IMPROVEMENT] |
| **Technical Debt Items** | [TECH_DEBT_ADDED]/[TECH_DEBT_RESOLVED] | Net negative | [IMPROVING/STABLE/CONCERNING] |
---
## User Stories Delivered
### Completed Stories ([COMPLETED_COUNT])
| Story ID | Title | Points | Owner | Completion Date | Notes |
|----------|-------|---------|-------|----------------|-------|
| [STORY_ID_1] | [STORY_TITLE_1] | [POINTS_1] | [OWNER_1] | [DATE_1] | [NOTES_1] |
| [STORY_ID_2] | [STORY_TITLE_2] | [POINTS_2] | [OWNER_2] | [DATE_2] | [NOTES_2] |
### In Progress Stories ([IN_PROGRESS_COUNT])
| Story ID | Title | Points | Owner | Progress | Expected Completion |
|----------|-------|---------|-------|----------|-------------------|
| [STORY_ID_3] | [STORY_TITLE_3] | [POINTS_3] | [OWNER_3] | [PROGRESS_3] | [ETA_3] |
### Blocked Stories ([BLOCKED_COUNT])
| Story ID | Title | Points | Owner | Blocker | Days Blocked | Escalation Status |
|----------|-------|---------|-------|---------|-------------|------------------|
| [STORY_ID_4] | [STORY_TITLE_4] | [POINTS_4] | [OWNER_4] | [BLOCKER_4] | [DAYS_4] | [ESCALATION_4] |
---
## Blockers & Impediments
### Resolved This Sprint ([RESOLVED_BLOCKERS_COUNT])
| ID | Description | Category | Created | Resolved | Resolution Time | Impact |
|----|-------------|----------|---------|----------|----------------|---------|
| [BLOCKER_ID_1] | [DESCRIPTION_1] | [CATEGORY_1] | [CREATED_1] | [RESOLVED_1] | [TIME_1] days | [IMPACT_1] |
### Active Blockers ([ACTIVE_BLOCKERS_COUNT])
| ID | Description | Category | Age | Owner | Next Steps | Priority |
|----|-------------|----------|-----|-------|------------|----------|
| [BLOCKER_ID_2] | [DESCRIPTION_2] | [CATEGORY_2] | [AGE_2] days | [OWNER_2] | [NEXT_STEPS_2] | [PRIORITY_2] |
### Escalation Required
- [ESCALATION_ITEM_1]
- [ESCALATION_ITEM_2]
---
## Team Performance Analysis
### Velocity Trend
```
Sprint [N-2]: [VELOCITY_N2] points
Sprint [N-1]: [VELOCITY_N1] points
Sprint [N]: [VELOCITY_N] points
Trend: [IMPROVING/STABLE/DECLINING] ([TREND_PERCENTAGE]% change)
```
### Predictability Assessment
- **Coefficient of Variation:** [CV_PERCENTAGE]% ([HIGH/MODERATE/LOW] volatility)
- **Commitment Reliability:** [COMMITMENT_RELIABILITY_SCORE]/100
- **Forecast Confidence:** [FORECAST_CONFIDENCE]% for next sprint
### Team Health Indicators
| Dimension | Score | Grade | Trend | Action Required |
|-----------|-------|--------|-------|-----------------|
| **Commitment Reliability** | [SCORE_1]/100 | [GRADE_1] | [TREND_1] | [ACTION_1] |
| **Scope Stability** | [SCORE_2]/100 | [GRADE_2] | [TREND_2] | [ACTION_2] |
| **Blocker Resolution** | [SCORE_3]/100 | [GRADE_3] | [TREND_3] | [ACTION_3] |
| **Ceremony Engagement** | [SCORE_4]/100 | [GRADE_4] | [TREND_4] | [ACTION_4] |
| **Story Completion** | [SCORE_5]/100 | [GRADE_5] | [TREND_5] | [ACTION_5] |
---
## Retrospective Insights
### What Went Well
- [WENT_WELL_1]
- [WENT_WELL_2]
- [WENT_WELL_3]
### Areas for Improvement
- [IMPROVE_1]
- [IMPROVE_2]
- [IMPROVE_3]
### Action Items from Retrospective
| ID | Action | Owner | Due Date | Priority | Status |
|----|--------|-------|----------|----------|--------|
| [AI_ID_1] | [ACTION_1] | [OWNER_1] | [DUE_1] | [PRIORITY_1] | [STATUS_1] |
| [AI_ID_2] | [ACTION_2] | [OWNER_2] | [DUE_2] | [PRIORITY_2] | [STATUS_2] |
### Previous Sprint Action Items Follow-up
| ID | Action | Owner | Status | Completion Notes |
|----|--------|-------|--------|------------------|
| [PREV_AI_1] | [PREV_ACTION_1] | [PREV_OWNER_1] | [PREV_STATUS_1] | [PREV_NOTES_1] |
---
## Risks & Dependencies
### High Priority Risks
| Risk | Probability | Impact | Mitigation Plan | Owner |
|------|-------------|---------|-----------------|-------|
| [RISK_1] | [PROB_1] | [IMPACT_1] | [MITIGATION_1] | [OWNER_1] |
### External Dependencies
| Dependency | Provider | Status | Expected Resolution | Contingency Plan |
|------------|----------|--------|---------------------|------------------|
| [DEP_1] | [PROVIDER_1] | [STATUS_1] | [RESOLUTION_1] | [CONTINGENCY_1] |
---
## Looking Ahead: Next Sprint
### Sprint Goals
1. [GOAL_1]
2. [GOAL_2]
3. [GOAL_3]
### Planned Capacity
- **Team Size:** [TEAM_SIZE] members
- **Available Capacity:** [AVAILABLE_HOURS] hours ([CAPACITY_POINTS] points)
- **Planned Velocity:** [PLANNED_VELOCITY] points
- **Capacity Buffer:** [BUFFER_PERCENTAGE]% for unknowns
### Key Focus Areas
- [FOCUS_AREA_1]
- [FOCUS_AREA_2]
- [FOCUS_AREA_3]
### Dependencies to Monitor
- [MONITOR_DEP_1]
- [MONITOR_DEP_2]
---
## Recommendations
### Immediate Actions (This Sprint)
1. **[HIGH_PRIORITY_ACTION_1]** - [DESCRIPTION] (Owner: [OWNER], Due: [DATE])
2. **[HIGH_PRIORITY_ACTION_2]** - [DESCRIPTION] (Owner: [OWNER], Due: [DATE])
### Process Improvements (Next 2-3 Sprints)
1. **[PROCESS_IMPROVEMENT_1]** - [DESCRIPTION]
2. **[PROCESS_IMPROVEMENT_2]** - [DESCRIPTION]
### Team Development Opportunities
1. **[DEVELOPMENT_1]** - [DESCRIPTION]
2. **[DEVELOPMENT_2]** - [DESCRIPTION]
---
## Appendix
### Sprint Burndown Chart
[BURNDOWN_CHART_REFERENCE]
### Detailed Metrics
[DETAILED_METRICS_REFERENCE]
### Team Feedback
[TEAM_FEEDBACK_SUMMARY]
---
**Report prepared by:** [SCRUM_MASTER_NAME]
**Next review date:** [NEXT_REVIEW_DATE]
**Distribution:** Product Owner, Development Team, Stakeholders
---
*This report is generated using standardized sprint health metrics and retrospective analysis. For questions or deeper analysis, please contact the Scrum Master.*
FILE:assets/team_health_check_template.md
# Team Health Check - Spotify Squad Model
**Team:** [TEAM_NAME]
**Assessment Date:** [DATE]
**Facilitator:** [FACILITATOR_NAME]
**Participants:** [PARTICIPANT_COUNT] of [TOTAL_TEAM_SIZE] members
---
## Health Check Overview
The Team Health Check is based on Spotify's Squad Health Check model, designed to visualize team health across multiple dimensions. Each dimension is assessed using a simple traffic light system:
- 🟢 **Green (Awesome):** We're doing great! No major concerns.
- 🟡 **Yellow (Some Concerns):** We're doing okay, but there are some things we could improve.
- 🔴 **Red (Not Good):** This really sucks and we need to do something about it.
### Assessment Method
- Anonymous individual ratings followed by team discussion
- Focus on trends over time rather than absolute scores
- Action-oriented outcomes for improvement areas
---
## Health Dimensions Assessment
### 1. Delivering Value 🎯
*Are we delivering value to our users and stakeholders?*
**Current Status:** [🟢/🟡/🔴]
**Trend from Last Check:** [⬆️ Improving / ➡️ Stable / ⬇️ Declining]
**Team Rating:** [X]/5 team members voted Green, [Y]/5 Yellow, [Z]/5 Red
**What's Working Well:**
- [POSITIVE_POINT_1]
- [POSITIVE_POINT_2]
**Areas of Concern:**
- [CONCERN_1]
- [CONCERN_2]
**Suggested Actions:**
- [ACTION_1]
- [ACTION_2]
---
### 2. Learning 📚
*Are we learning and growing as individuals and as a team?*
**Current Status:** [🟢/🟡/🔴]
**Trend from Last Check:** [⬆️ Improving / ➡️ Stable / ⬇️ Declining]
**Team Rating:** [X]/5 team members voted Green, [Y]/5 Yellow, [Z]/5 Red
**What's Working Well:**
- [POSITIVE_POINT_1]
- [POSITIVE_POINT_2]
**Areas of Concern:**
- [CONCERN_1]
- [CONCERN_2]
**Suggested Actions:**
- [ACTION_1]
- [ACTION_2]
---
### 3. Fun 🎉
*Do we enjoy working together and find our work engaging?*
**Current Status:** [🟢/🟡/🔴]
**Trend from Last Check:** [⬆️ Improving / ➡️ Stable / ⬇️ Declining]
**Team Rating:** [X]/5 team members voted Green, [Y]/5 Yellow, [Z]/5 Red
**What's Working Well:**
- [POSITIVE_POINT_1]
- [POSITIVE_POINT_2]
**Areas of Concern:**
- [CONCERN_1]
- [CONCERN_2]
**Suggested Actions:**
- [ACTION_1]
- [ACTION_2]
---
### 4. Health of Codebase 🏗️
*Is our code healthy, maintainable, and of good quality?*
**Current Status:** [🟢/🟡/🔴]
**Trend from Last Check:** [⬆️ Improving / ➡️ Stable / ⬇️ Declining]
**Team Rating:** [X]/5 team members voted Green, [Y]/5 Yellow, [Z]/5 Red
**What's Working Well:**
- [POSITIVE_POINT_1]
- [POSITIVE_POINT_2]
**Areas of Concern:**
- [CONCERN_1]
- [CONCERN_2]
**Suggested Actions:**
- [ACTION_1]
- [ACTION_2]
---
### 5. Mission Clarity 🎯
*Do we understand why we exist and what we're supposed to achieve?*
**Current Status:** [🟢/🟡/🔴]
**Trend from Last Check:** [⬆️ Improving / ➡️ Stable / ⬇️ Declining]
**Team Rating:** [X]/5 team members voted Green, [Y]/5 Yellow, [Z]/5 Red
**What's Working Well:**
- [POSITIVE_POINT_1]
- [POSITIVE_POINT_2]
**Areas of Concern:**
- [CONCERN_1]
- [CONCERN_2]
**Suggested Actions:**
- [ACTION_1]
- [ACTION_2]
---
### 6. Suitable Process ⚙️
*Is our process helping us be effective?*
**Current Status:** [🟢/🟡/🔴]
**Trend from Last Check:** [⬆️ Improving / ➡️ Stable / ⬇️ Declining]
**Team Rating:** [X]/5 team members voted Green, [Y]/5 Yellow, [Z]/5 Red
**What's Working Well:**
- [POSITIVE_POINT_1]
- [POSITIVE_POINT_2]
**Areas of Concern:**
- [CONCERN_1]
- [CONCERN_2]
**Suggested Actions:**
- [ACTION_1]
- [ACTION_2]
---
### 7. Support 🤝
*Do we get the support we need from management and other teams?*
**Current Status:** [🟢/🟡/🔴]
**Trend from Last Check:** [⬆️ Improving / ➡️ Stable / ⬇️ Declining]
**Team Rating:** [X]/5 team members voted Green, [Y]/5 Yellow, [Z]/5 Red
**What's Working Well:**
- [POSITIVE_POINT_1]
- [POSITIVE_POINT_2]
**Areas of Concern:**
- [CONCERN_1]
- [CONCERN_2]
**Suggested Actions:**
- [ACTION_1]
- [ACTION_2]
---
### 8. Speed ⚡
*Are we able to deliver quickly without compromising quality?*
**Current Status:** [🟢/🟡/🔴]
**Trend from Last Check:** [⬆️ Improving / ➡️ Stable / ⬇️ Declining]
**Team Rating:** [X]/5 team members voted Green, [Y]/5 Yellow, [Z]/5 Red
**What's Working Well:**
- [POSITIVE_POINT_1]
- [POSITIVE_POINT_2]
**Areas of Concern:**
- [CONCERN_1]
- [CONCERN_2]
**Suggested Actions:**
- [ACTION_1]
- [ACTION_2]
---
### 9. Pawns or Players 👥
*Do we feel like we have control over our work and destiny?*
**Current Status:** [🟢/🟡/🔴]
**Trend from Last Check:** [⬆️ Improving / ➡️ Stable / ⬇️ Declining]
**Team Rating:** [X]/5 team members voted Green, [Y]/5 Yellow, [Z]/5 Red
**What's Working Well:**
- [POSITIVE_POINT_1]
- [POSITIVE_POINT_2]
**Areas of Concern:**
- [CONCERN_1]
- [CONCERN_2]
**Suggested Actions:**
- [ACTION_1]
- [ACTION_2]
---
## Overall Health Summary
### Health Score Distribution
- 🟢 **Green Dimensions:** [GREEN_COUNT]/9 ([GREEN_PERCENTAGE]%)
- 🟡 **Yellow Dimensions:** [YELLOW_COUNT]/9 ([YELLOW_PERCENTAGE]%)
- 🔴 **Red Dimensions:** [RED_COUNT]/9 ([RED_PERCENTAGE]%)
### Overall Health Grade: [EXCELLENT/GOOD/FAIR/POOR]
### Trend Analysis
- **Improving:** [IMPROVING_COUNT] dimensions
- **Stable:** [STABLE_COUNT] dimensions
- **Declining:** [DECLINING_COUNT] dimensions
### Team Maturity Level
Based on the health check results and team dynamics observed:
**[FORMING/STORMING/NORMING/PERFORMING/ADJOURNING]**
---
## Priority Action Items
### High Priority (Red Dimensions)
1. **[RED_DIMENSION_1]:** [ACTION_DESCRIPTION_1]
- Owner: [OWNER_1]
- Timeline: [TIMELINE_1]
- Success Criteria: [CRITERIA_1]
2. **[RED_DIMENSION_2]:** [ACTION_DESCRIPTION_2]
- Owner: [OWNER_2]
- Timeline: [TIMELINE_2]
- Success Criteria: [CRITERIA_2]
### Medium Priority (Yellow Dimensions)
1. **[YELLOW_DIMENSION_1]:** [ACTION_DESCRIPTION_1]
- Owner: [OWNER_1]
- Timeline: [TIMELINE_1]
2. **[YELLOW_DIMENSION_2]:** [ACTION_DESCRIPTION_2]
- Owner: [OWNER_2]
- Timeline: [TIMELINE_2]
### Maintain Strengths (Green Dimensions)
1. **[GREEN_DIMENSION_1]:** Continue [STRENGTH_PRACTICE_1]
2. **[GREEN_DIMENSION_2]:** Share [BEST_PRACTICE_1] with other teams
---
## Psychological Safety Assessment
*Separate anonymous assessment of team psychological safety*
### Psychological Safety Indicators
1. **Speaking Up:** Team members feel safe to speak up with ideas, questions, concerns, or mistakes
- Score: [SCORE_1]/5 ⭐⭐⭐⭐⭐
2. **Risk Taking:** Team members feel safe to take risks and make mistakes
- Score: [SCORE_2]/5 ⭐⭐⭐⭐⭐
3. **Asking for Help:** Team members feel comfortable asking for help or admitting they don't know something
- Score: [SCORE_3]/5 ⭐⭐⭐⭐⭐
4. **Discussing Problems:** Difficult topics and problems can be discussed openly
- Score: [SCORE_4]/5 ⭐⭐⭐⭐⭐
5. **Being Yourself:** Team members don't feel they have to pretend to be someone else
- Score: [SCORE_5]/5 ⭐⭐⭐⭐⭐
**Overall Psychological Safety Score:** [TOTAL_SCORE]/25
### Psychological Safety Actions
- [PSYCH_SAFETY_ACTION_1]
- [PSYCH_SAFETY_ACTION_2]
---
## Communication & Collaboration Assessment
### Communication Quality
- **Clarity of Communication:** [SCORE]/5 ⭐⭐⭐⭐⭐
- **Frequency of Communication:** [SCORE]/5 ⭐⭐⭐⭐⭐
- **Openness & Transparency:** [SCORE]/5 ⭐⭐⭐⭐⭐
### Collaboration Patterns
- **Cross-functional Collaboration:** [SCORE]/5 ⭐⭐⭐⭐⭐
- **Knowledge Sharing:** [SCORE]/5 ⭐⭐⭐⭐⭐
- **Conflict Resolution:** [SCORE]/5 ⭐⭐⭐⭐⭐
---
## Follow-up Plan
### Next Health Check
**Scheduled Date:** [NEXT_DATE]
**Frequency:** [MONTHLY/QUARTERLY/BI-ANNUAL]
### Interim Check-ins
- **Sprint Retrospectives:** Continue monitoring health indicators
- **Weekly 1:1s:** Individual pulse checks with team members
- **Monthly Team Lunches:** Informal health and morale assessment
### Success Metrics
We'll know we're improving when we see:
- [SUCCESS_METRIC_1]
- [SUCCESS_METRIC_2]
- [SUCCESS_METRIC_3]
---
## Historical Comparison
### Previous Health Checks
| Date | Green | Yellow | Red | Overall Trend |
|------|-------|--------|-----|---------------|
| [PREV_DATE_1] | [G1] | [Y1] | [R1] | [TREND_1] |
| [PREV_DATE_2] | [G2] | [Y2] | [R2] | [TREND_2] |
| [CURRENT_DATE] | [G3] | [Y3] | [R3] | [TREND_3] |
### Long-term Improvements
- [LONG_TERM_IMPROVEMENT_1]
- [LONG_TERM_IMPROVEMENT_2]
### Persistent Challenges
- [PERSISTENT_CHALLENGE_1]
- [PERSISTENT_CHALLENGE_2]
---
## Team Comments & Feedback
*Anonymous feedback from team members*
### What's the most important thing we should focus on?
- "[FEEDBACK_1]"
- "[FEEDBACK_2]"
- "[FEEDBACK_3]"
### What's our biggest strength as a team?
- "[STRENGTH_1]"
- "[STRENGTH_2]"
- "[STRENGTH_3]"
### If you could change one thing, what would it be?
- "[CHANGE_1]"
- "[CHANGE_2]"
- "[CHANGE_3]"
---
## Action Item Summary
| Priority | Action | Owner | Due Date | Success Criteria | Status |
|----------|---------|-------|----------|------------------|--------|
| High | [ACTION_1] | [OWNER_1] | [DATE_1] | [CRITERIA_1] | [STATUS_1] |
| High | [ACTION_2] | [OWNER_2] | [DATE_2] | [CRITERIA_2] | [STATUS_2] |
| Medium | [ACTION_3] | [OWNER_3] | [DATE_3] | [CRITERIA_3] | [STATUS_3] |
| Medium | [ACTION_4] | [OWNER_4] | [DATE_4] | [CRITERIA_4] | [STATUS_4] |
---
**Assessment completed by:** [FACILITATOR_NAME]
**Report distribution:** Team Members, Product Owner, Management (summary only)
**Confidentiality:** Individual responses kept confidential, only aggregate data shared
---
*This health check is based on the Spotify Squad Health Check model. The goal is continuous improvement, not judgment. Use this data to have better conversations about how to work together effectively.*
FILE:references/retro-formats.md
# Sprint Retrospective Formats
## Start/Stop/Continue
**Best for:** Teams new to retrospectives, quick format
**Duration:** 45-60 minutes
### Structure
Create three columns:
- **Start:** What should we begin doing?
- **Stop:** What should we stop doing?
- **Continue:** What's working well that we should keep doing?
### Process
1. Team silently adds items to each column (10 min)
2. Group similar items (5 min)
3. Discuss each category, vote on top items (20 min)
4. Select 2-3 actions (10 min)
### Example Output
**Start:**
- Pairing on complex stories
- Code reviews within 4 hours
**Stop:**
- Taking on work mid-sprint
- Skipping acceptance criteria
**Continue:**
- Daily standups at 9:30am
- Demo prep on Thursday
---
## Glad/Sad/Mad
**Best for:** Emotional check-in, team morale assessment
**Duration:** 60-75 minutes
### Structure
Create three areas:
- **Glad:** What made you happy this sprint?
- **Sad:** What disappointed you?
- **Mad:** What frustrated you?
### Process
1. Silent brainstorming (10 min)
2. Share items, one person at a time (15 min)
3. Group themes (5 min)
4. Discuss top items from each category (20 min)
5. Identify action items (10 min)
### Example Output
**Glad:**
- Shipped feature X on time
- Great collaboration with design team
- New deployment process worked well
**Sad:**
- Lost time to production bugs
- Didn't finish all committed work
- Documentation fell behind
**Mad:**
- Environment was down 2 days
- Requirements changed mid-sprint
- Still waiting on API key from vendor
### Facilitation Tips
- Acknowledge emotions, don't dismiss
- Focus on what we can control
- Convert frustrations into actions
---
## 4Ls (Liked, Learned, Lacked, Longed For)
**Best for:** Deeper reflection, learning focus
**Duration:** 60-90 minutes
### Structure
- **Liked:** What went well? What did we enjoy?
- **Learned:** What new insights did we gain?
- **Lacked:** What was missing? What did we need?
- **Longed For:** What do we wish we had?
### Process
1. Individual reflection (10 min)
2. Round-robin sharing (20 min)
3. Group similar items (10 min)
4. Deep dive on top items (20 min)
5. Action planning (15 min)
### Example Output
**Liked:**
- Pair programming sessions
- Clear acceptance criteria
- Product Owner availability
**Learned:**
- New testing framework capabilities
- How to better estimate stories
- Importance of architectural review
**Lacked:**
- Automated deployment
- Clear API documentation
- Sufficient testing time
**Longed For:**
- Better development environments
- More design time upfront
- Dedicated QA support
---
## Sailboat
**Best for:** Visual teams, identifying headwinds and tailwinds
**Duration:** 60-90 minutes
### Structure
Draw a sailboat with:
- **Wind (propellers):** What's helping us go faster?
- **Anchors:** What's slowing us down?
- **Rocks (hazards):** What risks are ahead?
- **Island (goal):** Where are we headed?
### Process
1. Explain metaphor (5 min)
2. Team adds sticky notes to each area (15 min)
3. Group and discuss each area (30 min)
4. Prioritize anchors to remove (10 min)
5. Create action plan (15 min)
### Example Output
**Wind:**
- Strong team collaboration
- Clear product vision
- Good tooling
**Anchors:**
- Slow CI/CD pipeline
- Too many meetings
- Technical debt
**Rocks:**
- Upcoming dependency on Team B
- Key person on vacation next sprint
- Infrastructure migration
**Island:**
- Launch v2.0 by end of quarter
- Improve system stability
- Reduce production bugs by 50%
---
## Timeline
**Best for:** Detailed sprint review, identifying patterns
**Duration:** 75-90 minutes
### Structure
Create a timeline of the sprint on a whiteboard:
- Days of the sprint across the top
- Events, milestones, feelings plotted on timeline
### Process
1. Draw sprint timeline (5 min)
2. Team adds events chronologically (15 min)
3. Add emotion indicators (happy/sad/stressed) (10 min)
4. Identify patterns and themes (20 min)
5. Discuss high/low points (20 min)
6. Extract learnings and actions (15 min)
### Example Timeline
```
Day 1: Sprint planning, feeling optimistic 😊
Day 3: Production bug discovered, stressed 😰
Day 5: Bug fixed, relieved 😌
Day 7: Design feedback changed scope, frustrated 😠
Day 9: Great pairing session on new feature 😊
Day 10: Demo went really well! 🎉
```
### Facilitation Tips
- Focus on objective events first, emotions second
- Look for correlations between events and feelings
- Identify early warning signs
- Celebrate wins
---
## Starfish
**Best for:** More granular feedback than Start/Stop/Continue
**Duration:** 60-90 minutes
### Structure
Five categories:
- **Keep Doing:** What's working, don't change
- **Less Of:** What should we reduce?
- **More Of:** What should we increase?
- **Stop Doing:** What should we eliminate?
- **Start Doing:** What new practices should we try?
### Process
1. Explain each category (5 min)
2. Silent brainstorming (15 min)
3. Share and group items (15 min)
4. Discuss each category (25 min)
5. Vote on top actions (10 min)
6. Create action plan (15 min)
### Example Output
**Keep Doing:**
- Pairing on complex stories
- Demo every Friday
**Less Of:**
- Context switching
- Unplanned work
**More Of:**
- Automated testing
- Design upfront
**Stop Doing:**
- Skipping code reviews
- Working weekends
**Start Doing:**
- Mob programming for knowledge sharing
- Weekly architecture discussions
---
## Speed Dating
**Best for:** Large teams, fresh perspectives
**Duration:** 60 minutes
### Structure
- Pair up team members who don't usually work together
- Rotate pairs every 10 minutes
- Discuss sprint from different perspectives
### Process
1. Create pairs (2 min)
2. Round 1: "What went well?" (10 min)
3. Rotate pairs (2 min)
4. Round 2: "What could improve?" (10 min)
5. Rotate pairs (2 min)
6. Round 3: "What should we try?" (10 min)
7. Full group synthesis (15 min)
8. Action planning (10 min)
### Facilitation Tips
- Ensure quiet voices are heard
- Mix up pairs intentionally
- Capture themes as they emerge
- Focus on shared themes in synthesis
---
## Three Little Pigs
**Best for:** Architecture and technical decisions
**Duration:** 60-75 minutes
### Structure
Based on the story:
- **Straw House:** What's fragile? What will blow down?
- **Stick House:** What's okay but could be better?
- **Brick House:** What's solid and will last?
### Process
1. Explain metaphor (5 min)
2. Team identifies items for each house (15 min)
3. Group and discuss (20 min)
4. Prioritize straw house items to fix (10 min)
5. Create action plan (15 min)
### Example Output
**Straw House (fragile):**
- Manual deployment process
- No automated tests for API
- Undocumented code
**Stick House (needs improvement):**
- Test coverage at 60%
- Some documentation exists
- Partially automated builds
**Brick House (solid):**
- Strong CI/CD for frontend
- Well-tested core modules
- Clear architecture docs
---
## Facilitation Best Practices
### Before Retrospective
- Review previous action items
- Gather sprint metrics
- Choose format based on team needs
- Prepare collaboration space
### During Retrospective
- **Set the stage:** Create safe environment
- **Prime directive:** "Regardless of what we discover, we understand and truly believe that everyone did the best job they could, given what they knew at the time, their skills and abilities, the resources available, and the situation at hand."
- **Timebox discussions:** Keep energy high
- **Focus on actions:** Not just talk
- **Limit action items:** 1-3 max for next sprint
- **Get specific:** Vague actions don't happen
### After Retrospective
- Document immediately in Confluence
- Create Jira tickets for actions
- Assign owners and due dates
- Track completion
- Start next retro by reviewing these
### Red Flags
- Same issues every retro → Need deeper intervention
- No action items → Team not engaged
- Blame game → Not safe environment
- No follow-through → Actions not valued
- Facilitator talks more than team → Not facilitating
### Rotation Strategy
- Vary formats every 2-3 sprints
- Let team choose occasionally
- Match format to team mood
- Try new format when stuck
FILE:references/team-dynamics-framework.md
# Team Dynamics Framework for Scrum Teams
## Table of Contents
- [Overview](#overview)
- [Tuckman's Model Applied to Scrum](#tuckmans-model-applied-to-scrum)
- [Psychological Safety in Agile Teams](#psychological-safety-in-agile-teams)
- [Team Performance Metrics](#team-performance-metrics)
- [Facilitation Techniques by Stage](#facilitation-techniques-by-stage)
- [Conflict Resolution Strategies](#conflict-resolution-strategies)
- [Assessment Tools](#assessment-tools)
- [Intervention Strategies](#intervention-strategies)
- [Measurement & Tracking](#measurement--tracking)
---
## Overview
Understanding team dynamics is crucial for Scrum Masters to effectively guide teams through their development journey. This framework combines Tuckman's stages of group development with psychological safety principles and practical scrum-specific interventions.
### Core Principles
1. **Development is Non-Linear**: Teams may cycle between stages based on changes
2. **Each Stage Has Value**: Every stage serves a purpose in team development
3. **Facilitation Must Adapt**: Leadership style should match the team's developmental stage
4. **Psychological Safety is Foundational**: Without safety, teams cannot reach high performance
5. **Measurement Enables Improvement**: Track dynamics to guide interventions
### Framework Components
- **Tuckman's Stages**: Forming → Storming → Norming → Performing → Adjourning
- **Psychological Safety**: Environment for risk-taking and learning
- **Scrum Ceremonies**: Team development accelerators when facilitated well
- **Metrics & Assessment**: Data-driven approach to team health
---
## Tuckman's Model Applied to Scrum
### Stage 1: Forming (Team Inception)
*"Getting to know each other and understanding the work"*
#### Characteristics in Scrum Context
- **Individual Focus**: Members work independently, unsure of roles
- **Politeness**: Conflict is avoided, everyone tries to be agreeable
- **Dependency**: Heavy reliance on Scrum Master for guidance
- **Ceremony Awkwardness**: Standups feel forced, retrospectives are superficial
- **Low Velocity**: Productivity is low as team learns to work together
#### Scrum Master Behaviors
- **Directing Style**: Provide clear structure and guidance
- **Process Champion**: Teach scrum framework and ceremonies rigorously
- **Relationship Builder**: Facilitate team bonding and trust building
- **Context Setter**: Explain the "why" behind practices and goals
#### Key Metrics & Indicators
| Metric | Forming Range | Assessment Method |
|--------|---------------|-------------------|
| Ceremony Participation | 60-80% | Attendance tracking |
| Cross-team Collaboration | Low | Story pairing frequency |
| Velocity Predictability | High volatility (CV >40%) | Velocity coefficient of variation |
| Psychological Safety | 2.0-3.5/5.0 | Anonymous team survey |
| Conflict Frequency | Very low | Retrospective themes |
#### Intervention Strategies
- **Team Charter Creation**: Define working agreements and values together
- **Skill Inventory**: Map team capabilities and identify knowledge gaps
- **Pairing/Mobbing**: Encourage collaborative work to build relationships
- **Social Activities**: Team lunches, informal interactions
- **Process Education**: Intensive scrum training and coaching
#### Success Indicators
- Consistent ceremony attendance (>85%)
- Team members start asking questions about process
- Initial working agreements are established
- Some cross-functional collaboration begins
---
### Stage 2: Storming (Productive Conflict)
*"Working through differences and establishing team dynamics"*
#### Characteristics in Scrum Context
- **Conflict Emergence**: Disagreements about technical approaches, priorities
- **Role Struggles**: Tension around responsibilities and decision-making authority
- **Process Pushback**: Questioning scrum practices, suggesting changes
- **Subgroup Formation**: Cliques or mini-alliances may form
- **Velocity Fluctuations**: Performance varies as team works through conflicts
#### Scrum Master Behaviors
- **Coaching Style**: Guide conflict resolution without directing solutions
- **Neutral Facilitator**: Help team work through disagreements constructively
- **Psychological Safety Guardian**: Ensure conflicts remain productive
- **Process Flexibility**: Adapt ceremonies to team's evolving needs
#### Key Metrics & Indicators
| Metric | Storming Range | Assessment Method |
|--------|---------------|-------------------|
| Conflict Frequency | Moderate-High | Retrospective action items |
| Ceremony Engagement | Variable (70-90%) | Participation quality scoring |
| Velocity Volatility | Moderate (CV 25-40%) | Sprint-to-sprint variation |
| Psychological Safety | 2.5-4.0/5.0 | Team surveys + observation |
| Process Adherence | Inconsistent | Ceremony audit scores |
#### Intervention Strategies
- **Conflict Facilitation**: Structured conflict resolution sessions
- **Retrospective Focus**: Deep-dive into team dynamics and relationships
- **Individual Coaching**: 1:1s to address personal concerns and conflicts
- **Working Agreement Updates**: Revisit and refine team agreements
- **External Facilitation**: Bring in neutral parties for significant conflicts
#### Success Indicators
- Conflicts are addressed openly rather than avoided
- Team develops mechanisms for working through disagreements
- Ceremony participation becomes more authentic
- Velocity starts to stabilize
---
### Stage 3: Norming (Agreement & Collaboration)
*"Establishing effective ways of working together"*
#### Characteristics in Scrum Context
- **Shared Ownership**: Team takes collective responsibility for outcomes
- **Process Refinement**: Self-organizing improvements to scrum practices
- **Collaboration Increase**: More cross-functional pairing and knowledge sharing
- **Ceremony Effectiveness**: Meetings become more focused and productive
- **Velocity Stabilization**: More predictable delivery patterns emerge
#### Scrum Master Behaviors
- **Supporting Style**: Step back and let team lead, provide support when needed
- **Impediment Remover**: Focus on external blockers and organizational issues
- **Continuous Improvement Coach**: Help team identify and implement improvements
- **Shield Provider**: Protect team from external disruptions
#### Key Metrics & Indicators
| Metric | Norming Range | Assessment Method |
|--------|---------------|-------------------|
| Self-Organization | Increasing | Decision-making autonomy tracking |
| Ceremony Effectiveness | 80-90% | Time-to-value ratios |
| Velocity Consistency | Good (CV 15-25%) | Rolling average stability |
| Psychological Safety | 3.5-4.5/5.0 | Regular pulse surveys |
| Knowledge Sharing | High | Cross-training metrics |
#### Intervention Strategies
- **Process Ownership Transfer**: Guide team to own ceremony facilitation
- **Skill Development**: Focus on technical and collaboration skills
- **Measurement Introduction**: Help team define their own success metrics
- **External Relationship Building**: Facilitate connections with other teams
- **Continuous Improvement Rhythm**: Establish regular process refinement
#### Success Indicators
- Team members facilitate some ceremonies themselves
- Proactive identification and resolution of impediments
- Stable, predictable velocity patterns
- High-quality retrospectives with actionable outcomes
---
### Stage 4: Performing (High Performance)
*"Delivering exceptional results together"*
#### Characteristics in Scrum Context
- **Collective Excellence**: Team consistently exceeds expectations
- **Adaptive Expertise**: Quick response to changing requirements
- **Self-Management**: Minimal need for external direction
- **Innovation**: Team generates creative solutions and process improvements
- **Knowledge Multiplication**: Members actively develop others
#### Scrum Master Behaviors
- **Delegating Style**: Minimal intervention, team is largely autonomous
- **Strategic Facilitator**: Focus on long-term team development and capability
- **Organizational Catalyst**: Help team influence broader organizational change
- **Mentor Developer**: Coach team members to become coaches themselves
#### Key Metrics & Indicators
| Metric | Performing Range | Assessment Method |
|--------|---------------|-------------------|
| Autonomy Level | High | Decision independence tracking |
| Innovation Frequency | Regular | New idea implementation rate |
| Velocity Excellence | High + Consistent (CV <15%) | Performance benchmarking |
| Psychological Safety | 4.0-5.0/5.0 | Team assessment + observation |
| External Impact | Significant | Other teams adopting practices |
#### Intervention Strategies
- **Challenge Provision**: Introduce stretch goals and complex problems
- **Leadership Development**: Grow team members into coaches/leaders
- **Knowledge Sharing**: Facilitate teaching other teams
- **Strategic Alignment**: Connect team excellence to organizational goals
- **Innovation Support**: Create space for experimentation and learning
#### Success Indicators
- Consistent delivery of high-quality work with minimal defects
- Team serves as a model for other teams in the organization
- Members are sought out for coaching and mentoring roles
- Proactive contribution to organizational process improvements
---
### Stage 5: Adjourning (Transition & Legacy)
*"Wrapping up and transitioning knowledge"*
#### Characteristics in Scrum Context
- **Closure Activities**: Project completion or team dissolution
- **Knowledge Transfer**: Documenting learnings and sharing expertise
- **Relationship Maintenance**: Preserving professional networks
- **Legacy Creation**: Ensuring practices continue beyond the team
- **Emotional Processing**: Addressing feelings about team ending
#### Scrum Master Behaviors
- **Closure Facilitator**: Guide proper conclusion of work and relationships
- **Legacy Curator**: Ensure knowledge and practices are preserved
- **Transition Planner**: Help members move to new roles/teams effectively
- **Emotional Support**: Acknowledge and process team disbanding feelings
#### Key Activities
- **Final Retrospective**: Comprehensive review of team journey and learnings
- **Practice Documentation**: Record effective processes for future teams
- **Knowledge Transfer Sessions**: Share expertise with successor teams
- **Celebration**: Acknowledge achievements and relationships built
- **Network Maintenance**: Establish ongoing professional connections
---
## Psychological Safety in Agile Teams
### Definition & Importance
Psychological safety is the belief that one can show vulnerability, ask questions, admit mistakes, and propose ideas without risk of negative consequences to self-image, status, or career.
### Google's Four Components Applied to Scrum
1. **Ability to show vulnerability and ask for help**
2. **Permission to discuss difficult topics and disagreements**
3. **Freedom to take risks and make mistakes**
4. **Encouragement to be authentic and express oneself**
### Building Psychological Safety in Scrum Teams
#### Daily Standups
- **Model Vulnerability**: Scrum Master admits own mistakes and uncertainties
- **Normalize Help-Seeking**: "Who needs help?" vs. "Any blockers?"
- **Celebrate Learning**: Highlight lessons learned from failures
- **Time Protection**: Ensure everyone has space to speak
#### Sprint Planning
- **Estimation Comfort**: No judgment for "wrong" estimates
- **Capacity Honesty**: Safe to express realistic availability
- **Question Encouragement**: Reward curiosity and clarification requests
- **Scope Negotiation**: Team can push back on unrealistic commitments
#### Sprint Reviews
- **Failure Normalization**: Discuss what didn't work without blame
- **Stakeholder Preparation**: Coach stakeholders on constructive feedback
- **Team Support**: Unified front when facing criticism
- **Learning Focus**: Frame setbacks as learning opportunities
#### Retrospectives
- **Non-Judgmental Space**: Focus on systems, not individuals
- **Equal Participation**: Ensure all voices are heard
- **Actionable Outcomes**: Team commits to improvements together
- **Confidentiality**: What's said in retro stays in retro
### Measuring Psychological Safety
#### Edmondson's 7-Point Scale
1. If you make a mistake on this team, it is often held against you
2. Members of this team are able to bring up problems and tough issues
3. People on this team sometimes reject others for being different
4. It is safe to take a risk on this team
5. It is difficult to ask other members of this team for help
6. No one on this team would deliberately act to undermine my efforts
7. Working with members of this team, my unique skills and talents are valued and utilized
#### Practical Assessment Questions
- **Risk Taking**: "Do team members speak up when they disagree with leadership?"
- **Mistake Handling**: "How does the team respond when someone makes an error?"
- **Help Seeking**: "Do people admit when they don't know something?"
- **Inclusion**: "Are all team members' ideas heard and considered?"
- **Innovation**: "Does the team experiment with new approaches?"
---
## Team Performance Metrics
### Quantitative Indicators
#### Velocity & Predictability
- **Sprint Velocity Trends**: Improvement over time indicates team development
- **Commitment Reliability**: Ability to deliver planned work consistently
- **Velocity Volatility (CV)**: Lower variation indicates team maturity
- **Forecast Accuracy**: Precision in release planning improves with development
#### Quality Metrics
- **Defect Rates**: High-performing teams have lower defect introduction
- **Definition of Done Adherence**: Mature teams consistently meet quality criteria
- **Technical Debt Management**: Performing teams proactively address debt
- **Customer Satisfaction**: Ultimately reflected in user/stakeholder feedback
#### Collaboration Indicators
- **Cross-functional Work**: Story completion without handoffs
- **Knowledge Sharing**: Pair programming, code review participation
- **Skill Development**: Team members learning from each other
- **Collective Ownership**: Shared responsibility for all team outputs
### Qualitative Assessments
#### Ceremony Quality
- **Engagement Level**: Active participation vs. passive attendance
- **Value Generation**: Productive outcomes from time invested
- **Self-Facilitation**: Team taking ownership of meeting effectiveness
- **Adaptation**: Tailoring practices to team's specific needs
#### Communication Patterns
- **Openness**: Willingness to share problems and concerns
- **Constructive Conflict**: Disagreements lead to better solutions
- **Active Listening**: Team members build on each other's ideas
- **Feedback Culture**: Regular, specific, actionable feedback exchange
---
## Facilitation Techniques by Stage
### Forming Stage Facilitation
- **Structured Introductions**: Personal/professional background sharing
- **Explicit Process Teaching**: Step-by-step ceremony instruction
- **Role Clarification**: Clear explanation of responsibilities and expectations
- **Safe-to-Fail Experiments**: Low-risk opportunities to try new things
### Storming Stage Facilitation
- **Conflict Normalization**: "Conflict is healthy and expected"
- **Ground Rules Enforcement**: Maintain respectful disagreement standards
- **Perspective Taking**: Help team members understand different viewpoints
- **External Processing**: Individual coaching sessions for complex issues
### Norming Stage Facilitation
- **Autonomy Building**: Gradually reduce direct intervention
- **Process Ownership Transfer**: Team takes responsibility for improvements
- **Skill Gap Identification**: Focus on capability development
- **Success Pattern Recognition**: Help team understand what's working
### Performing Stage Facilitation
- **Challenge Introduction**: Stretch goals and complex problems
- **Innovation Support**: Time and space for experimentation
- **Teaching Opportunities**: Help team share knowledge with others
- **Strategic Connection**: Link team excellence to organizational goals
---
## Conflict Resolution Strategies
### Healthy vs. Unhealthy Conflict
#### Healthy Conflict Characteristics
- **Task-Focused**: About work, not personalities
- **Solution-Oriented**: Aimed at finding better ways forward
- **Open and Direct**: Issues addressed transparently
- **Respectful**: Maintains dignity of all parties
- **Temporary**: Resolved and doesn't fester
#### Unhealthy Conflict Characteristics
- **Personal Attacks**: Targeting individuals rather than ideas
- **Win-Lose Mentality**: Zero-sum thinking
- **Underground**: Gossip and indirect communication
- **Destructive**: Damages relationships and trust
- **Persistent**: Continues without resolution
### Conflict Resolution Process
#### 1. Early Detection
- **Retrospective Themes**: Recurring issues or tensions
- **Ceremony Observation**: Body language, participation patterns
- **1:1 Conversations**: Individual team member concerns
- **Performance Indicators**: Velocity drops, quality issues
#### 2. Assessment & Preparation
- **Stakeholder Mapping**: Who's involved, who's affected
- **Issue Clarification**: Separate facts from interpretations
- **Desired Outcomes**: What would resolution look like?
- **Facilitation Planning**: Process design for resolution session
#### 3. Facilitated Resolution
- **Ground Rules**: Safe space for honest dialogue
- **Perspective Sharing**: Each party states their view
- **Common Ground**: Identify shared interests and values
- **Solution Generation**: Collaborative problem-solving
- **Agreement Creation**: Clear commitments and follow-up
#### 4. Follow-up & Learning
- **Implementation Support**: Help parties honor agreements
- **Relationship Repair**: Ongoing relationship building
- **Process Improvement**: Learn from conflict for future prevention
- **Team Strengthening**: Use resolution as team development opportunity
---
## Assessment Tools
### Team Development Stage Assessment
#### Behavioral Indicators Checklist
**Forming Indicators:**
- [ ] Heavy reliance on Scrum Master for decisions
- [ ] Polite, superficial interactions
- [ ] Individual work preferences
- [ ] Process confusion or resistance
- [ ] Low ceremony engagement
**Storming Indicators:**
- [ ] Open disagreements about approach
- [ ] Questioning of established processes
- [ ] Subgroup formation
- [ ] Inconsistent performance
- [ ] Emotional reactions to feedback
**Norming Indicators:**
- [ ] Collaborative problem-solving
- [ ] Process adaptation and improvement
- [ ] Shared responsibility for outcomes
- [ ] Constructive feedback exchange
- [ ] Stable performance patterns
**Performing Indicators:**
- [ ] Self-organization without external direction
- [ ] Proactive problem anticipation
- [ ] Innovation and experimentation
- [ ] Mentoring of other teams
- [ ] Exceptional results consistently
### Psychological Safety Assessment Survey
#### Team Member Self-Assessment (5-point Likert Scale)
1. **Mistake Tolerance**: "When I make a mistake, my team supports me in learning from it"
2. **Voice Safety**: "I feel comfortable challenging decisions or raising concerns"
3. **Inclusion**: "My unique perspective is valued by the team"
4. **Risk Taking**: "I can take calculated risks without fear of negative consequences"
5. **Help Seeking**: "I can admit when I don't know something without judgment"
6. **Authenticity**: "I can be myself without pretending or hiding parts of my personality"
7. **Innovation**: "We try new approaches even if they might not work"
#### Behavioral Observation Checklist
- **Speaking Up**: Team members voice disagreements respectfully
- **Mistake Response**: Errors are discussed openly for learning
- **Help Seeking**: People admit knowledge gaps and ask for assistance
- **Experimentation**: Team tries new approaches without excessive fear
- **Inclusion**: All members participate actively in discussions
- **Feedback**: Constructive criticism is given and received well
---
## Intervention Strategies
### Stage-Specific Interventions
#### Forming → Storming Transition
- **Trust Building Activities**: Structured sharing and team bonding
- **Psychological Safety Foundation**: Establish ground rules for safe conflict
- **Process Education**: Deep training on collaboration and communication
- **Individual Coaching**: Prepare team members for productive disagreement
#### Storming → Norming Transition
- **Conflict Resolution Skills**: Training in constructive disagreement
- **Working Agreement Updates**: Refine team collaboration standards
- **Success Celebration**: Acknowledge progress through difficult conversations
- **Process Ownership**: Begin transferring facilitation responsibilities
#### Norming → Performing Transition
- **Challenge Introduction**: Stretch goals to push team capabilities
- **Leadership Development**: Grow coaching and mentoring skills
- **Innovation Support**: Create time and space for experimentation
- **External Engagement**: Opportunities to influence other teams
### Crisis Interventions
#### Performance Regression
**Symptoms**: Sudden drops in velocity, quality, or team satisfaction
**Interventions**:
- Team health check and root cause analysis
- Individual 1:1s to understand personal factors
- Process audit to identify systemic issues
- Targeted support for specific capability gaps
#### Psychological Safety Violations
**Symptoms**: Team members withdrawing, avoiding risk, or leaving
**Interventions**:
- Immediate protective actions for affected individuals
- Team-wide discussion of psychological safety principles
- Leadership coaching for those who violated safety
- System changes to prevent future violations
#### External Pressure Impact
**Symptoms**: Team stress, process shortcuts, decreased collaboration
**Interventions**:
- Stakeholder education about sustainable pace
- Scope negotiation and priority clarification
- Team capacity protection and workload management
- Stress management and resilience building
---
## Measurement & Tracking
### Dashboard Metrics by Stage
#### Forming Stage Metrics
- Ceremony attendance rates
- Individual vs. collaborative work ratios
- Process adherence scores
- Initial psychological safety baseline
#### Storming Stage Metrics
- Conflict frequency and resolution time
- Ceremony engagement quality
- Velocity volatility measures
- Team satisfaction surveys
#### Norming Stage Metrics
- Self-organization indicators
- Process improvement frequency
- Knowledge sharing metrics
- Stakeholder satisfaction
#### Performing Stage Metrics
- Innovation and experimentation rates
- External influence and mentoring
- Exceptional result achievement
- Leadership development outcomes
### Tracking Tools & Methods
#### Regular Assessment Schedule
- **Weekly**: Ceremony quality observation
- **Sprint**: Velocity and quality metrics
- **Monthly**: Psychological safety pulse survey
- **Quarterly**: Comprehensive team development assessment
#### Data Collection Methods
- **Quantitative**: Sprint metrics, attendance, survey scores
- **Qualitative**: Observation notes, retrospective themes, interview insights
- **Behavioral**: Video/audio analysis of team interactions (with consent)
- **External**: Stakeholder feedback, other team perceptions
#### Progress Visualization
- **Team Development Radar**: Multi-dimensional progress tracking
- **Psychological Safety Trends**: Safety metrics over time
- **Stage Transition Timeline**: Development milestone tracking
- **Intervention Impact Assessment**: Before/after comparison
---
## Conclusion
Effective team dynamics facilitation requires understanding that team development is a journey, not a destination. Scrum Masters must:
1. **Assess Accurately**: Understand current team development stage
2. **Facilitate Appropriately**: Match leadership style to team needs
3. **Build Safety First**: Psychological safety enables all other development
4. **Measure Progress**: Track both quantitative and qualitative indicators
5. **Intervene Thoughtfully**: Apply stage-appropriate interventions
6. **Celebrate Growth**: Acknowledge progress and learning throughout the journey
The goal is not just high-performing teams, but sustainable high performance built on strong relationships, psychological safety, and continuous learning. This framework provides the structure and tools to guide teams through their development journey effectively.
---
*This framework combines research-based models with practical scrum implementation experience. Adapt the tools and techniques to fit your specific organizational context and team needs.*
FILE:references/velocity-forecasting-guide.md
# Velocity Forecasting Guide: Monte Carlo Methods & Probabilistic Estimation
## Table of Contents
- [Overview](#overview)
- [Monte Carlo Simulation Fundamentals](#monte-carlo-simulation-fundamentals)
- [Velocity-Based Forecasting](#velocity-based-forecasting)
- [Implementation Approaches](#implementation-approaches)
- [Confidence Intervals & Risk Assessment](#confidence-intervals--risk-assessment)
- [Practical Applications](#practical-applications)
- [Advanced Techniques](#advanced-techniques)
- [Common Pitfalls](#common-pitfalls)
- [Case Studies](#case-studies)
---
## Overview
Velocity forecasting using Monte Carlo simulation provides probabilistic estimates for sprint and project completion, moving beyond single-point estimates to give stakeholders a range of likely outcomes with associated confidence levels.
### Why Probabilistic Forecasting?
- **Uncertainty Acknowledgment**: Software development is inherently uncertain
- **Risk Quantification**: Provides probability distributions rather than false precision
- **Stakeholder Communication**: Better expectation management through confidence intervals
- **Decision Support**: Enables data-driven planning and resource allocation
### Core Principles
1. **Historical Velocity Patterns**: Use actual team performance data
2. **Statistical Modeling**: Apply appropriate probability distributions
3. **Confidence Intervals**: Provide ranges, not single points
4. **Continuous Calibration**: Update forecasts with new data
---
## Monte Carlo Simulation Fundamentals
### What is Monte Carlo Simulation?
Monte Carlo simulation uses random sampling to model the probability of different outcomes in systems that cannot be easily predicted due to random variables.
### Application to Velocity Forecasting
```
For each simulation iteration:
1. Sample a velocity value from historical distribution
2. Calculate projected completion time
3. Repeat thousands of times
4. Analyze the distribution of results
```
### Key Statistical Concepts
#### Normal Distribution
Most teams' velocity follows a roughly normal distribution after stabilization:
- **Mean (μ)**: Average historical velocity
- **Standard Deviation (σ)**: Velocity variability measure
- **68-95-99.7 Rule**: Probability ranges for forecasting
#### Distribution Characteristics
- **Symmetry**: Balanced around the mean (normal teams)
- **Skewness**: Teams with frequent disruptions may show positive skew
- **Kurtosis**: Measure of "tail heaviness" - extreme outcomes frequency
---
## Velocity-Based Forecasting
### Basic Velocity Forecasting Formula
**Single Sprint Forecast:**
```
Confidence Interval = μ ± (Z-score × σ)
Where:
- μ = historical mean velocity
- σ = standard deviation of velocity
- Z-score = confidence level multiplier
```
**Multi-Sprint Forecast:**
```
Total Points = Σ(sampled_velocity_i) for i = 1 to n sprints
Where each velocity_i is randomly sampled from historical distribution
```
### Confidence Level Z-Scores
| Confidence Level | Z-Score | Interpretation |
|------------------|---------|----------------|
| 50% | 0.67 | Median outcome |
| 70% | 1.04 | Moderate confidence |
| 85% | 1.44 | High confidence |
| 95% | 1.96 | Very high confidence |
| 99% | 2.58 | Extremely high confidence |
---
## Implementation Approaches
### 1. Simple Historical Distribution Method
```python
def simple_monte_carlo_forecast(velocities, sprints_ahead, iterations=10000):
results = []
for _ in range(iterations):
total_points = sum(random.choice(velocities) for _ in range(sprints_ahead))
results.append(total_points)
return analyze_results(results)
```
**Pros:** Simple, uses actual data points
**Cons:** Ignores trends, assumes stationary distribution
### 2. Normal Distribution Method
```python
def normal_distribution_forecast(velocities, sprints_ahead, iterations=10000):
mean_velocity = statistics.mean(velocities)
std_velocity = statistics.stdev(velocities)
results = []
for _ in range(iterations):
total_points = sum(
max(0, random.normalvariate(mean_velocity, std_velocity))
for _ in range(sprints_ahead)
)
results.append(total_points)
return analyze_results(results)
```
**Pros:** Mathematically clean, handles interpolation
**Cons:** Assumes normal distribution, may generate impossible values
### 3. Bootstrap Sampling Method
```python
def bootstrap_forecast(velocities, sprints_ahead, iterations=10000):
n = len(velocities)
results = []
for _ in range(iterations):
# Sample with replacement
bootstrap_sample = [random.choice(velocities) for _ in range(n)]
# Calculate statistics from bootstrap sample
mean_vel = statistics.mean(bootstrap_sample)
std_vel = statistics.stdev(bootstrap_sample)
total_points = sum(
max(0, random.normalvariate(mean_vel, std_vel))
for _ in range(sprints_ahead)
)
results.append(total_points)
return analyze_results(results)
```
**Pros:** Robust to distribution assumptions, accounts for sampling uncertainty
**Cons:** More complex, requires sufficient historical data
---
## Confidence Intervals & Risk Assessment
### Interpreting Forecast Results
#### Percentile-Based Confidence Intervals
```python
def calculate_confidence_intervals(results, confidence_levels=[0.5, 0.7, 0.85, 0.95]):
sorted_results = sorted(results)
intervals = {}
for confidence in confidence_levels:
percentile_index = int(confidence * len(sorted_results))
intervals[f"{int(confidence*100)}%"] = sorted_results[percentile_index]
return intervals
```
#### Example Interpretation
For a 6-sprint forecast with results:
- **50%:** 120 points (median outcome)
- **70%:** 135 points (likely case)
- **85%:** 150 points (conservative case)
- **95%:** 170 points (very conservative case)
### Risk Assessment Framework
#### Delivery Probability
```
P(Completion ≤ Target) = (# simulations ≤ target) / total_simulations
```
#### Risk Categories
| Probability Range | Risk Level | Recommendation |
|-------------------|------------|----------------|
| > 85% | Low Risk | Proceed with confidence |
| 70-85% | Moderate Risk | Add buffer, monitor closely |
| 50-70% | High Risk | Reduce scope or extend timeline |
| < 50% | Very High Risk | Significant replanning required |
---
## Practical Applications
### Sprint Planning
Use velocity forecasting to:
- Set realistic sprint goals
- Communicate uncertainty to Product Owner
- Plan capacity buffers for unknowns
- Identify when to adjust scope
### Release Planning
Apply Monte Carlo methods to:
- Estimate feature completion dates
- Plan release milestones
- Assess project schedule risk
- Make go/no-go decisions
### Stakeholder Communication
Present forecasts as:
- Range estimates, not single points
- Probability statements ("70% confident we'll deliver X by date Y")
- Risk scenarios with mitigation options
- Visual distributions showing uncertainty
---
## Advanced Techniques
### 1. Trend-Adjusted Forecasting
Account for improving or declining velocity trends:
```python
def trend_adjusted_forecast(velocities, sprints_ahead):
# Calculate linear trend
x = range(len(velocities))
slope, intercept = calculate_linear_regression(x, velocities)
# Adjust future velocities for trend
adjusted_velocities = []
for i in range(sprints_ahead):
future_sprint = len(velocities) + i
predicted_velocity = slope * future_sprint + intercept
adjusted_velocities.append(predicted_velocity)
return monte_carlo_with_adjusted_velocities(adjusted_velocities)
```
### 2. Seasonality Adjustments
For teams with seasonal patterns (holidays, budget cycles):
```python
def seasonal_adjustment(velocities, sprint_dates, forecast_dates):
# Identify seasonal patterns
seasonal_factors = calculate_seasonal_factors(velocities, sprint_dates)
# Apply factors to forecast
adjusted_forecast = apply_seasonal_factors(forecast_dates, seasonal_factors)
return adjusted_forecast
```
### 3. Capacity-Based Modeling
Incorporate team capacity changes:
```python
def capacity_adjusted_forecast(velocities, historical_capacity, future_capacity):
# Calculate velocity per capacity unit
velocity_per_capacity = [v/c for v, c in zip(velocities, historical_capacity)]
baseline_efficiency = statistics.mean(velocity_per_capacity)
# Forecast based on future capacity
future_velocities = [capacity * baseline_efficiency for capacity in future_capacity]
return monte_carlo_forecast(future_velocities)
```
### 4. Multi-Team Forecasting
For dependencies across teams:
```python
def multi_team_forecast(team_forecasts, dependencies):
# Account for critical path and dependencies
# Use min/max operations for dependent deliveries
# Model coordination overhead
pass
```
---
## Common Pitfalls
### 1. Insufficient Historical Data
**Problem:** Using too few sprint data points
**Solution:** Minimum 6-8 sprints for reliable forecasting
**Mitigation:** Use industry benchmarks or similar team data
### 2. Non-Stationary Data
**Problem:** Including data from different team compositions or processes
**Solution:** Use only recent, relevant historical data
**Identification:** Look for structural breaks in velocity time series
### 3. False Precision
**Problem:** Reporting over-precise estimates (e.g., "23.7 points")
**Solution:** Round to reasonable precision, emphasize ranges
**Communication:** Use language like "approximately" and "around"
### 4. Ignoring External Factors
**Problem:** Not accounting for holidays, team changes, external dependencies
**Solution:** Adjust historical data or forecasts for known factors
**Documentation:** Maintain context for each sprint's circumstances
### 5. Overconfidence in Models
**Problem:** Treating forecasts as guarantees
**Solution:** Regular calibration against actual outcomes
**Improvement:** Update models based on forecast accuracy
---
## Case Studies
### Case Study 1: Stabilizing Team
**Situation:** New team, first 10 sprints, velocity ranging 15-25 points
**Approach:**
- Used bootstrap sampling due to small sample size
- Applied 30% buffer for team learning curve
- Updated forecast every 2 sprints
**Results:**
- Initial forecast: 20 ± 8 points per sprint
- Final 3 sprints: 22 ± 3 points per sprint
- Accuracy improved from 60% to 85% confidence bands
### Case Study 2: Seasonal Product Team
**Situation:** E-commerce team with holiday impacts
**Data:** 24 sprints showing clear seasonal patterns
**Approach:**
- Identified seasonal multipliers (0.7x during holidays)
- Used 2-year historical data for seasonal adjustment
- Applied capacity-based modeling for temporary staff
**Results:**
- Standard model: 40% forecast accuracy during Q4
- Seasonal-adjusted model: 80% forecast accuracy
- Better resource planning and stakeholder communication
### Case Study 3: Platform Team with Dependencies
**Situation:** Infrastructure team supporting multiple product teams
**Challenge:** High variability due to urgent requests and dependencies
**Approach:**
- Separated planned vs. unplanned work velocity
- Used wider confidence intervals (90% vs 70%)
- Implemented buffer management strategy
**Results:**
- Planned work predictability: 85%
- Total work predictability: 65% (acceptable for context)
- Improved capacity allocation decisions
---
## Tools and Implementation
### Recommended Tools
1. **Python/R:** For custom implementation and complex models
2. **Excel/Google Sheets:** For simple implementations and visualization
3. **Jira/Azure DevOps:** For automated data collection
4. **Specialized Tools:** ActionableAgile, Monte Carlo simulation software
### Key Metrics to Track
- **Forecast Accuracy:** How often do actual results fall within predicted ranges?
- **Calibration:** Do 70% confidence intervals contain 70% of actual results?
- **Bias:** Are forecasts consistently optimistic or pessimistic?
- **Resolution:** How precise are the forecasts for decision-making?
### Implementation Checklist
- [ ] Historical velocity data collection (minimum 6 sprints)
- [ ] Data quality validation (outliers, context)
- [ ] Distribution analysis (normal, skewed, multi-modal)
- [ ] Model selection and parameter estimation
- [ ] Validation against held-out data
- [ ] Visualization and communication materials
- [ ] Regular calibration and model updates
---
## Conclusion
Monte Carlo velocity forecasting transforms uncertain estimates into probabilistic statements that enable better decision-making. Success requires:
1. **Quality Data:** Clean, relevant historical velocity data
2. **Appropriate Models:** Choose methods suited to your team's patterns
3. **Clear Communication:** Present uncertainty honestly to stakeholders
4. **Continuous Improvement:** Calibrate and refine models over time
5. **Contextual Awareness:** Account for team changes, external factors, and business context
The goal is not perfect prediction, but better understanding of uncertainty to make more informed planning decisions.
---
*This guide provides a comprehensive foundation for implementing probabilistic velocity forecasting. Adapt the techniques to your team's specific context and constraints.*
FILE:scripts/retrospective_analyzer.py
#!/usr/bin/env python3
"""
Retrospective Analyzer
Processes retrospective data to track action item completion rates, identify
recurring themes, measure improvement trends, and generate insights for
continuous team improvement.
Usage:
python retrospective_analyzer.py retro_data.json
python retrospective_analyzer.py retro_data.json --format json
"""
import argparse
import json
import re
import statistics
import sys
from collections import Counter, defaultdict
from datetime import datetime, timedelta
from typing import Any, Dict, List, Optional, Set, Tuple
# ---------------------------------------------------------------------------
# Configuration and Constants
# ---------------------------------------------------------------------------
SENTIMENT_KEYWORDS = {
"positive": [
"good", "great", "excellent", "awesome", "fantastic", "wonderful",
"improved", "better", "success", "achievement", "celebration",
"working well", "effective", "efficient", "smooth", "pleased",
"happy", "satisfied", "proud", "accomplished", "breakthrough"
],
"negative": [
"bad", "terrible", "awful", "horrible", "frustrating", "annoying",
"problem", "issue", "blocker", "impediment", "concern", "worry",
"difficult", "challenging", "struggling", "failing", "broken",
"slow", "delayed", "confused", "unclear", "chaos", "stressed"
],
"neutral": [
"okay", "average", "normal", "standard", "typical", "usual",
"process", "procedure", "meeting", "discussion", "review",
"update", "status", "information", "data", "report"
]
}
THEME_CATEGORIES = {
"communication": [
"communication", "meeting", "standup", "discussion", "feedback",
"information", "clarity", "understanding", "alignment", "sync",
"reporting", "updates", "transparency", "visibility"
],
"process": [
"process", "procedure", "workflow", "methodology", "framework",
"scrum", "agile", "ceremony", "planning", "retrospective",
"review", "estimation", "refinement", "definition of done"
],
"technical": [
"technical", "code", "development", "bug", "testing", "deployment",
"architecture", "infrastructure", "tools", "technology",
"performance", "quality", "automation", "ci/cd", "devops"
],
"team_dynamics": [
"team", "collaboration", "cooperation", "support", "morale",
"motivation", "engagement", "culture", "relationship", "trust",
"conflict", "personality", "workload", "capacity", "burnout"
],
"external": [
"customer", "stakeholder", "management", "product owner", "business",
"requirement", "priority", "deadline", "budget", "resource",
"dependency", "vendor", "third party", "integration"
]
}
ACTION_PRIORITY_KEYWORDS = {
"high": ["urgent", "critical", "asap", "immediately", "blocker", "must"],
"medium": ["important", "should", "needed", "required", "significant"],
"low": ["nice to have", "consider", "explore", "investigate", "eventually"]
}
COMPLETION_STATUS_MAPPING = {
"completed": ["done", "completed", "finished", "resolved", "closed", "achieved"],
"in_progress": ["in progress", "ongoing", "working on", "started", "partial"],
"blocked": ["blocked", "stuck", "waiting", "dependent", "impediment"],
"cancelled": ["cancelled", "dropped", "abandoned", "not needed", "deprioritized"],
"not_started": ["not started", "pending", "todo", "planned", "upcoming"]
}
# ---------------------------------------------------------------------------
# Data Models
# ---------------------------------------------------------------------------
class ActionItem:
"""Represents a single action item from a retrospective."""
def __init__(self, data: Dict[str, Any]):
self.id: str = data.get("id", "")
self.description: str = data.get("description", "")
self.owner: str = data.get("owner", "")
self.priority: str = data.get("priority", "medium").lower()
self.due_date: Optional[str] = data.get("due_date")
self.status: str = data.get("status", "not_started").lower()
self.created_sprint: int = data.get("created_sprint", 0)
self.completed_sprint: Optional[int] = data.get("completed_sprint")
self.category: str = data.get("category", "")
self.effort_estimate: str = data.get("effort_estimate", "medium")
# Normalize status
self.normalized_status = self._normalize_status(self.status)
# Infer priority from description if not explicitly set
if self.priority == "medium":
self.inferred_priority = self._infer_priority(self.description)
else:
self.inferred_priority = self.priority
def _normalize_status(self, status: str) -> str:
"""Normalize status to standard categories."""
status_lower = status.lower().strip()
for category, statuses in COMPLETION_STATUS_MAPPING.items():
if any(s in status_lower for s in statuses):
return category
return "not_started"
def _infer_priority(self, description: str) -> str:
"""Infer priority from description text."""
description_lower = description.lower()
for priority, keywords in ACTION_PRIORITY_KEYWORDS.items():
if any(keyword in description_lower for keyword in keywords):
return priority
return "medium"
@property
def is_completed(self) -> bool:
return self.normalized_status == "completed"
@property
def is_overdue(self) -> bool:
if not self.due_date:
return False
try:
due_date = datetime.strptime(self.due_date, "%Y-%m-%d")
return datetime.now() > due_date and not self.is_completed
except ValueError:
return False
class RetrospectiveData:
"""Represents data from a single retrospective session."""
def __init__(self, data: Dict[str, Any]):
self.sprint_number: int = data.get("sprint_number", 0)
self.date: str = data.get("date", "")
self.facilitator: str = data.get("facilitator", "")
self.attendees: List[str] = data.get("attendees", [])
self.duration_minutes: int = data.get("duration_minutes", 0)
# Retrospective categories
self.went_well: List[str] = data.get("went_well", [])
self.to_improve: List[str] = data.get("to_improve", [])
self.action_items_data: List[Dict[str, Any]] = data.get("action_items", [])
# Create action items
self.action_items: List[ActionItem] = [
ActionItem({**item, "created_sprint": self.sprint_number})
for item in self.action_items_data
]
# Calculate metrics
self._calculate_metrics()
def _calculate_metrics(self):
"""Calculate retrospective session metrics."""
self.total_items = len(self.went_well) + len(self.to_improve)
self.action_items_count = len(self.action_items)
self.attendance_rate = len(self.attendees) / max(1, 5) # Assume team of 5
# Sentiment analysis
self.sentiment_scores = self._analyze_sentiment()
# Theme analysis
self.themes = self._extract_themes()
def _analyze_sentiment(self) -> Dict[str, float]:
"""Analyze sentiment of retrospective items."""
all_text = " ".join(self.went_well + self.to_improve).lower()
sentiment_scores = {}
for sentiment, keywords in SENTIMENT_KEYWORDS.items():
count = sum(1 for keyword in keywords if keyword in all_text)
sentiment_scores[sentiment] = count
# Normalize to percentages
total_sentiment = sum(sentiment_scores.values())
if total_sentiment > 0:
for sentiment in sentiment_scores:
sentiment_scores[sentiment] = sentiment_scores[sentiment] / total_sentiment
return sentiment_scores
def _extract_themes(self) -> Dict[str, int]:
"""Extract themes from retrospective items."""
all_text = " ".join(self.went_well + self.to_improve).lower()
theme_counts = {}
for theme, keywords in THEME_CATEGORIES.items():
count = sum(1 for keyword in keywords if keyword in all_text)
if count > 0:
theme_counts[theme] = count
return theme_counts
class RetroAnalysisResult:
"""Complete retrospective analysis results."""
def __init__(self):
self.summary: Dict[str, Any] = {}
self.action_item_analysis: Dict[str, Any] = {}
self.theme_analysis: Dict[str, Any] = {}
self.improvement_trends: Dict[str, Any] = {}
self.recommendations: List[str] = []
# ---------------------------------------------------------------------------
# Analysis Functions
# ---------------------------------------------------------------------------
def analyze_action_item_completion(retros: List[RetrospectiveData]) -> Dict[str, Any]:
"""Analyze action item completion rates and patterns."""
all_action_items = []
for retro in retros:
all_action_items.extend(retro.action_items)
if not all_action_items:
return {
"total_action_items": 0,
"completion_rate": 0.0,
"average_completion_time": 0.0
}
# Overall completion statistics
completed_items = [item for item in all_action_items if item.is_completed]
completion_rate = len(completed_items) / len(all_action_items)
# Completion time analysis
completion_times = []
for item in completed_items:
if item.completed_sprint and item.created_sprint:
completion_time = item.completed_sprint - item.created_sprint
if completion_time >= 0:
completion_times.append(completion_time)
avg_completion_time = statistics.mean(completion_times) if completion_times else 0.0
# Status distribution
status_counts = Counter(item.normalized_status for item in all_action_items)
# Priority analysis
priority_completion = {}
for priority in ["high", "medium", "low"]:
priority_items = [item for item in all_action_items if item.inferred_priority == priority]
if priority_items:
priority_completed = sum(1 for item in priority_items if item.is_completed)
priority_completion[priority] = {
"total": len(priority_items),
"completed": priority_completed,
"completion_rate": priority_completed / len(priority_items)
}
# Owner analysis
owner_performance = defaultdict(lambda: {"total": 0, "completed": 0})
for item in all_action_items:
if item.owner:
owner_performance[item.owner]["total"] += 1
if item.is_completed:
owner_performance[item.owner]["completed"] += 1
for owner in owner_performance:
owner_data = owner_performance[owner]
owner_data["completion_rate"] = owner_data["completed"] / owner_data["total"]
# Overdue items
overdue_items = [item for item in all_action_items if item.is_overdue]
return {
"total_action_items": len(all_action_items),
"completion_rate": completion_rate,
"completed_items": len(completed_items),
"average_completion_time": avg_completion_time,
"status_distribution": dict(status_counts),
"priority_analysis": priority_completion,
"owner_performance": dict(owner_performance),
"overdue_items": len(overdue_items),
"overdue_rate": len(overdue_items) / len(all_action_items) if all_action_items else 0.0
}
def analyze_recurring_themes(retros: List[RetrospectiveData]) -> Dict[str, Any]:
"""Identify recurring themes across retrospectives."""
theme_evolution = defaultdict(list)
sentiment_evolution = defaultdict(list)
# Track themes over time
for retro in retros:
sprint = retro.sprint_number
# Theme tracking
for theme, count in retro.themes.items():
theme_evolution[theme].append((sprint, count))
# Sentiment tracking
for sentiment, score in retro.sentiment_scores.items():
sentiment_evolution[sentiment].append((sprint, score))
# Identify recurring themes (appear in >50% of retros)
recurring_threshold = len(retros) * 0.5
recurring_themes = {}
for theme, occurrences in theme_evolution.items():
if len(occurrences) >= recurring_threshold:
sprints, counts = zip(*occurrences)
recurring_themes[theme] = {
"frequency": len(occurrences) / len(retros),
"average_mentions": statistics.mean(counts),
"trend": _calculate_trend(list(counts)),
"first_appearance": min(sprints),
"last_appearance": max(sprints),
"total_mentions": sum(counts)
}
# Sentiment trend analysis
sentiment_trends = {}
for sentiment, scores_by_sprint in sentiment_evolution.items():
if len(scores_by_sprint) >= 3: # Need at least 3 data points
_, scores = zip(*scores_by_sprint)
sentiment_trends[sentiment] = {
"average_score": statistics.mean(scores),
"trend": _calculate_trend(list(scores)),
"volatility": statistics.stdev(scores) if len(scores) > 1 else 0.0
}
# Identify persistent issues (negative themes that recur)
persistent_issues = []
for theme, data in recurring_themes.items():
if theme in ["technical", "process", "external"] and data["frequency"] > 0.6:
if data["trend"]["direction"] in ["stable", "increasing"]:
persistent_issues.append({
"theme": theme,
"frequency": data["frequency"],
"severity": data["average_mentions"],
"trend": data["trend"]["direction"]
})
return {
"recurring_themes": recurring_themes,
"sentiment_trends": sentiment_trends,
"persistent_issues": persistent_issues,
"total_themes_identified": len(theme_evolution),
"themes_per_retro": sum(len(r.themes) for r in retros) / len(retros) if retros else 0
}
def analyze_improvement_trends(retros: List[RetrospectiveData]) -> Dict[str, Any]:
"""Analyze improvement trends across retrospectives."""
if len(retros) < 3:
return {"error": "Need at least 3 retrospectives for trend analysis"}
# Sort retrospectives by sprint number
sorted_retros = sorted(retros, key=lambda r: r.sprint_number)
# Track various metrics over time
metrics_over_time = {
"action_items_per_retro": [len(r.action_items) for r in sorted_retros],
"attendance_rate": [r.attendance_rate for r in sorted_retros],
"duration": [r.duration_minutes for r in sorted_retros],
"positive_sentiment": [r.sentiment_scores.get("positive", 0) for r in sorted_retros],
"negative_sentiment": [r.sentiment_scores.get("negative", 0) for r in sorted_retros],
"total_items_discussed": [r.total_items for r in sorted_retros]
}
# Calculate trends for each metric
trend_analysis = {}
for metric_name, values in metrics_over_time.items():
if len(values) >= 3:
trend_analysis[metric_name] = {
"values": values,
"trend": _calculate_trend(values),
"average": statistics.mean(values),
"latest": values[-1],
"change_from_first": ((values[-1] - values[0]) / values[0]) if values[0] != 0 else 0
}
# Action item completion trend
completion_rates_by_sprint = []
for i, retro in enumerate(sorted_retros):
if i > 0: # Skip first retro as it has no previous action items to complete
prev_retro = sorted_retros[i-1]
if prev_retro.action_items:
completed_count = sum(1 for item in prev_retro.action_items
if item.is_completed and item.completed_sprint == retro.sprint_number)
completion_rate = completed_count / len(prev_retro.action_items)
completion_rates_by_sprint.append(completion_rate)
if completion_rates_by_sprint:
trend_analysis["action_item_completion"] = {
"values": completion_rates_by_sprint,
"trend": _calculate_trend(completion_rates_by_sprint),
"average": statistics.mean(completion_rates_by_sprint),
"latest": completion_rates_by_sprint[-1] if completion_rates_by_sprint else 0
}
# Team maturity indicators
maturity_score = _calculate_team_maturity(sorted_retros)
return {
"trend_analysis": trend_analysis,
"team_maturity_score": maturity_score,
"retrospective_quality_trend": _assess_retrospective_quality_trend(sorted_retros),
"improvement_velocity": _calculate_improvement_velocity(sorted_retros)
}
def _calculate_trend(values: List[float]) -> Dict[str, Any]:
"""Calculate trend direction and strength for a series of values."""
if len(values) < 2:
return {"direction": "insufficient_data", "strength": 0.0}
# Simple linear regression
n = len(values)
x_values = list(range(n))
x_mean = sum(x_values) / n
y_mean = sum(values) / n
numerator = sum((x - x_mean) * (y - y_mean) for x, y in zip(x_values, values))
denominator = sum((x - x_mean) ** 2 for x in x_values)
if denominator == 0:
slope = 0
else:
slope = numerator / denominator
# Calculate correlation coefficient for trend strength
try:
correlation = statistics.correlation(x_values, values) if n > 2 else 0.0
except statistics.StatisticsError:
correlation = 0.0
# Determine trend direction
if abs(slope) < 0.01: # Practically no change
direction = "stable"
elif slope > 0:
direction = "increasing"
else:
direction = "decreasing"
return {
"direction": direction,
"slope": slope,
"strength": abs(correlation),
"correlation": correlation
}
def _calculate_team_maturity(retros: List[RetrospectiveData]) -> Dict[str, Any]:
"""Calculate team maturity based on retrospective patterns."""
if len(retros) < 3:
return {"score": 50, "level": "developing"}
maturity_indicators = {
"action_item_focus": 0, # Fewer but higher quality action items
"sentiment_balance": 0, # Balanced positive/negative sentiment
"theme_consistency": 0, # Consistent themes without chaos
"participation": 0, # High attendance rates
"follow_through": 0 # Good action item completion
}
# Action item focus (quality over quantity)
avg_action_items = sum(len(r.action_items) for r in retros) / len(retros)
if 2 <= avg_action_items <= 5: # Sweet spot
maturity_indicators["action_item_focus"] = 100
elif avg_action_items < 2 or avg_action_items > 8:
maturity_indicators["action_item_focus"] = 30
else:
maturity_indicators["action_item_focus"] = 70
# Sentiment balance
avg_positive = sum(r.sentiment_scores.get("positive", 0) for r in retros) / len(retros)
avg_negative = sum(r.sentiment_scores.get("negative", 0) for r in retros) / len(retros)
if 0.3 <= avg_positive <= 0.6 and 0.2 <= avg_negative <= 0.4:
maturity_indicators["sentiment_balance"] = 100
else:
maturity_indicators["sentiment_balance"] = 50
# Participation
avg_attendance = sum(r.attendance_rate for r in retros) / len(retros)
maturity_indicators["participation"] = min(100, avg_attendance * 100)
# Theme consistency (not too chaotic, not too narrow)
avg_themes = sum(len(r.themes) for r in retros) / len(retros)
if 2 <= avg_themes <= 4:
maturity_indicators["theme_consistency"] = 100
else:
maturity_indicators["theme_consistency"] = 70
# Follow-through (estimated from action item patterns)
# This is simplified - in reality would track actual completion
recent_retros = retros[-3:] if len(retros) >= 3 else retros
avg_recent_actions = sum(len(r.action_items) for r in recent_retros) / len(recent_retros)
if avg_recent_actions <= 3: # Fewer action items might indicate better follow-through
maturity_indicators["follow_through"] = 80
else:
maturity_indicators["follow_through"] = 60
# Calculate overall maturity score
overall_score = sum(maturity_indicators.values()) / len(maturity_indicators)
if overall_score >= 85:
level = "high_performing"
elif overall_score >= 70:
level = "performing"
elif overall_score >= 55:
level = "developing"
else:
level = "forming"
return {
"score": overall_score,
"level": level,
"indicators": maturity_indicators
}
def _assess_retrospective_quality_trend(retros: List[RetrospectiveData]) -> Dict[str, Any]:
"""Assess the quality trend of retrospectives over time."""
quality_scores = []
for retro in retros:
score = 0
# Duration appropriateness (60-90 minutes is ideal)
if 60 <= retro.duration_minutes <= 90:
score += 25
elif 45 <= retro.duration_minutes <= 120:
score += 15
else:
score += 5
# Participation
score += min(25, retro.attendance_rate * 25)
# Balance of content
went_well_count = len(retro.went_well)
to_improve_count = len(retro.to_improve)
total_items = went_well_count + to_improve_count
if total_items > 0:
balance = min(went_well_count, to_improve_count) / total_items
score += balance * 25
# Action items quality (not too many, not too few)
action_count = len(retro.action_items)
if 2 <= action_count <= 5:
score += 25
elif 1 <= action_count <= 7:
score += 15
else:
score += 5
quality_scores.append(score)
if len(quality_scores) >= 2:
trend = _calculate_trend(quality_scores)
else:
trend = {"direction": "insufficient_data", "strength": 0.0}
return {
"quality_scores": quality_scores,
"average_quality": statistics.mean(quality_scores),
"trend": trend,
"latest_quality": quality_scores[-1] if quality_scores else 0
}
def _calculate_improvement_velocity(retros: List[RetrospectiveData]) -> Dict[str, Any]:
"""Calculate how quickly the team improves based on retrospective patterns."""
if len(retros) < 4:
return {"velocity": "insufficient_data"}
# Look at theme evolution - are persistent issues being resolved?
theme_counts = defaultdict(list)
for retro in retros:
for theme, count in retro.themes.items():
theme_counts[theme].append(count)
resolved_themes = 0
persistent_themes = 0
for theme, counts in theme_counts.items():
if len(counts) >= 3:
recent_avg = statistics.mean(counts[-2:])
early_avg = statistics.mean(counts[:2])
if recent_avg < early_avg * 0.7: # 30% reduction
resolved_themes += 1
elif recent_avg > early_avg * 0.9: # Still persistent
persistent_themes += 1
total_themes = resolved_themes + persistent_themes
if total_themes > 0:
resolution_rate = resolved_themes / total_themes
else:
resolution_rate = 0.5 # Neutral if no data
# Action item completion trends
if len(retros) >= 4:
recent_action_density = sum(len(r.action_items) for r in retros[-2:]) / 2
early_action_density = sum(len(r.action_items) for r in retros[:2]) / 2
action_efficiency = 1.0
if early_action_density > 0:
action_efficiency = min(1.0, early_action_density / max(recent_action_density, 1))
else:
action_efficiency = 0.5
# Overall velocity score
velocity_score = (resolution_rate * 0.6) + (action_efficiency * 0.4)
if velocity_score >= 0.8:
velocity = "high"
elif velocity_score >= 0.6:
velocity = "moderate"
elif velocity_score >= 0.4:
velocity = "low"
else:
velocity = "stagnant"
return {
"velocity": velocity,
"velocity_score": velocity_score,
"theme_resolution_rate": resolution_rate,
"action_efficiency": action_efficiency,
"resolved_themes": resolved_themes,
"persistent_themes": persistent_themes
}
def generate_recommendations(result: RetroAnalysisResult) -> List[str]:
"""Generate actionable recommendations based on retrospective analysis."""
recommendations = []
# Action item recommendations
action_analysis = result.action_item_analysis
completion_rate = action_analysis.get("completion_rate", 0)
if completion_rate < 0.5:
recommendations.append("CRITICAL: Low action item completion rate (<50%). Reduce action items per retro and focus on realistic, achievable goals.")
elif completion_rate < 0.7:
recommendations.append("Improve action item follow-through. Consider assigning owners and due dates more systematically.")
elif completion_rate > 0.9:
recommendations.append("Excellent action item completion! Consider taking on more ambitious improvement initiatives.")
overdue_rate = action_analysis.get("overdue_rate", 0)
if overdue_rate > 0.3:
recommendations.append("High overdue rate suggests unrealistic timelines. Review estimation and prioritization process.")
# Theme recommendations
theme_analysis = result.theme_analysis
persistent_issues = theme_analysis.get("persistent_issues", [])
if len(persistent_issues) >= 2:
recommendations.append(f"Address {len(persistent_issues)} persistent issues that keep recurring across retrospectives.")
for issue in persistent_issues[:2]: # Top 2 issues
recommendations.append(f"Focus on resolving recurring {issue['theme']} issues (appears in {issue['frequency']:.0%} of retros).")
# Trend-based recommendations
improvement_trends = result.improvement_trends
if "team_maturity_score" in improvement_trends:
maturity = improvement_trends["team_maturity_score"]
level = maturity.get("level", "forming")
if level == "forming":
recommendations.append("Team is in forming stage. Focus on establishing basic retrospective disciplines and psychological safety.")
elif level == "developing":
recommendations.append("Team is developing. Work on action item follow-through and deeper root cause analysis.")
elif level == "performing":
recommendations.append("Good team maturity. Consider advanced techniques like continuous improvement tracking.")
elif level == "high_performing":
recommendations.append("Excellent retrospective maturity! Share practices with other teams and focus on innovation.")
# Quality recommendations
if "retrospective_quality_trend" in improvement_trends:
quality_trend = improvement_trends["retrospective_quality_trend"]
avg_quality = quality_trend.get("average_quality", 50)
if avg_quality < 60:
recommendations.append("Retrospective quality is below average. Review facilitation techniques and engagement strategies.")
trend_direction = quality_trend.get("trend", {}).get("direction", "stable")
if trend_direction == "decreasing":
recommendations.append("Retrospective quality is declining. Consider changing facilitation approach or addressing team engagement issues.")
return recommendations
# ---------------------------------------------------------------------------
# Main Analysis Function
# ---------------------------------------------------------------------------
def analyze_retrospectives(data: Dict[str, Any]) -> RetroAnalysisResult:
"""Perform comprehensive retrospective analysis."""
result = RetroAnalysisResult()
try:
# Parse retrospective data
retro_records = data.get("retrospectives", [])
retros = [RetrospectiveData(record) for record in retro_records]
if not retros:
raise ValueError("No retrospective data found")
# Sort by sprint number
retros.sort(key=lambda r: r.sprint_number)
# Basic summary
result.summary = {
"total_retrospectives": len(retros),
"date_range": {
"first": retros[0].date if retros else "",
"last": retros[-1].date if retros else "",
"span_sprints": retros[-1].sprint_number - retros[0].sprint_number + 1 if retros else 0
},
"average_duration": statistics.mean([r.duration_minutes for r in retros if r.duration_minutes > 0]),
"average_attendance": statistics.mean([r.attendance_rate for r in retros]),
}
# Action item analysis
result.action_item_analysis = analyze_action_item_completion(retros)
# Theme analysis
result.theme_analysis = analyze_recurring_themes(retros)
# Improvement trends
result.improvement_trends = analyze_improvement_trends(retros)
# Generate recommendations
result.recommendations = generate_recommendations(result)
except Exception as e:
result.summary = {"error": str(e)}
return result
# ---------------------------------------------------------------------------
# Output Formatting
# ---------------------------------------------------------------------------
def format_text_output(result: RetroAnalysisResult) -> str:
"""Format analysis results as readable text report."""
lines = []
lines.append("="*60)
lines.append("RETROSPECTIVE ANALYSIS REPORT")
lines.append("="*60)
lines.append("")
if "error" in result.summary:
lines.append(f"ERROR: {result.summary['error']}")
return "\n".join(lines)
# Summary section
summary = result.summary
lines.append("RETROSPECTIVE SUMMARY")
lines.append("-"*30)
lines.append(f"Total Retrospectives: {summary['total_retrospectives']}")
lines.append(f"Sprint Range: {summary['date_range']['span_sprints']} sprints")
lines.append(f"Average Duration: {summary.get('average_duration', 0):.0f} minutes")
lines.append(f"Average Attendance: {summary.get('average_attendance', 0):.1%}")
lines.append("")
# Action item analysis
action_analysis = result.action_item_analysis
lines.append("ACTION ITEM ANALYSIS")
lines.append("-"*30)
lines.append(f"Total Action Items: {action_analysis.get('total_action_items', 0)}")
lines.append(f"Completion Rate: {action_analysis.get('completion_rate', 0):.1%}")
lines.append(f"Average Completion Time: {action_analysis.get('average_completion_time', 0):.1f} sprints")
lines.append(f"Overdue Items: {action_analysis.get('overdue_items', 0)} ({action_analysis.get('overdue_rate', 0):.1%})")
priority_analysis = action_analysis.get('priority_analysis', {})
if priority_analysis:
lines.append("Priority-based completion rates:")
for priority, data in priority_analysis.items():
lines.append(f" {priority.title()}: {data['completion_rate']:.1%} ({data['completed']}/{data['total']})")
lines.append("")
# Theme analysis
theme_analysis = result.theme_analysis
lines.append("THEME ANALYSIS")
lines.append("-"*30)
recurring_themes = theme_analysis.get("recurring_themes", {})
if recurring_themes:
lines.append("Top recurring themes:")
sorted_themes = sorted(recurring_themes.items(), key=lambda x: x[1]['frequency'], reverse=True)
for theme, data in sorted_themes[:5]:
lines.append(f" {theme.replace('_', ' ').title()}: {data['frequency']:.1%} frequency, {data['trend']['direction']} trend")
persistent_issues = theme_analysis.get("persistent_issues", [])
if persistent_issues:
lines.append("Persistent issues requiring attention:")
for issue in persistent_issues:
lines.append(f" {issue['theme'].replace('_', ' ').title()}: {issue['frequency']:.1%} frequency")
lines.append("")
# Improvement trends
improvement_trends = result.improvement_trends
if "team_maturity_score" in improvement_trends:
maturity = improvement_trends["team_maturity_score"]
lines.append("TEAM MATURITY")
lines.append("-"*30)
lines.append(f"Maturity Level: {maturity['level'].replace('_', ' ').title()}")
lines.append(f"Maturity Score: {maturity['score']:.0f}/100")
lines.append("")
if "improvement_velocity" in improvement_trends:
velocity = improvement_trends["improvement_velocity"]
lines.append("IMPROVEMENT VELOCITY")
lines.append("-"*30)
lines.append(f"Velocity: {velocity['velocity'].title()}")
lines.append(f"Theme Resolution Rate: {velocity.get('theme_resolution_rate', 0):.1%}")
lines.append("")
# Recommendations
if result.recommendations:
lines.append("RECOMMENDATIONS")
lines.append("-"*30)
for i, rec in enumerate(result.recommendations, 1):
lines.append(f"{i}. {rec}")
return "\n".join(lines)
def format_json_output(result: RetroAnalysisResult) -> Dict[str, Any]:
"""Format analysis results as JSON."""
return {
"summary": result.summary,
"action_item_analysis": result.action_item_analysis,
"theme_analysis": result.theme_analysis,
"improvement_trends": result.improvement_trends,
"recommendations": result.recommendations,
}
# ---------------------------------------------------------------------------
# CLI Interface
# ---------------------------------------------------------------------------
def main() -> int:
"""Main CLI entry point."""
parser = argparse.ArgumentParser(
description="Analyze retrospective data for continuous improvement insights"
)
parser.add_argument(
"data_file",
help="JSON file containing retrospective data"
)
parser.add_argument(
"--format",
choices=["text", "json"],
default="text",
help="Output format (default: text)"
)
args = parser.parse_args()
try:
# Load and validate data
with open(args.data_file, 'r') as f:
data = json.load(f)
# Perform analysis
result = analyze_retrospectives(data)
# Output results
if args.format == "json":
output = format_json_output(result)
print(json.dumps(output, indent=2))
else:
output = format_text_output(result)
print(output)
return 0
except FileNotFoundError:
print(f"Error: File '{args.data_file}' not found", file=sys.stderr)
return 1
except json.JSONDecodeError as e:
print(f"Error: Invalid JSON in '{args.data_file}': {e}", file=sys.stderr)
return 1
except Exception as e:
print(f"Error: {e}", file=sys.stderr)
return 1
if __name__ == "__main__":
sys.exit(main())
FILE:scripts/sprint_health_scorer.py
#!/usr/bin/env python3
"""
Sprint Health Scorer
Scores sprint health across multiple dimensions including commitment reliability,
scope creep, blocker resolution time, ceremony attendance, and story completion
distribution. Produces composite health scores with actionable recommendations.
Usage:
python sprint_health_scorer.py sprint_data.json
python sprint_health_scorer.py sprint_data.json --format json
"""
import argparse
import json
import statistics
import sys
from datetime import datetime, timedelta
from typing import Any, Dict, List, Optional, Tuple
# ---------------------------------------------------------------------------
# Scoring Configuration
# ---------------------------------------------------------------------------
HEALTH_DIMENSIONS = {
"commitment_reliability": {
"weight": 0.25,
"excellent_threshold": 0.95, # 95%+ commitment achievement
"good_threshold": 0.85, # 85%+ commitment achievement
"poor_threshold": 0.70, # Below 70% is poor
},
"scope_stability": {
"weight": 0.20,
"excellent_threshold": 0.05, # ≤5% scope change
"good_threshold": 0.15, # ≤15% scope change
"poor_threshold": 0.30, # >30% scope change is poor
},
"blocker_resolution": {
"weight": 0.15,
"excellent_threshold": 1.0, # ≤1 day average resolution
"good_threshold": 3.0, # ≤3 days average resolution
"poor_threshold": 7.0, # >7 days is poor
},
"ceremony_engagement": {
"weight": 0.15,
"excellent_threshold": 0.95, # 95%+ attendance
"good_threshold": 0.85, # 85%+ attendance
"poor_threshold": 0.70, # Below 70% is poor
},
"story_completion_distribution": {
"weight": 0.15,
"excellent_threshold": 0.80, # 80%+ stories fully completed
"good_threshold": 0.65, # 65%+ stories completed
"poor_threshold": 0.50, # Below 50% is poor
},
"velocity_predictability": {
"weight": 0.10,
"excellent_threshold": 0.10, # ≤10% CV
"good_threshold": 0.20, # ≤20% CV
"poor_threshold": 0.35, # >35% CV is poor
}
}
OVERALL_HEALTH_THRESHOLDS = {
"excellent": 85,
"good": 70,
"fair": 55,
"poor": 40,
}
STORY_STATUS_MAPPING = {
"completed": ["done", "completed", "closed", "resolved"],
"in_progress": ["in progress", "in_progress", "development", "testing"],
"blocked": ["blocked", "impediment", "waiting"],
"not_started": ["todo", "to do", "backlog", "new", "open"],
}
# ---------------------------------------------------------------------------
# Data Models
# ---------------------------------------------------------------------------
class Story:
"""Represents a user story within a sprint."""
def __init__(self, data: Dict[str, Any]):
self.id: str = data.get("id", "")
self.title: str = data.get("title", "")
self.points: int = data.get("points", 0)
self.status: str = data.get("status", "").lower()
self.assigned_to: str = data.get("assigned_to", "")
self.created_date: str = data.get("created_date", "")
self.completed_date: Optional[str] = data.get("completed_date")
self.blocked_days: int = data.get("blocked_days", 0)
self.priority: str = data.get("priority", "medium")
# Normalize status
self.normalized_status = self._normalize_status(self.status)
def _normalize_status(self, status: str) -> str:
"""Normalize status to standard categories."""
status_lower = status.lower().strip()
for category, statuses in STORY_STATUS_MAPPING.items():
if status_lower in statuses:
return category
return "unknown"
@property
def is_completed(self) -> bool:
return self.normalized_status == "completed"
@property
def is_blocked(self) -> bool:
return self.normalized_status == "blocked" or self.blocked_days > 0
class SprintHealthData:
"""Comprehensive sprint health data model."""
def __init__(self, data: Dict[str, Any]):
self.sprint_number: int = data.get("sprint_number", 0)
self.sprint_name: str = data.get("sprint_name", "")
self.start_date: str = data.get("start_date", "")
self.end_date: str = data.get("end_date", "")
self.team_size: int = data.get("team_size", 0)
self.working_days: int = data.get("working_days", 10)
# Commitment and delivery
self.planned_points: int = data.get("planned_points", 0)
self.completed_points: int = data.get("completed_points", 0)
self.added_points: int = data.get("added_points", 0)
self.removed_points: int = data.get("removed_points", 0)
# Stories
story_data = data.get("stories", [])
self.stories: List[Story] = [Story(story) for story in story_data]
# Blockers
self.blockers: List[Dict[str, Any]] = data.get("blockers", [])
# Ceremonies
self.ceremonies: Dict[str, Any] = data.get("ceremonies", {})
# Calculate derived metrics
self._calculate_derived_metrics()
def _calculate_derived_metrics(self):
"""Calculate derived health metrics."""
# Commitment reliability
self.commitment_ratio = (
self.completed_points / max(self.planned_points, 1)
)
# Scope change
total_scope_change = self.added_points + self.removed_points
self.scope_change_ratio = total_scope_change / max(self.planned_points, 1)
# Story completion distribution
total_stories = len(self.stories)
if total_stories > 0:
completed_stories = sum(1 for story in self.stories if story.is_completed)
self.story_completion_ratio = completed_stories / total_stories
else:
self.story_completion_ratio = 0.0
# Blocked stories analysis
blocked_stories = [story for story in self.stories if story.is_blocked]
self.blocked_stories_count = len(blocked_stories)
self.blocked_points = sum(story.points for story in blocked_stories)
class HealthScoreResult:
"""Complete health scoring results."""
def __init__(self):
self.dimension_scores: Dict[str, Dict[str, Any]] = {}
self.overall_score: float = 0.0
self.health_grade: str = ""
self.trend_analysis: Dict[str, Any] = {}
self.recommendations: List[str] = []
self.detailed_metrics: Dict[str, Any] = {}
# ---------------------------------------------------------------------------
# Scoring Functions
# ---------------------------------------------------------------------------
def score_commitment_reliability(sprints: List[SprintHealthData]) -> Dict[str, Any]:
"""Score commitment reliability across sprints."""
if not sprints:
return {"score": 0, "grade": "insufficient_data"}
commitment_ratios = [sprint.commitment_ratio for sprint in sprints]
avg_commitment = statistics.mean(commitment_ratios)
consistency = 1.0 - (statistics.stdev(commitment_ratios) if len(commitment_ratios) > 1 else 0)
# Score based on average achievement and consistency
config = HEALTH_DIMENSIONS["commitment_reliability"]
base_score = _calculate_dimension_score(avg_commitment, config)
# Penalty for inconsistency
consistency_bonus = min(10, consistency * 10)
final_score = min(100, base_score + consistency_bonus)
return {
"score": final_score,
"grade": _score_to_grade(final_score),
"average_commitment": avg_commitment,
"consistency": consistency,
"commitment_ratios": commitment_ratios,
"details": f"Average commitment: {avg_commitment:.1%}, Consistency: {consistency:.1%}"
}
def score_scope_stability(sprints: List[SprintHealthData]) -> Dict[str, Any]:
"""Score scope stability (low scope change is better)."""
if not sprints:
return {"score": 0, "grade": "insufficient_data"}
scope_change_ratios = [sprint.scope_change_ratio for sprint in sprints]
avg_scope_change = statistics.mean(scope_change_ratios)
# For scope change, lower is better, so invert the scoring
config = HEALTH_DIMENSIONS["scope_stability"]
if avg_scope_change <= config["excellent_threshold"]:
score = 90 + (config["excellent_threshold"] - avg_scope_change) * 200
elif avg_scope_change <= config["good_threshold"]:
score = 70 + (config["good_threshold"] - avg_scope_change) * 200
elif avg_scope_change <= config["poor_threshold"]:
score = 40 + (config["poor_threshold"] - avg_scope_change) * 200
else:
score = max(0, 40 - (avg_scope_change - config["poor_threshold"]) * 100)
score = min(100, max(0, score))
return {
"score": score,
"grade": _score_to_grade(score),
"average_scope_change": avg_scope_change,
"scope_change_ratios": scope_change_ratios,
"details": f"Average scope change: {avg_scope_change:.1%}"
}
def score_blocker_resolution(sprints: List[SprintHealthData]) -> Dict[str, Any]:
"""Score blocker resolution efficiency."""
if not sprints:
return {"score": 0, "grade": "insufficient_data"}
all_blockers = []
for sprint in sprints:
all_blockers.extend(sprint.blockers)
if not all_blockers:
return {
"score": 100,
"grade": "excellent",
"average_resolution_time": 0,
"details": "No blockers reported"
}
# Calculate average resolution time
resolution_times = []
for blocker in all_blockers:
resolution_time = blocker.get("resolution_days", 0)
if resolution_time > 0:
resolution_times.append(resolution_time)
if not resolution_times:
return {"score": 50, "grade": "fair", "details": "No resolution time data"}
avg_resolution_time = statistics.mean(resolution_times)
# Score based on resolution time (lower is better)
config = HEALTH_DIMENSIONS["blocker_resolution"]
if avg_resolution_time <= config["excellent_threshold"]:
score = 95
elif avg_resolution_time <= config["good_threshold"]:
score = 80 - (avg_resolution_time - config["excellent_threshold"]) * 10
elif avg_resolution_time <= config["poor_threshold"]:
score = 60 - (avg_resolution_time - config["good_threshold"]) * 5
else:
score = max(20, 40 - (avg_resolution_time - config["poor_threshold"]) * 3)
return {
"score": score,
"grade": _score_to_grade(score),
"average_resolution_time": avg_resolution_time,
"total_blockers": len(all_blockers),
"resolved_blockers": len(resolution_times),
"details": f"Average resolution: {avg_resolution_time:.1f} days from {len(all_blockers)} blockers"
}
def score_ceremony_engagement(sprints: List[SprintHealthData]) -> Dict[str, Any]:
"""Score team engagement in scrum ceremonies."""
if not sprints:
return {"score": 0, "grade": "insufficient_data"}
ceremony_scores = []
ceremony_details = {}
for sprint in sprints:
ceremonies = sprint.ceremonies
sprint_ceremony_scores = []
for ceremony_name, ceremony_data in ceremonies.items():
if isinstance(ceremony_data, dict):
attendance_rate = ceremony_data.get("attendance_rate", 0)
engagement_score = ceremony_data.get("engagement_score", 0)
# Weight attendance more heavily than engagement
ceremony_score = (attendance_rate * 0.7) + (engagement_score * 0.3)
sprint_ceremony_scores.append(ceremony_score)
if ceremony_name not in ceremony_details:
ceremony_details[ceremony_name] = []
ceremony_details[ceremony_name].append({
"sprint": sprint.sprint_number,
"attendance": attendance_rate,
"engagement": engagement_score,
"score": ceremony_score
})
if sprint_ceremony_scores:
ceremony_scores.append(statistics.mean(sprint_ceremony_scores))
if not ceremony_scores:
return {"score": 50, "grade": "fair", "details": "No ceremony data available"}
avg_ceremony_score = statistics.mean(ceremony_scores)
config = HEALTH_DIMENSIONS["ceremony_engagement"]
score = _calculate_dimension_score(avg_ceremony_score, config)
return {
"score": score,
"grade": _score_to_grade(score),
"average_ceremony_score": avg_ceremony_score,
"ceremony_details": ceremony_details,
"details": f"Average ceremony engagement: {avg_ceremony_score:.1%}"
}
def score_story_completion_distribution(sprints: List[SprintHealthData]) -> Dict[str, Any]:
"""Score how well stories are completed vs. partially done."""
if not sprints:
return {"score": 0, "grade": "insufficient_data"}
completion_ratios = []
story_analysis = {
"total_stories": 0,
"completed_stories": 0,
"blocked_stories": 0,
"partial_completion": 0
}
for sprint in sprints:
if sprint.stories:
sprint_completion = sprint.story_completion_ratio
completion_ratios.append(sprint_completion)
story_analysis["total_stories"] += len(sprint.stories)
story_analysis["completed_stories"] += sum(1 for s in sprint.stories if s.is_completed)
story_analysis["blocked_stories"] += sum(1 for s in sprint.stories if s.is_blocked)
if not completion_ratios:
return {"score": 50, "grade": "fair", "details": "No story data available"}
avg_completion_ratio = statistics.mean(completion_ratios)
config = HEALTH_DIMENSIONS["story_completion_distribution"]
score = _calculate_dimension_score(avg_completion_ratio, config)
# Penalty for high number of blocked stories
if story_analysis["total_stories"] > 0:
blocked_ratio = story_analysis["blocked_stories"] / story_analysis["total_stories"]
if blocked_ratio > 0.20: # More than 20% blocked
score = max(0, score - (blocked_ratio - 0.20) * 100)
return {
"score": score,
"grade": _score_to_grade(score),
"average_completion_ratio": avg_completion_ratio,
"story_analysis": story_analysis,
"details": f"Average story completion: {avg_completion_ratio:.1%}"
}
def score_velocity_predictability(sprints: List[SprintHealthData]) -> Dict[str, Any]:
"""Score velocity predictability based on coefficient of variation."""
if len(sprints) < 2:
return {"score": 50, "grade": "fair", "details": "Insufficient sprints for predictability analysis"}
velocities = [sprint.completed_points for sprint in sprints]
mean_velocity = statistics.mean(velocities)
if mean_velocity == 0:
return {"score": 0, "grade": "poor", "details": "No velocity recorded"}
velocity_cv = statistics.stdev(velocities) / mean_velocity
# Lower CV is better for predictability
config = HEALTH_DIMENSIONS["velocity_predictability"]
if velocity_cv <= config["excellent_threshold"]:
score = 95
elif velocity_cv <= config["good_threshold"]:
score = 80 - (velocity_cv - config["excellent_threshold"]) * 150
elif velocity_cv <= config["poor_threshold"]:
score = 60 - (velocity_cv - config["good_threshold"]) * 100
else:
score = max(20, 40 - (velocity_cv - config["poor_threshold"]) * 50)
return {
"score": score,
"grade": _score_to_grade(score),
"coefficient_of_variation": velocity_cv,
"mean_velocity": mean_velocity,
"velocity_std_dev": statistics.stdev(velocities),
"details": f"Velocity CV: {velocity_cv:.1%} (lower is more predictable)"
}
def _calculate_dimension_score(value: float, config: Dict[str, Any]) -> float:
"""Calculate dimension score based on thresholds."""
if value >= config["excellent_threshold"]:
return 95
elif value >= config["good_threshold"]:
# Linear interpolation between good and excellent
range_size = config["excellent_threshold"] - config["good_threshold"]
position = (value - config["good_threshold"]) / range_size
return 80 + (position * 15)
elif value >= config["poor_threshold"]:
# Linear interpolation between poor and good
range_size = config["good_threshold"] - config["poor_threshold"]
position = (value - config["poor_threshold"]) / range_size
return 50 + (position * 30)
else:
# Below poor threshold
return max(20, 50 - (config["poor_threshold"] - value) * 100)
def _score_to_grade(score: float) -> str:
"""Convert numerical score to letter grade."""
if score >= OVERALL_HEALTH_THRESHOLDS["excellent"]:
return "excellent"
elif score >= OVERALL_HEALTH_THRESHOLDS["good"]:
return "good"
elif score >= OVERALL_HEALTH_THRESHOLDS["fair"]:
return "fair"
else:
return "poor"
# ---------------------------------------------------------------------------
# Main Analysis Function
# ---------------------------------------------------------------------------
def analyze_sprint_health(data: Dict[str, Any]) -> HealthScoreResult:
"""Perform comprehensive sprint health analysis."""
result = HealthScoreResult()
try:
# Parse sprint data
sprint_records = data.get("sprints", [])
sprints = [SprintHealthData(record) for record in sprint_records]
if not sprints:
raise ValueError("No sprint data found")
# Sort by sprint number
sprints.sort(key=lambda s: s.sprint_number)
# Calculate dimension scores
dimensions = {
"commitment_reliability": score_commitment_reliability,
"scope_stability": score_scope_stability,
"blocker_resolution": score_blocker_resolution,
"ceremony_engagement": score_ceremony_engagement,
"story_completion_distribution": score_story_completion_distribution,
"velocity_predictability": score_velocity_predictability,
}
weighted_scores = []
for dimension_name, scoring_func in dimensions.items():
dimension_result = scoring_func(sprints)
result.dimension_scores[dimension_name] = dimension_result
# Calculate weighted contribution
weight = HEALTH_DIMENSIONS[dimension_name]["weight"]
weighted_score = dimension_result["score"] * weight
weighted_scores.append(weighted_score)
# Calculate overall score
result.overall_score = sum(weighted_scores)
result.health_grade = _score_to_grade(result.overall_score)
# Generate detailed metrics
result.detailed_metrics = _generate_detailed_metrics(sprints)
# Generate recommendations
result.recommendations = _generate_health_recommendations(result)
except Exception as e:
result.dimension_scores = {"error": str(e)}
result.overall_score = 0
return result
def _generate_detailed_metrics(sprints: List[SprintHealthData]) -> Dict[str, Any]:
"""Generate detailed metrics for analysis."""
metrics = {
"sprint_count": len(sprints),
"date_range": {
"start": sprints[0].start_date if sprints else "",
"end": sprints[-1].end_date if sprints else "",
},
"team_metrics": {},
"story_metrics": {},
"blocker_metrics": {},
}
if not sprints:
return metrics
# Team metrics
team_sizes = [sprint.team_size for sprint in sprints if sprint.team_size > 0]
if team_sizes:
metrics["team_metrics"] = {
"average_team_size": statistics.mean(team_sizes),
"team_size_stability": statistics.stdev(team_sizes) if len(team_sizes) > 1 else 0,
}
# Story metrics
all_stories = []
for sprint in sprints:
all_stories.extend(sprint.stories)
if all_stories:
story_points = [story.points for story in all_stories if story.points > 0]
metrics["story_metrics"] = {
"total_stories": len(all_stories),
"average_story_points": statistics.mean(story_points) if story_points else 0,
"completed_stories": sum(1 for story in all_stories if story.is_completed),
"blocked_stories": sum(1 for story in all_stories if story.is_blocked),
}
# Blocker metrics
all_blockers = []
for sprint in sprints:
all_blockers.extend(sprint.blockers)
if all_blockers:
resolution_times = [b.get("resolution_days", 0) for b in all_blockers if b.get("resolution_days", 0) > 0]
metrics["blocker_metrics"] = {
"total_blockers": len(all_blockers),
"resolved_blockers": len(resolution_times),
"average_resolution_days": statistics.mean(resolution_times) if resolution_times else 0,
}
return metrics
def _generate_health_recommendations(result: HealthScoreResult) -> List[str]:
"""Generate actionable recommendations based on health scores."""
recommendations = []
# Overall health recommendations
if result.overall_score < OVERALL_HEALTH_THRESHOLDS["poor"]:
recommendations.append("CRITICAL: Sprint health is poor across multiple dimensions. Immediate intervention required.")
elif result.overall_score < OVERALL_HEALTH_THRESHOLDS["fair"]:
recommendations.append("Sprint health needs improvement. Focus on top 2-3 problem areas.")
elif result.overall_score >= OVERALL_HEALTH_THRESHOLDS["excellent"]:
recommendations.append("Excellent sprint health! Maintain current practices and share learnings with other teams.")
# Dimension-specific recommendations
for dimension, scores in result.dimension_scores.items():
if isinstance(scores, dict) and "score" in scores:
score = scores["score"]
grade = scores["grade"]
if score < 50: # Poor performance
if dimension == "commitment_reliability":
recommendations.append("Improve sprint planning accuracy and realistic capacity estimation.")
elif dimension == "scope_stability":
recommendations.append("Reduce mid-sprint scope changes. Strengthen backlog refinement process.")
elif dimension == "blocker_resolution":
recommendations.append("Implement faster blocker escalation and resolution processes.")
elif dimension == "ceremony_engagement":
recommendations.append("Improve ceremony facilitation and team engagement strategies.")
elif dimension == "story_completion_distribution":
recommendations.append("Focus on completing stories fully rather than starting many partially.")
elif dimension == "velocity_predictability":
recommendations.append("Work on consistent estimation and delivery patterns.")
elif score >= 85: # Excellent performance
dimension_name = dimension.replace("_", " ").title()
recommendations.append(f"Excellent {dimension_name}! Document and share best practices.")
return recommendations
# ---------------------------------------------------------------------------
# Output Formatting
# ---------------------------------------------------------------------------
def format_text_output(result: HealthScoreResult) -> str:
"""Format results as readable text report."""
lines = []
lines.append("="*60)
lines.append("SPRINT HEALTH ANALYSIS REPORT")
lines.append("="*60)
lines.append("")
if "error" in result.dimension_scores:
lines.append(f"ERROR: {result.dimension_scores['error']}")
return "\n".join(lines)
# Overall health summary
lines.append("OVERALL HEALTH SUMMARY")
lines.append("-"*30)
lines.append(f"Health Score: {result.overall_score:.1f}/100")
lines.append(f"Health Grade: {result.health_grade.title()}")
lines.append("")
# Dimension scores
lines.append("DIMENSION SCORES")
lines.append("-"*30)
for dimension, scores in result.dimension_scores.items():
if isinstance(scores, dict) and "score" in scores:
dimension_name = dimension.replace("_", " ").title()
weight = HEALTH_DIMENSIONS[dimension]["weight"]
lines.append(f"{dimension_name} (Weight: {weight:.0%})")
lines.append(f" Score: {scores['score']:.1f}/100 ({scores['grade'].title()})")
lines.append(f" Details: {scores['details']}")
lines.append("")
# Detailed metrics
metrics = result.detailed_metrics
if metrics:
lines.append("DETAILED METRICS")
lines.append("-"*30)
lines.append(f"Sprints Analyzed: {metrics.get('sprint_count', 0)}")
if "team_metrics" in metrics and metrics["team_metrics"]:
team = metrics["team_metrics"]
lines.append(f"Average Team Size: {team.get('average_team_size', 0):.1f}")
if "story_metrics" in metrics and metrics["story_metrics"]:
stories = metrics["story_metrics"]
lines.append(f"Total Stories: {stories.get('total_stories', 0)}")
lines.append(f"Completed Stories: {stories.get('completed_stories', 0)}")
lines.append(f"Blocked Stories: {stories.get('blocked_stories', 0)}")
if "blocker_metrics" in metrics and metrics["blocker_metrics"]:
blockers = metrics["blocker_metrics"]
lines.append(f"Total Blockers: {blockers.get('total_blockers', 0)}")
lines.append(f"Average Resolution Time: {blockers.get('average_resolution_days', 0):.1f} days")
lines.append("")
# Recommendations
if result.recommendations:
lines.append("RECOMMENDATIONS")
lines.append("-"*30)
for i, rec in enumerate(result.recommendations, 1):
lines.append(f"{i}. {rec}")
return "\n".join(lines)
def format_json_output(result: HealthScoreResult) -> Dict[str, Any]:
"""Format results as JSON."""
return {
"overall_score": result.overall_score,
"health_grade": result.health_grade,
"dimension_scores": result.dimension_scores,
"detailed_metrics": result.detailed_metrics,
"recommendations": result.recommendations,
}
# ---------------------------------------------------------------------------
# CLI Interface
# ---------------------------------------------------------------------------
def main() -> int:
"""Main CLI entry point."""
parser = argparse.ArgumentParser(
description="Analyze sprint health across multiple dimensions"
)
parser.add_argument(
"data_file",
help="JSON file containing sprint health data"
)
parser.add_argument(
"--format",
choices=["text", "json"],
default="text",
help="Output format (default: text)"
)
args = parser.parse_args()
try:
# Load and validate data
with open(args.data_file, 'r') as f:
data = json.load(f)
# Perform analysis
result = analyze_sprint_health(data)
# Output results
if args.format == "json":
output = format_json_output(result)
print(json.dumps(output, indent=2))
else:
output = format_text_output(result)
print(output)
return 0
except FileNotFoundError:
print(f"Error: File '{args.data_file}' not found", file=sys.stderr)
return 1
except json.JSONDecodeError as e:
print(f"Error: Invalid JSON in '{args.data_file}': {e}", file=sys.stderr)
return 1
except Exception as e:
print(f"Error: {e}", file=sys.stderr)
return 1
if __name__ == "__main__":
sys.exit(main())
FILE:scripts/velocity_analyzer.py
#!/usr/bin/env python3
"""
Sprint Velocity Analyzer
Analyzes sprint velocity data to calculate rolling averages, detect trends, forecast
capacity, and identify anomalies. Supports multiple statistical measures and
probabilistic forecasting for scrum teams.
Usage:
python velocity_analyzer.py sprint_data.json
python velocity_analyzer.py sprint_data.json --format json
"""
import argparse
import json
import math
import statistics
import sys
from datetime import datetime, timedelta
from typing import Any, Dict, List, Optional, Tuple, Union
# ---------------------------------------------------------------------------
# Constants and Configuration
# ---------------------------------------------------------------------------
VELOCITY_THRESHOLDS: Dict[str, Dict[str, float]] = {
"trend_detection": {
"strong_improvement": 0.15, # 15% improvement
"improvement": 0.08, # 8% improvement
"stable": 0.05, # ±5% stable range
"decline": -0.08, # 8% decline
"strong_decline": -0.15, # 15% decline
},
"volatility": {
"low": 0.15, # CV below 15%
"moderate": 0.25, # CV 15-25%
"high": 0.40, # CV 25-40%
"very_high": 0.40, # CV above 40%
},
"anomaly_detection": {
"outlier_threshold": 2.0, # Standard deviations from mean
"extreme_outlier": 3.0, # Extreme outlier threshold
}
}
FORECASTING_CONFIG: Dict[str, Any] = {
"confidence_levels": [0.50, 0.70, 0.85, 0.95],
"monte_carlo_iterations": 10000,
"min_sprints_for_forecast": 3,
"max_sprints_lookback": 8,
}
# ---------------------------------------------------------------------------
# Data Structures and Types
# ---------------------------------------------------------------------------
class SprintData:
"""Represents a single sprint's velocity and metadata."""
def __init__(self, data: Dict[str, Any]):
self.sprint_number: int = data.get("sprint_number", 0)
self.sprint_name: str = data.get("sprint_name", "")
self.start_date: str = data.get("start_date", "")
self.end_date: str = data.get("end_date", "")
self.planned_points: int = data.get("planned_points", 0)
self.completed_points: int = data.get("completed_points", 0)
self.added_points: int = data.get("added_points", 0)
self.removed_points: int = data.get("removed_points", 0)
self.carry_over_points: int = data.get("carry_over_points", 0)
self.team_capacity: float = data.get("team_capacity", 0.0)
self.working_days: int = data.get("working_days", 10)
# Calculate derived metrics
self.velocity: int = self.completed_points
self.commitment_ratio: float = (
self.completed_points / max(self.planned_points, 1)
)
self.scope_change_ratio: float = (
(self.added_points + self.removed_points) / max(self.planned_points, 1)
)
class VelocityAnalysis:
"""Complete velocity analysis results."""
def __init__(self):
self.summary: Dict[str, Any] = {}
self.trend_analysis: Dict[str, Any] = {}
self.forecasting: Dict[str, Any] = {}
self.anomalies: List[Dict[str, Any]] = []
self.recommendations: List[str] = []
# ---------------------------------------------------------------------------
# Core Analysis Functions
# ---------------------------------------------------------------------------
def calculate_rolling_averages(sprints: List[SprintData],
window_sizes: List[int] = [3, 5, 8]) -> Dict[int, List[float]]:
"""Calculate rolling averages for different window sizes."""
velocities = [sprint.velocity for sprint in sprints]
rolling_averages = {}
for window_size in window_sizes:
averages = []
for i in range(len(velocities)):
start_idx = max(0, i - window_size + 1)
window = velocities[start_idx:i + 1]
if len(window) >= min(3, window_size): # Minimum data points
averages.append(sum(window) / len(window))
else:
averages.append(None)
rolling_averages[window_size] = averages
return rolling_averages
def detect_trend(sprints: List[SprintData], lookback_sprints: int = 6) -> Dict[str, Any]:
"""Detect velocity trends using linear regression and statistical analysis."""
if len(sprints) < 3:
return {"trend": "insufficient_data", "confidence": 0.0}
# Use recent sprints for trend analysis
recent_sprints = sprints[-lookback_sprints:] if len(sprints) > lookback_sprints else sprints
velocities = [sprint.velocity for sprint in recent_sprints]
# Calculate linear trend
n = len(velocities)
x_values = list(range(n))
x_mean = sum(x_values) / n
y_mean = sum(velocities) / n
# Linear regression slope
numerator = sum((x - x_mean) * (y - y_mean) for x, y in zip(x_values, velocities))
denominator = sum((x - x_mean) ** 2 for x in x_values)
if denominator == 0:
slope = 0
else:
slope = numerator / denominator
# Calculate correlation coefficient for trend strength
if n > 2:
try:
correlation = statistics.correlation(x_values, velocities)
except statistics.StatisticsError:
correlation = 0.0
else:
correlation = 0.0
# Determine trend direction and strength
avg_velocity = statistics.mean(velocities)
relative_slope = slope / max(avg_velocity, 1) # Normalize by average velocity
thresholds = VELOCITY_THRESHOLDS["trend_detection"]
if relative_slope > thresholds["strong_improvement"]:
trend = "strong_improvement"
elif relative_slope > thresholds["improvement"]:
trend = "improvement"
elif relative_slope > -thresholds["stable"]:
trend = "stable"
elif relative_slope > thresholds["decline"]:
trend = "decline"
else:
trend = "strong_decline"
return {
"trend": trend,
"slope": slope,
"relative_slope": relative_slope,
"correlation": abs(correlation),
"confidence": abs(correlation),
"recent_sprints_analyzed": len(recent_sprints),
"average_velocity": avg_velocity,
}
def calculate_volatility(sprints: List[SprintData]) -> Dict[str, Any]:
"""Calculate velocity volatility and stability metrics."""
if len(sprints) < 2:
return {"volatility": "insufficient_data"}
velocities = [sprint.velocity for sprint in sprints]
mean_velocity = statistics.mean(velocities)
if mean_velocity == 0:
return {"volatility": "no_velocity"}
# Coefficient of Variation (CV)
std_dev = statistics.stdev(velocities) if len(velocities) > 1 else 0
cv = std_dev / mean_velocity
# Classify volatility
thresholds = VELOCITY_THRESHOLDS["volatility"]
if cv <= thresholds["low"]:
volatility_level = "low"
elif cv <= thresholds["moderate"]:
volatility_level = "moderate"
elif cv <= thresholds["high"]:
volatility_level = "high"
else:
volatility_level = "very_high"
# Calculate additional stability metrics
velocity_range = max(velocities) - min(velocities)
range_ratio = velocity_range / mean_velocity if mean_velocity > 0 else 0
return {
"volatility": volatility_level,
"coefficient_of_variation": cv,
"standard_deviation": std_dev,
"mean_velocity": mean_velocity,
"velocity_range": velocity_range,
"range_ratio": range_ratio,
"min_velocity": min(velocities),
"max_velocity": max(velocities),
}
def detect_anomalies(sprints: List[SprintData]) -> List[Dict[str, Any]]:
"""Detect velocity anomalies using statistical methods."""
if len(sprints) < 3:
return []
velocities = [sprint.velocity for sprint in sprints]
mean_velocity = statistics.mean(velocities)
std_dev = statistics.stdev(velocities) if len(velocities) > 1 else 0
anomalies = []
threshold = VELOCITY_THRESHOLDS["anomaly_detection"]["outlier_threshold"]
extreme_threshold = VELOCITY_THRESHOLDS["anomaly_detection"]["extreme_outlier"]
for i, sprint in enumerate(sprints):
if std_dev == 0:
continue
z_score = abs(sprint.velocity - mean_velocity) / std_dev
if z_score >= extreme_threshold:
anomaly_type = "extreme_outlier"
elif z_score >= threshold:
anomaly_type = "outlier"
else:
continue
anomalies.append({
"sprint_number": sprint.sprint_number,
"sprint_name": sprint.sprint_name,
"velocity": sprint.velocity,
"expected_range": (mean_velocity - 2 * std_dev, mean_velocity + 2 * std_dev),
"z_score": z_score,
"anomaly_type": anomaly_type,
"deviation_percentage": ((sprint.velocity - mean_velocity) / mean_velocity) * 100,
})
return anomalies
def monte_carlo_forecast(sprints: List[SprintData], sprints_ahead: int = 6) -> Dict[str, Any]:
"""Generate probabilistic velocity forecasts using Monte Carlo simulation."""
if len(sprints) < FORECASTING_CONFIG["min_sprints_for_forecast"]:
return {"error": "insufficient_historical_data"}
# Use recent sprints for forecasting
lookback = min(len(sprints), FORECASTING_CONFIG["max_sprints_lookback"])
recent_sprints = sprints[-lookback:]
velocities = [sprint.velocity for sprint in recent_sprints]
if not velocities:
return {"error": "no_velocity_data"}
mean_velocity = statistics.mean(velocities)
std_dev = statistics.stdev(velocities) if len(velocities) > 1 else 0
# Monte Carlo simulation
iterations = FORECASTING_CONFIG["monte_carlo_iterations"]
confidence_levels = FORECASTING_CONFIG["confidence_levels"]
simulated_totals = []
for _ in range(iterations):
total_points = 0
for _ in range(sprints_ahead):
# Sample from normal distribution
if std_dev > 0:
simulated_velocity = max(0, random_normal(mean_velocity, std_dev))
else:
simulated_velocity = mean_velocity
total_points += simulated_velocity
simulated_totals.append(total_points)
# Calculate percentiles for confidence intervals
simulated_totals.sort()
forecasts = {}
for confidence in confidence_levels:
percentile_index = int(confidence * iterations)
percentile_index = min(percentile_index, iterations - 1)
forecasts[f"{int(confidence * 100)}%"] = simulated_totals[percentile_index]
return {
"sprints_ahead": sprints_ahead,
"historical_sprints_used": lookback,
"mean_velocity": mean_velocity,
"velocity_std_dev": std_dev,
"forecasted_totals": forecasts,
"average_per_sprint": mean_velocity,
"expected_total": mean_velocity * sprints_ahead,
}
def random_normal(mean: float, std_dev: float) -> float:
"""Generate a random number from a normal distribution using Box-Muller transform."""
import random
import math
# Box-Muller transformation
u1 = random.random()
u2 = random.random()
z0 = math.sqrt(-2 * math.log(u1)) * math.cos(2 * math.pi * u2)
return mean + z0 * std_dev
def generate_recommendations(analysis: VelocityAnalysis) -> List[str]:
"""Generate actionable recommendations based on velocity analysis."""
recommendations = []
# Trend-based recommendations
trend = analysis.trend_analysis.get("trend", "")
if trend == "strong_decline":
recommendations.append("URGENT: Address strong declining velocity trend. Review impediments, team capacity, and story complexity.")
elif trend == "decline":
recommendations.append("Monitor declining velocity. Consider impediment removal and capacity planning review.")
elif trend == "strong_improvement":
recommendations.append("Excellent improvement trend! Document successful practices to maintain momentum.")
# Volatility-based recommendations
volatility = analysis.summary.get("volatility", {}).get("volatility", "")
if volatility == "very_high":
recommendations.append("HIGH PRIORITY: Reduce velocity volatility. Review story sizing, definition of done, and sprint planning process.")
elif volatility == "high":
recommendations.append("Work on consistency. Review estimation practices and sprint commitment process.")
elif volatility == "low":
recommendations.append("Good velocity stability. Continue current practices.")
# Anomaly-based recommendations
if len(analysis.anomalies) > 0:
extreme_anomalies = [a for a in analysis.anomalies if a["anomaly_type"] == "extreme_outlier"]
if extreme_anomalies:
recommendations.append(f"Investigate {len(extreme_anomalies)} extreme velocity anomalies for root causes.")
# Commitment ratio recommendations
commitment_ratios = analysis.summary.get("commitment_analysis", {})
avg_commitment = commitment_ratios.get("average_commitment_ratio", 1.0)
if avg_commitment < 0.8:
recommendations.append("Low sprint commitment achievement. Review capacity planning and story complexity estimation.")
elif avg_commitment > 1.2:
recommendations.append("Consistently over-committing. Consider more realistic sprint planning.")
return recommendations
# ---------------------------------------------------------------------------
# Main Analysis Function
# ---------------------------------------------------------------------------
def analyze_velocity(data: Dict[str, Any]) -> VelocityAnalysis:
"""Perform comprehensive velocity analysis."""
analysis = VelocityAnalysis()
try:
# Parse sprint data
sprint_records = data.get("sprints", [])
sprints = [SprintData(record) for record in sprint_records]
if not sprints:
raise ValueError("No sprint data found")
# Sort by sprint number
sprints.sort(key=lambda s: s.sprint_number)
# Basic summary statistics
velocities = [sprint.velocity for sprint in sprints]
commitment_ratios = [sprint.commitment_ratio for sprint in sprints]
scope_change_ratios = [sprint.scope_change_ratio for sprint in sprints]
analysis.summary = {
"total_sprints": len(sprints),
"velocity_stats": {
"mean": statistics.mean(velocities),
"median": statistics.median(velocities),
"min": min(velocities),
"max": max(velocities),
"total_points": sum(velocities),
},
"commitment_analysis": {
"average_commitment_ratio": statistics.mean(commitment_ratios),
"commitment_consistency": statistics.stdev(commitment_ratios) if len(commitment_ratios) > 1 else 0,
"sprints_under_committed": sum(1 for r in commitment_ratios if r < 1.0),
"sprints_over_committed": sum(1 for r in commitment_ratios if r > 1.0),
},
"scope_change_analysis": {
"average_scope_change": statistics.mean(scope_change_ratios),
"scope_change_volatility": statistics.stdev(scope_change_ratios) if len(scope_change_ratios) > 1 else 0,
},
"rolling_averages": calculate_rolling_averages(sprints),
"volatility": calculate_volatility(sprints),
}
# Trend analysis
analysis.trend_analysis = detect_trend(sprints)
# Forecasting
analysis.forecasting = monte_carlo_forecast(sprints, sprints_ahead=6)
# Anomaly detection
analysis.anomalies = detect_anomalies(sprints)
# Generate recommendations
analysis.recommendations = generate_recommendations(analysis)
except Exception as e:
analysis.summary = {"error": str(e)}
return analysis
# ---------------------------------------------------------------------------
# Output Formatting
# ---------------------------------------------------------------------------
def format_text_output(analysis: VelocityAnalysis) -> str:
"""Format analysis results as readable text report."""
lines = []
lines.append("="*60)
lines.append("SPRINT VELOCITY ANALYSIS REPORT")
lines.append("="*60)
lines.append("")
if "error" in analysis.summary:
lines.append(f"ERROR: {analysis.summary['error']}")
return "\n".join(lines)
# Summary section
summary = analysis.summary
lines.append("VELOCITY SUMMARY")
lines.append("-"*30)
lines.append(f"Total Sprints Analyzed: {summary['total_sprints']}")
velocity_stats = summary.get("velocity_stats", {})
lines.append(f"Average Velocity: {velocity_stats.get('mean', 0):.1f} points")
lines.append(f"Median Velocity: {velocity_stats.get('median', 0):.1f} points")
lines.append(f"Velocity Range: {velocity_stats.get('min', 0)} - {velocity_stats.get('max', 0)} points")
lines.append(f"Total Points Completed: {velocity_stats.get('total_points', 0)}")
lines.append("")
# Volatility analysis
volatility = summary.get("volatility", {})
lines.append("VELOCITY STABILITY")
lines.append("-"*30)
lines.append(f"Volatility Level: {volatility.get('volatility', 'Unknown').replace('_', ' ').title()}")
lines.append(f"Coefficient of Variation: {volatility.get('coefficient_of_variation', 0):.2%}")
lines.append(f"Standard Deviation: {volatility.get('standard_deviation', 0):.1f} points")
lines.append("")
# Trend analysis
trend_analysis = analysis.trend_analysis
lines.append("TREND ANALYSIS")
lines.append("-"*30)
lines.append(f"Trend Direction: {trend_analysis.get('trend', 'Unknown').replace('_', ' ').title()}")
lines.append(f"Trend Confidence: {trend_analysis.get('confidence', 0):.1%}")
lines.append(f"Velocity Change Rate: {trend_analysis.get('relative_slope', 0):.1%} per sprint")
lines.append("")
# Forecasting
forecasting = analysis.forecasting
lines.append("CAPACITY FORECAST (Next 6 Sprints)")
lines.append("-"*30)
if "error" not in forecasting:
lines.append(f"Expected Total: {forecasting.get('expected_total', 0):.0f} points")
lines.append(f"Average Per Sprint: {forecasting.get('average_per_sprint', 0):.1f} points")
forecasted_totals = forecasting.get("forecasted_totals", {})
lines.append("Confidence Intervals:")
for confidence, total in forecasted_totals.items():
lines.append(f" {confidence}: {total:.0f} points")
else:
lines.append(f"Forecast unavailable: {forecasting.get('error', 'Unknown error')}")
lines.append("")
# Anomalies
if analysis.anomalies:
lines.append("VELOCITY ANOMALIES")
lines.append("-"*30)
for anomaly in analysis.anomalies:
lines.append(f"Sprint {anomaly['sprint_number']} ({anomaly['sprint_name']})")
lines.append(f" Velocity: {anomaly['velocity']} points")
lines.append(f" Deviation: {anomaly['deviation_percentage']:.1f}%")
lines.append(f" Type: {anomaly['anomaly_type'].replace('_', ' ').title()}")
lines.append("")
# Recommendations
if analysis.recommendations:
lines.append("RECOMMENDATIONS")
lines.append("-"*30)
for i, rec in enumerate(analysis.recommendations, 1):
lines.append(f"{i}. {rec}")
return "\n".join(lines)
def format_json_output(analysis: VelocityAnalysis) -> Dict[str, Any]:
"""Format analysis results as JSON."""
return {
"summary": analysis.summary,
"trend_analysis": analysis.trend_analysis,
"forecasting": analysis.forecasting,
"anomalies": analysis.anomalies,
"recommendations": analysis.recommendations,
}
# ---------------------------------------------------------------------------
# CLI Interface
# ---------------------------------------------------------------------------
def main() -> int:
"""Main CLI entry point."""
parser = argparse.ArgumentParser(
description="Analyze sprint velocity data with trend detection and forecasting"
)
parser.add_argument(
"data_file",
help="JSON file containing sprint data"
)
parser.add_argument(
"--format",
choices=["text", "json"],
default="text",
help="Output format (default: text)"
)
args = parser.parse_args()
try:
# Load and validate data
with open(args.data_file, 'r') as f:
data = json.load(f)
# Perform analysis
analysis = analyze_velocity(data)
# Output results
if args.format == "json":
output = format_json_output(analysis)
print(json.dumps(output, indent=2))
else:
output = format_text_output(analysis)
print(output)
return 0
except FileNotFoundError:
print(f"Error: File '{args.data_file}' not found", file=sys.stderr)
return 1
except json.JSONDecodeError as e:
print(f"Error: Invalid JSON in '{args.data_file}': {e}", file=sys.stderr)
return 1
except Exception as e:
print(f"Error: {e}", file=sys.stderr)
return 1
if __name__ == "__main__":
sys.exit(main())