@clawhub-quochungto-93dad49abd
Choose domain logic pattern for enterprise application subsystems: Transaction Script vs Domain Model vs Table Module, and decide Service Layer thickness. Us...
---
name: domain-logic-pattern-selector
description: "Choose domain logic pattern for enterprise application subsystems: Transaction Script vs Domain Model vs Table Module, and decide Service Layer thickness. Use when organizing business logic, choosing between procedural service methods and rich domain models, selecting enterprise app architecture, routing domain logic to the right pattern, deciding anemic domain model vs rich domain model, applying Fowler's complexity-vs-effort curve, determining when to use Domain Model, when to use Transaction Script for simple CRUD, when Table Module fits .NET RecordSet environments, deciding Service Layer facade vs operation script vs controller-entity, avoiding transaction-script-sprawl, avoiding anemic-domain-model anti-pattern, preventing stored-procedure logic leakage, structuring enterprise app business logic, domain logic organization, choose domain pattern, enterprise app business logic design."
version: "1.0.0"
homepage: https://github.com/bookforge-ai/bookforge-skills/tree/main/books/patterns-of-enterprise-application-architecture/skills/domain-logic-pattern-selector
metadata: {"openclaw":{"emoji":"🗂️","homepage":"https://github.com/bookforge-ai/bookforge-skills"}}
status: draft
source-books:
- id: patterns-of-enterprise-application-architecture
title: "Patterns of Enterprise Application Architecture"
authors: ["Martin Fowler", "David Rice", "Matthew Foemmel", "Edward Hieatt", "Robert Mee", "Randy Stafford"]
chapters: [2, 9]
domain: software-architecture
tags:
- domain-logic
- enterprise-architecture
- software-architecture
- design-patterns
- business-logic
depends-on: []
execution:
tier: 2
mode: hybrid
inputs:
- type: description
description: "Subsystem description: what the module does, the nature of the business rules (simple CRUD, rule-heavy, algorithmic), team OO experience, language and stack (Java, C#, Python, .NET, JVM), any existing patterns in place, and whether multiple client types or transactional resources are involved."
- type: codebase
description: "Optional: existing domain and service code to diagnose current pattern and detect anti-patterns."
tools-required:
- Read
- Grep
- Write
tools-optional:
- Glob
mcps-required: []
environment: "Enterprise application codebase or architecture discussion. Works offline. Codebase helpful but not required — skill operates on description alone."
discovery:
goal: "Route a subsystem to Transaction Script, Domain Model, or Table Module, decide Service Layer thickness, and produce a domain-logic pattern decision record."
tasks:
- "Classify the subsystem's domain complexity (simple CRUD → rule-heavy → rich invariant-laden domain)"
- "Apply complexity × team × platform routing matrix to select primary pattern"
- "Decide whether a Service Layer is warranted and, if so, what thickness"
- "Check for anti-pattern risk (Transaction Script sprawl, anemic domain model, stored-procedure logic leakage)"
- "Produce a domain-logic pattern decision record with rationale, migration path, and implementation sketch"
audience:
roles:
- software-architect
- senior-backend-engineer
- tech-lead
experience: intermediate
when_to_use:
triggers:
- "Starting a new subsystem or module and need to decide how to organize domain logic"
- "Existing code shows Transaction Script sprawl (god-class service with duplicated conditional logic)"
- "Existing code has an anemic domain model (data-only classes with all behavior in services)"
- "Team debating whether to use Domain Model or keep procedural Transaction Scripts"
- "Evaluating whether a Service Layer is needed and how thick it should be"
- ".NET project with ADO.NET RecordSets and uncertainty about Table Module applicability"
- "Legacy codebase migrating from stored procedures and need a domain-logic pattern to migrate toward"
- "Complex business rules growing unmanageable in Transaction Scripts"
prerequisites: []
not_for:
- "Choosing data-source patterns (ORM, Active Record, Data Mapper) — invoke data-source-pattern-selector for those"
- "Choosing presentation-layer patterns (MVC, Front Controller, Template View)"
- "Concurrency or session-state decisions"
- "Systems with no domain logic at all (pure data passthrough APIs)"
environment:
codebase_required: false
codebase_helpful: true
works_offline: true
quality:
scores:
with_skill: null
baseline: null
delta: null
tested_at: null
eval_count: null
assertion_count: 13
iterations_needed: null
---
# Domain Logic Pattern Selector
Choose between Transaction Script, Domain Model, and Table Module for a subsystem, then decide how thick to make the Service Layer. Based on Fowler's Patterns of Enterprise Application Architecture (Ch 2 + Ch 9).
## When to Use
Use this skill when starting or refactoring a subsystem and the central question is: _how should business logic be organized?_ This is distinct from _where_ that logic persists (data-source patterns) or _how_ it is presented.
Typical triggers: greenfield module design, Transaction Script code growing unmanageable, a team debate about "should we use DDD?", evaluating whether to add a service layer, diagnosing anemic domain model or script sprawl in legacy code.
Prerequisites: none. Works from a verbal description of the subsystem; codebase access improves diagnosis.
## Context & Input Gathering
Ask the user (or read from codebase) in this order:
**Required:**
1. **What does the subsystem do?** One paragraph — the nouns (entities), the operations (use cases), and the outputs.
2. **How complex is the business logic?** Pick the best match:
- Simple: mostly CRUD, a few validations, no conditional algorithms
- Moderate: validations + multi-step calculations + some conditional rules
- Complex: many rules that interact, vary by product/customer/time, change frequently
3. **What is the team's OO experience?** Low (procedural background), moderate (OO-familiar but not DDD-trained), high (experienced with rich domain models, strategy/composite patterns).
4. **What is the language and stack?** Java/JVM, C#/.NET, Python, TypeScript, Ruby, etc. If .NET, is ADO.NET / RecordSet usage widespread?
**Observable from codebase (if provided):**
- Service classes with many long methods containing `if/else` chains → Transaction Script
- Classes with fields but no methods (getters/setters only) → anemic domain model signal
- Behavior scattered across utility/helper classes → Transaction Script sprawl
- SQL or stored-procedure calls embedded directly in business-logic code → stored-procedure leakage
- Dataset/DataTable types flowing through business logic (.NET) → Table Module context
**Defaults if not specified:**
- Team OO experience: moderate
- Stack: JVM (Java/Kotlin) unless .NET signals are present
**Sufficiency test:** Proceed once complexity level and team experience are known. Stack is tie-breaker for moderate complexity.
## Process
### Step 1 — Classify Domain Complexity
WHY: Fowler's core insight is that the cost curves for the three patterns cross at different complexity points. Choosing the wrong pattern for the complexity level locks the team into a wall of effort as logic grows.
Map the subsystem to one of three complexity bands:
| Band | Signals | Cost Curve Behavior |
|------|---------|---------------------|
| **Low** | Pure CRUD, few validations, minimal branching | All three patterns are cheap. Transaction Script wins on simplicity. |
| **Moderate** | Multi-step calculations, conditional rules, some shared logic | Transaction Script starts showing duplication. Domain Model pays off if team is OO-comfortable. Table Module is viable if RecordSet tooling is strong. |
| **High** | Rules vary by product/customer/time, complex invariants, frequent changes, algorithmic variation | Transaction Script hits exponential complexity wall. Domain Model is the only pattern that manages growth gracefully. |
If the codebase is available: grep for service methods longer than ~50 lines with nested conditionals. That is a high-complexity signal even if the requirement description sounds simple.
### Step 2 — Apply the Routing Matrix
WHY: Complexity alone does not determine the best pattern. Team OO experience shifts the cost curve for Domain Model (familiar teams pay the ramp-up cost once; unfamiliar teams pay it every sprint). Platform shifts the appeal of Table Module (RecordSet tooling makes it a natural fit in .NET; without that tooling it is pointless friction).
```
ROUTING MATRIX
─────────────────────────────────────────────────────────────────────
Complexity Team OO Exp Platform → Recommended Pattern
─────────────────────────────────────────────────────────────────────
Low any any → Transaction Script
Moderate low any → Transaction Script (with extraction discipline)
Moderate moderate/high .NET + RecordSet → Table Module
Moderate moderate/high Java/Python/TS → Domain Model (simple variant)
High low any → Domain Model + coaching (or Table Module in .NET as interim)
High moderate/high any → Domain Model
─────────────────────────────────────────────────────────────────────
```
**Platform defaults (Fowler's explicit guidance):**
- **.NET with ADO.NET / DataSet**: Table Module is the natural fit. Fowler says "I don't see a reason to use Transaction Script in a .NET environment" once RecordSet tooling is present.
- **Java / JVM**: Domain Model is the natural target for moderate-to-high complexity. POJOs (plain old Java objects) with Data Mapper is the recommended implementation — avoid entity beans for rich domain models.
- **Python / Ruby / TypeScript**: Domain Model with Active Record is viable for simple-to-moderate domains; Data Mapper for rich ones.
**Refactoring direction rule:** If starting fresh or refactoring up — Transaction Script → Domain Model is a well-worn path. Going the other direction (Domain Model → Transaction Script) is rarely worthwhile unless you can also simplify the data-source layer. Start simple; refactor up when complexity demands.
**Mixed patterns are allowed.** Transaction Script for some use cases, Domain Model for the core rules — Fowler explicitly calls this common.
### Step 3 — Decide Service Layer
WHY: A Service Layer defines the application boundary — where transaction control and security live, and what coarse-grained API client layers (UI, APIs, messaging consumers) call into. Adding it unnecessarily layers overhead; omitting it when multiple client types exist causes cross-cutting logic to scatter.
**Add a Service Layer when ANY of these are true:**
- More than one kind of client (UI + REST API, UI + message consumer, UI + batch job)
- A use case response must be transacted atomically across multiple resources (database + message queue + email)
- Need a distinct, stable API boundary (versioning, remote access, access control)
**Skip the Service Layer when:**
- Single client type (a UI), simple use cases, single transactional resource
- Page Controllers can control transactions directly by delegating to the domain or data-source layer
**Thickness — three variants:**
| Variant | Description | When to Use | Fowler's Verdict |
|---------|-------------|------------|-----------------|
| **Domain Facade** (thin) | Service layer methods are one-liner delegations to the Domain Model. No business logic here. | When Domain Model is rich and self-contained. | Preferred by Fowler when using Domain Model. |
| **Operation Script** (thick) | Service methods contain application logic (notifications, workflow coordination, cross-resource transactions), delegating domain logic to domain objects. | When application logic (workflow, integration, notifications) must be coordinated. | Fowler's pick for applications with application-logic responsibilities. |
| **Controller-Entity** (mid) | Use-case-specific logic in service/controller scripts; shared logic on entities. | Common but risky — tend to produce duplication. | Fowler warns against this as a starting point; useful only when refactoring from Transaction Script. |
**Avoid stored procedures for domain logic.** Reserve stored procedures for batch jobs and extreme performance requirements only. Placing business rules in stored procedures creates logic you cannot test, version, or refactor as a domain model.
### Step 4 — Check Anti-Pattern Risk
WHY: Pattern choices create pressure toward specific failure modes. Flagging the risk up front lets the team establish guards before the anti-pattern takes hold.
| Chosen Pattern | Primary Anti-Pattern Risk | Early Warning Signs | Guard |
|---------------|--------------------------|--------------------|----|
| Transaction Script | **Transaction Script Sprawl**: logic duplicated across scripts; one god-class service | Methods >50 lines; copy-pasted validation blocks; service class >500 lines | Extract shared logic into domain objects; set line-count budget |
| Domain Model | **Anemic Domain Model**: classes with only getters/setters; all behavior in service layer | Domain class has zero methods beyond accessors; service layer orchestrates every operation | Move behavior onto entities; services handle only application logic |
| Domain Model | **Bloated Domain Model**: domain objects absorbing use-case-specific UI behavior | Domain class imports UI or HTTP types | Move use-case-specific logic to service layer or presentation; keep domain pure |
| Table Module | **Stored-Procedure Logic Leakage**: rules drift into the database | Business conditionals in SQL; procedures called from Table Module | Keep calculation logic in Table Module C# / VB class, not in SQL |
**Anemic Domain Model** is Fowler's most prominent warning: a system that uses Domain Model classes but has stripped behavior out into service classes reduces the pattern to an expensive object-relational mapping exercise with none of the encapsulation benefit. If your "domain model" has no methods beyond getters and setters, it is a data transfer layer, not a domain model.
### Step 5 — Produce the Decision Record
WHY: A written record anchors the decision, gives future maintainers the context to maintain pattern consistency, and becomes a checklist for code review.
Produce a **Domain-Logic Pattern Decision Record** containing:
```
## Domain-Logic Pattern Decision Record: [Subsystem Name]
### Classification
- Complexity band: [low / moderate / high]
- Team OO experience: [low / moderate / high]
- Stack / platform: [Java / .NET / Python / ...]
### Primary Pattern: [Transaction Script | Domain Model | Table Module]
**Rationale:** [1-2 sentences applying the routing matrix]
### Service Layer: [None | Domain Facade | Operation Script | Controller-Entity]
**Rationale:** [Why needed / not needed; thickness reasoning]
### Anti-Pattern Watch
- Risk: [name] Warning sign: [observable symptom]
### Migration Path (if refactoring)
[Starting point → recommended direction → end state]
### Implementation Sketch
[5-10 lines of pseudocode or language-idiomatic code showing the structural skeleton]
### Pair With Data-Source Pattern
[Brief note on which data-source pattern pairs with the chosen domain pattern]
```
## Inputs
| Input | Required | Description |
|-------|----------|-------------|
| Subsystem description | Yes | What it does, the entities, the use cases |
| Complexity level | Yes | Low / moderate / high (or description to classify) |
| Team OO experience | Yes | Low / moderate / high |
| Language / stack | Yes (for tie-breaking) | JVM, .NET, Python, etc. |
| Existing codebase | No | Helps diagnose current pattern and anti-patterns |
| Multiple client types? | No | Defaults to single client (no Service Layer) |
## Outputs
- **Domain-Logic Pattern Decision Record** (markdown) — primary output
- **Routing rationale** — explicit complexity × team × platform reasoning
- **Anti-pattern risk flags** — with early warning signs and guards
- **Implementation skeleton** — language-idiomatic structural sketch
- **Service Layer decision** — thickness recommendation with rationale
- **Data-source pairing note** — which data-source pattern to combine with the chosen domain pattern
## Key Principles
1. **Complexity determines the crossover point, not personal preference.** Fowler's complexity-vs-effort graph shows that Domain Model costs more upfront but grows sub-linearly while Transaction Script cost grows super-linearly. The crossover is real but cannot be measured precisely — use experienced judgment on domain complexity signals.
2. **Team experience shifts the curve but does not override complexity.** A team unfamiliar with OO domain models lowers the break-even point (they pay ramp-up cost longer), but high-complexity domains still demand Domain Model eventually. The alternative is unbounded Transaction Script sprawl.
3. **Platform preference is Fowler's explicit tie-breaker.** .NET RecordSet tooling makes Table Module the natural mid-complexity choice. Without that tooling, Table Module adds no value over a simple Domain Model with Active Record.
4. **Refactor up, rarely down.** If you start with Transaction Script and complexity grows: refactor toward Domain Model. If you start with Domain Model and want simplicity: going back to Transaction Script is rarely worth it unless you simultaneously simplify the data-source layer.
5. **The anemic domain model is a pattern failure, not a design choice.** Stripping behavior from domain classes into services defeats the purpose of Domain Model. If your domain classes have no methods, you are paying ORM complexity without the encapsulation benefit.
6. **Service Layer thickness should match application-logic responsibility.** Thin facade when the Domain Model carries all logic. Operation script when use cases require coordinating notifications, messaging, and multi-resource transactions. Avoid controller-entity as an architectural style — it tends to replicate Transaction Script duplication inside the service layer.
7. **Mixed patterns are legitimate.** Simple subsystems within a Domain Model application can use Transaction Script. The patterns are not mutually exclusive — the key is conscious, documented choice.
## Examples
### Scenario A — Revenue Recognition Engine (Complex Rules → Domain Model)
**Trigger:** A team building a SaaS billing system. Revenue recognition rules vary by product type: word processors recognize 100% immediately; spreadsheets spread recognition over three dates; databases over two; and the CFO adds new product categories quarterly.
**Process:**
- Classify: High complexity — rules vary by product, change frequently, involve calculations and date logic.
- Route: Domain Model (complexity high → Domain Model regardless of team experience).
- Service Layer: Operation Script — use case requires updating the database and notifying contract administrators atomically.
- Anti-pattern check: Risk of anemic domain model if the recognition strategy is placed in the service layer. Guard: `RevenueRecognition` objects should carry the `isRecognizableBy(date)` method; strategy subclasses should carry `calculateRevenueRecognitions()`. The service layer only coordinates the transaction and sends notifications.
**Output:** Domain Model with `Contract`, `Product`, `RevenueRecognition`, and a strategy hierarchy (`CompleteRecognitionStrategy`, `ThreeWayRecognitionStrategy`). `RecognitionService` is an Operation Script service layer that delegates domain logic to domain objects and handles email/integration notifications.
---
### Scenario B — Simple Order Management (.NET CRUD → Table Module)
**Trigger:** A .NET team building internal order management. Rules are mostly: validate fields, calculate totals, apply standard discounts. All UI is data-grid based. ADO.NET DataSets flow through every layer.
**Process:**
- Classify: Low-to-moderate complexity. No complex invariants, no algorithmic variation.
- Route: Table Module. .NET RecordSet tooling is present; the calculation logic fits naturally in table-oriented classes.
- Service Layer: None needed — single client (a Windows Forms UI), single transactional resource.
- Anti-pattern check: Watch for stored-procedure logic drift. Keep discount calculation in the `Order` Table Module, not in a SQL stored procedure.
**Output:** `Contract` Table Module, `Product` Table Module, `RevenueRecognition` Table Module operating on a shared `DataSet`. No domain object identity — every operation takes a primary key parameter.
---
### Scenario C — Legacy Transaction Scripts Hitting Complexity Wall → Refactor to Domain Model
**Trigger:** A Java team's billing service has grown to a 2,000-line `BillingService` class. Discount logic is copy-pasted across five methods. Adding a new rule requires touching seven places.
**Process:**
- Classify: Transaction Script sprawl — the complexity has exceeded the pattern's capacity.
- Route: Refactor toward Domain Model. Direction: up, not down.
- Service Layer: Keep `BillingService` as an Operation Script service layer; move business rules onto `Contract`, `LineItem`, and `Discount` domain objects.
- Migration path: Extract duplicate validation into domain object methods first; introduce `Data Mapper` to decouple persistence; migrate rule-specific conditional branches into strategy classes.
- Anti-pattern check: Do not introduce an anemic domain model by moving methods from `BillingService` to new classes that only hold data. Behavior must move too.
**Output:** Decision record documenting the refactor direction, a domain object skeleton (`Contract`, `LineItem`, `Discount`), `BillingService` reduced to coordinating transactions and notifications, and a Data Mapper recommendation for persistence.
## References
- `references/complexity-vs-effort-graph.md` — Fowler's unquantified-but-instructive chart described textually with decision thresholds
- `references/revenue-recognition-four-ways.md` — the three-table schema (products / contracts / revenueRecognitions) implemented as Transaction Script (Java), Domain Model (Java), Table Module (C#), and Service Layer (Java)
- `references/service-layer-thickness-guide.md` — detailed comparison of domain facade, operation script, and controller-entity with code sketches
If unclear which data-source pattern to pair with the chosen domain pattern → invoke `data-source-pattern-selector`. If unclear whether a distribution boundary (Remote Facade, DTO) is needed → invoke `distribution-boundary-designer`. If unsure which overall layering approach fits the application → invoke `enterprise-architecture-pattern-stack-selector`.
## License
This skill is licensed under [CC-BY-SA-4.0](https://creativecommons.org/licenses/by-sa/4.0/).
Source: [BookForge](https://github.com/bookforge-ai/bookforge-skills) — Patterns of Enterprise Application Architecture by Martin Fowler et al. (Addison-Wesley, 2002).
## Related BookForge Skills
This skill is standalone. Browse more BookForge skills: [bookforge-skills](https://github.com/bookforge-ai/bookforge-skills)
Distribution design for enterprise systems: decide whether to distribute, where to draw the service boundary, and how to implement it with Remote Facade and...
---
name: distribution-boundary-designer
description: "Distribution design for enterprise systems: decide whether to distribute, where to draw the service boundary, and how to implement it with Remote Facade and Data Transfer Object (DTO). Use when deciding microservices vs monolith, evaluating process boundaries, extracting services, designing remote APIs, choosing coarse-grained API shape, preventing distribution-by-class anti-pattern, applying Fowler's First Law of Distributed Object Design, designing service extraction strategy, determining when distribution is warranted vs cargo-culting microservices, implementing Remote Facade pattern, designing DTOs independent from domain objects, choosing between gRPC vs REST vs message queue vs GraphQL for service boundary, monolith decomposition, service boundary design, remote API design, distribution strategy, when to distribute, process boundary decision, coarse-grained interface design."
version: "1.0.0"
homepage: https://github.com/bookforge-ai/bookforge-skills/tree/main/books/patterns-of-enterprise-application-architecture/skills/distribution-boundary-designer
metadata: {"openclaw":{"emoji":"🔀","homepage":"https://github.com/bookforge-ai/bookforge-skills"}}
status: draft
source-books:
- id: patterns-of-enterprise-application-architecture
title: "Patterns of Enterprise Application Architecture"
authors: ["Martin Fowler", "David Rice", "Matthew Foemmel", "Edward Hieatt", "Robert Mee", "Randy Stafford"]
chapters: [7, 15]
domain: software-architecture
tags:
- distributed-systems
- microservices
- api-design
- software-architecture
- design-patterns
- remote-facade
- service-boundaries
- enterprise-architecture
depends-on: []
execution:
tier: 2
mode: hybrid
inputs:
- type: description
description: "System topology description: current architecture (monolith / modular monolith / cluster / microservices), motivation for distribution, team structure, deployment independence needs, scaling needs, and any security zone or vendor integration requirements."
- type: codebase
description: "Optional: existing codebase or architecture diagrams to identify existing remote boundaries, chatty interfaces, or domain logic in remote shells."
tools-required:
- Read
- Write
tools-optional:
- Grep
- Glob
mcps-required: []
environment: "Architecture design or refactor session. Works offline. Codebase helpful for detecting existing anti-patterns but not required — skill operates on description alone."
discovery:
goal: "Decide whether to distribute a system (or parts of it), identify legitimate boundaries if distribution is warranted, and design Remote Facade + DTO contracts at each boundary."
tasks:
- "Apply the First Law filter: check each legitimate distribution reason; recommend against distributing if none apply"
- "Identify coarse-grained boundary candidates (per subsystem, not per class)"
- "Design Remote Facade methods as use-case-shaped coarse calls (not CRUD wrappers)"
- "Design DTOs per interaction — independent from domain objects, shaped for client needs"
- "Audit for anti-patterns: distribution-by-class, domain logic in facade, chatty interfaces, DTO-domain coupling"
- "Select interface style (gRPC / REST / message queue / GraphQL) based on coupling tolerance and async needs"
- "Produce a distribution design record"
audience:
roles:
- software-architect
- senior-backend-engineer
- tech-lead
- staff-engineer
experience: intermediate
when_to_use:
triggers:
- "Team considering a microservices migration or service extraction"
- "Existing system with distribution-by-class (one remote component per domain class)"
- "Remote API is chatty — many fine-grained calls per user interaction"
- "Domain logic has leaked into remote service shells or API controllers"
- "Planning a new system with multiple client types (browser, mobile, desktop) or security zones"
- "Evaluating whether to extract a scaling hot-spot (payments, search, notifications) as a separate service"
- "Designing DTOs or gRPC protos for a service boundary"
- "Reviewing whether a microservices architecture is justified for the team size and scaling needs"
prerequisites: []
not_for:
- "Choosing domain-logic pattern (Transaction Script vs Domain Model) — invoke domain-logic-pattern-selector"
- "Selecting data-source patterns (ORM, Active Record, Data Mapper) — invoke data-source-pattern-selector"
- "Concurrency control within a service — invoke offline-concurrency-strategy-selector"
- "Systems already committed to a specific distribution topology with no flexibility"
environment:
codebase_required: false
codebase_helpful: true
works_offline: true
quality:
scores:
with_skill: null
baseline: null
delta: null
tested_at: null
eval_count: null
assertion_count: 13
iterations_needed: null
---
# Distribution Boundary Designer
Guides you through the decision of whether to distribute a system across processes or machines, where to draw the boundary if distribution is needed, and how to implement the boundary using the Remote Facade and Data Transfer Object patterns. Grounded in Fowler's First Law of Distributed Object Design and the pattern pair that makes distribution workable when it cannot be avoided.
## When to Use
Use this skill when your team is considering breaking a monolith into services, extracting a subsystem as a remote service, reviewing whether a microservices architecture is appropriate, or designing the interface at an existing remote boundary. Also use it when an existing remote API shows signs of the distribution-by-class anti-pattern (too many fine-grained calls per operation), or when domain logic has leaked into remote shells.
Prerequisites: a description of the system's purpose, team structure, deployment needs, and any known scaling or security zone requirements. A codebase or architecture diagram is helpful for detecting existing boundary problems.
## Context & Input Gathering
Gather these before proceeding:
**Required:**
- What does the system do? What are its major subsystems?
- What is the stated motivation for distribution? ("scalability," "team independence," "vendor integration," "security zones," etc.)
- Are there currently separate deployment units? If yes, which ones and why?
**Helpful:**
- How many teams? Do different teams need to deploy independently?
- Are there subsystems with dramatically different scaling needs (e.g., search scales 100x vs catalog)?
- Are there security zone requirements (DMZ vs internal network)?
- Are there vendor package integrations that run in their own process?
- Is the codebase available? Grep for remote stubs, service clients, or proto/schema files to detect existing boundary shape.
**Defaults if not provided:**
- Assume a single team if not stated — leans toward not distributing
- Assume uniform scaling if not stated — leans toward not distributing
- Assume no security zone requirement if not stated
**Sufficiency check:** If the only stated motivation is "we want microservices," "clean architecture," or "team size," the filter step will almost certainly return "do not distribute." Be explicit about this.
## Process
### Step 1 — Apply the First Law Filter
WHY: The single most common architecture mistake is distributing a system that does not need it. Fowler names this explicitly: "Don't distribute your objects." Distribution incurs latency, partial failure, marshaling cost, and interface rigidity — costs that are paid on every call, forever. Every process boundary is a tax.
Check each of Fowler's seven legitimate distribution reasons. The system should distribute ONLY if at least one applies:
| Reason | Check |
|--------|-------|
| Different client machines (desktop vs server, browser vs app server) | Does the client run on a different physical machine? |
| Application server ↔ database process boundary | Standard; SQL is designed for this. |
| Web server and app server must be separate processes | Vendor constraint, security zone, or scaling forces? |
| Independent scaling requirements | Does a hot subsystem need to scale at a dramatically different rate? |
| External vendor / purchased package software | Does a package run in its own process with a coarse-grained interface? |
| Security zone boundary (DMZ, internal network) | Is a firewall or network zone required between subsystems? |
| Different hardware or OS requirements | Does a subsystem require specialized hardware or a different OS? |
IF none apply → **Recommend modular monolith**: keep the application in one process, divide it into packages/modules with clear interfaces. Name this explicitly. Do not proceed to boundary design.
IF one or more apply → Identify only the subsystem pairs that cross a legitimate boundary. Every other subsystem pair should remain in-process.
### Step 2 — Identify Coarse-Grained Boundaries (per subsystem, not per class)
WHY: The distribution-by-class anti-pattern (one remote component per domain class) multiplies call counts and destroys performance. A remote boundary should span a subsystem — a coherent cluster of domain classes that collaborates internally and exposes a small number of use-case-shaped operations externally.
For each legitimate boundary identified in Step 1:
- Name the subsystem on each side (e.g., "order processing" and "payment processor")
- Identify the operations that cross the boundary — framed as user-intent operations ("place order," "authorize payment"), not as CRUD methods on individual entities
- Estimate call frequency: how many times is this boundary crossed per user interaction? If more than ~5, the boundary may be too fine-grained.
Signal that a boundary is too fine-grained: "GetCustomer() then GetOrders() then GetLineItems() then GetProducts()..." — four separate calls to display a single order screen. This should be one call: `GetOrderSummary(orderId)`.
### Step 3 — Design Remote Facade Methods
WHY: A Remote Facade is a thin coordination shell — its only job is to translate coarse-grained remote calls into sequences of fine-grained domain object calls. All domain logic lives in the fine-grained domain objects, not in the facade. A facade with domain logic becomes a second domain layer that is harder to test, harder to evolve, and violates the principle that the application should be runnable entirely in-process without the remote shells.
For each boundary, design the Remote Facade:
1. **Methods are use-case-shaped**: `GetOrderSummary(orderId)`, `SubmitPaymentAuthorization(authDTO)`, not `GetOrder()`, `GetCustomer()`, `UpdateOrderStatus()`
2. **Each method does one useful unit of work**: A single remote call should accomplish a meaningful operation, not just fetch one property
3. **Facade has no conditional domain logic**: No `if (order.status == SHIPPED) applyLateCharge()` in the facade. That belongs in domain objects.
4. **Facade can have multiple methods corresponding to different client use cases**: Different screens may call different facade methods that ultimately touch the same domain objects — this is correct
5. **Facade granularity**: Prefer fewer, larger facades over many small ones. A moderate application might have one; a large application, half a dozen.
6. **Security and transactions are appropriate in the facade**: The facade is a natural boundary for access control and transaction demarcation (start transaction → call domain objects → commit). This is not domain logic.
### Step 4 — Design DTOs Per Use Case
WHY: A DTO is a data carrier designed for the wire, not for the domain. Auto-deriving DTOs from domain classes couples the wire format to the domain model — any domain refactoring forces a wire format change, which may require client updates. DTOs designed around use cases are stable because use cases change less often than domain internals.
For each Remote Facade method, design the corresponding DTO(s):
1. **Shape the DTO around what the client interaction needs** — not around the domain class structure. Collapse related objects: embed artist name in AlbumDTO rather than passing a separate ArtistDTO with a separate call.
2. **Fields are primitives, strings, dates, or nested DTOs** — no domain object references. The DTO must be serializable independently of the domain model.
3. **Use a separate Assembler** to move data between domain objects and DTOs. The assembler keeps the domain model independent of wire format.
4. **Consider separate request and response DTOs** if the shapes diverge significantly. Use a single DTO if they are similar.
5. **Err on the side of sending too much data** rather than requiring a second call: "it's better to err on the side of sending too much data than have to make multiple calls."
6. **Modern naming**: Fowler's DTO = what the J2EE community called "Value Object." Today: gRPC proto messages, JSON response schemas, GraphQL types, OpenAPI schemas, Avro records.
### Step 5 — Audit for Anti-Patterns
WHY: The three most common mistakes at distribution boundaries are predictable and detectable in advance.
Check each boundary against these anti-patterns:
| Anti-Pattern | Signal | Remedy |
|--------------|--------|--------|
| Distribution by class | One remote component per domain class; dozens of remote calls per user interaction | Consolidate to subsystem-level Remote Facades with use-case methods |
| Domain logic in Remote Facade | Conditional logic, business rules, or workflow coordination in the facade | Move logic to domain objects or a non-remote Service Layer; reduce facade to thin delegation |
| Chatty fine-grained remote interface | N calls per screen/operation where N > ~3; individual property getters exposed remotely | Redesign as coarse-grained bulk accessor / command method |
| DTO-domain coupling | DTO fields map 1:1 to domain class fields; DTO imports domain classes | Introduce Assembler; redesign DTOs around use-case needs |
| Distribution without legitimate reason | Motivation is trend, architecture taste, or "clean code" | Recommend modular monolith; document the test that should be passed before re-evaluating |
### Step 6 — Select Interface Style
WHY: The interface style determines coupling tolerance, latency profile, and async capability. Fowler's 2002 guidance was RPC vs XML/HTTP; the modern decision space is richer.
| Style | Best for | Trade-offs |
|-------|----------|------------|
| gRPC (binary, proto) | Internal service-to-service, high throughput, same-platform | Tight schema coupling; not browser-native |
| REST / JSON | External APIs, browser clients, cross-platform, public APIs | Looser coupling; HTTP overhead; no streaming by default |
| Message queue (Kafka, SQS, RabbitMQ) | Fire-and-forget, event-driven, high latency tolerance, decoupled producers/consumers | Async only; harder to test; eventual consistency |
| GraphQL | Multi-client (mobile + web) with varying field needs; avoiding over-fetch | Query complexity; N+1 risk server-side; schema governance overhead |
Fowler's principle still holds: use the simplest mechanism that works. If both sides share a platform, use gRPC or its modern equivalent. Use REST/JSON when interoperability across platforms or external access matters. Use message queues when decoupling and async throughput matter more than latency.
### Step 7 — Produce Distribution Design Record
WHY: The decision must be documented so future architects understand WHY the boundary exists and what test should be passed before adding more. Distribution decisions are expensive to reverse.
Produce a distribution design record containing:
- **Current topology**: monolith / modular monolith / partial services
- **Legitimate distribution reasons identified** (Step 1) and which subsystem pair each applies to
- **Non-distribution alternatives considered** (modular monolith, package separation)
- **Proposed boundaries**: subsystem pair, direction of calls, call frequency estimate
- **Remote Facade spec**: for each facade — class name, method signatures, WHY each method is shaped this way
- **DTO spec**: for each DTO — fields, which domain objects it aggregates, assembler location
- **Interface style selected** with rationale
- **Anti-pattern audit result**: pass / fail per anti-pattern, with remediation notes if any failed
## Inputs
- System description (purpose, subsystems, team structure, deployment needs)
- Stated motivation for distribution
- Optional: codebase, architecture diagrams, existing proto/schema files
- Optional: scaling metrics or capacity constraints
## Outputs
**Distribution Design Record** (`distribution-design-record.md`) containing:
```
# Distribution Design Record: [System Name]
## Summary Decision
[Distribute / Do Not Distribute — one sentence]
## First Law Filter Results
| Reason | Applies? | Notes |
|--------|----------|-------|
| ... | Yes/No | ... |
## Recommended Topology
[Modular monolith / specific service boundaries]
## Boundaries (if distributing)
### Boundary: [Subsystem A] ↔ [Subsystem B]
- Legitimate reason: [reason from filter]
- Interface style: [gRPC / REST / queue / GraphQL]
- Remote Facade: [FacadeClass]
- [method(params) → ReturnDTO] — [WHY this use-case shape]
- DTOs:
- [DTOName]: fields=[...], aggregates=[domain objects], assembled by=[AssemblerClass]
## Anti-Pattern Audit
| Anti-Pattern | Result | Notes |
|--------------|--------|-------|
| Distribution by class | Pass/Fail | ... |
| Domain logic in facade | Pass/Fail | ... |
| Chatty interface | Pass/Fail | ... |
| DTO-domain coupling | Pass/Fail | ... |
| Distribution without reason | Pass/Fail | ... |
## Non-Distribution Alternatives Considered
[What modular monolith structure was evaluated and why distribution was preferred]
```
## Key Principles
**1. The First Law is a default, not a suggestion.**
Distribute only when you have a concrete, operational reason from Fowler's list. "We want microservices" is not a reason. "The payment processor must be PCI-compliant in a separate security zone" is a reason. The default is always in-process.
WHY: Remote calls are orders of magnitude slower than in-process calls. Every process boundary is a permanent tax on every call that crosses it. The cumulative cost of unjustified distribution dwarfs the cost of an in-process module boundary.
**2. Boundaries are per subsystem, not per class.**
A Remote Facade corresponds to a subsystem (a cluster of collaborating domain objects), not to an individual class. If every domain class has a remote interface, the boundary design is broken.
WHY: Each fine-grained call to a remote class pays the full inter-process latency cost. A subsystem-level facade batches those calls into one network round-trip per use case.
**3. Facades contain no domain logic — none.**
A Remote Facade is a translation layer: coarse-grained interface → fine-grained domain object calls. Any conditional logic, business rules, or workflow coordination in the facade must be moved to domain objects or a non-remote Service Layer.
WHY: A facade with domain logic becomes a hidden second domain layer. The application can no longer be tested or run in-process without the remote shell, because the logic lives there. This creates a testing and evolution trap.
**4. DTOs are designed for the interaction, not the domain.**
Each DTO is shaped around a specific client use case — what data that client needs to display or send in a single interaction. DTOs are not auto-generated mirrors of domain classes.
WHY: Domain classes change frequently as business rules evolve. DTOs change when the client interaction changes. Coupling these two change rates together means every domain refactoring potentially breaks clients across the wire.
**5. The modular monolith is always on the table.**
When a team says "we need microservices for separation of concerns," the correct counter-offer is: separate your packages/modules/bounded contexts within a single process. You get team ownership, clear interfaces, and independent evolution without network cost.
WHY: The costs of distribution (latency, partial failure, operational complexity, debugging difficulty) are real and ongoing. The benefits of a modular monolith (same testability, same team ownership, same interface discipline) are available without those costs.
**6. Send more data per call rather than fewer.**
When designing DTOs and facade methods, err toward aggregating more data in one call rather than planning a second call. Over-fetching slightly is far cheaper than a second round-trip.
WHY: Remote call latency is fixed overhead per call. Bandwidth is cheap. If a client might need the order's line items after getting the order header, include the line items in the initial DTO.
## Examples
### Example 1 — Reject Distribution for a Two-Team LOB App
**Scenario:** A 12-person engineering team across two squads is building an internal CRM. Engineering lead proposes microservices "for clean boundaries between the sales squad (contacts, deals) and the support squad (tickets, SLAs)."
**Trigger:** "Should we use microservices for our two-squad CRM?"
**Process:**
1. Apply First Law Filter:
- Client-server divide: Yes (browser client + server). But this is the standard web-app boundary, not an argument for internal microservices.
- Independent scaling: No — contacts/deals and tickets/SLAs have similar load profiles.
- Different security zones: No.
- Vendor package: No.
- Independent deployment: "Nice to have" but not operationally required.
- Result: No legitimate reason for internal service distribution.
2. Recommend modular monolith: define a `contacts-deals` package and a `tickets-slas` package with explicit module boundaries, shared domain types, and clear internal APIs. Each squad owns their package.
3. Note: the existing web-app boundary (browser ↔ app server) is the natural Remote Facade. A single `CRMService` facade handles screen-level operations for both squads.
4. Anti-pattern audit: distribution-by-class risk = not applicable (not distributing). Domain logic in facade = not applicable.
**Output:** Distribution Design Record stating "Do not distribute internally." Modular monolith with squad-owned packages. One Remote Facade for the browser-server boundary.
---
### Example 2 — Extract Payments as a Distributed Boundary
**Scenario:** E-commerce platform. Payments team requires PCI-DSS compliance in an isolated network zone. The search subsystem handles 50x the load of the product catalog at peak.
**Trigger:** "We need to separate payments for PCI compliance and search for scaling. How do we design those boundaries?"
**Process:**
1. Apply First Law Filter:
- Security zone (PCI): Yes — payments must live in an isolated environment. Legitimate.
- Independent scaling (search): Yes — 50x load differential is a real scaling reason.
- All other subsystems: No legitimate reason.
2. Two boundaries identified: Catalog ↔ Payments, Catalog ↔ Search. Everything else (orders, inventory, user) stays in the monolith.
3. Design Payments Remote Facade (`PaymentService`):
- `AuthorizePayment(orderId, paymentMethodDTO) → PaymentResultDTO`
- `CapturePayment(authorizationId) → CaptureResultDTO`
- `RefundPayment(paymentId, amount) → RefundResultDTO`
- NOT: `GetCard()`, `UpdateCard()`, `GetAuthorization()` — these are fine-grained domain calls
4. Design DTOs: `PaymentMethodDTO` (cardToken, billingAddress, amount), `PaymentResultDTO` (authorizationId, status, errorCode). Assembled from domain card/billing objects. Not coupled to domain class structure.
5. Design Search Remote Facade (`SearchService`):
- `SearchProducts(query, filters, pagination) → ProductSearchResultDTO`
- `SuggestProducts(partialQuery) → SuggestionDTO[]`
6. Interface style: Payments = gRPC (internal, same platform, high reliability needed). Search = REST/JSON (browser clients query search directly; must be cross-platform).
7. Anti-pattern audit: no distribution-by-class, facades are thin, DTOs are use-case-shaped.
**Output:** Distribution Design Record with two boundaries (PCI, scaling), facade specs, DTO specs, interface styles.
---
### Example 3 — Fix Legacy Chatty Distributed Objects
**Scenario:** Legacy J2EE system from 2005. Every entity bean (Customer, Order, OrderLine, Product, Address) is a separate remote EJB. Displaying an order summary screen makes 15-20 remote calls. Performance is unacceptable.
**Trigger:** "Our EJB app is painfully slow. Each screen makes dozens of remote calls. How do we fix this?"
**Process:**
1. First Law Filter: Application server is separate from DB (legitimate). The internal EJB-to-EJB calls are the problem — distribution by class applied to the domain model.
2. Diagnosis: Classic distribution-by-class anti-pattern. The EJBs mirror the domain class structure. Every property getter is a remote call.
3. Remediation:
- Keep domain objects as plain in-process Java objects (POJOs) — remove remote interfaces from Customer, Order, OrderLine, Product.
- Introduce a single `OrderService` Remote Facade with use-case methods: `GetOrderSummary(orderId) → OrderSummaryDTO`, `UpdateOrderStatus(orderId, statusDTO)`.
- `OrderSummaryDTO` aggregates order header + customer name + line items + product names in one call.
- Assembler populates the DTO from the fine-grained domain objects.
4. Interface style: Keep as EJB session bean Remote Facade (same platform) or migrate to gRPC for modern replacement.
5. Expected result: 15-20 calls → 1-2 calls per screen interaction.
**Output:** Refactoring plan with identified chatty calls, new facade spec, DTO design, migration sequence.
## References
- [`references/distribution-reasons-checklist.md`](references/distribution-reasons-checklist.md) — Fowler's seven legitimate distribution reasons as a quick-reference filter
- [`references/remote-facade-design-guide.md`](references/remote-facade-design-guide.md) — detailed guidance on Remote Facade granularity, method design, session facade vs remote facade distinction
- [`references/dto-design-guide.md`](references/dto-design-guide.md) — DTO design patterns, assembler pattern, serialization format trade-offs, modern parallels (proto, JSON schema, GraphQL types)
- [`references/interface-style-selector.md`](references/interface-style-selector.md) — gRPC vs REST vs message queue vs GraphQL selection criteria
## License
This skill is licensed under [CC-BY-SA-4.0](https://creativecommons.org/licenses/by-sa/4.0/).
Source: [BookForge](https://github.com/bookforge-ai/bookforge-skills) — Patterns of Enterprise Application Architecture by Martin Fowler, David Rice, Matthew Foemmel, Edward Hieatt, Robert Mee, Randy Stafford.
## Related BookForge Skills
Install related skills from ClawhHub:
- `clawhub install bookforge-enterprise-architecture-pattern-stack-selector`
- `clawhub install bookforge-domain-logic-pattern-selector`
Or install the full book set from GitHub: [bookforge-skills](https://github.com/bookforge-ai/bookforge-skills)
FILE:references/distribution-reasons-checklist.md
# Distribution Reasons Checklist
Source: Patterns of Enterprise Application Architecture, Ch 7 (Distribution Strategies) — Fowler
## The First Law
**Don't distribute your objects.**
Any inter-process call is orders of magnitude more expensive than an in-process call — even when both processes are on the same machine. This cost is paid on every call, forever.
## Legitimate Reasons to Distribute
Pass this checklist before committing to a process boundary. Distribute only if one or more applies:
| # | Reason | Question to Ask |
|---|--------|-----------------|
| 1 | **Client-server machine divide** | Do clients run on physically different machines from the server? (e.g., desktop PCs vs shared server, browser vs app server) |
| 2 | **Application server ↔ database** | Is the database in a separate process? (Almost always yes. SQL is designed as a remote interface; this cost is unavoidable but minimizable.) |
| 3 | **Web server and app server separation** | Is a separate web server process required by vendor constraints, security policy, or independent scaling? |
| 4 | **Independent scaling requirements** | Does a hot subsystem need to scale at a dramatically different rate than the rest? (e.g., 50x load differential) |
| 5 | **External vendor / package software** | Does a purchased package run in its own process with a coarse-grained interface? |
| 6 | **Security zone boundary** | Is a firewall or network zone required between subsystems? (e.g., PCI-DSS, DMZ, internal vs external) |
| 7 | **Different hardware or OS requirements** | Does a subsystem require specialized hardware, a different OS, or a different language runtime that cannot coexist in one process? |
## Non-Reasons (Do Not Distribute For These)
| Non-Reason | Why It Is Not Sufficient | Correct Alternative |
|------------|--------------------------|---------------------|
| "We want microservices" (trend) | Distribution costs are real; trend is not a force | Modular monolith with clear package boundaries |
| "Clean architecture / separation of concerns" | SoC is achievable within a process via packages, modules, or bounded contexts | Package-level separation, not process separation |
| "Conway's Law / team size" | Teams can own packages within a monolith; process boundaries are not required for team autonomy | Package ownership per team |
| "Polyglot language preference" | Distribution to use a different language has enormous operational cost | Choose a shared language or isolate via a truly necessary boundary |
| "Independent deployment" (without operational need) | If deploys are cheap (CI/CD), the benefit rarely outweighs the cost | Feature flags, versioned packages, trunk-based development |
## Decision Rule
If NONE of the seven legitimate reasons apply → **Modular monolith**. Structure the application with clear package/module interfaces; defer distribution until a legitimate operational reason emerges.
If one or more apply → Distribute ONLY the subsystem pairs that cross that specific boundary. Every other pair stays in-process.
FILE:references/dto-design-guide.md
# DTO Design Guide
Source: Patterns of Enterprise Application Architecture, Ch 15 (Data Transfer Object) — Fowler
## Pattern Intent
A Data Transfer Object carries data between processes in a single method call, reducing call count. It is a serializable data holder with no domain behavior.
## Core Rules
1. **No behavior** — only data fields and serialization logic
2. **Serializable independently** of the domain model
3. **Fields are primitives, strings, dates, or nested DTOs** — not domain object references
4. **Designed around client interaction needs**, not domain class structure
5. **Separated from domain objects by an Assembler** — neither the DTO nor the domain object depends on the other
## Assembler Pattern
The Assembler is a Mapper that lives on the server side:
- Populates DTOs from domain objects (for responses)
- Updates domain objects from DTOs (for commands/updates)
The Assembler is separate because:
- DTOs may need to be deployed on both sides of the wire without domain classes
- Domain model must not depend on wire format (wire format changes more often than domain invariants)
- Auto-generating DTOs is easy; auto-generating assemblers is hard — keeping them separate allows manual control of the translation
## DTO Structural Guidelines
**Collapse related objects:** If the client always needs the artist name when viewing an album, embed `artistName: string` in `AlbumDTO` rather than requiring a separate `ArtistDTO` call.
**Hierarchical, not graph:** DTOs should form a tree (album → tracks → performers), not a graph with back-references. Domain models often have complex bidirectional references; DTOs flatten or simplify these.
**Aggregation scope:** A single DTO typically carries data from multiple domain objects. If a screen needs order + customer + line items + delivery info, that's one DTO with nested sub-DTOs.
**Err toward over-sending:** Include related data the client might need in the near future. The cost of a second call exceeds the cost of modest over-fetching.
## Request vs Response DTOs
- If request and response shapes are similar → single DTO (mutable, populated differently)
- If they diverge significantly → separate `OrderRequestDTO` and `OrderResponseDTO`
- Fowler prefers mutable DTOs (easier to populate incrementally)
## Serialization Format Trade-offs
| Format | Pros | Cons | Use when |
|--------|------|------|----------|
| Binary (Java serialization, .NET binary) | Fast, compact | Fragile — any schema change breaks old clients | Both sides fully controlled and versioned together |
| JSON | Human-readable, widely supported, tolerant of adding optional fields | Larger payload, slower parsing | REST APIs, browser clients, cross-platform |
| XML / SOAP | Maximum interoperability, tooling support | Verbose, slower | Legacy integration, SOAP contracts |
| gRPC / protobuf | Fast, strongly typed, schema-first | Requires proto compiler, not browser-native | Internal service-to-service, high throughput |
| Dictionary/Map (binary) | Tolerant of field additions — old clients skip unknown keys | Loses strong typing, no explicit interface | When tolerance matters more than type safety |
## Modern Parallels
| Fowler DTO concept | Modern equivalent |
|--------------------|-------------------|
| DTO class with getters/setters | Java record, Kotlin data class, Python dataclass, TypeScript interface |
| XML serialization | JSON (Jackson, Gson, System.Text.Json) |
| Binary serialization | gRPC proto message, Avro record, Thrift struct |
| Assembler | MapStruct (Java), AutoMapper (.NET), manual mapper function |
| DTO designed per screen | GraphQL fragment, OpenAPI response schema, REST response body |
## DTO vs Value Object (Naming Clarification)
- **Fowler's DTO**: a data carrier for the wire — no behavior, serializable, designed per use case
- **Fowler's Value Object**: an immutable object whose identity is defined by its values (e.g., Money, DateRange, Address) — NOT for wire transfer
- **J2EE community "Value Object"**: what Fowler calls DTO — this naming collision caused significant confusion
## Common Anti-Patterns
| Anti-Pattern | Description | Remedy |
|--------------|-------------|--------|
| DTO mirrors domain class 1:1 | Auto-generated DTOs for every entity; one DTO per domain class | Design DTOs around use-case interactions, collapse related data |
| Domain object in DTO field | DTO references a domain class directly | Replace with primitive fields or nested DTO |
| Behavior in DTO | Business logic or validation in DTO methods | Move to domain objects or validators; DTO = data only |
| Over-split DTOs | Separate DTO per property group, requiring multiple calls to populate a screen | Consolidate into one use-case-shaped DTO |
FILE:references/interface-style-selector.md
# Interface Style Selector
Source: Patterns of Enterprise Application Architecture, Ch 7 (Interfaces for Distribution) — Fowler, extended with modern options
## Fowler's Original Guidance (2002)
- **RPC-style** (CORBA, RMI): Both sides same platform → use native binary mechanism. Avoid XML overhead.
- **XML over HTTP** (SOAP): Cross-platform communication, firewall traversal, public APIs. More interoperable, more overhead.
- Principle: "Use XML Web services only when a more direct approach isn't possible."
The principle holds today. Substitute gRPC for RMI/CORBA and REST/JSON for SOAP/XML.
## Modern Decision Matrix
| Style | Coupling | Latency | Async? | Browser-native? | Best for |
|-------|----------|---------|--------|-----------------|----------|
| **gRPC** (binary, proto) | Tight (schema-first) | Low | Streaming (not fire-and-forget) | No (needs grpc-web proxy) | Internal service-to-service, high throughput, same-platform |
| **REST / JSON** | Loose (URL + JSON) | Medium | No (HTTP/2 streams possible) | Yes | External APIs, browser clients, public/partner APIs, cross-platform |
| **Message queue** (Kafka, SQS, RabbitMQ) | Very loose (event schema) | High tolerance | Yes — async only | No | Fire-and-forget, event-driven, decoupled producers/consumers, high-volume ingest |
| **GraphQL** | Medium (schema-typed) | Medium | Subscriptions | Yes | Multi-client (mobile + web) with varying field needs, avoiding over-fetch, developer experience |
## Selection Heuristics
**Use gRPC when:**
- Both sides are controlled by your team and share the same platform or can consume proto-generated clients
- Low latency and high throughput are required (e.g., internal order processing, payment authorization)
- You want schema-first API contracts with strong typing
**Use REST/JSON when:**
- External clients (browsers, mobile apps, third parties) consume the API
- Cross-platform interoperability matters
- Developer experience / debuggability > raw performance
- The API is public or exposed to partners
**Use message queue when:**
- The call is fire-and-forget (no immediate response needed)
- The producer should not be blocked by consumer availability or speed
- High latency tolerance (seconds, not milliseconds)
- Event-driven architecture or data pipeline
- Examples: order placed → fulfillment service picks up from queue; audit log events; notification dispatch
**Use GraphQL when:**
- Multiple client types with different field subsets (mobile shows 3 fields; desktop shows 20)
- You want to avoid REST over-fetching / under-fetching
- Schema introspection and developer tooling matter
- Consider the N+1 risk: server-side resolvers must batch (DataLoader pattern) to avoid per-field DB queries
## Combining Styles
It is common and correct to use different styles at different boundaries:
- Internal services → gRPC
- External / public API → REST/JSON (possibly generated from proto schemas via gRPC-gateway)
- Async event flows → message queue
- Mobile-specific → GraphQL over REST
The Remote Facade pattern applies to all of these: design the interface as a coarse-grained use-case-shaped contract, implement it with whichever transport style is appropriate.
FILE:references/remote-facade-design-guide.md
# Remote Facade Design Guide
Source: Patterns of Enterprise Application Architecture, Ch 15 (Remote Facade) — Fowler
## Pattern Intent
A Remote Facade is a **coarse-grained facade over a web of fine-grained objects** that provides efficient remote access to a domain model that would otherwise be chatty.
## Core Rules
1. **No domain logic in the facade.** "Repeat after me three times: Remote Facade has no domain logic." The facade translates coarse → fine. Logic lives in domain objects.
2. **Thin delegation.** Each facade method should be 1-3 lines, delegating to domain objects or a non-remote Service Layer.
3. **Coarse-grained methods.** One call does one unit of useful work. Not one getter per property.
4. **Few facades, many methods.** A moderate application may have one facade; a large application, half a dozen.
## Method Design Heuristics
Good Remote Facade method shapes:
- **Bulk accessor**: `getAddressData(customerId) → AddressDTO` — returns everything the client needs about an address in one call
- **Use-case command**: `submitOrder(orderId, paymentDTO) → OrderConfirmationDTO` — performs a complete business operation
- **Screen loader**: `loadOrderScreen(orderId) → OrderScreenDTO` — loads all data for a specific screen
- **Status command**: `changeOrderStatus(orderId, status)` — a single button-press action
Bad Remote Facade method shapes (distribution-by-class):
- `getCity(addressId)`, `getState(addressId)`, `getZip(addressId)` — three calls for three properties
- `getOrder(orderId)`, `getOrderLines(orderId)`, `getProducts(orderId)` — three calls to display one screen
## Granularity Decision
Fowler prefers very few, very coarse facades. Design facades around **client families** (a group of related screens or operations), not around domain classes.
Example — music domain:
- One `AlbumService` facade handles: `getAlbum`, `createAlbum`, `updateAlbum`, `getArtist`, `addArtist`
- Not: separate `AlbumService`, `ArtistService`, `TrackService` facades
## Remote Facade vs Session Facade (J2EE)
| | Remote Facade (Fowler) | Session Facade (J2EE community) |
|--|------------------------|----------------------------------|
| Contains domain logic? | No — thin skin only | Often yes — workflow coordination |
| Domain objects | Fine-grained POJOs inside | Entity beans or services |
| Correctness | Correct pattern | Anti-pattern per Fowler |
If your "Remote Facade" has conditional logic, workflows, or business rules — it is a Session Facade with leaked domain logic, not a Remote Facade. Move the logic to domain objects.
## Stateful vs Stateless
- **Stateless facade**: Can be pooled; better for high-concurrency B2C scenarios. Requires external session state (Client Session State / Database Session State).
- **Stateful facade**: Holds session state itself; simpler to implement. May become a bottleneck under high user concurrency.
## Security and Transaction Boundaries
The Remote Facade is the correct place for:
- **Access control**: Check permissions at facade method entry
- **Transaction demarcation**: Begin transaction before calling domain objects; commit after
These are infrastructure concerns, not domain logic.
## Testing
Because the facade has no domain logic, you can test domain behavior without deploying the remote shell. Instantiate the facade bean/class directly and test in-process. This is a feature, not a workaround — it confirms the facade has no logic worth testing.
## Modern Equivalents
| Classic | Modern |
|---------|--------|
| EJB Session Bean | gRPC service implementation |
| SOAP Web Service interface | REST controller / OpenAPI endpoint |
| CORBA interface | gRPC .proto service definition |
| RMI interface | Java/Kotlin gRPC stub |
Choose the right data access pattern — Table Data Gateway, Row Data Gateway, Active Record, or Data Mapper — for a persistence layer. Use when asked "should...
---
name: data-source-pattern-selector
description: |
Choose the right data access pattern — Table Data Gateway, Row Data Gateway, Active Record, or Data Mapper — for a persistence layer. Use when asked "should I use Active Record or Data Mapper?", "which ORM pattern fits my app?", "when does Hibernate-style mapping make sense vs. Rails ActiveRecord?", "how do I structure my database access layer?", "data mapper or active record for my domain model?", "Row Data Gateway vs Active Record", "Table Data Gateway vs Data Mapper", "Fowler data source patterns", "persistence layer design", "ORM pattern selection", "choose ORM pattern", "database access layer architecture", "Hibernate vs Rails persistence style". Applies when designing a new persistence layer or refactoring an existing one. Routes each domain-logic pattern (Table Module, Transaction Script, Domain Model) to its natural data-source counterpart. Identifies the Active Record / Data Mapper mismatch anti-pattern (AR when schema is not isomorphic with objects; DM when AR would suffice). Maps each pattern to modern framework idioms: Rails ActiveRecord → AR pattern; Hibernate / Spring Data JPA / EF Core → DM; Django ORM → AR-leaning; SQLAlchemy Core → TDG-style; SQLAlchemy ORM → DM; Laravel Eloquent → AR. Warns against business logic creeping into Gateway classes. Produces a pattern decision record with rationale, framework notes, and migration path. If the domain-logic pattern has not yet been chosen, invoke `domain-logic-pattern-selector` first.
version: 1.0.0
homepage: https://github.com/bookforge-ai/bookforge-skills/tree/main/books/patterns-of-enterprise-application-architecture/skills/data-source-pattern-selector
metadata: {"openclaw":{"emoji":"📚","homepage":"https://github.com/bookforge-ai/bookforge-skills"}}
status: draft
source-books:
- id: patterns-of-enterprise-application-architecture
title: "Patterns of Enterprise Application Architecture"
authors: ["Martin Fowler", "David Rice", "Matthew Foemmel", "Edward Hieatt", "Robert Mee", "Randy Stafford"]
chapters: [3, 10]
tags: ["data-access", "persistence", "orm", "database", "design-patterns", "software-architecture", "active-record", "data-mapper", "table-data-gateway", "row-data-gateway", "fowler-peaa", "object-relational-mapping", "enterprise-patterns"]
depends-on: []
execution:
tier: 2
mode: hybrid
inputs:
- type: document
description: "System description including: the chosen (or candidate) domain-logic pattern, schema shape, language and ORM framework, whether the schema is owned by this team or external, team sophistication, and any current pain points with data access."
tools-required: [Read, Write]
tools-optional: [Grep]
mcps-required: []
environment: "Any agent environment. Works with codebase analysis, architecture documents, schema files, or a written project description."
discovery:
goal: "Produce a concrete data-source pattern recommendation with rationale, framework mapping, anti-pattern warnings, and migration path."
tasks:
- "Determine the already-chosen or candidate domain-logic pattern (Table Module, Transaction Script, Domain Model)"
- "Assess schema isomorphism with domain objects (high = AR-friendly; low = DM)"
- "Assess business logic weight on domain objects (none = TDG/RDG; moderate-to-heavy = AR or DM)"
- "Apply the routing matrix to produce a primary pattern recommendation"
- "Map the recommendation to the team's language/framework idiom"
- "Identify applicable anti-patterns and warn against them"
- "Output a pattern decision record with rationale and migration path"
audience:
roles: ["software-architect", "senior-backend-engineer", "tech-lead", "framework-designer"]
experience: "intermediate-to-advanced"
when_to_use:
triggers:
- "Starting a new service and choosing how to structure database access"
- "Refactoring a persistence layer that has grown messy or tightly coupled to the schema"
- "Evaluating whether to adopt an ORM and which style (AR-style vs. DM-style)"
- "Noticing Active Record classes that have become fat with complex domain logic and inheritance"
- "Noticing a full Data Mapper layer that feels over-engineered for simple CRUD"
- "Choosing between Rails-style ORM and Hibernate-style ORM for a new stack"
- "Documenting an architecture decision about persistence-layer design"
prerequisites:
- "Some clarity about domain-logic approach (even a guess). If unknown, run `domain-logic-pattern-selector` first."
not_for:
- "Choosing a specific database product (PostgreSQL vs. MySQL) — this skill selects the access pattern, not the product"
- "Concurrency control patterns (optimistic/pessimistic locking) — see `unit-of-work-implementer` or `optimistic-lock-advisor`"
- "Query optimization or index design — this is an architectural pattern selection, not a performance tuning skill"
environment:
codebase_required: false
codebase_helpful: true
works_offline: true
quality:
scores:
with_skill: "{filled by tester}"
baseline: "{filled by tester}"
delta: "{filled by tester}"
tested_at: "{filled by tester}"
eval_count: "{filled by tester}"
assertion_count: 14
iterations_needed: "{filled by tester}"
---
# Data Source Pattern Selector
## When to Use
You are designing or refactoring the persistence layer of an enterprise application and need to choose *how* objects talk to the database. The four canonical options — Table Data Gateway, Row Data Gateway, Active Record, Data Mapper — differ not just in complexity but in what they assume about domain object structure and business logic placement.
This skill applies when:
- You are choosing a data-access approach before — or alongside — selecting an ORM framework
- Your Active Record classes have started accumulating complex inheritance, collections, or non-trivial business logic and things feel messy
- You suspect your full Data Mapper layer is overkill for a largely CRUD application
- You want to document a persistence architecture decision with clear rationale
**If the domain-logic pattern (Table Module, Transaction Script, Domain Model) has not yet been chosen, the data-source pattern cannot be selected reliably.** Invoke `domain-logic-pattern-selector` first, or ask the user to describe their domain-logic approach before proceeding.
---
## Context & Input Gathering
### Required
- **Domain-logic pattern in use or planned:** Table Module / Transaction Script / Domain Model — this is the primary routing axis.
- **Schema shape:** Is the database schema closely aligned with how you model objects in code, or does it differ significantly? Are there inheritance hierarchies, embedded collections, or many-to-many associations that don't map neatly to tables?
- **Business logic weight:** Do domain objects carry significant behavior (validation, calculations, policies), or is the application largely CRUD?
- **Language and framework:** What language and ORM/database library are in use or being evaluated? (Rails, Hibernate, Django, SQLAlchemy, EF Core, Laravel, etc.)
### Helpful
- **Schema ownership:** Does your team control the schema, or is it a legacy/external schema you must conform to? (External schemas often push toward Data Mapper for isolation.)
- **Team sophistication:** Larger or more complex mapping layers (Data Mapper) require more experience to implement and maintain.
- **Existing code artifacts:** Any domain classes, repository interfaces, or gateway classes already in the codebase signal where the current approach sits.
### Defaults if not specified
- Unknown domain-logic pattern → ask before proceeding (this is a hard prerequisite)
- Unknown schema shape → assume moderate complexity; flag isomorphism check as required
- Unknown framework → provide pattern-level recommendation; offer framework mapping as a separate step
---
## Process
**Step 1 — Identify the domain-logic pattern.**
WHY: The domain-logic pattern is the primary determinant of which data-source pattern fits. Fowler is explicit: Table Module pairs with Table Data Gateway; Domain Model pairs with Active Record or Data Mapper. Choosing a data-source pattern independently from domain-logic pattern creates structural mismatches that compound over time.
- Table Module → go directly to **Table Data Gateway** (Step 5). No further analysis needed.
- Transaction Script → go to Step 2 to pick TDG vs RDG.
- Domain Model → go to Step 3 to pick AR vs DM.
- Unknown → invoke `domain-logic-pattern-selector` or ask the user to describe domain-logic approach.
**Step 2 — Transaction Script: choose TDG vs RDG.**
WHY: Both Gateway patterns separate SQL from application logic, but they differ in their result shape. The choice is about what is more convenient for the Transaction Script to work with — a result set / record set, or a per-row object.
- Prefer **Table Data Gateway** when: the environment has good Record Set support (.NET ADO.NET DataSet, JDBC ResultSet consumed directly), when scripts prefer to iterate over result sets rather than object collections, or when stored procedures are the access mechanism (stored procs naturally map to TDG).
- Prefer **Row Data Gateway** when: the environment favors per-row objects, when you want object-oriented field access in scripts, or when you anticipate logic accumulating on the gateway and want a natural refactoring path toward Active Record.
- Note: If you observe business logic creeping into either gateway class, that is the signal to refactor toward Active Record (Step 5a).
**Step 3 — Domain Model: assess schema isomorphism.**
WHY: Active Record works well only when the schema is isomorphic with domain objects — one table per class, fields map one-to-one to columns. The moment inheritance hierarchies, embedded value objects, rich collections, or divergent naming appear, Active Record mapping becomes patchwork. Isomorphism is the primary AR/DM split criterion.
Evaluate isomorphism:
- HIGH (AR-friendly): Each domain class corresponds to one table. No inheritance in the domain. Field names map cleanly to column names. No complex associations beyond simple foreign keys.
- LOW (DM territory): Inheritance hierarchies in the domain. Collections mapped across multiple tables. Domain objects named and shaped for business concepts, schema named for normalization. External/legacy schema that the domain must adapt to.
**Step 4 — Domain Model: assess business logic weight.**
WHY: Data Mapper is justified by domain model complexity. If the domain model is simple (validations, derivations, single-record logic), Active Record carries that complexity at low cost with no extra layer. If the domain model has complex policies, multi-object calculations, or needs to be testable without a database, Data Mapper's isolation pays off.
- Simple business logic (CRUD + validations + single-record derivations) → lean toward **Active Record**.
- Complex business logic (multi-entity rules, domain events, complex state machines, aggregate roots, test isolation needed) → lean toward **Data Mapper**.
**Step 5 — Apply the routing matrix and select pattern.**
WHY: Combines Steps 1–4 into a single recommendation. Mixing patterns at the primary persistence level is a known anti-pattern — it creates confusion about where persistence logic lives and where domain logic lives.
| Domain-Logic Pattern | Schema Isomorphism | Business Logic Weight | → Data-Source Pattern |
|---|---|---|---|
| Table Module | — | — | **Table Data Gateway** |
| Transaction Script | — | — | TDG (result-set style) or RDG (object style) |
| Domain Model | High | Low | **Active Record** |
| Domain Model | High | Growing / complex | **Active Record** → plan migration to DM |
| Domain Model | Low | Any | **Data Mapper** |
| Domain Model | Any | Heavy / test-isolated | **Data Mapper** |
**Step 6 — Map to framework idiom.**
WHY: Pattern selection is only useful if it connects to the team's actual tooling. Each major ORM embodies one of the four patterns; choosing the pattern misaligned with the framework requires fighting the framework's defaults.
- **Active Record pattern:** Rails ActiveRecord (naming intentional), Django ORM model classes, Laravel Eloquent, Grails GORM.
- **Data Mapper pattern:** Hibernate (Java), Spring Data JPA (via Hibernate), Entity Framework Core (especially with separate repository layer), SQLAlchemy ORM (with `Session` and mapped classes), TypeORM (repository mode), Doctrine ORM (PHP).
- **Table Data Gateway:** ADO.NET DataSet/TableAdapter (.NET classic), stored-procedure wrappers, SQLAlchemy Core (execute + handle result sets directly), JDBC direct usage.
- **Row Data Gateway:** Less common in modern frameworks; often hand-rolled or generated from metadata. Appears in legacy Java/C# codebases before frameworks matured.
**Step 7 — Check for anti-patterns.**
WHY: Even the correct pattern recommendation fails if common misuses are not flagged upfront. These are the most frequent failure modes Fowler identifies.
Check each:
- [ ] **AR with non-isomorphic schema**: Using Active Record when the domain has inheritance hierarchies, value object collections, or complex associations that don't map 1:1 to tables. Symptom: lots of `has_many :through`, STI workarounds, or manual column overrides. Fix: migrate to Data Mapper.
- [ ] **Premature Data Mapper**: Full mapping layer (Hibernate, hand-rolled mappers) for a largely CRUD application with simple domain. Symptom: enormous mapper configuration, trivial domain classes. Fix: evaluate whether Active Record would suffice.
- [ ] **Business logic in Gateway classes**: TDG or RDG methods that contain validation, calculation, or domain rules. This is Active Record by another name — but without intent. Fix: either commit to Active Record or strip the logic out.
- [ ] **Mixed primary persistence patterns**: Using both Active Record and Data Mapper for different parts of the same domain model. Fowler: "you don't want to mix them because that ends up getting very messy."
**Step 8 — Produce the decision record.**
WHY: Pattern selection without documented rationale gets revisited and second-guessed. A one-page decision record captures the reasoning, making future refactoring decisions faster.
Output: See Outputs section below.
---
## Inputs
- Domain-logic pattern name (Table Module / Transaction Script / Domain Model)
- Schema shape description (isomorphic vs. divergent, presence of inheritance/collections)
- Business logic weight (CRUD-only vs. complex policies)
- Language and ORM framework
- Optional: existing codebase, schema files, architecture docs
---
## Outputs
**Pattern Decision Record** (written to a markdown file or returned inline):
```
# Data Source Pattern Decision Record
## Context
- Domain-logic pattern: [Table Module | Transaction Script | Domain Model]
- Schema isomorphism: [High | Medium | Low]
- Business logic weight: [Low | Moderate | Heavy]
- Language / framework: [e.g., Java / Hibernate]
## Recommended Pattern
**[Table Data Gateway | Row Data Gateway | Active Record | Data Mapper]**
## Rationale
[2-4 sentences connecting domain-logic pattern + isomorphism + logic weight to the recommendation]
## Framework Mapping
- This project uses [framework], which implements the [pattern] style natively.
- [Any idiomatic notes or configuration guidance]
## Migration Path (if applicable)
[e.g., "Currently using RDG; as business logic accumulates on row objects, refactor to AR by
moving Transaction Script logic into the gateway class, renaming to domain classes."]
## Anti-Patterns to Watch
- [AR/DM mismatch warning if applicable]
- [Business-logic-in-gateway warning if applicable]
## Related Patterns Triggered
- Unit of Work: [needed / not needed] — [reason]
- Identity Map: [needed / not needed]
- Lazy Load: [consider / not applicable]
```
---
## Key Principles
**1. Domain-logic pattern is the primary routing axis, not a preference.**
The data-source pattern is constrained by domain-logic choice, not independent of it. Table Module cannot work without Table Data Gateway. Domain Model with complex logic cannot sustain Active Record cleanly. This is why `domain-logic-pattern-selector` must run first.
**2. Schema isomorphism is the Active Record / Data Mapper split criterion.**
Active Record requires one-to-one correspondence between class and table structure. When domain objects diverge from the schema — through inheritance, value objects, or separate evolution — Data Mapper becomes necessary. Trying to force AR onto a non-isomorphic schema produces layered hacks.
**3. The four patterns form a progression, not a menu.**
TDG → RDG → AR → DM is a progression of increasing capability and increasing complexity. The appropriate pattern is the simplest one that handles the domain-logic complexity. Over-engineering (premature DM) is as damaging as under-engineering (AR for complex domain).
**4. Business logic in a Gateway class is a smell, not a feature.**
If logic accumulates in a TDG or RDG, the class is evolving into an Active Record. That evolution is fine if intentional; it is an anti-pattern if unintentional, because it violates the Gateway's contract of pure data access.
**5. Data Mapper buys independent evolution, at a cost.**
DM lets the database schema and object model evolve independently — you can rename a table without touching domain code, or add a collection without altering the schema. The cost is the mapping layer itself: more code, more configuration, more tooling. Only justified when that independence is actually needed.
**6. Framework choice reflects pattern choice — fight the framework at your peril.**
Rails assumes Active Record pattern. Hibernate assumes Data Mapper. Using Hibernate like Active Record (loading everything into rich entities for simple CRUD) is as problematic as using Rails AR for a rich domain with complex inheritance. Choose the framework whose default pattern matches the one this skill recommends.
**7. Do not mix primary persistence patterns in the same domain.**
Mixing AR and DM for different domain classes creates two parallel mental models for how persistence works. Fowler is explicit: pick one primary pattern and apply it consistently. DM can call TDG internally (as a layering technique), but that is a deliberate architectural choice, not mixing.
---
## Examples
### Scenario A: Ruby on Rails e-commerce app with standard domain
**Trigger:** "We're building an online store with Rails. Products, Orders, Users — standard stuff. Should we use ActiveRecord or something like Trailblazer's Data Mapper approach?"
**Process:**
1. Domain-logic pattern: Transaction Script evolving toward Domain Model — validation and simple calculation on models. Standard Rails.
2. Schema isomorphism: HIGH — each model maps to one table, no unusual inheritance.
3. Business logic weight: LOW to MODERATE — discount calculations, order state transitions, nothing requiring test isolation from DB.
4. Routing matrix: Domain Model (simple) + High isomorphism + Low-moderate logic → **Active Record**.
5. Framework mapping: Rails ActiveRecord implements this natively.
6. Anti-pattern check: Flag that AR will strain if the domain grows to include complex pricing rules, product hierarchies, or external pricing APIs. Plan for that threshold and document it.
**Output:** Recommend Active Record. Use Rails ActiveRecord natively. Add a note: "When Product becomes a polymorphic hierarchy with >3 types and complex pricing rules, reassess toward Data Mapper using a separate service/repository layer."
---
### Scenario B: Java enterprise app with complex domain, legacy schema
**Trigger:** "We have a Java application with a complex insurance policy domain — policies, coverages, endorsements, riders, pricing rules. Legacy Oracle schema from 2003 that doesn't match how we model policies in code. Team of 20, using Spring."
**Process:**
1. Domain-logic pattern: Domain Model — clear from the description (policies, endorsements, pricing rules).
2. Schema isomorphism: LOW — legacy schema with 2003 naming, coverage modeled differently than in code, likely separate pricing tables.
3. Business logic weight: HEAVY — pricing rules, endorsements, riders all suggest complex domain behavior.
4. Routing matrix: Domain Model + Low isomorphism + Heavy logic → **Data Mapper**.
5. Framework mapping: Spring + Hibernate = Data Mapper pattern implemented. Use JPA `@Entity` with `@Column` mapping to isolate domain names from schema names. Repository interfaces provide finder behavior.
6. Anti-pattern check: Warn against "anemic domain model" — putting all logic in service classes while entities are DTOs. That removes DM's benefit. Domain logic should live in domain classes, not in Spring `@Service` beans.
**Output:** Recommend Data Mapper via Hibernate/JPA. Domain classes in `src/domain/`, repository interfaces in `src/persistence/`. Isolate schema column names in `@Column(name="POLICY_COVG_AMT")` mappings. Plan Unit of Work via JPA `EntityManager` + transaction boundaries.
---
### Scenario C: .NET WinForms reporting app with data grids
**Trigger:** "We're building an internal reporting tool in C# .NET. Lots of data grids, editable tables, CRUD screens that mirror database tables. Using DataSets and DataGridViews."
**Process:**
1. Domain-logic pattern: Table Module — screens correspond to tables, data grids bind to record sets.
2. No need to assess isomorphism or business logic weight — Table Module mandates TDG.
3. Framework mapping: ADO.NET DataSet + DataAdapter + DataGridView is the canonical .NET Table Data Gateway implementation. TableAdapter in the Visual Studio dataset designer IS a TDG.
4. Anti-pattern check: Do not add business logic to gateway classes. Validation that spans multiple tables belongs in a separate validation layer, not in the TableAdapter.
**Output:** Confirm Table Data Gateway. Use ADO.NET DataAdapter/DataSet pattern or TableAdapter. Return DataSet from gateway methods to bind to DataGridView controls. Keep gateway classes free of business logic.
---
## References
- `references/pattern-routing-matrix.md` — Complete routing table with all branching conditions
- `references/anti-pattern-catalog.md` — Detailed descriptions of each anti-pattern with detection criteria and remediation steps
- `references/framework-pattern-map.md` — Extended framework mapping (20+ ORM frameworks) organized by pattern
- `references/migration-paths.md` — Step-by-step migration guides: TDG→RDG, RDG→AR, AR→DM
**Downstream skills triggered by this skill's output:**
- If Data Mapper selected → `unit-of-work-implementer` (commit sequencing, dirty tracking)
- If Lazy Load needed → `lazy-load-strategy-implementer`
- If audit of existing persistence code → `data-access-anti-pattern-auditor`
---
## License
This skill is licensed under [CC-BY-SA-4.0](https://creativecommons.org/licenses/by-sa/4.0/).
Source: [BookForge](https://github.com/bookforge-ai/bookforge-skills) — Patterns of Enterprise Application Architecture by Martin Fowler et al.
---
## Related BookForge Skills
Install related skills from ClawhHub:
- `clawhub install bookforge-patterns-of-enterprise-application-architecture`
Or install the full book set from GitHub: [bookforge-skills](https://github.com/bookforge-ai/bookforge-skills)
Audit a persistence layer and schema for data access anti-patterns: N+1 query (SELECT N+1), ripple loading, lazy loading anti-pattern, ghost/proxy identity t...
---
name: data-access-anti-pattern-auditor
description: "Audit a persistence layer and schema for data access anti-patterns: N+1 query
(SELECT N+1), ripple loading, lazy loading anti-pattern, ghost/proxy identity trap (missing
Identity Map), Active Record anti-pattern on non-isomorphic schema, Active Record / Data Mapper
mismatch, Serialized LOB overuse (queryable data stored in BLOB/JSONB/TEXT), meaningful primary
key leakage, business logic in Gateway classes. Given a codebase and schema, produces a
prioritized anti-pattern inventory with code location, evidence snippet, consequence, and
remediation that cross-references pattern-selector skills. Use this for ORM performance audit,
ORM anti-pattern detection, persistence anti-pattern inventory, database access anti-pattern
review, persistence layer review, data access review, audit persistence layer."
version: "1.0.0"
homepage: https://github.com/bookforge-ai/bookforge-skills/tree/main/books/patterns-of-enterprise-application-architecture/skills/data-access-anti-pattern-auditor
metadata: {"openclaw":{"emoji":"🔍","homepage":"https://github.com/bookforge-ai/bookforge-skills"}}
status: draft
source-books:
- id: patterns-of-enterprise-application-architecture
title: "Patterns of Enterprise Application Architecture"
authors: ["Martin Fowler", "David Rice", "Matthew Foemmel", "Edward Hieatt", "Robert Mee", "Randy Stafford"]
chapters: [3, 11, 12]
domain: software-architecture
tags:
- anti-patterns
- persistence
- orm
- auditing
- data-access
- code-review
- performance
- database
depends-on: []
execution:
tier: 2
mode: hybrid
inputs:
- type: codebase
description: "Persistence layer source files (models, repositories, gateways, mappers) plus schema (SQL DDL or ORM model definitions). The fuller the snapshot, the more precise the findings."
- type: description
description: "ORM / framework in use (Rails/ActiveRecord, Hibernate/JPA, SQLAlchemy, EF Core, TypeORM, etc.) and language. Inferred from build files if not stated."
tools-required:
- Grep
- Read
- Glob
tools-optional:
- Bash
mcps-required: []
environment: "Enterprise application codebase with persistence layer accessible. Minimum: schema DDL + one sample model/repository file. Optimal: full src/persistence/, src/domain/, schema.sql, and ORM config."
discovery:
goal: "Produce a prioritized anti-pattern inventory for a persistence layer, with code location, evidence, consequence, and remediation for each finding."
tasks:
- "Identify N+1 / ripple loading: loop + per-iteration DB access"
- "Identify ghost/proxy identity trap: missing or bypassed Identity Map"
- "Identify Active Record / Data Mapper mismatch: schema-isomorphism failure"
- "Identify Serialized LOB overuse: queryable data buried in BLOB/JSONB/TEXT"
- "Identify meaningful primary key leakage: business values as PKs"
- "Identify business logic in Gateway classes: non-CRUD methods in TDG/RDG"
- "Prioritize findings by severity (data-integrity > performance > maintainability)"
- "Produce remediation recommendations cross-referencing selector skills"
audience:
roles:
- senior-backend-engineer
- software-architect
- tech-lead
- code-reviewer
experience: intermediate
when_to_use:
triggers:
- "Suspecting N+1 query performance problems in an ORM-backed app"
- "Pre-refactoring audit of persistence layer before introducing a new ORM or framework"
- "Code review of a persistence layer or data access object set"
- "Diagnosing slow pages or endpoints where the bottleneck is database round-trips"
- "Migrating from ActiveRecord to Data Mapper (or vice versa) and need to assess fit"
- "Greenfield design review: confirming the chosen data-source pattern matches domain complexity"
- "Legacy system audit: identifying structural persistence debt before modernization"
prerequisites: []
not_for:
- "Concurrency/locking anti-patterns (use transaction-isolation-level-auditor)"
- "Web presentation anti-patterns (fat controller, template scriptlets)"
- "Distribution anti-patterns (chatty remote calls)"
- "Choosing the correct data-source pattern from scratch (use data-source-pattern-selector)"
environment:
codebase_required: true
codebase_helpful: true
works_offline: true
quality:
scores:
with_skill: "{filled by tester}"
baseline: "{filled by tester}"
delta: "{filled by tester}"
tested_at: "{filled by tester}"
eval_count: "{filled by tester}"
assertion_count: 14
iterations_needed: "{filled by tester}"
---
# Data Access Anti-Pattern Auditor
An end-to-end persistence-layer audit that detects six classes of data access anti-pattern,
grades each finding by severity, and produces remediation recommendations grounded in
Fowler's Patterns of Enterprise Application Architecture.
## When to Use
Run this audit when:
- Pages or endpoints are inexplicably slow and the bottleneck is query count, not query duration.
- A code review reveals ORM model files that feel "too smart" or "too tangled with schema concerns."
- A migration is planned (new ORM, schema refactor, framework upgrade) and you need a
baseline of structural debt.
- A new subsystem is being designed and you want to confirm the data-source pattern choice
before coding begins.
- You have inherited a legacy codebase and need to map persistence risks before touching it.
Not for: concurrency/locking problems, web controller bloat, remote-call overhead, or
choosing a pattern from scratch (those have dedicated skills).
## Context and Input Gathering
**Required:**
- Persistence layer source files: model classes, repository/gateway/mapper files, ORM config.
- Schema: SQL DDL, migration files, or ORM model definitions that define table structure.
**Observable / inferred:**
- ORM and language: detected from `import`/`require` statements, `pom.xml`, `Gemfile`,
`requirements.txt`, `package.json`. Ask if not determinable.
- Application domain: inferred from table/class names. A brief description helps scope findings.
**Defaults if not provided:**
- Assume modern ORM with lazy loading enabled by default (the common case in Hibernate,
Rails, SQLAlchemy, EF Core).
- Assume new system unless file timestamps or comments indicate otherwise.
**Sufficiency check:** You need at least one model/entity file and the schema to produce
findings. Without both, produce a "what to look for" checklist and ask for the missing piece.
## Process
### Step 1 — Discover the persistence layer
WHY: Anti-patterns live in specific file types. Locating them before analysis avoids
missing findings buried in unusual directory structures.
1a. Glob for ORM model/entity/gateway files:
```
src/persistence/** src/models/** src/domain/**
**/models.py **/*.entity.ts **/*Gateway.* **/*Mapper.* **/*Repository.*
schema.sql db/schema.rb migrations/
```
1b. Read the build file (pom.xml, Gemfile, requirements.txt, package.json) to confirm ORM.
1c. Read `config/database.yml` or equivalent for connection pool and isolation level context.
### Step 2 — Scan for N+1 / ripple loading
WHY: Fowler warns that "ripple loading" — filling a collection with individually lazy-loaded
objects then examining them one at a time — is the most common ORM performance killer. Each
object triggers a separate SQL round-trip, making N parent entities produce 1+N queries.
2a. Grep for loop constructs: `forEach`, `each`, `for...in`, `for...of`, `.map(`, `.stream(`.
2b. Inside each loop body, check whether a navigation property or finder is accessed:
`.line_items`, `.getRelated()`, `.find(`, `.query(`.
2c. Confirm no eager-load option at the query site: Rails `.includes`, JPA `JOIN FETCH`,
SQLAlchemy `joinedload`, EF Core `.Include`, Prisma `include`.
2d. Check ORM model definitions for `lazy=True`, `FetchType.LAZY`, `virtual ICollection` —
these are the seed of N+1 when accessed in a loop.
Record: file path, line range, loop shape, the lazy field accessed, evidence snippet.
### Step 3 — Scan for ghost / proxy identity trap
WHY: Virtual proxies for lazy loading carry a different object identity than the real object.
Two proxies for the same DB row compare as unequal. Fowler calls this "a nasty identity
problem." Without an Identity Map, the same row can produce multiple conflicting instances
in one transaction — causing double-writes and broken version checks.
3a. Check if the ORM is configured to use a session / Unit of Work / first-level cache.
3b. Grep for multiple session/context creation within one request:
`new Session(`, `new DbContext(`, `sessionFactory.openSession()`.
3c. Check entity `equals()` / `__eq__` — if not overridden, comparison is reference equality,
which will fail for proxies of the same row.
3d. Look for detached entity access after session close (JPA `LazyInitializationException`
stack traces in logs; SQLAlchemy `DetachedInstanceError`).
Record: whether single-session discipline is enforced, any violations found.
### Step 4 — Scan for Active Record / Data Mapper mismatch
WHY: Fowler states Active Record "works well only if the objects correspond directly to the
database tables: an isomorphic schema." When the domain has inheritance hierarchies, value
objects, or systematic naming divergence, AR fights the ORM constantly — every method
involves manual conversion. Conversely, a full Data Mapper for a simple CRUD app adds
ceremony without benefit.
4a. Identify the data-source pattern in use: ActiveRecord base class, explicit mapper classes,
or repository classes that load domain objects.
4b. If AR: check for isomorphism failure signals —
- `to_domain` / `from_record` / `to_entity` methods inside the AR class.
- Inheritance hierarchy in domain mapped to a flat AR class (STI or manual type column).
- Value objects (Money, Address) stored as AR fields with multi-column mapping.
- Systematic column/field name mismatch beyond convention (e.g., `bill_addr_l1` ↔ `billing_address_line1`).
4c. If Data Mapper: check for over-engineering signals —
- Mapper classes that are pure field-for-field copies with no structural transformation.
- Domain objects with no behavior (all data, no methods) — anemic domain model.
Record: pattern identified, isomorphism verdict, specific mismatch evidence.
### Step 5 — Scan for Serialized LOB overuse
WHY: Fowler warns that the primary disadvantage of Serialized LOB is "you can't query the
structure using SQL." When teams query inside a LOB using LIKE, JSON operators, or secondary
indexes, they have chosen the wrong pattern — the data needed real columns from the start.
The versioning risk is equally serious: changing the serialized class definition can break
deserialization of old rows.
5a. Grep schema for TEXT/BLOB/JSONB/CLOB/XML columns: `BLOB|CLOB|JSONB|JSON\b|TEXT`.
5b. For each LOB column found: check whether it appears in a WHERE, ORDER BY, or JOIN
in SQL queries or ORM filter expressions.
5c. Grep application code for deserialization + post-filtering:
`json.loads(`, `JSON.parse(`, `deserialize(` followed by list comprehension / filter.
5d. Check for GIN indexes or expression indexes on LOB columns.
5e. Grep migration history for changes to LOB content structure (adding/renaming JSON fields
via UPDATE queries or migration helpers).
Record: column name, query-pattern found, whether content is ever filtered/joined.
### Step 6 — Scan for meaningful primary key leakage
WHY: Fowler states "meaningful keys should be distrusted." Business values used as PKs
are supposed to be unique and immutable, but human error makes them neither. A PK cascade
required by a business-rule change (new order number format, SSN correction) touches every
child table — expensive and error-prone.
6a. Inspect schema PKs: flag any that are VARCHAR, CHAR, or named after business concepts
(`email`, `ssn`, `order_number`, `sku`, `username`, `code`).
6b. Flag composite PKs where both columns carry business meaning.
6c. Check FK references: if child tables reference a meaningful PK, the cascade risk is live.
6d. Check domain objects: if the PK field is exposed as a stable business identifier (`getId()`
used as `orderNumber` in UI or external API), the business key is leaking.
Record: table, PK definition, FK references, cascade risk assessment.
### Step 7 — Scan for business logic in Gateway classes
WHY: A Table Data Gateway (DAO returning result sets / recordsets) or Row Data Gateway
(per-row accessor) has a single contract: data access — find, insert, update, delete.
When validation, calculation, or workflow logic accumulates in Gateway methods, the Gateway
has drifted toward Active Record without the design intent. Testing requires a live DB even
for logic that has nothing to do with persistence.
7a. Identify Gateway / DAO classes from naming: `*Gateway`, `*DAO`, `*Dao`, `*DataAccess`.
7b. Enumerate public methods. Flag any that are not CRUD verbs:
`find*`, `insert`, `update`, `delete`, `save`, `get*`.
7c. For non-CRUD methods: read the body — is it a business rule, calculation, or validation?
7d. Check for SQL WHERE clauses that encode business policy (discount eligibility, status
transitions) rather than just structural filtering.
Record: class, method name, evidence of non-CRUD logic, suggested move-to target.
### Step 8 — Prioritize and produce the audit report
WHY: Not all findings are equal. Data-integrity risks (double-writes, lost updates from
missing Identity Map) must be fixed before performance issues, which must be fixed before
maintainability issues. Prioritizing prevents "fixing smells while the building is on fire."
8a. Rank all findings by severity tier:
1. **Critical** — Missing Identity Map / proxy identity trap (data integrity)
2. **High** — N+1 / ripple loading (performance), AR/DM mismatch (correctness)
3. **Medium-High** — Serialized LOB overuse (query loss + versioning)
4. **Medium** — Meaningful key leakage (stability), Business logic in Gateway (maintainability)
8b. Write the audit report (see Outputs section).
8c. For each finding, cross-reference the family skill that handles the fix:
- N+1 → `lazy-load-strategy-implementer`
- AR/DM mismatch → `data-source-pattern-selector`
- Serialized LOB / meaningful key → `object-relational-structural-mapping-guide`
- Business logic in Gateway → `data-source-pattern-selector`
## Inputs
| Input | Required | Description |
|---|---|---|
| Persistence layer source | Yes | Model, gateway, mapper, repository files |
| Schema / DDL | Yes | SQL CREATE TABLE statements or ORM model definitions |
| ORM / framework | Inferred | Detected from imports/build files; ask if ambiguous |
| Domain description | Optional | Helps scope findings; inferred from naming if absent |
## Outputs
**Primary artifact: Anti-Pattern Audit Report**
```markdown
# Data Access Anti-Pattern Audit — [System / Subsystem Name]
**Stack:** [language, ORM, database]
**Date:** [date]
**Scope:** [files reviewed]
---
## Critical Findings
### [AP-ID]: [Anti-Pattern Name]
- **Location:** `path/to/file.py`, line XX–YY
- **Evidence:**
```
[code snippet showing the problem]
```
- **Consequence:** [what goes wrong, how badly]
- **Remediation:** [specific fix; cross-ref skill if pattern choice is involved]
---
## High Findings
[same structure]
## Medium-High Findings
[same structure]
## Medium Findings
[same structure]
---
## Summary Table
| Finding | Anti-Pattern | Severity | File | Remediation Skill |
|---|---|---|---|---|
| AP-01 | N+1: Order.line_items | High | orders_controller.rb:42 | lazy-load-strategy-implementer |
| ... | | | | |
---
## Recommended Action Order
1. [Critical findings first, with rationale]
2. [High findings]
3. [Medium findings]
```
## Key Principles
**1. Ripple loading is detectable statically — it doesn't require profiling.**
The shape is always a loop over N entities with an inner DB access. Grep finds it before
the system is ever under load. Do not wait for slow-query logs; find it in the code.
**2. The proxy identity trap is silent until it causes data loss.**
Two proxies for the same row compare as not-equal. There's no exception, no log message —
just incorrect equality checks, phantom dirty-tracking, and version-check failures. An
Identity Map check (single-session discipline, equals() override) is a correctness audit,
not a performance audit.
**3. "Isomorphic schema" is the only condition under which Active Record is correct.**
Active Record works when every domain field maps 1:1 to a column, every domain class maps
1:1 to a table, and the domain has no inheritance or value objects. Any deviation from this
— even a single `to_domain()` helper — signals that Data Mapper is the correct pattern.
**4. A Serialized LOB is correct only when the content will never be queried independently.**
Fowler's test: "Think of a LOB as a way to take a bunch of objects that aren't likely to be
queried from any SQL route outside the application." If a WHERE clause or application-side
filter ever touches the content, the content needs real columns.
**5. Meaningful keys should be distrusted by default.**
Keys need to be unique AND immutable to function correctly. Business values fail both
properties under human error. The cost of a PK cascade far exceeds the cost of adding a
surrogate column early.
**6. Prioritize by data-integrity risk, not by visibility.**
N+1 queries are loud and visible in slow query logs. Missing Identity Map bugs are silent
and only surface as "phantom data" in production. Audit for the silent killers first.
**7. Gateway classes have a single contract: data access.**
A method on a Gateway that is not `find`, `insert`, `update`, or `delete` is a smell.
Business logic in a Gateway is hidden domain behavior that cannot be tested without a
live database — a compound failure of the persistence/domain separation.
## Examples
### Scenario A: Rails E-Commerce App with N+1 on Order Items
**Trigger:** Engineer reports that the orders index page takes 3–8 seconds for 100+ orders.
**Process:**
1. Glob `app/controllers/orders_controller.rb` and `app/views/orders/index.html.erb`.
2. Grep for `.each` in the view; find `order.line_items.count` inside the loop.
3. Confirm `Order.all` in the controller has no `.includes(:line_items)`.
4. Grep `app/models/order.rb` for `has_many :line_items` — default lazy.
5. Severity: High (performance; 1+100 queries for 100 orders).
**Output snippet:**
```markdown
### AP-01: N+1 — Order.line_items in orders#index
- **Location:** `app/views/orders/index.html.erb`, line 14
- **Evidence:** `<%= order.line_items.count %>` inside `orders.each` loop;
controller: `@orders = Order.all` (no `.includes`).
- **Consequence:** 1 + N SQL queries (1 for orders + 1 per order for line_items).
With 200 orders: 201 queries per page load.
- **Remediation:** `Order.includes(:line_items)` in controller. Or `.eager_load(:line_items)`.
Cross-ref: `lazy-load-strategy-implementer`.
```
### Scenario B: Hibernate App — Domain Has Inheritance, AR-Style Mapping
**Trigger:** "Our User entity has a `to_domain()` method and the mapper is constantly wrong."
**Process:**
1. Read `User.java` — finds `User extends BaseEntity` (AR style) with `toUserDomain()` method.
2. Read schema — `users` table has a `type` discriminator column; subtypes `AdminUser`,
`GuestUser` have different behavior.
3. Confirm `@Entity` on `User` with `@Inheritance(strategy = SINGLE_TABLE)` — STI, not AR.
4. The `toUserDomain()` method converts `User` to a separate `UserDomain` POJO — isomorphism broken.
5. Severity: High (mismatch: domain has inheritance + behavior; AR fighting the ORM).
**Output snippet:**
```markdown
### AP-02: AR/DM Mismatch — User domain inheritance vs flat ActiveRecord
- **Location:** `src/persistence/User.java`, `toUserDomain()` at line 87
- **Evidence:** `User` uses AR-style `@Entity` but contains `toUserDomain()` conversion;
schema has `type` discriminator for AdminUser/GuestUser subtypes.
- **Consequence:** Every domain operation requires manual conversion; schema changes force
dual updates to entity and domain class; tests require DB for all domain logic.
- **Remediation:** Replace with proper Data Mapper (JPA Mapper or manual Mapper class) that
maps User table → AdminUser/GuestUser domain objects. Cross-ref: `data-source-pattern-selector`.
```
### Scenario C: PostgreSQL System — Customer Preferences in JSONB, Queried Frequently
**Trigger:** "We're adding a filter UI for customer preferences and it's slow."
**Process:**
1. Read schema — `customers.preferences` column is `JSONB`.
2. Grep SQL files — find `WHERE preferences->>'theme' = 'dark'` and `WHERE preferences->>'notifications' = 'email'`.
3. Grep migrations — find a migration that renames a JSON key inside `preferences`.
4. Severity: Medium-High (query-loss: querying inside LOB; versioning trap evident).
**Output snippet:**
```markdown
### AP-03: Serialized LOB Overuse — customers.preferences
- **Location:** `db/migrations/20240301_create_customers.sql` (column def),
`src/queries/customer_filters.sql` (WHERE clause at line 12)
- **Evidence:** `customers.preferences JSONB` column filtered via `->>'theme'` and
`->>'notifications'` operators; migration 20240512 renames `notif_type` → `notifications`.
- **Consequence:** SQL cannot efficiently filter JSONB without expression index; adding
expression indexes is schema-within-schema churn; past migration renamed a JSON key
(versioning trap confirmed).
- **Remediation:** Extract `theme VARCHAR(20)`, `notifications_channel VARCHAR(20)` as
real columns; retain `preferences JSONB` only for content never filtered directly.
Cross-ref: `object-relational-structural-mapping-guide`.
```
## References
- `references/anti-pattern-detection-cheatsheets.md` — Per-stack grep patterns, evidence
classification tables, and diagnostic tests for all six anti-pattern types.
- Fowler, *PEAA* Chapter 3: "Reading in Data" — N+1 avoidance rules; finder method placement.
- Fowler, *PEAA* Chapter 11: "Lazy Load" — Ripple loading definition; proxy identity trap.
- Fowler, *PEAA* Chapter 11: "Identity Map" — Single-instance guarantee; first-level cache.
- Fowler, *PEAA* Chapter 10: "Active Record", "Data Mapper" — Isomorphism requirement;
When to Use It sections.
- Fowler, *PEAA* Chapter 12: "Identity Field" — Meaningful vs meaningless keys.
- Fowler, *PEAA* Chapter 12: "Serialized LOB" — Queryability loss; versioning trap.
## License
This skill is licensed under [CC-BY-SA-4.0](https://creativecommons.org/licenses/by-sa/4.0/).
Source: [BookForge](https://github.com/bookforge-ai/bookforge-skills) — Patterns of Enterprise
Application Architecture by Martin Fowler, David Rice, Matthew Foemmel, Edward Hieatt,
Robert Mee, Randy Stafford.
## Related BookForge Skills
Install related skills from ClawhHub:
- `clawhub install bookforge-data-source-pattern-selector`
- `clawhub install bookforge-lazy-load-strategy-implementer`
- `clawhub install bookforge-object-relational-structural-mapping-guide`
Or install the full book set from GitHub: [bookforge-skills](https://github.com/bookforge-ai/bookforge-skills)
FILE:references/anti-pattern-detection-cheatsheets.md
# Anti-Pattern Detection Cheatsheets
Reference for `data-access-anti-pattern-auditor`. Each section provides per-stack
grep/search patterns and evidence classification for one anti-pattern.
---
## AP-1: N+1 / Ripple Loading
### Grep signatures (any ORM)
```
# Loop + lazy access — generic
forEach|for\s+\w+\s+in\s+|\.each\s*\{|\.map\s*\{|\.stream\(\)
# Inside the loop — look for DB access
\.find\(|\.query\(|\.fetch\(|\.load\(|\.count\(|\.size\(|\.all\b|SELECT
```
### Rails / ActiveRecord
```ruby
# Bad — N+1 smell
orders.each { |o| o.line_items.count } # one query per order
# Good — eager load
Order.includes(:line_items).each { |o| o.line_items.count }
```
Grep: `has_many\|belongs_to` combined with loop constructs that do not use `.includes` or `.eager_load`.
### Hibernate / JPA
```java
// Bad — N+1 smell
for (Order o : orders) { o.getLineItems().size(); } // triggers proxy load each time
// Good
@Query("SELECT o FROM Order o JOIN FETCH o.lineItems")
```
Grep: `@OneToMany(fetch = FetchType.LAZY)` or `@ManyToOne(fetch = FetchType.LAZY)` — then
check if the relationship is accessed inside a loop without `JOIN FETCH` or `@BatchSize`.
### SQLAlchemy (Python)
```python
# Bad
for order in session.query(Order).all():
print(order.line_items) # lazy SELECT per order
# Good
from sqlalchemy.orm import joinedload
session.query(Order).options(joinedload(Order.line_items)).all()
```
Grep: `relationship(` without `lazy='joined'` or `lazy='subquery'` or `options(joinedload`.
### EF Core (C#)
```csharp
// Bad
foreach (var o in context.Orders.ToList())
Console.WriteLine(o.LineItems.Count); // lazy navigation
// Good
context.Orders.Include(o => o.LineItems).ToList()
```
Grep: `virtual ICollection` combined with loop that accesses the property without `.Include`.
### Evidence classification
- **Definite**: Loop body contains explicit `find`, `query`, or ORM lazy proxy access on
a field that maps to a FK relationship, AND no eager-load option is present.
- **Probable**: Loop iterates collection of entities; inner body accesses navigation property.
- **Possible**: `lazy=True` or `FetchType.LAZY` in model definition — flag for review.
---
## AP-2: Ghost / Proxy Identity Trap
### Grep signatures
```
# Equality comparison on entity/domain objects
==\s*\w+Entity|equals\(.*entity|assertSame\(|assert.*==.*proxy
# Multiple sessions / detached entities
new Session\(|new DbContext\(|sessionFactory.openSession\(\)
# within the same request/transaction — signals multiple identity maps in one transaction
```
### Evidence classification
- **Definite**: Two `find(sameId)` calls in the same request that return different object
references (checked via reference equality); OR detached entity used after session close.
- **Probable**: Multiple `Session`/`DbContext` instances opened in the same request scope.
- **Possible**: `equals()` / `__eq__` not overridden on entity classes (all proxy comparisons
use reference equality by default).
### Diagnostic test (manual)
```python
a = session.query(Person).get(1)
b = session.query(Person).get(1)
assert a is b # should pass with Identity Map; fails if bypassed
```
---
## AP-3: AR / Data Mapper Mismatch
### Signal A — AR on non-isomorphic schema
Grep for conversion methods inside AR classes:
```
to_domain\|from_record\|to_entity\|from_model\|toDomain\|fromRecord
```
Grep for domain inheritance inside AR classes:
```
class \w+ < ApplicationRecord.*\n.*STI|type_column|polymorphic
```
Schema divergence signals:
- Domain field names differ systematically from DB column names (snake_case domain ≠ DB abbrev).
- Value Objects referenced from AR (Money, Address, DateRange) mapped via `composed_of` / `@Embeddable`
inside what is supposed to be a simple AR.
### Signal B — DM with no structural complexity
Grep for mapper classes that are pure field-copy:
```python
# SQLAlchemy mapper with no transform
mapper(Person, people_table, properties={'first_name': people_table.c.first_name, ...})
# every property is identity — no transformation
```
- Mapper class count equals table count with no complex mappings → likely over-engineered for CRUD app.
### Evidence classification
- **Definite**: AR class contains `to_domain` / `from_record` methods AND schema has
inheritance or value objects.
- **Probable**: AR class has explicit column-name-to-field-name mappings that differ in structure
(not just naming convention).
- **Possible**: AR class exceeds 200 lines with heavy methods — worth reviewing for DM candidate.
---
## AP-4: Serialized LOB Overuse
### Grep signatures
```sql
-- Querying inside LOB
WHERE config LIKE '%theme%'
WHERE prefs->>'setting' = 'value'
jsonb_extract_path_text(data, 'field') = 'value'
CAST(xml_col AS TEXT) LIKE '%<status>active%'
```
```
# Schema definition — suspicious column types
TEXT|BLOB|JSONB|CLOB|XML.*column
# AND referenced in WHERE/ORDER BY/JOIN
```
```python
# Deserialization inside a finder — LOB is being filtered after fetch
[x for x in session.query(Config).all() if json.loads(x.data)['theme'] == 'dark']
```
### Evidence classification
- **Definite**: LOB column appears in a WHERE clause (SQL filtering inside the LOB) or is
deserialized in application code to filter before returning to caller.
- **Probable**: LOB column has a GIN index, indexed expression, or computed column — indicates
pressure to query inside it.
- **Possible**: Column type is TEXT/BLOB/JSONB with more than ~3 application reads that each
parse/deserialize the content.
### Versioning trap indicator
- Migration file modifies structure of content inside a LOB column (e.g., renames a JSON key,
adds a required nested field) → schema-within-schema churn.
---
## AP-5: Meaningful Primary Key Leakage
### Grep signatures
```sql
-- PK is a business value
CREATE TABLE orders (order_number VARCHAR(20) PRIMARY KEY, ...)
CREATE TABLE employees (ssn CHAR(9) PRIMARY KEY, ...)
CREATE TABLE line_items (order_number VARCHAR(20), seq INT, PRIMARY KEY (order_number, seq))
```
```python
# Domain object exposes PK as stable business ID
def get_order_number(self):
return self.id # id IS the business key
```
```
# FK references to business-meaningful PK
REFERENCES orders(order_number)
REFERENCES employees(ssn)
```
### Evidence classification
- **Definite**: PK column name is `email`, `ssn`, `order_number`, `username`, `code`, `sku`,
or a composite of business-meaningful components.
- **Probable**: PK is a VARCHAR or CHAR type (surrogate keys are almost always INT/BIGINT/UUID).
- **Possible**: Composite PK with more than one column — check if components carry business meaning.
---
## AP-6: Business Logic in Gateway
### Grep signatures
```python
# Gateway method names that suggest domain logic
class CustomerGateway:
def apply_loyalty_discount(self, ...): # business rule
def calculate_tax(self, ...): # business calculation
def validate_credit_limit(self, ...): # validation
```
```java
// TDG method beyond CRUD
public class OrderGateway {
public BigDecimal computeTotal(long orderId) { ... } // domain logic
public boolean isEligibleForPromotion(long orderId) { ... }
}
```
### CRUD boundary definition
A Gateway's legal methods: `find*(...)`, `insert(...)`, `update(...)`, `delete(...)`,
`findBy*(...)`. Any method that does not fit these templates is a candidate for leakage.
### Evidence classification
- **Definite**: Gateway method contains conditional business rules, calculations, or validation
that references business policy (discount %, tax rate, eligibility rule).
- **Probable**: Gateway method name is a verb other than find/insert/update/delete.
- **Possible**: Gateway method has more than ~10 lines of logic beyond SQL construction.
---
## Severity Ranking Reference
| Anti-Pattern | Severity | Primary Risk |
|---|---|---|
| Missing Identity Map / proxy identity trap | Critical | Data integrity: double-write, lost update |
| N+1 / Ripple Loading | High | Performance: O(N²) queries |
| AR / DM Mismatch | High | Correctness + maintainability |
| Serialized LOB overuse | Medium-High | Queryability + versioning |
| Meaningful key leakage | Medium | Stability: cascade updates, uniqueness collisions |
| Business logic in Gateway | Medium | Maintainability: test isolation, logic scatter |
---
## Remediation Cross-References
| Anti-Pattern | Primary BookForge Skill |
|---|---|
| N+1 / Ripple Loading | `lazy-load-strategy-implementer` |
| AR / DM Mismatch | `data-source-pattern-selector` |
| Serialized LOB | `object-relational-structural-mapping-guide` |
| Meaningful key | `object-relational-structural-mapping-guide` |
| Business logic in Gateway | `data-source-pattern-selector` |
| Missing Identity Map | `unit-of-work-implementer` (if built) |
Assess whether a codebase situation warrants refactoring and determine the right approach before any structural changes begin. Use this skill when a develope...
---
name: refactoring-readiness-assessment
description: "Assess whether a codebase situation warrants refactoring and determine the right approach before any structural changes begin. Use this skill when a developer is about to modify existing code and needs to decide: should I refactor first, refactor not at all, or rewrite entirely? Triggers include: developer is adding a feature and the existing code is hard to understand or extend; developer just received a bug report and suspects the code structure is hiding more bugs; a code review has surfaced design concerns and the team wants concrete guidance; code appears to have been copied more than twice in similar form; developer is unsure whether to clean up code before a deadline; codebase uses a published interface or is tightly coupled to a database schema and the developer wants to know the constraints before restructuring; developer suspects the code is so broken it cannot be stabilized without a full rewrite. This skill produces a structured go/no-go assessment and session plan — it does not apply any refactoring itself. Use code-smell-diagnosis after this skill to identify specific smells, then individual refactoring skills to apply transformations."
version: 1.0.0
homepage: https://github.com/bookforge-ai/bookforge-skills/tree/main/books/refactoring/skills/refactoring-readiness-assessment
metadata: {"openclaw":{"emoji":"🔍","homepage":"https://github.com/bookforge-ai/bookforge-skills"}}
status: draft
source-books:
- id: refactoring
title: "Refactoring: Improving the Design of Existing Code"
authors: ["Martin Fowler", "Kent Beck", "John Brant", "William Opdyke", "Don Roberts"]
chapters: [2, 15]
tags: [refactoring, code-quality, software-design]
depends-on: []
execution:
tier: 1
mode: plan-only
inputs:
- type: description
description: "User's description of the code situation: what they are trying to do (add feature, fix bug, review, clean up), what the code looks like, and any constraints (deadline, published interface, database coupling)"
tools-required: []
tools-optional: [Read]
mcps-required: []
environment: "Any agent environment; user supplies context in text form or answers guided questions. No code changes are made during this skill."
discovery:
goal: "Produce a structured refactoring readiness assessment: go/no-go decision, opportunity trigger classification, constraint inventory, Two Hats protocol plan, and session discipline rules for this specific situation"
tasks:
- "Identify which of the four opportunity triggers applies to the current situation"
- "Check for hard blockers (non-stabilizable code, published interfaces, deadline proximity, database coupling)"
- "Apply the refactor-vs-rewrite heuristic if scale of decay is uncertain"
- "Determine Two Hats sequencing: which hat first, what the boundary is, how to track hat switches"
- "Define session discipline rules: goal statement, step size, stop condition, backtrack trigger, pairing recommendation"
- "Produce the go/no-go recommendation with rationale and next steps"
audience: "Software developers, senior developers, and tech leads deciding whether and how to refactor existing code before making changes"
triggers:
- "Developer is about to add a feature and the code is hard to follow"
- "Bug report received and code structure may be obscuring more bugs"
- "Code review surfaced design concerns requiring restructuring guidance"
- "Same or similar code appears in three or more places"
- "Developer is unsure whether cleanup is worth the time before a deadline"
- "Code may be too broken to stabilize safely for refactoring"
- "Published API or database schema may constrain refactoring options"
---
# Refactoring Readiness Assessment
## When to Use
You are about to change existing code — adding a feature, fixing a bug, doing a code review, or cleaning up — and you are not sure whether to refactor first, refactor not at all, or start from scratch.
This skill runs before any code changes. It answers three questions:
1. **Should we refactor?** (Go / No-go)
2. **If yes, what approach?** (Opportunity trigger, Two Hats plan, session rules)
3. **What are the risks?** (Constraint inventory, stop conditions, backtrack rules)
**What this skill does NOT do:**
- It does not apply any refactoring transformations (use `method-decomposition-refactoring`, `conditional-simplification-strategy`, or other catalog skills for that)
- It does not diagnose specific code smells (use `code-smell-diagnosis` after this assessment)
- It does not plan a large multi-session effort (use `big-refactoring-planner` for codebase-scale work)
---
## Context and Input Gathering
### Required (ask if not provided)
- **Immediate task:** What are you trying to do right now — add a feature, fix a bug, review someone's code, or clean up in general?
-> Ask: "What brought you to the code today?"
- **Code description:** What does the code look like? Long methods, duplicated blocks, tangled conditionals, hard-to-name concepts?
-> Ask: "Describe the code you are looking at. What makes it difficult to work with?"
- **Test coverage:** Do you have automated tests that verify the current behavior?
-> Ask: "If you change the code, how do you know it still works?" (This is the single most important safety question.)
### Useful (gather if present)
- **Deadline context:** Is there a deployment, release, or sprint deadline within the next few days?
- **Interface visibility:** Is the code behind a published API, a library interface, or a database schema that external code depends on?
- **Size estimate:** Is this a single method, a class, a module, or a subsystem?
- **Prior rewrite history:** Has this code been rewritten before without improving it?
---
## Process
### Step 1 — Classify the Opportunity Trigger
**Why:** Fowler identifies four specific situations where refactoring is most valuable. Knowing which trigger applies sets the scope and urgency of the refactoring session. Refactoring without a trigger is speculative and hard to time-box.
Identify which trigger best matches the user's situation:
**Trigger A: Rule of Three (Don Roberts)**
The same or structurally similar logic appears in three or more places. The third occurrence is the signal to refactor — not the first (do it), not the second (wince but proceed), but the third.
- Signal: Developer says "I keep copy-pasting this," or three methods share identical structure with minor variations.
- Scope: Extract the shared pattern. Eliminate duplicates. The refactoring is bounded by the duplication.
**Trigger B: Refactor When Adding a Feature**
The code is hard to understand or extend for the planned addition. Refactoring first makes the feature addition faster, not slower — once the code is well-structured, adding the feature is straightforward.
- Signal: Developer says "I need to add X but I can't figure out where it goes" or "the design doesn't support this."
- Scope: Refactor only until the code is ready to receive the new feature. Stop when the feature can be added cleanly. The feature is not added during the refactoring hat.
**Trigger C: Refactor When Fixing a Bug**
A bug report is a sign that code was not clear enough to reveal the bug during development. Refactoring to understand the code often exposes the bug itself, and the improved structure prevents similar bugs.
- Signal: Developer is reading code to understand how a bug could exist. Refactoring improves that understanding, and fixing the code reveals the bug.
- Scope: Refactor enough to make the logic visible and the bug obvious. The bug fix itself happens after switching to the adding-function hat.
**Trigger D: Refactor During Code Review**
Code review is the ideal moment to suggest structural improvements while the author is present and the code is being read fresh. Implementing suggestions in the review session (small review groups) creates more concrete outcomes than suggestions alone.
- Signal: Review session has identified design concerns or clarity issues beyond style. A single reviewer and the original author are present.
- Scope: Apply refactorings that can be demonstrated during the review session. Defer large structural changes to a separate session with a stated goal.
Record the trigger classification. If more than one applies, use the primary motivation for this session.
---
### Step 2 — Check Hard Blockers
**Why:** Some situations make refactoring unsafe, premature, or impossible to complete successfully. Proceeding into these situations without acknowledging the constraints leads to half-finished refactorings that leave the code worse than before.
Run through each blocker category:
**Blocker 1: Non-stabilizable code (rewrite candidate)**
The code does not work correctly and cannot be made to work mostly correctly before refactoring.
- Test: Can you run the existing code, write tests that capture its current behavior, and get those tests passing — before touching the structure?
- If no: Do not refactor. The code cannot be made safe for transformation. Evaluate the rewrite heuristic in Step 3 instead.
- Signal phrase: "The code is so full of bugs that I can't even figure out what it's supposed to do."
**Blocker 2: Deadline proximity**
A release, deployment, or hard commitment is within a few days.
- Fowler's rule: Refactoring near a deadline creates debt that appears after the deadline — the productivity gain lands too late to matter. Ward Cunningham describes this as design debt: you can carry it, but the interest payments (maintenance cost, slower development) will come due.
- If deadline is imminent: Do not start a refactoring session. Make a note of the debt explicitly. Schedule refactoring for after the delivery.
- Exception: A tiny, clearly bounded refactoring (e.g., renaming one confusing variable) that takes under five minutes is acceptable even near a deadline. Anything requiring structural changes is not.
**Blocker 3: Published interfaces**
The code exposes methods or APIs that external code depends on and that you cannot find and change everywhere.
- Risk: Many refactorings change interfaces. Rename Method on a published API breaks callers you do not control.
- Approach if present: You can still refactor internal implementations. You cannot remove or rename the published interface without a migration strategy (keep old interface, call new one, deprecate, remove in a later version). Add this constraint explicitly to the assessment output.
- Tip from Fowler: Do not publish interfaces prematurely. Modify code ownership policies so people can change any caller before publishing. Once published, the refactoring cost multiplies.
**Blocker 4: Database schema coupling**
The code is tightly coupled to a relational database schema, and refactoring the object model would require a data migration.
- Risk: Schema changes are long-fraught tasks. Data migration in production is high-risk and often irreversible.
- Approach if present: Place a separate isolation layer between the object model and the database schema. Refactor to that layer first, so changes to one model do not force immediate changes to the other. Do not move fields across persistence boundaries without an explicit migration plan.
If one or more hard blockers are present, the assessment may be a partial go (proceed with constraints explicitly stated) or a no-go.
---
### Step 3 — Apply the Refactor-vs-Rewrite Heuristic
**Why:** Fowler is explicit that sometimes the right answer is to start over rather than refactor. But "this is a mess" is not a sufficient reason to rewrite — rewrites have their own risks, and partial rewrites that decompose by component are often better than full rewrites.
Apply this heuristic only if Blocker 1 (non-stabilizable code) triggered or if the code appears fundamentally broken:
**Signal that rewrite is warranted:**
- The existing code does not work and cannot be made to work mostly correctly
- It would be easier to start from scratch than to salvage the existing structure
- Trying to refactor it would be longer and riskier than a rewrite
**Fowler's compromise route:**
Do not rewrite the whole system. Decompose it into components with strong encapsulation. Evaluate the refactor-vs-rewrite decision one component at a time. A component that is too broken to stabilize can be rewritten. Components that are poorly structured but functional can be refactored.
**Output of Step 3:**
- For each major component: Refactor (testable, structurally redeemable) or Rewrite (non-stabilizable, easier from scratch)
- Note: A mixed recommendation is common and valid. Do not let one broken component force a full-system rewrite decision.
If no blockers from Step 2 triggered and the code works, skip this step.
---
### Step 4 — Define the Two Hats Protocol
**Why:** Kent Beck's Two Hats metaphor captures the most common refactoring failure mode: developers mix structural cleanup with feature additions or bug fixes, losing track of which changes do what. When a test fails during mixed-hat work, it is impossible to know whether the refactoring broke something or the feature change introduced a bug. Keeping the hats separate preserves this diagnostic clarity.
**The Two Hats:**
- **Adding-function hat:** You add new behavior. You do not change existing code structure. You add tests. You add capabilities. Observable behavior changes.
- **Refactoring hat:** You restructure existing code. You do not add new behavior. You do not add tests (unless you find a case you missed). Observable behavior must not change. Tests pass before and after every step.
**Rules:**
1. You wear exactly one hat at a time. Never both simultaneously.
2. When you realize you need to switch hats, finish your current step, verify tests pass, then switch.
3. Keep a list of hat-switch requests — things you notice that need doing in the other hat. Do not act on them immediately. Act on them when you switch hats.
**For this session, determine:**
- Which hat do you wear first?
- Trigger B (adding feature): Refactoring hat first. Switch to adding-function hat once the code is ready.
- Trigger C (fixing bug): Refactoring hat first to understand the code. Switch to adding-function hat to apply the fix.
- Trigger A (Rule of Three) or Trigger D (code review): Refactoring hat only. No feature work.
- What is the hat-switch boundary?
- State it explicitly: "I will switch to adding-function hat when [specific condition]."
- Example: "I will switch when the OrderProcessor class has a single clear method for each processing step and all existing tests pass."
- How will you track hat-switch requests?
- Keep a notepad (physical or digital) for "things to do in the other hat." Write observations there during a session. Do not act on them.
---
### Step 5 — Define Session Discipline
**Why:** Kent Beck's session rules from Chapter 15 are the most commonly skipped part of refactoring practice and the source of most refactoring failures. Without them, developers refactor for too long without testing, get lost, cannot find the source of a test failure, and abandon the session mid-stream — leaving the code in a worse state than when they started. The rules are designed to make stopping and backtracking safe rather than shameful.
Define these five rules for this specific session:
**Rule 1: Pick a goal**
State the refactoring goal in one sentence before touching the code. "I am refactoring to [specific outcome] so that [specific next task] becomes possible."
- The goal bounds the session. Refactoring that does not serve the stated goal is deferred.
- Example: "I am refactoring to extract the fee calculation from OrderProcessor so that I can add the promotional pricing feature without touching the core calculation."
**Rule 2: Move in small steps**
Each step is the smallest transformation that leaves the code in a consistent, passing state. Run tests after each step — not after every three steps, not at the end.
- Why: When a test fails, you want to know exactly which one-step change caused it. If you made three changes before running tests, you have a debugging problem, not a refactoring session.
- If a step feels too large, decompose it into smaller steps.
**Rule 3: Stop when unsure**
If you cannot prove to yourself that the current step preserves behavior, stop. Do not proceed.
- If the code is already better than when you started: commit or save what you have, then stop. Integrate and release.
- If the code is not better: throw away your changes. Start again with a smaller step or a more bounded goal.
**Rule 4: Backtrack to last passing state**
If a test fails and you are not certain which change caused it: do not debug. Backtrack.
- Go back to the last known good configuration where all tests passed.
- Replay your changes one by one, running tests after each. The failing test will identify the specific change.
- Why: An hour of backtracking replays in ten minutes. An hour of debugging an uncertain failure can cost two hours — and you still may not find the root cause.
- Record this rule explicitly: "If any test fails and I cannot immediately identify the cause, I will revert to last passing state and replay."
**Rule 5: Work in pairs (recommended)**
Pair programming provides three benefits specific to refactoring: it keeps step sizes small (partner observes drift), it provides a second opinion on when to stop, and it provides quiet confidence when you are uncertain. Your partner catches the moment you switch from confident steps to uncertain ones.
- Assess: Is pair programming available for this session? If not, what substitute applies? (Time-boxing the session, commit after each step as a proxy for pair check-in.)
---
### Step 6 — Produce the Assessment Output
**Why:** The assessment must be tangible and actionable. A verbal judgment ("yeah, refactor it") is not sufficient. The output is a document the developer can refer to during the session — especially the hat-switch boundary and the backtrack rule, which are hardest to remember under pressure.
Write the assessment in this structure:
---
**REFACTORING READINESS ASSESSMENT**
**Situation:** [One sentence describing the code and the immediate task]
**Opportunity Trigger:** [A / B / C / D] — [name and brief rationale]
**Go / No-Go Recommendation:** [Go / No-Go / Conditional Go]
[2-3 sentences explaining the recommendation]
**Constraint Inventory:**
- Tests available: [Yes / No / Partial — what needs to be added before proceeding]
- Published interfaces: [None / Present — [which ones, constraints]]
- Database coupling: [None / Present — [migration risk level]]
- Deadline: [None within session / Imminent — [do not proceed recommendation]]
**Refactor vs. Rewrite:**
- [If applicable] Component X: Refactor | Rewrite — [rationale]
- [If no blockers triggered] Not evaluated — code is stabilizable
**Two Hats Plan:**
- First hat: [Refactoring | Adding-function]
- Hat-switch boundary: [Explicit condition]
- Hat-switch request tracker: [How to record observations during the session]
**Session Discipline Rules:**
- Goal: [One-sentence refactoring goal]
- Step size: [Specific — e.g., "one method extraction per step, tests after each"]
- Stop condition: [When to stop for the session]
- Backtrack trigger: [Any failing test I cannot immediately explain → revert to last passing state]
- Pairing: [Yes / No — if no, substitute]
**Next Steps:**
1. [First concrete action: e.g., "Add tests for OrderProcessor.process() to establish baseline"]
2. [Second action: e.g., "Run code-smell-diagnosis on OrderProcessor to identify which smells to address"]
3. [Third action: e.g., "Apply method-decomposition-refactoring to extract fee calculation"]
---
## Key Principles
**Refactoring is not a scheduled activity.** Fowler is explicit: do not allocate two weeks every few months to refactoring. Refactoring happens in small bursts, triggered by real tasks. You refactor because you want to do something else, and refactoring helps you do that other thing.
**The code must work before you refactor.** Refactoring restructures working code. Code that does not work mostly correctly cannot be safely transformed — every refactoring step's safety depends on tests that confirm behavior is preserved.
**Design debt accrues interest.** Ward Cunningham's framing: unfinished refactoring is debt. Some debt is necessary to function. But the interest payment is the extra cost of maintenance and slow development caused by overly complex code. When the payments become too great, you are overwhelmed.
**Not having enough time to refactor is a sign you need to refactor.** Tight schedules that prevent cleanup cause the conditions that make schedules tight. If you never have time to clean up, you are paying the interest on accumulated design debt.
**A big refactoring is a recipe for disaster.** Beck's warning from Chapter 15 is direct: when you see all the problems at once and want to clean up everything in sight, resist. Nibble at the problem — take a few minutes to clean up an area when you are about to add functionality there. A three-month cleanup halt is not acceptable to any organization and they are right to refuse it.
---
## Examples
### Example A: Feature Addition (Trigger B)
**Situation:** Adding promotional pricing to OrderProcessor. The method is 200 lines long, mixes fee calculation, tax computation, and discount application in a single block.
**Assessment:**
- Trigger: B (Refactor when adding a feature)
- Go: Yes — tests exist for current behavior, no published interface, no deadline this week
- First hat: Refactoring
- Hat-switch boundary: "When OrderProcessor has separate methods for fee calculation, tax computation, and discount application and all tests pass"
- Goal: "Extract fee calculation into its own method so the promotional pricing hook has a single, clear insertion point"
- Backtrack rule: "Any failing test I cannot explain in thirty seconds → revert to last commit"
- Next step: Run code-smell-diagnosis on OrderProcessor, then apply method-decomposition-refactoring
---
### Example B: Bug Fix (Trigger C)
**Situation:** Bug report: "discounts not applying correctly for international orders." Code mixes currency handling with discount logic in several places.
**Assessment:**
- Trigger: C (Refactor when fixing a bug)
- Conditional Go: Tests exist, but they do not cover international order paths — add tests first
- First hat: Refactoring hat (to understand the code and make the logic visible)
- Hat-switch boundary: "When the discount application logic is isolated and I can see exactly where the international order path diverges"
- Goal: "Clarify the discount calculation path well enough that the bug becomes visible"
- Stop condition: "Stop refactoring the moment the bug is visible — switch to adding-function hat to fix it"
- Backtrack rule: Standard — revert to last passing state on any unexplained failure
---
### Example C: Rewrite Candidate
**Situation:** Legacy billing module. Known to produce incorrect results in certain scenarios. Cannot write tests that pass because the existing behavior is wrong.
**Assessment:**
- Trigger: None of A/B/C/D applies cleanly — the code does not work
- No-Go for refactoring: Cannot stabilize the code before refactoring
- Refactor vs. Rewrite: Evaluate component by component
- BillingCalculator: Rewrite — non-stabilizable, too broken to test
- InvoiceFormatter: Refactor — works correctly, poor structure only
- PaymentGatewayAdapter: Refactor — stable behavior, interface can be preserved
- Next step: Isolate InvoiceFormatter and PaymentGatewayAdapter first; rewrite BillingCalculator behind a clean interface boundary
---
### Example D: Deadline Proximity (No-Go)
**Situation:** Release is tomorrow. Developer wants to clean up the authentication module because it has been bothering them.
**Assessment:**
- No-Go: Productivity gain from refactoring would land after the deadline, not before
- Action: Make a note of the specific concerns (e.g., "AuthService.authenticate() is 150 lines mixing session management with credential validation"). Schedule refactoring for the next working day after release.
- Rule: Even a "quick" refactoring near a deadline expands into unexpected scope. The right time to have done this was before the deadline pressure began.
---
## References
- `references/two-hats-protocol.md` — Extended Two Hats guidance including hat-switch request tracking templates and pair programming patterns for refactoring sessions
- `references/refactoring-constraints.md` — Detailed guidance on published interface migration strategies, database schema isolation layers, and design debt management
## License
This skill is licensed under [CC-BY-SA-4.0](https://creativecommons.org/licenses/by-sa/4.0/).
Source: [BookForge](https://github.com/bookforge-ai/bookforge-skills) — Refactoring: Improving the Design of Existing Code by Martin Fowler with Kent Beck, John Brant, William Opdyke, and Don Roberts.
## Related BookForge Skills
Install related skills from ClawhHub:
- `clawhub install bookforge-code-smell-diagnosis`
- `clawhub install bookforge-method-decomposition-refactoring`
- `clawhub install bookforge-big-refactoring-planner`
- `clawhub install bookforge-build-refactoring-test-suite`
Or install the full book set from GitHub: [bookforge-skills](https://github.com/bookforge-ai/bookforge-skills)
Optimize code performance by first refactoring to a well-factored structure, then running a profiler to find actual hot spots, and applying targeted optimiza...
---
name: profiling-driven-performance-optimization
description: Optimize code performance by first refactoring to a well-factored structure, then running a profiler to find actual hot spots, and applying targeted optimizations only where the profiler points — never by guessing. Use this skill when users report the program is too slow, before any performance work begins on an unfactored codebase, or after refactoring is complete and performance must now be tuned to acceptable levels.
version: 1.0.0
homepage: https://github.com/bookforge-ai/bookforge-skills/tree/main/books/refactoring/skills/profiling-driven-performance-optimization
metadata: {"openclaw":{"emoji":"⚡","homepage":"https://github.com/bookforge-ai/bookforge-skills"}}
status: draft
source-books:
- id: refactoring
title: "Refactoring: Improving the Design of Existing Code"
authors: ["Martin Fowler"]
chapters: [2]
tags: [refactoring, performance, code-quality]
depends-on: []
execution:
tier: 3
mode: hybrid
inputs:
- type: codebase
description: "A working program or module that runs too slowly or consumes too much memory, with or without existing tests."
tools-required: [Read, Write, Bash]
tools-optional: []
mcps-required: []
environment: "Working codebase with a profiler available for the language. Output: optimized code with before/after profiling data showing improvement at the identified hot spots."
discovery:
goal: "Reach performance that satisfies users by targeting only the small fraction of code that actually consumes most of the time, leaving the rest untouched."
tasks:
- "Confirm the codebase is in a well-factored state before beginning optimization"
- "Establish a baseline: run the profiler on real or representative workloads"
- "Identify the hot spots — the methods or code paths consuming the most time or memory"
- "Apply optimizations in small, targeted steps to the identified hot spots only"
- "Re-profile after each change to confirm measurable improvement"
- "Back out any change that does not produce a measurable improvement"
- "Continue until performance satisfies the users"
audience: "developers, engineers, performance engineers, anyone tasked with making a program faster or more memory-efficient"
when_to_use: "When a program's performance is unacceptable and optimization work is beginning, or when developers are tempted to optimize code speculatively before measuring"
environment: "Working codebase. A profiler must be available (see Context and Input Gathering). The code should have a test suite to protect against regressions during optimization."
quality: placeholder
---
# Profiling-Driven Performance Optimization
## When to Use
You are being asked to make a program faster or reduce its memory usage, and one of these is true:
- Users or stakeholders have reported the program is too slow
- Performance testing has produced unacceptable numbers
- You are about to begin a performance optimization pass after feature work is complete
- A developer is speculating about what might be slow and wants to optimize "obvious" bottlenecks before measuring
This skill applies to any context where performance matters but you are not operating under hard real-time constraints (heart pacemakers, flight control systems). For hard real-time systems, time-budgeting per component is appropriate — that is a different technique. For the vast majority of software — web services, data pipelines, desktop applications, CLIs, APIs, batch jobs — this profiling-based approach is the correct one.
**The core insight (Fowler's principle):** "The secret to fast software, in all but hard real-time contexts, is to write tunable software first and then to tune it for sufficient speed." Well-factored code is not just easier to read — it is easier to optimize, because profilers can pinpoint individual methods rather than tangled blocks.
Before starting, confirm you have:
- A working program (it runs correctly)
- A profiler available for the language and runtime
- A representative workload or benchmark to run under the profiler
- A test suite (strongly recommended) to catch regressions during optimization
---
## Context and Input Gathering
### The Three Approaches to Performance — Know Which One You Are Avoiding
Before beginning optimization work, establish which approach is being proposed and steer toward the correct one:
| Approach | Description | When appropriate |
|---|---|---|
| **Constant attention** | Every developer, all the time, keeps performance in mind and makes micro-optimizations during regular development | Almost never. Spreads optimization throughout the codebase, increases development cost, and makes code harder to change — with most of the effort going to code that is not actually slow. |
| **Time budgeting** | Decompose the design into components; assign each a time/memory budget that must not be exceeded | Hard real-time systems only (medical devices, avionics, embedded with strict latency guarantees). Overkill for ordinary software. |
| **Profiling-based** | Build the program in a well-factored manner first; then enter a dedicated optimization stage driven entirely by profiler data | The correct approach for nearly all software. This is what this skill implements. |
If the developer is proposing constant attention ("I'll just be more careful about performance as I write code"), redirect them to the profiling-based approach. The 90/10 rule makes constant attention wasteful: most programs spend 90% of their time in 10% of the code. Optimizing the other 90% is effort that produces no user-visible improvement.
### Required Context
- **The program or module:** What is being optimized? Get a clear scope boundary — the whole application, a specific service, a batch processing path, a single module.
- **The performance complaint:** What specifically is too slow or too large? "The report generation takes 4 minutes and users expect under 30 seconds." Concrete numbers matter — they define what "satisfies users" means at the end.
- **The workload:** What inputs or usage patterns should be used when profiling? Profiling against unrepresentative inputs produces unrepresentative hot spots. Use real or realistic production-like data.
- **The profiler:** Identify the profiler for the language and runtime in use. Common options:
| Language / Runtime | Profilers |
|---|---|
| Python | `cProfile`, `py-spy`, `line_profiler`, `memory_profiler` |
| JavaScript / Node.js | V8 CPU profiler (built into Chrome DevTools, `--prof` flag), `clinic.js` |
| Java / JVM | JProfiler, YourKit, async-profiler, JDK Flight Recorder |
| Go | `pprof` (built in), `go test -bench`) |
| Rust | `perf`, `cargo flamegraph`, `criterion` |
| Ruby | `rack-mini-profiler`, `stackprof`, `ruby-prof` |
| C / C++ | `gprof`, `Valgrind`/`Callgrind`, `perf`, Instruments (macOS) |
| .NET | dotTrace, PerfView, BenchmarkDotNet |
| General | Flamegraphs (Brendan Gregg's format, works across runtimes) |
- **The current state of the codebase:** Is it already well-factored? If not, the correct first step is to refactor before optimizing. See the "Precondition" step below.
### Sufficiency Check
You are ready to begin optimization when:
1. You know exactly what performance target must be met ("satisfies users")
2. You have a representative workload to run under the profiler
3. A profiler is installed and runnable
4. The codebase is in a well-factored state (or you have a plan to get it there)
---
## Process
### Step 0 — Precondition: Ensure the Codebase Is Well-Factored
Before running the profiler for the first time, assess whether the code is already well-structured.
**Why this step exists:** Profiling unfactored code (long methods, mixed responsibilities, tangled logic) is less effective because the profiler reports time in large, coarse units. You cannot pinpoint which part of a 300-line method is slow — only that the whole method is slow. Well-factored code with fine-grained methods gives the profiler smaller targets to report on, which means your optimization targets are also smaller and more precise.
Additionally, well-factored code is faster to optimize: you can add performance-specific code (caches, lazy initialization, index structures) more quickly because the code is already modular. Fowler found that well-factored code "gives you more time to focus on performance" — not because it runs faster, but because you spend less time understanding it before changing it.
**How to check:**
- Methods are small and do one thing
- Classes have clear, single responsibilities
- Logic is not duplicated across multiple places
- External dependencies (I/O, network, database) are isolated in identifiable locations
**If the codebase is not well-factored:** Refactor first. Apply `method-decomposition-refactoring`, `code-smell-diagnosis`, or other refactoring skills before starting this optimization workflow. This is not lost time — it is what makes the optimization phase both faster and more effective.
**If the codebase is already well-factored:** Proceed to Step 1.
---
### Step 1 — Establish a Baseline: Run the Profiler
Run the program under the profiler using a representative workload. Record the output in full before making any changes.
**Why:** You need a before-state to compare against. Without a baseline, you cannot tell whether an optimization improved performance, had no effect, or made things worse. The baseline also establishes the current hot spots — the specific methods or code paths the profiler identifies as consuming the most time or memory.
**What to record:**
- Total elapsed time (wall-clock time) for the workload
- Profiler output: which methods/functions are consuming the most cumulative time, the most self time, or the most memory
- For memory profiling: which allocations are largest, where they are created
**Save the baseline output to a file** — do not rely on memory. Example:
```bash
# Python: profile to a file, then display top 20 hotspots
python -m cProfile -o baseline.prof my_script.py workload_input.csv
python -c "import pstats; p = pstats.Stats('baseline.prof'); p.sort_stats('cumulative'); p.print_stats(20)"
# Go: CPU profile
go test -cpuprofile=baseline.prof -bench=BenchmarkMyFunction ./...
go tool pprof -top baseline.prof
# Node.js: run with profiling flag, then analyze
node --prof my_script.js
node --prof-process isolate-*.log > baseline.txt
```
Identify the **top hot spots**: typically 3-5 methods/functions that together account for most of the time. These are your optimization targets. Everything else is not worth touching.
---
### Step 2 — Select One Hot Spot to Optimize
From the profiler output, choose the single largest consumer of time or memory. Focus on one hot spot at a time.
**Why:** Focusing on one hot spot at a time maintains the connection between a change and its effect. If you optimize three things simultaneously and re-profile to find no improvement, you cannot tell which (if any) of the three changes was responsible. One change → one measurement.
**How to select:** Sort the profiler output by cumulative time (time spent in a function including all calls it makes). The function at the top of that list is the hot spot to address first — it has the most potential to improve the overall result.
If the top hot spot is infrastructure code you cannot change (a library, a system call), move to the next one down the list.
---
### Step 3 — Understand the Hot Spot Before Changing It
Read the hot spot code carefully before applying any optimization.
**Why:** The profiler tells you *where* time is being spent; it does not tell you *why* or *what to do about it*. You need to understand the code before you can optimize it correctly. Common causes of hot spots:
| Cause | Typical fix |
|---|---|
| Repeated expensive computation with the same inputs | Cache the result (memoization) |
| Repeated object/memory allocation in a tight loop | Reuse objects, pre-allocate, use lazy initialization |
| Unnecessary I/O inside a loop (database calls, file reads) | Batch the I/O, move it outside the loop, use connection pooling |
| Inefficient data structure for the access pattern | Replace with a more appropriate structure (e.g., list to set for membership tests) |
| Redundant work (computing the same derived value multiple times) | Compute once, store the result |
| Excessive copying of large data structures | Use references, views, or generators instead |
Do not guess — read the code, understand the data flow, identify the specific waste.
---
### Step 4 — Apply One Optimization in a Small Step
Make the smallest possible change that addresses the identified cause. Do not refactor other parts of the hot spot while you are in it — stay focused on the performance change only.
**Why small steps:** Small changes are easier to revert. If the optimization does not produce improvement (which happens frequently — see Step 5), the cost of undoing it is low. Large, sweeping optimizations that do not help leave you with a large, sweeping revert.
**Apply the change, then immediately:**
1. Compile (if the language requires it)
2. Run the test suite — confirm no regressions were introduced
3. Proceed to Step 5
**Note on trade-offs:** Performance optimizations often make code harder to understand. This is an accepted trade-off, but only at the hot spot — not throughout the codebase. Localized complexity in one well-identified method is manageable. Widespread micro-optimizations are not.
---
### Step 5 — Re-Profile: Measure the Effect
Run the profiler again with the same workload and compare the output to the baseline.
**Why:** This is the critical decision gate. Performance intuition is unreliable. Experienced engineers routinely expect optimizations to help that do not — and sometimes make things slower. The profiler does not care about intuition.
**Decision logic:**
```
Run profiler with same workload as baseline
|
├── Hot spot time decreased meaningfully?
| YES → Keep the change. Update baseline. Return to Step 2.
|
└── No meaningful improvement (or performance got worse)?
→ BACK OUT THE CHANGE immediately. Do not keep it.
Return to Step 3 and reconsider the cause.
```
**What counts as "meaningful improvement":** If the profiler shows the target method is now faster, and the total elapsed time for the workload improved measurably, keep the change. If the numbers are essentially the same within noise, the optimization had no real effect — remove it. Code clarity costs were paid; performance gains were not received.
**Why back out aggressively:** Every optimization that does not help is a net negative — it adds complexity without benefit. Fowler is explicit: "If you haven't improved performance, you back out the change." The discipline to revert unsuccessful optimizations is what keeps the codebase from accumulating unexplained complexity.
---
### Step 6 — Repeat Until Performance Satisfies Users
Return to Step 2. Select the next hot spot from the current profiler output (the hot spot landscape changes as you optimize — the second-largest consumer may have become the first).
Continue the loop:
```
Profile → Identify hot spot → Understand cause → Apply one change → Compile + test → Re-profile
→ Improved? Keep and continue.
→ Not improved? Revert and try differently.
→ Performance satisfies users? Stop.
```
**Stop condition:** Stop when the performance target established in Context and Input Gathering is met — not before, not after. Over-optimizing past the target produces diminishing returns and accumulates unnecessary complexity.
**If you exhaust the hot spots without meeting the target:** The remaining time may be in infrastructure (OS, runtime, network) outside your control, or the architecture itself may need to change. Escalate to architectural-level decisions (parallelism, caching layer, algorithmic change) — these are larger changes that warrant their own planning.
---
## Key Principles
**1. Never optimize without profiler data.**
"Programmers are very bad at guessing where the bottlenecks are." (McConnell, cited by Fowler.) The 90/10 rule means you will almost certainly guess wrong. Code that looks slow often is not; code that looks innocent is often the real bottleneck. Measurement is the only reliable guide.
**2. Well-factored code is a precondition for effective optimization.**
Fine-grained methods give the profiler fine-grained targets. A 300-line method that the profiler marks as slow tells you very little. Three 20-line methods with clear names tell you exactly where to look. Refactoring before optimizing is not delay — it is what makes the optimization fast.
**3. One change per measurement cycle.**
The discipline of one change → one profile run is what connects cause to effect. Without it, you accumulate changes you cannot attribute to specific improvements, and you cannot safely revert the ones that did not help.
**4. Back out unsuccessful optimizations without hesitation.**
An optimization that produced no measurable improvement costs code clarity for no performance gain. It must be removed. This is not failure — it is the scientific method applied to software. You learned something: that was not the bottleneck.
**5. Optimize only where the profiler points.**
Leaving the non-hot-spot code clean and unoptimized is correct behavior, not laziness. The 90% of code that is not a hot spot should remain readable, maintainable, and clear. Optimizing it wastes time and degrades code quality without improving user-visible performance.
**6. "Satisfies users" is the target, not maximum possible speed.**
Stop when the performance target is met. Over-optimization past the requirement accumulates complexity for no user benefit.
---
## Examples
### Example 1: Payroll processing system (based on Fowler's Chrysler case study)
**Situation:** A payroll system was expected to take over 1,000 hours to run the full payroll. The team suspected complex business logic was the cause and had ideas about where to optimize.
**What actually happened:** A profiler was brought in before any changes were made. The profiler revealed the biggest consumer was not business logic — it was the repeated creation of 12,000-byte strings, three per output record, deep in the I/O framework. The strings were so large that the runtime's garbage collector could not handle them normally and was paging them to disk on every creation.
**Fix 1:** Cache a single 12,000-byte string rather than creating a new one for each record. This addressed most of the problem.
**Fix 2:** Change the framework to write directly to a file stream, eliminating the large strings entirely.
**After profiling again:** The profiler found the next hot spots — smaller strings of 800 bytes, 500 bytes. Converting those to stream writes as well continued the improvement.
**Result:** The payroll that was expected to take over 1,000 hours ran in 40 hours at launch, then 18 hours, then 12 hours, then 9 hours as optimization continued.
**Key lesson:** The team's initial guesses about what was slow were completely wrong. The profiler pointed to something nobody had considered. The well-factored codebase also enabled a later optimization — adding multithreading — that took only three days to implement because the code was already modular.
---
### Example 2: A data pipeline that is too slow
**Situation:** A data processing pipeline takes 8 minutes to process daily input. Users expect under 2 minutes.
**Step 0 — Check factoring:** Methods are small and clearly named. Ready to proceed.
**Step 1 — Baseline profile:**
```
cumulative time by function:
process_records() 482s (called once)
validate_record() 431s (called 50,000 times)
lookup_category() 398s (called 50,000 times)
format_output() 49s (called 50,000 times)
write_batch() 2s (called 200 times)
```
**Step 2 — Hot spot:** `lookup_category()` consuming 398s out of 482s total. Clear target.
**Step 3 — Understand cause:** Reading the code reveals `lookup_category()` makes a database query on every call. 50,000 queries × ~8ms each = 400 seconds.
**Step 4 — Optimization:** Load the entire category table into an in-memory dictionary at startup. Replace the per-record database call with a dictionary lookup.
**Step 5 — Re-profile:**
```
cumulative time by function:
process_records() 89s
validate_record() 52s (now the hot spot)
lookup_category() 1s (resolved)
format_output() 34s
write_batch() 2s
```
Total: 89 seconds (down from 482). Improvement confirmed. Keep the change.
**Continue:** Next hot spot is `validate_record()` at 52s. Profile, understand, optimize, measure. Repeat.
---
### Example 3: Recognizing and redirecting the constant attention anti-pattern
**Situation:** A developer says: "I'm going to be careful about performance as I write this new feature — I'll avoid creating unnecessary objects and think about efficiency at every step."
**Correct response:** Redirect. Point out that constant attention has two specific costs:
1. It slows down feature development because the developer is making optimization decisions without knowing which code will actually be slow.
2. The 90/10 rule means most of the code being "carefully" written is not a hot spot and will never be. The optimization effort is wasted.
**Better approach:** Write the feature in the clearest, most well-factored way. After it is working, if performance is unacceptable, run a profiler. The profiler will tell you exactly which 10% of the new code to look at. Optimize only that.
---
## License
This skill is licensed under [CC-BY-SA-4.0](https://creativecommons.org/licenses/by-sa/4.0/).
Source: [BookForge](https://github.com/bookforge-ai/bookforge-skills) — Refactoring: Improving the Design of Existing Code by Martin Fowler.
## Related BookForge Skills
- `refactoring-readiness-assessment` — Assess whether code is ready to refactor before entering the optimization precondition
- `build-refactoring-test-suite` — Build the test suite that protects against regressions during optimization steps
- `method-decomposition-refactoring` — Decompose large methods to give the profiler finer-grained targets
Browse more BookForge skills: [bookforge-skills](https://github.com/bookforge-ai/bookforge-skills)
Decompose long, tangled methods into clean, composable units using the 9 composing-method refactorings from Fowler's catalog. Use when: a method has grown to...
---
name: method-decomposition-refactoring
description: |
Decompose long, tangled methods into clean, composable units using the 9 composing-method refactorings from Fowler's catalog. Use when: a method has grown too long to understand at a glance; code contains a comment that explains what a block does (the comment is a signal to extract); a method cannot be changed without understanding all of its internals; local variables are so numerous that Extract Method keeps failing; a method is doing several conceptually distinct things that are collapsed into one body. The flagship technique is Extract Method — applied when the semantic distance between the method name and its body is too large; name the fragment after what it does, not how it does it. Companion techniques handle the obstacles: Replace Temp with Query eliminates the local variables that block extraction; Split Temporary Variable separates a temp that has been reused for two different things; Introduce Explaining Variable names a sub-expression when extraction is blocked by too many locals; Remove Assignments to Parameters prevents a parameter from being reassigned and muddying the intent; Inline Method collapses a method whose body is as clear as its name; Inline Temp removes a temp that obstructs another refactoring; Replace Method with Method Object converts a hopelessly entangled method into its own class so that Extract Method can be applied freely; Substitute Algorithm replaces an obscure implementation with a cleaner one once the method is small enough. Trigger: Long Method smell from code-smell-diagnosis, or any method where a comment is needed to understand a code block.
version: 1.0.0
homepage: https://github.com/bookforge-ai/bookforge-skills/tree/main/books/refactoring/skills/method-decomposition-refactoring
metadata: {"openclaw":{"emoji":"📚","homepage":"https://github.com/bookforge-ai/bookforge-skills"}}
status: draft
depends-on:
- code-smell-diagnosis
source-books:
- id: refactoring
title: "Refactoring: Improving the Design of Existing Code"
authors: ["Martin Fowler", "Kent Beck"]
chapters: [6]
tags: [refactoring, code-quality, methods]
execution:
tier: 2
mode: full
level: 1
inputs:
- type: codebase
description: "The method or class to decompose — a file path, a method name, or a pasted code block"
- type: document
description: "Diagnosis report from code-smell-diagnosis identifying Long Method or related smells, if available"
tools-required: [Read, Edit, Bash]
tools-optional: [Grep, Write]
mcps-required: []
environment: "Run inside a project directory with source files readable and a test suite runnable via the project's standard test command."
discovery:
goal: "Transform every method whose semantic distance between name and body is too large into a set of short, intention-revealing methods — each doing one thing, each named after what it does"
tasks:
- "Identify the method(s) to decompose from smell diagnosis or user direction"
- "Eliminate obstructing local variables using Replace Temp with Query or Introduce Explaining Variable"
- "Extract cohesive code fragments into named methods with Extract Method"
- "Handle remaining obstacles: split dual-use temps, remove parameter assignments, apply Method Object for irreducible tangles"
- "Run the test suite after each step to confirm behavior is preserved"
- "Optionally substitute the algorithm if a cleaner implementation becomes apparent"
audience:
roles: ["software-developer", "senior-developer", "tech-lead"]
experience: "intermediate — assumes working knowledge of the host language and object-oriented design"
triggers:
- "Long Method smell identified by code-smell-diagnosis"
- "A method requires a comment to explain what a block of code does"
- "A method cannot be understood without reading its entire body"
- "Adding a feature to a method requires untangling it first"
- "A code review flags a method as too complex or hard to follow"
not_for:
- "Cross-class refactoring (Feature Envy, Shotgun Surgery) — use class-responsibility-realignment instead"
- "Conditional structure simplification — use conditional-simplification-strategy instead"
- "Performance optimization — use profiling-driven-performance-optimization instead"
- "New code that hasn't been written yet — these techniques apply to existing, working code"
---
# Method Decomposition Refactoring
## When to Use
A method has grown beyond the point where its name describes what it does. The body contains
conceptually distinct operations mixed together, or it requires comments to explain sections
of code that should be self-explanatory from method names.
**The key diagnostic question from Fowler:** Length is not the issue. The key is the semantic
distance between the method name and the method body. If extracting a fragment into a named
method improves clarity — even if the extracted method's name is longer than the code it
replaces — extract it.
**Signals that decomposition is needed:**
- A comment precedes a block of code explaining what the block does
- Understanding the method requires tracking many local variables simultaneously
- A loop body, a conditional branch, or an initialization block feels like a separate concept
- The method is long enough that you scroll to read it
- When you try Extract Method, you get blocked by too many parameters from local variables
**This skill executes decomposition.** `code-smell-diagnosis` identifies that decomposition is
needed and points here. After decomposition, `conditional-simplification-strategy` handles
any remaining complex conditionals exposed in the extracted methods.
---
## Context and Input Gathering
### Required Input
- **The method to decompose.** File path and method name, or a pasted code block. Why: the
refactoring is grounded in the actual code — general descriptions are insufficient for
making safe, correct transformations.
- **The test suite command.** Why: every Extract Method step must be verified by running
tests. Without a passing test suite before starting, you cannot confirm you have preserved
behavior. If no tests exist, flag this and apply `build-refactoring-test-suite` first.
### Observable Context
Before starting, read the method and answer these questions:
```
Local variable inventory:
- How many local variables does the method declare?
- Which variables are read-only within a candidate extraction?
- Which variables are assigned within a candidate extraction?
- Which variables are used both inside and outside a candidate extraction?
Conceptual block inventory:
- Where are the comments? Each comment marks a likely extraction boundary.
- Does the method initialize, then compute, then output? Each phase is a candidate.
- Are there loops? Each loop and its body is a candidate for extraction.
- Are there conditionals? Each branch may be a candidate.
```
### Default Assumptions
- If the user has not run the tests: run them first and confirm they pass before any change.
- If a candidate extraction would require more than one output variable: choose different
extraction boundaries or apply Replace Temp with Query first to eliminate the variable.
- If the method is so entangled that every extraction attempt produces a 5+ parameter
signature: jump to Replace Method with Method Object (Step 6) before attempting extraction.
---
## Process
### Step 1: Read and Inventory the Method
**ACTION:** Read the entire method. List every local variable, its type, where it is assigned,
and where it is used. Mark conceptual blocks — typically announced by comments.
**WHY:** Extract Method's only real complication is local variables. A full inventory before
starting prevents surprises mid-extraction (discovering a variable is written inside the
extraction and read outside it, which requires returning a value). The inventory also reveals
which temp elimination refactoring to apply before extraction.
**Categorize each local variable:**
- **Read-only inside candidate extraction, declared outside:** pass as parameter to extracted
method
- **Assigned only inside candidate extraction:** declare inside the extracted method; remove
the outer declaration afterward if it was declared only for the extracted block
- **Assigned inside, read outside:** the extracted method must return this value; only one
such variable is workable per extraction
- **Assigned inside, read outside, AND other variables also modified:** you cannot extract
cleanly; apply Replace Temp with Query or Split Temporary Variable first
---
### Step 2: Eliminate Temp Blockers Before Extracting
**ACTION:** For each local variable that would block a clean extraction, apply one of:
**Replace Temp with Query** — when a temp holds a simple expression assigned once and the
expression has no side effects.
Mechanics:
1. Find a temp assigned only once. Declare it `final` and compile — confirms it is only
assigned once.
2. Extract the right-hand side of the assignment into a private method.
3. Replace each reference to the temp with a call to the method.
4. Compile and test after each reference replacement.
5. Remove the temp declaration.
See Example 2 below for a full walkthrough.
**WHY Replace Temp with Query:** Temps are local — visible only inside the method, and they
encourage longer methods because the method is the only way to reach the value. A query
method is visible to the entire class and can be reused, making the class's code cleaner
overall. It also unblocks Extract Method by eliminating variables that would otherwise need
to be passed as parameters.
**Split Temporary Variable** — when a temp is assigned more than once for two different
purposes (not a loop variable and not a collecting variable).
Mechanics:
1. Rename the temp at its first declaration and first assignment to a name that reflects
only the first use. Declare it `final`.
2. Change all references from the first assignment up to the second assignment to use the
new name.
3. At the second assignment, declare a new temp with the original name.
4. Compile and test.
5. Repeat for further assignments.
**WHY Split Temporary Variable:** A variable used for two different purposes has two
responsibilities. The name cannot honestly reflect both, so it becomes a source of confusion.
Splitting it gives each responsibility an honest name and makes each one available for
Replace Temp with Query.
**Inline Temp** — when a temp is assigned the value of a method call once and never
reassigned, and the temp is blocking another refactoring (most commonly blocking Extract
Method).
Mechanics:
1. Declare the temp `final` and compile — confirms single assignment.
2. Replace each reference to the temp with the method call expression.
3. Compile and test after each replacement.
4. Remove the temp declaration.
**WHY Inline Temp:** Temps that simply name a method call result add indirection without
adding clarity. When they obstruct a more important refactoring, inlining them is the right
trade.
**Introduce Explaining Variable** — when an expression is too complex to extract into a
method (usually because there are too many local variables blocking extraction) and the
expression needs a name to be readable.
Mechanics:
1. Declare a `final` temporary variable whose name explains the purpose of the expression.
2. Replace the expression (or the first occurrence if repeated) with the temp.
3. If the expression is repeated, replace other occurrences one at a time.
4. Compile and test.
**WHY Introduce Explaining Variable:** This is a stepping stone, not a destination. Fowler
prefers Extract Method because a method is available to the whole object while a temp is only
local. Use Introduce Explaining Variable when you are inside a tangled algorithm with many
local variables that block extraction — the explained temps can later become Replace Temp with
Query calls as the tangles loosen.
---
### Step 3: Apply Extract Method
**ACTION:** For each cohesive code fragment that represents a single concept, extract it into
a new method.
**WHY Extract Method:** It is the most common and highest-value refactoring in the catalog.
Short, well-named methods increase reuse (other methods can call them), make higher-level
methods read like a series of comments, and make overriding easier because the granularity
aligns with the conceptual granularity of the domain.
**Full mechanics (9 steps):**
1. **Create a new method.** Name it after the intention — what it does, not how it does it.
If you cannot find a name more meaningful than the code itself, do not extract.
Why: the name is the primary value of extraction. A good name turns code into
self-documentation.
2. **Copy the extracted code** from the source method into the new target method body.
3. **Scan for references to local variables** in scope in the source method. These are
local variables declared in the source method and parameters of the source method.
4. **Handle read-only variables:** For variables referenced inside the extracted code but
declared outside it, and that are not modified inside the extraction, declare them as
parameters of the new method. Why: these are inputs to the computation the method
represents; passing them as parameters makes the dependency explicit.
5. **Handle variables declared only inside the extracted code:** Move their declaration into
the target method. After extraction, remove the original declaration from the source
method if it no longer appears there.
6. **Handle variables modified inside the extracted code:**
- If only one variable is modified and it is used after the extraction: have the target
method return that variable's value; assign the return value in the source method.
- If more than one variable is modified: you cannot cleanly extract. Either choose
different extraction boundaries, or apply Replace Temp with Query and Split Temporary
Variable to reduce the number of modified variables, then try again. As a last resort,
apply Replace Method with Method Object (Step 6 below).
7. **Pass read variables as parameters** into the new method. Compile.
8. **Compile when you have dealt with all locally scoped variables.**
9. **Replace the extracted code in the source method with a call to the new method.** If
you moved any temp declarations into the target method, remove their declarations from
the source method. Compile and test.
**The name heuristic:** If extracting improves clarity, do it — even if the name is longer
than the code you extracted. Fowler: "Length is not the issue. The key is the semantic
distance between the method name and the method body."
---
### Step 4: Handle Parameter Assignment
**ACTION:** If the source method assigns to a parameter variable (not just calling a method
on it, but reassigning the parameter reference itself), apply Remove Assignments to Parameters.
**WHY Remove Assignments to Parameters:** Assigning to a parameter is confusing because it
blurs what the parameter represents (the value passed in) with what the local computation
produces. In pass-by-value languages (Java, Python, most modern languages), assigning to a
parameter only affects the local copy — the caller sees no change — which is a common source
of bugs. Using a temp makes the semantics explicit.
Mechanics:
1. Create a temporary variable for the parameter. Initialize it to the parameter's value.
2. Replace all references to the parameter that appear after the assignment with the temp.
3. Change the assignment to assign to the temp instead of the parameter.
4. Compile and test.
---
### Step 5: Apply Inline Method When Indirection Becomes Noise
**ACTION:** After extraction rounds, if a method's body is as clear as its name — or if you
have a cluster of methods that delegate to each other without adding clarity — inline the
method back into its callers.
**WHY Inline Method:** Extraction can overshoot. A method whose name says exactly what its
one-line body says adds indirection without adding comprehension. Inline Method also prepares
a method for Replace Method with Method Object: inlining all the called methods into the
target method first makes it easier to move the whole behavior into the new class.
Mechanics:
1. Confirm the method is not polymorphic (no subclasses override it). Do not inline if they
do — subclasses cannot override a method that no longer exists.
2. Find all call sites.
3. Replace each call site with the method body.
4. Compile and test.
5. Remove the method definition.
---
### Step 6: Apply Replace Method with Method Object for Irreducible Tangles
**ACTION:** If after applying Replace Temp with Query and Split Temporary Variable, the
method still has so many local variables that Extract Method cannot be applied cleanly,
convert the entire method into its own class.
**WHY Replace Method with Method Object:** All local variables become fields on the new
class, eliminating the parameter-passing problem entirely. Once the method is an object, you
can apply Extract Method freely on the `compute()` method because all the "parameters" are
already available as fields. This is the escalation path for methods that resist decomposition.
Mechanics:
1. Create a new class. Name it after the method.
2. Give the new class a `final` field for the object that hosted the original method (the
source object). Give it a field for each parameter and each local variable of the method.
3. Give the new class a constructor that takes the source object and each parameter; assigns
them to the corresponding fields.
4. Give the new class a method named `compute`.
5. Copy the body of the original method into `compute`. Replace any calls to source object
methods with calls via the source object field.
6. Compile.
7. Replace the original method's body with: `return new MethodObjectClass(this, param1,
param2, ...).compute();`
8. Now apply Extract Method freely on `compute()` — local variables are all fields, so
parameter passing is no longer needed.
---
### Step 7: Apply Substitute Algorithm When a Cleaner Path Exists
**ACTION:** Once the method is decomposed enough to be understood, if the algorithm itself
is unnecessarily complex (a clearer algorithm is known, or a library method already provides
the behavior), replace the algorithm wholesale.
**WHY Substitute Algorithm:** Decomposition makes the algorithm legible enough to evaluate.
Sometimes the algorithm can be replaced with a simpler version (a list lookup instead of
cascading conditionals, a standard library call instead of manual iteration). You can only
substitute safely when the method is small; substituting a large complex algorithm is
unreliable.
Mechanics:
1. Prepare the replacement algorithm so it compiles.
2. Run the test suite with both the old and new algorithm available (comment one out) to
compare results.
3. If results match, replace permanently. If they differ, use the old algorithm to debug
which test cases fail.
---
### Step 8: Verify and Review
**ACTION:** Run the full test suite. Read the decomposed methods aloud — can each be
understood from its name alone?
**WHY:** The behavioral contract must be preserved exactly. Reading method names aloud is
the fastest test of whether the extraction produced intention-revealing names: if you need
to look at the body to understand what the method does, the name is still wrong.
**Acceptance criteria for completed decomposition:**
- All tests pass
- The original method body reads like a series of high-level steps, each a method call
- No method requires a comment to explain what it does (the name provides that)
- No extracted method has more than 3-4 parameters (if more, check for missed Replace Temp
with Query or Introduce Parameter Object opportunities)
- Each extracted method has a single, nameable responsibility
---
## Key Principles
**1. The semantic distance heuristic is the decision rule.**
Do not count lines. Ask: is there a gap between what this method's name says and what its
body does? If the body is implementing "how" at a level of detail that the name abstracts
away, extract the implementation detail into its own method. If the name and body are at the
same level of abstraction, leave it.
**2. Extract Method is the primary technique — the others clear the path to it.**
Replace Temp with Query, Split Temporary Variable, and Introduce Explaining Variable exist
primarily to reduce the local variable count so that Extract Method can proceed. Inline Temp
removes a specific class of obstruction. Remove Assignments to Parameters prevents a subtle
class of confusion that Extract Method would propagate. Replace Method with Method Object is
the escape hatch when all else fails.
**3. Names are the product, not the side effect.**
An extract that produces a well-named method is more valuable than ten extracts that produce
vaguely-named helper methods. If you cannot name the fragment better than the comment
describing it, the comment is not a good extraction signal in that case.
**4. Compile and test after every single step.**
Each refactoring is designed to be applied in tiny, verifiable increments. Testing after
each step means that when something breaks, the cause is obvious — it was the last change.
Batching multiple refactorings before testing makes failures hard to diagnose.
**5. Replace Temp with Query before Extract Method, not after.**
The order matters. Eliminating temps first reduces the parameter list of the extraction.
Extracting first and then trying to eliminate temps in the extracted method is harder because
the scope boundaries have already been drawn.
**6. Performance concerns about query methods are almost always premature.**
Replace Temp with Query introduces repeated method calls that a compiler can optimize or
that prove to be negligible. If performance becomes an issue, a profiler will identify it;
at that point, putting the temp back is trivial. Readable, factored code is worth the
theoretical risk.
---
## Examples
### Example 1: Extract Method with No Local Variable Complications
**Before** — a method with a comment marking an extractable block:
```java
void printOwing() {
Enumeration e = _orders.elements();
double outstanding = 0.0;
// print banner
System.out.println("**************************");
System.out.println("***** Customer Owes ******");
System.out.println("**************************");
while (e.hasMoreElements()) {
Order each = (Order) e.nextElement();
outstanding += each.getAmount();
}
System.out.println("name: " + _name);
System.out.println("amount: " + outstanding);
}
```
The banner block has no local variable dependencies. Extract directly:
```java
void printOwing() {
printBanner();
// ... rest of method
}
void printBanner() {
System.out.println("**************************");
System.out.println("***** Customer Owes ******");
System.out.println("**************************");
}
```
---
### Example 2: Replace Temp with Query Unblocking Extract Method
**Before** — `discountFactor` temp blocks clean extraction because `basePrice` is used to
compute it:
```java
double getPrice() {
int basePrice = _quantity * _itemPrice;
double discountFactor;
if (basePrice > 1000) discountFactor = 0.95;
else discountFactor = 0.98;
return basePrice * discountFactor;
}
```
**Step 1 — Replace Temp with Query for `basePrice`:**
Declare `final`, extract right-hand side, replace references, remove declaration:
```java
private int basePrice() {
return _quantity * _itemPrice;
}
```
**Step 2 — With `basePrice` as a query, `discountFactor` can now be extracted:**
```java
double getPrice() {
return basePrice() * discountFactor();
}
private double discountFactor() {
if (basePrice() > 1000) return 0.95;
else return 0.98;
}
```
The final `getPrice()` has zero local variables and reads like a specification.
---
### Example 3: Replace Method with Method Object for an Irreducible Tangle
**Before** — `gamma()` has 3 interacting local variables; any extraction would require
multiple output parameters:
```java
int gamma(int inputVal, int quantity, int yearToDate) {
int importantValue1 = (inputVal * quantity) + delta();
int importantValue2 = (inputVal * yearToDate) + 100;
if ((yearToDate - importantValue1) > 100) importantValue2 -= 20;
int importantValue3 = importantValue2 * 7;
return importantValue3 - 2 * importantValue1;
}
```
Create class `Gamma` with a field for the source object and a field for each parameter and
local variable. Add a constructor and a `compute()` method containing the original body.
Replace the original body with `return new Gamma(this, inputVal, quantity, yearToDate).compute();`
Now each fragment of `compute()` can be extracted without passing any parameters — they are
already fields. The `if` block becomes `importantThing()` with no arguments.
---
## Decision Framework: Which Technique When?
```
Long method to decompose
│
├── Does a candidate fragment have zero modified local variables?
│ └── YES → Extract Method directly (pass read-only vars as params)
│
├── Does a candidate fragment have exactly one modified variable?
│ └── YES → Extract Method; return that variable's value
│
├── Does a candidate fragment have 2+ modified variables?
│ ├── Can you change temps to query methods? → Replace Temp with Query first
│ ├── Is a temp used for two different purposes? → Split Temporary Variable first
│ ├── Is a temp trivially assigned from a method call? → Inline Temp first
│ └── Still too many params after all the above? → Replace Method with Method Object
│
├── Is a temp needed only to name a complex sub-expression?
│ └── Prefer Extract Method (reusable) over Introduce Explaining Variable (local)
│ └── Use Introduce Explaining Variable only when extraction is blocked by other vars
│
├── Is the method body as clear as the method name?
│ └── YES → Inline Method (remove the indirection)
│
├── Is a parameter being reassigned inside the method?
│ └── YES → Remove Assignments to Parameters first
│
└── Is the algorithm correct but needlessly complex?
└── YES → Substitute Algorithm (only after method is small enough to test confidently)
```
---
## References
| File | Contents | When to read |
|------|----------|--------------|
| `references/composing-methods-mechanics.md` | Full step-by-step mechanics for all 9 techniques with edge cases | When a specific technique behaves unexpectedly |
| `references/local-variable-decision-tree.md` | Extended decision tree for local variable classification | When local variable analysis is ambiguous |
**Related skills:**
- `code-smell-diagnosis` — identifies Long Method and points here for execution
- `conditional-simplification-strategy` — simplifies complex conditionals exposed after
decomposition
- `build-refactoring-test-suite` — create the test safety net if none exists before starting
- `class-responsibility-realignment` — when decomposition reveals that extracted methods
belong in a different class (Feature Envy)
## License
This skill is licensed under [CC-BY-SA-4.0](https://creativecommons.org/licenses/by-sa/4.0/).
Source: [BookForge](https://github.com/bookforge-ai/bookforge-skills) — Refactoring: Improving the Design of Existing Code by Martin Fowler and Kent Beck.
## Related BookForge Skills
Install related skills from ClawhHub:
- `clawhub install bookforge-code-smell-diagnosis`
- `clawhub install bookforge-conditional-simplification-strategy`
- `clawhub install bookforge-build-refactoring-test-suite`
- `clawhub install bookforge-class-responsibility-realignment`
Or install the full book set from GitHub: [bookforge-skills](https://github.com/bookforge-ai/bookforge-skills)
Apply the correct data organization refactoring when code smells in data structure design are diagnosed — Primitive Obsession, Data Clumps, Data Class, or ra...
---
name: data-organization-refactoring
description: |
Apply the correct data organization refactoring when code smells in data structure design are diagnosed — Primitive Obsession, Data Clumps, Data Class, or raw structural anti-patterns like magic numbers, positional arrays, and naked public fields. Covers the full Chapter 8 catalog: Replace Data Value with Object (primitive → first-class object); Change Value to Reference / Change Reference to Value (value vs. reference object decision); Self Encapsulate Field (internal field access via accessors); Encapsulate Field (public → private with accessors); Encapsulate Collection (raw collection → controlled add/remove protocol); Replace Array with Object (positional array → named-field object); Replace Magic Number with Symbolic Constant; Replace Record with Data Class (legacy record → typed wrapper); Duplicate Observed Data (domain data trapped in GUI → domain class + observer sync); Change Unidirectional Association to Bidirectional (one-way link → two-way when both ends need navigation); Change Bidirectional Association to Unidirectional (drop unnecessary back pointer). Use when: a field stores a raw primitive (string, int) but has behavior waiting to happen (formatting, validation, comparison); the same 2-4 data items travel together through method signatures and field lists (Data Clumps); a class exists only as a getter/setter bag with no behavior (Data Class); a collection field is exposed so callers can mutate it directly; positional arrays or records need to cross the boundary into object-oriented design; a numeric literal with special meaning appears in more than one place; a GUI class owns domain data that business methods need; a one-way association is insufficient or a two-way association has become unnecessarily complex. Type code refactorings (Replace Type Code with Class/Subclasses/State-Strategy) are handled by the sibling skill `type-code-refactoring-selector`.
version: 1.0.0
homepage: https://github.com/bookforge-ai/bookforge-skills/tree/main/books/refactoring/skills/data-organization-refactoring
metadata: {"openclaw":{"emoji":"📚","homepage":"https://github.com/bookforge-ai/bookforge-skills"}}
status: draft
depends-on:
- code-smell-diagnosis
source-books:
- id: refactoring
title: "Refactoring: Improving the Design of Existing Code"
authors: ["Martin Fowler", "Kent Beck"]
chapters: [8]
tags: [refactoring, code-quality, data-modeling, encapsulation]
execution:
tier: 2
mode: hybrid
inputs:
- type: codebase
description: "Source code containing the data structure problem — the class(es) with primitive fields, raw collections, exposed data, or positional arrays"
- type: document
description: "Code snippet, class description, or diagnosed smell report if no live codebase is accessible"
tools-required: [Read, Grep, Write]
tools-optional: [Bash]
mcps-required: []
environment: "Run inside a project directory. Read source files to locate the data structures; grep for field access patterns and method signatures to understand usage scope."
discovery:
goal: "Select the correct data organization refactoring for the presenting smell; execute the mechanics step by step; verify that clients use the new interface correctly and that the old exposure points are removed"
tasks:
- "Identify the presenting smell: Primitive Obsession, Data Clumps, Data Class, raw collection exposure, positional array, magic number, or structural record"
- "Apply the selection framework to pick the specific refactoring"
- "For value vs. reference decisions: determine whether the object has real-world identity or is defined purely by its data values"
- "Execute the mechanics of the selected refactoring in safe, compilable steps"
- "Verify that all callers use the new interface; confirm that the old exposure point is removed or made private"
- "Identify follow-on refactorings (Move Method to the new class; Replace Conditional with Polymorphism for type codes)"
audience:
roles: ["software-developer", "senior-developer", "tech-lead", "code-reviewer"]
experience: "intermediate — assumes working knowledge of object-oriented design and accessor patterns"
triggers:
- "Primitive Obsession or Data Clumps smell diagnosed — raw data needs a first-class object"
- "Data Class smell diagnosed — a class with only getters/setters needs behavior moved in"
- "A collection field is publicly settable or directly returned, allowing callers to mutate it"
- "An array where elements at index 0, 1, 2 mean different named things"
- "A numeric literal with a special meaning appears in more than one place"
- "Domain business logic is embedded in a GUI class and needs extraction"
- "A one-way object association is insufficient for a new feature; or a two-way link is creating zombie objects and coupling complexity"
not_for:
- "Type code refactorings (Replace Type Code with Class/Subclasses/State-Strategy) — use type-code-refactoring-selector instead"
- "Conditional simplification not caused by data structure problems — use conditional-simplification-strategy instead"
- "Class responsibility problems (Feature Envy, Shotgun Surgery, Divergent Change) — use class-responsibility-realignment instead"
- "New code written from scratch — these refactorings apply to existing data structures"
---
# Data Organization Refactoring
## When to Use
You have data structures in existing code that are making the code harder to read, harder to change, or actively attracting bugs. The data may be raw primitives with hidden behavior, loosely grouped fields that travel together but have no home object, or collections exposed for direct mutation by callers.
This skill applies when:
- A string, integer, or other primitive is doing work that a class should do — formatting, validation, comparison, area code extraction, currency conversion
- The same cluster of 2-4 fields appears together in field lists, parameter lists, and return types (Data Clumps smell)
- A class exists only as a getter/setter holder; other classes manipulate its data in excessive detail (Data Class smell)
- A collection field is returned directly so callers can add, remove, or replace elements without the owning class knowing
- An array uses positional index conventions (element 0 is name, element 1 is wins) that only comments explain
- A numeric literal with domain meaning appears in more than one place, making the meaning invisible
- GUI code contains domain data and business calculations that domain objects need access to
- A one-way object link needs to become bidirectional, or a bidirectional link is no longer earning its complexity cost
**The core insight from Fowler:** Data items start simple and grow. A telephone number starts as a string, but eventually needs formatting, area code extraction, and validation — it has become a first-class object. The signal is not the complexity of the current data item; it's the behavior that wants to live on it. When you find yourself adding the same behavior to the owner of the primitive rather than to a class that represents the concept, the primitive is overdue for promotion.
**Scope boundary:** Type code refactorings (Replace Type Code with Class, Replace Type Code with Subclasses, Replace Type Code with State/Strategy) are covered by the sibling skill `type-code-refactoring-selector`. When a magic number or type code enum is driving switch statement behavior, use that skill. When data merely needs a better structural home, use this one.
---
## Context and Input Gathering
### Required Input (must have — ask if missing)
- **The class or code fragment with the data problem.** Either a file path, a class name, or a pasted snippet. Why: data organization problems are structural — they require seeing the field declarations, the methods that use the fields, and the callers that access those methods. Without this, selection of the correct refactoring is guesswork.
- **The diagnosed smell** (if available). If `code-smell-diagnosis` has already been run, use its output to confirm which smell is present. Why: multiple smells can present similarly (Primitive Obsession vs. Data Clumps vs. Data Class). Knowing the smell name from diagnosis avoids picking the wrong refactoring.
- **Language and framework.** Why: collection encapsulation mechanics differ by language (Java unmodifiable views vs. Python property decorators vs. TypeScript readonly). Value object equality semantics differ. The mechanics steps must match the language.
### Observable Context (gather before asking)
Scan the code to orient:
```
Smell signals to look for:
- Fields that are primitives with related behavior scattered on the owner → Primitive Obsession
- Same 2-4 fields appearing together in 3+ places → Data Clumps
- Class with only getters/setters and no behavior methods → Data Class
- public field or collection returned directly → missing Encapsulate Field / Encapsulate Collection
- array[0], array[1], array[2] with comments naming each slot → Replace Array with Object
- Numeric literals (9.81, 0.85, 24) in domain calculations → Replace Magic Number with Symbolic Constant
- Business calculation methods on a class that extends a GUI framework → Duplicate Observed Data needed
- One class holds a reference to another but the referenced class cannot navigate back → unidirectional association
```
### Default Assumptions
- If multiple smells are present: prioritize by structural impact — replace primitives with objects first (enables other refactorings), then encapsulate, then handle associations
- If scope is unclear: focus on the class the user named; expand to callers only when verifying that the new interface is used correctly
- If the smell is borderline: do the delete-one-item test for Data Clumps (if removing one field from a group makes the others meaningless, it's a clump)
---
## Refactoring Selection Framework
**Start here.** Match the presenting symptom to the correct refactoring before executing mechanics.
```
SYMPTOM → REFACTORING
A primitive field has behavior that keeps growing on the owner class
→ Replace Data Value with Object
Then decide: does the resulting object have identity (real-world entity)?
YES → Change Value to Reference
NO → keep as value object (immutable, equality by value)
A field is accessed directly within the same class, and you need subclass
override flexibility or lazy initialization
→ Self Encapsulate Field
A public field is accessed by external classes
→ Encapsulate Field
(first step only — a class with just accessors is still a Data Class;
follow with Move Method to bring behavior in)
A collection field is returned directly, allowing callers to mutate it
→ Encapsulate Collection
An array where index position encodes meaning
→ Replace Array with Object
A numeric literal with domain meaning appears in 2+ places
→ Replace Magic Number with Symbolic Constant
Exception: if the literal is a type code driving switch behavior
→ use type-code-refactoring-selector instead
A legacy record or external API structure needs an object-oriented interface
→ Replace Record with Data Class
Domain data and business methods are trapped in a GUI class
→ Duplicate Observed Data
One class needs to navigate to the other but only a one-way link exists
→ Change Unidirectional Association to Bidirectional
A two-way association exists but one end no longer needs the other
→ Change Bidirectional Association to Unidirectional
The same 2-4 data items travel together in field lists and parameter lists
→ Extract Class (Data Clumps path — see code-smell-diagnosis for full Data Clumps mechanics)
Then: Introduce Parameter Object or Preserve Whole Object at call sites
```
---
## Process
### Step 1: Locate and Understand the Data Structure
**ACTION:** Read the class containing the problematic data. Identify the field(s), their type, how they are set, and how they are used. Then grep for all callers.
**WHY:** Data organization refactorings are usage-driven. The field declaration tells you what exists; the callers tell you what behavior the data is accumulating, which decides whether a value object or reference object is needed, and what methods need to move. Reading only the field declaration without understanding usage leads to incomplete refactorings where the data gets a new type but the behavior stays scattered on the wrong class.
Questions to answer before selecting mechanics:
- What type is the field currently? (int, String, array, raw collection)
- What operations are performed on it by the owner class? By callers?
- Does the field appear in method parameter lists? (Data Clumps signal)
- Is it publicly accessible? (Encapsulate Field needed)
- Does it change after the object is created? (relevant for value vs. reference decision)
- Do multiple objects need to represent the same conceptual instance? (Change Value to Reference trigger)
---
### Step 2: Execute the Selected Refactoring
Work through the mechanics of the refactoring selected in the framework above. Each refactoring below states its mechanics and the WHY for each step.
---
#### Replace Data Value with Object
**When:** A field is a primitive (string, int, float) but behavior keeps accumulating on the owner — formatting, parsing, comparison, validation.
**Mechanics:**
1. Create a new class for the value. Give it a final field of the same type as the current primitive. Add a getter and a constructor that takes the primitive as argument. WHY: the new class starts as a thin wrapper; the goal is to create a stable home for the behavior, not to design the perfect API upfront.
2. Compile.
3. Change the field type in the source class to the new class. WHY: this is the type boundary that forces all callers to go through the new class rather than manipulating the raw primitive.
4. Change the getter in the source class to delegate to the getter on the new class. WHY: callers that used the getter continue to work; the internal representation is now the new class.
5. If the field appears in the constructor, assign the field using the new class constructor. WHY: the creation path must build the right type from the start.
6. Update the setter to create a new instance of the new class. WHY: since value objects should be immutable, the setter replaces the whole object rather than mutating it in place.
7. Compile and test.
8. Rename methods on the new class to reflect their purpose in the domain, not just their type. WHY: `getCustomerName()` is clearer than `getCustomer()` when the caller cares about the name, not the object identity.
**Follow-on decision:** After Replace Data Value with Object, determine whether the new object needs to be a reference object. If multiple owner objects need to share the same conceptual instance (e.g., all orders for the same customer should point to the same Customer object), apply Change Value to Reference next.
---
#### Value Object vs. Reference Object Decision
**This is the most consequential decision in data organization refactoring.** Getting it wrong causes aliasing bugs (mutable value objects) or unnecessary coordination complexity (reference objects where values would suffice).
**Value objects** (Date, Money, Currency, PhoneNumber):
- Defined entirely by their data — two instances with the same data are equal
- Immutable: to "change" a value object, you replace it with a new one
- Equality by value: override `equals()` and `hashCode()` based on fields
- Copyable freely: no aliasing risk because mutation is not possible
- Preferred for: amounts, measurements, identifiers, codes, coordinates
**Reference objects** (Customer, Account, Employee, Order):
- Represent a real-world entity with identity independent of their data
- One instance per real-world entity: two Customer objects with the same name are still two different customers
- Equality by identity (object reference, database key)
- Require a registry or factory to enforce the one-instance constraint
- Preferred for: entities that have state that changes over time; entities that multiple other objects need to point to the same instance of
**Decision criteria:**
```
QUESTION → DIRECTION
Does changing this object's data need to be seen → Reference object
by all other objects that hold a reference to it? (aliasing is required)
Does the object represent a real-world entity with → Reference object
independent existence (customer, account, order)?
Is this a measurement, amount, code, or coordinate → Value object
defined purely by its data?
Would it be correct for two objects to each have → Value object
their own independent copy?
Is the object used in distributed or concurrent → Value object (safer)
contexts where shared mutable state is problematic?
```
**Change Value to Reference** (when a value object needs to become a reference object):
1. Apply Replace Constructor with Factory Method on the new class. WHY: a factory gives you control over object creation — the constructor can be made private, and the factory returns the single canonical instance.
2. Decide what object will serve as the registry (static field on the class, a separate registry object, or the container object). WHY: reference objects require a way to find the existing instance rather than creating a new one.
3. Decide whether objects are pre-created or created on demand. WHY: pre-created (loaded at startup from a database or config) requires handling the "not found" case; on-demand creates only when first requested.
4. Alter the factory method to return the existing instance from the registry. Rename the factory to convey that it retrieves rather than creates (e.g., `getNamed()` instead of `create()`). WHY: the name communicates the semantics — callers should know they are getting a shared instance.
5. Compile and test.
**Change Reference to Value** (when a reference object is too awkward and should become a value object):
1. Verify the object is immutable or can be made immutable. Remove setting methods until no mutation remains. WHY: a mutable value object is a trap — callers who copy the reference see each other's changes, producing aliasing bugs that are very hard to diagnose.
2. If the object cannot become immutable, abandon this refactoring. WHY: the immutability requirement is not optional; a mutable value object is worse than the original reference object.
3. Implement `equals()` and `hashCode()` based on the object's data fields. WHY: value objects are equal when their data is equal; without overriding these methods, equality falls back to object identity, defeating the purpose of the conversion.
4. Compile and test.
5. Remove the factory method and make the constructor public. WHY: value objects are created freely; there is no registry, no single-instance constraint.
---
#### Self Encapsulate Field
**When:** A class accesses its own field directly, and you need subclasses to be able to override how the value is produced (computed value, lazy initialization) without changing the field access code scattered through the class.
**Mechanics:**
1. Create getter and setter methods for the field. WHY: accessors are the hook point that subclasses override. Without them, the only way to intercept field access is to change every direct reference.
2. Find all references to the field within the same class and replace direct access with calls to the getter (reads) and setter (writes). WHY: once all internal access goes through the accessor, a subclass can override the getter to return a computed or cached value, and the rest of the class automatically picks up the new behavior.
3. Make the field private. WHY: private forces all access — internal and external — through the accessors. This prevents subclasses from bypassing the accessor by directly accessing the field.
4. In constructors, prefer direct field access or a separate `initialize()` method rather than the setter. WHY: setters often have behavior that is appropriate for changes after construction but not for initialization. Using the setter in the constructor can trigger that behavior prematurely.
5. Compile and test.
---
#### Encapsulate Field
**When:** A field is public and accessed directly by external classes, violating the encapsulation principle. The class cannot control what values are set or observe when the value changes.
**Mechanics:**
1. Create getter and setter methods for the field.
2. Find all external clients that reference the field directly. Replace reads with calls to the getter; replace writes with calls to the setter. WHY: direct field access bypasses the owning class entirely. Once callers go through accessors, the owning class can add validation, notification, or lazy initialization in the future without changing callers.
3. Compile and test after each client change.
4. Once all external references are replaced, declare the field private. WHY: the private modifier enforces that future code cannot bypass the accessor, preserving the encapsulation permanently.
5. After encapsulation, look at which methods on external classes use the new accessor to compute something about this object's data. Those methods are Feature Envy candidates — consider moving them to the class that owns the data.
---
#### Encapsulate Collection
**When:** A method returns a collection field directly (a list, set, or map), allowing callers to add, remove, or replace elements without the owning class knowing.
**Mechanics:**
1. Add `add(element)` and `remove(element) ` methods to the owning class. WHY: these are the controlled mutation points. The owning class can enforce invariants (uniqueness, ordering, related state updates) in these methods.
2. Initialize the field to an empty collection in the field declaration. WHY: callers should not need to check for null before using the collection; an empty collection is always safe.
3. Find all callers of the setter. Modify the setter to use the add and remove methods rather than directly assigning the field, or have callers call add/remove directly and remove the setter. WHY: a setter that replaces the whole collection bypasses the controlled mutation protocol; the setter should be renamed to `initialize` to clarify it is for initial population only, or removed entirely.
4. Find all callers of the getter that then mutate the collection (e.g., `person.getCourses().add(...)`). Change them to call the new add/remove methods on the owning class. WHY: getter-then-mutate is the same as direct field access — it bypasses the owning class's control.
5. Once no caller mutates through the getter, change the getter to return an unmodifiable view (Java: `Collections.unmodifiableSet()`, Python: `tuple()` or `frozenset()`, TypeScript: `readonly` array). WHY: the unmodifiable return makes it structurally impossible for callers to mutate the collection through the getter, enforcing the encapsulation permanently.
6. Look at client code that iterates the collection to perform operations on the elements. If those operations use only the owning class's data, they are candidates for Move Method — they belong on the owning class. WHY: Encapsulate Collection is the beginning; the end goal is a class that can answer questions about its own collection rather than leaking the collection for callers to query.
---
#### Replace Array with Object
**When:** An array is used to hold heterogeneous data where position encodes meaning — `row[0]` is the team name, `row[1]` is wins, `row[2]` is losses. Position-as-convention is fragile and invisible.
**Mechanics:**
1. Create a new class to represent the information in the array. Give it a public field holding the original array (temporarily). WHY: starting with a public field avoids having to change all callers immediately; the array stays inside the new class so existing code keeps working during the transition.
2. Change all sites that create the array to create the new class instead. WHY: the new class is now the entry point; the internal array is an implementation detail that will be replaced.
3. One by one, add a named getter and setter for each element of the array. Name the accessors after the purpose of that element, not after its index. WHY: the names replace the fragile index convention. `row.getName()` is self-documenting; `row[0]` requires the reader to remember the convention.
4. Change all callers to use the named accessors instead of array indexing. Compile and test after each element is converted. WHY: small steps mean each compile-and-test cycle confirms nothing broke. Changing all callers at once risks accumulating errors.
5. Once all external access uses named accessors, make the internal array private. WHY: no caller should be able to access by index anymore; private enforces the new API boundary.
6. Replace each array element with a proper named field in the class. Update the accessors to use the field instead of the array slot. Delete the array when all elements have been replaced. WHY: the array was the transitional mechanism; named fields are the real destination. Each field has a type appropriate to its domain meaning (not all strings).
---
#### Replace Magic Number with Symbolic Constant
**When:** A literal number with special domain meaning appears in the code. The reader cannot tell from the literal what it means.
**Mechanics:**
1. Declare a constant and set it to the value of the magic number. Name the constant after the meaning, not the value (e.g., `GRAVITATIONAL_CONSTANT` not `NINE_POINT_EIGHT_ONE`). WHY: the name is the documentation. A constant named after its value provides no more information than the literal; a constant named after its meaning makes the code self-explanatory.
2. Find all occurrences of the magic number in the codebase. WHY: if the number appears in multiple places and you only replace some, the "constant" now has two representations that can drift.
3. Replace each occurrence with the constant, but only if the usage matches the meaning. WHY: the same numeric value can appear in the code for different reasons (e.g., the number 24 might be hours in a day in one place and an array size in another). Do not replace all occurrences blindly; check that each occurrence represents the same concept.
4. Compile. Verify by changing the constant value temporarily and confirming that all the expected places in the code change behavior — this confirms coverage is complete.
**Alternatives to consider first:**
- If the magic number is the length of an array, use `array.length` instead of a constant. WHY: `array.length` is always correct even if the array size changes; a constant can drift.
- If the magic number is a type code (ENGINEER = 1, SALESMAN = 2), use `type-code-refactoring-selector` instead. WHY: type codes need polymorphism, not just symbolic names.
---
#### Replace Record with Data Class
**When:** Code interfaces with a legacy record structure (from a traditional programming environment, an external API, or a database row) that needs an object-oriented wrapper.
**Mechanics:**
1. Create a class to represent the record. Give it a private field with a getter and a setter for each data item in the record. WHY: the class starts as a dumb data object that matches the external structure. This is intentional — it creates a stable interface between the legacy structure and the rest of the codebase.
2. Once the wrapper exists, apply Encapsulate Field to any public fields, and use Move Method to migrate behavior that operates on the record into the wrapper class. WHY: Replace Record with Data Class is the beginning, not the end. The real value comes when the wrapper grows into a real domain object with its own behavior.
---
#### Duplicate Observed Data
**When:** A GUI class (a window, form, or controller) contains both the domain data (e.g., start date, end date, length of interval) and the business calculations on that data. The business logic cannot be tested without the GUI; it cannot be reused in other contexts.
**This is the most complex refactoring in the chapter.** It requires the Observer pattern (or equivalent event listener mechanism). Apply it when the coupling between GUI and domain is blocking testability or reuse.
**Mechanics:**
1. If no domain class exists for the data, create one. Make it extend or implement an observable/event-source mechanism. WHY: the domain class needs a way to notify the GUI when its data changes; the GUI needs a way to react without the domain class knowing about the GUI.
2. Make the GUI class an observer of the domain class. Store the domain object as a field in the GUI class. WHY: the GUI subscribes to domain changes and updates its controls in response; this is the decoupling mechanism.
3. Apply Self Encapsulate Field on each domain data field within the GUI class. WHY: self-encapsulation is the prerequisite for redirecting the accessor to the domain class later. You cannot redirect scattered direct field access — you can only redirect calls through a single accessor.
4. In each field's event handler, add a call to the setter (using the current field value). WHY: this is the synchronization trigger — when the user changes a field in the GUI, the event fires, the setter runs, and the setter will eventually update the domain object.
5. Add the same field to the domain class with a getter and setter. In the domain setter, trigger notification to observers. WHY: the domain class now owns the canonical value. The notification is what keeps the GUI in sync when the domain changes.
6. Redirect the GUI's getter to read from the domain object. Redirect the GUI's setter to write to the domain object. WHY: the GUI is now a view of the domain data; it does not own the data itself.
7. Implement the observer's update method in the GUI to pull the current value from the domain and update the GUI control. WHY: this is the downstream sync path — when the domain changes (from any source, not just this GUI field), the GUI reflects the change.
8. Once all data is duplicated and synchronized, use Move Method to migrate the business calculation methods from the GUI class to the domain class. WHY: once the domain class has the data, the calculations naturally belong there. The GUI class becomes a pure view.
---
#### Change Unidirectional Association to Bidirectional
**When:** Two classes need to use each other's features, but only a one-way link exists. Adding the reverse link is needed for a new feature.
**Mechanics:**
1. Add a field for the back pointer on the class that currently has no reference. WHY: the back pointer is what makes the association bidirectional; without the field, the reverse navigation does not exist.
2. Decide which class will control the association. Apply this rule: for one-to-many associations, the object that holds the single reference controls; for component-composite associations, the composite controls; for many-to-many, either can control. WHY: one class controlling keeps the pointer-maintenance logic in one place, preventing inconsistency.
3. Create a restricted-access helper method on the non-controlling class to expose the back pointer for the controller's use only. Name it to signal its restricted purpose (e.g., `friendOrders()`). WHY: the helper lets the controlling class maintain consistency without making the back pointer fully public, limiting the surface area for misuse.
4. On the controlling class, modify the setter/modifier to update both pointers: first tell the old related object to remove this; then update this object's pointer; then tell the new related object to add this. WHY: both sides of the association must stay in sync. The three-step update (remove old, set new, add new) handles null cases and prevents a pointer being left dangling.
5. Compile and test. WHY: bidirectional associations are easy to get wrong — the three-step update and the helper method are both error-prone. Test that setting the association from one side is reflected on the other side.
---
#### Change Bidirectional Association to Unidirectional
**When:** A two-way link exists but one end no longer needs the other. Bidirectional associations add complexity: they must be maintained in sync, they can create zombie objects (objects that cannot be garbage-collected because a back pointer keeps them alive), and they introduce coupling between packages.
**Mechanics:**
1. Examine all readers of the field holding the pointer you want to remove. For each reader, determine whether the object can be obtained another way — via a parameter, by traversing from another known object, or via Substitute Algorithm on the getter. WHY: before removing the pointer, you must confirm that every use can be satisfied without it. Finding an alternative is the feasibility gate.
2. If clients need the getter, apply Self Encapsulate Field first, then Substitute Algorithm on the getter body to obtain the object without the field. WHY: Self Encapsulate Field routes all access through one point; Substitute Algorithm changes what that one point returns, eliminating field reads without changing callers.
3. If clients can obtain the object another way, change each caller to get it from that other source. Compile and test after each change. WHY: one-at-a-time changes with compile-and-test between them keeps the code in a working state throughout the refactoring.
4. Once no reader uses the field, remove all assignments to the field and delete the field. WHY: an unread field is dead code; removing it eliminates the maintenance burden and prevents future confusion about why it exists.
---
### Step 3: Verify the New Interface is Complete
**ACTION:** After mechanics are complete, confirm: (1) the old exposure point is gone or private; (2) all callers use the new interface; (3) no callers bypass the new interface through another path.
**WHY:** Data organization refactorings leave behind debris if not verified. A collection field that was encapsulated but still has one caller using the getter to mutate directly is not encapsulated. A primitive replaced with an object but still passed as a raw string through one old parameter path is still a primitive at that path.
Verification checklist:
- Grep for direct field access by external classes (should be zero)
- Grep for the old collection getter usage that modifies the result (should be zero)
- Grep for the removed positional array access patterns (should be zero)
- Grep for the magic number literal (should be zero or only in the constant declaration)
- Confirm the value object has `equals()` and `hashCode()` overridden (if applicable)
- Confirm the reference object's factory enforces single-instance retrieval (if applicable)
---
### Step 4: Identify Follow-On Refactorings
**ACTION:** Look for the next refactoring that the completed refactoring enables.
**WHY:** Data organization refactorings are rarely endpoints. Replace Data Value with Object creates a class that should have behavior moved into it. Encapsulate Collection reveals client code that should be moved to the owning class. The value of each refactoring compounds when the follow-on steps are taken.
Common follow-on sequences:
| Completed refactoring | Natural follow-on |
|---|---|
| Replace Data Value with Object | Move Method — migrate behavior from the old owner to the new class |
| Change Value to Reference | Ensure registry is consistent; check that all creation sites use the factory |
| Encapsulate Collection | Move Method — move iteration/query code from callers to the owning class |
| Replace Array with Object | Move Method — behavior operating on the array slots belongs on the new class |
| Encapsulate Field (on a Data Class) | Move Method — the Data Class smell is not resolved until behavior moves in |
| Duplicate Observed Data | Move Method — business calculation methods migrate from GUI to domain class |
---
## Key Principles
**1. Value vs. reference is a decision that often needs reversing.**
Fowler explicitly notes that this decision is not always clear and frequently needs to be undone. Start with a value object (simpler, no registry needed, no aliasing risk). Convert to a reference object only when the aliasing requirement becomes concrete — when two objects genuinely need to share the same instance so that changes to one are seen by the other.
**2. Immutability is not optional for value objects.**
A mutable value object is worse than no refactoring. If callers copy the reference and then mutate the object through it, they silently affect each other's state. Before completing Change Reference to Value, verify that all setters are removed. If the object cannot become immutable, keep it as a reference object.
**3. Encapsulate Collection is a two-step refactoring.**
The first step is the interface change (add/remove methods, unmodifiable getter). The second step — which most developers skip — is moving the collection-operating code from callers back to the owning class. A collection that is encapsulated but still iterated externally for every operation has the right interface but has not yet earned its encapsulation.
**4. Self Encapsulate Field before Duplicate Observed Data.**
Self Encapsulate Field is the prerequisite for Duplicate Observed Data. Without self-encapsulation, field access is scattered across the GUI class and cannot be redirected to the domain object in a controlled way. Always apply Self Encapsulate Field first, verify it compiles, then proceed to the duplication and synchronization steps.
**5. Magic number replacement requires meaning-matching, not value-matching.**
Replace the literal only where it represents the same concept as the constant's name. The same number value can appear in code for different reasons. Replacing all occurrences of `24` with `HOURS_PER_DAY` will be wrong wherever `24` means something else entirely.
---
## Examples
### Example 1: Primitive Obsession — Telephone Number
**Scenario:** An `Employee` class has a `String telephoneNumber` field. Methods on `Employee` and its callers format the number, extract the area code, and validate the format in multiple places.
**Selection:** Replace Data Value with Object — the primitive has accumulated behavior that belongs on a class.
**Execution:**
1. Create `TelephoneNumber` class with `String _number` field, constructor `TelephoneNumber(String number)`, and getter `getNumber()`.
2. Change `Employee._telephoneNumber` type from `String` to `TelephoneNumber`.
3. Update `Employee`'s getter: `String getTelephoneNumber() { return _telephoneNumber.getNumber(); }`
4. Update constructor: `_telephoneNumber = new TelephoneNumber(number);`
5. Update setter: `_telephoneNumber = new TelephoneNumber(number);`
6. Move `formatNumber()`, `getAreaCode()`, and `isValid()` from `Employee` to `TelephoneNumber`.
**Value vs. reference decision:** Is a TelephoneNumber a real-world entity with independent identity? No — it is defined by its digits. Two `TelephoneNumber` objects with the same string are equal. Keep it as a value object. Implement `equals()` and `hashCode()` on the number string.
---
### Example 2: Data Class with Collection — Person and Courses
**Scenario:** `Person` has a `Set _courses` field with `getCourses()` and `setCourses(Set)` methods. Callers do: `person.getCourses().add(new Course(...))` and iterate the set externally to count advanced courses.
**Selection:** Encapsulate Collection — the collection is directly mutable by callers.
**Execution:**
1. Add `addCourse(Course arg) { _courses.add(arg); }` and `removeCourse(Course arg) { _courses.remove(arg); }`.
2. Initialize: `private Set _courses = new HashSet();`
3. Change the setter to `initializeCourses(Set arg)` that asserts the collection is empty then calls `addCourse` for each element. Or remove the setter entirely if callers can use `addCourse` directly.
4. Find `person.getCourses().add(...)` callers and change to `person.addCourse(...)`.
5. Change getter: `public Set getCourses() { return Collections.unmodifiableSet(_courses); }`
6. Move the advanced-course-counting iteration into `Person` as `numberOfAdvancedCourses()` — the external iteration was Feature Envy on `Person`'s data.
---
### Example 3: Value vs. Reference — Customer Object
**Scenario:** After applying Replace Data Value with Object to a customer name string in `Order`, the `Customer` class exists but each order creates its own Customer object. A requirement arrives: update the credit rating for a customer, and all their orders must see the change.
**Selection:** The aliasing requirement is now concrete — multiple orders need the same customer instance. Apply Change Value to Reference.
**Execution:**
1. Add factory method: `public static Customer create(String name) { return new Customer(name); }` Make constructor private.
2. Create registry: `private static Dictionary _instances = new Hashtable();` and a private `store()` method that puts `this` in the registry by name.
3. Pre-load known customers: `Customer.loadCustomers()` populates the registry at startup.
4. Change factory to retrieve: `public static Customer getNamed(String name) { return (Customer) _instances.get(name); }`
5. Change `Order` constructor and setter to use `Customer.getNamed(name)` instead of `new Customer(name)`.
Result: all orders pointing to the same customer name now share one Customer object. Changes to credit rating are visible everywhere.
---
## References
| File | Contents | When to read |
|------|----------|--------------|
| `references/value-vs-reference-guide.md` | Detailed decision tree for value vs. reference with distributed systems considerations, aliasing risk patterns, and language-specific equality semantics | Step 2 — value/reference decision |
| `references/collection-encapsulation-patterns.md` | Language-specific collection encapsulation patterns: Java unmodifiable views, Python properties and frozenset, TypeScript readonly arrays | Step 2 — Encapsulate Collection mechanics |
**Sibling skill relationships:**
- `code-smell-diagnosis` — run first to identify which data smell is present before selecting a refactoring
- `type-code-refactoring-selector` — for type code integers and enums that drive switch statement behavior; not covered by this skill
- `class-responsibility-realignment` — when Feature Envy or Inappropriate Intimacy is the primary smell alongside data problems
- `method-decomposition-refactoring` — when Long Method is present in the same class; often co-occurs with Data Class smell
## License
This skill is licensed under [CC-BY-SA-4.0](https://creativecommons.org/licenses/by-sa/4.0/).
Source: [BookForge](https://github.com/bookforge-ai/bookforge-skills) — Refactoring: Improving the Design of Existing Code by Martin Fowler and Kent Beck.
## Related BookForge Skills
Install related skills from ClawhHub:
- `clawhub install bookforge-code-smell-diagnosis`
- `clawhub install bookforge-type-code-refactoring-selector`
- `clawhub install bookforge-class-responsibility-realignment`
- `clawhub install bookforge-method-decomposition-refactoring`
Or install the full book set from GitHub: [bookforge-skills](https://github.com/bookforge-ai/bookforge-skills)
Select and apply the correct refactoring for complex or tangled conditional logic. Use when: a method has a complicated if-then-else that obscures why branch...
---
name: conditional-simplification-strategy
description: |
Select and apply the correct refactoring for complex or tangled conditional logic. Use when: a method has a complicated if-then-else that obscures why branching happens (not just what happens); a series of conditions all produce the same result; the same code fragment appears inside every branch; a boolean variable is being used as a control flag to track loop state; nested conditionals bury the normal execution path under special-case checks; a switch statement (or long if-else-if chain) branches on an object's type and new types are expected; a method's parameter controls which of several distinct operations runs; null checks are scattered throughout client code for the same object. Covers all 8 conditional refactorings from Fowler Chapter 9: Decompose Conditional, Consolidate Conditional Expression, Consolidate Duplicate Conditional Fragments, Remove Control Flag, Replace Nested Conditional with Guard Clauses, Replace Conditional with Polymorphism, Replace Parameter with Explicit Methods, Introduce Null Object. Also covers the supporting technique Introduce Assertion for making implicit state assumptions explicit. Includes the key semantic distinction between guard clauses (rare special cases that exit) and if-else (equal-weight alternatives), and Fowler's rejection of the single-exit-point rule as a reason to avoid early returns.
version: 1.0.0
homepage: https://github.com/bookforge-ai/bookforge-skills/tree/main/books/refactoring/skills/conditional-simplification-strategy
metadata: {"openclaw":{"emoji":"📚","homepage":"https://github.com/bookforge-ai/bookforge-skills"}}
status: draft
depends-on:
- code-smell-diagnosis
source-books:
- id: refactoring
title: "Refactoring: Improving the Design of Existing Code"
authors: ["Martin Fowler", "Kent Beck"]
chapters: [9]
tags: [refactoring, code-quality, conditionals, polymorphism]
execution:
tier: 2
mode: hybrid
inputs:
- type: codebase
description: "Source code containing the conditional logic to refactor — the primary input"
- type: document
description: "Code snippet or method body if no live codebase is accessible"
tools-required: [Read, Grep, Write, Edit]
tools-optional: [Bash]
mcps-required: []
environment: "Run inside a project directory. Reading source files and grepping for related conditionals is the primary analysis method."
discovery:
goal: "Identify which of the 8 conditional refactorings applies to the target conditional; execute the selected refactoring correctly; leave the code with conditional logic that clearly separates the reason for branching from the details of each branch"
tasks:
- "Read the target conditional and classify its structural problem"
- "Select the correct refactoring using the decision framework"
- "Apply the refactoring mechanics step by step, compiling and testing after each step"
- "Verify the result: is the normal path visible? Does each branch communicate intent, not just action?"
audience:
roles: ["software-developer", "senior-developer", "tech-lead"]
experience: "intermediate — assumes working knowledge of object-oriented programming and refactoring basics"
triggers:
- "A method has a complex if-then-else that requires re-reading to understand"
- "Switch Statements smell is diagnosed by code-smell-diagnosis"
- "Nested conditionals hide which execution path is the normal one"
- "Null checks for the same object appear in multiple client code locations"
- "A control flag (boolean tracking loop state) makes loop termination logic hard to read"
not_for:
- "Performance optimization — use profiling-driven-performance-optimization instead"
- "Type code refactoring decisions — use type-code-refactoring-selector for the full type code decision tree"
- "General code smell diagnosis — use code-smell-diagnosis first to identify which conditional problem is present"
- "Conditionals that are simple, stable, and clear — not every conditional needs refactoring"
---
# Conditional Simplification Strategy
## When to Use
You have conditional logic that is hard to read, hard to extend, or hiding its own intent. This includes:
- A method-level conditional where the condition checks multiple things but the code only tells you *what* happens, not *why* the branching exists
- Several separate conditions that all produce the same result, scattered as sequential checks
- The same code fragment repeated in every branch of a conditional
- A boolean variable toggled to control loop exit — the code is structuring state instead of expressing intent
- Nested if-else blocks where finding the "normal" path requires reading through multiple levels of special cases
- A switch (or long if-else-if chain) that dispatches behavior based on an object's type, and new types are expected
- Null checks for the same object in multiple client methods
**The core insight from Fowler:** Conditional logic has two parts — the *switching logic* (which path to take) and the *details* of what each path does. When these are mixed inside the same method body, the reader must decode both simultaneously. The refactorings in this chapter separate these concerns so each part is named and readable on its own.
**This skill depends on diagnosis.** If you arrived here from `code-smell-diagnosis` with a Switch Statements finding, use the decision framework in Step 1 to select the right refactoring. If you have a simpler conditional problem (not a Switch Statements smell), the framework also applies — start with the pattern that matches your code structure.
---
## Context and Input Gathering
### Required Input
- **The target conditional.** Either a file path + method name, or a pasted code block. Why: the selection framework requires reading the actual structure — the number of conditions, whether they share results, what the branches do, whether the type drives behavior in multiple places.
- **Whether the conditional appears in multiple places.** Ask or grep. Why: a conditional that switches on a type in one method might be the right candidate for Replace Parameter with Explicit Methods; the same switch appearing in five methods is the right candidate for Replace Conditional with Polymorphism. Location multiplicity changes the prescription.
### Observable Context (gather before asking)
```
Signals to grep for before reading the code:
- Same variable in switch/case across multiple files: grep for the type constant names
- Null checks: grep for "== null" or "!= null" on the same variable
- Control flags: grep for boolean variables set to true/false inside loop bodies
- Identical tail code in branches: visually scan each branch body for repeated statements
```
---
## Process
### Step 1: Classify the Conditional and Select the Refactoring
**ACTION:** Read the target conditional and match it to the pattern below. Use the first match that fits — the patterns are ordered from simplest to most structural.
**WHY:** The 8 refactorings are not interchangeable. Applying Decompose Conditional when the real problem is type-based dispatch produces a cleaner conditional that still grows incorrectly when types are added. Applying Replace Conditional with Polymorphism when there are only two stable cases and one method creates unnecessary class hierarchy. Selecting the right refactoring requires classifying the structural problem first.
---
#### Pattern 1: The condition expression itself is complex
**Signal:** The condition in the `if` statement (or the `else if`) is a multi-part boolean expression that requires re-reading to understand. The branch bodies may be simple. The *condition* is the hard part, not the branch logic.
**Refactoring: Decompose Conditional**
Extract the condition, the then-part, and the else-part into their own methods. Name the methods after *why* the branching happens, not *what* the expressions compute.
```
Before:
if (date.before(SUMMER_START) || date.after(SUMMER_END))
charge = quantity * _winterRate + _winterServiceCharge;
else charge = quantity * _summerRate;
After:
if (notSummer(date))
charge = winterCharge(quantity);
else charge = summerCharge(quantity);
```
Why this matters: `notSummer(date)` conveys the *intent* of the condition; the expression `date.before(SUMMER_START) || date.after(SUMMER_END)` conveys the *mechanics*. The method name reads like a comment that cannot go stale.
**Note:** If you find a nested conditional during Decompose Conditional, first check whether Replace Nested Conditional with Guard Clauses (Pattern 5) applies — guard clauses may eliminate the nesting before you need to decompose it.
---
#### Pattern 2: Multiple separate conditions all produce the same result
**Signal:** A sequence of `if` statements (or early `return 0;` checks) where each check is different but all lead to the same action. The checks feel like they belong together.
**Refactoring: Consolidate Conditional Expression, then Extract Method**
Step 1 — Verify no side effects exist in any condition (if side effects are present, consolidation is not safe).
Step 2 — Combine the checks into a single conditional using `||` (or `&&` for and-chains).
Step 3 — Apply Extract Method to give the combined condition a meaningful name.
```
Before:
if (_seniority < 2) return 0;
if (_monthsDisabled > 12) return 0;
if (_isPartTime) return 0;
// compute disability amount
After:
if (isNotEligibleForDisability()) return 0;
// compute disability amount
```
Why this matters: The three separate checks *communicate* that they are independent decisions. They are not — they are three ways of saying "this person does not qualify." The consolidated version makes the semantic unity visible and names it. Extract Method also sets up the consolidated check to be reused or overridden cleanly.
---
#### Pattern 3: The same code appears inside every branch
**Signal:** Looking at each branch of the conditional, you see the same statement (or statements) at the start or end of every branch. The code executes regardless of which branch is taken.
**Refactoring: Consolidate Duplicate Conditional Fragments**
Move the common code to before the conditional (if it appears at the start of all branches) or after (if it appears at the end of all branches).
```
Before:
if (isSpecialDeal()) {
total = price * 0.95;
send();
} else {
total = price * 0.98;
send();
}
After:
if (isSpecialDeal())
total = price * 0.95;
else
total = price * 0.98;
send();
```
Why this matters: Code inside both branches implies the branching decision controls it. Moving it out makes clear that the conditional only determines the price multiplier — `send()` always happens. This reduces duplication and makes the conditional's scope accurate.
If the common code is in the middle of branches (not at start or end), check whether the code before or after it changes anything, then move it to whichever end is safe. If more than one statement is common, extract them into a method.
---
#### Pattern 4: A boolean variable tracks when to stop processing
**Signal:** A variable (`found`, `done`, `flag`) initialized to `false` before a loop, set to `true` inside the loop to signal when to stop, and checked in the loop condition or inside the loop body to skip further processing. The variable exists to work around a single-exit-point constraint, not because the logic requires it.
**Refactoring: Remove Control Flag**
Replace the control flag with a `break` or `return` statement.
- If the flag only controls loop exit (no result value carried): replace `flag = true` with `break` inside the loop, then remove the flag and its condition check.
- If the flag also carries a result value: extract the loop into its own method; replace `flag = result_value` with `return result_value`; the method returns the found value directly.
```
Before:
boolean found = false;
for (int i = 0; i < people.length; i++) {
if (!found) {
if (people[i].equals("Don")) { sendAlert(); found = true; }
if (people[i].equals("John")) { sendAlert(); found = true; }
}
}
After (with break):
for (int i = 0; i < people.length; i++) {
if (people[i].equals("Don")) { sendAlert(); break; }
if (people[i].equals("John")) { sendAlert(); break; }
}
```
Why this matters: The control flag is an artifact of structured programming's one-exit-point rule. Fowler rejects this rule: "Clarity is the key principle. If the method is clearer with one exit point, use one; otherwise don't." The `break` or `return` directly expresses the intent — stop processing when the condition is met — without requiring the reader to track an extra variable's state across iterations.
Prefer the `return` approach (extract into a method) even in languages that support `break`, because `return` makes it clear that no further code in the method executes after the match is found.
---
#### Pattern 5: Nested conditionals hide the normal execution path
**Signal:** A method with nested if-else blocks where the "normal" case — the path that runs for the typical, non-exceptional input — is buried inside `else` clauses. Reading the method requires tracking multiple nesting levels to find the main path. The branches before the normal case handle unusual or error conditions.
**Refactoring: Replace Nested Conditional with Guard Clauses**
For each unusual condition, replace its `else` wrapper with a guard clause: a check at the top of the method that returns (or throws) immediately if the unusual condition is true. The normal path falls through to the end.
```
Before:
double getPayAmount() {
double result;
if (_isDead) result = deadAmount();
else {
if (_isSeparated) result = separatedAmount();
else {
if (_isRetired) result = retiredAmount();
else result = normalPayAmount();
}
}
return result;
}
After:
double getPayAmount() {
if (_isDead) return deadAmount();
if (_isSeparated) return separatedAmount();
if (_isRetired) return retiredAmount();
return normalPayAmount();
}
```
**The critical semantic distinction:** An `if-else` construct communicates that both branches are *equally likely and equally important* — the reader gives equal weight to each leg. A guard clause communicates "this is rare and exceptional — handle it and get out." The if-else form is wrong for special cases because it visually equalizes things that are not equal in the domain.
Fowler: "The guard clause says, 'This is rare, and if it happens, do something and get out.'"
**Reversing conditions:** When the nesting goes the other way (the method does something only when conditions are all satisfied), reverse each condition to get the guard. Negate the condition, add the guard clause with an early return, remove the outer if wrapper.
Apply guard clauses one at a time. Compile and test after each replacement.
---
#### Pattern 6: A switch (or if-else-if chain) branches on object type and new types are expected
**Signal:** A `switch` statement (or a chain of `if (type == X) ... else if (type == Y)`) selects different behavior depending on the type of an object. The same switch appears in multiple methods, or the type set is expected to grow. Adding a new type means finding every switch and adding a case.
**Refactoring: Replace Conditional with Polymorphism**
Move each leg of the conditional into an overriding method on a subclass. Make the original method abstract on the superclass.
Prerequisites: You need an inheritance hierarchy. If one does not exist, create it first using Replace Type Code with Subclasses (if the type does not change after object creation) or Replace Type Code with State/Strategy (if the type changes at runtime, or if the class is already subclassed for another reason).
Once the hierarchy exists:
1. If the conditional is part of a larger method, use Extract Method to isolate it.
2. Use Move Method to place the conditional on the class at the top of the hierarchy.
3. For each leg of the conditional: copy the leg body into an overriding method on the appropriate subclass. Compile and test. Remove the copied leg from the original switch. Repeat until all legs are removed.
4. Declare the superclass method abstract.
```
Before (in EmployeeType):
int payAmount(Employee emp) {
switch (getTypeCode()) {
case ENGINEER: return emp.getMonthlySalary();
case SALESMAN: return emp.getMonthlySalary() + emp.getCommission();
case MANAGER: return emp.getMonthlySalary() + emp.getBonus();
default: throw new RuntimeException("Incorrect Employee");
}
}
After:
class Engineer... int payAmount(Employee emp) { return emp.getMonthlySalary(); }
class Salesman... int payAmount(Employee emp) { return emp.getMonthlySalary() + emp.getCommission(); }
class Manager... int payAmount(Employee emp) { return emp.getMonthlySalary() + emp.getBonus(); }
abstract class EmployeeType... abstract int payAmount(Employee emp);
```
**The polymorphism principle:** The caller does not need to know about the conditional behavior. Adding a new variant means adding a new class and implementing its method — the caller never changes. This is the reason object-oriented programs have fewer switch statements than procedural programs: the dispatch is handled by the language's method resolution mechanism rather than by explicit code.
**When NOT to use polymorphism:** If the conditional appears in only one place, affects only one method, and the type set is stable (not expected to grow), the structural investment of a hierarchy may not be justified. In that case, consider Pattern 7 (Replace Parameter with Explicit Methods) instead.
---
#### Pattern 7: A parameter controls which of several distinct operations runs, in a single method
**Signal:** A method takes a parameter (often a string, constant, or boolean) and uses it to select between a small number of clearly distinct operations. Each branch of the conditional does something completely different. The method's behavior is determined entirely by the parameter value, and the callers always pass a literal value (never a computed one). The type set is stable — no new variants are expected.
**Refactoring: Replace Parameter with Explicit Methods**
Create a separate method for each value of the parameter. Delete the conditional dispatch method. Update each call site to call the appropriate explicit method directly.
```
Before:
void setValue(String name, int value) {
if (name.equals("height")) _height = value;
if (name.equals("width")) _width = value;
}
// callers: setValue("height", 10); setValue("width", 5);
After:
void setHeight(int value) { _height = value; }
void setWidth(int value) { _width = value; }
// callers: setHeight(10); setWidth(5);
```
Why this matters: The explicit methods are statically checkable — the compiler catches invalid parameter names. Each method has a clear, single purpose. Callers communicate intent directly in the method name rather than encoding it in a string argument.
**Condition for use:** Only apply this when callers always pass a literal constant — never a variable. If callers compute the parameter value at runtime (e.g., passing a value from user input or a database), the callers need the dispatching method and you cannot eliminate it.
---
#### Pattern 8: Null checks for the same object appear in multiple client methods
**Signal:** Multiple methods in client code contain checks of the form `if (object == null) do default thing; else object.doRealThing()`. The same default behavior is repeated wherever the object might be null. The null-handling is scattered rather than centralized.
**Refactoring: Introduce Null Object**
Create a null version of the class that implements the same interface and returns sensible default values for all methods. Replace null with instances of this null class. Client code stops checking for null and calls methods directly.
Step 1 — Create a subclass (or implement a Nullable interface) as the null version of the source class. Add an `isNull()` method that returns `true` on the null class and `false` on the real class.
Step 2 — Find all places that return null for the source type. Return an instance of the null class instead.
Step 3 — Find all `== null` checks on the source type. Replace each with `isNull()`. Compile and test incrementally — replace one source at a time.
Step 4 — For each check of the form `if (obj.isNull()) defaultValue; else obj.realBehavior()`: add the appropriate method to the null class returning `defaultValue`. Remove the condition. Client code calls `obj.realMethod()` unconditionally.
```
Before (in clients):
if (customer == null) plan = BillingPlan.basic();
else plan = customer.getPlan();
if (customer == null) name = "occupant";
else name = customer.getName();
After (NullCustomer.getName() returns "occupant", NullCustomer.getPlan() returns BillingPlan.basic()):
plan = customer.getPlan();
name = customer.getName();
```
Why this matters: The essence of polymorphism is that you ask an object to do something and it does the right thing based on its type — you do not ask what type it is first. Null objects extend this principle: the null object knows how to behave when it is absent, so the caller never needs to check. The repeated conditional is eliminated at the source, not patched at each call site.
**When to prefer isNull() over null:** Null objects work well when most clients want the same default behavior. Clients that need a different response can still call `isNull()` explicitly. Use this refactoring when the default behavior is shared across many clients.
---
### Step 2: Apply the Selected Refactoring
**ACTION:** Execute the mechanics for the selected refactoring, one small step at a time. Compile and test after each step.
**WHY:** Conditional refactorings are behavior-preserving transformations. Testing after each individual step (not after all steps) means any regression has a minimal search space — you know exactly which change caused it. Skipping intermediate tests and making all changes at once turns a refactoring session into a debugging session.
**Mechanics rule for all 8 refactorings:** Use Extract Method liberally. Extracting any condition, branch body, or common fragment into a named method is always safe (it is behavior-preserving) and always improves readability. When in doubt, extract.
---
### Step 3: Introduce Assertion for Implicit Assumptions (supporting technique)
**ACTION:** After refactoring the conditional, scan the remaining code for implicit assumptions — places where the code works only if some state is true, but that state is not checked or documented.
**WHY:** Assertions make assumptions explicit. They do not change behavior (a failing assertion produces the same exception the code would throw anyway, just closer to the source). They serve as communication: the reader immediately sees what the code requires, rather than decoding the algorithm to discover it. They also help debugging by catching violated assumptions near the violation point rather than downstream.
When to add an assertion:
- A method assumes at least one of several fields has a non-null or non-sentinel value
- A calculation assumes an input is positive (or within a range)
- A method assumes an object has been initialized before being called
```
Before (implicit):
double getExpenseLimit() {
// should have either expense limit or a primary project
return (_expenseLimit != NULL_EXPENSE) ?
_expenseLimit : _primaryProject.getMemberExpenseLimit();
}
After (explicit):
double getExpenseLimit() {
Assert.isTrue(_expenseLimit != NULL_EXPENSE || _primaryProject != null);
return (_expenseLimit != NULL_EXPENSE) ?
_expenseLimit : _primaryProject.getMemberExpenseLimit();
}
```
Assertions should be easily removable for production deployment. Use an assertion utility class rather than live `if` statements for assertion logic. Do not use assertions to check things that are not truly required — over-asserting creates duplicate logic that drifts from the real code. The test: if the assertion fails and the code still works correctly, the assertion is wrong.
---
## Key Principles
**1. Separate the switching logic from the branch details.**
A conditional should tell you *why* branching happens. The branch bodies should tell you *what* happens in each case. When both are mixed in one method, Extract Method to split them apart. The condition becomes a method call with a meaningful name; the branch bodies become method calls with meaningful names. The if-else statement is now pure routing logic.
**2. Guard clauses are semantically different from if-else.**
Using `if-else` when one branch is a special case is incorrect code communication — it tells the reader both paths are equally probable and important. Guard clauses (early returns) correctly communicate "this is exceptional; handle it and exit." Fowler explicitly rejects the single-exit-point rule: "Clarity is the key principle."
**3. Adding types should not require finding conditionals.**
If adding a new variant of a type requires searching the codebase for every switch on that type and adding a case, Replace Conditional with Polymorphism. The polymorphic design means adding a new type = adding a new class that implements the behavior. No existing code changes. The caller does not need to know about type-specific conditional behavior.
**4. Null objects eliminate scattered default behavior.**
Null checks are repeated default-behavior decisions. When many callers share the same default, the decision belongs in the object, not in every caller. The null object embodies the default; callers stop making the decision.
**5. Apply one refactoring at a time, test between each.**
Conditional refactorings often chain — Consolidate Conditional Expression followed by Extract Method; Replace Conditional with Polymorphism preceded by Replace Type Code with Subclasses. Apply each step independently, verify behavior is preserved, then proceed. The chain is safe; the leap is not.
---
## Examples
### Example 1: Choosing Between Polymorphism and Explicit Methods
**Scenario:** A `Shape` class has this method:
```java
double area(String shapeType) {
if (shapeType.equals("circle")) return Math.PI * _radius * _radius;
if (shapeType.equals("rectangle")) return _width * _height;
if (shapeType.equals("triangle")) return 0.5 * _base * _height;
throw new RuntimeException("Unknown shape");
}
```
Two questions determine the refactoring:
1. Does `shapeType` vary at runtime (computed value), or is it always a literal at call sites?
2. Are new shape types expected?
**If both answers are yes:** Pattern 6 — Replace Conditional with Polymorphism. Create `Circle`, `Rectangle`, `Triangle` subclasses; each implements `area()`. Adding `Pentagon` = adding a class.
**If the type set is stable and callers always pass literals:** Pattern 7 — Replace Parameter with Explicit Methods. Create `circleArea()`, `rectangleArea()`, `triangleArea()` methods. Simpler, statically checkable, no hierarchy overhead.
**Decision rule:** When types will grow or the conditional appears in multiple places → polymorphism. When types are fixed, the conditional is in one place, and callers always pass constants → explicit methods.
---
### Example 2: Identifying a Null Object Opportunity
**Scenario:** Three separate methods in client code contain:
```java
if (customer == null) plan = BillingPlan.basic();
else plan = customer.getPlan();
if (customer == null) name = "occupant";
else name = customer.getName();
if (customer == null) weeksDelinquent = 0;
else weeksDelinquent = customer.getHistory().getWeeksDelinquentInLastYear();
```
**Classification:** Pattern 8 — the same object (`customer`) is null-checked in multiple clients, each providing a sensible default.
**Apply Introduce Null Object:**
```java
class NullCustomer extends Customer {
public boolean isNull() { return true; }
public String getName() { return "occupant"; }
public BillingPlan getPlan() { return BillingPlan.basic(); }
public PaymentHistory getHistory() { return PaymentHistory.newNull(); }
}
// class NullPaymentHistory: getWeeksDelinquentInLastYear() returns 0
// Site returns NullCustomer instead of null:
Customer getCustomer() {
return (_customer == null) ? Customer.newNull() : _customer;
}
// Clients become:
plan = customer.getPlan();
name = customer.getName();
weeksDelinquent = customer.getHistory().getWeeksDelinquentInLastYear();
```
The three conditional blocks disappear. Note that null objects often return other null objects — `NullCustomer.getHistory()` returns `NullPaymentHistory`, which itself returns sensible defaults.
---
## References
| File | Contents | When to read |
|------|----------|--------------|
| `references/refactoring-prescriptions.md` | Full prescription tree for all conditional refactorings with Fowler chapter references | Verifying the correct conditional branch for a borderline case |
**Dependency skill:**
- `code-smell-diagnosis` — diagnoses the Switch Statements smell and other structural problems that trigger this skill; provides the prioritized finding that this skill executes
**Related skills in the refactoring set:**
- `type-code-refactoring-selector` — when the conditional is driven by a type code and the primary decision is how to restructure the type (Replace Type Code with Subclasses vs. State/Strategy), not just the conditional dispatch
- `method-decomposition-refactoring` — when Long Method is the primary smell and conditional complexity is one contributor among several
## License
This skill is licensed under [CC-BY-SA-4.0](https://creativecommons.org/licenses/by-sa/4.0/).
Source: [BookForge](https://github.com/bookforge-ai/bookforge-skills) — Refactoring: Improving the Design of Existing Code by Martin Fowler and Kent Beck.
## Related BookForge Skills
Install related skills from ClawhHub:
- `clawhub install bookforge-code-smell-diagnosis`
- `clawhub install bookforge-type-code-refactoring-selector`
- `clawhub install bookforge-method-decomposition-refactoring`
Or install the full book set from GitHub: [bookforge-skills](https://github.com/bookforge-ai/bookforge-skills)
Scan a codebase or code fragment for the 22 named code smells from Fowler's refactoring catalog and produce a prioritized diagnosis report with the specific...
---
name: code-smell-diagnosis
description: |
Scan a codebase or code fragment for the 22 named code smells from Fowler's refactoring catalog and produce a prioritized diagnosis report with the specific refactoring prescription for each smell. Use when: a developer wants to know what is wrong with existing code before touching it; a code review reveals structural problems but no clear fix; a class or method feels wrong but the exact smell is hard to name; a refactoring effort needs a starting point and a prioritized order of attack; a code author wants to justify a refactoring to a team by naming the specific smell and the prescribed remedy. Covers all 22 smells: Duplicated Code, Long Method, Large Class, Long Parameter List, Divergent Change, Shotgun Surgery, Feature Envy, Data Clumps, Primitive Obsession, Switch Statements, Parallel Inheritance Hierarchies, Lazy Class, Speculative Generality, Temporary Field, Message Chains, Middle Man, Inappropriate Intimacy, Alternative Classes with Different Interfaces, Incomplete Library Class, Data Class, Refused Bequest, Comments. Maps each smell to its Fowler-prescribed refactoring(s) including conditional branches (same class vs. sibling subclasses vs. unrelated classes for Duplicated Code; few cases vs. type code vs. null for Switch Statements; etc.).
version: 1.0.0
homepage: https://github.com/bookforge-ai/bookforge-skills/tree/main/books/refactoring/skills/code-smell-diagnosis
metadata: {"openclaw":{"emoji":"📚","homepage":"https://github.com/bookforge-ai/bookforge-skills"}}
status: draft
depends-on: []
source-books:
- id: refactoring
title: "Refactoring: Improving the Design of Existing Code"
authors: ["Martin Fowler", "Kent Beck"]
chapters: [3]
tags: [refactoring, code-quality, code-smells, software-design]
execution:
tier: 2
mode: hybrid
inputs:
- type: codebase
description: "Source code files or directories to scan for smells — the primary input"
- type: document
description: "Code snippet, class description, or pull request diff if no live codebase is accessible"
tools-required: [Read, Grep, Write]
tools-optional: [Bash]
mcps-required: []
environment: "Run inside a project directory. Reading and grepping source files is the primary analysis method."
discovery:
goal: "Identify all named code smells present in the target code; map each smell to its prescribed refactoring(s) with correct conditional branching; produce a prioritized findings report the developer can act on immediately"
tasks:
- "Read and understand the structure of the target code"
- "Check each of the 22 smells against the code systematically"
- "Name each smell using Fowler's exact terminology"
- "Select the correct refactoring prescription based on the specific context (location, cause, severity)"
- "Prioritize findings by impact and ease — quick wins first, structural rework last"
- "Write the diagnosis report with smell name, evidence, prescription, and rationale"
audience:
roles: ["software-developer", "senior-developer", "tech-lead", "code-reviewer"]
experience: "intermediate — assumes working knowledge of object-oriented programming and reading code"
triggers:
- "Code feels wrong but the exact problem is hard to name"
- "Pre-refactoring audit: what should be fixed and in what order"
- "Code review reveals structural problems without a clear fix"
- "Inherited codebase needs a quality baseline before feature work begins"
- "A team needs to justify a refactoring effort with specific named problems"
not_for:
- "Performance profiling — use profiling-driven-performance-optimization instead"
- "Security vulnerability scanning — different concern from structural code quality"
- "Executing the refactoring transformations themselves — this skill diagnoses; sibling skills execute"
- "New code that hasn't been written yet — smells appear in existing code"
---
# Code Smell Diagnosis
## When to Use
You have existing source code — a class, a module, a method, or an entire service — and you need to know what is structurally wrong with it before deciding how to improve it.
This skill applies when:
- A developer says "this code is a mess" but cannot name what exactly is wrong
- A code review surfaces problems that need specific names and specific fixes, not just "clean this up"
- A refactoring effort is starting and needs a prioritized list of what to address first
- Code keeps breaking in the same places — a smell is likely attracting bugs
- A pull request adds complexity to already-complex code and needs a targeted intervention
**The core insight from Fowler and Beck:** Refactoring without diagnosis is guesswork. When you can name the smell precisely, the refactoring prescription follows directly. "This is Feature Envy" immediately implies Move Method. "This is a Switch Statement on a type code" immediately implies Replace Type Code with Subclasses or Replace Type Code with State/Strategy followed by Replace Conditional with Polymorphism. The name unlocks the remedy.
**This is the hub skill.** Sibling skills execute specific remedies once a smell is diagnosed:
- `type-code-refactoring-selector` — when Primitive Obsession or Switch Statements are present
- `conditional-simplification-strategy` — when Switch Statements or complex conditionals are present
- `class-responsibility-realignment` — when Feature Envy, Inappropriate Intimacy, or Shotgun Surgery are present
- `big-refactoring-planner` — when multiple smells indicate a systemic design problem
- `data-organization-refactoring` — when Data Clumps, Primitive Obsession, or Data Class are present
---
## Context and Input Gathering
### Required Input (must have — ask if missing)
- **The code to diagnose.** Either a directory path to scan, specific files, a class name, or a pasted code fragment. Why: the diagnosis is grounded in actual code — general impressions are insufficient for naming specific smells and prescribing specific refactorings.
- If a full codebase: start with the files the user identifies as problematic, or grep for common smell signals (methods > 20 lines, classes > 200 lines, parameter lists > 4 parameters)
- If a code snippet is provided directly: work from the snippet
- **Language and framework.** Why: smell signals differ by language. Long parameter lists are more prevalent in procedural-style Python than in Java with builder patterns. Switch statements in Java suggest polymorphism; pattern matching in a language with first-class ADTs is different.
- Check: file extensions, import statements, build files (`pom.xml`, `package.json`, `pyproject.toml`)
- If unclear, infer from file extensions
### Observable Context (gather before asking)
Scan the environment to orient the diagnosis:
```
Size signals (quick grep targets):
- Long methods: functions/methods exceeding 20-30 lines
- Long parameter lists: functions with 4+ parameters
- Large classes: classes exceeding 200 lines or with 10+ instance variables
- Duplicated code: identical or near-identical blocks appearing in multiple places
- Switch/if-elif chains: switch statements or long if/elif chains on the same variable
Structure signals:
- Classes with only getters/setters and no behavior → Data Class
- Methods referencing another class's data more than their own class → Feature Envy
- Subclasses that override many parent methods without using the parent's data → Refused Bequest
- Long chains like a.getB().getC().getD() → Message Chains
```
### Default Assumptions
- If scope is unclear: start with the files the user directly mentioned, then expand if needed
- If a smell is borderline: flag it as "weak signal" rather than omitting it — the developer decides whether to act
- If the codebase is large: focus on the highest-traffic, most-changed, or most-complained-about areas first
---
## Process
### Step 1: Read and Orient
**ACTION:** Read the target code — structure first, then detail.
**WHY:** Smell detection is pattern recognition, not line-by-line parsing. A structural read (class names, method names, field names, file organization) surfaces most smells before reading any implementation. Diving into implementation first causes you to miss forest-level smells (Large Class, Divergent Change, Shotgun Surgery) while fixating on method-level detail.
Structural questions to answer:
- How many classes/files? How large are they?
- What are the class names and their responsibilities? Do they have a clear single purpose?
- What are the method names? Are they long, numerous, or oddly named?
- What are the fields? Many instance variables? Clusters that appear together?
- What imports/dependencies exist? Does a class depend on many others?
Then read method bodies for the classes flagged as problematic.
---
### Step 2: Check Each of the 22 Smells
**ACTION:** Systematically evaluate each smell against the code. Do not skip smells — the ones you expect to be absent are often the most revealing when present.
**WHY:** Unsystematic diagnosis produces a partial list. Developers naturally notice some smells (long methods, duplication) and miss others (Divergent Change, Speculative Generality, Refused Bequest). Checking all 22 takes 10 minutes and prevents prescribing the wrong refactoring because a deeper smell was missed.
Use the full smell catalog in the reference file (`references/smell-catalog.md`) for detailed detection criteria. The diagnostic decision tree below gives the key signal and the prescribed remedy for each smell:
---
#### Group A: Bloaters — Code That Has Grown Too Large
**1. Duplicated Code**
- Signal: The same code structure appears in more than one place
- Branch by location:
- Same class, two methods → Extract Method; invoke from both places
- Sibling subclasses → Extract Method in both; Pull Up Field or Pull Up Method to parent. If code is similar but not identical: Extract Method to separate the similar from different parts, then consider Form Template Method
- Unrelated classes → Extract Class in one class; use the new component in the other. Or decide the method belongs in only one class and have the other invoke it
- Same method, different algorithm doing the same thing → Substitute Algorithm (choose the clearer one)
**2. Long Method**
- Signal: Method body exceeds 10-20 lines; comments precede blocks of code; conditionals and loops are nested
- Primary remedy: Extract Method — find clumps of code that go together and give them a name. Do this aggressively; the name is the value, not the line savings
- If extracting creates too many parameters: Replace Temp with Query to eliminate temporaries; Introduce Parameter Object or Preserve Whole Object to slim the parameter list
- If parameter/temp problem persists after those: Replace Method with Method Object — turn the method into its own class
- Conditionals → Decompose Conditional
- Loops → extract the loop and its body into its own method
**3. Large Class**
- Signal: Class has too many instance variables (10+), too many methods (20+), or too many lines (200+); prefixes or suffixes group variables logically
- Common variable groups → Extract Class (e.g., `depositAmount` and `depositCurrency` belong together)
- Subsets of variables not used all the time → Extract Class or Extract Subclass
- Too much code with redundancy → eliminate redundancy first; five 100-line methods with overlap can become five 10-line methods with ten 10-line extracted helpers
- GUI class with behavior → move data and behavior to a domain object; use Duplicate Observed Data to sync if needed
- Identifying use patterns → Extract Interface for each major use cluster; this reveals natural subclass boundaries
**4. Long Parameter List**
- Signal: Method takes 4+ parameters; parameters change frequently as caller needs change
- If a parameter's value can be gotten by making a request of an object the method already knows → Replace Parameter with Method (eliminate the parameter)
- If parameters are a data cluster from an existing object → Preserve Whole Object (pass the object, not its individual fields)
- If parameters have no natural home object → Introduce Parameter Object (create one)
- Exception: when you deliberately do NOT want to create a dependency between the called and calling objects — in those cases, passing parameters explicitly is correct even if the list is long
---
#### Group B: Object-Oriented Abusers — Wrong Use of OO Concepts
**5. Switch Statements**
- Signal: switch/case statement or long if-else-if chain that recurs in multiple places; the same type-code value drives branching throughout the code
- Most cases: consider polymorphism — Extract Method to extract the switch, Move Method to the class with the type code value, then decide:
- Type code does not affect behavior → Replace Type Code with Class
- Type code affects behavior and subclassing is possible → Replace Type Code with Subclasses; then Replace Conditional with Polymorphism
- Type code affects behavior but subclassing is not possible (class already subclassed or type changes at runtime) → Replace Type Code with State/Strategy; then Replace Conditional with Polymorphism
- Switch affects only a single method and type changes are not expected → Replace Parameter with Explicit Methods
- One of the cases is null → Introduce Null Object
**6. Parallel Inheritance Hierarchies** (special case of Shotgun Surgery)
- Signal: Every time you subclass one class, you must also subclass another; the two hierarchies share the same class name prefix
- Remedy: Make instances of one hierarchy refer to instances of the other; use Move Method and Move Field until one hierarchy disappears
**7. Refused Bequest**
- Signal: Subclass inherits methods and data from parent but does not use most of them; subclass overrides the parent's methods and throws exceptions or does nothing
- Weak form (ignoring implementation, supporting interface): usually acceptable — the hierarchy is used for code reuse. Only nine times in ten is this smell faint enough to ignore
- Strong form (subclass refuses to support the superclass interface): Push Down Method and Push Down Field to create a sibling class; let the parent hold only what is genuinely common
- If subclass is reusing behavior but should not be in an is-a relationship → Replace Inheritance with Delegation
**8. Alternative Classes with Different Interfaces**
- Signal: Two classes do the same thing but have different method names for the same operations
- Rename Method to make signatures match; Move Method to bring behavior into alignment; if you have to redundantly move code → Extract Superclass
---
#### Group C: Change Preventers — Code That Makes Change Difficult
**9. Divergent Change**
- Signal: One class is changed in different ways for different reasons — adding a database changes methods A, B, C; adding a financial instrument changes methods D, E, F. The same class absorbs multiple axes of change
- Remedy: Identify everything that changes for a particular cause and Extract Class to put it all together
**10. Shotgun Surgery**
- Signal: One change requires making many small changes in many different classes; a single logical change is scattered across the codebase
- Remedy: Move Method and Move Field to consolidate the behavior into a single class. If no existing class is a good home, create one (Inline Class to bring scattered behavior together)
- Divergent change vs. Shotgun Surgery: Divergent Change = one class, many changes. Shotgun Surgery = one change, many classes. Either way, you want one logical change to map to one class.
---
#### Group D: Couplers — Excessive Coupling Between Classes
**11. Feature Envy**
- Signal: A method seems more interested in another class than the one it belongs to; it calls half a dozen getter methods on another object to calculate some value
- Remedy: Move Method to the class whose data the method uses most
- If only part of the method has envy: Extract Method on the envious part; then Move Method
- Exception: patterns that intentionally violate this rule — Strategy and Visitor are designed to break Feature Envy to enable flexible behavior changes; the fundamental test is which class has the data the behavior needs
**12. Data Clumps**
- Signal: The same 3-4 data items appear together repeatedly — in field lists, parameter lists, and method signatures; deleting one of the items from the group would make the others meaningless
- First step: look for where the clumps appear as fields → Extract Class to turn the clumps into an object
- Then: check method signatures using Introduce Parameter Object or Preserve Whole Object to slim them down
- Test: delete one item from the group — if the others don't make sense without it, you have an object waiting to be born
**13. Message Chains**
- Signal: `a.getB().getC().getD().getValue()` — a client asks one object for another, then asks that object for another, navigating a chain. Any change to the intermediate structure requires changing the client
- Remedy: Hide Delegate — at various points in the chain, introduce a delegation method
- But avoid over-applying: turning every intermediate object into a middle man creates Middle Man smell. Use Extract Method to take a piece of code that uses the chain; then Move Method to push it down the chain closer to the data
**14. Middle Man**
- Signal: A class delegates half or more of its methods to another class; the interface is hollow — it does nothing itself
- If too many methods delegate: Remove Middle Man; talk directly to the object that knows
- If only a few methods delegate: Inline Method to inline the delegation
- If there is additional behavior attached to the delegation: Replace Delegation with Inheritance to make the middle man a subclass of the real object
**15. Inappropriate Intimacy**
- Signal: Two classes know too much about each other's private parts; one class accesses another's private fields directly or depends heavily on its internal implementation
- Remedy: Move Method and Move Field to separate the pieces and reduce intimacy
- If a bidirectional association exists: Change Bidirectional Association to Unidirectional
- If common interests exist: Extract Class to put the common behavior in a safe shared place; use Hide Delegate to let another class act as go-between
- Inheritance overintimacy: if a subclass knows more about its parent than appropriate → Replace Delegation with Inheritance
**16. Incomplete Library Class**
- Signal: A library class doesn't do exactly what you need; you can't modify it (it's not your code); the workaround is scattered through your code
- A few missing methods: Introduce Foreign Method — add the needed method to your own class; document that it belongs conceptually to the library class
- Lots of extra behavior needed: Introduce Local Extension — create a subclass or wrapper of the library class with all the needed additions
---
#### Group E: Dispensables — Unnecessary Code That Should Not Exist
**17. Lazy Class**
- Signal: A class that isn't doing enough to justify its existence — it was added speculatively, was refactored down to almost nothing, or is a subclass that barely differs from its parent
- Subclasses not doing enough → Collapse Hierarchy
- Nearly useless components → Inline Class
**18. Speculative Generality**
- Signal: Abstract classes, hooks, special cases, and parameters added for future use that no current code actually exercises; the only callers are test cases
- Abstract classes not doing much → Collapse Hierarchy
- Unnecessary delegation → Inline Class
- Methods with unused parameters → Remove Parameter
- Methods with odd abstract names → Rename Method
- Detection: if the only users of a method or class are test cases, delete it and the test case that exercises it (but keep it if it tests functionality that is legitimately exercised by real code through a test)
**19. Temporary Field**
- Signal: An instance variable is set only in certain circumstances — it's only valid during some algorithms and is otherwise empty or null; understanding why it's there when it seems unused is confusing
- Remedy: Extract Class to create a home for the orphan variables; put all the code that concerns those variables into the new class
- Use Introduce Null Object to create an alternative component for when the variables aren't valid
- Common pattern: a complicated algorithm that needs several fields but only during computation → those fields belong in a Method Object, not in the host class
---
#### Group F: Comments — A Diagnostic Signal, Not a Smell Itself
**20. Comments** (used as deodorant)
- Signal: Comments that explain what a block of code does (not why); comments that compensate for confusing code; thickly commented code that would be unnecessary if the code were clearer
- Comments are not a smell themselves — they can be a sweet smell. The signal is comments used as deodorant to mask other smells
- If a comment explains what a block does → Extract Method; name the method after what the comment says
- If the method is extracted but still needs a comment to explain what it does → Rename Method
- If you need to state rules about required system state → Introduce Assertion
- Appropriate comments: explaining why something is done, noting uncertainty, documenting constraints that cannot be expressed in code
**The remaining two smells appear in object hierarchy design:**
**21. Data Class**
- Signal: A class with fields, getters, and setters — and nothing else. It exists only as a data holder; other classes manipulate it in far too much detail
- Public fields → Encapsulate Field immediately; check collection fields → Encapsulate Collection
- Fields that should not change → Remove Setting Method
- Find where getters and setters are used → Move Method to move behavior into the Data Class
- If you can't move a whole method → Extract Method; then Move Method
- Goal: the Data Class should take on responsibility rather than just being manipulated
**22. Refused Bequest** — covered under Group B (Object-Oriented Abusers) above.
---
### Step 3: Prioritize Findings
**ACTION:** Rank all identified smells for the diagnosis report.
**WHY:** Not all smells are equal. Treating a cosmetic smell (Lazy Class) with the same urgency as a change-blocking smell (Shotgun Surgery) wastes time and de-prioritizes what matters. Prioritization makes the report actionable — the developer knows where to start.
**Prioritization criteria (apply in order):**
1. **Change preventers first** (Divergent Change, Shotgun Surgery, Parallel Inheritance Hierarchies) — these make every future feature harder. Fix them before adding new behavior.
2. **Couplers second** (Feature Envy, Inappropriate Intimacy, Message Chains, Middle Man) — high coupling spreads bugs. Fixing coupling makes subsequent refactorings easier.
3. **Bloaters third, starting with the most impactful** — Long Method and Large Class are the most common bug attractors. Long Parameter List follows.
4. **OO abusers** (Switch Statements, Refused Bequest) — significant structural investment but high payoff.
5. **Dispensables last** (Lazy Class, Speculative Generality, Temporary Field) — these are cleanup; do them after the structural work.
**Severity tiers for the report:**
- **HIGH** — the smell is actively making the code fragile, bug-prone, or difficult to change; address before the next feature
- **MEDIUM** — the smell is a real problem but not immediately blocking; address in the current cleanup cycle
- **LOW** — weak signal or borderline; flag for awareness; no immediate action required
---
### Step 4: Write the Diagnosis Report
**ACTION:** Produce the structured diagnosis report as the skill's deliverable.
**WHY:** The report is what the developer acts on. It must be specific enough to be actionable (naming the exact smell and the exact refactoring) and organized enough to be usable as a task list or review checklist. Vague reports ("this code needs work") produce no change. Named smells with prescribed remedies produce refactoring plans.
**Output format:**
```markdown
# Code Smell Diagnosis — [Target: class/module/directory name]
## Summary
Files scanned: [count]
Smells found: [N total — X HIGH, Y MEDIUM, Z LOW]
Primary cluster: [the dominant smell group — e.g., "Change Preventers + Bloaters"]
---
## Priority Findings
### Finding #N — [Smell Name] — [HIGH | MEDIUM | LOW]
**Location:** [file:line or class/method name]
**Evidence:** [what in the code signals this smell — be specific]
**Prescription:** [the Fowler-prescribed refactoring(s)]
**Why this prescription:** [the specific conditional branch that applies — e.g., "duplication is in sibling subclasses, not the same class, so Pull Up Method rather than just Extract Method"]
**Next step:** [the single first action to take]
---
[repeat per finding]
---
## Refactoring Sequence
[Ordered list of the prescribed refactorings in the recommended sequence.
Change preventers first. Dependencies between refactorings noted.]
## Related Skills
[Which sibling skills to invoke for execution]
```
---
## Key Principles
**1. Name the smell before prescribing the remedy.**
A generic prescription ("extract this into a method") without naming the smell misses the diagnostic value. The smell name carries the full decision tree with it. "Feature Envy" tells a developer exactly what class the method belongs in. "Switch Statement on a type code" tells them exactly which three refactorings to apply in sequence.
**2. Follow the conditional branches precisely.**
Most smells have context-dependent prescriptions. Duplicated Code in the same class has a different remedy than Duplicated Code in sibling subclasses. Switch Statements affecting a single method have a different remedy than Switch Statements scattered across the codebase. Getting the branch wrong produces the wrong refactoring.
**3. One smell often hides another.**
Data Clumps often reveal Feature Envy once turned into objects. Long Method is often Divergent Change in disguise. Shotgun Surgery is often caused by a missing class that Data Clumps would reveal. When you name one smell, check whether it points to a deeper one.
**4. Smells are not rules — they are signals.**
Fowler and Beck explicitly decline to give precise metrics. A 25-line method might be fine; a 10-line method might reek. The judgment is: does this code make the next change harder? Does it attract bugs? Is it difficult to understand? The smell is an indicator, not a verdict.
**5. Comments are a diagnostic tool, not a finding.**
Heavy commenting often signals other smells. Use comments as a guide to where Extract Method and Rename Method are needed — the smell is what the comment is masking, not the comment itself.
---
## Examples
### Example 1: Service Class Audit
**Scenario:** A developer inherits a `PaymentService` class (280 lines, 12 methods, 9 instance variables) and asks for a diagnosis before adding a new payment gateway.
**Step 1 — Structural read:**
- 280 lines, 12 methods: candidate for Large Class
- 9 instance variables: some look like they form clusters (`gatewayUrl`, `gatewayApiKey`, `gatewayTimeout` → Data Clumps candidate)
- Method names: `processPayment`, `validateCard`, `logTransaction`, `formatCurrency`, `sendWebhook`, `retryWebhook`, `buildGatewayRequest`, `parseGatewayResponse` — some methods clearly work on gateway concerns, others on local concerns
**Step 2 — Smell check (selected findings):**
*Divergent Change (HIGH):* The class changes when the gateway changes (3 methods: `buildGatewayRequest`, `parseGatewayResponse`, `processPayment`) AND when the webhook logic changes (2 methods: `sendWebhook`, `retryWebhook`) AND when logging changes (1 method: `logTransaction`). Three axes of change in one class.
*Data Clumps (MEDIUM):* `gatewayUrl`, `gatewayApiKey`, `gatewayTimeout` appear together in every gateway-related method signature. Deleting `gatewayUrl` makes the others meaningless.
*Long Method (MEDIUM):* `processPayment` is 55 lines with embedded comments marking sections ("// validate", "// build request", "// process", "// log").
**Step 3 — Prioritize:** Divergent Change first (it's a change preventer). Data Clumps second (will simplify gateway extraction). Long Method third (the extracted sections will become methods in the new class).
**Diagnosis report excerpt:**
```markdown
### Finding #1 — Divergent Change — HIGH
Location: PaymentService (entire class)
Evidence: The class changes for at least three distinct reasons:
(1) Gateway integration changes → buildGatewayRequest, parseGatewayResponse, processPayment
(2) Webhook delivery changes → sendWebhook, retryWebhook
(3) Logging changes → logTransaction
Prescription: Extract Class for each axis of change.
- Extract GatewayClient (gateway request/response)
- Extract WebhookDispatcher (send + retry logic)
- PaymentService retains orchestration only
Why this prescription: Divergent Change calls for Extract Class on each axis.
The test: "I have to change these N methods every time X happens."
Three different values of X = three different classes needed.
Next step: Extract GatewayClient first — it's the largest axis and will expose
the Data Clumps smell for remedy.
```
---
### Example 2: Single Method Diagnosis
**Scenario:** A code reviewer asks what's wrong with this Python method:
```python
def calculate_charge(customer_type, base_price, quantity, discount_code,
loyalty_years, is_weekend, tax_rate):
if customer_type == 'enterprise':
price = base_price * quantity * 0.85
elif customer_type == 'retail':
price = base_price * quantity
elif customer_type == 'wholesale':
price = base_price * quantity * 0.7
else:
price = base_price * quantity
if discount_code == 'SUMMER10':
price *= 0.90
elif discount_code == 'VIP20':
price *= 0.80
if loyalty_years > 5:
price *= 0.95
if is_weekend:
price *= 1.05
return price * (1 + tax_rate)
```
**Step 2 — Smell check:**
*Long Parameter List (HIGH):* 7 parameters. Why this branch: there is no obvious existing object that holds these — `customer_type`, `loyalty_years` suggest a Customer object; `base_price`, `quantity`, `discount_code` suggest an Order. Two Introduce Parameter Object applications.
*Switch Statements (HIGH):* The `customer_type` branching recurs — any new customer type requires finding every switch on `customer_type` and adding a case. Why this branch: type code affects behavior, and customer type is unlikely to change at runtime → Replace Type Code with Subclasses on Customer, then Replace Conditional with Polymorphism for the pricing logic.
**Diagnosis report excerpt:**
```markdown
### Finding #1 — Switch Statements — HIGH
Location: calculate_charge(), lines 2-9 (customer_type branching)
Evidence: A switch on customer_type determines pricing multiplier.
This switch will exist wherever customer pricing is computed.
Every new customer type requires finding every such switch.
Prescription: Replace Type Code with Subclasses (EnterpriseCustomer,
RetailCustomer, WholesaleCustomer); then Replace Conditional with
Polymorphism (each subclass implements its own price multiplier method).
Why this branch: type code affects behavior; customer type doesn't change
at runtime for a given customer → subclassing is appropriate.
(If type changed at runtime, use Replace Type Code with State/Strategy instead.)
Next step: Introduce Parameter Object first (Finding #2) to create a
Customer object; then apply type code replacement on that object.
### Finding #2 — Long Parameter List — HIGH
Location: calculate_charge() signature (7 parameters)
Evidence: customer_type, loyalty_years → Customer data cluster.
base_price, quantity, discount_code → Order data cluster.
Deleting customer_type makes loyalty_years ambiguous without context.
Prescription: Introduce Parameter Object twice — Customer(customer_type,
loyalty_years) and Order(base_price, quantity, discount_code).
Then: Preserve Whole Object — pass Customer and Order, not their fields.
Next step: Define Customer and Order dataclasses; update the signature.
This unblocks the Switch Statement refactoring (Finding #1).
Refactoring sequence:
1. Introduce Parameter Object: Customer, Order
2. Replace Type Code with Subclasses on Customer
3. Replace Conditional with Polymorphism for pricing
4. Preserve Whole Object in calculate_charge signature
```
---
### Example 3: Inheritance Hierarchy Diagnosis
**Scenario:** A codebase has `Animal`, `Dog`, `Cat`, `Bird` hierarchy. A developer notices that every time a new animal species is added, a parallel set of `AnimalSound`, `DogSound`, `CatSound`, `BirdSound` classes must also be added.
**Step 2 — Smell check:**
*Parallel Inheritance Hierarchies (HIGH):* Subclassing `Animal` always requires subclassing `AnimalSound`. The prefix pattern is exact (Dog/DogSound, Cat/CatSound).
*Prescription:* Make instances of one hierarchy refer to instances of the other (Strategy pattern). Move Method and Move Field from `AnimalSound` hierarchy into `Animal` hierarchy using a sound strategy object. Once all behavior is moved, the `AnimalSound` hierarchy disappears.
---
## References
| File | Contents | When to read |
|------|----------|--------------|
| `references/smell-catalog.md` | Full detection criteria for all 22 smells with code examples per language; borderline cases; false positive filters | Step 2 — systematic smell check |
| `references/refactoring-prescriptions.md` | Full prescription tree per smell with all conditional branches; cross-references to Fowler catalog chapters | Step 2 — selecting the correct refactoring |
**Hub skill relationships:**
- `type-code-refactoring-selector` — when Switch Statements or Primitive Obsession (type code variant) are diagnosed
- `conditional-simplification-strategy` — when Switch Statements or deeply nested conditionals are present
- `class-responsibility-realignment` — when Feature Envy, Inappropriate Intimacy, Divergent Change, or Shotgun Surgery are the primary findings
- `big-refactoring-planner` — when the diagnosis reveals systemic design problems requiring coordinated multi-step refactoring
- `data-organization-refactoring` — when Data Clumps, Primitive Obsession, or Data Class are primary findings
## License
This skill is licensed under [CC-BY-SA-4.0](https://creativecommons.org/licenses/by-sa/4.0/).
Source: [BookForge](https://github.com/bookforge-ai/bookforge-skills) — Refactoring: Improving the Design of Existing Code by Martin Fowler and Kent Beck.
## Related BookForge Skills
Install related skills from ClawhHub:
- `clawhub install bookforge-type-code-refactoring-selector`
- `clawhub install bookforge-conditional-simplification-strategy`
- `clawhub install bookforge-class-responsibility-realignment`
- `clawhub install bookforge-big-refactoring-planner`
- `clawhub install bookforge-data-organization-refactoring`
Or install the full book set from GitHub: [bookforge-skills](https://github.com/bookforge-ai/bookforge-skills)
Redistribute methods and fields to the classes that own them, repair broken inheritance hierarchies, and extend unmodifiable library classes. Use when: a met...
---
name: class-responsibility-realignment
description: |
Redistribute methods and fields to the classes that own them, repair broken inheritance hierarchies, and extend unmodifiable library classes. Use when: a method uses another class's data far more than its own (Feature Envy — the method belongs in the other class); a single logical change forces edits across many files (Shotgun Surgery — scattered behavior needs consolidation); a class delegates half or more of its methods without adding value (Middle Man — remove the hollow layer or collapse it into inheritance); two classes share too much private knowledge about each other (Inappropriate Intimacy — separate the entangled pieces); a recurring group of fields travels together across class boundaries (Data Clumps — extract the group into its own class); a class has grown to serve two distinct responsibilities that change for different reasons (Extract Class — split it); a class that used to have a purpose has been refactored down to almost nothing (Inline Class — absorb it into its most active collaborator). For inheritance hierarchies: move common behavior upward (Pull Up Method, Pull Up Field) when subclasses duplicate it; push specialized behavior downward (Push Down Method, Push Down Field, Extract Subclass) when only some subclasses need it; create a shared abstraction over two similar but unrelated classes (Extract Superclass); extract a protocol-only contract (Extract Interface) when clients use only a subset of the class; merge a hierarchy that has converged (Collapse Hierarchy); move similar-but-not-identical subclass methods into a common template (Form Template Method); swap inheritance for delegation (Replace Inheritance with Delegation) when a subclass uses only part of the superclass interface or inherits inappropriate data; swap back (Replace Delegation with Inheritance) when the delegation covers the full interface and adds no control. For unmodifiable library classes: add one or two methods as foreign methods in the client class; add many methods via a local extension (subclass or wrapper). Decision rule for encapsulation balance: Hide Delegate to shield clients from internal structure; Remove Middle Man when the delegating layer has grown hollow. Decision rule for inheritance vs delegation: use Extract Subclass when variation is fixed at construction time and single-dimensional; use Extract Class (delegation) when variation is runtime-flexible or multi-dimensional — "if you want the class to vary in several different ways, you have to use delegation for all but one of them."
version: 1.0.0
homepage: https://github.com/bookforge-ai/bookforge-skills/tree/main/books/refactoring/skills/class-responsibility-realignment
metadata: {"openclaw":{"emoji":"📚","homepage":"https://github.com/bookforge-ai/bookforge-skills"}}
status: draft
depends-on:
- code-smell-diagnosis
source-books:
- id: refactoring
title: "Refactoring: Improving the Design of Existing Code"
authors: ["Martin Fowler", "Kent Beck"]
chapters: [7, 11]
tags: [refactoring, code-quality, object-oriented-design, responsibilities]
execution:
tier: 2
mode: hybrid
inputs:
- type: codebase
description: "Source code — the class or set of classes whose responsibility assignment is being corrected"
- type: document
description: "Code-smell-diagnosis report naming Feature Envy, Shotgun Surgery, Middle Man, Inappropriate Intimacy, or a hierarchy smell — use this if no live codebase is accessible"
tools-required: [Read, Grep, Write, Edit]
tools-optional: [Bash]
mcps-required: []
environment: "Run inside a project directory. Read source files to assess which class owns which data; grep to find all callers before moving anything."
discovery:
goal: "Identify the correct home for each misplaced method or field; select the appropriate refactoring from the responsibility-redistribution, hierarchy-repair, or library-extension catalogs; execute step by step with compile-and-test after each move"
tasks:
- "Read the target class to determine what data and behavior it contains"
- "Identify which other classes the target class interacts with most heavily"
- "Classify the structural problem using the smell-to-refactoring decision framework"
- "Select and apply the correct refactoring mechanics"
- "Verify: each class now does one clearly named thing; no class acts as a pass-through without adding value"
audience:
roles: ["software-developer", "senior-developer", "tech-lead"]
experience: "intermediate — assumes familiarity with object-oriented design and basic refactoring mechanics"
triggers:
- "Feature Envy, Shotgun Surgery, Middle Man, or Inappropriate Intimacy diagnosed by code-smell-diagnosis"
- "A class has two distinct change axes (Divergent Change) and needs splitting"
- "Duplicated behavior across subclasses that should live in the superclass"
- "A subclass rejects its inheritance by throwing exceptions or ignoring parent data (Refused Bequest)"
- "A library class is missing one or more methods needed throughout the codebase"
- "A delegation layer has grown to cover the entire interface of the delegate"
not_for:
- "Type code and switch statement refactoring — use type-code-refactoring-selector instead"
- "Conditional logic simplification — use conditional-simplification-strategy instead"
- "Diagnosis of which smell is present — run code-smell-diagnosis first"
- "Performance optimization — use profiling-driven-performance-optimization instead"
---
# Class Responsibility Realignment
## When to Use
You have identified (via `code-smell-diagnosis` or direct inspection) that behavior or data is in the wrong class. The concrete signals are:
- A method calls six getter methods on another object to do its work — it belongs in that other class (Feature Envy)
- One logical change requires touching five files — the behavior is scattered and needs a home (Shotgun Surgery)
- A class does nothing but forward calls to another class — it is a hollow Middle Man
- Two classes share internal knowledge they should not — they are Inappropriately Intimate
- A cluster of fields migrates together across method signatures — it is a Data Clump waiting to be extracted
- A hierarchy has duplicate methods across siblings, or specialized behavior sitting in the wrong layer
- A library class is missing one or two methods you need repeatedly but cannot add directly
**The core insight from Fowler:** One of the most fundamental decisions in object design is deciding where to put responsibilities. Getting it wrong the first time is not a problem — refactoring exists to correct it. The moment you realize a method uses another class's data more than its own, you have the diagnosis and the prescription simultaneously: Move Method.
---
## Context and Input Gathering
### Required Input
- **The target class or classes.** Either a file path, a class name, or the output of a `code-smell-diagnosis` report. Why: responsibility decisions are made relative to the data each class owns — you cannot realign without knowing what each class contains.
- **The smell classification.** Which of the catalog smells applies (Feature Envy, Shotgun Surgery, Middle Man, Inappropriate Intimacy, Data Clumps, hierarchy smell, library gap). Why: each smell maps to a distinct set of refactorings; applying the wrong one produces a different structural problem.
### Observable Context
Before touching anything:
```
Scan targets:
- Which class does this method call getters on most? → Move Method candidate
- Which fields travel together in parameter lists? → Extract Class / Data Clumps
- How many files change together for one logical edit? → Shotgun Surgery scope
- How many methods on class A just call class B? → Middle Man threshold
- Which subclass methods are identical? → Pull Up Method candidates
- Does a subclass throw NotImplemented or ignore parent fields? → Refused Bequest
```
### Default Assumptions
- If the smell is ambiguous between Feature Envy and Inappropriate Intimacy: start with Move Method; see which class becomes more cohesive
- If Shotgun Surgery scope is large (>5 files): consolidate into a new class rather than an existing one — no existing class is the right home if the behavior is truly scattered
- If Move Field is needed alongside Move Method: move the field first (Fowler's preference — it stabilizes the data layout before moving the behavior)
---
## Process
### Step 1: Classify the Structural Problem
**ACTION:** Map the symptoms to one of the three problem clusters below.
**WHY:** Each cluster uses a different set of refactorings. Misclassification wastes a move or creates a new smell.
**Cluster A — Misplaced behavior (flat class graph)**
| Symptom | Refactoring |
|---------|------------|
| Method uses another class's data more than its own | Move Method |
| Field is referenced more by another class | Move Field |
| Class does too much; subset of fields/methods coheres separately | Extract Class |
| Class does too little; its responsibilities fit an absorbing class | Inline Class |
| Client navigates `a.getB().getC()` chains | Hide Delegate (add delegating method on server) |
| Server delegates more than half its interface, adds no value | Remove Middle Man |
| Library class missing 1–2 methods | Introduce Foreign Method |
| Library class missing 3+ methods | Introduce Local Extension |
**Cluster B — Hierarchy misalignment**
| Symptom | Refactoring |
|---------|------------|
| Identical method/field in two or more subclasses | Pull Up Method / Pull Up Field |
| Constructors in subclasses with mostly identical bodies | Pull Up Constructor Body |
| Superclass method only meaningful in one subclass | Push Down Method |
| Superclass field only used in one subclass | Push Down Field |
| Subset of features used only in some instances | Extract Subclass |
| Two unrelated classes share common behavior | Extract Superclass |
| Several clients use only a subset of one class's interface | Extract Interface |
| Subclass and superclass have converged — barely different | Collapse Hierarchy |
| Two subclass methods do the same steps in the same order, but differently | Form Template Method |
| Subclass uses only part of superclass interface, or inherits wrong data | Replace Inheritance with Delegation |
| Delegation covers the full interface, delegation methods are boilerplate | Replace Delegation with Inheritance |
**Cluster C — Encapsulation balance**
| Symptom | Action |
|---------|--------|
| Clients know too much about delegate's internals | Add Hide Delegate methods on server |
| Server has grown into a pass-through layer | Remove Middle Man — let clients talk to delegate |
---
### Step 2: Apply Cluster A — Misplaced Behavior
**ACTION:** Execute the mechanics for the identified refactoring(s).
**WHY:** Each step produces a compilable intermediate state; verifying after each move catches errors before they compound.
#### Move Method
When a method is more interested in another class than its own:
1. **Examine all features** the method uses. If several features from the same source class are involved, consider moving them as a group — moving a clutch of methods is sometimes easier than moving them one at a time.
2. **Check subclasses and superclasses** of the source class for other declarations. If other declarations exist, the move may be blocked unless the target class can also express the polymorphism.
3. **Declare the method in the target class.** A rename that makes better sense in the target context is appropriate.
4. **Copy the method body.** Adjust it to work in its new home — four options for referencing the source object: (a) move the feature to the target class as well, (b) create or use an existing reference from target to source, (c) pass the source object as a parameter, (d) pass the specific value as a parameter.
5. **Turn the source method into a simple delegation** (return `target.methodName()`). Compile and test.
6. **Decide whether to remove the delegation.** Leaving it is easier if there are many callers; removing it is cleaner if callers can be updated.
#### Move Field
When a field is used more by another class:
1. **If the field is public,** use Encapsulate Field first. If many methods access it, use Self Encapsulate Field so only accessors touch the raw field.
2. **Create the field with getter/setter in the target class.** Compile.
3. **Determine how to reference the target object** from the source. Use an existing field or method; if none exists, create one (possibly temporary).
4. **Remove the field from the source class.** Replace all references with calls to the target's accessor.
5. Compile and test.
#### Extract Class
When a class is doing work that should be done by two:
1. **Decide the split.** A useful test: remove a field — do the remaining fields and methods still make sense without it? If yes, you have found an extraction boundary.
2. **Create the new class.** Rename the old class if its name no longer matches its reduced scope.
3. **Move fields first** (Move Field), then **move methods** (Move Method) — start with lower-level methods (called, not calling) and work toward higher-level ones.
4. **Review and reduce interfaces** after each move. If a two-way link exists, examine whether it can become one-way.
5. **Decide visibility of the new class.** If exposing it, decide between reference object (mutable, aliasing possible) and immutable value object.
#### Inline Class
When a class is no longer pulling its weight:
1. Declare the public protocol of the source class on the absorbing class. Delegate all methods to the source class temporarily.
2. Change all external references from the source class to the absorbing class.
3. Use Move Method and Move Field to move features from the source to the absorbing class until nothing is left.
4. Delete the empty class.
#### Hide Delegate vs. Remove Middle Man — the encapsulation balance
Fowler's observation: these two are inverse refactorings and neither is permanently correct. "Refactoring means you never have to say you're sorry — you just fix it." Apply the rule by tracking:
- **Hide Delegate:** Client calls `manager = john.getDepartment().getManager()`. Add `getManager()` directly to Person so the client does not need to know about Department. Do this for each method on the delegate that clients use. Remove the delegate accessor if clients no longer need it.
- **Remove Middle Man:** After applying Hide Delegate repeatedly, Person has grown a long list of one-liner delegation methods — it is now a Middle Man. Add an accessor for the delegate. For each client use of a delegation method, redirect the client to call the delegate directly. Remove each delegation method as its clients migrate.
**The judgment call:** Hide more when the delegate's implementation is likely to change (clients should not be coupled to it). Remove the middle man when the delegating layer adds no value and the cost of maintaining all those one-liners exceeds the coupling cost.
#### Library Extension Strategy
When you cannot modify a class but need additional behavior from it:
**1–2 methods needed → Introduce Foreign Method**
1. Create a method in the client class that does what is needed.
2. Make an instance of the server class the first parameter.
3. Comment the method: `// foreign method; should be on [ServerClass]`. Why: this marks it for migration if you later gain access to the source, and enables grep-based discovery of all foreign methods for a given class.
4. The method must not access any features of the client class — it should feel like it belongs on the server.
**3+ methods needed → Introduce Local Extension**
1. Decide between subclass and wrapper:
- **Subclass** (preferred): simpler, less code. Constraint: apply at object-creation time. If the original is immutable, a copy constructor suffices and aliasing is not a problem.
- **Wrapper**: required when (a) you cannot control creation of the original objects, (b) the original is mutable and you need the extension and the original to share state, or (c) you need to apply the extension to existing objects. Tradeoff: must delegate all original class methods, and symmetrical equality checks (`a.equals(b)` where `a` is original and `b` is wrapper) cannot be fully hidden.
2. Add converting constructors — one that takes an instance of the original as argument.
3. Add new methods to the extension class. Move any existing foreign methods defined for this class onto the extension.
4. Replace uses of the original class with uses of the extension where the new methods are needed.
---
### Step 3: Apply Cluster B — Hierarchy Repair
**ACTION:** Select and apply the hierarchy refactoring identified in Step 1.
**WHY:** Hierarchy smells compound — duplicate methods in subclasses create two maintenance points; misplaced fields make Pull Up Method harder. Address the fields before the methods.
#### Pull Up Field / Pull Up Method
Fields first, then methods. Why: Pull Up Method may depend on the field being in the superclass already.
- **Pull Up Field:** Inspect that all uses of the candidate fields are equivalent across subclasses. Rename if needed. Create the field in the superclass; delete from subclasses. Mark protected if subclasses need direct access.
- **Pull Up Method:** Inspect methods to confirm they are identical (or can be made identical via algorithm substitution). If similar but not identical, see Form Template Method. If the method references a feature that is only on the subclass, generalize that feature first (Pull Up the feature, or declare an abstract method on the superclass). Copy one body to the superclass; delete from subclasses one at a time, compiling and testing after each deletion.
- **Pull Up Constructor Body:** Constructors cannot be inherited. Extract the common initialization code into a superclass constructor; call it via `super(...)` from each subclass. If common code must run after subclass-specific initialization, use Extract Method on the common post-init logic and Pull Up Method to put it in the superclass, then call it explicitly from the subclass constructor.
#### Push Down Method / Push Down Field
When superclass behavior is only relevant to some subclasses:
- Declare the method/field in all subclasses that need it. Copy the body. Remove from the superclass. Remove from subclasses that do not need it. Compile and test.
- If the superclass is abstract and the method should still be accessible via a superclass variable, declare it abstract on the superclass instead of removing it.
#### Extract Subclass
When a class has features used only in some instances:
1. The main alternative is Extract Class (delegation vs. inheritance). **Choose Extract Subclass when:** the variation is fixed at construction time (the object's kind does not change after creation), the variation is single-dimensional (one axis of difference), and the class being extended is modifiable. Why: subclassing is simpler — the class-based behavior is changed by plugging in different components with Extract Class, but Extract Subclass requires only Push Down Method and Push Down Field.
2. **Choose Extract Class instead when:** the object's kind varies at runtime, or the class must vary in multiple dimensions simultaneously. Fowler: "If you want the class to vary in several different ways, you have to use delegation for all but one of them."
3. Define the new subclass. Provide constructors. Find all calls to the superclass constructor that should use the subclass — replace with subclass constructor calls.
4. Use Push Down Method and Push Down Field to move specialized features. Unlike Extract Class, work on methods first, data last.
5. Find boolean or type-code fields that now encode information already expressed by the hierarchy — eliminate them by replacing their getter with a polymorphic constant method.
#### Extract Superclass
When two classes share common behavior:
1. Create a blank abstract superclass. Make both classes its subclasses.
2. Move common fields first (Pull Up Field). Move common methods (Pull Up Method). If method signatures differ but purpose is the same, rename first.
3. If methods have different bodies doing the same thing, try Substitute Algorithm to unify them.
4. If methods have the same signature but different bodies that cannot be unified, declare the method abstract on the superclass.
5. Check clients — if they use only the common interface, change their type declarations to the superclass.
#### Extract Interface
When several clients use only a subset of a class's interface, or when a class needs to work with any class that can handle certain requests:
- Create an empty interface. Declare the common operations. Declare the relevant classes as implementing the interface. Update client type declarations to use the interface.
- Unlike Extract Superclass, Extract Interface cannot move common code — only the contract. If common code is needed too, combine with Extract Class (for the shared implementation).
#### Collapse Hierarchy
When a subclass and superclass have converged:
- Choose which class to remove (usually the subclass). Pull Up or Push Down all methods and fields to the surviving class. Update all references. Remove the empty class.
#### Form Template Method
When two subclass methods perform the same steps in the same order but the steps are implemented differently:
1. **Decompose each method** using Extract Method so that each extracted piece is either identical across subclasses or completely different — no partial overlap.
2. **Pull Up the identical extracted methods** to the superclass.
3. **Rename the different methods** so they have the same signature in both subclasses. Why: making the signatures identical makes the calling method identical across subclasses — it calls the same method names in the same order.
4. **Pull Up the calling method** on one of the subclasses. Declare the different methods as abstract on the superclass.
5. Remove the calling method from the other subclass.
Result: the superclass holds the invariant algorithm; subclasses supply only the steps that vary. This is the Template Method pattern — adding a new variant requires only a new subclass that overrides the abstract steps.
#### Replace Inheritance with Delegation
When a subclass uses only part of the superclass interface or inherits data that does not make sense for it (Refused Bequest, strong form):
1. **Create a field in the subclass** that refers to the superclass — initialize it to `this` so delegation and inheritance can coexist during the transition.
2. **Change each subclass method** to use the delegate field instead of implicit inheritance. Compile and test after each change. Note: methods that call `super` cannot be changed until the inheritance link is broken — they would recurse.
3. **Remove the inheritance declaration.** Change the delegate field from `this` to a new instance of the former superclass.
4. **For each superclass method used by clients,** add a simple delegating method on the subclass. Compile and test.
Contraindications: do not apply when the subclass uses all methods of the superclass (delegation is boilerplate without benefit), or when the delegate is shared and mutable (data sharing cannot be replicated with delegation).
#### Replace Delegation with Inheritance
When a class delegates to another class for the full interface and the delegating methods are pure boilerplate:
1. Precondition: the delegating class uses all of the delegate's methods. If not, use Remove Middle Man or Extract Interface instead.
2. Precondition: the delegate is not shared and mutable. If it is, delegation must remain (you cannot replicate shared mutable state via inheritance).
3. Make the delegating class a subclass of the delegate.
4. Set the delegate field to the object itself. Remove simple delegation methods. Compile and test.
5. Replace remaining uses of the delegate field with direct calls. Remove the delegate field.
---
### Step 4: Verify the Realignment
**ACTION:** Confirm the structural goals are achieved.
**WHY:** It is possible to complete all mechanics correctly but still have the smell — if the wrong refactoring was chosen for the symptom, or if a second smell was hidden beneath the first.
Verify each:
- Each class has a name that describes a single responsibility. If the name requires "and" to describe what it does, the split may be incomplete.
- No method calls another class's getters more than its own class's fields. If Feature Envy persists, a deeper extraction is needed.
- No class delegates more than half its public methods without adding any logic. If Middle Man persists, Remove Middle Man was not fully applied.
- Hierarchy: identical code does not appear in two sibling subclasses. Pull Up was not applied completely if it does.
- The delegation/inheritance decision is correct: delegation if the class varies in multiple dimensions or at runtime; inheritance if the variation is single-axis and fixed at construction.
---
## Key Principles
**1. Move Field before Move Method when doing both.**
A method moved before its data is moved will still reference the old class via getter calls. Moving the field first stabilizes the data layout; the method move then resolves cleanly.
**2. The encapsulation balance is not fixed.**
Hide Delegate protects clients from implementation changes. Remove Middle Man removes hollow layers. Apply whichever is appropriate given current coupling pressure — and reapply the inverse when conditions change.
**3. Inheritance vs. delegation: the variation-dimensions rule.**
Fowler's criterion: if you need the class to vary in only one way, subclassing (Extract Subclass) is simpler. If you need it to vary in several different ways, you must use delegation for all but one of them — a class can only have one superclass, but it can hold multiple delegate objects.
**4. Foreign methods are a workaround, not a destination.**
Comment them as `// foreign method; should be on [ServerClass]` so they can be found and migrated if you gain access to the server class. If the count grows beyond 2, promote to a local extension.
**5. Form Template Method is Extract Method + Pull Up.**
It is a composed refactoring, not a single step. The key is the decomposition: extracted methods must be either identical (candidates for Pull Up) or completely different (candidates for abstract methods). Partial overlap defeats the pattern.
---
## Examples
### Example 1: Feature Envy → Move Method
**Scenario:** `Account.overdraftCharge()` calls `_type.isPremium()` and uses `_type.daysOverdrawn()` — most of its logic is about `AccountType`, not `Account`.
**Diagnosis:** Feature Envy. The method is more interested in `AccountType` than in `Account`.
**Apply Move Method:**
1. Copy `overdraftCharge` to `AccountType`. Pass `daysOverdrawn` as a parameter (option d — pass the specific value).
2. Replace `Account.overdraftCharge()` body with `return _type.overdraftCharge(_daysOverdrawn);`
3. Compile and test. Callers of `account.overdraftCharge()` still work — the source is now a delegating method.
4. Decision: several account types will be added, each needing its own overdraft rule. Remove the delegation in `Account` and redirect all callers to `_type.overdraftCharge(_daysOverdrawn)` directly.
**Result:** `AccountType` owns overdraft logic. New account types require changes only in `AccountType` — Shotgun Surgery is eliminated.
---
### Example 2: Data Clumps → Extract Class
**Scenario:** `Person` has `_officeAreaCode` and `_officeNumber` as separate fields. Both appear in method signatures throughout the codebase. Deleting `_officeAreaCode` makes `_officeNumber` meaningless without context.
**Diagnosis:** Data Clumps. These two fields form a natural object.
**Apply Extract Class:**
1. Create `TelephoneNumber` class.
2. Add a link from `Person` to `TelephoneNumber`: `private TelephoneNumber _officeTelephone = new TelephoneNumber()`.
3. Move Field: move `_officeAreaCode` to `TelephoneNumber`. Update `Person`'s accessors to delegate: `getOfficeAreaCode()` → `_officeTelephone.getAreaCode()`.
4. Move Field: move `_officeNumber`. Move Method: move `getTelephoneNumber()` to `TelephoneNumber`.
5. Decide visibility: expose `TelephoneNumber` as a reference object (aliasing allowed) or make it immutable (safer).
**Result:** `TelephoneNumber` is a named concept. It can be reused by `HomeAddress`, `MobileContact`, and others without duplicating the area-code/number pair.
---
### Example 3: Extract Subclass vs. Extract Class — the variation-dimensions decision
**Scenario:** `JobItem` has an `_isLabor` boolean that controls behavior in `getUnitPrice()`. Labor items use `employee.getRate()`; parts items use `_unitPrice`. The `_employee` field is null for parts items.
**Classification step:**
- Is the variation fixed at construction time? Yes — a job item is either a labor item or a parts item from the start.
- Is the variation single-dimensional? Yes — there is one axis (labor vs. parts).
- Conclusion: Extract Subclass.
**Apply Extract Subclass (creating LaborItem):**
1. Create `LaborItem extends JobItem`.
2. Push Down Method: `getEmployee()` moves to `LaborItem`.
3. Push Down Field: `_employee` moves to `LaborItem`.
4. Self Encapsulate Field `_isLabor`; replace its getter with a polymorphic constant: `JobItem.isLabor()` returns `false`, `LaborItem.isLabor()` returns `true`.
5. Apply Replace Conditional with Polymorphism on `getUnitPrice()`: `JobItem` returns `_unitPrice`; `LaborItem` returns `_employee.getRate()`.
6. Remove `_isLabor` field.
**Contrast with delegation case:** If job items needed to vary along a second axis (e.g., taxable vs. non-taxable), subclassing could cover only one axis. The second axis would require a delegate: `JobItem` holds a `TaxStrategy` delegate and a `PricingStrategy` delegate, with the hierarchy handling neither.
---
### Example 4: Form Template Method
**Scenario:** `TextStatement.value()` and `HtmlStatement.value()` perform the same three steps (header, body loop, footer) but format them differently. The two methods are similar but not identical.
**Apply Form Template Method:**
1. Move both methods to subclasses of a new `Statement` superclass.
2. Extract Method on each differing step: `headerString()`, `eachRentalString()`, `footerString()` — extracted with identical signatures in both subclasses, different bodies.
3. The calling method `value()` becomes identical in both subclasses: it calls `headerString()`, loops calling `eachRentalString()`, and calls `footerString()`.
4. Pull Up Method on `value()` to `Statement`. Declare `headerString()`, `eachRentalString()`, `footerString()` as abstract on `Statement`.
5. Remove `value()` from both subclasses.
**Result:** Adding a `PdfStatement` requires only a new subclass that overrides three methods — the algorithm structure is untouchable.
---
## References
| File | Contents | When to read |
|------|----------|--------------|
| `references/smell-catalog.md` | Full detection criteria for all 22 smells | Confirming the smell classification in Step 1 |
| `references/refactoring-prescriptions.md` | Full prescription trees per smell | Selecting the correct conditional branch for each refactoring |
**Hub skill relationships:**
- `code-smell-diagnosis` — diagnose before realigning; this skill executes what that skill prescribes
- `data-organization-refactoring` — when Data Clumps or Primitive Obsession are the primary smell (Extract Class for data, not behavior)
- `type-code-refactoring-selector` — when the hierarchy problem involves a type code (Replace Type Code with Subclasses / State / Strategy)
- `conditional-simplification-strategy` — when Replace Conditional with Polymorphism is needed after Extract Subclass
- `big-refactoring-planner` — when multiple smells indicate a systemic design problem requiring coordinated multi-step refactoring
## License
This skill is licensed under [CC-BY-SA-4.0](https://creativecommons.org/licenses/by-sa/4.0/).
Source: [BookForge](https://github.com/bookforge-ai/bookforge-skills) — Refactoring: Improving the Design of Existing Code by Martin Fowler and Kent Beck.
## Related BookForge Skills
Install related skills from ClawhHub:
- `clawhub install bookforge-code-smell-diagnosis`
- `clawhub install bookforge-data-organization-refactoring`
- `clawhub install bookforge-type-code-refactoring-selector`
- `clawhub install bookforge-conditional-simplification-strategy`
- `clawhub install bookforge-big-refactoring-planner`
Or install the full book set from GitHub: [bookforge-skills](https://github.com/bookforge-ai/bookforge-skills)
Build a sufficient automated test suite before refactoring existing code by applying a 6-step sequential construction workflow (test class → fixture → normal...
---
name: build-refactoring-test-suite
description: Build a sufficient automated test suite before refactoring existing code by applying a 6-step sequential construction workflow (test class → fixture → normal behavior → boundary conditions → expected errors → green-suite gate) and a bug-fix variant (write failing test first → reproduce → fix → verify green). Use this skill when you are about to refactor a class or module that lacks tests, when a bug report arrives and you need to pin it down before fixing it, when you want to establish the compile-and-test gate that makes every subsequent refactoring step safe to revert, or when you need to assess whether an existing test suite is adequate to protect a planned refactoring.
version: 1.0.0
homepage: https://github.com/bookforge-ai/bookforge-skills/tree/main/books/refactoring/skills/build-refactoring-test-suite
metadata: {"openclaw":{"emoji":"🧪","homepage":"https://github.com/bookforge-ai/bookforge-skills"}}
status: draft
source-books:
- id: refactoring
title: "Refactoring: Improving the Design of Existing Code"
authors: ["Martin Fowler"]
chapters: [4]
tags: [refactoring, testing, code-quality]
depends-on: []
execution:
tier: 2
mode: hybrid
inputs:
- type: codebase
description: "A class, module, or file to be refactored, with or without existing tests."
tools-required: [Read, Write, Bash]
tools-optional: []
mcps-required: []
environment: "Working codebase with a test runner available. Output: a runnable test file that passes green, covering normal behavior, boundary conditions, and expected errors for the code under refactoring."
discovery:
goal: "Produce a fast, self-checking test suite that turns green before any refactoring step begins and re-runs in seconds after every atomic change."
tasks:
- "Create a test class/file for the code under refactoring"
- "Implement setup and teardown fixtures to isolate each test"
- "Write tests for all normal, expected-behavior paths"
- "Add boundary condition tests at edges and zero-value inputs"
- "Write tests that verify expected errors and exceptions are raised correctly"
- "Run the full suite and confirm green before proceeding to refactor"
audience: "developers, engineers, anyone refactoring existing production code without adequate test coverage"
when_to_use: "When a class or module needs to be refactored and does not yet have a test suite sufficient to detect regressions after each atomic change"
environment: "Existing codebase. The code under refactoring should be readable. A test runner (any language) must be available."
quality: placeholder
---
# Build Refactoring Test Suite
## When to Use
You are about to refactor code — extracting methods, moving fields, changing conditionals — and one of these is true:
- The code has no tests at all
- The existing tests are incomplete, untargeted, or only cover happy paths
- A bug has been reported and you need to pin it down before fixing and refactoring around it
- You are inheriting code from someone else and want a safety net before touching anything
This is the Level 0 foundation skill: every other refactoring mechanic in Fowler's catalog assumes this suite exists. Without it, you are refactoring blind. With it, every subsequent step is reversible — if a test turns red, you revert and try smaller steps.
**The core pattern:** build a self-checking test suite that runs in seconds, covers the code you are about to change, and can answer one question without human inspection: "Did I break anything?"
Before starting, confirm you have:
- Read access to the class or module being refactored
- A test runner installed for the language (pytest, JUnit, RSpec, Vitest, go test, etc.)
- The ability to run the test suite from the command line
---
## Context and Input Gathering
### Required Context
- **Target code:** The class, module, or file to be refactored. Read it fully before writing any tests.
- **Language and test framework:** Identify from the project structure (e.g., `pyproject.toml`, `package.json`, `pom.xml`, `go.mod`). Use the framework already in place — do not introduce a new one.
- **Existing tests:** Check for any test files that already cover the target. Run them first. If they all pass, extend rather than replace.
### Observable Context
Scan the target code for:
- **Public interface:** Every public method, function, or exported symbol is a testing target. Private internals are not — test through the public interface only.
- **Inputs and outputs:** What does each method take in and return? These define what to assert.
- **Error conditions:** What inputs should raise exceptions, return error codes, or produce empty results? These drive the error-path tests.
- **State mutations:** Does the class modify shared state? Fixtures must initialize and tear down that state per-test.
- **External dependencies:** Databases, files, network calls. These need either real test fixtures or test doubles (mocks/stubs). Prefer real fixtures for refactoring — mocks can hide regressions.
### Default Assumptions
- If no test framework exists → pick the language's idiomatic standard (pytest for Python, JUnit for Java, Vitest for TypeScript, etc.)
- If the code reads external files → create small, dedicated test data files in a `testdata/` or `fixtures/` directory
- If the code has database calls → prefer an in-memory or test-mode database over mocking; mocks test the mock, not the code
- If tests already exist and pass → run them first, then add missing coverage; do not re-implement passing tests
### Sufficiency Check
You are ready to start when:
1. You can read the target class/module completely
2. You know which test framework is in use
3. You know at least three things the code is supposed to do (its public contract)
If you cannot determine what the code is supposed to do (no comments, no documentation, unclear naming), read the calling code or integration tests first to reconstruct the intended behavior before writing unit tests.
---
## Process
### Step 1 — Create the Test Class/File
Create a dedicated test file for the code under refactoring. Place it where the project's test convention dictates (e.g., `tests/test_order.py`, `src/__tests__/Order.test.ts`, `OrderTest.java`).
**Why:** Each class under test needs its own test container. Mixing multiple classes into one test file makes isolation harder and failure messages harder to read. Using the project's existing naming convention ensures the test runner discovers the file automatically.
Minimal structure:
```
# Python
class TestOrder:
pass
# Java
class OrderTest extends TestCase { }
# TypeScript
describe('Order', () => { })
# Go
func TestOrder(t *testing.T) { }
```
Run the empty test file immediately to confirm the runner finds and executes it without errors.
---
### Step 2 — Implement Setup and Teardown Fixtures
Before writing any test methods, define the shared state that every test will need. The test framework's `setUp`/`beforeEach`/`setup` hook runs before each test; `tearDown`/`afterEach`/`cleanup` runs after.
**Why:** Each test must be fully isolated — it must not depend on execution order, and it must not leave side effects that corrupt the next test. Setup creates a fresh environment; teardown cleans up resources (open files, database connections, temp files). Without this isolation, a failure in test 3 can cause test 4 to fail for unrelated reasons, making debugging misleading.
**Guidelines:**
- Initialize only what is shared across most tests in the fixture. Test-specific state belongs in the test method itself.
- If setup can fail (file not found, connection refused), let the error propagate — a setup failure is a hard stop, not a test failure.
- If teardown involves resource release (closing files, dropping test tables), do it unconditionally — use `finally` blocks or the framework's guaranteed cleanup mechanism.
```python
# Python example
class TestFileProcessor:
def setup_method(self):
self.input_file = open("testdata/sample.txt", "r")
def teardown_method(self):
self.input_file.close()
```
---
### Step 3 — Write Tests for Normal / Expected Behavior
For each public method, test the central, intended behavior first — the happy path. Ask: "What is this method supposed to do when given valid, typical input?"
**Why:** Start with normal behavior so you confirm the code works correctly before probing its edges. If normal behavior tests fail, the code is broken before you even touch it — that is useful information and must be resolved before any refactoring begins.
**Rules:**
- One behavior per test method. Do not write omnibus tests that check five things in sequence — when one assertion fails, you cannot tell which behavior broke.
- Name tests descriptively: `test_read_returns_correct_character`, not `test1`. Descriptive names are the failure message.
- Assert the specific output, not just "no exception was raised." Confirm the actual value.
- Write the test, then verify it can fail: temporarily corrupt the assertion value (e.g., assert `'x' == result` instead of `'d' == result`). If it does not fail, the test is not testing what you think.
```python
def test_read_returns_correct_character(self):
# advance past the first three characters
for _ in range(3):
self.input_file.read(1)
ch = self.input_file.read(1)
assert ch == 'd' # fourth character in the test file
```
---
### Step 4 — Add Boundary Condition Tests
After normal behavior is covered, identify the boundaries where behavior could change or break. Boundary conditions are the most productive place to find bugs.
**Why:** Most bugs hide at the edges — the first item, the last item, the empty collection, the zero value, the maximum value. Fowler calls this "playing the part of an enemy to your own code" — actively trying to find the conditions under which the code will fail, rather than confirming it works for typical input.
**Common boundary categories:**
| Category | Examples |
|---|---|
| **Sequence edges** | First element, last element, element after the last |
| **Empty inputs** | Empty string, empty list, empty file, zero-length collection |
| **Zero / null values** | Zero quantity, null reference, None, empty optional |
| **Maximum / minimum values** | Integer overflow boundary, max string length, single-item list |
| **Repeated calls** | Reading past end-of-file twice, calling close twice |
For each boundary, write a separate test method. Add a descriptive message to assertions so that when a boundary test fails, the output tells you which boundary broke.
```python
def test_read_at_end_of_file_returns_minus_one(self):
# consume all 141 characters
for _ in range(141):
self.input_file.read(1)
result = self.input_file.read(1)
assert result == -1, "read at end of file should return -1"
def test_read_from_empty_file_returns_minus_one(self):
empty = open("testdata/empty.txt", "r")
result = empty.read(1)
empty.close()
assert result == -1, "read from empty file should return -1"
```
---
### Step 5 — Write Tests for Expected Errors and Exceptions
Test that error conditions produce the correct error, not just that they do not crash silently. If the code's contract says "raises ValueError on negative input" or "raises IOError if the stream is closed," write a test that verifies exactly that.
**Why:** Errors are part of the public contract. Failing to raise the expected error — or raising the wrong one — is a bug. These tests also protect against future refactoring silently swallowing exceptions.
**Pattern:**
- Close the resource intentionally, then attempt an operation — expect the specific error.
- Use the framework's `pytest.raises`, `assertRaises`, or `expect { }.to raise_error` idiom.
- If the test body completes without the expected error, force an explicit failure: `fail("expected error was not raised")`.
```python
def test_read_after_close_raises_io_error(self):
self.input_file.close()
with pytest.raises(IOError):
self.input_file.read(1)
# if no IOError is raised, pytest.raises will fail the test automatically
```
---
### Step 6 — Run the Full Suite: Green Gate
Run the entire test suite. All tests must pass — green — before any refactoring step begins.
**Why:** This is the precondition that makes refactoring safe. If the suite is red before you start, you do not know whether a subsequent red result was caused by your change or by a pre-existing bug. You must start from a known-good baseline.
**What to do if tests are red before you start:**
1. Do not begin refactoring yet.
2. Determine whether the failure is a test bug (wrong assertion) or a production bug.
3. If it is a production bug, decide: fix it first, or document it as a known failure and exclude that test from the baseline. Do not silently ignore red tests.
4. Once all tests pass (or excluded failures are documented), the green gate is established.
**The compile-and-test gate (applies to every subsequent step):**
Once the suite is green and refactoring begins, apply this gate after every single atomic change — not after a batch of changes:
```
make one atomic change → compile/lint → run test suite
green → continue to next change
red → revert immediately, try a smaller step
```
"Atomic" means the smallest possible change that can be independently compiled and tested: extract one method, rename one variable, move one field. Never accumulate multiple changes before testing. Small steps mean small reverting cost.
If a language has a compiler, compile first — compilation errors caught before test execution are faster feedback than test failures.
---
## Bug-Fix Variant: Test-First Bug Reproduction
When fixing a bug rather than refactoring, use this variant workflow:
1. **Write a failing test that reproduces the bug.** Do this before touching production code. The test should fail because the bug exists.
2. **Confirm the test fails.** Run it. If it passes, the test is wrong — it is not actually testing the buggy behavior.
3. **Fix the production code** to make the test pass.
4. **Run the full suite.** All tests should be green. If new failures appeared, your fix introduced a regression.
**Why test first for bugs:** Writing the test first forces you to understand exactly what the bug is, not approximately what it is. It also prevents you from accidentally fixing a different problem and convincing yourself the bug is gone. And the test permanently guards against the same bug recurring.
**When a bug report arrives:**
- Start by writing a unit test that exposes the bug, not by opening the source file.
- If you need multiple tests to narrow the scope of the bug (to rule out related failures), write all of them before fixing anything.
- The unit tests become the regression suite for this bug forever.
---
## Test Adequacy Criteria
A test suite is sufficient for refactoring when it satisfies all four of these criteria:
| Criterion | What to Check |
|---|---|
| **Normal behavior covered** | Every public method has at least one test for its primary intended behavior |
| **Boundaries covered** | Each method has tests for: empty input, first/last element, value after the last, zero/null values |
| **Error paths covered** | Every documented error condition or exception has a test that verifies it is raised correctly |
| **Fast enough to run after every step** | The full suite completes in under 30 seconds. If it takes longer, it will not be run frequently enough |
**What you do not need:**
- 100% branch coverage — Fowler explicitly rejects coverage targets as the goal. The goal is testing where the risk is.
- Tests for simple accessors (getters/setters that do nothing but read/write a field) — too simple to fail.
- Tests for every combination in a class hierarchy — test each alternative independently; only test combinations where the alternatives interact in complex ways.
**Fowler's practical rule:** Test the areas you are most worried about going wrong. Concentrate effort where complexity is highest and where bugs would be hardest to find manually. It is better to run incomplete tests than to have no tests because a complete suite felt impossible to write.
---
## Key Principles
**1. Tests must be self-checking.**
Tests that print output to the console for a human to inspect are not self-checking. Every assertion must be evaluated by the framework automatically. The only acceptable output is a pass/fail signal — ideally a progress bar that turns red on failure.
**2. Tests must be fast.**
Slow tests do not get run. If the suite takes more than 30 seconds, developers will batch changes and run tests infrequently. Infrequent testing means bugs accumulate between runs, making them harder to isolate. For refactoring specifically, tests must be fast enough to run after every single atomic step.
**3. Each test must be isolated.**
A test must not depend on the results of any other test. Execution order must not matter. Use setup/teardown to ensure each test starts from an identical, known state.
**4. Verify that tests can fail.**
When you write a test, temporarily insert a wrong value into the assertion. If the test does not turn red, it is not exercising what you think. A test that cannot fail is not a test — it is false confidence.
**5. Incomplete tests beat no tests.**
The most common failure mode is paralysis: "I can't test everything perfectly, so I won't test anything." Write the tests for the risky areas first. Run them. An imperfect suite that runs frequently is vastly more valuable than a theoretically complete suite that never gets written.
**6. The compile-and-test gate is non-negotiable.**
Every atomic refactoring step ends with: compile + run suite. Red = revert. No exceptions. This is what makes refactoring safe to do in a production codebase.
---
## Examples
### Example 1: Adding tests before extracting methods from a billing class
**Situation:** You want to decompose a 200-line `calculate_invoice()` method into smaller methods but there are no tests.
**Setup fixture:** Create an `Invoice` object with known line items and tax rates.
**Normal behavior tests:** Assert that `calculate_invoice()` returns the correct total for a standard order.
**Boundary tests:** Empty order (zero line items), single item, order with a discount applied to zero-priced items.
**Error tests:** Negative quantity raises `ValueError`, unknown product code raises `KeyError`.
**Green gate:** All pass. Now decompose `calculate_invoice()` one extracted method at a time, running after each extraction.
---
### Example 2: Bug-fix variant for a reported pricing error
**Situation:** A bug report says orders over $1,000 are applying the discount twice.
**Step 1 — Write failing test:**
```python
def test_discount_applied_once_for_large_order(self):
order = Order(items=[Item("product-A", quantity=10, unit_price=150)]) # total = $1,500
assert order.total_price() == 1350.00 # 10% discount applied once
```
**Step 2 — Run it.** It fails (returns 1215.00 — discount applied twice). Good. The test reproduces the bug.
**Step 3 — Fix the discount logic.**
**Step 4 — Run full suite.** All green including the new test. Bug is fixed and regression-protected.
---
### Example 3: Assessing an existing test suite before a large refactoring
**Situation:** A module has 12 tests. You want to refactor its data model.
**Audit checklist:**
- [ ] Does every public method have at least one test? — Check: 3 public methods, 12 tests → appears covered
- [ ] Are boundaries tested? — Check: no test for empty input, no test for maximum collection size → gap found
- [ ] Are error paths tested? — Check: no test for invalid state transition → gap found
- [ ] Does the suite run in under 30 seconds? — Check: 4.2 seconds → acceptable
**Action:** Add boundary and error path tests. Run. Green. Now proceed with the refactoring.
---
## License
This skill is licensed under [CC-BY-SA-4.0](https://creativecommons.org/licenses/by-sa/4.0/).
Source: [BookForge](https://github.com/bookforge-ai/bookforge-skills) — Refactoring: Improving the Design of Existing Code by Martin Fowler.
## Related BookForge Skills
- `refactoring-readiness-assessment` — Assess whether code is ready to refactor
- `code-smell-diagnosis` — Identify which smells to address first
- `method-decomposition-refactoring` — Apply once this test suite is green
Browse more BookForge skills: [bookforge-skills](https://github.com/bookforge-ai/bookforge-skills)
Plan and execute architectural-scale refactoring campaigns that take weeks to months — the four named patterns for large-scale structural restructuring from...
---
name: big-refactoring-planner
description: |
Plan and execute architectural-scale refactoring campaigns that take weeks to months — the four named patterns for large-scale structural restructuring from Fowler and Beck's Chapter 12. Use when: an inheritance hierarchy is doing two distinct jobs and subclass names share the same adjective prefix at every level (Tease Apart Inheritance); a codebase written in an object-oriented language uses a procedural style with long methods on behavior-less classes and dumb data objects (Convert Procedural Design to Objects); GUI or window classes contain SQL queries, business rules, or pricing logic instead of just display code (Separate Domain from Presentation); a single class has accumulated so many conditional statements that every new case requires editing the same class in multiple places (Extract Hierarchy). Applies when code-smell-diagnosis has surfaced Parallel Inheritance Hierarchies, Data Class, or Large Class with deep conditional branching and the fix is too large for a single refactoring session. Distinguishes between the four patterns by structural signal, selects the correct pattern and variant, and produces a multi-week campaign plan with interleaved feature development milestones.
version: 1.0.0
homepage: https://github.com/bookforge-ai/bookforge-skills/tree/main/books/refactoring/skills/big-refactoring-planner
metadata: {"openclaw":{"emoji":"📚","homepage":"https://github.com/bookforge-ai/bookforge-skills"}}
status: draft
depends-on:
- code-smell-diagnosis
source-books:
- id: refactoring
title: "Refactoring: Improving the Design of Existing Code"
authors: ["Martin Fowler", "Kent Beck"]
chapters: [12]
tags: [refactoring, code-quality, legacy-code, architecture]
execution:
tier: 2
mode: hybrid
inputs:
- type: codebase
description: "The target codebase or subsystem with the architectural-scale structural problem"
- type: document
description: "Code-smell-diagnosis report identifying which big refactoring pattern applies, if already run"
tools-required: [Read, Grep, Write]
tools-optional: [Bash]
mcps-required: []
environment: "Run inside a project directory. Read source files to identify the structural pattern; write the campaign plan as output."
discovery:
goal: "Identify which of the four big refactoring patterns applies; select the correct execution variant; produce a campaign plan with milestones, decision points, and interleaving strategy for working alongside ongoing feature development"
tasks:
- "Read the codebase to confirm the structural pattern from the signal list"
- "Select the correct pattern and variant based on the structural signals"
- "Identify the key decision points within the selected pattern"
- "Produce a phased campaign plan that can be executed alongside feature work"
- "Establish the stopping condition: when is the refactoring done enough?"
audience:
roles: ["software-developer", "senior-developer", "tech-lead", "architect"]
experience: "senior — assumes comfort with object-oriented design and familiarity with individual refactoring moves"
triggers:
- "An inheritance hierarchy has the same adjective prefix appearing at every level of subclasses"
- "Procedural code in an object-oriented language: long methods on classes with no data, dumb data objects with no behavior"
- "Window or GUI classes contain SQL statements, pricing logic, or business rules"
- "A single class requires editing in five different places every time a new case is added"
- "code-smell-diagnosis found Parallel Inheritance Hierarchies, Data Class, or a god class with pervasive conditionals"
- "A team cannot add a new feature type without touching the same class in multiple unrelated ways"
not_for:
- "Individual refactoring moves that can be completed in a single session — use method-decomposition-refactoring or conditional-simplification-strategy instead"
- "Performance optimization — use profiling-driven-performance-optimization"
- "Diagnosing which smells are present — run code-smell-diagnosis first"
- "Test coverage setup before starting refactoring — use build-refactoring-test-suite first"
---
# Big Refactoring Planner
## When to Use
You have a structural design problem that cannot be fixed in a single refactoring session. The individual moves — Extract Method, Move Method, Extract Class — are not in question. The challenge is that the problem is architectural: dozens of classes are tangled, or a hierarchy is doing multiple jobs, or procedural logic is spread across a codebase that nominally uses objects. The fix takes weeks to months, not minutes.
This skill is for those campaigns.
**Fowler and Beck's core principle:** "You refactor not because it is fun but because there are things you expect to be able to do with your programs if you refactor that you just can't do if you don't."
Big refactorings are done for a purpose — specifically because a particular kind of change that the team needs to make is blocked or costly without the restructuring. You are not refactoring for cleanliness. You are refactoring because the architecture is standing in the way of features you need to build.
**This skill applies to four named patterns:**
| Pattern | Core problem | Time scale |
|---------|-------------|------------|
| Tease Apart Inheritance | One hierarchy doing two independent jobs | Weeks |
| Convert Procedural Design to Objects | OO language used in procedural style | Weeks to months |
| Separate Domain from Presentation | Business logic embedded in GUI classes | Weeks |
| Extract Hierarchy | God class with accumulated conditionals | Weeks to months |
**Prerequisites:** Run `build-refactoring-test-suite` before starting. Big refactorings require a test suite that can catch regressions — the campaign moves incrementally and tests must confirm each step is safe.
---
## Context and Input Gathering
### Required Input (ask if missing)
- **The structural problem.** Either a code-smell-diagnosis report, a description of where the pain is ("every time we add a new deal type, we have to subclass in two hierarchies"), or a specific class or directory to examine. Why: the four patterns apply to distinct structural situations. Reading the code to confirm the signal is mandatory — big refactorings cannot be planned without inspecting the structure.
- **The purpose of the refactoring.** What feature work is blocked? What change is currently too costly? Why: Fowler and Beck are explicit that big refactorings must be done toward a purpose. A campaign without a purpose will be abandoned when pressure mounts. The purpose also determines when to stop — when the blocked change becomes easy, the refactoring has succeeded.
### Observable Context (read before asking)
Scan the codebase to identify which structural pattern is present:
```
Tease Apart Inheritance signals:
- Subclasses with identical prefix adjectives at every level
(TabularActiveDeal, TabularPassiveDeal — "Tabular" appears in both branches)
- Adding a new variation in one dimension requires adding subclasses in every branch
- Hierarchy depth exceeds 3 levels with cross-cutting concerns
Convert Procedural Design to Objects signals:
- Classes with only static methods or methods that take data objects as parameters
- Data classes (fields + accessors, no behavior) that are passed into procedure-style classes
- Long methods (50+ lines) on classes with few or no instance variables
- A single "calculator" or "processor" class that takes many different data objects
Separate Domain from Presentation signals:
- SQL statements inside window, panel, dialog, or view classes
- Window/form classes over 300 lines with business logic in event handlers
- Pricing, discount, or calculation logic in GUI event handlers
- java.sql imports in UI classes; database calls triggered directly by UI events
Extract Hierarchy signals:
- One class with 10+ boolean flags or type code fields
- Methods with large switch/case or if-elif chains that check the same flag
- Adding a new "type" or "mode" requires editing the same class in 5+ methods
- The class's behavior changes entirely based on a flag set at construction time
```
---
## Process
### Step 1: Confirm the Pattern
**ACTION:** Read the target code, identify which structural pattern applies, and confirm the diagnosis.
**WHY:** The four patterns require different remedies. Applying Extract Hierarchy mechanics to a tangled inheritance hierarchy will make it worse. Applying Tease Apart Inheritance to a god class with conditionals will create a hierarchy where subclassing is not the right solution. Correct pattern identification determines whether the campaign succeeds.
Work through the pattern-confirmation checklist:
**Tease Apart Inheritance — confirm by answering:**
- Do classes at the same level of the hierarchy share the same adjective prefix? (e.g., `Tabular` appearing in both ActiveDeal and PassiveDeal branches)
- Is the hierarchy growing combinatorially? (adding 1 new deal type requires adding 2 new subclasses because presentation styles cross with deal types)
- Can you draw a two-dimensional grid where one axis is one job and the other axis is the other job?
If yes to all three: this is Tease Apart Inheritance.
**Convert Procedural Design to Objects — confirm by answering:**
- Are there classes whose methods all take data objects as their primary parameter (e.g., `calculatePrice(Order order)` on `OrderCalculator`)?
- Are the data objects pure data holders with accessors but no behavior?
- If you removed the "calculator" class, would its behavior need to live somewhere on the data objects?
If yes: this is Convert Procedural Design to Objects.
**Separate Domain from Presentation — confirm by answering:**
- Do GUI or window classes contain SQL, database calls, or business computation?
- Is business logic triggered directly by UI events?
- If you wanted to change the pricing algorithm, would you have to open a window class?
If yes: this is Separate Domain from Presentation.
**Extract Hierarchy — confirm by answering:**
- Does one class control its behavior almost entirely through flags or type codes checked in conditionals?
- Are the conditionals static during the object's lifetime — does the type or mode get set at construction and not change?
- Does adding a new type require editing conditional logic in five or more methods of the same class?
If yes: this is Extract Hierarchy. Then determine the variant:
- **Variant A (unclear variations):** You are not sure what all the subclasses should be — discover them one at a time.
- **Variant B (clear variations):** The full set of variations is already known upfront — create all subclasses at once.
---
### Step 2: Select Execution Strategy
**ACTION:** Based on the confirmed pattern, select the execution strategy and identify the key decision points.
**WHY:** Each pattern has a fixed sequence of moves, but the sequence has decision points that cannot be fully specified in advance — the specific code you encounter will determine which moves apply. Knowing the decision points before starting prevents mid-campaign confusion and ensures the team pushes through when the refactoring gets messy.
---
#### Strategy: Tease Apart Inheritance
**Problem:** One hierarchy is doing two jobs. (Example: `Deal` hierarchy also captures presentation style, creating `TabularActiveDeal`, `TabularPassiveDeal`.)
**Goal:** Two clean, focused hierarchies connected by delegation.
**Sequence:**
1. **Draw the two-dimensional grid.** Label the axes with the two jobs the hierarchy is doing. Every current subclass should map to a cell in the grid. Cells that are missing reveal gaps in the current implementation. Why: the grid makes the tangling visible and determines how many new classes will be needed in the extracted hierarchy.
2. **Decide which job stays in the original hierarchy.** The rule of thumb: leave the job with more code in place — it has less to move. Extract the job with less code into the new hierarchy. Why: moving less code reduces the risk of bugs during extraction. The job that stays will be simplified once the other job is extracted.
3. **Apply Extract Class at the common superclass** to create a new class representing the subsidiary job. Add an instance variable to the original superclass to hold a reference to the new object. Why: the new class becomes the root of the extracted hierarchy; the instance variable is the delegation link between the two hierarchies.
4. **Create subclasses of the extracted class** — one for each variation of the subsidiary job (e.g., `TabularPresentationStyle`, `SinglePresentationStyle`). Initialize the instance variable in each original subclass to the appropriate subclass of the extracted class. Why: until the subclasses are created, the extracted class is just an empty shell with no behavior — this step populates the second hierarchy.
5. **Move behavior into the extracted hierarchy.** Apply Move Method (and Move Field where needed) in each original subclass to transfer presentation-related behavior to the corresponding extracted subclass. Why: behavior must follow data — the extracted hierarchy is not useful until it carries the code that belongs to it.
6. **Eliminate empty original subclasses.** When a subclass has no more code of its own, delete it. Continue until all subclasses are gone from the dimension being extracted. Why: the original hierarchy should now contain only the classes representing the primary job; the second hierarchy handles the subsidiary job entirely through delegation.
7. **Look for further simplification.** After the hierarchies are separated, each can often be simplified further with Pull Up Method or Pull Up Field — logic that was previously tangled may now be clearly common to all subclasses. Why: separation frequently reveals that what appeared to be variation was actually duplication.
**Key decision point:** Which job stays? Lean toward leaving the job with more code in place. If the code is evenly split, choose the job that is semantically primary — the job that gives the class its name.
---
#### Strategy: Convert Procedural Design to Objects
**Problem:** An object-oriented language is being used in a procedural style. Behavior is concentrated in procedure classes; data is concentrated in data-holder classes with no behavior.
**Goal:** Behavior distributed into the data objects where it belongs.
**Sequence:**
1. **Turn each record type into a dumb data object with accessors.** If you have a relational database, create a class for each table with accessor methods. Why: the data classes are the correct eventual home for the behavior that currently lives in procedure classes. They must exist before behavior can move into them.
2. **Concentrate all procedural code into a single class.** If it is scattered across multiple procedure classes, consolidate. Make the class a singleton or make its methods static to allow easy invocation during the transition. Why: having all procedural code in one place makes it easier to track progress and apply Extract Method systematically. This is a temporary state — the goal is to empty this class.
3. **Apply Extract Method to each long procedure** to break it into smaller named operations. Immediately follow with Move Method to move each extracted operation to the data class whose data it primarily uses. Why: Extract Method creates handles on behavior units; Move Method is the act of distributing those units to their natural home. The procedure class shrinks with each Move Method applied.
4. **Continue until the procedure class is empty.** If the original class was a purely procedural class, delete it — this is a sign the refactoring is complete. Why: an empty procedure class is the evidence that behavior has been fully distributed. If the class cannot be deleted, some behavior genuinely has no home and a new class may need to be created.
**Key decision point:** What to do when a method uses data from multiple record types equally. Move Method to the record type it uses most; extract the portions that use other records into separate methods that can be moved to those records.
---
#### Strategy: Separate Domain from Presentation
**Problem:** GUI classes contain domain logic. Business rules, SQL queries, and calculations are embedded in window or event handler classes.
**Goal:** Clean separation where GUI classes handle only presentation and domain classes carry all business logic.
**Sequence:**
1. **Create a domain class for each window.** If a window displays grid data, also create a class for the rows in the grid. Link the domain class to the window class via a reference field. Why: the domain class is the eventual home for all business logic currently in the window. It must exist before any logic can be moved.
2. **Examine each field and data element in the window.** Classify each one into one of three categories:
- *Pure GUI only (not used in domain logic):* Leave it on the window.
- *Domain data not displayed in the GUI:* Move Field directly to the domain class using Move Field. Why: non-displayed domain data has no presentation dependency and can be moved immediately with low risk.
- *Domain data that is also displayed in the GUI:* Use Duplicate Observed Data — create a corresponding field on the domain class with synchronization logic. Why: these fields cannot be moved directly because the GUI needs them; duplication with sync is the safe intermediate step before the GUI can be updated to reference the domain object's field instead.
3. **Examine the logic in the window class.** Classify each logical operation:
- *Pure presentation logic:* Leave it in the window.
- *Domain logic:* Apply Extract Method to isolate it, then Move Method to transfer it to the domain class.
- *Mixed presentation and domain logic:* Apply Extract Method to separate the domain portion, then Move Method for the domain portion; leave the presentation portion in the window.
4. **Target SQL statements specifically.** Drive SQL statements and database logic toward the domain class. Moving SQL imports away from the window class is a useful completion signal — when the window class no longer imports `java.sql` (or equivalent), the separation is largely complete. Why: SQL in a window class is the clearest signal of domain logic in the wrong place; its removal confirms the domain class is carrying its responsibilities.
5. **Stop when risk is addressed.** The resulting domain classes will not be perfectly factored — they will hold the right logic but may need further refactoring (Data Clumps, Feature Envy). Stop the Separate Domain from Presentation campaign once the primary risk is eliminated. If the risk was the mixing of presentation and domain logic, that risk is gone when the separation is clean. Further refactoring of the domain classes is a separate, smaller campaign. Why: trying to fully factor the domain classes simultaneously with the separation campaign overextends the scope and increases the risk of abandonment.
**Key decision point:** What to do with data that is both displayed and used in domain logic. Always use Duplicate Observed Data as the intermediate step. Direct Move Field will break the GUI. Accept the duplication temporarily — it is easier to eliminate once the logic is clearly in the domain class.
---
#### Strategy: Extract Hierarchy
**Problem:** A single class is doing too much, controlled by flags or type codes, with behavior that varies entirely based on those flags.
**Prerequisite check:** The conditional logic must be static during the object's lifetime. If the flags can change after construction, apply Extract Class first to create a separate object for the varying aspect before applying Extract Hierarchy. Why: Extract Hierarchy uses subclasses to represent variations. Subclass membership cannot change after instantiation — if a flag changes at runtime, the variation cannot be a subclass.
**Variant A — Unclear variations (discover one at a time):**
1. **Identify one recurring variation in the conditional logic.** Look for a flag or condition that recurs across multiple methods of the class. Why: starting with one variation keeps the scope manageable and prevents the campaign from becoming paralyzing.
2. **Create a subclass for that variation.** Apply Replace Constructor with Factory Method on the original class so that clients receive a subclass instance when the variation applies. Why: the factory method is necessary because constructors cannot return subclass instances; it also gives the campaign a controlled entry point for the new subclass.
3. **Copy conditional methods to the subclass.** For each method in the original class that has conditional logic based on the variation, copy the method to the subclass and simplify it — remove the branches that cannot apply to this subclass. Use Extract Method in the superclass if needed to isolate the conditional from the unconditional parts. Why: copying rather than moving allows the original class to continue working for all other variations while the subclass is being built.
4. **Continue isolating variations.** Pick the next variation, create the next subclass, repeat. Continue until the original class's conditional methods can be declared abstract — all variations are now handled by subclasses. Why: each new subclass removes one variation from the superclass; when the superclass has no remaining conditional logic for a method, that method becomes abstract.
5. **Delete superclass method bodies when all subclasses override them.** Make the superclass declarations abstract. Why: an abstract superclass with no concrete implementations is the evidence that the hierarchy is complete.
**Variant B — Clear variations (create all subclasses at once):**
1. **Create a subclass for each known variation.** Apply Replace Constructor with Factory Method to return the appropriate subclass for each variation.
2. **For each method with conditional logic, apply Replace Conditional with Polymorphism.** If the whole method varies, move it in full to each subclass. If only part of the method varies, apply Extract Method first to isolate the varying part, then move the extracted method. Why: Replace Conditional with Polymorphism is the terminal move — it eliminates the conditional by making each subclass responsible for its own case.
**Key decision point:** Which variant to use. Use Variant A when you are unsure how many variations exist or the existing conditional logic is inconsistent. Use Variant B when the full set of variations is well-understood and each variation is consistently represented in the conditional logic.
---
### Step 3: Build the Campaign Plan
**ACTION:** Produce a phased campaign plan the team can execute over weeks or months.
**WHY:** Big refactorings that are not planned as campaigns get abandoned. The team hits a messy intermediate state, feature pressure mounts, and the refactoring is frozen half-done — a state that is worse than the starting point. A campaign plan with explicit milestones and interleaving rules allows the team to make progress every day without disrupting feature delivery.
**Campaign plan structure:**
```markdown
# [Pattern Name] Campaign Plan — [Target Class/System]
## Purpose
[The specific feature or change that is currently blocked. Why this refactoring now.]
## Pattern
[Selected pattern and variant, with one-sentence rationale for the selection]
## Team Alignment Required
Big refactorings require shared awareness. Every developer working in the affected
area must know:
1. This refactoring is "in play"
2. Which classes are being restructured
3. How to continue the refactoring when adding new code (new code goes into the
new structure, not the old one)
[List the classes and files "in play" for this campaign]
## Milestones
### Milestone 1: [Name] — [Estimated: N days]
Goal: [What structural state the code is in at the end of this milestone]
Steps:
1. [Specific move to apply]
2. [Specific move to apply]
...
Done when: [Specific, observable condition — e.g., "TabularPresentationStyle class exists
and holds all tabular layout methods"]
### Milestone 2: [Name] — [Estimated: N days]
[...]
## Interleaving Rules
- Refactoring steps happen during feature work, not in a dedicated freeze.
- Each milestone step is small enough to complete between feature commits.
- New features that touch the affected area use the new structure, not the old.
- Do not skip milestones to reach the end state — intermediate states must compile
and pass all tests.
## Decision Points
[List the key decisions that cannot be made until the code is read at that milestone:
e.g., "At Milestone 2: decide which job stays in the original hierarchy based on
line count of each dimension"]
## Stopping Condition
[The specific observable state that means the refactoring is done enough:
e.g., "When OrderWindow no longer imports java.sql"]
[The specific feature or change that was blocked must now be easy to make]
```
---
### Step 4: Identify Interleaving Strategy
**ACTION:** Establish the rules for how the campaign runs alongside feature development.
**WHY:** "Refactoring not because it is fun but because there are things you expect to do with your programs" — Fowler and Beck. Big refactorings must coexist with feature delivery. The alternative — a two-month code freeze to refactor — is not available in production systems. The nibble-at-the-edges approach is the only viable one.
**Interleaving rules (apply to every big refactoring campaign):**
1. **Refactor to a purpose, not for cleanliness.** The campaign starts because a specific feature or change is blocked. Every step of the campaign should move toward making that feature easier. If a step does not contribute to that goal, question whether it is necessary.
2. **Nibble at the edges.** Each time a developer touches an affected file for feature work, they apply the next step of the refactoring. The refactoring rides along with feature development — it does not block it. A little today, a little tomorrow.
3. **Do as much as needed to achieve the real task.** You do not have to complete the entire campaign before shipping the feature. You have to complete enough of the campaign that the feature becomes easy to add. Stop when the purpose is achieved.
4. **Never leave tests failing overnight.** Each step of the campaign must leave the codebase in a state where all tests pass. An intermediate state with failing tests is a blocker for the whole team. The campaign must be executed in safe steps, each verifiable by the test suite.
5. **New code follows the new structure.** When adding new features in the affected area during the campaign, use the new structure (the extracted class, the new hierarchy, the domain class) — not the old one. This prevents the old structure from growing while the campaign is in progress.
---
### Step 5: Define the Stopping Condition
**ACTION:** Establish when the campaign is done and communicate it to the team.
**WHY:** Big refactorings without a stopping condition run forever or collapse under pressure. The stopping condition is the purpose stated in terms of observable code state. It is the answer to "how will we know when we are done?"
**Stopping condition by pattern:**
- **Tease Apart Inheritance:** Done when the two hierarchies are fully separated, each class in the original hierarchy has a single job, and adding a new variation in one dimension does not require adding subclasses in the other.
- **Convert Procedural Design to Objects:** Done when the procedure class is empty or deleted, and all behavior lives in the data objects. (Optional secondary: when the data objects no longer need to be passed as parameters because they know how to perform their own operations.)
- **Separate Domain from Presentation:** Done when the window class contains no SQL, no database calls, and no business computation — only display logic and event routing. The domain class handles all business logic. (Observable signal: the window class no longer imports database libraries.)
- **Extract Hierarchy:** Done when the original god class is abstract, each variation is a subclass, and adding a new variation requires adding only one new subclass with its own implementations — not editing any existing class.
---
## Key Principles
**1. Big refactorings are purposeful, not cosmetic.**
The motivation for a big refactoring is always a specific blocked capability. "The code is messy" is not sufficient motivation — a campaign that long will be abandoned. "We cannot add a new billing scheme without editing conditional logic in twelve methods of BillingScheme" is a sufficient motivation. The purpose drives the campaign and defines when it is complete.
**2. Steps cannot be fully specified in advance.**
Unlike the individual refactoring moves in Chapters 6-11, the steps of a big refactoring emerge as you do it. The pattern provides the strategy; the code provides the specifics. The plan must include decision points — places where you explicitly pause to assess the code and decide the next move based on what you find.
**3. Team agreement is mandatory.**
A big refactoring affects every developer working in the area. If one developer is teasing apart a hierarchy while another is subclassing the old tangled one, the campaign moves backward. Every developer must know that the refactoring is "in play" and must use the new structure when writing new code.
**4. Intermediate states are dangerous — tests must pass at every step.**
The period between starting and finishing a big refactoring is the most dangerous time. The code is in a partially restructured state. Without a test suite that passes after each step, the team cannot detect regressions. Run build-refactoring-test-suite before starting any campaign.
**5. Accumulation is the enemy.**
Fowler and Beck: "Accumulation of half-understood design decisions eventually chokes a program as a water weed chokes a canal." Big refactorings address accumulated design debt. They are not optional maintenance — they are necessary for the system to remain changeable.
---
## Examples
### Example 1: Tease Apart Inheritance — Deal Hierarchy
**Situation:** A codebase has a `Deal` hierarchy with subclasses `ActiveDeal`, `PassiveDeal`, `TabularActiveDeal`, and `TabularPassiveDeal`. Every time a new deal type is added, tabular and non-tabular subclasses must both be added. Every time a new presentation style is added, active and passive subclasses must both be added. The team needs to add a chart-based presentation style.
**Pattern confirmed:** Tease Apart Inheritance — same adjective prefix ("Tabular") appears in both branches; the hierarchy is growing combinatorially.
**Two-dimensional grid:**
```
| Active Deal | Passive Deal |
-------------|-------------|--------------|
(single) | X | X |
Tabular | X | X |
```
**Decision — which job stays:** Deal type (Active/Passive) has more code; presentation style (single/tabular) has less. Extract presentation style.
**Campaign milestones:**
Milestone 1 (Day 1-2): Apply Extract Class at `Deal` superclass — create `PresentationStyle`. Add `presentation` instance variable to `Deal`. Tests pass.
Milestone 2 (Day 3-5): Create `TabularPresentationStyle` and `SinglePresentationStyle` subclasses of `PresentationStyle`. Initialize `presentation` in `TabularActiveDeal` and `TabularPassiveDeal` constructors to `TabularPresentationStyle`. Tests pass.
Milestone 3 (Day 6-10): Apply Move Method in each tabular subclass — move presentation-related methods to `TabularPresentationStyle`. When a subclass is empty, delete it. Apply Move Method in single subclasses similarly. Tests pass after each deletion.
Milestone 4 (Day 11-12): Apply Pull Up Method/Field to simplify each hierarchy now that they are separated. Tests pass.
**Stopping condition:** `TabularActiveDeal` and `TabularPassiveDeal` are deleted. Adding `ChartPresentationStyle` requires only one new class.
---
### Example 2: Separate Domain from Presentation — Order Window
**Situation:** An `OrderWindow` class (400 lines) handles all GUI rendering and also contains SQL queries for fetching product prices, customer discount calculations, and order total computation. The team cannot unit test pricing logic because it is embedded in GUI event handlers.
**Pattern confirmed:** Separate Domain from Presentation — SQL statements in a window class, business calculation in event handlers.
**Purpose:** Make pricing logic unit-testable without a running GUI.
**Campaign milestones:**
Milestone 1 (Day 1): Create `Order` domain class linked from `OrderWindow`. Create `OrderLine` for grid rows. Tests pass (no behavior moved yet).
Milestone 2 (Day 2-5): Examine each field. `customerCodes` field is not displayed — apply Move Field directly to `Order`. Display fields (`customerName`, `amount`) — apply Duplicate Observed Data: add mirrored fields to `Order` with sync. Tests pass.
Milestone 3 (Day 6-15): Apply Extract Method to isolate SQL calls and pricing calculations in `OrderWindow`. Apply Move Method to transfer each extracted operation to `Order`. Drive all SQL toward `Order`. Tests pass after each move.
**Observable stopping signal:** `OrderWindow` no longer imports `java.sql`. Pricing methods exist on `Order` and can be unit tested without instantiating the GUI.
---
### Example 3: Extract Hierarchy — Billing Scheme (Variant A)
**Situation:** `BillingScheme` has flags for disability status, lifeline status, and business type. Every method contains conditional logic checking these flags. Adding a new billing category requires editing conditional logic in eight methods.
**Pattern confirmed:** Extract Hierarchy, Variant A — variations not fully clear upfront; flags static during object lifetime.
**First variation picked:** disability scheme.
**Campaign milestones:**
Milestone 1: Apply Replace Constructor with Factory Method on `BillingScheme`. Create `DisabilityBillingScheme` subclass. Factory method returns `DisabilityBillingScheme` when disability flag is set. Tests pass.
Milestone 2: Copy `createBill` to `DisabilityBillingScheme`. Simplify — remove branches that check `disabilityScheme()` since the subclass is always in disability context. Tests pass.
Milestone 3: Continue for other methods with disability conditionals. Pick next variation (lifeline). Create `LifelineBillingScheme`. Repeat. Tests pass after each subclass iteration.
Milestone 4: When all variations are subclassed, declare `BillingScheme` abstract. Delete conditional method bodies from the superclass. Tests pass.
**Stopping condition:** Adding a new billing category requires creating one new subclass of `BillingScheme` — no edits to existing classes.
---
## References
| File | Contents | When to read |
|------|----------|--------------|
| `references/big-refactoring-patterns.md` | Full mechanics for all four patterns with decision tables and edge cases | Step 2 — selecting execution strategy |
**Related skills:**
- `code-smell-diagnosis` — run first to identify which pattern applies (Parallel Inheritance Hierarchies, Data Class, or Large Class god class)
- `build-refactoring-test-suite` — run before starting any campaign to establish the safety net
- `method-decomposition-refactoring` — for the Extract Method and Move Method steps within the campaign
- `conditional-simplification-strategy` — for the Replace Conditional with Polymorphism steps in Extract Hierarchy
## License
This skill is licensed under [CC-BY-SA-4.0](https://creativecommons.org/licenses/by-sa/4.0/).
Source: [BookForge](https://github.com/bookforge-ai/bookforge-skills) — Refactoring: Improving the Design of Existing Code by Martin Fowler and Kent Beck.
## Related BookForge Skills
Install related skills from ClawhHub:
- `clawhub install bookforge-code-smell-diagnosis`
- `clawhub install bookforge-build-refactoring-test-suite`
- `clawhub install bookforge-method-decomposition-refactoring`
- `clawhub install bookforge-conditional-simplification-strategy`
Or install the full book set from GitHub: [bookforge-skills](https://github.com/bookforge-ai/bookforge-skills)
Build a complete Negotiation One Sheet — a five-section preparation document that covers your aspirational goal, a counterpart-validating situation summary,...
---
name: negotiation-one-sheet-generator
description: |
Build a complete Negotiation One Sheet — a five-section preparation document that covers your aspirational goal, a counterpart-validating situation summary, a preemptive accusation audit, a calibrated question bank, and a list of noncash offers — before any negotiation, sales conversation, contract discussion, salary negotiation, or difficult ask. Use when you need to prepare for a high-stakes conversation in a single document, when you want to stop improvising and start with a battle-tested preparation framework, when you keep leaving deals on the table by aiming at your bottom line instead of your aspirational target, when you need to combine emotional preparation with offer strategy into one coherent plan, or when you are coaching someone else through a complex negotiation. Also use before any negotiation where you have 20+ minutes to prepare and want to walk in with every major tool loaded: counterpart profile, labels, questions, offer sequence, and noncash options. Produces negotiation-one-sheet.md — a complete, ready-to-use preparation document with all five sections filled. Works standalone with simplified inline processes or in full-depth mode by invoking the seven supporting Level 0 skills. The hub of the Never Split the Difference skill set.
version: 1.0.0
homepage: https://github.com/bookforge-ai/bookforge-skills/tree/main/books/never-split-the-difference/skills/negotiation-one-sheet-generator
metadata: {"openclaw":{"emoji":"📚","homepage":"https://github.com/bookforge-ai/bookforge-skills"}}
status: draft
source-books:
- id: never-split-the-difference
title: "Never Split the Difference: Negotiating as if Your Life Depended on It"
authors: ["Chris Voss"]
chapters: [23]
tags: [negotiation, preparation, one-sheet, goal-setting, accusation-audit, calibrated-questions, noncash-offers, tactical-empathy, ackerman, bargaining, sales, salary-negotiation, hub-skill]
depends-on:
- counterpart-style-profiler
- accusation-audit-generator
- calibrated-questions-planner
- empathic-summary-planner
- ackerman-bargaining-planner
- black-swan-discovery
- commitment-verifier
execution:
tier: 1
mode: hybrid
inputs:
- type: document
description: "Situation brief — what the negotiation is about, who the counterpart is, what you want, what you know about their position and constraints"
- type: document
description: "Your aspirational goal — the best-case outcome you want. Not your walk-away point. If you don't know, the process will help you set one."
- type: document
description: "Any prior artifacts from Level 0 skills (optional) — counterpart-profile.md, accusation-audit.md, calibrated-questions.md, ackerman-plan.md, black-swan-report.md"
tools-required: [Read, Write]
tools-optional: []
mcps-required: []
environment: "Any agent environment. Works from a free-text situation brief or a full document set from Level 0 skills. Richer context produces a more targeted one sheet."
discovery:
goal: "Produce negotiation-one-sheet.md — a complete five-section preparation document covering aspirational goal, situation summary, labels/accusation audit, calibrated questions, and noncash offers"
tasks:
- "Set an aspirational goal using the four-step process (not BATNA-anchored)"
- "Write a two-sentence situation summary that will trigger genuine validation from the counterpart"
- "Prepare 3-5 preemptive labels targeting the counterpart's anticipated emotions"
- "Prepare 3-5 calibrated How/What questions across three categories"
- "Identify noncash items the counterpart possesses that would add value"
- "Optionally invoke Level 0 skills for deep output in each section"
- "Write negotiation-one-sheet.md"
audience:
roles: ["salesperson", "founder", "manager", "consultant", "recruiter", "lawyer", "freelancer", "job-seeker", "anyone-who-negotiates"]
experience: "beginner to intermediate — no formal negotiation training required"
triggers:
- "User is preparing for a negotiation and wants a single comprehensive preparation document"
- "User wants to combine goal, empathy, questions, and offer strategy in one place"
- "User has used one or more Level 0 skills and wants to integrate their outputs into a final prep document"
- "User is coaching someone else through a negotiation and needs a complete framework"
- "User has 20+ minutes before a negotiation and wants to use that time optimally"
not_for:
- "Deep counterpart profiling — use counterpart-style-profiler for a full three-archetype classification"
- "In-depth accusation audit with delivery scripting — use accusation-audit-generator"
- "Full calibrated question bank with deployment sequencing — use calibrated-questions-planner"
- "Detailed Ackerman offer schedule with computed amounts — use ackerman-bargaining-planner"
- "Black Swan hypothesis mapping — use black-swan-discovery"
- "Verifying commitment quality after the negotiation — use commitment-verifier"
---
# Negotiation One Sheet Generator
## When to Use
You are preparing for a negotiation — a salary discussion, sales conversation, vendor contract, partnership deal, difficult ask, or any high-stakes conversation where you want more than your bottom line. You have at least 20 minutes to prepare and want to walk in with every major tool loaded.
The Negotiation One Sheet is a five-section preparation document. It takes the place of a script (which makes you rigid) and replaces it with prepared tools (which make you adaptive). When pressure hits, you do not rise to the occasion — you fall to your highest level of preparation. This document is that level.
**Use this skill to build the full one sheet.** Use the Level 0 skills listed in `depends-on` when you want deep, standalone output for any individual section. The hub works in two modes:
- **Standalone mode:** Each section uses a simplified inline process. Produces a complete one sheet without invoking any other skill.
- **Integrated mode:** For sections where a Level 0 skill is available and loaded, invoke it to get deeper output, then copy the relevant artifact into the one sheet.
---
## Context & Input Gathering
### Required
- **The situation:** What are you negotiating? What does a good outcome look like? Who is the counterpart?
- **What you want:** Your aspirational target — or enough information to set one in Section I.
### Important
- **What you know about the counterpart:** Their role, what they care about, prior communication history, their likely objections.
- **Available noncash items:** Things you could offer that have high value to them and low cost to you.
### Optional (if Level 0 skills have been run)
- `counterpart-profile.md` — informs Section III (label calibration) and Section II (summary tone)
- `accusation-audit.md` — can replace Section III
- `calibrated-questions.md` — can replace Section IV
- `ackerman-plan.md` — informs Section I (goal feeds Ackerman target)
- `black-swan-report.md` — adds a Bonus Section VI
### Sufficiency Check
You need at minimum: the situation and a rough sense of what you want. Everything else can be inferred. If you have nothing, ask the user to describe the negotiation in 3-5 sentences before proceeding.
---
## Process
Work through the five sections sequentially. For each section, check whether a relevant Level 0 skill artifact is available. If yes, invoke it (or use its output). If no, use the inline process below.
---
### Section I: Goal
**Action:** Set a single aspirational goal — the best-case outcome, written down.
**WHY:** The most common preparation mistake is anchoring preparation to your walk-away point (the minimum acceptable outcome). Decades of goal-setting research confirm that people who set specific, challenging, but realistic goals get better outcomes than those who aim at their minimum. When a walk-away point becomes the focus of preparation, it becomes the psychological ceiling — the negotiator relaxes when they reach it and stops pushing. An aspirational goal does the opposite: it primes you to treat anything short of it as a loss, keeping you psychologically engaged throughout the negotiation. This is the core insight behind refusing to "split the difference" — when you aim at your minimum, splitting gives you something worse than your minimum minus margin.
**Inline four-step process:**
1. **Define the range.** Think through both ends — the absolute worst acceptable outcome AND the best plausible outcome. Write both. The range gives you structure; you need both ends to feel grounded.
2. **Set the aspirational target.** Choose the high end as your goal. Make it specific and concrete (a number, a date, a set of terms). Write it down. Vague goals produce vague negotiations.
3. **Commit it externally.** Write the goal down and share it with a colleague before the negotiation. Externalized commitments are harder to abandon — you psychologically resist lowering a target that another person has witnessed you set. A goal you only think is a goal you will abandon under pressure.
4. **Carry it in.** The written goal goes with you into the negotiation — physically or on screen. It is your anchor when the counterpart puts pressure on your bottom line.
**If `ackerman-plan.md` is available:** The target price in that document is your Section I goal. Paste it here.
**Section I output:**
```
**My Aspirational Goal:** [Specific, concrete best-case outcome]
**Why this is achievable:** [One sentence — what evidence or logic supports this target]
**My absolute floor (not on the one sheet, for mental reference only):** [Walk-away point]
```
---
### Section II: Situation Summary
**Action:** Write a 2-sentence situation summary that describes the facts from the counterpart's perspective — accurate enough that they would respond with "That's right" when they hear it.
**WHY:** The situation summary is not a self-serving description of why you deserve what you want. Its purpose is to demonstrate to the counterpart — at the opening of the conversation — that you understand their situation accurately. This triggers genuine validation ("That's right") rather than polite acknowledgment ("You're right"). The difference matters enormously: "That's right" means the counterpart feels understood; "You're right" means they want to end the conversation. A counterpart who feels understood lowers their emotional guard. A counterpart who feels misunderstood — or believes you are starting from a self-serving interpretation of the facts — immediately activates resistance that no amount of logic will overcome. Emotional preparation precedes rational persuasion. This is not optional: the emotional brain (System 1) evaluates whether you understand the situation before the rational brain (System 2) will engage with your arguments.
**Inline process:**
1. Write 1 sentence describing the situation from the counterpart's perspective: what are the facts as they see them, and what is at stake for them?
2. Write 1 sentence describing what they are likely trying to achieve: what does a good outcome look like from their side?
3. Check: would your counterpart respond "That's right" if they heard this? If they would respond "Well, not exactly..." — revise until they would agree.
4. Keep it to 2 sentences maximum. A longer summary signals you are making an argument, not demonstrating understanding.
**If `empathic-summary-planner` is available:** The "That's right" trigger statement in that skill's output is your Section II summary. Paste it here.
**Section II output:**
```
**Situation Summary (2 sentences):**
[Sentence 1 — facts from their perspective]
[Sentence 2 — what they are trying to accomplish]
**"That's right" test:** [Would they agree? If not, what needs to change?]
```
---
### Section III: Labels / Accusation Audit
**Action:** Prepare 3-5 preemptive labels that name the counterpart's anticipated negative feelings before they surface.
**WHY:** Emotional objections voiced by a counterpart have more force than the same objections that stay unspoken — because the counterpart has now committed to them publicly. Naming them first, before the counterpart does, drains the charge. When you say "It seems like you might be concerned this isn't going to be worth your time" before the counterpart thinks it, they cannot use that concern as a weapon. If the label is right, they feel understood. If it is wrong, they correct you — which still advances the conversation. The mechanism is neurological: naming a negative emotion engages the prefrontal cortex and reduces amygdala activation. The counterpart's emotional brain calms down. Their rational brain comes online. This is what makes emotional preparation more valuable than rational scripting.
**Inline process:**
1. List every accusation the counterpart might make — stated in their voice, uncensored. Include the extreme ones.
2. Convert each into a label using the formula: "It seems like \_\_\_\_" / "It sounds like \_\_\_\_" / "It looks like \_\_\_\_". Never "I feel" or "I think you feel."
3. Select the 3-5 most emotionally charged labels. Sequence from strongest to lightest.
4. Add the fill-in-the-blank templates below for any gaps:
**Universal label templates (fill in the blank):**
- "It seems like \_\_\_\_\_\_\_\_\_ is valuable to you."
- "It seems like you don't like \_\_\_\_\_\_\_\_\_."
- "It seems like you value \_\_\_\_\_\_\_\_\_."
- "It seems like \_\_\_\_\_\_\_\_\_ makes it easier."
- "It seems like you're reluctant to \_\_\_\_\_\_\_\_\_."
**If `counterpart-style-profiler` is available:** The counterpart's type (Analyst/Accommodator/Assertive) tells you which category of concerns to weight most heavily. Analysts fear surprises and insufficient data. Accommodators fear relationship damage and conflict. Assertives fear being ignored or not heard.
**If `accusation-audit-generator` is available:** The label bank in that skill's output replaces this section. Paste the 3-5 sequenced labels here.
**Section III output:**
```
**Anticipated Accusations (raw):**
1. [Their voice — worst-case thought]
2. [Their voice]
3. [Their voice]
**Labels (3-5, sequenced strongest to lightest):**
Label 1: "It seems like ___________." [Pause 3-5 seconds]
Label 2: "It sounds like ___________." [Pause 3-5 seconds]
Label 3: "It seems like ___________." [Pause 3-5 seconds]
[Label 4 if applicable]
[Label 5 if applicable]
```
---
### Section IV: Calibrated Questions
**Action:** Prepare 3-5 How/What questions across three categories: value-revealing, behind-the-table stakeholder discovery, and deal-killing issue identification.
**WHY:** A negotiator who only talks about their own position never finds out what the counterpart actually needs — which means they can never offer something that genuinely satisfies both sides. Calibrated questions shift the problem-solving burden to the counterpart, generate information, and give the counterpart a sense of control (because they are the one providing answers). "How" and "What" questions do this cleanly. "Why" questions — even well-intentioned ones — sound like accusations ("Why is that a concern for you?" implies their concern is unjustified) and produce defensiveness, not information. The three-category structure ensures you are not just asking about stated positions: you are also surfacing the stakeholders who can kill the deal from off-screen, and diagnosing problems that will derail implementation after a handshake.
**Inline process:**
**Category A — Value-revealing questions** (what matters to them, what success looks like):
- "What are we trying to accomplish?"
- "How is that worthwhile?"
- "What's the core issue here?"
- "How does that affect things?"
- "What's the biggest challenge you face?"
- "How does this fit into what the objective is?"
**Category B — Behind-the-table stakeholder questions** (who else has veto power):
- "How does this affect the rest of your team?"
- "How on board are the people not in this conversation?"
- "What do your colleagues see as their main challenges in this area?"
**Category C — Deal-killing issue questions** (what could prevent implementation):
- "What are we up against here?"
- "What happens if you do nothing?"
- "What does doing nothing cost you?"
- "How does making a deal with us affect things?"
- "How does this resonate with what your organization values?"
**Selection guidance:** Choose 1-2 questions from each category. Ask them in groups of 2-3 — similar questions from the same category help the counterpart think about the same issue from multiple angles without feeling interrogated. Prepare a follow-up label template for each question.
**Follow-up label templates (for after their answers):**
- "It seems like \_\_\_\_\_\_\_\_\_ is important."
- "It seems like you feel my organization is in a unique position to \_\_\_\_\_\_\_\_\_."
- "It seems like you're worried that \_\_\_\_\_\_\_\_\_."
**If `calibrated-questions-planner` is available:** The deployment groups in that skill's output replace this section. Paste the question bank here, selecting your 3-5 highest-priority questions.
**Section IV output:**
```
**Selected Questions:**
[Category A — Value-Revealing]
Q1: [How/What question]
→ Follow-up label: "It seems like ___________."
Q2: [How/What question]
→ Follow-up label: "It sounds like ___________."
[Category B — Behind-the-Table]
Q3: [How/What question about stakeholders]
→ Follow-up label: "It seems like ___________ has a stake in this."
[Category C — Deal-Killing Issues]
Q4: [How/What question about obstacles]
→ Follow-up label: "It seems like ___________ could be the real challenge."
[Q5 optional]
```
---
### Section V: Noncash Offers
**Action:** List noncash items your counterpart possesses — or that you possess — that could be traded to reach agreement when money alone has stalled.
**WHY:** A surprisingly high percentage of negotiations hinge on something other than price. Counterparts often have constraints, status needs, or interest in non-monetary outcomes that are far cheaper for the other side to provide than the gap in the cash number. The negotiator who only thinks about price leaves these trades on the table. Asking "what could they give that would almost make us do this for free?" forces creativity about value beyond money. It also produces your strongest noncash closing move: offering something of genuine value to the counterpart alongside your final price — especially when that price is a precise, non-round number — signals that you have exhausted your financial flexibility and are finding other ways to make the deal work. This makes your final number feel more credible and final.
**Inline process:**
1. Ask: "What could they give me that would almost make me do this for free?" List 3 candidates.
2. Ask: "What could I give them that costs me little but would be genuinely valuable to them?" List 3 candidates.
3. Select the 1-2 most viable options from each list.
4. Identify which item is most appropriate as a final-stage sweetener alongside your last offer.
**Examples by situation type:**
- Salary negotiation: Additional vacation days, remote flexibility, earlier performance review, professional development budget, equity acceleration
- Vendor/service purchase: Longer contract term, public testimonial, case study rights, referral to other buyers, priority support access
- Real estate: Flexible closing date, furniture items included, quick pre-approval, waived contingencies
- B2B sales: Speaking opportunity, advisory board seat, co-marketing, logo placement, named reference
**If `ackerman-bargaining-planner` is available:** The noncash item selected in that skill's output is your primary Section V item for the final offer stage. Add it here and expand the list.
**Section V output:**
```
**Noncash Items They Could Offer Me:**
1. [Item — why it's valuable to me, why it costs them little]
2. [Item]
3. [Item]
**Noncash Items I Could Offer Them:**
1. [Item — why it's valuable to them, why it costs me little]
2. [Item]
3. [Item]
**Best final-stage sweetener:** [The single most valuable noncash item to include with the last offer]
```
---
### Bonus Section VI: Black Swan Hypotheses (optional)
**If `black-swan-discovery` is available:** Paste the top 2-3 Black Swan hypotheses from that skill's `black-swan-report.md` here. These are the unknown unknowns that could explain surprising behavior or break the deal entirely. Carry them as live hypotheses — watch for signals during the conversation.
**If the skill is not available:** Skip this section or list 1-2 things that feel unexplained about the counterpart's behavior so far.
---
### Final Step: Write the One Sheet
**Action:** Produce `negotiation-one-sheet.md` by assembling all five (or six) sections into a single document. The one sheet must be short enough to review in 2 minutes before walking into the conversation. If any section exceeds one short paragraph or 5 bullet points, summarize it. Each section should contain 3-5 items maximum.
**WHY:** The one sheet fails if it becomes a binder. Its purpose is to be a live reference document — reviewed immediately before the conversation, carried into the room, consulted if the conversation stalls. It must be fast to read under pressure. Length is the enemy of use.
---
## Inputs
| Input | Required | Format |
|---|---|---|
| Situation brief | Yes | Any — markdown, plain text, verbal description |
| Aspirational goal (or enough to set one) | Yes | A number, term, or outcome |
| Counterpart description | Yes | Role, what they care about, prior history |
| Level 0 skill artifacts | Optional | counterpart-profile.md, accusation-audit.md, calibrated-questions.md, ackerman-plan.md, black-swan-report.md |
| Noncash items | Optional | Anything of value beyond money |
---
## Outputs
**Primary output:** `negotiation-one-sheet.md`
```markdown
# Negotiation One Sheet: [Deal / Conversation Name]
**Prepared by:** [Name / role]
**Counterpart:** [Name / organization / role]
**Date:** [Date of negotiation]
---
## Section I: Goal
**My Aspirational Goal:** [Specific, concrete best-case outcome — a number, a set of terms, a deadline]
**Why achievable:** [One sentence — the evidence or logic that makes this goal realistic, not just wishful]
*(Walk-away floor is known but not written here — it is not the target.)*
---
## Section II: Situation Summary
**Summary (2 sentences):**
[Sentence 1 — facts from their perspective]
[Sentence 2 — what they are trying to accomplish]
**"That's right" test:** [Would they agree? Note any uncertainty.]
---
## Section III: Labels / Accusation Audit
**Labels (3-5, sequenced strongest to lightest):**
Label 1: "It seems like ___________."
*[Pause 3-5 seconds after delivery.]*
Label 2: "It sounds like ___________."
*[Pause 3-5 seconds.]*
Label 3: "It seems like ___________."
*[Pause 3-5 seconds.]*
[Label 4 if applicable]
[Label 5 if applicable]
**Delivery note:** Open with the labels before any ask. Downward inflection — statement, not question. Do not fill the silence after each label.
---
## Section IV: Calibrated Questions
**Value-Revealing:**
- [How/What question] → Follow-up label: "It seems like ___________."
- [How/What question] → Follow-up label: "It sounds like ___________."
**Behind-the-Table Stakeholders:**
- [How/What question] → Follow-up label: "It seems like ___________ has a stake in this."
**Deal-Killing Issues:**
- [How/What question] → Follow-up label: "It seems like ___________ could be the real challenge."
[Optional 5th question]
**Deployment note:** Ask questions in groups of 2-3. Pause and label after each answer before asking the next.
---
## Section V: Noncash Offers
**Items I could offer them (high value to them / low cost to me):**
1. [Item]
2. [Item]
**Items they could offer me (high value to me / potentially low cost to them):**
1. [Item]
2. [Item]
**Final-stage sweetener:** [The single noncash item to pair with the last offer]
---
## Section VI: Black Swan Hypotheses (if applicable)
- [Hypothesis 1 — unknown unknown that could explain their behavior or constraints]
- [Hypothesis 2]
*(Watch for signals during the conversation. If a hypothesis confirms, pivot.)*
---
## Post-Negotiation Follow-Up
- [ ] Was a genuine "That's right" obtained on the situation summary?
- [ ] Were all labels delivered before the first ask?
- [ ] Were the calibrated questions asked, or did you skip to position-trading?
- [ ] Was the final offer accompanied by a noncash item?
- [ ] Any verbal commitments that need `commitment-verifier` analysis?
```
---
## Key Principles
**Preparation is the multiplier.** Every hour spent on a Negotiation One Sheet yields at least a 7:1 return on time saved from renegotiating deals and clarifying implementation. The investment is not in the document — it is in the mental simulation the preparation forces. Negotiators who prepare a one sheet have already imagined the hard moments before they arrive.
**The aspirational goal is the engine; the walk-away point is just the guardrail.** BATNA (Best Alternative to a Negotiated Agreement) anchors negotiators to their minimum because the human brain, under the cognitive load of a live negotiation, gravitates toward the most psychologically significant number it has prepared. Preparing an aspirational target puts that number front-and-center instead. The walk-away point exists to prevent catastrophic outcomes — not to aim at.
**The One Sheet is a toolkit, not a script.** The One Sheet is a toolkit of prepared tools (labels, questions, offers), NOT a predetermined script. Scripts make you rigid and exploitable — they break the moment the counterpart says something you did not anticipate. Tools give you prepared responses that adapt to whatever direction the conversation takes.
**Emotional preparation (labels, accusation audit) must precede rational preparation (goals, numbers).** The counterpart's System 1 (fast, emotional brain) processes social signals before System 2 (slow, rational brain) can engage with facts. Calming System 1 with empathic labels opens the door for System 2 to evaluate your proposal rationally. This is why the section ordering of the one sheet is not arbitrary: a summary that triggers genuine validation (Section II) and labels that defuse anticipated hostility (Section III) are prerequisites to productive engagement with questions (Section IV) and offers (Sections I and V).
**The five sections are interdependent.** Goal (I) feeds the Ackerman offer sequence. The situation summary (II) uses the counterpart profile. Labels (III) are calibrated to the counterpart's emotional state. Questions (IV) surface deal-killers before they surface on their own. Noncash items (V) are the creative exit from purely positional bargaining. Prepare all five — they compound.
**The one sheet is a live reference document.** It must be reviewable in 2 minutes. Long preparation documents are not consulted under pressure. One page — two at most — that you can glance at when the conversation stalls or when you are tempted to split the difference.
---
## Examples
### Example 1: Salary Negotiation
**Scenario:** A senior engineer preparing to negotiate a 25% raise with their current employer. They know budget constraints exist but also know their work has driven significant retention impact. They have 30 minutes before the conversation.
**Trigger:** "Help me build a complete one sheet for my salary negotiation this afternoon."
**Process:**
- Section I: Aspirational goal set at 25% raise (above market mid-range for the role). Walk-away at 15%. Written and stated to a friend.
- Section II: Summary written from manager's perspective — acknowledges budget cycle pressure and that raises are being evaluated against team-wide equity, not just individual performance.
- Section III: Three labels prepared — strongest first: "It seems like any raise right now creates a precedent problem with the rest of the team." Second: "It sounds like the timing isn't great given what the budget situation looks like." Third: "It seems like you want to reward good work but you're constrained by what's possible."
- Section IV: Two value-revealing questions (What does retention of high performers cost us relative to a raise?), one stakeholder question (How are decisions like this usually made — is it just you or does HR need to sign off?), one deal-killer question (What happens if this conversation doesn't go anywhere today?).
- Section V: Noncash items identified — earlier performance review cycle in exchange for lower raise increment now; remote work flexibility; professional development budget for conference attendance.
**Output:** `negotiation-one-sheet.md` with all five sections filled, goal set at 25%, two-sentence summary written from manager's perspective, three sequenced labels, four calibrated questions with follow-up labels, and two noncash alternatives as final-stage creative options.
---
### Example 2: Vendor Contract Renewal
**Scenario:** A procurement manager renewing a SaaS contract. The vendor has sent a 30% price increase. The procurement manager wants to hold price flat and secure a 2-year term.
**Trigger:** "I have a contract renewal meeting tomorrow. The vendor wants 30% more. Build me a one sheet."
**Process:**
- Section I: Goal — hold price flat or max 5% increase, lock in 2-year term with price protection. Walk-away at 15% increase maximum.
- `counterpart-style-profiler` invoked — vendor account manager classified as Assertive. Adaptation: lead with label before counter-position. Get to the point quickly. Do not over-explain rationale.
- Section II: Summary from vendor's perspective — they need to justify pricing to their own leadership and have been under margin pressure. The account relationship has been stable but the procurement manager has not been vocal about value internally.
- `accusation-audit-generator` invoked — three labels prepared: strongest first: "It seems like you're in a position where you can't bring a flat renewal back to your team and have it look like a win." Second: "It sounds like the pricing increase is tied to something that happened at the platform cost level, not just a negotiating position." Third: "It seems like locking in a longer term creates some of its own complications for you internally."
- Section IV: Three questions — What would a 2-year commitment change about how you think about the pricing? How does this account fit into your team's goals for retention this year? What happens to the pricing structure if we decide to go to market?
- Section V: Noncash items — offer a public case study, agree to introductory reference calls for two new prospects, offer a multi-year term with annual increases capped at CPI.
**Output:** Complete `negotiation-one-sheet.md` integrating counterpart-profile and accusation-audit outputs, with Assertive-adapted delivery notes and a noncash offer structured around reference value rather than cash.
---
### Example 3: Partnership Deal with a Larger Company
**Scenario:** A startup founder negotiating a distribution partnership with a much larger company. Power asymmetry is significant. The larger company's champion is enthusiastic but internal approval is uncertain.
**Trigger:** "I'm meeting their VP next week. There's a real deal here but I don't know who else needs to approve this. Build me a one sheet."
**Process:**
- `black-swan-discovery` invoked — key hypothesis: the champion does not have final authority and there is an internal stakeholder (legal or procurement) who has not been surfaced. A second hypothesis: the larger company is evaluating two alternatives simultaneously and is using the meeting to gather competitive information.
- Section I: Goal — signed letter of intent with exclusivity clause, revenue share above 20%, and implementation timeline under 90 days. Walk-away: any deal without exclusivity protection.
- Section II: Summary from VP's perspective — their team needs a credible distribution solution before the end of the quarter. This partnership solves a problem they have already identified internally. The risk for them is picking a partner who underdelivers and creating an embarrassment.
- Section III: Three labels — strongest: "It seems like the real concern is what happens if this partnership doesn't work and your team has publicly backed it." Second: "It sounds like there are people not in this meeting whose buy-in you need before anything can move." Third: "It seems like the timeline is being driven by something on your side that we haven't fully talked about yet."
- Section IV: Value-revealing (What does success look like for your team 12 months from now?), stakeholder (How on board are the people who would actually implement this?), deal-killer (What would have to be true for this not to move forward?).
- `commitment-verifier` noted for post-meeting use — any "yes" from the VP in this meeting should be analyzed before treating as a commitment.
**Output:** `negotiation-one-sheet.md` with Black Swan hypotheses in Section VI, stakeholder-focused questions dominating Section IV, labels targeting the political risk of public backing, and a post-meeting action item to run commitment-verifier on the VP's responses.
---
## References
| File | Contents |
|------|----------|
| `references/one-sheet-framework.md` | Extended one sheet guidelines: when to use standalone vs. integrated mode, how to time-box preparation, how to adapt the one sheet for remote vs. in-person conversations, one sheet anti-patterns |
| `references/level0-skill-integration.md` | How to pass context between Level 0 skill artifacts and the one sheet: file naming conventions, copy-paste workflows, which sections each artifact feeds |
| `references/system1-system2-prep.md` | Why emotional preparation beats rational scripting: the influence arc (Active Listening → Empathy → Rapport → Influence → Behavioral Change), System 1 / System 2 neuroscience, and the ordering logic behind the one sheet's five sections |
## License
This skill is licensed under [CC-BY-SA-4.0](https://creativecommons.org/licenses/by-sa/4.0/).
Source: [BookForge](https://github.com/bookforge-ai/bookforge-skills) — Never Split the Difference: Negotiating as if Your Life Depended on It by Chris Voss.
## Related BookForge Skills
Install related skills from ClawhHub:
- `clawhub install bookforge-counterpart-style-profiler`
- `clawhub install bookforge-accusation-audit-generator`
- `clawhub install bookforge-calibrated-questions-planner`
- `clawhub install bookforge-empathic-summary-planner`
- `clawhub install bookforge-ackerman-bargaining-planner`
- `clawhub install bookforge-black-swan-discovery`
- `clawhub install bookforge-commitment-verifier`
Or install the full book set from GitHub: [bookforge-skills](https://github.com/bookforge-ai/bookforge-skills)
Build an active listening summary and emotional validation script before any negotiation, sales conversation, difficult conversation, conflict resolution, or...
---
name: empathic-summary-planner
description: Build an active listening summary and emotional validation script before any negotiation, sales conversation, difficult conversation, conflict resolution, or persuasion attempt. Use this skill when you need to prepare a summary statement that triggers genuine agreement from a counterpart, create a listening script before a high-stakes conversation with an emotionally activated person, draft labels and paraphrasing language before a client call, build rapport with someone before making a request, prepare a stalled negotiation recovery plan using validation techniques, write an empathic opening before delivering difficult feedback, plan the listening sequence before a salary negotiation or sales discovery call, or generate a "that's right" trigger statement that signals real understanding — not just acknowledgment.
version: 1.0.0
homepage: https://github.com/bookforge-ai/bookforge-skills/tree/main/books/never-split-the-difference/skills/empathic-summary-planner
metadata: {"openclaw":{"emoji":"📚","homepage":"https://github.com/bookforge-ai/bookforge-skills"}}
status: draft
source-books:
- id: never-split-the-difference
title: "Never Split the Difference: Negotiating as if Your Life Depended on It"
authors: ["Chris Voss"]
chapters: [2, 5]
tags: [negotiation, active-listening, emotional-validation, mirroring, labeling, summarizing, rapport, influence, tactical-empathy, that's-right]
depends-on: []
execution:
tier: 1
mode: hybrid
inputs:
- type: document
description: "Situation brief — a description of the conversation, the counterpart, what you want, and the emotional dynamics you expect"
tools-required: [Read, Write]
tools-optional: []
mcps-required: []
environment: "Any agent environment. Document set preferred: situation-brief.md, counterpart-profile.md, conversation-history.md. Works from a free-text description if no files provided."
discovery:
goal: "Produce empathic-summary-script.md: a situation summary, a label bank, and a 'that's right' trigger statement that validates the counterpart's worldview and moves the conversation toward influence."
tasks:
- "Identify the counterpart's emotional state, core concerns, and underlying needs from available context"
- "Select and sequence active listening techniques appropriate to the conversation stage"
- "Draft a paraphrase of the counterpart's situation in their own value frame"
- "Draft 2-3 emotion labels targeting their most activated feelings"
- "Combine paraphrase + labels into a summary statement designed to trigger genuine validation"
- "Flag 'you're right' warning patterns and provide detection criteria"
- "Produce the empathic-summary-script.md artifact"
audience: "salespeople, founders, managers, consultants, negotiators — anyone preparing for a conversation where trust and understanding must precede persuasion"
when_to_use: "Before any high-stakes conversation where the counterpart's emotional state is a barrier to rational agreement"
environment: "Document set (situation-brief.md, counterpart-profile.md, conversation-history.md) or free-text description"
quality: placeholder
---
# Empathic Summary Planner
## When to Use
You are preparing for a conversation — or you are in a stalled one — where the counterpart does not feel fully understood, and that gap is blocking progress. This skill applies when:
- Entering a negotiation where the counterpart has previously felt dismissed or steamrolled
- Preparing a sales discovery call where trust must be established before the pitch
- Recovering a stalled deal where no progress has been made despite multiple conversations
- Preparing for a difficult conversation where emotions are high (performance review, conflict, complaint)
- Delivering a request that requires the counterpart to feel understood first
- Trying to shift a counterpart who is agreeing superficially but not actually moving ("you're right" responses that lead nowhere)
The core mechanism: **genuine agreement — "that's right" — is triggered when a counterpart hears their own worldview reflected back accurately, with their emotional state named.** This requires combining paraphrase (their situation in their words) with labeling (their feelings named neutrally), producing a summary that validates their perspective without endorsing or rejecting their position.
This is not about being agreeable or soft. It is about advancing through the influence arc in the correct sequence. Influence attempts that skip the listening and rapport stages fail — not because they lack logic, but because the counterpart's emotional brain (System 1) has not been calmed, and an activated System 1 blocks access to rational System 2 reasoning.
Before starting, confirm you have:
- Enough context to describe the counterpart's situation from their perspective
- An understanding of what they care about, fear, or feel frustrated about
- A goal for what you want the conversation to achieve
---
## Context & Input Gathering
### Required Context
- **The situation:** What is the conversation about? What do you want to achieve?
- **The counterpart:** Who are they, what do they care about, what is at stake for them?
- **Their emotional state:** What are they feeling — frustration, distrust, skepticism, anxiety, urgency?
- **Conversation history:** What has been said before? What did they respond to, positively or negatively?
### Observable Context
If documents are provided, read them for:
- Statements the counterpart has made about their concerns, goals, or frustrations
- Prior agreements or concessions that have not moved the conversation forward
- Repeated phrases or themes in their language (these are primary paraphrase material)
- Signs of superficial agreement: "you're right," "sure," "I hear you" — without follow-through
### Default Assumptions
- If no counterpart profile is provided → assume a skeptical, experienced counterpart who has heard similar pitches or requests before
- If no conversation history is provided → assume no trust has been established and start from Active Listening
- If the counterpart's emotional state is unclear → default to naming uncertainty and frustration as the primary labels
### Sufficiency Check
Before generating the script, confirm you can answer: "What does this person believe about their situation, and what are they most worried about?" If you cannot answer that, gather more context or ask the user.
---
## Process
### Step 1: Map the Counterpart's Worldview
**ACTION:** Write a brief internal summary (not the output artifact) of the counterpart's situation from their perspective. Answer: What do they think is happening? What do they need? What are they afraid of? What have they said that matters most to them?
**WHY:** Active listening begins before the conversation. A negotiator who enters unprepared defaults to their own frame of reference and misses signals. Mapping the counterpart's worldview first forces perspective-taking: you cannot produce an accurate paraphrase if you are still inside your own assumptions. This step also surfaces hidden concerns — needs that the counterpart has not stated explicitly but which are evident from their behavior or history. The goal is not to be right about their worldview; it is to make an informed hypothesis that the conversation can correct.
**Format:** 3-5 bullet points in third-person counterpart voice:
- "They believe they have been waiting too long for a resolution."
- "They feel their previous concessions were not recognized or reciprocated."
- "They are most worried about [specific risk] going unaddressed."
---
### Step 2: Select the Active Listening Techniques for This Conversation
**ACTION:** Review the six active listening techniques and select which are appropriate given where the conversation is. Not all six are needed every time — the right combination depends on the counterpart's current emotional state and how much rapport already exists.
**WHY:** The techniques operate in progression. Early in a conversation (or with an emotionally activated counterpart), starting with paraphrasing or summarizing before the counterpart feels heard creates resistance. Pausing and mirroring create space for the counterpart to elaborate before you interpret. Labeling names what emerges from that elaboration. Paraphrasing confirms you understood the content. Summarizing combines both into a validation statement. Choosing the wrong technique for the stage is like trying to run before the counterpart is willing to walk. See `references/active-listening-techniques.md` for full mechanics of each technique.
**The six techniques, in progression order:**
| # | Technique | What It Does | When to Use |
|---|-----------|-------------|-------------|
| 1 | **Effective Pause** | Silence after their statement — signals you are processing, not rushing | Always; especially after they say something emotionally significant |
| 2 | **Minimal Encouragers** | Short verbal affirmations ("yes," "I see," "right") | During any listening period; keeps them talking without interrupting |
| 3 | **Verbal Mirroring** | Repeat their last 1-3 words as a question → silence | Early rapport-building; when you want elaboration without interpretation |
| 4 | **Labeling** | Name their emotional state: "It seems like..." | When you can observe a feeling — anger, hesitation, relief, frustration |
| 5 | **Paraphrasing** | Restate their meaning in your own words | When you want to confirm understanding of their position |
| 6 | **Summarizing** | Paraphrase + label combined into one statement | When you are ready to trigger "that's right" — the full validation |
**Selection rule:** Match the technique to the stage. If no prior rapport exists, start with pauses and mirroring. If partial rapport exists, move to labeling and paraphrasing. Only produce the summary when you have enough context to reflect their full perspective accurately.
---
### Step 3: Draft the Verbal Mirror Script
**ACTION:** Write 2-3 verbal mirroring prompts based on key statements the counterpart has made or is likely to make. A mirror repeats the last 1-3 words of their statement, inflected slightly upward as a question. Follow each with a pause of at least 4 seconds.
**WHY:** Mirroring (repeating the last 1-3 words as a question) works because it extracts information without imposing a frame. Unlike direct questions which reveal what YOU think is important, a mirror invites the counterpart to elaborate on what THEY think is important. Under uncertainty — when you don't yet know what matters most to them — mirrors are safer than questions because they cannot lead the witness. When a counterpart elaborates in response to a mirror, they are revealing more than they planned — often the concern beneath the stated position. The upward inflection signals a question rather than a challenge. The silence that follows is not empty: it is pressure. The counterpart's brain processes the mirror as an implicit "tell me more," and fills the gap.
**Mirroring formula:**
- Counterpart says: "I've been waiting three weeks for a decision on this."
- Mirror: "Three weeks for a decision?" [Pause 4+ seconds]
- Counterpart: "Yes, and every time I follow up I get a different answer..." [elaboration revealed]
**Voice guidance:** Use the FM DJ voice — calm, measured, slightly slower than normal speech. Not flat or robotic, but unhurried. Never mirror with enthusiasm or urgency; it signals that you are pushing, not listening.
---
### Step 4: Draft the Label Bank
**ACTION:** Write 2-3 emotion labels targeting the counterpart's most activated feelings. Use the "It seems like..." / "It sounds like..." formula. Do not use "I feel" or "I think you feel." Sequence labels from most emotionally charged to least.
**WHY:** Labeling names a feeling rather than arguing with a position. When the counterpart's emotional state is unaddressed, it occupies their attention — they cannot engage with rational arguments because the emotional signal is still firing. Naming the feeling accurately turns the amygdala response down: research in affect labeling (Matthew Lieberman, UCLA) shows that verbalizing an emotional state reduces neural activation in the areas associated with threat response. The mechanism is not sympathy — you do not need to feel what they feel. You only need to demonstrate that you have accurately observed it. If the label is wrong, the counterpart corrects you, which opens dialogue. If it is right, they confirm it, and their guard comes down.
**Label formula:**
- "It seems like you've been carrying this situation longer than you expected."
- "It sounds like the timeline has been the most frustrating part."
- "It seems like you're not sure whether anything will actually change."
**Do not:** Move to paraphrasing or summarizing before labeling. A summary that skips the emotional acknowledgment is perceived as a pitch, not as listening.
---
### Step 5: Write the Paraphrase
**ACTION:** Restate the counterpart's situation in your own words — content only, no feelings. The paraphrase should be accurate enough that a neutral observer would recognize it as a fair representation of what the counterpart has communicated.
**WHY:** Paraphrasing proves comprehension. Labeling proves emotional attunement. Together, they form the components of the summary. Paraphrasing alone, without labeling, can feel clinical — "you understood the facts but not the feeling." Labeling alone, without paraphrasing, can feel vague — "you noticed the emotion but not the substance." The combination is what produces genuine validation.
**Paraphrase format:** 2-4 sentences in your own words that capture:
- Their stated situation or position
- What they care most about
- What has not been resolved from their perspective
---
### Step 6: Write the Summary Statement (the "That's Right" Trigger)
**ACTION:** Combine the paraphrase (their situation) and the strongest label (their feeling) into a single summary statement. Read it aloud (or internally) and ask: "Would a reasonable counterpart hear this and feel accurately understood — not agreed with, but understood?"
**WHY:** "That's right" is different from "you're right." "That's right" means: "You have accurately described my situation and how I feel about it." It is a genuine validation signal — the counterpart is not conceding to you, they are confirming that you see them. This is the trigger for the rapport stage of the influence arc. Until the counterpart feels genuinely understood, they are in a defensive posture. Once they reach "that's right," their emotional guard drops and influence becomes possible. The summary is the most powerful tool in the listening arsenal precisely because it is the synthesis: it shows the counterpart that you have both heard the content and felt the weight of their situation.
**Warning — the "you're right" trap:** "You're right" is the worst possible response to your summary. It means the counterpart is dismissing you, not confirming genuine understanding. Signs of a "you're right" dismissal:
- The conversation does not change after they say it
- They immediately return to their prior position
- Their tone does not shift — the statement is flat, not relieved
- They say it after you have made a lengthy argument (they are ending the argument, not agreeing with it)
If you receive "you're right" without behavioral change, do not proceed. Rebuild rapport with additional mirroring and labeling before attempting the summary again.
**Summary template:**
> "It sounds like [paraphrase of their situation — 2-3 sentences]. And on top of that, it seems like [strongest label — their emotional experience]."
---
### Step 7: Write the Empathic Summary Script
**ACTION:** Produce the `empathic-summary-script.md` artifact. This is the deliverable — a ready-to-use document the user can review, rehearse, or adapt for the conversation.
**WHY:** Writing the script makes the listening sequence a deliberate plan rather than improvisation. Under pressure, people default to pushing their own agenda. Having the mirroring prompts, labels, and summary written out prevents this. It also allows the user to identify gaps: if any component of the summary feels forced or inaccurate, that is a signal that more context about the counterpart is needed before the conversation.
---
## Inputs
| Input | Required | Format |
|---|---|---|
| Situation description | Yes | Any — markdown, plain text, verbal description |
| Goal for the conversation | Yes | One sentence minimum |
| Counterpart description | Yes | Role, stakes, emotional state, prior history if available |
| Conversation history or prior statements | Strongly recommended | Markdown or plain text |
| Counterpart's key phrases or repeated concerns | Optional | Any |
---
## Outputs
Produce `empathic-summary-script.md` with the following structure:
```markdown
# Empathic Summary Script
**Situation:** [One-sentence description]
**Counterpart:** [Who they are, what they care about, what they feel]
**Conversation Goal:** [What you want to achieve]
**Current Stage in Influence Arc:** [Active Listening / Empathy / Rapport / Influence]
---
## Counterpart's Worldview (Internal Map — Do Not Deliver)
- [What they believe about the situation]
- [What they need]
- [What they are afraid of]
- [What has frustrated them most]
---
## Verbal Mirrors (2-3)
**Mirror 1:** "[Last 1-3 words of a key statement]?"
*Pause. 4+ seconds. Do not fill the silence.*
**Mirror 2:** "[Last 1-3 words]?"
*Pause. 4+ seconds.*
[Mirror 3 if applicable]
---
## Label Bank (2-3 Labels, Sequenced Strongest to Lightest)
**Label 1 (Most Charged):**
> "It seems like [strongest emotional observation]."
*Pause. 3-5 seconds.*
**Label 2:**
> "It sounds like [second emotional observation]."
*Pause. 3-5 seconds.*
[Label 3 if applicable]
---
## Paraphrase
> "[Their situation in your words — 2-4 sentences capturing what they care about and what is unresolved]"
---
## Summary Statement ("That's Right" Trigger)
> "It sounds like [paraphrase of situation]. And on top of that, it seems like [strongest label]."
*Deliver slowly. Pause after. Wait for a full response — do not rush to your next point.*
---
## "That's Right" vs "You're Right" — Detection Guide
| Signal | "That's Right" (genuine) | "You're Right" (dismissal) |
|--------|--------------------------|---------------------------|
| Tone | Relieved, engaged | Flat, polite |
| What follows | New information, elaboration | Return to prior position |
| Body language | Relaxed, leans in | Unchanged or closed |
| Next move | Safe to move to influence | Rebuild rapport — more mirroring |
---
## Notes
- If you receive "that's right" → proceed to influence stage (share your perspective, make your ask)
- If you receive "you're right" → do not proceed; return to mirroring and labeling
- If the counterpart corrects a label → accept the correction and update your script: "You're right — help me understand what you're actually feeling about this"
- Do not skip to the summary before completing at least one mirroring exchange and one label delivery
```
---
## Key Principles
- **Influence requires rapport. Rapport requires empathy. Empathy requires listening.** This progression follows the FBI's Behavioral Change Stairway Model (BCSM): Active Listening → Empathy → Rapport → Influence → Behavioral Change. Each stage gates the next — attempting to influence before establishing rapport through empathic listening causes resistance, not compliance. The "That's right" moment signals the transition from Rapport to Influence: the counterpart has felt genuinely understood, and is now open to considering new possibilities. Skipping stages does not accelerate the process — it resets it. A counterpart who does not feel understood cannot be persuaded; they can only be pressured, and pressure produces resistance or false agreement.
*WHY:* The BCSM was developed by the FBI's Crisis Negotiation Unit to explain why direct influence attempts fail in high-stakes situations. The stages are not philosophical — they describe a neurological sequence. The emotional brain (System 1) must be calmed before the rational brain (System 2) can engage. Labels and mirroring target System 1 directly. Without that, even perfectly logical arguments cannot be received.
- **"That's right" is the goal, not "yes."** A "yes" can be counterfeit (to escape the conversation), confirmatory (acknowledgment without commitment), or genuine. "That's right" is nearly always genuine — it means the counterpart has heard themselves accurately reflected back and confirmed it. It signals that rapport has been established and the conversation is ready for influence.
*WHY:* The distinction matters because pursuing "yes" prematurely triggers the "you're right" pattern — the counterpart gives you the word you want while thinking something entirely different. "That's right" is harder to fake because it requires the counterpart to recognize their own worldview in your words.
- **Verbal mirroring extracts information without interpretation.** Repeating the last 1-3 words of a statement — with a slight upward inflection and then silence — causes the counterpart to elaborate. They fill the silence with what they actually think. This is more valuable than any question you could ask, because the counterpart is now revealing their reasoning rather than responding to your frame.
*WHY:* Questions impose a frame (they define what kinds of answers are relevant). Mirrors impose no frame — they simply signal "tell me more about that." Under uncertainty, this is the safest tool: you cannot misinterpret what you have not yet heard.
- **Labels name the feeling; they do not agree with the position.** "It seems like this has been a frustrating process" does not mean you think the process was unfair, or that you caused the frustration, or that you will fix it. It means you observed a feeling and named it. This distinction protects you legally and professionally while still delivering the emotional validation the counterpart needs.
*WHY:* Counterparts frequently mistake emotional validation for concession. They are not the same. You can fully acknowledge that a counterpart is frustrated without agreeing that their position is correct or their demand is reasonable. Separating these two things — and doing so cleanly — is what makes labeling a precision tool rather than appeasement.
- **The FM DJ voice is a physiological prerequisite, not a stylistic choice.** Deliver labels and summaries in the "late-night FM DJ voice" — calm, slow, with downward-inflecting tone. Downward inflection signals certainty and safety; upward inflection (question tone) invites challenge. The calm voice activates the counterpart's parasympathetic nervous system, reducing their threat response. This physiological calming is a prerequisite for emotional labels to land — a label delivered in an anxious or hurried voice triggers more defensiveness, not less. Voice tone accounts for 38% of the emotional signal in spoken communication (Mehrabian communication model). A label delivered in an urgent, high-energy voice reads as confrontation. The same label delivered slowly reads as understanding.
*WHY:* The counterpart's threat-detection system is sensitive to vocal signals before it processes content. A calm voice signals that the speaker is not threatened and is not threatening. This is why the Late-Night FM DJ voice is the default for de-escalation: it is physiologically safe, and physiological safety is the container that makes labels and mirrors land as listening rather than interrogation.
- **A corrected label is still a successful label.** If you label incorrectly, the counterpart will correct you. "No, it's not that I'm frustrated — I'm just exhausted by how long this has taken." This is productive: you now have more accurate information, the counterpart has expressed themselves, and the act of correcting you has engaged them in the conversation. There is no way to lose on a label.
*WHY:* Incorrect labels work because the mechanism is not accuracy — it is attention. By attempting a label, you signal that you are trying to understand. The counterpart's response to the attempt (whether confirmation or correction) always provides more information than silence.
---
## Examples
### Example 1: Stalled Sales Deal (Abu Sayyaf / Schilling kidnapping parallel)
**Scenario:** A software vendor has been in negotiations with a procurement team for four months. The procurement lead has been unresponsive for three weeks after verbally agreeing to terms. The vendor needs to restart the conversation without creating pressure.
**Trigger:** "This deal has been stalled for three weeks. The procurement lead agreed verbally but now won't respond. Help me plan how to restart the conversation without pushing them away."
**Process:**
- Counterpart worldview mapped: they likely have internal pressures (budget freeze, leadership review, other priorities) that have made the agreement harder to execute than expected. They may also feel embarrassed that they over-committed.
- Technique selection: start with mirroring (allow them to explain the delay), then labeling (name the difficulty they are in), then summary if rapport is established
- Labels drafted targeting the most likely feelings: pressure, embarrassment, uncertainty
- Summary written to reflect their constraints without applying pressure for commitment
**Output (`empathic-summary-script.md` excerpt):**
```
Mirror: "Three weeks since we last spoke?" [Pause 4+ seconds]
Label 1: "It seems like things have gotten more complicated on your end since we last talked."
[Pause 3-5 seconds]
Label 2: "It sounds like you might be in a situation where moving forward isn't fully in your hands right now."
[Pause 3-5 seconds]
Summary: "It sounds like this decision has more moving parts than it did when we first discussed it, and that some of those parts aren't resolved yet. And on top of that, it seems like the pressure to get this right might be making it harder to move at all."
[Wait for response. If "that's right" → proceed to: "I'd like to understand what would make it easier for you to move forward. What do you need from me?"]
[If "you're right" → mirror their response: "Easier to move forward?" — and restart labeling]
```
---
### Example 2: Pharma Sales Representative
**Scenario:** A pharmaceutical sales rep needs to get a doctor to consider prescribing a new medication. The doctor has been dismissive in prior visits — polite but not engaging. The rep suspects the doctor is skeptical of the efficacy data and tired of sales calls.
**Trigger:** "I have a 10-minute appointment with a doctor who has seen me three times and hasn't changed anything. Help me plan an approach that actually opens a conversation."
**Process:**
- Counterpart worldview: the doctor has heard the pitch and does not believe it adds enough value over their current prescribing habits. They are protecting their time and may be skeptical of vendor-provided data.
- Technique selection: open with a mirror on their skepticism (do not pitch), label the frustration with sales calls, paraphrase their actual clinical situation
- "That's right" trigger: a summary that reflects their patient-care goals, not the product features
**Output (`empathic-summary-script.md` excerpt):**
```
Opening (do not pitch): "I know you've heard a lot about this medication already."
Mirror: "Heard a lot already?" [Pause 4+ seconds — let them fill it]
Label 1: "It seems like a lot of these conversations end up being variations of the same pitch."
[Pause 3-5 seconds]
Label 2: "It sounds like what would actually matter to you is whether this changes outcomes for a specific type of patient — not just feature comparisons."
[Pause 3-5 seconds]
Summary: "It sounds like you've already evaluated a lot of options and you're not looking for another vendor's take on their own data. And on top of that, it seems like the only thing worth your time is something that addresses a patient population that isn't currently being served well."
[Wait for "that's right" before sharing any clinical data]
```
---
### Example 3: Chase Manhattan Bank Robbery — Verbal Mirroring Under Pressure
**Scenario:** A negotiator is dealing with a bank robber who claims he is not in charge and keeps deflecting decisions to "the others." The negotiator needs to identify who is actually in control without confronting the deception directly.
**Trigger:** "Help me plan a mirroring sequence to get the bank robber to reveal the actual decision-maker without accusing him of lying."
**Process:**
- Counterpart worldview: the robber is managing a deception (claiming multiple accomplices and shared control to confuse negotiators and buy time). He is calm and controlled — indicating he has a plan.
- Technique selection: mirroring only at this stage — no labels or summary. The goal is information extraction, not rapport.
- Mirrors targeted at the "others" language to reveal inconsistencies
**Output (`empathic-summary-script.md` excerpt):**
```
Robber: "I have to check with the others before we can agree to anything."
Mirror: "Check with the others?" [Pause 4+ seconds]
Robber: "Yeah, they're in charge of the timeline."
Mirror: "In charge of the timeline?" [Pause 4+ seconds]
Robber: "Look, I'm just the one talking to you — the decisions aren't mine."
Mirror: "Not yours to make?" [Pause 4+ seconds]
[Each mirror causes the robber to elaborate, revealing contradictions in the "multiple decision-makers" story. The negotiator collects information without confronting the lie — confrontation would end dialogue.]
Note: Do not advance to labeling or summary until the mirroring phase has produced enough information to form an accurate worldview map.
```
---
## References
- [references/active-listening-techniques.md](references/active-listening-techniques.md) — Full mechanics of all 6 techniques: rules, edge cases, common errors, examples for each
- [references/bcsm-influence-arc.md](references/bcsm-influence-arc.md) — Behavioral Change Stairway Model: 5 stages, gating conditions, failure modes at each stage
- [references/three-voices-guide.md](references/three-voices-guide.md) — FM DJ voice, positive/playful, and assertive: selection rules, inflection guidance, when each fails
- [references/thats-right-vs-youre-right.md](references/thats-right-vs-youre-right.md) — Detection criteria, recovery steps, case study examples
## License
This skill is licensed under [CC-BY-SA-4.0](https://creativecommons.org/licenses/by-sa/4.0/).
Source: [BookForge](https://github.com/bookforge-ai/bookforge-skills) — Never Split the Difference: Negotiating as if Your Life Depended on It by Chris Voss.
## Related BookForge Skills
This skill is standalone. Browse more BookForge skills: [bookforge-skills](https://github.com/bookforge-ai/bookforge-skills)
Profile a negotiation counterpart's communication style and generate a tailored adaptation strategy. Use when asking "how should I approach this person?", "w...
---
name: counterpart-style-profiler
description: |
Profile a negotiation counterpart's communication style and generate a tailored adaptation strategy. Use when asking "how should I approach this person?", "what communication style does my counterpart prefer?", "why is this negotiation not working?", "how do I adapt to this person's personality in negotiation?", or "what type of negotiator am I dealing with?" Also use for: diagnosing why previous conversations stalled or backfired; identifying whether warmth, data, or directness will land better; assessing self-type to avoid projecting your preferences onto the counterpart; preparing counterpart profiles for a negotiation one-sheet. Classifies counterparts into one of three communication archetypes (Analyst, Accommodator, Assertive) using observable behavioral signals, then produces a specific adaptation strategy covering communication tempo, information delivery, relationship style, and risk areas. Works from conversation history, emails, meeting notes, colleague descriptions, or any observable behavior data.
version: 1.0.0
homepage: https://github.com/bookforge-ai/bookforge-skills/tree/main/books/never-split-the-difference/skills/counterpart-style-profiler
metadata: {"openclaw":{"emoji":"📚","homepage":"https://github.com/bookforge-ai/bookforge-skills"}}
status: draft
depends-on: []
source-books:
- id: never-split-the-difference
title: "Never Split the Difference: Negotiating as if Your Life Depended on It"
authors: ["Chris Voss"]
chapters: [9]
tags: [negotiation, counterpart-profiling, communication-style, negotiator-types, analyst, accommodator, assertive, personality-assessment, adaptation-strategy, bargaining, conflict-resolution, sales, stakeholder-management]
execution:
tier: 1
mode: hybrid
inputs:
- type: document
description: "Observable counterpart behavior — conversation history, emails, meeting notes, colleague descriptions, or a written situation brief describing how the counterpart has communicated"
tools-required: [Read, Write]
tools-optional: []
mcps-required: []
environment: "Any agent environment. Works from pasted text, document files, or a verbal description of counterpart behavior."
discovery:
goal: "Classify a counterpart's communication archetype and produce an actionable adaptation strategy that changes how you communicate — tempo, information density, relationship-building approach, and risk areas to avoid"
tasks:
- "Collect behavioral signals from counterpart history"
- "Score counterpart against three archetype profiles"
- "Identify the dominant archetype (and any secondary blend)"
- "Assess user's own type to flag projection risk"
- "Produce a counterpart-profile.md with classification, adaptation strategy, and risk areas"
audience:
roles: ["salesperson", "founder", "manager", "consultant", "recruiter", "lawyer", "freelancer", "anyone-who-negotiates"]
experience: "any — no formal negotiation training required"
triggers:
- "User describes a counterpart and asks how to approach them"
- "User reports that their usual style isn't working with a specific person"
- "User is preparing a negotiation and wants to tailor their communication approach"
- "User wants to understand why a previous negotiation stalled or went sideways"
- "User asks whether to lead with data, relationship-building, or directness"
not_for:
- "Generating the full negotiation one-sheet — use negotiation-one-sheet-generator"
- "Designing calibrated questions — use calibrated-questions-planner"
- "Planning the Ackerman offer sequence — use ackerman-bargaining-planner"
- "Discovering hidden constraints or unknown unknowns — use black-swan-discovery"
---
# Counterpart Style Profiler
## When to Use
You are preparing for a negotiation, sales conversation, or difficult discussion and need to tailor your communication approach to the specific person you are dealing with. You have some observable data about how they communicate — emails, meeting behavior, prior conversation history, or secondhand descriptions — and want to translate that into a concrete adaptation strategy before the conversation begins.
This skill works in two scenarios:
1. **Pre-negotiation preparation:** You have some information about the counterpart and want a communication blueprint before first contact or before a critical conversation.
2. **Mid-negotiation diagnosis:** A conversation has stalled, or your usual approach isn't working, and you need to identify why and adjust.
**Do not use this skill if:** You have no information about the counterpart whatsoever. The profile requires at least a few observable signals to be reliable. Proceed with the Analyst archetype defaults (data-driven, slow pace, minimal warmth) as a low-risk starting posture when blind.
---
## Context & Input Gathering
### Required
- **Observable behavior:** At least 2–3 signals about how the counterpart communicates. Examples:
- Written: email tone, response time, level of detail, use of pleasantries
- Spoken: speaking pace, tendency to interrupt, whether they ask questions or make statements
- Meeting behavior: how they react to silence, whether they bring data or tell stories
- Colleague descriptions: "He's all business," "She wants to be your friend," "He always needs to think before answering"
### Important
- **Context of the relationship:** First contact, existing working relationship, adversarial history?
- **Stakes and setting:** High-pressure price negotiation, collaborative partnership discussion, employment offer?
- **Your own communication style:** What type are you? This is needed to flag projection risk.
### Defaults
- If communication history is available, use it as primary evidence.
- If only secondhand description is available, treat the classification as provisional and note it in the output.
- If no information is available at all, use Analyst defaults as a conservative starting posture — explain this choice in the output.
### Sufficiency check
If you have fewer than 2 behavioral signals, ask the user to describe one specific interaction with the counterpart before proceeding. A single data point is not enough to classify reliably.
---
## Process
### Step 1: Extract Behavioral Signals
**Action:** Read all available input about the counterpart. List every observable behavioral signal that relates to communication style. Separate signals into three categories: *pace and silence*, *relationship and warmth*, *directness and information style*.
**WHY:** Different archetypes manifest across different behavioral dimensions. An Analyst's signature is silence and deliberate pace. An Accommodator's signature is warmth and relationship emphasis. An Assertive's signature is time pressure and directness. Grouping signals by dimension prevents one strong signal from drowning out contradictory evidence in other dimensions. Missing any category risks misclassification.
Signals to look for:
- **Pace and silence:** Response time to emails/messages; comfort with pauses in conversation; how often they rush to fill silence; whether they make fast or slow decisions
- **Relationship and warmth:** How much time they spend on small talk; whether they ask personal questions; how they open and close communications; whether they express appreciation or warmth spontaneously
- **Directness and information style:** Whether they lead with conclusions or build up to them; how they react to detailed data vs. summaries; whether they interrupt or let others finish; how they express disagreement
---
### Step 2: Score Against Three Archetype Profiles
**Action:** For each of the three archetypes below, score the counterpart 1–5 based on how well the behavioral signals match. A score of 5 means strong match across all three signal dimensions. A score of 1 means signals contradict this archetype.
**WHY:** Scoring all three archetypes — not just the most obvious one — prevents premature commitment to a misclassification. Many counterparts show partial signals from two archetypes (especially Analyst/Assertive blends). Scoring explicitly surfaces mixed profiles and prevents the most common error: projecting your own type onto the counterpart.
**Archetype profiles:**
**Analyst**
- *Pace and silence:* Responds slowly, needs time to think, comfortable with long silences — does not rush to fill them. May not respond to messages immediately. Makes decisions carefully and slowly.
- *Relationship and warmth:* Reserved about personal topics. Not cold, but does not initiate small talk or express warmth readily. Relationship is built through reliability and demonstrated competence, not social bonding.
- *Directness and information style:* Leads with data, facts, and logic. Distrusts assertions without supporting evidence. Asks probing questions. Prefers comprehensive information before committing. May seem skeptical or unemotional. Does not like surprises.
- *Vulnerability:* Silence is thinking time, not a signal to be pushed. Rushing an Analyst or providing insufficient data produces resistance. Surprises feel like ambushes and create distrust.
- *Self-type note:* If you are an Analyst, watch for projecting your preference for data and deliberation onto counterparts who communicate very differently.
**Accommodator**
- *Pace and silence:* Conversationally active — tends to fill silences, enjoys dialogue, will often prolong conversations. Moves at a relationship pace rather than a task pace.
- *Relationship and warmth:* Leads with relationship. Uses first names, expresses genuine interest in your life and situation, sends friendly messages, may bring up personal topics before business. The relationship is the goal, not just a means.
- *Directness and information style:* Often agrees verbally without following through. Says "yes" or "sounds good" easily — watch for confirmation without commitment. Dislikes conflict and will often signal agreement to avoid tension, even if they haven't actually committed. May overpromise.
- *Vulnerability:* Their verbal agreement is often aspirational, not contractual. Pushing an Accommodator too hard or fast damages the relationship, which is their primary concern. Silence from an Accommodator is unusual and signals something is wrong.
- *Self-type note:* If you are an Accommodator, you may read warmth as commitment and miss the gap between their verbal enthusiasm and actual follow-through.
**Assertive**
- *Pace and silence:* Fast communicators. Gets to the point quickly. May interrupt. Treats time as money — long preambles or relationship-building before getting to the point feel like waste. Responds to emails quickly and expects quick responses.
- *Relationship and warmth:* Not relationship-first, but not hostile — they are direct because they respect your time. They value being heard and understood above being liked. Getting to "yes" is the goal; the relationship is a by-product.
- *Directness and information style:* States positions clearly and bluntly. Does not hedge. Reacts poorly to perceived evasion or over-explanation. Expects you to be equally direct. Needs to feel that they have been heard and understood before they will listen to your position.
- *Vulnerability:* They need to feel heard before they can hear you. Launching directly into your position or counter-argument before acknowledging theirs produces defensiveness and entrenchment. Warmth without substance reads as weakness or manipulation.
- *Self-type note:* If you are an Assertive, you may project bluntness as a universal virtue and come across as aggressive with Analysts and Accommodators.
**Scoring template:**
```
Analyst: [1–5] — [one-sentence rationale citing specific signals]
Accommodator: [1–5] — [one-sentence rationale citing specific signals]
Assertive: [1–5] — [one-sentence rationale citing specific signals]
```
---
### Step 3: Classify the Dominant Archetype
**Action:** Identify the highest-scoring archetype as the primary classification. If two scores are within 1 point of each other, classify as a blend and name both (e.g., "Analyst-Assertive blend"). If all three scores are similar, classify as adaptive and note that this counterpart may shift style by context.
**WHY:** A single dominant classification generates cleaner, more actionable adaptation guidance. Blends are real and common — acknowledging them prevents the user from forcing a misfit strategy. Adaptive counterparts require a different approach: observe first, adapt to the style they present in the specific conversation rather than relying on a fixed profile.
**Blend rules:**
- Analyst + Assertive: Data-driven and direct. Give them facts quickly, without preamble. They want comprehensive information delivered efficiently.
- Analyst + Accommodator: Methodical and warm. Build relationship slowly through competence and follow-through. Don't rush.
- Accommodator + Assertive: Relationship-forward but decisive. Lead with genuine warmth, then be direct about what you need. Watch for the Accommodator tendency to say yes without follow-through despite the Assertive pace.
---
### Step 4: Assess User's Own Type and Flag Projection Risk
**Action:** Ask the user to identify their own communication style (or infer it from how they have described the situation and their frustration with the counterpart). Score the user's self-described type against the same archetypes. Compare user type to counterpart type. Flag any projection risks.
**WHY:** The most common counterpart profiling error is projecting your own style. Analysts assume everyone needs time and data. Accommodators assume everyone wants relationship before business. Assertives assume directness is universally respected. Identifying the user's own type enables a specific warning: "You are an Assertive; your counterpart is an Analyst. Your instinct to get to the point quickly will feel like pressure to them. Pause and give them time to process."
**Projection risk table:**
| Your Type | Their Type | Risk |
|-----------|-----------|------|
| Analyst | Accommodator | You may neglect relationship-building that they require before they trust your data |
| Analyst | Assertive | You may over-explain; they want conclusions first, supporting data on request |
| Accommodator | Analyst | Your warmth may feel irrelevant; they need evidence, not rapport |
| Accommodator | Assertive | Your tendency toward lengthy relationship conversations will frustrate them |
| Assertive | Analyst | Your pace will feel like pressure; they will shut down or become evasive |
| Assertive | Accommodator | Your bluntness may damage the relationship they need before saying yes |
---
### Step 5: Generate Adaptation Strategy
**Action:** Produce a concrete adaptation strategy for the classified counterpart type. The strategy must cover four areas: communication tempo, information delivery, relationship approach, and risk areas to avoid. Include 2–3 specific do/don't examples for the classified type.
**WHY:** Generic advice ("communicate clearly," "build rapport") does not change behavior. Specific, type-matched rules do. An Analyst needs to hear "send supporting data before your ask, not after" — not "be thorough." An Assertive needs to hear "lead with a label that shows you understand their position before making your counter" — not "be direct." The adaptation strategy should be specific enough that the user can change what they say in the next conversation.
**Strategy template by type:**
*Analyst adaptation:*
- Tempo: Slow down. Build in response time. Do not follow up immediately after sending information.
- Information delivery: Front-load data and evidence. Provide supporting materials before or with your ask, not after they push back. Do not surprise them with new information mid-conversation.
- Relationship: Build trust through reliability and accuracy, not warmth. Follow through on every commitment you make.
- Risk areas: Do not interpret silence as resistance or discomfort — it is processing time. Do not pressure for a decision before they are ready. Avoid vague assertions ("this is a great deal") without supporting evidence.
*Accommodator adaptation:*
- Tempo: Match their conversational pace. Allow for relationship talk before business. Do not skip to the ask.
- Information delivery: Frame information in terms of relationship impact ("I want to make sure this works for both of us") and how it helps them achieve their goals. They respond to being seen and understood.
- Relationship: Invest in genuine relationship-building. Ask about them. Remember prior personal details. Express appreciation explicitly.
- Risk areas: Treat verbal agreement as a starting point, not a commitment — verify with specific follow-up ("So just to confirm, you'll have this done by Friday?"). Do not misread warmth as a "yes." Silence from an Accommodator is unusual and warrants gentle inquiry.
*Assertive adaptation:*
- Tempo: Match their pace. Get to the point. Do not open with pleasantries if they haven't. Respond to messages promptly.
- Information delivery: Lead with your conclusion, then offer supporting data if they ask. Do not build up slowly to a point — state it.
- Relationship: Earn their respect by being direct and competent, not by being warm. They will respect you more for pushing back on something wrong than for agreeing.
- Risk areas: Do not launch into your position before acknowledging theirs — they need to feel heard first. Use a brief label before making a counter-offer ("It sounds like timeline is the main concern — here's what I can do on that"). Avoid anything that reads as weakness: excessive hedging, premature concessions, apologetic framing.
---
### Step 6: Write the Output Artifact
**Action:** Produce a `counterpart-profile.md` document containing the full classification, scoring, adaptation strategy, projection risk note, and the user's self-type assessment.
**WHY:** The output artifact must be written to disk, not just explained in the conversation. It serves as a reference document the user reads immediately before a conversation. A mental note of "they're probably an Analyst" will not survive the pressure of a live negotiation. A one-page document they review beforehand will.
---
## Inputs / Outputs
### Inputs
- Behavioral observations about the counterpart (required — at minimum 2–3 signals)
- Communication history: emails, meeting notes, conversation transcripts (optional but improves accuracy)
- User's own communication style / self-assessment (optional — used for projection risk check)
- Relationship context: first contact, ongoing relationship, adversarial history (optional)
### Outputs
**Primary output:** `counterpart-profile.md`
```markdown
# Counterpart Profile: [Name / Role]
**Date:** [date]
**Situation:** [brief context — what negotiation, what's at stake]
---
## Classification
**Primary Type:** [Analyst / Accommodator / Assertive]
**Confidence:** [High / Medium / Provisional — note if based on limited signals]
**Blend:** [If applicable — e.g., "Analyst-Assertive blend"]
### Scoring
| Archetype | Score (1–5) | Key Signals |
|-----------|------------|-------------|
| Analyst | [X] | [signals cited] |
| Accommodator | [X] | [signals cited] |
| Assertive | [X] | [signals cited] |
---
## Behavioral Signals Observed
- [Signal 1 — with source: email on [date], meeting on [date], etc.]
- [Signal 2]
- [Signal 3]
- [Add as many as available]
---
## Adaptation Strategy
### Communication Tempo
[Specific guidance — slow down / match pace / etc.]
### Information Delivery
[What to lead with, what to defer, how to structure asks]
### Relationship Approach
[How to build trust with this specific type]
### Language to Use
- "[Example phrase or framing that works for this type]"
- "[Example phrase or framing that works for this type]"
### Language to Avoid
- "[Example phrase or pattern that backfires for this type]"
- "[Example phrase or pattern that backfires for this type]"
---
## Risk Areas
- [Risk 1 — specific to this type and this situation]
- [Risk 2]
- [Risk 3 — self-type projection risk if applicable]
---
## Self-Type Assessment
**Your Type:** [Analyst / Accommodator / Assertive / Unknown]
**Projection Risk:** [Specific warning based on your type vs. their type]
---
## Next Steps
- [ ] Review this profile immediately before the conversation
- [ ] Use [skill reference] for calibrated question design
- [ ] Use [skill reference] for offer sequencing
```
---
## Key Principles
**Observable behavior overrides assumptions.** Do not classify based on role, title, culture, or demographic. Classify based on what the counterpart has actually said and done. A senior executive may be an Accommodator; a junior analyst may be an Assertive. The signals are in the behavior, not the position.
**Your own type is the biggest source of profiling error.** You will naturally notice and weight signals that match your own communication preferences. An Assertive user will remember the counterpart's blunt moments and underweight their warmth. Explicitly identifying your own type and running the projection risk check is not optional — it corrects the most common classification failure.
**Type mismatches cause avoidable failures.** Treating an Analyst like an Assertive (rushing, pushing for quick decision) produces evasion and withdrawal. Treating an Assertive like an Accommodator (leading with warmth, taking time for relationship) reads as weakness or evasion. These failures are entirely preventable once the counterpart's type is known. The adaptation is not subtle — it changes what you say in the first 60 seconds.
**Accommodate the counterpart's style, not your comfort.** Adapting to an Assertive when you are an Accommodator may feel unnatural. Slowing down for an Analyst when you are an Assertive may feel like wasted time. This discomfort is the adaptation working correctly. The goal is not comfortable communication — it is effective communication for this specific person.
**The Assertive needs to be heard before they can hear you.** This is the most consistently violated rule for Assertive counterparts. Launching into your position before acknowledging theirs triggers defensiveness, not engagement. A single label ("It sounds like timeline is the main issue here") before your counter costs nothing and changes everything.
**Silence means different things for each type — misreading it is one of the most common type-mismatch failures.** For an Analyst, silence means they are processing — wait patiently and do not rush to fill it. For an Accommodator, silence means discomfort or suppressed disagreement — gently label it ("It seems like something's on your mind") rather than letting it sit. For an Assertive, silence from you is interpreted as having nothing to say — they will fill it with their own position, which is actually useful (let them talk). Applying the wrong silence response to the wrong type causes immediate damage: rushing an Analyst's silence reads as pressure; ignoring an Accommodator's silence lets the unspoken objection harden; filling silence too quickly with an Assertive robs you of information.
**When dealing with an Assertive counterpart, always acknowledge their position with a label BEFORE presenting your counter-position.** Assertives do not process new information until they feel heard. If you counter before acknowledging, they will simply repeat their position louder. The sequence is: label their position ("It sounds like you feel strongly that X"), wait for confirmation, THEN present your alternative. Skipping the label and going straight to the counter is the single most consistent source of escalation with Assertive counterparts.
---
## Examples
### Example 1: Pricing Negotiation with a Procurement Manager
**Scenario:** A sales rep preparing for a final pricing negotiation with a procurement manager. The manager has sent three highly detailed emails with numbered questions, requested a full pricing breakdown with line items, took five days to respond to the last message, and opened the first meeting with "Let's get right to the numbers."
**Trigger:** "How should I approach this negotiation? He's very analytical but also seems impatient."
**Process:**
- Step 1: Signals — detailed numbered questions (directness, information style), full pricing breakdown requested (data preference), 5-day response time (deliberate pace), "get right to the numbers" (time-efficient, task-focused)
- Step 2: Analyst=4 (data-driven, detailed, deliberate pace), Assertive=3 (direct opener, time-efficiency signal), Accommodator=1 (no warmth signals)
- Step 3: Primary type: Analyst with Assertive secondary — classify as Analyst-Assertive blend
- Step 4: User self-identifies as Accommodator → projection risk: user will want to build warmth before numbers; counterpart wants numbers immediately
- Step 5: Tempo — be prompt but don't rush decision. Information — lead with a complete pricing breakdown before the meeting, not in response to pushback. Relationship — skip extended small talk; open with "I've prepared the full breakdown you asked for." Risk — don't interpret slow response time as doubt; don't pad with warmth that reads as evasion
**Output:** `counterpart-profile.md` classifying as Analyst-Assertive blend, adaptation strategy leading with data delivery, projection risk warning for Accommodator user, language examples ("Here's the complete breakdown — happy to walk through any line item you want to dig into" vs. avoid "I just want to make sure we have a good relationship here before we get into numbers").
---
### Example 2: Salary Negotiation with a Hiring Manager
**Scenario:** A candidate preparing for salary negotiation with a hiring manager who has been very friendly throughout the interview process, asked personal questions about the candidate's family and career journey, said "I'm sure we can make this work" multiple times, but has not yet made a concrete offer after three conversations.
**Trigger:** "She keeps saying yes but nothing is moving. How should I approach asking for a number?"
**Process:**
- Step 1: Signals — personal questions (high warmth), "sure we can make this work" (verbal agreement without commitment), three conversations without concrete offer (avoidance of conflict), friendly throughout (relationship emphasis)
- Step 2: Accommodator=5 (all signals match), Analyst=1 (no data or deliberation signals), Assertive=1 (no directness or pace signals)
- Step 3: Primary type: Accommodator (high confidence)
- Step 4: User self-identifies as Analyst → projection risk: user may over-explain rationale and data for the salary ask; the manager needs to feel the relationship is good before committing
- Step 5: Tempo — allow relationship talk before business; don't rush. Information delivery — frame the ask in terms of the relationship and mutual success, not market data. Relationship — acknowledge the warmth explicitly before making the ask. Risk — verbal "yes" from an Accommodator is not a commitment; close with a specific follow-up: "So can we agree to [specific number] and have an offer letter by [date]?" Silence means something is wrong — check in.
**Output:** `counterpart-profile.md` classifying as Accommodator, adaptation strategy prioritizing relationship acknowledgment before the ask, specific language for closing ("I really appreciate how collaborative this process has been — I'd love to confirm the number and timeline so I can get excited about next steps"), and explicit warning that "I'm sure we can make this work" is not a commitment.
---
### Example 3: Partnership Negotiation with a Co-founder
**Scenario:** A founder trying to negotiate equity terms with a potential co-founder. The potential co-founder responds to messages within minutes, jumps to conclusions quickly, stated his equity expectations in the first meeting ("I need at least 30%"), interrupted twice when the founder was explaining the rationale, and pushed back hard when the founder said "let me think about it."
**Trigger:** "Every time I try to explain my reasoning, he cuts me off. What am I doing wrong?"
**Process:**
- Step 1: Signals — minute-level response time (fast pace), stated position immediately (direct, no preamble), interrupted twice (impatient with build-up), pushed back on "let me think about it" (time-is-money, wants momentum)
- Step 2: Assertive=5 (all signals match strongly), Analyst=1, Accommodator=1
- Step 3: Primary type: Assertive (high confidence)
- Step 4: User self-identifies as Analyst → projection risk: user's instinct to explain full rationale before stating position is exactly what causes the interruptions
- Step 5: Tempo — respond quickly, match his pace. Information delivery — lead with the number, offer rationale only if asked. Relationship — respect is earned by being direct and holding a position, not by explaining it. Risk — do not launch into rationale before acknowledging his 30% position; use a label first: "It sounds like you see your contribution as worth 30% of the outcome — I want to make sure I understand what's driving that before responding." Then state your position directly.
**Output:** `counterpart-profile.md` classifying as Assertive, adaptation strategy emphasizing label-first approach before counter-positions, explicit example language ("It sounds like timeline is the main concern — here's what I can do on that"), and projection risk warning for the Analyst user to front-load conclusion, not rationale.
---
## References
| File | Contents |
|------|----------|
| `references/type-profiles.md` | Full behavioral signal inventory per type; diagnostic question bank; cross-cultural considerations; blend profiles; adaptation scripts by context (email, phone, in-person, high-stakes vs. routine) |
## License
This skill is licensed under [CC-BY-SA-4.0](https://creativecommons.org/licenses/by-sa/4.0/).
Source: [BookForge](https://github.com/bookforge-ai/bookforge-skills) — Never Split the Difference: Negotiating as if Your Life Depended on It by Chris Voss.
## Related BookForge Skills
This skill is standalone. Browse more BookForge skills: [bookforge-skills](https://github.com/bookforge-ai/bookforge-skills)
Verify whether a counterpart's agreement is a real commitment or a polite escape. Use when someone asks "how do I know if they really mean yes?", "they agree...
---
name: commitment-verifier
description: |
Verify whether a counterpart's agreement is a real commitment or a polite escape. Use when someone asks "how do I know if they really mean yes?", "they agreed but I'm not sure they'll follow through", "my counterpart said yes but something feels off", "how do I tell if someone is stringing me along?", or "they said you're right — is that agreement?" Also use for: detecting when a verbal commitment won't survive contact with implementation, distinguishing genuine alignment from social pressure compliance, spotting deception signals in a counterpart's language or delivery, checking whether the decision-maker in the room actually has authority to commit, or preparing verification questions before a closing conversation. Analyzes an agreement interaction — conversation transcript, notes, or recalled exchange — and classifies each yes-type, flags verbal deception indicators, surfaces channel mismatches (words vs. tone vs. body language), and generates Rule of Three follow-up questions to confirm genuine commitment. Works for sales closes, contract negotiations, vendor agreements, hiring decisions, partnership deals, project sign-offs, and any high-stakes conversation where the difference between a real yes and a polite yes determines whether effort is wasted. Pair with calibrated-questions-planner (to design verification questions) and empathic-summary-planner (to build the rapport that makes genuine commitment possible).
version: 1.0.0
homepage: https://github.com/bookforge-ai/bookforge-skills/tree/main/books/never-split-the-difference/skills/commitment-verifier
metadata: {"openclaw":{"emoji":"📚","homepage":"https://github.com/bookforge-ai/bookforge-skills"}}
status: draft
depends-on: []
source-books:
- id: never-split-the-difference
title: "Never Split the Difference: Negotiating as if Your Life Depended on It"
authors: ["Chris Voss"]
chapters: [4, 5, 8]
tags: [negotiation, commitment, verification, deception-detection, yes-types, rule-of-three, pinocchio-effect, body-language, tone, baseline-deviation, closing, sales, agreement-quality, counterpart-analysis]
execution:
tier: 1
mode: hybrid
inputs:
- type: document
description: "Conversation transcript, notes, or recalled exchange — the interaction in which the agreement was made, including any relevant context leading up to it"
- type: document
description: "Agreement statements — the specific things the counterpart said that constitute the 'yes' (exact words matter for deception signal analysis)"
- type: document
description: "Behavioral observations (optional) — anything you noticed about their tone, body language, energy, eye contact, posture, or pace during or after the agreement"
tools-required: [Read, Write]
tools-optional: []
mcps-required: []
environment: "Any agent environment. Works with pasted notes, recalled conversations, or written transcripts."
discovery:
goal: "Classify the quality of each agreement made, flag deception signals in language and delivery, and produce verification questions that will reveal whether commitment is genuine before you invest further resources"
tasks:
- "Classify each identified agreement as Counterfeit, Confirmation, or Commitment yes"
- "Analyze verbal deception indicators: word count inflation, pronoun avoidance, sentence complexity, third-person distancing"
- "Check for 7-38-55 channel mismatches between words, tone, and observed body language"
- "Identify baseline deviations from the counterpart's normal communication style"
- "Generate Rule of Three verification sequence for each unconfirmed commitment"
- "Flag any behind-the-table stakeholders whose absence makes the commitment structurally unreliable"
- "Write commitment-assessment.md with full analysis and verification action plan"
audience:
roles: ["salesperson", "founder", "manager", "consultant", "account-executive", "procurement", "recruiter", "negotiator", "partnerships-manager"]
experience: "beginner to intermediate — no prior negotiation training required"
triggers:
- "User has received a yes but is uncertain whether it will translate to action"
- "User wants to verify commitment before investing significant time or resources in implementation"
- "User noticed a mismatch between what the counterpart said and how they said it"
- "User is dealing with a counterpart who tends to agree verbally without following through"
- "User needs to distinguish between social compliance and genuine decision-maker commitment"
- "User wants to design closing questions that surface real objections before they appear as cancellations"
not_for:
- "Generating the initial negotiation preparation document — use negotiation-one-sheet-generator"
- "Designing the questions to ask during negotiation — use calibrated-questions-planner"
- "Building the empathic summary that produces genuine alignment — use empathic-summary-planner first"
- "Profiling a counterpart's communication style — use counterpart-style-profiler"
quality: placeholder
---
# Commitment Verifier
## When to Use
You have received an agreement — a "yes," a verbal commitment, a signed-off plan, or an "I'm on board" — and something makes you uncertain whether it will survive contact with reality. The counterpart was agreeable, but you are not sure if they were agreeing to get out of the conversation, acknowledging a point without committing to act on it, or making a genuine commitment they intend to honor.
This skill works both pre-close (verifying before you proceed) and post-close (diagnosing why a commitment is stalling). You bring the conversation notes or transcript; the skill classifies what kind of yes you received, flags signals of deception or hedging, and produces a verification sequence to confirm genuine commitment or surface the real objection.
**Input this skill needs:** What was said (the agreement statements), how it was said (tone, energy, body language if observed), and any context about the relationship and interaction.
**Do not use this skill if:** You have not yet had the commitment conversation — first build rapport using `empathic-summary-planner` and generate your question bank using `calibrated-questions-planner`. This skill works on agreements already made, not on conversations yet to happen.
---
## Context & Input Gathering
### Required
- **The agreement statements:** The specific words the counterpart used to agree. Exact phrasing matters — "absolutely" and "yeah, sure" carry different signals.
- **What was agreed to:** The substance of the commitment — what they said they would do, decide, or approve.
- **When and how it came up:** Was it volunteered or prompted? After pressure or naturally? At the end of a long conversation or early?
### Important
- **Their tone and energy during agreement:** Did they sound enthusiastic, flat, distracted, over-eager? Any notable change from their normal pattern?
- **Body language if observed:** Eye contact, posture, pace of response, physical engagement or withdrawal.
- **What came before the agreement:** Were there unresolved objections, pending approvals, or stakeholders mentioned who weren't present?
### Observable
- **Follow-through signals:** Have they taken any action since agreeing — sent a follow-up, copied colleagues, initiated next steps?
- **Pronoun patterns:** Did they use "I will" or "we'll need to," "I'm committed" or "that sounds good"? Pronoun choice is a key deception signal.
- **Word count:** Did their agreement come with an unusually long explanation? Did they over-justify?
### Defaults (used if not provided)
- **Baseline behavior:** Assume the current sample is the only behavioral data available — flag deviations from internal consistency within the conversation itself.
- **Stakeholder completeness:** Assume that unless the decision-maker's identity and authority have been explicitly confirmed, at least one unconfirmed approver exists.
### Sufficiency check
If you have the agreement statements and a basic description of how the conversation went, proceed. Behavioral observations enrich the analysis but are not required.
---
## Process
### Step 1: Classify Each Agreement as One of Three Yes Types
**Action:** For each statement the counterpart made that constitutes a "yes," classify it as one of three types based on its diagnostic signals:
**Counterfeit Yes** — an agreement to escape the conversation, not to commit. The counterpart has no intention of following through. Signals: vague phrasing ("sounds good," "we'll see," "let's circle back"), no specifics on next steps, agreement came under pressure or was given quickly to end the discussion, tone flat or rushed.
**Confirmation Yes** — an acknowledgment that a fact is true, not a commitment to act. The counterpart understands and agrees with what you said but has not agreed to do anything. Signals: "yes, that's right," "I understand," "correct" — statements that confirm a point without specifying action. Often follows a question that asks for confirmation rather than action.
**Commitment Yes** — a genuine agreement to take a specific action. Signals: specific next steps named ("I'll send the contract by Friday"), ownership language ("I will," "I'm going to," "I'll handle"), the counterpart introduced details or logistics without being asked, their energy level rose with the agreement.
**WHY:** Most negotiations are lost not at the objection stage but at the false close — when you treat a Counterfeit or Confirmation yes as a Commitment and stop working the deal. The counterpart who says "yes" to make you feel good and then goes quiet is not deceiving maliciously — they are following the social instinct to avoid conflict. Distinguishing yes types before you invest in implementation protects your time and prevents the confusion of wondering why a "committed" counterpart has gone dark.
**IF** the agreement statement is ambiguous → mark it as unclassified and flag it for Rule of Three verification in Step 4.
**IF** the agreement was made under time pressure or at the end of a long conversation → weight toward Counterfeit or Confirmation until verified.
---
### Step 2: Analyze Verbal Deception Indicators (Pinocchio Effect)
**Action:** Read the counterpart's agreement statements and flag the presence of any of these four verbal deception indicators:
1. **Word count inflation** — Did they use significantly more words than necessary to express agreement? Genuine commitment tends to be direct ("I'll do it"). Deceptive or uncertain agreement often comes with elaborate justification, qualifications, or tangential explanation.
2. **First-person pronoun avoidance** — Did they use "I" or avoid it? Statements like "we'll get that sorted" or "that will happen" or "it should be fine" distance the speaker from ownership of the commitment. Genuine commitments tend to be first-person: "I will," "I'm going to," "I'll take care of that."
3. **Third-person distancing** — Did they refer to themselves in the third person, or refer to "the company" / "the team" / "our process" as the actor rather than themselves? This diffuses responsibility and signals they are not personally committed to the outcome.
4. **Sentence complexity increase** — Did the agreement come out in compound, qualified sentences full of conditionals ("as long as everything goes smoothly," "assuming the timeline holds," "provided nothing changes")? Genuine commitment is usually simple and direct. Complexity at the commitment moment is a hedge.
**WHY:** When people are not telling the whole truth — whether they're actively deceiving or simply uncertain — their language changes in predictable ways. They use more words because they're constructing rather than recalling. They avoid first-person pronouns because committing the "I" feels more psychologically binding. They add complexity and qualifiers because they're leaving themselves escape routes. These signals do not prove deception; they flag uncertainty worth verifying. Even when the counterpart is not consciously lying, these patterns often signal that they're not fully committed — they may not know yet whether they can actually deliver on what they're agreeing to.
**Flag each signal found.** A single signal is a soft flag. Two or more signals in the same statement is a strong flag requiring Rule of Three verification.
---
### Step 3: Check for 7-38-55 Channel Mismatches
**Action:** If you have tone or body language observations, apply the 7-38-55 communication channel framework:
- **7% of meaning is carried by words** — what was literally said
- **38% is carried by tone** — the pace, volume, energy, warmth, or flatness of delivery
- **55% is carried by body language** — posture, eye contact, physical engagement or withdrawal
**Mismatch rule:** When the words say yes but the tone is flat, rushed, or disengaged — discount the words. When the words say yes but the body language shows withdrawal (leaning back, reduced eye contact, closed posture) — discount the words. Alignment across all three channels is the strongest signal of genuine commitment.
**If observations are available, rate each channel:**
- Words: what they said (positive/neutral/negative)
- Tone: how they sounded (engaged/flat/rushed/warm)
- Body language: how they appeared (open/closed/distracted/energized)
**If channels conflict** — flag as a channel mismatch and escalate to Rule of Three verification.
**WHY:** The 7-38-55 framework reflects a well-documented asymmetry in communication: people have conscious control over their words but significantly less control over their tone and body language under stress or uncertainty. A counterpart who has decided to say yes to end a difficult conversation will choose positive words — but their voice and body often betray the discomfort or lack of conviction underneath. The channels where people have less conscious control carry more signal about their actual internal state than the channel (words) where they have the most control.
**IF** no behavioral observations are available → skip this step and note "channel data unavailable — rely on verbal signals only."
---
### Step 4: Apply the Rule of Three Verification Protocol
**Action:** For each agreement that was classified as Counterfeit or Confirmation, or flagged by deception indicators or channel mismatch, design a three-confirmation sequence using three different techniques:
**Confirmation 1 — Direct restatement:** Ask the counterpart to restate the commitment in their own words. "Just so we're aligned — can you walk me through what you're planning to do from here?" A genuine commitment can be restated easily. A Counterfeit yes often produces vague or deflecting answers.
**Confirmation 2 — Implementation framing:** Ask a "how" question about the mechanics of execution. "How are you planning to handle [the specific commitment]?" This forces them to think through implementation. If they can't describe the implementation, they haven't actually committed.
**Confirmation 3 — Obstacle surfacing:** Ask what could get in the way. "What obstacles do you see on your end?" or "What would make this harder to pull off than expected?" A Commitment yes engages with this question practically — the counterpart thinks about obstacles and discusses them. A Counterfeit yes often deflects: "Oh, I don't think there'll be any issues."
**WHY:** A genuine commitment is stable under three different framings. Counterfeit and Confirmation yeses, being social rather than substantive, tend to crack under even mild re-examination — not because the counterpart is malicious, but because the commitment was never solidly formed. The Rule of Three works because it's too cognitively expensive to maintain an uncommitted yes across three differently-structured confirmation requests. Each time you ask, they must reconstruct the same false commitment from a different angle — and the cracks show. Three confirmations also create a record: if a counterpart confirms three times and still doesn't follow through, the issue is execution capacity or external constraint, not intent — and that changes how you respond.
**Space the three confirmations out** — do not ask them in a row in one conversation. Use them across different touchpoints (end of meeting, follow-up email, kick-off call) unless urgency requires otherwise.
---
### Step 5: Detect Behind-the-Table Stakeholders via Pronoun Analysis
**Action:** Review the agreement statements for pronouns that signal the presence of decision-makers who were not at the table.
**Signals to flag:**
- "We'll need to..." (who is "we"?)
- "That should be fine with us" (who else is "us"?)
- "Our process requires..." (whose process? Who controls it?)
- Absence of "I" ownership with substituted organizational pronouns ("the company needs," "our team would have to")
**If plural or organizational pronouns appear**, generate a stakeholder confirmation question: "When you say 'we' — who else is involved in this decision?"
**WHY:** In most organizational negotiations, the person agreeing is not the only person whose buy-in is required. When a counterpart uses collective pronouns at the commitment moment, it often signals — consciously or not — that they are speaking for a group they have not yet actually consulted. Detecting this before you proceed prevents the most common form of deal collapse: a genuine-seeming yes from someone who then has to "check with" people who say no. The pronoun pattern is often an involuntary signal: the counterpart is using "we" because they know they haven't secured internal alignment, and the language reflects that uncertainty.
---
### Step 6: Special Case — "You're Right" vs. "That's Right"
**Action:** If the counterpart said "you're right" — flag it immediately as a non-commitment signal.
**"You're right"** = dismissal. The counterpart is agreeing with your position to end the point, not agreeing to do something. It is a way of acknowledging your argument without accepting your conclusion. It typically signals that you have been pushing your position rather than drawing out theirs, and they are placating you to move the conversation forward.
**"That's right"** = genuine acknowledgment. The counterpart is saying that your summary or paraphrase of their position is accurate. This is a signal of felt understanding — they believe you have correctly represented their perspective — which creates the conditions for genuine movement.
**WHY:** The distinction matters because "you're right" is often followed by no change in behavior — the counterpart has given you a verbal win that costs them nothing. "That's right" is usually preceded by an accurate empathic summary, which means you have done the work of understanding them first. The sequence that produces "that's right" (active listening → paraphrase → label) is what builds the platform for genuine Commitment yes. If you got "you're right," the path to real commitment runs through empathic listening first.
**If "you're right" was the key agreement statement** → do not treat as commitment. Route to `empathic-summary-planner` to rebuild rapport, then re-approach the commitment.
---
### Step 7: Write the Output Artifact
**Action:** Write `commitment-assessment.md` with the full analysis: yes-type classifications, deception signal flags, channel analysis, verification questions, and a summary action plan.
**WHY:** A written artifact converts subjective impressions ("something felt off") into structured evidence that supports a clear decision about how to proceed. It also creates a record — if the counterpart later claims they never committed, the assessment documents what was said and how it was evaluated.
---
## Inputs / Outputs
### Inputs
- Agreement statements — what the counterpart said (required)
- Conversation context — how and when the agreement was made (required)
- Behavioral observations — tone, body language, energy (optional)
- Stakeholder context — who was and wasn't in the room (optional)
### Outputs
**File:** `commitment-assessment.md`
**Template:**
```markdown
# Commitment Assessment: [Situation / Deal Name]
**Prepared:** [Date]
**Counterpart:** [Name / role / organization]
**Interaction:** [Meeting, call, email, etc. — date and context]
---
## Agreement Inventory
| # | Agreement Statement (verbatim or recalled) | Yes Type | Confidence | Flags |
|---|-------------------------------------------|----------|------------|-------|
| 1 | "[exact words]" | Counterfeit / Confirmation / Commitment | High / Medium / Low | [signal flags] |
| 2 | "[exact words]" | Counterfeit / Confirmation / Commitment | High / Medium / Low | [signal flags] |
---
## Deception Signal Analysis
### Verbal Indicators (Pinocchio Effect)
- [ ] Word count inflation: [describe if present]
- [ ] First-person pronoun avoidance: [describe if present]
- [ ] Third-person distancing: [describe if present]
- [ ] Sentence complexity / over-qualification: [describe if present]
**Overall verbal signal rating:** Clean / Soft flags / Strong flags
---
## Channel Analysis (7-38-55)
| Channel | Observation | Signal |
|---------|-------------|--------|
| Words (7%) | [what they said] | Positive / Neutral / Negative |
| Tone (38%) | [how they sounded] | Engaged / Flat / Rushed / N/A |
| Body language (55%) | [how they appeared] | Open / Closed / Distracted / N/A |
**Channel alignment:** Aligned / Mismatch detected
**Mismatch notes:** [describe any conflict between channels]
---
## Stakeholder Completeness
- Decision-maker confirmed in room: Yes / No / Unknown
- Plural pronoun signals: [list any "we" / "our" / "the team" usage at commitment moment]
- Stakeholder confirmation question needed: Yes / No
- Question: "[How does this decision involve the rest of your team?]"
---
## Rule of Three Verification Plan
*For each unconfirmed commitment, three confirmations required across different touchpoints.*
**Commitment 1:** [Agreement statement to verify]
| Confirmation | Technique | Question to Ask | Touchpoint |
|-------------|-----------|-----------------|------------|
| #1 | Direct restatement | "[Can you walk me through what you're planning to do from here?]" | End of next meeting |
| #2 | Implementation framing | "[How are you planning to handle X?]" | Follow-up email |
| #3 | Obstacle surfacing | "[What obstacles do you see on your end?]" | Kick-off call |
---
## Verdict and Action Plan
**Overall commitment quality:** Genuine / Uncertain / Likely Counterfeit
**Recommended next step:**
- [ ] Proceed — commitment signals are strong across all channels
- [ ] Verify — run Rule of Three sequence before investing further
- [ ] Rebuild — route to empathic-summary-planner to rebuild rapport before re-approaching commitment
- [ ] Surface stakeholders — confirm who else must approve before treating this as closed
**Notes:**
[Any additional context or observations]
```
---
## Key Principles
**Three types of yes require three different responses.** Treating a Counterfeit or Confirmation yes as a Commitment is the most common reason follow-through fails. Counterfeit yes needs rebuilding — find the real objection underneath the escape. Confirmation yes needs extension — the counterpart understood; now you need to move them from acknowledgment to action. Only Commitment yes allows you to proceed.
**WHY:** Most "commitment failures" are actually classification errors. A salesperson who loses a "closed" deal was usually working with a Counterfeit yes without knowing it. Misclassification causes the wrong intervention: you reinvest in a relationship that needs a different kind of attention, or you disengage from a counterpart who was genuinely committed but needed operational support to execute.
**Verbal deception indicators are signals of uncertainty, not lies.** When a counterpart's language inflates in word count, avoids first-person pronouns, or adds qualifiers at the commitment moment, they are not necessarily deceiving you — they may be uncertain themselves. The signals flag that the commitment is not fully formed, whether the reason is social pressure compliance, unresolved internal objections, or genuine inability to commit on behalf of others.
**WHY:** Treating deception signals as moral failures produces the wrong response — confrontation or withdrawal. Treating them as uncertainty signals produces the right response — Rule of Three verification that gives the counterpart a chance to surface the real constraint or solidify their commitment. The goal is not to catch them lying; it is to understand what is actually true.
**The channel with the least conscious control carries the most signal.** Words are chosen deliberately. Tone is semi-conscious. Body language is largely automatic. When these channels conflict, the channel the counterpart controls least — body language, then tone — is usually closer to their actual internal state. This asymmetry means that a flat, rushed voice delivering an enthusiastic "absolutely" is telling you more with the tone than with the word.
**WHY:** Under social pressure to agree, people default to positive words because that is the path of least resistance. But the emotional state underneath — discomfort, uncertainty, lack of conviction — surfaces in channels they are not consciously managing. The 7-38-55 framework is not a precise formula; it is a reminder to weight the channels that are harder to fake.
**"You're right" ends conversations; "that's right" advances them.** A counterpart who says "you're right" is closing down a thread that makes them uncomfortable. A counterpart who says "that's right" in response to your summary of their position is opening up — they feel understood, and that feeling creates movement. The difference tells you whether you've been pushing your position (producing escape responses) or drawing out theirs (producing genuine alignment).
**WHY:** Negotiation is lost at the point of pushing, not at the point of asking. When you make your case and the counterpart says "you're right" — you have pushed them into a corner where agreement costs them nothing and signals nothing. When you summarize their position accurately and they say "that's right" — they have moved toward you voluntarily, which is the only movement that sticks.
**The Rule of Three is too cognitively expensive to fake three times.** A Counterfeit yes can survive one confirmation request. It rarely survives three differently-framed requests across different touchpoints. The effort required to maintain the same false commitment under restatement, implementation questioning, and obstacle surfacing exceeds the social benefit of avoiding the conversation — so the real position surfaces.
**WHY:** This is why spacing the three confirmations across touchpoints (rather than asking them in sequence in one conversation) produces better information. In a single conversation, a counterpart can maintain a false commitment through social pressure. Across multiple touchpoints — when they have had time to think — the gap between what they said and what they're actually able to deliver becomes harder to bridge.
---
## Examples
### Example 1: Sales Close — Distinguishing Counterfeit from Commitment
**Scenario:** An account executive has been working a $200K software deal for four months. At the end of the final demo, the VP of Finance says: "This looks great. I think we can make this work. I don't see why we wouldn't move forward." The executive wants to know whether to treat this as a closed deal.
**Trigger:** "They said yes but I want to make sure before I start onboarding."
**Process:**
- Step 1: Classify — "I think we can make this work" is Confirmation at best. "I don't see why we wouldn't" is a double negative hedge, not a positive commitment. No specific action stated. No ownership pronoun ("I will"). Classify as Confirmation/Unverified.
- Step 2: Pinocchio check — "I don't see why we wouldn't" is over-complex for what a simple yes would be. No first-person commitment pronoun. Word count above what a genuine commitment requires. Two soft flags.
- Step 3: Channel check — the executive noted the VP's energy was lower at the end of the meeting than during the demo. Channel mismatch between positive words and reduced energy.
- Step 4: Rule of Three — Confirmation 1: "To make sure I have this right — can you walk me through what the path to signing looks like from here?" Confirmation 2: "How does your procurement process handle something at this contract size?" Confirmation 3: "What could slow this down on your end?"
- Step 5: Stakeholder check — "we" usage flagged. Question: "When you say 'we' — who else will be involved in the final sign-off?"
**Output:** `commitment-assessment.md` classifies the agreement as Unverified, flags two deception signals and a channel mismatch, generates the Rule of Three sequence and a stakeholder confirmation question. The follow-up reveals that legal approval is required — a two-week process the executive can now plan around rather than be surprised by.
---
### Example 2: Vendor Commitment — Detecting Scope Creep Risk Before It Happens
**Scenario:** A project manager has just finished a scope definition meeting with a new vendor. The vendor's lead said: "Absolutely, we can handle all of that. Our team does this kind of thing all the time. I'll make sure the team is fully aligned before we kick off." The PM wants to verify before approving the contract.
**Trigger:** "How do I know they can actually deliver what they committed to?"
**Process:**
- Step 1: Classify — "Absolutely, we can handle all of that" has positive energy but no specifics. "I'll make sure the team is fully aligned" is a process commitment (aligning the team), not a delivery commitment. Classify Agreement 1 as Confirmation, Agreement 2 as Commitment (to an internal action), flag for verification.
- Step 2: Pinocchio check — "Our team does this kind of thing all the time" is unsolicited justification at the commitment moment — word count inflation signal. "I'll make sure the team is fully aligned" uses first-person ownership — clean signal. Mixed result.
- Step 3: Channel check — PM noted the vendor lead spoke faster after the scope list was read. Pace increase = possible stress signal. Mild channel flag.
- Step 4: Rule of Three for Agreement 1 — Confirmation 1: "Can you describe how your team would approach the [most complex deliverable] specifically?" Confirmation 2: "How have you handled situations like this in previous projects?" Confirmation 3: "What would make delivery on this harder than expected?"
- Step 5: No plural pronouns at commitment. Stakeholder concern low.
**Output:** `commitment-assessment.md` flags word count inflation and mild pace mismatch, requests implementation framing in kick-off call. The implementation question reveals the vendor has not done one component of the scope before — surfacing before contract signing.
---
### Example 3: Internal Alignment — Verifying "You're Right" from a Reluctant Stakeholder
**Scenario:** A product manager needs sign-off from the VP of Engineering before launching a feature. After presenting the plan, the VP said: "You're right, we can't keep delaying this. I'll support it." The PM is uncertain whether this means the VP will actively support the launch or just stop blocking it.
**Trigger:** "I need to know if she's actually on board or just agreeing to end the conversation."
**Process:**
- Step 1: Classify — "You're right" is the diagnostic red flag. Per the "you're right" rule: this is a dismissal response, not a Commitment yes. "I'll support it" uses first-person and specifies action — classify that component as Commitment/Unverified.
- Step 2: Pinocchio check — "You're right, we can't keep delaying this" has a complaint embedded (the delay frustration) that was not addressed — a sign the underlying concern is unresolved. The "I'll support it" is short and direct — clean signal.
- Step 3: Channel check — the PM noted the VP did not make eye contact when saying "you're right." Maintained eye contact during "I'll support it." Channel mismatch on first statement, alignment on second.
- Step 4: The "you're right" component is a dismissal — route to empathic summary first. The "I'll support it" component is worth verifying. Rule of Three — Confirmation 1: "What specifically would your support look like during the launch period?" Confirmation 2: "How would you want to handle it if the launch surfaces issues?" Confirmation 3: "What concerns about the plan would you want us to address before we go?"
- Step 5: No stakeholder flags.
**Output:** `commitment-assessment.md` identifies the "you're right" as dismissal and separates it from the "I'll support it" commitment. Recommends a 10-minute empathic summary conversation to resolve the underlying delay frustration before the Rule of Three verification. The follow-up reveals the VP's real concern: resource allocation during the launch window — which the PM can now address directly.
---
## References
| File | Contents |
|------|----------|
| `references/yes-type-guide.md` | Diagnostic signal checklist for all three yes types; common statement patterns with classifications; edge cases and ambiguous examples; escalation criteria |
| `references/deception-signal-reference.md` | Full Pinocchio Effect signal library with examples; 7-38-55 channel weight rationale and application; baseline deviation detection method; pronoun analysis guide; "you're right" vs "that's right" full case breakdown |
| `references/rule-of-three-library.md` | 15+ ready-to-use verification questions across all three confirmation types; sequencing guide for different contexts (sales, internal, vendor, partnership); timing recommendations for distributed confirmation |
## License
This skill is licensed under [CC-BY-SA-4.0](https://creativecommons.org/licenses/by-sa/4.0/).
Source: [BookForge](https://github.com/bookforge-ai/bookforge-skills) — Never Split the Difference: Negotiating as if Your Life Depended on It by Chris Voss.
## Related BookForge Skills
This skill is standalone. Browse more BookForge skills: [bookforge-skills](https://github.com/bookforge-ai/bookforge-skills)
Generate a bank of open-ended strategic questions (how/what questions) for a negotiation, sales conversation, difficult discussion, or conflict situation. Us...
---
name: calibrated-questions-planner
description: |
Generate a bank of open-ended strategic questions (how/what questions) for a negotiation, sales conversation, difficult discussion, or conflict situation. Use when someone asks "what questions should I ask in my negotiation?", "how do I get more information without seeming pushy?", "how do I find out who else is involved in this decision?", "what should I ask to understand their constraints?", or "how do I stop the other side from stonewalling me?" Also use for: designing interview questions that reveal unstated priorities, discovering hidden stakeholders who could kill a deal, identifying deal-breaking issues before they surface, uncovering the real decision-making process behind a stated position, or preparing questions for any high-stakes conversation where you need the other party to think and talk. Produces a situation-specific question bank organized by category (value-revealing, behind-the-table stakeholder, deal-killing issue), with follow-up label templates and deployment sequencing. Works for sales calls, job negotiations, vendor negotiations, partnership discussions, client discovery, conflict resolution, and any scenario where understanding the counterpart's full picture is critical. Pair with accusation-audit-generator (to defuse objections before asking) and commitment-verifier (to verify answers reveal real commitment).
version: 1.0.0
homepage: https://github.com/bookforge-ai/bookforge-skills/tree/main/books/never-split-the-difference/skills/calibrated-questions-planner
metadata: {"openclaw":{"emoji":"📚","homepage":"https://github.com/bookforge-ai/bookforge-skills"}}
status: draft
depends-on: []
source-books:
- id: never-split-the-difference
title: "Never Split the Difference: Negotiating as if Your Life Depended on It"
authors: ["Chris Voss"]
chapters: [7, 8, 23]
tags: [negotiation, questions, open-ended-questions, stakeholder-discovery, deal-killers, how-what-questions, active-listening, information-gathering, sales, conflict-resolution, strategic-questions, decision-makers, hidden-constraints]
execution:
tier: 1
mode: hybrid
inputs:
- type: document
description: "Situation brief describing the negotiation, deal, or conversation — what you want, who is involved, what you know about their position and constraints"
- type: document
description: "Goals and constraints — your target outcome, your walk-away point, any known constraints on either side"
- type: document
description: "Stakeholder map (optional) — known decision-makers and influencers, especially those not directly at the table"
tools-required: [Read, Write]
tools-optional: []
mcps-required: []
environment: "Any agent environment. Works with pasted briefs or document files."
discovery:
goal: "Produce a situation-specific bank of open-ended strategic questions that shift problem-solving to the counterpart, reveal hidden constraints and stakeholders, and uncover deal-killing issues before they surface"
tasks:
- "Analyze the situation to identify information gaps and unknown constraints"
- "Generate 5-8 how/what questions organized across three categories"
- "Map behind-the-table stakeholders and generate questions to surface them"
- "Identify likely deal-killing issues and generate diagnostic questions"
- "Produce follow-up label templates for each question"
- "Sequence the questions into deployment groups of 2-3"
audience:
roles: ["salesperson", "founder", "manager", "consultant", "recruiter", "procurement", "negotiator", "account-executive", "product-manager"]
experience: "beginner to intermediate — no negotiation training required"
triggers:
- "User is preparing for a negotiation, sales call, or difficult conversation and needs questions to ask"
- "User is stuck and needs the other party to think about solutions rather than just objecting"
- "User needs to find out who else is involved in the decision without being confrontational"
- "User wants to uncover hidden constraints, budget limits, or approval requirements"
- "User needs to diagnose whether a deal-killing issue exists before it surfaces too late"
- "User wants to avoid yes/no questions that shut down conversation"
not_for:
- "Generating the full negotiation preparation document — use negotiation-one-sheet-generator for the complete 5-section prep"
- "Designing price negotiation offers — use ackerman-bargaining-planner"
- "Verifying whether the counterpart's yes is genuine — use commitment-verifier"
- "Defusing hostility before questions can land — use accusation-audit-generator first"
quality: placeholder
---
# Calibrated Questions Planner
## When to Use
You are preparing for a negotiation, sales conversation, discovery call, partnership discussion, or any high-stakes conversation where you need to understand the full picture on the other side — their priorities, constraints, decision process, and the stakeholders who are not in the room.
You want to ask questions that make the counterpart think and talk, rather than questions they can deflect with a yes or no. You need information without seeming demanding. You may also need to surface hidden stakeholders or deal-blocking issues before they derail the deal.
**Input this skill needs:** A situation brief (what the deal is, who is involved, what you know so far) and your goals. If you have a stakeholder map or conversation history, include them.
**Do not use this skill if:** You need to write a complete negotiation plan — use `negotiation-one-sheet-generator`. You need to handle an accusatory or hostile counterpart before questions can land — run `accusation-audit-generator` first to defuse the situation.
---
## Context & Input Gathering
### Required
- **The situation:** What is the negotiation or conversation? (Deal type, stakes, industry context)
- **What you want:** Your target outcome. Be specific — vague goals produce vague questions.
- **What you know about the other side:** Their stated position, what they've said they want, any known constraints.
### Important
- **Who is visible at the table:** The person(s) you are directly speaking with and their role.
- **What you don't know:** Information gaps — what you suspect exists but haven't confirmed (budget, authority, timeline, competing priorities, internal politics).
### Observable (tell the skill what you can observe)
- **Their communication style so far:** Are they forthcoming? Evasive? Technical? Relationship-focused?
- **Any signals of constraint:** Have they said things like "I need to check with someone," "that's not something I can decide," or "we have some internal considerations"?
### Defaults (used if not provided)
- **Stakeholder map:** Assume at least one decision-maker exists who is not present in the conversation.
- **Deal-killing issues:** Assume budget approval, authority limits, and internal alignment are all potentially unresolved unless confirmed.
### Sufficiency check
If you have the situation, your goal, and at least one known information gap, proceed. Missing stakeholder map or conversation history is acceptable — the skill will generate exploratory questions to fill these gaps.
---
## Process
### Step 1: Identify the Information Gaps
**Action:** Read the situation brief and list 4-6 specific things you do not know but need to know to reach your goal. Organize them into three buckets:
- (A) What does this situation actually mean to them — their real priorities and what success looks like for their side?
- (B) Who else is involved in making or blocking this decision?
- (C) What issues could kill this deal that have not been raised yet?
**WHY:** Open-ended strategic questions must be targeted at real unknowns, not asked generically. A question that reveals something you already know wastes conversation time. A question that hits a genuine unknown produces information you can act on. Mapping unknowns first prevents generic question generation and ensures every question has a purpose.
**IF** you have a stakeholder map → cross-check it: identify any decision-makers not confirmed as participants in the conversation.
**IF** you have conversation history → scan for deflections, vague answers, or mentions of "internal" considerations — these signal bucket B and C unknowns.
**IF** you have nothing → default to all three buckets as unknown territory.
---
### Step 2: Generate Bucket A — Value-Revealing Questions
**Action:** Write 2-3 open-ended how/what questions that surface what this situation means to the counterpart — their priorities, what they value most, and what a successful outcome looks like on their side.
**WHY:** You cannot negotiate effectively toward the counterpart's real interests unless you know what they are. Stated positions ("we need a 20% discount") often conceal real interests ("we need to hit budget before Q3 close"). Questions that make the counterpart articulate their own success criteria reveal negotiating room that a pure price-focused approach misses. Phrasing as "how" or "what" — not "why" — keeps the question open without triggering defensiveness. "Why" sounds accusatory; "what" and "how" sound collaborative.
**Question starters to use:**
- "What does success look like for you here?"
- "How does this fit into what you're trying to accomplish?"
- "What matters most to you in how this gets resolved?"
- "What would need to happen for this to work for your side?"
- "How will you know this was the right decision?"
**Constraint:** Every question must start with "how" or "what." Never "why." Never a closed yes/no question.
---
### Step 3: Generate Bucket B — Behind-the-Table Stakeholder Questions
**Action:** Write 2-3 questions that surface decision-makers, approvers, and deal-killers who are not visibly present in the conversation.
**WHY:** In most organizational negotiations, the person across the table is rarely the only person whose approval matters. Budget owners, legal teams, senior executives, technical reviewers, and implementation teams can all veto a deal from off-stage. Discovering these stakeholders before the final negotiation moment — when they would otherwise appear as a surprise "we need to check with..." — lets you address their concerns proactively. Asking about the team's perspective rather than demanding to speak to the boss keeps the question collaborative, not threatening.
**Ready-to-use questions (adapt as needed):**
- "How does this affect the rest of your team?"
- "What do the people not on this call see as their main concerns?"
- "How on board are the colleagues who aren't part of this conversation?"
- "What does your [manager/board/legal team] need to see to feel comfortable with this?"
- "Who else will be involved in making this final?"
**IF** you already know the decision structure → tailor questions to specific roles (e.g., "What does your CFO need to see?").
**IF** you do not know the decision structure → use the broadest versions to surface the structure first.
---
### Step 4: Generate Bucket C — Deal-Killing Issue Discovery Questions
**Action:** Write 1-2 questions that surface potential deal-blocking issues — budget constraints, approval limits, timing problems, competing priorities, or internal resistance — before they appear as a final objection.
**WHY:** Deal-killing issues that surface at the end of a negotiation, when you believe agreement is close, are far more damaging than issues surfaced early. A budget problem discovered in week one can be worked around; the same problem discovered in week eight, after both sides have invested time and built commitment, often causes the deal to collapse with hard feelings. Questions that invite the counterpart to raise potential problems early signal collaborative intent and protect both sides from wasted effort.
**Ready-to-use questions (adapt as needed):**
- "What could get in the way of this moving forward?"
- "What are the biggest obstacles you see on your side?"
- "How does this fit with your current priorities and budget cycle?"
- "What would make this impossible to approve?"
- "What concerns do you have that we haven't addressed yet?"
---
### Step 5: Add Follow-Up Label Templates
**Action:** For each question, write a follow-up label — a "It seems like..." or "It sounds like..." statement to use after the counterpart answers, before asking the next question.
**WHY:** A question without a follow-up creates a rapid-fire interrogation feeling. After the counterpart answers, pausing to reflect their answer back as a label — "It seems like the timeline pressure is the real constraint here" — demonstrates that you heard them, builds rapport, and often prompts deeper elaboration without requiring another question. Labels after answers are as important as the questions themselves.
**Label template format:**
```
After [Question]: "It seems like [paraphrase of what they said or implied]..."
```
**IF** you don't yet know what they'll say → write a label template with a blank: "It seems like [X] is the real concern here..." — fill in X after you hear their answer.
---
### Step 6: Sequence into Deployment Groups
**Action:** Organize the full question bank into groups of 2-3, ordered from rapport-building to deal-diagnostic. Indicate which group to ask first, second, and third.
**WHY:** Asking multiple questions in sequence without pausing for answers overwhelms the counterpart and signals an interrogation rather than a conversation. Grouping into 2-3 question sets with natural breaks — where you use a label or summary before continuing — keeps the pacing conversational. Starting with value-revealing questions (bucket A) builds understanding and rapport before surfacing potential problems (bucket C). This sequencing also ensures that if the conversation is cut short, you've gathered the most important information first.
**Deployment order:**
1. Group 1: 2 Bucket A questions (value-revealing) + labels
2. Group 2: 2 Bucket B questions (stakeholder discovery) + labels
3. Group 3: 1-2 Bucket C questions (deal-killer discovery) + labels
**IF** the relationship is new → open with the most rapport-friendly bucket A question.
**IF** the relationship is established → you can move to bucket B earlier.
---
### Step 7: Write the Output Artifact
**Action:** Write `calibrated-questions.md` with all questions organized by category, numbered, with follow-up label templates and deployment group markers.
**WHY:** A written document that the user can refer to during the conversation prevents forgetting key questions under pressure, allows adaptation in real time, and creates a record for post-conversation analysis. The format (numbered, categorized, with labels) mirrors how the questions should be deployed — it is not just a list but a deployment script.
**Output format:** See Outputs section below.
---
## Inputs / Outputs
### Inputs
- Situation brief (required)
- Your goals and constraints (required)
- Stakeholder map (optional)
- Conversation history (optional)
### Outputs
**File:** `calibrated-questions.md`
**Template:**
```markdown
# Calibrated Questions: [Situation Name]
**Prepared for:** [Your name / role]
**Counterpart:** [Name / role / organization]
**Goal:** [Your target outcome in one sentence]
---
## Deployment Group 1 — Value-Revealing Questions
*Ask these first. Pause and label after each answer before asking the next.*
**Q1.** [Question — starts with How or What]
→ Follow-up label: "It seems like [blank]..."
**Q2.** [Question — starts with How or What]
→ Follow-up label: "It sounds like [blank]..."
---
## Deployment Group 2 — Behind-the-Table Stakeholder Questions
*Ask these after establishing understanding. Listen for signs of hidden approvers.*
**Q3.** [Question about team or colleagues]
→ Follow-up label: "It seems like [blank] has a stake in this..."
**Q4.** [Question about who else is involved]
→ Follow-up label: "It sounds like there are people not in this conversation who matter..."
---
## Deployment Group 3 — Deal-Killing Issue Discovery
*Ask these once rapport is established. Surface problems now, not at the end.*
**Q5.** [Question about obstacles or concerns]
→ Follow-up label: "It seems like [blank] could be the real challenge here..."
**Q6.** [Optional: question about internal constraints or approvals]
→ Follow-up label: "It sounds like [blank] is what would make this hard to move forward..."
---
## Anti-Patterns to Avoid During the Conversation
- Do not ask "Why" questions — they trigger defensiveness. Every "Why" question contains an implicit accusation. "Why did you choose that vendor?" sounds like "Justify your decision." Rewrite: "What led you to choose that vendor?" — same information, zero accusation. The shift from "Why" to "What/How" removes the judgment while preserving the inquiry.
- Do not ask multiple questions without pausing for the answer
- Do not ask yes/no questions when you need understanding
- Do not skip the label after the answer — it signals you heard them
---
## Notes
[Space for observations made during the conversation]
```
---
## Key Principles
**How and What questions shift the problem-solving burden.** When you ask "How am I supposed to do that?" after a counterpart makes an unreasonable demand, you are not accepting the demand or outright rejecting it — you are returning the problem to them. They must now think about how to solve a problem they created. This shifts mental effort and often produces creative solutions you would never have discovered by presenting your own counter-proposal immediately. "How am I supposed to do that?" is the single most powerful calibrated question in negotiation: instead of arguing against the counterpart's position, you ask them to solve your constraint. This forces them to consider your perspective and often leads them to propose a solution that works for both sides. At Harvard, this single question defeated two negotiation professors in a mock hostage scenario.
**WHY:** The counterpart works harder on a problem they were invited to solve than one they were handed. When people create solutions, they feel ownership over those solutions — making them far more likely to follow through than if you had suggested the same solution yourself.
**"Why" is an accusation in disguise.** Even well-intentioned "why" questions — "Why is that a problem for you?" — activate defensiveness because they implicitly challenge the counterpart's judgment or motives. "What" and "how" questions ask the counterpart to describe rather than justify, which keeps them in a collaborative frame.
**WHY:** "Why" forces the counterpart to defend a position, which causes them to entrench it. "What" and "how" invite the counterpart to explain a situation, which often reveals more information and opens more options.
**Behind-the-table stakeholders are the most common deal-killers.** In organizational negotiations, the person you are talking to almost never has full authority. Budget owners, legal counsel, senior executives, and technical approvers all have veto power and are frequently not in the conversation. Discovering them early through collaborative questions prevents the "I need to check with someone" surprise that kills deals in their final hours.
**WHY:** A counterpart who says "I need to check with my boss" at the final stage is not being evasive — they may genuinely lack authority. But discovering this in hour one versus hour forty saves enormous time and effort on both sides and allows you to design the conversation to address the hidden stakeholders' concerns proactively.
**Questions in groups of 2-3 with labels in between.** Never ask a string of questions without pausing to acknowledge the answer. Each answer deserves a label — a reflection of what you heard — before the next question. This sequencing signals that you are listening, not interrogating, and often produces more complete answers than the next question would have generated.
**WHY:** Labels after answers serve two functions: they confirm understanding (reducing misinterpretation risk) and they create a moment of feeling heard, which makes the counterpart more willing to continue sharing. An interrogation-style rapid-fire sequence closes people off; a listen-label-question rhythm opens them.
**Calibrated questions create the illusion of control — and that illusion is the point.** Solutions the counterpart proposes themselves are implemented at far higher rates than solutions imposed on them. Calibrated questions work because they create the illusion of control — the counterpart believes the answer was their idea. This ownership effect means they will fight harder to implement "their" solution than to comply with yours.
**WHY:** When people feel they arrived at a decision through their own reasoning, they commit to it. When they feel a solution was handed to them, they comply at best — and resist at worst. Calibrated questions engineer the conditions for the counterpart to reach your preferred conclusion on their own terms.
**Surface deal-killers early, not late.** Asking "What could get in the way of this?" in the first conversation feels collaborative and signals you care about making the deal work. Discovering the same obstacle in week eight, after both parties have invested time and emotion, triggers frustration and often derails deals that could have been saved with early awareness.
**WHY:** Early discovery of problems gives both sides time to problem-solve. Late discovery, after commitment has built, creates loss — both parties feel like they are losing something, and loss aversion makes that hurt more than the deal was worth. Early questions protect both sides from this dynamic.
---
## Examples
### Example 1: Enterprise Software Sales — Uncovering Hidden Budget Constraints
**Scenario:** An account executive is closing a $120K annual contract. The prospect's main contact (a VP of Operations) has been enthusiastic throughout three months of evaluation. Two days before the expected close, the VP says "we need a bit more time." The executive suspects a budget approval issue but has not confirmed it.
**Trigger:** "What questions should I ask to find out what's really going on without being pushy?"
**Process:**
- Step 1: Gaps — unknown whether VP has budget authority, unknown who else must approve, unknown whether there's a competing priority or budget freeze
- Step 2: Bucket A — "What does success look like for your organization if this goes live in Q1?"
- Step 3: Bucket B — "What does your finance team or CFO need to see before something like this moves forward?" / "How on board are the colleagues who haven't been part of our conversations?"
- Step 4: Bucket C — "What could get in the way of moving forward by end of quarter?" / "What would make this impossible to approve right now?"
- Steps 5-6: Sequence — start with bucket A to rebuild rapport, then bucket B to surface the approval process, then bucket C to name the obstacle directly
**Output:** `calibrated-questions.md` with 6 questions across 3 groups, each with a follow-up label template. The bucket C question ("What would make this impossible to approve right now?") surfaces the budget freeze directly, allowing the executive to offer flexible payment terms rather than losing the deal.
---
### Example 2: Salary Negotiation — Discovering Real Constraints and Decision Structure
**Scenario:** A candidate is negotiating a job offer. The recruiter has offered $115K. The candidate's target is $130K. The recruiter says "that's the top of the band." The candidate does not know whether this is a firm limit or an opening position, or who has authority to go above the band.
**Trigger:** "How do I find out if there's any real flexibility without ruining the relationship?"
**Process:**
- Step 1: Gaps — unknown whether band is truly fixed or negotiable, unknown who has authority to approve exceptions, unknown what the full package looks like (equity, bonus, title)
- Step 2: Bucket A — "What does the full compensation picture look like at this level?" / "How does the company think about the tradeoff between base and total compensation?"
- Step 3: Bucket B — "What would need to happen for an exception to the band to be considered?" / "How does a decision like that typically get made?"
- Step 4: Bucket C — "What would make this offer impossible to adjust?"
- Steps 5-6: Sequence — open with bucket A (expands the conversation beyond base salary, may reveal equity or bonus levers), then bucket B to understand decision process, save bucket C as a last probe
**Output:** `calibrated-questions.md` with 5 questions. The bucket B question ("What would need to happen for an exception to be considered?") reveals that the hiring manager can approve exceptions with VP sign-off — and the recruiter offers to make that call.
---
### Example 3: Partnership Negotiation — Surfacing Deal-Killing Misalignment Early
**Scenario:** A startup founder is negotiating a distribution partnership with a large retail chain. After two months of discussion, both sides appear aligned. The founder is about to present final terms and suspects there may be exclusivity requirements from the retailer's side that would conflict with other distribution agreements — but nothing has been said.
**Trigger:** "Before I present final terms, what questions should I ask to make sure there are no surprises?"
**Process:**
- Step 1: Gaps — unknown whether retailer requires exclusivity, unknown who in the retailer's organization must approve the deal, unknown whether procurement or legal has different requirements than the business development contact
- Step 2: Bucket A — "What does a successful partnership look like for your team 12 months from now?" / "What matters most to your organization in how this gets structured?"
- Step 3: Bucket B — "Who else in your organization will be involved in finalizing this?" / "What does your legal or procurement team typically need to review?"
- Step 4: Bucket C — "What would make this structure impossible for your side to agree to?" / "What concerns do you have that we haven't discussed?"
- Steps 5-6: Sequence — start with bucket A to reinforce shared vision, move to bucket B to surface legal and procurement review (rather than discovering it after terms are presented), use bucket C to explicitly invite deal-killing issues to the surface now
**Output:** `calibrated-questions.md` with 6 questions. The bucket C question surfaces the retailer's exclusivity requirement before final terms are presented, allowing both sides to redesign the agreement structure rather than walking away after a failed close.
---
## References
| File | Contents |
|------|----------|
| `references/question-bank.md` | Full library of 15+ ready-to-use calibrated questions across all three categories; anti-pattern reference table; "why" vs "how/what" rewrites for common questions; follow-up label templates; deployment sequencing guide; Harvard mock hostage case study breakdown |
## License
This skill is licensed under [CC-BY-SA-4.0](https://creativecommons.org/licenses/by-sa/4.0/).
Source: [BookForge](https://github.com/bookforge-ai/bookforge-skills) — Never Split the Difference: Negotiating as if Your Life Depended on It by Chris Voss.
## Related BookForge Skills
This skill is standalone. Browse more BookForge skills: [bookforge-skills](https://github.com/bookforge-ai/bookforge-skills)
Identify the hidden unknowns that will determine whether your negotiation succeeds or fails before you ever make an offer. Use when someone asks "why is my c...
---
name: black-swan-discovery
description: |
Identify the hidden unknowns that will determine whether your negotiation succeeds or fails before you ever make an offer. Use when someone asks "why is my counterpart acting irrationally?", "why does this deal keep stalling for no apparent reason?", "what am I missing about this negotiation?", "how do I find out what the other side really wants?", or "why won't they just say yes when the deal is clearly good for them?" Also use for: diagnosing a stalled sales cycle where the prospect keeps deflecting, investigating why a candidate rejected an offer that seemed strong, uncovering hidden constraints before entering a high-stakes contract renegotiation, mapping leverage before a complex partnership discussion, or rebuilding a broken negotiation relationship. Produces a black-swan-report.md with a hypothesis map of unknown unknowns in all three categories (worldview mismatches, hidden constraints, hidden agendas), a leverage inventory across all three leverage types, and a prioritized bank of investigation questions to surface what you do not yet know. Pair with counterpart-style-profiler (to refine worldview hypotheses by type) and calibrated-questions-planner (to convert investigation questions into a deployment-ready set).
version: 1.0.0
homepage: https://github.com/bookforge-ai/bookforge-skills/tree/main/books/never-split-the-difference/skills/black-swan-discovery
metadata: {"openclaw":{"emoji":"📚","homepage":"https://github.com/bookforge-ai/bookforge-skills"}}
status: draft
source-books:
- id: never-split-the-difference
title: "Never Split the Difference: Negotiating as if Your Life Depended on It"
authors: ["Chris Voss"]
chapters: [5, 10]
tags: [negotiation, black-swan, unknown-unknowns, leverage, hidden-constraints, hidden-agendas, worldview, loss-aversion, irrational-behavior, discovery, deal-diagnosis, pre-negotiation]
depends-on: []
execution:
tier: 1
mode: hybrid
inputs:
- type: document
description: "Deal history — what has been offered, what responses have been received, what has stalled or confused you"
- type: document
description: "Counterpart behavior observations — anything that seemed irrational, inconsistent, or unexpectedly emotional"
- type: document
description: "Stakeholder map (if available) — who else may be involved in or affected by the decision"
tools-required: [Read, Write]
tools-optional: []
mcps-required: []
environment: "Any agent environment. Works with pasted deal summaries or document files. Richer context produces more targeted hypotheses."
discovery:
goal: "Produce a black-swan-report.md that maps hypothesized unknown unknowns across three categories, inventories available leverage, and generates a prioritized question bank to surface the hidden information that is blocking the deal"
tasks:
- "Gather deal history, counterpart behavior observations, and stakeholder context"
- "Diagnose potential unknown unknowns in all three Black Swan categories"
- "Flag any irrational or contradictory behavior as a diagnostic signal"
- "Map available leverage across all three leverage types"
- "Generate 6-10 investigation questions targeting the highest-priority unknowns"
- "Produce the black-swan-report.md artifact"
audience: "salespeople, founders, managers, consultants — anyone stuck in a negotiation that is not moving for reasons they cannot explain"
when_to_use: "When a negotiation is stalling, when counterpart behavior seems irrational, or when preparing for a high-stakes deal where the full picture is unclear"
environment: "Document set (deal-history.md, counterpart-observations.md) or free-text description"
quality: placeholder
---
# Black Swan Discovery
## When to Use
You are in a negotiation that is not moving the way it should. The deal looks reasonable on paper, but the counterpart keeps deflecting, delaying, or saying no without a clear explanation. Or you are preparing for a high-stakes negotiation and you suspect there is important information you do not have.
This skill applies when:
- A prospect or counterpart is behaving in ways that seem irrational — rejecting good offers, raising unrelated objections, going silent, or escalating without explanation
- A deal has stalled and you cannot diagnose why
- You are entering a negotiation where the counterpart's real priorities, constraints, or motivations are unclear
- You have received a surprising no after what felt like a strong offer
- You want to map what leverage you actually have before making your next move
- You are rebuilding a previously failed or contentious negotiation
The core principle: **apparent irrationality is almost always a signal, not a fact.** When a counterpart behaves in ways that do not make sense given what you know, they are usually operating on information, constraints, or motivations you do not yet have. The goal of this skill is to surface those unknown unknowns — called "Black Swans" in negotiation practice — so you can address them directly rather than responding to symptoms.
**Black Swans are unknown unknowns — pieces of information whose existence you are not even aware of.** This is different from known unknowns (questions you know you need to answer). A known unknown is "I don't know what their budget is" — you can ask about it directly. An unknown unknown is an undisclosed constraint, hidden relationship, or worldview difference that you do not yet know to ask about. Known unknowns can be resolved with direct questions. Unknown unknowns require a fundamentally different approach: instead of asking about what you know you don't know, you create conditions for unexpected information to surface — through face time, active listening, and watching for behaviors that don't make sense under your current model. These are the deal-killers that blindside experienced negotiators who are otherwise well-prepared.
Before starting, confirm you have:
- A description of the negotiation situation and what you want
- At least some observation of counterpart behavior — ideally including something that seemed puzzling or out of proportion
- A rough sense of who else might be involved beyond the person you are talking to
---
## Context & Input Gathering
### Required Context
- **The deal:** What is being negotiated? What have you offered? What has the counterpart said?
- **The gap:** What is preventing agreement? What does the counterpart say when they do not move?
- **The confusion:** What behavior or response surprised or puzzled you?
### Observable Context
If documents are provided, read them for:
- Inconsistencies between what the counterpart says they want and how they respond to offers that give it to them
- Unexpected escalation, emotion, or energy — positive or negative — around specific topics
- Statements that reveal unstated assumptions ("We'd never do that" without explaining why)
- Evidence of other stakeholders or decision-makers referenced but not introduced
- Deadlines, budget cycles, reporting relationships, or organizational constraints mentioned in passing
- Prior failed deals or contentious history that might be shaping the counterpart's current behavior
### Default Assumptions
- If no behavior observations are provided → assume some form of stall or deflection is occurring (most common Black Swan signal)
- If no stakeholder map is provided → assume at least one undisclosed stakeholder exists (boss, board, partner, budget owner)
- If the deal is described as "going well" with no confusion → treat this skill as pre-emptive mapping rather than diagnosis
### Sufficiency Check
Before generating the report, confirm you can answer: "What is the most puzzling or inexplicable thing this counterpart has done?" If there is nothing puzzling, you may still run the skill as a prevention exercise. If there is something puzzling, that specific behavior is your primary entry point for hypothesis generation.
---
## Process
### Step 1: Audit for Irrational Behavior Signals
**ACTION:** Review the deal history and behavior observations. Flag every instance where the counterpart's behavior seems disproportionate, inconsistent, or unexplainable given what you know.
**WHY:** Irrational behavior is not a dead end — it is the most reliable diagnostic signal in negotiation. When a counterpart acts in ways that do not serve their apparent interests, they are almost always responding to information you do not have. The common error is to dismiss unexplained behavior as personality, stubbornness, or bad faith. Skilled negotiators treat every instance of apparent irrationality as a question: "What would they have to believe or be constrained by for this behavior to make sense?" That reframe is what makes Black Swan discovery possible. Without this audit step, the hypotheses generated in Step 2 will be generic rather than targeted.
**Flag as signals:**
- Rejecting an offer that clearly meets their stated criteria
- Sudden escalation or emotional reaction to a neutral proposal
- Agreeing in conversation but not following through with action
- Deflecting or delaying on one specific issue while being cooperative on others
- Introducing new objections after previous objections have been resolved
- Expressing concern about something seemingly unrelated to the stated deal terms
---
### Step 2: Generate Hypotheses Across All Three Black Swan Categories
**ACTION:** For each flagged signal (and proactively if no signals are present), generate at least one hypothesis in each of the three categories below. Write each hypothesis as a testable statement.
**WHY:** Most negotiators mentally default to looking for one type of explanation — usually "they're being difficult" or "their budget is the constraint." The three-category framework forces you to search in places you would not naturally look. Worldview mismatches are invisible to both parties until one of them names them. Hidden constraints are rarely disclosed voluntarily. Hidden agendas are almost never mentioned because they are often embarrassing or politically sensitive. Generating hypotheses in all three categories is what makes the investigation comprehensive rather than confirmation-seeking.
#### Category 1: Worldview Mismatch
The counterpart is operating from fundamentally different assumptions about the situation — its context, norms, or what a "good outcome" looks like. Neither party is necessarily wrong; they are working from different maps of the same territory.
**Diagnostic signals:** Confusion when you state something you consider obvious; strong reactions to framing that seems neutral to you; repeated return to a principle or concern that seems disconnected from the specific deal terms; reluctance to engage with options that are standard in your context.
**Example hypotheses:**
- "They may believe this type of deal normally works differently and our structure looks unusual or suspicious to them."
- "They may have a fundamentally different risk tolerance than we do, shaped by a prior experience we don't know about."
- "They may define success for this negotiation in terms we have not heard — reputation, precedent-setting, relationship signal — not just the financial terms."
#### Category 2: Hidden Constraints
The counterpart cannot agree due to a constraint they have not disclosed — budget, timing, authority, process, or external dependency. They often will not disclose it voluntarily because it feels embarrassing, risky, or irrelevant to them.
**Diagnostic signals:** Stalls that coincide with organizational events (quarter end, budget cycle, leadership change); escalation to someone new in the conversation without explanation; requests for time that seem unrelated to the complexity of the decision; reluctance to commit on a specific dimension (price, timeline, scope) even when others are resolved.
**Example hypotheses:**
- "They may not have approval authority for this deal size and cannot admit it without losing face."
- "They may have a budget constraint that resets at a specific date, making timing the real variable."
- "They may be waiting on an internal decision that is blocking this one — a reorg, a competing priority, a budget freeze."
- "The actual decision-maker may not be the person we are talking to, and that person has not been brought into the conversation."
#### Category 3: Hidden Agenda
The counterpart has a goal, interest, or priority unrelated to (or only partially related to) the stated deal — one they are not disclosing because it is politically sensitive, personally motivated, or awkward to raise directly.
**Diagnostic signals:** Interest in elements of the deal that seem peripheral to their stated goal; unusual energy around non-financial terms; resistance to transparency or documentation that would be neutral in a straightforward deal; behavior that serves their personal position rather than their organization's interest; references to internal relationships or politics without elaboration.
**Example hypotheses:**
- "They may need this deal to succeed for personal career reasons — being seen as the person who closed it — and our current structure does not give them that."
- "They may be in conflict with an internal stakeholder and closing this deal would shift internal power in a way they want or fear."
- "They may need to demonstrate to their organization that they pushed back hard, regardless of the outcome — the negotiation itself is part of their deliverable."
- "There may be a competing relationship they are protecting — a vendor, a partner, a colleague — that this deal would threaten."
---
### Step 3: Map Available Leverage
**ACTION:** For each credible hypothesis from Step 2, identify which type of leverage is most relevant and what form it takes in this specific situation.
**WHY:** Leverage is not a fixed property of the deal — it is relative to what the counterpart values and fears. The same negotiation can contain all three leverage types simultaneously, but they apply to different counterpart concerns. Mapping leverage before deploying it prevents the most common error: using the wrong type of leverage for the counterpart's actual situation. Using negative leverage on a counterpart who is constrained by a hidden agenda (not by fear of loss) is ineffective and damages the relationship. Using normative leverage on a counterpart who has a genuine financial constraint is similarly ineffective.
#### Leverage Type 1: Positive Leverage
The counterpart wants something you can provide that they cannot easily get elsewhere. This is leverage you have when the deal is genuinely attractive to them.
**Activation condition:** The counterpart has demonstrated genuine interest in the outcome, not just in the negotiation process.
**Forms:** Your product/service/offer is differentiated; you have unique access, timing, or relationship value; the deal solves a specific problem they have urgency around.
**Application:** Use positive leverage to reinforce the value of moving forward — frame what they stand to gain if the deal closes. Most effective when combined with a concrete deadline or alternative that makes the opportunity feel limited.
#### Leverage Type 2: Negative Leverage
The counterpart fears what happens if the deal does not close — loss of the opportunity, loss of face, a worse outcome with an alternative. This is leverage based on what they stand to lose.
**Activation condition:** The counterpart has a real and recognized downside from a failed deal — not hypothetical.
**Forms:** A genuine alternative offer or option you can pursue; a time constraint that makes delay costly for them; the cost of the status quo (not closing this deal means the problem continues).
**Application:** Use sparingly and indirectly. Stating negative leverage explicitly often reads as a threat and triggers defensiveness. Instead, frame it as sharing your own constraints: "I need to make a decision by Friday on whether to go in a different direction."
**Loss aversion principle (Kahneman & Tversky):** Negative leverage is disproportionately powerful because of loss aversion — the psychological principle that losses are felt roughly twice as intensely as equivalent gains. A $100K loss stings twice as much as a $100K gain satisfies. This means implied negative consequences (what they stand to lose) carry twice the motivational weight of equivalent positive offers (what they stand to gain). "You stand to lose the early pricing" activates more urgency than "You can lock in the lower rate." Frame negative leverage as consequences, not threats: "It seems like if this doesn't work out, the pilot investment would be difficult to justify internally." Use this asymmetry deliberately when the counterpart is not moving on a deal that serves their interests — but only when the counterpart already recognizes the loss as real.
#### Leverage Type 3: Normative Leverage
The counterpart has stated standards, values, or commitments that their proposed behavior contradicts. This leverage uses the gap between their stated principles and their actions.
**Activation condition:** The counterpart has made statements — about fairness, their organization's values, how they treat partners, what their word means — that their current position violates.
**Forms:** Prior commitments ("In our last call you said that X was important to you — how does this decision align with that?"); organizational stated values; professional norms in their industry; precedents they themselves set in prior deals.
**Application:** Frame the discrepancy as a genuine question, not an accusation. "Help me understand how this fits with what you told me about your process" invites reflection without triggering defensiveness. This leverage type often works on hidden agenda situations because it surfaces the gap between the counterpart's public position and their private motivation.
---
### Step 4: Design the Investigation Question Bank
**ACTION:** For each high-priority hypothesis (focus on 3-5), write 1-2 calibrated investigation questions designed to surface that specific unknown without triggering defensiveness.
**WHY:** The purpose of the investigation questions is not to confirm your hypotheses — it is to create space for information you do not have to emerge. Hypotheses are starting points, not conclusions. The questions must be open (How/What, not Why or Yes/No), must convey genuine curiosity rather than accusation, and must be structured so that any answer — including "that's not it" — gives you useful information. Closed questions (Did you have approval? Is budget the issue?) allow the counterpart to shut the inquiry down with a simple denial. Open questions require elaboration that reveals the underlying reality whether or not the specific hypothesis was correct.
**Question design rules:**
- Start with How or What — never Why (reads as accusatory)
- Express genuine curiosity, not challenge: "Help me understand..." or "What would it take for..."
- Target one hypothesis per question
- After asking, use a label to create space: "It seems like there might be more to this."
- Sequence from least sensitive to most sensitive — build trust before surfacing the deepest hypotheses
**Example question bank structure:**
| Hypothesis | Investigation Question | Leverage Type |
|---|---|---|
| Hidden decision authority | "What does the approval process look like from here?" | Constraint |
| Worldview on deal structure | "How does this compare to how you typically structure these arrangements?" | Worldview |
| Personal stake in outcome | "What would success look like for you personally in how this gets resolved?" | Agenda |
| Budget timing constraint | "How does your organization's planning cycle affect timing on decisions like this?" | Constraint |
| Competing internal relationship | "What other considerations are you weighing as you think about this?" | Agenda |
---
### Step 5: Write the Black Swan Report
**ACTION:** Produce the `black-swan-report.md` artifact with the full hypothesis map, leverage inventory, and investigation question bank.
**WHY:** Documenting the hypotheses and leverage map before the next conversation serves two functions. First, it prevents the most dangerous negotiation failure mode: discovering the key unknown unknowns in the middle of the conversation and responding reactively rather than strategically. Second, it forces intellectual honesty — written hypotheses are easier to revise as new information arrives than assumptions held loosely in memory. The report also functions as a brief that can be shared with colleagues or revisited after a conversation to update based on what was learned.
---
## Inputs
| Input | Required | Format |
|---|---|---|
| Deal description | Yes | Any — markdown, plain text, verbal summary |
| What you want from the negotiation | Yes | One sentence minimum |
| Counterpart description | Yes | Role, organization, relationship history |
| Counterpart behavior observations | Yes | What has happened, what has been puzzling |
| Prior offers and responses | Optional | Summary or transcript |
| Stakeholder map | Optional | Who else is involved or affected |
| Counterpart profile | Optional | Output from counterpart-style-profiler |
---
## Outputs
Produce `black-swan-report.md` with the following structure:
```markdown
# Black Swan Report
**Deal:** [One-sentence description of what is being negotiated]
**Counterpart:** [Who they are, role, organization]
**Status:** [Current state — stalled, progressing, first contact, etc.]
**Primary Signal:** [The most puzzling or inexplicable behavior observed]
---
## Irrational Behavior Signals
| Behavior | Why It Seems Irrational | What It Might Signal |
|---|---|---|
| [Specific observed behavior] | [Why it doesn't fit their stated interests] | [Category — worldview / constraint / agenda] |
---
## Black Swan Hypotheses
### Category 1: Worldview Mismatches
- **Hypothesis 1a:** [What assumption mismatch might explain the observed behavior?]
- *Signal it would explain:* [Which behavior]
- *How to test:* [Observation or question that would confirm/disconfirm]
- **Hypothesis 1b:** [Alternative worldview hypothesis]
### Category 2: Hidden Constraints
- **Hypothesis 2a:** [What constraint might they not be disclosing?]
- *Signal it would explain:* [Which behavior]
- *How to test:* [Question or timing observation]
- **Hypothesis 2b:** [Alternative constraint hypothesis]
### Category 3: Hidden Agendas
- **Hypothesis 3a:** [What personal or political motivation might be at play?]
- *Signal it would explain:* [Which behavior]
- *How to test:* [Observation or indirect question]
---
## Leverage Inventory
| Leverage Type | What You Have | Activation Condition | Risk |
|---|---|---|---|
| Positive | [What they want that you can provide] | [When it applies] | [What could undermine it] |
| Negative | [What they stand to lose if this fails] | [When it's real, not hypothetical] | [Overuse = threat = defensiveness] |
| Normative | [What they've said that their current behavior contradicts] | [Specific prior statements or values] | [Must be framed as question, not accusation] |
---
## Investigation Question Bank
**Priority questions (next conversation):**
1. [Most important — targets highest-priority hypothesis]
*WHY this question:* [What unknown it surfaces]
*Follow with:* [Label to create space for elaboration]
2. [Second priority question]
*WHY this question:* [What unknown it surfaces]
3. [Third priority question]
*WHY this question:* [What unknown it surfaces]
**Secondary questions (if time or if primary questions open new threads):**
4. [Question targeting secondary hypothesis]
5. [Question targeting normative leverage opportunity]
---
## Discovery Strategy
**What to do before the next conversation:**
- [Specific preparation step — research, stakeholder outreach, leverage validation]
**What to observe in the next conversation:**
- [Specific signal that would confirm or disconfirm a key hypothesis]
**If the primary hypothesis is confirmed:**
- [How to address it — acknowledge the constraint, reframe the offer, introduce normative leverage]
**If the primary hypothesis is wrong:**
- [What to look for next — which secondary hypothesis becomes more likely]
```
---
## Key Principles
- **Irrational behavior is always a signal, never a conclusion.** When your counterpart acts in ways that do not serve their stated interests, the correct response is "What am I missing?" not "They're being unreasonable." Almost every instance of apparent irrationality is a window into information you do not yet have.
*WHY:* People do not generally act against their own interests. When behavior looks irrational from your perspective, the most likely explanation is that you are missing context — a constraint, a belief, a relationship, a fear — that makes the behavior internally consistent. Dismissing the behavior as "crazy" closes off the inquiry. Treating it as a signal keeps it open.
- **Unknown unknowns are not the same as known unknowns.** A known unknown is something you know you don't know ("I don't know their budget"). An unknown unknown is something you don't know you don't know. Black Swans are the second type — which is why they require active hypothesizing, not just direct questions about the things you already know to ask about.
*WHY:* Known unknowns can be addressed with direct questions. Unknown unknowns cannot be directly asked about because you do not yet know they exist. This is why the three-category hypothesis framework exists: it forces you to search in categories (worldview, constraints, agendas) where information is systematically undisclosed rather than merely unknown.
- **Face time surfaces what documents do not.** The most important Black Swans are rarely discoverable through written communication. They emerge in body language, pauses, inconsistencies between tone and content, and the emotional energy around specific topics. This is why high-stakes negotiations require in-person or video time — not just more email.
*WHY:* Written communication is filtered. People edit their words before sending. In real-time conversation, especially face-to-face, the emotional and nonverbal signals that reveal hidden constraints and agendas are much harder to suppress. The counterpart who writes a completely neutral email often reveals in the first thirty seconds of a call that something else is going on.
- **Loss aversion is approximately 2x as powerful as equivalent gain.** When designing leverage or framing choices for your counterpart, a potential loss activates approximately twice the urgency of an equivalent potential gain. Use this asymmetry deliberately, not manipulatively — frame the real cost of inaction rather than fabricating artificial urgency.
*WHY:* Prospect theory (Kahneman and Tversky) demonstrates that the psychological pain of losing something is roughly twice the pleasure of gaining the equivalent. In negotiation, this means "you might lose the early pricing by waiting" is approximately twice as motivating as "you can save money by deciding now." Both statements can be true simultaneously — the framing determines which motivational system you are activating.
- **All three leverage types apply simultaneously.** Most negotiators think in terms of one dominant leverage type. In reality, most negotiations contain all three, applying to different counterpart concerns. Mapping all three before the conversation lets you choose the right tool for the specific hypothesis you are investigating.
*WHY:* Using negative leverage on a counterpart who has a hidden agenda (not a fear of loss) is ineffective and often damages the relationship. The right leverage type for each situation depends on what the counterpart actually cares about — which is only discoverable through the hypothesis and investigation process.
- **Normative leverage is almost always available.** Every counterpart has stated values, prior commitments, and professional standards. Surfacing a gap between those statements and their current behavior — framed as a genuine question — is one of the most powerful and least adversarial ways to create movement. It requires the counterpart to reconcile their stated principles with their actions.
*WHY:* People are strongly motivated to maintain consistency between their stated values and their behavior (consistency bias). When a counterpart's behavior contradicts something they have said — about fairness, process, their organization's standards — gently surfacing that discrepancy creates internal pressure to resolve it. This does not require any threat or external pressure; the inconsistency itself creates the leverage.
---
## Examples
### Example 1: The Korean MBA Student and the Ex-Boss
**Scenario:** A Korean MBA student needs a letter of recommendation from a former employer who has become distant and difficult to reach. The student cannot understand why the ex-boss is unresponsive. The relationship had been positive during employment.
**Trigger:** "My former boss agreed to write a recommendation but keeps delaying. We haven't worked together in two years. I can't figure out what's going on."
**Process:**
- Signal audit: Agreement followed by prolonged inaction despite reminders. Behavior is inconsistent with a simple yes.
- Category 1 hypothesis: The ex-boss may assume that helping a former employee makes sense only if the relationship has ongoing value — a networking norm the student has not activated.
- Category 2 hypothesis: The ex-boss may have a time constraint (travel, project deadline) that they are too embarrassed to name after already agreeing.
- Category 3 hypothesis (discovered via investigation): The ex-boss may need something in return — specifically, a connection to the MBA program's network or a way to be associated with the student's future success. The letter is not just a favor; it is an implicit exchange the student has not acknowledged.
**Black Swans discovered:** (1) Worldview mismatch — the ex-boss operates on implicit reciprocity norms the student has not addressed; (2) Hidden agenda — the ex-boss wants to be positioned as a champion in the MBA community, not just a reference provider. When the student acknowledged both (offered to introduce the ex-boss to program contacts and framed the letter as a mutual professional visibility opportunity), the letter arrived within a week.
**Key leverage deployed:** Normative (the ex-boss had agreed and his professional reputation as a mentor was at stake) and Positive (access to MBA network contacts).
---
### Example 2: Stalled Enterprise Sales Deal
**Scenario:** A SaaS company's sales rep has been in a six-month sales cycle with a VP of Operations at a mid-size logistics company. The deal has gone through legal review, the technical evaluation was positive, and the VP has expressed enthusiasm in every conversation. But the final approval keeps getting pushed.
**Trigger:** "This deal should have closed three months ago. The champion loves the product, legal is done, pricing is agreed. Every time I ask about timing, she says 'soon' but nothing happens."
**Process:**
- Signal audit: Sustained enthusiasm combined with inability to execute. Gap between stated support and ability to close. Timing questions deflected with vague answers.
- Category 2 hypothesis (highest priority): Hidden budget constraint — the VP may not have authorization for this deal size and has been hoping to resolve that internally without disclosing it.
- Category 2 secondary hypothesis: There is a competing internal priority consuming the executive sponsor's attention and political capital.
- Category 3 hypothesis: The VP needs visible executive buy-in from her own boss before committing — either for political protection or because the actual decision-maker is above her.
**Investigation questions designed:**
1. "What does the approval process look like from here on your end?" (surfaces authority constraint without accusation)
2. "What would need to be true for this to move in the next thirty days?" (opens space for constraint disclosure)
3. "Who else in your organization would want to weigh in on a decision like this?" (surfaces undisclosed stakeholder)
**Outcome:** Question 3 revealed a CFO who had budget veto authority and had not been introduced into the process. The VP had been trying to shield the deal from CFO scrutiny but could not close without his approval. Once the sales rep offered to support the VP in presenting the ROI case to the CFO directly, the deal closed within three weeks.
---
### Example 3: Partnership Negotiation Breakdown
**Scenario:** A startup founder is negotiating a distribution partnership with a larger company. Initial conversations were extremely positive, but after a terms sheet was sent, communication became formal and slow. The other party's lead negotiator has been replaced by someone the founder has never met.
**Trigger:** "Everything was going great, then we sent the terms and it all went cold. Now there's a new person involved who seems hostile for no reason. I don't understand what changed."
**Process:**
- Signal audit: Sudden shift from warmth to formality after terms were sent; personnel change; new representative displays hostility without stated cause.
- Category 1 hypothesis: The terms sheet may have violated an unstated norm in how the other company handles formal proposals — perhaps they expected a different structure, or the founder went around the appropriate channel.
- Category 3 hypothesis (highest priority): The original champion may have been moved off the deal for internal reasons, and the new representative has a mandate or a personal motivation to renegotiate from a lower position.
- Category 3 secondary: The new representative may have been assigned specifically because the deal was seen internally as unfavorable, and the warmth in early conversations was not accurately representing the other organization's actual position.
**Leverage mapped:**
- Positive: Limited — the deal terms are now under review and the new representative has not expressed the same enthusiasm.
- Normative: High — the prior representative made specific positive statements about deal structure that are on record. The new representative is bound by what was discussed.
- Negative: Moderate — the startup has alternative distribution options that can be made more visible.
**Investigation questions designed:**
1. "Help me understand how this differs from what you typically look for in a partnership arrangement." (tests worldview mismatch)
2. "What are the most important factors for your team in how this is structured?" (resets the conversation to their priorities without the previous framing)
3. "What would make this work better from your perspective?" (opens space for the new representative to state their actual mandate)
---
## References
- [references/black-swan-categories.md](references/black-swan-categories.md) — Extended diagnostic signals per category, with real-world examples across sales, employment, and partnership negotiations
- [references/leverage-types-guide.md](references/leverage-types-guide.md) — Detailed activation conditions, risk profiles, and combination strategies for all three leverage types
- [references/loss-aversion-framing.md](references/loss-aversion-framing.md) — Prospect theory background, framing formulas, the 2x rule application in negotiation contexts
- [references/face-time-tactics.md](references/face-time-tactics.md) — How to extract Black Swan information in live conversation: observation checklist, contradiction signals, emotional energy mapping
- [references/investigation-question-templates.md](references/investigation-question-templates.md) — 20 ready-to-use investigation questions across all three categories with situation-specific variants
## License
This skill is licensed under [CC-BY-SA-4.0](https://creativecommons.org/licenses/by-sa/4.0/).
Source: [BookForge](https://github.com/bookforge-ai/bookforge-skills) — Never Split the Difference: Negotiating as if Your Life Depended on It by Chris Voss.
## Related BookForge Skills
This skill is standalone. Browse more BookForge skills: [bookforge-skills](https://github.com/bookforge-ai/bookforge-skills)
Build a complete price-negotiation offer schedule using the Ackerman bargaining model. Use when someone asks "how do I negotiate a lower price?", "what shoul...
---
name: ackerman-bargaining-planner
description: |
Build a complete price-negotiation offer schedule using the Ackerman bargaining model. Use when someone asks "how do I negotiate a lower price?", "what should my opening offer be?", "how do I avoid splitting the difference?", "how do I structure my counter-offers so I don't give too much away?", or "how do I make my final offer feel credible?" Also use for: designing a salary negotiation offer sequence, structuring a vendor price reduction campaign, planning a real estate purchase offer ladder, deciding how far to push on a freelance rate, or building an offer schedule for any purchase or contract negotiation where you are trying to reach a specific target price. Produces a situation-specific ackerman-plan.md with your 4-stage offer sequence showing actual dollar amounts computed from your target price, scripted phrases for each stage, fairness-challenge responses, a noncash item list for the final offer, and anti-patterns to avoid. Works for any negotiation where you are the buyer (or the party making offers). Pair with calibrated-questions-planner (to generate questions to use between offers) and counterpart-style-profiler (to adapt delivery pace and tone).
version: 1.0.0
homepage: https://github.com/bookforge-ai/bookforge-skills/tree/main/books/never-split-the-difference/skills/ackerman-bargaining-planner
metadata: {"openclaw":{"emoji":"📚","homepage":"https://github.com/bookforge-ai/bookforge-skills"}}
status: draft
depends-on: []
source-books:
- id: never-split-the-difference
title: "Never Split the Difference: Negotiating as if Your Life Depended on It"
authors: ["Chris Voss"]
chapters: [6, 9, 23]
tags: [negotiation, price-negotiation, ackerman-model, offer-sequence, anchoring, loss-aversion, fairness-framing, concession-strategy, salary-negotiation, procurement, bargaining, noncash-offers, anti-patterns]
execution:
tier: 1
mode: hybrid
inputs:
- type: document
description: "Situation brief — what you are buying or negotiating, the counterpart's current ask or opening position, and the context (B2B, consumer, salary, real estate, etc.)"
- type: document
description: "Your target price — the specific number you want to reach. This is NOT your walk-away point. It is your optimistic goal."
- type: document
description: "Noncash items available — non-monetary concessions you could offer at the final stage (faster payment, testimonial, referral, extended contract, flexibility on timing, etc.)"
tools-required: [Read, Write]
tools-optional: []
mcps-required: []
environment: "Any agent environment. Works with pasted briefs or document files."
discovery:
goal: "Produce a 4-stage offer schedule with computed dollar amounts, scripted phrases per stage, fairness-challenge responses, and a noncash item for the final offer — so the negotiator can execute with confidence and without improvising in the moment"
tasks:
- "Gather target price, current counterpart ask, and available noncash items"
- "Compute the 4 Ackerman offer amounts (65%, 85%, 95%, 100% of target)"
- "Write scripted phrases for each stage transition"
- "Generate fairness-challenge responses for all three fairness challenge types"
- "Select or generate one noncash item for the final offer"
- "Write the ackerman-plan.md artifact"
audience:
roles: ["buyer", "salesperson", "founder", "manager", "consultant", "recruiter", "procurement", "freelancer", "job-seeker", "real-estate-buyer"]
experience: "beginner to intermediate — no negotiation training required"
triggers:
- "User is preparing to negotiate a price and wants a structured offer sequence"
- "User wants to know what their opening offer should be"
- "User has been countered and wants to know how much to concede"
- "User wants their final offer to feel like a firm limit, not a round number"
- "User is being pressured to split the difference and wants a better alternative"
- "User has been accused of being unfair during a negotiation"
- "User wants to make their final offer feel credible without lying"
not_for:
- "Generating the full negotiation preparation document — use negotiation-one-sheet-generator for the 5-section prep"
- "Designing the questions to ask between offers — use calibrated-questions-planner for that"
- "Understanding the counterpart's communication style — use counterpart-style-profiler"
- "Defusing hostility before negotiation begins — use accusation-audit-generator first"
- "Seller-side negotiations where you are receiving offers — adapt the framework manually"
quality: placeholder
---
# Ackerman Bargaining Planner
## When to Use
You are preparing to negotiate a price — a purchase, a salary, a vendor contract, a freelance rate, or any deal where you need to move a counterpart from their current ask down to your target number.
You want a structured offer sequence that signals seriousness, creates credible commitment, and avoids the most common mistakes: splitting the difference, opening too close to your goal, using round numbers, making equal-sized concessions, or revealing your deadline.
**Input this skill needs:** Your target price (the number you want to reach — optimistic, not your walk-away point), the counterpart's current ask or opening position, and a list of noncash items you could offer as a final sweetener.
**Do not use this skill if:** You need a full negotiation preparation document — use `negotiation-one-sheet-generator`. You need to defuse hostility before you can make any offer — use `accusation-audit-generator` first. You are negotiating something where price is not the primary variable (use `calibrated-questions-planner` to find out what actually matters to them first).
---
## Context & Input Gathering
### Required
- **Your target price:** The specific number you want to reach. Be concrete — a vague target produces a vague plan. This is your optimistic goal, not your minimum acceptable price.
- **Counterpart's current ask:** Their stated price or opening position. If they have not yet made an ask, note that — you may be making the first offer.
- **Situation context:** What are you negotiating? (Purchase, salary, vendor contract, freelance rate, real estate, etc.) Who is the counterpart?
### Important
- **Noncash items available:** What non-monetary concessions could you offer? Examples: early payment, a testimonial or referral, extended contract term, flexibility on delivery schedule, exclusivity, training, or any item of genuine value to them that costs you little.
- **Counterpart type (if known):** Are they an Analyst (methodical, data-driven), Accommodator (relationship-focused, wants harmony), or Assertive (direct, time-focused)? If unknown, proceed — the skill works without this but use `counterpart-style-profiler` to adapt delivery.
### Observable (tell the skill what you can observe)
- **Their anchoring behavior:** Did they open extreme? Did they hold firm or move quickly?
- **Their deadline signals:** Have they mentioned timing constraints, deadlines, or urgency? (Important: protect your own deadline from disclosure.)
- **Their fairness language:** Have they used the word "fair" or accused you of being unreasonable?
### Defaults (used if not provided)
- **Noncash item:** If no items are provided, the skill will generate 3 plausible options based on the situation type for you to choose from.
- **Counterpart's ask:** If unknown, the skill will generate an anchor-response strategy and a recommended opening offer.
### Sufficiency check
If you have your target price and the situation context, proceed. Missing counterpart ask or noncash items is acceptable — the skill will generate estimates and options.
---
## Process
### Step 1: Establish Your True Target
**Action:** Confirm the target price the user wants to reach. Record it as a single specific number — not a range. If the user provides a range ("I want to pay between $40K and $50K"), set the target at the optimistic end ($40K). This number becomes the 100% anchor for all subsequent calculations.
**WHY:** Ranges invite the counterpart's brain to anchor on the number most favorable to them. A single optimistic target forces you to plan and execute toward your actual goal, not a compromise. Research on goal-setting in negotiations shows that negotiators with specific, ambitious single targets consistently outperform those with ranges or minimum-acceptable-outcome targets. This is the most common mistake: people confuse their walk-away point with their target and negotiate from the wrong number from the start.
**IF** user provides a walk-away point instead of a target → ask them to separate the two. Walk-away is the point at which you exit; target is what you are aiming for. The Ackerman model operates from the target, not the floor.
---
### Step 2: Compute the 4-Stage Offer Schedule
**Action:** Calculate the four Ackerman offer amounts using the following formula, then present them as a numbered schedule with actual dollar amounts:
```
Stage 1 — Opening offer: TARGET × 0.65
Stage 2 — Second offer: TARGET × 0.85
Stage 3 — Third offer: TARGET × 0.95
Stage 4 — Final offer: TARGET × 1.00 (use a non-round number near target)
```
**Increment psychology (critical):** The gap between each offer shrinks deliberately:
- Stage 1 → Stage 2: +20 percentage points
- Stage 2 → Stage 3: +10 percentage points
- Stage 3 → Stage 4: +5 percentage points
**Non-round number rule:** Make the final offer (Stage 4) a precise, non-round number that is at or near your target. If your target is $40,000, use $39,893 or $40,127 — not $40,000. If your target is $85/hour, use $84.35 — not $85.
**WHY:** The decreasing increment pattern communicates to the counterpart's subconscious that you are approaching your absolute limit. Each smaller concession signals "I have less room left." This is not a trick — it is honest signaling about your actual flexibility. Equal-sized concessions do the opposite: they suggest infinite room ("they moved $1,000 each time, so they can move $1,000 more"). The precise non-round final number creates the impression that the number was calculated from real constraints — because specific numbers imply research and limits, while round numbers imply padding. A counterpart receiving $39,893 thinks "that's an oddly specific number — they must have a real reason for it." A counterpart receiving $40,000 thinks "they rounded up, so maybe $38,000 is actually fine."
**Example calculation (target = $40,000):**
```
Stage 1: $40,000 × 0.65 = $26,000
Stage 2: $40,000 × 0.85 = $34,000
Stage 3: $40,000 × 0.95 = $38,000
Stage 4: $40,000 × 1.00 = $39,893 (non-round near target)
```
**IF** the counterpart's ask is below the Stage 1 opening → do not use this model as designed (you are already above their ask). Instead, start at Stage 3 or Stage 4 and use anchoring techniques from the Key Principles section.
---
### Step 3: Write Scripted Phrases for Each Stage
**Action:** For each of the 4 stages, write one scripted transition phrase the negotiator can say when moving to that offer. Each phrase must include: an empathic acknowledgment (label or calibrated question), then the offer amount.
**WHY:** The offer itself is only part of the communication. How you deliver each concession determines whether the counterpart feels heard or steamrolled. An empathic acknowledgment before each offer triggers the counterpart's cooperative instincts and makes the concession feel like a response to their situation rather than a mechanical formula. Calibrated questions between offers shift the problem-solving burden: instead of you explaining why you can't pay more, they explain what would make this work — which often reveals creative solutions.
**Stage 1 phrase template (opening offer):**
```
"I appreciate that this is an important [deal/purchase/hire] for both of us.
I want to be respectful of your time, so I'll be straightforward —
my current position is [STAGE 1 AMOUNT]."
```
**Stage 2 phrase template (after counterpart counters):**
```
"I hear you. It sounds like [label their stated concern or constraint].
I've looked at this again, and I can get to [STAGE 2 AMOUNT]."
```
**Stage 3 phrase template (after second counter):**
```
"I understand this isn't where you'd like to land, and I genuinely want to make this work.
How am I supposed to get to your number?
[Pause for answer]
The most I can do is [STAGE 3 AMOUNT]."
```
**Stage 4 phrase template (final offer — use with noncash item):**
```
"I've pushed as far as I possibly can.
[STAGE 4 AMOUNT — non-round number] is my absolute limit,
and I'd like to include [NONCASH ITEM] to make this work for you."
```
**Timing rule:** Do not rush from stage to stage. A label or calibrated question must precede each concession. The counterpart should be doing more talking than you between stages.
---
### Step 4: Generate Fairness-Challenge Responses
**Action:** Write a scripted response for each of the three types of fairness challenges the counterpart may deploy. Identify which type is most likely given the situation context, and mark it as the primary scenario.
**WHY:** The word "fair" is the most potent weapon in a negotiator's arsenal because it triggers loss aversion and puts the accused party on the defensive. Understanding the three distinct uses of "fair" — each with a different intent and a different required response — prevents you from either caving out of guilt or dismissing a genuine concern. Treating all three the same way is a high-cost mistake.
**Type 1 — Accusatory fairness (weaponized):**
*Signal:* "We just want what's fair." Said early in negotiation, often without evidence.
*Their intent:* Destabilize you. Put you on the defensive. Force an emotional concession.
*Response:*
```
"I understand — fairness matters to both of us.
I want you to feel this has been a fair process.
Can you help me understand specifically what feels unfair to you right now?"
```
*Why this works:* Returning the word "fair" without defensiveness takes away their weapon. Asking them to specify what feels unfair forces them to either articulate a real concern (which you can address) or reveal that the accusation was purely tactical.
**Type 2 — Genuine fairness (relational check-in):**
*Signal:* "I want to make sure we're both getting a fair deal." Said in a collaborative tone.
*Their intent:* Genuinely calibrate whether the deal feels balanced. Often used by Accommodator types.
*Response:*
```
"I appreciate you saying that — it matters to me too.
I've tried to be transparent throughout this process.
Is there something specific about the terms that doesn't feel right to you?"
```
**Type 3 — Inoculative fairness (offered by you proactively):**
*Signal:* You use it first.
*Your intent:* Disarm before they can weaponize it.
*How to deploy it:*
```
"I want to treat you fairly throughout this conversation.
If at any point something I propose seems off to you, I want you to tell me."
```
*Why this works:* Offering fairness before they demand it removes their ability to use it as a pressure tactic. It also signals confidence — you wouldn't offer to be fair if you planned to be unfair.
---
### Step 5: Select the Noncash Final Item
**Action:** From the user's provided noncash item list, select the one item most likely to be genuinely valuable to the counterpart based on the situation context. If no list was provided, generate 3 plausible options and ask the user to select one.
**WHY:** The noncash item serves two functions simultaneously. First, it provides the counterpart with a genuine concession that does not cost you money — preserving the financial terms of your final offer while giving them something to "win." People need to feel they got something, not just that they capitulated. Second, including a noncash item alongside a precise non-round final offer signals that you have exhausted your financial flexibility and are finding creative alternatives — confirming that the number is real.
**Noncash item selection criteria:**
- High perceived value to counterpart, low cost to you
- Relevant to their situation (not generic)
- Deliverable — something you can actually provide
**Examples by situation type:**
- Purchasing: Faster payment terms, favorable payment schedule, public testimonial, referral to similar buyers
- Salary negotiation: Extra PTO days, remote work flexibility, earlier performance review, professional development budget, equity accelerator
- Vendor/freelance: Longer contract term (predictable revenue), case study rights, priority queue access, introduction to other clients
- Real estate: Flexible closing date, furniture items, appliance upgrades, quick pre-approval
---
### Step 6: Write the Ackerman Plan Artifact
**Action:** Write `ackerman-plan.md` containing the complete offer schedule, scripted phrases, fairness responses, noncash item, and anti-patterns reminder.
**WHY:** A written plan with computed numbers and scripted phrases prevents the most common negotiation failure mode: improvising under pressure. When a counterpart pushes back hard, the human nervous system triggers the fawn or freeze response — and negotiators abandon their plan, concede too quickly, or split the difference just to end the discomfort. Having the script in front of you anchors the plan and gives you words to say when your brain goes blank. The anti-patterns section is included as a live reminder because the temptation to violate them will be highest at the moment of greatest emotional pressure.
**Output format:** See Outputs section below.
---
## Inputs / Outputs
### Inputs
- Target price (required)
- Counterpart's current ask or opening position (important — needed for calibration)
- Situation context (required)
- Noncash items available (important — skill can generate options if missing)
- Counterpart type/style (optional)
### Outputs
**File:** `ackerman-plan.md`
**Template:**
```markdown
# Ackerman Bargaining Plan: [Situation Name]
**Prepared for:** [Your name / role]
**Counterpart:** [Name / organization / role]
**Situation:** [What you are negotiating]
**Your target price:** [TARGET]
**Counterpart's current ask:** [THEIR ASK]
---
## Offer Schedule
| Stage | Multiplier | Amount | Increment |
|-------|-----------|--------|-----------|
| 1 — Opening offer | 65% of target | [AMOUNT] | — |
| 2 — Second offer | 85% of target | [AMOUNT] | +[X] from Stage 1 |
| 3 — Third offer | 95% of target | [AMOUNT] | +[X] from Stage 2 |
| 4 — Final offer | ~100% of target | [NON-ROUND AMOUNT] | +[X] from Stage 3 |
**Non-round final offer rationale:** [Explain the logic if needed — e.g., "calculated based on 12-month depreciation estimate"]
---
## Scripted Phrases
**Stage 1 — Opening:**
> "[Script from Step 3]"
**Stage 2 — After first counter:**
> "[Script from Step 3]"
**Stage 3 — After second counter:**
> "[Script from Step 3]"
**Stage 4 — Final offer:**
> "[Script from Step 3] + [NONCASH ITEM]"
---
## Fairness Challenge Responses
**If they say "We just want what's fair" (accusatory):**
> "[Response from Step 4]"
**If they raise fairness collaboratively:**
> "[Response from Step 4]"
**Inoculative fairness — use proactively at the start:**
> "[Proactive fairness statement from Step 4]"
---
## Noncash Final Item
**Selected item:** [ITEM]
**Why it works for them:** [Brief explanation]
**How to present it:** Include with Stage 4 offer as an add-on, not a trade.
---
## Anti-Patterns — Do Not Do These
- **Do not split the difference.** If they are at $50K and you are at $40K, $45K is not a solution — it rewards their extreme anchor and leaves you $5K short.
- **Do not use round numbers.** Round numbers signal padding. Use precise numbers at every stage.
- **Do not make equal-sized concessions.** Equal increments ($5K, $5K, $5K) signal infinite room. Shrink each increment.
- **Do not reveal your deadline.** Deadlines create urgency — but only for the person whose deadline is known. If you say "I need this done by Friday," you have handed them leverage.
- **Do not fixate on your walk-away point.** Negotiating from your minimum acceptable outcome anchors you low. Negotiate from your target and walk away if the target proves unreachable.
- **Do not make three or more rapid concessions.** Pace your concessions. A calibrated question or label between each offer is not optional — it is structurally required.
---
## Notes
[Space for observations made during the negotiation — what they said, what surprised you, what to adjust for the next conversation]
```
---
## Key Principles
**The 4-stage formula is a psychological signaling system, not just a math formula.** The percentages (65→85→95→100) work because of what the decreasing increments communicate: you are running out of room. Each smaller concession is an honest signal about your diminishing flexibility. The formula tells the counterpart's subconscious a coherent story: "this person started with room to move, they've moved, and now they're almost done." That story produces the settlement psychology that ends negotiations.
**WHY:** Counterpart decision-making during price negotiation is not rational — it is emotional and narrative-driven. The Ackerman formula exploits this by creating a credible narrative arc through the structure of your concessions alone. You are not arguing that your price is fair; you are demonstrating through behavior that you are at your limit.
**Loss aversion is twice as powerful as equivalent gain motivation.** Losses activate the emotional brain (System 1) approximately twice as strongly as equivalent gains. A counterpart who feels they might lose something they already have will negotiate harder than a counterpart who might gain the same thing. This means framing your offers and final position in terms of what they stand to lose if the deal falls apart is more motivating than framing it in terms of what they gain by accepting.
**WHY:** This is not manipulation — it is accurate communication. If the deal fails, they lose the certainty of a signed agreement, the time already invested, and any relationship goodwill built during negotiation. Naming those losses honestly is legitimate leverage. The key distinction: you are describing real consequences, not fabricating threats.
**Splitting the difference is a trap, not a compromise.** When both parties split the difference, the party who opened with a more extreme anchor wins more of the value. Splitting rewards extreme anchoring and punishes reasonable opening positions. It also produces outcomes that neither party is satisfied with — both feel they gave up the same amount, but neither feels they won.
**WHY:** The negotiator who proposes splitting the difference frames the midpoint as "fair" — but the midpoint is only fair relative to the two anchors, not relative to the actual value of the deal. If your anchor was accurate and theirs was inflated, the midpoint is inflated. The Ackerman model prevents this by anchoring your offers against your target, not their ask.
**Extreme anchoring makes your target seem reasonable by comparison.** Opening at 65% of your target — far below it — shifts the entire perceived range of the negotiation. When your Stage 1 offer is $26,000 and your target is $40,000, your final offer at $39,893 looks generous. If you had opened at $35,000, $39,893 would look like a small concession from an already-tight position. The opening offer shapes the psychological midpoint.
**WHY:** This is the anchoring effect in action. The first number spoken in a negotiation has disproportionate influence on where the settlement lands. Your opening offer is not your first concession — it is your opening anchor. It sets the range. Open extreme, then move systematically toward your target.
**Never reveal your deadline.** Deadlines create urgency — and urgency belongs to whoever needs the deal done faster. If you reveal your timeline ("I need to close this by end of quarter"), you have handed the counterpart the ability to slow-walk the negotiation until your deadline is imminent, then extract last-minute concessions under time pressure.
**WHY:** Your deadline is a constraint that limits your options. Any constraint that only you know about is a private vulnerability; any constraint both parties know about is leverage for the party without the constraint. Protect your deadline the same way you protect your walk-away point.
**The noncash item at the final stage creates a face-saving exit for the counterpart.** When you make your final non-round offer accompanied by a noncash item, the counterpart can tell themselves — and their boss — "I got the price plus X." This psychological face-saving function is as important as the financial function of the item. People need to feel they won something to agree to stop fighting.
**WHY:** Negotiations fail not because parties cannot find a number that works, but because one party cannot accept feeling like they lost. The noncash item gives the counterpart a "win" to report — which removes the psychological barrier to saying yes to your actual target price.
---
## Examples
### Example 1: Toyota 4Runner Purchase
**Scenario:** A buyer wants to purchase a used Toyota 4Runner. The dealer is asking $30,000. The buyer's research shows fair market value is approximately $25,000, which becomes the target price.
**Trigger:** "How do I negotiate the price on this truck without insulting the dealer?"
**Process:**
- Step 1: Target = $25,000
- Step 2: Offer schedule:
- Stage 1: $25,000 × 0.65 = $16,250 (opening)
- Stage 2: $25,000 × 0.85 = $21,250
- Stage 3: $25,000 × 0.95 = $23,750
- Stage 4: $25,000 × 1.00 = $24,893 (non-round final)
- Step 3: Each stage preceded by a label or calibrated question to the dealer
- Step 4: Fairness inoculation used proactively at the opening
- Step 5: Noncash item — immediate cash payment with same-day title transfer (saves dealer floor-plan carrying costs)
- Step 6: Plan written with scripted phrases
**Output:** `ackerman-plan.md` with the offer schedule showing actual amounts, Stage 4 delivery script including "I have the cash ready today and can do the title transfer this afternoon — that's worth something to you in terms of floor-plan savings," and a fairness-challenge response prepared for the dealer's likely "that's way below what we have in this truck."
**Result type:** Buyer reaches $24,893 — $5,107 below the ask — by moving systematically and making the dealer feel each concession was extracted through real effort.
---
### Example 2: Kidnapping Ransom Negotiation (Haiti)
**Scenario:** A Haitian kidnapping gang holds a victim and opens with a $150,000 ransom demand. The negotiating team's actual upper limit is $5,000, established as the target price. The goal is to use the Ackerman model to move the gang to a number the team can actually pay.
**Trigger:** Operational scenario — structured offer sequence needed under extreme pressure.
**Process:**
- Step 1: Target = $5,000
- Step 2: Offer schedule:
- Stage 1: $5,000 × 0.65 = $3,250
- Stage 2: $5,000 × 0.85 = $4,250
- Stage 3: $5,000 × 0.95 = $4,750
- Stage 4: $5,000 × 1.00 = $4,751 (non-round)
- Step 3: Calibrated questions used between stages: "How am I supposed to get that kind of money together?" "What would help us move this forward?" Labels used to acknowledge the gang's stated urgency without accepting their framing
- Step 5: Noncash item — a Christian burial for the victim (no cost to the team, high cultural value to the negotiating parties given Haiti's religious context)
- Step 6: Written plan maintained under pressure
**Output:** `ackerman-plan.md` used as a live script during multi-day negotiation. Settlement reached near the $5,000 target, with the burial offer accepted as the face-saving noncash item at the final stage.
**Key lesson:** The formula works under maximum emotional pressure precisely because it removes the need to improvise. When the gang demands more, the script provides the next move. The negotiator does not need to calculate in real time — the plan already computed every number.
---
### Example 3: Freelance Rate Negotiation
**Scenario:** A UX designer is negotiating her project rate. A startup wants to pay $8,000 for a 6-week engagement. The designer's target rate is $15,000 based on scope analysis and market comparables.
**Trigger:** "The client said $8,000 is their budget. How do I get to $15,000 without them walking away?"
**Process:**
- Step 1: Target = $15,000
- Step 2: Offer schedule:
- Stage 1: $15,000 × 0.65 = $9,750 (opening counter to their $8,000)
- Stage 2: $15,000 × 0.85 = $12,750
- Stage 3: $15,000 × 0.95 = $14,250
- Stage 4: $15,000 × 1.00 = $14,850 (non-round final)
- Step 3: Calibrated questions between stages: "What's driving the $8,000 figure — is that a hard budget cap or where you started?" (Stage 1→2 transition); "How am I supposed to cover [scope element] at that number?" (Stage 2→3)
- Step 4: Fairness inoculation — "I want to make sure we're both comfortable with the terms, so I'll be transparent about how I arrived at my numbers."
- Step 5: Noncash item — two rounds of free revisions after delivery (high perceived value to client nervous about scope creep; low cost to designer who plans one revision cycle anyway)
- Step 6: Written plan
**Output:** `ackerman-plan.md` with the offer schedule showing that the designer's opening counter ($9,750) is above the client's budget but below her target — which reframes the midpoint. Stage 4 script: "The absolute most I can do is $14,850, and I'll include two full revision rounds after delivery so you're not locked into the first cut."
---
## References
| File | Contents |
|------|----------|
| `references/formula-reference.md` | Ackerman percentages table, increment psychology, non-round number examples across common negotiation contexts, noncash item library by situation type, deadline protection tactics, extreme anchoring examples |
| `references/anti-patterns.md` | Full anti-pattern analysis: splitting the difference (why it rewards bad anchoring), BATNA-fixation (how walk-away point becomes psychological ceiling), equal-sized concessions (signaling infinite room), round numbers (padding signal), deadline disclosure (urgency transfer), over-scripting (rigidity trap) |
## License
This skill is licensed under [CC-BY-SA-4.0](https://creativecommons.org/licenses/by-sa/4.0/).
Source: [BookForge](https://github.com/bookforge-ai/bookforge-skills) — Never Split the Difference: Negotiating as if Your Life Depended on It by Chris Voss.
## Related BookForge Skills
This skill is standalone. Browse more BookForge skills: [bookforge-skills](https://github.com/bookforge-ai/bookforge-skills)
Generate a preemptive objection audit and emotion-label bank before any high-stakes negotiation, difficult conversation, salary discussion, sales pitch, or c...
---
name: accusation-audit-generator
description: Generate a preemptive objection audit and emotion-label bank before any high-stakes negotiation, difficult conversation, salary discussion, sales pitch, or conflict resolution. Use this skill when you need to defuse anticipated resistance before speaking, prepare labels for counterpart objections before a job offer negotiation, neutralize defensive reactions before presenting bad news, write preemptive acknowledgments before a pitch to a skeptical audience, prepare for a difficult performance review or client escalation, anticipate accusations before a contract renegotiation, build a delivery script for labeling counterpart frustrations, or create an accusation audit for a negotiation one-sheet.
version: 1.0.0
homepage: https://github.com/bookforge-ai/bookforge-skills/tree/main/books/never-split-the-difference/skills/accusation-audit-generator
metadata: {"openclaw":{"emoji":"📚","homepage":"https://github.com/bookforge-ai/bookforge-skills"}}
status: draft
source-books:
- id: never-split-the-difference
title: "Never Split the Difference: Negotiating as if Your Life Depended on It"
authors: ["Chris Voss"]
chapters: [3, 23]
tags: [negotiation, emotion-labeling, accusation-audit, tactical-empathy, objection-handling, conflict-resolution, sales, difficult-conversations]
depends-on: []
execution:
tier: 1
mode: hybrid
inputs:
- type: document
description: "Situation brief — a description of the negotiation or conversation, the counterpart, what you want, and what tensions or objections you expect"
tools-required: [Read, Write]
tools-optional: []
mcps-required: []
environment: "Any agent environment. Document set preferred: situation-brief.md, counterpart-profile.md. Works from a free-text description if no files provided."
discovery:
goal: "Produce a ready-to-deliver accusation audit: a list of 3-5 emotion labels anticipating the counterpart's worst-case feelings, plus a delivery script."
tasks:
- "Anticipate the counterpart's negative emotions and unstated objections"
- "Convert each anticipated negative into a label statement using the 'It seems like...' formula"
- "Sequence labels from strongest to lightest for opening delivery"
- "Add a pause instruction after each label"
- "Produce the accusation-audit.md artifact"
audience: "salespeople, founders, managers, consultants, freelancers — anyone preparing for a high-stakes or difficult conversation"
when_to_use: "Before any conversation where you expect resistance, defensiveness, or negative emotions from your counterpart"
environment: "Document set (situation-brief.md, counterpart-profile.md) or free-text description"
quality: placeholder
---
# Accusation Audit Generator
## When to Use
You are preparing for a conversation where your counterpart is likely to feel resistance, frustration, suspicion, or resentment — and you want to defuse those emotions before they derail the discussion. This skill applies when:
- Entering a salary, rate, or price negotiation where you expect pushback
- Delivering difficult news (restructuring, scope reduction, price increase, rejection)
- Opening a sales call with a skeptical or burned prospect
- Re-entering a stalled or previously contentious negotiation
- Managing a client escalation or complaint situation
- Preparing a difficult performance conversation with a direct report
The core pattern: **you name the counterpart's worst-case feelings first, before they do.** This drains the emotional charge from anticipated objections and signals that you understand their perspective — making it safe for them to listen instead of defend.
This is distinct from sympathy. Sympathy is feeling what they feel (joining them in the emotion). Active listening with emotional validation (tactical empathy) is recognizing and naming what they feel without being swept into it. Naming the emotion is the action; feeling it with them is not required or useful.
Before starting, confirm you have:
- A clear description of the situation and what you want
- Enough knowledge of the counterpart to anticipate their concerns (profile, history, stakes for them)
---
## Context & Input Gathering
### Required Context
- **The situation:** What is the conversation about? What do you want out of it?
- **The counterpart:** Who are they, what do they care about, what's at stake for them?
- **The anticipated negatives:** What do you expect them to feel — anger, fear, distrust, resentment, embarrassment, overwhelm?
### Observable Context
If documents are provided, read them for:
- Prior friction points or unresolved complaints in conversation history
- Explicit objections or concerns the counterpart has raised before
- Power dynamics and what the counterpart stands to lose
- Any promises made and not kept, or expectations set and not met
### Default Assumptions
- If no counterpart profile is provided → assume a skeptical, experienced counterpart who has been in similar conversations before (most challenging scenario)
- If no prior history is provided → assume the counterpart has generic concerns about fairness, respect, and getting a bad deal
- If fewer than 3 anticipated negatives can be identified → ask the user to describe the worst-case accusations they fear the counterpart might make
### Sufficiency Check
Before generating the audit, confirm you have enough to answer: "What is this person most afraid of, most frustrated about, and most suspicious of?" If you cannot answer that, gather more context first.
---
## Process
### Step 1: List Every Anticipated Accusation
**ACTION:** Write out every negative feeling or accusation the counterpart might have — stated as a raw complaint in their voice. Do not filter. Include the accusations that feel embarrassing or extreme.
**WHY:** The most dangerous objections are the ones that go unsaid — they fester and resurface as deal-killers. This step forces you to surface the worst-case perspective. Research in affect labeling (UCLA, Matthew Lieberman) shows that naming a negative emotion reduces its intensity by engaging the prefrontal cortex and dampening the amygdala response. You cannot label what you have not named. Including extreme accusations is deliberate: counterparts who hear their harshest unspoken thought voiced aloud by you often respond with surprise, then relief. The defusion effect is strongest for the most emotionally charged accusations.
**Format:** Write each in first-person counterpart voice:
- "You're only here because you want to take advantage of me."
- "This is a waste of my time."
- "You don't actually care about solving my problem."
- "You're going to lowball me."
- "I've heard this pitch before and it always disappoints."
Target: 5-10 raw accusations, more is better at this stage.
---
### Step 2: Convert Each Accusation into a Label Statement
**ACTION:** Rewrite each accusation as a third-person observation using the label formula. Select the 3-5 most charged accusations to convert.
**WHY:** The label formula shifts the statement from personal claim to neutral observation. "I think you feel cheated" implies judgment from you. "It seems like you feel this isn't fair" reflects back the counterpart's likely experience without endorsing or denying it. If the label is wrong, the counterpart corrects you — still valuable because it opens dialogue. If it is right, the counterpart feels understood — which reduces the emotional activation driving resistance. The third-person framing ("It seems like...") also protects you: you are not admitting fault or agreeing, you are naming what you observe.
Use third-person observation ("It seems like...") rather than first-person claim ("I think you feel..."). Third-person phrasing is safer because if the label is wrong, the counterpart corrects you without feeling confronted. First-person claims ("I feel that you're frustrated") make the conversation about YOUR perception rather than THEIR reality — which triggers defensiveness because the counterpart now has to argue with your feelings instead of reflecting on theirs.
**Label formula:**
- "It seems like..." (primary — most neutral)
- "It sounds like..." (for labels based on what they have said)
- "It looks like..." (for labels based on observable behavior)
**Never use:** "I feel..." or "I think you feel..." — these make the label about you, not the counterpart. They also trigger the counterpart to argue with your feelings rather than reflect on theirs.
**Conversion examples:**
| Raw Accusation | Label Statement |
|---|---|
| "You're going to lowball me." | "It seems like you're worried this conversation will waste your time without a real offer." |
| "You don't care about my situation." | "It sounds like you feel your concerns haven't been taken seriously in the past." |
| "This is a bait-and-switch." | "It seems like you've had experiences where the final terms didn't match the original pitch." |
| "You're only here for your own benefit." | "It seems like you're concerned this deal benefits us more than it benefits you." |
| "I've already decided — this won't change my mind." | "It looks like you've put a lot of thought into your position and you're not looking to be talked out of it." |
---
### Step 3: Sequence for Maximum Defusion
**ACTION:** Order the 3-5 selected labels from most emotionally charged to least. The most provocative accusation goes first.
**WHY:** Counterparts enter difficult conversations with their emotional guard up. Opening with the mildest label first leaves the biggest charge untouched — it remains the elephant in the room, distracting them from what you say next. Opening with the strongest label signals courage and transparency: "I know what you're thinking, and I'm not afraid to say it." This establishes credibility and disarms the defensive posture before you have asked for anything. After the heaviest label lands and is acknowledged (or corrected), lighter labels feel easy by comparison.
**Sequencing rule:** If you are uncertain which label is most charged, choose the one that most directly names a fear about your motives or fairness — those trigger the strongest defensive reactions.
---
### Step 4: Add Delivery Instructions
**ACTION:** For each label, append a pause instruction. Add a voice guidance note for the opening of the delivery.
**WHY:** The label only works if the counterpart has space to respond. Rushing past the label with the next sentence signals that you are not actually listening — you are just performing empathy. The silence after the label is where the counterpart processes recognition ("they understand what I'm feeling") and decides to lower their guard. A minimum 3-5 second pause is required. Do not fill it. If the silence feels uncomfortable, let it sit — that discomfort is the counterpart processing. Additionally, voice delivery matters: a calm, slow, downward-inflecting tone (not questioning, not tentative) signals confidence and safety. An upward-inflecting delivery sounds like you are seeking approval, which undermines the label.
**Voice guidance:**
- **Default delivery tone:** Warm, even, measured. Not apologetic. Not overly soft.
- **For the opening label (most charged):** Slow down. Downward inflection at the end of the sentence — it lands as a statement, not a question. This signals that you are calm and not threatened by naming it.
- **Avoid:** Uptalk, rushed delivery, apologetic hedges ("I don't know if this is right, but..."). These undermine the authority of the label.
---
### Step 5: Write the Accusation Audit
**ACTION:** Produce the `accusation-audit.md` artifact with the full label sequence and delivery script.
**WHY:** Having the labels written out in delivery order prevents in-the-moment improvisation under stress. Skilled practitioners rehearse the labels before high-stakes conversations. The written artifact also surfaces gaps: if you cannot write a label that feels honest and non-manipulative, that is a signal the label is either too vague or the situation requires more preparation.
---
## Inputs
| Input | Required | Format |
|---|---|---|
| Situation description | Yes | Any — markdown, plain text, verbal description |
| What you want from the conversation | Yes | One sentence minimum |
| Counterpart description | Yes | Role, stakes, prior history if available |
| Prior objections or stated concerns | Optional | Any |
| Conversation history | Optional | Markdown or plain text |
---
## Outputs
Produce `accusation-audit.md` with the following structure:
```markdown
# Accusation Audit
**Situation:** [One-sentence description]
**Counterpart:** [Who they are and what they care about]
**Goal:** [What you want from this conversation]
---
## Anticipated Accusations (Raw)
1. [Counterpart's worst-case thought, in their voice]
2. [Second accusation]
3. [Third accusation]
4. [Fourth accusation]
5. [Fifth accusation — include more if identified]
---
## Label Bank (3-5 Labels, Sequenced)
**Label 1 (Most Charged):**
> "It seems like [strongest anticipated negative]."
*Pause. Wait 3-5 seconds. Do not fill the silence.*
**Label 2:**
> "It sounds like [second anticipated negative]."
*Pause. Wait 3-5 seconds.*
**Label 3:**
> "It seems like [third anticipated negative]."
*Pause. Wait 3-5 seconds.*
[Label 4 and 5 if applicable]
---
## Delivery Script
**Opening (before any ask or proposal):**
Deliver the labels in sequence. Use a calm, even, downward-inflecting tone. Do not apologize for naming the feelings. Do not move to your ask until the counterpart has responded to at least one label.
Sample opening sequence:
[Insert Label 1]
[Wait for response]
[Insert Label 2 if still needed]
[Wait for response]
[Transition: "I want to make sure I understand your situation before I tell you what I'm thinking."]
---
## Notes
- If a label is wrong, the counterpart will correct you. Accept the correction: "You're right — help me understand what you're actually concerned about." Wrong labels are still productive.
- If the counterpart responds with "That's right" or "Exactly" — you have a genuine confirmation of understanding. Proceed.
- Do not use more than 3-5 labels in a single opening. More than 5 becomes a recital, not a conversation.
```
---
## Key Principles
- **Name the accusation before they do.** Counterparts who feel their worst-case thinking has been preemptively acknowledged have no reason to raise it defensively. The emotional charge dissipates before it fires.
*WHY:* The brain's threat-detection system (amygdala) stays activated while a negative emotion is unnamed. Naming it triggers the prefrontal cortex — the reasoning brain — which reduces the intensity of the emotional response. This mechanism is called affect labeling: when you say "It seems like you're frustrated," the counterpart's brain shifts from emotional reaction to cognitive processing. Affect labeling research (UCLA, Matthew Lieberman) shows that even brief verbal labeling measurably reduces amygdala reactivity. This is why labeling de-escalates: it literally moves neural activity from the threat-response center to the rational-thinking center. You cannot label what you have not named — which is why Step 1 requires surfacing every accusation first.
- **Use "It seems like..." — never "I feel..."** The formula keeps the label as an observation about their experience, not a claim about yours. This prevents the counterpart from arguing with your feelings and keeps the focus on understanding theirs.
*WHY:* "I feel you're upset" centers the speaker. "It seems like you're upset" centers the counterpart's experience. The distinction signals that you are observing, not projecting. It also protects you legally and professionally — you are not admitting fault, only reflecting what you observe.
- **Wrong labels still work.** If you mislabel the emotion, the counterpart corrects you — which opens dialogue, reveals their actual concern, and demonstrates that you are genuinely listening rather than following a script.
*WHY:* A corrected label is still valuable because: (1) it produces more information about what the counterpart actually feels, (2) the act of correction engages the counterpart actively instead of passively, and (3) the willingness to be corrected signals humility, not weakness.
- **Pause after every label.** The label requires a response to function. Filling the silence prevents the counterpart from processing the recognition and expressing their feeling. The pause is not empty — it is where the emotional defusion happens.
*WHY:* The mechanism requires the counterpart to internally confirm or deny the label. That internal process takes 3-5 seconds minimum. Interrupting it with more words cancels the effect.
- **Tactical empathy (active listening with emotional validation) is not sympathy.** You do not need to feel what they feel, agree with their position, or validate their conclusions. You only need to demonstrate that you understand their emotional state. These are separable. Sympathy means feeling the counterpart's pain and making concessions to relieve it. Tactical empathy means understanding and labeling their emotions while maintaining your position. The label "It seems like you feel this is unfair" acknowledges their emotion without agreeing that it IS unfair — you are naming their experience, not endorsing their conclusion.
*WHY:* Sympathy draws you into the counterpart's emotional frame and impairs your judgment. Emotional validation keeps you grounded while demonstrating understanding. The counterpart does not need you to suffer with them — they need evidence that you see their situation accurately. Once they believe you see it, they can engage rationally.
- **The accusation audit opens the conversation — it is not the pitch.** Labels come first, before any proposal, ask, or argument. Moving to your ask before the counterpart has felt heard activates resistance, not receptiveness.
*WHY:* The behavioral change model (Active Listening → Empathy → Rapport → Influence → Behavioral Change) treats empathy as a prerequisite for influence. Skipping to influence before rapport is established reliably produces defensiveness.
---
## Examples
### Example 1: Salary Negotiation
**Scenario:** A marketing manager is negotiating a 20% raise with a manager who has signaled budget constraints. The manager expects pushback about timing and the company's cost situation.
**Trigger:** "Help me prepare for tomorrow's salary conversation. I'm asking for a 20% raise and I know my manager is going to push back hard."
**Process:**
- Raw accusations identified: "You're asking too much given the budget freeze," "You're being greedy," "You don't understand how tight things are," "You're threatening to leave if we don't pay you," "Why now — the timing is terrible."
- Top 3 labels selected and sequenced: strongest first (fairness accusation, then timing, then motives)
- Delivery script drafted with pause instructions and transition to the actual ask
**Output (`accusation-audit.md` excerpt):**
```
Label 1: "It seems like this is a difficult time to be having a conversation about compensation."
[Pause 3-5 seconds]
Label 2: "It sounds like you might be concerned that I'm not aware of the pressures the team is facing."
[Pause 3-5 seconds]
Label 3: "It seems like you might wonder whether this is an ultimatum rather than a conversation."
[Pause 3-5 seconds]
Transition: "I want to be straightforward about what I'm thinking and why, and I'm genuinely open to how we get there."
```
---
### Example 2: Apartment Subletting Request
**Scenario:** A renter wants to sublet their apartment for six months while working abroad. They expect the landlord to refuse based on liability and lease terms.
**Trigger:** "I need to ask my landlord to let me sublet for six months. He's going to say no. Help me prepare."
**Process:**
- Raw accusations: "You're trying to get around the lease," "You'll get a stranger into my property," "I'll be liable for anything that goes wrong," "You're making this my problem," "You're going to cause damage and disappear."
- Labels converted using the formula, ordered from most to least charged
- Delivery script prepared for the phone call
**Output (`accusation-audit.md` excerpt):**
```
Label 1: "It seems like the idea of someone you don't know in your property is a real concern."
[Pause 3-5 seconds]
Label 2: "It sounds like you might be worried this would create liability issues for you."
[Pause 3-5 seconds]
Label 3: "It seems like this request might feel like I'm trying to get around the terms we agreed to."
[Pause 3-5 seconds]
Transition: "I'd like to walk you through exactly what I'm proposing and how I want to protect your interests throughout."
```
---
### Example 3: Sales Call with a Burned Prospect
**Scenario:** A SaaS sales rep is calling a prospect who tried a competitor's product, had a bad experience, and has been unresponsive to outreach for two months. The rep got a reluctant 20-minute call booked.
**Trigger:** "This prospect got burned by our competitor and I think they'll shut me down in the first two minutes. Help me prepare an opening."
**Process:**
- Raw accusations: "You're just like the last vendor," "This is going to waste my time," "Whatever you say will sound like the last pitch," "I've already made up my mind," "You're going to overpromise."
- Labels generated; strongest accusation (past bad experience → distrust) goes first
- Voice guidance added: FM DJ tone — slow, calm, no enthusiasm, no salesperson energy
**Output (`accusation-audit.md` excerpt):**
```
Label 1: "It seems like you've had experiences with vendors that didn't deliver what they promised."
[Pause 3-5 seconds]
Label 2: "It sounds like the last time you invested time in evaluating a new tool, it didn't go well."
[Pause 3-5 seconds]
Label 3: "It seems like you might be wondering whether this call is going to be different or just more of the same."
[Pause 3-5 seconds]
Transition: "I'm not going to pitch you today. I want to understand what actually happened and whether what we do is even relevant to your situation."
```
---
## References
- [references/emotion-labeling-mechanics.md](references/emotion-labeling-mechanics.md) — Affect labeling research, labeling formula variations, common mistakes
- [references/five-label-templates.md](references/five-label-templates.md) — All 5 fill-in-the-blank label templates with example completions
- [references/tactical-empathy-vs-sympathy.md](references/tactical-empathy-vs-sympathy.md) — Distinction between validation and sympathy; why sympathy undermines negotiation
- [references/three-voices-guide.md](references/three-voices-guide.md) — FM DJ voice vs positive/playful vs assertive: when to use each, inflection rules
## License
This skill is licensed under [CC-BY-SA-4.0](https://creativecommons.org/licenses/by-sa/4.0/).
Source: [BookForge](https://github.com/bookforge-ai/bookforge-skills) — Never Split the Difference: Negotiating as if Your Life Depended on It by Chris Voss.
## Related BookForge Skills
This skill is standalone. Browse more BookForge skills: [bookforge-skills](https://github.com/bookforge-ai/bookforge-skills)
Run a structured reflection or debrief after any learning experience, project, procedure, or performance to turn raw experience into durable skill. Use this...
---
name: structured-reflection-protocol
description: Run a structured reflection or debrief after any learning experience, project, procedure, or performance to turn raw experience into durable skill. Use this skill whenever the user wants to do an after-action review, write a learning journal entry, debrief a session, run a post-mortem, reflect on what went well and what to improve, turn a recent experience into a lesson they will remember, create a reflection document after completing a course chapter or training, or consolidate learning from a recent event — even if they do not use the words "reflection" or "retrieval." Works for students, professionals, coaches, clinicians, writers, teachers, and anyone learning from experience. Do NOT use this skill to build a spaced repetition quiz system (use retrieval-practice-study-system) or to analyze an external document for content (use a different skill).
version: 1.0.0
homepage: https://github.com/bookforge-ai/bookforge-skills/tree/main/books/make-it-stick/skills/structured-reflection-protocol
metadata: {"openclaw":{"emoji":"📚","homepage":"https://github.com/bookforge-ai/bookforge-skills"}}
status: draft
source-books:
- id: make-it-stick
title: "Make It Stick: The Science of Successful Learning"
authors: ["Peter C. Brown", "Henry L. Roediger III", "Mark A. McDaniel"]
chapters: [2, 4, 8]
tags: ["learning-science", "cognitive-psychology", "evidence-based-learning", "reflection", "experiential-learning"]
depends-on: []
execution:
tier: 1
mode: hybrid
inputs:
- type: document
description: "Description of a recent experience, session notes, procedure record, or brief summary of what just happened"
tools-required: [Write]
tools-optional: [Read]
mcps-required: []
environment: "Any agent environment. Write access needed to save the reflection document."
---
# Structured Reflection Protocol
## When to Use
You have just completed something — a surgery, a class session, a project sprint, a difficult conversation, a writing draft, a practice run — and you want to convert that raw experience into durable, retrievable learning before it fades.
Typical entry points:
- Just finished a procedure, class, training session, or performance
- Writing a learning journal and want more than a summary
- Running an after-action review with a team or for yourself
- Noticing a recurring mistake and wanting to break the pattern
- Finishing a course chapter or reading assignment and wanting it to actually stick
Before starting, verify:
- Is there a specific, bounded experience to reflect on? (If the user wants to reflect on "everything lately," narrow to one recent event first.)
- Is this for individual use or a team debrief? (Team debriefs need all participants contributing; flag if only one person's perspective is available.)
**Mode: Hybrid** — The agent structures the reflection, asks the four questions, and produces the output document. The human supplies the experience content. The agent connects dots, surfaces patterns, and generates the "strategies for next time" section.
## Context & Input Gathering
### Required Context (must have — ask if missing)
- **What just happened:** A description of the experience being reflected on. Even a rough sentence works: "I taught my first class session today" or "We just finished the sprint review."
-> Check prompt for: descriptions of recent events, project completions, session summaries, procedure notes
-> If missing, ask: "What experience do you want to reflect on? Give me a brief description of what happened."
- **Domain or role:** What kind of practitioner is this person? (student, surgeon, coach, writer, etc.)
-> Shapes which reflection format to use and the vocabulary of the output
-> If missing, infer from context; note the assumption
### Observable Context (gather from environment)
- **Prior reflection documents:** Any previous reflection files or learning journals
-> Look for: `reflection-*.md`, `learning-journal.md`, `after-action-*.md`
-> If found: reference them for pattern recognition ("this is the third time you've noted X")
- **Goals or intentions set before the experience:** What the person was trying to accomplish
-> Look for: planning documents, session outlines, stated objectives
-> If unavailable: derive from the experience description
### Default Assumptions
- If no domain specified -> use domain-neutral language; adapt examples to match whatever domain the user mentions
- If no prior reflections available -> treat as first reflection in a series
- If the experience was negative or difficult -> reflect without judgment; the protocol works regardless of outcome
- If the user is in a hurry -> offer the short-form output (four answers + three action items) rather than the full document
### Sufficiency Threshold
```
SUFFICIENT when ALL of these are true:
- At least one experience is described (even briefly)
- The time frame is clear (this happened recently, not months ago)
PROCEED WITH DEFAULTS when:
- Domain or role is vague
- No prior reflections exist
MUST ASK when:
- No experience at all has been described
- The event is so old that memory is likely unreliable (use a note: "recall may be approximate")
```
## Process
### Step 1: Gather the Experience
**ACTION:** Ask the user to describe what just happened, or read the input they have already provided. Establish the bounded event to reflect on.
**WHY:** Reflection requires a specific target. Vague, open-ended reflection ("I've been learning a lot lately") rarely produces retrievable insights because the mind cannot reconstruct specific episodes. The neurosurgeon Mike Ebersold described his reflection practice as always starting from a specific surgery: "Something would come up in surgery that I had difficulty with, and then I'd go home that night thinking about what happened." The bounded experience is the anchor.
**IF** the user provides a file or notes -> read them and summarize the key events in 3-5 bullet points before proceeding. Show this summary to the user and ask if anything important is missing.
**IF** the user describes the experience verbally in the prompt -> paraphrase it back in 2-3 sentences to confirm understanding before asking the reflection questions.
**OUTPUT:** A 2-5 sentence statement of what the experience was, who was involved, and what the intended outcome had been.
---
### Step 2: Run the Four Reflection Questions
**ACTION:** Work through each of the four questions in order. For each question, prompt the user for their answer, then elaborate and deepen it before moving to the next.
**WHY:** These four questions are not arbitrary. Each one activates a different cognitive mechanism that strengthens learning:
- Question 1 (what went well) consolidates the memory of successful strategies, making them easier to retrieve and repeat.
- Question 2 (what could go better) forces retrieval of what was difficult — the effortful retrieval itself strengthens the memory trace.
- Question 3 (what does this remind me of) is elaboration: connecting new experience to prior knowledge creates multiple retrieval paths and deepens understanding.
- Question 4 (strategies for next time) is mental rehearsal: visualizing the corrected action consolidates it before the next performance, the same mechanism Ebersold used to pre-solve surgical problems the night before returning to the OR.
**The four questions:**
#### Q1: What went well?
Prompt to the user: "What worked in this experience? What did you do that you would do exactly the same way next time?"
**Agent role:**
- Affirm specific behaviors, not vague positives ("your pacing was effective" not "it went well")
- If the user says "nothing," push gently: "What was the outcome? Did you complete the task? What enabled that?"
- Identify 1-3 concrete behaviors or decisions that contributed to success
**WHY this question first:** Starting with success is not merely motivational. It retrieves the memory in a positive state, reducing the defensiveness that causes people to shut down before they reach the harder questions. It also identifies what to protect — the strategies worth preserving.
#### Q2: What could have gone better?
Prompt to the user: "Where did you struggle? What would you change if you could do it again right now?"
**Agent role:**
- Press for specificity: "At which exact moment did things feel difficult?"
- Distinguish between knowledge gaps (didn't know what to do) and execution gaps (knew what to do but couldn't execute under pressure)
- If the user blames external factors entirely, acknowledge them but redirect: "Granted the conditions were difficult — what could you have done differently within those conditions?"
**WHY this question matters:** Difficulty is where the most durable learning lives. The brain assigns higher priority to encoding surprising, effortful, or failed attempts because they signal situations that need future preparation. Surfacing what went wrong is not self-criticism — it is the specific mechanism by which expert practitioners build the dense situational awareness that novices lack.
#### Q3: What does this remind you of? What earlier experience or knowledge does it connect to?
Prompt to the user: "Have you encountered something like this before — in this domain or a different one? What principles or frameworks does this experience illuminate or challenge?"
**Agent role:**
- Offer connections the user may not have seen: "This sounds similar to [X pattern] — does that resonance feel right?"
- Surface both same-domain analogies ("this is like the time you...") and cross-domain analogies ("this has the structure of...")
- If the user draws a blank, offer a prompt: "What does this remind you of from a book, a past job, a different field?"
**WHY this question is the multiplier:** Elaboration — connecting new learning to prior knowledge — is one of the most powerful learning mechanisms available. Every connection created is an additional retrieval path. A piece of learning with many connections is far more durable and accessible than isolated information. This is why expert practitioners can solve problems that novices cannot: their knowledge is richly interconnected, not just voluminous.
#### Q4: What strategies will you use next time?
Prompt to the user: "If you faced the exact same situation tomorrow, what would you do differently? What specific technique, preparation step, or adjustment would you make?"
**Agent role:**
- Push for concrete, executable strategies, not vague intentions ("I will prepare a list of three fallback questions before entering any difficult conversation" not "I will be better prepared")
- Include mental rehearsal: "Walk me through the moment where things got hard. Now describe what the corrected version looks like."
- If the user identifies multiple strategies, help prioritize to the 2-3 most impactful
**WHY mental rehearsal matters:** Visualization and mental rehearsal activate many of the same neural pathways as physical practice. Ebersold's reflection practice was not just verbal — he would mentally walk through the corrected surgical technique, seeing his hands working, before attempting it in the OR. This pre-consolidates the improved pattern. Football coaches Vince Dooley used reflection and mental rehearsal with his players to lock in playbook adjustments before the next game.
**OUTPUT after Step 2:** Four completed, elaborated answers — specific behaviors, honest analysis of difficulty, at least two connections to prior knowledge, and 2-3 concrete strategies for next time.
---
### Step 3: Select and Apply the Reflection Format
**ACTION:** Based on the domain, time available, and what was learned in Steps 1-2, select the most appropriate output format and produce the reflection document.
**WHY:** Different contexts benefit from different reflection structures. A 10-minute free-recall session serves a student differently than a structured after-action review serves a surgical team. The format should fit the practitioner's context, not the other way around.
**Format A: Free Recall (10 minutes, blank page)**
Best for: Students, individual learners, anyone with < 15 minutes, first reflection in a new domain.
Instructions to the user:
1. Close all notes, books, and references.
2. Set a timer for 10 minutes.
3. Write everything you can remember from the experience (or, for a course session, from the material just covered) — facts, sequences, confusing parts, surprising moments, anything.
4. Do not worry about organization. The retrieval effort is the point.
5. After 10 minutes, review what you wrote. Note what was easy to recall and what was absent.
**Agent role:** After the user completes free recall, read their output and identify:
- What they recalled easily (well-consolidated)
- What they recalled with uncertainty (needs one more retrieval session)
- What was absent entirely (likely needs re-encoding from the source)
This calibration is the most accurate feedback mechanism available — it shows the learner exactly where their memory is reliable versus where it only feels reliable.
**Format B: Learning Paragraph (Wenderoth Method)**
Best for: Students in courses, practitioners in training programs, weekly reflection practice.
Biology professor Mary Pat Wenderoth assigns weekly "learning paragraphs" in which students reflect on what they learned the previous week and characterize how their class learning connects to life outside class. This is a structured elaboration exercise, not a summary.
Structure:
1. In 1-2 sentences: What was the most important thing you learned this week/in this session?
2. In 2-3 sentences: How does it connect to what you already knew before this course/experience?
3. In 1-2 sentences: Where else does this appear — in your work, your life, another field?
4. In 1-2 sentences: What question does this raise that you have not yet answered?
**Agent role:** Read the completed paragraph and flag weak elaborations ("this connects to things I already know" is not an elaboration — ask the user to name what specifically).
**Format C: Structured Debrief (Ebersold Post-Procedure Method)**
Best for: Clinical procedures, high-stakes performances, team after-action reviews, any complex multi-step event.
This is the format Mike Ebersold used after difficult surgeries. It is structured around the gap between planned and actual performance.
Structure:
```
DEBRIEF RECORD
Experience: [procedure, project, session name]
Date: [date]
Participants: [if team]
WHAT WAS PLANNED
- Intended approach:
- Expected difficulties:
- Preparation steps taken:
WHAT ACTUALLY HAPPENED
- Where the plan held:
- Where the plan broke down:
- Unexpected events:
TECHNICAL ANALYSIS
- Root cause of any gap between plan and execution:
- Knowledge gap vs. execution gap:
- Environmental factors beyond control:
IMPROVEMENTS FOR NEXT TIME
- Technique adjustment:
- Preparation adjustment:
- Mental rehearsal target (what to visualize before next attempt):
WHAT TO TEACH OR SHARE
- What would be useful for a colleague or student to know?
```
**Agent role:** Complete the non-human fields from the experience description. Fill in analysis sections with the outputs from Step 2. Present the completed document for review.
---
### Step 4: Produce and Save the Reflection Output
**ACTION:** Compile the four-question answers and the selected format into a single, dated reflection document. If a file path or working directory is available, write it to disk.
**WHY:** Reflection documents are only as useful as their retrievability. Notes that exist only in conversation history become inaccessible within days. Writing the document to a file preserves it for future pattern recognition — noticing, for example, that "rushed preparation" appears in four consecutive after-action reviews signals a systemic habit to change, not just an isolated incident.
**Output document structure:**
```markdown
# Reflection: [Experience Name]
**Date:** [date]
**Domain/Role:** [domain]
**Duration of experience:** [approximate]
## What Happened
[2-5 sentence description]
## Four Questions
### What went well?
[Specific behaviors and decisions that worked — not vague positives]
### What could have gone better?
[Specific difficulty moments, honest analysis — knowledge gap vs. execution gap]
### What does this remind me of?
[Connections to prior experiences, frameworks, cross-domain analogies]
### Strategies for next time
1. [Concrete, executable strategy]
2. [Concrete, executable strategy]
3. [Optional third strategy]
## Mental Rehearsal Target
[A one-paragraph description of the corrected action, written in present tense as if performing it correctly right now]
## Action Items
- [ ] [Specific follow-up action — study, practice, consult, prepare]
- [ ] [If applicable: share this learning with whom, by when]
```
**IF** a working directory is available -> write to `reflection-[YYYY-MM-DD]-[slug].md`
**ELSE** -> present the completed document directly in the conversation
---
## Examples
**Scenario: Medical student after a difficult patient case presentation**
Trigger: "I just finished presenting a complex case in rounds. It went okay but I know I fumbled the differential diagnosis section. I want to make sure I learn from this."
Process:
1. Experience: Case presentation in medical rounds, differential diagnosis section was weak.
2. Four questions:
- Q1: "I stayed calm and presented the history clearly. Attendings noted my physical exam findings were thorough."
- Q2: "When asked why I hadn't included X in the differential, I blanked. I hadn't thought through the less common causes."
- Q3: "This reminds me of how I studied anatomy — I memorized structures but not their clinical implications. Same pattern: knowing facts but not knowing what to do with them."
- Q4: "Next time, before any presentation, I'll generate my differential by ruling out the dangerous diagnoses first, not just listing the common ones."
3. Format: Learning Paragraph (student context, moderate time available).
Output (abbreviated):
```
## Four Questions
### What went well?
Presented the history in a clear chronological sequence. Physical exam findings
were complete and well-organized. Remained composed under direct questioning.
### What could have gone better?
The differential diagnosis was shallow — I listed common causes but had not
systematically worked through the high-acuity diagnoses to exclude first.
When challenged on a rare diagnosis, I had no reasoning to offer. This is
an execution gap: I know the framework (dangerous diagnoses first) but
did not apply it under pressure.
### What does this remind me of?
This is the same pattern as my anatomy studying: I memorized structures but
missed their clinical relevance. Isolated facts without clinical reasoning
frameworks. Also reminds me of debugging code — you have to exclude the
catastrophic failures first before assuming it's a minor issue.
### Strategies for next time
1. For every presentation, explicitly work through the "must not miss" diagnoses
before listing common ones. Write them down even if I'm confident they're excluded.
2. Practice presenting differentials out loud to a study partner twice per week —
the retrieval pressure of a live audience surfaces blanks that solo review misses.
## Mental Rehearsal Target
Standing at the whiteboard, I've just presented the history. Before listing
my differential, I pause, say "Let me start with the diagnoses we need to rule
out," and work through the dangerous possibilities with brief reasoning for why
each is or isn't supported by the data. I finish with the likely common cause.
The attending asks about the rare diagnosis. I cite the two findings that made
me downgrade its probability.
```
---
**Scenario: Writing teacher after a workshop session that lost the room**
Trigger: "I ran a 90-minute writing workshop today and lost the group around the 45-minute mark. They were engaged at the start but then checked out. I need to figure out what happened."
Process:
1. Experience: 90-minute writing workshop, engagement dropped at midpoint.
2. Four questions:
- Q1: "The opening exercise worked — everyone participated and the energy was high."
- Q2: "Around 45 minutes I shifted from exercises to explanation. Too much theory at once. I can see now that I talked for 20 minutes straight."
- Q3: "This reminds me of the generation effect from learning science — learners retain more when they attempt a task before being shown the solution. I did the opposite: I explained the concept then had them practice."
- Q4: "Flip the sequence every 15 minutes: attempt first, explain second. Keep explanations under 5 minutes."
3. Format: Structured Debrief (teaching context, identifying a repeatable technique error).
---
**Scenario: Undercover police detective after a difficult surveillance operation**
Trigger: "Just finished a long undercover operation. We got what we needed but I made a cover story mistake at the 3-hour mark that almost burned me. I want to do a proper debrief before I forget the details."
Process:
1. Experience: Multi-hour undercover surveillance, cover story error at hour 3.
2. Four questions:
- Q1: "Initial contact and rapport-building went well. The target accepted my presence without suspicion."
- Q2: "At hour 3, fatigue caused a factual inconsistency in my cover story — I cited a location I had not visited yet in the timeline. The target noticed the hesitation."
- Q3: "Reminds me of the 'seven-thousand-and-one' rule from jump school training: the moment you stop counting, you're in trouble. Sustained performance under stress requires explicit cuing, not willpower."
- Q4: "Build in a 2-hour check: step away, review cover story facts, reset. Do not rely on memory under sustained stress without a structured refresh point."
3. Format: Structured Debrief (high-stakes procedure context, team operational record).
---
## Key Principles
- **Reflection is retrieval practice, not journaling** — The difference between a reflection that builds skill and a reflection that just feels good is effortful retrieval. The questions must surface specific memories, not vague impressions. "It went okay" is not a retrievable insight. "I used this specific technique at this moment and it produced this result" is.
- **Difficulty is data, not failure** — The questions are designed to surface struggle because that is where the most durable learning lives. A reflection that only records what went well is incomplete. What went wrong — and specifically why — is the material the brain will prioritize encoding for future use.
- **Mental rehearsal is not optional decoration** — Visualizing the corrected action before the next performance is a distinct cognitive step, not a summary of what was learned. The surgeon who thinks through the technique that night returns to the OR the next day with a pre-practiced neural pathway, not just an intention to do better.
- **Written beats remembered** — Reflection documents compound in value over time. A single reflection is useful. A month of reflections lets you see whether the same difficulty appears repeatedly, which tells you something very different than any single event could. Write it down.
- **Free recall first, then review sources** — When possible, attempt to write down everything you can remember before consulting notes or references. The retrieval attempt — even imperfect — strengthens the memory more than rereading ever will. The gaps you discover during free recall are your exact study targets.
## References
- For spaced retrieval practice as a study system (recurring self-quizzing schedule), see the `retrieval-practice-study-system` skill
- For the science behind retrieval, spacing, interleaving, and generation effects, see [references/cognitive-mechanisms.md](references/cognitive-mechanisms.md)
- For domain-specific reflection templates (clinical, writing, coaching, law enforcement), see [references/domain-reflection-templates.md](references/domain-reflection-templates.md)
## License
This skill is licensed under [CC-BY-SA-4.0](https://creativecommons.org/licenses/by-sa/4.0/).
Source: [BookForge](https://github.com/bookforge-ai/bookforge-skills) — Make It Stick: The Science of Successful Learning by Peter C. Brown, Henry L. Roediger III, Mark A. McDaniel.
## Related BookForge Skills
This skill is standalone. Browse more BookForge skills: [bookforge-skills](https://github.com/bookforge-ai/bookforge-skills)
Design a complete self-quizzing study system for any subject, course, or learning goal. Use this skill whenever the user wants to study more effectively, sto...
---
name: retrieval-practice-study-system
description: Design a complete self-quizzing study system for any subject, course, or learning goal. Use this skill whenever the user wants to study more effectively, stop wasting time rereading notes, build a study schedule from learning material, prepare for exams, create flashcard decks with a spacing system, design a practice-quiz regimen, or turn any document into a retrieval-based learning plan — even if they don't mention "retrieval practice" or "spaced repetition." Works for students at any level, professionals upskilling, lifelong learners, and coaches designing training programs. Do NOT use this skill to evaluate whether a textbook or course is good (that is a different task), or to build automated quiz software (that requires a coding skill).
version: 1.0.0
homepage: https://github.com/bookforge-ai/bookforge-skills/tree/main/books/make-it-stick/skills/retrieval-practice-study-system
metadata: {"openclaw":{"emoji":"📚","homepage":"https://github.com/bookforge-ai/bookforge-skills"}}
status: draft
source-books:
- id: make-it-stick
title: "Make It Stick: The Science of Successful Learning"
authors: ["Peter C. Brown", "Henry L. Roediger III", "Mark A. McDaniel"]
chapters: [2, 8]
tags: ["learning-science", "cognitive-psychology", "evidence-based-learning", "study-skills", "self-testing", "active-recall"]
depends-on: []
execution:
tier: 1
mode: hybrid
inputs:
- type: document
description: "Study material, course outline, textbook chapters, lecture notes, or learning objectives"
tools-required: [Read, Write]
tools-optional: []
mcps-required: []
environment: "Any agent environment with file read/write access."
discovery:
goal: "Design a complete retrieval-based study system — quiz questions, spacing schedule, mastery signals, and anti-pattern guide — so the learner can replace passive rereading with active self-testing."
tasks:
- "Analyze study material and extract key concepts, terms, and relationships"
- "Generate a prioritized set of self-quiz questions (short-answer and concept-connection format)"
- "Build a spaced repetition schedule calibrated to the content and timeline"
- "Define mastery signals and a Leitner-style progression system"
- "Produce a one-page anti-pattern comparison: retrieval practice vs. rereading"
audience: ["students", "lifelong learners", "teachers", "trainers", "coaches"]
triggers:
- "I need to study for an exam"
- "How do I make this material actually stick?"
- "Create flashcards for this content"
- "I keep rereading my notes but nothing is sticking"
- "Build a study plan for this course"
- "I want to learn this subject deeply, not just cram"
not_for:
- "Evaluating the quality of a course or textbook"
- "Building quiz software or automated learning platforms"
- "Summarizing a book (use a summarizer skill)"
environment: "Document-based: study guides, lecture notes, course outlines, PDF chapters, learning objectives"
quality:
completeness_score:
accuracy_score:
value_delta_score:
---
# Retrieval Practice Study System
## When to Use
You have a body of material to learn and want to build a structured, science-backed study system. Typical situations:
- The user has a course, textbook, or document they need to master by a deadline
- The user is frustrated that rereading does not produce real retention
- The user wants flashcards but does not know how to space them or signal mastery
- The user needs a study schedule that goes beyond "review the night before"
- A teacher or trainer wants to design a low-stakes quiz regimen for their students
Before starting, verify:
- Is the study material available to read? (If not, ask the user to share it or describe it)
- What is the learning goal and timeline? (Exam date, job start date, presentation deadline)
**Mode: Hybrid** — The agent designs all study materials, question sets, and the schedule. The human executes the daily practice sessions.
## Context & Input Gathering
### Required Context (must have — ask if missing)
- **Study material:** What must be learned? This is the source for generating questions.
→ Check for: uploaded files, linked documents, pasted notes, chapter summaries in the prompt
→ If missing, ask: "Please share the material you want to study — lecture notes, textbook chapters, or a course outline."
- **Learning goal:** What does mastery look like? This determines question depth.
→ Check for: exam format (multiple-choice, essay, practical), job competency requirements, certification criteria
→ If missing, ask: "What will you be tested on or need to do with this knowledge?"
- **Timeline:** When is the deadline or exam? This sets the spacing schedule.
→ Check for: dates mentioned in prompt, course syllabi, exam announcements
→ If missing, ask: "How much time do you have before you need to know this material?"
### Observable Context (gather from environment)
- **Existing notes or highlights:** Prior study attempts that reveal what the user already knows
→ Look for: annotated files, highlighted PDFs, previous flashcard decks
→ If unavailable: treat all material as new
- **Subject domain:** Affects question type (factual recall vs. concept-application vs. procedure)
→ Look for: course name, subject tags, discipline cues in the material
### Default Assumptions
- If no exam format specified → generate a mix: 60% short-answer, 40% concept-connection
- If no timeline specified → design a 4-week schedule with daily 20-minute sessions
- If no prior knowledge indicated → assume starting from zero
### Sufficiency Threshold
```
SUFFICIENT when ALL of these are true:
✓ Study material is available (or described in enough detail to generate questions)
✓ Learning goal is clear (what the learner must be able to do)
✓ Timeline is known (or default 4-week schedule is acceptable)
BLOCK if: no material and no description — cannot generate meaningful questions without content
```
## Process
### Step 1 — Analyze the Material and Extract Learning Targets
Read the study material and identify:
- **Key concepts:** Core ideas the learner must understand (not just recognize)
- **Critical terms:** Vocabulary with precise meanings that affect application
- **Relationships:** How concepts connect, cause each other, or contrast
- **Procedures:** Step-by-step processes or decision rules
**WHY:** Retrieval practice is most effective when questions target the deep structure of the material — the underlying principles — not just surface facts. Identifying learning targets first ensures the question set is prioritized, not exhaustive.
Output: A numbered list of 10-20 learning targets, ranked by importance to the learning goal.
---
### Step 2 — Generate the Self-Quiz Question Set
For each learning target, write 1-3 questions. Prefer:
- **Short-answer questions** that require the learner to produce the answer (not recognize it): "What happens to memory retention when retrieval is delayed by one week?"
- **Concept-connection questions** that require relating ideas: "How does spacing interact with retrieval practice to strengthen long-term memory?"
- **Application questions** that transfer learning to a new scenario: "A colleague is preparing a presentation. Which study strategy would you recommend and why?"
Avoid:
- True/false questions (recognition is weaker than recall)
- Questions answerable by a single memorized word
**WHY:** Research demonstrates that questions requiring the learner to produce an answer (short-answer, essay) yield significantly stronger long-term retention than recognition-based formats (multiple choice, true/false). The cognitive effort of generating an answer strengthens the neural pathway to that memory. When multiple-choice is necessary (e.g., matching a certification exam format), write questions with plausible distractors that require discrimination, not guessing.
**IF** the material is primarily procedural (e.g., a clinical protocol, a coding pattern):
→ Write sequence-recall questions: "List the steps of X in order" and error-identification questions: "What is wrong with this approach?"
**IF** the material is primarily conceptual (e.g., economic theory, learning science):
→ Weight toward explanation questions: "Explain why X happens" and comparison questions: "How does X differ from Y?"
Output: A question bank file (`quiz-questions.md`) with questions grouped by learning target.
---
### Step 3 — Build the Spacing Schedule
Construct a tiered review schedule based on the timeline and confidence level:
**Tier schedule (adjust for your timeline):**
- **Session 0 (Day 1):** Initial study of material + immediate self-quiz (all questions, no peeking)
- **Session 1 (Day 2-3):** Review all missed questions from Session 0 + skim correct ones
- **Session 2 (Day 5-7):** Quiz all questions again; retire cards answered correctly twice in a row to the "monthly" pile
- **Session 3 (Day 10-14):** Quiz remaining active questions; add any new material
- **Monthly review:** Pull the "retired" pile once a month and re-quiz — anything missed re-enters the active deck
**For short timelines (exam in under 2 weeks):**
- Compress to every-other-day quizzing
- Prioritize the highest-weight learning targets
- Do not retire cards until after the exam
**WHY:** Spacing practice — leaving time between retrieval sessions — forces the brain to reconstruct the memory from long-term storage rather than working memory. This reconstruction process, which feels effortful and even frustrating, is precisely what strengthens long-term retention. Research shows cramming produces 50% forgetting within two days; spaced practice reduces forgetting to 10% over the same period.
Output: A study calendar file (`study-schedule.md`) with specific dates, session content, and time estimates.
---
### Step 4 — Set Up the Leitner Box Progression
Organize flashcards (physical or digital) into 3-5 boxes with escalating review intervals:
| Box | Review frequency | Entry rule | Exit rule |
|-----|-----------------|------------|-----------|
| Box 1 (Active) | Every session | All new cards start here | Answered correctly once → Box 2 |
| Box 2 | Every other session | From Box 1 | Correct again → Box 3 |
| Box 3 | Once a week | From Box 2 | Correct again → Box 4 |
| Box 4 | Once a month | From Box 3 | Correct again → Box 5 |
| Box 5 (Mastered) | Once a semester / before high-stakes events | From Box 4 | Stays here unless missed → back to Box 1 |
**Critical rule:** If a card is answered incorrectly at any box level, it returns immediately to Box 1.
**WHY:** The Leitner system (a physical implementation of spaced repetition) ensures that difficult material receives more practice and easy material is not wasted on. The "any miss → Box 1" rule prevents the learner from self-deceiving about mastery — the moment a card is missed, it is treated as unlearned.
Output: Instructions for setting up the Leitner system in `study-schedule.md`, including the starting box assignment for all cards.
---
### Step 5 — Define Mastery Signals
Mastery for a given concept is declared when ALL of these are true:
1. The learner answers the question correctly **without hesitation** on 3 consecutive sessions
2. The learner can explain the concept **in their own words** (not by reciting the source text)
3. The learner can connect the concept to at least one other concept in the course
4. The concept has been in Box 4 or Box 5 for at least one review cycle
**Warning signals** (study is not working — change approach):
- Answering correctly immediately after reading but unable to recall 24 hours later → increase spacing interval
- Feeling confident but scoring below 70% on a practice test → the fluency illusion is active; reduce rereading and increase self-quizzing
- Unable to answer any question after a session → material is too complex; break learning targets into smaller sub-concepts
**WHY:** Without explicit mastery criteria, learners commonly experience the "fluency illusion" — the feeling of knowing that arises from familiarity with the text, not from actual command of the material. Familiarity is not retrievability. Defining mastery signals forces the learner to test their knowledge against objective criteria rather than subjective feeling.
Output: A mastery checklist section at the bottom of `study-schedule.md`.
---
### Step 6 — Produce the Anti-Pattern Comparison
Write a one-page summary contrasting retrieval practice with rereading, specific to the learner's material:
**Rereading (what feels productive but is not):**
- Highlights and color-coded notes create visual familiarity
- Fluency with the text mimics the feeling of understanding
- Results in 50-70% forgetting within one week
- Produces overconfidence: learners believe they know material they cannot recall
**Retrieval practice (what feels harder but works):**
- Self-quizzing feels awkward and slow — this discomfort is a signal it is working
- Effort during recall strengthens the memory pathway
- A single retrieval session boosts one-week retention by ~11%; three sessions "immunize" against forgetting
- Correcting wrong answers after a retrieval attempt produces better learning than never having tried
**WHY:** Learners who understand the mechanism are more likely to tolerate the discomfort of self-quizzing. The awkward feeling of struggling to recall is cognitively identical to the process that makes memories durable. Naming this feeling in advance reduces the temptation to abandon the system.
Output: Anti-pattern guide appended to `quiz-questions.md`.
---
## Inputs
| Input | Required | Description |
|-------|----------|-------------|
| Study material | Yes | Text, notes, chapters, or course outline to learn |
| Learning goal | Yes | What the learner must be able to do with this knowledge |
| Timeline | Yes (or accept default 4-week) | Days until exam or competency is needed |
| Existing flashcards / notes | No | Prior study artifacts that can seed the question bank |
| Exam format | No | Affects question type weighting |
## Outputs
| Output | Format | Description |
|--------|--------|-------------|
| `quiz-questions.md` | Markdown | Full question bank grouped by learning target, with anti-pattern guide |
| `study-schedule.md` | Markdown | Day-by-day schedule, Leitner box setup, mastery checklist |
### Output Template: study-schedule.md
```markdown
# Study Schedule: [Subject]
**Learning goal:** [What you will be able to do]
**Exam / deadline:** [Date]
**Total sessions:** [N]
## Leitner Box Setup
- Box 1 (Active): [N] cards — review every session
- Box 2: [N] cards — review every other session
- Box 3: [N] cards — review weekly
- Box 4: [N] cards — review monthly
- Box 5 (Mastered): empty at start
## Session Calendar
| Date | Box(es) to quiz | Time estimate | Notes |
|------|----------------|---------------|-------|
| [Day 1] | All (Box 1) | 30 min | First pass — expect to miss most |
| [Day 2-3] | Box 1 (misses only) | 20 min | Focus on gaps |
| ... | ... | ... | ... |
## Mastery Checklist
For each learning target, check off when:
- [ ] Correct 3 consecutive sessions without hesitation
- [ ] Can explain in own words
- [ ] Can connect to at least one other concept
- [ ] In Box 4 or Box 5 for one full cycle
```
## Key Principles
**1. Retrieval, not review, is the learning event**
Reading creates familiarity; retrieval creates memory. After the first reading, every additional hour spent rereading yields far less retention than the same hour spent self-quizzing. The act of pulling a memory from storage — not the act of encoding it — is what makes it durable.
**2. Desirable difficulty: effortful = effective**
The discomfort of struggling to recall something is not a sign of failure; it is the mechanism of learning. Research consistently shows that more effortful retrieval produces stronger retention than easy retrieval. If self-quizzing feels smooth and easy, the spacing interval is probably too short.
**3. Spacing beats massing**
Distributing practice across days produces dramatically better long-term retention than the same amount of practice compressed into one session. The reason: spaced sessions require the brain to reconstruct the memory from long-term storage, reinforcing the neural pathway each time. Cramming draws from short-term memory and fades quickly.
**4. Errors are learning events, not failures**
Getting a question wrong, then checking the correct answer and trying again, produces better learning than never having made the error. Wrong answers followed by corrective feedback are more effective than rereading alone. Do not avoid questions you expect to miss — seek them out.
**5. Calibration beats confidence**
The fluency illusion (feeling like you know material because you can read it fluently) is one of the most reliable predictors of exam failure. Self-quizzing provides an objective measure of what you actually know, not what you feel you know. Use quiz scores, not reading speed or highlighting volume, to guide study decisions.
**6. Interleaving deepens discrimination**
Mixing study of different topics or problem types — rather than blocking one topic at a time — helps the brain learn to identify which approach applies to which situation. This is harder and slower than blocked practice but produces superior transfer to new problems.
## Examples
### Example 1: Medical Student Preparing for a Physiology Exam
**Scenario:** A first-year medical student has four weeks before a comprehensive physiology exam. The course covers twelve organ systems. After two weeks of rereading notes and highlighting textbooks, they scored 65 on a practice exam. They need to change their approach.
**Trigger:** "I have a physiology exam in four weeks. I've been rereading my notes but not retaining anything. Help me study."
**Process:**
1. Agent reads the student's notes and identifies 60 learning targets across the twelve systems
2. Agent generates a 90-question bank weighted toward concept-connection and mechanism questions (e.g., "Trace the pathway by which decreased blood pressure triggers aldosterone release")
3. Agent builds a 4-week spaced schedule: daily 25-minute sessions in weeks 1-2, 30-minute sessions in weeks 3-4 with backward-reaching review
4. Agent sets up a 4-box Leitner system and designates mastery criteria for each organ system
5. Agent writes an anti-pattern guide specific to physiology: why reading the textbook description of the cardiac cycle is not the same as being able to describe it
**Output:** `quiz-questions.md` (90 questions, grouped by system) and `study-schedule.md` (28-day calendar, Leitner assignments, mastery checklist)
---
### Example 2: Professional Upskilling in Data Systems
**Scenario:** An engineer wants to deeply learn distributed systems concepts from a technical book. They have 6 weeks, no fixed exam, but a system design interview in 42 days. They've been reading chapters but feel like concepts slip away within a day.
**Trigger:** "I'm reading a technical book on distributed systems for a system design interview. Nothing is sticking. Can you build me a study system?"
**Process:**
1. Agent reads the chapter summaries and identifies 35 learning targets (replication strategies, consensus protocols, storage engines, etc.)
2. Agent generates a 50-question bank mixing definition recall ("What is the difference between strong and eventual consistency?") and application questions ("You are designing a leaderboard for a global gaming platform. Which consistency model would you choose and why?")
3. Agent builds a 6-week schedule with interleaving across topic areas (storage + replication + consistency alternated, not blocked)
4. Agent sets up a 3-box Leitner system (simpler, given the timeline) and defines mastery as "able to explain the concept whiteboard-style without notes"
**Output:** `quiz-questions.md` and `study-schedule.md` optimized for interview preparation
---
### Example 3: Teacher Designing a Classroom Retrieval System
**Scenario:** A middle school social studies teacher wants to integrate low-stakes quizzing into their unit on ancient civilizations. They have 8 weeks of unit material and want a system that does not feel punitive to students.
**Trigger:** "I'm a teacher. Help me design a quiz system for my ancient civilizations unit that helps students retain information without stressing them out."
**Process:**
1. Agent reads the unit outline and identifies 40 learning targets across Egypt, Mesopotamia, India, and China
2. Agent generates a bank of 80 short-answer questions suitable for classroom use, with 3 quiz sets of approximately 5 questions each (pre-lesson, post-lesson, pre-exam)
3. Agent builds an 8-week schedule with three quiz points per unit chapter: beginning of class (prior reading), end of class (today's lesson), and 24 hours before the chapter test
4. Agent writes teacher notes explaining the 79%-vs-92% research result (Columbia Middle School science study) and how to frame quizzes to students as "practice, not a grade"
5. Agent produces a student-facing anti-pattern card explaining why rereading notes feels productive but is not
**Output:** `quiz-questions.md` (80 questions in three quiz sets) and `study-schedule.md` (8-week calendar with classroom timing)
---
## References
- `references/research-evidence.md` — Key empirical studies: testing effect research (1917, 1939, 1978, 2005 Columbia Middle School, 2007 eighth-grade science), forgetting curve data, cramming vs. spacing retention comparisons
- `references/leitner-system-guide.md` — Full Leitner box implementation guide including physical card setup and digital app equivalents
- `references/mastery-signals.md` — Extended mastery criteria for different subject types (procedural, conceptual, declarative)
- `references/anti-patterns.md` — Full comparison of 6 ineffective study strategies vs. retrieval practice alternatives
## License
This skill is licensed under [CC-BY-SA-4.0](https://creativecommons.org/licenses/by-sa/4.0/).
Source: [BookForge](https://github.com/bookforge-ai/bookforge-skills) — Make It Stick: The Science of Successful Learning by Peter C. Brown, Henry L. Roediger III, Mark A. McDaniel.
## Related BookForge Skills
This skill is standalone. Browse more BookForge skills: [bookforge-skills](https://github.com/bookforge-ai/bookforge-skills)
FILE:references/anti-patterns.md
# Study Anti-Patterns vs. Retrieval Practice
## The 6 Most Common Counterproductive Study Habits
### 1. Rereading and Highlighting
**What it feels like:** Productive. The text becomes familiar. Highlighted passages feel significant.
**What it produces:** Familiarity, not retrievability. The brain becomes fluent with the text, which mimics understanding.
**The problem:** Fluency with a text is not the same as ability to recall it under test conditions. Students who rely on rereading routinely overestimate how much they know.
**Retrieval practice alternative:** After one reading, close the book and write down everything you can recall. Then reread only to correct errors.
### 2. Cramming (Massed Practice)
**What it feels like:** Intensive and effective. Material feels immediately accessible after a long cramming session.
**What it produces:** Short-term availability in working memory that fades within 48-72 hours.
**The data:** After cramming, learners forget 50% of recalled material within two days. After spaced retrieval practice, learners forget only 10-13% over the same period.
**Retrieval practice alternative:** Distribute study across sessions with increasing intervals. Accept that each session will feel harder as forgetting sets in — this difficulty is the mechanism of long-term encoding.
### 3. Passive Highlighting and Underlining
**What it feels like:** Active engagement with the text.
**What it produces:** A colorful book.
**The problem:** The act of drawing a highlighter across text requires no retrieval and creates no memory trace beyond visual familiarity.
**Retrieval practice alternative:** Read, close the book, write what you recall. Use highlighting only to mark what you could not recall — the gaps, not the familiar passages.
### 4. Study Groups Focused on Coverage (Not Testing)
**What it feels like:** Collaborative and efficient — the expert explains, others listen.
**What it produces:** Passive learning for everyone except the person explaining.
**Retrieval practice alternative:** Convert study groups to testing groups. Everyone attempts to answer before anyone explains. Textbooks remain closed. The goal is retrieval, not coverage.
### 5. Practice Tests Used Only to Identify Gaps (Not as Learning Events)
**What it feels like:** Efficient — you find out what you don't know and go study it.
**What it produces:** A catalog of gaps, but the practice test itself is not used as the learning event it is.
**The insight:** The act of attempting to retrieve an answer — even incorrectly — changes the memory of that content. Students who answer questions and then correct their errors learn the material better than students who simply reread the correct answers.
**Retrieval practice alternative:** Treat every practice test question as a learning event. Attempt an answer before looking it up. Correct errors actively. Return to missed questions in the next session.
### 6. Feeling of Knowing vs. Ability to Recall
**What it feels like:** Reading a passage and thinking "I know this."
**What it produces:** A false signal of mastery.
**The fluency illusion:** Recognizing information when you encounter it is far easier than retrieving it when you need it. The only way to calibrate whether you actually know something is to try to produce the answer without looking.
**Retrieval practice alternative:** Never say "I know this" after rereading. Say it only after you have successfully produced the answer from memory without prompting.
## The Core Inversion
Every ineffective study habit has one thing in common: the learner is **consuming** information (reading, listening, watching). Effective study requires **producing** information (recalling, explaining, applying). The shift from consumption to production is the single most important change a learner can make.
FILE:references/leitner-system-guide.md
# Leitner Box System: Implementation Guide
## What It Is
The Leitner box is a physical implementation of spaced repetition invented by German science journalist Sebastian Leitner. It organizes flashcards into boxes with escalating review intervals, so difficult material gets more practice and easy material is not wasted on.
## Physical Setup (Index Cards)
Use 5 dividers or physical boxes labeled 1-5. Write one question on the front of each card, the answer on the back.
## Review Intervals (Default)
| Box | Review Frequency | Cards Start Here |
|-----|-----------------|-----------------|
| 1 | Every study session | All new cards |
| 2 | Every other session | Promoted from Box 1 |
| 3 | Once per week | Promoted from Box 2 |
| 4 | Once per month | Promoted from Box 3 |
| 5 | Once per semester | Promoted from Box 4 |
## The Golden Rule
**Any card answered incorrectly at any box level returns immediately to Box 1.**
This is not punitive — it is the mechanism that prevents the fluency illusion.
## Promotion Rule
- Answered correctly once → promote to next box
- Answered incorrectly → return to Box 1 regardless of current box
## Digital Equivalents
- **Anki** (free, cross-platform): implements the SM-2 algorithm, which is mathematically equivalent to Leitner with adaptive intervals
- **Quizlet** (freemium): supports basic spaced repetition
- **Remnote** (freemium): combines notes and flashcards with spaced repetition
- **Physical cards** remain effective and have the advantage of requiring the learner to produce cards manually, which is itself a form of retrieval practice
## Adapting for Short Timelines
- Compress to 3 boxes: Active (daily), Practice (every other day), Mastered (weekly)
- Do not retire cards to Box 5 until after the exam or deadline
- Treat any Box 3 card as "nearly mastered" and revisit weekly
## Common Mistakes
- **Stopping at Box 2 and feeling done:** Mastery requires reaching Box 4+ with at least one correct review cycle
- **Not returning missed cards to Box 1:** Allowing incorrect cards to stay in higher boxes creates a false sense of progress
- **Never reviewing Box 5:** Mastered material still requires periodic retrieval; quarterly review of Box 5 prevents long-term forgetting
FILE:references/mastery-signals.md
# Mastery Signals by Subject Type
## Universal Mastery Criteria (apply to all subjects)
1. Correct answer on 3 consecutive sessions without hesitation
2. Can explain in own words (not reciting source text)
3. Can connect to at least one other concept in the material
4. Concept has been in Box 4 or Box 5 for at least one review cycle
## By Subject Type
### Declarative Knowledge (facts, dates, terminology, classifications)
- Mastery: Correct production (not recognition) 3 times with 48+ hours between sessions
- Test format: Short-answer, fill-in-blank (not multiple choice)
- Warning: If you can recognize the answer but not produce it, you are not at mastery
### Conceptual Understanding (theories, mechanisms, relationships, principles)
- Mastery: Can explain to someone unfamiliar with the subject using an analogy or example
- Test format: Explanation questions, comparison questions, "why does X happen" questions
- Warning: If you can recite the definition but cannot apply it to a new scenario, you are not at mastery
### Procedural Knowledge (protocols, algorithms, clinical steps, formulas)
- Mastery: Can perform or write the steps from memory in correct order without prompting
- Test format: Sequence recall, "what comes next," error identification
- Warning: If steps only come to you while looking at the first step, you are not at mastery
### Application and Transfer (using knowledge in new contexts)
- Mastery: Can recognize which concept or procedure applies when given an unfamiliar scenario
- Test format: Case-based questions, novel problems not seen during study
- Warning: The hardest mastery level to achieve — requires interleaved practice across problem types
## Warning Signals and Corrective Actions
| Warning Signal | Likely Cause | Corrective Action |
|---------------|-------------|-------------------|
| Correct immediately after reading, forgotten 24 hours later | Spacing interval too short; drawing from working memory | Increase gap between sessions to 48+ hours |
| Feeling confident but scoring below 70% on practice test | Fluency illusion from rereading | Reduce rereading; increase self-quizzing ratio |
| Unable to answer any questions after a session | Learning targets too complex | Break each target into 2-3 sub-concepts; add prerequisite cards |
| Correct in isolation but incorrect in mixed practice | Lack of interleaving | Mix question types in each session; avoid blocking by topic |
| Correct for months then suddenly wrong | Long-term forgetting (natural) | This is why Box 5 exists; return to Box 1 and rebuild |
## Self-Calibration Protocol
At the start of each study session:
1. Without looking at notes, write down everything you can recall about today's topics
2. Check against your notes — what did you miss?
3. Focus the session on misses, not on what you already retrieved successfully
FILE:references/research-evidence.md
# Research Evidence: The Testing Effect
## Key Studies
### 1917 — First Large-Scale Investigation (Washburne & Roediger)
Children in grades 3-8 studied brief biographies. Groups who spent ~60% of study time in silent self-recitation showed better retention than those who reread. **60% self-recitation was the optimal split.**
### 1939 — Iowa Study (N=3,000 Sixth Graders)
- Articles studied then tested at various intervals before a final test two months later
- Longer delay before first test → greater forgetting
- After a student took one test, forgetting nearly stopped on subsequent tests
- **Key finding: A single retrieval event dramatically slows subsequent forgetting**
### 1978 — Cramming vs. Spacing Comparison
- Massed practice (cramming) → higher score on immediate test
- Two days later: crammers had forgotten 50% of what they recalled on the initial test
- Spaced retrieval group forgot only 13% over the same period
- **Cramming forgetting rate: 50%; Spaced retrieval forgetting rate: 13%**
### 2006-2007 — Columbia Middle School Study (Agarwal, Bain, Roediger)
- 6 social studies classes, 3 semesters, topics: ancient Egypt, Mesopotamia, India, China
- Quizzed material vs. non-quizzed (restudied vs. neither)
- **Results: Students scored a full grade level higher on quizzed material**
- Rereading as statements of fact produced no advantage over no additional review
- Study extended to 8th grade science (genetics, evolution, anatomy):
- Non-quizzed material: **79% (C+)**
- Quizzed material: **92% (A-)**
- Effect persisted 8 months later at end-of-year exams
### Single Test Effect
- One test in a class produces a **large improvement in final exam scores**
- Three tests "immunize" against forgetting (53% recall maintained vs. 39% for one-test group vs. 28% for no-test group after one week)
### Generation Effect
- Filling in a missing letter (foot → s_\_e) produces better recall than studying the complete pair
- Delayed retrieval (20 intervening items) produces better recall than immediate retrieval after the same pair
- **Mechanism: greater cognitive effort required = stronger memory consolidation**
## Forgetting Curve Context
- Within hours of initial exposure: ~70% of new information is forgotten
- Forgetting slows after the initial loss
- A single retrieval event resets this curve
## Feedback and Error Correction
- Giving feedback after wrong answers strengthens retention more than testing alone
- Slightly delayed feedback (not immediate) produces better long-term learning than immediate correction
- Students who are tested frequently rate their classes more favorably at semester end
- 89% of students in retrieval-practice classrooms report it increased their learning
Design a concrete practice schedule that will actually make learning stick — not just feel productive. Use this skill when the user is preparing for a test,...
---
name: practice-schedule-designer
description: "Design a concrete practice schedule that will actually make learning stick — not just feel productive. Use this skill when the user is preparing for a test, building a new skill, training others, or planning a study program and needs to decide how to structure practice sessions over time. Triggers include: user is relying on marathon study sessions or cramming before a deadline; user practices one topic exhaustively before moving to the next; user feels they know material during practice but forgets it on tests or in real situations; user wants to know how often to review flashcards or revisit past material; user needs to design a training curriculum for a team or class; user is switching between topics during study and wants to know if that is helping or hurting; user is preparing for a performance context (exam, job, sport) and must choose between depth on one skill versus breadth across many. This skill does NOT address memorization technique or recall strategy — use retrieval-practice-study-system for those."
version: 1.0.0
homepage: https://github.com/bookforge-ai/bookforge-skills/tree/main/books/make-it-stick/skills/practice-schedule-designer
metadata: {"openclaw":{"emoji":"📚","homepage":"https://github.com/bookforge-ai/bookforge-skills"}}
status: draft
source-books:
- id: make-it-stick
title: "Make It Stick: The Science of Successful Learning"
authors: ["Peter C. Brown", "Henry L. Roediger III", "Mark A. McDaniel"]
chapters: [3, 4, 8]
tags: ["learning-science", "cognitive-psychology", "evidence-based-learning", "spaced-repetition", "interleaving", "practice-design"]
depends-on: []
execution:
tier: 1
mode: hybrid
inputs:
- type: document
description: "User's learning goal, material list, available time, and current practice approach (if any)"
tools-required: [Write]
tools-optional: [Read]
mcps-required: []
environment: "Any agent environment; user describes their learning situation in text form or answers guided questions"
discovery:
goal: "Diagnose the user's current practice approach, classify it against the four practice types, then produce a concrete personalized schedule with spacing intervals, sequencing patterns, and tracking guidance"
tasks:
- "Gather learning goal, material type, timeline, and current practice approach"
- "Diagnose current practice type and identify which anti-patterns are present"
- "Select optimal practice strategy based on learning goal, material type, and time horizon"
- "Set spacing intervals calibrated to material difficulty and available time"
- "Design interleaving and variation patterns appropriate to the content"
- "Build a concrete schedule document the user can follow immediately"
- "Warn against the familiarity trap and the blocked-practice illusion"
audience: "Students, teachers, trainers, coaches, and lifelong learners who need to design or redesign their practice regime"
triggers:
- "User is cramming or marathon-studying before a deadline"
- "User practices one topic exhaustively before moving to the next"
- "User performs well during practice but forgets material on tests or in real situations"
- "User wants to know how often to revisit material"
- "User needs to design a training curriculum for a class or team"
- "User is alternating topics during study and wants to know if that helps"
- "User is preparing for a high-stakes performance context"
---
# Practice Schedule Designer
## When to Use
You have material to learn and a timeline to learn it in. What you need now is a practice structure — not just more time studying, but the right pattern of when, how often, and in what order to practice.
This skill is about **schedule design**, not study technique. It tells you how to arrange your practice sessions over time, which types of practice to combine, and how to set spacing intervals for your specific situation.
**Preconditions to verify:**
- Does the user know what they are trying to learn? If the material is completely undefined, ask them to identify it before continuing.
- Does the user have a rough timeline (days, weeks, months)? Interval recommendations depend on it.
**This skill does NOT cover:**
- How to execute retrieval practice within a session (use `retrieval-practice-study-system`)
- How to memorize facts efficiently (flashcard systems, mnemonics)
- How to manage motivation or study environment
---
## The Core Counterintuitive Principle
Before designing any schedule, establish this with the user if they seem unaware of it:
**Feeling productive during practice is not the same as learning durably.**
Massed practice — repeating one thing many times in a row — produces fast visible improvement. That improvement is real but shallow: it rests on short-term memory and fades quickly. Researchers call this "momentary strength." The techniques that build "habit strength" — the kind of learning that is still there weeks later when you need it — feel slower and harder during practice. You sense the effort but not the benefit the effort is creating.
This is why people persist in practicing the wrong way even after they have seen evidence that it does not work. They trust the feeling of progress over the data on retention. The schedule you design here will sometimes feel less productive than the old way. That discomfort is the signal that the learning is durable.
---
## Context and Input Gathering
### Required (ask if missing)
- **What are you learning?** Subject, skill, or material set (e.g., Spanish vocabulary, calculus problem types, sales pitch, guitar chord transitions, medical diagnosis protocols).
- **What is your timeline?** When do you need to perform or be tested? (e.g., exam in 3 weeks, job interview in 10 days, ongoing professional development with no deadline).
- **How much practice time do you have per week?** Total hours available, and how those break into sessions (e.g., 1 hour daily vs. 4 hours on weekends).
- **What does your current practice look like?** Walk through a recent session — what did you do first, second, and for how long? This reveals which practice type they are currently using and which anti-patterns are present.
### Useful (gather if available)
- **Performance context:** What does success look like in the moment of performance? (e.g., unseen exam questions, real-world patient encounters, live athletic competition, a job interview). This determines whether interleaving and variation are especially critical.
- **Material structure:** Is the content a set of similar items (vocabulary, flashcards), or a set of different problem types that require choosing the right approach?
- **Current mastery level:** Beginner (everything is new) vs. intermediate (some material familiar, some not) vs. advanced (maintaining a high level).
---
## Step 1 — Diagnose Current Practice Type
**Why:** Most learners default to massed practice without knowing it. The diagnosis names the pattern, which makes the problem concrete and motivates the schedule change.
Map what the user described onto one of these four types:
| Type | Definition | Recognition Signal |
|---|---|---|
| **Massed** | Long unbroken sessions on one topic; cramming; re-reading | "I study [topic] for 2 hours then move on" or "I cram the night before" |
| **Spaced** | Same material revisited across sessions with time gaps | "I review it again a few days later" |
| **Interleaved** | Multiple topics or problem types mixed within one session | "I mix different subjects in the same sitting" |
| **Varied** | Same skill practiced in different contexts, formats, or conditions | "I practice [skill] in different scenarios or with different examples" |
**Anti-patterns to flag explicitly:**
- **Cramming:** Massed practice compressed into a single session before a deadline. Effective for next-day recall; ineffective for retention beyond 48 hours.
- **Blocked practice:** Practicing the same drill or problem type in a fixed sequence before switching. Feels like variety because the station changes, but is still massed within each station. Common in sports (always running the same drill from the same position) and math courses (doing 20 problems of type A before moving to type B).
- **Familiarity trap:** Stopping practice on material that feels familiar, mistaking recognition for mastery. Recognized by statements like "I already know this one" or skipping flashcards that seem obvious.
State the diagnosis explicitly:
> "Your current practice is primarily [type]. You are experiencing [anti-pattern if present]. Here is what that costs you: [specific retention or transfer consequence]."
---
## Step 2 — Select the Optimal Practice Strategy
**Why:** The right mix of practice types depends on learning goal, material structure, and time horizon. There is no single correct answer — the decision framework below makes the choice explicit and defensible.
### Decision Framework
**Start with your primary learning goal:**
**Goal A — Memorize a fixed set of items** (vocabulary, dates, formulas, anatomical names, legal definitions)
- Primary strategy: **Spaced practice**
- Add: **Interleaving** if the items are similar enough to be confused with each other
- Do not add variation until items are partially learned
**Goal B — Learn to solve problems of a specific type** (math problems, diagnosis protocols, coding patterns)
- Primary strategy: **Interleaved practice** across problem types
- Add: **Spaced** intervals between sessions
- Warning: blocked practice will make performance during practice look better but test performance will be worse
**Goal C — Build a skill that must transfer to unpredictable real-world conditions** (athletic performance, clinical judgment, language conversation, negotiation)
- Primary strategy: **Varied practice** — deliberately change context, format, and conditions across sessions
- Add: **Interleaving** of related sub-skills
- Add: **Spaced** intervals
- Massed practice of a fixed drill will not transfer; you must practice the skill in conditions that vary from the test conditions
**Goal D — Maintain mastery of material already learned** (ongoing professional skills, language retention, athletic fundamentals)
- Primary strategy: **Spaced practice** with long intervals (monthly)
- Material that is well-mastered needs low-frequency review; the Leitner-box principle applies: the better your mastery, the less frequent the practice, but the material never disappears completely from the rotation
**Time horizon modifier:**
| Horizon | Implication |
|---|---|
| Less than 3 days | Spacing intervals are short (hours); prioritize retrieval practice over rereading |
| 1–4 weeks | Set intervals of 1 day → 3 days → 1 week; interleave topics within sessions |
| 1–3 months | Set intervals of 1 day → 1 week → 2–3 weeks → monthly; build in variation |
| Ongoing (no deadline) | Leitner-box style: frequency tracks mastery level; monthly review of well-mastered material |
---
## Step 3 — Set Spacing Intervals
**Why:** The spacing interval determines how much forgetting occurs between sessions. Some forgetting is desirable — the effort of retrieval after a small gap strengthens the memory. Too little gap and you are resting on short-term memory. Too much gap and retrieval approaches relearning from scratch, which is inefficient.
### Interval Calibration Rules
**Rule 1 — The minimum interval is "enough forgetting."**
If you can recall something effortlessly with no hesitation, you reviewed it too soon. A productive session has some difficulty; some items should require real effort to retrieve.
**Rule 2 — The maximum interval is "not so much forgetting that retrieval becomes relearning."**
If you cannot recall an item at all and must look it up as if encountering it for the first time, the interval was too long.
**Rule 3 — Sleep is a consolidation amplifier.**
At least one sleep period between practice sessions significantly aids memory consolidation. This means even in a compressed timeline, practice on Monday and practice on Tuesday are better than two sessions on Monday.
**Recommended starting intervals by material type:**
| Material | First review | Second review | Third review | Ongoing |
|---|---|---|---|---|
| Names and faces, arbitrary associations | Within minutes (high forgetting rate) | Same day | Next day | Weekly, then monthly |
| New concepts from a text or lecture | Within 24 hours | 3–5 days later | 1–2 weeks later | Monthly |
| Problem-solving skills | 1–2 days | 1 week | 2–3 weeks | Monthly |
| Complex judgment skills (clinical, athletic) | Next session | 1 week | 3 weeks | Monthly; plus real-world practice |
**Leitner-box logic for flashcard-style material:**
Divide material into difficulty tiers based on current mastery:
- **Tier 1 (errors frequent):** Review every session
- **Tier 2 (mostly correct):** Review every other session
- **Tier 3 (reliably correct):** Review weekly
- **Tier 4 (mastered):** Review monthly — but never remove from rotation until the learning goal is fully met
When you answer incorrectly, move the item back up one tier (more frequent review). When you answer correctly in consecutive sessions, move it down one tier (less frequent review).
---
## Step 4 — Design the Interleaving and Variation Pattern
**Why:** Interleaving and variation do the work that spacing cannot do alone. Spacing builds retention of individual items. Interleaving builds discrimination — the ability to recognize which type of problem you are facing and select the right approach. Variation builds transfer — the ability to apply a skill in conditions different from where you practiced it.
### Interleaving Design Rules
**When to interleave:** When the material contains two or more distinct problem types, subject areas, or skill categories that the learner must eventually distinguish between. Do not interleave before the learner has a basic grasp of each type — a small amount of initial blocked practice to introduce a new problem type is acceptable.
**How to interleave:** Within a single session, rotate through topics or problem types without completing a full block of any one type. The switch should happen before the learner feels fully on top of the current topic. That feeling of incompleteness is the point: the interruption forces the learner to start fresh retrieval on return, which builds the discrimination ability needed for tests and real situations.
**Do not confuse blocked practice with interleaving:** If your session has 20 minutes on topic A, then 20 minutes on topic B, then 20 minutes on topic C — that is blocked practice across the session, not interleaving. True interleaving mixes A, B, and C within each segment.
> Example: A student preparing for a statistics exam should not work 30 problems of hypothesis testing, then 30 problems of regression, then 30 problems of ANOVA. Instead: do 5 problems, switch types, do 5 more of a different type, switch again. The session will feel slower and less satisfying. The exam results will be better.
### Variation Design Rules
**When to vary:** When the learning goal requires transfer to unpredictable real-world conditions — athletic performance, clinical encounters, language conversation, professional problem-solving. If the test conditions will differ from practice conditions, practice must differ internally too.
**How to vary:** Change one or more of the following across sessions: context (location, sequence, trigger conditions), format (problem presentation, question phrasing), conditions (speed, materials available, teammate or patient involved), or level of challenge (harder examples, more ambiguous cases).
**Blocked practice warning for motor and procedural skills:** Always practicing a drill from the same position, in the same sequence, at the same speed locks the skill to those conditions and prevents transfer. Vary the starting position, the sequence, the speed, and the context systematically.
> Example: A professional learning a sales pitch should not practice it the same way from the same prompt every time. Vary the opener, the objection presented, the simulated client's emotional state, and the medium (phone vs. in-person vs. video call). This is uncomfortable — it will feel like the learning is not taking hold. That discomfort is producing a more robust skill.
---
## Step 5 — Build the Practice Schedule Document
**Why:** An abstract plan does not change behavior. A concrete schedule the user can open on Monday morning and follow without re-reading instructions does.
Produce a schedule document with the following components:
**Header block:**
- Learning goal
- Timeline (start date and performance date)
- Material list (what is being practiced)
- Total sessions planned
- Session length
**Session-by-session plan** (for schedules 4 weeks or shorter, list every session; for longer schedules, provide a repeating weekly template plus interval triggers):
For each session:
- Date or week number
- Duration
- Topics to cover and rotation sequence (for interleaved sessions)
- What to retrieve (specific problem types, flashcard tiers, or skill variants)
- What to check (which mastery tier items are due for review)
**Interval trigger rules** (so the user can adapt if they miss a session):
- "If you miss a session, do not double up the next day. Resume the schedule; the gap is not wasted — some forgetting during the gap is acceptable."
- "If an item in Tier 1 is still in Tier 1 after 3 sessions, flag it for deeper investigation or alternate explanation — repetition alone will not fix a conceptual gap."
**Anti-pattern warning reminders** (embed in the schedule):
- At the start of every session: "Resist the urge to review items you feel you already know. Quiz yourself before checking. The familiarity trap is the most common reason well-prepared learners underperform."
- At the midpoint of any block: "If this is starting to feel effortless, switch topics or problem types now — not when you have finished the block."
---
## Worked Examples
### Example A: Exam Preparation — 3-Week Timeline
**Situation:** University student, organic chemistry final exam in 21 days. Currently: reads lecture notes for 2 hours then attempts a few problems at the end of each session. Covers one reaction type per session.
**Diagnosis:** Massed practice with blocked problem-solving. The reading-then-practice pattern means retrieval happens only after review, so short-term memory is carrying the work. One reaction type per session = blocked practice.
**Selected strategy:** Interleaved practice across reaction types, spaced across sessions.
**Spacing intervals:** Sessions 6 days per week, 90 minutes each.
**Week 1:** Introduce all 6 reaction types across 2 sessions each (brief blocked intro, ~15 min per type). By end of week, all types have been seen.
**Weeks 2–3:** Every session mixes problem types. 6 problems per sitting, each from a different reaction type in random order. Also retrieve prior lecture concepts: at the start of each session, answer 3 questions from Week 1 material before working new problems.
**Interval triggers:** Any reaction type answered incorrectly moves to "daily review" status. Any answered correctly 3 times in a row in interleaved conditions moves to "every-other-session" status.
---
### Example B: Professional Skill — Ongoing Sales Training
**Situation:** A team of 8 new sales agents needs to learn 4 skill areas over 6 months: prospecting, product knowledge, objection handling, and business planning. Current approach: one full day of training on each skill area before moving to the next.
**Diagnosis:** Massed curriculum design. Full-day blocks on single topics will produce rapid performance during training but weak retention and poor transfer when agents face real customers who mix topics spontaneously.
**Selected strategy:** Spiraling interleaved curriculum with spaced return to all 4 skill areas.
**Design:** Weekly sessions cycle through all 4 areas, returning to each with new examples and more complex scenarios that require applying earlier learning in a new context. No single session devotes more than 25% of time to one skill area.
**Variation:** Role-play scenarios in each session change the client profile, objection type, and product involved. After month 2, agents practice with no script, varying the opener.
**Tracking:** Weekly 5-question quiz covering one item from each of the 4 areas plus one item from a randomly selected prior week. Agents track their own error rate per area. Any area below 70% correct triggers additional retrieval practice in the next session.
---
### Example C: Athletic Skill — Learning a Technical Movement
**Situation:** Tennis player improving their second serve. Currently practices 50 serves in a row at the end of every session.
**Diagnosis:** Massed practice (blocked repetition from the same position). Will encode a serve that works in practice conditions but deteriorates under match pressure and varied court position.
**Selected strategy:** Varied practice with spacing.
**Design:** Second serve practice is distributed within each session (not saved for the end). Each serve set uses a different starting court position, a different target zone, and a different immediately preceding shot (serve after a baseline rally, serve cold at the start, serve under a 30-second time limit). No more than 10 consecutive serves from the same position.
**Spacing:** Serve technique reviewed every session but never as a marathon block. Maintenance once per session (15 minutes), interleaved with other technical elements.
---
## Quick Reference: Practice Type Selection
```
LEARNING GOAL → PRIMARY STRATEGY
------------------------------------------------------
Memorize fixed items → Spaced (+ interleave if confusable)
Solve typed problems → Interleaved (+ spaced sessions)
Transfer to unpredictable use → Varied (+ interleaved + spaced)
Maintain existing mastery → Spaced (long intervals, Leitner logic)
TIME HORIZON → STARTING INTERVAL
------------------------------------------------------
< 3 days → Hours between sessions; prioritize retrieval
1–4 weeks → 1 day → 3 days → 1 week
1–3 months → 1 day → 1 week → 2–3 weeks → monthly
Ongoing → Mastery-based (Leitner tiers)
ANTI-PATTERN CHECK → CORRECTION
------------------------------------------------------
Cramming → Break into sessions with gaps
Blocked practice → Interleave within sessions
Familiarity trap → Quiz before reviewing; never skip "known" items
```
---
## References
- `references/practice-type-comparison.md` — Full research evidence for each of the four practice types, including the geometry study (89% blocked vs. 63% interleaved during practice; 20% blocked vs. 63% interleaved on delayed test), the surgical resident spaced training study, and the beanbag motor learning experiment
- `references/spacing-interval-guide.md` — Detailed interval tables by material type, cognitive load, and learner experience level; guidance for compressing or expanding schedules when time constraints change
- `references/leitner-box-implementation.md` — Step-by-step Leitner box setup for flashcard-style material; digital and physical implementations; troubleshooting stalled items
## License
This skill is licensed under [CC-BY-SA-4.0](https://creativecommons.org/licenses/by-sa/4.0/).
Source: [BookForge](https://github.com/bookforge-ai/bookforge-skills) — Make It Stick: The Science of Successful Learning by Peter C. Brown, Henry L. Roediger III, Mark A. McDaniel.
## Related BookForge Skills
This skill is standalone. Browse more BookForge skills: [bookforge-skills](https://github.com/bookforge-ai/bookforge-skills)
FILE:references/leitner-box-implementation.md
# Leitner Box Implementation
## Overview
The Leitner box is a physical or logical system invented by German science journalist Sebastian Leitner for managing spaced retrieval practice of flashcard-style material. Its core principle: the better your mastery of an item, the less frequently you need to practice it — but items you know well never leave the system until the learning goal is fully met.
The system solves two common problems:
1. Learners spend equal time on items they know well and items they struggle with — an inefficient allocation.
2. Learners stop practicing items that feel familiar, triggering the familiarity trap and slow undetected forgetting.
---
## Physical Setup: Four Tiers
### Materials
- 4 physical containers (boxes, folders, file drawer sections, or divided notebook sections)
- Label them Tier 1 through Tier 4
- One card or slip per item to be learned
### Tier Definitions and Review Frequency
| Tier | Mastery Level | Review Frequency |
|---|---|---|
| Tier 1 | Errors frequent; not yet reliable | Every practice session |
| Tier 2 | Mostly correct but not consistent | Every other session |
| Tier 3 | Reliably correct in same-topic blocks | Once per week |
| Tier 4 | Reliably correct in mixed sessions | Once per month |
All new items start in Tier 1.
---
## Movement Rules
### Advancing (Tier 1 → 2 → 3 → 4)
Move an item forward one tier when:
- You answer it correctly in an **interleaved session** (mixed with other topics)
- You do so in two or more consecutive sessions without error
Do not advance based on blocked performance (answering correctly when all cards in the session are from the same topic). Correct recall in massed conditions may reflect short-term memory from recent exposure, not durable learning.
### Demoting (Any tier → Tier 1)
Move an item back to Tier 1 when:
- You answer it incorrectly in any session, regardless of its current tier
This rule is strict for a reason: errors in a system you have been practicing indicate that durable learning has not been achieved for that item. The item needs more frequent retrieval.
---
## Session Structure
### Session Composition
Each session contains:
1. **All Tier 1 items** (reviewed every session)
2. **Tier 2 items** (every other session — check the schedule)
3. **Tier 3 items** (if it is the weekly review day)
4. **Tier 4 items** (if it is the monthly review date)
Shuffle all items from all tiers scheduled for today into a single deck before beginning. Do not separate them into tier-by-tier blocks — the mixing is what makes the session interleaved and what validates advancement decisions.
### Minimum Session Requirements
- Never skip a session that contains Tier 1 items. These are the items most at risk of being forgotten.
- If time is short, prioritize Tier 1 and defer Tier 4 monthly review. The opposite is the familiarity trap in action.
---
## Troubleshooting Stalled Items
### Symptom: An item stays in Tier 1 for more than 3–4 sessions without advancing
This indicates the item is not being learned by repetition alone. Possible causes and remedies:
**Cause 1 — The item requires context, not just recall**
Symptoms: You can recall the item when you see the card, but you cannot generate it from memory in a real situation.
Remedy: Add context to the card. Instead of "Q: What is the capital of Texas? A: Austin" — try "Q: If you are driving from San Antonio to Dallas and stop at the state capital, where do you stop? A: Austin." Force the retrieval to mirror the context where you need the knowledge.
**Cause 2 — The item is being confused with a similar item**
Symptoms: You confuse this item with one or two other items consistently.
Remedy: Create a comparison card that explicitly contrasts the two confusable items. Practice them in consecutive slots (not blocks — just consecutively to highlight the difference), then return to interleaved practice.
**Cause 3 — The item lacks a hook**
Symptoms: The item feels arbitrary; there is no connection to anything you already know.
Remedy: Generate an elaboration — a story, analogy, image, or connection to prior knowledge. Write the hook on the back of the card. The hook does not need to be logical; it needs to be memorable. Even a silly or exaggerated image aids encoding.
**Cause 4 — The item is too big**
Symptoms: The item requires recalling a multi-step process or a long list.
Remedy: Break it into smaller cards. Each card should require a single, specific retrieval act. Multi-step cards often mask partial mastery as failure.
---
## Adapting to Digital Tools
Spaced-repetition software (Anki, RemNote, SuperMemo, etc.) implements equivalent logic with algorithmic precision:
- Individual intervals per item (not tier-based)
- Performance data adjusts intervals automatically
- Retention rate tracking over time
**When to use digital vs. physical:**
| Situation | Recommendation |
|---|---|
| Large material set (> 200 items) | Digital — manual tracking becomes impractical |
| Small material set (< 50 items) | Physical — simpler, no setup overhead, easier to inspect |
| Material requiring images, diagrams | Digital — better display support |
| Study group or shared curriculum | Physical — easier to share and modify collectively |
| Long-term retention (years) | Digital — algorithmic intervals adapt better over very long time horizons |
### Anki Configuration for Leitner-Style Behavior
Default Anki settings are close to Leitner logic. Adjust:
- **New cards per day:** Set low (10–20) to avoid overwhelming Tier 1 with simultaneous new items
- **Review order:** Set to "Random order" within due items — not "Due date" which can cluster items from the same topic
- **Interval modifier:** Default is 100%; increase to 120–130% only if you are reliably hitting 90%+ correct, to extend intervals further
Anki's "Ease" factor automatically reduces intervals for difficult items and extends them for easy items — equivalent to the Leitner tier system but at item-level granularity.
---
## The Familiarity Trap in Leitner Systems
The Leitner box's biggest failure mode is the learner stopping review of Tier 3 and Tier 4 items because they "feel like they know them." This is the familiarity trap: recognition does not equal recall under pressure.
Rules to prevent this:
1. Monthly Tier 4 review is non-negotiable. Schedule it on the calendar, not "when I feel like it."
2. If you skip a Tier 4 review for two or more months, demote those items to Tier 3 automatically.
3. Before any high-stakes performance (exam, presentation, job interview), pull all items from Tier 3 and Tier 4 and quiz yourself under timed, interleaved conditions. The goal is to confirm mastery, not to review — if mastery is there, the session will be short.
FILE:references/practice-type-comparison.md
# Practice Type Comparison: Research Evidence
## The Four Practice Types
### Massed Practice
Massed practice means concentrating study or drill into a single unbroken session or closely spaced repetitions. It is the default approach for most learners and is actively recommended by many educators and coaches.
**Why it feels effective:** During a massed session, performance improves visibly and rapidly. Researchers call this "momentary strength" — a real but temporary gain that rests on short-term memory rather than consolidated long-term storage.
**What it actually produces:** Fast acquisition, fast forgetting. The rapid gains do not survive the interval between practice and performance. Researchers describe massed practice as analogous to binge-and-purge: a large amount goes in, but most comes out quickly.
**When it is appropriate:** Massed practice has one defensible use: initial introduction to a completely new concept or skill type. A brief blocked period (15–20 minutes) to establish a working model of a new problem type before interleaving begins is acceptable and reduces early confusion.
---
### Spaced Practice
Spaced practice distributes study across multiple sessions with gaps between them. The gaps allow memory consolidation — the neurological process by which new learning is stabilized, connected to prior knowledge, and encoded in long-term memory.
**Key research evidence:**
- A study of 38 surgical residents learning microsurgery divided participants into two groups: one completed four lessons in a single day (standard schedule); the other completed the same four lessons with one week between sessions. Tested one month later, the spaced group outperformed the massed group on all measures — elapsed surgery time, number of hand movements, and successful procedure completion. 16% of the massed group damaged the tissue beyond repair and could not complete the surgery; none of the spaced group did.
- The mechanism: embedding new learning in long-term memory requires consolidation, a process that unfolds over hours and may take several days. Rapid-fire practice relies on short-term memory. The effort of retrieving learning after a small forgetting interval re-triggers consolidation and further strengthens the memory trace.
**Optimal spacing:** Enough time for some forgetting to occur, but not so much that retrieval becomes relearning. At minimum, one sleep period between sessions. In practice: 1 day → 3–5 days → 1–2 weeks → monthly.
---
### Interleaved Practice
Interleaved practice mixes two or more subjects or problem types within a single session, switching before each topic is fully exhausted.
**Key research evidence:**
- College students learning to compute volumes of four geometric solids: massed group averaged 89% correct during practice; interleaved group averaged 60%. On a delayed test one week later: massed group scored 20%; interleaved group scored 63%. The interleaved approach produced 215% better performance on the delayed test despite appearing worse during acquisition.
- Studies of art attribution (matching paintings to artists) and bird species classification both confirmed that interleaved exposure to multiple categories produced better discrimination ability than studying one category exhaustively before moving to the next. The massed-practice participants learned the common features of individual categories; the interleaved participants learned the distinguishing differences between categories — which is what real-world classification requires.
**Why interleaving works:** When you practice one problem type until you have it "down cold" before moving to another, you never practice the critical skill of identifying which type you are facing. Real exams and real situations present problems in unpredictable order. Interleaving forces continuous discrimination, building the sorting ability that transfers to performance contexts.
**Common confusion — blocked practice vs. interleaving:** Blocked practice (moving from station to station in a fixed sequence) is often mistaken for interleaving. The distinction: in blocked practice, you complete a full set of one type before switching. In interleaving, you switch before any type is complete. A station-rotation drill in sports or a textbook-chapter-by-chapter study plan are both blocked practice.
---
### Varied Practice
Varied practice changes the conditions, format, or context of practice across sessions, even when the core skill being practiced remains the same.
**Key research evidence:**
- Eight-year-olds practiced throwing beanbags into buckets: one group always used a 3-foot distance; the other group alternated between 2-foot and 4-foot distances. All children were then tested on the 3-foot distance — which only the massed group had actually practiced. The varied-practice group outperformed the massed group on this specific distance they had never practiced.
- The mechanism: varied practice appears to be consolidated in a different neural region than massed practice — one associated with higher-order motor skill learning rather than simple habit formation. The inference is that varied practice produces a more flexible, broadly applicable skill representation.
- Neuroimaging studies support the interpretation that varied practice "encodes learning in a more flexible representation that can be applied more broadly."
**Application to cognitive skills:** The beanbag effect extends to cognitive learning. In an anagram study, participants who practiced a word using multiple anagram variants outperformed those who practiced the same anagram repeatedly — even when both groups were tested on the specific anagram the massed group had practiced.
**Blocked practice as a named anti-pattern:** Blocked practice in motor skills involves practicing the same drill in the same sequence from the same position. The hockey example: a team that practices a one-touch pass always from the same position on the ice is practicing for a scenario that will never occur in a game. Effective varied practice requires changing position, sequence, speed, and context systematically.
---
## Momentary Strength vs. Habit Strength
This distinction is the conceptual key to understanding why learners persist in massed practice despite its ineffectiveness.
**Momentary strength:** The heightened ability to perform a skill that builds during an acquisition session. Real, measurable, visible — and based primarily on short-term memory. Fades quickly after the session ends.
**Habit strength:** The underlying durable learning that persists across time and transfers to new conditions. Builds slowly, produces no visible improvement signal during practice, and requires spaced, interleaved, or varied practice conditions.
The techniques that build habit strength (spacing, interleaving, variation) feel less productive and often produce slower visible gains during practice. Learners consistently rate massed practice as more effective even after experiencing evidence that interleaved practice produced better test results. The subjective feeling of productivity is a systematically misleading signal when evaluating practice quality.
---
## Sources
- Brown, P. C., Roediger, H. L., & McDaniel, M. A. (2014). *Make It Stick: The Science of Successful Learning*. Harvard University Press. Chapters 3, 4, 8.
- Rohrer, D., & Taylor, K. (2007). The shuffling of mathematics problems improves learning. *Instructional Science*, 35, 481–498. (geometry volume study)
- Kornell, N., & Bjork, R. A. (2008). Learning concepts and categories: Is spacing the "enemy of induction"? *Psychological Science*, 19, 585–592. (art attribution study)
- Shea, J. B., & Morgan, R. L. (1979). Contextual interference effects on the acquisition, retention, and transfer of a motor skill. *Journal of Experimental Psychology: Human Learning and Memory*, 5, 179–187. (motor variability studies)
FILE:references/spacing-interval-guide.md
# Spacing Interval Guide
## Core Principles
### The Forgetting Curve and Optimal Spacing
The goal of spacing is not to avoid forgetting — it is to engineer a controlled amount of forgetting that forces effortful retrieval. The retrieval effort re-triggers memory consolidation, making the trace stronger than it was before. This is why "a little forgetting between practice sessions can be a good thing."
The two failure modes:
- **Too little spacing:** Session B occurs before any meaningful forgetting from Session A. Retrieval requires no effort because short-term memory is still active. The session feels highly productive. Little new consolidation occurs.
- **Too much spacing:** So much has been forgotten that retrieving the material is essentially relearning it. Retrieval fails outright. The session requires looking up answers rather than recalling them, reducing the retrieval benefit.
The optimal zone is: "Effort required, but recall is possible."
### Sleep as a Consolidation Amplifier
Memory consolidation is strongly linked to sleep. At minimum, a single sleep period between practice sessions produces meaningfully better long-term retention than two sessions on the same day. Even in compressed schedules, "Monday morning and Monday evening" is inferior to "Monday and Tuesday" when long-term retention matters.
For material that requires deep integration (complex problem-solving, judgment skills, creative skills), multiple sleep periods before re-testing is even better than a single night.
---
## Interval Tables by Material Type
### Names, Faces, and Arbitrary Associations
These associations have a steep forgetting curve — most loss occurs within the first 24 hours.
| Session | Interval After Previous |
|---|---|
| Initial exposure | — |
| First review | Within 10–30 minutes of initial exposure |
| Second review | Same day (4–8 hours later) |
| Third review | Next day |
| Fourth review | 3 days later |
| Fifth review | 1 week later |
| Ongoing maintenance | Monthly |
**Note:** If performance in any review session falls below 60% correct, reset the interval to the previous level.
---
### New Concepts from Text or Lecture
Standard material encountered in academic or professional reading.
| Session | Interval After Previous |
|---|---|
| Initial reading or lecture | — |
| First retrieval (self-quiz without notes) | Same day or within 24 hours |
| Second retrieval | 3–5 days later |
| Third retrieval | 1–2 weeks later |
| Ongoing maintenance | Monthly |
**Key:** The first retrieval should occur without reviewing the source material first. Attempt to recall, then check. The failed retrieval attempts are not wasted — they prepare the mind to encode the corrected information more deeply.
---
### Problem-Solving Skills (Math, Logic, Diagnosis)
Material where the goal is not to recognize a fact but to apply a procedure or select the correct approach.
| Session | Interval After Previous |
|---|---|
| Introduction to problem type | — |
| First solo practice (interleaved with other types) | 1–2 days |
| Second practice (interleaved) | 1 week |
| Third practice (interleaved) | 2–3 weeks |
| Ongoing maintenance | Monthly |
**Note:** Interleaving is especially important here. Do not wait until mastery of one type is complete before introducing others. The point is to build discrimination ability, which requires encountering multiple types together.
---
### Complex Judgment and Transfer Skills
Clinical diagnosis, athletic performance, negotiation, language production, leadership decisions.
| Session | Interval After Previous |
|---|---|
| Initial exposure and structured practice | — |
| First varied practice (changed conditions) | Next session (1–2 days) |
| Second varied practice | 1 week |
| Third varied practice | 3 weeks |
| Real-world application and reflection | As soon as available |
| Ongoing maintenance | Monthly; plus reflection after each real-world encounter |
**Note:** Real-world encounters count as retrieval practice sessions when followed by reflection. A doctor who sees a patient and then reviews the encounter systematically is performing spaced retrieval practice.
---
## Adapting Intervals for Time Constraints
### Compressed Schedule (Exam in < 1 Week)
When the timeline is short, spacing intervals must be compressed. The principle remains the same — some gap between sessions is better than none.
- Minimum gap: One sleep period between sessions covering the same material.
- Use every available gap strategically: morning session + evening session = one sleep period, which is better than two morning sessions back-to-back.
- Prioritize retrieval (self-testing without notes) over re-reading. In compressed timelines, re-reading returns near zero value; retrieval practice returns the highest value per minute.
- If the exam is tomorrow: do a retrieval session tonight (testing yourself without looking), sleep, and do a brief review in the morning. Do not cram through the night — sleep deprivation impairs the consolidation that occurred during the prior week.
### Extended Schedule (3+ Months)
- Start with short intervals and lengthen them as mastery increases.
- Use the Leitner-box principle: items that are reliably retrieved correctly in interleaved conditions get moved to longer intervals (weekly → bi-weekly → monthly).
- Revisit all material at least monthly regardless of apparent mastery. The familiarity trap is most dangerous for well-learned material: it feels unnecessary to review, so review stops, and slow forgetting goes undetected until a performance moment reveals the gap.
### When You Miss Sessions
Missing sessions is unavoidable. The correct response is not to double up (which produces a massed session) but to resume the schedule and accept that some additional forgetting occurred. The forgetting is not catastrophic; it means the next retrieval session will require slightly more effort, which will strengthen the memory trace.
Do not extend the overall schedule proportionally for every missed session — this creates a never-ending plan. Instead, identify the highest-priority material (lowest mastery, most important for performance), concentrate additional retrieval on that, and allow lower-priority material to be reviewed at its scheduled interval even if some sessions were missed.
---
## Leitner Box Implementation
The Leitner box is a mastery-tracking system for flashcard-style material that automatically calibrates review frequency to current mastery level.
### Physical Setup
Use four labeled containers (boxes, folders, or sections of a binder):
- **Box 1:** Review every session (errors frequent)
- **Box 2:** Review every other session (mostly correct)
- **Box 3:** Review once per week (reliably correct in blocked conditions)
- **Box 4:** Review once per month (reliably correct in interleaved conditions)
All new material starts in Box 1.
### Movement Rules
- **Correct answer in interleaved conditions:** Move the item forward one box (toward less-frequent review)
- **Incorrect answer:** Move the item back to Box 1 (most frequent review)
- **Correct answer in blocked conditions (consecutive same-topic):** Do not advance — this may reflect short-term memory, not durable learning. Advance only when correct in a mixed session.
### Why Interleaved Conditions for Advancement
The test for "mastered" must match real performance conditions. If a vocabulary card is answered correctly when all the cards in the session are Spanish vocabulary, that is a weaker signal than answering it correctly when it appears among cards from three different subjects. Only advance when the retrieval is performed in mixed conditions.
### Digital Implementation
Any spaced-repetition software (Anki, RemNote, etc.) implements equivalent logic algorithmically, using performance data to schedule each item's next review. The key difference from a paper Leitner box: digital systems track intervals precisely and account for individual item difficulty separately. Paper boxes use approximate tier-based intervals.
Both work. The paper system is easier to understand and control; the digital system scales better for large material sets.