@clawhub-quochungto-93dad49abd
Design recovery mechanisms for a software system or component: select the right rollback control strategy from a three-mechanism decision framework (key rota...
---
name: recovery-mechanism-design
description: |
Design recovery mechanisms for a software system or component: select the right rollback control strategy from a three-mechanism decision framework (key rotation, deny list, and minimum acceptable security version number / downgrade prevention), set rate-of-change policy that decouples rollout velocity from rollout content, eliminate wall-clock time dependencies from recovery paths, design an explicit revocation mechanism with safe failure behavior (distributing cached revocation lists rather than failing open), and provision emergency access for use when normal access paths are completely unavailable. Use when designing a new system's update or rollback architecture, reviewing an existing release pipeline for security-reliability tradeoffs, defining rollback policy for self-updating firmware or system software, designing a revocation mechanism for credentials or certificates, or planning emergency access infrastructure before an incident occurs. Output: a recovery mechanism design document with rollback control strategy per component, rate-of-change policy, revocation mechanism design, and emergency access plan.
version: 1.0.0
homepage: https://github.com/bookforge-ai/bookforge-skills/tree/main/books/building-secure-and-reliable-systems/skills/recovery-mechanism-design
metadata: {"openclaw":{"emoji":"📚","homepage":"https://github.com/bookforge-ai/bookforge-skills"}}
status: draft
source-books:
- id: building-secure-and-reliable-systems
title: "Building Secure and Reliable Systems"
authors: ["Heather Adkins", "Betsy Beyer", "Paul Blankinship", "Piotr Lewandowski", "Ana Oprea", "Adam Stubblefield"]
chapters: [9]
tags:
- security
- reliability
- recovery
- rollback
- revocation
- update-mechanism
depends-on:
- resilience-and-blast-radius-design
execution:
tier: 2
mode: full
inputs:
- type: context
description: "System or component description: what software or firmware updates itself, whether updates are cryptographically signed, what credentials or certificates it uses, and the existing rollout or release pipeline if one exists."
tools-required: [Read, Write]
tools-optional: []
mcps-required: []
environment: "Works from a system description, design document, or release pipeline specification. Output: recovery mechanism design document with rollback control strategy, rate-of-change policy, revocation mechanism design, and emergency access plan."
discovery:
goal: "Produce a recovery mechanism design document that addresses rollback control, rate-of-change decoupling, time-dependency elimination, credential revocation, and emergency access — the five architectural decisions that determine whether a system can be safely recovered after a failure or compromise."
tasks:
- "Select rollback control mechanism (key rotation, deny list, minimum acceptable security version number, or combination) for each self-updating component"
- "Design rate-of-change policy that decouples rollout velocity from rollout content using a separate rate-limiting service"
- "Identify and eliminate wall-clock time dependencies from recovery paths; replace with rates, epoch/version advancement, or validity lists"
- "Design explicit revocation mechanism with locally cached distribution and safe failure behavior"
- "Design emergency access infrastructure for access controls, network, and communications that survives normal-path failures"
audience: "security engineers, platform engineers, SREs, and architects designing or reviewing update mechanisms, credential management, or incident response infrastructure"
when_to_use: "When designing a new system's rollback architecture, reviewing an existing release pipeline for security-reliability tradeoffs, specifying rollback policy for self-updating firmware or system software, designing a revocation mechanism for credentials or certificates, or planning emergency access infrastructure before an incident occurs"
environment: "System description or design document. Knowledge of whether updates are cryptographically signed, what credentials/certificates the system uses, and the existing rollout pipeline is needed."
quality: placeholder
---
# Recovery Mechanism Design
## When to Use
Apply this skill when:
- Designing a new component's update or rollback architecture from scratch
- Reviewing an existing release pipeline for the security-reliability tradeoff: can the system roll back safely, and does rolling back reintroduce vulnerabilities?
- Defining rollback policy for self-updating firmware or system software (package managers, BIOS, bootloaders, embedded firmware)
- Designing a credential or certificate revocation mechanism
- Provisioning emergency access infrastructure before an incident — not during one
**The core pattern:** Recovery is not a single operation. It is five cooperating design decisions made before any incident occurs: (1) choosing how rollbacks are controlled to prevent reintroducing vulnerabilities, (2) decoupling rollout velocity from rollout policy so speed can be adjusted without code changes, (3) eliminating wall-clock time dependencies that break recovery under clock skew, (4) designing an explicit revocation mechanism that distributes state rather than requiring real-time coordination, and (5) provisioning emergency access infrastructure that survives normal-path failures.
This skill depends on `resilience-and-blast-radius-design`. The compartmentalization and failure domain concepts from that skill apply here: revocation should be compartmentalized (half-machine self-protection during KRL updates), and emergency access must be designed as a low-dependency component that avoids the failure domains of the primary access path.
Before starting, confirm you have:
- A description of the component(s) to be designed for recovery
- Whether updates are cryptographically signed (required for the rollback control mechanisms)
- The existing release pipeline, if one exists
- Whether the component is self-updating (updates itself by overwriting its own executable or firmware image)
---
## Context and Input Gathering
### Required Context
- **Component description:** What software or firmware component is being designed? Is it application software, system software (package manager, OS), or firmware (BIOS, NIC, embedded controller)?
- **Update signature model:** Are updates cryptographically signed? Does the signature cover the component image and its version metadata?
- **Self-updating flag:** Does the component update itself (overwriting its own executable), or is it updated by an external package manager?
- **Credential/certificate usage:** What credentials or certificates does the system use? Are they issued by an internal certificate authority? What is their current validity model (expiration time, or revocation list)?
### Observable Context
If a system description is provided, scan for:
- Hard-coded rollback policies ("always allow" or "never allow") — both are known failure modes
- Certificate or credential validity tied to wall-clock expiration without a revocation fallback
- A single global rollout rate limit embedded in the rollout actuator — policy and mechanism are coupled
- Emergency access paths that depend on the same SSO or credential service as normal access — no isolation
- Revocation mechanisms that fail open when the revocation service is unavailable
### Default Assumptions
- If update signatures are not mentioned: treat the system as unsigned and note that the rollback control mechanisms in Step 2 require signing as a prerequisite
- If rollback policy is unspecified: treat the system as "allow arbitrary rollbacks" and apply the full mechanism selection process
- If no revocation mechanism exists: start with locally cached revocation list distribution rather than a centralized validity database
- If no emergency access exists: apply the emergency access design in Step 5, beginning with access controls (highest priority)
---
## Process
### Step 1 — Classify the Component and Its Recovery Context
Before selecting mechanisms, establish the recovery context that governs all downstream decisions.
#### 1a — Component Type
| Type | Characteristics | Recovery complexity |
|---|---|---|
| Application software | Updated by an external package manager; killed and restarted by external tooling | Lower — rollback is managed outside the component |
| System software | Self-updating (package management daemon, OS image); updates itself by overwriting its executable | Higher — must be able to update itself without losing the ability to continue updating |
| Firmware | Self-updating with hardware constraints; may use one-time-programmable memory for version tracking; spare parts introduce old-version risk | Highest — rollback may be physically infeasible; hardware key storage is limited |
**Why:** Self-updating components present the hardest recovery design problem. They can actively prevent themselves from being updated if maliciously modified. The rollback control mechanism must work even for these components, but the "intended behavior" (what rollback means) is itself ambiguous.
#### 1b — Known State Definition
Recovery means returning to a known good state. Define that state before choosing mechanisms:
- What constitutes the intended state of this component? (Version number, cryptographic hash of the image, configuration values, firmware parameters)
- Is the intended state captured and continuously monitored against the deployed state?
- Does state tracking cover in-memory state (daemon configuration loaded at startup) as well as on-disk state?
**Why:** The more thoroughly the intended state is encoded and compared against the actual state, the easier it is to detect deviations, trigger automated repair, and confirm recovery is complete. Systems that do not know their intended state cannot verify that recovery succeeded.
**Output for this step:** Component type classification and intended state definition.
---
### Step 2 — Select Rollback Control Mechanism
For self-updating components (and application software where security patch coverage is required), a bare "allow arbitrary rollbacks" policy reintroduces known vulnerabilities. A bare "never allow rollbacks" policy removes the path to a stable state when a bad update is deployed. The three mechanisms below provide controlled middle ground.
**Prerequisite for all three mechanisms:** Updates are cryptographically signed and the signature covers the component image and its version metadata.
#### The Three-Mechanism Comparison
| Mechanism | How it works | Best for | Key limitation |
|---|---|---|---|
| **Key rotation** | Rotating the update signing key invalidates older releases that were signed with the old key | Long-term hygiene; recovering from signing key compromise | Sudden rotation disrupts reliability; gradual rotation (dual-key overlap) adds complexity; hardware devices with OTP key storage have very limited rotation budget |
| **Deny list** | A list of version identifiers (hashes or labels) that components refuse to install | Quick incident response; blocking specific bad versions | Vulnerable to "unzipping" (incremental rollback through versions not yet on the list); list grows without bound; garbage collection is operationally complex |
| **Minimum acceptable security version number (downgrade prevention)** | Each release carries a security version number (SVN); a separately maintained floor value (stored in component state, not inside the release) prevents installation of releases with an SVN below the floor | Permanent exclusion of vulnerable version ranges; garbage-collecting deny list entries | Floor value is ratcheted forward by new releases (not by humans directly); requires ordered version values (not cryptographic hashes) |
#### Mechanism Selection Decision Framework
**Start with key rotation** — all healthy organizations should implement key rotation regardless of the other choices. It is foundational cryptographic hygiene and validates the rotation process before it is needed in an emergency.
**Add deny lists** for incident response velocity. During an active incident, quickly appending a version identifier to a deny list is faster than raising the minimum acceptable security version number floor. Use deny lists as a rapid-response tool; use downgrade prevention as the cleanup mechanism afterward.
**Add downgrade prevention (minimum acceptable security version number)** when:
- The deny list has grown large enough to be operationally burdensome
- You want to permanently exclude all versions below a security milestone, rather than maintaining individual entries
- You are designing the system from scratch and want the cleanest long-term architecture
**For self-updating components, combine all three** where feasible, but introduce one at a time. Each mechanism adds operational complexity. Introducing all three simultaneously makes it difficult to identify bugs or corner cases in any individual mechanism.
#### Implementation Notes
**Deny list storage:** Encode the deny list outside the self-updating component (`ComponentState[DenyList]`, not `Release[DenyList]`). A list stored inside the component is lost during updates and does not survive component replacement. The external list is the union of entries from all releases installed on that component.
**Minimum acceptable security version number floor storage:** Store the floor value in external component state (`ComponentState[MASVN]`), not inside any release. The floor value is ratcheted forward automatically: each time a new release initializes, it compares its own MASVN against the stored floor and updates the floor to the higher value. This means the floor only rises — it cannot be lowered by installing an older release.
**Key rotation for hardware:** Hardware devices with one-time-programmable (OTP) memory for storing public keys have a very limited key storage budget. Plan the number of key rotations needed across the device's lifetime before finalizing the hardware design. For spare parts that may have old firmware, support multiple signatures per release (old key and new key simultaneously) during the rotation transition window.
**Output for this step:** Selected mechanism(s) per component, with rationale. Implementation notes for deny list storage location and minimum acceptable security version number floor management.
---
### Step 3 — Design the Rate-of-Change Policy
Recovery speed and recovery risk are in direct tension: deploying changes faster reduces the window of exposure, but also reduces the time available for testing and increases the risk of deploying a broken patch. The correct architectural response is to decouple the *ability* to change quickly from the *policy* governing how quickly changes happen.
#### 3a — Build Update Velocity Independent of Update Policy
Design the update system to operate at the maximum conceivable speed. Then add a separate, independent rate-limiting service that constrains the actual rate of change according to current policy.
**Why:** If the rollout system's speed is determined by its internal policy parameters, then changing the policy requires modifying the rollout system — which is a code change subject to its own review, build, and release cycle. This makes emergency acceleration dependent on the system that is already under stress. Separating rate from action means that responding to an emergency requires only changing a rate limit, not rewriting infrastructure.
**Design the rate-limiting service as:**
- An independent, standalone microservice with minimal dependencies
- The single point of authority for approving changes at a given rate
- A collector of change logs for auditing
- Simple enough to test rigorously in isolation
**Rate-limit token design:** The rate-limiting service issues short-lived cryptographic tokens asserting that it has reviewed and approved a change at a certain time. The rollout actuator validates these tokens before applying changes. This communicates architecturally that change actuation is decoupled from change rate governance.
#### 3b — Backstop Rate Limit for Epoch/Version Advancement
If the system uses epoch or version advancement (see Step 4 on time dependencies), hardcode a backstop rate limit for how frequently the epoch or version can advance — for example, no more than once per second for a 64-bit integer counter. This is an exception to the "make policy external" principle, but is justified: it is difficult to imagine a legitimate reason to advance system state more than once per second, and the backstop prevents an adversary with temporary system control from forcing an epoch rollover that renders the version floor mechanism useless.
**Output for this step:** Rate-limiting service design, token mechanism, and backstop rate limit value for epoch/version advancement if applicable.
---
### Step 4 — Eliminate Wall-Clock Time Dependencies
Wall-clock time is a form of external state that the system cannot control. Recovery operations — replaying transactions, validating certificates, correlating logs across systems — are all vulnerable to clock skew, misconfigured NTP, certificate expiration edge cases, and leap-second bugs.
**The fundamental problem:** When recovering from a crash or rolling back to a previous state, any system component that checks wall-clock time may find that time has moved in unexpected ways. A recovery involving digitally signed transactions may fail if those transactions were signed by certificates that have since expired. Rolling back a database may require transaction replay — which fails if the database expects monotonically increasing timestamps.
#### 4a — Replace Wall-Clock Time with Rate-Based or Epoch-Based Alternatives
For each place in the system where wall-clock time is used, evaluate replacement with one of:
| Alternative | How it works | When to use |
|---|---|---|
| **Rates** | Express time-dependent limits as rates (events per second) rather than absolute timestamps | For throttling, rate limiting, and frequency policies |
| **Epoch or version advancement** | A monotonically increasing integer counter that advances by explicit policy action, not by clock progression | For validity windows, certificate freshness, ordering of events across distributed components |
| **Validity lists** | Explicit lists of valid-versus-revoked items, pushed periodically by a controlled process | For certificate validity, credential validity, and key status (see Step 5 for revocation design) |
**Why epoch advancement beats wall-clock time for validity:** With epoch advancement, certificates age without the system being tempted to skip certificate validity checking. Because you can halt epoch advancement in an emergency (freeze the epoch counter while investigating an issue), and because the validity system depends on your controlled push mechanism rather than on coordinated clocks across the fleet, recovery is not complicated by time.
**Caution on epoch rollover:** Aggressively incremented epoch values can roll over or overflow. Choose a sufficiently large integer range (64-bit) and apply a backstop rate limit (see Step 3b) to prevent an adversary from forcing rollover.
#### 4b — Audit for Wall-Clock Time Code Smells
Flag for removal or replacement:
- Fixed dates or time offsets in code (a "time bomb" — the code changes behavior at a specific real-world date without any human action)
- Certificate expiration times set far in the future ("not our problem anymore" engineering)
- Unauthenticated NTP dependencies (an attacker who controls NTP can manipulate time-based validity)
- Database recovery processes that require monotonically increasing timestamps (vulnerable to rollback-induced time inversion)
**Exception:** Wall-clock time is appropriate for deliberately time-bounded access — for example, requiring employees to re-authenticate daily, or enforcing a time-limited access grant. In these cases, design a repair path for the time-validation system that does not itself depend on wall-clock time.
**Output for this step:** List of wall-clock time dependencies in the system, replacements selected for each, and any legitimate exceptions with a documented repair path.
---
### Step 5 — Design the Revocation Mechanism
A revocation mechanism stops access by entities whose credentials have been compromised. It is most valuable during an active compromise — and that is precisely when it is most likely to be exercised under stress and with incomplete information. Design it before the incident.
#### 5a — Choose the Distribution Model
| Model | How it works | Risk |
|---|---|---|
| **Centralized validity database** | Every system checks credential validity against a central database before allowing access | If the database is down, all dependent systems fail; there is strong temptation to fail open, which creates an attack vector (denial-of-service the validity database to revalidate revoked credentials) |
| **Locally cached revocation list** | A revocation list is pushed periodically to all nodes; nodes use their local cache to make validity decisions | Nodes proceed on best-available information; revocation is eventually consistent rather than instantaneous; no single point of failure for access |
**Recommendation:** Prefer locally cached revocation list distribution over a centralized validity database. The centralized database creates a single point of failure and introduces the temptation to fail open. Distributing revocation data to nodes gives each node independence while maintaining eventual revocation of compromised credentials.
#### 5b — Define Safe Failure Behavior
If the revocation service is unavailable or a node cannot reach it:
- **Do not fail open.** Failing open when the revocation service is unavailable lets an attacker who conducts a denial-of-service attack on the revocation service reuse revoked credentials during the outage.
- **Use the cached list.** Nodes should proceed based on their most recent cached revocation list, with monitoring to detect nodes that have not received a recent update.
#### 5c — Revocation at Scale: Self-Protection During KRL Updates
When updating a Key Revocation List (KRL) file, a naive implementation — blindly replacing the old file with the new one — allows a single push to revoke every valid credential in the infrastructure. An attacker with partial system control can use this against you.
**Safeguard:** Require each node to evaluate a new KRL before applying it. A KRL that would revoke the node's own credentials is refused. This guarantees that even a malicious KRL push can affect at most half of the fleet, preserving a recovery base.
**Why this matters for recovery:** Half-machine protection means the worst-case blast radius of a revocation push is bounded. You can recover the remaining half of the infrastructure and use it to remediate the first half — far easier than recovering everything from scratch.
#### 5d — Avoid Special-Purpose Emergency Revocation Lists
The temptation during incident response is to build a separate, faster emergency revocation list to supplement the normal one. Avoid this: rarely-used mechanisms are less likely to work when most needed, and a separate emergency mechanism adds complexity without proportional benefit.
**Instead:** Shard the normal revocation list. Revoking a credential during an emergency requires updating only the relevant shard, not the entire list. Because the system always uses a multi-part revocation list (even under normal conditions), the mechanism is exercised regularly and is reliable under emergency conditions.
#### 5e — Remove Wall-Clock Time from Certificate Validation
Explicit revocation removes the dependency on accurate time for certificate validation. A certificate that is "expired" by wall-clock time but present on no revocation list is effectively valid under a pure revocation model. This eliminates the failure mode where clock skew accidentally invalidates legitimate certificates during a recovery, or where an attacker manipulates clocks to revalidate revoked certificates.
**Output for this step:** Revocation distribution model, safe failure behavior definition, self-protection safeguard for KRL updates, sharding strategy, and whether wall-clock time is eliminated from certificate validity.
---
### Step 6 — Design Emergency Access Infrastructure
When normal access methods are completely unavailable — the SSO service is down, the VPN is broken, the primary credential service is unreachable — responders must still be able to act. Emergency access infrastructure is what makes recovery possible when the access path itself has failed.
#### 6a — Access Controls
Emergency access cannot depend on the same dynamic credential services as normal access. Design an alternative access path that:
- Uses offline-provisioned credentials (not derived from SSO or federated identity providers that may be unavailable during the outage)
- Achieves equivalent access policy security, even if with reduced convenience or features
- Is restricted to the minimum set of people who need it immediately (limit attack surface, operational cost, and usability degradation)
- Has pre-provisioned credentials with explicitly managed lifetime — credentials issued proactively on a fixed schedule (not activated on demand at incident start) avoid the race condition where a credential expires just as an outage begins
**Credential lifetime tradeoff:** Short-lived emergency credentials enforce good security hygiene, but if the outage outlasts the credential lifetime, responders are locked out. Set credential lifetime to exceed the anticipated maximum outage duration, even though this extends the window during which a compromised credential is valid.
#### 6b — Network
Emergency network access must survive failures in the layers outside your control:
- Prefer static network access controls over dynamic protocols (software-defined networking and dynamic routing protocols have dependencies that may be unavailable during an outage)
- Implement sufficient monitoring to detect where inside the network access breaks — not just that it is broken, but at which layer
- For self-contained emergency access, deploy geographically distributed access points so that responders in different regions can independently reach their nearest rack and radiate recovery outward, rather than requiring global coordination
**Why geographic distribution matters:** During a global outage, global coordination may be impossible. If each geographic region can begin recovery independently on the infrastructure it can reach, recovery propagates outward from working points rather than waiting for a globally coordinated restart.
#### 6c — Communications
Emergency communication channels must be evaluated for availability when the primary chat or collaboration service is itself unavailable or compromised:
- Select a communication technology with as few dependencies as possible — the tool must be reachable by responders even if systems outside your control are broken
- If the communication system is outsourced, confirm that it is reachable even when your infrastructure layers are broken
- Consider: is the system reachable if an attacker is eavesdropping on your network? Does it provide sufficient authentication and confidentiality for incident response use?
#### 6d — Responder Habits and Continuous Validation
Emergency access technology provides no benefit if responders do not know how to use it under stress. The cognitive load of switching to unfamiliar tools during a high-stress incident can render emergency access effectively unusable.
**Minimize the distinction between normal and emergency processes:** When emergency access uses the same underlying tools as normal access (for example, an emergency mode added to the browser extension used for normal remote access), responders can draw on habit rather than recall during an incident.
**Require regular exercises:**
- Define a minimum period between required emergency access exercises for each responder
- Integrate emergency access procedures into normal on-call duties where possible
- Track credential-refresh and training completion; alert when a responder's emergency credentials are approaching expiration
- Make practicing emergency access mandatory if regular equivalent activity is not occurring
**Why validation is non-optional:** Humans, not technology, are most likely to render emergency access ineffective. A responder who has not used the emergency access path in six months and has let their credentials expire provides no incident response capacity, even if the infrastructure is sound.
**Output for this step:** Emergency access design covering access controls (credential model, provisioning, lifetime), network (static vs. dynamic, geographic distribution), communications (tool selection, dependency audit), and responder validation plan.
---
### Step 7 — Produce the Recovery Mechanism Design Document
Synthesize findings from Steps 1–6 into a structured document.
**Document sections:**
1. **Component inventory and types** — which components are application software, system software, or firmware; whether updates are signed; intended state definition per component
2. **Rollback control strategy** — selected mechanism(s) per component (key rotation / deny list / downgrade prevention), rationale, and implementation notes (storage location for deny list and version floor)
3. **Rate-of-change policy** — rate-limiting service design, token mechanism, backstop rate limit, and emergency acceleration procedure (change rate limit only, not rollout infrastructure)
4. **Time dependency audit** — wall-clock time dependencies identified, replacements selected (rates / epoch advancement / validity lists), legitimate exceptions with repair path
5. **Revocation mechanism** — distribution model, safe failure behavior, KRL self-protection safeguard, sharding strategy, wall-clock time elimination decision
6. **Emergency access plan** — access control credential model and lifetime, network access design, communications tool selection, responder exercise schedule and validation plan
---
## Key Principles
### The Rollback Policy Triangle
Three rollback policies exist, and only one is correct:
- **Allow arbitrary rollbacks:** Insecure. Any rollback may reintroduce a known vulnerability. Older versions have more stable, weaponized exploits.
- **Never allow rollbacks:** Unreliable. A bad update cannot be undone. The build system must always generate a valid forward target.
- **Controlled rollback with mechanism:** Deploy one or more of the three mechanisms to hold the floor while preserving the ability to recover from bad updates.
No organization should accept either extreme for production systems.
### Emergency Systems Must Be Normal Systems
Emergency push systems that are separate from the normal rollout pipeline will not work when needed, because they are never exercised. Design the emergency push path to be the normal push path operating at maximum rate with adjusted rate-limit policy. The same principle applies to emergency access — use the same platform with an emergency mode rather than a separate tool.
### Rate Separates Speed from Content
The rate-of-change policy is not part of the update mechanism itself. It is a separate service. This separation is what allows a security team to respond to an emergency by adjusting a rate limit rather than rewriting and redeploying rollout infrastructure during the incident.
### Revocation Must Not Fail Open
A revocation mechanism that fails open under load is an attack vector, not a safety mechanism. A distributed denial-of-service attack on the revocation service reactivates all revoked credentials for the duration of the attack. Use locally cached revocation state so that node decisions are independent of real-time service availability.
### Know the Intended State
Recovery from any error category — random, accidental, malicious, or software — requires returning to a known good state. The prerequisite for recovery is knowing what that state is. Encode the intended state per service, per host, per device. Monitor deviations continuously. Recovery is then the automated or manual process of repairing deviations — not a heroic reconstruction from incomplete memory.
---
## Examples
### Example 1 — Rate-Limiting Service Decoupling (Google Linux Distribution Rollout)
Google initially built rollout tooling around a monthly cadence for its internal Linux base image, bundling timing and content together. As security patches multiplied (each package needing its own patch cadence), the monthly assumption broke the tooling's design.
The solution separated three concerns: the rollout rate and timing, the configuration store defining each machine's target state, and the rollout actuator applying updates per machine. Each concern was developed and updated independently. Emergency releases became a matter of adjusting rate limits on the existing rollout service, not rewriting infrastructure. The result: simpler, more useful, and safer — and the same tooling handles both normal cadence and emergency releases.
### Example 2 — Minimum Acceptable Security Version Number in a Three-Release Sequence
- Release i−1 runs with SVN 4, MASVN 4. A vulnerability is discovered.
- Release i ships SVN 5 (security patch), but MASVN remains 4. The patch is deployed and proven stable.
- Release i+1 ships SVN 5, MASVN 5. Once deployed, `ComponentState[MASVN]` advances to 5. Release i−1 (SVN 4) can no longer be installed on any component that has received release i+1.
**Effect:** The security patch is now mandatory. Rollback to the vulnerable version is permanently blocked — on a per-component basis, without global coordination.
### Example 3 — Revocation List Self-Protection During KRL Update
A KRL push is intercepted or corrupted such that the new KRL would revoke every valid SSH credential in the infrastructure. Without self-protection, a single push takes down the entire fleet.
With self-protection: each node evaluates the incoming KRL before applying it. Any KRL that would revoke the node's own credentials is refused. The worst-case outcome is that half of the fleet refuses the KRL push. The other half remains functional and can be used as a recovery base to remediate the first half.
### Example 4 — Emergency Access with Locally Isolated Credentials
Google's corporate network uses SSO, short-term credentials, and multi-party authorization. A failure in any of these components could prevent all employee remote access, including incident responders.
To address this, offline credentials were provisioned and alternative authentication algorithms deployed that do not depend on SSO. These credentials are pre-provisioned on a fixed schedule (not activated on demand), ensuring they are valid at incident start rather than just being created. They are restricted to the minimum set of responders who need immediate access, while the broader organization waits for the normal access control services to be restored.
---
## References
- Chapter 9: Design for Recovery, *Building Secure and Reliable Systems* (pp. 183–215) — primary source for all mechanisms in this skill
- Chapter 8: Design for Resilience — compartmentalization, failure domains, and low-dependency component design (referenced by `resilience-and-blast-radius-design`)
- Chapter 7: Design for a Changing Landscape — rate-of-change tradeoffs during security vulnerability response
- Chapter 17: Identifying and Responding to Incidents — when to use revocation during active compromise
- Chapter 18: Recovery and Aftermath — full complexity of recovery from a serious targeted compromise
- *Site Reliability Engineering* book, Chapter 7 — automation and host management patterns for intended state tracking
- See `references/rollback-mechanism-comparison.md` for extended pseudocode examples for deny list and downgrade prevention implementation
- See `references/emergency-access-checklist.md` for a per-responder emergency credential provisioning and exercise checklist
Cross-references:
- `resilience-and-blast-radius-design` — failure domain design, compartmentalization axes, and low-dependency component patterns that emergency access must satisfy
- `security-change-rollout-planning` — rate-of-change tradeoffs during security patch deployment and rollout acceleration triggers
## License
This skill is licensed under [CC-BY-SA-4.0](https://creativecommons.org/licenses/by-sa/4.0/).
Source: [BookForge](https://github.com/bookforge-ai/bookforge-skills) — Building Secure and Reliable Systems by Heather Adkins, Betsy Beyer, Paul Blankinship, Piotr Lewandowski, Ana Oprea, Adam Stubblefield.
## Related BookForge Skills
Install related skills from ClawhHub:
- `clawhub install bookforge-resilience-and-blast-radius-design`
Or install the full book set from GitHub: [bookforge-skills](https://github.com/bookforge-ai/bookforge-skills)
Analyze a system's access patterns and design least-privilege controls: classify data and APIs by risk, select the narrowest API surface for each operation,...
---
name: least-privilege-access-design
description: |
Analyze a system's access patterns and design least-privilege controls: classify data and APIs by risk, select the narrowest API surface for each operation, define authorization policies with multi-party approval for sensitive actions, establish emergency access override procedures, and optionally introduce a controlled-access production proxy. Use when reviewing access controls for an existing system, designing authorization for a new service, auditing whether engineers have more permissions than their roles require, deciding whether to use a bastion or proxy for privileged operations, or hardening administrative API surfaces against insider mistakes and external compromise. Produces an access classification report, API surface recommendations, authorization policy decisions, and emergency override guidelines.
version: 1.0.0
homepage: https://github.com/bookforge-ai/bookforge-skills/tree/main/books/building-secure-and-reliable-systems/skills/least-privilege-access-design
metadata: {"openclaw":{"emoji":"📚","homepage":"https://github.com/bookforge-ai/bookforge-skills"}}
status: draft
depends-on: []
source-books:
- id: building-secure-and-reliable-systems
title: "Building Secure and Reliable Systems"
authors: ["Heather Adkins", "Betsy Beyer", "Paul Blankinship", "Piotr Lewandowski", "Ana Oprea", "Adam Stubblefield"]
chapters: [3, 5]
tags:
- security
- access-control
- least-privilege
- authorization
- zero-trust
- multi-party-authorization
- audit-logging
- api-design
- administrative-api
- breakglass
- proxies
- sre
- reliability
execution:
tier: 2
mode: full
inputs:
- type: codebase
description: "Service codebase, infrastructure config, IAM policy files, or API definitions revealing current access patterns and permission grants"
- type: document
description: "Architecture diagram, role/permission inventory, runbook, or written system description if no codebase is directly accessible"
tools-required: [Read, Write]
tools-optional: [Grep, Bash]
mcps-required: []
environment: "Run inside a project directory with codebase, config, or architecture artifacts. Falls back to structured interview with the engineer."
discovery:
goal: "Produce a written access classification report: data/API risk ratings, API surface recommendations, authorization policy decisions, emergency override guidelines, and a controlled-access proxy recommendation if applicable"
tasks:
- "Inventory all data stores and APIs; classify each by Public / Sensitive / Highly Sensitive"
- "Classify each access type (read / write / infrastructure) and assign a risk level per cell"
- "Evaluate current API surface against least-privilege: identify oversized APIs and recommend narrow functional replacements"
- "Select authorization controls for each risk level: ACL, multi-party authorization, temporary access, structured business justification"
- "Define emergency access override policy: who can invoke it, under what conditions, and how it is audited"
- "Recommend a controlled-access production proxy if fine-grained controls are unavailable or insufficient"
- "Design audit log strategy: granularity, structured justification, auditor selection"
audience:
roles: ["security-engineer", "software-engineer", "site-reliability-engineer", "platform-engineer", "tech-lead", "software-architect"]
experience: "intermediate-to-advanced — assumes familiarity with IAM concepts, API design, and distributed systems"
triggers:
- "Reviewing or auditing access controls for an existing system"
- "Designing authorization for a new service or administrative API"
- "Deciding whether a bastion host or production proxy is needed for privileged operations"
- "Hardening a system where engineers have more permissions than their roles require"
- "Post-incident review reveals an outage caused by an overly permissive admin operation"
- "Preparing for a security review or compliance audit"
- "Reducing the blast radius of a potential account compromise"
not_for:
- "Authentication mechanism selection (e.g., OAuth vs. mTLS) — covered separately"
- "Network topology and firewall rule design"
- "Application-layer threat modeling — use adversary-profiling-and-threat-modeling"
---
## When to Use
Use this skill when you need to systematically reduce the damage any one user, automation, or compromised credential can cause — by granting only the access needed and no more.
Invoke it for:
- Any service where an engineer with a production role could accidentally or maliciously cause an outage or data breach
- Systems where admin tooling uses broad, interactive APIs (SSH to hosts, root shells, POSIX-level access) rather than narrow functional APIs
- Designing new administrative APIs for a service where the access model hasn't been explicitly defined
- Hardening automation credentials: automation roles often accumulate unnecessary permissions over time
- Evaluating whether an emergency override (breakglass) policy is necessary and properly governed
Do not invoke it for selecting the cryptographic authentication mechanism, designing network segmentation, or full threat modeling — those are separate concerns.
---
## Context and Input Gathering
Before designing least-privilege controls, gather the following:
1. **Data inventory**: What data stores does the system hold or access? What is in each? Who currently has access?
2. **API inventory**: What interfaces does the system expose — user-facing, administrative, setup/teardown, maintenance/emergency? For each: what can a caller read, write, or modify?
3. **Role inventory**: What human roles (engineers, SREs, support staff, on-call) and automated roles (CI/CD, batch jobs, monitoring agents) access the system? What do they actually need?
4. **Current access breadth**: Are any roles granted interactive shell access, broad IAM policies, or "owner"-level credentials? Does any automation run as a privileged user beyond what its task requires?
5. **Authorization mechanism in place**: ACL? IAM policy? Role-based groups? Is there a shared authorization library or each service rolls its own?
6. **Audit coverage**: Are administrative actions logged? Is each log entry attributable to a specific person and action? Is there a review process?
7. **Emergency access story**: How do on-call engineers recover from a bad policy update or auth system failure? Is there a procedure, and is it tested?
If a codebase is available, search for:
- SSH ForceCommand entries, sudo rules, or `.authorized_keys` files
- IAM policy documents, role bindings, or service account configurations
- Any invocation of root, admin, or infrastructure-level APIs in automation scripts
- Logging calls around sensitive operations — are they structured or freeform?
---
## Process
### Step 1 — Classify Data and APIs by Risk
**WHY**: Not all data and actions carry the same blast radius. Treating everything uniformly either over-controls low-risk operations (hurting productivity) or under-controls high-risk ones (accepting unnecessary exposure). A classification framework makes the trade-off explicit and consistently applied.
Classify each data store and API using the access classification matrix. For each resource, determine its sensitivity category and then assess risk by access type:
**Sensitivity categories**:
| Category | Definition |
|---|---|
| **Public** | Open to anyone in the organization; limited business impact if exposed |
| **Sensitive** | Limited to groups with a documented business purpose; medium impact if exposed or corrupted |
| **Highly Sensitive** | No permanent access; high impact if exposed, corrupted, or deleted (PII, cryptographic secrets, billing data, user credentials) |
**Risk by access type** (per Table 5-1, Chapter 5):
| | Read access | Write access | Infrastructure access |
|---|---|---|---|
| **Public** | Low risk | Low risk | High risk |
| **Sensitive** | Medium/high risk | Medium risk | High risk |
| **Highly Sensitive** | High risk | High risk | High risk |
Infrastructure access — the ability to change ACLs, reduce logging levels, gain direct shell access, restart services, or otherwise affect service availability — is high risk for all sensitivity levels. A read of publicly available data can still enable catastrophic abuse if it bypasses normal access controls.
Output of this step: a classification table listing each data store, API group, and role, with its assigned sensitivity category and the risk level per access type.
### Step 2 — Evaluate and Narrow the API Surface
**WHY**: A large API surface is the root cause of most over-privilege. When users or automation connect via a broad interface (an interactive shell, a general-purpose admin API, a root-level process), the system can't distinguish what they actually need from what they could do. Narrowing the API to the minimum set of operations required makes it possible to grant the minimum permission and to audit actions precisely.
For each administrative API or access pathway, assess:
- **API surface size**: How many distinct operations can a caller perform? An interactive SSH session exposes the entire POSIX API. A custom RPC method that pushes a validated config file exposes exactly one operation.
- **Auditability**: Can you capture what the caller did in a meaningful way? "User opened an SSH session" is not auditable. "User pushed config hash `abc123`" is.
- **Ability to express least privilege**: Can you grant access to exactly this action for exactly this resource without granting broader access?
- **Preexisting infrastructure**: Does a narrow API already exist (e.g., a software update mechanism, a config push API)?
Use the API selection tradeoff matrix (per Table 5-2, Chapter 5 — configuration distribution example):
| API approach | API surface | Auditability | Can express least privilege | Complexity |
|---|---|---|---|---|
| POSIX API via SSH | Large | Poor | Poor | High |
| Software update / package manager API | Varies | Good | Varies | High, but reusable |
| Custom scoped command (e.g., SSH ForceCommand) | Small | Good | Good | Low |
| Custom HTTP/RPC sidecar | Small | Good | Good | Medium |
**Design rule**: Make each API endpoint do one thing well. When you need a new operation, build a new narrow endpoint rather than extending an existing broad one. This applies equally to user-facing APIs and administrative APIs.
**For existing systems with broad APIs** (e.g., SSH access to all hosts):
1. Identify the specific operations that are actually performed through the broad interface
2. Build a narrow API for each operation category, with input validation and structured logging
3. Restrict the broad interface to a controlled emergency override path (see Step 5)
4. Progressively migrate callers to the narrow API
### Step 3 — Select Authorization Controls Per Risk Level
**WHY**: The appropriate authorization control depends on the risk of the action. Binary yes/no ACLs are sufficient for low-risk reads; high-risk writes on sensitive data require additional controls that distribute trust across multiple parties and create an auditable record.
Match each classified operation to one or more of the following controls:
**Access control list (ACL) / group membership** — appropriate for:
- Low and medium risk reads
- Operations where a single authorized user making the decision is acceptable
- Implementation: role-based group membership checked at the API boundary; prefer a shared authorization library over per-service custom logic
**Multi-party authorization (multi-person approval)** — appropriate for:
- High-risk writes and all infrastructure-level operations
- Sensitive operations where unilateral action by a single person (even authorized) is unacceptable
- Benefits: prevents unintentional mistakes, deters insider abuse, increases attacker cost (must compromise multiple accounts or craft a request that passes peer review), provides an audit trail that is tamper-resistant
- Design requirement: ensure the authorization system and social dynamics both allow approvers to say no. If approvers feel unable to reject suspicious requests from managers or senior engineers, the control provides no security value. Provide an escalation path to a security team.
- Pitfall: ensure approvers have enough context to make an informed decision. Show the specific action, target, and parameters — not just a generic "approve this request."
**Business justification (structured)** — appropriate for:
- Access to sensitive customer data by support staff (tie to a specific ticket or case number)
- Operations that are permitted but should be associated with a documented business need
- Implementation: require a structured reference (ticket ID, incident number, customer case ID) rather than free-text, so access can be programmatically correlated to the justification and flagged when it doesn't match
**Temporary access** — appropriate for:
- All sensitive and highly sensitive operations where continuous standing access is unnecessary
- On-call rotations, time-bounded task assignments
- Benefit: reduces ambient authority — if a user never holds continuous access to sensitive resources, a credential compromise has a limited time window of damage
- Implementation: expiring group memberships, on-demand access request workflows, scheduled access windows tied to on-call shifts
**Three-factor authorization** — appropriate for:
- Extremely high-risk operations where broad workstation compromise is a realistic threat model
- Requires authorization from a separate, hardened device (e.g., a managed mobile device) that an attacker who has compromised the primary workstation cannot easily also control
- Note: this protects against broad infrastructure compromise, not against insider threats (the same person controls both approvals)
For highly sensitive infrastructure operations, combine controls: multi-party authorization + temporary access + structured business justification.
### Step 4 — Design the Audit Strategy
**WHY**: Authorization controls are only as effective as the audit mechanism that detects when they are circumvented or abused. The value of a narrow API comes not just from preventing misuse, but from making every action attributable and reviewable. Without deliberate audit design, audit logs become noise that nobody reviews.
**Audit log requirements**:
- Each log entry must answer: **Who** did **what** to **which resource**, **when**, and **why** (structured justification)
- Use structured data, not free-text — this enables programmatic analysis and correlation across events
- Associate audit events with structured justifications (ticket IDs, incident numbers) so that access patterns can be verified against documented need
**Granularity**: Small functional APIs provide the largest audit advantage. "User pushed config with hash `abc123` to host group `web-frontend`" enables strong assertions. "User opened SSH session" does not. Interactive session transcripts (bash history, `script(1)`) appear comprehensive but can be bypassed by any user who is aware of their existence.
**Auditor selection**:
- For best-practice audits (are controls being followed correctly?): use team-level peer review. Teammates have context to identify unusual patterns and create cultural pressure to use proper channels rather than emergency overrides. Distribute this responsibility broadly.
- For security breach detection (has an adversary compromised an account?): use a centralized security team with cross-team visibility. Individual teams may not notice the connection between anomalous actions across different services.
**Emergency override audit**: Emergency override (breakglass) events must always be reviewed. Weekly team review of all emergency override usage from the previous shift is a practical pattern — it creates cultural accountability and signals when the narrow API is insufficient for real operational needs (which should trigger a fix to the normal API, not normalization of emergency override use).
### Step 5 — Define the Emergency Access Override Policy
**WHY**: Any authorization system can fail. A bad policy update, a misconfigured ACL, or an urgent production incident may require access that the normal authorization path cannot provide in time. Without a pre-defined, tested emergency access mechanism, engineers will improvise — which introduces uncontrolled risk. With a well-designed one, you get a controlled escape valve that is tightly audited.
Define the emergency access override policy with the following properties:
**Access restriction**: Emergency override access should be available only to the team directly responsible for the service's operational SLA (typically the SRE team). It should not be broadly available to all engineers.
**Location restriction** (for zero trust network access): If the service uses zero trust network access (access based on user and device credentials, not network location), the emergency override for bypassing the zero trust control should be available only from specific, physically secured locations with additional physical access controls — sometimes called "panic rooms." This is an intentional exception to the "network location doesn't grant trust" principle, offset by physical controls.
**Monitoring**: All uses of emergency override must be logged and reviewed. Emergency override use should be rare and surprising. Routine use signals that the normal API is inadequate and must be fixed.
**Testing**: The emergency override mechanism must be tested regularly by the team responsible for the service. A mechanism that has never been tested may not work when it is needed.
**Graceful failure**: Design the authorization system to fail in a known, diagnosable way. When a caller is denied access, the denial message should include information proportional to the caller's privilege level — nothing for unprivileged callers (no information disclosure), remediation steps for authorized callers who are incorrectly denied. Provide a denial token that can be used to open a support ticket rather than requiring the caller to describe the failure from memory.
### Step 6 — Evaluate Whether a Controlled-Access Production Proxy Is Needed
**WHY**: When fine-grained controls for backend services are not available — because the service is third-party, legacy, or too costly to modify — a controlled-access production proxy can layer authorization, auditing, rate limiting, and multi-party approval on top of the existing interface without requiring changes to the underlying system.
A controlled-access production proxy is appropriate when:
- Direct modification of the target system to add authorization controls is impractical
- Multiple teams need audited, rate-limited access to the same production infrastructure
- You need multi-party approval for specific operations without changing each service
- You want to enforce that no human directly accesses a production system except through the proxy
A controlled-access production proxy provides:
- **Single audit point**: every operation against the fleet is logged at the proxy, regardless of which tool or engineer initiates it
- **Multi-party authorization enforcement**: the proxy checks for peer approval before forwarding the request
- **Rate limiting**: restricts the blast radius of mistakes (e.g., limiting the rate at which machine restarts can be issued)
- **Compatibility**: works with third-party systems that cannot be modified, by controlling behavior at the proxy layer
Proxy risks and mitigations:
- **Single point of failure**: run multiple instances for redundancy; ensure all dependencies have acceptable SLAs and documented emergency contacts
- **Policy configuration errors**: generate policy templates or auto-generate settings that are secure by default; review changes to policy config with the same rigor as code changes
- **Circumvention pressure**: engineers will attempt to bypass the proxy for convenience. Address this by working closely with the team to ensure emergency override paths are available for genuine emergencies, while maintaining that the proxy is the required channel for normal operations
---
## Key Principles
**Least privilege applies to humans, automation, and machines equally.** The objective extends through all authentication and authorization layers. Automation credentials often accumulate permissions over time — review them with the same rigor as human roles.
**Avoid ambient authority.** Users and automation should not hold standing access to sensitive resources they do not currently need. Temporary access that expires is always preferable to permanent standing access.
**Design for the realistic threat model, not the idealized one.** Engineers make typos. Accounts get compromised. Credentials are phished. A system that requires perfect human execution to remain secure is not secure. Design to limit the damage of realistic failure modes.
**Small APIs make everything else possible.** Narrow, functional APIs are the prerequisite for meaningful audit logs, meaningful least privilege, and meaningful multi-party authorization. A system built on broad interactive APIs cannot be audited or constrained effectively regardless of other controls.
**Authorization infrastructure should be shared, not per-service.** Separate authorization logic into a shared library or service. This enables org-wide controls (multi-party authorization, multi-factor authorization) to be added at a single layer rather than requiring changes to every service. Standardization also enables team mobility and consistent policy reasoning.
**Culture enforces what technology cannot.** Multi-party authorization only works if approvers feel genuinely empowered to reject suspicious requests. Emergency override use only remains rare if teams review it regularly and treat frequent use as a signal that the normal API needs improvement. Controls without cultural reinforcement become rubber stamps.
---
## Examples
### Example 1 — Controlled-Access Production Proxy (Safe Proxy Pattern)
**Scenario**: An infrastructure team runs a fleet of production servers. Engineers use command-line tools directly against the fleet for administrative tasks. These tools are potentially dangerous — an incorrect scope selector could stop multiple service frontends simultaneously. Centralized logging and authorization are not enforced.
**Problem**: No audit trail. Engineers can issue arbitrary commands to production. A single engineer account compromise or fat-finger mistake could take down significant fleet capacity.
**Least-privilege design**:
1. **Classify**: The fleet's command-line interface is an infrastructure-level API. By the classification matrix: high risk for all sensitivity levels.
2. **Narrow the API**: Introduce a controlled-access production proxy that exposes a defined set of RPCs. Each RPC corresponds to a specific administrative action (restart a job, push a config, check status). The underlying servers are reconfigured to accept connections only from the proxy — not directly from engineer workstations.
3. **Authorization policy**: The proxy enforces an ACL per RPC. For sensitive operations (e.g., stopping a job in production), the proxy triggers multi-party authorization — an engineer in the `admin` group must have a peer in `admin-leads` approve before the command executes.
4. **Audit**: The proxy logs every RPC, the identity of the caller, the parameters, and whether it was approved. Log entries are structured and associated with the requester's ticket or incident number.
5. **Emergency override**: Engineers who genuinely need direct access during an incident can invoke an emergency override mechanism. All emergency override uses are reviewed weekly by the team.
**Result**: ~13% of production outages that would have been caused by direct human access to production become preventable. The blast radius of any single mistake or compromise is bounded by the rate limiter and multi-party approval requirements.
**Proxy policy structure** (conceptual):
```
config {
proxy_role = 'admin-proxy'
tools {
restart_job {
allow = ['group:admin']
require_approval_from = ['group:admin-leads']
rate_limit = { max_per_minute = 5 }
}
status_check {
allow = ['group:admin', 'group:sre']
# No approval required — read-only, low risk
}
}
}
```
### Example 2 — Configuration Distribution API Design
**Scenario**: An automation system needs to push a validated configuration file to all web servers in a fleet. The naive approach: SSH to each host as the user the web server runs as, write the file, restart the process.
**Problem**: The SSH approach exposes the entire POSIX API. The automation role can read any data on the host, stop the web server permanently, start arbitrary binaries, or cause a coordinated outage of the entire fleet. A compromise of the automation credential is equivalent to a compromise of every web server.
**Least-privilege design using Table 5-2 logic**:
1. **Classify**: Web server configuration write is a write operation on a public service — medium risk. Infrastructure access (ability to restart the service) is high risk.
2. **Evaluate API options**:
- POSIX API via SSH: large surface, poor auditability, poor least-privilege expression — reject
- Software update API (e.g., package manager): good auditability, reusable infrastructure, but complexity is high and convergence timing may not meet requirements
- Custom SSH ForceCommand: small surface, good auditability, low complexity — viable
- Custom HTTP receiver (sidecar): small surface, good auditability, medium complexity — preferred for scale
3. **Design the narrow API**: A small sidecar process accepts a configuration payload via an authenticated RPC, validates its structure and signature, writes the file to the correct path, and restarts the web server. The automation role is authorized only to call this single RPC — it cannot read other files or run other processes.
4. **Segment trust further**: The signing of the configuration is performed by a separate role (the code review / release system), independent from the automation role that pushes it. Even if the push automation is compromised, it cannot push an arbitrary config — only content that has been signed by the release system.
5. **Audit**: Each push logs the config hash, the target host group, and the result. Rejected configs (invalid signature, schema validation failure) are logged for investigation.
**Result**: A compromise of the push automation credential cannot write arbitrary content to hosts or run arbitrary processes. The blast radius is limited to pushing a valid (signed) config — which itself requires compromise of the signing system.
### Example 3 — Support Staff Access to Customer Data
**Scenario**: Customer support representatives need to access customer account records to resolve tickets. Currently, all support staff have read access to all customer records for all customers at all times.
**Problem**: Overly broad read access to highly sensitive data. A support staff compromise, or a malicious insider, can exfiltrate all customer data without any specific trigger.
**Least-privilege design**:
1. **Classify**: Customer records are highly sensitive. Read access is high risk.
2. **Authorization control**: Replace standing read access with structured business justification — access to a customer's record is only permitted when a support case for that customer is open and assigned to this representative.
3. **Implementation path** (incremental):
- Phase 1: Require a support ticket ID for any customer data access. Log the association between the ticket and the access event.
- Phase 2: Validate that the ticket exists, is open, and is assigned to the requesting representative before granting access.
- Phase 3: Restrict access to only the specific customer's data associated with the open ticket, rather than all customers.
- Phase 4: Add time bounds — access expires when the ticket is closed.
4. **Audit**: Every customer data access is logged with the associated ticket ID. A programmatic check verifies that access events correspond to open tickets. Anomalies (access with no ticket, access after ticket closure, bulk reads) trigger alerts.
**Result**: The data surface exposed to any single support interaction is the minimum needed to resolve that case. A compromised support account can only access data for currently open tickets assigned to it — not the entire customer database.
---
## References
- [access-classification-matrix.md](references/access-classification-matrix.md) — Full access classification matrix (Public/Sensitive/Highly Sensitive × Read/Write/Infrastructure) with risk ratings and control recommendations
- [api-selection-tradeoffs.md](references/api-selection-tradeoffs.md) — API design options for administrative operations with tradeoff analysis (API surface, auditability, least-privilege expressibility, complexity)
- [authorization-policy-framework.md](references/authorization-policy-framework.md) — How to design and ship authorization policies: shared library patterns, policy language tradeoffs, pitfalls
- [controlled-access-proxy-design.md](references/controlled-access-proxy-design.md) — Detailed design guide for controlled-access production proxies: architecture, policy structure, failure modes, redundancy
- [emergency-override-policy-template.md](references/emergency-override-policy-template.md) — Template for defining an emergency access override policy: eligibility, invocation, monitoring, testing schedule
- [audit-log-design.md](references/audit-log-design.md) — Audit log design: structured justification, granularity, auditor selection, programmatic verification
Cross-references:
- `adversary-profiling-and-threat-modeling` — identify which adversaries and attack paths make least-privilege controls most valuable
## License
This skill is licensed under [CC-BY-SA-4.0](https://creativecommons.org/licenses/by-sa/4.0/).
Source: [BookForge](https://github.com/bookforge-ai/bookforge-skills) — Building Secure and Reliable Systems by Heather Adkins, Betsy Beyer, Paul Blankinship, Piotr Lewandowski, Ana Oprea, Adam Stubblefield.
## Related BookForge Skills
This skill is standalone. Browse more BookForge skills: [bookforge-skills](https://github.com/bookforge-ai/bookforge-skills)
Use when you need to set up an incident response team from scratch, design an IR team charter, define severity and priority models for incidents, create IR p...
---
name: incident-response-team-setup
description: Use when you need to set up an incident response team from scratch, design an IR team charter, define severity and priority models for incidents, create IR playbooks, build a structured testing program, design tabletop exercises, or answer "how do we build and validate our incident response capability."
tags: [security, incident-response, disaster-planning, team-design, tabletop-exercise]
---
# Incident Response Team Setup
Guides building an incident response (IR) team and testing program from zero using a 7-phase process: staffing model selection, role catalog definition, team charter writing, severity and priority model design, operating parameters, response plan development, playbook creation, and a 3-tier testing program. Consumes the risk register produced by `disaster-risk-assessment` to calibrate severity levels against real exposure. Output: IR team charter, severity/priority models, response plan templates, playbook structure, and a testing program design.
## When to Use
- Building an IR team for the first time or formalizing an ad hoc response function
- Redesigning an existing IR team after a significant organizational or threat environment change
- Establishing severity and priority models before the next incident season
- Designing a tabletop exercise or a disaster recovery test program
- Scoping which incidents to handle in-house vs. outsource
- Establishing IR training requirements and response decision criteria
**Prerequisite:** A completed risk register from `disaster-risk-assessment`. The P×I rankings and scenario list from that register feed directly into severity model calibration in Step 4. Without it, severity thresholds will be guesswork rather than evidence-based.
## Process
### Step 1 — Choose a staffing model
Select one of three models or a hybrid based on budget, organizational size, and incident complexity:
| Model | Description | Trade-offs |
|-------|-------------|------------|
| Dedicated full-time IR team | Employees whose primary role is incident response | Always available, appropriately trained, has system access; higher cost |
| Dual-hat (existing staff + IR duties) | Engineers handle regular work plus IR when incidents arise | Lower cost, leverages domain knowledge; responders may be unavailable or fatigued |
| Outsourced | Third parties perform IR activities | Access to specialist skills (e.g., forensics) without headcount; external responders may not be immediately available and lack system context |
**Why this matters:** Outsourcing response time can add appreciable delay during active incidents. Response time is a function of staffing model choice — decide it deliberately rather than by default. Many organizations use a hybrid: in-house for most IR, outsourced for specialized functions like forensics where full-time staff is not cost-effective.
**Avoid single points of failure regardless of model.** Incidents do not respect vacation schedules or time zones. Establish on-call rotations, empower deputies to approve emergency code fixes and configuration changes, and appoint delegates across time zones for multinational organizations.
### Step 2 — Define the role catalog
Identify which roles your team needs. Roles are not individuals — one person may hold multiple roles during an incident, and rotational staffing across shifts is recommended to reduce fatigue.
**Core command roles** (detailed in the incident-command skill):
- **Incident commander (IC):** Leads the response to an individual incident. Coordinates all responders and owns the overall response direction.
- **Operational lead (OL):** Directs technical response operations — the engineering execution arm of the IC.
- **Communications lead (CL):** Owns all internal and external communications during the incident.
- **Reliability lead (RL):** Coordinates site reliability and infrastructure responders.
**Supporting roles:**
- **Site reliability engineers (SREs):** Reconfigure impacted systems or implement code fixes.
- **Security engineers:** Review the security impact and work with SREs and privacy engineers to secure the system.
- **Forensic specialists:** Perform event reconstruction and attribution — determine what happened and how.
- **Privacy engineers:** Address impact on technical privacy matters.
- **Legal:** Provide counsel on applicable laws, statutes, regulations, and contracts.
- **Customer support:** Respond to customer inquiries or proactively reach out to affected customers.
- **Public relations:** Respond to public inquiries and coordinate with the communications lead on media statements.
**Why define roles explicitly:** Knowing who holds each role before an incident eliminates the coordination overhead of figuring it out under pressure. An individual may hold multiple roles, but the roles must be assigned — not assumed.
**Identify a champion:** Designate a person with sufficient organizational seniority to commit resources and remove roadblocks. The champion helps assemble the team and resolves competing priorities between IR work and regular operational commitments.
### Step 3 — Write the team charter
The charter is the IR team's governing document. It must contain three elements:
**Mission statement (one sentence):** A single sentence describing the types of incidents the team handles. This allows anyone to quickly understand what the team does without reading the entire charter.
**Scope:** Describe the environment the team covers — technologies, end users, products, and stakeholders. Clearly define:
- Which incidents are handled by internally staffed resources
- Which incidents are assigned to outsourced teams
- What is explicitly out of scope (e.g., individual customer inquiries about firewall configurations may belong to customer support, not the IR team)
**Definition of success:** How does the organization know when an incident response is complete and can be declared done? Define the done criteria explicitly. Without it, incident close is ambiguous and teams may disengage prematurely.
**Team morale consideration:** Review scope and workload together when establishing the charter. Overworked teams experience productivity drops and attrition. For dedicated or cross-functional virtual response teams alike, sustainable workload must be part of the charter conversation.
### Step 4 — Establish severity and priority models
Use both models concurrently — they are related but serve different purposes. Calibrate severity thresholds against the risk register from `disaster-risk-assessment`: scenarios with high P×I rankings should map to severity 0 or 1.
**Severity model** — categorizes incidents by their impact on the organization:
| Severity | Label | Example |
|----------|-------|---------|
| 0 | Most severe | Unauthorized access across production network |
| 1 | High | Confirmed breach of a single critical system |
| 2 | Medium | Temporary unavailability of security logs |
| 3 | Low | Suspected (unconfirmed) anomalous access |
| 4 | Least severe | Informational alert with no confirmed impact |
Assign severity ratings using the risk register categories. Not every incident deserves a critical or moderate severity rating — accurate ratings ensure incident commanders can correctly prioritize when multiple incidents are reported simultaneously.
**Priority model** — defines how quickly personnel must respond:
| Priority | Response tempo |
|----------|----------------|
| 0 | Immediate response; team members drop all other work |
| 1 | Urgent; respond before end of current shift |
| 2 | High; respond within the business day |
| 3 | Normal; handle within the week |
| 4 | Routine; handle as operational work allows |
**Critical distinction — severity is fixed, priority changes:**
Severity reflects the incident's actual impact on the organization and typically remains fixed throughout the incident's lifecycle. Priority reflects operational tempo and can change as the situation evolves. During early triage and implementation of a critical fix, priority may be 0. Once the fix is in place, priority can lower to 1 or 2 as engineering teams perform cleanup work. Misaligned priority ratings across teams cause coordination failures — one team responding at priority 0 tempo while another treats the same incident as priority 2 will operate at different speeds, delaying proper response.
### Step 5 — Define operating parameters
Operating parameters describe the day-to-day functioning of the IR team and ensure that severity 0 and priority 0 incidents receive timely responses.
Define at minimum:
- **Initial response time target:** How quickly must someone acknowledge a reported incident? (e.g., within 5 minutes, 30 minutes, 1 hour, or next business day — set per severity level)
- **Triage time target:** How quickly must the team complete initial triage and develop a response schedule?
- **Service level objectives (SLOs):** When must incident response interrupt regular day-to-day engineering work? This keeps IR from being deprioritized during busy operational periods.
- **On-call rotation structure:** How are on-call duties load-balanced across the team?
**Why operating parameters matter for distributed or virtual teams:** When an IR team includes members from multiple organizations or outsourced partners, each group may have different assumptions about response speed. Explicit operating parameters force alignment before an incident, not during one.
### Step 6 — Develop response plans
Response plans guide decision-making during severe incidents when responders are working quickly with limited information. Develop plans covering:
- **Incident reporting:** How does an incident get reported to the IR team? Who are the reporting channels for engineers, customers, administrators, and automated alerts?
- **Triage:** Who responds to the initial report and begins triaging? What is the handoff process?
- **Service level objectives:** Reference the SLOs established in Step 5 so responders know the expected tempo.
- **Roles and responsibilities:** Clear definitions for each role during the response.
- **Outreach:** How does the IR team reach engineering teams and participants who may need to assist?
- **Communications plan:** Communication during an incident does not happen without advance planning. The plan must specify:
- How to inform leadership (email, text, phone call — and what information to include)
- How to conduct intra-organization communication (chat rooms, videoconferencing, bug tracking tools)
- How to communicate with external stakeholders such as regulators or law enforcement (partner with legal; maintain an index of contact details and communication methods per external stakeholder)
- How to communicate with customer-facing teams without tipping off an adversary if the primary communication system is compromised or unavailable
**Backup communication channels are not optional:** Adversaries who compromise an email or instant messaging server can monitor IR coordination threads, sidestep detection, and observe mitigation efforts. If the communication system is offline, the team may be unable to contact stakeholders at other sites. The communications section of every response plan must cover backup communication methods.
Each response plan should contain high-level procedures referencing specific playbooks for detailed execution. Outline the overarching approach for each class of incident — the playbook contains step-by-step instructions.
### Step 7 — Create detailed playbooks
Playbooks complement response plans with specific, procedural instructions from beginning to end. They are team-specific, procedural in nature, and must be frequently revised. Examples of what playbooks cover:
- How to grant responders emergency temporary administrative access (breakglass procedures)
- How to output and parse particular logs for analysis
- How to fail over a system and when to implement graceful degradation
- Criteria for when to notify senior leadership and when to work with localized engineering teams
**Access and currency:** Store playbooks and response plans in a location accessible during a disaster — if company servers go offline, cloud-hosted documentation or printed offline copies must remain available. Set a review cadence (minimum: annually; after any significant infrastructure or configuration change) because threat postures change and new vulnerabilities emerge.
**Incident tracking:** Identify a suitable system for tracking information and retaining incident data. Security and privacy incident teams may want a system with need-to-know access controls; reliability response teams may prefer broader company access for coordination.
**Training for all engineers, not just IR team members:** Train all engineers who may assist the IR team on the IR roles and their responsibilities. Use the Incident Management at Google (IMAG) framework, which is based on the Incident Command System, as a reference structure for role assignments (incident commander, operational lead, communications lead). Establish a finite time limit — such as 15 minutes — for a first responder to grapple with an incident before escalating to the IR team. Pre-establish decision criteria for high-pressure choices (e.g., whether to take a compromised system offline vs. preserve it for forensics) so responders are not making gut decisions under stress.
### Step 8 — Build the testing program
Testing validates that your materials work before a real incident. Run tests at a minimum annually. The program has three tiers:
**Tier 1 — Automated system auditing**
Audit all critical systems and their dependencies (backup systems, logging systems, software updaters, alert generators, communication systems) to verify they are operating correctly. A full audit confirms:
- Backups are created, stored safely, stored for the appropriate retention period, and stored with correct permissions — conduct data recovery and validation exercises periodically
- Event logs are stored correctly and for a period appropriate to the organization's risk level (the industry average for detecting intrusions is approximately 200 days; logs deleted before detection cannot be used for investigation)
- Critical vulnerabilities are patched in a timely fashion — audit both automatic and manual patch processes
- Alerts fire correctly — validate each alert rule, and account for dependencies (e.g., how are alerts impacted if an SMTP server goes offline during a network outage?)
- Communication tools retain failover capability and message history needed for postmortems
**Tier 2 — Nonintrusive tabletop exercises**
Tabletop exercises test documented procedures and team decision-making without taking systems offline. They can also serve as a proxy when end-to-end production testing is not feasible (e.g., testing an earthquake response without causing an earthquake).
Design parameters for a standard tabletop exercise:
- **Duration:** 60 minutes
- **Decision points:** 10–20 storyline branch points structured like a "choose your own adventure" — each participant decision affects the subsequent scenario state (e.g., if the team takes a compromised email server offline, participants cannot send email notifications for the rest of the scenario)
- **Believability:** Base scenarios on realistic attack vectors and known vulnerabilities so participants engage without suspending disbelief
- **Artifacts:** Provide realistic artifacts — log files, customer reports, alert screenshots — to increase immersion and realism
- **Participants:** Include the full range: frontline engineers following playbooks, senior leadership making business-level decisions, public relations professionals coordinating external communications, and legal providing guidance on public statements
- **Active demonstration:** Participants should demonstrate procedures, not just describe them. If a playbook calls for escalating to forensics and blocking hostile traffic, the responder should carry out those steps during the exercise — building muscle memory
- **Facilitator preparation:** The facilitator must be deeply familiar with the scenario and typical responses in order to improvise and nudge responders in the right direction as the exercise unfolds
- **Outcomes:** Conclude with actionable feedback — what worked, what did not, and concrete improvement recommendations with assigned owners. Create action items to address each recommendation. An exercise without implemented lessons is entertainment, not preparation.
**Tier 3 — Fault injection and disaster recovery testing**
Production-environment testing validates that systems handle failure modes correctly under real-world constraints. This is where IR teams observe how their responses affect actual production environments.
Sub-types to include in the program:
- **Single-system fault injection:** Inject faults at the component level without disrupting the entire system. Use fault injection frameworks (e.g., the Envoy HTTP proxy fault injection filter) to return arbitrary errors for a percentage of traffic or delay requests for a specific time period. This tests timeout handling and isolates team dependencies.
- **Human resource testing:** Test what happens when key personnel are unavailable. IR teams that rely on individuals with institutional knowledge rather than documented processes are fragile — validate that documented processes function when those individuals are absent.
- **Multicomponent testing:** Test simultaneous failure of multiple dependent components. When failovers occur, verify that the system respects existing access control lists and that authorization services fail closed (not open).
- **System-wide failovers:** Test full failover to secondary or disaster-recovery datacenters. Until you actually fail over, you cannot confirm that your failover strategy will protect your business and security posture. For cloud-hosted services, test what happens when an entire availability zone or region fails.
**Google's disaster recovery test (DiRT) program — combined reliability and security test:**
During one annual disaster recovery test (part of Google's DiRT program), site reliability engineers tested whether breakglass credentials — emergency credentials that can bypass normal access controls when standard access control list services are down — actually worked to gain emergency access to the corporate and production networks. The DiRT team simultaneously looped in the signals detection team. When engineers engaged the breakglass procedure, the detection team was able to confirm that the correct alert fired and that the access request was recognized as legitimate. This combined test validated both the reliability of the emergency access path and the integrity of the security alerting system in a single exercise — demonstrating that reliability and security testing can be designed to reinforce each other rather than running as separate programs.
### Step 9 — Establish a feedback loop
Testing without feedback is entertainment. After every test and every live incident:
- Measure responses: track time taken at each stage of the response to identify corrective measures
- Write blameless postmortems focused on process and system improvement
- Create feedback loops for improving existing plans and developing new ones
- Collect artifacts from exercises and feed gaps back into signal detection
- Save logs and relevant materials from security exercises for forensic analysis
- Evaluate even "failed" tests — what partial successes can you build on?
## Key Principles
**Severity is fixed; priority is variable.** Confusing the two causes teams to treat the same incident at different operational tempos. Severity describes what happened; priority describes how fast to respond right now.
**Roles are not individuals.** One person can hold multiple roles. The goal is to ensure every role is explicitly assigned, not that every role maps to a unique person.
**Communication plans must survive compromise of primary channels.** Design backup communication methods before an incident, not during one when an adversary may be monitoring your primary channels.
**Testing has diminishing returns when it stays comfortable.** Automated audits catch configuration drift; tabletops build decision-making muscle memory; fault injection and disaster recovery tests expose the gaps that neither audit nor simulation reveals. All three tiers are necessary.
**The risk register drives severity calibration.** High P×I scenarios from `disaster-risk-assessment` should map to severity 0 and 1 — this connects the abstract risk model to the operational response model.
**Training extends beyond the IR team.** Any engineer who may encounter an incident first is part of the response system. Train them on escalation criteria and time limits (15-minute window before escalating) so the IR team is engaged at the right time.
## References
- *Building Secure and Reliable Systems* (Blank, Oprea et al., Google/O'Reilly, 2020)
- Chapter 16 "Disaster Planning" — pp. 367–385
- "Setting Up an Incident Response Team" (pp. 367–373): staffing models, role catalog, team charter, severity/priority models, operating parameters, response plan structure, playbook guidance
- "Prestaging Systems and People Before an Incident" (pp. 373–376): configuring systems, training, processes and procedures
- "Testing Systems and Response Plans" (pp. 376–382): automated auditing, tabletop exercises (design parameters pp. 378–379), production environment testing (pp. 379–381), evaluating responses (pp. 382)
- "Google Examples" (pp. 383–385): disaster recovery test (DiRT) breakglass + security alerting combined test (p. 384)
- Depends on: `disaster-risk-assessment` (risk register feeds severity model calibration)
- Related: `incident-command` (IMAG framework, IC/OL/CL/RL role execution detail)
## License
This skill is licensed under [CC-BY-SA-4.0](https://creativecommons.org/licenses/by-sa/4.0/).
Source: [BookForge](https://github.com/bookforge-ai/bookforge-skills) — Building Secure And Reliable Systems by Unknown.
## Related BookForge Skills
This skill is standalone. Browse more BookForge skills: [bookforge-skills](https://github.com/bookforge-ai/bookforge-skills)
Design DoS-resistant systems or respond to an active denial-of-service attack. Use this skill when designing a new service and want to evaluate its attack su...
---
name: dos-defense-and-mitigation
description: Design DoS-resistant systems or respond to an active denial-of-service attack. Use this skill when designing a new service and want to evaluate its attack surface and build in layered defenses, assessing whether a production system's architecture is DoS-hardened, investigating a traffic spike to determine whether it is an attack or a self-inflicted surge, detecting a client retry storm and needing to apply backoff and jitter fixes, building or reviewing a DoS mitigation system (detection + response pipeline), or deciding how to respond strategically to an ongoing attack without leaking information about your defenses to the adversary.
version: 1.0.0
homepage: https://github.com/bookforge-ai/bookforge-skills/tree/main/books/building-secure-and-reliable-systems/skills/dos-defense-and-mitigation
metadata: {"openclaw":{"emoji":"📚","homepage":"https://github.com/bookforge-ai/bookforge-skills"}}
status: draft
source-books:
- id: building-secure-and-reliable-systems
title: "Building Secure and Reliable Systems"
authors: ["Heather Adkins", "Betsy Beyer", "Paul Blankinship", "Piotr Lewandowski", "Ana Oprea", "Adam Stubblefield"]
chapters: [10]
tags: [security, reliability, dos-mitigation, load-management, resilience]
depends-on: [resilience-and-blast-radius-design]
execution:
tier: 2
mode: full
inputs:
- type: context
description: "System architecture description, traffic anomaly report, incident alert, or design proposal. For active attacks: include available traffic telemetry (request rates, source IPs, User-Agent distribution, affected endpoints)."
tools-required: [Read, Write]
tools-optional: []
mcps-required: []
environment: "Works from a system description or active incident context. Output: DoS defense assessment with architectural recommendations and mitigation playbook, OR active incident response plan."
discovery:
goal: "Produce a DoS defense assessment that identifies attack surface weaknesses, evaluates architectural and service-layer defenses, designs monitoring and graceful degradation behavior, and delivers a strategic response plan — including self-inflicted surge detection."
tasks:
- "Model the attacker's strategy: identify the weakest link in the request dependency chain"
- "Audit layered defense coverage across shared infrastructure (edge, network LB, application LB)"
- "Evaluate service-level design for caching, request minimization, and egress bandwidth controls"
- "Define monitoring and alerting thresholds that minimize false positives and maximize actionability"
- "Design graceful degradation modes and quality-of-service priorities for attack conditions"
- "Design or review the DoS mitigation system (detection + response pipeline, fail-static behavior)"
- "Plan a strategic response that avoids leaking defense fingerprints to the adversary"
- "Detect and fix self-inflicted surges: client retry storms and organic traffic misidentification"
audience: "engineers, SREs, security engineers, and architects at intermediate-to-advanced level"
when_to_use: "When designing a new service for DoS resilience, reviewing a system's attack posture, responding to an active traffic anomaly, fixing client retry behavior, or building a DoS mitigation system"
environment: "System architecture description or active incident telemetry. Knowledge of service capacity and traffic baselines is needed for monitoring thresholds."
quality: placeholder
---
# DoS Defense and Mitigation
## When to Use
Apply this skill when:
- Designing a new service and want to evaluate its attack surface and choose cost-effective layered defenses
- Auditing an existing system's DoS posture before a launch or after an incident
- Responding to an active traffic anomaly — including determining whether it is an attack or a legitimate surge
- Clients are caught in retry loops that compound a server outage into a denial of service
- Building or reviewing an automated DoS mitigation system (detection + response pipeline)
- Deciding how to respond to an attack without teaching the attacker how your defenses work
**The core economic model:** A DoS attack is a supply-and-demand imbalance. The adversary drives demand above your supply capacity. Simply absorbing all attacks by overprovisioning is rarely the most cost-effective approach. Instead, eliminate as much attack traffic as possible at each successive layer, so the expensive inner layers only have to handle what the cheap outer layers could not stop.
**Dependency note:** This skill builds on load shedding, throttling, and graceful degradation concepts from `resilience-and-blast-radius-design`. If those mechanisms are not yet defined for the system under review, complete that skill first.
Before starting, confirm you have:
- A description of the system's architecture or active traffic telemetry
- Traffic baselines (normal request rate, peak request rate, request cost distribution)
- Ownership of the affected service layers, or contacts for layers you cannot change
---
## Context and Input Gathering
### Required Context
- **System description or traffic telemetry:** Architecture diagram or prose describing how requests flow through the system, or — for active incidents — current traffic metrics, source IP distribution, and User-Agent samples.
- **Capacity baseline:** What is the normal request rate? What rate causes degradation? What rate causes outright failure?
- **Service dependency chain:** What external services does this service depend on? (DNS, databases, third-party APIs)
### Observable Context
If a system description is provided, scan for:
- A single constrained resource (network bandwidth, CPU, memory, a database backend) that an attacker could saturate
- Missing defenses at one or more layers: edge, network load balancer, application load balancer, service frontend
- No caching proxy between the edge and the application backend
- Client libraries that retry on error without exponential backoff or jitter
- No request rate monitoring at the service level — only CPU or memory
For active traffic anomalies, additionally scan for:
- Requests all sharing the same User-Agent, character set, or access pattern
- Traffic bursts correlated with an external event (news story, earthquake, live broadcast)
- Traffic originating from a geographically concentrated set of IPs or ASNs
- DNS retry rate spiking at 30x normal — a classic sign of recursive DNS retry storms
### Default Assumptions
- If the system has no existing DoS defenses: start with shared infrastructure defenses before service-layer changes — they are cheaper and protect multiple services at once
- If the traffic anomaly is ambiguous (attack or legitimate): do NOT immediately block; investigate the request distribution before taking action
- If the DoS mitigation system does not exist: design it to fail static (policy unchanged on controller failure) rather than fail open or fail closed
- If client retry behavior is uncontrolled: treat exponential backoff + jitter as a mandatory fix, not an optimization
### Sufficiency Check
You have enough to proceed when:
1. You can trace a request from the internet through all dependency layers to the backend
2. You know the system's capacity at each layer
3. You know whether traffic anomaly investigation is needed (active incident) or this is a design review
---
## Process
### Step 1 — Model the Attacker's Strategy
Understanding how an attacker would approach the system lets you find its weakest points before they do.
**Map the request dependency chain** for a typical user request:
1. DNS query resolves the IP address of the service
2. Network carries the request to service frontends
3. Service frontends interpret and route the request
4. Service backends (databases, caches, third-party APIs) generate the response
An attack that disrupts any link in this chain disrupts the service. The attacker will target the link with the lowest cost-to-disrupt.
**Attacker resource efficiency:** A sophisticated attacker will not flood with simple requests — they will generate requests that are more expensive to answer than to send. Examples: triggering search functionality, initiating sessions that exhaust connection state, or exploiting high-cost API endpoints.
**Attack types by scale requirement:**
| Attack type | How it works | Primary defense |
|---|---|---|
| Volumetric flood | Saturate bandwidth or CPU with high packet/request rate | Edge throttling, anycast dispersal |
| Amplification (DDoS) | Spoof victim IP; small requests generate large responses from third-party servers (DNS, NTP, memcache) | Router ACLs blocking UDP from abusable protocols; network-layer filtering |
| Application-layer | Legitimate-looking requests targeting expensive operations | Application-layer rate limiting; CAPTCHA challenges |
| Botnet / DDoS | Distributed attack from many machines — cannot be blocked by single-source filtering | Shared infrastructure defenses; collaboration with upstream providers |
**Threat model priority:** Use the number of machines an attacker would need to control to cause user-visible disruption as a proxy for attack cost. Prioritize defending the attack vectors that are cheapest for an adversary to mount against your specific architecture.
**Output for this step:** Dependency chain map with the weakest link annotated. Attack type assessment ranked by attacker cost-to-mount.
---
### Step 2 — Audit Layered Defense Coverage (Defendable Architecture)
Layered defenses eliminate attack traffic as early as possible, protecting expensive inner layers from having to absorb what cheaper outer layers can stop.
**The three-layer stack to evaluate:**
```
Internet traffic
|
[Edge routers] ← throttle high-bandwidth attacks; drop suspicious traffic via ACLs
|
[Network load balancers] ← throttle packet-flood attacks; protect application load balancers
|
[Application load balancers] ← throttle application-specific attacks; protect service frontends
|
[Service frontends]
|
[Backends / databases]
```
**For each layer, ask:**
- Does it have a mechanism to throttle or drop attack traffic before passing it downstream?
- Can it be overwhelmed by a style of attack the outer layer allows through?
- Is it stateless or stateful? Stateful components (firewalls with connection tracking) are vulnerable to state exhaustion attacks — use router ACLs instead.
**Shared infrastructure advantage:** Defenses at the network and load-balancer layers protect every service behind them. A single investment covers a broad range of services. This is the most cost-effective place to deploy defenses — do not skip it in favor of service-level-only fixes.
**Anycast for geographic distribution:** If a large DDoS targets a single datacenter, anycast routing automatically disperses traffic across all locations announcing the same IP address. No reactive system is needed — traffic is naturally absorbed across the global footprint.
**Caching proxies near the edge:** Deploy caching proxies close to the edge with correct `Cache-Control` headers. Cached responses require zero backend processing. This reduces both attack impact and normal operating costs.
**Amplification defense:** Router ACLs that throttle or block UDP traffic from protocols used for amplification (DNS, NTP, memcache) stop reflected amplification attacks at the edge. These attacks are identifiable by their well-known source ports.
**Output for this step:** Layer-by-layer defense inventory table. Mark each layer: defended / partially defended / undefended. Identify the first undefended layer that attack traffic reaches.
---
### Step 3 — Evaluate Service-Level Design (Defendable Services)
Service and application design choices have a significant impact on how well a service survives a DoS attack — and how much it costs to run in normal operation.
**Three design levers to evaluate:**
**Caching proxies (highest impact)**
- Use `Cache-Control` and related headers so proxy servers can serve repeated requests without hitting the application backend
- Applies to static images, CSS, JavaScript, and often the homepage itself
- Why: every cached response served at the edge is one less request reaching the backend; this reduces DoS impact proportionally to cache hit rate
**Minimize application requests**
- Reduce the number of requests a page requires: combine multiple small images into a single sprite; serve bundled assets
- Fewer legitimate requests per session = cleaner signal for anomaly detection (bots making many more requests than real users stand out more clearly)
- Why: each request consumes server resources; reducing the baseline consumption increases the margin available to absorb attack traffic
**Minimize egress bandwidth**
- Resize images to the minimum size needed for display; serve appropriately compressed formats
- Rate-limit or deprioritize responses to requests for unavoidably large resources
- Why: while most attacks target ingress bandwidth, egress saturation attacks (requesting a large resource repeatedly) are possible; minimizing response sizes limits exposure
**Output for this step:** Service-level defense checklist with gap annotations. Estimate the cache hit rate for the highest-traffic endpoints.
---
### Step 4 — Define Monitoring and Alerting
Outage resolution time is dominated by mean time to detection (MTTD) and mean time to repair (MTTR). A DoS attack may appear as a spike in CPU, memory exhaustion, or error rate — not obviously as a traffic anomaly — unless request-rate monitoring is in place.
**Minimum monitoring requirements:**
- Request rate per endpoint (not just aggregate) — attack traffic often concentrates on specific endpoints
- CPU utilization and memory usage at service frontends
- Network bandwidth in and out at each layer
- DNS query rate (for detecting recursive retry storms — a 30x spike is a strong signal)
- Syncookie trigger rate (synflood indicator)
**Alerting principles:**
- Alert when demand exceeds service capacity AND automated DoS defenses have engaged — not before. Pre-capacity alerts create noise and lead teams to absorb attacks that would resolve without human intervention.
- For network-layer attacks: alert only if a link becomes saturated, not for all high-bandwidth events
- For synflood: alert if syncookies are triggered, not for all new connection attempts
- Do NOT page on request rate alone if the service is still healthy — distinguish "high traffic" from "service degradation"
**Why this matters:** Noisy alerts that fire before human action is required train teams to ignore pages. Alert only when human intervention may actually change the outcome.
**Output for this step:** Monitoring metric list with alert thresholds and escalation conditions. Flag any layer with no request-rate visibility.
---
### Step 5 — Design Graceful Degradation Under Attack
When absorbing an attack is not feasible, the goal is to reduce user-facing impact to the minimum. This step relies on the load shedding and throttling mechanisms defined in `resilience-and-blast-radius-design`.
**Throttle, do not block outright:**
- Use network ACLs to throttle (not hard-block) suspicious traffic during an active attack
- Retain visibility into blocked traffic so you can identify legitimate users caught in the filter and adjust
- Hard-blocking makes you invisible to the threat and risks impacting legitimate users who share an IP or network path with attackers
**Quality-of-service (QoS) prioritization:**
- Assign higher QoS to critical user-facing traffic and security-critical operations
- Deprioritize batch copies, background sync, and other low-value traffic during attack conditions
- Released bandwidth from lower-priority queues becomes available to high-priority traffic
**Application degraded modes:**
- Define explicit degraded operating modes ahead of time — not during an incident
- Examples from Google production:
- Blogger: serve read-only mode, disable comments
- Web Search: serve reduced feature set (disable spelling correction, related search)
- DNS: answer as many queries as possible; designed to never crash under any load
**CAPTCHA as a mitigation bridge:**
- Automated defenses (IP throttling, CAPTCHA challenges) provide immediate mitigation while giving the incident response team time to analyze and design a custom response
- CAPTCHA challenges should issue browser cookies with a long-term exemption to avoid repeatedly challenging legitimate users
- Exemption cookies should contain: pseudo-anonymous identifier, challenge type, timestamp, solving IP address, and a signature (to prevent forgery and botnet sharing)
- False positives are unavoidable when blocking by IP — NAT and shared addresses are common. CAPTCHA is the lowest-friction way to allow legitimate users behind a blocked address to bypass the block
**Output for this step:** Degraded mode definitions per service component. QoS priority assignments for critical traffic. CAPTCHA/challenge strategy if applicable.
---
### Step 6 — Design or Review the DoS Mitigation System
An automated DoS mitigation system provides fast, consistent response that does not depend on human reaction time. It must be designed to handle its own failure modes safely.
**Two required components:**
**Detection:**
- Statistical traffic sampling at all endpoints, aggregated to a central control system
- Control system identifies anomalies (traffic volume, distribution, request patterns) that may indicate attacks
- Works in conjunction with load balancers that understand service capacity — so the system can determine whether traffic volume warrants a response
- Requires sampling, not full logging — at attack volumes, full logging is itself a resource exhaustion risk
**Response:**
- Ability to implement a defense mechanism against detected anomaly — most commonly, providing a set of IP addresses or traffic patterns to block or challenge
- Response must be fast: seconds, not minutes. Attacks cause immediate outages; the mitigation system must respond at machine speed
**Failure mode design:**
- **Fail static:** If the controller fails, the policy does not change. This allows the system to survive an attack that also targets the control plane — a real risk when the control plane uses the same infrastructure. Fail static is preferable to fail open (attack traffic flows) or fail closed (all traffic blocked, service outage).
- **Canary deployment:** Apply new automated responses to a subset of production infrastructure before deploying everywhere. Because attacks cause immediate outages, the canary window can be very short — as little as 1 second — but it must exist to guard against configuration errors.
- **Resilient control plane:** The DoS mitigation system itself must not depend on infrastructure that may be impacted by the attack. This extends to the incident response team's communication tools — if Slack or Gmail are under attack, have backup communication channels and playbook storage.
**Output for this step:** Detection + response component design. Failure mode policy (fail static confirmed). Canary deployment plan for automated responses.
---
### Step 7 — Plan a Strategic Response
Responding purely reactively — filtering the attack traffic signature immediately — teaches the adversary what your defenses can see. A strategic response exploits the adversary's uncertainty about your capabilities.
**Do not expose your detection method:**
Example: An attack arrived with `User-Agent: I AM BOT NET`. Rather than dropping all traffic with that string (which would teach the attacker to change User-Agents), enumerate the IPs sending that traffic and intercept all of their requests with CAPTCHAs — including any future requests, even with a changed User-Agent. This blocked the botnet's A/B testing capability.
**Adversary capability inference:**
- Small amplification attack → attacker likely has a single server, limited to spoofed packets
- HTTP DDoS fetching the same page repeatedly → attacker likely has a botnet
- Traffic that looks exactly like legitimate users → attacker may be a scraper, not a volumetric attacker
**When the adversary may not be an attacker:**
- Unexpected traffic that matches real user behavior (browser distribution, geographic location) may be legitimate surge, not an attack
- Before applying adversarial defenses, verify the traffic profile against the "self-inflicted attack" patterns in Step 8
**Collaboration and escalation:**
- DoS mitigation providers can scrub certain traffic types upstream, before it reaches your infrastructure
- Network providers can perform upstream filtering closer to the attack source
- Network operator communities can coordinate to identify and filter attack sources
- Do not treat DoS defense as a problem to solve alone at the service layer
**Output for this step:** Response plan that avoids revealing the detection method. Adversary capability assessment. Escalation path to upstream providers if needed.
---
### Step 8 — Detect and Fix Self-Inflicted Surges
Not all traffic spikes are attacks. During an incident, the natural instinct is to look for an adversary — but a self-inflicted surge can look identical to a volumetric attack and will be worsened by adversarial countermeasures.
**Two categories of self-inflicted surge:**
**Organic traffic surge (synchronized user behavior):**
- Cause: an external event synchronizes user actions — a natural disaster (earthquake, storm) drives users to check news, social media, or safety services simultaneously; a live broadcast or game show drives viewers to interact with a service in unison
- Signal: traffic matches real user browser distribution, geographic distribution, and request patterns
- Key example: In 2009, Google Web Search received a burst of traffic for German words with identical character prefixes, arriving in three waves. SREs initially suspected a botnet conducting a dictionary attack. Investigation revealed the traffic matched real German browsers from Germany. Root cause: a televised game show challenged contestants to find word completions that returned the most Google search results. Viewers at home played along simultaneously. The "attack" was addressed with a design change — launching a word-completion suggestion feature — rather than a security response.
- Response: design changes that reduce the demand spike (precomputation, caching, autocompletion); NOT rate limiting or blocking
**Client retry storm (misbehaving software):**
- Cause: a server returns errors; clients retry immediately; when many clients are in the retry loop simultaneously, the resulting demand prevents the server from recovering
- This is especially severe for services like authoritative DNS servers, where recursive DNS clients controlled by other organizations retry aggressively and can reach 30x normal traffic during an outage
- Fix — exponential backoff + jitter (mandatory, not optional):
- **Exponential backoff:** double the wait period after each failed attempt (e.g., 1s → 2s → 4s → 8s). This limits the total request rate from any single client.
- **Jitter:** add a random duration to each wait period. Without jitter, all clients in the retry loop synchronize on the same retry cadence — producing synchronized bursts. Jitter de-synchronizes retries, smoothing demand into a constant low rate.
- Both are required: backoff alone does not prevent synchronization; jitter alone does not prevent escalating load from a single client
- If you do not control the client: the best response is to answer as many requests as possible while keeping the server healthy via upstream request throttling. Each successful response allows one client to escape its retry loop.
**Distinguishing attack from self-inflicted surge — diagnostic checklist:**
| Signal | Suggests attack | Suggests self-inflicted surge |
|---|---|---|
| Requests match real browser/OS distribution | No | Yes |
| Requests originate from expected geographic regions | No | Yes |
| Requests target a diverse set of queries/endpoints | No (attacks are focused) | Yes |
| Traffic arrives in correlated waves | Possible (botnet scanning) | Yes (event-driven) |
| Traffic correlates with a known external event | No | Yes |
| DNS retry rate is 30x normal | No | Yes (retry storm) |
**Output for this step:** Self-inflicted surge assessment: organic event or retry storm. Fix: design change (organic) or backoff + jitter implementation (retry storm).
---
## Deliverable
Produce a DoS defense assessment report with the following sections:
1. **Attack surface map** — dependency chain with weakest link annotated; attack type assessment by attacker cost
2. **Layered defense inventory** — coverage at edge, network LB, application LB, and service layer; gaps marked
3. **Service-level design assessment** — caching, request minimization, egress bandwidth findings
4. **Monitoring and alerting plan** — metrics, thresholds, alert conditions
5. **Graceful degradation design** — degraded operating modes, QoS priorities, CAPTCHA strategy
6. **DoS mitigation system design** — detection + response architecture, fail-static policy, canary plan
7. **Strategic response plan** — response approach, adversary capability inference, escalation path
8. **Self-inflicted surge assessment** — organic event or retry storm diagnosis, fix recommendations
---
## Key Principles
### The Economics of DoS Defense
Simply absorbing attacks by overprovisioning is not cost-effective at scale. The defender's strategy is to eliminate attack traffic at each layer for the minimum cost, so the expensive inner layers see only residual attack volume. Shared infrastructure defenses (edge, network LB) are the highest-leverage investment because they protect all services at once.
### Layered Defense: Eliminate Early, Protect Deep
Each defense layer should be able to handle the attack traffic that breaches the outer layer. Defenses near the edge are cheap (bandwidth is shared); defenses deep in the stack are expensive (CPU, database connections). Drop as early as possible.
### Fail Static — Never Fail Open or Closed in the Mitigation System
A DoS mitigation controller that fails open lets attack traffic through. One that fails closed creates a self-inflicted outage. Failing static — freezing the current policy — is the correct tradeoff: the system continues functioning at whatever state it was in when the controller went down, without making things worse in either direction.
### Strategic Response: Do Not Teach the Adversary
Immediately dropping traffic matching the attack signature reveals exactly what your detection sees. Instead, respond in ways that do not fingerprint your detection method — for example, challenging all traffic from identified sources rather than only traffic matching the current attack signature. This blocks the adversary's ability to A/B test your defenses.
### Self-Inflicted Surges Require Different Responses Than Attacks
Applying rate limiting and blocking to an organic surge or a client retry storm will worsen the situation. Always verify the traffic profile before applying adversarial countermeasures. The correct response to a retry storm is to serve as many requests as possible while backoff + jitter propagates; the correct response to an organic surge is a design change that reduces demand.
### Backoff and Jitter Are Both Required
Exponential backoff without jitter still produces synchronized bursts when many clients fail simultaneously. Jitter without backoff limits per-client load but does not prevent total load from remaining high. Both are necessary. At Google, exponential backoff with jitter is standard in all client software.
---
## Examples
### Example 1 — Strategic Response: The User-Agent Botnet
Google received an attack where all traffic contained `User-Agent: I AM BOT NET`. Rather than dropping that User-Agent (which would immediately teach the attacker to use `User-Agent: Chrome`), SREs enumerated all IPs sending that traffic and applied CAPTCHA challenges to all of their requests — regardless of User-Agent. This prevented the attacker from using A/B testing to discover which signals the defense was keying on, and blocked future requests even after the User-Agent changed.
### Example 2 — Self-Inflicted Surge: The German TV Game Show
In 2009, Google Search received a burst of traffic for German words with identical character prefixes, arriving in three waves roughly 10 minutes apart. Initial suspicion: a botnet conducting a dictionary attack. Investigation found the traffic originated from machines in Germany and matched real browser distributions. Root cause: a televised game show challenged contestants to find word completions with the most Google search results — and viewers at home searched along. The response was a design change: adding word-completion suggestions as users type, which reduced the number of queries users submitted. No adversarial countermeasures were needed.
### Example 3 — Client Retry Storm: DNS Outage
An authoritative DNS server experiences an outage. Recursive DNS servers controlled by external organizations immediately begin retrying, escalating to 30x normal traffic. This prevents the server from recovering — each attempted recovery is overwhelmed by the retry flood. The correct response is to serve as many requests as possible (each successful answer lets one DNS resolver escape its retry loop) while applying upstream request throttling to preserve server health. The long-term fix is to ensure all clients implement exponential backoff with jitter — but this cannot be controlled externally.
### Example 4 — CAPTCHA Exemption Cookie Design
When blocking by IP, legitimate users behind the same NAT as an attacker are blocked. A CAPTCHA challenge allows them to prove they are human and receive a browser-based exemption cookie. The cookie must contain: a pseudo-anonymous identifier (allows abuse detection and revocation), the challenge type (allows requiring harder challenges for suspicious behaviors), a timestamp (allows expiring old cookies), the solving IP address (prevents botnets from sharing a single exemption across many machines), and a cryptographic signature (prevents forgery).
---
## References
- Chapter 10: Mitigating Denial-of-Service Attacks, *Building Secure and Reliable Systems* (pp. 217–229)
- Chapter 8: Design for Resilience (load shedding, throttling, graceful degradation — prerequisite concepts)
- Chapter 9: Design for Recovery (recovery after a DoS causes an outage)
- *Site Reliability Engineering* book, Chapter 22: Addressing Cascading Failures (retry storms, load shedding)
- Project Shield (Google): shared DoS defense infrastructure illustrating economy-of-scale defense
- See `resilience-and-blast-radius-design` skill for load shedding and throttling implementation
## License
This skill is licensed under [CC-BY-SA-4.0](https://creativecommons.org/licenses/by-sa/4.0/).
Source: [BookForge](https://github.com/bookforge-ai/bookforge-skills) — Building Secure and Reliable Systems by Heather Adkins, Betsy Beyer, Paul Blankinship, Piotr Lewandowski, Ana Oprea, Adam Stubblefield.
## Related BookForge Skills
Install related skills from ClawhHub:
- `clawhub install bookforge-resilience-and-blast-radius-design`
Or install the full book set from GitHub: [bookforge-skills](https://github.com/bookforge-ai/bookforge-skills)
Use when you need to assess disaster risk for a system or organization, perform structured risk analysis before disaster planning, identify which disasters t...
--- name: disaster-risk-assessment description: Use when you need to assess disaster risk for a system or organization, perform structured risk analysis before disaster planning, identify which disasters to plan for, build a prioritized risk register, quantify probability and impact of failure scenarios, or answer "what disasters should we prepare for and in what order." tags: [security, reliability, disaster-planning, risk-assessment, incident-readiness] --- # Disaster Risk Assessment Produces a scored, prioritized risk register using a quantitative Probability × Impact matrix. Covers 7 disaster types across 3 themes (Environmental, Infrastructure Reliability, Security) with 18+ pre-seeded scenarios. Output drives response plan prioritization, incident response team scoping, and disaster recovery test selection. ## When to Use - Starting disaster planning for a new or existing system - Preparing for a disaster recovery test or tabletop exercise - Scoping an incident response team's charter and coverage - Evaluating how a change in infrastructure (new datacenter, cloud migration) shifts risk exposure - Conducting a per-site risk review for a multi-location organization - Revisiting a prior assessment after a significant organizational or threat environment change **Prerequisite:** Know your system's architecture and its key dependencies (networking, authentication, storage, third-party services). Risk ratings are only as good as the system inventory behind them. ## Context & Input Gathering Before scoring, establish three inputs: **1. System inventory with criticality classification** Classify every system the risk could affect into one of three tiers. This classification determines how much impact a given disaster actually has on operations. | Tier | Label | Definition | |------|-------|------------| | 1 | Mission-essential | Absence causes total operational disruption. Organization cannot function. | | 2 | Mission-important | Absence significantly degrades operations but does not halt them. | | 3 | Nonessential | Absence has minimal operational impact. Tolerable downtime. | Ask: which services, if offline for 24 hours, would be catastrophic (Tier 1), serious (Tier 2), or acceptable (Tier 3)? **2. Geographic and infrastructure context** Risk ratings are location-dependent. A site in Los Angeles warrants a higher earthquake probability than one in Hamburg. A site in the southeastern US warrants higher hurricane probability. A single-ISP facility warrants higher internet connectivity loss probability than one with redundant circuits. Collect: - Physical datacenter location(s) - Existing fault-tolerance controls (redundant power, redundant ISP, UPS, generators) - Known historical incidents at this site or in this region **3. Scope boundary** Decide whether you are assessing at the organizational level (global) or per site. Large organizations should do both — a site that hosts only Tier 3 systems warrants a different response plan than one hosting Tier 1 systems. ## Process ### Step 1 — Start with the pre-seeded risk taxonomy The matrix in Appendix A of *Building Secure and Reliable Systems* groups disaster scenarios into three themes. Use these as your starting point rather than an empty list. Pre-seeded scenarios prevent the common failure mode of omitting non-obvious risks (e.g., emerging zero-day vulnerabilities, insider intellectual property theft). **Environmental theme** (natural events that affect physical infrastructure) - Earthquake - Flood - Fire - Hurricane / severe storm **Infrastructure Reliability theme** (component and service failures) - Power outage - Loss of internet connectivity - Authentication system down - High system latency / infrastructure slowdown **Security theme** (adversarial and vulnerability-driven events) - System compromise (external attacker gaining unauthorized access) - Insider theft of intellectual property - Distributed denial-of-service (DDoS) / denial-of-service (DoS) attack - Misuse of system resources (e.g., cryptocurrency mining) - Vandalism / website defacement - Phishing attack - Software security bug - Hardware security bug - Emerging serious vulnerability (e.g., Meltdown/Spectre, Heartbleed class) Add organization-specific scenarios beyond this list. Examples: ransomware targeting backup systems, supply chain compromise of a build pipeline, regulatory action requiring emergency data deletion. ### Step 2 — Score each scenario using the P×I scales For each scenario, assign two values independently, then compute the ranking. **Probability of occurrence within a year (P)** | Value | Label | |-------|-------| | 0.0 | Almost never | | 0.2 | Unlikely | | 0.4 | Somewhat unlikely | | 0.6 | Likely | | 0.8 | Highly likely | | 1.0 | Inevitable | Score probability based on your specific location, historical data, and existing controls. A site with a generator and UPS reduces power outage probability; a site on a flood plain increases flood probability. **Impact to organization if risk occurs (I)** | Value | Label | |-------|-------| | 0.0 | Negligible | | 0.2 | Minimal | | 0.5 | Moderate | | 0.8 | Severe | | 1.0 | Critical | Score impact relative to the Tier 1/2/3 systems affected. If a disaster only affects Tier 3 systems, impact is at most Moderate. If it takes down a Tier 1 system with no failover, impact is Severe or Critical. **Ranking = Probability × Impact** A power outage scored P=0.6, I=0.8 produces Ranking=0.48. A hurricane at P=0.2, I=1.0 produces Ranking=0.20. Sort the completed register from highest to lowest ranking. ### Step 3 — Populate the risk register Create one row per scenario. Minimum columns: | Theme | Risk | Probability (P) | Impact (I) | Ranking (P×I) | Systems Impacted | Tier | |-------|------|-----------------|------------|---------------|------------------|------| | Environmental | Earthquake | — | — | — | — | — | | Environmental | Flood | — | — | — | — | — | | Environmental | Fire | — | — | — | — | — | | Environmental | Hurricane | — | — | — | — | — | | Infrastructure Reliability | Power outage | — | — | — | — | — | | Infrastructure Reliability | Loss of internet connectivity | — | — | — | — | — | | Infrastructure Reliability | Authentication system down | — | — | — | — | — | | Infrastructure Reliability | High system latency / infrastructure slowdown | — | — | — | — | — | | Security | System compromise | — | — | — | — | — | | Security | Insider theft of intellectual property | — | — | — | — | — | | Security | DDoS/DoS attack | — | — | — | — | — | | Security | Misuse of system resources | — | — | — | — | — | | Security | Vandalism / website defacement | — | — | — | — | — | | Security | Phishing attack | — | — | — | — | — | | Security | Software security bug | — | — | — | — | — | | Security | Hardware security bug | — | — | — | — | — | | Security | Emerging serious vulnerability | — | — | — | — | — | Fill in scores, sort by Ranking descending. ### Step 4 — Review for outliers before finalizing Sorting by ranking is a starting heuristic, not a final answer. Perform a manual outlier review: - **Low-probability, high-impact outliers:** A scenario ranked 0.10 (P=0.1, I=1.0) may still demand a response plan because the consequence is catastrophic. Flag any scenario with I=1.0 regardless of ranking. - **Hidden dependencies:** A seemingly low-impact risk may become critical if it disables a monitoring or logging system that other incident responses depend on. - **Correlated risks:** An earthquake can simultaneously trigger power outage, connectivity loss, and fire. Assess whether scenarios cluster and whether the combined impact exceeds individual rankings. - **Expert review:** Solicit review from someone outside the team who can identify risks with hidden factors or dependencies. Groupthink tends to underweight unfamiliar scenarios. ### Step 5 — Document scope, assumptions, and review cadence Record alongside the register: - Date of assessment - Location(s) assessed - Existing controls assumed (e.g., "assumes redundant ISP, UPS, and diesel generator") - Owner responsible for next review - Planned review cadence (minimum: annually; recommended: after any major infrastructure change or post-incident) ## Key Principles **Quantification counters groupthink.** Intuitive risk assessment tends to weight salient scenarios (recent news events, memorable near-misses) over statistically more likely ones. A scored matrix forces explicit probability and impact estimates, making invisible assumptions visible and debatable. **Probability is infrastructure-dependent, not universal.** A cloud-hosted system with multi-region failover has a different authentication system downtime probability than a single on-premises deployment. Score after accounting for existing controls — but also model what happens if a control fails. **Ratings must evolve with the system.** Risk posture changes when the organization adds redundant internet circuits, migrates to a different cloud region, or discovers a new vulnerability class. Schedule reviews; do not treat the register as a one-time artifact. **Low probability does not mean no plan.** Scenarios with I=0.8 or I=1.0 warrant response plans even if their ranking is low. The ranking guides where to invest preparation effort first, not which risks to ignore entirely. **Assess dependencies alongside primary systems.** Key operational functions include their underlying dependencies — networking, authentication, application-layer components. A mission-essential service that depends on a Tier 3 authentication system effectively elevates that dependency to Tier 1 during an incident. **Multi-location organizations need per-site assessments.** Global rankings mask site-specific exposure. A site in earthquake country has different environmental risk than headquarters. Run the matrix per site and aggregate. ## Examples **Example: SaaS company, single US West Coast datacenter, no redundant power** | Theme | Risk | P | I | Ranking | Systems Impacted | |-------|------|---|---|---------|-----------------| | Security | System compromise | 0.6 | 1.0 | 0.60 | Auth service (T1), API (T1) | | Infrastructure | Power outage | 0.6 | 0.8 | 0.48 | All systems | | Security | Software security bug | 0.6 | 0.8 | 0.48 | API (T1) | | Security | Phishing attack | 0.8 | 0.5 | 0.40 | Email (T2), SSO (T1) | | Infrastructure | Loss of internet connectivity | 0.4 | 1.0 | 0.40 | All externally facing (T1) | | Security | DDoS/DoS attack | 0.4 | 0.8 | 0.32 | API (T1) | | Environmental | Earthquake | 0.4 | 0.8 | 0.32 | All systems | | Security | Emerging serious vulnerability | 0.2 | 1.0 | 0.20 | All systems | | Environmental | Flood | 0.2 | 0.5 | 0.10 | On-prem equipment (T2) | **Outlier flag:** Emerging serious vulnerability ranks 0.20 but Impact=1.0. Flag for mandatory response plan despite low ranking. Earthquake and internet connectivity loss are correlated — their combined impact may be higher than either alone. **Example: Adjusting for existing controls** After adding a backup ISP: Loss of internet connectivity drops from P=0.4 to P=0.2, Ranking drops from 0.40 to 0.20. After adding UPS and generator: Power outage drops from P=0.6 to P=0.2, Ranking drops from 0.48 to 0.16. Re-run the matrix when controls change to confirm prioritization remains valid. ## References - *Building Secure and Reliable Systems* (Blank, Oprea et al., Google/O'Reilly, 2020) - Chapter 16 "Disaster Planning" — pp. 363–382: disaster type taxonomy (pp. 364), disaster risk analysis methodology (pp. 366), system criticality classification (pp. 366), dynamic response strategy phases (pp. 365) - Appendix A "A Disaster Risk Assessment Matrix" — pp. 499–500: Table A-1 with full probability scale, impact scale, pre-seeded scenario taxonomy, and Ranking = P×I formula - Next steps after completing the register: incident response team setup (Chapter 16, pp. 367–375), response plan development (pp. 371–373), disaster recovery test planning (pp. 376–382) ## License This skill is licensed under [CC-BY-SA-4.0](https://creativecommons.org/licenses/by-sa/4.0/). Source: [BookForge](https://github.com/bookforge-ai/bookforge-skills) — Building Secure And Reliable Systems by Unknown. ## Related BookForge Skills This skill is standalone. Browse more BookForge skills: [bookforge-skills](https://github.com/bookforge-ai/bookforge-skills)
Profile the likely adversaries targeting a system and produce a structured threat model with prioritized threat scenarios. Use when: designing a new system o...
---
name: adversary-profiling-and-threat-modeling
description: |
Profile the likely adversaries targeting a system and produce a structured threat model with prioritized threat scenarios. Use when: designing a new system or service and need to identify who might attack it and how; evaluating whether existing security controls address the right threats; preparing a threat model document for a security review, compliance audit, or architecture decision record; assessing insider risk for a system that handles sensitive data or privileged operations; or mapping attack lifecycle stages to defensive controls. Applies the three adversary frameworks — attacker motivations, attacker profiles, and attack lifecycle stages — alongside a four-dimension actor-motive-action-target threat scenario matrix to produce ranked threat scenarios. Distinct from vulnerability assessment (which audits specific technical flaws) and penetration testing (which actively exploits weaknesses). Produces: adversary profile summary, insider risk matrix, threat scenario list ranked by likelihood and impact, and per-stage defensive control recommendations.
version: 1.0.0
homepage: https://github.com/bookforge-ai/bookforge-skills/tree/main/books/building-secure-and-reliable-systems/skills/adversary-profiling-and-threat-modeling
metadata: {"openclaw":{"emoji":"📚","homepage":"https://github.com/bookforge-ai/bookforge-skills"}}
status: draft
depends-on: []
source-books:
- id: building-secure-and-reliable-systems
title: "Building Secure and Reliable Systems"
authors: ["Heather Adkins", "Betsy Beyer", "Paul Blankinship", "Piotr Lewandowski", "Ana Oprea", "Adam Stubblefield"]
chapters: [2]
tags:
- security
- threat-modeling
- adversary-analysis
- risk-assessment
- insider-threat
- attack-lifecycle
- kill-chain
- threat-intelligence
- security-design
- nation-state
- criminal-actors
- hacktivism
- vulnerability-research
- tactics-techniques-procedures
execution:
tier: 1
mode: hybrid
inputs:
- type: document
description: "System design document, architecture diagram, or description of the system being modeled — including what data it stores, what services it exposes, and who uses it"
- type: codebase
description: "Optional: existing codebase, infrastructure config, or security policies to enrich the threat model with concrete system context"
tools-required: [Read, Write]
tools-optional: [Grep, Bash]
mcps-required: []
environment: "Runs in conversation or project context. Works from a system description provided by the user. Produces a threat model document."
discovery:
goal: "Produce a structured threat model: adversary profile summary, insider risk matrix, prioritized threat scenario list, and per-stage defensive control recommendations"
tasks:
- "Identify the system's assets, data types, and exposure surface"
- "Apply motivation framework to determine which attacker motivations are relevant"
- "Match system profile to attacker profiles and assess each profile's likelihood"
- "Map insider categories (first-party, third-party, related) and generate insider threat scenarios using the actor-motive-action-target matrix"
- "Build external threat scenarios using the same four-dimension matrix"
- "Plot prioritized scenarios against attack lifecycle stages"
- "Recommend per-stage defensive controls"
audience:
roles: ["software-engineer", "site-reliability-engineer", "security-engineer", "architect", "tech-lead"]
experience: "intermediate-to-advanced — assumes familiarity with system design but not necessarily with formal threat modeling"
triggers:
- "User is designing a new system and wants to identify who might attack it"
- "User needs a threat model document for a security review or compliance process"
- "User wants to understand what defenses to prioritize given their likely adversaries"
- "User is assessing insider risk for a system with privileged access or sensitive data"
- "User wants to map their system's threat landscape before designing security controls"
- "User is preparing for a red team exercise or penetration test scope definition"
not_for:
- "Auditing specific technical vulnerabilities in code — use a vulnerability assessment process"
- "Actively exploiting or testing defenses — use a penetration testing skill or red team process"
- "Responding to an active incident — use an incident response skill"
---
# Adversary Profiling and Threat Modeling
## When to Use
You are helping a user build, design, or secure a system and need to produce a structured threat model — a document that identifies who is likely to attack the system, what motivates them, how they would proceed, and what defenses to prioritize.
This skill applies when the threat landscape is undefined (new system), underexamined (inherited system with no documented threats), or needs to be formalized (security review, compliance, architecture decision). It produces a concrete threat model document with ranked threat scenarios, not a general discussion of security concepts.
Threat modeling complements, but does not replace, vulnerability assessment (which finds specific technical flaws) and penetration testing (which validates exploitability). Run this skill first, then use the outputs to scope those activities.
---
## Context & Input Gathering
### Required Context (must have — ask if missing)
**1. What does the system do and what data does it handle?**
Why: The system's purpose and data profile are the primary signals for which attacker motivations apply. A payment processor attracts financially motivated criminals. A government contractor attracts nation-state actors. A messaging platform with political users attracts hacktivists. A system with no sensitive data and low public profile may realistically attract only hobbyists and automated scanners. Without this, the threat model cannot be grounded.
- Check prompt or environment for: system description, architecture docs, README, data classification labels, privacy policy, API surface description
- If missing, ask: "What does this system do, what data does it store or process, and who uses it? For example: a financial transaction API storing customer payment data, used by 500k retail customers."
**2. Who has privileged access to the system (insiders)?**
Why: Insider threats are statistically among the most impactful risk categories because insiders already have access — they bypass the entry stage of an attack entirely. Every system has insiders. The categories are: first-party (employees, interns, executives, board), third-party (contractors, vendors, open-source contributors, API partners), and related (family members, roommates, household members with physical access to devices). Skipping insider modeling produces an incomplete threat model.
- Check prompt or environment for: org chart references, team descriptions, third-party integrations, open-source contribution policy, remote work / work-from-home context
- If missing, ask: "Who has trusted access to this system or its data? This includes employees, contractors, third-party vendors with API access, and open-source contributors if applicable."
**3. Is this system or organization potentially a target of interest to sophisticated actors?**
Why: Determines whether nation-state adversaries belong in the threat model. Signals include: processing data that intelligence agencies value (communications, location, financial), supplying technology used by governments or militaries, operating in a regulated or politically sensitive sector, or being a supplier to a higher-value target. Organizations often do not realize they are attractive targets — a fitness app may reveal military base locations; a software vendor may be targeted to reach its downstream customers.
- Check prompt or environment for: customer or partner descriptions, industry sector, any government or defense relationships, supply chain position
- If missing, ask: "Does your organization handle data that a government or intelligence agency would value (user communications, location data, financial records)? Do you supply technology to government or defense customers, or to other companies that do?"
### Optional Context (enriches the model)
- **Known security incidents or near-misses:** Existing incident history is the strongest signal of which attacker profiles are already active.
- **Regulatory or compliance requirements:** GDPR, SOC 2, HIPAA, PCI-DSS often mandate specific threat categories be addressed.
- **Current security posture:** Existing controls (MFA, logging, access reviews) determine which attack lifecycle stages are already partially defended.
- **Public exposure:** Is the system internet-facing? Does it have a public bug bounty program? High public exposure attracts hobbyists and vulnerability researchers.
---
## Process
### Step 1 — Profile the System's Attractiveness to Each Attacker Motivation
Assess the system against all eight attacker motivations to determine which are plausible.
**Why:** Attackers are primarily human and their actions are goal-directed. Knowing which motivations apply eliminates implausible threat actors and focuses the model on realistic scenarios. A hobbyist is unlikely to target an obscure internal tool; a nation-state actor is unlikely to target a small consumer app with no sensitive data or geopolitical relevance. Eliminating implausible motivations prevents the threat model from becoming so broad it is useless.
Work through each motivation and rate it: **High** (a primary attacker motivation given this system), **Medium** (plausible but not primary), or **Not applicable**.
| Motivation | What it drives attackers to do | Assess for this system |
|---|---|---|
| **Fun** | Undermine security for the challenge of it | Public-facing systems with any technical complexity; hobbyists; low-barrier automated scanning |
| **Fame** | Gain notoriety by demonstrating technical skill | Systems where a breach would be publicly visible or embarrassing |
| **Activism** | Make a political statement; disrupt or deface | Systems operated by organizations with political opponents or controversial products/customers |
| **Financial gain** | Steal money, data for sale, or enable fraud | Any system handling payments, credentials, PII, or data with resale value |
| **Coercion** | Force the victim to act against their interest | Systems where disruption (ransomware, DDoS) is severely damaging and payment is preferable to downtime |
| **Manipulation** | Spread misinformation; alter data or behavior | Systems that publish content, display search results, or make automated decisions at scale |
| **Espionage** | Steal secrets; long-term persistent access for intelligence | Systems with proprietary IP, research data, user communications, or government relationships |
| **Destruction** | Sabotage; data deletion; taking the system offline | Critical infrastructure, competitors, or systems operated by organizations with powerful enemies |
**Output of this step:** A table listing each motivation with a High/Medium/Not Applicable rating and a one-sentence justification.
---
### Step 2 — Match Attacker Profiles to the System
Map the applicable motivations to attacker profiles and assess each profile's likelihood.
**Why:** Motivations identify the *why*; profiles identify the *who*. Different profiles have different capabilities, resources, and behaviors that directly determine which attack methods are feasible and which defenses are effective. A criminal actor gravitates toward the lowest-cost approach and will move to an easier target if you raise the cost of attacking you. A nation-state actor will invest significant resources and cannot be deterred by cost alone. Knowing the profile shapes the defensive strategy.
Assess each of the following profiles:
**Hobbyists and automated scanners**
Curious technologists motivated by fun or learning. Generally follow personal ethics; rarely cross into criminal behavior. However, their discoveries may be picked up by more motivated actors. Automated scanners (bots, vulnerability scanners for sale) now amplify the effective capability of low-skill actors.
- Likely for: any internet-facing system.
- Defensive priority: patch known vulnerabilities promptly; implement rate limiting; use CAPTCHA or behavioral analysis to distinguish bots from humans.
**Vulnerability researchers**
Security professionals who find and report flaws, motivated by financial reward (bug bounties) or professional reputation. Operate within disclosure norms; typically notify the organization before going public. Can be allies if engaged constructively.
- Likely for: systems with a public bug bounty program, or systems operated by organizations with public security commitments.
- Defensive priority: run a Vulnerability Reward Program; have a clear disclosure policy; respond quickly to reports.
**Governments and law enforcement (nation-state actors)**
Intelligence agencies, military cyber units, and law enforcement with the goal of intelligence gathering, military disruption, or policing. Have significant resources and can sustain long-term operations. Typically use sophisticated methods but also rely on simple techniques (phishing) when those work.
- Likely for: organizations handling user communications, location data, financial records, or government/military supply chains.
- Defensive priority: invest in long-term defense in depth; protect the most sensitive assets even at high cost; build layered detection that catches persistent access, not just initial compromise.
**Activists (hacktivists)**
Groups or individuals using technical attacks to advance political or social causes. Methods range from website defacement and DDoS to data theft and publication. Vocal and often seek public credit. Do not always have high technical skill — DDoS-for-hire services are inexpensive.
- Likely for: organizations with politically controversial products, customers, or public positions.
- Defensive priority: DDoS mitigation; backup and rapid restore; hardened content delivery.
**Criminal actors**
Motivated by financial gain. Use the most cost-effective attack methods available, including social engineering, phishing, ransomware, and credential stuffing. Will move to easier targets if attack cost exceeds expected return. Operate alone or are hired by organizations (political campaigns, competitors, criminal enterprises).
- Likely for: any system handling money, credentials, or data with resale value.
- Defensive priority: raise attack cost; implement MFA; make your system a more expensive target than alternatives.
**Automation and AI-assisted attacks**
Emerging category: automated systems that discover and exploit vulnerabilities without direct human control. Currently most relevant for large-scale credential stuffing, vulnerability scanning, and opportunistic exploitation of known CVEs.
- Likely for: any internet-facing system; grows more relevant over time.
- Defensive priority: automated configuration management; resilient-by-default system design; continuously updated vulnerability patching.
**Output of this step:** A profile assessment table with likelihood rating (High/Medium/Low/Not applicable) for each profile and 1–2 sentences on what that attacker would target in this specific system.
---
### Step 3 — Build the Insider Threat Scenario Matrix
Apply the four-dimension actor-motive-action-target framework to insider categories to generate concrete insider threat scenarios.
**Why:** Insider threats are frequently undermodeled because security thinking defaults to external attackers. Yet insiders already have access — they skip the entry stage of an attack entirely. Insider actions span the full range from deliberate malice to accidental damage, and the defenses (least privilege, multi-party authorization, business justifications, auditing) are the same regardless of intent. Planning for both malicious and accidental insider actions is essential because intent is often impossible to determine after the fact.
**The three insider categories:**
- **First-party insiders:** Employees, interns, executives, board members — those brought in to meet business objectives and granted direct system access.
- **Third-party insiders:** Contractors, vendors, open-source contributors, commercial partners, auditors — insiders whom few or no people in the organization have met personally. Open-source contributors who submit malicious code changes are a growing subcategory.
- **Related insiders:** Friends, family, roommates — people with physical access to an insider's device or workspace. Remote and work-from-home arrangements expand this category.
**Build a scenario matrix using four dimensions:**
| Dimension | Example values |
|---|---|
| **Actor/Role** | Engineering, Operations, Legal, Marketing, Executives, Contractor, Vendor, Open-source contributor, Family member |
| **Motive** | Accidental, Negligent, Compromised (account takeover), Financial, Ideological, Retaliatory, Vanity |
| **Action** | Data access (read), Exfiltration (copy/send), Deletion, Modification, Injection (malicious code/config), Leak to press |
| **Target** | User data, Source code, Documents, Logs, Infrastructure, Services, Financial records |
Generate at least 5 concrete scenarios by combining one item from each dimension. Include both malicious and accidental scenarios. Examples:
- An **engineer** with source code access is **dissatisfied after a negative performance review** and **injects a backdoor** into production that **exfiltrates user credentials**.
- An **SRE** preparing an emergency change is **working without enough sleep** and **accidentally deletes** the production **database**.
- A **contractor** with API access is **compromised** by a third-party phishing campaign and their credentials are used to **exfiltrate** **source code**.
- An **open-source contributor** submits a malicious change list that **injects** a dependency backdoor affecting **all downstream users**.
- A **family member** uses an employee's unlocked laptop and **accidentally installs malware** that **prevents the employee from responding** to an on-call incident.
**Output of this step:** A table of 5–8 insider threat scenarios, each with Actor, Motive, Action, Target, and a brief impact statement.
---
### Step 4 — Build External Threat Scenarios
Apply the same four-dimension matrix to external attacker profiles from Step 2.
**Why:** Combining attacker profiles (who) with the actor-motive-action-target framework produces concrete, testable threat scenarios rather than abstract risk statements. Concrete scenarios map directly to defensive controls — each scenario implies specific technical and procedural mitigations. Abstract risk statements ("we might be hacked") cannot drive security investment decisions.
For each High or Medium attacker profile from Step 2, generate at least one scenario:
| Dimension | Values for external threats |
|---|---|
| **Actor** | Criminal gang, Hacktivist collective, Nation-state intelligence agency, Automated scanner, Former employee (acting externally), Hired attacker |
| **Motive** | Financial gain (ransomware, credential theft), Espionage (IP theft, data collection), Activism (defacement, disruption), Coercion (DDoS for ransom), Destruction (sabotage) |
| **Action** | Phishing to credential theft, Exploiting unpatched CVE, Supply chain compromise, Social engineering of support staff, DDoS, Injecting malicious dependency |
| **Target** | User accounts, Admin credentials, Payment data, Source code, Cryptographic keys, API endpoints, Third-party integrations |
**Output of this step:** A table of 5–8 external threat scenarios covering the relevant attacker profiles.
---
### Step 5 — Map Scenarios to Attack Lifecycle Stages and Assign Defenses
Plot prioritized threat scenarios against the five attack lifecycle stages and identify the defensive controls that interrupt each stage.
**Why:** Attackers must succeed at every stage in sequence to achieve their goal — defenders only need to interrupt one stage. Mapping scenarios to lifecycle stages identifies where in the attack chain a defense can be applied most cost-effectively. A multi-stage attack also provides multiple detection opportunities; building detection into each stage maximizes the chance of catching an attacker before they reach their goal.
**The five attack lifecycle stages:**
| Stage | What the attacker does | Example attack action |
|---|---|---|
| **Reconnaissance** | Surveys target to understand weak points | Search for employee email addresses, enumerate public APIs, read job postings to infer tech stack |
| **Entry** | Gains initial access to systems, accounts, or the network | Sends phishing emails leading to credential compromise; exploits a known CVE in a public endpoint |
| **Lateral movement** | Moves from initial access point to higher-value systems | Logs into internal systems using stolen credentials; pivots from compromised workstation to production servers |
| **Persistence** | Ensures ongoing access survives detection and remediation | Installs a backdoor; creates a secondary admin account; modifies startup scripts |
| **Goal execution** | Takes the action that achieves the attack objective | Exfiltrates data; encrypts files for ransom; deletes production systems; publishes stolen documents |
For each high-priority scenario from Steps 3 and 4, identify which stage is the natural intervention point and what defense interrupts it:
**Reconnaissance defenses:** Employee security awareness training; minimize public information exposure (job postings, error messages that reveal tech stack); monitor for public OSINT gathering.
**Entry defenses:** Multi-factor authentication (hardware security keys for high-privilege accounts); restrict VPN access to organization-managed devices; patch known vulnerabilities promptly; phishing-resistant authentication.
**Lateral movement defenses:** Enforce least privilege — employees can only access systems required for their role; require re-authentication for sensitive system access; segment the network; implement zero-trust network access (no implicit trust based on network location alone).
**Persistence defenses:** Application allowlisting (only permit authorized software to run); monitor for unexpected new accounts or scheduled tasks; automated integrity checking of critical system files.
**Goal execution defenses:** Least-privilege access to sensitive data; data loss prevention controls; monitoring for large data transfers or bulk deletions; enable rapid recovery (tested backups, runbooks for data restoration).
**Output of this step:** A table mapping each prioritized threat scenario to the stage where intervention is most effective and the specific defensive control recommended.
---
### Step 6 — Produce the Threat Model Document
Compile the outputs of Steps 1–5 into a structured threat model document.
**Why:** A threat model is only useful if it is documented, shareable, and actionable. The document becomes the shared reference for security investment decisions, architecture reviews, penetration test scoping, and compliance evidence. Without a written artifact, the modeling exercise has no lasting impact.
**Threat model document structure:**
```
1. System Summary
- System name and purpose
- Key assets (data, services, infrastructure)
- Exposure surface (internet-facing endpoints, privileged user count)
2. Attacker Motivation Assessment
- Table: motivation × High/Medium/Not Applicable × justification
3. Attacker Profile Assessment
- Table: profile × likelihood × what they would target in this system
4. Insider Threat Scenarios
- Matrix table: Actor | Motive | Action | Target | Impact
5. External Threat Scenarios
- Matrix table: Actor | Motive | Action | Target | Impact
6. Prioritized Threat Scenario List
- Ranked by: (likelihood × impact)
- Top 5–10 scenarios with priority ranking
7. Attack Lifecycle Defensive Map
- Table: Scenario | Stage | Recommended Defense
8. Open Questions and Assumptions
- What information would change this model?
- What is assumed but not verified?
```
**Output of this step:** A completed threat model document written to the project directory or presented to the user as a structured response.
---
## Key Principles
**1. Assume you are a target before you know why.**
Organizations routinely do not realize they are attractive targets until after a breach. A small company may hold data that a nation-state wants, or may be a stepping stone to a higher-value target in the supply chain. Assess attractiveness from the adversary's perspective, not the organization's self-assessment.
**2. Attack sophistication does not predict success.**
Even well-resourced attackers choose the simplest available path to their goal. Phishing remains one of the most effective entry techniques regardless of the attacker's technical sophistication. Prioritize defenses against simple, proven attack methods (MFA, patching, access controls) before investing in defenses against exotic scenarios.
**3. Plan for both malicious and accidental insider actions.**
Insider intent is often impossible to determine after the fact. An accidental database deletion and a retaliatory one have the same impact. Design defenses (least privilege, multi-party authorization, auditing) that protect against both simultaneously.
**4. Focus on attacker methods when attribution is uncertain.**
Attackers can disguise their identity and motivation (NotPetya appeared to be ransomware but was sabotage). Do not build defenses that depend on knowing who the attacker is. Instead, map defenses to the attacker's *methods* (tactics, techniques, and procedures) — these are harder to disguise and more stable across attribution changes.
**5. Raising attack cost is a legitimate and measurable defense.**
Criminal actors choose the easiest target. If you make your system significantly more expensive to attack than alternatives, they shift to easier targets. Nation-state actors cannot be deterred by cost, but forcing them to spend significant resources increases their risk of detection and attribution. "Make the attack more expensive than the reward" is a concrete design goal.
---
## Examples
### Example 1 — Payment processing API
**User request:** "We're building a payment processing API. We need to understand who might attack it and what we should prioritize."
**Step 1 — Motivation assessment (abbreviated):**
- Financial gain: **High** — payment data and transaction manipulation are direct paths to money.
- Coercion: **High** — ransomware or DDoS against a payment processor causes immediate revenue loss; operators are likely to pay to restore service.
- Espionage: **Medium** — customer payment data has resale value; transaction patterns could be valuable to competitors.
- Destruction: **Low** — no obvious political or military motive unless the client base includes sanctioned entities.
**Step 2 — Profile assessment (abbreviated):**
- Criminal actors: **High** — direct financial return from credential theft, card data, or ransomware.
- Nation-state: **Medium** — if processing cross-border transactions or serving government clients; intelligence agencies value financial data.
- Hobbyists/automated: **High** — any public API endpoint will face automated scanning.
**Step 3 — Key insider scenarios:**
- A **payment operations engineer** is **compromised** via phishing and their admin credentials are used to **exfiltrate** bulk **card data**.
- An **executive** with access to financial reporting **accidentally leaks** quarterly **revenue data** before public disclosure.
- A **contractor** building the reconciliation module **injects fraudulent transaction records** to redirect small amounts to an external account over months.
**Step 4 — Key external scenarios:**
- A **criminal gang** motivated by **financial gain** uses **credential stuffing** against merchant API keys to **initiate fraudulent transactions**.
- An **automated scanner** exploits a **known CVE** in an unpatched dependency to gain **initial access** to the payment gateway.
**Step 5 — Top defensive priorities:**
- Entry: Hardware MFA for all admin accounts; phishing-resistant authentication for operations staff.
- Lateral movement: Least privilege — payment engineers cannot access bulk card data outside of specific, logged operations.
- Goal execution: Data loss prevention on bulk card data exports; anomaly detection on transaction velocity.
- Insider: Multi-party authorization for any operation affecting more than N transactions; business justification logging for bulk data access.
---
### Example 2 — Internal developer platform
**User request:** "We run an internal platform where engineers submit code that gets deployed to production. What threats should we be designing for?"
**Step 1 — Motivation assessment (abbreviated):**
- Espionage: **High** — source code is proprietary IP; the deployment pipeline is a high-value target for supply chain attacks.
- Destruction: **Medium** — a disgruntled insider or external attacker could disrupt production deployments.
- Financial gain: **Medium** — source code may be sold; credentials to the deployment system could be used for ransomware.
**Step 2 — Profile assessment (abbreviated):**
- Nation-state: **High** (if company has proprietary technology) — source code is prime espionage target.
- Criminal actors: **Medium** — ransomware via supply chain compromise is increasing.
- Hobbyists/researchers: **Low** — internal-only system with no public exposure.
- Insider threat: **High** — all engineers are first-party insiders with significant access.
**Step 3 — Key insider scenarios:**
- An **engineer** with a **retaliatory** motive (fired for cause) uses retained credentials to **delete** production **infrastructure** before access is revoked.
- A **third-party open-source contributor** **injects a malicious dependency** into a shared library that **exfiltrates environment variables** from production containers.
- An **SRE** working a late-night incident is **negligent** and **accidentally modifies** the wrong **production configuration**, causing an outage.
**Step 4 — Key external scenarios:**
- A **nation-state actor** uses **spear-phishing** against a senior engineer to steal their credentials and gain **persistent access** to the source code repository.
- A **criminal actor** compromises a **third-party dependency** in the build pipeline (supply chain attack) to inject malware into production artifacts.
**Step 5 — Top defensive priorities:**
- Entry: Hardware MFA for all engineers; device health checks before VPN access.
- Lateral movement: Zero-trust network access — production access requires per-session authorization, not standing access.
- Persistence: Code review requirements for all changes; dependency pinning and integrity verification; automated build provenance.
- Insider/retaliatory: Access revocation runbook with sub-hour SLA for departing employees; audit logs for all production access.
---
### Example 3 — Consumer fitness tracking app
**User request:** "We run a fitness app with GPS route tracking. We're small and not a bank or defense contractor — do we really need to threat model?"
**Step 1 — Motivation assessment:**
This illustrates the "you may not realize you're a target" principle. The fitness app collected GPS routes from users who happened to be active-duty military personnel. In 2018, a public heatmap of user activity routes revealed locations of undisclosed military bases. A fitness app with location data may attract intelligence agency interest it would never anticipate.
- Espionage: **Medium** (unexpected) — aggregated location data has intelligence value.
- Financial gain: **High** — user accounts with PII and payment data are resale targets.
- Coercion: **Low** — service disruption is inconvenient but not mission-critical for most users.
**Key insight for the user:** Even organizations that do not consider themselves high-value targets may hold data that is attractive to nation-state actors, criminal gangs, or others. The fitness app scenario demonstrates that the adversary's view of your data's value differs from your own view. Run the motivation assessment from the adversary's perspective, not yours.
---
## References
See `references/` for:
- `attacker-motivation-worksheet.md` — Guided worksheet for systematically rating each motivation against a system's profile
- `attacker-profile-reference.md` — Detailed capability, resource, and behavioral descriptions for each profile
- `insider-category-taxonomy.md` — Extended insider category taxonomy with worked examples of related-insider scenarios
- `threat-scenario-matrix-template.md` — Blank actor-motive-action-target matrix for generating scenarios
- `attack-lifecycle-defense-map.md` — Comprehensive per-stage defense catalog drawn from the book's chapters on access controls, logging, and incident response
## License
This skill is licensed under [CC-BY-SA-4.0](https://creativecommons.org/licenses/by-sa/4.0/).
Source: [BookForge](https://github.com/bookforge-ai/bookforge-skills) — Building Secure and Reliable Systems by Heather Adkins, Betsy Beyer, Paul Blankinship, Piotr Lewandowski, Ana Oprea, Adam Stubblefield.
## Related BookForge Skills
This skill is standalone. Browse more BookForge skills: [bookforge-skills](https://github.com/bookforge-ai/bookforge-skills)
Audit and score an existing offer or service using the four-driver perceived value formula (Hormozi's "Value Equation"): Dream Outcome × Perceived Likelihood...
---
name: value-equation-offer-audit
description: |
Audit and score an existing offer or service using the four-driver perceived value formula (Hormozi's "Value Equation"): Dream Outcome × Perceived Likelihood of Achievement ÷ (Time Delay × Effort & Sacrifice). Use this skill when an offer is underpriced relative to its actual value, getting price objections, or failing to convert despite being genuinely good. Scores each driver on a binary 0/1 rubric, identifies which drivers are dragging value down, and produces concrete improvement actions for each weak driver. Triggers include: "why won't people pay for this?", "how do I raise my price?", "my offer isn't converting", "I need to make my service more compelling", "how do I justify my premium", "should I be charging more?", "what makes my offer valuable?", "how do I compete against cheaper alternatives?". Applies to: consulting, coaching, courses, agencies, productized services, physical products, SaaS, any offer where perceived value determines willingness to pay.
model: sonnet
context: 200k
execution:
tier: 1
mode: hybrid
inputs:
- type: document
description: "Offer description, sales page, service overview, pricing page, or pitch deck"
- type: none
description: "Skill also works from a verbal description of the offer and target customer"
tools-required: [Read, Write, TodoWrite]
tools-optional: [Grep]
environment: "Run from any directory; document access enables concrete rewrite suggestions"
depends-on: []
---
# Value Equation Offer Audit
## When to Use
Use this skill when you are:
- **Diagnosing why an offer isn't converting** — leads are interested but not buying, or price objections are frequent
- **Preparing to raise prices** — need to identify which value drivers to strengthen before justifying a higher price point
- **Comparing your offer to a cheaper competitor** — two similar offers at different prices; need to understand the perceived value gap
- **Designing a new offer from scratch** — want to build in all four value drivers before launching
- **Reviewing existing marketing copy or sales messaging** — checking whether the language communicates each driver clearly or leaves value on the table
- **Evaluating a Done-For-You vs. Do-It-Yourself vs. Done-With-You pricing structure** — deciding which delivery model to use for a given customer segment
Trigger phrases users commonly say:
- "Why won't people pay what I'm asking?"
- "I keep getting 'too expensive' — but I know we deliver results."
- "How do I make my offer more compelling without discounting?"
- "What's missing from this offer?"
- "My competitor charges less, how do I justify my premium?"
- "I want to raise my price — where do I start?"
Preconditions: you have at least one of:
- A written description of the offer, service, or product
- A sales page, landing page, or pitch deck
- A verbal description from the user of what they sell, who they sell to, and what outcome they deliver
**Agent:** Before starting, identify whether you are auditing an existing offer (AUDIT mode) or designing a new one (DESIGN mode). The process below covers AUDIT mode with design guidance embedded. For a full build-out of a new offer, use `grand-slam-offer-creation` after completing this audit.
## Context & Input Gathering
### Input Sufficiency Check
```
User prompt → Extract: what is the offer? who is the customer? what outcome is promised?
↓
Environment → Scan for: sales pages, offer docs, pricing docs, pitch materials
↓
Gap analysis → Do I know: (1) what the offer is, (2) who buys it, (3) what outcome is delivered,
(4) current price or target price?
↓
Missing critical info? ──YES──→ ASK (one question at a time)
│
NO
↓
PROCEED
```
### Required Context (must have — ask if missing)
- **Offer description:**
→ Check prompt or files for: what the customer receives, how it is delivered, what the core promise is
→ If missing, ask: "Can you describe what you offer, how it's delivered, and what specific result your customer gets? For example: 'a 12-week coaching program for freelance designers that helps them go from $3k/mo to $10k/mo clients.'"
- **Target customer and their dream outcome:**
→ Check for: who buys this, what they ultimately want, what "success" looks like for them
→ If missing, ask: "Who is your ideal buyer, and what is the single result they most want from this offer?"
### Observable Context (gather from environment)
- **Existing offer materials:** Read any sales pages, proposal templates, or offer documents present.
→ Look for: how the outcome is described, what objections are preemptively addressed, what proof is offered, what friction is mentioned or unaddressed
- **Pricing context:** Note current price if stated, or ask for it if needed to calibrate the audit.
### Default Assumptions
- If no price is given: proceed with the audit; note that pricing recommendations will be directional, not absolute
- If no customer segment is specified: infer from the offer description and flag the assumption
- If delivery format is ambiguous: ask whether this is Done-For-You (DFY), Done-With-You (DWY), or Do-It-Yourself (DIY) — this directly affects the Effort & Sacrifice score
## Process
Use `TodoWrite` to track steps before beginning.
```
TodoWrite([
{ id: "1", content: "Extract offer core: outcome, customer, delivery, price", status: "pending" },
{ id: "2", content: "Score Driver 1: Dream Outcome (0 or 1)", status: "pending" },
{ id: "3", content: "Score Driver 2: Perceived Likelihood of Achievement (0 or 1)", status: "pending" },
{ id: "4", content: "Score Driver 3: Time Delay (0 or 1)", status: "pending" },
{ id: "5", content: "Score Driver 4: Effort & Sacrifice (0 or 1)", status: "pending" },
{ id: "6", content: "Identify top improvement actions per weak driver", status: "pending" },
{ id: "7", content: "Check delivery model alignment (DFY/DWY/DIY)", status: "pending" },
{ id: "8", content: "Produce scored audit summary and priority action list", status: "pending" }
])
```
---
### Step 1: Extract the Offer Core
**ACTION:** From the offer description or documents, extract and write down:
1. The specific result or transformation promised (the Dream Outcome in concrete terms)
2. Who the customer is and what their current state is
3. How the offer is delivered (Done-For-You / Done-With-You / Do-It-Yourself)
4. Current price (if known) or target price range
5. Time from purchase to first result, and time to full result
**WHY:** The four value drivers operate on the customer's *perception*, not objective reality. Before scoring, you need to know both what is actually delivered AND how it is currently communicated — because the gap between those two is often where value is being lost. A genuinely excellent offer can score poorly if it fails to communicate its drivers clearly. Equally, a weak offer cannot be saved by communication alone. Separating "what it is" from "how it is communicated" is essential for diagnosing the right problem.
**Output:** A single-paragraph offer summary in this format:
> "[Offer name] helps [customer segment] go from [current state] to [dream outcome] via [delivery method], in [time to first result / full result], at [price]. Delivery requires [what the customer must do]."
Mark Step 1 complete in TodoWrite.
---
### Step 2: Score Driver 1 — Dream Outcome
**Definition:** The dream outcome is the gap between where the customer is now and where they most want to be. It is not a feature list — it is the *feeling* and *status* the customer expects to experience after achieving the result. Offers that tap into status, respect, security, love, or freedom score higher than offers framed around features or mechanics.
**Scoring rubric:**
- **1 (value achieved):** The offer is clearly framed around a specific, emotionally resonant outcome the customer deeply wants. The outcome increases their status in some meaningful way (professionally, socially, financially, physically). The customer would describe the outcome in their own words, not the seller's jargon.
- **0 (missing):** The offer is described in terms of what is included ("8 modules," "weekly calls," "template library") rather than what the customer becomes or achieves. The outcome is vague ("grow your business," "feel better") or framed in seller-centric language.
**ACTION:**
1. Read the current offer description.
2. Ask: "Does this description make a customer *feel* the outcome, or just list what they receive?"
3. Identify the deepest desire the offer actually addresses. Probe one level deeper than the surface: weight loss → feeling attractive and confident → increased status among peers. The deeper the desire the offer connects to, the higher its dream outcome score.
4. Assign score: 0 or 1.
**IF score = 0:** Note the specific language changes needed to reframe the offer around the outcome. The fix is almost always repositioning, not rebuilding the offer.
**Levers for increasing Dream Outcome:**
- Name the specific end state in the customer's language, not the seller's
- Quantify the outcome where possible: "$10k/month" beats "more revenue"; "close your first enterprise deal" beats "improve your sales"
- Connect the outcome to status: describe how the customer will be perceived by peers, family, or clients once they achieve it
- Address the *real* desire beneath the stated desire (the minivan example: a customer who "doesn't care about status" still chose the option that increased status among the people they care about)
Mark Step 2 complete in TodoWrite.
---
### Step 3: Score Driver 2 — Perceived Likelihood of Achievement
**Definition:** Perceived likelihood of achievement is the customer's belief that *they specifically* will get the promised result if they buy this offer. It is not about whether the result is real — it is about the customer's confidence that it will happen for them. People pay for certainty. A surgeon's 10,000th patient pays more than their first, even if both receive the same procedure, because the perceived likelihood of a good outcome is higher.
**Scoring rubric:**
- **1 (value achieved):** The offer includes strong, specific social proof that the target customer identifies with (case studies from similar people, testimonials with specific outcomes, before/after data). It pre-answers the customer's internal objection: "But will this work for *me*?" It reduces perceived risk through guarantees, proof of track record, or credentialing. The customer leaves the offer description more confident than when they arrived.
- **0 (missing):** The offer relies on general claims without proof. Testimonials are vague ("This was amazing!") or from customers who don't resemble the target buyer. No mechanism is explained for why this works. The customer is expected to take the result on faith.
**ACTION:**
1. Identify all proof elements currently in the offer: testimonials, case studies, data, credentials, methodology explanations.
2. Evaluate each for specificity and relatability to the target customer.
3. Ask: "Would a skeptical, first-time buyer read this and think: 'Yes, this will work for someone like me'?"
4. Assign score: 0 or 1.
**IF score = 0:** Identify the fastest path to adding specific, credible proof. If proof is thin, the offer needs structural components that increase confidence: money-back guarantees, risk reversal, detailed methodology explanation, or a visible track record. See `guarantee-design-and-selection` for specific guarantee structures.
**Levers for increasing Perceived Likelihood of Achievement:**
- Testimonials from customers who match the target buyer (same role, same problem, similar starting point)
- Specific outcome data: "14 out of 17 clients hit $10k/month within 90 days" beats "many clients see great results"
- Mechanism explanation: explain *why* this works (the system, the process, the proprietary method) — customers who understand the mechanism trust the outcome more
- Risk reversal: guarantees do not just reduce risk, they signal that the seller is confident enough to stake money on the result
- Credentials and track record: years in domain, number of clients served, recognizable past clients or employers
Mark Step 3 complete in TodoWrite.
---
### Step 4: Score Driver 3 — Time Delay
**Definition:** Time delay is the gap between when a customer pays and when they first experience meaningful value. Two components matter: (1) the long-term outcome (the ultimate result they bought for) and (2) short-term wins that occur on the path to that outcome. Customers *buy* for the long-term outcome but *stay* for the short-term wins. The shorter the gap to first value, the higher the perceived value of the offer.
**Critical insight — fast beats free:** Speed is so valuable that many companies have built entire businesses by charging for what others give away — simply by delivering it faster. FedEx vs. USPS. Uber vs. walking. Private DMV renewal vs. waiting in line. If an offer competes in a market that has a free alternative, the winning strategy is almost always speed, not price reduction.
**Scoring rubric:**
- **1 (value achieved):** The offer delivers a meaningful quick win within the first 24–72 hours of purchase — an emotional confirmation that the customer made the right decision. The path to the long-term outcome includes visible milestones that reassure the customer they're on track. The time to full result is clearly stated and is competitive for the category.
- **0 (missing):** Value delivery begins slowly, with no early wins designed in. The customer must wait weeks or months before seeing any evidence the offer is working. No milestones are communicated. The time to full result is vague or long without justification.
**ACTION:**
1. Map the current delivery timeline: when does the customer first experience something? When do they first see a result?
2. Ask: "What is the fastest meaningful win I could deliver within the first 7 days?"
3. Check: is the time to full result competitive with alternatives in this market? (Including doing nothing, DIY, or competing offers)
4. Assign score: 0 or 1.
**IF score = 0:** Design a specific fast-win component to add to the offer. This is often the highest-leverage improvement available — it costs little to deliver, dramatically increases retention and referrals, and raises perceived value without changing price. Examples: a quick-start guide that delivers one concrete result on day one; a first-session output that gives the customer something immediately usable; a 48-hour diagnostic that shows the customer exactly where they are and what to do next.
**Gym example:** A gym client's long-term goal (adding $239k/year in revenue) takes months to achieve. The fast-win solution was getting their first ad live and closing a $2,000 sale within their first 7 days. This immediate win reinforced their purchase decision and built the trust needed to follow through the full program.
**Levers for decreasing Time Delay:**
- Design an explicit "fast win" deliverable for the first 24–72 hours (not just onboarding paperwork)
- Structure the delivery sequence so the customer experiences value *while* progressing toward the full outcome
- Communicate milestones clearly: "By week 2 you will have X; by week 6 you will have Y"
- Consider whether a higher-touch delivery option (DFY vs. DWY) reduces the customer's wait time enough to justify premium pricing
- If the full outcome genuinely takes time, make the short-term experience vivid: what will they feel, see, and receive along the way?
Mark Step 4 complete in TodoWrite.
---
### Step 5: Score Driver 4 — Effort & Sacrifice
**Definition:** Effort & Sacrifice is everything the customer must give up, endure, or do in order to receive the outcome — beyond simply paying the price. This includes time, energy, learning curves, lifestyle changes, social discomfort, and risk. Done-For-You services command premium prices primarily because they eliminate this driver almost entirely. The customer's ideal scenario is to say yes, hand over money, and wake up with the outcome — zero effort on their part.
**Warning — AP-5: the beginner marketer top-of-equation trap:** Most new offer designers focus exclusively on Drivers 1 and 2 (Dream Outcome and Perceived Likelihood) because they are easy to work on — you simply make bigger claims. This is the lazy approach, and it produces an arms race of promises that becomes increasingly easy to ignore. The companies that dominate their markets — Apple, Amazon, Netflix — win by obsessing over the *bottom* of the equation: making delivery faster, more seamless, and more effortless than anyone else. When your offer has strong top drivers but weak bottom drivers, raising Drivers 3 and 4 is almost always the highest-leverage move available, because it is harder for competitors to copy and more unique in most markets.
**Scoring rubric:**
- **1 (value achieved):** The customer's required effort is minimal, clearly bounded, and feels worth the outcome. Every friction point in the delivery process has been addressed or removed. The offer anticipates what customers find hard or uncomfortable and either eliminates it (DFY) or makes it easy (DWY). The marketing honestly names the sacrifice required and frames it as minimal compared to alternatives.
- **0 (missing):** The offer requires significant ongoing effort from the customer, and this is either unaddressed in the marketing or presented as a virtue ("hard work pays off"). Friction points are hidden and only discovered after purchase. The customer's experience of getting to the outcome is harder than they expected.
**ACTION:**
1. List every action, change, and discomfort the customer must endure to receive the full outcome. Be exhaustive — include: time, scheduling, learning new tools or concepts, lifestyle changes, social friction, emotional discomfort, administrative tasks.
2. For each item on that list, ask: "Can this be eliminated? Can it be reduced? Can it be done for them?"
3. Compare your list to what the marketing currently communicates about required effort.
4. Assign score: 0 or 1.
**Fitness vs. liposuction comparison:** Both deliver the same outcome (reduced body fat). Fitness requires: waking up earlier, 5–10 hours/week, dietary restriction, physical discomfort, embarrassment, risk of injury, meal prep, new food costs. Liposuction requires: falling asleep and being sore for 2–4 weeks. This is why liposuction commands $25,000 while gym memberships struggle to hold $29/month. The outcome is identical; the effort and sacrifice differential is enormous.
**Levers for decreasing Effort & Sacrifice:**
- Move from DIY to DWY to DFY where feasible — each step dramatically reduces customer effort and justifies a price increase
- Eliminate administrative friction: intake forms, scheduling complexity, and onboarding steps are all effort costs before value is received
- Anticipate and pre-solve the most common points where customers get stuck or give up
- Reframe necessary effort as trivially small compared to the alternative (as liposuction marketing does against gym memberships)
- Consider psychological solutions over logical ones: reducing the *perception* of effort is often as powerful as reducing actual effort (the London Underground dotted map increased rider satisfaction more than faster trains — at a fraction of the cost)
Mark Step 5 complete in TodoWrite.
---
### Step 6: Identify Improvement Actions per Weak Driver
**ACTION:** For each driver scored at 0, produce 2–3 specific, concrete actions the user can take to improve it. Prioritize actions by:
1. Speed of implementation (can this be done this week?)
2. Leverage (does fixing this likely unlock price increases or conversion improvements?)
3. Cost (does it require building new delivery infrastructure or just repositioning existing content?)
**Prioritization guidance:**
- If Drivers 3 or 4 (Time Delay and Effort & Sacrifice) are scored 0: prioritize these first. They are harder to fake, more competitively durable, and are where most markets are weakest. This is the antidote to AP-5.
- If Driver 2 (Perceived Likelihood) is scored 0: this is often the fastest fix — adding specific case studies and testimonials is low-cost and high-impact.
- If Driver 1 (Dream Outcome) is scored 0: this is a messaging and positioning problem, not a product problem. The fix is rewriting, not rebuilding.
Mark Step 6 complete in TodoWrite.
---
### Step 7: Evaluate Delivery Model Alignment
**ACTION:** Assess whether the current delivery model (DFY / DWY / DIY) is optimally aligned with the target customer's preferences and willingness to pay.
**WHY:** The delivery model is one of the highest-leverage decisions in offer design. It directly affects all four value drivers simultaneously: DFY maximizes Perceived Likelihood (experts do it) and minimizes Effort & Sacrifice (the customer does almost nothing), while DWY and DIY shift those drivers in the opposite direction. Most businesses default to one delivery model without consciously choosing it — which means they often leave money on the table by not offering a DFY tier, or price-compress themselves by not offering a DIY tier for price-sensitive buyers.
**DFY / DWY / DIY pricing ladder:**
- **Done-For-You (DFY):** Highest price. Customer pays for expertise, speed, and zero personal effort. Best for customers whose time is expensive or whose confidence in DIY is low. Perceived Likelihood is highest because an expert handles execution.
- **Done-With-You (DWY):** Mid-tier. Customer learns and participates, but with guidance. Balances price sensitivity with personal involvement. Appropriate for customers who want to build capability, not just get the outcome.
- **Do-It-Yourself (DIY):** Lowest price. Customer receives tools, systems, and knowledge. High Effort & Sacrifice score. Appropriate for price-sensitive buyers or those who value developing the skill themselves.
**Questions to resolve:**
- Is there a DFY option for customers willing to pay a premium? If not, is there a reason to add one?
- If the current offer is DWY or DIY, is the pricing reflecting the customer's burden (high Effort & Sacrifice)?
- Would a tiered structure (DIY / DWY / DFY at different price points) capture more of the market without cannibalizing the premium tier?
Mark Step 7 complete in TodoWrite.
---
### Step 8: Produce the Scored Audit Summary
**ACTION:** Write the final audit output.
**Format:**
```
VALUE EQUATION AUDIT — [Offer Name]
Offer summary: [one-paragraph summary from Step 1]
DRIVER SCORES:
┌─────────────────────────────────────┬───────┬──────────────────────────┐
│ Driver │ Score │ Status │
├─────────────────────────────────────┼───────┼──────────────────────────┤
│ 1. Dream Outcome │ /1 │ [Strong / Needs work] │
│ 2. Perceived Likelihood │ /1 │ [Strong / Needs work] │
│ 3. Time Delay (lower = better) │ /1 │ [Strong / Needs work] │
│ 4. Effort & Sacrifice (lower = better)│ /1 │ [Strong / Needs work] │
├─────────────────────────────────────┼───────┼──────────────────────────┤
│ OVERALL │ /4 │ [Verdict] │
└─────────────────────────────────────┴───────┴──────────────────────────┘
OVERALL VERDICT:
4/4 — Strong perceived value. Offer is positioned to command premium pricing. Review messaging and guarantee structure.
3/4 — Good foundation. One driver is limiting value. Fix the weak driver before raising price.
2/4 — Significant value gap. Two drivers are dragging conversion and price tolerance down. Priority improvements required.
1/4 or below — Fundamental offer problem. Perceived value is low regardless of quality. Consider offer redesign using grand-slam-offer-creation.
PRIORITY ACTIONS:
1. [Driver X — specific, concrete action, estimated effort: low/medium/high]
2. [Driver Y — specific, concrete action, estimated effort: low/medium/high]
3. [Driver Z — specific, concrete action, estimated effort: low/medium/high]
DELIVERY MODEL NOTE:
[Assessment of whether current DFY/DWY/DIY structure is appropriate, and whether a pricing ladder is warranted]
NEXT STEPS:
[If score < 3/4]: Use grand-slam-offer-creation to rebuild offer components around weak drivers.
[If score = 3/4]: Address priority actions above, then re-audit.
[If score = 4/4]: Use premium-pricing-strategy to identify the price ceiling for this offer.
[If Driver 2 is weak]: Use guarantee-design-and-selection to add risk-reversal that boosts perceived likelihood.
```
**HANDOFF TO HUMAN** — present the scored audit and priority action list. Ask: "Which of these actions do you want to tackle first? I can help you rewrite the offer framing, design a fast-win component, or work on proof elements."
Mark Step 8 complete in TodoWrite.
---
## Examples
### Example 1: Photography Studio — 5x Ticket Increase
**Trigger:** "I run a photography studio. My average client pays $300 per session. I want to charge $1,500 but I don't know how to justify it."
**Offer summary:** Portrait photography sessions for families and professionals. 1-hour session, 20 edited photos delivered in 2 weeks. Customer selects prints or digital files. Price: $300.
**Audit:**
| Driver | Score | Finding |
|---|---|---|
| Dream Outcome | 0 | Offer is described as "photos" and "editing." The actual dream outcome — looking beautiful, feeling proud, capturing a milestone — is absent from all marketing. |
| Perceived Likelihood | 1 | Portfolio is strong with clear before/after quality evidence. Customer can see the output they will receive. |
| Time Delay | 0 | 2-week delivery with no interim touchpoints. Customer pays and then waits with no confirmation they made the right decision. |
| Effort & Sacrifice | 0 | Customer must schedule around studio hours, travel to the studio, manage children during the session, and wait 2 weeks for any value. No guidance on how to prepare, what to wear, or what to expect. |
**Overall: 1/4**
**Priority actions:**
1. **Dream Outcome (messaging reframe, low effort):** Rewrite all copy to center on the outcome: "Family portraits that stop you in your tracks — the kind your kids will find in a drawer 30 years from now and cry over." Frame every package around the memory and status the photo creates, not the technical specs.
2. **Time Delay (fast win design, medium effort):** Send a preview of 3 favorite shots within 24 hours of the session. This gives the customer an immediate emotional win and confirmation. Full gallery at 2 weeks, but the customer stops waiting after day one.
3. **Effort & Sacrifice (friction removal, medium effort):** Add a preparation guide ("What to wear, how to prep kids, what to expect") to eliminate the anxiety of not knowing. Offer a mobile session option to eliminate travel. These changes reduce the perceived effort of booking, not just the session itself.
**Result:** Implementing all three actions removes the drivers that were suppressing price tolerance. The offer's actual quality (1/4 → 3–4/4 after changes) supports a $1,500 price. The studio's profitability transformation followed: same hours, higher value delivered, dramatically higher revenue per client.
---
### Example 2: Business Coaching — First 7-Day Fast Win
**Trigger:** "I'm a business coach for gym owners. I charge $2,000/month but I'm getting a lot of churn after month 2. People say they're not seeing results fast enough."
**Offer summary:** Monthly coaching program for gym owners targeting $239k/year incremental revenue. Weekly calls, strategy sessions, and ad setup guidance. Full results expected over 6–12 months. Price: $2,000/month.
**Audit:**
| Driver | Score | Finding |
|---|---|---|
| Dream Outcome | 1 | "$239k additional annual revenue" is specific, quantified, and highly motivating to gym owners. |
| Perceived Likelihood | 1 | Coach has track record of achieving this with multiple clients. Case studies present. |
| Time Delay | 0 | 6–12 months to full outcome. No explicit fast-win milestones designed in. Customer is left wondering "is this working?" after month 1. Churn at month 2 is the direct symptom. |
| Effort & Sacrifice | 0 | Gym owners must implement ad strategies, manage campaigns, change sales conversations, and train staff — all while running their existing business. Required effort is high and largely unaddressed. |
**Overall: 2/4**
**Priority actions:**
1. **Time Delay — design a 7-day fast win (medium effort):** Within the first week, get the gym owner's first ad live and close a $2,000 membership sale using the new sales framework. This is the single highest-leverage change available. It answers "is this working?" before the customer has time to doubt their decision. Churn at month 2 drops because the customer already has evidence they are on the right path.
2. **Effort & Sacrifice — reduce implementation friction (medium effort):** Provide done-for-you ad templates, scripts for sales conversations, and a week-by-week implementation calendar. The coaching should move from "here's what to do" to "here's the exact thing to copy and use today." Consider adding a monthly DFY ad management option at a higher tier for owners who do not want to manage campaigns.
3. **Effort & Sacrifice — acknowledge and reframe required effort (low effort):** In the sales conversation, be explicit: "Here is exactly what you will need to do in the first 30 days, and here is what we do for you. Most owners tell us it takes 3–4 hours per week." Naming the effort honestly, and showing it is bounded, is more effective than hiding it and having customers discover it post-purchase.
---
### Example 3: Software Tool — Competing Against Free
**Trigger:** "We have a project management tool for freelancers. There are free alternatives. How do we justify charging $49/month?"
**Offer summary:** Project management SaaS for freelancers. Features include time tracking, invoice generation, client portal, and project templates. Onboarding takes 2–3 hours. Price: $49/month.
**Audit:**
| Driver | Score | Finding |
|---|---|---|
| Dream Outcome | 0 | Tool is described by features ("time tracking," "invoices"), not by outcome ("get paid faster," "look more professional to clients," "spend less time on admin"). |
| Perceived Likelihood | 0 | No case studies showing specific results (e.g., "freelancers using this tool get paid 8 days faster on average"). Free tools have more social proof by volume. |
| Time Delay | 0 | 2–3 hour onboarding before any value is experienced. Competitor free tools can be used in minutes. |
| Effort & Sacrifice | 0 | Migration from existing tools, learning curve, ongoing 2–3 hour setup, and no automation of existing workflow. High friction relative to free alternatives. |
**Overall: 0/4**
**Fast beats free principle in action:** When competing against free tools, the winning strategy is never price reduction — it is speed and effortlessness. FedEx vs. USPS. Uber vs. walking. The free alternative is available; the question is: which customers will pay for faster, easier delivery of the same outcome?
**Priority actions:**
1. **Time Delay — reduce onboarding to under 15 minutes (high effort, high leverage):** Free tools win on friction-of-starting. If the paid tool can deliver a working setup faster than the free alternative, it wins on time delay. Build an onboarding wizard that imports existing projects from common tools (Trello, Notion) and generates a first invoice template in under 10 minutes.
2. **Dream Outcome — reframe around professional status (low effort):** Change all copy from features to outcomes: "Send invoices that make clients say 'this feels like a real agency'" and "Get paid in 3 days instead of 30." Both are about status and money — the real dream outcomes for freelancers.
3. **Perceived Likelihood — add specific proof (medium effort):** Add one prominent case study with data: "Julia, freelance designer, reduced time on invoicing from 4 hours/week to 20 minutes and got paid 11 days faster." This is the kind of proof that makes the $49/month feel like a bargain against the outcome.
---
## Key Principles
- **Perception is the only currency.** Improving the actual offer matters only if the improvement is perceived by the customer. The London Underground's dotted arrival-time map increased rider satisfaction more than faster trains — at a fraction of the cost. Always ask: "How will the customer *perceive* this change?" before investing in building it.
- **The bottom of the equation is where markets are most underdeveloped.** Most competitors focus on Dream Outcome and Perceived Likelihood — making bigger promises and stacking social proof. Very few systematically address Time Delay and Effort & Sacrifice. This is where durable competitive advantage lives. The best companies in every category win by making their delivery faster, more seamless, and more effortless than anyone else.
- **Fast beats free.** When competing in a market with free alternatives, the winning move is speed and effortlessness — not price reduction. People pay for the elimination of waiting and effort. Any offer that delivers faster or requires less work than the free alternative has a viable premium.
- **Done-For-You always commands a premium — even for the same outcome.** The reason DFY services cost 10–100x more than DIY versions of the same result is Effort & Sacrifice and Perceived Likelihood operating together: the customer does nothing (effort = 0) and trusts an expert to deliver (likelihood = high). If your offer is currently DWY or DIY and you are struggling to hold price, the highest-leverage option is adding a DFY tier.
- **Binary scoring forces honest diagnosis.** Rating each driver 0 or 1 — not 0.7 or "pretty good" — forces a clear verdict on whether each driver is actually contributing to perceived value. Partial credit is how weak offers justify staying unchanged. If a driver is not clearly and confidently delivering value to the customer, score it 0 and fix it.
## References
- For building out a new offer from scratch after identifying gaps: `grand-slam-offer-creation`
- For designing guarantees that improve Driver 2 (Perceived Likelihood of Achievement): `guarantee-design-and-selection`
- For setting and justifying premium prices once all four drivers are strong: `premium-pricing-strategy`
- Source: *$100M Offers*, Alex Hormozi, Chapter 6 "Value Offer: The Value Equation," pages 78–93
## License
This skill is licensed under [CC-BY-SA-4.0](https://creativecommons.org/licenses/by-sa/4.0/).
Source: [BookForge](https://github.com/bookforge-ai/bookforge-skills) — 100M Offers by Unknown.
## Related BookForge Skills
This skill is standalone. Browse more BookForge skills: [bookforge-skills](https://github.com/bookforge-ai/bookforge-skills)
Score and select a target market before building any offer. Use this skill when starting a new business, evaluating a niche, choosing between customer segmen...
---
name: target-market-selection
description: Score and select a target market before building any offer. Use this skill when starting a new business, evaluating a niche, choosing between customer segments, questioning why an existing offer is underperforming despite good execution, or deciding how narrowly to specialize. Activates on phrases like "who should I sell to," "is this a good market," "should I niche down," "I'm not getting traction," "which audience should I focus on," or any request to validate, score, or compare potential target markets.
version: 1.0.0
homepage: https://github.com/bookforge-ai/bookforge-skills/tree/main/books/100m-offers/skills/target-market-selection
metadata: {"openclaw":{"emoji":"📚","homepage":"https://github.com/bookforge-ai/bookforge-skills"}}
status: draft
source-books:
- id: 100m-offers
title: "$100M Offers: How To Make Offers So Good People Feel Stupid Saying No"
authors: ["Alex Hormozi"]
chapters: [4]
tags: [market-selection, entrepreneurship, niche-strategy, business-strategy]
depends-on: []
execution:
tier: 1
mode: hybrid
inputs:
- type: document
description: "Business concept, service description, or target customer hypothesis — the market or niche to evaluate"
tools-required: [Read, Write]
tools-optional: []
mcps-required: []
environment: "Any agent environment. Document set preferred: business descriptions, offer drafts, market research notes."
---
# Target Market Selection
## When to Use
You are determining who to sell to — before building an offer, before writing copy, or before spending more effort on a market that may not support success. Typical triggers:
- Starting a new business and need to choose a customer segment
- Evaluating whether your current market is holding you back despite a good product
- Deciding between two or more candidate niches or customer types
- Questioning why a great offer is not gaining traction (bad market may be the cause)
- Deciding how narrowly to specialize within a broader category
- Pivoting away from a declining or saturated market
This skill runs **before** offer design. Market selection is the highest-leverage decision in business. A great offer in the wrong market will fail. A mediocre offer in the right market will still generate revenue.
Priority hierarchy: **Market quality > Offer strength > Sales and persuasion skill**
---
## Context & Input Gathering
### Required Context (must have — ask if missing)
- **Business concept or service description:** What you are selling or intending to sell
→ Check prompt for: service type, product description, expertise area, problem solved
→ Ask if missing: "What product or service are you planning to offer, and what problem does it solve?"
- **Target customer hypothesis:** Who you currently think the customer is (even if uncertain)
→ Check prompt for: industry, role, demographic, company size, life situation
→ Ask if missing: "Who do you currently think your ideal customer is? Be as specific as you can — industry, role, life stage, problem they face."
### Observable Context (gather from environment)
- **Market trend data:** Any existing research, articles, or notes about the market's growth or decline
- **Competitor presence:** Signs of active demand (competitors, advertising, active communities)
- **Existing customer data:** Any past clients that reveal who is already paying
### Default Assumptions
- If no target hypothesis exists → evaluate the three macro-markets (Health, Wealth, Relationships) first to identify which one applies, then narrow from there
- If comparing multiple markets → score each against all four indicators; recommend the highest-scoring market
- If market is already chosen → run as a diagnostic; flag any indicators that score poorly
---
## Process
### Step 1: Anchor to a Macro-Market
**ACTION:** Identify which of the three universal macro-markets the business concept fits into: Health, Wealth, or Relationships.
**WHY:** These three markets exist because the pain of lacking them is universal and permanent — humans will always need to improve their physical condition, earn more money, and improve their relationships. Any business that cannot be placed in one of these buckets is attempting to create demand rather than channel existing demand, which is an order of magnitude harder. Placing yourself inside a macro-market confirms you are working with existing human desire, not against it.
| Macro-Market | Example sub-niches |
|---|---|
| **Health** | Weight loss, fitness, chronic illness, mental health, longevity, sleep, pain relief |
| **Wealth** | Business growth, investing, career advancement, sales skills, real estate, financial planning |
| **Relationships** | Dating, marriage, parenting, leadership, networking, communication, conflict resolution |
**IF the concept does not fit cleanly into one macro-market** → flag this as a structural risk; the business may be attempting to create demand rather than serve it.
**IF the concept fits multiple macro-markets** → identify the primary pain driver and anchor there. A business coach serving executives touches both Wealth and Relationships — anchor to whichever pain is the dominant purchase motivation.
---
### Step 2: Score the Market on Four Indicators
**ACTION:** Rate the candidate market on each of the four market quality indicators using the scoring rubric below. Produce a score from 1 (weak) to 3 (strong) for each indicator. Total score: 4–12.
**WHY:** You need to channel demand, not create it. The four indicators identify whether a market already has the conditions for demand to exist and be converted. Missing even one indicator creates a structural obstacle that no offer quality or sales skill can fully overcome — as the newspaper market example demonstrates: three strong indicators could not save a business from a market shrinking 25% per year.
---
#### Indicator 1: Massive Pain
**Score 1 — Low pain:** The audience experiences minor inconvenience or wants a "nice to have." They are not actively seeking a solution. No urgency.
**Score 2 — Moderate pain:** The audience has a real problem but manages it; they would buy if the offer found them but do not actively search for solutions.
**Score 3 — High pain:** The audience suffers acutely and actively seeks solutions. The problem affects their daily life, income, relationships, or health in a way they cannot ignore. They are already spending money on partial solutions.
**Diagnostic questions:**
- Are people in this market actively complaining about this problem in online communities, forums, or social media?
- Are they already paying for imperfect alternatives because no perfect solution exists?
- Would this audience immediately understand why someone would sell a solution to this problem, without explanation?
**IF pain score is 1** → stop; this market will not support meaningful revenue without extensive (and expensive) demand education.
---
#### Indicator 2: Purchasing Power
**Score 1 — Low purchasing power:** The audience cannot afford to pay what the service is worth. They may want it but lack the money or access to money.
**Score 2 — Moderate purchasing power:** The audience can pay but requires price justification; high-ticket offers will face resistance.
**Score 3 — High purchasing power:** The audience has disposable income, business revenue, or access to financing sufficient to pay premium prices without hardship.
**Diagnostic questions:**
- Can this audience afford what I need to charge to make the business viable?
- Do they already spend money in this category (competitors exist, products are sold)?
- Is the problem tied to income generation or cost reduction — which makes ROI framing easy?
**Anti-pattern — Purchasing Power Trap:** A market can have intense pain, easy targeting, and strong growth, yet still fail commercially if the audience cannot pay. Unemployed job seekers are a classic example: massive pain (joblessness), easy to target (LinkedIn, job boards), growing during recessions — but they cannot pay for resume help at prices that make the business viable. Purchasing power is non-negotiable.
---
#### Indicator 3: Easy to Target (Reachability)
**Score 1 — Hard to target:** The audience is dispersed, not organized into identifiable communities, channels, or associations. Advertising reaches too broad a group to be efficient.
**Score 2 — Moderately targetable:** The audience can be reached but requires significant creative effort or indirect channels.
**Score 3 — Easy to target:** The audience is self-organized. They belong to identifiable associations, follow specific publications or influencers, gather in online communities, attend niche events, or are reachable via narrowly defined ad targeting criteria.
**Diagnostic questions:**
- Is there a specific Facebook group, subreddit, LinkedIn group, or forum where this audience gathers?
- Are there industry associations, trade publications, or conferences serving this niche?
- Can I name 2–3 influencers, podcasts, or media channels that this exact audience consumes?
**WHY this matters beyond marketing:** Easy targeting also makes your messaging more resonant. When you know exactly where your audience gathers, you learn their exact language, fears, and aspirations — which directly improves offer design and copy.
---
#### Indicator 4: Growing Market
**Score 1 — Declining market:** The total number of potential customers is shrinking. Market contraction creates a headwind that no offer can overcome at scale.
**Score 2 — Flat market:** Stable size; the business can grow by capturing share from competitors, but the rising tide will not help.
**Score 3 — Growing market:** The number of potential customers is increasing. External forces (demographics, technology trends, regulatory changes, economic shifts) are creating new entrants into the market. A growing market is a tailwind — it makes everything easier.
**Diagnostic questions:**
- Is the number of people who could potentially become customers increasing or decreasing year over year?
- Are there demographic, technological, or economic trends driving more people into this market?
- Are competitors launching and growing, or consolidating and exiting?
**Anti-pattern — Declining Market Blindness:** Entrepreneurs are problem-solvers by nature. They will try harder, iterate faster, and find new angles when a market resists — often failing to recognize that the market itself is the problem. Lloyd's newspaper software business had a great product, great offer, and strong sales skills; the market was shrinking 25% per year. No amount of effort could overcome that headwind. When a market is declining, pivot the skill set to a growing market rather than fighting the current.
---
#### Scoring Summary
| Indicator | Score (1–3) |
|---|---|
| Massive Pain | |
| Purchasing Power | |
| Easy to Target | |
| Growing Market | |
| **Total** | **/12** |
**Interpretation:**
- **10–12:** Exceptional market. Move to offer design immediately.
- **7–9:** Solid market. Identify the weak indicator(s) and assess whether they are structural or addressable.
- **4–6:** Problematic market. At least one indicator is critically weak. Do not build an offer until the weak indicator is resolved or the market is changed.
- **4 or below:** Bad market. Stop. Redirect to a different niche or macro-market.
**Any single indicator scoring 1 is a potential deal-breaker** — evaluate whether it can be remedied before proceeding.
---
### Step 3: Validate the Niche Depth
**ACTION:** Determine whether to serve the macro-market broadly or niche down to a specific sub-segment. Apply the niche-depth rule.
**WHY:** Niching down increases the perceived relevance of any offer to the audience, which allows dramatically higher pricing for effectively the same core service. The same time management content priced at $19 as a generic course can be repriced at $1,997 when positioned for a highly specific audience (outbound B2B power tools sales reps), because the audience perceives it as built exactly for them. Specificity signals understanding, and understanding signals value. For most businesses under $10M in annual revenue, niching down will generate more profit than serving a broader audience — because conversion rates, pricing, and referral rates all improve with specificity.
**Niche-depth decision rule:**
- **Under $10M in annual revenue:** Default to niching down. Stay narrow. Serve fewer people more completely.
- **At or above $10M in annual revenue:** Evaluate whether the total addressable market (TAM) of the current niche can support further growth, or whether expansion up-market, down-market, or into adjacent niches is warranted. Do not expand prematurely — many businesses at $1M–$3M believe they have hit their ceiling when they have not.
**How to niche:**
Start with the macro-market category, then apply one or more of these specificity dimensions:
1. **Who specifically** (role, demographic, life stage): not "business owners" → "microgym owners with 50–200 members"
2. **What specific problem** (pain sub-type): not "marketing help" → "acquiring first 100 paying customers"
3. **What specific context** (industry, platform, geography): not "outbound sales training" → "outbound B2B sales training for power tools distributors"
**IF the niche feels "too small"** → challenge that assumption. Companies regularly scale to $30M+ serving a single narrow niche (chiropractors, gyms, plumbers, solar installers, roofers, salon owners). Narrowness is a feature, not a limitation, up to $10M.
---
### Step 4: Test Market Commitment Readiness
**ACTION:** Before finalizing market selection, assess whether you can commit to this market through the natural failure-and-iteration cycle.
**WHY:** The primary cause of market selection failure is not choosing the wrong market — it is abandoning a workable market before making 100 genuine offer attempts. Both dentists and chiropractors represent multi-billion dollar markets; either would work. The fatal error is switching between them before exhausting the offer iteration space. Every market switch resets positioning, reputation, referral networks, and customer feedback cycles — compounding the time cost of failure. Commit to one market. Iterate the offer, not the audience.
**Commitment readiness checklist:**
- [ ] I can articulate the specific pain of this audience without guessing
- [ ] I have access to this audience (through existing network, community, or paid channels)
- [ ] I am willing to make at least 50–100 genuine offers to this market before concluding it does not work
- [ ] I understand that if my first offer fails, the offer needs to change — not necessarily the market
- [ ] I am not currently serving a different market simultaneously (divided focus produces divided results)
**IF checklist has multiple unchecked items** → resolve access and commitment gaps before proceeding to offer design.
**Anti-pattern — Niche Hopping:** Switching markets at the first sign of resistance is the single most common cause of entrepreneurial stagnation. The impulse to hop niches (from dentists to chiropractors, from e-commerce to coaches) is driven by the false belief that the market is the problem when the offer has not been sufficiently tested. All markets have friction. The grass is not greener in the new niche — it is just unfamiliar, which temporarily masks its own friction. Stay. Iterate the offer.
---
### Step 5: Produce the Market Selection Report
**ACTION:** Synthesize Steps 1–4 into a structured market assessment. Produce a go/no-go recommendation with supporting rationale.
**WHY:** A written assessment forces explicit scoring and prevents post-hoc rationalization. It also creates a reference point for future iterations — if the market underperforms, you can return to the assessment and identify which indicator was the actual weak point.
**Output template:**
```
## Market Assessment: [Market Name / Niche Description]
**Macro-market:** [Health / Wealth / Relationships]
**Specific niche:** [Exact customer segment being evaluated]
### Four-Indicator Scorecard
| Indicator | Score (1–3) | Rationale |
|-------------------|-------------|----------------------------------|
| Massive Pain | | |
| Purchasing Power | | |
| Easy to Target | | |
| Growing Market | | |
| **Total** | **/12** | |
### Niche Depth Assessment
- Current revenue stage: [Under / Over $10M]
- Recommended niche level: [Broad / Narrow / Hyper-specific]
- Suggested niche formulation: [Who + What problem + What context]
### Risk Flags
- [Any indicator scoring 1, with explanation]
- [Any structural risk: declining market, targeting difficulty, purchasing power gap]
### Go / No-Go Recommendation
**[GO / NO-GO / CONDITIONAL GO]**
Rationale: [2–4 sentences explaining the recommendation]
Next step: [If GO → proceed to premium-pricing-strategy or grand-slam-offer-creation]
[If NO-GO → identify alternative markets to evaluate]
[If CONDITIONAL GO → specify what must change before proceeding]
```
---
## Examples
### Example 1: Strong market, clear niche (GO)
**Input:** "I want to help restaurant owners increase their revenue. Is this a good market?"
**Process:**
- Macro-market: Wealth (business revenue growth)
- Pain: High (3) — restaurant margins are notoriously thin; owners are in constant pain about revenue, labor costs, and slow nights
- Purchasing power: Moderate (2) — small restaurant owners have limited cash but are willing to pay when ROI is clear
- Easy to target: High (3) — restaurant owners have associations (National Restaurant Association), industry-specific Facebook groups, trade publications (Nation's Restaurant News), and events
- Growing market: Moderate (2) — the restaurant industry is stable but competitive; growth depends on sub-niche (fast casual growing, fine dining flat)
- **Total: 10/12**
**Niche recommendation:** Narrow further. "Restaurant owners" is too broad. Recommend: "Fast casual restaurant owners with 1–3 locations wanting to increase repeat customer visits" — higher pain specificity, clearer ROI framing, tighter targeting.
**Recommendation: GO** — solid market with room to niche for premium pricing. Proceed to offer design.
---
### Example 2: Declining market, correct diagnosis (NO-GO)
**Input:** "I'm selling software services to print media companies — newspapers and magazines. I've been at it for 3 years and can't seem to grow."
**Process:**
- Macro-market: Wealth (B2B software)
- Pain: Moderate (2) — print media companies have operational problems but are more focused on survival than growth
- Purchasing power: Low (1) — print media ad revenue has collapsed; discretionary technology spend is minimal
- Easy to target: High (3) — industry is well-organized with associations and publications
- Growing market: Low (1) — print media circulation is declining 15–25% per year across the sector
- **Total: 7/12** with two indicators at 1
**Diagnosis:** Two critical weak indicators. Purchasing power is structurally constrained by declining ad revenue. Market growth is negative. These are not fixable through offer iteration. The market itself is the problem.
**Recommendation: NO-GO** — pivot the existing skill set to a growing market. The same software capabilities that served print media could serve digital media companies, local news startups, or content marketing agencies — all growing, all with purchasing power. Run a new market assessment on the pivot target before building a new offer.
---
### Example 3: Good macro-market, needs niching (CONDITIONAL GO)
**Input:** "I'm a relationship coach. Who should I target?"
**Process:**
- Macro-market: Relationships — confirmed universal demand
- Broad "relationship coaching" scores: Pain 2, Purchasing Power 2, Easy to Target 1, Growing 2 → **Total: 7/12**
- The weak point is targeting: "relationship coaching" is too diffuse to reach efficiently via any channel
**Niche candidates evaluated:**
| Niche | Pain | Purchasing Power | Easy to Target | Growing | Total |
|---|---|---|---|---|---|
| College students (relationships) | 2 | 1 | 2 | 2 | 7 |
| Newly divorced professionals (40–55) | 3 | 3 | 3 | 3 | 12 |
| Couples in early marriage (1–5 years) | 2 | 2 | 2 | 2 | 8 |
**Recommended niche:** Newly divorced professionals aged 40–55. Higher pain (life disruption), higher purchasing power (mid-career income), easy to target (divorce attorney referral networks, specific Facebook groups, therapist partnerships), and demographically growing (large boomer cohort entering this life stage).
**Recommendation: CONDITIONAL GO** — proceed only after narrowing to a specific sub-segment. Do not launch as a generic relationship coach.
---
## Key Principles
- **Market quality trumps all other factors.** Offer strength and sales skill only matter if the market has the capacity to respond. A great offer in a dying market still dies. Assess market quality before investing in offer design.
- **You are channeling demand, not creating it.** Look for markets where people are already spending money, already complaining, already seeking solutions. Demand creation is an order of magnitude more expensive and slow than demand channeling. The three macro-markets (Health, Wealth, Relationships) always have demand.
- **Specificity enables premium pricing.** The same core service priced at $19 for a generic audience can command $1,997 when positioned for a specific avatar with a specific problem in a specific context. Niching down is not limiting — it is a pricing strategy. Do not broaden until revenue warrants it (typically $10M+).
- **Commit to one market; iterate the offer, not the audience.** The failure that feels like a bad market is usually an untested offer. Make 50–100 genuine attempts before concluding the market is wrong. Every market switch resets your positioning, network, and learning — compounding the cost of starting over.
- **A single weak indicator can sink a business.** Four strong indicators produce a great market. But one indicator scoring 1 — especially purchasing power or market growth — can block revenue no matter how well the other three perform. Score honestly. Do not rationalize weak indicators away.
---
## References
- **Next step after market selection:** Use `premium-pricing-strategy` to set price points and avoid the commodity trap; use `grand-slam-offer-creation` to build an offer tailored to the selected niche.
- **Offer evaluation:** Use `value-equation-offer-audit` to assess whether the offer delivers enough perceived value for the chosen market's price sensitivity.
- For extended niche-depth examples and the full niche-pricing multiplier framework, see `references/niche-pricing-examples.md` (when available).
## License
This skill is licensed under [CC-BY-SA-4.0](https://creativecommons.org/licenses/by-sa/4.0/).
Source: [BookForge](https://github.com/bookforge-ai/bookforge-skills) — $100M Offers: How To Make Offers So Good People Feel Stupid Saying No by Alex Hormozi.
## Related BookForge Skills
This skill is standalone. Browse more BookForge skills: [bookforge-skills](https://github.com/bookforge-ai/bookforge-skills)
Add scarcity and urgency layers to a completed offer to increase conversion rate and perceived value without changing the offer itself. Use this skill when y...
---
name: scarcity-and-urgency-tactics
description: |
Add scarcity and urgency layers to a completed offer to increase conversion rate and perceived value without changing the offer itself. Use this skill when you have a defined offer but conversion is sluggish, when prospects say "I'll think about it" or "maybe later," when you are launching a promotion and need a legitimate reason to act now, or when you want to build pent-up demand between sales cycles. Scarcity limits quantity (how many can buy); urgency limits time (when they can buy). Used together they create the psychological conditions where action now is more compelling than waiting. Trigger phrases: "how do I get people to buy now", "how do I stop prospects from procrastinating", "add urgency to my offer", "create scarcity for my service", "how do I run a limited-time offer", "set up a launch deadline", "get more people off the fence", "use fear of missing out", "make my offer feel exclusive", "why aren't people buying after the call", "how do I fill my cohort faster", "people keep saying they'll think about it". Applies to: coaching, consulting, group programs, online courses, agencies, local services, physical products, SaaS trials, any business that takes new clients or customers. Run this skill after `grand-slam-offer-creation`. Layer on top of output from `bonus-stacking-system` and `offer-naming-magic-formula` before going live.
tags: [scarcity, urgency, demand-generation, offers, sales]
depends-on: [grand-slam-offer-creation]
---
# Scarcity and Urgency Tactics
## When to Use
Use this skill when you have a complete offer (built with `grand-slam-offer-creation`) and need to increase the percentage of qualified prospects who take action within a defined window.
Specific triggers:
- **Prospects stall after expressing interest** — "sounds great, let me think about it" is the most expensive sentence in sales
- **Low conversion on calls or landing pages** — the offer is understood but not acted on
- **Running a launch, promotion, or enrollment cycle** — you need a clear open/close window
- **You want to build pent-up demand** — intentionally underselling to increase desire for future cycles
- **You are pricing up** — announcing a price increase creates a legitimate near-term deadline
**What this skill produces:** A written scarcity and urgency implementation plan for your specific offer, including the tactic type(s) selected, the real constraints behind each, word-for-word scripts for sales conversations and copy, and an ethical validation checklist.
**What this skill does not cover:** Offer structure, bonus design, guarantee framing, or offer naming. See `grand-slam-offer-creation`, `bonus-stacking-system`, and `offer-naming-magic-formula`.
**Critical distinction:**
- Scarcity = limits on **quantity** ("only X spots available")
- Urgency = limits on **time** ("only available until X date")
Both work on loss aversion — fear of loss is psychologically stronger than desire for gain.
> **Warning (Anti-Pattern AP-6):** Fake scarcity and fake urgency are credibility destroyers. If you say "only 5 spots left" every week indefinitely, or run a "48-hour deadline" that resets the moment it expires, prospects notice — and they tell others. One fake constraint undoes months of trust-building. Every tactic in this skill must be real. If you can't make it real, don't use it.
---
## Context & Input Gathering
Before selecting tactics, answer these questions. If you don't know the answer to a question, default to the simplest tactic that you can honestly implement.
**About your offer:**
1. What is the offer? (service, product, program, course, physical item)
2. Do you have a real maximum capacity — a number beyond which quality degrades or you physically cannot deliver? (Yes/No + number)
3. Do you start clients in groups or individually?
4. How often do you open enrollment? (weekly, monthly, quarterly, never)
5. Are you planning to raise prices, change bonuses, or run a promotion in the near future?
6. Is the opportunity you're offering time-sensitive by nature (market arbitrage, seasonal window, trend)?
**About your prospect:**
7. What stage are they at? (cold traffic, warm lead, post-call follow-up)
8. What is their most common objection when they don't buy? (price, timing, doubt)
Use answers to match the tactic matrix in the Process section.
---
## Process
### Step 1: Choose your scarcity type (limits quantity)
Scarcity works by publicly constraining how many units or spots are available. The psychology: people want what they cannot have. A limited item carries implied demand. The moment supply is perceived as fixed, buyers shift from "should I?" to "can I still get it?"
**Rule before selecting:** Only use scarcity you can enforce honestly. If you say 10 spots, you must turn away the 11th person (or have a credible waiting list).
---
#### Scarcity Type 1: Limited Supply of Seats/Slots
You cap the number of clients you will take at a given service level. This is the most durable scarcity because it is structurally real — you genuinely cannot deliver quality past a certain client load.
Three sub-variants:
**1a. Total Business Cap ("Only accepting X clients total")**
- Set an absolute maximum for your highest service tier.
- "My agency services 25 clients total. Period." Create a waiting list. When a slot opens, waiting list members jump in with no price resistance because they've been waiting for access, not debating value.
- Periodically, increase capacity by 10-20%, then cap again. Each expansion is a mini-launch event.
- Best for: highest-tier, highest-margin service levels. Agencies, high-touch coaching, done-for-you services.
**1b. Growth Rate Cap ("Only accepting X new clients per week")**
- Set a weekly intake limit based on your real onboarding capacity.
- Script: *"We only take 5 new clients per week and 3 spots are already filled for this week. I have 6 more calls scheduled, so if this is a fit you'd want to act today or secure a spot for next week before those fill."*
- Why it works: this is literally true for most service businesses. You can only onboard so many people at once without degrading the experience. Saying it out loud converts a real operational constraint into a sales asset.
- Best for: service businesses running sales calls continuously.
**1c. Cohort Cap ("Only accepting X clients per cohort")**
- You batch starts: monthly, quarterly, or twice yearly.
- "We take 100 clients four times a year. Doors open, then close."
- Creates two natural urgency events per cohort: the opening (act now to get this batch) and the warning (X spots remain before doors close).
- Best for: group programs, courses, mastermind groups, any service that benefits from participants starting together.
**Pro tip — always sell out:** Have fewer spots available than you think you can sell. It is better to consistently sell out fast than to fail to fill a larger cohort. Selling out quickly compounds over time — prospects remember it and move faster next cycle. Always announce the sell-out publicly so fence-sitters see social proof that others thought it was worth it.
---
#### Scarcity Type 2: Limited Supply of Bonuses
Rather than limiting access to the core offer, you limit access to specific bonus components.
- "The first 10 people to enroll get a live strategy session with me personally — after that it's async only."
- "The next 5 clients get the implementation workshop included. After that it's available as an add-on."
- Why this works: it creates a two-tier incentive structure. The core offer remains available but the best version of the offer is not. This motivates action without excluding buyers who decide slightly later.
- Pairs naturally with `bonus-stacking-system`. Apply scarcity to your highest-perceived-value bonus items.
- Requirement: you must actually remove or price the bonus separately once the threshold is hit.
---
#### Scarcity Type 3: Never Available Again
This applies when an item, offer configuration, or price genuinely will not be offered again.
- Physical product limited release: "We made 100 units of the mint chocolate protein bar flavor. Once they're gone, they're gone."
- Service at current configuration: "We're retiring this tier in Q3 — after that it becomes the [higher-priced tier]."
- Price lock: "You can lock in today's rate. If you come back in 60 days the price will be [higher amount]."
- Why this works: removes the option of waiting. The prospect cannot delay and re-evaluate — the offer genuinely expires.
- Critical requirement: it must actually be true. If you bring the same thing back next month with a different name, you've taught your audience not to believe your deadlines.
- "Never again" scarcity is the most powerful but also the most integrity-expensive. Reserve it for genuine discontinuations.
**Extreme scarcity variant:** Offer a very small number of 1-on-1 access slots (direct message, Zoom, phone). Cap at a tiny number. Price it very high. The scarcity is real by design because it is access to your time, which is genuinely finite. This attracts the top 1% of buyers who self-select by commitment level.
**"Once you're out, you can never come back" variant:** Cap a service level and add a rule that departing clients cannot re-enter. Makes current members think harder about leaving, and makes prospects think harder about the decision to join. Most effective in small, tight-knit groups (mastermind-sized); loses effectiveness as groups scale past ~50.
---
### Step 2: Choose your urgency type (limits time)
Urgency limits when a prospect can take action, not how many can. It creates a deadline — a specific moment after which the window closes. The core insight: the biggest sales during any week-long campaign happen in the last 4 hours of the last day (up to 50-60% of total sales). That last 3% of time produces more than half the revenue. Deadlines drive decisions.
**Rule before selecting:** Make deadlines visible and real. A countdown timer that resets is not a deadline — it is a lie. Real deadlines printed in copy and on landing pages convert because they are verifiable.
---
#### Urgency Type 1: Cohort-Based Rolling Urgency
You start new clients on a defined schedule (weekly, biweekly, monthly). The next start date is the natural deadline.
Script — baseline version:
*"If you sign up today, I can get you in with our next group that kicks off on Monday. Otherwise you'll have to wait until our next kickoff date."*
Script — stronger version (when a spot just opened):
*"I actually had a client who signed up a few weeks ago drop out, so I have an opening for our next cohort that kicks off on Monday. If you're pretty sure you're going to do this sooner or later, might as well get in on it now so you can start reaping the rewards sooner rather than paying the same and waiting."*
Why it works: it converts a real operational fact (clients start on Mondays) into a meaningful cost of delay. Waiting is not neutral — waiting means another week without results, at the same price.
If someone wants to buy but your cohort just started: offer them (a) a fast-track onboarding bonus to join now, or (b) explain they'll have more time to prepare, review materials, and potentially get an extended payment plan since the start date is further out. You always call the shots.
Best for: any business that batches onboarding. The less frequent the cohort, the more powerful the urgency.
---
#### Urgency Type 2: Rolling Seasonal Urgency
Wrap your promotion in a seasonal or calendar-based name with a hard end date. The promotion itself can remain the same — the name and date are what change.
Calendar sequence example:
- January: "New Year Transformation Offer — ends Jan 30"
- February: "Valentine's Fresh Start Promo — ends Feb 28"
- March: "Sexy by Spring Special — ends March 31"
- April: "Spring Forward Offer — ends April 30"
Each is the same core promotion. But each has a named identity, a start, and a real end. The end date is verifiable. When the date passes, the promotion ends (or the name changes and the old one is genuinely gone).
Why this works: deadlines drive decisions. Simply having a visible end date causes a subset of prospects who were sitting on the fence to make a decision rather than default to "later." You are not manipulating them — you are removing the indefinite delay option that costs both of you.
Execution: put the date in your ad copy, on your landing page, and in your sales conversations. Make it visible everywhere.
Best for: businesses selling year-round who want consistent conversion pressure without a hard cohort model. Especially effective for local businesses, which should vary their marketing more frequently than national advertisers.
---
#### Urgency Type 3: Pricing or Bonus-Based Urgency
Use this when you cannot honestly create urgency around the core service (e.g., a roofing company cannot say they won't fix roofs after Friday), but you can create urgency around the pricing or bonuses attached to this purchase.
Script:
*"Yes, let's get you started today so you can take advantage of the discount you came in for. I'm not sure how long we'll be running it — we change these every four weeks or so, and this is one of the better ones we've run in a while."*
What to wrap urgency around:
- A current discount or promotional price
- A bonus component (free onboarding, extra workshop, implementation call valued at $X)
- A payment plan that won't be available after a certain date
- An upcoming price increase ("The price is going up on [date]. Get in now at today's rate.")
Why this maintains integrity: you are not lying about whether you'll serve them — you're being truthful that the current pricing or bonus configuration expires. They can still buy later; they just won't get the same deal.
Price increase variant: if you are genuinely planning to raise prices, announce it in advance. "The price goes up on [date]. Get in at today's rate." This is honest, creates urgency, and also signals confidence in your value. Never raise prices without telling your pipeline first — you'll get a cash influx from fence-sitters and signal market strength simultaneously.
---
#### Urgency Type 4: Exploding Opportunity
Some offers are inherently time-decaying — the opportunity is most valuable when acted on immediately, and less valuable (or worthless) later.
Examples:
- A market arbitrage window (buy low on one platform, sell high on another) that will close as more people find it
- Getting into an emerging platform or channel before competitors saturate it
- A job offer with a declining compensation package for each day the candidate waits to decide
- An early-adopter pricing tier before a product matures and prices normalize
Script approach: make the time decay explicit and quantified if possible. "Every week you wait on this, the market inefficiency closes further. The people who acted in month one are already seeing X% returns. Month three entrants are seeing Y%."
Best for: genuine arbitrage or first-mover situations. Do not fabricate opportunity decay — it is easily falsified and destroys trust. Only use this when the time decay is real and verifiable.
---
### Step 3: Select and combine
Scarcity and urgency work independently but stack together. The most effective implementations combine one scarcity type with one urgency type.
Common combinations:
| Business type | Scarcity tactic | Urgency tactic |
|---|---|---|
| Coaching/consulting (1-on-1) | Total business cap (X clients max) | Cohort-based rolling (next start date) |
| Group program | Cohort cap (X per cohort) | Rolling seasonal (named promotion deadline) |
| Local service business | Bonus scarcity (first X get bonus) | Pricing-based (promotion ends on date) |
| Physical product | Never again (limited release) | Rolling seasonal (end date) |
| High-ticket agency | Growth rate cap (X new per week) | Pricing-based (rate increase date) |
If you are early in business and have no real capacity constraints yet: start with the growth rate cap (pick a small honest number like 3-5 per week) and cohort-based urgency (pick a real weekly start day). Both are immediately implementable and truthful.
---
### Step 4: Write your implementation scripts
For each selected tactic, write out:
1. **The constraint statement** — the one sentence that states what is limited and why it is real
2. **The sales script** — what you say on a call or in follow-up when the prospect hesitates
3. **The copy block** — a 2-3 sentence version for landing pages, emails, and ads
4. **The enforcement rule** — what you will actually do when the limit is hit
Template:
```
Constraint: [We only take X new clients per week / cohort starts Monday / promotion ends [date] / only Y spots left]
Reason it's real: [actual capacity limit / operational batch / planned price change / finite inventory]
Sales script: "If you're thinking about doing this, [constraint statement]. [Cost of delay]. [Invitation to decide]."
Copy block: [2-3 sentences suitable for email or landing page]
Enforcement: If limit is hit, I will [turn away / waitlist / close the promotion / remove the bonus].
```
---
### Step 5: Ethical validation checklist
Before using any tactic, confirm all boxes are checked:
- [ ] The constraint is factually true right now
- [ ] I will actually enforce it (turn away buyers, close the offer, remove the bonus) when the threshold is hit
- [ ] If I run this same promotion again, it will have a genuinely new constraint (new date, new cohort, new price)
- [ ] I am not saying "limited spots" indefinitely without actually filling up
- [ ] I am not running a countdown timer that resets
- [ ] The urgency language I use refers to a real deadline visible to the prospect (date, cohort start, price increase date)
- [ ] If questioned directly ("Is this actually limited?"), I can answer truthfully
If any box is unchecked, revise the tactic or choose a different one.
---
## Examples
### Example 1: Cohort cap + cohort urgency (group coaching program)
A fitness coach runs a 12-week online program. She opens enrollment to 20 clients per cohort, four times per year.
Scarcity: Cohort cap — 20 spots per quarter. She announces it, fills it, and closes it publicly.
Urgency: Cohort start date — "Our next group kicks off March 1. After that, doors don't open again until June."
Script on a sales call: *"We're at 14 out of 20 spots for the March cohort. If you want in before the group closes, now is the time. If not, the next opening is June — I'm happy to put you on the waitlist, but March people will be done by then and you'd be starting from scratch."*
Result: prospects who were vaguely interested but undecided now have two reasons to act: the cohort fills before June, and waiting means 3 more months of the problem.
### Example 2: Arnold fundraiser — price increase + seat scarcity (Case Study CS-13)
At the After School All Stars fundraiser, the event organizer (acting on advice from a donor) cut the number of tickets available from the prior year and raised the price from $15,000 to $25,000 per ticket. Demand was higher than ever. The event raised nearly $5,400,000 from 100 people — $54,000 per head — the most successful night in the charity's history. Items that might not have sold at a different venue for $10,000 sold for $100,000 in that environment. The mechanism: by cutting supply and raising price simultaneously, the organizer created perceived exclusivity and triggered competitive demand. People want what they cannot have. People want what other people want. Scarcity and urgency made the products more valuable without changing the products.
### Example 3: Pricing urgency for a local service business (roofing, dental, gym)
A gym owner runs the same core membership year-round but wraps each month's promotion in a seasonal name with a hard end date.
January: "New Year Transformation Offer — enrollment closes Jan 31"
February: "Valentine's Body Goal Promo — closes Feb 28"
The promotion may be similar month to month, but each has a real end date. When January 31 arrives, the January offer closes. February 1, the new name and date go live. Conversion pressure is maintained continuously without fake urgency. The gym owner can honestly say "this offer ends on the 31st."
---
## Key Principles
**1. Scarcity = quantity. Urgency = time. They are distinct levers.**
Using both together creates the strongest conversion environment. Using them interchangeably dilutes both.
**2. The last 4 hours produce 50-60% of sales in any campaign.**
This is a documented pattern across every experienced marketer. Do not panic when the first 96% of your deadline period produces little activity. The deadline is doing its job. Do not extend the deadline — that is what trains your audience not to act.
**3. Pent-up demand is an asset, not a problem.**
Selling fewer units than you could sell is not failure — it is investment. Each person who wanted to buy but couldn't becomes a primed buyer for your next cycle, willing to move faster and pay more. Satisfying all demand now kills future demand.
**4. Honest scarcity implies social proof.**
Telling someone "we're 80% full" communicates two things: you have a limit AND a lot of other people already decided to buy. Social proof and scarcity arrive in the same sentence.
**5. Fake urgency is a credibility tax with compound interest.**
Every fake deadline trains your audience to ignore the next one. Over time, your real deadlines produce no response because you have taught people they don't matter. Build urgency equity by always enforcing what you say.
**6. The Hormozi Law on demand timing:**
"The longer you delay the ask, the bigger the ask you can make." Running cohorts less frequently — quarterly instead of monthly — creates more pent-up demand and allows higher prices. The runway determines the plane size.
**7. Supply control is the foundation of premium pricing.**
Authorities and celebrities charge extraordinary rates not because their work is worth more per unit of time, but because perceived supply is extremely low. You control your supply. Defining and communicating your limits is not arrogance — it is pricing strategy.
---
## References
- `grand-slam-offer-creation` — build the core offer before layering scarcity and urgency
- `bonus-stacking-system` — identifies which bonuses are candidates for limited-supply scarcity (Type 2)
- `offer-naming-magic-formula` — seasonal urgency requires a named promotion; naming skill produces the wrapper
- `premium-pricing-strategy` — scarcity is a precondition for premium pricing; read alongside this skill
**Source:** $100M Offers, Alex Hormozi — Chapters 12 (Scarcity) and 13 (Urgency), pages 141-157
## License
This skill is licensed under [CC-BY-SA-4.0](https://creativecommons.org/licenses/by-sa/4.0/).
Source: [BookForge](https://github.com/bookforge-ai/bookforge-skills) — 100M Offers by Unknown.
## Related BookForge Skills
Install related skills from ClawhHub:
- `clawhub install bookforge-grand-slam-offer-creation`
Or install the full book set from GitHub: [bookforge-skills](https://github.com/bookforge-ai/bookforge-skills)
Use when a business is pricing by copying competitors, losing deals on price, or struggling to grow margins — to diagnose commoditization, justify a premium...
---
name: premium-pricing-strategy
description: Use when a business is pricing by copying competitors, losing deals on price, or struggling to grow margins — to diagnose commoditization, justify a premium price point with ROI evidence, and calculate the niche-specific multiplier that unlocks value-driven purchasing.
tags: [pricing, offers, differentiation, sales, entrepreneurship]
depends-on: [target-market-selection]
---
# Premium Pricing Strategy
## When to Use
Use this skill when any of the following are true:
- You are setting the initial price for a new offer
- Prospects compare your offer to competitors and choose on price
- You are considering a price cut to win more customers
- Revenue is growing but profit is flat or shrinking
- You feel guilty or uncertain quoting your current price
- You are entering a niche market and want to size the pricing opportunity
Do NOT use this skill if you have not yet completed `target-market-selection` — the right niche is a prerequisite because niche specificity is a direct price multiplier (see Step 3).
---
## Context & Input Gathering
Before running the process, collect the following from the business description or pricing documents:
1. **Current price** — What is the business charging today (or planning to charge)?
2. **Competitor pricing range** — What does the low-price player charge? What does the highest-price player charge?
3. **Target customer avatar** — Who specifically is being served? (Role, industry, situation)
4. **Quantifiable result** — What measurable outcome does the customer achieve? (Revenue increase, time saved, cost avoided)
5. **Time to result** — How long does the customer take to achieve that outcome?
6. **Customer effort required** — How many hours per week does the customer have to invest?
7. **Payment structure** — Upfront, recurring, or performance-based?
If items 4-6 are unknown, flag them as required inputs before proceeding — they are essential for the ROI framing conversation in Step 4.
---
## Process
### Step 1 — Diagnose for Commodity Trap
**Action:** Check whether the offer is currently positioned as a commodity (undifferentiated, price-driven purchase).
Run this checklist:
- [ ] Does the sales conversation include the phrase "we're competitive on price"?
- [ ] Do prospects ask "what's your rate?" before asking about results?
- [ ] Does your marketing look substantially similar to competitors'?
- [ ] Are you priced at, or slightly below, the market average?
- [ ] Has a prospect ever switched to a cheaper competitor for "basically the same thing"?
**WHY:** A commoditized offer forces market-efficiency pricing. In an efficient market, competition drives prices down until margins are just enough to keep the lights on — "just enough" traps the owner in a business that barely sustains itself. This is the commodity trap: the marketplace prices you at the point where staying alive is the best you can hope for.
**IF** 3 or more boxes are checked: the offer is commoditized. Proceed to Step 2 immediately — pricing changes alone will not work; the offer must be differentiated first (see `grand-slam-offer-creation`).
**IF** fewer than 3 boxes are checked: the offer has differentiation signals. Proceed to Step 2 to validate and strengthen the pricing rationale.
> **Anti-pattern AP-1 (Commodity Trap):** Never price by surveying competitors and going slightly below. Those competitors you are copying are operating at thin or negative margins. Copying their pricing means copying their financial outcome. The six-step commodity pricing process — look at marketplace, see what others offer, take the average, go slightly below, add "a little more," end up at "more for less" — is a guaranteed path to market efficiency pricing, not profitable differentiation.
---
### Step 2 — Build the Premium Pricing Rationale
**Action:** Construct the case for why charging above market rate is the correct business and ethical decision. This is the internal conviction work — you must believe the price before you can defend it.
Work through the Virtuous Cycle of Price argument:
**When you lower your price, five things happen:**
1. Client emotional investment decreases (it didn't cost them much, so they don't prioritize it)
2. Client perceived value decreases (cheap signals low quality)
3. Client results decrease (low investment = low follow-through)
4. Client demandingness increases (low-paying clients are the hardest to serve)
5. Your margin for fulfillment decreases (less revenue = less ability to invest in delivery quality)
**When you raise your price, the same five factors reverse:**
1. Client emotional investment increases
2. Client perceived value increases
3. Client results improve (they paid enough that it stings — they show up)
4. You attract clients who are easiest to satisfy and cost less to serve
5. Your margin multiplies — enabling better systems, better people, better outcomes
**WHY:** Premium pricing is not just a revenue strategy — it is a service quality strategy. Higher prices attract clients who are more invested, produce better results, and require less hand-holding. The profit margin that premium pricing creates funds the delivery improvements that justify the premium. The cycle is self-reinforcing.
**Empirical anchor:** In a blind wine-tasting study, participants rated the same wine highest when it carried the highest price tag. Price itself signals value. The goal is not to be slightly above market — it is to be so much higher that a prospect thinks "this must be something entirely different." That creates a category of one with monopoly-level pricing power.
> **Anti-pattern AP-4 (Underpricing Trap):** "I'll start low to build clients, then raise prices later." This is the underpricing cascade. Low prices attract low-investment clients who get poor results (because they aren't invested), which destroys social proof, which makes it harder to raise prices, which forces more low-price clients to fill capacity. There is no strategic benefit to being the second-cheapest in your marketplace. There is a strategic benefit to being the most expensive.
---
### Step 3 — Calculate the Niche Pricing Multiplier
**Action:** Determine how much price expansion is achievable by increasing niche specificity. Apply the niche drill-down to the current offer.
**The multiplier mechanism:** The same core product, repositioned for a progressively narrower audience, commands dramatically higher prices — not because the content changes, but because the perceived relevance and ROI specificity increase.
Work through the niche drill-down using this template:
| Specificity Level | Offer Name Pattern | Price Signal |
|---|---|---|
| Generic | [Core topic] | Commodity price |
| Role-specific | [Core topic] for [Role] | 3-5x |
| Role + context | [Core topic] for [Role] in [Industry] | 15-25x |
| Role + context + situation | [Core topic] for [Role] in [Industry] with [Specific situation] | 50-100x |
**Reference example from source (Time Management):**
- "Time Management" → $19
- "Time Management for Sales Professionals" → $99 (5x)
- "Time Management for Outbound B2B Sales" → $499 (26x)
- "Time Management for Outbound B2B Power Tools & Gardening Sales Reps" → $1,997 (105x)
The core content is the same. The price multiplier comes from the buyer's perception: "This is made exactly for me." A power tools outbound sales rep will happily pay $1,000–$2,000 because one extra deal pays for the program many times over.
**Apply to the current offer:**
1. Write the offer name at the generic level. Note the commodity price.
2. Add one layer of specificity (role or industry). Estimate new price.
3. Add a second layer (context or situation). Estimate new price.
4. Add a third layer (specific problem or counterintuitive mechanism). Estimate final price.
5. Select the specificity level that matches the niche chosen in `target-market-selection`.
**WHY:** Niche specificity converts a generic solution into a targeted ROI instrument. The more specific the avatar, the more precisely you can tie the outcome to their measurable business result, the easier it is to justify a premium price with a concrete payback calculation.
**IF** the niche is already highly specific (Level 3 or 4): proceed directly to Step 4 to frame the ROI conversation.
**IF** the niche is generic (Level 1): return to `target-market-selection` to narrow the avatar before pricing. A generic offer cannot command niche pricing.
> **Anti-pattern AP-9 (Entrepreneur Job-Buying Trap):** Staying generic to "serve everyone" feels like growth but is actually a pricing ceiling. When you serve everyone, your messaging speaks to no one specifically, your price must reflect the generic average, and you can never charge niche premium rates. Niching down feels like shrinking the market — it actually multiplies revenue per customer.
---
### Step 4 — Frame the ROI Conversation
**Action:** Construct the pricing conversation that converts a skeptical prospect (or internal stakeholder) by anchoring to return on investment rather than cost.
This is a structured dialogue, not a pitch. The goal is to make the prospect calculate the answer themselves.
**The ROI framing sequence:**
1. **State the measurable outcome and the evidence base.**
"Based on [N] clients in your situation, the average result is [specific outcome] over [timeframe]."
2. **Ask the ROI question.**
"If I could reliably deliver [outcome] for you, would [price] be worth paying?"
3. **Neutralize the effort concern.**
"And to get that result, you'd only need to invest [hours/week] of your time."
4. **Clarify the timeline.**
"The [outcome] typically materializes within [timeframe]."
5. **Address payment structure.**
"And [payment terms — upfront, deferred, performance-based]."
6. **Let the prospect close themselves.**
If the ROI is genuine, the math does the persuasion. Your job is to present the math accurately, not to pressure.
**Reference example from source (father conversation):**
- "If I made you $239,000 extra this year, would you pay me $42,000?" (The $239,000 figure was the average topline revenue increase for a gym in the program over 11 months.)
- "For sure — if I knew I was going to make that back."
- "About 15 hours a week of work."
- "Eleven months."
- "Nothing upfront. Just pay as you start making the money."
- Response: "Oh — well then, yeah, I would do it."
The prospect did not object to $42,000. He objected to uncertainty. Once the uncertainty was resolved with data, the price was irrelevant.
**WHY:** Price resistance is almost always ROI uncertainty, not price sensitivity. Prospects are not asking "is this too expensive?" They are asking "will I get my money back?" The ROI framing conversation converts price objections into investment calculations. When the math works, the price becomes justified on its own terms.
**Checklist before running this conversation:**
- [ ] Do you have documented evidence of outcomes (survey data, case studies, averages across clients)?
- [ ] Is the outcome quantifiable in the prospect's currency (revenue, profit, hours saved)?
- [ ] Is the price a fraction of the outcome (ideally 10-20% of the value delivered)?
- [ ] Do you genuinely believe the client will achieve the result?
**IF** you cannot answer yes to all four: do not raise the price yet. Build the evidence base first, then price on the evidence.
---
### Step 5 — Set and Validate the Price Point
**Action:** Arrive at a specific recommended price with a rationale.
Use this framework:
1. **Identify the value delivered** (annual revenue increase, cost savings, time saved × hourly rate, or risk avoided).
2. **Price at 10–20% of that value.** This keeps the price-to-value gap large enough that the client always perceives a deal.
3. **Check against competitor ceiling.** Aim to be 3x or more above the next-highest competitor — this triggers the "category of one" perception.
4. **Confirm against niche multiplier** from Step 3 — the final price should align with the specificity level of the offer.
5. **State the price confidently without a discount framing.** The price is what it is because the value is what it is.
**Output format:**
> **Recommended price:** $[X]
> **Value delivered:** $[Y] (based on [evidence source])
> **Price-to-value ratio:** [X/Y]% — client receives approximately $[Y/X] for every $1 spent
> **Vs. market:** [Xm]x above low-price competitor, [Xh]x above high-price competitor
> **Payment structure:** [Upfront / Deferred / Performance-based]
> **Rationale:** [2-3 sentences connecting niche specificity, measurable outcome, and evidence base]
---
## Examples
### Example A: Generic Fitness Coach → Niche Premium Offer
**Before:** Generic "weight loss coaching" priced at $197/month to compete with local gyms.
**Niche drill-down:**
- "Weight loss coaching" → $197/month (commodity)
- "Weight loss for shift nurses" → $997/month (nurse's hourly wage × overtime means $1k/month is less than 2 overtime shifts)
**ROI frame:** "Nurses on night shift have a 40% higher obesity rate and 3x higher cardiovascular risk. If this program reduces your health risk and improves your sleep quality enough to eliminate 2 sick days per year, you've already recovered the cost. Most participants also reduce their overtime reliance by 1 shift/month."
**Recommended price:** $997/month
**Rationale:** Same core program, niche-repositioned for an avatar with high pain, clear purchasing power, and a specific measurable outcome (health metrics, sick days, overtime reduction).
---
### Example B: Agency Services → Differentiated Offer
**Before:** Marketing agency charging $1,000/month retainer (commodity offer), losing clients to cheaper alternatives.
**Diagnosis:** Clients can compare retainer to retainer — pure commodity. When a cheaper option appears, the value discrepancy causes churn.
**Differentiated offer structure:** "Pay one time. No retainer. I generate and work your leads. Pay only when people show up. Guarantee: 20 qualified appointments in your first month or your next month is free."
**ROI frame (lead generation agency example from source):**
- Commodity offer: $1,000 upfront + $1,000/month, 5 clients closed per $10,000 ad spend → ROAS 0.5:1 (losing money on acquisition)
- Differentiated offer: $3,997 one-time, same ad spend → 28 clients closed → ROAS 11.2:1
**Multiplier:** Same ad spend. 2.5x response rate (more compelling offer). 2.3x close rate (more value). 4x price. Net result: 22.4x more cash collected.
---
## Key Principles
1. **Price is a signal, not just a number.** Raising price increases perceived value — empirically, not just theoretically. Price too high creates allure ("there must be something different here"). Price at market creates comparison. Price below market creates suspicion.
2. **The goal is not the most customers — it is the most profit.** Getting people to buy is not the objective. Making money is. These are different optimization targets with different pricing strategies.
3. **You can only lower price to $0, but you can raise price infinitely.** There is no strategic advantage to being the second-cheapest. There is a clear strategic advantage to being the most expensive.
4. **Margin funds quality.** Low prices destroy the margin needed to invest in better delivery. Premium prices fund the systems, people, and attention that justify the premium. The virtuous cycle requires the premium to begin spinning.
5. **Conviction precedes permission.** You cannot charge a price you do not believe in. Conviction comes from documented results. Document results, then price on the evidence.
6. **Niche specificity is a price multiplier, not a market constraint.** Niching down feels like reducing the addressable market. In practice, it multiplies revenue per customer. For most businesses under $10M/year, narrower serves fewer clients more profitably than broader serves more clients cheaply.
---
## References
- `value-equation-offer-audit` — Quantify the four value dimensions (dream outcome, perceived likelihood, time delay, effort/sacrifice) that underpin the price-to-value gap established here
- `target-market-selection` — Prerequisite: niche selection determines specificity level and the ROI baseline used in Step 4
- `grand-slam-offer-creation` — Build the differentiated offer structure that escapes the commodity trap diagnosed in Step 1; premium pricing requires a premium offer
**Source:** $100M Offers, Alex Hormozi — Chapters 3 ("Pricing: The Commodity Problem"), 4 ("Pricing: Finding the Right Market — A Starving Crowd"), 5 ("Pricing: Charge What It's Worth"), pages 38–74
## License
This skill is licensed under [CC-BY-SA-4.0](https://creativecommons.org/licenses/by-sa/4.0/).
Source: [BookForge](https://github.com/bookforge-ai/bookforge-skills) — 100M Offers by Unknown.
## Related BookForge Skills
Install related skills from ClawhHub:
- `clawhub install bookforge-target-market-selection`
Or install the full book set from GitHub: [bookforge-skills](https://github.com/bookforge-ai/bookforge-skills)
Apply when you need to name a new offer, program, service, or promotion — or when an existing offer's response rate has dropped and you suspect the name is t...
---
name: offer-naming-magic-formula
description: Apply when you need to name a new offer, program, service, or promotion — or when an existing offer's response rate has dropped and you suspect the name is the bottleneck. Generates 3-5 testable offer name variants using the five-component naming framework (magnetic reason, target audience, goal, time interval, container word) and produces a prioritized offer refresh plan to combat audience fatigue.
tags: [naming, copywriting, offers, marketing, branding]
depends-on: [grand-slam-offer-creation]
---
# Offer Naming: MAGIC Formula
## When to Use
Use this skill when you:
- Have a completed offer (value stack, pricing, bonuses) and need a compelling name before promoting it
- Notice declining response rates, click-through rates, or lead volume on a campaign that used to perform well
- Are re-entering a market after a seasonal break and want a fresh angle on the same core offer
- Want to generate multiple name variants to A/B test in ads or landing pages
- Are naming individual bonuses, sub-items, or components within a larger bundle
Do not use this skill to redesign the offer itself. The underlying product, pricing, and value stack remain unchanged. You are only changing the wrapper — the external perception of what the offer is called.
## Context and Input Gathering
Before generating names, collect:
1. **The core offer** — what is actually delivered (program, service, product, consultation, challenge, etc.)
2. **Target audience** — who specifically (demographics, geography, role, pain point identity)
3. **Primary dream outcome** — the single most desirable result the buyer wants
4. **Expected or proven timeframe** — how long until they see results
5. **Promotional hook or reason** — is there a season, event, anniversary, or discount angle to lead with?
6. **Current name (if refreshing)** — what name has fatigued and why you believe it has
If any of these are unclear, ask before generating names. A vague goal produces a vague name.
## Process
### Step 1: Map the Five Components
Work through each component of the naming framework (MAGIC) and generate 3-5 options per component. You will mix and match these in Step 2.
---
**M — Magnetic Reason (Attention)**
A word or short phrase at the start of the name that answers: *Why is this offer available right now?* or *What's in it for me?*
This creates the hook that causes a prospect to stop scrolling. It does not need to be a logical business reason — seasonal anchors, community events, and promotional mechanics all work.
Options include:
- Discount or access signals: Free, 88% Off, $X Off, Giveaway, Bonus
- Seasonal or calendar anchors: Spring, Back to School, Halloween, New Year, Grand Opening, Anniversary
- Emotional or identity hooks: New Management, New Building, Celebration, Challenge
If your offer is free or discounted, lead with that. Scarcity and discount are the strongest magnetic reasons. See `scarcity-and-urgency-tactics` for how to create urgency that reinforces this component.
---
**A — Avatar (Discrimination)**
Call out exactly who this offer is for. The more specific, the higher the conversion — especially in local or niche markets.
- For local markets: go sub-city, not city. Not "Dallas Moms" — "Lakeway Moms." Not "Chicago Dentists" — "Hinsdale Dentists."
- For online markets: use role, identity, or pain point. "Salon Owners," "Retired Athletes," "Brick and Mortar Business Owners"
- Omit the avatar only if your offer has extremely broad appeal and adding it would make the name too long
The implicit-egotism effect explains why this works: people are drawn to things that resemble them. Seeing their own identity in a name dramatically increases the likelihood they will read the next word.
---
**G — Goal (Purpose)**
State the dream outcome in as few words as possible. Make it specific and tangible — not a process, but a result.
- Weak: "Improve Your Health"
- Strong: "Pain Free," "Celebrity Smile," "Never Out of Breath," "Little Black Dress," "First Client," "7 Figure"
Use numbers where they add credibility and specificity. Emotions, events, and status markers all work. The more the prospect can visualize the end state, the stronger the pull.
---
**I — Interval (Timeline)**
State how long the program takes or how quickly results arrive. This sets expectation, reduces perceived risk, and creates a defined commitment window.
- Format: "4 Hour," "21 Day," "6 Week," "3 Month," "90 Day," "12 Week"
- Compliance note: If your goal component makes a quantifiable claim (e.g., income or weight loss), many advertising platforms will reject combining a stated outcome with a stated duration — it implies a guarantee. In that case, use a non-quantifiable goal with the interval ("Make Your First Sale in 10 Days" rather than "$10,000 in 10 Days"), or omit the interval. Check platform ad policies before publishing.
---
**C — Container Word (Method)**
The final word signals that this is a bundled, systematic offering — not a single session or commodity service. It elevates perceived value by implying a complete system.
A strong container word makes the offer feel proprietary and hard to comparison-shop.
High-performing container words: Challenge, Blueprint, Bootcamp, Intensive, Masterclass, Accelerator, Sprint, Fast Track, Shortcut, System, Transformation, Deep Dive, Detox, Reset, Liftoff, Launch, Game Plan, Cheatcode, Incubator, Mastermind, Workshop, Comeback
Match the energy of the container word to your audience's appetite for intensity:
- High-intensity audiences (athletes, sales teams): Bootcamp, Assault, Attack, Explosion
- Professional audiences: Blueprint, Intensive, Masterclass, Workshop
- Mass-market audiences: Challenge, System, Program, Reset
---
### Step 2: Assemble Name Variants
Combine components into 5-8 candidate names. Rules:
- Use 3 to 5 components per name — not all five. Names with all five components are usually too long.
- Order does not have to follow M-A-G-I-C. Rearrange for rhythm and punch.
- Shorter and punchier beats longer and complete. If two names carry equal meaning, prefer the shorter one.
- Test rhyming: names that rhyme are more memorable and more likely to be repeated by word-of-mouth. Use a rhyming dictionary (search "rhyming dictionary") on your goal and container words.
- Test alliteration: make two or more words start with the same letter or sound. Easier than rhyming and nearly as sticky. Examples: Make Money Masterclass, Debt Detox, Life Coach Liftoff, Big Booty Bootcamp.
- Do not force rhyme or alliteration if it produces an awkward name. It is a bonus, not a requirement.
Name length target: 3-6 words is the sweet spot. Under 3 words often lacks specificity. Over 7 words is hard to say aloud or remember.
---
### Step 3: Select 3-5 Names to Test
From your candidate list, choose the 3-5 names that best satisfy:
1. Clarity — a stranger immediately understands who it is for and what they will get
2. Specificity — contains at least one concrete detail (audience, outcome, or timeframe)
3. Distinctiveness — sounds different from what competitors are running
4. Sayability — easy to say aloud without stumbling
Run 2-3 of these names simultaneously in your advertising. Track response rate (clicks, leads, cost per lead) for each. The name with the lowest cost per lead becomes your control. Test new names against the control, not against each other.
---
### Step 4: Apply the Formula to Sub-Items and Bonuses
Apply the same naming process to every item in your bonus stack (see `bonus-stacking-system`). A named bonus ("The 5-Day Product Launch Blueprint") converts better than an unnamed bonus ("Product Training"). The name communicates value before the prospect reads the description.
---
### Deliverable
At the end of this process you should have:
- A naming worksheet with 3-5 named offer variants ready to run in ads
- Each name annotated with the MAGIC components it uses
- A designated A/B test plan: which 2-3 names run first, what metric determines the winner, and what the winner becomes the control against
---
## Offer Fatigue Remediation Sequence
When lead volume drops or cost per acquisition rises on a previously performing offer, work through these steps in order. Do not skip ahead. The further down the list you go, the more operationally disruptive the change.
**Step 1 — Refresh the creative (ad images and video)**
Swap the visual assets in your ads while keeping all copy identical. Most platform algorithms favor fresh creative and will re-distribute your ad to new audiences. Cost: low. Time: 1-2 days.
**Step 2 — Refresh the ad body copy**
Keep the headline and offer unchanged; rewrite the supporting copy (hooks, story, social proof) with a different angle. Addresses copy blindness without touching the offer itself.
**Step 3 — Rename the offer (change the wrapper)**
Apply the MAGIC formula to generate a new name for the identical core offer. Example: "Six-Week Stress Release Challenge" becomes "42-Day Relaxing Holidays Challenge." Same sessions, same price, same delivery — different wrapper. This is the highest-leverage low-cost refresh available.
**Step 4 — Change the seasonal or reason-why component**
Update the magnetic reason to match the current calendar or a new promotional hook. Example: "Holiday Hangover" becomes "New Year New You." The goal and container word stay the same; only the reason-why changes.
**Step 5 — Adjust the time interval**
Change the stated duration of the program. Six weeks becomes 28 days or eight weeks. This creates a perception of novelty and may also improve conversion for audiences who found the original duration too long or too short.
**Step 6 — Change the promotional enhancer**
Modify the free or discount component. If you were offering 50% off, switch to a free bonus. If you were offering a free session, switch to a dollar-amount discount. The underlying offer economics may be similar; the framing changes.
**Step 7 — Restructure the monetization model**
Change the price points, payment structure, or sequence of offers presented to prospects. This is the most operationally heavy intervention and should only be attempted after exhausting steps 1-6.
Important: Most offer fatigue resolves at steps 1-3. Jumping to step 7 when step 3 would have worked wastes operational capacity and introduces unnecessary business risk. Work the list in order.
---
## Warning: AP-8 — Lazy Naming and Offer Fatigue Misdiagnosis
The most common and costly naming mistake is treating the name as an afterthought. The same offer, named differently, routinely produces 2x to 10x differences in response rate. "Free Six-Week Stress Release Challenge" and "Float Tank Center Session" may describe identical services — but only one of them gets clicked.
Lazy naming symptoms:
- The name describes what you do rather than what the client gets ("Consulting Package" vs. "7-Figure Agency Blueprint")
- The name uses internal jargon the prospect does not use to describe their own problem
- The name has no specificity — no audience, no outcome, no timeframe
- The offer has been running for over a year with no name variation
Offer fatigue misdiagnosis: before restructuring your pricing, delivery, or value stack, verify that the name is not the problem. Run a new name on the identical offer for two to four weeks. If lead volume recovers, the offer was never fatigued — it was just stale in its wrapper.
---
## Multi-Industry Examples
The examples below demonstrate how the same five components combine differently across industries.
### Wellness and Fitness
| Name | M | A | G | I | C |
|------|---|---|---|---|---|
| Free Six-Week Lean-By-Halloween Challenge | Free | — | Lean By Halloween | Six-Week | Challenge |
| 88% Off 12-Week Bikini Blueprint | 88% Off | — | Bikini | 12-Week | Blueprint |
| Free 21-Day Mommy Makeover | Free | Moms | Makeover | 21-Day | — |
| Six-Week Stress Release Challenge | — | — | Stress Release | Six-Week | Challenge |
| (Free!) Bend Over Pain Free in 42 Days Healing Fast Track | Free | — | Pain Free | 42 Days | Fast Track |
| Six-Pack Fast Track | — | — | Six-Pack | — | Fast Track |
### Medical and Dental
| Name | M | A | G | I | C |
|------|---|---|---|---|---|
| $2,000-Off Celebrity Smile Transformation | $2,000 Off | — | Celebrity Smile | — | Transformation |
| Lakeway Moms — 12 Months to a Perfect Smile | — | Lakeway Moms | Perfect Smile | 12 Months | — |
| Back to School Free Braces Giveaway | Free | — | Braces | Back to School | Giveaway |
| Grand Opening Free X-Ray and Treatment — Instant Relief | Free / Grand Opening | — | Instant Relief | — | — |
| Back Sore No More! 90-Day Rapid Healing Intensive (81% off) | 81% Off | — | No More Back Pain | 90 Day | Intensive |
### Coaching and Business
| Name | M | A | G | I | C |
|------|---|---|---|---|---|
| 5 Clients in 5 Days Blueprint | — | — | 5 Clients | 5 Days | Blueprint |
| 7-Figure Agency 12-Week Intensive | — | Agency Owners | 7 Figure | 12 Week | Intensive |
| 14-Day Find Your Perfect Product Launch | — | — | Perfect Product Launch | 14 Day | — |
| Fill Your Gym in 30 Days (Free!) | Free | Gym Owners | Fill Your Gym | 30 Days | — |
| Make Money Masterclass | — | — | Make Money | — | Masterclass |
### Pattern Observations
- Wellness names frequently lead with the discount or free offer (magnetic reason first) because the audience is price-sensitive and aspirational goals are highly visual
- Medical names often embed the avatar (hyper-local: "Lakeway Moms") because local trust is the primary purchase driver
- Coaching names often lead with the goal and interval because the audience is outcome-oriented and wants to know the timeline before the price
- Container words shift by audience sophistication: "Challenge" for mass market, "Intensive" or "Masterclass" for professional buyers, "Blueprint" for analytical buyers
---
## Key Principles
1. **You are changing the wrapper, not the offer.** The product, price, and delivery remain identical. The name is the exterior perception — the cover of the book.
2. **Specificity converts.** A name that clearly identifies a person, a result, and a timeframe will almost always outperform a generic name, even if the generic name sounds more "professional."
3. **3 to 5 components is the optimal range.** Fewer than three components produces names that lack enough signal. More than five produces names that are too long to say or remember.
4. **Order follows rhythm, not formula.** Rearrange MAGIC components to produce the punchiest, most natural-sounding sequence. Read the name aloud. If you stumble, reorder.
5. **Test, do not predict.** No expert can reliably predict which name will win. Generate 3-5 strong candidates, run them simultaneously, and let the market decide.
6. **Exhaust shallow changes before deep ones.** When an offer fatigues, change the creative first, then the copy, then the name. Only restructure the underlying offer if all surface-level changes have failed.
7. **Name your bonuses too.** Every item in your value stack is a mini-offer. A named bonus with a clear outcome communicates more value than a generic module title.
---
## References
- `grand-slam-offer-creation` — Build the underlying value stack before naming it; naming a weak offer does not fix the offer
- `bonus-stacking-system` — Apply the naming formula to each bonus and sub-item in the stack
- `scarcity-and-urgency-tactics` — Use seasonal anchors and limited availability as magnetic reason-why components
- Source: $100M Offers, Alex Hormozi, Chapter 16 "Enhancing the Offer: Naming," pages 185-195
## License
This skill is licensed under [CC-BY-SA-4.0](https://creativecommons.org/licenses/by-sa/4.0/).
Source: [BookForge](https://github.com/bookforge-ai/bookforge-skills) — 100M Offers by Unknown.
## Related BookForge Skills
Install related skills from ClawhHub:
- `clawhub install bookforge-grand-slam-offer-creation`
Or install the full book set from GitHub: [bookforge-skills](https://github.com/bookforge-ai/bookforge-skills)
Design, select, and word a risk-reversal guarantee for a product or service offer. Use this skill when the user wants to add a guarantee to an offer, asks "w...
---
name: guarantee-design-and-selection
description: Design, select, and word a risk-reversal guarantee for a product or service offer. Use this skill when the user wants to add a guarantee to an offer, asks "what kind of guarantee should I offer," says prospects are hesitant or objecting to the price or risk, wants to reduce refund fear without killing conversion, asks how to guarantee results, wonders whether to offer a money-back guarantee, wants to switch from a retainer pricing model to a performance model, needs to improve offer conversion rate, asks "what happens if they don't get results," wants to stack multiple guarantees, or is designing a new high-ticket offer and needs a risk-reversal mechanism — even if they don't explicitly mention "guarantee" or "risk reversal." This skill produces a guarantee recommendation with type, wording, and ROI projection. For building the full offer stack see grand-slam-offer-creation. For auditing perceived value before writing the guarantee see value-equation-offer-audit.
version: 1.0.0
homepage: https://github.com/bookforge-ai/bookforge-skills/tree/main/books/100m-offers/skills/guarantee-design-and-selection
metadata: {"openclaw":{"emoji":"📚","homepage":"https://github.com/bookforge-ai/bookforge-skills"}}
status: draft
source-books:
- id: 100m-offers
title: "$100M Offers"
authors: ["Alex Hormozi"]
chapters: [15]
tags: [guarantees, risk-reversal, offers, sales, pricing]
depends-on: [value-equation-offer-audit]
execution:
tier: 2
mode: hybrid
inputs:
- type: context
description: "Description of the product or service, current pricing model, target customer, and any known objections"
tools-required: [Write]
tools-optional: [Read]
mcps-required: []
environment: "Any agent environment"
---
# Guarantee Design and Selection
## When to Use
You are designing or improving an offer and need a guarantee that reverses buyer risk, increases conversion, and protects the business from catastrophic refund exposure. Typical triggers:
- You have an offer but prospects keep hesitating at the risk of not getting results
- You are building a new high-ticket service and need a risk-reversal mechanism
- You want to test whether a stronger guarantee could lift your conversion rate
- You are switching from retainer pricing to a performance or revshare model
- You want to stack multiple guarantees to build an exceptionally compelling offer
- Your current guarantee is weak ("satisfaction guaranteed") and you want to sharpen it
**Preconditions to verify:**
- Is the product or service capable of delivering what it promises? A guarantee on a poor product will backfire into mass refunds.
- Does the user know their approximate current close rate and refund rate? (Needed for ROI math in Step 4)
- Does the user know their fulfillment cost per customer? (Needed for type selection in Step 3)
**This skill does NOT cover:**
- Building the full offer (core deliverable, bonuses, pricing) — use `grand-slam-offer-creation`
- Auditing whether the underlying perceived value is strong enough to sell — use `value-equation-offer-audit`
- Stacking bonuses alongside the guarantee — cross-reference `bonus-stacking-system`
## Context & Input Gathering
### Required Context (must have — ask if missing)
- **Product or service description:** What is being sold, and what specific outcome does it produce?
-> Check prompt for: deliverable names, service type, offer description
-> If missing, ask: "What are you selling and what result does the customer get?"
- **Pricing tier and business model:** Low-ticket B2C product, high-ticket B2B service, coaching program, digital course, agency, SaaS, etc.
-> Check prompt for: price points, mentions of "retainer," "subscription," "course," "coaching"
-> If missing, ask: "What is the price and how do customers pay? (one-time, retainer, performance-based)"
- **Fulfillment cost:** Is there significant cost to deliver the service — staff time, ad spend, materials, travel?
-> If high fulfillment cost: steer toward Conditional or Anti-Guarantee (not Unconditional)
-> If low fulfillment cost (digital, info product): Unconditional is viable
### Observable Context (gather from environment)
- **Existing offer document or sales page:** Look for a file describing the current offer
-> If found: read it before recommending a guarantee type
- **Current refund or cancellation data:** Any indication of baseline refund rate
-> If available: use in ROI projection (Step 4)
### Default Assumptions
- If fulfillment cost is unknown: assume moderate — recommend Conditional as the default safe choice
- If close rate is unknown: use 100 baseline sales for ROI math and note the assumption
- If customer segment is unclear: assume B2C for ticket < $1,000, B2B for ticket > $1,000
### Sufficiency Threshold
```
SUFFICIENT when ALL of these are true:
- Product/service description is known
- Approximate price point is known
- Fulfillment cost structure is known (high vs low)
PROCEED WITH DEFAULTS when:
- Price and product are known but fulfillment cost is unclear
MUST ASK when:
- Product or service is completely undefined
```
## Process
### Step 1: Understand the Core Guarantee Structure
**ACTION:** Before selecting a type, confirm understanding of what makes a guarantee effective. Every strong guarantee follows this template:
> **"If you do not get [X result] in [Y time period], we will [Z consequence]."**
The Z component — what happens if they fail — is what gives the guarantee power. Without it, you just have a vague claim. Always complete all three parts before moving to type selection.
**Examples of weak vs strong guarantee wording:**
| Weak (incomplete) | Strong (complete) |
|---|---|
| "We guarantee 20 clients." | "You will get 20 clients in your first 30 days, or we give you your money back plus your advertising dollars spent with us." |
| "Satisfaction guaranteed." | "If at any time you don't feel you received $500 in value and service from us, I will write you a check the day you tell me." |
| "Results guaranteed." | "If you don't lose your first 5 pounds in 14 days, we continue your program at no charge until you do." |
**WHY:** The "or we will Z" clause is what triggers the prospect's imagination to picture themselves succeeding. Without a consequence, the guarantee is noise. With it, the prospect mentally simulates the scenario where everything goes well — which is the psychological moment of purchase.
**IF** the user already has guarantee wording that omits the Z clause -> flag this before proceeding. Completing the template alone can lift conversion without changing the guarantee type.
### Step 2: Identify Which of the Four Guarantee Types Applies
**ACTION:** Map the business context to one of the four guarantee types. Read all four descriptions, then score each against the user's situation.
---
#### Type 1: Unconditional Guarantee
**What the customer gets:** A refund with no questions asked and no requirements. They pay, try it, and can get their money back for any reason — full refund, partial refund, or refund plus bonus amount.
**Variants:**
- Full money back ("no questions asked" within X days)
- Partial refund (50%)
- Refund of ancillary costs (ad spend, travel, materials)
- Refund plus competitor's program paid for
- Refund plus additional cash payment ($500, $1,000)
- Named creative guarantee ("Club a Baby Seal Guarantee: after 30 days, if you wouldn't club a baby seal to stay, you don't pay a penny")
**Best for:** Low-ticket consumer products and digital offers where fulfillment cost is low. The more conditions you add to an unconditional guarantee, the weaker it becomes. Works best when you are highly confident in your product and your customers.
**Risk:** You bear full risk of both refund cost AND fulfillment cost. If someone does not achieve results for any reason — including their own lack of effort — you still pay. High consumer-facing volume businesses absorb this well; high-cost-to-deliver services cannot.
**Selection criteria:**
- Ticket price: low to mid (under ~$3,000 for services, any price for digital products)
- Fulfillment cost: low (no significant variable cost per customer)
- Customer behavior: B2C or mass market, where most people won't bother refunding
---
#### Type 2: Conditional Guarantee
**What the customer gets:** A strong outcome guarantee — often better than a money-back — but the customer must satisfy specific conditions (complete key actions that drive success) to qualify.
**Variants:**
- Outsized refund (double or triple money back) if conditions are met
- Service guarantee: you keep working for them free until X is achieved (no time limit)
- Modified service guarantee: you extend service for an additional Y period free of charge
- Credit-based guarantee: refund given as credit toward any service you offer
- Personal service guarantee: you work one-on-one with them free until they reach X (strongest conditional)
- Hotel and airfare guarantee: refund product price plus travel costs if attending an event
- Wage-payment guarantee: pay their hourly rate if they don't find the session valuable
- Release of service guarantee: let them out of their contract with no cancellation fee
- Delayed second payment: don't bill the second installment until they achieve their first outcome
- First outcome guarantee: cover their ancillary costs (ad spend, materials) until they get their first result
**Best for:** High-ticket services, coaching, agencies, and B2B where (1) the customer must take action to succeed, (2) you want to reduce refund risk while still offering a compelling guarantee, and (3) you know the key actions that produce success.
**Key insight:** In the ideal conditional guarantee, 100% of customers qualify for it — because 100% of them followed the conditions — but 100% of them achieved the result and therefore don't want it. This structure also creates better customer outcomes because the conditions ARE the success path.
**Selection criteria:**
- Ticket price: any, but especially high-ticket (> $3,000)
- Fulfillment cost: moderate to high
- Business model: services, coaching, programs, agencies
- You know the key actions that produce results for your customers
**Pro tip (Unconditional vs Conditional by business type):** Bigger, broader guarantees work better with lower-ticket B2C businesses (many people won't bother to claim them). The higher the ticket and the more B2B the context, the more you want specific conditional guarantees. These may or may not include refunds and may or may not have time limits.
---
#### Type 3: Anti-Guarantee
**What the customer gets:** Explicit notice that all sales are final. No refund is possible.
**How to use it:** You must own this position with a compelling "reason why" that the customer can immediately understand and think "Yes, that makes sense." The reason should show your vulnerability — something you expose by working with them that you cannot take back.
**Example framing:** "We are going to show you the proprietary process we use right now to generate leads in our own business — our funnels, ads, and live metrics. Because we're exposing the inner workings of our operation, all sales are final."
**Another example framing:** "If you're the type of customer who needs a guarantee before taking a jump, you are not the type of person we want to work with. We want motivated self-starters who are not looking for a way out before they even begin."
**Best for:** Products or services where value is delivered instantly upon access (code, proprietary data, confidential methodology, information that once seen cannot be unseen). Also works for high-ticket services that require heavy customization, where refunds would mean absorbing full labor cost with nothing to show.
**Selection criteria:**
- Product is consumable or permanently transfers knowledge/access on delivery
- You have a genuine and believable reason why a refund is impossible
- You are targeting serious, committed buyers (anti-guarantee can actually filter out low-quality prospects)
---
#### Type 4: Implied Guarantee (Performance Models)
**What the customer gets:** A pricing structure where you do not get paid unless the customer gets the outcome. No performance, no payment. The guarantee is structural — built into the deal itself.
**Variants:**
- Performance: $X per sale made, $X per pound lost, $X per show
- Revshare: 10–25% of top-line revenue or revenue growth from baseline
- Profit-share: X% of gross or net profit generated
- Ratchets: 10% if over X, 20% if over Y, 30% if over Z (escalating performance fees)
- Bonuses/Triggers: receive X when Y event occurs
- Hybrid floor + performance: "the greater of $1,000/mo or 10% of revenue generated"
- Ramp model: fixed retainer for first 3 months to cover setup, then switch to 100% performance
**What the customer gets:** If you don't perform, they don't pay. If you perform exceptionally, you are very well compensated.
**Best for:** Agencies, consultants, media buyers, and service providers who generate quantifiable outcomes (revenue, leads, weight lost, deals closed). Requires outcome transparency — both parties must be able to measure the result and trust the tracking.
**Why it is powerful:** Creates perfect incentive alignment. You are accountable to results. Low performers are naturally weeded out. The agency/consultant case study (CS-12): agencies switching from retainer models to performance models have gone from $20k/month to $200k+/month in a matter of months because the client has no reason to say no — the risk is entirely on the service provider.
**Selection criteria:**
- Outcome is quantifiable (revenue, leads, conversions, measurable physical result)
- You have a transparent measurement mechanism both parties trust
- You are confident in your ability to deliver — this model rewards the best performers most
---
### Step 3: Select the Best Guarantee Type
**ACTION:** Use this decision framework to pick the type (or combination) that fits:
```
START HERE:
Is your fulfillment cost HIGH (significant labor, ad spend, materials per customer)?
YES -> Conditional, Anti-Guarantee, or Implied/Performance. NOT Unconditional.
NO -> Any type is viable. Start with Unconditional or Conditional.
Is the outcome quantifiable and trackable?
YES -> Implied/Performance is worth serious consideration.
NO -> Conditional or Unconditional.
Is the product delivered upon access (knowledge, code, methodology)?
YES -> Anti-Guarantee with compelling reason why.
NO -> Continue.
Is this high-ticket B2B (> $3,000 and business buyer)?
YES -> Conditional with specific conditions tied to success behaviors.
NO (low-ticket consumer) -> Unconditional or named creative guarantee.
Do you want maximum conversion and are confident in delivery?
YES -> Unconditional (or stacked: unconditional short-window + conditional long-window).
UNCERTAIN -> Conditional with conditions that mirror the success path.
```
**Pro tip on naming:** Give your guarantee a compelling name. Avoid "satisfaction guarantee" or "money-back guarantee." Use vivid, specific language. Example: instead of "30 Day Money Back Satisfaction Guarantee," use "In 30 days, if you wouldn't jump into shark-infested waters to get our product back, we'll return every dollar you paid."
### Step 4: Run the ROI Math
**ACTION:** Calculate whether the stronger guarantee is financially worth it. Do not skip this step — the math is what separates emotion from business logic.
**The formula:**
```
Net Sales (baseline) = Total Sales × (1 - Refund Rate)
Net Sales (with guarantee) = New Total Sales × (1 - New Refund Rate)
ROI Multiple = Net Sales (with guarantee) ÷ Net Sales (baseline)
```
**Working example from the source material:**
```
Baseline: 100 sales × (1 - 5% refund) = 95 net sales
With guarantee: 130 sales × (1 - 10% refund) = 117 net sales
ROI multiple: 117 ÷ 95 = 1.23x (23% net revenue increase)
```
The conversion lifted 30% and the refund rate doubled — yet net revenue still grew 23%.
**Rule of thumb:** For a stronger guarantee to NOT be worth it, the increase in refund rate would have to completely offset every additional sale. A 5% absolute increase in sales would need to be completely wiped out by a 5% absolute increase in refunds (which would be an implausible doubling of refunds). In practice, the stronger guarantee almost always wins on net.
**For high-cost fulfillment services:** Adjust the formula to include fulfillment cost:
```
Net Revenue (baseline) = (Sales × Price) - (Sales × Fulfillment Cost) - (Refunds × Price)
Net Revenue (with guarantee) = (New Sales × Price) - (New Sales × Fulfillment Cost) - (New Refunds × Price + Guarantee Cost)
```
**ACTION:** Plug in the user's numbers. If current close rate and refund rate are unknown, use these conservative defaults: 100 baseline sales, 5% baseline refund rate, 20% conversion lift from guarantee, refund rate doubles. Present the calculation explicitly so the user can adjust assumptions.
### Step 5: Consider Stacking
**ACTION:** Evaluate whether the offer would benefit from stacking two guarantees.
Stacking means layering guarantees for different time windows or different conditions. Examples:
- **Unconditional short + Conditional long:** "No questions asked, full refund within 30 days. OR: complete all modules and implement the framework within 90 days and don't double your leads — we'll give you triple your money back."
- **Two conditional outcomes sequenced:** "You'll generate $10,000 by day 60, and $30,000 by day 90, as long as you complete steps 1, 2, and 3."
- **Implied + Conditional hybrid:** Fixed base fee for setup month, then 100% performance after that.
**WHY stacking works:** Multiple guarantees future-pace the prospect through a timeline of outcomes. It makes the seller appear deeply convinced the customer will succeed. It shifts risk further from buyer to seller at each stage, which increases the perceived "unfairness" of not buying.
**When to stack:** When a single guarantee feels insufficient for the price point, or when the sales conversation reveals the prospect has multiple distinct fears (immediate risk AND long-term outcome risk).
### Step 6: Write the Final Guarantee
**ACTION:** Produce the complete guarantee recommendation with these components:
1. **Type selected** and rationale (1-2 sentences)
2. **Wording** — complete "If you do not get X in Y, we will Z" sentence
3. **Name** — a compelling, memorable name for the guarantee
4. **Conditions** (if Conditional) — specific customer actions required to qualify
5. **ROI projection** — calculated using Step 4 formula with stated assumptions
6. **Delivery script** — a 2-3 sentence way to present the guarantee during the sales conversation
## Examples
**Scenario A: Online fitness coaching program, $497, B2C**
Context: 100 current sales/month, 4% refund rate, digital delivery, low fulfillment cost.
Process: Low ticket, B2C, low fulfillment cost. Unconditional viable. ROI check: baseline 100 × 96% = 96 net sales. With guarantee: projected 135 × 8% = 124.2 net sales. Multiple = 1.29x (29% net revenue gain). Stack: unconditional 30-day + conditional 90-day for those who follow the program.
Output:
- **Type:** Stacked (Unconditional 30-day + Conditional 90-day)
- **Name:** "The No-Excuse Guarantee"
- **Wording:** "Try the program for 30 days, no questions asked — if it's not for you, you pay nothing. OR: follow the program exactly as outlined for 90 days. If you don't lose at least 12 pounds, I will refund every dollar AND pay for your next program."
- **ROI:** 96 baseline net sales → ~124 projected net sales = 1.29x improvement (assumes 35% conversion lift, refund rate doubling)
---
**Scenario B: Marketing agency, $10,000/month retainer, B2B**
Context: Agency currently on retainer model. Clients often hesitate due to "what if it doesn't work." High fulfillment cost (staff time).
Process: High ticket B2B, high fulfillment cost. Unconditional too risky. Performance/Implied model ideal because outcomes (leads, revenue) are trackable. This is the CS-12 case study scenario.
Output:
- **Type:** Implied/Performance (retainer-to-performance transition)
- **Name:** "We Only Win When You Win"
- **Wording:** "We charge $1,000/month for the first 3 months to cover setup. After that, we take 15% of the revenue we generate for you — nothing if we generate nothing."
- **Transition pitch:** "We've helped agencies like ours go from $20k/month to $200k+/month by making this switch. The reason clients agree immediately is simple: if we don't perform, they owe us nothing."
- **ROI:** Baseline at $10k/mo flat. If agency generates $100k/mo in results: performance fee = $15k/mo, a 50% revenue increase, with zero client objection to the price.
---
**Scenario C: Business consultant selling proprietary methodology, $25,000 one-time**
Context: Methodology includes confidential internal playbook and live financial models from the consultant's own business.
Process: High ticket, knowledge instantly transferred on delivery, consultant has genuine "reason why" for no refunds. Anti-Guarantee is appropriate and actually strengthens perceived exclusivity.
Output:
- **Type:** Anti-Guarantee
- **Name:** "All Access, All Final"
- **Wording:** "This engagement gives you full access to the live playbooks, ad accounts, and financial models from our own operating businesses. Because you're seeing exactly how we run our company, all sales are final — we can't un-show you what you've seen."
- **Delivery script:** "I want to be upfront about one thing: this is all-sales-final. The reason is that within the first session, you'll have access to our actual numbers, our actual funnels, and our proprietary frameworks. Once you have that, you have it. That's also why this works."
## Key Principles
- **Risk reversal is the single greatest objection handler** — The number one reason people don't buy is fear that the product won't do what it says. A guarantee directly addresses that fear. Changing the quality of a guarantee alone can lift conversions 2–4x.
- **Always say the guarantee boldly — even if you don't have one** — State your position clearly and give the reason why. Vague, hedged guarantee language performs worse than an explicit anti-guarantee with a strong rationale.
- **Guarantees are enhancers, not foundations** — A guarantee on a poor product or weak sales process will accelerate refunds, not conversions. Confirm the offer delivers real results before strengthening the guarantee.
- **The "or what" clause is what gives a guarantee teeth** — Without a specified consequence, a guarantee is just a claim. "We guarantee results" is meaningless. "If you don't get results in 30 days, we refund you in full plus pay for a competitor's program" is a guarantee.
- **The stronger the guarantee, the higher the net increase in purchases — even if refunds increase** — Do the math. A guarantee that doubles your refund rate but lifts sales by 30% still wins by 23% on net. Don't be afraid of the refund number in isolation.
- **Conditions should mirror the success path** — The best conditional guarantee has conditions that are the exact actions a customer must take to succeed. In a perfect world, 100% qualify and 100% achieve the result — meaning nobody claims it. This also improves customer outcomes.
- **AP-7 Warning — Wrong-customer magnet:** Guarantees can attract the wrong type of customer. A person who buys primarily because of the guarantee — not because they want the outcome — is likely uncommitted to doing the work. This leads to: high refund rates, difficult customer relationships, poor case studies, and burnout. Use conditional guarantees to filter for customers willing to act. Use anti-guarantees for high-commitment, self-selecting audiences. Never use a strong unconditional guarantee to prop up a weak product or to compensate for poor targeting.
## References
- For building the full offer before designing the guarantee: `grand-slam-offer-creation`
- For auditing perceived value and the value equation (Dream Outcome × Perceived Likelihood ÷ Time Delay × Effort): `value-equation-offer-audit`
- For stacking bonuses alongside the guarantee to complete the offer: `bonus-stacking-system`
## License
This skill is licensed under [CC-BY-SA-4.0](https://creativecommons.org/licenses/by-sa/4.0/).
Source: [BookForge](https://github.com/bookforge-ai/bookforge-skills) — $100M Offers by Alex Hormozi.
## Related BookForge Skills
Install related skills from ClawhHub:
- `clawhub install bookforge-value-equation-offer-audit`
Or install the full book set from GitHub: [bookforge-skills](https://github.com/bookforge-ai/bookforge-skills)
Build a complete, differentiated offer bundle from scratch using a 5-step process: define the target customer's dream outcome, map every obstacle they face,...
---
name: grand-slam-offer-creation
description: |
Build a complete, differentiated offer bundle from scratch using a 5-step process: define the target customer's dream outcome, map every obstacle they face, convert those obstacles into named solution components, select the highest-value delivery formats for each, then trim low-value items and stack the remainder into a final offer with assigned dollar values and a single price. Use this skill when starting a new offer, when an existing offer is being commoditized (competing on price), when conversion is poor despite genuine quality, or when a business needs to escape the "race to the bottom" pricing dynamic. Trigger phrases: "how do I create an offer", "build me a product", "what should I include in my offer", "how do I stop competing on price", "design a new service package", "make my offer irresistible", "what should my program include", "how do I package my services", "what should I charge for", "create an offer from scratch", "help me build a coaching program", "escape commoditization". Applies to: coaching, consulting, agencies, courses, productized services, gyms, clinics, SaaS, any business where the offer structure determines price and conversion. This is the hub skill for offer creation — run it before guarantee design, bonus stacking, scarcity/urgency framing, or offer naming.
tags: [offers, product-design, value-creation, sales, entrepreneurship]
depends-on: [value-equation-offer-audit, target-market-selection]
---
# Grand Slam Offer Creation
## When to Use
Use this skill when you need to build or rebuild a complete offer from scratch. Specifically:
- **Starting a new product or service** — you know what you do but not how to package it into an irresistible offer
- **Competing on price against cheaper alternatives** — your current offer looks identical to competitors and buyers negotiate you down
- **Low conversion despite genuine quality** — your service is good but prospects don't perceive the value before they buy
- **Existing offer lacks differentiation** — buyers can compare your price directly to others, which always ends badly for margins
- **After completing `value-equation-offer-audit`** — you've identified weak value drivers and need to build the offer components that address them
**What this skill produces:** A complete offer document with named components, assigned perceived dollar values, a stacked total value, and a single purchase price — structured so the buyer experiences a massive value-to-price gap that makes saying no feel irrational.
**What this skill does not cover:** Pricing, guarantee design, bonus stacking, scarcity/urgency, or naming. Run `premium-pricing-strategy`, `guarantee-design-and-selection`, `bonus-stacking-system`, `scarcity-and-urgency-tactics`, and `offer-naming-magic-formula` after completing this skill.
**Precondition:** You must know (1) who you serve and (2) what outcome they ultimately want. If you don't yet have a defined target market, run `target-market-selection` first.
## Context & Input Gathering
### Input Sufficiency Check
```
User prompt → Extract: who is the target customer? what is their dream outcome?
↓
Environment → Scan for: existing offer docs, sales pages, service descriptions
↓
Gap analysis → Do I know: (1) who the customer is, (2) what they most want to achieve,
(3) what business/service is being offered?
↓
Missing critical info? ──YES──→ ASK (one question at a time, max 2 questions)
│
NO
↓
PROCEED with 5-step process
```
### Required Context (ask if missing)
- **Target customer:** Who buys this? What is their current situation?
→ If missing, ask: "Who is your ideal buyer and what is their current frustrating situation before they find you?"
- **Dream outcome:** What does the customer ultimately want to achieve — the destination, not the journey?
→ If missing, ask: "What specific result does your ideal customer most want? (e.g., 'lose 20 pounds in 6 weeks', 'sign 3 new clients per month', 'launch their first product')"
- **What you do / your core capability:** What do you know how to deliver?
→ Usually stated in the user's initial prompt. If not, ask: "What is the core service or expertise you are packaging?"
### Default Assumptions
- If no specific market is stated: infer from the offer description and flag the assumption
- If the user has an existing offer: treat it as the starting point for Step 1, not as the finished product
- If the user is unsure what to charge: build the offer first (Steps 1–5), then stack the component values — the price follows from perceived value, not from cost or habit
## Process
Use `TodoWrite` to track steps before beginning:
- Step 1: Define dream outcome | Step 2: Map all problems | Step 3: Convert to solution statements | Step 4: Generate delivery vehicles | Step 5a: Trim | Step 5b: Stack + assign values | Final: Output offer document
---
### Strategic Framing: The Sales-to-Fulfillment Continuum
Before building, establish the right mindset for offer design.
Every offer sits on a continuum between two extremes:
```
Easy to Sell ←────────────────────────────────→ Hard to Sell
Hard to Fulfill ←──────────────────────────────→ Easy to Fulfill
```
- **Maximum sales ease** (over-delivering on everything) makes the offer irresistible but can make the business unsustainable
- **Maximum fulfillment ease** (bare-minimum delivery) makes the business easy to run but kills sales
- **The goal is a sweet spot:** an offer that is genuinely easy to sell because it over-delivers on perceived value, but structured using low-cost, high-leverage delivery vehicles so margins remain strong
**Practical guidance for new offers:** When building your first version, bias toward over-delivering. Generate demand first. Once you have buyers saying yes and cash flowing, optimize fulfillment. It is always easier to remove from an offer that is selling than to fix an offer that is not.
**Warning — AP-10 (Fulfillment Imbalance):** Do not design an offer so heavy that fulfillment destroys the business. If you cannot deliver the offer profitably at scale, you will either burn out or quietly stop honoring the promise. Every component you add in Step 4 must pass the filter in Step 5: is the cost-to-deliver acceptable at volume? The goal is high perceived value to the buyer at low actual cost to you. "One-to-many" delivery vehicles (guides, videos, templates, automated systems) are the highest-leverage format because they cost the same whether you serve 10 clients or 10,000.
---
### Step 1: Define the Dream Outcome
**ACTION:** Identify the specific end state the customer most wants to reach — not what your service does, but where the customer arrives.
1. Write down the customer's dream outcome in concrete, measurable terms:
- What does "success" look like for them in specific numbers or observable states?
- What is the largest result they could reasonably expect?
- Add a time component: what is the minimum viable timeframe in which this result could occur?
2. Check: are you selling the flight or the vacation?
→ "A gym membership" = the flight (the mechanism)
→ "Lose 20 pounds in 6 weeks" = the vacation (the destination)
Always sell the vacation. The mechanism is irrelevant to the buyer.
3. Write your dream outcome statement in this format:
> "[Specific measurable result] in [timeframe]"
> Example: "Lose 20 pounds in 6 weeks"
> Example: "Get 20 new gym clients in 30 days"
> Example: "Sign 3 enterprise software deals in 90 days"
**WHY:** Everything in Steps 2–5 is built to deliver this specific outcome. The more precise your dream outcome, the more precisely you can map the obstacles between the customer and that outcome, which is where all value in the offer comes from. A vague outcome ("grow your business") produces vague problems which produce vague solutions which produce a mediocre offer.
**IF/THEN:**
- If the user has multiple possible outcomes: pick the one that is most emotionally resonant and commercially significant to the target customer. You can always add variations later.
- If the dream outcome is vague: push one level deeper. "Lose weight" → "lose 20 pounds" → "fit into my wedding dress in 8 weeks." The deeper the specificity, the higher the perceived value of an offer that addresses it.
Mark Step 1 complete in TodoWrite.
---
### Step 2: List All Problems (Obstacle Mapping)
**ACTION:** Generate an exhaustive list of every obstacle the target customer faces on the path from where they are now to the dream outcome.
**The 4-bucket problem framework:** Every obstacle a customer faces falls into one of four categories — these map directly to the four drivers in the value equation. Use them as prompts, not constraints:
| Bucket | Core fear | Customer says... |
|--------|-----------|-----------------|
| **Dream Outcome** | "It won't be worth it financially" | "Is this even possible for someone like me?" |
| **Perceived Likelihood** | "It won't work for me specifically" | "I'll start and then quit. External factors will derail me." |
| **Effort & Sacrifice** | "It will be too hard / I'll hate it" | "This requires too much discipline / I'll suck at it." |
| **Time** | "It will take too long / I'm too busy" | "I don't have time for this. It's not convenient." |
**Technique:** For each thing the customer must *do* to reach the dream outcome, generate every reason they might not be able to do it, sustain it, or complete it. Think in sequence — what happens immediately before they start? What happens immediately after? What are the next steps after that?
**Gym example (Step 2 in action):**
- Dream outcome: Lose 20 pounds in 6 weeks
- Things they must do: buy healthy food → cook healthy food → eat healthy food → exercise regularly → stay consistent → handle social situations
- For "buy healthy food": it's hard and confusing / takes too much time / is expensive / is unsustainable when traveling or when family has different needs
- For "cook healthy food": hard, time-consuming, expensive, unsustainable, family conflicts, no idea what to do when traveling
- For "exercise": confusing, embarrassing, risk of injury, don't know what to do, don't like it, too busy
- (Repeat for every item they must do)
**Guidance:** Aim for exhaustiveness. More problems = more solutions = more components = more perceived value. Repetition across buckets is normal. Do not filter yet — filtering is Step 5.
**Output:** A written list organized by task (e.g., "Buying food: [list]", "Cooking: [list]"). Aim for 20–50+ problems.
**WHY:** The problems list is the complete map of every reason a prospect might say no or quit. If any one item on this list goes unsolved, it becomes a potential lost sale or a client who fails and cancels. Solving all of them makes the offer impossible to compare to commoditized alternatives that solve only some.
Mark Step 2 complete in TodoWrite.
---
### Step 3: Convert Problems into Solution Statements
**ACTION:** Take every problem from Step 2 and flip it into a positive solution statement using the frame: *"What would I need to show someone to solve this problem?"*
**The conversion formula:** Reverse each problem element into outcome-oriented language.
- "Buying healthy food is hard, confusing, I won't like it" → "How to make buying healthy food easy and enjoyable, so that anyone can do it"
- "Cooking takes too much time" → "How to cook meals in under 5 minutes"
- "This is expensive, it's not worth it" → "How eating healthy is actually cheaper than unhealthy food"
- "It's unsustainable" → "How to make eating healthy last forever"
- "My family's needs will get in the way" → "How to cook this despite your family's concerns"
- "I won't know what to do when I travel" → "How to travel and still eat healthy"
**Format for each conversion:**
> [Problem statement] → [Solution statement beginning with "How to..."]
**WHY:** The solutions list is not the offer yet — it is the *checklist* of what the offer must accomplish. Each solution statement tells you exactly what a buyer needs to believe you can deliver. This step transforms the creative chaos of Step 2 into a structured list of deliverables. It also prevents common offer-building errors: adding components based on what's easy to create rather than what solves a real obstacle.
**Critical rule:** Solve every problem. Do not self-censor — write all solutions even if unsure how to deliver them yet. Filtering for feasibility is Step 5. One unsolved obstacle can be the single reason a sale is lost.
**Output:** A solution list mirroring the problem list from Step 2, one "How to..." statement per problem. These become the offer component building blocks in Step 4.
Mark Step 3 complete in TodoWrite.
---
### Step 4: Generate Delivery Vehicles ("The How")
**ACTION:** For each solution statement from Step 3, brainstorm every possible way you could deliver that solution. This is the most important step in the process — this is what you will actually provide in exchange for money.
**Goal:** Generate the most expansive possible list before filtering. Think divergently: if money and time were no constraint, how many different ways could you deliver each solution?
**The 6-dimension delivery vehicle cheat codes:** For each solution, run through all six dimensions to generate variations:
| Dimension | Questions | Options |
|-----------|-----------|---------|
| **1. Level of attention** | How many people receive this at once? | 1-on-1 / Small group / One-to-many |
| **2. Customer effort level** | How much does the customer do themselves? | Do-It-Yourself (DIY) / Done-With-You (DWY) / Done-For-You (DFY) |
| **3. Live delivery medium** | If delivering live, what channel? | In-person / Phone / Email / Text / Video call / Chat |
| **4. Recorded consumption format** | If recorded, how does the customer consume it? | Video / Audio / Written |
| **5. Speed and availability** | When and how quickly is this available? | 24/7 / 9–5 / Within 5 minutes / Within 1 hour / Within 24 hours / Monday–Friday |
| **6. The 10x/1/10th test** | Expand the solution space in both directions: | If the customer paid 10x your price ($100,000), what would you provide? If they paid 1/10th the price and you still had to make them successful, what's the leanest possible version? |
**How to use the 10x/1/10th test:** This is a divergent thinking tool, not a commitment to delivery. Ask: "If this customer paid me $100,000, what would I do for them?" The answer pushes you toward ideas you wouldn't normally consider. Then ask: "If they paid $100, how could I still get them the result?" This often surfaces elegant, scalable, low-cost solutions. Both directions generate ideas. You keep the ones you will actually deliver in Step 5.
**Example (for the problem "Buying healthy food is hard and confusing"):**
*One-on-one delivery options:*
- In-person grocery shopping trip where I take the client to the store and teach them
- Personalized grocery list, taught 1-on-1
- Full-service shopping: I buy their food for them entirely
- Text support while they shop — they text me pictures and I guide them
- Phone call scheduled for when they're at the store
*Small group options:*
- Group grocery shopping trip
- Group class: how to build a weekly shopping list
- Shared grocery delivery service
*One-to-many / scalable options:*
- Recorded grocery store walkthrough video
- DIY grocery calculator tool (spreadsheet or app)
- Pre-made weekly grocery lists for each meal plan tier
- Grocery buddy system (pair clients together)
- Pre-made Instacart lists — one click delivers their week
**After generating:** You will have a monster list of 50–100+ delivery vehicle options. This is correct. Filtering happens in Step 5.
**WHY:** The same solution can cost 100 hours per client or 1 minute per client depending on delivery format. Most businesses default to one format without considering the full menu. This step surfaces all options so you can choose optimally in Step 5.
Mark Step 4 complete in TodoWrite.
---
### Step 5: Trim and Stack (Offer Optimization)
This step has two sub-parts: first trim the list to the optimal components, then stack them into the final offer with assigned values.
#### Step 5a: Trim — Apply the Cost-Value Filter
**ACTION:** Take the full list of delivery vehicles from Step 4 and apply a two-axis filter:
**The cost-value 2x2:**
```
HIGH VALUE
│
KEEP ──────────────┼────────────── KEEP
(low cost, │ (high cost,
high value) │ high value)
│
LOW COST ────────────────┼──────────────── HIGH COST
│
REMOVE ────────────┼────────────── REMOVE FIRST
(low cost, │ (high cost,
low value) │ low value)
│
LOW VALUE
```
**Remove first:** High cost, low value items — these drain resources without meaningfully moving the buyer's perceived value.
**Remove second:** Low cost, low value items — these add complexity without payoff and dilute the offer by making it look padded.
**Keep:** Both categories of high-value items:
- Low cost, high value = the best components in the offer. Prioritize these. Examples: templates, guides, recorded videos, automated systems — created once, delivered infinitely.
- High cost, high value = keep if the cost is acceptable at your scale, or if they are essential to the dream outcome and no lower-cost alternative exists. Reserve high-cost, high-touch components (1-on-1 sessions, DFY services) for premium tiers or use them sparingly.
**How to evaluate "high value":** Ask which components most directly:
1. Increase the customer's financial outcome
2. Increase their confidence they will succeed
3. Reduce the effort and sacrifice required
4. Reduce the time to first result
**The fulfillment imbalance check (AP-10):** After filtering, ask: "Can I deliver this profitably at 50 clients? 200?" If no, restructure high-cost components into scalable formats. Bias toward "one-to-many" components (videos, tools, templates, guides) — created once, delivered infinitely, highest value-to-cost ratio in the offer.
Mark Step 5a complete in TodoWrite.
#### Step 5b: Stack — Build the Final Offer Bundle
**ACTION:** Take the trimmed components and assemble them into a final offer bundle.
**Format:** For each component write: (1) the problem it solves, (2) a compelling outcome-oriented name (not a format description — "Foolproof Bargain Grocery System" beats "Grocery Guide" — see `offer-naming-magic-formula`), (3) perceived value — what a motivated buyer would pay to solve only this problem standalone, (4) the delivery vehicles from Step 4.
**Final stack format:**
```
OFFER COMPONENT LIST
─────────────────────────────────────────────────────
Problem solved → Named component → Perceived value
─────────────────────────────────────────────────────
[Problem 1] → [Component Name 1] ........... $[XXX]
[Problem 2] → [Component Name 2] ........... $[XXX]
[Problem 3] → [Component Name 3] ........... $[XXX]
...
─────────────────────────────────────────────────────
TOTAL PERCEIVED VALUE: $[total stacked]
YOUR PRICE: $[price]
VALUE-TO-PRICE RATIO: [ratio]:1
─────────────────────────────────────────────────────
```
**The grand slam threshold:** The value-to-price ratio should feel almost unreasonable to the buyer. 4:1 is a floor; 7:1–10:1 is common in well-structured offers. The bundle must accomplish three things: (1) solve *all* perceived problems — missing one can be the single reason someone doesn't buy, (2) give you conviction that what you sell is one-of-a-kind, (3) make direct price comparison to competitors impossible.
**WHY:** The stacked offer shifts the buyer's decision from "is this worth the price?" to "which of these problems am I willing to let stay unsolved?" — a fundamentally easier sales conversation.
Mark Step 5b complete in TodoWrite. Mark Step 7 (final output) in progress.
---
### Final Output: The Offer Document
After completing all steps, produce a clean offer document in this format:
```
OFFER: [Working title — will be refined with offer-naming-magic-formula]
Target customer: [who they are and their starting situation]
Dream outcome: [specific result + timeframe]
─────────────────────────────────────────────────────────────────────
WHAT THEY GET
─────────────────────────────────────────────────────────────────────
[Component 1 name]: [one-sentence description of what it does]
Solves: [problem from Step 2]
Consists of: [delivery vehicles]
Perceived value: $[amount]
[Component 2 name]: [one-sentence description]
Solves: [problem]
Consists of: [delivery vehicles]
Perceived value: $[amount]
[... all components]
─────────────────────────────────────────────────────────────────────
TOTAL PERCEIVED VALUE: $[stacked total]
PRICE: $[price]
VALUE-TO-PRICE MULTIPLE: [X]:1
─────────────────────────────────────────────────────────────────────
NEXT STEPS:
→ Guarantee: guarantee-design-and-selection (reduces risk perception, boosts Driver 2)
→ Bonuses: bonus-stacking-system (present components as bonuses to increase perceived value)
→ Scarcity/urgency: scarcity-and-urgency-tactics (add ethical urgency to accelerate decision)
→ Name: offer-naming-magic-formula (turn the working title into a compelling, memorable name)
```
**HANDOFF TO HUMAN:** Present the completed offer document. Ask: "Does this capture all the problems your customer faces? Are there obstacles we haven't solved yet? The goal is for a prospect to read this list and have no remaining reasons to say no."
Mark Step 7 complete in TodoWrite.
---
## Examples
### Example 1: Agency Comparison — Commodity vs. Grand Slam (Before/After)
This example illustrates the financial transformation from a commoditized offer to a differentiated one, using identical advertising spend.
**The scenario:** A lead generation agency serving brick-and-mortar businesses. Two versions of the same underlying service — same work, same team, same ad budget.
**Commodity offer (price-driven, "race to the bottom"):**
> "$1,000 down, then $1,000/month retainer for agency services."
The pitch is: "You pay us. We work. Maybe you get results. Maybe you don't."
This is a reasonable offer, but it is identical to every other agency. The client can compare it directly to 50 competitors. The pressure to match the cheapest competitor is permanent.
**Grand Slam Offer (value-driven, incomparable):**
> "Pay one time. No recurring fee. No retainer. Just cover ad spend. I'll generate and work your leads. Only pay me if people show up. I guarantee 20 clients in month one, or next month is free. Plus: daily sales coaching, tested scripts, tested price points, sales recordings — and the entire industry playbook, free."
This offer cannot be compared to the commodity offer. The decision is not "which agency is cheaper?" but "do I want these 20 guaranteed clients or not?"
**Results at the same $10,000 ad spend:**
| Metric | Commodity | Grand Slam | Change |
|--------|-----------|------------|--------|
| Response rate | 0.013% | 0.033% | 2.5x more respond |
| Appointments booked | 40 | 100 | Result |
| Show rate | 75% | 75% | Unchanged |
| Closing % | 16% | 37% | 2.3x more close |
| Sales closed | 5 | 28 | Result |
| Price | $1,000 | $3,997 | 4x higher price |
| Total collected | $5,000 | $112,000 | 22.4x more cash |
| Return on ad spend | 0.5:1 | 11.2:1 | Get paid to acquire customers |
**Breakdown:** Same eyeballs. 2.5x more respond (compelling offer). 2.3x more close (value is obvious). 4x higher price (no comparison point). 2.5 × 2.3 × 4 = **22.4x more cash**. The fulfillment is the same. Only the offer structure changed.
---
### Example 2: Gym Owner — Full 5-Step Walkthrough
This is the complete process applied to a real business that went from failing to sell a $99/month bootcamp to successfully selling a $599 bundle worth $4,351 in perceived value.
**Starting situation:** Gym owner can't sell $99/month memberships. "LA Fitness is $29/month. This is expensive." Even free trials failed.
#### Step 1: Dream Outcome
Realization: "I'm not selling a gym membership. I'm not selling the flight. I'm selling the vacation."
Dream outcome: **Lose 20 pounds in 6 weeks.**
- Big dream outcome: lose 20 pounds
- Time component: 6 weeks
#### Step 2: Problems List (partial, illustrative)
For "buying healthy food":
1. Hard, confusing, won't like it
2. Takes too much time
3. Expensive
4. Unsustainable (family needs, travel)
For "cooking healthy food":
1. Hard, time-consuming, confusing
2. Takes too much time
3. Expensive and not worth it
4. Unsustainable; family conflicts; travel
For "exercising regularly":
1. Hard, confusing, intimidating
2. Will injure myself
3. Too time-consuming
4. Don't know what to do; will plateau
For "sticking with it":
1. Will fall off when life gets hard
2. Embarrassing to be seen at the gym
3. No one keeps me accountable
For "social situations":
1. Can't eat out without ruining progress
2. Feels left out at social events
#### Step 3: Solution Statements (partial)
- Buying food is hard → How to buy healthy food fast, easy, cheaply
- Cooking takes too long → How to cook healthy meals in under 5 minutes
- Exercise is confusing → Easy-to-follow exercise system adjusted to your exact needs
- Sticking with it is hard → System that works without your permission, even for people who hate the gym
- Can't eat out → How to eat out 100% of the time and still hit your goal
#### Step 4: Delivery Vehicles (trimmed selection)
- 1-on-1 Nutrition Orientation (explain the full system)
- Recorded grocery store walkthrough video
- DIY Grocery Calculator
- Pre-made weekly grocery list for each plan
- Grocery Buddy System (pair clients together)
- Pre-made Instacart lists for one-click delivery
- Meal prep instructions + Meal prep calculator
- Personalized meal plan
- 5-minute meal guides (breakfast, lunch, dinner)
- Family size meal options
- Fat-burning workouts calibrated to individual needs
- Travel eating and workout blueprint
- Accountability system (check-ins, community)
- Eating-out guide for restaurants
#### Step 5: Final Stacked Offer
| Problem solved | Named component | Perceived value |
|----------------|-----------------|----------------|
| Buying food | Foolproof Bargain Grocery System — saves hundreds/month, takes less time than your current routine | $1,000 |
| Cooking | Ready-in-5-Minute Busy Parent Cooking Guide — eat healthy even with no time, get 200 hours/year back | $600 |
| Eating | Personalized "Lick Your Fingers Good" Meal Plan — easier to follow than eating what you used to cheat with | $500 |
| Exercise | Fat Burning Workouts Proven to Burn More Fat Than Doing It Alone — calibrated so you never plateau or risk injury | $699 |
| Traveling | Ultimate Tone-Up-While-You-Travel Eating and Workout Blueprint — amazing workouts with no equipment | $199 |
| Accountability | "Never Fall Off" Accountability System — works without your permission, even for people who hate coming to the gym | $1,000 |
| Social eating | "Live It Up While Slimming Down" Eating Out System — freedom to eat out and live life without feeling like the odd man out | $349 |
| **TOTAL VALUE** | | **$4,351** |
| **PRICE** | | **$599** |
| **VALUE MULTIPLE** | | **7.3:1** |
**Result:** The gym went from failing to sell $99/month memberships to selling $599 bundles. The facilities using this system eventually sold the same bundle for $2,400–$5,200 as they refined and added components over time.
---
## Key Principles
- **The offer is the product, not the service.** A commodity business does the same work as a Grand Slam Offer business. The fulfillment is the same. What changes is how that work is packaged, named, and valued. Packaging determines pricing — not quality, effort, or years of experience.
- **Solve every perceived problem, not just the obvious ones.** One unsolved obstacle can be the single reason someone doesn't buy. The goal is to design an offer where the prospect runs out of objections before they run out of reasons to say yes. If you find yourself insisting prospects must handle a problem themselves, you are leaving sales on the table.
- **The divergent phase must stay divergent.** Steps 2–4 are about generating the maximum possible list. Filtering is for Step 5. Self-censoring during generation produces a mediocre offer with the obvious components. The best offer components often emerge from pushing past the first ten ideas.
- **Perceived value is not cost plus margin.** Assign values based on what the market charges to solve that problem, or what the outcome is worth to the buyer — not how long the component takes to create. A template built in 2 hours that solves a $500 problem is a $500 component.
- **One-to-many components are the profit engine.** The offers with the best economics are built primarily from components that are created once and delivered infinitely: recorded training, automated tools, templates, reference guides, community systems. High-cost, high-touch components (1-on-1 sessions, DFY delivery) belong in the stack only where they are irreplaceable or in premium tiers.
- **Escaping commodity pricing is the entire point.** When your offer cannot be directly compared to a competitor's, you control the pricing conversation. When it can be compared, you will always be pressured to match the cheapest option in the market.
- **Build first, optimize later.** Bias the first version toward over-delivery. Create cash flow, serve clients, and learn what they value most. Then replace high-cost components with lower-cost alternatives. Optimization without a proven offer is premature.
## References
- For auditing the perceived value of the offer you just built: `value-equation-offer-audit`
- For designing a guarantee that boosts perceived likelihood of achievement: `guarantee-design-and-selection`
- For presenting selected offer components as bonuses to increase perceived value: `bonus-stacking-system`
- For adding ethical scarcity and urgency without manipulation: `scarcity-and-urgency-tactics`
- For turning the working offer title into a compelling, memorable name: `offer-naming-magic-formula`
- For setting and testing the right price for the completed offer: `premium-pricing-strategy`
- Source: *$100M Offers*, Alex Hormozi, Chapters 8–10, pages 97–127 (offer creation process) and pages 43–47 (agency commodity vs. Grand Slam comparison)
## License
This skill is licensed under [CC-BY-SA-4.0](https://creativecommons.org/licenses/by-sa/4.0/).
Source: [BookForge](https://github.com/bookforge-ai/bookforge-skills) — 100M Offers by Unknown.
## Related BookForge Skills
Install related skills from ClawhHub:
- `clawhub install bookforge-value-equation-offer-audit`
- `clawhub install bookforge-target-market-selection`
Or install the full book set from GitHub: [bookforge-skills](https://github.com/bookforge-ai/bookforge-skills)
Build and present a bonus stack that makes your core offer feel irresistible by applying an 11-point bonus quality checklist, a before/after-objection deploy...
---
name: bonus-stacking-system
description: Build and present a bonus stack that makes your core offer feel irresistible by applying an 11-point bonus quality checklist, a before/after-objection deployment sequence for one-on-one sales, and a 3-step partner bonus system that sources free high-value bonuses from adjacent businesses. Use this skill when your grand slam offer is drafted and you need to increase perceived value without discounting the price; when prospects stall at close and you need a structured way to layer bonuses against their specific objections; when you want to source third-party bonuses at zero cost and turn them into affiliate revenue; or when you need a complete bonus stack document with named bonuses, assigned values, and a presentation sequence ready for use in sales conversations.
version: 1.0.0
homepage: https://github.com/bookforge-ai/bookforge-skills/tree/main/books/100m-offers/skills/bonus-stacking-system
metadata: {"openclaw":{"emoji":"📚","homepage":"https://github.com/bookforge-ai/bookforge-skills"}}
status: draft
source-books:
- id: 100m-offers
title: "$100M Offers: How To Make Offers So Good People Feel Stupid Saying No"
authors: ["Alex Hormozi"]
chapters: [14]
tags: [bonuses, offers, value-creation, sales]
depends-on: [grand-slam-offer-creation, value-equation-offer-audit]
execution:
tier: 2
mode: hybrid
inputs:
- type: document
description: "Completed grand slam offer or deliverables list from the grand-slam-offer-creation skill. Optionally: a list of prospect objections."
tools-required: [Read, Write]
tools-optional: []
mcps-required: []
environment: "Works from a finalized offer and a list of deliverables. Output: a bonus stack document with named bonuses, assigned values, and a presentation sequence."
discovery:
goal: "Produce a complete bonus stack that eclipses the value of the core offer, addresses every anticipated objection, and includes at least one partner bonus sourced at zero cost."
tasks:
- "Audit every deliverable against the 11-point bonus quality checklist"
- "Name each bonus with a benefit in the title"
- "Assign a dollar value to each bonus with justification"
- "Order bonuses by impact: strongest first, objection-busters held in reserve"
- "Identify adjacent businesses and draft partner bonus outreach"
- "Document the before/after-objection deployment sequence for sales conversations"
audience: "beginner-to-intermediate entrepreneurs building or refining a premium offer"
when_to_use: "When a core offer is drafted and needs bonus stacking to increase conversion, or when prospects are stalling at close and bonuses are needed to dissolve objections"
environment: "A finalized offer with at least one deliverable and a known price point."
quality: placeholder
---
# Bonus Stacking System
## When to Use
Your core offer exists — you know what you deliver and at what price — but conversion is lower than it should be, or you sense the offer feels thin relative to its price. This skill applies when:
- You have completed the grand slam offer framework and are now ready to present it in a way that makes the price feel small compared to the value
- You are preparing for a sales conversation and want bonuses ready to deploy if a prospect objects
- You want to increase perceived value without cutting your price (discounting is always the wrong move — it trains buyers that your price is negotiable)
- You want to source partner bonuses from adjacent businesses at zero cost and potentially earn affiliate commissions from them
- You need a written bonus stack document — named bonuses, assigned values, presentation sequence — that you or a salesperson can use consistently
**The core pattern:** A single offer presented as one thing is worth less than the same offer broken into its named component parts and stacked as bonuses. Enumeration creates value. Prospects cannot value what they cannot see. Bonuses make the invisible visible, and each additional bonus expands the price-to-value gap until buying feels like the obvious choice.
Before starting, confirm you have:
- A completed offer with at least one deliverable (use `grand-slam-offer-creation` first if not)
- A price point anchored to the core offer
- A basic sense of what obstacles prevent your prospect from buying (use `value-equation-offer-audit` to identify these if needed)
---
## Context and Input Gathering
### Required Context
- **Deliverables list:** Every product, service, or outcome included in your current offer — this is raw material for your bonus stack. Many items you already deliver are invisible to the buyer and need only be named and valued.
- **Core offer price:** The price you will anchor at the start of the sales conversation. Bonuses expand value around this anchor; the anchor must be set first.
### Observable Context
If a draft offer document is provided, scan for:
- Deliverables buried inside service descriptions — these are candidates to be extracted and named as standalone bonuses
- Outcomes that are fast, easy, or done-for-the-buyer — these score high on the value equation (low effort/time = high value) and make strong bonuses
- Problems the buyer will encounter after purchasing that you could pre-solve — these become "next logical need" bonuses
### Default Assumptions
- If the deliverables list is empty → use the grand slam offer creation skill first; you cannot stack what you have not built
- If the buyer's objections are unknown → assume the three most common: time, effort, and a secondary problem they expect to hit after solving the main one
- If you are unsure whether an item belongs in the core offer or as a bonus → make it a bonus; the "wow factor" rule applies: if it is short but high quality or value, it reads as more impressive as a bonus than as an implied inclusion
- If no partner bonuses have been negotiated yet → complete Steps 1-3 first, then run Step 4 as a separate effort
### Sufficiency Check
You have enough to proceed when:
1. You have at least three distinct deliverables or outcomes to work with
2. You know your core offer price
3. You can name at least two anticipated buyer objections or obstacles
---
## Process
### Step 1 — Audit Every Deliverable Against the 11-Point Bonus Quality Checklist
Take each deliverable and score it. High-scoring items become named bonuses. Low-scoring items either stay as silent inclusions in the core offer or get improved before presenting.
For each deliverable, check:
1. **Always offer it.** Every deliverable should be visible. If you are delivering it anyway, name it and give it a value. Unnamed value is invisible value.
2. **Give it a benefit-in-the-title name.** "Sales Script" is a deliverable. "The 7-Figure Close Script That Eliminates Price Objections In Under 60 Seconds" is a bonus. The name does selling before you say a word.
3. **Connect it to the buyer's specific issue.** State how this bonus directly addresses their problem. Generic bonuses feel like filler. Specific bonuses feel like gifts.
4. **Explain what it is.** State clearly what the buyer receives: a checklist, a template, a video, a live session, a report. Ambiguity reduces perceived value.
5. **Explain how you discovered or created it.** Origin story creates credibility. "I built this after losing $200k so you don't have to" is worth more than "here's a checklist."
6. **Explain how it makes their life faster, easier, or lower effort.** This is the value equation in action: value rises as time-to-result drops and effort required drops. Always frame bonuses through this lens.
7. **Provide proof.** A stat, a past client result, or a personal experience. Proof converts a claimed value into a believed value.
8. **Paint the mental image.** Describe the buyer's life after using this bonus as if they have already experienced the benefit. Future pacing converts abstract value into felt value.
9. **Assign a price tag and justify it.** Every bonus needs a dollar value — not arbitrary, but defensible. "This normally costs $X because Y" is the minimum. If you sell this item separately, use that price. If you do not, price it at what it would cost to hire someone to produce the equivalent outcome.
10. **Address a specific objection or anticipated obstacle.** The best bonuses do not add value generically — they dissolve the specific reason the buyer hesitates. Map each bonus to a "I can't or won't succeed because..." belief and show why that belief is wrong.
11. **Solve the next logical need.** After the buyer succeeds with your core offer, what problem do they hit next? Bonuses that pre-solve that next problem extend the buyer's loyalty and remove a reason to look elsewhere.
**Bonus format upgrade — tools and checklists beat trainings.** A checklist or template requires less effort and time from the buyer than a course or training. Per the value equation, lower effort means higher value. Prefer deliverables that are immediately usable over deliverables that require the buyer to invest time learning. If you have training content, record it once, then create a checklist or swipe file that extracts the core action without requiring the buyer to watch it.
**Value eclipsing rule.** The total stated value of your bonuses should exceed the price of the core offer. This is not deception — it is psychology. When bonuses alone are worth more than the price, the core offer reads as if it were free. The buyer's subconscious also reasons: "If these are the extras, the main thing must be even more valuable." Both effects work in your favor.
**Scarcity and urgency amplifiers (optional, apply carefully):**
- **Scarcity version:** "These bonuses are only available through this program — they are never sold separately." Limits access, not time. Works when true.
- **Urgency version:** "If you join today, I will add [bonus X] valued at $1,000. I do this to reward people who act quickly." Limits window, not access. Only use real urgency; manufactured urgency destroys trust.
---
### Step 2 — Name, Value, and Order the Stack
With your audited bonuses, build the stack document:
**For each bonus, write one entry:**
```
Bonus Name: [Benefit-in-the-title name]
What it is: [One sentence — format + content]
Why it matters: [How it addresses their specific issue or obstacle]
How you created it: [Origin — cost you paid, experience it encodes]
What it does for them: [Faster/easier/lower effort — value equation framing]
Proof: [Stat, client result, or personal experience]
Assigned value: $[amount] — because [justification]
Objection it destroys: [Specific belief it refutes]
```
**Ordering rule:** Place the strongest, most emotionally resonant bonus first — it sets the standard for what follows. Order remaining bonuses by perceived value, descending. Hold two or three bonuses in reserve; these are your objection-busters deployed only if the prospect does not buy on first ask (see Step 3). The buyer does not know these exist until they are needed.
**Total value calculation:** Sum all bonus values. Confirm the total exceeds the core offer price. If it does not, either add more bonuses or create a partner bonus (Step 4) to close the gap.
---
### Step 3 — Deploy the Before/After-Objection Sequence in Sales Conversations
The deployment sequence differs depending on whether the buyer says yes or no on first ask.
**First ask — always ask for the sale before presenting bonuses.**
State the core offer and price. Ask for the sale. Do not volunteer bonuses before the prospect has responded. Offering bonuses before the ask signals that you lack confidence in your core price.
**If the prospect says yes:**
- Complete the transaction
- Then reveal the additional bonuses they are going to receive
- Frame this as a reward for their decision: "Now that you're in, here's everything else you're getting..."
- This creates a post-purchase "wow experience" that reinforces the buying decision and reduces buyer's remorse
**If the prospect does not buy:**
1. Identify the objection — the specific reason they are not buying. Do not guess; ask: "What's holding you back?"
2. Match a held-in-reserve bonus to that exact objection. The bonus should directly address the concern, not just add general value.
3. Present the bonus: "I understand [their concern]. I actually have something that addresses exactly that. I'm going to add [Bonus Name] — normally $X — for free. Does that make this feel fair?"
4. Ask again. Do not apologize. Adding a bonus is a move from strength — you are giving more, not retreating on price.
5. If they still object, identify the new objection and match another bonus. Repeat once more if needed.
**The reciprocity mechanism:** Humans struggle to refuse someone who has just given them something. Each bonus added to address an objection creates social pressure toward yes. The buyer must actively reject goodwill — a much harder psychological position than simply declining a straightforward offer.
**Never discount the core price.** Discounting teaches the buyer that your prices are always negotiable, which is permanently damaging. Adding bonuses to close a deal increases value without undermining price integrity. You remain in a position of strength.
---
### Step 4 — Build the Partner Bonus System
Partner bonuses are high-value additions to your offer sourced from adjacent businesses at zero cost to you. The partner gets free exposure to your buyers (their ideal prospects). You get valuable bonuses that may also generate affiliate commissions.
**The three-step partner bonus system:**
**Step A — Identify adjacent businesses.**
Ask: What does my customer need next, after they start working with me? What businesses serve that need? List every category — these are your partner targets. The only constraint: partners must not be direct competitors.
Example for a business coaching client: bookkeeper, attorney, email marketing software, ad agency, copywriter, paid traffic specialist, productivity app.
**Step B — Negotiate the exchange.**
Approach each business with the same framing: "I send you my highest-quality prospects — people who have already bought, are motivated, and need exactly what you offer. In exchange, you give my clients a free session, a discount, or access to your product as a bonus. You pay nothing. I pay nothing. My clients get more value."
Target outcomes (aim for one or both):
- A group discount or free access to give your buyers
- An affiliate commission paid to you for each client you refer
**Step C — Create a grand slam offer with each partner.**
Once a partner relationship is established, apply the same offer-building logic to the partner's product. Help them create a compelling framing for the bonus so it reads as high-value, not as a promotional filler item. A well-framed partner bonus is worth more than a poorly-framed internal bonus.
**Revenue math:** Your bonuses can become direct revenue streams. If your offer is $400 and you negotiate affiliate commissions from five partners, those commissions can exceed the $400 price. The buyer acquisition cost you already paid then generates additional profit from referrals — at zero incremental effort.
---
### Step 5 — Assemble the Final Bonus Stack Document
Write the complete stack as a deliverable you can use in sales conversations, sales pages, or handoff documents for a sales team.
**Structure:**
```
OFFER PRICE: $[amount]
CORE DELIVERABLE: [One-sentence description of what you deliver]
BONUS STACK:
Bonus 1: [Name] — Value: $[amount]
[Two-sentence description: what it is + what it does for them]
Bonus 2: [Name] — Value: $[amount]
[Two-sentence description]
[Continue for all bonuses]
TOTAL BONUS VALUE: $[sum]
TOTAL OFFER VALUE: $[core + bonuses]
YOU PAY: $[price]
OBJECTION RESERVE (do not present unless prospect declines first ask):
Reserve Bonus A: [Name] — deploys against: [specific objection]
Reserve Bonus B: [Name] — deploys against: [specific objection]
```
This document is the deliverable. Review it against the value eclipsing rule before finalizing: total stated bonus value must exceed the core offer price.
---
## Examples
### Case Study: DFY-to-DWY Gym Business Pivot
A gym owner originally sold done-for-you (DFY) services at a high price with thin margins. Transitioning to a done-with-you (DWY) model meant the core deliverable changed from physical execution to coaching and systems.
The offer remained priced similarly, but the bonus stack allowed the pivot to succeed:
- The gym owner extracted every template, checklist, and script they had built over years of DFY work and packaged them as standalone bonuses: "The 6-Week Member Retention Email Sequence," "The Front Desk Revenue Script," "The Monthly Fitness Challenge Template."
- Each had been part of the invisible DFY execution. As named bonuses with assigned values, they made the DWY offer feel richer than the DFY offer even at a lower labor cost.
- Partner bonuses from a supplement company (group discount + commission to the gym owner) made the total stated bonus value exceed the offer price — fulfilling the value eclipsing rule.
- Objection reserve bonuses targeting "I don't have time to manage this myself" dissolved the main transition objection.
**The lesson:** Transitioning delivery models is also a re-presentation problem. Bonuses make the same underlying value visible in a new form, allowing a pivot without a price reduction.
### Partner Bonus Example: Pain Clinic
A pain clinic with a $400 offer negotiated partner bonuses from adjacent health businesses. The value of partner bonuses alone exceeded the $400 price:
- Chiropractor: 2 free adjustments ($100 value) + $100 affiliate commission per patient referred
- Low-inflammation food company: product discount ($50 savings) + free product affiliate benefit
- Orthotics company: discounts ($150 savings) + $100 per person referred
- Health club: free personal training session + one month pool membership ($100 value) + $50 per signup referred
- Pharmacy: $100/month in prescription savings
Total affiliate commissions on a single client: ~$350 — nearly covering the $400 offer price in downstream revenue.
---
## Key Principles
**Enumerate to create value.** Value that is not named does not exist in the buyer's mind. Every deliverable you provide silently is value you are giving away without receiving credit for. Bonuses make implicit value explicit.
**Bonuses over discounts, always.** Discounting tells the market your price was wrong. Adding bonuses tells the market your offer is generous. One undermines your positioning; the other reinforces it. When you feel pressure to close, add value — never subtract price.
**Tools and checklists over trainings.** Lower buyer effort equals higher perceived value. A one-page checklist that produces results in 15 minutes is worth more to a buyer than a 3-hour training covering the same ground. Build for usability, not impressiveness.
**Value eclipsing is psychological leverage.** When the stated bonus value exceeds the price, the buyer's subconscious concludes two things: (1) the bonuses alone are worth the price, so anything in the core offer is pure gain; (2) the core offer must be even more valuable than the bonuses, because these are just the extras. Neither conclusion requires you to say anything — the math communicates it.
**The bonus vault compounds over time.** Every recorded workshop, every client result, every tool you build becomes a bonus asset. Treat them as inventory. A library of 20 validated bonus assets means you can build and customize a stack for any new offer in under an hour.
---
## References
- `grand-slam-offer-creation` — Build the core offer that this skill enhances. The deliverables list from that skill is the raw material for your bonus stack.
- `value-equation-offer-audit` — Diagnose which elements of your offer feel high-effort or slow. These are the bonuses to prioritize: tools or shortcuts that address exactly those friction points score highest.
- `offer-naming-magic-formula` — Apply benefit-driven naming to each bonus. The bonus name does as much selling as the bonus itself.
- `guarantee-design-and-selection` — Guarantees and bonuses work together. A strong guarantee removes fear of loss; a strong bonus stack removes doubt about value. Pair them for maximum conversion impact.
## License
This skill is licensed under [CC-BY-SA-4.0](https://creativecommons.org/licenses/by-sa/4.0/).
Source: [BookForge](https://github.com/bookforge-ai/bookforge-skills) — $100M Offers: How To Make Offers So Good People Feel Stupid Saying No by Alex Hormozi.
## Related BookForge Skills
Install related skills from ClawhHub:
- `clawhub install bookforge-grand-slam-offer-creation`
- `clawhub install bookforge-value-equation-offer-audit`
Or install the full book set from GitHub: [bookforge-skills](https://github.com/bookforge-ai/bookforge-skills)
Write, rewrite, or audit customer interview questions so they extract honest behavioral data instead of false validation. Use this skill whenever the user ne...
---
name: conversation-question-designer
description: Write, rewrite, or audit customer interview questions so they extract honest behavioral data instead of false validation. Use this skill whenever the user needs to prepare a conversation script for customer discovery, draft questions for an upcoming interview, fix questions that keep producing useless or vague answers, rewrite biased or leading questions into past-focused behavior-revealing ones, check whether their interview questions will trigger compliments instead of facts, or build a question list from learning goals — even if they don't mention "question design" or "The Mom Test." Do NOT use this skill to analyze conversation notes after a meeting (use conversation-data-quality-analyzer) or to decide which questions matter most strategically (use question-importance-prioritizer).
version: 1.0.0
homepage: https://github.com/bookforge-ai/bookforge-skills/tree/main/books/the-mom-test/skills/conversation-question-designer
metadata: {"openclaw":{"emoji":"📚","homepage":"https://github.com/bookforge-ai/bookforge-skills"}}
status: verified
source-books:
- id: the-mom-test
title: "The Mom Test"
authors: ["Rob Fitzpatrick"]
chapters: [1]
tags: [customer-discovery, interview-questions, customer-conversations, validation, question-design]
depends-on: []
execution:
tier: 1
mode: hybrid
inputs:
- type: document
description: "Product idea description, draft question list, or conversation topic"
tools-required: [Read, Write]
tools-optional: []
mcps-required: []
environment: "Any agent environment with file read/write access."
---
# Conversation Question Designer
## When to Use
You are preparing for a customer conversation and need questions that will produce honest, useful data rather than false validation. Typical situations:
- The user has a list of draft interview questions and wants them reviewed or improved
- The user is about to talk to potential customers and needs a conversation script
- The user wants to validate a product idea but is not sure what to ask
- The user has a conversation topic or learning goal but no specific questions yet
- The user received unhelpful answers from previous conversations and wants to fix their approach
Before starting, verify:
- Does the user have a product idea or problem area to explore? (If not, help them articulate one first)
- Does the user know who they will be talking to? (Customer type affects which questions matter most)
**Mode: Hybrid** — The agent designs the question script and analyzes/rewrites questions. The human conducts the actual conversation.
## Context & Input Gathering
### Required Context (must have — ask if missing)
- **Product idea or problem area:** What is the user building or exploring? This shapes which behaviors and past experiences to ask about.
→ Check prompt for: product descriptions, problem statements, startup ideas, feature concepts
→ Check environment for: `product-idea.md`, `README.md`, pitch documents
→ If still missing, ask: "What product idea or problem area are you exploring? A sentence or two is enough."
- **Learning goals:** What does the user want to learn from these conversations? This determines which questions matter most.
→ Check prompt for: assumptions to validate, unknowns, hypotheses, "I want to find out..."
→ Check environment for: `learning-log.md`, `question-script.md`, previous conversation notes
→ If still missing, ask: "What are the 1-3 most important things you want to learn from these conversations?"
### Observable Context (gather from environment)
- **Existing question drafts:** Check for prepared questions the user wants reviewed
→ Look for: `question-script.md`, question lists in the prompt, bullet-pointed questions
→ If unavailable: generate questions from scratch based on learning goals
- **Customer segment:** Who will the user be talking to?
→ Look for: `customer-segments.md`, persona descriptions, target market references
→ If unavailable: ask "Who will you be talking to? (e.g., small business owners, enterprise CTOs, parents)"
- **Previous conversation notes:** Past learnings that should inform new questions
→ Look for: `conversation-notes/`, `learning-log.md`
→ If unavailable: assume first round of conversations
### Default Assumptions
- If no customer type specified → design questions generic enough for early exploration, note this limitation
- If no stage specified → assume pre-product (learning phase, not selling)
- If no prior conversations → assume this is the first batch and questions should start broad
### Sufficiency Threshold
```
SUFFICIENT when ALL of these are true:
- Product idea or problem area is known
- At least 1 learning goal is identified
- Customer type is known or defaulted
PROCEED WITH DEFAULTS when:
- Product idea is known but learning goals are vague
- Customer type is approximate ("probably small businesses")
MUST ASK when:
- No product idea or problem area at all
- User provides questions but no context on what they are building
```
## Process
### Step 1: Extract Learning Goals
**ACTION:** Identify the 3 most important things the user needs to learn from upcoming conversations. If the user has provided learning goals, validate them. If not, derive them from their product idea and assumptions.
**WHY:** Questions without clear learning goals produce scattered, unusable data. Every question in the script must connect to a specific learning goal, otherwise the conversation wanders and the user leaves with opinions instead of facts.
**IF** the user has stated learning goals → validate they are about customer behavior/problems (not about validating the solution)
**IF** learning goals are solution-focused (e.g., "Would people use my app?") → reframe toward the underlying behavior: "How are people currently solving this problem?"
**OUTPUT:** A numbered list of 3 learning goals, each framed as a behavior or fact to discover (not an opinion to collect).
### Step 2: Audit Existing Questions (if provided)
**ACTION:** Evaluate each question against the 3 customer conversation quality rules (The Mom Test):
| Rule | Test | Failure Pattern |
|------|------|----------------|
| **Rule 1: Their life, not your idea** | Does this question ask about the customer's actual behavior, problems, or workflow — or does it ask them to evaluate your concept? | Questions containing "my idea," "this product," "would you use," "do you think" |
| **Rule 2: Specifics in the past, not generics about the future** | Does this question ask about concrete events that already happened — or does it ask for predictions, hypotheticals, or generalizations? | Questions containing "would you," "will you," "do you ever," "how much would you pay," future tense |
| **Rule 3: Listen more than talk** | Does this question invite an open-ended response — or does it lead toward a specific answer? | Yes/no questions, questions with the answer embedded, multi-part questions that overwhelm |
**WHY:** Bad questions are the default. Most people naturally ask opinion-seeking, future-hypothetical, or idea-validating questions because those feel productive. But the answers to those questions are worthless — people are overly optimistic about the future and will lie to avoid hurting your feelings. Auditing against these 3 rules catches the specific failure modes before the conversation happens.
**FOR EACH** question in the user's draft list:
1. Rate as PASS, FAIL, or FIXABLE against each of the 3 rules
2. If FAIL or FIXABLE → identify which rule(s) it violates and why
3. Write a rewritten version that passes all 3 rules
4. Note what learning goal this question serves (or flag it as goalless)
**IF** no existing questions are provided → skip to Step 3.
### Step 3: Generate the Question Script
**ACTION:** Build a structured conversation script organized by learning goal. For each learning goal, create 3-5 questions that progress from broad to specific.
**WHY:** Conversations need structure to produce usable data, but rigid scripts kill natural flow. Organizing by learning goal lets the human navigate flexibly — they can follow interesting threads while keeping track of what they still need to learn. Starting broad prevents premature zoom (asking about a specific problem before confirming the person even cares about that area).
**Question design rules — apply to every question:**
1. **Ask about their life, not your idea.** Frame questions around their current behavior, workflows, problems, and goals. Never mention your product concept unless deliberately testing a later-stage hypothesis.
2. **Ask about specifics in the past, not generics about the future.** Replace "Would you..." with "When was the last time..." Replace "Do you usually..." with "Talk me through what happened last time..." Past behavior is the only reliable predictor.
3. **Use open questions that invite stories.** "Talk me through..." and "Tell me about the last time..." produce richer data than "Do you..." or "How often do you..."
4. **Include motivation-revealing questions.** "Why do you bother?" and "What are the implications?" separate must-solve problems from nice-to-haves.
5. **Include commitment-testing questions.** "What have you already tried to solve this?" and "How are you dealing with it now?" reveal whether the problem is painful enough to drive action.
6. **End with network-expanding questions.** "Who else should I talk to?" and "Is there anything else I should have asked?" generate warm introductions and catch blind spots.
**Script structure per learning goal:**
```
## Learning Goal: [goal description]
### Opener (broad, non-leading)
- [Question that explores whether this area matters to them at all]
### Depth questions (specific, past-focused)
- [Question about concrete past behavior]
- [Question about current workflow/workaround]
- [Question about implications/severity]
### Commitment signal (reveals real vs stated priority)
- [Question about what they have tried or are spending]
```
### Step 4: Add Anti-Bias Safeguards
**ACTION:** Review the complete script for common bias traps and add inline warnings where the human might accidentally slip into bad patterns during the conversation.
**WHY:** Even with good prepared questions, conversations go off-script. The human needs to recognize danger signals in real-time. Adding warnings at specific points in the script acts as a field guide during the conversation.
**Bias traps to flag:**
| Trap | Signal | Recovery |
|------|--------|----------|
| **Compliment fishing** | You just described your idea and they said "That sounds great!" | Deflect: "Thanks — but tell me, how are you currently handling this?" |
| **Accepting vague enthusiasm** | They say "I would definitely use that" or "That is exactly what I need" | Anchor: "When was the last time this problem came up? Walk me through what happened." |
| **Pitching instead of asking** | You have been talking for more than 30 seconds without asking a question | Stop and ask: "Sorry, I got excited. Can I ask — how are you dealing with this right now?" |
| **Leading toward your solution** | Your question contains your product's features or approach | Reframe around their problem, not your solution |
| **Accepting generic claims** | They say "I usually..." or "I always..." or "I never..." | Anchor: "Can you tell me about the most recent specific time?" |
### Step 5: Produce the Deliverable
**ACTION:** Write the final question script as a structured document the user can reference during their conversation.
**WHY:** The deliverable must be usable in the field — not a theoretical analysis. It should fit on a single page, be scannable during a live conversation, and clearly connect each question to what the user is trying to learn.
**Output format:**
```markdown
# Conversation Question Script
## Context
- **Product/Problem Area:** [from input]
- **Target Customer:** [from input]
- **Date Prepared:** [today]
## Learning Goals
1. [Goal 1]
2. [Goal 2]
3. [Goal 3]
## Question Script
### Learning Goal 1: [description]
**Opener:** [broad question]
**Depth:**
- [specific past-focused question]
- [workflow/workaround question]
- [implication/severity question]
**Commitment signal:** [what-have-you-tried question]
> WATCH OUT: [relevant bias trap warning]
### Learning Goal 2: [description]
[same structure]
### Learning Goal 3: [description]
[same structure]
## Closing Questions (use for every conversation)
- "Who else should I talk to?" — generates warm intros
- "Is there anything else I should have asked?" — catches blind spots
## Quick Reference: Bias Recovery
| If you hear... | Do this... |
|----------------|-----------|
| "That sounds great!" | Deflect → ask about their current process |
| "I would definitely..." | Anchor → ask about most recent specific instance |
| "I usually/always/never..." | Anchor → "Tell me about the last time" |
| Feature requests | Dig → "Why do you want that? What would it let you do?" |
## Questions Rewritten (if audit was performed)
| # | Original (FAIL) | Rewritten (PASS) | Rule Violated |
|---|-----------------|------------------|---------------|
| 1 | [original] | [rewritten] | [rule 1/2/3] |
```
**IF** the user provided a file path or working directory → write the output to `question-script.md`
**ELSE** → present the output directly in the conversation
## Examples
**Scenario: Founder with draft questions about a project management tool**
Trigger: "I'm building a project management tool for freelancers. Here are my interview questions: 1) Do you think a simpler project management tool would be useful? 2) Would you pay $10/month for it? 3) How do you manage your projects?"
Process:
1. Extract learning goals: (a) Do freelancers have project management pain? (b) What do they currently use? (c) Would they pay for a solution?
2. Audit existing questions:
- Q1 FAIL (Rule 1: asks them to evaluate your idea; Rule 2: hypothetical)
- Q2 FAIL (Rule 1: about your product; Rule 2: hypothetical pricing)
- Q3 FIXABLE (Rule 2: too generic, needs past-tense anchoring)
3. Generate rewritten script with depth questions
Output (abbreviated):
```
## Questions Rewritten
| # | Original (FAIL) | Rewritten (PASS) | Rule Violated |
|---|-----------------|------------------|---------------|
| 1 | "Do you think a simpler PM tool would be useful?" | "Tell me about the last project you managed. What went well and what was frustrating?" | Rule 1, Rule 2 |
| 2 | "Would you pay $10/month for it?" | "How much time do you spend on project admin each week? What tools are you paying for right now?" | Rule 1, Rule 2 |
| 3 | "How do you manage your projects?" | "Walk me through how you managed your most recent client project from start to finish." | Rule 2 |
## Learning Goal 1: Do freelancers have project management pain?
Opener: "What does a typical work week look like for you?"
Depth:
- "Tell me about the last time a project got disorganized. What happened?"
- "What are the implications when project details fall through the cracks?"
- "How seriously do you take project organization compared to, say, finding new clients?"
Commitment signal: "Have you looked into any tools or systems to help with this?"
```
---
**Scenario: Product manager preparing for first customer discovery conversations**
Trigger: "We're exploring a new feature for our analytics platform that would auto-generate executive reports. I need to prepare questions for talking to marketing directors. No existing questions yet."
Process:
1. Identify learning goals: (a) How do marketing directors currently create executive reports? (b) How painful is the current process? (c) Who else is involved in reporting?
2. Skip audit (no existing questions)
3. Generate full question script from scratch
4. Add anti-bias safeguards (flag the risk of pitching the auto-generate feature)
Output (abbreviated):
```
## Learning Goal 1: Current reporting workflow
Opener: "Tell me about the last executive report you put together."
Depth:
- "Walk me through the steps — where did the data come from, who was involved, how long did it take?"
- "What was the most annoying part of that process?"
- "What happened after you delivered it? Any follow-up requests or revisions?"
Commitment signal: "Have you tried any tools or shortcuts to speed this up?"
> WATCH OUT: Do NOT mention "auto-generated reports" during this section.
> If they mention reporting pain, resist the urge to pitch — instead ask
> "What have you tried to make it faster?"
```
---
**Scenario: Rewriting a single biased question**
Trigger: "Is this a good question to ask potential customers: 'Would you buy an app that helps you track your calories more easily?'"
Process:
1. Identify the single question violates Rule 1 (about your app) and Rule 2 (hypothetical future)
2. Rewrite to pass all 3 rules
3. Provide the rewritten version with explanation
Output:
```
FAIL — violates Rule 1 (asks about your idea, not their life) and Rule 2 (hypothetical future).
The answer will almost always be "yes" because people are optimistic about
hypotheticals and want to be supportive. This tells you nothing.
Rewritten:
- "How are you tracking what you eat right now?" (their current behavior)
- "Walk me through what you did for food tracking yesterday." (specific past)
- "What else have you tried? What did you like and hate about it?" (commitment signal)
- "How much are you spending on health/nutrition apps right now?" (price anchor from reality)
These questions reveal whether they actually care enough to have tried
solving this — which is far more predictive than whether they say "yes"
to a hypothetical.
```
## Key Principles
- **Opinions are worthless; behavior is everything** — People will tell you your idea is great to avoid hurting your feelings. But they cannot fake what they have actually done in the past. Always ask about concrete past actions, never about future intentions or abstract opinions. A question like "Would you use X?" is answered with optimism. "When did you last try to solve X?" is answered with facts.
- **The best questions make your idea invisible** — If the person you are talking to does not even know you have a product idea, they cannot lie to you about it. The moment you reveal your concept, every response becomes contaminated by their desire to be supportive (or contrarian). Keep the conversation about their life, their problems, their workflow.
- **Generic claims need anchoring to specifics** — When someone says "I always" or "I usually" or "I would," that is a generic claim (vague, non-specific feedback). It sounds like data but it is noise. Anchor it to reality by asking "When was the last time that happened?" or "Can you walk me through a specific example?" The specific story either confirms the generic claim or contradicts it.
- **Every question must serve a learning goal** — Questions without a clear purpose produce interesting but useless conversations. Before adding any question to the script, verify: "If they answer this, which of my 3 learning goals does it advance?" Goalless questions waste scarce conversation time.
- **Premature specificity kills discovery** — If you zoom into your specific problem area before confirming the person even cares about that area, you get false validation. Start broad ("What are the biggest challenges in your role?") and only zoom in when they independently raise the topic you care about. If they do not mention it unprompted, it is probably not a top priority for them.
## References
- For the complete 14-question good/bad rubric with fix patterns, see [question-quality-rubric.md](references/question-quality-rubric.md)
- For analyzing conversation quality after conversations happen, use the `conversation-data-quality-analyzer` skill
- For prioritizing which questions matter most given business risk, use the `question-importance-prioritizer` skill
## License
This skill is licensed under [CC-BY-SA-4.0](https://creativecommons.org/licenses/by-sa/4.0/).
Source: [BookForge](https://github.com/bookforge-ai/bookforge-skills) — The Mom Test by Rob Fitzpatrick.
## Related BookForge Skills
This skill is standalone. Browse more BookForge skills: [bookforge-skills](https://github.com/bookforge-ai/bookforge-skills)
FILE:references/question-quality-rubric.md
# Question Quality Rubric
Complete evaluation rubric for customer discovery questions. Use this reference when auditing a question list or when you need the full reasoning behind why a question passes or fails.
## The 3 Rules (Customer Conversation Quality Rules / "The Mom Test")
| # | Rule | What it prevents |
|---|------|-----------------|
| 1 | Talk about their life instead of your idea | Prevents compliment-fishing and false validation |
| 2 | Ask about specifics in the past instead of generics or opinions about the future | Prevents optimism bias and hypothetical data |
| 3 | Talk less and listen more | Prevents leading, pitching, and confirmation bias |
## 14-Question Scored Rubric
### BAD Questions
**"Do you think it's a good idea?"**
- Verdict: FAIL (Rule 1, Rule 2)
- Why it fails: Asks for an opinion about your idea. Only the market can tell if an idea is good. Unless they are a deep industry expert, this is self-indulgent noise with a high risk of false positives.
- Fix: Ask them to show you how they currently do the thing your idea relates to. Ask which parts they love and hate. Ask what other tools and processes they tried before settling on this one. Are they actively searching for a replacement? Take all that information and decide for yourself whether it is a good idea.
- Rule of thumb: Opinions are worthless.
**"Would you buy a product which did X?"**
- Verdict: FAIL (Rule 1, Rule 2)
- Why it fails: Asks for a hypothetical future purchase from overly optimistic people who want to make you happy. The answer is almost always "yes," which makes it worthless.
- Fix: Ask how they currently solve X and how much it costs them in time and money. Ask them to talk you through what happened the last time X came up. If they have not tried to solve the problem, ask why not. If they have not even looked for solutions already, they are not going to buy yours.
- Rule of thumb: Anything involving the future is an over-optimistic lie.
**"How much would you pay for X?"**
- Verdict: FAIL (Rule 1, Rule 2)
- Why it fails: Same as above, but the number creates a false sense of rigor and precision. People are bad at predicting their future spending behavior.
- Fix: Ask about their life as it currently is. How much does the problem cost them? How much do they currently pay to solve it? How big is the budget they have allocated? If you are far enough along, literally ask for money — a deposit or pre-order tells the truth.
- Rule of thumb: People will lie to you if they think it is what you want to hear.
**"Would you pay X for a product which did Y?"**
- Verdict: FAIL (Rule 1, Rule 2)
- Why it fails: Adding a number does not help. Still hypothetical, still about your idea, still produces unreliable answers. People are overly optimistic about what they would do and want to make you happy.
- Fix: Ask about what they currently do, not what they believe they might do in the future. Price your product in terms of value to the customer rather than cost to you. You cannot quantify their perceived value without understanding their current financial reality.
### SORT-OF-OKAY Questions
**"What would your dream product do?"**
- Verdict: FIXABLE (Rule 2 — sort of)
- Why it is risky: On its own, this just collects feature requests — which is like building your product by committee. People know their problems but do not know how to solve them.
- When it works: Only if you ask good follow-ups. Treat it as the "set" before the spike in volleyball — use it to set up deeper questions about why they want each feature, what it would let them do, and how they are coping without it.
- Fix: Follow every feature request with: "Why do you want that?" and "What would that let you do?" and "How are you coping without it?"
- Rule of thumb: People know what their problems are, but they do not know how to solve those problems.
### GOOD Questions
**"Why do you bother?"**
- Verdict: PASS (all 3 rules)
- Why it works: Gets from the perceived problem to the real motivation. Great for uncovering the actual goal behind a stated need.
- Example: Finance people were asking for better messaging tools. "Why do you bother?" led to "so we can be certain that we're all working off the latest version." The solution ended up being less like a messaging tool and more like Dropbox.
- Rule of thumb: You are shooting blind until you understand their goals.
**"What are the implications of that?"**
- Verdict: PASS (all 3 rules)
- Why it works: Distinguishes between I-will-pay-to-solve-that problems and kind-of-annoying-but-I-can-deal-with-it problems. Some problems have big, costly implications. Others exist but do not actually matter. Also gives you a pricing signal.
- Example: Someone described a workflow with emotionally loaded terms ("DISASTER") but when asked about implications, said "Oh, we just ended up throwing a bunch of interns at the problem — it's actually working pretty well."
- Rule of thumb: Some problems don't actually matter.
**"Talk me through the last time that happened."**
- Verdict: PASS (all 3 rules)
- Why it works: Requests a concrete, specific story from the past. People cannot be wishy-washy when recounting a real event. You learn about their actual actions instead of their stated opinions. Being walked through their full workflow answers many questions simultaneously: how they spend their days, what tools they use, who they talk to, what the constraints are.
- Rule of thumb: Watching someone do a task will show you where the problems and inefficiencies really are, not where the customer thinks they are.
**"Talk me through your workflow."**
- Verdict: PASS (all 3 rules)
- Why it works: Same principle as above. Being walked through a real workflow reveals how your product fits into their day, which other tools you need to integrate with, and what constraints exist that they would never think to mention.
**"What else have you tried?"**
- Verdict: PASS (all 3 rules)
- Why it works: Reveals what they are currently using, how much it costs, what they love and hate about it, and how big of a pain it would be to switch. If they have not tried anything, that is a strong signal that the problem is not painful enough to drive action.
- Important: A future-promise statement ("I would definitely pay for something that solves this") without any past commitment to back it up is a red flag. If they say it happens all the time but have never looked for a solution, they will not buy yours.
- Rule of thumb: If they have not looked for ways of solving it already, they are not going to look for (or buy) yours.
**"How are you dealing with it now?"**
- Verdict: PASS (all 3 rules)
- Why it works: Gives you a price anchor and reveals current workflow. If they are paying 100/month for a duct-tape workaround, you know what ballpark you are playing in. If they spent 120,000 this year on agency fees to maintain a site you are replacing, you should not be positioning yourself as the cheap option.
- Rule of thumb: While it is rare for someone to tell you precisely what they will pay you, they will often show you what it is worth to them.
**"Where does the money come from?"**
- Verdict: PASS (all 3 rules)
- Why it works: In a business context, this is a must-ask. It leads to a conversation about whose budget the purchase will come from and who else within their company holds the power to torpedo the deal. Without this, your future pitches will hit unseen snags.
**"Who else should I talk to?"**
- Verdict: PASS (all 3 rules)
- Why it works: End every conversation with this. If you are onto something interesting and treating people well, your leads will quickly multiply via warm introductions. If someone does not want to make introductions, you have learned something: either you screwed up the meeting (too formal, too pitchy, too clingy) or they do not actually care about the problem you are solving.
- Rule of thumb: People want to help you, but will rarely do so unless you give them an excuse to do so.
**"Is there anything else I should have asked?"**
- Verdict: PASS (all 3 rules)
- Why it works: By the end of the meeting, they understand what you are trying to do. Since you are new to the industry, they will often be sitting there thinking about the most important point while you are asking about something else entirely. This question gives them permission to fix your line of questioning. Use it as a crutch early — discard it as you become more skilled.
## Rules of Thumb (Quick Reference)
1. Opinions are worthless
2. Anything involving the future is an over-optimistic lie
3. People will lie to you if they think it is what you want to hear
4. People know what their problems are, but they do not know how to solve them
5. If they have not looked for ways of solving it already, they are not going to look for (or buy) yours
6. You are shooting blind until you understand their goals
7. Some problems do not actually matter
8. Watching someone do a task will show you where the real problems are
9. While it is rare for someone to tell you precisely what they will pay, they will often show you what it is worth to them
10. People want to help you, but will rarely do so unless you give them an excuse
## Fluff-Inducing Questions to Avoid
These question patterns almost always produce vague, non-specific feedback (fluff) rather than concrete facts:
- "Do you ever...?"
- "Would you ever...?"
- "What do you usually...?"
- "Do you think you...?"
- "Might you...?"
- "Could you see yourself...?"
Replace with past-tense, specific alternatives:
- "Do you ever..." → "When was the last time you...?"
- "Would you ever..." → "Have you ever tried to...?"
- "What do you usually..." → "What did you do last time...?"