@clawhub-tangweigang-jpg-8679fec286
实现自托管式加密货币投资组合追踪,自动聚合多交易所和链上钱包资产,实时计算持仓损益并生成税务报告。
---
name: rotki-crypto-tracker
description: |-
实现自托管式加密货币投资组合追踪,自动聚合多交易所和链上钱包资产,实时计算持仓损益并生成税务报告。
license: Proprietary. See LICENSE.txt in project root.
compatibility: Designed for Doramagic-host ecosystem (Claude Code / openclaw / Cursor). Requires Python 3.12+ with uv package manager.
metadata:
version: "v6.1"
blueprint_id: "finance-bp-095"
compiled_at: "2026-04-22T13:00:41.524812+00:00"
capability_markets: "crypto"
capability_activities: "crypto-trading"
sop_version: "crystal-compilation-v6.1"
---
# Rotki 加密追踪 (rotki-crypto-tracker)
> 实现自托管式加密货币投资组合追踪,自动聚合多交易所和链上钱包资产,实时计算持仓损益并生成税务报告。
## Pipeline
`data_collection -> data_storage -> factor_computation -> target_selection -> trading_execution -> visualization`
## Top Use Cases (1 total)
### Sphinx Documentation Configuration (`UC-101`)
Configure Sphinx documentation builder settings for the rotki project including version, author, and extension modules
**Triggers**: documentation, sphinx, configuration
**Execute trigger**: `When user intent matches intent_router.uc_entries[].positive_terms AND user uses action verb (run/execute/跑/执行/backtest/fetch/collect)`
## What I'll Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
## Semantic Locks (Fatal)
| ID | Rule | On Violation |
|---|---|---|
| `SL-01` | Execute sell orders before buy orders in every trading cycle | halt |
| `SL-02` | Trading signals MUST use next-bar execution (no look-ahead) | halt |
| `SL-03` | Entity IDs MUST follow format entity_type_exchange_code | halt |
| `SL-04` | DataFrame index MUST be MultiIndex (entity_id, timestamp) | halt |
| `SL-05` | TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount | halt |
| `SL-06` | filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION | halt |
| `SL-07` | Transformer MUST run BEFORE Accumulator in factor pipeline | halt |
| `SL-08` | MACD parameters locked: fast=12, slow=26, signal=9 | halt |
Full lock definitions: [references/LOCKS.md](references/LOCKS.md)
## Top Anti-Patterns (13 total)
- **`AP-CRYPTO-TRADING-001`**: Float Arithmetic for Monetary Values
- **`AP-CRYPTO-TRADING-002`**: Missing Market Initialization Before Access
- **`AP-CRYPTO-TRADING-003`**: Bypassing API Facade Layer
All 13 anti-patterns: [references/ANTI_PATTERNS.md](references/ANTI_PATTERNS.md)
## Evidence Quality Notice
> [QUALITY NOTICE] This crystal was compiled from blueprint finance-bp-095. Evidence verify ratio = 47.0% and audit fail total = 36. Generated results may have uncaptured requirement gaps. Verify critical decisions against source files (LATEST.yaml / LATEST.jsonl).
## Reference Files
| File | Contents | When to Load |
|---|---|---|
| [references/seed.yaml](references/seed.yaml) | V6+ 全量权威 (source-of-truth) | 有行为/决策争议时必读 |
| [references/ANTI_PATTERNS.md](references/ANTI_PATTERNS.md) | 13 条跨项目反模式 | 开始实现前 |
| [references/WISDOM.md](references/WISDOM.md) | 跨项目精华借鉴 | 架构决策时 |
| [references/CONSTRAINTS.md](references/CONSTRAINTS.md) | domain + fatal 约束 | 规则冲突时 |
| [references/USE_CASES.md](references/USE_CASES.md) | 全量 KUC-* 业务场景 | 需要完整示例时 |
| [references/LOCKS.md](references/LOCKS.md) | SL-* + preconditions + hints | 生成回测/交易代码前 |
| [references/COMPONENTS.md](references/COMPONENTS.md) | AST 组件地图(按 module 拆分)| 查 API 时 |
---
*Compiled by Doramagic crystal-compilation-v6.1 from `finance-bp-095` blueprint at 2026-04-22T13:00:41.524812+00:00.*
*See [human_summary.md](human_summary.md) for non-technical overview.*
FILE:human_summary.md
# Human Summary
> I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
## What I Can Do
- **tagline**: I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
- **use_cases**: ['Sphinx Documentation Configuration', 'A-share MACD daily golden-cross backtest with hfq price adjustment from eastmoney', 'End-to-end ZVT pipeline: FinanceRecorder + GoodCompanyFactor + StockTrader', 'Multi-factor strategy with TargetSelector (AND mode) combining MACD + volume breakout', 'Index composition data collection (SZ1000, SZ2000) with EM recorder', 'Institutional fund holdings tracker via joinquant_fund_runner pattern', 'Custom Transformer + Accumulator factor with per-entity rolling state']
## What I Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
FILE:references/ANTI_PATTERNS.md
# Anti-Patterns (Cross-Project)
Total: **13**
## ccxt (1)
### `AP-CRYPTO-TRADING-002` — Missing Market Initialization Before Access <sub>(high)</sub>
Attempting to access market data via symbol lookups before load_markets() is called leaves self.markets empty, causing KeyError or BadSymbol exceptions on all trading operations and data retrieval. This breaks the entire trading workflow at the first market interaction.
## cryptofeed (3)
### `AP-CRYPTO-TRADING-009` — Applying Order Book Deltas Before Snapshot <sub>(high)</sub>
Processing order book delta messages before receiving a snapshot for the symbol applies updates to an uninitialized or stale book state. Price levels are incorrectly added/removed, corrupting the local book representation with no way to recover without full reset.
### `AP-CRYPTO-TRADING-010` — Silent HTTP Error Handling <sub>(medium)</sub>
Ignoring non-200 HTTP response status codes without raising exceptions causes silent failures for data requests. Market data is missing or corrupted, failed requests are not retried, and downstream consumers receive incomplete data with no indication of failure.
### `AP-CRYPTO-TRADING-011` — Missing Sequence Number Validation <sub>(medium)</sub>
Not validating that order book sequence numbers increment by exactly 1 allows out-of-order or missing messages to corrupt local book state. Stale or incorrect price levels persist in the book, leading to wrong trading signals and corrupted market depth data.
## hummingbot (5)
### `AP-CRYPTO-TRADING-005` — Unvalidated Collateral for Order Execution <sub>(high)</sub>
Submitting orders without checking collateral requirements including order cost, percent fees, and fixed fees against available balance causes orders to exceed margin. This triggers immediate liquidation or forced position closure at unfavorable prices with partial or total loss of collateral.
### `AP-CRYPTO-TRADING-006` — Close Order Placed Before Open Order Fills <sub>(high)</sub>
Placing a close order before verifying the open order is fully filled causes mismatched position sizes. The executor attempts to close a larger or smaller position than actually exists, leading to unintended directional exposure and potential losses exceeding the configured risk parameters.
### `AP-CRYPTO-TRADING-007` — Arbitrage Across Non-Interchangeable Tokens <sub>(high)</sub>
Executing arbitrage trades between tokens that appear similar but are not interchangeable causes permanent loss of funds. The received tokens cannot be used to close the opposing position, stranding capital and creating one-sided exposure with no recovery path.
### `AP-CRYPTO-TRADING-008` — Skipping Triple Barrier Evaluations <sub>(high)</sub>
Omitting control_stop_loss, control_take_profit, or control_time_limit calls in the control_barriers cycle leaves positions unprotected. Losses exceed configured thresholds as barrier checks never trigger, positions remain open beyond risk tolerance, resulting in amplified losses.
### `AP-CRYPTO-TRADING-012` — Wrong Position Key for Perpetual Modes <sub>(medium)</sub>
Using trading_pair only as the position key in HEDGE mode causes different position sides to collide and overwrite each other. Position tracking becomes incorrect, leading to wrong order matching and potential financial loss when the system misidentifies position direction.
## rotki (3)
### `AP-CRYPTO-TRADING-003` — Bypassing API Facade Layer <sub>(high)</sub>
Directly accessing internal service methods without routing through the RestAPI facade bypasses authentication, task tracking, and error handling mechanisms. Anonymous requests can execute privileged operations, creating critical security vulnerabilities where unauthorized users access sensitive financial data or execute trades.
### `AP-CRYPTO-TRADING-004` — Non-Checksummed EVM Addresses <sub>(high)</sub>
Passing lowercase or mixed-case Ethereum addresses to RPC nodes causes InvalidAddress exceptions since nodes enforce EIP-55 checksum format. This results in RemoteError failures that halt all blockchain data collection for the affected chain, with no graceful degradation or fallback.
### `AP-CRYPTO-TRADING-013` — Overwriting User-Customized Event Classifications <sub>(medium)</sub>
Re-decoding operations silently replace user-modified events marked as CUSTOMIZED without explicit user action. User edits to event classifications are permanently lost, causing incorrect accounting treatment and potential tax reporting errors that may not be detected until audit.
## rotki, hummingbot, cryptofeed, ccxt (1)
### `AP-CRYPTO-TRADING-001` — Float Arithmetic for Monetary Values <sub>(high)</sub>
Using Python float type instead of Decimal for price, amount, balance, PnL, and other financial calculations causes precision errors due to binary floating-point representation. Rounding errors compound across multiple calculations, leading to incorrect order sizing, wrong profit/loss reporting, and potentially incorrect trading decisions or tax calculations.
FILE:references/COMPONENTS.md
# Component Capability Map
**Project**: finance-bp-095--rotki
**Scan date**: 2026-04-22
**Stats**: {'total_files': 8, 'total_classes': 31, 'total_functions': 0, 'total_stages': 8}
## Modules (8)
- [rest_api_layer](components/rest_api_layer.md): 5 classes
- [blockchain_data_collection](components/blockchain_data_collection.md): 5 classes
- [transaction_decoding](components/transaction_decoding.md): 4 classes
- [exchange_integration](components/exchange_integration.md): 3 classes
- [history_event_management](components/history_event_management.md): 3 classes
- [accounting_&_pnl_calculation](components/accounting_-_pnl_calculation.md): 5 classes
- [database_layer](components/database_layer.md): 3 classes
- [asset_registry_&_resolution](components/asset_registry_-_resolution.md): 3 classes
FILE:references/CONSTRAINTS.md
# Constraints
## preservation_manifest
```yaml
required_objects:
business_decisions_count: 170
fatal_constraints_count: 59
non_fatal_constraints_count: 305
use_cases_count: 1
semantic_locks_count: 12
preconditions_count: 4
evidence_quality_rules_count: 2
traceback_scenarios_count: 5
```
FILE:references/LOCKS.md
# Semantic Locks + Preconditions
## Semantic Locks (12)
### `SL-01` <sub>(on_violation: fatal)</sub>
Execute sell orders before buy orders in every trading cycle
### `SL-02` <sub>(on_violation: fatal)</sub>
Trading signals MUST use next-bar execution (no look-ahead)
### `SL-03` <sub>(on_violation: fatal)</sub>
Entity IDs MUST follow format entity_type_exchange_code
### `SL-04` <sub>(on_violation: fatal)</sub>
DataFrame index MUST be MultiIndex (entity_id, timestamp)
### `SL-05` <sub>(on_violation: fatal)</sub>
TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount
### `SL-06` <sub>(on_violation: fatal)</sub>
filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION
### `SL-07` <sub>(on_violation: fatal)</sub>
Transformer MUST run BEFORE Accumulator in factor pipeline
### `SL-08` <sub>(on_violation: fatal)</sub>
MACD parameters locked: fast=12, slow=26, signal=9
### `SL-09` <sub>(on_violation: warning)</sub>
Default transaction costs: buy_cost=0.001, sell_cost=0.001, slippage=0.001
### `SL-10` <sub>(on_violation: fatal)</sub>
A-share equity trading is T+1 (no same-day close of buy positions)
### `SL-11` <sub>(on_violation: fatal)</sub>
Recorder subclass MUST define provider AND data_schema class attributes
### `SL-12` <sub>(on_violation: fatal)</sub>
Factor result_df MUST contain either 'filter_result' OR 'score_result' column
## Preconditions (4)
- **`PC-01`**: `python3 -c 'import zvt; print(zvt.__version__)'` → on_fail: Run: python3 -m pip install zvt then re-run: python3 -m zvt.init_dirs to initialize data directories
- **`PC-02`**: `python3 -c "from zvt.api.kdata import get_kdata; df = get_kdata(entity_ids=['stock_sh_600000'], limit=1); assert df is not None and len(df) > 0, 'No kdata found'"` → on_fail: Run recorder first: python3 -m zvt.recorders.em.em_stock_kdata_recorder --entity_ids stock_sh_600000 (replace with your target entity IDs)
- **`PC-03`**: `python3 -c "import os; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); assert zvt_home.exists(), f'ZVT home not found: {zvt_home}'"` → on_fail: Run: python3 -m zvt.init_dirs
- **`PC-04`**: `python3 -c "import os, tempfile; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); test_f = zvt_home / '.write_test'; test_f.touch(); test_f.unlink()"` → on_fail: Check directory permissions: chmod u+w ~/.zvt or set ZVT_HOME environment variable to a writable location
FILE:references/USE_CASES.md
# Known Use Cases (KUC)
Total: **1**
## `KUC-101`
**Source**: `docs/conf.py`
Configure Sphinx documentation builder settings for the rotki project including version, author, and extension modules.
FILE:references/WISDOM.md
# Cross-Project Wisdom
Total: **8**
## `CW-CRYPTO-TRADING-001` — Decimal Type for All Monetary Values
**From**: rotki, hummingbot, cryptofeed, ccxt · **Applicable to**: crypto-trading
All four projects mandate Decimal type for price, amount, balance, quantity, and PnL fields. Float arithmetic causes rounding errors that compound across financial calculations, leading to incorrect order sizing and reporting. Always use Decimal for any value representing money in crypto trading systems.
## `CW-CRYPTO-TRADING-002` — Initialize Data Structures Before Access
**From**: ccxt, cryptofeed, rotki · **Applicable to**: crypto-trading
Projects consistently require explicit initialization before data access: load_markets() before symbol lookups, check symbol population before mapping access, establish RPC connections before queries. Skipping initialization causes KeyError, AttributeError, or silent data corruption that breaks downstream operations.
## `CW-CRYPTO-TRADING-003` — Precise String Arithmetic for Financial Calculations
**From**: ccxt · **Applicable to**: crypto-trading
CCXT mandates Precise.string_* static methods (string_mul, string_div, string_add, string_sub) for monetary calculations to avoid floating-point precision errors. This is especially critical for high-precision exchange data where rounding errors cause incorrect order costs, fees, and balances that may result in financial loss.
## `CW-CRYPTO-TRADING-004` — Respect Exchange Rate Limits
**From**: ccxt · **Applicable to**: crypto-trading
Disabling rate limiting via enableRateLimit=False causes HTTP 429 responses and potential temporary or permanent API key suspension by exchanges. CCXT enforces rate limits per IP/API key pair, and bypassing throttle() gates results in compliance violations that disrupt all trading activity until exchanges lift bans.
## `CW-CRYPTO-TRADING-005` — Inverse Contract Price Adjustment
**From**: ccxt, hummingbot · **Applicable to**: crypto-trading
Perpetual swap cost calculations require applying inverse price adjustment (1/price) before multiplying by contractSize for inverse contracts. Incorrect cost calculation causes wrong position sizing, leading to unexpected liquidation or insufficient margin for perpetual trading positions.
## `CW-CRYPTO-TRADING-006` — Strict Connection Lifecycle Ordering
**From**: cryptofeed, ccxt · **Applicable to**: crypto-trading
Both projects enforce strict execution order for connection operations: cryptofeed requires authenticate -> subscribe -> message handler sequence, while ccxt mandates connect -> on_connected_callback -> subscriptions -> on_close_callback. Out-of-order operations cause subscription failures and no data flow through connections.
## `CW-CRYPTO-TRADING-007` — Validate Input Data Structure Before Processing
**From**: rotki, cryptofeed · **Applicable to**: crypto-trading
Rotki validates EVM address checksum format before RPC calls; cryptofeed checks Symbols.populated() before symbol mapping access. Validating data structure before processing prevents downstream crashes (KeyError, InvalidAddress) and data corruption that is harder to debug when symptoms appear in unrelated code paths.
## `CW-CRYPTO-TRADING-008` — Validate Order Sizes Against Exchange Minimums
**From**: hummingbot · **Applicable to**: crypto-trading
DCAExecutor amounts must be validated against min_notional_size and amounts_quote/prices against min_order_size before execution. Orders below exchange minimums are rejected, breaking strategy execution and potentially leaving positions partially unfilled at unfavorable prices.
FILE:references/components/accounting_-_pnl_calculation.md
# accounting_&_pnl_calculation (5 classes)
## `Accountant.process_history`
`accounting_&_pnl_calculation/accountant-process-history.py:0`
## `CostBasisCalculator.get_cost_basis`
`accounting_&_pnl_calculation/costbasiscalculator-get-cost-basis.py:0`
## `AccountingPot.reset`
`accounting_&_pnl_calculation/accountingpot-reset.py:0`
## `Cost basis calculation method`
`accounting_&_pnl_calculation/cost-basis-calculation-method.py:0`
## `Accounting rules engine`
`accounting_&_pnl_calculation/accounting-rules-engine.py:0`
FILE:references/components/asset_registry_-_resolution.md
# asset_registry_&_resolution (3 classes)
## `Asset.resolve`
`asset_registry_&_resolution/asset-resolve.py:0`
## `Asset.__init__`
`asset_registry_&_resolution/asset-init.py:0`
## `Oracle mappings`
`asset_registry_&_resolution/oracle-mappings.py:0`
FILE:references/components/blockchain_data_collection.md
# blockchain_data_collection (5 classes)
## `ChainsAggregator.get_history`
`blockchain_data_collection/chainsaggregator-get-history.py:0`
## `EvmNodeInquirer.query`
`blockchain_data_collection/evmnodeinquirer-query.py:0`
## `TaskManager._maybe_schedule`
`blockchain_data_collection/taskmanager-maybe-schedule.py:0`
## `Chain-specific inquirers`
`blockchain_data_collection/chain-specific-inquirers.py:0`
## `Indexer providers`
`blockchain_data_collection/indexer-providers.py:0`
FILE:references/components/database_layer.md
# database_layer (3 classes)
## `DBHandler.write`
`database_layer/dbhandler-write.py:0`
## `DBHandler.transient_write`
`database_layer/dbhandler-transient-write.py:0`
## `Database driver`
`database_layer/database-driver.py:0`
FILE:references/components/exchange_integration.md
# exchange_integration (3 classes)
## `ExchangeManager.get_exchange`
`exchange_integration/exchangemanager-get-exchange.py:0`
## `ExchangeInterface.query_trades`
`exchange_integration/exchangeinterface-query-trades.py:0`
## `Exchange implementations`
`exchange_integration/exchange-implementations.py:0`
FILE:references/components/history_event_management.md
# history_event_management (3 classes)
## `DBHistoryEvents.get_history_events`
`history_event_management/dbhistoryevents-get-history-events.py:0`
## `HistoryBaseEntry.get_event_direction`
`history_event_management/historybaseentry-get-event-direction.py:0`
## `Event type mappings`
`history_event_management/event-type-mappings.py:0`
FILE:references/components/rest_api_layer.md
# rest_api_layer (5 classes)
## `Rotkehlchen.query_async`
`rest_api_layer/rotkehlchen-query-async.py:0`
## `APIServer.start`
`rest_api_layer/apiserver-start.py:0`
## `HistoryService.process_history`
`rest_api_layer/historyservice-process-history.py:0`
## `Exchange connectors`
`rest_api_layer/exchange-connectors.py:0`
## `Price oracles`
`rest_api_layer/price-oracles.py:0`
FILE:references/components/transaction_decoding.md
# transaction_decoding (4 classes)
## `EVMTransactionDecoder.decode_transactions`
`transaction_decoding/evmtransactiondecoder-decode-transaction.py:0`
## `EvmDecoderInterface.decode_event`
`transaction_decoding/evmdecoderinterface-decode-event.py:0`
## `Protocol decoders`
`transaction_decoding/protocol-decoders.py:0`
## `Decoding rule priority`
`transaction_decoding/decoding-rule-priority.py:0`
自动化投资组合再平衡与交易执行,遵循先卖后买原则,支持多市场资产配置,智能计算最低交易规模及税费。。
---
name: robo-advisor-python
description: |-
自动化投资组合再平衡与交易执行,遵循先卖后买原则,支持多市场资产配置,智能计算最低交易规模及税费。。
license: Proprietary. See LICENSE.txt in project root.
compatibility: Designed for Doramagic-host ecosystem (Claude Code / openclaw / Cursor). Requires Python 3.12+ with uv package manager.
metadata:
version: "v6.1"
blueprint_id: "finance-bp-066"
compiled_at: "2026-04-22T13:00:21.762032+00:00"
capability_markets: "multi-market"
capability_activities: "portfolio-analytics"
sop_version: "crystal-compilation-v6.1"
---
# 智能投顾 (robo-advisor-python)
> 自动化投资组合再平衡与交易执行,遵循先卖后买原则,支持多市场资产配置,智能计算最低交易规模及税费。
## Pipeline
`data_collection -> data_storage -> factor_computation -> target_selection -> trading_execution -> visualization`
## Top Use Cases (0 total)
**Execute trigger**: `When user intent matches intent_router.uc_entries[].positive_terms AND user uses action verb (run/execute/跑/执行/backtest/fetch/collect)`
## What I'll Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
## Semantic Locks (Fatal)
| ID | Rule | On Violation |
|---|---|---|
| `SL-01` | Execute sell orders before buy orders in every trading cycle | halt |
| `SL-02` | Trading signals MUST use next-bar execution (no look-ahead) | halt |
| `SL-03` | Entity IDs MUST follow format entity_type_exchange_code | halt |
| `SL-04` | DataFrame index MUST be MultiIndex (entity_id, timestamp) | halt |
| `SL-05` | TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount | halt |
| `SL-06` | filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION | halt |
| `SL-07` | Transformer MUST run BEFORE Accumulator in factor pipeline | halt |
| `SL-08` | MACD parameters locked: fast=12, slow=26, signal=9 | halt |
Full lock definitions: [references/LOCKS.md](references/LOCKS.md)
## Top Anti-Patterns (14 total)
- **`AP-PORTFOLIO-ANALYTICS-001`**: Division by zero in price ratio calculations corrupts rebalancing
- **`AP-PORTFOLIO-ANALYTICS-002`**: Look-ahead bias from unshifted signal generation and position calculations
- **`AP-PORTFOLIO-ANALYTICS-003`**: Non-positive-semidefinite covariance matrix breaks CVXPY optimization
All 14 anti-patterns: [references/ANTI_PATTERNS.md](references/ANTI_PATTERNS.md)
## Evidence Quality Notice
> [QUALITY NOTICE] This crystal was compiled from blueprint finance-bp-066. Evidence verify ratio = 72.7% and audit fail total = 20. Generated results may have uncaptured requirement gaps. Verify critical decisions against source files (LATEST.yaml / LATEST.jsonl).
## Reference Files
| File | Contents | When to Load |
|---|---|---|
| [references/seed.yaml](references/seed.yaml) | V6+ 全量权威 (source-of-truth) | 有行为/决策争议时必读 |
| [references/ANTI_PATTERNS.md](references/ANTI_PATTERNS.md) | 14 条跨项目反模式 | 开始实现前 |
| [references/WISDOM.md](references/WISDOM.md) | 跨项目精华借鉴 | 架构决策时 |
| [references/CONSTRAINTS.md](references/CONSTRAINTS.md) | domain + fatal 约束 | 规则冲突时 |
| [references/USE_CASES.md](references/USE_CASES.md) | 全量 KUC-* 业务场景 | 需要完整示例时 |
| [references/LOCKS.md](references/LOCKS.md) | SL-* + preconditions + hints | 生成回测/交易代码前 |
| [references/COMPONENTS.md](references/COMPONENTS.md) | AST 组件地图(按 module 拆分)| 查 API 时 |
---
*Compiled by Doramagic crystal-compilation-v6.1 from `finance-bp-066` blueprint at 2026-04-22T13:00:21.762032+00:00.*
*See [human_summary.md](human_summary.md) for non-technical overview.*
FILE:human_summary.md
# Human Summary
> I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
## What I Can Do
- **tagline**: I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
- **use_cases**: ['A-share MACD daily golden-cross backtest with hfq price adjustment from eastmoney', 'End-to-end ZVT pipeline: FinanceRecorder + GoodCompanyFactor + StockTrader', 'Multi-factor strategy with TargetSelector (AND mode) combining MACD + volume breakout', 'Index composition data collection (SZ1000, SZ2000) with EM recorder', 'Institutional fund holdings tracker via joinquant_fund_runner pattern', 'Custom Transformer + Accumulator factor with per-entity rolling state', 'Bollinger Band mean-reversion factor with BollTransformer (window=20, window_dev=2)']
## What I Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
FILE:references/ANTI_PATTERNS.md
# Anti-Patterns (Cross-Project)
Total: **14**
## finance-bp-066--wealthbot (2)
### `AP-PORTFOLIO-ANALYTICS-001` — Division by zero in price ratio calculations corrupts rebalancing <sub>(high)</sub>
When calculating price_diff using current_price divided by old_price without validating old_price is non-zero, the result is NaN or INF. This corrupts portfolio rebalancing calculations in wealthbot, causing incorrect buy/sell decisions based on invalid prices_diff values. The same issue appears in getPricesDiff() where divide-by-zero when old_price equals zero produces NaN/infinity that propagates to all subsequent trade decisions.
### `AP-PORTFOLIO-ANALYTICS-004` — Incorrect portfolio value tracking destroys time-series integrity <sub>(high)</sub>
Updating existing ClientPortfolioValue records instead of creating new ones destroys the time-series integrity needed for billing calculations and historical reconciliation. This creates data corruption where billing calculations and historical reporting against custodian records will fail to match. Portfolio value records must be linked to parent ClientPortfolio via proper relationships to avoid orphaned records.
## finance-bp-068--xalpha (1)
### `AP-PORTFOLIO-ANALYTICS-006` — FIFO sell order violation corrupts cost basis and XIRR <sub>(high)</sub>
Processing positions out of chronological order in FIFO sell operations causes incorrect cost basis assignment, leading to inaccurate realized gains/losses and wrong XIRR calculation. Chinese funds have tiered redemption fees based on holding periods, so FIFO violations result in incorrect holding period calculation and wrong redemption fee being applied, causing direct financial loss.
## finance-bp-068--xalpha, finance-bp-093--PyPortfolioOpt, finance-bp-117--Riskfolio-Lib (1)
### `AP-PORTFOLIO-ANALYTICS-010` — Missing DataFrame schema validation causes KeyError propagation <sub>(medium)</sub>
Passing non-DataFrame objects (numpy arrays, lists) where DataFrame is expected causes NameError, AttributeError, or TypeError in downstream pandas operations. xalpha's fundinfo.price requires specific columns (date, netvalue, totvalue, comment), PyPortfolioOpt and Riskfolio-Lib require index alignment between expected returns and covariance matrix. Missing columns cause backtest calculations to fail with NaN values or KeyError.
## finance-bp-082--stock-screener (1)
### `AP-PORTFOLIO-ANALYTICS-007` — Score validation bypass allows invalid composite calculations <sub>(medium)</sub>
Accepting scores outside the 0-100 range in screener results corrupts ranking and rating logic, causing unpredictable screening results that violate the fundamental score contract. When combined with division-by-zero guards that return 0.0 for empty screener lists, this creates unpredictable behavior where invalid scores produce wrong composite calculations and incorrect Strong Buy/Buy/Watch/Pass ratings.
## finance-bp-093--PyPortfolioOpt (1)
### `AP-PORTFOLIO-ANALYTICS-008` — Convex optimization constraints violate DCP rules <sub>(high)</sub>
Using non-convex objectives or DCP-violating expressions in CVXPY optimization causes DCPError, completely preventing portfolio optimization from running. Similarly, providing non-callable constraints or invalid bounds formats (not matching n_assets length) causes TypeError. Feasibility violations like setting target_volatility below global minimum or target_return above maximum achievable return make problems infeasible.
## finance-bp-093--PyPortfolioOpt, finance-bp-117--Riskfolio-Lib (1)
### `AP-PORTFOLIO-ANALYTICS-003` — Non-positive-semidefinite covariance matrix breaks CVXPY optimization <sub>(high)</sub>
Passing a non-positive-semidefinite covariance matrix to CVXPY optimization with assume_PSD=True produces incorrect results because the solver assumes validity without verification. This causes Cholesky decomposition to fail or produce garbage weights, preventing portfolio optimization from running entirely. Riskfolio-Lib and PyPortfolioOpt both require explicit PSD validation before optimization.
## finance-bp-106--pyfolio-reloaded (2)
### `AP-PORTFOLIO-ANALYTICS-005` — Allocation denominator excludes cash, corrupting portfolio composition <sub>(medium)</sub>
When computing allocation percentages excluding cash from the denominator, portfolio allocation percentages will not sum to 100%, misrepresenting the portfolio's actual composition. Additionally, concentration metrics become artificially skewed when including cash (a non-position asset), producing misleading diversification assessments that could lead to inappropriate risk management decisions.
### `AP-PORTFOLIO-ANALYTICS-009` — Transaction data corruption from missing columns and invalid dates <sub>(medium)</sub>
Extracting round trips from transactions DataFrame without validating required columns (amount, price, symbol) causes KeyError exceptions. When open_dt is not strictly less than close_dt, negative or zero duration values indicate data corruption causing incorrect holding period statistics. Similarly, non-normalized transaction timestamps cause intra-day trades to be incorrectly split across days.
## finance-bp-107--empyrical-reloaded (1)
### `AP-PORTFOLIO-ANALYTICS-011` — Wrong annualization factors distort cross-frequency metric comparison <sub>(high)</sub>
Applying incorrect annualization factors (wrong values for daily, weekly, monthly, quarterly, yearly frequencies) produces non-comparable metrics across different return frequencies, causing invalid strategy comparisons and misallocated capital. The Sharpe ratio formula must use correct annualization with sample standard deviation (ddof=1), otherwise producing misleading risk-adjusted return estimates.
## finance-bp-107--empyrical-reloaded, finance-bp-118--FinanceToolkit (1)
### `AP-PORTFOLIO-ANALYTICS-012` — Misaligned time series in alpha/beta calculation produces invalid factor analysis <sub>(high)</sub>
Passing returns and factor_returns to alpha_beta functions without verifying data alignment on index labels (pd.Series) or length equality (np.ndarray) produces incorrect alpha/beta values due to correlation computed between mismatched periods. Including benchmark ticker in the asset ticker list causes circular correlation producing meaningless beta values of approximately 1.0.
## finance-bp-108--finmarketpy (1)
### `AP-PORTFOLIO-ANALYTICS-013` — Forward-filling spot prices creates look-ahead bias in TRI construction <sub>(high)</sub>
Forward-filling spot prices creates look-ahead bias where future prices are used to calculate historical returns, invalidating all TRI-based backtest results. The total return index construction requires multiplicative cumulation using cumprod (not cumsum) with base value 100, as additive cumulation allows negative cumulative returns to break the index chain.
## finance-bp-108--finmarketpy, finance-bp-106--pyfolio-reloaded (1)
### `AP-PORTFOLIO-ANALYTICS-002` — Look-ahead bias from unshifted signal generation and position calculations <sub>(high)</sub>
Generating trading signals from current-period technical indicators (RSI, moving averages) without proper shift(-1) creates look-ahead bias, causing live trading returns to fall far below backtested results. Similarly, when estimating intraday positions from transactions without applying shift(1) to EOD positions, day-start positions are contaminated with end-of-day values, making results unrepresentative of actual trading.
## finance-bp-117--Riskfolio-Lib, finance-bp-093--PyPortfolioOpt (1)
### `AP-PORTFOLIO-ANALYTICS-014` — Unsupported solver selection breaks advanced risk calculations <sub>(medium)</sub>
Using solvers that don't support required cone programming (power cone, exponential cone) causes CVXPY to fail with SolverError, returning None and breaking risk calculations. CLARABEL, SCS, ECOS support power cone for RLVaR/RLDaR calculations, while CLARABEL/MOSEK/SCS/ECOS support exponential cone for EVaR calculations. Riskfolio-Lib and PyPortfolioOpt both require careful solver selection.
FILE:references/COMPONENTS.md
# Component Capability Map
**Project**: finance-bp-066--wealthbot
**Scan date**: 2026-04-22
**Stats**: {'total_files': 5, 'total_classes': 19, 'total_functions': 0, 'total_stages': 5}
## Modules (5)
- [price_collection](components/price_collection.md): 4 classes
- [portfolio_analysis](components/portfolio_analysis.md): 4 classes
- [trade_decision_engine](components/trade_decision_engine.md): 5 classes
- [trade_execution](components/trade_execution.md): 5 classes
- [portfolio_value_update](components/portfolio_value_update.md): 1 classes
FILE:references/CONSTRAINTS.md
# Constraints
## preservation_manifest
```yaml
required_objects:
business_decisions_count: 114
fatal_constraints_count: 28
non_fatal_constraints_count: 165
use_cases_count: 0
semantic_locks_count: 12
preconditions_count: 4
evidence_quality_rules_count: 2
traceback_scenarios_count: 5
```
FILE:references/LOCKS.md
# Semantic Locks + Preconditions
## Semantic Locks (12)
### `SL-01` <sub>(on_violation: fatal)</sub>
Execute sell orders before buy orders in every trading cycle
### `SL-02` <sub>(on_violation: fatal)</sub>
Trading signals MUST use next-bar execution (no look-ahead)
### `SL-03` <sub>(on_violation: fatal)</sub>
Entity IDs MUST follow format entity_type_exchange_code
### `SL-04` <sub>(on_violation: fatal)</sub>
DataFrame index MUST be MultiIndex (entity_id, timestamp)
### `SL-05` <sub>(on_violation: fatal)</sub>
TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount
### `SL-06` <sub>(on_violation: fatal)</sub>
filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION
### `SL-07` <sub>(on_violation: fatal)</sub>
Transformer MUST run BEFORE Accumulator in factor pipeline
### `SL-08` <sub>(on_violation: fatal)</sub>
MACD parameters locked: fast=12, slow=26, signal=9
### `SL-09` <sub>(on_violation: warning)</sub>
Default transaction costs: buy_cost=0.001, sell_cost=0.001, slippage=0.001
### `SL-10` <sub>(on_violation: fatal)</sub>
A-share equity trading is T+1 (no same-day close of buy positions)
### `SL-11` <sub>(on_violation: fatal)</sub>
Recorder subclass MUST define provider AND data_schema class attributes
### `SL-12` <sub>(on_violation: fatal)</sub>
Factor result_df MUST contain either 'filter_result' OR 'score_result' column
## Preconditions (4)
- **`PC-01`**: `python3 -c 'import zvt; print(zvt.__version__)'` → on_fail: Run: python3 -m pip install zvt then re-run: python3 -m zvt.init_dirs to initialize data directories
- **`PC-02`**: `python3 -c "from zvt.api.kdata import get_kdata; df = get_kdata(entity_ids=['stock_sh_600000'], limit=1); assert df is not None and len(df) > 0, 'No kdata found'"` → on_fail: Run recorder first: python3 -m zvt.recorders.em.em_stock_kdata_recorder --entity_ids stock_sh_600000 (replace with your target entity IDs)
- **`PC-03`**: `python3 -c "import os; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); assert zvt_home.exists(), f'ZVT home not found: {zvt_home}'"` → on_fail: Run: python3 -m zvt.init_dirs
- **`PC-04`**: `python3 -c "import os, tempfile; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); test_f = zvt_home / '.write_test'; test_f.touch(); test_f.unlink()"` → on_fail: Check directory permissions: chmod u+w ~/.zvt or set ZVT_HOME environment variable to a writable location
FILE:references/USE_CASES.md
# Known Use Cases (KUC)
Total: **0**
FILE:references/WISDOM.md
# Cross-Project Wisdom
Total: **10**
## `CW-PORTFOLIO-ANALYTICS-001` — Defensive zero-division guards with explicit handling
**From**: finance-bp-066--wealthbot, finance-bp-082--stock-screener, finance-bp-093--PyPortfolioOpt · **Applicable to**: portfolio-analytics
Always guard division operations with explicit zero-value checks before executing. In price ratio calculations, filter out securities where old_price is zero before calling getPricesDiff. In composite score calculations, guard against total_weight of zero and return 0.0 for empty input lists. This prevents NaN/infinity propagation that corrupts downstream calculations and crashes pipelines.
## `CW-PORTFOLIO-ANALYTICS-002` — Covariance matrix positive-semidefiniteness verification
**From**: finance-bp-093--PyPortfolioOpt, finance-bp-117--Riskfolio-Lib · **Applicable to**: portfolio-analytics
Always verify covariance matrix is positive-semidefinite before passing to CVXPY optimization. Apply eigenvalue clipping if violated, as non-PSD matrices cause Cholesky decomposition failures. Both PyPortfolioOpt and Riskfolio-Lib enforce this constraint to prevent optimizer from finding mathematically invalid solutions or crashing entirely.
## `CW-PORTFOLIO-ANALYTICS-003` — Geometric compounding for cumulative returns
**From**: finance-bp-068--xalpha, finance-bp-106--pyfolio-reloaded, finance-bp-107--empyrical-reloaded · **Applicable to**: portfolio-analytics
Compute cumulative returns using geometric compounding via cumprod(1 + returns), never arithmetic cumulation via cumsum. Arithmetic cumulative sum overstates gains and understates losses, causing cumulative returns to diverge significantly from actual portfolio performance over volatile periods. This principle applies to total return index construction and any cumulative performance calculation.
## `CW-PORTFOLIO-ANALYTICS-004` — Temporal shift enforcement to prevent look-ahead bias
**From**: finance-bp-108--finmarketpy, finance-bp-106--pyfolio-reloaded · **Applicable to**: portfolio-analytics
Enforce proper temporal shifting in signal generation and position calculations. Use shift(-1) for exit signals to prevent look-ahead bias, and shift(1) when estimating intraday positions from EOD data. Forward-fill carry data and backward-fill only old data gaps, never forward-fill spot prices. Violations cause live trading returns to diverge from backtested results.
## `CW-PORTFOLIO-ANALYTICS-005` — DCP-compliant convex optimization construction
**From**: finance-bp-093--PyPortfolioOpt, finance-bp-117--Riskfolio-Lib · **Applicable to**: portfolio-analytics
Use only DCP-compliant convex objectives and constraints in CVXPY. Provide constraints as callable functions accepting weight variables, use valid bounds formats matching n_assets length, and verify target parameters (volatility, return) are within feasible ranges. Non-convex or infeasible problems fail with DCPError or OptimizationError, preventing optimization entirely.
## `CW-PORTFOLIO-ANALYTICS-006` — Correct Sharpe ratio formula with risk-free rate subtraction
**From**: finance-bp-107--empyrical-reloaded, finance-bp-118--FinanceToolkit · **Applicable to**: portfolio-analytics
Calculate Sharpe ratio using (mean returns - risk_free) / std(returns) * sqrt(annualization) with sample standard deviation (ddof=1). Subtract risk-free rate from asset returns before dividing by volatility. Incorrect Sharpe ratio calculation produces misleading risk-adjusted return estimates, causing poor investment decisions based on faulty performance attribution.
## `CW-PORTFOLIO-ANALYTICS-007` — Immutable FIFO position tracking with chronological ordering
**From**: finance-bp-068--xalpha, finance-bp-066--wealthbot · **Applicable to**: portfolio-analytics
Maintain FIFO position tracking with strictly increasing date order for position entries. Use copy() function to create independent copies before mutating remtable to avoid side effects. Enforce chronological ordering in sell operations to ensure correct cost basis and holding period calculation, particularly important for funds with tiered fees by holding period.
## `CW-PORTFOLIO-ANALYTICS-008` — Validation at system boundaries with descriptive errors
**From**: finance-bp-082--stock-screener, finance-bp-093--PyPortfolioOpt, finance-bp-117--Riskfolio-Lib · **Applicable to**: portfolio-analytics
Enforce validation at system boundaries with descriptive error messages. Validate expected returns matches covariance matrix dimensions, score values are within [0, 100], confidence values within [0, 1], and required DataFrame columns are present. Invalid inputs should raise ValueError with descriptive messages listing valid options to prevent silent failures or corrupted calculations.
## `CW-PORTFOLIO-ANALYTICS-009` — Decimal rounding for monetary calculations
**From**: finance-bp-068--xalpha, finance-bp-107--empyrical-reloaded · **Applicable to**: portfolio-analytics
Use Decimal with explicit rounding (myround) for each monetary calculation to avoid floating-point errors that cause share miscalculation and incorrect cost basis. This prevents rounding errors from propagating to XIRR and portfolio valuation calculations. Direct floating-point operations in financial calculations accumulate errors that become material over many transactions.
## `CW-PORTFOLIO-ANALYTICS-010` — Cash flow sign convention enforcement
**From**: finance-bp-106--pyfolio-reloaded, finance-bp-068--xalpha · **Applicable to**: portfolio-analytics
Mark cash outflows as negative and cash inflows as positive in cftable. Incorrect cash flow signs cause NPV calculation to invert, producing negative returns for profitable trades and vice versa. Verify sum of round trip PnLs equals total realized transaction dollars to catch sign convention errors before they corrupt performance attribution.
FILE:references/components/portfolio_analysis.md
# portfolio_analysis (4 classes)
## `CeModel.getModelEntities`
`portfolio_analysis/cemodel-getmodelentities.py:0`
## `Rebalancer.updatePortfolioValues`
`portfolio_analysis/rebalancer-updateportfoliovalues.py:0`
## `Job.getRebalanceType`
`portfolio_analysis/job-getrebalancetype.py:0`
## `rebalance_type`
`portfolio_analysis/rebalance-type.py:0`
FILE:references/components/portfolio_value_update.md
# portfolio_value_update (1 classes)
## `Rebalancer.updatePortfolioValues`
`portfolio_value_update/rebalancer-updateportfoliovalues.py:0`
FILE:references/components/price_collection.md
# price_collection (4 classes)
## `Rebalancer.updateSecurities`
`price_collection/rebalancer-updatesecurities.py:0`
## `BaseRebalancer.getPricesDiff`
`price_collection/baserebalancer-getpricesdiff.py:0`
## `Rebalancer.setIsCurrent`
`price_collection/rebalancer-setiscurrent.py:0`
## `price_source`
`price_collection/price-source.py:0`
FILE:references/components/trade_decision_engine.md
# trade_decision_engine (5 classes)
## `Trade.buyOrSell`
`trade_decision_engine/trade-buyorsell.py:0`
## `TradeData.getTradeData`
`trade_decision_engine/tradedata-gettradedata.py:0`
## `deviation_threshold`
`trade_decision_engine/deviation-threshold.py:0`
## `price_threshold_buy`
`trade_decision_engine/price-threshold-buy.py:0`
## `price_threshold_sell`
`trade_decision_engine/price-threshold-sell.py:0`
FILE:references/components/trade_execution.md
# trade_execution (5 classes)
## `Requests.placeOrder`
`trade_execution/requests-placeorder.py:0`
## `Trade.buy`
`trade_execution/trade-buy.py:0`
## `Trade.sell`
`trade_execution/trade-sell.py:0`
## `order_type`
`trade_execution/order-type.py:0`
## `duration`
`trade_execution/duration.py:0`
FILE:references/seed.yaml
meta:
id: finance-bp-066-v5.3
version: v6.1
blueprint_id: finance-bp-066
sop_version: crystal-compilation-v6.1
source_language: en
compiled_at: '2026-04-22T13:00:21.762032+00:00'
target_host: openclaw
authoritative_artifact:
primary: seed.yaml
non_authoritative_derivatives:
- SKILL.md (host-generated summary, may lag)
- HEARTBEAT.md (host telemetry)
- memory/*.md (host conversational memory)
rule: On any behavioral decision (preconditions check, OV assertion, EQ rule firing, spec_lock verification), agents MUST
re-read seed.yaml. Derivatives are for UI display only and may be out-of-date.
execution_protocol:
install_trigger:
- Execute resources.host_adapter.install_recipes[] in declared order
- Verify each package with import check before proceeding
execute_trigger: When user intent matches intent_router.uc_entries[].positive_terms AND user uses action verb (run/execute/跑/执行/backtest/fetch/collect)
on_execute:
- Reload seed.yaml (do not rely on SKILL.md or cached summaries)
- Run preconditions[] in declared order; halt on first fatal failure with on_fail message to user
- Enter context_state_machine.CA1_MEMORY_CHECKED state
- Evaluate evidence_quality.enforcement_rules[]; prepend user_disclosure_template
- Translate user_facing_fields to user locale per locale_contract
- "[V6 READING ORDER]\nThis crystal contains the following V6 layers. Before answering any business question, the host\
\ MUST read them in order:\n 1. anti_patterns[] — cross-project anti-patterns (with AP-* ids)\n 2. cross_project_wisdom[]\
\ — cross-project wisdom (with CW-* ids)\n 3. domain_constraints_injected[] — domain constraints (SHARED-* ids)\n \
\ 4. known_use_cases[] — concrete business scenarios (KUC-* ids)\n 5. component_capability_map — AST component map\
\ (by module)\n\nWhen answering user questions, proactively cite relevant AP-*/CW-*/SHARED-*/KUC-* ids with source text.\
\ Examples: T+1 rules -> cite SHARED-* constraint; model comparison -> warn via AP-*; follow-holdings strategy -> cite\
\ KUC-* with example file."
workspace_resolution:
scripts_path: '{host_workspace}/scripts/'
skills_path: '{host_workspace}/skills/'
trace_path: '{host_workspace}/.trace/'
capability_tags:
markets:
- multi-market
activities:
- portfolio-analytics
upgraded_from: finance-bp-066-v1.seed.yaml
upgraded_at: '2026-04-22T13:20:13.424714+00:00'
v6_inputs:
ast_mind_map: knowledge/sources/finance/finance-bp-066--wealthbot/v6_inputs/ast_mind_map.yaml
anti_patterns: null
cross_project_wisdom: null
examples_kuc: knowledge/sources/finance/finance-bp-066--wealthbot/v6_inputs/examples_kuc.yaml
shared_pools_dir: knowledge/sources/finance/_shared
anti_patterns:
- id: AP-PORTFOLIO-ANALYTICS-001
title: Division by zero in price ratio calculations corrupts rebalancing
description: When calculating price_diff using current_price divided by old_price without validating old_price is non-zero,
the result is NaN or INF. This corrupts portfolio rebalancing calculations in wealthbot, causing incorrect buy/sell decisions
based on invalid prices_diff values. The same issue appears in getPricesDiff() where divide-by-zero when old_price equals
zero produces NaN/infinity that propagates to all subsequent trade decisions.
project_source: finance-bp-066--wealthbot
severity: high
applicable_to_tags:
markets:
- multi-market
activities:
- portfolio-analytics
_source_file: anti-patterns/portfolio-analytics.yaml
- id: AP-PORTFOLIO-ANALYTICS-002
title: Look-ahead bias from unshifted signal generation and position calculations
description: Generating trading signals from current-period technical indicators (RSI, moving averages) without proper shift(-1)
creates look-ahead bias, causing live trading returns to fall far below backtested results. Similarly, when estimating
intraday positions from transactions without applying shift(1) to EOD positions, day-start positions are contaminated
with end-of-day values, making results unrepresentative of actual trading.
project_source: finance-bp-108--finmarketpy, finance-bp-106--pyfolio-reloaded
severity: high
applicable_to_tags:
markets:
- multi-market
activities:
- portfolio-analytics
_source_file: anti-patterns/portfolio-analytics.yaml
- id: AP-PORTFOLIO-ANALYTICS-003
title: Non-positive-semidefinite covariance matrix breaks CVXPY optimization
description: Passing a non-positive-semidefinite covariance matrix to CVXPY optimization with assume_PSD=True produces incorrect
results because the solver assumes validity without verification. This causes Cholesky decomposition to fail or produce
garbage weights, preventing portfolio optimization from running entirely. Riskfolio-Lib and PyPortfolioOpt both require
explicit PSD validation before optimization.
project_source: finance-bp-093--PyPortfolioOpt, finance-bp-117--Riskfolio-Lib
severity: high
applicable_to_tags:
markets:
- multi-market
activities:
- portfolio-analytics
_source_file: anti-patterns/portfolio-analytics.yaml
- id: AP-PORTFOLIO-ANALYTICS-004
title: Incorrect portfolio value tracking destroys time-series integrity
description: Updating existing ClientPortfolioValue records instead of creating new ones destroys the time-series integrity
needed for billing calculations and historical reconciliation. This creates data corruption where billing calculations
and historical reporting against custodian records will fail to match. Portfolio value records must be linked to parent
ClientPortfolio via proper relationships to avoid orphaned records.
project_source: finance-bp-066--wealthbot
severity: high
applicable_to_tags:
markets:
- multi-market
activities:
- portfolio-analytics
_source_file: anti-patterns/portfolio-analytics.yaml
- id: AP-PORTFOLIO-ANALYTICS-005
title: Allocation denominator excludes cash, corrupting portfolio composition
description: When computing allocation percentages excluding cash from the denominator, portfolio allocation percentages
will not sum to 100%, misrepresenting the portfolio's actual composition. Additionally, concentration metrics become artificially
skewed when including cash (a non-position asset), producing misleading diversification assessments that could lead to
inappropriate risk management decisions.
project_source: finance-bp-106--pyfolio-reloaded
severity: medium
applicable_to_tags:
markets:
- multi-market
activities:
- portfolio-analytics
_source_file: anti-patterns/portfolio-analytics.yaml
- id: AP-PORTFOLIO-ANALYTICS-006
title: FIFO sell order violation corrupts cost basis and XIRR
description: Processing positions out of chronological order in FIFO sell operations causes incorrect cost basis assignment,
leading to inaccurate realized gains/losses and wrong XIRR calculation. Chinese funds have tiered redemption fees based
on holding periods, so FIFO violations result in incorrect holding period calculation and wrong redemption fee being applied,
causing direct financial loss.
project_source: finance-bp-068--xalpha
severity: high
applicable_to_tags:
markets:
- multi-market
activities:
- portfolio-analytics
_source_file: anti-patterns/portfolio-analytics.yaml
- id: AP-PORTFOLIO-ANALYTICS-007
title: Score validation bypass allows invalid composite calculations
description: Accepting scores outside the 0-100 range in screener results corrupts ranking and rating logic, causing unpredictable
screening results that violate the fundamental score contract. When combined with division-by-zero guards that return
0.0 for empty screener lists, this creates unpredictable behavior where invalid scores produce wrong composite calculations
and incorrect Strong Buy/Buy/Watch/Pass ratings.
project_source: finance-bp-082--stock-screener
severity: medium
applicable_to_tags:
markets:
- multi-market
activities:
- portfolio-analytics
_source_file: anti-patterns/portfolio-analytics.yaml
- id: AP-PORTFOLIO-ANALYTICS-008
title: Convex optimization constraints violate DCP rules
description: Using non-convex objectives or DCP-violating expressions in CVXPY optimization causes DCPError, completely
preventing portfolio optimization from running. Similarly, providing non-callable constraints or invalid bounds formats
(not matching n_assets length) causes TypeError. Feasibility violations like setting target_volatility below global minimum
or target_return above maximum achievable return make problems infeasible.
project_source: finance-bp-093--PyPortfolioOpt
severity: high
applicable_to_tags:
markets:
- multi-market
activities:
- portfolio-analytics
_source_file: anti-patterns/portfolio-analytics.yaml
- id: AP-PORTFOLIO-ANALYTICS-009
title: Transaction data corruption from missing columns and invalid dates
description: Extracting round trips from transactions DataFrame without validating required columns (amount, price, symbol)
causes KeyError exceptions. When open_dt is not strictly less than close_dt, negative or zero duration values indicate
data corruption causing incorrect holding period statistics. Similarly, non-normalized transaction timestamps cause intra-day
trades to be incorrectly split across days.
project_source: finance-bp-106--pyfolio-reloaded
severity: medium
applicable_to_tags:
markets:
- multi-market
activities:
- portfolio-analytics
_source_file: anti-patterns/portfolio-analytics.yaml
- id: AP-PORTFOLIO-ANALYTICS-010
title: Missing DataFrame schema validation causes KeyError propagation
description: Passing non-DataFrame objects (numpy arrays, lists) where DataFrame is expected causes NameError, AttributeError,
or TypeError in downstream pandas operations. xalpha's fundinfo.price requires specific columns (date, netvalue, totvalue,
comment), PyPortfolioOpt and Riskfolio-Lib require index alignment between expected returns and covariance matrix. Missing
columns cause backtest calculations to fail with NaN values or KeyError.
project_source: finance-bp-068--xalpha, finance-bp-093--PyPortfolioOpt, finance-bp-117--Riskfolio-Lib
severity: medium
applicable_to_tags:
markets:
- multi-market
activities:
- portfolio-analytics
_source_file: anti-patterns/portfolio-analytics.yaml
- id: AP-PORTFOLIO-ANALYTICS-011
title: Wrong annualization factors distort cross-frequency metric comparison
description: Applying incorrect annualization factors (wrong values for daily, weekly, monthly, quarterly, yearly frequencies)
produces non-comparable metrics across different return frequencies, causing invalid strategy comparisons and misallocated
capital. The Sharpe ratio formula must use correct annualization with sample standard deviation (ddof=1), otherwise producing
misleading risk-adjusted return estimates.
project_source: finance-bp-107--empyrical-reloaded
severity: high
applicable_to_tags:
markets:
- multi-market
activities:
- portfolio-analytics
_source_file: anti-patterns/portfolio-analytics.yaml
- id: AP-PORTFOLIO-ANALYTICS-012
title: Misaligned time series in alpha/beta calculation produces invalid factor analysis
description: Passing returns and factor_returns to alpha_beta functions without verifying data alignment on index labels
(pd.Series) or length equality (np.ndarray) produces incorrect alpha/beta values due to correlation computed between mismatched
periods. Including benchmark ticker in the asset ticker list causes circular correlation producing meaningless beta values
of approximately 1.0.
project_source: finance-bp-107--empyrical-reloaded, finance-bp-118--FinanceToolkit
severity: high
applicable_to_tags:
markets:
- multi-market
activities:
- portfolio-analytics
_source_file: anti-patterns/portfolio-analytics.yaml
- id: AP-PORTFOLIO-ANALYTICS-013
title: Forward-filling spot prices creates look-ahead bias in TRI construction
description: Forward-filling spot prices creates look-ahead bias where future prices are used to calculate historical returns,
invalidating all TRI-based backtest results. The total return index construction requires multiplicative cumulation using
cumprod (not cumsum) with base value 100, as additive cumulation allows negative cumulative returns to break the index
chain.
project_source: finance-bp-108--finmarketpy
severity: high
applicable_to_tags:
markets:
- multi-market
activities:
- portfolio-analytics
_source_file: anti-patterns/portfolio-analytics.yaml
- id: AP-PORTFOLIO-ANALYTICS-014
title: Unsupported solver selection breaks advanced risk calculations
description: Using solvers that don't support required cone programming (power cone, exponential cone) causes CVXPY to fail
with SolverError, returning None and breaking risk calculations. CLARABEL, SCS, ECOS support power cone for RLVaR/RLDaR
calculations, while CLARABEL/MOSEK/SCS/ECOS support exponential cone for EVaR calculations. Riskfolio-Lib and PyPortfolioOpt
both require careful solver selection.
project_source: finance-bp-117--Riskfolio-Lib, finance-bp-093--PyPortfolioOpt
severity: medium
applicable_to_tags:
markets:
- multi-market
activities:
- portfolio-analytics
_source_file: anti-patterns/portfolio-analytics.yaml
cross_project_wisdom:
- wisdom_id: CW-PORTFOLIO-ANALYTICS-001
source_project: finance-bp-066--wealthbot, finance-bp-082--stock-screener, finance-bp-093--PyPortfolioOpt
pattern_name: Defensive zero-division guards with explicit handling
description: Always guard division operations with explicit zero-value checks before executing. In price ratio calculations,
filter out securities where old_price is zero before calling getPricesDiff. In composite score calculations, guard against
total_weight of zero and return 0.0 for empty input lists. This prevents NaN/infinity propagation that corrupts downstream
calculations and crashes pipelines.
applicable_to_activity: portfolio-analytics
_source_file: cross-project-wisdom/portfolio-analytics.yaml
- wisdom_id: CW-PORTFOLIO-ANALYTICS-002
source_project: finance-bp-093--PyPortfolioOpt, finance-bp-117--Riskfolio-Lib
pattern_name: Covariance matrix positive-semidefiniteness verification
description: Always verify covariance matrix is positive-semidefinite before passing to CVXPY optimization. Apply eigenvalue
clipping if violated, as non-PSD matrices cause Cholesky decomposition failures. Both PyPortfolioOpt and Riskfolio-Lib
enforce this constraint to prevent optimizer from finding mathematically invalid solutions or crashing entirely.
applicable_to_activity: portfolio-analytics
_source_file: cross-project-wisdom/portfolio-analytics.yaml
- wisdom_id: CW-PORTFOLIO-ANALYTICS-003
source_project: finance-bp-068--xalpha, finance-bp-106--pyfolio-reloaded, finance-bp-107--empyrical-reloaded
pattern_name: Geometric compounding for cumulative returns
description: Compute cumulative returns using geometric compounding via cumprod(1 + returns), never arithmetic cumulation
via cumsum. Arithmetic cumulative sum overstates gains and understates losses, causing cumulative returns to diverge significantly
from actual portfolio performance over volatile periods. This principle applies to total return index construction and
any cumulative performance calculation.
applicable_to_activity: portfolio-analytics
_source_file: cross-project-wisdom/portfolio-analytics.yaml
- wisdom_id: CW-PORTFOLIO-ANALYTICS-004
source_project: finance-bp-108--finmarketpy, finance-bp-106--pyfolio-reloaded
pattern_name: Temporal shift enforcement to prevent look-ahead bias
description: Enforce proper temporal shifting in signal generation and position calculations. Use shift(-1) for exit signals
to prevent look-ahead bias, and shift(1) when estimating intraday positions from EOD data. Forward-fill carry data and
backward-fill only old data gaps, never forward-fill spot prices. Violations cause live trading returns to diverge from
backtested results.
applicable_to_activity: portfolio-analytics
_source_file: cross-project-wisdom/portfolio-analytics.yaml
- wisdom_id: CW-PORTFOLIO-ANALYTICS-005
source_project: finance-bp-093--PyPortfolioOpt, finance-bp-117--Riskfolio-Lib
pattern_name: DCP-compliant convex optimization construction
description: Use only DCP-compliant convex objectives and constraints in CVXPY. Provide constraints as callable functions
accepting weight variables, use valid bounds formats matching n_assets length, and verify target parameters (volatility,
return) are within feasible ranges. Non-convex or infeasible problems fail with DCPError or OptimizationError, preventing
optimization entirely.
applicable_to_activity: portfolio-analytics
_source_file: cross-project-wisdom/portfolio-analytics.yaml
- wisdom_id: CW-PORTFOLIO-ANALYTICS-006
source_project: finance-bp-107--empyrical-reloaded, finance-bp-118--FinanceToolkit
pattern_name: Correct Sharpe ratio formula with risk-free rate subtraction
description: Calculate Sharpe ratio using (mean returns - risk_free) / std(returns) * sqrt(annualization) with sample standard
deviation (ddof=1). Subtract risk-free rate from asset returns before dividing by volatility. Incorrect Sharpe ratio calculation
produces misleading risk-adjusted return estimates, causing poor investment decisions based on faulty performance attribution.
applicable_to_activity: portfolio-analytics
_source_file: cross-project-wisdom/portfolio-analytics.yaml
- wisdom_id: CW-PORTFOLIO-ANALYTICS-007
source_project: finance-bp-068--xalpha, finance-bp-066--wealthbot
pattern_name: Immutable FIFO position tracking with chronological ordering
description: Maintain FIFO position tracking with strictly increasing date order for position entries. Use copy() function
to create independent copies before mutating remtable to avoid side effects. Enforce chronological ordering in sell operations
to ensure correct cost basis and holding period calculation, particularly important for funds with tiered fees by holding
period.
applicable_to_activity: portfolio-analytics
_source_file: cross-project-wisdom/portfolio-analytics.yaml
- wisdom_id: CW-PORTFOLIO-ANALYTICS-008
source_project: finance-bp-082--stock-screener, finance-bp-093--PyPortfolioOpt, finance-bp-117--Riskfolio-Lib
pattern_name: Validation at system boundaries with descriptive errors
description: Enforce validation at system boundaries with descriptive error messages. Validate expected returns matches
covariance matrix dimensions, score values are within [0, 100], confidence values within [0, 1], and required DataFrame
columns are present. Invalid inputs should raise ValueError with descriptive messages listing valid options to prevent
silent failures or corrupted calculations.
applicable_to_activity: portfolio-analytics
_source_file: cross-project-wisdom/portfolio-analytics.yaml
- wisdom_id: CW-PORTFOLIO-ANALYTICS-009
source_project: finance-bp-068--xalpha, finance-bp-107--empyrical-reloaded
pattern_name: Decimal rounding for monetary calculations
description: Use Decimal with explicit rounding (myround) for each monetary calculation to avoid floating-point errors that
cause share miscalculation and incorrect cost basis. This prevents rounding errors from propagating to XIRR and portfolio
valuation calculations. Direct floating-point operations in financial calculations accumulate errors that become material
over many transactions.
applicable_to_activity: portfolio-analytics
_source_file: cross-project-wisdom/portfolio-analytics.yaml
- wisdom_id: CW-PORTFOLIO-ANALYTICS-010
source_project: finance-bp-106--pyfolio-reloaded, finance-bp-068--xalpha
pattern_name: Cash flow sign convention enforcement
description: Mark cash outflows as negative and cash inflows as positive in cftable. Incorrect cash flow signs cause NPV
calculation to invert, producing negative returns for profitable trades and vice versa. Verify sum of round trip PnLs
equals total realized transaction dollars to catch sign convention errors before they corrupt performance attribution.
applicable_to_activity: portfolio-analytics
_source_file: cross-project-wisdom/portfolio-analytics.yaml
domain_constraints_injected: []
resources_injected: {}
component_capability_map:
project: finance-bp-066--wealthbot
scan_date: '2026-04-22'
stats:
total_files: 5
total_classes: 19
total_functions: 0
total_stages: 5
modules:
price_collection:
class_count: 4
stage_id: price_collection
stage_order: 1
responsibility: Fetch and store current security prices from Tradier API; ensures each securities have latest market
prices for portfolio valuation
classes:
- name: Rebalancer.updateSecurities
file: price_collection/rebalancer-updatesecurities.py
line: 0
kind: required_method
signature: ''
- name: BaseRebalancer.getPricesDiff
file: price_collection/baserebalancer-getpricesdiff.py
line: 0
kind: required_method
signature: ''
- name: Rebalancer.setIsCurrent
file: price_collection/rebalancer-setiscurrent.py
line: 0
kind: required_method
signature: ''
- name: price_source
file: price_collection/price-source.py
line: 0
kind: replaceable_point
design_decision_count: 2
portfolio_analysis:
class_count: 4
stage_id: portfolio_analysis
stage_order: 2
responsibility: Calculate portfolio value, model deviation, and determine required rebalancing actions based on target
allocations
classes:
- name: CeModel.getModelEntities
file: portfolio_analysis/cemodel-getmodelentities.py
line: 0
kind: required_method
signature: ''
- name: Rebalancer.updatePortfolioValues
file: portfolio_analysis/rebalancer-updateportfoliovalues.py
line: 0
kind: required_method
signature: ''
- name: Job.getRebalanceType
file: portfolio_analysis/job-getrebalancetype.py
line: 0
kind: required_method
signature: ''
- name: rebalance_type
file: portfolio_analysis/rebalance-type.py
line: 0
kind: replaceable_point
design_decision_count: 3
trade_decision_engine:
class_count: 5
stage_id: trade_decision
stage_order: 3
responsibility: Determine which securities to buy or sell based on price movements and risk tolerance thresholds
classes:
- name: Trade.buyOrSell
file: trade_decision_engine/trade-buyorsell.py
line: 0
kind: required_method
signature: ''
- name: TradeData.getTradeData
file: trade_decision_engine/tradedata-gettradedata.py
line: 0
kind: required_method
signature: ''
- name: deviation_threshold
file: trade_decision_engine/deviation-threshold.py
line: 0
kind: replaceable_point
- name: price_threshold_buy
file: trade_decision_engine/price-threshold-buy.py
line: 0
kind: replaceable_point
- name: price_threshold_sell
file: trade_decision_engine/price-threshold-sell.py
line: 0
kind: replaceable_point
design_decision_count: 4
trade_execution:
class_count: 5
stage_id: trade_execution
stage_order: 4
responsibility: Place orders via Tradier API and record resulting transactions, lots, and positions
classes:
- name: Requests.placeOrder
file: trade_execution/requests-placeorder.py
line: 0
kind: required_method
signature: ''
- name: Trade.buy
file: trade_execution/trade-buy.py
line: 0
kind: required_method
signature: ''
- name: Trade.sell
file: trade_execution/trade-sell.py
line: 0
kind: required_method
signature: ''
- name: order_type
file: trade_execution/order-type.py
line: 0
kind: replaceable_point
- name: duration
file: trade_execution/duration.py
line: 0
kind: replaceable_point
design_decision_count: 5
portfolio_value_update:
class_count: 1
stage_id: portfolio_value_update
stage_order: 5
responsibility: Record post-trade portfolio valuation for historical tracking and billing calculations
classes:
- name: Rebalancer.updatePortfolioValues
file: portfolio_value_update/rebalancer-updateportfoliovalues.py
line: 0
kind: required_method
signature: ''
design_decision_count: 2
data_flow_hints: []
locale_contract:
source_language: en
user_facing_fields:
- human_summary.what_i_can_do.tagline
- human_summary.what_i_can_do.use_cases[]
- human_summary.what_i_auto_fetch[]
- human_summary.what_i_ask_you[]
- evidence_quality.user_disclosure_template
- post_install_notice.message_template.positioning
- post_install_notice.message_template.capability_catalog.groups[].name
- post_install_notice.message_template.capability_catalog.groups[].description
- post_install_notice.message_template.capability_catalog.groups[].ucs[].name
- post_install_notice.message_template.capability_catalog.groups[].ucs[].short_description
- post_install_notice.message_template.call_to_action
- post_install_notice.message_template.featured_entries[].beginner_prompt
- post_install_notice.message_template.more_info_hint
- preconditions[].description
- preconditions[].on_fail
- intent_router.uc_entries[].name
- intent_router.uc_entries[].ambiguity_question
- architecture.pipeline
- architecture.stages[].narrative.does_what
- architecture.stages[].narrative.key_decisions
- architecture.stages[].narrative.common_pitfalls
- constraints.fatal[].consequence
- constraints.regular[].consequence
- output_validator.assertions[].failure_message
- acceptance.hard_gates[].on_fail
- skill_crystallization.action
locale_detection_order:
- explicit_user_declaration
- first_message_language
- system_locale
translation_enforcement:
trigger: on_first_user_message
action: Render user_facing_fields in detected locale, preserving all IDs (BD-/SL-/UC-/finance-C-) and code identifiers
verbatim
violation_code: LOCALE-01
violation_signal: User receives untranslated English Human Summary when detected locale != en
evidence_quality:
declared:
evidence_coverage_ratio: 1.0
evidence_verify_ratio: 0.7272727272727273
evidence_invalid: 21
evidence_verified: 56
evidence_auto_fixed: 0
audit_coverage: 48/48 (100%)
audit_pass_rate: 3/48 (6%)
audit_fail_total: 20
audit_finance_universal:
pass: 1
warn: 11
fail: 8
audit_subdomain_totals:
pass: 2
warn: 14
fail: 12
enforcement_rules:
- id: EQ-01
trigger: declared.evidence_verify_ratio < 0.5
action: MUST invoke traceback lookup for all cited BD-IDs in output before emitting business code — read LATEST.yaml sections
for each BD referenced
violation_code: EQ-01-V
violation_signal: Generated script references BD-IDs but no tool_call to read LATEST.yaml preceded code generation
user_disclosure_template: '[QUALITY NOTICE] This crystal was compiled from blueprint finance-bp-066. Evidence verify ratio
= 72.7% and audit fail total = 20. Generated results may have uncaptured requirement gaps. Verify critical decisions against
source files (LATEST.yaml / LATEST.jsonl).'
traceback:
source_files:
blueprint: LATEST.yaml
constraints: LATEST.jsonl
mandatory_lookup_scenarios:
- id: TB-01
condition: Two constraints have apparently conflicting enforcement rules
lookup_target: LATEST.jsonl — find both constraint IDs, compare `consequence` + `evidence_refs` to determine priority
- id: TB-02
condition: A business decision rationale is unclear or disputed
lookup_target: LATEST.yaml — locate BD-ID under business_decisions, read `rationale` + `alternative_considered` fields
- id: TB-03
condition: evidence_invalid > 0 in evidence_quality.declared
lookup_target: LATEST.yaml _enrich_meta — cross-check specific BD `evidence_refs` fields for invalid markers
- id: TB-04
condition: User asks where a rule comes from
lookup_target: LATEST.jsonl — find constraint by ID, read `confidence.evidence_refs` for source file + line number
- id: TB-05
condition: Generated code does not match expected ZVT API behavior
lookup_target: LATEST.yaml stages[].required_methods — verify method signature and evidence locator in source code
degraded_lookup:
no_fs_access: 'Ask the user to paste the relevant LATEST.yaml section or LATEST.jsonl lines for the BD-/finance-C- IDs
in question. Crystal ID: finance-bp-066-v5.0.'
trace_schema:
event_types:
- precondition_check
- spec_lock_check
- evidence_rule_fired
- evidence_rule_skipped
- locale_translation_emitted
- hard_gate_passed
- hard_gate_failed
- skill_emitted
- false_completion_claim
preconditions:
- id: PC-01
description: zvt package installed and importable
check_command: python3 -c 'import zvt; print(zvt.__version__)'
on_fail: 'Run: python3 -m pip install zvt then re-run: python3 -m zvt.init_dirs to initialize data directories'
severity: fatal
- id: PC-02
description: K-data exists for target entities (required before backtesting)
check_command: python3 -c "from zvt.api.kdata import get_kdata; df = get_kdata(entity_ids=['stock_sh_600000'], limit=1);
assert df is not None and len(df) > 0, 'No kdata found'"
on_fail: 'Run recorder first: python3 -m zvt.recorders.em.em_stock_kdata_recorder --entity_ids stock_sh_600000 (replace
with your target entity IDs)'
severity: fatal
applies_to_uc: []
- id: PC-03
description: ZVT data directory initialized (~/.zvt or ZVT_HOME)
check_command: 'python3 -c "import os; from pathlib import Path; zvt_home = Path(os.environ.get(''ZVT_HOME'', Path.home()
/ ''.zvt'')); assert zvt_home.exists(), f''ZVT home not found: {zvt_home}''"'
on_fail: 'Run: python3 -m zvt.init_dirs'
severity: fatal
- id: PC-04
description: SQLite write permission for ZVT data directory
check_command: python3 -c "import os, tempfile; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home()
/ '.zvt')); test_f = zvt_home / '.write_test'; test_f.touch(); test_f.unlink()"
on_fail: 'Check directory permissions: chmod u+w ~/.zvt or set ZVT_HOME environment variable to a writable location'
severity: warn
intent_router:
uc_entries: []
context_state_machine:
states:
- id: CA1_MEMORY_CHECKED
entry: Task started
exit: All memory queries attempted and recorded; memory_unavailable set if failed
timeout: 30s — skip memory, mark memory_unavailable=true, proceed to CA2
- id: CA2_GAPS_FILLED
entry: CA1 complete
exit: 'All FATAL-priority required inputs answered: target market (A-share/HK/US), data source, time range, strategy type'
timeout: NOT skippable — FATAL inputs MUST be user-answered before proceeding
- id: CA3_PATH_SELECTED
entry: CA2 complete
exit: intent_router matched single use case with confidence gap > 20% over next candidate, no data_domain ambiguity
timeout: Trigger ambiguity_question for top-2 candidates, await user selection
- id: CA4_EXECUTING
entry: CA3 complete + user explicit confirmation received
exit: All hard gates G1-Gn passed and output files written
timeout: NOT skippable — user confirmation of execution path required
enforcement: Code generation is PROHIBITED before CA4_EXECUTING. Any regression to earlier state MUST be announced to user.
buy/sell ordering SL-01 check runs at CA4 entry.
spec_lock_registry:
semantic_locks:
- id: SL-01
description: Execute sell orders before buy orders in every trading cycle
locked_value: sell() called before buy() in each Trader.run() iteration
violation_is: fatal
source_bd_ids:
- BD-018
- id: SL-02
description: Trading signals MUST use next-bar execution (no look-ahead)
locked_value: due_timestamp = happen_timestamp + level.to_second()
violation_is: fatal
source_bd_ids:
- BD-014
- BD-025
- id: SL-03
description: Entity IDs MUST follow format entity_type_exchange_code
locked_value: stock_sh_600000 | stockhk_hk_0700 | stockus_nasdaq_AAPL
violation_is: fatal
source_bd_ids: []
- id: SL-04
description: DataFrame index MUST be MultiIndex (entity_id, timestamp)
locked_value: df.index.names == ['entity_id', 'timestamp']
violation_is: fatal
source_bd_ids: []
- id: SL-05
description: 'TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount'
locked_value: XOR enforcement in trading/__init__.py:68
violation_is: fatal
source_bd_ids: []
- id: SL-06
description: 'filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION'
locked_value: factor.py:475 order_type_flag mapping
violation_is: fatal
source_bd_ids: []
- id: SL-07
description: Transformer MUST run BEFORE Accumulator in factor pipeline
locked_value: 'compute_result(): transform at :403 before accumulator at :409'
violation_is: fatal
source_bd_ids: []
- id: SL-08
description: 'MACD parameters locked: fast=12, slow=26, signal=9'
locked_value: factors/algorithm.py:30 macd(slow=26, fast=12, n=9)
violation_is: fatal
source_bd_ids:
- BD-036
- id: SL-09
description: 'Default transaction costs: buy_cost=0.001, sell_cost=0.001, slippage=0.001'
locked_value: sim_account.py:25 SimAccountService default costs
violation_is: warning
source_bd_ids:
- BD-029
- id: SL-10
description: A-share equity trading is T+1 (no same-day close of buy positions)
locked_value: sim_account.available_long filters by trading_t
violation_is: fatal
source_bd_ids: []
- id: SL-11
description: Recorder subclass MUST define provider AND data_schema class attributes
locked_value: contract/recorder.py:71 Meta; register_schema decorator
violation_is: fatal
source_bd_ids: []
- id: SL-12
description: Factor result_df MUST contain either 'filter_result' OR 'score_result' column
locked_value: result_df.columns.intersection({'filter_result', 'score_result'}) non-empty
violation_is: fatal
source_bd_ids: []
implementation_hints:
- id: IH-01
hint: 'Use AdjustType enum exactly: qfq (pre-adjust), hfq (post-adjust), bfq (none) — contract/__init__.py:121'
- id: IH-02
hint: For A-share kdata, default to hfq for long-term analysis (dividend-adjusted) — trader.py:538 StockTrader
- id: IH-03
hint: SQLite connection MUST use check_same_thread=False for multi-threaded recorders
- id: IH-04
hint: Accumulator state serialization uses JSON with custom encoder/decoder hooks — contract/base_service.py
- id: IH-05
hint: Factor.level MUST match TargetSelector.level (enforced at add_factor) — factors/target_selector.py:84
preservation_manifest:
required_objects:
business_decisions_count: 114
fatal_constraints_count: 28
non_fatal_constraints_count: 165
use_cases_count: 0
semantic_locks_count: 12
preconditions_count: 4
evidence_quality_rules_count: 2
traceback_scenarios_count: 5
architecture:
pipeline: data_collection -> data_storage -> factor_computation -> target_selection -> trading_execution -> visualization
stages:
- id: data_collection
narrative:
does_what: TimeSeriesDataRecorder and FixedCycleDataRecorder fetch OHLCV and fundamental data from providers (eastmoney,
joinquant, baostock, akshare) and persist domain objects (Stock1dKdata, BalanceSheet) to SQLite via df_to_db().
key_decisions: BD-002 chose evaluate_start_end_size_timestamps for incremental fetch (not full refresh) because comparing
to get_latest_saved_record avoids redundant API calls; BD-003 chose get_data_map field transformation to keep domain
schema provider-agnostic.
common_pitfalls: 'Don''t forget SL-11: Recorder subclass MUST declare both provider and data_schema class attributes
else initialization fails with assertion error; finance-C-001 fatal violation.'
business_decisions: []
- id: data_storage
narrative:
does_what: StorageBackend persists DataFrames to per-provider SQLite databases at {data_path}/{provider}/{provider}_{db_name}.db
using path templates from _get_path_template; Mixin.record_data and Mixin.query_data provide uniform read/write interface.
key_decisions: BD-004 chose StorageBackend abstraction (not hardcoded SQLite) to allow future cloud storage swap; BD-006
derives db_name from data_schema __tablename__ for per-domain database isolation.
common_pitfalls: SL-04 violation (wrong DataFrame index) causes factor pipeline failures downstream; always ensure df.index.names
== ['entity_id', 'timestamp'] before calling record_data.
business_decisions: []
- id: factor_computation
narrative:
does_what: Factor.compute() applies Transformer (stateless, e.g. MacdTransformer) then Accumulator (stateful, e.g. MaStatsAccumulator)
to produce filter_result or score_result columns; EntityStateService persists per-entity rolling state across batches.
key_decisions: BD-007 chose Factor inheriting DataReader for composable data access; SL-08 locks MACD at (fast=12, slow=26,
n=9) — chose standard Appel parameters not adaptive because interpretability matters for practitioners.
common_pitfalls: 'SL-07: Transformer MUST run before Accumulator — swapping order causes NaN propagation; SL-12: result_df
must contain filter_result OR score_result column or TargetSelector silently drops all signals.'
business_decisions: []
- id: target_selection
narrative:
does_what: TargetSelector.add_factor() registers Factor instances; get_targets() returns entity_ids passing threshold
filter at a specific timestamp, enabling point-in-time historical backtesting without look-ahead.
key_decisions: BD-012 chose registrable factor list (not hardcoded) for runtime customization; BD-013 chose timestamp-specific
filtering not current-only because backtests need historical point-in-time correctness.
common_pitfalls: Factor.level MUST match TargetSelector.level (IH-05); mismatched levels cause silent empty target lists
that look like no signals but are actually level-mismatch bugs.
business_decisions: []
- id: trading_execution
narrative:
does_what: Trader.run() calls sell() before buy() each cycle, generates TradingSignals with due_timestamp = happen_timestamp
+ level.to_second() for next-bar execution, and applies on_profit_control() for stop-loss/take-profit before regular
target selection.
key_decisions: SL-01 locks sell-before-buy order because available_long check in sim_account depends on it — chose this
over symmetric ordering to prevent implicit leverage; BD-039 chose long=AND/short=OR multi-level logic to reflect
risk asymmetry.
common_pitfalls: 'SL-02 violation (immediate execution instead of next-bar) introduces look-ahead bias and makes backtest
results unreproducible in live trading; SL-10: A-share T+1 constraint — backtesting without it overstates returns.'
business_decisions: []
- id: visualization
narrative:
does_what: Drawer.draw() combines kline main chart with factor overlays and Rect annotations for entry/exit signals
using Plotly; Drawable interface on Factor enables consistent chart rendering across data types.
key_decisions: BD-019 chose drawer_rects subclass override for custom annotations not hardcoded markers — allows traders
to define entry/exit visuals without modifying base drawing logic.
common_pitfalls: draw_result=True by default (BD-055) is fine for development but set draw_result=False in production/headless
environments to avoid Plotly server startup overhead.
business_decisions: []
- id: cross_cutting_concerns
narrative:
does_what: 'Invariants and utilities that span multiple pipeline stages — collected from 27 source groups: BillManager(1),
CashCalculationManager(1), CeModelManager(2), ClientAllocationValuesManager(2), ClientPortfolioManager(1), FeeManager(3),
and 21 more.'
key_decisions: 114 BDs merged here because they apply to more than one main stage (e.g. algorithm helpers, default value
choices, ordering contracts, error handling). Agent should inspect individual BD summaries and link back to affected
main stages via shared IDs.
common_pitfalls: Cross-cutting concerns frequently surface as bugs when changes to one main stage unintentionally break
another. Check constraints referencing these BDs and verify invariants still hold after any stage-local modification.
business_decisions:
- id: BD-066
type: B
summary: Billing sum calculation using average account values and day count
- id: BD-067
type: B
summary: Account cash calculation as sum of free cash across each accounts before date
- id: BD-063
type: B/RC
summary: Commission range extracted as min/max from model security assignments
- id: BD-065
type: B
summary: Risk rating adjustment when deleting model parent (child ratings decrement if above parent)
- id: BD-054
type: B/RC
summary: Dollar-to-percent and percent-to-dollar conversions for allocation
- id: BD-058
type: B/BA
summary: Portfolio variance calculation (dollar and percent) vs target
- id: BD-064
type: B
summary: Portfolio total value = latest ClientPortfolioValue or sum of ClientAccount values
- id: BD-051
type: B/BA
summary: Tiered Fee Calculation with proportional tier splitting
- id: BD-055
type: B
summary: Proportional fee billing based on days worked in period
- id: BD-060
type: B/BA
summary: Minimum fee floor enforcement after tiered calculation
- id: BD-057
type: B/BA
summary: Short-term vs long-term gain classification by 365-day threshold
- id: BD-062
type: B/DK
summary: Security allocation aggregation by subclass using SUM and current price
- id: BD-061
type: B/BA
summary: Tax Loss Harvesting savings calculation with differential tax rates
- id: BD-056
type: B/DK
summary: Gain/Loss percentage calculation using cost basis ratio
- id: BD-053
type: B/BA
summary: Risk tolerance score with additive points and closest model match
- id: BD-052
type: B
summary: Annualized TWR using geometric mean formula
- id: BD-059
type: B
summary: Investment gain = ending value - beginning value - contributions + withdrawals
- id: BD-007
type: B/BA
summary: 'Lot status tracking: INITIAL=1, IS_OPEN=2, CLOSED=3, DIVIDED=4'
- id: BD-008
type: B/BA
summary: 'Position status: INITIAL=1, IS_OPEN=2, IS_CLOSE=3, NOT_VERIFIED=4'
- id: BD-043
type: B/RC
summary: wasRebalancerDiff flag tracks if lot was modified by rebalancer
- id: BD-017
type: B
summary: Billing specs support TIER=1 (percentage of AUM) and FLAT=2 (fixed) fee types
- id: BD-018
type: B
summary: Fee tier infinity defined as 1000000000000 (1 trillion) for unbounded top tier
- id: BD-019
type: B/BA
summary: Minimum billing fee enforced as floor on calculated tiered fees
- id: BD-020
type: B
summary: RIA fees calculated per quarter using account value at period end date
- id: BD-035
type: B
summary: 'Relationship types: License Fee (0) vs TAMP (1) for RIA business model'
- id: BD-036
type: B
summary: FeeManager calculates tiered fees by iterating sorted fee tiers
- id: BD-044
type: B/BA
summary: Fee tiers use INFINITY constant to represent unbounded top tier
- id: BD-048
type: B/DK
summary: Account value calculated at period end date for billing
- id: BD-038
type: B/BA
summary: System account default source is 'sample' for demo accounts
- id: BD-082
type: BA
summary: 'INTERACTION: BD-003 (fee structure: $250 fixed + 10%) × BD-002 (minimum trade sizes: $50-$150) → Minimum economically
viable trade is $2,500 (breakeven point where 10% fee = $250)'
- id: BD-083
type: B/BA
summary: 'INTERACTION: BD-057 (365-day short-term threshold) × BD-022 (wash sale lot-level tracking) × BD-040 (TLH buy-back
flag) → Tax lot age determines both gain classification AND wash sale exposure simulta'
- id: BD-084
type: BA
summary: 'INTERACTION: BD-011 (tolerance bands 2-21% config) ≠ BD-068 (15% hardcoded thresholds) → Configurable drift
bands don''t match actual rebalancing triggers, creating silent override'
- id: BD-085
type: RC
summary: 'INTERACTION: BD-080 (hardcoded API key fallback) undermines BD-076 (BaseVoter authorization pattern) → Security
bypass negates authorization contract for fallback scenarios'
- id: BD-086
type: B/BA
summary: 'RISK CASCADE: BD-021 (TLH threshold) → BD-061 (TLH savings formula) → BD-059 (cash flow TWR) → BD-052 (annualized
TWR) → BD-064 (portfolio value) → BD-048 (period-end billing) → BD-066 (billing sum) →'
- id: BD-087
type: BA
summary: 'RISK CASCADE: BD-068 (15% threshold) + BD-011 (2-21% bands) + BD-003 ($250 fixed fee) + BD-002 ($50 min trades)
→ Dead zone where drift exceeds tolerance but trades are uneconomical'
- id: BD-088
type: B/BA
summary: 'HIDDEN DEPENDENCY: BD-052 (annualized TWR) depends on BD-059 (investment gain) which depends on BD-064 (portfolio
value) which has implicit settlement date dependency without explicit day count conven'
- id: BD-089
type: B/BA
summary: 'INTERACTION: BD-045 (qualified vs non-qualified segregation) × BD-025 (municipal substitution) × BD-057 (365-day
threshold) → Tax-aware model decisions require coordinated lot tracking across account '
- id: BD-090
type: BA
summary: 'INTERACTION: BD-023 (managed level: household/account) × BD-049 (household rebalancing) × BD-031 (trade recon
grouping by account/subclass/security) → Household aggregation creates reconciliation comp'
- id: BD-076
type: BA
summary: BaseVoter abstract class defines OWNER/OPERATOR attributes; concrete voters must implement vote()
- id: BD-077
type: DK
summary: 'Workflow Entity extends Model/Workflow base class, delegates each methods via parent::'
- id: BD-030
type: BA/M
summary: Client-to-system account type adapter converts account groups to system types
- id: BD-070
type: DK
summary: getClientAccounts()->first() assumes client always has at least one account
- id: BD-071
type: DK
summary: getGroups()->first() assumes user always belongs to at least one group
- id: BD-078
type: B
summary: SignableObjectRepositoryInterface implemented by 4 repositories with document signature logic
- id: BD-072
type: BA/DK
summary: Client registration steps are hardcoded arrays with implicit ordering 0-7
- id: BD-073
type: B/BA
summary: 'Workflow client_status progression: DEFAULT(0)→ENVELOPE_CREATED(1)→...→ACCOUNT_FUNDED(7)'
- id: BD-079
type: B/BA
summary: 'Job REBALANCE_TYPE constants: FULL(0)→REQUIRED_CASH(1)→FULL_AND_TLH(2)→NO_ACTIONS(3)→INITIAL(4)'
- id: BD-074
type: B
summary: Rebalancer extends BaseRebalancer and uses Trade trait for composition
- id: BD-075
type: BA
summary: AbstractFormHandler defines success() contract; each handlers must implement
- id: BD-081
type: BA
summary: 'RebalancerQueue status constants: STATUS_SELL=''sell'', STATUS_BUY=''buy'''
- id: BD-010
type: B
summary: Asset class types limited to STOCKS and BONDS enumeration
- id: BD-011
type: B/BA
summary: Tolerance bands defined per asset class and subclass with default range 2-21%
- id: BD-012
type: B
summary: 'Rebalancing methods: Asset Class level (1) or Subclass level (2)'
- id: BD-013
type: B/BA
summary: 'Rebalancing frequencies: Quarterly, Semi-Annual, Annual, Tolerance Bands'
- id: BD-023
type: B/DK
summary: 'Account managed levels: Account=1, Household=2, Account or Household=3'
- id: BD-025
type: B/RC
summary: Security assignment allows preferred flag and municipal substitution
- id: BD-026
type: B
summary: 'Model types: STRATEGY and CUSTOM with inheritance support'
- id: BD-028
type: B/BA
summary: CeModel copy preserves assumption settings including commission min/max
- id: BD-029
type: B/BA
summary: Subclass expected performance tracked for return assumptions
- id: BD-033
type: B/BA
summary: Expected performance values range from 3% to 10% in fixture data
- id: BD-045
type: B/RC
summary: Qualified vs non-qualified account segregation in model entities
- id: BD-049
type: B/RC
summary: Rebalancing at household level aggregates each client accounts
- id: BD-068
type: BA/DK
summary: Rebalancer uses hardcoded 15% deviation thresholds for buy/sell triggers
- id: BD-069
type: BA/M
summary: Hardcoded model_deviation = 4 in Rebalancer.php:179 for portfolio value updates
- id: BD-080
type: RC
summary: 'Role-based API key fallback: admin/ria/admin→hardcoded [email protected] account'
- id: BD-GAP-001
type: DK
summary: 'Missing: as-of vs processing time'
- id: BD-GAP-002
type: M
summary: 'Missing: Matrix ill-conditioning and stability'
- id: BD-GAP-003
type: DK
summary: 'Missing: Random seed full coverage'
- id: BD-GAP-004
type: DK
summary: 'Missing: Model and data version snapshot binding'
- id: BD-GAP-005
type: RC
summary: 'Missing: Price and quantity precision (tick/lot size)'
- id: BD-GAP-006
type: B
summary: 'Missing: : 7'
- id: BD-GAP-007
type: B
summary: 'Missing: Backtest Overfitting Protection'
- id: BD-GAP-008
type: B
summary: 'Missing: Factor IC Demean & Group Alignment'
- id: BD-GAP-009
type: M
summary: 'Missing: Transition Matrix Time-Homogeneity'
- id: BD-GAP-010
type: B
summary: 'Missing: Provider Priority & Credential Isolation'
- id: BD-GAP-011
type: B
summary: 'Missing: Feature Extraction Time Boundaries'
- id: BD-GAP-012
type: DK
summary: 'Missing: ** "Add timezone annotation to each datetime fields and implement UTC normalization'
- id: BD-GAP-013
type: RC
summary: 'Missing: ** "Migrate monetary values from float to DECIMAL(19,4) type for precision'
- id: BD-GAP-014
type: B
summary: 'Missing: ** "Implement walk-forward analysis framework with out-of-sample testing'
- id: BD-GAP-015
type: DK
summary: 'Missing: ** "Add configurable random seed for each stochastic components'
- id: BD-GAP-016
type: RC
summary: 'Missing: ** "Add security-specific tick_size and lot_size fields with order quantity validation'
- id: BD-GAP-017
type: B
summary: 'Missing: : 7'
- id: BD-GAP-018
type: M
summary: 'Missing: Covariance Matrix PSD Repair'
- id: BD-GAP-019
type: B
summary: 'Missing: Covariance Estimator Selection (Ledoit-Wolf vs Sample)'
- id: BD-GAP-020
type: B
summary: 'Missing: VaR/CVaR Confidence Level & Window'
- id: BD-GAP-021
type: B
summary: 'Missing: PD/LGD/EAD Estimation (IRB vs Standard)'
- id: BD-GAP-022
type: B
summary: 'Missing: Vasicek Single-Factor Asset Correlation'
- id: BD-GAP-023
type: B
summary: 'Missing: Stress Test Scenario Macro Variables'
- id: BD-GAP-024
type: M
summary: 'Missing: ** "Implement covariance matrix shrinkage (Ledoit-Wolf) and PSD repair'
- id: BD-015
type: B/BA
summary: Risk tolerance scoring starts at 50 points base, adds question points
- id: BD-016
type: B/BA
summary: Portfolio model selection by closest risk rating match
- id: BD-027
type: B
summary: Risk rating must be zero or positive, enforced by setter validation
- id: BD-037
type: B
summary: Questionnaire answers use point-based scoring with withdraw-age special handling
- id: BD-050
type: B/BA
summary: Risk question sequence default is 100 for ordering
- id: BD-005
type: RC
summary: Short-term capital gains threshold is 365 days per IRS regulations
- id: BD-006
type: B/BA
summary: Capital gains classification implemented using 365-day threshold in Lot entity
- id: BD-021
type: B/BA
summary: Tax loss harvesting enabled with configurable threshold (dollar and percent)
- id: BD-022
type: B/RC
summary: Wash sale tracking flagged at lot level
- id: BD-040
type: B/RC
summary: TLH buy-back original security option controlled by tlh_buy_back_original flag
- id: BD-042
type: B/RC
summary: Cost basis known flag tracks whether purchase price is available
- id: BD-001
type: B/RC
summary: Trade execution follows strict SELL before BUY ordering within each rebalancer action
- id: BD-002
type: B/BA
summary: 'Default minimum trade sizes: $50 buy, $150 initial buy, $50 sell for equity securities'
- id: BD-003
type: B/BA
summary: Security transaction fees include fixed ($250) and percentage (10%) components
- id: BD-004
type: B/BA
summary: Redemption penalties apply for 21-day holding period with 15 fixed + 5% fee
- id: BD-009
type: B/RC
summary: 'Transaction statuses: IN_PROGRESS, PLACED, VERIFIED, NOT_POSTED'
- id: BD-014
type: B/BA
summary: 'Job rebalance types: FULL=0, REQUIRED_CASH=1, FULL_AND_TLH=2, NO_ACTIONS=3, INITIAL=4'
- id: BD-031
type: B/RC
summary: Trade recon groups by account, subclass, security for aggregation
- id: BD-032
type: B/RC
summary: RebalancerQueue SQL distinguishes 'AS' (each shares) vs 'S' (some shares) quantity types
- id: BD-034
type: B/BA
summary: Subclass priority ordering determines processing sequence
- id: BD-041
type: B
summary: RebalancerAction status uses same constants as Job rebalance_type
- id: BD-046
type: B
summary: RebalancerQueue amount represents target trade value in dollars
- id: BD-047
type: B
summary: RebalancerQueue quantity represents number of shares
- id: BD-024
type: B
summary: 'Portfolio processing modes: Straight-Through=1, Collaborative=2'
- id: BD-039
type: B/BA
summary: Client account default process_step starts at 0
resources:
packages:
- name: pandas
version_pin: ==1.5.3
- name: numpy
version_pin: ==1.24.4
- name: matplotlib
version_pin: '>=2'
- name: requests
version_pin: ==2.31.0
- name: scipy
version_pin: '>=1.3.0'
- name: scikit-learn
version_pin: '>1.4.2'
- name: pytest
version_pin: '>=8.3'
strategy_scaffold:
entry_point_name: run_backtest
output_path: result.csv
execution_mode: backtest
conditional_entry_points:
backtest:
entry_point_name: run_backtest
output_path: result.csv
collector:
entry_point_name: run_collector
output_path: result.json
factor:
entry_point_name: run_factor
output_path: result.parquet
training:
entry_point_name: run_training
output_path: result.json
serving:
entry_point_name: run_server
output_path: result.json
research:
entry_point_name: run_research
output_path: result.json
tail_template: "# === DO NOT MODIFY BELOW THIS LINE ===\nif __name__ == \"__main__\":\n result = run_backtest() #\
\ implement above\n from validate import enforce_validation\n enforce_validation(result, output_path=\"{workspace}/result.csv\"\
)\n# === END DO NOT MODIFY ==="
host_adapter:
target: openclaw
timeout_seconds: 1800
shell_operator_restriction: 'exec tool intercepts && / ; / | — never chain: ''pip install X && python Y''. Use separate
exec calls.'
install_recipes:
- python3 -m pip install zvt
credential_injection: JoinQuant/QMT credentials require user-side '!' prefix shell login. Never hardcode credentials in
generated scripts.
path_resolution: '{workspace} resolves to ~/.openclaw/workspace/doramagic at execution time.'
file_io_tooling: Use openclaw 'write' tool for .py/.sql files; 'exec' tool for python3 /absolute/path/script.py (absolute
paths only).
constraints:
fatal:
- id: finance-C-001
when: When calculating prices_diff using current_price divided by old_price
action: allow old_price to be zero without validation
severity: fatal
kind: domain_rule
modality: must_not
consequence: Division by zero will produce INF or NaN, corrupting portfolio rebalancing calculations and causing incorrect
buy/sell decisions based on invalid prices_diff values
stage_ids:
- price_collection
- id: finance-C-002
when: When storing SecurityPrice entity with source='tradier'
action: validate that the quote->last price value exists before persisting
severity: fatal
kind: domain_rule
modality: must
consequence: Storing null or missing price values will corrupt price history, causing incorrect portfolio valuations and
rebalancing decisions based on invalid price data
stage_ids:
- price_collection
- id: finance-C-005
when: When persisting new SecurityPrice with is_current=true
action: reset is_current flag to false for each existing prices of the same security first
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Multiple prices marked as is_current=true causes ambiguous price lookups, breaking downstream portfolio valuation
and rebalancing calculations that rely on single source of truth
stage_ids:
- price_collection
- id: finance-C-014
when: When calculating security allocation amounts
action: Divide percent value by 100 before multiplying with portfolio total value
severity: fatal
kind: domain_rule
modality: must
consequence: Allocation amounts will be 100x larger than intended, causing over-investment in securities and potential
margin calls or regulatory violations
stage_ids:
- portfolio_analysis
- id: finance-C-015
when: When calculating price change ratios for rebalancing decisions
action: Allow price_diff to be zero or undefined when old_price is missing
severity: fatal
kind: domain_rule
modality: must_not
consequence: Division by zero or undefined price will cause runtime error or incorrect rebalancing signals, triggering
unauthorized trades or missing required trades
stage_ids:
- portfolio_analysis
- id: finance-C-019
when: When accessing client accounts for initial rebalancing
action: Check that getClientAccounts() returns non-empty collection before calling first()
severity: fatal
kind: domain_rule
modality: must
consequence: Calling first() on empty ArrayCollection will return false, causing account number access to fail with TypeError,
preventing any initial rebalancing
stage_ids:
- portfolio_analysis
- id: finance-C-031
when: When calculating price change ratio in getPricesDiff()
action: divide by zero when old_price equals zero
severity: fatal
kind: domain_rule
modality: must_not
consequence: Division by zero causes NaN/infinity values in prices_diff, causing all subsequent trade decisions to be
invalid or unpredictable
stage_ids:
- trade_decision
- id: finance-C-050
when: When placing orders via Tradier API
action: Prefix account numbers with 'VA' before sending to broker endpoint
severity: fatal
kind: domain_rule
modality: must
consequence: Orders submitted with incorrect account numbers will be rejected by Tradier, causing failed trade execution
and portfolio misalignment
stage_ids:
- trade_execution
- id: finance-C-051
when: When creating Lot records on buy transactions
action: Set cost_basis to the latest price retrieved from price cache via security ID
severity: fatal
kind: domain_rule
modality: must
consequence: Incorrect cost basis values will cause inaccurate gain/loss calculations, leading to incorrect tax reporting
and wash sale violations
stage_ids:
- trade_execution
- id: finance-C-052
when: When executing buy or sell orders
action: Use security symbol as primary identifier in API calls
severity: fatal
kind: domain_rule
modality: must
consequence: Using internal security IDs instead of symbols in Tradier API calls will cause order failures and invalid
position records
stage_ids:
- trade_execution
- id: finance-C-066
when: When implementing updatePortfolioValues to record post-trade portfolio value
action: create a new ClientPortfolioValue record instead of updating existing records
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Updating existing records destroys the time-series integrity needed for billing calculations and historical
reconciliation against custodian records
stage_ids:
- portfolio_value_update
- id: finance-C-074
when: When linking ClientPortfolioValue to the owning ClientPortfolio
action: associate the new ClientPortfolioValue record with its parent ClientPortfolio via the clientPortfolio relationship
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Orphaned portfolio value records cannot be queried by client, breaking billing calculations and historical
reporting
stage_ids:
- portfolio_value_update
- id: finance-C-077
when: When calculating price ratio in portfolio_analysis using current_price divided by old_price from price_collection
action: Pass old_price=0 to division; must filter out securities where old_price is zero before calling getPricesDiff
severity: fatal
kind: domain_rule
modality: must_not
consequence: Division by zero causes fatal PHP error, crashing the rebalance pipeline and leaving Job and RebalancerAction
records in inconsistent started state
- id: finance-C-080
when: When executing trades via the Tradier API
action: Bypass the API order placement; must use placeOrder function to submit market orders through the brokerage gateway
severity: fatal
kind: architecture_guardrail
modality: must_not
consequence: Trades execute at unverified prices without brokerage integration, causing settlement failures and potential
securities law violations
- id: finance-C-084
when: When passing price data from Tradier API to rebalancing logic
action: Trust the API response without validating that quote contains 'last' field; must check isset($quote->last) before
using price
severity: fatal
kind: resource_boundary
modality: must_not
consequence: NULL or missing price values propagate through calculations, causing invalid price ratios and potentially
placing orders with zero or NULL amounts
- id: finance-C-088
when: When calculating prices_diff ratio (current_price / old_price)
action: Use float division and store result; prices_diff > 1 means price increased, prices_diff < 1 means price decreased
severity: fatal
kind: domain_rule
modality: must
consequence: Inverted ratio logic causes buy/sell signals to fire in opposite directions, actively harming portfolio allocation
instead of improving it
- id: finance-C-091
when: When deciding whether to sell or buy based on prices_diff thresholds
action: Execute sell when prices_diff > 1.15 (security appreciated beyond tolerance) and buy when prices_diff < 0.85 (security
depreciated beyond tolerance)
severity: fatal
kind: domain_rule
modality: must
consequence: Swapped buy/sell signals actively move portfolio away from target allocation, causing inverse rebalancing
and customer losses
- id: finance-C-092
when: When implementing a Rebalancer subclass for broker integration
action: Prefix account numbers with 'VA' before calling Tradier API endpoints
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Tradier API will reject account identifiers, causing all trade orders to fail with invalid account errors
- id: finance-C-095
when: When routing trade execution through the Rebalancer API layer
action: Use security symbols as primary identifier in Tradier API calls rather than internal security IDs
severity: fatal
kind: domain_rule
modality: must
consequence: Tradier API returns validation errors for unrecognized security identifiers, causing all trades to fail
- id: finance-C-101
when: When deploying or testing the wealthbot system in non-production environments
action: Configure tradier_sandbox parameter to true to route API calls to sandbox.tradier.com endpoint
severity: fatal
kind: resource_boundary
modality: must
consequence: Live trades execute on real accounts using real capital during testing, causing irreversible financial transactions
- id: finance-C-114
when: When calculating capital gains for tax reporting and tax lot optimization
action: Classify assets held less than 365 days as short-term capital gains (taxed as ordinary income) and assets held
365+ days as long-term capital gains per IRS holding period regulations
severity: fatal
kind: domain_rule
modality: must
consequence: Incorrect holding period classification violates IRS tax regulations, resulting in inaccurate tax calculations
that may trigger penalties or audit findings
derived_from_bd_id: BD-005
- id: finance-C-119
when: When applying Markov chain models for price state predictions
action: Assume transition matrix time-homogeneity is satisfied without explicit validation — framework does not implement
time-homogeneity verification for Markov transition matrices
severity: fatal
kind: claim_boundary
modality: must_not
consequence: Assuming time-homogeneous transitions without validation produces systematically incorrect state predictions
when underlying market regimes shift, degrading both backtest reliability and live trading performance
derived_from_bd_id: BD-GAP-009
- id: finance-C-125
when: When calculating capital gains classification for tax lot accounting
action: 'Apply exactly 365-day threshold at lot level: lots held 365 days or more are long-term; lots held fewer than
365 days are short-term; use this boundary for IRS holding period compliance'
severity: fatal
kind: domain_rule
modality: must
consequence: Using incorrect threshold (e.g., 366 days or 364 days) misclassifies short-term vs long-term gains, causing
incorrect tax calculations and potential IRS compliance issues; wrong classification triggers wrong tax rates affecting
net returns
derived_from_bd_id: BD-006
- id: finance-C-127
when: When tracking wash sale adjustments for IRS compliance on securities transactions
action: 'Implement lot-level wash sale tracking: flag lots where substantially identical security was purchased within
30-day window before or after a loss; adjust cost basis of replacement lots to disallow loss deduction'
severity: fatal
kind: domain_rule
modality: must
consequence: Position-level tracking misses lot-level detail, causing wash sale adjustments to apply to wrong shares;
this results in disallowed loss claims triggering IRS penalties, interest, and potential audit exposure
derived_from_bd_id: BD-022
- id: finance-C-129
when: When implementing API fallback credential logic in authorization scenarios
action: Use hardcoded fallback API key ([email protected]) as a bypass for normal credential management; the BaseVoter
pattern must enforce OWNER/OPERATOR authorization without fallback bypass
severity: fatal
kind: domain_rule
modality: must_not
consequence: Hardcoded fallback bypasses authorization checks, allowing privileged operations under a known account regardless
of actual authorization state; this creates security vulnerability with audit trail gaps and regulatory exposure in
production environments
derived_from_bd_id: BD-085
- id: finance-C-151
when: When implementing rebalancing calculation logic for household-level aggregation
action: Aggregate each related client accounts within a household for unified rebalancing to prevent siloed rebalancing
that creates household-level imbalances
severity: fatal
kind: domain_rule
modality: must
consequence: Rebalancing individual accounts without household aggregation creates conflicting positions across related
accounts, eliminating tax optimization benefits and causing household-level diversification failures
derived_from_bd_id: BD-049
- id: finance-C-169
when: When implementing annualized return calculation in TwrCalculatorManager
action: Use geometric mean formula ((1 + TWR)^(365/interval) - 1) * 100 for annualization — do not replace with simple
linear scaling TWR * 365/days
severity: fatal
kind: domain_rule
modality: must
consequence: Using simple annualization instead of geometric mean inflates reported returns and destroys cross-period
comparability by ignoring compounding effects, causing misleading performance attribution and incorrect strategy comparisons
derived_from_bd_id: BD-052
- id: finance-C-171
when: When calculating investment gain for performance attribution in TwrCalculatorManager
action: 'Apply formula: investment_gains = ending_value - beginning_value - contributions + withdrawals — do not use simple
ending minus beginning'
severity: fatal
kind: domain_rule
modality: must
consequence: Computing gain as simple ending minus beginning ignores cash flow impact on portfolio size, distorting performance
attribution and TWR calculations by attributing deposit/withdrawal effects to investment decisions
derived_from_bd_id: BD-059
regular:
- id: finance-C-003
when: When fetching security prices from external API
action: resolve API gateway to the correct Tradier endpoint based on sandbox mode
severity: high
kind: resource_boundary
modality: must
consequence: Using wrong API endpoint causes connection failures, missing price updates, and portfolio valuations based
on stale price data
stage_ids:
- price_collection
- id: finance-C-004
when: When storing monetary price values in SecurityPrice entity
action: store prices as float type for monetary calculations
severity: high
kind: domain_rule
modality: must_not
consequence: Float precision errors in price calculations compound over multiple trades, causing small but cumulative
discrepancies in portfolio value and trade sizing
stage_ids:
- price_collection
- id: finance-C-006
when: When calculating prices_diff for portfolio rebalancing
action: filter out zero-priced entries before using prices_diff in buyOrSell calculations
severity: high
kind: domain_rule
modality: must
consequence: Zero prices produce invalid prices_diff ratios (0 or INF), causing erroneous buy/sell signals and potentially
placing trades with zero value
stage_ids:
- price_collection
- id: finance-C-007
when: When fetching security prices from Tradier API
action: use valid API credentials configured via TRADIER_API_KEY and TRADIER_API_SECRET environment variables
severity: high
kind: resource_boundary
modality: must
consequence: Invalid or missing API credentials cause authentication failures, preventing price updates and leaving portfolios
valued at stale or zero prices
stage_ids:
- price_collection
- id: finance-C-008
when: When processing price records for rebalancing decisions
action: fetch exactly 2 most recent price records per security ordered by datetime descending
severity: high
kind: architecture_guardrail
modality: must
consequence: Fetching incorrect number of prices breaks prices_diff calculation, causing wrong comparison between current
and previous prices and incorrect buy/sell decisions
stage_ids:
- price_collection
- id: finance-C-009
when: When using Tradier API in production environment
action: enable TRADIER_SANDBOX mode which routes requests to sandbox.tradier.com
severity: high
kind: claim_boundary
modality: must_not
consequence: Sandbox mode returns simulated or delayed price data, causing portfolio valuations to diverge from actual
market conditions and leading to incorrect investment decisions
stage_ids:
- price_collection
- id: finance-C-010
when: When storing SecurityPrice with source='tradier'
action: record the exact datetime of price capture for temporal ordering
severity: medium
kind: architecture_guardrail
modality: must
consequence: Missing or incorrect datetime breaks chronological price ordering, causing wrong price comparisons and corrupted
historical price analysis
stage_ids:
- price_collection
- id: finance-C-011
when: When implementing price collection stage
action: claim real-time price capabilities when using polling-based Tradier API
severity: medium
kind: claim_boundary
modality: must_not
consequence: Polling-based API has inherent delays (typically 15 minutes for Tradier), misrepresenting real-time capabilities
leads to incorrect system assumptions and potential regulatory compliance issues
stage_ids:
- price_collection
- id: finance-C-012
when: When performing backtested rebalancing using stored prices
action: claim that simulated rebalancing results predict actual live trading performance
severity: high
kind: claim_boundary
modality: must_not
consequence: Backtested rebalancing ignores slippage, liquidity constraints, and execution timing that occur in live markets,
overstating expected returns and misleading clients
stage_ids:
- price_collection
- id: finance-C-013
when: When implementing CeModel target allocations
action: Validate that sum of each CeModelEntity percent values equals exactly 100
severity: high
kind: domain_rule
modality: must
consequence: Portfolio allocation will not match target model, causing incorrect position sizing and improper diversification
across asset classes
stage_ids:
- portfolio_analysis
- id: finance-C-016
when: When setting rebalance type on Job entity
action: Use only the predefined REBALANCE_TYPE_* constants from Job entity
severity: high
kind: architecture_guardrail
modality: must
consequence: Using string values or arbitrary integers will cause Rebalancer.php switch statement at lines 64-67 to fail
matching, resulting in no rebalancing actions being executed
stage_ids:
- portfolio_analysis
- id: finance-C-017
when: When creating RebalancerAction records
action: Associate each action with a valid Job entity that tracks execution state
severity: high
kind: architecture_guardrail
modality: must
consequence: Rebalancer actions will lack job tracking, preventing error recovery and audit logging, leading to untraceable
financial discrepancies
stage_ids:
- portfolio_analysis
- id: finance-C-018
when: When completing rebalancing operations
action: Create a new ClientPortfolioValue record with current timestamp and calculated totals
severity: high
kind: architecture_guardrail
modality: must
consequence: Portfolio valuation history will be incomplete, breaking performance tracking and compliance reporting, making
it impossible to audit rebalancing decisions
stage_ids:
- portfolio_analysis
- id: finance-C-020
when: When configuring model deviation threshold for rebalancing
action: Hardcode model_deviation to 4% for each clients regardless of risk profile
severity: medium
kind: resource_boundary
modality: must_not
consequence: Aggressive traders will trigger excessive rebalancing with unnecessary tax events; conservative clients will
have portfolios too far from target allocation
stage_ids:
- portfolio_analysis
- id: finance-C-021
when: When using Tradier API for live trading
action: Assume sandbox and production endpoints behave identically or are interchangeable
severity: high
kind: resource_boundary
modality: must_not
consequence: Orders may be executed with real money on sandbox endpoint or rejected on production, causing financial loss
or missed trading opportunities
stage_ids:
- portfolio_analysis
- id: finance-C-022
when: When accessing security prices for rebalancing
action: Retrieve only prices marked as is_current=true for current valuation
severity: high
kind: domain_rule
modality: must
consequence: Using stale prices will cause incorrect position valuation, leading to wrong rebalancing amounts and potential
regulatory violations
stage_ids:
- portfolio_analysis
- id: finance-C-023
when: When evaluating rebalancing decisions
action: Claim that backtested rebalancing returns predict live trading performance
severity: high
kind: claim_boundary
modality: must_not
consequence: Presenting simulated rebalancing results as expected live returns violates regulatory guidance and misleads
clients about actual investment outcomes
stage_ids:
- portfolio_analysis
- id: finance-C-024
when: When processing rebalancing for REQUIRED_CASH type
action: Use getInvestableCash() instead of getTotalValue() for cash-only rebalancing
severity: high
kind: architecture_guardrail
modality: must
consequence: Using total value for cash-only rebalancing will over-invest available cash, causing overdrafts or margin
requirements
stage_ids:
- portfolio_analysis
- id: finance-C-025
when: When making risk tolerance adjustments to rebalancing
action: Calculate point score as (sum of questionnaire answer points / 100) + 1 before threshold comparison
severity: high
kind: architecture_guardrail
modality: must
consequence: Incorrect point calculation will cause wrong risk tolerance adjustment, triggering excessive or insufficient
rebalancing trades
stage_ids:
- portfolio_analysis
- id: finance-C-026
when: When validating price change thresholds
action: Use 0.15 threshold for risk-adjusted comparison and 0.85/1.15 for absolute buy/sell triggers
severity: high
kind: domain_rule
modality: must
consequence: Incorrect thresholds will cause over-trading (too sensitive) or portfolio drift (too insensitive), both causing
financial harm to clients
stage_ids:
- portfolio_analysis
- id: finance-C-027
when: When deciding to implement per-RIA model deviation
action: Skip configuration because '4% works fine for our test clients'
severity: medium
kind: rationalization_guard
modality: must_not
consequence: Different RIA firms have varying risk appetites and regulatory requirements; a single hardcoded threshold
may violate client investment policy statements
stage_ids:
- portfolio_analysis
- id: finance-C-028
when: When implementing client account access in initialRebalance
action: Skip null check because 'each clients have at least one account in our test data'
severity: high
kind: rationalization_guard
modality: must_not
consequence: New clients without accounts will cause complete rebalancing failure, blocking onboarding workflow and requiring
manual intervention
stage_ids:
- portfolio_analysis
- id: finance-C-029
when: When testing price processing logic
action: Skip validation that old_price exists because 'prices are always updated daily'
severity: high
kind: rationalization_guard
modality: must_not
consequence: Market holidays, API outages, or new securities will cause division by zero, crashing the rebalancing process
or producing NaN values
stage_ids:
- portfolio_analysis
- id: finance-C-030
when: When presenting rebalancing action queue to users
action: Present simulated rebalancing as confirmed live trades
severity: high
kind: claim_boundary
modality: must_not
consequence: Clients will believe trades have executed when they are only queued for execution, potentially causing confusion
and regulatory issues
stage_ids:
- portfolio_analysis
- id: finance-C-032
when: When computing risk tolerance point from questionnaire answers
action: normalize point value by dividing by 100 and adding 1 before using in threshold comparison
severity: high
kind: domain_rule
modality: must
consequence: Without normalization, the point value is too large causing threshold comparisons to fail, triggering trades
when portfolio drift is within acceptable tolerance
stage_ids:
- trade_decision
- id: finance-C-033
when: When implementing trade decision logic
action: apply deviation threshold check before price threshold checks
severity: high
kind: domain_rule
modality: must
consequence: Current code has redundant if-else branches that produce identical outcomes regardless of deviation check,
causing unnecessary API calls and potential order failures
stage_ids:
- trade_decision
- id: finance-C-034
when: When placing orders through Tradier API
action: use market orders only for execution
severity: high
kind: resource_boundary
modality: must
consequence: Market orders execute at current market price which may differ from the price used in decision calculation,
especially during volatile periods when rebalancing triggers
stage_ids:
- trade_decision
- id: finance-C-035
when: When accessing price data for trade decisions
action: require historical price to exist for each security before triggering any buyOrSell decisions
severity: high
kind: resource_boundary
modality: must
consequence: Without historical price, old_price defaults to 0 causing division by zero, resulting in corrupted prices_diff
values that lead to incorrect trade decisions
stage_ids:
- trade_decision
- id: finance-C-036
when: When connecting to Tradier API for live trading
action: skip sandbox validation before production deployment
severity: high
kind: resource_boundary
modality: must_not
consequence: Untested orders may fail due to authentication, insufficient funds, or account restrictions, resulting in
failed rebalancing and potential compliance violations
stage_ids:
- trade_decision
- id: finance-C-037
when: When calculating prices_diff for trade decisions
action: use oldest available price as baseline when historical data spans multiple days
severity: medium
kind: resource_boundary
modality: must
consequence: Using the most recent two prices may span different time periods depending on data availability, causing
inconsistent price_diff calculations and unpredictable rebalancing behavior
stage_ids:
- trade_decision
- id: finance-C-038
when: When processing questionnaire answers for risk tolerance
action: proceed with empty or null answers in risk tolerance calculation
severity: high
kind: operational_lesson
modality: must_not
consequence: Empty questionnaire answers result in point=0, causing the threshold to be 1.0 instead of the expected risk-adjusted
value, breaking risk-based trade throttling
stage_ids:
- trade_decision
- id: finance-C-039
when: When triggering trades based on deviation threshold
action: apply 15% (0.15) deviation threshold correctly in the trade decision formula
severity: high
kind: operational_lesson
modality: must
consequence: Incorrect threshold application causes excessive trading when deviation is less than 15%, increasing transaction
costs and tax events, or failing to rebalance when drift exceeds tolerance
stage_ids:
- trade_decision
- id: finance-C-040
when: When deciding to buy or sell securities
action: apply price thresholds of 1.15 (15% rise triggers sell) and 0.85 (15% drop triggers buy)
severity: high
kind: operational_lesson
modality: must
consequence: Incorrect price threshold application causes buy/sell signals at wrong price levels, breaking the band-pass
filter that prevents excessive turnover during minor price movements
stage_ids:
- trade_decision
- id: finance-C-041
when: When making trade decisions in volatile markets
action: trigger trades when prices_diff falls within 0.85-1.15 band
severity: high
kind: operational_lesson
modality: must_not
consequence: Trading within the price band incurs transaction costs and potential tax events without meaningful portfolio
improvement, eroding returns especially in volatile sideways markets
stage_ids:
- trade_decision
- id: finance-C-042
when: When portfolio requires rebalancing action
action: execute trades only after passing both deviation threshold and price threshold checks
severity: high
kind: architecture_guardrail
modality: must
consequence: Current implementation executes trades when either threshold is met, causing excessive trading that contradicts
the dual-gate design intended to reduce unnecessary portfolio turnover
stage_ids:
- trade_decision
- id: finance-C-043
when: When executing trades through the Rebalancer
action: route each order placement through placeOrder() method to verify consistent API handling
severity: high
kind: architecture_guardrail
modality: must
consequence: Direct API calls bypassing placeOrder() bypass authentication headers and error handling, potentially exposing
account credentials or failing silently
stage_ids:
- trade_decision
- id: finance-C-044
when: When accessing price data for trade calculations
action: retrieve prices only through getLatestPriceBySecurityId() method to verify consistent caching
severity: medium
kind: architecture_guardrail
modality: must
consequence: Direct database queries for prices bypass the cached prices array, causing redundant API calls and potential
race conditions with concurrent rebalancing processes
stage_ids:
- trade_decision
- id: finance-C-045
when: When presenting rebalancing results to users
action: claim real-time trading capability
severity: high
kind: claim_boundary
modality: must_not
consequence: The system uses polling-based price updates, not real-time streaming, so trades execute based on potentially
stale price data with delay between decision and execution
stage_ids:
- trade_decision
- id: finance-C-046
when: When displaying backtested or sandbox rebalancing results
action: present results as guaranteed live trading outcomes
severity: high
kind: claim_boundary
modality: must_not
consequence: Sandbox API responses are simulated and may not reflect actual market conditions, order fills, or slippage;
presenting them as live results misleads users about expected performance
stage_ids:
- trade_decision
- id: finance-C-047
when: When evaluating rebalancing performance
action: claim that backtest results predict future trading returns
severity: high
kind: claim_boundary
modality: must_not
consequence: Past portfolio rebalancing does not guarantee future results due to market conditions, fees, tax events,
and execution differences between simulation and live trading
stage_ids:
- trade_decision
- id: finance-C-048
when: When developing new trade decision logic
action: assume simple logic fixes without testing edge cases
severity: high
kind: rationalization_guard
modality: must_not
consequence: The redundant if-else pattern in buyOrSell() appears simple but masks a critical logic error; assuming it
works leads to production failures and incorrect trades
stage_ids:
- trade_decision
- id: finance-C-049
when: When price history is missing for a security
action: skip the security without logging or alerting
severity: medium
kind: rationalization_guard
modality: must_not
consequence: Missing price history silently skipped leads to untracked positions, portfolio allocation drift, and inability
to verify if rebalancing actually occurred for all securities
stage_ids:
- trade_decision
- id: finance-C-053
when: When creating TransactionType records for buy/sell operations
action: Create new TransactionType instances on each trade
severity: high
kind: domain_rule
modality: must_not
consequence: Creating duplicate TransactionType records wastes database rows and breaks referential integrity with existing
transaction data fixtures
stage_ids:
- trade_execution
- id: finance-C-054
when: When retrieving latest price for cost basis calculation
action: Verify price exists in cache before using it for order placement and cost basis
severity: high
kind: domain_rule
modality: must
consequence: If price is not found in cache, getLatestPriceBySecurityId() returns null, causing market orders with null
prices and zero cost basis records
stage_ids:
- trade_execution
- id: finance-C-055
when: When integrating with Tradier API
action: Claim real-time trading capability when using polling-based HTTP client
severity: high
kind: claim_boundary
modality: must_not
consequence: The system uses Symfony HTTPClient polling, not WebSocket or push notifications, so it cannot claim real-time
trading execution
stage_ids:
- trade_execution
- id: finance-C-056
when: When processing wash sale tracking
action: Claim complete wash sale detection when implementation hardcodes wash_sale=false
severity: high
kind: claim_boundary
modality: must_not
consequence: The wash sale detection is explicitly incomplete per TODO comment - all lots are marked with wash_sale=false
regardless of actual tax lot rules
stage_ids:
- trade_execution
- id: finance-C-057
when: When placing orders via Tradier API
action: Send order_type=market and duration=day parameters to broker
severity: medium
kind: resource_boundary
modality: must
consequence: Market orders with day duration are required; deviating from this may cause unexpected fill behavior or order
rejection
stage_ids:
- trade_execution
- id: finance-C-058
when: When configuring Tradier API endpoints
action: Use sandbox endpoint for testing and production endpoint for live trading
severity: high
kind: resource_boundary
modality: must
consequence: Using wrong endpoint will cause orders to go to wrong environment - sandbox orders won't execute, production
orders won't be testable
stage_ids:
- trade_execution
- id: finance-C-059
when: When placing orders via Tradier API
action: Include Authorization Bearer token in request headers
severity: high
kind: resource_boundary
modality: must
consequence: Missing authorization token will cause 401 Unauthorized responses from Tradier API and all orders will fail
stage_ids:
- trade_execution
- id: finance-C-060
when: When implementing trade execution for Wealthbot
action: Persist Lot and Position records atomically with Transaction via flush()
severity: high
kind: architecture_guardrail
modality: must
consequence: Without flush(), database may be in inconsistent state if subsequent operations fail - orphaned Lots without
Positions or vice versa
stage_ids:
- trade_execution
- id: finance-C-061
when: When creating Position records
action: Set Position status to POSITION_STATUS_IS_OPEN on new positions
severity: medium
kind: architecture_guardrail
modality: must
consequence: Incorrect position status prevents proper tracking of open vs closed positions in portfolio accounting
stage_ids:
- trade_execution
- id: finance-C-062
when: When creating Lot records
action: Set Lot status to LOT_IS_OPEN and link to Position via setPosition()
severity: high
kind: architecture_guardrail
modality: must
consequence: Missing Lot-Position linkage breaks tax lot accounting and wash sale tracking functionality
stage_ids:
- trade_execution
- id: finance-C-063
when: When implementing wash sale detection
action: Skip wash sale detection when buying same security within 30 days of a loss sale
severity: high
kind: operational_lesson
modality: must_not
consequence: IRS wash sale rules require adjusting cost basis when substantially identical securities are purchased within
30 days before or after a loss sale
stage_ids:
- trade_execution
- id: finance-C-064
when: When recording transactions in the database
action: Set transaction tx_date to current datetime
severity: medium
kind: operational_lesson
modality: must
consequence: Incorrect transaction dates will cause wrong settlement calculations and portfolio value tracking errors
stage_ids:
- trade_execution
- id: finance-C-065
when: When presenting trade execution results
action: Claim guaranteed order fills or specific price execution
severity: medium
kind: claim_boundary
modality: must_not
consequence: Market orders may experience slippage; presenting simulated execution prices as guaranteed results misleads
users about actual trading outcomes
stage_ids:
- trade_execution
- id: finance-C-067
when: When implementing portfolio value recording logic
action: set each cash components (total_in_securities, total_cash_in_accounts, total_cash_in_money_market) equal to the
post-trade total value
severity: high
kind: domain_rule
modality: must
consequence: The simplified accounting model assumes all cash flows to money market after trades; violating this causes
billing_cash calculation errors and historical reconciliation failures
stage_ids:
- portfolio_value_update
- id: finance-C-068
when: When recording model_deviation in ClientPortfolioValue
action: hardcode model_deviation to exactly 4 (percent) for each rebalance records
severity: high
kind: domain_rule
modality: must
consequence: Model deviation controls trading band tolerance; incorrect values cause improper trade sizing or regulatory
compliance violations
stage_ids:
- portfolio_value_update
- id: finance-C-069
when: When persisting portfolio value records after trade execution
action: set sas_cash, cash_buffer, and billing_cash to zero in the new ClientPortfolioValue record
severity: high
kind: domain_rule
modality: must
consequence: Non-zero cash allocations at this stage misrepresent cash position, leading to incorrect billing fee calculations
and potential regulatory reporting errors
stage_ids:
- portfolio_value_update
- id: finance-C-070
when: When implementing financial calculations for portfolio value tracking
action: use floating-point types for monetary calculations; each currency fields must use float type per ORM mapping
severity: high
kind: resource_boundary
modality: must_not
consequence: Floating-point precision errors in monetary calculations accumulate over time, causing discrepancies in billing
fees and custodian reconciliation
stage_ids:
- portfolio_value_update
- id: finance-C-071
when: When completing a rebalance job execution
action: update Job entity with finished_at timestamp and is_error=false upon successful completion
severity: high
kind: architecture_guardrail
modality: must
consequence: Missing job completion timestamps cause monitoring systems to report hung jobs and prevent proper audit trail
for regulatory compliance
stage_ids:
- portfolio_value_update
- id: finance-C-072
when: When setting error state in job or rebalancer action
action: set is_error to false when marking job completion; error state must only be set on actual failure conditions
severity: high
kind: domain_rule
modality: must_not
consequence: Incorrect error state reporting misleads auditors and compliance reviewers about actual system failures and
trade execution outcomes
stage_ids:
- portfolio_value_update
- id: finance-C-073
when: When maintaining historical portfolio value records for billing
action: record each ClientPortfolioValue with a datetime timestamp to enable chronological ordering
severity: high
kind: operational_lesson
modality: must
consequence: Without time-series data, billing fee calculations cannot correctly apply tiered rates based on historical
portfolio values, causing revenue loss or client disputes
stage_ids:
- portfolio_value_update
- id: finance-C-075
when: When presenting backtest or historical portfolio value results
action: claim that simulated portfolio values equal expected live trading returns; time-series records are historical
snapshots, not forward projections
severity: high
kind: claim_boundary
modality: must_not
consequence: Presenting historical portfolio values as guarantees of future returns violates financial regulations and
misleads clients about expected investment outcomes
stage_ids:
- portfolio_value_update
- id: finance-C-076
when: When modifying updatePortfolioValues or ClientPortfolioValue entity
action: skip adding setDate() call; new records must always include timestamp for billing audit trail
severity: high
kind: rationalization_guard
modality: must_not
consequence: Omitting date from new ClientPortfolioValue records breaks chronological queries and prevents billing fee
calculations from finding relevant portfolio values
stage_ids:
- portfolio_value_update
- id: finance-C-078
when: When processing SecurityPrice records for price_diff calculation
action: Query exactly 2 most recent prices per security ordered by datetime DESC to verify old_price represents the previous
trading session
severity: high
kind: domain_rule
modality: must
consequence: Price diff ratios become meaningless if prices come from non-consecutive sessions, causing incorrect buy/sell
signals and potential regulatory violations
- id: finance-C-079
when: When passing trade decisions from portfolio_analysis to trade_execution
action: Only include securities where prices_diff deviates beyond 15% tolerance band (prices_diff > 1.15 OR prices_diff
< 0.85)
severity: high
kind: domain_rule
modality: must
consequence: Excessive trading from over-sensitive thresholds causes unnecessary transaction costs, tax events, and potential
wash sale violations
- id: finance-C-081
when: When persisting executed trade records to database
action: Persist Position, Lot, and Transaction entities together in same database flush to maintain referential integrity
severity: high
kind: architecture_guardrail
modality: must
consequence: Partial record creation leaves orphaned positions or lots without transactions, corrupting cost basis tracking
and tax lot reports
- id: finance-C-082
when: When creating Lot records for executed trades
action: Set cost_basis to the latest price retrieved from the shared prices array via getLatestPriceBySecurityId
severity: high
kind: domain_rule
modality: must
consequence: Incorrect cost basis causes wrong gain/loss calculations, leading to tax filing errors and potential IRS
penalties
- id: finance-C-083
when: When updating portfolio values after trade execution
action: Record model_deviation value from the rebalance analysis with each new ClientPortfolioValue record
severity: medium
kind: domain_rule
modality: must
consequence: Missing model deviation tracking prevents regulatory audit of portfolio drift and violates fiduciary duty
documentation requirements
- id: finance-C-085
when: When transferring trade decision data between portfolio_analysis and trade_decision stages
action: Pass the security_id from the model entity to enable security lookup in trade execution phase
severity: high
kind: architecture_guardrail
modality: must
consequence: Missing security_id prevents security entity resolution, causing NullPointerException and failed order placement
- id: finance-C-086
when: When marking Job and RebalancerAction as completed after rebalance execution
action: Set finished_at timestamp and is_error=false only after each trades and portfolio updates have been persisted
successfully
severity: medium
kind: architecture_guardrail
modality: must
consequence: Premature job completion marking causes incorrect operational status, potentially triggering duplicate rebalance
runs or skipped scheduled jobs
- id: finance-C-087
when: When the system uses Tradier brokerage API for live trading
action: Claim real-time trading capability; Tradier provides delayed market data (typically 15-minute delay) and sandbox
mode is available for testing
severity: high
kind: claim_boundary
modality: must_not
consequence: Marketing real-time capability when using delayed data sources violates securities advertising regulations
and misleads clients about trade execution timing
- id: finance-C-089
when: When performing initial rebalance for a new ClientPortfolio
action: Use getValueSum across each client accounts to calculate total portfolio value before computing position sizes
severity: high
kind: domain_rule
modality: must
consequence: Incorrect total value causes mis-sized initial positions, leading to portfolio allocation errors at inception
that compound over time
- id: finance-C-090
when: When passing ClientPortfolio entity to portfolio value update
action: Pass the same ClientPortfolio instance that was analyzed, not a re-loaded instance with potentially different
state
severity: medium
kind: architecture_guardrail
modality: must
consequence: State mismatch between analyzed portfolio and recorded portfolio causes model deviation calculations to reference
incorrect baseline allocation
- id: finance-C-093
when: When recording job execution history for audit trail compliance
action: Set started_at timestamp at job creation, finished_at timestamp at job completion, and is_error flag to track
execution status
severity: high
kind: architecture_guardrail
modality: must
consequence: Audit trail becomes incomplete, making it impossible to prove regulatory compliance for automated trading
decisions
- id: finance-C-094
when: When representing a portfolio position for a specific account, security, and date
action: Aggregate multiple Lots under a single Position entity for cost basis and tax reporting purposes
severity: high
kind: architecture_guardrail
modality: must
consequence: Tax lot tracking breaks, preventing accurate gain/loss calculations and wash sale detection
- id: finance-C-096
when: When determining rebalance actions for client portfolios
action: Apply hardcoded drift thresholds of 0.85 (buy trigger) and 1.15 (sell trigger) to price deviation ratios
severity: high
kind: domain_rule
modality: must
consequence: Portfolio drift may exceed acceptable thresholds without triggering rebalancing, causing prolonged portfolio
misalignment with target allocations
- id: finance-C-097
when: When initializing a new Job entity for rebalancing operations
action: 'Set rebalance_type to one of the defined constants: FULL(0), REQUIRED_CASH(1), FULL_AND_TLH(2), NO_ACTIONS(3),
or INITIAL(4)'
severity: high
kind: domain_rule
modality: must
consequence: Invalid rebalance type causes undefined rebalancing behavior or silent failures in trade execution
- id: finance-C-098
when: When managing client onboarding workflow state transitions
action: 'Progress client_status through the defined sequence: DEFAULT(0)→ENVELOPE_CREATED(1)→ENVELOPE_OPENED(2)→ENVELOPE_COMPLETED(3)→PORTFOLIO_PROPOSED(4)→PORTFOLIO_CLIENT_ACCEPTED(5)→ACCOUNT_OPENED(6)→ACCOUNT_FUNDED(7)'
severity: high
kind: architecture_guardrail
modality: must
consequence: Clients may proceed to trade execution before completing required paperwork, violating regulatory requirements
- id: finance-C-099
when: When implementing trade execution logic in Rebalancer classes
action: Reuse trade execution logic through the Trade trait composition rather than duplicating buy/sell logic
severity: medium
kind: architecture_guardrail
modality: must
consequence: Duplicated trade logic diverges over time, causing inconsistent order placement and potential order execution
failures
- id: finance-C-100
when: When fetching security price data for trading decisions
action: Retrieve the two most recent price records and compute price_diff as current_price / old_price ratio
severity: high
kind: domain_rule
modality: must
consequence: Without historical price comparison, drift calculations produce invalid ratios, triggering incorrect buy/sell
decisions
- id: finance-C-102
when: When presenting or marketing this system as a wealth management solution
action: Claim support for high-frequency trading strategies — the system uses polling-based price updates and is designed
for periodic rebalancing
severity: high
kind: claim_boundary
modality: must_not
consequence: Users deploy the system for latency-sensitive strategies where it cannot perform, leading to poor trading
outcomes
- id: finance-C-103
when: When presenting or marketing this system for international markets
action: Claim support for non-US markets without Tradier broker support — Tradier only supports US securities
severity: high
kind: claim_boundary
modality: must_not
consequence: Users attempt to trade international securities through US-focused broker integration, causing all orders
to fail
- id: finance-C-104
when: When presenting or marketing this system for real-time trading applications
action: Claim real-time streaming price feed support — the system uses batch polling for price updates with inherent latency
severity: high
kind: claim_boundary
modality: must_not
consequence: Users rely on stale price data for time-sensitive trading decisions, causing trades at outdated prices
- id: finance-C-105
when: When integrating this system with brokerage or custodian services
action: Claim support for custodians other than Tradier — the Rebalancer is hardcoded to Tradier API endpoints
severity: high
kind: claim_boundary
modality: must_not
consequence: Users attempt integration with unsupported brokers, causing all API calls to fail with authentication or
endpoint errors
- id: finance-C-106
when: When offering this software to end clients or making public claims about the system
action: Present simulated or sandbox test results as proof of live trading capability — the system is offered as-is without
guarantees
severity: high
kind: claim_boundary
modality: must_not
consequence: Clients make capital allocation decisions based on unverified backtest or test environment results, leading
to unexpected live trading outcomes
- id: finance-C-107
when: When deploying the wealthbot system in production
action: Run on PHP 7.4 with Symfony 4.4 and Doctrine ORM 2.7 as specified in composer.json requirements
severity: high
kind: resource_boundary
modality: must
consequence: Compatibility issues arise with other PHP or framework versions, causing runtime errors or unpredictable
behavior
- id: finance-C-108
when: When calculating short-term vs long-term capital gains for tax reporting
action: Classify a Lot as short-term if held less than 365 days, long-term if held 365 days or more
severity: high
kind: domain_rule
modality: must
consequence: Incorrect tax classification leads to inaccurate tax reporting and potential IRS penalties
- id: finance-C-109
when: When implementing asset class handling in portfolio models and allocation calculations
action: Assign asset classes outside the STOCKS and BONDS enumeration — only these two values are supported
severity: high
kind: domain_rule
modality: must_not
consequence: Assigning unrecognized asset class values causes allocation calculation failures, producing incorrect portfolio
weights that lead to inappropriate investment decisions
derived_from_bd_id: BD-010
- id: finance-C-110
when: When configuring minimum trade sizes for equity securities
action: Verify that default minimums ($50 buy, $150 initial buy, $50 sell) match the actual broker fee structure and investor
contribution patterns before using them in backtesting or live trading
severity: medium
kind: operational_lesson
modality: should
consequence: Using inappropriate minimum trade sizes causes fee dilution on small transactions or prevents meaningful
portfolio establishment, distorting backtest results
derived_from_bd_id: BD-002
- id: finance-C-111
when: When implementing rebalancer action processing logic
action: Execute each SELL orders within each atomic rebalancer action before executing any BUY orders — do not interleave
or parallelize sell and buy executions
severity: high
kind: domain_rule
modality: must
consequence: Buy-before-sell or interleaved execution can cause cash account overdrafts in live trading when capital is
insufficient, resulting in rejected orders and backtest-live inconsistency
derived_from_bd_id: BD-001
- id: finance-C-112
when: When implementing concrete form handler subclasses extending AbstractFormHandler
action: Implement success() method in every concrete handler — absence causes AbstractMethodError at instantiation time,
preventing form submission completion
severity: medium
kind: operational_lesson
modality: must
consequence: Missing success() implementation throws AbstractMethodError during handler instantiation, causing form submission
workflows to fail silently
derived_from_bd_id: BD-075
- id: finance-C-113
when: When integrating with multiple custodians that use different account classification taxonomies
action: Use client-to-system account type adapter to normalize account groups — do not bypass the adapter for direct custodian
integration or assume universal type compatibility
severity: high
kind: architecture_guardrail
modality: must
consequence: Bypassing the account type adapter causes mismatched account classifications across custodians, leading to
incorrect portfolio aggregation and reporting
derived_from_bd_id: BD-030
- id: finance-C-115
when: When implementing or configuring portfolio rebalancing logic
action: Implement hybrid rebalancing combining both asset class level and subclass level in a single rebalancing run —
rebalancing method must be either asset class level OR subclass level exclusively
severity: high
kind: domain_rule
modality: must_not
consequence: Hybrid rebalancing violates the either-or boundary, causing either duplicate rebalancing actions or conflicting
allocation targets that lead to incorrect portfolio distributions
derived_from_bd_id: BD-012
- id: finance-C-116
when: When calculating security transaction fees for backtesting or live trading
action: Apply both fixed fee component ($250) and percentage component (10%) to each transaction — verify the dual-component
structure matches actual broker fee schedules
severity: medium
kind: operational_lesson
modality: should
consequence: Using only the percentage component underestimates transaction costs by approximately $250 per trade, making
high-frequency strategies appear 5-15% more profitable than they actually are in live trading
derived_from_bd_id: BD-003
- id: finance-C-117
when: When implementing concrete voter classes extending BaseVoter
action: Implement vote() method returning boolean and handle both OWNER and OPERATOR attribute types as defined in BaseVoter
— vote() must not remain unimplemented or throw AbstractMethodError
severity: high
kind: domain_rule
modality: must
consequence: Missing vote() implementation causes AbstractMethodError at runtime, crashing authorization checks; adding
new attribute types without updating all concrete voters creates silent denial where legitimate access requests are
rejected without errors
derived_from_bd_id: BD-076
- id: finance-C-118
when: When implementing user authorization logic using group membership
action: Validate that every user belongs to at least one group before calling getGroups()->first() — implement explicit
checks or use null-safe alternatives to prevent NoSuchElementException
severity: high
kind: domain_rule
modality: must
consequence: Calling getGroups()->first() on a user with no group memberships throws NoSuchElementException, causing authorization
failures that prevent users from accessing protected resources even when they should have valid permissions
derived_from_bd_id: BD-071
- id: finance-C-120
when: When applying Markov chain models to time-varying financial phenomena
action: Use time-inhomogeneous transition matrices with regime detection when data spans periods where market conditions
change — for long historical windows or volatile instruments, do not apply time-homogeneous Markov assumptions
severity: high
kind: domain_rule
modality: must
consequence: Applying time-homogeneous Markov models to non-stationary financial data produces incorrect transition probabilities,
leading to flawed state predictions that cause poor trading decisions and financial losses
derived_from_bd_id: BD-GAP-009
- id: finance-C-121
when: When configuring API keys for admin/ria/admin roles in production environments
action: Rely on the hardcoded [email protected] fallback account for privileged operations — specific API keys must
be configured; fallback usage must be logged and alerted as a security event
severity: high
kind: architecture_guardrail
modality: must_not
consequence: Using the hardcoded fallback bypasses normal API credential rotation practices, creating a persistent security
vulnerability if the hardcoded credentials are compromised or become outdated
derived_from_bd_id: BD-080
- id: finance-C-122
when: When implementing billing fee calculation logic for RIA clients
action: Support both TIER=1 (percentage of AUM) and FLAT=2 (fixed periodic) fee calculation paths based on the fee type
parameter; TIER calculates fee as percentage of assets under management while FLAT charges a fixed periodic amount
severity: high
kind: domain_rule
modality: must
consequence: Incorrect fee type handling causes billing to apply wrong calculation method, resulting in either overcharging
clients (reputation damage, refunds) or undercharging (revenue loss); strategies relying on accurate fee projection
will have incorrect profitability estimates
derived_from_bd_id: BD-017
- id: finance-C-123
when: When implementing tiered fee structure calculations for clients with large AUM
action: Use INFINITY constant of 1000000000000 (1 trillion) as the upper boundary for unbounded top tier; do not use NULL,
actual infinity, or zero for the highest tier ceiling
severity: high
kind: domain_rule
modality: must
consequence: Using NULL for tier ceiling causes calculation errors when processing high-AUM clients; using actual infinity
risks numeric overflow; both scenarios produce incorrect fee calculations and potential billing failures for large accounts
derived_from_bd_id: BD-018
- id: finance-C-124
when: When implementing redemption penalty logic for equity securities
action: 'Apply 21-day holding period check: if shares held less than 21 days, calculate penalty as $15 fixed fee plus
5% of redemption amount; skip penalty if holding period exceeds 21 days'
severity: high
kind: domain_rule
modality: must
consequence: Incorrect holding period calculation applies or omits penalties incorrectly, causing either investor disputes
(overcharged) or regulatory non-compliance (undercharged); accumulated errors in penalty calculation distort net redemption
returns and tax basis adjustments
derived_from_bd_id: BD-004
- id: finance-C-126
when: When calculating gain or loss percentages for performance reporting and tax triggers
action: 'Calculate percentage using formula: (marketValue - costBasis) / costBasis; this normalizes gain/loss figures
to enable scale comparison across positions of different sizes'
severity: high
kind: domain_rule
modality: must
consequence: Using absolute dollar gain instead of percentage misleads performance reporting for different position sizes;
a $100 gain on a $1000 position (10%) differs materially from $100 gain on $100000 position (0.1%), causing incorrect
rebalancing and tax trigger decisions
derived_from_bd_id: BD-056
- id: finance-C-128
when: When processing rebalancer queue operations using status constants
action: Use STATUS_SELL='sell' and STATUS_BUY='buy' string constants for queue status values; do not replace with boolean
flags or numeric codes in queue processing logic
severity: medium
kind: operational_lesson
modality: must
consequence: Replacing string constants with boolean flags or magic numbers breaks queue processing readability and extensibility;
existing code consuming these statuses will fail with undefined constant errors, causing rebalancing operations to halt
derived_from_bd_id: BD-081
- id: finance-C-130
when: When implementing or refactoring trade reconciliation aggregation logic
action: Group reconciliation by account, subclass, and security dimensions to capture each relevant aggregation levels
severity: high
kind: domain_rule
modality: must
consequence: Incorrect grouping dimensions will produce wrong position aggregations across reconciliation reporting, leading
to materially incorrect account positions
derived_from_bd_id: BD-031
- id: finance-C-131
when: When implementing RIA fee calculations or evaluating trade profitability
action: Verify that minimum economically viable trade size of $2,500 is enforced before triggering rebalancing; trades
below this threshold lose money on fees alone (fixed $250 + 10%)
severity: medium
kind: operational_lesson
modality: should
consequence: Trades below $2,500 will result in negative fee economics where the RIA pays to rebalance rather than earns;
small rebalancing adjustments within the 15% deviation band are likely unprofitable
derived_from_bd_id: BD-082
- id: finance-C-132
when: When configuring or implementing rebalancing tolerance bands
action: Verify that configured tolerance bands align with actual rebalancing triggers; the 15% hardcoded threshold acts
as a ceiling on each configured bands, making bands below 15% ineffective
severity: high
kind: operational_lesson
modality: must
consequence: A subclass configured with 5% tolerance will not trigger rebalancing until 15% drift, causing client expectation
mismatch and delayed corrective action
derived_from_bd_id: BD-084
- id: finance-C-133
when: When implementing or refactoring price collection and timestamp handling
action: Assume price data timestamps are unambiguous; the framework does not clearly distinguish between as-of date and
processing timestamp
severity: high
kind: claim_boundary
modality: must_not
consequence: Without clear as-of vs processing time distinction, prices may be attributed to incorrect trading days, causing
incorrect historical valuations and P&L calculations
derived_from_bd_id: BD-GAP-001
- id: finance-C-134
when: When implementing price data ingestion and storage
action: Define and enforce separate as_of_date and processed_timestamp fields for each price records; validate that as_of_date
represents the market date and processed_timestamp represents the database write time
severity: high
kind: domain_rule
modality: must
consequence: Without explicit timestamp semantics, backtest results will reflect incorrect price attributions, potentially
overstating or understating returns
derived_from_bd_id: BD-GAP-001
- id: finance-C-135
when: When implementing or refactoring any stochastic components
action: Assume stochastic operations produce consistent results across runs; the framework does not verify random seed
coverage for each random number generators
severity: high
kind: claim_boundary
modality: must_not
consequence: Without proper random seed management, backtest results become non-reproducible, making it impossible to
validate strategy performance consistently
derived_from_bd_id: BD-GAP-003
- id: finance-C-136
when: When implementing any stochastic operations or random sampling
action: Initialize random seed explicitly at system startup and propagate seed configuration to each random number generators;
document each stochastic operations requiring seed management
severity: high
kind: domain_rule
modality: must
consequence: Inconsistent random seeds across runs produce non-reproducible backtest results, preventing strategy validation
and regulatory audit trails
derived_from_bd_id: BD-GAP-003
- id: finance-C-137
when: When implementing or refactoring monetary value storage and calculations
action: Assume float or double types are sufficient for storing monetary values; floating-point representation causes
precision loss in financial calculations
severity: high
kind: claim_boundary
modality: must_not
consequence: Floating-point precision errors in monetary values cause rounding discrepancies that accumulate over high-volume
transactions, resulting in incorrect fee calculations and reconciliation failures
derived_from_bd_id: BD-GAP-013
- id: finance-C-138
when: When implementing monetary value storage and calculations
action: Migrate each monetary value fields from float/double to DECIMAL(19,4) type; verify each calculations preserve
precision and use decimal arithmetic instead of binary floating-point
severity: high
kind: domain_rule
modality: must
consequence: Float-based monetary storage causes rounding errors that lead to incorrect billing amounts and reconciliation
discrepancies in production
derived_from_bd_id: BD-GAP-013
- id: finance-C-139
when: When implementing risk rating assignment logic in portfolio model classification
action: Enforce non-negative risk rating via setter validation - reject any negative values at object construction to
prevent invalid portfolio model state
severity: high
kind: domain_rule
modality: must
consequence: Allowing negative risk ratings would break score-mapping algorithms and produce invalid portfolio model classifications
throughout the entire system
derived_from_bd_id: BD-027
- id: finance-C-140
when: When implementing billing fee calculation for RIA business model relationships
action: 'Use correct relationship type enumeration values: License Fee (0) for direct RIA relationship, TAMP (1) for third-party
manager overlay - verify type before fee arrangement and reporting'
severity: high
kind: domain_rule
modality: must
consequence: Using wrong relationship type values causes incorrect fee treatment and billing calculations, leading to
revenue discrepancies and compliance issues with fee reporting
derived_from_bd_id: BD-035
- id: finance-C-141
when: When implementing tiered fee calculation logic in FeeManager
action: Sort fee tiers by breakpoint from lowest to highest before iterating - apply marginal rates in tier order to calculate
correct cumulative fees across AUM levels
severity: high
kind: domain_rule
modality: must
consequence: Unsorted tier processing produces incorrect cumulative fees by applying marginal rates in wrong order, leading
to billing errors and client disputes
derived_from_bd_id: BD-036
- id: finance-C-142
when: When implementing rebalancing frequency selection logic for portfolio management
action: 'Select rebalancing frequency only from the defined set: Quarterly, Semi-Annual, Annual, or Tolerance Bands -
invalid frequency selection breaks rebalancing strategy execution'
severity: high
kind: domain_rule
modality: must
consequence: Using invalid rebalancing frequency causes incorrect rebalancing strategy selection, leading to either excessive
trading costs from over-rebalancing or portfolio drift from under-rebalancing
derived_from_bd_id: BD-013
- id: finance-C-143
when: When implementing rebalancer job type execution logic
action: 'Map job rebalance type enumeration correctly: FULL (0) for complete reallocation, REQUIRED_CASH (1) for distributions
only, FULL_AND_TLH (2) for tax-loss harvesting, NO_ACTIONS (3) for validation only, INITIAL (4) for baseline establishment'
severity: high
kind: domain_rule
modality: must
consequence: Using incorrect job rebalance type values causes rebalancer to execute wrong action set, potentially triggering
unauthorized trades or missing required rebalancing
derived_from_bd_id: BD-014
- id: finance-C-144
when: When implementing tax lot gain/loss calculation for tax reporting
action: Check cost basis known flag before attempting gain/loss calculation - proceed only when flag indicates purchase
price is available; track lots with unknown cost basis separately for position but not tax reporting
severity: high
kind: domain_rule
modality: must
consequence: Attempting gain/loss calculation without known cost basis produces incorrect tax calculations, potentially
causing compliance violations and incorrect tax liability reporting
derived_from_bd_id: BD-042
- id: finance-C-145
when: When implementing questionnaire risk scoring logic in the risk assessment stage
action: Implement point-based scoring that accumulates responses into a risk score, with special handling for withdraw-age
to adjust scoring for liquidity needs and time horizon impact
severity: high
kind: domain_rule
modality: must
consequence: Incorrect risk scoring causes clients to be assigned to inappropriate portfolio models, leading to misaligned
risk tolerance and potential regulatory compliance issues in wealth management
derived_from_bd_id: BD-037
- id: finance-C-146
when: When defining status constants for RebalancerAction and Job entities
action: Use identical constant values for rebalance_type in Job and status in RebalancerAction to maintain cross-entity
status alignment
severity: high
kind: domain_rule
modality: must
consequence: Mismatched status constants between Job and RebalancerAction cause state translation errors, resulting in
incorrect rebalancing status reporting and potential duplicate or missed trades
derived_from_bd_id: BD-041
- id: finance-C-147
when: When implementing rebalancing queue entry logic in the trading stage
action: Store amount in RebalancerQueue as target trade value in dollars, not share quantity; conversion to share quantity
must occur at execution time using current price
severity: high
kind: domain_rule
modality: must
consequence: Storing share quantity instead of dollar value causes position sizing to be incorrect when price changes
between queue entry and execution, leading to unintended over/under-allocation
derived_from_bd_id: BD-046
- id: finance-C-148
when: When implementing portfolio model selection based on client risk score
action: Map client risk score to nearest available portfolio model using closest-match algorithm with ties resolving to
lower risk rating
severity: medium
kind: operational_lesson
modality: must
consequence: Using exact-match-only algorithm leaves clients with scores not exactly matching any model without coverage,
causing allocation failures or assignment to wildly inappropriate risk levels
derived_from_bd_id: BD-016
- id: finance-C-149
when: When calculating tiered billing fees for registered investment advisors
action: Apply minimum billing fee as a floor AFTER calculating tiered percentage fees, ensuring RIA receives baseline
compensation regardless of low AUM accounts
severity: high
kind: operational_lesson
modality: must
consequence: Applying floor before tiered calculation or omitting the floor causes RIA to receive inadequate compensation
on small accounts, breaking economic expectations and potentially causing revenue shortfalls
derived_from_bd_id: BD-019
- id: finance-C-150
when: When implementing tax loss harvesting activation logic
action: Enable TLH only when BOTH configurable dollar threshold AND percent threshold are exceeded; use whichever threshold
is more conservative for the specific account context
severity: high
kind: operational_lesson
modality: must
consequence: Activating TLH without threshold checks harvests trivial losses with disproportionate wash sale risk, causing
unnecessary trading costs and potential wash sale rule violations that eliminate tax benefits
derived_from_bd_id: BD-021
- id: finance-C-152
when: When implementing allocation value conversion between dollar amounts and percentages
action: 'Use exact formulas for bidirectional conversion: dollar-to-percent as (value / totalAmount) * 100, and percent-to-dollar
as (value * totalAmount) / 100'
severity: high
kind: domain_rule
modality: must
consequence: Using different or approximated conversion formulas causes allocation display and calculation mismatches,
leading to confusing user interfaces and incorrect position sizing across the system
derived_from_bd_id: BD-054
- id: finance-C-153
when: When implementing commission cost estimation for model securities
action: Extract commission range as min/max across each securities in the model by finding minimum and maximum transaction
fee values (not average)
severity: high
kind: domain_rule
modality: must
consequence: Using average fee instead of min/max range masks the full cost spectrum, causing cost estimation to be inaccurate
for clients and violating fee disclosure requirements
derived_from_bd_id: BD-063
- id: finance-C-154
when: When implementing model deletion logic in CeModelManager
action: Decrement risk ratings of child models when their rating exceeds the deleted parent's rating, ensuring no orphaned
child has a rating higher than the new top-level ancestor
severity: high
kind: domain_rule
modality: must
consequence: Without proper child rating adjustment, deleted parent scenarios leave orphaned children with risk ratings
exceeding their new ancestors, corrupting the hierarchical risk rating system and causing incorrect risk model assignments
derived_from_bd_id: BD-065
- id: finance-C-155
when: When implementing account cash calculation in CashCalculationManager
action: Sum free cash balances across each linked accounts before the specified date, excluding pending trades, reserved
amounts, and funds in transit
severity: high
kind: domain_rule
modality: must
consequence: Using total cash including pending trades overstates available funds, causing withdrawal requests or rebalancing
to fail in live trading and creating inconsistent cash flow projections between backtest and live execution
derived_from_bd_id: BD-067
- id: finance-C-156
when: When implementing fee tier logic using the INFINITY constant
action: Use INFINITY constant to represent unbounded top fee tier, ensuring no special-case handling for highest tier
and proper tier boundary calculations
severity: medium
kind: operational_lesson
modality: must
consequence: Using NULL or an arbitrary large value for unbounded tier causes calculation errors or database constraints
failures, resulting in incorrect fee calculations that either overcharge clients or undercharge the firm
derived_from_bd_id: BD-044
- id: finance-C-157
when: When implementing risk tolerance scoring and model matching logic
action: Calculate risk tolerance by starting at 50 and summing additive points across questionnaire answers, then select
the model with minimum absolute difference between score and model risk rating
severity: high
kind: domain_rule
modality: must
consequence: Using mode-based matching or applying a single fee rate to entire AUM instead of additive scoring with closest-match
selection causes incorrect risk model assignments, exposing clients to inappropriate risk profiles and potential regulatory
compliance issues
derived_from_bd_id: BD-053
- id: finance-C-158
when: When implementing order validation logic in the price_collection stage
action: Assume the framework provides security-specific tick_size and lot_size fields with order quantity validation —
these fields do not exist in the current implementation
severity: high
kind: claim_boundary
modality: must_not
consequence: Without tick_size and lot_size validation, orders may be submitted with invalid quantities causing rejection
at broker API level, leading to missed trades and position tracking gaps in production
derived_from_bd_id: BD-GAP-016
- id: finance-C-159
when: When implementing order submission in the price_collection stage
action: 'Add security-specific tick_size and lot_size fields to the security data model and implement order quantity validation:
quantity must be divisible by lot_size and price must match tick_size increments'
severity: high
kind: domain_rule
modality: must
consequence: Orders with quantities not matching lot_size increments will be rejected by exchanges, causing order failures
and incomplete position fills that corrupt portfolio accounting
derived_from_bd_id: BD-GAP-016
- id: finance-C-160
when: When implementing portfolio workflow routing logic
action: Modify portfolio processing mode enum values (Straight-Through=1, Collaborative=2) or consolidate modes — changing
these values alters workflow routing and may bypass required compliance checkpoints
severity: high
kind: domain_rule
modality: must_not
consequence: Changing processing mode values can route portfolios incorrectly, bypassing advisor review for Collaborative
accounts or forcing unnecessary review for Straight-Through accounts, violating RIA compliance requirements
derived_from_bd_id: BD-024
- id: finance-C-161
when: When implementing model type handling in portfolio_value_update
action: Maintain model_type enum as STRATEGY or CUSTOM, and preserve inheritance validation for CUSTOM models derived
from STRATEGY templates — do not allow model_type changes without proper inheritance revalidation
severity: high
kind: architecture_guardrail
modality: must
consequence: Changing model types without inheritance revalidation breaks template derivation, causing CUSTOM models to
lose reference to parent strategies and corrupting portfolio allocation calculations
derived_from_bd_id: BD-026
- id: finance-C-162
when: When implementing position status updates in the accounting stage
action: Use each four position status values (INITIAL=1, IS_OPEN=2, IS_CLOSE=3, NOT_VERIFIED=4) correctly — IS_OPEN for
active positions, IS_CLOSE for closing transactions, NOT_VERIFIED for positions pending external confirmation; do not
assume positions are verified by default
severity: high
kind: domain_rule
modality: must
consequence: Assuming positions are verified by default causes unconfirmed positions to be treated as settled, leading
to incorrect portfolio valuation and potential settlement failures when external confirmations do not arrive
derived_from_bd_id: BD-008
- id: finance-C-163
when: When implementing rebalancing logic using tolerance bands in portfolio_value_update
action: Verify tolerance band values against actual account risk tolerance before applying — the default 2-21% range is
a starting estimate that should be customized per account; equity-focused accounts typically need tighter bands (3-5%)
while fixed-income accounts tolerate wider drift (15-20%)
severity: medium
kind: operational_lesson
modality: should
consequence: Using default tolerance bands without verification causes incorrect rebalancing triggers — too tight creates
excessive trading costs, too loose allows dangerous portfolio drift from target allocation
derived_from_bd_id: BD-011
- id: finance-C-164
when: When implementing trade execution or settlement quantity calculation in RebalancerQueue
action: 'Use correct quantity type in SQL queries: ''AS'' (each shares) for full position liquidation, ''S'' (some shares)
for partial adjustment — never substitute one for the other'
severity: high
kind: domain_rule
modality: must
consequence: Using 'S' instead of 'AS' causes partial adjustment logic to execute when full liquidation was intended,
resulting in incorrect trade sizing that leaves residual positions and misaligns portfolio allocation
derived_from_bd_id: BD-032
- id: finance-C-165
when: When implementing tax-loss harvesting logic and wash sale compliance checks
action: 'Respect the tlh_buy_back_original flag: when disabled, prevent repurchase of the same security after harvesting
losses; when enabled, allow repurchase after wash sale period — never hardcode repurchase behavior without checking
this flag'
severity: high
kind: domain_rule
modality: must
consequence: Ignoring the tlh_buy_back_original flag can cause wash sale violations where the same security is repurchased
too soon, disallowing the tax loss deduction and resulting in unexpected tax liability
derived_from_bd_id: BD-040
- id: finance-C-166
when: When using default rebalancing thresholds ($250 fixed fee, $50 min trades, 15% drift threshold) for small accounts
below $17,000
action: 'Verify that rebalancing is economically viable before triggering: check if expected trade size would meet minimum
economic threshold ($2,500 trade minimum derived from fee structure) — if not, either alert the user to the dead zone
or adjust thresholds proportionally'
severity: medium
kind: operational_lesson
modality: should
consequence: The risk cascade of BD-068 (15% threshold) + BD-003 ($250 fixed fee) + BD-002 ($50 min trades) creates a
dead zone where drift exceeding tolerance cannot be economically corrected, leaving small accounts permanently misaligned
without any resolution path
derived_from_bd_id: BD-087
- id: finance-C-167
when: When executing household-level rebalancing that spans multiple accounts
action: Check both account-level and household-level drift states after executing cross-account trades — verify net household
positions reconcile properly across accounts since account-level grouping (BD-031) does not capture aggregate household
allocation
severity: medium
kind: operational_lesson
modality: should
consequence: Household aggregation during rebalancing creates reconciliation mismatches where individual accounts appear
balanced while the household view shows drift, causing tax lot optimization issues and requiring manual reconciliation
of cross-account positions
derived_from_bd_id: BD-090
- id: finance-C-168
when: When implementing or modifying RebalancerQueue execution logic
action: Interpret quantity field as share count, not dollar amount — use amount field for dollar-targeting strategies
severity: high
kind: domain_rule
modality: must
consequence: Using quantity field as dollar value causes position size miscalculations where quantities are treated as
share counts, resulting in orders for incorrect share volumes that either fail execution or create unintended portfolio
exposure
derived_from_bd_id: BD-047
- id: finance-C-170
when: When computing fees in FeeManager for billing periods with partial service
action: Prorate fees using formula (fee * daysWorked) / daysInPeriod, rounded to 2 decimal places — never charge full
period fee for partial service
severity: high
kind: domain_rule
modality: must
consequence: Charging full-period fees to clients with mid-period account openings or closures overcharges them for unused
service days, causing financial losses to clients and potential compliance violations with fee transparency requirements
derived_from_bd_id: BD-055
- id: finance-C-172
when: When determining portfolio total value in ClientPortfolioManager
action: First check for cached latest ClientPortfolioValue record before falling back to summing linked ClientAccount
values — always attempt cached lookup first for performance
severity: high
kind: domain_rule
modality: must
consequence: Skipping cached ClientPortfolioValue lookup and always summing ClientAccount values causes unnecessary database
queries on every call, degrading system performance for portfolios with many linked accounts
derived_from_bd_id: BD-064
- id: finance-C-173
when: When implementing model copying functionality for CeModel
action: Verify that model copying preserves transaction cost assumptions including commission min/max settings, and test
that copied models produce consistent net return calculations without re-entering parameters
severity: medium
kind: operational_lesson
modality: should
consequence: If model copy does not preserve commission min/max settings, copied models will use default values causing
net return projections to diverge from the original, leading to incorrect portfolio performance estimates
derived_from_bd_id: BD-028
- id: finance-C-174
when: When configuring expected return assumptions for portfolio projections
action: Verify expected performance tracking is implemented at the asset subclass level rather than asset class level,
as different subclasses have different return profiles requiring granular projection modeling
severity: high
kind: operational_lesson
modality: must
consequence: Using asset-class-level expected returns instead of subclass-level granularity causes projection precision
loss, as each subclass has distinct return characteristics that aggregate masking would ignore
derived_from_bd_id: BD-029
- id: finance-C-175
when: When setting expected performance values for return projections
action: Validate that expected performance values fall within the 3% to 10% realistic range (3% floor prevents optimism;
10% ceiling reflects market expectations), and reject or flag values outside this range as unrealistic
severity: medium
kind: operational_lesson
modality: should
consequence: Unbounded expected performance values can produce unrealistic portfolio projections, either overly optimistic
or arbitrarily pessimistic, making long-term planning based on flawed assumptions
derived_from_bd_id: BD-033
- id: finance-C-176
when: When implementing trade execution logic with multiple asset subclasses requiring action
action: Implement subclass priority ordering using numeric sequence to verify consistent processing order across runs,
as this determines which positions rebalance first when capital is constrained and affects cash flow timing
severity: medium
kind: operational_lesson
modality: should
consequence: Without consistent subclass priority ordering, trade execution sequence becomes unpredictable causing different
cash flow timing and settlement order, leading to inconsistent portfolio rebalancing results across backtest runs
derived_from_bd_id: BD-034
- id: finance-C-177
when: When implementing price_collection stage functionality
action: 'Assume the framework implements the missing '': 7'' functionality — this feature is absent from the framework
as confirmed by code audit'
severity: high
kind: claim_boundary
modality: must_not
consequence: 'Missing functionality '': 7'' in price_collection stage causes undefined behavior when the feature is expected
by strategy code, leading to silent failures or incorrect results'
derived_from_bd_id: BD-GAP-006
- id: finance-C-178
when: When implementing backtesting logic in price_collection stage
action: Implement walk-forward analysis, out-of-sample testing, or Monte Carlo simulations to detect and mitigate overfitting
in strategy parameter selection
severity: high
kind: domain_rule
modality: must
consequence: Without backtest overfitting protection, strategies that appear highly profitable in-sample fail catastrophically
in live trading, causing significant capital losses
derived_from_bd_id: BD-GAP-006
- id: finance-C-179
when: When implementing quantitative strategy backtesting logic
action: Assume the framework automatically prevents backtest overfitting — the framework does not implement walk-forward
analysis, out-of-sample validation, or overfitting detection
severity: high
kind: claim_boundary
modality: must_not
consequence: Without explicit overfitting protection, strategies optimized on historical data exhibit severe curve-fitting,
causing live trading returns to diverge dramatically from backtested results
derived_from_bd_id: BD-GAP-007
- id: finance-C-180
when: When designing and evaluating quantitative trading strategies
action: Implement walk-forward analysis or rolling window out-of-sample testing with at least 30% holdout ratio to validate
strategy robustness before live deployment
severity: high
kind: domain_rule
modality: must
consequence: Strategies deployed without overfitting protection systematically overfit to historical patterns, leading
to live trading losses that can exceed the entire backtested profit
derived_from_bd_id: BD-GAP-007
- id: finance-C-181
when: When computing factor IC (Information Coefficient) metrics
action: Assume the framework performs IC demean and group alignment automatically — the framework does not implement cross-sectional
IC calculation with proper demeaning
severity: high
kind: claim_boundary
modality: must_not
consequence: Without IC demean and group alignment, factor IC calculations include systematic biases that misrepresent
true predictive power, causing incorrect factor selection and strategy allocation
derived_from_bd_id: BD-GAP-008
- id: finance-C-182
when: When calculating factor Information Coefficients for multi-factor strategies
action: Implement cross-sectional IC demean per group — subtract group-mean IC from each factor's IC value before aggregating,
ensuring individual factor contributions are properly isolated
severity: high
kind: domain_rule
modality: must
consequence: Factor IC values calculated without demeaning and group alignment contain systematic biases that distort
factor ranking, leading to suboptimal factor selection and portfolio construction
derived_from_bd_id: BD-GAP-008
- id: finance-C-183
when: When configuring multi-provider data sources for backtesting
action: Assume the framework implements provider priority routing or credential isolation — the framework does not provide
built-in provider failover or secure credential management
severity: high
kind: claim_boundary
modality: must_not
consequence: Without provider priority and credential isolation, data source failures cause backtest interruptions, and
shared credentials create security vulnerabilities in production environments
derived_from_bd_id: BD-GAP-010
- id: finance-C-184
when: When implementing multi-provider data collection architecture
action: Implement provider priority configuration with explicit fallback routing and isolate each provider's credentials
in separate secure storage (environment variables or secrets manager)
severity: high
kind: domain_rule
modality: must
consequence: Missing provider priority routing causes backtest failures when the primary data source becomes unavailable,
and credential leakage creates security risks in production deployments
derived_from_bd_id: BD-GAP-010
- id: finance-C-185
when: When implementing feature extraction logic for quantitative signals
action: Assume the framework handles time boundary conditions for feature extraction — the framework does not implement
lookahead bias protection or explicit temporal boundary validation
severity: high
kind: claim_boundary
modality: must_not
consequence: Without explicit time boundary handling, feature extraction inadvertently uses future data (lookahead bias),
causing backtest results that cannot be replicated in live trading
derived_from_bd_id: BD-GAP-011
- id: finance-C-186
when: When designing feature extraction pipelines for time-series data
action: Implement explicit time boundary validation — verify features at timestamp T use only data available up to T,
validate no future data leaks via unit tests with synthetic future data
severity: high
kind: domain_rule
modality: must
consequence: Feature extraction without time boundary validation introduces lookahead bias that inflates backtest returns
by 10-50%, causing live trading performance to fall far below backtested results
derived_from_bd_id: BD-GAP-011
- id: finance-C-187
when: When implementing or evaluating backtesting results in the price_collection stage
action: Assume the framework implements walk-forward analysis with out-of-sample testing — this capability is confirmed
absent from the framework
severity: high
kind: claim_boundary
modality: must_not
consequence: Strategies validated only on in-sample data will appear profitable in backtesting but fail in live trading,
causing actual capital losses when deployed
derived_from_bd_id: BD-GAP-014
- id: finance-C-188
when: When validating strategy performance in the price_collection stage
action: 'Implement walk-forward analysis: divide historical data into rolling windows with train periods followed by out-of-sample
test periods, execute strategy only on test periods, and aggregate performance metrics from non-overlapping test results
to detect overfitting'
severity: high
kind: domain_rule
modality: must
consequence: Without walk-forward out-of-sample validation, backtest equity curves are overfitted to historical noise,
making live trading returns fall far below backtested results
derived_from_bd_id: BD-GAP-014
- id: finance-C-189
when: When implementing or modifying job scheduling and execution logic that references REBALANCE_TYPE constants
action: 'Preserve the exact ordinal sequence: FULL(0), REQUIRED_CASH(1), FULL_AND_TLH(2), NO_ACTIONS(3), INITIAL(4) —
the integer values and their relative ordering must not be altered'
severity: high
kind: domain_rule
modality: must
consequence: Changing REBALANCE_TYPE ordinals causes portfolio rebalancing operations to execute the wrong job type, resulting
in incorrect capital allocation, unintended position sizing, or failed TLH harvesting
derived_from_bd_id: BD-079
- id: finance-C-190
when: When evaluating tax lots for tax-loss harvesting decisions
action: 'Check both conditions simultaneously: (1) verify lot age against 365-day threshold for long-term classification,
AND (2) verify lot is not within 30-day wash sale window from any replacement purchase — lots aged 335-365 days must
still be checked for wash sale exposure before harvesting'
severity: high
kind: domain_rule
modality: must
consequence: Harvesting lots without wash sale window verification causes disallowed losses under IRS rules, triggering
additional tax liability and permanent reduction of TLH benefits by 5-15% annually
derived_from_bd_id: BD-083
- id: finance-C-191
when: When implementing or refactoring the TLH savings calculation chain
action: 'Preserve the complete dependency chain: TLH threshold (BD-021) → savings formula (BD-061) → cash flow TWR (BD-059)
→ annualized TWR (BD-052) → portfolio value (BD-064) → period-end billing (BD-048) → billing sum (BD-066) → min fee
floor (BD-060) → min billing fee (BD-019); do not isolate any component from its dependent calculations'
severity: high
kind: domain_rule
modality: must
consequence: TLH thresholds set too low cause excessive wash sale trading whose losses are not captured in savings estimates,
overstating performance by 5-15% and causing incorrect fee billing that triggers client disputes
derived_from_bd_id: BD-086
- id: finance-C-192
when: When calculating time-weighted returns across custodians or time periods
action: Define and document explicit settlement date convention (T+N) and day count convention (act/360, 30/360, act/365)
before any interval calculations; verify each TWR components use the same conventions consistently
severity: high
kind: domain_rule
modality: must
consequence: Without explicit date conventions, portfolio value intervals are calculated differently across custodians,
months with varying day counts, and leap years, causing TWR comparisons to be comparing fundamentally inconsistent calculations
derived_from_bd_id: BD-088
- id: finance-C-193
when: When implementing lot selection and account placement logic
action: 'Coordinate three decisions jointly: qualified vs non-qualified account segregation (BD-045), municipal bond substitution
eligibility restricted to taxable accounts (BD-025), and 365-day threshold for long-term vs short-term gains (BD-057);
apply account-type-dependent tax treatment as a filter before any lot selection'
severity: high
kind: domain_rule
modality: must
consequence: Ignoring account-type coordination causes municipal bonds placed in qualified accounts (suboptimal tax treatment),
short-term trades in taxable accounts (higher tax rates), and lot selection without tax awareness across accounts, reducing
household-level tax efficiency by 2-5%
derived_from_bd_id: BD-089
output_validator:
assertions:
- id: OV-01
check_predicate: all(p in inspect.getsource(zvt.factors.algorithm.macd) for p in ['slow=26', 'fast=12', 'n=9'])
failure_message: 'FATAL: MACD params drifted from (fast=12, slow=26, n=9) — SL-08 violation, non-reproducible signals'
business_meaning: Standard MACD parameters are a semantic lock; drift makes results incomparable with industry-standard
indicators and non-reproducible.
source_ids:
- SL-08
- BD-036
- id: OV-02
check_predicate: result.get('total_trades', 0) > 0 or result.get('explicit_zero_trade_ack') is True
failure_message: Zero trades executed — likely missing pre-fetched data (see PC-02) or over-restrictive filters
business_meaning: A backtest with zero trades is not a valid result; either data is missing or the strategy never triggered.
Structural non-emptiness check is insufficient — we need business confirmation.
source_ids:
- SL-01
- finance-C-073
- id: OV-03
check_predicate: result.get('annual_return') is None or abs(float(result['annual_return'])) <= 5.0
failure_message: 'FATAL: |annual_return| > 500% — likely look-ahead bias or data error'
business_meaning: Annual returns exceeding 500% are physically implausible for A-share strategies; indicates look-ahead
bias or corrupt data.
source_ids: []
- id: OV-04
check_predicate: result.get('holding_change_pct') is None or abs(float(result['holding_change_pct'])) <= 1.0
failure_message: 'FATAL: |holding_change_pct| > 100% — physically impossible'
business_meaning: Holding change percentage cannot exceed 100%; violation indicates position accounting error.
source_ids:
- BD-029
- id: OV-05
check_predicate: result.get('max_drawdown') is None or abs(float(result['max_drawdown'])) <= 1.0
failure_message: 'FATAL: |max_drawdown| > 100% — impossible for non-leveraged account'
business_meaning: Maximum drawdown cannot exceed 100% without leverage; violation indicates calculation error or look-ahead
bias.
source_ids: []
- id: OV-06
check_predicate: not (hasattr(result, 'trade_log') and result.trade_log and any(result.trade_log[i].action == 'sell' and
i+1 < len(result.trade_log) and result.trade_log[i+1].action == 'buy' and result.trade_log[i].timestamp == result.trade_log[i+1].timestamp
for i in range(len(result.trade_log)-1)))
failure_message: 'FATAL: buy-before-sell detected in same cycle — SL-01 violation, creates implicit leverage'
business_meaning: SL-01 requires sell() before buy() in each cycle; violation means available_long was not updated before
buying, risking duplicate positions.
source_ids:
- SL-01
scaffold:
validate_py_path: '{workspace}/validate.py'
tail_block: "# === DO NOT MODIFY BELOW THIS LINE ===\nif __name__ == \"__main__\":\n result = run_backtest()\n from\
\ validate import enforce_validation\n enforce_validation(result, output_path=\"{workspace}/result.csv\")\n# ===\
\ END DO NOT MODIFY ==="
enforcement_protocol: 1. Never edit validate.py. 2. Never delete the DO NOT MODIFY tail block from the main script. 3. Never
wrap enforce_validation() in try/except. 4. Never rewrite result write logic — it MUST go through enforce_validation.
5. If validate.py raises ImportError, fix the dependency, do not remove the call.
acceptance:
hard_gates:
- id: G1
check: '{workspace}/result.csv exists AND file size > 0'
on_fail: Strategy did not produce output; check run_backtest() return value and enforce_validation() call
- id: G2
check: '{workspace}/result.csv.validation_passed marker file exists'
on_fail: Validation did not complete; review validate.py output and fix assertion failures
- id: G3
check: 'Main script contains literal: from validate import enforce_validation'
on_fail: Validation chain stripped; re-add the import in the DO NOT MODIFY block
- id: G4
check: 'Main script contains literal: # === DO NOT MODIFY BELOW THIS LINE ==='
on_fail: Validation fence removed; regenerate DO NOT MODIFY tail block
- id: G5
check: 'result.csv has at least 1 row: pandas.read_csv(result.csv).shape[0] >= 1'
on_fail: Empty result; check if trade_log is non-empty and factors generated signals. Confirm PC-02 (k-data exists) passed.
- id: G6
check: 'If MACD strategy: source contains ''slow=26'' AND ''fast=12'' AND ''n=9'' in algorithm call'
on_fail: MACD params drifted from SL-08 lock; restore standard (12, 26, 9)
- id: G7
check: 'For data pipeline tasks: result.csv contains ''entity_id'' and ''timestamp'' fields'
on_fail: Missing required columns; check Mixin.query_data return schema and DataFrame MultiIndex reset_index() before
writing
- id: G8
check: 'OV-03 passes: abs(annual_return) <= 5.0 (500%)'
on_fail: Physical plausibility check failed; investigate look-ahead bias or data corruption in input kdata
soft_gates:
- id: SG-01
rubric: 'Strategy narrative consistency: user intent aligns with generated strategy.py logic. dim_a: signal direction
(buy/sell) matches intent [1-5, pass>=4]; dim_b: frequency (daily/intraday) aligns [1-5, pass>=4]; dim_c: risk controls
match user intent [1-5, pass>=4].'
- id: SG-02
rubric: 'Factor combination quality. dim_a: no highly correlated factor duplication [1-5, pass>=4]; dim_b: multi-period
alignment correct [1-5, pass>=4]; dim_c: liquidity filter present for A-share [1-5, pass>=4].'
- id: SG-03
rubric: 'Data source selection appropriateness. dim_a: coverage sufficient for target entities [1-5, pass>=4]; dim_b:
provider latency acceptable for strategy frequency [1-5, pass>=4]; dim_c: no unauthorized provider used without credentials
[1-5, pass>=4].'
skill_crystallization:
trigger: all_hard_gates_passed AND user_opt_out_skill_saving != true
output_path_template: '{workspace}/../skills/{slug}.skill'
slug_template: '{blueprint_id_short}-{uc_id_lower}'
captured_fields:
- name
- intent_keywords
- entry_point_script
- validate_script
- fatal_constraints
- spec_locks
- preconditions
- install_recipes
- human_summary_translated
action: 'After all Hard Gates PASS, resolve slug via slug_template using the executed UC, then write the .skill YAML file
at output_path_template. Notify user in their detected locale: ''Skill saved as {slug}.skill — next time say one of {sample_triggers}
from the matched UC to invoke directly.'''
violation_signal: All hard gates passed but no .skill file exists at expected path
skill_file_schema:
name: finance-bp-066 / UC-101
version: v5.3
intent_keywords: []
entry_point: run_backtest
fatal_guards:
- SL-01
- SL-02
- SL-03
- SL-04
- SL-05
- SL-06
- SL-07
- SL-08
- SL-10
- SL-11
- SL-12
spec_locks:
- SL-01
- SL-02
- SL-03
- SL-04
- SL-05
- SL-06
- SL-07
- SL-08
- SL-09
- SL-10
- SL-11
- SL-12
preconditions:
- PC-01
- PC-02
- PC-03
- PC-04
post_install_notice:
trigger: skill_installation_complete
message_template:
positioning: I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow.
capability_catalog:
group_strategy:
source: auto_grouped
strategy_reason: no candidate field had 2-7 distinct values; all capabilities collapsed into single group
groups:
- group_id: all
name: All Capabilities
description: ''
emoji: 📦
uc_count: 0
ucs: []
call_to_action: Tell me which one you want to try.
featured_entries:
- uc_id: UC-100
beginner_prompt: Try capability UC-100
auto_selected: true
- uc_id: UC-101
beginner_prompt: Try capability UC-101
auto_selected: true
- uc_id: UC-102
beginner_prompt: Try capability UC-102
auto_selected: true
more_info_hint: Ask me 'what else can you do?' to see all 0 capabilities.
locale_rendering:
instruction: On skill_installation_complete, translate ALL user-facing strings (positioning + capability_catalog.groups[].name
+ capability_catalog.groups[].description + capability_catalog.groups[].ucs[].short_description + call_to_action + featured_entries[].beginner_prompt
+ more_info_hint) into detected user locale per locale_contract. Preserve UC-IDs, group_id, emoji, and sample_triggers
verbatim.
preserve_verbatim:
- UC-IDs
- group_id
- emoji
- sample_triggers
- technical_class_names
enforcement:
action: 'Host agent MUST send composed message to user as the FIRST user-facing response after skill_installation_complete
event. Message MUST contain: positioning, capability_catalog (rendered as markdown tables per group), 3 featured_entries,
call_to_action, and more_info_hint.'
violation_code: PIN-01
violation_signal: First user-facing message post-install does not contain the full capability_catalog (all UCs grouped)
OR skips featured_entries OR skips call_to_action.
human_summary:
persona: Doraemon
what_i_can_do:
tagline: 'I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me
what you want; I''ll write the code, you don''t have to dig docs. (Heads up: ZVT natively supports A-share, HK, and
crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don''t bother for serious work.)'
use_cases:
- A-share MACD daily golden-cross backtest with hfq price adjustment from eastmoney
- 'End-to-end ZVT pipeline: FinanceRecorder + GoodCompanyFactor + StockTrader'
- Multi-factor strategy with TargetSelector (AND mode) combining MACD + volume breakout
- Index composition data collection (SZ1000, SZ2000) with EM recorder
- Institutional fund holdings tracker via joinquant_fund_runner pattern
- Custom Transformer + Accumulator factor with per-entity rolling state
- Bollinger Band mean-reversion factor with BollTransformer (window=20, window_dev=2)
what_i_auto_fetch:
- ZVT stage pipeline structure (data_collection → visualization) from LATEST.yaml
- Semantic locks (SL-01 through SL-12) — especially sell-before-buy ordering and MACD params
- Fatal constraints (finance-C-*) relevant to your target strategy type
- 'Default parameters: MACD(12,26,9), hfq adjustment, buy_cost=0.001, base_capital=1M CNY'
- Entity ID format (stock_sh_600000) and DataFrame MultiIndex convention
- Provider-specific recorder class names and required class attributes
what_i_ask_you:
- 'Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage
is thin)'
- 'Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare,
or qmt (broker)?'
- 'Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?'
- 'Time range: start_timestamp and end_timestamp for backtest period'
- 'Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?'
locale_rendering:
instruction: On first user contact, translate all fields above into detected user locale while preserving Doraemon persona
(direct, frank, mildly snarky, knows limits).
preserve_verbatim:
- BD-IDs
- SL-IDs
- UC-IDs
- finance-C-IDs
- class_names
- function_names
- file_paths
- numeric_thresholds
基于OTC衍生品组合的XVA估值与风险指标计算,支持CVA/DVA/FVA度量及敞口曲线生成;提供SIMM保证金敏感性分析,兼容多定价引擎配置。
---
name: reactive-pricing-engine
description: |-
基于OTC衍生品组合的XVA估值与风险指标计算,支持CVA/DVA/FVA度量及敞口曲线生成;提供SIMM保证金敏感性分析,兼容多定价引擎配置。
license: Proprietary. See LICENSE.txt in project root.
compatibility: Designed for Doramagic-host ecosystem (Claude Code / openclaw / Cursor). Requires Python 3.12+ with uv package manager.
metadata:
version: "v6.1"
blueprint_id: "finance-bp-104"
compiled_at: "2026-04-22T13:00:49.125318+00:00"
capability_markets: "unspecified"
capability_activities: "finance-analytics"
sop_version: "crystal-compilation-v6.1"
---
# XVA 定价引擎 (reactive-pricing-engine)
> 基于OTC衍生品组合的XVA估值与风险指标计算,支持CVA/DVA/FVA度量及敞口曲线生成;提供SIMM保证金敏感性分析,兼容多定价引擎配置。
## Pipeline
`data_collection -> data_storage -> factor_computation -> target_selection -> trading_execution -> visualization`
## Top Use Cases (19 total)
### Dynamic SIMM Exposure Analysis (`UC-101`)
Analyzes collateralized vs uncollateralized counterparty exposure dynamics for risk management and margin calculations
**Triggers**: initial margin, SIMM, collateral
### XVA Valuation and Sensitivity Reporting (`UC-102`)
Computes and visualizes XVA metrics including CVA, DVA, FVA, and exposure profiles for OTC derivatives portfolio
**Triggers**: XVA, CVA, FVA
### Portfolio NPV Cashflow and Curve Reporting (`UC-106`)
Generates comprehensive portfolio reports including NPV, cashflows, and yield curves for trade valuation
**Triggers**: NPV, cashflow, curves
For all **19** use cases, see [references/USE_CASES.md](references/USE_CASES.md).
**Execute trigger**: `When user intent matches intent_router.uc_entries[].positive_terms AND user uses action verb (run/execute/跑/执行/backtest/fetch/collect)`
## What I'll Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
## Semantic Locks (Fatal)
| ID | Rule | On Violation |
|---|---|---|
| `SL-01` | Execute sell orders before buy orders in every trading cycle | halt |
| `SL-02` | Trading signals MUST use next-bar execution (no look-ahead) | halt |
| `SL-03` | Entity IDs MUST follow format entity_type_exchange_code | halt |
| `SL-04` | DataFrame index MUST be MultiIndex (entity_id, timestamp) | halt |
| `SL-05` | TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount | halt |
| `SL-06` | filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION | halt |
| `SL-07` | Transformer MUST run BEFORE Accumulator in factor pipeline | halt |
| `SL-08` | MACD parameters locked: fast=12, slow=26, signal=9 | halt |
Full lock definitions: [references/LOCKS.md](references/LOCKS.md)
## Evidence Quality Notice
> [QUALITY NOTICE] This crystal was compiled from blueprint finance-bp-104. Evidence verify ratio = 38.9% and audit fail total = 7. Generated results may have uncaptured requirement gaps. Verify critical decisions against source files (LATEST.yaml / LATEST.jsonl).
## Reference Files
| File | Contents | When to Load |
|---|---|---|
| [references/seed.yaml](references/seed.yaml) | V6+ 全量权威 (source-of-truth) | 有行为/决策争议时必读 |
| [references/ANTI_PATTERNS.md](references/ANTI_PATTERNS.md) | 0 条跨项目反模式 | 开始实现前 |
| [references/WISDOM.md](references/WISDOM.md) | 跨项目精华借鉴 | 架构决策时 |
| [references/CONSTRAINTS.md](references/CONSTRAINTS.md) | domain + fatal 约束 | 规则冲突时 |
| [references/USE_CASES.md](references/USE_CASES.md) | 全量 KUC-* 业务场景 | 需要完整示例时 |
| [references/LOCKS.md](references/LOCKS.md) | SL-* + preconditions + hints | 生成回测/交易代码前 |
| [references/COMPONENTS.md](references/COMPONENTS.md) | AST 组件地图(按 module 拆分)| 查 API 时 |
---
*Compiled by Doramagic crystal-compilation-v6.1 from `finance-bp-104` blueprint at 2026-04-22T13:00:49.125318+00:00.*
*See [human_summary.md](human_summary.md) for non-technical overview.*
FILE:human_summary.md
# Human Summary
> I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
## What I Can Do
- **tagline**: I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
- **use_cases**: ['Pricing Sensitivity Method Comparison', 'XVA Valuation and Sensitivity Reporting', 'Dynamic SIMM Exposure Analysis', 'A-share MACD daily golden-cross backtest with hfq price adjustment from eastmoney', 'End-to-end ZVT pipeline: FinanceRecorder + GoodCompanyFactor + StockTrader', 'Multi-factor strategy with TargetSelector (AND mode) combining MACD + volume breakout', 'Index composition data collection (SZ1000, SZ2000) with EM recorder']
## What I Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
FILE:references/ANTI_PATTERNS.md
# Anti-Patterns (Cross-Project)
Total: **0**
FILE:references/COMPONENTS.md
# Component Capability Map
**Project**: finance-bp-104--Engine
**Scan date**: 2026-04-22
**Stats**: {'total_files': 7, 'total_classes': 34, 'total_functions': 0, 'total_stages': 7}
## Modules (7)
- [xml_configuration_stage](components/xml_configuration_stage.md): 5 classes
- [market_data_loading](components/market_data_loading.md): 4 classes
- [pricing_&_npv_calculation](components/pricing_-_npv_calculation.md): 6 classes
- [monte_carlo_simulation_&_exposure](components/monte_carlo_simulation_-_exposure.md): 5 classes
- [xva_&_margin_calculation](components/xva_-_margin_calculation.md): 5 classes
- [post-processing_&_visualization](components/post-processing_-_visualization.md): 5 classes
- [regression_testing_framework](components/regression_testing_framework.md): 4 classes
FILE:references/CONSTRAINTS.md
# Constraints
## preservation_manifest
```yaml
required_objects:
business_decisions_count: 78
fatal_constraints_count: 44
non_fatal_constraints_count: 148
use_cases_count: 19
semantic_locks_count: 12
preconditions_count: 4
evidence_quality_rules_count: 2
traceback_scenarios_count: 5
```
FILE:references/LOCKS.md
# Semantic Locks + Preconditions
## Semantic Locks (12)
### `SL-01` <sub>(on_violation: fatal)</sub>
Execute sell orders before buy orders in every trading cycle
### `SL-02` <sub>(on_violation: fatal)</sub>
Trading signals MUST use next-bar execution (no look-ahead)
### `SL-03` <sub>(on_violation: fatal)</sub>
Entity IDs MUST follow format entity_type_exchange_code
### `SL-04` <sub>(on_violation: fatal)</sub>
DataFrame index MUST be MultiIndex (entity_id, timestamp)
### `SL-05` <sub>(on_violation: fatal)</sub>
TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount
### `SL-06` <sub>(on_violation: fatal)</sub>
filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION
### `SL-07` <sub>(on_violation: fatal)</sub>
Transformer MUST run BEFORE Accumulator in factor pipeline
### `SL-08` <sub>(on_violation: fatal)</sub>
MACD parameters locked: fast=12, slow=26, signal=9
### `SL-09` <sub>(on_violation: warning)</sub>
Default transaction costs: buy_cost=0.001, sell_cost=0.001, slippage=0.001
### `SL-10` <sub>(on_violation: fatal)</sub>
A-share equity trading is T+1 (no same-day close of buy positions)
### `SL-11` <sub>(on_violation: fatal)</sub>
Recorder subclass MUST define provider AND data_schema class attributes
### `SL-12` <sub>(on_violation: fatal)</sub>
Factor result_df MUST contain either 'filter_result' OR 'score_result' column
## Preconditions (4)
- **`PC-01`**: `python3 -c 'import zvt; print(zvt.__version__)'` → on_fail: Run: python3 -m pip install zvt then re-run: python3 -m zvt.init_dirs to initialize data directories
- **`PC-02`**: `python3 -c "from zvt.api.kdata import get_kdata; df = get_kdata(entity_ids=['stock_sh_600000'], limit=1); assert df is not None and len(df) > 0, 'No kdata found'"` → on_fail: Run recorder first: python3 -m zvt.recorders.em.em_stock_kdata_recorder --entity_ids stock_sh_600000 (replace with your target entity IDs)
- **`PC-03`**: `python3 -c "import os; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); assert zvt_home.exists(), f'ZVT home not found: {zvt_home}'"` → on_fail: Run: python3 -m zvt.init_dirs
- **`PC-04`**: `python3 -c "import os, tempfile; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); test_f = zvt_home / '.write_test'; test_f.touch(); test_f.unlink()"` → on_fail: Check directory permissions: chmod u+w ~/.zvt or set ZVT_HOME environment variable to a writable location
FILE:references/USE_CASES.md
# Known Use Cases (KUC)
Total: **19**
## `KUC-101`
**Source**: `Examples/InitialMargin/ore_dynamicsimm.ipynb`
Analyzes collateralized vs uncollateralized counterparty exposure dynamics for risk management and margin calculations.
## `KUC-102`
**Source**: `Examples/Legacy/Example_56/ore.ipynb`
Computes and visualizes XVA metrics including CVA, DVA, FVA, and exposure profiles for OTC derivatives portfolio.
## `KUC-103`
**Source**: `Examples/Legacy/Example_61/ore.ipynb`
Compares different sensitivity computation methods (bump-and-reval, automatic differentiation, complex step) for accuracy and performance validation.
## `KUC-104`
**Source**: `Examples/ORE-Python/Notebooks/Dependencies/dependencies.ipynb`
Explores market object dependencies and portfolio structure by examining curves, conventions, and pricing engine configurations.
## `KUC-105`
**Source**: `Examples/ORE-Python/Notebooks/Example_1/hello.ipynb`
Demonstrates basic instrument setup and valuation for a commodity forward contract using price and discount curves.
## `KUC-106`
**Source**: `Examples/ORE-Python/Notebooks/Example_1/ore.ipynb`
Generates comprehensive portfolio reports including NPV, cashflows, and yield curves for trade valuation.
## `KUC-107`
**Source**: `Examples/ORE-Python/Notebooks/Example_2/ore.ipynb`
Analyzes counterparty exposure profiles (EPE, ENE) for individual swaps and netting sets to assess credit risk.
## `KUC-108`
**Source**: `Examples/ORE-Python/Notebooks/Example_3/ore.ipynb`
Visualizes Monte Carlo simulation paths and NPV cubes for XVA computation, including scenario data analysis.
## `KUC-109`
**Source**: `Examples/ORE-Python/Notebooks/Example_3/progress.ipynb`
Monitors ORE simulation progress by reading log files and displaying execution status in Jupyter.
## `KUC-110`
**Source**: `Examples/ORE-Python/Notebooks/Example_4/ore.ipynb`
Analyzes simulated scenario data to compute covariance matrices and risk factor correlations from Monte Carlo output.
## `KUC-111`
**Source**: `Examples/ORE-Python/Notebooks/Example_5/ore.ipynb`
Visualizes custom payoff functions and analyzes trade-level exposure for exotic instruments like barrier options.
## `KUC-112`
**Source**: `Examples/ORE-Python/Notebooks/Example_6/ore.ipynb`
Demonstrates dynamic valuation date changes and interactive scenario analysis using Jupyter widgets for what-if analysis.
## `KUC-113`
**Source**: `Examples/ORE-Python/Notebooks/Example_7/ore.ipynb`
Calculates regulatory margin requirements using SIMM methodology based on CRIF-formatted risk sensitivities.
## `KUC-114`
**Source**: `Examples/ORE-Python/Notebooks/Example_8/analytics.ipynb`
Sets up complete market data infrastructure including curves, conventions, pricing engines, and term structure configurations.
## `KUC-115`
**Source**: `Examples/ORE-Python/Notebooks/Example_9/ore.ipynb`
Extracts and inspects term structures including discount curves, forward curves, and zero rates for pricing validation.
## `KUC-116`
**Source**: `Examples/ORE-Python/Notebooks/QuasiMonteCarloMethods.ipynb`
Demonstrates quasi-Monte Carlo methods using low-discrepancy sequences (Sobol) for more efficient European option pricing.
## `KUC-117`
**Source**: `Examples/Performance/ore_aadsensi.ipynb`
Compares automatic differentiation sensitivity computation performance against traditional bump-and-reval and complex step methods.
## `KUC-118`
**Source**: `Examples/Performance/ore_cvasensi.ipynb`
Computes CVA sensitivities to risk factors and analyzes credit valuation adjustment exposure profiles.
## `KUC-119`
**Source**: `FrontEnd/Python/Visualization/npvcube/ore_jupyter_dashboard.ipynb`
Provides interactive Jupyter dashboard for launching ORE, visualizing NPV cube data, and filtering by netting sets.
FILE:references/WISDOM.md
# Cross-Project Wisdom
Total: **0**
FILE:references/components/market_data_loading.md
# market_data_loading (4 classes)
## `CSVLoader`
`market_data_loading/csvloader.py:0`
## `InMemoryLoader`
`market_data_loading/inmemoryloader.py:0`
## `Market construction`
`market_data_loading/market-construction.py:0`
## `loader_implementation`
`market_data_loading/loader-implementation.py:0`
FILE:references/components/monte_carlo_simulation_-_exposure.md
# monte_carlo_simulation_&_exposure (5 classes)
## `Simulation configuration`
`monte_carlo_simulation_&_exposure/simulation-configuration.py:0`
## `Cube storage and retrieval`
`monte_carlo_simulation_&_exposure/cube-storage-and-retrieval.py:0`
## `FinancialSystem.consistent_input`
`monte_carlo_simulation_&_exposure/financialsystem-consistent-input.py:0`
## `simulation_engine`
`monte_carlo_simulation_&_exposure/simulation-engine.py:0`
## `scenario_output`
`monte_carlo_simulation_&_exposure/scenario-output.py:0`
FILE:references/components/post-processing_-_visualization.md
# post-processing_&_visualization (5 classes)
## `OutputPlotter`
`post-processing_&_visualization/outputplotter.py:0`
## `compare_files`
`post-processing_&_visualization/compare-files.py:0`
## `OutputFiles.from_folder`
`post-processing_&_visualization/outputfiles-from-folder.py:0`
## `plot_format`
`post-processing_&_visualization/plot-format.py:0`
## `comparison_tolerance`
`post-processing_&_visualization/comparison-tolerance.py:0`
FILE:references/components/pricing_-_npv_calculation.md
# pricing_&_npv_calculation (6 classes)
## `VanillaOption.setPricingEngine`
`pricing_&_npv_calculation/vanillaoption-setpricingengine.py:0`
## `AnalyticEuropeanEngine`
`pricing_&_npv_calculation/analyticeuropeanengine.py:0`
## `HestonModel`
`pricing_&_npv_calculation/hestonmodel.py:0`
## `PlainInMemoryReport.getReport`
`pricing_&_npv_calculation/plaininmemoryreport-getreport.py:0`
## `pricing_engine`
`pricing_&_npv_calculation/pricing-engine.py:0`
## `model_type`
`pricing_&_npv_calculation/model-type.py:0`
FILE:references/components/regression_testing_framework.md
# regression_testing_framework (4 classes)
## `TestExamples dynamic test generation`
`regression_testing_framework/testexamples-dynamic-test-generation.py:0`
## `compare_files_df`
`regression_testing_framework/compare-files-df.py:0`
## `test_parallelism`
`regression_testing_framework/test-parallelism.py:0`
## `sample_count_override`
`regression_testing_framework/sample-count-override.py:0`
FILE:references/components/xml_configuration_stage.md
# xml_configuration_stage (5 classes)
## `Parameters.fromFile`
`xml_configuration_stage/parameters-fromfile.py:0`
## `OreExample.__init__`
`xml_configuration_stage/oreexample-init.py:0`
## `OreExample.run`
`xml_configuration_stage/oreexample-run.py:0`
## `ore_exe_location`
`xml_configuration_stage/ore-exe-location.py:0`
## `xml_configuration_file`
`xml_configuration_stage/xml-configuration-file.py:0`
FILE:references/components/xva_-_margin_calculation.md
# xva_&_margin_calculation (5 classes)
## `expectedSimmEvolution`
`xva_&_margin_calculation/expectedsimmevolution.py:0`
## `netCubeFilter`
`xva_&_margin_calculation/netcubefilter.py:0`
## `CreditPortfolioModel`
`xva_&_margin_calculation/creditportfoliomodel.py:0`
## `margin_methodology`
`xva_&_margin_calculation/margin-methodology.py:0`
## `evaluation_date`
`xva_&_margin_calculation/evaluation-date.py:0`
FILE:references/seed.yaml
meta:
id: finance-bp-104-v5.3
version: v6.1
blueprint_id: finance-bp-104
sop_version: crystal-compilation-v6.1
source_language: en
compiled_at: '2026-04-22T13:00:49.125318+00:00'
target_host: openclaw
authoritative_artifact:
primary: seed.yaml
non_authoritative_derivatives:
- SKILL.md (host-generated summary, may lag)
- HEARTBEAT.md (host telemetry)
- memory/*.md (host conversational memory)
rule: On any behavioral decision (preconditions check, OV assertion, EQ rule firing, spec_lock verification), agents MUST
re-read seed.yaml. Derivatives are for UI display only and may be out-of-date.
execution_protocol:
install_trigger:
- Execute resources.host_adapter.install_recipes[] in declared order
- Verify each package with import check before proceeding
execute_trigger: When user intent matches intent_router.uc_entries[].positive_terms AND user uses action verb (run/execute/跑/执行/backtest/fetch/collect)
on_execute:
- Reload seed.yaml (do not rely on SKILL.md or cached summaries)
- Run preconditions[] in declared order; halt on first fatal failure with on_fail message to user
- Enter context_state_machine.CA1_MEMORY_CHECKED state
- Evaluate evidence_quality.enforcement_rules[]; prepend user_disclosure_template
- Translate user_facing_fields to user locale per locale_contract
- "[V6 READING ORDER]\nThis crystal contains the following V6 layers. Before answering any business question, the host\
\ MUST read them in order:\n 1. anti_patterns[] — cross-project anti-patterns (with AP-* ids)\n 2. cross_project_wisdom[]\
\ — cross-project wisdom (with CW-* ids)\n 3. domain_constraints_injected[] — domain constraints (SHARED-* ids)\n \
\ 4. known_use_cases[] — concrete business scenarios (KUC-* ids)\n 5. component_capability_map — AST component map\
\ (by module)\n\nWhen answering user questions, proactively cite relevant AP-*/CW-*/SHARED-*/KUC-* ids with source text.\
\ Examples: T+1 rules -> cite SHARED-* constraint; model comparison -> warn via AP-*; follow-holdings strategy -> cite\
\ KUC-* with example file."
workspace_resolution:
scripts_path: '{host_workspace}/scripts/'
skills_path: '{host_workspace}/skills/'
trace_path: '{host_workspace}/.trace/'
capability_tags:
markets:
- unspecified
activities:
- finance-analytics
upgraded_from: finance-bp-104-v1.seed.yaml
upgraded_at: '2026-04-22T13:20:27.856671+00:00'
v6_inputs:
ast_mind_map: knowledge/sources/finance/finance-bp-104--Engine/v6_inputs/ast_mind_map.yaml
anti_patterns: null
cross_project_wisdom: null
examples_kuc: knowledge/sources/finance/finance-bp-104--Engine/v6_inputs/examples_kuc.yaml
shared_pools_dir: knowledge/sources/finance/_shared
domain_constraints_injected: []
resources_injected: {}
known_use_cases:
- kuc_id: KUC-101
source_file: Examples/InitialMargin/ore_dynamicsimm.ipynb
business_problem: Analyzes collateralized vs uncollateralized counterparty exposure dynamics for risk management and margin
calculations.
intent_keywords:
- initial margin
- SIMM
- collateral
- exposure
- variation margin
stage: factor_computation
data_domain: trading_data
type: reporting
- kuc_id: KUC-102
source_file: Examples/Legacy/Example_56/ore.ipynb
business_problem: Computes and visualizes XVA metrics including CVA, DVA, FVA, and exposure profiles for OTC derivatives
portfolio.
intent_keywords:
- XVA
- CVA
- FVA
- exposure
- counterparty risk
stage: factor_computation
data_domain: trading_data
type: reporting
- kuc_id: KUC-103
source_file: Examples/Legacy/Example_61/ore.ipynb
business_problem: Compares different sensitivity computation methods (bump-and-reval, automatic differentiation, complex
step) for accuracy and performance validation.
intent_keywords:
- sensitivity
- AD
- automatic differentiation
- complex step
- bump and reval
stage: factor_computation
data_domain: financial_data
type: research_analysis
- kuc_id: KUC-104
source_file: Examples/ORE-Python/Notebooks/Dependencies/dependencies.ipynb
business_problem: Explores market object dependencies and portfolio structure by examining curves, conventions, and pricing
engine configurations.
intent_keywords:
- market data
- curves
- conventions
- portfolio
- dependencies
stage: data_collection
data_domain: market_data
type: data_pipeline
- kuc_id: KUC-105
source_file: Examples/ORE-Python/Notebooks/Example_1/hello.ipynb
business_problem: Demonstrates basic instrument setup and valuation for a commodity forward contract using price and discount
curves.
intent_keywords:
- commodity
- forward
- pricing engine
- price curve
- discount curve
stage: factor_computation
data_domain: financial_data
type: research_analysis
- kuc_id: KUC-106
source_file: Examples/ORE-Python/Notebooks/Example_1/ore.ipynb
business_problem: Generates comprehensive portfolio reports including NPV, cashflows, and yield curves for trade valuation.
intent_keywords:
- NPV
- cashflow
- curves
- portfolio
- valuation
stage: factor_computation
data_domain: trading_data
type: reporting
- kuc_id: KUC-107
source_file: Examples/ORE-Python/Notebooks/Example_2/ore.ipynb
business_problem: Analyzes counterparty exposure profiles (EPE, ENE) for individual swaps and netting sets to assess credit
risk.
intent_keywords:
- exposure
- EPE
- ENE
- netting set
- swap
stage: factor_computation
data_domain: trading_data
type: reporting
- kuc_id: KUC-108
source_file: Examples/ORE-Python/Notebooks/Example_3/ore.ipynb
business_problem: Visualizes Monte Carlo simulation paths and NPV cubes for XVA computation, including scenario data analysis.
intent_keywords:
- XVA
- scenario
- simulation
- Monte Carlo
- NPV cube
stage: factor_computation
data_domain: trading_data
type: reporting
- kuc_id: KUC-109
source_file: Examples/ORE-Python/Notebooks/Example_3/progress.ipynb
business_problem: Monitors ORE simulation progress by reading log files and displaying execution status in Jupyter.
intent_keywords:
- progress
- monitoring
- simulation
- log
stage: data_collection
data_domain: mixed
type: monitoring
- kuc_id: KUC-110
source_file: Examples/ORE-Python/Notebooks/Example_4/ore.ipynb
business_problem: Analyzes simulated scenario data to compute covariance matrices and risk factor correlations from Monte
Carlo output.
intent_keywords:
- scenario
- covariance
- correlation
- risk factor
- eigenvalue
stage: factor_computation
data_domain: financial_data
type: research_analysis
- kuc_id: KUC-111
source_file: Examples/ORE-Python/Notebooks/Example_5/ore.ipynb
business_problem: Visualizes custom payoff functions and analyzes trade-level exposure for exotic instruments like barrier
options.
intent_keywords:
- payoff
- barrier option
- custom
- exposure
- visualization
stage: factor_computation
data_domain: trading_data
type: research_analysis
- kuc_id: KUC-112
source_file: Examples/ORE-Python/Notebooks/Example_6/ore.ipynb
business_problem: Demonstrates dynamic valuation date changes and interactive scenario analysis using Jupyter widgets for
what-if analysis.
intent_keywords:
- valuation date
- scenario
- interactive
- what-if
- dynamic
stage: factor_computation
data_domain: financial_data
type: research_analysis
- kuc_id: KUC-113
source_file: Examples/ORE-Python/Notebooks/Example_7/ore.ipynb
business_problem: Calculates regulatory margin requirements using SIMM methodology based on CRIF-formatted risk sensitivities.
intent_keywords:
- SIMM
- margin
- CRIF
- initial margin
- regulatory
stage: factor_computation
data_domain: trading_data
type: reporting
- kuc_id: KUC-114
source_file: Examples/ORE-Python/Notebooks/Example_8/analytics.ipynb
business_problem: Sets up complete market data infrastructure including curves, conventions, pricing engines, and term structure
configurations.
intent_keywords:
- market data
- configuration
- curves
- conventions
- pricing engine
stage: data_collection
data_domain: market_data
type: data_pipeline
- kuc_id: KUC-115
source_file: Examples/ORE-Python/Notebooks/Example_9/ore.ipynb
business_problem: Extracts and inspects term structures including discount curves, forward curves, and zero rates for pricing
validation.
intent_keywords:
- term structure
- discount curve
- forward curve
- zero rate
- ibor
stage: factor_computation
data_domain: market_data
type: reporting
- kuc_id: KUC-116
source_file: Examples/ORE-Python/Notebooks/QuasiMonteCarloMethods.ipynb
business_problem: Demonstrates quasi-Monte Carlo methods using low-discrepancy sequences (Sobol) for more efficient European
option pricing.
intent_keywords:
- quasi-Monte Carlo
- Sobol
- low discrepancy
- option pricing
- variance reduction
stage: factor_computation
data_domain: financial_data
type: research_analysis
- kuc_id: KUC-117
source_file: Examples/Performance/ore_aadsensi.ipynb
business_problem: Compares automatic differentiation sensitivity computation performance against traditional bump-and-reval
and complex step methods.
intent_keywords:
- AAD
- sensitivity
- performance
- automatic differentiation
- Greeks
stage: factor_computation
data_domain: financial_data
type: research_analysis
- kuc_id: KUC-118
source_file: Examples/Performance/ore_cvasensi.ipynb
business_problem: Computes CVA sensitivities to risk factors and analyzes credit valuation adjustment exposure profiles.
intent_keywords:
- CVA
- sensitivity
- credit
- scenario
- exposure
stage: factor_computation
data_domain: trading_data
type: reporting
- kuc_id: KUC-119
source_file: FrontEnd/Python/Visualization/npvcube/ore_jupyter_dashboard.ipynb
business_problem: Provides interactive Jupyter dashboard for launching ORE, visualizing NPV cube data, and filtering by
netting sets.
intent_keywords:
- dashboard
- NPV cube
- visualization
- interactive
- netting set
stage: factor_computation
data_domain: trading_data
type: reporting
component_capability_map:
project: finance-bp-104--Engine
scan_date: '2026-04-22'
stats:
total_files: 7
total_classes: 34
total_functions: 0
total_stages: 7
modules:
xml_configuration_stage:
class_count: 5
stage_id: xml_configuration
stage_order: 1
responsibility: 'Parses XML input files (ore.xml, simulation.xml, portfolio.xml) to define the ORE execution pipeline.
WHY: Decoupled configuration enables flexible workflow definition without code changes, allowing non-programmers to
modify execution behavior.'
classes:
- name: Parameters.fromFile
file: xml_configuration_stage/parameters-fromfile.py
line: 0
kind: required_method
signature: ''
- name: OreExample.__init__
file: xml_configuration_stage/oreexample-init.py
line: 0
kind: required_method
signature: ''
- name: OreExample.run
file: xml_configuration_stage/oreexample-run.py
line: 0
kind: required_method
signature: ''
- name: ore_exe_location
file: xml_configuration_stage/ore-exe-location.py
line: 0
kind: replaceable_point
- name: xml_configuration_file
file: xml_configuration_stage/xml-configuration-file.py
line: 0
kind: replaceable_point
design_decision_count: 3
market_data_loading:
class_count: 4
stage_id: market_data_loading
stage_order: 2
responsibility: 'Loads market data (yield curves, volatility surfaces, FX spots, credit curves) into ORE''s market object.
WHY: Separates data ingestion from pricing to enable consistent revaluation and supports both file-based and programmatic
data injection.'
classes:
- name: CSVLoader
file: market_data_loading/csvloader.py
line: 0
kind: required_method
signature: ''
- name: InMemoryLoader
file: market_data_loading/inmemoryloader.py
line: 0
kind: required_method
signature: ''
- name: Market construction
file: market_data_loading/market-construction.py
line: 0
kind: required_method
signature: ''
- name: loader_implementation
file: market_data_loading/loader-implementation.py
line: 0
kind: replaceable_point
design_decision_count: 2
pricing_&_npv_calculation:
class_count: 6
stage_id: pricing_npv
stage_order: 3
responsibility: 'Prices trades using configured pricing engines and calculates Net Present Values for the entire portfolio.
WHY: Core valuation step that feeds each downstream risk calculations, XVA, and reporting.'
classes:
- name: VanillaOption.setPricingEngine
file: pricing_&_npv_calculation/vanillaoption-setpricingengine.py
line: 0
kind: required_method
signature: ''
- name: AnalyticEuropeanEngine
file: pricing_&_npv_calculation/analyticeuropeanengine.py
line: 0
kind: required_method
signature: ''
- name: HestonModel
file: pricing_&_npv_calculation/hestonmodel.py
line: 0
kind: required_method
signature: ''
- name: PlainInMemoryReport.getReport
file: pricing_&_npv_calculation/plaininmemoryreport-getreport.py
line: 0
kind: required_method
signature: ''
- name: pricing_engine
file: pricing_&_npv_calculation/pricing-engine.py
line: 0
kind: replaceable_point
- name: model_type
file: pricing_&_npv_calculation/model-type.py
line: 0
kind: replaceable_point
design_decision_count: 3
monte_carlo_simulation_&_exposure:
class_count: 5
stage_id: simulation_exposure
stage_order: 4
responsibility: 'Runs Monte Carlo simulations to generate future exposure profiles for trades and netting sets. WHY:
Enables XVA calculations that require full distribution of future portfolio values, not just point estimates.'
classes:
- name: Simulation configuration
file: monte_carlo_simulation_&_exposure/simulation-configuration.py
line: 0
kind: required_method
signature: ''
- name: Cube storage and retrieval
file: monte_carlo_simulation_&_exposure/cube-storage-and-retrieval.py
line: 0
kind: required_method
signature: ''
- name: FinancialSystem.consistent_input
file: monte_carlo_simulation_&_exposure/financialsystem-consistent-input.py
line: 0
kind: required_method
signature: ''
- name: simulation_engine
file: monte_carlo_simulation_&_exposure/simulation-engine.py
line: 0
kind: replaceable_point
- name: scenario_output
file: monte_carlo_simulation_&_exposure/scenario-output.py
line: 0
kind: replaceable_point
design_decision_count: 3
xva_&_margin_calculation:
class_count: 5
stage_id: xva_calculation
stage_order: 5
responsibility: 'Computes Credit Valuation Adjustment (CVA), Debt Valuation Adjustment (DVA), Funding Valuation Adjustment
(FVA), and regulatory margins (SA-CCR, SIMM, DIM). WHY: Critical for trading desk P&L attribution, risk management,
and regulatory capital compliance.'
classes:
- name: expectedSimmEvolution
file: xva_&_margin_calculation/expectedsimmevolution.py
line: 0
kind: required_method
signature: ''
- name: netCubeFilter
file: xva_&_margin_calculation/netcubefilter.py
line: 0
kind: required_method
signature: ''
- name: CreditPortfolioModel
file: xva_&_margin_calculation/creditportfoliomodel.py
line: 0
kind: required_method
signature: ''
- name: margin_methodology
file: xva_&_margin_calculation/margin-methodology.py
line: 0
kind: replaceable_point
- name: evaluation_date
file: xva_&_margin_calculation/evaluation-date.py
line: 0
kind: replaceable_point
design_decision_count: 3
post-processing_&_visualization:
class_count: 5
stage_id: post_processing
stage_order: 6
responsibility: 'Transforms raw ORE output into analysis-ready formats and generates publication-quality plots. WHY:
Bridges gap between engine output and human-interpretable results for risk management decision-making.'
classes:
- name: OutputPlotter
file: post-processing_&_visualization/outputplotter.py
line: 0
kind: required_method
signature: ''
- name: compare_files
file: post-processing_&_visualization/compare-files.py
line: 0
kind: required_method
signature: ''
- name: OutputFiles.from_folder
file: post-processing_&_visualization/outputfiles-from-folder.py
line: 0
kind: required_method
signature: ''
- name: plot_format
file: post-processing_&_visualization/plot-format.py
line: 0
kind: replaceable_point
- name: comparison_tolerance
file: post-processing_&_visualization/comparison-tolerance.py
line: 0
kind: replaceable_point
design_decision_count: 3
regression_testing_framework:
class_count: 4
stage_id: regression_testing
stage_order: 7
responsibility: 'Validates ORE outputs against expected results and manages test suite execution. WHY: Ensures numerical
stability across versions, platforms, and configuration changes to prevent silent regressions.'
classes:
- name: TestExamples dynamic test generation
file: regression_testing_framework/testexamples-dynamic-test-generation.py
line: 0
kind: required_method
signature: ''
- name: compare_files_df
file: regression_testing_framework/compare-files-df.py
line: 0
kind: required_method
signature: ''
- name: test_parallelism
file: regression_testing_framework/test-parallelism.py
line: 0
kind: replaceable_point
- name: sample_count_override
file: regression_testing_framework/sample-count-override.py
line: 0
kind: replaceable_point
design_decision_count: 3
data_flow_hints: []
locale_contract:
source_language: en
user_facing_fields:
- human_summary.what_i_can_do.tagline
- human_summary.what_i_can_do.use_cases[]
- human_summary.what_i_auto_fetch[]
- human_summary.what_i_ask_you[]
- evidence_quality.user_disclosure_template
- post_install_notice.message_template.positioning
- post_install_notice.message_template.capability_catalog.groups[].name
- post_install_notice.message_template.capability_catalog.groups[].description
- post_install_notice.message_template.capability_catalog.groups[].ucs[].name
- post_install_notice.message_template.capability_catalog.groups[].ucs[].short_description
- post_install_notice.message_template.call_to_action
- post_install_notice.message_template.featured_entries[].beginner_prompt
- post_install_notice.message_template.more_info_hint
- preconditions[].description
- preconditions[].on_fail
- intent_router.uc_entries[].name
- intent_router.uc_entries[].ambiguity_question
- architecture.pipeline
- architecture.stages[].narrative.does_what
- architecture.stages[].narrative.key_decisions
- architecture.stages[].narrative.common_pitfalls
- constraints.fatal[].consequence
- constraints.regular[].consequence
- output_validator.assertions[].failure_message
- acceptance.hard_gates[].on_fail
- skill_crystallization.action
locale_detection_order:
- explicit_user_declaration
- first_message_language
- system_locale
translation_enforcement:
trigger: on_first_user_message
action: Render user_facing_fields in detected locale, preserving all IDs (BD-/SL-/UC-/finance-C-) and code identifiers
verbatim
violation_code: LOCALE-01
violation_signal: User receives untranslated English Human Summary when detected locale != en
evidence_quality:
declared:
evidence_coverage_ratio: 1.0
evidence_verify_ratio: 0.3888888888888889
evidence_invalid: 44
evidence_verified: 28
evidence_auto_fixed: 0
audit_coverage: 47/47 (100%)
audit_pass_rate: 12/47 (25%)
audit_fail_total: 7
audit_finance_universal:
pass: 0
warn: 0
fail: 0
audit_subdomain_totals:
pass: 12
warn: 28
fail: 7
enforcement_rules:
- id: EQ-01
trigger: declared.evidence_verify_ratio < 0.5
action: MUST invoke traceback lookup for all cited BD-IDs in output before emitting business code — read LATEST.yaml sections
for each BD referenced
violation_code: EQ-01-V
violation_signal: Generated script references BD-IDs but no tool_call to read LATEST.yaml preceded code generation
user_disclosure_template: '[QUALITY NOTICE] This crystal was compiled from blueprint finance-bp-104. Evidence verify ratio
= 38.9% and audit fail total = 7. Generated results may have uncaptured requirement gaps. Verify critical decisions against
source files (LATEST.yaml / LATEST.jsonl).'
traceback:
source_files:
blueprint: LATEST.yaml
constraints: LATEST.jsonl
mandatory_lookup_scenarios:
- id: TB-01
condition: Two constraints have apparently conflicting enforcement rules
lookup_target: LATEST.jsonl — find both constraint IDs, compare `consequence` + `evidence_refs` to determine priority
- id: TB-02
condition: A business decision rationale is unclear or disputed
lookup_target: LATEST.yaml — locate BD-ID under business_decisions, read `rationale` + `alternative_considered` fields
- id: TB-03
condition: evidence_invalid > 0 in evidence_quality.declared
lookup_target: LATEST.yaml _enrich_meta — cross-check specific BD `evidence_refs` fields for invalid markers
- id: TB-04
condition: User asks where a rule comes from
lookup_target: LATEST.jsonl — find constraint by ID, read `confidence.evidence_refs` for source file + line number
- id: TB-05
condition: Generated code does not match expected ZVT API behavior
lookup_target: LATEST.yaml stages[].required_methods — verify method signature and evidence locator in source code
degraded_lookup:
no_fs_access: 'Ask the user to paste the relevant LATEST.yaml section or LATEST.jsonl lines for the BD-/finance-C- IDs
in question. Crystal ID: finance-bp-104-v5.0.'
trace_schema:
event_types:
- precondition_check
- spec_lock_check
- evidence_rule_fired
- evidence_rule_skipped
- locale_translation_emitted
- hard_gate_passed
- hard_gate_failed
- skill_emitted
- false_completion_claim
preconditions:
- id: PC-01
description: zvt package installed and importable
check_command: python3 -c 'import zvt; print(zvt.__version__)'
on_fail: 'Run: python3 -m pip install zvt then re-run: python3 -m zvt.init_dirs to initialize data directories'
severity: fatal
- id: PC-02
description: K-data exists for target entities (required before backtesting)
check_command: python3 -c "from zvt.api.kdata import get_kdata; df = get_kdata(entity_ids=['stock_sh_600000'], limit=1);
assert df is not None and len(df) > 0, 'No kdata found'"
on_fail: 'Run recorder first: python3 -m zvt.recorders.em.em_stock_kdata_recorder --entity_ids stock_sh_600000 (replace
with your target entity IDs)'
severity: fatal
applies_to_uc:
- UC-101
- UC-102
- UC-103
- UC-105
- UC-106
- UC-107
- UC-108
- UC-110
- UC-111
- UC-112
- UC-113
- UC-115
- UC-116
- UC-117
- UC-118
- UC-119
- id: PC-03
description: ZVT data directory initialized (~/.zvt or ZVT_HOME)
check_command: 'python3 -c "import os; from pathlib import Path; zvt_home = Path(os.environ.get(''ZVT_HOME'', Path.home()
/ ''.zvt'')); assert zvt_home.exists(), f''ZVT home not found: {zvt_home}''"'
on_fail: 'Run: python3 -m zvt.init_dirs'
severity: fatal
- id: PC-04
description: SQLite write permission for ZVT data directory
check_command: python3 -c "import os, tempfile; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home()
/ '.zvt')); test_f = zvt_home / '.write_test'; test_f.touch(); test_f.unlink()"
on_fail: 'Check directory permissions: chmod u+w ~/.zvt or set ZVT_HOME environment variable to a writable location'
severity: warn
intent_router:
uc_entries:
- uc_id: UC-101
name: Dynamic SIMM Exposure Analysis
positive_terms:
- initial margin
- SIMM
- collateral
- exposure
- variation margin
data_domain: trading_data
negative_terms:
- XVA
- NPV
- pricing
ambiguity_question: Are you looking for margin/SIMM calculations or valuation/pricing analytics?
- uc_id: UC-102
name: XVA Valuation and Sensitivity Reporting
positive_terms:
- XVA
- CVA
- FVA
- exposure
- counterparty risk
data_domain: trading_data
negative_terms:
- NPV
- pricing
- margin
ambiguity_question: Do you need XVA/counterparty risk metrics or basic portfolio valuations?
- uc_id: UC-103
name: Pricing Sensitivity Method Comparison
positive_terms:
- sensitivity
- AD
- automatic differentiation
- complex step
- bump and reval
data_domain: financial_data
negative_terms:
- XVA
- exposure
ambiguity_question: Are you comparing sensitivity computation methods or analyzing risk metrics?
- uc_id: UC-104
name: Market Data and Portfolio Dependency Analysis
positive_terms:
- market data
- curves
- conventions
- portfolio
- dependencies
data_domain: market_data
negative_terms:
- XVA
- exposure
ambiguity_question: Are you setting up market data infrastructure or running risk calculations?
- uc_id: UC-105
name: Commodity Forward Valuation Setup
positive_terms:
- commodity
- forward
- pricing engine
- price curve
- discount curve
data_domain: financial_data
negative_terms:
- portfolio
- XVA
- exposure
ambiguity_question: Are you learning instrument setup or need portfolio-level analytics?
- uc_id: UC-106
name: Portfolio NPV Cashflow and Curve Reporting
positive_terms:
- NPV
- cashflow
- curves
- portfolio
- valuation
data_domain: trading_data
negative_terms:
- XVA
- exposure
- sensitivity
ambiguity_question: Do you need basic portfolio valuations or risk/exposure metrics?
- uc_id: UC-107
name: Netting Set Exposure Analysis
positive_terms:
- exposure
- EPE
- ENE
- netting set
- swap
data_domain: trading_data
negative_terms:
- NPV
- pricing
- margin
ambiguity_question: Are you analyzing credit exposure or computing trade valuations?
- uc_id: UC-108
name: XVA Scenario Path Visualization
positive_terms:
- XVA
- scenario
- simulation
- Monte Carlo
- NPV cube
data_domain: trading_data
negative_terms:
- NPV
- pricing
ambiguity_question: Are you analyzing XVA scenarios or basic trade valuations?
- uc_id: UC-109
name: Simulation Progress Monitoring
positive_terms:
- progress
- monitoring
- simulation
- log
data_domain: mixed
negative_terms:
- NPV
- exposure
- reporting
ambiguity_question: Are you monitoring execution progress or analyzing results?
- uc_id: UC-110
name: Scenario Dump Covariance Analysis
positive_terms:
- scenario
- covariance
- correlation
- risk factor
- eigenvalue
data_domain: financial_data
negative_terms:
- NPV
- XVA
ambiguity_question: Are you analyzing risk factor correlations or computing valuations?
- uc_id: UC-111
name: Custom Payoff and Exposure Analysis
positive_terms:
- payoff
- barrier option
- custom
- exposure
- visualization
data_domain: trading_data
negative_terms:
- portfolio
- XVA
ambiguity_question: Are you analyzing exotic instrument payoffs or standard trade valuations?
- uc_id: UC-112
name: Dynamic Valuation Date Scenario Analysis
positive_terms:
- valuation date
- scenario
- interactive
- what-if
- dynamic
data_domain: financial_data
negative_terms:
- XVA
- exposure
ambiguity_question: Are you running what-if scenario analysis or standard risk reports?
- uc_id: UC-113
name: SIMM Margin Calculation with CRIF Data
positive_terms:
- SIMM
- margin
- CRIF
- initial margin
- regulatory
data_domain: trading_data
negative_terms:
- XVA
- NPV
ambiguity_question: Are you calculating SIMM margin or XVA metrics?
- uc_id: UC-114
name: Market Data Configuration and Analytics Setup
positive_terms:
- market data
- configuration
- curves
- conventions
- pricing engine
data_domain: market_data
negative_terms:
- XVA
- exposure
ambiguity_question: Are you configuring market data or running analytics?
- uc_id: UC-115
name: Term Structure and Discount Factor Extraction
positive_terms:
- term structure
- discount curve
- forward curve
- zero rate
- ibor
data_domain: market_data
negative_terms:
- XVA
- exposure
ambiguity_question: Are you inspecting term structures or computing risk metrics?
- uc_id: UC-116
name: Quasi-Monte Carlo Option Pricing
positive_terms:
- quasi-Monte Carlo
- Sobol
- low discrepancy
- option pricing
- variance reduction
data_domain: financial_data
negative_terms:
- portfolio
- XVA
ambiguity_question: Are you exploring numerical methods for option pricing or analyzing portfolios?
- uc_id: UC-117
name: AAD Sensitivity Performance Analysis
positive_terms:
- AAD
- sensitivity
- performance
- automatic differentiation
- Greeks
data_domain: financial_data
negative_terms:
- XVA
- exposure
ambiguity_question: Are you benchmarking sensitivity computation methods or running risk reports?
- uc_id: UC-118
name: CVA Sensitivity Scenario Analysis
positive_terms:
- CVA
- sensitivity
- credit
- scenario
- exposure
data_domain: trading_data
negative_terms:
- NPV
- pricing
ambiguity_question: Are you analyzing CVA sensitivities or basic valuations?
- uc_id: UC-119
name: NPV Cube Visualization Dashboard
positive_terms:
- dashboard
- NPV cube
- visualization
- interactive
- netting set
data_domain: trading_data
negative_terms:
- XVA
- sensitivity
ambiguity_question: Are you building custom dashboards or running standard risk reports?
context_state_machine:
states:
- id: CA1_MEMORY_CHECKED
entry: Task started
exit: All memory queries attempted and recorded; memory_unavailable set if failed
timeout: 30s — skip memory, mark memory_unavailable=true, proceed to CA2
- id: CA2_GAPS_FILLED
entry: CA1 complete
exit: 'All FATAL-priority required inputs answered: target market (A-share/HK/US), data source, time range, strategy type'
timeout: NOT skippable — FATAL inputs MUST be user-answered before proceeding
- id: CA3_PATH_SELECTED
entry: CA2 complete
exit: intent_router matched single use case with confidence gap > 20% over next candidate, no data_domain ambiguity
timeout: Trigger ambiguity_question for top-2 candidates, await user selection
- id: CA4_EXECUTING
entry: CA3 complete + user explicit confirmation received
exit: All hard gates G1-Gn passed and output files written
timeout: NOT skippable — user confirmation of execution path required
enforcement: Code generation is PROHIBITED before CA4_EXECUTING. Any regression to earlier state MUST be announced to user.
buy/sell ordering SL-01 check runs at CA4 entry.
spec_lock_registry:
semantic_locks:
- id: SL-01
description: Execute sell orders before buy orders in every trading cycle
locked_value: sell() called before buy() in each Trader.run() iteration
violation_is: fatal
source_bd_ids:
- BD-018
- id: SL-02
description: Trading signals MUST use next-bar execution (no look-ahead)
locked_value: due_timestamp = happen_timestamp + level.to_second()
violation_is: fatal
source_bd_ids:
- BD-014
- BD-025
- id: SL-03
description: Entity IDs MUST follow format entity_type_exchange_code
locked_value: stock_sh_600000 | stockhk_hk_0700 | stockus_nasdaq_AAPL
violation_is: fatal
source_bd_ids: []
- id: SL-04
description: DataFrame index MUST be MultiIndex (entity_id, timestamp)
locked_value: df.index.names == ['entity_id', 'timestamp']
violation_is: fatal
source_bd_ids: []
- id: SL-05
description: 'TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount'
locked_value: XOR enforcement in trading/__init__.py:68
violation_is: fatal
source_bd_ids: []
- id: SL-06
description: 'filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION'
locked_value: factor.py:475 order_type_flag mapping
violation_is: fatal
source_bd_ids: []
- id: SL-07
description: Transformer MUST run BEFORE Accumulator in factor pipeline
locked_value: 'compute_result(): transform at :403 before accumulator at :409'
violation_is: fatal
source_bd_ids: []
- id: SL-08
description: 'MACD parameters locked: fast=12, slow=26, signal=9'
locked_value: factors/algorithm.py:30 macd(slow=26, fast=12, n=9)
violation_is: fatal
source_bd_ids:
- BD-036
- id: SL-09
description: 'Default transaction costs: buy_cost=0.001, sell_cost=0.001, slippage=0.001'
locked_value: sim_account.py:25 SimAccountService default costs
violation_is: warning
source_bd_ids:
- BD-029
- id: SL-10
description: A-share equity trading is T+1 (no same-day close of buy positions)
locked_value: sim_account.available_long filters by trading_t
violation_is: fatal
source_bd_ids: []
- id: SL-11
description: Recorder subclass MUST define provider AND data_schema class attributes
locked_value: contract/recorder.py:71 Meta; register_schema decorator
violation_is: fatal
source_bd_ids: []
- id: SL-12
description: Factor result_df MUST contain either 'filter_result' OR 'score_result' column
locked_value: result_df.columns.intersection({'filter_result', 'score_result'}) non-empty
violation_is: fatal
source_bd_ids: []
implementation_hints:
- id: IH-01
hint: 'Use AdjustType enum exactly: qfq (pre-adjust), hfq (post-adjust), bfq (none) — contract/__init__.py:121'
- id: IH-02
hint: For A-share kdata, default to hfq for long-term analysis (dividend-adjusted) — trader.py:538 StockTrader
- id: IH-03
hint: SQLite connection MUST use check_same_thread=False for multi-threaded recorders
- id: IH-04
hint: Accumulator state serialization uses JSON with custom encoder/decoder hooks — contract/base_service.py
- id: IH-05
hint: Factor.level MUST match TargetSelector.level (enforced at add_factor) — factors/target_selector.py:84
preservation_manifest:
required_objects:
business_decisions_count: 78
fatal_constraints_count: 44
non_fatal_constraints_count: 148
use_cases_count: 19
semantic_locks_count: 12
preconditions_count: 4
evidence_quality_rules_count: 2
traceback_scenarios_count: 5
architecture:
pipeline: data_collection -> data_storage -> factor_computation -> target_selection -> trading_execution -> visualization
stages:
- id: data_collection
narrative:
does_what: TimeSeriesDataRecorder and FixedCycleDataRecorder fetch OHLCV and fundamental data from providers (eastmoney,
joinquant, baostock, akshare) and persist domain objects (Stock1dKdata, BalanceSheet) to SQLite via df_to_db().
key_decisions: BD-002 chose evaluate_start_end_size_timestamps for incremental fetch (not full refresh) because comparing
to get_latest_saved_record avoids redundant API calls; BD-003 chose get_data_map field transformation to keep domain
schema provider-agnostic.
common_pitfalls: 'Don''t forget SL-11: Recorder subclass MUST declare both provider and data_schema class attributes
else initialization fails with assertion error; finance-C-001 fatal violation.'
business_decisions: []
- id: data_storage
narrative:
does_what: StorageBackend persists DataFrames to per-provider SQLite databases at {data_path}/{provider}/{provider}_{db_name}.db
using path templates from _get_path_template; Mixin.record_data and Mixin.query_data provide uniform read/write interface.
key_decisions: BD-004 chose StorageBackend abstraction (not hardcoded SQLite) to allow future cloud storage swap; BD-006
derives db_name from data_schema __tablename__ for per-domain database isolation.
common_pitfalls: SL-04 violation (wrong DataFrame index) causes factor pipeline failures downstream; always ensure df.index.names
== ['entity_id', 'timestamp'] before calling record_data.
business_decisions: []
- id: factor_computation
narrative:
does_what: Factor.compute() applies Transformer (stateless, e.g. MacdTransformer) then Accumulator (stateful, e.g. MaStatsAccumulator)
to produce filter_result or score_result columns; EntityStateService persists per-entity rolling state across batches.
key_decisions: BD-007 chose Factor inheriting DataReader for composable data access; SL-08 locks MACD at (fast=12, slow=26,
n=9) — chose standard Appel parameters not adaptive because interpretability matters for practitioners.
common_pitfalls: 'SL-07: Transformer MUST run before Accumulator — swapping order causes NaN propagation; SL-12: result_df
must contain filter_result OR score_result column or TargetSelector silently drops all signals.'
business_decisions: []
- id: target_selection
narrative:
does_what: TargetSelector.add_factor() registers Factor instances; get_targets() returns entity_ids passing threshold
filter at a specific timestamp, enabling point-in-time historical backtesting without look-ahead.
key_decisions: BD-012 chose registrable factor list (not hardcoded) for runtime customization; BD-013 chose timestamp-specific
filtering not current-only because backtests need historical point-in-time correctness.
common_pitfalls: Factor.level MUST match TargetSelector.level (IH-05); mismatched levels cause silent empty target lists
that look like no signals but are actually level-mismatch bugs.
business_decisions: []
- id: trading_execution
narrative:
does_what: Trader.run() calls sell() before buy() each cycle, generates TradingSignals with due_timestamp = happen_timestamp
+ level.to_second() for next-bar execution, and applies on_profit_control() for stop-loss/take-profit before regular
target selection.
key_decisions: SL-01 locks sell-before-buy order because available_long check in sim_account depends on it — chose this
over symmetric ordering to prevent implicit leverage; BD-039 chose long=AND/short=OR multi-level logic to reflect
risk asymmetry.
common_pitfalls: 'SL-02 violation (immediate execution instead of next-bar) introduces look-ahead bias and makes backtest
results unreproducible in live trading; SL-10: A-share T+1 constraint — backtesting without it overstates returns.'
business_decisions: []
- id: visualization
narrative:
does_what: Drawer.draw() combines kline main chart with factor overlays and Rect annotations for entry/exit signals
using Plotly; Drawable interface on Factor enables consistent chart rendering across data types.
key_decisions: BD-019 chose drawer_rects subclass override for custom annotations not hardcoded markers — allows traders
to define entry/exit visuals without modifying base drawing logic.
common_pitfalls: draw_result=True by default (BD-055) is fine for development but set draw_result=False in production/headless
environments to avoid Plotly server startup overhead.
business_decisions: []
- id: cross_cutting_concerns
narrative:
does_what: 'Invariants and utilities that span multiple pipeline stages — collected from 22 source groups: Examples/Exposure(2),
Examples/InitialMargin(1), Examples/ORE-Python/ExampleScripts(12), ORE-SWIG/test(4), PythonIntegration(4), Tools/PythonOreRunner(2),
and 16 more.'
key_decisions: 78 BDs merged here because they apply to more than one main stage (e.g. algorithm helpers, default value
choices, ordering contracts, error handling). Agent should inspect individual BD summaries and link back to affected
main stages via shared IDs.
common_pitfalls: Cross-cutting concerns frequently surface as bugs when changes to one main stage unintentionally break
another. Check constraints referencing these BDs and verify invariants still hold after any stage-local modification.
business_decisions:
- id: BD-027
type: B/BA
summary: PCA on yield curve scenarios
- id: BD-028
type: B/BA
summary: Log-linear discount-to-zero-rate conversion
- id: BD-029
type: B/BA
summary: 365.25 day count convention
- id: BD-030
type: B/BA
summary: Black-Scholes-Merton for vanilla options
- id: BD-031
type: B/DK
summary: Heston stochastic volatility model
- id: BD-032
type: B
summary: COS method for Heston pricing
- id: BD-033
type: B/BA
summary: Finite difference Black-Scholes PDE
- id: BD-034
type: B
summary: Binomial tree with multiple algorithms
- id: BD-035
type: B/RC
summary: Monte Carlo with variance reduction
- id: BD-036
type: B/DK
summary: Hull-White one-factor model for swaptions
- id: BD-037
type: B/BA
summary: Levenberg-Marquardt optimization
- id: BD-038
type: B/BA
summary: End criteria for optimization
- id: BD-039
type: B
summary: Gaussian1dSwaptionEngine discretization
- id: BD-040
type: B/BA
summary: Implied volatility Newton solver
- id: BD-041
type: B/BA
summary: Calibration basket generation modes
- id: BD-042
type: B/BA
summary: Black constant volatility surface
- id: BD-043
type: B/DK
summary: Black variance surface by moneyness
- id: BD-044
type: B
summary: FX Vanna-Volga smile calibration
- id: BD-045
type: B/BA
summary: QLE Swaption vol cube interpolation
- id: BD-023
type: B/BA
summary: MARS for conditional expectation estimation
- id: BD-024
type: B/BA
summary: HNSW approximate nearest neighbors
- id: BD-025
type: B
summary: Fourier basis features for regression
- id: BD-026
type: B/DK
summary: KNN weighted smoothing
- id: BD-021
type: B/BA
summary: Directional graph for counterparty network
- id: BD-022
type: B/BA
summary: EEPE-based systemic risk contribution
- id: BD-GAP-001
type: B
summary: TradeGenerator organizes instruments into currency-segmented portfolios (oisPortfolio keyed by currency), enabling
parallel curve bootstrapping per currency
- id: BD-GAP-003
type: B/BA
summary: Day count conventions are mapped per currency (and optionally per tenor), encoding the regulatory and market
practice for each currency jurisdiction
- id: BD-GAP-006
type: B
summary: ORE API routes analytics requests to specialized setup handlers (setupNPV, setupSensitivity, setupStress, setupXVA,
setupExposure) rather than a generic handler, encoding domain-specific parameter str
- id: BD-046
type: BA
summary: Shift horizon defaults to 0.0 across ALL model builders (LGM, CR-LGM, InfDK, InfJY) as architecture
- id: BD-049
type: BA
summary: EngineFactory lazy-resets builders only when buildersDirty_ flag is set; initial state has dirty=true
- id: BD-054
type: BA
summary: FileList._parse() silently catches FileNotFoundError and continues; no error propagation
- id: BD-GAP-002
type: B
summary: Floating legs support configurable basis spread (bool parameter) while fixed legs do not, reflecting the structural
difference between floating coupons (can have basis) and fixed coupons (basis is zer
- id: BD-GAP-004
type: BA
summary: ORE API accepts CSV uploads for results post-processing (postCSV) rather than mandating structured JSON output,
prioritizing analyst flexibility with spreadsheet tools
- id: BD-GAP-007
type: B/RC
summary: State scenarios are stored in gzip-compressed files with separate extraction methods by key and date index,
trading compression efficiency for random access by scenario dimension
- id: BD-GAP-008
type: T
summary: 'Performance utilities provide dual plotting modes: individual NPV path visualization (plotNpvPaths) and aggregated
date-level scenario extraction (getNpvScenarios), supporting both path-specific analy'
- id: BD-GAP-009
type: T
summary: Chart generation in Excel export supports configurable row and column headers, allowing dynamic reorientation
of data tables to match analyst spreadsheet layouts
- id: BD-056
type: BA/M
summary: 'INTERACTION: [BD-052] × [BD-055] → Systemic pricing risk from inconsistent shift horizon validation and application
across model families'
- id: BD-057
type: B
summary: 'INTERACTION: [BD-019] × [BD-008] → Test-production sample divergence creates false regression confidence'
- id: BD-058
type: BA
summary: 'INTERACTION: [BD-009] × [BD-017] → Dual parallelism controls create resource allocation ambiguity'
- id: BD-059
type: BA
summary: 'INTERACTION: [BD-027] × [BD-028] → Hidden dependency: PCA requires consistent rate conversion; log-linear
approximation affects factor quality'
- id: BD-060
type: B/BA
summary: 'INTERACTION: [BD-015] × [BD-019] → Reduced sample count amplifies tolerance calibration requirements for MC
comparisons'
- id: BD-061
type: B/BA
summary: 'INTERACTION: [BD-046] × [BD-052] → Default shift=0.0 interacts with model-specific shift application creating
inconsistent initialization'
- id: BD-062
type: BA/DK
summary: 'INTERACTION: [BD-051] × [BD-010] → Mirrored FX trades required for systemic risk analysis but creates tension
with netting set independence'
- id: BD-063
type: BA/DK
summary: 'INTERACTION: [BD-023] × [BD-024] × [BD-025] → MARS, HNSW, and Fourier features create complex interdependent
ML pipeline with implicit assumptions'
- id: BD-GAP-010
type: T
summary: Legacy Excel integration supports multiple data import formats (CSV, XML, space-delimited) reflecting the heterogeneous
data sources in institutional trading environments
- id: BD-048
type: B
summary: Every TradeBuilder subclasses MUST be stateless; constructor returns shared_ptr without state
- id: BD-050
type: B
summary: ShiftedLognormal volatility type requires shift() lookup in 4 permutations; other types ignore shift
- id: BD-055
type: BA/M
summary: ShiftHorizon must be non-negative (>=0) validated in ALL model builders but InfJyBuilder uses different logic
- id: BD-004
type: BA/DK
summary: Loaders abstract data source (CSV vs in-memory) to allow flexible market data injection
- id: BD-005
type: B/DK
summary: CSV format allows non-programmers to update market data without recompilation
- id: BD-047
type: B
summary: OreBasic.__init__ calls parse_input() but NOT parse_output(); parse_output must be called explicitly after
run()
- id: BD-051
type: B
summary: FinancialSystem requires mirrored trade pairs for FxForward/Swap; validation enforced by consistent_input()
- id: BD-053
type: M
summary: OreExample._locate_ore_exe() hardcodes 15+ path patterns for Windows vs Unix; path order matters
- id: BD-052
type: DK
summary: Shift applied as -parametrization->H(shiftHorizon) in LGM but as direct value in CR-LGM and Inflation models
- id: BD-014
type: BA/DK
summary: OutputFiles uses factory pattern from_folder() to auto-discover output files
- id: BD-015
type: BA
summary: Comparison uses datacompy for DataFrame equality with tolerance support
- id: BD-016
type: B
summary: OutputPlotter extends dict for lazy loading of plot data
- id: BD-017
type: B
summary: EXAMPLES_PARALLEL env var controls ThreadPoolExecutor workers for output processing
- id: BD-006
type: B/BA
summary: Engine pattern allows swapping pricing models without changing trade representation
- id: BD-007
type: BA
summary: Reports stored in memory and retrieved by name after ORE run
- id: BD-018
type: BA
summary: Dynamic test generation allows adding examples without modifying test code
- id: BD-019
type: B
summary: OVERWRITE_SCENARIOGENERATOR_SAMPLES=50 for regression testing (not 1000+ production)
- id: BD-020
type: BA/DK
summary: Per-example comparison_config.json allows file-specific tolerances and exclusions
- id: BD-GAP-005
type: B/BA
summary: ORE API implements explicit directory traversal protection (is_directory_traversal check) before serving files,
preventing path injection attacks on the analytics engine
- id: BD-008
type: B/DK
summary: Exposure simulation uses scenario files to cache market paths for efficiency
- id: BD-009
type: BA
summary: Parallel execution via ThreadPoolExecutor with EXAMPLES_PARALLEL env var control
- id: BD-010
type: BA
summary: FinancialSystem builds counterparty graph via networkx for systemic risk analysis
- id: BD-001
type: BA
summary: XML-based configuration decouples engine from workflow definition
- id: BD-002
type: B
summary: OreExample class provides unified interface for running ORE across each examples
- id: BD-003
type: B
summary: ORE_EXAMPLES_USE_PYTHON env var switches between native exe and Python wrapper
- id: BD-GAP-011
type: DK
summary: 'Missing: Point-in-Time data availability'
- id: BD-GAP-012
type: B
summary: 'Missing: Covariance estimator selection and shrinkage'
- id: BD-GAP-013
type: B
summary: 'Missing: Factor IC demean and grouping alignment'
- id: BD-GAP-014
type: B
summary: 'Missing: NPL portfolio EBA field completeness'
- id: BD-GAP-015
type: B
summary: 'Missing: Provider priority and credential isolation'
- id: BD-011
type: BA/DK
summary: SIMM calculation done post-simulation using scenarioToMarket pipeline
- id: BD-012
type: B
summary: Multiple ORE runs with different configurations for sensitivity analysis (ore.xml, ore1.xml, ore2.xml)
- id: BD-013
type: BA
summary: SA-CCR and CPM run as separate examples enabling modular capital calculation
resources:
packages:
- name: pandas
version_pin: ==1.5.3
- name: numpy
version_pin: ==1.24.4
- name: matplotlib
version_pin: '>=2'
- name: requests
version_pin: ==2.31.0
- name: scipy
version_pin: '>=1.3.0'
- name: scikit-learn
version_pin: '>1.4.2'
- name: pytest
version_pin: '>=8.3'
strategy_scaffold:
entry_point_name: run_backtest
output_path: result.csv
execution_mode: backtest
conditional_entry_points:
backtest:
entry_point_name: run_backtest
output_path: result.csv
collector:
entry_point_name: run_collector
output_path: result.json
factor:
entry_point_name: run_factor
output_path: result.parquet
training:
entry_point_name: run_training
output_path: result.json
serving:
entry_point_name: run_server
output_path: result.json
research:
entry_point_name: run_research
output_path: result.json
tail_template: "# === DO NOT MODIFY BELOW THIS LINE ===\nif __name__ == \"__main__\":\n result = run_backtest() #\
\ implement above\n from validate import enforce_validation\n enforce_validation(result, output_path=\"{workspace}/result.csv\"\
)\n# === END DO NOT MODIFY ==="
host_adapter:
target: openclaw
timeout_seconds: 1800
shell_operator_restriction: 'exec tool intercepts && / ; / | — never chain: ''pip install X && python Y''. Use separate
exec calls.'
install_recipes:
- python3 -m pip install zvt
credential_injection: JoinQuant/QMT credentials require user-side '!' prefix shell login. Never hardcode credentials in
generated scripts.
path_resolution: '{workspace} resolves to ~/.openclaw/workspace/doramagic at execution time.'
file_io_tooling: Use openclaw 'write' tool for .py/.sql files; 'exec' tool for python3 /absolute/path/script.py (absolute
paths only).
constraints:
fatal:
- id: finance-C-001
when: When parsing ore.xml with Parameters.fromFile()
action: 'include a Setup node containing mandatory parameters: asofDate, inputPath, and outputPath'
severity: fatal
kind: domain_rule
modality: must
consequence: OREApp initialization fails with a parse error or undefined behavior because the XML parser requires the
Setup section with essential date and path configuration
stage_ids:
- xml_configuration
- id: finance-C-002
when: When executing XMLDocument::fromFile()
action: pass an empty or non-existent file path to XMLDocument constructor
severity: fatal
kind: domain_rule
modality: must_not
consequence: ORE terminates with a QL_REQUIRE exception stating 'Failed to open file' or 'File X is empty', preventing
any analytics from running
stage_ids:
- xml_configuration
- id: finance-C-003
when: When configuring ore.xml for OREApp
action: wrap the root XML element with an <ORE> tag as the top-level container
severity: fatal
kind: domain_rule
modality: must
consequence: Parameters parsing fails because Parameters::fromXML() explicitly checks for the 'ORE' node name at xmlutils.cpp:76
stage_ids:
- xml_configuration
- id: finance-C-011
when: When OREApp initialization encounters missing setup group
action: throw QL_REQUIRE exception stating 'parameter group setup missing'
severity: fatal
kind: domain_rule
modality: must
consequence: ORE terminates with opaque error instead of identifying the missing Setup section in the XML configuration
stage_ids:
- xml_configuration
- id: finance-C-012
when: When the ore.xml file contains malformed XML
action: catch rapidxml::parse_error and call handle_rapidxml_parse_error() for user-friendly diagnostics
severity: fatal
kind: domain_rule
modality: must
consequence: ORE crashes with raw rapidxml exception details instead of providing actionable error messages about XML
syntax errors
stage_ids:
- xml_configuration
- id: finance-C-016
when: When parsing market data CSV files, loader expects exactly 3 tokens per data line
action: 'Provide data rows with exactly three whitespace/comma/semicolon/tab-delimited fields: date, key name, and numeric
value'
severity: fatal
kind: domain_rule
modality: must
consequence: Invalid CSVLoader lines cause QL_REQUIRE to throw parsing errors, preventing market data from loading and
blocking all downstream pricing calculations
stage_ids:
- market_data_loading
- id: finance-C-018
when: When building yield curves, loader must provide market quotes for each as-of dates matching the valuation date
action: Verify market data file contains entries with dates matching the evaluationDate used for pricing
severity: fatal
kind: domain_rule
modality: must
consequence: TodaysMarket initialization fails with 'no quotes available for date' warning when data is missing for the
valuation date, causing null curve handles and NaN pricing results
stage_ids:
- market_data_loading
- id: finance-C-027
when: When building market object, loader must be provided as non-null shared_ptr to avoid runtime crashes
action: Pass valid Loader instance (CSVLoader or InMemoryLoader) to TodaysMarket constructor, never pass null loaders
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Null loader causes QL_REQUIRE to throw 'Loader is null' error, preventing market initialization and blocking
all valuations
stage_ids:
- market_data_loading
- id: finance-C-036
when: When pricing a trade and generating NPV reports
action: verify that NPV values are finite using std::isfinite before writing to report
severity: fatal
kind: domain_rule
modality: must
consequence: Non-finite NPV values (NaN, Inf, -Inf) propagate through the risk calculation pipeline, corrupting XVA, exposure,
and sensitivity outputs that depend on trade valuations
stage_ids:
- pricing_npv
- id: finance-C-037
when: When implementing trade builders for portfolio construction
action: verify each AbstractTradeBuilder subclasses are stateless and produce new trade instances via factory methods
severity: fatal
kind: domain_rule
modality: must
consequence: Stateful trade builders cause shared_ptr reuse bugs across concurrent portfolio builds, leading to duplicate
trades or data corruption when multiple threads process the same builder instance
stage_ids:
- pricing_npv
- id: finance-C-039
when: When calling VanillaOption.NPV() method
action: verify a pricing engine has been attached via setPricingEngine() before invoking NPV()
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Calling NPV() without an attached pricing engine throws an exception or returns uninitialized values, causing
silent failures in the risk calculation pipeline
stage_ids:
- pricing_npv
- id: finance-C-043
when: When configuring pricing engine parameters via EngineData XML
action: verify each trade type has both model and engine specified to enable proper engine binding
severity: fatal
kind: resource_boundary
modality: must
consequence: Missing model or engine specification in EngineData causes EngineFactory to fail during portfolio build,
preventing risk calculations from running
stage_ids:
- pricing_npv
- id: finance-C-044
when: When performing FX conversion during NPV calculation
action: use current FX spot rates from SimMarket for conversion, dividing by numeraire for proper discounting
severity: fatal
kind: domain_rule
modality: must
consequence: Incorrect FX conversion or numeraire division produces wrong base currency NPV values, leading to misstated
portfolio values and incorrect risk reporting
stage_ids:
- pricing_npv
- id: finance-C-049
when: When validating NPV cube dimensions after portfolio valuation
action: verify cube dimensions (numIds, numDates, samples, depth) match expected portfolio and date grid sizes
severity: fatal
kind: domain_rule
modality: must
consequence: Dimension mismatch between NPV cube and portfolio causes index out-of-bounds errors or silent data truncation
in exposure aggregation
stage_ids:
- pricing_npv
- id: finance-C-050
when: When filtering simulation samples from rawcube.csv
action: account for the 1-based sample indexing convention in rawcube versus 0-based in scenariodata
severity: fatal
kind: domain_rule
modality: must
consequence: Exposure calculations will produce incorrect results when filterSample values are misaligned between rawcube
and scenariodata conventions
stage_ids:
- simulation_exposure
- id: finance-C-051
when: When extracting NPV values from the raw cube
action: filter on depth == 0 to extract primary simulation results representing current portfolio state
severity: fatal
kind: domain_rule
modality: must
consequence: Raw cube filter returns incorrect NPV values when depth != 0 is included, potentially including auxiliary
calculations
stage_ids:
- simulation_exposure
- id: finance-C-052
when: When generating EPE/ENE exposure CSV files
action: include columns Date, NettingSetId, EPE, ENE, EEPE, ENEE as required by downstream XVA calculations
severity: fatal
kind: domain_rule
modality: must
consequence: Downstream XVA and collateral calculations fail when required exposure columns are missing from netting set
exposure CSVs
stage_ids:
- simulation_exposure
- id: finance-C-053
when: When reconstructing Market from scenario file
action: parse simulation.xml to extract currencies, indices, curve tenors, FX vols, and swaption vols matching scenario
dump structure
severity: fatal
kind: domain_rule
modality: must
consequence: scenarioToMarket fails to reconstruct valid market data when simulation.xml market configuration does not
match scenario dump columns
stage_ids:
- simulation_exposure
- id: finance-C-064
when: When implementing netCubeFilter for XVA aggregation
action: filter depth == 0 for trade-level NPV aggregation to netting-set level
severity: fatal
kind: domain_rule
modality: must
consequence: Incorrect depth filtering causes NPV values to be incorrectly aggregated, leading to wrong XVA calculations
and mispriced counterparty risk exposure
stage_ids:
- xva_calculation
- id: finance-C-065
when: When implementing expectedSimmEvolution for SIMM averaging
action: divide InitialMargin by number of samples to compute expected value across simulation paths
severity: fatal
kind: domain_rule
modality: must
consequence: SIMM evolution computed without sample averaging produces path-specific results instead of expected values,
causing incorrect IM forecasting
stage_ids:
- xva_calculation
- id: finance-C-066
when: When computing SIMM evolution from multi-sample cube
action: aggregate samples by summing before averaging InitialMargin values
severity: fatal
kind: domain_rule
modality: must
consequence: Sample aggregation without proper summation-then-division causes incorrect expected SIMM values across simulation
paths
stage_ids:
- xva_calculation
- id: finance-C-067
when: When validating XVA report output format
action: 'include #Id (or #TradeId), NettingSetId, CVA, DVA, FVA (or FBA/FCA), KVA columns in output'
severity: fatal
kind: domain_rule
modality: must
consequence: Missing required columns in XVA report causes downstream risk reporting systems to fail and regulatory submissions
to be rejected
stage_ids:
- xva_calculation
- id: finance-C-068
when: When validating SIMM evolution DataFrame output
action: return DataFrame with Date (or AsOfDate), NettingSetId (or Portfolio), SIMM columns
severity: fatal
kind: domain_rule
modality: must
consequence: SIMM evolution output missing required columns causes IM forecasting pipelines to fail
stage_ids:
- xva_calculation
- id: finance-C-069
when: When implementing SA-CCR capital calculation
action: output AddOn, NPV, EAD (Exposure at Default), RW (Risk Weight), CC (Capital Charge) columns per netting set
severity: fatal
kind: domain_rule
modality: must
consequence: SA-CCR output missing regulatory required fields causes non-compliance with Basel CCR capital requirements
stage_ids:
- xva_calculation
- id: finance-C-088
when: When configuring CSV comparison in comparison_config.json
action: Define keys field to specify join columns for DataFrame comparison
severity: fatal
kind: domain_rule
modality: must
consequence: DataFrame comparison fails with warning 'The comparison configuration must contain a keys field', causing
regression tests to abort without meaningful diff output
stage_ids:
- post_processing
- id: finance-C-089
when: When plotting exposure data with OutputPlotter
action: Verify exposure CSV files contain 'Date', 'EPE', and 'ENE' columns
severity: fatal
kind: domain_rule
modality: must
consequence: KeyError raised when accessing df['Date'] or df['EPE']/df['ENE'], causing plot generation to fail with pandas
KeyError
stage_ids:
- post_processing
- id: finance-C-099
when: When implementing CSV file comparison configuration
action: Define non-empty keys field containing non-empty string column names for join-based comparison
severity: fatal
kind: domain_rule
modality: must
consequence: Comparison fails silently or produces incorrect results when keys are missing, causing false pass/fail signals
in regression testing
stage_ids:
- regression_testing
- id: finance-C-115
when: When constructing the Market object from XML configuration
action: use the same asofDate across each market data components (yield curves, volatility surfaces, FX spots, credit
curves)
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Mixed date market data causes incorrect trade pricing, leading to materially wrong NPV calculations and XVA
results across the portfolio
stage_ids:
- market_data_loading
- pricing_npv
- id: finance-C-116
when: When transferring NPV cube data between simulation and XVA stages
action: verify cube trade IDs match the portfolio trade IDs exactly
severity: fatal
kind: domain_rule
modality: must
consequence: NPVCube index mismatch with portfolio causes exposure aggregation to fail or produce incorrect CVA/DVA allocations
stage_ids:
- simulation_exposure
- xva_calculation
- id: finance-C-117
when: When computing exposure profiles for post-processing
action: use the same CubeInterpretation that was used during simulation cube generation
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Inconsistent cube interpretation causes incorrect MPOR cashflow retrieval, leading to wrong collateral exposure
calculations
stage_ids:
- simulation_exposure
- post_processing
- id: finance-C-118
when: When loading pre-built cubes for XVA calculation
action: provide both the NPV cube and corresponding AggregationScenarioData from the same simulation run
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Mismatched scenario data causes numeraire values to be applied to incorrect dates/samples, corrupting all
exposure and XVA calculations
stage_ids:
- simulation_exposure
- xva_calculation
- id: finance-C-120
when: When computing netting set level exposures from trade level
action: apply netting set netting rules before aggregating exposures
severity: fatal
kind: domain_rule
modality: must
consequence: Without proper netting, EPE/ENE calculations overstate exposure, leading to excessive capital reserves and
incorrect CVA/DVA allocation
stage_ids:
- pricing_npv
- simulation_exposure
- id: finance-C-121
when: When transferring aggregation scenario data between simulation and XVA
action: preserve the exact date grid indices and sample counts in AggregationScenarioData
severity: fatal
kind: domain_rule
modality: must
consequence: Scenario data dimension mismatch causes index out of bounds errors or silent misalignment in numeraire retrieval
during XVA calculation
stage_ids:
- simulation_exposure
- xva_calculation
- id: finance-C-125
when: When loading market data from marketdata.txt and curves.csv
action: parse dates using the same date format as todaysmarket.xml conventions
severity: fatal
kind: domain_rule
modality: must
consequence: Date parsing mismatch causes market data to be associated with wrong dates, corrupting curve construction
and trade pricing
stage_ids:
- xml_configuration
- market_data_loading
- id: finance-C-127
when: When building the EngineFactory for trade pricing
action: link the pricing engine factory to the Market object from the same as-of date
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Pricing engines built with mismatched market data produce incorrect NPVs, corrupting the entire risk calculation
pipeline
stage_ids:
- market_data_loading
- pricing_npv
- id: finance-C-130
when: When computing XVA for netting sets with collateral
action: use the same collateral balance evolution from MPOR calculation in both exposure and XVA stages
severity: fatal
kind: domain_rule
modality: must
consequence: Collateral balance inconsistency causes wrong COLVA calculation and incorrect net exposure reporting
stage_ids:
- simulation_exposure
- xva_calculation
- id: finance-C-132
when: When extracting portfolio trades for simulation exposure
action: preserve netting set assignments from the original portfolio for trade-level exposure aggregation
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Lost netting set assignments prevent proper netting and produce overstated exposures for capital calculation
stage_ids:
- pricing_npv
- simulation_exposure
- id: finance-C-142
when: When implementing TradeBuilder subclasses for custom trade types
action: Maintain any internal state in TradeBuilder subclasses — each TradeBuilder implementations must be stateless,
returning shared_ptr to newly constructed objects
severity: fatal
kind: architecture_guardrail
modality: must_not
consequence: Stateful TradeBuilders cause shared_ptr reuse bugs where trades share incorrect state, leading to incorrect
pricing and risk calculations across portfolios
- id: finance-C-145
when: When initializing ORE application for pricing and risk analytics
action: Call initBuilders() with registerOREAnalytics=true before creating OREApp instance — this registers each analytic
builders (XVA, SIMM, VaR, etc.) in the EngineBuilderFactory
severity: fatal
kind: architecture_guardrail
modality: must
consequence: OREApp fails to find registered analytics because builders are not registered, preventing any pricing or
risk calculations from executing
- id: finance-C-153
when: When using ORE for regulatory reporting and capital calculation
action: Validate SIMM and SA-CCR calculations independently — ORE provides implementation guidance but regulatory capital
calculations require compliance verification against official regulatory specifications
severity: fatal
kind: claim_boundary
modality: must
consequence: Regulatory capital figures reported to authorities may not meet compliance requirements if ORE implementation
details diverge from official ISDA SIMM or SA-CCR specifications
- id: finance-C-155
when: When implementing pricing logic for exotic derivatives in XVA calculations
action: Use Monte Carlo with variance reduction techniques (antithetic variates, control variates) to achieve <0.1% pricing
accuracy with 100k paths; verify control variate is highly correlated with payoff
severity: fatal
kind: domain_rule
modality: must
consequence: Plain Monte Carlo without variance reduction requires 1M+ paths to achieve equivalent accuracy, causing unacceptable
computational overhead in XVA calculations; exotic derivatives priced with insufficient paths produce inaccurate risk
metrics that may violate regulatory capital requirements
derived_from_bd_id: BD-035
- id: finance-C-171
when: When pricing options with Black-Scholes-Merton model
action: Apply Black-Scholes-Merton to American options, options with discrete dividends, or long-dated options exceeding
2 years — these require alternative models (binomial tree, finite difference, or Monte Carlo)
severity: fatal
kind: architecture_guardrail
modality: must_not
consequence: Black-Scholes-Merton assumes constant volatility and log-normal prices, systematically mispricing American
options by ignoring early exercise premium and misvaluing discrete dividend impacts
derived_from_bd_id: BD-030
- id: finance-C-189
when: When implementing the Heston stochastic volatility model for equity options pricing
action: Validate that Feller condition (2*kappa*theta > sigma^2) is satisfied to verify variance process positivity; reject
parameter sets violating this condition
severity: fatal
kind: domain_rule
modality: must
consequence: Parameter sets violating the Feller condition cause the variance process to reach zero, producing invalid
option prices and potentially negative values that corrupt XVA calculations for exotic equity structures
derived_from_bd_id: BD-031
- id: finance-C-190
when: When using the Hull-White one-factor short-rate model for interest rate swaption pricing
action: Apply Hull-White one-factor exclusively to vanilla swaptions; do not use for structures requiring spread dynamics
between curve tenors as single-factor models cannot capture cross-tenor correlations
severity: fatal
kind: architecture_guardrail
modality: must_not
consequence: Using Hull-White one-factor for non-vanilla swaption structures causes systematic mispricing due to inability
to model tenor spread dynamics, leading to incorrect IMA FRTB risk factor calculations
derived_from_bd_id: BD-036
regular:
- id: finance-C-004
when: When initializing OreExample with dry=True
action: skip subprocess execution of ore.exe and return immediately from run() method
severity: high
kind: architecture_guardrail
modality: must
consequence: Dry-run mode fails to bypass ore.exe execution, causing unnecessary process spawning and delayed testing
feedback
stage_ids:
- xml_configuration
- id: finance-C-005
when: When auto-detecting ore executable location
action: exceed 3 path pattern iterations when ORE_EXAMPLES_USE_PYTHON is not set
severity: medium
kind: resource_boundary
modality: must_not
consequence: Auto-detection mechanism performs excessive filesystem checks, causing slow initialization especially on
network-mounted directories
stage_ids:
- xml_configuration
- id: finance-C-006
when: When ore.exe cannot be located by _locate_ore_exe()
action: print 'ORE executable not found' message and terminate execution with quit()
severity: high
kind: operational_lesson
modality: must
consequence: Missing executable detection fails silently, causing downstream test failures with cryptic error messages
instead of clear diagnostic
stage_ids:
- xml_configuration
- id: finance-C-007
when: When setting up OreExample with ORE_EXAMPLES_USE_PYTHON environment variable
action: check ORE_EXAMPLES_USE_PYTHON key in os.environ and set use_python flag accordingly
severity: medium
kind: architecture_guardrail
modality: must
consequence: Environment variable-based switching fails, causing ore.exe to be invoked instead of Python wrapper in CI/CD
environments requiring Python mode
stage_ids:
- xml_configuration
- id: finance-C-008
when: When ore.xml specifies a relative inputPath
action: resolve referenced files (portfolio.xml, simulation.xml) relative to the ore.xml directory
severity: high
kind: domain_rule
modality: must
consequence: Configuration file resolution fails because referenced XML files are not found, causing MissingResourceException
equivalent errors
stage_ids:
- xml_configuration
- id: finance-C-009
when: When running ore.exe from command line
action: pass exactly one argument (the ore.xml path) to ore executable
severity: high
kind: resource_boundary
modality: must
consequence: ORE prints usage message and returns -1 without executing, failing automated test pipelines that expect successful
runs
stage_ids:
- xml_configuration
- id: finance-C-010
when: When constructing OREApp with Parameters from ore.xml
action: call initBuilders() before instantiating OREApp to register each analytic types
severity: high
kind: architecture_guardrail
modality: must
consequence: Analytics fail to execute because required analytic builders (XVA, SIMM, Sensitivity) are not registered
in the factory system
stage_ids:
- xml_configuration
- id: finance-C-013
when: When documenting ore.exe capabilities
action: claim that ore.exe provides real-time market data or live trading execution
severity: high
kind: claim_boundary
modality: must_not
consequence: Users may incorrectly expect live trading capabilities, leading to operational failures when they attempt
to execute real trades
stage_ids:
- xml_configuration
- id: finance-C-014
when: When presenting ORE backtest results to stakeholders
action: imply that simulated portfolio returns guarantee equivalent live trading performance
severity: high
kind: claim_boundary
modality: must_not
consequence: Backtested XVA and exposure calculations do not account for execution slippage, counterparty delays, or market
impact, leading to overconfident risk estimates
stage_ids:
- xml_configuration
- id: finance-C-015
when: When configuring ORE for parallel execution
action: validate that the specified nThreads parameter is a positive integer and within system resource limits
severity: medium
kind: operational_lesson
modality: must
consequence: Invalid thread count causes thread spawning failures or excessive context switching, degrading simulation
performance
stage_ids:
- xml_configuration
- id: finance-C-017
when: When implementing market data loading, loader must skip comment lines beginning with '#'
action: Filter out or skip any line where the first non-whitespace character is '#' before parsing
severity: high
kind: domain_rule
modality: must
consequence: Comment lines containing market data format headers or documentation will be incorrectly parsed as data entries,
corrupting the market object with invalid quotes
stage_ids:
- market_data_loading
- id: finance-C-019
when: When loading fixings data, dates must be strictly less than evaluation date unless implyTodaysFixings is enabled
action: Load historical fixings with dates before evaluationDate, excluding today's fixings when implyTodaysFixings=false
severity: high
kind: domain_rule
modality: must_not
consequence: Including future or today's unconfirmed fixings without proper flag causes pricing to use provisional data,
producing unreliable XVA calculations
stage_ids:
- market_data_loading
- id: finance-C-020
when: When loading market data, CSV loader must accept multiple delimiter formats for flexibility
action: 'Parse CSV files using delimiters: comma, semicolon, tab, and space with token compression enabled'
severity: high
kind: domain_rule
modality: must
consequence: Data files created with different regional settings (European comma-decimal vs US period-decimal) fail to
parse correctly, producing incorrect token counts
stage_ids:
- market_data_loading
- id: finance-C-021
when: When loading FX market data, loader must enforce currency dominance ordering for consistent triangulation
action: Follow fxDominance hierarchy (XAU,XAG,XPT,XPD,EUR,GBP,AUD,NZD,USD,CAD,CHF,ZAR,...JPY,IDR,KRW) when storing FX
spot pairs
severity: high
kind: architecture_guardrail
modality: must
consequence: Duplicate FX pairs stored in wrong order (e.g., both EUR/USD and USD/EUR) cause triangulation errors and
incorrect cross-currency valuations
stage_ids:
- market_data_loading
- id: finance-C-022
when: When configuring volatility surfaces, curve configuration must match available quotes in the loader
action: Verify volatility curve configuration specifies strike/expiry quotes that exist in marketdata.txt, with wildcard
patterns matching actual quote IDs
severity: high
kind: domain_rule
modality: must
consequence: Missing volatility surface data points cause 'missing expiry/strike will be excluded' warnings and incomplete
vol surfaces, leading to interpolation errors in option pricing
stage_ids:
- market_data_loading
- id: finance-C-023
when: When implementing market data loading, loader must enforce unique market datum entries using set-based storage
action: Store market data in std::set containers with SharedPtrMarketDatumComparator to verify uniqueness and sorted retrieval
severity: medium
kind: domain_rule
modality: must
consequence: Duplicate market data entries cause ambiguous quote resolution, where get() operations may return inconsistent
results between calls
stage_ids:
- market_data_loading
- id: finance-C-024
when: When implementing market data loading, loader must support lazy loading capability for large datasets
action: Implement lazyBuild parameter to defer curve construction until first access, reducing initialization time for
large market datasets
severity: medium
kind: resource_boundary
modality: should
consequence: Eager construction of all curves at initialization causes excessive startup time when only a subset of curves
is needed for pricing
stage_ids:
- market_data_loading
- id: finance-C-025
when: When loading market data, parser must reject invalid market datum names with specific format errors
action: Validate parseMarketDatum input against expected format (TYPE/NAME/EXPIRY/STRIKE), throwing QuantLib::Error for
malformed keys
severity: high
kind: domain_rule
modality: must
consequence: Malformed market datum keys silently pass validation and cause downstream curve building failures with cryptic
errors
stage_ids:
- market_data_loading
- id: finance-C-026
when: When loading market data, this stage must use file-based CSV loading rather than live market data feeds
action: Load market observables from CSV/text files (marketdata.txt, curves.csv, fixings.txt), not from real-time data
providers
severity: medium
kind: claim_boundary
modality: must_not
consequence: Presenting file-based static market data as real-time creates misleading impressions about data freshness
and pricing accuracy
stage_ids:
- market_data_loading
- id: finance-C-028
when: When loading market data for XVA calculations, loader must support continueOnError mode for partial data scenarios
action: Enable continueOnError flag in TodaysMarket to log build errors but continue with available data rather than failing
completely
severity: medium
kind: operational_lesson
modality: should
consequence: Strict error handling causes entire XVA calculation to fail when a single non-critical curve is missing,
preventing partial results
stage_ids:
- market_data_loading
- id: finance-C-029
when: When adding fixings to loader, duplicate entries must be logged and skipped rather than causing errors
action: Insert fixings into std::set container which automatically deduplicates, logging warnings for each duplicate found
severity: medium
kind: domain_rule
modality: must
consequence: Duplicate fixings cause unexpected behavior in index bootstrap calculations, producing inconsistent forward
rate estimates
stage_ids:
- market_data_loading
- id: finance-C-030
when: When building volatility surfaces, loader must handle missing data gracefully with proper logging
action: Log warnings for missing volatility surface expiry/strike combinations and exclude them from construction rather
than throwing errors
severity: high
kind: operational_lesson
modality: must
consequence: Incomplete volatility surfaces with unlogged missing points cause silent interpolation errors in option pricing,
producing systematically biased results
stage_ids:
- market_data_loading
- id: finance-C-031
when: When parsing date fields in market data files, loader must support multiple ISO date formats
action: Accept dates in YYYYMMDD, YYYY-MM-DD, YYYY/MM/DD, and YYYY.MM.DD formats through parseDate() function
severity: high
kind: domain_rule
modality: must
consequence: Date parsing failures cause entire market data files to be rejected, blocking pricing with cryptic 'Invalid
date format' errors
stage_ids:
- market_data_loading
- id: finance-C-032
when: When implementing market data loading, developers must not skip data validation even for seemingly correct test
data
action: Always run parseMarketDatum validation on input data; never assume test data is correctly formatted without verification
severity: high
kind: rationalization_guard
modality: must_not
consequence: Skipping validation for 'simple' test cases allows malformed data to pass, creating false confidence in production
data quality
stage_ids:
- market_data_loading
- id: finance-C-033
when: When loading market data, loader must not assume token compression handles each whitespace variations
action: Use boost::token_compress_on explicitly to handle multiple consecutive delimiters, not just single delimiters
severity: high
kind: domain_rule
modality: must_not
consequence: Files with inconsistent spacing (e.g., '20160205 FX/RATE/EUR/USD 1.10') produce wrong token counts when
token compression is not enabled
stage_ids:
- market_data_loading
- id: finance-C-034
when: When loading market data, ORE must not claim real-time data capability since it only supports static file-based
loading
action: Document that market data is loaded from files with possible delay, not from live exchange feeds
severity: medium
kind: claim_boundary
modality: must_not
consequence: Marketing ORE as real-time capable when it only supports batch file loading creates compliance risk and misaligned
user expectations
stage_ids:
- market_data_loading
- id: finance-C-035
when: When loading market data, loader must handle dividend data with 3-5 tokens for different dividend date specifications
action: 'Accept dividend entries with format: Date Name Value [PayDate] [AnnouncementDate], where PayDate and AnnouncementDate
are optional'
severity: high
kind: domain_rule
modality: must
consequence: Incorrect dividend data causes equity curve miscalibration, producing wrong dividend yield adjustments in
equity option pricing
stage_ids:
- market_data_loading
- id: finance-C-038
when: When configuring Monte Carlo pricing engines for European options
action: set requiredTolerance parameter to achieve accuracy within 1 basis point of analytic benchmarks
severity: high
kind: domain_rule
modality: must
consequence: MC engine without proper tolerance produces inaccurate NPV values that misrepresent trade economics, leading
to incorrect risk assessments and potential financial losses in live trading decisions
stage_ids:
- pricing_npv
- id: finance-C-040
when: When using the engine swapping pattern for model comparison
action: compare AnalyticEuropeanEngine vs MCEuropeanEngine results and verify differences remain within 1 basis point
tolerance for European options
severity: medium
kind: operational_lesson
modality: should
consequence: Engine swapping without tolerance verification may introduce numerical discrepancies between pricing models
that compound through risk calculations
stage_ids:
- pricing_npv
- id: finance-C-041
when: When building the NPV cube for simulation exposure calculation
action: set T0 values first using setT0() before setting other scenario values using set() method
severity: high
kind: architecture_guardrail
modality: must
consequence: Incorrect ordering of T0 vs scenario value setting in NPVCube causes undefined behavior or incorrect exposure
calculations in downstream XVA computations
stage_ids:
- pricing_npv
- id: finance-C-042
when: When retrieving NPV reports from AnalyticsManager
action: use getReport() method with the exact report name to retrieve in-memory reports after analytics execution
severity: high
kind: architecture_guardrail
modality: must
consequence: Incorrect report retrieval causes null pointer exceptions or empty data, breaking post-processing workflows
that depend on NPV report data
stage_ids:
- pricing_npv
- id: finance-C-045
when: When presenting NPV calculation results from backtesting
action: claim that backtest NPV results represent expected live trading performance
severity: high
kind: claim_boundary
modality: must_not
consequence: Presenting backtest NPV as indicative of live trading performance creates misleading expectations, ignoring
execution costs, slippage, and market impact that differ between simulation and reality
stage_ids:
- pricing_npv
- id: finance-C-046
when: When handling pricing failures for individual trades in portfolio
action: catch exceptions from trade->instrument()->NPV() and write Null<Real>() values to report with structured error
logging
severity: high
kind: architecture_guardrail
modality: must
consequence: Uncaught pricing exceptions cause portfolio valuation to abort entirely, losing all NPV results instead of
isolating the failing trade
stage_ids:
- pricing_npv
- id: finance-C-047
when: When configuring the NPV report output format
action: 'include #TradeId column matching portfolio trade IDs and Maturity column for each trade entry'
severity: high
kind: architecture_guardrail
modality: must
consequence: Missing or incorrectly named ID columns prevent downstream systems from matching NPV reports to specific
trades, breaking trade-level risk attribution
stage_ids:
- pricing_npv
- id: finance-C-048
when: When using EngineBuilder caching for performance optimization
action: reset cached engines when market data changes to prevent stale pricing using outdated curves
severity: high
kind: resource_boundary
modality: must
consequence: Cached engines with stale market data produce incorrect NPV values that do not reflect current market conditions,
causing systematic mispricing
stage_ids:
- pricing_npv
- id: finance-C-054
when: When configuring Monte Carlo simulation samples
action: use SCENARIOGENERATOR_SAMPLES >= 50 for regression testing and >= 1000 for production exposure calculations
severity: high
kind: resource_boundary
modality: must
consequence: Insufficient samples produce statistically unreliable exposure estimates with high variance, leading to incorrect
XVA calculations
stage_ids:
- simulation_exposure
- id: finance-C-055
when: When running exposure simulations in parallel
action: configure EXAMPLES_PARALLEL environment variable to control ThreadPoolExecutor max_workers for resource-constrained
environments
severity: medium
kind: resource_boundary
modality: should
consequence: Uncontrolled parallel execution may exhaust system resources in CI environments or exceed available memory
for large portfolios
stage_ids:
- simulation_exposure
- id: finance-C-056
when: When configuring the simulation engine for exposure calculation
action: select simulation_engine from supported variants (classic or AMC) as defined in ore.xml analytics configuration
severity: high
kind: resource_boundary
modality: must
consequence: Wrong simulation engine selection produces inconsistent exposure profiles between trade types, particularly
for exotic instruments requiring AMC
stage_ids:
- simulation_exposure
- id: finance-C-057
when: When running exposure simulations for post-processing analysis
action: write scenario files to disk via scenariodump configuration to enable scenario reuse without re-running expensive
Monte Carlo
severity: medium
kind: operational_lesson
modality: should
consequence: Re-running Monte Carlo for every post-processing change wastes computation time; scenario caching enables
efficient SIMM/DIM comparison
stage_ids:
- simulation_exposure
- id: finance-C-058
when: When configuring random number generation for simulation
action: use non-zero Seed value to verify reproducible simulation results and regression test compatibility
severity: high
kind: architecture_guardrail
modality: must
consequence: Zero seed produces unpredictable simulation paths, breaking regression testing and preventing comparison
against expected outputs
stage_ids:
- simulation_exposure
- id: finance-C-059
when: When computing systemic risk metrics from netting set exposures
action: build FinancialSystem counterparty graph via NetworkX using NettingSetId relationships from XVA output
severity: medium
kind: architecture_guardrail
modality: must
consequence: Systemic risk analysis produces incorrect rho+ and rho- metrics when netting set relationships are not properly
extracted from XVA output
stage_ids:
- simulation_exposure
- id: finance-C-060
when: When validating simulation input consistency
action: verify every trade has a correctly mirrored counterpart trade with matching trade IDs and opposite payer/receiver
orientation
severity: high
kind: architecture_guardrail
modality: must
consequence: Unmirrored trades in portfolio produce incorrect netting set aggregations and expose the system to incorrect
exposure calculations
stage_ids:
- simulation_exposure
- id: finance-C-061
when: When reporting Monte Carlo simulation exposure results
action: claim that simulation-derived EPE/ENE values represent actual expected live trading exposure
severity: high
kind: claim_boundary
modality: must_not
consequence: Monte Carlo simulation uses stochastic modeling with idealized assumptions; presenting simulated exposures
as guaranteed future values misleads risk stakeholders
stage_ids:
- simulation_exposure
- id: finance-C-062
when: When presenting NPV cube results
action: present multi-dimensional NPV cube outputs as direct representations of actual portfolio values
severity: medium
kind: claim_boundary
modality: must_not
consequence: NPV cube contains simulated valuations across scenarios that may not reflect actual market conditions at
future dates; presenting these as real valuations misrepresents risk
stage_ids:
- simulation_exposure
- id: finance-C-063
when: When configuring scenario output for post-simulation analysis
action: verify scenario_output setting matches the intended analysis workflow, either writing scenario files to disk or
using in-memory processing
severity: medium
kind: resource_boundary
modality: must
consequence: Incorrect scenario_output configuration leads to either unnecessary disk I/O for memory-resident analysis
or failed post-processing when scenarios are not persisted
stage_ids:
- simulation_exposure
- id: finance-C-070
when: When calculating SIMM on simulated market scenarios
action: discount InitialMargin by numeraire before aggregating across simulation dates
severity: high
kind: domain_rule
modality: must
consequence: Undiscounted SIMM values cause incorrect IM forecasting due to inconsistent discounting across time periods
stage_ids:
- xva_calculation
- id: finance-C-071
when: When filtering samples in XVA calculation
action: use filterSample parameter consistently across netCubeFilter, rawCubeFilter, and SIMM cube generation
severity: high
kind: architecture_guardrail
modality: must
consequence: Inconsistent sample filtering causes mismatched data between NPV aggregation and SIMM calculation
stage_ids:
- xva_calculation
- id: finance-C-072
when: When running SIMM calculation post-simulation
action: use scenarioToMarket to convert simulated market scenarios into ORE market data format
severity: high
kind: architecture_guardrail
modality: must
consequence: Without proper scenario conversion, SIMM cannot be computed on simulated market paths, preventing IM forecasting
analysis
stage_ids:
- xva_calculation
- id: finance-C-073
when: When executing SA-CCR and CPM as separate analytic runs
action: run SA-CCR and Credit Portfolio Model as independent modules for modular regulatory capital calculation
severity: medium
kind: architecture_guardrail
modality: should
consequence: Coupling SA-CCR with CPM causes inability to run independent capital calculations and methodology comparison
stage_ids:
- xva_calculation
- id: finance-C-074
when: When aggregating trade-level NPVs to netting-set level
action: filter by both nettingSetId and depth==0 to verify only direct positions are included in aggregation
severity: high
kind: architecture_guardrail
modality: must
consequence: Missing nettingSetId filter causes NPVs from all netting sets to be aggregated together, corrupting counterparty-level
risk calculations
stage_ids:
- xva_calculation
- id: finance-C-075
when: When configuring XVA analysis with multiple runs
action: use different ORE XML configurations for sensitivity analysis without modifying instrument-level settings
severity: medium
kind: operational_lesson
modality: must
consequence: Re-running with same config prevents sensitivity analysis and stress testing capabilities
stage_ids:
- xva_calculation
- id: finance-C-076
when: When computing SA-CVA as post-processor for XVA sensitivity
action: verify CVA sensitivities are computed before SA-CVA calculation or loaded from pre-computed file
severity: high
kind: resource_boundary
modality: must
consequence: SA-CVA calculation without input sensitivities fails or produces zero capital figures
stage_ids:
- xva_calculation
- id: finance-C-077
when: When computing BA-CVA using SA-CCR internal results
action: run SA-CCR analytic first to produce EAD values required for BA-CVA calculation
severity: high
kind: resource_boundary
modality: must
consequence: BA-CVA calculation fails without SA-CCR EAD input, preventing basic approach CVA capital computation
stage_ids:
- xva_calculation
- id: finance-C-078
when: When computing XVA under stressed market conditions
action: apply separate shift scenarios (parallel curve shifts) to revalue portfolio and compute stressed XVA
severity: medium
kind: resource_boundary
modality: must
consequence: Without stress testing, portfolio XVA undervalues tail risk and fails regulatory stress test requirements
stage_ids:
- xva_calculation
- id: finance-C-079
when: When using ORE for XVA calculations in production
action: claim real-time trading capability based on backtest simulation results
severity: high
kind: claim_boundary
modality: must_not
consequence: Presenting simulated XVA results as live trading proof misleads stakeholders about actual counterparty risk
in real portfolios
stage_ids:
- xva_calculation
- id: finance-C-080
when: When presenting SA-CCR capital figures from simulation
action: equate simulated EAD values with actual regulatory capital requirements without model validation
severity: high
kind: claim_boundary
modality: must_not
consequence: Simulated SA-CCR capital figures may differ from actual regulatory calculations due to model approximations,
requiring validation against exchange or internal model requirements
stage_ids:
- xva_calculation
- id: finance-C-081
when: When validating XVA calculations against regulatory formulas
action: compare SA-CCR output against external regulatory calculator for same inputs before using in compliance reporting
severity: high
kind: operational_lesson
modality: must
consequence: Using unvalidated SA-CCR implementation for regulatory capital reporting risks non-compliance with Basel
CCR requirements
stage_ids:
- xva_calculation
- id: finance-C-082
when: When setting up SIMM calculation for regulatory IM
action: use ISDA SIMM version compatible with current regulatory requirements (2.4, 2.5A, 2.6)
severity: high
kind: resource_boundary
modality: must
consequence: Using outdated SIMM version may not comply with current regulatory margin requirements for OTC derivatives
stage_ids:
- xva_calculation
- id: finance-C-083
when: When computing Dynamic SIMM with regression-based DIM
action: verify simulation configuration (Euler discretization, time steps per year, date grid) matches between brute-force
and AMCCG calculation
severity: high
kind: operational_lesson
modality: must
consequence: Mismatched simulation configurations cause incorrect SIMM evolution benchmarking and validation failures
stage_ids:
- xva_calculation
- id: finance-C-084
when: When parsing SIMM cube with depth levels
action: map margin type (All, Delta, Vega, Curvature) to corresponding depth integer values (0, 1, 2, 3)
severity: high
kind: domain_rule
modality: must
consequence: Incorrect depth mapping causes wrong SIMM component to be selected, corrupting delta/vega/curvature breakdown
stage_ids:
- xva_calculation
- id: finance-C-085
when: When implementing XVA P&L Explain
action: compute XVA change between two evaluation dates using full revaluation rather than sensitivity-based approximation
severity: medium
kind: architecture_guardrail
modality: must
consequence: Sensitivity-based XVA P&L Explain produces incorrect attribution by using linear approximation instead of
full revaluation
stage_ids:
- xva_calculation
- id: finance-C-086
when: When processing scenario data for SIMM calculation
action: extract reference dates from scenario CSV and convert aggregation data to fixing file format
severity: high
kind: operational_lesson
modality: must
consequence: Missing scenario date extraction causes SIMM calculation to fail on implied market scenarios
stage_ids:
- xva_calculation
- id: finance-C-087
when: When running multiple XVA examples in parallel
action: configure EXAMPLES_PARALLEL environment variable or limit to 1 to avoid output file conflicts
severity: medium
kind: resource_boundary
modality: must
consequence: Parallel execution without proper configuration causes output file corruption and incorrect XVA results
stage_ids:
- xva_calculation
- id: finance-C-090
when: When performing regression testing on Monte Carlo output files
action: Configure tolerance parameters (abs_tol or rel_tol) for numerical columns
severity: high
kind: domain_rule
modality: must
consequence: Floating-point comparison fails due to minor numerical differences in stochastic simulation results, causing
false test failures despite valid output
stage_ids:
- post_processing
- id: finance-C-091
when: When using OutputFiles.from_folder() for file discovery
action: Verify output folder contains files with .csv extension only
severity: medium
kind: resource_boundary
modality: must
consequence: Only files ending with '.csv' are discovered; XML, JSON, or binary output files are silently ignored, causing
incomplete test coverage
stage_ids:
- post_processing
- id: finance-C-092
when: When generating plots with OutputPlotter
action: Set plot format to .pdf (currently hardcoded at line 37)
severity: low
kind: resource_boundary
modality: must
consequence: Plots are always saved as PDF regardless of user preference; cannot generate PNG or SVG formats for web publishing
or reports requiring raster graphics
stage_ids:
- post_processing
- id: finance-C-093
when: When initializing OutputPlotter for plot generation
action: Call parse_output() before accessing self.plots or self.output.csv
severity: high
kind: architecture_guardrail
modality: must
consequence: AttributeError raised if OutputPlotter is accessed before parse_output() initializes self.output and self.plots
as None
stage_ids:
- post_processing
- id: finance-C-094
when: When executing compare_files() on CSV file pairs
action: Provide comparison configuration specifying keys for DataFrame join
severity: high
kind: architecture_guardrail
modality: must
consequence: Without config, comparison falls back to join_columns=None which causes datacompy.Compare() to fail, returning
incorrect comparison results
stage_ids:
- post_processing
- id: finance-C-095
when: When loading ORE output for post-processing analysis
action: Access CSV data through OutputFiles.csv dictionary keys derived from file names
severity: medium
kind: architecture_guardrail
modality: must
consequence: Incorrect key used to access CSV data returns KeyError; keys are file names without .csv extension (e.g.,
'npv', 'xva', 'exposure_nettingset_CPTY_A')
stage_ids:
- post_processing
- id: finance-C-096
when: When presenting regression test results to stakeholders
action: Claim that backtest output equals expected live trading performance
severity: high
kind: claim_boundary
modality: must_not
consequence: Presenting simulation results as guaranteed live trading outcomes violates risk management best practices
and misleads decision-makers about actual expected performance
stage_ids:
- post_processing
- id: finance-C-097
when: When configuring comparison tolerance for regression testing
action: Set tolerance values too loose to mask genuine numerical regressions
severity: high
kind: operational_lesson
modality: must_not
consequence: Setting rel_tol above 1e-6 or abs_tol above 0.1 for critical exposure metrics hides genuine numerical regressions
in Monte Carlo simulations
stage_ids:
- post_processing
- id: finance-C-098
when: When discovering output files with OutputFiles.from_folder()
action: Assume files are returned in any particular order
severity: medium
kind: operational_lesson
modality: must_not
consequence: os.listdir() returns files in arbitrary OS-dependent order; relying on order for sequential processing causes
inconsistent analysis results
stage_ids:
- post_processing
- id: finance-C-100
when: When running regression tests on examples
action: Run with full simulation samples instead of OVERWRITE_SCENARIOGENERATOR_SAMPLES=50 for non-exempt examples
severity: medium
kind: domain_rule
modality: must_not
consequence: Non-exempt examples consume excessive CI time with full simulation, causing delayed feedback and resource
waste
stage_ids:
- regression_testing
- id: finance-C-101
when: When comparing numerical values in regression tests
action: Apply both absolute tolerance (abs_tol) and relative tolerance (rel_tol) checks for floating-point comparisons
severity: high
kind: domain_rule
modality: must
consequence: Numerical values that differ slightly due to floating-point precision are incorrectly flagged as failures,
causing test flakiness across platforms
stage_ids:
- regression_testing
- id: finance-C-102
when: When configuring test-specific comparison settings
action: Merge test-specific comparison_config.json with the default configuration file
severity: high
kind: architecture_guardrail
modality: must
consequence: Test-specific tolerances and exclusions are ignored, causing incorrect pass/fail signals for examples with
known acceptable variations
stage_ids:
- regression_testing
- id: finance-C-103
when: When executing regression test suite
action: Use deterministic sample count by exempting performance-sensitive examples from OVERWRITE_SCENARIOGENERATOR_SAMPLES
override
severity: medium
kind: architecture_guardrail
modality: must
consequence: Performance examples produce non-reproducible results due to reduced sample count, causing inconsistent pass/fail
signals between runs
stage_ids:
- regression_testing
- id: finance-C-104
when: When defining column-level comparison tolerances
action: Specify column-specific tolerances in comparison_config.json column_settings array
severity: medium
kind: domain_rule
modality: must
consequence: Column groups without explicit tolerances use default comparison, causing false failures for numerically-sensitive
columns like IM amounts and SIMM data
stage_ids:
- regression_testing
- id: finance-C-105
when: When running regression tests in parallel
action: Produce identical pass/fail results to sequential execution for the same codebase
severity: high
kind: architecture_guardrail
modality: must
consequence: Parallel execution introduces race conditions or non-determinism, causing flaky test results that vary between
CI runs
stage_ids:
- regression_testing
- id: finance-C-106
when: When generating tests for new examples
action: Automatically register tests through add_utest() without manual test code modification
severity: medium
kind: operational_lesson
modality: must
consequence: Manual test registration creates maintenance burden and causes new examples to be missed in regression suite
stage_ids:
- regression_testing
- id: finance-C-107
when: When configuring file comparison keys
action: Use duplicate key names in the keys array for CSV comparison
severity: high
kind: domain_rule
modality: must_not
consequence: Duplicate keys cause incorrect DataFrame joins and produce misleading comparison results
stage_ids:
- regression_testing
- id: finance-C-108
when: When defining per-example comparison configuration
action: Place comparison_config.json in the example test directory for example-specific overrides
severity: medium
kind: operational_lesson
modality: must
consequence: Known acceptable variations are not excluded, causing intentional differences to fail regression tests
stage_ids:
- regression_testing
- id: finance-C-109
when: When running regression test suite on baseline platform
action: Achieve test pass rate of at least 95% with known failures documented in comparison_config
severity: high
kind: operational_lesson
modality: must
consequence: Low pass rate indicates fundamental platform incompatibility or regression in core functionality
stage_ids:
- regression_testing
- id: finance-C-110
when: When a test comparison fails
action: Produce diff files with column-level granularity showing exact value differences
severity: high
kind: architecture_guardrail
modality: must
consequence: Without column-level diffs, developers cannot quickly identify which columns caused test failures
stage_ids:
- regression_testing
- id: finance-C-111
when: When defining JSON file comparison paths
action: Include parent key path in nested key specifications using forward slash separator
severity: high
kind: domain_rule
modality: must
consequence: Nested keys are not matched correctly, causing diffs to be incorrectly ignored or flagged
stage_ids:
- regression_testing
- id: finance-C-112
when: When using optional columns in comparison configuration
action: Verify optional columns exist in both DataFrames or are absent from both
severity: medium
kind: domain_rule
modality: must
consequence: Columns present in one file but not the other cause comparison to fail even when optional_cols behavior is
desired
stage_ids:
- regression_testing
- id: finance-C-113
when: When configuring tolerance for regression testing
action: Accept that small numerical differences are expected across platforms and use specified tolerances
severity: medium
kind: claim_boundary
modality: must
consequence: Overly strict tolerances cause cross-platform CI failures, misrepresenting stable code as regressed
stage_ids:
- regression_testing
- id: finance-C-114
when: When setting up parallel test execution
action: Configure pytest -n parameter appropriately for available CPU cores without exceeding resource limits
severity: low
kind: resource_boundary
modality: must
consequence: Excessive parallelism causes resource exhaustion, while insufficient parallelism increases test duration
stage_ids:
- regression_testing
- id: finance-C-119
when: When generating XVA reports for comparison against baseline
action: use the same CSV quote character and null string conventions as the baseline ExpectedOutput
severity: high
kind: domain_rule
modality: must
consequence: Mismatched CSV formatting causes regression test comparison failures even when numeric values are correct
stage_ids:
- post_processing
- regression_testing
- id: finance-C-122
when: When writing trade exposure CSVs to post-processing
action: include the Date column and each EPE/ENE profile columns as defined in ReportWriter::writeTradeExposures
severity: high
kind: architecture_guardrail
modality: must
consequence: Missing columns cause visualization and analysis tools to fail, preventing proper risk assessment
stage_ids:
- simulation_exposure
- post_processing
- id: finance-C-123
when: When transferring XVA results for post-processing
action: include both trade-level and netting-set-level CVA/DVA/FVA values in the XVA report
severity: high
kind: domain_rule
modality: must
consequence: Missing granularity prevents proper risk attribution and allocation analysis required for regulatory reporting
stage_ids:
- xva_calculation
- post_processing
- id: finance-C-124
when: When comparing output files during regression testing
action: apply the tolerance thresholds defined in comparison_config.json for each output file type
severity: high
kind: resource_boundary
modality: must
consequence: Using default tolerances instead of file-specific tolerances causes false test failures or missed regressions
in critical output files
stage_ids:
- post_processing
- regression_testing
- id: finance-C-126
when: When computing credit migration files for visualization
action: generate credit migration output with pdf and cdf columns for each requested time step
severity: high
kind: architecture_guardrail
modality: must
consequence: Missing credit migration distributions prevents analysis of credit migration risk and regulatory stress testing
stage_ids:
- xva_calculation
- post_processing
- id: finance-C-128
when: When writing netting set exposure CSVs
action: include EPE, ENE, PFE, and EEPE columns for each netting set ID
severity: high
kind: domain_rule
modality: must
consequence: Incomplete exposure columns prevent proper netting set risk analysis and regulatory reporting
stage_ids:
- simulation_exposure
- post_processing
- id: finance-C-129
when: When running regression tests
action: skip comparison of files that are in the ExpectedOutput directory
severity: high
kind: claim_boundary
modality: must_not
consequence: Skipping file comparison masks output inconsistencies and prevents detection of silent result changes
stage_ids:
- post_processing
- regression_testing
- id: finance-C-131
when: When processing margin evolution CSVs for post-processing
action: preserve the chronological ordering of margin schedule dates
severity: high
kind: domain_rule
modality: must
consequence: Out-of-order margin dates cause incorrect IM evolution tracking and wrong MVA calculations
stage_ids:
- xva_calculation
- post_processing
- id: finance-C-133
when: When writing scenario dump files for SIMM calculation
action: include each risk factor keys that are simulated in the scenario generator configuration
severity: high
kind: domain_rule
modality: must
consequence: Missing risk factors in scenario dump causes incomplete SIMM calculation and incorrect regulatory capital
requirements
stage_ids:
- simulation_exposure
- xva_calculation
- id: finance-C-134
when: When running regression testing after code changes
action: assume that floating point tolerance values from comparison_config.json are sufficient for each numeric columns
severity: medium
kind: rationalization_guard
modality: must_not
consequence: Using generic tolerances instead of column-specific tolerances causes false passes on critical columns or
false failures on noise columns
stage_ids:
- post_processing
- regression_testing
- id: finance-C-135
when: When parsing CSV output files produced by ORE reports
action: Use '#Id' column for trade identifiers and 'Date' column for timestamps — each CSV output files follow this convention
severity: high
kind: domain_rule
modality: must
consequence: Custom parsers fail to read ORE output files because they look for incorrect column names, causing data extraction
errors in downstream analysis
- id: finance-C-136
when: When executing ORE from example directories on Linux/Unix systems
action: Locate the ORE executable at ../../build/App/ore relative to the example directory (with fallback paths for ../../../build/App/ore
and ../../App/build/ore)
severity: high
kind: architecture_guardrail
modality: must
consequence: ORE examples fail to execute because the ore executable cannot be found, preventing risk analytics workflows
from running
- id: finance-C-137
when: When testing ORE example configurations without executing the full risk engine
action: Set the dry=True flag on OreExample class to skip actual ORE execution while validating each other setup and parsing
steps
severity: medium
kind: architecture_guardrail
modality: must
consequence: Dry-run tests attempt actual ORE execution instead of validating configuration, causing slow test runs and
unnecessary computation
- id: finance-C-138
when: When running ORE example test suite for regression testing
action: Use pytest with -n 16 flag to enable 16-way parallelism via pytest-xdist for faster execution across the ~80 example
configurations
severity: medium
kind: operational_lesson
modality: must
consequence: Test suite runs sequentially taking 16x longer than necessary, delaying feedback on regressions in risk analytics
functionality
- id: finance-C-139
when: When configuring scenario generator samples for fast regression testing
action: Set SCENARIOGENERATOR_SAMPLES=50 for fast regression testing, while production runs should use 1000+ samples for
statistical significance
severity: high
kind: operational_lesson
modality: must
consequence: Fast regression tests produce statistically insignificant results that may mask real issues in exposure simulation
and XVA calculations
- id: finance-C-140
when: When parsing market datum quotes for correlation, rates, and other instruments
action: Use TYPE/NAME/EXPIRY/STRIKE format (e.g., 'CORRELATION/RATE/INDEX1/INDEX2/1Y/ATM') for each market quote according
to the parsing convention
severity: high
kind: domain_rule
modality: must
consequence: Market data loader fails to parse quotes, causing incomplete term structure construction and incorrect pricing
across derivatives portfolios
- id: finance-C-141
when: When using OreBasic Python wrapper for ORE configurations
action: Call parse_input() explicitly in constructor for input parsing, but call parse_output() only after run() completes
— parse_output() is NOT called automatically in __init__
severity: high
kind: architecture_guardrail
modality: must
consequence: Attempting to access output data before running ORE returns empty results because output files do not exist
yet, leading to incorrect analysis
- id: finance-C-143
when: When overriding EngineFactory model or engine parameters after construction
action: Call setModelParameterOverrides() or setEngineParameterOverrides() which sets buildersDirty_=true, then trigger
resetBuilders() before using the factory — initial state has dirty=true
severity: high
kind: architecture_guardrail
modality: must
consequence: Parameter overrides are ignored because buildersDirty_ remains true but resetBuilders() is only called lazily
on next builder() call, causing stale pricing engines to be used
- id: finance-C-144
when: When validating ShiftHorizon parameter across different model builders
action: Verify ShiftHorizon is non-negative (>=0) — LGMBuilder and CrLGMBuilder throw exceptions for negative values,
while InfJyBuilder issues warnings only
severity: medium
kind: domain_rule
modality: must
consequence: Inconsistent ShiftHorizon validation causes some models to fail silently (InfJyBuilder warning) while others
throw hard errors, leading to model-specific bugs in risk simulations
- id: finance-C-146
when: When running multi-threaded valuation in ORE
action: Configure nThreads parameter in ore.xml — ORE's multi-threaded engine works around QuantLib's thread-safety limitations
by splitting portfolio into parts processed by separate threads
severity: medium
kind: resource_boundary
modality: must
consequence: Valuation runs on single thread despite available CPU cores, extending execution time unnecessarily for large
portfolios with many trades
- id: finance-C-147
when: When using GPU acceleration for exposure simulation
action: Claim production-ready GPU speedup for exposure calculations — GPU implementation for conditional expectation
calculations is work-in-progress with no current speedup vs CPU benchmark
severity: high
kind: resource_boundary
modality: must_not
consequence: Users are promised GPU acceleration benefits that do not materialize because dominating conditional expectation
calculations still run on CPU, leading to失望 (disappointment) and misallocated computational resources
- id: finance-C-148
when: When presenting ORE's computational performance capabilities
action: Claim production-ready AAD for CVA sensitivities — AAD implementation for CVA sensitivities is described as proof-of-concept
in the documentation
severity: high
kind: claim_boundary
modality: must_not
consequence: Users build production workflows on AAD-based CVA sensitivity capabilities that have not been validated for
production use, risking incorrect regulatory capital calculations
- id: finance-C-149
when: When presenting or reporting ORE's pricing and risk analytics capabilities
action: Claim support for high-frequency trading scenarios requiring sub-millisecond latency — ORE is a batch-oriented
risk engine designed for end-of-day and periodic risk calculations
severity: high
kind: claim_boundary
modality: must_not
consequence: Users build real-time trading systems expecting sub-millisecond responses that ORE cannot provide, leading
to critical failures in latency-sensitive trading environments
- id: finance-C-150
when: When presenting or reporting ORE's pricing and risk analytics capabilities
action: Claim support for real-time streaming market data integration — ORE uses batch-oriented architecture with CSV
and XML file-based market data loading
severity: high
kind: claim_boundary
modality: must_not
consequence: Users build real-time market data pipelines expecting live streaming quotes that ORE cannot consume, causing
complete system architecture failures
- id: finance-C-151
when: When presenting or reporting ORE's pricing and risk analytics capabilities
action: Claim support for non-derivative portfolios (bonds, equities) without trade wrapper conversion — ORE requires
trades to be wrapped as derivative instruments for risk analysis
severity: medium
kind: claim_boundary
modality: must_not
consequence: Users attempt to analyze plain bonds and equity positions directly in ORE, receiving errors or incorrect
risk measures due to missing trade wrapper conversion
- id: finance-C-152
when: When presenting or reporting ORE's pricing and risk analytics capabilities
action: Claim support for simple single-instrument pricing without risk simulation — ORE is designed for portfolio-level
XVA and risk simulation, not standalone trade pricing
severity: medium
kind: claim_boundary
modality: must_not
consequence: Users deploy ORE for simple pricing tasks better suited to QuantLib directly, incurring unnecessary complexity
and computational overhead while missing features available in simpler tools
- id: finance-C-154
when: When running ORE multiple times in the same process
action: Verify singleton cleanup via CleanUpThreadLocalSingletons and CleanUpThreadGlobalSingletons is executed between
runs to reset IndexManager, DividendManager, ObservationMode, ComputeEnvironment, and other thread-local/global state
severity: high
kind: architecture_guardrail
modality: must
consequence: Stale singleton state from previous ORE runs causes incorrect pricing, corrupted scenario data, and memory
leaks in long-running applications
- id: finance-C-156
when: When implementing pricing engine configuration in trade workflows
action: Set pricing engine on trade object before any pricing/NPV calculation; engine must be set after trade construction
but before valuation calls
severity: high
kind: domain_rule
modality: must
consequence: Pricing without an explicitly set engine may use default model assumptions that differ from intended strategy;
model comparison workflows produce incorrect results if engine configuration order is violated, leading to wrong risk
assessments
derived_from_bd_id: BD-006
- id: finance-C-157
when: When validating shiftHorizon parameter across model builder implementations
action: Validate shiftHorizon >= 0 consistently across each model builders including InfJyBuilder; reject negative values
with error instead of warning for uniform runtime behavior
severity: high
kind: domain_rule
modality: must
consequence: InfJyBuilder accepts negative shiftHorizon with a warning while other model builders reject it outright,
causing inconsistent validation results in multi-model workflows where users expect uniform behavior for identical parameter
values
derived_from_bd_id: BD-055
- id: finance-C-158
when: When implementing shift horizon configuration across LGM, CR-LGM, and Inflation model families
action: Assume uniform shift handling when switching between LGM, CR-LGM, and Inflation models; shift is applied via -parametrization->H(shiftHorizon)
in LGM but as direct value in CR-LGM and Inflation models
severity: medium
kind: claim_boundary
modality: must_not
consequence: Assuming uniform shift application across model families leads to incorrect parameter interpretation; the
same shiftHorizon value produces different model behavior depending on which model family is selected, causing silent
pricing errors in multi-model portfolios
derived_from_bd_id: BD-052
- id: finance-C-159
when: When documenting or implementing multi-model workflows using LGM, CR-LGM, and Inflation models
action: 'Document each model''s shift application method explicitly: LGM uses -parametrization->H(shiftHorizon) transformation,
CR-LGM and Inflation models use direct value assignment; provide migration guide for users switching between model families'
severity: medium
kind: domain_rule
modality: must
consequence: Without explicit documentation, extension developers implement custom models with incorrect assumptions about
shift handling, leading to incompatible model implementations that cannot be compared or hedged correctly across model
families
derived_from_bd_id: BD-052
- id: finance-C-160
when: When implementing or modifying ORE executable discovery logic in ore_examples_helper.py
action: Maintain the hardcoded path pattern search order with Windows paths (lines 77-113) checked before Unix paths (lines
114-141); first matching pattern wins; verify ORE executables are placed in one of the supported installation locations
severity: high
kind: domain_rule
modality: must
consequence: ORE executable discovery fails silently when executable is not found in any of the 15+ hardcoded path patterns;
in mixed Windows/Unix environments, searching wrong path order first causes unnecessary filesystem scans before finding
the executable
derived_from_bd_id: BD-053
- id: finance-C-161
when: When implementing regression tests for scenario generation
action: Maintain OVERWRITE_SCENARIOGENERATOR_SAMPLES=50 for regression test execution — do not increase to production
values (1000+) as this would make CI prohibitively slow, and do not decrease below 50 as this would lose statistical
significance for algorithmic regression detection
severity: high
kind: domain_rule
modality: must
consequence: Increasing sample count to production levels causes regression test suite to exceed CI time limits, blocking
deployment pipelines; decreasing below 50 causes regression tests to fail to detect algorithmic regressions, allowing
defective code to reach production
derived_from_bd_id: BD-019
- id: finance-C-162
when: When implementing scenario generation with periodic risk factor patterns
action: Use Fourier basis expansion with order=40 for capturing seasonal volatility cycles and intra-year correlation
shifts — do not change to lower orders without re-validating frequency resolution for daily data
severity: high
kind: architecture_guardrail
modality: must
consequence: Using polynomial basis instead of Fourier order 40 causes extrapolation instability in scenario generation,
producing invalid risk factor paths for seasonal patterns and corrupting P&L attribution accuracy
derived_from_bd_id: BD-025
- id: finance-C-163
when: When implementing PCA-based scenario generation for yield curve movements
action: Apply PCA to yield curve using 252-day rolling window annualized covariance matching Basel market risk horizon,
and retain components explaining at least 95% of variance
severity: high
kind: domain_rule
modality: must
consequence: Using fewer PCA components than 95% variance threshold causes insufficient dimensionality reduction that
degrades scenario accuracy, leading to regulatory capital miscalculation and potential Basel compliance violations
derived_from_bd_id: BD-027
- id: finance-C-164
when: When implementing discount factor to zero rate conversion for yield curve building
action: Apply log-linear discount-to-zero-rate conversion for maturities exceeding 10 years — use alternative interpolation
methods (cubic spline) for long-dated instruments where curvature is high
severity: high
kind: architecture_guardrail
modality: must_not
consequence: Using log-linear approximation for maturities beyond 10 years causes accuracy degradation that introduces
material errors in CSA-based exposure calculations, resulting in incorrect XVA valuations and potential regulatory capital
miscalculations
derived_from_bd_id: BD-028
- id: finance-C-165
when: When implementing Black variance surface for FX options pricing
action: Parameterize Black variance surface by moneyness (not strike) to capture volatility smile consistent with FX market
practice — interpolation in moneyness space better captures smile dynamics
severity: high
kind: domain_rule
modality: must
consequence: Using strike-based parameterization instead of moneyness causes interpolation artifacts at boundaries that
corrupt FX option pricing, leading to systematic mispricing in CVA calculations and incorrect hedging decisions
derived_from_bd_id: BD-043
- id: finance-C-166
when: When implementing Heston model pricing using the COS method
action: Use N=200 terms in the COS series expansion to achieve accuracy within 0.01% of quasi-exact values required for
XVA grid computations
severity: high
kind: domain_rule
modality: must
consequence: Reducing N below 200 terms compromises the 0.01% accuracy threshold needed for millions of Heston evaluations
in XVA grid computations, causing systematic pricing errors that accumulate across the portfolio
derived_from_bd_id: BD-032
- id: finance-C-167
when: When implementing binomial tree option pricing for model uncertainty measurement
action: Use at least two distinct binomial tree algorithms (Cox-Ross-Rubinstein, Jarrow-Rudd, or Leisen-Reimer) to enable
cross-validation and model risk quantification under FRTB IMA requirements
severity: high
kind: architecture_guardrail
modality: must
consequence: Using only a single binomial tree algorithm eliminates the ability to quantify model uncertainty, violating
FRTB IMA requirements and potentially misestimating risk for regulatory capital calculations
derived_from_bd_id: BD-034
- id: finance-C-168
when: When configuring Gaussian1dSwaptionEngine for GSR model discretization
action: Use exactly 64 time steps and 7.0 standard deviation cutoff for swaption pricing discretization to verify convergence
across swaption maturities and coverage of 99.99% of distribution
severity: high
kind: domain_rule
modality: must
consequence: Using fewer than 64 time steps causes insufficient convergence on long-dated swaptions, leading to inaccurate
calibrated values and incorrect FRTB IMA risk calculations
derived_from_bd_id: BD-039
- id: finance-C-169
when: When calibrating FX smile for derivative pricing
action: Use Vanna-Volga method with three liquid strikes (25 delta put, ATM, 25 delta call premium) for FX smile calibration
to capture correlation between FX spot and volatility dynamics
severity: high
kind: domain_rule
modality: must
consequence: Replacing Vanna-Volga with simpler models like SABR loses accuracy for correlation-sensitive FX derivatives
such as barrier options, systematically mispricing the portfolio
derived_from_bd_id: BD-044
- id: finance-C-170
when: When calculating time deltas for initial margin calculations under SIMM
action: Use ACT/365.25 day count convention consistently across each currency pairs for SIMM compliance and multi-currency
portfolio margin accuracy
severity: high
kind: domain_rule
modality: must
consequence: Using ACT/360 or 30/360 breaks SIMM compliance for USD LIBOR legacy trades, causing incorrect margin calculations
that may result in regulatory penalties or insufficient margin buffers
derived_from_bd_id: BD-029
- id: finance-C-172
when: When implementing finite difference solver for Black-Scholes PDE
action: Use exactly 801 time steps and 800 spatial grid points with stability condition dt < (dS^2)/(sigma^2*S^2) to verify
convergence for early exercise boundary detection
severity: high
kind: domain_rule
modality: must
consequence: Reducing grid resolution below 801x800 violates the stability condition, causing numerical oscillation or
divergence in early exercise boundary detection and incorrect American option pricing
derived_from_bd_id: BD-033
- id: finance-C-173
when: When initializing model builders (LGM, CR-LGM, InfDK, InfJY) for pricing discretization
action: Explicitly set shiftHorizon to a non-zero calibrated value during model initialization when controlling pricing
discretization is required — do not rely on implicit defaults
severity: medium
kind: domain_rule
modality: must
consequence: Without explicit shiftHorizon configuration, pricing accuracy and risk sensitivities are affected for instruments
requiring shift adjustments, causing incorrect FRTB regulatory capital calculations
derived_from_bd_id: BD-046
- id: finance-C-174
when: When managing builder state via EngineFactory for repeated pricing calls
action: Set buildersDirty_ flag to true whenever configuration changes to verify stale builder state is properly refreshed
during lazy reset — do not assume builders automatically reflect configuration updates
severity: medium
kind: operational_lesson
modality: must
consequence: Failing to set the dirty flag after configuration changes causes the factory to reuse stale builder state,
leading to incorrect pricing results that do not reflect updated market data or model parameters
derived_from_bd_id: BD-049
- id: finance-C-175
when: When processing file lists via FileList._parse() for batch configuration
action: Assume the framework validates configuration completeness when FileList._parse() is called — the method silently
catches FileNotFoundError without propagating exceptions or providing warnings
severity: medium
kind: claim_boundary
modality: must_not
consequence: Missing configuration files are silently excluded from processing, causing incomplete analysis with no error
indication and making configuration errors difficult to diagnose
derived_from_bd_id: BD-054
- id: finance-C-176
when: When parsing file lists for batch processing
action: Implement explicit error handling that raises exceptions or logs warnings with FAILURE level when expected files
are not found, rather than silently continuing with incomplete file lists
severity: medium
kind: operational_lesson
modality: must
consequence: Without explicit error handling, batch processing proceeds with incomplete file lists producing misleading
results that appear valid but lack critical configuration data
derived_from_bd_id: BD-054
- id: finance-C-177
when: When implementing or refactoring OreBasic class usage patterns
action: Call parse_output() explicitly AFTER run() completes, NOT in __init__. Never attempt to parse output before market
data is populated by run()
severity: high
kind: domain_rule
modality: must
consequence: Calling parse_output() before run() will produce empty or incorrect results since output data structures
are not initialized until run() completes, making backtest results unreliable
derived_from_bd_id: BD-047
- id: finance-C-178
when: When implementing TradeBuilder subclasses for engine factory operations
action: Keep TradeBuilder instances completely stateless; constructor must return shared_ptr without any internal state
that persists between build requests
severity: high
kind: architecture_guardrail
modality: must
consequence: Stateful builders cause non-deterministic trade generation and race conditions in multi-threaded scenarios
where builder instances are reused across EngineFactory calls, corrupting trade output
derived_from_bd_id: BD-048
- id: finance-C-179
when: When configuring volatility type for interest rate model calibration
action: Apply shift() lookup ONLY for ShiftedLognormal volatility type; Normal and Black-Scholes types must NOT apply
shift parameter
severity: high
kind: architecture_guardrail
modality: must
consequence: Applying shift to non-ShiftedLognormal volatility types produces incorrect pricing results; the 4-permutation
pattern must be maintained for each interest rate model calibration
derived_from_bd_id: BD-050
- id: finance-C-180
when: When calibrating interest rate models (e.g., Hull-White) to swaption volatility surface
action: Use Levenberg-Marquardt optimization algorithm combining Gauss-Newton and gradient descent methods; verify initial
parameters are within convergence radius
severity: high
kind: domain_rule
modality: must
consequence: Incorrect optimization algorithm selection causes convergence failures or poor calibration quality, producing
inaccurate risk factors that invalidate regulatory capital calculations
derived_from_bd_id: BD-037
- id: finance-C-181
when: When implementing model calibration optimization loops
action: 'Set explicit termination criteria: max iterations=1000, max stationary state=10, function tolerance=1e-8, gradient
tolerance=1e-8, x-tolerance=1e-8 to verify reproducibility'
severity: medium
kind: domain_rule
modality: must
consequence: Missing or incorrect termination criteria causes infinite loops or premature convergence, degrading calibration
quality and breaking reproducibility required for audit compliance
derived_from_bd_id: BD-038
- id: finance-C-182
when: When computing implied volatility from market prices for SABR/SVI surface fitting
action: Use Newton-Raphson solver with tolerance 1e-6 and max 1000 iterations; verify initial guess is between 0% and
200%
severity: high
kind: domain_rule
modality: must
consequence: Using bisection instead of Newton-Raphson causes 10x slower convergence; incorrect initial guess or solver
tolerance produces inaccurate implied volatility for deep ITM/OTM options
derived_from_bd_id: BD-040
- id: finance-C-183
when: When selecting calibration instruments for model calibration
action: 'Choose calibration basket mode based on portfolio complexity: Naive for simple portfolios, MaturityStrikeByDeltaGamma
for complex portfolios requiring better vol surface shape capture'
severity: medium
kind: domain_rule
modality: should
consequence: Using Naive basket mode for complex portfolios under-captures vol surface shape, degrading model calibration
quality required for accurate regulatory capital calculation
derived_from_bd_id: BD-041
- id: finance-C-184
when: When processing results for automated downstream systems requiring structured data
action: Use structured JSON output instead of CSV when downstream systems require precise decimal representation or when
processing instruments with high decimal precision
severity: medium
kind: operational_lesson
modality: should
consequence: CSV format causes precision loss for decimal-heavy instruments and breaks automation with systems requiring
structured JSON, potentially corrupting financial calculations in downstream systems
derived_from_bd_id: BD-GAP-004
- id: finance-C-185
when: When initializing model builders with default shiftHorizon parameter across different model families (LG, CR-LGM,
Inflation)
action: Verify that applying shiftHorizon=0.0 produces mathematically consistent results across model families, or explicitly
use model-specific defaults with documented behavior differences
severity: high
kind: operational_lesson
modality: must
consequence: Applying the same shiftHorizon default value (0.0) to different model families produces mathematically inconsistent
risk metrics despite documented consistency intentions, leading to incorrect hedging and capital allocation decisions
derived_from_bd_id: BD-061
- id: finance-C-186
when: When implementing or refactoring plot data loading in post-processing
action: Preserve lazy loading behavior via dict-based interface — do not refactor OutputPlotter to eagerly load each exposure
data into memory at initialization
severity: high
kind: architecture_guardrail
modality: must
consequence: Eagerly loading large Monte Carlo exposure files consumes excessive memory during initialization; for files
exceeding available RAM, this causes out-of-memory errors that prevent any visualization from running, defeating the
purpose of exposure analysis
derived_from_bd_id: BD-016
- id: finance-C-187
when: When using MARS for conditional expectation estimation in risk factor modeling
action: Verify the training window spans at least one full market cycle to enable stable knot selection; using shorter
windows risks unstable coefficient estimation
severity: high
kind: domain_rule
modality: must
consequence: MARS models trained on incomplete market cycles produce unstable knots that cause misestimated conditional
expectations, leading to incorrect risk factor valuations and potential regulatory capital miscalculations
derived_from_bd_id: BD-023
- id: finance-C-188
when: When implementing HNSW approximate nearest neighbor search for basket matching and scenario reduction
action: Maintain recall rate above 95% for regulatory acceptance; configure ef_construction and M parameters to balance
speed against accuracy requirements
severity: high
kind: domain_rule
modality: must
consequence: HNSW implementations with recall rates below 95% cause suboptimal basket matching in calibration, introducing
pricing errors that fail Basel IV SA-CCR regulatory acceptance criteria
derived_from_bd_id: BD-024
- id: finance-C-191
when: When implementing NPL portfolio data processing in the analytics engine
action: Assume the framework automatically validates NPL portfolio EBA field completeness — the framework does not implement
EBA field validation for NPL data; missing fields will cause silent data corruption in downstream analytics
severity: high
kind: claim_boundary
modality: must_not
consequence: Without explicit EBA field validation, NPL portfolio records with missing required fields propagate through
the analytics pipeline, causing incorrect regulatory reporting and risk calculations
derived_from_bd_id: BD-GAP-014
- id: finance-C-192
when: When processing NPL portfolio data in the analytics engine
action: Implement explicit EBA field validation that checks each required fields are populated before processing, and
fail with explicit error if any required field is missing or null
severity: high
kind: domain_rule
modality: must
consequence: Without explicit field validation, NPL portfolio records with incomplete EBA fields silently propagate through
the system, corrupting downstream risk calculations and regulatory reports
derived_from_bd_id: BD-GAP-014
output_validator:
assertions:
- id: OV-01
check_predicate: all(p in inspect.getsource(zvt.factors.algorithm.macd) for p in ['slow=26', 'fast=12', 'n=9'])
failure_message: 'FATAL: MACD params drifted from (fast=12, slow=26, n=9) — SL-08 violation, non-reproducible signals'
business_meaning: Standard MACD parameters are a semantic lock; drift makes results incomparable with industry-standard
indicators and non-reproducible.
source_ids:
- SL-08
- BD-036
- id: OV-02
check_predicate: result.get('total_trades', 0) > 0 or result.get('explicit_zero_trade_ack') is True
failure_message: Zero trades executed — likely missing pre-fetched data (see PC-02) or over-restrictive filters
business_meaning: A backtest with zero trades is not a valid result; either data is missing or the strategy never triggered.
Structural non-emptiness check is insufficient — we need business confirmation.
source_ids:
- SL-01
- finance-C-073
- id: OV-03
check_predicate: result.get('annual_return') is None or abs(float(result['annual_return'])) <= 5.0
failure_message: 'FATAL: |annual_return| > 500% — likely look-ahead bias or data error'
business_meaning: Annual returns exceeding 500% are physically implausible for A-share strategies; indicates look-ahead
bias or corrupt data.
source_ids: []
- id: OV-04
check_predicate: result.get('holding_change_pct') is None or abs(float(result['holding_change_pct'])) <= 1.0
failure_message: 'FATAL: |holding_change_pct| > 100% — physically impossible'
business_meaning: Holding change percentage cannot exceed 100%; violation indicates position accounting error.
source_ids:
- BD-029
- id: OV-05
check_predicate: result.get('max_drawdown') is None or abs(float(result['max_drawdown'])) <= 1.0
failure_message: 'FATAL: |max_drawdown| > 100% — impossible for non-leveraged account'
business_meaning: Maximum drawdown cannot exceed 100% without leverage; violation indicates calculation error or look-ahead
bias.
source_ids: []
- id: OV-06
check_predicate: not (hasattr(result, 'trade_log') and result.trade_log and any(result.trade_log[i].action == 'sell' and
i+1 < len(result.trade_log) and result.trade_log[i+1].action == 'buy' and result.trade_log[i].timestamp == result.trade_log[i+1].timestamp
for i in range(len(result.trade_log)-1)))
failure_message: 'FATAL: buy-before-sell detected in same cycle — SL-01 violation, creates implicit leverage'
business_meaning: SL-01 requires sell() before buy() in each cycle; violation means available_long was not updated before
buying, risking duplicate positions.
source_ids:
- SL-01
scaffold:
validate_py_path: '{workspace}/validate.py'
tail_block: "# === DO NOT MODIFY BELOW THIS LINE ===\nif __name__ == \"__main__\":\n result = run_backtest()\n from\
\ validate import enforce_validation\n enforce_validation(result, output_path=\"{workspace}/result.csv\")\n# ===\
\ END DO NOT MODIFY ==="
enforcement_protocol: 1. Never edit validate.py. 2. Never delete the DO NOT MODIFY tail block from the main script. 3. Never
wrap enforce_validation() in try/except. 4. Never rewrite result write logic — it MUST go through enforce_validation.
5. If validate.py raises ImportError, fix the dependency, do not remove the call.
acceptance:
hard_gates:
- id: G1
check: '{workspace}/result.csv exists AND file size > 0'
on_fail: Strategy did not produce output; check run_backtest() return value and enforce_validation() call
- id: G2
check: '{workspace}/result.csv.validation_passed marker file exists'
on_fail: Validation did not complete; review validate.py output and fix assertion failures
- id: G3
check: 'Main script contains literal: from validate import enforce_validation'
on_fail: Validation chain stripped; re-add the import in the DO NOT MODIFY block
- id: G4
check: 'Main script contains literal: # === DO NOT MODIFY BELOW THIS LINE ==='
on_fail: Validation fence removed; regenerate DO NOT MODIFY tail block
- id: G5
check: 'result.csv has at least 1 row: pandas.read_csv(result.csv).shape[0] >= 1'
on_fail: Empty result; check if trade_log is non-empty and factors generated signals. Confirm PC-02 (k-data exists) passed.
- id: G6
check: 'If MACD strategy: source contains ''slow=26'' AND ''fast=12'' AND ''n=9'' in algorithm call'
on_fail: MACD params drifted from SL-08 lock; restore standard (12, 26, 9)
- id: G7
check: 'For data pipeline tasks: result.csv contains ''entity_id'' and ''timestamp'' fields'
on_fail: Missing required columns; check Mixin.query_data return schema and DataFrame MultiIndex reset_index() before
writing
- id: G8
check: 'OV-03 passes: abs(annual_return) <= 5.0 (500%)'
on_fail: Physical plausibility check failed; investigate look-ahead bias or data corruption in input kdata
soft_gates:
- id: SG-01
rubric: 'Strategy narrative consistency: user intent aligns with generated strategy.py logic. dim_a: signal direction
(buy/sell) matches intent [1-5, pass>=4]; dim_b: frequency (daily/intraday) aligns [1-5, pass>=4]; dim_c: risk controls
match user intent [1-5, pass>=4].'
- id: SG-02
rubric: 'Factor combination quality. dim_a: no highly correlated factor duplication [1-5, pass>=4]; dim_b: multi-period
alignment correct [1-5, pass>=4]; dim_c: liquidity filter present for A-share [1-5, pass>=4].'
- id: SG-03
rubric: 'Data source selection appropriateness. dim_a: coverage sufficient for target entities [1-5, pass>=4]; dim_b:
provider latency acceptable for strategy frequency [1-5, pass>=4]; dim_c: no unauthorized provider used without credentials
[1-5, pass>=4].'
skill_crystallization:
trigger: all_hard_gates_passed AND user_opt_out_skill_saving != true
output_path_template: '{workspace}/../skills/{slug}.skill'
slug_template: '{blueprint_id_short}-{uc_id_lower}'
captured_fields:
- name
- intent_keywords
- entry_point_script
- validate_script
- fatal_constraints
- spec_locks
- preconditions
- install_recipes
- human_summary_translated
action: 'After all Hard Gates PASS, resolve slug via slug_template using the executed UC, then write the .skill YAML file
at output_path_template. Notify user in their detected locale: ''Skill saved as {slug}.skill — next time say one of {sample_triggers}
from the matched UC to invoke directly.'''
violation_signal: All hard gates passed but no .skill file exists at expected path
skill_file_schema:
name: finance-bp-104 / Dynamic SIMM Exposure Analysis
version: v5.3
intent_keywords:
- initial margin
- SIMM
- collateral
- exposure
- variation margin
entry_point: run_backtest
fatal_guards:
- SL-01
- SL-02
- SL-03
- SL-04
- SL-05
- SL-06
- SL-07
- SL-08
- SL-10
- SL-11
- SL-12
spec_locks:
- SL-01
- SL-02
- SL-03
- SL-04
- SL-05
- SL-06
- SL-07
- SL-08
- SL-09
- SL-10
- SL-11
- SL-12
preconditions:
- PC-01
- PC-02
- PC-03
- PC-04
post_install_notice:
trigger: skill_installation_complete
message_template:
positioning: I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow.
capability_catalog:
group_strategy:
source: auto_grouped
strategy_reason: auto-grouped by UC.type (4 distinct values, balanced distribution)
groups:
- group_id: reporting
name: Reporting
description: ''
emoji: 📋
uc_count: 9
ucs:
- uc_id: UC-101
name: Dynamic SIMM Exposure Analysis
short_description: Analyzes collateralized vs uncollateralized counterparty exposure dynamics for risk management
and margin calculations
sample_triggers:
- initial margin
- SIMM
- collateral
- uc_id: UC-102
name: XVA Valuation and Sensitivity Reporting
short_description: Computes and visualizes XVA metrics including CVA, DVA, FVA, and exposure profiles for OTC derivatives
portfolio
sample_triggers:
- XVA
- CVA
- FVA
- uc_id: UC-106
name: Portfolio NPV Cashflow and Curve Reporting
short_description: Generates comprehensive portfolio reports including NPV, cashflows, and yield curves for trade
valuation
sample_triggers:
- NPV
- cashflow
- curves
- uc_id: UC-107
name: Netting Set Exposure Analysis
short_description: Analyzes counterparty exposure profiles (EPE, ENE) for individual swaps and netting sets to assess
credit risk
sample_triggers:
- exposure
- EPE
- ENE
- uc_id: UC-108
name: XVA Scenario Path Visualization
short_description: Visualizes Monte Carlo simulation paths and NPV cubes for XVA computation, including scenario
data analysis
sample_triggers:
- XVA
- scenario
- simulation
- uc_id: UC-113
name: SIMM Margin Calculation with CRIF Data
short_description: Calculates regulatory margin requirements using SIMM methodology based on CRIF-formatted risk
sensitivities
sample_triggers:
- SIMM
- margin
- CRIF
- uc_id: UC-115
name: Term Structure and Discount Factor Extraction
short_description: Extracts and inspects term structures including discount curves, forward curves, and zero rates
for pricing validation
sample_triggers:
- term structure
- discount curve
- forward curve
- uc_id: UC-118
name: CVA Sensitivity Scenario Analysis
short_description: Computes CVA sensitivities to risk factors and analyzes credit valuation adjustment exposure
profiles
sample_triggers:
- CVA
- sensitivity
- credit
- uc_id: UC-119
name: NPV Cube Visualization Dashboard
short_description: Provides interactive Jupyter dashboard for launching ORE, visualizing NPV cube data, and filtering
by netting sets
sample_triggers:
- dashboard
- NPV cube
- visualization
- group_id: research_analysis
name: Research Analysis
description: ''
emoji: 📦
uc_count: 7
ucs:
- uc_id: UC-103
name: Pricing Sensitivity Method Comparison
short_description: Compares different sensitivity computation methods (bump-and-reval, automatic differentiation,
complex step) for accuracy and performance validation
sample_triggers:
- sensitivity
- AD
- automatic differentiation
- uc_id: UC-105
name: Commodity Forward Valuation Setup
short_description: Demonstrates basic instrument setup and valuation for a commodity forward contract using price
and discount curves
sample_triggers:
- commodity
- forward
- pricing engine
- uc_id: UC-110
name: Scenario Dump Covariance Analysis
short_description: Analyzes simulated scenario data to compute covariance matrices and risk factor correlations
from Monte Carlo output
sample_triggers:
- scenario
- covariance
- correlation
- uc_id: UC-111
name: Custom Payoff and Exposure Analysis
short_description: Visualizes custom payoff functions and analyzes trade-level exposure for exotic instruments like
barrier options
sample_triggers:
- payoff
- barrier option
- custom
- uc_id: UC-112
name: Dynamic Valuation Date Scenario Analysis
short_description: Demonstrates dynamic valuation date changes and interactive scenario analysis using Jupyter widgets
for what-if analysis
sample_triggers:
- valuation date
- scenario
- interactive
- uc_id: UC-116
name: Quasi-Monte Carlo Option Pricing
short_description: Demonstrates quasi-Monte Carlo methods using low-discrepancy sequences (Sobol) for more efficient
European option pricing
sample_triggers:
- quasi-Monte Carlo
- Sobol
- low discrepancy
- uc_id: UC-117
name: AAD Sensitivity Performance Analysis
short_description: Compares automatic differentiation sensitivity computation performance against traditional bump-and-reval
and complex step methods
sample_triggers:
- AAD
- sensitivity
- performance
- group_id: data_pipeline
name: Data Pipeline
description: ''
emoji: 📊
uc_count: 2
ucs:
- uc_id: UC-104
name: Market Data and Portfolio Dependency Analysis
short_description: Explores market object dependencies and portfolio structure by examining curves, conventions,
and pricing engine configurations
sample_triggers:
- market data
- curves
- conventions
- uc_id: UC-114
name: Market Data Configuration and Analytics Setup
short_description: Sets up complete market data infrastructure including curves, conventions, pricing engines, and
term structure configurations
sample_triggers:
- market data
- configuration
- curves
- group_id: monitoring
name: Monitoring
description: ''
emoji: 📦
uc_count: 1
ucs:
- uc_id: UC-109
name: Simulation Progress Monitoring
short_description: Monitors ORE simulation progress by reading log files and displaying execution status in Jupyter
sample_triggers:
- progress
- monitoring
- simulation
call_to_action: Tell me which one you want to try.
featured_entries:
- uc_id: UC-101
beginner_prompt: Try dynamic simm exposure analysis
auto_selected: true
- uc_id: UC-102
beginner_prompt: Try xva valuation and sensitivity reporting
auto_selected: true
- uc_id: UC-103
beginner_prompt: Try pricing sensitivity method comparison
auto_selected: true
more_info_hint: Ask me 'what else can you do?' to see all 19 capabilities.
locale_rendering:
instruction: On skill_installation_complete, translate ALL user-facing strings (positioning + capability_catalog.groups[].name
+ capability_catalog.groups[].description + capability_catalog.groups[].ucs[].short_description + call_to_action + featured_entries[].beginner_prompt
+ more_info_hint) into detected user locale per locale_contract. Preserve UC-IDs, group_id, emoji, and sample_triggers
verbatim.
preserve_verbatim:
- UC-IDs
- group_id
- emoji
- sample_triggers
- technical_class_names
enforcement:
action: 'Host agent MUST send composed message to user as the FIRST user-facing response after skill_installation_complete
event. Message MUST contain: positioning, capability_catalog (rendered as markdown tables per group), 3 featured_entries,
call_to_action, and more_info_hint.'
violation_code: PIN-01
violation_signal: First user-facing message post-install does not contain the full capability_catalog (all UCs grouped)
OR skips featured_entries OR skips call_to_action.
human_summary:
persona: Doraemon
what_i_can_do:
tagline: 'I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me
what you want; I''ll write the code, you don''t have to dig docs. (Heads up: ZVT natively supports A-share, HK, and
crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don''t bother for serious work.)'
use_cases:
- Pricing Sensitivity Method Comparison
- XVA Valuation and Sensitivity Reporting
- Dynamic SIMM Exposure Analysis
- A-share MACD daily golden-cross backtest with hfq price adjustment from eastmoney
- 'End-to-end ZVT pipeline: FinanceRecorder + GoodCompanyFactor + StockTrader'
- Multi-factor strategy with TargetSelector (AND mode) combining MACD + volume breakout
- Index composition data collection (SZ1000, SZ2000) with EM recorder
what_i_auto_fetch:
- ZVT stage pipeline structure (data_collection → visualization) from LATEST.yaml
- Semantic locks (SL-01 through SL-12) — especially sell-before-buy ordering and MACD params
- Fatal constraints (finance-C-*) relevant to your target strategy type
- 'Default parameters: MACD(12,26,9), hfq adjustment, buy_cost=0.001, base_capital=1M CNY'
- Entity ID format (stock_sh_600000) and DataFrame MultiIndex convention
- Provider-specific recorder class names and required class attributes
what_i_ask_you:
- 'Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage
is thin)'
- 'Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare,
or qmt (broker)?'
- 'Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?'
- 'Time range: start_timestamp and end_timestamp for backtest period'
- 'Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?'
locale_rendering:
instruction: On first user contact, translate all fields above into detected user locale while preserving Doraemon persona
(direct, frank, mildly snarky, knows limits).
preserve_verbatim:
- BD-IDs
- SL-IDs
- UC-IDs
- finance-C-IDs
- class_names
- function_names
- file_paths
- numeric_thresholds
通过 SWIG 绑定调用 QuantLib 引擎,完成期权、互换、债券等金融衍生品的定价计算,支持美式期权有限差分法和篮子价差期权等多资产策略验证。。
---
name: quantlib-derivatives
description: |-
通过 SWIG 绑定调用 QuantLib 引擎,完成期权、互换、债券等金融衍生品的定价计算,支持美式期权有限差分法和篮子价差期权等多资产策略验证。。
license: Proprietary. See LICENSE.txt in project root.
compatibility: Designed for Doramagic-host ecosystem (Claude Code / openclaw / Cursor). Requires Python 3.12+ with uv package manager.
metadata:
version: "v6.1"
blueprint_id: "finance-bp-123"
compiled_at: "2026-04-22T13:01:00.865366+00:00"
capability_markets: "global"
capability_activities: "derivatives-pricing"
sop_version: "crystal-compilation-v6.1"
---
# QuantLib 衍生品定价 (quantlib-derivatives)
> 通过 SWIG 绑定调用 QuantLib 引擎,完成期权、互换、债券等金融衍生品的定价计算,支持美式期权有限差分法和篮子价差期权等多资产策略验证。
## Pipeline
`data_collection -> data_storage -> factor_computation -> target_selection -> trading_execution -> visualization`
## Top Use Cases (35 total)
### Market Element Observability Test (`UC-101`)
Verifies that market quotes (SimpleQuote) properly notify registered observers when their values change, ensuring reactive pricing systems work correc
**Triggers**: market element, observer pattern, quote observability
### Joint Nordic Calendar Holidays Test (`UC-109`)
Tests joint calendar functionality combining multiple country calendars to determine valid business days for cross-border trading
**Triggers**: joint calendar, holidays, business days
### Currency Constructor Test (`UC-113`)
Tests currency class construction including default, standard (EUR), and bespoke currencies for multi-currency instrument pricing
**Triggers**: currency, multi-currency, currency constructor
For all **35** use cases, see [references/USE_CASES.md](references/USE_CASES.md).
**Execute trigger**: `When user intent matches intent_router.uc_entries[].positive_terms AND user uses action verb (run/execute/跑/执行/backtest/fetch/collect)`
## What I'll Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
## Semantic Locks (Fatal)
| ID | Rule | On Violation |
|---|---|---|
| `SL-01` | Execute sell orders before buy orders in every trading cycle | halt |
| `SL-02` | Trading signals MUST use next-bar execution (no look-ahead) | halt |
| `SL-03` | Entity IDs MUST follow format entity_type_exchange_code | halt |
| `SL-04` | DataFrame index MUST be MultiIndex (entity_id, timestamp) | halt |
| `SL-05` | TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount | halt |
| `SL-06` | filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION | halt |
| `SL-07` | Transformer MUST run BEFORE Accumulator in factor pipeline | halt |
| `SL-08` | MACD parameters locked: fast=12, slow=26, signal=9 | halt |
Full lock definitions: [references/LOCKS.md](references/LOCKS.md)
## Top Anti-Patterns (15 total)
- **`AP-DERIVATIVES-PRICING-001`**: Instrument NPV called without attached pricing engine
- **`AP-DERIVATIVES-PRICING-002`**: BSM forward price ignores dividend yield
- **`AP-DERIVATIVES-PRICING-003`**: Negative discount factors passed to log-domain interpolation
All 15 anti-patterns: [references/ANTI_PATTERNS.md](references/ANTI_PATTERNS.md)
## Evidence Quality Notice
> [QUALITY NOTICE] This crystal was compiled from blueprint finance-bp-123. Evidence verify ratio = 10.2% and audit fail total = 9. Generated results may have uncaptured requirement gaps. Verify critical decisions against source files (LATEST.yaml / LATEST.jsonl).
## Reference Files
| File | Contents | When to Load |
|---|---|---|
| [references/seed.yaml](references/seed.yaml) | V6+ 全量权威 (source-of-truth) | 有行为/决策争议时必读 |
| [references/ANTI_PATTERNS.md](references/ANTI_PATTERNS.md) | 15 条跨项目反模式 | 开始实现前 |
| [references/WISDOM.md](references/WISDOM.md) | 跨项目精华借鉴 | 架构决策时 |
| [references/CONSTRAINTS.md](references/CONSTRAINTS.md) | domain + fatal 约束 | 规则冲突时 |
| [references/USE_CASES.md](references/USE_CASES.md) | 全量 KUC-* 业务场景 | 需要完整示例时 |
| [references/LOCKS.md](references/LOCKS.md) | SL-* + preconditions + hints | 生成回测/交易代码前 |
| [references/COMPONENTS.md](references/COMPONENTS.md) | AST 组件地图(按 module 拆分)| 查 API 时 |
---
*Compiled by Doramagic crystal-compilation-v6.1 from `finance-bp-123` blueprint at 2026-04-22T13:01:00.865366+00:00.*
*See [human_summary.md](human_summary.md) for non-technical overview.*
FILE:human_summary.md
# Human Summary
> I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
## What I Can Do
- **tagline**: I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
- **use_cases**: ['Three-Asset Basket Spread Option Test', 'American Quanto Option FDM Pricing Test', 'Market Element Observability Test', 'A-share MACD daily golden-cross backtest with hfq price adjustment from eastmoney', 'End-to-end ZVT pipeline: FinanceRecorder + GoodCompanyFactor + StockTrader', 'Multi-factor strategy with TargetSelector (AND mode) combining MACD + volume breakout', 'Index composition data collection (SZ1000, SZ2000) with EM recorder']
## What I Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
FILE:references/ANTI_PATTERNS.md
# Anti-Patterns (Cross-Project)
Total: **15**
## FinancePy (finance-bp-101) (3)
### `AP-DERIVATIVES-PRICING-003` — Negative discount factors passed to log-domain interpolation <sub>(high)</sub>
When Numba-jitted interpolation functions perform log transformation on discount factors, negative or zero values cause domain errors. This occurs because log(-x) and log(0) are mathematically undefined. The consequence is runtime crashes in jitted functions and complete failure of discount curve interpolation, blocking all downstream pricing calculations.
### `AP-DERIVATIVES-PRICING-004` — Non-monotonic time points in discount curve interpolation <sub>(high)</sub>
Interpolation over non-monotonically increasing time points produces undefined behavior at crossing times, causing discount factors to be incorrectly computed where time values overlap. This corrupts the entire term structure because the bootstrap algorithm cannot determine which discount factor corresponds to which maturity. The consequence is incorrect present value calculations across all downstream products priced against the curve.
### `AP-DERIVATIVES-PRICING-005` — Bootstrap calibration instruments not in maturity order <sub>(high)</sub>
When building yield curves from market instruments (deposits, FRAs, swaps), the instruments must be provided in strictly increasing maturity order. Out-of-order instruments cause the bootstrap algorithm to solve for discount factors at incorrect time points, corrupting the entire term structure. The consequence is wrong forward rates and discount factors that propagate into all priced instruments.
## QuantLib-SWIG (finance-bp-123) (4)
### `AP-DERIVATIVES-PRICING-001` — Instrument NPV called without attached pricing engine <sub>(high)</sub>
Calling NPV() on a derivatives instrument without first calling setPricingEngine() returns uninitialized garbage values or throws null pointer exceptions. This occurs because the Instrument class relies on the attached PricingEngine to perform actual valuation logic. The consequence is silently incorrect pricing results that appear valid, potentially leading to bad trading decisions.
### `AP-DERIVATIVES-PRICING-006` — Option Exercise type mismatches VanillaOption constructor <sub>(high)</sub>
VanillaOption requires both a StrikedTypePayoff and a matching Exercise object. Using wrong Exercise type (e.g., AmericanExercise for European option) causes compilation failures in C++ or runtime errors in SWIG bindings. The consequence is the pricing system cannot initialize options, blocking all option pricing workflows.
### `AP-DERIVATIVES-PRICING-013` — Evaluation date not set before QuantLib term structure construction <sub>(medium)</sub>
QuantLib requires ql.Settings.instance().evaluationDate to be set before constructing yield term structures and instruments. Without an explicit evaluation date, the curve reference date becomes undefined, causing date calculations to fail or produce incorrect settlement dates. The consequence is wrong discount factors and NPV calculations across the entire portfolio.
### `AP-DERIVATIVES-PRICING-014` — Market quotes passed without QuoteHandle wrapper <sub>(medium)</sub>
QuantLib's observer pattern requires all market quotes to be wrapped in QuoteHandle before passing to rate helpers. Raw quote values bypass the observable notification mechanism, causing dependent instruments to never recalculate when market data updates. The consequence is stale pricing that doesn't reflect current market conditions.
## arch (finance-bp-124) (2)
### `AP-DERIVATIVES-PRICING-007` — NaN/inf values in ARCH model input data <sub>(high)</sub>
ARCH model estimation relies on recursive variance computations and scipy optimize. Non-finite input values (NaN, inf) cause optimizers to produce NaN results and recursive variance calculations to fail. The consequence is complete model estimation failure with meaningless outputs that appear valid, leading to incorrect volatility forecasts and risk misestimation.
### `AP-DERIVATIVES-PRICING-008` — ARCH parameter array concatenation in wrong order <sub>(high)</sub>
ARCHModel composes from three components (mean, volatility, distribution) and requires parameter arrays concatenated in fixed order: [mean_params, volatility_params, distribution_params]. Incorrect ordering causes _parse_parameters to assign wrong values to wrong components, producing mathematically invalid models (e.g., volatility parameters interpreted as distribution parameters). The consequence is invalid conditional variance forecasts.
## py_vollib (finance-bp-127) (6)
### `AP-DERIVATIVES-PRICING-002` — BSM forward price ignores dividend yield <sub>(high)</sub>
When calculating option prices on dividend-paying stocks using BSM, the forward price must be adjusted as F = S * exp((r-q)*t). Omitting the dividend yield adjustment (using F = S * exp(r*t)) causes systematic mispricing for all dividend-paying assets. The consequence is consistently wrong option prices that diverge from market prices, leading to arbitrage opportunities and trading losses.
### `AP-DERIVATIVES-PRICING-009` — Zero or negative time-to-expiration in option pricing <sub>(high)</sub>
Option pricing formulas (Black-Scholes, Black model) compute sqrt(t) in the denominator. Zero time causes division by zero; negative time produces NaN in d1/d2 calculations. The consequence is invalid option prices (NaN, inf) that break downstream Greeks calculations and hedging workflows.
### `AP-DERIVATIVES-PRICING-010` — Black model applies spot price instead of forward price <sub>(high)</sub>
The Black model is designed for options on futures/forwards and expects futures price F as input, not spot price S. Using spot directly causes incorrect pricing because the Black formula assumes the underlying follows geometric Brownian motion with drift equal to the risk-free rate (i.e., forward dynamics). The consequence is systematically wrong forward option prices.
### `AP-DERIVATIVES-PRICING-011` — Missing discount factor in Black model pricing <sub>(medium)</sub>
Black model pricing must apply time value discounting with deflater = exp(-r*t) to undiscounted option prices. Omitting the discount factor produces forward option prices that exceed their fair value by the risk-free compounding amount. The consequence is violation of time value of money principles and prices that cannot be used for fair valuation or hedging.
### `AP-DERIVATIVES-PRICING-012` — Invalid flag parameter ('c'/'p') passed to py_vollib without validation <sub>(medium)</sub>
py_vollib binary_flag dict only contains keys 'c' and 'p'. Passing any other flag value causes KeyError exception. The library lacks input validation and crashes on invalid inputs. The consequence is unhandled exceptions in production systems when flag values come from external sources with unexpected formats.
### `AP-DERIVATIVES-PRICING-015` — Implied volatility computed without proper bounds validation <sub>(medium)</sub>
When computing implied volatility, option prices outside theoretical bounds (below intrinsic value or above maximum) must raise appropriate exceptions. Returning invalid IV values (negative volatility or extreme values) violates mathematical definitions and leads to incorrect pricing, risk calculations, and hedging ratios. The consequence is systemic pricing errors across all vol-dependent derivatives.
FILE:references/COMPONENTS.md
# Component Capability Map
**Project**: finance-bp-123--QuantLib-SWIG
**Scan date**: 2026-04-22
**Stats**: {'total_files': 5, 'total_classes': 27, 'total_functions': 0, 'total_stages': 5}
## Modules (5)
- [swig_interface_definition](components/swig_interface_definition.md): 4 classes
- [market_data_construction](components/market_data_construction.md): 6 classes
- [instrument_definition](components/instrument_definition.md): 7 classes
- [pricing_engine_attachment](components/pricing_engine_attachment.md): 6 classes
- [npv_calculation_and_results](components/npv_calculation_and_results.md): 4 classes
FILE:references/CONSTRAINTS.md
# Constraints
## preservation_manifest
```yaml
required_objects:
business_decisions_count: 141
fatal_constraints_count: 33
non_fatal_constraints_count: 151
use_cases_count: 35
semantic_locks_count: 12
preconditions_count: 4
evidence_quality_rules_count: 2
traceback_scenarios_count: 5
```
FILE:references/LOCKS.md
# Semantic Locks + Preconditions
## Semantic Locks (12)
### `SL-01` <sub>(on_violation: fatal)</sub>
Execute sell orders before buy orders in every trading cycle
### `SL-02` <sub>(on_violation: fatal)</sub>
Trading signals MUST use next-bar execution (no look-ahead)
### `SL-03` <sub>(on_violation: fatal)</sub>
Entity IDs MUST follow format entity_type_exchange_code
### `SL-04` <sub>(on_violation: fatal)</sub>
DataFrame index MUST be MultiIndex (entity_id, timestamp)
### `SL-05` <sub>(on_violation: fatal)</sub>
TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount
### `SL-06` <sub>(on_violation: fatal)</sub>
filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION
### `SL-07` <sub>(on_violation: fatal)</sub>
Transformer MUST run BEFORE Accumulator in factor pipeline
### `SL-08` <sub>(on_violation: fatal)</sub>
MACD parameters locked: fast=12, slow=26, signal=9
### `SL-09` <sub>(on_violation: warning)</sub>
Default transaction costs: buy_cost=0.001, sell_cost=0.001, slippage=0.001
### `SL-10` <sub>(on_violation: fatal)</sub>
A-share equity trading is T+1 (no same-day close of buy positions)
### `SL-11` <sub>(on_violation: fatal)</sub>
Recorder subclass MUST define provider AND data_schema class attributes
### `SL-12` <sub>(on_violation: fatal)</sub>
Factor result_df MUST contain either 'filter_result' OR 'score_result' column
## Preconditions (4)
- **`PC-01`**: `python3 -c 'import zvt; print(zvt.__version__)'` → on_fail: Run: python3 -m pip install zvt then re-run: python3 -m zvt.init_dirs to initialize data directories
- **`PC-02`**: `python3 -c "from zvt.api.kdata import get_kdata; df = get_kdata(entity_ids=['stock_sh_600000'], limit=1); assert df is not None and len(df) > 0, 'No kdata found'"` → on_fail: Run recorder first: python3 -m zvt.recorders.em.em_stock_kdata_recorder --entity_ids stock_sh_600000 (replace with your target entity IDs)
- **`PC-03`**: `python3 -c "import os; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); assert zvt_home.exists(), f'ZVT home not found: {zvt_home}'"` → on_fail: Run: python3 -m zvt.init_dirs
- **`PC-04`**: `python3 -c "import os, tempfile; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); test_f = zvt_home / '.write_test'; test_f.touch(); test_f.unlink()"` → on_fail: Check directory permissions: chmod u+w ~/.zvt or set ZVT_HOME environment variable to a writable location
FILE:references/USE_CASES.md
# Known Use Cases (KUC)
Total: **35**
## `KUC-101`
**Source**: `Python/test/test_marketelements.py`
Verifies that market quotes (SimpleQuote) properly notify registered observers when their values change, ensuring reactive pricing systems work correctly.
## `KUC-102`
**Source**: `Python/test/test_americanquantooption.py`
Tests finite difference method pricing for American quanto options, which have payoff dependent on both equity and foreign exchange movements.
## `KUC-103`
**Source**: `Python/test/test_basket_option.py`
Validates pricing of basket options involving multiple underlying assets and spread options, which are used for correlation trading and cross-asset strategies.
## `KUC-104`
**Source**: `Python/test/test_cms.py`
Tests CMS (Constant Maturity Swap) pricing engines that handle swap-rate based coupons, used in structured products and interest rate derivatives.
## `KUC-105`
**Source**: `Python/test/test_bonds.py`
Tests fixed rate bond pricing with scheduled coupon payments, used for government and corporate bond valuation and analysis.
## `KUC-106`
**Source**: `Python/test/test_bondfunctions.py`
Tests bond analytics functions including yield calculation, duration, convexity, and other fixed income risk measures.
## `KUC-107`
**Source**: `Python/test/test_termstructures.py`
Tests yield term structure creation, interpolation methods, and forward rate calculations essential for pricing each interest rate derivatives.
## `KUC-108`
**Source**: `Python/test/test_options.py`
Tests finite difference method pricing for options under Heston Hull-White stochastic volatility model, capturing equity-interest rate correlation.
## `KUC-109`
**Source**: `Python/test/test_calendars.py`
Tests joint calendar functionality combining multiple country calendars to determine valid business days for cross-border trading.
## `KUC-110`
**Source**: `Python/test/test_sabr.py`
Tests SABR (Stochastic Alpha, Beta, Rho) volatility model for smile fitting in interest rate and FX markets.
## `KUC-111`
**Source**: `Python/test/test_volatilities.py`
Tests swaption volatility term structure handling for interest rate derivatives pricing and calibration.
## `KUC-112`
**Source**: `Python/test/test_futures.py`
Tests perpetual futures pricing models used in cryptocurrency exchanges for continuous futures contracts without expiration.
## `KUC-113`
**Source**: `Python/test/test_currencies.py`
Tests currency class construction including default, standard (EUR), and bespoke currencies for multi-currency instrument pricing.
## `KUC-114`
**Source**: `Python/test/test_swap.py`
Tests interest rate swap pricing engines with historical rate fixings, used for vanilla IRS valuation and curve bootstrapping validation.
## `KUC-115`
**Source**: `Python/test/test_integrals.py`
Tests numerical integration methods used in pricing analytics for computing expected values and option premiums.
## `KUC-116`
**Source**: `Python/test/test_swaption.py`
Tests swaption pricing with various settlement types and calculates Greeks for interest rate option risk management.
## `KUC-117`
**Source**: `Python/test/test_ode.py`
Tests ordinary differential equation solvers (Runge-Kutta) used in pricing models for term structure evolution and boundary value problems.
## `KUC-118`
**Source**: `Python/test/test_slv.py`
Tests Stochastic Local Volatility (SLV) process generation combining Heston stochastic vol with local vol for more accurate calibration.
## `KUC-119`
**Source**: `Python/test/test_coupons.py`
Tests Ibor index coupon construction and floating leg generation for interest rate derivatives including rate averaging methods.
## `KUC-120`
**Source**: `Python/test/test_fdm.py`
Tests finite difference method meshers and solvers used for PDE-based option pricing across various boundary conditions.
## `KUC-121`
**Source**: `Python/test/test_daycounters.py`
Tests day count conventions including Business252 used for interest accrual calculations in Brazilian and other markets.
## `KUC-122`
**Source**: `Python/test/test_ratehelpers.py`
Tests rate helpers used in bootstrapping yield curves from market instruments like deposits, futures, and swaps.
## `KUC-123`
**Source**: `Python/test/test_fxforward.py`
Tests FX forward pricing with cross-currency discount curves, used for currency hedging and FX derivative valuation.
## `KUC-124`
**Source**: `Python/test/test_solvers1d.py`
Tests root-finding algorithms (Brent, Bisection, Secant, etc.) used for implied volatility calculation and calibration.
## `KUC-125`
**Source**: `Python/test/test_extrapolation.py`
Tests Richardson extrapolation for improving numerical accuracy, used in pricing model convergence acceleration.
## `KUC-126`
**Source**: `Python/test/test_equityindex.py`
Tests equity index valuation with dividend yields and interest rates, used for index derivative pricing.
## `KUC-127`
**Source**: `Python/test/test_instruments.py`
Tests instrument observability pattern for stocks, ensuring market data changes properly propagate through to instrument valuations.
## `KUC-128`
**Source**: `Python/test/test_linear_algebra.py`
Tests array and matrix operations used throughout QuantLib for numerical computations in pricing and risk models.
## `KUC-129`
**Source**: `Python/test/test_date.py`
Tests date arithmetic operations including periods, schedules, and date adjustments used throughout financial calculations.
## `KUC-130`
**Source**: `Python/test/test_blackformula.py`
Tests Black-Scholes-Merton formula implementation for vanilla option pricing, the foundational model for equity derivatives.
## `KUC-131`
**Source**: `Python/test/test_inflation.py`
Tests inflation term structures and CPI index handling for inflation-linked derivatives like inflation swaps and bonds.
## `KUC-132`
**Source**: `Python/test/test_money.py`
Tests Money class comparison and arithmetic operations for proper handling of multi-currency quantities.
## `KUC-133`
**Source**: `Python/test/test_capfloor.py`
Tests cap and floor pricing engines for interest rate risk management, used for hedging rate movements and callable structures.
## `KUC-134`
**Source**: `Python/test/test_settings.py`
Tests global settings management including evaluation date control and fixing behavior configuration.
## `KUC-135`
**Source**: `Python/test/test_iborindex.py`
Tests Ibor index fixing management and historical fixings handling for floating rate instruments.
FILE:references/WISDOM.md
# Cross-Project Wisdom
Total: **8**
## `CW-DERIVATIVES-PRICING-001` — Strict input validation before financial calculations
**From**: FinancePy, QuantLib-SWIG · **Applicable to**: derivatives-pricing
Both FinancePy and QuantLib-SWIG enforce strict validation of all input parameters before any financial computation. FinancePy validates day count types, date arguments, tolerance parameters, and max iterations. QuantLib-SWIG validates exercise types and swap direction enums. This pattern prevents corrupted calculations and provides clear error messages. Apply this pattern by validating all inputs at function entry points.
## `CW-DERIVATIVES-PRICING-002` — Bootstrap requires ordered instrument calibration
**From**: FinancePy, QuantLib-SWIG · **Applicable to**: derivatives-pricing
Both FinancePy and QuantLib-SWIG require calibration instruments to be provided in strict maturity order for curve bootstrapping. FinancePy enforces monotonically increasing time points and validates instrument sequencing (deposits before FRAs before swaps). QuantLib-SWIG uses bootstrap helpers (DepositRateHelper, FraRateHelper, SwapRateHelper) that assume ordered inputs. This ensures the bootstrap algorithm solves for discount factors at mathematically correct time points.
## `CW-DERIVATIVES-PRICING-003` — Handle pattern for lazy evaluation chains
**From**: QuantLib-SWIG · **Applicable to**: derivatives-pricing
QuantLib-SWIG requires wrapping market data (quotes, term structures) in Handle objects to enable lazy evaluation and automatic recalculation. QuoteHandle for market quotes and Handle for term structures enable the observer pattern. When market data updates, all dependent instruments automatically recalculate. This pattern is essential for live pricing systems where prices must reflect current market conditions.
## `CW-DERIVATIVES-PRICING-004` — Parameter composition requires fixed ordering and partitioning
**From**: arch · **Applicable to**: derivatives-pricing
arch enforces a strict parameter composition pattern where mean, volatility, and distribution parameters must be concatenated in fixed order with explicit offset partitioning. The offsets array partitions the unified parameter vector into components. This pattern prevents parameter assignment errors that would corrupt model components. Apply this when composing financial models from multiple sub-components.
## `CW-DERIVATIVES-PRICING-005` — Strict mathematical constraint enforcement
**From**: arch, py_vollib · **Applicable to**: derivatives-pricing
Both arch and py_vollib enforce strict mathematical constraints: arch enforces volatility model stationarity constraints (A.dot(params) - b >= 0) for SLSQP optimization; py_vollib validates implied volatility is positive and option prices within intrinsic/maximum bounds. Violating these constraints produces mathematically invalid results. Always enforce domain constraints on all financial model parameters.
## `CW-DERIVATIVES-PRICING-006` — Forward price adjustment for dividend yield in BSM
**From**: py_vollib · **Applicable to**: derivatives-pricing
py_vollib demonstrates the correct BSM implementation: compute forward price F = S * exp((r-q)*t) to adjust for continuous dividend yield before passing to the pricing engine. This pattern is essential for all options on dividend-paying assets. Forgetting the dividend adjustment causes systematic mispricing for the entire equity derivatives book.
## `CW-DERIVATIVES-PRICING-007` — Monotonicity validation for interpolation arrays
**From**: FinancePy · **Applicable to**: derivatives-pricing
FinancePy enforces strictly monotonically increasing time arrays before interpolation operations. This prevents undefined behavior at crossing times and ensures each time point maps to exactly one discount factor. Apply this validation whenever implementing interpolation over financial time series (discount curves, volatility surfaces, forward rates).
## `CW-DERIVATIVES-PRICING-008` — Production vs reference implementation selection
**From**: py_vollib · **Applicable to**: derivatives-pricing
py_vollib explicitly distinguishes between ref_python (slow, educational) and production (fast, C-based lets_be_rational) implementations. Using the reference implementation in production causes 10-100x performance degradation. Always select the appropriate implementation tier based on use case requirements—reference for testing/education, optimized for production trading systems.
FILE:references/components/instrument_definition.md
# instrument_definition (7 classes)
## `VanillaOption construction`
`instrument_definition/vanillaoption-construction.py:0`
## `PlainVanillaPayoff definition`
`instrument_definition/plainvanillapayoff-definition.py:0`
## `EuropeanExercise specification`
`instrument_definition/europeanexercise-specification.py:0`
## `VanillaSwap construction`
`instrument_definition/vanillaswap-construction.py:0`
## `Schedule generation`
`instrument_definition/schedule-generation.py:0`
## `Payoff type`
`instrument_definition/payoff-type.py:0`
## `Exercise type`
`instrument_definition/exercise-type.py:0`
FILE:references/components/market_data_construction.md
# market_data_construction (6 classes)
## `SimpleQuote construction`
`market_data_construction/simplequote-construction.py:0`
## `FlatForward curve creation`
`market_data_construction/flatforward-curve-creation.py:0`
## `BlackConstantVol setup`
`market_data_construction/blackconstantvol-setup.py:0`
## `PiecewiseFlatForward bootstrapping`
`market_data_construction/piecewiseflatforward-bootstrapping.py:0`
## `Curve bootstrap method`
`market_data_construction/curve-bootstrap-method.py:0`
## `Volatility surface type`
`market_data_construction/volatility-surface-type.py:0`
FILE:references/components/npv_calculation_and_results.md
# npv_calculation_and_results (4 classes)
## `Instrument.NPV`
`npv_calculation_and_results/instrument-npv.py:0`
## `VanillaOption.errorEstimate`
`npv_calculation_and_results/vanillaoption-errorestimate.py:0`
## `VanillaSwap.fairRate`
`npv_calculation_and_results/vanillaswap-fairrate.py:0`
## `impliedVolatility calculation`
`npv_calculation_and_results/impliedvolatility-calculation.py:0`
FILE:references/components/pricing_engine_attachment.md
# pricing_engine_attachment (6 classes)
## `setPricingEngine attachment`
`pricing_engine_attachment/setpricingengine-attachment.py:0`
## `BlackScholesMertonProcess creation`
`pricing_engine_attachment/blackscholesmertonprocess-creation.py:0`
## `AnalyticEuropeanEngine initialization`
`pricing_engine_attachment/analyticeuropeanengine-initialization.py:0`
## `HestonModel calibration`
`pricing_engine_attachment/hestonmodel-calibration.py:0`
## `Pricing method`
`pricing_engine_attachment/pricing-method.py:0`
## `Model`
`pricing_engine_attachment/model.py:0`
FILE:references/components/swig_interface_definition.md
# swig_interface_definition (4 classes)
## `ql.i module inclusion`
`swig_interface_definition/ql-i-module-inclusion.py:0`
## `PyObserver callback registration`
`swig_interface_definition/pyobserver-callback-registration.py:0`
## `Handle template instantiation`
`swig_interface_definition/handle-template-instantiation.py:0`
## `Monte Carlo random sequence`
`swig_interface_definition/monte-carlo-random-sequence.py:0`
提供 A 股市场的因子计算、存储与 tear sheet 分析能力,支持 Pandas/Polars 零拷贝数据转换和 QIFI 账户回测模拟,适用于多数据源量化研究。
---
name: quantaxis-data-platform
description: |-
提供 A 股市场的因子计算、存储与 tear sheet 分析能力,支持 Pandas/Polars 零拷贝数据转换和 QIFI 账户回测模拟,适用于多数据源量化研究。
license: Proprietary. See LICENSE.txt in project root.
compatibility: Designed for Doramagic-host ecosystem (Claude Code / openclaw / Cursor). Requires Python 3.12+ with uv package manager.
metadata:
version: "v6.1"
blueprint_id: "finance-bp-090"
compiled_at: "2026-04-22T13:00:37.999318+00:00"
capability_markets: "cn-astock"
capability_activities: "backtesting, factor-research"
sop_version: "crystal-compilation-v6.1"
---
# QuantAxis 数据平台 (quantaxis-data-platform)
> 提供 A 股市场的因子计算、存储与 tear sheet 分析能力,支持 Pandas/Polars 零拷贝数据转换和 QIFI 账户回测模拟,适用于多数据源量化研究。
## Pipeline
`data_collection -> data_storage -> factor_computation -> target_selection -> trading_execution -> visualization`
## Top Use Cases (8 total)
### Moving Average Factor Computation (`UC-101`)
Computes and stores a 5-day moving average (MA5) factor for daily stock data, enabling technical indicator analysis across multiple stocks using Click
**Triggers**: factor, moving average, MA5
### Factor Tear Sheet Analysis (`UC-102`)
Retrieves pre-computed MA5 factor data from ClickHouse and generates comprehensive tear sheets for factor performance analysis and visualization in re
**Triggers**: tear sheet, factor analysis, visualization
### Zero-Copy Data Bridge Conversion (`UC-103`)
Demonstrates efficient zero-copy conversion between Pandas and Polars dataframes, and shared memory-based cross-process data transmission for high-per
**Triggers**: pandas, polars, conversion
For all **8** use cases, see [references/USE_CASES.md](references/USE_CASES.md).
**Execute trigger**: `When user intent matches intent_router.uc_entries[].positive_terms AND user uses action verb (run/execute/跑/执行/backtest/fetch/collect)`
## What I'll Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
## Semantic Locks (Fatal)
| ID | Rule | On Violation |
|---|---|---|
| `SL-01` | Execute sell orders before buy orders in every trading cycle | halt |
| `SL-02` | Trading signals MUST use next-bar execution (no look-ahead) | halt |
| `SL-03` | Entity IDs MUST follow format entity_type_exchange_code | halt |
| `SL-04` | DataFrame index MUST be MultiIndex (entity_id, timestamp) | halt |
| `SL-05` | TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount | halt |
| `SL-06` | filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION | halt |
| `SL-07` | Transformer MUST run BEFORE Accumulator in factor pipeline | halt |
| `SL-08` | MACD parameters locked: fast=12, slow=26, signal=9 | halt |
Full lock definitions: [references/LOCKS.md](references/LOCKS.md)
## Top Anti-Patterns (25 total)
- **`AP-ZVT-183`**: 除权因子为 inf/NaN 时直接参与乘法导致复权静默失败
- **`AP-ZVT-179`**: 第三方数据接口超限后异常被吞噬,数据静默缺失
- **`AP-ZVT-183B`**: HFQ(后复权)与 QFQ(前复权)K 线表使用错误导致因子计算漂移
All 25 anti-patterns: [references/ANTI_PATTERNS.md](references/ANTI_PATTERNS.md)
## Evidence Quality Notice
> [QUALITY NOTICE] This crystal was compiled from blueprint finance-bp-090. Evidence verify ratio = 67.7% and audit fail total = 30. Generated results may have uncaptured requirement gaps. Verify critical decisions against source files (LATEST.yaml / LATEST.jsonl).
## Reference Files
| File | Contents | When to Load |
|---|---|---|
| [references/seed.yaml](references/seed.yaml) | V6+ 全量权威 (source-of-truth) | 有行为/决策争议时必读 |
| [references/ANTI_PATTERNS.md](references/ANTI_PATTERNS.md) | 25 条跨项目反模式 | 开始实现前 |
| [references/WISDOM.md](references/WISDOM.md) | 跨项目精华借鉴 | 架构决策时 |
| [references/CONSTRAINTS.md](references/CONSTRAINTS.md) | domain + fatal 约束 | 规则冲突时 |
| [references/USE_CASES.md](references/USE_CASES.md) | 全量 KUC-* 业务场景 | 需要完整示例时 |
| [references/LOCKS.md](references/LOCKS.md) | SL-* + preconditions + hints | 生成回测/交易代码前 |
| [references/COMPONENTS.md](references/COMPONENTS.md) | AST 组件地图(按 module 拆分)| 查 API 时 |
---
*Compiled by Doramagic crystal-compilation-v6.1 from `finance-bp-090` blueprint at 2026-04-22T13:00:37.999318+00:00.*
*See [human_summary.md](human_summary.md) for non-technical overview.*
FILE:human_summary.md
# Human Summary
> I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
## What I Can Do
- **tagline**: I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
- **use_cases**: ['Zero-Copy Data Bridge Conversion', 'Factor Tear Sheet Analysis', 'Moving Average Factor Computation', 'A-share MACD daily golden-cross backtest with hfq price adjustment from eastmoney', 'End-to-end ZVT pipeline: FinanceRecorder + GoodCompanyFactor + StockTrader', 'Multi-factor strategy with TargetSelector (AND mode) combining MACD + volume breakout', 'Index composition data collection (SZ1000, SZ2000) with EM recorder']
## What I Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
FILE:references/ANTI_PATTERNS.md
# Anti-Patterns (Cross-Project)
Total: **25**
## qlib (9)
### `AP-QLIB-1930` — 回测结果与模型无关——共享 dataset 对象导致预测值被首次模型覆盖 <sub>(high)</sub>
Qlib 中多个模型复用同一个已 fit 的 DatasetH 实例时,dataset 内部的标准化 参数(fit_start_time/fit_end_time 决定的归一化统计量)在第一次 fit 后固化。 切换模型但不重新初始化 dataset,导致所有模型实际使用同一套预测信号。表现为 无论换 LightGBM/XGBoost/DNN,回测净值曲线完全一致。这是最危险的"实验看起来 在跑,但结论全部无效"反模式。
Source: https://github.com/microsoft/qlib/issues/1930
### `AP-QLIB-2090` — fit_start_time 与 train segment 双重配置引发隐式数据泄露 <sub>(high)</sub>
Qlib DatasetH 有两个"训练数据范围":handler 的 fit_start_time/fit_end_time (决定归一化器拟合范围)和 segments.train(决定模型训练范围)。常见错误是 让 fit_end_time 覆盖 valid/test 段,使归一化统计量(均值、标准差)包含了 未来数据,造成前向偏差(look-ahead bias)。两者独立配置但语义耦合,文档 未明确说明 fit_end_time 必须 <= train_end。
Source: https://github.com/microsoft/qlib/issues/2090
### `AP-QLIB-2036` — MACD 因子公式文档错误——DEA 被多除一次 CLOSE 导致量纲不一致 <sub>(high)</sub>
Qlib 官方文档中的 Alpha 公式示例将 MACD 的 DEA 定义为 EMA(DIF, 9) / CLOSE, 但 DIF 已经是无量纲(除过 CLOSE 的),再次除以 CLOSE 导致 DEA 量纲为 1/price。 基于此文档公式构建的 MACD 因子在截面标准化后与正确公式差异显著,IC 下降。 此类文档层面的公式错误会被大量用户直接照搬入生产因子库。
Source: https://github.com/microsoft/qlib/issues/2036
### `AP-QLIB-2184` — 自定义 A 股数据导入前未按约定填充停牌日 NaN,引发下游因子噪声 <sub>(high)</sub>
Qlib 约定停牌日 open/close/high/low/volume/factor 字段均应填 NaN,以便框架 在因子计算时识别并跳过。用户自建 A 股数据集时若将停牌日保留为上一日价格 (常见于从东财/Wind 直接导出的数据),会导致停牌期间的价格动量因子出现 "假信号"(价格不变但因子非零)。Qlib 不校验此约定,错误静默流入训练数据。
Source: https://github.com/microsoft/qlib/issues/2184
### `AP-QLIB-1892` — PIT(Point-In-Time)财务数据收集器依赖外部股票列表接口,全量 A 股获取不完整 <sub>(high)</sub>
Qlib 的 PIT 数据收集器(财务数据时间点快照)在初始化时调用 get_hs_stock_symbols() 获取沪深股票列表。该函数依赖东财 API,经常仅返回 部分列表而非全量 5000+ 股票,且函数在获取不完整时直接 raise ValueError。 用户若按文档步骤操作,财务数据集将只覆盖部分股票,基于 PIT 财务因子的回测 存在严重生存者偏差(未被采集的股票被隐式排除)。
Source: https://github.com/microsoft/qlib/issues/1892
### `AP-QLIB-2097` — 全市场 instrument="all" 在 32GB 内存机器上 OOM,但 CSI300 正常 <sub>(medium)</sub>
Qlib 在加载 Alpha158 特征时会将指定 universe 的全部特征矩阵一次性载入内存。 使用 instrument="csi300"(300 股)与 instrument="all"(5000+ 股)的内存占用 差约 16 倍。32GB 机器跑全市场时在 init_instance_by_config 阶段直接 OOM, 错误信息不提示内存问题。用户容易误以为是配置错误,实际上需要分批加载或 使用流式特征计算。
Source: https://github.com/microsoft/qlib/issues/2097
### `AP-QLIB-1984` — LightGBM 模型标签维度校验逻辑永远不触发导致多标签训练静默失败 <sub>(medium)</sub>
Qlib gbdt.py 中用 y.values.ndim == 2 判断是否为多标签,但从 DataFrame 取出的 Series 的 ndim 永远为 1,条件永远为 False,因此多标签训练不会走 squeeze 分支,而是直接进入 LightGBM 训练并在更深处抛出语义不明的错误。 用户尝试自定义多标签任务时无法从错误信息定位到此根因。
Source: https://github.com/microsoft/qlib/issues/1984
### `AP-QLIB-1915` — 自定义 CSV 数据 dump_bin 后 DataHandler 报 Length mismatch,D.features 却正常 <sub>(high)</sub>
Qlib 存在两套数据访问路径:D.features(直接读 binary)和 DataHandler/DataHandlerLP (带 processor pipeline)。自定义 A 股 CSV 数据在 dump_bin 时若字段顺序 或 symbol 格式(如 600000.SH vs SH600000)与 Qlib 约定不符,DataHandler 的 processor 在 align/reindex 时触发 Length mismatch,而 D.features 因不 经过 processor 而成功。这一"两套路径行为不一致"让用户误以为数据已正确导入。
Source: https://github.com/microsoft/qlib/issues/1915
### `AP-QLIB-1949` — Colab/Linux 多进程后端与 Qlib ParallelExt 冲突导致 DataHandler 完全不可用 <sub>(medium)</sub>
Qlib 在非 fork 环境(Windows 或 Google Colab)中,DataHandler 使用 joblib 并行加载特征时,ParallelExt 初始化时访问 _backend_args 属性失败(AttributeError)。 根因是 joblib 1.5+ 移除了该内部属性,Qlib 的兼容层未更新。表现为 D.features 调用抛出多层嵌套异常,用户无法从错误栈判断是并行后端问题还是数据问题。
Source: https://github.com/microsoft/qlib/issues/1949
## vnpy (4)
### `AP-VNPY-3691` — K 线生成器首根 K 线时间戳不对齐,导致第一个周期信号错误 <sub>(high)</sub>
vnpy BarGenerator 在合成 N 分钟 K 线时,第一根推送的 K 线时间戳为"当前 tick 所在分钟"而非"完整 N 分钟周期结束时间"。具体表现:09:59 的 tick 会 触发一根不完整的 5 分钟 K 线推送(本应等到 10:04 才推送)。策略若在 on_bar 中直接用 datetime.minute % 5 过滤,第一根 K 线恰好通过,但包含的 数据不足一个完整周期,用于信号计算会产生错误的开仓信号。
Source: https://github.com/vnpy/vnpy/issues/3691
### `AP-VNPY-3669` — Alpha 模块历史数据增量保存时新旧 DataFrame schema 不兼容导致 SchemaError <sub>(medium)</sub>
vnpy Alpha 模块在保存 K 线数据到 Parquet 文件时,将新下载数据(可能含 Float64 列)与已存文件(历史 Int64 列)直接 polars.concat。polars 强类型 不允许隐式类型提升,抛出 SchemaError。根因是不同数据源/版本返回的字段类型 不一致(如 volume 在部分行情源为整数,在另一些为浮点),且 concat 前无 schema 对齐步骤。影响所有使用 vnpy alpha 进行回测的历史数据构建流程。
Source: https://github.com/vnpy/vnpy/issues/3669
### `AP-VNPY-3685` — 价差交易模块 run_backtesting() 在 Jupyter 环境下静默报错,结果不可信 <sub>(high)</sub>
vnpy 4.10 价差交易(SpreadTrading)模块的 run_backtesting() 在 Jupyter 环境下存在事件循环冲突(asyncio already running),导致回测引擎部分逻辑 不执行但不抛异常,返回看似正常的回测统计数据。同样代码在命令行 Python 中无此问题。vnpy 4.x 将部分 IO 改为 async 但 Jupyter 的事件循环与之不兼容, 是"回测结果看起来正确但实际不完整"的隐蔽陷阱。
Source: https://github.com/vnpy/vnpy/issues/3685
### `AP-VNPY-3700` — 安装脚本不使用 venv 导致全局 numpy 版本被降级破坏其他依赖 <sub>(medium)</sub>
vnpy install.bat 直接在系统/conda base 环境安装,会强制降级 numpy 到 <2.0 以满足 vnpy 依赖,破坏依赖 numpy 2.x 的其他量化工具(如 scipy、pytorch 新版)。 没有 requirements.txt,依赖边界不透明。在多工具共存的量化研究环境中, vnpy 的安装脚本是"全局环境污染"的常见根源。
Source: https://github.com/vnpy/vnpy/issues/3700
## zipline (6)
### `AP-ZIPLINE-138` — 回测价格为未复权价,教程图表误导用户误判策略收益 <sub>(high)</sub>
Zipline 教程使用 AAPL 股价图做演示,但 bundle 中存储的是未复权价格(raw price), 而非经过拆股/分红调整的复权价。图表显示的历史价格与市场实际价约差 4 倍(Apple 历次拆股累计因子),用户误将"价格翻 4 倍"当作策略收益。A 股场景更严重: 除权前后价格跳变会在未复权数据中形成巨大"信号",吸引技术指标在除权日产生 虚假突破信号。
Source: https://github.com/stefan-jansen/zipline-reloaded/issues/138
### `AP-ZIPLINE-235` — 默认以当根 K 线收盘价成交,低估实盘滑点,策略回测收益虚高 <sub>(high)</sub>
Zipline 默认滑点模型在当根 K 线触发信号后,以同根 K 线收盘价成交(current bar close fill)。实盘中信号只能在下一根 K 线的开盘价附近成交(T+1 order execution)。以 A 股日线为例,用收盘价回测比用次日开盘价成交平均高估日收益 约 0.1-0.3%,年化差距可超 30%。需显式配置 slippage model 为 VolumeShareSlippage 或 FixedSlippage 并设合理 volume_limit。
Source: https://github.com/stefan-jansen/zipline-reloaded/issues/235
### `AP-ZIPLINE-190` — 日历 start_session 设为非交易日触发 DateOutOfBounds,无提示如何修正 <sub>(medium)</sub>
Zipline 在注册 bundle 或运行算法时,若 start_session 参数恰好是非交易日 (如 1998-01-01 元旦),Calendar 校验抛出 DateOutOfBounds("cannot be earlier than the first session")。错误信息仅显示交易日历起始日,不提示"请改为第一个 交易日"。A 股场景:使用 SSE/SZSE 日历时,若 start_date 恰好是春节前最后 一天次日(节假日),会触发同类错误,调试成本极高。
Source: https://github.com/stefan-jansen/zipline-reloaded/issues/190
### `AP-ZIPLINE-181` — asset db 过期后 Pipeline 报"no assets traded",误导用户排查数据范围 <sub>(high)</sub>
Zipline 的 asset database(SQLite)记录每只股票的 start/end 交易日期。若 使用了旧版 Quandl/自建 bundle 且未重新 ingest,在回测新日期范围时 Pipeline 抛出 "Failed to find any assets with country_code 'US' that traded between [dates]"。A 股场景:重新下载行情后若只更新价格数据而未重建 asset db,退市/ 新上市股票的日期范围不更新,Pipeline 过滤会悄悄排除这些股票,产生生存者偏差。
Source: https://github.com/stefan-jansen/zipline-reloaded/issues/181
### `AP-ZIPLINE-285` — week_start()/week_end() 在自定义日历(非美股)下静默失效 <sub>(medium)</sub>
Zipline schedule_function 的 date_rules.week_start() 和 date_rules.week_end() 依赖交易日历的周首/周末判断逻辑,但在非美股日历(如 ASX、SSE)中,该逻辑 与 NYSE 日历的偏移计算不兼容,导致 schedule 永远不触发或在错误的日期触发。 A 股场景:使用 SSE 日历时,含春节等连续长假的周,week_start 可能跳过整个 假期周而不调仓,但用户无法从日志发现未触发的调度。
Source: https://github.com/stefan-jansen/zipline-reloaded/issues/285
### `AP-ZIPLINE-240` — 回测日期时区必须为 UTC,传入 naive datetime 引发深层 AssertionError <sub>(medium)</sub>
Zipline 内部强制要求所有时间戳为 UTC aware datetime。当用户传入 naive datetime (无时区信息,如 pd.Timestamp('2020-01-01'))时,不在入口处报错,而是在 算法执行深处触发 AssertionError: Algorithm should have a utc datetime,栈深 难以定位。A 股开发者从本地 CST 时间导入数据时极易触发此陷阱,需在 bundle 注册时显式 tz_localize('UTC')。
Source: https://github.com/stefan-jansen/zipline-reloaded/issues/240
## zvt (6)
### `AP-ZVT-183` — 除权因子为 inf/NaN 时直接参与乘法导致复权静默失败 <sub>(high)</sub>
ZVT 在计算前复权因子时以 new/old 价格比计算 qfq_factor。当 old==0(新股首日 或数据缺失)时因子为 inf;当 kdata.open 本身为 None(停牌日未填充)时乘法 抛出 TypeError。结果:整个 entity 的复权计算中断,后续 K 线全部丢失,但主 流程只 log ERROR 不中断,用户往往不知道已有大量股票数据损坏。
Source: https://github.com/zvtvz/zvt/issues/183
### `AP-ZVT-179` — 第三方数据接口超限后异常被吞噬,数据静默缺失 <sub>(high)</sub>
ZVT 使用聚宽 jqdatasdk 批量拉取全市场 K 线时(4000+ 股票),触发聚宽每日 最大查询条数限制(错误:已超过每日最大查询数量)。ZVT 捕获异常后继续执行下一 entity,导致超限后所有股票的当日数据均静默缺失。回测若使用该残缺数据库,因 子计算结果将产生系统性偏差,且无告警。
Source: https://github.com/zvtvz/zvt/issues/179
### `AP-ZVT-161` — 全市场 SQLite 批量因子计算触发 too many SQL variables 错误 <sub>(medium)</sub>
ZVT 在计算 VolumeUpMaFactor 等多股因子时,将所有 entity_id 拼入单条 SQL 的 IN 子句。当 A 股全市场(5000+ 股)一次性查询时,触发 SQLite 默认限制 SQLITE_MAX_VARIABLE_NUMBER=999。调大 max_allowed_packet(MySQL 参数)无效, 根因是 SQLite 变量数上限。正确解法是分批查询,但 ZVT 早期版本未处理此边界。
Source: https://github.com/zvtvz/zvt/issues/161
### `AP-ZVT-129` — 使用通配符导入隐藏 API 版本变更,AdjustType 等枚举莫名消失 <sub>(medium)</sub>
ZVT 文档示例使用 `from zvt import *` 导入所有符号。当 ZVT 版本升级重构 枚举(如将 AdjustType 移入子模块)后,通配符导入不再包含该符号,触发 AttributeError。使用者误以为是安装问题,实际是版本间 API breaking change 未在 CHANGELOG 中标注,且通配符导入掩盖了具体来源。应显式 import 枚举类。
Source: https://github.com/zvtvz/zvt/issues/129
### `AP-ZVT-187` — 回测引擎未在数据层空结果时提前终止,导致空指针级联崩溃 <sub>(medium)</sub>
ZVT Trader 在 load_data 完成后检查数据为空时,不提前退出,而是将空 DataFrame 传入 selector 计算,触发后续 NoneType 操作链式崩溃。错误栈深且难以定位根因, 用户误以为是策略逻辑问题。根因是数据时间窗口配置错误(start/end 不在数据 库覆盖范围内)但无有效校验。
Source: https://github.com/zvtvz/zvt/issues/187
### `AP-ZVT-183B` — HFQ(后复权)与 QFQ(前复权)K 线表使用错误导致因子计算漂移 <sub>(high)</sub>
ZVT 提供 Stock1dKdata(不复权)、Stock1dHfqKdata(后复权)、Stock1dQfqKdata (前复权)三张独立表。用户在计算价格动量/均线因子时混用两张表(如用不复权 做均线,用后复权做收益率),导致除权日前后因子值产生跳变。ZVT 不做跨表 复权类型一致性校验,混用静默通过。
Source: https://github.com/zvtvz/zvt/issues/183
FILE:references/COMPONENTS.md
# Component Capability Map
**Project**: finance-bp-090--QUANTAXIS
**Scan date**: 2026-04-22
**Stats**: {'total_files': 7, 'total_classes': 46, 'total_functions': 0, 'total_stages': 7}
## Modules (7)
- [data_collection_&_storage](components/data_collection_-_storage.md): 6 classes
- [data_processing_&_resampling](components/data_processing_-_resampling.md): 7 classes
- [factor_computation_&_analysis](components/factor_computation_-_analysis.md): 7 classes
- [strategy_execution_engine](components/strategy_execution_engine.md): 8 classes
- [order_management_&_execution](components/order_management_-_execution.md): 7 classes
- [real-time_messaging_infrastructure](components/real-time_messaging_infrastructure.md): 7 classes
- [cross-language_data_bridge](components/cross-language_data_bridge.md): 4 classes
FILE:references/CONSTRAINTS.md
# Constraints
## preservation_manifest
```yaml
required_objects:
business_decisions_count: 124
fatal_constraints_count: 35
non_fatal_constraints_count: 163
use_cases_count: 8
semantic_locks_count: 12
preconditions_count: 4
evidence_quality_rules_count: 2
traceback_scenarios_count: 5
```
## Domain Constraints Injected (71)
- **`SHARED-CN-ASTOCK-T1-001`** <sub>(fatal)</sub>: A 股股票实行 T+1 交收制度:T 日买入的股票最早 T+1 日方可卖出。 T 日卖出所得资金可当日再用于买入。回测框架若未施加 T+1 持仓锁定, 将高估换手率与策略胜率,尤其损害日内反转类策略的真实性。
- **`SHARED-CN-ASTOCK-T1-002`** <sub>(fatal)</sub>: 沪深主板股票日涨跌幅上限为 ±10%(ST/SST 股票 ±5%)。 涨停封板时买方消失、跌停封板时卖方消失;回测若假设当日可以任意价格 成交,会系统性高估可执行性。封板检测应在成交模拟层强制实施。
- **`SHARED-CN-ASTOCK-T1-003`** <sub>(high)</sub>: 科创板和创业板(2020年8月改革后)正常交易日涨跌幅为 ±20%; 北交所 ±30%;新股上市后前5个交易日不设涨跌幅限制。 回测若对所有股票统一套用 ±10% 过滤逻辑,会错误剔除或错误包含这些板块的成交。
- **`SHARED-CN-ASTOCK-T1-004`** <sub>(high)</sub>: ST/*ST 股票日涨跌幅限制为 ±5%,流动性极差,成交假设不可与正常股票混用。 包含历史 ST 股票(最终退市)但不纳入回测会产生幸存者偏差; 纳入回测但不区分 ST 涨跌幅会错误模拟成交。
- **`SHARED-CN-ASTOCK-T1-005`** <sub>(medium)</sub>: A 股开盘集合竞价(9:15-9:25)和收盘集合竞价(14:57-15:00)期间, 成交价由"最大成交量原则"确定,非即时撮合。回测以开盘价或收盘价假设 即时全量成交会低估实际滑点风险,大单策略尤为明显。
- **`SHARED-CN-ASTOCK-T1-006`** <sub>(high)</sub>: 停牌制度:A 股长期停牌(2018年前可长达数月)期间,持仓资金被锁定, 无法再平衡,机会成本在回测中普遍被忽略。应在因子计算前过滤停牌日 (volume == 0 或 is_suspended == True),停牌期间不发出信号。
- **`SHARED-CN-ASTOCK-T1-007`** <sub>(high)</sub>: 新股上市后前5个交易日无涨跌幅限制(首日涨幅可超300%), 且无完整历史数据(均线/波动率/换手率因子无法计算)。 应在因子计算前过滤上市不足 N 个交易日(通常 60-252 日)的股票。
- **`SHARED-CN-ASTOCK-T1-008`** <sub>(high)</sub>: A 股程序化交易监管新规(2025年7月7日施行):单账户每秒申报/撤单 ≥ 300 笔, 或单日申报/撤单 ≥ 20000 笔,被认定为高频交易,须向交易所报备。 AI 生成的量化策略若频率超标则无法合规运行,应在策略设计期提示。
- **`SHARED-CN-ASTOCK-ADJ-001`** <sub>(fatal)</sub>: 除权除息日股价跳空是账面调整而非真实亏损。复权选择: 不复权会虚增策略亏损;前复权会将历史价格内嵌未来分红信息(lookahead bias); 后复权以上市首日为基准累积,是量化回测的最优选择。
- **`SHARED-CN-ASTOCK-ADJ-002`** <sub>(fatal)</sub>: A 股上市公司财务报告披露有法定延迟:年报在次年4月30日前、 半年报在8月31日前、季报分别在4月30日(一季)/10月31日(三季)前披露。 回测中使用财务数据时,必须以实际披露日期(announcement_date)而非 会计期间结束日作为数据可用时间点,否则引入 point-in-time lookahead bias。
- **`SHARED-CN-ASTOCK-ADJ-003`** <sub>(high)</sub>: 分红送股转增和配股会导致除权除息日后股本增加,历史持股数量不变但股价等比 缩水,若回测系统未同步调整持仓股数,会在除权日产生虚假亏损或盈利。
- **`SHARED-CN-ASTOCK-ADJ-004`** <sub>(medium)</sub>: 大宗交易与竞价交易价差:大宗交易成交价可比市价折价最多 10%(主板), 但此价格不影响次日竞价开盘。大宗交易数据出现在收盘后,若将其混入 日内 OHLCV 数据,会污染收盘价和成交量的正常计算。
- **`SHARED-CN-ASTOCK-ADJ-005`** <sub>(fatal)</sub>: 融资融券(两融)做空限制:A 股散户无法直接卖空,融券标的池有限(主要为 大盘蓝筹,中小盘融券极度稀缺),融券利率远高于融资利率。 回测若直接假设可做空任意股票,会产生不可执行的策略,实盘与回测严重背离。
- **`SHARED-CN-ASTOCK-FX-001`** <sub>(high)</sub>: 通过沪深港通(北向)买入股票,境外投资者合计持股上限 30%,预警线 28%。 当外资持股比例达 28% 时,联交所暂停该股新增买盘,直到降至 26% 才恢复。 策略若重仓外资偏好股(消费/医药龙头),需监控外资持股比例。
- **`SHARED-CN-ASTOCK-FX-002`** <sub>(high)</sub>: 5% 举牌规则:单一投资者持有上市公司已发行股份超过 5%,须在3日内向证监会 和交易所报告并公告;在此期间及公告后2日内不得再买卖。 量化选股系统若不考虑此规则,重仓股超过 5% 阈值后将面临强制停止买入。
- **`SHARED-CN-ASTOCK-FX-003`** <sub>(high)</sub>: 公募基金"双十原则":单基金持有单只股票不超过净资产 10%, 同一基金管理人旗下所有基金合计不超过该公司已发行股份 10%。 量化选股组合若部署于公募基金,需在优化约束中强制加入合规上限。
- **`SHARED-CN-ASTOCK-FX-004`** <sub>(fatal)</sub>: 内幕交易边界:AI 辅助量化系统的所有输入数据必须来自公开已披露信息。 通过非公开渠道(私有数据服务/内部消息/重组前预知)触发的自动化交易 构成内幕交易,适用《证券法》第80-83条及《内幕交易行为认定指引》。
- **`SHARED-CN-ASTOCK-MKT-001`** <sub>(fatal)</sub>: 幸存者偏差:使用当前 A 股成分股(如当前沪深300)作为历史回测股票池, 会遗漏曾被纳入指数但因业绩差被调出或退市的股票。2020-2024年 A 股 退市数量加速(41家/年创纪录),此偏差日趋严重。必须使用历史时点快照。
- **`SHARED-CN-ASTOCK-MKT-002`** <sub>(medium)</sub>: 指数成分股调整效应:沪深300/中证500等每半年调整一次(6月/12月), 被纳入股票通常在公告日至生效日之间显著上涨(被动资金被动买入), 被剔除股票则相反。回测股票池应使用历史成分股快照,并标注调整窗口期。
- **`SHARED-CN-ASTOCK-MKT-003`** <sub>(high)</sub>: 策略拥挤(Strategy Crowding):大量量化私募使用相似因子模型时, 持仓高度重叠,遇市场冲击时集体卖出形成踩踏。2024年2月 A 股量化危机 是典型案例(小盘股指数单日跌幅超 10%)。需监控因子多头持仓与 主流量化基金的重叠率。
- **`SHARED-CN-ASTOCK-MKT-004`** <sub>(high)</sub>: A 股量化对冲策略常用 IF/IC/IM 股指期货做多/空对冲系统性风险。 但 A 股股指期货长期处于贴水(远期价格 < 现货),IC 年化贴水可达 10-20%。 回测若仅考虑价格收益而忽略期货贴水/升水,会严重高估对冲策略净收益。
- **`SHARED-CN-ASTOCK-MKT-005`** <sub>(high)</sub>: A 股月度动量因子在方向上与美股相反:近1个月表现最好的股票, 下1个月大概率反转(反转效应而非动量)。机构研究(华泰/东吴证券) 与学术论文均验证:直接套用美股月度动量因子在 A 股会产生系统性亏损。
- **`SHARED-CN-ASTOCK-BF-001`** <sub>(medium)</sub>: 处置效应(Shefrin & Statman 1985)在 A 股散户中尤为显著: 投资者倾向于过早卖出盈利股票、过长持有亏损股票。上交所实证研究证实 超过 90% 的个人账户存在此效应,AI 辅助工具不应迁就"持有亏损等解套" 的直觉,而应基于量化信号提供纪律性止损止盈建议。
- **`SHARED-CN-ASTOCK-BF-002`** <sub>(medium)</sub>: A 股以散户为主(个人账户交易量占比超 80%),羊群效应显著:散户倾向于 跟风操作,导致价格非理性波动(如 2015年杠杆牛熊)。量化策略应避免 使用成交量排行/热度排行等可能强化羊群信号的指标作为主要因子。
- **`SHARED-CN-ASTOCK-BF-003`** <sub>(medium)</sub>: 过度自信效应(Barber & Odean 2000)在 A 股散户中更严重:散户年均换手率 超 500%,机构长期收益显著优于散户。高换手率策略经交易成本后净收益往往 更低。AI 不应鼓励"频繁操作",而应推荐低频高质信号驱动交易。
- **`SHARED-CN-ASTOCK-BF-004`** <sub>(medium)</sub>: A 股日历效应:春节效应(节前5日和节后1-3日倾向上涨)、月初效应 (月初第1-5个交易日表现优于月中/月末)已有学术实证(南京财经大学等)。 策略应在日历特殊窗口降低信号置信度,或单独评估日历驱动收益的贡献。
- **`SHARED-CN-ASTOCK-BF-005`** <sub>(high)</sub>: 策略容量(Capacity)限制:A 股小盘/微盘股日均成交额仅数百万, 大资金买入/卖出会造成严重价格冲击,策略实际容量可能仅几千万元。 回测结果不可外推至亿级资金,应在回测中加入成交量比例上限约束。
- **`SHARED-CN-ASTOCK-COST-001`** <sub>(fatal)</sub>: A 股完整交易成本结构(2023年8月调整后):印花税卖出单向 0.05%; 佣金双向约 0.01%(最低5元);过户费(沪市)0.001%; 滑点/冲击成本小盘股 0.1%-0.5%/次。忽略成本的回测策略年化收益率 具有欺骗性,高频/高换手策略尤甚。
- **`SHARED-CN-ASTOCK-COST-002`** <sub>(high)</sub>: 市场冲击成本(Market Impact)在回测中通常完全缺失,但在实盘中可能是 最大成本来源。A 股小盘股 100 万元买入可能推高 1% 以上。冲击成本与 成交规模呈幂律而非线性关系,应使用 Almgren-Chriss 模型或简化版估算。
- **`SHARED-CN-ASTOCK-COST-003`** <sub>(medium)</sub>: 大股东/董监高减持新规(证监会第224号令,2024年5月):持股5%以上大股东 通过集中竞价减持须提前15个交易日披露减持计划,3个月内不超过股份总数1%。 解禁股减持压力是 A 股特有的系统性风险因子,回测中忽略解禁日历会低估 相关股票的持股风险。
- **`SHARED-CN-ASTOCK-DATA-001`** <sub>(high)</sub>: A 股交易日历与自然日历不一致:存在法定节假日调休导致的"补班日"(周六上班), 以及临时停市(2015年7月8日至7月10日因股灾紧急停市)。 使用通用工作日历(weekdays)推算 A 股交易日会产生偏差, 必须使用 A 股专用交易日历(如 exchange_calendars 或 tushare 的交易日接口)。
- **`SHARED-CN-ASTOCK-DATA-002`** <sub>(medium)</sub>: A 股退市后股票代码可能被新股重用(极少见但存在)。使用纯代码(如 '000001') 作为历史数据主键而不包含交易所后缀('.SZ')或上市日期范围,可能导致 历史数据与当前股票的错误混淆,长周期回测中需特别注意。
- **`SHARED-BT-LAB-001`** <sub>(fatal)</sub>: 未来函数(Lookahead Bias):在模拟历史时间点 t 的交易决策时, 不得使用 t 时刻之后才能知道的信息。最常见形式: (1) 使用收盘价计算信号并同日以收盘价成交; (2) 将 T 日收盘后计算的指标标记在同一根 K 线; (3) 使用当日最高/最低价作为成交假设。 信号计算与成交时间必须对齐:T 日收盘后计算信号,T+1 日开盘成交。
- **`SHARED-BT-LAB-002`** <sub>(high)</sub>: 指标预热期(Warmup Period)处理:滚动窗口指标在前 N 个 bar 时 NaN, 这些 bar 不应参与信号计算和持仓决策。强制要求指标的 warmup_period 与最长 lookback 期等长,且 warmup 期间持仓应置零。
- **`SHARED-BT-LAB-003`** <sub>(fatal)</sub>: ML/DL 模型时序数据划分必须按时间顺序:TRAIN < VALID < TEST, 不可使用随机 k-fold 分折(会将未来数据混入训练集)。 应使用 TimeSeriesSplit 或 Walk-Forward 验证。
- **`SHARED-BT-LAB-004`** <sub>(fatal)</sub>: 开盘价/最高价/最低价成交假设:日线回测中假设每日可以最高价卖出或 最低价买入(如动量策略"最高价止盈"),这是明显的 lookahead, 因为日内最高/最低价只有收盘后才能确认。成交价只能用开盘价或 前一日收盘价(带滑点)。
- **`SHARED-BT-LAB-005`** <sub>(high)</sub>: 数据对齐偏移(Off-by-one):pandas rolling/shift 等操作容易引入细微的 1 期偏移错误。应在代码中明确记录每个序列的"观测时间点", 并通过 assert 验证关键时间对齐关系。
- **`SHARED-BT-LAB-006`** <sub>(high)</sub>: 过度优化(Overfitting):回测数量越多,过拟合概率越高。 Bailey et al.(2014)证明 Optimal Sharpe Ratio 期望值随回测次数单调递减。 应使用 Walk-Forward 验证代替 in-sample 参数穷举,并报告 Deflated Sharpe Ratio(DSR)而非峰值 Sharpe。
- **`SHARED-BT-SURV-001`** <sub>(fatal)</sub>: 幸存者偏差(Survivorship Bias):使用当前市场成分股作为历史回测股票池, 会遗漏曾经存在但后来退市、摘牌或被合并的股票,系统性高估策略历史收益率。 回测股票池必须使用历史时点快照(point-in-time universe)。
- **`SHARED-BT-SURV-002`** <sub>(high)</sub>: In-Sample / Out-of-Sample 划分:策略开发、参数选择必须在样本内完成, 样本外数据仅用于最终验证,不可多次"看"样本外数据后继续调优 (会将样本外变为新的样本内,重蹈过拟合)。
- **`SHARED-BT-SURV-003`** <sub>(high)</sub>: 停牌/缺失数据的填充策略:停牌日价格不可简单用前一日收盘价 forward-fill, 因为这会在复盘时造成"零成交量"日参与了因子计算和信号生成。 应在因子计算层显式过滤缺失交易日,不填充。
- **`SHARED-BT-SURV-004`** <sub>(high)</sub>: 异常值(Extreme Value)污染:原始市场数据可能含有数据源错误(如除权未 及时调整、手工录入错误导致的极端价格),不清洗直接进入因子计算会产生 极端信号,污染整个横截面。应在 pipeline 入口处过滤 3-sigma 异常值。
- **`SHARED-BT-COST-001`** <sub>(fatal)</sub>: 交易成本(佣金 + 印花税/转让税 + 过户费)必须在回测初始化时强制配置, 不可使用零成本默认值。忽略成本的回测策略绩效指标具有欺骗性, 高换手率策略尤其严重(单边往返成本往往吞噬 50%+ 的毛收益)。
- **`SHARED-BT-COST-002`** <sub>(high)</sub>: 滑点(Slippage)建模:回测若无滑点,假设每笔订单以理想价格成交, 高频策略在实盘中会因成交价劣化而产生严重亏损。至少应配置固定点差 或比例滑点;大单应使用成交量比例模型(如不超过日成交量 5%)。
- **`SHARED-BT-COST-003`** <sub>(high)</sub>: 换手率(Turnover)必须在回测绩效报告中展示并与成本关联分析。 月换手率超过 50%(年化 600%+)时,策略净收益对成本假设极度敏感, 每 10bps 成本变化可能改变策略盈亏结论,必须做成本敏感性分析。
- **`SHARED-BT-COST-004`** <sub>(medium)</sub>: 仓位规模化(Position Sizing)必须纳入资金量约束:回测应模拟固定资金量 下的实际持仓股数(取整),而非假设可以持有小数股。 对小盘股,最小交易单位(A股:100股/手)会导致实际可持仓量与目标权重 产生偏差,应在回测中模拟取整效应。
- **`SHARED-BT-TIME-001`** <sub>(high)</sub>: 时间戳时区统一:多数据源合并时,UTC vs 本地时间混用是常见数据腐败源。 所有时间戳必须在 pipeline 入口处统一转换为同一时区(推荐 UTC 存储, 市场本地时区展示),不可在 pipeline 中途混用不同时区。
- **`SHARED-BT-TIME-002`** <sub>(high)</sub>: 交易日历对齐:合并不同市场或不同频率数据时(如日线价格 + 周频因子), 必须使用明确的交易日历进行 reindex/merge,不可使用 outer join 后 fillna, 否则会在非交易日(节假日)创建虚假数据行。
- **`SHARED-BT-TIME-003`** <sub>(high)</sub>: 增量更新边界校验:历史数据增量更新时,必须从数据库查询已存最新日期, 仅下载该日期之后的数据。若重新下载已有数据并追加,会产生时间戳重复行, 导致回测时序错误。更新前后必须校验无重复 (index.duplicated().any() == False)。
- **`SHARED-BT-TIME-004`** <sub>(medium)</sub>: 回测绩效归因失真:基准(Benchmark)选择不当会使 Alpha/Beta 计算失真。 应选用策略实际可投资的被动基准(如 HS300 ETF),而非不可直接投资的 价格指数(如 HS300 指数)。价格指数不含股息再投资,会低估持仓基准收益。
- **`SHARED-BT-PERF-001`** <sub>(medium)</sub>: 最大回撤(Max Drawdown)计算必须使用净值序列(portfolio value), 不可用累计收益率序列代替。若使用对数收益率累加,会低估回撤深度 (因对数收益率在下跌时会比简单收益率偏小)。
- **`SHARED-BT-PERF-002`** <sub>(medium)</sub>: Sharpe Ratio 年化化约定:年化 Sharpe = 日 Sharpe × sqrt(252)(股票,252 交易日) 或 × sqrt(365)(加密货币,365日)。不同系统默认不同,跨系统对比前必须 确认年化因子,否则 Sharpe 不可比。
- **`SHARED-BT-PERF-003`** <sub>(medium)</sub>: Calmar Ratio / Sortino Ratio 优于 Sharpe Ratio 作为风险调整收益指标: Sharpe 假设收益正态分布,A 股/加密市场的收益分布显著左偏(肥尾), 会低估下行风险。量化评估应同时报告 Sortino(仅下行波动)和 Calmar(年化收益/最大回撤),不应单一依赖 Sharpe。
- **`SHARED-BT-PERF-004`** <sub>(medium)</sub>: 回测绩效归因应拆解为:alpha(主动收益)、beta(市场收益)、 因子暴露收益(style/sector)和特异性收益(stock selection)。 不做归因的回测无法区分"策略优秀"与"顺风行情恰好 beta 对了"。
- **`SHARED-FR-IC-001`** <sub>(high)</sub>: IC(信息系数)是衡量因子预测能力的核心指标,定义为因子值与 下期收益率的 Spearman 秩相关系数(ICIR = IC / std(IC))。 IC 绝对值 > 0.05 视为有预测能力的初步证据,ICIR > 0.5 视为稳定。 不计算 IC 直接报告回测绩效是因子有效性证明缺失的典型问题。
- **`SHARED-FR-IC-002`** <sub>(high)</sub>: IC 衰减(IC Decay)分析:因子预测能力通常随持仓期增长而衰减。 应计算 1/5/10/20 日 IC 序列,识别因子的最优持仓期。 IC 在1日高但20日迅速衰减的因子是短期因子,不适合月度换仓策略; 反之亦然。使用错误的持仓期会严重损害因子实盘表现。
- **`SHARED-FR-IC-003`** <sub>(high)</sub>: Harvey, Liu & Zhu (2016) 警告:学术界已发现 300+ 个"显著"因子, 其中大量是多重检验下的误发现(False Discovery)。因子有效性要求: t-stat > 3.0(而非传统的 1.96);或在不同时段/市场独立复现; 或有清晰的经济学逻辑。不满足上述条件的因子极可能是数据挖掘产物。
- **`SHARED-FR-IC-004`** <sub>(high)</sub>: 因子换手率(Factor Turnover)控制:高 IC 但高换手率的因子,在扣除 交易成本后净 IC 可能为负。应计算换手率调整后的有效 IC: net_IC = IC - turnover × cost_per_turn。目标换手率 ≤ 50%(月频)。
- **`SHARED-FR-IC-005`** <sub>(medium)</sub>: 因子衰减期(Half-life)是因子信号强度的核心参数,直接决定最优再平衡频率。 半衰期 < 5 日:日频或周频换仓;5-20 日:周频或双周;> 20 日:月频换仓。 错误地对短期因子使用月频换仓,会导致大量 alpha 在持仓期内消散。
- **`SHARED-FR-NEUT-001`** <sub>(high)</sub>: 行业中性化(Industry Neutralization):因子值若不对行业均值中性化, 因子收益中会混入行业轮动收益,难以判断是因子本身还是行业暴露驱动了收益。 行业中性化操作:factor_neutral = factor - industry_mean(factor)。
- **`SHARED-FR-NEUT-002`** <sub>(high)</sub>: 市值中性化(Market Cap Neutralization):小盘股效应(小盘跑赢大盘) 是金融史上最持久的 anomaly 之一,会污染几乎所有未中性化的因子。 若因子与市值高度相关,选股会系统性偏向小盘,收益来自市值暴露而非因子本身。 需同时进行行业和市值中性化(Fama-MacBeth 回归或残差法)。
- **`SHARED-FR-NEUT-003`** <sub>(high)</sub>: 异常值处理(Winsorize/MAD):因子原始值通常含有极端值,极端值会扭曲 分组分析(如 Q1/Q10 十分位)。应对原始因子值做 Winsorize(截尾至 [1%, 99%] 或 3-sigma)或 MAD(中位数绝对偏差)缩尾,然后再排名/中性化。
- **`SHARED-FR-NEUT-004`** <sub>(medium)</sub>: 因子正交化(Factor Orthogonalization):当多个因子共同用于合成打分时, 高相关因子的合成等效于对单一因子过度权重,稀释信号多样性。 应在合成前对因子做施密特正交化或 PCA,消除因子间的多重共线性。
- **`SHARED-FR-NEUT-005`** <sub>(medium)</sub>: 缺失数据填充策略:因子计算中的 NaN(停牌/新股/数据缺口)若用截面均值填充 会引入 lookahead bias(均值本身含未来信息);若完全删除会产生幸存者偏差; 正确做法是用截面中位数(当日所有股票的中位数,不依赖未来)或将该股当日排除。
- **`SHARED-FR-PORT-001`** <sub>(high)</sub>: 分层分析(Quantile Analysis):因子评估应使用 Q1/Q5(五分位)或 Q1/Q10(十分位)分组的多空收益差(top minus bottom spread)作为 主要评估指标,而非简单的多头收益。Q1 多 Q5 空的"单调性"检验是 因子有效性的核心证据:单调递增/递减 > 非单调 >> 仅多头有效。
- **`SHARED-FR-PORT-002`** <sub>(medium)</sub>: Alpha 衰减测试(Alpha Decay Test):因子的月度 IC 在不同时段(牛市/熊市/ 震荡市)的稳定性是因子鲁棒性的重要证据。IC 仅在某个特定市场状态下有效 的因子不适合全天候部署;应分段(rolling 12M)展示 IC 时序, 识别因子失效期。
- **`SHARED-FR-PORT-003`** <sub>(medium)</sub>: 换仓成本感知(Turnover-Aware Selection):因子排名靠近中间地带(49-51 分位) 的股票,排名小幅波动就会触发换仓,产生大量无效交易成本。 应在选股时设置换仓缓冲区(buffer zone):只在排名变化超过阈值时才换仓。
- **`SHARED-FR-PORT-004`** <sub>(medium)</sub>: 分组收益的统计显著性(Bootstrap 检验):因子分层收益差(Q1-Q5 spread) 即使在历史数据上很大,也可能是偶然,需要 bootstrap 或 t-test 检验 显著性(p-value < 0.05)。小样本回测期(< 3年)的分层收益尤其不可靠。
- **`SHARED-FR-XFER-001`** <sub>(high)</sub>: 因子跨市场可移植性验证:在一个市场有效的因子,不必然在另一个市场有效。 将美股因子直接套用 A 股、或将股票因子套用期货/加密货币,需要独立 IC 验证, 不可假设跨市场通用性。A 股特有异象(如反转效应、ST 价格异常)不存在于美股。
- **`SHARED-FR-XFER-002`** <sub>(medium)</sub>: 因子有效性时间稳定性:曾经有效的因子会因市场学习和套利行为逐渐失效 (McLean & Pontiff 2016 证明因子发表后平均衰减 58%)。 应定期(每季度/年)重新评估因子 IC,失效因子应及时替换或降权。
- **`SHARED-FR-XFER-003`** <sub>(medium)</sub>: 因子与宏观经济环境的交互:利率周期/经济周期/市场情绪对因子有效性影响显著。 价值因子(低 P/B)在利率上升期更有效;动量因子在趋势市更有效,震荡市失效。 部署因子前应评估当前宏观环境与因子最优生存环境的匹配度。
FILE:references/LOCKS.md
# Semantic Locks + Preconditions
## Semantic Locks (12)
### `SL-01` <sub>(on_violation: fatal)</sub>
Execute sell orders before buy orders in every trading cycle
### `SL-02` <sub>(on_violation: fatal)</sub>
Trading signals MUST use next-bar execution (no look-ahead)
### `SL-03` <sub>(on_violation: fatal)</sub>
Entity IDs MUST follow format entity_type_exchange_code
### `SL-04` <sub>(on_violation: fatal)</sub>
DataFrame index MUST be MultiIndex (entity_id, timestamp)
### `SL-05` <sub>(on_violation: fatal)</sub>
TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount
### `SL-06` <sub>(on_violation: fatal)</sub>
filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION
### `SL-07` <sub>(on_violation: fatal)</sub>
Transformer MUST run BEFORE Accumulator in factor pipeline
### `SL-08` <sub>(on_violation: fatal)</sub>
MACD parameters locked: fast=12, slow=26, signal=9
### `SL-09` <sub>(on_violation: warning)</sub>
Default transaction costs: buy_cost=0.001, sell_cost=0.001, slippage=0.001
### `SL-10` <sub>(on_violation: fatal)</sub>
A-share equity trading is T+1 (no same-day close of buy positions)
### `SL-11` <sub>(on_violation: fatal)</sub>
Recorder subclass MUST define provider AND data_schema class attributes
### `SL-12` <sub>(on_violation: fatal)</sub>
Factor result_df MUST contain either 'filter_result' OR 'score_result' column
## Preconditions (4)
- **`PC-01`**: `python3 -c 'import zvt; print(zvt.__version__)'` → on_fail: Run: python3 -m pip install zvt then re-run: python3 -m zvt.init_dirs to initialize data directories
- **`PC-02`**: `python3 -c "from zvt.api.kdata import get_kdata; df = get_kdata(entity_ids=['stock_sh_600000'], limit=1); assert df is not None and len(df) > 0, 'No kdata found'"` → on_fail: Run recorder first: python3 -m zvt.recorders.em.em_stock_kdata_recorder --entity_ids stock_sh_600000 (replace with your target entity IDs)
- **`PC-03`**: `python3 -c "import os; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); assert zvt_home.exists(), f'ZVT home not found: {zvt_home}'"` → on_fail: Run: python3 -m zvt.init_dirs
- **`PC-04`**: `python3 -c "import os, tempfile; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); test_f = zvt_home / '.write_test'; test_f.touch(); test_f.unlink()"` → on_fail: Check directory permissions: chmod u+w ~/.zvt or set ZVT_HOME environment variable to a writable location
FILE:references/USE_CASES.md
# Known Use Cases (KUC)
Total: **8**
## `KUC-101`
**Source**: `examples/factoranalysis.py`
Computes and stores a 5-day moving average (MA5) factor for daily stock data, enabling technical indicator analysis across multiple stocks using ClickHouse database storage.
## `KUC-102`
**Source**: `examples/featureanalysis.ipynb`
Retrieves pre-computed MA5 factor data from ClickHouse and generates comprehensive tear sheets for factor performance analysis and visualization in research environments.
## `KUC-103`
**Source**: `examples/qadatabridge_example.py`
Demonstrates efficient zero-copy conversion between Pandas and Polars dataframes, and shared memory-based cross-process data transmission for high-performance data pipelines.
## `KUC-104`
**Source**: `examples/qarsbridge_example.py`
Provides a bridge interface to QARS2 Rust-powered backend components for high-performance account management, trading operations, and backtesting with 100x account and 10x backtest speed improvements.
## `KUC-105`
**Source**: `examples/qifiaccountexample.py`
Simulates trading account operations including order placement, trade execution, and daily settlement for backtesting trading strategies with persistent account state management.
## `KUC-106`
**Source**: `examples/resource_manager_example.py`
Provides unified context manager-based resource management for MongoDB, RabbitMQ, ClickHouse, and Redis connections with automatic connection handling and exception safety.
## `KUC-107`
**Source**: `examples/scheduleserver.py`
Implements a web-based task scheduler allowing users to add, query, and manage periodic or cron-based jobs through HTTP endpoints for automated trading task execution.
## `KUC-108`
**Source**: `examples/test_ckread_qifi.py`
Provides direct SQL query access to QIFI trading data stored in ClickHouse, enabling retrieval of account information, orders, trades, and positions for analysis.
FILE:references/WISDOM.md
# Cross-Project Wisdom
Total: **10**
## `CW-BT-001` — Cerebro 统一编排引擎
**From**: backtrader · **Applicable to**: backtesting
backtrader 用 Cerebro 作为单一入口,统一管理 data feeds、strategies、analyzers、 observers 的生命周期,支持一次 cerebro.run() 跑多策略+多数据源。 zvt 的 StockTrader 目前每次实例化只绑定一套因子,缺乏统一的多策略组合编排层; 借鉴 Cerebro 模式可让用户把多个 Trader 实例组合到一个 runner 中对比评估。
## `CW-BT-002` — Analyzer 插件化绩效评估
**From**: backtrader · **Applicable to**: backtesting
backtrader 提供 SharpeRatio、DrawDown、TimeReturn、TradeAnalyzer 等即插即用 的 Analyzer,可在不修改策略代码的情况下附加任意绩效指标。 zvt 当前绩效评估能力较弱,没有标准化的 Analyzer 接口; 借鉴此模式可让用户 cerebro.addanalyzer(SharpeRatio) 即得风险调整收益报告。
## `CW-BT-003` — Sizer 仓位管理分离
**From**: backtrader · **Applicable to**: backtesting
backtrader 将仓位管理(每次开仓买多少股/多大比例)单独抽象为 Sizer, 与信号逻辑完全解耦;内置 FixedSize、PercentSizer 等,用户可自定义。 zvt 目前没有显式的 Sizer 概念,仓位控制逻辑散落在 Trader.on_profit_control 等钩子中; 引入 Sizer 接口可使策略信号与资金管理规则独立演化和组合复用。
## `CW-BT-004` — Order 类型全集(Limit/Stop/OCO/Bracket)
**From**: backtrader · **Applicable to**: backtesting
backtrader 支持 Market、Limit、Stop、StopLimit、OCO(二选一)、 Bracket(止盈止损一对订单)等丰富订单类型,并模拟成交滑点和手续费方案。 zvt 回测目前主要支持市价成交,缺乏限价委托和组合订单模拟; 对于高频或实盘对接场景,完善订单类型将大幅提升回测真实性。
## `CW-BT-005` — 数据重采样与重播(Resampling & Replaying)
**From**: backtrader · **Applicable to**: backtesting
backtrader 可将低级别数据(如 1 min)实时 resample 为高级别(如 1 day)并同步驱动策略, 或 replay 逐 tick 模拟 OHLC 形成过程,实现日内精细回测。 zvt 目前多时间框架通过预录入不同级别 K 线实现,缺少运行时动态重采样; 借鉴此模式可在不重复录入数据的前提下支持任意时间粒度组合回测。
## `CW-VN-003` — CTA 回测引擎内置可视化
**From**: vnpy · **Applicable to**: backtesting
vnpy 的 cta_backtester 提供图形界面直接展示策略净值曲线、最大回撤、 每日盈亏、成交明细,无需 Jupyter Notebook。 zvt 目前回测结果可视化依赖 draw_result 方法调用 Plotly,但无统一的回测报告页面; 借鉴此模式可打包一个开箱即用的策略绩效仪表盘。
## `CW-VN-004` — vnpy.alpha ML 因子研究实验室(Lab)
**From**: vnpy · **Applicable to**: factor-research
vnpy 4.0 的 vnpy.alpha.lab 提供数据管理、模型训练、信号生成、策略回测一体化工作流, 支持 Lasso/LightGBM/MLP 等算法的标准化训练接口和可视化对比。 zvt 的 ML 能力目前仅有 MaStockMLMachine 一个入口,缺乏规范化 Lab 框架; 借鉴 Lab 模式可建立"特征工程→训练→信号→回测"的标准流水线,降低 ML 实验门槛。
## `CW-QL-001` — Point-in-Time 数据库(防未来数据泄漏)
**From**: qlib · **Applicable to**: backtesting
qlib 的 Point-in-Time Provider 保证在给定时间点 t 的查询只返回 t 时刻 真实可知的数据(财报发布延迟、修订历史均被正确处理), 彻底消除回测中的 look-ahead bias。 zvt 目前财务数据以报告期为 timestamp,缺少"发布日"维度, 存在用未来财报数据做选股的潜在偏差;引入 PIT 模式可大幅提升回测可信度。
## `CW-QL-002` — Recorder + Experiment 实验管理(MLflow 风格)
**From**: qlib · **Applicable to**: factor-research
qlib 的 workflow 模块提供 Experiment/Recorder,自动记录每次模型训练的 超参数、特征、指标、预测结果,支持跨实验比较和模型版本管理。 zvt 目前缺乏 ML 实验追踪机制,每次重跑结果会覆盖前次; 借鉴 Recorder 模式可将每次因子实验的参数和结果持久化,支持快速复现和版本对比。
## `CW-QL-003` — Nested Decision Framework(多层嵌套决策执行)
**From**: qlib · **Applicable to**: backtesting
qlib 支持将高频执行层(分钟级委托拆单)嵌套在低频决策层(日级组合调仓)中, 两层独立优化且可组合运行,实现日内最优执行算法(如 TWAP、VWAP 调仓)。 zvt 目前回测仅有日线级别的成交假设,缺乏执行算法建模; 借鉴嵌套框架可让 zvt 区分"何时持有哪些股"与"如何以最小冲击成本建仓"两个问题。
FILE:references/components/cross-language_data_bridge.md
# cross-language_data_bridge (4 classes)
## `SharedMemoryWriter.write`
`cross-language_data_bridge/sharedmemorywriter-write.py:0`
## `SharedMemoryReader.read`
`cross-language_data_bridge/sharedmemoryreader-read.py:0`
## `convert_pandas_to_polars.convert`
`cross-language_data_bridge/convert-pandas-to-polars-convert.py:0`
## `data_format`
`cross-language_data_bridge/data-format.py:0`
FILE:references/components/data_collection_-_storage.md
# data_collection_&_storage (6 classes)
## `QA_SU_save_stock_day.save`
`data_collection_&_storage/qa-su-save-stock-day-save.py:0`
## `QA_Tdx_Executor.execute`
`data_collection_&_storage/qa-tdx-executor-execute.py:0`
## `QA_Fetcher.fetch`
`data_collection_&_storage/qa-fetcher-fetch.py:0`
## `Parallelism.run`
`data_collection_&_storage/parallelism-run.py:0`
## `fetch_engine`
`data_collection_&_storage/fetch-engine.py:0`
## `storage_backend`
`data_collection_&_storage/storage-backend.py:0`
FILE:references/components/data_processing_-_resampling.md
# data_processing_&_resampling (7 classes)
## `_quotation_base.resample`
`data_processing_&_resampling/quotation-base-resample.py:0`
## `QA_DataStruct_Stock_day.adjust`
`data_processing_&_resampling/qa-datastruct-stock-day-adjust.py:0`
## `QA_DataStruct_Stock_min.resample`
`data_processing_&_resampling/qa-datastruct-stock-min-resample.py:0`
## `QA_data_min_resample.resample`
`data_processing_&_resampling/qa-data-min-resample-resample.py:0`
## `QA_data_stock_to_fq.calculate`
`data_processing_&_resampling/qa-data-stock-to-fq-calculate.py:0`
## `fq_method`
`data_processing_&_resampling/fq-method.py:0`
## `resample_agg`
`data_processing_&_resampling/resample-agg.py:0`
FILE:references/components/factor_computation_-_analysis.md
# factor_computation_&_analysis (7 classes)
## `QASingleFactor_DailyBase.calc`
`factor_computation_&_analysis/qasinglefactor-dailybase-calc.py:0`
## `QASingleFactor_DailyBase.register`
`factor_computation_&_analysis/qasinglefactor-dailybase-register.py:0`
## `QAFeatureAnalysis.analyze`
`factor_computation_&_analysis/qafeatureanalysis-analyze.py:0`
## `QAFeatureBacktest.generate_signals`
`factor_computation_&_analysis/qafeaturebacktest-generate-signals.py:0`
## `QA_indicator_MA.calculate`
`factor_computation_&_analysis/qa-indicator-ma-calculate.py:0`
## `factor_type`
`factor_computation_&_analysis/factor-type.py:0`
## `storage`
`factor_computation_&_analysis/storage.py:0`
FILE:references/components/order_management_-_execution.md
# order_management_&_execution (7 classes)
## `QA_Order.create`
`order_management_&_execution/qa-order-create.py:0`
## `QA_OrderQueue.settle`
`order_management_&_execution/qa-orderqueue-settle.py:0`
## `QA_Position.update`
`order_management_&_execution/qa-position-update.py:0`
## `QIFI_Account.sync`
`order_management_&_execution/qifi-account-sync.py:0`
## `MARKET_PRESET.get_commission`
`order_management_&_execution/market-preset-get-commission.py:0`
## `order_model`
`order_management_&_execution/order-model.py:0`
## `commission_calc`
`order_management_&_execution/commission-calc.py:0`
FILE:references/components/real-time_messaging_infrastructure.md
# real-time_messaging_infrastructure (7 classes)
## `base_ps.connect`
`real-time_messaging_infrastructure/base-ps-connect.py:0`
## `publisher_routing.publish`
`real-time_messaging_infrastructure/publisher-routing-publish.py:0`
## `subscriber.subscribe`
`real-time_messaging_infrastructure/subscriber-subscribe.py:0`
## `QA_AsyncScheduler.schedule`
`real-time_messaging_infrastructure/qa-asyncscheduler-schedule.py:0`
## `QA_Thread.execute`
`real-time_messaging_infrastructure/qa-thread-execute.py:0`
## `broker`
`real-time_messaging_infrastructure/broker.py:0`
## `exchange_type`
`real-time_messaging_infrastructure/exchange-type.py:0`
FILE:references/components/strategy_execution_engine.md
# strategy_execution_engine (8 classes)
## `QAStrategyCtaBase.on_bar`
`strategy_execution_engine/qastrategyctabase-on-bar.py:0`
## `QAStrategyStockBase.send_order`
`strategy_execution_engine/qastrategystockbase-send-order.py:0`
## `QARSStrategy.run`
`strategy_execution_engine/qarsstrategy-run.py:0`
## `QARSBacktest.execute`
`strategy_execution_engine/qarsbacktest-execute.py:0`
## `QA_Engine.run`
`strategy_execution_engine/qa-engine-run.py:0`
## `strategy_base`
`strategy_execution_engine/strategy-base.py:0`
## `execution_mode`
`strategy_execution_engine/execution-mode.py:0`
## `data_feed`
`strategy_execution_engine/data-feed.py:0`
基于微软 qlib 的 AI 量化平台:覆盖预测模型、因子挖掘(Alpha158/TFT)、 组合优化、多频回测。支持 A 股 + 美股 + 港股多市场。
---
name: qlib-ai-quant
description: |-
基于微软 qlib 的 AI 量化平台:覆盖预测模型、因子挖掘(Alpha158/TFT)、
组合优化、多频回测。支持 A 股 + 美股 + 港股多市场。
license: Proprietary. See LICENSE.txt in project root.
compatibility: Designed for Doramagic-host ecosystem (Claude Code / openclaw / Cursor). Requires Python 3.12+ with uv package manager.
metadata:
version: "v6.1"
blueprint_id: "finance-bp-087"
compiled_at: "2026-04-22T11:06:12.650493+00:00"
capability_markets: "multi-market"
capability_activities: "backtesting, factor-research"
sop_version: "crystal-compilation-v6.1"
---
# Qlib AI 量化 (qlib-ai-quant)
> 用微软 qlib 做 AI 驱动的量化策略——预测模型、组合优化、Alpha158/TFT 特征工程,一套跑通。
## Pipeline
`data_collection -> data_storage -> factor_computation -> target_selection -> trading_execution -> visualization`
## Top Use Cases (38 total)
### Multi-Frequency Data Resampling Instrument Processor (`UC-101`)
Resampling high-frequency 1-minute data to lower frequencies (e.g., daily) for downstream feature computation and model training
**Triggers**: resample, frequency conversion, 1min
### Specific Minute Selection Data Resampling (`UC-102`)
Resampling 1-minute data to daily frequency by extracting a specific minute point from each day for feature generation
**Triggers**: resample, specific minute, 1min to day
### Multi-Frequency Feature Handler (`UC-103`)
Loading and processing data with both daily frequency features and 15-minute frequency features for models that leverage multiple time scales
**Triggers**: multi-frequency, 15min, day frequency
For all **38** use cases, see [references/USE_CASES.md](references/USE_CASES.md).
**Execute trigger**: `When user intent matches intent_router.uc_entries[].positive_terms AND user uses action verb (run/execute/跑/执行/backtest/fetch/collect)`
## What I'll Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
## Semantic Locks (Fatal)
| ID | Rule | On Violation |
|---|---|---|
| `SL-01` | Execute sell orders before buy orders in every trading cycle | halt |
| `SL-02` | Trading signals MUST use next-bar execution (no look-ahead) | halt |
| `SL-03` | Entity IDs MUST follow format entity_type_exchange_code | halt |
| `SL-04` | DataFrame index MUST be MultiIndex (entity_id, timestamp) | halt |
| `SL-05` | TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount | halt |
| `SL-06` | filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION | halt |
| `SL-07` | Transformer MUST run BEFORE Accumulator in factor pipeline | halt |
| `SL-08` | MACD parameters locked: fast=12, slow=26, signal=9 | halt |
Full lock definitions: [references/LOCKS.md](references/LOCKS.md)
## Top Anti-Patterns (25 total)
- **`AP-ZVT-183`**: 除权因子为 inf/NaN 时直接参与乘法导致复权静默失败
- **`AP-ZVT-179`**: 第三方数据接口超限后异常被吞噬,数据静默缺失
- **`AP-ZVT-183B`**: HFQ(后复权)与 QFQ(前复权)K 线表使用错误导致因子计算漂移
All 25 anti-patterns: [references/ANTI_PATTERNS.md](references/ANTI_PATTERNS.md)
## Evidence Quality Notice
> [QUALITY NOTICE] This crystal was compiled from blueprint finance-bp-087. Evidence verify ratio = 53.9% and audit fail total = 16. Generated results may have uncaptured requirement gaps. Verify critical decisions against source files (LATEST.yaml / LATEST.jsonl).
## Reference Files
| File | Contents | When to Load |
|---|---|---|
| [references/seed.yaml](references/seed.yaml) | V6+ 全量权威 (source-of-truth) | 有行为/决策争议时必读 |
| [references/ANTI_PATTERNS.md](references/ANTI_PATTERNS.md) | 25 条跨项目反模式 | 开始实现前 |
| [references/WISDOM.md](references/WISDOM.md) | 跨项目精华借鉴 | 架构决策时 |
| [references/CONSTRAINTS.md](references/CONSTRAINTS.md) | domain + fatal 约束 | 规则冲突时 |
| [references/USE_CASES.md](references/USE_CASES.md) | 全量 KUC-* 业务场景 | 需要完整示例时 |
| [references/LOCKS.md](references/LOCKS.md) | SL-* + preconditions + hints | 生成回测/交易代码前 |
| [references/COMPONENTS.md](references/COMPONENTS.md) | AST 组件地图(按 module 拆分)| 查 API 时 |
---
*Compiled by Doramagic crystal-compilation-v6.1 from `finance-bp-087` blueprint at 2026-04-22T11:06:12.650493+00:00.*
*See [human_summary.md](human_summary.md) for non-technical overview.*
FILE:human_summary.md
# Human Summary
> I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
## What I Can Do
- **tagline**: I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
- **use_cases**: ['Multi-Frequency Feature Handler', 'Specific Minute Selection Data Resampling', 'Multi-Frequency Data Resampling Instrument Processor', 'A-share MACD daily golden-cross backtest with hfq price adjustment from eastmoney', 'End-to-end ZVT pipeline: FinanceRecorder + GoodCompanyFactor + StockTrader', 'Multi-factor strategy with TargetSelector (AND mode) combining MACD + volume breakout', 'Index composition data collection (SZ1000, SZ2000) with EM recorder']
## What I Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
FILE:references/ANTI_PATTERNS.md
# Anti-Patterns (Cross-Project)
Total: **25**
## qlib (9)
### `AP-QLIB-1930` — 回测结果与模型无关——共享 dataset 对象导致预测值被首次模型覆盖 <sub>(high)</sub>
Qlib 中多个模型复用同一个已 fit 的 DatasetH 实例时,dataset 内部的标准化 参数(fit_start_time/fit_end_time 决定的归一化统计量)在第一次 fit 后固化。 切换模型但不重新初始化 dataset,导致所有模型实际使用同一套预测信号。表现为 无论换 LightGBM/XGBoost/DNN,回测净值曲线完全一致。这是最危险的"实验看起来 在跑,但结论全部无效"反模式。
Source: https://github.com/microsoft/qlib/issues/1930
### `AP-QLIB-2090` — fit_start_time 与 train segment 双重配置引发隐式数据泄露 <sub>(high)</sub>
Qlib DatasetH 有两个"训练数据范围":handler 的 fit_start_time/fit_end_time (决定归一化器拟合范围)和 segments.train(决定模型训练范围)。常见错误是 让 fit_end_time 覆盖 valid/test 段,使归一化统计量(均值、标准差)包含了 未来数据,造成前向偏差(look-ahead bias)。两者独立配置但语义耦合,文档 未明确说明 fit_end_time 必须 <= train_end。
Source: https://github.com/microsoft/qlib/issues/2090
### `AP-QLIB-2036` — MACD 因子公式文档错误——DEA 被多除一次 CLOSE 导致量纲不一致 <sub>(high)</sub>
Qlib 官方文档中的 Alpha 公式示例将 MACD 的 DEA 定义为 EMA(DIF, 9) / CLOSE, 但 DIF 已经是无量纲(除过 CLOSE 的),再次除以 CLOSE 导致 DEA 量纲为 1/price。 基于此文档公式构建的 MACD 因子在截面标准化后与正确公式差异显著,IC 下降。 此类文档层面的公式错误会被大量用户直接照搬入生产因子库。
Source: https://github.com/microsoft/qlib/issues/2036
### `AP-QLIB-2184` — 自定义 A 股数据导入前未按约定填充停牌日 NaN,引发下游因子噪声 <sub>(high)</sub>
Qlib 约定停牌日 open/close/high/low/volume/factor 字段均应填 NaN,以便框架 在因子计算时识别并跳过。用户自建 A 股数据集时若将停牌日保留为上一日价格 (常见于从东财/Wind 直接导出的数据),会导致停牌期间的价格动量因子出现 "假信号"(价格不变但因子非零)。Qlib 不校验此约定,错误静默流入训练数据。
Source: https://github.com/microsoft/qlib/issues/2184
### `AP-QLIB-1892` — PIT(Point-In-Time)财务数据收集器依赖外部股票列表接口,全量 A 股获取不完整 <sub>(high)</sub>
Qlib 的 PIT 数据收集器(财务数据时间点快照)在初始化时调用 get_hs_stock_symbols() 获取沪深股票列表。该函数依赖东财 API,经常仅返回 部分列表而非全量 5000+ 股票,且函数在获取不完整时直接 raise ValueError。 用户若按文档步骤操作,财务数据集将只覆盖部分股票,基于 PIT 财务因子的回测 存在严重生存者偏差(未被采集的股票被隐式排除)。
Source: https://github.com/microsoft/qlib/issues/1892
### `AP-QLIB-2097` — 全市场 instrument="all" 在 32GB 内存机器上 OOM,但 CSI300 正常 <sub>(medium)</sub>
Qlib 在加载 Alpha158 特征时会将指定 universe 的全部特征矩阵一次性载入内存。 使用 instrument="csi300"(300 股)与 instrument="all"(5000+ 股)的内存占用 差约 16 倍。32GB 机器跑全市场时在 init_instance_by_config 阶段直接 OOM, 错误信息不提示内存问题。用户容易误以为是配置错误,实际上需要分批加载或 使用流式特征计算。
Source: https://github.com/microsoft/qlib/issues/2097
### `AP-QLIB-1984` — LightGBM 模型标签维度校验逻辑永远不触发导致多标签训练静默失败 <sub>(medium)</sub>
Qlib gbdt.py 中用 y.values.ndim == 2 判断是否为多标签,但从 DataFrame 取出的 Series 的 ndim 永远为 1,条件永远为 False,因此多标签训练不会走 squeeze 分支,而是直接进入 LightGBM 训练并在更深处抛出语义不明的错误。 用户尝试自定义多标签任务时无法从错误信息定位到此根因。
Source: https://github.com/microsoft/qlib/issues/1984
### `AP-QLIB-1915` — 自定义 CSV 数据 dump_bin 后 DataHandler 报 Length mismatch,D.features 却正常 <sub>(high)</sub>
Qlib 存在两套数据访问路径:D.features(直接读 binary)和 DataHandler/DataHandlerLP (带 processor pipeline)。自定义 A 股 CSV 数据在 dump_bin 时若字段顺序 或 symbol 格式(如 600000.SH vs SH600000)与 Qlib 约定不符,DataHandler 的 processor 在 align/reindex 时触发 Length mismatch,而 D.features 因不 经过 processor 而成功。这一"两套路径行为不一致"让用户误以为数据已正确导入。
Source: https://github.com/microsoft/qlib/issues/1915
### `AP-QLIB-1949` — Colab/Linux 多进程后端与 Qlib ParallelExt 冲突导致 DataHandler 完全不可用 <sub>(medium)</sub>
Qlib 在非 fork 环境(Windows 或 Google Colab)中,DataHandler 使用 joblib 并行加载特征时,ParallelExt 初始化时访问 _backend_args 属性失败(AttributeError)。 根因是 joblib 1.5+ 移除了该内部属性,Qlib 的兼容层未更新。表现为 D.features 调用抛出多层嵌套异常,用户无法从错误栈判断是并行后端问题还是数据问题。
Source: https://github.com/microsoft/qlib/issues/1949
## vnpy (4)
### `AP-VNPY-3691` — K 线生成器首根 K 线时间戳不对齐,导致第一个周期信号错误 <sub>(high)</sub>
vnpy BarGenerator 在合成 N 分钟 K 线时,第一根推送的 K 线时间戳为"当前 tick 所在分钟"而非"完整 N 分钟周期结束时间"。具体表现:09:59 的 tick 会 触发一根不完整的 5 分钟 K 线推送(本应等到 10:04 才推送)。策略若在 on_bar 中直接用 datetime.minute % 5 过滤,第一根 K 线恰好通过,但包含的 数据不足一个完整周期,用于信号计算会产生错误的开仓信号。
Source: https://github.com/vnpy/vnpy/issues/3691
### `AP-VNPY-3669` — Alpha 模块历史数据增量保存时新旧 DataFrame schema 不兼容导致 SchemaError <sub>(medium)</sub>
vnpy Alpha 模块在保存 K 线数据到 Parquet 文件时,将新下载数据(可能含 Float64 列)与已存文件(历史 Int64 列)直接 polars.concat。polars 强类型 不允许隐式类型提升,抛出 SchemaError。根因是不同数据源/版本返回的字段类型 不一致(如 volume 在部分行情源为整数,在另一些为浮点),且 concat 前无 schema 对齐步骤。影响所有使用 vnpy alpha 进行回测的历史数据构建流程。
Source: https://github.com/vnpy/vnpy/issues/3669
### `AP-VNPY-3685` — 价差交易模块 run_backtesting() 在 Jupyter 环境下静默报错,结果不可信 <sub>(high)</sub>
vnpy 4.10 价差交易(SpreadTrading)模块的 run_backtesting() 在 Jupyter 环境下存在事件循环冲突(asyncio already running),导致回测引擎部分逻辑 不执行但不抛异常,返回看似正常的回测统计数据。同样代码在命令行 Python 中无此问题。vnpy 4.x 将部分 IO 改为 async 但 Jupyter 的事件循环与之不兼容, 是"回测结果看起来正确但实际不完整"的隐蔽陷阱。
Source: https://github.com/vnpy/vnpy/issues/3685
### `AP-VNPY-3700` — 安装脚本不使用 venv 导致全局 numpy 版本被降级破坏其他依赖 <sub>(medium)</sub>
vnpy install.bat 直接在系统/conda base 环境安装,会强制降级 numpy 到 <2.0 以满足 vnpy 依赖,破坏依赖 numpy 2.x 的其他量化工具(如 scipy、pytorch 新版)。 没有 requirements.txt,依赖边界不透明。在多工具共存的量化研究环境中, vnpy 的安装脚本是"全局环境污染"的常见根源。
Source: https://github.com/vnpy/vnpy/issues/3700
## zipline (6)
### `AP-ZIPLINE-138` — 回测价格为未复权价,教程图表误导用户误判策略收益 <sub>(high)</sub>
Zipline 教程使用 AAPL 股价图做演示,但 bundle 中存储的是未复权价格(raw price), 而非经过拆股/分红调整的复权价。图表显示的历史价格与市场实际价约差 4 倍(Apple 历次拆股累计因子),用户误将"价格翻 4 倍"当作策略收益。A 股场景更严重: 除权前后价格跳变会在未复权数据中形成巨大"信号",吸引技术指标在除权日产生 虚假突破信号。
Source: https://github.com/stefan-jansen/zipline-reloaded/issues/138
### `AP-ZIPLINE-235` — 默认以当根 K 线收盘价成交,低估实盘滑点,策略回测收益虚高 <sub>(high)</sub>
Zipline 默认滑点模型在当根 K 线触发信号后,以同根 K 线收盘价成交(current bar close fill)。实盘中信号只能在下一根 K 线的开盘价附近成交(T+1 order execution)。以 A 股日线为例,用收盘价回测比用次日开盘价成交平均高估日收益 约 0.1-0.3%,年化差距可超 30%。需显式配置 slippage model 为 VolumeShareSlippage 或 FixedSlippage 并设合理 volume_limit。
Source: https://github.com/stefan-jansen/zipline-reloaded/issues/235
### `AP-ZIPLINE-190` — 日历 start_session 设为非交易日触发 DateOutOfBounds,无提示如何修正 <sub>(medium)</sub>
Zipline 在注册 bundle 或运行算法时,若 start_session 参数恰好是非交易日 (如 1998-01-01 元旦),Calendar 校验抛出 DateOutOfBounds("cannot be earlier than the first session")。错误信息仅显示交易日历起始日,不提示"请改为第一个 交易日"。A 股场景:使用 SSE/SZSE 日历时,若 start_date 恰好是春节前最后 一天次日(节假日),会触发同类错误,调试成本极高。
Source: https://github.com/stefan-jansen/zipline-reloaded/issues/190
### `AP-ZIPLINE-181` — asset db 过期后 Pipeline 报"no assets traded",误导用户排查数据范围 <sub>(high)</sub>
Zipline 的 asset database(SQLite)记录每只股票的 start/end 交易日期。若 使用了旧版 Quandl/自建 bundle 且未重新 ingest,在回测新日期范围时 Pipeline 抛出 "Failed to find any assets with country_code 'US' that traded between [dates]"。A 股场景:重新下载行情后若只更新价格数据而未重建 asset db,退市/ 新上市股票的日期范围不更新,Pipeline 过滤会悄悄排除这些股票,产生生存者偏差。
Source: https://github.com/stefan-jansen/zipline-reloaded/issues/181
### `AP-ZIPLINE-285` — week_start()/week_end() 在自定义日历(非美股)下静默失效 <sub>(medium)</sub>
Zipline schedule_function 的 date_rules.week_start() 和 date_rules.week_end() 依赖交易日历的周首/周末判断逻辑,但在非美股日历(如 ASX、SSE)中,该逻辑 与 NYSE 日历的偏移计算不兼容,导致 schedule 永远不触发或在错误的日期触发。 A 股场景:使用 SSE 日历时,含春节等连续长假的周,week_start 可能跳过整个 假期周而不调仓,但用户无法从日志发现未触发的调度。
Source: https://github.com/stefan-jansen/zipline-reloaded/issues/285
### `AP-ZIPLINE-240` — 回测日期时区必须为 UTC,传入 naive datetime 引发深层 AssertionError <sub>(medium)</sub>
Zipline 内部强制要求所有时间戳为 UTC aware datetime。当用户传入 naive datetime (无时区信息,如 pd.Timestamp('2020-01-01'))时,不在入口处报错,而是在 算法执行深处触发 AssertionError: Algorithm should have a utc datetime,栈深 难以定位。A 股开发者从本地 CST 时间导入数据时极易触发此陷阱,需在 bundle 注册时显式 tz_localize('UTC')。
Source: https://github.com/stefan-jansen/zipline-reloaded/issues/240
## zvt (6)
### `AP-ZVT-183` — 除权因子为 inf/NaN 时直接参与乘法导致复权静默失败 <sub>(high)</sub>
ZVT 在计算前复权因子时以 new/old 价格比计算 qfq_factor。当 old==0(新股首日 或数据缺失)时因子为 inf;当 kdata.open 本身为 None(停牌日未填充)时乘法 抛出 TypeError。结果:整个 entity 的复权计算中断,后续 K 线全部丢失,但主 流程只 log ERROR 不中断,用户往往不知道已有大量股票数据损坏。
Source: https://github.com/zvtvz/zvt/issues/183
### `AP-ZVT-179` — 第三方数据接口超限后异常被吞噬,数据静默缺失 <sub>(high)</sub>
ZVT 使用聚宽 jqdatasdk 批量拉取全市场 K 线时(4000+ 股票),触发聚宽每日 最大查询条数限制(错误:已超过每日最大查询数量)。ZVT 捕获异常后继续执行下一 entity,导致超限后所有股票的当日数据均静默缺失。回测若使用该残缺数据库,因 子计算结果将产生系统性偏差,且无告警。
Source: https://github.com/zvtvz/zvt/issues/179
### `AP-ZVT-161` — 全市场 SQLite 批量因子计算触发 too many SQL variables 错误 <sub>(medium)</sub>
ZVT 在计算 VolumeUpMaFactor 等多股因子时,将所有 entity_id 拼入单条 SQL 的 IN 子句。当 A 股全市场(5000+ 股)一次性查询时,触发 SQLite 默认限制 SQLITE_MAX_VARIABLE_NUMBER=999。调大 max_allowed_packet(MySQL 参数)无效, 根因是 SQLite 变量数上限。正确解法是分批查询,但 ZVT 早期版本未处理此边界。
Source: https://github.com/zvtvz/zvt/issues/161
### `AP-ZVT-129` — 使用通配符导入隐藏 API 版本变更,AdjustType 等枚举莫名消失 <sub>(medium)</sub>
ZVT 文档示例使用 `from zvt import *` 导入所有符号。当 ZVT 版本升级重构 枚举(如将 AdjustType 移入子模块)后,通配符导入不再包含该符号,触发 AttributeError。使用者误以为是安装问题,实际是版本间 API breaking change 未在 CHANGELOG 中标注,且通配符导入掩盖了具体来源。应显式 import 枚举类。
Source: https://github.com/zvtvz/zvt/issues/129
### `AP-ZVT-187` — 回测引擎未在数据层空结果时提前终止,导致空指针级联崩溃 <sub>(medium)</sub>
ZVT Trader 在 load_data 完成后检查数据为空时,不提前退出,而是将空 DataFrame 传入 selector 计算,触发后续 NoneType 操作链式崩溃。错误栈深且难以定位根因, 用户误以为是策略逻辑问题。根因是数据时间窗口配置错误(start/end 不在数据 库覆盖范围内)但无有效校验。
Source: https://github.com/zvtvz/zvt/issues/187
### `AP-ZVT-183B` — HFQ(后复权)与 QFQ(前复权)K 线表使用错误导致因子计算漂移 <sub>(high)</sub>
ZVT 提供 Stock1dKdata(不复权)、Stock1dHfqKdata(后复权)、Stock1dQfqKdata (前复权)三张独立表。用户在计算价格动量/均线因子时混用两张表(如用不复权 做均线,用后复权做收益率),导致除权日前后因子值产生跳变。ZVT 不做跨表 复权类型一致性校验,混用静默通过。
Source: https://github.com/zvtvz/zvt/issues/183
FILE:references/COMPONENTS.md
# Component Capability Map
**Project**: finance-bp-087--qlib
**Scan date**: 2026-04-22
**Stats**: {'total_files': 11, 'total_classes': 60, 'total_functions': 0, 'total_stages': 11}
## Modules (11)
- [data_expression_layer](components/data_expression_layer.md): 5 classes
- [data_operators](components/data_operators.md): 6 classes
- [data_provider_system](components/data_provider_system.md): 5 classes
- [dataset_and_handler](components/dataset_and_handler.md): 6 classes
- [model_training](components/model_training.md): 6 classes
- [trading_strategy](components/trading_strategy.md): 6 classes
- [backtest_execution](components/backtest_execution.md): 7 classes
- [nested_execution](components/nested_execution.md): 4 classes
- [rl_trading](components/rl_trading.md): 6 classes
- [workflow_and_recording](components/workflow_and_recording.md): 6 classes
- [rolling_backtest](components/rolling_backtest.md): 3 classes
FILE:references/CONSTRAINTS.md
# Constraints
## preservation_manifest
```yaml
required_objects:
business_decisions_count: 169
fatal_constraints_count: 61
non_fatal_constraints_count: 273
use_cases_count: 38
semantic_locks_count: 12
preconditions_count: 4
evidence_quality_rules_count: 2
traceback_scenarios_count: 5
```
## Domain Constraints Injected (39)
- **`SHARED-BT-LAB-001`** <sub>(fatal)</sub>: 未来函数(Lookahead Bias):在模拟历史时间点 t 的交易决策时, 不得使用 t 时刻之后才能知道的信息。最常见形式: (1) 使用收盘价计算信号并同日以收盘价成交; (2) 将 T 日收盘后计算的指标标记在同一根 K 线; (3) 使用当日最高/最低价作为成交假设。 信号计算与成交时间必须对齐:T 日收盘后计算信号,T+1 日开盘成交。
- **`SHARED-BT-LAB-002`** <sub>(high)</sub>: 指标预热期(Warmup Period)处理:滚动窗口指标在前 N 个 bar 时 NaN, 这些 bar 不应参与信号计算和持仓决策。强制要求指标的 warmup_period 与最长 lookback 期等长,且 warmup 期间持仓应置零。
- **`SHARED-BT-LAB-003`** <sub>(fatal)</sub>: ML/DL 模型时序数据划分必须按时间顺序:TRAIN < VALID < TEST, 不可使用随机 k-fold 分折(会将未来数据混入训练集)。 应使用 TimeSeriesSplit 或 Walk-Forward 验证。
- **`SHARED-BT-LAB-004`** <sub>(fatal)</sub>: 开盘价/最高价/最低价成交假设:日线回测中假设每日可以最高价卖出或 最低价买入(如动量策略"最高价止盈"),这是明显的 lookahead, 因为日内最高/最低价只有收盘后才能确认。成交价只能用开盘价或 前一日收盘价(带滑点)。
- **`SHARED-BT-LAB-005`** <sub>(high)</sub>: 数据对齐偏移(Off-by-one):pandas rolling/shift 等操作容易引入细微的 1 期偏移错误。应在代码中明确记录每个序列的"观测时间点", 并通过 assert 验证关键时间对齐关系。
- **`SHARED-BT-LAB-006`** <sub>(high)</sub>: 过度优化(Overfitting):回测数量越多,过拟合概率越高。 Bailey et al.(2014)证明 Optimal Sharpe Ratio 期望值随回测次数单调递减。 应使用 Walk-Forward 验证代替 in-sample 参数穷举,并报告 Deflated Sharpe Ratio(DSR)而非峰值 Sharpe。
- **`SHARED-BT-SURV-001`** <sub>(fatal)</sub>: 幸存者偏差(Survivorship Bias):使用当前市场成分股作为历史回测股票池, 会遗漏曾经存在但后来退市、摘牌或被合并的股票,系统性高估策略历史收益率。 回测股票池必须使用历史时点快照(point-in-time universe)。
- **`SHARED-BT-SURV-002`** <sub>(high)</sub>: In-Sample / Out-of-Sample 划分:策略开发、参数选择必须在样本内完成, 样本外数据仅用于最终验证,不可多次"看"样本外数据后继续调优 (会将样本外变为新的样本内,重蹈过拟合)。
- **`SHARED-BT-SURV-003`** <sub>(high)</sub>: 停牌/缺失数据的填充策略:停牌日价格不可简单用前一日收盘价 forward-fill, 因为这会在复盘时造成"零成交量"日参与了因子计算和信号生成。 应在因子计算层显式过滤缺失交易日,不填充。
- **`SHARED-BT-SURV-004`** <sub>(high)</sub>: 异常值(Extreme Value)污染:原始市场数据可能含有数据源错误(如除权未 及时调整、手工录入错误导致的极端价格),不清洗直接进入因子计算会产生 极端信号,污染整个横截面。应在 pipeline 入口处过滤 3-sigma 异常值。
- **`SHARED-BT-COST-001`** <sub>(fatal)</sub>: 交易成本(佣金 + 印花税/转让税 + 过户费)必须在回测初始化时强制配置, 不可使用零成本默认值。忽略成本的回测策略绩效指标具有欺骗性, 高换手率策略尤其严重(单边往返成本往往吞噬 50%+ 的毛收益)。
- **`SHARED-BT-COST-002`** <sub>(high)</sub>: 滑点(Slippage)建模:回测若无滑点,假设每笔订单以理想价格成交, 高频策略在实盘中会因成交价劣化而产生严重亏损。至少应配置固定点差 或比例滑点;大单应使用成交量比例模型(如不超过日成交量 5%)。
- **`SHARED-BT-COST-003`** <sub>(high)</sub>: 换手率(Turnover)必须在回测绩效报告中展示并与成本关联分析。 月换手率超过 50%(年化 600%+)时,策略净收益对成本假设极度敏感, 每 10bps 成本变化可能改变策略盈亏结论,必须做成本敏感性分析。
- **`SHARED-BT-COST-004`** <sub>(medium)</sub>: 仓位规模化(Position Sizing)必须纳入资金量约束:回测应模拟固定资金量 下的实际持仓股数(取整),而非假设可以持有小数股。 对小盘股,最小交易单位(A股:100股/手)会导致实际可持仓量与目标权重 产生偏差,应在回测中模拟取整效应。
- **`SHARED-BT-TIME-001`** <sub>(high)</sub>: 时间戳时区统一:多数据源合并时,UTC vs 本地时间混用是常见数据腐败源。 所有时间戳必须在 pipeline 入口处统一转换为同一时区(推荐 UTC 存储, 市场本地时区展示),不可在 pipeline 中途混用不同时区。
- **`SHARED-BT-TIME-002`** <sub>(high)</sub>: 交易日历对齐:合并不同市场或不同频率数据时(如日线价格 + 周频因子), 必须使用明确的交易日历进行 reindex/merge,不可使用 outer join 后 fillna, 否则会在非交易日(节假日)创建虚假数据行。
- **`SHARED-BT-TIME-003`** <sub>(high)</sub>: 增量更新边界校验:历史数据增量更新时,必须从数据库查询已存最新日期, 仅下载该日期之后的数据。若重新下载已有数据并追加,会产生时间戳重复行, 导致回测时序错误。更新前后必须校验无重复 (index.duplicated().any() == False)。
- **`SHARED-BT-TIME-004`** <sub>(medium)</sub>: 回测绩效归因失真:基准(Benchmark)选择不当会使 Alpha/Beta 计算失真。 应选用策略实际可投资的被动基准(如 HS300 ETF),而非不可直接投资的 价格指数(如 HS300 指数)。价格指数不含股息再投资,会低估持仓基准收益。
- **`SHARED-BT-PERF-001`** <sub>(medium)</sub>: 最大回撤(Max Drawdown)计算必须使用净值序列(portfolio value), 不可用累计收益率序列代替。若使用对数收益率累加,会低估回撤深度 (因对数收益率在下跌时会比简单收益率偏小)。
- **`SHARED-BT-PERF-002`** <sub>(medium)</sub>: Sharpe Ratio 年化化约定:年化 Sharpe = 日 Sharpe × sqrt(252)(股票,252 交易日) 或 × sqrt(365)(加密货币,365日)。不同系统默认不同,跨系统对比前必须 确认年化因子,否则 Sharpe 不可比。
- **`SHARED-BT-PERF-003`** <sub>(medium)</sub>: Calmar Ratio / Sortino Ratio 优于 Sharpe Ratio 作为风险调整收益指标: Sharpe 假设收益正态分布,A 股/加密市场的收益分布显著左偏(肥尾), 会低估下行风险。量化评估应同时报告 Sortino(仅下行波动)和 Calmar(年化收益/最大回撤),不应单一依赖 Sharpe。
- **`SHARED-BT-PERF-004`** <sub>(medium)</sub>: 回测绩效归因应拆解为:alpha(主动收益)、beta(市场收益)、 因子暴露收益(style/sector)和特异性收益(stock selection)。 不做归因的回测无法区分"策略优秀"与"顺风行情恰好 beta 对了"。
- **`SHARED-FR-IC-001`** <sub>(high)</sub>: IC(信息系数)是衡量因子预测能力的核心指标,定义为因子值与 下期收益率的 Spearman 秩相关系数(ICIR = IC / std(IC))。 IC 绝对值 > 0.05 视为有预测能力的初步证据,ICIR > 0.5 视为稳定。 不计算 IC 直接报告回测绩效是因子有效性证明缺失的典型问题。
- **`SHARED-FR-IC-002`** <sub>(high)</sub>: IC 衰减(IC Decay)分析:因子预测能力通常随持仓期增长而衰减。 应计算 1/5/10/20 日 IC 序列,识别因子的最优持仓期。 IC 在1日高但20日迅速衰减的因子是短期因子,不适合月度换仓策略; 反之亦然。使用错误的持仓期会严重损害因子实盘表现。
- **`SHARED-FR-IC-003`** <sub>(high)</sub>: Harvey, Liu & Zhu (2016) 警告:学术界已发现 300+ 个"显著"因子, 其中大量是多重检验下的误发现(False Discovery)。因子有效性要求: t-stat > 3.0(而非传统的 1.96);或在不同时段/市场独立复现; 或有清晰的经济学逻辑。不满足上述条件的因子极可能是数据挖掘产物。
- **`SHARED-FR-IC-004`** <sub>(high)</sub>: 因子换手率(Factor Turnover)控制:高 IC 但高换手率的因子,在扣除 交易成本后净 IC 可能为负。应计算换手率调整后的有效 IC: net_IC = IC - turnover × cost_per_turn。目标换手率 ≤ 50%(月频)。
- **`SHARED-FR-IC-005`** <sub>(medium)</sub>: 因子衰减期(Half-life)是因子信号强度的核心参数,直接决定最优再平衡频率。 半衰期 < 5 日:日频或周频换仓;5-20 日:周频或双周;> 20 日:月频换仓。 错误地对短期因子使用月频换仓,会导致大量 alpha 在持仓期内消散。
- **`SHARED-FR-NEUT-001`** <sub>(high)</sub>: 行业中性化(Industry Neutralization):因子值若不对行业均值中性化, 因子收益中会混入行业轮动收益,难以判断是因子本身还是行业暴露驱动了收益。 行业中性化操作:factor_neutral = factor - industry_mean(factor)。
- **`SHARED-FR-NEUT-002`** <sub>(high)</sub>: 市值中性化(Market Cap Neutralization):小盘股效应(小盘跑赢大盘) 是金融史上最持久的 anomaly 之一,会污染几乎所有未中性化的因子。 若因子与市值高度相关,选股会系统性偏向小盘,收益来自市值暴露而非因子本身。 需同时进行行业和市值中性化(Fama-MacBeth 回归或残差法)。
- **`SHARED-FR-NEUT-003`** <sub>(high)</sub>: 异常值处理(Winsorize/MAD):因子原始值通常含有极端值,极端值会扭曲 分组分析(如 Q1/Q10 十分位)。应对原始因子值做 Winsorize(截尾至 [1%, 99%] 或 3-sigma)或 MAD(中位数绝对偏差)缩尾,然后再排名/中性化。
- **`SHARED-FR-NEUT-004`** <sub>(medium)</sub>: 因子正交化(Factor Orthogonalization):当多个因子共同用于合成打分时, 高相关因子的合成等效于对单一因子过度权重,稀释信号多样性。 应在合成前对因子做施密特正交化或 PCA,消除因子间的多重共线性。
- **`SHARED-FR-NEUT-005`** <sub>(medium)</sub>: 缺失数据填充策略:因子计算中的 NaN(停牌/新股/数据缺口)若用截面均值填充 会引入 lookahead bias(均值本身含未来信息);若完全删除会产生幸存者偏差; 正确做法是用截面中位数(当日所有股票的中位数,不依赖未来)或将该股当日排除。
- **`SHARED-FR-PORT-001`** <sub>(high)</sub>: 分层分析(Quantile Analysis):因子评估应使用 Q1/Q5(五分位)或 Q1/Q10(十分位)分组的多空收益差(top minus bottom spread)作为 主要评估指标,而非简单的多头收益。Q1 多 Q5 空的"单调性"检验是 因子有效性的核心证据:单调递增/递减 > 非单调 >> 仅多头有效。
- **`SHARED-FR-PORT-002`** <sub>(medium)</sub>: Alpha 衰减测试(Alpha Decay Test):因子的月度 IC 在不同时段(牛市/熊市/ 震荡市)的稳定性是因子鲁棒性的重要证据。IC 仅在某个特定市场状态下有效 的因子不适合全天候部署;应分段(rolling 12M)展示 IC 时序, 识别因子失效期。
- **`SHARED-FR-PORT-003`** <sub>(medium)</sub>: 换仓成本感知(Turnover-Aware Selection):因子排名靠近中间地带(49-51 分位) 的股票,排名小幅波动就会触发换仓,产生大量无效交易成本。 应在选股时设置换仓缓冲区(buffer zone):只在排名变化超过阈值时才换仓。
- **`SHARED-FR-PORT-004`** <sub>(medium)</sub>: 分组收益的统计显著性(Bootstrap 检验):因子分层收益差(Q1-Q5 spread) 即使在历史数据上很大,也可能是偶然,需要 bootstrap 或 t-test 检验 显著性(p-value < 0.05)。小样本回测期(< 3年)的分层收益尤其不可靠。
- **`SHARED-FR-XFER-001`** <sub>(high)</sub>: 因子跨市场可移植性验证:在一个市场有效的因子,不必然在另一个市场有效。 将美股因子直接套用 A 股、或将股票因子套用期货/加密货币,需要独立 IC 验证, 不可假设跨市场通用性。A 股特有异象(如反转效应、ST 价格异常)不存在于美股。
- **`SHARED-FR-XFER-002`** <sub>(medium)</sub>: 因子有效性时间稳定性:曾经有效的因子会因市场学习和套利行为逐渐失效 (McLean & Pontiff 2016 证明因子发表后平均衰减 58%)。 应定期(每季度/年)重新评估因子 IC,失效因子应及时替换或降权。
- **`SHARED-FR-XFER-003`** <sub>(medium)</sub>: 因子与宏观经济环境的交互:利率周期/经济周期/市场情绪对因子有效性影响显著。 价值因子(低 P/B)在利率上升期更有效;动量因子在趋势市更有效,震荡市失效。 部署因子前应评估当前宏观环境与因子最优生存环境的匹配度。
FILE:references/LOCKS.md
# Semantic Locks + Preconditions
## Semantic Locks (12)
### `SL-01` <sub>(on_violation: fatal)</sub>
Execute sell orders before buy orders in every trading cycle
### `SL-02` <sub>(on_violation: fatal)</sub>
Trading signals MUST use next-bar execution (no look-ahead)
### `SL-03` <sub>(on_violation: fatal)</sub>
Entity IDs MUST follow format entity_type_exchange_code
### `SL-04` <sub>(on_violation: fatal)</sub>
DataFrame index MUST be MultiIndex (entity_id, timestamp)
### `SL-05` <sub>(on_violation: fatal)</sub>
TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount
### `SL-06` <sub>(on_violation: fatal)</sub>
filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION
### `SL-07` <sub>(on_violation: fatal)</sub>
Transformer MUST run BEFORE Accumulator in factor pipeline
### `SL-08` <sub>(on_violation: fatal)</sub>
MACD parameters locked: fast=12, slow=26, signal=9
### `SL-09` <sub>(on_violation: warning)</sub>
Default transaction costs: buy_cost=0.001, sell_cost=0.001, slippage=0.001
### `SL-10` <sub>(on_violation: fatal)</sub>
A-share equity trading is T+1 (no same-day close of buy positions)
### `SL-11` <sub>(on_violation: fatal)</sub>
Recorder subclass MUST define provider AND data_schema class attributes
### `SL-12` <sub>(on_violation: fatal)</sub>
Factor result_df MUST contain either 'filter_result' OR 'score_result' column
## Preconditions (4)
- **`PC-01`**: `python3 -c 'import zvt; print(zvt.__version__)'` → on_fail: Run: python3 -m pip install zvt then re-run: python3 -m zvt.init_dirs to initialize data directories
- **`PC-02`**: `python3 -c "from zvt.api.kdata import get_kdata; df = get_kdata(entity_ids=['stock_sh_600000'], limit=1); assert df is not None and len(df) > 0, 'No kdata found'"` → on_fail: Run recorder first: python3 -m zvt.recorders.em.em_stock_kdata_recorder --entity_ids stock_sh_600000 (replace with your target entity IDs)
- **`PC-03`**: `python3 -c "import os; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); assert zvt_home.exists(), f'ZVT home not found: {zvt_home}'"` → on_fail: Run: python3 -m zvt.init_dirs
- **`PC-04`**: `python3 -c "import os, tempfile; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); test_f = zvt_home / '.write_test'; test_f.touch(); test_f.unlink()"` → on_fail: Check directory permissions: chmod u+w ~/.zvt or set ZVT_HOME environment variable to a writable location
FILE:references/USE_CASES.md
# Known Use Cases (KUC)
Total: **38**
## `KUC-101`
**Source**: `examples/benchmarks/LightGBM/features_resample_N.py`
Resampling high-frequency 1-minute data to lower frequencies (e.g., daily) for downstream feature computation and model training.
## `KUC-102`
**Source**: `examples/benchmarks/LightGBM/features_sample.py`
Resampling 1-minute data to daily frequency by extracting a specific minute point from each day for feature generation.
## `KUC-103`
**Source**: `examples/benchmarks/LightGBM/multi_freq_handler.py`
Loading and processing data with both daily frequency features and 15-minute frequency features for models that leverage multiple time scales.
## `KUC-104`
**Source**: `examples/benchmarks/TFT/tft.py`
Training and evaluating a Temporal Fusion Transformer model using Alpha158 features for stock return prediction.
## `KUC-105`
**Source**: `examples/benchmarks/TFT/data_formatters/qlib_Alpha158.py`
Formatting Alpha158 dataset specifically for Temporal Fusion Transformer models with proper column definitions and transformations.
## `KUC-106`
**Source**: `examples/benchmarks/TFT/libs/hyperparam_opt.py`
Optimizing hyperparameters for Temporal Fusion Transformer models using random search, supporting both single-GPU and distributed training.
## `KUC-107`
**Source**: `examples/benchmarks/TRA/example.py`
Training a Temporal Routing Attention model for stock prediction using LSTM-based architecture with configurable seed for reproducibility.
## `KUC-108`
**Source**: `examples/benchmarks/TRA/src/dataset.py`
Creating time series slices for the TRA model, handling multi-index pandas data with instrument and datetime levels for sequential model training.
## `KUC-109`
**Source**: `examples/benchmarks/TRA/src/model.py`
Implementing the Temporal Routing Attention neural network model with configurable LSTM cells and loss functions for stock prediction.
## `KUC-110`
**Source**: `examples/benchmarks/TRA/Reports.ipynb`
Analyzing and reporting backtest results for TRA model including MSE, MAE, IC metrics and top-N ranking performance.
## `KUC-111`
**Source**: `examples/benchmarks_dynamic/DDG-DA/workflow.py`
Running domain adaptation workflow for dynamic trading strategies using DDG-DA method to adapt models across different market regimes.
## `KUC-112`
**Source**: `examples/benchmarks_dynamic/baseline/rolling_benchmark.py`
Running rolling window benchmarks for model evaluation using fixed configurations with Linear and LightGBM models on Alpha158 features.
## `KUC-113`
**Source**: `examples/data_demo/data_cache_demo.py`
Demonstrating how to serialize and cache processed data handlers to disk to avoid redundant data preprocessing operations.
## `KUC-114`
**Source**: `examples/data_demo/data_mem_resuse_demo.py`
Demonstrating how to reuse processed data in memory across multiple training iterations to improve efficiency.
## `KUC-115`
**Source**: `examples/highfreq/highfreq_handler.py`
Handling high-frequency 1-minute market data with specific feature configurations for short-term trading models.
## `KUC-116`
**Source**: `examples/highfreq/highfreq_ops.py`
Implementing operators for high-frequency data processing including DayLast (daily last value), forward/backward fill, date extraction, and null handling.
## `KUC-117`
**Source**: `examples/highfreq/highfreq_processor.py`
Normalizing high-frequency price and volume data using median-based scaling for consistent feature ranges across instruments.
## `KUC-118`
**Source**: `examples/highfreq/workflow.py`
Executing end-to-end high-frequency trading workflow from data loading through model training to signal generation for minute-level strategies.
## `KUC-119`
**Source**: `examples/hyperparameter/LightGBM/hyperparameter_158.py`
Optimizing LightGBM hyperparameters using Optuna for Alpha158 feature dataset to find best model configuration for stock prediction.
## `KUC-120`
**Source**: `examples/hyperparameter/LightGBM/hyperparameter_360.py`
Optimizing LightGBM hyperparameters using Optuna for Alpha360 (high-frequency) feature dataset for minute-level prediction models.
## `KUC-121`
**Source**: `examples/model_interpreter/feature.py`
Extracting and analyzing feature importance from trained GBDT models to understand which factors drive model predictions.
## `KUC-122`
**Source**: `examples/model_rolling/task_manager_rolling.py`
Managing rolling training tasks across multiple time periods with task generation, collection, and multiprocessing support.
## `KUC-123`
**Source**: `examples/nested_decision_execution/workflow.py`
Running backtests for nested decision trading strategies that execute at multiple frequencies (daily and 30-minute) with portfolio analysis.
## `KUC-124`
**Source**: `examples/online_srv/online_management_simulate.py`
Simulating online model management with rolling tasks, including model updates and signal generation for live trading scenarios.
## `KUC-125`
**Source**: `examples/online_srv/rolling_online_management.py`
Managing online models with rolling updates including first training, routine updates, strategy addition, and signal preparation for production.
## `KUC-126`
**Source**: `examples/online_srv/update_online_pred.py`
Updating online model predictions with first training and subsequent prediction refreshes for production deployment.
## `KUC-127`
**Source**: `examples/orderbook_data/create_dataset.py`
Importing raw orderbook data (ticks, orders, transactions) into Qlib's data storage using Arctic time-series database.
## `KUC-128`
**Source**: `examples/orderbook_data/example.py`
Demonstrating how to use imported orderbook data in Qlib with custom providers and non-aligned time series handling.
## `KUC-129`
**Source**: `examples/portfolio/prepare_riskdata.py`
Preparing portfolio risk model data using Structured Covariance Estimator for risk management and portfolio optimization.
## `KUC-130`
**Source**: `examples/rl/simple_example.ipynb`
Implementing a simple reinforcement learning simulator and state interpreter for RL-based trading agent development.
## `KUC-131`
**Source**: `examples/rl_order_execution/scripts/gen_pickle_data.py`
Generating pickle-format training data for reinforcement learning order execution from high-frequency backtest results.
## `KUC-132`
**Source**: `examples/rl_order_execution/scripts/gen_training_orders.py`
Generating synthetic training orders for RL order execution by sampling from historical volume distributions with train/valid/test splits.
## `KUC-133`
**Source**: `examples/rl_order_execution/scripts/merge_orders.py`
Merging multiple order files into consolidated train/valid/test datasets for RL order execution model training.
## `KUC-134`
**Source**: `examples/rolling_process_data/rolling_handler.py`
Creating a data handler specifically designed for rolling window data processing with configurable time ranges.
## `KUC-135`
**Source**: `examples/rolling_process_data/workflow.py`
Executing rolling window workflows for model training and evaluation with pre-processor caching to optimize repeated runs.
## `KUC-136`
**Source**: `examples/run_all_model.py`
Running comprehensive benchmarks across multiple models with standardized configurations for fair model comparison and evaluation.
## `KUC-137`
**Source**: `examples/tutorial/detailed_workflow.ipynb`
Learning Qlib's detailed workflow including data access, feature computation, model training, and portfolio analysis through interactive examples.
## `KUC-138`
**Source**: `examples/workflow_by_code.py`
Building Qlib workflows programmatically using Python code instead of YAML configuration for fine-grained control over the pipeline.
FILE:references/WISDOM.md
# Cross-Project Wisdom
Total: **10**
## `CW-BT-001` — Cerebro 统一编排引擎
**From**: backtrader · **Applicable to**: backtesting
backtrader 用 Cerebro 作为单一入口,统一管理 data feeds、strategies、analyzers、 observers 的生命周期,支持一次 cerebro.run() 跑多策略+多数据源。 zvt 的 StockTrader 目前每次实例化只绑定一套因子,缺乏统一的多策略组合编排层; 借鉴 Cerebro 模式可让用户把多个 Trader 实例组合到一个 runner 中对比评估。
## `CW-BT-002` — Analyzer 插件化绩效评估
**From**: backtrader · **Applicable to**: backtesting
backtrader 提供 SharpeRatio、DrawDown、TimeReturn、TradeAnalyzer 等即插即用 的 Analyzer,可在不修改策略代码的情况下附加任意绩效指标。 zvt 当前绩效评估能力较弱,没有标准化的 Analyzer 接口; 借鉴此模式可让用户 cerebro.addanalyzer(SharpeRatio) 即得风险调整收益报告。
## `CW-BT-003` — Sizer 仓位管理分离
**From**: backtrader · **Applicable to**: backtesting
backtrader 将仓位管理(每次开仓买多少股/多大比例)单独抽象为 Sizer, 与信号逻辑完全解耦;内置 FixedSize、PercentSizer 等,用户可自定义。 zvt 目前没有显式的 Sizer 概念,仓位控制逻辑散落在 Trader.on_profit_control 等钩子中; 引入 Sizer 接口可使策略信号与资金管理规则独立演化和组合复用。
## `CW-BT-004` — Order 类型全集(Limit/Stop/OCO/Bracket)
**From**: backtrader · **Applicable to**: backtesting
backtrader 支持 Market、Limit、Stop、StopLimit、OCO(二选一)、 Bracket(止盈止损一对订单)等丰富订单类型,并模拟成交滑点和手续费方案。 zvt 回测目前主要支持市价成交,缺乏限价委托和组合订单模拟; 对于高频或实盘对接场景,完善订单类型将大幅提升回测真实性。
## `CW-BT-005` — 数据重采样与重播(Resampling & Replaying)
**From**: backtrader · **Applicable to**: backtesting
backtrader 可将低级别数据(如 1 min)实时 resample 为高级别(如 1 day)并同步驱动策略, 或 replay 逐 tick 模拟 OHLC 形成过程,实现日内精细回测。 zvt 目前多时间框架通过预录入不同级别 K 线实现,缺少运行时动态重采样; 借鉴此模式可在不重复录入数据的前提下支持任意时间粒度组合回测。
## `CW-VN-003` — CTA 回测引擎内置可视化
**From**: vnpy · **Applicable to**: backtesting
vnpy 的 cta_backtester 提供图形界面直接展示策略净值曲线、最大回撤、 每日盈亏、成交明细,无需 Jupyter Notebook。 zvt 目前回测结果可视化依赖 draw_result 方法调用 Plotly,但无统一的回测报告页面; 借鉴此模式可打包一个开箱即用的策略绩效仪表盘。
## `CW-VN-004` — vnpy.alpha ML 因子研究实验室(Lab)
**From**: vnpy · **Applicable to**: factor-research
vnpy 4.0 的 vnpy.alpha.lab 提供数据管理、模型训练、信号生成、策略回测一体化工作流, 支持 Lasso/LightGBM/MLP 等算法的标准化训练接口和可视化对比。 zvt 的 ML 能力目前仅有 MaStockMLMachine 一个入口,缺乏规范化 Lab 框架; 借鉴 Lab 模式可建立"特征工程→训练→信号→回测"的标准流水线,降低 ML 实验门槛。
## `CW-QL-001` — Point-in-Time 数据库(防未来数据泄漏)
**From**: qlib · **Applicable to**: backtesting
qlib 的 Point-in-Time Provider 保证在给定时间点 t 的查询只返回 t 时刻 真实可知的数据(财报发布延迟、修订历史均被正确处理), 彻底消除回测中的 look-ahead bias。 zvt 目前财务数据以报告期为 timestamp,缺少"发布日"维度, 存在用未来财报数据做选股的潜在偏差;引入 PIT 模式可大幅提升回测可信度。
## `CW-QL-002` — Recorder + Experiment 实验管理(MLflow 风格)
**From**: qlib · **Applicable to**: factor-research
qlib 的 workflow 模块提供 Experiment/Recorder,自动记录每次模型训练的 超参数、特征、指标、预测结果,支持跨实验比较和模型版本管理。 zvt 目前缺乏 ML 实验追踪机制,每次重跑结果会覆盖前次; 借鉴 Recorder 模式可将每次因子实验的参数和结果持久化,支持快速复现和版本对比。
## `CW-QL-003` — Nested Decision Framework(多层嵌套决策执行)
**From**: qlib · **Applicable to**: backtesting
qlib 支持将高频执行层(分钟级委托拆单)嵌套在低频决策层(日级组合调仓)中, 两层独立优化且可组合运行,实现日内最优执行算法(如 TWAP、VWAP 调仓)。 zvt 目前回测仅有日线级别的成交假设,缺乏执行算法建模; 借鉴嵌套框架可让 zvt 区分"何时持有哪些股"与"如何以最小冲击成本建仓"两个问题。
FILE:references/components/backtest_execution.md
# backtest_execution (7 classes)
## `Exchange.deal_order`
`backtest_execution/exchange-deal-order.py:0`
## `Exchange.get_quote`
`backtest_execution/exchange-get-quote.py:0`
## `Position.update`
`backtest_execution/position-update.py:0`
## `Account.update`
`backtest_execution/account-update.py:0`
## `SimulatorExecutor.execute`
`backtest_execution/simulatorexecutor-execute.py:0`
## `Price Model`
`backtest_execution/price-model.py:0`
## `Position Type`
`backtest_execution/position-type.py:0`
FILE:references/components/data_expression_layer.md
# data_expression_layer (5 classes)
## `Expression.load`
`data_expression_layer/expression-load.py:0`
## `Feature.compute`
`data_expression_layer/feature-compute.py:0`
## `PFeature.compute`
`data_expression_layer/pfeature-compute.py:0`
## `Expression.get_longest_back_rolling`
`data_expression_layer/expression-get-longest-back-rolling.py:0`
## `Feature Provider Backend`
`data_expression_layer/feature-provider-backend.py:0`
FILE:references/components/data_operators.md
# data_operators (6 classes)
## `ElemOperator._load_internal`
`data_operators/elemoperator-load-internal.py:0`
## `PairOperator._load_internal`
`data_operators/pairoperator-load-internal.py:0`
## `Rolling._load_internal`
`data_operators/rolling-load-internal.py:0`
## `Ref._load_internal`
`data_operators/ref-load-internal.py:0`
## `If._load_internal`
`data_operators/if-load-internal.py:0`
## `Rolling Functions`
`data_operators/rolling-functions.py:0`
FILE:references/components/data_provider_system.md
# data_provider_system (5 classes)
## `CalendarProvider.calendar`
`data_provider_system/calendarprovider-calendar.py:0`
## `InstrumentProvider.instrument`
`data_provider_system/instrumentprovider-instrument.py:0`
## `FeatureProvider.features`
`data_provider_system/featureprovider-features.py:0`
## `LocalProvider.inst_calculator`
`data_provider_system/localprovider-inst-calculator.py:0`
## `Storage Backend`
`data_provider_system/storage-backend.py:0`
FILE:references/components/dataset_and_handler.md
# dataset_and_handler (6 classes)
## `DatasetH.setup_data`
`dataset_and_handler/dataseth-setup-data.py:0`
## `TSDataSampler.sample`
`dataset_and_handler/tsdatasampler-sample.py:0`
## `DataHandlerLP.get_data`
`dataset_and_handler/datahandlerlp-get-data.py:0`
## `Alpha158.load`
`dataset_and_handler/alpha158-load.py:0`
## `Processor Pipeline`
`dataset_and_handler/processor-pipeline.py:0`
## `Data Sampler`
`dataset_and_handler/data-sampler.py:0`
FILE:references/components/model_training.md
# model_training (6 classes)
## `Model.fit`
`model_training/model-fit.py:0`
## `Model.predict`
`model_training/model-predict.py:0`
## `LGBModel.fit`
`model_training/lgbmodel-fit.py:0`
## `TrainerR.train`
`model_training/trainerr-train.py:0`
## `Model Backend`
`model_training/model-backend.py:0`
## `Training Strategy`
`model_training/training-strategy.py:0`
FILE:references/components/nested_execution.md
# nested_execution (4 classes)
## `NestedExecutor.execute`
`nested_execution/nestedexecutor-execute.py:0`
## `TradeRange.clip_time_range`
`nested_execution/traderange-clip-time-range.py:0`
## `NestedExecutor._init_sub_trading`
`nested_execution/nestedexecutor-init-sub-trading.py:0`
## `Execution Hierarchy`
`nested_execution/execution-hierarchy.py:0`
FILE:references/components/rl_trading.md
# rl_trading (6 classes)
## `RLStrategy.step`
`rl_trading/rlstrategy-step.py:0`
## `RLIntStrategy.generate_trade_decision`
`rl_trading/rlintstrategy-generate-trade-decision.py:0`
## `SAOEIntStrategy.step`
`rl_trading/saoeintstrategy-step.py:0`
## `Simulator.step`
`rl_trading/simulator-step.py:0`
## `RL Policy`
`rl_trading/rl-policy.py:0`
## `State Interpreter`
`rl_trading/state-interpreter.py:0`
FILE:references/components/rolling_backtest.md
# rolling_backtest (3 classes)
## `Rolling.run`
`rolling_backtest/rolling-run.py:0`
## `Rolling.basic_task`
`rolling_backtest/rolling-basic-task.py:0`
## `Ensemble Strategy`
`rolling_backtest/ensemble-strategy.py:0`
FILE:references/components/trading_strategy.md
# trading_strategy (6 classes)
## `BaseStrategy.generate_trade_decision`
`trading_strategy/basestrategy-generate-trade-decision.py:0`
## `TopkDropoutStrategy.generate_trade_decision`
`trading_strategy/topkdropoutstrategy-generate-trade-decis.py:0`
## `WeightStrategyBase.generate_target_weight_position`
`trading_strategy/weightstrategybase-generate-target-weigh.py:0`
## `EnhancedIndexingStrategy.generate_trade_decision`
`trading_strategy/enhancedindexingstrategy-generate-trade-.py:0`
## `Decision Logic`
`trading_strategy/decision-logic.py:0`
## `Risk Constraint`
`trading_strategy/risk-constraint.py:0`
FILE:references/components/workflow_and_recording.md
# workflow_and_recording (6 classes)
## `QlibRecorder.start`
`workflow_and_recording/qlibrecorder-start.py:0`
## `RecordTemp.generate`
`workflow_and_recording/recordtemp-generate.py:0`
## `SignalRecord.generate`
`workflow_and_recording/signalrecord-generate.py:0`
## `SigAnaRecord.generate`
`workflow_and_recording/siganarecord-generate.py:0`
## `PortAnaRecord.generate`
`workflow_and_recording/portanarecord-generate.py:0`
## `Recording Strategy`
`workflow_and_recording/recording-strategy.py:0`
基于 pyfolio-reloaded 的投资组合绩效分析:一键生成 tear sheet(夏普、回撤、 年化、换手、个股往返交易、行业归因)。适用于回测后的标准化报告。
---
name: pyfolio-performance
description: |-
基于 pyfolio-reloaded 的投资组合绩效分析:一键生成 tear sheet(夏普、回撤、
年化、换手、个股往返交易、行业归因)。适用于回测后的标准化报告。
license: Proprietary. See LICENSE.txt in project root.
compatibility: Designed for Doramagic-host ecosystem (Claude Code / openclaw / Cursor). Requires Python 3.12+ with uv package manager.
metadata:
version: "v6.1"
blueprint_id: "finance-bp-106"
compiled_at: "2026-04-22T13:00:50.454770+00:00"
capability_markets: "multi-market"
capability_activities: "backtesting, factor-research"
sop_version: "crystal-compilation-v6.1"
---
# Pyfolio 业绩分析 (pyfolio-performance)
> 给你的回测结果一键出 tear sheet——夏普、回撤、年化、换手、行业归因 全套图表,不用自己画。
## Pipeline
`data_collection -> data_storage -> factor_computation -> target_selection -> trading_execution -> visualization`
## Top Use Cases (7 total)
### Sphinx Documentation Deployment (`UC-101`)
Automates the build and deployment of Sphinx-generated documentation for the pyfolio library, ensuring consistent documentation deployment across envi
**Triggers**: documentation, deploy, sphinx
### Sphinx Documentation Configuration (`UC-102`)
Configures Sphinx documentation build settings including theme, extensions, and project metadata for generating pyfolio library documentation
**Triggers**: documentation, sphinx config, configuration
### Round Trip Trade Analysis with Tear Sheets (`UC-103`)
Analyzes individual round trip trades (entry/exit) in a portfolio, computing profitability metrics by trade and sector to understand trading efficienc
**Triggers**: round trip, trade analysis, tear sheet
For all **7** use cases, see [references/USE_CASES.md](references/USE_CASES.md).
**Execute trigger**: `When user intent matches intent_router.uc_entries[].positive_terms AND user uses action verb (run/execute/跑/执行/backtest/fetch/collect)`
## What I'll Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
## Semantic Locks (Fatal)
| ID | Rule | On Violation |
|---|---|---|
| `SL-01` | Execute sell orders before buy orders in every trading cycle | halt |
| `SL-02` | Trading signals MUST use next-bar execution (no look-ahead) | halt |
| `SL-03` | Entity IDs MUST follow format entity_type_exchange_code | halt |
| `SL-04` | DataFrame index MUST be MultiIndex (entity_id, timestamp) | halt |
| `SL-05` | TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount | halt |
| `SL-06` | filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION | halt |
| `SL-07` | Transformer MUST run BEFORE Accumulator in factor pipeline | halt |
| `SL-08` | MACD parameters locked: fast=12, slow=26, signal=9 | halt |
Full lock definitions: [references/LOCKS.md](references/LOCKS.md)
## Top Anti-Patterns (25 total)
- **`AP-ZVT-183`**: 除权因子为 inf/NaN 时直接参与乘法导致复权静默失败
- **`AP-ZVT-179`**: 第三方数据接口超限后异常被吞噬,数据静默缺失
- **`AP-ZVT-183B`**: HFQ(后复权)与 QFQ(前复权)K 线表使用错误导致因子计算漂移
All 25 anti-patterns: [references/ANTI_PATTERNS.md](references/ANTI_PATTERNS.md)
## Evidence Quality Notice
> [QUALITY NOTICE] This crystal was compiled from blueprint finance-bp-106. Evidence verify ratio = 41.5% and audit fail total = 16. Generated results may have uncaptured requirement gaps. Verify critical decisions against source files (LATEST.yaml / LATEST.jsonl).
## Reference Files
| File | Contents | When to Load |
|---|---|---|
| [references/seed.yaml](references/seed.yaml) | V6+ 全量权威 (source-of-truth) | 有行为/决策争议时必读 |
| [references/ANTI_PATTERNS.md](references/ANTI_PATTERNS.md) | 25 条跨项目反模式 | 开始实现前 |
| [references/WISDOM.md](references/WISDOM.md) | 跨项目精华借鉴 | 架构决策时 |
| [references/CONSTRAINTS.md](references/CONSTRAINTS.md) | domain + fatal 约束 | 规则冲突时 |
| [references/USE_CASES.md](references/USE_CASES.md) | 全量 KUC-* 业务场景 | 需要完整示例时 |
| [references/LOCKS.md](references/LOCKS.md) | SL-* + preconditions + hints | 生成回测/交易代码前 |
| [references/COMPONENTS.md](references/COMPONENTS.md) | AST 组件地图(按 module 拆分)| 查 API 时 |
---
*Compiled by Doramagic crystal-compilation-v6.1 from `finance-bp-106` blueprint at 2026-04-22T13:00:50.454770+00:00.*
*See [human_summary.md](human_summary.md) for non-technical overview.*
FILE:human_summary.md
# Human Summary
> I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
## What I Can Do
- **tagline**: I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
- **use_cases**: ['Round Trip Trade Analysis with Tear Sheets', 'Sphinx Documentation Configuration', 'Sphinx Documentation Deployment', 'A-share MACD daily golden-cross backtest with hfq price adjustment from eastmoney', 'End-to-end ZVT pipeline: FinanceRecorder + GoodCompanyFactor + StockTrader', 'Multi-factor strategy with TargetSelector (AND mode) combining MACD + volume breakout', 'Index composition data collection (SZ1000, SZ2000) with EM recorder']
## What I Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
FILE:references/ANTI_PATTERNS.md
# Anti-Patterns (Cross-Project)
Total: **25**
## qlib (9)
### `AP-QLIB-1930` — 回测结果与模型无关——共享 dataset 对象导致预测值被首次模型覆盖 <sub>(high)</sub>
Qlib 中多个模型复用同一个已 fit 的 DatasetH 实例时,dataset 内部的标准化 参数(fit_start_time/fit_end_time 决定的归一化统计量)在第一次 fit 后固化。 切换模型但不重新初始化 dataset,导致所有模型实际使用同一套预测信号。表现为 无论换 LightGBM/XGBoost/DNN,回测净值曲线完全一致。这是最危险的"实验看起来 在跑,但结论全部无效"反模式。
Source: https://github.com/microsoft/qlib/issues/1930
### `AP-QLIB-2090` — fit_start_time 与 train segment 双重配置引发隐式数据泄露 <sub>(high)</sub>
Qlib DatasetH 有两个"训练数据范围":handler 的 fit_start_time/fit_end_time (决定归一化器拟合范围)和 segments.train(决定模型训练范围)。常见错误是 让 fit_end_time 覆盖 valid/test 段,使归一化统计量(均值、标准差)包含了 未来数据,造成前向偏差(look-ahead bias)。两者独立配置但语义耦合,文档 未明确说明 fit_end_time 必须 <= train_end。
Source: https://github.com/microsoft/qlib/issues/2090
### `AP-QLIB-2036` — MACD 因子公式文档错误——DEA 被多除一次 CLOSE 导致量纲不一致 <sub>(high)</sub>
Qlib 官方文档中的 Alpha 公式示例将 MACD 的 DEA 定义为 EMA(DIF, 9) / CLOSE, 但 DIF 已经是无量纲(除过 CLOSE 的),再次除以 CLOSE 导致 DEA 量纲为 1/price。 基于此文档公式构建的 MACD 因子在截面标准化后与正确公式差异显著,IC 下降。 此类文档层面的公式错误会被大量用户直接照搬入生产因子库。
Source: https://github.com/microsoft/qlib/issues/2036
### `AP-QLIB-2184` — 自定义 A 股数据导入前未按约定填充停牌日 NaN,引发下游因子噪声 <sub>(high)</sub>
Qlib 约定停牌日 open/close/high/low/volume/factor 字段均应填 NaN,以便框架 在因子计算时识别并跳过。用户自建 A 股数据集时若将停牌日保留为上一日价格 (常见于从东财/Wind 直接导出的数据),会导致停牌期间的价格动量因子出现 "假信号"(价格不变但因子非零)。Qlib 不校验此约定,错误静默流入训练数据。
Source: https://github.com/microsoft/qlib/issues/2184
### `AP-QLIB-1892` — PIT(Point-In-Time)财务数据收集器依赖外部股票列表接口,全量 A 股获取不完整 <sub>(high)</sub>
Qlib 的 PIT 数据收集器(财务数据时间点快照)在初始化时调用 get_hs_stock_symbols() 获取沪深股票列表。该函数依赖东财 API,经常仅返回 部分列表而非全量 5000+ 股票,且函数在获取不完整时直接 raise ValueError。 用户若按文档步骤操作,财务数据集将只覆盖部分股票,基于 PIT 财务因子的回测 存在严重生存者偏差(未被采集的股票被隐式排除)。
Source: https://github.com/microsoft/qlib/issues/1892
### `AP-QLIB-2097` — 全市场 instrument="all" 在 32GB 内存机器上 OOM,但 CSI300 正常 <sub>(medium)</sub>
Qlib 在加载 Alpha158 特征时会将指定 universe 的全部特征矩阵一次性载入内存。 使用 instrument="csi300"(300 股)与 instrument="all"(5000+ 股)的内存占用 差约 16 倍。32GB 机器跑全市场时在 init_instance_by_config 阶段直接 OOM, 错误信息不提示内存问题。用户容易误以为是配置错误,实际上需要分批加载或 使用流式特征计算。
Source: https://github.com/microsoft/qlib/issues/2097
### `AP-QLIB-1984` — LightGBM 模型标签维度校验逻辑永远不触发导致多标签训练静默失败 <sub>(medium)</sub>
Qlib gbdt.py 中用 y.values.ndim == 2 判断是否为多标签,但从 DataFrame 取出的 Series 的 ndim 永远为 1,条件永远为 False,因此多标签训练不会走 squeeze 分支,而是直接进入 LightGBM 训练并在更深处抛出语义不明的错误。 用户尝试自定义多标签任务时无法从错误信息定位到此根因。
Source: https://github.com/microsoft/qlib/issues/1984
### `AP-QLIB-1915` — 自定义 CSV 数据 dump_bin 后 DataHandler 报 Length mismatch,D.features 却正常 <sub>(high)</sub>
Qlib 存在两套数据访问路径:D.features(直接读 binary)和 DataHandler/DataHandlerLP (带 processor pipeline)。自定义 A 股 CSV 数据在 dump_bin 时若字段顺序 或 symbol 格式(如 600000.SH vs SH600000)与 Qlib 约定不符,DataHandler 的 processor 在 align/reindex 时触发 Length mismatch,而 D.features 因不 经过 processor 而成功。这一"两套路径行为不一致"让用户误以为数据已正确导入。
Source: https://github.com/microsoft/qlib/issues/1915
### `AP-QLIB-1949` — Colab/Linux 多进程后端与 Qlib ParallelExt 冲突导致 DataHandler 完全不可用 <sub>(medium)</sub>
Qlib 在非 fork 环境(Windows 或 Google Colab)中,DataHandler 使用 joblib 并行加载特征时,ParallelExt 初始化时访问 _backend_args 属性失败(AttributeError)。 根因是 joblib 1.5+ 移除了该内部属性,Qlib 的兼容层未更新。表现为 D.features 调用抛出多层嵌套异常,用户无法从错误栈判断是并行后端问题还是数据问题。
Source: https://github.com/microsoft/qlib/issues/1949
## vnpy (4)
### `AP-VNPY-3691` — K 线生成器首根 K 线时间戳不对齐,导致第一个周期信号错误 <sub>(high)</sub>
vnpy BarGenerator 在合成 N 分钟 K 线时,第一根推送的 K 线时间戳为"当前 tick 所在分钟"而非"完整 N 分钟周期结束时间"。具体表现:09:59 的 tick 会 触发一根不完整的 5 分钟 K 线推送(本应等到 10:04 才推送)。策略若在 on_bar 中直接用 datetime.minute % 5 过滤,第一根 K 线恰好通过,但包含的 数据不足一个完整周期,用于信号计算会产生错误的开仓信号。
Source: https://github.com/vnpy/vnpy/issues/3691
### `AP-VNPY-3669` — Alpha 模块历史数据增量保存时新旧 DataFrame schema 不兼容导致 SchemaError <sub>(medium)</sub>
vnpy Alpha 模块在保存 K 线数据到 Parquet 文件时,将新下载数据(可能含 Float64 列)与已存文件(历史 Int64 列)直接 polars.concat。polars 强类型 不允许隐式类型提升,抛出 SchemaError。根因是不同数据源/版本返回的字段类型 不一致(如 volume 在部分行情源为整数,在另一些为浮点),且 concat 前无 schema 对齐步骤。影响所有使用 vnpy alpha 进行回测的历史数据构建流程。
Source: https://github.com/vnpy/vnpy/issues/3669
### `AP-VNPY-3685` — 价差交易模块 run_backtesting() 在 Jupyter 环境下静默报错,结果不可信 <sub>(high)</sub>
vnpy 4.10 价差交易(SpreadTrading)模块的 run_backtesting() 在 Jupyter 环境下存在事件循环冲突(asyncio already running),导致回测引擎部分逻辑 不执行但不抛异常,返回看似正常的回测统计数据。同样代码在命令行 Python 中无此问题。vnpy 4.x 将部分 IO 改为 async 但 Jupyter 的事件循环与之不兼容, 是"回测结果看起来正确但实际不完整"的隐蔽陷阱。
Source: https://github.com/vnpy/vnpy/issues/3685
### `AP-VNPY-3700` — 安装脚本不使用 venv 导致全局 numpy 版本被降级破坏其他依赖 <sub>(medium)</sub>
vnpy install.bat 直接在系统/conda base 环境安装,会强制降级 numpy 到 <2.0 以满足 vnpy 依赖,破坏依赖 numpy 2.x 的其他量化工具(如 scipy、pytorch 新版)。 没有 requirements.txt,依赖边界不透明。在多工具共存的量化研究环境中, vnpy 的安装脚本是"全局环境污染"的常见根源。
Source: https://github.com/vnpy/vnpy/issues/3700
## zipline (6)
### `AP-ZIPLINE-138` — 回测价格为未复权价,教程图表误导用户误判策略收益 <sub>(high)</sub>
Zipline 教程使用 AAPL 股价图做演示,但 bundle 中存储的是未复权价格(raw price), 而非经过拆股/分红调整的复权价。图表显示的历史价格与市场实际价约差 4 倍(Apple 历次拆股累计因子),用户误将"价格翻 4 倍"当作策略收益。A 股场景更严重: 除权前后价格跳变会在未复权数据中形成巨大"信号",吸引技术指标在除权日产生 虚假突破信号。
Source: https://github.com/stefan-jansen/zipline-reloaded/issues/138
### `AP-ZIPLINE-235` — 默认以当根 K 线收盘价成交,低估实盘滑点,策略回测收益虚高 <sub>(high)</sub>
Zipline 默认滑点模型在当根 K 线触发信号后,以同根 K 线收盘价成交(current bar close fill)。实盘中信号只能在下一根 K 线的开盘价附近成交(T+1 order execution)。以 A 股日线为例,用收盘价回测比用次日开盘价成交平均高估日收益 约 0.1-0.3%,年化差距可超 30%。需显式配置 slippage model 为 VolumeShareSlippage 或 FixedSlippage 并设合理 volume_limit。
Source: https://github.com/stefan-jansen/zipline-reloaded/issues/235
### `AP-ZIPLINE-190` — 日历 start_session 设为非交易日触发 DateOutOfBounds,无提示如何修正 <sub>(medium)</sub>
Zipline 在注册 bundle 或运行算法时,若 start_session 参数恰好是非交易日 (如 1998-01-01 元旦),Calendar 校验抛出 DateOutOfBounds("cannot be earlier than the first session")。错误信息仅显示交易日历起始日,不提示"请改为第一个 交易日"。A 股场景:使用 SSE/SZSE 日历时,若 start_date 恰好是春节前最后 一天次日(节假日),会触发同类错误,调试成本极高。
Source: https://github.com/stefan-jansen/zipline-reloaded/issues/190
### `AP-ZIPLINE-181` — asset db 过期后 Pipeline 报"no assets traded",误导用户排查数据范围 <sub>(high)</sub>
Zipline 的 asset database(SQLite)记录每只股票的 start/end 交易日期。若 使用了旧版 Quandl/自建 bundle 且未重新 ingest,在回测新日期范围时 Pipeline 抛出 "Failed to find any assets with country_code 'US' that traded between [dates]"。A 股场景:重新下载行情后若只更新价格数据而未重建 asset db,退市/ 新上市股票的日期范围不更新,Pipeline 过滤会悄悄排除这些股票,产生生存者偏差。
Source: https://github.com/stefan-jansen/zipline-reloaded/issues/181
### `AP-ZIPLINE-285` — week_start()/week_end() 在自定义日历(非美股)下静默失效 <sub>(medium)</sub>
Zipline schedule_function 的 date_rules.week_start() 和 date_rules.week_end() 依赖交易日历的周首/周末判断逻辑,但在非美股日历(如 ASX、SSE)中,该逻辑 与 NYSE 日历的偏移计算不兼容,导致 schedule 永远不触发或在错误的日期触发。 A 股场景:使用 SSE 日历时,含春节等连续长假的周,week_start 可能跳过整个 假期周而不调仓,但用户无法从日志发现未触发的调度。
Source: https://github.com/stefan-jansen/zipline-reloaded/issues/285
### `AP-ZIPLINE-240` — 回测日期时区必须为 UTC,传入 naive datetime 引发深层 AssertionError <sub>(medium)</sub>
Zipline 内部强制要求所有时间戳为 UTC aware datetime。当用户传入 naive datetime (无时区信息,如 pd.Timestamp('2020-01-01'))时,不在入口处报错,而是在 算法执行深处触发 AssertionError: Algorithm should have a utc datetime,栈深 难以定位。A 股开发者从本地 CST 时间导入数据时极易触发此陷阱,需在 bundle 注册时显式 tz_localize('UTC')。
Source: https://github.com/stefan-jansen/zipline-reloaded/issues/240
## zvt (6)
### `AP-ZVT-183` — 除权因子为 inf/NaN 时直接参与乘法导致复权静默失败 <sub>(high)</sub>
ZVT 在计算前复权因子时以 new/old 价格比计算 qfq_factor。当 old==0(新股首日 或数据缺失)时因子为 inf;当 kdata.open 本身为 None(停牌日未填充)时乘法 抛出 TypeError。结果:整个 entity 的复权计算中断,后续 K 线全部丢失,但主 流程只 log ERROR 不中断,用户往往不知道已有大量股票数据损坏。
Source: https://github.com/zvtvz/zvt/issues/183
### `AP-ZVT-179` — 第三方数据接口超限后异常被吞噬,数据静默缺失 <sub>(high)</sub>
ZVT 使用聚宽 jqdatasdk 批量拉取全市场 K 线时(4000+ 股票),触发聚宽每日 最大查询条数限制(错误:已超过每日最大查询数量)。ZVT 捕获异常后继续执行下一 entity,导致超限后所有股票的当日数据均静默缺失。回测若使用该残缺数据库,因 子计算结果将产生系统性偏差,且无告警。
Source: https://github.com/zvtvz/zvt/issues/179
### `AP-ZVT-161` — 全市场 SQLite 批量因子计算触发 too many SQL variables 错误 <sub>(medium)</sub>
ZVT 在计算 VolumeUpMaFactor 等多股因子时,将所有 entity_id 拼入单条 SQL 的 IN 子句。当 A 股全市场(5000+ 股)一次性查询时,触发 SQLite 默认限制 SQLITE_MAX_VARIABLE_NUMBER=999。调大 max_allowed_packet(MySQL 参数)无效, 根因是 SQLite 变量数上限。正确解法是分批查询,但 ZVT 早期版本未处理此边界。
Source: https://github.com/zvtvz/zvt/issues/161
### `AP-ZVT-129` — 使用通配符导入隐藏 API 版本变更,AdjustType 等枚举莫名消失 <sub>(medium)</sub>
ZVT 文档示例使用 `from zvt import *` 导入所有符号。当 ZVT 版本升级重构 枚举(如将 AdjustType 移入子模块)后,通配符导入不再包含该符号,触发 AttributeError。使用者误以为是安装问题,实际是版本间 API breaking change 未在 CHANGELOG 中标注,且通配符导入掩盖了具体来源。应显式 import 枚举类。
Source: https://github.com/zvtvz/zvt/issues/129
### `AP-ZVT-187` — 回测引擎未在数据层空结果时提前终止,导致空指针级联崩溃 <sub>(medium)</sub>
ZVT Trader 在 load_data 完成后检查数据为空时,不提前退出,而是将空 DataFrame 传入 selector 计算,触发后续 NoneType 操作链式崩溃。错误栈深且难以定位根因, 用户误以为是策略逻辑问题。根因是数据时间窗口配置错误(start/end 不在数据 库覆盖范围内)但无有效校验。
Source: https://github.com/zvtvz/zvt/issues/187
### `AP-ZVT-183B` — HFQ(后复权)与 QFQ(前复权)K 线表使用错误导致因子计算漂移 <sub>(high)</sub>
ZVT 提供 Stock1dKdata(不复权)、Stock1dHfqKdata(后复权)、Stock1dQfqKdata (前复权)三张独立表。用户在计算价格动量/均线因子时混用两张表(如用不复权 做均线,用后复权做收益率),导致除权日前后因子值产生跳变。ZVT 不做跨表 复权类型一致性校验,混用静默通过。
Source: https://github.com/zvtvz/zvt/issues/183
FILE:references/COMPONENTS.md
# Component Capability Map
**Project**: finance-bp-106--pyfolio-reloaded
**Scan date**: 2026-04-22
**Stats**: {'total_files': 8, 'total_classes': 43, 'total_functions': 0, 'total_stages': 8}
## Modules (8)
- [data_preprocessing](components/data_preprocessing.md): 5 classes
- [position_analysis](components/position_analysis.md): 5 classes
- [transaction_analysis](components/transaction_analysis.md): 4 classes
- [round_trip_analysis](components/round_trip_analysis.md): 6 classes
- [performance_attribution](components/performance_attribution.md): 4 classes
- [returns_analysis](components/returns_analysis.md): 6 classes
- [capacity_analysis](components/capacity_analysis.md): 5 classes
- [report_generation](components/report_generation.md): 8 classes
FILE:references/CONSTRAINTS.md
# Constraints
## preservation_manifest
```yaml
required_objects:
business_decisions_count: 142
fatal_constraints_count: 50
non_fatal_constraints_count: 193
use_cases_count: 7
semantic_locks_count: 12
preconditions_count: 4
evidence_quality_rules_count: 2
traceback_scenarios_count: 5
```
## Domain Constraints Injected (39)
- **`SHARED-BT-LAB-001`** <sub>(fatal)</sub>: 未来函数(Lookahead Bias):在模拟历史时间点 t 的交易决策时, 不得使用 t 时刻之后才能知道的信息。最常见形式: (1) 使用收盘价计算信号并同日以收盘价成交; (2) 将 T 日收盘后计算的指标标记在同一根 K 线; (3) 使用当日最高/最低价作为成交假设。 信号计算与成交时间必须对齐:T 日收盘后计算信号,T+1 日开盘成交。
- **`SHARED-BT-LAB-002`** <sub>(high)</sub>: 指标预热期(Warmup Period)处理:滚动窗口指标在前 N 个 bar 时 NaN, 这些 bar 不应参与信号计算和持仓决策。强制要求指标的 warmup_period 与最长 lookback 期等长,且 warmup 期间持仓应置零。
- **`SHARED-BT-LAB-003`** <sub>(fatal)</sub>: ML/DL 模型时序数据划分必须按时间顺序:TRAIN < VALID < TEST, 不可使用随机 k-fold 分折(会将未来数据混入训练集)。 应使用 TimeSeriesSplit 或 Walk-Forward 验证。
- **`SHARED-BT-LAB-004`** <sub>(fatal)</sub>: 开盘价/最高价/最低价成交假设:日线回测中假设每日可以最高价卖出或 最低价买入(如动量策略"最高价止盈"),这是明显的 lookahead, 因为日内最高/最低价只有收盘后才能确认。成交价只能用开盘价或 前一日收盘价(带滑点)。
- **`SHARED-BT-LAB-005`** <sub>(high)</sub>: 数据对齐偏移(Off-by-one):pandas rolling/shift 等操作容易引入细微的 1 期偏移错误。应在代码中明确记录每个序列的"观测时间点", 并通过 assert 验证关键时间对齐关系。
- **`SHARED-BT-LAB-006`** <sub>(high)</sub>: 过度优化(Overfitting):回测数量越多,过拟合概率越高。 Bailey et al.(2014)证明 Optimal Sharpe Ratio 期望值随回测次数单调递减。 应使用 Walk-Forward 验证代替 in-sample 参数穷举,并报告 Deflated Sharpe Ratio(DSR)而非峰值 Sharpe。
- **`SHARED-BT-SURV-001`** <sub>(fatal)</sub>: 幸存者偏差(Survivorship Bias):使用当前市场成分股作为历史回测股票池, 会遗漏曾经存在但后来退市、摘牌或被合并的股票,系统性高估策略历史收益率。 回测股票池必须使用历史时点快照(point-in-time universe)。
- **`SHARED-BT-SURV-002`** <sub>(high)</sub>: In-Sample / Out-of-Sample 划分:策略开发、参数选择必须在样本内完成, 样本外数据仅用于最终验证,不可多次"看"样本外数据后继续调优 (会将样本外变为新的样本内,重蹈过拟合)。
- **`SHARED-BT-SURV-003`** <sub>(high)</sub>: 停牌/缺失数据的填充策略:停牌日价格不可简单用前一日收盘价 forward-fill, 因为这会在复盘时造成"零成交量"日参与了因子计算和信号生成。 应在因子计算层显式过滤缺失交易日,不填充。
- **`SHARED-BT-SURV-004`** <sub>(high)</sub>: 异常值(Extreme Value)污染:原始市场数据可能含有数据源错误(如除权未 及时调整、手工录入错误导致的极端价格),不清洗直接进入因子计算会产生 极端信号,污染整个横截面。应在 pipeline 入口处过滤 3-sigma 异常值。
- **`SHARED-BT-COST-001`** <sub>(fatal)</sub>: 交易成本(佣金 + 印花税/转让税 + 过户费)必须在回测初始化时强制配置, 不可使用零成本默认值。忽略成本的回测策略绩效指标具有欺骗性, 高换手率策略尤其严重(单边往返成本往往吞噬 50%+ 的毛收益)。
- **`SHARED-BT-COST-002`** <sub>(high)</sub>: 滑点(Slippage)建模:回测若无滑点,假设每笔订单以理想价格成交, 高频策略在实盘中会因成交价劣化而产生严重亏损。至少应配置固定点差 或比例滑点;大单应使用成交量比例模型(如不超过日成交量 5%)。
- **`SHARED-BT-COST-003`** <sub>(high)</sub>: 换手率(Turnover)必须在回测绩效报告中展示并与成本关联分析。 月换手率超过 50%(年化 600%+)时,策略净收益对成本假设极度敏感, 每 10bps 成本变化可能改变策略盈亏结论,必须做成本敏感性分析。
- **`SHARED-BT-COST-004`** <sub>(medium)</sub>: 仓位规模化(Position Sizing)必须纳入资金量约束:回测应模拟固定资金量 下的实际持仓股数(取整),而非假设可以持有小数股。 对小盘股,最小交易单位(A股:100股/手)会导致实际可持仓量与目标权重 产生偏差,应在回测中模拟取整效应。
- **`SHARED-BT-TIME-001`** <sub>(high)</sub>: 时间戳时区统一:多数据源合并时,UTC vs 本地时间混用是常见数据腐败源。 所有时间戳必须在 pipeline 入口处统一转换为同一时区(推荐 UTC 存储, 市场本地时区展示),不可在 pipeline 中途混用不同时区。
- **`SHARED-BT-TIME-002`** <sub>(high)</sub>: 交易日历对齐:合并不同市场或不同频率数据时(如日线价格 + 周频因子), 必须使用明确的交易日历进行 reindex/merge,不可使用 outer join 后 fillna, 否则会在非交易日(节假日)创建虚假数据行。
- **`SHARED-BT-TIME-003`** <sub>(high)</sub>: 增量更新边界校验:历史数据增量更新时,必须从数据库查询已存最新日期, 仅下载该日期之后的数据。若重新下载已有数据并追加,会产生时间戳重复行, 导致回测时序错误。更新前后必须校验无重复 (index.duplicated().any() == False)。
- **`SHARED-BT-TIME-004`** <sub>(medium)</sub>: 回测绩效归因失真:基准(Benchmark)选择不当会使 Alpha/Beta 计算失真。 应选用策略实际可投资的被动基准(如 HS300 ETF),而非不可直接投资的 价格指数(如 HS300 指数)。价格指数不含股息再投资,会低估持仓基准收益。
- **`SHARED-BT-PERF-001`** <sub>(medium)</sub>: 最大回撤(Max Drawdown)计算必须使用净值序列(portfolio value), 不可用累计收益率序列代替。若使用对数收益率累加,会低估回撤深度 (因对数收益率在下跌时会比简单收益率偏小)。
- **`SHARED-BT-PERF-002`** <sub>(medium)</sub>: Sharpe Ratio 年化化约定:年化 Sharpe = 日 Sharpe × sqrt(252)(股票,252 交易日) 或 × sqrt(365)(加密货币,365日)。不同系统默认不同,跨系统对比前必须 确认年化因子,否则 Sharpe 不可比。
- **`SHARED-BT-PERF-003`** <sub>(medium)</sub>: Calmar Ratio / Sortino Ratio 优于 Sharpe Ratio 作为风险调整收益指标: Sharpe 假设收益正态分布,A 股/加密市场的收益分布显著左偏(肥尾), 会低估下行风险。量化评估应同时报告 Sortino(仅下行波动)和 Calmar(年化收益/最大回撤),不应单一依赖 Sharpe。
- **`SHARED-BT-PERF-004`** <sub>(medium)</sub>: 回测绩效归因应拆解为:alpha(主动收益)、beta(市场收益)、 因子暴露收益(style/sector)和特异性收益(stock selection)。 不做归因的回测无法区分"策略优秀"与"顺风行情恰好 beta 对了"。
- **`SHARED-FR-IC-001`** <sub>(high)</sub>: IC(信息系数)是衡量因子预测能力的核心指标,定义为因子值与 下期收益率的 Spearman 秩相关系数(ICIR = IC / std(IC))。 IC 绝对值 > 0.05 视为有预测能力的初步证据,ICIR > 0.5 视为稳定。 不计算 IC 直接报告回测绩效是因子有效性证明缺失的典型问题。
- **`SHARED-FR-IC-002`** <sub>(high)</sub>: IC 衰减(IC Decay)分析:因子预测能力通常随持仓期增长而衰减。 应计算 1/5/10/20 日 IC 序列,识别因子的最优持仓期。 IC 在1日高但20日迅速衰减的因子是短期因子,不适合月度换仓策略; 反之亦然。使用错误的持仓期会严重损害因子实盘表现。
- **`SHARED-FR-IC-003`** <sub>(high)</sub>: Harvey, Liu & Zhu (2016) 警告:学术界已发现 300+ 个"显著"因子, 其中大量是多重检验下的误发现(False Discovery)。因子有效性要求: t-stat > 3.0(而非传统的 1.96);或在不同时段/市场独立复现; 或有清晰的经济学逻辑。不满足上述条件的因子极可能是数据挖掘产物。
- **`SHARED-FR-IC-004`** <sub>(high)</sub>: 因子换手率(Factor Turnover)控制:高 IC 但高换手率的因子,在扣除 交易成本后净 IC 可能为负。应计算换手率调整后的有效 IC: net_IC = IC - turnover × cost_per_turn。目标换手率 ≤ 50%(月频)。
- **`SHARED-FR-IC-005`** <sub>(medium)</sub>: 因子衰减期(Half-life)是因子信号强度的核心参数,直接决定最优再平衡频率。 半衰期 < 5 日:日频或周频换仓;5-20 日:周频或双周;> 20 日:月频换仓。 错误地对短期因子使用月频换仓,会导致大量 alpha 在持仓期内消散。
- **`SHARED-FR-NEUT-001`** <sub>(high)</sub>: 行业中性化(Industry Neutralization):因子值若不对行业均值中性化, 因子收益中会混入行业轮动收益,难以判断是因子本身还是行业暴露驱动了收益。 行业中性化操作:factor_neutral = factor - industry_mean(factor)。
- **`SHARED-FR-NEUT-002`** <sub>(high)</sub>: 市值中性化(Market Cap Neutralization):小盘股效应(小盘跑赢大盘) 是金融史上最持久的 anomaly 之一,会污染几乎所有未中性化的因子。 若因子与市值高度相关,选股会系统性偏向小盘,收益来自市值暴露而非因子本身。 需同时进行行业和市值中性化(Fama-MacBeth 回归或残差法)。
- **`SHARED-FR-NEUT-003`** <sub>(high)</sub>: 异常值处理(Winsorize/MAD):因子原始值通常含有极端值,极端值会扭曲 分组分析(如 Q1/Q10 十分位)。应对原始因子值做 Winsorize(截尾至 [1%, 99%] 或 3-sigma)或 MAD(中位数绝对偏差)缩尾,然后再排名/中性化。
- **`SHARED-FR-NEUT-004`** <sub>(medium)</sub>: 因子正交化(Factor Orthogonalization):当多个因子共同用于合成打分时, 高相关因子的合成等效于对单一因子过度权重,稀释信号多样性。 应在合成前对因子做施密特正交化或 PCA,消除因子间的多重共线性。
- **`SHARED-FR-NEUT-005`** <sub>(medium)</sub>: 缺失数据填充策略:因子计算中的 NaN(停牌/新股/数据缺口)若用截面均值填充 会引入 lookahead bias(均值本身含未来信息);若完全删除会产生幸存者偏差; 正确做法是用截面中位数(当日所有股票的中位数,不依赖未来)或将该股当日排除。
- **`SHARED-FR-PORT-001`** <sub>(high)</sub>: 分层分析(Quantile Analysis):因子评估应使用 Q1/Q5(五分位)或 Q1/Q10(十分位)分组的多空收益差(top minus bottom spread)作为 主要评估指标,而非简单的多头收益。Q1 多 Q5 空的"单调性"检验是 因子有效性的核心证据:单调递增/递减 > 非单调 >> 仅多头有效。
- **`SHARED-FR-PORT-002`** <sub>(medium)</sub>: Alpha 衰减测试(Alpha Decay Test):因子的月度 IC 在不同时段(牛市/熊市/ 震荡市)的稳定性是因子鲁棒性的重要证据。IC 仅在某个特定市场状态下有效 的因子不适合全天候部署;应分段(rolling 12M)展示 IC 时序, 识别因子失效期。
- **`SHARED-FR-PORT-003`** <sub>(medium)</sub>: 换仓成本感知(Turnover-Aware Selection):因子排名靠近中间地带(49-51 分位) 的股票,排名小幅波动就会触发换仓,产生大量无效交易成本。 应在选股时设置换仓缓冲区(buffer zone):只在排名变化超过阈值时才换仓。
- **`SHARED-FR-PORT-004`** <sub>(medium)</sub>: 分组收益的统计显著性(Bootstrap 检验):因子分层收益差(Q1-Q5 spread) 即使在历史数据上很大,也可能是偶然,需要 bootstrap 或 t-test 检验 显著性(p-value < 0.05)。小样本回测期(< 3年)的分层收益尤其不可靠。
- **`SHARED-FR-XFER-001`** <sub>(high)</sub>: 因子跨市场可移植性验证:在一个市场有效的因子,不必然在另一个市场有效。 将美股因子直接套用 A 股、或将股票因子套用期货/加密货币,需要独立 IC 验证, 不可假设跨市场通用性。A 股特有异象(如反转效应、ST 价格异常)不存在于美股。
- **`SHARED-FR-XFER-002`** <sub>(medium)</sub>: 因子有效性时间稳定性:曾经有效的因子会因市场学习和套利行为逐渐失效 (McLean & Pontiff 2016 证明因子发表后平均衰减 58%)。 应定期(每季度/年)重新评估因子 IC,失效因子应及时替换或降权。
- **`SHARED-FR-XFER-003`** <sub>(medium)</sub>: 因子与宏观经济环境的交互:利率周期/经济周期/市场情绪对因子有效性影响显著。 价值因子(低 P/B)在利率上升期更有效;动量因子在趋势市更有效,震荡市失效。 部署因子前应评估当前宏观环境与因子最优生存环境的匹配度。
FILE:references/LOCKS.md
# Semantic Locks + Preconditions
## Semantic Locks (12)
### `SL-01` <sub>(on_violation: fatal)</sub>
Execute sell orders before buy orders in every trading cycle
### `SL-02` <sub>(on_violation: fatal)</sub>
Trading signals MUST use next-bar execution (no look-ahead)
### `SL-03` <sub>(on_violation: fatal)</sub>
Entity IDs MUST follow format entity_type_exchange_code
### `SL-04` <sub>(on_violation: fatal)</sub>
DataFrame index MUST be MultiIndex (entity_id, timestamp)
### `SL-05` <sub>(on_violation: fatal)</sub>
TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount
### `SL-06` <sub>(on_violation: fatal)</sub>
filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION
### `SL-07` <sub>(on_violation: fatal)</sub>
Transformer MUST run BEFORE Accumulator in factor pipeline
### `SL-08` <sub>(on_violation: fatal)</sub>
MACD parameters locked: fast=12, slow=26, signal=9
### `SL-09` <sub>(on_violation: warning)</sub>
Default transaction costs: buy_cost=0.001, sell_cost=0.001, slippage=0.001
### `SL-10` <sub>(on_violation: fatal)</sub>
A-share equity trading is T+1 (no same-day close of buy positions)
### `SL-11` <sub>(on_violation: fatal)</sub>
Recorder subclass MUST define provider AND data_schema class attributes
### `SL-12` <sub>(on_violation: fatal)</sub>
Factor result_df MUST contain either 'filter_result' OR 'score_result' column
## Preconditions (4)
- **`PC-01`**: `python3 -c 'import zvt; print(zvt.__version__)'` → on_fail: Run: python3 -m pip install zvt then re-run: python3 -m zvt.init_dirs to initialize data directories
- **`PC-02`**: `python3 -c "from zvt.api.kdata import get_kdata; df = get_kdata(entity_ids=['stock_sh_600000'], limit=1); assert df is not None and len(df) > 0, 'No kdata found'"` → on_fail: Run recorder first: python3 -m zvt.recorders.em.em_stock_kdata_recorder --entity_ids stock_sh_600000 (replace with your target entity IDs)
- **`PC-03`**: `python3 -c "import os; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); assert zvt_home.exists(), f'ZVT home not found: {zvt_home}'"` → on_fail: Run: python3 -m zvt.init_dirs
- **`PC-04`**: `python3 -c "import os, tempfile; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); test_f = zvt_home / '.write_test'; test_f.touch(); test_f.unlink()"` → on_fail: Check directory permissions: chmod u+w ~/.zvt or set ZVT_HOME environment variable to a writable location
FILE:references/USE_CASES.md
# Known Use Cases (KUC)
Total: **7**
## `KUC-101`
**Source**: `docs/deploy.py`
Automates the build and deployment of Sphinx-generated documentation for the pyfolio library, ensuring consistent documentation deployment across environments.
## `KUC-102`
**Source**: `docs/source/conf.py`
Configures Sphinx documentation build settings including theme, extensions, and project metadata for generating pyfolio library documentation.
## `KUC-103`
**Source**: `src/pyfolio/examples/round_trip_tear_sheet_example.ipynb`
Analyzes individual round trip trades (entry/exit) in a portfolio, computing profitability metrics by trade and sector to understand trading efficiency.
## `KUC-104`
**Source**: `src/pyfolio/examples/sector_mappings_example.ipynb`
Generates position and round trip tear sheets with sector-based profitability analysis, allowing comparison of trading performance across industry sectors.
## `KUC-105`
**Source**: `src/pyfolio/examples/single_stock_example.ipynb`
Downloads historical price data for a single stock and generates a returns tear sheet with in-sample/out-of-sample comparison to evaluate stock performance.
## `KUC-106`
**Source**: `src/pyfolio/examples/slippage_example.ipynb`
Evaluates the impact of transaction slippage on portfolio performance by generating a full tear sheet with slippage modeling, helping understand trading cost implications.
## `KUC-107`
**Source**: `src/pyfolio/examples/zipline_algo_example.ipynb`
Implements and backtests the On-Line Portfolio Moving Average Reversion (OLMAR) algorithm using Zipline, demonstrating mean-reversion portfolio management across multiple stocks.
FILE:references/WISDOM.md
# Cross-Project Wisdom
Total: **10**
## `CW-BT-001` — Cerebro 统一编排引擎
**From**: backtrader · **Applicable to**: backtesting
backtrader 用 Cerebro 作为单一入口,统一管理 data feeds、strategies、analyzers、 observers 的生命周期,支持一次 cerebro.run() 跑多策略+多数据源。 zvt 的 StockTrader 目前每次实例化只绑定一套因子,缺乏统一的多策略组合编排层; 借鉴 Cerebro 模式可让用户把多个 Trader 实例组合到一个 runner 中对比评估。
## `CW-BT-002` — Analyzer 插件化绩效评估
**From**: backtrader · **Applicable to**: backtesting
backtrader 提供 SharpeRatio、DrawDown、TimeReturn、TradeAnalyzer 等即插即用 的 Analyzer,可在不修改策略代码的情况下附加任意绩效指标。 zvt 当前绩效评估能力较弱,没有标准化的 Analyzer 接口; 借鉴此模式可让用户 cerebro.addanalyzer(SharpeRatio) 即得风险调整收益报告。
## `CW-BT-003` — Sizer 仓位管理分离
**From**: backtrader · **Applicable to**: backtesting
backtrader 将仓位管理(每次开仓买多少股/多大比例)单独抽象为 Sizer, 与信号逻辑完全解耦;内置 FixedSize、PercentSizer 等,用户可自定义。 zvt 目前没有显式的 Sizer 概念,仓位控制逻辑散落在 Trader.on_profit_control 等钩子中; 引入 Sizer 接口可使策略信号与资金管理规则独立演化和组合复用。
## `CW-BT-004` — Order 类型全集(Limit/Stop/OCO/Bracket)
**From**: backtrader · **Applicable to**: backtesting
backtrader 支持 Market、Limit、Stop、StopLimit、OCO(二选一)、 Bracket(止盈止损一对订单)等丰富订单类型,并模拟成交滑点和手续费方案。 zvt 回测目前主要支持市价成交,缺乏限价委托和组合订单模拟; 对于高频或实盘对接场景,完善订单类型将大幅提升回测真实性。
## `CW-BT-005` — 数据重采样与重播(Resampling & Replaying)
**From**: backtrader · **Applicable to**: backtesting
backtrader 可将低级别数据(如 1 min)实时 resample 为高级别(如 1 day)并同步驱动策略, 或 replay 逐 tick 模拟 OHLC 形成过程,实现日内精细回测。 zvt 目前多时间框架通过预录入不同级别 K 线实现,缺少运行时动态重采样; 借鉴此模式可在不重复录入数据的前提下支持任意时间粒度组合回测。
## `CW-VN-003` — CTA 回测引擎内置可视化
**From**: vnpy · **Applicable to**: backtesting
vnpy 的 cta_backtester 提供图形界面直接展示策略净值曲线、最大回撤、 每日盈亏、成交明细,无需 Jupyter Notebook。 zvt 目前回测结果可视化依赖 draw_result 方法调用 Plotly,但无统一的回测报告页面; 借鉴此模式可打包一个开箱即用的策略绩效仪表盘。
## `CW-VN-004` — vnpy.alpha ML 因子研究实验室(Lab)
**From**: vnpy · **Applicable to**: factor-research
vnpy 4.0 的 vnpy.alpha.lab 提供数据管理、模型训练、信号生成、策略回测一体化工作流, 支持 Lasso/LightGBM/MLP 等算法的标准化训练接口和可视化对比。 zvt 的 ML 能力目前仅有 MaStockMLMachine 一个入口,缺乏规范化 Lab 框架; 借鉴 Lab 模式可建立"特征工程→训练→信号→回测"的标准流水线,降低 ML 实验门槛。
## `CW-QL-001` — Point-in-Time 数据库(防未来数据泄漏)
**From**: qlib · **Applicable to**: backtesting
qlib 的 Point-in-Time Provider 保证在给定时间点 t 的查询只返回 t 时刻 真实可知的数据(财报发布延迟、修订历史均被正确处理), 彻底消除回测中的 look-ahead bias。 zvt 目前财务数据以报告期为 timestamp,缺少"发布日"维度, 存在用未来财报数据做选股的潜在偏差;引入 PIT 模式可大幅提升回测可信度。
## `CW-QL-002` — Recorder + Experiment 实验管理(MLflow 风格)
**From**: qlib · **Applicable to**: factor-research
qlib 的 workflow 模块提供 Experiment/Recorder,自动记录每次模型训练的 超参数、特征、指标、预测结果,支持跨实验比较和模型版本管理。 zvt 目前缺乏 ML 实验追踪机制,每次重跑结果会覆盖前次; 借鉴 Recorder 模式可将每次因子实验的参数和结果持久化,支持快速复现和版本对比。
## `CW-QL-003` — Nested Decision Framework(多层嵌套决策执行)
**From**: qlib · **Applicable to**: backtesting
qlib 支持将高频执行层(分钟级委托拆单)嵌套在低频决策层(日级组合调仓)中, 两层独立优化且可组合运行,实现日内最优执行算法(如 TWAP、VWAP 调仓)。 zvt 目前回测仅有日线级别的成交假设,缺乏执行算法建模; 借鉴嵌套框架可让 zvt 区分"何时持有哪些股"与"如何以最小冲击成本建仓"两个问题。
FILE:references/components/capacity_analysis.md
# capacity_analysis (5 classes)
## `days_to_liquidate_positions`
`capacity_analysis/days-to-liquidate-positions.py:0`
## `get_low_liquidity_transactions`
`capacity_analysis/get-low-liquidity-transactions.py:0`
## `apply_slippage_penalty`
`capacity_analysis/apply-slippage-penalty.py:0`
## `volume_consumption_limit`
`capacity_analysis/volume-consumption-limit.py:0`
## `slippage_impact_coefficient`
`capacity_analysis/slippage-impact-coefficient.py:0`
FILE:references/components/data_preprocessing.md
# data_preprocessing (5 classes)
## `check_intraday`
`data_preprocessing/check-intraday.py:0`
## `estimate_intraday`
`data_preprocessing/estimate-intraday.py:0`
## `adjust_returns_for_slippage`
`data_preprocessing/adjust-returns-for-slippage.py:0`
## `intraday_estimation_strategy`
`data_preprocessing/intraday-estimation-strategy.py:0`
## `slippage_model`
`data_preprocessing/slippage-model.py:0`
FILE:references/components/performance_attribution.md
# performance_attribution (4 classes)
## `perf_attrib`
`performance_attribution/perf-attrib.py:0`
## `compute_exposures`
`performance_attribution/compute-exposures.py:0`
## `_align_and_warn`
`performance_attribution/align-and-warn.py:0`
## `factor_model`
`performance_attribution/factor-model.py:0`
FILE:references/components/position_analysis.md
# position_analysis (5 classes)
## `get_percent_alloc`
`position_analysis/get-percent-alloc.py:0`
## `get_top_long_short_abs`
`position_analysis/get-top-long-short-abs.py:0`
## `get_sector_exposures`
`position_analysis/get-sector-exposures.py:0`
## `get_max_median_position_concentration`
`position_analysis/get-max-median-position-concentration.py:0`
## `sector_mapping_source`
`position_analysis/sector-mapping-source.py:0`
FILE:references/components/report_generation.md
# report_generation (8 classes)
## `create_full_tear_sheet`
`report_generation/create-full-tear-sheet.py:0`
## `create_simple_tear_sheet`
`report_generation/create-simple-tear-sheet.py:0`
## `create_returns_tear_sheet`
`report_generation/create-returns-tear-sheet.py:0`
## `create_position_tear_sheet`
`report_generation/create-position-tear-sheet.py:0`
## `create_round_trip_tear_sheet`
`report_generation/create-round-trip-tear-sheet.py:0`
## `create_capacity_tear_sheet`
`report_generation/create-capacity-tear-sheet.py:0`
## `report_format`
`report_generation/report-format.py:0`
## `tear_sheet_subset`
`report_generation/tear-sheet-subset.py:0`
FILE:references/components/returns_analysis.md
# returns_analysis (6 classes)
## `get_max_drawdown`
`returns_analysis/get-max-drawdown.py:0`
## `gen_drawdown_table`
`returns_analysis/gen-drawdown-table.py:0`
## `rolling_beta`
`returns_analysis/rolling-beta.py:0`
## `forecast_cone_bootstrap`
`returns_analysis/forecast-cone-bootstrap.py:0`
## `rolling_window_size`
`returns_analysis/rolling-window-size.py:0`
## `cone_method`
`returns_analysis/cone-method.py:0`
FILE:references/components/round_trip_analysis.md
# round_trip_analysis (6 classes)
## `extract_round_trips`
`round_trip_analysis/extract-round-trips.py:0`
## `_groupby_consecutive`
`round_trip_analysis/groupby-consecutive.py:0`
## `add_closing_transactions`
`round_trip_analysis/add-closing-transactions.py:0`
## `gen_round_trip_stats`
`round_trip_analysis/gen-round-trip-stats.py:0`
## `round_trip_grouping_window`
`round_trip_analysis/round-trip-grouping-window.py:0`
## `matching_algorithm`
`round_trip_analysis/matching-algorithm.py:0`
FILE:references/components/transaction_analysis.md
# transaction_analysis (4 classes)
## `get_txn_vol`
`transaction_analysis/get-txn-vol.py:0`
## `get_turnover`
`transaction_analysis/get-turnover.py:0`
## `plot_txn_time_hist`
`transaction_analysis/plot-txn-time-hist.py:0`
## `turnover_denominator`
`transaction_analysis/turnover-denominator.py:0`
使用 BSM 和 Black 模型对欧式期权进行定价和 Greeks 计算,支持连续股息收益率调整。
---
name: py-vollib-options-pricing
description: |-
使用 BSM 和 Black 模型对欧式期权进行定价和 Greeks 计算,支持连续股息收益率调整。
license: Proprietary. See LICENSE.txt in project root.
compatibility: Designed for Doramagic-host ecosystem (Claude Code / openclaw / Cursor). Requires Python 3.12+ with uv package manager.
metadata:
version: "v6.1"
blueprint_id: "finance-bp-127"
compiled_at: "2026-04-22T13:01:03.585888+00:00"
capability_markets: "global"
capability_activities: "derivatives-pricing"
sop_version: "crystal-compilation-v6.1"
---
# 期权 BSM 定价 (py-vollib-options-pricing)
> 使用 BSM 和 Black 模型对欧式期权进行定价和 Greeks 计算,支持连续股息收益率调整。
## Pipeline
`data_collection -> data_storage -> factor_computation -> target_selection -> trading_execution -> visualization`
## Top Use Cases (1 total)
### Sphinx Documentation Configuration for py_vollib (`UC-101`)
Configures automated documentation generation for the py_vollib options pricing library, enabling consistent API documentation, code examples, and cov
**Triggers**: documentation, sphinx, api docs
**Execute trigger**: `When user intent matches intent_router.uc_entries[].positive_terms AND user uses action verb (run/execute/跑/执行/backtest/fetch/collect)`
## What I'll Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
## Semantic Locks (Fatal)
| ID | Rule | On Violation |
|---|---|---|
| `SL-01` | Execute sell orders before buy orders in every trading cycle | halt |
| `SL-02` | Trading signals MUST use next-bar execution (no look-ahead) | halt |
| `SL-03` | Entity IDs MUST follow format entity_type_exchange_code | halt |
| `SL-04` | DataFrame index MUST be MultiIndex (entity_id, timestamp) | halt |
| `SL-05` | TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount | halt |
| `SL-06` | filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION | halt |
| `SL-07` | Transformer MUST run BEFORE Accumulator in factor pipeline | halt |
| `SL-08` | MACD parameters locked: fast=12, slow=26, signal=9 | halt |
Full lock definitions: [references/LOCKS.md](references/LOCKS.md)
## Top Anti-Patterns (15 total)
- **`AP-DERIVATIVES-PRICING-001`**: Instrument NPV called without attached pricing engine
- **`AP-DERIVATIVES-PRICING-002`**: BSM forward price ignores dividend yield
- **`AP-DERIVATIVES-PRICING-003`**: Negative discount factors passed to log-domain interpolation
All 15 anti-patterns: [references/ANTI_PATTERNS.md](references/ANTI_PATTERNS.md)
## Evidence Quality Notice
> [QUALITY NOTICE] This crystal was compiled from blueprint finance-bp-127. Evidence verify ratio = 54.8% and audit fail total = 10. Generated results may have uncaptured requirement gaps. Verify critical decisions against source files (LATEST.yaml / LATEST.jsonl).
## Reference Files
| File | Contents | When to Load |
|---|---|---|
| [references/seed.yaml](references/seed.yaml) | V6+ 全量权威 (source-of-truth) | 有行为/决策争议时必读 |
| [references/ANTI_PATTERNS.md](references/ANTI_PATTERNS.md) | 15 条跨项目反模式 | 开始实现前 |
| [references/WISDOM.md](references/WISDOM.md) | 跨项目精华借鉴 | 架构决策时 |
| [references/CONSTRAINTS.md](references/CONSTRAINTS.md) | domain + fatal 约束 | 规则冲突时 |
| [references/USE_CASES.md](references/USE_CASES.md) | 全量 KUC-* 业务场景 | 需要完整示例时 |
| [references/LOCKS.md](references/LOCKS.md) | SL-* + preconditions + hints | 生成回测/交易代码前 |
| [references/COMPONENTS.md](references/COMPONENTS.md) | AST 组件地图(按 module 拆分)| 查 API 时 |
---
*Compiled by Doramagic crystal-compilation-v6.1 from `finance-bp-127` blueprint at 2026-04-22T13:01:03.585888+00:00.*
*See [human_summary.md](human_summary.md) for non-technical overview.*
FILE:human_summary.md
# Human Summary
> I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
## What I Can Do
- **tagline**: I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
- **use_cases**: ['Sphinx Documentation Configuration for py_vollib', 'A-share MACD daily golden-cross backtest with hfq price adjustment from eastmoney', 'End-to-end ZVT pipeline: FinanceRecorder + GoodCompanyFactor + StockTrader', 'Multi-factor strategy with TargetSelector (AND mode) combining MACD + volume breakout', 'Index composition data collection (SZ1000, SZ2000) with EM recorder', 'Institutional fund holdings tracker via joinquant_fund_runner pattern', 'Custom Transformer + Accumulator factor with per-entity rolling state']
## What I Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
FILE:references/ANTI_PATTERNS.md
# Anti-Patterns (Cross-Project)
Total: **15**
## FinancePy (finance-bp-101) (3)
### `AP-DERIVATIVES-PRICING-003` — Negative discount factors passed to log-domain interpolation <sub>(high)</sub>
When Numba-jitted interpolation functions perform log transformation on discount factors, negative or zero values cause domain errors. This occurs because log(-x) and log(0) are mathematically undefined. The consequence is runtime crashes in jitted functions and complete failure of discount curve interpolation, blocking all downstream pricing calculations.
### `AP-DERIVATIVES-PRICING-004` — Non-monotonic time points in discount curve interpolation <sub>(high)</sub>
Interpolation over non-monotonically increasing time points produces undefined behavior at crossing times, causing discount factors to be incorrectly computed where time values overlap. This corrupts the entire term structure because the bootstrap algorithm cannot determine which discount factor corresponds to which maturity. The consequence is incorrect present value calculations across all downstream products priced against the curve.
### `AP-DERIVATIVES-PRICING-005` — Bootstrap calibration instruments not in maturity order <sub>(high)</sub>
When building yield curves from market instruments (deposits, FRAs, swaps), the instruments must be provided in strictly increasing maturity order. Out-of-order instruments cause the bootstrap algorithm to solve for discount factors at incorrect time points, corrupting the entire term structure. The consequence is wrong forward rates and discount factors that propagate into all priced instruments.
## QuantLib-SWIG (finance-bp-123) (4)
### `AP-DERIVATIVES-PRICING-001` — Instrument NPV called without attached pricing engine <sub>(high)</sub>
Calling NPV() on a derivatives instrument without first calling setPricingEngine() returns uninitialized garbage values or throws null pointer exceptions. This occurs because the Instrument class relies on the attached PricingEngine to perform actual valuation logic. The consequence is silently incorrect pricing results that appear valid, potentially leading to bad trading decisions.
### `AP-DERIVATIVES-PRICING-006` — Option Exercise type mismatches VanillaOption constructor <sub>(high)</sub>
VanillaOption requires both a StrikedTypePayoff and a matching Exercise object. Using wrong Exercise type (e.g., AmericanExercise for European option) causes compilation failures in C++ or runtime errors in SWIG bindings. The consequence is the pricing system cannot initialize options, blocking all option pricing workflows.
### `AP-DERIVATIVES-PRICING-013` — Evaluation date not set before QuantLib term structure construction <sub>(medium)</sub>
QuantLib requires ql.Settings.instance().evaluationDate to be set before constructing yield term structures and instruments. Without an explicit evaluation date, the curve reference date becomes undefined, causing date calculations to fail or produce incorrect settlement dates. The consequence is wrong discount factors and NPV calculations across the entire portfolio.
### `AP-DERIVATIVES-PRICING-014` — Market quotes passed without QuoteHandle wrapper <sub>(medium)</sub>
QuantLib's observer pattern requires all market quotes to be wrapped in QuoteHandle before passing to rate helpers. Raw quote values bypass the observable notification mechanism, causing dependent instruments to never recalculate when market data updates. The consequence is stale pricing that doesn't reflect current market conditions.
## arch (finance-bp-124) (2)
### `AP-DERIVATIVES-PRICING-007` — NaN/inf values in ARCH model input data <sub>(high)</sub>
ARCH model estimation relies on recursive variance computations and scipy optimize. Non-finite input values (NaN, inf) cause optimizers to produce NaN results and recursive variance calculations to fail. The consequence is complete model estimation failure with meaningless outputs that appear valid, leading to incorrect volatility forecasts and risk misestimation.
### `AP-DERIVATIVES-PRICING-008` — ARCH parameter array concatenation in wrong order <sub>(high)</sub>
ARCHModel composes from three components (mean, volatility, distribution) and requires parameter arrays concatenated in fixed order: [mean_params, volatility_params, distribution_params]. Incorrect ordering causes _parse_parameters to assign wrong values to wrong components, producing mathematically invalid models (e.g., volatility parameters interpreted as distribution parameters). The consequence is invalid conditional variance forecasts.
## py_vollib (finance-bp-127) (6)
### `AP-DERIVATIVES-PRICING-002` — BSM forward price ignores dividend yield <sub>(high)</sub>
When calculating option prices on dividend-paying stocks using BSM, the forward price must be adjusted as F = S * exp((r-q)*t). Omitting the dividend yield adjustment (using F = S * exp(r*t)) causes systematic mispricing for all dividend-paying assets. The consequence is consistently wrong option prices that diverge from market prices, leading to arbitrage opportunities and trading losses.
### `AP-DERIVATIVES-PRICING-009` — Zero or negative time-to-expiration in option pricing <sub>(high)</sub>
Option pricing formulas (Black-Scholes, Black model) compute sqrt(t) in the denominator. Zero time causes division by zero; negative time produces NaN in d1/d2 calculations. The consequence is invalid option prices (NaN, inf) that break downstream Greeks calculations and hedging workflows.
### `AP-DERIVATIVES-PRICING-010` — Black model applies spot price instead of forward price <sub>(high)</sub>
The Black model is designed for options on futures/forwards and expects futures price F as input, not spot price S. Using spot directly causes incorrect pricing because the Black formula assumes the underlying follows geometric Brownian motion with drift equal to the risk-free rate (i.e., forward dynamics). The consequence is systematically wrong forward option prices.
### `AP-DERIVATIVES-PRICING-011` — Missing discount factor in Black model pricing <sub>(medium)</sub>
Black model pricing must apply time value discounting with deflater = exp(-r*t) to undiscounted option prices. Omitting the discount factor produces forward option prices that exceed their fair value by the risk-free compounding amount. The consequence is violation of time value of money principles and prices that cannot be used for fair valuation or hedging.
### `AP-DERIVATIVES-PRICING-012` — Invalid flag parameter ('c'/'p') passed to py_vollib without validation <sub>(medium)</sub>
py_vollib binary_flag dict only contains keys 'c' and 'p'. Passing any other flag value causes KeyError exception. The library lacks input validation and crashes on invalid inputs. The consequence is unhandled exceptions in production systems when flag values come from external sources with unexpected formats.
### `AP-DERIVATIVES-PRICING-015` — Implied volatility computed without proper bounds validation <sub>(medium)</sub>
When computing implied volatility, option prices outside theoretical bounds (below intrinsic value or above maximum) must raise appropriate exceptions. Returning invalid IV values (negative volatility or extreme values) violates mathematical definitions and leads to incorrect pricing, risk calculations, and hedging ratios. The consequence is systemic pricing errors across all vol-dependent derivatives.
FILE:references/COMPONENTS.md
# Component Capability Map
**Project**: finance-bp-127--py_vollib
**Scan date**: 2026-04-22
**Stats**: {'total_files': 5, 'total_classes': 23, 'total_functions': 0, 'total_stages': 5}
## Modules (5)
- [reference_implementations](components/reference_implementations.md): 5 classes
- [option_pricing](components/option_pricing.md): 4 classes
- [analytical_greeks](components/analytical_greeks.md): 6 classes
- [numerical_greeks](components/numerical_greeks.md): 4 classes
- [implied_volatility](components/implied_volatility.md): 4 classes
FILE:references/CONSTRAINTS.md
# Constraints
## preservation_manifest
```yaml
required_objects:
business_decisions_count: 117
fatal_constraints_count: 45
non_fatal_constraints_count: 124
use_cases_count: 1
semantic_locks_count: 12
preconditions_count: 4
evidence_quality_rules_count: 2
traceback_scenarios_count: 5
```
FILE:references/LOCKS.md
# Semantic Locks + Preconditions
## Semantic Locks (12)
### `SL-01` <sub>(on_violation: fatal)</sub>
Execute sell orders before buy orders in every trading cycle
### `SL-02` <sub>(on_violation: fatal)</sub>
Trading signals MUST use next-bar execution (no look-ahead)
### `SL-03` <sub>(on_violation: fatal)</sub>
Entity IDs MUST follow format entity_type_exchange_code
### `SL-04` <sub>(on_violation: fatal)</sub>
DataFrame index MUST be MultiIndex (entity_id, timestamp)
### `SL-05` <sub>(on_violation: fatal)</sub>
TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount
### `SL-06` <sub>(on_violation: fatal)</sub>
filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION
### `SL-07` <sub>(on_violation: fatal)</sub>
Transformer MUST run BEFORE Accumulator in factor pipeline
### `SL-08` <sub>(on_violation: fatal)</sub>
MACD parameters locked: fast=12, slow=26, signal=9
### `SL-09` <sub>(on_violation: warning)</sub>
Default transaction costs: buy_cost=0.001, sell_cost=0.001, slippage=0.001
### `SL-10` <sub>(on_violation: fatal)</sub>
A-share equity trading is T+1 (no same-day close of buy positions)
### `SL-11` <sub>(on_violation: fatal)</sub>
Recorder subclass MUST define provider AND data_schema class attributes
### `SL-12` <sub>(on_violation: fatal)</sub>
Factor result_df MUST contain either 'filter_result' OR 'score_result' column
## Preconditions (4)
- **`PC-01`**: `python3 -c 'import zvt; print(zvt.__version__)'` → on_fail: Run: python3 -m pip install zvt then re-run: python3 -m zvt.init_dirs to initialize data directories
- **`PC-02`**: `python3 -c "from zvt.api.kdata import get_kdata; df = get_kdata(entity_ids=['stock_sh_600000'], limit=1); assert df is not None and len(df) > 0, 'No kdata found'"` → on_fail: Run recorder first: python3 -m zvt.recorders.em.em_stock_kdata_recorder --entity_ids stock_sh_600000 (replace with your target entity IDs)
- **`PC-03`**: `python3 -c "import os; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); assert zvt_home.exists(), f'ZVT home not found: {zvt_home}'"` → on_fail: Run: python3 -m zvt.init_dirs
- **`PC-04`**: `python3 -c "import os, tempfile; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); test_f = zvt_home / '.write_test'; test_f.touch(); test_f.unlink()"` → on_fail: Check directory permissions: chmod u+w ~/.zvt or set ZVT_HOME environment variable to a writable location
FILE:references/USE_CASES.md
# Known Use Cases (KUC)
Total: **1**
## `KUC-101`
**Source**: `docs/conf.py`
Configures automated documentation generation for the py_vollib options pricing library, enabling consistent API documentation, code examples, and coverage reporting for developers.
FILE:references/WISDOM.md
# Cross-Project Wisdom
Total: **8**
## `CW-DERIVATIVES-PRICING-001` — Strict input validation before financial calculations
**From**: FinancePy, QuantLib-SWIG · **Applicable to**: derivatives-pricing
Both FinancePy and QuantLib-SWIG enforce strict validation of all input parameters before any financial computation. FinancePy validates day count types, date arguments, tolerance parameters, and max iterations. QuantLib-SWIG validates exercise types and swap direction enums. This pattern prevents corrupted calculations and provides clear error messages. Apply this pattern by validating all inputs at function entry points.
## `CW-DERIVATIVES-PRICING-002` — Bootstrap requires ordered instrument calibration
**From**: FinancePy, QuantLib-SWIG · **Applicable to**: derivatives-pricing
Both FinancePy and QuantLib-SWIG require calibration instruments to be provided in strict maturity order for curve bootstrapping. FinancePy enforces monotonically increasing time points and validates instrument sequencing (deposits before FRAs before swaps). QuantLib-SWIG uses bootstrap helpers (DepositRateHelper, FraRateHelper, SwapRateHelper) that assume ordered inputs. This ensures the bootstrap algorithm solves for discount factors at mathematically correct time points.
## `CW-DERIVATIVES-PRICING-003` — Handle pattern for lazy evaluation chains
**From**: QuantLib-SWIG · **Applicable to**: derivatives-pricing
QuantLib-SWIG requires wrapping market data (quotes, term structures) in Handle objects to enable lazy evaluation and automatic recalculation. QuoteHandle for market quotes and Handle for term structures enable the observer pattern. When market data updates, all dependent instruments automatically recalculate. This pattern is essential for live pricing systems where prices must reflect current market conditions.
## `CW-DERIVATIVES-PRICING-004` — Parameter composition requires fixed ordering and partitioning
**From**: arch · **Applicable to**: derivatives-pricing
arch enforces a strict parameter composition pattern where mean, volatility, and distribution parameters must be concatenated in fixed order with explicit offset partitioning. The offsets array partitions the unified parameter vector into components. This pattern prevents parameter assignment errors that would corrupt model components. Apply this when composing financial models from multiple sub-components.
## `CW-DERIVATIVES-PRICING-005` — Strict mathematical constraint enforcement
**From**: arch, py_vollib · **Applicable to**: derivatives-pricing
Both arch and py_vollib enforce strict mathematical constraints: arch enforces volatility model stationarity constraints (A.dot(params) - b >= 0) for SLSQP optimization; py_vollib validates implied volatility is positive and option prices within intrinsic/maximum bounds. Violating these constraints produces mathematically invalid results. Always enforce domain constraints on all financial model parameters.
## `CW-DERIVATIVES-PRICING-006` — Forward price adjustment for dividend yield in BSM
**From**: py_vollib · **Applicable to**: derivatives-pricing
py_vollib demonstrates the correct BSM implementation: compute forward price F = S * exp((r-q)*t) to adjust for continuous dividend yield before passing to the pricing engine. This pattern is essential for all options on dividend-paying assets. Forgetting the dividend adjustment causes systematic mispricing for the entire equity derivatives book.
## `CW-DERIVATIVES-PRICING-007` — Monotonicity validation for interpolation arrays
**From**: FinancePy · **Applicable to**: derivatives-pricing
FinancePy enforces strictly monotonically increasing time arrays before interpolation operations. This prevents undefined behavior at crossing times and ensures each time point maps to exactly one discount factor. Apply this validation whenever implementing interpolation over financial time series (discount curves, volatility surfaces, forward rates).
## `CW-DERIVATIVES-PRICING-008` — Production vs reference implementation selection
**From**: py_vollib · **Applicable to**: derivatives-pricing
py_vollib explicitly distinguishes between ref_python (slow, educational) and production (fast, C-based lets_be_rational) implementations. Using the reference implementation in production causes 10-100x performance degradation. Always select the appropriate implementation tier based on use case requirements—reference for testing/education, optimized for production trading systems.
FILE:references/components/analytical_greeks.md
# analytical_greeks (6 classes)
## `analytical.delta`
`analytical_greeks/analytical-delta.py:0`
## `analytical.theta`
`analytical_greeks/analytical-theta.py:0`
## `analytical.gamma`
`analytical_greeks/analytical-gamma.py:0`
## `analytical.vega`
`analytical_greeks/analytical-vega.py:0`
## `analytical.rho`
`analytical_greeks/analytical-rho.py:0`
## `d1_d2_source`
`analytical_greeks/d1-d2-source.py:0`
FILE:references/components/implied_volatility.md
# implied_volatility (4 classes)
## `black_scholes.implied_volatility`
`implied_volatility/black-scholes-implied-volatility.py:0`
## `black.implied_volatility`
`implied_volatility/black-implied-volatility.py:0`
## `black.normalised_implied_volatility`
`implied_volatility/black-normalised-implied-volatility.py:0`
## `iv_algorithm`
`implied_volatility/iv-algorithm.py:0`
FILE:references/components/numerical_greeks.md
# numerical_greeks (4 classes)
## `numerical_greeks.delta`
`numerical_greeks/numerical-greeks-delta.py:0`
## `numerical_greeks.theta`
`numerical_greeks/numerical-greeks-theta.py:0`
## `numerical_greeks.gamma`
`numerical_greeks/numerical-greeks-gamma.py:0`
## `step_size`
`numerical_greeks/step-size.py:0`
FILE:references/components/option_pricing.md
# option_pricing (4 classes)
## `black.black`
`option_pricing/black-black.py:0`
## `black_scholes.black_scholes`
`option_pricing/black-scholes-black-scholes.py:0`
## `black_scholes_merton.black_scholes_merton`
`option_pricing/black-scholes-merton-black-scholes-merto.py:0`
## `pricing_backend`
`option_pricing/pricing-backend.py:0`
FILE:references/components/reference_implementations.md
# reference_implementations (5 classes)
## `black.black`
`reference_implementations/black-black.py:0`
## `black_scholes.black_scholes`
`reference_implementations/black-scholes-black-scholes.py:0`
## `black_scholes_merton.black_scholes_merton`
`reference_implementations/black-scholes-merton-black-scholes-merto.py:0`
## `implied_volatility.implied_volatility`
`reference_implementations/implied-volatility-implied-volatility.py:0`
## `norm_cdf_source`
`reference_implementations/norm-cdf-source.py:0`
FILE:references/seed.yaml
meta:
id: finance-bp-127-v5.3
version: v6.1
blueprint_id: finance-bp-127
sop_version: crystal-compilation-v6.1
source_language: en
compiled_at: '2026-04-22T13:01:03.585888+00:00'
target_host: openclaw
authoritative_artifact:
primary: seed.yaml
non_authoritative_derivatives:
- SKILL.md (host-generated summary, may lag)
- HEARTBEAT.md (host telemetry)
- memory/*.md (host conversational memory)
rule: On any behavioral decision (preconditions check, OV assertion, EQ rule firing, spec_lock verification), agents MUST
re-read seed.yaml. Derivatives are for UI display only and may be out-of-date.
execution_protocol:
install_trigger:
- Execute resources.host_adapter.install_recipes[] in declared order
- Verify each package with import check before proceeding
execute_trigger: When user intent matches intent_router.uc_entries[].positive_terms AND user uses action verb (run/execute/跑/执行/backtest/fetch/collect)
on_execute:
- Reload seed.yaml (do not rely on SKILL.md or cached summaries)
- Run preconditions[] in declared order; halt on first fatal failure with on_fail message to user
- Enter context_state_machine.CA1_MEMORY_CHECKED state
- Evaluate evidence_quality.enforcement_rules[]; prepend user_disclosure_template
- Translate user_facing_fields to user locale per locale_contract
- "[V6 READING ORDER]\nThis crystal contains the following V6 layers. Before answering any business question, the host\
\ MUST read them in order:\n 1. anti_patterns[] — cross-project anti-patterns (with AP-* ids)\n 2. cross_project_wisdom[]\
\ — cross-project wisdom (with CW-* ids)\n 3. domain_constraints_injected[] — domain constraints (SHARED-* ids)\n \
\ 4. known_use_cases[] — concrete business scenarios (KUC-* ids)\n 5. component_capability_map — AST component map\
\ (by module)\n\nWhen answering user questions, proactively cite relevant AP-*/CW-*/SHARED-*/KUC-* ids with source text.\
\ Examples: T+1 rules -> cite SHARED-* constraint; model comparison -> warn via AP-*; follow-holdings strategy -> cite\
\ KUC-* with example file."
workspace_resolution:
scripts_path: '{host_workspace}/scripts/'
skills_path: '{host_workspace}/skills/'
trace_path: '{host_workspace}/.trace/'
capability_tags:
markets:
- global
activities:
- derivatives-pricing
upgraded_from: finance-bp-127-v1.seed.yaml
upgraded_at: '2026-04-22T13:20:35.266651+00:00'
v6_inputs:
ast_mind_map: knowledge/sources/finance/finance-bp-127--py_vollib/v6_inputs/ast_mind_map.yaml
anti_patterns: null
cross_project_wisdom: null
examples_kuc: knowledge/sources/finance/finance-bp-127--py_vollib/v6_inputs/examples_kuc.yaml
shared_pools_dir: knowledge/sources/finance/_shared
anti_patterns:
- id: AP-DERIVATIVES-PRICING-001
title: Instrument NPV called without attached pricing engine
description: Calling NPV() on a derivatives instrument without first calling setPricingEngine() returns uninitialized garbage
values or throws null pointer exceptions. This occurs because the Instrument class relies on the attached PricingEngine
to perform actual valuation logic. The consequence is silently incorrect pricing results that appear valid, potentially
leading to bad trading decisions.
project_source: QuantLib-SWIG (finance-bp-123)
severity: high
applicable_to_tags:
markets:
- global
activities:
- derivatives-pricing
_source_file: anti-patterns/derivatives-pricing.yaml
- id: AP-DERIVATIVES-PRICING-002
title: BSM forward price ignores dividend yield
description: When calculating option prices on dividend-paying stocks using BSM, the forward price must be adjusted as F
= S * exp((r-q)*t). Omitting the dividend yield adjustment (using F = S * exp(r*t)) causes systematic mispricing for all
dividend-paying assets. The consequence is consistently wrong option prices that diverge from market prices, leading to
arbitrage opportunities and trading losses.
project_source: py_vollib (finance-bp-127)
severity: high
applicable_to_tags:
markets:
- global
activities:
- derivatives-pricing
_source_file: anti-patterns/derivatives-pricing.yaml
- id: AP-DERIVATIVES-PRICING-003
title: Negative discount factors passed to log-domain interpolation
description: When Numba-jitted interpolation functions perform log transformation on discount factors, negative or zero
values cause domain errors. This occurs because log(-x) and log(0) are mathematically undefined. The consequence is runtime
crashes in jitted functions and complete failure of discount curve interpolation, blocking all downstream pricing calculations.
project_source: FinancePy (finance-bp-101)
severity: high
applicable_to_tags:
markets:
- global
activities:
- derivatives-pricing
_source_file: anti-patterns/derivatives-pricing.yaml
- id: AP-DERIVATIVES-PRICING-004
title: Non-monotonic time points in discount curve interpolation
description: Interpolation over non-monotonically increasing time points produces undefined behavior at crossing times,
causing discount factors to be incorrectly computed where time values overlap. This corrupts the entire term structure
because the bootstrap algorithm cannot determine which discount factor corresponds to which maturity. The consequence
is incorrect present value calculations across all downstream products priced against the curve.
project_source: FinancePy (finance-bp-101)
severity: high
applicable_to_tags:
markets:
- global
activities:
- derivatives-pricing
_source_file: anti-patterns/derivatives-pricing.yaml
- id: AP-DERIVATIVES-PRICING-005
title: Bootstrap calibration instruments not in maturity order
description: When building yield curves from market instruments (deposits, FRAs, swaps), the instruments must be provided
in strictly increasing maturity order. Out-of-order instruments cause the bootstrap algorithm to solve for discount factors
at incorrect time points, corrupting the entire term structure. The consequence is wrong forward rates and discount factors
that propagate into all priced instruments.
project_source: FinancePy (finance-bp-101)
severity: high
applicable_to_tags:
markets:
- global
activities:
- derivatives-pricing
_source_file: anti-patterns/derivatives-pricing.yaml
- id: AP-DERIVATIVES-PRICING-006
title: Option Exercise type mismatches VanillaOption constructor
description: VanillaOption requires both a StrikedTypePayoff and a matching Exercise object. Using wrong Exercise type (e.g.,
AmericanExercise for European option) causes compilation failures in C++ or runtime errors in SWIG bindings. The consequence
is the pricing system cannot initialize options, blocking all option pricing workflows.
project_source: QuantLib-SWIG (finance-bp-123)
severity: high
applicable_to_tags:
markets:
- global
activities:
- derivatives-pricing
_source_file: anti-patterns/derivatives-pricing.yaml
- id: AP-DERIVATIVES-PRICING-007
title: NaN/inf values in ARCH model input data
description: ARCH model estimation relies on recursive variance computations and scipy optimize. Non-finite input values
(NaN, inf) cause optimizers to produce NaN results and recursive variance calculations to fail. The consequence is complete
model estimation failure with meaningless outputs that appear valid, leading to incorrect volatility forecasts and risk
misestimation.
project_source: arch (finance-bp-124)
severity: high
applicable_to_tags:
markets:
- global
activities:
- derivatives-pricing
_source_file: anti-patterns/derivatives-pricing.yaml
- id: AP-DERIVATIVES-PRICING-008
title: ARCH parameter array concatenation in wrong order
description: 'ARCHModel composes from three components (mean, volatility, distribution) and requires parameter arrays concatenated
in fixed order: [mean_params, volatility_params, distribution_params]. Incorrect ordering causes _parse_parameters to
assign wrong values to wrong components, producing mathematically invalid models (e.g., volatility parameters interpreted
as distribution parameters). The consequence is invalid conditional variance forecasts.'
project_source: arch (finance-bp-124)
severity: high
applicable_to_tags:
markets:
- global
activities:
- derivatives-pricing
_source_file: anti-patterns/derivatives-pricing.yaml
- id: AP-DERIVATIVES-PRICING-009
title: Zero or negative time-to-expiration in option pricing
description: Option pricing formulas (Black-Scholes, Black model) compute sqrt(t) in the denominator. Zero time causes division
by zero; negative time produces NaN in d1/d2 calculations. The consequence is invalid option prices (NaN, inf) that break
downstream Greeks calculations and hedging workflows.
project_source: py_vollib (finance-bp-127)
severity: high
applicable_to_tags:
markets:
- global
activities:
- derivatives-pricing
_source_file: anti-patterns/derivatives-pricing.yaml
- id: AP-DERIVATIVES-PRICING-010
title: Black model applies spot price instead of forward price
description: The Black model is designed for options on futures/forwards and expects futures price F as input, not spot
price S. Using spot directly causes incorrect pricing because the Black formula assumes the underlying follows geometric
Brownian motion with drift equal to the risk-free rate (i.e., forward dynamics). The consequence is systematically wrong
forward option prices.
project_source: py_vollib (finance-bp-127)
severity: high
applicable_to_tags:
markets:
- global
activities:
- derivatives-pricing
_source_file: anti-patterns/derivatives-pricing.yaml
- id: AP-DERIVATIVES-PRICING-011
title: Missing discount factor in Black model pricing
description: Black model pricing must apply time value discounting with deflater = exp(-r*t) to undiscounted option prices.
Omitting the discount factor produces forward option prices that exceed their fair value by the risk-free compounding
amount. The consequence is violation of time value of money principles and prices that cannot be used for fair valuation
or hedging.
project_source: py_vollib (finance-bp-127)
severity: medium
applicable_to_tags:
markets:
- global
activities:
- derivatives-pricing
_source_file: anti-patterns/derivatives-pricing.yaml
- id: AP-DERIVATIVES-PRICING-012
title: Invalid flag parameter ('c'/'p') passed to py_vollib without validation
description: py_vollib binary_flag dict only contains keys 'c' and 'p'. Passing any other flag value causes KeyError exception.
The library lacks input validation and crashes on invalid inputs. The consequence is unhandled exceptions in production
systems when flag values come from external sources with unexpected formats.
project_source: py_vollib (finance-bp-127)
severity: medium
applicable_to_tags:
markets:
- global
activities:
- derivatives-pricing
_source_file: anti-patterns/derivatives-pricing.yaml
- id: AP-DERIVATIVES-PRICING-013
title: Evaluation date not set before QuantLib term structure construction
description: QuantLib requires ql.Settings.instance().evaluationDate to be set before constructing yield term structures
and instruments. Without an explicit evaluation date, the curve reference date becomes undefined, causing date calculations
to fail or produce incorrect settlement dates. The consequence is wrong discount factors and NPV calculations across the
entire portfolio.
project_source: QuantLib-SWIG (finance-bp-123)
severity: medium
applicable_to_tags:
markets:
- global
activities:
- derivatives-pricing
_source_file: anti-patterns/derivatives-pricing.yaml
- id: AP-DERIVATIVES-PRICING-014
title: Market quotes passed without QuoteHandle wrapper
description: QuantLib's observer pattern requires all market quotes to be wrapped in QuoteHandle before passing to rate
helpers. Raw quote values bypass the observable notification mechanism, causing dependent instruments to never recalculate
when market data updates. The consequence is stale pricing that doesn't reflect current market conditions.
project_source: QuantLib-SWIG (finance-bp-123)
severity: medium
applicable_to_tags:
markets:
- global
activities:
- derivatives-pricing
_source_file: anti-patterns/derivatives-pricing.yaml
- id: AP-DERIVATIVES-PRICING-015
title: Implied volatility computed without proper bounds validation
description: When computing implied volatility, option prices outside theoretical bounds (below intrinsic value or above
maximum) must raise appropriate exceptions. Returning invalid IV values (negative volatility or extreme values) violates
mathematical definitions and leads to incorrect pricing, risk calculations, and hedging ratios. The consequence is systemic
pricing errors across all vol-dependent derivatives.
project_source: py_vollib (finance-bp-127)
severity: medium
applicable_to_tags:
markets:
- global
activities:
- derivatives-pricing
_source_file: anti-patterns/derivatives-pricing.yaml
cross_project_wisdom:
- wisdom_id: CW-DERIVATIVES-PRICING-001
source_project: FinancePy, QuantLib-SWIG
pattern_name: Strict input validation before financial calculations
description: Both FinancePy and QuantLib-SWIG enforce strict validation of all input parameters before any financial computation.
FinancePy validates day count types, date arguments, tolerance parameters, and max iterations. QuantLib-SWIG validates
exercise types and swap direction enums. This pattern prevents corrupted calculations and provides clear error messages.
Apply this pattern by validating all inputs at function entry points.
applicable_to_activity: derivatives-pricing
_source_file: cross-project-wisdom/derivatives-pricing.yaml
- wisdom_id: CW-DERIVATIVES-PRICING-002
source_project: FinancePy, QuantLib-SWIG
pattern_name: Bootstrap requires ordered instrument calibration
description: Both FinancePy and QuantLib-SWIG require calibration instruments to be provided in strict maturity order for
curve bootstrapping. FinancePy enforces monotonically increasing time points and validates instrument sequencing (deposits
before FRAs before swaps). QuantLib-SWIG uses bootstrap helpers (DepositRateHelper, FraRateHelper, SwapRateHelper) that
assume ordered inputs. This ensures the bootstrap algorithm solves for discount factors at mathematically correct time
points.
applicable_to_activity: derivatives-pricing
_source_file: cross-project-wisdom/derivatives-pricing.yaml
- wisdom_id: CW-DERIVATIVES-PRICING-003
source_project: QuantLib-SWIG
pattern_name: Handle pattern for lazy evaluation chains
description: QuantLib-SWIG requires wrapping market data (quotes, term structures) in Handle objects to enable lazy evaluation
and automatic recalculation. QuoteHandle for market quotes and Handle for term structures enable the observer pattern.
When market data updates, all dependent instruments automatically recalculate. This pattern is essential for live pricing
systems where prices must reflect current market conditions.
applicable_to_activity: derivatives-pricing
_source_file: cross-project-wisdom/derivatives-pricing.yaml
- wisdom_id: CW-DERIVATIVES-PRICING-004
source_project: arch
pattern_name: Parameter composition requires fixed ordering and partitioning
description: arch enforces a strict parameter composition pattern where mean, volatility, and distribution parameters must
be concatenated in fixed order with explicit offset partitioning. The offsets array partitions the unified parameter vector
into components. This pattern prevents parameter assignment errors that would corrupt model components. Apply this when
composing financial models from multiple sub-components.
applicable_to_activity: derivatives-pricing
_source_file: cross-project-wisdom/derivatives-pricing.yaml
- wisdom_id: CW-DERIVATIVES-PRICING-005
source_project: arch, py_vollib
pattern_name: Strict mathematical constraint enforcement
description: 'Both arch and py_vollib enforce strict mathematical constraints: arch enforces volatility model stationarity
constraints (A.dot(params) - b >= 0) for SLSQP optimization; py_vollib validates implied volatility is positive and option
prices within intrinsic/maximum bounds. Violating these constraints produces mathematically invalid results. Always enforce
domain constraints on all financial model parameters.'
applicable_to_activity: derivatives-pricing
_source_file: cross-project-wisdom/derivatives-pricing.yaml
- wisdom_id: CW-DERIVATIVES-PRICING-006
source_project: py_vollib
pattern_name: Forward price adjustment for dividend yield in BSM
description: 'py_vollib demonstrates the correct BSM implementation: compute forward price F = S * exp((r-q)*t) to adjust
for continuous dividend yield before passing to the pricing engine. This pattern is essential for all options on dividend-paying
assets. Forgetting the dividend adjustment causes systematic mispricing for the entire equity derivatives book.'
applicable_to_activity: derivatives-pricing
_source_file: cross-project-wisdom/derivatives-pricing.yaml
- wisdom_id: CW-DERIVATIVES-PRICING-007
source_project: FinancePy
pattern_name: Monotonicity validation for interpolation arrays
description: FinancePy enforces strictly monotonically increasing time arrays before interpolation operations. This prevents
undefined behavior at crossing times and ensures each time point maps to exactly one discount factor. Apply this validation
whenever implementing interpolation over financial time series (discount curves, volatility surfaces, forward rates).
applicable_to_activity: derivatives-pricing
_source_file: cross-project-wisdom/derivatives-pricing.yaml
- wisdom_id: CW-DERIVATIVES-PRICING-008
source_project: py_vollib
pattern_name: Production vs reference implementation selection
description: py_vollib explicitly distinguishes between ref_python (slow, educational) and production (fast, C-based lets_be_rational)
implementations. Using the reference implementation in production causes 10-100x performance degradation. Always select
the appropriate implementation tier based on use case requirements—reference for testing/education, optimized for production
trading systems.
applicable_to_activity: derivatives-pricing
_source_file: cross-project-wisdom/derivatives-pricing.yaml
domain_constraints_injected: []
resources_injected: {}
known_use_cases:
- kuc_id: KUC-101
source_file: docs/conf.py
business_problem: Configures automated documentation generation for the py_vollib options pricing library, enabling consistent
API documentation, code examples, and coverage reporting for developers.
intent_keywords:
- documentation
- sphinx
- api docs
- readthedocs
- docs generation
stage: reporting
data_domain: mixed
type: reporting
component_capability_map:
project: finance-bp-127--py_vollib
scan_date: '2026-04-22'
stats:
total_files: 5
total_classes: 23
total_functions: 0
total_stages: 5
modules:
reference_implementations:
class_count: 5
stage_id: reference_implementations
stage_order: 1
responsibility: Provide pure Python reference implementations using scipy.stats.norm.cdf for verification and sanity
checking against the optimized production core. Not for production use but enable numerical validation.
classes:
- name: black.black
file: reference_implementations/black-black.py
line: 0
kind: required_method
signature: ''
- name: black_scholes.black_scholes
file: reference_implementations/black-scholes-black-scholes.py
line: 0
kind: required_method
signature: ''
- name: black_scholes_merton.black_scholes_merton
file: reference_implementations/black-scholes-merton-black-scholes-merto.py
line: 0
kind: required_method
signature: ''
- name: implied_volatility.implied_volatility
file: reference_implementations/implied-volatility-implied-volatility.py
line: 0
kind: required_method
signature: ''
- name: norm_cdf_source
file: reference_implementations/norm-cdf-source.py
line: 0
kind: replaceable_point
design_decision_count: 2
option_pricing:
class_count: 4
stage_id: pricing
stage_order: 2
responsibility: Calculate option prices using Black, Black-Scholes, and Black-Scholes-Merton models. Delegates to py_lets_be_rational
for performance while exposing a clean public API.
classes:
- name: black.black
file: option_pricing/black-black.py
line: 0
kind: required_method
signature: ''
- name: black_scholes.black_scholes
file: option_pricing/black-scholes-black-scholes.py
line: 0
kind: required_method
signature: ''
- name: black_scholes_merton.black_scholes_merton
file: option_pricing/black-scholes-merton-black-scholes-merto.py
line: 0
kind: required_method
signature: ''
- name: pricing_backend
file: option_pricing/pricing-backend.py
line: 0
kind: replaceable_point
design_decision_count: 4
analytical_greeks:
class_count: 6
stage_id: greeks_analytical
stage_order: 3
responsibility: Compute option Greeks (delta, theta, gamma, vega, rho) using closed-form Black-Scholes formulas. Each
greek is scaled per industry convention for per-day theta and per-percent vega/rho.
classes:
- name: analytical.delta
file: analytical_greeks/analytical-delta.py
line: 0
kind: required_method
signature: ''
- name: analytical.theta
file: analytical_greeks/analytical-theta.py
line: 0
kind: required_method
signature: ''
- name: analytical.gamma
file: analytical_greeks/analytical-gamma.py
line: 0
kind: required_method
signature: ''
- name: analytical.vega
file: analytical_greeks/analytical-vega.py
line: 0
kind: required_method
signature: ''
- name: analytical.rho
file: analytical_greeks/analytical-rho.py
line: 0
kind: required_method
signature: ''
- name: d1_d2_source
file: analytical_greeks/d1-d2-source.py
line: 0
kind: replaceable_point
design_decision_count: 4
numerical_greeks:
class_count: 4
stage_id: greeks_numerical
stage_order: 4
responsibility: Compute option Greeks via finite difference approximation. Serves as verification of analytical formulas
and fallback for non-standard models or exotic payoffs.
classes:
- name: numerical_greeks.delta
file: numerical_greeks/numerical-greeks-delta.py
line: 0
kind: required_method
signature: ''
- name: numerical_greeks.theta
file: numerical_greeks/numerical-greeks-theta.py
line: 0
kind: required_method
signature: ''
- name: numerical_greeks.gamma
file: numerical_greeks/numerical-greeks-gamma.py
line: 0
kind: required_method
signature: ''
- name: step_size
file: numerical_greeks/step-size.py
line: 0
kind: replaceable_point
design_decision_count: 4
implied_volatility:
class_count: 4
stage_id: implied_volatility
stage_order: 5
responsibility: Solve for the volatility that produces a given option price using Newton-Raphson via lets_be_rational.
Raises exceptions for out-of-bounds prices that violate intrinsic value constraints.
classes:
- name: black_scholes.implied_volatility
file: implied_volatility/black-scholes-implied-volatility.py
line: 0
kind: required_method
signature: ''
- name: black.implied_volatility
file: implied_volatility/black-implied-volatility.py
line: 0
kind: required_method
signature: ''
- name: black.normalised_implied_volatility
file: implied_volatility/black-normalised-implied-volatility.py
line: 0
kind: required_method
signature: ''
- name: iv_algorithm
file: implied_volatility/iv-algorithm.py
line: 0
kind: replaceable_point
design_decision_count: 3
data_flow_hints: []
locale_contract:
source_language: en
user_facing_fields:
- human_summary.what_i_can_do.tagline
- human_summary.what_i_can_do.use_cases[]
- human_summary.what_i_auto_fetch[]
- human_summary.what_i_ask_you[]
- evidence_quality.user_disclosure_template
- post_install_notice.message_template.positioning
- post_install_notice.message_template.capability_catalog.groups[].name
- post_install_notice.message_template.capability_catalog.groups[].description
- post_install_notice.message_template.capability_catalog.groups[].ucs[].name
- post_install_notice.message_template.capability_catalog.groups[].ucs[].short_description
- post_install_notice.message_template.call_to_action
- post_install_notice.message_template.featured_entries[].beginner_prompt
- post_install_notice.message_template.more_info_hint
- preconditions[].description
- preconditions[].on_fail
- intent_router.uc_entries[].name
- intent_router.uc_entries[].ambiguity_question
- architecture.pipeline
- architecture.stages[].narrative.does_what
- architecture.stages[].narrative.key_decisions
- architecture.stages[].narrative.common_pitfalls
- constraints.fatal[].consequence
- constraints.regular[].consequence
- output_validator.assertions[].failure_message
- acceptance.hard_gates[].on_fail
- skill_crystallization.action
locale_detection_order:
- explicit_user_declaration
- first_message_language
- system_locale
translation_enforcement:
trigger: on_first_user_message
action: Render user_facing_fields in detected locale, preserving all IDs (BD-/SL-/UC-/finance-C-) and code identifiers
verbatim
violation_code: LOCALE-01
violation_signal: User receives untranslated English Human Summary when detected locale != en
evidence_quality:
declared:
evidence_coverage_ratio: 1.0
evidence_verify_ratio: 0.5480769230769231
evidence_invalid: 47
evidence_verified: 57
evidence_auto_fixed: 0
audit_coverage: 23/23 (100%)
audit_pass_rate: 2/23 (8%)
audit_fail_total: 10
audit_finance_universal:
pass: 2
warn: 6
fail: 8
audit_subdomain_totals:
pass: 0
warn: 5
fail: 2
enforcement_rules:
- id: EQ-01
trigger: declared.evidence_verify_ratio < 0.5
action: MUST invoke traceback lookup for all cited BD-IDs in output before emitting business code — read LATEST.yaml sections
for each BD referenced
violation_code: EQ-01-V
violation_signal: Generated script references BD-IDs but no tool_call to read LATEST.yaml preceded code generation
user_disclosure_template: '[QUALITY NOTICE] This crystal was compiled from blueprint finance-bp-127. Evidence verify ratio
= 54.8% and audit fail total = 10. Generated results may have uncaptured requirement gaps. Verify critical decisions against
source files (LATEST.yaml / LATEST.jsonl).'
traceback:
source_files:
blueprint: LATEST.yaml
constraints: LATEST.jsonl
mandatory_lookup_scenarios:
- id: TB-01
condition: Two constraints have apparently conflicting enforcement rules
lookup_target: LATEST.jsonl — find both constraint IDs, compare `consequence` + `evidence_refs` to determine priority
- id: TB-02
condition: A business decision rationale is unclear or disputed
lookup_target: LATEST.yaml — locate BD-ID under business_decisions, read `rationale` + `alternative_considered` fields
- id: TB-03
condition: evidence_invalid > 0 in evidence_quality.declared
lookup_target: LATEST.yaml _enrich_meta — cross-check specific BD `evidence_refs` fields for invalid markers
- id: TB-04
condition: User asks where a rule comes from
lookup_target: LATEST.jsonl — find constraint by ID, read `confidence.evidence_refs` for source file + line number
- id: TB-05
condition: Generated code does not match expected ZVT API behavior
lookup_target: LATEST.yaml stages[].required_methods — verify method signature and evidence locator in source code
degraded_lookup:
no_fs_access: 'Ask the user to paste the relevant LATEST.yaml section or LATEST.jsonl lines for the BD-/finance-C- IDs
in question. Crystal ID: finance-bp-127-v5.0.'
trace_schema:
event_types:
- precondition_check
- spec_lock_check
- evidence_rule_fired
- evidence_rule_skipped
- locale_translation_emitted
- hard_gate_passed
- hard_gate_failed
- skill_emitted
- false_completion_claim
preconditions:
- id: PC-01
description: zvt package installed and importable
check_command: python3 -c 'import zvt; print(zvt.__version__)'
on_fail: 'Run: python3 -m pip install zvt then re-run: python3 -m zvt.init_dirs to initialize data directories'
severity: fatal
- id: PC-02
description: K-data exists for target entities (required before backtesting)
check_command: python3 -c "from zvt.api.kdata import get_kdata; df = get_kdata(entity_ids=['stock_sh_600000'], limit=1);
assert df is not None and len(df) > 0, 'No kdata found'"
on_fail: 'Run recorder first: python3 -m zvt.recorders.em.em_stock_kdata_recorder --entity_ids stock_sh_600000 (replace
with your target entity IDs)'
severity: fatal
applies_to_uc:
- UC-101
- id: PC-03
description: ZVT data directory initialized (~/.zvt or ZVT_HOME)
check_command: 'python3 -c "import os; from pathlib import Path; zvt_home = Path(os.environ.get(''ZVT_HOME'', Path.home()
/ ''.zvt'')); assert zvt_home.exists(), f''ZVT home not found: {zvt_home}''"'
on_fail: 'Run: python3 -m zvt.init_dirs'
severity: fatal
- id: PC-04
description: SQLite write permission for ZVT data directory
check_command: python3 -c "import os, tempfile; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home()
/ '.zvt')); test_f = zvt_home / '.write_test'; test_f.touch(); test_f.unlink()"
on_fail: 'Check directory permissions: chmod u+w ~/.zvt or set ZVT_HOME environment variable to a writable location'
severity: warn
intent_router:
uc_entries:
- uc_id: UC-101
name: Sphinx Documentation Configuration for py_vollib
positive_terms:
- documentation
- sphinx
- api docs
- readthedocs
- docs generation
data_domain: mixed
negative_terms:
- trading
- backtesting
- screening
- factor computation
- live trading
- options pricing
ambiguity_question: Are you looking to configure project documentation (docs/conf.py), or are you trying to implement
a trading strategy, run factor analysis, or process market data?
context_state_machine:
states:
- id: CA1_MEMORY_CHECKED
entry: Task started
exit: All memory queries attempted and recorded; memory_unavailable set if failed
timeout: 30s — skip memory, mark memory_unavailable=true, proceed to CA2
- id: CA2_GAPS_FILLED
entry: CA1 complete
exit: 'All FATAL-priority required inputs answered: target market (A-share/HK/US), data source, time range, strategy type'
timeout: NOT skippable — FATAL inputs MUST be user-answered before proceeding
- id: CA3_PATH_SELECTED
entry: CA2 complete
exit: intent_router matched single use case with confidence gap > 20% over next candidate, no data_domain ambiguity
timeout: Trigger ambiguity_question for top-2 candidates, await user selection
- id: CA4_EXECUTING
entry: CA3 complete + user explicit confirmation received
exit: All hard gates G1-Gn passed and output files written
timeout: NOT skippable — user confirmation of execution path required
enforcement: Code generation is PROHIBITED before CA4_EXECUTING. Any regression to earlier state MUST be announced to user.
buy/sell ordering SL-01 check runs at CA4 entry.
spec_lock_registry:
semantic_locks:
- id: SL-01
description: Execute sell orders before buy orders in every trading cycle
locked_value: sell() called before buy() in each Trader.run() iteration
violation_is: fatal
source_bd_ids:
- BD-018
- id: SL-02
description: Trading signals MUST use next-bar execution (no look-ahead)
locked_value: due_timestamp = happen_timestamp + level.to_second()
violation_is: fatal
source_bd_ids:
- BD-014
- BD-025
- id: SL-03
description: Entity IDs MUST follow format entity_type_exchange_code
locked_value: stock_sh_600000 | stockhk_hk_0700 | stockus_nasdaq_AAPL
violation_is: fatal
source_bd_ids: []
- id: SL-04
description: DataFrame index MUST be MultiIndex (entity_id, timestamp)
locked_value: df.index.names == ['entity_id', 'timestamp']
violation_is: fatal
source_bd_ids: []
- id: SL-05
description: 'TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount'
locked_value: XOR enforcement in trading/__init__.py:68
violation_is: fatal
source_bd_ids: []
- id: SL-06
description: 'filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION'
locked_value: factor.py:475 order_type_flag mapping
violation_is: fatal
source_bd_ids: []
- id: SL-07
description: Transformer MUST run BEFORE Accumulator in factor pipeline
locked_value: 'compute_result(): transform at :403 before accumulator at :409'
violation_is: fatal
source_bd_ids: []
- id: SL-08
description: 'MACD parameters locked: fast=12, slow=26, signal=9'
locked_value: factors/algorithm.py:30 macd(slow=26, fast=12, n=9)
violation_is: fatal
source_bd_ids:
- BD-036
- id: SL-09
description: 'Default transaction costs: buy_cost=0.001, sell_cost=0.001, slippage=0.001'
locked_value: sim_account.py:25 SimAccountService default costs
violation_is: warning
source_bd_ids:
- BD-029
- id: SL-10
description: A-share equity trading is T+1 (no same-day close of buy positions)
locked_value: sim_account.available_long filters by trading_t
violation_is: fatal
source_bd_ids: []
- id: SL-11
description: Recorder subclass MUST define provider AND data_schema class attributes
locked_value: contract/recorder.py:71 Meta; register_schema decorator
violation_is: fatal
source_bd_ids: []
- id: SL-12
description: Factor result_df MUST contain either 'filter_result' OR 'score_result' column
locked_value: result_df.columns.intersection({'filter_result', 'score_result'}) non-empty
violation_is: fatal
source_bd_ids: []
implementation_hints:
- id: IH-01
hint: 'Use AdjustType enum exactly: qfq (pre-adjust), hfq (post-adjust), bfq (none) — contract/__init__.py:121'
- id: IH-02
hint: For A-share kdata, default to hfq for long-term analysis (dividend-adjusted) — trader.py:538 StockTrader
- id: IH-03
hint: SQLite connection MUST use check_same_thread=False for multi-threaded recorders
- id: IH-04
hint: Accumulator state serialization uses JSON with custom encoder/decoder hooks — contract/base_service.py
- id: IH-05
hint: Factor.level MUST match TargetSelector.level (enforced at add_factor) — factors/target_selector.py:84
preservation_manifest:
required_objects:
business_decisions_count: 117
fatal_constraints_count: 45
non_fatal_constraints_count: 124
use_cases_count: 1
semantic_locks_count: 12
preconditions_count: 4
evidence_quality_rules_count: 2
traceback_scenarios_count: 5
architecture:
pipeline: data_collection -> data_storage -> factor_computation -> target_selection -> trading_execution -> visualization
stages:
- id: data_collection
narrative:
does_what: TimeSeriesDataRecorder and FixedCycleDataRecorder fetch OHLCV and fundamental data from providers (eastmoney,
joinquant, baostock, akshare) and persist domain objects (Stock1dKdata, BalanceSheet) to SQLite via df_to_db().
key_decisions: BD-002 chose evaluate_start_end_size_timestamps for incremental fetch (not full refresh) because comparing
to get_latest_saved_record avoids redundant API calls; BD-003 chose get_data_map field transformation to keep domain
schema provider-agnostic.
common_pitfalls: 'Don''t forget SL-11: Recorder subclass MUST declare both provider and data_schema class attributes
else initialization fails with assertion error; finance-C-001 fatal violation.'
business_decisions: []
- id: data_storage
narrative:
does_what: StorageBackend persists DataFrames to per-provider SQLite databases at {data_path}/{provider}/{provider}_{db_name}.db
using path templates from _get_path_template; Mixin.record_data and Mixin.query_data provide uniform read/write interface.
key_decisions: BD-004 chose StorageBackend abstraction (not hardcoded SQLite) to allow future cloud storage swap; BD-006
derives db_name from data_schema __tablename__ for per-domain database isolation.
common_pitfalls: SL-04 violation (wrong DataFrame index) causes factor pipeline failures downstream; always ensure df.index.names
== ['entity_id', 'timestamp'] before calling record_data.
business_decisions: []
- id: factor_computation
narrative:
does_what: Factor.compute() applies Transformer (stateless, e.g. MacdTransformer) then Accumulator (stateful, e.g. MaStatsAccumulator)
to produce filter_result or score_result columns; EntityStateService persists per-entity rolling state across batches.
key_decisions: BD-007 chose Factor inheriting DataReader for composable data access; SL-08 locks MACD at (fast=12, slow=26,
n=9) — chose standard Appel parameters not adaptive because interpretability matters for practitioners.
common_pitfalls: 'SL-07: Transformer MUST run before Accumulator — swapping order causes NaN propagation; SL-12: result_df
must contain filter_result OR score_result column or TargetSelector silently drops all signals.'
business_decisions: []
- id: target_selection
narrative:
does_what: TargetSelector.add_factor() registers Factor instances; get_targets() returns entity_ids passing threshold
filter at a specific timestamp, enabling point-in-time historical backtesting without look-ahead.
key_decisions: BD-012 chose registrable factor list (not hardcoded) for runtime customization; BD-013 chose timestamp-specific
filtering not current-only because backtests need historical point-in-time correctness.
common_pitfalls: Factor.level MUST match TargetSelector.level (IH-05); mismatched levels cause silent empty target lists
that look like no signals but are actually level-mismatch bugs.
business_decisions: []
- id: trading_execution
narrative:
does_what: Trader.run() calls sell() before buy() each cycle, generates TradingSignals with due_timestamp = happen_timestamp
+ level.to_second() for next-bar execution, and applies on_profit_control() for stop-loss/take-profit before regular
target selection.
key_decisions: SL-01 locks sell-before-buy order because available_long check in sim_account depends on it — chose this
over symmetric ordering to prevent implicit leverage; BD-039 chose long=AND/short=OR multi-level logic to reflect
risk asymmetry.
common_pitfalls: 'SL-02 violation (immediate execution instead of next-bar) introduces look-ahead bias and makes backtest
results unreproducible in live trading; SL-10: A-share T+1 constraint — backtesting without it overstates returns.'
business_decisions: []
- id: visualization
narrative:
does_what: Drawer.draw() combines kline main chart with factor overlays and Rect annotations for entry/exit signals
using Plotly; Drawable interface on Factor enables consistent chart rendering across data types.
key_decisions: BD-019 chose drawer_rects subclass override for custom annotations not hardcoded markers — allows traders
to define entry/exit visuals without modifying base drawing logic.
common_pitfalls: draw_result=True by default (BD-055) is fine for development but set draw_result=False in production/headless
environments to avoid Plotly server startup overhead.
business_decisions: []
- id: cross_cutting_concerns
narrative:
does_what: 'Invariants and utilities that span multiple pipeline stages — collected from 44 source groups: architecture(1),
calibration(1), core_algorithm(1), default_value(5), delta_calculation(1), design(1), and 38 more.'
key_decisions: 117 BDs merged here because they apply to more than one main stage (e.g. algorithm helpers, default value
choices, ordering contracts, error handling). Agent should inspect individual BD summaries and link back to affected
main stages via shared IDs.
common_pitfalls: Cross-cutting concerns frequently surface as bugs when changes to one main stage unintentionally break
another. Check constraints referencing these BDs and verify invariants still hold after any stage-local modification.
business_decisions:
- id: BD-GAP-001
type: B
summary: 'Dual implementation strategy: providing both analytical (closed-form) and numerical (finite difference) Greeks
for the same option parameters'
- id: BD-GAP-002
type: B
summary: Finite difference epsilon set to 0.0001 for numerical Greek computation
- id: BD-017
type: B
summary: Delegate core pricing to py_lets_be_rational for speed and precision
- id: BD-087
type: DK/B
summary: 'The ''b'' cost-of-carry parameter defaults encode the entire model selection: Black=0, Black-Scholes=r, BSM=r-q'
- id: BD-090
type: DK
summary: Theta divided by 365 for per-day convention (textbook raw value must be multiplied by 365)
- id: BD-091
type: DK
summary: Vega and rho multiplied by 0.01 for per-percent convention (1% change, not 100% change)
- id: BD-095
type: DK
summary: BSM numerical greeks lambda computes dividend q from b via r-b, but bs_merton pricing expects q directly
- id: BD-097
type: DK
summary: Numerical theta uses 0.00001 as minimum time step when t < 1/365 to avoid negative time
- id: BD-055
type: B
summary: 'Black delta: N(d1)*exp(-r*t) for calls, -N(-d1)*exp(-r*t) for puts'
- id: BD-GAP-003
type: T
summary: Standardized function signature (flag, S, K, t, r, sigma) across each Greek calculations with 'flag' as first
parameter
- id: BD-040
type: B
summary: Use Drezner-Wesolowsky method for bivariate normal distribution (CBND)
- id: BD-041
type: B/BA
summary: Use correlation thresholds 0.3 and 0.75 to select CBND quadrature order
- id: BD-042
type: B/BA
summary: Cap CND evaluation at |x|>37 to prevent underflow/overflow
- id: BD-043
type: B
summary: Use Abramowitz-Steilger polynomial coefficients in CND approximation
- id: BD-029
type: B/BA
summary: In BSM delta, multiply by exp(-q*t) to adjust for dividend yield
- id: BD-030
type: B
summary: In BSM gamma, include exp(-q*t) in numerator
- id: BD-031
type: B
summary: In BSM vega, multiply by exp(-q*t) factor
- id: BD-051
type: B/BA
summary: 'BSM theta includes additional dividend term: -q*S*exp(-q*t)*N(±d1)'
- id: BD-024
type: B
summary: At t=0, set delta to {c:0.5, p:-0.5} when S=K, {c:1, p:0} when S>K
- id: BD-045
type: B
summary: 'For numerical gamma at t=0: return infinity if S=K, else 0'
- id: BD-021
type: B/DK
summary: Raise PriceIsAboveMaximum exception when sigma equals FLOAT_MAX
- id: BD-022
type: B/DK
summary: Raise PriceIsBelowIntrinsic exception when sigma equals MINUS_FLOAT_MAX
- id: BD-018
type: B
summary: Calculate forward price as S/numpy.exp(-r*t) equivalent to S*exp(r*t)
- id: BD-046
type: B/BA
summary: 'Discount forward price calculation: F = S * exp((r-q)*t) for BSM'
- id: BD-100
type: BA
summary: 'INTERACTION: [BD-089] × [BD-003] × [BD-092] → Complete IV calculation chain failure if binary_flag mapping
is wrong'
- id: BD-101
type: B/BA
summary: 'INTERACTION: [BD-094] vs [BD-055]/[BD-071] → Black delta discount factor inconsistency creates model confusion'
- id: BD-102
type: BA/DK
summary: 'INTERACTION: [BD-005] × [BD-009] × [BD-025] → Theta instability amplification near 1-day expiry'
- id: BD-103
type: B
summary: 'INTERACTION: [BD-029] × [BD-030] × [BD-031] × [BD-051] → BSM dividend factor cascade across each Greeks'
- id: BD-104
type: BA
summary: 'INTERACTION: [BD-085] × [BD-093] × [BD-003] × [BD-020] → IV solver chain dependency'
- id: BD-105
type: DK
summary: 'INTERACTION: [BD-098] × [BD-046] × [BD-018] → Forward price symbol confusion across models'
- id: BD-106
type: T
summary: 'INTERACTION: [BD-099] × [BD-011] → ref_python serves dual role as reference and utility, creating coupling'
- id: BD-004
type: B
summary: BSM delta includes exp(-q*t) factor for dividend yield
- id: BD-005
type: B/BA
summary: Theta divided by 365 (per-day convention)
- id: BD-006
type: B/BA
summary: Vega/Rho multiplied by 0.01 (per-percent convention)
- id: BD-013
type: B
summary: Divide theta by 365 to express as daily change (per calendar day)
- id: BD-014
type: B/BA
summary: Multiply vega by 0.01 to express per 1% change in implied volatility
- id: BD-015
type: B
summary: Multiply rho by 0.01 to express per 1% change in interest rate
- id: BD-044
type: B
summary: Gamma and vega are identical for calls and puts (symmetric)
- id: BD-007
type: BA/M
summary: Uses pricing_function parameter for generic computation
- id: BD-008
type: B/BA
summary: dS = 0.01 (1% bump) hardcoded as step size
- id: BD-009
type: B
summary: Theta capped at 1/365 to avoid singularity at expiry
- id: BD-023
type: B/BA
summary: Use numerical delta step size dS = 0.01 for finite difference approximation
- id: BD-025
type: B/BA
summary: For t<=1/365, approximate theta via 0.00001 time step instead of 1/365
- id: BD-048
type: B
summary: For numerical rho, shift both r and b by 0.01 in opposite directions
- id: BD-GAP-005
type: M
summary: Analytical Greeks compute each first-order derivatives (delta, theta, vega, rho) plus gamma using closed-form
formulas without error bounds
- id: BD-010
type: B
summary: Uses FLOAT_MAX/MINUS_FLOAT_MAX as sentinel values
- id: BD-020
type: B/RC
summary: Use transformed rational guess algorithm for implied volatility
- id: BD-047
type: B/BA
summary: 'Undiscount option price before IV calculation: undiscounted = price / deflater'
- id: BD-094
type: DK
summary: Black delta has discount factor exp(-r*t) but Black-Scholes delta does not - inconsistent delta definitions
- id: BD-012
type: B/BA
summary: Use 'c'/'p' string flags instead of numeric flags, converted to binary via {CALL:1, PUT:-1}
- id: BD-054
type: B
summary: Use binary_flag lookup for converting 'c'/'p' to internal numeric representation
- id: BD-056
type: B
summary: Use float() conversion on each numeric inputs to ensure type consistency
- id: BD-089
type: B/BA
summary: 'Global binary_flag mapping: {''c'':1, ''p'':-1} must match py_lets_be_rational convention throughout'
- id: BD-092
type: BA/DK
summary: FLOAT_MAX/MINUS_FLOAT_MAX sentinel values from lets_be_rational must map to custom exceptions
- id: BD-096
type: BA
summary: 'Numerical greeks edge case: t=0 returns finite values for delta/gamma but float(''inf'') for gamma at ATM'
- id: BD-098
type: DK
summary: 'Forward price formula: F=S/exp(-r*t) in black_scholes; F=S*exp((r-q)*t) in BSM - different interpretations
of ''F'''
- id: BD-019
type: B
summary: Bridge Black-Scholes to Black via F = S/deflater, then multiply back
- id: BD-053
type: B/RC
summary: d2 = d1 - sigma*sqrt(t) relationship holds across each models
- id: BD-026
type: B/BA
summary: Use b=r (cost-of-carry = risk-free rate) for Black-Scholes model
- id: BD-027
type: B/BA
summary: Use b=0 (cost-of-carry = 0) for Black futures option model
- id: BD-028
type: B/BA
summary: Use b=r-q for Black-Scholes-Merton model to account for continuous dividends
- id: BD-016
type: B
summary: Pre-compute PDF constant ONE_OVER_SQRT_TWO_PI = 0.3989... for performance
- id: BD-093
type: DK
summary: 'Implied volatility requires price first: BSM IV must compute F=S*exp((r-q)*t) before calling lets_be_rational'
- id: BD-088
type: T
summary: 'Two-implementation architecture: fast (py_lets_be_rational) vs reference (scipy/brentq) for cross-validation'
- id: BD-099
type: T
summary: Analytical greeks import d1/d2 from ref_python modules, creating hidden coupling
- id: BD-052
type: B
summary: Provide limited-iteration IV calculation for time-critical applications
- id: BD-001
type: B
summary: BSM delegates to Black model via F=S*exp((r-q)*t)
- id: BD-002
type: B/RC
summary: Flag uses 'c'/'p' strings mapped to +1/-1 via binary_flag dict
- id: BD-003
type: B
summary: undiscounted_black forwards to py_lets_be_rational.black
- id: BD-039
type: B
summary: Implement normalised_black for time-value put-call invariant transformation
- id: BD-057
type: B
summary: Use Black (1976) futures option model with discounted price calculation
- id: BD-058
type: B/BA
summary: Use transformed rational guess algorithm for implied volatility calculation
- id: BD-078
type: B
summary: Use normalised (time-value invariant) Black pricing for comparative analysis
- id: BD-085
type: B/BA
summary: Convert price to undiscounted form before Black IV calculation
- id: BD-086
type: B
summary: Raise PriceIsAboveMaximum/PriceIsBelowIntrinsic exceptions for out-of-bounds IV
- id: BD-066
type: B/BA
summary: Scale vega and rho by 0.01 for per-1% change convention
- id: BD-067
type: B/BA
summary: Divide theta by 365 to express as daily decay
- id: BD-070
type: B
summary: Set cost_of_carry b=0 for Black model (futures options)
- id: BD-077
type: B
summary: Apply analytical rho formula using price-times-time sensitivity
- id: BD-084
type: B
summary: Use analytical theta formula with three-term decomposition
- id: BD-068
type: B/BA
summary: Use Black-Scholes (1973) equity option model with spot price
- id: BD-083
type: B
summary: Apply put-call parity for Black-Scholes via forward price conversion
- id: BD-069
type: B
summary: Use Black-Scholes-Merton model with continuous dividend yield q
- id: BD-074
type: B
summary: Convert BSM price to Black implied volatility using forward adjustment
- id: BD-060
type: B
summary: Custom polynomial approximation for cumulative normal distribution
- id: BD-061
type: B
summary: Use Drezner-Wesolowsky (1990) algorithm with Gauss-Hermite quadrature for bivariate normal
- id: BD-062
type: B
summary: Use central finite difference for numerical delta calculation
- id: BD-063
type: B
summary: Use central finite difference for numerical gamma calculation
- id: BD-064
type: B/BA
summary: Use symmetric finite difference for numerical vega calculation
- id: BD-065
type: B/BA
summary: Use 1/365 years (one calendar day) time step for theta approximation
- id: BD-076
type: B/DK
summary: Handle special cases at expiry (t=0) with analytical deltas
- id: BD-079
type: B/RC
summary: Use Gauss-Hermite quadrature with adaptive node count based on correlation
- id: BD-081
type: B
summary: Return infinity for gamma at expiry when S=K (digital option behavior)
- id: BD-059
type: B
summary: Use Brent's method root-finding for implied volatility in reference Python
- id: BD-073
type: B/BA
summary: Calculate d1 and d2 using log-moneyness with volatility term structure
- id: BD-075
type: B
summary: Use scipy.stats.norm.cdf for reference Python implementations
- id: BD-080
type: B/BA
summary: Set implied volatility search bounds to [1e-12, 100] for brentq
- id: BD-071
type: B/DK
summary: Use analytical delta formula with discounted N(d1) for Black model
- id: BD-082
type: B
summary: 'Use BSM d1 formula with cost of carry: (ln(S/K) + (r-q+σ²/2)*t) / (σ√t)'
- id: BD-072
type: B
summary: Use analytical delta formula with exp(-q*t) for BSM model
- id: BD-011
type: BA
summary: ref_python mirrors production API exactly
- id: BD-032
type: B
summary: Provide ref_python module as pure-Python reference without LetsBeRational
- id: BD-033
type: B/BA
summary: Use scipy brentq for ref_python implied volatility with a=1e-12, b=100
- id: BD-GAP-006
type: DK
summary: 'Missing: Timezone explicit annotation'
- id: BD-GAP-007
type: DK
summary: 'Missing: Random seed full coverage'
- id: BD-GAP-008
type: B
summary: 'Missing: Currency/unit explicit annotation'
- id: BD-GAP-009
type: RC
summary: 'Missing: Price/quantity precision (tick/lot)'
- id: BD-GAP-010
type: B
summary: 'Missing: 模型校准残差与收敛诊断'
- id: BD-GAP-011
type: DK
summary: 'Missing: Timezone explicit annotation'
- id: BD-038
type: B/BA
summary: 'For Black rho, use simplified formula: -t * black(flag, F, K, t, r, sigma) * .01'
- id: BD-GAP-004
type: BA/DK
summary: Time-to-expiry parameter 't' used directly (not as T-t time-from-now) in Black-Scholes calculations
- id: BD-037
type: B/BA
summary: Set maxiter=1000 for brentq implied volatility solver
- id: BD-036
type: B/BA
summary: Use xtol=1e-15, rtol=1e-15 for brentq implied volatility solver
- id: BD-034
type: B/BA
summary: Set numerical vs analytical comparison epsilon = 0.01 for most greeks
- id: BD-035
type: B/BA
summary: Set numerical vs analytical comparison epsilon = 0.0001 for rho
- id: BD-049
type: B
summary: Test against Hull textbook examples (Examples 17.1-17.7) for validation
- id: BD-050
type: B
summary: Test against Haug's Complete Guide to Option Pricing Formulas
resources:
packages:
- name: py_lets_be_rational
version_pin: latest
- name: numpy
version_pin: latest
- name: scipy
version_pin: latest
- name: pandas
version_pin: latest
- name: simplejson
version_pin: latest
- name: numba
version_pin: latest
- name: pytest
version_pin: latest
- name: sphinx
version_pin: latest
- name: sphinx_rtd_theme
version_pin: latest
- name: recommonmark
version_pin: latest
strategy_scaffold:
entry_point_name: run_backtest
output_path: result.csv
execution_mode: backtest
conditional_entry_points:
backtest:
entry_point_name: run_backtest
output_path: result.csv
collector:
entry_point_name: run_collector
output_path: result.json
factor:
entry_point_name: run_factor
output_path: result.parquet
training:
entry_point_name: run_training
output_path: result.json
serving:
entry_point_name: run_server
output_path: result.json
research:
entry_point_name: run_research
output_path: result.json
tail_template: "# === DO NOT MODIFY BELOW THIS LINE ===\nif __name__ == \"__main__\":\n result = run_backtest() #\
\ implement above\n from validate import enforce_validation\n enforce_validation(result, output_path=\"{workspace}/result.csv\"\
)\n# === END DO NOT MODIFY ==="
host_adapter:
target: openclaw
timeout_seconds: 1800
shell_operator_restriction: 'exec tool intercepts && / ; / | — never chain: ''pip install X && python Y''. Use separate
exec calls.'
install_recipes:
- python3 -m pip install py_lets_be_rational
- python3 -m pip install numpy
- python3 -m pip install scipy
- python3 -m pip install zvt
credential_injection: JoinQuant/QMT credentials require user-side '!' prefix shell login. Never hardcode credentials in
generated scripts.
path_resolution: '{workspace} resolves to ~/.openclaw/workspace/doramagic at execution time.'
file_io_tooling: Use openclaw 'write' tool for .py/.sql files; 'exec' tool for python3 /absolute/path/script.py (absolute
paths only).
constraints:
fatal:
- id: finance-C-005
when: When using ref_python for production calculations
action: use ref_python implementations in production systems
severity: fatal
kind: resource_boundary
modality: must_not
consequence: ref_python is 10-100x slower than production (uses pure Python scipy instead of optimized lets_be_rational
C implementation), causing severe performance degradation
stage_ids:
- reference_implementations
- id: finance-C-012
when: When considering ref_python for mission-critical trading
action: deploy ref_python in any production trading system
severity: fatal
kind: claim_boundary
modality: must_not
consequence: Pure Python implementation lacks the speed and reliability guarantees of the optimized lets_be_rational C
implementation, risking financial losses
stage_ids:
- reference_implementations
- id: finance-C-015
when: When implementing option pricing with BSM model
action: adjust forward price for dividend yield using F = S * exp((r-q)*t)
severity: fatal
kind: domain_rule
modality: must
consequence: Without dividend yield adjustment, BSM forward price will be incorrect, causing systematic mispricing for
dividend-paying stocks. The option price will be systematically higher or lower than the true theoretical value.
stage_ids:
- pricing
- id: finance-C-016
when: When calling py_vollib pricing functions with flag parameter
action: pass any flag value other than 'c' or 'p'
severity: fatal
kind: domain_rule
modality: must_not
consequence: Passing invalid flag values causes KeyError exception since binary_flag dict only contains keys 'c' and 'p'.
The library has no input validation and will crash on invalid flags.
stage_ids:
- pricing
- id: finance-C-017
when: When calculating option prices with time-to-expiration
action: verify time parameter t is positive and expressed in years as continuous time fraction
severity: fatal
kind: domain_rule
modality: must
consequence: Negative or zero time causes sqrt(t) to produce NaN in the d1/d2 calculations, or division by zero in the
denominator, resulting in invalid option prices that break downstream Greeks calculations.
stage_ids:
- pricing
- id: finance-C-021
when: When converting flag parameter for underlying pricing library
action: use binary_flag dict to convert 'c'/'p' to +1/-1 before passing to py_lets_be_rational
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Passing string flags directly to py_lets_be_rational causes type errors since it expects integer flag values
(+1 for call, -1 for put).
stage_ids:
- pricing
- id: finance-C-022
when: When implementing Black model pricing
action: apply time value discounting with deflater = exp(-r*t) to undiscounted price
severity: fatal
kind: domain_rule
modality: must
consequence: Missing the discount factor results in forward option prices that exceed their fair value by the risk-free
compounding amount, violating time value of money principles.
stage_ids:
- pricing
- id: finance-C-023
when: When using py_vollib.black for options on spot assets
action: pass spot price S directly to Black model without converting to forward price
severity: fatal
kind: resource_boundary
modality: must_not
consequence: Black model expects futures price F, not spot price S. Passing spot directly causes incorrect pricing because
Black formula assumes the underlying follows geometric Brownian motion with zero cost-of-carry, not spot dynamics.
stage_ids:
- pricing
- id: finance-C-067
when: When computing implied volatility from option prices
action: Return a positive IV value (volatility is mathematically defined as positive)
severity: fatal
kind: domain_rule
modality: must
consequence: A negative implied volatility would violate the mathematical definition of volatility as the standard deviation
of log returns, producing incorrect pricing and risk calculations.
stage_ids:
- implied_volatility
- id: finance-C-068
when: When input option price exceeds theoretical maximum
action: Raise PriceIsAboveMaximum exception
severity: fatal
kind: domain_rule
modality: must
consequence: An option price above the theoretical maximum violates put-call parity bounds and would produce invalid IV
results that could lead to incorrect trading decisions.
stage_ids:
- implied_volatility
- id: finance-C-069
when: When input option price is below intrinsic value
action: Raise PriceIsBelowIntrinsic exception
severity: fatal
kind: domain_rule
modality: must
consequence: An option price below intrinsic value (max(S-K,0) for calls) cannot be valid since no arbitrage would allow
purchasing the option for less than its guaranteed payoff.
stage_ids:
- implied_volatility
- id: finance-C-071
when: When calculating BSM implied volatility
action: Compute forward price F = S * exp((r-q)*t) before calling lets_be_rational
severity: fatal
kind: domain_rule
modality: must
consequence: Omitting forward price calculation or using spot price instead of forward would cause IV mis-calculation
for dividend-paying assets, producing systematically biased volatility estimates.
stage_ids:
- implied_volatility
- id: finance-C-072
when: When calling the lets_be_rational IV solver
action: Undiscount the option price before passing to lets_be_rational using deflater = exp(-r*t)
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Passing a discounted price to lets_be_rational (which expects undiscounted values) would produce incorrect
IV results that diverge significantly from theoretical values.
stage_ids:
- implied_volatility
- id: finance-C-073
when: When lets_be_rational returns sentinel values
action: Translate FLOAT_MAX to PriceIsAboveMaximum and MINUS_FLOAT_MAX to PriceIsBelowIntrinsic
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Returning FLOAT_MAX or MINUS_FLOAT_MAX directly to callers would expose implementation details and violate
the API contract, causing downstream numeric overflow errors.
stage_ids:
- implied_volatility
- id: finance-C-078
when: When implementing IV calculation with time to expiration
action: Express time t in years (fraction of 365-day year) for correct annualization
severity: fatal
kind: domain_rule
modality: must
consequence: Using time in days or months without proper conversion would produce IV values that are systematically scaled,
causing all derived Greeks and hedging ratios to be incorrect.
stage_ids:
- implied_volatility
- id: finance-C-079
when: When implementing the Black-Scholes pricing function that receives d1/d2 values from reference_implementations
action: Verify the d1 formula uses (log(S/K) + (r + sigma^2/2)*t) / (sigma*sqrt(t)) and d2 uses d1 - sigma*sqrt(t) with
parameters (S, K, t, r, sigma)
severity: fatal
kind: domain_rule
modality: must
consequence: Incorrect d1/d2 values will corrupt analytical greeks calculations (delta, theta, gamma, vega, rho), producing
wrong option risk metrics that could lead to significant financial losses in trading decisions
- id: finance-C-080
when: When computing d1(S,K,t,r,sigma) for option pricing or Greeks calculation
action: Allow t (time to expiration) to be zero, which causes division by zero in sigma*sqrt(t)
severity: fatal
kind: domain_rule
modality: must_not
consequence: Division by zero when computing d1 or d2 at t=0 will raise ZeroDivisionError, crashing all dependent option
pricing and Greeks calculations
- id: finance-C-081
when: When passing pricing_function from pricing to numerical greeks
action: 'Verify the pricing_function callable accepts exactly seven parameters: (flag, S, K, t, r, sigma, b) in that order'
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Mismatched function signature will cause TypeError at runtime when numerical Greeks (delta, theta, vega,
rho, gamma) invoke the pricing function with wrong argument count
- id: finance-C-082
when: When computing forward price F = S*exp((r-q)*t) for implied volatility calculation
action: Use the helpers.forward_price function which computes S/numpy.exp(-r*t) for Black-Scholes (q=0) or the BSM formula
F = S*exp((r-q)*t) for dividend-paying stocks
severity: fatal
kind: domain_rule
modality: must
consequence: Incorrect forward price calculation will corrupt implied volatility root-finding, producing wrong IV values
that mislead trading decisions on option strikes and expirations
- id: finance-C-083
when: When computing undiscounted option price for implied volatility root-finding
action: Divide the option price by exp(-r*t) deflater to get undiscounted_option_price = price / exp(-r*t)
severity: fatal
kind: domain_rule
modality: must
consequence: Without proper deflation, implied volatility calculation will converge to wrong sigma values, causing systematic
mispricing of all options in the portfolio
- id: finance-C-084
when: When analytical Greeks functions import and use d1/d2 from reference_implementations
action: Import d1 and d2 specifically from py_vollib.ref_python.black_scholes (not compute locally) to verify formula
consistency between pricing and Greeks
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Using different d1/d2 implementations in pricing vs Greeks will cause internal model inconsistency, producing
Greeks that don't match the priced options
- id: finance-C-086
when: When passing option flag parameter between stages
action: Use lowercase 'c' for call and 'p' for put; the binary_flag helper converts to 1 (call) or -1 (put) for py_lets_be_rational
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Wrong flag value will compute call price for put parameters or vice versa, causing 100% wrong pricing and
Greeks for the specified option type
- id: finance-C-087
when: When numerical Greeks performs bump-and-reprice calculations near t=0
action: Pass t=0 directly to pricing_function; instead use t=0.00001 as the floor value for theta/gamma calculations
severity: fatal
kind: domain_rule
modality: must_not
consequence: Direct t=0 will cause division by zero in d1 formula during bump-and-reprice, producing NaN or infinity values
for all numerical Greeks
- id: finance-C-092
when: When passing underlying price (S or F) between option pricing stages
action: Verify S (spot price) is used for Black-Scholes while F (futures price) is used for Black futures model; these
are mathematically distinct inputs
severity: fatal
kind: domain_rule
modality: must
consequence: Using spot where forward price is required (or vice versa) will produce wrong d1 values and corrupted pricing/Greeks
for futures options
- id: finance-C-093
when: When implementing or calling any option pricing function across Black, Black-Scholes, and Black-Scholes-Merton models
action: Pass flag parameter as 'c' (call) or 'p' (put), which will be mapped to +1/-1 via the binary_flag dictionary from
py_vollib.helpers
severity: fatal
kind: domain_rule
modality: must
consequence: Pricing and implied volatility calculations will be inverted — call prices computed as put prices and vice
versa, causing fundamentally incorrect option valuations
- id: finance-C-094
when: When calculating option theta across any model (Black, Black-Scholes, Black-Scholes-Merton)
action: Scale the theta result by dividing by 365 to convert from annualized to per-day convention
severity: fatal
kind: domain_rule
modality: must
consequence: Theta will report annual change instead of daily change — 365x larger than expected — causing incorrect time
decay calculations and risk management errors
- id: finance-C-095
when: When calculating option vega or rho across any model (Black, Black-Scholes, Black-Scholes-Merton)
action: Scale the vega/rho result by multiplying by 0.01 to report change per 1% move rather than per 100% move
severity: fatal
kind: domain_rule
modality: must
consequence: Vega and rho will report change per 100% move instead of per 1% move — 100x larger than industry-standard
convention — causing severe risk miscalculation
- id: finance-C-097
when: When using cost-of-carry parameter b to select between option pricing models
action: Set b=0 for Black futures option model, b=r for Black-Scholes stock model, b=r-q for Black-Scholes-Merton with
continuous dividend yield q
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Using incorrect b value will invoke wrong pricing model — the b default values encode the model selection,
and changing them collapses three distinct models into one
- id: finance-C-099
when: When calculating forward prices across different option pricing models
action: Use S*exp(r*t) for Black-Scholes forward and S*exp((r-q)*t) for Black-Scholes-Merton forward, where q is the continuous
dividend yield
severity: fatal
kind: domain_rule
modality: must
consequence: Forward price mismatch will cause incorrect option pricing — the dividend adjustment in BSM forward formula
is essential for accuracy with dividend-paying stocks
- id: finance-C-100
when: When using py_vollib for American options pricing
action: Claim that py_vollib supports American options pricing — it implements only European option pricing models (Black,
Black-Scholes, Black-Scholes-Merton)
severity: fatal
kind: claim_boundary
modality: must_not
consequence: Users will build American options trading systems expecting early exercise premium to be captured, but py_vollib
will return European-style prices that exclude this critical component, causing systematic overvaluation of ITM puts
and undervaluation of deep ITM calls
- id: finance-C-101
when: When using py_vollib for exotic options pricing
action: Claim that py_vollib supports exotic options pricing — it provides only vanilla European option pricing
severity: fatal
kind: claim_boundary
modality: must_not
consequence: Users will incorrectly price barrier options, Asian options, binary options, or other exotics using European
formulas, producing meaningless results for non-vanilla payoffs
- id: finance-C-102
when: When using py_vollib for models requiring Monte Carlo simulation
action: Claim that py_vollib supports Monte Carlo simulation — it provides only analytical closed-form solutions
severity: fatal
kind: claim_boundary
modality: must_not
consequence: Users will attempt to price path-dependent or high-dimensional options expecting simulation capabilities,
but py_vollib contains no Monte Carlo engine, producing no results or incorrect analytical approximations
- id: finance-C-104
when: When using py_vollib for options with early exercise features
action: Claim that py_vollib supports early exercise features — it implements only European options without exercise timing
flexibility
severity: fatal
kind: claim_boundary
modality: must_not
consequence: Users will price American options, Bermudan options, or callable structures expecting early exercise premium,
but py_vollib returns European prices that ignore this value component, causing systematic mispricing
- id: finance-C-107
when: When using numerical Greeks calculation with the pricing_function parameter
action: Verify pricing_function signature accepts (flag, S, K, t, r, sigma, b) in that exact order — numerical Greeks
pass these arguments positionally
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Numerical Greeks will silently compute incorrect values using misaligned arguments, causing hedging ratios
and risk measures to be based on wrong sensitivities
- id: finance-C-112
when: When passing time-to-expiration parameter t to any py_vollib pricing function
action: Express time t in years (not days or months) — t=0.25 represents 3 months, t=1.0 represents 1 year
severity: fatal
kind: domain_rule
modality: must
consequence: Option prices and Greeks will be computed with incorrect time scaling — passing t=30 expecting 30 days will
actually compute for t=30 years, producing wildly incorrect and useless results
- id: finance-C-122
when: When configuring the cost-of-carry parameter 'b' for option pricing models
action: Set b=0 for Black model (futures), b=r for Black-Scholes (no dividends), b=r-q for BSM (continuous dividend yield)
— incorrect b value produces wrong model output
severity: fatal
kind: domain_rule
modality: must
consequence: Using b=r when b=r-q is required causes Black-Scholes pricing instead of BSM, systematically mispricing dividend-paying
equity options; hedge ratios and expected returns will be incorrect
derived_from_bd_id: BD-087
- id: finance-C-152
when: When calculating delta for Black model (futures/forward options)
action: 'Apply Black delta formula with exp(-r*t) discount factor: call delta = N(d1)*exp(-r*t), put delta = -N(-d1)*exp(-r*t)
— do not use BS delta formula without discount factor for Black model options'
severity: fatal
kind: domain_rule
modality: must
consequence: Using BS delta without the exp(-r*t) factor overstates Black model futures option delta, causing delta hedging
in Black space to mismatch spot exposure and resulting in incorrect hedge ratios that accumulate losses over time
derived_from_bd_id: BD-055
- id: finance-C-153
when: When calculating implied volatility using the Black model
action: 'Undiscount the option price before passing to the IV solver: undiscounted = price * exp(r*t) — the Black model
operates in forward-space and requires undiscounted forward option prices as input'
severity: fatal
kind: domain_rule
modality: must
consequence: Passing discounted prices to the Black IV solver produces incorrect implied volatility because the Black
pricing function assumes forward prices, causing systematic mispricing that affects all downstream risk calculations
derived_from_bd_id: BD-047
- id: finance-C-158
when: When pricing options on futures contracts
action: Use Black (1976) futures option model with discounted price calculation — do not substitute Black-Scholes (1973)
or other equity models
severity: fatal
kind: domain_rule
modality: must
consequence: Black-Scholes assumes cost-of-carry b≠0 for equities; applying it to futures options with b=0 systematically
misprices by ignoring the futures-specific discount factor, causing 1-5% pricing errors
derived_from_bd_id: BD-057
- id: finance-C-164
when: When pricing equity options without dividends using spot price
action: Use Black-Scholes (1973) model — do not confuse with Black-76 (futures) or BSM (dividend-paying stocks)
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Using Black-76 for equity options ignores cost-of-carry; using BSM for non-dividend stocks adds unnecessary
dividend yield parameter causing 0.1-2% pricing errors
derived_from_bd_id: BD-068
- id: finance-C-170
when: When configuring cost_of_carry parameter for Black model (futures options)
action: Set b=0 for Black model because futures prices already incorporate cost of carry; verify b=r-q is NOT used as
this would double-count cost of carry
severity: fatal
kind: domain_rule
modality: must
consequence: Setting b=r-q in Black model incorrectly accounts for cost of carry that is already embedded in futures prices,
leading to systematic pricing errors for futures options
derived_from_bd_id: BD-070
- id: finance-C-174
when: When implementing option type flag conversion or using py_lets_be_rational integration
action: Use binary_flag mapping {'c':1, 'p':-1} consistently across each model variants; verify flag conversion preserves
this exact encoding when calling external libraries
severity: fatal
kind: domain_rule
modality: must
consequence: Any deviation in binary_flag mapping (e.g., {c:1, p:1} or {c:0, p:1}) causes incorrect option type detection,
producing wrong prices or IV values that lead to significant trading losses
derived_from_bd_id: BD-089
- id: finance-C-178
when: When implementing any BSM Greeks (delta, gamma, vega, theta) calculations
action: Include exp(-q*t) dividend adjustment factor in every Greek calculation; the dividend yield q must be consistent
across each Greeks — if dividend yield q is wrong, each Greek is systematically wrong by this factor
severity: fatal
kind: domain_rule
modality: must
consequence: The dividend factor cascade means errors in dividend yield q multiply across all Greeks simultaneously; for
high-dividend stocks (5% yield), each Greek sensitivity can be off by 0-5%, causing portfolio delta hedging to systematically
over or under hedge by this cumulative factor
derived_from_bd_id: BD-103
- id: finance-C-182
when: When implementing Black futures option pricing model
action: Set cost-of-carry parameter b=0 for Black model — Black applies to futures/forward options where underlying has
zero cost-of-carry
severity: fatal
kind: domain_rule
modality: must
consequence: Using Black model with b≠0 violates the fundamental futures pricing assumption and produces incorrect option
prices; the forward/futures pricing relationship assumes b=0, so prices diverge from market values
derived_from_bd_id: BD-027
- id: finance-C-183
when: When implementing Black-Scholes-Merton option pricing model for dividend-paying stocks
action: Set cost-of-carry parameter b=r-q where q represents continuous dividend yield; verify q>0 when using BSM
severity: fatal
kind: domain_rule
modality: must
consequence: Using Black-Scholes (b=r) instead of BSM (b=r-q) for dividend-paying stocks underprices puts and overprices
calls; the dividend yield reduces expected stock growth and must be reflected in the cost-of-carry
derived_from_bd_id: BD-028
regular:
- id: finance-C-001
when: When implementing ref_python functions
action: use scipy.stats.norm.cdf for cumulative normal distribution calculations
severity: high
kind: domain_rule
modality: must
consequence: Using alternative CDF implementations could cause reference implementation results to diverge from production
values, breaking the verification purpose
stage_ids:
- reference_implementations
- id: finance-C-002
when: When implementing ref_python implied volatility
action: use scipy.optimize.brentq with bounds a=1e-12, b=100, xtol=1e-15, rtol=1e-15, maxiter=1000
severity: high
kind: domain_rule
modality: must
consequence: Different root-finding parameters may cause implied volatility to fail convergence or produce different results
than expected, compromising numerical verification
stage_ids:
- reference_implementations
- id: finance-C-003
when: When implementing ref_python black pricing functions
action: validate flag parameter as 'c' or 'p' and raise exception for invalid values
severity: high
kind: domain_rule
modality: must
consequence: Invalid flag values would silently compute incorrect results, causing verification failures to go undetected
stage_ids:
- reference_implementations
- id: finance-C-004
when: When implementing ref_python d1 calculation
action: guard against zero denominator (sigma * sqrt(t) = 0) to prevent division by zero
severity: high
kind: domain_rule
modality: must
consequence: Zero denominator causes division by zero error, crashing the reference implementation and preventing any
verification
stage_ids:
- reference_implementations
- id: finance-C-006
when: When depending on ref_python for real-time calculations
action: claim ref_python provides real-time capable option pricing
severity: high
kind: claim_boundary
modality: must_not
consequence: Pure Python scipy implementation is significantly slower than production C implementation, making real-time
pricing claims false
stage_ids:
- reference_implementations
- id: finance-C-007
when: When comparing ref_python outputs with production
action: allow floating-point tolerance for numerical comparison due to different CDF implementations
severity: medium
kind: domain_rule
modality: must
consequence: scipy.stats.norm.cdf and lets_be_rational.norm_cdf may produce slightly different floating-point results,
causing false verification failures without tolerance
stage_ids:
- reference_implementations
- id: finance-C-008
when: When implementing ref_python functions
action: maintain identical function signatures to corresponding production functions
severity: high
kind: architecture_guardrail
modality: must
consequence: Mismatched signatures prevent drop-in replacement testing and automated verification, breaking the core purpose
of ref_python
stage_ids:
- reference_implementations
- id: finance-C-009
when: When implementing ref_python implied volatility
action: use brentq root-finding algorithm (not newton) for numerical verification purposes
severity: medium
kind: architecture_guardrail
modality: must
consequence: Different algorithms expose implementation differences for verification; using the same algorithm would defeat
the purpose of having a reference implementation
stage_ids:
- reference_implementations
- id: finance-C-010
when: When verifying ref_python implied volatility convergence
action: verify scipy.optimize.brentq converges for valid inputs within maxiter=1000 iterations
severity: medium
kind: operational_lesson
modality: must
consequence: brentq may fail to converge for edge cases (e.g., deep ITM options, extreme volatilities), causing verification
to fail silently
stage_ids:
- reference_implementations
- id: finance-C-011
when: When testing ref_python against production
action: expect bit-exact matches between ref_python and production outputs
severity: medium
kind: operational_lesson
modality: must_not
consequence: Different CDF implementations (scipy vs lets_be_rational) and different IV algorithms (brentq vs Newton-Raphson
with rational bounds) produce small floating-point differences
stage_ids:
- reference_implementations
- id: finance-C-013
when: When using brentq for implied volatility in ref_python
action: specify volatility search bounds of a=1e-12 (lower) and b=100 (upper) to bracket the solution
severity: high
kind: domain_rule
modality: must
consequence: Without proper bounds, brentq will raise ValueError for out-of-bracket calls, preventing any implied volatility
calculation
stage_ids:
- reference_implementations
- id: finance-C-014
when: When implementing ref_python greeks calculations
action: apply convention scaling (theta/365, vega*0.01, rho*0.01) as documented in code comments
severity: medium
kind: domain_rule
modality: must
consequence: Greeks without proper convention scaling will differ from industry-standard values, causing verification
failures against textbook examples
stage_ids:
- reference_implementations
- id: finance-C-018
when: When pricing options on dividend-paying stocks
action: use Black-Scholes-Merton model with q parameter instead of Black-Scholes
severity: high
kind: domain_rule
modality: must
consequence: Using BS without dividend adjustment underprices calls and overprices puts on dividend-paying stocks, leading
to systematic losses when writing options or missing profitable opportunities when buying.
stage_ids:
- pricing
- id: finance-C-019
when: When verifying BSM pricing correctness
action: verify BSM with q=0 produces identical results to BS
severity: high
kind: architecture_guardrail
modality: must
consequence: If BSM(q=0) differs from BS, there is a bug in the dividend yield handling or forward price calculation that
will cause mispricing for all BSM calculations.
stage_ids:
- pricing
- id: finance-C-020
when: When calling black_scholes or black_scholes_merton
action: delegate underlying computation to py_lets_be_rational.black via undiscounted_black
severity: high
kind: architecture_guardrail
modality: must
consequence: Direct implementation without delegation to py_lets_be_rational loses the performance and precision benefits
of Jaeckel's optimized LetsBeRational algorithm, producing slower and potentially less accurate results.
stage_ids:
- pricing
- id: finance-C-024
when: When using py_vollib.ref_python for production pricing
action: use ref_python implementation for industrial or production option pricing
severity: high
kind: resource_boundary
modality: must_not
consequence: ref_python is orders of magnitude slower than the optimized py_lets_be_rational implementation and lacks
the precision guarantees, making it unsuitable for production trading systems.
stage_ids:
- pricing
- id: finance-C-025
when: When relying on py_vollib for option pricing
action: claim theoretical prices are actual market prices or guaranteed execution values
severity: high
kind: claim_boundary
modality: must_not
consequence: Presenting theoretical Black-Scholes prices as real market prices misleads stakeholders. The model assumes
constant volatility, continuous trading, and no transaction costs that don't exist in real markets.
stage_ids:
- pricing
- id: finance-C-026
when: When comparing backtested option strategy returns to live trading
action: claim backtest results will replicate in live trading without adjustment for slippage and fees
severity: high
kind: claim_boundary
modality: must_not
consequence: Backtested option strategy returns that ignore bid-ask spreads, slippage, and execution delays will dramatically
overestimate live trading performance, leading to strategies that appear profitable in backtests but lose money in live
trading.
stage_ids:
- pricing
- id: finance-C-027
when: When pricing options near expiration with very small t
action: apply numerical stability safeguards for sqrt(t) near zero
severity: high
kind: operational_lesson
modality: must
consequence: Very small t values cause numerical instability in sqrt(t) denominator, producing extreme or NaN Greeks values
that cause downstream calculations to fail or produce meaningless results.
stage_ids:
- pricing
- id: finance-C-028
when: When verifying Black model correctness
action: validate ATM options match known benchmark values within tolerance
severity: high
kind: operational_lesson
modality: must
consequence: ATM option pricing errors indicate fundamental problems with volatility surface input, time calculation,
or discount factor application that will affect all options regardless of moneyness.
stage_ids:
- pricing
- id: finance-C-029
when: When validating put-call parity for BSM
action: verify C - P = S*exp(-q*t) - K*exp(-r*t) holds within numerical tolerance
severity: high
kind: operational_lesson
modality: must
consequence: Put-call parity violation indicates bugs in discount factor calculation, dividend handling, or forward price
conversion that will cause arbitrage opportunities in the pricing model.
stage_ids:
- pricing
- id: finance-C-030
when: When verifying option price monotonicity
action: verify ITM calls have higher prices than OTM calls with same expiry and volatility
severity: high
kind: operational_lesson
modality: must
consequence: Violation of call price monotonicity with respect to spot price indicates bugs in intrinsic value calculation
or forward price adjustment that create arbitrage opportunities.
stage_ids:
- pricing
- id: finance-C-031
when: When executing production trading strategies based on py_vollib
action: account for transaction costs, bid-ask spreads, and slippage not modeled by Black-Scholes
severity: high
kind: operational_lesson
modality: must
consequence: Black-Scholes assumes frictionless markets with zero transaction costs. Ignoring bid-ask spreads and fees
in live trading can turn profitable theoretical strategies into losing ones, especially for high-frequency option trading.
stage_ids:
- pricing
- id: finance-C-032
when: When implementing Black model with deep ITM options
action: use Black model without handling numerical underflow for near-zero option prices
severity: medium
kind: operational_lesson
modality: must_not
consequence: Deep ITM options approaching intrinsic value may underflow to zero in floating point calculations, causing
implied volatility calculations to fail or produce infinite values.
stage_ids:
- pricing
- id: finance-C-053
when: When computing numerical gamma for an option at expiry
action: handle the singularity by returning float('inf') when S equals K and t equals zero
severity: high
kind: domain_rule
modality: must
consequence: Without singularity handling, numerical gamma calculation will cause division by zero or NaN when time to
expiry is zero and the option is at-the-money
stage_ids:
- greeks_numerical
- id: finance-C-054
when: When computing numerical delta
action: 'enforce delta bounds: call options must yield values in [0,1] and put options must yield values in [-1,0]'
severity: high
kind: domain_rule
modality: must
consequence: Delta values outside these bounds indicate incorrect finite difference calculation or pricing function errors,
leading to incorrect hedging ratios in live trading
stage_ids:
- greeks_numerical
- id: finance-C-055
when: When computing numerical gamma for long option positions
action: produce positive gamma values for both calls and puts regardless of moneyness
severity: high
kind: domain_rule
modality: must
consequence: Negative gamma would indicate concave option pricing which violates the convexity property of option values,
leading to incorrect risk assessments and potential hedging losses
stage_ids:
- greeks_numerical
- id: finance-C-056
when: When computing numerical theta
action: return negative values for long option positions (both calls and puts) as time passes
severity: high
kind: domain_rule
modality: must
consequence: Positive theta for long options would indicate time value is increasing, which violates the fundamental principle
that options lose time value as expiry approaches
stage_ids:
- greeks_numerical
- id: finance-C-057
when: When implementing the finite difference step size for delta and gamma
action: use dS = 0.01 (1% bump) as the industry-standard symmetric finite difference step size
severity: medium
kind: resource_boundary
modality: must
consequence: Using non-standard step sizes may cause numerical instability, increase approximation error, or produce unreliable
Greeks that deviate significantly from analytical values
stage_ids:
- greeks_numerical
- id: finance-C-058
when: When adjusting the finite difference step size
action: verify step_size balances approximation accuracy against pricing_function smoothness
severity: medium
kind: resource_boundary
modality: should
consequence: Step sizes too small cause numerical noise from floating-point precision limits; step sizes too large increase
truncation error, leading to inaccurate Greeks
stage_ids:
- greeks_numerical
- id: finance-C-059
when: When computing theta for options near expiry
action: cap theta calculation at 1/365 to prevent numerical singularity
severity: high
kind: operational_lesson
modality: must
consequence: Without capping, theta approaches infinity as time to expiry approaches zero, causing overflow errors and
meaningless results in risk calculations
stage_ids:
- greeks_numerical
- id: finance-C-060
when: When computing delta at expiry
action: 'apply special boundary conditions: return 0.5 at-the-money, 1.0 for deep ITM calls, 0.0 for deep OTM calls, and
symmetric negatives for puts'
severity: high
kind: operational_lesson
modality: must
consequence: Standard central difference formula is undefined at t=0, causing incorrect delta values of 0 or NaN for at-the-money
options at expiry
stage_ids:
- greeks_numerical
- id: finance-C-061
when: When comparing numerical Greeks against analytical Greeks
action: tolerate finite difference approximation error below 0.0001 (epsilon)
severity: medium
kind: operational_lesson
modality: must
consequence: Strict equality would cause valid numerical approximations to fail, forcing unnecessary use of analytical
formulas where numerical Greeks are needed for exotic payoffs
stage_ids:
- greeks_numerical
- id: finance-C-062
when: When computing BSM numerical greeks
action: derive cost-of-carry b from r and q via b = r - q (dividend yield adjustment)
severity: high
kind: architecture_guardrail
modality: must
consequence: Incorrect cost-of-carry parameterization causes numerical Greeks to use wrong pricing model, producing Greeks
that do not match the BSM framework
stage_ids:
- greeks_numerical
- id: finance-C-063
when: When using numerical Greeks with any pricing model
action: pass pricing_function as a parameter to enable strategy pattern for generic computation
severity: high
kind: architecture_guardrail
modality: must
consequence: Hardcoding a specific pricing model prevents numerical Greeks from working with exotic options or alternative
pricing frameworks
stage_ids:
- greeks_numerical
- id: finance-C-064
when: When presenting numerical Greeks results
action: claim numerical Greeks are more accurate than analytical Greeks for standard BSM options
severity: medium
kind: claim_boundary
modality: must_not
consequence: Numerical Greeks are approximations with inherent finite difference error; presenting them as equivalent
to analytical Greeks misleads users about precision
stage_ids:
- greeks_numerical
- id: finance-C-065
when: When using numerical Greeks as primary calculation method
action: claim numerical Greeks equal analytical Greeks without specifying the approximation error tolerance
severity: high
kind: claim_boundary
modality: must_not
consequence: Without explicit error tolerance disclosure, users may assume perfect accuracy and make incorrect hedging
or risk management decisions
stage_ids:
- greeks_numerical
- id: finance-C-066
when: When numerical Greeks produce infinity values
action: allow float('inf') gamma values to propagate to downstream risk calculations without warning
severity: high
kind: domain_rule
modality: must_not
consequence: Infinite gamma at expiry causes risk systems to fail or produce meaningless hedging ratios that could lead
to significant trading losses
stage_ids:
- greeks_numerical
- id: finance-C-070
when: When implementing IV calculation that must be mathematically consistent
action: 'Verify IV calculation is reversible: price(IV(price)) approximately equals original price'
severity: high
kind: domain_rule
modality: must
consequence: Non-reversible IV calculations indicate fundamental mathematical inconsistencies in the pricing model, causing
systematic pricing errors in trading strategies.
stage_ids:
- implied_volatility
- id: finance-C-075
when: When validating IV solver output bounds
action: Verify returned IV is within reasonable bounds (typically 0 < IV < 10 for most markets)
severity: high
kind: domain_rule
modality: must
consequence: An IV exceeding reasonable bounds (e.g., >10 or negative) would indicate input data errors or market anomalies
that could lead to incorrect pricing and substantial trading losses.
stage_ids:
- implied_volatility
- id: finance-C-076
when: When computing implied volatility for production trading systems
action: Claim IV results are guaranteed to converge or are suitable for real-time trading without validation
severity: high
kind: claim_boundary
modality: must_not
consequence: Presenting IV as guaranteed accurate without edge case handling would mislead traders into making decisions
based on potentially incorrect volatility estimates.
stage_ids:
- implied_volatility
- id: finance-C-077
when: When passing option prices to the IV solver
action: Use 'c' or 'p' string values for flag parameter, mapped to binary values internally
severity: high
kind: architecture_guardrail
modality: must
consequence: Passing invalid flag values or incorrect binary values directly would cause incorrect option type handling
and produce wrong IV calculations.
stage_ids:
- implied_volatility
- id: finance-C-085
when: When Black-Scholes-Merton numerical greeks compute b parameter (cost of carry)
action: For Black-Scholes (no dividends), set b = r; for BSM (continuous dividends), compute q = r - b and pass q to the
pricing function
severity: high
kind: domain_rule
modality: must
consequence: Wrong cost-of-carry parameter will misprice options and compute incorrect Greeks, leading to hedge ratios
that don't offset risk properly
- id: finance-C-088
when: When using reference_implementations for production option pricing
action: Use py_vollib.black_scholes or py_vollib.black (optimized with py_lets_be_rational) instead of py_vollib.ref_python
implementations in production systems
severity: high
kind: resource_boundary
modality: must_not
consequence: ref_python uses pure NumPy/SciPy which is 100-1000x slower than py_lets_be_rational; production systems will
have unacceptable latency for real-time pricing
- id: finance-C-089
when: When computing analytical theta with time to expiration t
action: Divide the analytical theta formula by 365 to convert from annual to daily change (1 day = 1/365 years)
severity: high
kind: domain_rule
modality: must
consequence: Without /365 division, theta will be 365x too large, causing systematic over-hedging of time decay and incorrect
P&L attribution
- id: finance-C-090
when: When computing analytical vega with implied volatility sigma
action: Multiply the analytical vega formula by 0.01 to express sensitivity per 1% change in volatility (not per 100%
change)
severity: high
kind: domain_rule
modality: must
consequence: Without *0.01 scaling, vega will be 100x too large, causing over-sensitivity estimates and excessive volatility
hedging
- id: finance-C-091
when: When computing analytical rho with interest rate r
action: Multiply the analytical rho formula by 0.01 to express sensitivity per 1% change in interest rate
severity: high
kind: domain_rule
modality: must
consequence: Without *0.01 scaling, rho will be 100x too large, causing incorrect interest rate sensitivity estimates
- id: finance-C-096
when: When calculating implied volatility and the input price violates intrinsic bounds
action: Raise PriceIsAboveMaximum when price exceeds maximum theoretical value, or PriceIsBelowIntrinsic when price is
below intrinsic value
severity: high
kind: architecture_guardrail
modality: must
consequence: Invalid implied volatility results will be silently returned without error indication, leading to downstream
calculations using mathematically impossible volatility values
- id: finance-C-098
when: When comparing delta values between Black and Black-Scholes models
action: Directly compare Black delta with Black-Scholes delta as equivalent quantities — Black delta includes discount
factor exp(-r*t) while BS delta does not
severity: high
kind: domain_rule
modality: must_not
consequence: Delta hedging strategies will use incorrect hedge ratios — Black delta and BS delta are mathematically different
definitions that cannot be compared directly
- id: finance-C-103
when: When using py_vollib for options with discrete dividend handling
action: Claim that py_vollib supports discrete dividend handling — q parameter represents continuous dividend yield, not
discrete cash dividends
severity: high
kind: claim_boundary
modality: must_not
consequence: Users will pass discrete dividend amounts expecting proper stock price adjustment, but py_vollib's BSM model
uses continuous yield approximation that fails to capture discrete dividend timing effects, causing material pricing
errors near ex-dates
- id: finance-C-105
when: When presenting or reporting option pricing or Greeks calculated by py_vollib
action: Claim that theoretical option prices and Greeks equal actual market values — py_vollib provides idealized model
outputs that assume constant volatility, continuous trading, and frictionless markets
severity: high
kind: claim_boundary
modality: must_not
consequence: Users will make trading and risk management decisions based on idealized model outputs, ignoring market microstructure,
bid-ask spreads, transaction costs, and volatility smile effects that cause real-world option prices to deviate significantly
from theoretical values
- id: finance-C-106
when: When calculating implied volatility from option prices
action: 'Verify input option price satisfies intrinsic bounds — for calls: max(0, F-K)*exp(-r*t) <= price <= F*exp(-r*t),
for puts: max(0, K-F)*exp(-r*t) <= price <= K*exp(-r*t)'
severity: high
kind: domain_rule
modality: must
consequence: Implied volatility calculation will fail or return meaningless values when price violates intrinsic bounds,
causing downstream risk calculations to use invalid volatility inputs
- id: finance-C-108
when: When deploying py_vollib in production trading or risk management systems
action: Claim or imply that py_vollib provides production-ready, legally-compliant financial calculations — the library
explicitly disclaims each warranties and is provided 'as is' without any guarantee of fitness for any purpose
severity: high
kind: claim_boundary
modality: must_not
consequence: Users will deploy library in regulated environments expecting legally-compliant calculations, but the MIT
license explicitly disclaims all warranties and the library has not been audited for regulatory compliance, exposing
users to legal and financial liability
- id: finance-C-109
when: When using py_vollib.ref_python as the primary pricing engine
action: Use py_vollib.ref_python for production or industrial option pricing — the README explicitly states it is 'not
recommended for serious use' and is provided 'purely as a reference implementation for sanity checking'
severity: medium
kind: resource_boundary
modality: must_not
consequence: Option pricing will be orders of magnitude slower (no Numba optimization, no lets_be_rational C implementation)
and potentially less numerically stable than the production py_vollib implementation
- id: finance-C-110
when: When installing or depending on py_vollib
action: Install py_lets_be_rational as a required dependency — py_vollib's core implied volatility calculations depend
on Peter Jäckel's lets_be_rational algorithms (either C or Python implementation)
severity: high
kind: resource_boundary
modality: must
consequence: Implied volatility calculations will fail with import errors if py_lets_be_rational is not installed, breaking
any system relying on the library's core functionality
- id: finance-C-111
when: When expecting py_vollib to match textbook Black-Scholes formulas exactly
action: Expect py_vollib theta/vega/rho to match textbook formulas without adjustment — py_vollib scales these Greeks
by 365 (theta) and 0.01 (vega/rho) to match industry convention, while textbooks report raw partial derivatives
severity: medium
kind: domain_rule
modality: must_not
consequence: Comparison tests will fail — py_vollib returns per-day theta and per-1%-move vega/rho while textbook formulas
return annual theta and per-100%-move vega/rho, causing apparent discrepancies that are actually convention differences
- id: finance-C-113
when: When using the framework's analytical theta calculation for backtesting
action: Change the theta division by 365 convention — the formula divides annual theta by 365 to report daily decay; removing
this division or changing the divisor produces annualized theta values incompatible with industry dashboards
severity: medium
kind: domain_rule
modality: must_not
consequence: Strategies relying on theta values for daily P&L attribution will receive annualized values (365x larger),
causing incorrect position sizing, wrong stop-loss calibration, and systematic overestimation of time decay costs
derived_from_bd_id: BD-005
- id: finance-C-114
when: When implementing Black76 pricing calculations
action: Use py_lets_be_rational.black for undiscounted_black computation — this delegates to Jaeckel's highly-optimized
C implementation for numerical accuracy and speed
severity: high
kind: architecture_guardrail
modality: must
consequence: Reimplementing Black76 in pure Python would be 100-1000x slower and risk numerical instability in edge cases
that the validated C extension handles correctly
derived_from_bd_id: BD-003
- id: finance-C-115
when: When implementing BSM delta calculation for equity options with continuous dividend yield
action: Include exp(-q*t) factor in delta calculation to capture reduced spot-price sensitivity from dividend yield
severity: high
kind: domain_rule
modality: must
consequence: Omitting the dividend yield factor causes delta to overstate the probability of finishing in-the-money, leading
to incorrect hedge ratios and potential over-hedging or under-hedging of portfolio positions
derived_from_bd_id: BD-004
- id: finance-C-116
when: When implementing or validating vega and rho calculations
action: Multiply raw vega and rho derivatives by 0.01 to convert to per-percent convention (per 1% = 100 bps change in
volatility/rate, not per unit change)
severity: high
kind: domain_rule
modality: must
consequence: Using undifferentiated vega/rho (per vol-point) causes Greeks to be 100x larger than expected by market convention;
strategies relying on these values will miscalculate position sizes and hedge ratios
derived_from_bd_id: BD-006
- id: finance-C-117
when: When implementing numerical Greeks calculations using finite differences
action: Use dS = 0.01 (1% relative bump) as the step size — relative bump scales correctly across each strike prices unlike
absolute bump
severity: high
kind: domain_rule
modality: must
consequence: Using absolute bump (fixed dollar amount) causes incorrect finite difference approximations for strikes far
from spot price; deep OTM options may show zero sensitivity due to bump being too small relative to strike
derived_from_bd_id: BD-008
- id: finance-C-118
when: When handling implied volatility calculation boundary conditions
action: Raise PriceIsBelowIntrinsic exception when sigma reaches MINUS_FLOAT_MAX — do not return 0 or NaN as this masks
invalid state
severity: high
kind: domain_rule
modality: must
consequence: Returning 0 or NaN for mathematically impossible prices makes it impossible to distinguish between 'not yet
converged' and 'intrinsically invalid'; callers cannot implement proper validation workflows
derived_from_bd_id: BD-022
- id: finance-C-119
when: When implementing implied volatility calculation
action: Use the transformed rational guess algorithm for numerical inversion of the pricing function — do not substitute
bisection (slow convergence) or Newton-Raphson (unstable near boundaries)
severity: high
kind: domain_rule
modality: must
consequence: Using bisection causes slow convergence especially for near-expiry options; Newton-Raphson exhibits instability
at strike/maturity boundaries, leading to missed convergence or wrong IV values in edge cases
derived_from_bd_id: BD-020
- id: finance-C-120
when: When handling numerical Greeks at t=0 (expiry)
action: Implement special handling for ATM gamma at t=0 — check for inf results and return a large finite value or raise
a distinct exception rather than propagating infinity silently
severity: medium
kind: operational_lesson
modality: should
consequence: At t=0 with S=K (ATM), gamma mathematically approaches infinity; numerical approximation returns inf which
propagates through portfolio risk calculations and causes downstream systems to fail or display meaningless values
derived_from_bd_id: BD-096
- id: finance-C-121
when: When calculating theta for options with time to expiry near 1 day (t <= 1/365)
action: Verify theta calculation continuity across the t=1/365 boundary — the interaction of 365 divisor, 1/365 cap, and
0.00001 minimum timestep creates discontinuity; use consistent theta values or document the discontinuity
severity: medium
kind: operational_lesson
modality: should
consequence: Same option at t=0.0027 (just below 1/365) vs t=0.00274 (just above) shows significantly different theta
due to the discrete minimum timestep branch; backtests will show inconsistent theta exposure near end-of-day
derived_from_bd_id: BD-102
- id: finance-C-123
when: When implementing or refactoring option pricing error handling logic
action: Replace FLOAT_MAX/MINUS_FLOAT_MAX sentinel values with NaN — these extreme float values carry semantic meaning
(indicating overflow/underflow boundary) that would be lost with NaN
severity: high
kind: domain_rule
modality: must_not
consequence: Replacing sentinels with NaN loses information about which edge case occurred, preventing caller-side distinction
between positive overflow, negative overflow, and other invalid inputs; backtest results may silently propagate invalid
values
derived_from_bd_id: BD-010
- id: finance-C-124
when: When implementing theta calculation in options pricing
action: Divide theta by 365 to convert annual theta to per-calendar-day sensitivity — do not use 252 (trading days) as
theta represents calendar-time decay
severity: high
kind: domain_rule
modality: must
consequence: Using 252 trading days instead of 365 calendar days overestimates daily theta decay by 31%, causing strategies
that sell theta to appear more profitable in backtests than in live trading
derived_from_bd_id: BD-013
- id: finance-C-125
when: When implementing rho calculation in options pricing
action: Multiply rho by 0.01 to express sensitivity per 1% change in interest rate — do not remove this scaling as rho
must align with percent-quoted rates
severity: high
kind: domain_rule
modality: must
consequence: Removing the 0.01 scaling factor changes rho from per-percent to per-basis-point interpretation, causing
100x miscalculation of interest rate sensitivity; strategies hedging rate risk will be wildly over- or under-hedged
derived_from_bd_id: BD-015
- id: finance-C-126
when: When implementing vega calculation in options pricing
action: Multiply vega by 0.01 to express sensitivity per 1 percentage point change in implied volatility — do not remove
this scaling as market IV is quoted in percent
severity: high
kind: domain_rule
modality: must
consequence: Without the 0.01 scaling, vega reports per-decimal-volatility sensitivity instead of per-percent, causing
100x miscalculation; volatility hedging strategies will be significantly over- or under-sized
derived_from_bd_id: BD-014
- id: finance-C-127
when: When using numerical delta finite difference approximation
action: Use fixed step size dS = 0.01 for finite difference approximation — do not change to relative/percentage-based
step sizes
severity: high
kind: domain_rule
modality: must
consequence: Using relative dS (percentage of spot) instead of fixed 0.01 causes different perturbation magnitudes across
strikes, introducing inconsistent numerical precision; ATM options near $100 get dS=1 while deep ITM options near $10
get dS=0.1
derived_from_bd_id: BD-023
- id: finance-C-128
when: When computing numerical theta for options near expiration (t <= 1/365)
action: Use time step dt = 0.00001 (not 1/365 or 0.001) when t < 1/365 to avoid numerical instability near expiration
severity: high
kind: domain_rule
modality: must
consequence: Using standard dt = 1/365 for near-expiry options causes theta instability with extremely large values and
discontinuous behavior at t=1/365 boundary; strategies exploiting same-day expiry theta decay will show erratic backtest
results
derived_from_bd_id: BD-025
- id: finance-C-129
when: When implementing or maintaining the greeks_numerical module
action: Preserve the minimum time step 0.00001 for numerical theta when t < 1/365 — do not use proportional dt or remove
the floor
severity: medium
kind: operational_lesson
modality: must
consequence: Removing the minimum time step floor causes negative time in finite difference calculation, producing invalid
theta values; near-expiry options will show discontinuous theta at the t=1/365 boundary
derived_from_bd_id: BD-097
- id: finance-C-130
when: When implementing or refactoring delta calculation
action: Use black_delta_with_discount for Black model and bs_delta_without_discount for Black-Scholes — do not assume
delta definitions are interchangeable between models
severity: high
kind: architecture_guardrail
modality: must
consequence: Black delta includes exp(-r*t) discount factor as it measures sensitivity to forward price; Black-Scholes
delta excludes it as it measures sensitivity to spot price; swapping functions causes 10-50% delta error for longer-dated
options
derived_from_bd_id: BD-094
- id: finance-C-131
when: When using BSM numerical greeks with cost-of-carry parameter b
action: Verify q = r - b relationship holds between cost-of-carry b and dividend yield q — do not use mismatched values
where q != r - b
severity: high
kind: domain_rule
modality: must
consequence: BSM numerical greeks compute dividend yield via q_from_b_lambda (q = r - b); if q and b are passed with inconsistent
relationship, analytical prices and numerical greeks use different dividend assumptions, producing wrong Greeks for
dividend-sensitive strategies
derived_from_bd_id: BD-095
- id: finance-C-132
when: When computing forward prices or passing F parameter across modules
action: Do not use Black-Scholes F = S*exp(r*t) formula in BSM context or BSM F = S*exp((r-q)*t) in Black-Scholes context
— verify which model interprets 'F'
severity: high
kind: architecture_guardrail
modality: must
consequence: Black-Scholes F incorporates only risk-free rate (F = S*exp(r*t)) while BSM F includes dividend yield (F
= S*exp((r-q)*t)); using wrong forward formula causes IV computation or pricing errors of 5-30% for high-dividend stocks
derived_from_bd_id: BD-098
- id: finance-C-133
when: When implementing option delta calculation at time to expiration
action: 'Set delta to {call: 0.5, put: -0.5} when spot equals strike (S=K), and to {call: 1, put: 0} when spot exceeds
strike (S>K) — do not use analytical formula through t=0 due to division by sqrt(t)'
severity: high
kind: domain_rule
modality: must
consequence: Using analytical delta formula at t=0 causes division by zero and incorrect exercise boundary; for American
options, wrong delta at S>K vs S=K determines immediate exercise decisions, causing mispriced exercise strategies
derived_from_bd_id: BD-024
- id: finance-C-134
when: When calculating BSM gamma for dividend-paying stock options
action: Include exp(-q*t) factor in the gamma numerator — gamma = exp(-q*t) * (K * exp(-r*T) * d2_N) / (S^2 * sigma *
sqrt(T) * sqrt(2*pi))
severity: high
kind: domain_rule
modality: must
consequence: Omitting exp(-q*t) from gamma overstates gamma exposure for dividend-paying stocks; portfolios hedged with
unadjusted gamma will have residual dividend-related gamma risk, leading to hedge failures during dividend events
derived_from_bd_id: BD-030
- id: finance-C-135
when: When calculating BSM vega for dividend-paying stock options
action: Multiply vega by exp(-q*t) factor — vega = exp(-q*t) * (K * exp(-r*T) * T * N(d1)) / (sqrt(2*pi) * sqrt(T))
severity: high
kind: domain_rule
modality: must
consequence: Omitting exp(-q*t) from vega overstates volatility exposure for high-dividend stocks; hedges using unadjusted
vega will allocate excessive capital to volatility trades, causing systematic over-hedging losses
derived_from_bd_id: BD-031
- id: finance-C-136
when: When calculating implied volatility using the Jäckel transformed rational approximation
action: Use normalised_black with dimensionless inputs (F/K ratio and sigma*sqrt(T)) for the invariant transformation
— do not use direct pricing that lacks the time/rate invariance property
severity: high
kind: domain_rule
modality: must
consequence: Using direct pricing without normalised transformation breaks the Jäckel algorithm's strike-normalized lookup,
causing implied volatility calculations to produce inconsistent results across different maturities and interest rates
derived_from_bd_id: BD-039
- id: finance-C-137
when: When calculating BSM delta for dividend-paying stock options
action: Multiply delta by exp(-q*t) to adjust for dividend yield — delta_call = exp(-q*t) * N(d1), delta_put = -exp(-q*t)
* N(-d1)
severity: high
kind: domain_rule
modality: must
consequence: Using unadjusted BS delta for dividend-paying stocks overstates delta exposure; this causes put-call parity
violations and miscalculated hedge ratios, leading to unhedged directional risk in dividend-paying stock options
derived_from_bd_id: BD-029
- id: finance-C-138
when: When using scipy brentq for implied volatility solving in ref_python
action: Set brentq bounds to a=1e-12 (near-zero volatility floor) and b=100 (extreme high vol ceiling) — these bounds
verify the solver finds valid IV for option prices ranging from 1bp to 10000% implied vol
severity: medium
kind: operational_lesson
modality: must
consequence: Changing brentq bounds to narrower values causes IV solver failures for deep out-of-money or extreme volatility
scenarios; strategies relying on these IV values will encounter runtime errors or default to incorrect volatility assumptions
derived_from_bd_id: BD-033
- id: finance-C-139
when: When testing numerical greeks against analytical greeks for delta, gamma, vega, and theta
action: Set epsilon tolerance to 0.01 (1% relative error) — this accounts for finite-difference approximation noise while
catching significant implementation bugs
severity: medium
kind: operational_lesson
modality: should
consequence: Using epsilon tighter than 0.01 causes false test failures from numerical noise; using epsilon looser than
0.01 misses real bugs in numerical greek implementations, allowing incorrect hedging values to pass testing
derived_from_bd_id: BD-034
- id: finance-C-140
when: When testing numerical rho against analytical rho
action: Set epsilon tolerance to 0.0001 (0.01% relative error) — rho requires tighter tolerance than other greeks because
its small magnitude makes 1% absolute tolerance too loose
severity: high
kind: domain_rule
modality: must
consequence: Using the default 0.01 epsilon for rho causes rho tests to pass with unacceptably large absolute errors relative
to rho's small values, masking real implementation bugs in interest rate sensitivity calculations
derived_from_bd_id: BD-035
- id: finance-C-141
when: When configuring brentq convergence tolerance for implied volatility solver
action: Set xtol=1e-15 and rtol=1e-15 for near-machine-precision results — this ensures ref_python IV matches py_lets_be_rational
within floating-point accuracy
severity: medium
kind: operational_lesson
modality: should
consequence: Loosening brentq tolerance causes IV rounding errors that propagate through risk calculations; even small
IV errors compound through delta and gamma hedges, leading to systematic hedging discrepancies in live trading
derived_from_bd_id: BD-036
- id: finance-C-142
when: When implementing or refactoring bivariate normal CDF calculations for basket options or correlation-dependent greeks
action: Use the Drezner-Wesolowsky algorithm for CBND calculation — this specific algorithm provides the required balance
of speed and accuracy for correlation sensitivity calculations
severity: high
kind: domain_rule
modality: must
consequence: Substituting alternative algorithms like Gaussian quadrature changes pricing accuracy and speed characteristics,
potentially causing basket option prices to differ from expected values by amounts that accumulate in correlation trading
strategies
derived_from_bd_id: BD-040
- id: finance-C-143
when: When implementing or refactoring standard normal CDF calculations
action: Use Abramowitz-Stegun polynomial coefficients for CND approximation with accuracy target <7.5e-8 — do not substitute
with lookup tables, rational functions, or alternative polynomial approximations
severity: high
kind: domain_rule
modality: must
consequence: Using alternative polynomial coefficients changes the CND accuracy profile, potentially causing greeks calculations
to differ from expected values especially for deep ITM/OTM options where small CDF errors amplify into significant delta/gamma
errors
derived_from_bd_id: BD-043
- id: finance-C-144
when: When calculating numerical gamma for options at expiration (t=0)
action: Return infinity when spot equals strike (S=K at t=0) and return 0 otherwise — do not use finite difference methods
at t=0 as they produce unstable results
severity: high
kind: domain_rule
modality: must
consequence: Using finite differences at t=0 produces unreliable gamma values that either oscillate wildly or collapse
to zero, making same-day expiry risk management impossible and causing strategies to misjudge their gamma exposure near
market close
derived_from_bd_id: BD-045
- id: finance-C-145
when: When calculating numerical rho using finite differences for interest rate sensitivity
action: Shift both r (risk-free rate) and b (cost-of-carry) by ±0.01 in opposite directions — this ensures the finite
difference captures pure rate sensitivity without contaminating the result with forward-price sensitivity
severity: high
kind: domain_rule
modality: must
consequence: Shifting only one parameter changes both the discount factor and forward price simultaneously, causing numerical
rho to include forward-price sensitivity in addition to pure rate sensitivity, leading to incorrect interest rate hedging
derived_from_bd_id: BD-048
- id: finance-C-146
when: When calculating Black model rho for futures options
action: 'Use the simplified formula: rho = -t * black(flag, F, K, t, r, sigma) * 0.01 — do not use the general d1/d2 derivative
formula which is more complex and less numerically stable'
severity: high
kind: domain_rule
modality: must
consequence: Using the general d1/d2 derivative formula introduces numerical instability in rho calculation, causing futures
options to show incorrect interest rate sensitivity that leads to improper hedging and P&L attribution errors
derived_from_bd_id: BD-038
- id: finance-C-147
when: When calculating bivariate normal CDF for correlation-dependent calculations
action: 'Select CBND quadrature order based on correlation thresholds: use 6-point for |ρ|<0.3, 12-point for 0.3<=|ρ|<=0.75,
and 20-point for |ρ|>0.75 — these thresholds were empirically determined to keep maximum error below 1e-6'
severity: high
kind: operational_lesson
modality: must
consequence: Using fixed quadrature order either sacrifices speed (unnecessarily high order for low correlations) or accuracy
(insufficient order for high correlations near ±1), causing basket options to price incorrectly where correlation accuracy
matters most
derived_from_bd_id: BD-041
- id: finance-C-148
when: When evaluating cumulative normal distribution for extreme spot prices or deep ITM/OTM options
action: Cap CND evaluation at |x|>37 by returning the limiting value (0 for negative, 1 for positive) — do not evaluate
the polynomial directly as IEEE double precision loses significance beyond this threshold
severity: high
kind: operational_lesson
modality: must
consequence: Evaluating the CND polynomial without capping causes numerical overflow that propagates NaN values through
all dependent calculations, making greeks and prices undefined for extreme market conditions like market crashes or
black swan events
derived_from_bd_id: BD-042
- id: finance-C-149
when: When calculating forward prices for Black-Scholes-Merton implied volatility
action: 'Use dividend-adjusted forward formula: F = S * exp((r-q)*t) where r is risk-free rate and q is dividend yield
— do not use the simpler BS forward formula F = S * exp(r*t) which ignores dividends'
severity: high
kind: domain_rule
modality: must
consequence: Using BS forward formula without dividend adjustment causes implied volatility calculation to systematically
misinterpret option prices, resulting in IV values that are systematically biased and causing incorrect strategy signals
for dividend-paying stocks
derived_from_bd_id: BD-046
- id: finance-C-150
when: When implementing implied volatility calculation for time-critical applications using the limited-iteration variant
action: Use the limited-iteration IV method as a default replacement for standard IV calculation — the limited-iteration
variant sacrifices 1-2 decimals of precision for ~50% speed improvement and must be explicitly opted-in for latency-critical
use cases only
severity: high
kind: architecture_guardrail
modality: must_not
consequence: Defaulting to limited-iteration IV introduces systematic pricing errors of 1-2 decimals that propagate to
all Greeks calculations, causing risk management systems to misvalue portfolios and potentially execute hedges at incorrect
levels
derived_from_bd_id: BD-052
- id: finance-C-151
when: When converting option type flags ('c'/'p') to internal numeric representation
action: Use the binary_flag function from py_vollib.helpers for each option type conversions — do not use inline string
comparison (flag == 'c') or equivalent logic scattered across pricing functions
severity: high
kind: domain_rule
modality: must
consequence: Inline string comparisons bypass centralized validation and may introduce inconsistent flag handling across
different pricing functions, causing silent errors when option type is checked differently in various code paths
derived_from_bd_id: BD-054
- id: finance-C-154
when: When calculating theta for Black-Scholes-Merton (BSM) dividend-paying stock options
action: 'Include the dividend erosion term in BSM theta calculation: theta includes -q*S*exp(-q*t)*N(±d1) in addition
to the standard BS theta term — do not use BS theta formula for dividend-paying stocks'
severity: high
kind: domain_rule
modality: must
consequence: BSM theta without the dividend term systematically miscalculates theta for dividend-paying stocks, causing
the model to incorrectly estimate time decay and leading to poor theta-based trading decisions and hedging errors
derived_from_bd_id: BD-051
- id: finance-C-155
when: When calculating implied volatility using the transformed rational guess algorithm
action: Verify input price is within arbitrage-free bounds before initializing the transformed rational guess search —
standard implementation limited to 50 iterations, limited-iteration variant limited to 10 iterations
severity: high
kind: domain_rule
modality: must
consequence: The transformed rational guess algorithm requires arbitrage-free bounds for valid convergence; feeding prices
outside valid bounds causes the solver to fail or converge to incorrect values, producing meaningless IV estimates
derived_from_bd_id: BD-058
- id: finance-C-156
when: When calculating vega numerically using finite difference methods
action: Use symmetric (central) finite difference with step of 0.01 in volatility space — do not use asymmetric forward
or backward differences which introduce bias
severity: high
kind: architecture_guardrail
modality: must
consequence: Asymmetric finite differences introduce systematic bias into vega calculations, causing the Greeks to mismeasure
volatility sensitivity and leading to incorrect hedge ratios and risk estimates
derived_from_bd_id: BD-064
- id: finance-C-157
when: When implementing input validation for options pricing functions
action: Apply float() conversion to each numeric input to verify IEEE 754 double precision
severity: high
kind: domain_rule
modality: must
consequence: 'Without explicit float() conversion, integer or numpy array inputs cause silent type errors: integer division
in Python 3, array broadcasting in numpy operations, and precision loss for large integers exceeding 53-bit mantissa'
derived_from_bd_id: BD-056
- id: finance-C-159
when: When implementing implied volatility calculation using Brent's method
action: Use Brent's method with search bounds [1e-12, 100] — do not use Newton-Raphson or narrower bounds [0.01, 5]
severity: high
kind: architecture_guardrail
modality: must
consequence: Newton-Raphson diverges near boundaries causing convergence failures; narrow bounds [0.01, 5] reject options
with IV >500% commonly seen in deep ITM/OTM contracts, returning NaN instead of valid IV
derived_from_bd_id: BD-059
- id: finance-C-160
when: When computing cumulative normal distribution for options pricing
action: Use the custom polynomial approximation only for z-scores within [-6, +6] — do not apply to extreme z-scores beyond
±6 where accuracy degrades below 1e-6
severity: high
kind: domain_rule
modality: must_not
consequence: Custom polynomial accuracy degrades below 1e-6 for extreme z-scores; using it for tail probabilities causes
cumulative distribution errors exceeding 1e-5, distorting delta and probability calculations
derived_from_bd_id: BD-060
- id: finance-C-161
when: When calculating bivariate normal distribution for correlation-dependent Greeks
action: Use Drezner-Wesenowsky (1990) algorithm with fixed 11x4 Gauss-Hermite quadrature nodes achieving 7-digit accuracy
severity: high
kind: architecture_guardrail
modality: must
consequence: Alternative numerical integration methods (Simpson's rule, direct integration) fail to maintain 7-digit accuracy
for tail correlations, causing errors in basket options and correlation-dependent strategies
derived_from_bd_id: BD-061
- id: finance-C-162
when: When calculating vega or rho Greeks for options
action: Scale vega and rho by 0.01 to convert from per-unit-sigma to per-1% change convention — verify output matches
industry standard reporting
severity: high
kind: domain_rule
modality: must
consequence: Without 0.01 scaling, vega reports per 100% vol move instead of per 1% move; this 100x misinterpretation
causes portfolio risk models to underestimate Greek exposure by 99%, leading to significant hedging errors
derived_from_bd_id: BD-066
- id: finance-C-163
when: When calculating theta for daily mark-to-market reporting
action: Divide annual theta by 365 to express as daily decay — do not use 365.25 or leave theta in annual terms
severity: medium
kind: operational_lesson
modality: must
consequence: 'Annual theta misreported as daily causes P&L attribution errors: strategy daily decay appears 0.07% higher
than actual (365 vs 365.25), distorting performance attribution for long-dated options strategies'
derived_from_bd_id: BD-067
- id: finance-C-165
when: When calculating d1 and d2 parameters for Black-Scholes or Black-76 pricing
action: Calculate d1 = [ln(F/K) + (σ²/2)T] / (σ√T) and d2 = d1 - σ√T using log-moneyness with volatility term structure
— handle special cases σ=0 or T=0 with boundary conditions
severity: high
kind: domain_rule
modality: must
consequence: Incorrect d1/d2 formula produces wrong delta and probability calculations; delta errors of 1-5% cause hedging
ratios to be miscalculated, leading to gamma scalping losses
derived_from_bd_id: BD-073
- id: finance-C-166
when: When setting up implied volatility root-finding for brentq
action: Set search bounds to [1e-12, 100] to handle both near-zero vol (1e-12 approximates zero) and extreme vol up to
10000% — do not use narrower bounds [0.01, 5]
severity: high
kind: operational_lesson
modality: must
consequence: Narrow bounds [0.01, 5] reject options with realized volatility exceeding 500%, causing implied volatility
calculation to return NaN for high-volatility regimes and deep ITM/OTM options
derived_from_bd_id: BD-080
- id: finance-C-167
when: When implementing or refactoring numerical delta calculation logic
action: Use central finite difference with step size h=0.01 to achieve O(h²) accuracy — the method requires (V(S+h) -
V(S-h)) / (2*h) formulation
severity: high
kind: domain_rule
modality: must
consequence: Using forward or backward difference with O(h) accuracy introduces systematic delta hedging errors that accumulate
over many rebalancing intervals, causing live PnL to diverge from backtested results
derived_from_bd_id: BD-062
- id: finance-C-168
when: When implementing numerical gamma calculation
action: Use central finite difference with 3-point stencil capturing curvature — the method requires (V(S+h) - 2*V(S)
+ V(S-h)) / h² formulation
severity: high
kind: domain_rule
modality: must
consequence: Using 2-point stencil or forward/backward difference fails to capture option gamma curvature, causing systematic
delta hedging errors especially near ATM where gamma is highest
derived_from_bd_id: BD-063
- id: finance-C-169
when: When configuring cost_of_carry parameter for Black-Scholes-Merton model
action: Set b = r - q for equity options where q is the continuous dividend yield; for discrete dividends, use binomial
tree or stock price adjustment method instead of continuous approximation
severity: high
kind: architecture_guardrail
modality: must
consequence: Using b=0 for BSM model would exclude dividend yield adjustment, systematically overpricing dividend-paying
equities and causing hedging losses in live trading
derived_from_bd_id: BD-069
- id: finance-C-171
when: When implementing or calling analytical delta calculation for BSM model
action: 'Apply dividend yield adjustment: call delta = exp(-q*t) * N(d1), put delta = -exp(-q*t) * N(-d1); verify exp(-q*t)
factor is included for each equity options with dividend yield q'
severity: high
kind: domain_rule
modality: must
consequence: Omitting exp(-q*t) adjustment causes delta to misrepresent true equity sensitivity, leading to incorrect
hedge ratios and accumulating PnL losses in live delta-hedged positions
derived_from_bd_id: BD-072
- id: finance-C-172
when: When converting BSM implied volatility to Black implied volatility
action: 'Apply forward price adjustment: F = S * exp((r-q)*T) before calling Black IV solver; the BSM price must be converted
using undiscounted form P_undiscounted = P_BSM * exp(r*T)'
severity: high
kind: domain_rule
modality: must
consequence: Using BSM IV directly without forward adjustment creates inconsistent volatility surfaces across models,
causing incorrect vol quotes and hedge ratios when switching between Black and BSM
derived_from_bd_id: BD-074
- id: finance-C-173
when: When converting option price to undiscounted form before Black implied volatility calculation
action: 'Undo r*t scaling: P_undiscounted = P_discounted * exp(r*T) before calling Black IV solver; Black model expects
undiscounted option prices under F*K convention'
severity: high
kind: domain_rule
modality: must
consequence: Passing discounted price to Black IV solver causes systematic IV calculation errors since Black model expects
undiscounted prices, leading to wrong vol surfaces and hedging ratios
derived_from_bd_id: BD-085
- id: finance-C-175
when: When using delta values interchangeably between Black and BSM/BS models
action: Assume Black delta and BS delta are comparable without adjustment; Black delta includes exp(-r*t) discount factor
while BS delta does not — they measure different sensitivities
severity: high
kind: domain_rule
modality: must_not
consequence: Treating Black delta as equivalent to BS delta without accounting for the discount factor causes systematic
hedging errors, as Black delta reflects forward contract sensitivity while BS delta reflects spot sensitivity
derived_from_bd_id: BD-101
- id: finance-C-176
when: When implementing the BSM d1 calculation for option Greeks
action: 'Use the cost of carry formula: d1 = (ln(S/K) + (r-q+σ²/2)*t) / (σ√t) — the q term represents dividend yield and
must be included; as q→0, reduces to Black-Scholes; as q→r, reduces to Black model'
severity: high
kind: domain_rule
modality: must
consequence: Omitting the cost-of-carry term q causes systematic errors in all Greeks calculations; for high-dividend
stocks (5% yield), errors of 0-5% per Greek propagate through delta hedging, leading to systematic over or under-hedging
in live trading
derived_from_bd_id: BD-082
- id: finance-C-177
when: When implementing put-call parity pricing logic in py_vollib.black_scholes
action: 'Maintain put-call parity via forward price conversion: C - P = S - K*exp(-rT); use this relationship to price
puts from calls (or vice versa) rather than computing independently'
severity: high
kind: domain_rule
modality: must
consequence: Violating put-call parity either creates exploitable arbitrage opportunities or indicates a fundamental pricing
error; independent put pricing without parity check leads to inconsistent prices that cannot be reconciled with arbitrage-free
market conditions
derived_from_bd_id: BD-083
- id: finance-C-179
when: When implementing gamma calculation for options approaching expiry
action: Return infinity (or a very large value) for gamma when spot price S equals strike K at expiry — must NOT clip
gamma to a finite value
severity: high
kind: domain_rule
modality: must
consequence: Clipping gamma to a finite value at S=K expiry loses critical information about hedging difficulty; at-the-money
gamma mathematically explodes to infinity at expiry, reflecting digital option payoff behavior, and clipping masks this
risk in live hedging scenarios
derived_from_bd_id: BD-081
- id: finance-C-180
when: When implementing forward price calculation logic in option pricing
action: Use S/numpy.exp(-r*t) formula for forward price calculation rather than S*exp(r*t) — this leverages numpy's handling
of negative exponents to prevent numerical underflow
severity: high
kind: domain_rule
modality: must
consequence: Using S*exp(r*t) directly can cause numerical underflow for very small rates or long tenors, producing zero
or near-zero forward prices that lead to incorrect option valuations and significant financial loss
derived_from_bd_id: BD-018
- id: finance-C-181
when: When implementing Black-Scholes option pricing model
action: Set cost-of-carry parameter b equal to risk-free rate r (b=r) — this parameterization assumes non-dividend-paying
stocks with continuous cost-of-carry equal to risk-free return
severity: high
kind: domain_rule
modality: must
consequence: Using b≠r violates Black-Scholes model assumptions, producing systematically incorrect option prices for
stocks with dividends or futures where cost-of-carry differs from risk-free rate, causing significant trading losses
derived_from_bd_id: BD-026
- id: finance-C-184
when: When computing forward price using helper functions across different option pricing models
action: 'Verify model context before computing forward price: Black/BS uses F=S/exp(-r*t) while BSM uses F=S*exp((r-q)*t);
do not use forward_price helper without specifying model type'
severity: high
kind: domain_rule
modality: must
consequence: Using wrong forward price formula cascades to incorrect d1/d2 calculations and all downstream Greeks (delta,
gamma, theta, vega, rho); wrong Greeks lead to incorrect hedging signals and position management
derived_from_bd_id: BD-105
- id: finance-C-185
when: When processing datetime values in option pricing calculations
action: Assume the framework handles timezone conversions automatically — the framework does not implement timezone annotation;
naive datetime values are treated as system local time
severity: high
kind: claim_boundary
modality: must_not
consequence: Timezone mismatches cause time-to-maturity (TTM) calculations to be incorrect, producing wrong option prices
and Greeks that diverge from actual market values
derived_from_bd_id: BD-GAP-006
- id: finance-C-186
when: When implementing reference implementations with datetime values
action: Add explicit timezone annotations to each datetime/timestamp fields using pytz or zoneinfo; verify each timestamps
are UTC-aware with timezone info attached before any calculations
severity: high
kind: domain_rule
modality: must
consequence: Without timezone annotation, datetime values are treated as naive/system local, causing TTM miscalculation
that produces incorrect option prices across different trading sessions
derived_from_bd_id: BD-GAP-006
- id: finance-C-187
when: When implementing stochastic processes or Monte Carlo simulations in option pricing
action: Assume random seeds are set consistently across each stochastic components — the framework does not enforce random
seed coverage for reproducibility
severity: high
kind: claim_boundary
modality: must_not
consequence: Without explicit seed control, stochastic results are non-reproducible across runs, making backtesting validation
impossible and strategy performance evaluation unreliable
derived_from_bd_id: BD-GAP-007
- id: finance-C-188
when: When implementing stochastic processes in reference implementations
action: Set random seeds explicitly for each random number generators using np.random.seed() or Generator's default_rng(seed)
before stochastic operations; document seed values in configuration and verify full coverage across each stochastic
components
severity: high
kind: domain_rule
modality: must
consequence: Without reproducible seeds, Monte Carlo simulations produce different results on each run, making backtesting
validation impossible and strategy comparison unreliable
derived_from_bd_id: BD-GAP-007
- id: finance-C-189
when: When implementing pricing models that require numerical calibration (e.g., PDE-based pricing, finite difference
methods, calibration to market data)
action: Assume the framework provides model calibration residual tracking and convergence diagnostics — the framework
does not implement calibration diagnostics; unverified calibration may silently fail
severity: high
kind: claim_boundary
modality: must_not
consequence: Without calibration residual tracking, numerical models may converge to incorrect parameters without triggering
any error, leading to systematic pricing errors that accumulate silently across all positions
derived_from_bd_id: BD-GAP-010
- id: finance-C-190
when: When calibrating numerical pricing models to market data in production backtesting
action: 'Implement explicit calibration diagnostics: define convergence tolerance (relative residual norm < 1e-6 or absolute
residual < 1e-8), track iteration-by-iteration residual history, log convergence metrics at each iteration, and halt
calibration with explicit error if max_iterations reached without convergence tolerance met'
severity: high
kind: domain_rule
modality: must
consequence: Without explicit residual monitoring and convergence checks, calibration may converge to wrong parameters
causing 5-20% pricing errors on exotic derivatives or structured products, with errors silently propagating to P&L reports
derived_from_bd_id: BD-GAP-010
output_validator:
assertions:
- id: OV-01
check_predicate: all(p in inspect.getsource(zvt.factors.algorithm.macd) for p in ['slow=26', 'fast=12', 'n=9'])
failure_message: 'FATAL: MACD params drifted from (fast=12, slow=26, n=9) — SL-08 violation, non-reproducible signals'
business_meaning: Standard MACD parameters are a semantic lock; drift makes results incomparable with industry-standard
indicators and non-reproducible.
source_ids:
- SL-08
- BD-036
- id: OV-02
check_predicate: result.get('total_trades', 0) > 0 or result.get('explicit_zero_trade_ack') is True
failure_message: Zero trades executed — likely missing pre-fetched data (see PC-02) or over-restrictive filters
business_meaning: A backtest with zero trades is not a valid result; either data is missing or the strategy never triggered.
Structural non-emptiness check is insufficient — we need business confirmation.
source_ids:
- SL-01
- finance-C-073
- id: OV-03
check_predicate: result.get('annual_return') is None or abs(float(result['annual_return'])) <= 5.0
failure_message: 'FATAL: |annual_return| > 500% — likely look-ahead bias or data error'
business_meaning: Annual returns exceeding 500% are physically implausible for A-share strategies; indicates look-ahead
bias or corrupt data.
source_ids: []
- id: OV-04
check_predicate: result.get('holding_change_pct') is None or abs(float(result['holding_change_pct'])) <= 1.0
failure_message: 'FATAL: |holding_change_pct| > 100% — physically impossible'
business_meaning: Holding change percentage cannot exceed 100%; violation indicates position accounting error.
source_ids:
- BD-029
- id: OV-05
check_predicate: result.get('max_drawdown') is None or abs(float(result['max_drawdown'])) <= 1.0
failure_message: 'FATAL: |max_drawdown| > 100% — impossible for non-leveraged account'
business_meaning: Maximum drawdown cannot exceed 100% without leverage; violation indicates calculation error or look-ahead
bias.
source_ids: []
- id: OV-06
check_predicate: not (hasattr(result, 'trade_log') and result.trade_log and any(result.trade_log[i].action == 'sell' and
i+1 < len(result.trade_log) and result.trade_log[i+1].action == 'buy' and result.trade_log[i].timestamp == result.trade_log[i+1].timestamp
for i in range(len(result.trade_log)-1)))
failure_message: 'FATAL: buy-before-sell detected in same cycle — SL-01 violation, creates implicit leverage'
business_meaning: SL-01 requires sell() before buy() in each cycle; violation means available_long was not updated before
buying, risking duplicate positions.
source_ids:
- SL-01
scaffold:
validate_py_path: '{workspace}/validate.py'
tail_block: "# === DO NOT MODIFY BELOW THIS LINE ===\nif __name__ == \"__main__\":\n result = run_backtest()\n from\
\ validate import enforce_validation\n enforce_validation(result, output_path=\"{workspace}/result.csv\")\n# ===\
\ END DO NOT MODIFY ==="
enforcement_protocol: 1. Never edit validate.py. 2. Never delete the DO NOT MODIFY tail block from the main script. 3. Never
wrap enforce_validation() in try/except. 4. Never rewrite result write logic — it MUST go through enforce_validation.
5. If validate.py raises ImportError, fix the dependency, do not remove the call.
acceptance:
hard_gates:
- id: G1
check: '{workspace}/result.csv exists AND file size > 0'
on_fail: Strategy did not produce output; check run_backtest() return value and enforce_validation() call
- id: G2
check: '{workspace}/result.csv.validation_passed marker file exists'
on_fail: Validation did not complete; review validate.py output and fix assertion failures
- id: G3
check: 'Main script contains literal: from validate import enforce_validation'
on_fail: Validation chain stripped; re-add the import in the DO NOT MODIFY block
- id: G4
check: 'Main script contains literal: # === DO NOT MODIFY BELOW THIS LINE ==='
on_fail: Validation fence removed; regenerate DO NOT MODIFY tail block
- id: G5
check: 'result.csv has at least 1 row: pandas.read_csv(result.csv).shape[0] >= 1'
on_fail: Empty result; check if trade_log is non-empty and factors generated signals. Confirm PC-02 (k-data exists) passed.
- id: G6
check: 'If MACD strategy: source contains ''slow=26'' AND ''fast=12'' AND ''n=9'' in algorithm call'
on_fail: MACD params drifted from SL-08 lock; restore standard (12, 26, 9)
- id: G7
check: 'For data pipeline tasks: result.csv contains ''entity_id'' and ''timestamp'' fields'
on_fail: Missing required columns; check Mixin.query_data return schema and DataFrame MultiIndex reset_index() before
writing
- id: G8
check: 'OV-03 passes: abs(annual_return) <= 5.0 (500%)'
on_fail: Physical plausibility check failed; investigate look-ahead bias or data corruption in input kdata
soft_gates:
- id: SG-01
rubric: 'Strategy narrative consistency: user intent aligns with generated strategy.py logic. dim_a: signal direction
(buy/sell) matches intent [1-5, pass>=4]; dim_b: frequency (daily/intraday) aligns [1-5, pass>=4]; dim_c: risk controls
match user intent [1-5, pass>=4].'
- id: SG-02
rubric: 'Factor combination quality. dim_a: no highly correlated factor duplication [1-5, pass>=4]; dim_b: multi-period
alignment correct [1-5, pass>=4]; dim_c: liquidity filter present for A-share [1-5, pass>=4].'
- id: SG-03
rubric: 'Data source selection appropriateness. dim_a: coverage sufficient for target entities [1-5, pass>=4]; dim_b:
provider latency acceptable for strategy frequency [1-5, pass>=4]; dim_c: no unauthorized provider used without credentials
[1-5, pass>=4].'
skill_crystallization:
trigger: all_hard_gates_passed AND user_opt_out_skill_saving != true
output_path_template: '{workspace}/../skills/{slug}.skill'
slug_template: '{blueprint_id_short}-{uc_id_lower}'
captured_fields:
- name
- intent_keywords
- entry_point_script
- validate_script
- fatal_constraints
- spec_locks
- preconditions
- install_recipes
- human_summary_translated
action: 'After all Hard Gates PASS, resolve slug via slug_template using the executed UC, then write the .skill YAML file
at output_path_template. Notify user in their detected locale: ''Skill saved as {slug}.skill — next time say one of {sample_triggers}
from the matched UC to invoke directly.'''
violation_signal: All hard gates passed but no .skill file exists at expected path
skill_file_schema:
name: finance-bp-127 / Sphinx Documentation Configuration for py_vollib
version: v5.3
intent_keywords:
- documentation
- sphinx
- api docs
- readthedocs
- docs generation
entry_point: run_backtest
fatal_guards:
- SL-01
- SL-02
- SL-03
- SL-04
- SL-05
- SL-06
- SL-07
- SL-08
- SL-10
- SL-11
- SL-12
spec_locks:
- SL-01
- SL-02
- SL-03
- SL-04
- SL-05
- SL-06
- SL-07
- SL-08
- SL-09
- SL-10
- SL-11
- SL-12
preconditions:
- PC-01
- PC-02
- PC-03
- PC-04
post_install_notice:
trigger: skill_installation_complete
message_template:
positioning: I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow.
capability_catalog:
group_strategy:
source: auto_grouped
strategy_reason: no candidate field had 2-7 distinct values; all capabilities collapsed into single group
groups:
- group_id: all
name: All Capabilities
description: ''
emoji: 📦
uc_count: 1
ucs:
- uc_id: UC-101
name: Sphinx Documentation Configuration for py_vollib
short_description: Configures automated documentation generation for the py_vollib options pricing library, enabling
consistent API documentation, code examples, and cov
sample_triggers:
- documentation
- sphinx
- api docs
call_to_action: Tell me which one you want to try.
featured_entries:
- uc_id: UC-101
beginner_prompt: Try sphinx documentation configuration for py_vollib
auto_selected: true
- uc_id: UC-100
beginner_prompt: Try capability UC-100
auto_selected: true
- uc_id: UC-101
beginner_prompt: Try capability UC-101
auto_selected: true
more_info_hint: Ask me 'what else can you do?' to see all 1 capabilities.
locale_rendering:
instruction: On skill_installation_complete, translate ALL user-facing strings (positioning + capability_catalog.groups[].name
+ capability_catalog.groups[].description + capability_catalog.groups[].ucs[].short_description + call_to_action + featured_entries[].beginner_prompt
+ more_info_hint) into detected user locale per locale_contract. Preserve UC-IDs, group_id, emoji, and sample_triggers
verbatim.
preserve_verbatim:
- UC-IDs
- group_id
- emoji
- sample_triggers
- technical_class_names
enforcement:
action: 'Host agent MUST send composed message to user as the FIRST user-facing response after skill_installation_complete
event. Message MUST contain: positioning, capability_catalog (rendered as markdown tables per group), 3 featured_entries,
call_to_action, and more_info_hint.'
violation_code: PIN-01
violation_signal: First user-facing message post-install does not contain the full capability_catalog (all UCs grouped)
OR skips featured_entries OR skips call_to_action.
human_summary:
persona: Doraemon
what_i_can_do:
tagline: 'I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me
what you want; I''ll write the code, you don''t have to dig docs. (Heads up: ZVT natively supports A-share, HK, and
crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don''t bother for serious work.)'
use_cases:
- Sphinx Documentation Configuration for py_vollib
- A-share MACD daily golden-cross backtest with hfq price adjustment from eastmoney
- 'End-to-end ZVT pipeline: FinanceRecorder + GoodCompanyFactor + StockTrader'
- Multi-factor strategy with TargetSelector (AND mode) combining MACD + volume breakout
- Index composition data collection (SZ1000, SZ2000) with EM recorder
- Institutional fund holdings tracker via joinquant_fund_runner pattern
- Custom Transformer + Accumulator factor with per-entity rolling state
what_i_auto_fetch:
- ZVT stage pipeline structure (data_collection → visualization) from LATEST.yaml
- Semantic locks (SL-01 through SL-12) — especially sell-before-buy ordering and MACD params
- Fatal constraints (finance-C-*) relevant to your target strategy type
- 'Default parameters: MACD(12,26,9), hfq adjustment, buy_cost=0.001, base_capital=1M CNY'
- Entity ID format (stock_sh_600000) and DataFrame MultiIndex convention
- Provider-specific recorder class names and required class attributes
what_i_ask_you:
- 'Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage
is thin)'
- 'Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare,
or qmt (broker)?'
- 'Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?'
- 'Time range: start_timestamp and end_timestamp for backtest period'
- 'Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?'
locale_rendering:
instruction: On first user contact, translate all fields above into detected user locale while preserving Doraemon persona
(direct, frank, mildly snarky, knows limits).
preserve_verbatim:
- BD-IDs
- SL-IDs
- UC-IDs
- finance-C-IDs
- class_names
- function_names
- file_paths
- numeric_thresholds
提供多策略投资组合优化框架,支持均值-方差、Black-Litterman 和分层风险平价(HRP)算法,内置多种协方差估计方法对比分析。
---
name: portfolio-optimization
description: |-
提供多策略投资组合优化框架,支持均值-方差、Black-Litterman 和分层风险平价(HRP)算法,内置多种协方差估计方法对比分析。
license: Proprietary. See LICENSE.txt in project root.
compatibility: Designed for Doramagic-host ecosystem (Claude Code / openclaw / Cursor). Requires Python 3.12+ with uv package manager.
metadata:
version: "v6.1"
blueprint_id: "finance-bp-093"
compiled_at: "2026-04-22T13:00:40.212744+00:00"
capability_markets: "multi-market"
capability_activities: "portfolio-analytics"
sop_version: "crystal-compilation-v6.1"
---
# 投资组合优化 (portfolio-optimization)
> 提供多策略投资组合优化框架,支持均值-方差、Black-Litterman 和分层风险平价(HRP)算法,内置多种协方差估计方法对比分析。
## Pipeline
`data_collection -> data_storage -> factor_computation -> target_selection -> trading_execution -> visualization`
## Top Use Cases (6 total)
### Risk Model Comparison Analysis (`UC-101`)
Compares multiple covariance estimation methods (sample, semicovariance, exponential, Ledoit-Wolf variants, oracle approximating) to evaluate which pr
**Triggers**: risk model comparison, covariance estimation methods, portfolio risk analysis
### Basic Mean-Variance Optimization (`UC-102`)
Constructs a minimum volatility portfolio using mean-variance optimization with CAPM-based expected returns and compares sample covariance vs Ledoit-W
**Triggers**: mean-variance optimization, minimum volatility portfolio, Efficient Frontier
### Mean-Variance Optimization with Transaction Costs (`UC-103`)
Implements advanced mean-variance optimization that accounts for broker transaction costs when rebalancing from an initial portfolio allocation, using
**Triggers**: transaction cost optimization, portfolio rebalancing, semicovariance risk
For all **6** use cases, see [references/USE_CASES.md](references/USE_CASES.md).
**Execute trigger**: `When user intent matches intent_router.uc_entries[].positive_terms AND user uses action verb (run/execute/跑/执行/backtest/fetch/collect)`
## What I'll Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
## Semantic Locks (Fatal)
| ID | Rule | On Violation |
|---|---|---|
| `SL-01` | Execute sell orders before buy orders in every trading cycle | halt |
| `SL-02` | Trading signals MUST use next-bar execution (no look-ahead) | halt |
| `SL-03` | Entity IDs MUST follow format entity_type_exchange_code | halt |
| `SL-04` | DataFrame index MUST be MultiIndex (entity_id, timestamp) | halt |
| `SL-05` | TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount | halt |
| `SL-06` | filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION | halt |
| `SL-07` | Transformer MUST run BEFORE Accumulator in factor pipeline | halt |
| `SL-08` | MACD parameters locked: fast=12, slow=26, signal=9 | halt |
Full lock definitions: [references/LOCKS.md](references/LOCKS.md)
## Top Anti-Patterns (14 total)
- **`AP-PORTFOLIO-ANALYTICS-001`**: Division by zero in price ratio calculations corrupts rebalancing
- **`AP-PORTFOLIO-ANALYTICS-002`**: Look-ahead bias from unshifted signal generation and position calculations
- **`AP-PORTFOLIO-ANALYTICS-003`**: Non-positive-semidefinite covariance matrix breaks CVXPY optimization
All 14 anti-patterns: [references/ANTI_PATTERNS.md](references/ANTI_PATTERNS.md)
## Evidence Quality Notice
> [QUALITY NOTICE] This crystal was compiled from blueprint finance-bp-093. Evidence verify ratio = 34.6% and audit fail total = 40. Generated results may have uncaptured requirement gaps. Verify critical decisions against source files (LATEST.yaml / LATEST.jsonl).
## Reference Files
| File | Contents | When to Load |
|---|---|---|
| [references/seed.yaml](references/seed.yaml) | V6+ 全量权威 (source-of-truth) | 有行为/决策争议时必读 |
| [references/ANTI_PATTERNS.md](references/ANTI_PATTERNS.md) | 14 条跨项目反模式 | 开始实现前 |
| [references/WISDOM.md](references/WISDOM.md) | 跨项目精华借鉴 | 架构决策时 |
| [references/CONSTRAINTS.md](references/CONSTRAINTS.md) | domain + fatal 约束 | 规则冲突时 |
| [references/USE_CASES.md](references/USE_CASES.md) | 全量 KUC-* 业务场景 | 需要完整示例时 |
| [references/LOCKS.md](references/LOCKS.md) | SL-* + preconditions + hints | 生成回测/交易代码前 |
| [references/COMPONENTS.md](references/COMPONENTS.md) | AST 组件地图(按 module 拆分)| 查 API 时 |
---
*Compiled by Doramagic crystal-compilation-v6.1 from `finance-bp-093` blueprint at 2026-04-22T13:00:40.212744+00:00.*
*See [human_summary.md](human_summary.md) for non-technical overview.*
FILE:human_summary.md
# Human Summary
> I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
## What I Can Do
- **tagline**: I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
- **use_cases**: ['Mean-Variance Optimization with Transaction Costs', 'Basic Mean-Variance Optimization', 'Risk Model Comparison Analysis', 'A-share MACD daily golden-cross backtest with hfq price adjustment from eastmoney', 'End-to-end ZVT pipeline: FinanceRecorder + GoodCompanyFactor + StockTrader', 'Multi-factor strategy with TargetSelector (AND mode) combining MACD + volume breakout', 'Index composition data collection (SZ1000, SZ2000) with EM recorder']
## What I Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
FILE:references/ANTI_PATTERNS.md
# Anti-Patterns (Cross-Project)
Total: **14**
## finance-bp-066--wealthbot (2)
### `AP-PORTFOLIO-ANALYTICS-001` — Division by zero in price ratio calculations corrupts rebalancing <sub>(high)</sub>
When calculating price_diff using current_price divided by old_price without validating old_price is non-zero, the result is NaN or INF. This corrupts portfolio rebalancing calculations in wealthbot, causing incorrect buy/sell decisions based on invalid prices_diff values. The same issue appears in getPricesDiff() where divide-by-zero when old_price equals zero produces NaN/infinity that propagates to all subsequent trade decisions.
### `AP-PORTFOLIO-ANALYTICS-004` — Incorrect portfolio value tracking destroys time-series integrity <sub>(high)</sub>
Updating existing ClientPortfolioValue records instead of creating new ones destroys the time-series integrity needed for billing calculations and historical reconciliation. This creates data corruption where billing calculations and historical reporting against custodian records will fail to match. Portfolio value records must be linked to parent ClientPortfolio via proper relationships to avoid orphaned records.
## finance-bp-068--xalpha (1)
### `AP-PORTFOLIO-ANALYTICS-006` — FIFO sell order violation corrupts cost basis and XIRR <sub>(high)</sub>
Processing positions out of chronological order in FIFO sell operations causes incorrect cost basis assignment, leading to inaccurate realized gains/losses and wrong XIRR calculation. Chinese funds have tiered redemption fees based on holding periods, so FIFO violations result in incorrect holding period calculation and wrong redemption fee being applied, causing direct financial loss.
## finance-bp-068--xalpha, finance-bp-093--PyPortfolioOpt, finance-bp-117--Riskfolio-Lib (1)
### `AP-PORTFOLIO-ANALYTICS-010` — Missing DataFrame schema validation causes KeyError propagation <sub>(medium)</sub>
Passing non-DataFrame objects (numpy arrays, lists) where DataFrame is expected causes NameError, AttributeError, or TypeError in downstream pandas operations. xalpha's fundinfo.price requires specific columns (date, netvalue, totvalue, comment), PyPortfolioOpt and Riskfolio-Lib require index alignment between expected returns and covariance matrix. Missing columns cause backtest calculations to fail with NaN values or KeyError.
## finance-bp-082--stock-screener (1)
### `AP-PORTFOLIO-ANALYTICS-007` — Score validation bypass allows invalid composite calculations <sub>(medium)</sub>
Accepting scores outside the 0-100 range in screener results corrupts ranking and rating logic, causing unpredictable screening results that violate the fundamental score contract. When combined with division-by-zero guards that return 0.0 for empty screener lists, this creates unpredictable behavior where invalid scores produce wrong composite calculations and incorrect Strong Buy/Buy/Watch/Pass ratings.
## finance-bp-093--PyPortfolioOpt (1)
### `AP-PORTFOLIO-ANALYTICS-008` — Convex optimization constraints violate DCP rules <sub>(high)</sub>
Using non-convex objectives or DCP-violating expressions in CVXPY optimization causes DCPError, completely preventing portfolio optimization from running. Similarly, providing non-callable constraints or invalid bounds formats (not matching n_assets length) causes TypeError. Feasibility violations like setting target_volatility below global minimum or target_return above maximum achievable return make problems infeasible.
## finance-bp-093--PyPortfolioOpt, finance-bp-117--Riskfolio-Lib (1)
### `AP-PORTFOLIO-ANALYTICS-003` — Non-positive-semidefinite covariance matrix breaks CVXPY optimization <sub>(high)</sub>
Passing a non-positive-semidefinite covariance matrix to CVXPY optimization with assume_PSD=True produces incorrect results because the solver assumes validity without verification. This causes Cholesky decomposition to fail or produce garbage weights, preventing portfolio optimization from running entirely. Riskfolio-Lib and PyPortfolioOpt both require explicit PSD validation before optimization.
## finance-bp-106--pyfolio-reloaded (2)
### `AP-PORTFOLIO-ANALYTICS-005` — Allocation denominator excludes cash, corrupting portfolio composition <sub>(medium)</sub>
When computing allocation percentages excluding cash from the denominator, portfolio allocation percentages will not sum to 100%, misrepresenting the portfolio's actual composition. Additionally, concentration metrics become artificially skewed when including cash (a non-position asset), producing misleading diversification assessments that could lead to inappropriate risk management decisions.
### `AP-PORTFOLIO-ANALYTICS-009` — Transaction data corruption from missing columns and invalid dates <sub>(medium)</sub>
Extracting round trips from transactions DataFrame without validating required columns (amount, price, symbol) causes KeyError exceptions. When open_dt is not strictly less than close_dt, negative or zero duration values indicate data corruption causing incorrect holding period statistics. Similarly, non-normalized transaction timestamps cause intra-day trades to be incorrectly split across days.
## finance-bp-107--empyrical-reloaded (1)
### `AP-PORTFOLIO-ANALYTICS-011` — Wrong annualization factors distort cross-frequency metric comparison <sub>(high)</sub>
Applying incorrect annualization factors (wrong values for daily, weekly, monthly, quarterly, yearly frequencies) produces non-comparable metrics across different return frequencies, causing invalid strategy comparisons and misallocated capital. The Sharpe ratio formula must use correct annualization with sample standard deviation (ddof=1), otherwise producing misleading risk-adjusted return estimates.
## finance-bp-107--empyrical-reloaded, finance-bp-118--FinanceToolkit (1)
### `AP-PORTFOLIO-ANALYTICS-012` — Misaligned time series in alpha/beta calculation produces invalid factor analysis <sub>(high)</sub>
Passing returns and factor_returns to alpha_beta functions without verifying data alignment on index labels (pd.Series) or length equality (np.ndarray) produces incorrect alpha/beta values due to correlation computed between mismatched periods. Including benchmark ticker in the asset ticker list causes circular correlation producing meaningless beta values of approximately 1.0.
## finance-bp-108--finmarketpy (1)
### `AP-PORTFOLIO-ANALYTICS-013` — Forward-filling spot prices creates look-ahead bias in TRI construction <sub>(high)</sub>
Forward-filling spot prices creates look-ahead bias where future prices are used to calculate historical returns, invalidating all TRI-based backtest results. The total return index construction requires multiplicative cumulation using cumprod (not cumsum) with base value 100, as additive cumulation allows negative cumulative returns to break the index chain.
## finance-bp-108--finmarketpy, finance-bp-106--pyfolio-reloaded (1)
### `AP-PORTFOLIO-ANALYTICS-002` — Look-ahead bias from unshifted signal generation and position calculations <sub>(high)</sub>
Generating trading signals from current-period technical indicators (RSI, moving averages) without proper shift(-1) creates look-ahead bias, causing live trading returns to fall far below backtested results. Similarly, when estimating intraday positions from transactions without applying shift(1) to EOD positions, day-start positions are contaminated with end-of-day values, making results unrepresentative of actual trading.
## finance-bp-117--Riskfolio-Lib, finance-bp-093--PyPortfolioOpt (1)
### `AP-PORTFOLIO-ANALYTICS-014` — Unsupported solver selection breaks advanced risk calculations <sub>(medium)</sub>
Using solvers that don't support required cone programming (power cone, exponential cone) causes CVXPY to fail with SolverError, returning None and breaking risk calculations. CLARABEL, SCS, ECOS support power cone for RLVaR/RLDaR calculations, while CLARABEL/MOSEK/SCS/ECOS support exponential cone for EVaR calculations. Riskfolio-Lib and PyPortfolioOpt both require careful solver selection.
FILE:references/COMPONENTS.md
# Component Capability Map
**Project**: finance-bp-093--PyPortfolioOpt
**Scan date**: 2026-04-22
**Stats**: {'total_files': 7, 'total_classes': 37, 'total_functions': 0, 'total_stages': 7}
## Modules (7)
- [data_input_&_preprocessing](components/data_input_-_preprocessing.md): 6 classes
- [portfolio_optimization](components/portfolio_optimization.md): 11 classes
- [hierarchical_portfolio_optimization](components/hierarchical_portfolio_optimization.md): 2 classes
- [critical_line_algorithm_optimization](components/critical_line_algorithm_optimization.md): 4 classes
- [black-litterman_optimization](components/black-litterman_optimization.md): 5 classes
- [discrete_allocation](components/discrete_allocation.md): 4 classes
- [visualization](components/visualization.md): 5 classes
FILE:references/CONSTRAINTS.md
# Constraints
## preservation_manifest
```yaml
required_objects:
business_decisions_count: 123
fatal_constraints_count: 43
non_fatal_constraints_count: 153
use_cases_count: 6
semantic_locks_count: 12
preconditions_count: 4
evidence_quality_rules_count: 2
traceback_scenarios_count: 5
```
FILE:references/LOCKS.md
# Semantic Locks + Preconditions
## Semantic Locks (12)
### `SL-01` <sub>(on_violation: fatal)</sub>
Execute sell orders before buy orders in every trading cycle
### `SL-02` <sub>(on_violation: fatal)</sub>
Trading signals MUST use next-bar execution (no look-ahead)
### `SL-03` <sub>(on_violation: fatal)</sub>
Entity IDs MUST follow format entity_type_exchange_code
### `SL-04` <sub>(on_violation: fatal)</sub>
DataFrame index MUST be MultiIndex (entity_id, timestamp)
### `SL-05` <sub>(on_violation: fatal)</sub>
TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount
### `SL-06` <sub>(on_violation: fatal)</sub>
filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION
### `SL-07` <sub>(on_violation: fatal)</sub>
Transformer MUST run BEFORE Accumulator in factor pipeline
### `SL-08` <sub>(on_violation: fatal)</sub>
MACD parameters locked: fast=12, slow=26, signal=9
### `SL-09` <sub>(on_violation: warning)</sub>
Default transaction costs: buy_cost=0.001, sell_cost=0.001, slippage=0.001
### `SL-10` <sub>(on_violation: fatal)</sub>
A-share equity trading is T+1 (no same-day close of buy positions)
### `SL-11` <sub>(on_violation: fatal)</sub>
Recorder subclass MUST define provider AND data_schema class attributes
### `SL-12` <sub>(on_violation: fatal)</sub>
Factor result_df MUST contain either 'filter_result' OR 'score_result' column
## Preconditions (4)
- **`PC-01`**: `python3 -c 'import zvt; print(zvt.__version__)'` → on_fail: Run: python3 -m pip install zvt then re-run: python3 -m zvt.init_dirs to initialize data directories
- **`PC-02`**: `python3 -c "from zvt.api.kdata import get_kdata; df = get_kdata(entity_ids=['stock_sh_600000'], limit=1); assert df is not None and len(df) > 0, 'No kdata found'"` → on_fail: Run recorder first: python3 -m zvt.recorders.em.em_stock_kdata_recorder --entity_ids stock_sh_600000 (replace with your target entity IDs)
- **`PC-03`**: `python3 -c "import os; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); assert zvt_home.exists(), f'ZVT home not found: {zvt_home}'"` → on_fail: Run: python3 -m zvt.init_dirs
- **`PC-04`**: `python3 -c "import os, tempfile; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); test_f = zvt_home / '.write_test'; test_f.touch(); test_f.unlink()"` → on_fail: Check directory permissions: chmod u+w ~/.zvt or set ZVT_HOME environment variable to a writable location
FILE:references/USE_CASES.md
# Known Use Cases (KUC)
Total: **6**
## `KUC-101`
**Source**: `cookbook/1-RiskReturnModels.ipynb`
Compares multiple covariance estimation methods (sample, semicovariance, exponential, Ledoit-Wolf variants, oracle approximating) to evaluate which provides the most accurate risk estimates for portfolio construction.
## `KUC-102`
**Source**: `cookbook/2-Mean-Variance-Optimisation.ipynb`
Constructs a minimum volatility portfolio using mean-variance optimization with CAPM-based expected returns and compares sample covariance vs Ledoit-Wolf shrinkage estimators.
## `KUC-103`
**Source**: `cookbook/3-Advanced-Mean-Variance-Optimisation.ipynb`
Implements advanced mean-variance optimization that accounts for broker transaction costs when rebalancing from an initial portfolio allocation, using semicovariance as the risk model.
## `KUC-104`
**Source**: `cookbook/4-Black-Litterman-Allocation.ipynb`
Combines market equilibrium prior returns with investor views using the Black-Litterman model to generate more realistic expected returns and construct portfolios that reflect both market consensus and proprietary opinions.
## `KUC-105`
**Source**: `cookbook/5-Hierarchical-Risk-Parity.ipynb`
Constructs a diversified portfolio using Hierarchical Risk Parity (HRP), which uses clustering/dendrogram analysis to group correlated assets and allocate weights without requiring covariance matrix inversion.
## `KUC-106`
**Source**: `docs/conf.py`
Sphinx documentation configuration file for building PyPortfolioOpt package documentation, not a functional portfolio strategy.
FILE:references/WISDOM.md
# Cross-Project Wisdom
Total: **10**
## `CW-PORTFOLIO-ANALYTICS-001` — Defensive zero-division guards with explicit handling
**From**: finance-bp-066--wealthbot, finance-bp-082--stock-screener, finance-bp-093--PyPortfolioOpt · **Applicable to**: portfolio-analytics
Always guard division operations with explicit zero-value checks before executing. In price ratio calculations, filter out securities where old_price is zero before calling getPricesDiff. In composite score calculations, guard against total_weight of zero and return 0.0 for empty input lists. This prevents NaN/infinity propagation that corrupts downstream calculations and crashes pipelines.
## `CW-PORTFOLIO-ANALYTICS-002` — Covariance matrix positive-semidefiniteness verification
**From**: finance-bp-093--PyPortfolioOpt, finance-bp-117--Riskfolio-Lib · **Applicable to**: portfolio-analytics
Always verify covariance matrix is positive-semidefinite before passing to CVXPY optimization. Apply eigenvalue clipping if violated, as non-PSD matrices cause Cholesky decomposition failures. Both PyPortfolioOpt and Riskfolio-Lib enforce this constraint to prevent optimizer from finding mathematically invalid solutions or crashing entirely.
## `CW-PORTFOLIO-ANALYTICS-003` — Geometric compounding for cumulative returns
**From**: finance-bp-068--xalpha, finance-bp-106--pyfolio-reloaded, finance-bp-107--empyrical-reloaded · **Applicable to**: portfolio-analytics
Compute cumulative returns using geometric compounding via cumprod(1 + returns), never arithmetic cumulation via cumsum. Arithmetic cumulative sum overstates gains and understates losses, causing cumulative returns to diverge significantly from actual portfolio performance over volatile periods. This principle applies to total return index construction and any cumulative performance calculation.
## `CW-PORTFOLIO-ANALYTICS-004` — Temporal shift enforcement to prevent look-ahead bias
**From**: finance-bp-108--finmarketpy, finance-bp-106--pyfolio-reloaded · **Applicable to**: portfolio-analytics
Enforce proper temporal shifting in signal generation and position calculations. Use shift(-1) for exit signals to prevent look-ahead bias, and shift(1) when estimating intraday positions from EOD data. Forward-fill carry data and backward-fill only old data gaps, never forward-fill spot prices. Violations cause live trading returns to diverge from backtested results.
## `CW-PORTFOLIO-ANALYTICS-005` — DCP-compliant convex optimization construction
**From**: finance-bp-093--PyPortfolioOpt, finance-bp-117--Riskfolio-Lib · **Applicable to**: portfolio-analytics
Use only DCP-compliant convex objectives and constraints in CVXPY. Provide constraints as callable functions accepting weight variables, use valid bounds formats matching n_assets length, and verify target parameters (volatility, return) are within feasible ranges. Non-convex or infeasible problems fail with DCPError or OptimizationError, preventing optimization entirely.
## `CW-PORTFOLIO-ANALYTICS-006` — Correct Sharpe ratio formula with risk-free rate subtraction
**From**: finance-bp-107--empyrical-reloaded, finance-bp-118--FinanceToolkit · **Applicable to**: portfolio-analytics
Calculate Sharpe ratio using (mean returns - risk_free) / std(returns) * sqrt(annualization) with sample standard deviation (ddof=1). Subtract risk-free rate from asset returns before dividing by volatility. Incorrect Sharpe ratio calculation produces misleading risk-adjusted return estimates, causing poor investment decisions based on faulty performance attribution.
## `CW-PORTFOLIO-ANALYTICS-007` — Immutable FIFO position tracking with chronological ordering
**From**: finance-bp-068--xalpha, finance-bp-066--wealthbot · **Applicable to**: portfolio-analytics
Maintain FIFO position tracking with strictly increasing date order for position entries. Use copy() function to create independent copies before mutating remtable to avoid side effects. Enforce chronological ordering in sell operations to ensure correct cost basis and holding period calculation, particularly important for funds with tiered fees by holding period.
## `CW-PORTFOLIO-ANALYTICS-008` — Validation at system boundaries with descriptive errors
**From**: finance-bp-082--stock-screener, finance-bp-093--PyPortfolioOpt, finance-bp-117--Riskfolio-Lib · **Applicable to**: portfolio-analytics
Enforce validation at system boundaries with descriptive error messages. Validate expected returns matches covariance matrix dimensions, score values are within [0, 100], confidence values within [0, 1], and required DataFrame columns are present. Invalid inputs should raise ValueError with descriptive messages listing valid options to prevent silent failures or corrupted calculations.
## `CW-PORTFOLIO-ANALYTICS-009` — Decimal rounding for monetary calculations
**From**: finance-bp-068--xalpha, finance-bp-107--empyrical-reloaded · **Applicable to**: portfolio-analytics
Use Decimal with explicit rounding (myround) for each monetary calculation to avoid floating-point errors that cause share miscalculation and incorrect cost basis. This prevents rounding errors from propagating to XIRR and portfolio valuation calculations. Direct floating-point operations in financial calculations accumulate errors that become material over many transactions.
## `CW-PORTFOLIO-ANALYTICS-010` — Cash flow sign convention enforcement
**From**: finance-bp-106--pyfolio-reloaded, finance-bp-068--xalpha · **Applicable to**: portfolio-analytics
Mark cash outflows as negative and cash inflows as positive in cftable. Incorrect cash flow signs cause NPV calculation to invert, producing negative returns for profitable trades and vice versa. Verify sum of round trip PnLs equals total realized transaction dollars to catch sign convention errors before they corrupt performance attribution.
FILE:references/components/black-litterman_optimization.md
# black-litterman_optimization (5 classes)
## `BlackLittermanModel.bl_weights`
`black-litterman_optimization/blacklittermanmodel-bl-weights.py:0`
## `BlackLittermanModel.optimize`
`black-litterman_optimization/blacklittermanmodel-optimize.py:0`
## `BlackLittermanModel.idzorek_method`
`black-litterman_optimization/blacklittermanmodel-idzorek-method.py:0`
## `prior specification`
`black-litterman_optimization/prior-specification.py:0`
## `omega computation`
`black-litterman_optimization/omega-computation.py:0`
FILE:references/components/critical_line_algorithm_optimization.md
# critical_line_algorithm_optimization (4 classes)
## `CLA.max_sharpe`
`critical_line_algorithm_optimization/cla-max-sharpe.py:0`
## `CLA.min_volatility`
`critical_line_algorithm_optimization/cla-min-volatility.py:0`
## `CLA.efficient_frontier`
`critical_line_algorithm_optimization/cla-efficient-frontier.py:0`
## `frontier generation points`
`critical_line_algorithm_optimization/frontier-generation-points.py:0`
FILE:references/components/data_input_-_preprocessing.md
# data_input_&_preprocessing (6 classes)
## `mean_historical_return`
`data_input_&_preprocessing/mean-historical-return.py:0`
## `CovarianceShrinkage.__init__`
`data_input_&_preprocessing/covarianceshrinkage-init.py:0`
## `return_model`
`data_input_&_preprocessing/return-model.py:0`
## `return_model method`
`data_input_&_preprocessing/return-model-method.py:0`
## `risk_matrix method`
`data_input_&_preprocessing/risk-matrix-method.py:0`
## `ledoit_wolf shrinkage target`
`data_input_&_preprocessing/ledoit-wolf-shrinkage-target.py:0`
FILE:references/components/discrete_allocation.md
# discrete_allocation (4 classes)
## `DiscreteAllocation.greedy_portfolio`
`discrete_allocation/discreteallocation-greedy-portfolio.py:0`
## `DiscreteAllocation.lp_portfolio`
`discrete_allocation/discreteallocation-lp-portfolio.py:0`
## `allocation algorithm`
`discrete_allocation/allocation-algorithm.py:0`
## `reinvest flag`
`discrete_allocation/reinvest-flag.py:0`
FILE:references/components/hierarchical_portfolio_optimization.md
# hierarchical_portfolio_optimization (2 classes)
## `HRPOpt.optimize`
`hierarchical_portfolio_optimization/hrpopt-optimize.py:0`
## `linkage_method`
`hierarchical_portfolio_optimization/linkage-method.py:0`
FILE:references/components/portfolio_optimization.md
# portfolio_optimization (11 classes)
## `EfficientFrontier.min_volatility`
`portfolio_optimization/efficientfrontier-min-volatility.py:0`
## `EfficientFrontier.max_sharpe`
`portfolio_optimization/efficientfrontier-max-sharpe.py:0`
## `EfficientFrontier.efficient_risk`
`portfolio_optimization/efficientfrontier-efficient-risk.py:0`
## `EfficientFrontier.efficient_return`
`portfolio_optimization/efficientfrontier-efficient-return.py:0`
## `BaseConvexOptimizer.add_objective`
`portfolio_optimization/baseconvexoptimizer-add-objective.py:0`
## `BaseConvexOptimizer.add_constraint`
`portfolio_optimization/baseconvexoptimizer-add-constraint.py:0`
## `BaseOptimizer.clean_weights`
`portfolio_optimization/baseoptimizer-clean-weights.py:0`
## `portfolio_performance`
`portfolio_optimization/portfolio-performance.py:0`
## `optimization method`
`portfolio_optimization/optimization-method.py:0`
## `solver backend`
`portfolio_optimization/solver-backend.py:0`
## `custom objective function`
`portfolio_optimization/custom-objective-function.py:0`
FILE:references/components/visualization.md
# visualization (5 classes)
## `plot_efficient_frontier`
`visualization/plot-efficient-frontier.py:0`
## `plot_weights`
`visualization/plot-weights.py:0`
## `plot_dendrogram`
`visualization/plot-dendrogram.py:0`
## `plot_covariance`
`visualization/plot-covariance.py:0`
## `visualization library`
`visualization/visualization-library.py:0`
FILE:references/seed.yaml
meta:
id: finance-bp-093-v5.3
version: v6.1
blueprint_id: finance-bp-093
sop_version: crystal-compilation-v6.1
source_language: en
compiled_at: '2026-04-22T13:00:40.212744+00:00'
target_host: openclaw
authoritative_artifact:
primary: seed.yaml
non_authoritative_derivatives:
- SKILL.md (host-generated summary, may lag)
- HEARTBEAT.md (host telemetry)
- memory/*.md (host conversational memory)
rule: On any behavioral decision (preconditions check, OV assertion, EQ rule firing, spec_lock verification), agents MUST
re-read seed.yaml. Derivatives are for UI display only and may be out-of-date.
execution_protocol:
install_trigger:
- Execute resources.host_adapter.install_recipes[] in declared order
- Verify each package with import check before proceeding
execute_trigger: When user intent matches intent_router.uc_entries[].positive_terms AND user uses action verb (run/execute/跑/执行/backtest/fetch/collect)
on_execute:
- Reload seed.yaml (do not rely on SKILL.md or cached summaries)
- Run preconditions[] in declared order; halt on first fatal failure with on_fail message to user
- Enter context_state_machine.CA1_MEMORY_CHECKED state
- Evaluate evidence_quality.enforcement_rules[]; prepend user_disclosure_template
- Translate user_facing_fields to user locale per locale_contract
- "[V6 READING ORDER]\nThis crystal contains the following V6 layers. Before answering any business question, the host\
\ MUST read them in order:\n 1. anti_patterns[] — cross-project anti-patterns (with AP-* ids)\n 2. cross_project_wisdom[]\
\ — cross-project wisdom (with CW-* ids)\n 3. domain_constraints_injected[] — domain constraints (SHARED-* ids)\n \
\ 4. known_use_cases[] — concrete business scenarios (KUC-* ids)\n 5. component_capability_map — AST component map\
\ (by module)\n\nWhen answering user questions, proactively cite relevant AP-*/CW-*/SHARED-*/KUC-* ids with source text.\
\ Examples: T+1 rules -> cite SHARED-* constraint; model comparison -> warn via AP-*; follow-holdings strategy -> cite\
\ KUC-* with example file."
workspace_resolution:
scripts_path: '{host_workspace}/scripts/'
skills_path: '{host_workspace}/skills/'
trace_path: '{host_workspace}/.trace/'
capability_tags:
markets:
- multi-market
activities:
- portfolio-analytics
upgraded_from: finance-bp-093-v1.seed.yaml
upgraded_at: '2026-04-22T13:20:23.298469+00:00'
v6_inputs:
ast_mind_map: knowledge/sources/finance/finance-bp-093--PyPortfolioOpt/v6_inputs/ast_mind_map.yaml
anti_patterns: null
cross_project_wisdom: null
examples_kuc: knowledge/sources/finance/finance-bp-093--PyPortfolioOpt/v6_inputs/examples_kuc.yaml
shared_pools_dir: knowledge/sources/finance/_shared
anti_patterns:
- id: AP-PORTFOLIO-ANALYTICS-001
title: Division by zero in price ratio calculations corrupts rebalancing
description: When calculating price_diff using current_price divided by old_price without validating old_price is non-zero,
the result is NaN or INF. This corrupts portfolio rebalancing calculations in wealthbot, causing incorrect buy/sell decisions
based on invalid prices_diff values. The same issue appears in getPricesDiff() where divide-by-zero when old_price equals
zero produces NaN/infinity that propagates to all subsequent trade decisions.
project_source: finance-bp-066--wealthbot
severity: high
applicable_to_tags:
markets:
- multi-market
activities:
- portfolio-analytics
_source_file: anti-patterns/portfolio-analytics.yaml
- id: AP-PORTFOLIO-ANALYTICS-002
title: Look-ahead bias from unshifted signal generation and position calculations
description: Generating trading signals from current-period technical indicators (RSI, moving averages) without proper shift(-1)
creates look-ahead bias, causing live trading returns to fall far below backtested results. Similarly, when estimating
intraday positions from transactions without applying shift(1) to EOD positions, day-start positions are contaminated
with end-of-day values, making results unrepresentative of actual trading.
project_source: finance-bp-108--finmarketpy, finance-bp-106--pyfolio-reloaded
severity: high
applicable_to_tags:
markets:
- multi-market
activities:
- portfolio-analytics
_source_file: anti-patterns/portfolio-analytics.yaml
- id: AP-PORTFOLIO-ANALYTICS-003
title: Non-positive-semidefinite covariance matrix breaks CVXPY optimization
description: Passing a non-positive-semidefinite covariance matrix to CVXPY optimization with assume_PSD=True produces incorrect
results because the solver assumes validity without verification. This causes Cholesky decomposition to fail or produce
garbage weights, preventing portfolio optimization from running entirely. Riskfolio-Lib and PyPortfolioOpt both require
explicit PSD validation before optimization.
project_source: finance-bp-093--PyPortfolioOpt, finance-bp-117--Riskfolio-Lib
severity: high
applicable_to_tags:
markets:
- multi-market
activities:
- portfolio-analytics
_source_file: anti-patterns/portfolio-analytics.yaml
- id: AP-PORTFOLIO-ANALYTICS-004
title: Incorrect portfolio value tracking destroys time-series integrity
description: Updating existing ClientPortfolioValue records instead of creating new ones destroys the time-series integrity
needed for billing calculations and historical reconciliation. This creates data corruption where billing calculations
and historical reporting against custodian records will fail to match. Portfolio value records must be linked to parent
ClientPortfolio via proper relationships to avoid orphaned records.
project_source: finance-bp-066--wealthbot
severity: high
applicable_to_tags:
markets:
- multi-market
activities:
- portfolio-analytics
_source_file: anti-patterns/portfolio-analytics.yaml
- id: AP-PORTFOLIO-ANALYTICS-005
title: Allocation denominator excludes cash, corrupting portfolio composition
description: When computing allocation percentages excluding cash from the denominator, portfolio allocation percentages
will not sum to 100%, misrepresenting the portfolio's actual composition. Additionally, concentration metrics become artificially
skewed when including cash (a non-position asset), producing misleading diversification assessments that could lead to
inappropriate risk management decisions.
project_source: finance-bp-106--pyfolio-reloaded
severity: medium
applicable_to_tags:
markets:
- multi-market
activities:
- portfolio-analytics
_source_file: anti-patterns/portfolio-analytics.yaml
- id: AP-PORTFOLIO-ANALYTICS-006
title: FIFO sell order violation corrupts cost basis and XIRR
description: Processing positions out of chronological order in FIFO sell operations causes incorrect cost basis assignment,
leading to inaccurate realized gains/losses and wrong XIRR calculation. Chinese funds have tiered redemption fees based
on holding periods, so FIFO violations result in incorrect holding period calculation and wrong redemption fee being applied,
causing direct financial loss.
project_source: finance-bp-068--xalpha
severity: high
applicable_to_tags:
markets:
- multi-market
activities:
- portfolio-analytics
_source_file: anti-patterns/portfolio-analytics.yaml
- id: AP-PORTFOLIO-ANALYTICS-007
title: Score validation bypass allows invalid composite calculations
description: Accepting scores outside the 0-100 range in screener results corrupts ranking and rating logic, causing unpredictable
screening results that violate the fundamental score contract. When combined with division-by-zero guards that return
0.0 for empty screener lists, this creates unpredictable behavior where invalid scores produce wrong composite calculations
and incorrect Strong Buy/Buy/Watch/Pass ratings.
project_source: finance-bp-082--stock-screener
severity: medium
applicable_to_tags:
markets:
- multi-market
activities:
- portfolio-analytics
_source_file: anti-patterns/portfolio-analytics.yaml
- id: AP-PORTFOLIO-ANALYTICS-008
title: Convex optimization constraints violate DCP rules
description: Using non-convex objectives or DCP-violating expressions in CVXPY optimization causes DCPError, completely
preventing portfolio optimization from running. Similarly, providing non-callable constraints or invalid bounds formats
(not matching n_assets length) causes TypeError. Feasibility violations like setting target_volatility below global minimum
or target_return above maximum achievable return make problems infeasible.
project_source: finance-bp-093--PyPortfolioOpt
severity: high
applicable_to_tags:
markets:
- multi-market
activities:
- portfolio-analytics
_source_file: anti-patterns/portfolio-analytics.yaml
- id: AP-PORTFOLIO-ANALYTICS-009
title: Transaction data corruption from missing columns and invalid dates
description: Extracting round trips from transactions DataFrame without validating required columns (amount, price, symbol)
causes KeyError exceptions. When open_dt is not strictly less than close_dt, negative or zero duration values indicate
data corruption causing incorrect holding period statistics. Similarly, non-normalized transaction timestamps cause intra-day
trades to be incorrectly split across days.
project_source: finance-bp-106--pyfolio-reloaded
severity: medium
applicable_to_tags:
markets:
- multi-market
activities:
- portfolio-analytics
_source_file: anti-patterns/portfolio-analytics.yaml
- id: AP-PORTFOLIO-ANALYTICS-010
title: Missing DataFrame schema validation causes KeyError propagation
description: Passing non-DataFrame objects (numpy arrays, lists) where DataFrame is expected causes NameError, AttributeError,
or TypeError in downstream pandas operations. xalpha's fundinfo.price requires specific columns (date, netvalue, totvalue,
comment), PyPortfolioOpt and Riskfolio-Lib require index alignment between expected returns and covariance matrix. Missing
columns cause backtest calculations to fail with NaN values or KeyError.
project_source: finance-bp-068--xalpha, finance-bp-093--PyPortfolioOpt, finance-bp-117--Riskfolio-Lib
severity: medium
applicable_to_tags:
markets:
- multi-market
activities:
- portfolio-analytics
_source_file: anti-patterns/portfolio-analytics.yaml
- id: AP-PORTFOLIO-ANALYTICS-011
title: Wrong annualization factors distort cross-frequency metric comparison
description: Applying incorrect annualization factors (wrong values for daily, weekly, monthly, quarterly, yearly frequencies)
produces non-comparable metrics across different return frequencies, causing invalid strategy comparisons and misallocated
capital. The Sharpe ratio formula must use correct annualization with sample standard deviation (ddof=1), otherwise producing
misleading risk-adjusted return estimates.
project_source: finance-bp-107--empyrical-reloaded
severity: high
applicable_to_tags:
markets:
- multi-market
activities:
- portfolio-analytics
_source_file: anti-patterns/portfolio-analytics.yaml
- id: AP-PORTFOLIO-ANALYTICS-012
title: Misaligned time series in alpha/beta calculation produces invalid factor analysis
description: Passing returns and factor_returns to alpha_beta functions without verifying data alignment on index labels
(pd.Series) or length equality (np.ndarray) produces incorrect alpha/beta values due to correlation computed between mismatched
periods. Including benchmark ticker in the asset ticker list causes circular correlation producing meaningless beta values
of approximately 1.0.
project_source: finance-bp-107--empyrical-reloaded, finance-bp-118--FinanceToolkit
severity: high
applicable_to_tags:
markets:
- multi-market
activities:
- portfolio-analytics
_source_file: anti-patterns/portfolio-analytics.yaml
- id: AP-PORTFOLIO-ANALYTICS-013
title: Forward-filling spot prices creates look-ahead bias in TRI construction
description: Forward-filling spot prices creates look-ahead bias where future prices are used to calculate historical returns,
invalidating all TRI-based backtest results. The total return index construction requires multiplicative cumulation using
cumprod (not cumsum) with base value 100, as additive cumulation allows negative cumulative returns to break the index
chain.
project_source: finance-bp-108--finmarketpy
severity: high
applicable_to_tags:
markets:
- multi-market
activities:
- portfolio-analytics
_source_file: anti-patterns/portfolio-analytics.yaml
- id: AP-PORTFOLIO-ANALYTICS-014
title: Unsupported solver selection breaks advanced risk calculations
description: Using solvers that don't support required cone programming (power cone, exponential cone) causes CVXPY to fail
with SolverError, returning None and breaking risk calculations. CLARABEL, SCS, ECOS support power cone for RLVaR/RLDaR
calculations, while CLARABEL/MOSEK/SCS/ECOS support exponential cone for EVaR calculations. Riskfolio-Lib and PyPortfolioOpt
both require careful solver selection.
project_source: finance-bp-117--Riskfolio-Lib, finance-bp-093--PyPortfolioOpt
severity: medium
applicable_to_tags:
markets:
- multi-market
activities:
- portfolio-analytics
_source_file: anti-patterns/portfolio-analytics.yaml
cross_project_wisdom:
- wisdom_id: CW-PORTFOLIO-ANALYTICS-001
source_project: finance-bp-066--wealthbot, finance-bp-082--stock-screener, finance-bp-093--PyPortfolioOpt
pattern_name: Defensive zero-division guards with explicit handling
description: Always guard division operations with explicit zero-value checks before executing. In price ratio calculations,
filter out securities where old_price is zero before calling getPricesDiff. In composite score calculations, guard against
total_weight of zero and return 0.0 for empty input lists. This prevents NaN/infinity propagation that corrupts downstream
calculations and crashes pipelines.
applicable_to_activity: portfolio-analytics
_source_file: cross-project-wisdom/portfolio-analytics.yaml
- wisdom_id: CW-PORTFOLIO-ANALYTICS-002
source_project: finance-bp-093--PyPortfolioOpt, finance-bp-117--Riskfolio-Lib
pattern_name: Covariance matrix positive-semidefiniteness verification
description: Always verify covariance matrix is positive-semidefinite before passing to CVXPY optimization. Apply eigenvalue
clipping if violated, as non-PSD matrices cause Cholesky decomposition failures. Both PyPortfolioOpt and Riskfolio-Lib
enforce this constraint to prevent optimizer from finding mathematically invalid solutions or crashing entirely.
applicable_to_activity: portfolio-analytics
_source_file: cross-project-wisdom/portfolio-analytics.yaml
- wisdom_id: CW-PORTFOLIO-ANALYTICS-003
source_project: finance-bp-068--xalpha, finance-bp-106--pyfolio-reloaded, finance-bp-107--empyrical-reloaded
pattern_name: Geometric compounding for cumulative returns
description: Compute cumulative returns using geometric compounding via cumprod(1 + returns), never arithmetic cumulation
via cumsum. Arithmetic cumulative sum overstates gains and understates losses, causing cumulative returns to diverge significantly
from actual portfolio performance over volatile periods. This principle applies to total return index construction and
any cumulative performance calculation.
applicable_to_activity: portfolio-analytics
_source_file: cross-project-wisdom/portfolio-analytics.yaml
- wisdom_id: CW-PORTFOLIO-ANALYTICS-004
source_project: finance-bp-108--finmarketpy, finance-bp-106--pyfolio-reloaded
pattern_name: Temporal shift enforcement to prevent look-ahead bias
description: Enforce proper temporal shifting in signal generation and position calculations. Use shift(-1) for exit signals
to prevent look-ahead bias, and shift(1) when estimating intraday positions from EOD data. Forward-fill carry data and
backward-fill only old data gaps, never forward-fill spot prices. Violations cause live trading returns to diverge from
backtested results.
applicable_to_activity: portfolio-analytics
_source_file: cross-project-wisdom/portfolio-analytics.yaml
- wisdom_id: CW-PORTFOLIO-ANALYTICS-005
source_project: finance-bp-093--PyPortfolioOpt, finance-bp-117--Riskfolio-Lib
pattern_name: DCP-compliant convex optimization construction
description: Use only DCP-compliant convex objectives and constraints in CVXPY. Provide constraints as callable functions
accepting weight variables, use valid bounds formats matching n_assets length, and verify target parameters (volatility,
return) are within feasible ranges. Non-convex or infeasible problems fail with DCPError or OptimizationError, preventing
optimization entirely.
applicable_to_activity: portfolio-analytics
_source_file: cross-project-wisdom/portfolio-analytics.yaml
- wisdom_id: CW-PORTFOLIO-ANALYTICS-006
source_project: finance-bp-107--empyrical-reloaded, finance-bp-118--FinanceToolkit
pattern_name: Correct Sharpe ratio formula with risk-free rate subtraction
description: Calculate Sharpe ratio using (mean returns - risk_free) / std(returns) * sqrt(annualization) with sample standard
deviation (ddof=1). Subtract risk-free rate from asset returns before dividing by volatility. Incorrect Sharpe ratio calculation
produces misleading risk-adjusted return estimates, causing poor investment decisions based on faulty performance attribution.
applicable_to_activity: portfolio-analytics
_source_file: cross-project-wisdom/portfolio-analytics.yaml
- wisdom_id: CW-PORTFOLIO-ANALYTICS-007
source_project: finance-bp-068--xalpha, finance-bp-066--wealthbot
pattern_name: Immutable FIFO position tracking with chronological ordering
description: Maintain FIFO position tracking with strictly increasing date order for position entries. Use copy() function
to create independent copies before mutating remtable to avoid side effects. Enforce chronological ordering in sell operations
to ensure correct cost basis and holding period calculation, particularly important for funds with tiered fees by holding
period.
applicable_to_activity: portfolio-analytics
_source_file: cross-project-wisdom/portfolio-analytics.yaml
- wisdom_id: CW-PORTFOLIO-ANALYTICS-008
source_project: finance-bp-082--stock-screener, finance-bp-093--PyPortfolioOpt, finance-bp-117--Riskfolio-Lib
pattern_name: Validation at system boundaries with descriptive errors
description: Enforce validation at system boundaries with descriptive error messages. Validate expected returns matches
covariance matrix dimensions, score values are within [0, 100], confidence values within [0, 1], and required DataFrame
columns are present. Invalid inputs should raise ValueError with descriptive messages listing valid options to prevent
silent failures or corrupted calculations.
applicable_to_activity: portfolio-analytics
_source_file: cross-project-wisdom/portfolio-analytics.yaml
- wisdom_id: CW-PORTFOLIO-ANALYTICS-009
source_project: finance-bp-068--xalpha, finance-bp-107--empyrical-reloaded
pattern_name: Decimal rounding for monetary calculations
description: Use Decimal with explicit rounding (myround) for each monetary calculation to avoid floating-point errors that
cause share miscalculation and incorrect cost basis. This prevents rounding errors from propagating to XIRR and portfolio
valuation calculations. Direct floating-point operations in financial calculations accumulate errors that become material
over many transactions.
applicable_to_activity: portfolio-analytics
_source_file: cross-project-wisdom/portfolio-analytics.yaml
- wisdom_id: CW-PORTFOLIO-ANALYTICS-010
source_project: finance-bp-106--pyfolio-reloaded, finance-bp-068--xalpha
pattern_name: Cash flow sign convention enforcement
description: Mark cash outflows as negative and cash inflows as positive in cftable. Incorrect cash flow signs cause NPV
calculation to invert, producing negative returns for profitable trades and vice versa. Verify sum of round trip PnLs
equals total realized transaction dollars to catch sign convention errors before they corrupt performance attribution.
applicable_to_activity: portfolio-analytics
_source_file: cross-project-wisdom/portfolio-analytics.yaml
domain_constraints_injected: []
resources_injected: {}
known_use_cases:
- kuc_id: KUC-101
source_file: cookbook/1-RiskReturnModels.ipynb
business_problem: Compares multiple covariance estimation methods (sample, semicovariance, exponential, Ledoit-Wolf variants,
oracle approximating) to evaluate which provides the most accurate risk estimates for portfolio construction.
intent_keywords:
- risk model comparison
- covariance estimation methods
- portfolio risk analysis
- Ledoit-Wolf shrinkage
- risk matrix computation
stage: factor_computation
data_domain: market_data
type: research_analysis
- kuc_id: KUC-102
source_file: cookbook/2-Mean-Variance-Optimisation.ipynb
business_problem: Constructs a minimum volatility portfolio using mean-variance optimization with CAPM-based expected returns
and compares sample covariance vs Ledoit-Wolf shrinkage estimators.
intent_keywords:
- mean-variance optimization
- minimum volatility portfolio
- Efficient Frontier
- CAPM returns
- portfolio construction
stage: portfolio_optimization
data_domain: market_data
type: trading_strategy
- kuc_id: KUC-103
source_file: cookbook/3-Advanced-Mean-Variance-Optimisation.ipynb
business_problem: Implements advanced mean-variance optimization that accounts for broker transaction costs when rebalancing
from an initial portfolio allocation, using semicovariance as the risk model.
intent_keywords:
- transaction cost optimization
- portfolio rebalancing
- semicovariance risk
- minimum volatility with costs
- initial portfolio adjustment
stage: portfolio_optimization
data_domain: market_data
type: trading_strategy
- kuc_id: KUC-104
source_file: cookbook/4-Black-Litterman-Allocation.ipynb
business_problem: Combines market equilibrium prior returns with investor views using the Black-Litterman model to generate
more realistic expected returns and construct portfolios that reflect both market consensus and proprietary opinions.
intent_keywords:
- Black-Litterman model
- investor views integration
- market implied prior returns
- Bayesian portfolio allocation
- equilibrium returns
stage: portfolio_optimization
data_domain: market_data
type: trading_strategy
- kuc_id: KUC-105
source_file: cookbook/5-Hierarchical-Risk-Parity.ipynb
business_problem: Constructs a diversified portfolio using Hierarchical Risk Parity (HRP), which uses clustering/dendrogram
analysis to group correlated assets and allocate weights without requiring covariance matrix inversion.
intent_keywords:
- Hierarchical Risk Parity
- HRP optimization
- dendrogram clustering
- diversified portfolio
- correlation-based allocation
stage: portfolio_optimization
data_domain: market_data
type: trading_strategy
- kuc_id: KUC-106
source_file: docs/conf.py
business_problem: Sphinx documentation configuration file for building PyPortfolioOpt package documentation, not a functional
portfolio strategy.
intent_keywords:
- documentation setup
- Sphinx configuration
- package documentation
- doc build
- autodoc
stage: data_collection
data_domain: mixed
type: extension_example
component_capability_map:
project: finance-bp-093--PyPortfolioOpt
scan_date: '2026-04-22'
stats:
total_files: 7
total_classes: 37
total_functions: 0
total_stages: 7
modules:
data_input_&_preprocessing:
class_count: 6
stage_id: data_input
stage_order: 1
responsibility: Converts raw price data into returns and estimates expected returns/covariance matrices for optimization.
Normalizes to annual frequency using 252 trading days assumption.
classes:
- name: mean_historical_return
file: data_input_&_preprocessing/mean-historical-return.py
line: 0
kind: required_method
signature: ''
- name: CovarianceShrinkage.__init__
file: data_input_&_preprocessing/covarianceshrinkage-init.py
line: 0
kind: required_method
signature: ''
- name: return_model
file: data_input_&_preprocessing/return-model.py
line: 0
kind: required_method
signature: ''
- name: return_model method
file: data_input_&_preprocessing/return-model-method.py
line: 0
kind: replaceable_point
- name: risk_matrix method
file: data_input_&_preprocessing/risk-matrix-method.py
line: 0
kind: replaceable_point
- name: ledoit_wolf shrinkage target
file: data_input_&_preprocessing/ledoit-wolf-shrinkage-target.py
line: 0
kind: replaceable_point
design_decision_count: 3
portfolio_optimization:
class_count: 11
stage_id: portfolio_optimization
stage_order: 2
responsibility: Finds optimal asset weights based on mean-variance or alternative risk objectives using convex optimization
via cvxpy. Core optimization engine of the library.
classes:
- name: EfficientFrontier.min_volatility
file: portfolio_optimization/efficientfrontier-min-volatility.py
line: 0
kind: required_method
signature: ''
- name: EfficientFrontier.max_sharpe
file: portfolio_optimization/efficientfrontier-max-sharpe.py
line: 0
kind: required_method
signature: ''
- name: EfficientFrontier.efficient_risk
file: portfolio_optimization/efficientfrontier-efficient-risk.py
line: 0
kind: required_method
signature: ''
- name: EfficientFrontier.efficient_return
file: portfolio_optimization/efficientfrontier-efficient-return.py
line: 0
kind: required_method
signature: ''
- name: BaseConvexOptimizer.add_objective
file: portfolio_optimization/baseconvexoptimizer-add-objective.py
line: 0
kind: required_method
signature: ''
- name: BaseConvexOptimizer.add_constraint
file: portfolio_optimization/baseconvexoptimizer-add-constraint.py
line: 0
kind: required_method
signature: ''
- name: BaseOptimizer.clean_weights
file: portfolio_optimization/baseoptimizer-clean-weights.py
line: 0
kind: required_method
signature: ''
- name: portfolio_performance
file: portfolio_optimization/portfolio-performance.py
line: 0
kind: required_method
signature: ''
- name: optimization method
file: portfolio_optimization/optimization-method.py
line: 0
kind: replaceable_point
- name: solver backend
file: portfolio_optimization/solver-backend.py
line: 0
kind: replaceable_point
- name: custom objective function
file: portfolio_optimization/custom-objective-function.py
line: 0
kind: replaceable_point
design_decision_count: 6
hierarchical_portfolio_optimization:
class_count: 2
stage_id: hierarchical_optimization
stage_order: 3
responsibility: Non-convex optimization using hierarchical clustering to avoid estimation error in traditional mean-variance.
Uses scipy for clustering with inverse-variance weighting within clusters.
classes:
- name: HRPOpt.optimize
file: hierarchical_portfolio_optimization/hrpopt-optimize.py
line: 0
kind: required_method
signature: ''
- name: linkage_method
file: hierarchical_portfolio_optimization/linkage-method.py
line: 0
kind: replaceable_point
design_decision_count: 2
critical_line_algorithm_optimization:
class_count: 4
stage_id: cla_optimization
stage_order: 4
responsibility: Alternative mean-variance optimizer using de Prado's CLA algorithm. Generates full efficient frontier
efficiently without cvxpy dependency using pure numpy.
classes:
- name: CLA.max_sharpe
file: critical_line_algorithm_optimization/cla-max-sharpe.py
line: 0
kind: required_method
signature: ''
- name: CLA.min_volatility
file: critical_line_algorithm_optimization/cla-min-volatility.py
line: 0
kind: required_method
signature: ''
- name: CLA.efficient_frontier
file: critical_line_algorithm_optimization/cla-efficient-frontier.py
line: 0
kind: required_method
signature: ''
- name: frontier generation points
file: critical_line_algorithm_optimization/frontier-generation-points.py
line: 0
kind: replaceable_point
design_decision_count: 2
black-litterman_optimization:
class_count: 5
stage_id: black_litterman
stage_order: 5
responsibility: Bayesian approach combining market equilibrium prior with investor views. Reduces extreme weights from
traditional MVO by blending views with implied equilibrium returns.
classes:
- name: BlackLittermanModel.bl_weights
file: black-litterman_optimization/blacklittermanmodel-bl-weights.py
line: 0
kind: required_method
signature: ''
- name: BlackLittermanModel.optimize
file: black-litterman_optimization/blacklittermanmodel-optimize.py
line: 0
kind: required_method
signature: ''
- name: BlackLittermanModel.idzorek_method
file: black-litterman_optimization/blacklittermanmodel-idzorek-method.py
line: 0
kind: required_method
signature: ''
- name: prior specification
file: black-litterman_optimization/prior-specification.py
line: 0
kind: replaceable_point
- name: omega computation
file: black-litterman_optimization/omega-computation.py
line: 0
kind: replaceable_point
design_decision_count: 3
discrete_allocation:
class_count: 4
stage_id: discrete_allocation
stage_order: 6
responsibility: Converts continuous weights from optimizer into actual share counts for trading. Bridges optimization
theory to execution by handling rounding and lot sizes.
classes:
- name: DiscreteAllocation.greedy_portfolio
file: discrete_allocation/discreteallocation-greedy-portfolio.py
line: 0
kind: required_method
signature: ''
- name: DiscreteAllocation.lp_portfolio
file: discrete_allocation/discreteallocation-lp-portfolio.py
line: 0
kind: required_method
signature: ''
- name: allocation algorithm
file: discrete_allocation/allocation-algorithm.py
line: 0
kind: replaceable_point
- name: reinvest flag
file: discrete_allocation/reinvest-flag.py
line: 0
kind: replaceable_point
design_decision_count: 3
visualization:
class_count: 5
stage_id: visualization
stage_order: 7
responsibility: Plots efficient frontiers, dendrograms, and weight allocations. Supports both matplotlib and plotly
backends with lazy imports to avoid hard dependencies.
classes:
- name: plot_efficient_frontier
file: visualization/plot-efficient-frontier.py
line: 0
kind: required_method
signature: ''
- name: plot_weights
file: visualization/plot-weights.py
line: 0
kind: required_method
signature: ''
- name: plot_dendrogram
file: visualization/plot-dendrogram.py
line: 0
kind: required_method
signature: ''
- name: plot_covariance
file: visualization/plot-covariance.py
line: 0
kind: required_method
signature: ''
- name: visualization library
file: visualization/visualization-library.py
line: 0
kind: replaceable_point
design_decision_count: 2
data_flow_hints: []
locale_contract:
source_language: en
user_facing_fields:
- human_summary.what_i_can_do.tagline
- human_summary.what_i_can_do.use_cases[]
- human_summary.what_i_auto_fetch[]
- human_summary.what_i_ask_you[]
- evidence_quality.user_disclosure_template
- post_install_notice.message_template.positioning
- post_install_notice.message_template.capability_catalog.groups[].name
- post_install_notice.message_template.capability_catalog.groups[].description
- post_install_notice.message_template.capability_catalog.groups[].ucs[].name
- post_install_notice.message_template.capability_catalog.groups[].ucs[].short_description
- post_install_notice.message_template.call_to_action
- post_install_notice.message_template.featured_entries[].beginner_prompt
- post_install_notice.message_template.more_info_hint
- preconditions[].description
- preconditions[].on_fail
- intent_router.uc_entries[].name
- intent_router.uc_entries[].ambiguity_question
- architecture.pipeline
- architecture.stages[].narrative.does_what
- architecture.stages[].narrative.key_decisions
- architecture.stages[].narrative.common_pitfalls
- constraints.fatal[].consequence
- constraints.regular[].consequence
- output_validator.assertions[].failure_message
- acceptance.hard_gates[].on_fail
- skill_crystallization.action
locale_detection_order:
- explicit_user_declaration
- first_message_language
- system_locale
translation_enforcement:
trigger: on_first_user_message
action: Render user_facing_fields in detected locale, preserving all IDs (BD-/SL-/UC-/finance-C-) and code identifiers
verbatim
violation_code: LOCALE-01
violation_signal: User receives untranslated English Human Summary when detected locale != en
evidence_quality:
declared:
evidence_coverage_ratio: 1.0
evidence_verify_ratio: 0.34615384615384615
evidence_invalid: 68
evidence_verified: 36
evidence_auto_fixed: 0
audit_coverage: 61/61 (100%)
audit_pass_rate: 1/61 (1%)
audit_fail_total: 40
audit_finance_universal:
pass: 0
warn: 6
fail: 14
audit_subdomain_totals:
pass: 1
warn: 14
fail: 26
enforcement_rules:
- id: EQ-01
trigger: declared.evidence_verify_ratio < 0.5
action: MUST invoke traceback lookup for all cited BD-IDs in output before emitting business code — read LATEST.yaml sections
for each BD referenced
violation_code: EQ-01-V
violation_signal: Generated script references BD-IDs but no tool_call to read LATEST.yaml preceded code generation
user_disclosure_template: '[QUALITY NOTICE] This crystal was compiled from blueprint finance-bp-093. Evidence verify ratio
= 34.6% and audit fail total = 40. Generated results may have uncaptured requirement gaps. Verify critical decisions against
source files (LATEST.yaml / LATEST.jsonl).'
traceback:
source_files:
blueprint: LATEST.yaml
constraints: LATEST.jsonl
mandatory_lookup_scenarios:
- id: TB-01
condition: Two constraints have apparently conflicting enforcement rules
lookup_target: LATEST.jsonl — find both constraint IDs, compare `consequence` + `evidence_refs` to determine priority
- id: TB-02
condition: A business decision rationale is unclear or disputed
lookup_target: LATEST.yaml — locate BD-ID under business_decisions, read `rationale` + `alternative_considered` fields
- id: TB-03
condition: evidence_invalid > 0 in evidence_quality.declared
lookup_target: LATEST.yaml _enrich_meta — cross-check specific BD `evidence_refs` fields for invalid markers
- id: TB-04
condition: User asks where a rule comes from
lookup_target: LATEST.jsonl — find constraint by ID, read `confidence.evidence_refs` for source file + line number
- id: TB-05
condition: Generated code does not match expected ZVT API behavior
lookup_target: LATEST.yaml stages[].required_methods — verify method signature and evidence locator in source code
degraded_lookup:
no_fs_access: 'Ask the user to paste the relevant LATEST.yaml section or LATEST.jsonl lines for the BD-/finance-C- IDs
in question. Crystal ID: finance-bp-093-v5.0.'
trace_schema:
event_types:
- precondition_check
- spec_lock_check
- evidence_rule_fired
- evidence_rule_skipped
- locale_translation_emitted
- hard_gate_passed
- hard_gate_failed
- skill_emitted
- false_completion_claim
preconditions:
- id: PC-01
description: zvt package installed and importable
check_command: python3 -c 'import zvt; print(zvt.__version__)'
on_fail: 'Run: python3 -m pip install zvt then re-run: python3 -m zvt.init_dirs to initialize data directories'
severity: fatal
- id: PC-02
description: K-data exists for target entities (required before backtesting)
check_command: python3 -c "from zvt.api.kdata import get_kdata; df = get_kdata(entity_ids=['stock_sh_600000'], limit=1);
assert df is not None and len(df) > 0, 'No kdata found'"
on_fail: 'Run recorder first: python3 -m zvt.recorders.em.em_stock_kdata_recorder --entity_ids stock_sh_600000 (replace
with your target entity IDs)'
severity: fatal
applies_to_uc:
- UC-101
- id: PC-03
description: ZVT data directory initialized (~/.zvt or ZVT_HOME)
check_command: 'python3 -c "import os; from pathlib import Path; zvt_home = Path(os.environ.get(''ZVT_HOME'', Path.home()
/ ''.zvt'')); assert zvt_home.exists(), f''ZVT home not found: {zvt_home}''"'
on_fail: 'Run: python3 -m zvt.init_dirs'
severity: fatal
- id: PC-04
description: SQLite write permission for ZVT data directory
check_command: python3 -c "import os, tempfile; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home()
/ '.zvt')); test_f = zvt_home / '.write_test'; test_f.touch(); test_f.unlink()"
on_fail: 'Check directory permissions: chmod u+w ~/.zvt or set ZVT_HOME environment variable to a writable location'
severity: warn
intent_router:
uc_entries:
- uc_id: UC-101
name: Risk Model Comparison Analysis
positive_terms:
- risk model comparison
- covariance estimation methods
- portfolio risk analysis
- Ledoit-Wolf shrinkage
- risk matrix computation
data_domain: market_data
negative_terms:
- portfolio optimization
- Black-Litterman
- Hierarchical Risk Parity
- minimum variance
- expected returns
ambiguity_question: Are you comparing different risk/covariance estimation techniques, or are you trying to construct
an optimized portfolio?
- uc_id: UC-102
name: Basic Mean-Variance Optimization
positive_terms:
- mean-variance optimization
- minimum volatility portfolio
- Efficient Frontier
- CAPM returns
- portfolio construction
data_domain: market_data
negative_terms:
- Black-Litterman views
- Hierarchical Risk Parity
- transaction costs
- risk model comparison
- screening criteria
ambiguity_question: Do you want a basic minimum variance portfolio using standard mean-variance, or are you incorporating
market views/transaction costs?
- uc_id: UC-103
name: Mean-Variance Optimization with Transaction Costs
positive_terms:
- transaction cost optimization
- portfolio rebalancing
- semicovariance risk
- minimum volatility with costs
- initial portfolio adjustment
data_domain: market_data
negative_terms:
- Black-Litterman
- Hierarchical Risk Parity
- risk model comparison
- screening
- live trading execution
ambiguity_question: Are you optimizing a new portfolio from scratch, or rebalancing from an existing allocation with transaction
cost considerations?
- uc_id: UC-104
name: Black-Litterman Bayesian Portfolio Allocation
positive_terms:
- Black-Litterman model
- investor views integration
- market implied prior returns
- Bayesian portfolio allocation
- equilibrium returns
data_domain: market_data
negative_terms:
- Hierarchical Risk Parity
- mean-variance without views
- risk model comparison
- transaction costs
- pure quantitative screening
ambiguity_question: Do you have specific market views/beliefs you want to incorporate, or are you looking for an unconstrained
optimization approach?
- uc_id: UC-105
name: Hierarchical Risk Parity Portfolio
positive_terms:
- Hierarchical Risk Parity
- HRP optimization
- dendrogram clustering
- diversified portfolio
- correlation-based allocation
data_domain: market_data
negative_terms:
- Mean-Variance optimization
- Black-Litterman
- minimum variance
- risk model comparison
- CAPM expected returns
ambiguity_question: Do you want a machine learning/clustering-based approach (HRP), or a traditional mean-variance/Black-Litterman
optimization?
- uc_id: UC-106
name: PyPortfolioOpt Documentation Build Configuration
positive_terms:
- documentation setup
- Sphinx configuration
- package documentation
- doc build
- autodoc
data_domain: mixed
negative_terms:
- portfolio optimization
- risk models
- trading strategy
- market data
- factor computation
ambiguity_question: This is a documentation/infrastructure file, not a use case for portfolio construction. Are you looking
for actual portfolio strategies?
context_state_machine:
states:
- id: CA1_MEMORY_CHECKED
entry: Task started
exit: All memory queries attempted and recorded; memory_unavailable set if failed
timeout: 30s — skip memory, mark memory_unavailable=true, proceed to CA2
- id: CA2_GAPS_FILLED
entry: CA1 complete
exit: 'All FATAL-priority required inputs answered: target market (A-share/HK/US), data source, time range, strategy type'
timeout: NOT skippable — FATAL inputs MUST be user-answered before proceeding
- id: CA3_PATH_SELECTED
entry: CA2 complete
exit: intent_router matched single use case with confidence gap > 20% over next candidate, no data_domain ambiguity
timeout: Trigger ambiguity_question for top-2 candidates, await user selection
- id: CA4_EXECUTING
entry: CA3 complete + user explicit confirmation received
exit: All hard gates G1-Gn passed and output files written
timeout: NOT skippable — user confirmation of execution path required
enforcement: Code generation is PROHIBITED before CA4_EXECUTING. Any regression to earlier state MUST be announced to user.
buy/sell ordering SL-01 check runs at CA4 entry.
spec_lock_registry:
semantic_locks:
- id: SL-01
description: Execute sell orders before buy orders in every trading cycle
locked_value: sell() called before buy() in each Trader.run() iteration
violation_is: fatal
source_bd_ids:
- BD-018
- id: SL-02
description: Trading signals MUST use next-bar execution (no look-ahead)
locked_value: due_timestamp = happen_timestamp + level.to_second()
violation_is: fatal
source_bd_ids:
- BD-014
- BD-025
- id: SL-03
description: Entity IDs MUST follow format entity_type_exchange_code
locked_value: stock_sh_600000 | stockhk_hk_0700 | stockus_nasdaq_AAPL
violation_is: fatal
source_bd_ids: []
- id: SL-04
description: DataFrame index MUST be MultiIndex (entity_id, timestamp)
locked_value: df.index.names == ['entity_id', 'timestamp']
violation_is: fatal
source_bd_ids: []
- id: SL-05
description: 'TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount'
locked_value: XOR enforcement in trading/__init__.py:68
violation_is: fatal
source_bd_ids: []
- id: SL-06
description: 'filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION'
locked_value: factor.py:475 order_type_flag mapping
violation_is: fatal
source_bd_ids: []
- id: SL-07
description: Transformer MUST run BEFORE Accumulator in factor pipeline
locked_value: 'compute_result(): transform at :403 before accumulator at :409'
violation_is: fatal
source_bd_ids: []
- id: SL-08
description: 'MACD parameters locked: fast=12, slow=26, signal=9'
locked_value: factors/algorithm.py:30 macd(slow=26, fast=12, n=9)
violation_is: fatal
source_bd_ids:
- BD-036
- id: SL-09
description: 'Default transaction costs: buy_cost=0.001, sell_cost=0.001, slippage=0.001'
locked_value: sim_account.py:25 SimAccountService default costs
violation_is: warning
source_bd_ids:
- BD-029
- id: SL-10
description: A-share equity trading is T+1 (no same-day close of buy positions)
locked_value: sim_account.available_long filters by trading_t
violation_is: fatal
source_bd_ids: []
- id: SL-11
description: Recorder subclass MUST define provider AND data_schema class attributes
locked_value: contract/recorder.py:71 Meta; register_schema decorator
violation_is: fatal
source_bd_ids: []
- id: SL-12
description: Factor result_df MUST contain either 'filter_result' OR 'score_result' column
locked_value: result_df.columns.intersection({'filter_result', 'score_result'}) non-empty
violation_is: fatal
source_bd_ids: []
implementation_hints:
- id: IH-01
hint: 'Use AdjustType enum exactly: qfq (pre-adjust), hfq (post-adjust), bfq (none) — contract/__init__.py:121'
- id: IH-02
hint: For A-share kdata, default to hfq for long-term analysis (dividend-adjusted) — trader.py:538 StockTrader
- id: IH-03
hint: SQLite connection MUST use check_same_thread=False for multi-threaded recorders
- id: IH-04
hint: Accumulator state serialization uses JSON with custom encoder/decoder hooks — contract/base_service.py
- id: IH-05
hint: Factor.level MUST match TargetSelector.level (enforced at add_factor) — factors/target_selector.py:84
preservation_manifest:
required_objects:
business_decisions_count: 123
fatal_constraints_count: 43
non_fatal_constraints_count: 153
use_cases_count: 6
semantic_locks_count: 12
preconditions_count: 4
evidence_quality_rules_count: 2
traceback_scenarios_count: 5
architecture:
pipeline: data_collection -> data_storage -> factor_computation -> target_selection -> trading_execution -> visualization
stages:
- id: data_collection
narrative:
does_what: TimeSeriesDataRecorder and FixedCycleDataRecorder fetch OHLCV and fundamental data from providers (eastmoney,
joinquant, baostock, akshare) and persist domain objects (Stock1dKdata, BalanceSheet) to SQLite via df_to_db().
key_decisions: BD-002 chose evaluate_start_end_size_timestamps for incremental fetch (not full refresh) because comparing
to get_latest_saved_record avoids redundant API calls; BD-003 chose get_data_map field transformation to keep domain
schema provider-agnostic.
common_pitfalls: 'Don''t forget SL-11: Recorder subclass MUST declare both provider and data_schema class attributes
else initialization fails with assertion error; finance-C-001 fatal violation.'
business_decisions: []
- id: data_storage
narrative:
does_what: StorageBackend persists DataFrames to per-provider SQLite databases at {data_path}/{provider}/{provider}_{db_name}.db
using path templates from _get_path_template; Mixin.record_data and Mixin.query_data provide uniform read/write interface.
key_decisions: BD-004 chose StorageBackend abstraction (not hardcoded SQLite) to allow future cloud storage swap; BD-006
derives db_name from data_schema __tablename__ for per-domain database isolation.
common_pitfalls: SL-04 violation (wrong DataFrame index) causes factor pipeline failures downstream; always ensure df.index.names
== ['entity_id', 'timestamp'] before calling record_data.
business_decisions: []
- id: factor_computation
narrative:
does_what: Factor.compute() applies Transformer (stateless, e.g. MacdTransformer) then Accumulator (stateful, e.g. MaStatsAccumulator)
to produce filter_result or score_result columns; EntityStateService persists per-entity rolling state across batches.
key_decisions: BD-007 chose Factor inheriting DataReader for composable data access; SL-08 locks MACD at (fast=12, slow=26,
n=9) — chose standard Appel parameters not adaptive because interpretability matters for practitioners.
common_pitfalls: 'SL-07: Transformer MUST run before Accumulator — swapping order causes NaN propagation; SL-12: result_df
must contain filter_result OR score_result column or TargetSelector silently drops all signals.'
business_decisions: []
- id: target_selection
narrative:
does_what: TargetSelector.add_factor() registers Factor instances; get_targets() returns entity_ids passing threshold
filter at a specific timestamp, enabling point-in-time historical backtesting without look-ahead.
key_decisions: BD-012 chose registrable factor list (not hardcoded) for runtime customization; BD-013 chose timestamp-specific
filtering not current-only because backtests need historical point-in-time correctness.
common_pitfalls: Factor.level MUST match TargetSelector.level (IH-05); mismatched levels cause silent empty target lists
that look like no signals but are actually level-mismatch bugs.
business_decisions: []
- id: trading_execution
narrative:
does_what: Trader.run() calls sell() before buy() each cycle, generates TradingSignals with due_timestamp = happen_timestamp
+ level.to_second() for next-bar execution, and applies on_profit_control() for stop-loss/take-profit before regular
target selection.
key_decisions: SL-01 locks sell-before-buy order because available_long check in sim_account depends on it — chose this
over symmetric ordering to prevent implicit leverage; BD-039 chose long=AND/short=OR multi-level logic to reflect
risk asymmetry.
common_pitfalls: 'SL-02 violation (immediate execution instead of next-bar) introduces look-ahead bias and makes backtest
results unreproducible in live trading; SL-10: A-share T+1 constraint — backtesting without it overstates returns.'
business_decisions: []
- id: visualization
narrative:
does_what: Drawer.draw() combines kline main chart with factor overlays and Rect annotations for entry/exit signals
using Plotly; Drawable interface on Factor enables consistent chart rendering across data types.
key_decisions: BD-019 chose drawer_rects subclass override for custom annotations not hardcoded markers — allows traders
to define entry/exit visuals without modifying base drawing logic.
common_pitfalls: draw_result=True by default (BD-055) is fine for development but set draw_result=False in production/headless
environments to avoid Plotly server startup overhead.
business_decisions:
- id: BD-020
type: M
summary: Lazy import of matplotlib/plotly
- id: BD-021
type: BA
summary: 'Efficient frontier range: linspace from GMV to max-0.0001'
- id: cross_cutting_concerns
narrative:
does_what: 'Invariants and utilities that span multiple pipeline stages — collected from 15 source groups: bayesian_update(2),
black_litterman(3), cla_optimization(2), data_input(17), discrete_allocation(10), global(9), and 9 more.'
key_decisions: 121 BDs merged here because they apply to more than one main stage (e.g. algorithm helpers, default value
choices, ordering contracts, error handling). Agent should inspect individual BD summaries and link back to affected
main stages via shared IDs.
common_pitfalls: Cross-cutting concerns frequently surface as bugs when changes to one main stage unintentionally break
another. Check constraints referencing these BDs and verify invariants still hold after any stage-local modification.
business_decisions:
- id: BD-028
type: B/BA
summary: Tau (weight-on-views) defaults to 0.05 in Black-Litterman
- id: BD-047
type: B
summary: Zero confidence in Idzorek method results in 1e6 uncertainty
- id: BD-014
type: BA/DK
summary: tau=0.05 default weight-on-views
- id: BD-015
type: BA/DK
summary: risk_aversion=1 default
- id: BD-016
type: B
summary: Idzorek method for view uncertainty
- id: BD-012
type: B
summary: No cvxpy dependency in CLA
- id: BD-013
type: B/BA
summary: Golden section search for lambda
- id: BD-001
type: BA/DK
summary: frequency=252 hardcoded as default
- id: BD-002
type: BA
summary: Compounding=True default for mean_historical_return
- id: BD-003
type: B
summary: Positive semidefinite matrix check with Cholesky decomposition
- id: BD-024
type: B/BA
summary: Frequency defaults to 252 (trading days per year)
- id: BD-059
type: B
summary: NaN prices filled forward with ffill() in get_latest_prices
- id: BD-067
type: B/BA
summary: Pseudo-prices initialize to 1.0 for returns-to-prices conversion
- id: BD-068
type: B/BA
summary: NaN rows dropped with 'all' (only if each values are NaN)
- id: BD-GAP-001
type: RC
summary: 'Missing: ** "Time semantics framework: Implement as_of/evaluation_date concept for reproducible optimization
runs with temporal data'
- id: BD-GAP-002
type: DK
summary: 'Missing: ** "Trading calendar integration: Replace hardcoded frequency=252 with configurable calendar-aware
annualization'
- id: BD-GAP-003
type: DK
summary: 'Missing: ** "Point-in-time data markers: Add release_date/publish_date fields to track data vintage'
- id: BD-GAP-004
type: DK
summary: 'Missing: ** "Stale data detection: Implement max_age/cache_ttl policy with configurable expiry warnings'
- id: BD-GAP-005
type: B
summary: 'Missing: ** "Signal alignment validation: Add shift/lag check to prevent look-ahead in returns computation'
- id: BD-GAP-006
type: B
summary: 'Missing: ** "Audit trail for optimization runs: Add immutable event log for weights/outputs'
- id: BD-GAP-007
type: B
summary: 'Missing: ** "Currency/unit annotation: Add currency_denomination field to price data'
- id: BD-GAP-008
type: RC
summary: 'Missing: ** "Settlement convention T+N: Add T+ settlement modeling to DiscreteAllocation'
- id: BD-GAP-009
type: RC
summary: 'Missing: ** "Tick size/lot size enforcement: Add min_qty and tick_size constraints'
- id: BD-GAP-010
type: DK
summary: 'Missing: ** "Rebalance trigger mechanism: Add calendar-based or drift-threshold rebalancing'
- id: BD-017
type: BA
summary: total_portfolio_value=10000 default
- id: BD-018
type: B/BA
summary: 'Two allocation methods: greedy vs LP'
- id: BD-019
type: B
summary: RMSE error calculation
- id: BD-034
type: B
summary: Short ratio auto-derived from input weights when not specified
- id: BD-035
type: B/BA
summary: Total portfolio value defaults to $10,000 for discrete allocation
- id: BD-040
type: B/BA
summary: ECOS_BB solver used as default for integer programming (LP)
- id: BD-043
type: B/BA
summary: Greedy allocation counter limit is 10 iterations
- id: BD-051
type: B/BA
summary: RMSE (not MAE) used for allocation error measurement
- id: BD-069
type: B/DK
summary: Long/short portfolios split into separate sub-allocations
- id: BD-070
type: B
summary: Reinvest option controls whether short proceeds fund longs
- id: BD-105
type: BA/DK
summary: 'INTERACTION: [BD-001 frequency=252] × [BD-065 annualization scaling] → 20% mis-scaling for non-US calendars'
- id: BD-106
type: BA/DK
summary: 'INTERACTION: [BD-031 compounding=True] × [BD-002 geometric mean] → compound overstatement in volatile markets'
- id: BD-107
type: B/BA
summary: 'INTERACTION: [BD-027 beta=0.95 CVaR] × [BD-092 CDaR confidence] × [BD-093 CVaR confidence] → triple tail-risk
amplification'
- id: BD-108
type: B
summary: 'INTERACTION: [BD-003 Cholesky PSD check] → [BD-065 covariance annualization] → [BD-008 cvxpy optimization]
→ silent failure cascade'
- id: BD-109
type: BA/DK
summary: 'INTERACTION: [BD-059 ffill() prices] → [BD-017 total_portfolio_value=10000] → [BD-018 greedy vs LP] → stale
price allocation error'
- id: BD-110
type: B/BA
summary: 'INTERACTION: [BD-014 tau=0.05] × [BD-028 tau=0.05] × [BD-096 BL weight on views] → over-trusting equilibrium
priors'
- id: BD-111
type: BA
summary: 'INTERACTION: [BD-007 strategy pattern] × [BD-009 no re-solving] → custom objective state confusion'
- id: BD-112
type: BA/DK
summary: 'INTERACTION: [BD-032 log_returns=False] × [BD-076 log returns vs simple] → inconsistent return convention
handling'
- id: BD-113
type: BA/M
summary: 'INTERACTION: [BD-033 weights_sum_to_one=True] × [BD-042 transaction cost k=0.001] → artificial portfolio turnover
encouragement'
- id: BD-010
type: B
summary: Inverse-variance weighting within clusters
- id: BD-011
type: B/BA
summary: Bi-section recursion for allocation
- id: BD-022
type: B/BA
summary: Weight bounds default to (0, 1) for long-only portfolios
- id: BD-052
type: B/BA
summary: OrderedDict used for deterministic weight output
- id: BD-054
type: B/DK
summary: Weights output as plain Python floats (not numpy float64)
- id: BD-066
type: B/BA
summary: Sharpe ratio annualization assumes same std as mean
- id: BD-004
type: BA
summary: weight_bounds=(0,1) long-only default
- id: BD-005
type: BA/DK
summary: risk_free_rate=0.0 default
- id: BD-006
type: BA
summary: cutoff=1e-4 for clean_weights
- id: BD-007
type: B/RC
summary: Strategy pattern via add_objective/add_constraint
- id: BD-008
type: B
summary: cvxpy for convex optimization
- id: BD-009
type: B
summary: No re-solving after objectives added
- id: BD-023
type: B/BA
summary: Risk-free rate defaults to 0.0 across each optimization methods
- id: BD-029
type: B/BA
summary: Risk aversion defaults to 1.0 for quadratic utility
- id: BD-033
type: B/BA
summary: Weights sum to one constraint is True by default
- id: BD-037
type: B/BA
summary: Efficient frontier computed with 100 points by default
- id: BD-038
type: B/BA
summary: Golden section tolerance is 1e-9 for CLA solver
- id: BD-039
type: B/BA
summary: Market neutral defaults to False (long-only portfolios)
- id: BD-041
type: B/BA
summary: L2 regularization gamma defaults to 1
- id: BD-042
type: B/BA
summary: Transaction cost parameter k defaults to 0.001 (10 basis points)
- id: BD-044
type: B
summary: CVXPY solver accepts 'optimal' or 'optimal_inaccurate' status
- id: BD-045
type: B
summary: Weights rounded to 16 decimals internally after optimization
- id: BD-048
type: B
summary: Minimum volatility uses variance (not standard deviation) as objective
- id: BD-049
type: B
summary: Max Sharpe uses variable substitution (tangency portfolio trick)
- id: BD-050
type: B/BA
summary: No expected returns allowed for min_volatility only
- id: BD-053
type: B/BA
summary: Single linkage method is default for HRP clustering
- id: BD-055
type: B
summary: Dummy zero covariance matrix passed to CDaR/CVaR variants
- id: BD-056
type: B/BA
summary: Return objective uses negative=True by default (minimize pattern)
- id: BD-057
type: B
summary: Global minimum variance used to validate efficient_risk target
- id: BD-058
type: B/BA
summary: Max return caches result to avoid redundant optimization
- id: BD-060
type: B
summary: CLA requires solving full frontier before individual methods
- id: BD-061
type: B/BA
summary: Sector constraints warn if shorts are allowed
- id: BD-062
type: B
summary: Objective/constraint changes after solve raise InstantiationError
- id: BD-063
type: B
summary: Distance matrix computed as sqrt((1-corr)/2) for HRP
- id: BD-064
type: B/BA
summary: Inverse variance weighting used within HRP clusters
- id: BD-025
type: B
summary: Weight cleanup cutoff is 1e-4 (weights below this become zero)
- id: BD-026
type: B/BA
summary: Weight rounding defaults to 5 decimal places
- id: BD-072
type: B/BA
summary: Annualization factor for return/covariance estimation
- id: BD-073
type: B/BA
summary: Compounding method for return estimation
- id: BD-074
type: B/BA
summary: EMA span for exponentially-weighted returns
- id: BD-075
type: B/BA
summary: CAPM beta estimation method
- id: BD-076
type: B/BA
summary: Log returns vs simple returns
- id: BD-077
type: B/RC
summary: Covariance matrix positivity fix method
- id: BD-078
type: B/BA
summary: Semicovariance benchmark level
- id: BD-079
type: B/BA
summary: Exponential covariance span
- id: BD-080
type: B/BA
summary: Ledoit-Wolf shrinkage target
- id: BD-081
type: B
summary: Manual shrinkage parameter
- id: BD-082
type: B/RC
summary: PSD verification via Cholesky
- id: BD-083
type: B
summary: Sharpe ratio calculation
- id: BD-084
type: B
summary: L2 regularization for sparse weights
- id: BD-085
type: B/BA
summary: Quadratic utility function
- id: BD-086
type: B
summary: Transaction cost model
- id: BD-087
type: B
summary: Ex-ante tracking error
- id: BD-088
type: B/BA
summary: Default weight bounds
- id: BD-089
type: B
summary: Max Sharpe ratio optimization method
- id: BD-090
type: B
summary: Global minimum volatility formula
- id: BD-091
type: B/BA
summary: Semivariance benchmark threshold
- id: BD-092
type: B/BA
summary: Conditional Drawdown at Risk confidence level
- id: BD-093
type: B/BA
summary: Conditional VaR confidence level
- id: BD-094
type: B
summary: Hierarchical clustering linkage method
- id: BD-095
type: B/BA
summary: Cluster variance weighting
- id: BD-096
type: B/DK
summary: Black-Litterman weight on views
- id: BD-097
type: B/BA
summary: Black-Litterman default omega calculation
- id: BD-098
type: B/BA
summary: Market-implied prior calculation
- id: BD-099
type: B/BA
summary: Critical Line Algorithm tolerance
- id: BD-100
type: B/BA
summary: CLA efficient frontier points
- id: BD-101
type: B
summary: Greedy vs LP allocation method
- id: BD-102
type: B
summary: Weight cleanup rounding
- id: BD-103
type: B
summary: Nonconvex optimization solver
- id: BD-104
type: B/BA
summary: Deviation risk parity objective
- id: BD-031
type: B/BA
summary: Compounding (geometric mean) defaults to True for returns
- id: BD-032
type: B/BA
summary: Log returns default to False (simple returns used)
- id: BD-071
type: B/BA
summary: Equal-weighted market used as default proxy for market in CAPM
- id: BD-027
type: B/BA
summary: Beta confidence level defaults to 0.95 for CVaR/CDaR
- id: BD-030
type: B/BA
summary: Exponential covariance span defaults to 180 days
- id: BD-036
type: B/BA
summary: Semicovariance benchmark defaults to ~0.000079 (daily risk-free)
- id: BD-046
type: B/BA
summary: Non-PSD matrix fix defaults to 'spectral' method
- id: BD-065
type: B/BA
summary: Annualization applied by multiplying daily covariance by frequency
resources:
packages:
- name: cvxpy>=1.1.19
version_pin: latest
- name: numpy>=1.26.0,<3.0.0
version_pin: latest
- name: pandas>=1.0.0,<4.0.0
version_pin: latest
- name: scipy>=1.3.0
version_pin: latest
- name: scikit-learn>=0.24.1
version_pin: latest
- name: scikit-base<0.14.0
version_pin: latest
- name: matplotlib>=3.2.0
version_pin: latest
- name: plotly>=5.0.0,<7
version_pin: latest
- name: ecos>=2.0.14,<2.1
version_pin: latest
- name: cvxopt
version_pin: latest
strategy_scaffold:
entry_point_name: run_backtest
output_path: result.csv
execution_mode: backtest
conditional_entry_points:
backtest:
entry_point_name: run_backtest
output_path: result.csv
collector:
entry_point_name: run_collector
output_path: result.json
factor:
entry_point_name: run_factor
output_path: result.parquet
training:
entry_point_name: run_training
output_path: result.json
serving:
entry_point_name: run_server
output_path: result.json
research:
entry_point_name: run_research
output_path: result.json
tail_template: "# === DO NOT MODIFY BELOW THIS LINE ===\nif __name__ == \"__main__\":\n result = run_backtest() #\
\ implement above\n from validate import enforce_validation\n enforce_validation(result, output_path=\"{workspace}/result.csv\"\
)\n# === END DO NOT MODIFY ==="
host_adapter:
target: openclaw
timeout_seconds: 1800
shell_operator_restriction: 'exec tool intercepts && / ; / | — never chain: ''pip install X && python Y''. Use separate
exec calls.'
install_recipes:
- python3 -m pip install cvxpy>=1.1.19
- python3 -m pip install numpy>=1.26.0,<3.0.0
- python3 -m pip install pandas>=1.0.0,<4.0.0
- python3 -m pip install zvt
credential_injection: JoinQuant/QMT credentials require user-side '!' prefix shell login. Never hardcode credentials in
generated scripts.
path_resolution: '{workspace} resolves to ~/.openclaw/workspace/doramagic at execution time.'
file_io_tooling: Use openclaw 'write' tool for .py/.sql files; 'exec' tool for python3 /absolute/path/script.py (absolute
paths only).
constraints:
fatal:
- id: finance-C-001
when: When implementing mean-variance portfolio optimization
action: Verify covariance matrix passes is_positive_semidefinite check before optimization
severity: fatal
kind: domain_rule
modality: must
consequence: Non-positive semidefinite covariance matrix causes Cholesky decomposition to fail with LinAlgError, preventing
optimizer from running
stage_ids:
- data_input
- id: finance-C-002
when: When computing covariance and expected returns
action: Verify expected returns series has the same index as covariance matrix columns
severity: fatal
kind: domain_rule
modality: must
consequence: Index mismatch between expected returns and covariance matrix causes portfolio optimization to fail with
dimension error or produces incorrect weights
stage_ids:
- data_input
- id: finance-C-003
when: When processing raw price data
action: Verify no NaN values exist in output covariance matrix
severity: fatal
kind: domain_rule
modality: must
consequence: NaN values in covariance matrix cause portfolio optimization to produce NaN/inf weights or fail entirely
stage_ids:
- data_input
- id: finance-C-017
when: When defining covariance matrix input
action: provide a positive semidefinite covariance matrix to cvxpy optimization
severity: fatal
kind: domain_rule
modality: must
consequence: cvxpy quad_form with assume_PSD=True will produce incorrect results if the covariance matrix is not actually
positive semidefinite, causing the optimizer to find mathematically invalid portfolios
stage_ids:
- portfolio_optimization
- id: finance-C-018
when: When optimizing with max_sharpe
action: verify at least one asset's expected return exceeds the risk-free rate
severity: fatal
kind: domain_rule
modality: must
consequence: When all expected returns are below the risk-free rate, the Sharpe maximization problem has no feasible solution
and cvxpy will fail to converge
stage_ids:
- portfolio_optimization
- id: finance-C-019
when: When implementing portfolio optimization objectives
action: use convex objectives that satisfy DCP (Disciplined Convex Programming) rules
severity: fatal
kind: domain_rule
modality: must
consequence: Non-convex objectives or DCP-violating expressions cause cvxpy to raise DCPError, completely preventing portfolio
optimization from running
stage_ids:
- portfolio_optimization
- id: finance-C-020
when: When defining custom constraints
action: provide constraints as callable functions (e.g., lambda) that accept weight variable
severity: fatal
kind: domain_rule
modality: must
consequence: Non-callable constraints cause TypeError and prevent optimization from executing, breaking the entire portfolio
optimization workflow
stage_ids:
- portfolio_optimization
- id: finance-C-022
when: When setting weight bounds
action: provide bounds as tuple/list with exactly 2 elements or a collection matching n_assets length
severity: fatal
kind: domain_rule
modality: must
consequence: Invalid bounds format causes TypeError, preventing optimizer initialization and blocking all subsequent optimization
calls
stage_ids:
- portfolio_optimization
- id: finance-C-023
when: When targeting a specific volatility
action: set target_volatility >= global minimum volatility of the covariance matrix
severity: fatal
kind: domain_rule
modality: must
consequence: Setting target_volatility below the global minimum volatility makes the efficient_risk problem infeasible,
causing OptimizationError
stage_ids:
- portfolio_optimization
- id: finance-C-024
when: When defining expected returns
action: verify expected returns matches the number of assets in cov_matrix
severity: fatal
kind: domain_rule
modality: must
consequence: Mismatched dimensions between expected_returns and cov_matrix cause ValueError, preventing EfficientFrontier
initialization
stage_ids:
- portfolio_optimization
- id: finance-C-025
when: When using cvxpy optimization
action: modify objectives or constraints after initial optimization is solved
severity: fatal
kind: architecture_guardrail
modality: must_not
consequence: Modifying the problem after solving raises InstantiationError, protecting users from stale state that would
produce misleading results
stage_ids:
- portfolio_optimization
- id: finance-C-037
when: When validating portfolio weights after optimization
action: verify that optimization status is OPTIMAL or OPTIMAL_INACCURATE before trusting weights
severity: fatal
kind: domain_rule
modality: must
consequence: Weights from failed optimizations (infeasible, unbounded, or solver errors) may be NaN or invalid, leading
to incorrect portfolio allocations
stage_ids:
- portfolio_optimization
- id: finance-C-041
when: When calculating efficient_return
action: verify target_return does not exceed the maximum achievable return from the optimizer
severity: fatal
kind: domain_rule
modality: must
consequence: Setting target_return above the maximum possible return makes the problem infeasible, raising ValueError
during optimization
stage_ids:
- portfolio_optimization
- id: finance-C-042
when: When initializing HRPOpt for hierarchical portfolio optimization
action: provide at least one of returns or cov_matrix inputs
severity: fatal
kind: domain_rule
modality: must
consequence: ValueError is raised when neither returns nor cov_matrix is provided, preventing portfolio optimization from
proceeding
stage_ids:
- hierarchical_optimization
- id: finance-C-043
when: When providing returns data to HRPOpt
action: pass returns as pandas DataFrame, not numpy array
severity: fatal
kind: domain_rule
modality: must
consequence: TypeError is raised when numpy array is passed instead of DataFrame, causing initialization failure
stage_ids:
- hierarchical_optimization
- id: finance-C-044
when: When selecting linkage method for hierarchical clustering
action: use linkage_method parameter value recognized by scipy.cluster.hierarchy
severity: fatal
kind: domain_rule
modality: must
consequence: ValueError is raised if linkage_method is not in scipy's _LINKAGE_METHODS, causing optimization to fail
stage_ids:
- hierarchical_optimization
- id: finance-C-047
when: When validating portfolio weights output from hierarchical optimization
action: verify weights sum to 1.0 after optimization
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Portfolio weights not summing to 1.0 would cause incorrect capital allocation, either over-allocating or
under-allocating total portfolio value
stage_ids:
- hierarchical_optimization
- id: finance-C-048
when: When calling portfolio_performance() before optimization
action: call optimize() first to compute weights, otherwise raise ValueError
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Attempting to calculate portfolio performance without prior optimization raises ValueError, as weights are
None before optimize() is called
stage_ids:
- hierarchical_optimization
- id: finance-C-058
when: When presenting hierarchical portfolio optimization results
action: claim that backtest results equal expected live trading performance
severity: fatal
kind: claim_boundary
modality: must_not
consequence: Presenting backtest returns as guaranteed live trading results violates financial regulations and misleads
investors about expected performance
stage_ids:
- hierarchical_optimization
- id: finance-C-061
when: When implementing CLA._solve, the covariance matrix must be positive semidefinite
action: verify covariance matrix is positive semidefinite before CLA._solve execution
severity: fatal
kind: domain_rule
modality: must
consequence: Matrix inversion via np.linalg.inv(covarF) will fail with LinAlgError if covariance matrix is not positive
semidefinite, causing _solve to crash
stage_ids:
- cla_optimization
- id: finance-C-062
when: When computing portfolio weights, weights must sum to approximately 1.0
action: verify computed weights satisfy sum-to-one constraint within tolerance 10e-10
severity: fatal
kind: domain_rule
modality: must
consequence: Violation of sum-to-one constraint produces invalid portfolio allocations where total investment exceeds
or falls short of available capital
stage_ids:
- cla_optimization
- id: finance-C-076
when: When implementing Black-Litterman matrix computations
action: maintain strict matrix dimensional consistency between cov_matrix (NxN), Q (Kx1), P (KxN), pi (Nx1), and omega
(KxK)
severity: fatal
kind: domain_rule
modality: must
consequence: Linear system solver fails with LinAlgError when dimensions mismatch, producing incorrect posterior returns
or crash at runtime
stage_ids:
- black_litterman
- id: finance-C-077
when: When specifying tau weight-on-views parameter
action: constrain tau to range (0, 1] to verify valid weight blending between prior and views
severity: fatal
kind: domain_rule
modality: must
consequence: Negative or zero tau causes the posterior covariance calculation to produce invalid matrix values, leading
to meaningless portfolio weights
stage_ids:
- black_litterman
- id: finance-C-078
when: When specifying risk_aversion parameter
action: constrain risk_aversion to positive values (> 0) for valid portfolio optimization
severity: fatal
kind: domain_rule
modality: must
consequence: Non-positive risk_aversion causes division by zero or invalid weight calculations, producing degenerate portfolios
stage_ids:
- black_litterman
- id: finance-C-079
when: When specifying view_confidences for Idzorek method
action: constrain confidence values to [0, 1] range inclusive
severity: fatal
kind: domain_rule
modality: must
consequence: Confidence outside [0,1] produces invalid omega uncertainty matrix, causing posterior returns to diverge
from expected Bayesian blend
stage_ids:
- black_litterman
- id: finance-C-084
when: When specifying views on asset universe
action: verify each view tickers exist in the covariance matrix asset universe
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Providing views on assets not in the universe raises ValueError during view parsing, preventing model initialization
stage_ids:
- black_litterman
- id: finance-C-085
when: When using Idzorek method for view uncertainty
action: provide view_confidences vector with exactly K elements matching the number of views
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Mismatched view_confidences length causes assertion error or invalid omega computation, corrupting posterior
estimates
stage_ids:
- black_litterman
- id: finance-C-086
when: When using pi='market' prior specification
action: pass market_caps keyword argument with capitalizations for each assets
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Missing market_caps when pi='market' raises ValueError preventing model initialization
stage_ids:
- black_litterman
- id: finance-C-093
when: When implementing discrete_allocation algorithms
action: produce fractional share counts
severity: fatal
kind: domain_rule
modality: must_not
consequence: Fractional shares cannot be traded in most markets, causing the allocation to be unexecutable or requiring
manual adjustment that deviates from the optimizer's intended weights
stage_ids:
- discrete_allocation
- id: finance-C-094
when: When constructing DiscreteAllocation instances
action: accept NaN values in weights or latest_prices inputs
severity: fatal
kind: domain_rule
modality: must_not
consequence: NaN values propagate through calculations causing corrupted allocation results and potentially silent failures
in trading execution
stage_ids:
- discrete_allocation
- id: finance-C-095
when: When initializing DiscreteAllocation with total_portfolio_value
action: set total_portfolio_value to zero or negative values
severity: fatal
kind: domain_rule
modality: must_not
consequence: Zero or negative portfolio value causes division by zero or invalid share count calculations, resulting in
undefined behavior
stage_ids:
- discrete_allocation
- id: finance-C-096
when: When specifying short_ratio parameter
action: provide negative short_ratio values
severity: fatal
kind: domain_rule
modality: must_not
consequence: Negative short_ratio produces invalid leverage calculations and may cause unexpected short position sizing
stage_ids:
- discrete_allocation
- id: finance-C-099
when: When working with portfolios containing short positions
action: handle longs and shorts as separate sub-portfolios with normalized weights
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Improper handling of long/short segregation leads to incorrect position sizing and risk exposure that deviates
from the optimizer's intended allocation
stage_ids:
- discrete_allocation
- id: finance-C-106
when: When weights contain tickers not present in latest_prices
action: silently skip those tickers without raising an error
severity: fatal
kind: domain_rule
modality: must_not
consequence: Missing prices for tickers in weights causes KeyError at runtime when accessing self.latest_prices[ticker],
crashing allocation generation
stage_ids:
- discrete_allocation
- id: finance-C-123
when: When providing covariance matrix to any EfficientFrontier optimizer
action: Verify the covariance matrix is positive semidefinite before passing to cvxpy optimization
severity: fatal
kind: domain_rule
modality: must
consequence: Non-positive semidefinite covariance matrix causes cvxpy optimization to fail with DCPError, preventing portfolio
optimization from completing
- id: finance-C-124
when: When computing expected returns for portfolio optimization
action: Provide expected returns for each optimization methods except min_volatility
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Omitting expected returns for methods other than min_volatility causes ValueError exceptions, preventing
portfolio optimization from executing
- id: finance-C-126
when: When modifying optimization objectives or constraints after an optimizer has been solved
action: Add objectives or constraints to an already-solved optimization problem instance
severity: fatal
kind: architecture_guardrail
modality: must_not
consequence: Modifying solved problems leads to InstantiationError exceptions and unpredictable optimization behavior,
as cvxpy problem structure becomes invalid
- id: finance-C-137
when: When implementing custom objectives and constraints via add_objective/add_constraint APIs
action: Verify each custom objectives and constraints are convex functions for DCP (Disciplined Convex Programming) compliance
— non-convex custom objectives will cause solver failures or incorrect optimization results
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Non-convex custom objectives violate DCP rules and cause the underlying convex optimizer to fail or produce
undefined results, making portfolio optimization non-deterministic and unreliable
derived_from_bd_id: BD-007
- id: finance-C-145
when: When implementing discrete portfolio allocation with short positions
action: Use the LP (linear programming) allocation method when short_ratio constraint is required — greedy method does
not support short_ratio constraint and will ignore short position limits
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Using greedy allocation with short_ratio constraint silently ignores the constraint, allowing unlimited short
positions in live trading with counterparty and margin risk
derived_from_bd_id: BD-018
- id: finance-C-146
when: When initializing portfolio optimizer without explicit weight bounds
action: Use default weight bounds of (0, 1) for long-only portfolios — this enforces conservative investment management
and prevents unintended short selling or leverage
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Unbounded or incorrect weight bounds allow negative weights (short positions) or excessive leverage, violating
institutional mandates and causing margin calls in live trading
derived_from_bd_id: BD-022
- id: finance-C-163
when: When implementing covariance matrix processing for portfolio optimization
action: Implement numerical safeguards for PSD matrix handling across the Cholesky check → annualization → optimization
pipeline; set deterministic cvxpy parameters (warm_start=False) and validate that annualization does not introduce non-PSD
behavior
severity: fatal
kind: architecture_guardrail
modality: must
consequence: The silent failure cascade occurs when Cholesky check passes but annualization introduces subtle non-PSD
behavior that cvxpy handles differently across platforms, causing non-deterministic optimization results where the same
inputs produce different portfolios on different machines or Python versions
derived_from_bd_id: BD-108
- id: finance-C-183
when: When annualizing returns, Sharpe ratios, or risk metrics for non-US market portfolios
action: Verify that frequency parameter matches the actual trading calendar of the market being backtested; do not rely
on the default 252 value for markets with different holiday schedules (UK, HK, EU) - use exchange-specific trading day
counts or pass explicit frequency parameter
severity: fatal
kind: domain_rule
modality: must
consequence: The interaction between frequency=252 and annualization scaling creates approximately 20% mis-calculation
of risk-adjusted returns for international portfolios, causing Sharpe ratios and expected annual returns to be systematically
over- or under-stated by up to 20%
derived_from_bd_id: BD-105
- id: finance-C-198
when: When converting periodic returns to annualized equivalents for portfolio optimization
action: Verify annualization factor matches data frequency — use 252 for daily data, 52 for weekly data. The factor must
be explicitly specified and consistent between return estimation and covariance calculation
severity: fatal
kind: domain_rule
modality: must
consequence: Mismatched annualization factors (e.g., 252 on daily data but 365 on weekly) cause expected returns and risk
estimates to be miscalibrated, leading to portfolio allocations that are either overly aggressive or overly conservative
relative to actual risk-adjusted targets
derived_from_bd_id: BD-072
regular:
- id: finance-C-004
when: When providing price data to PyPortfolioOpt
action: Verify each prices are positive values within reasonable range
severity: high
kind: domain_rule
modality: must
consequence: Zero or negative prices produce invalid returns (division by zero or infinity), leading to corrupted expected
return estimates
stage_ids:
- data_input
- id: finance-C-005
when: When using Cholesky decomposition for PSD validation
action: Add 1e-16 numerical stability constant to diagonal before Cholesky check
severity: high
kind: domain_rule
modality: must
consequence: Without 1e-16 regularization, near-singular matrices fail Cholesky check due to floating-point precision
errors, causing valid PSD matrices to be rejected
stage_ids:
- data_input
- id: finance-C-006
when: When selecting frequency parameter for annualization
action: Adjust frequency parameter to match the actual trading calendar of the market
severity: high
kind: resource_boundary
modality: must
consequence: Using US market default (252) for non-US markets produces incorrect annualization, causing expected returns
and risk estimates to be systematically miscalibrated
stage_ids:
- data_input
- id: finance-C-007
when: When using CovarianceShrinkage or Ledoit-Wolf methods
action: Verify scikit-learn is installed in the environment
severity: high
kind: resource_boundary
modality: must
consequence: Missing scikit-learn dependency causes ImportError when attempting to use CovarianceShrinkage.ledoit_wolf()
or shrunk_covariance()
stage_ids:
- data_input
- id: finance-C-008
when: When using min_cov_determinant covariance estimation
action: Use min_cov_determinant as it is deprecated and will be removed in v1.5
severity: medium
kind: resource_boundary
modality: must_not
consequence: Using deprecated function that will be removed causes code breakage after PyPortfolioOpt v1.5 upgrade
stage_ids:
- data_input
- id: finance-C-009
when: When selecting return estimation method
action: Understand that compounding=True (CAGR) and compounding=False (arithmetic mean) produce significantly different
expected return estimates
severity: medium
kind: operational_lesson
modality: must
consequence: Using CAGR (default) produces lower expected returns for volatile assets than arithmetic mean, leading to
suboptimal portfolio allocations
stage_ids:
- data_input
- id: finance-C-010
when: When using exponential covariance with short span
action: Use span values less than 10 for exponential covariance
severity: low
kind: operational_lesson
modality: should_not
consequence: Very short span (less than 10 days) produces noisy covariance estimates with insufficient smoothing, leading
to unstable portfolio weights
stage_ids:
- data_input
- id: finance-C-011
when: When processing returns data
action: Pass returns_data=True only when data is already in returns format, not log returns
severity: high
kind: architecture_guardrail
modality: must
consequence: Passing log returns with returns_data=True causes incorrect CAGR calculations because the code assumes arithmetic
returns, producing systematically wrong expected returns
stage_ids:
- data_input
- id: finance-C-012
when: When calculating CAPM expected returns
action: Verify risk_free_rate parameter matches the time period corresponding to the frequency parameter
severity: medium
kind: architecture_guardrail
modality: must
consequence: Mismatch between risk-free rate frequency and data frequency causes incorrect CAPM return estimates, leading
to misallocated portfolios
stage_ids:
- data_input
- id: finance-C-013
when: When using factory functions for return/risk models
action: Pass supported method names to return_model() and risk_matrix() functions
severity: high
kind: architecture_guardrail
modality: must
consequence: Unsupported method names cause NotImplementedError, preventing portfolio optimization from running
stage_ids:
- data_input
- id: finance-C-014
when: When presenting expected returns to stakeholders
action: Claim that historical expected return estimates accurately predict future returns
severity: high
kind: claim_boundary
modality: must_not
consequence: Historical returns are known to be poor predictors of future performance; overstating their predictive accuracy
misleads investors and violates financial advisory standards
stage_ids:
- data_input
- id: finance-C-015
when: When reporting backtested portfolio performance
action: Present backtest returns as expected live trading returns
severity: high
kind: claim_boundary
modality: must_not
consequence: Backtested returns exclude transaction costs, slippage, and market impact; presenting them as live returns
overstates expected performance and violates compliance requirements
stage_ids:
- data_input
- id: finance-C-016
when: When documenting PyPortfolioOpt capabilities
action: Claim real-time trading support for a backtesting-only library
severity: high
kind: claim_boundary
modality: must_not
consequence: PyPortfolioOpt is a portfolio optimization library for analysis; claiming live trading capability misleads
users and may cause financial harm if used without proper execution infrastructure
stage_ids:
- data_input
- hierarchical_optimization
- id: finance-C-021
when: When calling clean_weights
action: verify that weights have been computed before cleaning, otherwise raise AttributeError
severity: high
kind: domain_rule
modality: must
consequence: Calling clean_weights on unoptimized portfolio returns None and leads to downstream errors when trying to
access portfolio allocations
stage_ids:
- portfolio_optimization
- id: finance-C-026
when: When adding objectives after optimization instance creation
action: call add_objective on an optimizer instance that has already solved a problem
severity: high
kind: architecture_guardrail
modality: must_not
consequence: Adding objectives to a solved problem raises InstantiationError to prevent unintended side effects from stale
optimization state
stage_ids:
- portfolio_optimization
- id: finance-C-027
when: When adding constraints after optimization instance creation
action: call add_constraint on an optimizer instance that has already solved a problem
severity: high
kind: architecture_guardrail
modality: must_not
consequence: Adding constraints to a solved problem raises InstantiationError to prevent unintended side effects from
stale optimization state
stage_ids:
- portfolio_optimization
- id: finance-C-028
when: When requiring market neutrality
action: change weight_bounds to allow negative weights before setting market_neutral=True
severity: medium
kind: architecture_guardrail
modality: must
consequence: Market-neutral portfolios require weights summing to zero, which is impossible with long-only bounds, causing
RuntimeWarning and automatic bounds amendment to (-1, 1)
stage_ids:
- portfolio_optimization
- id: finance-C-029
when: When modifying market_neutral flag after instantiation
action: create a new optimizer instance instead of reusing the same one
severity: high
kind: architecture_guardrail
modality: must
consequence: Reusing the same instance with different market_neutral values raises InstantiationError, preventing confusion
from stale optimization parameters
stage_ids:
- portfolio_optimization
- id: finance-C-030
when: When using max_sharpe with additional objectives
action: be aware that variable transformation for Sharpe maximization may cause additional objectives to not work as expected
severity: medium
kind: resource_boundary
modality: must
consequence: max_sharpe uses a variable substitution that transforms the optimization problem, causing additional objectives
added via add_objective to behave differently than intended
stage_ids:
- portfolio_optimization
- id: finance-C-031
when: When using scipy nonconvex_objective
action: provide an initial_guess to avoid local minima convergence issues
severity: medium
kind: operational_lesson
modality: must
consequence: Without explicit initial_guess, scipy uses equal weights which may converge to suboptimal local minima in
nonconvex optimization problems
stage_ids:
- portfolio_optimization
- id: finance-C-032
when: When cleaning weights with default cutoff
action: understand that 1e-4 cutoff removes floating-point noise but may affect portfolios with genuinely tiny weights
severity: low
kind: domain_rule
modality: must
consequence: Weights with absolute value below 1e-4 are set to exactly zero, which may meaningfully affect portfolio composition
when small allocations are intentional
stage_ids:
- portfolio_optimization
- id: finance-C-033
when: When using sector constraints with shorting
action: be aware that sector constraints may not produce reasonable results with short positions
severity: medium
kind: operational_lesson
modality: must
consequence: Sector constraints with negative weight bounds can produce mathematically valid but economically nonsensical
portfolios due to offsetting long and short positions
stage_ids:
- portfolio_optimization
- id: finance-C-034
when: When optimizing for minimum volatility
action: provide expected_returns as None or accept that the optimizer ignores return maximization
severity: low
kind: resource_boundary
modality: must
consequence: min_volatility only minimizes portfolio variance without considering expected returns, so providing mu is
misleading since it won't affect optimization
stage_ids:
- portfolio_optimization
- id: finance-C-035
when: When using cvxpy as the optimization backend
action: accept that only convex optimization problems are supported via the main API
severity: medium
kind: resource_boundary
modality: must
consequence: Non-convex objectives require nonconvex_objective with scipy backend, which is documented as prone to local
minima and generally not recommended
stage_ids:
- portfolio_optimization
- id: finance-C-036
when: When presenting portfolio optimization results
action: claim that backtested Sharpe ratios or expected returns will be achieved in live trading
severity: high
kind: claim_boundary
modality: must_not
consequence: Historical returns used for expected_returns are estimates that do not guarantee future performance; presenting
them as predictions misleads investors about likely live trading outcomes
stage_ids:
- portfolio_optimization
- id: finance-C-038
when: When computing portfolio performance
action: use risk_free_rate consistent with the frequency of expected_returns (annual vs daily)
severity: high
kind: domain_rule
modality: must
consequence: Mismatched risk-free rate frequency causes Sharpe ratio calculation errors, producing misleading risk-adjusted
return metrics
stage_ids:
- portfolio_optimization
- id: finance-C-039
when: When using efficient_risk or efficient_return
action: expect market-neutral portfolios to work with max_sharpe or min_volatility
severity: high
kind: resource_boundary
modality: must_not
consequence: Market neutrality is not supported for max_sharpe and min_volatility because these objectives are not invariant
with respect to leverage, and attempting it causes errors
stage_ids:
- portfolio_optimization
- id: finance-C-040
when: When setting weight rounding
action: use positive integer rounding values or None to disable rounding
severity: high
kind: domain_rule
modality: must
consequence: Non-integer or zero rounding values cause ValueError, preventing weight cleaning from completing
stage_ids:
- portfolio_optimization
- id: finance-C-045
when: When computing distance matrix from correlation values
action: clip correlation values before sqrt to prevent NaN from negative arguments
severity: high
kind: domain_rule
modality: must
consequence: Without np.clip, correlation values slightly exceeding 1.0 (due to floating-point precision) would cause
sqrt to produce NaN, corrupting the distance matrix
stage_ids:
- hierarchical_optimization
- id: finance-C-046
when: When computing cluster variance for inverse-variance weighting
action: normalize inverse variance weights to sum to 1.0 within each cluster
severity: high
kind: domain_rule
modality: must
consequence: Without normalization, cluster weights would not properly allocate risk contributions proportionally across
assets within a cluster
stage_ids:
- hierarchical_optimization
- id: finance-C-049
when: When returning weights from hierarchical optimization
action: return weights as OrderedDict with ticker keys for consistent ordering
severity: medium
kind: architecture_guardrail
modality: must
consequence: Without OrderedDict, dictionary iteration order could vary across Python versions, causing inconsistent weight
assignments for the same portfolio
stage_ids:
- hierarchical_optimization
- id: finance-C-050
when: When performing bi-section allocation in hierarchical clustering
action: process cluster pairs in groups of two during recursive allocation
severity: high
kind: architecture_guardrail
modality: must
consequence: Bi-section loop assumes even number of clusters after each split; odd number of clusters causes IndexError
when accessing second_cluster
stage_ids:
- hierarchical_optimization
- id: finance-C-051
when: When using scipy hierarchical clustering for HRP
action: provide a valid linkage_method from scipy.cluster.hierarchy._LINKAGE_METHODS
severity: high
kind: resource_boundary
modality: must
consequence: Invalid linkage_method causes ValueError and stops optimization; only single, complete, average, weighted,
centroid, median, ward are supported
stage_ids:
- hierarchical_optimization
- id: finance-C-052
when: When computing cluster variance with zero-variance assets
action: handle zero diagonal values in covariance matrix to prevent division by zero
severity: high
kind: resource_boundary
modality: must
consequence: Assets with zero variance cause np.diag(cov_slice) to return zeros, resulting in division by zero when computing
1/diag values
stage_ids:
- hierarchical_optimization
- id: finance-C-053
when: When cleaning weights with clean_weights()
action: use cutoff threshold of 1e-4 to zero out negligible weights
severity: medium
kind: resource_boundary
modality: must
consequence: Without proper cutoff, negligible weights (e.g., 1e-15) remain non-zero, causing unnecessary precision in
allocation calculations
stage_ids:
- hierarchical_optimization
- id: finance-C-054
when: When computing correlation matrix from covariance for HRP input
action: round correlation matrix to 6 decimal places to avoid floating-point artifacts
severity: medium
kind: resource_boundary
modality: must
consequence: Unrounded correlation values may contain floating-point precision errors that corrupt distance matrix computation
stage_ids:
- hierarchical_optimization
- id: finance-C-055
when: When plotting dendrogram before optimization
action: trigger automatic optimization with RuntimeWarning when clusters are None
severity: medium
kind: operational_lesson
modality: must
consequence: Without auto-optimization, plotting dendrogram on unoptimized HRPOpt would fail silently or produce empty
visualization
stage_ids:
- hierarchical_optimization
- id: finance-C-056
when: When selecting linkage method for financial time series
action: prefer 'single' linkage as default due to its robustness with noisy financial data
severity: low
kind: operational_lesson
modality: should
consequence: Other linkage methods (complete, average) may produce overly balanced clusters that don't reflect true asset
correlations in financial data
stage_ids:
- hierarchical_optimization
- id: finance-C-057
when: When using clean_weights() for discrete allocation
action: set rounding parameter to 5 decimal places minimum for accurate discrete allocation
severity: medium
kind: operational_lesson
modality: must
consequence: Insufficient rounding precision causes rounding errors in discrete allocation, leading to leftover cash that
doesn't match expected portfolio value
stage_ids:
- hierarchical_optimization
- id: finance-C-060
when: When making contributions to hierarchical optimization module
action: include unit tests covering linkage method variations and edge cases
severity: high
kind: rationalization_guard
modality: must
consequence: Without tests, regressions in hierarchical clustering logic would go undetected, potentially causing incorrect
portfolio allocations
stage_ids:
- hierarchical_optimization
- id: finance-C-063
when: When implementing CLA initialization, expected_returns input must be validated
action: validate expected_returns is pd.Series, list, or np.ndarray before reshaping
severity: high
kind: domain_rule
modality: must
consequence: Unvalidated input causes reshape operation to fail or produce incorrect array dimensions, leading to optimization
failure
stage_ids:
- cla_optimization
- id: finance-C-064
when: When implementing efficient_frontier generation, frontier points must be sorted by increasing risk
action: return sigma values in ascending order to verify monotonically increasing risk frontier
severity: high
kind: domain_rule
modality: must
consequence: Unsorted sigma values produce non-Pareto-optimal frontier, violating mean-variance efficiency assumption
and causing incorrect portfolio selection
stage_ids:
- cla_optimization
- id: finance-C-065
when: When implementing CLA weight bounds, lower and upper bounds must be respected
action: enforce weight bounds (lB, uB) constraints during optimization and purge phase
severity: high
kind: domain_rule
modality: must
consequence: Bound violations produce infeasible portfolios where individual asset weights exceed specified limits, violating
investor constraints
stage_ids:
- cla_optimization
- id: finance-C-066
when: When using CLA.efficient_frontier, the points parameter has a default value of 100
action: modify the default points value from 100 unless specifically required for performance
severity: medium
kind: resource_boundary
modality: must_not
consequence: Insufficient frontier points produces choppy, discontinuous efficient frontier; excessive points cause unnecessary
computation without quality improvement
stage_ids:
- cla_optimization
- id: finance-C-067
when: When implementing CLA, the set_weights method must raise NotImplementedError
action: override set_weights to raise NotImplementedError as CLA computes weights internally
severity: high
kind: architecture_guardrail
modality: must
consequence: Calling set_weights would corrupt internal CLA state since weights are computed by _solve, not set externally
stage_ids:
- cla_optimization
- id: finance-C-068
when: When implementing CLA, _solve must be called before accessing weights
action: call _solve lazily if self.w is empty before max_sharpe, min_volatility, or efficient_frontier
severity: high
kind: architecture_guardrail
modality: must
consequence: Accessing weights before _solve produces AttributeError or returns None, causing portfolio performance calculation
to fail
stage_ids:
- cla_optimization
- id: finance-C-069
when: When implementing CLA, it must inherit from BaseOptimizer (not BaseConvexOptimizer)
action: extend BaseOptimizer class to verify pure numpy implementation without cvxpy dependency
severity: high
kind: architecture_guardrail
modality: must
consequence: Inheriting from BaseConvexOptimizer adds unnecessary cvxpy dependency and cvxpy-specific state variables
that conflict with CLA's pure numpy implementation
stage_ids:
- cla_optimization
- id: finance-C-070
when: When accessing weights after CLA.efficient_frontier, weights[0] shape must match expected dimensionality
action: verify weights are returned as (n_assets, 1) column vectors to match CLA internal representation
severity: medium
kind: domain_rule
modality: must
consequence: Mismatched weight array shape causes downstream operations to fail or produce incorrect allocations
stage_ids:
- cla_optimization
- id: finance-C-071
when: When computing efficient frontier, CLA must generate continuous Pareto-optimal curve
action: interpolate between turning points using linspace to create smooth frontier without gaps
severity: high
kind: domain_rule
modality: must
consequence: Discontinuous frontier produces invalid portfolios at certain risk levels and violates mean-variance efficiency
assumptions
stage_ids:
- cla_optimization
- id: finance-C-072
when: When implementing CLA, the golden section search tolerance is 1.0e-9
action: modify the golden section search tolerance from the hardcoded 1.0e-9 value
severity: medium
kind: resource_boundary
modality: must_not
consequence: Relaxed tolerance produces less accurate Sharpe ratio maximization; stricter tolerance causes excessive iterations
without meaningful improvement
stage_ids:
- cla_optimization
- id: finance-C-073
when: When using CLA for portfolio optimization, expected live trading returns do not equal backtested returns
action: claim that CLA optimization results will match live trading performance
severity: high
kind: claim_boundary
modality: must_not
consequence: Presenting backtested CLA returns as expected live trading returns violates financial modeling principles
and misleads investors about likely outcomes
stage_ids:
- cla_optimization
- id: finance-C-074
when: When implementing CLA._purge_num_err, numerical tolerance must be 10e-10
action: change the numerical purge tolerance from the hardcoded 10e-10 value
severity: medium
kind: resource_boundary
modality: must_not
consequence: Modified tolerance causes either false positive purging of valid turning points or failure to detect invalid
weight constraints
stage_ids:
- cla_optimization
- id: finance-C-075
when: When implementing CLA, weights must be accessed only after _solve completes
action: access self.weights before calling _solve or calling max_sharpe/min_volatility first
severity: high
kind: architecture_guardrail
modality: must_not
consequence: Accessing weights before optimization returns None or uninitialized values, causing portfolio performance
calculations to fail
stage_ids:
- cla_optimization
- id: finance-C-080
when: When validating omega matrix for linear system solving
action: verify omega matrix is invertible (positive definite for diagonal case) before np.linalg.solve
severity: high
kind: domain_rule
modality: must
consequence: Singular omega causes np.linalg.solve to fail, requiring fallback to lstsq which may produce numerically
unstable results
stage_ids:
- black_litterman
- id: finance-C-081
when: When providing prior estimate pi to Black-Litterman model
action: provide a prior with zero variance assets without warning, as it causes numerical instability
severity: high
kind: domain_rule
modality: must_not
consequence: Zero-variance assets in prior lead to infinite or undefined market-implied returns, corrupting the posterior
estimation
stage_ids:
- black_litterman
- id: finance-C-082
when: When using default tau=0.05 parameter
action: understand this represents 5% weight on views vs 95% on prior as industry standard, but calibrate based on view
quality
severity: medium
kind: operational_lesson
modality: should
consequence: Rigid adherence to default tau without considering view reliability may underweight or overweight investor
views inappropriately
stage_ids:
- black_litterman
- id: finance-C-083
when: When using default risk_aversion=1 parameter
action: calibrate risk_aversion to personal utility function rather than assuming moderate risk aversion of 1
severity: high
kind: operational_lesson
modality: must
consequence: Using default risk_aversion=1 when actual risk tolerance differs leads to suboptimal portfolio allocation
that may cause significant losses
stage_ids:
- black_litterman
- id: finance-C-087
when: When using Black-Litterman bl_weights() method
action: normalize weights by their sum to verify fully-invested portfolio
severity: high
kind: architecture_guardrail
modality: must
consequence: Unnormalized weights produce portfolios that are not fully invested, violating the expected sum-to-one constraint
stage_ids:
- black_litterman
- id: finance-C-088
when: When implementing Black-Litterman views with relative views
action: verify picking matrix P rows for relative views sum to zero
severity: high
kind: architecture_guardrail
modality: must
consequence: Non-zero sum picking matrix rows produce incorrect relative view interpretation, corrupting posterior returns
for relative views
stage_ids:
- black_litterman
- id: finance-C-089
when: When using Black-Litterman posterior returns in live trading
action: claim backtest-optimized weights equal to expected live trading returns
severity: high
kind: claim_boundary
modality: must_not
consequence: Backtest returns do not equal expected live trading returns due to transaction costs, slippage, and market
impact not modeled in BL optimization
stage_ids:
- black_litterman
- id: finance-C-090
when: When using Black-Litterman as the sole alpha generation method
action: claim BL-generated views are equivalent to fundamental analysis or proprietary research
severity: medium
kind: claim_boundary
modality: must_not
consequence: BL model only combines market equilibrium with user-specified views; the quality of outputs directly depends
on input view quality which must come from validated research
stage_ids:
- black_litterman
- id: finance-C-091
when: When using BL with no prior (pi=None)
action: proceed without awareness that zero prior essentially treats views as absolute truth
severity: medium
kind: operational_lesson
modality: should_not
consequence: No-prior mode produces extreme weights dominated entirely by user views without market equilibrium anchoring,
increasing sensitivity to view errors
stage_ids:
- black_litterman
- id: finance-C-092
when: When implementing Black-Litterman with cov_matrix not as DataFrame
action: verify market cap index aligns to cov_matrix column order to avoid silent misalignment
severity: medium
kind: architecture_guardrail
modality: should
consequence: Non-DataFrame cov_matrix with misaligned market caps produces incorrect market-implied prior returns in wrong
asset order
stage_ids:
- black_litterman
- id: finance-C-097
when: When allocating shares in greedy algorithm second round
action: exceed 10 iterations when searching for affordable assets
severity: high
kind: resource_boundary
modality: must_not
consequence: Greedy algorithm terminates prematurely with suboptimal allocation when no affordable asset can be found
within the iteration limit, leaving funds unallocated
stage_ids:
- discrete_allocation
- id: finance-C-098
when: When using lp_portfolio with integer programming
action: verify cvxpy with a mixed-integer programming solver is available
severity: high
kind: resource_boundary
modality: must
consequence: LP portfolio requires cvxpy with mixed-integer solver support; without it, allocation falls back to greedy
method which may produce suboptimal results
stage_ids:
- discrete_allocation
- id: finance-C-100
when: When returning allocation from any allocation method
action: filter out zero-position tickers before returning the allocation dict
severity: high
kind: architecture_guardrail
modality: must
consequence: Zero-position tickers in allocation dict cause confusion in downstream processing and may incorrectly influence
portfolio value calculations
stage_ids:
- discrete_allocation
- id: finance-C-101
when: When calculating discrete allocation RMSE error
action: compute error only from tickers present in the original weights list
severity: high
kind: domain_rule
modality: must
consequence: Incorrect RMSE calculation including/excluding wrong tickers produces misleading tracking error metrics that
don't reflect the actual deviation from intended weights
stage_ids:
- discrete_allocation
- id: finance-C-102
when: When the LP solver fails to find an optimal solution
action: fall back to greedy_portfolio method which is more robust
severity: high
kind: operational_lesson
modality: must
consequence: LP solver failure without fallback leaves the user without any allocation, blocking trading execution entirely
stage_ids:
- discrete_allocation
- id: finance-C-103
when: When allocating small portfolios
action: expect higher RMSE and larger leftover funds due to rounding granularity
severity: medium
kind: operational_lesson
modality: should
consequence: Small portfolio values relative to share prices cause significant weight approximation errors, leading to
portfolio performance that diverges from the optimizer's target
stage_ids:
- discrete_allocation
- id: finance-C-104
when: When presenting discrete allocation results
action: claim the allocation achieves exact target weights without acknowledging rounding error
severity: high
kind: claim_boundary
modality: must_not
consequence: Presenting discrete allocation as exact weight replication misleads users about precision; in reality, RMSE
error is inherent due to integer share constraints
stage_ids:
- discrete_allocation
- id: finance-C-105
when: When using the latest_prices parameter
action: provide prices corresponding to the most recent market close to avoid stale data
severity: high
kind: resource_boundary
modality: must
consequence: Using outdated prices causes share counts to be calculated against incorrect valuations, leading to over/under-allocation
when actual trading occurs
stage_ids:
- discrete_allocation
- id: finance-C-107
when: When implementing discrete allocation in trading systems
action: present backtest allocation results as guaranteed live trading outcomes
severity: high
kind: claim_boundary
modality: must_not
consequence: Backtest allocations use historical prices while live trading uses current prices; slippage, liquidity, and
execution delays cause actual fills to differ from discrete allocation calculations
stage_ids:
- discrete_allocation
- id: finance-C-108
when: When importing the plotting module without matplotlib installed
action: raise ImportError with message instructing pip or poetry installation
severity: high
kind: resource_boundary
modality: must
consequence: Plotting functions fail immediately with ImportError when matplotlib is not available, preventing any visualization
output
stage_ids:
- visualization
- id: finance-C-109
when: When importing plotting module without plotly installed
action: raise ImportError with message about plotly installation requirement
severity: high
kind: resource_boundary
modality: must
consequence: Interactive plotting mode fails with ImportError when plotly is not installed, blocking plotly-based visualizations
stage_ids:
- visualization
- id: finance-C-110
when: When passing non-EfficientFrontier and non-CLA objects to plot_efficient_frontier
action: raise NotImplementedError specifying EfficientFrontier or CLA as valid inputs
severity: high
kind: domain_rule
modality: must
consequence: Invalid optimizer objects cause NotImplementedError, preventing silent failures or incorrect plots
stage_ids:
- visualization
- id: finance-C-111
when: When using ef_param not in {'utility', 'risk', 'return'}
action: raise NotImplementedError with valid parameter list
severity: high
kind: domain_rule
modality: must
consequence: Invalid ef_param values trigger NotImplementedError, ensuring only supported frontier parameterizations are
used
stage_ids:
- visualization
- id: finance-C-112
when: When computing default returns range for efficient frontier
action: subtract 0.0001 from max_return to avoid numerical boundary issues
severity: high
kind: domain_rule
modality: must
consequence: Without the epsilon subtraction, numerical issues at the maximum return boundary cause plotting failures
or infinite optimization loops
stage_ids:
- visualization
- id: finance-C-113
when: When plotting dendrogram with unoptimized HRPOpt object
action: emit RuntimeWarning and automatically call optimize() before plotting
severity: medium
kind: operational_lesson
modality: must
consequence: Unoptimized HRPOpt produces empty or invalid dendrogram; automatic optimization ensures valid hierarchical
structure visualization
stage_ids:
- visualization
- id: finance-C-114
when: When portfolio optimization fails for a specific parameter value
action: catch exceptions.OptimizationError and skip that point, catching ValueError with UserWarning
severity: medium
kind: architecture_guardrail
modality: must
consequence: Without exception handling, frontier plotting aborts on first optimization failure; graceful skipping produces
partial frontier plots
stage_ids:
- visualization
- id: finance-C-115
when: When calling plotting functions in automated tests
action: skip test if matplotlib is not installed via _check_soft_dependencies
severity: medium
kind: operational_lesson
modality: must
consequence: Tests fail catastrophically without soft dependency checks; skip decorator allows test suite completion without
matplotlib
stage_ids:
- visualization
- id: finance-C-116
when: When using plot_covariance with large portfolios
action: set show_tickers=False to avoid cluttered axis labels
severity: low
kind: operational_lesson
modality: should
consequence: With show_tickers=True for large portfolios, axis labels overlap making the plot unreadable and the covariance
matrix interpretation difficult
stage_ids:
- visualization
- id: finance-C-117
when: When using plot_dendrogram with large portfolios
action: set show_tickers=False to avoid cluttered x-axis labels
severity: low
kind: operational_lesson
modality: should
consequence: With show_tickers=True for large portfolios, dendrogram labels overlap making hierarchical structure unreadable
stage_ids:
- visualization
- id: finance-C-118
when: When presenting plots as investment recommendations
action: present optimized portfolio weights or frontier plots as guaranteed future performance
severity: high
kind: claim_boundary
modality: must_not
consequence: Presenting backtested efficient frontiers or optimized weights as predictions of future returns violates
the project disclaimer and misleads investors
stage_ids:
- visualization
- id: finance-C-119
when: When adding new plotting functionality
action: write pytest unit tests covering core functionality, warnings, errors, and edge cases
severity: high
kind: rationalization_guard
modality: must
consequence: Without comprehensive tests, plotting bugs go undetected causing silent failures in portfolio visualization
workflows
stage_ids:
- visualization
- id: finance-C-120
when: When contributing new visualization features
action: seek early feedback via GitHub issue before implementation
severity: medium
kind: rationalization_guard
modality: must
consequence: Without early feedback, API changes may be rejected after implementation, wasting development effort on incompatible
contributions
stage_ids:
- visualization
- id: finance-C-122
when: When calculating expected returns or covariance matrices from price data
action: Use consistent frequency parameter (default 252 for trading days) for annualization of both expected returns and
covariance
severity: high
kind: domain_rule
modality: must
consequence: Inconsistent annualization causes misaligned risk/return estimates, leading to incorrect portfolio allocations
with either excessive risk or diminished returns
- id: finance-C-125
when: When constructing market-neutral portfolios with EfficientFrontier
action: Allow negative weights by setting weight_bounds to include negative values (e.g., (-1, 1))
severity: high
kind: architecture_guardrail
modality: must
consequence: Market neutrality constraint with long-only bounds causes RuntimeWarning and potentially invalid optimization
results, producing portfolios that do not meet market-neutral specifications
- id: finance-C-127
when: When using the max_sharpe optimization method
action: Use market_neutral=True with max_sharpe optimization
severity: high
kind: architecture_guardrail
modality: must_not
consequence: max_sharpe optimization is not invariant with respect to leverage and cannot support market neutrality, leading
to mathematically invalid or degenerate portfolio allocations
- id: finance-C-128
when: When cleaning portfolio weights for output or discrete allocation
action: Treat weights with absolute value below 1e-4 as exactly zero to verify numerical stability
severity: medium
kind: domain_rule
modality: must
consequence: Near-zero weights cause unnecessary trading friction, precision errors in discrete allocation, and degraded
Sharpe ratio calculations
- id: finance-C-129
when: When presenting or reporting this system's backtested or optimized returns to users
action: Claim that expected returns or Sharpe ratios derived from historical data equal guaranteed future investment performance
severity: high
kind: claim_boundary
modality: must_not
consequence: Users make live capital allocation decisions based on inflated expected returns, leading to severe underperformance
when actual market conditions diverge from historical patterns
- id: finance-C-130
when: When building or marketing investment systems using PyPortfolioOpt
action: Claim that PyPortfolioOpt supports real-time trading execution, live order placement, or direct exchange connectivity
severity: high
kind: claim_boundary
modality: must_not
consequence: Users deploying PyPortfolioOpt for live trading without execution infrastructure experience service failures,
missed trades, and financial losses from unexecuted portfolio allocations
- id: finance-C-131
when: When applying PyPortfolioOpt to construct investment portfolios
action: Claim that the system provides valid optimization for investors without quantitative expected return and covariance
inputs
severity: high
kind: claim_boundary
modality: must_not
consequence: Investors without quantitative alpha inputs (expected returns/covariance) receive meaningless portfolios
driven entirely by estimation noise rather than genuine information
- id: finance-C-132
when: When solving portfolio optimization problems with PyPortfolioOpt
action: Claim that PyPortfolioOpt supports non-convex optimization problems outside standard portfolio allocation
severity: medium
kind: claim_boundary
modality: must_not
consequence: Users attempting non-convex portfolio problems receive suboptimal or failed results due to cvxpy's convex
optimization limitations, leading to financial losses from suboptimal allocations
- id: finance-C-133
when: When calculating discrete allocations from continuous portfolio weights
action: Use current market prices (not historical prices) for converting continuous weights to share counts
severity: high
kind: domain_rule
modality: must
consequence: Using stale prices causes incorrect share counts, portfolio value mismatches, and potential over/under-allocation
relative to target portfolio size
- id: finance-C-134
when: When adding sector or custom constraints to an optimization problem
action: Verify weight bounds allow negative values when using sector constraints with market neutrality
severity: medium
kind: architecture_guardrail
modality: must
consequence: Sector constraints with long-only bounds produce unreasonable results for portfolios allowing shorts, creating
allocation conflicts and optimization failures
- id: finance-C-135
when: When calculating portfolio performance metrics
action: Use the same risk_free_rate for portfolio_performance that was used in max_sharpe optimization
severity: medium
kind: architecture_guardrail
modality: must
consequence: Mismatched risk-free rates cause incorrect Sharpe ratio calculations, leading to invalid performance comparisons
and misguided rebalancing decisions
- id: finance-C-136
when: When implementing portfolio allocation logic in hierarchical optimization
action: Use bi-section recursion algorithm for weight allocation — this ensures O(n log n) complexity and deterministic
results matching the framework's documented behavior; do not replace with naive full optimization approaches
severity: high
kind: architecture_guardrail
modality: must
consequence: Replacing bi-section recursion with full optimization increases time complexity from O(n log n) to O(n^2),
causing unacceptable latency for real-time allocation with large portfolios; different algorithms also produce different
allocation results
derived_from_bd_id: BD-011
- id: finance-C-138
when: When using the framework's default Compounding=True parameter for mean_historical_return
action: Verify that Compounding=True matches the strategy's return calculation requirements — CAGR (geometric) vs arithmetic
mean produce significantly different expected returns, affecting optimization outcomes
severity: medium
kind: operational_lesson
modality: should
consequence: Default CAGR compounding produces lower expected returns than arithmetic mean for volatile assets, causing
the optimizer to select more conservative portfolios than intended for long-term strategies
derived_from_bd_id: BD-002
- id: finance-C-139
when: When using the framework's default frequency=252 parameter for annualization
action: Verify that frequency=252 matches the actual trading calendar of the market being analyzed — for non-US markets,
override with market-specific value (e.g., 250 for some European markets, actual trading days for others)
severity: high
kind: domain_rule
modality: must
consequence: Hardcoded frequency=252 incorrectly annualizes returns for non-US markets, causing systematic miscalculation
of expected returns and risk metrics by 1-5% depending on the market's actual trading days
derived_from_bd_id: BD-001
- id: finance-C-140
when: When annualizing returns for data with periods other than daily (weekly, monthly, quarterly)
action: Assume the framework provides calendar-aware annualization — the framework uses hardcoded frequency=252 for each
annualization, which introduces significant errors for non-daily data periods
severity: high
kind: claim_boundary
modality: must_not
consequence: Without calendar-aware annualization, monthly data annualization with 252 factor introduces 21x error, causing
expected returns to be massively overestimated and leading to fundamentally wrong portfolio allocations
derived_from_bd_id: BD-GAP-002
- id: finance-C-141
when: When processing temporal data for portfolio optimization
action: Implement external timestamp tracking for optimization runs using as_of/evaluation_date concepts — record the
data snapshot timestamp with each optimization result to verify reproducibility across different data timestamps
severity: high
kind: domain_rule
modality: must
consequence: Without time semantics tracking, optimizations are not reproducible across different data timestamps; the
same code with the same inputs produces different results as historical data updates, violating production audit requirements
derived_from_bd_id: BD-GAP-002
- id: finance-C-142
when: When running portfolio optimization with temporal data
action: Assume the framework implements time semantics with as_of/evaluation_date concept for reproducible optimization
runs — the framework does not track data timestamps internally
severity: high
kind: claim_boundary
modality: must_not
consequence: Without time semantics framework, optimizations are not reproducible across different data timestamps without
external tracking, making it impossible to audit or replicate production optimization decisions
derived_from_bd_id: BD-GAP-001
- id: finance-C-143
when: When running portfolio optimization workflows in production environments
action: 'Implement external timestamp tracking: store as_of_date and evaluation_date metadata alongside each optimization
result, and validate data snapshot consistency before each optimization run to verify reproducibility'
severity: high
kind: domain_rule
modality: must
consequence: Without implementing external time semantics tracking, production optimization runs cannot be audited or
replicated when historical data updates, violating regulatory reproducibility requirements for investment management
systems
derived_from_bd_id: BD-GAP-001
- id: finance-C-144
when: When implementing hierarchical risk parity cluster weight calculations
action: Use inverse-variance weighting (1/cluster_variance) for weights within each cluster boundary to achieve equal-risk
contribution across cluster pairs
severity: high
kind: domain_rule
modality: must
consequence: Using equal weighting within clusters removes variance awareness, causing unequal risk contribution across
assets and breaking the hierarchical risk parity objective
derived_from_bd_id: BD-010
- id: finance-C-147
when: When preparing covariance matrix inputs for quadratic optimization
action: Verify covariance matrix is positive semi-definite (PSD) via Cholesky decomposition before passing to quadratic
optimizer — non-PSD matrices indicate numerical instability or data quality issues
severity: high
kind: domain_rule
modality: must
consequence: Non-PSD covariance matrices cause Cholesky decomposition failures in quadratic optimizers, producing NaN/Inf
portfolio weights that cannot be executed in live trading
derived_from_bd_id: BD-082
- id: finance-C-148
when: When using clean_weights with the default cutoff threshold for portfolio optimization
action: Verify that weights below cutoff=1e-4 are genuinely negligible and not positions with small but economically meaningful
allocations — review the sum of zeroed weights to assess impact
severity: medium
kind: operational_lesson
modality: should
consequence: The 1e-4 threshold removes floating-point noise but may eliminate genuinely small positions, causing portfolio
normalization to overweight remaining assets and diverge from intended allocation
derived_from_bd_id: BD-006
- id: finance-C-149
when: When running discrete allocation for real portfolio backtesting
action: Set total_portfolio_value to actual portfolio size rather than relying on default 10000 — share counts scale linearly
with this parameter and affect strategy capacity analysis
severity: medium
kind: operational_lesson
modality: should
consequence: Using default 10000 for large portfolios produces unrealistically small share counts, making strategy capacity
appear lower than actual and causing suboptimal capital deployment decisions
derived_from_bd_id: BD-017
- id: finance-C-150
when: When initializing Black-Litterman model without explicit tau parameter
action: Verify that tau=0.05 (5% weight on views, 95% on equilibrium) matches your confidence level in views vs market
consensus — adjust tau based on conviction in the view predictions
severity: high
kind: operational_lesson
modality: must
consequence: Default tau=0.05 blends 95% market equilibrium with 5% investor views; using incorrect tau causes portfolio
to either over-rely on uncertain views or underweight valuable private information
derived_from_bd_id: BD-014
- id: finance-C-151
when: When computing Black-Litterman equilibrium returns without explicit risk_aversion
action: Calibrate risk_aversion parameter to your specific utility function rather than using default 1.0 — verify that
implied expected return magnitudes match market expectations
severity: high
kind: operational_lesson
modality: must
consequence: Default risk_aversion=1.0 produces equilibrium returns that may not match your risk tolerance, causing portfolio
to either over-concentrate in high-variance assets or underallocate to return drivers
derived_from_bd_id: BD-015
- id: finance-C-152
when: When processing price data inputs for portfolio optimization or backtesting
action: Assume the framework automatically detects or validates data freshness — the framework does not implement stale
data detection or cache TTL policies
severity: high
kind: claim_boundary
modality: must_not
consequence: Without stale data detection, outdated prices are used silently causing portfolio weights to be based on
stale market data, leading to execution failures and tracking errors in live trading
derived_from_bd_id: BD-GAP-004
- id: finance-C-153
when: When fetching price data for portfolio calculations
action: 'Implement a data freshness check: verify that price timestamp is within max_age threshold (e.g., trading_hours_check
or cache_ttl), and emit a warning or reject data if stale — use the timestamp field to compare against current datetime'
severity: high
kind: domain_rule
modality: must
consequence: Using outdated prices without freshness validation causes portfolio weights to diverge from intended strategy,
resulting in execution failures when stale orders cannot be filled at expected prices
derived_from_bd_id: BD-GAP-004
- id: finance-C-154
when: When implementing discrete allocation with get_latest_prices for price lookup
action: Use forward-fill (ffill) to propagate last known price for NaN values — do not use backfill, zero-fill, or drop
NaN values
severity: high
kind: domain_rule
modality: must
consequence: Using backfill would introduce look-ahead bias by using future prices; using zero would cause zero-value
allocations; dropping would cause allocation failures for any asset with missing data points
derived_from_bd_id: BD-059
- id: finance-C-155
when: When implementing Critical Line Algorithm portfolio optimization
action: Use lazy solving pattern — compute the full efficient frontier once, then return specific points from cached results
via individual method calls
severity: medium
kind: architecture_guardrail
modality: must
consequence: Eager solving would recompute the entire frontier on each method call, increasing computation time proportionally
to the number of allocations requested; lazy evaluation amortizes expensive frontier computation
derived_from_bd_id: BD-060
- id: finance-C-156
when: When implementing implementing or modifying portfolio optimization after solve()
action: Raise InstantiationError when objective function or constraints are modified after solving — do not silently re-solve
with modified parameters
severity: high
kind: domain_rule
modality: must
consequence: Silent re-solve after structural changes would produce stale results that appear valid but represent a different
optimization problem than intended, leading to wrong portfolio allocations without any warning
derived_from_bd_id: BD-062
- id: finance-C-157
when: When implementing Hierarchical Risk Parity distance matrix computation
action: Compute distance as sqrt((1-corr)/2) — the angular transformation from correlation to ultrametric distance is
mathematically required for valid hierarchical clustering
severity: high
kind: domain_rule
modality: must
consequence: Using alternative formulas like 1-corr or absolute correlation would produce incorrect distance metrics,
breaking the ultrametric property and causing fundamentally different cluster structures in the portfolio hierarchy
derived_from_bd_id: BD-063
- id: finance-C-158
when: When implementing discrete allocation for portfolios with short positions
action: Respect the reinvest flag — when reinvest=False, short sale proceeds must NOT increase long allocation capacity;
when reinvest=True, short proceeds fund additional long positions (130/30-style)
severity: medium
kind: domain_rule
modality: must
consequence: Forcing reinvest=True for long-only strategies would introduce unintended leverage by using short proceeds
to fund oversized longs; for long-short strategies, disabling reinvest would incorrectly reduce allocation capacity
derived_from_bd_id: BD-070
- id: finance-C-159
when: When configuring semicovariance risk model for portfolio optimization
action: Verify that the semicovariance benchmark matches actual risk-free rate used in the strategy (default 0.000079
daily = 2% annual) — adjust benchmark parameter explicitly if the strategy uses different risk-free rate assumptions
severity: medium
kind: operational_lesson
modality: should
consequence: Using the default 2% annual benchmark when the strategy assumes different risk-free rates would mismeasure
downside risk, causing semicovariance to penalize returns incorrectly relative to the investor's actual opportunity
cost
derived_from_bd_id: BD-036
- id: finance-C-160
when: When configuring efficient frontier visualization or allocation precision
action: Verify that 100 points provides adequate granularity for the use case — increase to 200+ for detailed visualization,
decrease to 50 for faster iteration during parameter tuning
severity: low
kind: operational_lesson
modality: should
consequence: Using 100 points by default may be unnecessarily slow for quick parameter scans while potentially too coarse
for smooth visualization of complex efficient frontiers with multiple constraints
derived_from_bd_id: BD-037
- id: finance-C-161
when: When configuring efficient frontier for long-short or market neutral strategies
action: Explicitly set market_neutral parameter — the default False value assumes long-only portfolios; enabling market
neutrality requires margin capability and explicit parameter configuration
severity: high
kind: domain_rule
modality: must
consequence: Relying on the default market_neutral=False when implementing long-short strategies would constrain the optimizer
to long-only, producing portfolios that miss significant alpha opportunities from short positions
derived_from_bd_id: BD-039
- id: finance-C-162
when: When configuring integer programming solver for discrete allocation with binary share constraints
action: Use ECOS_BB solver by default for open-source deployments — if commercial solvers (Gurobi, SCIP) are available,
they may provide faster convergence for large allocation problems
severity: medium
kind: operational_lesson
modality: should
consequence: Using non-ECOS_BB solvers without verification may fail for some integer programming structures; commercial
solvers provide better performance but require licenses unavailable in open-source deployments
derived_from_bd_id: BD-040
- id: finance-C-164
when: When implementing portfolio optimization with EfficientFrontier or using the maximize_sharpe/maximize_quadratic_utilities
objectives
action: Understand that quadratic utility (mean-variance optimization) is the core objective function; the risk_aversion
parameter controls the return-risk trade-off. Do NOT substitute with a different utility function without explicitly
configuring it and understanding its mathematical implications
severity: high
kind: domain_rule
modality: must
consequence: Changing the utility function fundamentally alters the optimal portfolio weights. Using a different utility
without understanding its properties leads to portfolios that may not align with investor preferences, causing suboptimal
risk-adjusted returns in live trading
derived_from_bd_id: BD-085
- id: finance-C-165
when: When configuring portfolio optimization weight bounds
action: Verify that the default weight bounds of -1 to 1 (allowing 100% shorts) match your investment mandate. Most strategies
require tighter bounds such as 0 to 1 (long-only) or custom bounds based on liquidity and regulatory constraints
severity: high
kind: operational_lesson
modality: must
consequence: Default bounds permit unlimited short positions that most retail and institutional mandates prohibit. Backtesting
with permissive bounds produces portfolios that cannot be traded live, causing execution failures or regulatory violations
derived_from_bd_id: BD-088
- id: finance-C-166
when: When optimizing portfolio using EfficientCVaR for tail risk management
action: Verify that the 95% confidence level for CVaR aligns with your regulatory requirements and risk management framework.
CVaR at 95% is mandated under FRTB for internal models; changing to other confidence levels requires justification
severity: high
kind: domain_rule
modality: must
consequence: CVaR at 95% confidence is the regulatory standard under FRTB for market risk capital calculations. Using
a different confidence level may fail regulatory review and result in incorrect risk assessment that underestimates
or overestimates tail losses
derived_from_bd_id: BD-093
- id: finance-C-167
when: When implementing Hierarchical Risk Parity (HRP) portfolio construction
action: Replace the cluster variance weighting heuristic with equal weighting or custom logic without explicit justification.
This heuristic balances diversification with risk contribution; changing it produces materially different risk allocations
severity: high
kind: domain_rule
modality: must_not
consequence: Cluster variance weighting assigns higher weights to lower-variance clusters. Substituting equal weighting
or another heuristic without understanding the implications produces portfolios with different risk concentration that
may underperform in volatile markets
derived_from_bd_id: BD-095
- id: finance-C-168
when: When implementing the Black-Litterman model for portfolio optimization
action: Understand that default_omega calculates uncertainty proportional to market-cap-weighted covariance. Do NOT modify
omega without understanding its role in view weighting; incorrect omega leads to views being over-weighted or under-weighted
relative to market equilibrium
severity: high
kind: domain_rule
modality: must
consequence: Omega determines how much weight each investor view receives versus market equilibrium. Using an incorrectly
specified omega distorts the blended return estimates, causing portfolios to deviate from both the market prior and
the intended tilts
derived_from_bd_id: BD-097
- id: finance-C-169
when: When implementing the Black-Litterman model to derive equilibrium returns
action: Understand that market-implied prior uses reverse optimization from market-cap weights, assuming CAPM validity
and market efficiency. Using historical returns or custom priors requires explicit justification and may produce inconsistent
equilibrium estimates
severity: high
kind: domain_rule
modality: must
consequence: Reverse optimization derives equilibrium returns from market-cap weights using implied risk premiums. Substituting
this with historical returns or custom priors without understanding CAPM assumptions produces equilibrium estimates
that may contradict market wisdom
derived_from_bd_id: BD-098
- id: finance-C-170
when: When using the Critical Line Algorithm (CLA) for cardinality-constrained optimization
action: Verify that the default tolerance of 1e-5 provides sufficient precision for your application. Tighter tolerance
(1e-7) increases computation time but improves precision; looser tolerance (1e-3) is acceptable only for rapid prototyping
severity: medium
kind: operational_lesson
modality: should
consequence: CLA tolerance of 1e-5 balances precision with computation time. Using excessive tolerance may cause convergence
to suboptimal solutions with incorrect weight allocations, while unnecessarily tight tolerance significantly increases
computation without meaningful precision gains
derived_from_bd_id: BD-099
- id: finance-C-171
when: When generating efficient frontier visualization using CLA
action: Verify that 50 frontier points provide sufficient granularity for your analysis. For detailed analysis, use 100+
points; for summary views, 20 points may suffice but may miss curvature details
severity: low
kind: operational_lesson
modality: should
consequence: The default 50 points may miss fine curvature details in complex efficient frontiers with multiple bend points.
Using too few points produces misleading visualizations that obscure optimal risk-return trade-offs
derived_from_bd_id: BD-100
- id: finance-C-172
when: When selecting or configuring portfolio optimization objectives in pypfopt
action: Assume that the risk_parity objective function allocates capital equally or inversely to volatility — risk parity
balances asset contribution to portfolio risk, not capital allocation, producing different weight distributions than
inverse-variance or equal-weight approaches
severity: medium
kind: operational_lesson
modality: must_not
consequence: Misunderstanding risk parity as capital parity leads to incorrect portfolio construction; inverse-variance
weighting or equal-weighting produces materially different allocations that do not balance risk contribution, causing
backtest results to diverge significantly from intended risk-balanced strategy
derived_from_bd_id: BD-104
- id: finance-C-173
when: When configuring multiple tail-risk optimization objectives simultaneously (CVaR and CDaR)
action: Set identical 95% confidence levels across CVaR, CDaR, and other tail-risk measures without de-confliction — differentiated
confidence levels or explicit de-confliction logic should be applied to prevent cascading over-conservatism
severity: high
kind: operational_lesson
modality: must_not
consequence: Optimizing CVaR and CDaR simultaneously at 95% creates cascading effect where portfolios become overly conservative
relative to intended risk budget; users expecting balanced tail-risk management find counterintuitive allocations that
underperform the risk-adjusted target
derived_from_bd_id: BD-107
- id: finance-C-174
when: When implementing Black-Litterman portfolio optimization with high-conviction views
action: Assume that strong investor views will dominate the optimization and significantly deviate from market-cap weights
— three reinforcing tau=0.05 parameters heavily favor equilibrium priors, limiting the optimizer's ability to incorporate
conviction views
severity: high
kind: domain_rule
modality: must_not
consequence: Reinforced 5% tau parameters in Black-Litterman heavily weight market equilibrium over investor views; users
expecting optimizer to reflect strong conviction views find allocations remain close to equilibrium, causing backtest
performance to undermatch the expected alpha from high-conviction views
derived_from_bd_id: BD-110
- id: finance-C-175
when: When implementing or using DiscreteAllocation for portfolio backtesting
action: Assume the framework implements T+N settlement date tracking — the framework treats each trades as same-day settlement,
ignoring actual settlement conventions
severity: high
kind: claim_boundary
modality: must_not
consequence: Ignoring settlement dates causes capital to be incorrectly freed before it is actually available, overstating
capital efficiency and potentially allowing over-trading that would be impossible in live trading
derived_from_bd_id: BD-GAP-008
- id: finance-C-176
when: When implementing capital tracking in DiscreteAllocation
action: 'Track settlement dates for each trade and adjust available capital only after the settlement period has elapsed
(e.g., for T+1: subtract allocated capital on day N but only free it for re-use on day N+1; for T+2: free on day N+2)'
severity: high
kind: domain_rule
modality: must
consequence: Without settlement date tracking, backtested capital efficiency is inflated as funds appear available before
they actually settle, causing live trading failures when attempting to execute trades that exceed available settled
capital
derived_from_bd_id: BD-GAP-008
- id: finance-C-177
when: When implementing or using DiscreteAllocation for portfolio backtesting
action: Assume the framework enforces minimum quantity and tick size constraints — DiscreteAllocation generates share
counts without rounding to valid lot sizes or checking minimum quantity requirements
severity: high
kind: claim_boundary
modality: must_not
consequence: Without tick size/lot size enforcement, DiscreteAllocation may generate share counts that cannot be executed
in live markets (e.g., 73 shares when lot size is 100), causing order rejections and backtest-live divergence
derived_from_bd_id: BD-GAP-009
- id: finance-C-178
when: When implementing share count rounding in DiscreteAllocation
action: Round allocated share quantities to valid lot sizes using round(qty / lot_size) * lot_size and enforce minimum
quantity thresholds, filtering out allocations where qty < min_qty before passing to order execution
severity: high
kind: domain_rule
modality: must
consequence: Without lot size rounding and min_qty enforcement, orders with invalid share counts will be rejected by exchanges,
causing live trading failures and backtest results that cannot be reproduced in production
derived_from_bd_id: BD-GAP-009
- id: finance-C-179
when: When implementing objective addition logic in BaseConvexOptimizer
action: Trigger re-solving or automatic resolution after add_objective() is called — the optimizer must remain in unsolved
state until explicitly solved via a dedicated solve() call
severity: high
kind: domain_rule
modality: must_not
consequence: Auto-re-solving after add_objective introduces stale state issues where objectives are solved with inconsistent
constraint sets, producing non-deterministic optimization results that vary based on call order
derived_from_bd_id: BD-009
- id: finance-C-180
when: When implementing Sharpe ratio optimization in CLA (Continuous Lifetime Allocation)
action: Use golden section search for lambda optimization — do not replace with gradient-dependent methods like Newton's
method, as the Sharpe ratio objective may not be differentiable in each regions
severity: medium
kind: architecture_guardrail
modality: must
consequence: Replacing golden section search with gradient-dependent methods introduces silent failures when the Sharpe
ratio objective function has non-differentiable regions, causing optimization to converge to incorrect solutions
derived_from_bd_id: BD-013
- id: finance-C-181
when: When using the framework's default frequency parameter (252) for annualization in backtesting
action: Verify that frequency=252 matches the actual market trading calendar being backtested; adjust to the correct number
of trading days for non-US markets (e.g., UK ~252-253, HK ~245-250) or pass explicit frequency parameter
severity: high
kind: operational_lesson
modality: must
consequence: Using 252 trading days for non-US markets causes approximately 5-15% annualization error due to different
holiday schedules and trading days, leading to miscalculated Sharpe ratios and expected returns that do not match actual
market performance
derived_from_bd_id: BD-024
- id: finance-C-182
when: When implementing custom objectives using the strategy pattern for portfolio optimization
action: Assume that custom objectives can be incrementally modified after initial solve - the optimizer does not support
re-solving with modified objectives; design custom objectives to be self-contained and complete on first instantiation
severity: high
kind: operational_lesson
modality: must_not
consequence: Attempting to modify custom objectives after initial solve causes silent failures where the optimizer does
not re-compute, leading to backtest results that do not reflect the intended objective changes and creating inconsistency
between backtest and live trading
derived_from_bd_id: BD-111
- id: finance-C-184
when: When configuring CVaR or CDaR risk estimation parameters
action: Verify that beta=0.95 (95% confidence, 5% tail) matches the target risk management standards; 99% confidence captures
rarer tail events but requires more data and has higher estimation variance; adjust beta to align with regulatory requirements
or internal risk policies if 5% tail is insufficient
severity: medium
kind: operational_lesson
modality: should
consequence: Default 95% confidence level measures only the 5% worst-case losses; strategies requiring coverage of rarer
tail events (1% worst cases) will systematically underestimate extreme drawdown risk, leading to inadequate capital
reserves in live trading
derived_from_bd_id: BD-027
- id: finance-C-185
when: When configuring quadratic utility optimization with risk aversion parameter
action: Verify that risk_aversion=1.0 (moderate risk tolerance) aligns with the target risk-return preferences; risk_aversion=2.0
represents typical market price of risk and produces more conservative portfolios; scale the parameter based on investor
risk tolerance and regulatory constraints
severity: medium
kind: operational_lesson
modality: should
consequence: Default risk_aversion=1.0 produces moderate concentration/dispersion; strategies requiring more conservative
portfolios (higher risk aversion) will systematically take excessive risk, while aggressive strategies (lower risk aversion)
may underexploit return opportunities
derived_from_bd_id: BD-029
- id: finance-C-186
when: When processing portfolios containing thinly traded or gap-prone assets with greedy allocation and default capital
action: 'Implement staleness detection for price data: check for time gaps exceeding expected market hours between consecutive
prices, validate that price changes align with plausible market movements for the asset class, and flag or filter positions
where close price deviates more than 20% from previous close (potential data artifact or stale quote)'
severity: high
kind: operational_lesson
modality: must
consequence: The interaction of forward-filled stale prices (BD-059) flowing through default $10k capital (BD-017) into
greedy allocation (BD-018) creates systematic misallocation for thinly traded assets; stale prices in fast-moving markets
cause orders sized based on outdated values, resulting in over/under-allocation relative to true market conditions
derived_from_bd_id: BD-109
- id: finance-C-187
when: When implementing or documenting return calculations in backtesting
action: Use log_returns=False with simple returns consistently — verify that any documentation referencing 'log returns'
as 'theoretically superior' is corrected to match the actual implementation default, or explicitly document the log/simple
toggle behavior
severity: medium
kind: operational_lesson
modality: must
consequence: The framework defaults to simple returns (log_returns=False) but documentation describes log returns as 'theoretically
superior'; this contradiction causes users to misinterpret backtest results when strategies are designed expecting log-return
calculations
derived_from_bd_id: BD-112
- id: finance-C-188
when: When implementing post-processing logic for portfolio weight cleanup
action: Set weight cleanup cutoff exactly at 1e-4 — weights below this threshold must become zero; do not change to higher
values like 1e-3 as this would eliminate meaningful small-cap exposures below 0.1% allocation
severity: high
kind: domain_rule
modality: must
consequence: Raising the cutoff above 1e-4 eliminates small-cap positions that may be economically significant over time;
while 0.01% seems small, cumulative returns from small positions add up in long-running strategies
derived_from_bd_id: BD-025
- id: finance-C-189
when: When implementing discrete allocation from continuous portfolio weights
action: Auto-derive short_ratio from the sum of absolute negative weights when not explicitly specified — compute short_ratio
= sum(|w|) for each w < 0; do not use fixed defaults or manual specification without preserving the original short exposure
severity: high
kind: domain_rule
modality: must
consequence: Manual short_ratio specification or fixed defaults can diverge from the original long/short portfolio's intended
exposure; allocating shares without matching short exposure creates unintended leverage or risk imbalance in live trading
derived_from_bd_id: BD-034
- id: finance-C-190
when: When implementing portfolio optimization solver result validation
action: Accept both 'optimal' and 'optimal_inaccurate' solver statuses as valid results — do not reject 'optimal_inaccurate'
solutions; only fail when status is neither 'optimal' nor 'optimal_inaccurate'
severity: high
kind: domain_rule
modality: must
consequence: Strictly accepting only 'optimal' status causes optimization failures for ill-conditioned problems where
near-optimal solutions are numerically unavoidable; rejecting these usable results forces strategies to fail when mathematically
sound portfolios exist
derived_from_bd_id: BD-044
- id: finance-C-191
when: When using the framework's default L2 regularization gamma parameter for portfolio optimization
action: Verify that gamma=1 matches the actual optimization requirements for the portfolio; adjust gamma if asset correlation
structure differs significantly from typical institutional portfolios
severity: medium
kind: operational_lesson
modality: should
consequence: Gamma=1 produces moderate regularization that may over-regularize low-correlation portfolios or under-regularize
highly-correlated ones, leading to suboptimal risk-adjusted returns compared to properly tuned gamma values
derived_from_bd_id: BD-041
- id: finance-C-192
when: When using the framework's default transaction cost parameter k for portfolio optimization
action: Verify that k=0.001 (10 basis points) matches actual round-trip transaction costs for the target execution venue;
adjust to 0.0025 for high-impact retail trades or 0.0005 for high-liquidity ETFs
severity: high
kind: domain_rule
modality: must
consequence: Default k=0.001 assumes institutional execution costs; using this for retail accounts with higher commissions
and spreads causes backtest to underestimate true trading costs, producing artificially optimistic backtest results
derived_from_bd_id: BD-042
- id: finance-C-193
when: When fixing non-positive-semidefinite covariance matrices in portfolio risk estimation
action: Verify the covariance matrix is near-PSD before applying spectral (eigenvalue clipping) method; for severely non-PSD
matrices from illiquid assets or sparse data, use Higham's algorithm instead
severity: high
kind: domain_rule
modality: must
consequence: Spectral method clips negative eigenvalues which may introduce significant pricing errors for severely non-PSD
matrices, causing the optimizer to produce unstable portfolios that perform poorly in live trading
derived_from_bd_id: BD-046
- id: finance-C-194
when: When calling the minimum variance optimization method
action: Pass expected returns to the min_volatility optimization method — min_volatility only accepts covariance matrix
and does not support expected returns inputs
severity: high
kind: architecture_guardrail
modality: must_not
consequence: Passing expected returns to min_volatility causes framework errors since the method only uses covariance
matrix; backtesting pipeline will fail at runtime
derived_from_bd_id: BD-050
- id: finance-C-195
when: When implementing portfolio optimization objectives in backtesting
action: Include transaction cost parameters (fixed and/or proportional costs) in portfolio optimization objectives — the
cost model is essential for realistic portfolio implementation and prevents trivial rebalancing to theoretical optimums
severity: high
kind: domain_rule
modality: must
consequence: Ignoring transaction costs causes backtests to assume frictionless trading, systematically overestimating
returns by 1-5% per rebalance cycle and creating significant live trading discrepancy
derived_from_bd_id: BD-086
- id: finance-C-196
when: When optimizing benchmark-aware portfolio strategies
action: Enforce tracking error bounds — the target tracking error constrains how far portfolio weights can deviate from
benchmark weights and controls active risk
severity: high
kind: domain_rule
modality: must
consequence: Removing tracking error constraints allows unrestricted deviation from benchmark, potentially causing the
portfolio to behave like an unconstrained strategy instead of an index-aware active fund, leading to regulatory and
client mandate violations
derived_from_bd_id: BD-087
- id: finance-C-197
when: When estimating expected returns using CAPM model without explicit market index
action: Verify that equal-weighted average of asset returns is an acceptable market proxy for the strategy — this assumes
each asset contributes equally to market movement, which is theoretically inferior to value-weighted indices
severity: medium
kind: operational_lesson
modality: should
consequence: Equal-weighted market proxy systematically overweights small-cap assets and underweights large-cap assets
compared to cap-weighted indices like S&P 500, leading to biased expected return estimates that cause portfolio allocations
to diverge from optimal value-weighted benchmarks
derived_from_bd_id: BD-071
output_validator:
assertions:
- id: OV-01
check_predicate: all(p in inspect.getsource(zvt.factors.algorithm.macd) for p in ['slow=26', 'fast=12', 'n=9'])
failure_message: 'FATAL: MACD params drifted from (fast=12, slow=26, n=9) — SL-08 violation, non-reproducible signals'
business_meaning: Standard MACD parameters are a semantic lock; drift makes results incomparable with industry-standard
indicators and non-reproducible.
source_ids:
- SL-08
- BD-036
- id: OV-02
check_predicate: result.get('total_trades', 0) > 0 or result.get('explicit_zero_trade_ack') is True
failure_message: Zero trades executed — likely missing pre-fetched data (see PC-02) or over-restrictive filters
business_meaning: A backtest with zero trades is not a valid result; either data is missing or the strategy never triggered.
Structural non-emptiness check is insufficient — we need business confirmation.
source_ids:
- SL-01
- finance-C-073
- id: OV-03
check_predicate: result.get('annual_return') is None or abs(float(result['annual_return'])) <= 5.0
failure_message: 'FATAL: |annual_return| > 500% — likely look-ahead bias or data error'
business_meaning: Annual returns exceeding 500% are physically implausible for A-share strategies; indicates look-ahead
bias or corrupt data.
source_ids: []
- id: OV-04
check_predicate: result.get('holding_change_pct') is None or abs(float(result['holding_change_pct'])) <= 1.0
failure_message: 'FATAL: |holding_change_pct| > 100% — physically impossible'
business_meaning: Holding change percentage cannot exceed 100%; violation indicates position accounting error.
source_ids:
- BD-029
- id: OV-05
check_predicate: result.get('max_drawdown') is None or abs(float(result['max_drawdown'])) <= 1.0
failure_message: 'FATAL: |max_drawdown| > 100% — impossible for non-leveraged account'
business_meaning: Maximum drawdown cannot exceed 100% without leverage; violation indicates calculation error or look-ahead
bias.
source_ids: []
- id: OV-06
check_predicate: not (hasattr(result, 'trade_log') and result.trade_log and any(result.trade_log[i].action == 'sell' and
i+1 < len(result.trade_log) and result.trade_log[i+1].action == 'buy' and result.trade_log[i].timestamp == result.trade_log[i+1].timestamp
for i in range(len(result.trade_log)-1)))
failure_message: 'FATAL: buy-before-sell detected in same cycle — SL-01 violation, creates implicit leverage'
business_meaning: SL-01 requires sell() before buy() in each cycle; violation means available_long was not updated before
buying, risking duplicate positions.
source_ids:
- SL-01
scaffold:
validate_py_path: '{workspace}/validate.py'
tail_block: "# === DO NOT MODIFY BELOW THIS LINE ===\nif __name__ == \"__main__\":\n result = run_backtest()\n from\
\ validate import enforce_validation\n enforce_validation(result, output_path=\"{workspace}/result.csv\")\n# ===\
\ END DO NOT MODIFY ==="
enforcement_protocol: 1. Never edit validate.py. 2. Never delete the DO NOT MODIFY tail block from the main script. 3. Never
wrap enforce_validation() in try/except. 4. Never rewrite result write logic — it MUST go through enforce_validation.
5. If validate.py raises ImportError, fix the dependency, do not remove the call.
acceptance:
hard_gates:
- id: G1
check: '{workspace}/result.csv exists AND file size > 0'
on_fail: Strategy did not produce output; check run_backtest() return value and enforce_validation() call
- id: G2
check: '{workspace}/result.csv.validation_passed marker file exists'
on_fail: Validation did not complete; review validate.py output and fix assertion failures
- id: G3
check: 'Main script contains literal: from validate import enforce_validation'
on_fail: Validation chain stripped; re-add the import in the DO NOT MODIFY block
- id: G4
check: 'Main script contains literal: # === DO NOT MODIFY BELOW THIS LINE ==='
on_fail: Validation fence removed; regenerate DO NOT MODIFY tail block
- id: G5
check: 'result.csv has at least 1 row: pandas.read_csv(result.csv).shape[0] >= 1'
on_fail: Empty result; check if trade_log is non-empty and factors generated signals. Confirm PC-02 (k-data exists) passed.
- id: G6
check: 'If MACD strategy: source contains ''slow=26'' AND ''fast=12'' AND ''n=9'' in algorithm call'
on_fail: MACD params drifted from SL-08 lock; restore standard (12, 26, 9)
- id: G7
check: 'For data pipeline tasks: result.csv contains ''entity_id'' and ''timestamp'' fields'
on_fail: Missing required columns; check Mixin.query_data return schema and DataFrame MultiIndex reset_index() before
writing
- id: G8
check: 'OV-03 passes: abs(annual_return) <= 5.0 (500%)'
on_fail: Physical plausibility check failed; investigate look-ahead bias or data corruption in input kdata
soft_gates:
- id: SG-01
rubric: 'Strategy narrative consistency: user intent aligns with generated strategy.py logic. dim_a: signal direction
(buy/sell) matches intent [1-5, pass>=4]; dim_b: frequency (daily/intraday) aligns [1-5, pass>=4]; dim_c: risk controls
match user intent [1-5, pass>=4].'
- id: SG-02
rubric: 'Factor combination quality. dim_a: no highly correlated factor duplication [1-5, pass>=4]; dim_b: multi-period
alignment correct [1-5, pass>=4]; dim_c: liquidity filter present for A-share [1-5, pass>=4].'
- id: SG-03
rubric: 'Data source selection appropriateness. dim_a: coverage sufficient for target entities [1-5, pass>=4]; dim_b:
provider latency acceptable for strategy frequency [1-5, pass>=4]; dim_c: no unauthorized provider used without credentials
[1-5, pass>=4].'
skill_crystallization:
trigger: all_hard_gates_passed AND user_opt_out_skill_saving != true
output_path_template: '{workspace}/../skills/{slug}.skill'
slug_template: '{blueprint_id_short}-{uc_id_lower}'
captured_fields:
- name
- intent_keywords
- entry_point_script
- validate_script
- fatal_constraints
- spec_locks
- preconditions
- install_recipes
- human_summary_translated
action: 'After all Hard Gates PASS, resolve slug via slug_template using the executed UC, then write the .skill YAML file
at output_path_template. Notify user in their detected locale: ''Skill saved as {slug}.skill — next time say one of {sample_triggers}
from the matched UC to invoke directly.'''
violation_signal: All hard gates passed but no .skill file exists at expected path
skill_file_schema:
name: finance-bp-093 / Risk Model Comparison Analysis
version: v5.3
intent_keywords:
- risk model comparison
- covariance estimation methods
- portfolio risk analysis
- Ledoit-Wolf shrinkage
- risk matrix computation
entry_point: run_backtest
fatal_guards:
- SL-01
- SL-02
- SL-03
- SL-04
- SL-05
- SL-06
- SL-07
- SL-08
- SL-10
- SL-11
- SL-12
spec_locks:
- SL-01
- SL-02
- SL-03
- SL-04
- SL-05
- SL-06
- SL-07
- SL-08
- SL-09
- SL-10
- SL-11
- SL-12
preconditions:
- PC-01
- PC-02
- PC-03
- PC-04
post_install_notice:
trigger: skill_installation_complete
message_template:
positioning: I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow.
capability_catalog:
group_strategy:
source: auto_grouped
strategy_reason: auto-grouped by UC.type (3 distinct values, balanced distribution)
groups:
- group_id: research_analysis
name: Research Analysis
description: ''
emoji: 📦
uc_count: 1
ucs:
- uc_id: UC-101
name: Risk Model Comparison Analysis
short_description: Compares multiple covariance estimation methods (sample, semicovariance, exponential, Ledoit-Wolf
variants, oracle approximating) to evaluate which pr
sample_triggers:
- risk model comparison
- covariance estimation methods
- portfolio risk analysis
- group_id: trading_strategy
name: Trading Strategy
description: ''
emoji: 📦
uc_count: 4
ucs:
- uc_id: UC-102
name: Basic Mean-Variance Optimization
short_description: Constructs a minimum volatility portfolio using mean-variance optimization with CAPM-based expected
returns and compares sample covariance vs Ledoit-W
sample_triggers:
- mean-variance optimization
- minimum volatility portfolio
- Efficient Frontier
- uc_id: UC-103
name: Mean-Variance Optimization with Transaction Costs
short_description: Implements advanced mean-variance optimization that accounts for broker transaction costs when
rebalancing from an initial portfolio allocation, using
sample_triggers:
- transaction cost optimization
- portfolio rebalancing
- semicovariance risk
- uc_id: UC-104
name: Black-Litterman Bayesian Portfolio Allocation
short_description: Combines market equilibrium prior returns with investor views using the Black-Litterman model
to generate more realistic expected returns and construc
sample_triggers:
- Black-Litterman model
- investor views integration
- market implied prior returns
- uc_id: UC-105
name: Hierarchical Risk Parity Portfolio
short_description: Constructs a diversified portfolio using Hierarchical Risk Parity (HRP), which uses clustering/dendrogram
analysis to group correlated assets and allo
sample_triggers:
- Hierarchical Risk Parity
- HRP optimization
- dendrogram clustering
- group_id: extension_example
name: Extension Example
description: ''
emoji: 📦
uc_count: 1
ucs:
- uc_id: UC-106
name: PyPortfolioOpt Documentation Build Configuration
short_description: Sphinx documentation configuration file for building PyPortfolioOpt package documentation, not
a functional portfolio strategy
sample_triggers:
- documentation setup
- Sphinx configuration
- package documentation
call_to_action: Tell me which one you want to try.
featured_entries:
- uc_id: UC-101
beginner_prompt: Try risk model comparison analysis
auto_selected: true
- uc_id: UC-102
beginner_prompt: Try basic mean-variance optimization
auto_selected: true
- uc_id: UC-103
beginner_prompt: Try mean-variance optimization with transaction costs
auto_selected: true
more_info_hint: Ask me 'what else can you do?' to see all 6 capabilities.
locale_rendering:
instruction: On skill_installation_complete, translate ALL user-facing strings (positioning + capability_catalog.groups[].name
+ capability_catalog.groups[].description + capability_catalog.groups[].ucs[].short_description + call_to_action + featured_entries[].beginner_prompt
+ more_info_hint) into detected user locale per locale_contract. Preserve UC-IDs, group_id, emoji, and sample_triggers
verbatim.
preserve_verbatim:
- UC-IDs
- group_id
- emoji
- sample_triggers
- technical_class_names
enforcement:
action: 'Host agent MUST send composed message to user as the FIRST user-facing response after skill_installation_complete
event. Message MUST contain: positioning, capability_catalog (rendered as markdown tables per group), 3 featured_entries,
call_to_action, and more_info_hint.'
violation_code: PIN-01
violation_signal: First user-facing message post-install does not contain the full capability_catalog (all UCs grouped)
OR skips featured_entries OR skips call_to_action.
human_summary:
persona: Doraemon
what_i_can_do:
tagline: 'I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me
what you want; I''ll write the code, you don''t have to dig docs. (Heads up: ZVT natively supports A-share, HK, and
crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don''t bother for serious work.)'
use_cases:
- Mean-Variance Optimization with Transaction Costs
- Basic Mean-Variance Optimization
- Risk Model Comparison Analysis
- A-share MACD daily golden-cross backtest with hfq price adjustment from eastmoney
- 'End-to-end ZVT pipeline: FinanceRecorder + GoodCompanyFactor + StockTrader'
- Multi-factor strategy with TargetSelector (AND mode) combining MACD + volume breakout
- Index composition data collection (SZ1000, SZ2000) with EM recorder
what_i_auto_fetch:
- ZVT stage pipeline structure (data_collection → visualization) from LATEST.yaml
- Semantic locks (SL-01 through SL-12) — especially sell-before-buy ordering and MACD params
- Fatal constraints (finance-C-*) relevant to your target strategy type
- 'Default parameters: MACD(12,26,9), hfq adjustment, buy_cost=0.001, base_capital=1M CNY'
- Entity ID format (stock_sh_600000) and DataFrame MultiIndex convention
- Provider-specific recorder class names and required class attributes
what_i_ask_you:
- 'Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage
is thin)'
- 'Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare,
or qmt (broker)?'
- 'Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?'
- 'Time range: start_timestamp and end_timestamp for backtest period'
- 'Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?'
locale_rendering:
instruction: On first user contact, translate all fields above into detected user locale while preserving Doraemon persona
(direct, frank, mildly snarky, knows limits).
preserve_verbatim:
- BD-IDs
- SL-IDs
- UC-IDs
- finance-C-IDs
- class_names
- function_names
- file_paths
- numeric_thresholds
基于 pandas-ta 库计算技术分析指标(RSI、MACD、布林带、KAMA 等),支持多市场数据可视化与自定义参数调整。。
---
name: pandas-ta-indicators
description: |-
基于 pandas-ta 库计算技术分析指标(RSI、MACD、布林带、KAMA 等),支持多市场数据可视化与自定义参数调整。。
license: Proprietary. See LICENSE.txt in project root.
compatibility: Designed for Doramagic-host ecosystem (Claude Code / openclaw / Cursor). Requires Python 3.12+ with uv package manager.
metadata:
version: "v6.1"
blueprint_id: "finance-bp-122"
compiled_at: "2026-04-22T13:01:00.198579+00:00"
capability_markets: "multi-market"
capability_activities: "technical-analysis"
sop_version: "crystal-compilation-v6.1"
---
# Pandas-TA 技术指标 (pandas-ta-indicators)
> 基于 pandas-ta 库计算技术分析指标(RSI、MACD、布林带、KAMA 等),支持多市场数据可视化与自定义参数调整。
## Pipeline
`data_collection -> data_storage -> factor_computation -> target_selection -> trading_execution -> visualization`
## Top Use Cases (2 total)
### Sphinx Documentation Configuration (`UC-101`)
Configures the Sphinx documentation builder for the Technical Analysis Library, enabling automated generation of API documentation
**Triggers**: documentation, sphinx, config
### Technical Analysis Features Visualization (`UC-102`)
Explores and visualizes various technical analysis indicators (Bollinger Bands, Keltner Channel, Donchian Channel, MACD) on historical price data to u
**Triggers**: visualize, technical indicators, charting
**Execute trigger**: `When user intent matches intent_router.uc_entries[].positive_terms AND user uses action verb (run/execute/跑/执行/backtest/fetch/collect)`
## What I'll Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
## Semantic Locks (Fatal)
| ID | Rule | On Violation |
|---|---|---|
| `SL-01` | Execute sell orders before buy orders in every trading cycle | halt |
| `SL-02` | Trading signals MUST use next-bar execution (no look-ahead) | halt |
| `SL-03` | Entity IDs MUST follow format entity_type_exchange_code | halt |
| `SL-04` | DataFrame index MUST be MultiIndex (entity_id, timestamp) | halt |
| `SL-05` | TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount | halt |
| `SL-06` | filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION | halt |
| `SL-07` | Transformer MUST run BEFORE Accumulator in factor pipeline | halt |
| `SL-08` | MACD parameters locked: fast=12, slow=26, signal=9 | halt |
Full lock definitions: [references/LOCKS.md](references/LOCKS.md)
## Top Anti-Patterns (15 total)
- **`AP-TECHNICAL-ANALYSIS-001`**: C FFI Type Mismatch with Non-float64 Arrays
- **`AP-TECHNICAL-ANALYSIS-002`**: Multidimensional Array Memory Access Violations
- **`AP-TECHNICAL-ANALYSIS-003`**: Ignoring TA_RetCode Error Status from C Calls
All 15 anti-patterns: [references/ANTI_PATTERNS.md](references/ANTI_PATTERNS.md)
## Evidence Quality Notice
> [QUALITY NOTICE] This crystal was compiled from blueprint finance-bp-122. Evidence verify ratio = 72.5% and audit fail total = 34. Generated results may have uncaptured requirement gaps. Verify critical decisions against source files (LATEST.yaml / LATEST.jsonl).
## Reference Files
| File | Contents | When to Load |
|---|---|---|
| [references/seed.yaml](references/seed.yaml) | V6+ 全量权威 (source-of-truth) | 有行为/决策争议时必读 |
| [references/ANTI_PATTERNS.md](references/ANTI_PATTERNS.md) | 15 条跨项目反模式 | 开始实现前 |
| [references/WISDOM.md](references/WISDOM.md) | 跨项目精华借鉴 | 架构决策时 |
| [references/CONSTRAINTS.md](references/CONSTRAINTS.md) | domain + fatal 约束 | 规则冲突时 |
| [references/USE_CASES.md](references/USE_CASES.md) | 全量 KUC-* 业务场景 | 需要完整示例时 |
| [references/LOCKS.md](references/LOCKS.md) | SL-* + preconditions + hints | 生成回测/交易代码前 |
| [references/COMPONENTS.md](references/COMPONENTS.md) | AST 组件地图(按 module 拆分)| 查 API 时 |
---
*Compiled by Doramagic crystal-compilation-v6.1 from `finance-bp-122` blueprint at 2026-04-22T13:01:00.198579+00:00.*
*See [human_summary.md](human_summary.md) for non-technical overview.*
FILE:human_summary.md
# Human Summary
> I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
## What I Can Do
- **tagline**: I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
- **use_cases**: ['Technical Analysis Features Visualization', 'Sphinx Documentation Configuration', 'A-share MACD daily golden-cross backtest with hfq price adjustment from eastmoney', 'End-to-end ZVT pipeline: FinanceRecorder + GoodCompanyFactor + StockTrader', 'Multi-factor strategy with TargetSelector (AND mode) combining MACD + volume breakout', 'Index composition data collection (SZ1000, SZ2000) with EM recorder', 'Institutional fund holdings tracker via joinquant_fund_runner pattern']
## What I Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
FILE:references/ANTI_PATTERNS.md
# Anti-Patterns (Cross-Project)
Total: **15**
## finance-bp-109--ta-lib-python (8)
### `AP-TECHNICAL-ANALYSIS-001` — C FFI Type Mismatch with Non-float64 Arrays <sub>(high)</sub>
Passing non-float64 (NPY_DOUBLE) numpy arrays to TA-Lib C functions causes memory corruption or silent incorrect calculations. The C FFI layer expects precisely float64 precision, and type mismatches propagate undetected, producing wrong indicator values that may silently corrupt trading strategies. Root cause is not validating array dtype before the C function call.
### `AP-TECHNICAL-ANALYSIS-002` — Multidimensional Array Memory Access Violations <sub>(high)</sub>
Passing multidimensional numpy arrays to TA-Lib C functions causes segmentation faults and memory access violations due to incorrect stride calculations. The C layer assumes contiguous 1-dimensional memory layouts, and higher-dimensional inputs break its internal pointer arithmetic, leading to crashes or silent memory corruption.
### `AP-TECHNICAL-ANALYSIS-003` — Ignoring TA_RetCode Error Status from C Calls <sub>(high)</sub>
When TA-Lib C functions return non-zero TA_RetCode values (indicating errors like uninitialized library, invalid parameters, or out-of-range inputs), ignoring these codes silently propagates invalid computation results. This leads to incorrect technical indicator values feeding into trading strategies without any warning, potentially causing significant financial loss.
### `AP-TECHNICAL-ANALYSIS-004` — Mismatched Array Lengths in Multi-Input Functions <sub>(high)</sub>
When calculating indicators that require multiple input arrays (e.g., open, high, low, close, volume), providing arrays of different lengths causes out-of-bounds memory access. TA-Lib iterates assuming identical sizes, and length mismatches produce garbage values or segmentation faults, corrupting the entire indicator output.
### `AP-TECHNICAL-ANALYSIS-011` — Stale Cached Outputs Without Invalidation <sub>(medium)</sub>
Caching computed indicator outputs without invalidating when inputs, parameters, or input_names change causes stale results to be returned even when underlying data has changed. This produces incorrect indicator values that silently propagate into trading strategies, leading to wrong signals based on outdated calculations.
### `AP-TECHNICAL-ANALYSIS-012` — Concurrent Access Without Thread-Local State <sub>(high)</sub>
Using shared Function instances across multiple threads without thread-local storage causes race conditions where concurrent threads share state. This leads to data corruption, incorrect results, and non-deterministic indicator values when multiple threads compute indicators simultaneously on the same instance.
### `AP-TECHNICAL-ANALYSIS-013` — Using Python Lists Instead of NumPy Arrays for Stream Functions <sub>(medium)</sub>
Stream functions require numpy.ndarray inputs due to direct C API access via PyArray_TYPE() and PyArray_FLAGS(). Passing plain Python lists or other sequences causes runtime errors because the C layer cannot access the underlying C arrays. This breaks real-time indicator calculations that expect efficient numpy buffer access.
### `AP-TECHNICAL-ANALYSIS-014` — Library Not Initialized Before C Function Calls <sub>(high)</sub>
Calling TA-Lib C functions without prior library initialization returns TA_RetCode=1 (TA_LIB_NOT_INITIALIZE), causing all function calls to fail. This is a silent failure mode that produces no output indicators, breaking batch calculation pipelines unless the initialization step is explicitly performed before any function calls.
## finance-bp-122--ta-python (7)
### `AP-TECHNICAL-ANALYSIS-005` — Time-Series Index Reindexing Breaks Alignment <sub>(high)</sub>
Reindexing or resetting the DataFrame/Series index after computing technical indicators breaks temporal alignment with original price data and other features. This causes look-ahead bias, shifts indicator values to incorrect timestamps, and corrupts time-series datasets when used in backtesting or feature engineering pipelines.
### `AP-TECHNICAL-ANALYSIS-006` — NaN/Inf/Zero Propagation Corrupts Indicator Values <sub>(high)</sub>
Failing to clean input data of NaN, infinite values, or zero prices causes cascading corruption through rolling window calculations. Division-by-zero errors on zero prices produce NaN that propagates into all subsequent indicator values, corrupting entire datasets. Invalid values also cause incorrect boolean mask classifications when compared with np.inf directly.
### `AP-TECHNICAL-ANALYSIS-007` — EMA Smoothing Parameter Divergence from TA Standards <sub>(medium)</sub>
Using pandas adjust=True (the default) for ewm() when implementing EMA-based indicators produces Yahoo Finance variant smoothing instead of standard recursive exponential smoothing per technical analysis textbooks. This causes different signal thresholds and divergence from widely-accepted indicator calculations, leading to inconsistent trading signals.
### `AP-TECHNICAL-ANALYSIS-008` — False Claims: Indicator Calculation as Trading Signal <sub>(high)</sub>
Presenting technical indicator values as real-time trading signals or guaranteed future performance misleads users about the tool's capabilities. The library calculates historical indicators from OHLCV data; claiming these as trading signals leads to improper trading decisions. Backtest results also do not guarantee future performance due to look-ahead bias and market regime changes.
### `AP-TECHNICAL-ANALYSIS-009` — Functional vs OOP API Implementation Divergence <sub>(medium)</sub>
When both functional wrappers (e.g., rsi()) and OOP classes (e.g., RSIIndicator) are provided, diverging implementations produce different indicator values for the same inputs. This causes confusion, test failures, and breaks user code that expects consistent behavior across APIs. The functional wrapper must delegate to the class implementation to ensure equivalence.
### `AP-TECHNICAL-ANALYSIS-010` — Bollinger Bands Using Sample Std Deviation <sub>(medium)</sub>
Using pandas default ddof=1 (sample standard deviation) for Bollinger Bands produces wider bands than John Bollinger's original specification, which uses population standard deviation. This causes overestimation of volatility, incorrect trading signal thresholds, and divergence from the canonical indicator calculation that traders expect.
### `AP-TECHNICAL-ANALYSIS-015` — Stateful Wrapper Functions Leak State Across Calls <sub>(medium)</sub>
When functional wrapper functions retain internal state between calls, different input series contaminate each other's results through data leakage. This produces incorrect indicator values when the same wrapper function is called sequentially with different data, as cached state from previous calls affects new computations.
FILE:references/COMPONENTS.md
# Component Capability Map
**Project**: finance-bp-122--ta-python
**Scan date**: 2026-04-22
**Stats**: {'total_files': 5, 'total_classes': 29, 'total_functions': 0, 'total_stages': 5}
## Modules (5)
- [data_input](components/data_input.md): 6 classes
- [indicator_computation](components/indicator_computation.md): 8 classes
- [feature_aggregation](components/feature_aggregation.md): 8 classes
- [functional_api_layer](components/functional_api_layer.md): 6 classes
- [data_cleanup_utility](components/data_cleanup_utility.md): 1 classes
FILE:references/CONSTRAINTS.md
# Constraints
## preservation_manifest
```yaml
required_objects:
business_decisions_count: 135
fatal_constraints_count: 19
non_fatal_constraints_count: 171
use_cases_count: 2
semantic_locks_count: 12
preconditions_count: 4
evidence_quality_rules_count: 2
traceback_scenarios_count: 5
```
FILE:references/LOCKS.md
# Semantic Locks + Preconditions
## Semantic Locks (12)
### `SL-01` <sub>(on_violation: fatal)</sub>
Execute sell orders before buy orders in every trading cycle
### `SL-02` <sub>(on_violation: fatal)</sub>
Trading signals MUST use next-bar execution (no look-ahead)
### `SL-03` <sub>(on_violation: fatal)</sub>
Entity IDs MUST follow format entity_type_exchange_code
### `SL-04` <sub>(on_violation: fatal)</sub>
DataFrame index MUST be MultiIndex (entity_id, timestamp)
### `SL-05` <sub>(on_violation: fatal)</sub>
TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount
### `SL-06` <sub>(on_violation: fatal)</sub>
filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION
### `SL-07` <sub>(on_violation: fatal)</sub>
Transformer MUST run BEFORE Accumulator in factor pipeline
### `SL-08` <sub>(on_violation: fatal)</sub>
MACD parameters locked: fast=12, slow=26, signal=9
### `SL-09` <sub>(on_violation: warning)</sub>
Default transaction costs: buy_cost=0.001, sell_cost=0.001, slippage=0.001
### `SL-10` <sub>(on_violation: fatal)</sub>
A-share equity trading is T+1 (no same-day close of buy positions)
### `SL-11` <sub>(on_violation: fatal)</sub>
Recorder subclass MUST define provider AND data_schema class attributes
### `SL-12` <sub>(on_violation: fatal)</sub>
Factor result_df MUST contain either 'filter_result' OR 'score_result' column
## Preconditions (4)
- **`PC-01`**: `python3 -c 'import zvt; print(zvt.__version__)'` → on_fail: Run: python3 -m pip install zvt then re-run: python3 -m zvt.init_dirs to initialize data directories
- **`PC-02`**: `python3 -c "from zvt.api.kdata import get_kdata; df = get_kdata(entity_ids=['stock_sh_600000'], limit=1); assert df is not None and len(df) > 0, 'No kdata found'"` → on_fail: Run recorder first: python3 -m zvt.recorders.em.em_stock_kdata_recorder --entity_ids stock_sh_600000 (replace with your target entity IDs)
- **`PC-03`**: `python3 -c "import os; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); assert zvt_home.exists(), f'ZVT home not found: {zvt_home}'"` → on_fail: Run: python3 -m zvt.init_dirs
- **`PC-04`**: `python3 -c "import os, tempfile; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); test_f = zvt_home / '.write_test'; test_f.touch(); test_f.unlink()"` → on_fail: Check directory permissions: chmod u+w ~/.zvt or set ZVT_HOME environment variable to a writable location
FILE:references/USE_CASES.md
# Known Use Cases (KUC)
Total: **2**
## `KUC-101`
**Source**: `docs/conf.py`
Configures the Sphinx documentation builder for the Technical Analysis Library, enabling automated generation of API documentation.
## `KUC-102`
**Source**: `examples_to_use/visualize_features.ipynb`
Explores and visualizes various technical analysis indicators (Bollinger Bands, Keltner Channel, Donchian Channel, MACD) on historical price data to understand their patterns and behavior.
FILE:references/WISDOM.md
# Cross-Project Wisdom
Total: **9**
## `CW-TECHNICAL-ANALYSIS-001` — Explicit Input Validation Before Computation
**From**: finance-bp-109--ta-lib-python, finance-bp-122--ta-python · **Applicable to**: technical-analysis
Both projects require rigorous pre-computation validation: dtype checking (float64 for C FFI, numeric for pandas), dimension checking (1D arrays for C layer), and length validation. This defensive pattern prevents silent failures and memory corruption. Apply this pattern whenever interfacing with external C libraries or computing indicators on potentially malformed input data.
## `CW-TECHNICAL-ANALYSIS-002` — Index Preservation Throughout Indicator Pipeline
**From**: finance-bp-109--ta-lib-python, finance-bp-122--ta-python · **Applicable to**: technical-analysis
Preserving the original DataFrame/Series index without reindexing or reset is critical for temporal alignment. When constructing output Series, use index=self._close.index to maintain alignment with price data. This prevents look-ahead bias and ensures downstream features correctly reference their corresponding timestamps.
## `CW-TECHNICAL-ANALYSIS-003` — Data Cleaning Before Indicator Computation
**From**: finance-bp-122--ta-python · **Applicable to**: technical-analysis
Indicators like RSI, MACD, and Bollinger Bands produce incorrect results when fed NaN, inf, or zero values. Remove rows with zero prices (to prevent division-by-zero), filter out infinite values using the exp(709) threshold as the maximum float64, and apply dropna to DataFrames before passing to indicator functions. This ensures clean propagation through rolling window calculations.
## `CW-TECHNICAL-ANALYSIS-004` — Error Code Propagation from C to Python Layer
**From**: finance-bp-109--ta-lib-python · **Applicable to**: technical-analysis
Always call _ta_check_success and raise exceptions on non-zero TA_RetCode return values from C function calls. This pattern ensures that errors like uninitialized library, invalid parameters, or out-of-range inputs propagate as proper Python exceptions instead of silently producing garbage values. Never ignore return codes from the underlying C library.
## `CW-TECHNICAL-ANALYSIS-005` — Thread-Local Storage for Concurrent Indicator Access
**From**: finance-bp-109--ta-lib-python · **Applicable to**: technical-analysis
When the same Function instance may be accessed from multiple threads, use thread-local storage to maintain isolated state per thread. This prevents race conditions, state corruption, and non-deterministic results when concurrent threads compute indicators simultaneously. The pattern is essential for any multi-threaded trading system or async processing pipeline.
## `CW-TECHNICAL-ANALYSIS-006` — Functional Wrapper Delegates to OOP Implementation
**From**: finance-bp-122--ta-python · **Applicable to**: technical-analysis
Functional wrapper functions like rsi() and ema_indicator() should instantiate the corresponding Indicator class and call its result method, not reimplement logic. This ensures OOP and functional APIs produce identical outputs. Any divergence causes test failures and breaks user code that switches between API styles. Validate equivalence in test suites.
## `CW-TECHNICAL-ANALYSIS-007` — Standard TA Textbook Parameters for EMA Calculations
**From**: finance-bp-122--ta-python · **Applicable to**: technical-analysis
When implementing EMA-based indicators, use adjust=False in pandas ewm() to match standard recursive exponential smoothing from technical analysis textbooks, not the Yahoo Finance variant. Also use ddof=0 for Bollinger Bands standard deviation per the original specification. Deviations produce different signal thresholds that diverge from what traders expect.
## `CW-TECHNICAL-ANALYSIS-008` — Cache Invalidation on Any Input Change
**From**: finance-bp-109--ta-lib-python · **Applicable to**: technical-analysis
Set outputs_valid flag to False whenever inputs, parameters, or input_names change. This pattern prevents returning stale cached outputs when underlying data or parameters have been modified. Implement proper cache invalidation to ensure computed indicators always reflect the current state.
## `CW-TECHNICAL-ANALYSIS-009` — Library Initialization Before First Use
**From**: finance-bp-109--ta-lib-python, finance-bp-122--ta-python · **Applicable to**: technical-analysis
Explicitly initialize the TA-Lib C library before any function calls. Without initialization, all function calls fail with TA_RetCode=1 (TA_LIB_NOT_INITIALIZE). This is a critical setup step that must be performed once before the indicator computation pipeline begins, typically at application startup or when first loading the library.
FILE:references/components/data_cleanup_utility.md
# data_cleanup_utility (1 classes)
## `dropna`
`data_cleanup_utility/dropna.py:0`
FILE:references/components/data_input.md
# data_input (6 classes)
## `add_all_ta_features`
`data_input/add-all-ta-features.py:0`
## `add_volume_ta`
`data_input/add-volume-ta.py:0`
## `add_trend_ta`
`data_input/add-trend-ta.py:0`
## `add_momentum_ta`
`data_input/add-momentum-ta.py:0`
## `add_volatility_ta`
`data_input/add-volatility-ta.py:0`
## `Column naming convention`
`data_input/column-naming-convention.py:0`
FILE:references/components/feature_aggregation.md
# feature_aggregation (8 classes)
## `BollingerBands.bollinger_mavg`
`feature_aggregation/bollingerbands-bollinger-mavg.py:0`
## `BollingerBands.bollinger_hband/bollinger_lband`
`feature_aggregation/bollingerbands-bollinger-hband-bollinger.py:0`
## `BollingerBands.bollinger_pband`
`feature_aggregation/bollingerbands-bollinger-pband.py:0`
## `MACD.macd`
`feature_aggregation/macd-macd.py:0`
## `MACD.macd_signal`
`feature_aggregation/macd-macd-signal.py:0`
## `MACD.macd_diff`
`feature_aggregation/macd-macd-diff.py:0`
## `StochasticOscillator.stoch/stoch_signal`
`feature_aggregation/stochasticoscillator-stoch-stoch-signal.py:0`
## `Derived output selection`
`feature_aggregation/derived-output-selection.py:0`
FILE:references/components/functional_api_layer.md
# functional_api_layer (6 classes)
## `rsi`
`functional_api_layer/rsi.py:0`
## `macd`
`functional_api_layer/macd.py:0`
## `bollinger_mavg`
`functional_api_layer/bollinger-mavg.py:0`
## `money_flow_index`
`functional_api_layer/money-flow-index.py:0`
## `on_balance_volume`
`functional_api_layer/on-balance-volume.py:0`
## `API style`
`functional_api_layer/api-style.py:0`
FILE:references/components/indicator_computation.md
# indicator_computation (8 classes)
## `RSIIndicator.rsi`
`indicator_computation/rsiindicator-rsi.py:0`
## `MACD.macd/macd_signal/macd_diff`
`indicator_computation/macd-macd-macd-signal-macd-diff.py:0`
## `BollingerBands.bollinger_mavg/hband/lband/wband/pband/hband_indicator/lband_indicator`
`indicator_computation/bollingerbands-bollinger-mavg-hband-lban.py:0`
## `MFIIndicator.money_flow_index`
`indicator_computation/mfiindicator-money-flow-index.py:0`
## `IndicatorMixin._check_fillna`
`indicator_computation/indicatormixin-check-fillna.py:0`
## `Fill strategy`
`indicator_computation/fill-strategy.py:0`
## `Rolling window minimum periods`
`indicator_computation/rolling-window-minimum-periods.py:0`
## `EMA smoothing method`
`indicator_computation/ema-smoothing-method.py:0`
FILE:references/seed.yaml
meta:
id: finance-bp-122-v5.3
version: v6.1
blueprint_id: finance-bp-122
sop_version: crystal-compilation-v6.1
source_language: en
compiled_at: '2026-04-22T13:01:00.198579+00:00'
target_host: openclaw
authoritative_artifact:
primary: seed.yaml
non_authoritative_derivatives:
- SKILL.md (host-generated summary, may lag)
- HEARTBEAT.md (host telemetry)
- memory/*.md (host conversational memory)
rule: On any behavioral decision (preconditions check, OV assertion, EQ rule firing, spec_lock verification), agents MUST
re-read seed.yaml. Derivatives are for UI display only and may be out-of-date.
execution_protocol:
install_trigger:
- Execute resources.host_adapter.install_recipes[] in declared order
- Verify each package with import check before proceeding
execute_trigger: When user intent matches intent_router.uc_entries[].positive_terms AND user uses action verb (run/execute/跑/执行/backtest/fetch/collect)
on_execute:
- Reload seed.yaml (do not rely on SKILL.md or cached summaries)
- Run preconditions[] in declared order; halt on first fatal failure with on_fail message to user
- Enter context_state_machine.CA1_MEMORY_CHECKED state
- Evaluate evidence_quality.enforcement_rules[]; prepend user_disclosure_template
- Translate user_facing_fields to user locale per locale_contract
- "[V6 READING ORDER]\nThis crystal contains the following V6 layers. Before answering any business question, the host\
\ MUST read them in order:\n 1. anti_patterns[] — cross-project anti-patterns (with AP-* ids)\n 2. cross_project_wisdom[]\
\ — cross-project wisdom (with CW-* ids)\n 3. domain_constraints_injected[] — domain constraints (SHARED-* ids)\n \
\ 4. known_use_cases[] — concrete business scenarios (KUC-* ids)\n 5. component_capability_map — AST component map\
\ (by module)\n\nWhen answering user questions, proactively cite relevant AP-*/CW-*/SHARED-*/KUC-* ids with source text.\
\ Examples: T+1 rules -> cite SHARED-* constraint; model comparison -> warn via AP-*; follow-holdings strategy -> cite\
\ KUC-* with example file."
workspace_resolution:
scripts_path: '{host_workspace}/scripts/'
skills_path: '{host_workspace}/skills/'
trace_path: '{host_workspace}/.trace/'
capability_tags:
markets:
- multi-market
activities:
- technical-analysis
upgraded_from: finance-bp-122-v1.seed.yaml
upgraded_at: '2026-04-22T13:20:33.510071+00:00'
v6_inputs:
ast_mind_map: knowledge/sources/finance/finance-bp-122--ta-python/v6_inputs/ast_mind_map.yaml
anti_patterns: null
cross_project_wisdom: null
examples_kuc: knowledge/sources/finance/finance-bp-122--ta-python/v6_inputs/examples_kuc.yaml
shared_pools_dir: knowledge/sources/finance/_shared
anti_patterns:
- id: AP-TECHNICAL-ANALYSIS-001
title: C FFI Type Mismatch with Non-float64 Arrays
description: Passing non-float64 (NPY_DOUBLE) numpy arrays to TA-Lib C functions causes memory corruption or silent incorrect
calculations. The C FFI layer expects precisely float64 precision, and type mismatches propagate undetected, producing
wrong indicator values that may silently corrupt trading strategies. Root cause is not validating array dtype before the
C function call.
project_source: finance-bp-109--ta-lib-python
severity: high
applicable_to_tags:
markets:
- multi-market
activities:
- technical-analysis
_source_file: anti-patterns/technical-analysis.yaml
- id: AP-TECHNICAL-ANALYSIS-002
title: Multidimensional Array Memory Access Violations
description: Passing multidimensional numpy arrays to TA-Lib C functions causes segmentation faults and memory access violations
due to incorrect stride calculations. The C layer assumes contiguous 1-dimensional memory layouts, and higher-dimensional
inputs break its internal pointer arithmetic, leading to crashes or silent memory corruption.
project_source: finance-bp-109--ta-lib-python
severity: high
applicable_to_tags:
markets:
- multi-market
activities:
- technical-analysis
_source_file: anti-patterns/technical-analysis.yaml
- id: AP-TECHNICAL-ANALYSIS-003
title: Ignoring TA_RetCode Error Status from C Calls
description: When TA-Lib C functions return non-zero TA_RetCode values (indicating errors like uninitialized library, invalid
parameters, or out-of-range inputs), ignoring these codes silently propagates invalid computation results. This leads
to incorrect technical indicator values feeding into trading strategies without any warning, potentially causing significant
financial loss.
project_source: finance-bp-109--ta-lib-python
severity: high
applicable_to_tags:
markets:
- multi-market
activities:
- technical-analysis
_source_file: anti-patterns/technical-analysis.yaml
- id: AP-TECHNICAL-ANALYSIS-004
title: Mismatched Array Lengths in Multi-Input Functions
description: When calculating indicators that require multiple input arrays (e.g., open, high, low, close, volume), providing
arrays of different lengths causes out-of-bounds memory access. TA-Lib iterates assuming identical sizes, and length mismatches
produce garbage values or segmentation faults, corrupting the entire indicator output.
project_source: finance-bp-109--ta-lib-python
severity: high
applicable_to_tags:
markets:
- multi-market
activities:
- technical-analysis
_source_file: anti-patterns/technical-analysis.yaml
- id: AP-TECHNICAL-ANALYSIS-005
title: Time-Series Index Reindexing Breaks Alignment
description: Reindexing or resetting the DataFrame/Series index after computing technical indicators breaks temporal alignment
with original price data and other features. This causes look-ahead bias, shifts indicator values to incorrect timestamps,
and corrupts time-series datasets when used in backtesting or feature engineering pipelines.
project_source: finance-bp-122--ta-python
severity: high
applicable_to_tags:
markets:
- multi-market
activities:
- technical-analysis
_source_file: anti-patterns/technical-analysis.yaml
- id: AP-TECHNICAL-ANALYSIS-006
title: NaN/Inf/Zero Propagation Corrupts Indicator Values
description: Failing to clean input data of NaN, infinite values, or zero prices causes cascading corruption through rolling
window calculations. Division-by-zero errors on zero prices produce NaN that propagates into all subsequent indicator
values, corrupting entire datasets. Invalid values also cause incorrect boolean mask classifications when compared with
np.inf directly.
project_source: finance-bp-122--ta-python
severity: high
applicable_to_tags:
markets:
- multi-market
activities:
- technical-analysis
_source_file: anti-patterns/technical-analysis.yaml
- id: AP-TECHNICAL-ANALYSIS-007
title: EMA Smoothing Parameter Divergence from TA Standards
description: Using pandas adjust=True (the default) for ewm() when implementing EMA-based indicators produces Yahoo Finance
variant smoothing instead of standard recursive exponential smoothing per technical analysis textbooks. This causes different
signal thresholds and divergence from widely-accepted indicator calculations, leading to inconsistent trading signals.
project_source: finance-bp-122--ta-python
severity: medium
applicable_to_tags:
markets:
- multi-market
activities:
- technical-analysis
_source_file: anti-patterns/technical-analysis.yaml
- id: AP-TECHNICAL-ANALYSIS-008
title: 'False Claims: Indicator Calculation as Trading Signal'
description: Presenting technical indicator values as real-time trading signals or guaranteed future performance misleads
users about the tool's capabilities. The library calculates historical indicators from OHLCV data; claiming these as trading
signals leads to improper trading decisions. Backtest results also do not guarantee future performance due to look-ahead
bias and market regime changes.
project_source: finance-bp-122--ta-python
severity: high
applicable_to_tags:
markets:
- multi-market
activities:
- technical-analysis
_source_file: anti-patterns/technical-analysis.yaml
- id: AP-TECHNICAL-ANALYSIS-009
title: Functional vs OOP API Implementation Divergence
description: When both functional wrappers (e.g., rsi()) and OOP classes (e.g., RSIIndicator) are provided, diverging implementations
produce different indicator values for the same inputs. This causes confusion, test failures, and breaks user code that
expects consistent behavior across APIs. The functional wrapper must delegate to the class implementation to ensure equivalence.
project_source: finance-bp-122--ta-python
severity: medium
applicable_to_tags:
markets:
- multi-market
activities:
- technical-analysis
_source_file: anti-patterns/technical-analysis.yaml
- id: AP-TECHNICAL-ANALYSIS-010
title: Bollinger Bands Using Sample Std Deviation
description: Using pandas default ddof=1 (sample standard deviation) for Bollinger Bands produces wider bands than John
Bollinger's original specification, which uses population standard deviation. This causes overestimation of volatility,
incorrect trading signal thresholds, and divergence from the canonical indicator calculation that traders expect.
project_source: finance-bp-122--ta-python
severity: medium
applicable_to_tags:
markets:
- multi-market
activities:
- technical-analysis
_source_file: anti-patterns/technical-analysis.yaml
- id: AP-TECHNICAL-ANALYSIS-011
title: Stale Cached Outputs Without Invalidation
description: Caching computed indicator outputs without invalidating when inputs, parameters, or input_names change causes
stale results to be returned even when underlying data has changed. This produces incorrect indicator values that silently
propagate into trading strategies, leading to wrong signals based on outdated calculations.
project_source: finance-bp-109--ta-lib-python
severity: medium
applicable_to_tags:
markets:
- multi-market
activities:
- technical-analysis
_source_file: anti-patterns/technical-analysis.yaml
- id: AP-TECHNICAL-ANALYSIS-012
title: Concurrent Access Without Thread-Local State
description: Using shared Function instances across multiple threads without thread-local storage causes race conditions
where concurrent threads share state. This leads to data corruption, incorrect results, and non-deterministic indicator
values when multiple threads compute indicators simultaneously on the same instance.
project_source: finance-bp-109--ta-lib-python
severity: high
applicable_to_tags:
markets:
- multi-market
activities:
- technical-analysis
_source_file: anti-patterns/technical-analysis.yaml
- id: AP-TECHNICAL-ANALYSIS-013
title: Using Python Lists Instead of NumPy Arrays for Stream Functions
description: Stream functions require numpy.ndarray inputs due to direct C API access via PyArray_TYPE() and PyArray_FLAGS().
Passing plain Python lists or other sequences causes runtime errors because the C layer cannot access the underlying C
arrays. This breaks real-time indicator calculations that expect efficient numpy buffer access.
project_source: finance-bp-109--ta-lib-python
severity: medium
applicable_to_tags:
markets:
- multi-market
activities:
- technical-analysis
_source_file: anti-patterns/technical-analysis.yaml
- id: AP-TECHNICAL-ANALYSIS-014
title: Library Not Initialized Before C Function Calls
description: Calling TA-Lib C functions without prior library initialization returns TA_RetCode=1 (TA_LIB_NOT_INITIALIZE),
causing all function calls to fail. This is a silent failure mode that produces no output indicators, breaking batch calculation
pipelines unless the initialization step is explicitly performed before any function calls.
project_source: finance-bp-109--ta-lib-python
severity: high
applicable_to_tags:
markets:
- multi-market
activities:
- technical-analysis
_source_file: anti-patterns/technical-analysis.yaml
- id: AP-TECHNICAL-ANALYSIS-015
title: Stateful Wrapper Functions Leak State Across Calls
description: When functional wrapper functions retain internal state between calls, different input series contaminate each
other's results through data leakage. This produces incorrect indicator values when the same wrapper function is called
sequentially with different data, as cached state from previous calls affects new computations.
project_source: finance-bp-122--ta-python
severity: medium
applicable_to_tags:
markets:
- multi-market
activities:
- technical-analysis
_source_file: anti-patterns/technical-analysis.yaml
cross_project_wisdom:
- wisdom_id: CW-TECHNICAL-ANALYSIS-001
source_project: finance-bp-109--ta-lib-python, finance-bp-122--ta-python
pattern_name: Explicit Input Validation Before Computation
description: 'Both projects require rigorous pre-computation validation: dtype checking (float64 for C FFI, numeric for
pandas), dimension checking (1D arrays for C layer), and length validation. This defensive pattern prevents silent failures
and memory corruption. Apply this pattern whenever interfacing with external C libraries or computing indicators on potentially
malformed input data.'
applicable_to_activity: technical-analysis
_source_file: cross-project-wisdom/technical-analysis.yaml
- wisdom_id: CW-TECHNICAL-ANALYSIS-002
source_project: finance-bp-109--ta-lib-python, finance-bp-122--ta-python
pattern_name: Index Preservation Throughout Indicator Pipeline
description: Preserving the original DataFrame/Series index without reindexing or reset is critical for temporal alignment.
When constructing output Series, use index=self._close.index to maintain alignment with price data. This prevents look-ahead
bias and ensures downstream features correctly reference their corresponding timestamps.
applicable_to_activity: technical-analysis
_source_file: cross-project-wisdom/technical-analysis.yaml
- wisdom_id: CW-TECHNICAL-ANALYSIS-003
source_project: finance-bp-122--ta-python
pattern_name: Data Cleaning Before Indicator Computation
description: Indicators like RSI, MACD, and Bollinger Bands produce incorrect results when fed NaN, inf, or zero values.
Remove rows with zero prices (to prevent division-by-zero), filter out infinite values using the exp(709) threshold as
the maximum float64, and apply dropna to DataFrames before passing to indicator functions. This ensures clean propagation
through rolling window calculations.
applicable_to_activity: technical-analysis
_source_file: cross-project-wisdom/technical-analysis.yaml
- wisdom_id: CW-TECHNICAL-ANALYSIS-004
source_project: finance-bp-109--ta-lib-python
pattern_name: Error Code Propagation from C to Python Layer
description: Always call _ta_check_success and raise exceptions on non-zero TA_RetCode return values from C function calls.
This pattern ensures that errors like uninitialized library, invalid parameters, or out-of-range inputs propagate as proper
Python exceptions instead of silently producing garbage values. Never ignore return codes from the underlying C library.
applicable_to_activity: technical-analysis
_source_file: cross-project-wisdom/technical-analysis.yaml
- wisdom_id: CW-TECHNICAL-ANALYSIS-005
source_project: finance-bp-109--ta-lib-python
pattern_name: Thread-Local Storage for Concurrent Indicator Access
description: When the same Function instance may be accessed from multiple threads, use thread-local storage to maintain
isolated state per thread. This prevents race conditions, state corruption, and non-deterministic results when concurrent
threads compute indicators simultaneously. The pattern is essential for any multi-threaded trading system or async processing
pipeline.
applicable_to_activity: technical-analysis
_source_file: cross-project-wisdom/technical-analysis.yaml
- wisdom_id: CW-TECHNICAL-ANALYSIS-006
source_project: finance-bp-122--ta-python
pattern_name: Functional Wrapper Delegates to OOP Implementation
description: Functional wrapper functions like rsi() and ema_indicator() should instantiate the corresponding Indicator
class and call its result method, not reimplement logic. This ensures OOP and functional APIs produce identical outputs.
Any divergence causes test failures and breaks user code that switches between API styles. Validate equivalence in test
suites.
applicable_to_activity: technical-analysis
_source_file: cross-project-wisdom/technical-analysis.yaml
- wisdom_id: CW-TECHNICAL-ANALYSIS-007
source_project: finance-bp-122--ta-python
pattern_name: Standard TA Textbook Parameters for EMA Calculations
description: When implementing EMA-based indicators, use adjust=False in pandas ewm() to match standard recursive exponential
smoothing from technical analysis textbooks, not the Yahoo Finance variant. Also use ddof=0 for Bollinger Bands standard
deviation per the original specification. Deviations produce different signal thresholds that diverge from what traders
expect.
applicable_to_activity: technical-analysis
_source_file: cross-project-wisdom/technical-analysis.yaml
- wisdom_id: CW-TECHNICAL-ANALYSIS-008
source_project: finance-bp-109--ta-lib-python
pattern_name: Cache Invalidation on Any Input Change
description: Set outputs_valid flag to False whenever inputs, parameters, or input_names change. This pattern prevents returning
stale cached outputs when underlying data or parameters have been modified. Implement proper cache invalidation to ensure
computed indicators always reflect the current state.
applicable_to_activity: technical-analysis
_source_file: cross-project-wisdom/technical-analysis.yaml
- wisdom_id: CW-TECHNICAL-ANALYSIS-009
source_project: finance-bp-109--ta-lib-python, finance-bp-122--ta-python
pattern_name: Library Initialization Before First Use
description: Explicitly initialize the TA-Lib C library before any function calls. Without initialization, all function
calls fail with TA_RetCode=1 (TA_LIB_NOT_INITIALIZE). This is a critical setup step that must be performed once before
the indicator computation pipeline begins, typically at application startup or when first loading the library.
applicable_to_activity: technical-analysis
_source_file: cross-project-wisdom/technical-analysis.yaml
domain_constraints_injected: []
resources_injected: {}
known_use_cases:
- kuc_id: KUC-101
source_file: docs/conf.py
business_problem: Configures the Sphinx documentation builder for the Technical Analysis Library, enabling automated generation
of API documentation.
intent_keywords:
- documentation
- sphinx
- config
- api docs
stage: reporting
data_domain: mixed
type: reporting
- kuc_id: KUC-102
source_file: examples_to_use/visualize_features.ipynb
business_problem: Explores and visualizes various technical analysis indicators (Bollinger Bands, Keltner Channel, Donchian
Channel, MACD) on historical price data to understand their patterns and behavior.
intent_keywords:
- visualize
- technical indicators
- charting
- feature exploration
- research
stage: factor_computation
data_domain: market_data
type: research_analysis
component_capability_map:
project: finance-bp-122--ta-python
scan_date: '2026-04-22'
stats:
total_files: 5
total_classes: 29
total_functions: 0
total_stages: 5
modules:
data_input:
class_count: 6
stage_id: data_input
stage_order: 1
responsibility: Accept raw OHLCV pandas DataFrames and validate column presence. Acts as entry point for each feature
engineering pipelines, orchestrating the addition of technical indicator columns in-place to the input DataFrame.
classes:
- name: add_all_ta_features
file: data_input/add-all-ta-features.py
line: 0
kind: required_method
signature: ''
- name: add_volume_ta
file: data_input/add-volume-ta.py
line: 0
kind: required_method
signature: ''
- name: add_trend_ta
file: data_input/add-trend-ta.py
line: 0
kind: required_method
signature: ''
- name: add_momentum_ta
file: data_input/add-momentum-ta.py
line: 0
kind: required_method
signature: ''
- name: add_volatility_ta
file: data_input/add-volatility-ta.py
line: 0
kind: required_method
signature: ''
- name: Column naming convention
file: data_input/column-naming-convention.py
line: 0
kind: replaceable_point
design_decision_count: 4
indicator_computation:
class_count: 8
stage_id: indicator_computation
stage_order: 2
responsibility: Compute mathematical transformations of OHLCV data to produce technical indicators. Each indicator class
encapsulates one TA formula with consistent __init__ → _run → <result> pattern across each 41 indicator classes.
classes:
- name: RSIIndicator.rsi
file: indicator_computation/rsiindicator-rsi.py
line: 0
kind: required_method
signature: ''
- name: MACD.macd/macd_signal/macd_diff
file: indicator_computation/macd-macd-macd-signal-macd-diff.py
line: 0
kind: required_method
signature: ''
- name: BollingerBands.bollinger_mavg/hband/lband/wband/pband/hband_indicator/lband_indicator
file: indicator_computation/bollingerbands-bollinger-mavg-hband-lban.py
line: 0
kind: required_method
signature: ''
- name: MFIIndicator.money_flow_index
file: indicator_computation/mfiindicator-money-flow-index.py
line: 0
kind: required_method
signature: ''
- name: IndicatorMixin._check_fillna
file: indicator_computation/indicatormixin-check-fillna.py
line: 0
kind: required_method
signature: ''
- name: Fill strategy
file: indicator_computation/fill-strategy.py
line: 0
kind: replaceable_point
- name: Rolling window minimum periods
file: indicator_computation/rolling-window-minimum-periods.py
line: 0
kind: replaceable_point
- name: EMA smoothing method
file: indicator_computation/ema-smoothing-method.py
line: 0
kind: replaceable_point
design_decision_count: 7
feature_aggregation:
class_count: 8
stage_id: feature_aggregation
stage_order: 3
responsibility: Combine multiple related indicator outputs into single API calls. Each indicator may produce multiple
derived outputs (e.g., MACD line, signal, histogram) from a single _run() computation to avoid recomputing rolling
windows.
classes:
- name: BollingerBands.bollinger_mavg
file: feature_aggregation/bollingerbands-bollinger-mavg.py
line: 0
kind: required_method
signature: ''
- name: BollingerBands.bollinger_hband/bollinger_lband
file: feature_aggregation/bollingerbands-bollinger-hband-bollinger.py
line: 0
kind: required_method
signature: ''
- name: BollingerBands.bollinger_pband
file: feature_aggregation/bollingerbands-bollinger-pband.py
line: 0
kind: required_method
signature: ''
- name: MACD.macd
file: feature_aggregation/macd-macd.py
line: 0
kind: required_method
signature: ''
- name: MACD.macd_signal
file: feature_aggregation/macd-macd-signal.py
line: 0
kind: required_method
signature: ''
- name: MACD.macd_diff
file: feature_aggregation/macd-macd-diff.py
line: 0
kind: required_method
signature: ''
- name: StochasticOscillator.stoch/stoch_signal
file: feature_aggregation/stochasticoscillator-stoch-stoch-signal.py
line: 0
kind: required_method
signature: ''
- name: Derived output selection
file: feature_aggregation/derived-output-selection.py
line: 0
kind: replaceable_point
design_decision_count: 2
functional_api_layer:
class_count: 6
stage_id: functional_api
stage_order: 4
responsibility: Provide stateless function wrappers for each indicator, enabling both OOP and functional usage patterns.
Functions instantiate classes internally and call single result methods, serving both OOP users needing multiple outputs
and functional users wanting single calls.
classes:
- name: rsi
file: functional_api_layer/rsi.py
line: 0
kind: required_method
signature: ''
- name: macd
file: functional_api_layer/macd.py
line: 0
kind: required_method
signature: ''
- name: bollinger_mavg
file: functional_api_layer/bollinger-mavg.py
line: 0
kind: required_method
signature: ''
- name: money_flow_index
file: functional_api_layer/money-flow-index.py
line: 0
kind: required_method
signature: ''
- name: on_balance_volume
file: functional_api_layer/on-balance-volume.py
line: 0
kind: required_method
signature: ''
- name: API style
file: functional_api_layer/api-style.py
line: 0
kind: replaceable_point
design_decision_count: 2
data_cleanup_utility:
class_count: 1
stage_id: data_cleanup
stage_order: 5
responsibility: Remove rows with invalid numeric values (NaN, inf, zero) that would corrupt indicator calculations.
Acts as data quality gate before and after indicator computation.
classes:
- name: dropna
file: data_cleanup_utility/dropna.py
line: 0
kind: required_method
signature: ''
design_decision_count: 2
data_flow_hints: []
locale_contract:
source_language: en
user_facing_fields:
- human_summary.what_i_can_do.tagline
- human_summary.what_i_can_do.use_cases[]
- human_summary.what_i_auto_fetch[]
- human_summary.what_i_ask_you[]
- evidence_quality.user_disclosure_template
- post_install_notice.message_template.positioning
- post_install_notice.message_template.capability_catalog.groups[].name
- post_install_notice.message_template.capability_catalog.groups[].description
- post_install_notice.message_template.capability_catalog.groups[].ucs[].name
- post_install_notice.message_template.capability_catalog.groups[].ucs[].short_description
- post_install_notice.message_template.call_to_action
- post_install_notice.message_template.featured_entries[].beginner_prompt
- post_install_notice.message_template.more_info_hint
- preconditions[].description
- preconditions[].on_fail
- intent_router.uc_entries[].name
- intent_router.uc_entries[].ambiguity_question
- architecture.pipeline
- architecture.stages[].narrative.does_what
- architecture.stages[].narrative.key_decisions
- architecture.stages[].narrative.common_pitfalls
- constraints.fatal[].consequence
- constraints.regular[].consequence
- output_validator.assertions[].failure_message
- acceptance.hard_gates[].on_fail
- skill_crystallization.action
locale_detection_order:
- explicit_user_declaration
- first_message_language
- system_locale
translation_enforcement:
trigger: on_first_user_message
action: Render user_facing_fields in detected locale, preserving all IDs (BD-/SL-/UC-/finance-C-) and code identifiers
verbatim
violation_code: LOCALE-01
violation_signal: User receives untranslated English Human Summary when detected locale != en
evidence_quality:
declared:
evidence_coverage_ratio: 1.0
evidence_verify_ratio: 0.7254901960784313
evidence_invalid: 28
evidence_verified: 74
evidence_auto_fixed: 0
audit_coverage: 47/47 (100%)
audit_pass_rate: 1/47 (2%)
audit_fail_total: 34
audit_finance_universal:
pass: 1
warn: 5
fail: 14
audit_subdomain_totals:
pass: 0
warn: 7
fail: 20
enforcement_rules:
- id: EQ-01
trigger: declared.evidence_verify_ratio < 0.5
action: MUST invoke traceback lookup for all cited BD-IDs in output before emitting business code — read LATEST.yaml sections
for each BD referenced
violation_code: EQ-01-V
violation_signal: Generated script references BD-IDs but no tool_call to read LATEST.yaml preceded code generation
user_disclosure_template: '[QUALITY NOTICE] This crystal was compiled from blueprint finance-bp-122. Evidence verify ratio
= 72.5% and audit fail total = 34. Generated results may have uncaptured requirement gaps. Verify critical decisions against
source files (LATEST.yaml / LATEST.jsonl).'
traceback:
source_files:
blueprint: LATEST.yaml
constraints: LATEST.jsonl
mandatory_lookup_scenarios:
- id: TB-01
condition: Two constraints have apparently conflicting enforcement rules
lookup_target: LATEST.jsonl — find both constraint IDs, compare `consequence` + `evidence_refs` to determine priority
- id: TB-02
condition: A business decision rationale is unclear or disputed
lookup_target: LATEST.yaml — locate BD-ID under business_decisions, read `rationale` + `alternative_considered` fields
- id: TB-03
condition: evidence_invalid > 0 in evidence_quality.declared
lookup_target: LATEST.yaml _enrich_meta — cross-check specific BD `evidence_refs` fields for invalid markers
- id: TB-04
condition: User asks where a rule comes from
lookup_target: LATEST.jsonl — find constraint by ID, read `confidence.evidence_refs` for source file + line number
- id: TB-05
condition: Generated code does not match expected ZVT API behavior
lookup_target: LATEST.yaml stages[].required_methods — verify method signature and evidence locator in source code
degraded_lookup:
no_fs_access: 'Ask the user to paste the relevant LATEST.yaml section or LATEST.jsonl lines for the BD-/finance-C- IDs
in question. Crystal ID: finance-bp-122-v5.0.'
trace_schema:
event_types:
- precondition_check
- spec_lock_check
- evidence_rule_fired
- evidence_rule_skipped
- locale_translation_emitted
- hard_gate_passed
- hard_gate_failed
- skill_emitted
- false_completion_claim
preconditions:
- id: PC-01
description: zvt package installed and importable
check_command: python3 -c 'import zvt; print(zvt.__version__)'
on_fail: 'Run: python3 -m pip install zvt then re-run: python3 -m zvt.init_dirs to initialize data directories'
severity: fatal
- id: PC-02
description: K-data exists for target entities (required before backtesting)
check_command: python3 -c "from zvt.api.kdata import get_kdata; df = get_kdata(entity_ids=['stock_sh_600000'], limit=1);
assert df is not None and len(df) > 0, 'No kdata found'"
on_fail: 'Run recorder first: python3 -m zvt.recorders.em.em_stock_kdata_recorder --entity_ids stock_sh_600000 (replace
with your target entity IDs)'
severity: fatal
applies_to_uc:
- UC-101
- UC-102
- id: PC-03
description: ZVT data directory initialized (~/.zvt or ZVT_HOME)
check_command: 'python3 -c "import os; from pathlib import Path; zvt_home = Path(os.environ.get(''ZVT_HOME'', Path.home()
/ ''.zvt'')); assert zvt_home.exists(), f''ZVT home not found: {zvt_home}''"'
on_fail: 'Run: python3 -m zvt.init_dirs'
severity: fatal
- id: PC-04
description: SQLite write permission for ZVT data directory
check_command: python3 -c "import os, tempfile; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home()
/ '.zvt')); test_f = zvt_home / '.write_test'; test_f.touch(); test_f.unlink()"
on_fail: 'Check directory permissions: chmod u+w ~/.zvt or set ZVT_HOME environment variable to a writable location'
severity: warn
intent_router:
uc_entries:
- uc_id: UC-101
name: Sphinx Documentation Configuration
positive_terms:
- documentation
- sphinx
- config
- api docs
data_domain: mixed
negative_terms:
- trading
- screening
- backtesting
- indicators
ambiguity_question: Are you looking to build documentation or to implement a trading strategy?
- uc_id: UC-102
name: Technical Analysis Features Visualization
positive_terms:
- visualize
- technical indicators
- charting
- feature exploration
- research
data_domain: market_data
negative_terms:
- backtesting
- live trading
- screening
- signal generation
ambiguity_question: Do you want to explore and visualize indicator behavior, or do you need automated screening/trading
signals?
context_state_machine:
states:
- id: CA1_MEMORY_CHECKED
entry: Task started
exit: All memory queries attempted and recorded; memory_unavailable set if failed
timeout: 30s — skip memory, mark memory_unavailable=true, proceed to CA2
- id: CA2_GAPS_FILLED
entry: CA1 complete
exit: 'All FATAL-priority required inputs answered: target market (A-share/HK/US), data source, time range, strategy type'
timeout: NOT skippable — FATAL inputs MUST be user-answered before proceeding
- id: CA3_PATH_SELECTED
entry: CA2 complete
exit: intent_router matched single use case with confidence gap > 20% over next candidate, no data_domain ambiguity
timeout: Trigger ambiguity_question for top-2 candidates, await user selection
- id: CA4_EXECUTING
entry: CA3 complete + user explicit confirmation received
exit: All hard gates G1-Gn passed and output files written
timeout: NOT skippable — user confirmation of execution path required
enforcement: Code generation is PROHIBITED before CA4_EXECUTING. Any regression to earlier state MUST be announced to user.
buy/sell ordering SL-01 check runs at CA4 entry.
spec_lock_registry:
semantic_locks:
- id: SL-01
description: Execute sell orders before buy orders in every trading cycle
locked_value: sell() called before buy() in each Trader.run() iteration
violation_is: fatal
source_bd_ids:
- BD-018
- id: SL-02
description: Trading signals MUST use next-bar execution (no look-ahead)
locked_value: due_timestamp = happen_timestamp + level.to_second()
violation_is: fatal
source_bd_ids:
- BD-014
- BD-025
- id: SL-03
description: Entity IDs MUST follow format entity_type_exchange_code
locked_value: stock_sh_600000 | stockhk_hk_0700 | stockus_nasdaq_AAPL
violation_is: fatal
source_bd_ids: []
- id: SL-04
description: DataFrame index MUST be MultiIndex (entity_id, timestamp)
locked_value: df.index.names == ['entity_id', 'timestamp']
violation_is: fatal
source_bd_ids: []
- id: SL-05
description: 'TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount'
locked_value: XOR enforcement in trading/__init__.py:68
violation_is: fatal
source_bd_ids: []
- id: SL-06
description: 'filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION'
locked_value: factor.py:475 order_type_flag mapping
violation_is: fatal
source_bd_ids: []
- id: SL-07
description: Transformer MUST run BEFORE Accumulator in factor pipeline
locked_value: 'compute_result(): transform at :403 before accumulator at :409'
violation_is: fatal
source_bd_ids: []
- id: SL-08
description: 'MACD parameters locked: fast=12, slow=26, signal=9'
locked_value: factors/algorithm.py:30 macd(slow=26, fast=12, n=9)
violation_is: fatal
source_bd_ids:
- BD-036
- id: SL-09
description: 'Default transaction costs: buy_cost=0.001, sell_cost=0.001, slippage=0.001'
locked_value: sim_account.py:25 SimAccountService default costs
violation_is: warning
source_bd_ids:
- BD-029
- id: SL-10
description: A-share equity trading is T+1 (no same-day close of buy positions)
locked_value: sim_account.available_long filters by trading_t
violation_is: fatal
source_bd_ids: []
- id: SL-11
description: Recorder subclass MUST define provider AND data_schema class attributes
locked_value: contract/recorder.py:71 Meta; register_schema decorator
violation_is: fatal
source_bd_ids: []
- id: SL-12
description: Factor result_df MUST contain either 'filter_result' OR 'score_result' column
locked_value: result_df.columns.intersection({'filter_result', 'score_result'}) non-empty
violation_is: fatal
source_bd_ids: []
implementation_hints:
- id: IH-01
hint: 'Use AdjustType enum exactly: qfq (pre-adjust), hfq (post-adjust), bfq (none) — contract/__init__.py:121'
- id: IH-02
hint: For A-share kdata, default to hfq for long-term analysis (dividend-adjusted) — trader.py:538 StockTrader
- id: IH-03
hint: SQLite connection MUST use check_same_thread=False for multi-threaded recorders
- id: IH-04
hint: Accumulator state serialization uses JSON with custom encoder/decoder hooks — contract/base_service.py
- id: IH-05
hint: Factor.level MUST match TargetSelector.level (enforced at add_factor) — factors/target_selector.py:84
preservation_manifest:
required_objects:
business_decisions_count: 135
fatal_constraints_count: 19
non_fatal_constraints_count: 171
use_cases_count: 2
semantic_locks_count: 12
preconditions_count: 4
evidence_quality_rules_count: 2
traceback_scenarios_count: 5
architecture:
pipeline: data_collection -> data_storage -> factor_computation -> target_selection -> trading_execution -> visualization
stages:
- id: data_collection
narrative:
does_what: TimeSeriesDataRecorder and FixedCycleDataRecorder fetch OHLCV and fundamental data from providers (eastmoney,
joinquant, baostock, akshare) and persist domain objects (Stock1dKdata, BalanceSheet) to SQLite via df_to_db().
key_decisions: BD-002 chose evaluate_start_end_size_timestamps for incremental fetch (not full refresh) because comparing
to get_latest_saved_record avoids redundant API calls; BD-003 chose get_data_map field transformation to keep domain
schema provider-agnostic.
common_pitfalls: 'Don''t forget SL-11: Recorder subclass MUST declare both provider and data_schema class attributes
else initialization fails with assertion error; finance-C-001 fatal violation.'
business_decisions: []
- id: data_storage
narrative:
does_what: StorageBackend persists DataFrames to per-provider SQLite databases at {data_path}/{provider}/{provider}_{db_name}.db
using path templates from _get_path_template; Mixin.record_data and Mixin.query_data provide uniform read/write interface.
key_decisions: BD-004 chose StorageBackend abstraction (not hardcoded SQLite) to allow future cloud storage swap; BD-006
derives db_name from data_schema __tablename__ for per-domain database isolation.
common_pitfalls: SL-04 violation (wrong DataFrame index) causes factor pipeline failures downstream; always ensure df.index.names
== ['entity_id', 'timestamp'] before calling record_data.
business_decisions: []
- id: factor_computation
narrative:
does_what: Factor.compute() applies Transformer (stateless, e.g. MacdTransformer) then Accumulator (stateful, e.g. MaStatsAccumulator)
to produce filter_result or score_result columns; EntityStateService persists per-entity rolling state across batches.
key_decisions: BD-007 chose Factor inheriting DataReader for composable data access; SL-08 locks MACD at (fast=12, slow=26,
n=9) — chose standard Appel parameters not adaptive because interpretability matters for practitioners.
common_pitfalls: 'SL-07: Transformer MUST run before Accumulator — swapping order causes NaN propagation; SL-12: result_df
must contain filter_result OR score_result column or TargetSelector silently drops all signals.'
business_decisions: []
- id: target_selection
narrative:
does_what: TargetSelector.add_factor() registers Factor instances; get_targets() returns entity_ids passing threshold
filter at a specific timestamp, enabling point-in-time historical backtesting without look-ahead.
key_decisions: BD-012 chose registrable factor list (not hardcoded) for runtime customization; BD-013 chose timestamp-specific
filtering not current-only because backtests need historical point-in-time correctness.
common_pitfalls: Factor.level MUST match TargetSelector.level (IH-05); mismatched levels cause silent empty target lists
that look like no signals but are actually level-mismatch bugs.
business_decisions: []
- id: trading_execution
narrative:
does_what: Trader.run() calls sell() before buy() each cycle, generates TradingSignals with due_timestamp = happen_timestamp
+ level.to_second() for next-bar execution, and applies on_profit_control() for stop-loss/take-profit before regular
target selection.
key_decisions: SL-01 locks sell-before-buy order because available_long check in sim_account depends on it — chose this
over symmetric ordering to prevent implicit leverage; BD-039 chose long=AND/short=OR multi-level logic to reflect
risk asymmetry.
common_pitfalls: 'SL-02 violation (immediate execution instead of next-bar) introduces look-ahead bias and makes backtest
results unreproducible in live trading; SL-10: A-share T+1 constraint — backtesting without it overstates returns.'
business_decisions: []
- id: visualization
narrative:
does_what: Drawer.draw() combines kline main chart with factor overlays and Rect annotations for entry/exit signals
using Plotly; Drawable interface on Factor enables consistent chart rendering across data types.
key_decisions: BD-019 chose drawer_rects subclass override for custom annotations not hardcoded markers — allows traders
to define entry/exit visuals without modifying base drawing logic.
common_pitfalls: draw_result=True by default (BD-055) is fine for development but set draw_result=False in production/headless
environments to avoid Plotly server startup overhead.
business_decisions: []
- id: cross_cutting_concerns
narrative:
does_what: 'Invariants and utilities that span multiple pipeline stages — collected from 28 source groups: algorithm_selection(2),
data_cleanup(1), data_input(25), default_value(6), feature_aggregation(1), functional_api(1), and 22 more.'
key_decisions: 135 BDs merged here because they apply to more than one main stage (e.g. algorithm helpers, default value
choices, ordering contracts, error handling). Agent should inspect individual BD summaries and link back to affected
main stages via shared IDs.
common_pitfalls: Cross-cutting concerns frequently surface as bugs when changes to one main stage unintentionally break
another. Check constraints referencing these BDs and verify invariants still hold after any stage-local modification.
business_decisions:
- id: BD-041
type: B/BA
summary: Keltner original_version=True uses SMA of typical price
- id: BD-GAP-005
type: B/BA
summary: Volume moving average uses Simple Moving Average (SMA) with 20-period default rather than exponential weighting,
prioritizing interpretability over recency sensitivity
- id: BD-042
type: B/DK
summary: DropNA filters values > exp(709) and != 0.0 then dropna()
- id: BD-GAP-007
type: DK
summary: 'Missing: as-of vs processing time'
- id: BD-GAP-008
type: DK
summary: 'Missing: Trading calendar vs natural calendar'
- id: BD-GAP-009
type: DK
summary: 'Missing: Timezone explicit annotation + UTC'
- id: BD-GAP-010
type: RC
summary: 'Missing: float vs Decimal for currency'
- id: BD-GAP-011
type: DK
summary: 'Missing: Stale data detection and expiry'
- id: BD-GAP-012
type: B
summary: 'Missing: PnL conservation (realized + unrealized)'
- id: BD-GAP-013
type: B
summary: 'Missing: Train/test time split integrity'
- id: BD-GAP-014
type: DK
summary: 'Missing: Random seed full coverage'
- id: BD-GAP-015
type: DK
summary: 'Missing: Model and data version snapshot binding'
- id: BD-GAP-016
type: B
summary: 'Missing: Immutable event log'
- id: BD-GAP-017
type: RC
summary: 'Missing: Settlement and delivery time convention'
- id: BD-GAP-018
type: RC
summary: 'Missing: Price and quantity precision (tick/lot)'
- id: BD-GAP-019
type: B
summary: 'Missing: Cost Model Completeness'
- id: BD-GAP-020
type: B
summary: 'Missing: Funding Cost Modeling (Carry)'
- id: BD-GAP-021
type: M/DK
summary: 'Missing: Day Count and Compounding Conventions'
- id: BD-GAP-022
type: M
summary: 'Missing: Transition Matrix Time Homogeneity'
- id: BD-GAP-023
type: DK
summary: 'Missing: Versioned Writes & Snapshot Semantics'
- id: BD-GAP-024
type: B
summary: 'Missing: Provider Priority & Credential Isolation'
- id: BD-GAP-025
type: B
summary: 'Missing: type'
- id: BD-GAP-026
type: B
summary: 'Missing: Implied Volatility Solver'
- id: BD-GAP-027
type: B
summary: 'Missing: Finite Difference Grid Stability (CFL)'
- id: BD-GAP-028
type: B
summary: 'Missing: Arbitrage-Free Constraint Verification'
- id: BD-GAP-029
type: B
summary: 'Missing: PD/LGD/EAD Estimation Methods'
- id: BD-GAP-030
type: B
summary: 'Missing: NPL Portfolio EBA Field Completeness'
- id: BD-GAP-031
type: B
summary: 'Missing: type'
- id: BD-082
type: BA
summary: '_fillna=False class-level default in IndicatorMixin encodes business rule: NaN handling is opt-in'
- id: BD-083
type: BA
summary: 'min_periods logic tied to _fillna: 0 if fillna else window — affects first valid index position'
- id: BD-084
type: BA/DK
summary: 'Fill value by indicator type encodes semantic meaning: 50 for oscillators, 0 for momentum, 20 for ADX'
- id: BD-092
type: B/BA
summary: CumulativeReturnIndicator uses first price as base; cannot recompute after data updates
- id: BD-093
type: BA/DK
summary: PSAR step=0.02, max_step=0.20 defaults encode specific market assumptions
- id: BD-094
type: BA/DK
summary: MACD defaults (fast=12, slow=26, signal=9) encode standard market cycle assumptions
- id: BD-GAP-001
type: B
summary: Volume features use On-Balance Volume (OBV) with cumulative signed volume logic rather than raw volume totals,
enabling direction-aware momentum tracking
- id: BD-GAP-004
type: BA/M
summary: Feature aggregation combines volume, momentum, and volatility indicators into a single normalized output dictionary,
allowing downstream consumers to select desired features without code changes
- id: BD-097
type: BA/DK
summary: 'INTERACTION: [BD-001] × [BD-044] × [BD-046] → Canonical RSI Implementation (Wilder equivalence)'
- id: BD-098
type: B/BA
summary: 'INTERACTION: [BD-082] × [BD-083] × [BD-033] → Coupled NaN propagation system with boundary conditions'
- id: BD-099
type: BA
summary: 'INTERACTION: [BD-090] × [BD-070] × [BD-030] → Keltner Channel dependency on internal ATR with Wilder smoothing'
- id: BD-100
type: B
summary: 'INTERACTION: [BD-034/BD-081] × [BD-070,BD-065,BD-067,BD-048] → True Range single point of failure cascades'
- id: BD-101
type: B
summary: 'INTERACTION: [BD-086] × [BD-009/BD-010] × [BD-051] → shift(1) hardcoded conflicts with multi-period return
definitions'
- id: BD-102
type: B/BA
summary: 'INTERACTION: [BD-003] × [BD-089] → Bollinger Bands ddof=0 conflicts with pandas default sample std'
- id: BD-103
type: B/BA
summary: 'INTERACTION: [BD-087] × [BD-044] × [BD-058] → adjust=False creates pandas ewm incompatibility'
- id: BD-104
type: BA
summary: 'INTERACTION: [BD-091] × [BD-004] × [BD-050] → KAMA np.roll() boundary issue corrupts Efficiency Ratio calculation'
- id: BD-001
type: B/BA
summary: RSI default window=14, EMA-based calculation using alpha=1/window
- id: BD-002
type: B/BA
summary: 'MACD default windows: slow=26, fast=12, signal=9'
- id: BD-003
type: B/BA
summary: 'Bollinger Bands default: window=20, window_dev=2, ddof=0 (population std)'
- id: BD-004
type: B/BA
summary: 'KAMA default: window=10, pow1=2, pow2=30 for efficiency ratio'
- id: BD-005
type: B/BA
summary: CCI default constant=0.015 (0.015 = typical commodity constant)
- id: BD-006
type: B/BA
summary: 'PSAR default: step=0.02, max_step=0.20 for acceleration factor'
- id: BD-007
type: B/BA
summary: NVI initial value starts at 1000 (cumulative base)
- id: BD-013
type: B/DK
summary: TRIX triple EMA smoothing then ROC on third EMA
- id: BD-014
type: B/BA
summary: Mass Index = EMA(amplitude) / EMA(EMA(amplitude)) sum over slow window
- id: BD-015
type: B/BA
summary: 'Ichimoku defaults: window1=9 (conversion), window2=26 (base), window3=52 (span B)'
- id: BD-016
type: B
summary: 'KST: 4 ROC periods (10,15,20,30) weighted 1:2:3:4, smoothed with SMA'
- id: BD-017
type: B/DK
summary: DPO shift = int((window/2) + 1) to detrend price
- id: BD-018
type: B/DK
summary: 'Ultimate Oscillator weights: 4:2:1 (window1:window2:window3)'
- id: BD-019
type: B/BA
summary: Williams %R multiplied by -100 (inverted negative scale)
- id: BD-020
type: B/DK
summary: Stochastic Oscillator smooth_window=3 for %D signal line
- id: BD-021
type: B/DK
summary: 'TSI double EMA smoothing: window_slow=25, window_fast=13'
- id: BD-022
type: B/BA
summary: Force Index default window=13 (EMA smoothing)
- id: BD-023
type: B/BA
summary: 'VPT: cumulative (close_pct_change * volume), no default smoothing'
- id: BD-024
type: B/BA
summary: VWAP uses rolling window (default 14) for cumulative calculation
- id: BD-025
type: B/BA
summary: ROC default window=12 periods
- id: BD-026
type: B/BA
summary: 'Awesome Oscillator: window1=5 (fast SMA), window2=34 (slow SMA) on median price'
- id: BD-027
type: B/BA
summary: Aroon default window=25 periods
- id: BD-028
type: B/BA
summary: Vortex Indicator default window=14 periods
- id: BD-029
type: B/DK
summary: Donchian Channel middle band = (high + low) / 2 of rolling window
- id: BD-030
type: B/BA
summary: Keltner Channel multiplier=2 (ATR-based bands)
- id: BD-031
type: B/BA
summary: 'Ulcer Index: Rolling max period default=14 for drawdown calculation'
- id: BD-032
type: B/BA
summary: 'STC: window_slow=50, window_fast=23, cycle=10, smooth1=3, smooth2=3'
- id: BD-035
type: B/BA
summary: ADX default window=14 for directional movement smoothing
- id: BD-036
type: B/BA
summary: 'AccDist Index (CLV): (Close-Low - High-Close) / (High-Low)'
- id: BD-037
type: B/BA
summary: 'OBV: add volume on up day, subtract on down day, cumulative sum'
- id: BD-038
type: B/BA
summary: Chaikin Money Flow default window=20 periods
- id: BD-039
type: B/BA
summary: 'StochRSI: window=14, smooth1=3, smooth2=3 (KAMA-like smoothing)'
- id: BD-040
type: B
summary: 'Percent Price Oscillator (PPO): same as MACD but expressed as percentage'
- id: BD-GAP-002
type: B/BA
summary: Rate of Change (ROC) uses percentage-based calculation with configurable period defaulting to 12 bars, enabling
cross-asset comparability
- id: BD-090
type: T
summary: KeltnerChannel creates internal AverageTrueRange instance when original_version=False
- id: BD-045
type: B/BA
summary: add_all_ta_features expects 76 columns (vectorized) or 94 columns (full)
- id: BD-086
type: B/BA
summary: 'shift(1) hardcoded across 18+ locations: each indicators assume 1-period lookback'
- id: BD-089
type: B/BA
summary: BollingerBands uses ddof=0 (population std) while pandas defaults to ddof=1 (sample std)
- id: BD-091
type: B
summary: 'np.roll() used in KAMA: wraps first element with last element causing spurious first-value calculation'
- id: BD-033
type: B
summary: 'FillNA: replace inf with NaN, then ffill, then bfill or fillna(value)'
- id: BD-088
type: T
summary: 'add_all_ta_features calls indicators in fixed sequence: volume→volatility→trend→momentum→others'
- id: BD-GAP-006
type: B
summary: Bollinger Band feature extraction returns upper band, middle band, lower band, and bandwidth percentage, enabling
both breakout and mean-reversion strategy support
- id: BD-GAP-003
type: B/BA
summary: Bollinger Bands use 20-period SMA with 2 standard deviation bands as the default configuration, representing
the most widely-used technical analysis parameters
- id: BD-085
type: T
summary: 'Dual-API pattern: every indicator has both Class and Function wrapper APIs'
- id: BD-087
type: B/BA
summary: 'EMA uses adjust=False: traditional Wilder smoothing, not modern exponential decay'
- id: BD-095
type: T
summary: _true_range static method in IndicatorMixin used by ATR, Ultimate Oscillator, Vortex
- id: BD-096
type: B/BA
summary: NegativeVolumeIndex initializes to 1000 and uses explicit loop, not cumulative sum
- id: BD-011
type: B
summary: Vectorized mode excludes MFI, NVI, ADX, CCI, Aroon, PSAR (non-vectorized only)
- id: BD-012
type: B/BA
summary: Typical Price = (High + Low + Close) / 3 for MFI, VWAP, CCI
- id: BD-034
type: B
summary: True Range = max(High-Low, |High-PrevClose|, |Low-PrevClose|)
- id: BD-009
type: B/BA
summary: 'Daily Return: (close/prev_close) - 1, multiplied by 100 for percentage'
- id: BD-010
type: B/BA
summary: 'Log Return: ln(close) - ln(prev_close), multiplied by 100'
- id: BD-043
type: B/DK
summary: SMA min_periods = 0 if fillna else periods (allows partial calculation)
- id: BD-008
type: B/BA
summary: EoM multiplier = 100000000 (100M) for scale normalization
- id: BD-044
type: B/DK
summary: EMA uses adjust=False for traditional (non-Yahoo Finance) smoothing
- id: BD-046
type: B/DK
summary: RSI uses EMA smoothing with alpha=1/window for up/down direction components
- id: BD-047
type: B/DK
summary: TSI uses double EMA smoothing (recursive ewm) on price difference and absolute difference
- id: BD-048
type: B
summary: Ultimate Oscillator uses weighted average of 3 timeframe BP/TR ratios
- id: BD-049
type: B/DK
summary: Stochastic Oscillator %K uses rolling min/max of high/low over window
- id: BD-050
type: B
summary: KAMA uses Efficiency Ratio (ER) to adapt smoothing constant for each period
- id: BD-051
type: B/BA
summary: 'Rate of Change uses percentage change formula: (current - past) / past * 100'
- id: BD-052
type: B/DK
summary: Awesome Oscillator uses SMA of median price (H+L)/2 with windows 5 and 34
- id: BD-053
type: B/BA
summary: Williams %R uses inverse stochastic formula with -100 scaling
- id: BD-054
type: B/DK
summary: Stochastic RSI applies stochastic normalization to RSI values themselves
- id: BD-055
type: B/BA
summary: Percentage Price Oscillator computes ((fast_ema - slow_ema) / slow_ema) * 100
- id: BD-056
type: B
summary: MACD uses EMA difference (fast minus slow) with signal line as EMA of MACD
- id: BD-059
type: B/DK
summary: 'WMA uses linear weights: weight_i = 2*i/(n*(n+1)) for i from 1 to n'
- id: BD-060
type: B
summary: TRIX applies triple EMA then computes ROC of the triple-smoothed series
- id: BD-061
type: B/BA
summary: Mass Index uses EMA ratio (ema1/ema2) summed over window_slow
- id: BD-062
type: B/BA
summary: Ichimoku Conversion/Baseline uses midpoint (high+low)/2 rolling extremes
- id: BD-063
type: B/BA
summary: KST uses weighted sum of multiple ROC smoothed values with fixed weights 1:2:3:4
- id: BD-064
type: B/DK
summary: Detrended Price Oscillator shifts SMA by (window/2 + 1) then subtracts from shifted close
- id: BD-065
type: B/BA
summary: CCI uses Mean Absolute Deviation (MAD) in denominator with constant=0.015
- id: BD-066
type: B/BA
summary: ADX uses Wilder smoothing (recursive EMA equivalent) for DX averaging
- id: BD-067
type: B/BA
summary: Vortex Indicator uses sum of abs(price movement) divided by true range sum
- id: BD-068
type: B/BA
summary: PSAR uses parabolic formula with acceleration factor step=0.02, max_step=0.20
- id: BD-069
type: B/BA
summary: STC applies stochastic to MACD values, then EMA smooths twice with cycle=10
- id: BD-057
type: B/DK
summary: SMA uses rolling window mean with min_periods control for fillna behavior
- id: BD-058
type: B/RC
summary: EMA uses pandas ewm with span parameter and adjust=False for exact Wilder smoothing
- id: BD-081
type: B
summary: 'True Range uses max of three: H-L, |H-prev_C|, |L-prev_C|'
- id: BD-070
type: B/BA
summary: ATR uses Wilder smoothing (recursive formula) for true range averaging
- id: BD-071
type: B/BA
summary: Bollinger Bands use population std (ddof=0) with SMA centerline
- id: BD-072
type: B/BA
summary: Keltner Channel uses SMA of typical price (H+L+C)/3 as centerline by default
- id: BD-073
type: B/BA
summary: Ulcer Index computes RMS of percentage drawdowns from rolling maximum
- id: BD-074
type: B/BA
summary: Accumulation/Distribution uses CLV = ((C-L) - (H-C))/(H-L) multiplied by volume
- id: BD-075
type: B/DK
summary: On-Balance Volume adds volume when close rises, subtracts when close falls
- id: BD-076
type: B/BA
summary: Chaikin Money Flow sums MFI over window divided by sum of volume over window
- id: BD-077
type: B/DK
summary: Force Index uses (close - prev_close) * volume with EMA smoothing
- id: BD-078
type: B/BA
summary: Ease of Movement uses (H.diff + L.diff)*(H-L)/(2*V) scaled by 1e8 for readability
- id: BD-079
type: B/BA
summary: Money Flow Index applies RSI formula to typical price * volume * direction
- id: BD-080
type: B/BA
summary: VWAP computes sum(typical_price * volume) / sum(volume) over rolling window
resources:
packages:
- name: pandas
version_pin: ==1.5.3
- name: numpy
version_pin: ==1.24.4
- name: matplotlib
version_pin: '>=2'
- name: requests
version_pin: ==2.31.0
- name: scipy
version_pin: '>=1.3.0'
- name: scikit-learn
version_pin: '>1.4.2'
- name: pytest
version_pin: '>=8.3'
strategy_scaffold:
entry_point_name: run_backtest
output_path: result.csv
execution_mode: backtest
conditional_entry_points:
backtest:
entry_point_name: run_backtest
output_path: result.csv
collector:
entry_point_name: run_collector
output_path: result.json
factor:
entry_point_name: run_factor
output_path: result.parquet
training:
entry_point_name: run_training
output_path: result.json
serving:
entry_point_name: run_server
output_path: result.json
research:
entry_point_name: run_research
output_path: result.json
tail_template: "# === DO NOT MODIFY BELOW THIS LINE ===\nif __name__ == \"__main__\":\n result = run_backtest() #\
\ implement above\n from validate import enforce_validation\n enforce_validation(result, output_path=\"{workspace}/result.csv\"\
)\n# === END DO NOT MODIFY ==="
host_adapter:
target: openclaw
timeout_seconds: 1800
shell_operator_restriction: 'exec tool intercepts && / ; / | — never chain: ''pip install X && python Y''. Use separate
exec calls.'
install_recipes:
- python3 -m pip install zvt
credential_injection: JoinQuant/QMT credentials require user-side '!' prefix shell login. Never hardcode credentials in
generated scripts.
path_resolution: '{workspace} resolves to ~/.openclaw/workspace/doramagic at execution time.'
file_io_tooling: Use openclaw 'write' tool for .py/.sql files; 'exec' tool for python3 /absolute/path/script.py (absolute
paths only).
constraints:
fatal:
- id: finance-C-003
when: When calculating technical indicators
action: preserve the input DataFrame index without reindexing
severity: fatal
kind: domain_rule
modality: must
consequence: Reindexing breaks temporal alignment with other features, causes data corruption in time-series datasets,
and may shift indicator values to incorrect timestamps
stage_ids:
- data_input
- id: finance-C-010
when: When using ta library for financial analysis
action: claim the library provides real-time trading signals or live trading capability
severity: fatal
kind: claim_boundary
modality: must_not
consequence: The library is a feature engineering tool that calculates technical indicators from historical OHLCV data;
presenting these as trading signals could lead to financial losses if users trade without proper risk management
stage_ids:
- data_input
- id: finance-C-011
when: When calculating technical indicators on financial data
action: present backtest results as guaranteed future trading performance
severity: fatal
kind: claim_boundary
modality: must_not
consequence: Historical indicator values do not guarantee future performance; backtest results are affected by look-ahead
bias, survivorship bias, and market regime changes
stage_ids:
- data_input
- id: finance-C-013
when: When computing Bollinger Bands standard deviation
action: use ddof=0 (population standard deviation) for consistency with original Bollinger specification
severity: fatal
kind: domain_rule
modality: must
consequence: Using sample std (ddof=1) would produce wider bands than specified by John Bollinger, causing overestimation
of volatility and incorrect trading signal thresholds
stage_ids:
- indicator_computation
- id: finance-C-017
when: When implementing EMA-based indicators requiring traditional exponential smoothing
action: use adjust=False in ewm() to implement standard recursive exponential smoothing per TA textbooks
severity: fatal
kind: domain_rule
modality: must
consequence: Using adjust=True (pandas default) implements Yahoo Finance variant smoothing which produces different results
from standard technical analysis textbooks, causing signal divergence from expected indicator behavior
stage_ids:
- indicator_computation
- id: finance-C-018
when: When returning computed indicator Series from result methods
action: preserve the input Series index object to maintain temporal alignment in downstream calculations
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Creating a new Series with reindexed or reset index breaks temporal alignment with original price data, causing
look-ahead bias or misaligned signals in trading strategies
stage_ids:
- indicator_computation
- id: finance-C-042
when: When implementing functional wrapper functions like rsi(), ema_indicator()
action: instantiate the corresponding Indicator class and call the single result method
severity: fatal
kind: domain_rule
modality: must
consequence: Diverging implementations produce inconsistent results between OOP and functional API, causing users to receive
different indicator values for the same inputs
stage_ids:
- functional_api
- id: finance-C-043
when: When implementing functional wrapper functions
action: retain any instance state between function calls
severity: fatal
kind: domain_rule
modality: must_not
consequence: Stateful wrappers cause data leakage across calls, producing incorrect indicator values when the same function
is called with different input series
stage_ids:
- functional_api
- id: finance-C-047
when: When providing functional wrappers for the public API
action: verify functional wrapper produces identical output to class method with same parameters
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Test suite validation failure causing regression in user code that depends on equivalence between rsi() and
RSIIndicator().rsi()
stage_ids:
- functional_api
- id: finance-C-055
when: When implementing data cleanup for financial time series indicators
action: Use np.inf directly in comparisons; use math.exp(709) as the maximum float64 threshold instead
severity: fatal
kind: domain_rule
modality: must_not
consequence: Comparing values with np.inf in boolean masks can produce unexpected results in pandas, causing valid large
values to be incorrectly classified and corrupting indicator calculations
stage_ids:
- data_cleanup
- id: finance-C-056
when: When processing price or volume data before indicator computation
action: Remove rows with zero values in numeric columns; zero prices cause division-by-zero errors
severity: fatal
kind: domain_rule
modality: must
consequence: Zero values in price columns will cause division-by-zero errors when computing returns, ratios, or percentage-based
indicators, leading to NaN propagation or infinite values
stage_ids:
- data_cleanup
- id: finance-C-058
when: When cleaning up DataFrames for indicator computation
action: Filter out rows containing inf or -inf values in numeric columns using the exp(709) threshold
severity: fatal
kind: domain_rule
modality: must
consequence: Infinite values propagate through rolling window calculations and produce NaN in all subsequent indicator
values, corrupting the entire dataset
stage_ids:
- data_cleanup
- id: finance-C-066
when: When using the ta library for financial analysis
action: Apply dropna to any DataFrame before passing it to ta indicator functions
severity: fatal
kind: resource_boundary
modality: must
consequence: Indicators like RSI, MACD, Bollinger Bands, and other calculations will produce incorrect results when fed
data containing NaN, inf, or zero values, leading to false trading signals
stage_ids:
- data_cleanup
- id: finance-C-070
when: When passing internal pandas.Series from _run() computation to wrapper method accessors
action: return Series that preserve the original close price index for temporal alignment
severity: fatal
kind: domain_rule
modality: must
consequence: If returned indicator Series lose the original time index, downstream feature aggregation will misalign indicators
with each other and with price data, causing incorrect backtest results
stage_ids:
- indicator_computation
- feature_aggregation
- id: finance-C-081
when: When implementing any technical indicator class that outputs a Series
action: preserve the input Series index without reindexing — use index=self._close.index when constructing output Series
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Output Series index misalignment causes downstream features or trading logic to reference incorrect timestamps,
leading to wrong signal generation and corrupted DataFrames
- id: finance-C-083
when: When implementing rolling-window calculations with fillna
action: set min_periods=0 when fillna=True, otherwise set min_periods equal to the window size
severity: fatal
kind: domain_rule
modality: must
consequence: Incorrect min_periods causes premature rolling calculations with insufficient data, producing unreliable
indicator values at the start of the time series
- id: finance-C-086
when: When implementing any price-change or return calculation
action: use shift(1) for 1-period lookback to prevent look-ahead bias — no abstraction for configurable lookback periods
exists
severity: fatal
kind: domain_rule
modality: must
consequence: Missing shift(1) causes look-ahead bias where current price is compared against itself, producing incorrect
RSI, momentum, and return calculations that inflate backtest performance
- id: finance-C-121
when: When implementing or refactoring True Range calculation logic
action: Modify the _true_range calculation implementation — this function is the foundational input for ATR, CCI, Vortex,
and Ultimate Oscillator (4 independent indicators)
severity: fatal
kind: architecture_guardrail
modality: must_not
consequence: A bug or change to _true_range silently propagates to four independent indicators, creating systematic errors
across volatility, trend, and momentum calculations without obvious single source of failure
derived_from_bd_id: BD-100
- id: finance-C-174
when: When implementing Williams %R indicator
action: Apply the negative scaling (multiply by -100) — the indicator MUST output values in the range 0 to -100 where
values above -20 indicate overbought and below -80 indicate oversold; do not apply positive scaling (0 to 100)
severity: fatal
kind: domain_rule
modality: must_not
consequence: Removing the negative sign inverts the oscillator scale, making overbought conditions appear near -100 and
oversold near 0 — all strategy thresholds become inverted, causing opposite trading signals in live execution
derived_from_bd_id: BD-019
regular:
- id: finance-C-001
when: When implementing wrapper functions for data input
action: accept column NAME strings instead of Series objects
severity: high
kind: architecture_guardrail
modality: must
consequence: Accepting Series objects directly would break the in-place DataFrame mutation design, preventing fluent API
chaining where users expect df to be modified without explicit reassignment
stage_ids:
- data_input
- id: finance-C-002
when: When adding indicator columns to DataFrame
action: use naming pattern {colprefix}{category}_{indicator_name}
severity: high
kind: domain_rule
modality: must
consequence: Using inconsistent naming patterns breaks downstream feature selection, causes column name conflicts, and
prevents reproducible experiments
stage_ids:
- data_input
- id: finance-C-004
when: When calling add_all_ta_features with vectorized=True
action: produce exactly 76 indicator features
severity: high
kind: resource_boundary
modality: must
consequence: Incorrect feature count indicates non-vectorized indicators were incorrectly included or vectorized indicators
were skipped, breaking the expected feature set for performance-critical applications
stage_ids:
- data_input
- id: finance-C-005
when: When calling add_all_ta_features with vectorized=False
action: produce exactly 94 indicator features
severity: high
kind: resource_boundary
modality: must
consequence: Incorrect feature count indicates non-vectorized indicators (ADX, CCI, Aroon, PSAR, KAMA, MFI, NVI, ATR,
UI) were incorrectly skipped or included twice, breaking the expected full feature set
stage_ids:
- data_input
- id: finance-C-006
when: When providing column names to wrapper functions
action: raise KeyError with informative message for missing OHLCV columns
severity: high
kind: domain_rule
modality: must
consequence: Pandas df[colname] access raises KeyError for missing columns, but without explicit validation the error
message may not clearly indicate which required OHLCV column is missing
stage_ids:
- data_input
- id: finance-C-007
when: When adding each TA features via add_all_ta_features
action: process indicators in fixed volume→volatility→trend→momentum→others order
severity: medium
kind: architecture_guardrail
modality: must
consequence: Processing in non-deterministic order causes irreproducible column ordering, breaks tests that check column
positions, and may cause issues with feature dependencies if later categories depend on earlier computations
stage_ids:
- data_input
- id: finance-C-008
when: When using vectorized=True parameter
action: include non-vectorized indicators (ADX, CCI, Aroon, PSAR, KAMA, MFI, NVI, ATR, UI)
severity: high
kind: resource_boundary
modality: must_not
consequence: Non-vectorized indicators contain iterative Python loops that significantly degrade performance; enabling
them defeats the purpose of vectorized=True
stage_ids:
- data_input
- id: finance-C-009
when: When adding multiple indicator sets to the same DataFrame
action: use colprefix parameter to namespace different indicator sets
severity: high
kind: operational_lesson
modality: must
consequence: Without colprefix, multiple calls to add_all_ta_features will overwrite columns with the same names, causing
silent data loss and incorrect feature values
stage_ids:
- data_input
- id: finance-C-012
when: When the data contains NaN values before adding TA features
action: clean NaN values using ta.utils.dropna or handle fillna parameter
severity: medium
kind: operational_lesson
modality: must
consequence: NaN values propagate through indicator calculations, producing NaN outputs for rows that could otherwise
have valid indicator values, reducing the effective dataset size
stage_ids:
- data_input
- id: finance-C-014
when: When implementing new price-derived indicators that can't have negative values
action: use fillna value=-1 to indicate missing data as impossible price levels
severity: high
kind: domain_rule
modality: must
consequence: Using other fillna values like 0 or 50 would corrupt data semantics, as -1 is the sentinel value distinguishing
legitimate negative indicator values from missing price-derived data
stage_ids:
- indicator_computation
- id: finance-C-015
when: When implementing normalized oscillator indicators with 0-100 range
action: use fillna value=50 to fill NaN values with the neutral midpoint of the oscillator range
severity: high
kind: domain_rule
modality: must
consequence: Using other fillna values would misrepresent indicator state during warmup periods, potentially causing false
overbought/oversold signals during backtesting
stage_ids:
- indicator_computation
- id: finance-C-016
when: When implementing difference or ratio indicators where 0 is meaningful
action: use fillna value=0 to fill NaN values preserving the semantic meaning of zero change
severity: high
kind: domain_rule
modality: must
consequence: Using non-zero fillna would introduce artificial bias in return/difference calculations, corrupting strategy
performance metrics
stage_ids:
- indicator_computation
- id: finance-C-019
when: When implementing rolling window calculations with fillna=True
action: set min_periods=0 to start producing values immediately from the first element
severity: high
kind: architecture_guardrail
modality: must
consequence: Using default min_periods=window with fillna=True would leave excessive NaN values at series start, reducing
effective indicator history and causing warmup period issues
stage_ids:
- indicator_computation
- id: finance-C-020
when: When implementing rolling window calculations with fillna=False (default)
action: set min_periods equal to window size to verify sufficient data for calculation
severity: high
kind: architecture_guardrail
modality: must
consequence: Using min_periods=0 without fillna=True would produce NaN-inflated calculations from insufficient data samples,
corrupting early indicator values
stage_ids:
- indicator_computation
- id: finance-C-021
when: When implementing shift operations on series at series start
action: use fill_value=series.mean() to provide meaningful edge value instead of NaN propagation
severity: high
kind: architecture_guardrail
modality: must
consequence: Newer pandas versions require explicit fill_value parameter; omitting it causes NaN propagation that corrupts
TRIX, KST, Ichimoku, and Vortex indicators at series start
stage_ids:
- indicator_computation
- id: finance-C-022
when: When creating new technical indicator classes
action: inherit from IndicatorMixin and implement __init__ → _run → <result> pattern for consistent API
severity: high
kind: architecture_guardrail
modality: must
consequence: Deviating from the established pattern breaks interoperability with wrapper functions (add_all_ta_features)
and violates user expectations for 41+ indicator implementations
stage_ids:
- indicator_computation
- id: finance-C-023
when: When handling NaN values across indicator calculations
action: use _check_fillna method from IndicatorMixin to centralize fillna logic and replace infinities with NaN first
severity: high
kind: architecture_guardrail
modality: must
consequence: Bypassing _check_fillna allows infinite values to propagate through calculations, causing division errors
or corrupted indicator outputs
stage_ids:
- indicator_computation
- id: finance-C-024
when: When computing True Range for ATR, ADX, or Vortex indicators
action: use the shared _true_range static method to verify DRY principle and consistent formula implementation
severity: medium
kind: architecture_guardrail
modality: must
consequence: Reimplementing True Range formula separately risks divergence between indicators that must share identical
calculation (Wilder's specification)
stage_ids:
- indicator_computation
- id: finance-C-025
when: When using shift() without explicit fill_value in pandas operations
action: rely on implicit fill_value behavior in newer pandas versions without providing explicit fill_value
severity: high
kind: operational_lesson
modality: must_not
consequence: Pandas 2.0+ removed implicit fill_value support; shift() without explicit fill_value produces NaN, breaking
indicators that depend on edge value propagation
stage_ids:
- indicator_computation
- id: finance-C-026
when: When initializing IndicatorMixin subclasses
action: set _fillna class attribute to True as default, as this can mask calculation errors from insufficient data
severity: high
kind: operational_lesson
modality: must_not
consequence: Default fillna=True hides NaN propagation issues during development, causing silent data corruption that
only manifests in production with incomplete data
stage_ids:
- indicator_computation
- id: finance-C-027
when: When computing technical indicators as input for trading strategies
action: claim that indicator values alone constitute trading signals or guarantees of profit
severity: high
kind: claim_boundary
modality: must_not
consequence: Presenting indicator values as trading signals implies financial advice, violates regulatory requirements,
and exposes users to uncontrolled market risk
stage_ids:
- indicator_computation
- id: finance-C-028
when: When evaluating indicator performance based on historical calculations
action: claim that historical indicator values predict future market behavior or guarantee similar results
severity: medium
kind: claim_boundary
modality: must_not
consequence: Backtested indicator performance does not indicate future returns; presenting historical calculations as
predictive overstates library capabilities
stage_ids:
- indicator_computation
- id: finance-C-029
when: When using fillna=True for production trading systems
action: silently accept NaN replacement as real data without explicit logging of fill locations
severity: high
kind: operational_lesson
modality: must_not
consequence: Filled NaN values treated as real observations can cause strategy to trade on fabricated indicator readings,
leading to unexpected position sizing or entry/exit timing
stage_ids:
- indicator_computation
- id: finance-C-030
when: When implementing percentage band (pband) calculations for channel indicators
action: Return NaN where the channel range (hband - lband) equals zero to prevent division by zero
severity: high
kind: domain_rule
modality: must
consequence: Division by zero produces inf values which corrupt downstream calculations and statistical analysis, leading
to invalid trading signals or corrupted feature data
stage_ids:
- feature_aggregation
- id: finance-C-031
when: When implementing derived getter methods for an indicator class
action: Preserve the input Series index across each derived outputs to verify temporal alignment
severity: high
kind: domain_rule
modality: must
consequence: Misaligned indices cause incorrect feature values when joining or merging indicator outputs, producing data
corruption that is difficult to debug
stage_ids:
- feature_aggregation
- id: finance-C-032
when: When computing oscillator difference outputs (macd_diff, kst_diff, ppo_hist, etc.)
action: Compute the difference as primary_value minus derived_signal at each data point
severity: high
kind: domain_rule
modality: must
consequence: Incorrect difference calculation produces wrong histogram values, causing misinterpretation of momentum direction
and leading to bad trading decisions
stage_ids:
- feature_aggregation
- id: finance-C-033
when: When designing multi-output indicator classes that cache _run() results
action: Call _run() once during __init__ and store each derived values as instance attributes for retrieval by getter
methods
severity: high
kind: architecture_guardrail
modality: must
consequence: Repeated computation of expensive rolling window operations causes severe performance degradation on long
time series, making the library unusable for large datasets
stage_ids:
- feature_aggregation
- id: finance-C-034
when: When implementing percentage band for KeltnerChannel or DonchianChannel
action: Apply the same division-by-zero protection as BollingerBands using .where(hband != lband, np.nan)
severity: high
kind: domain_rule
modality: must
consequence: KeltnerChannel and DonchianChannel pband implementations produce inf values when channels are flat (hband
equals lband), corrupting downstream feature calculations
stage_ids:
- feature_aggregation
- id: finance-C-035
when: When implementing derived getter methods for multi-output indicators
action: Recompute rolling window calculations within getter methods — each computation must occur in _run()
severity: high
kind: architecture_guardrail
modality: must_not
consequence: Placing computation logic in getter methods causes redundant expensive operations when users call multiple
derived outputs, leading to O(n*m) complexity instead of O(n)
stage_ids:
- feature_aggregation
- id: finance-C-036
when: When using ta library for financial feature engineering
action: Claim or imply that technical indicators predict future price movements or guarantee profitable trading
severity: medium
kind: claim_boundary
modality: must_not
consequence: Misrepresenting indicator outputs as trading signals creates regulatory and ethical risks, potentially leading
users to make harmful financial decisions
stage_ids:
- feature_aggregation
- id: finance-C-037
when: When implementing the _check_fillna utility method for NaN handling
action: Replace inf and -inf values with np.nan before applying fillna logic to prevent infinite value propagation
severity: high
kind: domain_rule
modality: must
consequence: Unchecked inf values propagate through mathematical operations, corrupting statistical summaries and visualization
while causing crashes in numerical algorithms
stage_ids:
- feature_aggregation
- id: finance-C-038
when: When implementing channel indicator classes with offset parameter
action: Apply the same offset shift to each derived band outputs (mband, pband, wband) for consistency
severity: medium
kind: architecture_guardrail
modality: must
consequence: Inconsistent offset application across derived outputs breaks mathematical relationships (e.g., pband = (close
- lband) / (hband - lband)) causing incorrect feature values
stage_ids:
- feature_aggregation
- id: finance-C-039
when: When implementing oscillator-based indicators with signal lines (MACD, PPO, KST)
action: Cache each intermediate EMA calculations (emafast, emaslow, macd_signal/ppo_signal) as instance attributes during
_run()
severity: high
kind: architecture_guardrail
modality: must
consequence: Recomputing EMA calculations for each derived getter causes quadratic time complexity and memory waste, especially
for long time series
stage_ids:
- feature_aggregation
- id: finance-C-040
when: When accessing derived outputs from an indicator class instance
action: Expect computation to occur on each getter call — each values are pre-computed in _run() during __init__
severity: medium
kind: resource_boundary
modality: must_not
consequence: Users expecting lazy computation may inadvertently modify input data after indicator instantiation, leading
to stale cached results and incorrect indicators
stage_ids:
- feature_aggregation
- id: finance-C-041
when: When implementing the fillna parameter behavior for indicator getters
action: Apply fillna consistently across each derived outputs from the same indicator instance
severity: high
kind: domain_rule
modality: must
consequence: Inconsistent NaN handling across derived outputs creates mathematical inconsistencies (e.g., bollinger_wband
numerator/denominator mismatch) corrupting normalized indicators
stage_ids:
- feature_aggregation
- id: finance-C-044
when: When comparing outputs between class and functional interfaces
action: expect wrapper default parameters to match class defaults
severity: high
kind: domain_rule
modality: must_not
consequence: Using default parameters with rsi() will produce different results than RSIIndicator(), leading to incorrect
backtesting or trading signal generation
stage_ids:
- functional_api
- id: finance-C-045
when: When implementing functional wrappers for multi-output indicators like MACD
action: provide separate wrapper functions for each output method
severity: medium
kind: architecture_guardrail
modality: must
consequence: Users lose access to secondary outputs like MACD signal line and histogram, reducing analytical capability
and requiring OOP usage
stage_ids:
- functional_api
- id: finance-C-046
when: When using the functional API for analytical workflows
action: expect the functional wrapper to maintain any internal state across calls
severity: high
kind: architecture_guardrail
modality: must_not
consequence: Each function call computes results independently; streaming calculations requiring state continuity cannot
be achieved with functional API alone
stage_ids:
- functional_api
- id: finance-C-048
when: When extending the library with new indicators
action: provide both class-based and functional wrapper implementations
severity: medium
kind: operational_lesson
modality: should
consequence: Users requiring simple single-call interface cannot use new indicators, reducing adoption and inconsistent
API experience
stage_ids:
- functional_api
- id: finance-C-049
when: When testing functional wrappers for correctness
action: verify wrapper produces same output as corresponding class method with identical parameters
severity: high
kind: operational_lesson
modality: must
consequence: Undetected divergence between functional and OOP interfaces causes silent data corruption in user workflows
stage_ids:
- functional_api
- id: finance-C-050
when: When documenting default parameter differences between wrappers and classes
action: document each wrapper with its specific default values prominently
severity: medium
kind: operational_lesson
modality: must
consequence: Users unaware of default parameter mismatch use wrong window values, producing incorrect indicator readings
that propagate to trading decisions
stage_ids:
- functional_api
- id: finance-C-051
when: When considering live trading capabilities of the library
action: claim real-time data streaming support for functional API
severity: high
kind: claim_boundary
modality: must_not
consequence: The ta library is a pure feature engineering library for historical data; claiming real-time capability misleads
users into inappropriate live trading deployment
stage_ids:
- functional_api
- id: finance-C-052
when: When comparing functional API performance to OOP
action: imply functional wrappers have inherent performance advantage
severity: low
kind: claim_boundary
modality: must_not
consequence: Each functional wrapper creates a new class instance, so performance characteristics are equivalent; misleading
claims lead to inappropriate architecture decisions
stage_ids:
- functional_api
- id: finance-C-053
when: When providing a stateless functional interface
action: claim functional wrappers can replace OOP for each use cases
severity: medium
kind: claim_boundary
modality: must_not
consequence: Multi-output indicators like MACD require OOP to access all results in one instantiation; functional API
users must make multiple calls, missing the efficiency benefit
stage_ids:
- functional_api
- id: finance-C-054
when: When choosing between OOP and functional programming patterns
action: consider functional API as a replacement for understanding class-based implementation
severity: low
kind: rationalization_guard
modality: should_not
consequence: Hiding class instantiation behind functional wrappers obscures the computational model, making it harder
to debug issues or extend functionality
stage_ids:
- functional_api
- id: finance-C-057
when: When validating output from the data cleanup function
action: Check non-numeric columns for invalid values; only numeric columns are validated
severity: high
kind: domain_rule
modality: must_not
consequence: Non-numeric columns (dates, strings) are intentionally skipped by the function. Assuming all columns are
checked would miss data quality issues in non-numeric columns
stage_ids:
- data_cleanup
- id: finance-C-059
when: When implementing data cleanup utility functions
action: Create a copy of the input DataFrame before modifying it to avoid side effects
severity: high
kind: architecture_guardrail
modality: must
consequence: Mutating the input DataFrame would cause data loss in the caller's scope, leading to incorrect results in
downstream processing that expects the original data to be intact
stage_ids:
- data_cleanup
- id: finance-C-060
when: When computing technical indicators after data cleanup
action: Preserve the original row order in the output DataFrame; do not reorder valid rows
severity: high
kind: architecture_guardrail
modality: must
consequence: Reordering rows would break the temporal sequence of financial data, causing indicators to be calculated
on incorrect time periods and producing meaningless results
stage_ids:
- data_cleanup
- id: finance-C-061
when: When using the dropna function to clean financial data
action: Call dropna before computing technical indicators to verify data quality
severity: high
kind: operational_lesson
modality: must
consequence: Computing indicators on uncleaned data will propagate NaN and inf values through rolling calculations, corrupting
all subsequent indicator values and making the dataset unusable for analysis
stage_ids:
- data_cleanup
- id: finance-C-062
when: When estimating data loss from the cleanup process
action: Compare input and output DataFrame shapes to determine the number of invalid rows removed
severity: medium
kind: operational_lesson
modality: must
consequence: Without tracking shape reduction, invalid rows may silently corrupt a large portion of the dataset, and users
will not realize the extent of data quality issues in their source data
stage_ids:
- data_cleanup
- id: finance-C-063
when: When validating the data cleanup output
action: Verify that the output DataFrame contains no rows with NaN, inf, -inf, or zero in any numeric column
severity: high
kind: operational_lesson
modality: must
consequence: Any remaining invalid numeric values will propagate through indicator calculations, producing corrupted results
that appear valid but are mathematically incorrect
stage_ids:
- data_cleanup
- id: finance-C-064
when: When claiming capabilities of the data cleanup function
action: Claim the function handles each data quality issues; it only handles NaN, inf, and zero in numeric columns
severity: high
kind: claim_boundary
modality: must_not
consequence: Misrepresenting the scope of data cleanup would lead users to assume their data is fully validated when negative
prices, invalid strings in numeric columns, or non-numeric column issues remain unhandled
stage_ids:
- data_cleanup
- id: finance-C-065
when: When evaluating whether to skip data cleanup
action: Skip the data cleanup step even if the dataset appears clean; NaN/inf/zero values may be hidden
severity: high
kind: rationalization_guard
modality: must_not
consequence: Skipping cleanup assuming data is clean will cause division-by-zero errors and NaN propagation when computing
returns, ratios, and percentage-based indicators on hidden invalid values
stage_ids:
- data_cleanup
- id: finance-C-067
when: When passing pandas.Series from DataFrame columns (close, high, low, volume) to indicator constructors
action: pass pandas.Series objects with numeric dtype that share the same index alignment
severity: high
kind: domain_rule
modality: must
consequence: Indicators compute incorrect values or raise exceptions when input Series indices are misaligned, causing
wrong technical analysis results in downstream trading decisions
stage_ids:
- data_input
- indicator_computation
- id: finance-C-068
when: When computing indicators that perform division operations on price data
action: handle division by zero cases to prevent inf/nan propagation
severity: high
kind: domain_rule
modality: must
consequence: Division by zero (e.g., when high equals low in Accumulation/Distribution) produces inf values that corrupt
all downstream indicator calculations and trading signals
stage_ids:
- data_input
- indicator_computation
- id: finance-C-069
when: When extracting columns from DataFrame as input to indicator constructors
action: extract Series using df[colname] notation, not df.colname, to preserve explicit column name strings
severity: medium
kind: architecture_guardrail
modality: must
consequence: Using attribute access (df.High) instead of item access (df['High']) may cause KeyError when column names
contain spaces or special characters, breaking the entire analysis pipeline
stage_ids:
- data_input
- indicator_computation
- id: finance-C-071
when: When using shift() operations in indicator computations
action: account for the NaN values introduced at the beginning of shifted Series due to look-back nature
severity: medium
kind: domain_rule
modality: must
consequence: shift(1) produces NaN at index 0, which propagates through dependent calculations and can cause indicators
to report misleading values for early data points
stage_ids:
- indicator_computation
- feature_aggregation
- id: finance-C-072
when: When passing Series through _check_fillna() before returning indicator values
action: apply fillna consistently across each outputs of the same indicator family to maintain semantic consistency
severity: medium
kind: operational_lesson
modality: must
consequence: Inconsistent NaN handling between related indicators (e.g., MACD line filled but MACD signal not filled)
creates data integrity issues that corrupt machine learning feature matrices
stage_ids:
- indicator_computation
- feature_aggregation
- id: finance-C-073
when: When assigning named pandas.Series to df[colname] via wrapper methods
action: use meaningful column names with prefixes to avoid naming conflicts with existing DataFrame columns
severity: high
kind: architecture_guardrail
modality: must
consequence: Overwriting original OHLCV columns with indicator results destroys the source data and prevents recalculating
different indicators without reloading
stage_ids:
- feature_aggregation
- data_input
- id: finance-C-074
when: When passing DataFrame with numeric columns to indicator constructors after adding features
action: verify each new indicator columns contain numeric dtype (float64) compatible with pandas mathematical operations
severity: high
kind: domain_rule
modality: must
consequence: Non-numeric indicator columns cause TypeError in subsequent rolling/ewm operations, breaking iterative indicator
chains like VWAP that depend on intermediate results
stage_ids:
- feature_aggregation
- data_input
- id: finance-C-075
when: When passing pandas.DataFrame with NaN/inf/zero values from raw data to data_cleanup
action: clean or validate numeric columns to prevent inf propagation through indicator calculations
severity: high
kind: domain_rule
modality: must
consequence: Raw data with inf values (from division by zero before data collection) corrupts all downstream indicators
and produces NaN in cumulative calculations like OBV and ADI
stage_ids:
- data_input
- data_cleanup
- id: finance-C-076
when: When using the dropna utility function for data cleanup
action: filter out rows containing exp(709) as maximum threshold and 0.0 values from numeric columns before dropping NaN
severity: medium
kind: operational_lesson
modality: must
consequence: Skipping the numeric range filtering step allows extreme values (overflow indicators) to remain in data,
causing numerical instability in rolling mean and exponential weighted mean calculations
stage_ids:
- data_input
- data_cleanup
- id: finance-C-077
when: When returning cleaned DataFrame from data_cleanup to data_input for indicator computation
action: preserve column order and each non-numeric columns (like timestamps) that are needed for time series alignment
severity: medium
kind: architecture_guardrail
modality: must
consequence: Losing the original index or timestamp column breaks temporal alignment between cleaned data and any external
signals or signals computed from pre-cleaned segments
stage_ids:
- data_cleanup
- data_input
- id: finance-C-078
when: When applying rolling window operations to compute indicators
action: verify input Series have at least window length data points to produce meaningful results
severity: high
kind: domain_rule
modality: must
consequence: Short Series produce all NaN values for rolling means, causing ATR, Bollinger Bands, and other window-based
indicators to return invalid results without raising errors
stage_ids:
- data_input
- indicator_computation
- data_cleanup
- id: finance-C-079
when: When chaining multiple indicator computations that depend on each other
action: pass indicator output Series directly as input to subsequent indicators without explicit validation of their numeric
properties
severity: high
kind: architecture_guardrail
modality: must_not
consequence: Using raw indicator outputs (which may contain NaN/inf from edge cases) as inputs to nested indicators amplifies
numerical errors and produces unreliable composite indicators
stage_ids:
- feature_aggregation
- data_input
- id: finance-C-080
when: When performing cumulative operations like cumsum() on indicator results
action: verify input Series contains no NaN values at the starting point to prevent cumulative error propagation
severity: high
kind: domain_rule
modality: must
consequence: NaN at the beginning of a cumsum operation (e.g., OBV, ADI) propagates as NaN through the entire Series,
making the entire indicator useless
stage_ids:
- indicator_computation
- feature_aggregation
- id: finance-C-082
when: When implementing any indicator output method
action: set the name attribute on every returned Series to an indicator-specific string (e.g., 'rsi', 'macd', 'macd_signal')
severity: high
kind: architecture_guardrail
modality: must
consequence: Unnamed Series cause DataFrame column naming failures and break downstream feature selection by name, leading
to KeyError exceptions in calling code
- id: finance-C-084
when: When implementing Bollinger Bands or any band indicator
action: use ddof=0 for the population standard deviation calculation (not pandas default ddof=1 for sample std)
severity: high
kind: domain_rule
modality: must
consequence: Using sample std (ddof=1) instead of population std (ddof=0) underestimates volatility, causing Bollinger
Band boundaries to be too narrow and triggering false breakouts in backtesting
- id: finance-C-085
when: When implementing EMA-based indicators
action: use adjust=False parameter for traditional Wilder smoothing (non-exponential weighting of past values)
severity: high
kind: domain_rule
modality: must
consequence: Using adjust=True causes EMA to put more weight on recent values, producing different smoothing than established
Wilder method used in traditional technical analysis — RSI and other indicators will not match textbook calculations
- id: finance-C-087
when: When using fillna to handle missing values in price-derived indicators
action: fill with value=-1 for volatility bands and similar price-derived indicators that cannot have negative price or
range values
severity: medium
kind: domain_rule
modality: must
consequence: Using incorrect fillna value for price-derived indicators causes artificial -1 values to be mistaken for
valid price data in downstream calculations
- id: finance-C-088
when: When using fillna to handle missing values in normalized oscillators
action: fill with value=50 for RSI, Stochastic, and other normalized oscillators (neutral midpoint of 0-100 range)
severity: medium
kind: domain_rule
modality: must
consequence: Using incorrect fillna value for normalized oscillators causes artificial 50 values to be mistaken for neutral
indicator readings in downstream logic
- id: finance-C-089
when: When using fillna to handle missing values in difference/ratio indicators
action: fill with value=0 for MACD, PPO, and other difference/ratio indicators that have neutral zero-crossing semantics
severity: medium
kind: domain_rule
modality: must
consequence: Using incorrect fillna value for difference indicators causes artificial zero values to be mistaken for valid
zero-crossing signals in downstream trading logic
- id: finance-C-090
when: When using add_all_ta_features with vectorized=True
action: expect exactly 76 features to be added to the DataFrame (excludes non-vectorized indicators like ADX, CCI, Aroon,
PSAR, ATR, Ulcer Index, MFI, NVI)
severity: high
kind: resource_boundary
modality: must
consequence: Mismatch between expected and actual feature count causes downstream ML pipelines or feature selection logic
to fail silently with wrong number of inputs
- id: finance-C-092
when: When presenting or reporting this library's backtested indicator values to users
action: claim that technical indicator outputs can be used directly as trading signals without additional threshold optimization
and risk management — this library provides feature engineering only
severity: high
kind: claim_boundary
modality: must_not
consequence: Users may blindly trade on raw indicator values (e.g., RSI > 70 = sell) without understanding overbought/oversold
dynamics, leading to significant financial losses in sideways markets
- id: finance-C-093
when: When presenting or reporting this system's backtested indicator accuracy to users
action: claim that historical indicator values equal expected future indicator values — each technical indicators exhibit
regime dependency and their predictive power degrades as market conditions change
severity: high
kind: claim_boundary
modality: must_not
consequence: Users allocate capital based on historically profitable indicator configurations that may no longer work
due to market regime changes, causing significant underperformance
- id: finance-C-094
when: When building systems based on this blueprint
action: claim support for real-time trading systems — this is a feature engineering library that computes indicators from
historical OHLCV DataFrames, with no live data feed integration or order execution capability
severity: high
kind: claim_boundary
modality: must_not
consequence: Users build live trading systems expecting real-time indicator updates, leading to system failures when no
real-time data pipeline exists
- id: finance-C-095
when: When building systems based on this blueprint
action: claim support for backtesting frameworks — this library computes indicators but contains no backtesting engine,
position tracking, portfolio management, or performance attribution logic
severity: high
kind: claim_boundary
modality: must_not
consequence: Users expect this library to provide backtesting capabilities, leading to incomplete backtesting implementations
that miss key components like transaction costs and slippage modeling
- id: finance-C-096
when: When building systems based on this blueprint
action: claim support for non-OHLCV data analysis — each indicators expect Open, High, Low, Close, Volume columns and
compute price-derived features using financial domain formulas
severity: high
kind: claim_boundary
modality: must_not
consequence: Users apply this library to non-price data (e.g., web traffic, sensor readings) producing meaningless technical
indicator values that have no valid financial interpretation
- id: finance-C-097
when: When building systems based on this blueprint
action: claim support for streaming data processing — each indicators are designed for batch DataFrame computation with
no support for incremental/online learning or partial series updates
severity: high
kind: claim_boundary
modality: must_not
consequence: Users implement streaming pipelines expecting incremental indicator updates, leading to full DataFrame recomputation
on each tick and severe performance degradation
- id: finance-C-098
when: When building systems based on this blueprint
action: claim support for non-financial time series analysis — indicator formulas (RSI, MACD, Bollinger Bands, etc.) are
mathematically defined for financial price data with specific range and behavior assumptions
severity: high
kind: claim_boundary
modality: must_not
consequence: Users apply financial indicators to non-financial time series (e.g., temperature, inventory counts) producing
values that have no meaningful interpretation and may lead to incorrect business decisions
- id: finance-C-099
when: When users call add_all_ta_features without first cleaning their DataFrame
action: pass DataFrames with NaN values to indicator functions without prior cleaning — the README explicitly states 'You
should clean or fill NaN values in your dataset before add technical analysis features'
severity: medium
kind: operational_lesson
modality: should_not
consequence: Uncleaned DataFrames with NaN values cause incorrect rolling calculations and produce unreliable indicator
values that may lead to wrong trading decisions
- id: finance-C-102
when: When building transition matrix-based models for regime detection or state prediction
action: Assume the framework provides transition matrix time homogeneity validation or enforcement — the framework does
not implement this capability; matrices computed from different time periods are used as-is without homogeneity testing
severity: high
kind: claim_boundary
modality: must_not
consequence: Without time homogeneity validation, non-stationary transition matrices cause regime models to incorrectly
assume consistent state transition probabilities, leading to systematic prediction errors in changing market conditions
derived_from_bd_id: BD-GAP-022
- id: finance-C-103
when: When computing transition matrices for regime analysis
action: 'Implement time homogeneity testing: compute separate transition matrices for rolling windows, then apply chi-square
test (p<0.05) or KL divergence threshold to detect stationarity breaks; if homogeneity fails, segment data or use time-varying
transition models'
severity: high
kind: domain_rule
modality: must
consequence: Non-stationary transition matrices without homogeneity validation cause regime models to extrapolate outdated
transition probabilities, producing systematically biased predictions that diverge from actual market behavior
derived_from_bd_id: BD-GAP-022
- id: finance-C-104
when: When implementing financial calculations involving interest accrual, bond pricing, or option valuation
action: Assume the framework handles day count conventions (Actual/360, 30/360, Actual/365) or compounding frequency (annual,
semi-annual, continuous) — these are not implemented; default Python float arithmetic is used without convention enforcement
severity: high
kind: claim_boundary
modality: must_not
consequence: Without day count convention handling, interest calculations and bond pricing produce incorrect values that
vary by convention, causing systematic pricing errors that accumulate over time and lead to significant valuation discrepancies
derived_from_bd_id: BD-GAP-021
- id: finance-C-105
when: When implementing financial instruments requiring precise time-based calculations
action: Explicitly specify day_count convention parameter (e.g., 'actual/360', '30/360', 'actual/365') and compounding_frequency
(e.g., 'annual', 'semi-annual', 'continuous') in each interest and pricing calculations; use libraries like QuantLib
or implement convention-aware formulas
severity: high
kind: domain_rule
modality: must
consequence: Financial instruments using incorrect day count or compounding conventions produce pricing errors ranging
from 0.01% to 0.5% per period, which compound significantly in high-notional trades and long-dated instruments
derived_from_bd_id: BD-GAP-021
- id: finance-C-106
when: When implementing currency-denominated calculations for financial reporting, pricing, or reconciliation
action: Use Python Decimal type for each currency values instead of float — float precision errors cause rounding discrepancies
in financial calculations; initialize Decimal from string (Decimal('10.25')) not float (Decimal(10.25)) to avoid float-to-Decimal
conversion errors
severity: high
kind: domain_rule
modality: must
consequence: Using float for currency causes representation errors (e.g., 0.1 + 0.2 != 0.3 in float), leading to reconciliation
failures, incorrect P&L reporting, and potential regulatory compliance issues in audit trails
derived_from_bd_id: BD-GAP-010
- id: finance-C-107
when: When implementing or optimizing KST (Know Sure Thing) indicator calculations
action: Use exactly 4 ROC periods (10, 15, 20, 30) with weights 1, 2, 3, 4 respectively, smoothed with SMA — do not alter
ROC periods or weight ratios
severity: high
kind: domain_rule
modality: must
consequence: Changing KST ROC periods or weights produces different momentum signals, causing the indicator to generate
trading signals at different price points than intended and breaking strategy reproducibility
derived_from_bd_id: BD-016
- id: finance-C-108
when: When implementing NaN handling for financial time series data
action: 'Apply the three-step FillNA sequence: replace inf with NaN first, then forward-fill (ffill) to propagate last
valid value, then backward-fill (bfill) or fillna(value) for leading NaNs — use forward-fill only in live trading scenarios
where backward-fill would introduce look-ahead bias'
severity: high
kind: domain_rule
modality: must
consequence: Using dropna-only or forward-fill-only breaks time series continuity required for rolling calculations, causing
NaN values in rolling windows and breaking indicator computations mid-series
derived_from_bd_id: BD-033
- id: finance-C-109
when: When using the framework's default MACD parameters for momentum calculations
action: Verify that MACD slow=26, fast=12, signal=9 matches the target market's trading patterns (~252 trading days/year);
for crypto or markets with different trading patterns, adjust windows to half and full market cycles accordingly
severity: medium
kind: operational_lesson
modality: should
consequence: Using standard 12/26/9 MACD on markets with different trading day counts produces delayed or premature crossover
signals, causing momentum strategies to enter/exit positions at suboptimal price points
derived_from_bd_id: BD-002
- id: finance-C-110
when: When implementing Detrended Price Oscillator calculations
action: Use formula int((window/2) + 1) for DPO shift to center the SMA at window/2 position, ensuring DPO peaks and troughs
align with actual price cycle turning points
severity: high
kind: domain_rule
modality: must
consequence: Using alternative shift formulas misaligns DPO indicator peaks with actual price cycle turning points, causing
momentum signals to fire at wrong price levels and corrupting trading decisions
derived_from_bd_id: BD-017
- id: finance-C-111
when: When using the framework's default min_periods parameter for rolling calculations
action: 'Verify that min_periods logic aligns with _fillna strategy: min_periods=0 when fillna=True (immediate computation
after filling), min_periods=window when fillna=False (full window required) — verify first valid index positions match
between backtesting and live trading'
severity: medium
kind: operational_lesson
modality: should
consequence: Incorrect min_periods creates different indicator availability windows between backtesting and live trading,
causing strategies to execute on indicators that were not yet valid in backtest or fail to load in live trading
derived_from_bd_id: BD-083
- id: finance-C-112
when: When processing timestamped financial data in the data input stage
action: 'Assume processing time equals as-of time — these represent fundamentally different concepts: as-of time is when
an event occurred, processing time is when data was recorded or ingested'
severity: high
kind: claim_boundary
modality: must_not
consequence: Confusing as-of time with processing time causes timestamp-based joins and historical reconstruction to produce
incorrect data associations, corrupting backtest signals and breaking audit trails
derived_from_bd_id: BD-GAP-007
- id: finance-C-113
when: When processing timestamped financial data in the data input stage
action: 'Implement explicit as-of and processing time fields: as_of_time field captures event timestamp, processing_time
field captures ingestion timestamp — use as_of_time for each time-series operations and joins'
severity: high
kind: domain_rule
modality: must
consequence: Without explicit time field separation, timestamp-based operations use wrong time references, causing historical
data reconstruction to produce incorrect market states and corrupting backtesting accuracy
derived_from_bd_id: BD-GAP-007
- id: finance-C-114
when: When calculating date-based offsets or duration for trading strategies
action: Assume natural calendar calculations apply to trading strategies — natural calendar ignores non-trading days (weekends,
holidays) and produces incorrect date offsets for trading operations
severity: high
kind: claim_boundary
modality: must_not
consequence: Using natural calendar (e.g., 30 days) for trading operations causes position calculations to span wrong
number of trading days, corrupting time-series alignment and causing signals to fire on non-trading days
derived_from_bd_id: BD-GAP-008
- id: finance-C-115
when: When implementing date-based calculations for trading operations
action: 'Use trading calendar that accounts for market-specific non-trading days: replace natural calendar date arithmetic
with trading-day counting functions (e.g., add_trading_days(date, n) instead of date + timedelta(days=n))'
severity: high
kind: domain_rule
modality: must
consequence: Without trading calendar support, date-based position sizing and duration calculations include non-trading
days, causing strategies to reference incorrect historical prices and corrupting backtest results
derived_from_bd_id: BD-GAP-008
- id: finance-C-116
when: When modeling settlement and delivery timing for financial instruments
action: Assume T+0 settlement or same-day delivery — the framework does not implement settlement timing conventions, treating
each trades as if settled immediately upon execution
severity: high
kind: claim_boundary
modality: must_not
consequence: Assuming T+0 settlement causes cash flow calculations to ignore settlement delays, creating apparent liquidity
that does not exist and causing settlement failures when positions are reused before funds settle
derived_from_bd_id: BD-GAP-017
- id: finance-C-117
when: When implementing trade execution or position management logic
action: 'Define settlement_time convention per instrument: T+1 for A-shares, T+2 for US equities, T+0 for same-day futures
— implement settlement_delay tracking and prevent position reuse until settlement completes'
severity: high
kind: domain_rule
modality: must
consequence: Without settlement convention tracking, the framework allows same-day position reuse that violates market
settlement rules, causing failed settlements and potential regulatory violations in live trading
derived_from_bd_id: BD-GAP-017
- id: finance-C-118
when: When modeling price or quantity values for backtesting
action: Assume continuous decimal precision for prices and quantities — the framework treats prices as continuous floats
without enforcing discrete tick sizes or lot minimums
severity: high
kind: claim_boundary
modality: must_not
consequence: Continuous price models allow fractional tick values that cannot execute in real markets, causing backtest
results to assume fills at prices unattainable in live trading due to tick size constraints
derived_from_bd_id: BD-GAP-018
- id: finance-C-119
when: When implementing order sizing or price validation for backtesting
action: 'Enforce tick_size and lot_size constraints: round order prices to nearest tick (e.g., price = round(raw_price
/ tick_size) * tick_size), round quantities to nearest lot (quantity = round(raw_quantity / lot_size) * lot_size), reject
orders below minimum lot'
severity: high
kind: domain_rule
modality: must
consequence: Without tick/lot enforcement, backtested orders assume execution at continuous prices that cannot actually
trade, creating systematic overestimation of fills and underestimation of transaction costs in live trading
derived_from_bd_id: BD-GAP-018
- id: finance-C-120
when: When implementing or refactoring KAMA indicator calculations
action: Use np.roll() for boundary handling in KAMA indicator — np.roll() wraps the last element to position 0, causing
spurious first-value calculation from wrapped data
severity: high
kind: domain_rule
modality: must_not
consequence: The first KAMA value incorporates data from the wrapped last element, producing incorrect indicator values
that corrupt trading signal generation for short price series
derived_from_bd_id: BD-091
- id: finance-C-122
when: When implementing or refactoring shift operations in return and ROC calculations
action: Change the hardcoded shift(1) lookback period in return calculations — shift(1) is hardcoded across 18+ locations
and conflicts with multi-period return definitions (Daily Return, Log Return, ROC)
severity: high
kind: domain_rule
modality: must_not
consequence: Changing shift(1) to make it configurable breaks the documented flexibility; users cannot compute multi-period
returns as documented when shift() is modified, causing backtest-live inconsistency
derived_from_bd_id: BD-101
- id: finance-C-123
when: When implementing log return calculations in backtesting
action: Calculate log returns as ln(close) - ln(prev_close), multiplied by 100 for percentage convention — verify time-additivity
for multi-period compounding is preserved
severity: high
kind: domain_rule
modality: must
consequence: Using simple returns instead of log returns breaks multi-period compounding equivalence; accumulated return
errors grow significantly over longer backtest periods, causing live trading returns to diverge from backtested results
derived_from_bd_id: BD-010
- id: finance-C-124
when: When implementing Typical Price calculations for MFI, VWAP, or CCI indicators
action: Calculate Typical Price as (High + Low + Close) / 3 — do not use OHLC4 or weighted averages as they change the
price representation used by MFI, VWAP, and CCI
severity: high
kind: domain_rule
modality: must
consequence: Using OHLC4 or weighted averages instead of (H+L+C)/3 changes the price normalization baseline for MFI, VWAP,
and CCI, causing volume-weighted and momentum signals to diverge from expected values
derived_from_bd_id: BD-012
- id: finance-C-125
when: When implementing Mass Index indicator for trend reversal detection
action: Calculate Mass Index as EMA(amplitude) / EMA(EMA(amplitude)) summed over the slow window (default 25) — values
above 27 indicate potential reversal; below 26.5 confirms reversal entry
severity: medium
kind: domain_rule
modality: must
consequence: Using incorrect EMA ratio or window length breaks the reversal signal thresholds; Mass Index values become
incomparable to standard 27/26.5 reversal boundaries, causing false or missed trend reversal signals
derived_from_bd_id: BD-014
- id: finance-C-126
when: When implementing Ichimoku Cloud indicator calculations
action: 'Use default windows of window1=9 (conversion line), window2=26 (base line), window3=52 (span B) — these derive
from standard Ichimoku time periods: 9=1.5 weeks, 26=1 month, 52=2 months'
severity: medium
kind: domain_rule
modality: must
consequence: Using non-standard windows breaks the Ichimoku cloud formation and signal generation; custom windows produce
different conversion/base line crossovers and cloud boundaries that do not match standard Ichimoku interpretation
derived_from_bd_id: BD-015
- id: finance-C-127
when: When implementing RSI indicator calculations
action: Apply EMA smoothing with alpha=1/window to RSI directional components — window must be positive integer >= 1;
alpha is fixed at 1/window and not adjustable
severity: high
kind: domain_rule
modality: must
consequence: Using standard EMA with span parameter or different alpha values breaks Wilder's smoothing equivalence; RSI
values become incomparable to standard 30/70 overbought/oversold thresholds, corrupting momentum signals
derived_from_bd_id: BD-046
- id: finance-C-128
when: When implementing TSI indicator calculations
action: Apply double EMA smoothing (recursive ewm with span=window_slow then span=window_fast) to price difference and
absolute difference — window_slow must be >= window_fast; both must be positive integers
severity: high
kind: domain_rule
modality: must
consequence: Using single EMA, triple EMA, or different smoothing order breaks TSI's two-stage noise reduction design;
momentum signals become unreliable and incomparable to standard TSI interpretation
derived_from_bd_id: BD-047
- id: finance-C-129
when: When implementing Stochastic Oscillator %K calculations
action: Calculate Stochastic %K using rolling min/max of high/low over the window — window must be positive integer >=
1; requires sufficient high/low data for the lookback period
severity: high
kind: domain_rule
modality: must
consequence: Using alternative range calculations (e.g., close-only or EMA-smoothed) instead of rolling min/max breaks
Stochastic %K normalization; price position within range becomes incomparable to standard 80/20 overbought/oversold
levels
derived_from_bd_id: BD-049
- id: finance-C-130
when: When implementing train/test data split logic for model development
action: Assume train/test time split integrity is automatically enforced — the framework does not implement temporal data
leakage prevention; test data timestamps may overlap with or precede training data timestamps
severity: high
kind: claim_boundary
modality: must_not
consequence: Without proper time-based split enforcement, future information leaks into training data, causing model performance
to appear inflated in backtesting but fail catastrophically in production deployment
derived_from_bd_id: BD-GAP-013
- id: finance-C-131
when: When implementing train/test data split logic for model development
action: Implement temporal separation by validating that each training samples have timestamps strictly before their corresponding
test samples; use a time-based split function that enforces no temporal overlap or data contamination between splits
severity: high
kind: domain_rule
modality: must
consequence: Without explicit temporal separation enforcement, future data can contaminate training sets, causing models
to memorize temporal patterns that don't exist at prediction time
derived_from_bd_id: BD-GAP-013
- id: finance-C-132
when: When implementing event logging in the backtesting framework
action: Assume event logs provide immutability guarantees — the framework does not implement write-once or cryptographic
integrity verification for logged events; logs can be modified or deleted after creation
severity: high
kind: claim_boundary
modality: must_not
consequence: Without immutable event logging, audit trails can be altered retroactively, making regulatory compliance
verification impossible and invalidating post-hoc trade reconstruction
derived_from_bd_id: BD-GAP-016
- id: finance-C-133
when: When implementing event logging in the backtesting framework
action: Implement append-only event logging with cryptographic hash chains or database write-protections to verify logged
events cannot be modified or deleted after creation
severity: high
kind: domain_rule
modality: must
consequence: Without immutable event logging, trade records can be tampered with, violating audit requirements and making
it impossible to prove backtest result integrity
derived_from_bd_id: BD-GAP-016
- id: finance-C-134
when: When implementing cost calculations in the backtesting framework
action: Assume the framework's cost model is complete — the framework does not implement slippage, market impact, or crossing
spread costs; backtest P&L will overstate actual returns
severity: high
kind: claim_boundary
modality: must_not
consequence: Incomplete cost modeling causes backtest returns to exceed actual trading returns by 0.1-0.5% per trade in
liquid markets and up to 1-2% in illiquid markets
derived_from_bd_id: BD-GAP-019
- id: finance-C-135
when: When implementing cost calculations in the backtesting framework
action: Implement comprehensive cost model including slippage (0.01-0.05% for liquid stocks), market impact proportional
to order size, and crossing spread costs for limit orders
severity: high
kind: domain_rule
modality: must
consequence: Without comprehensive cost modeling, strategies that appear profitable in backtesting become unprofitable
in live trading due to unmodeled transaction costs
derived_from_bd_id: BD-GAP-019
- id: finance-C-136
when: When implementing position carry cost calculations for overnight positions
action: Assume funding costs are automatically modeled — the framework does not implement margin interest, short borrow
rates, or dividend adjustments; strategies with overnight exposure will show inflated returns
severity: high
kind: claim_boundary
modality: must_not
consequence: Unmodeled carry costs cause overnight strategies to appear 0.01-0.05% per day more profitable than reality,
accumulating to significant discrepancies over multi-week holding periods
derived_from_bd_id: BD-GAP-020
- id: finance-C-137
when: When implementing position carry cost calculations for overnight positions
action: Implement funding cost model with configurable margin_rate, short_rebate_rate, and dividend_adjustment parameters;
apply costs daily based on position value and current rates
severity: high
kind: domain_rule
modality: must
consequence: Without carry cost modeling, long-short strategies that rely on overnight funding appear profitable but may
lose money when actual margin costs are applied
derived_from_bd_id: BD-GAP-020
- id: finance-C-138
when: When using the Chaikin Money Flow indicator with default parameters
action: Verify CMF window=20 matches the trading instrument's cycle length; adjust window if sampling frequency differs
from daily data (e.g., intraday data may require window proportional to trading session length)
severity: medium
kind: operational_lesson
modality: should
consequence: Using a window size that doesn't match the sampling frequency causes CMF to measure money flow over an incorrect
time horizon, leading to misleading overbought/oversold signals and poor trade timing
derived_from_bd_id: BD-038
- id: finance-C-139
when: When implementing or adjusting StochRSI overbought/oversold thresholds
action: Use 80/20 thresholds instead of default 70/30 when applying StochRSI, since the nested oscillator structure causes
more frequent extreme readings; adjust further for high-volatility instruments if signals are too frequent
severity: medium
kind: operational_lesson
modality: should
consequence: Using RSI-style thresholds (70/30) for StochRSI causes excessive overbought/oversold signals due to the nested
oscillator amplification, leading to overtrading and eroded profits in live trading
derived_from_bd_id: BD-039
- id: finance-C-140
when: When configuring Keltner Channel middle line calculation method
action: Verify original_version parameter matches the intended trading system specification; original_version=True uses
SMA (Keltner 1960 specification for backward compatibility), original_version=False uses EMA (modern adaptation with
faster response)
severity: medium
kind: domain_rule
modality: must
consequence: Using the wrong Keltner version creates inconsistent backtest behavior when comparing against historical
trading systems that relied on the original SMA-based Keltner approach, making live trading results non-reproducible
derived_from_bd_id: BD-041
- id: finance-C-141
when: When implementing or refactoring Ichimoku Conversion Line (tenkan-sen) calculation
action: Replace the midpoint formula (high+low)/2 with single price (close) or asymmetric high/low combinations; the midpoint
of rolling extremes is the defining characteristic of tenkan-sen per Ichimoku specification
severity: medium
kind: domain_rule
modality: must_not
consequence: Using close price or asymmetric high/low instead of midpoint fundamentally alters the conversion line's sensitivity
and trend detection, producing different buy/sell signals that invalidate historical strategy backtests
derived_from_bd_id: BD-062
- id: finance-C-142
when: When implementing or refactoring Know Sure Thing (KST) momentum indicator calculation
action: Replace the fixed weight structure 1:2:3:4 with equal weights or arbitrary coefficients; the graduated weight
scheme reflects Martin Prime's theory that longer-term ROC carries greater predictive significance
severity: medium
kind: architecture_guardrail
modality: must_not
consequence: Using equal or arbitrary weights breaks the KST's theoretical foundation, fundamentally altering signal timing
and smoothness; backtests calibrated on the original weight structure will produce inconsistent live trading results
derived_from_bd_id: BD-063
- id: finance-C-143
when: When calculating Bollinger Bands or comparing results with pandas DataFrame.std()
action: Verify that ddof=0 (population std) is explicitly specified for Bollinger Bands calculations to match the ta library
rationale; be aware that pandas DataFrame.std() defaults to ddof=1 (sample std), causing different band widths in cross-library
validation
severity: medium
kind: operational_lesson
modality: should
consequence: Using pandas default ddof=1 when comparing with ta library Bollinger Bands produces narrower bands, causing
cross-library validation failures and potential misinterpretation of volatility signals
derived_from_bd_id: BD-102
- id: finance-C-144
when: When implementing or extending EMA-based indicators in the framework
action: Verify adjust=False (Wilder smoothing) is consistently used for each EMA calculations; must_not assume compatibility
with pandas DataFrame.ewm(adjust=True) default, which produces different smoothing behavior
severity: high
kind: domain_rule
modality: must
consequence: Assuming pandas ewm compatibility causes systematic calculation differences across all 80+ EMA-using indicators,
making backtest results incompatible with standard pandas workflows and creating silent discrepancies hard to debug
derived_from_bd_id: BD-103
- id: finance-C-145
when: When using the ROC (Rate of Change) indicator with default 12-period parameter
action: Verify that the default period of 12 bars matches strategy requirements for short-term momentum capture; validate
that price data does not approach zero, as ROC calculation produces extreme values when divisor approaches zero
severity: medium
kind: operational_lesson
modality: should
consequence: Using ROC with near-zero prices produces extreme percentage values that could trigger incorrect trading signals,
causing strategies to misread momentum and execute trades at unfavorable prices
derived_from_bd_id: BD-GAP-002
- id: finance-C-146
when: When implementing or modifying Bollinger Bands calculations in the framework
action: Maintain 20-period SMA with 2 standard deviation bands as the default configuration; changing these parameters
fundamentally alters the indicator's behavior and breaks compatibility with industry-standard trading platforms
severity: high
kind: domain_rule
modality: must
consequence: Modifying Bollinger Bands default parameters without explicit rationale causes backtest results to diverge
from standard trading platform outputs, making strategy validation unreliable and performance claims non-reproducible
derived_from_bd_id: BD-GAP-003
- id: finance-C-147
when: When implementing volume moving average calculations in the framework
action: Use Simple Moving Average (SMA) with 20-period default for volume calculations; must_not substitute exponential
weighting without explicit business rationale, as SMA ensures predictable and human-readable threshold interpretation
severity: medium
kind: domain_rule
modality: must
consequence: Replacing SMA with exponential weighting amplifies recent volume spikes and introduces signal interpretation
lag, causing volume-based strategies to generate different entry/exit signals than originally designed
derived_from_bd_id: BD-GAP-005
- id: finance-C-148
when: When using the framework's default Bollinger Bands parameters for backtesting or live trading
action: Verify that ddof=0 (population standard deviation) matches the intended statistical assumption for the strategy;
for sample-based statistical analysis, explicitly set ddof=1 and revalidate band thresholds
severity: medium
kind: operational_lesson
modality: should
consequence: Using population std (ddof=0) instead of sample std (ddof=1) produces narrower bands, potentially generating
false Bollinger Band breakouts and triggering premature mean-reversion entries that consistently lose money
derived_from_bd_id: BD-003
- id: finance-C-149
when: When implementing or refactoring Ultimate Oscillator calculations
action: Apply the 4:2:1 weighting ratio across the three timeframes (short_period:medium_period:long_period) as designed
by Larry Williams to balance short-term, medium-term, and long-term buying pressure
severity: high
kind: domain_rule
modality: must
consequence: Using incorrect weighting ratios alters oscillator sensitivity across timeframes, causing divergences and
false signals that lead to poor entry and exit timing decisions
derived_from_bd_id: BD-018
- id: finance-C-150
when: When implementing or refactoring Stochastic Oscillator indicator logic
action: Apply smooth_window=3 as the default SMA smoothing for %D signal line calculation to reduce raw %K volatility
and generate clear crossover signals
severity: high
kind: domain_rule
modality: must
consequence: Removing or changing smooth_window from 3 causes the indicator to behave as raw %K instead of smoothed %D,
introducing excessive noise and false crossover signals that degrade strategy performance
derived_from_bd_id: BD-020
- id: finance-C-151
when: When implementing or modifying Keltner Channel indicators in backtesting or live trading
action: Assume Keltner Channel operates independently of its internal ATR component — the band calculations depend entirely
on the hidden ATR instance using Wilder smoothing, and modifying Keltner without awareness of this dependency causes
silent calculation changes
severity: high
kind: claim_boundary
modality: must_not
consequence: Backtested strategies using Keltner bands without awareness of hidden ATR dependency produce results that
diverge from live trading when ATR parameters or smoothing assumptions change silently
derived_from_bd_id: BD-099
- id: finance-C-152
when: When implementing strategies using Keltner Channel indicators
action: Access the internal KeltnerChannel.indicators['atr'] instance and verify Wilder smoothing (ema_span=2*atr_period-1)
matches strategy assumptions; when modifying ATR parameters, revalidate Keltner band thresholds and multiplier=2 accordingly
severity: high
kind: domain_rule
modality: must
consequence: Strategies relying on Keltner band width for volatility breakout or mean reversion signals fail silently
when the internal ATR smoothing differs from expectations, causing persistent backtest-live inconsistency in position
sizing and entry decisions
derived_from_bd_id: BD-099
- id: finance-C-153
when: When implementing or modifying KAMA indicator calculations in backtesting
action: Validate that np.roll() boundary handling does not corrupt Efficiency Ratio calculations; use explicit slicing
or initialization logic for the first N values instead of relying on np.roll() wrapped data
severity: medium
kind: operational_lesson
modality: should
consequence: np.roll() wraps array boundaries, causing the first N KAMA values to incorporate wrapped last-element data,
which corrupts the Efficiency Ratio and produces incorrect adaptive smoothing constants for trend signals near the series
start
derived_from_bd_id: BD-104
- id: finance-C-154
when: When using MACD indicator with default parameters for backtesting
action: 'Verify that MACD parameters match the intended market frequency: fast=12 (approx 2-week), slow=26 (approx 1-month),
signal=9 represent short-term momentum conventions; document any custom period justification'
severity: medium
kind: operational_lesson
modality: should
consequence: Using MACD with mismatched periods for the target market frequency produces momentum signals at wrong time
scales, causing trades to enter/exit at suboptimal points and reducing strategy profitability
derived_from_bd_id: BD-094
- id: finance-C-155
when: When calculating RSI values for overbought/oversold signals
action: 'Maintain the exact Wilder RSI parameter combination: window=14, alpha=1/14, and adjust=False must each be preserved
together; changing any single parameter without the others breaks Wilder''s exponential smoothing equivalence'
severity: medium
kind: operational_lesson
modality: should
consequence: Changing RSI parameters individually without maintaining the Wilder equivalence produces non-standard RSI
values, causing overbought/oversold signals to trigger at incorrect thresholds and leading to wrong trade entries
derived_from_bd_id: BD-097
- id: finance-C-156
when: When processing timestamp data in backtesting
action: Assume timezone handling is explicit or that UTC conversion is applied automatically — the framework lacks explicit
timezone annotation
severity: high
kind: claim_boundary
modality: must_not
consequence: Without explicit timezone annotation, timestamps from different sources may use inconsistent timezone references,
causing overnight gap calculations and True Range measurements to produce incorrect volatility estimates across sessions
derived_from_bd_id: BD-GAP-009
- id: finance-C-157
when: When ingesting timestamp data at the data input boundary
action: Annotate each timestamps with explicit timezone (prefer UTC); convert each timestamps to UTC-aware datetime types
at ingestion; store timestamps with timezone info to prevent misalignment across sessions
severity: high
kind: domain_rule
modality: must
consequence: Without timezone-aware timestamps, overnight gap calculations produce incorrect True Range values, which
propagates errors into ATR, position sizing, and risk management calculations throughout the backtest
derived_from_bd_id: BD-GAP-009
- id: finance-C-158
when: When calculating volatility measures using True Range
action: 'Use the complete True Range formula: max(High-Low, |High-PrevClose|, |Low-PrevClose|); do not simplify to High-Low
only; for the first bar without previous close, use High-Low as the fallback'
severity: high
kind: domain_rule
modality: must
consequence: Simplified High-Low range misses overnight gaps, causing volatility underestimation; this propagates into
ATR, Bollinger Bands, and position sizing calculations, leading to incorrect risk estimates and over-leveraged positions
derived_from_bd_id: BD-034
- id: finance-C-159
when: When processing data inputs in the framework
action: Assume the framework automatically detects stale data and handles data expiry — the framework does not implement
stale data detection or expiry mechanisms
severity: high
kind: claim_boundary
modality: must_not
consequence: Without stale data detection, the framework may process outdated market data that no longer reflects current
market conditions, causing strategies to trade on expired information and produce incorrect signals
derived_from_bd_id: BD-GAP-011
- id: finance-C-160
when: When implementing data ingestion pipelines
action: Implement data freshness validation by checking timestamp fields (e.g., data_date, timestamp) against current
time, and reject or flag data older than the configured stale_threshold (e.g., > 5 minutes for intraday data)
severity: high
kind: operational_lesson
modality: must
consequence: Without data freshness validation, stale market data causes strategies to generate signals based on outdated
prices, leading to failed trade execution or positions based on expired information
derived_from_bd_id: BD-GAP-011
- id: finance-C-161
when: When implementing any stochastic operations or random sampling in backtesting
action: Assume reproducibility is guaranteed without explicit random seed configuration — the framework does not automatically
set random seeds for each random number generators
severity: high
kind: claim_boundary
modality: must_not
consequence: Without explicit random seed coverage, backtest results become non-reproducible; different runs produce varying
equity curves and performance metrics, making it impossible to validate strategy consistency or compare strategy improvements
derived_from_bd_id: BD-GAP-014
- id: finance-C-162
when: When running backtests or simulations with stochastic components
action: Set random seed explicitly via framework seed function before stochastic operations, and verify each RNG sources
(numpy.random, random, torch.random, etc.) are seeded consistently
severity: high
kind: operational_lesson
modality: must
consequence: Without full random seed coverage, backtests produce non-reproducible results across runs; the same strategy
may show different Sharpe ratios and drawdowns, making it impossible to verify strategy stability or compare parameter
changes
derived_from_bd_id: BD-GAP-014
- id: finance-C-163
when: When loading models or datasets for strategy execution
action: Assume the framework automatically binds and tracks model versions to data versions — the framework does not implement
version snapshot binding between models and data
severity: high
kind: claim_boundary
modality: must_not
consequence: Without model-data version binding, running a model trained on older data with newer market conditions may
produce unpredictable results, and it becomes impossible to reproduce specific backtest runs when model or data versions
change
derived_from_bd_id: BD-GAP-015
- id: finance-C-164
when: When implementing model versioning and data management
action: Implement version snapshot binding by capturing model version hash and data version hash at model load time, and
log them alongside backtest results to verify reproducibility
severity: high
kind: domain_rule
modality: must
consequence: Without version snapshot binding, mixing different model versions with different data versions causes non-reproducible
backtests; changing model or data without tracking produces inconsistent results that cannot be validated or audited
derived_from_bd_id: BD-GAP-015
- id: finance-C-165
when: When implementing PPO (Percentage Price Oscillator) indicator calculations
action: Calculate PPO using the formula ((fast_EMA - slow_EMA) / slow_EMA) * 100 to normalize momentum as a percentage
of price, enabling cross-asset comparison of momentum signals
severity: high
kind: domain_rule
modality: must
consequence: Incorrect PPO formula produces wrong momentum normalization; using absolute MACD values instead of percentage
causes PPO signals to be incomparable across different price levels and assets, leading to suboptimal asset selection
in multi-asset strategies
derived_from_bd_id: BD-040
- id: finance-C-166
when: When implementing Ultimate Oscillator calculations
action: Use periods 7, 14, and 28 with corresponding weights 4, 2, and 1 to calculate the weighted average of BP/TR ratios
across three timeframes
severity: high
kind: domain_rule
modality: must
consequence: Using incorrect Ultimate Oscillator periods or weights (not 7/14/28 with 4/2/1 weighting) produces false
signals that deviate from the standard formula, reducing the oscillator's effectiveness at filtering noise and confirming
momentum across multiple timeframes
derived_from_bd_id: BD-048
- id: finance-C-167
when: When implementing or modifying Awesome Oscillator calculation in ta/momentum.py
action: Use median price (H+L)/2 rather than close price, combined with fixed windows w1=5 and w2=34 (representing one
week and one month of daily data)
severity: high
kind: domain_rule
modality: must
consequence: Using close instead of median price reduces noise reduction effect; using different windows deviates from
Bill Williams' original formula and produces non-standard indicator values that may not match mainstream technical analysis
libraries
derived_from_bd_id: BD-052
- id: finance-C-168
when: When implementing Stochastic RSI calculation in ta/momentum.py
action: Apply stochastic oscillator normalization formula to RSI values themselves (not raw prices), producing output
normalized to [0,1] range within the rolling window's RSI extremes
severity: high
kind: domain_rule
modality: must
consequence: Applying stochastic normalization to prices instead of RSI values fundamentally changes the indicator meaning,
producing a momentum oscillator instead of an RSI position indicator; strategies expecting RSI extremes analysis will
receive incorrect signals
derived_from_bd_id: BD-054
- id: finance-C-169
when: When implementing SMA calculation in ta/utils.py
action: Use rolling window mean with min_periods parameter to allow partial calculations before window is complete; verify
min_periods <= periods and both are positive integers >= 1
severity: high
kind: domain_rule
modality: must
consequence: Requiring full periods for all SMA calculations would produce excessive NAs at series start, breaking downstream
indicators and strategies that depend on SMA values being available earlier
derived_from_bd_id: BD-057
- id: finance-C-170
when: When implementing WMA calculation in ta/trend.py
action: Apply linear weight formula weight_i = 2*i/(n*(n+1)) where i ranges from 1 to n, ensuring weights sum to 1.0 and
are strictly increasing (most recent price gets highest weight)
severity: high
kind: domain_rule
modality: must
consequence: Using non-linear weights (exponential, square root, or equal weights) changes WMA's sensitivity to recent
price changes, altering signal timing and potentially causing strategies to generate trades at different points than
expected
derived_from_bd_id: BD-059
- id: finance-C-171
when: When implementing volume-based features in feature_aggregation
action: Use On-Balance Volume (OBV) cumulative signed volume logic where positive days add to running total and negative
days subtract, preserving directional volume information
severity: high
kind: domain_rule
modality: must
consequence: Using raw volume totals loses directional information; strategies expecting OBV divergence detection (price
up but volume down signals weakness) will fail to detect momentum divergence, missing critical trend reversal signals
derived_from_bd_id: BD-GAP-001
- id: finance-C-172
when: When implementing Bollinger Band feature extraction
action: 'Return each four values: upper_band, middle_band, lower_band, and bandwidth_percentage; the bandwidth_percentage
enables volatility regime detection independent of absolute price levels'
severity: high
kind: domain_rule
modality: must
consequence: Omitting bandwidth_percentage breaks volatility regime detection strategies that rely on normalized band
width; dropping any band loses either breakout or mean-reversion signal support
derived_from_bd_id: BD-GAP-006
- id: finance-C-173
when: When middle band equals zero in Bollinger Band calculation
action: Handle division by zero gracefully — bandwidth calculation (bandwidth = (upper - lower) / middle) must not raise
an exception; return null or skip the record when middle_band is zero
severity: high
kind: domain_rule
modality: must
consequence: Division by zero crashes the feature extraction when middle band is zero (occurs during certain market conditions
or data artifacts); unhandled exception prevents backtest from running
derived_from_bd_id: BD-GAP-006
- id: finance-C-175
when: When using Williams %R output for trading signals
action: 'Apply the standard Williams %R boundaries: values above -20 signal overbought (potential sell), values below
-80 signal oversold (potential buy); these thresholds are specific to the negative-scaled convention'
severity: high
kind: operational_lesson
modality: must
consequence: Using positive-scaled thresholds (e.g., above 80 for overbought) on negative-scaled output produces inverted
signals, causing the strategy to buy when overbought and sell when oversold
derived_from_bd_id: BD-019
- id: finance-C-176
when: When implementing Force Index indicator
action: Use default window of 13 for EMA smoothing as specified by Elder — this window provides sufficient smoothing to
filter daily noise while remaining responsive to meaningful force changes
severity: high
kind: domain_rule
modality: must
consequence: Using non-13 window sizes changes the indicator's smoothing characteristics; shorter windows increase sensitivity
to daily fluctuations while longer windows follow longer trends, altering the intended signal behavior
derived_from_bd_id: BD-022
- id: finance-C-177
when: When implementing Volume Price Trend indicator
action: Calculate VPT as cumulative sum of (close_pct_change * volume) with no default smoothing applied — preserve the
raw cumulative nature of the indicator
severity: high
kind: domain_rule
modality: must
consequence: Applying default EMA smoothing to VPT fundamentally changes its interpretation from cumulative money flow
to smoothed momentum; the indicator no longer represents true cumulative flow behavior
derived_from_bd_id: BD-023
- id: finance-C-178
when: When using VPT output for trading decisions
action: Apply custom moving average smoothing externally if needed — the raw unsmoothed VPT shows true cumulative flow;
if smoothed version is required, apply user-defined MA after obtaining the base VPT value
severity: medium
kind: operational_lesson
modality: should
consequence: Misinterpreting unsmoothed VPT volatility as noise may lead to unnecessary filtering or over-smoothing, obscuring
genuine cumulative flow divergences that indicate trend reversals
derived_from_bd_id: BD-023
- id: finance-C-179
when: When implementing or refactoring Aroon indicator calculations in backtesting
action: Verify that the Aroon window parameter matches the intended strategy timeframe; default window=25 periods assumes
one trading month lookback; adjust window to align with specific trend capture objectives (e.g., 14 for shorter-term,
50+ for longer-term trends)
severity: medium
kind: operational_lesson
modality: should
consequence: Using default Aroon window=25 with a short-term strategy causes laggy signals and missed opportunities; using
a short window with a long-term strategy introduces excessive noise and false signals
derived_from_bd_id: BD-027
- id: finance-C-180
when: When implementing or refactoring Vortex Indicator calculations in backtesting
action: Verify that the Vortex window parameter matches the intended strategy timeframe; default window=14 periods assumes
two trading weeks; adjust window to balance responsiveness vs reliability for specific market conditions
severity: medium
kind: operational_lesson
modality: should
consequence: Using default VI window=14 with highly volatile assets causes whipsaw trades and excessive false signals;
using a longer window reduces responsiveness, causing delayed entries and exits
derived_from_bd_id: BD-028
- id: finance-C-181
when: When implementing or refactoring Keltner Channel calculations in backtesting
action: Verify that the Keltner Channel multiplier matches the intended volatility capture; default multiplier=2 assumes
95% price envelope under normal distribution; adjust multiplier based on asset volatility profile (higher volatility
assets may need multiplier >2, lower volatility may need <2)
severity: high
kind: operational_lesson
modality: must
consequence: Using default multiplier=2 on high-volatility assets creates bands too tight, generating excessive false
breakout signals; using too large a multiplier on low-volatility assets creates bands too wide, missing valid breakout
opportunities
derived_from_bd_id: BD-030
- id: finance-C-182
when: When implementing or refactoring Ulcer Index calculations in backtesting
action: Verify that the Ulcer Index rolling max period matches the intended risk measurement timeframe; default period=14
assumes two trading weeks drawdown measurement; adjust period to align with strategy holding period and risk tolerance
severity: medium
kind: operational_lesson
modality: should
consequence: Using default Ulcer period=14 with long-holding strategies causes underestimation of maximum drawdown risk;
using longer periods with short-holding strategies causes over-sensitivity to temporary dips and premature risk alerts
derived_from_bd_id: BD-031
- id: finance-C-183
when: When implementing or refactoring Schaff Trend Cycle (STC) calculations in backtesting
action: Verify that each STC parameters match the intended strategy characteristics; default (window_slow=50, window_fast=23,
cycle=10, smooth1=3, smooth2=3) assumes medium-term trend following; adjust parameters based on asset class and trading
frequency (shorter cycles for high-frequency, longer for swing trading)
severity: high
kind: operational_lesson
modality: must
consequence: Using default STC parameters mismatched to strategy timeframe causes systematic signal timing errors; short
cycle values with long-term strategies create noise, long cycle values with short-term strategies cause lag and missed
opportunities
derived_from_bd_id: BD-032
- id: finance-C-184
when: When implementing or modifying the CCI indicator calculation
action: Use any denominator other than Mean Absolute Deviation (MAD) — CCI requires MAD for proper outlier insensitivity
and the constant 0.015 to verify 70-80% of values fall within [-100, 100]
severity: high
kind: domain_rule
modality: must_not
consequence: Using standard deviation instead of MAD would increase CCI sensitivity to outliers, causing significant deviation
from expected statistical distribution and producing unreliable overbought/oversold signals
derived_from_bd_id: BD-065
- id: finance-C-185
when: When configuring the CCI indicator parameters
action: Verify window parameter is a positive integer >= 1 and constant parameter is exactly 0.015
severity: high
kind: domain_rule
modality: must
consequence: CCI constant=0.015 (approximately 1/66.7) is calibrated to produce 70-80% values in [-100, 100]; incorrect
constant produces non-standard CCI distribution breaking threshold-based trading strategies
derived_from_bd_id: BD-065
- id: finance-C-186
when: When implementing or modifying the ADX indicator calculation
action: Use Wilder smoothing (recursive EMA equivalent) for DX averaging — standard EMA would produce different lag characteristics
and break consistency with Wilder's original DM/TR methodology
severity: high
kind: domain_rule
modality: must_not
consequence: Replacing Wilder smoothing with standard EMA alters the lag profile of ADX values, causing systematic divergence
from Welles Wilder's original formula and breaking strategies calibrated to ADX signal levels
derived_from_bd_id: BD-066
- id: finance-C-187
when: When configuring the ADX indicator parameters
action: Verify window parameter is a positive integer >= 1 (standard value is 14 per Wilder's methodology)
severity: medium
kind: domain_rule
modality: must
consequence: ADX window controls DX averaging; non-standard window values alter smoothing characteristics and produce
ADX values incompatible with standard interpretation thresholds (e.g., ADX > 25 for trend strength)
derived_from_bd_id: BD-066
- id: finance-C-188
when: When implementing or modifying the Vortex Indicator calculation
action: Calculate VI using sum of absolute price movement divided by True Range sum — using close-to-close changes or
alternative range measures would alter VI's volatility sensitivity and break the upward/downward movement isolation
severity: high
kind: domain_rule
modality: must_not
consequence: Alternative price movement calculations change VI's normalization, causing VI+ and VI- crossover signals
to occur at different price levels than expected and breaking mean-reversion strategies using VI thresholds
derived_from_bd_id: BD-067
- id: finance-C-189
when: When configuring the Vortex Indicator parameters
action: Verify window parameter is a positive integer >= 1; VI+ and VI- are separate indicators that must be compared
together (not in isolation)
severity: medium
kind: domain_rule
modality: must
consequence: VI+ and VI- crossover timing depends on window size; using only one component or incorrect window breaks
momentum reversal detection that relies on VI signal line crossings
derived_from_bd_id: BD-067
- id: finance-C-190
when: When implementing or modifying the PSAR indicator calculation
action: Use parabolic stop-and-reverse formula with acceleration factor step=0.02 and max_step=0.20 to achieve gradual
acceleration without excessive sensitivity
severity: high
kind: domain_rule
modality: must
consequence: Different step/max_step values alter stop-tightening speed and reversal frequency; excessive step values
cause premature reversals while insufficient values result in stops too loose to be useful
derived_from_bd_id: BD-068
- id: finance-C-191
when: When configuring PSAR indicator parameters
action: Verify step is in range (0, max_step] and max_step is positive; larger max_step values increase reversal sensitivity
severity: high
kind: domain_rule
modality: must
consequence: PSAR max_step=0.20 caps acceleration factor to control reversal frequency; unbounded or larger max_step causes
erratic stop behavior making the indicator impractical for risk management
derived_from_bd_id: BD-068
- id: finance-C-192
when: When implementing or modifying the STC indicator calculation
action: Apply stochastic normalization to MACD values, then perform double EMA smoothing with cycle=10 — single EMA or
different cycle lengths alter smoothness and signal timing
severity: high
kind: domain_rule
modality: must
consequence: STC combines MACD momentum with oscillator properties using double smoothing; single EMA or incorrect cycle
breaks the noise-filtering design causing excessive false signals in volatile markets
derived_from_bd_id: BD-069
- id: finance-C-193
when: When configuring STC indicator parameters
action: Verify cycle parameter is a positive integer (standard value is 10); cycle controls the EMA smoothing window affecting
both noise filtering and signal lag
severity: medium
kind: domain_rule
modality: must
consequence: STC cycle=10 balances smoothness against responsiveness; non-standard cycle values shift the STC centerline
crossover timing breaking strategies calibrated to standard 50-level signals
derived_from_bd_id: BD-069
output_validator:
assertions:
- id: OV-01
check_predicate: all(p in inspect.getsource(zvt.factors.algorithm.macd) for p in ['slow=26', 'fast=12', 'n=9'])
failure_message: 'FATAL: MACD params drifted from (fast=12, slow=26, n=9) — SL-08 violation, non-reproducible signals'
business_meaning: Standard MACD parameters are a semantic lock; drift makes results incomparable with industry-standard
indicators and non-reproducible.
source_ids:
- SL-08
- BD-036
- id: OV-02
check_predicate: result.get('total_trades', 0) > 0 or result.get('explicit_zero_trade_ack') is True
failure_message: Zero trades executed — likely missing pre-fetched data (see PC-02) or over-restrictive filters
business_meaning: A backtest with zero trades is not a valid result; either data is missing or the strategy never triggered.
Structural non-emptiness check is insufficient — we need business confirmation.
source_ids:
- SL-01
- finance-C-073
- id: OV-03
check_predicate: result.get('annual_return') is None or abs(float(result['annual_return'])) <= 5.0
failure_message: 'FATAL: |annual_return| > 500% — likely look-ahead bias or data error'
business_meaning: Annual returns exceeding 500% are physically implausible for A-share strategies; indicates look-ahead
bias or corrupt data.
source_ids: []
- id: OV-04
check_predicate: result.get('holding_change_pct') is None or abs(float(result['holding_change_pct'])) <= 1.0
failure_message: 'FATAL: |holding_change_pct| > 100% — physically impossible'
business_meaning: Holding change percentage cannot exceed 100%; violation indicates position accounting error.
source_ids:
- BD-029
- id: OV-05
check_predicate: result.get('max_drawdown') is None or abs(float(result['max_drawdown'])) <= 1.0
failure_message: 'FATAL: |max_drawdown| > 100% — impossible for non-leveraged account'
business_meaning: Maximum drawdown cannot exceed 100% without leverage; violation indicates calculation error or look-ahead
bias.
source_ids: []
- id: OV-06
check_predicate: not (hasattr(result, 'trade_log') and result.trade_log and any(result.trade_log[i].action == 'sell' and
i+1 < len(result.trade_log) and result.trade_log[i+1].action == 'buy' and result.trade_log[i].timestamp == result.trade_log[i+1].timestamp
for i in range(len(result.trade_log)-1)))
failure_message: 'FATAL: buy-before-sell detected in same cycle — SL-01 violation, creates implicit leverage'
business_meaning: SL-01 requires sell() before buy() in each cycle; violation means available_long was not updated before
buying, risking duplicate positions.
source_ids:
- SL-01
scaffold:
validate_py_path: '{workspace}/validate.py'
tail_block: "# === DO NOT MODIFY BELOW THIS LINE ===\nif __name__ == \"__main__\":\n result = run_backtest()\n from\
\ validate import enforce_validation\n enforce_validation(result, output_path=\"{workspace}/result.csv\")\n# ===\
\ END DO NOT MODIFY ==="
enforcement_protocol: 1. Never edit validate.py. 2. Never delete the DO NOT MODIFY tail block from the main script. 3. Never
wrap enforce_validation() in try/except. 4. Never rewrite result write logic — it MUST go through enforce_validation.
5. If validate.py raises ImportError, fix the dependency, do not remove the call.
acceptance:
hard_gates:
- id: G1
check: '{workspace}/result.csv exists AND file size > 0'
on_fail: Strategy did not produce output; check run_backtest() return value and enforce_validation() call
- id: G2
check: '{workspace}/result.csv.validation_passed marker file exists'
on_fail: Validation did not complete; review validate.py output and fix assertion failures
- id: G3
check: 'Main script contains literal: from validate import enforce_validation'
on_fail: Validation chain stripped; re-add the import in the DO NOT MODIFY block
- id: G4
check: 'Main script contains literal: # === DO NOT MODIFY BELOW THIS LINE ==='
on_fail: Validation fence removed; regenerate DO NOT MODIFY tail block
- id: G5
check: 'result.csv has at least 1 row: pandas.read_csv(result.csv).shape[0] >= 1'
on_fail: Empty result; check if trade_log is non-empty and factors generated signals. Confirm PC-02 (k-data exists) passed.
- id: G6
check: 'If MACD strategy: source contains ''slow=26'' AND ''fast=12'' AND ''n=9'' in algorithm call'
on_fail: MACD params drifted from SL-08 lock; restore standard (12, 26, 9)
- id: G7
check: 'For data pipeline tasks: result.csv contains ''entity_id'' and ''timestamp'' fields'
on_fail: Missing required columns; check Mixin.query_data return schema and DataFrame MultiIndex reset_index() before
writing
- id: G8
check: 'OV-03 passes: abs(annual_return) <= 5.0 (500%)'
on_fail: Physical plausibility check failed; investigate look-ahead bias or data corruption in input kdata
soft_gates:
- id: SG-01
rubric: 'Strategy narrative consistency: user intent aligns with generated strategy.py logic. dim_a: signal direction
(buy/sell) matches intent [1-5, pass>=4]; dim_b: frequency (daily/intraday) aligns [1-5, pass>=4]; dim_c: risk controls
match user intent [1-5, pass>=4].'
- id: SG-02
rubric: 'Factor combination quality. dim_a: no highly correlated factor duplication [1-5, pass>=4]; dim_b: multi-period
alignment correct [1-5, pass>=4]; dim_c: liquidity filter present for A-share [1-5, pass>=4].'
- id: SG-03
rubric: 'Data source selection appropriateness. dim_a: coverage sufficient for target entities [1-5, pass>=4]; dim_b:
provider latency acceptable for strategy frequency [1-5, pass>=4]; dim_c: no unauthorized provider used without credentials
[1-5, pass>=4].'
skill_crystallization:
trigger: all_hard_gates_passed AND user_opt_out_skill_saving != true
output_path_template: '{workspace}/../skills/{slug}.skill'
slug_template: '{blueprint_id_short}-{uc_id_lower}'
captured_fields:
- name
- intent_keywords
- entry_point_script
- validate_script
- fatal_constraints
- spec_locks
- preconditions
- install_recipes
- human_summary_translated
action: 'After all Hard Gates PASS, resolve slug via slug_template using the executed UC, then write the .skill YAML file
at output_path_template. Notify user in their detected locale: ''Skill saved as {slug}.skill — next time say one of {sample_triggers}
from the matched UC to invoke directly.'''
violation_signal: All hard gates passed but no .skill file exists at expected path
skill_file_schema:
name: finance-bp-122 / Sphinx Documentation Configuration
version: v5.3
intent_keywords:
- documentation
- sphinx
- config
- api docs
entry_point: run_backtest
fatal_guards:
- SL-01
- SL-02
- SL-03
- SL-04
- SL-05
- SL-06
- SL-07
- SL-08
- SL-10
- SL-11
- SL-12
spec_locks:
- SL-01
- SL-02
- SL-03
- SL-04
- SL-05
- SL-06
- SL-07
- SL-08
- SL-09
- SL-10
- SL-11
- SL-12
preconditions:
- PC-01
- PC-02
- PC-03
- PC-04
post_install_notice:
trigger: skill_installation_complete
message_template:
positioning: I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow.
capability_catalog:
group_strategy:
source: auto_grouped
strategy_reason: auto-grouped by UC.type (2 distinct values, balanced distribution)
groups:
- group_id: reporting
name: Reporting
description: ''
emoji: 📋
uc_count: 1
ucs:
- uc_id: UC-101
name: Sphinx Documentation Configuration
short_description: Configures the Sphinx documentation builder for the Technical Analysis Library, enabling automated
generation of API documentation
sample_triggers:
- documentation
- sphinx
- config
- group_id: research_analysis
name: Research Analysis
description: ''
emoji: 📦
uc_count: 1
ucs:
- uc_id: UC-102
name: Technical Analysis Features Visualization
short_description: Explores and visualizes various technical analysis indicators (Bollinger Bands, Keltner Channel,
Donchian Channel, MACD) on historical price data to u
sample_triggers:
- visualize
- technical indicators
- charting
call_to_action: Tell me which one you want to try.
featured_entries:
- uc_id: UC-101
beginner_prompt: Try sphinx documentation configuration
auto_selected: true
- uc_id: UC-102
beginner_prompt: Try technical analysis features visualization
auto_selected: true
- uc_id: UC-100
beginner_prompt: Try capability UC-100
auto_selected: true
more_info_hint: Ask me 'what else can you do?' to see all 2 capabilities.
locale_rendering:
instruction: On skill_installation_complete, translate ALL user-facing strings (positioning + capability_catalog.groups[].name
+ capability_catalog.groups[].description + capability_catalog.groups[].ucs[].short_description + call_to_action + featured_entries[].beginner_prompt
+ more_info_hint) into detected user locale per locale_contract. Preserve UC-IDs, group_id, emoji, and sample_triggers
verbatim.
preserve_verbatim:
- UC-IDs
- group_id
- emoji
- sample_triggers
- technical_class_names
enforcement:
action: 'Host agent MUST send composed message to user as the FIRST user-facing response after skill_installation_complete
event. Message MUST contain: positioning, capability_catalog (rendered as markdown tables per group), 3 featured_entries,
call_to_action, and more_info_hint.'
violation_code: PIN-01
violation_signal: First user-facing message post-install does not contain the full capability_catalog (all UCs grouped)
OR skips featured_entries OR skips call_to_action.
human_summary:
persona: Doraemon
what_i_can_do:
tagline: 'I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me
what you want; I''ll write the code, you don''t have to dig docs. (Heads up: ZVT natively supports A-share, HK, and
crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don''t bother for serious work.)'
use_cases:
- Technical Analysis Features Visualization
- Sphinx Documentation Configuration
- A-share MACD daily golden-cross backtest with hfq price adjustment from eastmoney
- 'End-to-end ZVT pipeline: FinanceRecorder + GoodCompanyFactor + StockTrader'
- Multi-factor strategy with TargetSelector (AND mode) combining MACD + volume breakout
- Index composition data collection (SZ1000, SZ2000) with EM recorder
- Institutional fund holdings tracker via joinquant_fund_runner pattern
what_i_auto_fetch:
- ZVT stage pipeline structure (data_collection → visualization) from LATEST.yaml
- Semantic locks (SL-01 through SL-12) — especially sell-before-buy ordering and MACD params
- Fatal constraints (finance-C-*) relevant to your target strategy type
- 'Default parameters: MACD(12,26,9), hfq adjustment, buy_cost=0.001, base_capital=1M CNY'
- Entity ID format (stock_sh_600000) and DataFrame MultiIndex convention
- Provider-specific recorder class names and required class attributes
what_i_ask_you:
- 'Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage
is thin)'
- 'Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare,
or qmt (broker)?'
- 'Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?'
- 'Time range: start_timestamp and end_timestamp for backtest period'
- 'Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?'
locale_rendering:
instruction: On first user contact, translate all fields above into detected user locale while preserving Doraemon persona
(direct, frank, mildly snarky, knows limits).
preserve_verbatim:
- BD-IDs
- SL-IDs
- UC-IDs
- finance-C-IDs
- class_names
- function_names
- file_paths
- numeric_thresholds
验证 Frappe Lending 贷款模块核心流程,包括贷款申请创建、放款计划生成、还款处理及结清退款的自动化测试能力。
---
name: p2p-lending-data
description: |-
验证 Frappe Lending 贷款模块核心流程,包括贷款申请创建、放款计划生成、还款处理及结清退款的自动化测试能力。
license: Proprietary. See LICENSE.txt in project root.
compatibility: Designed for Doramagic-host ecosystem (Claude Code / openclaw / Cursor). Requires Python 3.12+ with uv package manager.
metadata:
version: "v6.1"
blueprint_id: "finance-bp-072"
compiled_at: "2026-04-22T13:00:26.108289+00:00"
capability_markets: "global"
capability_activities: "credit-risk"
sop_version: "crystal-compilation-v6.1"
---
# P2P 贷款测试 (p2p-lending-data)
> 验证 Frappe Lending 贷款模块核心流程,包括贷款申请创建、放款计划生成、还款处理及结清退款的自动化测试能力。
## Pipeline
`data_collection -> data_storage -> factor_computation -> target_selection -> trading_execution -> visualization`
## Top Use Cases (18 total)
### Test Infrastructure Setup for Lending Module (`UC-101`)
Provides shared test utilities and setup functions needed by each lending module tests, including master initialization, loan product creation, and cu
**Triggers**: test setup, lending test utils, test infrastructure
### Loan Refund and Closure Testing (`UC-102`)
Tests the loan closure process when a borrower requests a refund of excess amounts after repaying the loan
**Triggers**: loan refund, loan closure, excess amount refund
### Loan Application Creation Testing (`UC-103`)
Tests the creation and processing of loan applications including rate of interest configuration and applicant details
**Triggers**: loan application, loan request, apply for loan
For all **18** use cases, see [references/USE_CASES.md](references/USE_CASES.md).
**Execute trigger**: `When user intent matches intent_router.uc_entries[].positive_terms AND user uses action verb (run/execute/跑/执行/backtest/fetch/collect)`
## What I'll Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
## Semantic Locks (Fatal)
| ID | Rule | On Violation |
|---|---|---|
| `SL-01` | Execute sell orders before buy orders in every trading cycle | halt |
| `SL-02` | Trading signals MUST use next-bar execution (no look-ahead) | halt |
| `SL-03` | Entity IDs MUST follow format entity_type_exchange_code | halt |
| `SL-04` | DataFrame index MUST be MultiIndex (entity_id, timestamp) | halt |
| `SL-05` | TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount | halt |
| `SL-06` | filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION | halt |
| `SL-07` | Transformer MUST run BEFORE Accumulator in factor pipeline | halt |
| `SL-08` | MACD parameters locked: fast=12, slow=26, signal=9 | halt |
Full lock definitions: [references/LOCKS.md](references/LOCKS.md)
## Top Anti-Patterns (14 total)
- **`AP-CREDIT-RISK-001`**: Empty DataFrame passed to bucketing pipeline
- **`AP-CREDIT-RISK-002`**: Multi-dimensional target array causing WoE shape mismatch
- **`AP-CREDIT-RISK-003`**: OptimalBucketer receiving high-cardinality numerical features
All 14 anti-patterns: [references/ANTI_PATTERNS.md](references/ANTI_PATTERNS.md)
## Evidence Quality Notice
> [QUALITY NOTICE] This crystal was compiled from blueprint finance-bp-072. Evidence verify ratio = 69.5% and audit fail total = 24. Generated results may have uncaptured requirement gaps. Verify critical decisions against source files (LATEST.yaml / LATEST.jsonl).
## Reference Files
| File | Contents | When to Load |
|---|---|---|
| [references/seed.yaml](references/seed.yaml) | V6+ 全量权威 (source-of-truth) | 有行为/决策争议时必读 |
| [references/ANTI_PATTERNS.md](references/ANTI_PATTERNS.md) | 14 条跨项目反模式 | 开始实现前 |
| [references/WISDOM.md](references/WISDOM.md) | 跨项目精华借鉴 | 架构决策时 |
| [references/CONSTRAINTS.md](references/CONSTRAINTS.md) | domain + fatal 约束 | 规则冲突时 |
| [references/USE_CASES.md](references/USE_CASES.md) | 全量 KUC-* 业务场景 | 需要完整示例时 |
| [references/LOCKS.md](references/LOCKS.md) | SL-* + preconditions + hints | 生成回测/交易代码前 |
| [references/COMPONENTS.md](references/COMPONENTS.md) | AST 组件地图(按 module 拆分)| 查 API 时 |
---
*Compiled by Doramagic crystal-compilation-v6.1 from `finance-bp-072` blueprint at 2026-04-22T13:00:26.108289+00:00.*
*See [human_summary.md](human_summary.md) for non-technical overview.*
FILE:human_summary.md
# Human Summary
> I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
## What I Can Do
- **tagline**: I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
- **use_cases**: ['Loan Application Creation Testing', 'Loan Refund and Closure Testing', 'Test Infrastructure Setup for Lending Module', 'A-share MACD daily golden-cross backtest with hfq price adjustment from eastmoney', 'End-to-end ZVT pipeline: FinanceRecorder + GoodCompanyFactor + StockTrader', 'Multi-factor strategy with TargetSelector (AND mode) combining MACD + volume breakout', 'Index composition data collection (SZ1000, SZ2000) with EM recorder']
## What I Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
FILE:references/ANTI_PATTERNS.md
# Anti-Patterns (Cross-Project)
Total: **14**
## finance-bp-050--skorecard (5)
### `AP-CREDIT-RISK-001` — Empty DataFrame passed to bucketing pipeline <sub>(high)</sub>
When preparing input data for bucketing, passing an empty DataFrame with zero rows or zero columns causes immediate ValueError at validation stage. This prevents any downstream processing and blocks the entire credit risk scoring pipeline from executing. The root cause is missing defensive validation before data enters the bucketing workflow.
### `AP-CREDIT-RISK-002` — Multi-dimensional target array causing WoE shape mismatch <sub>(high)</sub>
When providing target variable y to bucketers without normalizing to 1D numpy array through _check_y validation, downstream Weight of Evidence calculations fail with shape mismatches. The consequence is corrupted bucket tables with incorrect credit risk scores that misrepresent default probability estimates.
### `AP-CREDIT-RISK-003` — OptimalBucketer receiving high-cardinality numerical features <sub>(high)</sub>
When implementing prebucketing for OptimalBucketer on numerical features without reducing to at most 100 unique values, the system raises NotPreBucketedError and blocks the entire bucketing pipeline. Similarly, AsIsNumericalBucketer fails with the same error for columns exceeding 100 unique values, preventing feature transformation in production scoring.
### `AP-CREDIT-RISK-004` — Special values distorting optimal bin boundaries <sub>(high)</sub>
When implementing fit() for bucketers without filtering special values from X before computing bin boundaries using _filter_specials_for_fit(), outlier special values distort optimal bin boundaries. This causes incorrect weight-of-evidence calculations and unreliable credit risk scores that misrepresent borrower default probabilities.
### `AP-CREDIT-RISK-005` — Two-phase bucketing ordering violation causing special value loss <sub>(high)</sub>
When fitting a BucketingProcess with two-phase bucketing without fitting prebucketing_pipeline before bucketing_pipeline, special value remapping fails because pre-bucket labels are unavailable. Additionally, not using _find_remapped_specials() after prebucketing causes special values to lose their correct bucket mappings, resulting in runtime errors.
## finance-bp-072--lending (3)
### `AP-CREDIT-RISK-006` — Loan amount exceeding product and collateral limits <sub>(high)</sub>
When validating loan amount for loan applications without enforcing loan_amount does not exceed maximum_loan_amount from loan product or proposed securities, disbursing amounts exceeding product or collateral limits exposes the lender to uncollateralized risk. This violates lending policy and creates direct financial loss exposure through unauthorized lending.
### `AP-CREDIT-RISK-007` — Disbursement validation failures creating unauthorized exposure <sub>(high)</sub>
When implementing loan disbursement validation without checking disbursed amount against loan limit, assigned security value, available limit amount, and limit applicability dates, unauthorized disbursements occur. For Line of Credit loans, disbursement outside approved periods or exceeding available limits creates unauthorized lending exposure and regulatory compliance violations.
### `AP-CREDIT-RISK-008` — Interest accrual on written-off loans inflating income <sub>(high)</sub>
When processing interest accrual for Written Off loans without verifying posting_date is on or after the loan write-off date, interest is artificially inflated on non-performing assets. This misrepresents loan portfolio value, violates provisioning requirements, and creates false income reporting that misleads stakeholders about actual financial performance.
## finance-bp-112--openLGD (2)
### `AP-CREDIT-RISK-009` — Loop index errors in federated parameter averaging <sub>(high)</sub>
When implementing federated parameter averaging logic, using the final index n instead of the loop variable k causes only the last server's weight to be applied repeatedly. Additionally, skipping the first server by starting loop index at 1 excludes valid parameters from averaging, breaking federated convergence and producing incorrect LGD estimates across all nodes.
### `AP-CREDIT-RISK-010` — API response format inconsistency breaking federated coordination <sub>(high)</sub>
When implementing GET /start and POST /update endpoints for LGD estimation without consistent 'intercept' and 'coefficient' keys in JSON responses, the federated coordinator fails to parse responses causing KeyError. Different return key names (e.g., 'coef' instead of 'coefficient') break both standalone and federated execution paths.
## finance-bp-119--transitionMatrix (4)
### `AP-CREDIT-RISK-011` — Invalid transition probabilities corrupting Markov matrices <sub>(high)</sub>
When generating synthetic Markov chain data or estimating transition matrices with probabilities outside [0, 1] or row sums not equal to 1.0, the resulting matrices violate the fundamental mathematical definition of a stochastic transition matrix. This corrupts all downstream Markov chain modeling and credit curve generation, producing unreliable credit risk estimates.
### `AP-CREDIT-RISK-012` — Unsorted event data causing incorrect transition matrix estimates <sub>(high)</sub>
When feeding generated data to cohort or duration estimators without sorting by entity ID first, then by ascending time, incorrect timepoint assignment occurs in estimators, leading to wrong transition counts. Unsorted data also causes the Aalen-Johansen algorithm to process events out of temporal order, producing incorrect transition matrices that violate the Markov property.
### `AP-CREDIT-RISK-013` — Zero-count division causing NaN in transition matrices <sub>(high)</sub>
When normalizing counts to produce transition probabilities without checking source state population count is greater than zero before division, division by zero occurs and causes NaN values in the transition matrix. These NaN values corrupt all downstream matrix operations including generator matrix computation and credit curve generation.
### `AP-CREDIT-RISK-014` — Wrong matrix logarithm method producing invalid generator matrices <sub>(medium)</sub>
When implementing generator() method without using scipy.linalg.logm for matrix logarithm computation, using numpy.log or other approximation methods produces invalid generator matrices with row sums not equal to zero. This violates the mathematical definition of an infinitesimal generator, causing incorrect continuous-time Markov chain modeling.
FILE:references/COMPONENTS.md
# Component Capability Map
**Project**: finance-bp-072--lending
**Scan date**: 2026-04-22
**Stats**: {'total_files': 11, 'total_classes': 44, 'total_functions': 0, 'total_stages': 11}
## Modules (11)
- [loan_application_processing](components/loan_application_processing.md): 5 classes
- [loan_booking](components/loan_booking.md): 4 classes
- [loan_disbursement](components/loan_disbursement.md): 4 classes
- [interest_accrual](components/interest_accrual.md): 4 classes
- [loan_demand_generation](components/loan_demand_generation.md): 3 classes
- [loan_repayment_processing](components/loan_repayment_processing.md): 4 classes
- [loan_restructure](components/loan_restructure.md): 4 classes
- [loan_classification_&_npa_processing](components/loan_classification_-_npa_processing.md): 4 classes
- [loan_write-off](components/loan_write-off.md): 4 classes
- [balance_adjustments_&_refunds](components/balance_adjustments_-_refunds.md): 4 classes
- [loan_security_management](components/loan_security_management.md): 4 classes
FILE:references/CONSTRAINTS.md
# Constraints
## preservation_manifest
```yaml
required_objects:
business_decisions_count: 131
fatal_constraints_count: 87
non_fatal_constraints_count: 261
use_cases_count: 18
semantic_locks_count: 12
preconditions_count: 4
evidence_quality_rules_count: 2
traceback_scenarios_count: 5
```
FILE:references/LOCKS.md
# Semantic Locks + Preconditions
## Semantic Locks (12)
### `SL-01` <sub>(on_violation: fatal)</sub>
Execute sell orders before buy orders in every trading cycle
### `SL-02` <sub>(on_violation: fatal)</sub>
Trading signals MUST use next-bar execution (no look-ahead)
### `SL-03` <sub>(on_violation: fatal)</sub>
Entity IDs MUST follow format entity_type_exchange_code
### `SL-04` <sub>(on_violation: fatal)</sub>
DataFrame index MUST be MultiIndex (entity_id, timestamp)
### `SL-05` <sub>(on_violation: fatal)</sub>
TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount
### `SL-06` <sub>(on_violation: fatal)</sub>
filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION
### `SL-07` <sub>(on_violation: fatal)</sub>
Transformer MUST run BEFORE Accumulator in factor pipeline
### `SL-08` <sub>(on_violation: fatal)</sub>
MACD parameters locked: fast=12, slow=26, signal=9
### `SL-09` <sub>(on_violation: warning)</sub>
Default transaction costs: buy_cost=0.001, sell_cost=0.001, slippage=0.001
### `SL-10` <sub>(on_violation: fatal)</sub>
A-share equity trading is T+1 (no same-day close of buy positions)
### `SL-11` <sub>(on_violation: fatal)</sub>
Recorder subclass MUST define provider AND data_schema class attributes
### `SL-12` <sub>(on_violation: fatal)</sub>
Factor result_df MUST contain either 'filter_result' OR 'score_result' column
## Preconditions (4)
- **`PC-01`**: `python3 -c 'import zvt; print(zvt.__version__)'` → on_fail: Run: python3 -m pip install zvt then re-run: python3 -m zvt.init_dirs to initialize data directories
- **`PC-02`**: `python3 -c "from zvt.api.kdata import get_kdata; df = get_kdata(entity_ids=['stock_sh_600000'], limit=1); assert df is not None and len(df) > 0, 'No kdata found'"` → on_fail: Run recorder first: python3 -m zvt.recorders.em.em_stock_kdata_recorder --entity_ids stock_sh_600000 (replace with your target entity IDs)
- **`PC-03`**: `python3 -c "import os; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); assert zvt_home.exists(), f'ZVT home not found: {zvt_home}'"` → on_fail: Run: python3 -m zvt.init_dirs
- **`PC-04`**: `python3 -c "import os, tempfile; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); test_f = zvt_home / '.write_test'; test_f.touch(); test_f.unlink()"` → on_fail: Check directory permissions: chmod u+w ~/.zvt or set ZVT_HOME environment variable to a writable location
FILE:references/USE_CASES.md
# Known Use Cases (KUC)
Total: **18**
## `KUC-101`
**Source**: `lending/tests/test_utils.py`
Provides shared test utilities and setup functions needed by each lending module tests, including master initialization, loan product creation, and customer setup.
## `KUC-102`
**Source**: `lending/loan_management/doctype/loan_refund/test_loan_refund.py`
Tests the loan closure process when a borrower requests a refund of excess amounts after repaying the loan.
## `KUC-103`
**Source**: `lending/loan_management/doctype/loan_application/test_loan_application.py`
Tests the creation and processing of loan applications including rate of interest configuration and applicant details.
## `KUC-104`
**Source**: `lending/loan_management/doctype/loan_security_deposit/test_loan_security_deposit.py`
Tests security deposit adjustments for secured loans where borrowers pledge collateral.
## `KUC-105`
**Source**: `lending/loan_management/doctype/loan/test_loan.py`
Comprehensive testing of the core Loan doctype including loan closure requests, security unpledging, disbursement amounts, interest accrual, repayment calculations, and loan classification.
## `KUC-106`
**Source**: `lending/loan_management/doctype/loan_adjustment/test_loan_adjustment.py`
Tests loan balance adjustments after loan disbursement, including interest accrual processing and demand generation.
## `KUC-107`
**Source**: `lending/loan_management/doctype/loan_repayment_repost/test_loan_repayment_repost.py`
Tests the reposting of loan repayments when backdated transactions occur, ensuring repayment allocations are correctly reset.
## `KUC-108`
**Source**: `lending/loan_management/doctype/loan_security_shortfall/test_loan_security_shortfall.py`
Tests scenarios where pledged security value falls below required margin, triggering shortfall alerts and collateral top-up requirements.
## `KUC-109`
**Source**: `lending/loan_management/doctype/loan_security_assignment/test_loan_security_assignment.py`
Tests the assignment and pledging of security assets when taking a secured loan.
## `KUC-110`
**Source**: `lending/loan_management/doctype/loan_restructure/test_loan_restructure.py`
Tests loan restructuring operations including term modifications, rate changes, and loan classification updates for distressed loans.
## `KUC-111`
**Source**: `lending/loan_management/doctype/sanctioned_loan_amount/test_sanctioned_loan_amount.py`
Tests sanctioned loan amount limits for secured loans based on pledged security values and LTV ratios.
## `KUC-112`
**Source**: `lending/loan_management/doctype/loan_repayment_schedule/test_loan_repayment_schedule.py`
Tests loan repayment schedule generation and moratorium period calculations, especially after loan restructuring.
## `KUC-113`
**Source**: `lending/loan_management/doctype/loan_interest_accrual/test_loan_interest_accrual.py`
Tests interest accrual processing including batch processing, freeze date handling, and daily/monthly accrual frequencies.
## `KUC-114`
**Source**: `lending/loan_management/doctype/loan_disbursement/test_loan_disbursement.py`
Tests loan disbursement processes including creation of sales invoices for disbursement charges.
## `KUC-115`
**Source**: `lending/loan_origination/doctype/loan_document_type/test_loan_document_type.py`
Integration tests for LoanDocumentType doctype used in loan origination workflows.
## `KUC-116`
**Source**: `lending/loan_origination/doctype/loan_purpose/test_loan_purpose.py`
Integration tests for LoanPurpose doctype defining loan purpose categories in origination.
## `KUC-117`
**Source**: `lending/loan_origination/doctype/loan_origination_settings/test_loan_origination_settings.py`
Integration tests for LoanOriginationSettings doctype containing configuration for the loan origination process.
## `KUC-118`
**Source**: `lending/loan_origination/doctype/loan_lead/test_loan_lead.py`
Integration tests for LoanLead doctype used in tracking potential loan applicants before application submission.
FILE:references/WISDOM.md
# Cross-Project Wisdom
Total: **8**
## `CW-CREDIT-RISK-001` — Strict input DataFrame schema validation
**From**: finance-bp-050--skorecard, finance-bp-112--openLGD · **Applicable to**: credit-risk
Both skorecard and openLGD require strict validation that input DataFrames contain exactly the expected columns (X/Y for openLGD, specified variable names for skorecard). This pattern is critical when data flows through multiple transformation stages where downstream modules access columns by name without defensive checking. Always validate column existence before pipeline execution.
## `CW-CREDIT-RISK-002` — Explicit random_state for ML model reproducibility
**From**: finance-bp-112--openLGD · **Applicable to**: credit-risk
In federated learning scenarios with SGDRegressor, omitting random_state causes non-deterministic results due to random data shuffling and weight initialization. This breaks federated learning convergence guarantees. Always set random_state explicitly when reproducibility across nodes or runs is required for regulatory auditability.
## `CW-CREDIT-RISK-003` — Mandatory data sorting before multi-stage estimation
**From**: finance-bp-050--skorecard, finance-bp-119--transitionMatrix · **Applicable to**: credit-risk
Both skorecard's two-phase bucketing and transitionMatrix's Aalen-Johansen estimator require data to be in a specific order before processing. Skorecard requires prebucketing before bucketing; transitionMatrix requires sorting by entity ID then time. Violating this ordering produces incorrect results or runtime errors. Always establish and enforce processing order in multi-stage pipelines.
## `CW-CREDIT-RISK-004` — Consistent API response key naming across all endpoints
**From**: finance-bp-112--openLGD · **Applicable to**: credit-risk
In federated systems with multiple API endpoints (/start, /update), all responses must use identical key names for parameters (intercept, coefficient). Inconsistency causes coordination loop failures in downstream consumers. Define a schema contract upfront and enforce key naming consistency across all response types.
## `CW-CREDIT-RISK-005` — Cardinality bounds checking before array operations
**From**: finance-bp-050--skorecard, finance-bp-119--transitionMatrix · **Applicable to**: credit-risk
Both skorecard's bucketers (max 100 unique values) and transitionMatrix's matrix operations (state cardinality matching matrix dimensions) require strict cardinality validation before creating numpy arrays or performing computations. Violations cause NotPreBucketedError or index out-of-bounds errors. Always validate cardinality constraints before array initialization.
## `CW-CREDIT-RISK-006` — Financial validation gates before transaction execution
**From**: finance-bp-072--lending · **Applicable to**: credit-risk
Lending systems require validation that disbursement amounts do not exceed limits, collateral values, or authorized periods before any transaction executes. These are financial loss prevention controls, not optional business logic. Missing these validations creates unauthorized exposure and regulatory compliance violations that cannot be remedied retroactively.
## `CW-CREDIT-RISK-007` — Mathematical constraint validation for probability outputs
**From**: finance-bp-050--skorecard, finance-bp-119--transitionMatrix · **Applicable to**: credit-risk
Credit risk models must validate mathematical constraints on outputs: skorecard's WoE requires valid bin assignments, transitionMatrix's transition matrices require row sums equals 1.0 and generator matrices require row sums equals 0.0. Invalid mathematical properties corrupt downstream risk calculations. Validate constraints before returning results.
## `CW-CREDIT-RISK-008` — Port-to-ID mapping consistency in distributed model serving
**From**: finance-bp-112--openLGD · **Applicable to**: credit-risk
When deploying distributed model servers, port numbers must map deterministically to server IDs (e.g., port 5001 maps to server ID 1). Computation of ID from port must be consistent across all components. Inconsistencies cause incorrect data directory selection and model parameter mismatches. Document and validate port-ID mappings during deployment.
FILE:references/components/balance_adjustments_-_refunds.md
# balance_adjustments_&_refunds (4 classes)
## `LoanBalanceAdjustment.validate`
`balance_adjustments_&_refunds/loanbalanceadjustment-validate.py:0`
## `LoanRefund.validate`
`balance_adjustments_&_refunds/loanrefund-validate.py:0`
## `LoanBalanceAdjustment.validate_if_restructure_in_process`
`balance_adjustments_&_refunds/loanbalanceadjustment-validate-if-restru.py:0`
## `refund_type`
`balance_adjustments_&_refunds/refund-type.py:0`
FILE:references/components/interest_accrual.md
# interest_accrual (4 classes)
## `LoanInterestAccrual.validate`
`interest_accrual/loaninterestaccrual-validate.py:0`
## `ProcessLoanInterestAccrual.on_submit`
`interest_accrual/processloaninterestaccrual-on-submit.py:0`
## `Loan.make_suspense_journal_entry`
`interest_accrual/loan-make-suspense-journal-entry.py:0`
## `accrual_frequency`
`interest_accrual/accrual-frequency.py:0`
FILE:references/components/loan_application_processing.md
# loan_application_processing (5 classes)
## `LoanApplication.validate`
`loan_application_processing/loanapplication-validate.py:0`
## `LoanApplication.before_save`
`loan_application_processing/loanapplication-before-save.py:0`
## `LoanApplication.check_sanctioned_amount_limit`
`loan_application_processing/loanapplication-check-sanctioned-amount-.py:0`
## `repayment_calculation`
`loan_application_processing/repayment-calculation.py:0`
## `customer_creation`
`loan_application_processing/customer-creation.py:0`
FILE:references/components/loan_booking.md
# loan_booking (4 classes)
## `Loan.validate`
`loan_booking/loan-validate.py:0`
## `LoanController.loan_accounting_enabled`
`loan_booking/loancontroller-loan-accounting-enabled.py:0`
## `Loan.get_migration_date_for_import`
`loan_booking/loan-get-migration-date-for-import.py:0`
## `accounting_mode`
`loan_booking/accounting-mode.py:0`
FILE:references/components/loan_classification_-_npa_processing.md
# loan_classification_&_npa_processing (4 classes)
## `ProcessLoanClassification.on_submit`
`loan_classification_&_npa_processing/processloanclassification-on-submit.py:0`
## `Loan.get_oldest_unpaid_demand_date`
`loan_classification_&_npa_processing/loan-get-oldest-unpaid-demand-date.py:0`
## `Loan.unmark_npa`
`loan_classification_&_npa_processing/loan-unmark-npa.py:0`
## `dpd_threshold`
`loan_classification_&_npa_processing/dpd-threshold.py:0`
FILE:references/components/loan_demand_generation.md
# loan_demand_generation (3 classes)
## `LoanDemand.validate`
`loan_demand_generation/loandemand-validate.py:0`
## `ProcessLoanDemand.on_submit`
`loan_demand_generation/processloandemand-on-submit.py:0`
## `batch_size`
`loan_demand_generation/batch-size.py:0`
FILE:references/components/loan_disbursement.md
# loan_disbursement (4 classes)
## `LoanDisbursement.validate`
`loan_disbursement/loandisbursement-validate.py:0`
## `LoanDisbursement.on_update`
`loan_disbursement/loandisbursement-on-update.py:0`
## `LoanDisbursement.set_cyclic_date`
`loan_disbursement/loandisbursement-set-cyclic-date.py:0`
## `schedule_type`
`loan_disbursement/schedule-type.py:0`
FILE:references/components/loan_repayment_processing.md
# loan_repayment_processing (4 classes)
## `LoanRepayment.validate`
`loan_repayment_processing/loanrepayment-validate.py:0`
## `LoanRepayment.create_repost`
`loan_repayment_processing/loanrepayment-create-repost.py:0`
## `LoanDemand.allocate`
`loan_repayment_processing/loandemand-allocate.py:0`
## `allocation_order`
`loan_repayment_processing/allocation-order.py:0`
FILE:references/components/loan_restructure.md
# loan_restructure (4 classes)
## `LoanRestructure.validate`
`loan_restructure/loanrestructure-validate.py:0`
## `LoanRestructure.allocate_security_deposit`
`loan_restructure/loanrestructure-allocate-security-deposi.py:0`
## `LoanRestructure.treatment_of_normal_interest`
`loan_restructure/loanrestructure-treatment-of-normal-inte.py:0`
## `restructure_type`
`loan_restructure/restructure-type.py:0`
FILE:references/components/loan_security_management.md
# loan_security_management (4 classes)
## `LoanSecurityAssignment.validate`
`loan_security_management/loansecurityassignment-validate.py:0`
## `LoanSecurityAssignment.post_haircut_amount`
`loan_security_management/loansecurityassignment-post-haircut-amou.py:0`
## `LoanSecurityAssignment.update_loan_securities_values`
`loan_security_management/loansecurityassignment-update-loan-secur.py:0`
## `haircut_calculation`
`loan_security_management/haircut-calculation.py:0`
FILE:references/components/loan_write-off.md
# loan_write-off (4 classes)
## `LoanWriteOff.validate`
`loan_write-off/loanwriteoff-validate.py:0`
## `LoanWriteOff.make_loan_waivers`
`loan_write-off/loanwriteoff-make-loan-waivers.py:0`
## `LoanWriteOff.write_off_suspense_entries`
`loan_write-off/loanwriteoff-write-off-suspense-entries.py:0`
## `closure_on_write_off`
`loan_write-off/closure-on-write-off.py:0`
OpenSanctions 黑名单合规筛查:国际制裁名单、PEP(政要)、高风险人物数据的 抓取、去重、匹配与版本归档。适用于 KYC 和 AML 尽调。
---
name: opensanctions-watchlist
description: |-
OpenSanctions 黑名单合规筛查:国际制裁名单、PEP(政要)、高风险人物数据的
抓取、去重、匹配与版本归档。适用于 KYC 和 AML 尽调。
license: Proprietary. See LICENSE.txt in project root.
compatibility: Designed for Doramagic-host ecosystem (Claude Code / openclaw / Cursor). Requires Python 3.12+ with uv package manager.
metadata:
version: "v6.1"
blueprint_id: "finance-bp-071"
compiled_at: "2026-04-22T13:00:25.342470+00:00"
capability_markets: "global"
capability_activities: "regtech-compliance"
sop_version: "crystal-compilation-v6.1"
---
# 制裁名单筛查 (opensanctions-watchlist)
> 国际制裁名单 + PEP(政要)+ 高风险人物实时筛查——合规 KYC/AML 场景必备。
## Pipeline
`data_collection -> data_storage -> factor_computation -> target_selection -> trading_execution -> visualization`
## Top Use Cases (60 total)
### Dataset Crawling (ETL) (`UC-101`)
Automates the extraction, transformation, and loading of data from external sources into the OpenSanctions data store with optional validation and dat
**Triggers**: crawl, extract, load
### Wikidata Updates Review (`UC-103`)
Interactively reviews and applies Wikidata updates to OpenSanctions datasets, allowing manual curation of proposed entity matches
**Triggers**: wikidata, update, review
### Database Statement Loading (`UC-104`)
Loads dataset statements from the archive into a SQL database for querying and analysis, with configurable batch sizes
**Triggers**: load, database, sql
For all **60** use cases, see [references/USE_CASES.md](references/USE_CASES.md).
**Execute trigger**: `When user intent matches intent_router.uc_entries[].positive_terms AND user uses action verb (run/execute/跑/执行/backtest/fetch/collect)`
## What I'll Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
## Semantic Locks (Fatal)
| ID | Rule | On Violation |
|---|---|---|
| `SL-01` | Execute sell orders before buy orders in every trading cycle | halt |
| `SL-02` | Trading signals MUST use next-bar execution (no look-ahead) | halt |
| `SL-03` | Entity IDs MUST follow format entity_type_exchange_code | halt |
| `SL-04` | DataFrame index MUST be MultiIndex (entity_id, timestamp) | halt |
| `SL-05` | TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount | halt |
| `SL-06` | filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION | halt |
| `SL-07` | Transformer MUST run BEFORE Accumulator in factor pipeline | halt |
| `SL-08` | MACD parameters locked: fast=12, slow=26, signal=9 | halt |
Full lock definitions: [references/LOCKS.md](references/LOCKS.md)
## Top Anti-Patterns (15 total)
- **`AP-REGTECH-001`**: Missing attribute initialization on data structures
- **`AP-REGTECH-002`**: Self-loops in transaction graphs violate domain rules
- **`AP-REGTECH-003`**: Unvalidated floating-point inputs cause runtime crashes
All 15 anti-patterns: [references/ANTI_PATTERNS.md](references/ANTI_PATTERNS.md)
## Evidence Quality Notice
> [QUALITY NOTICE] This crystal was compiled from blueprint finance-bp-071. Evidence verify ratio = 26.8% and audit fail total = 35. Generated results may have uncaptured requirement gaps. Verify critical decisions against source files (LATEST.yaml / LATEST.jsonl).
## Reference Files
| File | Contents | When to Load |
|---|---|---|
| [references/seed.yaml](references/seed.yaml) | V6+ 全量权威 (source-of-truth) | 有行为/决策争议时必读 |
| [references/ANTI_PATTERNS.md](references/ANTI_PATTERNS.md) | 15 条跨项目反模式 | 开始实现前 |
| [references/WISDOM.md](references/WISDOM.md) | 跨项目精华借鉴 | 架构决策时 |
| [references/CONSTRAINTS.md](references/CONSTRAINTS.md) | domain + fatal 约束 | 规则冲突时 |
| [references/USE_CASES.md](references/USE_CASES.md) | 全量 KUC-* 业务场景 | 需要完整示例时 |
| [references/LOCKS.md](references/LOCKS.md) | SL-* + preconditions + hints | 生成回测/交易代码前 |
| [references/COMPONENTS.md](references/COMPONENTS.md) | AST 组件地图(按 module 拆分)| 查 API 时 |
---
*Compiled by Doramagic crystal-compilation-v6.1 from `finance-bp-071` blueprint at 2026-04-22T13:00:25.342470+00:00.*
*See [human_summary.md](human_summary.md) for non-technical overview.*
FILE:human_summary.md
# Human Summary
> I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
## What I Can Do
- **tagline**: I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
- **use_cases**: ['Database Statement Loading', 'Wikidata Updates Review', 'Dataset Crawling (ETL)', 'A-share MACD daily golden-cross backtest with hfq price adjustment from eastmoney', 'End-to-end ZVT pipeline: FinanceRecorder + GoodCompanyFactor + StockTrader', 'Multi-factor strategy with TargetSelector (AND mode) combining MACD + volume breakout', 'Index composition data collection (SZ1000, SZ2000) with EM recorder']
## What I Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
FILE:references/ANTI_PATTERNS.md
# Anti-Patterns (Cross-Project)
Total: **15**
## finance-bp-060--AMLSim (1)
### `AP-REGTECH-011` — Mismatched configuration parameters across coupled components <sub>(medium)</sub>
When TransactionGenerator and Nominator use different degree_threshold values, Nominator identifies hub accounts using different criteria than TransactionGenerator. This causes incorrect fan-in/fan-out candidate selection. Consequence: AML typology patterns placed on wrong accounts, invalidating simulation results.
## finance-bp-060--AMLSim, finance-bp-067--firesale_stresstest (1)
### `AP-REGTECH-002` — Self-loops in transaction graphs violate domain rules <sub>(high)</sub>
When generating directed transaction graphs or AML typologies, allowing source == destination edges creates self-loops. In AML simulation, self-loops represent accounts sending money to themselves, which is not a valid money laundering pattern. In fire-sale models, self-loops cause undefined behavior. Consequence: corrupted graph topology and invalid typology validation.
## finance-bp-060--AMLSim, finance-bp-071--opensanctions (1)
### `AP-REGTECH-001` — Missing attribute initialization on data structures <sub>(high)</sub>
When loading account lists or creating entity dictionaries, failing to initialize required list/dict attributes (e.g., normal_models, statement IDs) causes KeyError or ValueError at runtime. The code path that reads these structures assumes they exist, but the initialization path omits them. Consequence: pipeline crashes or data loss for affected entities.
## finance-bp-062--ifrs9 (3)
### `AP-REGTECH-005` — Incorrect amortization windows violate IFRS 9 compliance <sub>(high)</sub>
Stage 1 ECL requires exactly 12-month amortization (11 zero-indexed iterations) while Stage 2/3 requires full remaining tenor (tenor-1 iterations). Using identical windows for all stages causes ECL over/understatement. Consequence: regulatory non-compliance and materially incorrect loan loss provisions.
### `AP-REGTECH-010` — Incorrect cumulative PD ordering corrupts lifetime ECL term structure <sub>(high)</sub>
Using cumprod(1-conPD) without shift(1) and fillna(1) produces corrupted first-period survival probability. This cascades into all subsequent marginal and cumulative PD calculations, violating IFRS 9 lifetime ECL requirements. Consequence: systematically incorrect provisions across all remaining tenor periods.
### `AP-REGTECH-015` — Missing EAD component in ECL formula produces incomplete provisions <sub>(high)</sub>
IFRS 9 requires ECL = PD x LGD x EAD. When the EAD module is missing or not integrated, the ECL calculation is incomplete and unusable for provisioning. Consequence: regulatory rejection of ECL calculations, blocking of provisioning and reporting processes.
## finance-bp-062--ifrs9, finance-bp-067--firesale_stresstest (2)
### `AP-REGTECH-003` — Unvalidated floating-point inputs cause runtime crashes <sub>(high)</sub>
When parsing CSV files or computing statistical functions on raw data, failing to validate inputs against acceptable ranges (e.g., DDP near 0 or 1 for norm.ppf, unvalidated floats from CSV) causes ValueError or infinite/NaN values. Consequence: entire model crashes before simulation or corrupted downstream calculations.
### `AP-REGTECH-004` — Division by zero in financial calculations produces inf/NaN <sub>(high)</sub>
When calculating ratios like DDP (downgrade observations / total observations) or price impact denominators (total_quantities), zero-denominator cases are not guarded. The resulting inf/NaN propagates through all downstream calculations, corrupting CCI, ECL, or market clearing. Consequence: systematic data corruption across the entire calculation pipeline.
## finance-bp-067--firesale_stresstest (4)
### `AP-REGTECH-006` — Wrong leverage formula in threshold-based decisions <sub>(high)</sub>
Computing leverage as equity-to-liabilities (E/L) instead of equity-to-assets (E/A) produces different values. This causes deleveraging triggers and insolvency detection to fire at wrong thresholds. Consequence: zombie banks continue operating with negative equity, or healthy banks unnecessarily deleverage.
### `AP-REGTECH-007` — Confusing deleveraging buffer threshold with insolvency threshold <sub>(high)</sub>
Banks below 3% leverage are insolvent and must default, but deleveraging should trigger at 4% buffer. Using the same threshold eliminates the buffer zone, causing immediate default with no intermediate corrective action. Consequence: excessive bank failures amplify systemic contagion.
### `AP-REGTECH-013` — Order-dependent execution creates first-mover advantage bias <sub>(medium)</sub>
Without separating step() and act() phases, first-acting banks sell assets before others decide, creating systematic first-mover advantage. This distorts the competitive equilibrium and fire-sale dynamics. Consequence: unreliable systemic risk estimates that understate contagion for late-acting banks.
### `AP-REGTECH-014` — Immediate asset sales cause double-selling and undefined state <sub>(medium)</sub>
Executing asset sales immediately rather than queuing them to a buffer allows multiple banks holding the same asset to sell simultaneously without accounting for concurrent intentions. Consequence: undefined price impact and incorrect cash transfers in market clearing.
## finance-bp-071--opensanctions (3)
### `AP-REGTECH-008` — Cache keys omit request body for state-changing methods <sub>(high)</sub>
Using only URL for cache fingerprints on POST/PATCH requests means different request bodies return identical cached content. This causes stale data, missing entities, and data corruption in compliance screening pipelines. Consequence: sanctions matches missed or false positives from stale entity data.
### `AP-REGTECH-009` — ID collision in entity construction creates false sanctions matches <sub>(high)</sub>
When constructing entity IDs from source identifiers, insufficient identifying attributes cause different real-world entities to receive identical IDs. The database then merges them into one entity. Consequence: a sanctioned entity's ID matches an innocent entity, causing false positive compliance alerts.
### `AP-REGTECH-012` — Reverse property assignment corrupts entity construction <sub>(medium)</sub>
Stub (reverse) properties represent inverse relationships and raise InvalidData when directly assigned. Attempting to add values to stub properties instead of forward properties causes ValueError, aborting entity construction. Consequence: entities lost from output, incomplete compliance datasets.
FILE:references/COMPONENTS.md
# Component Capability Map
**Project**: finance-bp-071--opensanctions
**Scan date**: 2026-04-22
**Stats**: {'total_files': 6, 'total_classes': 26, 'total_functions': 0, 'total_stages': 6}
## Modules (6)
- [data_collection](components/data_collection.md): 4 classes
- [entity_construction](components/entity_construction.md): 5 classes
- [statement_storage](components/statement_storage.md): 4 classes
- [data_exporting](components/data_exporting.md): 5 classes
- [delta_computation](components/delta_computation.md): 4 classes
- [archival_and_publishing](components/archival_and_publishing.md): 4 classes
FILE:references/CONSTRAINTS.md
# Constraints
## preservation_manifest
```yaml
required_objects:
business_decisions_count: 127
fatal_constraints_count: 56
non_fatal_constraints_count: 208
use_cases_count: 60
semantic_locks_count: 12
preconditions_count: 4
evidence_quality_rules_count: 2
traceback_scenarios_count: 5
```
FILE:references/LOCKS.md
# Semantic Locks + Preconditions
## Semantic Locks (12)
### `SL-01` <sub>(on_violation: fatal)</sub>
Execute sell orders before buy orders in every trading cycle
### `SL-02` <sub>(on_violation: fatal)</sub>
Trading signals MUST use next-bar execution (no look-ahead)
### `SL-03` <sub>(on_violation: fatal)</sub>
Entity IDs MUST follow format entity_type_exchange_code
### `SL-04` <sub>(on_violation: fatal)</sub>
DataFrame index MUST be MultiIndex (entity_id, timestamp)
### `SL-05` <sub>(on_violation: fatal)</sub>
TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount
### `SL-06` <sub>(on_violation: fatal)</sub>
filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION
### `SL-07` <sub>(on_violation: fatal)</sub>
Transformer MUST run BEFORE Accumulator in factor pipeline
### `SL-08` <sub>(on_violation: fatal)</sub>
MACD parameters locked: fast=12, slow=26, signal=9
### `SL-09` <sub>(on_violation: warning)</sub>
Default transaction costs: buy_cost=0.001, sell_cost=0.001, slippage=0.001
### `SL-10` <sub>(on_violation: fatal)</sub>
A-share equity trading is T+1 (no same-day close of buy positions)
### `SL-11` <sub>(on_violation: fatal)</sub>
Recorder subclass MUST define provider AND data_schema class attributes
### `SL-12` <sub>(on_violation: fatal)</sub>
Factor result_df MUST contain either 'filter_result' OR 'score_result' column
## Preconditions (4)
- **`PC-01`**: `python3 -c 'import zvt; print(zvt.__version__)'` → on_fail: Run: python3 -m pip install zvt then re-run: python3 -m zvt.init_dirs to initialize data directories
- **`PC-02`**: `python3 -c "from zvt.api.kdata import get_kdata; df = get_kdata(entity_ids=['stock_sh_600000'], limit=1); assert df is not None and len(df) > 0, 'No kdata found'"` → on_fail: Run recorder first: python3 -m zvt.recorders.em.em_stock_kdata_recorder --entity_ids stock_sh_600000 (replace with your target entity IDs)
- **`PC-03`**: `python3 -c "import os; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); assert zvt_home.exists(), f'ZVT home not found: {zvt_home}'"` → on_fail: Run: python3 -m zvt.init_dirs
- **`PC-04`**: `python3 -c "import os, tempfile; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); test_f = zvt_home / '.write_test'; test_f.touch(); test_f.unlink()"` → on_fail: Check directory permissions: chmod u+w ~/.zvt or set ZVT_HOME environment variable to a writable location
FILE:references/USE_CASES.md
# Known Use Cases (KUC)
Total: **60**
## `KUC-101`
**Source**: `zavod/zavod/cli/etl.py`
Automates the extraction, transformation, and loading of data from external sources into the OpenSanctions data store with optional validation and data clearing.
## `KUC-103`
**Source**: `zavod/zavod/cli/wd_up.py`
Interactively reviews and applies Wikidata updates to OpenSanctions datasets, allowing manual curation of proposed entity matches.
## `KUC-104`
**Source**: `zavod/zavod/cli/util.py`
Loads dataset statements from the archive into a SQL database for querying and analysis, with configurable batch sizes.
## `KUC-106`
**Source**: `zavod/zavod/cli/dedupe.py`
Generates deduplication candidates by cross-referencing entities across datasets using configurable blocking strategies and matching algorithms.
## `KUC-107`
**Source**: `zavod/zavod/cli/archive.py`
Manages dataset versions in the archive, tracking publication history and maintaining version lineage for reproducibility.
## `KUC-108`
**Source**: `datasets/_analysis/ann_pep_positions/test_analyzer.py`
Analyzes politically exposed persons (PEP) positions to determine influence levels and occupancy status across government roles.
## `KUC-109`
**Source**: `datasets/ba/companies/test_crawler.py`
Cleans company names by removing patterns like 'dissolved', 'liquidated' and splitting full/short names for consistent entity representation.
## `KUC-110`
**Source**: `contrib/test_index.py`
Validates dataset index files to ensure proper structure, type assignments, and collection/source/external relationships.
## `KUC-111`
**Source**: `zavod/zavod/tests/test_entity.py`
Core functionality for creating, populating, and managing entity objects with schema validation and property management.
## `KUC-112`
**Source**: `zavod/zavod/tests/test_store.py`
Manages persistent storage and retrieval of entities with support for views, external data, and entity state tracking.
## `KUC-113`
**Source**: `zavod/zavod/tests/test_tune.py`
Optimizes and tunes extraction models (e.g., name extraction) by comparing results and validating performance.
## `KUC-114`
**Source**: `zavod/zavod/tests/test_archive.py`
Publishes dataset artifacts to the archive backend, managing file resources and version tracking for releases.
## `KUC-115`
**Source**: `zavod/zavod/tests/test_dataset.py`
Defines and manages dataset metadata including HTTP settings, naming rules, and schema configurations.
## `KUC-116`
**Source**: `zavod/zavod/tests/test_logs.py`
Redacts sensitive information from logs to protect confidential data while maintaining operational visibility.
## `KUC-117`
**Source**: `zavod/zavod/tests/test_assertions.py`
Defines and validates data quality assertions such as minimum entity counts, property value requirements, and schema constraints.
## `KUC-118`
**Source**: `zavod/zavod/tests/test_context.py`
Manages the crawling context including entity creation, ID generation, URL handling, and issue logging during data extraction.
## `KUC-119`
**Source**: `zavod/zavod/tests/test_validate.py`
Executes data validators that check for dangling references, self-references, empty entities, and assertion compliance.
## `KUC-120`
**Source**: `zavod/zavod/tests/test_publish.py`
Orchestrates the complete dataset publishing workflow including crawling, storage, export, and archive publication.
## `KUC-121`
**Source**: `zavod/zavod/tests/test_dedupe.py`
Resolves duplicate entities by managing resolver edges, making merge decisions, and handling cluster operations.
## `KUC-122`
**Source**: `zavod/zavod/tests/test_cli.py`
Executes CLI commands for crawling, exporting, validating, and managing datasets through the command-line interface.
## `KUC-123`
**Source**: `zavod/zavod/tests/tools/test_export_catalog.py`
Exports dataset catalog information including collection hierarchies and cross-references to index files.
## `KUC-124`
**Source**: `zavod/zavod/tests/tools/test_load_db.py`
Bulk loads dataset statements into a SQL database with batching support for efficient large-scale data ingestion.
## `KUC-125`
**Source**: `zavod/zavod/tests/tools/test_dump_file.py`
Dumps dataset entities with resolved canonical IDs to files, handling merged entities and deduplication state.
## `KUC-126`
**Source**: `zavod/zavod/tests/exporters/test_statistics.py`
Exports dataset statistics including entity counts, schema distributions, country coverage, and target counts.
## `KUC-127`
**Source**: `zavod/zavod/tests/exporters/test_delta.py`
Exports incremental changes between dataset versions, tracking additions, modifications, and deletions.
## `KUC-128`
**Source**: `zavod/zavod/tests/exporters/test_securities.py`
Exports securities/financial instrument data including ISINs, tickers, and company identifiers in standardized formats.
## `KUC-129`
**Source**: `zavod/zavod/tests/exporters/test_metadata.py`
Exports dataset metadata and catalog information to index files for discovery and resource listing.
## `KUC-130`
**Source**: `zavod/zavod/tests/exporters/test_exporters.py`
Exports entities in multiple formats including FollowTheMoney JSON, CSV, nested JSON, and Senzing formats.
## `KUC-131`
**Source**: `zavod/zavod/tests/exporters/test_nested.py`
Exports target entities with nested relationships in a hierarchical JSON format preserving entity associations.
## `KUC-132`
**Source**: `zavod/zavod/tests/exporters/test_senzing.py`
Exports entities in Senzing G2 format with standardized entity records for entity resolution systems.
## `KUC-133`
**Source**: `zavod/zavod/tests/exporters/test_maritime.py`
Exports maritime vessel data including IMO numbers, vessel types, and ownership information.
## `KUC-134`
**Source**: `zavod/zavod/tests/integration/test_edges.py`
Detects and manages entity relationships and edges in the knowledge graph for ownership and association tracking.
## `KUC-135`
**Source**: `zavod/zavod/tests/runtime/test_resources.py`
Manages dataset resources including downloaded files, checksums, and resource metadata tracking.
## `KUC-136`
**Source**: `zavod/zavod/tests/runtime/test_loader.py`
Dynamically loads dataset entry point functions for custom crawling and processing logic.
## `KUC-137`
**Source**: `zavod/zavod/tests/runtime/test_issues.py`
Logs and tracks issues encountered during crawling including warnings, errors, and entity-specific problems.
## `KUC-138`
**Source**: `zavod/zavod/tests/runtime/test_timestamps.py`
Indexes and tracks entity timestamps for first_seen, last_seen, and statement-level temporal tracking.
## `KUC-139`
**Source**: `zavod/zavod/tests/stateful/test_positions.py`
Categorizes political positions by occupancy status (current, former, unknown) and determines influence levels.
## `KUC-140`
**Source**: `zavod/zavod/tests/stateful/test_review.py`
Manages review workflows for human-in-the-loop data validation with source tracking and acceptance workflows.
## `KUC-141`
**Source**: `zavod/zavod/tests/enrich/test_local_enricher.py`
Enriches entities with additional data from local enrichment datasets based on configurable matching rules.
## `KUC-142`
**Source**: `zavod/zavod/tests/enrich/test_enrichment.py`
Enriches entities with data from external enricher services through matching and expansion operations.
## `KUC-143`
**Source**: `zavod/zavod/tests/extract/test_names.py`
Extracts and evaluates name components from unstructured text with feedback-based metric scoring.
## `KUC-144`
**Source**: `zavod/zavod/tests/extract/test_zyte_api.py`
Fetches HTML, JSON, and text content from websites using the Zyte API with browser rendering support.
## `KUC-145`
**Source**: `zavod/zavod/tests/helpers/test_identification.py`
Creates identification documents (passports, licenses) linked to entities with number validation.
## `KUC-146`
**Source**: `zavod/zavod/tests/helpers/test_xml.py`
Removes XML namespaces from parsed documents to simplify element selection and processing.
## `KUC-147`
**Source**: `zavod/zavod/tests/helpers/test_pdf.py`
Extracts tabular data from PDF documents with configurable page settings and border detection.
## `KUC-148`
**Source**: `zavod/zavod/tests/helpers/test_sanctions.py`
Creates sanction list entries linked to entities with program identification and authority information.
## `KUC-149`
**Source**: `zavod/zavod/tests/helpers/test_dates.py`
Parses and extracts dates from various formats with month replacement and date application utilities.
## `KUC-150`
**Source**: `zavod/zavod/tests/helpers/test_numbers.py`
Applies consistent number formatting to entity properties including unit suffixes and decimal handling.
## `KUC-151`
**Source**: `zavod/zavod/tests/helpers/test_securities.py`
Creates financial security entities with ISIN validation and country code extraction.
## `KUC-152`
**Source**: `zavod/zavod/tests/helpers/test_cryptos.py`
Extracts cryptocurrency wallet addresses (BTC, ETH, TRON) from unstructured text for blockchain analysis.
## `KUC-153`
**Source**: `zavod/zavod/tests/helpers/test_change.py`
Detects changes in files and URLs by comparing content against expected hashes.
## `KUC-154`
**Source**: `zavod/zavod/tests/helpers/test_html.py`
Extracts tabular data from HTML documents using XPath selectors with header detection.
## `KUC-155`
**Source**: `zavod/zavod/tests/helpers/test_addresses.py`
Creates standardized address entities from component parts with country code resolution.
## `KUC-156`
**Source**: `zavod/zavod/tests/helpers/test_text.py`
Processes unstructured text with multi-split delimiters, bracket removal, and empty value filtering.
## `KUC-157`
**Source**: `zavod/zavod/tests/helpers/test_positions.py`
Creates political position entities with topics, dates, and organizational affiliations.
## `KUC-158`
**Source**: `zavod/zavod/tests/helpers/test_excel.py`
Parses Excel files (.xls and .xlsx) with cell type detection and date conversion.
## `KUC-159`
**Source**: `zavod/zavod/tests/extract/names/test_clean.py`
Handles name components with language tagging and multi-value management for person names.
## `KUC-160`
**Source**: `zavod/zavod/tests/helpers/names/test_names.py`
Applies parsed names to entities with support for first/last names, aliases, and review workflow integration.
## `KUC-161`
**Source**: `zavod/zavod/tests/helpers/names/test_regularity.py`
Checks name regularity against configured rules including character filters, null word detection, and length validation.
## `KUC-162`
**Source**: `zavod/zavod/tests/helpers/names/test_derive_originals.py`
Maps extracted name values back to original source values when names contain multiple variants.
FILE:references/WISDOM.md
# Cross-Project Wisdom
Total: **10**
## `CW-REGTECH-001` — Input bounds validation before statistical computation
**From**: finance-bp-062--ifrs9, finance-bp-067--firesale_stresstest · **Applicable to**: regtech-compliance
Statistical functions like norm.ppf() and cumprod() have strict input requirements that, if violated, produce infinite or NaN values corrupting entire pipelines. Always validate inputs against domain constraints (DDP in (0,1), counts > 0) before passing to statistical functions. Apply to any statistical or inverse-CDF computation.
## `CW-REGTECH-002` — Graph/topology invariant verification before construction
**From**: finance-bp-060--AMLSim, finance-bp-067--firesale_stresstest · **Applicable to**: regtech-compliance
Before constructing graph structures (transaction networks, transition matrices), verify invariants: sum(in-degrees) = sum(out-degrees), matrix row sums = 1.0, degree sequence length divisibility. This catches data corruption early before expensive graph construction operations. Apply to any bipartite or directed graph generation.
## `CW-REGTECH-003` — Regulatory amortization window discipline
**From**: finance-bp-062--ifrs9 · **Applicable to**: regtech-compliance
IFRS 9 mandates different ECL calculation windows: exactly 12-month for Stage 1 (11 zero-indexed iterations), full remaining tenor for Stage 2/3. Mixing these up violates compliance requirements. Always encode stage-specific window logic explicitly rather than reusing a single loop variable across stages.
## `CW-REGTECH-004` — Fingerprint composition must include all request dimensions
**From**: finance-bp-071--opensanctions · **Applicable to**: regtech-compliance
Cache keys must include all request parameters that affect response content: URL, HTTP method, authentication headers, and request body for state-changing methods. POST requests with different bodies returning identical cache is a silent data corruption bug. Always compose fingerprints from the union of all content-affecting parameters.
## `CW-REGTECH-005` — Floating-point zero-equivalence with explicit epsilon tolerance
**From**: finance-bp-067--firesale_stresstest · **Applicable to**: regtech-compliance
IEEE 754 floating-point precision causes exact zero comparisons to fail in financial calculations. Always use eps=1e-9 tolerance for zero-equivalence checks in market clearing, leverage ratios, and price impact calculations. This prevents division-by-zero crashes and incorrect cash transfers.
## `CW-REGTECH-006` — Stage classification threshold ordering enforcement
**From**: finance-bp-062--ifrs9 · **Applicable to**: regtech-compliance
IFRS 9 SICR thresholds must be ordered: BUCKETS 2-3 trigger Stage 2, BUCKETS >=4 trigger Stage 3. Applying thresholds in wrong order or omitting absolute DPD triggers causes material ECL misstatement. Validate threshold ordering and document bucket-to-stage mapping explicitly.
## `CW-REGTECH-007` — Initialization-before-use dependency ordering
**From**: finance-bp-067--firesale_stresstest · **Applicable to**: regtech-compliance
Operational dependencies must initialize before dependent objects use them: AssetMarket before bank registration, CSV file existence before parsing, entity ID before statement addition. Violations cause AttributeError or FileNotFoundError that abort entire initialization. Always encode dependency ordering explicitly in initialization sequences.
## `CW-REGTECH-008` — Sufficient entity ID collision prevention
**From**: finance-bp-071--opensanctions · **Applicable to**: regtech-compliance
Entity IDs must include enough identifying attributes (dataset prefix, source, identifier type, document number) to guarantee uniqueness. Collisions create false equivalence between unrelated entities, directly causing false positive sanctions matches. Include the maximum available discriminating attributes in ID construction.
## `CW-REGTECH-009` — Hub selection with candidate removal before addition
**From**: finance-bp-060--AMLSim · **Applicable to**: regtech-compliance
When selecting hub accounts for typology placement, always call remove_typology_candidate BEFORE add_node for each selected account. Reversing this order causes hub self-selection (accounts choosing themselves) and duplicate assignment across overlapping patterns. Apply to any allocation algorithm with candidate pooling.
## `CW-REGTECH-010` — Insolvency detection before operational decisions
**From**: finance-bp-067--firesale_stresstest · **Applicable to**: regtech-compliance
Banks below the insolvency threshold (3% leverage) must trigger default immediately, not enter the deleveraging decision logic. Checking operational thresholds before insolvency creates zombie banks with negative equity. Always gate operational decisions on prior insolvency state.
FILE:references/components/archival_and_publishing.md
# archival_and_publishing (4 classes)
## `publish_dataset`
`archival_and_publishing/publish-dataset.py:0`
## `ArchiveBackend.upload`
`archival_and_publishing/archivebackend-upload.py:0`
## `ArchiveBackend.backfill`
`archival_and_publishing/archivebackend-backfill.py:0`
## `Archive backend`
`archival_and_publishing/archive-backend.py:0`
FILE:references/components/data_collection.md
# data_collection (4 classes)
## `Context.fetch_response`
`data_collection/context-fetch-response.py:0`
## `Context.http`
`data_collection/context-http.py:0`
## `HTTP.configure`
`data_collection/http-configure.py:0`
## `HTTP backend`
`data_collection/http-backend.py:0`
FILE:references/components/data_exporting.md
# data_exporting (5 classes)
## `export_data`
`data_exporting/export-data.py:0`
## `Exporter.feed`
`data_exporting/exporter-feed.py:0`
## `consolidate_entity`
`data_exporting/consolidate-entity.py:0`
## `Export format`
`data_exporting/export-format.py:0`
## `Consolidation strategy`
`data_exporting/consolidation-strategy.py:0`
FILE:references/components/delta_computation.md
# delta_computation (4 classes)
## `HashDelta.compute`
`delta_computation/hashdelta-compute.py:0`
## `HashDelta.backfill`
`delta_computation/hashdelta-backfill.py:0`
## `DeltaExporter.write`
`delta_computation/deltaexporter-write.py:0`
## `Hash algorithm`
`delta_computation/hash-algorithm.py:0`
FILE:references/components/entity_construction.md
# entity_construction (5 classes)
## `Context.make`
`entity_construction/context-make.py:0`
## `Entity.add_cast`
`entity_construction/entity-add-cast.py:0`
## `value_clean`
`entity_construction/value-clean.py:0`
## `Entity ID generation`
`entity_construction/entity-id-generation.py:0`
## `Value cleaning pipeline`
`entity_construction/value-cleaning-pipeline.py:0`
FILE:references/components/statement_storage.md
# statement_storage (4 classes)
## `Context.emit`
`statement_storage/context-emit.py:0`
## `Store.assemble`
`statement_storage/store-assemble.py:0`
## `TimeStampIndex.record`
`statement_storage/timestampindex-record.py:0`
## `Statement serialization format`
`statement_storage/statement-serialization-format.py:0`
获取全球股票、加密货币、外汇、大宗商品等多市场实时行情与历史数据,提供技术指标计算、宏观经济数据追踪与资产比率分析功能。。
---
name: openbb-terminal
description: |-
获取全球股票、加密货币、外汇、大宗商品等多市场实时行情与历史数据,提供技术指标计算、宏观经济数据追踪与资产比率分析功能。。
license: Proprietary. See LICENSE.txt in project root.
compatibility: Designed for Doramagic-host ecosystem (Claude Code / openclaw / Cursor). Requires Python 3.12+ with uv package manager.
metadata:
version: "v6.1"
blueprint_id: "finance-bp-097"
compiled_at: "2026-04-22T13:00:43.142714+00:00"
capability_markets: "multi-market"
capability_activities: "data-sourcing"
sop_version: "crystal-compilation-v6.1"
---
# OpenBB 金融终端 (openbb-terminal)
> 获取全球股票、加密货币、外汇、大宗商品等多市场实时行情与历史数据,提供技术指标计算、宏观经济数据追踪与资产比率分析功能。
## Pipeline
`data_collection -> data_storage -> factor_computation -> target_selection -> trading_execution -> visualization`
## Top Use Cases (19 total)
### Momentum Trading Strategy Backtesting (`UC-101`)
Tests a dual moving average crossover strategy to identify optimal buy/sell signals for multiple stocks based on short-term vs long-term momentum
**Triggers**: momentum trading, moving average crossover, backtesting
### Ethereum Trend Analysis (`UC-102`)
Analyzes Ethereum price trends using technical indicators (moving averages, volatility) to identify patterns and trading opportunities in crypto marke
**Triggers**: Ethereum analysis, crypto trend, moving averages
### Copper to Gold Ratio Analysis (`UC-103`)
Tracks the copper/gold ratio over time and correlates it with US Treasury yields to identify economic cycle indicators
**Triggers**: commodity ratio, copper gold ratio, treasury yields
For all **19** use cases, see [references/USE_CASES.md](references/USE_CASES.md).
**Execute trigger**: `When user intent matches intent_router.uc_entries[].positive_terms AND user uses action verb (run/execute/跑/执行/backtest/fetch/collect)`
## What I'll Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
## Semantic Locks (Fatal)
| ID | Rule | On Violation |
|---|---|---|
| `SL-01` | Execute sell orders before buy orders in every trading cycle | halt |
| `SL-02` | Trading signals MUST use next-bar execution (no look-ahead) | halt |
| `SL-03` | Entity IDs MUST follow format entity_type_exchange_code | halt |
| `SL-04` | DataFrame index MUST be MultiIndex (entity_id, timestamp) | halt |
| `SL-05` | TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount | halt |
| `SL-06` | filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION | halt |
| `SL-07` | Transformer MUST run BEFORE Accumulator in factor pipeline | halt |
| `SL-08` | MACD parameters locked: fast=12, slow=26, signal=9 | halt |
Full lock definitions: [references/LOCKS.md](references/LOCKS.md)
## Top Anti-Patterns (14 total)
- **`AP-DATA-SOURCING-001`**: Missing or invalid User-Agent headers for SEC API requests
- **`AP-DATA-SOURCING-002`**: Ignoring external API rate limits causing IP blocking
- **`AP-DATA-SOURCING-003`**: No HTTP timeout configuration causing indefinite hangs
All 14 anti-patterns: [references/ANTI_PATTERNS.md](references/ANTI_PATTERNS.md)
## Evidence Quality Notice
> [QUALITY NOTICE] This crystal was compiled from blueprint finance-bp-097. Evidence verify ratio = 32.0% and audit fail total = 65. Generated results may have uncaptured requirement gaps. Verify critical decisions against source files (LATEST.yaml / LATEST.jsonl).
## Reference Files
| File | Contents | When to Load |
|---|---|---|
| [references/seed.yaml](references/seed.yaml) | V6+ 全量权威 (source-of-truth) | 有行为/决策争议时必读 |
| [references/ANTI_PATTERNS.md](references/ANTI_PATTERNS.md) | 14 条跨项目反模式 | 开始实现前 |
| [references/WISDOM.md](references/WISDOM.md) | 跨项目精华借鉴 | 架构决策时 |
| [references/CONSTRAINTS.md](references/CONSTRAINTS.md) | domain + fatal 约束 | 规则冲突时 |
| [references/USE_CASES.md](references/USE_CASES.md) | 全量 KUC-* 业务场景 | 需要完整示例时 |
| [references/LOCKS.md](references/LOCKS.md) | SL-* + preconditions + hints | 生成回测/交易代码前 |
| [references/COMPONENTS.md](references/COMPONENTS.md) | AST 组件地图(按 module 拆分)| 查 API 时 |
---
*Compiled by Doramagic crystal-compilation-v6.1 from `finance-bp-097` blueprint at 2026-04-22T13:00:43.142714+00:00.*
*See [human_summary.md](human_summary.md) for non-technical overview.*
FILE:human_summary.md
# Human Summary
> I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
## What I Can Do
- **tagline**: I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
- **use_cases**: ['Copper to Gold Ratio Analysis', 'Ethereum Trend Analysis', 'Momentum Trading Strategy Backtesting', 'A-share MACD daily golden-cross backtest with hfq price adjustment from eastmoney', 'End-to-end ZVT pipeline: FinanceRecorder + GoodCompanyFactor + StockTrader', 'Multi-factor strategy with TargetSelector (AND mode) combining MACD + volume breakout', 'Index composition data collection (SZ1000, SZ2000) with EM recorder']
## What I Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
FILE:references/ANTI_PATTERNS.md
# Anti-Patterns (Cross-Project)
Total: **14**
## finance-bp-070--edgartools (2)
### `AP-DATA-SOURCING-004` — Invalidating XBRL period types for balance sheet analysis <sub>(high)</sub>
Balance sheets represent point-in-time snapshots (instant periods), not ranges (duration periods). Using duration periods for balance sheet statements causes stockholder equity and other line items to show nonsensical date ranges, corrupting financial calculations that depend on accurate period associations.
### `AP-DATA-SOURCING-012` — Large document parsing without streaming causing OOM errors <sub>(high)</sub>
SEC filings can exceed 160MB, and parsing large documents in memory without streaming causes OOM errors that crash the entire service for all users. Documents exceeding 10MB require switching to streaming parsers to prevent extreme memory usage.
## finance-bp-070--edgartools, finance-bp-079--akshare, finance-bp-084--eastmoney, finance-bp-114--edgar-crawler (1)
### `AP-DATA-SOURCING-002` — Ignoring external API rate limits causing IP blocking <sub>(high)</sub>
Multiple financial data sources (SEC EDGAR, Sina, Eastmoney, TuShare) enforce strict rate limits (10 req/sec, 120 calls/minute). Exceeding these triggers temporary IP blocks lasting 10-60 minutes, causing complete data unavailability. Immediate retry attempts during blocks extend the block duration significantly.
## finance-bp-070--edgartools, finance-bp-114--edgar-crawler (1)
### `AP-DATA-SOURCING-001` — Missing or invalid User-Agent headers for SEC API requests <sub>(high)</sub>
SEC EDGAR requires valid User-Agent identity with contact information in headers. Without this, requests are rejected with 403 Forbidden errors, completely blocking all filing access. Both edgartools and edgar-crawler enforce this constraint as fundamental to any data retrieval operation.
## finance-bp-079--akshare (4)
### `AP-DATA-SOURCING-003` — No HTTP timeout configuration causing indefinite hangs <sub>(high)</sub>
HTTP requests to external financial data sources (Yahoo, Sina, Eastmoney) without timeout values can hang indefinitely on blocked connections. This freezes the entire application and prevents data collection from all other sources, creating cascading failures across the system.
### `AP-DATA-SOURCING-005` — Malformed or empty JSON responses causing silent failures <sub>(medium)</sub>
Financial API responses containing malformed JSON raise unhandled ValueError exceptions, crashing downstream processing. Similarly, empty JSON responses (empty dict, list, null) masquerading as valid data cause silent failures producing empty DataFrames or misleading results in financial analysis.
### `AP-DATA-SOURCING-006` — Source-specific symbol mapping errors causing data corruption <sub>(high)</sub>
Stock symbols require source-specific formatting (sh/sz prefixes for Sina, numeric codes for THS, etc.). Incorrect symbol mapping causes API calls to return empty results or wrong data, corrupting financial datasets with missing records or entirely incorrect tickers being stored.
### `AP-DATA-SOURCING-013` — Column mapping length mismatch causing DataFrame errors <sub>(medium)</sub>
Column mapping constants with length mismatch against actual API response columns cause ValueError exceptions during DataFrame construction. Raw field names (f1, f2, f12) must be mapped to meaningful names (最新价, 涨跌幅) with exact column count alignment.
## finance-bp-103--ArcticDB (3)
### `AP-DATA-SOURCING-007` — Using unsupported DataFrame types with time-series storage <sub>(high)</sub>
ArcticDB does not support MultiIndex columns, PyArrow-backed pandas DataFrames, or timedelta64 columns. Attempting to write these DataFrame types raises ArcticDbNotYetImplemented exceptions, causing write failures and permanent data loss if not properly handled before storage operations.
### `AP-DATA-SOURCING-008` — Non-atomic storage writes causing concurrent access corruption <sub>(high)</sub>
Storage backends without atomic write_if_none operations can cause data corruption under concurrent multi-writer access. Similarly, updating reference keys before atom keys complete allows readers to access incomplete or missing data, breaking version chain integrity.
### `AP-DATA-SOURCING-014` — Pruning snapshot-protected versions breaking point-in-time recovery <sub>(high)</sub>
Deleting or pruning versions that are referenced by existing snapshots breaks historical data access. Snapshots provide point-in-time recovery capabilities, and removing their referenced versions causes read failures when users attempt to access data from specific snapshots.
## finance-bp-114--edgar-crawler (1)
### `AP-DATA-SOURCING-010` — 8-K filing item numbering scheme mismatch for historical filings <sub>(medium)</sub>
8-K filings use obsolete item numbering (1-12) before 2004-08-23 and new numbering (1.01-9.01) after. Using the wrong numbering scheme causes no matches for historical filings, resulting in empty item sections and complete extraction failure for pre-2004 data.
## finance-bp-128--yfinance (2)
### `AP-DATA-SOURCING-009` — Missing timezone-aware DatetimeIndex causing DST offset errors <sub>(high)</sub>
Price history DataFrames returned without timezone-aware DatetimeIndex cause incorrect timestamp interpretation when combined with other timezone-aware data. This leads to 23-25 hour offset errors during daylight saving time transitions, corrupting historical price calculations.
### `AP-DATA-SOURCING-011` — Yahoo Finance missing crumb authentication causing 401/403 errors <sub>(high)</sub>
Yahoo Finance API requires crumb and cookie authentication with every request. Without proper crumb management, API calls return 401 Unauthorized or HTML error pages instead of JSON data, breaking all downstream price and financial data processing.
FILE:references/COMPONENTS.md
# Component Capability Map
**Project**: finance-bp-097--OpenBB
**Scan date**: 2026-04-22
**Stats**: {'total_files': 7, 'total_classes': 36, 'total_functions': 0, 'total_stages': 7}
## Modules (7)
- [extension_loading_&_registry](components/extension_loading_-_registry.md): 5 classes
- [data_acquisition_(fetcher_tet_pattern)](components/data_acquisition_-fetcher_tet_pattern.md): 6 classes
- [command_registration_&_routing](components/command_registration_-_routing.md): 5 classes
- [command_execution_&_parameter_building](components/command_execution_-_parameter_building.md): 5 classes
- [query_interface_(provider_abstraction)](components/query_interface_-provider_abstraction.md): 5 classes
- [results_container_(obbject)](components/results_container_-obbject.md): 5 classes
- [cli_interface_layer](components/cli_interface_layer.md): 5 classes
FILE:references/CONSTRAINTS.md
# Constraints
## preservation_manifest
```yaml
required_objects:
business_decisions_count: 164
fatal_constraints_count: 24
non_fatal_constraints_count: 248
use_cases_count: 19
semantic_locks_count: 12
preconditions_count: 4
evidence_quality_rules_count: 2
traceback_scenarios_count: 5
```
## Domain Constraints Injected (16)
- **`SHARED-DS-RL-001`** <sub>(fatal)</sub>: Rate Limit + 指数退避重试:所有外部数据 API 调用必须实施速率限制控制 和指数退避重试(Exponential Backoff with Jitter)。收到 429/503 响应后 立即重试是反模式,会加剧服务端压力并触发 IP 封禁。 最大重试次数 3-5 次,退避基数 1-2 秒,最大退避 60 秒。
- **`SHARED-DS-RL-002`** <sub>(high)</sub>: 批量 API 调用必须控制并发数(max_workers),不可无限制并行。 免费 API(akshare/tushare 免费版)通常限制为 1-3 并发; 付费 API 也有并发上限(tushare 积分制,不同积分对应不同并发)。 超出并发限制会触发 429 或 IP 封禁。推荐使用 asyncio.Semaphore 或 ThreadPoolExecutor 的 max_workers 参数显式控制。
- **`SHARED-DS-RL-003`** <sub>(high)</sub>: API Token / 凭证安全:数据源 API key(tushare token / akshare 无需 token 但 其他商业数据源需要)不可硬编码在代码中,必须通过环境变量或配置文件读取。 硬编码 token 提交到 Git 会导致 token 泄露和费用损失。
- **`SHARED-DS-RL-004`** <sub>(medium)</sub>: 请求节流(Throttling):对同一 API 的批量请求应在请求间插入最小间隔 (akshare 部分接口要求 ≥ 0.5s;tushare 免费版每分钟 200 次)。 纯代码 sleep 不如令牌桶(Token Bucket)算法精确,推荐使用 ratelimit 或 slowapi 等成熟库。
- **`SHARED-DS-MISS-001`** <sub>(high)</sub>: 停牌日数据缺失策略:停牌股票在停牌期间无成交数据,数据库中会出现日期缺口。 缺失日期不可使用 forward-fill(会产生虚假成交量); 应在数据库中以 is_suspended=True 标记,量和成交额填 0,价格保留前一日收盘价。 因子计算时必须过滤 is_suspended=True 的行。
- **`SHARED-DS-MISS-002`** <sub>(medium)</sub>: 新上市股票的历史数据边界:新股上市首日开始在数据库中出现,但其上市前 无历史数据。若因子计算的 lookback 期超过上市天数,会产生所有 NaN 因子值。 采集时应记录每只股票的上市日期(list_date),采集逻辑应以上市日期为起点, 不以固定开始日期。
- **`SHARED-DS-MISS-003`** <sub>(high)</sub>: 退市股票的数据完整性:已退市股票在主流数据源(akshare/tushare)中依然 可以查询历史数据(退市前的历史),但退市日期后无数据。 历史股票池构建时必须包含已退市股票(否则幸存者偏差), 且采集时需明确处理退市日截止边界。
- **`SHARED-DS-MISS-004`** <sub>(high)</sub>: 多数据源数据对账(Cross-Source Reconciliation):同一数据(如收盘价) 从不同数据源(akshare/tushare/baostock)获取可能存在细微差异 (不同复权方式/不同节假日处理/除息调整时间不同)。 应在 pipeline 中实施多源对账检查,差异超阈值(如 0.1%)时记录告警并人工确认。
- **`SHARED-DS-TIME-001`** <sub>(high)</sub>: 时间戳精度与类型一致性:数据库中时间戳应使用统一的数据类型 (timestamp 而非 varchar/int)。混用字符串日期('2024-01-15')和 Timestamp 对象是比较、索引、merge 出现细微 bug 的常见来源, 应在 pipeline 入口处强制转换。
- **`SHARED-DS-TIME-002`** <sub>(high)</sub>: 交易时间与自然时间的区分:日线数据的"日期"通常对应交易日(T日), 而新闻/公告数据的"时间"是自然时间。合并两类数据时,必须将自然时间 映射到下一个可用交易日(next available trading day), 否则会产生"公告在T日,但T日盘中已经可用"的 lookahead 问题。
- **`SHARED-DS-TIME-003`** <sub>(medium)</sub>: 夏令时(DST)处理:采集美股/欧洲股市数据时,夏令时切换日(3月/11月) 会导致同一 HH:MM 时刻对应不同的 UTC 时间,若未处理,当日时序数据 会出现1小时的漂移。应始终以 UTC 存储,展示时按市场本地时区转换。
- **`SHARED-DS-INCR-001`** <sub>(high)</sub>: 增量更新幂等性:数据更新脚本必须是幂等的(多次运行结果相同)。 若脚本因网络中断在中途失败,重新运行时不应产生重复数据或数据缺口。 实现方式:先写入临时表,校验后 UPSERT 到主表,不直接 INSERT/APPEND。
- **`SHARED-DS-INCR-002`** <sub>(high)</sub>: 数据完整性检验(数据校验和/行数检查):每次数据更新后, 应对关键字段做完整性检验:行数是否在预期范围内、价格是否为正数、 日期是否连续(无缺失交易日)。缺少自动校验的数据管道是"沉默腐烂"的根源。
- **`SHARED-DS-INCR-003`** <sub>(medium)</sub>: 数据版本化:数据管道的输出数据应版本化管理(data versioning)。 当数据源更新了历史数据(如修订调整后的财务数据), 旧版本数据应保留可追溯,不应静默覆盖,以便对比版本间差异及复现历史回测。
- **`SHARED-DS-INCR-004`** <sub>(medium)</sub>: 数据对齐到交易日历边界:采集完成后,应验证所有股票/资产的数据覆盖 完整性与交易日历的一致性。每只股票在每个交易日都应有一行数据 (停牌标记,不是缺失)。通过 pivot_table 检查 NaN 比例是有效的快速诊断手段。
- **`SHARED-DS-INCR-005`** <sub>(medium)</sub>: 缓存策略(Caching):频繁读取的静态/低频更新数据(如股票信息、行业分类、 指数成分股)应本地缓存,避免每次运行重复 API 调用。 缓存必须设置过期时间(TTL),防止使用过期的行业分类或已失效的成分股信息。
FILE:references/LOCKS.md
# Semantic Locks + Preconditions
## Semantic Locks (12)
### `SL-01` <sub>(on_violation: fatal)</sub>
Execute sell orders before buy orders in every trading cycle
### `SL-02` <sub>(on_violation: fatal)</sub>
Trading signals MUST use next-bar execution (no look-ahead)
### `SL-03` <sub>(on_violation: fatal)</sub>
Entity IDs MUST follow format entity_type_exchange_code
### `SL-04` <sub>(on_violation: fatal)</sub>
DataFrame index MUST be MultiIndex (entity_id, timestamp)
### `SL-05` <sub>(on_violation: fatal)</sub>
TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount
### `SL-06` <sub>(on_violation: fatal)</sub>
filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION
### `SL-07` <sub>(on_violation: fatal)</sub>
Transformer MUST run BEFORE Accumulator in factor pipeline
### `SL-08` <sub>(on_violation: fatal)</sub>
MACD parameters locked: fast=12, slow=26, signal=9
### `SL-09` <sub>(on_violation: warning)</sub>
Default transaction costs: buy_cost=0.001, sell_cost=0.001, slippage=0.001
### `SL-10` <sub>(on_violation: fatal)</sub>
A-share equity trading is T+1 (no same-day close of buy positions)
### `SL-11` <sub>(on_violation: fatal)</sub>
Recorder subclass MUST define provider AND data_schema class attributes
### `SL-12` <sub>(on_violation: fatal)</sub>
Factor result_df MUST contain either 'filter_result' OR 'score_result' column
## Preconditions (4)
- **`PC-01`**: `python3 -c 'import zvt; print(zvt.__version__)'` → on_fail: Run: python3 -m pip install zvt then re-run: python3 -m zvt.init_dirs to initialize data directories
- **`PC-02`**: `python3 -c "from zvt.api.kdata import get_kdata; df = get_kdata(entity_ids=['stock_sh_600000'], limit=1); assert df is not None and len(df) > 0, 'No kdata found'"` → on_fail: Run recorder first: python3 -m zvt.recorders.em.em_stock_kdata_recorder --entity_ids stock_sh_600000 (replace with your target entity IDs)
- **`PC-03`**: `python3 -c "import os; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); assert zvt_home.exists(), f'ZVT home not found: {zvt_home}'"` → on_fail: Run: python3 -m zvt.init_dirs
- **`PC-04`**: `python3 -c "import os, tempfile; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); test_f = zvt_home / '.write_test'; test_f.touch(); test_f.unlink()"` → on_fail: Check directory permissions: chmod u+w ~/.zvt or set ZVT_HOME environment variable to a writable location
FILE:references/USE_CASES.md
# Known Use Cases (KUC)
Total: **19**
## `KUC-101`
**Source**: `examples/BacktestingMomentumTrading.ipynb`
Tests a dual moving average crossover strategy to identify optimal buy/sell signals for multiple stocks based on short-term vs long-term momentum.
## `KUC-102`
**Source**: `examples/EthereumTrendAnalysis.ipynb`
Analyzes Ethereum price trends using technical indicators (moving averages, volatility) to identify patterns and trading opportunities in crypto markets.
## `KUC-103`
**Source**: `examples/copperToGoldRatio.ipynb`
Tracks the copper/gold ratio over time and correlates it with US Treasury yields to identify economic cycle indicators.
## `KUC-104`
**Source**: `examples/currencyExchangeRateForecasting.ipynb`
Uses ARIMA, SARIMAX, and LSTM models to forecast EUR/USD exchange rate movements for forex trading decisions.
## `KUC-105`
**Source**: `examples/financialStatements.ipynb`
Compares financial statements (balance sheets) from multiple data providers to ensure data consistency and quality for fundamental analysis.
## `KUC-106`
**Source**: `examples/findSymbols.ipynb`
Searches for stocks, ETFs, and indices using various filters (sector, industry, country, recommendation) to find investment candidates.
## `KUC-107`
**Source**: `examples/googleColab.ipynb`
Explores OpenBB platform capabilities for retrieving and analyzing options chains data for derivatives trading.
## `KUC-108`
**Source**: `examples/impliedEarningsMove.ipynb`
Calculates the expected stock price movement around earnings announcements using options straddle pricing.
## `KUC-109`
**Source**: `examples/loadHistoricalPriceData.ipynb`
Retrieves and resamples historical OHLCV data for stocks with various intervals and date ranges for further analysis.
## `KUC-110`
**Source**: `examples/mAndAImpact.ipynb`
Analyzes stock performance metrics (returns, volatility, beta) around M&A announcements to assess deal impact.
## `KUC-111`
**Source**: `examples/openbb-apachebeam/tests/test_obb_pipeline.py`
Demonstrates scalable data processing using Apache Beam to fetch financial data from OpenBB for enterprise ETL workflows.
## `KUC-112`
**Source**: `examples/openbbPlatformAsLLMTools.ipynb`
Exposes OpenBB functions as tools for LLM agents (LangChain) to enable natural language financial data queries.
## `KUC-113`
**Source**: `examples/openbb_vs_langchain.ipynb`
Compares native OpenBB capabilities with LangChain tools for financial data retrieval and analysis.
## `KUC-114`
**Source**: `examples/platform_standardization.ipynb`
Tests OpenBB platform standardization by comparing data models and outputs across different providers.
## `KUC-115`
**Source**: `examples/portfolioOptimizationUsingModernPortfolioTheory.ipynb`
Optimizes cryptocurrency portfolio allocation using Modern Portfolio Theory to maximize risk-adjusted returns.
## `KUC-116`
**Source**: `examples/riskReturnAnalysis.ipynb`
Analyzes risk and return characteristics across multiple asset classes (equities, bonds, real estate, commodities) for portfolio construction.
## `KUC-117`
**Source**: `examples/sectorRotationStrategy.ipynb`
Implements a sector rotation strategy using ETF performance to dynamically allocate to top-performing sectors.
## `KUC-118`
**Source**: `examples/streamlit/news.py`
Provides a Streamlit-based dashboard for displaying and monitoring financial news from multiple sources.
## `KUC-119`
**Source**: `examples/usdLiquidityIndex.ipynb`
Constructs a USD Liquidity Index by combining Federal Reserve monetary aggregates from FRED for macroeconomic analysis.
FILE:references/WISDOM.md
# Cross-Project Wisdom
Total: **8**
## `CW-DATA-SOURCING-001` — Exponential backoff retry with rate limit detection
**From**: finance-bp-079--akshare, finance-bp-114--edgar-crawler · **Applicable to**: data-sourcing
Implement retry logic with exponential backoff specifically for HTTP 429 rate limit responses. Retrying immediately on rate limit errors worsens the block situation. Separate retry logic for transient network errors (TimeoutError, ConnectionError) from permanent errors (ValueError, KeyError) prevents resource waste and masks underlying bugs.
## `CW-DATA-SOURCING-002` — Strict date format validation and standardization
**From**: finance-bp-070--edgartools, finance-bp-079--akshare, finance-bp-084--eastmoney · **Applicable to**: data-sourcing
Validate date formats strictly (YYYY-MM-DD pattern with leap year and month-end checks) before processing XBRL or API data. Convert date strings between formats (YYYYMMDD to YYYY-MM-DD) when storing to databases. Invalid dates corrupt downstream financial calculations.
## `CW-DATA-SOURCING-003` — XBRL fact attribute completeness enforcement
**From**: finance-bp-070--edgartools, finance-bp-114--edgar-crawler · **Applicable to**: data-sourcing
Extract and validate all essential XBRL fact attributes (concept, value, period, unit) from every fact. Missing attributes cause financial analysis queries to return incomplete or misleading results. Period type (instant vs duration) must be correctly distinguished for accurate balance sheet rendering.
## `CW-DATA-SOURCING-004` — Streaming parser threshold for large documents
**From**: finance-bp-070--edgartools, finance-bp-128--yfinance · **Applicable to**: data-sourcing
Implement streaming parser activation when documents exceed configurable thresholds (10MB default). This prevents OOM errors on large NPORT-P filings or bulk document downloads. Also require timezone information for time-series data to prevent DST offset corruption.
## `CW-DATA-SOURCING-005` — Data accuracy disclaimer requirements
**From**: finance-bp-079--akshare, finance-bp-128--yfinance, finance-bp-097--OpenBB · **Applicable to**: data-sourcing
Always present scraped or third-party financial data with proper caveats about accuracy limitations and delays. Claims of guaranteed accuracy, real-time capabilities, or Yahoo/provider affiliation violate terms of service and can lead to user financial losses from reliance on delayed or incorrect data.
## `CW-DATA-SOURCING-006` — Atomic write ordering for versioned storage
**From**: finance-bp-103--ArcticDB · **Applicable to**: data-sourcing
Write atom keys (TABLE_DATA, TABLE_INDEX, VERSION) before updating mutable reference keys (VERSION_REF, SNAPSHOT_REF). Never modify atom keys after writing to preserve content-addressed storage invariants. This prevents readers from accessing incomplete data in multi-writer scenarios.
## `CW-DATA-SOURCING-007` — HTTP status code validation before data processing
**From**: finance-bp-079--akshare, finance-bp-097--OpenBB · **Applicable to**: data-sourcing
Always validate HTTP response status codes before processing response data. Error responses (404, 500) may contain HTML error pages that corrupt downstream JSON parsing. Explicitly check for HTTP 429 and raise RateLimitError for proper handling by callers.
## `CW-DATA-SOURCING-008` — Quality gates for financial recommendations
**From**: finance-bp-084--eastmoney · **Applicable to**: data-sourcing
Apply fundamental quality filters (ROE thresholds, OCF/Profit ratios, debt ratios) before generating financial recommendations. Without quality gates, low-quality stocks may be recommended for positions, leading to investment losses. Separate on-demand computation from scheduled pre-computation to handle API rate limits.
FILE:references/components/cli_interface_layer.md
# cli_interface_layer (5 classes)
## `PlatformController.call`
`cli_interface_layer/platformcontroller-call.py:0`
## `ArgparseTranslator.translate`
`cli_interface_layer/argparsetranslator-translate.py:0`
## `ReferenceToArgumentsProcessor._build_custom_groups`
`cli_interface_layer/referencetoargumentsprocessor-build-cust.py:0`
## `Registry._get_by_index`
`cli_interface_layer/registry-get-by-index.py:0`
## `Data processing command`
`cli_interface_layer/data-processing-command.py:0`
FILE:references/components/command_execution_-_parameter_building.md
# command_execution_&_parameter_building (5 classes)
## `CommandRunner.run`
`command_execution_&_parameter_building/commandrunner-run.py:0`
## `StaticCommandRunner.execute`
`command_execution_&_parameter_building/staticcommandrunner-execute.py:0`
## `ParametersBuilder.validate_kwargs`
`command_execution_&_parameter_building/parametersbuilder-validate-kwargs.py:0`
## `ExecutionContext.update_command_context`
`command_execution_&_parameter_building/executioncontext-update-command-context.py:0`
## `Output callback`
`command_execution_&_parameter_building/output-callback.py:0`
FILE:references/components/command_registration_-_routing.md
# command_registration_&_routing (5 classes)
## `Router.command`
`command_registration_&_routing/router-command.py:0`
## `Router.include_router`
`command_registration_&_routing/router-include-router.py:0`
## `SignatureInspector.complete`
`command_registration_&_routing/signatureinspector-complete.py:0`
## `CommandMap.get_command`
`command_registration_&_routing/commandmap-get-command.py:0`
## `Command function`
`command_registration_&_routing/command-function.py:0`
FILE:references/components/data_acquisition_-fetcher_tet_pattern.md
# data_acquisition_(fetcher_tet_pattern) (6 classes)
## `Fetcher.transform_query`
`data_acquisition_(fetcher_tet_pattern)/fetcher-transform-query.py:0`
## `Fetcher.extract_data`
`data_acquisition_(fetcher_tet_pattern)/fetcher-extract-data.py:0`
## `Fetcher.aextract_data`
`data_acquisition_(fetcher_tet_pattern)/fetcher-aextract-data.py:0`
## `Fetcher.transform_data`
`data_acquisition_(fetcher_tet_pattern)/fetcher-transform-data.py:0`
## `QueryExecutor.execute`
`data_acquisition_(fetcher_tet_pattern)/queryexecutor-execute.py:0`
## `Data fetcher`
`data_acquisition_(fetcher_tet_pattern)/data-fetcher.py:0`
FILE:references/components/extension_loading_-_registry.md
# extension_loading_&_registry (5 classes)
## `ExtensionLoader.load_extensions`
`extension_loading_&_registry/extensionloader-load-extensions.py:0`
## `ExtensionLoader._sorted_entry_points`
`extension_loading_&_registry/extensionloader-sorted-entry-points.py:0`
## `RegistryLoader.load`
`extension_loading_&_registry/registryloader-load.py:0`
## `Provider backend`
`extension_loading_&_registry/provider-backend.py:0`
## `OBBject extension`
`extension_loading_&_registry/obbject-extension.py:0`
FILE:references/components/query_interface_-provider_abstraction.md
# query_interface_(provider_abstraction) (5 classes)
## `Query.execute`
`query_interface_(provider_abstraction)/query-execute.py:0`
## `Query.filter_extra_params`
`query_interface_(provider_abstraction)/query-filter-extra-params.py:0`
## `ProviderInterface.get_provider`
`query_interface_(provider_abstraction)/providerinterface-get-provider.py:0`
## `ProviderInterface._generate_params_dc`
`query_interface_(provider_abstraction)/providerinterface-generate-params-dc.py:0`
## `Provider`
`query_interface_(provider_abstraction)/provider.py:0`
FILE:references/components/results_container_-obbject.md
# results_container_(obbject) (5 classes)
## `OBBject.to_dataframe`
`results_container_(obbject)/obbject-to-dataframe.py:0`
## `OBBject.to_polars`
`results_container_(obbject)/obbject-to-polars.py:0`
## `OBBject.to_dict`
`results_container_(obbject)/obbject-to-dict.py:0`
## `OBBject.to_numpy`
`results_container_(obbject)/obbject-to-numpy.py:0`
## `Output format`
`results_container_(obbject)/output-format.py:0`
使用 NautilusTrader 配置驱动的 BacktestNode 运行高性能多市场回测,支持 Parquet 数据目录和外部 CSV 数据导入,策略可直接过渡到实盘交易。。
---
name: nautilus-algo-trading
description: |-
使用 NautilusTrader 配置驱动的 BacktestNode 运行高性能多市场回测,支持 Parquet 数据目录和外部 CSV 数据导入,策略可直接过渡到实盘交易。。
license: Proprietary. See LICENSE.txt in project root.
compatibility: Designed for Doramagic-host ecosystem (Claude Code / openclaw / Cursor). Requires Python 3.12+ with uv package manager.
metadata:
version: "v6.1"
blueprint_id: "finance-bp-098"
compiled_at: "2026-04-22T13:00:43.978060+00:00"
capability_markets: "multi-market"
capability_activities: "backtesting, factor-research"
sop_version: "crystal-compilation-v6.1"
---
# Nautilus 算法回测 (nautilus-algo-trading)
> 使用 NautilusTrader 配置驱动的 BacktestNode 运行高性能多市场回测,支持 Parquet 数据目录和外部 CSV 数据导入,策略可直接过渡到实盘交易。
## Pipeline
`data_collection -> data_storage -> factor_computation -> target_selection -> trading_execution -> visualization`
## Top Use Cases (73 total)
### High-Level Backtest with Parquet Data Catalog (`UC-101`)
Users need to run backtests using a config-driven approach with the BacktestNode, enabling reproducible production workflows that can transition to li
**Triggers**: backtest, BacktestNode, Parquet catalog
### Low-Level Backtest with Direct Component Access (`UC-102`)
Users need fine-grained control over backtesting components including custom execution algorithms like TWAP for sophisticated order execution simulati
**Triggers**: BacktestEngine, TWAP, direct control
### Quickstart EMA Crossover Backtest (`UC-103`)
New users need a minimal example to run their first backtest quickly, demonstrating the core strategy and market data handling pattern
**Triggers**: quickstart, first backtest, EMA crossover
For all **73** use cases, see [references/USE_CASES.md](references/USE_CASES.md).
**Execute trigger**: `When user intent matches intent_router.uc_entries[].positive_terms AND user uses action verb (run/execute/跑/执行/backtest/fetch/collect)`
## What I'll Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
## Semantic Locks (Fatal)
| ID | Rule | On Violation |
|---|---|---|
| `SL-01` | Execute sell orders before buy orders in every trading cycle | halt |
| `SL-02` | Trading signals MUST use next-bar execution (no look-ahead) | halt |
| `SL-03` | Entity IDs MUST follow format entity_type_exchange_code | halt |
| `SL-04` | DataFrame index MUST be MultiIndex (entity_id, timestamp) | halt |
| `SL-05` | TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount | halt |
| `SL-06` | filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION | halt |
| `SL-07` | Transformer MUST run BEFORE Accumulator in factor pipeline | halt |
| `SL-08` | MACD parameters locked: fast=12, slow=26, signal=9 | halt |
Full lock definitions: [references/LOCKS.md](references/LOCKS.md)
## Top Anti-Patterns (25 total)
- **`AP-ZVT-183`**: 除权因子为 inf/NaN 时直接参与乘法导致复权静默失败
- **`AP-ZVT-179`**: 第三方数据接口超限后异常被吞噬,数据静默缺失
- **`AP-ZVT-183B`**: HFQ(后复权)与 QFQ(前复权)K 线表使用错误导致因子计算漂移
All 25 anti-patterns: [references/ANTI_PATTERNS.md](references/ANTI_PATTERNS.md)
## Evidence Quality Notice
> [QUALITY NOTICE] This crystal was compiled from blueprint finance-bp-098. Evidence verify ratio = 33.3% and audit fail total = 20. Generated results may have uncaptured requirement gaps. Verify critical decisions against source files (LATEST.yaml / LATEST.jsonl).
## Reference Files
| File | Contents | When to Load |
|---|---|---|
| [references/seed.yaml](references/seed.yaml) | V6+ 全量权威 (source-of-truth) | 有行为/决策争议时必读 |
| [references/ANTI_PATTERNS.md](references/ANTI_PATTERNS.md) | 25 条跨项目反模式 | 开始实现前 |
| [references/WISDOM.md](references/WISDOM.md) | 跨项目精华借鉴 | 架构决策时 |
| [references/CONSTRAINTS.md](references/CONSTRAINTS.md) | domain + fatal 约束 | 规则冲突时 |
| [references/USE_CASES.md](references/USE_CASES.md) | 全量 KUC-* 业务场景 | 需要完整示例时 |
| [references/LOCKS.md](references/LOCKS.md) | SL-* + preconditions + hints | 生成回测/交易代码前 |
| [references/COMPONENTS.md](references/COMPONENTS.md) | AST 组件地图(按 module 拆分)| 查 API 时 |
---
*Compiled by Doramagic crystal-compilation-v6.1 from `finance-bp-098` blueprint at 2026-04-22T13:00:43.978060+00:00.*
*See [human_summary.md](human_summary.md) for non-technical overview.*
FILE:human_summary.md
# Human Summary
> I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
## What I Can Do
- **tagline**: I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
- **use_cases**: ['Quickstart EMA Crossover Backtest', 'Low-Level Backtest with Direct Component Access', 'High-Level Backtest with Parquet Data Catalog', 'A-share MACD daily golden-cross backtest with hfq price adjustment from eastmoney', 'End-to-end ZVT pipeline: FinanceRecorder + GoodCompanyFactor + StockTrader', 'Multi-factor strategy with TargetSelector (AND mode) combining MACD + volume breakout', 'Index composition data collection (SZ1000, SZ2000) with EM recorder']
## What I Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
FILE:references/ANTI_PATTERNS.md
# Anti-Patterns (Cross-Project)
Total: **25**
## qlib (9)
### `AP-QLIB-1930` — 回测结果与模型无关——共享 dataset 对象导致预测值被首次模型覆盖 <sub>(high)</sub>
Qlib 中多个模型复用同一个已 fit 的 DatasetH 实例时,dataset 内部的标准化 参数(fit_start_time/fit_end_time 决定的归一化统计量)在第一次 fit 后固化。 切换模型但不重新初始化 dataset,导致所有模型实际使用同一套预测信号。表现为 无论换 LightGBM/XGBoost/DNN,回测净值曲线完全一致。这是最危险的"实验看起来 在跑,但结论全部无效"反模式。
Source: https://github.com/microsoft/qlib/issues/1930
### `AP-QLIB-2090` — fit_start_time 与 train segment 双重配置引发隐式数据泄露 <sub>(high)</sub>
Qlib DatasetH 有两个"训练数据范围":handler 的 fit_start_time/fit_end_time (决定归一化器拟合范围)和 segments.train(决定模型训练范围)。常见错误是 让 fit_end_time 覆盖 valid/test 段,使归一化统计量(均值、标准差)包含了 未来数据,造成前向偏差(look-ahead bias)。两者独立配置但语义耦合,文档 未明确说明 fit_end_time 必须 <= train_end。
Source: https://github.com/microsoft/qlib/issues/2090
### `AP-QLIB-2036` — MACD 因子公式文档错误——DEA 被多除一次 CLOSE 导致量纲不一致 <sub>(high)</sub>
Qlib 官方文档中的 Alpha 公式示例将 MACD 的 DEA 定义为 EMA(DIF, 9) / CLOSE, 但 DIF 已经是无量纲(除过 CLOSE 的),再次除以 CLOSE 导致 DEA 量纲为 1/price。 基于此文档公式构建的 MACD 因子在截面标准化后与正确公式差异显著,IC 下降。 此类文档层面的公式错误会被大量用户直接照搬入生产因子库。
Source: https://github.com/microsoft/qlib/issues/2036
### `AP-QLIB-2184` — 自定义 A 股数据导入前未按约定填充停牌日 NaN,引发下游因子噪声 <sub>(high)</sub>
Qlib 约定停牌日 open/close/high/low/volume/factor 字段均应填 NaN,以便框架 在因子计算时识别并跳过。用户自建 A 股数据集时若将停牌日保留为上一日价格 (常见于从东财/Wind 直接导出的数据),会导致停牌期间的价格动量因子出现 "假信号"(价格不变但因子非零)。Qlib 不校验此约定,错误静默流入训练数据。
Source: https://github.com/microsoft/qlib/issues/2184
### `AP-QLIB-1892` — PIT(Point-In-Time)财务数据收集器依赖外部股票列表接口,全量 A 股获取不完整 <sub>(high)</sub>
Qlib 的 PIT 数据收集器(财务数据时间点快照)在初始化时调用 get_hs_stock_symbols() 获取沪深股票列表。该函数依赖东财 API,经常仅返回 部分列表而非全量 5000+ 股票,且函数在获取不完整时直接 raise ValueError。 用户若按文档步骤操作,财务数据集将只覆盖部分股票,基于 PIT 财务因子的回测 存在严重生存者偏差(未被采集的股票被隐式排除)。
Source: https://github.com/microsoft/qlib/issues/1892
### `AP-QLIB-2097` — 全市场 instrument="all" 在 32GB 内存机器上 OOM,但 CSI300 正常 <sub>(medium)</sub>
Qlib 在加载 Alpha158 特征时会将指定 universe 的全部特征矩阵一次性载入内存。 使用 instrument="csi300"(300 股)与 instrument="all"(5000+ 股)的内存占用 差约 16 倍。32GB 机器跑全市场时在 init_instance_by_config 阶段直接 OOM, 错误信息不提示内存问题。用户容易误以为是配置错误,实际上需要分批加载或 使用流式特征计算。
Source: https://github.com/microsoft/qlib/issues/2097
### `AP-QLIB-1984` — LightGBM 模型标签维度校验逻辑永远不触发导致多标签训练静默失败 <sub>(medium)</sub>
Qlib gbdt.py 中用 y.values.ndim == 2 判断是否为多标签,但从 DataFrame 取出的 Series 的 ndim 永远为 1,条件永远为 False,因此多标签训练不会走 squeeze 分支,而是直接进入 LightGBM 训练并在更深处抛出语义不明的错误。 用户尝试自定义多标签任务时无法从错误信息定位到此根因。
Source: https://github.com/microsoft/qlib/issues/1984
### `AP-QLIB-1915` — 自定义 CSV 数据 dump_bin 后 DataHandler 报 Length mismatch,D.features 却正常 <sub>(high)</sub>
Qlib 存在两套数据访问路径:D.features(直接读 binary)和 DataHandler/DataHandlerLP (带 processor pipeline)。自定义 A 股 CSV 数据在 dump_bin 时若字段顺序 或 symbol 格式(如 600000.SH vs SH600000)与 Qlib 约定不符,DataHandler 的 processor 在 align/reindex 时触发 Length mismatch,而 D.features 因不 经过 processor 而成功。这一"两套路径行为不一致"让用户误以为数据已正确导入。
Source: https://github.com/microsoft/qlib/issues/1915
### `AP-QLIB-1949` — Colab/Linux 多进程后端与 Qlib ParallelExt 冲突导致 DataHandler 完全不可用 <sub>(medium)</sub>
Qlib 在非 fork 环境(Windows 或 Google Colab)中,DataHandler 使用 joblib 并行加载特征时,ParallelExt 初始化时访问 _backend_args 属性失败(AttributeError)。 根因是 joblib 1.5+ 移除了该内部属性,Qlib 的兼容层未更新。表现为 D.features 调用抛出多层嵌套异常,用户无法从错误栈判断是并行后端问题还是数据问题。
Source: https://github.com/microsoft/qlib/issues/1949
## vnpy (4)
### `AP-VNPY-3691` — K 线生成器首根 K 线时间戳不对齐,导致第一个周期信号错误 <sub>(high)</sub>
vnpy BarGenerator 在合成 N 分钟 K 线时,第一根推送的 K 线时间戳为"当前 tick 所在分钟"而非"完整 N 分钟周期结束时间"。具体表现:09:59 的 tick 会 触发一根不完整的 5 分钟 K 线推送(本应等到 10:04 才推送)。策略若在 on_bar 中直接用 datetime.minute % 5 过滤,第一根 K 线恰好通过,但包含的 数据不足一个完整周期,用于信号计算会产生错误的开仓信号。
Source: https://github.com/vnpy/vnpy/issues/3691
### `AP-VNPY-3669` — Alpha 模块历史数据增量保存时新旧 DataFrame schema 不兼容导致 SchemaError <sub>(medium)</sub>
vnpy Alpha 模块在保存 K 线数据到 Parquet 文件时,将新下载数据(可能含 Float64 列)与已存文件(历史 Int64 列)直接 polars.concat。polars 强类型 不允许隐式类型提升,抛出 SchemaError。根因是不同数据源/版本返回的字段类型 不一致(如 volume 在部分行情源为整数,在另一些为浮点),且 concat 前无 schema 对齐步骤。影响所有使用 vnpy alpha 进行回测的历史数据构建流程。
Source: https://github.com/vnpy/vnpy/issues/3669
### `AP-VNPY-3685` — 价差交易模块 run_backtesting() 在 Jupyter 环境下静默报错,结果不可信 <sub>(high)</sub>
vnpy 4.10 价差交易(SpreadTrading)模块的 run_backtesting() 在 Jupyter 环境下存在事件循环冲突(asyncio already running),导致回测引擎部分逻辑 不执行但不抛异常,返回看似正常的回测统计数据。同样代码在命令行 Python 中无此问题。vnpy 4.x 将部分 IO 改为 async 但 Jupyter 的事件循环与之不兼容, 是"回测结果看起来正确但实际不完整"的隐蔽陷阱。
Source: https://github.com/vnpy/vnpy/issues/3685
### `AP-VNPY-3700` — 安装脚本不使用 venv 导致全局 numpy 版本被降级破坏其他依赖 <sub>(medium)</sub>
vnpy install.bat 直接在系统/conda base 环境安装,会强制降级 numpy 到 <2.0 以满足 vnpy 依赖,破坏依赖 numpy 2.x 的其他量化工具(如 scipy、pytorch 新版)。 没有 requirements.txt,依赖边界不透明。在多工具共存的量化研究环境中, vnpy 的安装脚本是"全局环境污染"的常见根源。
Source: https://github.com/vnpy/vnpy/issues/3700
## zipline (6)
### `AP-ZIPLINE-138` — 回测价格为未复权价,教程图表误导用户误判策略收益 <sub>(high)</sub>
Zipline 教程使用 AAPL 股价图做演示,但 bundle 中存储的是未复权价格(raw price), 而非经过拆股/分红调整的复权价。图表显示的历史价格与市场实际价约差 4 倍(Apple 历次拆股累计因子),用户误将"价格翻 4 倍"当作策略收益。A 股场景更严重: 除权前后价格跳变会在未复权数据中形成巨大"信号",吸引技术指标在除权日产生 虚假突破信号。
Source: https://github.com/stefan-jansen/zipline-reloaded/issues/138
### `AP-ZIPLINE-235` — 默认以当根 K 线收盘价成交,低估实盘滑点,策略回测收益虚高 <sub>(high)</sub>
Zipline 默认滑点模型在当根 K 线触发信号后,以同根 K 线收盘价成交(current bar close fill)。实盘中信号只能在下一根 K 线的开盘价附近成交(T+1 order execution)。以 A 股日线为例,用收盘价回测比用次日开盘价成交平均高估日收益 约 0.1-0.3%,年化差距可超 30%。需显式配置 slippage model 为 VolumeShareSlippage 或 FixedSlippage 并设合理 volume_limit。
Source: https://github.com/stefan-jansen/zipline-reloaded/issues/235
### `AP-ZIPLINE-190` — 日历 start_session 设为非交易日触发 DateOutOfBounds,无提示如何修正 <sub>(medium)</sub>
Zipline 在注册 bundle 或运行算法时,若 start_session 参数恰好是非交易日 (如 1998-01-01 元旦),Calendar 校验抛出 DateOutOfBounds("cannot be earlier than the first session")。错误信息仅显示交易日历起始日,不提示"请改为第一个 交易日"。A 股场景:使用 SSE/SZSE 日历时,若 start_date 恰好是春节前最后 一天次日(节假日),会触发同类错误,调试成本极高。
Source: https://github.com/stefan-jansen/zipline-reloaded/issues/190
### `AP-ZIPLINE-181` — asset db 过期后 Pipeline 报"no assets traded",误导用户排查数据范围 <sub>(high)</sub>
Zipline 的 asset database(SQLite)记录每只股票的 start/end 交易日期。若 使用了旧版 Quandl/自建 bundle 且未重新 ingest,在回测新日期范围时 Pipeline 抛出 "Failed to find any assets with country_code 'US' that traded between [dates]"。A 股场景:重新下载行情后若只更新价格数据而未重建 asset db,退市/ 新上市股票的日期范围不更新,Pipeline 过滤会悄悄排除这些股票,产生生存者偏差。
Source: https://github.com/stefan-jansen/zipline-reloaded/issues/181
### `AP-ZIPLINE-285` — week_start()/week_end() 在自定义日历(非美股)下静默失效 <sub>(medium)</sub>
Zipline schedule_function 的 date_rules.week_start() 和 date_rules.week_end() 依赖交易日历的周首/周末判断逻辑,但在非美股日历(如 ASX、SSE)中,该逻辑 与 NYSE 日历的偏移计算不兼容,导致 schedule 永远不触发或在错误的日期触发。 A 股场景:使用 SSE 日历时,含春节等连续长假的周,week_start 可能跳过整个 假期周而不调仓,但用户无法从日志发现未触发的调度。
Source: https://github.com/stefan-jansen/zipline-reloaded/issues/285
### `AP-ZIPLINE-240` — 回测日期时区必须为 UTC,传入 naive datetime 引发深层 AssertionError <sub>(medium)</sub>
Zipline 内部强制要求所有时间戳为 UTC aware datetime。当用户传入 naive datetime (无时区信息,如 pd.Timestamp('2020-01-01'))时,不在入口处报错,而是在 算法执行深处触发 AssertionError: Algorithm should have a utc datetime,栈深 难以定位。A 股开发者从本地 CST 时间导入数据时极易触发此陷阱,需在 bundle 注册时显式 tz_localize('UTC')。
Source: https://github.com/stefan-jansen/zipline-reloaded/issues/240
## zvt (6)
### `AP-ZVT-183` — 除权因子为 inf/NaN 时直接参与乘法导致复权静默失败 <sub>(high)</sub>
ZVT 在计算前复权因子时以 new/old 价格比计算 qfq_factor。当 old==0(新股首日 或数据缺失)时因子为 inf;当 kdata.open 本身为 None(停牌日未填充)时乘法 抛出 TypeError。结果:整个 entity 的复权计算中断,后续 K 线全部丢失,但主 流程只 log ERROR 不中断,用户往往不知道已有大量股票数据损坏。
Source: https://github.com/zvtvz/zvt/issues/183
### `AP-ZVT-179` — 第三方数据接口超限后异常被吞噬,数据静默缺失 <sub>(high)</sub>
ZVT 使用聚宽 jqdatasdk 批量拉取全市场 K 线时(4000+ 股票),触发聚宽每日 最大查询条数限制(错误:已超过每日最大查询数量)。ZVT 捕获异常后继续执行下一 entity,导致超限后所有股票的当日数据均静默缺失。回测若使用该残缺数据库,因 子计算结果将产生系统性偏差,且无告警。
Source: https://github.com/zvtvz/zvt/issues/179
### `AP-ZVT-161` — 全市场 SQLite 批量因子计算触发 too many SQL variables 错误 <sub>(medium)</sub>
ZVT 在计算 VolumeUpMaFactor 等多股因子时,将所有 entity_id 拼入单条 SQL 的 IN 子句。当 A 股全市场(5000+ 股)一次性查询时,触发 SQLite 默认限制 SQLITE_MAX_VARIABLE_NUMBER=999。调大 max_allowed_packet(MySQL 参数)无效, 根因是 SQLite 变量数上限。正确解法是分批查询,但 ZVT 早期版本未处理此边界。
Source: https://github.com/zvtvz/zvt/issues/161
### `AP-ZVT-129` — 使用通配符导入隐藏 API 版本变更,AdjustType 等枚举莫名消失 <sub>(medium)</sub>
ZVT 文档示例使用 `from zvt import *` 导入所有符号。当 ZVT 版本升级重构 枚举(如将 AdjustType 移入子模块)后,通配符导入不再包含该符号,触发 AttributeError。使用者误以为是安装问题,实际是版本间 API breaking change 未在 CHANGELOG 中标注,且通配符导入掩盖了具体来源。应显式 import 枚举类。
Source: https://github.com/zvtvz/zvt/issues/129
### `AP-ZVT-187` — 回测引擎未在数据层空结果时提前终止,导致空指针级联崩溃 <sub>(medium)</sub>
ZVT Trader 在 load_data 完成后检查数据为空时,不提前退出,而是将空 DataFrame 传入 selector 计算,触发后续 NoneType 操作链式崩溃。错误栈深且难以定位根因, 用户误以为是策略逻辑问题。根因是数据时间窗口配置错误(start/end 不在数据 库覆盖范围内)但无有效校验。
Source: https://github.com/zvtvz/zvt/issues/187
### `AP-ZVT-183B` — HFQ(后复权)与 QFQ(前复权)K 线表使用错误导致因子计算漂移 <sub>(high)</sub>
ZVT 提供 Stock1dKdata(不复权)、Stock1dHfqKdata(后复权)、Stock1dQfqKdata (前复权)三张独立表。用户在计算价格动量/均线因子时混用两张表(如用不复权 做均线,用后复权做收益率),导致除权日前后因子值产生跳变。ZVT 不做跨表 复权类型一致性校验,混用静默通过。
Source: https://github.com/zvtvz/zvt/issues/183
FILE:references/COMPONENTS.md
# Component Capability Map
**Project**: finance-bp-098--nautilus_trader
**Scan date**: 2026-04-22
**Stats**: {'total_files': 8, 'total_classes': 60, 'total_functions': 0, 'total_stages': 8}
## Modules (8)
- [data_collection_&_ingestion](components/data_collection_-_ingestion.md): 7 classes
- [factor_computation_/_strategy_processing](components/factor_computation_-_strategy_processing.md): 11 classes
- [risk_management](components/risk_management.md): 4 classes
- [order_execution_&_venue_routing](components/order_execution_-_venue_routing.md): 8 classes
- [trading_coordination_&_portfolio](components/trading_coordination_-_portfolio.md): 8 classes
- [backtest_simulation](components/backtest_simulation.md): 8 classes
- [state_persistence_&_cache](components/state_persistence_-_cache.md): 8 classes
- [accounting_&_settlement](components/accounting_-_settlement.md): 6 classes
FILE:references/CONSTRAINTS.md
# Constraints
## preservation_manifest
```yaml
required_objects:
business_decisions_count: 137
fatal_constraints_count: 68
non_fatal_constraints_count: 217
use_cases_count: 73
semantic_locks_count: 12
preconditions_count: 4
evidence_quality_rules_count: 2
traceback_scenarios_count: 5
```
## Domain Constraints Injected (39)
- **`SHARED-BT-LAB-001`** <sub>(fatal)</sub>: 未来函数(Lookahead Bias):在模拟历史时间点 t 的交易决策时, 不得使用 t 时刻之后才能知道的信息。最常见形式: (1) 使用收盘价计算信号并同日以收盘价成交; (2) 将 T 日收盘后计算的指标标记在同一根 K 线; (3) 使用当日最高/最低价作为成交假设。 信号计算与成交时间必须对齐:T 日收盘后计算信号,T+1 日开盘成交。
- **`SHARED-BT-LAB-002`** <sub>(high)</sub>: 指标预热期(Warmup Period)处理:滚动窗口指标在前 N 个 bar 时 NaN, 这些 bar 不应参与信号计算和持仓决策。强制要求指标的 warmup_period 与最长 lookback 期等长,且 warmup 期间持仓应置零。
- **`SHARED-BT-LAB-003`** <sub>(fatal)</sub>: ML/DL 模型时序数据划分必须按时间顺序:TRAIN < VALID < TEST, 不可使用随机 k-fold 分折(会将未来数据混入训练集)。 应使用 TimeSeriesSplit 或 Walk-Forward 验证。
- **`SHARED-BT-LAB-004`** <sub>(fatal)</sub>: 开盘价/最高价/最低价成交假设:日线回测中假设每日可以最高价卖出或 最低价买入(如动量策略"最高价止盈"),这是明显的 lookahead, 因为日内最高/最低价只有收盘后才能确认。成交价只能用开盘价或 前一日收盘价(带滑点)。
- **`SHARED-BT-LAB-005`** <sub>(high)</sub>: 数据对齐偏移(Off-by-one):pandas rolling/shift 等操作容易引入细微的 1 期偏移错误。应在代码中明确记录每个序列的"观测时间点", 并通过 assert 验证关键时间对齐关系。
- **`SHARED-BT-LAB-006`** <sub>(high)</sub>: 过度优化(Overfitting):回测数量越多,过拟合概率越高。 Bailey et al.(2014)证明 Optimal Sharpe Ratio 期望值随回测次数单调递减。 应使用 Walk-Forward 验证代替 in-sample 参数穷举,并报告 Deflated Sharpe Ratio(DSR)而非峰值 Sharpe。
- **`SHARED-BT-SURV-001`** <sub>(fatal)</sub>: 幸存者偏差(Survivorship Bias):使用当前市场成分股作为历史回测股票池, 会遗漏曾经存在但后来退市、摘牌或被合并的股票,系统性高估策略历史收益率。 回测股票池必须使用历史时点快照(point-in-time universe)。
- **`SHARED-BT-SURV-002`** <sub>(high)</sub>: In-Sample / Out-of-Sample 划分:策略开发、参数选择必须在样本内完成, 样本外数据仅用于最终验证,不可多次"看"样本外数据后继续调优 (会将样本外变为新的样本内,重蹈过拟合)。
- **`SHARED-BT-SURV-003`** <sub>(high)</sub>: 停牌/缺失数据的填充策略:停牌日价格不可简单用前一日收盘价 forward-fill, 因为这会在复盘时造成"零成交量"日参与了因子计算和信号生成。 应在因子计算层显式过滤缺失交易日,不填充。
- **`SHARED-BT-SURV-004`** <sub>(high)</sub>: 异常值(Extreme Value)污染:原始市场数据可能含有数据源错误(如除权未 及时调整、手工录入错误导致的极端价格),不清洗直接进入因子计算会产生 极端信号,污染整个横截面。应在 pipeline 入口处过滤 3-sigma 异常值。
- **`SHARED-BT-COST-001`** <sub>(fatal)</sub>: 交易成本(佣金 + 印花税/转让税 + 过户费)必须在回测初始化时强制配置, 不可使用零成本默认值。忽略成本的回测策略绩效指标具有欺骗性, 高换手率策略尤其严重(单边往返成本往往吞噬 50%+ 的毛收益)。
- **`SHARED-BT-COST-002`** <sub>(high)</sub>: 滑点(Slippage)建模:回测若无滑点,假设每笔订单以理想价格成交, 高频策略在实盘中会因成交价劣化而产生严重亏损。至少应配置固定点差 或比例滑点;大单应使用成交量比例模型(如不超过日成交量 5%)。
- **`SHARED-BT-COST-003`** <sub>(high)</sub>: 换手率(Turnover)必须在回测绩效报告中展示并与成本关联分析。 月换手率超过 50%(年化 600%+)时,策略净收益对成本假设极度敏感, 每 10bps 成本变化可能改变策略盈亏结论,必须做成本敏感性分析。
- **`SHARED-BT-COST-004`** <sub>(medium)</sub>: 仓位规模化(Position Sizing)必须纳入资金量约束:回测应模拟固定资金量 下的实际持仓股数(取整),而非假设可以持有小数股。 对小盘股,最小交易单位(A股:100股/手)会导致实际可持仓量与目标权重 产生偏差,应在回测中模拟取整效应。
- **`SHARED-BT-TIME-001`** <sub>(high)</sub>: 时间戳时区统一:多数据源合并时,UTC vs 本地时间混用是常见数据腐败源。 所有时间戳必须在 pipeline 入口处统一转换为同一时区(推荐 UTC 存储, 市场本地时区展示),不可在 pipeline 中途混用不同时区。
- **`SHARED-BT-TIME-002`** <sub>(high)</sub>: 交易日历对齐:合并不同市场或不同频率数据时(如日线价格 + 周频因子), 必须使用明确的交易日历进行 reindex/merge,不可使用 outer join 后 fillna, 否则会在非交易日(节假日)创建虚假数据行。
- **`SHARED-BT-TIME-003`** <sub>(high)</sub>: 增量更新边界校验:历史数据增量更新时,必须从数据库查询已存最新日期, 仅下载该日期之后的数据。若重新下载已有数据并追加,会产生时间戳重复行, 导致回测时序错误。更新前后必须校验无重复 (index.duplicated().any() == False)。
- **`SHARED-BT-TIME-004`** <sub>(medium)</sub>: 回测绩效归因失真:基准(Benchmark)选择不当会使 Alpha/Beta 计算失真。 应选用策略实际可投资的被动基准(如 HS300 ETF),而非不可直接投资的 价格指数(如 HS300 指数)。价格指数不含股息再投资,会低估持仓基准收益。
- **`SHARED-BT-PERF-001`** <sub>(medium)</sub>: 最大回撤(Max Drawdown)计算必须使用净值序列(portfolio value), 不可用累计收益率序列代替。若使用对数收益率累加,会低估回撤深度 (因对数收益率在下跌时会比简单收益率偏小)。
- **`SHARED-BT-PERF-002`** <sub>(medium)</sub>: Sharpe Ratio 年化化约定:年化 Sharpe = 日 Sharpe × sqrt(252)(股票,252 交易日) 或 × sqrt(365)(加密货币,365日)。不同系统默认不同,跨系统对比前必须 确认年化因子,否则 Sharpe 不可比。
- **`SHARED-BT-PERF-003`** <sub>(medium)</sub>: Calmar Ratio / Sortino Ratio 优于 Sharpe Ratio 作为风险调整收益指标: Sharpe 假设收益正态分布,A 股/加密市场的收益分布显著左偏(肥尾), 会低估下行风险。量化评估应同时报告 Sortino(仅下行波动)和 Calmar(年化收益/最大回撤),不应单一依赖 Sharpe。
- **`SHARED-BT-PERF-004`** <sub>(medium)</sub>: 回测绩效归因应拆解为:alpha(主动收益)、beta(市场收益)、 因子暴露收益(style/sector)和特异性收益(stock selection)。 不做归因的回测无法区分"策略优秀"与"顺风行情恰好 beta 对了"。
- **`SHARED-FR-IC-001`** <sub>(high)</sub>: IC(信息系数)是衡量因子预测能力的核心指标,定义为因子值与 下期收益率的 Spearman 秩相关系数(ICIR = IC / std(IC))。 IC 绝对值 > 0.05 视为有预测能力的初步证据,ICIR > 0.5 视为稳定。 不计算 IC 直接报告回测绩效是因子有效性证明缺失的典型问题。
- **`SHARED-FR-IC-002`** <sub>(high)</sub>: IC 衰减(IC Decay)分析:因子预测能力通常随持仓期增长而衰减。 应计算 1/5/10/20 日 IC 序列,识别因子的最优持仓期。 IC 在1日高但20日迅速衰减的因子是短期因子,不适合月度换仓策略; 反之亦然。使用错误的持仓期会严重损害因子实盘表现。
- **`SHARED-FR-IC-003`** <sub>(high)</sub>: Harvey, Liu & Zhu (2016) 警告:学术界已发现 300+ 个"显著"因子, 其中大量是多重检验下的误发现(False Discovery)。因子有效性要求: t-stat > 3.0(而非传统的 1.96);或在不同时段/市场独立复现; 或有清晰的经济学逻辑。不满足上述条件的因子极可能是数据挖掘产物。
- **`SHARED-FR-IC-004`** <sub>(high)</sub>: 因子换手率(Factor Turnover)控制:高 IC 但高换手率的因子,在扣除 交易成本后净 IC 可能为负。应计算换手率调整后的有效 IC: net_IC = IC - turnover × cost_per_turn。目标换手率 ≤ 50%(月频)。
- **`SHARED-FR-IC-005`** <sub>(medium)</sub>: 因子衰减期(Half-life)是因子信号强度的核心参数,直接决定最优再平衡频率。 半衰期 < 5 日:日频或周频换仓;5-20 日:周频或双周;> 20 日:月频换仓。 错误地对短期因子使用月频换仓,会导致大量 alpha 在持仓期内消散。
- **`SHARED-FR-NEUT-001`** <sub>(high)</sub>: 行业中性化(Industry Neutralization):因子值若不对行业均值中性化, 因子收益中会混入行业轮动收益,难以判断是因子本身还是行业暴露驱动了收益。 行业中性化操作:factor_neutral = factor - industry_mean(factor)。
- **`SHARED-FR-NEUT-002`** <sub>(high)</sub>: 市值中性化(Market Cap Neutralization):小盘股效应(小盘跑赢大盘) 是金融史上最持久的 anomaly 之一,会污染几乎所有未中性化的因子。 若因子与市值高度相关,选股会系统性偏向小盘,收益来自市值暴露而非因子本身。 需同时进行行业和市值中性化(Fama-MacBeth 回归或残差法)。
- **`SHARED-FR-NEUT-003`** <sub>(high)</sub>: 异常值处理(Winsorize/MAD):因子原始值通常含有极端值,极端值会扭曲 分组分析(如 Q1/Q10 十分位)。应对原始因子值做 Winsorize(截尾至 [1%, 99%] 或 3-sigma)或 MAD(中位数绝对偏差)缩尾,然后再排名/中性化。
- **`SHARED-FR-NEUT-004`** <sub>(medium)</sub>: 因子正交化(Factor Orthogonalization):当多个因子共同用于合成打分时, 高相关因子的合成等效于对单一因子过度权重,稀释信号多样性。 应在合成前对因子做施密特正交化或 PCA,消除因子间的多重共线性。
- **`SHARED-FR-NEUT-005`** <sub>(medium)</sub>: 缺失数据填充策略:因子计算中的 NaN(停牌/新股/数据缺口)若用截面均值填充 会引入 lookahead bias(均值本身含未来信息);若完全删除会产生幸存者偏差; 正确做法是用截面中位数(当日所有股票的中位数,不依赖未来)或将该股当日排除。
- **`SHARED-FR-PORT-001`** <sub>(high)</sub>: 分层分析(Quantile Analysis):因子评估应使用 Q1/Q5(五分位)或 Q1/Q10(十分位)分组的多空收益差(top minus bottom spread)作为 主要评估指标,而非简单的多头收益。Q1 多 Q5 空的"单调性"检验是 因子有效性的核心证据:单调递增/递减 > 非单调 >> 仅多头有效。
- **`SHARED-FR-PORT-002`** <sub>(medium)</sub>: Alpha 衰减测试(Alpha Decay Test):因子的月度 IC 在不同时段(牛市/熊市/ 震荡市)的稳定性是因子鲁棒性的重要证据。IC 仅在某个特定市场状态下有效 的因子不适合全天候部署;应分段(rolling 12M)展示 IC 时序, 识别因子失效期。
- **`SHARED-FR-PORT-003`** <sub>(medium)</sub>: 换仓成本感知(Turnover-Aware Selection):因子排名靠近中间地带(49-51 分位) 的股票,排名小幅波动就会触发换仓,产生大量无效交易成本。 应在选股时设置换仓缓冲区(buffer zone):只在排名变化超过阈值时才换仓。
- **`SHARED-FR-PORT-004`** <sub>(medium)</sub>: 分组收益的统计显著性(Bootstrap 检验):因子分层收益差(Q1-Q5 spread) 即使在历史数据上很大,也可能是偶然,需要 bootstrap 或 t-test 检验 显著性(p-value < 0.05)。小样本回测期(< 3年)的分层收益尤其不可靠。
- **`SHARED-FR-XFER-001`** <sub>(high)</sub>: 因子跨市场可移植性验证:在一个市场有效的因子,不必然在另一个市场有效。 将美股因子直接套用 A 股、或将股票因子套用期货/加密货币,需要独立 IC 验证, 不可假设跨市场通用性。A 股特有异象(如反转效应、ST 价格异常)不存在于美股。
- **`SHARED-FR-XFER-002`** <sub>(medium)</sub>: 因子有效性时间稳定性:曾经有效的因子会因市场学习和套利行为逐渐失效 (McLean & Pontiff 2016 证明因子发表后平均衰减 58%)。 应定期(每季度/年)重新评估因子 IC,失效因子应及时替换或降权。
- **`SHARED-FR-XFER-003`** <sub>(medium)</sub>: 因子与宏观经济环境的交互:利率周期/经济周期/市场情绪对因子有效性影响显著。 价值因子(低 P/B)在利率上升期更有效;动量因子在趋势市更有效,震荡市失效。 部署因子前应评估当前宏观环境与因子最优生存环境的匹配度。
FILE:references/LOCKS.md
# Semantic Locks + Preconditions
## Semantic Locks (12)
### `SL-01` <sub>(on_violation: fatal)</sub>
Execute sell orders before buy orders in every trading cycle
### `SL-02` <sub>(on_violation: fatal)</sub>
Trading signals MUST use next-bar execution (no look-ahead)
### `SL-03` <sub>(on_violation: fatal)</sub>
Entity IDs MUST follow format entity_type_exchange_code
### `SL-04` <sub>(on_violation: fatal)</sub>
DataFrame index MUST be MultiIndex (entity_id, timestamp)
### `SL-05` <sub>(on_violation: fatal)</sub>
TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount
### `SL-06` <sub>(on_violation: fatal)</sub>
filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION
### `SL-07` <sub>(on_violation: fatal)</sub>
Transformer MUST run BEFORE Accumulator in factor pipeline
### `SL-08` <sub>(on_violation: fatal)</sub>
MACD parameters locked: fast=12, slow=26, signal=9
### `SL-09` <sub>(on_violation: warning)</sub>
Default transaction costs: buy_cost=0.001, sell_cost=0.001, slippage=0.001
### `SL-10` <sub>(on_violation: fatal)</sub>
A-share equity trading is T+1 (no same-day close of buy positions)
### `SL-11` <sub>(on_violation: fatal)</sub>
Recorder subclass MUST define provider AND data_schema class attributes
### `SL-12` <sub>(on_violation: fatal)</sub>
Factor result_df MUST contain either 'filter_result' OR 'score_result' column
## Preconditions (4)
- **`PC-01`**: `python3 -c 'import zvt; print(zvt.__version__)'` → on_fail: Run: python3 -m pip install zvt then re-run: python3 -m zvt.init_dirs to initialize data directories
- **`PC-02`**: `python3 -c "from zvt.api.kdata import get_kdata; df = get_kdata(entity_ids=['stock_sh_600000'], limit=1); assert df is not None and len(df) > 0, 'No kdata found'"` → on_fail: Run recorder first: python3 -m zvt.recorders.em.em_stock_kdata_recorder --entity_ids stock_sh_600000 (replace with your target entity IDs)
- **`PC-03`**: `python3 -c "import os; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); assert zvt_home.exists(), f'ZVT home not found: {zvt_home}'"` → on_fail: Run: python3 -m zvt.init_dirs
- **`PC-04`**: `python3 -c "import os, tempfile; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); test_f = zvt_home / '.write_test'; test_f.touch(); test_f.unlink()"` → on_fail: Check directory permissions: chmod u+w ~/.zvt or set ZVT_HOME environment variable to a writable location
FILE:references/USE_CASES.md
# Known Use Cases (KUC)
Total: **73**
## `KUC-101`
**Source**: `docs/getting_started/backtest_high_level.py`
Users need to run backtests using a config-driven approach with the BacktestNode, enabling reproducible production workflows that can transition to live trading.
## `KUC-102`
**Source**: `docs/getting_started/backtest_low_level.py`
Users need fine-grained control over backtesting components including custom execution algorithms like TWAP for sophisticated order execution simulation.
## `KUC-103`
**Source**: `docs/getting_started/quickstart.py`
New users need a minimal example to run their first backtest quickly, demonstrating the core strategy and market data handling pattern.
## `KUC-104`
**Source**: `docs/how_to/data_catalog_databento.py`
Users need to efficiently store and query historical market data from Databento using Nautilus Parquet data catalog for backtests and research.
## `KUC-105`
**Source**: `docs/how_to/loading_external_data.py`
Users with historical CSV data from external vendors need to load it into the Parquet data catalog for backtesting with BacktestNode.
## `KUC-106`
**Source**: `docs/tutorials/backtest_fx_bars.py`
Users need to backtest FX strategies with realistic rollover interest simulation and probabilistic fill modeling for USD/JPY.
## `KUC-107`
**Source**: `docs/tutorials/backtest_orderbook_binance.py`
Users need to backtest order book imbalance strategies using Binance exchange depth data for high-frequency trading signal generation.
## `KUC-108`
**Source**: `docs/tutorials/backtest_orderbook_bybit.py`
Users need to backtest order book imbalance strategies using Bybit exchange depth data for high-frequency trading signal generation.
## `KUC-109`
**Source**: `examples/backtest/architect_ax_book_imbalance.py`
Users need to backtest an order book imbalance strategy using Archax exchange data loaded from Databento for crypto perpetual contracts.
## `KUC-110`
**Source**: `examples/backtest/architect_ax_mean_reversion.py`
Users need to backtest a mean reversion strategy using Bollinger Bands indicators on Archax exchange crypto perpetuals.
## `KUC-111`
**Source**: `examples/backtest/betfair_backtest_orderbook_imbalance.py`
Users need to backtest order book imbalance strategies on Betfair sports betting exchange using NautilusTrader.
## `KUC-112`
**Source**: `examples/backtest/bitmex_grid_market_maker.py`
Users need to backtest a grid market making strategy on BitMEX using Tardis quote data for XBTUSD perpetual.
## `KUC-113`
**Source**: `examples/backtest/crypto_ema_cross_ethusdt_trade_ticks.py`
Users need to backtest an EMA crossover strategy with TWAP execution algorithm on Binance ETHUSDT using trade tick data.
## `KUC-114`
**Source**: `examples/backtest/crypto_ema_cross_ethusdt_trailing_stop.py`
Users need to backtest an EMA crossover strategy with trailing stop protection on Binance ETHUSDT trade tick data.
## `KUC-115`
**Source**: `examples/backtest/crypto_ema_cross_with_binance_provider.py`
Users need to backtest an EMA cross with trailing stop strategy using Binance futures instrument provider for real instrument data.
## `KUC-116`
**Source**: `examples/backtest/crypto_orderbook_imbalance.py`
Users need to backtest an order book imbalance strategy on Binance BTCUSDT using order book delta data for high-frequency signals.
## `KUC-117`
**Source**: `examples/backtest/databento_cme_quoter.py`
Users need to backtest a simple quoter strategy that provides liquidity on CME futures markets using Databento historical data.
## `KUC-118`
**Source**: `examples/backtest/databento_ema_cross_long_only_aapl_bars.py`
Users need to backtest a long-only EMA crossover strategy on AAPL equity bars using Databento market data.
## `KUC-119`
**Source**: `examples/backtest/databento_ema_cross_long_only_spy_trades.py`
Users need to backtest a long-only EMA crossover strategy on SPY ETF trade tick data using Databento.
## `KUC-120`
**Source**: `examples/backtest/databento_ema_cross_long_only_tsla_trades.py`
Users need to backtest a long-only EMA crossover strategy on TSLA equity trade tick data using Databento.
## `KUC-121`
**Source**: `examples/backtest/example_01_load_bars_from_custom_csv/run_example.py`
Users need to load custom bar data from CSV files into NautilusTrader for backtesting with their own historical data.
## `KUC-122`
**Source**: `examples/backtest/example_02_use_clock_timer/run_example.py`
Users need to understand how to implement timer-based logic in their strategies for scheduled actions independent of market data.
## `KUC-123`
**Source**: `examples/backtest/example_03_bar_aggregation/run_example.py`
Users need to learn how to aggregate raw tick data into different bar resolutions within their strategies.
## `KUC-124`
**Source**: `examples/backtest/example_04_using_data_catalog/run_example.py`
Users need to use the Parquet data catalog for organizing, storing, and retrieving backtest market data efficiently.
## `KUC-125`
**Source**: `examples/backtest/example_05_using_portfolio/run_example.py`
Users need to understand how to work with the portfolio module to track positions, orders, and manage multi-instrument strategies.
## `KUC-126`
**Source**: `examples/backtest/example_06_using_cache/run_example.py`
Users need to store and retrieve custom data objects in the cache for stateful strategy logic across backtest runs.
## `KUC-127`
**Source**: `examples/backtest/example_07_using_indicators/run_example.py`
Users need to learn how to use built-in technical indicators (MovingAverages, etc.) in their trading strategies.
## `KUC-128`
**Source**: `examples/backtest/example_08_cascaded_indicator/run_example.py`
Users need to implement strategies that chain indicators together, such as EMA of EMA, for more sophisticated signal generation.
## `KUC-129`
**Source**: `examples/backtest/example_09_messaging_with_msgbus/run_example.py`
Users need to implement inter-component communication using the message bus for decoupled strategy architecture.
## `KUC-130`
**Source**: `examples/backtest/example_10_messaging_with_actor_data/run_example.py`
Users need to publish and subscribe to custom data types between actors for building complex multi-component trading systems.
## `KUC-131`
**Source**: `examples/backtest/example_11_messaging_with_actor_signals/run_example.py`
Users need to use the Signal mechanism to emit and receive trading signals between actors for signal-based strategies.
## `KUC-132`
**Source**: `examples/backtest/fx_ema_cross_audusd_bars_from_ticks.py`
Users need to backtest EMA crossover on AUD/USD with rollover interest simulation, building bars from raw tick data.
## `KUC-133`
**Source**: `examples/backtest/fx_ema_cross_audusd_ticks.py`
Users need to backtest EMA crossover on AUD/USD using raw tick data with fill model and rollover interest simulation.
## `KUC-134`
**Source**: `examples/backtest/fx_ema_cross_bracket_gbpusd_bars_external.py`
Users need to backtest EMA crossover with bracket orders (profit target, stop loss) on GBP/USD using external bar data.
## `KUC-135`
**Source**: `examples/backtest/fx_ema_cross_bracket_gbpusd_bars_internal.py`
Users need to backtest EMA crossover with bracket orders on GBP/USD using internally generated bars from quote ticks.
## `KUC-136`
**Source**: `examples/backtest/fx_market_maker_gbpusd_bars.py`
Users need to backtest a volatility-based market making strategy on GBP/USD with dual-sided quotes and rollover interest.
## `KUC-137`
**Source**: `examples/backtest/model_configs_example.py`
Users need to understand how to configure backtests including venues, data sources, fee models, fill models, and latency models.
## `KUC-138`
**Source**: `examples/backtest/notebooks/databento_backtest_with_data_client.py`
Users need to integrate Databento data client with BacktestNode for streaming historical data into backtests.
## `KUC-139`
**Source**: `examples/backtest/notebooks/databento_download.py`
Users need to download and cache historical market data from Databento for offline backtesting and research.
## `KUC-140`
**Source**: `examples/backtest/notebooks/databento_futures_settlement.py`
Users need to backtest futures trading strategies that handle contract settlement and expiry correctly using Databento data.
## `KUC-141`
**Source**: `examples/backtest/notebooks/databento_option_exercise.py`
Users need to backtest options strategies including exercise and assignment mechanics using Databento market data.
## `KUC-142`
**Source**: `examples/backtest/notebooks/databento_option_greeks.py`
Users need to analyze option Greeks (delta, gamma, vega, theta) and create performance tearsheets for options strategies.
## `KUC-143`
**Source**: `examples/backtest/notebooks/databento_test_order_book_deltas.py`
Users need to test order book delta data handling from Databento for high-frequency and microstructure strategies.
## `KUC-144`
**Source**: `examples/backtest/notebooks/databento_test_request_bars.py`
Users need to request and process OHLCV bar data from Databento for backtesting bar-based strategies.
## `KUC-145`
**Source**: `examples/backtest/polymarket_simple_quoter.py`
Users need to backtest a quoter strategy on Polymarket prediction market using historical trade data from their APIs.
## `KUC-146`
**Source**: `examples/backtest/synthetic_data_pnl_test.py`
Users need to test P&L calculations and portfolio accounting using synthetic futures contract data with proper settlement.
## `KUC-147`
**Source**: `examples/live/architect_ax/ax_book_imbalance.py`
Users need to run an order book imbalance strategy live on Archax exchange for real-time crypto perpetual trading.
## `KUC-148`
**Source**: `examples/live/architect_ax/ax_data_tester.py`
Users need to test and validate market data streaming from Archax exchange before running live strategies.
## `KUC-149`
**Source**: `examples/live/architect_ax/ax_exec_tester.py`
Users need to test order execution and fill simulation on Archax exchange before running production strategies.
## `KUC-150`
**Source**: `examples/live/architect_ax/ax_mean_reversion.py`
Users need to run a Bollinger Bands mean reversion strategy live on Archax exchange for crypto perpetuals.
## `KUC-151`
**Source**: `examples/live/betfair/betfair.py`
Users need to run an order book imbalance strategy live on Betfair sports betting exchange.
## `KUC-152`
**Source**: `examples/live/binance/binance_futures_demo_exec_tester.py`
Users need to test Binance futures execution functionality including order submission and fill confirmation.
## `KUC-153`
**Source**: `examples/live/binance/binance_spot_and_futures_market_maker.py`
Users need to run a volatility market maker strategy simultaneously on Binance spot and futures markets.
## `KUC-154`
**Source**: `examples/live/binance/binance_spot_ema_cross_bracket_algo.py`
Users need to run an EMA crossover strategy with bracket orders and TWAP execution algorithm live on Binance spot.
## `KUC-155`
**Source**: `examples/live/binance/binance_spot_exec_tester.py`
Users need to test order execution functionality on Binance spot market.
## `KUC-156`
**Source**: `examples/live/bybit/bybit_ema_cross.py`
Users need to run an EMA crossover strategy live on Bybit for crypto perpetual trading.
## `KUC-157`
**Source**: `examples/live/bybit/bybit_ema_cross_bracket_algo.py`
Users need to run EMA crossover with bracket orders and TWAP execution algorithm live on Bybit.
## `KUC-158`
**Source**: `examples/live/bybit/bybit_ema_cross_stop_entry.py`
Users need to run EMA crossover strategy with stop entry orders live on Bybit.
## `KUC-159`
**Source**: `examples/live/bybit/bybit_ema_cross_with_trailing_stop.py`
Users need to run EMA crossover with trailing stop protection live on Bybit.
## `KUC-160`
**Source**: `examples/live/bybit/bybit_option_chain.py`
Users need to subscribe to option chain data from Bybit including strikes and expiry for BTC options trading.
## `KUC-161`
**Source**: `examples/live/bybit/bybit_option_greeks.py`
Users need to subscribe to live option Greeks (delta, gamma, vega, theta, IV) from Bybit for options trading.
## `KUC-162`
**Source**: `examples/live/bybit/bybit_options_data_collector.py`
Users need to collect and store historical options market data from Bybit for research and backtesting.
## `KUC-163`
**Source**: `examples/live/databento/databento_data_tester.py`
Users need to test and validate Databento live market data streaming including bars and order book data.
## `KUC-164`
**Source**: `examples/live/deribit/deribit_data_tester.py`
Users need to test market data streaming and options data from Deribit exchange for crypto options trading.
## `KUC-165`
**Source**: `examples/live/dydx/dydx_exec_tester.py`
Users need to test order execution on dYdX v4 including short-term and long-term orders with custom tags.
## `KUC-166`
**Source**: `examples/live/dydx/dydx_market_maker.py`
Users need to run a volatility market maker strategy live on dYdX v4 perpetual markets.
## `KUC-167`
**Source**: `examples/live/hyperliquid/hyperliquid_exec_tester.py`
Users need to test order execution functionality on Hyperliquid exchange for crypto perpetual trading.
## `KUC-168`
**Source**: `examples/live/interactive_brokers/connect_with_dockerized_gateway.py`
Users need to connect NautilusTrader to Interactive Brokers using a dockerized Gateway for traditional asset trading.
## `KUC-169`
**Source**: `examples/live/interactive_brokers/connect_with_tws.py`
Users need to connect NautilusTrader to Interactive Brokers via Trader Workstation (TWS) for trading.
## `KUC-170`
**Source**: `examples/live/interactive_brokers/contract_download.py`
Users need to download contract details from Interactive Brokers including futures and options instruments.
## `KUC-171`
**Source**: `examples/live/interactive_brokers/historical_download.py`
Users need to download historical market data from Interactive Brokers for backtesting and research.
## `KUC-172`
**Source**: `examples/live/interactive_brokers/notebooks/bracket_order_example.py`
Users need to understand how to place bracket orders (with profit target and stop loss) on Interactive Brokers.
## `KUC-173`
**Source**: `examples/live/interactive_brokers/notebooks/oca_group_example.py`
Users need to understand how to use One-Cancels-All (OCA) groups for order management on Interactive Brokers.
FILE:references/WISDOM.md
# Cross-Project Wisdom
Total: **10**
## `CW-BT-001` — Cerebro 统一编排引擎
**From**: backtrader · **Applicable to**: backtesting
backtrader 用 Cerebro 作为单一入口,统一管理 data feeds、strategies、analyzers、 observers 的生命周期,支持一次 cerebro.run() 跑多策略+多数据源。 zvt 的 StockTrader 目前每次实例化只绑定一套因子,缺乏统一的多策略组合编排层; 借鉴 Cerebro 模式可让用户把多个 Trader 实例组合到一个 runner 中对比评估。
## `CW-BT-002` — Analyzer 插件化绩效评估
**From**: backtrader · **Applicable to**: backtesting
backtrader 提供 SharpeRatio、DrawDown、TimeReturn、TradeAnalyzer 等即插即用 的 Analyzer,可在不修改策略代码的情况下附加任意绩效指标。 zvt 当前绩效评估能力较弱,没有标准化的 Analyzer 接口; 借鉴此模式可让用户 cerebro.addanalyzer(SharpeRatio) 即得风险调整收益报告。
## `CW-BT-003` — Sizer 仓位管理分离
**From**: backtrader · **Applicable to**: backtesting
backtrader 将仓位管理(每次开仓买多少股/多大比例)单独抽象为 Sizer, 与信号逻辑完全解耦;内置 FixedSize、PercentSizer 等,用户可自定义。 zvt 目前没有显式的 Sizer 概念,仓位控制逻辑散落在 Trader.on_profit_control 等钩子中; 引入 Sizer 接口可使策略信号与资金管理规则独立演化和组合复用。
## `CW-BT-004` — Order 类型全集(Limit/Stop/OCO/Bracket)
**From**: backtrader · **Applicable to**: backtesting
backtrader 支持 Market、Limit、Stop、StopLimit、OCO(二选一)、 Bracket(止盈止损一对订单)等丰富订单类型,并模拟成交滑点和手续费方案。 zvt 回测目前主要支持市价成交,缺乏限价委托和组合订单模拟; 对于高频或实盘对接场景,完善订单类型将大幅提升回测真实性。
## `CW-BT-005` — 数据重采样与重播(Resampling & Replaying)
**From**: backtrader · **Applicable to**: backtesting
backtrader 可将低级别数据(如 1 min)实时 resample 为高级别(如 1 day)并同步驱动策略, 或 replay 逐 tick 模拟 OHLC 形成过程,实现日内精细回测。 zvt 目前多时间框架通过预录入不同级别 K 线实现,缺少运行时动态重采样; 借鉴此模式可在不重复录入数据的前提下支持任意时间粒度组合回测。
## `CW-VN-003` — CTA 回测引擎内置可视化
**From**: vnpy · **Applicable to**: backtesting
vnpy 的 cta_backtester 提供图形界面直接展示策略净值曲线、最大回撤、 每日盈亏、成交明细,无需 Jupyter Notebook。 zvt 目前回测结果可视化依赖 draw_result 方法调用 Plotly,但无统一的回测报告页面; 借鉴此模式可打包一个开箱即用的策略绩效仪表盘。
## `CW-VN-004` — vnpy.alpha ML 因子研究实验室(Lab)
**From**: vnpy · **Applicable to**: factor-research
vnpy 4.0 的 vnpy.alpha.lab 提供数据管理、模型训练、信号生成、策略回测一体化工作流, 支持 Lasso/LightGBM/MLP 等算法的标准化训练接口和可视化对比。 zvt 的 ML 能力目前仅有 MaStockMLMachine 一个入口,缺乏规范化 Lab 框架; 借鉴 Lab 模式可建立"特征工程→训练→信号→回测"的标准流水线,降低 ML 实验门槛。
## `CW-QL-001` — Point-in-Time 数据库(防未来数据泄漏)
**From**: qlib · **Applicable to**: backtesting
qlib 的 Point-in-Time Provider 保证在给定时间点 t 的查询只返回 t 时刻 真实可知的数据(财报发布延迟、修订历史均被正确处理), 彻底消除回测中的 look-ahead bias。 zvt 目前财务数据以报告期为 timestamp,缺少"发布日"维度, 存在用未来财报数据做选股的潜在偏差;引入 PIT 模式可大幅提升回测可信度。
## `CW-QL-002` — Recorder + Experiment 实验管理(MLflow 风格)
**From**: qlib · **Applicable to**: factor-research
qlib 的 workflow 模块提供 Experiment/Recorder,自动记录每次模型训练的 超参数、特征、指标、预测结果,支持跨实验比较和模型版本管理。 zvt 目前缺乏 ML 实验追踪机制,每次重跑结果会覆盖前次; 借鉴 Recorder 模式可将每次因子实验的参数和结果持久化,支持快速复现和版本对比。
## `CW-QL-003` — Nested Decision Framework(多层嵌套决策执行)
**From**: qlib · **Applicable to**: backtesting
qlib 支持将高频执行层(分钟级委托拆单)嵌套在低频决策层(日级组合调仓)中, 两层独立优化且可组合运行,实现日内最优执行算法(如 TWAP、VWAP 调仓)。 zvt 目前回测仅有日线级别的成交假设,缺乏执行算法建模; 借鉴嵌套框架可让 zvt 区分"何时持有哪些股"与"如何以最小冲击成本建仓"两个问题。
FILE:references/components/accounting_-_settlement.md
# accounting_&_settlement (6 classes)
## `Account.apply_fill`
`accounting_&_settlement/account-apply-fill.py:0`
## `Account.calculate_margin`
`accounting_&_settlement/account-calculate-margin.py:0`
## `Account.apply_settlement`
`accounting_&_settlement/account-apply-settlement.py:0`
## `LeveragedMarginModel.calculate`
`accounting_&_settlement/leveragedmarginmodel-calculate.py:0`
## `Account type`
`accounting_&_settlement/account-type.py:0`
## `MarginModel`
`accounting_&_settlement/marginmodel.py:0`
FILE:references/components/backtest_simulation.md
# backtest_simulation (8 classes)
## `BacktestEngine.run`
`backtest_simulation/backtestengine-run.py:0`
## `BacktestEngine.add_data`
`backtest_simulation/backtestengine-add-data.py:0`
## `BacktestEngine.add_instrument`
`backtest_simulation/backtestengine-add-instrument.py:0`
## `SimulatedExchange.process_order`
`backtest_simulation/simulatedexchange-process-order.py:0`
## `BacktestEngine.results`
`backtest_simulation/backtestengine-results.py:0`
## `FillModel`
`backtest_simulation/fillmodel.py:0`
## `LatencyModel`
`backtest_simulation/latencymodel.py:0`
## `FeeModel`
`backtest_simulation/feemodel.py:0`
FILE:references/components/data_collection_-_ingestion.md
# data_collection_&_ingestion (7 classes)
## `DataEngine.subscribe_data`
`data_collection_&_ingestion/dataengine-subscribe-data.py:0`
## `DataEngine.subscribe_order_book_deltas`
`data_collection_&_ingestion/dataengine-subscribe-order-book-deltas.py:0`
## `DataEngine.request_data`
`data_collection_&_ingestion/dataengine-request-data.py:0`
## `DataCatalog.query`
`data_collection_&_ingestion/datacatalog-query.py:0`
## `BarAggregator.on_tick`
`data_collection_&_ingestion/baraggregator-on-tick.py:0`
## `DataBackend`
`data_collection_&_ingestion/databackend.py:0`
## `DataCatalog`
`data_collection_&_ingestion/datacatalog.py:0`
FILE:references/components/factor_computation_-_strategy_processing.md
# factor_computation_/_strategy_processing (11 classes)
## `Strategy.on_bar`
`factor_computation_/_strategy_processing/strategy-on-bar.py:0`
## `Strategy.on_quote_tick`
`factor_computation_/_strategy_processing/strategy-on-quote-tick.py:0`
## `Strategy.on_trade_tick`
`factor_computation_/_strategy_processing/strategy-on-trade-tick.py:0`
## `Strategy.submit_order`
`factor_computation_/_strategy_processing/strategy-submit-order.py:0`
## `Strategy.cancel_order`
`factor_computation_/_strategy_processing/strategy-cancel-order.py:0`
## `OrderFactory.with_pnl_protection`
`factor_computation_/_strategy_processing/orderfactory-with-pnl-protection.py:0`
## `Actor.on_start`
`factor_computation_/_strategy_processing/actor-on-start.py:0`
## `Actor.on_stop`
`factor_computation_/_strategy_processing/actor-on-stop.py:0`
## `Strategy logic`
`factor_computation_/_strategy_processing/strategy-logic.py:0`
## `Clock`
`factor_computation_/_strategy_processing/clock.py:0`
## `OrderFactory`
`factor_computation_/_strategy_processing/orderfactory.py:0`
FILE:references/components/order_execution_-_venue_routing.md
# order_execution_&_venue_routing (8 classes)
## `ExecutionEngine.submit_order`
`order_execution_&_venue_routing/executionengine-submit-order.py:0`
## `ExecutionEngine.modify_order`
`order_execution_&_venue_routing/executionengine-modify-order.py:0`
## `ExecutionEngine.cancel_order`
`order_execution_&_venue_routing/executionengine-cancel-order.py:0`
## `OrderEmulator.transform_order`
`order_execution_&_venue_routing/orderemulator-transform-order.py:0`
## `ExecAlgorithm.apply`
`order_execution_&_venue_routing/execalgorithm-apply.py:0`
## `ExecAlgorithm`
`order_execution_&_venue_routing/execalgorithm.py:0`
## `OrderEmulator`
`order_execution_&_venue_routing/orderemulator.py:0`
## `ExecutionClient`
`order_execution_&_venue_routing/executionclient.py:0`
FILE:references/components/risk_management.md
# risk_management (4 classes)
## `RiskEngine.check_order`
`risk_management/riskengine-check-order.py:0`
## `RiskEngine.check_cancel`
`risk_management/riskengine-check-cancel.py:0`
## `RiskEngine.check_modify`
`risk_management/riskengine-check-modify.py:0`
## `Risk rule set`
`risk_management/risk-rule-set.py:0`
FILE:references/components/state_persistence_-_cache.md
# state_persistence_&_cache (8 classes)
## `Cache.instrument`
`state_persistence_&_cache/cache-instrument.py:0`
## `Cache.orders`
`state_persistence_&_cache/cache-orders.py:0`
## `Cache.position`
`state_persistence_&_cache/cache-position.py:0`
## `Cache.account`
`state_persistence_&_cache/cache-account.py:0`
## `Cache.quote_ticks`
`state_persistence_&_cache/cache-quote-ticks.py:0`
## `CacheDatabaseFacade.flush`
`state_persistence_&_cache/cachedatabasefacade-flush.py:0`
## `CacheDatabaseFacade.load`
`state_persistence_&_cache/cachedatabasefacade-load.py:0`
## `CacheDatabaseFacade`
`state_persistence_&_cache/cachedatabasefacade.py:0`
FILE:references/components/trading_coordination_-_portfolio.md
# trading_coordination_&_portfolio (8 classes)
## `Portfolio.apply_fill`
`trading_coordination_&_portfolio/portfolio-apply-fill.py:0`
## `Portfolio.position`
`trading_coordination_&_portfolio/portfolio-position.py:0`
## `Position.apply_fill`
`trading_coordination_&_portfolio/position-apply-fill.py:0`
## `Position.unrealized_pnl`
`trading_coordination_&_portfolio/position-unrealized-pnl.py:0`
## `Trader.register`
`trading_coordination_&_portfolio/trader-register.py:0`
## `Trader.start`
`trading_coordination_&_portfolio/trader-start.py:0`
## `MarginModel`
`trading_coordination_&_portfolio/marginmodel.py:0`
## `FeeModel`
`trading_coordination_&_portfolio/feemodel.py:0`
基于《Machine Learning for Trading》第二版配套 notebooks 实现量化交易策略开发与回测,涵盖多市场金融数据的时间序列机器学习分析。
---
name: ml4t-book-notebooks
description: |-
基于《Machine Learning for Trading》第二版配套 notebooks 实现量化交易策略开发与回测,涵盖多市场金融数据的时间序列机器学习分析。
license: Proprietary. See LICENSE.txt in project root.
compatibility: Designed for Doramagic-host ecosystem (Claude Code / openclaw / Cursor). Requires Python 3.12+ with uv package manager.
metadata:
version: "v6.1"
blueprint_id: "finance-bp-121"
compiled_at: "2026-04-22T13:00:59.543591+00:00"
capability_markets: "multi-market"
capability_activities: "time-series-ml"
sop_version: "crystal-compilation-v6.1"
---
# ML4T 交易教程 (ml4t-book-notebooks)
> 基于《Machine Learning for Trading》第二版配套 notebooks 实现量化交易策略开发与回测,涵盖多市场金融数据的时间序列机器学习分析。
## Pipeline
`data_collection -> data_storage -> factor_computation -> target_selection -> trading_execution -> visualization`
## Top Use Cases (0 total)
**Execute trigger**: `When user intent matches intent_router.uc_entries[].positive_terms AND user uses action verb (run/execute/跑/执行/backtest/fetch/collect)`
## What I'll Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
## Semantic Locks (Fatal)
| ID | Rule | On Violation |
|---|---|---|
| `SL-01` | Execute sell orders before buy orders in every trading cycle | halt |
| `SL-02` | Trading signals MUST use next-bar execution (no look-ahead) | halt |
| `SL-03` | Entity IDs MUST follow format entity_type_exchange_code | halt |
| `SL-04` | DataFrame index MUST be MultiIndex (entity_id, timestamp) | halt |
| `SL-05` | TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount | halt |
| `SL-06` | filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION | halt |
| `SL-07` | Transformer MUST run BEFORE Accumulator in factor pipeline | halt |
| `SL-08` | MACD parameters locked: fast=12, slow=26, signal=9 | halt |
Full lock definitions: [references/LOCKS.md](references/LOCKS.md)
## Top Anti-Patterns (15 total)
- **`AP-TIME-SERIES-ML-001`**: TimeSeries values array dimensionality mismatch
- **`AP-TIME-SERIES-ML-002`**: Non-floating-point dtype in TimeSeries values
- **`AP-TIME-SERIES-ML-003`**: Irregular or non-monotonic time index
All 15 anti-patterns: [references/ANTI_PATTERNS.md](references/ANTI_PATTERNS.md)
## Evidence Quality Notice
> [QUALITY NOTICE] This crystal was compiled from blueprint finance-bp-121. Evidence verify ratio = 31.8% and audit fail total = 35. Generated results may have uncaptured requirement gaps. Verify critical decisions against source files (LATEST.yaml / LATEST.jsonl).
## Reference Files
| File | Contents | When to Load |
|---|---|---|
| [references/seed.yaml](references/seed.yaml) | V6+ 全量权威 (source-of-truth) | 有行为/决策争议时必读 |
| [references/ANTI_PATTERNS.md](references/ANTI_PATTERNS.md) | 15 条跨项目反模式 | 开始实现前 |
| [references/WISDOM.md](references/WISDOM.md) | 跨项目精华借鉴 | 架构决策时 |
| [references/CONSTRAINTS.md](references/CONSTRAINTS.md) | domain + fatal 约束 | 规则冲突时 |
| [references/USE_CASES.md](references/USE_CASES.md) | 全量 KUC-* 业务场景 | 需要完整示例时 |
| [references/LOCKS.md](references/LOCKS.md) | SL-* + preconditions + hints | 生成回测/交易代码前 |
| [references/COMPONENTS.md](references/COMPONENTS.md) | AST 组件地图(按 module 拆分)| 查 API 时 |
---
*Compiled by Doramagic crystal-compilation-v6.1 from `finance-bp-121` blueprint at 2026-04-22T13:00:59.543591+00:00.*
*See [human_summary.md](human_summary.md) for non-technical overview.*
FILE:human_summary.md
# Human Summary
> I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
## What I Can Do
- **tagline**: I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
- **use_cases**: ['A-share MACD daily golden-cross backtest with hfq price adjustment from eastmoney', 'End-to-end ZVT pipeline: FinanceRecorder + GoodCompanyFactor + StockTrader', 'Multi-factor strategy with TargetSelector (AND mode) combining MACD + volume breakout', 'Index composition data collection (SZ1000, SZ2000) with EM recorder', 'Institutional fund holdings tracker via joinquant_fund_runner pattern', 'Custom Transformer + Accumulator factor with per-entity rolling state', 'Bollinger Band mean-reversion factor with BollTransformer (window=20, window_dev=2)']
## What I Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
FILE:references/ANTI_PATTERNS.md
# Anti-Patterns (Cross-Project)
Total: **15**
## finance-bp-102--Darts (7)
### `AP-TIME-SERIES-ML-001` — TimeSeries values array dimensionality mismatch <sub>(high)</sub>
When constructing a TimeSeries with a values array that is not expanded to exactly 3 dimensions (time×component×sample), downstream model operations expecting the standard 3D shape will fail with dimension mismatches. This causes all downstream models to receive incorrectly formatted data tensors, leading to complete pipeline failure or silent data corruption.
### `AP-TIME-SERIES-ML-002` — Non-floating-point dtype in TimeSeries values <sub>(high)</sub>
When setting TimeSeries values dtype to integer or non-floating-point types, numerical operations produce incorrect results during financial calculations. Financial forecasts require float64 or float32 precision to handle decimal computations accurately; integer dtypes truncate precision and cause accumulation of rounding errors that compound across time steps.
### `AP-TIME-SERIES-ML-003` — Irregular or non-monotonic time index <sub>(high)</sub>
When TimeSeries time index is not strictly monotonically increasing with a well-defined frequency and no gaps, downstream models produce incorrect forecasts due to temporal misalignment. Gap detection methods fail, and any temporal aggregation or differencing operations will produce meaningless results.
### `AP-TIME-SERIES-ML-004` — Time index and values length mismatch at construction <sub>(high)</sub>
When the time index length does not equal the values array first dimension length, TimeSeries construction fails with ValueError at construction time, preventing any data from being loaded into the system. This typically occurs when importing data from CSV or DataFrame sources where column alignment assumptions are incorrect.
### `AP-TIME-SERIES-ML-005` — Missing abstract method implementations in ForecastingModel subclasses <sub>(high)</sub>
When implementing ForecastingModel subclasses without implementing all required abstract methods (fit, predict, min_train_samples, _target_window_lengths, extreme_lags, supports_multivariate, supports_transferable_series_prediction), Python's ABC abstractmethod enforcement causes TypeError at instantiation time, preventing any model from being created.
### `AP-TIME-SERIES-ML-006` — fit() method not returning self for chaining <sub>(medium)</sub>
When fit() method does not return self for method chaining, the fluent interface pattern expected by users breaks at lines 209, 2932, and 3069 where chaining is attempted. Users encounter AttributeError when trying to chain operations like model.fit(series).predict(n_periods).
### `AP-TIME-SERIES-ML-007` — Frequency inference failure with insufficient timesteps <sub>(medium)</sub>
When using fill_missing_dates with fewer than 3 time steps, frequency inference fails with ValueError because at least 3 consecutive timestamps are required to determine a unique constant frequency. Irregular time series cannot be gap-filled without this minimum data.
## finance-bp-121--machine-learning-for-trading (8)
### `AP-TIME-SERIES-ML-008` — Look-ahead bias from random train/test splits <sub>(high)</sub>
When implementing cross-validation for financial time series using random K-fold or standard train_test_split without temporal ordering, future information leaks into training data. This look-ahead bias artificially inflates backtest performance metrics and leads to significant live trading losses when the model encounters truly unseen data.
### `AP-TIME-SERIES-ML-009` — Missing purge gap contaminating validation results <sub>(high)</sub>
When using walking forward split without an embargo gap between train and test periods, overlapping outcomes between training and test periods contaminate validation results. Without purge gap, seemingly good backtest results do not generalize to live performance due to information leakage across the split boundary.
### `AP-TIME-SERIES-ML-010` — Hardcoded credentials in source code <sub>(high)</sub>
When scraping content from websites requiring authentication by hardcoding credentials in source code files, exposed credentials lead to unauthorized access, potential account termination, and security breaches. Credentials should be loaded from environment variables or secure configuration files, never committed to version control.
### `AP-TIME-SERIES-ML-011` — TA-Lib infinite values causing ML model failures <sub>(high)</sub>
When computing technical indicators using TA-Lib (RSI, MACD, ATR) without handling edge cases, division-by-zero produces infinite values that corrupt the feature DataFrame. Gradient-based ML models (neural networks) cannot process infinite values, causing training to fail or produce NaN gradients.
### `AP-TIME-SERIES-ML-012` — MultiIndex structure lost during feature engineering <sub>(high)</sub>
When flattening or renaming the (ticker, date) MultiIndex during feature engineering for multi-ticker trading, downstream stages (prediction_modeling, backtesting) fail because they expect MultiIndex for proper temporal train/test splits. Data corruption occurs silently when multi-ticker data is treated as single-ticker.
### `AP-TIME-SERIES-ML-013` — Missing TA-Lib C library dependency <sub>(high)</sub>
When installing TA-Lib via pip install ta-lib alone without compiling the underlying C library, import fails because the Python package is merely a wrapper around compiled native code. This causes immediate runtime failure for any code attempting to import talib for technical indicator computation.
### `AP-TIME-SERIES-ML-014` — Trading calendar minutes_per_day mismatch <sub>(high)</sub>
When configuring extended-hours trading calendar with incorrect minutes_per_day (e.g., using 960 for extended hours instead of 1600), minute bar alignment with the calendar fails. Backtest prices do not correspond to actual trading times, producing meaningless results that don't reflect real market microstructure.
### `AP-TIME-SERIES-ML-015` — Zipline bundle ingest function signature mismatch <sub>(high)</sub>
When implementing Zipline bundle ingest function with incorrect parameter count or order, Zipline fails with TypeError during bundle ingest because the ingestion pipeline expects exactly 9 parameters in a specific order. Backtesting cannot run at all when bundle ingestion fails, blocking all downstream work.
FILE:references/COMPONENTS.md
# Component Capability Map
**Project**: finance-bp-121--machine-learning-for-trading
**Scan date**: 2026-04-22
**Stats**: {'total_files': 7, 'total_classes': 41, 'total_functions': 0, 'total_stages': 7}
## Modules (7)
- [alternative_data_collection](components/alternative_data_collection.md): 6 classes
- [market_data_ingestion](components/market_data_ingestion.md): 7 classes
- [feature_engineering](components/feature_engineering.md): 5 classes
- [prediction_modeling](components/prediction_modeling.md): 6 classes
- [backtesting](components/backtesting.md): 6 classes
- [reinforcement_learning_trading](components/reinforcement_learning_trading.md): 7 classes
- [multiple_testing_correction](components/multiple_testing_correction.md): 4 classes
FILE:references/CONSTRAINTS.md
# Constraints
## preservation_manifest
```yaml
required_objects:
business_decisions_count: 152
fatal_constraints_count: 58
non_fatal_constraints_count: 187
use_cases_count: 0
semantic_locks_count: 12
preconditions_count: 4
evidence_quality_rules_count: 2
traceback_scenarios_count: 5
```
FILE:references/LOCKS.md
# Semantic Locks + Preconditions
## Semantic Locks (12)
### `SL-01` <sub>(on_violation: fatal)</sub>
Execute sell orders before buy orders in every trading cycle
### `SL-02` <sub>(on_violation: fatal)</sub>
Trading signals MUST use next-bar execution (no look-ahead)
### `SL-03` <sub>(on_violation: fatal)</sub>
Entity IDs MUST follow format entity_type_exchange_code
### `SL-04` <sub>(on_violation: fatal)</sub>
DataFrame index MUST be MultiIndex (entity_id, timestamp)
### `SL-05` <sub>(on_violation: fatal)</sub>
TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount
### `SL-06` <sub>(on_violation: fatal)</sub>
filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION
### `SL-07` <sub>(on_violation: fatal)</sub>
Transformer MUST run BEFORE Accumulator in factor pipeline
### `SL-08` <sub>(on_violation: fatal)</sub>
MACD parameters locked: fast=12, slow=26, signal=9
### `SL-09` <sub>(on_violation: warning)</sub>
Default transaction costs: buy_cost=0.001, sell_cost=0.001, slippage=0.001
### `SL-10` <sub>(on_violation: fatal)</sub>
A-share equity trading is T+1 (no same-day close of buy positions)
### `SL-11` <sub>(on_violation: fatal)</sub>
Recorder subclass MUST define provider AND data_schema class attributes
### `SL-12` <sub>(on_violation: fatal)</sub>
Factor result_df MUST contain either 'filter_result' OR 'score_result' column
## Preconditions (4)
- **`PC-01`**: `python3 -c 'import zvt; print(zvt.__version__)'` → on_fail: Run: python3 -m pip install zvt then re-run: python3 -m zvt.init_dirs to initialize data directories
- **`PC-02`**: `python3 -c "from zvt.api.kdata import get_kdata; df = get_kdata(entity_ids=['stock_sh_600000'], limit=1); assert df is not None and len(df) > 0, 'No kdata found'"` → on_fail: Run recorder first: python3 -m zvt.recorders.em.em_stock_kdata_recorder --entity_ids stock_sh_600000 (replace with your target entity IDs)
- **`PC-03`**: `python3 -c "import os; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); assert zvt_home.exists(), f'ZVT home not found: {zvt_home}'"` → on_fail: Run: python3 -m zvt.init_dirs
- **`PC-04`**: `python3 -c "import os, tempfile; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); test_f = zvt_home / '.write_test'; test_f.touch(); test_f.unlink()"` → on_fail: Check directory permissions: chmod u+w ~/.zvt or set ZVT_HOME environment variable to a writable location
FILE:references/USE_CASES.md
# Known Use Cases (KUC)
Total: **0**
FILE:references/WISDOM.md
# Cross-Project Wisdom
Total: **10**
## `CW-TIME-SERIES-ML-001` — 3D TimeSeries dimensionality invariant
**From**: finance-bp-102--Darts · **Applicable to**: time-series-ml
Always expand TimeSeries values to exactly 3 dimensions (n_timesteps, n_components, n_samples) regardless of input format. This invariant enables uniform downstream processing regardless of whether the data is univariate (1 component), single-sample, or multivariate probabilistic series with multiple samples.
## `CW-TIME-SERIES-ML-002` — Strict time index validation
**From**: finance-bp-102--Darts · **Applicable to**: time-series-ml
Validate time index at construction: must be strictly monotonically increasing, have a well-defined frequency, no holes by default, and length must match values first dimension. This prevents silent data corruption in all downstream temporal operations.
## `CW-TIME-SERIES-ML-003` — MultiIndex preservation in multi-ticker pipelines
**From**: finance-bp-121--machine-learning-for-trading · **Applicable to**: time-series-ml
Maintain (ticker, date) MultiIndex structure throughout the entire feature engineering and prediction pipeline for multi-ticker trading systems. Downstream stages depend on this structure for proper temporal train/test splits that respect per-ticker time boundaries.
## `CW-TIME-SERIES-ML-004` — Purged walking forward cross-validation
**From**: finance-bp-121--machine-learning-for-trading · **Applicable to**: time-series-ml
Use purged walking forward split with embargo gap for financial time series validation. Random splits cause look-ahead bias, while splits without purge gaps contaminate results with overlapping outcomes. The purge gap prevents information leakage across train/test boundaries.
## `CW-TIME-SERIES-ML-005` — TA-Lib edge case sanitization
**From**: finance-bp-121--machine-learning-for-trading · **Applicable to**: time-series-ml
Always replace infinite values with NaN and call dropna before ML model training when using TA-Lib technical indicators. RSI, MACD, ATR and other indicators produce inf values during division-by-zero edge cases, which corrupt gradient-based model training.
## `CW-TIME-SERIES-ML-006` — Fluent forecasting model interface
**From**: finance-bp-102--Darts · **Applicable to**: time-series-ml
Implement fit() returning self and predict() on ForecastingModel subclasses to support method chaining. This fluent interface pattern is expected by users for idiomatic usage like model.fit(series).predict(n_periods).
## `CW-TIME-SERIES-ML-007` — Zipline bundle signature contract
**From**: finance-bp-121--machine-learning-for-trading · **Applicable to**: time-series-ml
When implementing Zipline bundle ingest functions, the function must accept exactly 9 parameters in the specified order: environ, asset_db_writer, minute_bar_writer, daily_bar_writer, adjustment_writer, calendar, start_session, end_session, cache. This contract is enforced by Zipline's ingestion pipeline.
## `CW-TIME-SERIES-ML-008` — Calendar minutes_per_day alignment
**From**: finance-bp-121--machine-learning-for-trading · **Applicable to**: time-series-ml
When configuring trading calendars for backtesting, set minutes_per_day to match the total trading minutes including extended hours (960 for regular NYSE, 1600 for extended hours starting 4:00 AM). This ensures minute bar alignment with actual trading times in the backtest.
## `CW-TIME-SERIES-ML-009` — Deterministic series detection
**From**: finance-bp-121--machine-learning-for-trading · **Applicable to**: time-series-ml
A TimeSeries is deterministic when n_samples equals 1, otherwise probabilistic. This distinction matters for methods like to_json and gaps detection which execute differently depending on whether the series contains probabilistic predictions or point estimates.
## `CW-TIME-SERIES-ML-010` — Minimum training sample enforcement
**From**: finance-bp-102--Darts · **Applicable to**: time-series-ml
Enforce min_train_series_length at fit time to prevent underfitting with insufficient historical data. Models should raise ValueError with clear messaging when training series length is below the model's minimum requirement, preventing silent poor forecasts.
FILE:references/components/alternative_data_collection.md
# alternative_data_collection (6 classes)
## `OpenTableSpider.parse`
`alternative_data_collection/opentablespider-parse.py:0`
## `parse_html`
`alternative_data_collection/parse-html.py:0`
## `MonitorDownloadsExtension.periodic_task`
`alternative_data_collection/monitordownloadsextension-periodic-task.py:0`
## `Spider implementation`
`alternative_data_collection/spider-implementation.py:0`
## `Downloader middleware`
`alternative_data_collection/downloader-middleware.py:0`
## `Item pipeline`
`alternative_data_collection/item-pipeline.py:0`
FILE:references/components/backtesting.md
# backtesting (6 classes)
## `TradingSimulator.take_step`
`backtesting/tradingsimulator-take-step.py:0`
## `get_backtest_data`
`backtesting/get-backtest-data.py:0`
## `TradingEnvironment.step`
`backtesting/tradingenvironment-step.py:0`
## `Backtesting framework`
`backtesting/backtesting-framework.py:0`
## `Cost model`
`backtesting/cost-model.py:0`
## `Position sizing`
`backtesting/position-sizing.py:0`
FILE:references/components/feature_engineering.md
# feature_engineering (5 classes)
## `DataSource.load_data`
`feature_engineering/datasource-load-data.py:0`
## `preprocess_data`
`feature_engineering/preprocess-data.py:0`
## `get_ohlcv_by_ticker`
`feature_engineering/get-ohlcv-by-ticker.py:0`
## `Feature set`
`feature_engineering/feature-set.py:0`
## `Normalization`
`feature_engineering/normalization.py:0`
FILE:references/components/market_data_ingestion.md
# market_data_ingestion (7 classes)
## `algoseek_to_bundle`
`market_data_ingestion/algoseek-to-bundle.py:0`
## `stooq_jp_to_bundle`
`market_data_ingestion/stooq-jp-to-bundle.py:0`
## `AlgoSeekCalendar`
`market_data_ingestion/algoseekcalendar.py:0`
## `metadata_frame`
`market_data_ingestion/metadata-frame.py:0`
## `Bundle ingest function`
`market_data_ingestion/bundle-ingest-function.py:0`
## `Exchange calendar`
`market_data_ingestion/exchange-calendar.py:0`
## `Adjustment data`
`market_data_ingestion/adjustment-data.py:0`
FILE:references/components/multiple_testing_correction.md
# multiple_testing_correction (4 classes)
## `get_analytical_max_sr`
`multiple_testing_correction/get-analytical-max-sr.py:0`
## `get_numerical_max_sr`
`multiple_testing_correction/get-numerical-max-sr.py:0`
## `simulate`
`multiple_testing_correction/simulate.py:0`
## `DSR calculation method`
`multiple_testing_correction/dsr-calculation-method.py:0`
FILE:references/components/prediction_modeling.md
# prediction_modeling (6 classes)
## `MultipleTimeSeriesCV.split`
`prediction_modeling/multipletimeseriescv-split.py:0`
## `get_backtest_data`
`prediction_modeling/get-backtest-data.py:0`
## `spearmanr`
`prediction_modeling/spearmanr.py:0`
## `Model type`
`prediction_modeling/model-type.py:0`
## `CV strategy`
`prediction_modeling/cv-strategy.py:0`
## `Alpha selection`
`prediction_modeling/alpha-selection.py:0`
FILE:references/components/reinforcement_learning_trading.md
# reinforcement_learning_trading (7 classes)
## `TradingEnvironment.step`
`reinforcement_learning_trading/tradingenvironment-step.py:0`
## `TradingEnvironment.reset`
`reinforcement_learning_trading/tradingenvironment-reset.py:0`
## `TradingEnvironment.render`
`reinforcement_learning_trading/tradingenvironment-render.py:0`
## `DataSource.get_observation`
`reinforcement_learning_trading/datasource-get-observation.py:0`
## `RL algorithm`
`reinforcement_learning_trading/rl-algorithm.py:0`
## `Episode length`
`reinforcement_learning_trading/episode-length.py:0`
## `Observation features`
`reinforcement_learning_trading/observation-features.py:0`
运行ALM资产负债管理模拟,生成组合收益、现金流报告,并通过Smith-Wilson方法校准EIOPA风险自由收益率曲线进行企业债券定价。。
---
name: macro-economic-model
description: |-
运行ALM资产负债管理模拟,生成组合收益、现金流报告,并通过Smith-Wilson方法校准EIOPA风险自由收益率曲线进行企业债券定价。。
license: Proprietary. See LICENSE.txt in project root.
compatibility: Designed for Doramagic-host ecosystem (Claude Code / openclaw / Cursor). Requires Python 3.12+ with uv package manager.
metadata:
version: "v6.1"
blueprint_id: "finance-bp-077"
compiled_at: "2026-04-22T13:00:28.955724+00:00"
capability_markets: "global"
capability_activities: "macro-data"
sop_version: "crystal-compilation-v6.1"
---
# 宏观经济模型 (macro-economic-model)
> 运行ALM资产负债管理模拟,生成组合收益、现金流报告,并通过Smith-Wilson方法校准EIOPA风险自由收益率曲线进行企业债券定价。
## Pipeline
`data_collection -> data_storage -> factor_computation -> target_selection -> trading_execution -> visualization`
## Top Use Cases (13 total)
### ALM Portfolio Summary Report Generator (`UC-101`)
Running a comprehensive Asset-Liability Management (ALM) simulation and visualizing portfolio returns, cash flows, and EIOPA yield curves for a mixed
**Triggers**: ALM simulation, portfolio summary, yield curve visualization
### ALM Cash Flow Visualization Dashboard (`UC-111`)
Creating visual summaries of ALM simulation results including dividend, coupon, liability cash flows, notional returns, and terminal cash flows over t
**Triggers**: cash flow charts, portfolio visualization, ALM reporting
### EIOPA Risk-Free Curve Projection and Calibration (`UC-102`)
Projecting forward interest rates and calibrating the EIOPA risk-free yield curve for long-term insurance liability discounting using Smith-Wilson met
**Triggers**: EIOPA curve, forward rate projection, yield curve calibration
For all **13** use cases, see [references/USE_CASES.md](references/USE_CASES.md).
**Execute trigger**: `When user intent matches intent_router.uc_entries[].positive_terms AND user uses action verb (run/execute/跑/执行/backtest/fetch/collect)`
## What I'll Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
## Semantic Locks (Fatal)
| ID | Rule | On Violation |
|---|---|---|
| `SL-01` | Execute sell orders before buy orders in every trading cycle | halt |
| `SL-02` | Trading signals MUST use next-bar execution (no look-ahead) | halt |
| `SL-03` | Entity IDs MUST follow format entity_type_exchange_code | halt |
| `SL-04` | DataFrame index MUST be MultiIndex (entity_id, timestamp) | halt |
| `SL-05` | TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount | halt |
| `SL-06` | filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION | halt |
| `SL-07` | Transformer MUST run BEFORE Accumulator in factor pipeline | halt |
| `SL-08` | MACD parameters locked: fast=12, slow=26, signal=9 | halt |
Full lock definitions: [references/LOCKS.md](references/LOCKS.md)
## Top Anti-Patterns (14 total)
- **`AP-MACRO-DATA-001`**: SEC EDGAR Rate Limit Violation
- **`AP-MACRO-DATA-002`**: Temporal Knowledge Graph Look-Ahead Bias
- **`AP-MACRO-DATA-003`**: Technical Indicator Look-Ahead Bias via Missing Shift
All 14 anti-patterns: [references/ANTI_PATTERNS.md](references/ANTI_PATTERNS.md)
## Evidence Quality Notice
> [QUALITY NOTICE] This crystal was compiled from blueprint finance-bp-077. Evidence verify ratio = 42.6% and audit fail total = 34. Generated results may have uncaptured requirement gaps. Verify critical decisions against source files (LATEST.yaml / LATEST.jsonl).
## Reference Files
| File | Contents | When to Load |
|---|---|---|
| [references/seed.yaml](references/seed.yaml) | V6+ 全量权威 (source-of-truth) | 有行为/决策争议时必读 |
| [references/ANTI_PATTERNS.md](references/ANTI_PATTERNS.md) | 14 条跨项目反模式 | 开始实现前 |
| [references/WISDOM.md](references/WISDOM.md) | 跨项目精华借鉴 | 架构决策时 |
| [references/CONSTRAINTS.md](references/CONSTRAINTS.md) | domain + fatal 约束 | 规则冲突时 |
| [references/USE_CASES.md](references/USE_CASES.md) | 全量 KUC-* 业务场景 | 需要完整示例时 |
| [references/LOCKS.md](references/LOCKS.md) | SL-* + preconditions + hints | 生成回测/交易代码前 |
| [references/COMPONENTS.md](references/COMPONENTS.md) | AST 组件地图(按 module 拆分)| 查 API 时 |
---
*Compiled by Doramagic crystal-compilation-v6.1 from `finance-bp-077` blueprint at 2026-04-22T13:00:28.955724+00:00.*
*See [human_summary.md](human_summary.md) for non-technical overview.*
FILE:human_summary.md
# Human Summary
> I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
## What I Can Do
- **tagline**: I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
- **use_cases**: ['Corporate Bond Portfolio Pricing', 'EIOPA Risk-Free Curve Projection and Calibration', 'ALM Portfolio Summary Report Generator', 'A-share MACD daily golden-cross backtest with hfq price adjustment from eastmoney', 'End-to-end ZVT pipeline: FinanceRecorder + GoodCompanyFactor + StockTrader', 'Multi-factor strategy with TargetSelector (AND mode) combining MACD + volume breakout', 'Index composition data collection (SZ1000, SZ2000) with EM recorder']
## What I Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
FILE:references/ANTI_PATTERNS.md
# Anti-Patterns (Cross-Project)
Total: **14**
## finance-bp-074--FinRobot (1)
### `AP-MACRO-DATA-001` — SEC EDGAR Rate Limit Violation <sub>(high)</sub>
When implementing SEC API calls without applying rate limiting decorators, requests exceed the regulatory 10 requests per second limit. This causes IP blocking from SEC EDGAR, preventing all subsequent access to critical financial filings and completely disrupting the data collection pipeline. FinRobot demonstrates that SEC enforces strict rate limits and missing User-Agent headers compound this by causing silent request failures.
## finance-bp-077--Open_Source_Economic_Model (2)
### `AP-MACRO-DATA-004` — EIOPA Non-Compliant Curve Extrapolation <sub>(high)</sub>
When implementing the Smith-Wilson algorithm for EIOPA Solvency II calculations, using non-EIOPA compliant formulas or incorrect convergence point calculations violates regulatory specifications. The convergence point must use max(U+40, 60) years per EIOPA paragraph 120. Non-compliant formulas will fail regulatory audits for insurance liability calculations and produce incorrect risk-free rates, leading to materially wrong liability valuations.
### `AP-MACRO-DATA-009` — CSV BOM Encoding Corruption in Data Import <sub>(medium)</sub>
When importing CSV portfolio files with special characters without using 'utf-8-sig' encoding to handle BOM markers, CSV files with UTF-8 BOM markers fail to parse correctly. This causes KeyError exceptions when reading row fields, preventing portfolio data from loading entirely. The BOM marker silently corrupts the first column name read by pandas.
## finance-bp-080--FinDKG (3)
### `AP-MACRO-DATA-002` — Temporal Knowledge Graph Look-Ahead Bias <sub>(high)</sub>
When implementing temporal data splitting for knowledge graphs, using non-temporal train/val/test splits causes the model to see future events during training. The violation of train_edges occurring before val_edges and test_edges temporally results in inflated metrics that do not reflect real-world performance. This produces overfit models that fail catastrophically when deployed for actual temporal prediction tasks.
### `AP-MACRO-DATA-008` — DGL Graph Attribute Propagation Failure in Temporal Batching <sub>(medium)</sub>
When implementing temporal knowledge graph data collation without propagating graph attributes (num_relations, num_all_nodes, time_interval) to subgraph variants created by collate_fn, downstream model components encounter missing attribute errors. The EmbeddingUpdater and EdgeModel expect these attributes on all graph objects including subgraphs, causing training to fail with AttributeError.
### `AP-MACRO-DATA-014` — Temporal DataLoader Shuffling Breaking Graph Ordering <sub>(medium)</sub>
When configuring DataLoader for temporal knowledge graph training with shuffle=True, the temporal ordering required for cumulative graph construction is violated. The model receives edges in non-chronological order, breaking the prior_G, batch_G, cumulative_G construction logic that depends on edges_before_batch occurring before edges_in_batch.
## finance-bp-083--Economic-Dashboard (3)
### `AP-MACRO-DATA-003` — Technical Indicator Look-Ahead Bias via Missing Shift <sub>(high)</sub>
When implementing SMA crossover detection (golden/death cross) without using shift(1) to compare current bar state with prior bar state, crossover detection uses current bar data causing look-ahead bias. Signals appear to fire at the same bar as the cross occurs, producing unrealistic backtest results that fail in live trading. Rationalizing this with 'we need the current bar signal immediately' leads to future information leaking into current signals.
### `AP-MACRO-DATA-010` — OHLCV Data Quality Validation Failure <sub>(medium)</sub>
When calculating technical indicators from OHLCV data without verifying required columns (open, high, low, close, volume), missing required OHLCV columns causes ValueError and prevents technical indicator calculation for affected tickers. This blocks downstream regime classification and pattern detection for all tickers with incomplete data.
### `AP-MACRO-DATA-011` — Inconsistent Primary Key Schema Causing JOIN Failures <sub>(medium)</sub>
When storing derived features in DuckDB with a different primary key schema than technical_features table, inconsistent primary keys prevent JOIN operations between tables. This breaks regime classification and pattern detection pipelines. The composite primary key (ticker, date) must be consistent across all feature tables to enable efficient querying and data integrity.
## finance-bp-105--open-climate-investing (5)
### `AP-MACRO-DATA-005` — Factor Regression Using Raw Returns Instead of Excess Returns <sub>(high)</sub>
When computing returns for CAPM/Fama-French factor regression, using raw stock returns instead of subtracting the risk-free rate (Rf) violates standard financial econometric methodology. CAPM/FF regression requires excess returns (Return - Rf); using raw returns produces incorrect beta estimates that misrepresent a stock's systematic risk exposure. This leads to fundamentally flawed risk attribution and portfolio construction decisions.
### `AP-MACRO-DATA-006` — Percentage vs Decimal Unit Mismatch in Factor Data <sub>(high)</sub>
When importing Fama-French factors from CSV files, failing to divide percentage-formatted factors (e.g., 5.2) by 100 before regression causes coefficients scaled by 100x. This produces statistically invalid inference and meaningless factor loadings. The same issue applies to risk-free rate values, corrupting all CAPM beta calculations downstream.
### `AP-MACRO-DATA-007` — Insufficient Regression Observations for Statistical Validity <sub>(medium)</sub>
When implementing factor regression analysis, using fewer than 20 data points after filtering (inner join, winsorization, date range) produces unreliable or undefined t-statistics and p-values. OLS with insufficient observations produces meaningless regression coefficients, making it impossible to distinguish significant factor exposures from noise. This commonly occurs when combining multiple data sources with missing values.
### `AP-MACRO-DATA-012` — Frequency Column Enforcement Missing in Time Series Schema <sub>(medium)</sub>
When creating PostgreSQL schema for time series tables without explicit frequency column enforcement of 'MONTHLY' or 'DAILY' text values, mixed frequency data corrupts regression calculations. Combining incompatible data frequencies produces statistically invalid regression results. The database must enforce frequency consistency to prevent silent data corruption.
### `AP-MACRO-DATA-013` — PostgreSQL Fork in Multiprocessing Context <sub>(medium)</sub>
When implementing multiprocessing for parallel regression execution using fork start method with psycopg2 database connections, child processes inherit corrupted connection state. This causes 'connection already closed' errors or corrupted connection state in child processes, resulting in failed database writes and incomplete factor regression calculations.
FILE:references/COMPONENTS.md
# Component Capability Map
**Project**: finance-bp-077--Open_Source_Economic_Model
**Scan date**: 2026-04-22
**Stats**: {'total_files': 8, 'total_classes': 36, 'total_functions': 0, 'total_stages': 8}
## Modules (8)
- [configuration_and_data_import](components/configuration_and_data_import.md): 4 classes
- [risk-free_curve_construction](components/risk-free_curve_construction.md): 7 classes
- [asset_cash_flow_generation](components/asset_cash_flow_generation.md): 6 classes
- [liability_cash_flow_generation](components/liability_cash_flow_generation.md): 3 classes
- [asset_pricing_and_valuation](components/asset_pricing_and_valuation.md): 6 classes
- [main_time-stepping_loop](components/main_time-stepping_loop.md): 5 classes
- [portfolio_rebalancing](components/portfolio_rebalancing.md): 3 classes
- [results_export](components/results_export.md): 2 classes
FILE:references/CONSTRAINTS.md
# Constraints
## preservation_manifest
```yaml
required_objects:
business_decisions_count: 139
fatal_constraints_count: 82
non_fatal_constraints_count: 201
use_cases_count: 13
semantic_locks_count: 12
preconditions_count: 4
evidence_quality_rules_count: 2
traceback_scenarios_count: 5
```
FILE:references/LOCKS.md
# Semantic Locks + Preconditions
## Semantic Locks (12)
### `SL-01` <sub>(on_violation: fatal)</sub>
Execute sell orders before buy orders in every trading cycle
### `SL-02` <sub>(on_violation: fatal)</sub>
Trading signals MUST use next-bar execution (no look-ahead)
### `SL-03` <sub>(on_violation: fatal)</sub>
Entity IDs MUST follow format entity_type_exchange_code
### `SL-04` <sub>(on_violation: fatal)</sub>
DataFrame index MUST be MultiIndex (entity_id, timestamp)
### `SL-05` <sub>(on_violation: fatal)</sub>
TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount
### `SL-06` <sub>(on_violation: fatal)</sub>
filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION
### `SL-07` <sub>(on_violation: fatal)</sub>
Transformer MUST run BEFORE Accumulator in factor pipeline
### `SL-08` <sub>(on_violation: fatal)</sub>
MACD parameters locked: fast=12, slow=26, signal=9
### `SL-09` <sub>(on_violation: warning)</sub>
Default transaction costs: buy_cost=0.001, sell_cost=0.001, slippage=0.001
### `SL-10` <sub>(on_violation: fatal)</sub>
A-share equity trading is T+1 (no same-day close of buy positions)
### `SL-11` <sub>(on_violation: fatal)</sub>
Recorder subclass MUST define provider AND data_schema class attributes
### `SL-12` <sub>(on_violation: fatal)</sub>
Factor result_df MUST contain either 'filter_result' OR 'score_result' column
## Preconditions (4)
- **`PC-01`**: `python3 -c 'import zvt; print(zvt.__version__)'` → on_fail: Run: python3 -m pip install zvt then re-run: python3 -m zvt.init_dirs to initialize data directories
- **`PC-02`**: `python3 -c "from zvt.api.kdata import get_kdata; df = get_kdata(entity_ids=['stock_sh_600000'], limit=1); assert df is not None and len(df) > 0, 'No kdata found'"` → on_fail: Run recorder first: python3 -m zvt.recorders.em.em_stock_kdata_recorder --entity_ids stock_sh_600000 (replace with your target entity IDs)
- **`PC-03`**: `python3 -c "import os; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); assert zvt_home.exists(), f'ZVT home not found: {zvt_home}'"` → on_fail: Run: python3 -m zvt.init_dirs
- **`PC-04`**: `python3 -c "import os, tempfile; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); test_f = zvt_home / '.write_test'; test_f.touch(); test_f.unlink()"` → on_fail: Check directory permissions: chmod u+w ~/.zvt or set ZVT_HOME environment variable to a writable location
FILE:references/USE_CASES.md
# Known Use Cases (KUC)
Total: **13**
## `KUC-101`
**Source**: `Archive/_SUMMARY OF THE EXAMPLE RUN.ipynb`
Running a comprehensive Asset-Liability Management (ALM) simulation and visualizing portfolio returns, cash flows, and EIOPA yield curves for a mixed asset portfolio.
## `KUC-102`
**Source**: `Documentation/Archive/_PROJECTION OF THE RISK FREE CURVE AND RECALIBRATION_v2.ipynb`
Projecting forward interest rates and calibrating the EIOPA risk-free yield curve for long-term insurance liability discounting using Smith-Wilson methodology.
## `KUC-103`
**Source**: `Documentation/Archive/_PROTOTYPE BOND PRICING_v2.ipynb`
Pricing a corporate bond portfolio using EIOPA term structures and sector spreads for ALM asset valuation.
## `KUC-104`
**Source**: `Documentation/Archive/_PROTOTYPE EQUITY PRICING_v2.ipynb`
Valuing an equity portfolio by modeling dividend cash flows and terminal values using EIOPA discount curves for ALM analysis.
## `KUC-105`
**Source**: `Documentation/Equity_Class_Dev.ipynb`
Developing core data structures for equity share modeling including dividend yields, growth rates, frequency settings, and terminal value calculations.
## `KUC-106`
**Source**: `LIABILITY_PROTOTYPE.ipynb`
Modeling insurance liability cash flows including policy capitalization, premium growth, mortality rates, and retirement cash flows for ALM liability projections.
## `KUC-107`
**Source**: `Liability_Dev/LIABILITY_PROTOTYPE.ipynb`
Prototype development for insurance liability modeling including policy capitalization functions and premium growth calculations.
## `KUC-108`
**Source**: `_PROJECTION OF THE RISK FREE CURVE AND RECALIBRATION_v2.ipynb`
Projecting and recalibrating risk-free yield curves using EIOPA methodology with configurable spread adjustments for ALM asset and liability valuation.
## `KUC-109`
**Source**: `_PROTOTYPE BOND PRICING_v2.ipynb`
Pricing corporate bond portfolios by generating coupon, maturity, and notional cash flows using EIOPA term structure curves.
## `KUC-110`
**Source**: `_PROTOTYPE EQUITY PRICING_v3.ipynb`
Comprehensive equity portfolio valuation generating dividend flows, terminal cash flows, and unit-based market prices for ALM asset projections.
## `KUC-111`
**Source**: `_SUMMARY CHARTS FOR OSEM RUN.ipynb`
Creating visual summaries of ALM simulation results including dividend, coupon, liability cash flows, notional returns, and terminal cash flows over time.
## `KUC-112`
**Source**: `_TRADING.ipynb`
Processing trading operations across equity, bond, and liability portfolios including cash flow matching, portfolio rebalancing, and expiry handling for ALM execution.
## `KUC-113`
**Source**: `docs/conf.py`
Sphinx documentation configuration for the Open Source Economic Modeling (OSEM) project.
FILE:references/WISDOM.md
# Cross-Project Wisdom
Total: **8**
## `CW-MACRO-DATA-001` — Temporal Ordering Enforcement
**From**: finance-bp-080--FinDKG, finance-bp-083--Economic-Dashboard · **Applicable to**: macro-data
Across temporal knowledge graphs and financial time series, strict temporal ordering must be enforced in train/val/test splits and data loading. Train edges must occur strictly before validation edges, which must occur strictly before test edges. DataLoaders must never shuffle temporal data. Apply this pattern whenever implementing any time-series ML pipeline to prevent look-ahead bias that inflates evaluation metrics.
## `CW-MACRO-DATA-002` — Regulatory Formula Compliance
**From**: finance-bp-077--Open_Source_Economic_Model, finance-bp-105--open-climate-investing · **Applicable to**: macro-data
When implementing financial calculations subject to regulatory oversight (EIOPA Solvency II, CAPM, Fama-French), use exact formula specifications from authoritative sources. The Smith-Wilson convergence point must follow EIOPA paragraph 120, factor regressions must use excess returns with properly scaled inputs. Apply this pattern when calculations will be used for regulatory reporting or investment decision-making.
## `CW-MACRO-DATA-003` — Strict Data Schema Enforcement
**From**: finance-bp-083--Economic-Dashboard, finance-bp-077--Open_Source_Economic_Model · **Applicable to**: macro-data
Financial data pipelines require strict schema validation at ingestion points. OHLCV requires specific columns, CSV imports require exact column names matching field access, INI files require specific sections. Missing or malformed schema elements should fail loudly rather than produce silent corruption. Apply this pattern during data import to catch errors early before downstream calculations use bad data.
## `CW-MACRO-DATA-004` — Composite Primary Key Uniqueness
**From**: finance-bp-105--open-climate-investing, finance-bp-080--FinDKG, finance-bp-083--Economic-Dashboard · **Applicable to**: macro-data
Time-series financial databases require composite primary keys (ticker, date) to ensure uniqueness and enable efficient querying. Inconsistent primary keys across tables break JOIN operations essential for feature merging. Apply this pattern when designing any financial database schema involving time-series measurements with multiple entities.
## `CW-MACRO-DATA-005` — External API Rate Limiting
**From**: finance-bp-074--FinRobot · **Applicable to**: macro-data
When accessing external financial APIs (SEC EDGAR, data vendors), strict rate limiting must be implemented before deployment. SEC EDGAR enforces 10 requests per second with IP blocking consequences. Use decorators and proper User-Agent headers. Apply this pattern when integrating any external financial data API to prevent service disruption that blocks critical data access.
## `CW-MACRO-DATA-006` — Graph Attribute Propagation in Batching
**From**: finance-bp-080--FinDKG, finance-bp-105--open-climate-investing · **Applicable to**: macro-data
When creating subgraph variants during batch collation in graph-based ML, all metadata attributes (num_nodes, num_relations, time_interval) must be explicitly propagated to each subgraph. Downstream model components expect these attributes on all graph objects. Apply this pattern whenever implementing custom collate functions for graph neural networks to prevent training failures.
## `CW-MACRO-DATA-007` — Statistical Validity Thresholds
**From**: finance-bp-105--open-climate-investing, finance-bp-083--Economic-Dashboard · **Applicable to**: macro-data
Factor regressions and statistical calculations require minimum observation counts (typically 20+) for reliable inference. Inner joins, winsorization, and date filtering reduce observations; pipeline validation must check for sufficient data points before regression. Apply this pattern whenever computing regression statistics to ensure results are meaningful rather than spurious.
## `CW-MACRO-DATA-008` — Data Type Strictness for ML Operations
**From**: finance-bp-080--FinDKG, finance-bp-077--Open_Source_Economic_Model · **Applicable to**: macro-data
Graph operations and time calculations require strict dtype consistency (float32 for time values, integer for node types, boolean for masks). Type mismatches cause silent failures in edge_subgraph, degree calculations, and time interval transformations. Apply this pattern when preparing data for graph neural networks or any numerical ML pipeline to catch dtype issues early.
FILE:references/components/asset_cash_flow_generation.md
# asset_cash_flow_generation (6 classes)
## `EquityShare.generate_dividend_dates`
`asset_cash_flow_generation/equityshare-generate-dividend-dates.py:0`
## `EquityShare.terminal_amount`
`asset_cash_flow_generation/equityshare-terminal-amount.py:0`
## `CorpBond.coupon_amount`
`asset_cash_flow_generation/corpbond-coupon-amount.py:0`
## `EquitySharePortfolio.unique_dates_profile`
`asset_cash_flow_generation/equityshareportfolio-unique-dates-profil.py:0`
## `terminal_value_model`
`asset_cash_flow_generation/terminal-value-model.py:0`
## `dividend_growth`
`asset_cash_flow_generation/dividend-growth.py:0`
FILE:references/components/asset_pricing_and_valuation.md
# asset_pricing_and_valuation (6 classes)
## `price_share`
`asset_pricing_and_valuation/price-share.py:0`
## `price_bond`
`asset_pricing_and_valuation/price-bond.py:0`
## `calibrate_bond_portfolio`
`asset_pricing_and_valuation/calibrate-bond-portfolio.py:0`
## `Curves.RetrieveRates`
`asset_pricing_and_valuation/curves-retrieverates.py:0`
## `pricing_model`
`asset_pricing_and_valuation/pricing-model.py:0`
## `spread_calibration`
`asset_pricing_and_valuation/spread-calibration.py:0`
FILE:references/components/configuration_and_data_import.md
# configuration_and_data_import (4 classes)
## `ImportData.get_configuration`
`configuration_and_data_import/importdata-get-configuration.py:0`
## `Settings.__post_init__`
`configuration_and_data_import/settings-post-init.py:0`
## `ImportData.get_EquityShare`
`configuration_and_data_import/importdata-get-equityshare.py:0`
## `input_format`
`configuration_and_data_import/input-format.py:0`
FILE:references/components/liability_cash_flow_generation.md
# liability_cash_flow_generation (3 classes)
## `Liability.__init__`
`liability_cash_flow_generation/liability-init.py:0`
## `LiabilityClasses.unique_dates_profile`
`liability_cash_flow_generation/liabilityclasses-unique-dates-profile.py:0`
## `liability_model`
`liability_cash_flow_generation/liability-model.py:0`
FILE:references/components/main_time-stepping_loop.md
# main_time-stepping_loop (5 classes)
## `set_dates_of_interest`
`main_time-stepping_loop/set-dates-of-interest.py:0`
## `process_expired_cf`
`main_time-stepping_loop/process-expired-cf.py:0`
## `create_cashflow_dataframe`
`main_time-stepping_loop/create-cashflow-dataframe.py:0`
## `time_step`
`main_time-stepping_loop/time-step.py:0`
## `equity_growth_model`
`main_time-stepping_loop/equity-growth-model.py:0`
FILE:references/components/portfolio_rebalancing.md
# portfolio_rebalancing (3 classes)
## `trade`
`portfolio_rebalancing/trade.py:0`
## `trading_strategy`
`portfolio_rebalancing/trading-strategy.py:0`
## `asset_allocation`
`portfolio_rebalancing/asset-allocation.py:0`
FILE:references/components/results_export.md
# results_export (2 classes)
## `summary_df.to_csv`
`results_export/summary-df-to-csv.py:0`
## `output_format`
`results_export/output-format.py:0`
FILE:references/components/risk-free_curve_construction.md
# risk-free_curve_construction (7 classes)
## `Curves.SetObservedTermStructure`
`risk-free_curve_construction/curves-setobservedtermstructure.py:0`
## `Curves.CalcFwdRates`
`risk-free_curve_construction/curves-calcfwdrates.py:0`
## `Curves.ProjectForwardRate`
`risk-free_curve_construction/curves-projectforwardrate.py:0`
## `Curves.CalibrateProjected`
`risk-free_curve_construction/curves-calibrateprojected.py:0`
## `BisectionAlpha.find_alpha`
`risk-free_curve_construction/bisectionalpha-find-alpha.py:0`
## `curve_algorithm`
`risk-free_curve_construction/curve-algorithm.py:0`
## `calibration_method`
`risk-free_curve_construction/calibration-method.py:0`
基于 lifelines 库提供生存分析与 Cox 比例风险建模能力,支持残差诊断、参数化回归模型自定义、时滞转化率分析及比例风险假设检验。
---
name: lifelines-survival-analysis
description: |-
基于 lifelines 库提供生存分析与 Cox 比例风险建模能力,支持残差诊断、参数化回归模型自定义、时滞转化率分析及比例风险假设检验。
license: Proprietary. See LICENSE.txt in project root.
compatibility: Designed for Doramagic-host ecosystem (Claude Code / openclaw / Cursor). Requires Python 3.12+ with uv package manager.
metadata:
version: "v6.1"
blueprint_id: "finance-bp-126"
compiled_at: "2026-04-22T13:01:02.914448+00:00"
capability_markets: "global"
capability_activities: "insurance-actuarial"
sop_version: "crystal-compilation-v6.1"
---
# 生存分析建模 (lifelines-survival-analysis)
> 基于 lifelines 库提供生存分析与 Cox 比例风险建模能力,支持残差诊断、参数化回归模型自定义、时滞转化率分析及比例风险假设检验。
## Pipeline
`data_collection -> data_storage -> factor_computation -> target_selection -> trading_execution -> visualization`
## Top Use Cases (19 total)
### Cox Model Residual Analysis (`UC-101`)
Diagnosing Cox proportional hazards model fit by computing and visualizing martingale, deviance, and delta_beta residuals to identify outliers and inf
**Triggers**: cox residuals, martingale residual, deviance residual
### Time-Lagged Conversion Rate Analysis (`UC-103`)
Modeling marketing conversion rates where there is a time lag between initial contact and conversion event, requiring specialized survival analysis te
**Triggers**: conversion rate, time-lagged, marketing
### Piecewise Exponential Survival Models (`UC-104`)
Fitting piecewise exponential survival models that allow different hazard rates in different time intervals, useful when hazard is non-constant over t
**Triggers**: piecewise exponential, varying hazard, breakpoints
For all **19** use cases, see [references/USE_CASES.md](references/USE_CASES.md).
**Execute trigger**: `When user intent matches intent_router.uc_entries[].positive_terms AND user uses action verb (run/execute/跑/执行/backtest/fetch/collect)`
## What I'll Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
## Semantic Locks (Fatal)
| ID | Rule | On Violation |
|---|---|---|
| `SL-01` | Execute sell orders before buy orders in every trading cycle | halt |
| `SL-02` | Trading signals MUST use next-bar execution (no look-ahead) | halt |
| `SL-03` | Entity IDs MUST follow format entity_type_exchange_code | halt |
| `SL-04` | DataFrame index MUST be MultiIndex (entity_id, timestamp) | halt |
| `SL-05` | TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount | halt |
| `SL-06` | filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION | halt |
| `SL-07` | Transformer MUST run BEFORE Accumulator in factor pipeline | halt |
| `SL-08` | MACD parameters locked: fast=12, slow=26, signal=9 | halt |
Full lock definitions: [references/LOCKS.md](references/LOCKS.md)
## Top Anti-Patterns (15 total)
- **`AP-INSURANCE-001`**: Implicit numeric format assumptions without validation
- **`AP-INSURANCE-002`**: Triangle axis construction with invalid temporal ordering
- **`AP-INSURANCE-003`**: Cumulative/incremental triangle representation misuse
All 15 anti-patterns: [references/ANTI_PATTERNS.md](references/ANTI_PATTERNS.md)
## Evidence Quality Notice
> [QUALITY NOTICE] This crystal was compiled from blueprint finance-bp-126. Evidence verify ratio = 53.0% and audit fail total = 27. Generated results may have uncaptured requirement gaps. Verify critical decisions against source files (LATEST.yaml / LATEST.jsonl).
## Reference Files
| File | Contents | When to Load |
|---|---|---|
| [references/seed.yaml](references/seed.yaml) | V6+ 全量权威 (source-of-truth) | 有行为/决策争议时必读 |
| [references/ANTI_PATTERNS.md](references/ANTI_PATTERNS.md) | 15 条跨项目反模式 | 开始实现前 |
| [references/WISDOM.md](references/WISDOM.md) | 跨项目精华借鉴 | 架构决策时 |
| [references/CONSTRAINTS.md](references/CONSTRAINTS.md) | domain + fatal 约束 | 规则冲突时 |
| [references/USE_CASES.md](references/USE_CASES.md) | 全量 KUC-* 业务场景 | 需要完整示例时 |
| [references/LOCKS.md](references/LOCKS.md) | SL-* + preconditions + hints | 生成回测/交易代码前 |
| [references/COMPONENTS.md](references/COMPONENTS.md) | AST 组件地图(按 module 拆分)| 查 API 时 |
---
*Compiled by Doramagic crystal-compilation-v6.1 from `finance-bp-126` blueprint at 2026-04-22T13:01:02.914448+00:00.*
*See [human_summary.md](human_summary.md) for non-technical overview.*
FILE:human_summary.md
# Human Summary
> I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
## What I Can Do
- **tagline**: I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
- **use_cases**: ['Time-Lagged Conversion Rate Analysis', 'Custom Parametric Regression Models', 'Cox Model Residual Analysis', 'A-share MACD daily golden-cross backtest with hfq price adjustment from eastmoney', 'End-to-end ZVT pipeline: FinanceRecorder + GoodCompanyFactor + StockTrader', 'Multi-factor strategy with TargetSelector (AND mode) combining MACD + volume breakout', 'Index composition data collection (SZ1000, SZ2000) with EM recorder']
## What I Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
FILE:references/ANTI_PATTERNS.md
# Anti-Patterns (Cross-Project)
Total: **15**
## finance-bp-063--chainladder-python (4)
### `AP-INSURANCE-002` — Triangle axis construction with invalid temporal ordering <sub>(high)</sub>
Development dates are created without verifying they are strictly greater than origin dates, or development lags are calculated with incorrect formulas (e.g., using wrong divisor for monthly difference). This creates logically impossible triangle cells where development <= origin, corrupting the fundamental data structure and producing wrong loss development patterns.
### `AP-INSURANCE-003` — Cumulative/incremental triangle representation misuse <sub>(high)</sub>
Link ratios are computed on incremental triangles instead of cumulative form, or cum_to_incr/incr_to_cum conversions are not properly inverse-applied. This produces link ratios near 1.0 regardless of actual claims development, leading to misleading development factors and incorrect IBNR estimates.
### `AP-INSURANCE-004` — Including incomplete latest diagonal in development analysis <sub>(high)</sub>
Link ratio computation includes the latest diagonal which contains incomplete/in-progress development data. Without excluding this diagonal via valuation_date filtering, development factor estimation uses partial data that biases IBNR estimates. The latest diagonal must be excluded to capture true historical development patterns.
### `AP-INSURANCE-015` — Triangle grain transformation with incompatible parameters <sub>(medium)</sub>
Triangle grain() method is called without setting is_cumulative attribute, or origin grain is made finer than development grain. These produce invalid triangular data structures with misaligned periods and undefined behavior, corrupting actuarial reserving calculations.
## finance-bp-064--insurance_python (2)
### `AP-INSURANCE-005` — EIOPA calibration workflow violations <sub>(high)</sub>
Smith-Wilson calibration workflow is violated in multiple ways: calibration step is skipped before extrapolation, different alpha values are used for calibration vs extrapolation, or convergence point T uses incorrect formula. These violations produce mathematically inconsistent rate curves where observed points do not match market data and extrapolated rates violate EIOPA specifications.
### `AP-INSURANCE-006` — Missing iteration bounds causing infinite loops <sub>(high)</sub>
Root-finding algorithms like bisection for alpha calibration lack maxIter parameters. When the algorithm fails to converge (e.g., no sign change in Galfa at interval bounds), the application freezes indefinitely, causing service disruption. This is especially critical in regulatory compliance workflows where calibration must complete.
## finance-bp-064--insurance_python, finance-bp-126--lifelines (1)
### `AP-INSURANCE-007` — Invalid financial/mathematical constraints not validated <sub>(high)</sub>
Correlation coefficients outside [-1,1], non-positive-semidefinite covariance matrices, negative durations, or entry times >= duration are not validated before use. These cause Cholesky decomposition failures, imaginary values in sqrt(1-rho²), or logically impossible scenarios, producing NaN prices or corrupted at-risk calculations.
## finance-bp-065--pyliferisk (4)
### `AP-INSURANCE-008` — None values propagated to arithmetic operations <sub>(high)</sub>
Critical parameters like interest rate i are passed as None to actuarial calculations. In pyliferisk, Actuarial.__init__ with i=None causes TypeError in (1/(1+i)) and commutation arrays remain empty. Bare except clauses catch these TypeErrors and silently return 0, masking the fundamental issue and producing incorrect but seemingly valid results.
### `AP-INSURANCE-009` — Stub function implementations and duplicate definitions <sub>(high)</sub>
Critical insurance functions like deferred temporary annuities are implemented as empty stubs (only 'pass' statement) or have duplicate definitions where the second shadows the first. This causes functions to return None instead of calculated values, breaking increasing annuity and premium calculations silently in production.
### `AP-INSURANCE-010` — Dispatcher routing to undefined functions <sub>(medium)</sub>
Complex function dispatchers (like annuity()) handle many parameter combinations but call functions that do not exist (e.g., qtaaxn, qtaxn). This causes NameError at runtime when specific parameter combinations are requested, preventing deferred temporary increasing annuity calculations entirely.
### `AP-INSURANCE-014` — Actuarial convention violations in life table construction <sub>(high)</sub>
Life tables violate standard actuarial conventions: using incorrect radix (not 100000), failing to append 0 to lx array for complete extinction, or using wrong payment adjustment formula for fractional annuities. These violations scale all derived quantities (dx, ex, reserves, premiums) incorrectly.
## finance-bp-065--pyliferisk, finance-bp-064--insurance_python (1)
### `AP-INSURANCE-001` — Implicit numeric format assumptions without validation <sub>(high)</sub>
Data formats like per-mille qx values or rate-to-price conversions are applied implicitly without validation. In pyliferisk, qx values stored as per-mille (qx*1000) are used directly as probabilities yielding 1000x errors. In insurance_python, rates are converted to prices using p=(1+r)^(-M) without verifying input format. This causes material miscalculations in reserve and premium calculations.
## finance-bp-126--lifelines (3)
### `AP-INSURANCE-011` — Survival function monotonicity not enforced <sub>(high)</sub>
Non-parametric survival curve estimators do not verify that S(t) is monotonically non-increasing across timeline values. Violations produce mathematically invalid survival curves where probability of survival increases over time, or S(0) is not initialized to 1.0, breaking interpretation as probability distribution.
### `AP-INSURANCE-012` — Input data corruption via inplace operations <sub>(medium)</sub>
User-provided DataFrames are modified inplace using .pop() operations without first creating a copy. This permanently corrupts user data by removing columns, violating data isolation principles and potentially affecting downstream analysis on the original data.
### `AP-INSURANCE-013` — Interval censoring bounds not validated <sub>(medium)</sub>
Lower and upper bounds for interval-censored data are not validated, allowing upper_bound < lower_bound. Invalid interval bounds produce undefined survival probability calculations, potentially negative time intervals in the likelihood function, and corrupt NPMLE estimation.
FILE:references/COMPONENTS.md
# Component Capability Map
**Project**: finance-bp-126--lifelines
**Scan date**: 2026-04-22
**Stats**: {'total_files': 6, 'total_classes': 28, 'total_functions': 0, 'total_stages': 6}
## Modules (6)
- [data_input_&_validation](components/data_input_-_validation.md): 4 classes
- [non-parametric_survival_curve_estimation](components/non-parametric_survival_curve_estimation.md): 4 classes
- [regression_model_fitting](components/regression_model_fitting.md): 6 classes
- [statistical_inference_&_model_evaluation](components/statistical_inference_-_model_evaluation.md): 5 classes
- [prediction_from_fitted_models](components/prediction_from_fitted_models.md): 5 classes
- [visualization](components/visualization.md): 4 classes
FILE:references/CONSTRAINTS.md
# Constraints
## preservation_manifest
```yaml
required_objects:
business_decisions_count: 132
fatal_constraints_count: 53
non_fatal_constraints_count: 160
use_cases_count: 19
semantic_locks_count: 12
preconditions_count: 4
evidence_quality_rules_count: 2
traceback_scenarios_count: 5
```
FILE:references/LOCKS.md
# Semantic Locks + Preconditions
## Semantic Locks (12)
### `SL-01` <sub>(on_violation: fatal)</sub>
Execute sell orders before buy orders in every trading cycle
### `SL-02` <sub>(on_violation: fatal)</sub>
Trading signals MUST use next-bar execution (no look-ahead)
### `SL-03` <sub>(on_violation: fatal)</sub>
Entity IDs MUST follow format entity_type_exchange_code
### `SL-04` <sub>(on_violation: fatal)</sub>
DataFrame index MUST be MultiIndex (entity_id, timestamp)
### `SL-05` <sub>(on_violation: fatal)</sub>
TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount
### `SL-06` <sub>(on_violation: fatal)</sub>
filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION
### `SL-07` <sub>(on_violation: fatal)</sub>
Transformer MUST run BEFORE Accumulator in factor pipeline
### `SL-08` <sub>(on_violation: fatal)</sub>
MACD parameters locked: fast=12, slow=26, signal=9
### `SL-09` <sub>(on_violation: warning)</sub>
Default transaction costs: buy_cost=0.001, sell_cost=0.001, slippage=0.001
### `SL-10` <sub>(on_violation: fatal)</sub>
A-share equity trading is T+1 (no same-day close of buy positions)
### `SL-11` <sub>(on_violation: fatal)</sub>
Recorder subclass MUST define provider AND data_schema class attributes
### `SL-12` <sub>(on_violation: fatal)</sub>
Factor result_df MUST contain either 'filter_result' OR 'score_result' column
## Preconditions (4)
- **`PC-01`**: `python3 -c 'import zvt; print(zvt.__version__)'` → on_fail: Run: python3 -m pip install zvt then re-run: python3 -m zvt.init_dirs to initialize data directories
- **`PC-02`**: `python3 -c "from zvt.api.kdata import get_kdata; df = get_kdata(entity_ids=['stock_sh_600000'], limit=1); assert df is not None and len(df) > 0, 'No kdata found'"` → on_fail: Run recorder first: python3 -m zvt.recorders.em.em_stock_kdata_recorder --entity_ids stock_sh_600000 (replace with your target entity IDs)
- **`PC-03`**: `python3 -c "import os; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); assert zvt_home.exists(), f'ZVT home not found: {zvt_home}'"` → on_fail: Run: python3 -m zvt.init_dirs
- **`PC-04`**: `python3 -c "import os, tempfile; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); test_f = zvt_home / '.write_test'; test_f.touch(); test_f.unlink()"` → on_fail: Check directory permissions: chmod u+w ~/.zvt or set ZVT_HOME environment variable to a writable location
FILE:references/USE_CASES.md
# Known Use Cases (KUC)
Total: **19**
## `KUC-101`
**Source**: `examples/Cox residuals.ipynb`
Diagnosing Cox proportional hazards model fit by computing and visualizing martingale, deviance, and delta_beta residuals to identify outliers and influential observations.
## `KUC-102`
**Source**: `examples/Custom Regression Models.ipynb`
Creating user-defined parametric survival regression models by subclassing ParametricRegressionFitter to implement custom hazard functions for specialized applications.
## `KUC-103`
**Source**: `examples/Modelling time-lagged conversion rates.ipynb`
Modeling marketing conversion rates where there is a time lag between initial contact and conversion event, requiring specialized survival analysis techniques.
## `KUC-104`
**Source**: `examples/Piecewise Exponential Models and Creating Custom Models.ipynb`
Fitting piecewise exponential survival models that allow different hazard rates in different time intervals, useful when hazard is non-constant over time.
## `KUC-105`
**Source**: `examples/Proportional hazard assumption.ipynb`
Validating that the proportional hazards assumption holds for Cox models using statistical tests and visual diagnostics, and applying remedies like stratification or splines when violations occur.
## `KUC-106`
**Source**: `examples/B-splines.ipynb`
Using B-splines in Cox proportional hazards models to flexibly model non-linear relationships between covariates and hazard without assuming specific functional forms.
## `KUC-107`
**Source**: `examples/SaaS churn and piecewise regression models.ipynb`
Predicting customer churn in SaaS subscription business using survival analysis, accounting for varying churn rates across customer tenure segments.
## `KUC-108`
**Source**: `examples/US Presidential Cabinet survival.ipynb`
Analyzing tenure survival of US presidential cabinet members across administrations, examining factors affecting cabinet turnover and duration.
## `KUC-109`
**Source**: `examples/aalen_and_cook_simulation.py`
Simulation study for Aalen-Johansen estimator and comparison with Cox/Weibull models to understand multi-state survival dynamics.
## `KUC-110`
**Source**: `examples/copula_frailty_weibull_model.py`
Modeling dependent competing risks using copula-based frailty to account for unobserved heterogeneity that creates correlation between multiple failure types.
## `KUC-111`
**Source**: `examples/cox_spline_custom_knots.py`
Fitting Cox proportional hazards model with spline-based baseline hazard using user-specified knot locations for flexible survival modeling.
## `KUC-112`
**Source**: `examples/crowther_royston_clements_splines.py`
Implementing flexible parametric accelerated failure time models using Royston-Clements splines for flexible baseline survival estimation.
## `KUC-113`
**Source**: `examples/cure_model.py`
Modeling survival data where a proportion of subjects will never experience the event (cured), using Weibull survival with cure component.
## `KUC-114`
**Source**: `examples/haft_model.py`
Implementing heteroscedastic accelerated failure time models where variance of log-survival time depends on covariates, providing more flexible survival modeling.
## `KUC-115`
**Source**: `examples/left_censoring_experiments.py`
Handling left-censored survival data where the true event time may have occurred before the observation window began, common in environmental or detection-limited data.
## `KUC-116`
**Source**: `examples/mixture_cure_model.py`
Modeling survival with multiple cure pathways using mixture models to represent different subgroups with different probabilities of experiencing the event.
## `KUC-117`
**Source**: `examples/Solving a mixture of exponentials and binning using interval censoring.ipynb`
Fitting mixture of exponential distributions to interval-censored data, where exact event times are unknown but fall within observed intervals.
## `KUC-118`
**Source**: `experiments/Experiments on primary and secondary shelf life.ipynb`
Analyzing product shelf life using competing risks framework to model both primary shelf life (time to first event) and secondary shelf life (time between events).
## `KUC-119`
**Source**: `docs/images/dist_script.py`
Visualizing Weibull survival functions to understand how shape and scale parameters affect survival curves for educational and model selection purposes.
FILE:references/WISDOM.md
# Cross-Project Wisdom
Total: **10**
## `CW-INSURANCE-001` — Validate input data format and type before computation
**From**: finance-bp-063--chainladder-python, finance-bp-126--lifelines · **Applicable to**: insurance-actuarial
Both triangle construction and survival analysis require strict input validation: numeric types for triangle columns, valid event indicators (0/1), no NaN/Inf values, and correct temporal ordering. This prevents downstream numerical failures and ensures mathematical validity of actuarial computations.
## `CW-INSURANCE-002` — Initialize probability distributions to boundary values
**From**: finance-bp-065--pyliferisk, finance-bp-126--lifelines · **Applicable to**: insurance-actuarial
Survival probability S(0) must equal 1.0 and life table lx must start at standard radix (100000) and end at 0. Properly initializing boundary values ensures actuarial quantities have correct scale and interpretation as probability distributions.
## `CW-INSURANCE-003` — Include iteration limits in numerical root-finding
**From**: finance-bp-064--insurance_python · **Applicable to**: insurance-actuarial
Bisection and other root-finding algorithms must include maxIter parameters and verify interval contains valid root (sign change). This prevents infinite loops when calibration fails, ensuring service availability in regulatory compliance workflows.
## `CW-INSURANCE-004` — Avoid bare except clauses that mask TypeErrors
**From**: finance-bp-065--pyliferisk · **Applicable to**: insurance-actuarial
Bare except clauses that catch all exceptions including TypeError and return default values (0 or None) mask fundamental parameter errors. Use specific exception handling and validate inputs upfront to fail fast with clear error messages.
## `CW-INSURANCE-005` — Preserve standard radix and extinction conventions in life tables
**From**: finance-bp-065--pyliferisk · **Applicable to**: insurance-actuarial
Life insurance calculations rely on industry-standard conventions: radix of 100000 at age 0 and lx[-1]=0 for complete extinction. Deviating from these conventions scales all derived quantities incorrectly and breaks interoperability with other actuarial systems.
## `CW-INSURANCE-006` — Ensure workflow step ordering and parameter consistency
**From**: finance-bp-063--chainladder-python, finance-bp-064--insurance_python · **Applicable to**: insurance-actuarial
Multi-step algorithms (triangle transformations, Smith-Wilson calibration) require strict step ordering: compute calibration vector before extrapolation, use consistent alpha values throughout. Violating workflow order produces undefined or mathematically inconsistent results.
## `CW-INSURANCE-007` — Validate probability bounds for confidence intervals
**From**: finance-bp-126--lifelines · **Applicable to**: insurance-actuarial
Confidence interval bounds must be constrained to [0,1] for probability estimates. Use fillna and formula constraints to ensure CI bounds remain valid probability ranges, preventing invalid statistical inference from actuarial models.
## `CW-INSURANCE-008` — Validate matrix properties before decomposition
**From**: finance-bp-065--pyliferisk, finance-bp-064--insurance_python · **Applicable to**: insurance-actuarial
Positive semi-definite matrices must be verified before Cholesky decomposition. Invalid matrices cause math domain errors or invalid correlated samples. Similarly, correlation coefficients must be validated to [-1,1] bounds before use in sqrt(1-rho²).
## `CW-INSURANCE-009` — Make defensive copies of input DataFrames
**From**: finance-bp-126--lifelines · **Applicable to**: insurance-actuarial
User-provided DataFrames should be copied before inplace modifications (.pop(), .drop()). This preserves user data integrity and prevents side effects from leaking into caller code, maintaining data isolation principles.
## `CW-INSURANCE-010` — Exclude incomplete diagonals from historical analysis
**From**: finance-bp-063--chainladder-python · **Applicable to**: insurance-actuarial
The latest diagonal in claims triangles contains incomplete development data from the current period. Excluding this diagonal via valuation_date filtering ensures development factors capture only completed, reliable historical patterns for unbiased IBNR estimation.
FILE:references/components/data_input_-_validation.md
# data_input_&_validation (4 classes)
## `BaseFitter.fit`
`data_input_&_validation/basefitter-fit.py:0`
## `CensoringType.detect`
`data_input_&_validation/censoringtype-detect.py:0`
## `_check_complete_separation`
`data_input_&_validation/check-complete-separation.py:0`
## `column_validation`
`data_input_&_validation/column-validation.py:0`
FILE:references/components/non-parametric_survival_curve_estimation.md
# non-parametric_survival_curve_estimation (4 classes)
## `KaplanMeierFitter.fit`
`non-parametric_survival_curve_estimation/kaplanmeierfitter-fit.py:0`
## `NelsonAalenFitter.fit`
`non-parametric_survival_curve_estimation/nelsonaalenfitter-fit.py:0`
## `GeneralizedGammaFitter.fit`
`non-parametric_survival_curve_estimation/generalizedgammafitter-fit.py:0`
## `confidence_interval_method`
`non-parametric_survival_curve_estimation/confidence-interval-method.py:0`
FILE:references/components/prediction_from_fitted_models.md
# prediction_from_fitted_models (5 classes)
## `predict_survival_function`
`prediction_from_fitted_models/predict-survival-function.py:0`
## `predict_hazard`
`prediction_from_fitted_models/predict-hazard.py:0`
## `predict_cumulative_hazard`
`prediction_from_fitted_models/predict-cumulative-hazard.py:0`
## `median_survival_times`
`prediction_from_fitted_models/median-survival-times.py:0`
## `prediction_method`
`prediction_from_fitted_models/prediction-method.py:0`
FILE:references/components/regression_model_fitting.md
# regression_model_fitting (6 classes)
## `CoxPHFitter.fit`
`regression_model_fitting/coxphfitter-fit.py:0`
## `WeibullAFTFitter.fit`
`regression_model_fitting/weibullaftfitter-fit.py:0`
## `ParametricRegressionFitter._scipy_fit`
`regression_model_fitting/parametricregressionfitter-scipy-fit.py:0`
## `optimizer`
`regression_model_fitting/optimizer.py:0`
## `baseline_hazard`
`regression_model_fitting/baseline-hazard.py:0`
## `tie_handling`
`regression_model_fitting/tie-handling.py:0`
FILE:references/components/statistical_inference_-_model_evaluation.md
# statistical_inference_&_model_evaluation (5 classes)
## `StatisticalResult.summary`
`statistical_inference_&_model_evaluation/statisticalresult-summary.py:0`
## `concordance_index`
`statistical_inference_&_model_evaluation/concordance-index.py:0`
## `k_fold_cross_validation`
`statistical_inference_&_model_evaluation/k-fold-cross-validation.py:0`
## `logrank_test`
`statistical_inference_&_model_evaluation/logrank-test.py:0`
## `scoring_method`
`statistical_inference_&_model_evaluation/scoring-method.py:0`
FILE:references/components/visualization.md
# visualization (4 classes)
## `plot_survival_function`
`visualization/plot-survival-function.py:0`
## `add_at_risk_counts`
`visualization/add-at-risk-counts.py:0`
## `qq_plot`
`visualization/qq-plot.py:0`
## `plot_style`
`visualization/plot-style.py:0`
通过字节码驱动的复式记账引擎,支持多币种账户余额实时查询和资金来源的FIFO分配追踪。
---
name: ledger-plaintext-accounting
description: |-
通过字节码驱动的复式记账引擎,支持多币种账户余额实时查询和资金来源的FIFO分配追踪。
license: Proprietary. See LICENSE.txt in project root.
compatibility: Designed for Doramagic-host ecosystem (Claude Code / openclaw / Cursor). Requires Python 3.12+ with uv package manager.
metadata:
version: "v6.1"
blueprint_id: "finance-bp-073"
compiled_at: "2026-04-22T13:00:26.836559+00:00"
capability_markets: "global"
capability_activities: "accounting"
sop_version: "crystal-compilation-v6.1"
---
# Ledger 纯文本记账 (ledger-plaintext-accounting)
> 通过字节码驱动的复式记账引擎,支持多币种账户余额实时查询和资金来源的FIFO分配追踪。
## Pipeline
`data_collection -> data_storage -> factor_computation -> target_selection -> trading_execution -> visualization`
## Top Use Cases (0 total)
**Execute trigger**: `When user intent matches intent_router.uc_entries[].positive_terms AND user uses action verb (run/execute/跑/执行/backtest/fetch/collect)`
## What I'll Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
## Semantic Locks (Fatal)
| ID | Rule | On Violation |
|---|---|---|
| `SL-01` | Execute sell orders before buy orders in every trading cycle | halt |
| `SL-02` | Trading signals MUST use next-bar execution (no look-ahead) | halt |
| `SL-03` | Entity IDs MUST follow format entity_type_exchange_code | halt |
| `SL-04` | DataFrame index MUST be MultiIndex (entity_id, timestamp) | halt |
| `SL-05` | TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount | halt |
| `SL-06` | filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION | halt |
| `SL-07` | Transformer MUST run BEFORE Accumulator in factor pipeline | halt |
| `SL-08` | MACD parameters locked: fast=12, slow=26, signal=9 | halt |
Full lock definitions: [references/LOCKS.md](references/LOCKS.md)
## Top Anti-Patterns (15 total)
- **`AP-ACCOUNTING-001`**: Using floating-point arithmetic for monetary amounts
- **`AP-ACCOUNTING-002`**: Skipping initialization calls before VM/script execution
- **`AP-ACCOUNTING-003`**: Mixing different asset types in monetary operations
All 15 anti-patterns: [references/ANTI_PATTERNS.md](references/ANTI_PATTERNS.md)
## Evidence Quality Notice
> [QUALITY NOTICE] This crystal was compiled from blueprint finance-bp-073. Evidence verify ratio = 85.9% and audit fail total = 0. Generated results may have uncaptured requirement gaps. Verify critical decisions against source files (LATEST.yaml / LATEST.jsonl).
## Reference Files
| File | Contents | When to Load |
|---|---|---|
| [references/seed.yaml](references/seed.yaml) | V6+ 全量权威 (source-of-truth) | 有行为/决策争议时必读 |
| [references/ANTI_PATTERNS.md](references/ANTI_PATTERNS.md) | 15 条跨项目反模式 | 开始实现前 |
| [references/WISDOM.md](references/WISDOM.md) | 跨项目精华借鉴 | 架构决策时 |
| [references/CONSTRAINTS.md](references/CONSTRAINTS.md) | domain + fatal 约束 | 规则冲突时 |
| [references/USE_CASES.md](references/USE_CASES.md) | 全量 KUC-* 业务场景 | 需要完整示例时 |
| [references/LOCKS.md](references/LOCKS.md) | SL-* + preconditions + hints | 生成回测/交易代码前 |
| [references/COMPONENTS.md](references/COMPONENTS.md) | AST 组件地图(按 module 拆分)| 查 API 时 |
---
*Compiled by Doramagic crystal-compilation-v6.1 from `finance-bp-073` blueprint at 2026-04-22T13:00:26.836559+00:00.*
*See [human_summary.md](human_summary.md) for non-technical overview.*
FILE:human_summary.md
# Human Summary
> I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
## What I Can Do
- **tagline**: I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
- **use_cases**: ['A-share MACD daily golden-cross backtest with hfq price adjustment from eastmoney', 'End-to-end ZVT pipeline: FinanceRecorder + GoodCompanyFactor + StockTrader', 'Multi-factor strategy with TargetSelector (AND mode) combining MACD + volume breakout', 'Index composition data collection (SZ1000, SZ2000) with EM recorder', 'Institutional fund holdings tracker via joinquant_fund_runner pattern', 'Custom Transformer + Accumulator factor with per-entity rolling state', 'Bollinger Band mean-reversion factor with BollTransformer (window=20, window_dev=2)']
## What I Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
FILE:references/ANTI_PATTERNS.md
# Anti-Patterns (Cross-Project)
Total: **15**
## finance-bp-073--ledger (7)
### `AP-ACCOUNTING-002` — Skipping initialization calls before VM/script execution <sub>(high)</sub>
Executing Numscript VM without first calling ResolveResources() and ResolveBalances() causes panics with ErrResourcesNotInitialized or ErrBalancesNotInitialized. This prevents any script execution and leaves transactions in an unrunnable state, blocking financial operations entirely.
### `AP-ACCOUNTING-003` — Mixing different asset types in monetary operations <sub>(high)</sub>
Performing addition, subtraction, or take operations on amounts with different asset types produces invalid financial calculations. This violates the fundamental accounting principle that amounts in different currencies cannot be combined, leading to corrupted account balances and failed reconciliations.
### `AP-ACCOUNTING-004` — Missing insufficient funds validation <sub>(high)</sub>
Failing to detect when account balance cannot cover a requested withdrawal or transfer allows overdrafts beyond permitted limits. This causes real monetary losses, account balance violations, and potential regulatory compliance issues in global markets.
### `AP-ACCOUNTING-005` — Non-atomic transaction commit/rollback <sub>(high)</sub>
Processing database operations without atomic commit/rollback leaves partial state when failures occur. This corrupts account balances and volumes, violating double-entry bookkeeping integrity and making audit trails unreliable for global regulatory compliance.
### `AP-ACCOUNTING-006` — On-demand posting generation causing double-spending <sub>(high)</sub>
Computing postings on-demand rather than accumulating them during transaction execution fails to track already-spent funds within the same transaction. This creates double-spending vulnerabilities that violate atomic transaction semantics and can result in significant financial losses.
### `AP-ACCOUNTING-007` — Log insertion after transaction commit breaking event sourcing <sub>(high)</sub>
Committing the transaction before inserting the audit log breaks the event sourcing pattern fundamental to accounting integrity. This makes it impossible to rebuild state from logs and violates audit requirements necessary for global financial compliance.
### `AP-ACCOUNTING-008` — Incomplete transaction log hash chaining <sub>(high)</sub>
Computing log hashes without including the previous log hash breaks the immutable audit trail chain. This allows undetected tampering with historical transaction records, compromising financial integrity and regulatory audit compliance.
## finance-bp-073--ledger, finance-bp-129--beancount (1)
### `AP-ACCOUNTING-001` — Using floating-point arithmetic for monetary amounts <sub>(high)</sub>
Representing currency values with float64 or similar floating-point types causes precision loss during arithmetic operations. Rounding errors accumulate over multiple transactions, leading to incorrect balance calculations and potential financial losses. This violates the fundamental requirement that monetary calculations must be exact.
## finance-bp-078--fava_investor (4)
### `AP-ACCOUNTING-009` — Incorrect row data access patterns on query results <sub>(high)</sub>
Using dictionary notation (row['column_name']) on namedtuple query results raises TypeError since namedtuples only support attribute access. This breaks all module queries expecting attribute-style access, causing asset allocation, tax loss harvesting, and other critical financial computations to fail.
### `AP-ACCOUNTING-010` — Missing bidirectional inference for fund relationship declarations <sub>(medium)</sub>
When relationship A→B is declared but B→A is not inferred, the TLH partner list becomes incomplete. This leads to suboptimal tax-loss harvesting decisions where only some funds show all valid swap options, reducing potential tax savings for investors.
### `AP-ACCOUNTING-011` — Wash sale comparison within substantially identical groups <sub>(high)</sub>
Comparing a ticker to itself in its own substantially identical group falsely triggers wash sale warnings. This incorrectly blocks valid tax-loss harvesting transactions, causing investors to miss opportunities to realize tax losses and offset capital gains.
### `AP-ACCOUNTING-012` — Missing substantially identical tickers in wash sale queries <sub>(high)</sub>
Omitting substantially identical fund tickers from the wash sale comparison set allows purchases of similar funds within the 30-day window. This triggers unintended wash sales that disallow tax loss claims on subsequent sales of the original position.
## finance-bp-129--beancount (3)
### `AP-ACCOUNTING-013` — Using parsed entries with MISSING sentinel values for calculations <sub>(high)</sub>
Using parsed entries directly that contain MISSING sentinel values for balance or cost computations causes runtime errors or silent zero-value calculations. This results in incorrect portfolio valuations and reconciliation failures, compromising financial reporting accuracy.
### `AP-ACCOUNTING-014` — Underspecified interpolation with multiple missing values per currency <sub>(high)</sub>
Having more than one missing value per currency group creates an underdetermined system with no unique solution during interpolation. This causes InterpolationError and transaction failure, blocking balance calculations for affected accounts.
### `AP-ACCOUNTING-015` — Violating accounting identity in opening balance transactions <sub>(high)</sub>
Creating opening balance transactions where the total balance of summarized entries does not equal exactly zero violates the fundamental accounting identity (Assets = Liabilities + Equity). This causes the balance sheet to be fundamentally incorrect with non-zero total assets and liabilities.
FILE:references/COMPONENTS.md
# Component Capability Map
**Project**: finance-bp-073--ledger
**Scan date**: 2026-04-22
**Stats**: {'total_files': 10, 'total_classes': 24, 'total_functions': 0, 'total_stages': 10}
## Modules (10)
- [numscript_script_execution](components/numscript_script_execution.md): 3 classes
- [transaction_processing](components/transaction_processing.md): 3 classes
- [volume_accounting](components/volume_accounting.md): 1 classes
- [log_chain_creation](components/log_chain_creation.md): 1 classes
- [chart/schema_enforcement](components/chart-schema_enforcement.md): 3 classes
- [controller_store](components/controller_store.md): 2 classes
- [async_block_hashing](components/async_block_hashing.md): 3 classes
- [pipeline_replication](components/pipeline_replication.md): 4 classes
- [bucket_cleanup](components/bucket_cleanup.md): 2 classes
- [worker_fx_module](components/worker_fx_module.md): 2 classes
FILE:references/CONSTRAINTS.md
# Constraints
## preservation_manifest
```yaml
required_objects:
business_decisions_count: 99
fatal_constraints_count: 82
non_fatal_constraints_count: 149
use_cases_count: 0
semantic_locks_count: 12
preconditions_count: 4
evidence_quality_rules_count: 2
traceback_scenarios_count: 5
```
FILE:references/LOCKS.md
# Semantic Locks + Preconditions
## Semantic Locks (12)
### `SL-01` <sub>(on_violation: fatal)</sub>
Execute sell orders before buy orders in every trading cycle
### `SL-02` <sub>(on_violation: fatal)</sub>
Trading signals MUST use next-bar execution (no look-ahead)
### `SL-03` <sub>(on_violation: fatal)</sub>
Entity IDs MUST follow format entity_type_exchange_code
### `SL-04` <sub>(on_violation: fatal)</sub>
DataFrame index MUST be MultiIndex (entity_id, timestamp)
### `SL-05` <sub>(on_violation: fatal)</sub>
TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount
### `SL-06` <sub>(on_violation: fatal)</sub>
filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION
### `SL-07` <sub>(on_violation: fatal)</sub>
Transformer MUST run BEFORE Accumulator in factor pipeline
### `SL-08` <sub>(on_violation: fatal)</sub>
MACD parameters locked: fast=12, slow=26, signal=9
### `SL-09` <sub>(on_violation: warning)</sub>
Default transaction costs: buy_cost=0.001, sell_cost=0.001, slippage=0.001
### `SL-10` <sub>(on_violation: fatal)</sub>
A-share equity trading is T+1 (no same-day close of buy positions)
### `SL-11` <sub>(on_violation: fatal)</sub>
Recorder subclass MUST define provider AND data_schema class attributes
### `SL-12` <sub>(on_violation: fatal)</sub>
Factor result_df MUST contain either 'filter_result' OR 'score_result' column
## Preconditions (4)
- **`PC-01`**: `python3 -c 'import zvt; print(zvt.__version__)'` → on_fail: Run: python3 -m pip install zvt then re-run: python3 -m zvt.init_dirs to initialize data directories
- **`PC-02`**: `python3 -c "from zvt.api.kdata import get_kdata; df = get_kdata(entity_ids=['stock_sh_600000'], limit=1); assert df is not None and len(df) > 0, 'No kdata found'"` → on_fail: Run recorder first: python3 -m zvt.recorders.em.em_stock_kdata_recorder --entity_ids stock_sh_600000 (replace with your target entity IDs)
- **`PC-03`**: `python3 -c "import os; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); assert zvt_home.exists(), f'ZVT home not found: {zvt_home}'"` → on_fail: Run: python3 -m zvt.init_dirs
- **`PC-04`**: `python3 -c "import os, tempfile; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); test_f = zvt_home / '.write_test'; test_f.touch(); test_f.unlink()"` → on_fail: Check directory permissions: chmod u+w ~/.zvt or set ZVT_HOME environment variable to a writable location
FILE:references/USE_CASES.md
# Known Use Cases (KUC)
Total: **0**
FILE:references/WISDOM.md
# Cross-Project Wisdom
Total: **10**
## `CW-ACCOUNTING-001` — Use exact-precision integer types for monetary representation
**From**: finance-bp-073--ledger, finance-bp-129--beancount · **Applicable to**: accounting
Both the Numscript ledger and Beancount parser mandates using Decimal (beancount) or MonetaryInt based on big.Int (ledger) instead of floating-point. This pattern ensures no rounding errors accumulate in financial calculations, critical for audit compliance in global markets.
## `CW-ACCOUNTING-002` — Mandatory initialization sequence before execution
**From**: finance-bp-073--ledger · **Applicable to**: accounting
The Numscript VM requires a strict initialization sequence: ResolveResources() then ResolveBalances() must both be called before Execute(). Skipping any step causes panics. This teaches that VM/script execution requires careful state setup—always verify prerequisites before running financial logic.
## `CW-ACCOUNTING-003` — Dual idempotency key strategy
**From**: finance-bp-073--ledger · **Applicable to**: accounting
Using both IdempotencyKey and IdempotencyHash together ensures robust duplicate detection: IdempotencyKey prevents exact retries while IdempotencyHash catches retries with different input parameters that would otherwise incorrectly succeed. Single-key approaches leave gaps in financial transaction safety.
## `CW-ACCOUNTING-004` — Log-before-commit event sourcing pattern
**From**: finance-bp-073--ledger · **Applicable to**: accounting
In the transaction processing pipeline, the log must be inserted before committing the transaction to maintain event sourcing integrity. This ensures the audit trail can always reconstruct state and supports rollback scenarios, critical for regulatory compliance in global accounting.
## `CW-ACCOUNTING-005` — Read Committed isolation with FOR UPDATE locks
**From**: finance-bp-073--ledger · **Applicable to**: accounting
When implementing balance operations, use Read Committed isolation level combined with FOR UPDATE row locks. This prevents concurrent transactions from creating inconsistent balances (e.g., both succeeding when they should fail due to insufficient funds), ensuring data integrity under concurrent load.
## `CW-ACCOUNTING-006` — Transitive closure for equivalence relationships
**From**: finance-bp-078--fava_investor · **Applicable to**: accounting
When building commodity groups or substantially identical fund relationships, apply transitive closure to infer complete equivalence. If A equals B and B equals C, then A, B, and C form one group. This ensures wash sale detection and TLH calculations are complete and accurate across all declared relationships.
## `CW-ACCOUNTING-007` — Canonical representative selection for relationship groups
**From**: finance-bp-078--fava_investor · **Applicable to**: accounting
When selecting a representative for a substantially identical fund group, always return the same representative ticker for any member of that group. Inconsistent representative selection causes non-deterministic calculations where the same ticker gets different partners depending on which group member is queried.
## `CW-ACCOUNTING-008` — Immutable monetary objects with __slots__
**From**: finance-bp-129--beancount · **Applicable to**: accounting
Constructing Amount or Position objects using immutable Decimal values with __slots__ = () pattern prevents accidental mutation of monetary values after creation. This immutability ensures financial calculations remain consistent throughout transaction processing and audit trails.
## `CW-ACCOUNTING-009` — Eliminate all MISSING values before presenting parsed data as complete
**From**: finance-bp-129--beancount · **Applicable to**: accounting
Parsed entries with MISSING sentinel values are incomplete and cannot be used for financial reporting. All MISSING values must be resolved through booking and interpolation before claiming parsed entries are ready for balance calculations or realized/unrealized gains computation.
## `CW-ACCOUNTING-010` — Strict schema compatibility across class hierarchies
**From**: finance-bp-078--fava_investor, finance-bp-129--beancount · **Applicable to**: accounting
When extending base classes with additional functionality (like ScaledNAV extending RelateTickers), maintain compatibility with existing metadata schemas. Schema divergence causes extended classes to miss relationships declared for the base class, breaking wash sale detection and TLH recommendations.
FILE:references/components/async_block_hashing.md
# async_block_hashing (3 classes)
## `AsyncBlockRunner.Run`
`async_block_hashing/asyncblockrunner-run.py:0`
## `Block size`
`async_block_hashing/block-size.py:0`
## `Schedule`
`async_block_hashing/schedule.py:0`
FILE:references/components/bucket_cleanup.md
# bucket_cleanup (2 classes)
## `BucketCleanupRunner.Run`
`bucket_cleanup/bucketcleanuprunner-run.py:0`
## `Retention period`
`bucket_cleanup/retention-period.py:0`
FILE:references/components/chart-schema_enforcement.md
# chart/schema_enforcement (3 classes)
## `ChartOfAccounts.ValidatePosting`
`chart/schema_enforcement/chartofaccounts-validateposting.py:0`
## `ChartOfAccounts.FindAccountSchema`
`chart/schema_enforcement/chartofaccounts-findaccountschema.py:0`
## `Chart pattern engine`
`chart/schema_enforcement/chart-pattern-engine.py:0`
FILE:references/components/controller_store.md
# controller_store (2 classes)
## `Store.GetBalances`
`controller_store/store-getbalances.py:0`
## `Store.LockLedger`
`controller_store/store-lockledger.py:0`
FILE:references/components/log_chain_creation.md
# log_chain_creation (1 classes)
## `N/A`
`log_chain_creation/n-a.py:0`
FILE:references/components/numscript_script_execution.md
# numscript_script_execution (3 classes)
## `Machine.Run`
`numscript_script_execution/machine-run.py:0`
## `Machine.Execute`
`numscript_script_execution/machine-execute.py:0`
## `Store implementation`
`numscript_script_execution/store-implementation.py:0`
FILE:references/components/pipeline_replication.md
# pipeline_replication (4 classes)
## `Manager.Run`
`pipeline_replication/manager-run.py:0`
## `PipelineHandler.Run`
`pipeline_replication/pipelinehandler-run.py:0`
## `Driver type`
`pipeline_replication/driver-type.py:0`
## `Pull interval`
`pipeline_replication/pull-interval.py:0`
FILE:references/components/transaction_processing.md
# transaction_processing (3 classes)
## `logProcessor.runTx`
`transaction_processing/logprocessor-runtx.py:0`
## `logProcessor.fetchLogWithIK`
`transaction_processing/logprocessor-fetchlogwithik.py:0`
## `Schema enforcement mode`
`transaction_processing/schema-enforcement-mode.py:0`
FILE:references/components/volume_accounting.md
# volume_accounting (1 classes)
## `N/A`
`volume_accounting/n-a.py:0`
FILE:references/components/worker_fx_module.md
# worker_fx_module (2 classes)
## `NewFXModule`
`worker_fx_module/newfxmodule.py:0`
## `Feature toggles`
`worker_fx_module/feature-toggles.py:0`
FILE:references/seed.yaml
meta:
id: finance-bp-073-v5.3
version: v6.1
blueprint_id: finance-bp-073
sop_version: crystal-compilation-v6.1
source_language: en
compiled_at: '2026-04-22T13:00:26.836559+00:00'
target_host: openclaw
authoritative_artifact:
primary: seed.yaml
non_authoritative_derivatives:
- SKILL.md (host-generated summary, may lag)
- HEARTBEAT.md (host telemetry)
- memory/*.md (host conversational memory)
rule: On any behavioral decision (preconditions check, OV assertion, EQ rule firing, spec_lock verification), agents MUST
re-read seed.yaml. Derivatives are for UI display only and may be out-of-date.
execution_protocol:
install_trigger:
- Execute resources.host_adapter.install_recipes[] in declared order
- Verify each package with import check before proceeding
execute_trigger: When user intent matches intent_router.uc_entries[].positive_terms AND user uses action verb (run/execute/跑/执行/backtest/fetch/collect)
on_execute:
- Reload seed.yaml (do not rely on SKILL.md or cached summaries)
- Run preconditions[] in declared order; halt on first fatal failure with on_fail message to user
- Enter context_state_machine.CA1_MEMORY_CHECKED state
- Evaluate evidence_quality.enforcement_rules[]; prepend user_disclosure_template
- Translate user_facing_fields to user locale per locale_contract
- "[V6 READING ORDER]\nThis crystal contains the following V6 layers. Before answering any business question, the host\
\ MUST read them in order:\n 1. anti_patterns[] — cross-project anti-patterns (with AP-* ids)\n 2. cross_project_wisdom[]\
\ — cross-project wisdom (with CW-* ids)\n 3. domain_constraints_injected[] — domain constraints (SHARED-* ids)\n \
\ 4. known_use_cases[] — concrete business scenarios (KUC-* ids)\n 5. component_capability_map — AST component map\
\ (by module)\n\nWhen answering user questions, proactively cite relevant AP-*/CW-*/SHARED-*/KUC-* ids with source text.\
\ Examples: T+1 rules -> cite SHARED-* constraint; model comparison -> warn via AP-*; follow-holdings strategy -> cite\
\ KUC-* with example file."
workspace_resolution:
scripts_path: '{host_workspace}/scripts/'
skills_path: '{host_workspace}/skills/'
trace_path: '{host_workspace}/.trace/'
capability_tags:
markets:
- global
activities:
- accounting
upgraded_from: finance-bp-073-v1.seed.yaml
upgraded_at: '2026-04-22T13:20:15.969288+00:00'
v6_inputs:
ast_mind_map: knowledge/sources/finance/finance-bp-073--ledger/v6_inputs/ast_mind_map.yaml
anti_patterns: null
cross_project_wisdom: null
examples_kuc: knowledge/sources/finance/finance-bp-073--ledger/v6_inputs/examples_kuc.yaml
shared_pools_dir: knowledge/sources/finance/_shared
anti_patterns:
- id: AP-ACCOUNTING-001
title: Using floating-point arithmetic for monetary amounts
description: Representing currency values with float64 or similar floating-point types causes precision loss during arithmetic
operations. Rounding errors accumulate over multiple transactions, leading to incorrect balance calculations and potential
financial losses. This violates the fundamental requirement that monetary calculations must be exact.
project_source: finance-bp-073--ledger, finance-bp-129--beancount
severity: high
applicable_to_tags:
markets:
- global
activities:
- accounting
_source_file: anti-patterns/accounting.yaml
- id: AP-ACCOUNTING-002
title: Skipping initialization calls before VM/script execution
description: Executing Numscript VM without first calling ResolveResources() and ResolveBalances() causes panics with ErrResourcesNotInitialized
or ErrBalancesNotInitialized. This prevents any script execution and leaves transactions in an unrunnable state, blocking
financial operations entirely.
project_source: finance-bp-073--ledger
severity: high
applicable_to_tags:
markets:
- global
activities:
- accounting
_source_file: anti-patterns/accounting.yaml
- id: AP-ACCOUNTING-003
title: Mixing different asset types in monetary operations
description: Performing addition, subtraction, or take operations on amounts with different asset types produces invalid
financial calculations. This violates the fundamental accounting principle that amounts in different currencies cannot
be combined, leading to corrupted account balances and failed reconciliations.
project_source: finance-bp-073--ledger
severity: high
applicable_to_tags:
markets:
- global
activities:
- accounting
_source_file: anti-patterns/accounting.yaml
- id: AP-ACCOUNTING-004
title: Missing insufficient funds validation
description: Failing to detect when account balance cannot cover a requested withdrawal or transfer allows overdrafts beyond
permitted limits. This causes real monetary losses, account balance violations, and potential regulatory compliance issues
in global markets.
project_source: finance-bp-073--ledger
severity: high
applicable_to_tags:
markets:
- global
activities:
- accounting
_source_file: anti-patterns/accounting.yaml
- id: AP-ACCOUNTING-005
title: Non-atomic transaction commit/rollback
description: Processing database operations without atomic commit/rollback leaves partial state when failures occur. This
corrupts account balances and volumes, violating double-entry bookkeeping integrity and making audit trails unreliable
for global regulatory compliance.
project_source: finance-bp-073--ledger
severity: high
applicable_to_tags:
markets:
- global
activities:
- accounting
_source_file: anti-patterns/accounting.yaml
- id: AP-ACCOUNTING-006
title: On-demand posting generation causing double-spending
description: Computing postings on-demand rather than accumulating them during transaction execution fails to track already-spent
funds within the same transaction. This creates double-spending vulnerabilities that violate atomic transaction semantics
and can result in significant financial losses.
project_source: finance-bp-073--ledger
severity: high
applicable_to_tags:
markets:
- global
activities:
- accounting
_source_file: anti-patterns/accounting.yaml
- id: AP-ACCOUNTING-007
title: Log insertion after transaction commit breaking event sourcing
description: Committing the transaction before inserting the audit log breaks the event sourcing pattern fundamental to
accounting integrity. This makes it impossible to rebuild state from logs and violates audit requirements necessary for
global financial compliance.
project_source: finance-bp-073--ledger
severity: high
applicable_to_tags:
markets:
- global
activities:
- accounting
_source_file: anti-patterns/accounting.yaml
- id: AP-ACCOUNTING-008
title: Incomplete transaction log hash chaining
description: Computing log hashes without including the previous log hash breaks the immutable audit trail chain. This allows
undetected tampering with historical transaction records, compromising financial integrity and regulatory audit compliance.
project_source: finance-bp-073--ledger
severity: high
applicable_to_tags:
markets:
- global
activities:
- accounting
_source_file: anti-patterns/accounting.yaml
- id: AP-ACCOUNTING-009
title: Incorrect row data access patterns on query results
description: Using dictionary notation (row['column_name']) on namedtuple query results raises TypeError since namedtuples
only support attribute access. This breaks all module queries expecting attribute-style access, causing asset allocation,
tax loss harvesting, and other critical financial computations to fail.
project_source: finance-bp-078--fava_investor
severity: high
applicable_to_tags:
markets:
- global
activities:
- accounting
_source_file: anti-patterns/accounting.yaml
- id: AP-ACCOUNTING-010
title: Missing bidirectional inference for fund relationship declarations
description: When relationship A→B is declared but B→A is not inferred, the TLH partner list becomes incomplete. This leads
to suboptimal tax-loss harvesting decisions where only some funds show all valid swap options, reducing potential tax
savings for investors.
project_source: finance-bp-078--fava_investor
severity: medium
applicable_to_tags:
markets:
- global
activities:
- accounting
_source_file: anti-patterns/accounting.yaml
- id: AP-ACCOUNTING-011
title: Wash sale comparison within substantially identical groups
description: Comparing a ticker to itself in its own substantially identical group falsely triggers wash sale warnings.
This incorrectly blocks valid tax-loss harvesting transactions, causing investors to miss opportunities to realize tax
losses and offset capital gains.
project_source: finance-bp-078--fava_investor
severity: high
applicable_to_tags:
markets:
- global
activities:
- accounting
_source_file: anti-patterns/accounting.yaml
- id: AP-ACCOUNTING-012
title: Missing substantially identical tickers in wash sale queries
description: Omitting substantially identical fund tickers from the wash sale comparison set allows purchases of similar
funds within the 30-day window. This triggers unintended wash sales that disallow tax loss claims on subsequent sales
of the original position.
project_source: finance-bp-078--fava_investor
severity: high
applicable_to_tags:
markets:
- global
activities:
- accounting
_source_file: anti-patterns/accounting.yaml
- id: AP-ACCOUNTING-013
title: Using parsed entries with MISSING sentinel values for calculations
description: Using parsed entries directly that contain MISSING sentinel values for balance or cost computations causes
runtime errors or silent zero-value calculations. This results in incorrect portfolio valuations and reconciliation failures,
compromising financial reporting accuracy.
project_source: finance-bp-129--beancount
severity: high
applicable_to_tags:
markets:
- global
activities:
- accounting
_source_file: anti-patterns/accounting.yaml
- id: AP-ACCOUNTING-014
title: Underspecified interpolation with multiple missing values per currency
description: Having more than one missing value per currency group creates an underdetermined system with no unique solution
during interpolation. This causes InterpolationError and transaction failure, blocking balance calculations for affected
accounts.
project_source: finance-bp-129--beancount
severity: high
applicable_to_tags:
markets:
- global
activities:
- accounting
_source_file: anti-patterns/accounting.yaml
- id: AP-ACCOUNTING-015
title: Violating accounting identity in opening balance transactions
description: Creating opening balance transactions where the total balance of summarized entries does not equal exactly
zero violates the fundamental accounting identity (Assets = Liabilities + Equity). This causes the balance sheet to be
fundamentally incorrect with non-zero total assets and liabilities.
project_source: finance-bp-129--beancount
severity: high
applicable_to_tags:
markets:
- global
activities:
- accounting
_source_file: anti-patterns/accounting.yaml
cross_project_wisdom:
- wisdom_id: CW-ACCOUNTING-001
source_project: finance-bp-073--ledger, finance-bp-129--beancount
pattern_name: Use exact-precision integer types for monetary representation
description: Both the Numscript ledger and Beancount parser mandates using Decimal (beancount) or MonetaryInt based on big.Int
(ledger) instead of floating-point. This pattern ensures no rounding errors accumulate in financial calculations, critical
for audit compliance in global markets.
applicable_to_activity: accounting
_source_file: cross-project-wisdom/accounting.yaml
- wisdom_id: CW-ACCOUNTING-002
source_project: finance-bp-073--ledger
pattern_name: Mandatory initialization sequence before execution
description: 'The Numscript VM requires a strict initialization sequence: ResolveResources() then ResolveBalances() must
both be called before Execute(). Skipping any step causes panics. This teaches that VM/script execution requires careful
state setup—always verify prerequisites before running financial logic.'
applicable_to_activity: accounting
_source_file: cross-project-wisdom/accounting.yaml
- wisdom_id: CW-ACCOUNTING-003
source_project: finance-bp-073--ledger
pattern_name: Dual idempotency key strategy
description: 'Using both IdempotencyKey and IdempotencyHash together ensures robust duplicate detection: IdempotencyKey
prevents exact retries while IdempotencyHash catches retries with different input parameters that would otherwise incorrectly
succeed. Single-key approaches leave gaps in financial transaction safety.'
applicable_to_activity: accounting
_source_file: cross-project-wisdom/accounting.yaml
- wisdom_id: CW-ACCOUNTING-004
source_project: finance-bp-073--ledger
pattern_name: Log-before-commit event sourcing pattern
description: In the transaction processing pipeline, the log must be inserted before committing the transaction to maintain
event sourcing integrity. This ensures the audit trail can always reconstruct state and supports rollback scenarios, critical
for regulatory compliance in global accounting.
applicable_to_activity: accounting
_source_file: cross-project-wisdom/accounting.yaml
- wisdom_id: CW-ACCOUNTING-005
source_project: finance-bp-073--ledger
pattern_name: Read Committed isolation with FOR UPDATE locks
description: When implementing balance operations, use Read Committed isolation level combined with FOR UPDATE row locks.
This prevents concurrent transactions from creating inconsistent balances (e.g., both succeeding when they should fail
due to insufficient funds), ensuring data integrity under concurrent load.
applicable_to_activity: accounting
_source_file: cross-project-wisdom/accounting.yaml
- wisdom_id: CW-ACCOUNTING-006
source_project: finance-bp-078--fava_investor
pattern_name: Transitive closure for equivalence relationships
description: When building commodity groups or substantially identical fund relationships, apply transitive closure to infer
complete equivalence. If A equals B and B equals C, then A, B, and C form one group. This ensures wash sale detection
and TLH calculations are complete and accurate across all declared relationships.
applicable_to_activity: accounting
_source_file: cross-project-wisdom/accounting.yaml
- wisdom_id: CW-ACCOUNTING-007
source_project: finance-bp-078--fava_investor
pattern_name: Canonical representative selection for relationship groups
description: When selecting a representative for a substantially identical fund group, always return the same representative
ticker for any member of that group. Inconsistent representative selection causes non-deterministic calculations where
the same ticker gets different partners depending on which group member is queried.
applicable_to_activity: accounting
_source_file: cross-project-wisdom/accounting.yaml
- wisdom_id: CW-ACCOUNTING-008
source_project: finance-bp-129--beancount
pattern_name: Immutable monetary objects with __slots__
description: Constructing Amount or Position objects using immutable Decimal values with __slots__ = () pattern prevents
accidental mutation of monetary values after creation. This immutability ensures financial calculations remain consistent
throughout transaction processing and audit trails.
applicable_to_activity: accounting
_source_file: cross-project-wisdom/accounting.yaml
- wisdom_id: CW-ACCOUNTING-009
source_project: finance-bp-129--beancount
pattern_name: Eliminate all MISSING values before presenting parsed data as complete
description: Parsed entries with MISSING sentinel values are incomplete and cannot be used for financial reporting. All
MISSING values must be resolved through booking and interpolation before claiming parsed entries are ready for balance
calculations or realized/unrealized gains computation.
applicable_to_activity: accounting
_source_file: cross-project-wisdom/accounting.yaml
- wisdom_id: CW-ACCOUNTING-010
source_project: finance-bp-078--fava_investor, finance-bp-129--beancount
pattern_name: Strict schema compatibility across class hierarchies
description: When extending base classes with additional functionality (like ScaledNAV extending RelateTickers), maintain
compatibility with existing metadata schemas. Schema divergence causes extended classes to miss relationships declared
for the base class, breaking wash sale detection and TLH recommendations.
applicable_to_activity: accounting
_source_file: cross-project-wisdom/accounting.yaml
domain_constraints_injected: []
resources_injected: {}
component_capability_map:
project: finance-bp-073--ledger
scan_date: '2026-04-22'
stats:
total_files: 10
total_classes: 24
total_functions: 0
total_stages: 10
modules:
numscript_script_execution:
class_count: 3
stage_id: numscript_execution
stage_order: 1
responsibility: Compiles and executes Numscript DSL programs to generate double-entry postings with balance tracking.
This stage is the entry point for financial logic, converting human-readable scripts into executable postings that
move funds between accounts.
classes:
- name: Machine.Run
file: numscript_script_execution/machine-run.py
line: 0
kind: required_method
signature: ''
- name: Machine.Execute
file: numscript_script_execution/machine-execute.py
line: 0
kind: required_method
signature: ''
- name: Store implementation
file: numscript_script_execution/store-implementation.py
line: 0
kind: replaceable_point
design_decision_count: 5
transaction_processing:
class_count: 3
stage_id: transaction_processing
stage_order: 2
responsibility: Validates, persists, and tracks transactions with idempotency and schema enforcement. This stage ensures
each financial operations are recorded atomically with full audit support.
classes:
- name: logProcessor.runTx
file: transaction_processing/logprocessor-runtx.py
line: 0
kind: required_method
signature: ''
- name: logProcessor.fetchLogWithIK
file: transaction_processing/logprocessor-fetchlogwithik.py
line: 0
kind: required_method
signature: ''
- name: Schema enforcement mode
file: transaction_processing/schema-enforcement-mode.py
line: 0
kind: replaceable_point
design_decision_count: 5
volume_accounting:
class_count: 1
stage_id: volume_accounting
stage_order: 3
responsibility: Tracks Input/Output volumes per account/asset for balance calculation. Provides the foundation for each
balance queries and ensures financial integrity through double-entry bookkeeping.
classes:
- name: N/A
file: volume_accounting/n-a.py
line: 0
kind: required_method
signature: ''
design_decision_count: 4
log_chain_creation:
class_count: 1
stage_id: log_creation
stage_order: 4
responsibility: Creates immutable, cryptographically chained logs for audit trail. Ensures tamper-evidence through hash
chaining where each log includes the previous log's hash.
classes:
- name: N/A
file: log_chain_creation/n-a.py
line: 0
kind: required_method
signature: ''
design_decision_count: 4
chart/schema_enforcement:
class_count: 3
stage_id: schema_enforcement
stage_order: 5
responsibility: Validates account addresses and postings against hierarchical chart of accounts. Enables controlled
account structure with parameterized patterns and default metadata inheritance.
classes:
- name: ChartOfAccounts.ValidatePosting
file: chart/schema_enforcement/chartofaccounts-validateposting.py
line: 0
kind: required_method
signature: ''
- name: ChartOfAccounts.FindAccountSchema
file: chart/schema_enforcement/chartofaccounts-findaccountschema.py
line: 0
kind: required_method
signature: ''
- name: Chart pattern engine
file: chart/schema_enforcement/chart-pattern-engine.py
line: 0
kind: replaceable_point
design_decision_count: 3
controller_store:
class_count: 2
stage_id: controller_store
stage_order: 6
responsibility: DB-backed store implementing Controller interface with transactions and locking. Provides data access
layer with PostgreSQL transactions, advisory locks, and pagination.
classes:
- name: Store.GetBalances
file: controller_store/store-getbalances.py
line: 0
kind: required_method
signature: ''
- name: Store.LockLedger
file: controller_store/store-lockledger.py
line: 0
kind: required_method
signature: ''
design_decision_count: 4
async_block_hashing:
class_count: 3
stage_id: async_block_hashing
stage_order: 7
responsibility: Periodic background job that groups logs into blocks and computes cumulative hashes. Provides batched
block creation for efficient querying and additional integrity verification.
classes:
- name: AsyncBlockRunner.Run
file: async_block_hashing/asyncblockrunner-run.py
line: 0
kind: required_method
signature: ''
- name: Block size
file: async_block_hashing/block-size.py
line: 0
kind: replaceable_point
- name: Schedule
file: async_block_hashing/schedule.py
line: 0
kind: replaceable_point
design_decision_count: 4
pipeline_replication:
class_count: 4
stage_id: pipeline_replication
stage_order: 8
responsibility: Continuous log streaming to external systems (Kafka, HTTP, OTLP) via driver plugins. Enables horizontal
scaling by allowing replicas to consume logs at their own pace.
classes:
- name: Manager.Run
file: pipeline_replication/manager-run.py
line: 0
kind: required_method
signature: ''
- name: PipelineHandler.Run
file: pipeline_replication/pipelinehandler-run.py
line: 0
kind: required_method
signature: ''
- name: Driver type
file: pipeline_replication/driver-type.py
line: 0
kind: replaceable_point
- name: Pull interval
file: pipeline_replication/pull-interval.py
line: 0
kind: replaceable_point
design_decision_count: 5
bucket_cleanup:
class_count: 2
stage_id: bucket_cleanup
stage_order: 9
responsibility: Periodic cleanup of soft-deleted buckets after retention period expires. Provides recovery window for
accidentally deleted ledgers while ensuring eventual cleanup.
classes:
- name: BucketCleanupRunner.Run
file: bucket_cleanup/bucketcleanuprunner-run.py
line: 0
kind: required_method
signature: ''
- name: Retention period
file: bucket_cleanup/retention-period.py
line: 0
kind: replaceable_point
design_decision_count: 3
worker_fx_module:
class_count: 2
stage_id: worker_fx_module
stage_order: 10
responsibility: Wires together each worker components via Uber FX dependency injection. Provides cohesive worker binary
combining async block hashing, replication, and cleanup workers.
classes:
- name: NewFXModule
file: worker_fx_module/newfxmodule.py
line: 0
kind: required_method
signature: ''
- name: Feature toggles
file: worker_fx_module/feature-toggles.py
line: 0
kind: replaceable_point
design_decision_count: 4
data_flow_hints: []
locale_contract:
source_language: en
user_facing_fields:
- human_summary.what_i_can_do.tagline
- human_summary.what_i_can_do.use_cases[]
- human_summary.what_i_auto_fetch[]
- human_summary.what_i_ask_you[]
- evidence_quality.user_disclosure_template
- post_install_notice.message_template.positioning
- post_install_notice.message_template.capability_catalog.groups[].name
- post_install_notice.message_template.capability_catalog.groups[].description
- post_install_notice.message_template.capability_catalog.groups[].ucs[].name
- post_install_notice.message_template.capability_catalog.groups[].ucs[].short_description
- post_install_notice.message_template.call_to_action
- post_install_notice.message_template.featured_entries[].beginner_prompt
- post_install_notice.message_template.more_info_hint
- preconditions[].description
- preconditions[].on_fail
- intent_router.uc_entries[].name
- intent_router.uc_entries[].ambiguity_question
- architecture.pipeline
- architecture.stages[].narrative.does_what
- architecture.stages[].narrative.key_decisions
- architecture.stages[].narrative.common_pitfalls
- constraints.fatal[].consequence
- constraints.regular[].consequence
- output_validator.assertions[].failure_message
- acceptance.hard_gates[].on_fail
- skill_crystallization.action
locale_detection_order:
- explicit_user_declaration
- first_message_language
- system_locale
translation_enforcement:
trigger: on_first_user_message
action: Render user_facing_fields in detected locale, preserving all IDs (BD-/SL-/UC-/finance-C-) and code identifiers
verbatim
violation_code: LOCALE-01
violation_signal: User receives untranslated English Human Summary when detected locale != en
evidence_quality:
declared:
evidence_coverage_ratio: 1.0
evidence_verify_ratio: 0.8586956521739131
evidence_invalid: 13
evidence_verified: 79
evidence_auto_fixed: 0
audit_coverage: 22/22 (100%)
audit_pass_rate: 12/22 (54%)
audit_fail_total: 0
audit_finance_universal:
pass: 8
warn: 6
fail: 0
audit_subdomain_totals:
pass: 4
warn: 4
fail: 0
enforcement_rules:
- id: EQ-01
trigger: declared.evidence_verify_ratio < 0.5
action: MUST invoke traceback lookup for all cited BD-IDs in output before emitting business code — read LATEST.yaml sections
for each BD referenced
violation_code: EQ-01-V
violation_signal: Generated script references BD-IDs but no tool_call to read LATEST.yaml preceded code generation
user_disclosure_template: '[QUALITY NOTICE] This crystal was compiled from blueprint finance-bp-073. Evidence verify ratio
= 85.9% and audit fail total = 0. Generated results may have uncaptured requirement gaps. Verify critical decisions against
source files (LATEST.yaml / LATEST.jsonl).'
traceback:
source_files:
blueprint: LATEST.yaml
constraints: LATEST.jsonl
mandatory_lookup_scenarios:
- id: TB-01
condition: Two constraints have apparently conflicting enforcement rules
lookup_target: LATEST.jsonl — find both constraint IDs, compare `consequence` + `evidence_refs` to determine priority
- id: TB-02
condition: A business decision rationale is unclear or disputed
lookup_target: LATEST.yaml — locate BD-ID under business_decisions, read `rationale` + `alternative_considered` fields
- id: TB-03
condition: evidence_invalid > 0 in evidence_quality.declared
lookup_target: LATEST.yaml _enrich_meta — cross-check specific BD `evidence_refs` fields for invalid markers
- id: TB-04
condition: User asks where a rule comes from
lookup_target: LATEST.jsonl — find constraint by ID, read `confidence.evidence_refs` for source file + line number
- id: TB-05
condition: Generated code does not match expected ZVT API behavior
lookup_target: LATEST.yaml stages[].required_methods — verify method signature and evidence locator in source code
degraded_lookup:
no_fs_access: 'Ask the user to paste the relevant LATEST.yaml section or LATEST.jsonl lines for the BD-/finance-C- IDs
in question. Crystal ID: finance-bp-073-v5.0.'
trace_schema:
event_types:
- precondition_check
- spec_lock_check
- evidence_rule_fired
- evidence_rule_skipped
- locale_translation_emitted
- hard_gate_passed
- hard_gate_failed
- skill_emitted
- false_completion_claim
preconditions:
- id: PC-01
description: zvt package installed and importable
check_command: python3 -c 'import zvt; print(zvt.__version__)'
on_fail: 'Run: python3 -m pip install zvt then re-run: python3 -m zvt.init_dirs to initialize data directories'
severity: fatal
- id: PC-02
description: K-data exists for target entities (required before backtesting)
check_command: python3 -c "from zvt.api.kdata import get_kdata; df = get_kdata(entity_ids=['stock_sh_600000'], limit=1);
assert df is not None and len(df) > 0, 'No kdata found'"
on_fail: 'Run recorder first: python3 -m zvt.recorders.em.em_stock_kdata_recorder --entity_ids stock_sh_600000 (replace
with your target entity IDs)'
severity: fatal
applies_to_uc: []
- id: PC-03
description: ZVT data directory initialized (~/.zvt or ZVT_HOME)
check_command: 'python3 -c "import os; from pathlib import Path; zvt_home = Path(os.environ.get(''ZVT_HOME'', Path.home()
/ ''.zvt'')); assert zvt_home.exists(), f''ZVT home not found: {zvt_home}''"'
on_fail: 'Run: python3 -m zvt.init_dirs'
severity: fatal
- id: PC-04
description: SQLite write permission for ZVT data directory
check_command: python3 -c "import os, tempfile; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home()
/ '.zvt')); test_f = zvt_home / '.write_test'; test_f.touch(); test_f.unlink()"
on_fail: 'Check directory permissions: chmod u+w ~/.zvt or set ZVT_HOME environment variable to a writable location'
severity: warn
intent_router:
uc_entries: []
context_state_machine:
states:
- id: CA1_MEMORY_CHECKED
entry: Task started
exit: All memory queries attempted and recorded; memory_unavailable set if failed
timeout: 30s — skip memory, mark memory_unavailable=true, proceed to CA2
- id: CA2_GAPS_FILLED
entry: CA1 complete
exit: 'All FATAL-priority required inputs answered: target market (A-share/HK/US), data source, time range, strategy type'
timeout: NOT skippable — FATAL inputs MUST be user-answered before proceeding
- id: CA3_PATH_SELECTED
entry: CA2 complete
exit: intent_router matched single use case with confidence gap > 20% over next candidate, no data_domain ambiguity
timeout: Trigger ambiguity_question for top-2 candidates, await user selection
- id: CA4_EXECUTING
entry: CA3 complete + user explicit confirmation received
exit: All hard gates G1-Gn passed and output files written
timeout: NOT skippable — user confirmation of execution path required
enforcement: Code generation is PROHIBITED before CA4_EXECUTING. Any regression to earlier state MUST be announced to user.
buy/sell ordering SL-01 check runs at CA4 entry.
spec_lock_registry:
semantic_locks:
- id: SL-01
description: Execute sell orders before buy orders in every trading cycle
locked_value: sell() called before buy() in each Trader.run() iteration
violation_is: fatal
source_bd_ids:
- BD-018
- id: SL-02
description: Trading signals MUST use next-bar execution (no look-ahead)
locked_value: due_timestamp = happen_timestamp + level.to_second()
violation_is: fatal
source_bd_ids:
- BD-014
- BD-025
- id: SL-03
description: Entity IDs MUST follow format entity_type_exchange_code
locked_value: stock_sh_600000 | stockhk_hk_0700 | stockus_nasdaq_AAPL
violation_is: fatal
source_bd_ids: []
- id: SL-04
description: DataFrame index MUST be MultiIndex (entity_id, timestamp)
locked_value: df.index.names == ['entity_id', 'timestamp']
violation_is: fatal
source_bd_ids: []
- id: SL-05
description: 'TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount'
locked_value: XOR enforcement in trading/__init__.py:68
violation_is: fatal
source_bd_ids: []
- id: SL-06
description: 'filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION'
locked_value: factor.py:475 order_type_flag mapping
violation_is: fatal
source_bd_ids: []
- id: SL-07
description: Transformer MUST run BEFORE Accumulator in factor pipeline
locked_value: 'compute_result(): transform at :403 before accumulator at :409'
violation_is: fatal
source_bd_ids: []
- id: SL-08
description: 'MACD parameters locked: fast=12, slow=26, signal=9'
locked_value: factors/algorithm.py:30 macd(slow=26, fast=12, n=9)
violation_is: fatal
source_bd_ids:
- BD-036
- id: SL-09
description: 'Default transaction costs: buy_cost=0.001, sell_cost=0.001, slippage=0.001'
locked_value: sim_account.py:25 SimAccountService default costs
violation_is: warning
source_bd_ids:
- BD-029
- id: SL-10
description: A-share equity trading is T+1 (no same-day close of buy positions)
locked_value: sim_account.available_long filters by trading_t
violation_is: fatal
source_bd_ids: []
- id: SL-11
description: Recorder subclass MUST define provider AND data_schema class attributes
locked_value: contract/recorder.py:71 Meta; register_schema decorator
violation_is: fatal
source_bd_ids: []
- id: SL-12
description: Factor result_df MUST contain either 'filter_result' OR 'score_result' column
locked_value: result_df.columns.intersection({'filter_result', 'score_result'}) non-empty
violation_is: fatal
source_bd_ids: []
implementation_hints:
- id: IH-01
hint: 'Use AdjustType enum exactly: qfq (pre-adjust), hfq (post-adjust), bfq (none) — contract/__init__.py:121'
- id: IH-02
hint: For A-share kdata, default to hfq for long-term analysis (dividend-adjusted) — trader.py:538 StockTrader
- id: IH-03
hint: SQLite connection MUST use check_same_thread=False for multi-threaded recorders
- id: IH-04
hint: Accumulator state serialization uses JSON with custom encoder/decoder hooks — contract/base_service.py
- id: IH-05
hint: Factor.level MUST match TargetSelector.level (enforced at add_factor) — factors/target_selector.py:84
preservation_manifest:
required_objects:
business_decisions_count: 99
fatal_constraints_count: 82
non_fatal_constraints_count: 149
use_cases_count: 0
semantic_locks_count: 12
preconditions_count: 4
evidence_quality_rules_count: 2
traceback_scenarios_count: 5
architecture:
pipeline: data_collection -> data_storage -> factor_computation -> target_selection -> trading_execution -> visualization
stages:
- id: data_collection
narrative:
does_what: TimeSeriesDataRecorder and FixedCycleDataRecorder fetch OHLCV and fundamental data from providers (eastmoney,
joinquant, baostock, akshare) and persist domain objects (Stock1dKdata, BalanceSheet) to SQLite via df_to_db().
key_decisions: BD-002 chose evaluate_start_end_size_timestamps for incremental fetch (not full refresh) because comparing
to get_latest_saved_record avoids redundant API calls; BD-003 chose get_data_map field transformation to keep domain
schema provider-agnostic.
common_pitfalls: 'Don''t forget SL-11: Recorder subclass MUST declare both provider and data_schema class attributes
else initialization fails with assertion error; finance-C-001 fatal violation.'
business_decisions: []
- id: data_storage
narrative:
does_what: StorageBackend persists DataFrames to per-provider SQLite databases at {data_path}/{provider}/{provider}_{db_name}.db
using path templates from _get_path_template; Mixin.record_data and Mixin.query_data provide uniform read/write interface.
key_decisions: BD-004 chose StorageBackend abstraction (not hardcoded SQLite) to allow future cloud storage swap; BD-006
derives db_name from data_schema __tablename__ for per-domain database isolation.
common_pitfalls: SL-04 violation (wrong DataFrame index) causes factor pipeline failures downstream; always ensure df.index.names
== ['entity_id', 'timestamp'] before calling record_data.
business_decisions: []
- id: factor_computation
narrative:
does_what: Factor.compute() applies Transformer (stateless, e.g. MacdTransformer) then Accumulator (stateful, e.g. MaStatsAccumulator)
to produce filter_result or score_result columns; EntityStateService persists per-entity rolling state across batches.
key_decisions: BD-007 chose Factor inheriting DataReader for composable data access; SL-08 locks MACD at (fast=12, slow=26,
n=9) — chose standard Appel parameters not adaptive because interpretability matters for practitioners.
common_pitfalls: 'SL-07: Transformer MUST run before Accumulator — swapping order causes NaN propagation; SL-12: result_df
must contain filter_result OR score_result column or TargetSelector silently drops all signals.'
business_decisions: []
- id: target_selection
narrative:
does_what: TargetSelector.add_factor() registers Factor instances; get_targets() returns entity_ids passing threshold
filter at a specific timestamp, enabling point-in-time historical backtesting without look-ahead.
key_decisions: BD-012 chose registrable factor list (not hardcoded) for runtime customization; BD-013 chose timestamp-specific
filtering not current-only because backtests need historical point-in-time correctness.
common_pitfalls: Factor.level MUST match TargetSelector.level (IH-05); mismatched levels cause silent empty target lists
that look like no signals but are actually level-mismatch bugs.
business_decisions: []
- id: trading_execution
narrative:
does_what: Trader.run() calls sell() before buy() each cycle, generates TradingSignals with due_timestamp = happen_timestamp
+ level.to_second() for next-bar execution, and applies on_profit_control() for stop-loss/take-profit before regular
target selection.
key_decisions: SL-01 locks sell-before-buy order because available_long check in sim_account depends on it — chose this
over symmetric ordering to prevent implicit leverage; BD-039 chose long=AND/short=OR multi-level logic to reflect
risk asymmetry.
common_pitfalls: 'SL-02 violation (immediate execution instead of next-bar) introduces look-ahead bias and makes backtest
results unreproducible in live trading; SL-10: A-share T+1 constraint — backtesting without it overstates returns.'
business_decisions: []
- id: visualization
narrative:
does_what: Drawer.draw() combines kline main chart with factor overlays and Rect annotations for entry/exit signals
using Plotly; Drawable interface on Factor enables consistent chart rendering across data types.
key_decisions: BD-019 chose drawer_rects subclass override for custom annotations not hardcoded markers — allows traders
to define entry/exit visuals without modifying base drawing logic.
common_pitfalls: draw_result=True by default (BD-055) is fine for development but set draw_result=False in production/headless
environments to avoid Plotly server startup overhead.
business_decisions: []
- id: cross_cutting_concerns
narrative:
does_what: 'Invariants and utilities that span multiple pipeline stages — collected from 16 source groups: async_block_hashing(3),
bucket_cleanup(2), configuration(9), controller_store(5), data_model(13), data_retention(1), and 10 more.'
key_decisions: 99 BDs merged here because they apply to more than one main stage (e.g. algorithm helpers, default value
choices, ordering contracts, error handling). Agent should inspect individual BD summaries and link back to affected
main stages via shared IDs.
common_pitfalls: Cross-cutting concerns frequently surface as bugs when changes to one main stage unintentionally break
another. Check constraints referencing these BDs and verify invariants still hold after any stage-local modification.
business_decisions:
- id: BD-020
type: BA
summary: Block hashing runs asynchronously via PostgreSQL stored procedure in CRON-scheduled worker
- id: BD-021
type: M
summary: CRON-based scheduling with configurable interval controls block creation frequency
- id: BD-022
type: BA
summary: Feature flag controls per-ledger opt-in for block hashing feature
- id: BD-026
type: B/BA
summary: Soft delete + 30-day delayed hard delete pattern for bucket cleanup
- id: BD-027
type: T
summary: Bucket cleanup continues on individual failure to avoid blocking other cleanups
- id: BD-039
type: T
summary: Ledger names must match pattern '^[0-9a-zA-Z_-]{1,63}$' - alphanumeric with hyphens/underscores, max 63 characters
- id: BD-040
type: B/DK
summary: 'Reserved ledger names: ''_'', ''_info'', ''_healthcheck'' cannot be created by users'
- id: BD-041
type: B/BA
summary: Default bucket name is '_default' when not explicitly specified
- id: BD-046
type: B/BA
summary: MOVES_HISTORY feature defaults to ON - funds movements are tracked per account/asset
- id: BD-047
type: B/BA
summary: HASH_LOGS feature defaults to SYNC - logs are hashed synchronously on write
- id: BD-051
type: B/RC
summary: 'Schema enforcement has two modes: Strict (fail) and Audit (warn)'
- id: BD-055
type: T
summary: Bulk parallel execution capped at 10 workers
- id: BD-057
type: T
summary: Async block hasher runs on cron schedule with configurable MaxBlockSize
- id: BD-059
type: T
summary: Parallel bucket migrations default to 10 workers with 5-second retry
- id: BD-018
type: B
summary: Advisory lock on ledger ID using hashtext prevents concurrent modifications
- id: BD-019
type: B
summary: GetBalances returns locked balances for transaction duration to prevent double-spend
- id: BD-035
type: B/RC
summary: Controller upserts accounts after log is created; log exists but accounts may be missing on failure
- id: BD-95
type: B/BA
summary: 'INTERACTION: [BD-035] × [BD-005] → Account upsert after log creation can leave orphaned logs with missing
account metadata in log-first architecture'
- id: BD-99
type: B/BA
summary: 'INTERACTION: [BD-018] × [BD-019] × [BD-007] → Advisory lock serialization combined with deadlock retry creates
performance bottleneck under contention; high contention amplifies retry delays exponenti'
- id: BD-042
type: B
summary: 'Transactions have dual volume tracking: PostCommitVolumes (immutable) and PostCommitEffectiveVolumes (updated
on past-dated inserts)'
- id: BD-043
type: M
summary: Balance is calculated as Input minus Output (not cumulative sum)
- id: BD-044
type: T
summary: Account addresses use ':' separator for chart segment hierarchy
- id: BD-062
type: T
summary: Reversion metadata uses 'com.formance.spec/state/reverts' key
- id: BD-064
type: B/RC
summary: Chart segment validation regex allows '$' prefix for variable segments
- id: BD-071
type: M/BA
summary: Monetary amounts use big.Int for arbitrary precision
- id: BD-073
type: M
summary: Balance calculation uses Input - Output formula via big.Int subtraction
- id: BD-075
type: M
summary: SHA256 hashing for log chain integrity and idempotency signatures
- id: BD-076
type: M/BA
summary: Arbitrary precision integers (big.Int) for each financial amounts
- id: BD-078
type: M
summary: PostCommitVolumes tracks cumulative account/asset volumes after transaction
- id: BD-079
type: M
summary: ComputePostCommitEffectiveVolumes finds most recent move per account/asset
- id: BD-080
type: M
summary: JSON encoder feeds canonical data to hash function for reproducible signatures
- id: BD-089
type: M/BA
summary: 'Log ID chaining: next ID = previous.ID + 1'
- id: BD-058
type: B/BA
summary: Bucket cleanup uses retention period - hard delete after cutoff time
- id: BD-93
type: B/B
summary: 'INTERACTION: [BD-047] × [BD-020] → SYNC hashing default contradicts async block hasher worker - SYNC means
immediate hashing but async CRON worker performs batch hashing separately'
- id: BD-012
type: B/RC
summary: SHA256 hash chain includes previous log hash for tamper-evident log chain
- id: BD-013
type: BA
summary: Memento pattern excludes computed postCommitVolumes from hash computation
- id: BD-014
type: T
summary: Log types are Go integer constants, not strings, for type safety
- id: BD-001
type: T
summary: VM uses opcode-based instruction set (bytecode) instead of direct AST interpretation
- id: BD-002
type: BA
summary: Store interface abstracts balance queries, allowing multiple implementations
- id: BD-003
type: B/BA
summary: Funding model tracks source parts with FIFO allocation for partial credit/debit
- id: BD-004
type: M
summary: Postings accumulate during execution in single pass, preventing double-spending
- id: BD-031
type: B/BA
summary: 'VM execution requires strict phase ordering: SetVarsFromJSON → ResolveResources → ResolveBalances → Execute'
- id: BD-034
type: B
summary: Funding concatenation merges adjacent funding parts with same account into single part
- id: BD-036
type: B
summary: TAKE opcode rejects mismatched assets but not negative amounts; TAKE_MAX rejects both
- id: BD-037
type: T
summary: StaticStore enables in-memory VM testing without database dependency
- id: BD-050
type: B
summary: Transactions support DryRun mode - execute without committing
- id: BD-052
type: B
summary: RevertTransaction supports Force flag - allow negative balances after revert
- id: BD-053
type: B/RC
summary: RevertTransaction supports AtEffectiveDate flag - use original timestamp
- id: BD-054
type: B/BA
summary: Bulk operations default to atomic mode - each succeed or each fail
- id: BD-067
type: B/RC
summary: Funding.Take processes parts in order and merges consecutive same-account parts
- id: BD-072
type: B
summary: Transaction reversal creates new transaction with swapped source/destination and negated amounts
- id: BD-083
type: B/M
summary: Transaction reversal swaps source/destination and negates via reverse postings
- id: BD-085
type: M
summary: SQL ON CONFLICT upsert for volume accumulation on duplicate accounts
- id: BD-090
type: M
summary: Volume updates per posting add amounts using big.Int.Add()
- id: BD-091
type: M
summary: SubtractPostings negates posting amounts for pre-commit volume calculation
- id: BD-94
type: B/B
summary: 'INTERACTION: [BD-036] × [BD-045] → TAKE opcode allows negative amounts silently but posting validation rejects
them, creating inconsistent enforcement boundaries'
- id: BD-97
type: B/BA
summary: 'INTERACTION: [BD-003] × [BD-067] → Funding FIFO ordering assumption breaks when concat merges same-account
parts, potentially corrupting fund allocation semantics'
- id: BD-023
type: B
summary: Pipeline replication uses pull-based polling with cursor stored as lastLogID
- id: BD-024
type: M
summary: Push retry with exponential backoff prevents thundering herd on downstream failures
- id: BD-025
type: T
summary: Driver factory pattern enables plugin architecture for new exporter types
- id: BD-038
type: M
summary: Pagination with configurable page size prevents unbounded queries in replication
- id: BD-068
type: T
summary: Replication pipeline pulls logs every 10s with 100 log page size
- id: BD-069
type: T
summary: Exporters are lazily stopped when no pipelines use them
- id: BD-070
type: B
summary: Pipeline reset re-enables pipeline with lastLogID set to nil
- id: BD-065
type: T
summary: Variable placeholders use 'variable' syntax in filter templates
- id: BD-066
type: B/BA
summary: 'Default query pagination: sort desc by ID, max page size enforced'
- id: BD-074
type: M
summary: SQL SUM aggregation for volume totals across accounts/assets
- id: BD-081
type: T
summary: Template variable extraction uses regex pattern ^([a-z_]+)$
- id: BD-082
type: M
summary: SQL first_value window function for point-in-time volume queries
- id: BD-087
type: M
summary: SQL DISTINCT ON for deduplicating account/asset volume queries
- id: BD-088
type: T
summary: Variable substitution falls back from json.Number to float64 then big.Float to big.Int
- id: BD-092
type: T
summary: 'Variable name parsing: lowercase letter start, alphanum or underscore tail'
- id: BD-015
type: B
summary: Account addresses use hierarchical colon-separated format (e.g., bank:checking:alice)
- id: BD-016
type: BA
summary: Variable account segments use regex patterns for dynamic validation
- id: BD-017
type: BA
summary: Schema versioning allows multiple schema versions to coexist during migration
- id: BD-005
type: B
summary: 'Log-first architecture: each state changes produce immutable logs as source of truth'
- id: BD-006
type: B/BA
summary: Idempotency enforced via IdempotencyKey + IdempotencyHash validation
- id: BD-007
type: DK
summary: Deadlock retry with exponential backoff handles PostgreSQL deadlock errors
- id: BD-008
type: B/BA
summary: 'Schema validation has three enforcement modes: strict, warn, and audit'
- id: BD-032
type: B
summary: Postings.Reverse() swaps source/destination AND reverses array order for transaction reversal
- id: BD-98
type: B/B
summary: 'INTERACTION: [BD-006] × [BD-049] → Idempotency hash depends on idempotency key''s uniqueness guarantee, creating
circular dependency where hash collision detection requires hash to already exist'
- id: BD-045
type: B
summary: Posting amounts must be non-negative (zero allowed)
- id: BD-048
type: B
summary: Transaction references must be unique per ledger
- id: BD-049
type: B
summary: Idempotency keys include hash of input to detect conflicting parameters
- id: BD-056
type: B
summary: Atomic and Parallel bulk options are mutually exclusive
- id: BD-060
type: B
summary: Importer rejects logs if ledger is not empty or logs are out of order
- id: BD-061
type: B/BA
summary: World account is the system root - not subject to insufficient funds check
- id: BD-063
type: B
summary: Reverted transactions cannot be reverted again (ErrAlreadyReverted)
- id: BD-077
type: B/BA
summary: Numeric variables must be integers only - no decimal values allowed
- id: BD-084
type: B
summary: Posting amounts must be non-negative (>= 0)
- id: BD-086
type: T
summary: 'Decimal validation: math.Floor(v) == v for float64 variable values'
- id: BD-009
type: B
summary: Balance = Input - Output computed field, never stored separately
- id: BD-010
type: BA
summary: PostCommitVolumes computed in-flight from logs, never persisted
- id: BD-011
type: B
summary: WORLD account is special sink/source for external asset flows
- id: BD-033
type: B/RC
summary: PostCommitEffectiveVolumes computed by transaction timestamp, not insertion order
- id: BD-96
type: B/BA
summary: 'INTERACTION: [BD-042] × [BD-033] × [BD-010] → Dual volume tracking with timestamp-based ordering causes effective
volumes to retroactively change on backdated inserts, cascading to each historical bal'
- id: BD-028
type: T
summary: FX dependency injection modules compose each worker components
- id: BD-029
type: T
summary: FX lifecycle hooks ensure proper startup/shutdown ordering for each workers
- id: BD-030
type: T
summary: Graceful shutdown via stop channels for clean goroutine termination
resources:
packages:
- name: github.com/jackc/pgx/v5
version_pin: latest
- name: github.com/uptrace/bun
version_pin: latest
- name: github.com/formancehq/go-libs/v4
version_pin: latest
- name: github.com/nats-io/nats.go
version_pin: latest
- name: github.com/formancehq/numscript
version_pin: latest
- name: go.opentelemetry.io/otel
version_pin: latest
- name: github.com/ThreeDotsLabs/watermill
version_pin: latest
- name: github.com/ClickHouse/clickhouse-go/v2
version_pin: latest
- name: github.com/olivere/elastic/v7
version_pin: latest
- name: github.com/spf13/viper
version_pin: latest
strategy_scaffold:
entry_point_name: run_backtest
output_path: result.csv
execution_mode: backtest
conditional_entry_points:
backtest:
entry_point_name: run_backtest
output_path: result.csv
collector:
entry_point_name: run_collector
output_path: result.json
factor:
entry_point_name: run_factor
output_path: result.parquet
training:
entry_point_name: run_training
output_path: result.json
serving:
entry_point_name: run_server
output_path: result.json
research:
entry_point_name: run_research
output_path: result.json
tail_template: "# === DO NOT MODIFY BELOW THIS LINE ===\nif __name__ == \"__main__\":\n result = run_backtest() #\
\ implement above\n from validate import enforce_validation\n enforce_validation(result, output_path=\"{workspace}/result.csv\"\
)\n# === END DO NOT MODIFY ==="
host_adapter:
target: openclaw
timeout_seconds: 1800
shell_operator_restriction: 'exec tool intercepts && / ; / | — never chain: ''pip install X && python Y''. Use separate
exec calls.'
install_recipes:
- python3 -m pip install github.com/jackc/pgx/v5
- python3 -m pip install github.com/uptrace/bun
- python3 -m pip install github.com/formancehq/go-libs/v4
- python3 -m pip install zvt
credential_injection: JoinQuant/QMT credentials require user-side '!' prefix shell login. Never hardcode credentials in
generated scripts.
path_resolution: '{workspace} resolves to ~/.openclaw/workspace/doramagic at execution time.'
file_io_tooling: Use openclaw 'write' tool for .py/.sql files; 'exec' tool for python3 /absolute/path/script.py (absolute
paths only).
constraints:
fatal:
- id: finance-C-001
when: When implementing Numscript monetary calculations
action: use MonetaryInt based on *big.Int to represent monetary amounts
severity: fatal
kind: domain_rule
modality: must
consequence: Using float64 for monetary amounts causes rounding errors, leading to incorrect balance calculations and
potential financial losses due to precision loss in currency computations
stage_ids:
- numscript_execution
- id: finance-C-002
when: When executing Numscript VM
action: call ResolveResources() before Execute() to initialize variables and constants
severity: fatal
kind: domain_rule
modality: must
consequence: Execute panics with ErrResourcesNotInitialized when called before resource resolution, preventing any script
execution
stage_ids:
- numscript_execution
- id: finance-C-003
when: When executing Numscript VM
action: call ResolveBalances() before Execute() to populate account balances
severity: fatal
kind: domain_rule
modality: must
consequence: Execute panics with ErrBalancesNotInitialized when called before balance resolution, causing script execution
to fail
stage_ids:
- numscript_execution
- id: finance-C-004
when: When performing monetary arithmetic operations
action: mix different asset types in addition, subtraction, or take operations
severity: fatal
kind: domain_rule
modality: must_not
consequence: Cross-asset monetary operations produce invalid financial calculations, violating the fundamental principle
that amounts in different currencies cannot be combined
stage_ids:
- numscript_execution
- id: finance-C-005
when: When parsing portion values in Numscript
action: validate portions are within 0% to 100% range inclusive
severity: fatal
kind: domain_rule
modality: must
consequence: Invalid portion values outside 0-100% range cause incorrect allocation calculations, distributing more or
less than 100% of funds
stage_ids:
- numscript_execution
- id: finance-C-009
when: When implementing balance tracking during execution
action: request negative balance values from the Store
severity: fatal
kind: resource_boundary
modality: must_not
consequence: Negative balance values returned from Store indicate data corruption, causing the VM to reject the transaction
and fail
stage_ids:
- numscript_execution
- id: finance-C-011
when: When handling insufficient funds scenario
action: return ErrInsufficientFund when account balance cannot cover the requested amount
severity: fatal
kind: domain_rule
modality: must
consequence: Failing to detect insufficient funds allows overdraft beyond permitted limits, causing real monetary losses
and account balance violations
stage_ids:
- numscript_execution
- id: finance-C-013
when: When processing send operations in Numscript
action: accumulate postings during execution rather than computing on-demand
severity: fatal
kind: architecture_guardrail
modality: must
consequence: On-demand posting generation risks double-spending by not tracking already-spent funds within the same transaction,
violating atomic transaction semantics
stage_ids:
- numscript_execution
- id: finance-C-020
when: When assembling multiple fundings
action: verify each fundings have matching asset types
severity: fatal
kind: domain_rule
modality: must
consequence: Assembling fundings with different assets produces invalid Funding structures that corrupt subsequent monetary
operations
stage_ids:
- numscript_execution
- id: finance-C-022
when: When implementing idempotency for transaction processing
action: use both IdempotencyKey and IdempotencyHash for duplicate detection
severity: fatal
kind: domain_rule
modality: must
consequence: Without IdempotencyHash validation, retries with different input parameters will incorrectly succeed, causing
duplicate or inconsistent financial transactions to be recorded in the ledger
stage_ids:
- transaction_processing
- id: finance-C-023
when: When implementing transaction processing
action: perform atomic commit/rollback for database operations
severity: fatal
kind: domain_rule
modality: must
consequence: Failed transactions will leave partial state in the database, causing inconsistent account balances and volumes
that violate double-entry bookkeeping integrity
stage_ids:
- transaction_processing
- id: finance-C-025
when: When implementing log hash computation
action: chain hashes by including previous log hash in current log hash computation
severity: fatal
kind: domain_rule
modality: must
consequence: Without hash chaining, the immutable audit trail can be compromised, allowing undetected tampering with historical
transaction records
stage_ids:
- transaction_processing
- id: finance-C-033
when: When implementing transaction processing pipeline
action: insert log before committing transaction to maintain event sourcing integrity
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Committing before log insertion breaks the event sourcing pattern, making it impossible to rebuild state
from logs and violating audit requirements
stage_ids:
- transaction_processing
- id: finance-C-035
when: When implementing transaction log processing
action: use Read Committed isolation level and FOR UPDATE locks for balance operations
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Without proper locking, concurrent transactions can create inconsistent balances (e.g., -400 USD when two
200 USD transactions both succeed with -200 limit)
stage_ids:
- transaction_processing
- id: finance-C-041
when: When implementing balance calculation for an account/asset pair
action: calculate balance as Input minus Output using big.Int subtraction
severity: fatal
kind: domain_rule
modality: must
consequence: Storing Balance as a separate field creates synchronization risk where Input, Output, and Balance can diverge,
causing incorrect account balances and financial integrity violations
stage_ids:
- volume_accounting
- id: finance-C-042
when: When reading PostCommitVolumes from a committed transaction
action: modify or store PostCommitVolumes as it is computed in-flight
severity: fatal
kind: domain_rule
modality: must_not
consequence: Storing PostCommitVolumes introduces stale data risk where query-time values differ from committed state,
breaking the single source of truth principle
stage_ids:
- volume_accounting
- id: finance-C-043
when: When creating a posting with a monetary amount
action: allow negative amounts in postings
severity: fatal
kind: domain_rule
modality: must_not
consequence: Negative amounts allow reversal transactions without proper double-entry bookkeeping, potentially creating
invalid financial states where money is created from nothing
stage_ids:
- volume_accounting
- id: finance-C-044
when: When tracking Input and Output volumes for an account
action: use big.Int type for each volume calculations to maintain integer precision
severity: fatal
kind: domain_rule
modality: must
consequence: Using floating-point types for monetary calculations introduces rounding errors that compound across transactions,
leading to penny-wise discrepancies in final balances
stage_ids:
- volume_accounting
- id: finance-C-045
when: When executing concurrent transactions on the same bounded source accounts
action: acquire database row locks using FOR UPDATE before reading balances
severity: fatal
kind: domain_rule
modality: must
consequence: Without FOR UPDATE locking, concurrent transactions can both read the same balance and overdraw an account
beyond its overdraft limit, violating business rules and causing financial losses
stage_ids:
- volume_accounting
- id: finance-C-048
when: When the accounts_volumes table has no rows for a newly queried account
action: insert a placeholder row with zero volumes before acquiring locks to prevent lock-free reads
severity: fatal
kind: domain_rule
modality: must
consequence: Without placeholder insertion, concurrent transactions can both pass overdraft checks on never-used accounts,
creating balances that violate overdraft limits
stage_ids:
- volume_accounting
- id: finance-C-052
when: When computing log hash for tamper-evident chain
action: include the previous log's hash in the SHA256 digest before the current log's data
severity: fatal
kind: domain_rule
modality: must
consequence: Without including previous hash, the chain is broken and any log modification won't be detectable through
hash comparison, eliminating the tamper-evident property of the audit trail
stage_ids:
- log_creation
- id: finance-C-054
when: When creating new log IDs
action: use database sequence (nextval) for log ID generation to verify strictly increasing IDs with no gaps
severity: fatal
kind: domain_rule
modality: must
consequence: Generating IDs at the application layer creates race conditions and potential gaps in the sequence, breaking
the audit trail's completeness requirement
stage_ids:
- log_creation
- id: finance-C-059
when: When implementing concurrent log insertion
action: acquire advisory lock on the ledger before reading previous log to prevent race conditions
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Without proper locking, concurrent transactions may read the same 'previous log', compute identical hashes,
and create divergent chains that cannot be merged
stage_ids:
- log_creation
- id: finance-C-060
when: When inserting logs with HASH_LOGS in SYNC mode
action: verify that the returned hash matches the computed hash before considering insertion successful
severity: fatal
kind: architecture_guardrail
modality: must
consequence: If hash mismatch occurs after insert, the audit trail is compromised without immediate detection, potentially
allowing tampered logs to be committed
stage_ids:
- log_creation
- id: finance-C-064
when: When defining chart of accounts segment names
action: use only alphanumeric characters with optional $ or . prefix matching the regex ^(\ $|\.)?[a-zA-Z0-9_-]+$
severity: fatal
kind: domain_rule
modality: must
consequence: Invalid segment names cause chart parsing to fail, preventing the chart of accounts from being loaded and
blocking all account validations
stage_ids:
- schema_enforcement
- id: finance-C-066
when: When defining variable segment patterns in chart
action: provide valid regex patterns that compile successfully via regexp.Compile
severity: fatal
kind: domain_rule
modality: must
consequence: Invalid regex patterns cause chart loading to fail with 'invalid pattern regex' error, preventing schema
enforcement from functioning
stage_ids:
- schema_enforcement
- id: finance-C-068
when: When defining chart structure at root level
action: define variable segments or .self (account) at the root level of chart
severity: fatal
kind: domain_rule
modality: must_not
consequence: Root-level accounts or variable segments cause chart parsing errors, preventing the ledger schema from loading
stage_ids:
- schema_enforcement
- id: finance-C-069
when: When placing patterns on chart segments
action: attach .pattern property to fixed (non-variable) segments
severity: fatal
kind: domain_rule
modality: must_not
consequence: Fixed segments with patterns cause chart loading to fail with 'cannot have a pattern on a fixed segment'
error
stage_ids:
- schema_enforcement
- id: finance-C-070
when: When using chart pattern engine backend
action: provide account validation logic that implements the FindAccountSchema and ValidatePosting interface contracts
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Non-conforming pattern engines break ValidateWithSchema calls, causing schema enforcement to fail silently
or throw exceptions
stage_ids:
- schema_enforcement
- id: finance-C-071
when: When processing postings against schema
action: call Chart.ValidatePosting for each posting before committing transactions
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Skipping ValidatePosting allows invalid account addresses to pass through, corrupting the ledger with accounts
outside the defined chart
stage_ids:
- schema_enforcement
- id: finance-C-073
when: When running in strict schema enforcement mode
action: block log insertion when schema validation fails instead of logging warnings
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Schema violations slip through in strict mode if not blocked, defeating compliance requirements and allowing
invalid data
stage_ids:
- schema_enforcement
- id: finance-C-074
when: When inserting new schemas
action: verify chart of accounts is present in schema definition
severity: fatal
kind: domain_rule
modality: must
consequence: Missing chart causes NewSchema to return 'missing chart of accounts' error, blocking schema registration
stage_ids:
- schema_enforcement
- id: finance-C-075
when: When validating postings against chart
action: check both source and destination accounts match defined chart patterns
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Only validating one side allows invalid accounts on the unchecked side, enabling transactions from/to undefined
accounts
stage_ids:
- schema_enforcement
- id: finance-C-083
when: When defining two variable segments with same prefix
action: define multiple variable segments under the same parent segment (e.g., $userID and $otherID)
severity: fatal
kind: domain_rule
modality: must_not
consequence: Chart loading fails with 'cannot have two variable segments with the same prefix' error
stage_ids:
- schema_enforcement
- id: finance-C-086
when: When using variable segment labels
action: prefix variable segment keys with $ character in chart JSON definition
severity: fatal
kind: domain_rule
modality: must
consequence: Missing $ prefix causes variable segment to be parsed as fixed segment, breaking dynamic account matching
stage_ids:
- schema_enforcement
- id: finance-C-087
when: When implementing balance queries with multiple account-asset pairs
action: sort accountsVolumes by account and asset in stable order before locking
severity: fatal
kind: domain_rule
modality: must
consequence: Without consistent ordering, concurrent transactions can deadlock when each acquires locks in different orders,
causing PostgreSQL to abort one transaction with ErrDeadlockDetected and potential data inconsistency
stage_ids:
- controller_store
- id: finance-C-088
when: When implementing balance queries within a SQL transaction
action: use SELECT FOR UPDATE to acquire row-level locks on balance rows
severity: fatal
kind: domain_rule
modality: must
consequence: Without FOR UPDATE locks, concurrent transactions can read uncommitted balances and cause double-spend vulnerabilities
where the same funds are credited to multiple accounts
stage_ids:
- controller_store
- id: finance-C-090
when: When using the Store for concurrent write transactions
action: verify LockLedger is called before any GetBalances operation
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Without proper ledger-level advisory locking, concurrent transactions modifying the same ledger can cause
race conditions in balance updates and lead to inconsistent account states
stage_ids:
- controller_store
- id: finance-C-095
when: When using Store operations with PostgreSQL
action: expect Store to work with databases that lack pg_advisory_lock support
severity: fatal
kind: resource_boundary
modality: must_not
consequence: Store implementation relies on PostgreSQL-specific advisory lock functions that do not exist in MySQL, SQLite,
or other databases, causing runtime panics
stage_ids:
- controller_store
- id: finance-C-100
when: When implementing async block creation logic
action: chain block hashes with the previous block hash using the PostgreSQL digest function
severity: fatal
kind: domain_rule
modality: must
consequence: Block hash chain integrity is broken, allowing undetected tampering with log sequences in the ledger's audit
trail
stage_ids:
- async_block_hashing
- id: finance-C-101
when: When implementing async block creation logic
action: only include logs with ID greater than the previous block's max_log_id
severity: fatal
kind: domain_rule
modality: must
consequence: Duplicate logs are included in multiple blocks, causing incorrect hash chains and violating the immutability
of the audit trail
stage_ids:
- async_block_hashing
- id: finance-C-109
when: When computing block hashes in the PostgreSQL procedure
action: use SHA256 cryptographic hash via public.digest function for log aggregation
severity: fatal
kind: domain_rule
modality: must
consequence: Using a non-cryptographic or weak hash algorithm compromises the integrity verification properties of the
block chain
stage_ids:
- async_block_hashing
- id: finance-C-112
when: When implementing the replication pipeline log fetching logic
action: use ascending order (OrderAsc) when querying logs for export
severity: fatal
kind: domain_rule
modality: must
consequence: Logs delivered to external systems out of chronological order will cause data inconsistency in downstream
consumers, leading to incorrect financial state reconstruction
stage_ids:
- pipeline_replication
- id: finance-C-113
when: When implementing log export cursor management
action: update lastLogID only after successful export (exporter.Accept returns without error)
severity: fatal
kind: domain_rule
modality: must
consequence: Updating cursor before successful export causes log duplication if export fails; downstream systems may miss
critical financial events
stage_ids:
- pipeline_replication
- id: finance-C-116
when: When configuring batcher driver settings
action: set FlushInterval when MaxItems is 0 to verify logs are flushed even with low traffic
severity: fatal
kind: resource_boundary
modality: must
consequence: Logs will never be delivered when MaxItems=0 and FlushInterval=0, causing permanent data loss in downstream
systems
stage_ids:
- pipeline_replication
- id: finance-C-120
when: When handling export failures in the pipeline
action: retry failed exports with configured PushRetryPeriod delay
severity: fatal
kind: operational_lesson
modality: must
consequence: Dropping logs on export failure causes permanent data loss; downstream systems miss critical financial events
stage_ids:
- pipeline_replication
- id: finance-C-122
when: When implementing pipeline crash recovery
action: resume from lastLogID stored in pipeline state
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Starting from beginning after crash causes duplicate log delivery, corrupting downstream financial data
stage_ids:
- pipeline_replication
- id: finance-C-128
when: When soft-deleting a bucket via DeleteBucket
action: Set deleted_at timestamp on each ledgers in that bucket
severity: fatal
kind: domain_rule
modality: must
consequence: Soft-deleted buckets will not be tracked for retention, causing immediate hard deletion instead of the intended
30-day recovery window
stage_ids:
- bucket_cleanup
- id: finance-C-129
when: When determining which buckets to hard delete
action: Only select buckets where deleted_at is older than the configured retention period
severity: fatal
kind: domain_rule
modality: must
consequence: Buckets deleted within the retention period will be permanently destroyed, preventing accidental deletion
recovery
stage_ids:
- bucket_cleanup
- id: finance-C-131
when: When configuring the bucket cleanup retention period
action: Set retention period to a value greater than zero
severity: fatal
kind: domain_rule
modality: must
consequence: Retention period of zero or negative causes immediate hard deletion, eliminating the recovery window for
accidentally deleted buckets
stage_ids:
- bucket_cleanup
- id: finance-C-134
when: When configuring the bucket cleanup schedule
action: Verify a valid cron schedule is configured
severity: fatal
kind: resource_boundary
modality: must
consequence: Missing or nil schedule causes worker validation to fail, preventing bucket cleanup from running
stage_ids:
- bucket_cleanup
- id: finance-C-140
when: When implementing WorkerConfiguration validation
action: enforce bucket cleanup retention period to be greater than zero
severity: fatal
kind: domain_rule
modality: must
consequence: Invalid retention period configuration leads to unpredictable bucket cleanup behavior, potentially causing
premature deletion of buckets before data is safely retained
stage_ids:
- worker_fx_module
- id: finance-C-141
when: When implementing WorkerConfiguration validation
action: enforce bucket cleanup schedule to be configured (non-nil CRON spec)
severity: fatal
kind: domain_rule
modality: must
consequence: Missing bucket cleanup schedule causes the cleanup runner to fail on startup, leaving orphaned buckets in
the database indefinitely
stage_ids:
- worker_fx_module
- id: finance-C-142
when: When implementing worker goroutine lifecycle
action: use context.WithoutCancel to detach worker context from parent lifecycle
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Worker goroutines terminate unexpectedly when parent context is cancelled during startup or normal operation,
causing incomplete block hashing, replication failures, or bucket cleanup gaps
stage_ids:
- worker_fx_module
- id: finance-C-143
when: When implementing worker graceful shutdown
action: implement stop channel mechanism that confirms goroutine termination before returning
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Without confirmation, the stop call returns immediately while the goroutine continues running, causing resource
leaks, duplicate work during restart, and potential data corruption in concurrent operations
stage_ids:
- worker_fx_module
- id: finance-C-144
when: When registering worker lifecycle hooks via fx
action: attach OnStop hook to worker.Stop method for proper shutdown sequencing
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Workers continue running after FX signals shutdown, leading to orphaned goroutines, database connection leaks,
and inability to cleanly restart or redeploy the service
stage_ids:
- worker_fx_module
- id: finance-C-146
when: When implementing replication manager shutdown
action: wait for each pipeline goroutines to complete before returning from Stop
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Stop returns before pipelines finish processing, causing data loss, incomplete replication to external systems,
and goroutine leaks during shutdown
stage_ids:
- worker_fx_module
- id: finance-C-150
when: When wiring worker modules via Uber FX
action: compose storage.NewFXModule before worker.NewFXModule to verify database readiness
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Workers attempt database operations before connection is ready, causing startup failures and inability to
hash blocks, replicate logs, or cleanup buckets
stage_ids:
- worker_fx_module
- id: finance-C-156
when: When implementing double-entry bookkeeping transactions in Formance Ledger
action: 'Verify postings balance: sum of each source amounts must equal sum of each destination amounts per transaction'
severity: fatal
kind: domain_rule
modality: must
consequence: An unbalanced transaction creates money from nothing or destroys money, violating fundamental accounting
principles and causing financial ledger inconsistency
- id: finance-C-157
when: When storing monetary amounts in the ledger
action: Use non-negative big integers (github.com/formancehq/go-libs/pkg/monetary) for each amounts
severity: fatal
kind: domain_rule
modality: must
consequence: Negative amounts would violate the ledger's non-negative invariant, potentially allowing overdraft exploitation
or invalid financial state
- id: finance-C-158
when: When chaining logs in the immutable audit trail
action: Compute each log's hash including the previous log's hash to create a cryptographically linked chain
severity: fatal
kind: domain_rule
modality: must
consequence: Breaking the log chain invalidates the audit trail integrity, allowing tampering with historical transactions
without detection
- id: finance-C-160
when: When computing account balances
action: Calculate balance as Input minus Output (computed dynamically, never stored)
severity: fatal
kind: domain_rule
modality: must
consequence: Storing balances would create consistency drift between the computed balance and the transaction history,
causing incorrect financial reporting
- id: finance-C-163
when: When creating or modifying transactions via the VM
action: 'Execute VM instructions in strict order: SetVarsFromJSON → ResolveResources → ResolveBalances → Execute'
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Skipping or reordering execution phases causes undefined VM state, leading to incorrect postings or runtime
errors
- id: finance-C-164
when: When processing ledger logs through the log processor
action: 'Follow strict transaction order: BeginTX → runLog (validateSchema) → InsertLog → Commit'
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Deviating from this order breaks the idempotency and audit trail consistency, potentially duplicating or
losing transactions
- id: finance-C-165
when: When handling idempotency in transaction processing
action: Use idempotency key combined with computed hash to uniquely identify and deduplicate requests
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Missing idempotency validation causes duplicate transactions, double-spending, and inconsistent ledger state
- id: finance-C-172
when: When deploying Formance Ledger in environments without PostgreSQL support
action: Claim Formance Ledger can run without PostgreSQL as its storage backend
severity: fatal
kind: claim_boundary
modality: must_not
consequence: PostgreSQL is the only supported storage layer; claiming otherwise leads to deployment failures and data
loss
- id: finance-C-174
when: When importing ledger data via the import functionality
action: Verify the target ledger is empty before import (no existing transactions)
severity: fatal
kind: operational_lesson
modality: must
consequence: Importing into a non-empty ledger causes transaction ID conflicts and inconsistent state, corrupting the
ledger's audit trail
- id: finance-C-175
when: When using concurrent transactions affecting the same bounded source accounts
action: Rely on PostgreSQL FOR UPDATE locks and the INSERT...ON CONFLICT pattern to prevent overdraft race conditions
severity: fatal
kind: operational_lesson
modality: must
consequence: Without proper locking, concurrent transactions can both pass overdraft checks and result in balance violations
(e.g., two transactions each for 200 USD when limit is -200 USD)
- id: finance-C-182
when: When implementing log creation and hash chain maintenance
action: Include the previous log hash in SHA256 computation to maintain a tamper-evident hash chain — each log must reference
its predecessor's hash
severity: fatal
kind: domain_rule
modality: must
consequence: Without previous log hash chaining, modifications to historical logs go undetected; this violates audit compliance
requirements and allows unauthorized changes to slip through without detection
derived_from_bd_id: BD-012
- id: finance-C-185
when: When implementing posting accumulation logic in transaction execution
action: Accumulate postings in a single pass during execution and validate that sum of postings equals zero before committing
— this prevents double-spending by ensuring balance invariant holds throughout
severity: fatal
kind: domain_rule
modality: must
consequence: Multi-pass execution or delayed balance validation allows double-spending scenarios where the same funds
are committed multiple times before detection; single-pass accumulation with immediate validation is required for correctness
derived_from_bd_id: BD-004
- id: finance-C-186
when: When implementing or modifying monetary amount handling in the data model
action: Use float64 or any fixed-precision decimal type for monetary amounts — each monetary calculations must use big.Int
for arbitrary precision integer arithmetic
severity: fatal
kind: domain_rule
modality: must_not
consequence: float64 introduces rounding errors in financial calculations that accumulate over time; strategies that appear
profitable may be losing money due to precision loss, particularly in high-frequency or high-volume scenarios
derived_from_bd_id: BD-071
- id: finance-C-195
when: When implementing financial amount calculations or choosing data types in the data model
action: Use arbitrary precision integers (big.Int) for each financial amounts; do not use float64 or fixed-point decimal
types which introduce precision loss affecting financial accuracy
severity: fatal
kind: domain_rule
modality: must
consequence: Using float64 for financial amounts causes precision loss where calculations like 0.1 + 0.2 produce 0.30000000000000004,
leading to incorrect transaction amounts and irreconcilable balance discrepancies
derived_from_bd_id: BD-076
- id: finance-C-201
when: When configuring schema enforcement mode in controllers
action: Verify the actual number of schema enforcement modes implemented matches the documented modes — evidence shows
discrepancy between stated three modes (strict, warn, audit) and code showing two modes (Strict, Audit)
severity: fatal
kind: domain_rule
modality: must
consequence: Configuration relying on a third mode (warn) that doesn't exist in implementation will silently fail or behave
unexpectedly, allowing invalid schema entries to pass validation unnoticed
derived_from_bd_id: BD-051
- id: finance-C-202
when: When implementing or modifying Memento hash computation for transaction history
action: Include postCommitVolumes in Memento hash computation — these fields are computed post-commit and including them
in hash causes any volume formula change to invalidate each historical memento verification
severity: fatal
kind: domain_rule
modality: must_not
consequence: Including postCommitVolumes in hash breaks historical integrity verification — all mementos created before
the formula change become unverifiable, causing audit failures and loss of tamper-evident history
derived_from_bd_id: BD-013
- id: finance-C-204
when: When implementing or modifying balance calculation logic
action: Calculate balance as Input minus Output using big.Int.Sub — changing to Output minus Input negates every account
balance, breaking each account queries and causing complete ledger data inversion
severity: fatal
kind: domain_rule
modality: must_not
consequence: Reversing the balance formula (Output - Input instead of Input - Output) produces negated balances for every
account, making all balance queries return incorrect negative values and corrupting entire ledger state
derived_from_bd_id: BD-043
- id: finance-C-212
when: When calculating account balances in the ledger data model
action: Use Input - Output formula via big.Int subtraction for balance calculation — do not use Output - Input (which
negates balances) or absolute value (which loses direction information)
severity: fatal
kind: domain_rule
modality: must
consequence: Using incorrect balance formula negates all balances or loses direction information, causing every account
balance to be calculated wrong and transactions to fail validation
derived_from_bd_id: BD-073
- id: finance-C-220
when: When implementing fund allocation logic in Funding.Take
action: Process funding parts in deterministic order and merge consecutive parts with the same account — ordering must
be reproducible for auditability
severity: fatal
kind: domain_rule
modality: must
consequence: Random ordering or separate handling of same-account parts would cause non-deterministic fund allocation,
making audit trails impossible to verify and breaking regulatory compliance requirements
derived_from_bd_id: BD-067
- id: finance-C-222
when: When implementing log chain integrity or idempotency signature hashing
action: Use SHA-256 for hash computation — changing hash algorithm would invalidate existing chain data and break idempotency
guarantees
severity: fatal
kind: domain_rule
modality: must
consequence: Switching to SHA-512, BLAKE3, or Merkle trees would change hash outputs, breaking compatibility with existing
chains and causing all historical log verifications to fail
derived_from_bd_id: BD-075
- id: finance-C-224
when: When implementing signature generation for reproducible transaction signatures
action: Use canonical JSON serialization to feed deterministic data to hash function - this named technique ensures reproducible
signatures across implementations
severity: fatal
kind: domain_rule
modality: must
consequence: Using Protocol Buffers, msgpack, or custom binary encoding breaks signature reproducibility because non-canonical
serialization produces different byte streams for identical logical data
derived_from_bd_id: BD-080
- id: finance-C-226
when: When implementing bulk operation configuration in backtesting
action: Enable both atomic and parallel bulk options simultaneously — these are mutually exclusive options that cannot
coexist
severity: fatal
kind: domain_rule
modality: must_not
consequence: Enabling both atomic and parallel bulk operations breaks ACID consistency guarantees; parallel execution
cannot provide all-or-nothing behavior, leading to partial state changes that corrupt backtest data integrity
derived_from_bd_id: BD-056
- id: finance-C-227
when: When implementing log import logic for ledger data
action: Allow importing logs into a non-empty ledger or import logs that are out of order — the importer must reject such
operations to preserve append-only audit trail integrity
severity: fatal
kind: domain_rule
modality: must_not
consequence: Allowing non-empty imports or out-of-order log ingestion corrupts the audit trail; backtest results become
non-reproducible and transaction history loses its chronological integrity, making regulatory audits impossible
derived_from_bd_id: BD-060
- id: finance-C-228
when: When implementing transaction revert logic in backtesting
action: Allow a transaction to be reverted more than once — reverted transactions must be rejected with ErrAlreadyReverted
error on subsequent revert attempts
severity: fatal
kind: domain_rule
modality: must_not
consequence: Double-revert operations create ambiguous transaction history where the same transaction is reversed multiple
times, breaking audit trail integrity and making it impossible to trace the true state of funds at any point in time
derived_from_bd_id: BD-063
regular:
- id: finance-C-006
when: When executing Numscript VM
action: verify stack is empty after execution completes
severity: high
kind: domain_rule
modality: must
consequence: Non-empty stack after execution indicates unconsumed values, signaling a logic error in the compiled program
that could lead to incorrect results
stage_ids:
- numscript_execution
- id: finance-C-007
when: When taking funds from a funding source
action: process funding parts in FIFO order to maintain deterministic execution
severity: high
kind: domain_rule
modality: must
consequence: Non-FIFO processing of funding parts causes non-deterministic results and may violate the expected distribution
of funds across multiple source accounts
stage_ids:
- numscript_execution
- id: finance-C-008
when: When using Store interface for balance queries
action: implement GetBalances and GetAccount methods for custom storage backends
severity: high
kind: resource_boundary
modality: must
consequence: Missing Store interface implementation causes runtime errors during VM balance resolution, halting all transaction
processing
stage_ids:
- numscript_execution
- id: finance-C-010
when: When testing Numscript VM behavior
action: use StaticStore to enable testing without database dependency
severity: medium
kind: resource_boundary
modality: must
consequence: Tests requiring live database connections are slow, flaky, and cannot run in isolated CI environments, reducing
test coverage and reliability
stage_ids:
- numscript_execution
- id: finance-C-012
when: When handling WORLD account in Numscript
action: treat WORLD as unbounded source/sink that never requires balance checks
severity: high
kind: architecture_guardrail
modality: must
consequence: WORLD represents external flows (money entering/leaving the system), requiring special handling as an unlimited
source or destination without balance constraints
stage_ids:
- numscript_execution
- id: finance-C-014
when: When compiling and executing Numscript programs
action: 'follow the mandated execution sequence: NewMachine -> SetVars -> ResolveResources -> ResolveBalances -> Execute'
severity: high
kind: architecture_guardrail
modality: must
consequence: Skipping or reordering execution phases leads to uninitialized state panics, missing resources, or stale
balance data
stage_ids:
- numscript_execution
- id: finance-C-015
when: When defining account addresses in Numscript
action: validate addresses match the defined account pattern
severity: high
kind: domain_rule
modality: must
consequence: Invalid account addresses cause transaction failures and potential data corruption when attempting to post
to malformed account identifiers
stage_ids:
- numscript_execution
- id: finance-C-016
when: When running Numscript VM tests
action: call ResolveResources before ResolveBalances to verify dependency order
severity: high
kind: operational_lesson
modality: must
consequence: Reversing the resolution order causes resource dependencies to be unresolved when balance queries attempt
to use them, leading to test failures
stage_ids:
- numscript_execution
- id: finance-C-017
when: When using multi-source allocations with portions
action: verify total portions sum to exactly 100% for complete allocation
severity: high
kind: domain_rule
modality: must
consequence: Portions not summing to 100% cause funds to be undistributed or over-distributed, breaking the invariant
that all sent amounts must be fully accounted for
stage_ids:
- numscript_execution
- id: finance-C-018
when: When handling metadata in Numscript execution
action: allow metadata key overrides between script and request parameters
severity: high
kind: domain_rule
modality: must_not
consequence: Allowing metadata key overrides causes silent data loss where user-provided metadata is discarded without
warning, breaking audit trails
stage_ids:
- numscript_execution
- id: finance-C-019
when: When implementing OP_TAKE_MAX opcode
action: allow partial takes when requested amount exceeds available funds
severity: high
kind: architecture_guardrail
modality: must
consequence: OP_TAKE_MAX is designed to take maximum available without error, unlike OP_TAKE which fails on insufficient
funds; failing on partial availability violates this semantic contract
stage_ids:
- numscript_execution
- id: finance-C-021
when: When using overdraft limits in Numscript
action: track cumulative balance including overdraft across multiple sends to same account
severity: high
kind: domain_rule
modality: must
consequence: Not tracking cumulative balance causes overdraft limits to be violated across multiple sends, allowing accounts
to go further negative than permitted
stage_ids:
- numscript_execution
- id: finance-C-024
when: When implementing transaction schema validation
action: respect schema enforcement mode (strict vs audit) for posting validation
severity: high
kind: domain_rule
modality: must
consequence: In strict mode, schema violations will block transactions; in audit mode, violations will only be logged
but transactions will proceed, potentially violating business rules
stage_ids:
- transaction_processing
- id: finance-C-026
when: When implementing transaction reference handling
action: enforce unique transaction references per ledger via database constraint
severity: high
kind: domain_rule
modality: must
consequence: Duplicate references will create ambiguous transaction identification, making audit trails unreliable and
causing reconciliation failures
stage_ids:
- transaction_processing
- id: finance-C-027
when: When implementing transaction processing
action: implement deadlock retry with exponential backoff for PostgreSQL operations
severity: high
kind: resource_boundary
modality: must
consequence: Without deadlock retry, concurrent transactions will fail immediately on deadlock, causing transaction failures
and degraded service availability during high contention periods
stage_ids:
- transaction_processing
- id: finance-C-028
when: When implementing idempotency key storage
action: limit idempotency key length to 256 characters as enforced by database schema
severity: medium
kind: resource_boundary
modality: must
consequence: Idempotency keys exceeding 256 characters will cause database constraint violations, resulting in failed
transactions
stage_ids:
- transaction_processing
- id: finance-C-029
when: When implementing transaction processing with schema
action: require schema version specification when ledger has schemas in strict enforcement mode
severity: high
kind: resource_boundary
modality: must
consequence: Without schema version specification, transactions may execute against unintended schema versions, violating
business rules enforced by specific schema versions
stage_ids:
- transaction_processing
- id: finance-C-030
when: When implementing idempotency retry logic
action: fetch existing log from database when IdempotencyKeyConflict error occurs
severity: high
kind: operational_lesson
modality: must
consequence: Without fetching the existing log on conflict, the system will lose track of the previously created transaction,
causing duplicate processing attempts
stage_ids:
- transaction_processing
- id: finance-C-031
when: When implementing transaction processing
action: return identical response for duplicate idempotency key submissions
severity: high
kind: operational_lesson
modality: must
consequence: Different responses for same idempotency key will confuse clients, causing them to retry or take incorrect
actions based on perceived failures
stage_ids:
- transaction_processing
- id: finance-C-032
when: When implementing bulk transaction processing
action: allow atomic and parallel options simultaneously
severity: high
kind: operational_lesson
modality: must_not
consequence: Atomic parallel execution creates undefined ordering semantics, potentially causing inconsistent state when
some transactions in the batch fail
stage_ids:
- transaction_processing
- id: finance-C-034
when: When implementing transaction processing
action: compute idempotency hash from request inputs only, excluding computed fields
severity: high
kind: architecture_guardrail
modality: must
consequence: Including computed fields (like transaction ID, timestamps) in idempotency hash will cause valid retries
to fail with hash mismatch errors
stage_ids:
- transaction_processing
- id: finance-C-036
when: When implementing transaction reversal
action: check balance sufficiency after reversal calculation before committing
severity: high
kind: architecture_guardrail
modality: must
consequence: Without balance checks after reversal, accounts can go below zero unexpectedly, causing overdrafts and potential
financial losses
stage_ids:
- transaction_processing
- id: finance-C-037
when: When implementing transaction API endpoints
action: set Idempotency-Hit header to true on successful idempotent request reuse
severity: medium
kind: architecture_guardrail
modality: must
consequence: Without proper header indication, clients cannot distinguish between new transactions and cached responses,
causing duplicate transaction attempts
stage_ids:
- transaction_processing
- id: finance-C-038
when: When documenting transaction processing capabilities
action: claim real-time trading support or zero-latency execution guarantees
severity: medium
kind: claim_boundary
modality: must_not
consequence: Performance claims without latency bounds will mislead users about execution guarantees, causing inappropriate
use cases and SLA violations
stage_ids:
- transaction_processing
- id: finance-C-039
when: When presenting transaction processing results
action: present simulated or backtest results as equivalent to live execution
severity: high
kind: claim_boundary
modality: must_not
consequence: Simulated results do not account for slippage, network latency, partial fills, and market impact, causing
unrealistic expectations and potential financial losses
stage_ids:
- transaction_processing
- id: finance-C-040
when: When implementing schema validation for transactions
action: block transactions in audit enforcement mode for schema violations
severity: medium
kind: claim_boundary
modality: must_not
consequence: Blocking in audit mode will break existing workflows that rely on gradual schema adoption, causing unexpected
transaction failures
stage_ids:
- transaction_processing
- id: finance-C-046
when: When handling the WORLD account in transaction postings
action: treat WORLD as an unbounded overdraft source that completes double-entry for external asset flows
severity: high
kind: domain_rule
modality: must
consequence: WORLD account is the ledger boundary representing external systems. Treating it as bounded or tracking its
balance creates incorrect external flow accounting
stage_ids:
- volume_accounting
- id: finance-C-047
when: When processing transactions involving the WORLD account
action: apply overdraft restrictions to the WORLD account source
severity: high
kind: domain_rule
modality: must_not
consequence: WORLD represents external asset sources with unlimited supply. Restricting it would block valid external-to-internal
asset flows
stage_ids:
- volume_accounting
- id: finance-C-049
when: When querying balances from multiple accounts concurrently
action: sort accounts by address before acquiring locks to prevent deadlocks
severity: high
kind: domain_rule
modality: must
consequence: Without consistent lock ordering, concurrent transactions acquiring locks in different orders can deadlock,
blocking all transactions on affected accounts
stage_ids:
- volume_accounting
- id: finance-C-050
when: When handling PostCommitEffectiveVolumes for transactions with past timestamps
action: allow PostCommitEffectiveVolumes to be updated when inserting transactions in the past
severity: medium
kind: architecture_guardrail
modality: should
consequence: PostCommitEffectiveVolumes reflect the state at TransactionData.Timestamp. When inserting past-dated transactions,
previously committed volumes must be recalculated to maintain temporal accuracy
stage_ids:
- volume_accounting
- id: finance-C-051
when: When calculating balance for WORLD account reporting
action: track or report WORLD account balance as it represents external systems with unlimited supply
severity: high
kind: claim_boundary
modality: must_not
consequence: WORLD is a virtual account representing the ledger boundary to external systems. Reporting its balance implies
the ledger controls infinite external resources
stage_ids:
- volume_accounting
- id: finance-C-053
when: When implementing log payload hashing
action: include postCommitVolumes or postCommitEffectiveVolumes fields in the hash computation
severity: high
kind: domain_rule
modality: must_not
consequence: Including computed postCommitVolumes fields in the hash breaks event sourcing principles because these are
derived values, not causal decisions. Any state rebuild would produce different hashes for identical causal inputs
stage_ids:
- log_creation
- id: finance-C-055
when: When defining log type handling
action: use typed constants (iota) rather than string comparisons to prevent typos and type errors
severity: high
kind: domain_rule
modality: must
consequence: Using string literals for log types risks silent typos that pass compilation but cause incorrect log type
handling at runtime, panicking the application
stage_ids:
- log_creation
- id: finance-C-056
when: When enabling HASH_LOGS feature with SYNC mode for high throughput
action: expect linear scalability because the advisory lock creates a serialization bottleneck
severity: high
kind: resource_boundary
modality: must_not
consequence: The pg_advisory_xact_lock on ledger ID serializes all log insertions, causing throughput to plateau or degrade
under concurrent write load
stage_ids:
- log_creation
- id: finance-C-057
when: When requiring maximum write throughput
action: configure HASH_LOGS feature to ASYNC mode so background worker computes hashes without blocking writes
severity: medium
kind: resource_boundary
modality: must
consequence: SYNC hashing blocks every transaction waiting for advisory lock, limiting throughput to sequential writes
per ledger
stage_ids:
- log_creation
- id: finance-C-058
when: When computing log hash for deterministic results
action: maintain consistent field ordering in the hash struct because JSON encoder includes field order in output
severity: high
kind: operational_lesson
modality: must
consequence: Inconsistent field ordering produces different hashes for structurally identical logs, breaking hash chain
verification and audit integrity
stage_ids:
- log_creation
- id: finance-C-061
when: When handling idempotency in log creation
action: verify IdempotencyHash matches computed hash to verify duplicate requests produce identical results
severity: high
kind: architecture_guardrail
modality: must
consequence: Without IdempotencyHash verification, duplicate requests with different inputs could be incorrectly treated
as idempotent, leading to incorrect ledger state
stage_ids:
- log_creation
- id: finance-C-062
when: When claiming log chain provides real-time tamper detection
action: claim that HASH_LOGS SYNC mode provides immediate integrity verification because hash computation is done at insert
time only
severity: medium
kind: claim_boundary
modality: must_not
consequence: The hash chain provides tamper evidence but does not include active monitoring. Tampering would only be detected
during explicit verification, not at the moment of modification
stage_ids:
- log_creation
- id: finance-C-063
when: When presenting log chain results in ASYNC hashing mode
action: present logs as having verified integrity before the background hasher has processed them
severity: high
kind: claim_boundary
modality: must_not
consequence: In ASYNC mode, recent logs may have empty or stale hashes until the background worker processes them, misleading
users about actual audit trail integrity
stage_ids:
- log_creation
- id: finance-C-065
when: When creating hierarchical account addresses
action: use colon (:) as the segment separator for address path construction
severity: high
kind: domain_rule
modality: must
consequence: Addresses split on wrong separator cause FindAccountSchema to fail matching, rejecting valid accounts or
accepting invalid ones
stage_ids:
- schema_enforcement
- id: finance-C-067
when: When assigning metadata or rules to chart segments
action: place .metadata or .rules properties on non-account intermediate segments
severity: high
kind: domain_rule
modality: must_not
consequence: Invalid chart structure passes validation but creates accounts with unreachable metadata definitions, causing
metadata inheritance failures
stage_ids:
- schema_enforcement
- id: finance-C-072
when: When creating new accounts via postings
action: inherit default metadata from chart account schema through AccountsWithDefaultMetadata
severity: high
kind: architecture_guardrail
modality: must
consequence: Accounts created without default metadata miss required fields, breaking compliance requirements and audit
trails
stage_ids:
- schema_enforcement
- id: finance-C-076
when: When specifying schema version in transaction parameters
action: request a non-existent schema version that will trigger ErrSchemaNotFound
severity: high
kind: resource_boundary
modality: must_not
consequence: Non-existent schema version causes transaction to fail with error including latest available version hint
stage_ids:
- schema_enforcement
- id: finance-C-077
when: When operating in audit schema enforcement mode
action: emit telemetry traces and logs for schema validation failures instead of blocking transactions
severity: medium
kind: operational_lesson
modality: should
consequence: Audit mode without logging fails to capture compliance evidence of schema violations, reducing audit trail
effectiveness
stage_ids:
- schema_enforcement
- id: finance-C-078
when: When requiring schema specification
action: provide schema version when ledger has schemas and input payload needs schema
severity: high
kind: resource_boundary
modality: must
consequence: Missing schema version when ledger has schemas causes strict mode to error with ErrSchemaNotSpecified
stage_ids:
- schema_enforcement
- id: finance-C-079
when: When multiple schema versions coexist
action: use FindSchema to lookup specific version or FindLatestSchemaVersion for newest
severity: high
kind: resource_boundary
modality: must
consequence: Incorrect schema version lookup returns wrong chart, causing valid accounts to be rejected or invalid ones
accepted
stage_ids:
- schema_enforcement
- id: finance-C-080
when: When claiming real-time schema enforcement
action: present audit mode compliance logs as guaranteed blocking enforcement
severity: high
kind: claim_boundary
modality: must_not
consequence: Audit mode only logs warnings, allowing schema violations through and misrepresenting compliance posture
stage_ids:
- schema_enforcement
- id: finance-C-081
when: When enforcing schemas on new ledgers
action: require schema validation when no schema has been inserted yet
severity: high
kind: claim_boundary
modality: must_not
consequence: Ledger without schema cannot validate against non-existent chart, breaking transaction creation
stage_ids:
- schema_enforcement
- id: finance-C-082
when: When inserting duplicate schema versions
action: attempt to insert a schema with version that already exists
severity: medium
kind: resource_boundary
modality: must_not
consequence: Duplicate schema version insertion fails with ErrSchemaAlreadyExists, leaving no schema for subsequent operations
stage_ids:
- schema_enforcement
- id: finance-C-084
when: When storing patterns for variable segments
action: save pattern as string pointer (nil allowed for unconstrained variables)
severity: high
kind: architecture_guardrail
modality: must
consequence: Non-pointer pattern storage prevents optional pattern semantics, forcing all variable segments to be constrained
stage_ids:
- schema_enforcement
- id: finance-C-085
when: When using patternless variable segments
action: match any alphanumeric segment value without regex constraint
severity: medium
kind: domain_rule
modality: must
consequence: Patternless variable segments matching invalid characters allow unexpected account addresses through validation
stage_ids:
- schema_enforcement
- id: finance-C-089
when: When executing multiple ledger operations within a single transaction
action: follow the established operation ordering sequence used in CommitTransaction
severity: high
kind: domain_rule
modality: must
consequence: Deviating from the established operation order (volumes first, then transactions, then logs) can cause deadlocks
when concurrent transactions execute operations in conflicting orders
stage_ids:
- controller_store
- id: finance-C-091
when: When adapting the Store interface for different consumers
action: use vmStoreAdapter or numscriptRewriteAdapter to translate between Store and consumer-specific interfaces
severity: high
kind: architecture_guardrail
modality: must
consequence: Direct Store access without proper adapter translation can cause type mismatches in balance queries and lead
to incorrect financial calculations
stage_ids:
- controller_store
- id: finance-C-092
when: When handling deadlocks in transaction processing
action: retry the entire transaction when ErrDeadlockDetected is returned
severity: high
kind: operational_lesson
modality: must
consequence: Without retry logic, failed transactions due to deadlocks can result in lost operations and require manual
intervention to reconcile account states
stage_ids:
- controller_store
- id: finance-C-093
when: When adding a second ledger to an existing bucket
action: update the aloneInBucket atomic flag to false for each stores in the bucket
severity: medium
kind: operational_lesson
modality: must
consequence: Without updating the shared aloneInBucket flag, query optimization decisions will be incorrect, causing unnecessary
WHERE clause filtering and degraded sequential scan plans
stage_ids:
- controller_store
- id: finance-C-094
when: When implementing pagination for large result sets
action: enforce PageSize bounds by defaulting to bunpaginate.QueryDefaultPageSize when PageSize is 0
severity: medium
kind: architecture_guardrail
modality: must
consequence: Without default bounds, queries with no PageSize specified can return excessive results, causing memory exhaustion
and network degradation
stage_ids:
- controller_store
- id: finance-C-096
when: When calling LockLedger outside of an active transaction
action: verify the connection is returned to the pool only after explicit release function is called
severity: high
kind: resource_boundary
modality: must
consequence: Without proper connection management, holding a dedicated connection during advisory lock can exhaust the
connection pool, blocking new requests
stage_ids:
- controller_store
- id: finance-C-097
when: When presenting transaction results from the store
action: claim atomicity guarantees across multiple ledgers in different buckets
severity: medium
kind: claim_boundary
modality: must_not
consequence: Store operations are scoped to single ledger; cross-ledger consistency must be managed at application layer,
not within Store transactions
stage_ids:
- controller_store
- id: finance-C-098
when: When performing transaction validation before commit
action: validate idempotency key hashes match when logs are re-read after ErrIdempotencyKeyConflict
severity: high
kind: domain_rule
modality: must
consequence: Without idempotency hash validation, retrying after conflict can execute different operations than the original
request, causing inconsistent account states
stage_ids:
- controller_store
- id: finance-C-099
when: When implementing GetBalances within a transaction
action: insert zero-value balance rows for missing accounts before acquiring locks
severity: high
kind: domain_rule
modality: must
consequence: Without pre-inserting zero balances, accounts that don't exist yet cannot be locked, allowing concurrent
transactions to create duplicate or inconsistent balance entries
stage_ids:
- controller_store
- id: finance-C-102
when: When configuring async block hashing
action: set HASH_LOGS feature to 'ASYNC' for ledgers to be processed by the async block hasher
severity: high
kind: architecture_guardrail
modality: must
consequence: Ledgers with HASH_LOGS = 'SYNC' or 'DISABLED' are silently skipped, creating gaps in the block chain for
those ledgers
stage_ids:
- async_block_hashing
- id: finance-C-103
when: When setting up the async block hasher CRON schedule
action: use a valid CRON expression that aligns with the business requirement for block creation frequency
severity: high
kind: resource_boundary
modality: must
consequence: Blocks are created too infrequently, causing delayed integrity verification, or too frequently, causing unnecessary
database load
stage_ids:
- async_block_hashing
- id: finance-C-104
when: When setting the async block hasher max block size
action: configure MaxBlockSize to a positive integer value for batch processing
severity: high
kind: resource_boundary
modality: must
consequence: Invalid block size causes the procedure to fail or process all remaining logs in a single potentially unbounded
block
stage_ids:
- async_block_hashing
- id: finance-C-105
when: When implementing graceful shutdown for the async block hasher
action: drain the stopChannel to verify the current block processing completes before stopping
severity: high
kind: operational_lesson
modality: must
consequence: Incomplete block processing leaves logs unprocessed, creating gaps in the hash chain that compromise audit
integrity
stage_ids:
- async_block_hashing
- id: finance-C-106
when: When the async block hasher is enabled
action: claim real-time integrity verification equivalent to SYNC mode
severity: medium
kind: claim_boundary
modality: must_not
consequence: ASYNC mode creates blocks on CRON schedule, introducing a delay between log creation and hash verification
that does not provide the same immediate integrity guarantees as SYNC mode
stage_ids:
- async_block_hashing
- id: finance-C-107
when: When running async block hasher alongside SYNC log hashing
action: not enable both SYNC and ASYNC modes simultaneously for the same ledger
severity: high
kind: domain_rule
modality: must
consequence: Duplicate hash computations occur with different mechanisms, potentially producing inconsistent results and
confusing the audit trail
stage_ids:
- async_block_hashing
- id: finance-C-108
when: When the ASYNC mode ledger receives new transactions
action: handle logs inserted after block creation without blocking the transaction commit
severity: medium
kind: architecture_guardrail
modality: must
consequence: Blocking transaction commits causes high-throughput clients to experience increased latency and reduced throughput
due to serialization on hash computation
stage_ids:
- async_block_hashing
- id: finance-C-110
when: When executing the async block hasher Run loop
action: recalculate next CRON execution time after each run completes
severity: medium
kind: operational_lesson
modality: must
consequence: Fixed interval scheduling causes drift from the intended CRON schedule, leading to inconsistent block creation
timing
stage_ids:
- async_block_hashing
- id: finance-C-111
when: When migrating ledgers between buckets
action: preserve the logs_blocks table with correct block chaining to maintain audit continuity
severity: high
kind: domain_rule
modality: must
consequence: Breaking the block chain during migration creates gaps that invalidate the cumulative hash integrity verification
stage_ids:
- async_block_hashing
- id: finance-C-114
when: When implementing the pagination logic
action: set next poll interval to 0 when HasMore is true to drain each pending logs
severity: high
kind: domain_rule
modality: must
consequence: Waiting for full PullInterval when more logs exist causes unnecessary delay in log delivery, potentially
impacting real-time financial processing
stage_ids:
- pipeline_replication
- id: finance-C-115
when: When configuring pipeline export retry behavior
action: implement exponential backoff for failed export retries to prevent thundering herd
severity: high
kind: resource_boundary
modality: must
consequence: Without backoff, repeated retry attempts during downstream outages can overwhelm external systems, causing
cascading failures
stage_ids:
- pipeline_replication
- id: finance-C-117
when: When configuring batcher MaxItems parameter
action: verify MaxItems is non-negative (>= 0)
severity: high
kind: resource_boundary
modality: must
consequence: Negative MaxItems causes undefined batching behavior and potential memory leaks in log buffering
stage_ids:
- pipeline_replication
- id: finance-C-118
when: When implementing driver acceptance logic
action: block Accept calls until driver reports ready state
severity: high
kind: resource_boundary
modality: must
consequence: Accepting logs before driver is ready causes undefined export behavior and potential log loss
stage_ids:
- pipeline_replication
- id: finance-C-119
when: When implementing driver initialization
action: retry driver start indefinitely on failure (unless context is canceled)
severity: high
kind: operational_lesson
modality: must
consequence: Without retry, transient driver initialization failures cause permanent pipeline failure and stop log replication
stage_ids:
- pipeline_replication
- id: finance-C-121
when: When implementing pipeline state persistence
action: persist pipeline state asynchronously to avoid blocking the export loop
severity: medium
kind: operational_lesson
modality: must
consequence: Synchronous state persistence during export causes latency spikes and potential thundering herd on crash
recovery
stage_ids:
- pipeline_replication
- id: finance-C-123
when: When implementing pipeline synchronization with storage
action: periodically sync enabled pipelines from storage to handle configuration changes
severity: high
kind: architecture_guardrail
modality: must
consequence: Missing periodic sync causes pipelines to remain running after being disabled or fail to start newly enabled
pipelines
stage_ids:
- pipeline_replication
- id: finance-C-124
when: When implementing driver factory for pipeline exporters
action: wrap factory with batching layer to buffer logs before export
severity: medium
kind: architecture_guardrail
modality: must
consequence: Without batching, each log triggers a separate network call, causing high latency and potential network saturation
stage_ids:
- pipeline_replication
- id: finance-C-125
when: When describing the replication pipeline capabilities
action: claim real-time log delivery when using pull-based polling architecture
severity: high
kind: claim_boundary
modality: must_not
consequence: Pull-based polling has inherent delay of up to PullInterval + processing time; claiming real-time delivery
sets false expectations for financial use cases requiring immediate notification
stage_ids:
- pipeline_replication
- id: finance-C-126
when: When describing export delivery guarantees
action: claim exactly-once delivery without idempotency handling in downstream systems
severity: medium
kind: claim_boundary
modality: must_not
consequence: At-least-once delivery with cursor-based deduplication still risks duplicates on crash during state persistence;
claiming exactly-once misleads users about reliability
stage_ids:
- pipeline_replication
- id: finance-C-127
when: When describing batcher behavior
action: claim bounded delivery latency when MaxItems is configured without FlushInterval
severity: medium
kind: claim_boundary
modality: must_not
consequence: With MaxItems > 0 and no FlushInterval, logs wait indefinitely if traffic is low, causing unbounded latency
for low-volume ledgers
stage_ids:
- pipeline_replication
- id: finance-C-130
when: When hard deleting a bucket
action: Drop the PostgreSQL schema with CASCADE and delete each ledger records from _system.ledgers
severity: high
kind: domain_rule
modality: must
consequence: Orphaned data will remain in PostgreSQL, consuming storage and potentially causing confusion about bucket
state
stage_ids:
- bucket_cleanup
- id: finance-C-132
when: When processing multiple buckets for cleanup
action: Continue processing remaining buckets even if one bucket fails
severity: high
kind: architecture_guardrail
modality: must
consequence: One corrupted or locked bucket will block cleanup of all other expired buckets, causing storage accumulation
stage_ids:
- bucket_cleanup
- id: finance-C-133
when: When running the bucket cleanup worker
action: Execute cleanup according to a configured cron schedule
severity: high
kind: resource_boundary
modality: must
consequence: Without a schedule, cleanup only runs once at startup, allowing storage to accumulate indefinitely
stage_ids:
- bucket_cleanup
- id: finance-C-135
when: When implementing bucket cleanup
action: Use the system store interface for each bucket operations
severity: high
kind: architecture_guardrail
modality: must
consequence: Direct database access bypasses transactional guarantees and consistency checks, leading to data corruption
stage_ids:
- bucket_cleanup
- id: finance-C-136
when: When a bucket fails to hard delete
action: Log the failure and record the bucket name for later investigation
severity: medium
kind: operational_lesson
modality: must
consequence: Failed deletions go unnoticed, accumulating unrecoverable data and potentially causing system storage exhaustion
stage_ids:
- bucket_cleanup
- id: finance-C-137
when: When restoring a soft-deleted bucket
action: Set deleted_at back to NULL on each ledgers in the bucket
severity: high
kind: domain_rule
modality: must
consequence: Incorrect restore operation will cause the bucket to be immediately hard-deleted upon next cleanup cycle
stage_ids:
- bucket_cleanup
- id: finance-C-138
when: When fetching ledgers without the includeDeleted flag
action: Filter out ledgers where deleted_at IS NOT NULL
severity: high
kind: architecture_guardrail
modality: must
consequence: Soft-deleted ledgers appear in normal queries, causing confusion and potential data integrity issues
stage_ids:
- bucket_cleanup
- id: finance-C-139
when: When setting the default retention period
action: Use 30 days as the default retention period to provide adequate recovery window
severity: medium
kind: resource_boundary
modality: should
consequence: Shorter retention periods increase risk of accidental data loss; users need time to discover and recover
from mistakes
stage_ids:
- bucket_cleanup
- id: finance-C-145
when: When implementing worker goroutine error handling
action: panic when worker Run method returns an error (unrecoverable failure)
severity: high
kind: architecture_guardrail
modality: must
consequence: Silent error swallowing allows the worker to continue running in a failed state, producing incorrect hashes,
missing replications, or orphaned buckets without operator awareness
stage_ids:
- worker_fx_module
- id: finance-C-147
when: When configuring replication pipeline defaults
action: set pull interval to at least 5 seconds to avoid excessive database load
severity: high
kind: resource_boundary
modality: must
consequence: Default pull interval of 10 seconds is specified; setting too low causes excessive polling, database CPU
saturation, and degraded transaction processing performance
stage_ids:
- worker_fx_module
- id: finance-C-148
when: When configuring replication pipeline defaults
action: set push retry period to at least 10 seconds to avoid overwhelming external systems
severity: high
kind: resource_boundary
modality: must
consequence: Default push retry period of 10 seconds is specified; aggressive retries overwhelm downstream systems during
outages and cause cascading failures
stage_ids:
- worker_fx_module
- id: finance-C-149
when: When configuring replication pipeline defaults
action: set logs page size to at least 100 for efficient batch processing
severity: medium
kind: resource_boundary
modality: must
consequence: Default logs page size of 100 is specified; very small page sizes cause excessive round trips and network
overhead
stage_ids:
- worker_fx_module
- id: finance-C-151
when: When implementing bucket cleanup retention
action: set default retention period to at least 30 days for data safety
severity: high
kind: resource_boundary
modality: must
consequence: Shorter retention periods risk permanent data loss before users can recover from accidental deletions or
perform compliance audits
stage_ids:
- worker_fx_module
- id: finance-C-152
when: When implementing worker stop cancellation
action: handle context cancellation in stop channel operations to avoid hanging shutdown
severity: high
kind: architecture_guardrail
modality: must
consequence: Shutdown hangs indefinitely when context deadline is exceeded during stop, preventing clean service restart
and causing deployment timeouts
stage_ids:
- worker_fx_module
- id: finance-C-153
when: When implementing replication pipeline error handling
action: continue processing other buckets when one bucket cleanup fails
severity: high
kind: domain_rule
modality: must
consequence: Single bucket cleanup failure causes entire cleanup batch to abort, leaving other expired buckets undeleted
and accumulating storage bloat
stage_ids:
- worker_fx_module
- id: finance-C-154
when: When implementing worker configuration via command-line flags
action: provide sensible defaults for each worker configuration parameters
severity: high
kind: resource_boundary
modality: must
consequence: Missing defaults cause worker to fail startup without explicit configuration, preventing automated deployment
and requiring manual intervention per environment
stage_ids:
- worker_fx_module
- id: finance-C-155
when: When managing pipelines dynamically based on feature flags
action: stop pipelines that are disabled or deleted during periodic synchronization
severity: high
kind: architecture_guardrail
modality: must
consequence: Disabled pipelines continue consuming resources and producing stale data, causing confusion and potential
data inconsistency in downstream systems
stage_ids:
- worker_fx_module
- id: finance-C-159
when: When validating account addresses in postings
action: Validate account addresses match the patterns defined in the ledger's chart of accounts schema
severity: high
kind: domain_rule
modality: must
consequence: Postings to invalid accounts violate the chart of accounts contract, breaking balance tracking and audit
trail integrity
- id: finance-C-161
when: When handling external monetary flows with the WORLD account
action: Treat WORLD account credits as no-ops and do not track WORLD balances
severity: high
kind: domain_rule
modality: must
consequence: Tracking WORLD balances would incorrectly include external funding sources in balance calculations, breaking
accounting correctness
- id: finance-C-162
when: When computing effective volumes for time-sensitive queries
action: Use transaction timestamp ordering, not insertion order, for effective volume calculations
severity: high
kind: domain_rule
modality: must
consequence: Using insertion order would cause incorrect balance calculations when transactions are backdated, violating
temporal accuracy requirements
- id: finance-C-166
when: When storing computed fields in transaction logs for hashing
action: Include postCommitVolumes or postCommitEffectiveVolumes in the log hash computation
severity: high
kind: architecture_guardrail
modality: must_not
consequence: Including computed volumes in hash would create inconsistent hashes when volumes are recalculated, breaking
audit trail integrity
- id: finance-C-167
when: When executing funding allocations via Funding.Take()
action: Process funding parts in FIFO order, with remainder going to later parts
severity: high
kind: domain_rule
modality: must
consequence: Non-FIFO processing changes the allocation semantics, potentially overallocating from early funders and underallocating
from late funders
- id: finance-C-168
when: When resuming pipeline replication after failure
action: Store and use lastLogID as the cursor for resumable recovery
severity: high
kind: architecture_guardrail
modality: must
consequence: Without persistent lastLogID, pipeline restart loses position in the log stream, potentially missing or duplicating
replicated transactions
- id: finance-C-169
when: When using Formance Ledger as a data storage solution
action: Claim Formance Ledger supports simple key-value storage use cases
severity: high
kind: claim_boundary
modality: must_not
consequence: Users expecting simple key-value storage will encounter complexity overhead (PostgreSQL dependency, schema
validation, immutable logs) without benefit
- id: finance-C-170
when: When considering non-financial applications of Formance Ledger
action: Claim Formance Ledger is suitable for non-financial data tracking
severity: high
kind: claim_boundary
modality: must_not
consequence: Non-financial use cases impose unnecessary overhead (double-entry bookkeeping, monetary precision, audit
trails) without appropriate domain fit
- id: finance-C-171
when: When evaluating Formance Ledger for applications without audit requirements
action: Claim Formance Ledger is suitable for applications without need for audit trails
severity: high
kind: claim_boundary
modality: must_not
consequence: Applications not requiring immutable audit trails pay unnecessary complexity cost (log chaining, hash verification,
PostgreSQL dependency) for unused features
- id: finance-C-173
when: When presenting or reporting Formance Ledger's transaction history
action: Claim that the ledger provides real-time data consistency guarantees across each read replicas
severity: high
kind: claim_boundary
modality: must_not
consequence: The pipeline replication uses polling (10-second default interval) with eventual consistency; users expecting
real-time consistency will observe stale reads
- id: finance-C-176
when: When processing ledger features
action: Assume feature flags are stable across versions — some can be added or removed
severity: medium
kind: operational_lesson
modality: should_not
consequence: Using features that may be removed in future versions creates upgrade fragility; best practice is to explicitly
configure required features
- id: finance-C-177
when: When working with bucket isolation for multi-tenant deployments
action: Understand that buckets use PostgreSQL schemas and enable horizontal scaling isolation
severity: medium
kind: resource_boundary
modality: must
consequence: Improper bucket configuration can lead to schema conflicts or reduced horizontal scaling effectiveness
- id: finance-C-178
when: When implementing the log-first architecture for transaction processing
action: Produce immutable logs as the source of truth for every state change; derive each balance and volume state from
log recomputation, never update in-place
severity: high
kind: domain_rule
modality: must
consequence: State-first updates with in-place modifications break the audit trail and temporal queries; historical balances
cannot be reconstructed, making compliance audits and error investigation impossible
derived_from_bd_id: BD-005
- id: finance-C-179
when: When implementing or refactoring the hashing mechanism for logs and blocks
action: Assume a unified synchronous hashing mode applies to both log hashing and block hashing — these are different
operations at different stages with different timing requirements; BD-047 SYNC applies to log hashing while BD-020 ASYNC
applies to block hashing via CRON worker
severity: high
kind: domain_rule
modality: must_not
consequence: Treating block hashing as synchronous causes the hasher to block on each block, defeating the purpose of
batch background processing; or forcing all hashing to async breaks the immediate tamper-evident guarantees for individual
logs
derived_from_bd_id: BD-93
- id: finance-C-180
when: When configuring or validating ledger naming in the system
action: Allow users to create ledgers with reserved names '_', '_info', '_healthcheck' — these names are reserved for
system endpoints and health checks
severity: high
kind: architecture_guardrail
modality: must_not
consequence: Creating ledgers with reserved names causes routing conflicts with system endpoints; health checks fail for
affected ledgers, breaking monitoring and observability
derived_from_bd_id: BD-040
- id: finance-C-181
when: When implementing or modifying transaction reversal logic
action: Reverse a transaction by swapping source/destination and negating amounts via reverse postings — do not use a
separate reversal transaction type or compensation log
severity: high
kind: domain_rule
modality: must
consequence: Using a separate reversal type or compensation log breaks bilateral swap semantics; the transaction history
no longer reflects proper reversals, causing incorrect balance calculations and audit trail inconsistencies
derived_from_bd_id: BD-083
- id: finance-C-183
when: When implementing funding allocation logic for partial credit or debit operations
action: Allocate funds using FIFO (First-In-First-Out) ordering across multiple sources — consume sources in the order
they were added, not by random or priority-based selection
severity: high
kind: domain_rule
modality: must
consequence: Non-FIFO allocation violates business intent for fund consumption ordering; strategies relying on predictable
source depletion ordering produce incorrect results, potentially exhausting wrong funding sources first
derived_from_bd_id: BD-003
- id: finance-C-184
when: When implementing or substituting the Store interface for balance queries in numscript execution
action: Verify that the Store implementation is deterministic and read-only during VM execution — verify any custom Store
implementation maintains these properties or document deviations clearly
severity: medium
kind: operational_lesson
modality: should
consequence: Non-deterministic or write-capable Store implementations cause inconsistent execution results; strategies
that pass with StaticStore may fail with live storage, making backtest-to-production transitions unreliable
derived_from_bd_id: BD-002
- id: finance-C-187
when: When implementing or refactoring balance tracking logic in volume accounting
action: Maintain Balance as a computed field derived from Input minus Output, never stored as a separate persisted field;
any refactoring must preserve this computed relationship where negative balance (overdraft) is allowed by the formula
severity: high
kind: domain_rule
modality: must
consequence: Changing Balance to a stored field creates synchronization risk between balance and volume fields, potentially
causing double-entry bookkeeping violations and incorrect account states
derived_from_bd_id: BD-009
- id: finance-C-188
when: When handling external asset flows in double-entry bookkeeping
action: Route each external asset movements through the WORLD account as sink/source; the WORLD account must never persist
received value, and when used as a source, must require each funds to be available
severity: high
kind: domain_rule
modality: must
consequence: Without WORLD account routing, external transactions break double-entry bookkeeping and create untraceable
asset flows that cannot be reconciled against external records
derived_from_bd_id: BD-011
- id: finance-C-189
when: When implementing or validating NumScript TAKE opcode operations
action: Validate negative amounts in TAKE opcode like TAKE_MAX does; enforce validation at a single layer to prevent the
enforcement gap where TAKE silently accepts negative amounts but posting validation later rejects them
severity: high
kind: architecture_guardrail
modality: must
consequence: The split enforcement between VM execution and posting validation causes silent failures where negative TAKE
amounts execute in the VM potentially accumulating incorrect balances before failing at persistence, corrupting intermediate
VM state
derived_from_bd_id: BD-94
- id: finance-C-190
when: When implementing transaction processing that requires idempotency protection
action: Use IdempotencyKey combined with IdempotencyHash validation to uniquely identify transaction intent; verify that
hash of inputs correctly represents transaction identity and handles collision scenarios
severity: medium
kind: operational_lesson
modality: must
consequence: Using only idempotency key without hash validation increases collision risk, allowing duplicate transactions
to slip through and causing double credit/debit operations
derived_from_bd_id: BD-006
- id: finance-C-191
when: When computing or querying PostCommitEffectiveVolumes that depend on transaction ordering
action: Order PostCommitEffectiveVolumes by transaction timestamp, not insertion order; handle backdated transactions
that retroactively affect effective volumes by recomputing affected periods
severity: high
kind: domain_rule
modality: must
consequence: Using insertion order instead of transaction timestamp violates business time semantics, causing backdated
transactions to have incorrect effective volume impacts and breaking audit trail integrity
derived_from_bd_id: BD-033
- id: finance-C-192
when: When implementing volume querying or caching logic in volume accounting
action: Compute PostCommitVolumes in-flight from logs rather than persisting them; if caching, implement invalidation
to prevent stale data from queries returning outdated volume values
severity: medium
kind: operational_lesson
modality: should
consequence: Persisting volumes creates stale data risk where queries return outdated values after recent transactions,
leading to incorrect balance reporting and potential financial reconciliation failures
derived_from_bd_id: BD-010
- id: finance-C-193
when: When configuring block creation scheduling in async block hashing
action: Use CRON-based scheduling with configurable interval to control block creation frequency; the interval must be
explicitly configured and non-zero to balance block availability latency against compute overhead
severity: high
kind: architecture_guardrail
modality: must
consequence: Using event-driven scheduling on every log insert causes higher compute overhead and potential thundering
herd; using CRON with explicit intervals ensures controlled block creation rate
derived_from_bd_id: BD-021
- id: finance-C-194
when: When implementing retry logic for downstream push operations in pipeline replication
action: Use exponential backoff for retries to prevent thundering herd on downstream failures; the retry period must increase
exponentially rather than using fixed intervals that can cause resonance
severity: high
kind: architecture_guardrail
modality: must
consequence: Using fixed retry intervals causes resonance with downstream recovery times, creating thundering herd scenarios
where all retries arrive simultaneously and overwhelm recovering services
derived_from_bd_id: BD-024
- id: finance-C-196
when: When implementing log ID generation or changing ID assignment logic in the data model
action: Generate log IDs sequentially using next = previous.ID + 1 to maintain ID ordering assumptions throughout the
system; do not switch to UUID or distributed ID generation methods
severity: high
kind: architecture_guardrail
modality: must
consequence: Using UUID or distributed ID generation breaks ID ordering assumptions throughout the system, causing chain
verification failures and invalidating dependent ordering logic that relies on sequential ID progression
derived_from_bd_id: BD-089
- id: finance-C-197
when: When implementing idempotency validation with hash-based conflict detection
action: Use a separate conflict detection mechanism (not idempotency hash) or log input hash in log ID computation instead
of idempotency key — the circular dependency where hash collision detection requires hash to already exist creates a
vulnerability where different inputs producing the same hash cause one transaction to be rejected as duplicate of another
severity: high
kind: domain_rule
modality: must
consequence: Two different transactions with different parameters but colliding idempotency hashes will cause one to be
incorrectly rejected as a duplicate, leading to silent transaction loss and data inconsistency in replicated ledgers
derived_from_bd_id: BD-98
- id: finance-C-198
when: When configuring schema validation mode in ledger controllers
action: Explicitly specify schema enforcement mode (strict/warn/audit) in configuration rather than relying on defaults
— verify that the framework exposes each three modes in its public API, as warn mode validates without failing and audit
mode logs without blocking
severity: medium
kind: operational_lesson
modality: should
consequence: Running with incorrect enforcement mode allows invalid schema entries to enter the ledger — strict mode silently
downgrades to warn, causing data quality issues that corrupt ledger integrity over time
derived_from_bd_id: BD-008
- id: finance-C-199
when: When implementing bucket cleanup with soft delete + 30-day delayed hard delete pattern
action: Verify that the 30-day recovery window matches actual business requirements — document whether deletions are expected
to be mostly accidental before relying on this assumption
severity: medium
kind: operational_lesson
modality: should
consequence: If deletions are intentional rather than accidental (e.g., user-initiated account closures), the 30-day window
creates unnecessary storage overhead and potential GDPR/compliance issues with retaining data longer than needed
derived_from_bd_id: BD-026
- id: finance-C-200
when: When processing transactions where log creation succeeds but account upsert fails
action: Implement reconciliation logic to detect orphaned logs (log exists but accounts missing) and trigger account recovery
— validate account existence before any operation requiring account consistency
severity: high
kind: domain_rule
modality: must
consequence: Logs exist without corresponding accounts, causing ledger queries to return incomplete or inconsistent state
— operations depending on account existence will fail unpredictably
derived_from_bd_id: BD-035
- id: finance-C-203
when: When deploying regex patterns for variable account segment validation
action: Validate regex patterns against known valid and invalid account examples before deploying to production — verify
patterns correctly encode business rules and test edge cases including empty strings, special characters, and maximum
length boundaries
severity: high
kind: operational_lesson
modality: must
consequence: Invalid regex patterns silently allow malformed accounts to pass validation, corrupting ledger data with
accounts that don't follow business rules and may cause downstream processing errors
derived_from_bd_id: BD-016
- id: finance-C-205
when: When implementing ledger balance update operations that modify account state
action: Acquire advisory lock on ledger ID using hashtext before modifying balances to serialize concurrent writes
severity: high
kind: domain_rule
modality: must
consequence: Concurrent modifications without proper locking cause race conditions where balance updates are lost or corrupted,
leading to incorrect account states and potential financial loss in live transactions
derived_from_bd_id: BD-018
- id: finance-C-206
when: When querying account balances within a transaction context for sufficient funds validation
action: Return locked balances for transaction duration to prevent double-spend — do not use stale, unlocked, or inconsistent
balance values
severity: high
kind: domain_rule
modality: must
consequence: Using unlocked balances causes double-spend vulnerabilities where the same funds are committed multiple times,
resulting in financial loss and account reconciliation failures
derived_from_bd_id: BD-019
- id: finance-C-207
when: When executing NumScript VM instructions
action: 'Execute phases in strict order: SetVarsFromJSON → ResolveResources → ResolveBalances → Execute. Do not skip,
reorder, or parallelize phases'
severity: high
kind: operational_lesson
modality: must
consequence: Phase ordering violations cause initialization errors or silent state corruption, leading to incorrect execution
results and potentially executing transactions with uninitialized variables
derived_from_bd_id: BD-031
- id: finance-C-208
when: When reverting a transaction in the ledger for corrections
action: Use AtEffectiveDate flag to preserve original timestamp during RevertTransaction — do not use current time for
reverted transaction timestamps
severity: high
kind: domain_rule
modality: must
consequence: Using current time instead of original timestamp corrupts temporal audit trail integrity, making regulatory
audits and historical corrections unreliable and potentially non-compliant
derived_from_bd_id: BD-053
- id: finance-C-209
when: When implementing chart segment validation for parameterized chart definitions
action: Use '$' prefix to mark variable segments in chart definitions — validate using ChartSegmentRegexp which expects
dollar-prefixed variable markers
severity: medium
kind: domain_rule
modality: must
consequence: Using different variable markers breaks existing chart definitions that depend on '$' syntax for parameterized
matching, causing validation failures for all affected charts
derived_from_bd_id: BD-064
- id: finance-C-210
when: When running schema migration with multiple ledger schema versions coexisting
action: Verify that version conflict resolution is deterministic — verify migration logic handles concurrent schema versions
predictably without race conditions
severity: medium
kind: operational_lesson
modality: should
consequence: Non-deterministic version conflict resolution causes inconsistent schema states, potentially corrupting ledger
data during migration and leading to query failures
derived_from_bd_id: BD-017
- id: finance-C-211
when: When implementing block hashing workflow using async PostgreSQL CRON worker
action: Verify that async block hashing handles failures gracefully — check retry logic, CRON scheduling consistency,
and hash chain integrity after worker restarts
severity: high
kind: operational_lesson
modality: must
consequence: Async block hashing failures cause block gaps or inconsistent hash chains, compromising data integrity and
making historical verification impossible
derived_from_bd_id: BD-020
- id: finance-C-213
when: When calculating volume totals across accounts or assets in queries
action: Use SQL SUM aggregation for volume totals to maintain consistency guarantees — do not substitute with application-level
aggregation or map-reduce that may alter precision or consistency
severity: high
kind: domain_rule
modality: must
consequence: Using non-SQL aggregation methods alters consistency guarantees and precision, causing volume totals to differ
from on-chain records and breaking reconciliation
derived_from_bd_id: BD-074
- id: finance-C-214
when: When implementing pipeline replication logic
action: Use pull-based polling with cursor stored as lastLogID — each exporter tracks its own cursor, not the ledger tracking
consumers
severity: high
kind: domain_rule
modality: must
consequence: Switching to push-based replication would require the ledger to track consumer state, breaking crash recovery
guarantees and potentially causing data loss or duplication during leader failover
derived_from_bd_id: BD-023
- id: finance-C-215
when: When implementing transaction reversal logic via Postings.Reverse()
action: Swap source/destination fields AND reverse the array order — both operations are required for correct multi-posting
transaction reversal
severity: high
kind: domain_rule
modality: must
consequence: Implementing only a simple swap would fail to correctly reverse transactions containing multiple postings,
causing mismatched source-destination pairs and corrupted ledger state
derived_from_bd_id: BD-032
- id: finance-C-216
when: When implementing funding allocation in numscript execution
action: Merge adjacent funding parts with the same account into a single part — allocation semantics depend on consolidated
parts
severity: high
kind: domain_rule
modality: must
consequence: Keeping separate parts for multiple sends to the same account would change allocation semantics, potentially
causing incorrect fund distribution and audit mismatches
derived_from_bd_id: BD-034
- id: finance-C-217
when: When enabling or configuring MOVES_HISTORY feature for account tracking
action: Verify that the default ON setting (tracking funds movements per account/asset) matches deployment compliance
requirements — if audit trails are not needed, explicitly disable MOVES_HISTORY
severity: medium
kind: operational_lesson
modality: should
consequence: Running with MOVES_HISTORY ON by default incurs storage costs without explicit review; deployments that don't
need audit trails pay unnecessary storage overhead, while those needing audit trails may not verify data completeness
derived_from_bd_id: BD-046
- id: finance-C-218
when: When configuring HASH_LOGS feature for log chain integrity
action: Verify that the default SYNC setting (synchronous hashing on write) matches throughput requirements — if higher
throughput is acceptable at reduced integrity guarantees, explicitly switch to ASYNC
severity: medium
kind: operational_lesson
modality: should
consequence: Running with HASH_LOGS SYNC by default may limit throughput; deployments with relaxed integrity requirements
during import batch processing may unnecessarily sacrifice performance
derived_from_bd_id: BD-047
- id: finance-C-219
when: When executing bulk operations in numscript
action: Verify that atomic mode default (each succeed or each fail) matches operational requirements — if partial success
is acceptable, explicitly choose parallel mode with understanding of consistency tradeoffs
severity: medium
kind: operational_lesson
modality: should
consequence: Bulk operations default to atomic mode for data integrity; choosing parallel mode without understanding consistency
tradeoffs may lead to partial state in multi-step operations
derived_from_bd_id: BD-054
- id: finance-C-221
when: When evaluating feature flag checks for block hashing
action: Verify feature flag evaluation is consistent across each code paths — HASH_LOGS feature flag checks must be evaluated
identically in every execution context (sync, async, worker)
severity: high
kind: domain_rule
modality: must
consequence: Inconsistent feature flag evaluation between code paths could cause some blocks to be hashed while others
are not, breaking the log chain integrity and causing verification failures
derived_from_bd_id: BD-022
- id: finance-C-223
when: When tracking account and asset volumes via PostCommitVolumes
action: Use cumulative sum with incremental updates — volume queries depend on cumulative state, not point-in-time snapshots
or event sourcing
severity: high
kind: domain_rule
modality: must
consequence: Switching to point-in-time snapshots or event sourcing would alter volume tracking semantics, breaking existing
volume queries and causing incorrect balance calculations
derived_from_bd_id: BD-078
- id: finance-C-225
when: When processing monetary values in ledger calculations
action: Verify that each numeric variables represent whole units only — confirm integer-only arithmetic matches business
requirements before deployment
severity: medium
kind: operational_lesson
modality: should
consequence: Decimal values in integer-only system cause precision loss, truncation errors accumulate in high-frequency
transactions, leading to balance discrepancies of fractional units
derived_from_bd_id: BD-077
- id: finance-C-229
when: When implementing transaction reversal in double-entry accounting
action: Create a new reversal transaction with source and destination fields swapped and each amounts negated — do not
merely mark the original transaction as reversed or create only an offset entry
severity: high
kind: domain_rule
modality: must
consequence: Simplifying reversal to just marking original as reversed breaks the double-entry accounting invariant where
debits must equal credits; this causes accounting imbalances that manifest as phantom funds or missing money in backtest
results
derived_from_bd_id: BD-072
- id: finance-C-230
when: When implementing posting validation in backtesting
action: Validate that posting amounts are non-negative (>= 0) — reject any attempt to create postings with negative amounts
or use a separate sign field
severity: high
kind: domain_rule
modality: must
consequence: Allowing negative posting amounts fundamentally changes transaction semantics; financial calculations that
rely on non-negative amounts will produce incorrect equity curves and return estimates in backtesting
derived_from_bd_id: BD-084
- id: finance-C-231
when: When implementing multi-source funding allocation with Concat() operations
action: Verify that FIFO ordering semantics are preserved when Concat() merges consecutive same-account parts — trace
original part indices and verify fund allocation follows the documented FIFO sequence even after concatenation
severity: medium
kind: operational_lesson
modality: should
consequence: When Concat merges non-adjacent same-account segments, FIFO fund allocation semantics break silently; funds
are allocated in the wrong order causing backtested strategy returns to diverge from live trading results without any
obvious error
derived_from_bd_id: BD-97
output_validator:
assertions:
- id: OV-01
check_predicate: all(p in inspect.getsource(zvt.factors.algorithm.macd) for p in ['slow=26', 'fast=12', 'n=9'])
failure_message: 'FATAL: MACD params drifted from (fast=12, slow=26, n=9) — SL-08 violation, non-reproducible signals'
business_meaning: Standard MACD parameters are a semantic lock; drift makes results incomparable with industry-standard
indicators and non-reproducible.
source_ids:
- SL-08
- BD-036
- id: OV-02
check_predicate: result.get('total_trades', 0) > 0 or result.get('explicit_zero_trade_ack') is True
failure_message: Zero trades executed — likely missing pre-fetched data (see PC-02) or over-restrictive filters
business_meaning: A backtest with zero trades is not a valid result; either data is missing or the strategy never triggered.
Structural non-emptiness check is insufficient — we need business confirmation.
source_ids:
- SL-01
- finance-C-073
- id: OV-03
check_predicate: result.get('annual_return') is None or abs(float(result['annual_return'])) <= 5.0
failure_message: 'FATAL: |annual_return| > 500% — likely look-ahead bias or data error'
business_meaning: Annual returns exceeding 500% are physically implausible for A-share strategies; indicates look-ahead
bias or corrupt data.
source_ids: []
- id: OV-04
check_predicate: result.get('holding_change_pct') is None or abs(float(result['holding_change_pct'])) <= 1.0
failure_message: 'FATAL: |holding_change_pct| > 100% — physically impossible'
business_meaning: Holding change percentage cannot exceed 100%; violation indicates position accounting error.
source_ids:
- BD-029
- id: OV-05
check_predicate: result.get('max_drawdown') is None or abs(float(result['max_drawdown'])) <= 1.0
failure_message: 'FATAL: |max_drawdown| > 100% — impossible for non-leveraged account'
business_meaning: Maximum drawdown cannot exceed 100% without leverage; violation indicates calculation error or look-ahead
bias.
source_ids: []
- id: OV-06
check_predicate: not (hasattr(result, 'trade_log') and result.trade_log and any(result.trade_log[i].action == 'sell' and
i+1 < len(result.trade_log) and result.trade_log[i+1].action == 'buy' and result.trade_log[i].timestamp == result.trade_log[i+1].timestamp
for i in range(len(result.trade_log)-1)))
failure_message: 'FATAL: buy-before-sell detected in same cycle — SL-01 violation, creates implicit leverage'
business_meaning: SL-01 requires sell() before buy() in each cycle; violation means available_long was not updated before
buying, risking duplicate positions.
source_ids:
- SL-01
scaffold:
validate_py_path: '{workspace}/validate.py'
tail_block: "# === DO NOT MODIFY BELOW THIS LINE ===\nif __name__ == \"__main__\":\n result = run_backtest()\n from\
\ validate import enforce_validation\n enforce_validation(result, output_path=\"{workspace}/result.csv\")\n# ===\
\ END DO NOT MODIFY ==="
enforcement_protocol: 1. Never edit validate.py. 2. Never delete the DO NOT MODIFY tail block from the main script. 3. Never
wrap enforce_validation() in try/except. 4. Never rewrite result write logic — it MUST go through enforce_validation.
5. If validate.py raises ImportError, fix the dependency, do not remove the call.
acceptance:
hard_gates:
- id: G1
check: '{workspace}/result.csv exists AND file size > 0'
on_fail: Strategy did not produce output; check run_backtest() return value and enforce_validation() call
- id: G2
check: '{workspace}/result.csv.validation_passed marker file exists'
on_fail: Validation did not complete; review validate.py output and fix assertion failures
- id: G3
check: 'Main script contains literal: from validate import enforce_validation'
on_fail: Validation chain stripped; re-add the import in the DO NOT MODIFY block
- id: G4
check: 'Main script contains literal: # === DO NOT MODIFY BELOW THIS LINE ==='
on_fail: Validation fence removed; regenerate DO NOT MODIFY tail block
- id: G5
check: 'result.csv has at least 1 row: pandas.read_csv(result.csv).shape[0] >= 1'
on_fail: Empty result; check if trade_log is non-empty and factors generated signals. Confirm PC-02 (k-data exists) passed.
- id: G6
check: 'If MACD strategy: source contains ''slow=26'' AND ''fast=12'' AND ''n=9'' in algorithm call'
on_fail: MACD params drifted from SL-08 lock; restore standard (12, 26, 9)
- id: G7
check: 'For data pipeline tasks: result.csv contains ''entity_id'' and ''timestamp'' fields'
on_fail: Missing required columns; check Mixin.query_data return schema and DataFrame MultiIndex reset_index() before
writing
- id: G8
check: 'OV-03 passes: abs(annual_return) <= 5.0 (500%)'
on_fail: Physical plausibility check failed; investigate look-ahead bias or data corruption in input kdata
soft_gates:
- id: SG-01
rubric: 'Strategy narrative consistency: user intent aligns with generated strategy.py logic. dim_a: signal direction
(buy/sell) matches intent [1-5, pass>=4]; dim_b: frequency (daily/intraday) aligns [1-5, pass>=4]; dim_c: risk controls
match user intent [1-5, pass>=4].'
- id: SG-02
rubric: 'Factor combination quality. dim_a: no highly correlated factor duplication [1-5, pass>=4]; dim_b: multi-period
alignment correct [1-5, pass>=4]; dim_c: liquidity filter present for A-share [1-5, pass>=4].'
- id: SG-03
rubric: 'Data source selection appropriateness. dim_a: coverage sufficient for target entities [1-5, pass>=4]; dim_b:
provider latency acceptable for strategy frequency [1-5, pass>=4]; dim_c: no unauthorized provider used without credentials
[1-5, pass>=4].'
skill_crystallization:
trigger: all_hard_gates_passed AND user_opt_out_skill_saving != true
output_path_template: '{workspace}/../skills/{slug}.skill'
slug_template: '{blueprint_id_short}-{uc_id_lower}'
captured_fields:
- name
- intent_keywords
- entry_point_script
- validate_script
- fatal_constraints
- spec_locks
- preconditions
- install_recipes
- human_summary_translated
action: 'After all Hard Gates PASS, resolve slug via slug_template using the executed UC, then write the .skill YAML file
at output_path_template. Notify user in their detected locale: ''Skill saved as {slug}.skill — next time say one of {sample_triggers}
from the matched UC to invoke directly.'''
violation_signal: All hard gates passed but no .skill file exists at expected path
skill_file_schema:
name: finance-bp-073 / UC-101
version: v5.3
intent_keywords: []
entry_point: run_backtest
fatal_guards:
- SL-01
- SL-02
- SL-03
- SL-04
- SL-05
- SL-06
- SL-07
- SL-08
- SL-10
- SL-11
- SL-12
spec_locks:
- SL-01
- SL-02
- SL-03
- SL-04
- SL-05
- SL-06
- SL-07
- SL-08
- SL-09
- SL-10
- SL-11
- SL-12
preconditions:
- PC-01
- PC-02
- PC-03
- PC-04
post_install_notice:
trigger: skill_installation_complete
message_template:
positioning: I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow.
capability_catalog:
group_strategy:
source: auto_grouped
strategy_reason: no candidate field had 2-7 distinct values; all capabilities collapsed into single group
groups:
- group_id: all
name: All Capabilities
description: ''
emoji: 📦
uc_count: 0
ucs: []
call_to_action: Tell me which one you want to try.
featured_entries:
- uc_id: UC-100
beginner_prompt: Try capability UC-100
auto_selected: true
- uc_id: UC-101
beginner_prompt: Try capability UC-101
auto_selected: true
- uc_id: UC-102
beginner_prompt: Try capability UC-102
auto_selected: true
more_info_hint: Ask me 'what else can you do?' to see all 0 capabilities.
locale_rendering:
instruction: On skill_installation_complete, translate ALL user-facing strings (positioning + capability_catalog.groups[].name
+ capability_catalog.groups[].description + capability_catalog.groups[].ucs[].short_description + call_to_action + featured_entries[].beginner_prompt
+ more_info_hint) into detected user locale per locale_contract. Preserve UC-IDs, group_id, emoji, and sample_triggers
verbatim.
preserve_verbatim:
- UC-IDs
- group_id
- emoji
- sample_triggers
- technical_class_names
enforcement:
action: 'Host agent MUST send composed message to user as the FIRST user-facing response after skill_installation_complete
event. Message MUST contain: positioning, capability_catalog (rendered as markdown tables per group), 3 featured_entries,
call_to_action, and more_info_hint.'
violation_code: PIN-01
violation_signal: First user-facing message post-install does not contain the full capability_catalog (all UCs grouped)
OR skips featured_entries OR skips call_to_action.
human_summary:
persona: Doraemon
what_i_can_do:
tagline: 'I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me
what you want; I''ll write the code, you don''t have to dig docs. (Heads up: ZVT natively supports A-share, HK, and
crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don''t bother for serious work.)'
use_cases:
- A-share MACD daily golden-cross backtest with hfq price adjustment from eastmoney
- 'End-to-end ZVT pipeline: FinanceRecorder + GoodCompanyFactor + StockTrader'
- Multi-factor strategy with TargetSelector (AND mode) combining MACD + volume breakout
- Index composition data collection (SZ1000, SZ2000) with EM recorder
- Institutional fund holdings tracker via joinquant_fund_runner pattern
- Custom Transformer + Accumulator factor with per-entity rolling state
- Bollinger Band mean-reversion factor with BollTransformer (window=20, window_dev=2)
what_i_auto_fetch:
- ZVT stage pipeline structure (data_collection → visualization) from LATEST.yaml
- Semantic locks (SL-01 through SL-12) — especially sell-before-buy ordering and MACD params
- Fatal constraints (finance-C-*) relevant to your target strategy type
- 'Default parameters: MACD(12,26,9), hfq adjustment, buy_cost=0.001, base_capital=1M CNY'
- Entity ID format (stock_sh_600000) and DataFrame MultiIndex convention
- Provider-specific recorder class names and required class attributes
what_i_ask_you:
- 'Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage
is thin)'
- 'Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare,
or qmt (broker)?'
- 'Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?'
- 'Time range: start_timestamp and end_timestamp for backtest period'
- 'Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?'
locale_rendering:
instruction: On first user contact, translate all fields above into detected user locale while preserving Doraemon persona
(direct, frank, mildly snarky, knows limits).
preserve_verbatim:
- BD-IDs
- SL-IDs
- UC-IDs
- finance-C-IDs
- class_names
- function_names
- file_paths
- numeric_thresholds
通过 LEAN 引擎搭建多市场量化研究与回测环境,支持 QuantBook 历史数据获取、技术指标计算和自定义因子建模。。
---
name: lean-cloud-backtest
description: |-
通过 LEAN 引擎搭建多市场量化研究与回测环境,支持 QuantBook 历史数据获取、技术指标计算和自定义因子建模。。
license: Proprietary. See LICENSE.txt in project root.
compatibility: Designed for Doramagic-host ecosystem (Claude Code / openclaw / Cursor). Requires Python 3.12+ with uv package manager.
metadata:
version: "v6.1"
blueprint_id: "finance-bp-100"
compiled_at: "2026-04-22T13:00:45.713977+00:00"
capability_markets: "multi-market"
capability_activities: "backtesting, factor-research"
sop_version: "crystal-compilation-v6.1"
---
# LEAN 云端回测 (lean-cloud-backtest)
> 通过 LEAN 引擎搭建多市场量化研究与回测环境,支持 QuantBook 历史数据获取、技术指标计算和自定义因子建模。
## Pipeline
`data_collection -> data_storage -> factor_computation -> target_selection -> trading_execution -> visualization`
## Top Use Cases (8 total)
### C# QuantBook Research Environment Setup (`UC-101`)
Provides a foundational C# research environment template for loading QuantBook and fetching historical data across multiple asset classes for analysis
**Triggers**: C#, QuantBook, research environment
### Python QuantBook Basic Research with Indicators (`UC-102`)
Provides a Python research environment template demonstrating QuantBook setup, historical data fetching, price plotting, and Bollinger Bands indicator
**Triggers**: Python, QuantBook, Bollinger Bands
### C# Comprehensive QuantBook API and Data Fetching (`UC-103`)
Comprehensive C# template demonstrating QuantBook API cloud connectivity, project listing, and multiple methods for fetching historical data with diff
**Triggers**: C#, QuantBook, API
For all **8** use cases, see [references/USE_CASES.md](references/USE_CASES.md).
**Execute trigger**: `When user intent matches intent_router.uc_entries[].positive_terms AND user uses action verb (run/execute/跑/执行/backtest/fetch/collect)`
## What I'll Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
## Semantic Locks (Fatal)
| ID | Rule | On Violation |
|---|---|---|
| `SL-01` | Execute sell orders before buy orders in every trading cycle | halt |
| `SL-02` | Trading signals MUST use next-bar execution (no look-ahead) | halt |
| `SL-03` | Entity IDs MUST follow format entity_type_exchange_code | halt |
| `SL-04` | DataFrame index MUST be MultiIndex (entity_id, timestamp) | halt |
| `SL-05` | TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount | halt |
| `SL-06` | filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION | halt |
| `SL-07` | Transformer MUST run BEFORE Accumulator in factor pipeline | halt |
| `SL-08` | MACD parameters locked: fast=12, slow=26, signal=9 | halt |
Full lock definitions: [references/LOCKS.md](references/LOCKS.md)
## Top Anti-Patterns (25 total)
- **`AP-ZVT-183`**: 除权因子为 inf/NaN 时直接参与乘法导致复权静默失败
- **`AP-ZVT-179`**: 第三方数据接口超限后异常被吞噬,数据静默缺失
- **`AP-ZVT-183B`**: HFQ(后复权)与 QFQ(前复权)K 线表使用错误导致因子计算漂移
All 25 anti-patterns: [references/ANTI_PATTERNS.md](references/ANTI_PATTERNS.md)
## Evidence Quality Notice
> [QUALITY NOTICE] This crystal was compiled from blueprint finance-bp-100. Evidence verify ratio = 23.0% and audit fail total = 20. Generated results may have uncaptured requirement gaps. Verify critical decisions against source files (LATEST.yaml / LATEST.jsonl).
## Reference Files
| File | Contents | When to Load |
|---|---|---|
| [references/seed.yaml](references/seed.yaml) | V6+ 全量权威 (source-of-truth) | 有行为/决策争议时必读 |
| [references/ANTI_PATTERNS.md](references/ANTI_PATTERNS.md) | 25 条跨项目反模式 | 开始实现前 |
| [references/WISDOM.md](references/WISDOM.md) | 跨项目精华借鉴 | 架构决策时 |
| [references/CONSTRAINTS.md](references/CONSTRAINTS.md) | domain + fatal 约束 | 规则冲突时 |
| [references/USE_CASES.md](references/USE_CASES.md) | 全量 KUC-* 业务场景 | 需要完整示例时 |
| [references/LOCKS.md](references/LOCKS.md) | SL-* + preconditions + hints | 生成回测/交易代码前 |
| [references/COMPONENTS.md](references/COMPONENTS.md) | AST 组件地图(按 module 拆分)| 查 API 时 |
---
*Compiled by Doramagic crystal-compilation-v6.1 from `finance-bp-100` blueprint at 2026-04-22T13:00:45.713977+00:00.*
*See [human_summary.md](human_summary.md) for non-technical overview.*
FILE:human_summary.md
# Human Summary
> I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
## What I Can Do
- **tagline**: I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
- **use_cases**: ['C# Comprehensive QuantBook API and Data Fetching', 'Python QuantBook Basic Research with Indicators', 'C# QuantBook Research Environment Setup', 'A-share MACD daily golden-cross backtest with hfq price adjustment from eastmoney', 'End-to-end ZVT pipeline: FinanceRecorder + GoodCompanyFactor + StockTrader', 'Multi-factor strategy with TargetSelector (AND mode) combining MACD + volume breakout', 'Index composition data collection (SZ1000, SZ2000) with EM recorder']
## What I Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
FILE:references/ANTI_PATTERNS.md
# Anti-Patterns (Cross-Project)
Total: **25**
## qlib (9)
### `AP-QLIB-1930` — 回测结果与模型无关——共享 dataset 对象导致预测值被首次模型覆盖 <sub>(high)</sub>
Qlib 中多个模型复用同一个已 fit 的 DatasetH 实例时,dataset 内部的标准化 参数(fit_start_time/fit_end_time 决定的归一化统计量)在第一次 fit 后固化。 切换模型但不重新初始化 dataset,导致所有模型实际使用同一套预测信号。表现为 无论换 LightGBM/XGBoost/DNN,回测净值曲线完全一致。这是最危险的"实验看起来 在跑,但结论全部无效"反模式。
Source: https://github.com/microsoft/qlib/issues/1930
### `AP-QLIB-2090` — fit_start_time 与 train segment 双重配置引发隐式数据泄露 <sub>(high)</sub>
Qlib DatasetH 有两个"训练数据范围":handler 的 fit_start_time/fit_end_time (决定归一化器拟合范围)和 segments.train(决定模型训练范围)。常见错误是 让 fit_end_time 覆盖 valid/test 段,使归一化统计量(均值、标准差)包含了 未来数据,造成前向偏差(look-ahead bias)。两者独立配置但语义耦合,文档 未明确说明 fit_end_time 必须 <= train_end。
Source: https://github.com/microsoft/qlib/issues/2090
### `AP-QLIB-2036` — MACD 因子公式文档错误——DEA 被多除一次 CLOSE 导致量纲不一致 <sub>(high)</sub>
Qlib 官方文档中的 Alpha 公式示例将 MACD 的 DEA 定义为 EMA(DIF, 9) / CLOSE, 但 DIF 已经是无量纲(除过 CLOSE 的),再次除以 CLOSE 导致 DEA 量纲为 1/price。 基于此文档公式构建的 MACD 因子在截面标准化后与正确公式差异显著,IC 下降。 此类文档层面的公式错误会被大量用户直接照搬入生产因子库。
Source: https://github.com/microsoft/qlib/issues/2036
### `AP-QLIB-2184` — 自定义 A 股数据导入前未按约定填充停牌日 NaN,引发下游因子噪声 <sub>(high)</sub>
Qlib 约定停牌日 open/close/high/low/volume/factor 字段均应填 NaN,以便框架 在因子计算时识别并跳过。用户自建 A 股数据集时若将停牌日保留为上一日价格 (常见于从东财/Wind 直接导出的数据),会导致停牌期间的价格动量因子出现 "假信号"(价格不变但因子非零)。Qlib 不校验此约定,错误静默流入训练数据。
Source: https://github.com/microsoft/qlib/issues/2184
### `AP-QLIB-1892` — PIT(Point-In-Time)财务数据收集器依赖外部股票列表接口,全量 A 股获取不完整 <sub>(high)</sub>
Qlib 的 PIT 数据收集器(财务数据时间点快照)在初始化时调用 get_hs_stock_symbols() 获取沪深股票列表。该函数依赖东财 API,经常仅返回 部分列表而非全量 5000+ 股票,且函数在获取不完整时直接 raise ValueError。 用户若按文档步骤操作,财务数据集将只覆盖部分股票,基于 PIT 财务因子的回测 存在严重生存者偏差(未被采集的股票被隐式排除)。
Source: https://github.com/microsoft/qlib/issues/1892
### `AP-QLIB-2097` — 全市场 instrument="all" 在 32GB 内存机器上 OOM,但 CSI300 正常 <sub>(medium)</sub>
Qlib 在加载 Alpha158 特征时会将指定 universe 的全部特征矩阵一次性载入内存。 使用 instrument="csi300"(300 股)与 instrument="all"(5000+ 股)的内存占用 差约 16 倍。32GB 机器跑全市场时在 init_instance_by_config 阶段直接 OOM, 错误信息不提示内存问题。用户容易误以为是配置错误,实际上需要分批加载或 使用流式特征计算。
Source: https://github.com/microsoft/qlib/issues/2097
### `AP-QLIB-1984` — LightGBM 模型标签维度校验逻辑永远不触发导致多标签训练静默失败 <sub>(medium)</sub>
Qlib gbdt.py 中用 y.values.ndim == 2 判断是否为多标签,但从 DataFrame 取出的 Series 的 ndim 永远为 1,条件永远为 False,因此多标签训练不会走 squeeze 分支,而是直接进入 LightGBM 训练并在更深处抛出语义不明的错误。 用户尝试自定义多标签任务时无法从错误信息定位到此根因。
Source: https://github.com/microsoft/qlib/issues/1984
### `AP-QLIB-1915` — 自定义 CSV 数据 dump_bin 后 DataHandler 报 Length mismatch,D.features 却正常 <sub>(high)</sub>
Qlib 存在两套数据访问路径:D.features(直接读 binary)和 DataHandler/DataHandlerLP (带 processor pipeline)。自定义 A 股 CSV 数据在 dump_bin 时若字段顺序 或 symbol 格式(如 600000.SH vs SH600000)与 Qlib 约定不符,DataHandler 的 processor 在 align/reindex 时触发 Length mismatch,而 D.features 因不 经过 processor 而成功。这一"两套路径行为不一致"让用户误以为数据已正确导入。
Source: https://github.com/microsoft/qlib/issues/1915
### `AP-QLIB-1949` — Colab/Linux 多进程后端与 Qlib ParallelExt 冲突导致 DataHandler 完全不可用 <sub>(medium)</sub>
Qlib 在非 fork 环境(Windows 或 Google Colab)中,DataHandler 使用 joblib 并行加载特征时,ParallelExt 初始化时访问 _backend_args 属性失败(AttributeError)。 根因是 joblib 1.5+ 移除了该内部属性,Qlib 的兼容层未更新。表现为 D.features 调用抛出多层嵌套异常,用户无法从错误栈判断是并行后端问题还是数据问题。
Source: https://github.com/microsoft/qlib/issues/1949
## vnpy (4)
### `AP-VNPY-3691` — K 线生成器首根 K 线时间戳不对齐,导致第一个周期信号错误 <sub>(high)</sub>
vnpy BarGenerator 在合成 N 分钟 K 线时,第一根推送的 K 线时间戳为"当前 tick 所在分钟"而非"完整 N 分钟周期结束时间"。具体表现:09:59 的 tick 会 触发一根不完整的 5 分钟 K 线推送(本应等到 10:04 才推送)。策略若在 on_bar 中直接用 datetime.minute % 5 过滤,第一根 K 线恰好通过,但包含的 数据不足一个完整周期,用于信号计算会产生错误的开仓信号。
Source: https://github.com/vnpy/vnpy/issues/3691
### `AP-VNPY-3669` — Alpha 模块历史数据增量保存时新旧 DataFrame schema 不兼容导致 SchemaError <sub>(medium)</sub>
vnpy Alpha 模块在保存 K 线数据到 Parquet 文件时,将新下载数据(可能含 Float64 列)与已存文件(历史 Int64 列)直接 polars.concat。polars 强类型 不允许隐式类型提升,抛出 SchemaError。根因是不同数据源/版本返回的字段类型 不一致(如 volume 在部分行情源为整数,在另一些为浮点),且 concat 前无 schema 对齐步骤。影响所有使用 vnpy alpha 进行回测的历史数据构建流程。
Source: https://github.com/vnpy/vnpy/issues/3669
### `AP-VNPY-3685` — 价差交易模块 run_backtesting() 在 Jupyter 环境下静默报错,结果不可信 <sub>(high)</sub>
vnpy 4.10 价差交易(SpreadTrading)模块的 run_backtesting() 在 Jupyter 环境下存在事件循环冲突(asyncio already running),导致回测引擎部分逻辑 不执行但不抛异常,返回看似正常的回测统计数据。同样代码在命令行 Python 中无此问题。vnpy 4.x 将部分 IO 改为 async 但 Jupyter 的事件循环与之不兼容, 是"回测结果看起来正确但实际不完整"的隐蔽陷阱。
Source: https://github.com/vnpy/vnpy/issues/3685
### `AP-VNPY-3700` — 安装脚本不使用 venv 导致全局 numpy 版本被降级破坏其他依赖 <sub>(medium)</sub>
vnpy install.bat 直接在系统/conda base 环境安装,会强制降级 numpy 到 <2.0 以满足 vnpy 依赖,破坏依赖 numpy 2.x 的其他量化工具(如 scipy、pytorch 新版)。 没有 requirements.txt,依赖边界不透明。在多工具共存的量化研究环境中, vnpy 的安装脚本是"全局环境污染"的常见根源。
Source: https://github.com/vnpy/vnpy/issues/3700
## zipline (6)
### `AP-ZIPLINE-138` — 回测价格为未复权价,教程图表误导用户误判策略收益 <sub>(high)</sub>
Zipline 教程使用 AAPL 股价图做演示,但 bundle 中存储的是未复权价格(raw price), 而非经过拆股/分红调整的复权价。图表显示的历史价格与市场实际价约差 4 倍(Apple 历次拆股累计因子),用户误将"价格翻 4 倍"当作策略收益。A 股场景更严重: 除权前后价格跳变会在未复权数据中形成巨大"信号",吸引技术指标在除权日产生 虚假突破信号。
Source: https://github.com/stefan-jansen/zipline-reloaded/issues/138
### `AP-ZIPLINE-235` — 默认以当根 K 线收盘价成交,低估实盘滑点,策略回测收益虚高 <sub>(high)</sub>
Zipline 默认滑点模型在当根 K 线触发信号后,以同根 K 线收盘价成交(current bar close fill)。实盘中信号只能在下一根 K 线的开盘价附近成交(T+1 order execution)。以 A 股日线为例,用收盘价回测比用次日开盘价成交平均高估日收益 约 0.1-0.3%,年化差距可超 30%。需显式配置 slippage model 为 VolumeShareSlippage 或 FixedSlippage 并设合理 volume_limit。
Source: https://github.com/stefan-jansen/zipline-reloaded/issues/235
### `AP-ZIPLINE-190` — 日历 start_session 设为非交易日触发 DateOutOfBounds,无提示如何修正 <sub>(medium)</sub>
Zipline 在注册 bundle 或运行算法时,若 start_session 参数恰好是非交易日 (如 1998-01-01 元旦),Calendar 校验抛出 DateOutOfBounds("cannot be earlier than the first session")。错误信息仅显示交易日历起始日,不提示"请改为第一个 交易日"。A 股场景:使用 SSE/SZSE 日历时,若 start_date 恰好是春节前最后 一天次日(节假日),会触发同类错误,调试成本极高。
Source: https://github.com/stefan-jansen/zipline-reloaded/issues/190
### `AP-ZIPLINE-181` — asset db 过期后 Pipeline 报"no assets traded",误导用户排查数据范围 <sub>(high)</sub>
Zipline 的 asset database(SQLite)记录每只股票的 start/end 交易日期。若 使用了旧版 Quandl/自建 bundle 且未重新 ingest,在回测新日期范围时 Pipeline 抛出 "Failed to find any assets with country_code 'US' that traded between [dates]"。A 股场景:重新下载行情后若只更新价格数据而未重建 asset db,退市/ 新上市股票的日期范围不更新,Pipeline 过滤会悄悄排除这些股票,产生生存者偏差。
Source: https://github.com/stefan-jansen/zipline-reloaded/issues/181
### `AP-ZIPLINE-285` — week_start()/week_end() 在自定义日历(非美股)下静默失效 <sub>(medium)</sub>
Zipline schedule_function 的 date_rules.week_start() 和 date_rules.week_end() 依赖交易日历的周首/周末判断逻辑,但在非美股日历(如 ASX、SSE)中,该逻辑 与 NYSE 日历的偏移计算不兼容,导致 schedule 永远不触发或在错误的日期触发。 A 股场景:使用 SSE 日历时,含春节等连续长假的周,week_start 可能跳过整个 假期周而不调仓,但用户无法从日志发现未触发的调度。
Source: https://github.com/stefan-jansen/zipline-reloaded/issues/285
### `AP-ZIPLINE-240` — 回测日期时区必须为 UTC,传入 naive datetime 引发深层 AssertionError <sub>(medium)</sub>
Zipline 内部强制要求所有时间戳为 UTC aware datetime。当用户传入 naive datetime (无时区信息,如 pd.Timestamp('2020-01-01'))时,不在入口处报错,而是在 算法执行深处触发 AssertionError: Algorithm should have a utc datetime,栈深 难以定位。A 股开发者从本地 CST 时间导入数据时极易触发此陷阱,需在 bundle 注册时显式 tz_localize('UTC')。
Source: https://github.com/stefan-jansen/zipline-reloaded/issues/240
## zvt (6)
### `AP-ZVT-183` — 除权因子为 inf/NaN 时直接参与乘法导致复权静默失败 <sub>(high)</sub>
ZVT 在计算前复权因子时以 new/old 价格比计算 qfq_factor。当 old==0(新股首日 或数据缺失)时因子为 inf;当 kdata.open 本身为 None(停牌日未填充)时乘法 抛出 TypeError。结果:整个 entity 的复权计算中断,后续 K 线全部丢失,但主 流程只 log ERROR 不中断,用户往往不知道已有大量股票数据损坏。
Source: https://github.com/zvtvz/zvt/issues/183
### `AP-ZVT-179` — 第三方数据接口超限后异常被吞噬,数据静默缺失 <sub>(high)</sub>
ZVT 使用聚宽 jqdatasdk 批量拉取全市场 K 线时(4000+ 股票),触发聚宽每日 最大查询条数限制(错误:已超过每日最大查询数量)。ZVT 捕获异常后继续执行下一 entity,导致超限后所有股票的当日数据均静默缺失。回测若使用该残缺数据库,因 子计算结果将产生系统性偏差,且无告警。
Source: https://github.com/zvtvz/zvt/issues/179
### `AP-ZVT-161` — 全市场 SQLite 批量因子计算触发 too many SQL variables 错误 <sub>(medium)</sub>
ZVT 在计算 VolumeUpMaFactor 等多股因子时,将所有 entity_id 拼入单条 SQL 的 IN 子句。当 A 股全市场(5000+ 股)一次性查询时,触发 SQLite 默认限制 SQLITE_MAX_VARIABLE_NUMBER=999。调大 max_allowed_packet(MySQL 参数)无效, 根因是 SQLite 变量数上限。正确解法是分批查询,但 ZVT 早期版本未处理此边界。
Source: https://github.com/zvtvz/zvt/issues/161
### `AP-ZVT-129` — 使用通配符导入隐藏 API 版本变更,AdjustType 等枚举莫名消失 <sub>(medium)</sub>
ZVT 文档示例使用 `from zvt import *` 导入所有符号。当 ZVT 版本升级重构 枚举(如将 AdjustType 移入子模块)后,通配符导入不再包含该符号,触发 AttributeError。使用者误以为是安装问题,实际是版本间 API breaking change 未在 CHANGELOG 中标注,且通配符导入掩盖了具体来源。应显式 import 枚举类。
Source: https://github.com/zvtvz/zvt/issues/129
### `AP-ZVT-187` — 回测引擎未在数据层空结果时提前终止,导致空指针级联崩溃 <sub>(medium)</sub>
ZVT Trader 在 load_data 完成后检查数据为空时,不提前退出,而是将空 DataFrame 传入 selector 计算,触发后续 NoneType 操作链式崩溃。错误栈深且难以定位根因, 用户误以为是策略逻辑问题。根因是数据时间窗口配置错误(start/end 不在数据 库覆盖范围内)但无有效校验。
Source: https://github.com/zvtvz/zvt/issues/187
### `AP-ZVT-183B` — HFQ(后复权)与 QFQ(前复权)K 线表使用错误导致因子计算漂移 <sub>(high)</sub>
ZVT 提供 Stock1dKdata(不复权)、Stock1dHfqKdata(后复权)、Stock1dQfqKdata (前复权)三张独立表。用户在计算价格动量/均线因子时混用两张表(如用不复权 做均线,用后复权做收益率),导致除权日前后因子值产生跳变。ZVT 不做跨表 复权类型一致性校验,混用静默通过。
Source: https://github.com/zvtvz/zvt/issues/183
FILE:references/COMPONENTS.md
# Component Capability Map
**Project**: finance-bp-100--LEAN
**Scan date**: 2026-04-22
**Stats**: {'total_files': 7, 'total_classes': 22, 'total_functions': 0, 'total_stages': 7}
## Modules (7)
- [universe_selection](components/universe_selection.md): 3 classes
- [alpha_generation](components/alpha_generation.md): 3 classes
- [portfolio_construction](components/portfolio_construction.md): 3 classes
- [risk_management](components/risk_management.md): 2 classes
- [execution](components/execution.md): 3 classes
- [order_lifecycle_management](components/order_lifecycle_management.md): 4 classes
- [algorithm_manager_(orchestration)](components/algorithm_manager_-orchestration.md): 4 classes
FILE:references/CONSTRAINTS.md
# Constraints
## preservation_manifest
```yaml
required_objects:
business_decisions_count: 142
fatal_constraints_count: 44
non_fatal_constraints_count: 241
use_cases_count: 8
semantic_locks_count: 12
preconditions_count: 4
evidence_quality_rules_count: 2
traceback_scenarios_count: 5
```
## Domain Constraints Injected (39)
- **`SHARED-BT-LAB-001`** <sub>(fatal)</sub>: 未来函数(Lookahead Bias):在模拟历史时间点 t 的交易决策时, 不得使用 t 时刻之后才能知道的信息。最常见形式: (1) 使用收盘价计算信号并同日以收盘价成交; (2) 将 T 日收盘后计算的指标标记在同一根 K 线; (3) 使用当日最高/最低价作为成交假设。 信号计算与成交时间必须对齐:T 日收盘后计算信号,T+1 日开盘成交。
- **`SHARED-BT-LAB-002`** <sub>(high)</sub>: 指标预热期(Warmup Period)处理:滚动窗口指标在前 N 个 bar 时 NaN, 这些 bar 不应参与信号计算和持仓决策。强制要求指标的 warmup_period 与最长 lookback 期等长,且 warmup 期间持仓应置零。
- **`SHARED-BT-LAB-003`** <sub>(fatal)</sub>: ML/DL 模型时序数据划分必须按时间顺序:TRAIN < VALID < TEST, 不可使用随机 k-fold 分折(会将未来数据混入训练集)。 应使用 TimeSeriesSplit 或 Walk-Forward 验证。
- **`SHARED-BT-LAB-004`** <sub>(fatal)</sub>: 开盘价/最高价/最低价成交假设:日线回测中假设每日可以最高价卖出或 最低价买入(如动量策略"最高价止盈"),这是明显的 lookahead, 因为日内最高/最低价只有收盘后才能确认。成交价只能用开盘价或 前一日收盘价(带滑点)。
- **`SHARED-BT-LAB-005`** <sub>(high)</sub>: 数据对齐偏移(Off-by-one):pandas rolling/shift 等操作容易引入细微的 1 期偏移错误。应在代码中明确记录每个序列的"观测时间点", 并通过 assert 验证关键时间对齐关系。
- **`SHARED-BT-LAB-006`** <sub>(high)</sub>: 过度优化(Overfitting):回测数量越多,过拟合概率越高。 Bailey et al.(2014)证明 Optimal Sharpe Ratio 期望值随回测次数单调递减。 应使用 Walk-Forward 验证代替 in-sample 参数穷举,并报告 Deflated Sharpe Ratio(DSR)而非峰值 Sharpe。
- **`SHARED-BT-SURV-001`** <sub>(fatal)</sub>: 幸存者偏差(Survivorship Bias):使用当前市场成分股作为历史回测股票池, 会遗漏曾经存在但后来退市、摘牌或被合并的股票,系统性高估策略历史收益率。 回测股票池必须使用历史时点快照(point-in-time universe)。
- **`SHARED-BT-SURV-002`** <sub>(high)</sub>: In-Sample / Out-of-Sample 划分:策略开发、参数选择必须在样本内完成, 样本外数据仅用于最终验证,不可多次"看"样本外数据后继续调优 (会将样本外变为新的样本内,重蹈过拟合)。
- **`SHARED-BT-SURV-003`** <sub>(high)</sub>: 停牌/缺失数据的填充策略:停牌日价格不可简单用前一日收盘价 forward-fill, 因为这会在复盘时造成"零成交量"日参与了因子计算和信号生成。 应在因子计算层显式过滤缺失交易日,不填充。
- **`SHARED-BT-SURV-004`** <sub>(high)</sub>: 异常值(Extreme Value)污染:原始市场数据可能含有数据源错误(如除权未 及时调整、手工录入错误导致的极端价格),不清洗直接进入因子计算会产生 极端信号,污染整个横截面。应在 pipeline 入口处过滤 3-sigma 异常值。
- **`SHARED-BT-COST-001`** <sub>(fatal)</sub>: 交易成本(佣金 + 印花税/转让税 + 过户费)必须在回测初始化时强制配置, 不可使用零成本默认值。忽略成本的回测策略绩效指标具有欺骗性, 高换手率策略尤其严重(单边往返成本往往吞噬 50%+ 的毛收益)。
- **`SHARED-BT-COST-002`** <sub>(high)</sub>: 滑点(Slippage)建模:回测若无滑点,假设每笔订单以理想价格成交, 高频策略在实盘中会因成交价劣化而产生严重亏损。至少应配置固定点差 或比例滑点;大单应使用成交量比例模型(如不超过日成交量 5%)。
- **`SHARED-BT-COST-003`** <sub>(high)</sub>: 换手率(Turnover)必须在回测绩效报告中展示并与成本关联分析。 月换手率超过 50%(年化 600%+)时,策略净收益对成本假设极度敏感, 每 10bps 成本变化可能改变策略盈亏结论,必须做成本敏感性分析。
- **`SHARED-BT-COST-004`** <sub>(medium)</sub>: 仓位规模化(Position Sizing)必须纳入资金量约束:回测应模拟固定资金量 下的实际持仓股数(取整),而非假设可以持有小数股。 对小盘股,最小交易单位(A股:100股/手)会导致实际可持仓量与目标权重 产生偏差,应在回测中模拟取整效应。
- **`SHARED-BT-TIME-001`** <sub>(high)</sub>: 时间戳时区统一:多数据源合并时,UTC vs 本地时间混用是常见数据腐败源。 所有时间戳必须在 pipeline 入口处统一转换为同一时区(推荐 UTC 存储, 市场本地时区展示),不可在 pipeline 中途混用不同时区。
- **`SHARED-BT-TIME-002`** <sub>(high)</sub>: 交易日历对齐:合并不同市场或不同频率数据时(如日线价格 + 周频因子), 必须使用明确的交易日历进行 reindex/merge,不可使用 outer join 后 fillna, 否则会在非交易日(节假日)创建虚假数据行。
- **`SHARED-BT-TIME-003`** <sub>(high)</sub>: 增量更新边界校验:历史数据增量更新时,必须从数据库查询已存最新日期, 仅下载该日期之后的数据。若重新下载已有数据并追加,会产生时间戳重复行, 导致回测时序错误。更新前后必须校验无重复 (index.duplicated().any() == False)。
- **`SHARED-BT-TIME-004`** <sub>(medium)</sub>: 回测绩效归因失真:基准(Benchmark)选择不当会使 Alpha/Beta 计算失真。 应选用策略实际可投资的被动基准(如 HS300 ETF),而非不可直接投资的 价格指数(如 HS300 指数)。价格指数不含股息再投资,会低估持仓基准收益。
- **`SHARED-BT-PERF-001`** <sub>(medium)</sub>: 最大回撤(Max Drawdown)计算必须使用净值序列(portfolio value), 不可用累计收益率序列代替。若使用对数收益率累加,会低估回撤深度 (因对数收益率在下跌时会比简单收益率偏小)。
- **`SHARED-BT-PERF-002`** <sub>(medium)</sub>: Sharpe Ratio 年化化约定:年化 Sharpe = 日 Sharpe × sqrt(252)(股票,252 交易日) 或 × sqrt(365)(加密货币,365日)。不同系统默认不同,跨系统对比前必须 确认年化因子,否则 Sharpe 不可比。
- **`SHARED-BT-PERF-003`** <sub>(medium)</sub>: Calmar Ratio / Sortino Ratio 优于 Sharpe Ratio 作为风险调整收益指标: Sharpe 假设收益正态分布,A 股/加密市场的收益分布显著左偏(肥尾), 会低估下行风险。量化评估应同时报告 Sortino(仅下行波动)和 Calmar(年化收益/最大回撤),不应单一依赖 Sharpe。
- **`SHARED-BT-PERF-004`** <sub>(medium)</sub>: 回测绩效归因应拆解为:alpha(主动收益)、beta(市场收益)、 因子暴露收益(style/sector)和特异性收益(stock selection)。 不做归因的回测无法区分"策略优秀"与"顺风行情恰好 beta 对了"。
- **`SHARED-FR-IC-001`** <sub>(high)</sub>: IC(信息系数)是衡量因子预测能力的核心指标,定义为因子值与 下期收益率的 Spearman 秩相关系数(ICIR = IC / std(IC))。 IC 绝对值 > 0.05 视为有预测能力的初步证据,ICIR > 0.5 视为稳定。 不计算 IC 直接报告回测绩效是因子有效性证明缺失的典型问题。
- **`SHARED-FR-IC-002`** <sub>(high)</sub>: IC 衰减(IC Decay)分析:因子预测能力通常随持仓期增长而衰减。 应计算 1/5/10/20 日 IC 序列,识别因子的最优持仓期。 IC 在1日高但20日迅速衰减的因子是短期因子,不适合月度换仓策略; 反之亦然。使用错误的持仓期会严重损害因子实盘表现。
- **`SHARED-FR-IC-003`** <sub>(high)</sub>: Harvey, Liu & Zhu (2016) 警告:学术界已发现 300+ 个"显著"因子, 其中大量是多重检验下的误发现(False Discovery)。因子有效性要求: t-stat > 3.0(而非传统的 1.96);或在不同时段/市场独立复现; 或有清晰的经济学逻辑。不满足上述条件的因子极可能是数据挖掘产物。
- **`SHARED-FR-IC-004`** <sub>(high)</sub>: 因子换手率(Factor Turnover)控制:高 IC 但高换手率的因子,在扣除 交易成本后净 IC 可能为负。应计算换手率调整后的有效 IC: net_IC = IC - turnover × cost_per_turn。目标换手率 ≤ 50%(月频)。
- **`SHARED-FR-IC-005`** <sub>(medium)</sub>: 因子衰减期(Half-life)是因子信号强度的核心参数,直接决定最优再平衡频率。 半衰期 < 5 日:日频或周频换仓;5-20 日:周频或双周;> 20 日:月频换仓。 错误地对短期因子使用月频换仓,会导致大量 alpha 在持仓期内消散。
- **`SHARED-FR-NEUT-001`** <sub>(high)</sub>: 行业中性化(Industry Neutralization):因子值若不对行业均值中性化, 因子收益中会混入行业轮动收益,难以判断是因子本身还是行业暴露驱动了收益。 行业中性化操作:factor_neutral = factor - industry_mean(factor)。
- **`SHARED-FR-NEUT-002`** <sub>(high)</sub>: 市值中性化(Market Cap Neutralization):小盘股效应(小盘跑赢大盘) 是金融史上最持久的 anomaly 之一,会污染几乎所有未中性化的因子。 若因子与市值高度相关,选股会系统性偏向小盘,收益来自市值暴露而非因子本身。 需同时进行行业和市值中性化(Fama-MacBeth 回归或残差法)。
- **`SHARED-FR-NEUT-003`** <sub>(high)</sub>: 异常值处理(Winsorize/MAD):因子原始值通常含有极端值,极端值会扭曲 分组分析(如 Q1/Q10 十分位)。应对原始因子值做 Winsorize(截尾至 [1%, 99%] 或 3-sigma)或 MAD(中位数绝对偏差)缩尾,然后再排名/中性化。
- **`SHARED-FR-NEUT-004`** <sub>(medium)</sub>: 因子正交化(Factor Orthogonalization):当多个因子共同用于合成打分时, 高相关因子的合成等效于对单一因子过度权重,稀释信号多样性。 应在合成前对因子做施密特正交化或 PCA,消除因子间的多重共线性。
- **`SHARED-FR-NEUT-005`** <sub>(medium)</sub>: 缺失数据填充策略:因子计算中的 NaN(停牌/新股/数据缺口)若用截面均值填充 会引入 lookahead bias(均值本身含未来信息);若完全删除会产生幸存者偏差; 正确做法是用截面中位数(当日所有股票的中位数,不依赖未来)或将该股当日排除。
- **`SHARED-FR-PORT-001`** <sub>(high)</sub>: 分层分析(Quantile Analysis):因子评估应使用 Q1/Q5(五分位)或 Q1/Q10(十分位)分组的多空收益差(top minus bottom spread)作为 主要评估指标,而非简单的多头收益。Q1 多 Q5 空的"单调性"检验是 因子有效性的核心证据:单调递增/递减 > 非单调 >> 仅多头有效。
- **`SHARED-FR-PORT-002`** <sub>(medium)</sub>: Alpha 衰减测试(Alpha Decay Test):因子的月度 IC 在不同时段(牛市/熊市/ 震荡市)的稳定性是因子鲁棒性的重要证据。IC 仅在某个特定市场状态下有效 的因子不适合全天候部署;应分段(rolling 12M)展示 IC 时序, 识别因子失效期。
- **`SHARED-FR-PORT-003`** <sub>(medium)</sub>: 换仓成本感知(Turnover-Aware Selection):因子排名靠近中间地带(49-51 分位) 的股票,排名小幅波动就会触发换仓,产生大量无效交易成本。 应在选股时设置换仓缓冲区(buffer zone):只在排名变化超过阈值时才换仓。
- **`SHARED-FR-PORT-004`** <sub>(medium)</sub>: 分组收益的统计显著性(Bootstrap 检验):因子分层收益差(Q1-Q5 spread) 即使在历史数据上很大,也可能是偶然,需要 bootstrap 或 t-test 检验 显著性(p-value < 0.05)。小样本回测期(< 3年)的分层收益尤其不可靠。
- **`SHARED-FR-XFER-001`** <sub>(high)</sub>: 因子跨市场可移植性验证:在一个市场有效的因子,不必然在另一个市场有效。 将美股因子直接套用 A 股、或将股票因子套用期货/加密货币,需要独立 IC 验证, 不可假设跨市场通用性。A 股特有异象(如反转效应、ST 价格异常)不存在于美股。
- **`SHARED-FR-XFER-002`** <sub>(medium)</sub>: 因子有效性时间稳定性:曾经有效的因子会因市场学习和套利行为逐渐失效 (McLean & Pontiff 2016 证明因子发表后平均衰减 58%)。 应定期(每季度/年)重新评估因子 IC,失效因子应及时替换或降权。
- **`SHARED-FR-XFER-003`** <sub>(medium)</sub>: 因子与宏观经济环境的交互:利率周期/经济周期/市场情绪对因子有效性影响显著。 价值因子(低 P/B)在利率上升期更有效;动量因子在趋势市更有效,震荡市失效。 部署因子前应评估当前宏观环境与因子最优生存环境的匹配度。
FILE:references/LOCKS.md
# Semantic Locks + Preconditions
## Semantic Locks (12)
### `SL-01` <sub>(on_violation: fatal)</sub>
Execute sell orders before buy orders in every trading cycle
### `SL-02` <sub>(on_violation: fatal)</sub>
Trading signals MUST use next-bar execution (no look-ahead)
### `SL-03` <sub>(on_violation: fatal)</sub>
Entity IDs MUST follow format entity_type_exchange_code
### `SL-04` <sub>(on_violation: fatal)</sub>
DataFrame index MUST be MultiIndex (entity_id, timestamp)
### `SL-05` <sub>(on_violation: fatal)</sub>
TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount
### `SL-06` <sub>(on_violation: fatal)</sub>
filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION
### `SL-07` <sub>(on_violation: fatal)</sub>
Transformer MUST run BEFORE Accumulator in factor pipeline
### `SL-08` <sub>(on_violation: fatal)</sub>
MACD parameters locked: fast=12, slow=26, signal=9
### `SL-09` <sub>(on_violation: warning)</sub>
Default transaction costs: buy_cost=0.001, sell_cost=0.001, slippage=0.001
### `SL-10` <sub>(on_violation: fatal)</sub>
A-share equity trading is T+1 (no same-day close of buy positions)
### `SL-11` <sub>(on_violation: fatal)</sub>
Recorder subclass MUST define provider AND data_schema class attributes
### `SL-12` <sub>(on_violation: fatal)</sub>
Factor result_df MUST contain either 'filter_result' OR 'score_result' column
## Preconditions (4)
- **`PC-01`**: `python3 -c 'import zvt; print(zvt.__version__)'` → on_fail: Run: python3 -m pip install zvt then re-run: python3 -m zvt.init_dirs to initialize data directories
- **`PC-02`**: `python3 -c "from zvt.api.kdata import get_kdata; df = get_kdata(entity_ids=['stock_sh_600000'], limit=1); assert df is not None and len(df) > 0, 'No kdata found'"` → on_fail: Run recorder first: python3 -m zvt.recorders.em.em_stock_kdata_recorder --entity_ids stock_sh_600000 (replace with your target entity IDs)
- **`PC-03`**: `python3 -c "import os; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); assert zvt_home.exists(), f'ZVT home not found: {zvt_home}'"` → on_fail: Run: python3 -m zvt.init_dirs
- **`PC-04`**: `python3 -c "import os, tempfile; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); test_f = zvt_home / '.write_test'; test_f.touch(); test_f.unlink()"` → on_fail: Check directory permissions: chmod u+w ~/.zvt or set ZVT_HOME environment variable to a writable location
FILE:references/USE_CASES.md
# Known Use Cases (KUC)
Total: **8**
## `KUC-101`
**Source**: `Research/BasicCSharpQuantBookTemplate.ipynb`
Provides a foundational C# research environment template for loading QuantBook and fetching historical data across multiple asset classes for analysis.
## `KUC-102`
**Source**: `Research/BasicQuantBookTemplate.ipynb`
Provides a Python research environment template demonstrating QuantBook setup, historical data fetching, price plotting, and Bollinger Bands indicator calculation.
## `KUC-103`
**Source**: `Research/KitchenSinkCSharpQuantBookTemplate.ipynb`
Comprehensive C# template demonstrating QuantBook API cloud connectivity, project listing, and multiple methods for fetching historical data with different parameters and asset types.
## `KUC-104`
**Source**: `Research/KitchenSinkQuantBookTemplate.ipynb`
Most comprehensive Python template showing QuantBook API connectivity, multi-asset data (including options), option history filtering, multiple data fetching methods, and price plotting.
## `KUC-105`
**Source**: `Tests/Research/RegressionTemplates/BasicTemplateCustomDataTypeHistoryResearchCSharp.ipynb`
Demonstrates how to create and implement a custom data type in C# extending DynamicData, including custom source fetching from remote CSV files for research analysis.
## `KUC-106`
**Source**: `Tests/Research/RegressionTemplates/BasicTemplateCustomDataTypeHistoryResearchPython.ipynb`
Shows how to implement a custom data type in Python using PythonData base class, defining custom GetSource and Reader methods to fetch data from remote CSV for research.
## `KUC-107`
**Source**: `Tests/Research/RegressionTemplates/BasicTemplateResearchCSharp.ipynb`
Minimal C# template demonstrating QuantBook setup with date-range based historical data fetching for research analysis.
## `KUC-108`
**Source**: `Tests/Research/RegressionTemplates/BasicTemplateResearchPython.ipynb`
Python research template showing QuantBook setup, date-range historical data fetching, and Bollinger Bands indicator history retrieval for technical analysis.
FILE:references/WISDOM.md
# Cross-Project Wisdom
Total: **10**
## `CW-BT-001` — Cerebro 统一编排引擎
**From**: backtrader · **Applicable to**: backtesting
backtrader 用 Cerebro 作为单一入口,统一管理 data feeds、strategies、analyzers、 observers 的生命周期,支持一次 cerebro.run() 跑多策略+多数据源。 zvt 的 StockTrader 目前每次实例化只绑定一套因子,缺乏统一的多策略组合编排层; 借鉴 Cerebro 模式可让用户把多个 Trader 实例组合到一个 runner 中对比评估。
## `CW-BT-002` — Analyzer 插件化绩效评估
**From**: backtrader · **Applicable to**: backtesting
backtrader 提供 SharpeRatio、DrawDown、TimeReturn、TradeAnalyzer 等即插即用 的 Analyzer,可在不修改策略代码的情况下附加任意绩效指标。 zvt 当前绩效评估能力较弱,没有标准化的 Analyzer 接口; 借鉴此模式可让用户 cerebro.addanalyzer(SharpeRatio) 即得风险调整收益报告。
## `CW-BT-003` — Sizer 仓位管理分离
**From**: backtrader · **Applicable to**: backtesting
backtrader 将仓位管理(每次开仓买多少股/多大比例)单独抽象为 Sizer, 与信号逻辑完全解耦;内置 FixedSize、PercentSizer 等,用户可自定义。 zvt 目前没有显式的 Sizer 概念,仓位控制逻辑散落在 Trader.on_profit_control 等钩子中; 引入 Sizer 接口可使策略信号与资金管理规则独立演化和组合复用。
## `CW-BT-004` — Order 类型全集(Limit/Stop/OCO/Bracket)
**From**: backtrader · **Applicable to**: backtesting
backtrader 支持 Market、Limit、Stop、StopLimit、OCO(二选一)、 Bracket(止盈止损一对订单)等丰富订单类型,并模拟成交滑点和手续费方案。 zvt 回测目前主要支持市价成交,缺乏限价委托和组合订单模拟; 对于高频或实盘对接场景,完善订单类型将大幅提升回测真实性。
## `CW-BT-005` — 数据重采样与重播(Resampling & Replaying)
**From**: backtrader · **Applicable to**: backtesting
backtrader 可将低级别数据(如 1 min)实时 resample 为高级别(如 1 day)并同步驱动策略, 或 replay 逐 tick 模拟 OHLC 形成过程,实现日内精细回测。 zvt 目前多时间框架通过预录入不同级别 K 线实现,缺少运行时动态重采样; 借鉴此模式可在不重复录入数据的前提下支持任意时间粒度组合回测。
## `CW-VN-003` — CTA 回测引擎内置可视化
**From**: vnpy · **Applicable to**: backtesting
vnpy 的 cta_backtester 提供图形界面直接展示策略净值曲线、最大回撤、 每日盈亏、成交明细,无需 Jupyter Notebook。 zvt 目前回测结果可视化依赖 draw_result 方法调用 Plotly,但无统一的回测报告页面; 借鉴此模式可打包一个开箱即用的策略绩效仪表盘。
## `CW-VN-004` — vnpy.alpha ML 因子研究实验室(Lab)
**From**: vnpy · **Applicable to**: factor-research
vnpy 4.0 的 vnpy.alpha.lab 提供数据管理、模型训练、信号生成、策略回测一体化工作流, 支持 Lasso/LightGBM/MLP 等算法的标准化训练接口和可视化对比。 zvt 的 ML 能力目前仅有 MaStockMLMachine 一个入口,缺乏规范化 Lab 框架; 借鉴 Lab 模式可建立"特征工程→训练→信号→回测"的标准流水线,降低 ML 实验门槛。
## `CW-QL-001` — Point-in-Time 数据库(防未来数据泄漏)
**From**: qlib · **Applicable to**: backtesting
qlib 的 Point-in-Time Provider 保证在给定时间点 t 的查询只返回 t 时刻 真实可知的数据(财报发布延迟、修订历史均被正确处理), 彻底消除回测中的 look-ahead bias。 zvt 目前财务数据以报告期为 timestamp,缺少"发布日"维度, 存在用未来财报数据做选股的潜在偏差;引入 PIT 模式可大幅提升回测可信度。
## `CW-QL-002` — Recorder + Experiment 实验管理(MLflow 风格)
**From**: qlib · **Applicable to**: factor-research
qlib 的 workflow 模块提供 Experiment/Recorder,自动记录每次模型训练的 超参数、特征、指标、预测结果,支持跨实验比较和模型版本管理。 zvt 目前缺乏 ML 实验追踪机制,每次重跑结果会覆盖前次; 借鉴 Recorder 模式可将每次因子实验的参数和结果持久化,支持快速复现和版本对比。
## `CW-QL-003` — Nested Decision Framework(多层嵌套决策执行)
**From**: qlib · **Applicable to**: backtesting
qlib 支持将高频执行层(分钟级委托拆单)嵌套在低频决策层(日级组合调仓)中, 两层独立优化且可组合运行,实现日内最优执行算法(如 TWAP、VWAP 调仓)。 zvt 目前回测仅有日线级别的成交假设,缺乏执行算法建模; 借鉴嵌套框架可让 zvt 区分"何时持有哪些股"与"如何以最小冲击成本建仓"两个问题。
FILE:references/components/algorithm_manager_-orchestration.md
# algorithm_manager_(orchestration) (4 classes)
## `AlgorithmManager.Run`
`algorithm_manager_(orchestration)/algorithmmanager-run.py:0`
## `AlgorithmManager.OnFrameworkData`
`algorithm_manager_(orchestration)/algorithmmanager-onframeworkdata.py:0`
## `AlgorithmManager.ProcessSynchronousEvents`
`algorithm_manager_(orchestration)/algorithmmanager-processsynchronousevent.py:0`
## `Transaction handler`
`algorithm_manager_(orchestration)/transaction-handler.py:0`
FILE:references/components/alpha_generation.md
# alpha_generation (3 classes)
## `AlphaModel.Update`
`alpha_generation/alphamodel-update.py:0`
## `AlphaModel.OnSecuritiesChanged`
`alpha_generation/alphamodel-onsecuritieschanged.py:0`
## `Alpha generation strategy`
`alpha_generation/alpha-generation-strategy.py:0`
FILE:references/components/execution.md
# execution (3 classes)
## `ExecutionModel.Execute`
`execution/executionmodel-execute.py:0`
## `ExecutionModel.OnOrderEvent`
`execution/executionmodel-onorderevent.py:0`
## `Order execution strategy`
`execution/order-execution-strategy.py:0`
FILE:references/components/order_lifecycle_management.md
# order_lifecycle_management (4 classes)
## `SecurityTransactionManager.UpdateOrder`
`order_lifecycle_management/securitytransactionmanager-updateorder.py:0`
## `VolumeShareSlippageModel.get_slippage_approximation`
`order_lifecycle_management/volumeshareslippagemodel-get-slippage-ap.py:0`
## `Slippage model`
`order_lifecycle_management/slippage-model.py:0`
## `Fee model`
`order_lifecycle_management/fee-model.py:0`
FILE:references/components/portfolio_construction.md
# portfolio_construction (3 classes)
## `PortfolioConstructionModel.CreateTargets`
`portfolio_construction/portfolioconstructionmodel-createtargets.py:0`
## `PortfolioConstructionModel.DetermineTargetPercent`
`portfolio_construction/portfolioconstructionmodel-determinetarg.py:0`
## `Weighting methodology`
`portfolio_construction/weighting-methodology.py:0`
FILE:references/components/risk_management.md
# risk_management (2 classes)
## `RiskManagementModel.ManageRisk`
`risk_management/riskmanagementmodel-managerisk.py:0`
## `Risk control mechanism`
`risk_management/risk-control-mechanism.py:0`
FILE:references/components/universe_selection.md
# universe_selection (3 classes)
## `FundamentalUniverseSelectionModel.Select`
`universe_selection/fundamentaluniverseselectionmodel-select.py:0`
## `UniverseSelectionModel.GetNextRefreshTimeUtc`
`universe_selection/universeselectionmodel-getnextrefreshtim.py:0`
## `Security selection logic`
`universe_selection/security-selection-logic.py:0`
用 chainladder-python 做精算损失准备金估算:从历史理赔三角到 IBNR 准备金、 尾部参数拟合。支持再保险 / 巨灾 / 一般责任险多产品线。
---
name: insurance-loss-reserving
description: |-
用 chainladder-python 做精算损失准备金估算:从历史理赔三角到 IBNR 准备金、
尾部参数拟合。支持再保险 / 巨灾 / 一般责任险多产品线。
license: Proprietary. See LICENSE.txt in project root.
compatibility: Designed for Doramagic-host ecosystem (Claude Code / openclaw / Cursor). Requires Python 3.12+ with uv package manager.
metadata:
version: "v6.1"
blueprint_id: "finance-bp-063"
compiled_at: "2026-04-22T13:00:20.366204+00:00"
capability_markets: "global"
capability_activities: "insurance-actuarial"
sop_version: "crystal-compilation-v6.1"
---
# 保险损失准备金 (insurance-loss-reserving)
> 用 chain ladder 方法从历史理赔三角估算 IBNR 准备金——再保险、巨灾、 一般责任险都能跑。
## Pipeline
`data_collection -> data_storage -> factor_computation -> target_selection -> trading_execution -> visualization`
## Top Use Cases (0 total)
**Execute trigger**: `When user intent matches intent_router.uc_entries[].positive_terms AND user uses action verb (run/execute/跑/执行/backtest/fetch/collect)`
## What I'll Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
## Semantic Locks (Fatal)
| ID | Rule | On Violation |
|---|---|---|
| `SL-01` | Execute sell orders before buy orders in every trading cycle | halt |
| `SL-02` | Trading signals MUST use next-bar execution (no look-ahead) | halt |
| `SL-03` | Entity IDs MUST follow format entity_type_exchange_code | halt |
| `SL-04` | DataFrame index MUST be MultiIndex (entity_id, timestamp) | halt |
| `SL-05` | TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount | halt |
| `SL-06` | filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION | halt |
| `SL-07` | Transformer MUST run BEFORE Accumulator in factor pipeline | halt |
| `SL-08` | MACD parameters locked: fast=12, slow=26, signal=9 | halt |
Full lock definitions: [references/LOCKS.md](references/LOCKS.md)
## Top Anti-Patterns (15 total)
- **`AP-INSURANCE-001`**: Implicit numeric format assumptions without validation
- **`AP-INSURANCE-002`**: Triangle axis construction with invalid temporal ordering
- **`AP-INSURANCE-003`**: Cumulative/incremental triangle representation misuse
All 15 anti-patterns: [references/ANTI_PATTERNS.md](references/ANTI_PATTERNS.md)
## Evidence Quality Notice
> [QUALITY NOTICE] This crystal was compiled from blueprint finance-bp-063. Evidence verify ratio = 56.5% and audit fail total = 15. Generated results may have uncaptured requirement gaps. Verify critical decisions against source files (LATEST.yaml / LATEST.jsonl).
## Reference Files
| File | Contents | When to Load |
|---|---|---|
| [references/seed.yaml](references/seed.yaml) | V6+ 全量权威 (source-of-truth) | 有行为/决策争议时必读 |
| [references/ANTI_PATTERNS.md](references/ANTI_PATTERNS.md) | 15 条跨项目反模式 | 开始实现前 |
| [references/WISDOM.md](references/WISDOM.md) | 跨项目精华借鉴 | 架构决策时 |
| [references/CONSTRAINTS.md](references/CONSTRAINTS.md) | domain + fatal 约束 | 规则冲突时 |
| [references/USE_CASES.md](references/USE_CASES.md) | 全量 KUC-* 业务场景 | 需要完整示例时 |
| [references/LOCKS.md](references/LOCKS.md) | SL-* + preconditions + hints | 生成回测/交易代码前 |
| [references/COMPONENTS.md](references/COMPONENTS.md) | AST 组件地图(按 module 拆分)| 查 API 时 |
---
*Compiled by Doramagic crystal-compilation-v6.1 from `finance-bp-063` blueprint at 2026-04-22T13:00:20.366204+00:00.*
*See [human_summary.md](human_summary.md) for non-technical overview.*
FILE:human_summary.md
# Human Summary
> I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
## What I Can Do
- **tagline**: I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
- **use_cases**: ['A-share MACD daily golden-cross backtest with hfq price adjustment from eastmoney', 'End-to-end ZVT pipeline: FinanceRecorder + GoodCompanyFactor + StockTrader', 'Multi-factor strategy with TargetSelector (AND mode) combining MACD + volume breakout', 'Index composition data collection (SZ1000, SZ2000) with EM recorder', 'Institutional fund holdings tracker via joinquant_fund_runner pattern', 'Custom Transformer + Accumulator factor with per-entity rolling state', 'Bollinger Band mean-reversion factor with BollTransformer (window=20, window_dev=2)']
## What I Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
FILE:references/ANTI_PATTERNS.md
# Anti-Patterns (Cross-Project)
Total: **15**
## finance-bp-063--chainladder-python (4)
### `AP-INSURANCE-002` — Triangle axis construction with invalid temporal ordering <sub>(high)</sub>
Development dates are created without verifying they are strictly greater than origin dates, or development lags are calculated with incorrect formulas (e.g., using wrong divisor for monthly difference). This creates logically impossible triangle cells where development <= origin, corrupting the fundamental data structure and producing wrong loss development patterns.
### `AP-INSURANCE-003` — Cumulative/incremental triangle representation misuse <sub>(high)</sub>
Link ratios are computed on incremental triangles instead of cumulative form, or cum_to_incr/incr_to_cum conversions are not properly inverse-applied. This produces link ratios near 1.0 regardless of actual claims development, leading to misleading development factors and incorrect IBNR estimates.
### `AP-INSURANCE-004` — Including incomplete latest diagonal in development analysis <sub>(high)</sub>
Link ratio computation includes the latest diagonal which contains incomplete/in-progress development data. Without excluding this diagonal via valuation_date filtering, development factor estimation uses partial data that biases IBNR estimates. The latest diagonal must be excluded to capture true historical development patterns.
### `AP-INSURANCE-015` — Triangle grain transformation with incompatible parameters <sub>(medium)</sub>
Triangle grain() method is called without setting is_cumulative attribute, or origin grain is made finer than development grain. These produce invalid triangular data structures with misaligned periods and undefined behavior, corrupting actuarial reserving calculations.
## finance-bp-064--insurance_python (2)
### `AP-INSURANCE-005` — EIOPA calibration workflow violations <sub>(high)</sub>
Smith-Wilson calibration workflow is violated in multiple ways: calibration step is skipped before extrapolation, different alpha values are used for calibration vs extrapolation, or convergence point T uses incorrect formula. These violations produce mathematically inconsistent rate curves where observed points do not match market data and extrapolated rates violate EIOPA specifications.
### `AP-INSURANCE-006` — Missing iteration bounds causing infinite loops <sub>(high)</sub>
Root-finding algorithms like bisection for alpha calibration lack maxIter parameters. When the algorithm fails to converge (e.g., no sign change in Galfa at interval bounds), the application freezes indefinitely, causing service disruption. This is especially critical in regulatory compliance workflows where calibration must complete.
## finance-bp-064--insurance_python, finance-bp-126--lifelines (1)
### `AP-INSURANCE-007` — Invalid financial/mathematical constraints not validated <sub>(high)</sub>
Correlation coefficients outside [-1,1], non-positive-semidefinite covariance matrices, negative durations, or entry times >= duration are not validated before use. These cause Cholesky decomposition failures, imaginary values in sqrt(1-rho²), or logically impossible scenarios, producing NaN prices or corrupted at-risk calculations.
## finance-bp-065--pyliferisk (4)
### `AP-INSURANCE-008` — None values propagated to arithmetic operations <sub>(high)</sub>
Critical parameters like interest rate i are passed as None to actuarial calculations. In pyliferisk, Actuarial.__init__ with i=None causes TypeError in (1/(1+i)) and commutation arrays remain empty. Bare except clauses catch these TypeErrors and silently return 0, masking the fundamental issue and producing incorrect but seemingly valid results.
### `AP-INSURANCE-009` — Stub function implementations and duplicate definitions <sub>(high)</sub>
Critical insurance functions like deferred temporary annuities are implemented as empty stubs (only 'pass' statement) or have duplicate definitions where the second shadows the first. This causes functions to return None instead of calculated values, breaking increasing annuity and premium calculations silently in production.
### `AP-INSURANCE-010` — Dispatcher routing to undefined functions <sub>(medium)</sub>
Complex function dispatchers (like annuity()) handle many parameter combinations but call functions that do not exist (e.g., qtaaxn, qtaxn). This causes NameError at runtime when specific parameter combinations are requested, preventing deferred temporary increasing annuity calculations entirely.
### `AP-INSURANCE-014` — Actuarial convention violations in life table construction <sub>(high)</sub>
Life tables violate standard actuarial conventions: using incorrect radix (not 100000), failing to append 0 to lx array for complete extinction, or using wrong payment adjustment formula for fractional annuities. These violations scale all derived quantities (dx, ex, reserves, premiums) incorrectly.
## finance-bp-065--pyliferisk, finance-bp-064--insurance_python (1)
### `AP-INSURANCE-001` — Implicit numeric format assumptions without validation <sub>(high)</sub>
Data formats like per-mille qx values or rate-to-price conversions are applied implicitly without validation. In pyliferisk, qx values stored as per-mille (qx*1000) are used directly as probabilities yielding 1000x errors. In insurance_python, rates are converted to prices using p=(1+r)^(-M) without verifying input format. This causes material miscalculations in reserve and premium calculations.
## finance-bp-126--lifelines (3)
### `AP-INSURANCE-011` — Survival function monotonicity not enforced <sub>(high)</sub>
Non-parametric survival curve estimators do not verify that S(t) is monotonically non-increasing across timeline values. Violations produce mathematically invalid survival curves where probability of survival increases over time, or S(0) is not initialized to 1.0, breaking interpretation as probability distribution.
### `AP-INSURANCE-012` — Input data corruption via inplace operations <sub>(medium)</sub>
User-provided DataFrames are modified inplace using .pop() operations without first creating a copy. This permanently corrupts user data by removing columns, violating data isolation principles and potentially affecting downstream analysis on the original data.
### `AP-INSURANCE-013` — Interval censoring bounds not validated <sub>(medium)</sub>
Lower and upper bounds for interval-censored data are not validated, allowing upper_bound < lower_bound. Invalid interval bounds produce undefined survival probability calculations, potentially negative time intervals in the likelihood function, and corrupt NPMLE estimation.
FILE:references/COMPONENTS.md
# Component Capability Map
**Project**: finance-bp-063--chainladder-python
**Scan date**: 2026-04-22
**Stats**: {'total_files': 6, 'total_classes': 37, 'total_functions': 0, 'total_stages': 6}
## Modules (6)
- [triangle_data_structure](components/triangle_data_structure.md): 6 classes
- [development_pattern_estimation](components/development_pattern_estimation.md): 6 classes
- [reserving_methods](components/reserving_methods.md): 8 classes
- [tail_factor_estimation](components/tail_factor_estimation.md): 6 classes
- [triangle_adjustments](components/triangle_adjustments.md): 6 classes
- [workflow_and_ensemble](components/workflow_and_ensemble.md): 5 classes
FILE:references/CONSTRAINTS.md
# Constraints
## preservation_manifest
```yaml
required_objects:
business_decisions_count: 103
fatal_constraints_count: 76
non_fatal_constraints_count: 199
use_cases_count: 0
semantic_locks_count: 12
preconditions_count: 4
evidence_quality_rules_count: 2
traceback_scenarios_count: 5
```
FILE:references/LOCKS.md
# Semantic Locks + Preconditions
## Semantic Locks (12)
### `SL-01` <sub>(on_violation: fatal)</sub>
Execute sell orders before buy orders in every trading cycle
### `SL-02` <sub>(on_violation: fatal)</sub>
Trading signals MUST use next-bar execution (no look-ahead)
### `SL-03` <sub>(on_violation: fatal)</sub>
Entity IDs MUST follow format entity_type_exchange_code
### `SL-04` <sub>(on_violation: fatal)</sub>
DataFrame index MUST be MultiIndex (entity_id, timestamp)
### `SL-05` <sub>(on_violation: fatal)</sub>
TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount
### `SL-06` <sub>(on_violation: fatal)</sub>
filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION
### `SL-07` <sub>(on_violation: fatal)</sub>
Transformer MUST run BEFORE Accumulator in factor pipeline
### `SL-08` <sub>(on_violation: fatal)</sub>
MACD parameters locked: fast=12, slow=26, signal=9
### `SL-09` <sub>(on_violation: warning)</sub>
Default transaction costs: buy_cost=0.001, sell_cost=0.001, slippage=0.001
### `SL-10` <sub>(on_violation: fatal)</sub>
A-share equity trading is T+1 (no same-day close of buy positions)
### `SL-11` <sub>(on_violation: fatal)</sub>
Recorder subclass MUST define provider AND data_schema class attributes
### `SL-12` <sub>(on_violation: fatal)</sub>
Factor result_df MUST contain either 'filter_result' OR 'score_result' column
## Preconditions (4)
- **`PC-01`**: `python3 -c 'import zvt; print(zvt.__version__)'` → on_fail: Run: python3 -m pip install zvt then re-run: python3 -m zvt.init_dirs to initialize data directories
- **`PC-02`**: `python3 -c "from zvt.api.kdata import get_kdata; df = get_kdata(entity_ids=['stock_sh_600000'], limit=1); assert df is not None and len(df) > 0, 'No kdata found'"` → on_fail: Run recorder first: python3 -m zvt.recorders.em.em_stock_kdata_recorder --entity_ids stock_sh_600000 (replace with your target entity IDs)
- **`PC-03`**: `python3 -c "import os; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); assert zvt_home.exists(), f'ZVT home not found: {zvt_home}'"` → on_fail: Run: python3 -m zvt.init_dirs
- **`PC-04`**: `python3 -c "import os, tempfile; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); test_f = zvt_home / '.write_test'; test_f.touch(); test_f.unlink()"` → on_fail: Check directory permissions: chmod u+w ~/.zvt or set ZVT_HOME environment variable to a writable location
FILE:references/USE_CASES.md
# Known Use Cases (KUC)
Total: **0**
FILE:references/WISDOM.md
# Cross-Project Wisdom
Total: **10**
## `CW-INSURANCE-001` — Validate input data format and type before computation
**From**: finance-bp-063--chainladder-python, finance-bp-126--lifelines · **Applicable to**: insurance-actuarial
Both triangle construction and survival analysis require strict input validation: numeric types for triangle columns, valid event indicators (0/1), no NaN/Inf values, and correct temporal ordering. This prevents downstream numerical failures and ensures mathematical validity of actuarial computations.
## `CW-INSURANCE-002` — Initialize probability distributions to boundary values
**From**: finance-bp-065--pyliferisk, finance-bp-126--lifelines · **Applicable to**: insurance-actuarial
Survival probability S(0) must equal 1.0 and life table lx must start at standard radix (100000) and end at 0. Properly initializing boundary values ensures actuarial quantities have correct scale and interpretation as probability distributions.
## `CW-INSURANCE-003` — Include iteration limits in numerical root-finding
**From**: finance-bp-064--insurance_python · **Applicable to**: insurance-actuarial
Bisection and other root-finding algorithms must include maxIter parameters and verify interval contains valid root (sign change). This prevents infinite loops when calibration fails, ensuring service availability in regulatory compliance workflows.
## `CW-INSURANCE-004` — Avoid bare except clauses that mask TypeErrors
**From**: finance-bp-065--pyliferisk · **Applicable to**: insurance-actuarial
Bare except clauses that catch all exceptions including TypeError and return default values (0 or None) mask fundamental parameter errors. Use specific exception handling and validate inputs upfront to fail fast with clear error messages.
## `CW-INSURANCE-005` — Preserve standard radix and extinction conventions in life tables
**From**: finance-bp-065--pyliferisk · **Applicable to**: insurance-actuarial
Life insurance calculations rely on industry-standard conventions: radix of 100000 at age 0 and lx[-1]=0 for complete extinction. Deviating from these conventions scales all derived quantities incorrectly and breaks interoperability with other actuarial systems.
## `CW-INSURANCE-006` — Ensure workflow step ordering and parameter consistency
**From**: finance-bp-063--chainladder-python, finance-bp-064--insurance_python · **Applicable to**: insurance-actuarial
Multi-step algorithms (triangle transformations, Smith-Wilson calibration) require strict step ordering: compute calibration vector before extrapolation, use consistent alpha values throughout. Violating workflow order produces undefined or mathematically inconsistent results.
## `CW-INSURANCE-007` — Validate probability bounds for confidence intervals
**From**: finance-bp-126--lifelines · **Applicable to**: insurance-actuarial
Confidence interval bounds must be constrained to [0,1] for probability estimates. Use fillna and formula constraints to ensure CI bounds remain valid probability ranges, preventing invalid statistical inference from actuarial models.
## `CW-INSURANCE-008` — Validate matrix properties before decomposition
**From**: finance-bp-065--pyliferisk, finance-bp-064--insurance_python · **Applicable to**: insurance-actuarial
Positive semi-definite matrices must be verified before Cholesky decomposition. Invalid matrices cause math domain errors or invalid correlated samples. Similarly, correlation coefficients must be validated to [-1,1] bounds before use in sqrt(1-rho²).
## `CW-INSURANCE-009` — Make defensive copies of input DataFrames
**From**: finance-bp-126--lifelines · **Applicable to**: insurance-actuarial
User-provided DataFrames should be copied before inplace modifications (.pop(), .drop()). This preserves user data integrity and prevents side effects from leaking into caller code, maintaining data isolation principles.
## `CW-INSURANCE-010` — Exclude incomplete diagonals from historical analysis
**From**: finance-bp-063--chainladder-python · **Applicable to**: insurance-actuarial
The latest diagonal in claims triangles contains incomplete development data from the current period. Excluding this diagonal via valuation_date filtering ensures development factors capture only completed, reliable historical patterns for unbiased IBNR estimation.
FILE:references/components/development_pattern_estimation.md
# development_pattern_estimation (6 classes)
## `Development.fit_transform`
`development_pattern_estimation/development-fit-transform.py:0`
## `WeightedRegression.fit`
`development_pattern_estimation/weightedregression-fit.py:0`
## `TweedieGLM.fit`
`development_pattern_estimation/tweedieglm-fit.py:0`
## `average`
`development_pattern_estimation/average.py:0`
## `sigma_interpolation`
`development_pattern_estimation/sigma-interpolation.py:0`
## `groupby`
`development_pattern_estimation/groupby.py:0`
FILE:references/components/reserving_methods.md
# reserving_methods (8 classes)
## `Chainladder.fit_predict`
`reserving_methods/chainladder-fit-predict.py:0`
## `MackChainladder.fit_predict`
`reserving_methods/mackchainladder-fit-predict.py:0`
## `BornhuetterFerguson.fit_predict`
`reserving_methods/bornhuetterferguson-fit-predict.py:0`
## `CapeCod.fit_predict`
`reserving_methods/capecod-fit-predict.py:0`
## `Benktander.fit_predict`
`reserving_methods/benktander-fit-predict.py:0`
## `n_iters`
`reserving_methods/n-iters.py:0`
## `apriori`
`reserving_methods/apriori.py:0`
## `trend`
`reserving_methods/trend.py:0`
FILE:references/components/tail_factor_estimation.md
# tail_factor_estimation (6 classes)
## `TailCurve.fit_transform`
`tail_factor_estimation/tailcurve-fit-transform.py:0`
## `TailBondy.fit_transform`
`tail_factor_estimation/tailbondy-fit-transform.py:0`
## `TailConstant.fit_transform`
`tail_factor_estimation/tailconstant-fit-transform.py:0`
## `curve`
`tail_factor_estimation/curve.py:0`
## `attachment_age`
`tail_factor_estimation/attachment-age.py:0`
## `decay`
`tail_factor_estimation/decay.py:0`
FILE:references/components/triangle_adjustments.md
# triangle_adjustments (6 classes)
## `Trend.fit_transform`
`triangle_adjustments/trend-fit-transform.py:0`
## `BootstrapODPSample.fit_transform`
`triangle_adjustments/bootstrapodpsample-fit-transform.py:0`
## `ParallelogramOLF.fit_transform`
`triangle_adjustments/parallelogramolf-fit-transform.py:0`
## `BerquistSherman.fit_transform`
`triangle_adjustments/berquistsherman-fit-transform.py:0`
## `hat_adj`
`triangle_adjustments/hat-adj.py:0`
## `axis`
`triangle_adjustments/axis.py:0`
FILE:references/components/triangle_data_structure.md
# triangle_data_structure (6 classes)
## `Triangle.__init__`
`triangle_data_structure/triangle-init.py:0`
## `Triangle.to_frame`
`triangle_data_structure/triangle-to-frame.py:0`
## `TriangleSlicer.loc`
`triangle_data_structure/triangleslicer-loc.py:0`
## `TriangleSlicer.iloc`
`triangle_data_structure/triangleslicer-iloc.py:0`
## `array_backend`
`triangle_data_structure/array-backend.py:0`
## `array_backend_priority`
`triangle_data_structure/array-backend-priority.py:0`
FILE:references/components/workflow_and_ensemble.md
# workflow_and_ensemble (5 classes)
## `Pipeline.fit_transform`
`workflow_and_ensemble/pipeline-fit-transform.py:0`
## `GridSearch.fit`
`workflow_and_ensemble/gridsearch-fit.py:0`
## `VotingChainladder.fit_predict`
`workflow_and_ensemble/votingchainladder-fit-predict.py:0`
## `scoring`
`workflow_and_ensemble/scoring.py:0`
## `weights`
`workflow_and_ensemble/weights.py:0`
使用奇异谱分析和平稳自助法对利率时间序列进行分解与统计推断,构建 NSS 曲线模型并校准利率衍生品参数。
---
name: insurance-actuarial-python
description: |-
使用奇异谱分析和平稳自助法对利率时间序列进行分解与统计推断,构建 NSS 曲线模型并校准利率衍生品参数。
license: Proprietary. See LICENSE.txt in project root.
compatibility: Designed for Doramagic-host ecosystem (Claude Code / openclaw / Cursor). Requires Python 3.12+ with uv package manager.
metadata:
version: "v6.1"
blueprint_id: "finance-bp-064"
compiled_at: "2026-04-22T13:00:20.990803+00:00"
capability_markets: "global"
capability_activities: "insurance-actuarial"
sop_version: "crystal-compilation-v6.1"
---
# 保险精算建模 (insurance-actuarial-python)
> 使用奇异谱分析和平稳自助法对利率时间序列进行分解与统计推断,构建 NSS 曲线模型并校准利率衍生品参数。
## Pipeline
`data_collection -> data_storage -> factor_computation -> target_selection -> trading_execution -> visualization`
## Top Use Cases (2 total)
### Singular Spectrum Analysis Time Series Decomposition (`UC-101`)
Decomposes time series data into interpretable components (trend, seasonality, noise) using Singular Spectrum Analysis to identify underlying patterns
**Triggers**: SSA, singular spectrum analysis, time series decomposition
### Stationary Bootstrap for Interest Rate Swap Inference (`UC-102`)
Applies stationary bootstrap resampling method to Italian swap rate data for statistical inference, enabling confidence interval estimation and hypoth
**Triggers**: stationary bootstrap, swap rates, resampling
**Execute trigger**: `When user intent matches intent_router.uc_entries[].positive_terms AND user uses action verb (run/execute/跑/执行/backtest/fetch/collect)`
## What I'll Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
## Semantic Locks (Fatal)
| ID | Rule | On Violation |
|---|---|---|
| `SL-01` | Execute sell orders before buy orders in every trading cycle | halt |
| `SL-02` | Trading signals MUST use next-bar execution (no look-ahead) | halt |
| `SL-03` | Entity IDs MUST follow format entity_type_exchange_code | halt |
| `SL-04` | DataFrame index MUST be MultiIndex (entity_id, timestamp) | halt |
| `SL-05` | TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount | halt |
| `SL-06` | filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION | halt |
| `SL-07` | Transformer MUST run BEFORE Accumulator in factor pipeline | halt |
| `SL-08` | MACD parameters locked: fast=12, slow=26, signal=9 | halt |
Full lock definitions: [references/LOCKS.md](references/LOCKS.md)
## Top Anti-Patterns (15 total)
- **`AP-INSURANCE-001`**: Implicit numeric format assumptions without validation
- **`AP-INSURANCE-002`**: Triangle axis construction with invalid temporal ordering
- **`AP-INSURANCE-003`**: Cumulative/incremental triangle representation misuse
All 15 anti-patterns: [references/ANTI_PATTERNS.md](references/ANTI_PATTERNS.md)
## Evidence Quality Notice
> [QUALITY NOTICE] This crystal was compiled from blueprint finance-bp-064. Evidence verify ratio = 11.6% and audit fail total = 40. Generated results may have uncaptured requirement gaps. Verify critical decisions against source files (LATEST.yaml / LATEST.jsonl).
## Reference Files
| File | Contents | When to Load |
|---|---|---|
| [references/seed.yaml](references/seed.yaml) | V6+ 全量权威 (source-of-truth) | 有行为/决策争议时必读 |
| [references/ANTI_PATTERNS.md](references/ANTI_PATTERNS.md) | 15 条跨项目反模式 | 开始实现前 |
| [references/WISDOM.md](references/WISDOM.md) | 跨项目精华借鉴 | 架构决策时 |
| [references/CONSTRAINTS.md](references/CONSTRAINTS.md) | domain + fatal 约束 | 规则冲突时 |
| [references/USE_CASES.md](references/USE_CASES.md) | 全量 KUC-* 业务场景 | 需要完整示例时 |
| [references/LOCKS.md](references/LOCKS.md) | SL-* + preconditions + hints | 生成回测/交易代码前 |
| [references/COMPONENTS.md](references/COMPONENTS.md) | AST 组件地图(按 module 拆分)| 查 API 时 |
---
*Compiled by Doramagic crystal-compilation-v6.1 from `finance-bp-064` blueprint at 2026-04-22T13:00:20.990803+00:00.*
*See [human_summary.md](human_summary.md) for non-technical overview.*
FILE:human_summary.md
# Human Summary
> I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
## What I Can Do
- **tagline**: I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
- **use_cases**: ['Stationary Bootstrap for Interest Rate Swap Inference', 'Singular Spectrum Analysis Time Series Decomposition', 'A-share MACD daily golden-cross backtest with hfq price adjustment from eastmoney', 'End-to-end ZVT pipeline: FinanceRecorder + GoodCompanyFactor + StockTrader', 'Multi-factor strategy with TargetSelector (AND mode) combining MACD + volume breakout', 'Index composition data collection (SZ1000, SZ2000) with EM recorder', 'Institutional fund holdings tracker via joinquant_fund_runner pattern']
## What I Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
FILE:references/ANTI_PATTERNS.md
# Anti-Patterns (Cross-Project)
Total: **15**
## finance-bp-063--chainladder-python (4)
### `AP-INSURANCE-002` — Triangle axis construction with invalid temporal ordering <sub>(high)</sub>
Development dates are created without verifying they are strictly greater than origin dates, or development lags are calculated with incorrect formulas (e.g., using wrong divisor for monthly difference). This creates logically impossible triangle cells where development <= origin, corrupting the fundamental data structure and producing wrong loss development patterns.
### `AP-INSURANCE-003` — Cumulative/incremental triangle representation misuse <sub>(high)</sub>
Link ratios are computed on incremental triangles instead of cumulative form, or cum_to_incr/incr_to_cum conversions are not properly inverse-applied. This produces link ratios near 1.0 regardless of actual claims development, leading to misleading development factors and incorrect IBNR estimates.
### `AP-INSURANCE-004` — Including incomplete latest diagonal in development analysis <sub>(high)</sub>
Link ratio computation includes the latest diagonal which contains incomplete/in-progress development data. Without excluding this diagonal via valuation_date filtering, development factor estimation uses partial data that biases IBNR estimates. The latest diagonal must be excluded to capture true historical development patterns.
### `AP-INSURANCE-015` — Triangle grain transformation with incompatible parameters <sub>(medium)</sub>
Triangle grain() method is called without setting is_cumulative attribute, or origin grain is made finer than development grain. These produce invalid triangular data structures with misaligned periods and undefined behavior, corrupting actuarial reserving calculations.
## finance-bp-064--insurance_python (2)
### `AP-INSURANCE-005` — EIOPA calibration workflow violations <sub>(high)</sub>
Smith-Wilson calibration workflow is violated in multiple ways: calibration step is skipped before extrapolation, different alpha values are used for calibration vs extrapolation, or convergence point T uses incorrect formula. These violations produce mathematically inconsistent rate curves where observed points do not match market data and extrapolated rates violate EIOPA specifications.
### `AP-INSURANCE-006` — Missing iteration bounds causing infinite loops <sub>(high)</sub>
Root-finding algorithms like bisection for alpha calibration lack maxIter parameters. When the algorithm fails to converge (e.g., no sign change in Galfa at interval bounds), the application freezes indefinitely, causing service disruption. This is especially critical in regulatory compliance workflows where calibration must complete.
## finance-bp-064--insurance_python, finance-bp-126--lifelines (1)
### `AP-INSURANCE-007` — Invalid financial/mathematical constraints not validated <sub>(high)</sub>
Correlation coefficients outside [-1,1], non-positive-semidefinite covariance matrices, negative durations, or entry times >= duration are not validated before use. These cause Cholesky decomposition failures, imaginary values in sqrt(1-rho²), or logically impossible scenarios, producing NaN prices or corrupted at-risk calculations.
## finance-bp-065--pyliferisk (4)
### `AP-INSURANCE-008` — None values propagated to arithmetic operations <sub>(high)</sub>
Critical parameters like interest rate i are passed as None to actuarial calculations. In pyliferisk, Actuarial.__init__ with i=None causes TypeError in (1/(1+i)) and commutation arrays remain empty. Bare except clauses catch these TypeErrors and silently return 0, masking the fundamental issue and producing incorrect but seemingly valid results.
### `AP-INSURANCE-009` — Stub function implementations and duplicate definitions <sub>(high)</sub>
Critical insurance functions like deferred temporary annuities are implemented as empty stubs (only 'pass' statement) or have duplicate definitions where the second shadows the first. This causes functions to return None instead of calculated values, breaking increasing annuity and premium calculations silently in production.
### `AP-INSURANCE-010` — Dispatcher routing to undefined functions <sub>(medium)</sub>
Complex function dispatchers (like annuity()) handle many parameter combinations but call functions that do not exist (e.g., qtaaxn, qtaxn). This causes NameError at runtime when specific parameter combinations are requested, preventing deferred temporary increasing annuity calculations entirely.
### `AP-INSURANCE-014` — Actuarial convention violations in life table construction <sub>(high)</sub>
Life tables violate standard actuarial conventions: using incorrect radix (not 100000), failing to append 0 to lx array for complete extinction, or using wrong payment adjustment formula for fractional annuities. These violations scale all derived quantities (dx, ex, reserves, premiums) incorrectly.
## finance-bp-065--pyliferisk, finance-bp-064--insurance_python (1)
### `AP-INSURANCE-001` — Implicit numeric format assumptions without validation <sub>(high)</sub>
Data formats like per-mille qx values or rate-to-price conversions are applied implicitly without validation. In pyliferisk, qx values stored as per-mille (qx*1000) are used directly as probabilities yielding 1000x errors. In insurance_python, rates are converted to prices using p=(1+r)^(-M) without verifying input format. This causes material miscalculations in reserve and premium calculations.
## finance-bp-126--lifelines (3)
### `AP-INSURANCE-011` — Survival function monotonicity not enforced <sub>(high)</sub>
Non-parametric survival curve estimators do not verify that S(t) is monotonically non-increasing across timeline values. Violations produce mathematically invalid survival curves where probability of survival increases over time, or S(0) is not initialized to 1.0, breaking interpretation as probability distribution.
### `AP-INSURANCE-012` — Input data corruption via inplace operations <sub>(medium)</sub>
User-provided DataFrames are modified inplace using .pop() operations without first creating a copy. This permanently corrupts user data by removing columns, violating data isolation principles and potentially affecting downstream analysis on the original data.
### `AP-INSURANCE-013` — Interval censoring bounds not validated <sub>(medium)</sub>
Lower and upper bounds for interval-censored data are not validated, allowing upper_bound < lower_bound. Invalid interval bounds produce undefined survival probability calculations, potentially negative time intervals in the likelihood function, and corrupt NPMLE estimation.
FILE:references/COMPONENTS.md
# Component Capability Map
**Project**: finance-bp-064--insurance_python
**Scan date**: 2026-04-22
**Stats**: {'total_files': 6, 'total_classes': 20, 'total_functions': 0, 'total_stages': 6}
## Modules (6)
- [yield_curve_fitting](components/yield_curve_fitting.md): 5 classes
- [alpha_parameter_calibration](components/alpha_parameter_calibration.md): 2 classes
- [interest_rate_simulation](components/interest_rate_simulation.md): 4 classes
- [option_pricing](components/option_pricing.md): 3 classes
- [time_series_analysis_(ssa)](components/time_series_analysis_-ssa.md): 4 classes
- [stationary_bootstrap_resampling](components/stationary_bootstrap_resampling.md): 2 classes
FILE:references/CONSTRAINTS.md
# Constraints
## preservation_manifest
```yaml
required_objects:
business_decisions_count: 107
fatal_constraints_count: 35
non_fatal_constraints_count: 125
use_cases_count: 2
semantic_locks_count: 12
preconditions_count: 4
evidence_quality_rules_count: 2
traceback_scenarios_count: 5
```
FILE:references/LOCKS.md
# Semantic Locks + Preconditions
## Semantic Locks (12)
### `SL-01` <sub>(on_violation: fatal)</sub>
Execute sell orders before buy orders in every trading cycle
### `SL-02` <sub>(on_violation: fatal)</sub>
Trading signals MUST use next-bar execution (no look-ahead)
### `SL-03` <sub>(on_violation: fatal)</sub>
Entity IDs MUST follow format entity_type_exchange_code
### `SL-04` <sub>(on_violation: fatal)</sub>
DataFrame index MUST be MultiIndex (entity_id, timestamp)
### `SL-05` <sub>(on_violation: fatal)</sub>
TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount
### `SL-06` <sub>(on_violation: fatal)</sub>
filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION
### `SL-07` <sub>(on_violation: fatal)</sub>
Transformer MUST run BEFORE Accumulator in factor pipeline
### `SL-08` <sub>(on_violation: fatal)</sub>
MACD parameters locked: fast=12, slow=26, signal=9
### `SL-09` <sub>(on_violation: warning)</sub>
Default transaction costs: buy_cost=0.001, sell_cost=0.001, slippage=0.001
### `SL-10` <sub>(on_violation: fatal)</sub>
A-share equity trading is T+1 (no same-day close of buy positions)
### `SL-11` <sub>(on_violation: fatal)</sub>
Recorder subclass MUST define provider AND data_schema class attributes
### `SL-12` <sub>(on_violation: fatal)</sub>
Factor result_df MUST contain either 'filter_result' OR 'score_result' column
## Preconditions (4)
- **`PC-01`**: `python3 -c 'import zvt; print(zvt.__version__)'` → on_fail: Run: python3 -m pip install zvt then re-run: python3 -m zvt.init_dirs to initialize data directories
- **`PC-02`**: `python3 -c "from zvt.api.kdata import get_kdata; df = get_kdata(entity_ids=['stock_sh_600000'], limit=1); assert df is not None and len(df) > 0, 'No kdata found'"` → on_fail: Run recorder first: python3 -m zvt.recorders.em.em_stock_kdata_recorder --entity_ids stock_sh_600000 (replace with your target entity IDs)
- **`PC-03`**: `python3 -c "import os; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); assert zvt_home.exists(), f'ZVT home not found: {zvt_home}'"` → on_fail: Run: python3 -m zvt.init_dirs
- **`PC-04`**: `python3 -c "import os, tempfile; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); test_f = zvt_home / '.write_test'; test_f.touch(); test_f.unlink()"` → on_fail: Check directory permissions: chmod u+w ~/.zvt or set ZVT_HOME environment variable to a writable location
FILE:references/USE_CASES.md
# Known Use Cases (KUC)
Total: **2**
## `KUC-101`
**Source**: `singular_spectrum_analysis/SSA_Example.ipynb`
Decomposes time series data into interpretable components (trend, seasonality, noise) using Singular Spectrum Analysis to identify underlying patterns in financial data.
## `KUC-102`
**Source**: `stationary_bootstrap/Stationary Bootstrap Italian Swap Example.ipynb`
Applies stationary bootstrap resampling method to Italian swap rate data for statistical inference, enabling confidence interval estimation and hypothesis testing on interest rate derivatives.
FILE:references/WISDOM.md
# Cross-Project Wisdom
Total: **10**
## `CW-INSURANCE-001` — Validate input data format and type before computation
**From**: finance-bp-063--chainladder-python, finance-bp-126--lifelines · **Applicable to**: insurance-actuarial
Both triangle construction and survival analysis require strict input validation: numeric types for triangle columns, valid event indicators (0/1), no NaN/Inf values, and correct temporal ordering. This prevents downstream numerical failures and ensures mathematical validity of actuarial computations.
## `CW-INSURANCE-002` — Initialize probability distributions to boundary values
**From**: finance-bp-065--pyliferisk, finance-bp-126--lifelines · **Applicable to**: insurance-actuarial
Survival probability S(0) must equal 1.0 and life table lx must start at standard radix (100000) and end at 0. Properly initializing boundary values ensures actuarial quantities have correct scale and interpretation as probability distributions.
## `CW-INSURANCE-003` — Include iteration limits in numerical root-finding
**From**: finance-bp-064--insurance_python · **Applicable to**: insurance-actuarial
Bisection and other root-finding algorithms must include maxIter parameters and verify interval contains valid root (sign change). This prevents infinite loops when calibration fails, ensuring service availability in regulatory compliance workflows.
## `CW-INSURANCE-004` — Avoid bare except clauses that mask TypeErrors
**From**: finance-bp-065--pyliferisk · **Applicable to**: insurance-actuarial
Bare except clauses that catch all exceptions including TypeError and return default values (0 or None) mask fundamental parameter errors. Use specific exception handling and validate inputs upfront to fail fast with clear error messages.
## `CW-INSURANCE-005` — Preserve standard radix and extinction conventions in life tables
**From**: finance-bp-065--pyliferisk · **Applicable to**: insurance-actuarial
Life insurance calculations rely on industry-standard conventions: radix of 100000 at age 0 and lx[-1]=0 for complete extinction. Deviating from these conventions scales all derived quantities incorrectly and breaks interoperability with other actuarial systems.
## `CW-INSURANCE-006` — Ensure workflow step ordering and parameter consistency
**From**: finance-bp-063--chainladder-python, finance-bp-064--insurance_python · **Applicable to**: insurance-actuarial
Multi-step algorithms (triangle transformations, Smith-Wilson calibration) require strict step ordering: compute calibration vector before extrapolation, use consistent alpha values throughout. Violating workflow order produces undefined or mathematically inconsistent results.
## `CW-INSURANCE-007` — Validate probability bounds for confidence intervals
**From**: finance-bp-126--lifelines · **Applicable to**: insurance-actuarial
Confidence interval bounds must be constrained to [0,1] for probability estimates. Use fillna and formula constraints to ensure CI bounds remain valid probability ranges, preventing invalid statistical inference from actuarial models.
## `CW-INSURANCE-008` — Validate matrix properties before decomposition
**From**: finance-bp-065--pyliferisk, finance-bp-064--insurance_python · **Applicable to**: insurance-actuarial
Positive semi-definite matrices must be verified before Cholesky decomposition. Invalid matrices cause math domain errors or invalid correlated samples. Similarly, correlation coefficients must be validated to [-1,1] bounds before use in sqrt(1-rho²).
## `CW-INSURANCE-009` — Make defensive copies of input DataFrames
**From**: finance-bp-126--lifelines · **Applicable to**: insurance-actuarial
User-provided DataFrames should be copied before inplace modifications (.pop(), .drop()). This preserves user data integrity and prevents side effects from leaking into caller code, maintaining data isolation principles.
## `CW-INSURANCE-010` — Exclude incomplete diagonals from historical analysis
**From**: finance-bp-063--chainladder-python · **Applicable to**: insurance-actuarial
The latest diagonal in claims triangles contains incomplete development data from the current period. Excluding this diagonal via valuation_date filtering ensures development factors capture only completed, reliable historical patterns for unbiased IBNR estimation.
FILE:references/components/alpha_parameter_calibration.md
# alpha_parameter_calibration (2 classes)
## `BisectionAlpha.find_alpha`
`alpha_parameter_calibration/bisectionalpha-find-alpha.py:0`
## `convergence_point_formula`
`alpha_parameter_calibration/convergence-point-formula.py:0`
FILE:references/components/interest_rate_simulation.md
# interest_rate_simulation (4 classes)
## `BrownianMotion.simulate`
`interest_rate_simulation/brownianmotion-simulate.py:0`
## `ssaBasic.decompose`
`interest_rate_simulation/ssabasic-decompose.py:0`
## `random_number_generator`
`interest_rate_simulation/random-number-generator.py:0`
## `svd_implementation`
`interest_rate_simulation/svd-implementation.py:0`
FILE:references/components/option_pricing.md
# option_pricing (3 classes)
## `Swaption.price`
`option_pricing/swaption-price.py:0`
## `ZeroCouponBond.price_Vasicek_Two_Factor`
`option_pricing/zerocouponbond-price-vasicek-two-factor.py:0`
## `integration_method`
`option_pricing/integration-method.py:0`
FILE:references/components/stationary_bootstrap_resampling.md
# stationary_bootstrap_resampling (2 classes)
## `N/A`
`stationary_bootstrap_resampling/n-a.py:0`
## `block_length_algorithm`
`stationary_bootstrap_resampling/block-length-algorithm.py:0`
FILE:references/components/time_series_analysis_-ssa.md
# time_series_analysis_(ssa) (4 classes)
## `ssaBasic.fit`
`time_series_analysis_(ssa)/ssabasic-fit.py:0`
## `ssaBasic.forecast`
`time_series_analysis_(ssa)/ssabasic-forecast.py:0`
## `ssaBasic.reconstruct`
`time_series_analysis_(ssa)/ssabasic-reconstruct.py:0`
## `forecast_method`
`time_series_analysis_(ssa)/forecast-method.py:0`
FILE:references/components/yield_curve_fitting.md
# yield_curve_fitting (5 classes)
## `NSSMinimize.compute`
`yield_curve_fitting/nssminimize-compute.py:0`
## `NSSGoodFit.objective`
`yield_curve_fitting/nssgoodfit-objective.py:0`
## `SWCalibrate`
`yield_curve_fitting/swcalibrate.py:0`
## `SWExtrapolate`
`yield_curve_fitting/swextrapolate.py:0`
## `root_finding_method`
`yield_curve_fitting/root-finding-method.py:0`
FILE:references/seed.yaml
meta:
id: finance-bp-064-v5.3
version: v6.1
blueprint_id: finance-bp-064
sop_version: crystal-compilation-v6.1
source_language: en
compiled_at: '2026-04-22T13:00:20.990803+00:00'
target_host: openclaw
authoritative_artifact:
primary: seed.yaml
non_authoritative_derivatives:
- SKILL.md (host-generated summary, may lag)
- HEARTBEAT.md (host telemetry)
- memory/*.md (host conversational memory)
rule: On any behavioral decision (preconditions check, OV assertion, EQ rule firing, spec_lock verification), agents MUST
re-read seed.yaml. Derivatives are for UI display only and may be out-of-date.
execution_protocol:
install_trigger:
- Execute resources.host_adapter.install_recipes[] in declared order
- Verify each package with import check before proceeding
execute_trigger: When user intent matches intent_router.uc_entries[].positive_terms AND user uses action verb (run/execute/跑/执行/backtest/fetch/collect)
on_execute:
- Reload seed.yaml (do not rely on SKILL.md or cached summaries)
- Run preconditions[] in declared order; halt on first fatal failure with on_fail message to user
- Enter context_state_machine.CA1_MEMORY_CHECKED state
- Evaluate evidence_quality.enforcement_rules[]; prepend user_disclosure_template
- Translate user_facing_fields to user locale per locale_contract
- "[V6 READING ORDER]\nThis crystal contains the following V6 layers. Before answering any business question, the host\
\ MUST read them in order:\n 1. anti_patterns[] — cross-project anti-patterns (with AP-* ids)\n 2. cross_project_wisdom[]\
\ — cross-project wisdom (with CW-* ids)\n 3. domain_constraints_injected[] — domain constraints (SHARED-* ids)\n \
\ 4. known_use_cases[] — concrete business scenarios (KUC-* ids)\n 5. component_capability_map — AST component map\
\ (by module)\n\nWhen answering user questions, proactively cite relevant AP-*/CW-*/SHARED-*/KUC-* ids with source text.\
\ Examples: T+1 rules -> cite SHARED-* constraint; model comparison -> warn via AP-*; follow-holdings strategy -> cite\
\ KUC-* with example file."
workspace_resolution:
scripts_path: '{host_workspace}/scripts/'
skills_path: '{host_workspace}/skills/'
trace_path: '{host_workspace}/.trace/'
capability_tags:
markets:
- global
activities:
- insurance-actuarial
upgraded_from: finance-bp-064-v1.seed.yaml
upgraded_at: '2026-04-22T13:20:12.931688+00:00'
v6_inputs:
ast_mind_map: knowledge/sources/finance/finance-bp-064--insurance_python/v6_inputs/ast_mind_map.yaml
anti_patterns: null
cross_project_wisdom: null
examples_kuc: knowledge/sources/finance/finance-bp-064--insurance_python/v6_inputs/examples_kuc.yaml
shared_pools_dir: knowledge/sources/finance/_shared
anti_patterns:
- id: AP-INSURANCE-001
title: Implicit numeric format assumptions without validation
description: Data formats like per-mille qx values or rate-to-price conversions are applied implicitly without validation.
In pyliferisk, qx values stored as per-mille (qx*1000) are used directly as probabilities yielding 1000x errors. In insurance_python,
rates are converted to prices using p=(1+r)^(-M) without verifying input format. This causes material miscalculations
in reserve and premium calculations.
project_source: finance-bp-065--pyliferisk, finance-bp-064--insurance_python
severity: high
applicable_to_tags:
markets:
- global
activities:
- insurance-actuarial
_source_file: anti-patterns/insurance.yaml
- id: AP-INSURANCE-002
title: Triangle axis construction with invalid temporal ordering
description: Development dates are created without verifying they are strictly greater than origin dates, or development
lags are calculated with incorrect formulas (e.g., using wrong divisor for monthly difference). This creates logically
impossible triangle cells where development <= origin, corrupting the fundamental data structure and producing wrong loss
development patterns.
project_source: finance-bp-063--chainladder-python
severity: high
applicable_to_tags:
markets:
- global
activities:
- insurance-actuarial
_source_file: anti-patterns/insurance.yaml
- id: AP-INSURANCE-003
title: Cumulative/incremental triangle representation misuse
description: Link ratios are computed on incremental triangles instead of cumulative form, or cum_to_incr/incr_to_cum conversions
are not properly inverse-applied. This produces link ratios near 1.0 regardless of actual claims development, leading
to misleading development factors and incorrect IBNR estimates.
project_source: finance-bp-063--chainladder-python
severity: high
applicable_to_tags:
markets:
- global
activities:
- insurance-actuarial
_source_file: anti-patterns/insurance.yaml
- id: AP-INSURANCE-004
title: Including incomplete latest diagonal in development analysis
description: Link ratio computation includes the latest diagonal which contains incomplete/in-progress development data.
Without excluding this diagonal via valuation_date filtering, development factor estimation uses partial data that biases
IBNR estimates. The latest diagonal must be excluded to capture true historical development patterns.
project_source: finance-bp-063--chainladder-python
severity: high
applicable_to_tags:
markets:
- global
activities:
- insurance-actuarial
_source_file: anti-patterns/insurance.yaml
- id: AP-INSURANCE-005
title: EIOPA calibration workflow violations
description: 'Smith-Wilson calibration workflow is violated in multiple ways: calibration step is skipped before extrapolation,
different alpha values are used for calibration vs extrapolation, or convergence point T uses incorrect formula. These
violations produce mathematically inconsistent rate curves where observed points do not match market data and extrapolated
rates violate EIOPA specifications.'
project_source: finance-bp-064--insurance_python
severity: high
applicable_to_tags:
markets:
- global
activities:
- insurance-actuarial
_source_file: anti-patterns/insurance.yaml
- id: AP-INSURANCE-006
title: Missing iteration bounds causing infinite loops
description: Root-finding algorithms like bisection for alpha calibration lack maxIter parameters. When the algorithm fails
to converge (e.g., no sign change in Galfa at interval bounds), the application freezes indefinitely, causing service
disruption. This is especially critical in regulatory compliance workflows where calibration must complete.
project_source: finance-bp-064--insurance_python
severity: high
applicable_to_tags:
markets:
- global
activities:
- insurance-actuarial
_source_file: anti-patterns/insurance.yaml
- id: AP-INSURANCE-007
title: Invalid financial/mathematical constraints not validated
description: Correlation coefficients outside [-1,1], non-positive-semidefinite covariance matrices, negative durations,
or entry times >= duration are not validated before use. These cause Cholesky decomposition failures, imaginary values
in sqrt(1-rho²), or logically impossible scenarios, producing NaN prices or corrupted at-risk calculations.
project_source: finance-bp-064--insurance_python, finance-bp-126--lifelines
severity: high
applicable_to_tags:
markets:
- global
activities:
- insurance-actuarial
_source_file: anti-patterns/insurance.yaml
- id: AP-INSURANCE-008
title: None values propagated to arithmetic operations
description: Critical parameters like interest rate i are passed as None to actuarial calculations. In pyliferisk, Actuarial.__init__
with i=None causes TypeError in (1/(1+i)) and commutation arrays remain empty. Bare except clauses catch these TypeErrors
and silently return 0, masking the fundamental issue and producing incorrect but seemingly valid results.
project_source: finance-bp-065--pyliferisk
severity: high
applicable_to_tags:
markets:
- global
activities:
- insurance-actuarial
_source_file: anti-patterns/insurance.yaml
- id: AP-INSURANCE-009
title: Stub function implementations and duplicate definitions
description: Critical insurance functions like deferred temporary annuities are implemented as empty stubs (only 'pass'
statement) or have duplicate definitions where the second shadows the first. This causes functions to return None instead
of calculated values, breaking increasing annuity and premium calculations silently in production.
project_source: finance-bp-065--pyliferisk
severity: high
applicable_to_tags:
markets:
- global
activities:
- insurance-actuarial
_source_file: anti-patterns/insurance.yaml
- id: AP-INSURANCE-010
title: Dispatcher routing to undefined functions
description: Complex function dispatchers (like annuity()) handle many parameter combinations but call functions that do
not exist (e.g., qtaaxn, qtaxn). This causes NameError at runtime when specific parameter combinations are requested,
preventing deferred temporary increasing annuity calculations entirely.
project_source: finance-bp-065--pyliferisk
severity: medium
applicable_to_tags:
markets:
- global
activities:
- insurance-actuarial
_source_file: anti-patterns/insurance.yaml
- id: AP-INSURANCE-011
title: Survival function monotonicity not enforced
description: Non-parametric survival curve estimators do not verify that S(t) is monotonically non-increasing across timeline
values. Violations produce mathematically invalid survival curves where probability of survival increases over time, or
S(0) is not initialized to 1.0, breaking interpretation as probability distribution.
project_source: finance-bp-126--lifelines
severity: high
applicable_to_tags:
markets:
- global
activities:
- insurance-actuarial
_source_file: anti-patterns/insurance.yaml
- id: AP-INSURANCE-012
title: Input data corruption via inplace operations
description: User-provided DataFrames are modified inplace using .pop() operations without first creating a copy. This permanently
corrupts user data by removing columns, violating data isolation principles and potentially affecting downstream analysis
on the original data.
project_source: finance-bp-126--lifelines
severity: medium
applicable_to_tags:
markets:
- global
activities:
- insurance-actuarial
_source_file: anti-patterns/insurance.yaml
- id: AP-INSURANCE-013
title: Interval censoring bounds not validated
description: Lower and upper bounds for interval-censored data are not validated, allowing upper_bound < lower_bound. Invalid
interval bounds produce undefined survival probability calculations, potentially negative time intervals in the likelihood
function, and corrupt NPMLE estimation.
project_source: finance-bp-126--lifelines
severity: medium
applicable_to_tags:
markets:
- global
activities:
- insurance-actuarial
_source_file: anti-patterns/insurance.yaml
- id: AP-INSURANCE-014
title: Actuarial convention violations in life table construction
description: 'Life tables violate standard actuarial conventions: using incorrect radix (not 100000), failing to append
0 to lx array for complete extinction, or using wrong payment adjustment formula for fractional annuities. These violations
scale all derived quantities (dx, ex, reserves, premiums) incorrectly.'
project_source: finance-bp-065--pyliferisk
severity: high
applicable_to_tags:
markets:
- global
activities:
- insurance-actuarial
_source_file: anti-patterns/insurance.yaml
- id: AP-INSURANCE-015
title: Triangle grain transformation with incompatible parameters
description: Triangle grain() method is called without setting is_cumulative attribute, or origin grain is made finer than
development grain. These produce invalid triangular data structures with misaligned periods and undefined behavior, corrupting
actuarial reserving calculations.
project_source: finance-bp-063--chainladder-python
severity: medium
applicable_to_tags:
markets:
- global
activities:
- insurance-actuarial
_source_file: anti-patterns/insurance.yaml
cross_project_wisdom:
- wisdom_id: CW-INSURANCE-001
source_project: finance-bp-063--chainladder-python, finance-bp-126--lifelines
pattern_name: Validate input data format and type before computation
description: 'Both triangle construction and survival analysis require strict input validation: numeric types for triangle
columns, valid event indicators (0/1), no NaN/Inf values, and correct temporal ordering. This prevents downstream numerical
failures and ensures mathematical validity of actuarial computations.'
applicable_to_activity: insurance-actuarial
_source_file: cross-project-wisdom/insurance.yaml
- wisdom_id: CW-INSURANCE-002
source_project: finance-bp-065--pyliferisk, finance-bp-126--lifelines
pattern_name: Initialize probability distributions to boundary values
description: Survival probability S(0) must equal 1.0 and life table lx must start at standard radix (100000) and end at
0. Properly initializing boundary values ensures actuarial quantities have correct scale and interpretation as probability
distributions.
applicable_to_activity: insurance-actuarial
_source_file: cross-project-wisdom/insurance.yaml
- wisdom_id: CW-INSURANCE-003
source_project: finance-bp-064--insurance_python
pattern_name: Include iteration limits in numerical root-finding
description: Bisection and other root-finding algorithms must include maxIter parameters and verify interval contains valid
root (sign change). This prevents infinite loops when calibration fails, ensuring service availability in regulatory compliance
workflows.
applicable_to_activity: insurance-actuarial
_source_file: cross-project-wisdom/insurance.yaml
- wisdom_id: CW-INSURANCE-004
source_project: finance-bp-065--pyliferisk
pattern_name: Avoid bare except clauses that mask TypeErrors
description: Bare except clauses that catch all exceptions including TypeError and return default values (0 or None) mask
fundamental parameter errors. Use specific exception handling and validate inputs upfront to fail fast with clear error
messages.
applicable_to_activity: insurance-actuarial
_source_file: cross-project-wisdom/insurance.yaml
- wisdom_id: CW-INSURANCE-005
source_project: finance-bp-065--pyliferisk
pattern_name: Preserve standard radix and extinction conventions in life tables
description: 'Life insurance calculations rely on industry-standard conventions: radix of 100000 at age 0 and lx[-1]=0 for
complete extinction. Deviating from these conventions scales all derived quantities incorrectly and breaks interoperability
with other actuarial systems.'
applicable_to_activity: insurance-actuarial
_source_file: cross-project-wisdom/insurance.yaml
- wisdom_id: CW-INSURANCE-006
source_project: finance-bp-063--chainladder-python, finance-bp-064--insurance_python
pattern_name: Ensure workflow step ordering and parameter consistency
description: 'Multi-step algorithms (triangle transformations, Smith-Wilson calibration) require strict step ordering: compute
calibration vector before extrapolation, use consistent alpha values throughout. Violating workflow order produces undefined
or mathematically inconsistent results.'
applicable_to_activity: insurance-actuarial
_source_file: cross-project-wisdom/insurance.yaml
- wisdom_id: CW-INSURANCE-007
source_project: finance-bp-126--lifelines
pattern_name: Validate probability bounds for confidence intervals
description: Confidence interval bounds must be constrained to [0,1] for probability estimates. Use fillna and formula constraints
to ensure CI bounds remain valid probability ranges, preventing invalid statistical inference from actuarial models.
applicable_to_activity: insurance-actuarial
_source_file: cross-project-wisdom/insurance.yaml
- wisdom_id: CW-INSURANCE-008
source_project: finance-bp-065--pyliferisk, finance-bp-064--insurance_python
pattern_name: Validate matrix properties before decomposition
description: Positive semi-definite matrices must be verified before Cholesky decomposition. Invalid matrices cause math
domain errors or invalid correlated samples. Similarly, correlation coefficients must be validated to [-1,1] bounds before
use in sqrt(1-rho²).
applicable_to_activity: insurance-actuarial
_source_file: cross-project-wisdom/insurance.yaml
- wisdom_id: CW-INSURANCE-009
source_project: finance-bp-126--lifelines
pattern_name: Make defensive copies of input DataFrames
description: User-provided DataFrames should be copied before inplace modifications (.pop(), .drop()). This preserves user
data integrity and prevents side effects from leaking into caller code, maintaining data isolation principles.
applicable_to_activity: insurance-actuarial
_source_file: cross-project-wisdom/insurance.yaml
- wisdom_id: CW-INSURANCE-010
source_project: finance-bp-063--chainladder-python
pattern_name: Exclude incomplete diagonals from historical analysis
description: The latest diagonal in claims triangles contains incomplete development data from the current period. Excluding
this diagonal via valuation_date filtering ensures development factors capture only completed, reliable historical patterns
for unbiased IBNR estimation.
applicable_to_activity: insurance-actuarial
_source_file: cross-project-wisdom/insurance.yaml
domain_constraints_injected: []
resources_injected: {}
known_use_cases:
- kuc_id: KUC-101
source_file: singular_spectrum_analysis/SSA_Example.ipynb
business_problem: Decomposes time series data into interpretable components (trend, seasonality, noise) using Singular Spectrum
Analysis to identify underlying patterns in financial data.
intent_keywords:
- SSA
- singular spectrum analysis
- time series decomposition
- scree plot
- trend extraction
stage: research_analysis
data_domain: financial_data
type: research_analysis
- kuc_id: KUC-102
source_file: stationary_bootstrap/Stationary Bootstrap Italian Swap Example.ipynb
business_problem: Applies stationary bootstrap resampling method to Italian swap rate data for statistical inference, enabling
confidence interval estimation and hypothesis testing on interest rate derivatives.
intent_keywords:
- stationary bootstrap
- swap rates
- resampling
- confidence intervals
- interest rate derivatives
stage: research_analysis
data_domain: financial_data
type: research_analysis
component_capability_map:
project: finance-bp-064--insurance_python
scan_date: '2026-04-22'
stats:
total_files: 6
total_classes: 20
total_functions: 0
total_stages: 6
modules:
yield_curve_fitting:
class_count: 5
stage_id: yield_curve_fitting
stage_order: 1
responsibility: Interpolate and extrapolate interest rate curves from observed market data using regulatory-grade algorithms
(EIOPA-compliant). This stage provides the foundational zero-coupon rates needed by downstream pricing and simulation
stages.
classes:
- name: NSSMinimize.compute
file: yield_curve_fitting/nssminimize-compute.py
line: 0
kind: required_method
signature: ''
- name: NSSGoodFit.objective
file: yield_curve_fitting/nssgoodfit-objective.py
line: 0
kind: required_method
signature: ''
- name: SWCalibrate
file: yield_curve_fitting/swcalibrate.py
line: 0
kind: required_method
signature: ''
- name: SWExtrapolate
file: yield_curve_fitting/swextrapolate.py
line: 0
kind: required_method
signature: ''
- name: root_finding_method
file: yield_curve_fitting/root-finding-method.py
line: 0
kind: replaceable_point
design_decision_count: 1
alpha_parameter_calibration:
class_count: 2
stage_id: alpha_calibration
stage_order: 2
responsibility: Find optimal convergence speed parameter alpha using bisection root-finding to satisfy EIOPA tolerance
constraints for Solvency II regulatory compliance.
classes:
- name: BisectionAlpha.find_alpha
file: alpha_parameter_calibration/bisectionalpha-find-alpha.py
line: 0
kind: required_method
signature: ''
- name: convergence_point_formula
file: alpha_parameter_calibration/convergence-point-formula.py
line: 0
kind: replaceable_point
design_decision_count: 2
interest_rate_simulation:
class_count: 4
stage_id: interest_rate_simulation
stage_order: 3
responsibility: Simulate stochastic paths for interest rates using mean-reverting processes (Vasicek, Hull-White, Dothan)
enabling Monte Carlo pricing and risk analysis.
classes:
- name: BrownianMotion.simulate
file: interest_rate_simulation/brownianmotion-simulate.py
line: 0
kind: required_method
signature: ''
- name: ssaBasic.decompose
file: interest_rate_simulation/ssabasic-decompose.py
line: 0
kind: required_method
signature: ''
- name: random_number_generator
file: interest_rate_simulation/random-number-generator.py
line: 0
kind: replaceable_point
- name: svd_implementation
file: interest_rate_simulation/svd-implementation.py
line: 0
kind: replaceable_point
design_decision_count: 4
option_pricing:
class_count: 3
stage_id: option_pricing
stage_order: 4
responsibility: Price financial derivatives (swaptions, zero-coupon bonds) under stochastic interest rate models using
Monte Carlo simulation.
classes:
- name: Swaption.price
file: option_pricing/swaption-price.py
line: 0
kind: required_method
signature: ''
- name: ZeroCouponBond.price_Vasicek_Two_Factor
file: option_pricing/zerocouponbond-price-vasicek-two-factor.py
line: 0
kind: required_method
signature: ''
- name: integration_method
file: option_pricing/integration-method.py
line: 0
kind: replaceable_point
design_decision_count: 2
time_series_analysis_(ssa):
class_count: 4
stage_id: time_series_analysis
stage_order: 5
responsibility: Non-parametric decomposition and forecasting of time series using Singular Spectrum Analysis, enabling
signal extraction and uncertainty quantification.
classes:
- name: ssaBasic.fit
file: time_series_analysis_(ssa)/ssabasic-fit.py
line: 0
kind: required_method
signature: ''
- name: ssaBasic.forecast
file: time_series_analysis_(ssa)/ssabasic-forecast.py
line: 0
kind: required_method
signature: ''
- name: ssaBasic.reconstruct
file: time_series_analysis_(ssa)/ssabasic-reconstruct.py
line: 0
kind: required_method
signature: ''
- name: forecast_method
file: time_series_analysis_(ssa)/forecast-method.py
line: 0
kind: replaceable_point
design_decision_count: 4
stationary_bootstrap_resampling:
class_count: 2
stage_id: resampling_bootstrap
stage_order: 6
responsibility: Resample dependent time series while preserving stationarity using random block lengths, enabling statistical
inference for autocorrelated data.
classes:
- name: N/A
file: stationary_bootstrap_resampling/n-a.py
line: 0
kind: required_method
signature: ''
- name: block_length_algorithm
file: stationary_bootstrap_resampling/block-length-algorithm.py
line: 0
kind: replaceable_point
design_decision_count: 1
data_flow_hints: []
locale_contract:
source_language: en
user_facing_fields:
- human_summary.what_i_can_do.tagline
- human_summary.what_i_can_do.use_cases[]
- human_summary.what_i_auto_fetch[]
- human_summary.what_i_ask_you[]
- evidence_quality.user_disclosure_template
- post_install_notice.message_template.positioning
- post_install_notice.message_template.capability_catalog.groups[].name
- post_install_notice.message_template.capability_catalog.groups[].description
- post_install_notice.message_template.capability_catalog.groups[].ucs[].name
- post_install_notice.message_template.capability_catalog.groups[].ucs[].short_description
- post_install_notice.message_template.call_to_action
- post_install_notice.message_template.featured_entries[].beginner_prompt
- post_install_notice.message_template.more_info_hint
- preconditions[].description
- preconditions[].on_fail
- intent_router.uc_entries[].name
- intent_router.uc_entries[].ambiguity_question
- architecture.pipeline
- architecture.stages[].narrative.does_what
- architecture.stages[].narrative.key_decisions
- architecture.stages[].narrative.common_pitfalls
- constraints.fatal[].consequence
- constraints.regular[].consequence
- output_validator.assertions[].failure_message
- acceptance.hard_gates[].on_fail
- skill_crystallization.action
locale_detection_order:
- explicit_user_declaration
- first_message_language
- system_locale
translation_enforcement:
trigger: on_first_user_message
action: Render user_facing_fields in detected locale, preserving all IDs (BD-/SL-/UC-/finance-C-) and code identifiers
verbatim
violation_code: LOCALE-01
violation_signal: User receives untranslated English Human Summary when detected locale != en
evidence_quality:
declared:
evidence_coverage_ratio: 1.0
evidence_verify_ratio: 0.11578947368421053
evidence_invalid: 84
evidence_verified: 11
evidence_auto_fixed: 0
audit_coverage: 61/61 (100%)
audit_pass_rate: 0/61 (0%)
audit_fail_total: 40
audit_finance_universal:
pass: 0
warn: 7
fail: 13
audit_subdomain_totals:
pass: 0
warn: 14
fail: 27
enforcement_rules:
- id: EQ-01
trigger: declared.evidence_verify_ratio < 0.5
action: MUST invoke traceback lookup for all cited BD-IDs in output before emitting business code — read LATEST.yaml sections
for each BD referenced
violation_code: EQ-01-V
violation_signal: Generated script references BD-IDs but no tool_call to read LATEST.yaml preceded code generation
user_disclosure_template: '[QUALITY NOTICE] This crystal was compiled from blueprint finance-bp-064. Evidence verify ratio
= 11.6% and audit fail total = 40. Generated results may have uncaptured requirement gaps. Verify critical decisions against
source files (LATEST.yaml / LATEST.jsonl).'
traceback:
source_files:
blueprint: LATEST.yaml
constraints: LATEST.jsonl
mandatory_lookup_scenarios:
- id: TB-01
condition: Two constraints have apparently conflicting enforcement rules
lookup_target: LATEST.jsonl — find both constraint IDs, compare `consequence` + `evidence_refs` to determine priority
- id: TB-02
condition: A business decision rationale is unclear or disputed
lookup_target: LATEST.yaml — locate BD-ID under business_decisions, read `rationale` + `alternative_considered` fields
- id: TB-03
condition: evidence_invalid > 0 in evidence_quality.declared
lookup_target: LATEST.yaml _enrich_meta — cross-check specific BD `evidence_refs` fields for invalid markers
- id: TB-04
condition: User asks where a rule comes from
lookup_target: LATEST.jsonl — find constraint by ID, read `confidence.evidence_refs` for source file + line number
- id: TB-05
condition: Generated code does not match expected ZVT API behavior
lookup_target: LATEST.yaml stages[].required_methods — verify method signature and evidence locator in source code
degraded_lookup:
no_fs_access: 'Ask the user to paste the relevant LATEST.yaml section or LATEST.jsonl lines for the BD-/finance-C- IDs
in question. Crystal ID: finance-bp-064-v5.0.'
trace_schema:
event_types:
- precondition_check
- spec_lock_check
- evidence_rule_fired
- evidence_rule_skipped
- locale_translation_emitted
- hard_gate_passed
- hard_gate_failed
- skill_emitted
- false_completion_claim
preconditions:
- id: PC-01
description: zvt package installed and importable
check_command: python3 -c 'import zvt; print(zvt.__version__)'
on_fail: 'Run: python3 -m pip install zvt then re-run: python3 -m zvt.init_dirs to initialize data directories'
severity: fatal
- id: PC-02
description: K-data exists for target entities (required before backtesting)
check_command: python3 -c "from zvt.api.kdata import get_kdata; df = get_kdata(entity_ids=['stock_sh_600000'], limit=1);
assert df is not None and len(df) > 0, 'No kdata found'"
on_fail: 'Run recorder first: python3 -m zvt.recorders.em.em_stock_kdata_recorder --entity_ids stock_sh_600000 (replace
with your target entity IDs)'
severity: fatal
applies_to_uc: []
- id: PC-03
description: ZVT data directory initialized (~/.zvt or ZVT_HOME)
check_command: 'python3 -c "import os; from pathlib import Path; zvt_home = Path(os.environ.get(''ZVT_HOME'', Path.home()
/ ''.zvt'')); assert zvt_home.exists(), f''ZVT home not found: {zvt_home}''"'
on_fail: 'Run: python3 -m zvt.init_dirs'
severity: fatal
- id: PC-04
description: SQLite write permission for ZVT data directory
check_command: python3 -c "import os, tempfile; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home()
/ '.zvt')); test_f = zvt_home / '.write_test'; test_f.touch(); test_f.unlink()"
on_fail: 'Check directory permissions: chmod u+w ~/.zvt or set ZVT_HOME environment variable to a writable location'
severity: warn
intent_router:
uc_entries:
- uc_id: UC-101
name: Singular Spectrum Analysis Time Series Decomposition
positive_terms:
- SSA
- singular spectrum analysis
- time series decomposition
- scree plot
- trend extraction
data_domain: financial_data
negative_terms:
- trading strategy
- MACD
- moving average crossover
- screening
- live trading
- stationary bootstrap
ambiguity_question: Are you looking to decompose time series data into components (trend, seasonality, noise) for pattern
recognition, or do you need a trading signal generation strategy?
- uc_id: UC-102
name: Stationary Bootstrap for Interest Rate Swap Inference
positive_terms:
- stationary bootstrap
- swap rates
- resampling
- confidence intervals
- interest rate derivatives
data_domain: financial_data
negative_terms:
- SSA
- singular spectrum
- trading strategy
- MACD
- screening
- live trading
ambiguity_question: Are you interested in bootstrap resampling methods for statistical inference on interest rate data,
or are you looking for time series decomposition techniques like SSA?
context_state_machine:
states:
- id: CA1_MEMORY_CHECKED
entry: Task started
exit: All memory queries attempted and recorded; memory_unavailable set if failed
timeout: 30s — skip memory, mark memory_unavailable=true, proceed to CA2
- id: CA2_GAPS_FILLED
entry: CA1 complete
exit: 'All FATAL-priority required inputs answered: target market (A-share/HK/US), data source, time range, strategy type'
timeout: NOT skippable — FATAL inputs MUST be user-answered before proceeding
- id: CA3_PATH_SELECTED
entry: CA2 complete
exit: intent_router matched single use case with confidence gap > 20% over next candidate, no data_domain ambiguity
timeout: Trigger ambiguity_question for top-2 candidates, await user selection
- id: CA4_EXECUTING
entry: CA3 complete + user explicit confirmation received
exit: All hard gates G1-Gn passed and output files written
timeout: NOT skippable — user confirmation of execution path required
enforcement: Code generation is PROHIBITED before CA4_EXECUTING. Any regression to earlier state MUST be announced to user.
buy/sell ordering SL-01 check runs at CA4 entry.
spec_lock_registry:
semantic_locks:
- id: SL-01
description: Execute sell orders before buy orders in every trading cycle
locked_value: sell() called before buy() in each Trader.run() iteration
violation_is: fatal
source_bd_ids:
- BD-018
- id: SL-02
description: Trading signals MUST use next-bar execution (no look-ahead)
locked_value: due_timestamp = happen_timestamp + level.to_second()
violation_is: fatal
source_bd_ids:
- BD-014
- BD-025
- id: SL-03
description: Entity IDs MUST follow format entity_type_exchange_code
locked_value: stock_sh_600000 | stockhk_hk_0700 | stockus_nasdaq_AAPL
violation_is: fatal
source_bd_ids: []
- id: SL-04
description: DataFrame index MUST be MultiIndex (entity_id, timestamp)
locked_value: df.index.names == ['entity_id', 'timestamp']
violation_is: fatal
source_bd_ids: []
- id: SL-05
description: 'TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount'
locked_value: XOR enforcement in trading/__init__.py:68
violation_is: fatal
source_bd_ids: []
- id: SL-06
description: 'filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION'
locked_value: factor.py:475 order_type_flag mapping
violation_is: fatal
source_bd_ids: []
- id: SL-07
description: Transformer MUST run BEFORE Accumulator in factor pipeline
locked_value: 'compute_result(): transform at :403 before accumulator at :409'
violation_is: fatal
source_bd_ids: []
- id: SL-08
description: 'MACD parameters locked: fast=12, slow=26, signal=9'
locked_value: factors/algorithm.py:30 macd(slow=26, fast=12, n=9)
violation_is: fatal
source_bd_ids:
- BD-036
- id: SL-09
description: 'Default transaction costs: buy_cost=0.001, sell_cost=0.001, slippage=0.001'
locked_value: sim_account.py:25 SimAccountService default costs
violation_is: warning
source_bd_ids:
- BD-029
- id: SL-10
description: A-share equity trading is T+1 (no same-day close of buy positions)
locked_value: sim_account.available_long filters by trading_t
violation_is: fatal
source_bd_ids: []
- id: SL-11
description: Recorder subclass MUST define provider AND data_schema class attributes
locked_value: contract/recorder.py:71 Meta; register_schema decorator
violation_is: fatal
source_bd_ids: []
- id: SL-12
description: Factor result_df MUST contain either 'filter_result' OR 'score_result' column
locked_value: result_df.columns.intersection({'filter_result', 'score_result'}) non-empty
violation_is: fatal
source_bd_ids: []
implementation_hints:
- id: IH-01
hint: 'Use AdjustType enum exactly: qfq (pre-adjust), hfq (post-adjust), bfq (none) — contract/__init__.py:121'
- id: IH-02
hint: For A-share kdata, default to hfq for long-term analysis (dividend-adjusted) — trader.py:538 StockTrader
- id: IH-03
hint: SQLite connection MUST use check_same_thread=False for multi-threaded recorders
- id: IH-04
hint: Accumulator state serialization uses JSON with custom encoder/decoder hooks — contract/base_service.py
- id: IH-05
hint: Factor.level MUST match TargetSelector.level (enforced at add_factor) — factors/target_selector.py:84
preservation_manifest:
required_objects:
business_decisions_count: 107
fatal_constraints_count: 35
non_fatal_constraints_count: 125
use_cases_count: 2
semantic_locks_count: 12
preconditions_count: 4
evidence_quality_rules_count: 2
traceback_scenarios_count: 5
architecture:
pipeline: data_collection -> data_storage -> factor_computation -> target_selection -> trading_execution -> visualization
stages:
- id: data_collection
narrative:
does_what: TimeSeriesDataRecorder and FixedCycleDataRecorder fetch OHLCV and fundamental data from providers (eastmoney,
joinquant, baostock, akshare) and persist domain objects (Stock1dKdata, BalanceSheet) to SQLite via df_to_db().
key_decisions: BD-002 chose evaluate_start_end_size_timestamps for incremental fetch (not full refresh) because comparing
to get_latest_saved_record avoids redundant API calls; BD-003 chose get_data_map field transformation to keep domain
schema provider-agnostic.
common_pitfalls: 'Don''t forget SL-11: Recorder subclass MUST declare both provider and data_schema class attributes
else initialization fails with assertion error; finance-C-001 fatal violation.'
business_decisions: []
- id: data_storage
narrative:
does_what: StorageBackend persists DataFrames to per-provider SQLite databases at {data_path}/{provider}/{provider}_{db_name}.db
using path templates from _get_path_template; Mixin.record_data and Mixin.query_data provide uniform read/write interface.
key_decisions: BD-004 chose StorageBackend abstraction (not hardcoded SQLite) to allow future cloud storage swap; BD-006
derives db_name from data_schema __tablename__ for per-domain database isolation.
common_pitfalls: SL-04 violation (wrong DataFrame index) causes factor pipeline failures downstream; always ensure df.index.names
== ['entity_id', 'timestamp'] before calling record_data.
business_decisions: []
- id: factor_computation
narrative:
does_what: Factor.compute() applies Transformer (stateless, e.g. MacdTransformer) then Accumulator (stateful, e.g. MaStatsAccumulator)
to produce filter_result or score_result columns; EntityStateService persists per-entity rolling state across batches.
key_decisions: BD-007 chose Factor inheriting DataReader for composable data access; SL-08 locks MACD at (fast=12, slow=26,
n=9) — chose standard Appel parameters not adaptive because interpretability matters for practitioners.
common_pitfalls: 'SL-07: Transformer MUST run before Accumulator — swapping order causes NaN propagation; SL-12: result_df
must contain filter_result OR score_result column or TargetSelector silently drops all signals.'
business_decisions: []
- id: target_selection
narrative:
does_what: TargetSelector.add_factor() registers Factor instances; get_targets() returns entity_ids passing threshold
filter at a specific timestamp, enabling point-in-time historical backtesting without look-ahead.
key_decisions: BD-012 chose registrable factor list (not hardcoded) for runtime customization; BD-013 chose timestamp-specific
filtering not current-only because backtests need historical point-in-time correctness.
common_pitfalls: Factor.level MUST match TargetSelector.level (IH-05); mismatched levels cause silent empty target lists
that look like no signals but are actually level-mismatch bugs.
business_decisions: []
- id: trading_execution
narrative:
does_what: Trader.run() calls sell() before buy() each cycle, generates TradingSignals with due_timestamp = happen_timestamp
+ level.to_second() for next-bar execution, and applies on_profit_control() for stop-loss/take-profit before regular
target selection.
key_decisions: SL-01 locks sell-before-buy order because available_long check in sim_account depends on it — chose this
over symmetric ordering to prevent implicit leverage; BD-039 chose long=AND/short=OR multi-level logic to reflect
risk asymmetry.
common_pitfalls: 'SL-02 violation (immediate execution instead of next-bar) introduces look-ahead bias and makes backtest
results unreproducible in live trading; SL-10: A-share T+1 constraint — backtesting without it overstates returns.'
business_decisions: []
- id: visualization
narrative:
does_what: Drawer.draw() combines kline main chart with factor overlays and Rect annotations for entry/exit signals
using Plotly; Drawable interface on Factor enables consistent chart rendering across data types.
key_decisions: BD-019 chose drawer_rects subclass override for custom annotations not hardcoded markers — allows traders
to define entry/exit visuals without modifying base drawing logic.
common_pitfalls: draw_result=True by default (BD-055) is fine for development but set draw_result=False in production/headless
environments to avoid Plotly server startup overhead.
business_decisions: []
- id: cross_cutting_concerns
narrative:
does_what: 'Invariants and utilities that span multiple pipeline stages — collected from 48 source groups: alpha_calibration(11),
black_sholes(1), block_size_formula(1), cash_flow_matrix(1), contract_size(1), convergence_criteria(1), and 42 more.'
key_decisions: 107 BDs merged here because they apply to more than one main stage (e.g. algorithm helpers, default value
choices, ordering contracts, error handling). Agent should inspect individual BD summaries and link back to affected
main stages via shared IDs.
common_pitfalls: Cross-cutting concerns frequently surface as bugs when changes to one main stage unintentionally break
another. Check constraints referencing these BDs and verify invariants still hold after any stage-local modification.
business_decisions:
- id: BD-004
type: BA
summary: Bisection root-finding over Newton-Raphson for alpha calibration
- id: BD-005
type: B/DK
summary: Convergence point T = max(U+40, 60) from EIOPA spec
- id: BD-019
type: B/BA
summary: Set Ultimate Forward Rate (UFR) to 4.2% as long-term convergence target
- id: BD-020
type: B/BA
summary: Set alpha convergence parameter to 0.142068 for EIOPA example data
- id: BD-022
type: B/BA
summary: Set convergence point T = max(U+40, 60) for alpha calibration
- id: BD-023
type: B/BA
summary: Use bisection root-finding to calibrate alpha to satisfy Tau tolerance
- id: BD-033
type: B/BA
summary: Use default mean reversion speed a=1.0 for Vasicek processes
- id: BD-040
type: B/BA
summary: Use SSA embedding dimension L0 = N/2 for time series decomposition
- id: BD-050
type: B
summary: Balance swap and rates calibration with relative error in objective function
- id: BD-054
type: B/BA
summary: Use MLE (maximum likelihood estimation) for Vasicek one-factor parameter validation
- id: BD-070
type: B/BA
summary: Use bisection root-finding algorithm to find optimal alpha for Smith-Wilson convergence
- id: BD-071
type: B
summary: 'Use exact discretization for Black-Scholes GBM: S[t+dt] = S[t]*exp((mu-0.5*sigma^2)*dt + sigma*sqrt(dt)*Z)'
- id: BD-038
type: B
summary: Calculate optimal block size B* = (2*G²/DSB)^(1/3) * n^(1/3) for bootstrap
- id: BD-056
type: B/RC
summary: Use identity matrix as cash flow matrix for ZCB bonds in SW calibration
- id: BD-048
type: B/BA
summary: Default 10% notional for swaption pricing example
- id: BD-024
type: B/BA
summary: Set Tau tolerance to 0.0001 (0.01%) for alpha calibration convergence
- id: BD-072
type: B
summary: Use custom Cholesky-Banachiewicz decomposition for variance-covariance matrix square root
- id: BD-030
type: B/BA
summary: Simulate Vasicek two-factor model with default correlation rho=0.5
- id: BD-045
type: B
summary: Use Cholesky decomposition for correlated Brownian motion generation
- id: BD-028
type: B/DK
summary: Use 5 data points (1,2,5,10,25yr) for NSS yield curve calibration
- id: BD-039
type: B/BA
summary: Require minimum 9 observations for stationary bootstrap calibration
- id: BD-084
type: B/BA
summary: EIOPA convergence point T defaults to max(U+40, 60) in bisection_alpha
- id: BD-085
type: B/BA
summary: BrownianMotion x0 defaults to 0 in Vasicek two-factor model
- id: BD-093
type: BA/M
summary: SSA OptimalLength minimum time series size is 9 elements
- id: BD-073
type: B/RC
summary: Use moment-matched lognormal approximation for Dothan model discretization
- id: BD-098
type: B/BA
summary: 'INTERACTION: BD-018 (EIOPA Smith-Wilson mandate) × BD-019 (UFR 4.2%) → Amplified regulatory compliance requiring
both exact algorithm and specific parameter'
- id: BD-099
type: BA/M
summary: 'INTERACTION: BD-005/BD-084 (Convergence point formula) × BD-022/BD-023/BD-024 (Alpha calibration) → Risk cascade
where T boundary affects bisection convergence reliability'
- id: BD-100
type: BA/M
summary: 'INTERACTION: BD-045 (Cholesky for correlated paths) × BD-030 (rho=0.5 default) → Hidden dependency where correlation
parameter silently requires matrix decomposition'
- id: BD-101
type: M
summary: 'INTERACTION: BD-090 (Stationary bootstrap code duplication) × BD-097 (Smith-Wilson code duplication) → Amplified
maintenance risk creating parallel defect propagation vectors'
- id: BD-102
type: B/BA
summary: 'INTERACTION: BD-032 (Initial rate 10%) × BD-033 (Mean reversion a=1.0) × BD-034 (Volatility sigma=0.2) → Hidden
dependency where stress-test initial conditions interact with parameter assumptions'
- id: BD-103
type: T
summary: 'INTERACTION: BD-094 (Undefined Calibrator attributes) × BD-062/BD-063 (Nelder-Mead with SSE calibration) →
Risk cascade creating calibration failure under each conditions'
- id: BD-104
type: BA/M
summary: 'INTERACTION: BD-035 (100 MC scenarios) × BD-042 (100 bootstrap samples) → Amplification of undersampling bias
across simulation and uncertainty quantification'
- id: BD-105
type: B/BA
summary: 'INTERACTION: BD-053 (Nominal = Real + Inflation decomposition) × BD-030/BD-066 (Correlated rate generation)
→ Contradiction where decomposition assumption conflicts with correlation structure'
- id: BD-106
type: BA
summary: 'INTERACTION: BD-040 (L0=N/2 default) × BD-091 (L0<N/2 invariant) × BD-058 (Window length balancing) → Risk
cascade where default parameter sits exactly at boundary constraint'
- id: BD-107
type: B
summary: 'INTERACTION: BD-067 (Euler-Maruyama discretization) × BD-080 (Exact Vasicek discretization) → Contradiction
in discretization standards across Vasicek implementations'
- id: BD-074
type: B/BA
summary: Use analytical discretization for Hull-White one-factor model with time-dependent theta
- id: BD-094
type: T
summary: Calibrator class in vasicek_two_factor has undefined methods used as static
- id: BD-027
type: B/BA
summary: Initialize each 6 NSS parameters at 0.1 for Nelder-Mead starting point
- id: BD-032
type: B/BA
summary: Set default initial interest rates to 10% for both real and nominal processes
- id: BD-006
type: B/DK
summary: DataFrame output with Time as index
- id: BD-007
type: BA/M
summary: Cholesky decomposition for correlated Brownian motion generation
- id: BD-008
type: B
summary: Custom Cholesky implementation over numpy.linalg.cholesky
- id: BD-009
type: BA/M
summary: L0 embedding dimension defaults to N/2 when not provided
- id: BD-053
type: B
summary: Model nominal interest rate as real rate plus inflation rate
- id: BD-043
type: B/BA
summary: Calculate 95% confidence intervals using 2.5th and 97.5th percentiles
- id: BD-089
type: B/BA
summary: Every interest rate simulators return DataFrames with 'Time' as index
- id: BD-091
type: BA
summary: SSA L0 must be < N/2 by design (embedding dimension constraint)
- id: BD-092
type: BA
summary: SSA r0 must be < L+1 for recursive forecast to work correctly
- id: BD-036
type: B/BA
summary: Use trapezoidal kernel for stationary bootstrap spectral estimation
- id: BD-029
type: B
summary: Use sum of squared residuals as NSS goodness-of-fit objective function
- id: BD-018
type: B/DK
summary: Use EIOPA's Smith-Wilson algorithm for interest rate term structure interpolation/extrapolation
- id: BD-075
type: B/BA
summary: Use Nelder-Mead simplex algorithm for Nelson-Siegel-Svansson parameter fitting
- id: BD-076
type: B
summary: Use sum of squared errors (Euclidean distance) for NSS goodness-of-fit measure
- id: BD-055
type: B/DK
summary: Use trapz (trapezoidal) integration for bond pricing under stochastic rates
- id: BD-026
type: B/DK
summary: Use Nelder-Mead simplex algorithm for Nelson-Siegel-Svensson optimization
- id: BD-051
type: B/BA
summary: Set Nelder-Mead max iterations=1000, max function evaluations=5000 for calibration
- id: BD-010
type: BA/M
summary: Monte Carlo integration for two-factor bond pricing
- id: BD-011
type: B
summary: Swaption payer/receiver derived from boolean not payer
- id: BD-035
type: B
summary: Use Monte Carlo with 100 scenarios for zero-coupon bond pricing
- id: BD-083
type: B
summary: SWCalibrate() MUST be called before SWExtrapolate() in Smith-Wilson pipeline
- id: BD-087
type: B
summary: 'Vasicek two-factor pricing pipeline: BrownianMotion -> ZeroCouponBond.price_Vasicek_Two_Factor'
- id: BD-088
type: B/DK
summary: SSA pipeline requires embedding/SVD before reconstruction/forecast
- id: BD-095
type: B/BA
summary: BisectionAlpha requires xStart < xEnd and opposite-sign function values
- id: BD-086
type: BA/M
summary: Correlated Brownian motion uses Cholesky decomposition exclusively
- id: BD-096
type: BA
summary: Every yield curve fitting uses Nelder-Mead simplex optimization exclusively
- id: BD-052
type: B/BA
summary: Use 6-month (0.5yr) floating leg frequency for swap/swaption pricing
- id: BD-046
type: B
summary: Use vectorized operations for Black-Scholes simulation instead of loop
- id: BD-021
type: B/DK
summary: Extrapolate yield curve to 65 years maturity for pension liability calculations
- id: BD-016
type: BA
summary: Automatic block length selection via Politis-White 2004 method
- id: BD-017
type: BA/M
summary: Trapezoidal spectral window for block size estimation
- id: BD-077
type: B
summary: Use Politis-White automatic block length selection for stationary bootstrap
- id: BD-078
type: B/BA
summary: Use trapezoidal kernel function for spectral density estimation in bootstrap
- id: BD-079
type: B/BA
summary: 'Use Politis-White Bstar formula: Bstar = (2*Ghat^2/DSBhat)^(1/3) * n^(1/3)'
- id: BD-044
type: B/BA
summary: Use OLS regression of reconstructed signal on original for SSA bootstrap residuals
- id: BD-047
type: B/DK
summary: 'Use Geometric Brownian Motion formula: S(t+dt) = S(t) * exp((mu-0.5*sigma²)*dt + sigma*sqrt(dt)*Z)'
- id: BD-025
type: B/BA
summary: 'Set bisection search bounds for alpha: xStart=0.05, xEnd=0.5'
- id: BD-068
type: B/BA
summary: Use matrix inversion (numpy.linalg.inv) for Smith-Wilson calibration vector computation
- id: BD-069
type: B/DK
summary: Use Wilson kernel function (Heart of Wilson) with alpha convergence parameter
- id: BD-037
type: B/BA
summary: Set autocorrelation significance threshold c=2 for bootstrap block selection
- id: BD-049
type: B/BA
summary: Default 10% fixed rate for swaption contracts
- id: BD-090
type: DK
summary: stationary_bootstrap and stationary_bootstrap_calibration contain identical code
- id: BD-097
type: M/BA
summary: smith_wilson and bisection_alpha share nearly identical SWCalibrate/SWExtrapolate/SWHeart code
- id: BD-012
type: BA
summary: Embedding dimension L0 auto-adjusts to N/2 with warning when L0 > N/2
- id: BD-013
type: B
summary: Hankel matrix construction for time series embedding
- id: BD-014
type: BA/M
summary: Weighted correlation for SSA component separability assessment
- id: BD-015
type: BA/M
summary: Bootstrap sampling for SSA forecast uncertainty quantification
- id: BD-031
type: B
summary: Use 52 time periods with dt=0.1 (5.2 year horizon) for interest rate simulation
- id: BD-057
type: B/BA
summary: Use SVD (Singular Value Decomposition) for time series decomposition instead of PCA or Fourier transform
- id: BD-058
type: B/BA
summary: Use Hankel matrix embedding with window length L0 set to N/2 as default
- id: BD-059
type: B/BA
summary: Use percentile-based bootstrap confidence intervals (97.5th and 2.5th) for forecast uncertainty
- id: BD-060
type: B/BA
summary: Use OLS (Ordinary Least Squares via numpy.linalg.lstsq) for bootstrap residual calculation
- id: BD-061
type: B
summary: Use weighted correlation (w-correlation) for assessing component separability
- id: BD-042
type: B/BA
summary: Generate 100 bootstrap samples for SSA forecast confidence intervals
- id: BD-041
type: B
summary: Validate L0 < N/2 to prevent overfitting in SSA reconstruction
- id: BD-080
type: B/BA
summary: 'Use exact discretization for Vasicek one-factor model: r[t] = r[t-1]*exp(-a*dt) + lam*(1-exp(-a*dt)) + sigma*sqrt((1-exp(-2a*dt))/(2a))*Z'
- id: BD-081
type: B/BA
summary: Use two-step OLS for initial Vasicek parameter estimation before MLE refinement
- id: BD-082
type: B/BA
summary: 'Use closed-form MLE formulas for Vasicek parameters: MLmu, MLlam, MLsigma derived from sufficient statistics'
- id: BD-062
type: B/BA
summary: Use Nelder-Mead simplex algorithm for Vasicek calibration instead of Levenberg-Marquardt or BFGS
- id: BD-063
type: B
summary: Use sum of squared relative errors as calibration objective function
- id: BD-064
type: B/BA
summary: Use closed-form Vasicek zero-coupon bond pricing formula
- id: BD-065
type: B/BA
summary: Use Monte Carlo simulation with trapezoidal integration for zero-coupon bond pricing
- id: BD-066
type: B/BA
summary: 'Use correlated Brownian motion via conditional formula: Z3 = rho*Z1 + sqrt(1-rho^2)*Z2'
- id: BD-067
type: B/BA
summary: Use Euler-Maruyama discretization for two-factor Vasicek SDE
- id: BD-034
type: B/BA
summary: Use default volatility sigma=0.2 for interest rate simulations
- id: BD-001
type: B/DK
summary: Local imports of SWHeart inside SWCalibrate/SWExtrapolate
- id: BD-002
type: B/BA
summary: Matrix formulation using numpy for readability over loops
- id: BD-003
type: BA/DK
summary: Nelder-Mead simplex for NSS optimization over gradient methods
resources:
packages:
- name: numpy
version_pin: latest
- name: scipy
version_pin: latest
- name: pandas
version_pin: latest
- name: matplotlib
version_pin: latest
- name: seaborn
version_pin: latest
- name: pytest
version_pin: latest
- name: IPython
version_pin: latest
- name: datetime
version_pin: latest
- name: warnings
version_pin: latest
strategy_scaffold:
entry_point_name: run_backtest
output_path: result.csv
execution_mode: backtest
conditional_entry_points:
backtest:
entry_point_name: run_backtest
output_path: result.csv
collector:
entry_point_name: run_collector
output_path: result.json
factor:
entry_point_name: run_factor
output_path: result.parquet
training:
entry_point_name: run_training
output_path: result.json
serving:
entry_point_name: run_server
output_path: result.json
research:
entry_point_name: run_research
output_path: result.json
tail_template: "# === DO NOT MODIFY BELOW THIS LINE ===\nif __name__ == \"__main__\":\n result = run_backtest() #\
\ implement above\n from validate import enforce_validation\n enforce_validation(result, output_path=\"{workspace}/result.csv\"\
)\n# === END DO NOT MODIFY ==="
host_adapter:
target: openclaw
timeout_seconds: 1800
shell_operator_restriction: 'exec tool intercepts && / ; / | — never chain: ''pip install X && python Y''. Use separate
exec calls.'
install_recipes:
- python3 -m pip install numpy
- python3 -m pip install scipy
- python3 -m pip install pandas
- python3 -m pip install zvt
credential_injection: JoinQuant/QMT credentials require user-side '!' prefix shell login. Never hardcode credentials in
generated scripts.
path_resolution: '{workspace} resolves to ~/.openclaw/workspace/doramagic at execution time.'
file_io_tooling: Use openclaw 'write' tool for .py/.sql files; 'exec' tool for python3 /absolute/path/script.py (absolute
paths only).
constraints:
fatal:
- id: finance-C-001
when: When implementing the Smith-Wilson calibration vector calculation
action: Compute calibration vector b using EIOPA paragraph 149 specification with matrix inverse of (Q' * H * Q)
severity: fatal
kind: domain_rule
modality: must
consequence: Incorrect calibration vector causes interpolated/extrapolated rates to deviate from EIOPA-compliant values,
invalidating downstream insurance reserve calculations
stage_ids:
- yield_curve_fitting
- id: finance-C-002
when: When implementing the Wilson heart function for matrix operations
action: 'Calculate H matrix using formula: 0.5 * (α*(u+v) + exp(-α*(u+v)) - α*|u-v| - exp(-α*|u-v|)) per EIOPA paragraph
132'
severity: fatal
kind: domain_rule
modality: must
consequence: Incorrect Wilson heart function causes wrong H matrix values, propagating errors through all calibration
and extrapolation calculations
stage_ids:
- yield_curve_fitting
- id: finance-C-003
when: When implementing rate to price conversion for zero-coupon bonds
action: Transform observed rates to ZCB prices using p = (1+r)^(-M) formula before calibration
severity: fatal
kind: domain_rule
modality: must
consequence: Incorrect rate-to-price conversion produces wrong bond prices, causing calibration vector b to be miscalculated
and invalidating all derived rates
stage_ids:
- yield_curve_fitting
- id: finance-C-004
when: When converting extrapolated bond prices back to interest rates
action: Convert derived prices to rates using r = p^(-1/M) - 1 formula per EIOPA paragraph 147
severity: fatal
kind: domain_rule
modality: must
consequence: Incorrect price-to-rate conversion produces wrong yield values, causing reported interest rates to be incorrect
for regulatory and pricing purposes
stage_ids:
- yield_curve_fitting
- id: finance-C-005
when: When implementing the root-finding algorithm for alpha calibration
action: Include a maxIter parameter to prevent infinite loops when bisection method fails to converge
severity: fatal
kind: domain_rule
modality: must
consequence: Missing iteration limit causes infinite loop, freezing the application when alpha cannot be calibrated from
given market data
stage_ids:
- yield_curve_fitting
- id: finance-C-011
when: When executing the Smith-Wilson workflow
action: Call SWCalibrate to compute vector b before calling SWExtrapolate
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Missing calibration step causes undefined or incorrect extrapolation results due to absent calibration vector
stage_ids:
- yield_curve_fitting
- id: finance-C-016
when: When implementing alpha calibration for Solvency II compliance
action: use the calibrated alpha value consistently in the subsequent SWExtrapolate call
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Using different alpha values between calibration and extrapolation violates EIOPA specifications and produces
incorrect extrapolated rates that do not satisfy the convergence tolerance Tau, invalidating the Solvency II regulatory
submission
stage_ids:
- alpha_calibration
- id: finance-C-017
when: When implementing EIOPA convergence point calculation
action: compute convergence point T as max(U + 40, 60) where U is the maximum observed maturity
severity: fatal
kind: domain_rule
modality: must
consequence: Using any formula other than max(U+40, 60) for convergence point T violates EIOPA paragraphs 120 and 157,
causing the calibrated alpha to be computed against an incorrect convergence target and invalidating regulatory compliance
stage_ids:
- alpha_calibration
- id: finance-C-018
when: When calibrating alpha using the bisection method
action: verify the bisection interval bounds (xStart, xEnd) contain a sign change in Galfa
severity: fatal
kind: domain_rule
modality: must
consequence: Without sign changes in Galfa at the interval bounds, the bisection method fails to find a root, causing
infinite loop timeout or incorrect alpha values that do not satisfy the Tau tolerance
stage_ids:
- alpha_calibration
- id: finance-C-020
when: When calling Galfa with the calibrated alpha
action: verify that |Galfa(M_Obs, r_Obs, ufr, alpha_calibrated, Tau)| is within Precision tolerance
severity: fatal
kind: domain_rule
modality: must
consequence: If Galfa(calibrated_alpha) exceeds the Precision tolerance, the alpha calibration has not actually satisfied
the Tau convergence constraint, meaning the EIOPA tolerance requirements are violated
stage_ids:
- alpha_calibration
- id: finance-C-022
when: When computing the calibration vector b in SWCalibrate
action: use the same alpha value that will be used in SWExtrapolate for the same dataset
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Computing calibration vector b with a different alpha than used in extrapolation produces mathematically
inconsistent rate curves where observed points do not match market prices
stage_ids:
- alpha_calibration
- id: finance-C-032
when: When passing dt (time step) parameter to simulation functions
action: express dt as a fraction of year where dt > 0 (e.g., 0.1 = ~36 days)
severity: fatal
kind: domain_rule
modality: must
consequence: Negative or zero dt produces invalid number of time steps, causing index errors or infinite loops. dt=0 causes
division by zero in time discretization calculations.
stage_ids:
- interest_rate_simulation
- id: finance-C-033
when: When providing variance-covariance matrix to correlated Brownian motion
action: verify the matrix is positive semi-definite before passing to Cholesky decomposition
severity: fatal
kind: domain_rule
modality: must
consequence: Non-positive-semidefinite matrix causes Cholesky to fail with math domain error or produce invalid correlated
samples, invalidating all downstream risk calculations.
stage_ids:
- interest_rate_simulation
- id: finance-C-043
when: When implementing Monte Carlo pricing with two-factor Vasicek model
action: validate correlation coefficient rho is within [-1, 1] bounds
severity: fatal
kind: domain_rule
modality: must
consequence: Invalid correlation values will cause undefined behavior in Brownian motion generation (sqrt(1-rho²) becomes
imaginary), producing NaN prices or silent numerical failures
stage_ids:
- option_pricing
- id: finance-C-044
when: When implementing Monte Carlo pricing with trapz integration
action: validate nScen (number of Monte Carlo scenarios) is a positive integer
severity: fatal
kind: domain_rule
modality: must
consequence: Zero or negative nScen causes division by zero in mean calculation; non-integer nScen causes silent data
corruption through broadcasting errors
stage_ids:
- option_pricing
- id: finance-C-048
when: When computing bond option prices under stochastic interest rate models
action: use numerical integration (trapz) because no closed-form solution exists for two-factor model
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Attempting to use closed-form pricing for two-factor model produces mathematically incorrect bond prices
that diverge from market values
stage_ids:
- option_pricing
- id: finance-C-058
when: When implementing SSA embedding dimension L0
action: Set L0 strictly less than N/2, with automatic warning adjustment if violated
severity: fatal
kind: domain_rule
modality: must
consequence: If L0 >= N/2, the SSA algorithm crashes because the Hankel matrix becomes singular and cannot be properly
decomposed via SVD, producing invalid reconstruction results.
stage_ids:
- time_series_analysis
- id: finance-C-059
when: When configuring recursive SSA forecast
action: Verify r0 < L+1 for the selected singular vectors to span the forecast subspace
severity: fatal
kind: domain_rule
modality: must
consequence: If max(r0) >= L+1, the forecast recursion produces meaningless values because the right singular vectors
cannot span the required subspace for linear recursive forecasting.
stage_ids:
- time_series_analysis
- id: finance-C-069
when: When implementing time-series resampling with stationary bootstrap
action: validate input data is a 1-dimensional numpy ndarray before processing
severity: fatal
kind: domain_rule
modality: must
consequence: Non-1D array input causes undefined behavior in index operations, leading to incorrect bootstrap samples
or silent data corruption
stage_ids:
- resampling_bootstrap
- id: finance-C-070
when: When computing autocorrelation for block length calibration
action: use at least 9 elements in the input time series for meaningful bootstrap results
severity: fatal
kind: domain_rule
modality: must
consequence: Time series shorter than 9 elements produce unreliable autocorrelation estimates, causing suboptimal block
length selection and invalid statistical inference
stage_ids:
- resampling_bootstrap
- id: finance-C-071
when: When calling stationary_bootstrap with calibration-computed block length
action: verify block length m is positive before passing to the resampling algorithm
severity: fatal
kind: domain_rule
modality: must
consequence: Non-positive block length causes division by zero in accept probability calculation, producing NaN values
in bootstrap output
stage_ids:
- resampling_bootstrap
- id: finance-C-078
when: When applying stationary bootstrap to non-stationary time series
action: apply stationary bootstrap directly without first transforming data to stationarity
severity: fatal
kind: domain_rule
modality: must_not
consequence: Stationary bootstrap assumes weak dependence and stationarity; applying it to trending or unit-root data
produces invalid resamples that mix temporal states
stage_ids:
- resampling_bootstrap
- id: finance-C-081
when: When passing observed rates and maturities from yield_curve_fitting to alpha_calibration
action: verify r_Obs and M_Obs are numpy arrays with matching dimensions (n x 1 column vectors)
severity: fatal
kind: domain_rule
modality: must
consequence: Dimension mismatch causes Smith-Wilson calibration to produce invalid calibration vector b, leading to incorrect
yield curve extrapolation results
- id: finance-C-082
when: When alpha_calibration returns the optimal alpha to yield_curve_fitting for recalibration
action: return alpha as a positive floating-point value within the search bounds (xStart, xEnd)
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Invalid alpha value causes matrix inversion failure in SWCalibrate or produces degenerate yield curve that
violates EIOPA regulations
- id: finance-C-084
when: When yield_curve_fitting passes zero-coupon prices and rates to option_pricing
action: verify rates are annual rates (not log returns or discount factors) and maturities are in years
severity: fatal
kind: domain_rule
modality: must
consequence: Incorrect rate format causes swaption prices to be miscalculated, leading to significant financial losses
in live trading scenarios
- id: finance-C-092
when: When implementing Smith-Wilson interest rate curve algorithms
action: Use annual decimal representation for rates (0.042 = 4.2%) and maturity vectors as n x 1 column vectors for EIOPA-compliant
matrix operations
severity: fatal
kind: domain_rule
modality: must
consequence: Matrix operations will fail due to shape mismatch, producing incorrect interpolated/extrapolated interest
rates that violate EIOPA technical specifications
- id: finance-C-094
when: When using Smith-Wilson algorithm for interest rate curve fitting
action: Call SWCalibrate() before SWExtrapolate(), and verify the alpha parameter used in calibration matches the alpha
parameter passed to extrapolation
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Extrapolated rates will be mathematically incorrect because the calibration vector b is specific to the alpha
value used during calibration; mismatched alpha produces invalid curve fits
- id: finance-C-095
when: When using SSA (Singular Spectrum Analysis) for time series forecasting
action: Complete the __init__ embedding and SVD decomposition before calling forecast() or reconstruction() — the four-stage
pipeline (embedding → SVD → grouping → hankelization) must execute in sequence
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Forecast and reconstruction methods will raise AttributeError or produce meaningless results because required
matrices (U, S, V, H) are not initialized
- id: finance-C-096
when: When performing correlated Brownian motion simulations requiring Cholesky decomposition
action: Verify the variance-covariance matrix is positive semi-definite before passing to the Cholesky decomposition function
severity: fatal
kind: resource_boundary
modality: must
consequence: Cholesky decomposition will fail with negative square root or produce invalid correlation structures, causing
Monte Carlo simulations to produce incorrect or undefined results
- id: finance-C-098
when: When specifying SSA reconstruction indices r0
action: Verify max(r0) is less than or equal to L+1, where L is the embedding dimension
severity: fatal
kind: resource_boundary
modality: must
consequence: SSA recursive forecast fails because the selected right singular vectors do not form a valid subspace for
the recursive projection algorithm
- id: finance-C-112
when: When implementing or selecting interest rate term structure interpolation and extrapolation methods
action: Use EIOPA-compliant Smith-Wilson algorithm for interest rate term structure interpolation and extrapolation —
do not use cubic splines, piecewise linear interpolation, or other non-EIOPA methods
severity: fatal
kind: domain_rule
modality: must
consequence: Non-EIOPA interpolation methods violate Solvency II SCR calculation requirements and produce yield curves
that do not converge to the Ultimate Forward Rate at long maturities as mandated by EIOPA technical specifications
derived_from_bd_id: BD-018
- id: finance-C-120
when: When implementing EIOPA-compliant term structure calculations
action: Modify BD-018 (Smith-Wilson algorithm) or BD-019 (UFR 4.2%) independently — both decisions must be changed together
to maintain regulatory compliance
severity: fatal
kind: domain_rule
modality: must_not
consequence: Changing only the Smith-Wilson algorithm without updating the UFR parameter leaves the convergence target
undefined, breaking EIOPA regulatory compliance for insurance liability calculations
derived_from_bd_id: BD-098
- id: finance-C-127
when: When implementing equity price simulation using Geometric Brownian Motion
action: 'Use the exact GBM formula: S(t+dt) = S(t) * exp((mu - 0.5*sigma^2)*dt + sigma*sqrt(dt)*Z); verify the drift correction
term (mu-0.5*sigma^2) is included, not just mu*dt'
severity: fatal
kind: domain_rule
modality: must
consequence: Omitting the drift correction term causes simulated prices to follow a log-normal distribution with incorrect
mean, leading to systematically biased option pricing and incorrect Greeks calculations
derived_from_bd_id: BD-047
- id: finance-C-145
when: When implementing or refactoring the Smith-Wilson curve fitting pipeline
action: Execute SWCalibrate() to completion before calling SWExtrapolate(); SWHeart matrix must be fully computed and
available as input to the extrapolation function
severity: fatal
kind: domain_rule
modality: must
consequence: Calling SWExtrapolate before SWCalibrate causes runtime exceptions with SWHeart=None, producing undefined
behavior and preventing the Smith-Wilson algorithm from converging
derived_from_bd_id: BD-083
- id: finance-C-146
when: When implementing or refactoring the Vasicek two-factor pricing pipeline
action: Execute simulate_Vasicek_Two_Factor() to completion before calling price_Vasicek_Two_Factor(); each Brownian motion
paths must be generated and stored before pricing calculations begin
severity: fatal
kind: domain_rule
modality: must
consequence: Calling price_Vasicek_Two_Factor before simulation completes results in pricing with empty or undefined rate
paths, causing NaN values or zero present values
derived_from_bd_id: BD-087
regular:
- id: finance-C-006
when: When using numpy float64 for interest rate calculations
action: Accept numpy float64 precision limits for monetary calculations in insurance actuarial work
severity: medium
kind: resource_boundary
modality: must
consequence: Floating-point rounding errors in interest rate calculations compound over long maturities, causing material
discrepancies in insurance reserve valuations
stage_ids:
- yield_curve_fitting
- id: finance-C-007
when: When computing the matrix (Q' * H * Q) for inversion
action: Verify the matrix is numerically invertible with acceptable condition number
severity: high
kind: resource_boundary
modality: must
consequence: Near-singular or ill-conditioned matrix causes numerical instability, producing unreliable calibration vectors
and incorrect yield curve estimates
stage_ids:
- yield_curve_fitting
- id: finance-C-008
when: When configuring alpha convergence parameter
action: Provide bounded search range (xStart, xEnd) for bisection root-finding algorithm
severity: high
kind: resource_boundary
modality: must
consequence: Unbounded alpha search causes numerical overflow or failure to find valid convergence speed, preventing calibration
for certain yield curve shapes
stage_ids:
- yield_curve_fitting
- id: finance-C-009
when: When reusing calibration vector b for multiple extrapolation calls
action: Reuse the same calibration vector b with identical observed data (M_Obs, ufr, alpha) for different target maturities
severity: low
kind: operational_lesson
modality: must
consequence: Recalculating b for each target maturity wastes computation and may produce inconsistent rates if alpha drifts
between calls
stage_ids:
- yield_curve_fitting
- id: finance-C-010
when: When ensuring extrapolated rates converge to the ultimate forward rate
action: Set Tau (tolerance) parameter appropriately to control maximum deviation from ufr at convergence point
severity: high
kind: operational_lesson
modality: must
consequence: Improper Tau causes extrapolated rates to diverge from ufr at long maturities, violating EIOPA convergence
requirements for insurance regulations
stage_ids:
- yield_curve_fitting
- id: finance-C-012
when: When importing the Wilson heart function in calibration and extrapolation modules
action: Import SWHeart locally inside each function to avoid circular dependencies while maintaining encapsulation
severity: high
kind: architecture_guardrail
modality: must
consequence: Improper import strategy causes circular import errors or breaks encapsulation of EIOPA paragraph references
across modules
stage_ids:
- yield_curve_fitting
- id: finance-C-013
when: When providing observed rates as inputs
action: Verify input rates are annual decimals (e.g., 0.042 = 4.2%) and not in percentage format
severity: high
kind: claim_boundary
modality: must
consequence: Percentage-formatted rates (e.g., 4.2 instead of 0.042) cause approximately 100x amplification in all calculated
prices and rates, producing invalid yield curves
stage_ids:
- yield_curve_fitting
- id: finance-C-014
when: When claiming regulatory compliance with EIOPA standards
action: Claim EIOPA compliance for the algorithm implementation only, not for the entire insurance pricing system
severity: high
kind: claim_boundary
modality: must_not
consequence: Overclaiming EIOPA compliance exposes the insurer to regulatory scrutiny and potential penalties for system-level
gaps in actuarial controls
stage_ids:
- yield_curve_fitting
- id: finance-C-015
when: When presenting yield curve fitting results
action: Present interpolated/extrapolated rates as estimated values derived from observed data, not as actual market quotes
severity: medium
kind: claim_boundary
modality: must_not
consequence: Misrepresenting derived rates as market quotes misleads stakeholders about data reliability and violates
actuarial transparency requirements
stage_ids:
- yield_curve_fitting
- id: finance-C-019
when: When implementing BisectionAlpha for Solvency II regulatory calculations
action: set xStart bound to at least 0.05 per EIOPA recommendations
severity: high
kind: operational_lesson
modality: must
consequence: Setting xStart below 0.05 may produce alpha values that converge too slowly, failing to meet regulatory requirements
for reasonable extrapolation behavior under Solvency II framework
stage_ids:
- alpha_calibration
- id: finance-C-021
when: When using the bisection_alpha module alongside the smith_wilson module
action: mix SWCalibrate/SWExtrapolate from different module directories
severity: high
kind: architecture_guardrail
modality: must_not
consequence: Using SWCalibrate from bisection_alpha with SWExtrapolate from smith_wilson (or vice versa) may produce inconsistent
results due to duplicated implementations with potentially different numerical behaviors
stage_ids:
- alpha_calibration
- id: finance-C-023
when: When using the Smith-Wilson algorithm for EIOPA regulatory calculations
action: use Decimal or verify float precision is sufficient for monetary calculations
severity: medium
kind: domain_rule
modality: must
consequence: Standard Python floats may introduce rounding errors in rate calculations that compound through matrix operations,
potentially causing small but systematic deviations in calibrated alpha that violate the tight Tau tolerance
stage_ids:
- alpha_calibration
- id: finance-C-024
when: When the bisection method reaches maxIter iterations
action: return an unvalidated alpha value without warning the user
severity: high
kind: operational_lesson
modality: must_not
consequence: Returning a non-converged alpha value silently causes downstream yield curve calculations to use an unvalidated
parameter that does not satisfy EIOPA tolerance requirements
stage_ids:
- alpha_calibration
- id: finance-C-025
when: When validating alpha calibration results for regulatory submissions
action: present the backtest calibration results as equivalent to live regulatory calculations
severity: high
kind: claim_boundary
modality: must_not
consequence: Alpha values calibrated offline on historical market data cannot guarantee the same convergence properties
when applied to current market conditions; regulatory submissions require real-time recalibration
stage_ids:
- alpha_calibration
- id: finance-C-026
when: When using the Smith-Wilson algorithm with EIOPA specifications
action: document the alpha value and Tau tolerance used in calibration documentation
severity: medium
kind: operational_lesson
modality: must
consequence: Without documented alpha and Tau values, regulatory auditors cannot verify that the calibrated yield curve
meets EIOPA convergence requirements, potentially invalidating Solvency II submissions
stage_ids:
- alpha_calibration
- id: finance-C-027
when: When computing the convergence gap Galfa
action: use np.abs() around the denominator to handle sign changes correctly
severity: high
kind: domain_rule
modality: must
consequence: Without absolute value, the denominator (1 - K*exp(alpha*T)) can become negative, causing the Galfa function
to return positive values even when convergence is not achieved
stage_ids:
- alpha_calibration
- id: finance-C-028
when: When implementing the Galfa function from EIOPA specifications
action: verify Tau parameter represents the allowed difference between ufr and actual curve
severity: high
kind: domain_rule
modality: must
consequence: Passing Tau with incorrect interpretation (e.g., as a multiplier instead of absolute tolerance) causes the
bisection to target an incorrect convergence goal, invalidating the calibration
stage_ids:
- alpha_calibration
- id: finance-C-030
when: When performing matrix inversion in SWCalibrate
action: check that Q.transpose() @ H @ Q is non-singular before inversion
severity: high
kind: resource_boundary
modality: must
consequence: Singular matrix in calibration causes np.linalg.inv to raise LinAlgError, preventing alpha calibration from
completing; this can occur if input maturities are linearly dependent
stage_ids:
- alpha_calibration
- id: finance-C-031
when: When implementing interest rate simulation functions
action: set the DataFrame index to 'Time' using set_index('Time', inplace=True)
severity: high
kind: architecture_guardrail
modality: must
consequence: Without 'Time' as index, downstream plotting and time-series analysis functions will fail or produce incorrect
results, breaking the consistent interface contract across simulation modules.
stage_ids:
- interest_rate_simulation
- id: finance-C-034
when: When simulating mean-reverting interest rate paths with Vasicek or Hull-White models
action: maintain speed of reversion parameter a such that a > 0 to verify mean-reversion convergence
severity: high
kind: domain_rule
modality: must
consequence: When a=0, the Vasicek model becomes a random walk without mean-reversion, and the variance formula contains
division by zero, producing infinite variance or NaN values.
stage_ids:
- interest_rate_simulation
- id: finance-C-035
when: When initializing Brownian motion starting points
action: explicitly specify non-zero starting points since x0 defaults to 0
severity: high
kind: operational_lesson
modality: must
consequence: Default x0=0 causes all paths to start at zero rather than the intended rate, producing incorrect simulation
paths and biased Monte Carlo pricing results.
stage_ids:
- interest_rate_simulation
- id: finance-C-036
when: When presenting results from stochastic interest rate simulations
action: claim that single simulation paths represent expected outcomes or guaranteed returns
severity: high
kind: claim_boundary
modality: must_not
consequence: Each Monte Carlo path is one realization from random draws. Single-path results overstate precision, mislead
stakeholders, and violate financial modeling best practices requiring multiple scenario aggregation.
stage_ids:
- interest_rate_simulation
- id: finance-C-037
when: When using the Vasicek model for interest rate simulation
action: document that negative interest rates are mathematically possible due to normally-distributed noise
severity: medium
kind: claim_boundary
modality: must
consequence: Vasicek model allows negative spreads due to Gaussian noise assumption. In credit markets where negative
spreads are economically meaningless, using Vasicek without acknowledging this limitation leads to invalid pricing and
risk mismeasurement.
stage_ids:
- interest_rate_simulation
- id: finance-C-038
when: When selecting the number of Monte Carlo simulation paths
action: use sufficient paths to achieve statistical convergence of price/risk estimates
severity: medium
kind: operational_lesson
modality: should
consequence: Insufficient paths produce high-variance estimates, causing unstable pricing and unreliable risk metrics
that may mislead decision-making in actuarial applications.
stage_ids:
- interest_rate_simulation
- id: finance-C-039
when: When implementing correlated Brownian motion generation
action: use Cholesky decomposition or equivalent method to transform independent normals into correlated samples
severity: high
kind: architecture_guardrail
modality: must
consequence: Incorrect correlation structure produces biased multi-factor risk estimates, leading to under- or over-estimation
of portfolio risk and incorrect hedge ratios.
stage_ids:
- interest_rate_simulation
- id: finance-C-040
when: When setting random seed for reproducible simulation results
action: document the seed value used and its purpose (model validation, testing, or audit trail)
severity: medium
kind: operational_lesson
modality: should
consequence: Without documented seed, simulation results cannot be replicated for model validation, regulatory audit,
or debugging purposes, violating actuarial standards for model reproducibility.
stage_ids:
- interest_rate_simulation
- id: finance-C-041
when: When using correlated Brownian motion generation
action: assume the custom Cholesky implementation is optimized for high-throughput production use
severity: low
kind: resource_boundary
modality: must_not
consequence: The explicit three-nested-loop Cholesky implementation at CorBM.py:48-56 is O(n^3) without vectorization.
For large variance-covariance matrices, numpy.linalg.cholesky provides significantly better performance.
stage_ids:
- interest_rate_simulation
- id: finance-C-042
when: When running correlated Brownian motion generation
action: skip input validation assuming valid variance-covariance matrix structure
severity: high
kind: resource_boundary
modality: must_not
consequence: Without input validation, non-symmetric matrices or non-numeric values cause cryptic runtime errors or silent
incorrect results. The code explicitly notes no input testing is implemented.
stage_ids:
- interest_rate_simulation
- id: finance-C-045
when: When implementing interest rate path integration for bond pricing
action: validate volatility sigma and rate parameters are non-negative
severity: high
kind: domain_rule
modality: must
consequence: Negative volatility produces invalid square root calculations; negative rates may indicate miscalibration
and lead to mathematically invalid bond prices
stage_ids:
- option_pricing
- id: finance-C-046
when: When pricing zero-coupon bonds using Monte Carlo simulation
action: receive interest rate simulation data from the interest_rate_simulation stage only
severity: high
kind: architecture_guardrail
modality: must
consequence: Using rate paths from other sources bypasses model assumptions, causing mispriced bonds and inconsistent
pricing across the system
stage_ids:
- option_pricing
- id: finance-C-047
when: When implementing Swaption payer/receiver direction logic
action: derive receiver from 'not payer' boolean for single source of truth
severity: high
kind: architecture_guardrail
modality: must
consequence: Separate payer and receiver booleans can diverge, causing inconsistent swaption pricing based on wrong option
direction
stage_ids:
- option_pricing
- id: finance-C-049
when: When implementing Monte Carlo pricing for confidence interval estimation
action: calculate standard error or confidence intervals from simulation results
severity: high
kind: domain_rule
modality: must
consequence: Omitting confidence intervals violates the stage output contract and prevents users from assessing pricing
uncertainty from Monte Carlo sampling error
stage_ids:
- option_pricing
- id: finance-C-050
when: When using trapezoidal integration for bond pricing
action: verify integration time step dt divides T evenly
severity: medium
kind: domain_rule
modality: must
consequence: Non-integer N = T/dt causes truncation in rate path length, missing the final time step and underestimating
the discount integral
stage_ids:
- option_pricing
- id: finance-C-051
when: When deploying Monte Carlo pricing for live financial decisions
action: claim exact pricing accuracy for Monte Carlo point estimates
severity: high
kind: claim_boundary
modality: must_not
consequence: Monte Carlo estimates have inherent sampling variance; presenting np.mean() as exact price overstates precision
and may lead to suboptimal trading decisions
stage_ids:
- option_pricing
- id: finance-C-052
when: When setting the number of Monte Carlo scenarios for pricing
action: increase nScen until confidence interval width falls below acceptable tolerance
severity: medium
kind: operational_lesson
modality: should
consequence: Insufficient scenarios produce wide confidence intervals making the price estimate unreliable for risk management
decisions
stage_ids:
- option_pricing
- id: finance-C-053
when: When implementing Monte Carlo simulation for financial derivatives pricing
action: set random seed for reproducible results when debugging or testing
severity: medium
kind: operational_lesson
modality: must
consequence: Unseeded random number generation causes non-deterministic prices, making unit tests flaky and debugging
impossible across different execution environments
stage_ids:
- option_pricing
- id: finance-C-054
when: When initializing zero-coupon bond pricing models
action: initialize with zero-coupon bond prices from yield_curve_fitting stage
severity: high
kind: architecture_guardrail
modality: must
consequence: Using ad-hoc or hardcoded initial prices bypasses market calibration, causing systematic mispricing relative
to observable market instruments
stage_ids:
- option_pricing
- id: finance-C-055
when: When configuring the Monte Carlo integration method for option pricing
action: replace only the integration_method parameter, preserving rate path generation and expectation calculation
severity: medium
kind: architecture_guardrail
modality: must
consequence: Changing integration method without preserving the overall Monte Carlo structure produces mathematically
incorrect pricing
stage_ids:
- option_pricing
- id: finance-C-056
when: When implementing swaption pricing
action: accept string values for payer/receiver direction in pricing calculations
severity: high
kind: domain_rule
modality: must_not
consequence: String-based direction causes silent type coercion failures, potentially flipping swaption payer/receiver
and producing inverse payoff calculations
stage_ids:
- option_pricing
- id: finance-C-057
when: When presenting Monte Carlo pricing results to stakeholders
action: present Monte Carlo estimated prices as guaranteed market prices
severity: high
kind: claim_boundary
modality: must_not
consequence: Monte Carlo estimates are stochastic approximations subject to sampling variance; presenting them as deterministic
market prices violates financial modeling best practices
stage_ids:
- option_pricing
- id: finance-C-060
when: When providing time series input to ssaBasic
action: Pass only 1D numpy arrays as row vectors; raise TypeError for multi-dimensional arrays
severity: high
kind: domain_rule
modality: must
consequence: Passing multi-dimensional arrays causes incorrect Hankel matrix construction, leading to silent data corruption
and meaningless decomposition results.
stage_ids:
- time_series_analysis
- id: finance-C-061
when: When performing SSA with minimal time series data
action: Provide at least 9 elements in the time series for meaningful decomposition
severity: medium
kind: operational_lesson
modality: must
consequence: Time series shorter than 9 elements cannot produce meaningful SSA components, as the Hankel matrix dimensions
become too small for interpretable singular value decomposition.
stage_ids:
- time_series_analysis
- id: finance-C-062
when: When using OOP-style SSA implementation
action: Use class-based interface for ssaBasic rather than functional style
severity: high
kind: operational_lesson
modality: must
consequence: ssaBasic implements SSA using OOP pattern (class ssaBasic with methods) unlike other functional-style modules
in the repository. Mixing patterns causes interface mismatches and integration failures.
stage_ids:
- time_series_analysis
- id: finance-C-063
when: When configuring weighted correlation for separability assessment
action: Use Toeplitz weights in w-correlation calculation to correctly measure component separability
severity: medium
kind: architecture_guardrail
modality: must
consequence: Standard SSA requires Toeplitz weighting for w-correlation; incorrect weights produce misleading separability
measures, causing wrong component grouping decisions.
stage_ids:
- time_series_analysis
- id: finance-C-064
when: When generating bootstrap forecast uncertainty
action: Use residual bootstrap to preserve autocorrelation structure in SSA residuals
severity: high
kind: architecture_guardrail
modality: must
consequence: Non-residual bootstrap destroys autocorrelation structure in SSA residuals, producing confidence intervals
that do not reflect true forecast uncertainty.
stage_ids:
- time_series_analysis
- id: finance-C-065
when: When constructing trajectory matrix for SSA
action: Build Hankel matrix preserving time ordering via anti-diagonal constant structure
severity: high
kind: architecture_guardrail
modality: must
consequence: Hankel structure preserves time ordering in trajectory matrix; incorrect matrix construction destroys temporal
correlations and produces meaningless decomposition results.
stage_ids:
- time_series_analysis
- id: finance-C-066
when: When validating grouping parameter G0 in ssaBasic
action: Verify G0 length equals embedding dimension L0+1 with no gaps in group indices
severity: high
kind: domain_rule
modality: must
consequence: Invalid G0 causes index errors or incorrect component grouping, producing mathematically undefined behavior
in the SSA reconstruction pipeline.
stage_ids:
- time_series_analysis
- id: finance-C-067
when: When selecting forecast method in SSA
action: Claim real-time or live trading capability based on SSA backtest results
severity: high
kind: claim_boundary
modality: must_not
consequence: 'SSA backtest results do not reflect live trading performance due to inherent limitations: bootstrap sampling
provides uncertainty estimates, not execution guarantees; market conditions change between backtest and live periods.'
stage_ids:
- time_series_analysis
- id: finance-C-068
when: When presenting SSA forecast results
action: Present bootstrap confidence intervals as precise probability bounds
severity: medium
kind: claim_boundary
modality: must_not
consequence: Bootstrap confidence intervals (97.5th/2.5th percentiles) are approximate uncertainty estimates based on
residual resampling, not exact probability coverage. Mispresenting them leads to incorrect risk assessment.
stage_ids:
- time_series_analysis
- id: finance-C-072
when: When performing bootstrap resampling on autocorrelated data
action: use Politis-White 2004 optimal block length algorithm to minimize MSE for dependent data
severity: high
kind: architecture_guardrail
modality: must
consequence: Arbitrary block length selection leads to either excessive variance (too small blocks) or excessive bias
(too large blocks), degrading statistical inference quality
stage_ids:
- resampling_bootstrap
- id: finance-C-073
when: When integrating SSA residuals with bootstrap calibration
action: compute autocorrelation from SSA-reconstructed residuals before passing to OptimalLength
severity: high
kind: architecture_guardrail
modality: must
consequence: Using raw data instead of SSA residuals introduces trend components into autocorrelation, causing block length
to be inappropriately large for the noise component
stage_ids:
- resampling_bootstrap
- id: finance-C-074
when: When maintaining bootstrap calibration code across directories
action: consolidate duplicated OptimalLength, mlag, and lam implementations into a shared module
severity: high
kind: operational_lesson
modality: must
consequence: Duplicated code leads to divergent implementations over time, causing inconsistent block length results when
switching between stationary_bootstrap/ and stationary_bootstrap_calibration/ directories
stage_ids:
- resampling_bootstrap
- id: finance-C-075
when: When generating reproducible bootstrap samples for testing
action: set numpy random seed before calling stationary_bootstrap for deterministic test results
severity: medium
kind: operational_lesson
modality: must
consequence: Without seed control, bootstrap output is non-deterministic, causing flaky tests that pass/fail randomly
across test runs
stage_ids:
- resampling_bootstrap
- id: finance-C-076
when: When computing trapezoidal kernel weights for block length formula
action: verify lag-window input to lam() is within [-1, 1] range for proper spectral estimation
severity: high
kind: domain_rule
modality: must
consequence: Values outside [-1, 1] produce zero kernel weights, corrupting the Ghat/DSBhat computation and yielding incorrect
Bstar block length
stage_ids:
- resampling_bootstrap
- id: finance-C-077
when: When claiming statistical validity of bootstrap confidence intervals
action: claim exact replication of population parameters from finite bootstrap samples
severity: high
kind: claim_boundary
modality: must_not
consequence: Bootstrap provides asymptotic approximation to sampling distribution; finite samples and non-stationary data
produce confidence intervals with unknown coverage properties
stage_ids:
- resampling_bootstrap
- id: finance-C-079
when: When exporting bootstrap results as empirical evidence
action: present bootstrap-derived distributions as equivalent to analytically computed distributions
severity: medium
kind: claim_boundary
modality: must_not
consequence: Bootstrap distributions are model-free approximations with sampling error; presenting them as definitive
probabilities misleads stakeholders about uncertainty quantification
stage_ids:
- resampling_bootstrap
- id: finance-C-080
when: When handling lagged correlation matrix construction in mlag
action: delete rows containing zero-padding before computing autocorrelation to avoid correlation with zero artifacts
severity: high
kind: domain_rule
modality: must
consequence: Zero-padding rows create artificial zero-correlation entries, biasing the threshold-based mhat selection
downward and producing suboptimal block length
stage_ids:
- resampling_bootstrap
- id: finance-C-083
when: When interest_rate_simulation passes rate paths to option_pricing for Monte Carlo pricing
action: verify the DataFrame has 'Time' as index and 'Interest Rate' (or equivalent rate column) as values
severity: high
kind: domain_rule
modality: must
consequence: Incorrect column names cause Monte Carlo integration to fail or price zero-coupon bonds using wrong interest
rate series
- id: finance-C-085
when: When time_series_analysis passes SSA-reconstructed residuals to resampling_bootstrap
action: verify the residuals are provided as a 1-dimensional numpy array (not a DataFrame or 2D array)
severity: high
kind: domain_rule
modality: must
consequence: Non-1D array causes autocorrelation calculation to fail in OptimalLength, producing invalid bootstrap block
length that corrupts all downstream resampled series
- id: finance-C-086
when: When time_series_analysis extracts autocorrelation structure for bootstrap block length selection
action: verify the input time series has at least 9 observations (minimum data requirement for Politis-White method)
severity: high
kind: resource_boundary
modality: must
consequence: Insufficient data points cause OptimalLength to produce unreliable block length estimates, invalidating the
entire bootstrap statistical inference
- id: finance-C-087
when: When alpha_calibration performs iterative convergence with yield_curve_fitting
action: allow infinite iteration loops without convergence check
severity: high
kind: architecture_guardrail
modality: must_not
consequence: Bisection algorithm enters infinite loop when tolerance Tau is unreachable, causing CPU exhaustion and system
hang
- id: finance-C-088
when: When Vasicek/Hull-White/Dothan simulation passes rate paths to pricing models
action: verify time index is continuous with no gaps matching the dt increment
severity: high
kind: domain_rule
modality: must
consequence: Gaps in time index cause trapezoidal integration to produce incorrect bond prices, leading to systematic
mispricing
- id: finance-C-089
when: When Smith-Wilson calibration vector b is passed between yield_curve_fitting iterations
action: preserve the calibration vector as an n x 1 numpy array without reshaping
severity: high
kind: domain_rule
modality: must
consequence: Vector reshaping causes Wilson function heart calculation to fail due to broadcasting mismatch, producing
NaN rates
- id: finance-C-090
when: When option_pricing receives zero-coupon rates from yield_curve_fitting
action: verify the ufr (ultimate forward rate) parameter matches the one used in calibration
severity: high
kind: architecture_guardrail
modality: must
consequence: Mismatched ufr causes Smith-Wilson extrapolation to produce inconsistent curve shapes, invalidating all derived
swaption prices
- id: finance-C-091
when: When implementing interest rate simulation functions (Vasicek, Hull-White, Dothan, Black-Scholes)
action: Return results as pandas DataFrame with Time as the index column
severity: high
kind: domain_rule
modality: must
consequence: Consumers expecting consistent DataFrame interface across modules will receive inconsistent output formats,
causing integration failures in downstream analysis pipelines
- id: finance-C-093
when: When implementing continuous-time interest rate models (Vasicek, Hull-White, Dothan)
action: Express time step dt as a fraction of year (dt=0.1 represents approximately 36 days) to enable proper model discretization
severity: high
kind: domain_rule
modality: must
consequence: Stochastic differential equations will produce incorrect temporal discretization, causing simulated interest
rates to diverge significantly from expected model outputs
- id: finance-C-097
when: When configuring SSA embedding dimension L0
action: Set L0 to a value less than or equal to N/2, where N is the length of the time series
severity: high
kind: resource_boundary
modality: must
consequence: SSA decomposition produces mathematically invalid trajectory matrices; reconstruction and forecast accuracy
degrades significantly with insufficient degrees of freedom
- id: finance-C-099
when: When using stationary bootstrap for time series resampling
action: Provide positive values for block length parameter m and sample_length
severity: high
kind: resource_boundary
modality: must
consequence: Bootstrap algorithm raises ValueError and produces no valid resampled output; negative or zero parameters
are mathematically undefined for block resampling
- id: finance-C-100
when: When presenting or reporting this system's backtested or simulated returns to users
action: Claim that simulated returns equal expected live trading returns — simulations ignore market impact, financing
costs, execution delays, slippage, and liquidity constraints
severity: high
kind: claim_boundary
modality: must_not
consequence: Users make live capital allocation decisions based on inflated simulation returns, leading to severe underperformance
in actual trading and potential financial loss exceeding initial investment
- id: finance-C-101
when: When using this system as the basis for external capability claims
action: Claim real-time trading support, live market data integration capability, or production-grade risk management
system functionality — this is a collection of actuarial models for research and analytical purposes only
severity: high
kind: claim_boundary
modality: must_not
consequence: Users deploying these models in live trading systems without proper safeguards will experience unhandled
latency, data feeds failures, and regulatory compliance violations
- id: finance-C-102
when: When presenting this system's interest rate curve interpolations or extrapolations as financial advice
action: Present Smith-Wilson or Nelson-Siegel-Svensson curve outputs as guaranteed market predictions or risk-free rate
estimates without proper regulatory disclosure and model risk documentation
severity: high
kind: claim_boundary
modality: must_not
consequence: Regulatory violations occur when actuarial models are presented without model risk disclosure; users may
make sub-optimal capital allocation decisions based on uncalibrated curve fits
- id: finance-C-103
when: When validating financial model parameters in stochastic simulations
action: Verify volatility parameters (sigma) are non-negative and mean reversion parameters (a) are non-negative to maintain
mathematical validity of Ornstein-Uhlenbeck processes
severity: high
kind: domain_rule
modality: must
consequence: Negative volatility produces undefined Gaussian increments; negative mean reversion speed causes divergence
rather than mean reversion, producing unbounded interest rate simulations
- id: finance-C-104
when: When performing Monte Carlo pricing using stochastic interest rate models
action: Set a random seed or use reproducible random number generation when repeatability is required for model validation
or regulatory documentation
severity: medium
kind: operational_lesson
modality: must
consequence: Monte Carlo results become non-reproducible, preventing model validation, audit trail requirements, and regulatory
compliance documentation
- id: finance-C-105
when: When performing SSA cross-validation
action: Verify qInSample is between 0 and 1 (exclusive of boundaries) to maintain valid train-test split for time series
cross-validation
severity: medium
kind: resource_boundary
modality: must
consequence: Cross-validation produces invalid train-test splits when qInSample equals 0 or 1, causing division by zero
or empty test sets with undefined RMSE calculations
- id: finance-C-106
when: When using this actuarial model library
action: Present these models as production-ready financial software with warranties — the README explicitly states 'This
software is provided on an as is basis, without warranties or conditions of any kind' (singular_spectrum_analysis/README.md:46)
severity: high
kind: claim_boundary
modality: must_not
consequence: Users relying on production SLAs or warranty claims will have no legal recourse when models produce unexpected
outputs; regulatory audits will fail due to unsupported software in critical systems
- id: finance-C-107
when: When fixing defects or updating algorithms in any duplicated code location
action: 'Apply identical fixes to each four duplicated locations: stationary_bootstrap_calibrate.py (2 locations), Smith-Wilson
SWCalibrate/SWExtrapolate/SWHeart (2 locations), and Cholesky decomposition (2+ locations from BD-045/BD-072) — use
grep to identify each identical code blocks before applying changes'
severity: high
kind: architecture_guardrail
modality: must
consequence: Fixing a bug in only one location of duplicated code leaves the defect active in all other copies, causing
inconsistent behavior across modules and potential silent failures in downstream calculations
derived_from_bd_id: BD-101
- id: finance-C-108
when: When implementing or refactoring the Nelson-Siegel-Svensson curve fitting implementation
action: Use sum of squared residuals as the objective function for NSS goodness-of-fit — do not replace with Huber loss,
absolute deviation, or other loss functions
severity: high
kind: domain_rule
modality: must
consequence: Changing the objective function to Huber loss or absolute deviation alters the curve fitting behavior, producing
different yield curve shapes that affect actuarial discounting and regulatory valuations under Solvency II frameworks
derived_from_bd_id: BD-029
- id: finance-C-109
when: When configuring Monte Carlo simulation parameters for two-factor Vasicek interest rate model
action: Use exactly 52 time periods with dt=0.1 (5.2 year horizon) for swaption pricing — do not change to weekly steps
(dt=0.02) or other discretization schemes without revalidation
severity: high
kind: domain_rule
modality: must
consequence: Changing dt or period count affects Monte Carlo convergence and pricing accuracy; weekly steps increase computation
5x without proportional accuracy gain for 5-year instruments
derived_from_bd_id: BD-031
- id: finance-C-110
when: When calibrating alpha parameter in Smith-Wilson alpha calibration
action: Set convergence point T = max(U+40, 60) where U is the last liquid maturity — do not use a fixed 60-year convergence
point as this violates the adaptive convergence distance requirement
severity: high
kind: domain_rule
modality: must
consequence: Using a fixed 60-year convergence point instead of max(U+40, 60) causes incorrect curve fitting when the
last liquid maturity exceeds 20 years, as the convergence distance falls below the required 40-year minimum
derived_from_bd_id: BD-022
- id: finance-C-111
when: When implementing or modifying the alpha calibration algorithm in Smith-Wilson calibration
action: Use bisection root-finding method for alpha calibration to satisfy Tau tolerance — do not replace with Newton-Raphson
or other gradient-based methods
severity: high
kind: architecture_guardrail
modality: must
consequence: Newton-Raphson methods are sensitive to initial guesses and may not converge for ill-conditioned Tau functions,
causing alpha calibration to fail or produce incorrect values that violate regulatory tolerance requirements
derived_from_bd_id: BD-023
- id: finance-C-113
when: When setting the projection horizon for pension liability calculations using Smith-Wilson term structure
action: Extrapolate yield curve to at least 65 years maturity for pension liability calculations — do not truncate at
40 or 50 years as this would miss annuity payments and deferred pension obligations
severity: high
kind: domain_rule
modality: must
consequence: Truncating the yield curve at 40-50 years for a typical retirement age of 65 with life expectancy 85-90 causes
pension liability cash flows to be improperly discounted, significantly underestimating or overestimating long-term
obligations
derived_from_bd_id: BD-021
- id: finance-C-114
when: When configuring SSA embedding dimension parameter L0 for trajectory matrix construction
action: Verify L0 (embedding dimension) satisfies 1 <= L0 < N/2 where N is the series length — verify L0 < N/2 holds before
SSA decomposition
severity: high
kind: domain_rule
modality: must
consequence: Setting L0 >= N/2 destroys the Hankel matrix structure required for valid SVD decomposition, causing degenerate
singular vectors and corrupted SSA component extraction
derived_from_bd_id: BD-091
- id: finance-C-115
when: When configuring SSA forecast reconstruction with parameter r0 for singular value selection
action: Verify r0 (number of singular values for reconstruction) satisfies r0 < L+1 where L is the embedding window size
— validate r0 < L+1 before recursive forecast execution
severity: high
kind: domain_rule
modality: must
consequence: Setting r0 >= L+1 creates an underdetermined system in the recursive prediction algorithm, causing forecast
trajectories to become unstable or divergent
derived_from_bd_id: BD-092
- id: finance-C-116
when: When pricing bonds using the two-factor Vasicek interest rate model
action: Use Monte Carlo integration with trapezoidal approximation (integrate.trapz) for bond pricing — do not use analytical
approximations or higher-dimensional quadrature as they introduce systematic bias or become intractable
severity: high
kind: domain_rule
modality: must
consequence: Analytical approximations introduce systematic bias in two-factor Vasicek bond pricing; without Monte Carlo
integration, pricing errors propagate to option valuations and hedging strategies
derived_from_bd_id: BD-010
- id: finance-C-117
when: When assessing component separability in Singular Spectrum Analysis decomposition
action: Use weighted (Toeplitz) correlation for w-correlation calculation between SSA components — do not use unweighted
standard correlation as it biases separability assessment toward dominant components
severity: high
kind: domain_rule
modality: must
consequence: Unweighted correlation misrepresents minor signal contributions and biases separability assessment, leading
to incorrect component grouping decisions in SSA reconstruction
derived_from_bd_id: BD-014
- id: finance-C-118
when: When implementing or refactoring any interest rate simulator method (simulate_X)
action: Return DataFrame with 'Time' column as the DataFrame index — never use RangeIndex or custom index types
severity: high
kind: architecture_guardrail
modality: must
consequence: Downstream portfolio aggregation, scenario analysis, and stress testing code assumes uniform Time-indexed
DataFrames; using a different index causes silent iteration failures or incorrect results across simulators
derived_from_bd_id: BD-089
- id: finance-C-119
when: When using BisectionAlpha to find roots
action: Verify initial bounds satisfy xStart < xEnd AND f(xStart) * f(xEnd) < 0 (opposite-sign function values) before
calling the algorithm
severity: high
kind: domain_rule
modality: must
consequence: Violating the bracketing interval requirements causes the bisection algorithm to either fail convergence
with infinite iterations or return incorrect root values, corrupting downstream financial calculations
derived_from_bd_id: BD-095
- id: finance-C-121
when: When implementing or modifying rate generation and decomposition logic
action: Use BD-053 (nominal = real + inflation additive decomposition) with BD-030/BD-066 (bivariate normal correlated
generation) simultaneously — these are mutually exclusive modeling assumptions
severity: high
kind: domain_rule
modality: must_not
consequence: Combining additive decomposition with bivariate normal generation creates a contradiction where implied inflation
becomes negatively correlated with real rates, violating monetary policy intuition and producing economically inconsistent
scenario paths
derived_from_bd_id: BD-105
- id: finance-C-122
when: When implementing correlated Brownian motion generation using Cholesky decomposition
action: Verify that the covariance matrix is positive-definite before applying Cholesky decomposition; if matrix is not
positive-definite, use eigenvalue decomposition as fallback or reject with error
severity: high
kind: domain_rule
modality: must
consequence: Cholesky decomposition fails with LinAlgError on non-positive-definite matrices, causing simulation to abort;
generated paths will have incorrect correlation structure if eigenvalue fallback is used without explicit handling
derived_from_bd_id: BD-086
- id: finance-C-123
when: When calibrating SSA OptimalLength for time series decomposition
action: Verify input time series has at least 9 elements before invoking SSA OptimalLength; if length < 9, reject calibration
with clear error message stating minimum requirement not met
severity: high
kind: domain_rule
modality: must
consequence: SSA OptimalLength with fewer than 9 elements produces trajectory matrices too small for SVD extraction, yielding
statistically insignificant singular components and unreliable forecasting results
derived_from_bd_id: BD-093
- id: finance-C-124
when: When calibrating alpha parameters using bisection root-finding with Galfa convergence point formula T=max(U+40,60)
action: Monitor convergence behavior when U approaches liquid maturity limits; implement maximum iteration limits and
convergence tolerance checks; warn when U > 20 that bisection may exhibit slow convergence or false-positive convergence
severity: high
kind: architecture_guardrail
modality: must
consequence: Large U values cause T to approach 100+ years, making Galfa error function extremely flat near zero; bisection
algorithm may converge prematurely to incorrect alpha values, producing unreliable calibration results
derived_from_bd_id: BD-099
- id: finance-C-125
when: When implementing SSA reconstruction validation logic
action: Enforce that embedding dimension L0 is strictly less than N/2 where N is the time series length; reject configurations
violating this constraint to maintain Toeplitz matrix full rank
severity: high
kind: domain_rule
modality: must
consequence: Setting L0 >= N/2 causes Toeplitz matrix rank deficiency, leading to numerical instability, eigenvalue clustering,
and overfitting to noise in component separation; actuarial forecasts become unreliable and non-reproducible
derived_from_bd_id: BD-041
- id: finance-C-126
when: When initializing the Nelder-Mead optimizer for Nelson-Siegel-Svensson yield curve calibration
action: Verify that initial parameter values are set to 0.1 for each 6 NSS parameters (theta0-theta5); if using a different
initialization strategy, document the rationale and assess convergence behavior
severity: medium
kind: operational_lesson
modality: should
consequence: Using a suboptimal starting point can cause the Nelder-Mead optimizer to converge to local minima instead
of the global market-observed yield curve, leading to incorrect discount factors and bond pricing errors in live trading
derived_from_bd_id: BD-027
- id: finance-C-128
when: When implementing bond pricing integration under Vasicek short-rate models
action: Use trapezoidal integration (trapz) for bond pricing integrals; verify grid spacing is uniform and sufficient
for second-order accuracy; do not use Simpson's rule which requires odd grid points
severity: high
kind: domain_rule
modality: must
consequence: Using non-trapezoidal integration methods may introduce numerical errors in bond pricing calculations, causing
inaccurate valuations that fail regulatory reporting requirements
derived_from_bd_id: BD-055
- id: finance-C-129
when: When implementing or modifying spectral density estimation for stationary bootstrap confidence intervals
action: Use trapezoidal kernel for spectral density estimation — do not replace with Parzen or Bartlett kernels which
produce wider intervals with higher bias at spectral boundaries
severity: high
kind: domain_rule
modality: must
consequence: Replacing trapezoidal kernel with Parzen or Bartlett increases spectral leakage and produces wider, more
conservative confidence intervals, potentially causing underconfidence in valid trading signals or rejection of profitable
strategies
derived_from_bd_id: BD-036
- id: finance-C-130
when: When calibrating block size parameters for stationary bootstrap procedures
action: Set autocorrelation significance threshold c=2 for block bootstrap parameter estimation — this corresponds to
approximately 5% significance level for identifying genuinely dependent observations
severity: high
kind: domain_rule
modality: must
consequence: Using a different threshold (e.g., c=1.96 or c=2.5) changes which autocorrelations are considered significant,
altering block size calculation and potentially invalidating confidence intervals or producing unreliable statistical
inference
derived_from_bd_id: BD-037
- id: finance-C-131
when: When processing time series data for stationary bootstrap calibration
action: Enforce minimum 9 observations requirement before running bootstrap calibration — reject or flag datasets with
fewer observations as insufficient for reliable spectral density estimation
severity: high
kind: operational_lesson
modality: must
consequence: Running bootstrap calibration with fewer than 9 observations produces unreliable spectral estimates with
insufficient blocks for meaningful resampling, leading to invalid confidence intervals that misrepresent uncertainty
in backtest results
derived_from_bd_id: BD-039
- id: finance-C-132
when: When configuring SSA embedding dimension for time series decomposition
action: Set SSA embedding dimension L0 to N/2 (half the series length) for trend-noise separation — do not use L0=N/3
as it may miss medium-frequency cyclical components in liability cash flow patterns
severity: high
kind: domain_rule
modality: must
consequence: Using an incorrect embedding dimension (N/3 or other values) causes either over-fragmentation or under-capture
of signal components, leading to poor trend-noise separation and inaccurate forecasts in actuarial or financial time
series analysis
derived_from_bd_id: BD-040
- id: finance-C-133
when: When generating bootstrap samples for SSA forecast confidence intervals
action: Use exactly 100 bootstrap replications for SSA forecast confidence interval estimation — verify that 100 samples
provides adequate percentile accuracy (approximately 1.25x critical value accuracy for 95% intervals) for actuarial
reporting
severity: medium
kind: operational_lesson
modality: should
consequence: Using fewer than 100 bootstrap samples degrades percentile estimate precision for confidence intervals, potentially
producing misleading uncertainty bounds that fail to meet actuarial reporting accuracy requirements
derived_from_bd_id: BD-042
- id: finance-C-134
when: When implementing or modifying the Vasicek two-factor calibration objective function
action: Use sum of squared relative errors as the calibration objective function — this normalizes instrument contributions
by their price level, preventing large-maturity bonds from dominating the objective and ensuring calibration fits across
the entire yield curve simultaneously
severity: high
kind: domain_rule
modality: must
consequence: Using absolute squared errors causes calibration to be dominated by long-maturity instruments with large
absolute prices, resulting in poor fit at short maturities and unreliable multi-curve yield estimation that propagates
into incorrect derivative pricing
derived_from_bd_id: BD-063
- id: finance-C-135
when: When implementing geometric Brownian motion simulation for Black-Scholes option pricing
action: Use exact discretization formula S[t+dt] = S[t]*exp((mu-0.5*sigma^2)*dt + sigma*sqrt(dt)*Z) — do not replace with
Euler-Maruyama, Milstein, or other approximate discretization methods
severity: high
kind: domain_rule
modality: must
consequence: Euler-Maruyama discretization introduces systematic drift underestimation for path-dependent options and
long-dated derivatives, causing option strategies to be mispriced by 5-15% on average and producing non-reproducible
backtest results
derived_from_bd_id: BD-071
- id: finance-C-136
when: When implementing correlated asset simulation using variance-covariance matrix decomposition
action: Use Cholesky-Banachiewicz row-wise decomposition for the variance-covariance matrix square root — maintain the
row-based approach that enables memory-efficient partial computation for leading correlations
severity: high
kind: architecture_guardrail
modality: must
consequence: Using Cholesky-Crout column-based decomposition without verifying row-wise equivalence produces incorrect
correlation structures in multi-asset simulation paths, invalidating diversification benefits and causing portfolio
risk misestimation by 10-30%
derived_from_bd_id: BD-072
- id: finance-C-137
when: When implementing Nelson-Siegel-Svensson yield curve calibration goodness-of-fit measurement
action: Use sum of squared errors (Euclidean distance) as the NSS goodness-of-fit objective — this provides uniform weighting
across maturity points and ensures convex, interpretable calibration results
severity: high
kind: domain_rule
modality: must
consequence: Using weighted least squares without proper heteroscedasticity calibration distorts fit quality at short
maturities where economic signals are most informative, leading to incorrect yield curve shapes and flawed strategy
signals for interest rate derivatives
derived_from_bd_id: BD-076
- id: finance-C-138
when: When implementing stationary bootstrap resampling for yield curve time series analysis
action: Use Politis-White automatic block length selection for stationary bootstrap — this estimates optimal block size
from the data's dependence structure without manual tuning
severity: high
kind: architecture_guardrail
modality: must
consequence: Using fixed block lengths that don't adapt to the data's dependence structure causes invalid bootstrap inference,
producing unreliable confidence intervals and misleading strategy backtest results that don't generalize to live trading
derived_from_bd_id: BD-077
- id: finance-C-139
when: When implementing confidence interval calculation for SSA forecast distributions
action: Calculate 95% confidence intervals using 2.5th and 97.5th empirical percentiles of the bootstrap distribution
— do not substitute parametric methods assuming normality
severity: high
kind: operational_lesson
modality: must
consequence: Parametric confidence intervals assume normal distribution, but financial returns exhibit fat tails and skewness;
using parametric CI systematically underestimates uncertainty for extreme outcomes, causing backtest intervals to exclude
real losses
derived_from_bd_id: BD-043
- id: finance-C-140
when: When implementing bootstrap residual estimation for SSA forecasting
action: Use OLS regression of SSA-reconstructed signal on original series to compute residuals — do not apply moving block
bootstrap on residuals
severity: high
kind: operational_lesson
modality: must
consequence: SSA OLS residuals are white noise by construction; applying moving block bootstrap introduces autocorrelation
structure that does not exist, distorting the bootstrap distribution and causing forecast intervals to misrepresent
true uncertainty
derived_from_bd_id: BD-044
- id: finance-C-141
when: When demonstrating swaption pricing calculations in educational examples
action: Use 10% notional (relative scale) for swaption pricing examples — do not change to absolute monetary values like
EUR 100m
severity: medium
kind: operational_lesson
modality: must
consequence: Large absolute notional values distract from pricing mechanics by forcing attention on number magnitude rather
than rate sensitivity and Greeks; learners miss the core valuation concepts buried in unwieldy numbers
derived_from_bd_id: BD-048
- id: finance-C-142
when: When demonstrating swaption premium sensitivity in educational examples
action: Use 10% fixed rate as out-of-the-money strike in swaption examples — avoid changing to ATM strike at current forward
rate
severity: medium
kind: operational_lesson
modality: should
consequence: ATM strikes produce near-zero intrinsic value, obscuring the relationship between moneyness and premium;
learners cannot visualize option value changes when the starting point has no optionality
derived_from_bd_id: BD-049
- id: finance-C-143
when: When calibrating 4-parameter two-factor Vasicek model using Nelder-Mead optimizer
action: Set max iterations=1000 and max function evaluations=5000 for Nelder-Mead calibration stopping criteria — do not
reduce below these thresholds without validation
severity: high
kind: operational_lesson
modality: must
consequence: Insufficient iterations or evaluations cause premature optimizer termination on complex 4-parameter calibration,
leading to suboptimal parameter estimates that produce systematically biased swaption prices
derived_from_bd_id: BD-051
- id: finance-C-144
when: When pricing interest rate swaps or swaptions using the framework's default payment frequency
action: Verify that the 6-month (0.5yr) floating leg frequency matches the actual instrument being priced; for non-EUR
instruments or custom structures, explicitly specify the correct payment frequency parameter
severity: medium
kind: operational_lesson
modality: should
consequence: Using 6-month frequency for non-EUR swaps or instruments with quarterly/monthly conventions causes systematic
mispricing, producing discount factors that do not match quoted swap prices
derived_from_bd_id: BD-052
- id: finance-C-147
when: When pricing long-dated pension liabilities or instruments spanning 40+ years using both Vasicek one-factor and
two-factor models
action: Use inconsistent discretization methods across Vasicek model variants; one-factor model uses exact discretization
while two-factor model uses Euler-Maruyama, producing systematically different pricing for identical instruments
severity: high
kind: architecture_guardrail
modality: must_not
consequence: Euler-Maruyama discretization has O(dt) path accuracy with systematic drift bias compared to exact discretization
with O(1) accuracy; for 40+ year liabilities, this discrepancy produces materially different pricing between models,
violating internal consistency requirements
derived_from_bd_id: BD-107
- id: finance-C-148
when: When using the closed-form Vasicek zero-coupon bond pricing formula for pricing or calibration
action: Use the closed-form formula for time-varying parameters (a, lambda, r0) — the analytical solution assumes constant
parameters within each evaluation interval; must switch to numerical methods for time-varying parameter scenarios
severity: high
kind: domain_rule
modality: must_not
consequence: Applying the closed-form Vasicek formula with time-varying parameters systematically misprices bonds, as
the formula derivation assumes constant drift and volatility over each interval; accumulated pricing errors can reach
10-50bp for rapidly changing rate environments
derived_from_bd_id: BD-064
- id: finance-C-149
when: When validating the Vasicek closed-form pricing implementation
action: Verify that parameters a, lambda, and r0 are held constant within each evaluation interval before using the analytical
formula
severity: medium
kind: architecture_guardrail
modality: must
consequence: Without validating parameter constancy, the closed-form solution produces incorrect bond prices as the mathematical
derivation assumes continuous compounding with fixed coefficients over the evaluation period
derived_from_bd_id: BD-064
- id: finance-C-150
when: When configuring Monte Carlo simulation parameters for zero-coupon bond pricing
action: Configure at least 10,000 simulation paths to achieve pricing accuracy within 1 basis point (bp)
severity: high
kind: domain_rule
modality: must
consequence: Using fewer than 10,000 Monte Carlo paths introduces excessive sampling variance, causing bond price estimates
to diverge by more than 1bp from the true value; this error compounds in calibration loops where prices are evaluated
thousands of times
derived_from_bd_id: BD-065
- id: finance-C-151
when: When using trapezoidal integration for computing integrated quantities in bond pricing
action: Verify the integrand is sufficiently smooth before applying trapezoidal rule — the method assumes smooth behavior
and may underperform for discontinuous or highly oscillatory payoffs
severity: medium
kind: operational_lesson
modality: should
consequence: Trapezoidal integration on non-smooth integrands produces systematic integration errors that accumulate across
the pricing calculation, leading to biased bond valuations especially near cash flow discontinuities
derived_from_bd_id: BD-065
- id: finance-C-152
when: When generating correlated Brownian motion increments using the conditional formula Z3 = rho*Z1 + sqrt(1-rho^2)*Z2
action: Validate that |rho| < 1 before generating correlated samples — the formula requires a valid correlation coefficient;
when |rho| approaches 1, switch to Cholesky decomposition for numerical stability
severity: high
kind: domain_rule
modality: must
consequence: Setting |rho| >= 1 causes sqrt(1-rho^2) to become imaginary or zero, breaking the correlated Brownian motion
generation; even near-singular correlations (|rho| > 0.99) introduce numerical instability that distorts multi-factor
interest rate simulations
derived_from_bd_id: BD-066
- id: finance-C-153
when: When simulating two-factor Vasicek paths using Euler-Maruyama discretization
action: Use a time step dt <= 1 day (dt <= 1/252 years) to maintain pricing accuracy within 5 basis points in typical
rate environments
severity: high
kind: domain_rule
modality: must
consequence: Using larger time steps with Euler-Maruyama accumulates discretization error at O(dt) in mean and O(1) in
second moment, causing bond prices to deviate by more than 5bp from the true value; the drift bias compounds over long
simulation horizons
derived_from_bd_id: BD-067
- id: finance-C-154
when: When implementing Smith-Wilson calibration using numpy.linalg.inv for matrix inversion
action: Apply numpy.linalg.inv directly for Wilson matrices with condition number > 1e10 — use Cholesky decomposition
or LU decomposition with pivoting for numerical stability in near-singular cases
severity: high
kind: architecture_guardrail
modality: must_not
consequence: Direct matrix inversion of near-singular Wilson matrices (extremely long or short maturities) produces unreliable
calibration vectors with large numerical errors, causing bond prices to deviate significantly from market values
derived_from_bd_id: BD-068
- id: finance-C-155
when: When selecting the calibration method for Smith-Wilson algorithm
action: Consider Cholesky decomposition for positive-definite Wilson matrices as an alternative to direct inversion —
Cholesky is more numerically stable and has O(n^3) same complexity but with better constant factors for positive-definite
cases
severity: medium
kind: operational_lesson
modality: should
consequence: Direct matrix inversion may introduce numerical instabilities that degrade calibration accuracy, especially
when the Wilson matrix approaches singularity due to extreme maturity constraints
derived_from_bd_id: BD-068
- id: finance-C-156
when: When calibrating Smith-Wilson alpha using the bisection root-finding algorithm
action: Verify that a valid bracketing interval exists where the Wilson error function changes sign before running bisection,
and confirm monotonicity of the error function in alpha across the interval
severity: medium
kind: operational_lesson
modality: should
consequence: Without verifying bracketing and monotonicity, bisection may fail to converge or converge to the wrong root,
causing incorrect alpha values that distort bond and swap pricing calculations throughout the system
derived_from_bd_id: BD-070
- id: finance-C-157
when: When fitting Nelson-Siegel-Svensson parameters using the Nelder-Mead simplex algorithm
action: Initialize parameters with economically meaningful starting values derived from level-slope-curvature interpretation,
and verify sufficient function evaluations (200-500) to avoid premature convergence to local minima in poorly conditioned
regions
severity: medium
kind: operational_lesson
modality: should
consequence: Poor starting values or premature convergence leads to suboptimal NSS parameters, distorting the yield curve
shape and compromising the accuracy of interpolated rates and forward rate calculations
derived_from_bd_id: BD-075
- id: finance-C-158
when: When estimating spectral density using the trapezoidal kernel for block length calibration
action: Verify that the spectral density is smooth without sharp peaks before applying the trapezoidal kernel, and avoid
using it for processes with strong cyclical components where Parzen or Bartlett kernels may perform better
severity: medium
kind: operational_lesson
modality: should
consequence: Applying trapezoidal kernel to non-smooth spectral density or cyclical processes produces inconsistent block
length estimates, leading to unreliable bootstrap confidence intervals and incorrect statistical inference
derived_from_bd_id: BD-078
- id: finance-C-159
when: When applying closed-form MLE formulas (MLmu, MLlam, MLsigma) to Vasicek parameter estimation
action: Verify that interest rate innovations follow a normal distribution by performing normality tests (e.g., Jarque-Bera);
if fat tails are detected, switch to robust MLE with t-distributed errors or quasi-MLE for misspecified distributions
severity: medium
kind: operational_lesson
modality: should
consequence: Closed-form MLE assumes normally distributed innovations; in practice, interest rate returns exhibit fat
tails and outliers that cause the estimator to underweight tail risk, leading to overconfident parameter estimates and
underpriced tail risk in hedging
derived_from_bd_id: BD-082
- id: finance-C-160
when: When calibrating EIOPA risk-free term structure curves using bisection_alpha
action: Verify that the convergence point T is set to max(U+40, 60) where U is the last observable maturity; for regulatory
reporting under EIOPA-BoS-14/065, verify T >= max(U+40, 60) with T=60 minimum floor for short-dated curves
severity: high
kind: operational_lesson
modality: must
consequence: EIOPA regulation requires terminal convergence point T to be at least 40 years beyond the last observable
maturity U with a 60-year floor; using incorrect T values fails regulatory compliance and may invalidate solvency calculations
for insurance companies
derived_from_bd_id: BD-084
- id: finance-C-161
when: When initializing the Vasicek two-factor model BrownianMotion component for path simulation
action: Verify that x0=0 (zero-mean starting point) matches the intended initial condition for the short-rate process;
explicitly set x0 to a different value if the process should start from non-equilibrium or historical initial state
severity: medium
kind: operational_lesson
modality: should
consequence: Default x0=0 assumes the process starts from equilibrium; starting from a non-equilibrium initial condition
with x0=0 introduces systematic bias in early path simulations that affects option pricing and hedging ratios
derived_from_bd_id: BD-085
output_validator:
assertions:
- id: OV-01
check_predicate: all(p in inspect.getsource(zvt.factors.algorithm.macd) for p in ['slow=26', 'fast=12', 'n=9'])
failure_message: 'FATAL: MACD params drifted from (fast=12, slow=26, n=9) — SL-08 violation, non-reproducible signals'
business_meaning: Standard MACD parameters are a semantic lock; drift makes results incomparable with industry-standard
indicators and non-reproducible.
source_ids:
- SL-08
- BD-036
- id: OV-02
check_predicate: result.get('total_trades', 0) > 0 or result.get('explicit_zero_trade_ack') is True
failure_message: Zero trades executed — likely missing pre-fetched data (see PC-02) or over-restrictive filters
business_meaning: A backtest with zero trades is not a valid result; either data is missing or the strategy never triggered.
Structural non-emptiness check is insufficient — we need business confirmation.
source_ids:
- SL-01
- finance-C-073
- id: OV-03
check_predicate: result.get('annual_return') is None or abs(float(result['annual_return'])) <= 5.0
failure_message: 'FATAL: |annual_return| > 500% — likely look-ahead bias or data error'
business_meaning: Annual returns exceeding 500% are physically implausible for A-share strategies; indicates look-ahead
bias or corrupt data.
source_ids: []
- id: OV-04
check_predicate: result.get('holding_change_pct') is None or abs(float(result['holding_change_pct'])) <= 1.0
failure_message: 'FATAL: |holding_change_pct| > 100% — physically impossible'
business_meaning: Holding change percentage cannot exceed 100%; violation indicates position accounting error.
source_ids:
- BD-029
- id: OV-05
check_predicate: result.get('max_drawdown') is None or abs(float(result['max_drawdown'])) <= 1.0
failure_message: 'FATAL: |max_drawdown| > 100% — impossible for non-leveraged account'
business_meaning: Maximum drawdown cannot exceed 100% without leverage; violation indicates calculation error or look-ahead
bias.
source_ids: []
- id: OV-06
check_predicate: not (hasattr(result, 'trade_log') and result.trade_log and any(result.trade_log[i].action == 'sell' and
i+1 < len(result.trade_log) and result.trade_log[i+1].action == 'buy' and result.trade_log[i].timestamp == result.trade_log[i+1].timestamp
for i in range(len(result.trade_log)-1)))
failure_message: 'FATAL: buy-before-sell detected in same cycle — SL-01 violation, creates implicit leverage'
business_meaning: SL-01 requires sell() before buy() in each cycle; violation means available_long was not updated before
buying, risking duplicate positions.
source_ids:
- SL-01
scaffold:
validate_py_path: '{workspace}/validate.py'
tail_block: "# === DO NOT MODIFY BELOW THIS LINE ===\nif __name__ == \"__main__\":\n result = run_backtest()\n from\
\ validate import enforce_validation\n enforce_validation(result, output_path=\"{workspace}/result.csv\")\n# ===\
\ END DO NOT MODIFY ==="
enforcement_protocol: 1. Never edit validate.py. 2. Never delete the DO NOT MODIFY tail block from the main script. 3. Never
wrap enforce_validation() in try/except. 4. Never rewrite result write logic — it MUST go through enforce_validation.
5. If validate.py raises ImportError, fix the dependency, do not remove the call.
acceptance:
hard_gates:
- id: G1
check: '{workspace}/result.csv exists AND file size > 0'
on_fail: Strategy did not produce output; check run_backtest() return value and enforce_validation() call
- id: G2
check: '{workspace}/result.csv.validation_passed marker file exists'
on_fail: Validation did not complete; review validate.py output and fix assertion failures
- id: G3
check: 'Main script contains literal: from validate import enforce_validation'
on_fail: Validation chain stripped; re-add the import in the DO NOT MODIFY block
- id: G4
check: 'Main script contains literal: # === DO NOT MODIFY BELOW THIS LINE ==='
on_fail: Validation fence removed; regenerate DO NOT MODIFY tail block
- id: G5
check: 'result.csv has at least 1 row: pandas.read_csv(result.csv).shape[0] >= 1'
on_fail: Empty result; check if trade_log is non-empty and factors generated signals. Confirm PC-02 (k-data exists) passed.
- id: G6
check: 'If MACD strategy: source contains ''slow=26'' AND ''fast=12'' AND ''n=9'' in algorithm call'
on_fail: MACD params drifted from SL-08 lock; restore standard (12, 26, 9)
- id: G7
check: 'For data pipeline tasks: result.csv contains ''entity_id'' and ''timestamp'' fields'
on_fail: Missing required columns; check Mixin.query_data return schema and DataFrame MultiIndex reset_index() before
writing
- id: G8
check: 'OV-03 passes: abs(annual_return) <= 5.0 (500%)'
on_fail: Physical plausibility check failed; investigate look-ahead bias or data corruption in input kdata
soft_gates:
- id: SG-01
rubric: 'Strategy narrative consistency: user intent aligns with generated strategy.py logic. dim_a: signal direction
(buy/sell) matches intent [1-5, pass>=4]; dim_b: frequency (daily/intraday) aligns [1-5, pass>=4]; dim_c: risk controls
match user intent [1-5, pass>=4].'
- id: SG-02
rubric: 'Factor combination quality. dim_a: no highly correlated factor duplication [1-5, pass>=4]; dim_b: multi-period
alignment correct [1-5, pass>=4]; dim_c: liquidity filter present for A-share [1-5, pass>=4].'
- id: SG-03
rubric: 'Data source selection appropriateness. dim_a: coverage sufficient for target entities [1-5, pass>=4]; dim_b:
provider latency acceptable for strategy frequency [1-5, pass>=4]; dim_c: no unauthorized provider used without credentials
[1-5, pass>=4].'
skill_crystallization:
trigger: all_hard_gates_passed AND user_opt_out_skill_saving != true
output_path_template: '{workspace}/../skills/{slug}.skill'
slug_template: '{blueprint_id_short}-{uc_id_lower}'
captured_fields:
- name
- intent_keywords
- entry_point_script
- validate_script
- fatal_constraints
- spec_locks
- preconditions
- install_recipes
- human_summary_translated
action: 'After all Hard Gates PASS, resolve slug via slug_template using the executed UC, then write the .skill YAML file
at output_path_template. Notify user in their detected locale: ''Skill saved as {slug}.skill — next time say one of {sample_triggers}
from the matched UC to invoke directly.'''
violation_signal: All hard gates passed but no .skill file exists at expected path
skill_file_schema:
name: finance-bp-064 / Singular Spectrum Analysis Time Series Decomposition
version: v5.3
intent_keywords:
- SSA
- singular spectrum analysis
- time series decomposition
- scree plot
- trend extraction
entry_point: run_backtest
fatal_guards:
- SL-01
- SL-02
- SL-03
- SL-04
- SL-05
- SL-06
- SL-07
- SL-08
- SL-10
- SL-11
- SL-12
spec_locks:
- SL-01
- SL-02
- SL-03
- SL-04
- SL-05
- SL-06
- SL-07
- SL-08
- SL-09
- SL-10
- SL-11
- SL-12
preconditions:
- PC-01
- PC-02
- PC-03
- PC-04
post_install_notice:
trigger: skill_installation_complete
message_template:
positioning: I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow.
capability_catalog:
group_strategy:
source: auto_grouped
strategy_reason: no candidate field had 2-7 distinct values; all capabilities collapsed into single group
groups:
- group_id: all
name: All Capabilities
description: ''
emoji: 📦
uc_count: 2
ucs:
- uc_id: UC-101
name: Singular Spectrum Analysis Time Series Decomposition
short_description: Decomposes time series data into interpretable components (trend, seasonality, noise) using Singular
Spectrum Analysis to identify underlying patterns
sample_triggers:
- SSA
- singular spectrum analysis
- time series decomposition
- uc_id: UC-102
name: Stationary Bootstrap for Interest Rate Swap Inference
short_description: Applies stationary bootstrap resampling method to Italian swap rate data for statistical inference,
enabling confidence interval estimation and hypoth
sample_triggers:
- stationary bootstrap
- swap rates
- resampling
call_to_action: Tell me which one you want to try.
featured_entries:
- uc_id: UC-101
beginner_prompt: Try singular spectrum analysis time series decomposition
auto_selected: true
- uc_id: UC-102
beginner_prompt: Try stationary bootstrap for interest rate swap inference
auto_selected: true
- uc_id: UC-100
beginner_prompt: Try capability UC-100
auto_selected: true
more_info_hint: Ask me 'what else can you do?' to see all 2 capabilities.
locale_rendering:
instruction: On skill_installation_complete, translate ALL user-facing strings (positioning + capability_catalog.groups[].name
+ capability_catalog.groups[].description + capability_catalog.groups[].ucs[].short_description + call_to_action + featured_entries[].beginner_prompt
+ more_info_hint) into detected user locale per locale_contract. Preserve UC-IDs, group_id, emoji, and sample_triggers
verbatim.
preserve_verbatim:
- UC-IDs
- group_id
- emoji
- sample_triggers
- technical_class_names
enforcement:
action: 'Host agent MUST send composed message to user as the FIRST user-facing response after skill_installation_complete
event. Message MUST contain: positioning, capability_catalog (rendered as markdown tables per group), 3 featured_entries,
call_to_action, and more_info_hint.'
violation_code: PIN-01
violation_signal: First user-facing message post-install does not contain the full capability_catalog (all UCs grouped)
OR skips featured_entries OR skips call_to_action.
human_summary:
persona: Doraemon
what_i_can_do:
tagline: 'I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me
what you want; I''ll write the code, you don''t have to dig docs. (Heads up: ZVT natively supports A-share, HK, and
crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don''t bother for serious work.)'
use_cases:
- Stationary Bootstrap for Interest Rate Swap Inference
- Singular Spectrum Analysis Time Series Decomposition
- A-share MACD daily golden-cross backtest with hfq price adjustment from eastmoney
- 'End-to-end ZVT pipeline: FinanceRecorder + GoodCompanyFactor + StockTrader'
- Multi-factor strategy with TargetSelector (AND mode) combining MACD + volume breakout
- Index composition data collection (SZ1000, SZ2000) with EM recorder
- Institutional fund holdings tracker via joinquant_fund_runner pattern
what_i_auto_fetch:
- ZVT stage pipeline structure (data_collection → visualization) from LATEST.yaml
- Semantic locks (SL-01 through SL-12) — especially sell-before-buy ordering and MACD params
- Fatal constraints (finance-C-*) relevant to your target strategy type
- 'Default parameters: MACD(12,26,9), hfq adjustment, buy_cost=0.001, base_capital=1M CNY'
- Entity ID format (stock_sh_600000) and DataFrame MultiIndex convention
- Provider-specific recorder class names and required class attributes
what_i_ask_you:
- 'Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage
is thin)'
- 'Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare,
or qmt (broker)?'
- 'Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?'
- 'Time range: start_timestamp and end_timestamp for backtest period'
- 'Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?'
locale_rendering:
instruction: On first user contact, translate all fields above into detected user locale while preserving Doraemon persona
(direct, frank, mildly snarky, knows limits).
preserve_verbatim:
- BD-IDs
- SL-IDs
- UC-IDs
- finance-C-IDs
- class_names
- function_names
- file_paths
- numeric_thresholds
计算IFRS 9预期信用损失(ECL),支持Vasicek单因子前瞻性调整、Kaplan-Meier生存分析计算PD及贷款摊销计划生成,满足Basel III减值合规要求。
---
name: ifrs9-loss-engine
description: |-
计算IFRS 9预期信用损失(ECL),支持Vasicek单因子前瞻性调整、Kaplan-Meier生存分析计算PD及贷款摊销计划生成,满足Basel III减值合规要求。
license: Proprietary. See LICENSE.txt in project root.
compatibility: Designed for Doramagic-host ecosystem (Claude Code / openclaw / Cursor). Requires Python 3.12+ with uv package manager.
metadata:
version: "v6.1"
blueprint_id: "finance-bp-062"
compiled_at: "2026-04-22T13:00:19.657886+00:00"
capability_markets: "global"
capability_activities: "regtech-compliance"
sop_version: "crystal-compilation-v6.1"
---
# IFRS 9 损失引擎 (ifrs9-loss-engine)
> 计算IFRS 9预期信用损失(ECL),支持Vasicek单因子前瞻性调整、Kaplan-Meier生存分析计算PD及贷款摊销计划生成,满足Basel III减值合规要求。
## Pipeline
`data_collection -> data_storage -> factor_computation -> target_selection -> trading_execution -> visualization`
## Top Use Cases (42 total)
### ECL Limit Level Truncation Analysis (`UC-101`)
Calculates Expected Credit Loss (ECL) at the limit/tranche level by computing remaining tenor and projecting loan balances with interest, supporting I
**Triggers**: ECL, Expected Credit Loss, limit level
### Loan Amortization Schedule Calculator (`UC-102`)
Computes loan amortization schedules by iteratively calculating interest amounts and remaining balances after each payment, determining total repaymen
**Triggers**: amortization, loan, payment schedule
### Amortization Schedule with NumPy Financial (`UC-103`)
Generates amortization schedules using numpy-financial library functions (PMT, PPMT, IPMT) for calculating periodic payments, principal, and interest
**Triggers**: amortization, numpy-financial, PMT
For all **42** use cases, see [references/USE_CASES.md](references/USE_CASES.md).
**Execute trigger**: `When user intent matches intent_router.uc_entries[].positive_terms AND user uses action verb (run/execute/跑/执行/backtest/fetch/collect)`
## What I'll Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
## Semantic Locks (Fatal)
| ID | Rule | On Violation |
|---|---|---|
| `SL-01` | Execute sell orders before buy orders in every trading cycle | halt |
| `SL-02` | Trading signals MUST use next-bar execution (no look-ahead) | halt |
| `SL-03` | Entity IDs MUST follow format entity_type_exchange_code | halt |
| `SL-04` | DataFrame index MUST be MultiIndex (entity_id, timestamp) | halt |
| `SL-05` | TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount | halt |
| `SL-06` | filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION | halt |
| `SL-07` | Transformer MUST run BEFORE Accumulator in factor pipeline | halt |
| `SL-08` | MACD parameters locked: fast=12, slow=26, signal=9 | halt |
Full lock definitions: [references/LOCKS.md](references/LOCKS.md)
## Top Anti-Patterns (15 total)
- **`AP-REGTECH-001`**: Missing attribute initialization on data structures
- **`AP-REGTECH-002`**: Self-loops in transaction graphs violate domain rules
- **`AP-REGTECH-003`**: Unvalidated floating-point inputs cause runtime crashes
All 15 anti-patterns: [references/ANTI_PATTERNS.md](references/ANTI_PATTERNS.md)
## Evidence Quality Notice
> [QUALITY NOTICE] This crystal was compiled from blueprint finance-bp-062. Evidence verify ratio = 80.0% and audit fail total = 15. Generated results may have uncaptured requirement gaps. Verify critical decisions against source files (LATEST.yaml / LATEST.jsonl).
## Reference Files
| File | Contents | When to Load |
|---|---|---|
| [references/seed.yaml](references/seed.yaml) | V6+ 全量权威 (source-of-truth) | 有行为/决策争议时必读 |
| [references/ANTI_PATTERNS.md](references/ANTI_PATTERNS.md) | 15 条跨项目反模式 | 开始实现前 |
| [references/WISDOM.md](references/WISDOM.md) | 跨项目精华借鉴 | 架构决策时 |
| [references/CONSTRAINTS.md](references/CONSTRAINTS.md) | domain + fatal 约束 | 规则冲突时 |
| [references/USE_CASES.md](references/USE_CASES.md) | 全量 KUC-* 业务场景 | 需要完整示例时 |
| [references/LOCKS.md](references/LOCKS.md) | SL-* + preconditions + hints | 生成回测/交易代码前 |
| [references/COMPONENTS.md](references/COMPONENTS.md) | AST 组件地图(按 module 拆分)| 查 API 时 |
---
*Compiled by Doramagic crystal-compilation-v6.1 from `finance-bp-062` blueprint at 2026-04-22T13:00:19.657886+00:00.*
*See [human_summary.md](human_summary.md) for non-technical overview.*
FILE:human_summary.md
# Human Summary
> I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
## What I Can Do
- **tagline**: I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
- **use_cases**: ['Amortization Schedule with NumPy Financial', 'Loan Amortization Schedule Calculator', 'ECL Limit Level Truncation Analysis', 'A-share MACD daily golden-cross backtest with hfq price adjustment from eastmoney', 'End-to-end ZVT pipeline: FinanceRecorder + GoodCompanyFactor + StockTrader', 'Multi-factor strategy with TargetSelector (AND mode) combining MACD + volume breakout', 'Index composition data collection (SZ1000, SZ2000) with EM recorder']
## What I Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
FILE:references/ANTI_PATTERNS.md
# Anti-Patterns (Cross-Project)
Total: **15**
## finance-bp-060--AMLSim (1)
### `AP-REGTECH-011` — Mismatched configuration parameters across coupled components <sub>(medium)</sub>
When TransactionGenerator and Nominator use different degree_threshold values, Nominator identifies hub accounts using different criteria than TransactionGenerator. This causes incorrect fan-in/fan-out candidate selection. Consequence: AML typology patterns placed on wrong accounts, invalidating simulation results.
## finance-bp-060--AMLSim, finance-bp-067--firesale_stresstest (1)
### `AP-REGTECH-002` — Self-loops in transaction graphs violate domain rules <sub>(high)</sub>
When generating directed transaction graphs or AML typologies, allowing source == destination edges creates self-loops. In AML simulation, self-loops represent accounts sending money to themselves, which is not a valid money laundering pattern. In fire-sale models, self-loops cause undefined behavior. Consequence: corrupted graph topology and invalid typology validation.
## finance-bp-060--AMLSim, finance-bp-071--opensanctions (1)
### `AP-REGTECH-001` — Missing attribute initialization on data structures <sub>(high)</sub>
When loading account lists or creating entity dictionaries, failing to initialize required list/dict attributes (e.g., normal_models, statement IDs) causes KeyError or ValueError at runtime. The code path that reads these structures assumes they exist, but the initialization path omits them. Consequence: pipeline crashes or data loss for affected entities.
## finance-bp-062--ifrs9 (3)
### `AP-REGTECH-005` — Incorrect amortization windows violate IFRS 9 compliance <sub>(high)</sub>
Stage 1 ECL requires exactly 12-month amortization (11 zero-indexed iterations) while Stage 2/3 requires full remaining tenor (tenor-1 iterations). Using identical windows for all stages causes ECL over/understatement. Consequence: regulatory non-compliance and materially incorrect loan loss provisions.
### `AP-REGTECH-010` — Incorrect cumulative PD ordering corrupts lifetime ECL term structure <sub>(high)</sub>
Using cumprod(1-conPD) without shift(1) and fillna(1) produces corrupted first-period survival probability. This cascades into all subsequent marginal and cumulative PD calculations, violating IFRS 9 lifetime ECL requirements. Consequence: systematically incorrect provisions across all remaining tenor periods.
### `AP-REGTECH-015` — Missing EAD component in ECL formula produces incomplete provisions <sub>(high)</sub>
IFRS 9 requires ECL = PD x LGD x EAD. When the EAD module is missing or not integrated, the ECL calculation is incomplete and unusable for provisioning. Consequence: regulatory rejection of ECL calculations, blocking of provisioning and reporting processes.
## finance-bp-062--ifrs9, finance-bp-067--firesale_stresstest (2)
### `AP-REGTECH-003` — Unvalidated floating-point inputs cause runtime crashes <sub>(high)</sub>
When parsing CSV files or computing statistical functions on raw data, failing to validate inputs against acceptable ranges (e.g., DDP near 0 or 1 for norm.ppf, unvalidated floats from CSV) causes ValueError or infinite/NaN values. Consequence: entire model crashes before simulation or corrupted downstream calculations.
### `AP-REGTECH-004` — Division by zero in financial calculations produces inf/NaN <sub>(high)</sub>
When calculating ratios like DDP (downgrade observations / total observations) or price impact denominators (total_quantities), zero-denominator cases are not guarded. The resulting inf/NaN propagates through all downstream calculations, corrupting CCI, ECL, or market clearing. Consequence: systematic data corruption across the entire calculation pipeline.
## finance-bp-067--firesale_stresstest (4)
### `AP-REGTECH-006` — Wrong leverage formula in threshold-based decisions <sub>(high)</sub>
Computing leverage as equity-to-liabilities (E/L) instead of equity-to-assets (E/A) produces different values. This causes deleveraging triggers and insolvency detection to fire at wrong thresholds. Consequence: zombie banks continue operating with negative equity, or healthy banks unnecessarily deleverage.
### `AP-REGTECH-007` — Confusing deleveraging buffer threshold with insolvency threshold <sub>(high)</sub>
Banks below 3% leverage are insolvent and must default, but deleveraging should trigger at 4% buffer. Using the same threshold eliminates the buffer zone, causing immediate default with no intermediate corrective action. Consequence: excessive bank failures amplify systemic contagion.
### `AP-REGTECH-013` — Order-dependent execution creates first-mover advantage bias <sub>(medium)</sub>
Without separating step() and act() phases, first-acting banks sell assets before others decide, creating systematic first-mover advantage. This distorts the competitive equilibrium and fire-sale dynamics. Consequence: unreliable systemic risk estimates that understate contagion for late-acting banks.
### `AP-REGTECH-014` — Immediate asset sales cause double-selling and undefined state <sub>(medium)</sub>
Executing asset sales immediately rather than queuing them to a buffer allows multiple banks holding the same asset to sell simultaneously without accounting for concurrent intentions. Consequence: undefined price impact and incorrect cash transfers in market clearing.
## finance-bp-071--opensanctions (3)
### `AP-REGTECH-008` — Cache keys omit request body for state-changing methods <sub>(high)</sub>
Using only URL for cache fingerprints on POST/PATCH requests means different request bodies return identical cached content. This causes stale data, missing entities, and data corruption in compliance screening pipelines. Consequence: sanctions matches missed or false positives from stale entity data.
### `AP-REGTECH-009` — ID collision in entity construction creates false sanctions matches <sub>(high)</sub>
When constructing entity IDs from source identifiers, insufficient identifying attributes cause different real-world entities to receive identical IDs. The database then merges them into one entity. Consequence: a sanctioned entity's ID matches an innocent entity, causing false positive compliance alerts.
### `AP-REGTECH-012` — Reverse property assignment corrupts entity construction <sub>(medium)</sub>
Stub (reverse) properties represent inverse relationships and raise InvalidData when directly assigned. Attempting to add values to stub properties instead of forward properties causes ValueError, aborting entity construction. Consequence: entities lost from output, incomplete compliance datasets.
FILE:references/COMPONENTS.md
# Component Capability Map
**Project**: finance-bp-062--ifrs9
**Scan date**: 2026-04-22
**Stats**: {'total_files': 9, 'total_classes': 32, 'total_functions': 0, 'total_stages': 9}
## Modules (9)
- [transaction_staging](components/transaction_staging.md): 3 classes
- [transition_matrix_estimation](components/transition_matrix_estimation.md): 4 classes
- [survival_analysis_for_pd](components/survival_analysis_for_pd.md): 4 classes
- [forward-looking_pd_adjustment](components/forward-looking_pd_adjustment.md): 5 classes
- [lgd_model_estimation](components/lgd_model_estimation.md): 3 classes
- [exposure_at_default_(ead)_estimation](components/exposure_at_default_-ead-_estimation.md): 3 classes
- [ifrs_9_staging_classification](components/ifrs_9_staging_classification.md): 3 classes
- [amortization_schedule_generation](components/amortization_schedule_generation.md): 4 classes
- [ecl_computation](components/ecl_computation.md): 3 classes
FILE:references/CONSTRAINTS.md
# Constraints
## preservation_manifest
```yaml
required_objects:
business_decisions_count: 117
fatal_constraints_count: 54
non_fatal_constraints_count: 175
use_cases_count: 42
semantic_locks_count: 12
preconditions_count: 4
evidence_quality_rules_count: 2
traceback_scenarios_count: 5
```
FILE:references/LOCKS.md
# Semantic Locks + Preconditions
## Semantic Locks (12)
### `SL-01` <sub>(on_violation: fatal)</sub>
Execute sell orders before buy orders in every trading cycle
### `SL-02` <sub>(on_violation: fatal)</sub>
Trading signals MUST use next-bar execution (no look-ahead)
### `SL-03` <sub>(on_violation: fatal)</sub>
Entity IDs MUST follow format entity_type_exchange_code
### `SL-04` <sub>(on_violation: fatal)</sub>
DataFrame index MUST be MultiIndex (entity_id, timestamp)
### `SL-05` <sub>(on_violation: fatal)</sub>
TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount
### `SL-06` <sub>(on_violation: fatal)</sub>
filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION
### `SL-07` <sub>(on_violation: fatal)</sub>
Transformer MUST run BEFORE Accumulator in factor pipeline
### `SL-08` <sub>(on_violation: fatal)</sub>
MACD parameters locked: fast=12, slow=26, signal=9
### `SL-09` <sub>(on_violation: warning)</sub>
Default transaction costs: buy_cost=0.001, sell_cost=0.001, slippage=0.001
### `SL-10` <sub>(on_violation: fatal)</sub>
A-share equity trading is T+1 (no same-day close of buy positions)
### `SL-11` <sub>(on_violation: fatal)</sub>
Recorder subclass MUST define provider AND data_schema class attributes
### `SL-12` <sub>(on_violation: fatal)</sub>
Factor result_df MUST contain either 'filter_result' OR 'score_result' column
## Preconditions (4)
- **`PC-01`**: `python3 -c 'import zvt; print(zvt.__version__)'` → on_fail: Run: python3 -m pip install zvt then re-run: python3 -m zvt.init_dirs to initialize data directories
- **`PC-02`**: `python3 -c "from zvt.api.kdata import get_kdata; df = get_kdata(entity_ids=['stock_sh_600000'], limit=1); assert df is not None and len(df) > 0, 'No kdata found'"` → on_fail: Run recorder first: python3 -m zvt.recorders.em.em_stock_kdata_recorder --entity_ids stock_sh_600000 (replace with your target entity IDs)
- **`PC-03`**: `python3 -c "import os; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); assert zvt_home.exists(), f'ZVT home not found: {zvt_home}'"` → on_fail: Run: python3 -m zvt.init_dirs
- **`PC-04`**: `python3 -c "import os, tempfile; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); test_f = zvt_home / '.write_test'; test_f.touch(); test_f.unlink()"` → on_fail: Check directory permissions: chmod u+w ~/.zvt or set ZVT_HOME environment variable to a writable location
FILE:references/USE_CASES.md
# Known Use Cases (KUC)
Total: **42**
## `KUC-101`
**Source**: `ECL/ECLLimitLevel_Truncate.ipynb`
Calculates Expected Credit Loss (ECL) at the limit/tranche level by computing remaining tenor and projecting loan balances with interest, supporting IFRS 9 impairment calculations.
## `KUC-102`
**Source**: `ECL/abAmortization.ipynb`
Computes loan amortization schedules by iteratively calculating interest amounts and remaining balances after each payment, determining total repayment terms.
## `KUC-103`
**Source**: `ECL/amortization.ipynb`
Generates amortization schedules using numpy-financial library functions (PMT, PPMT, IPMT) for calculating periodic payments, principal, and interest breakdown.
## `KUC-104`
**Source**: `LGD/LGDFunctionModel.ipynb`
Develops Loss Given Default (LGD) models using statistical regression techniques with 12-month forward observation windows for non-default populations.
## `KUC-105`
**Source**: `PD/APIBOTMacro.ipynb`
Fetches macroeconomic variables from Bank of Thailand's API including GDP, employment, and trade indicators for integration into credit risk models.
## `KUC-106`
**Source**: `PD/BOTNPLData.ipynb`
Cleans and processes raw NPL (Non-Performing Loan) data from Bank of Thailand, handling inconsistent column structures and formatting for credit risk analysis.
## `KUC-107`
**Source**: `PD/BayesCalibration.ipynb`
Calibrates credit rating transition matrices using Bayesian optimization methods to ensure row stochasticity and alignment with observed migration patterns.
## `KUC-108`
**Source**: `PD/CHAIDSegmentation.ipynb`
Segments used car loan customers using CHAID (Chi-squared Automatic Interaction Detection) decision tree algorithm with optimal binning for credit scoring development.
## `KUC-109`
**Source**: `PD/CooksDStudent.ipynb`
Identifies influential outliers and high-leverage points in PD regression models using Cook's distance and studentized residuals analysis.
## `KUC-110`
**Source**: `PD/HACAdjustment.ipynb`
Adjusts PD regression models for heteroscedasticity and autocorrelation using HAC (Heteroscedasticity and Autocorrelation Consistent) standard errors.
## `KUC-111`
**Source**: `PD/KMVModel.ipynb`
Implements the KMV-Merton structural model to estimate default probabilities from equity prices and market capitalization using option pricing theory.
## `KUC-112`
**Source**: `PD/LassoSelection.ipynb`
Selects optimal macroeconomic variables for PD models using LASSO (Least Absolute Shrinkage and Selection Operator) regression with time series cross-validation.
## `KUC-113`
**Source**: `PD/NRTMatrix.ipynb`
Constructs credit rating transition matrices for non-retail portfolios by counting rating migrations and computing conditional transition probabilities.
## `KUC-114`
**Source**: `PD/NelsonSiegelCurve.ipynb`
Fits Nelson-Siegel-Svensson curves to PD term structures for modeling the relationship between probability of default and time to maturity.
## `KUC-115`
**Source**: `PD/PDAssumedLGD.ipynb`
Defines assumed PD and LGD risk weights by asset class (Strong, Good, Satisfactory, Weak, Default) for regulatory capital and expected loss calculations.
## `KUC-116`
**Source**: `PD/PDCalibration.ipynb`
Calibrates Probability of Default from Through-the-Cycle (TTC) to Point-in-Time (PIT) estimates using central tendency adjustments and risk grade mapping.
## `KUC-117`
**Source**: `PD/PROCVARCLUS.ipynb`
Performs hierarchical variable clustering using VarClusHi algorithm to reduce multicollinearity among macroeconomic variables in PD models.
## `KUC-118`
**Source**: `PD/SilhouetteAnalysis.ipynb`
Evaluates optimal number of clusters in K-means segmentation using silhouette score analysis to determine best customer groupings for PD modeling.
## `KUC-119`
**Source**: `PD/TheoreticalMigrationMatrix.ipynb`
Constructs theoretically grounded migration matrices using parametric distributions and optimization to ensure mathematical consistency and economic plausibility.
## `KUC-120`
**Source**: `PD/allCombinations.ipynb`
Generates each possible combinations of macroeconomic variables within clusters for exhaustive model selection in PD development.
## `KUC-121`
**Source**: `PD/autoCorrTest.ipynb`
Tests for autocorrelation in PD model residuals using ACF plots, Ljung-Box test, and Durbin-Watson statistics to validate time series assumptions.
## `KUC-122`
**Source**: `PD/cci.ipynb`
Calculates Credit Conversion Index (CCI) by measuring downgrade rates and converting to a normalized z-score for stress testing and early warning systems.
## `KUC-123`
**Source**: `PD/chainLadder.ipynb`
Applies actuarial chain ladder methodology to estimate PD curves over time and aging buckets using weighted average calculations.
## `KUC-124`
**Source**: `PD/clusterSelection.ipynb`
Selects representative variables from clusters based on lowest R-square ratio and highest correlation for parsimonious PD model specification.
## `KUC-125`
**Source**: `PD/distributionSelection.ipynb`
Fits and compares multiple statistical distributions (Gamma, Log-normal, Weibull, etc.) to empirical PD risk grade curves for model selection.
## `KUC-126`
**Source**: `PD/externalMatrix.ipynb`
Integrates external credit ratings (e.g., Moody's) and their transition matrices with internal rating systems for PD benchmarking and calibration.
## `KUC-127`
**Source**: `PD/gammaFitting.ipynb`
Fits Gamma cumulative distribution functions to PD risk grade curves using scipy optimization to model default probability term structure.
## `KUC-128`
**Source**: `PD/generatorMatrix.ipynb`
Constructs generator (infinitesimal) matrices for continuous-time credit migration models enabling computation of transition probabilities over arbitrary time horizons.
## `KUC-129`
**Source**: `PD/heteroTest.ipynb`
Tests for heteroscedasticity in PD regression models using White's test and Breusch-Pagan test to validate homoscedasticity assumptions.
## `KUC-130`
**Source**: `PD/lifetimeCalibration.ipynb`
Calibrates lifetime PD curves using through-the-cycle (TTC) reference data and tracks calibration stability over time for IFRS 9 lifetime ECL calculations.
## `KUC-131`
**Source**: `PD/limitPDCurves.ipynb`
Applies limiting constraints to PD curves ensuring monotonicity, non-exceedance of regulatory floors, and compliance with Basel/IFRS 9 requirements.
## `KUC-132`
**Source**: `PD/multicolTest.ipynb`
Tests for multicollinearity among macroeconomic variables using Variance Inflation Factor (VIF) and correlation matrices in PD regression models.
## `KUC-133`
**Source**: `PD/normalityTest.ipynb`
Tests normality of regression residuals using Shapiro-Wilk, Anderson-Darling tests, QQ plots, and histograms for PD model assumption validation.
## `KUC-134`
**Source**: `PD/simplifiedCCI.ipynb`
Computes simplified Credit Conversion Index using z-score normalization of downgrade probabilities for portfolio-level credit stress monitoring.
## `KUC-135`
**Source**: `PD/survivalAnalysis.ipynb`
Applies survival analysis techniques to estimate lifetime default probabilities using Kaplan-Meier curves and Cox proportional hazards for IFRS 9 staging.
## `KUC-136`
**Source**: `PD/timeSeriesKMeans.ipynb`
Clusters macroeconomic time series using Dynamic Time Warping (DTW) distance with K-means algorithm to identify similar economic patterns for PD modeling.
## `KUC-137`
**Source**: `PD/timeSeriesStationary.ipynb`
Analyzes stationarity of macroeconomic time series using ADF tests and seasonal decomposition for proper model specification in PD regression.
## `KUC-138`
**Source**: `PD/transitionMatrix.ipynb`
Constructs empirical credit rating transition matrices from transaction-level data by tracking 12-month forward aging status and computing migration probabilities.
## `KUC-139`
**Source**: `PD/univariateAnalysis1.ipynb`
Performs univariate regression analysis on individual macroeconomic variables to assess their predictive power for ODR (Observed Default Rate) before multivariate modeling.
## `KUC-140`
**Source**: `PD/vasicekBaselRho.ipynb`
Estimates Basel Rho parameter (asset correlation) in the Vasicek single-factor model for regulatory capital calculation and portfolio credit risk.
## `KUC-141`
**Source**: `PD/vasicekTransitionMatrix.ipynb`
Calibrates credit rating transition matrices using the Vasicek asymptotic single risk factor model ensuring consistency with Basel regulatory requirements.
## `KUC-142`
**Source**: `Staging/backwardTransaction.ipynb`
Processes backward-looking transaction data to create performance windows and default flags for credit scoring model development in staging environments.
FILE:references/WISDOM.md
# Cross-Project Wisdom
Total: **10**
## `CW-REGTECH-001` — Input bounds validation before statistical computation
**From**: finance-bp-062--ifrs9, finance-bp-067--firesale_stresstest · **Applicable to**: regtech-compliance
Statistical functions like norm.ppf() and cumprod() have strict input requirements that, if violated, produce infinite or NaN values corrupting entire pipelines. Always validate inputs against domain constraints (DDP in (0,1), counts > 0) before passing to statistical functions. Apply to any statistical or inverse-CDF computation.
## `CW-REGTECH-002` — Graph/topology invariant verification before construction
**From**: finance-bp-060--AMLSim, finance-bp-067--firesale_stresstest · **Applicable to**: regtech-compliance
Before constructing graph structures (transaction networks, transition matrices), verify invariants: sum(in-degrees) = sum(out-degrees), matrix row sums = 1.0, degree sequence length divisibility. This catches data corruption early before expensive graph construction operations. Apply to any bipartite or directed graph generation.
## `CW-REGTECH-003` — Regulatory amortization window discipline
**From**: finance-bp-062--ifrs9 · **Applicable to**: regtech-compliance
IFRS 9 mandates different ECL calculation windows: exactly 12-month for Stage 1 (11 zero-indexed iterations), full remaining tenor for Stage 2/3. Mixing these up violates compliance requirements. Always encode stage-specific window logic explicitly rather than reusing a single loop variable across stages.
## `CW-REGTECH-004` — Fingerprint composition must include all request dimensions
**From**: finance-bp-071--opensanctions · **Applicable to**: regtech-compliance
Cache keys must include all request parameters that affect response content: URL, HTTP method, authentication headers, and request body for state-changing methods. POST requests with different bodies returning identical cache is a silent data corruption bug. Always compose fingerprints from the union of all content-affecting parameters.
## `CW-REGTECH-005` — Floating-point zero-equivalence with explicit epsilon tolerance
**From**: finance-bp-067--firesale_stresstest · **Applicable to**: regtech-compliance
IEEE 754 floating-point precision causes exact zero comparisons to fail in financial calculations. Always use eps=1e-9 tolerance for zero-equivalence checks in market clearing, leverage ratios, and price impact calculations. This prevents division-by-zero crashes and incorrect cash transfers.
## `CW-REGTECH-006` — Stage classification threshold ordering enforcement
**From**: finance-bp-062--ifrs9 · **Applicable to**: regtech-compliance
IFRS 9 SICR thresholds must be ordered: BUCKETS 2-3 trigger Stage 2, BUCKETS >=4 trigger Stage 3. Applying thresholds in wrong order or omitting absolute DPD triggers causes material ECL misstatement. Validate threshold ordering and document bucket-to-stage mapping explicitly.
## `CW-REGTECH-007` — Initialization-before-use dependency ordering
**From**: finance-bp-067--firesale_stresstest · **Applicable to**: regtech-compliance
Operational dependencies must initialize before dependent objects use them: AssetMarket before bank registration, CSV file existence before parsing, entity ID before statement addition. Violations cause AttributeError or FileNotFoundError that abort entire initialization. Always encode dependency ordering explicitly in initialization sequences.
## `CW-REGTECH-008` — Sufficient entity ID collision prevention
**From**: finance-bp-071--opensanctions · **Applicable to**: regtech-compliance
Entity IDs must include enough identifying attributes (dataset prefix, source, identifier type, document number) to guarantee uniqueness. Collisions create false equivalence between unrelated entities, directly causing false positive sanctions matches. Include the maximum available discriminating attributes in ID construction.
## `CW-REGTECH-009` — Hub selection with candidate removal before addition
**From**: finance-bp-060--AMLSim · **Applicable to**: regtech-compliance
When selecting hub accounts for typology placement, always call remove_typology_candidate BEFORE add_node for each selected account. Reversing this order causes hub self-selection (accounts choosing themselves) and duplicate assignment across overlapping patterns. Apply to any allocation algorithm with candidate pooling.
## `CW-REGTECH-010` — Insolvency detection before operational decisions
**From**: finance-bp-067--firesale_stresstest · **Applicable to**: regtech-compliance
Banks below the insolvency threshold (3% leverage) must trigger default immediately, not enter the deleveraging decision logic. Checking operational thresholds before insolvency creates zombie banks with negative equity. Always gate operational decisions on prior insolvency state.
FILE:references/components/amortization_schedule_generation.md
# amortization_schedule_generation (4 classes)
## `amortization.ipynb`
`amortization_schedule_generation/amortization-ipynb.py:0`
## `abAmortization.ipynb`
`amortization_schedule_generation/abamortization-ipynb.py:0`
## `compounding`
`amortization_schedule_generation/compounding.py:0`
## `schedule_type`
`amortization_schedule_generation/schedule-type.py:0`
FILE:references/components/ecl_computation.md
# ecl_computation (3 classes)
## `ECLLimitLevel_Truncate.ipynb`
`ecl_computation/ecllimitlevel-truncate-ipynb.py:0`
## `discount_rate`
`ecl_computation/discount-rate.py:0`
## `probability_weighting`
`ecl_computation/probability-weighting.py:0`
FILE:references/components/exposure_at_default_-ead-_estimation.md
# exposure_at_default_(ead)_estimation (3 classes)
## `ECLLimitLevel_Truncate.ipynb`
`exposure_at_default_(ead)_estimation/ecllimitlevel-truncate-ipynb.py:0`
## `ead_model`
`exposure_at_default_(ead)_estimation/ead-model.py:0`
## `limit_treatment`
`exposure_at_default_(ead)_estimation/limit-treatment.py:0`
FILE:references/components/forward-looking_pd_adjustment.md
# forward-looking_pd_adjustment (5 classes)
## `cci.ipynb`
`forward-looking_pd_adjustment/cci-ipynb.py:0`
## `vasicekTransitionMatrix.ipynb`
`forward-looking_pd_adjustment/vasicektransitionmatrix-ipynb.py:0`
## `simplifiedCCI.ipynb`
`forward-looking_pd_adjustment/simplifiedcci-ipynb.py:0`
## `macro_model`
`forward-looking_pd_adjustment/macro-model.py:0`
## `correlation_parameter`
`forward-looking_pd_adjustment/correlation-parameter.py:0`
FILE:references/components/ifrs_9_staging_classification.md
# ifrs_9_staging_classification (3 classes)
## `backwardTransaction.ipynb`
`ifrs_9_staging_classification/backwardtransaction-ipynb.py:0`
## `sicr_method`
`ifrs_9_staging_classification/sicr-method.py:0`
## `stage_thresholds`
`ifrs_9_staging_classification/stage-thresholds.py:0`
FILE:references/components/lgd_model_estimation.md
# lgd_model_estimation (3 classes)
## `LGDFunctionModel.ipynb`
`lgd_model_estimation/lgdfunctionmodel-ipynb.py:0`
## `lgd_definition`
`lgd_model_estimation/lgd-definition.py:0`
## `discount_method`
`lgd_model_estimation/discount-method.py:0`
FILE:references/components/survival_analysis_for_pd.md
# survival_analysis_for_pd (4 classes)
## `survivalAnalysis.ipynb`
`survival_analysis_for_pd/survivalanalysis-ipynb.py:0`
## `lifetimeCalibration.ipynb`
`survival_analysis_for_pd/lifetimecalibration-ipynb.py:0`
## `survival_function`
`survival_analysis_for_pd/survival-function.py:0`
## `calibration_method`
`survival_analysis_for_pd/calibration-method.py:0`
FILE:references/components/transaction_staging.md
# transaction_staging (3 classes)
## `backwardTransaction.ipynb`
`transaction_staging/backwardtransaction-ipynb.py:0`
## `lag_period`
`transaction_staging/lag-period.py:0`
## `exclusion_rules`
`transaction_staging/exclusion-rules.py:0`
FILE:references/components/transition_matrix_estimation.md
# transition_matrix_estimation (4 classes)
## `transitionMatrix.ipynb`
`transition_matrix_estimation/transitionmatrix-ipynb.py:0`
## `NRTMatrix.ipynb`
`transition_matrix_estimation/nrtmatrix-ipynb.py:0`
## `status_mapping`
`transition_matrix_estimation/status-mapping.py:0`
## `matrix_normalization`
`transition_matrix_estimation/matrix-normalization.py:0`
使用Hummingbot框架执行加密货币做市和套利策略,支持资金费率套利、流动性提供、价格监控等自动化交易场景。
---
name: hummingbot-market-maker
description: |-
使用Hummingbot框架执行加密货币做市和套利策略,支持资金费率套利、流动性提供、价格监控等自动化交易场景。
license: Proprietary. See LICENSE.txt in project root.
compatibility: Designed for Doramagic-host ecosystem (Claude Code / openclaw / Cursor). Requires Python 3.12+ with uv package manager.
metadata:
version: "v6.1"
blueprint_id: "finance-bp-096"
compiled_at: "2026-04-22T13:00:42.333686+00:00"
capability_markets: "crypto"
capability_activities: "crypto-trading"
sop_version: "crystal-compilation-v6.1"
---
# Hummingbot 做市机器人 (hummingbot-market-maker)
> 使用Hummingbot框架执行加密货币做市和套利策略,支持资金费率套利、流动性提供、价格监控等自动化交易场景。
## Pipeline
`data_collection -> data_storage -> factor_computation -> target_selection -> trading_execution -> visualization`
## Top Use Cases (14 total)
### Funding Rate Arbitrage (`UC-101`)
Exploits funding rate differences between perpetual exchanges (e.g., Hyperliquid and Binance) to generate risk-adjusted returns using leverage
**Triggers**: funding rate, arbitrage, perpetual
### XRPL Triggered Liquidity Provision (`UC-103`)
Provides liquidity on XRPL (Ripple Ledger) decentralized exchange when price crosses user-defined target levels
**Triggers**: xrpl, ripple, liquidity
### Simple Cross Exchange Market Making (XEMM) (`UC-108`)
Places maker orders on one exchange and immediately hedges/hedging filled orders on another exchange to capture spread
**Triggers**: xemm, cross-exchange, market making
For all **14** use cases, see [references/USE_CASES.md](references/USE_CASES.md).
**Execute trigger**: `When user intent matches intent_router.uc_entries[].positive_terms AND user uses action verb (run/execute/跑/执行/backtest/fetch/collect)`
## What I'll Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
## Semantic Locks (Fatal)
| ID | Rule | On Violation |
|---|---|---|
| `SL-01` | Execute sell orders before buy orders in every trading cycle | halt |
| `SL-02` | Trading signals MUST use next-bar execution (no look-ahead) | halt |
| `SL-03` | Entity IDs MUST follow format entity_type_exchange_code | halt |
| `SL-04` | DataFrame index MUST be MultiIndex (entity_id, timestamp) | halt |
| `SL-05` | TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount | halt |
| `SL-06` | filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION | halt |
| `SL-07` | Transformer MUST run BEFORE Accumulator in factor pipeline | halt |
| `SL-08` | MACD parameters locked: fast=12, slow=26, signal=9 | halt |
Full lock definitions: [references/LOCKS.md](references/LOCKS.md)
## Top Anti-Patterns (13 total)
- **`AP-CRYPTO-TRADING-001`**: Float Arithmetic for Monetary Values
- **`AP-CRYPTO-TRADING-002`**: Missing Market Initialization Before Access
- **`AP-CRYPTO-TRADING-003`**: Bypassing API Facade Layer
All 13 anti-patterns: [references/ANTI_PATTERNS.md](references/ANTI_PATTERNS.md)
## Evidence Quality Notice
> [QUALITY NOTICE] This crystal was compiled from blueprint finance-bp-096. Evidence verify ratio = 46.3% and audit fail total = 30. Generated results may have uncaptured requirement gaps. Verify critical decisions against source files (LATEST.yaml / LATEST.jsonl).
## Reference Files
| File | Contents | When to Load |
|---|---|---|
| [references/seed.yaml](references/seed.yaml) | V6+ 全量权威 (source-of-truth) | 有行为/决策争议时必读 |
| [references/ANTI_PATTERNS.md](references/ANTI_PATTERNS.md) | 13 条跨项目反模式 | 开始实现前 |
| [references/WISDOM.md](references/WISDOM.md) | 跨项目精华借鉴 | 架构决策时 |
| [references/CONSTRAINTS.md](references/CONSTRAINTS.md) | domain + fatal 约束 | 规则冲突时 |
| [references/USE_CASES.md](references/USE_CASES.md) | 全量 KUC-* 业务场景 | 需要完整示例时 |
| [references/LOCKS.md](references/LOCKS.md) | SL-* + preconditions + hints | 生成回测/交易代码前 |
| [references/COMPONENTS.md](references/COMPONENTS.md) | AST 组件地图(按 module 拆分)| 查 API 时 |
---
*Compiled by Doramagic crystal-compilation-v6.1 from `finance-bp-096` blueprint at 2026-04-22T13:00:42.333686+00:00.*
*See [human_summary.md](human_summary.md) for non-technical overview.*
FILE:human_summary.md
# Human Summary
> I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
## What I Can Do
- **tagline**: I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
- **use_cases**: ['XRPL Triggered Liquidity Provision', 'Price Logging Example', 'Funding Rate Arbitrage', 'A-share MACD daily golden-cross backtest with hfq price adjustment from eastmoney', 'End-to-end ZVT pipeline: FinanceRecorder + GoodCompanyFactor + StockTrader', 'Multi-factor strategy with TargetSelector (AND mode) combining MACD + volume breakout', 'Index composition data collection (SZ1000, SZ2000) with EM recorder']
## What I Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
FILE:references/ANTI_PATTERNS.md
# Anti-Patterns (Cross-Project)
Total: **13**
## ccxt (1)
### `AP-CRYPTO-TRADING-002` — Missing Market Initialization Before Access <sub>(high)</sub>
Attempting to access market data via symbol lookups before load_markets() is called leaves self.markets empty, causing KeyError or BadSymbol exceptions on all trading operations and data retrieval. This breaks the entire trading workflow at the first market interaction.
## cryptofeed (3)
### `AP-CRYPTO-TRADING-009` — Applying Order Book Deltas Before Snapshot <sub>(high)</sub>
Processing order book delta messages before receiving a snapshot for the symbol applies updates to an uninitialized or stale book state. Price levels are incorrectly added/removed, corrupting the local book representation with no way to recover without full reset.
### `AP-CRYPTO-TRADING-010` — Silent HTTP Error Handling <sub>(medium)</sub>
Ignoring non-200 HTTP response status codes without raising exceptions causes silent failures for data requests. Market data is missing or corrupted, failed requests are not retried, and downstream consumers receive incomplete data with no indication of failure.
### `AP-CRYPTO-TRADING-011` — Missing Sequence Number Validation <sub>(medium)</sub>
Not validating that order book sequence numbers increment by exactly 1 allows out-of-order or missing messages to corrupt local book state. Stale or incorrect price levels persist in the book, leading to wrong trading signals and corrupted market depth data.
## hummingbot (5)
### `AP-CRYPTO-TRADING-005` — Unvalidated Collateral for Order Execution <sub>(high)</sub>
Submitting orders without checking collateral requirements including order cost, percent fees, and fixed fees against available balance causes orders to exceed margin. This triggers immediate liquidation or forced position closure at unfavorable prices with partial or total loss of collateral.
### `AP-CRYPTO-TRADING-006` — Close Order Placed Before Open Order Fills <sub>(high)</sub>
Placing a close order before verifying the open order is fully filled causes mismatched position sizes. The executor attempts to close a larger or smaller position than actually exists, leading to unintended directional exposure and potential losses exceeding the configured risk parameters.
### `AP-CRYPTO-TRADING-007` — Arbitrage Across Non-Interchangeable Tokens <sub>(high)</sub>
Executing arbitrage trades between tokens that appear similar but are not interchangeable causes permanent loss of funds. The received tokens cannot be used to close the opposing position, stranding capital and creating one-sided exposure with no recovery path.
### `AP-CRYPTO-TRADING-008` — Skipping Triple Barrier Evaluations <sub>(high)</sub>
Omitting control_stop_loss, control_take_profit, or control_time_limit calls in the control_barriers cycle leaves positions unprotected. Losses exceed configured thresholds as barrier checks never trigger, positions remain open beyond risk tolerance, resulting in amplified losses.
### `AP-CRYPTO-TRADING-012` — Wrong Position Key for Perpetual Modes <sub>(medium)</sub>
Using trading_pair only as the position key in HEDGE mode causes different position sides to collide and overwrite each other. Position tracking becomes incorrect, leading to wrong order matching and potential financial loss when the system misidentifies position direction.
## rotki (3)
### `AP-CRYPTO-TRADING-003` — Bypassing API Facade Layer <sub>(high)</sub>
Directly accessing internal service methods without routing through the RestAPI facade bypasses authentication, task tracking, and error handling mechanisms. Anonymous requests can execute privileged operations, creating critical security vulnerabilities where unauthorized users access sensitive financial data or execute trades.
### `AP-CRYPTO-TRADING-004` — Non-Checksummed EVM Addresses <sub>(high)</sub>
Passing lowercase or mixed-case Ethereum addresses to RPC nodes causes InvalidAddress exceptions since nodes enforce EIP-55 checksum format. This results in RemoteError failures that halt all blockchain data collection for the affected chain, with no graceful degradation or fallback.
### `AP-CRYPTO-TRADING-013` — Overwriting User-Customized Event Classifications <sub>(medium)</sub>
Re-decoding operations silently replace user-modified events marked as CUSTOMIZED without explicit user action. User edits to event classifications are permanently lost, causing incorrect accounting treatment and potential tax reporting errors that may not be detected until audit.
## rotki, hummingbot, cryptofeed, ccxt (1)
### `AP-CRYPTO-TRADING-001` — Float Arithmetic for Monetary Values <sub>(high)</sub>
Using Python float type instead of Decimal for price, amount, balance, PnL, and other financial calculations causes precision errors due to binary floating-point representation. Rounding errors compound across multiple calculations, leading to incorrect order sizing, wrong profit/loss reporting, and potentially incorrect trading decisions or tax calculations.
FILE:references/COMPONENTS.md
# Component Capability Map
**Project**: finance-bp-096--hummingbot
**Scan date**: 2026-04-22
**Stats**: {'total_files': 8, 'total_classes': 41, 'total_functions': 0, 'total_stages': 8}
## Modules (8)
- [market_data_layer](components/market_data_layer.md): 5 classes
- [position_management_layer](components/position_management_layer.md): 4 classes
- [execution_control_(controllers)](components/execution_control_-controllers.md): 6 classes
- [executor_execution_layer](components/executor_execution_layer.md): 6 classes
- [orchestration_layer](components/orchestration_layer.md): 5 classes
- [backtesting_layer](components/backtesting_layer.md): 5 classes
- [cli_interface_layer](components/cli_interface_layer.md): 5 classes
- [gateway_integration](components/gateway_integration.md): 5 classes
FILE:references/CONSTRAINTS.md
# Constraints
## preservation_manifest
```yaml
required_objects:
business_decisions_count: 149
fatal_constraints_count: 55
non_fatal_constraints_count: 248
use_cases_count: 14
semantic_locks_count: 12
preconditions_count: 4
evidence_quality_rules_count: 2
traceback_scenarios_count: 5
```
FILE:references/LOCKS.md
# Semantic Locks + Preconditions
## Semantic Locks (12)
### `SL-01` <sub>(on_violation: fatal)</sub>
Execute sell orders before buy orders in every trading cycle
### `SL-02` <sub>(on_violation: fatal)</sub>
Trading signals MUST use next-bar execution (no look-ahead)
### `SL-03` <sub>(on_violation: fatal)</sub>
Entity IDs MUST follow format entity_type_exchange_code
### `SL-04` <sub>(on_violation: fatal)</sub>
DataFrame index MUST be MultiIndex (entity_id, timestamp)
### `SL-05` <sub>(on_violation: fatal)</sub>
TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount
### `SL-06` <sub>(on_violation: fatal)</sub>
filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION
### `SL-07` <sub>(on_violation: fatal)</sub>
Transformer MUST run BEFORE Accumulator in factor pipeline
### `SL-08` <sub>(on_violation: fatal)</sub>
MACD parameters locked: fast=12, slow=26, signal=9
### `SL-09` <sub>(on_violation: warning)</sub>
Default transaction costs: buy_cost=0.001, sell_cost=0.001, slippage=0.001
### `SL-10` <sub>(on_violation: fatal)</sub>
A-share equity trading is T+1 (no same-day close of buy positions)
### `SL-11` <sub>(on_violation: fatal)</sub>
Recorder subclass MUST define provider AND data_schema class attributes
### `SL-12` <sub>(on_violation: fatal)</sub>
Factor result_df MUST contain either 'filter_result' OR 'score_result' column
## Preconditions (4)
- **`PC-01`**: `python3 -c 'import zvt; print(zvt.__version__)'` → on_fail: Run: python3 -m pip install zvt then re-run: python3 -m zvt.init_dirs to initialize data directories
- **`PC-02`**: `python3 -c "from zvt.api.kdata import get_kdata; df = get_kdata(entity_ids=['stock_sh_600000'], limit=1); assert df is not None and len(df) > 0, 'No kdata found'"` → on_fail: Run recorder first: python3 -m zvt.recorders.em.em_stock_kdata_recorder --entity_ids stock_sh_600000 (replace with your target entity IDs)
- **`PC-03`**: `python3 -c "import os; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); assert zvt_home.exists(), f'ZVT home not found: {zvt_home}'"` → on_fail: Run: python3 -m zvt.init_dirs
- **`PC-04`**: `python3 -c "import os, tempfile; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); test_f = zvt_home / '.write_test'; test_f.touch(); test_f.unlink()"` → on_fail: Check directory permissions: chmod u+w ~/.zvt or set ZVT_HOME environment variable to a writable location
FILE:references/USE_CASES.md
# Known Use Cases (KUC)
Total: **14**
## `KUC-101`
**Source**: `scripts/v2_funding_rate_arb.py`
Exploits funding rate differences between perpetual exchanges (e.g., Hyperliquid and Binance) to generate risk-adjusted returns using leverage.
## `KUC-102`
**Source**: `scripts/log_price_example.py`
Demonstrates how to retrieve and log market prices (best bid, best ask, mid price) from multiple exchanges for monitoring purposes.
## `KUC-103`
**Source**: `scripts/xrpl_liquidity_example.py`
Provides liquidity on XRPL (Ripple Ledger) decentralized exchange when price crosses user-defined target levels.
## `KUC-104`
**Source**: `scripts/download_order_book_and_trades.py`
Collects and exports historical order book snapshots and trade data to CSV files for backtesting and analysis.
## `KUC-105`
**Source**: `scripts/external_events_example.py`
Demonstrates how to use the MQTT external events plugin to subscribe to external topics and receive/process messages.
## `KUC-106`
**Source**: `scripts/amm_data_feed_example.py`
Fetches and displays real-time price data from AMM DEX connectors (Jupiter, Uniswap) via the Gateway for analysis.
## `KUC-107`
**Source**: `scripts/screener_volatility.py`
Scans multiple trading pairs to identify highly volatile instruments using Bollinger Bands width, percentage, and NATR indicators.
## `KUC-108`
**Source**: `scripts/simple_xemm.py`
Places maker orders on one exchange and immediately hedges/hedging filled orders on another exchange to capture spread.
## `KUC-109`
**Source**: `scripts/format_status_example.py`
Demonstrates how to add custom status display formatting to a strategy using format_status method and market status dataframes.
## `KUC-110`
**Source**: `scripts/simple_pmm.py`
Provides liquidity by placing bid and ask orders around the mid price with configurable spreads, refreshing at set intervals.
## `KUC-111`
**Source**: `scripts/v2_with_controllers.py`
Runs a multi-controller strategy with features like cash-out scheduling, dynamic config reloading, and drawdown protection.
## `KUC-112`
**Source**: `scripts/candles_example.py`
Demonstrates how to initialize and consume candles data feeds for multiple trading pairs and timeframes without requiring market trading.
## `KUC-113`
**Source**: `scripts/xrpl_arb_example.py`
Exploits price differences between XRPL decentralized exchange and centralized exchanges for risk-free profit opportunities.
## `KUC-114`
**Source**: `scripts/simple_vwap.py`
Executes large buy or sell orders using Volume Weighted Average Price algorithm, splitting orders to minimize market impact.
FILE:references/WISDOM.md
# Cross-Project Wisdom
Total: **8**
## `CW-CRYPTO-TRADING-001` — Decimal Type for All Monetary Values
**From**: rotki, hummingbot, cryptofeed, ccxt · **Applicable to**: crypto-trading
All four projects mandate Decimal type for price, amount, balance, quantity, and PnL fields. Float arithmetic causes rounding errors that compound across financial calculations, leading to incorrect order sizing and reporting. Always use Decimal for any value representing money in crypto trading systems.
## `CW-CRYPTO-TRADING-002` — Initialize Data Structures Before Access
**From**: ccxt, cryptofeed, rotki · **Applicable to**: crypto-trading
Projects consistently require explicit initialization before data access: load_markets() before symbol lookups, check symbol population before mapping access, establish RPC connections before queries. Skipping initialization causes KeyError, AttributeError, or silent data corruption that breaks downstream operations.
## `CW-CRYPTO-TRADING-003` — Precise String Arithmetic for Financial Calculations
**From**: ccxt · **Applicable to**: crypto-trading
CCXT mandates Precise.string_* static methods (string_mul, string_div, string_add, string_sub) for monetary calculations to avoid floating-point precision errors. This is especially critical for high-precision exchange data where rounding errors cause incorrect order costs, fees, and balances that may result in financial loss.
## `CW-CRYPTO-TRADING-004` — Respect Exchange Rate Limits
**From**: ccxt · **Applicable to**: crypto-trading
Disabling rate limiting via enableRateLimit=False causes HTTP 429 responses and potential temporary or permanent API key suspension by exchanges. CCXT enforces rate limits per IP/API key pair, and bypassing throttle() gates results in compliance violations that disrupt all trading activity until exchanges lift bans.
## `CW-CRYPTO-TRADING-005` — Inverse Contract Price Adjustment
**From**: ccxt, hummingbot · **Applicable to**: crypto-trading
Perpetual swap cost calculations require applying inverse price adjustment (1/price) before multiplying by contractSize for inverse contracts. Incorrect cost calculation causes wrong position sizing, leading to unexpected liquidation or insufficient margin for perpetual trading positions.
## `CW-CRYPTO-TRADING-006` — Strict Connection Lifecycle Ordering
**From**: cryptofeed, ccxt · **Applicable to**: crypto-trading
Both projects enforce strict execution order for connection operations: cryptofeed requires authenticate -> subscribe -> message handler sequence, while ccxt mandates connect -> on_connected_callback -> subscriptions -> on_close_callback. Out-of-order operations cause subscription failures and no data flow through connections.
## `CW-CRYPTO-TRADING-007` — Validate Input Data Structure Before Processing
**From**: rotki, cryptofeed · **Applicable to**: crypto-trading
Rotki validates EVM address checksum format before RPC calls; cryptofeed checks Symbols.populated() before symbol mapping access. Validating data structure before processing prevents downstream crashes (KeyError, InvalidAddress) and data corruption that is harder to debug when symptoms appear in unrelated code paths.
## `CW-CRYPTO-TRADING-008` — Validate Order Sizes Against Exchange Minimums
**From**: hummingbot · **Applicable to**: crypto-trading
DCAExecutor amounts must be validated against min_notional_size and amounts_quote/prices against min_order_size before execution. Orders below exchange minimums are rejected, breaking strategy execution and potentially leaving positions partially unfilled at unfavorable prices.
FILE:references/components/backtesting_layer.md
# backtesting_layer (5 classes)
## `BacktestingEngineBase.run_backtesting`
`backtesting_layer/backtestingenginebase-run-backtesting.py:0`
## `ExecutorSimulatorBase.simulate_fill`
`backtesting_layer/executorsimulatorbase-simulate-fill.py:0`
## `PositionExecutorSimulator.evaluate_barriers`
`backtesting_layer/positionexecutorsimulator-evaluate-barri.py:0`
## `DCAExecutorSimulator.get_break_even_price`
`backtesting_layer/dcaexecutorsimulator-get-break-even-pric.py:0`
## `backtesting_resolution`
`backtesting_layer/backtesting-resolution.py:0`
FILE:references/components/cli_interface_layer.md
# cli_interface_layer (5 classes)
## `HummingbotApplication.start`
`cli_interface_layer/hummingbotapplication-start.py:0`
## `CreateCommand.create`
`cli_interface_layer/createcommand-create.py:0`
## `StartCommand.start`
`cli_interface_layer/startcommand-start.py:0`
## `ConfigCommand.config`
`cli_interface_layer/configcommand-config.py:0`
## `rate_source`
`cli_interface_layer/rate-source.py:0`
FILE:references/components/execution_control_-controllers.md
# execution_control_(controllers) (6 classes)
## `ControllerBase.determine_executor_actions`
`execution_control_(controllers)/controllerbase-determine-executor-action.py:0`
## `DirectionalTradingControllerBase.evaluate_signals`
`execution_control_(controllers)/directionaltradingcontrollerbase-evaluat.py:0`
## `MarketMakingControllerBase.get_level_configs`
`execution_control_(controllers)/marketmakingcontrollerbase-get-level-con.py:0`
## `ControllerBase.update_processed_data`
`execution_control_(controllers)/controllerbase-update-processed-data.py:0`
## `trading_strategy`
`execution_control_(controllers)/trading-strategy.py:0`
## `execution_strategy`
`execution_control_(controllers)/execution-strategy.py:0`
FILE:references/components/executor_execution_layer.md
# executor_execution_layer (6 classes)
## `ExecutorBase.execute_actions`
`executor_execution_layer/executorbase-execute-actions.py:0`
## `PositionExecutor.control_barriers`
`executor_execution_layer/positionexecutor-control-barriers.py:0`
## `DCAExecutor.execute_trade`
`executor_execution_layer/dcaexecutor-execute-trade.py:0`
## `ArbitrageExecutor.check_arbitrage_opportunity`
`executor_execution_layer/arbitrageexecutor-check-arbitrage-opport.py:0`
## `executor_type`
`executor_execution_layer/executor-type.py:0`
## `close_type`
`executor_execution_layer/close-type.py:0`
FILE:references/components/gateway_integration.md
# gateway_integration (5 classes)
## `GatewayHttpClient.get_balance`
`gateway_integration/gatewayhttpclient-get-balance.py:0`
## `GatewayHttpClient.amm_swap`
`gateway_integration/gatewayhttpclient-amm-swap.py:0`
## `GatewayHttpClient.clmm_position_info`
`gateway_integration/gatewayhttpclient-clmm-position-info.py:0`
## `GatewayHttpClient.get_allowances`
`gateway_integration/gatewayhttpclient-get-allowances.py:0`
## `gateway_chain`
`gateway_integration/gateway-chain.py:0`
FILE:references/components/market_data_layer.md
# market_data_layer (5 classes)
## `MarketDataProvider.get_price_by_indicator`
`market_data_layer/marketdataprovider-get-price-by-indicato.py:0`
## `MarketDataProvider.get_order_book`
`market_data_layer/marketdataprovider-get-order-book.py:0`
## `MarketDataProvider.get_candles`
`market_data_layer/marketdataprovider-get-candles.py:0`
## `RateOracle.get_rate`
`market_data_layer/rateoracle-get-rate.py:0`
## `price_data_sources`
`market_data_layer/price-data-sources.py:0`
FILE:references/components/orchestration_layer.md
# orchestration_layer (5 classes)
## `ExecutorOrchestrator.execute_actions`
`orchestration_layer/executororchestrator-execute-actions.py:0`
## `StrategyV2Base.on_tick`
`orchestration_layer/strategyv2base-on-tick.py:0`
## `MarketsRecorder.save_filled_order`
`orchestration_layer/marketsrecorder-save-filled-order.py:0`
## `PositionHold.get_unified_position`
`orchestration_layer/positionhold-get-unified-position.py:0`
## `backtesting_engine`
`orchestration_layer/backtesting-engine.py:0`
FILE:references/components/position_management_layer.md
# position_management_layer (4 classes)
## `PerpetualTrading.get_position`
`position_management_layer/perpetualtrading-get-position.py:0`
## `BudgetChecker.adjust_candidate`
`position_management_layer/budgetchecker-adjust-candidate.py:0`
## `PerpetualTrading.get_funding_info`
`position_management_layer/perpetualtrading-get-funding-info.py:0`
## `position_mode`
`position_management_layer/position-mode.py:0`
提供年化波动率、指数加权移动平均(EMA)和指数加权标准差等量化金融指标的专业计算能力,支持维度枚举到字符串的灵活覆盖,适用于金融时间序列分析与资产定价建模。
---
name: gs-quant-pricing
description: |-
提供年化波动率、指数加权移动平均(EMA)和指数加权标准差等量化金融指标的专业计算能力,支持维度枚举到字符串的灵活覆盖,适用于金融时间序列分析与资产定价建模。
license: Proprietary. See LICENSE.txt in project root.
compatibility: Designed for Doramagic-host ecosystem (Claude Code / openclaw / Cursor). Requires Python 3.12+ with uv package manager.
metadata:
version: "v6.1"
blueprint_id: "finance-bp-020"
compiled_at: "2026-04-22T13:00:16.803564+00:00"
capability_markets: "unspecified"
capability_activities: "finance-analytics"
sop_version: "crystal-compilation-v6.1"
---
# GS Quant 风险定价 (gs-quant-pricing)
> 提供年化波动率、指数加权移动平均(EMA)和指数加权标准差等量化金融指标的专业计算能力,支持维度枚举到字符串的灵活覆盖,适用于金融时间序列分析与资产定价建模。
## Pipeline
`data_collection -> data_storage -> factor_computation -> target_selection -> trading_execution -> visualization`
## Top Use Cases (0 total)
**Execute trigger**: `When user intent matches intent_router.uc_entries[].positive_terms AND user uses action verb (run/execute/跑/执行/backtest/fetch/collect)`
## What I'll Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
## Semantic Locks (Fatal)
| ID | Rule | On Violation |
|---|---|---|
| `SL-01` | Execute sell orders before buy orders in every trading cycle | halt |
| `SL-02` | Trading signals MUST use next-bar execution (no look-ahead) | halt |
| `SL-03` | Entity IDs MUST follow format entity_type_exchange_code | halt |
| `SL-04` | DataFrame index MUST be MultiIndex (entity_id, timestamp) | halt |
| `SL-05` | TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount | halt |
| `SL-06` | filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION | halt |
| `SL-07` | Transformer MUST run BEFORE Accumulator in factor pipeline | halt |
| `SL-08` | MACD parameters locked: fast=12, slow=26, signal=9 | halt |
Full lock definitions: [references/LOCKS.md](references/LOCKS.md)
## Evidence Quality Notice
> [QUALITY NOTICE] This crystal was compiled from blueprint finance-bp-020. Evidence verify ratio = 57.7% and audit fail total = 4. Generated results may have uncaptured requirement gaps. Verify critical decisions against source files (LATEST.yaml / LATEST.jsonl).
## Reference Files
| File | Contents | When to Load |
|---|---|---|
| [references/seed.yaml](references/seed.yaml) | V6+ 全量权威 (source-of-truth) | 有行为/决策争议时必读 |
| [references/ANTI_PATTERNS.md](references/ANTI_PATTERNS.md) | 0 条跨项目反模式 | 开始实现前 |
| [references/WISDOM.md](references/WISDOM.md) | 跨项目精华借鉴 | 架构决策时 |
| [references/CONSTRAINTS.md](references/CONSTRAINTS.md) | domain + fatal 约束 | 规则冲突时 |
| [references/USE_CASES.md](references/USE_CASES.md) | 全量 KUC-* 业务场景 | 需要完整示例时 |
| [references/LOCKS.md](references/LOCKS.md) | SL-* + preconditions + hints | 生成回测/交易代码前 |
| [references/COMPONENTS.md](references/COMPONENTS.md) | AST 组件地图(按 module 拆分)| 查 API 时 |
---
*Compiled by Doramagic crystal-compilation-v6.1 from `finance-bp-020` blueprint at 2026-04-22T13:00:16.803564+00:00.*
*See [human_summary.md](human_summary.md) for non-technical overview.*
FILE:human_summary.md
# Human Summary
> I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
## What I Can Do
- **tagline**: I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me what you want; I'll write the code, you don't have to dig docs. (Heads up: ZVT natively supports A-share, HK, and crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don't bother for serious work.)
- **use_cases**: ['A-share MACD daily golden-cross backtest with hfq price adjustment from eastmoney', 'End-to-end ZVT pipeline: FinanceRecorder + GoodCompanyFactor + StockTrader', 'Multi-factor strategy with TargetSelector (AND mode) combining MACD + volume breakout', 'Index composition data collection (SZ1000, SZ2000) with EM recorder', 'Institutional fund holdings tracker via joinquant_fund_runner pattern', 'Custom Transformer + Accumulator factor with per-entity rolling state', 'Bollinger Band mean-reversion factor with BollTransformer (window=20, window_dev=2)']
## What I Ask You
- Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
- Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
- Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
- Time range: start_timestamp and end_timestamp for backtest period
- Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?
FILE:references/ANTI_PATTERNS.md
# Anti-Patterns (Cross-Project)
Total: **0**
FILE:references/COMPONENTS.md
# Component Capability Map
**Project**: finance-bp-020--gs-quant
**Scan date**: 2026-04-22
**Stats**: {'total_files': 10, 'total_classes': 59, 'total_functions': 0, 'total_stages': 10}
## Modules (10)
- [data_ingestion_layer](components/data_ingestion_layer.md): 6 classes
- [instrument_modeling](components/instrument_modeling.md): 5 classes
- [pricing_context_management](components/pricing_context_management.md): 5 classes
- [risk_analysis_&_measures](components/risk_analysis_-_measures.md): 5 classes
- [backtesting_engine](components/backtesting_engine.md): 7 classes
- [analytics_processing](components/analytics_processing.md): 6 classes
- [time_series_analysis](components/time_series_analysis.md): 6 classes
- [risk_model_management](components/risk_model_management.md): 6 classes
- [reporting_&_performance_analytics](components/reporting_-_performance_analytics.md): 8 classes
- [entity_management](components/entity_management.md): 5 classes
FILE:references/CONSTRAINTS.md
# Constraints
## preservation_manifest
```yaml
required_objects:
business_decisions_count: 97
fatal_constraints_count: 46
non_fatal_constraints_count: 239
use_cases_count: 0
semantic_locks_count: 12
preconditions_count: 4
evidence_quality_rules_count: 2
traceback_scenarios_count: 5
```
FILE:references/LOCKS.md
# Semantic Locks + Preconditions
## Semantic Locks (12)
### `SL-01` <sub>(on_violation: fatal)</sub>
Execute sell orders before buy orders in every trading cycle
### `SL-02` <sub>(on_violation: fatal)</sub>
Trading signals MUST use next-bar execution (no look-ahead)
### `SL-03` <sub>(on_violation: fatal)</sub>
Entity IDs MUST follow format entity_type_exchange_code
### `SL-04` <sub>(on_violation: fatal)</sub>
DataFrame index MUST be MultiIndex (entity_id, timestamp)
### `SL-05` <sub>(on_violation: fatal)</sub>
TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount
### `SL-06` <sub>(on_violation: fatal)</sub>
filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION
### `SL-07` <sub>(on_violation: fatal)</sub>
Transformer MUST run BEFORE Accumulator in factor pipeline
### `SL-08` <sub>(on_violation: fatal)</sub>
MACD parameters locked: fast=12, slow=26, signal=9
### `SL-09` <sub>(on_violation: warning)</sub>
Default transaction costs: buy_cost=0.001, sell_cost=0.001, slippage=0.001
### `SL-10` <sub>(on_violation: fatal)</sub>
A-share equity trading is T+1 (no same-day close of buy positions)
### `SL-11` <sub>(on_violation: fatal)</sub>
Recorder subclass MUST define provider AND data_schema class attributes
### `SL-12` <sub>(on_violation: fatal)</sub>
Factor result_df MUST contain either 'filter_result' OR 'score_result' column
## Preconditions (4)
- **`PC-01`**: `python3 -c 'import zvt; print(zvt.__version__)'` → on_fail: Run: python3 -m pip install zvt then re-run: python3 -m zvt.init_dirs to initialize data directories
- **`PC-02`**: `python3 -c "from zvt.api.kdata import get_kdata; df = get_kdata(entity_ids=['stock_sh_600000'], limit=1); assert df is not None and len(df) > 0, 'No kdata found'"` → on_fail: Run recorder first: python3 -m zvt.recorders.em.em_stock_kdata_recorder --entity_ids stock_sh_600000 (replace with your target entity IDs)
- **`PC-03`**: `python3 -c "import os; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); assert zvt_home.exists(), f'ZVT home not found: {zvt_home}'"` → on_fail: Run: python3 -m zvt.init_dirs
- **`PC-04`**: `python3 -c "import os, tempfile; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home() / '.zvt')); test_f = zvt_home / '.write_test'; test_f.touch(); test_f.unlink()"` → on_fail: Check directory permissions: chmod u+w ~/.zvt or set ZVT_HOME environment variable to a writable location
FILE:references/USE_CASES.md
# Known Use Cases (KUC)
Total: **0**
FILE:references/WISDOM.md
# Cross-Project Wisdom
Total: **0**
FILE:references/components/analytics_processing.md
# analytics_processing (6 classes)
## `BaseProcessor.process`
`analytics_processing/baseprocessor-process.py:0`
## `BaseProcessor.update`
`analytics_processing/baseprocessor-update.py:0`
## `DataGrid.poll`
`analytics_processing/datagrid-poll.py:0`
## `VolatilityProcessor.compute`
`analytics_processing/volatilityprocessor-compute.py:0`
## `Processor type`
`analytics_processing/processor-type.py:0`
## `Returns type`
`analytics_processing/returns-type.py:0`
FILE:references/components/backtesting_engine.md
# backtesting_engine (7 classes)
## `Strategy.run`
`backtesting_engine/strategy-run.py:0`
## `GenericEngine.run_backtest`
`backtesting_engine/genericengine-run-backtest.py:0`
## `PredefinedAssetEngine.run`
`backtesting_engine/predefinedassetengine-run.py:0`
## `BackTest.get_results`
`backtesting_engine/backtest-get-results.py:0`
## `TransactionCostModel`
`backtesting_engine/transactioncostmodel.py:0`
## `CashAccrualModel`
`backtesting_engine/cashaccrualmodel.py:0`
## `BacktestEngine`
`backtesting_engine/backtestengine.py:0`
FILE:references/components/data_ingestion_layer.md
# data_ingestion_layer (6 classes)
## `Dataset.get_data`
`data_ingestion_layer/dataset-get-data.py:0`
## `Dataset.get_data_series`
`data_ingestion_layer/dataset-get-data-series.py:0`
## `GsDataApi.query`
`data_ingestion_layer/gsdataapi-query.py:0`
## `MarqueeDataIngestionLibrary.create_dataset`
`data_ingestion_layer/marqueedataingestionlibrary-create-datas.py:0`
## `DataProvider`
`data_ingestion_layer/dataprovider.py:0`
## `MissingDataStrategy`
`data_ingestion_layer/missingdatastrategy.py:0`
FILE:references/components/entity_management.md
# entity_management (5 classes)
## `Entity.get`
`entity_management/entity-get.py:0`
## `PositionedEntity.get_latest_position_set`
`entity_management/positionedentity-get-latest-position-set.py:0`
## `Country.get_region_mapping`
`entity_management/country-get-region-mapping.py:0`
## `Entity.get_entitlements`
`entity_management/entity-get-entitlements.py:0`
## `Entity identifier type`
`entity_management/entity-identifier-type.py:0`
FILE:references/components/instrument_modeling.md
# instrument_modeling (5 classes)
## `Priceable.resolve`
`instrument_modeling/priceable-resolve.py:0`
## `Priceable.calc`
`instrument_modeling/priceable-calc.py:0`
## `Portfolio.price`
`instrument_modeling/portfolio-price.py:0`
## `Grid.parameterize`
`instrument_modeling/grid-parameterize.py:0`
## `Instrument resolution`
`instrument_modeling/instrument-resolution.py:0`
FILE:references/components/pricing_context_management.md
# pricing_context_management (5 classes)
## `PricingContext.__enter__`
`pricing_context_management/pricingcontext-enter.py:0`
## `PricingContext.__exit__`
`pricing_context_management/pricingcontext-exit.py:0`
## `ContextBase.push`
`pricing_context_management/contextbase-push.py:0`
## `ContextBaseWithDefault.default_value`
`pricing_context_management/contextbasewithdefault-default-value.py:0`
## `Context default behavior`
`pricing_context_management/context-default-behavior.py:0`
FILE:references/components/reporting_-_performance_analytics.md
# reporting_&_performance_analytics (8 classes)
## `Report.run`
`reporting_&_performance_analytics/report-run.py:0`
## `PerformanceReport.get_results`
`reporting_&_performance_analytics/performancereport-get-results.py:0`
## `FactorRiskReport.get_results`
`reporting_&_performance_analytics/factorriskreport-get-results.py:0`
## `FactorRiskReport.get_view`
`reporting_&_performance_analytics/factorriskreport-get-view.py:0`
## `PortfolioManager.get_aum`
`reporting_&_performance_analytics/portfoliomanager-get-aum.py:0`
## `PerformanceReport.get_brinson_attribution`
`reporting_&_performance_analytics/performancereport-get-brinson-attributio.py:0`
## `Report type`
`reporting_&_performance_analytics/report-type.py:0`
## `Return format`
`reporting_&_performance_analytics/return-format.py:0`
FILE:references/components/risk_analysis_-_measures.md
# risk_analysis_&_measures (5 classes)
## `RiskMeasure.calculate`
`risk_analysis_&_measures/riskmeasure-calculate.py:0`
## `FloatWithInfo.value`
`risk_analysis_&_measures/floatwithinfo-value.py:0`
## `DataFrameWithInfo.filter`
`risk_analysis_&_measures/dataframewithinfo-filter.py:0`
## `aggregate_risk`
`risk_analysis_&_measures/aggregate-risk.py:0`
## `Risk measure`
`risk_analysis_&_measures/risk-measure.py:0`
FILE:references/components/risk_model_management.md
# risk_model_management (6 classes)
## `RiskModel.get_data`
`risk_model_management/riskmodel-get-data.py:0`
## `FactorRiskModel.get_asset_factor_attribution`
`risk_model_management/factorriskmodel-get-asset-factor-attribu.py:0`
## `MacroRiskModel.get_fair_value_gap`
`risk_model_management/macroriskmodel-get-fair-value-gap.py:0`
## `ThematicRiskModel.get_exposure`
`risk_model_management/thematicriskmodel-get-exposure.py:0`
## `FactorRiskModel.upload_partial_data`
`risk_model_management/factorriskmodel-upload-partial-data.py:0`
## `Return format`
`risk_model_management/return-format.py:0`
FILE:references/components/time_series_analysis.md
# time_series_analysis (6 classes)
## `ExtendedSeries.returns`
`time_series_analysis/extendedseries-returns.py:0`
## `LinearRegression.fit`
`time_series_analysis/linearregression-fit.py:0`
## `RollingLinearRegression.rolling_fit`
`time_series_analysis/rollinglinearregression-rolling-fit.py:0`
## `vol_swap_volatility`
`time_series_analysis/vol-swap-volatility.py:0`
## `Returns type`
`time_series_analysis/returns-type.py:0`
## `Mean type`
`time_series_analysis/mean-type.py:0`
FILE:references/seed.yaml
meta:
id: finance-bp-020-v5.3
version: v6.1
blueprint_id: finance-bp-020
sop_version: crystal-compilation-v6.1
source_language: en
compiled_at: '2026-04-22T13:00:16.803564+00:00'
target_host: openclaw
authoritative_artifact:
primary: seed.yaml
non_authoritative_derivatives:
- SKILL.md (host-generated summary, may lag)
- HEARTBEAT.md (host telemetry)
- memory/*.md (host conversational memory)
rule: On any behavioral decision (preconditions check, OV assertion, EQ rule firing, spec_lock verification), agents MUST
re-read seed.yaml. Derivatives are for UI display only and may be out-of-date.
execution_protocol:
install_trigger:
- Execute resources.host_adapter.install_recipes[] in declared order
- Verify each package with import check before proceeding
execute_trigger: When user intent matches intent_router.uc_entries[].positive_terms AND user uses action verb (run/execute/跑/执行/backtest/fetch/collect)
on_execute:
- Reload seed.yaml (do not rely on SKILL.md or cached summaries)
- Run preconditions[] in declared order; halt on first fatal failure with on_fail message to user
- Enter context_state_machine.CA1_MEMORY_CHECKED state
- Evaluate evidence_quality.enforcement_rules[]; prepend user_disclosure_template
- Translate user_facing_fields to user locale per locale_contract
- "[V6 READING ORDER]\nThis crystal contains the following V6 layers. Before answering any business question, the host\
\ MUST read them in order:\n 1. anti_patterns[] — cross-project anti-patterns (with AP-* ids)\n 2. cross_project_wisdom[]\
\ — cross-project wisdom (with CW-* ids)\n 3. domain_constraints_injected[] — domain constraints (SHARED-* ids)\n \
\ 4. known_use_cases[] — concrete business scenarios (KUC-* ids)\n 5. component_capability_map — AST component map\
\ (by module)\n\nWhen answering user questions, proactively cite relevant AP-*/CW-*/SHARED-*/KUC-* ids with source text.\
\ Examples: T+1 rules -> cite SHARED-* constraint; model comparison -> warn via AP-*; follow-holdings strategy -> cite\
\ KUC-* with example file."
workspace_resolution:
scripts_path: '{host_workspace}/scripts/'
skills_path: '{host_workspace}/skills/'
trace_path: '{host_workspace}/.trace/'
capability_tags:
markets:
- unspecified
activities:
- finance-analytics
upgraded_from: finance-bp-020-v1.seed.yaml
upgraded_at: '2026-04-22T13:20:10.885044+00:00'
v6_inputs:
ast_mind_map: knowledge/sources/finance/finance-bp-020--gs-quant/v6_inputs/ast_mind_map.yaml
anti_patterns: null
cross_project_wisdom: null
examples_kuc: knowledge/sources/finance/finance-bp-020--gs-quant/v6_inputs/examples_kuc.yaml
shared_pools_dir: knowledge/sources/finance/_shared
domain_constraints_injected: []
resources_injected: {}
component_capability_map:
project: finance-bp-020--gs-quant
scan_date: '2026-04-22'
stats:
total_files: 10
total_classes: 59
total_functions: 0
total_stages: 10
modules:
data_ingestion_layer:
class_count: 6
stage_id: data_ingestion
stage_order: 1
responsibility: Retrieves market data from Marquee/Goldman Sachs APIs. Provides time-series, coordinate-based, and batch
data retrieval with caching and async support. This layer exists because quantitative analysis requires reliable,
performant access to market data with proper handling of data quality issues.
classes:
- name: Dataset.get_data
file: data_ingestion_layer/dataset-get-data.py
line: 0
kind: required_method
signature: ''
- name: Dataset.get_data_series
file: data_ingestion_layer/dataset-get-data-series.py
line: 0
kind: required_method
signature: ''
- name: GsDataApi.query
file: data_ingestion_layer/gsdataapi-query.py
line: 0
kind: required_method
signature: ''
- name: MarqueeDataIngestionLibrary.create_dataset
file: data_ingestion_layer/marqueedataingestionlibrary-create-datas.py
line: 0
kind: required_method
signature: ''
- name: DataProvider
file: data_ingestion_layer/dataprovider.py
line: 0
kind: replaceable_point
- name: MissingDataStrategy
file: data_ingestion_layer/missingdatastrategy.py
line: 0
kind: replaceable_point
design_decision_count: 3
instrument_modeling:
class_count: 5
stage_id: instrument_modeling
stage_order: 2
responsibility: Represents financial instruments (swaps, options, equities) with resolution, pricing, and risk calculation
capabilities. This layer exists to provide a unified abstraction for diverse financial products that can be priced
and risk-analyzed consistently.
classes:
- name: Priceable.resolve
file: instrument_modeling/priceable-resolve.py
line: 0
kind: required_method
signature: ''
- name: Priceable.calc
file: instrument_modeling/priceable-calc.py
line: 0
kind: required_method
signature: ''
- name: Portfolio.price
file: instrument_modeling/portfolio-price.py
line: 0
kind: required_method
signature: ''
- name: Grid.parameterize
file: instrument_modeling/grid-parameterize.py
line: 0
kind: required_method
signature: ''
- name: Instrument resolution
file: instrument_modeling/instrument-resolution.py
line: 0
kind: replaceable_point
design_decision_count: 3
pricing_context_management:
class_count: 5
stage_id: pricing_context
stage_order: 3
responsibility: Thread-local and async-aware context for market data and pricing parameters. Enables nested contexts
with shadowing for scenario analysis. This layer exists to provide thread-safe, composable pricing environments that
can be entered and exited safely.
classes:
- name: PricingContext.__enter__
file: pricing_context_management/pricingcontext-enter.py
line: 0
kind: required_method
signature: ''
- name: PricingContext.__exit__
file: pricing_context_management/pricingcontext-exit.py
line: 0
kind: required_method
signature: ''
- name: ContextBase.push
file: pricing_context_management/contextbase-push.py
line: 0
kind: required_method
signature: ''
- name: ContextBaseWithDefault.default_value
file: pricing_context_management/contextbasewithdefault-default-value.py
line: 0
kind: required_method
signature: ''
- name: Context default behavior
file: pricing_context_management/context-default-behavior.py
line: 0
kind: replaceable_point
design_decision_count: 4
risk_analysis_&_measures:
class_count: 5
stage_id: risk_analysis
stage_order: 4
responsibility: Calculates risk measures (greeks, VaR, scenario P&L) for instruments and portfolios with result aggregation.
This layer exists to provide standardized, comparable risk quantification across each instrument types with full audit
trails.
classes:
- name: RiskMeasure.calculate
file: risk_analysis_&_measures/riskmeasure-calculate.py
line: 0
kind: required_method
signature: ''
- name: FloatWithInfo.value
file: risk_analysis_&_measures/floatwithinfo-value.py
line: 0
kind: required_method
signature: ''
- name: DataFrameWithInfo.filter
file: risk_analysis_&_measures/dataframewithinfo-filter.py
line: 0
kind: required_method
signature: ''
- name: aggregate_risk
file: risk_analysis_&_measures/aggregate-risk.py
line: 0
kind: required_method
signature: ''
- name: Risk measure
file: risk_analysis_&_measures/risk-measure.py
line: 0
kind: replaceable_point
design_decision_count: 3
backtesting_engine:
class_count: 7
stage_id: backtesting
stage_order: 5
responsibility: Historical simulation of trading strategies with triggers, actions, transaction costs, and risk hedging.
Supports systematic and predefined asset strategies. This layer exists to enable quantitative strategy validation
against historical data with realistic cost modeling and risk management.
classes:
- name: Strategy.run
file: backtesting_engine/strategy-run.py
line: 0
kind: required_method
signature: ''
- name: GenericEngine.run_backtest
file: backtesting_engine/genericengine-run-backtest.py
line: 0
kind: required_method
signature: ''
- name: PredefinedAssetEngine.run
file: backtesting_engine/predefinedassetengine-run.py
line: 0
kind: required_method
signature: ''
- name: BackTest.get_results
file: backtesting_engine/backtest-get-results.py
line: 0
kind: required_method
signature: ''
- name: TransactionCostModel
file: backtesting_engine/transactioncostmodel.py
line: 0
kind: replaceable_point
- name: CashAccrualModel
file: backtesting_engine/cashaccrualmodel.py
line: 0
kind: replaceable_point
- name: BacktestEngine
file: backtesting_engine/backtestengine.py
line: 0
kind: replaceable_point
design_decision_count: 6
analytics_processing:
class_count: 6
stage_id: analytics_processing
stage_order: 6
responsibility: Applies analytical processors (volatility, correlation, returns) to market data with dependency resolution
and async execution. This layer exists to provide composable, reusable analytical computations with automatic parallelization
and shared sub-expression optimization.
classes:
- name: BaseProcessor.process
file: analytics_processing/baseprocessor-process.py
line: 0
kind: required_method
signature: ''
- name: BaseProcessor.update
file: analytics_processing/baseprocessor-update.py
line: 0
kind: required_method
signature: ''
- name: DataGrid.poll
file: analytics_processing/datagrid-poll.py
line: 0
kind: required_method
signature: ''
- name: VolatilityProcessor.compute
file: analytics_processing/volatilityprocessor-compute.py
line: 0
kind: required_method
signature: ''
- name: Processor type
file: analytics_processing/processor-type.py
line: 0
kind: replaceable_point
- name: Returns type
file: analytics_processing/returns-type.py
line: 0
kind: replaceable_point
design_decision_count: 4
time_series_analysis:
class_count: 6
stage_id: timeseries_analysis
stage_order: 7
responsibility: 'Statistical and econometric analysis of financial time series: returns, volatility, correlation, regression
models. This layer exists to provide specialized time series operations optimized for financial data with proper handling
of datetime indices and financial calculations.'
classes:
- name: ExtendedSeries.returns
file: time_series_analysis/extendedseries-returns.py
line: 0
kind: required_method
signature: ''
- name: LinearRegression.fit
file: time_series_analysis/linearregression-fit.py
line: 0
kind: required_method
signature: ''
- name: RollingLinearRegression.rolling_fit
file: time_series_analysis/rollinglinearregression-rolling-fit.py
line: 0
kind: required_method
signature: ''
- name: vol_swap_volatility
file: time_series_analysis/vol-swap-volatility.py
line: 0
kind: required_method
signature: ''
- name: Returns type
file: time_series_analysis/returns-type.py
line: 0
kind: replaceable_point
- name: Mean type
file: time_series_analysis/mean-type.py
line: 0
kind: replaceable_point
design_decision_count: 3
risk_model_management:
class_count: 6
stage_id: risk_models
stage_order: 8
responsibility: Manages Marquee risk models (factor, macro, thematic) with data retrieval and upload capabilities. This
layer exists to provide access to sophisticated risk factor models that decompose portfolio risk into explainable
components.
classes:
- name: RiskModel.get_data
file: risk_model_management/riskmodel-get-data.py
line: 0
kind: required_method
signature: ''
- name: FactorRiskModel.get_asset_factor_attribution
file: risk_model_management/factorriskmodel-get-asset-factor-attribu.py
line: 0
kind: required_method
signature: ''
- name: MacroRiskModel.get_fair_value_gap
file: risk_model_management/macroriskmodel-get-fair-value-gap.py
line: 0
kind: required_method
signature: ''
- name: ThematicRiskModel.get_exposure
file: risk_model_management/thematicriskmodel-get-exposure.py
line: 0
kind: required_method
signature: ''
- name: FactorRiskModel.upload_partial_data
file: risk_model_management/factorriskmodel-upload-partial-data.py
line: 0
kind: required_method
signature: ''
- name: Return format
file: risk_model_management/return-format.py
line: 0
kind: replaceable_point
design_decision_count: 3
reporting_&_performance_analytics:
class_count: 8
stage_id: reporting
stage_order: 9
responsibility: 'Retrieves and analyzes portfolio performance data: P&L, attribution, factor risk, thematic exposure.
This layer exists to provide standardized reporting formats for portfolio analytics with both raw data access and
formatted views.'
classes:
- name: Report.run
file: reporting_&_performance_analytics/report-run.py
line: 0
kind: required_method
signature: ''
- name: PerformanceReport.get_results
file: reporting_&_performance_analytics/performancereport-get-results.py
line: 0
kind: required_method
signature: ''
- name: FactorRiskReport.get_results
file: reporting_&_performance_analytics/factorriskreport-get-results.py
line: 0
kind: required_method
signature: ''
- name: FactorRiskReport.get_view
file: reporting_&_performance_analytics/factorriskreport-get-view.py
line: 0
kind: required_method
signature: ''
- name: PortfolioManager.get_aum
file: reporting_&_performance_analytics/portfoliomanager-get-aum.py
line: 0
kind: required_method
signature: ''
- name: PerformanceReport.get_brinson_attribution
file: reporting_&_performance_analytics/performancereport-get-brinson-attributio.py
line: 0
kind: required_method
signature: ''
- name: Report type
file: reporting_&_performance_analytics/report-type.py
line: 0
kind: replaceable_point
- name: Return format
file: reporting_&_performance_analytics/return-format.py
line: 0
kind: replaceable_point
design_decision_count: 3
entity_management:
class_count: 5
stage_id: entity_management
stage_order: 10
responsibility: Manages first-class entities (assets, countries, KPIs) with data coordinates and entitlements. This
layer exists to provide unified access to financial entities with proper typing, validation, and coordinate resolution
across the platform.
classes:
- name: Entity.get
file: entity_management/entity-get.py
line: 0
kind: required_method
signature: ''
- name: PositionedEntity.get_latest_position_set
file: entity_management/positionedentity-get-latest-position-set.py
line: 0
kind: required_method
signature: ''
- name: Country.get_region_mapping
file: entity_management/country-get-region-mapping.py
line: 0
kind: required_method
signature: ''
- name: Entity.get_entitlements
file: entity_management/entity-get-entitlements.py
line: 0
kind: required_method
signature: ''
- name: Entity identifier type
file: entity_management/entity-identifier-type.py
line: 0
kind: replaceable_point
design_decision_count: 2
data_flow_hints: []
locale_contract:
source_language: en
user_facing_fields:
- human_summary.what_i_can_do.tagline
- human_summary.what_i_can_do.use_cases[]
- human_summary.what_i_auto_fetch[]
- human_summary.what_i_ask_you[]
- evidence_quality.user_disclosure_template
- post_install_notice.message_template.positioning
- post_install_notice.message_template.capability_catalog.groups[].name
- post_install_notice.message_template.capability_catalog.groups[].description
- post_install_notice.message_template.capability_catalog.groups[].ucs[].name
- post_install_notice.message_template.capability_catalog.groups[].ucs[].short_description
- post_install_notice.message_template.call_to_action
- post_install_notice.message_template.featured_entries[].beginner_prompt
- post_install_notice.message_template.more_info_hint
- preconditions[].description
- preconditions[].on_fail
- intent_router.uc_entries[].name
- intent_router.uc_entries[].ambiguity_question
- architecture.pipeline
- architecture.stages[].narrative.does_what
- architecture.stages[].narrative.key_decisions
- architecture.stages[].narrative.common_pitfalls
- constraints.fatal[].consequence
- constraints.regular[].consequence
- output_validator.assertions[].failure_message
- acceptance.hard_gates[].on_fail
- skill_crystallization.action
locale_detection_order:
- explicit_user_declaration
- first_message_language
- system_locale
translation_enforcement:
trigger: on_first_user_message
action: Render user_facing_fields in detected locale, preserving all IDs (BD-/SL-/UC-/finance-C-) and code identifiers
verbatim
violation_code: LOCALE-01
violation_signal: User receives untranslated English Human Summary when detected locale != en
evidence_quality:
declared:
evidence_coverage_ratio: 1.0
evidence_verify_ratio: 0.5769230769230769
evidence_invalid: 33
evidence_verified: 45
evidence_auto_fixed: 0
audit_coverage: 30/30 (100%)
audit_pass_rate: 15/30 (50%)
audit_fail_total: 4
audit_finance_universal:
pass: 9
warn: 8
fail: 3
audit_subdomain_totals:
pass: 6
warn: 3
fail: 1
enforcement_rules:
- id: EQ-01
trigger: declared.evidence_verify_ratio < 0.5
action: MUST invoke traceback lookup for all cited BD-IDs in output before emitting business code — read LATEST.yaml sections
for each BD referenced
violation_code: EQ-01-V
violation_signal: Generated script references BD-IDs but no tool_call to read LATEST.yaml preceded code generation
user_disclosure_template: '[QUALITY NOTICE] This crystal was compiled from blueprint finance-bp-020. Evidence verify ratio
= 57.7% and audit fail total = 4. Generated results may have uncaptured requirement gaps. Verify critical decisions against
source files (LATEST.yaml / LATEST.jsonl).'
traceback:
source_files:
blueprint: LATEST.yaml
constraints: LATEST.jsonl
mandatory_lookup_scenarios:
- id: TB-01
condition: Two constraints have apparently conflicting enforcement rules
lookup_target: LATEST.jsonl — find both constraint IDs, compare `consequence` + `evidence_refs` to determine priority
- id: TB-02
condition: A business decision rationale is unclear or disputed
lookup_target: LATEST.yaml — locate BD-ID under business_decisions, read `rationale` + `alternative_considered` fields
- id: TB-03
condition: evidence_invalid > 0 in evidence_quality.declared
lookup_target: LATEST.yaml _enrich_meta — cross-check specific BD `evidence_refs` fields for invalid markers
- id: TB-04
condition: User asks where a rule comes from
lookup_target: LATEST.jsonl — find constraint by ID, read `confidence.evidence_refs` for source file + line number
- id: TB-05
condition: Generated code does not match expected ZVT API behavior
lookup_target: LATEST.yaml stages[].required_methods — verify method signature and evidence locator in source code
degraded_lookup:
no_fs_access: 'Ask the user to paste the relevant LATEST.yaml section or LATEST.jsonl lines for the BD-/finance-C- IDs
in question. Crystal ID: finance-bp-020-v5.0.'
trace_schema:
event_types:
- precondition_check
- spec_lock_check
- evidence_rule_fired
- evidence_rule_skipped
- locale_translation_emitted
- hard_gate_passed
- hard_gate_failed
- skill_emitted
- false_completion_claim
preconditions:
- id: PC-01
description: zvt package installed and importable
check_command: python3 -c 'import zvt; print(zvt.__version__)'
on_fail: 'Run: python3 -m pip install zvt then re-run: python3 -m zvt.init_dirs to initialize data directories'
severity: fatal
- id: PC-02
description: K-data exists for target entities (required before backtesting)
check_command: python3 -c "from zvt.api.kdata import get_kdata; df = get_kdata(entity_ids=['stock_sh_600000'], limit=1);
assert df is not None and len(df) > 0, 'No kdata found'"
on_fail: 'Run recorder first: python3 -m zvt.recorders.em.em_stock_kdata_recorder --entity_ids stock_sh_600000 (replace
with your target entity IDs)'
severity: fatal
applies_to_uc: []
- id: PC-03
description: ZVT data directory initialized (~/.zvt or ZVT_HOME)
check_command: 'python3 -c "import os; from pathlib import Path; zvt_home = Path(os.environ.get(''ZVT_HOME'', Path.home()
/ ''.zvt'')); assert zvt_home.exists(), f''ZVT home not found: {zvt_home}''"'
on_fail: 'Run: python3 -m zvt.init_dirs'
severity: fatal
- id: PC-04
description: SQLite write permission for ZVT data directory
check_command: python3 -c "import os, tempfile; from pathlib import Path; zvt_home = Path(os.environ.get('ZVT_HOME', Path.home()
/ '.zvt')); test_f = zvt_home / '.write_test'; test_f.touch(); test_f.unlink()"
on_fail: 'Check directory permissions: chmod u+w ~/.zvt or set ZVT_HOME environment variable to a writable location'
severity: warn
intent_router:
uc_entries: []
context_state_machine:
states:
- id: CA1_MEMORY_CHECKED
entry: Task started
exit: All memory queries attempted and recorded; memory_unavailable set if failed
timeout: 30s — skip memory, mark memory_unavailable=true, proceed to CA2
- id: CA2_GAPS_FILLED
entry: CA1 complete
exit: 'All FATAL-priority required inputs answered: target market (A-share/HK/US), data source, time range, strategy type'
timeout: NOT skippable — FATAL inputs MUST be user-answered before proceeding
- id: CA3_PATH_SELECTED
entry: CA2 complete
exit: intent_router matched single use case with confidence gap > 20% over next candidate, no data_domain ambiguity
timeout: Trigger ambiguity_question for top-2 candidates, await user selection
- id: CA4_EXECUTING
entry: CA3 complete + user explicit confirmation received
exit: All hard gates G1-Gn passed and output files written
timeout: NOT skippable — user confirmation of execution path required
enforcement: Code generation is PROHIBITED before CA4_EXECUTING. Any regression to earlier state MUST be announced to user.
buy/sell ordering SL-01 check runs at CA4 entry.
spec_lock_registry:
semantic_locks:
- id: SL-01
description: Execute sell orders before buy orders in every trading cycle
locked_value: sell() called before buy() in each Trader.run() iteration
violation_is: fatal
source_bd_ids:
- BD-018
- id: SL-02
description: Trading signals MUST use next-bar execution (no look-ahead)
locked_value: due_timestamp = happen_timestamp + level.to_second()
violation_is: fatal
source_bd_ids:
- BD-014
- BD-025
- id: SL-03
description: Entity IDs MUST follow format entity_type_exchange_code
locked_value: stock_sh_600000 | stockhk_hk_0700 | stockus_nasdaq_AAPL
violation_is: fatal
source_bd_ids: []
- id: SL-04
description: DataFrame index MUST be MultiIndex (entity_id, timestamp)
locked_value: df.index.names == ['entity_id', 'timestamp']
violation_is: fatal
source_bd_ids: []
- id: SL-05
description: 'TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amount'
locked_value: XOR enforcement in trading/__init__.py:68
violation_is: fatal
source_bd_ids: []
- id: SL-06
description: 'filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTION'
locked_value: factor.py:475 order_type_flag mapping
violation_is: fatal
source_bd_ids: []
- id: SL-07
description: Transformer MUST run BEFORE Accumulator in factor pipeline
locked_value: 'compute_result(): transform at :403 before accumulator at :409'
violation_is: fatal
source_bd_ids: []
- id: SL-08
description: 'MACD parameters locked: fast=12, slow=26, signal=9'
locked_value: factors/algorithm.py:30 macd(slow=26, fast=12, n=9)
violation_is: fatal
source_bd_ids:
- BD-036
- id: SL-09
description: 'Default transaction costs: buy_cost=0.001, sell_cost=0.001, slippage=0.001'
locked_value: sim_account.py:25 SimAccountService default costs
violation_is: warning
source_bd_ids:
- BD-029
- id: SL-10
description: A-share equity trading is T+1 (no same-day close of buy positions)
locked_value: sim_account.available_long filters by trading_t
violation_is: fatal
source_bd_ids: []
- id: SL-11
description: Recorder subclass MUST define provider AND data_schema class attributes
locked_value: contract/recorder.py:71 Meta; register_schema decorator
violation_is: fatal
source_bd_ids: []
- id: SL-12
description: Factor result_df MUST contain either 'filter_result' OR 'score_result' column
locked_value: result_df.columns.intersection({'filter_result', 'score_result'}) non-empty
violation_is: fatal
source_bd_ids: []
implementation_hints:
- id: IH-01
hint: 'Use AdjustType enum exactly: qfq (pre-adjust), hfq (post-adjust), bfq (none) — contract/__init__.py:121'
- id: IH-02
hint: For A-share kdata, default to hfq for long-term analysis (dividend-adjusted) — trader.py:538 StockTrader
- id: IH-03
hint: SQLite connection MUST use check_same_thread=False for multi-threaded recorders
- id: IH-04
hint: Accumulator state serialization uses JSON with custom encoder/decoder hooks — contract/base_service.py
- id: IH-05
hint: Factor.level MUST match TargetSelector.level (enforced at add_factor) — factors/target_selector.py:84
preservation_manifest:
required_objects:
business_decisions_count: 97
fatal_constraints_count: 46
non_fatal_constraints_count: 239
use_cases_count: 0
semantic_locks_count: 12
preconditions_count: 4
evidence_quality_rules_count: 2
traceback_scenarios_count: 5
architecture:
pipeline: data_collection -> data_storage -> factor_computation -> target_selection -> trading_execution -> visualization
stages:
- id: data_collection
narrative:
does_what: TimeSeriesDataRecorder and FixedCycleDataRecorder fetch OHLCV and fundamental data from providers (eastmoney,
joinquant, baostock, akshare) and persist domain objects (Stock1dKdata, BalanceSheet) to SQLite via df_to_db().
key_decisions: BD-002 chose evaluate_start_end_size_timestamps for incremental fetch (not full refresh) because comparing
to get_latest_saved_record avoids redundant API calls; BD-003 chose get_data_map field transformation to keep domain
schema provider-agnostic.
common_pitfalls: 'Don''t forget SL-11: Recorder subclass MUST declare both provider and data_schema class attributes
else initialization fails with assertion error; finance-C-001 fatal violation.'
business_decisions: []
- id: data_storage
narrative:
does_what: StorageBackend persists DataFrames to per-provider SQLite databases at {data_path}/{provider}/{provider}_{db_name}.db
using path templates from _get_path_template; Mixin.record_data and Mixin.query_data provide uniform read/write interface.
key_decisions: BD-004 chose StorageBackend abstraction (not hardcoded SQLite) to allow future cloud storage swap; BD-006
derives db_name from data_schema __tablename__ for per-domain database isolation.
common_pitfalls: SL-04 violation (wrong DataFrame index) causes factor pipeline failures downstream; always ensure df.index.names
== ['entity_id', 'timestamp'] before calling record_data.
business_decisions: []
- id: factor_computation
narrative:
does_what: Factor.compute() applies Transformer (stateless, e.g. MacdTransformer) then Accumulator (stateful, e.g. MaStatsAccumulator)
to produce filter_result or score_result columns; EntityStateService persists per-entity rolling state across batches.
key_decisions: BD-007 chose Factor inheriting DataReader for composable data access; SL-08 locks MACD at (fast=12, slow=26,
n=9) — chose standard Appel parameters not adaptive because interpretability matters for practitioners.
common_pitfalls: 'SL-07: Transformer MUST run before Accumulator — swapping order causes NaN propagation; SL-12: result_df
must contain filter_result OR score_result column or TargetSelector silently drops all signals.'
business_decisions: []
- id: target_selection
narrative:
does_what: TargetSelector.add_factor() registers Factor instances; get_targets() returns entity_ids passing threshold
filter at a specific timestamp, enabling point-in-time historical backtesting without look-ahead.
key_decisions: BD-012 chose registrable factor list (not hardcoded) for runtime customization; BD-013 chose timestamp-specific
filtering not current-only because backtests need historical point-in-time correctness.
common_pitfalls: Factor.level MUST match TargetSelector.level (IH-05); mismatched levels cause silent empty target lists
that look like no signals but are actually level-mismatch bugs.
business_decisions: []
- id: trading_execution
narrative:
does_what: Trader.run() calls sell() before buy() each cycle, generates TradingSignals with due_timestamp = happen_timestamp
+ level.to_second() for next-bar execution, and applies on_profit_control() for stop-loss/take-profit before regular
target selection.
key_decisions: SL-01 locks sell-before-buy order because available_long check in sim_account depends on it — chose this
over symmetric ordering to prevent implicit leverage; BD-039 chose long=AND/short=OR multi-level logic to reflect
risk asymmetry.
common_pitfalls: 'SL-02 violation (immediate execution instead of next-bar) introduces look-ahead bias and makes backtest
results unreproducible in live trading; SL-10: A-share T+1 constraint — backtesting without it overstates returns.'
business_decisions: []
- id: visualization
narrative:
does_what: Drawer.draw() combines kline main chart with factor overlays and Rect annotations for entry/exit signals
using Plotly; Drawable interface on Factor enables consistent chart rendering across data types.
key_decisions: BD-019 chose drawer_rects subclass override for custom annotations not hardcoded markers — allows traders
to define entry/exit visuals without modifying base drawing logic.
common_pitfalls: draw_result=True by default (BD-055) is fine for development but set draw_result=False in production/headless
environments to avoid Plotly server startup overhead.
business_decisions: []
- id: cross_cutting_concerns
narrative:
does_what: 'Invariants and utilities that span multiple pipeline stages — collected from 15 source groups: architecture(12),
business_rule(8), data_ingestion(11), default_value(10), global(9), gs_quant.backtests.order(2), and 9 more.'
key_decisions: 97 BDs merged here because they apply to more than one main stage (e.g. algorithm helpers, default value
choices, ordering contracts, error handling). Agent should inspect individual BD summaries and link back to affected
main stages via shared IDs.
common_pitfalls: Cross-cutting concerns frequently surface as bugs when changes to one main stage unintentionally break
another. Check constraints referencing these BDs and verify invariants still hold after any stage-local modification.
business_decisions:
- id: BD-GAP-001
type: B/BA
summary: InMemoryApiRequestCache uses max_size=1000 entries and TTL=3600 seconds (1 hour) for API response caching
- id: BD-GAP-002
type: B/BA
summary: RiskApi defaults max_concurrent=100 for parallel risk calculations and uses DEFAULT_TIMEOUT for backtest operations
- id: BD-GAP-003
type: B/BA
summary: SecMasterIdentifiers mapping defaults to output_type=frozenset([SecMasterIdentifiers.GSID]) for security lookups
- id: BD-GAP-004
type: B/BA
summary: DataApi build_query defaults to format='MessagePack' for efficient binary serialization over JSON
- id: BD-GAP-008
type: B/BA
summary: GsAssetApi.get_many_assets defaults limit=100 and get_many_assets_scroll defaults scroll='1m' and limit=1000
- id: BD-GAP-010
type: B
summary: Dataset.get_data_bulk constrains request_batch_size between 0 and 5 (>0 and <5) for parallel queries
- id: BD-GAP-014
type: B
summary: ApiWithCustomSession allows session override via factory/supplier pattern rather than direct session injection
- id: BD-GAP-015
type: B/DK
summary: IntradayFactorDataSource provides choice of data sources for intraday risk model factor data
- id: BD-GAP-016
type: B/BA
summary: TransactionCostModel hierarchy uses algebraic Model base class with operator overloading for cost combination
- id: BD-GAP-017
type: B/BA
summary: CoordinatesData methods default to MarketDataVendor.Goldman_Sachs as primary data source
- id: BD-GAP-018
type: B
summary: PretradeExecutionOptimization uses max_attempts=15 polling with get_pretrade_execution_optimization
- id: BD-GAP-019
type: B/BA
summary: ThreadPoolManager.initialize allows configurable max_workers for async task parallelization
- id: BD-GAP-005
type: T
summary: SharpeRatioProcessor defaults currency to USD for risk-free rate lookup
- id: BD-GAP-006
type: T
summary: VolatilityProcessor, BetaProcessor, CorrelationProcessor default w=Window(None, 0) meaning no rolling window
by default
- id: BD-GAP-007
type: BA
summary: PlotRunner defaults to RelativeDate('-1y') to RelativeDate('-1b') time range (1 year to previous business day)
- id: BD-GAP-009
type: T
summary: PerformanceHedgeQuery defaults max_return_deviation=5, max_adv_percentage=15, max_leverage=100, max_weight=100,
market_participation_rate=10, exclude_target_assets=True
- id: BD-GAP-011
type: BA
summary: 'ContentApi.get_contents defaults order_by={''direction'': OrderBy.DESC, ''field'': ''createdTime''} and limit=10'
- id: BD-GAP-012
type: T
summary: PortfolioApi.get_positions defaults position_type='close' (close prices) for position lookups
- id: BD-GAP-013
type: T
summary: FXImpliedCorrProcessor defaults tenor='3m' (3-month) for foreign exchange correlation calculations
- id: BD-GAP-020
type: BA
summary: ScaleShape enum limits marker shapes to PIPE and DIAMOND for spot markers in visualization scales
- id: BD-001
type: B/BA
summary: DimensionsOverride converts enum keys to string values
- id: BD-GAP-021
type: B
summary: 'Missing: PnL conservation'
- id: BD-GAP-022
type: DK
summary: 'Missing: Random seed full coverage'
- id: BD-GAP-023
type: B
summary: 'Missing: Immutable event log'
- id: BD-GAP-024
type: B
summary: 'Missing: Status**: Absent'
- id: BD-GAP-025
type: DK
summary: 'Missing: Status**: Absent'
- id: BD-GAP-026
type: RC
summary: 'Missing: Status**: Absent'
- id: BD-GAP-027
type: B
summary: 'Missing: Absent'
- id: BD-GAP-028
type: DK
summary: 'Missing: Add comprehensive random seed management across each stochastic components (simulation, ML models,
sampling)'
- id: BD-GAP-029
type: RC
summary: 'Missing: Add model parameter versioning with effective_date and valid_from timestamps for each quantitative
models'
- id: BD-GAP-030
type: B
summary: 'Missing: PnL conservation'
- id: BD-039
type: BA
summary: ContextBaseWithDefault.default_value() auto-instantiates class when accessed
- id: BD-043
type: B
summary: PriceableImpl._pricing_context returns nullcontext if PricingContext is_entered - prevents re-entrancy
- id: BD-044
type: DK
summary: RiskKey.ex_measure strips risk_measure from key; ex_historical_diddle clears date - enables cross-measure comparison
- id: BD-046
type: BA
summary: GenericDataSource.missing_data_strategy defaults to fail - unexpected missing data raises RuntimeError
- id: BD-048
type: BA/M
summary: default_transaction_cost() factory returns ConstantTransactionModel(0) - zero cost by default
- id: BD-049
type: BA
summary: Instrument.quantity_ defaults to 1 in InstrumentBase - each instruments have unit quantity unless specified
- id: BD-053
type: BA
summary: HedgeAction.risk_percentage defaults to 100 - full hedge unless specified otherwise
- id: BD-054
type: BA
summary: RebalanceAction.priceable.unresolved check in __post_init__ - requires pre-resolved priceables
- id: BD-056
type: BA/DK
summary: _default_pricing_location raises MqValueError for unsupported currencies - explicit location required
- id: BD-058
type: B/BA
summary: ResultInfo.composition_info enforces market.location equality - cannot compose across locations
- id: BD-059
type: BA/DK
summary: 'INTERACTION: BD-002 × BD-023 → Annualization factor inconsistency between 252 trading days (volatility) and
365 calendar days (frequency mapping)'
- id: BD-060
type: BA/M
summary: 'INTERACTION: BD-030 × BD-005 → Information Ratio uses population std (ddof=0) while z-score normalization
uses sample std (ddof=1)'
- id: BD-061
type: B/DK
summary: 'INTERACTION: BD-008 × BD-021 → Winsorization at 2.5σ caps data before Bollinger Bands at ±2σ evaluate for
mean-reversion signals'
- id: BD-062
type: B
summary: 'INTERACTION: BD-040 × BD-045 × BD-050 → Execution ordering risk cascade: CalcType hierarchy + path_dependent
requirements + aggregate max dominate execution'
- id: BD-063
type: BA/DK
summary: 'INTERACTION: BD-048 × BD-037 → Zero default transaction cost contradicts TWAP arithmetic mean execution assumption'
- id: BD-064
type: B/BA
summary: 'INTERACTION: BD-029 × BD-008 → GARCH normal distribution assumption conflicts with winsorization''s implicit
heavy-tailed distribution treatment'
- id: BD-065
type: BA/DK
summary: 'INTERACTION: BD-017 × BD-014 → Epidemiological model fitting chains ODE solver (LSODA) with optimizer (Levenberg-Marquardt)
creating coupled accuracy dependency'
- id: BD-066
type: BA
summary: 'INTERACTION: BD-046 × BD-012 → Missing data strategy defaults to fail but rolling product uses nanprod silently
propagating NaN'
- id: BD-067
type: BA/DK
summary: 'INTERACTION: BD-002 × BD-006 × BD-007 × BD-005 × BD-013 → Universal ddof=1 (Bessel correction) creates hidden
consistency contract across each variance-based calculations'
- id: BD-037
type: B/BA
summary: TWAP execution uses arithmetic mean of fixings as executed price
- id: BD-038
type: B/BA
summary: BTIC TWAP uses mean of BTIC fixings for execution price
- id: BD-014
type: B
summary: SIR compartmental model uses Levenberg-Marquardt (leastsq) via lmfit.minimize for fitting
- id: BD-015
type: B
summary: SEIR adds exposed compartment with sigma transition rate, same LM fitting
- id: BD-016
type: B
summary: SEAIRD adds age stratification (K groups), quarantine (T), mortality (M) compartments
- id: BD-017
type: B
summary: ODE integration uses scipy.integrate.odeint (LSODA algorithm) for model solving
- id: BD-018
type: B
summary: Residual = solution - data (L2 norm minimized), with optional fit_period tail truncation
- id: BD-024
type: B
summary: EMA Crossover uses difference of two EMAs, no explicit signal line
- id: BD-025
type: B
summary: AR model via OLS on lagged values using numpy lstsq
- id: BD-026
type: B
summary: Augmented Dickey-Fuller test for stationarity with AIC-based lag selection
- id: BD-027
type: B
summary: Engle-Granger two-step cointegration test using OLS residuals
- id: BD-028
type: B/BA
summary: Variance targeting estimator scales realized var by window mean, annualizes
- id: BD-029
type: B/BA
summary: GARCH(1,1) via rugarch with constant mean, normal distribution for returns
- id: BD-030
type: B
summary: Information Ratio = mean(excess) / std(excess) using population std (np.std)
- id: BD-031
type: B/DK
summary: Bear market IR filters benchmark returns < 0 before computing excess and std
- id: BD-032
type: B/DK
summary: Bull market IR filters benchmark returns > 0 before computing excess and std
- id: BD-033
type: B/DK
summary: 'Bear market conditional std: if market down at period end, compute rolling std else NaN'
- id: BD-034
type: B/BA
summary: 'Bull market conditional std: if market up at period end, compute rolling std else NaN'
- id: BD-035
type: B/BA
summary: Modigliani ratio rescales excess return by Sharpe ratio of benchmark
- id: BD-036
type: B
summary: Factor-attributed PnL summed across factors via np.sum along axis=1
- id: BD-002
type: B/BA
summary: Annualized rolling volatility uses simple std with ddof=1 and 252-day annualization
- id: BD-003
type: B/BA
summary: Exponential moving average uses span=14 default (beta=0.94, alpha=0.06), no adjustment
- id: BD-004
type: B/BA
summary: Exponentially weighted std uses beta=0.75 (alpha=0.25), with debiasing factor
- id: BD-005
type: B/BA
summary: Rolling z-scores computed with sample std (ddof=1) over rolling window
- id: BD-006
type: B/DK
summary: Rolling variance uses Bessel correction (ddof=1) for unbiased sample variance
- id: BD-007
type: B/DK
summary: Rolling covariance uses Bessel correction (ddof=1) for unbiased sample covariance
- id: BD-008
type: B/BA
summary: Winsorization caps values at ±2.5 sigma from sample mean
- id: BD-009
type: B/BA
summary: Rolling percentile rank uses 'mean' interpolation between tied values
- id: BD-010
type: B
summary: OLS linear regression via statsmodels OLS with optional intercept, filters NaN/inf
- id: BD-011
type: B/DK
summary: Rolling OLS via statsmodels RollingOLS with fixed window, optional intercept
- id: BD-012
type: B/DK
summary: Rolling product computes product of (1+r) values over window for cumulative return
- id: BD-013
type: B
summary: Autocorrelation uses unbiased Var(X) estimator (ddof=1) per Bartlett's formula
- id: BD-019
type: B/BA
summary: Seasonal decomposition via statsmodels seasonal_decompose, additive or multiplicative
- id: BD-020
type: B
summary: Seasonally adjusted = trend + resid (additive) or trend * resid (multiplicative)
- id: BD-021
type: B/BA
summary: Bollinger Bands at ±2 standard deviations from 20-period SMA
- id: BD-022
type: B
summary: Exponential volatility = annualized exp-weighted std of log returns
- id: BD-023
type: B/BA
summary: 'Frequency-to-period mapping converts index freq to number of points: daily=365, business=5, weekly=52, monthly=12'
- id: BD-042
type: BA
summary: __subclasses__ registry pattern in Action/Trigger/DataSource - subclasses auto-register via __init_subclass__
- id: BD-057
type: DK
summary: AddWeightedTradeAction._calc_type set to semi_path_dependent in __post_init__ - not a class default
- id: BD-050
type: B/RC
summary: AggregateTriggerRequirements.calc_type takes max of child types - path_dependent dominates hierarchy
- id: BD-040
type: B
summary: 'CalcType enum enforces execution order: simple→semi_path_dependent→path_dependent'
- id: BD-045
type: B
summary: GenericEngine._process_triggers_and_actions_for_date requires __ensure_risk_results before path_dependent actions
- id: BD-051
type: BA
summary: ExitTradeAction transaction_cost_exit defaults to transaction_cost in __post_init__
- id: BD-041
type: B
summary: ActionHandler factory pattern maps Actions to Handlers - get_action_handler must return correct handler type
- id: BD-047
type: DK
summary: historical_risk_key() creates LocationOnlyMarket stripping market details for historical comparison
- id: BD-052
type: B/BA
summary: BuySell.flip dict in instruments - hedge directions must be opposite of original trade
- id: BD-055
type: B
summary: BaseProcessor uses async update with recursive parent traversal - calculate() propagates up tree
resources:
packages:
- name: numpy
version_pin: latest
- name: pandas
version_pin: latest
- name: scipy
version_pin: latest
- name: requests
version_pin: latest
- name: httpx
version_pin: latest
- name: websockets
version_pin: latest
- name: msgpack
version_pin: latest
- name: dataclasses_json
version_pin: latest
- name: backoff
version_pin: latest
- name: pydash
version_pin: latest
strategy_scaffold:
entry_point_name: run_backtest
output_path: result.csv
execution_mode: backtest
conditional_entry_points:
backtest:
entry_point_name: run_backtest
output_path: result.csv
collector:
entry_point_name: run_collector
output_path: result.json
factor:
entry_point_name: run_factor
output_path: result.parquet
training:
entry_point_name: run_training
output_path: result.json
serving:
entry_point_name: run_server
output_path: result.json
research:
entry_point_name: run_research
output_path: result.json
tail_template: "# === DO NOT MODIFY BELOW THIS LINE ===\nif __name__ == \"__main__\":\n result = run_backtest() #\
\ implement above\n from validate import enforce_validation\n enforce_validation(result, output_path=\"{workspace}/result.csv\"\
)\n# === END DO NOT MODIFY ==="
host_adapter:
target: openclaw
timeout_seconds: 1800
shell_operator_restriction: 'exec tool intercepts && / ; / | — never chain: ''pip install X && python Y''. Use separate
exec calls.'
install_recipes:
- python3 -m pip install numpy
- python3 -m pip install pandas
- python3 -m pip install scipy
- python3 -m pip install zvt
credential_injection: JoinQuant/QMT credentials require user-side '!' prefix shell login. Never hardcode credentials in
generated scripts.
path_resolution: '{workspace} resolves to ~/.openclaw/workspace/doramagic at execution time.'
file_io_tooling: Use openclaw 'write' tool for .py/.sql files; 'exec' tool for python3 /absolute/path/script.py (absolute
paths only).
constraints:
fatal:
- id: finance-C-001
when: When building DataQuery with start parameter of type datetime
action: verify end parameter is also of type datetime
severity: fatal
kind: domain_rule
modality: must
consequence: Mixed datetime and date types for start/end cause query construction to fail with ValueError, resulting in
invalid API requests that return no data or incorrect data
stage_ids:
- data_ingestion
- id: finance-C-010
when: When handling HTTP responses from data API
action: validate status code is 2xx before processing response
severity: fatal
kind: domain_rule
modality: must
consequence: Processing non-2xx responses as valid data causes silent failures where error messages are treated as actual
data, corrupting downstream analysis
stage_ids:
- data_ingestion
- id: finance-C-013
when: When constructing DataQuery with start and end dates
action: mix date and datetime types in the same query
severity: fatal
kind: domain_rule
modality: must_not
consequence: Type mixing causes ValueError at query construction time, preventing data retrieval entirely for the affected
query
stage_ids:
- data_ingestion
- id: finance-C-015
when: When authenticating with the data API
action: handle 401 status codes by re-authenticating before retrying
severity: fatal
kind: resource_boundary
modality: must
consequence: Without proper 401 handling, expired tokens cause all subsequent API requests to fail, stopping data ingestion
entirely
stage_ids:
- data_ingestion
- id: finance-C-016
when: When implementing DataProvider for Dataset
action: implement each abstract methods from DataApi interface
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Missing abstract method implementations cause NotImplementedError at runtime when Dataset attempts data retrieval
through the provider
stage_ids:
- data_ingestion
- id: finance-C-017
when: When defining instruments without explicit quantity
action: default quantity to 1 for unit quantity instruments
severity: fatal
kind: domain_rule
modality: must
consequence: Instruments will be treated as having zero position size, causing dollar_price and calc() to return zero
values for all risk measures, leading to incorrect portfolio valuations
stage_ids:
- instrument_modeling
- id: finance-C-018
when: When resolving instruments under HistoricalPricingContext
action: call resolve with in_place=True under HistoricalPricingContext
severity: fatal
kind: domain_rule
modality: must_not
consequence: Historical pricing requires immutable resolved values for each date point; in-place mutation corrupts the
temporal sequence and produces incorrect historical valuations
stage_ids:
- instrument_modeling
- id: finance-C-019
when: When resolving instruments under MultiScenario Context
action: call resolve with in_place=True under MultiScenario Context
severity: fatal
kind: domain_rule
modality: must_not
consequence: Multi-scenario analysis requires independent instrument states for each scenario; in-place mutation causes
cross-scenario contamination and incorrect scenario comparisons
stage_ids:
- instrument_modeling
- id: finance-C-020
when: When implementing portfolio aggregation logic
action: aggregate individual instrument values through PortfolioRiskResult.aggregate()
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Incorrect aggregation produces wrong portfolio-level risk measures, causing mispriced positions and potential
regulatory compliance violations
stage_ids:
- instrument_modeling
- id: finance-C-056
when: When composing risk results from different market locations
action: compose results with different market locations in the same operation
severity: fatal
kind: domain_rule
modality: must_not
consequence: Risk calculations from different markets (NYC, LDN, HKG) cannot be meaningfully combined as they use different
market data conventions, causing corrupted or misleading aggregated risk figures
stage_ids:
- risk_analysis
- id: finance-C-057
when: When aggregating FloatWithInfo values with mismatched units
action: add FloatWithInfo values with different unit dictionaries
severity: fatal
kind: domain_rule
modality: must_not
consequence: Adding quantities with incompatible units (e.g., USD and EUR notionals) produces mathematically meaningless
results and violates dimensional analysis principles
stage_ids:
- risk_analysis
- id: finance-C-058
when: When aggregating risk results with different units for the same risk measure
action: aggregate results that have conflicting unit definitions for the same risk_measure
severity: fatal
kind: domain_rule
modality: must_not
consequence: Aggregating IRDelta results with different units (e.g., SomeCurrency/bp vs SomeCurrency) produces incorrect
portfolio-level risk figures that misrepresent actual exposure
stage_ids:
- risk_analysis
- id: finance-C-059
when: When aggregating risk results where any individual result is in error
action: aggregate results that contain ErrorValue objects
severity: fatal
kind: domain_rule
modality: must_not
consequence: Including erroneous calculations in aggregation propagates the error silently or throws an exception, making
the entire portfolio risk calculation unreliable
stage_ids:
- risk_analysis
- id: finance-C-062
when: When propagating risk results through arithmetic operations
action: preserve the RiskKey through each arithmetic operations (add, mul) on FloatWithInfo and SeriesWithInfo
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Losing RiskKey during computation breaks audit trails and makes it impossible to trace risk figures back
to their market data sources, violating regulatory compliance requirements
stage_ids:
- risk_analysis
- id: finance-C-063
when: When creating DataFrames containing risk data
action: use DataFrameWithInfo class that embeds RiskKey for provenance tracking
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Using plain pandas DataFrames for risk data loses the RiskKey metadata, making it impossible to track which
market conditions generated each risk figure
stage_ids:
- risk_analysis
- id: finance-C-072
when: When backtesting FX options without setting premium=0
action: Set premium=0 on FXOption instruments before running backtest
severity: fatal
kind: domain_rule
modality: must
consequence: FX option resolution sets premium that makes DollarPrice zero, causing the backtest P&L to be meaningless
and all price-based calculations to fail
stage_ids:
- backtesting
- id: finance-C-073
when: When implementing a backtest that accesses market data
action: Access data at a future date relative to the current backtest state
severity: fatal
kind: domain_rule
modality: must_not
consequence: Clock.time_check() raises RuntimeError preventing look-ahead bias; accessing future data corrupts backtest
validity by introducing information not available at decision time
stage_ids:
- backtesting
- id: finance-C-074
when: When running a BackTest object without an engine
action: Pass BackTest object to GenericEngine.run_backtest() or engine subclass for execution
severity: fatal
kind: resource_boundary
modality: must
consequence: BackTest is a data container with no execute() method; without an engine, the backtest remains unexecuted
and produces no results, trades, or P&L
stage_ids:
- backtesting
- id: finance-C-075
when: When using an action type not supported by the engine
action: Use only actions registered in GenericEngineActionFactory or verify engine supports the action
severity: fatal
kind: resource_boundary
modality: must
consequence: GenericEngine.get_action_handler() raises RuntimeError 'Action X not supported by engine', causing backtest
execution to fail
stage_ids:
- backtesting
- id: finance-C-081
when: When implementing custom Action subclasses
action: Register the action in the engine's action_impl_map through GenericEngineActionFactory or engine subclass
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Unregistered actions raise RuntimeError at execution time, preventing backtest from running with the custom
action
stage_ids:
- backtesting
- id: finance-C-086
when: When using GenericEngine for backtesting
action: Set up GsSession authentication before calling run_backtest()
severity: fatal
kind: resource_boundary
modality: must
consequence: GenericEngine pricing calls require GS Marquee API authentication; unauthenticated calls raise exceptions,
preventing backtest execution
stage_ids:
- backtesting
- id: finance-C-087
when: When using EquityVolEngine
action: Use only EqOption or EqVarianceSwap instruments with the engine
severity: fatal
kind: resource_boundary
modality: must
consequence: EquityVolEngine raises MqValueError for unsupported equity instruments, causing backtest to fail at initialization
stage_ids:
- backtesting
- id: finance-C-104
when: When implementing statistical functions on time series data
action: Verify input series index is monotonically increasing before processing
severity: fatal
kind: domain_rule
modality: must
consequence: Rolling window calculations will produce incorrect results if the time series index is not sorted chronologically,
causing statistical metrics to be misaligned with their time periods.
stage_ids:
- timeseries_analysis
- id: finance-C-105
when: When calculating returns from price series
action: Use the lag function with LagMode.TRUNCATE to shift prices backward before computing returns
severity: fatal
kind: domain_rule
modality: must
consequence: Returns calculation will include current period data in the denominator, creating look-ahead bias where future
prices influence current period returns.
stage_ids:
- timeseries_analysis
- id: finance-C-107
when: When computing volatility swap pricing
action: Apply logarithmic returns by default as specified by the vol_swap_volatility convention
severity: fatal
kind: domain_rule
modality: must
consequence: Volatility swap payoff calculations will be incorrect if arithmetic returns are used instead of logarithmic
returns, leading to mispriced volatility swap contracts.
stage_ids:
- timeseries_analysis
- id: finance-C-108
when: When creating Window objects for rolling calculations
action: Set window size to zero or negative values
severity: fatal
kind: domain_rule
modality: must_not
consequence: Rolling window calculations will crash or produce undefined behavior with zero or negative window sizes,
causing analytics pipeline failures.
stage_ids:
- timeseries_analysis
- id: finance-C-124
when: When processing returns with logarithmic type
action: Apply natural logarithm to both current and lagged prices before subtraction
severity: fatal
kind: domain_rule
modality: must
consequence: Logarithmic returns will be calculated incorrectly, breaking time-additivity property and causing portfolio
return aggregation errors.
stage_ids:
- timeseries_analysis
- id: finance-C-143
when: When scheduling a report without valid report ID and Position Source ID
action: schedule reports that lack valid identifiers
severity: fatal
kind: domain_rule
modality: must_not
consequence: Scheduling reports without valid IDs raises MqValueError('Can only schedule reports with valid IDs and Position
Source IDs.')
stage_ids:
- reporting
- id: finance-C-144
when: When scheduling a non-Portfolio report without explicit start and end dates
action: provide explicit schedule start and end dates for asset-level or other non-Portfolio reports
severity: fatal
kind: domain_rule
modality: must
consequence: Non-Portfolio reports without explicit dates raise MqValueError('Must specify schedule start and end dates
for report.')
stage_ids:
- reporting
- id: finance-C-145
when: When scheduling a report for a portfolio that has no positions
action: schedule reports for portfolios with no position history
severity: fatal
kind: domain_rule
modality: must_not
consequence: Attempting to schedule a report for an empty portfolio raises MqValueError('Cannot schedule reports for a
portfolio with no positions.')
stage_ids:
- reporting
- id: finance-C-161
when: When creating an Entity object with a specific entity_type
action: pass a valid EntityType enum value to verify proper endpoint routing
severity: fatal
kind: domain_rule
modality: must
consequence: Invalid entity_type causes EntityType() constructor to raise ValueError, breaking entity creation and all
downstream data access
stage_ids:
- entity_management
- id: finance-C-162
when: When calling PositionedEntity methods with a non-PORTFOLIO/ASSET entity_type
action: use EntityType values other than PORTFOLIO or ASSET for position operations
severity: fatal
kind: domain_rule
modality: must_not
consequence: PositionedEntity raises NotImplementedError at runtime when entity_type is not PORTFOLIO or ASSET, causing
all position queries to fail
stage_ids:
- entity_management
- id: finance-C-169
when: When accessing entity entitlements in gs_quant library
action: assume each entities have entitlements entitlements or bypass entitlement checks
severity: fatal
kind: claim_boundary
modality: must_not
consequence: Accessing entities without proper entitlements violates platform security policies and may expose unauthorized
financial data
stage_ids:
- entity_management
- id: finance-C-176
when: When entity type validation fails during Entity.get
action: pass invalid string values to EntityType constructor
severity: fatal
kind: rationalization_guard
modality: must_not
consequence: Invalid entity_type strings raise ValueError in EntityType enum constructor, causing immediate failure of
entity retrieval operations
stage_ids:
- entity_management
- id: finance-C-178
when: When resolving instruments with unresolved parameters
action: call resolve() method with in_place=False when using HistoricalPricingContext to get separate copies per date
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Using in_place=True with HistoricalPricingContext causes all dates to share the same resolved instrument
state, corrupting historical calculations
stage_ids:
- data_ingestion
- instrument_modeling
- id: finance-C-180
when: When creating PricingContext for historical calculations
action: set pricing_date to a date more than 5 days in the future (use RollFwd Scenario for future pricing)
severity: fatal
kind: resource_boundary
modality: must_not
consequence: Setting pricing_date too far in the future raises ValueError and prevents any pricing calculations from executing
stage_ids:
- instrument_modeling
- pricing_context
- id: finance-C-196
when: When constructing FX options for backtesting
action: set premium=0 explicitly to get meaningful DollarPrice values; otherwise resolution sets a zeroing premium making
each option values appear as 0
severity: fatal
kind: rationalization_guard
modality: must
consequence: Backtest P&L becomes meaningless when all option values show zero, making strategy performance evaluation
impossible
stage_ids:
- instrument_modeling
- pricing_context
- id: finance-C-198
when: When accessing any GS Quant API for pricing, data, or risk calculations
action: Initialize a GsSession using GsSession.use() with valid OAuth2 credentials before making any API calls
severity: fatal
kind: architecture_guardrail
modality: must
consequence: API calls will fail with MqUninitialisedError when GsSession.current is accessed without prior initialization,
preventing any pricing, risk, or data operations
- id: finance-C-199
when: When calculating prices, resolving instruments, or computing risk measures
action: Enter a PricingContext (as context manager or via PricingContext.current) before calling calc(), price(), dollar_price(),
or resolve() methods
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Risk calculations return PricingFuture objects that block indefinitely or return incomplete results when
no PricingContext is entered to manage the pricing date and market data location
- id: finance-C-200
when: When setting a pricing_date in PricingContext
action: Set a pricing_date that is more than 5 calendar days in the future — future pricing requires the RollFwd Scenario
severity: fatal
kind: domain_rule
modality: must_not
consequence: PricingContext raises ValueError at construction time, blocking the pricing workflow entirely
- id: finance-C-207
when: When claiming or representing the capabilities of gs_quant for regulatory, compliance, or marketing purposes
action: Claim gs_quant provides financial advisory services, investment management, or real-time trading execution — it
is a quantitative analytics toolkit for modeling, pricing, and backtesting derivative instruments, and does not constitute
financial advice or a trading platform
severity: fatal
kind: claim_boundary
modality: must_not
consequence: Regulatory violations occur when the system is misrepresented as a licensed trading platform or advisory
service, exposing the organization to SEC/FINRA enforcement actions and legal liability
- id: finance-C-221
when: When implementing or reordering action execution in GenericEngine
action: Call __ensure_risk_results and wait for completion before executing any action with calc_type=path_dependent —
this ordering is mandatory regardless of declaration order or perceived optimization opportunities
severity: fatal
kind: domain_rule
modality: must
consequence: Skipping __ensure_risk_results before path_dependent actions produces incorrect hedge ratios and trigger
conditions, leading to backtest results that diverge from live trading behavior
derived_from_bd_id: BD-045
- id: finance-C-223
when: When implementing or modifying CalcType ordering, path_dependent action requirements, or aggregate calculation type
inheritance
action: Maintain the execution ordering contract formed by CalcType hierarchy (BD-040), __ensure_risk_results requirement
(BD-045), and max-child-calcType inheritance (BD-050) — any aggregate containing path_dependent components must trigger
full path_dependent requirements for the entire aggregate
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Violating the CalcType cascade causes silent failures where path_dependent strategies compute without complete
risk results, producing incorrect hedge ratios and trigger conditions that are extremely difficult to debug
derived_from_bd_id: BD-062
- id: finance-C-259
when: When modeling volatility for equity returns with GARCH(1,1)
action: Assume normal distribution for equity returns — use t-distribution (degrees of freedom ~4-6) or Skew-t to capture
fat tails and asymmetric volatility; normal distribution systematically underestimates tail risk
severity: fatal
kind: domain_rule
modality: must_not
consequence: Normal distribution underestimates 1% VaR by 40-60% for equity returns due to fat tails; a strategy calibrated
with normal GARCH will undersize tail hedges and exceed loss limits in live trading
derived_from_bd_id: BD-029
- id: finance-C-273
when: When setting up a session for any gs_quant API call
action: Initialize an active GsSession using GsSession.use() before making any instrument resolution, pricing, or data
queries — each API communication flows through the authenticated session
severity: fatal
kind: architecture_guardrail
modality: must
consequence: Any API call (instrument resolution, pricing, data queries) without an active session will fail or produce
unexpected results.
stage_ids:
- session_initialization
- id: finance-C-277
when: When selecting a backtest engine for an OTC multi-asset strategy
action: Use GenericEngine as the default engine — EquityVolEngine is restricted to EqOption and EqVarianceSwap only and
will not work for IRSwap, IRSwaption, FXOption, FXForward, or other instruments
severity: fatal
kind: resource_boundary
modality: must
consequence: EquityVolEngine only supports EqOption and EqVarianceSwap instruments. Using it with other instruments will
cause the backtest to fail.
stage_ids:
- backtesting
regular:
- id: finance-C-002
when: When implementing get_data_series with symbol_dimensions
action: verify symbol_dimensions length equals 1
severity: high
kind: domain_rule
modality: must
consequence: get_data_series raises MqValueError('get_data_series only valid for symbol_dimensions of length 1') when
called with multi-dimensional datasets, breaking time-series analysis workflows
stage_ids:
- data_ingestion
- id: finance-C-003
when: When using ApiRequestCache for GsDataApi
action: initialize cache via set_api_request_cache before use
severity: medium
kind: architecture_guardrail
modality: must
consequence: Without explicit cache initialization, _api_request_cache remains None and no caching occurs, causing redundant
API calls on every identical request
stage_ids:
- data_ingestion
- id: finance-C-004
when: When making API requests through GsDataApi
action: expect real-time data without considering data latency
severity: high
kind: claim_boundary
modality: must_not
consequence: Market data from Marquee APIs has inherent latency. Treating delayed data as real-time causes trading signals
to be generated on stale information, leading to financial losses in live trading
stage_ids:
- data_ingestion
- id: finance-C-005
when: When using Dataset.provider for data retrieval
action: access data only through provider interface, not directly
severity: medium
kind: architecture_guardrail
modality: must
consequence: Direct access to underlying data sources bypasses the provider abstraction, preventing vendor swapping and
mock testing capabilities defined in the architecture
stage_ids:
- data_ingestion
- id: finance-C-006
when: When configuring InMemoryApiRequestCache
action: set specified max_size and ttl_in_seconds to prevent unbounded memory growth
severity: high
kind: resource_boundary
modality: must
consequence: Without proper TTL and size limits, TTLCache can grow unbounded causing memory exhaustion in long-running
applications
stage_ids:
- data_ingestion
- id: finance-C-007
when: When executing bulk data retrieval with get_data_bulk
action: set request_batch_size between 1 and 5 (exclusive upper bound)
severity: high
kind: domain_rule
modality: must
consequence: Invalid request_batch_size values cause batch processing to fail or produce unexpected results in parallel
data extraction
stage_ids:
- data_ingestion
- id: finance-C-008
when: When constructing cache keys for API requests
action: properly serialize datetime objects and query objects to JSON
severity: medium
kind: domain_rule
modality: must
consequence: Incorrect cache key serialization causes cache misses for semantically identical queries, defeating the purpose
of caching and increasing API load
stage_ids:
- data_ingestion
- id: finance-C-009
when: When using coordinate-based data retrieval
action: properly sort coordinate data fields in expected order
severity: medium
kind: architecture_guardrail
modality: must
consequence: Unsorted coordinate data fields cause inconsistent results across queries with identical data but different
field ordering
stage_ids:
- data_ingestion
- id: finance-C-011
when: When processing data series with multiple symbols
action: group results by symbol_dimension before returning series
severity: high
kind: architecture_guardrail
modality: must
consequence: Without proper grouping, multi-symbol data series returns misleading results showing mixed asset data as
a single time series
stage_ids:
- data_ingestion
- id: finance-C-012
when: When extending MarketDataResponseFrame
action: preserve DataFrame constructor and finalize semantics
severity: medium
kind: architecture_guardrail
modality: must
consequence: Breaking DataFrame inheritance causes pandas operations to return plain DataFrames instead of MarketDataResponseFrame,
losing market-data-specific attributes
stage_ids:
- data_ingestion
- id: finance-C-014
when: When using async data retrieval methods
action: properly await each async operations before accessing results
severity: high
kind: operational_lesson
modality: must
consequence: Accessing unresolved async results causes race conditions where partial or uninitialized data is processed,
leading to incorrect analysis
stage_ids:
- data_ingestion
- id: finance-C-021
when: When traversing nested portfolio hierarchies
action: use all_instruments and all_portfolios properties for recursive hierarchy traversal
severity: high
kind: architecture_guardrail
modality: must
consequence: Direct priceables iteration misses nested instruments in fund-of-funds structures, causing incomplete valuations
and incorrect risk attribution
stage_ids:
- instrument_modeling
- id: finance-C-022
when: When saving portfolios via save/save_as_quote/save_to_shadowbook
action: include nested portfolios in portfolios that need to be persisted
severity: high
kind: architecture_guardrail
modality: must_not
consequence: ValueError is raised for portfolios with nested portfolios, blocking persistence operations
stage_ids:
- instrument_modeling
- id: finance-C-023
when: When aggregating nested PortfolioRiskResult objects
action: maintain correct risk_measure and risk_key alignment across portfolio hierarchies
severity: high
kind: architecture_guardrail
modality: must
consequence: Mismatched risk keys across nested portfolios produce incorrect aggregate values, corrupting portfolio-level
risk reporting
stage_ids:
- instrument_modeling
- id: finance-C-024
when: When implementing instrument resolution providers
action: handle ErrorValue results gracefully in resolution callbacks
severity: high
kind: architecture_guardrail
modality: must
consequence: Unresolved instruments produce null values or silent failures, leading to incorrect pricing downstream
stage_ids:
- instrument_modeling
- id: finance-C-025
when: When adding PortfolioRiskResult instances with overlapping instruments
action: add results that overlap on risk measures, instruments, or dates
severity: high
kind: domain_rule
modality: must_not
consequence: ValueError is raised preventing double-counting of positions which would inflate portfolio values incorrectly
stage_ids:
- instrument_modeling
- id: finance-C-026
when: When instrument parsing fails to resolve text to instrument
action: raise ValueError with clear error message when instrument specification is invalid
severity: high
kind: domain_rule
modality: must
consequence: Invalid instrument specifications silently pass through pricing, producing NaN or incorrect values in risk
calculations
stage_ids:
- instrument_modeling
- id: finance-C-027
when: When using Portfolio calc() method
action: clone portfolio before returning in PortfolioRiskResult to prevent in-place mutation side effects
severity: medium
kind: architecture_guardrail
modality: must
consequence: Without cloning, resolved portfolios hold references that can be corrupted if the original portfolio is later
modified
stage_ids:
- instrument_modeling
- id: finance-C-028
when: When calculating DollarPrice for instruments
action: convert local currency values to USD using specified FX rates
severity: high
kind: resource_boundary
modality: must
consequence: Non-USD instruments report incorrect dollar values, causing portfolio PnL calculations to be misstated
stage_ids:
- instrument_modeling
- id: finance-C-029
when: When checking if market data is available for instrument resolution
action: handle cases where market data is unavailable by returning None or ErrorValue
severity: medium
kind: resource_boundary
modality: must
consequence: Instruments without market data return null resolved values, leading to zero-valued positions in portfolio
calculations
stage_ids:
- instrument_modeling
- id: finance-C-030
when: When resolving instruments with market() method
action: expect synchronous market data retrieval for each instrument types
severity: medium
kind: resource_boundary
modality: must_not
consequence: Not all instruments support market() method; calling it returns incomplete or None market data
stage_ids:
- instrument_modeling
- id: finance-C-031
when: When querying price() method for local currency present value
action: assume price() returns values in local currency for each instruments
severity: medium
kind: resource_boundary
modality: must_not
consequence: price() is not yet supported on all instrument types; unsupported instruments raise NotImplementedError
stage_ids:
- instrument_modeling
- id: finance-C-032
when: When using resolve() with in_place=True default
action: consider using in_place=False when instrument needs to be reused in multiple contexts
severity: medium
kind: operational_lesson
modality: should
consequence: In-place resolution mutates the original instrument, which may cause unintended side effects if the same
instrument is used in multiple portfolios or scenarios
stage_ids:
- instrument_modeling
- id: finance-C-033
when: When creating instruments via from_quick_entry
action: validate that instrument specification resolves to exactly one instrument type
severity: medium
kind: domain_rule
modality: must
consequence: Ambiguous instrument specifications silently pick the first matching type, causing incorrect instrument modeling
stage_ids:
- instrument_modeling
- id: finance-C-034
when: When implementing price() with currency parameter
action: pass currency parameter to Price risk measure when currency is specified
severity: medium
kind: architecture_guardrail
modality: must
consequence: Without passing currency, Price uses base currency which may not match the requested reporting currency
stage_ids:
- instrument_modeling
- id: finance-C-035
when: When performing division or multiplication on PortfolioRiskResult
action: operate with non-numeric types on PortfolioRiskResult
severity: high
kind: domain_rule
modality: must_not
consequence: PortfolioRiskResult raises ValueError for invalid operand types, causing calculation failures
stage_ids:
- instrument_modeling
- id: finance-C-036
when: When accessing PricingContext.current without prior initialization
action: raise MqUninitialisedError to signal context is uninitialized
severity: high
kind: domain_rule
modality: must
consequence: Attempting to access PricingContext.current without entering a context or setting a default raises MqUninitialisedError,
preventing pricing operations from proceeding with undefined state
stage_ids:
- pricing_context
- id: finance-C-037
when: When setting PricingContext.current while in a nested context
action: directly assign current property when another context is entered
severity: high
kind: domain_rule
modality: must_not
consequence: Assigning PricingContext.current = context while inside a with block for a different context raises MqValueError,
breaking context stack integrity
stage_ids:
- pricing_context
- id: finance-C-038
when: When specifying a pricing_date beyond today plus tolerance
action: reject future pricing dates with ValueError and guidance to use RollFwd
severity: high
kind: domain_rule
modality: must
consequence: Specifying a future pricing_date raises ValueError, directing users to use RollFwd Scenario to roll pricing_date
to future dates
stage_ids:
- pricing_context
- id: finance-C-039
when: When using thread_local storage for context state
action: isolate context state per thread using threading.local()
severity: high
kind: architecture_guardrail
modality: must
consequence: Thread-local storage ensures each thread maintains independent context path, preventing context leakage between
concurrent threads
stage_ids:
- pricing_context
- id: finance-C-040
when: When entering a PricingContext
action: use context manager protocol via with statement or async with
severity: high
kind: architecture_guardrail
modality: must
consequence: PricingContext must be entered using with/as or async with syntax to properly manage push/pop lifecycle and
attribute state preservation
stage_ids:
- pricing_context
- id: finance-C-041
when: When nesting PricingContext instances
action: properly shadow outer context parameters through inheritance mechanism
severity: medium
kind: architecture_guardrail
modality: must
consequence: 'Nested contexts shadow outer parameters correctly: active_context inherits from entered contexts without
set_parameters_only, while properties inherit from prior context via _inherited_val'
stage_ids:
- pricing_context
- id: finance-C-042
when: When exiting a PricingContext
action: reset each attributes to pre-entry state using saved attrs_on_entry
severity: high
kind: architecture_guardrail
modality: must
consequence: On context exit, _on_exit restores all parameters to their pre-entry values via __reset_atts, ensuring outer
context state is preserved correctly
stage_ids:
- pricing_context
- id: finance-C-043
when: When managing context state with ContextBaseWithDefault
action: use singleton pattern through thread_local for default context
severity: medium
kind: architecture_guardrail
modality: must
consequence: ContextBaseWithDefault.default_value() returns a singleton instance per thread via thread_local caching,
reducing boilerplate for simple use cases
stage_ids:
- pricing_context
- id: finance-C-044
when: When using set_parameters_only=True context
action: recognize that such contexts are invisible to active_context resolution
severity: medium
kind: architecture_guardrail
modality: must
consequence: Contexts with set_parameters_only=True do not become active_context even when entered, allowing nested contexts
to inherit parameters without triggering calculations
stage_ids:
- pricing_context
- id: finance-C-045
when: When resolving properties on an unentered context
action: inherit values from would-be prior context for display purposes
severity: low
kind: architecture_guardrail
modality: must
consequence: When context is not yet entered, _inherited_val retrieves values from PricingContext.current to maintain
consistency in property getters
stage_ids:
- pricing_context
- id: finance-C-046
when: When using PricingContext for calculations
action: claim results represent live trading execution
severity: medium
kind: claim_boundary
modality: must_not
consequence: PricingContext provides simulated pricing based on market data coordinates and parameters; results reflect
backtesting scenarios, not actual trade execution or live market conditions
stage_ids:
- pricing_context
- id: finance-C-047
when: When accessing default PricingContext without initialization
action: raise MqUninitialisedError when no default context exists
severity: high
kind: domain_rule
modality: must
consequence: Accessing PricingContext.current when no context is entered and no default exists raises MqUninitialisedError,
preventing silent failures with undefined state
stage_ids:
- pricing_context
- id: finance-C-048
when: When setting market.location and market_data_location simultaneously
action: specify conflicting location values between market and market_data_location
severity: high
kind: domain_rule
modality: must_not
consequence: Passing both market with a different location and market_data_location raises ValueError to prevent contradictory
pricing context configuration
stage_ids:
- pricing_context
- id: finance-C-049
when: When providing market dated in the future
action: reject future market dates with ValueError directing to RollFwd Scenario
severity: high
kind: domain_rule
modality: must
consequence: Specifying a CloseMarket, OverlayMarket, or RelativeMarket with future date raises ValueError, directing
users to use RollFwd Scenario for forward pricing
stage_ids:
- pricing_context
- id: finance-C-050
when: When creating HistoricalPricingContext with both start and dates
action: supply both start parameter and dates iterable simultaneously
severity: medium
kind: domain_rule
modality: must_not
consequence: Providing both start/end and dates parameters raises ValueError, as these are mutually exclusive date specification
methods
stage_ids:
- pricing_context
- id: finance-C-051
when: When accessing default values for batch configuration
action: use hardcoded defaults when parameters are not explicitly set
severity: medium
kind: resource_boundary
modality: must
consequence: 'Default values: _max_concurrent=1000, _max_per_batch=1000, _dates_per_batch=1, is_async=False, is_batch=False,
use_cache=False ensure consistent behavior without explicit configuration'
stage_ids:
- pricing_context
- id: finance-C-052
when: When supporting Python versions prior to 3.7
action: provide nullcontext fallback implementation for contextlib compatibility
severity: low
kind: resource_boundary
modality: must
consequence: For Python <3.7 without contextlib.nullcontext, a custom nullcontext class is provided to ensure backward
compatibility
stage_ids:
- pricing_context
- id: finance-C-053
when: When entering async pricing context
action: use __aenter__ and __aexit__ for proper async lifecycle management
severity: high
kind: architecture_guardrail
modality: must
consequence: Async context managers require __aenter__/__aexit__ for proper coroutine-based push/pop and async hook execution
stage_ids:
- pricing_context
- id: finance-C-054
when: When using PricingContext for instrument resolution or pricing
action: enter the context using with statement before calling calc or resolution methods
severity: high
kind: operational_lesson
modality: must
consequence: PricingContext must be entered via context manager before instrument resolution and pricing operations; accessing
.current without entering raises MqUninitialisedError
stage_ids:
- pricing_context
- id: finance-C-055
when: When pricing with async mode in spawned threads
action: preserve context attributes on spawned threads using saved attrs_for_request
severity: high
kind: operational_lesson
modality: must
consequence: When dispatching async requests, attributes are saved to attrs_for_request dict to ensure spawned threads
access correct values even after context exits
stage_ids:
- pricing_context
- id: finance-C-060
when: When computing risk for a pricing_date more than 5 days in the future
action: set PricingContext pricing_date beyond 5 calendar days from today
severity: high
kind: resource_boundary
modality: must_not
consequence: PricingContext rejects future dates beyond a 5-day tolerance, forcing users to use RollFwd Scenario for forward
projections instead of direct date specification
stage_ids:
- risk_analysis
- id: finance-C-061
when: When setting both market and market_data_location with conflicting values
action: set conflicting market.location and market_data_location values
severity: high
kind: resource_boundary
modality: must_not
consequence: PricingContext rejects conflicting market data location settings, preventing ambiguous or inconsistent pricing
contexts from being created
stage_ids:
- risk_analysis
- id: finance-C-064
when: When composing scalar risk values over time
action: use ScalarWithInfo.compose() which validates market location consistency via ResultInfo.composition_info()
severity: high
kind: architecture_guardrail
modality: must
consequence: Composing scalar values without market location validation can silently combine data from different pricing
locations, producing misleading time-series of risk figures
stage_ids:
- risk_analysis
- id: finance-C-065
when: When aggregating heterogeneous result types without explicit permission
action: set allow_heterogeneous_types=True when aggregating mixed FloatWithInfo, SeriesWithInfo, and DataFrameWithInfo
severity: high
kind: architecture_guardrail
modality: must
consequence: Silently aggregating incompatible types can cause type conversion errors or silent data loss, producing incorrect
portfolio risk totals
stage_ids:
- risk_analysis
- id: finance-C-066
when: When aggregating risk results with different pricing context parameters
action: aggregate results where the ex_historical_diddle property of RiskKey differs
severity: high
kind: architecture_guardrail
modality: must_not
consequence: Aggregating results with different historical diddle settings produces inconsistent risk figures that mix
simulated and actual historical scenarios
stage_ids:
- risk_analysis
- id: finance-C-067
when: When presenting backtest simulation results
action: present backtest returns as if they represent guaranteed live trading performance
severity: high
kind: claim_boundary
modality: must_not
consequence: Backtest results are simulated using historical market data and do not account for slippage, liquidity constraints,
or market impact that would affect actual trading returns
stage_ids:
- risk_analysis
- id: finance-C-068
when: When caching pricing results
action: cache ErrorValue or LiveMarket pricing results in the pricing cache
severity: high
kind: operational_lesson
modality: must_not
consequence: Caching error values would cause downstream calculations to reuse failed computations, propagating pricing
failures through the entire risk analysis
stage_ids:
- risk_analysis
- id: finance-C-069
when: When using ErrorValue objects
action: access arbitrary attributes on ErrorValue objects
severity: medium
kind: operational_lesson
modality: must_not
consequence: ErrorValue raises AttributeError for any attribute access except 'error', 'risk_key', 'unit', 'request_id',
and 'raw_value', causing unexpected runtime exceptions if not properly handled
stage_ids:
- risk_analysis
- id: finance-C-070
when: When performing subtract_risk operation
action: verify left and right DataFrames have identical column structure before calling subtract_risk
severity: medium
kind: domain_rule
modality: must
consequence: subtract_risk uses assert to validate column structure, which will throw AssertionError if column names or
the 'value' column are not identical
stage_ids:
- risk_analysis
- id: finance-C-071
when: When implementing a backtest without explicit transaction costs
action: Set transaction_cost parameter on each action to reflect realistic trading costs
severity: high
kind: domain_rule
modality: must
consequence: Backtest will report inflated profitability because default_transaction_cost() returns ConstantTransactionModel(0),
silently ignoring all transaction costs in the simulation
stage_ids:
- backtesting
- id: finance-C-076
when: When defining exit transaction costs on HedgeAction, ExitTradeAction, or RebalanceAction
action: Explicitly set transaction_cost_exit parameter if different from entry cost
severity: high
kind: operational_lesson
modality: must
consequence: Exit costs default to entry costs if not explicitly overridden, causing backtest to overstate profitability
when actual exit costs differ from entry costs
stage_ids:
- backtesting
- id: finance-C-077
when: When implementing partial hedges with HedgeAction
action: Explicitly set risk_percentage parameter to desired hedge ratio below 100
severity: high
kind: operational_lesson
modality: must
consequence: Partial hedges default to risk_percentage=100 (full hedge), causing unexpected full hedging behavior when
only partial hedging was intended
stage_ids:
- backtesting
- id: finance-C-078
when: When using AggregateTriggerRequirements with mixed calc_type child triggers
action: Design strategy accounting for path_dependent child triggers dominating the aggregation
severity: medium
kind: operational_lesson
modality: must
consequence: AggregateTriggerRequirements.calc_type returns path_dependent if ANY child trigger has that type, causing
the entire aggregate to be evaluated with path-dependent computation cost and complexity
stage_ids:
- backtesting
- id: finance-C-079
when: When constructing a Strategy with multiple triggers
action: Place entry triggers before hedge triggers in the strategy trigger list
severity: high
kind: architecture_guardrail
modality: must
consequence: Triggers are evaluated in order; hedge triggers fire before entry triggers causes incorrect hedge sizing
or missed hedges on newly entered positions
stage_ids:
- backtesting
- id: finance-C-080
when: When defining multiple actions on a single trigger
action: Order exit actions before add/entry actions within a trigger's action list
severity: high
kind: architecture_guardrail
modality: must
consequence: Actions execute sequentially; adding positions before exiting old positions causes unintended double exposure
and incorrect portfolio state
stage_ids:
- backtesting
- id: finance-C-082
when: When backtesting and presenting results to stakeholders
action: Claim backtest returns equal expected live trading returns
severity: high
kind: claim_boundary
modality: must_not
consequence: Backtest simulates historical execution with no slippage, partial fills, or liquidity constraints; live trading
will have materially different results due to market impact, execution latency, and changing conditions
stage_ids:
- backtesting
- id: finance-C-083
when: When configuring transaction costs in a backtest
action: Present backtest results as real execution proof without accounting for model limitations
severity: high
kind: claim_boundary
modality: must_not
consequence: TransactionCostModel approximates real costs; assuming model accuracy equals actual trading costs misleads
stakeholders about expected live performance
stage_ids:
- backtesting
- id: finance-C-084
when: When relying on default transaction costs for profitability assessment
action: Skip explicit transaction cost modeling based on the assumption costs are negligible
severity: high
kind: rationalization_guard
modality: must_not
consequence: Frequent trading strategies can incur substantial transaction costs that default to zero, causing backtest
profitability to be overstated by potentially 100% or more
stage_ids:
- backtesting
- id: finance-C-085
when: When modifying risk_percentage away from the default 100%
action: Skip validation that the partial hedge ratio matches intended risk management strategy
severity: medium
kind: rationalization_guard
modality: must_not
consequence: Changing risk_percentage without business justification causes unintended hedge ratios, potentially leaving
positions over-hedged or under-hedged relative to mandate
stage_ids:
- backtesting
- id: finance-C-088
when: When running backtests with missing market data
action: Define MissingDataStrategy (fill_forward, interpolate, or fail) for data sources
severity: high
kind: domain_rule
modality: must
consequence: MissingDataStrategy defaults to 'fail', causing RuntimeError when market data is unavailable at required
dates, preventing backtest completion
stage_ids:
- backtesting
- id: finance-C-089
when: When implementing a processor that computes values from children data
action: check ProcessorResult.success before accessing data
severity: high
kind: domain_rule
modality: must
consequence: Accessing children_data without checking success flag leads to AttributeError or incorrect results when processing
fails
stage_ids:
- analytics_processing
- id: finance-C-090
when: When date range filtering is applied to processor results
action: use numpy datetime64 conversion for timestamp comparisons
severity: high
kind: domain_rule
modality: must
consequence: Comparing pandas Series index with date/datetime objects without conversion causes TypeError or silent filtering
failures
stage_ids:
- analytics_processing
- id: finance-C-091
when: When a processor receives empty or None data from query
action: return ProcessorResult with success=False and error message
severity: high
kind: domain_rule
modality: must
consequence: Silent handling of empty data causes NaN propagation through the grid, producing misleading outputs
stage_ids:
- analytics_processing
- id: finance-C-092
when: When building the processor dependency graph
action: set parent reference and parent_attr on child processors
severity: high
kind: architecture_guardrail
modality: must
consequence: Missing parent references break the recursive calculation chain, leaving dependent processors uncalculated
stage_ids:
- analytics_processing
- id: finance-C-093
when: When a leaf processor calculation completes
action: traverse up the parent chain to recalculate dependent processors
severity: high
kind: architecture_guardrail
modality: must
consequence: Skipping parent traversal leaves parent processors with stale values, causing incorrect aggregated results
stage_ids:
- analytics_processing
- id: finance-C-094
when: When processing with a measure_processor=True flag
action: apply date range filtering to the processor result
severity: medium
kind: architecture_guardrail
modality: must_not
consequence: Applying date range mask to measure processor results corrupts cross-sectional calculations that require
full date coverage
stage_ids:
- analytics_processing
- id: finance-C-095
when: When setting polling_time on DataGrid
action: set value to 0 or >= 10000 milliseconds
severity: high
kind: resource_boundary
modality: must
consequence: Invalid polling_time causes MqValueError, preventing DataGrid from being used in live dashboards
stage_ids:
- analytics_processing
- id: finance-C-096
when: When querying data with RelativeDate parameters
action: resolve RelativeDate to concrete date before query execution
severity: high
kind: operational_lesson
modality: must
consequence: Unresolved RelativeDate causes API query failures or returns empty datasets
stage_ids:
- analytics_processing
- id: finance-C-097
when: When using ProcessPoolExecutor for processor execution
action: pass pool to each update() and calculate() calls in the chain
severity: medium
kind: operational_lesson
modality: must
consequence: Omitting pool in parent.calculate() prevents parallel execution, degrading performance for large grids
stage_ids:
- analytics_processing
- id: finance-C-098
when: When initializing DataGrid
action: call initialize() before calling poll()
severity: high
kind: architecture_guardrail
modality: must
consequence: Calling poll() without initialize() causes AttributeError due to uninitialized _cells and _data_queries
stage_ids:
- analytics_processing
- id: finance-C-099
when: When aggregating data queries for execution
action: group queries by dataset_id to minimize API calls
severity: medium
kind: architecture_guardrail
modality: must
consequence: Ungrouped queries cause excessive API roundtrips, significantly degrading DataGrid response time
stage_ids:
- analytics_processing
- id: finance-C-100
when: When handling special cells (EntityProcessor, CoordinateProcessor)
action: process them in _process_special_cells before data queries
severity: high
kind: architecture_guardrail
modality: must
consequence: Skipping special cell processing causes entity data to be missing from the grid output
stage_ids:
- analytics_processing
- id: finance-C-101
when: When polling DataGrid for live updates
action: claim that data is real-time or guarantee immediate consistency
severity: medium
kind: claim_boundary
modality: must_not
consequence: Presenting polling-based data as real-time misleads users about data freshness and system capabilities
stage_ids:
- analytics_processing
- id: finance-C-102
when: When returning numeric values in DataGrid
action: round values according to column format precision
severity: low
kind: domain_rule
modality: must
consequence: Unrounded float values cause display inconsistency and potential JSON serialization issues
stage_ids:
- analytics_processing
- id: finance-C-103
when: When creating a DataCoordinate with DataFrequency.DAILY
action: include start and end date parameters in the DataQuery
severity: medium
kind: operational_lesson
modality: must
consequence: Missing date range causes API to return all available data, potentially causing memory issues or slow responses
stage_ids:
- analytics_processing
- id: finance-C-106
when: When processing price or return series for volatility calculation
action: Handle empty series gracefully by returning the empty series immediately
severity: high
kind: domain_rule
modality: must
consequence: Functions operating on empty series will crash or produce NaN results, breaking downstream calculations and
causing silent failures in financial analytics pipelines.
stage_ids:
- timeseries_analysis
- id: finance-C-109
when: When applying the vol_swap_volatility function
action: Use annualization_factor parameter explicitly instead of relying on auto-detection
severity: high
kind: resource_boundary
modality: must
consequence: Volatility swap valuations will use incorrect annualization scaling, causing systematic mispricing of approximately
2-7% depending on the chosen factor vs the industry standard of 252.
stage_ids:
- timeseries_analysis
- id: finance-C-110
when: When using string-based window specifications in time series functions
action: Use business day offset patterns (b or B) in string-based window definitions
severity: high
kind: operational_lesson
modality: must_not
consequence: Business day offsets cause ValueError in lag operations because the system cannot reliably determine business
days without explicit calendar information, leading to calculation failures.
stage_ids:
- timeseries_analysis
- id: finance-C-111
when: When defining Window objects for volatility calculations
action: Set ramp-up value equal to window size minus one (r = w - 1) for vol_swap_volatility
severity: high
kind: operational_lesson
modality: must
consequence: Volatility swap rolling window will not include the correct number of returns, producing systematically biased
volatility estimates for the swap payoff calculation.
stage_ids:
- timeseries_analysis
- id: finance-C-112
when: When using the lag function with time-based offsets
action: Verify input series has a DatetimeIndex for string-based lag offsets
severity: high
kind: architecture_guardrail
modality: must
consequence: Lagging by relative dates (like '1y', '1m', '1d') requires DatetimeIndex to compute date arithmetic; non-datetime
indices will cause type errors in the lag operation.
stage_ids:
- timeseries_analysis
- id: finance-C-113
when: When using LinearRegression on financial time series
action: Filter out NaN, infinite, and missing values before fitting the model
severity: high
kind: architecture_guardrail
modality: must
consequence: OLS regression will fail or produce degenerate results when NaN or infinite values are present in the input
data, causing the model to be uncomputable.
stage_ids:
- timeseries_analysis
- id: finance-C-114
when: When using string window definitions like '1m', '1d', '1w'
action: Convert window strings to proper DateOffset objects via _to_offset function
severity: high
kind: architecture_guardrail
modality: must
consequence: String-based windows will not be properly interpreted as time durations, causing rolling calculations to
use incorrect window boundaries or fail entirely.
stage_ids:
- timeseries_analysis
- id: finance-C-115
when: When computing rolling statistics with DateOffset windows
action: Use rolling_offset function for time-based windows instead of simple integer-based rolling
severity: high
kind: architecture_guardrail
modality: must
consequence: Time-based rolling calculations will use observation counts instead of time durations, producing incorrect
statistical values for sparse or irregular time series.
stage_ids:
- timeseries_analysis
- id: finance-C-116
when: When extending pandas Series for financial time series
action: Preserve datetime index and custom attributes through each operations using __finalize__
severity: high
kind: architecture_guardrail
modality: must
consequence: Financial time series operations will lose critical datetime information or custom metadata, corrupting downstream
calculations that depend on temporal alignment.
stage_ids:
- timeseries_analysis
- id: finance-C-117
when: When calculating mean using QUADRATIC mean type
action: Square values before averaging, then take the square root of the result
severity: high
kind: domain_rule
modality: must
consequence: Quadratic mean (RMS) calculation will produce incorrect results if values are not squared before averaging,
breaking zero-mean volatility calculations for volatility swaps.
stage_ids:
- timeseries_analysis
- id: finance-C-118
when: When computing volatility with assume_zero_mean=True
action: Use population standard deviation formula (N divisor) instead of sample formula (N-1 divisor)
severity: high
kind: domain_rule
modality: must
consequence: Volatility swap payoff calculations will systematically underestimate or overestimate realized variance by
a factor of N/(N-1), leading to incorrect settlement amounts.
stage_ids:
- timeseries_analysis
- id: finance-C-119
when: When validating Window ramp-up parameter
action: Verify ramp value is non-negative and less than or equal to series length
severity: high
kind: domain_rule
modality: must
consequence: Invalid ramp-up values will cause IndexError exceptions or produce truncated results in rolling window calculations,
breaking financial analytics pipelines.
stage_ids:
- timeseries_analysis
- id: finance-C-120
when: When computing correlation between price or return series
action: Align series indices to use only common dates before correlation calculation
severity: high
kind: architecture_guardrail
modality: must
consequence: Misaligned series will produce NaN correlation values or incorrect correlation estimates, leading to faulty
risk calculations and portfolio optimization errors.
stage_ids:
- timeseries_analysis
- id: finance-C-121
when: When working with technical indicators that require DatetimeIndex
action: Convert series index to DatetimeIndex before passing to technical analysis functions
severity: high
kind: architecture_guardrail
modality: must
consequence: Technical indicators will raise MqValueError when series lacks DatetimeIndex, causing trading signal generation
to fail.
stage_ids:
- timeseries_analysis
- id: finance-C-122
when: When validating user inputs for type parameters
action: Raise MqTypeError when boolean parameters receive non-boolean values
severity: high
kind: domain_rule
modality: must
consequence: Silent type coercion or unexpected behavior will occur if boolean parameters receive non-boolean values,
leading to incorrect model fits or silent failures.
stage_ids:
- timeseries_analysis
- id: finance-C-123
when: When computing annualized volatility from price series
action: Apply proper annualization factor based on data frequency or explicit parameter
severity: high
kind: resource_boundary
modality: must
consequence: Volatility estimates will be incorrectly scaled for aggregation across different data frequencies, causing
risk models to under or overestimate portfolio risk by factors of 2-12x.
stage_ids:
- timeseries_analysis
- id: finance-C-125
when: When using the vol_swap_volatility function
action: Pass a Window object with ramp-up equal to n_days minus one to match vol swap conventions
severity: high
kind: operational_lesson
modality: must
consequence: Volatility swap rolling window will include an extra return observation, causing the variance calculation
to include N+1 returns instead of N returns for the payoff calculation.
stage_ids:
- timeseries_analysis
- id: finance-C-126
when: When initializing ExtendedSeries from external data
action: Normalize datetime index to nanosecond precision using _normalize_dtidx function
severity: medium
kind: architecture_guardrail
modality: must
consequence: Series with non-nanosecond datetime precision will cause alignment errors when joining with other financial
time series, leading to incorrect merged datasets.
stage_ids:
- timeseries_analysis
- id: finance-C-127
when: When implementing risk model data retrieval
action: Use the get_data method to combine multiple measures in a single API call
severity: medium
kind: domain_rule
modality: must
consequence: Multiple separate API calls increase latency and may cause rate limiting; combined calls via get_data are
optimized for efficiency
stage_ids:
- risk_models
- id: finance-C-128
when: When accessing risk model ID property
action: Attempt to modify the risk model ID after instantiation
severity: high
kind: architecture_guardrail
modality: must_not
consequence: Risk model ID is immutable with no setter; attempting to modify will raise AttributeError or create inconsistent
state
stage_ids:
- risk_models
- id: finance-C-129
when: When uploading risk model data via partial upload
action: Upload factorData and covarianceMatrix first before asset data on the same date
severity: high
kind: operational_lesson
modality: must
consequence: On partial uploads, newer data replaces existing data; uploading asset data before factor data can leave
model in inconsistent state with missing factor definitions
stage_ids:
- risk_models
- id: finance-C-130
when: When computing factor attribution for an asset
action: Verify the asset has end-of-day price available for the requested date
severity: high
kind: domain_rule
modality: must
consequence: Factor attribution requires spot price to convert factor exposures and returns; missing price data causes
calculation failure with MqValueError
stage_ids:
- risk_models
- id: finance-C-131
when: When computing factor attribution for an asset
action: Verify the asset is covered by the risk model on the specified date
severity: high
kind: domain_rule
modality: must
consequence: Asset must be in the risk model universe on the given date; attempting attribution for uncovered asset raises
MqValueError
stage_ids:
- risk_models
- id: finance-C-132
when: When making API requests to risk model endpoints
action: Handle MqRequestError with status >= 500 as potential timeout requiring retry
severity: high
kind: resource_boundary
modality: must
consequence: Server errors (500-599) may cause timeouts; without retry logic, data retrieval fails silently or returns
incomplete results
stage_ids:
- risk_models
- id: finance-C-133
when: When making API requests to risk model endpoints
action: Respect rate limiting by handling MqRateLimitedError (status 429) with backoff
severity: high
kind: resource_boundary
modality: must
consequence: Exceeding API rate limits without proper backoff causes request rejection and potential service disruption
stage_ids:
- risk_models
- id: finance-C-134
when: When uploading large risk model datasets
action: Use upload_data with automatic batching when universe size exceeds max_asset_batch_size (default 10000)
severity: medium
kind: resource_boundary
modality: must
consequence: Single large upload may exceed API payload limits causing failure; automatic batching ensures reliable data
transfer
stage_ids:
- risk_models
- id: finance-C-135
when: When using risk model data retrieval methods
action: Claim risk model outputs as predictions of future performance
severity: high
kind: claim_boundary
modality: must_not
consequence: Risk model outputs (factor exposures, attributions) are mathematical decompositions of historical data, not
forward-looking predictions; presenting them as forecasts misleads users about uncertainty
stage_ids:
- risk_models
- id: finance-C-136
when: When using deprecated upload_partial_data method
action: Continue using upload_partial_data instead of upload_data
severity: medium
kind: operational_lesson
modality: must_not
consequence: Deprecated method may be removed in future versions; using upload_data ensures access to automatic batching
and current API behavior
stage_ids:
- risk_models
- id: finance-C-137
when: When uploading asset coverage data to enable UI visibility
action: Upload coverage data within the last 5 days from the risk model calendar
severity: medium
kind: operational_lesson
modality: must
consequence: Coverage data outside 5-day window does not enable model visibility in Marquee UI dropdown for users with
'execute' capabilities
stage_ids:
- risk_models
- id: finance-C-138
when: When computing factor attribution calculations
action: Verify factor risk contributions plus specific risk sum to total risk for correctly attributed portfolios
severity: high
kind: domain_rule
modality: must
consequence: Portfolio factor attribution is mathematically complete only when all factor contributions plus specific
risk equal total risk; violations indicate data quality or calculation errors
stage_ids:
- risk_models
- id: finance-C-139
when: When accessing risk model private attributes
action: Modify private attributes (prefixed with double underscore) of risk model instances
severity: high
kind: architecture_guardrail
modality: must_not
consequence: Private attributes like __id, __name, __type are internal implementation; external modification bypasses
validation and can corrupt model state affecting caching and reuse
stage_ids:
- risk_models
- id: finance-C-140
when: When making HTTP requests to Marquee API
action: Set request timeout to DEFAULT_TIMEOUT (65 seconds) to prevent indefinite blocking
severity: medium
kind: resource_boundary
modality: must
consequence: Requests without timeout may hang indefinitely on network issues; 65-second timeout balances reliability
with allowing complex queries to complete
stage_ids:
- risk_models
- id: finance-C-141
when: When calling ReportJobFuture.result() on a cancelled report job
action: attempt to retrieve results from a cancelled report job
severity: high
kind: domain_rule
modality: must_not
consequence: Attempting to retrieve data from a cancelled report job raises MqValueError with message 'This report job
in status cancelled. Cannot retrieve results.'
stage_ids:
- reporting
- id: finance-C-142
when: When calling ReportJobFuture.result() on a report job in error status
action: attempt to retrieve results from a report job that encountered an error
severity: high
kind: domain_rule
modality: must_not
consequence: Attempting to retrieve data from an errored report job raises MqValueError, preventing invalid downstream
calculations based on incomplete data
stage_ids:
- reporting
- id: finance-C-146
when: When polling for report job completion with default parameters
action: use the default retry configuration of max_retries=10 with sleep_time=10 seconds
severity: medium
kind: resource_boundary
modality: must
consequence: Default timeout is 100 seconds (10 retries * 10 second sleep) before raising timeout error, ensuring reports
have adequate time to complete
stage_ids:
- reporting
- id: finance-C-147
when: When synchronously running a report without is_async flag
action: handle the blocking synchronous execution with retry logic of 100 iterations with 6-second sleep
severity: medium
kind: resource_boundary
modality: must
consequence: Synchronous execution polls for completion with 6-second intervals, allowing up to 600 seconds before timeout,
requiring appropriate timeout handling
stage_ids:
- reporting
- id: finance-C-148
when: When getting raw factor risk report results
action: use get_results() method to retrieve raw computational data from the API
severity: medium
kind: architecture_guardrail
modality: must
consequence: get_results() returns raw API response data suitable for programmatic analysis, distinct from Marquee UI
format returned by get_view()
stage_ids:
- reporting
- id: finance-C-149
when: When getting Marquee UI-formatted factor risk report data
action: use get_view() method to retrieve data formatted for Marquee interface display
severity: medium
kind: architecture_guardrail
modality: must
consequence: get_view() returns data structured as displayed in Marquee UI, which is different from raw results in get_results()
stage_ids:
- reporting
- id: finance-C-150
when: When converting PerformanceReport instances
action: cast non-performance reports to PerformanceReport class
severity: high
kind: architecture_guardrail
modality: must_not
consequence: Attempting to convert a non-performance report to PerformanceReport raises MqValueError('This report is not
a performance report.')
stage_ids:
- reporting
- id: finance-C-151
when: When retrieving factor risk table data
action: check for 'table' key in response before accessing table data
severity: high
kind: architecture_guardrail
modality: must
consequence: When API returns warning instead of table data, MqValueError is raised with the warning message, preventing
KeyError exceptions
stage_ids:
- reporting
- id: finance-C-152
when: When using Brinson attribution with different aggregation types
action: validate that arithmetic and geometric aggregation produce mathematically consistent attribution breakdowns
severity: high
kind: operational_lesson
modality: must
consequence: Inconsistent aggregation results across arithmetic vs geometric types would lead to incorrect performance
attribution and misallocated gains/losses
stage_ids:
- reporting
- id: finance-C-153
when: When validating attribution breakdowns against total P&L
action: sum individual attribution components (security selection, allocation, interaction) to match the total portfolio
P&L
severity: high
kind: domain_rule
modality: must
consequence: Attribution components that do not sum to total P&L indicate data integrity issues or calculation errors,
leading to incorrect performance reporting
stage_ids:
- reporting
- id: finance-C-154
when: When reporting backtest or simulated performance results
action: present historical report calculations as indicative of future trading returns or real execution results
severity: high
kind: claim_boundary
modality: must_not
consequence: Performance reports calculate historical analytics on historical positions, which do not guarantee similar
results in live trading due to market impact, slippage, and execution delays
stage_ids:
- reporting
- id: finance-C-155
when: When handling report execution timeouts in production systems
action: ignore report jobs stuck in 'waiting' status as a normal condition
severity: high
kind: operational_lesson
modality: must_not
consequence: Reports stuck in 'waiting' status indicate a system issue requiring manual intervention; ignoring this leads
to silent failures in reporting pipelines
stage_ids:
- reporting
- id: finance-C-156
when: When deciding between get_results() and get_view() for factor risk data
action: interchange raw results and UI view data without understanding their different purposes
severity: medium
kind: architecture_guardrail
modality: should_not
consequence: Mixing raw computational results with UI-formatted data leads to incorrect data interpretation and potential
calculation errors
stage_ids:
- reporting
- id: finance-C-157
when: When creating PerformanceReport instances from non-Portfolio sources
action: verify position_source_type is explicitly set to PositionSourceType.Portfolio
severity: high
kind: architecture_guardrail
modality: must
consequence: PerformanceReport is designed specifically for portfolio-level analytics; using it with other position source
types leads to incorrect report type assignment
stage_ids:
- reporting
- id: finance-C-158
when: When configuring report data retrieval with return_format parameter
action: specify return_format as ReturnFormat.DATA_FRAME for pandas DataFrame output or ReturnFormat.JSON for dict output
severity: medium
kind: resource_boundary
modality: must
consequence: Incorrect return_format settings lead to type mismatches in downstream code expecting specific data structures
stage_ids:
- reporting
- id: finance-C-159
when: When handling report execution polling loops
action: implement proper exit conditions to prevent infinite loops during report execution
severity: high
kind: operational_lesson
modality: must
consequence: Without proper retry limits and timeout handling, polling loops can hang indefinitely, blocking resources
and causing system deadlocks
stage_ids:
- reporting
- id: finance-C-160
when: When initializing ReportJobFuture for status monitoring
action: 'capture each required job metadata: report_id, job_id, report_type, start_date, and end_date'
severity: high
kind: architecture_guardrail
modality: must
consequence: Missing job metadata prevents proper status polling and result retrieval, leading to incomplete or failed
report monitoring
stage_ids:
- reporting
- id: finance-C-163
when: When calling Entity.get_entitlements() on an entity without entitlements
action: check that the entity has entitlements before accessing them via get_entitlements
severity: high
kind: domain_rule
modality: must
consequence: Entity.get_entitlements() raises ValueError 'This entity does not have entitlements' when entitlements field
is absent, breaking entitlement-based operations
stage_ids:
- entity_management
- id: finance-C-164
when: When using PositionedEntity.get_latest_position_set and related methods
action: pass PositionType enum value (OPEN, CLOSE, or ANY) for position_type parameter
severity: high
kind: domain_rule
modality: must
consequence: Passing invalid position_type causes attribute access errors or API errors when constructing URL query parameters
stage_ids:
- entity_management
- id: finance-C-165
when: When implementing entity lookup via the Entity.get factory method
action: use the factory pattern in Entity.get for each entity type resolution
severity: high
kind: architecture_guardrail
modality: must
consequence: Bypassing Entity.get factory method leads to entity type resolution errors and breaks unified access pattern
across the platform
stage_ids:
- entity_management
- id: finance-C-166
when: When dispatching entity operations in PositionedEntity
action: check positioned_entity_type property before routing to PORTFOLIO or ASSET specific API calls
severity: high
kind: architecture_guardrail
modality: must
consequence: Missing positioned_entity_type check causes AttributeError or wrong API endpoint being called, corrupting
position data retrieval
stage_ids:
- entity_management
- id: finance-C-167
when: When resolving entity type from string to EntityType enum
action: use EntityType(value) conversion only for values in the _entity_to_endpoint mapping
severity: high
kind: architecture_guardrail
modality: must
consequence: EntityType constructor raises ValueError for entity types not in _entity_to_endpoint, breaking entity retrieval
for unsupported types like BACKTEST or HEDGE
stage_ids:
- entity_management
- id: finance-C-168
when: When retrieving positions via PositionedEntity for a PORTFOLIO
action: convert PositionType enum to string via position_type.value before passing to GsPortfolioApi
severity: high
kind: architecture_guardrail
modality: must
consequence: Passing PositionType enum directly to GsPortfolioApi.get_latest_positions causes API parameter type mismatch,
returning empty or incorrect position data
stage_ids:
- entity_management
- id: finance-C-170
when: When using Entity.get for entity type resolution
action: expect Entity.get to return non-None for entity types not covered by _entity_to_endpoint
severity: high
kind: resource_boundary
modality: must_not
consequence: Entity types like BACKTEST, HEDGE, REPORT, SCENARIO are not in _entity_to_endpoint, causing Entity.get to
fail with KeyError when accessing endpoint mapping
stage_ids:
- entity_management
- id: finance-C-171
when: When using PositionedEntity for position data retrieval
action: expect PositionedEntity to support entity types other than PORTFOLIO or ASSET
severity: high
kind: resource_boundary
modality: must_not
consequence: PositionedEntity raises NotImplementedError for unsupported entity types, causing all position queries to
fail at runtime
stage_ids:
- entity_management
- id: finance-C-172
when: When querying position data with default position_type
action: use PositionType.CLOSE as the default position type for end-of-day positioning analysis
severity: medium
kind: operational_lesson
modality: must
consequence: Using wrong default position_type (e.g., OPEN instead of CLOSE) returns corporate-action-adjusted positions
instead of trading activity positions, causing incorrect analytics
stage_ids:
- entity_management
- id: finance-C-173
when: When using entity ID types for lookup
action: use MQID (Marquee ID) for direct entity retrieval by unique identifier
severity: medium
kind: operational_lesson
modality: must
consequence: Using non-MQID identifier types triggers query-based lookup which may return stale or incorrect results if
multiple entities share similar identifier values
stage_ids:
- entity_management
- id: finance-C-174
when: When implementing position set updates for PORTFOLIO entities
action: check for missing position quantities before calling price() method on PositionSet
severity: high
kind: operational_lesson
modality: must
consequence: Calling update_positions with position sets containing None quantities without first pricing them causes
attribute errors or null values in position updates
stage_ids:
- entity_management
- id: finance-C-175
when: When resolving entities from analytics references
action: wrap Entity.get calls in try-except block to handle MqRequestError
severity: high
kind: operational_lesson
modality: must
consequence: Entity.get raises MqRequestError for invalid or inaccessible entity IDs, causing analytics pipeline failures
if not properly handled
stage_ids:
- entity_management
- id: finance-C-177
when: When passing DataCoordinate objects from data_ingestion to instrument_modeling
action: populate dataset_id field with a valid Marquee dataset identifier and specify measure as a DataMeasure enum value
severity: high
kind: domain_rule
modality: must
consequence: Instrument resolution will return None for get_series() causing downstream pricing calculations to fail with
AttributeError
stage_ids:
- data_ingestion
- instrument_modeling
- id: finance-C-179
when: When instrument parameters contain None values requiring resolution
action: invoke resolve() to fill in missing parameters via GS pricing service before passing to pricing_context
severity: high
kind: domain_rule
modality: must
consequence: Unresolved instruments with None parameters cause downstream risk calculations to return ErrorValue or incorrect
market values
stage_ids:
- instrument_modeling
- pricing_context
- id: finance-C-181
when: When passing instruments into a PricingContext
action: enter the PricingContext using 'with' statement to verify __exit__ triggers calculation execution via __calc()
severity: high
kind: architecture_guardrail
modality: must
consequence: Without proper context entry/exit, pending calculations never execute and dollar_price() returns unevaluated
PricingFuture objects
stage_ids:
- instrument_modeling
- pricing_context
- id: finance-C-182
when: When calculating risk measures in pricing_context
action: specify currency parameter for each currency-sensitive RiskMeasure types (EqDelta, EqGamma, IRDelta, FXDelta,
etc.)
severity: high
kind: domain_rule
modality: must
consequence: Risk measures without explicit currency return values in inconsistent currencies, leading to incorrect P&L
attribution and hedge sizing
stage_ids:
- pricing_context
- risk_analysis
- id: finance-C-183
when: When passing RiskMeasure objects to risk calculation
action: use concrete RiskMeasure subclasses from gs_quant.risk (DollarPrice, IRDelta, FXDelta, etc.) rather than raw measure_type
strings
severity: high
kind: architecture_guardrail
modality: must
consequence: Using string measure types bypasses the RiskMeasure type system causing risk API to reject requests or return
malformed results
stage_ids:
- pricing_context
- risk_analysis
- id: finance-C-184
when: When using backtesting with strategy_risk triggers that evaluate risk_measure thresholds
action: verify risk measures passed to backtest runs match the trigger_level units and aggregation_level specified in
RiskTriggerRequirements
severity: high
kind: operational_lesson
modality: must
consequence: Mismatched risk measure aggregation causes triggers to fire at wrong thresholds, resulting in incorrect hedge
execution timing and excess transaction costs
stage_ids:
- pricing_context
- risk_analysis
- id: finance-C-185
when: When configuring backtest trigger_level thresholds for risk-based triggers
action: use trigger_level values without verifying they align with the risk measure unit (e.g., using 50000 for FXDelta
without knowing if it's in USD or EUR)
severity: high
kind: operational_lesson
modality: must_not
consequence: Trigger fires at wrong risk levels causing premature or delayed hedging, leading to unhedged exposure or
excessive hedging costs
stage_ids:
- risk_analysis
- backtesting
- id: finance-C-186
when: When computing backtest total P&L
action: sum Price + Cumulative Cash + Transaction Costs columns (not just Price) to get accurate strategy performance
severity: medium
kind: domain_rule
modality: must
consequence: Omitting transaction costs underestimates true strategy cost by 1-5% annually, leading to incorrect strategy
comparison and allocation decisions
stage_ids:
- risk_analysis
- backtesting
- id: finance-C-187
when: When passing risk measures to run_backtest
action: include Price risk measure when transaction costs need to be computed, as AddTradeAction uses Price to calculate
proportional transaction costs
severity: high
kind: operational_lesson
modality: must
consequence: Missing Price measure causes ScaledTransactionModel to fail calculating proportional costs, resulting in
zero transaction costs being recorded
stage_ids:
- risk_analysis
- backtesting
- id: finance-C-188
when: When querying market data for analytics processing
action: set DataContext.start_date and DataContext.end_date boundaries to match the required temporal scope before calling
get_data_series()
severity: high
kind: architecture_guardrail
modality: must
consequence: DataContext defaults (last 30 days) cause truncated time series in analytics results, producing incorrect
volatility and correlation calculations
stage_ids:
- data_ingestion
- analytics_processing
- id: finance-C-189
when: When using analytics processors that perform diff or ratio operations on time series
action: validate that input time series are properly aligned with matching indices before passing to DiffProcessor or
similar operators
severity: high
kind: operational_lesson
modality: must
consequence: Misaligned time series indices produce NaN values in diff/ratio results, corrupting downstream correlation
and regression analytics
stage_ids:
- data_ingestion
- analytics_processing
- id: finance-C-190
when: When generating reports using analytics computed via processors
action: verify that BaseProcessor.calculate() completes successfully and check for ErrorValue in results before including
in reports
severity: high
kind: architecture_guardrail
modality: must
consequence: Including ErrorValue results in reports causes downstream aggregation failures and produces misleading dashboard
visualizations
stage_ids:
- analytics_processing
- reporting
- id: finance-C-191
when: When uploading factor exposure data from risk_models to risk_analysis
action: verify risk_model_id used in get_risk_model_factor_data() matches the model_id specified in risk_analysis calculations
severity: high
kind: domain_rule
modality: must
consequence: Mismatched risk_model_id causes factor exposures from different models to contaminate risk attribution, producing
inconsistent P&L explain results
stage_ids:
- risk_models
- risk_analysis
- id: finance-C-192
when: When uploading custom risk model data including covariance matrices
action: validate covariance matrix is positive semi-definite before upload to prevent numerical instability in risk calculations
severity: high
kind: resource_boundary
modality: must
consequence: Non-positive-definite covariance matrices cause Cholesky decomposition failures in risk engine, resulting
in calculation timeouts or NaN outputs
stage_ids:
- risk_models
- risk_analysis
- id: finance-C-193
when: When using backtest equity curves to justify live trading decisions
action: present backtest results as hypothetical performance without claiming they predict future returns or guarantee
live trading results
severity: high
kind: claim_boundary
modality: must_not
consequence: Misrepresenting backtest results as predictive guarantees violates financial industry best practices and
may expose users to unexpected live trading losses
stage_ids:
- risk_analysis
- backtesting
- id: finance-C-194
when: When data_ingestion receives entity identifiers from entity_management
action: normalize identifier types (bbid, ric, Cusip, etc.) using map_identifiers() before constructing DataCoordinate
dimensions
severity: high
kind: operational_lesson
modality: must
consequence: Using unmapped identifiers as dataset dimensions causes query failures and returns empty data for instruments
not matching the identifier type expected by the dataset
stage_ids:
- entity_management
- data_ingestion
- id: finance-C-195
when: When generating reports from entity_management portfolio references
action: validate portfolio_id exists via get_portfolio() before passing to reporting to avoid downstream API errors
severity: high
kind: architecture_guardrail
modality: must
consequence: Invalid portfolio_id causes get_positions() and get_attribution() to raise MqValueError, blocking report
generation entirely
stage_ids:
- entity_management
- reporting
- id: finance-C-197
when: When considering skipping transaction cost modeling in backtests
action: skip transaction cost modeling because strategy looks simple; always include costs to get realistic performance
estimates
severity: high
kind: rationalization_guard
modality: must_not
consequence: Strategies that appear profitable before costs may be loss-making after realistic transaction costs are applied,
leading to poor allocation decisions
stage_ids:
- risk_analysis
- backtesting
- id: finance-C-201
when: When running backtests using the gs_quant backtesting framework
action: Pass a GenericEngine or engine subclass instance to run_backtest() — the engine is required for strategy execution
severity: high
kind: architecture_guardrail
modality: must
consequence: Backtest run_backtest() method requires a non-None engine argument; without it the backtest cannot execute
trades, compute P&L, or generate results
- id: finance-C-202
when: When constructing FX options for pricing or backtesting
action: Set premium=0 explicitly on FXOption, FXBinary, and FXMultiCrossBinary instruments to obtain non-zero DollarPrice
values
severity: high
kind: operational_lesson
modality: must
consequence: Without premium=0, instrument resolution sets a premium that makes DollarPrice zero, causing backtest P&L
to be meaningless (shows ~0 instead of actual option value)
- id: finance-C-203
when: When querying Marquee datasets via Dataset.get_data()
action: Use the correct symbol dimension name for the dataset (e.g., 'bbid' for equities, 'assetId' for GS assets, 'ticker'
for other instruments) — wrong dimension names silently return empty DataFrames
severity: high
kind: operational_lesson
modality: must
consequence: Wrong dimension names cause silent empty DataFrame returns with zero rows, leading to downstream calculations
producing zero or missing results without any error signal
- id: finance-C-204
when: When processing risk calculation results from calc() method calls
action: Call .result() on PricingFuture objects when inside an async PricingContext (is_async=True) to retrieve the actual
computed values
severity: high
kind: operational_lesson
modality: must
consequence: PricingFuture objects returned from async contexts contain unresolved futures; accessing numeric values without
.result() returns the future wrapper instead of the FloatWithInfo or DataFrameWithInfo value, causing TypeError in downstream
arithmetic
- id: finance-C-205
when: When presenting or reporting this system's backtested returns or P&L to users or stakeholders
action: Claim or imply that backtested returns equal expected live trading returns — backtesting does not account for
market impact, financing costs, execution delays, slippage, counterparty credit risk, or liquidity constraints
severity: high
kind: claim_boundary
modality: must_not
consequence: Users or investors make live capital allocation decisions based on inflated backtest returns, leading to
severe underperformance in live trading and potential financial loss when actual execution costs and market conditions
diverge from simulation assumptions
- id: finance-C-206
when: When presenting this system or its outputs to users without Marquee API credentials
action: Claim the system can perform live pricing, real-time risk calculations, or live trading execution — gs_quant requires
Marquee API credentials (client_id and client_secret) available to institutional Goldman Sachs clients, and without
them no pricing or data retrieval operations are possible
severity: high
kind: claim_boundary
modality: must_not
consequence: Users attempt live trading strategies or real-time analytics with an uninitialized session, receiving MqUninitialisedError
or MqAuthenticationError at runtime, leading to failed operations and potential financial losses from missed trading
opportunities
- id: finance-C-208
when: When extending the risk framework with custom RiskMeasure implementations
action: Implement the pricing_context() abstract method in each RiskMeasure subclasses to enable context binding for the
pricing and market data parameters
severity: high
kind: architecture_guardrail
modality: must
consequence: RiskMeasure subclasses without pricing_context() fail to bind to PricingContext parameters, causing risk
calculations to use default market data instead of context-specified values, producing incorrect valuations
- id: finance-C-209
when: When processing time series data via ExtendedSeries or Dataset operations
action: Verify each time series indices are pd.DatetimeIndex — ExtendedSeries and Dataset operations expect datetime-indexed
pandas Series for proper alignment and time-based operations
severity: high
kind: domain_rule
modality: must
consequence: Non-DatetimeIndex causes TypeError or incorrect results in time-based operations, windowing functions, and
pandas resampling operations throughout the timeseries module
- id: finance-C-210
when: When using backtest transaction cost models for live trading preparation
action: Claim or assume that backtest transaction cost models (ConstantTransactionModel, ScaledTransactionModel) accurately
reflect real execution costs — transaction costs in backtests are simplified approximations that do not account for
market liquidity variations, bid-ask spreads at execution time, or order sizing effects
severity: medium
kind: claim_boundary
modality: must_not
consequence: Strategies optimized for backtest transaction cost assumptions underperform live trading when actual execution
costs exceed model parameters, leading to negative alpha and strategy failure
- id: finance-C-211
when: When validating Entity IDs across the system
action: Validate each entity identifiers against the EntityType enum — entity IDs must correspond to valid EntityType
values (Asset, Person, Org, etc.) to verify proper resolution in the GS data catalog
severity: medium
kind: domain_rule
modality: must
consequence: Invalid entity IDs cause downstream data lookups to fail with MqValueError or return incorrect reference
data, breaking instrument resolution and portfolio construction
- id: finance-C-212
when: When setting up RealtimePricingContext intervals
action: Set interval to at least 1 minute (_MIN_INTERVAL) and less than 1 day (_MAX_INTERVAL) — intervals outside this
range raise ValueError
severity: high
kind: resource_boundary
modality: must
consequence: Invalid interval values cause ValueError at RealtimePricingContext construction, blocking intraday pricing
workflows
- id: finance-C-213
when: When implementing regression analysis in the timeseries module
action: Use statsmodels OLS for regression calculations to preserve full statistical inference capabilities (t-stats,
R-squared, confidence intervals)
severity: high
kind: domain_rule
modality: must
consequence: Replacing statsmodels OLS with numpy lstsq or sklearn LinearRegression strips statistical inference capabilities,
making backtest results unreliable for hypothesis testing and confidence estimation
derived_from_bd_id: BD-010
- id: finance-C-214
when: When implementing rolling variance calculations in the statistics module
action: Use ddof=1 (Bessel correction) for rolling variance calculations to provide unbiased sample variance estimation
severity: high
kind: domain_rule
modality: must
consequence: Using population variance (ddof=0) instead of sample variance systematically underestimates variance by factor
(n-1)/n, distorting risk calculations and statistical inference in backtests
derived_from_bd_id: BD-006
- id: finance-C-215
when: When implementing aggregate trigger requirement calculations
action: Use max() function to determine aggregate calc_type — path_dependent > semi_path_dependent > simple — ensuring
highest required complexity level dominates execution ordering
severity: high
kind: domain_rule
modality: must
consequence: Incorrect calc_type selection causes wrong execution ordering in trigger evaluation, potentially using stale
prices or creating race conditions in backtest simulations
derived_from_bd_id: BD-050
- id: finance-C-216
when: When using ContextBaseWithDefault.default_value() in the context module
action: Verify that auto-instantiation behavior matches expected usage — default_value() returns a cached class instance
on first access, not a class descriptor
severity: medium
kind: operational_lesson
modality: should
consequence: Relying on default_value() to return a class descriptor when it actually returns an instantiated object causes
type errors and unexpected behavior in downstream code
derived_from_bd_id: BD-039
- id: finance-C-217
when: When determining default pricing location for unsupported currencies
action: Explicitly provide location parameter when currency is not in supported mapping — _default_pricing_location raises
MqValueError for unsupported currencies
severity: medium
kind: operational_lesson
modality: should
consequence: Using unsupported currencies without explicit location causes MqValueError, potentially crashing batch jobs
or returning incorrect pricing based on wrong market conventions
derived_from_bd_id: BD-056
- id: finance-C-218
when: When using the framework's default transaction cost model for backtesting
action: Verify that default_transaction_cost() returning ConstantTransactionModel(0) matches actual broker fee structure;
for live trading, MUST explicitly configure non-zero transaction costs
severity: high
kind: operational_lesson
modality: must
consequence: Zero transaction cost default causes backtest to overestimate returns by excluding broker commissions, stamp
duties, and slippage; live trading P&L will be significantly lower than backtested
derived_from_bd_id: BD-048
- id: finance-C-219
when: When implementing transaction cost modeling in backtesting
action: Assume default_transaction_cost() provides meaningful cost modeling — it returns ConstantTransactionModel(0) with
zero friction by default; MUST use PercentageTransactionModel or explicitly configured cost models for realistic backtests
severity: high
kind: architecture_guardrail
modality: must_not
consequence: Using zero-cost transaction model in backtest systematically overestimates strategy returns, causing live
trading P&L to fall far below backtested results
derived_from_bd_id: BD-048
- id: finance-C-220
when: When implementing or refactoring PricingContext usage in pricing calculations
action: Refactor or remove the nullcontext() fallback in PriceableImpl._pricing_context — when PricingContext is_entered=True,
returning nullcontext() is a mandatory safety mechanism to prevent nested context entry that would corrupt pricing state
severity: high
kind: architecture_guardrail
modality: must_not
consequence: Removing nullcontext fallback allows re-entering an active PricingContext, causing double-applied transformations
or nested contexts with inconsistent state, resulting in silently incorrect pricing calculations
derived_from_bd_id: BD-043
- id: finance-C-222
when: When implementing or refactoring processor calculation logic
action: Remove the recursive parent traversal in BaseProcessor.calculate() — the bottom-up async update propagation through
parent chain is mandatory for correct analytics tree computation
severity: high
kind: domain_rule
modality: must_not
consequence: Removing recursive parent traversal breaks the bottom-up computation contract where child values must be
computed before parent values, causing incomplete or incorrect aggregated results in analytics trees
derived_from_bd_id: BD-055
- id: finance-C-224
when: When implementing or modifying batch query logic in Dataset.get_data_bulk
action: Constrain request_batch_size between 1 and 5 inclusive — this architectural limit prevents overwhelming the data
service while ensuring non-zero throughput
severity: high
kind: architecture_guardrail
modality: must
consequence: Increasing batch size beyond 5 risks overwhelming the data service with too many parallel requests, causing
request failures or throttling that disrupt backtesting workflows
derived_from_bd_id: BD-GAP-010
- id: finance-C-225
when: When implementing or modifying result composition logic
action: Validate market.location equality in ResultInfo.composition_info before composing results — mismatched locations
must raise ValueError, cross-location aggregation requires explicit normalization first
severity: high
kind: domain_rule
modality: must
consequence: Composing results across different market locations without normalization combines incompatible data sources,
conventions, or time zones, producing semantically invalid aggregated results
derived_from_bd_id: BD-058
- id: finance-C-226
when: When implementing GARCH volatility modeling with winsorized return data
action: Use GARCH-t (t-distribution) instead of GARCH with normal distribution when fitting on winsorized returns — normal
GARCH on winsorized data creates double-softening that severely underestimates tail risk
severity: high
kind: domain_rule
modality: must
consequence: GARCH with normal distribution on winsorized data systematically underestimates VaR and CVaR by 15-40%, producing
dangerously optimistic risk estimates that could lead to inadequate hedge ratios in live trading
derived_from_bd_id: BD-064
- id: finance-C-227
when: When implementing autocorrelation computation or modifying rolling_std
action: Use unbiased variance estimator with ddof=1 for autocorrelation confidence intervals per Bartlett's formula
severity: high
kind: domain_rule
modality: must
consequence: Using ddof=0 produces biased variance estimates, causing incorrect confidence intervals for autocorrelation
and potentially leading to false conclusions about serial dependence in financial time series
derived_from_bd_id: BD-013
- id: finance-C-228
when: When fitting SIR compartmental epidemiological models
action: Use Levenberg-Marquardt method (leastsq) via lmfit.minimize for nonlinear least squares fitting with parameter
bounds and uncertainty estimation
severity: high
kind: domain_rule
modality: must
consequence: Using alternative optimizers like Nelder-Mead simplex or differential evolution may fail to converge properly
for ill-conditioned epidemiological Jacobians, producing unreliable parameter estimates
derived_from_bd_id: BD-014
- id: finance-C-229
when: When implementing annualized rolling volatility calculations
action: Use ddof=1 (Bessel correction) for unbiased sample variance and 252-day annualization factor for US equity markets
severity: high
kind: domain_rule
modality: must
consequence: Using ddof=0 produces biased population variance, and using 250 or 365 days instead of 252 creates inconsistent
annualization; accumulated errors distort risk estimates and strategy allocation decisions
derived_from_bd_id: BD-002
- id: finance-C-230
when: When using exponential moving average with default parameters
action: Verify that span=14 (alpha=0.06, beta=0.94) matches the strategy's intended smoothing-lag tradeoff; alternative
spans of 20 or 26 are valid for different market conventions but fundamentally alter indicator behavior
severity: medium
kind: operational_lesson
modality: should
consequence: AI may default to different span values without recognizing that span=14 is the industry standard for short-term
momentum; changing span silently alters smoothing characteristics, causing momentum strategies to trigger on different
signals
derived_from_bd_id: BD-003
- id: finance-C-231
when: When implementing rolling covariance calculations for portfolio optimization
action: Use Bessel correction with ddof=1 for unbiased sample covariance when true means are estimated from rolling window
data
severity: high
kind: domain_rule
modality: must
consequence: Using ddof=0 produces biased covariance estimates, distorting correlation inputs to portfolio optimization
and potentially leading to suboptimal allocation that concentrates risk in incorrectly estimated asset relationships
derived_from_bd_id: BD-007
- id: finance-C-232
when: When implementing or using factory patterns with Action/Trigger/DataSource subclasses
action: Verify that subclasses are defined at module load time (not dynamically created after import), and confirm auto-registration
via __init_subclass__ has occurred before factory lookup
severity: medium
kind: operational_lesson
modality: should
consequence: Dynamically created subclasses after import time won't appear in the __subclasses__ registry, causing factory
lookups to silently fail to find registered subclasses and return incomplete or empty results
derived_from_bd_id: BD-042
- id: finance-C-233
when: When computing annualized volatility or converting between time frequencies in the backtesting framework
action: Use a consistent annualization factor throughout the calculation — either 252 for trading days (US equity) or
365 for calendar days — and verify both annualized volatility (via std * sqrt(N)) and frequency-to-period mapping use
the same factor; do not mix 252 and 365 in the same calculation pipeline
severity: high
kind: operational_lesson
modality: must
consequence: Inconsistent annualization factors (252 vs 365) produce systematically biased Sharpe ratios and risk metrics,
causing incorrect strategy selection based on flawed performance comparisons between different code paths
derived_from_bd_id: BD-059
- id: finance-C-234
when: When computing Information Ratio using rolling z-score normalized returns in the same pipeline
action: Use consistent ddof (degrees of freedom) across both z-score normalization and Information Ratio calculation —
both must use ddof=1 (sample std) or both ddof=0 (population std); verify np.std calls use matching conventions
severity: high
kind: domain_rule
modality: must
consequence: Mismatched ddof creates internal inconsistency where z-score distributions appear to have variance different
from 1.0 when plugged into IR calculation, systematically biasing risk-adjusted performance metrics upward or downward
derived_from_bd_id: BD-060
- id: finance-C-235
when: When implementing statistical computations involving z-score normalization and Information Ratio
action: Audit each np.std() calls in the calculation pipeline for ddof parameter — verify the same convention (0 or 1)
is used throughout, especially when combining z-score normalized returns with IR calculation
severity: medium
kind: operational_lesson
modality: should
consequence: Sample vs population std mismatch causes z-score-based return distributions to have apparent population variance
different from 1.0 when combined with IR calculation using ddof=0, silently biasing risk-adjusted performance metrics
without triggering obvious errors
derived_from_bd_id: BD-060
- id: finance-C-236
when: When implementing or modifying stochastic components in the codebase
action: Assume the framework provides comprehensive random seed management ensuring reproducible results across each stochastic
operations
severity: high
kind: claim_boundary
modality: must_not
consequence: Without full random seed coverage, tests may produce non-deterministic results, making it impossible to reproduce
bugs or verify fixes reliably in production environments
derived_from_bd_id: BD-GAP-022
- id: finance-C-237
when: When setting up test infrastructure or stochastic model initialization
action: Explicitly set random seeds before each stochastic operation using a seed management utility that tracks each
random number generators (NumPy, Python random, framework-specific RNGs) and verify reproducibility via test assertions
severity: high
kind: domain_rule
modality: must
consequence: Failing to set seeds explicitly for each RNG type leads to flaky tests and unreproducible behavior in production
where different RNG implementations may initialize differently
derived_from_bd_id: BD-GAP-022
- id: finance-C-238
when: When implementing return calculations in backtesting or live trading systems
action: Compute cumulative returns using the product of (1+r) values over the return window, not simple arithmetic averaging
severity: high
kind: domain_rule
modality: must
consequence: Using simple averaging for cumulative returns ignores compounding effects; a 10% gain followed by 10% loss
yields 99% of original with compounding but appears as 100% with averaging, causing significant backtest-live discrepancy
derived_from_bd_id: BD-012
- id: finance-C-239
when: When computing bear market information ratio in performance attribution
action: Filter benchmark returns to periods where benchmark < 0 before computing excess returns and standard deviation;
verify both numerator (excess return) and denominator (tracking error) use the same filtered bear market periods to
avoid inconsistent metric computation
severity: high
kind: domain_rule
modality: must
consequence: Computing excess returns over all periods but std only over bear periods produces an inconsistent information
ratio that mixes filtered and unfiltered data, leading to incorrect performance attribution and potentially misleading
drawdown analysis
derived_from_bd_id: BD-031
- id: finance-C-240
when: When computing bull market information ratio in performance attribution
action: Filter benchmark returns to periods where benchmark > 0 before computing excess returns and standard deviation;
verify both numerator and denominator use the same filtered bull market periods; verify function exists at gs_quant/timeseries/measures_reports.py:1271-1274
as evidence suggests implementation may have drifted
severity: high
kind: domain_rule
modality: must
consequence: Computing information ratio for bull markets without consistent filtering produces misleading alpha attribution,
potentially attributing defensive positioning gains to upside selection skill and causing incorrect strategy assessment
derived_from_bd_id: BD-032
- id: finance-C-241
when: When configuring ExitTradeAction with custom transaction costs in backtesting
action: Explicitly set transaction_cost_exit if it differs from transaction_cost; verify exit costs in __post_init__ match
expected broker fees for closing positions; when entry costs change, re-evaluate whether exit costs should also change
severity: medium
kind: operational_lesson
modality: should
consequence: ExitTradeAction defaults transaction_cost_exit to transaction_cost in __post_init__, silently inheriting
entry cost assumptions for exit trades; if exit fees differ from entry fees (common with wide spreads on closure), backtest
underestimates or overestimates round-trip costs by the differential amount
derived_from_bd_id: BD-051
- id: finance-C-242
when: When implementing residual calculation for epidemic curve fitting using L2 norm minimization
action: Calculate residual as solution - data (not data - solution), ensuring L2 norm minimization treats residuals consistently
with epidemic model formulation
severity: high
kind: domain_rule
modality: must
consequence: Inverting the residual formula to data - solution inverts the optimization direction, causing the curve fitting
algorithm to minimize wrong objective and produce incorrect epidemic parameters
derived_from_bd_id: BD-018
- id: finance-C-243
when: When implementing seasonal adjustment decomposition for time series analysis
action: Select additive adjustment (trend + resid) for series with stable seasonal amplitude, or multiplicative adjustment
(trend * resid) for series where seasonal variation grows proportionally with trend level
severity: high
kind: domain_rule
modality: must
consequence: Using multiplicative decomposition on additive data or vice versa causes systematic misinterpretation of
seasonal patterns, leading to incorrect inflation-adjusted metrics or economic indicators
derived_from_bd_id: BD-020
- id: finance-C-244
when: When implementing volatility calculation using exponentially weighted moving standard deviation
action: Use log returns (ln(price_t / price_t-1)) for exponential volatility calculation to verify multiplicative returns
compound correctly over annualization period
severity: high
kind: domain_rule
modality: must
consequence: Using arithmetic returns instead of log returns for exponential volatility introduces compounding error during
annualization, causing volatility estimates to diverge from true annualized values especially in high-volatility regimes
derived_from_bd_id: BD-022
- id: finance-C-245
when: When processing time series with seasonal patterns in backtesting
action: Assume the framework handles seasonal decomposition via statsmodels seasonal_decompose — the evidence shows this
function is not found in gs_quant/timeseries/technicals.py; alternative decomposition must be implemented externally
severity: high
kind: claim_boundary
modality: must_not
consequence: Without verified seasonal decomposition capability, momentum or mean-reversion strategies that rely on seasonally
adjusted data will produce backtest results inconsistent with actual market behavior
derived_from_bd_id: BD-019
- id: finance-C-246
when: When implementing seasonal decomposition for strategy signals or indicator calculation
action: Implement seasonal decomposition externally using statsmodels seasonal_decompose or equivalent library with explicit
additive/multiplicative mode specification matching your data characteristics
severity: high
kind: domain_rule
modality: must
consequence: Strategies relying on seasonally adjusted data without proper decomposition implementation will have inconsistent
signal generation between backtest and live trading environments
derived_from_bd_id: BD-019
- id: finance-C-247
when: When configuring Bollinger Bands parameters for mean-reversion signal detection
action: Verify that Bollinger Bands use exactly 20-period SMA with ±2 standard deviations; parameters outside these values
will change signal frequency and mean-reversion boundaries significantly
severity: medium
kind: operational_lesson
modality: should
consequence: Different Bollinger Band parameters produce materially different mean-reversion signals; using 30-period
or ±1.5σ changes support/resistance boundaries, causing strategies to misidentify entry/exit points and systematically
trade at wrong prices
derived_from_bd_id: BD-021
- id: finance-C-248
when: When implementing bear market volatility regime detection in risk calculations
action: Change the conditional standard deviation logic that returns NaN for non-bear periods to a continuous volatility
calculation without explicit regime filtering
severity: high
kind: domain_rule
modality: must_not
consequence: Modifying the conditional std to return continuous values removes regime segmentation, causing bear market
volatility estimates to be diluted by bull market data and risk models to underestimate downside risk during market
crashes
derived_from_bd_id: BD-033
- id: finance-C-249
when: When implementing or refactoring the data preprocessing pipeline that feeds into Bollinger Bands calculations
action: Verify Winsorization (capping at ±2.5σ) is applied AFTER Bollinger Band signal generation, not before — if winsorization
must precede Bollinger calculation, document that ±2σ band signals may be suppressed for truly extreme values that were
clipped
severity: high
kind: architecture_guardrail
modality: must
consequence: Applying winsorization before Bollinger Bands causes truly extreme deviations to be clipped at ±2.5σ, suppressing
±2σ band breakout signals that indicate genuine regime shifts, leading to delayed or missed trades during market turning
points
derived_from_bd_id: BD-061
- id: finance-C-250
when: When configuring RebalanceAction instruments in backtest setup
action: Resolve each priceable instruments before passing them to RebalanceAction — verify priceable.unresolved is False
for each instrument in the rebalance set
severity: high
kind: operational_lesson
modality: must
consequence: Unresolved priceable instruments cause ValueError during __post_init__ validation, breaking backtest initialization;
the error occurs late in setup after data fetching, making diagnosis difficult and wasting computation time
derived_from_bd_id: BD-054
- id: finance-C-251
when: When configuring missing data handling in backtesting pipelines that use rolling product calculations for cumulative
returns
action: Explicitly set missing_data_strategy='interpolate' or equivalent for rolling product calculations to prevent silent
NaN propagation; do not rely on GenericDataSource fail-fast default as rolling products use nanprod which propagates
NaN without raising errors
severity: high
kind: domain_rule
modality: must
consequence: Inconsistent missing data handling causes some data gaps to raise immediate RuntimeError while others produce
NaN values that silently propagate through rolling product calculations, leading to compounding interpolation errors
and strategies that perform differently in live trading than in backtesting
derived_from_bd_id: BD-066
- id: finance-C-252
when: When configuring the default time range for PlotRunner plots and historical data queries
action: Verify that the default time range of '-1y' to '-1b' (1 year to previous business day) matches the intended analysis
window; adjust RelativeDate parameters explicitly if different context periods are required
severity: medium
kind: operational_lesson
modality: should
consequence: Using the default 1-year lookback may include stale historical data for strategies requiring shorter-term
analysis, or miss critical regime changes for long-term strategies; backtest results will be misaligned with actual
strategy intent
derived_from_bd_id: BD-GAP-007
- id: finance-C-253
when: When using ContentApi.get_contents with default parameters
action: 'Verify that the default ordering (order_by={''direction'': OrderBy.DESC, ''field'': ''createdTime''}) and limit=10
matches the expected content retrieval pattern; explicitly specify parameters if ASC ordering or different limits are
needed'
severity: low
kind: operational_lesson
modality: should
consequence: Default DESC ordering returns newest content first; using ASC by mistake would return oldest content, causing
data retrieval patterns that don't match user expectations and potential analysis gaps
derived_from_bd_id: BD-GAP-011
- id: finance-C-254
when: When implementing or modifying EMA crossover signal generation logic
action: Use the direct difference of two EMAs (fast minus slow) as the crossover signal; do NOT introduce a separate signal
line smoothing (e.g., 9-period EMA of the difference) unless explicitly required by the strategy specification
severity: high
kind: domain_rule
modality: must
consequence: Adding signal line smoothing changes crossover timing and reduces sensitivity; strategies calibrated to the
direct difference EMA will produce different entry/exit signals, causing backtest results to diverge from live trading
expectations
derived_from_bd_id: BD-024
- id: finance-C-255
when: When implementing autoregressive model estimation for time series forecasting
action: Use OLS (ordinary least squares) via numpy lstsq for AR model estimation — do not substitute Yule-Walker, Burg
algorithm, or MLE without validating that the alternative provides better numerical stability for high lag orders
severity: high
kind: domain_rule
modality: must
consequence: Replacing OLS with Yule-Walker may introduce biased covariance estimates for high lag orders; Burg algorithm
optimizes forward prediction error rather than one-step-ahead accuracy; MLE requires distributional assumptions that
OLS avoids
derived_from_bd_id: BD-025
- id: finance-C-256
when: When implementing stationarity testing for financial time series
action: Use Augmented Dickey-Fuller test with AIC-based automatic lag selection — do not substitute Phillips-Perron or
KPSS without understanding the different null hypotheses and test power characteristics
severity: high
kind: domain_rule
modality: must
consequence: 'ADF tests for unit root (null: non-stationary) while KPSS tests for stationarity (null: stationary) — using
the wrong test direction produces opposite conclusions about mean-reversion validity'
derived_from_bd_id: BD-026
- id: finance-C-257
when: When using the framework's default frequency-to-period mapping for annualization
action: Verify that the frequency mapping (daily=365, business=5, weekly=52, monthly=12) matches your strategy's trading
calendar — equity strategies typically use 252 trading days instead of 365 calendar days; using wrong annualization
factors distorts Sharpe ratio and volatility estimates
severity: medium
kind: operational_lesson
modality: should
consequence: Annualizing volatility with 365 instead of 252 trading days underestimates annual volatility by 21%, causing
Sharpe ratios to be overstated by approximately 30% for daily-traded equity strategies
derived_from_bd_id: BD-023
- id: finance-C-258
when: When initializing GARCH volatility models using the variance targeting estimator
action: Verify the variance targeting estimator scales realized variance by the ratio of unconditional to realized variance
— do not replace with simple EWMA or QMLE without understanding the stabilization effect on GARCH parameter estimation
severity: high
kind: domain_rule
modality: must
consequence: Without variance targeting, GARCH(1,1) initialization may produce unstable omega parameters, leading to explosive
variance forecasts that cause position sizing algorithms to undertrade during high-volatility regimes
derived_from_bd_id: BD-028
- id: finance-C-260
when: When implementing or refactoring cointegration testing logic for pairs trading and spread analysis
action: 'Use Engle-Granger two-step cointegration test: first estimate long-run equilibrium via OLS regression including
constant (trend=''c''), then test residual stationarity using the t-statistic on residuals. Do not substitute Johansen
test, Phillips-Ouliaris, or DOLS without explicit business justification'
severity: high
kind: domain_rule
modality: must
consequence: Using an incorrect cointegration test (e.g., Johansen) for pairs trading causes the algorithm to identify
non-mean-reverting pairs as valid trades, leading to strategies with unpredictable drawdowns and significant capital
loss
derived_from_bd_id: BD-027
- id: finance-C-261
when: When calculating Information Ratio for performance benchmarking and manager selection
action: Calculate Information Ratio as mean(excess) / std(excess) using population standard deviation (np.std with ddof=0).
Do not use sample std (ddof=1) or alternative formulations without explicit business justification
severity: high
kind: domain_rule
modality: must
consequence: Using sample std (ddof=1) instead of population std (ddof=0) inflates the denominator, causing systematically
lower IR estimates that lead to rejection of outperforming managers and flawed benchmark comparisons
derived_from_bd_id: BD-030
- id: finance-C-262
when: When aggregating factor-attributed PnL in multi-factor performance attribution models
action: Sum factor contributions using np.sum along axis=1 within each time period to verify factor returns sum to total
return and maintain attribution consistency. Do not change axis parameter or aggregate differently without explicit
business justification
severity: high
kind: domain_rule
modality: must
consequence: Changing the axis parameter in factor PnL summation breaks attribution consistency, causing individual factor
contributions to not sum to total return and producing misleading performance decomposition that leads to incorrect
investment decisions
derived_from_bd_id: BD-036
- id: finance-C-263
when: When implementing or refactoring calculation execution order in the backtesting engine
action: 'Enforce CalcType enum execution order: simple→semi_path_dependent→path_dependent. Calculations tagged with higher
complexity type must not execute before lower complexity calculations complete. Do not reorder calculations or bypass
CalcType dependency checks'
severity: high
kind: domain_rule
modality: must
consequence: Violating CalcType execution order causes calculations to run before their dependencies are satisfied, producing
incorrect risk results with systematic errors in Greeks, scenario analysis, and portfolio metrics that lead to bad trading
decisions
derived_from_bd_id: BD-040
- id: finance-C-264
when: When implementing or refactoring Action-Handler mapping in the backtesting engine
action: Use ActionHandler factory pattern (get_action_handler) to map Actions to Handlers. Ensure get_action_handler()
returns a handler type compatible with the action's execution context. each Action subclasses must register with the
factory; do not use direct Action-Handler coupling
severity: high
kind: architecture_guardrail
modality: must
consequence: Bypassing the factory pattern and using direct Action-Handler coupling causes runtime type mismatches, breaks
extensibility, and leads to ActionNotHandled exceptions that halt backtesting execution mid-process
derived_from_bd_id: BD-041
- id: finance-C-265
when: When implementing or refactoring cost model aggregation logic in backtesting
action: Replace TransactionCostModel algebraic operator overloading (+, -, *, /) with method-based aggregation (add_costs,
scale_costs) — maintain the existing operator interface for cost combination
severity: high
kind: architecture_guardrail
modality: must_not
consequence: Switching from operator overloading to method calls breaks the natural composition API used by FixedCostModel,
ScaledCostModel, and AggregateCostModel, causing runtime AttributeError or breaking portfolio-level cost calculations
derived_from_bd_id: BD-GAP-016
- id: finance-C-266
when: When constructing FX options (FXOption, FXBinary, FXMultiCrossBinary) for pricing or backtesting
action: Always explicitly set premium=0 to get the actual option value instead of a zeroed-out DollarPrice
severity: high
kind: domain_rule
modality: must
consequence: Instrument resolution automatically sets a premium that makes DollarPrice zero, producing meaningless P&L
in backtests and confusing results in pricing. The user will always get 0 for the option value.
stage_ids:
- backtesting
- id: finance-C-267
when: When using HistoricalPricingContext to resolve instruments across multiple dates
action: Pass in_place=False to receive a new resolved copy — in_place=True is not supported for historical resolution
severity: high
kind: architecture_guardrail
modality: must
consequence: 'Resolution across multiple dates returns a dict of {date: instrument} pairs, which cannot be written in-place
to a single instrument object.'
- id: finance-C-268
when: When constructing FXMultiCrossBinaryLeg objects used within FXMultiCrossBinary
action: Use OptionType.Binary_Call or OptionType.Binary_Put — do not use OptionType.Call or OptionType.Put which are for
FXBinary
severity: high
kind: domain_rule
modality: must
consequence: Using the wrong OptionType values will cause an error or incorrect option type assignment because FXMultiCrossBinaryLeg
uses a different enum namespace.
stage_ids:
- backtesting
- id: finance-C-269
when: When accessing equities and listed instrument data via the Dataset API
action: Use TREOD (Thomson Reuters End-of-Day) dataset with bbid as the symbol dimension for equity indices, ETFs, futures,
and most listed instruments
severity: medium
kind: resource_boundary
modality: must
consequence: Using the wrong dataset ID or dimension name will silently return an empty DataFrame with no results.
- id: finance-C-270
when: When querying datasets with many assets or long date ranges
action: Break large queries into smaller time batches (e.g. 30-day windows) to avoid API timeouts or rejections
severity: medium
kind: resource_boundary
modality: should
consequence: Very wide queries (many assets × long date range) will time out or be rejected by the API, causing the data
retrieval to fail entirely.
- id: finance-C-271
when: When querying a Dataset with symbol dimensions
action: Pass the correct dimension name (bbid, assetId, ticker, ric) as a kwarg matching the specific dataset's schema
from the Marquee catalog — passing the wrong name silently returns an empty DataFrame
severity: high
kind: operational_lesson
modality: must
consequence: Each dataset has its own symbol dimension names. Passing the wrong dimension name silently returns an empty
DataFrame with no error, leading to downstream data corruption.
- id: finance-C-272
when: When constructing a cross-currency swap with principal_exchange
action: Do not include principal exchanges with dates in the past relative to the PricingContext — they are not ignored
by the Price measure and will corrupt pricing results
severity: medium
kind: operational_lesson
modality: must_not
consequence: Past principal exchanges are not ignored by the Price measure and will affect the pricing calculation, producing
incorrect valuations.
- id: finance-C-274
when: When adding cross-currency swap notional parameters
action: Do not set receiver_amount on IRXccySwap (MTM) — the receiver notional is computed automatically each period;
receiver_amount is only for IRXccySwapFltFlt (non-MTM)
severity: high
kind: domain_rule
modality: must
consequence: IRXccySwap does not have a receiver_amount field. Attempting to set it will raise an error or be silently
ignored.
- id: finance-C-275
when: When building a backtesting strategy with multiple triggers
action: Place entry triggers before hedge triggers in the strategy trigger list — triggers are evaluated in order and
order matters for correct portfolio state
severity: high
kind: architecture_guardrail
modality: must
consequence: If hedge triggers fire before entry triggers, the hedge will be applied to an empty or incorrect portfolio,
producing wrong P&L and risk calculations.
stage_ids:
- backtesting
- id: finance-C-276
when: When building a single trigger with multiple actions
action: Order exit actions before add actions within a single trigger — actions execute sequentially in the order they
are provided
severity: high
kind: architecture_guardrail
modality: must
consequence: If add actions come before exit actions, new positions will be added before old ones are closed, resulting
in incorrect position sizing and P&L attribution.
stage_ids:
- backtesting
- id: finance-C-278
when: When computing total strategy P&L from backtest results
action: Calculate Total as Price + Cumulative Cash + Transaction Costs — these three components sum to the complete strategy
P&L
severity: high
kind: domain_rule
modality: must
consequence: Omitting any of the three components (Price, Cumulative Cash, or Transaction Costs) will produce an incorrect
total P&L figure, leading to wrong performance assessment.
stage_ids:
- backtesting
- id: finance-C-279
when: When using risk-based transaction cost scaling with ScaledTransactionModel
action: Apply absolute value to the scaling_type value when computing risk-based costs — the formula is cost = |scaling_type
value| × scaling_level, and the absolute value is always taken
severity: medium
kind: domain_rule
modality: must
consequence: Risk measures like Price can be negative for short positions. Without taking the absolute value, negative
P&L positions would generate negative transaction costs, incorrectly inflating returns.
stage_ids:
- backtesting
- id: finance-C-280
when: When retrieving historical data state via the Dataset API
action: Use the as_of parameter to get data as it existed at a specific point in time — without as_of, queries return
current data which may include future revisions
severity: high
kind: domain_rule
modality: must
consequence: Without point-in-time semantics, backtests can inadvertently incorporate data that was not available at the
historical date, creating look-ahead bias in backtest results.
- id: finance-C-281
when: When determining the evaluation frequency for a backtest
action: Understand that trigger frequency and backtest evaluation frequency are independent parameters — set frequency='1b'
for daily evaluation of monthly trigger roll strategies to capture intra-month P&L
severity: medium
kind: operational_lesson
modality: must
consequence: Setting evaluation frequency equal to trigger frequency (e.g. '1m' evaluation with '1m' trigger) will miss
intra-month price movements and produce inaccurate daily P&L attribution.
stage_ids:
- backtesting
- id: finance-C-282
when: When the backtest returns an empty DataFrame unexpectedly from Dataset queries
action: Verify session entitlements and confirm the GsSession was initialized with specified scopes (e.g. read_product_data)
— empty results without an error often indicate missing access rights
severity: medium
kind: operational_lesson
modality: should
consequence: Missing entitlements cause silent failures where get_data returns an empty DataFrame with no error message,
making it hard to diagnose access issues.
- id: finance-C-283
when: When accessing datasets indexed by datetime (intraday) vs date (EOD)
action: Pass datetime.datetime objects for intraday datasets and datetime.date objects for EOD datasets — mixing these
types will cause query failures or empty results
severity: medium
kind: operational_lesson
modality: must
consequence: Passing the wrong date/time type for the dataset index will silently return empty results or raise a type
error.
- id: finance-C-284
when: When resolving instruments under a specific historical pricing date
action: Use PricingContext(pricing_date=...) context manager with the desired date — the pricing service computes missing
parameters using the market data as of that date
severity: high
kind: architecture_guardrail
modality: must
consequence: Resolving without an explicit PricingContext will use the current date, which introduces look-ahead bias
in historical pricing scenarios.
stage_ids:
- backtesting
- id: finance-C-285
when: When choosing trade_duration for FX option roll strategies
action: Use 'expiration_date' as trade_duration to hold options until their expiration — using a tenor that does not match
the option's actual expiry will cause premature or delayed unwinding
severity: medium
kind: operational_lesson
modality: must
consequence: Setting a fixed tenor trade_duration on an option with a different expiration will either exit before expiry
(losing remaining option value) or hold past expiry (causing errors on non-existent instruments).
stage_ids:
- backtesting
output_validator:
assertions:
- id: OV-01
check_predicate: all(p in inspect.getsource(zvt.factors.algorithm.macd) for p in ['slow=26', 'fast=12', 'n=9'])
failure_message: 'FATAL: MACD params drifted from (fast=12, slow=26, n=9) — SL-08 violation, non-reproducible signals'
business_meaning: Standard MACD parameters are a semantic lock; drift makes results incomparable with industry-standard
indicators and non-reproducible.
source_ids:
- SL-08
- BD-036
- id: OV-02
check_predicate: result.get('total_trades', 0) > 0 or result.get('explicit_zero_trade_ack') is True
failure_message: Zero trades executed — likely missing pre-fetched data (see PC-02) or over-restrictive filters
business_meaning: A backtest with zero trades is not a valid result; either data is missing or the strategy never triggered.
Structural non-emptiness check is insufficient — we need business confirmation.
source_ids:
- SL-01
- finance-C-073
- id: OV-03
check_predicate: result.get('annual_return') is None or abs(float(result['annual_return'])) <= 5.0
failure_message: 'FATAL: |annual_return| > 500% — likely look-ahead bias or data error'
business_meaning: Annual returns exceeding 500% are physically implausible for A-share strategies; indicates look-ahead
bias or corrupt data.
source_ids: []
- id: OV-04
check_predicate: result.get('holding_change_pct') is None or abs(float(result['holding_change_pct'])) <= 1.0
failure_message: 'FATAL: |holding_change_pct| > 100% — physically impossible'
business_meaning: Holding change percentage cannot exceed 100%; violation indicates position accounting error.
source_ids:
- BD-029
- id: OV-05
check_predicate: result.get('max_drawdown') is None or abs(float(result['max_drawdown'])) <= 1.0
failure_message: 'FATAL: |max_drawdown| > 100% — impossible for non-leveraged account'
business_meaning: Maximum drawdown cannot exceed 100% without leverage; violation indicates calculation error or look-ahead
bias.
source_ids: []
- id: OV-06
check_predicate: not (hasattr(result, 'trade_log') and result.trade_log and any(result.trade_log[i].action == 'sell' and
i+1 < len(result.trade_log) and result.trade_log[i+1].action == 'buy' and result.trade_log[i].timestamp == result.trade_log[i+1].timestamp
for i in range(len(result.trade_log)-1)))
failure_message: 'FATAL: buy-before-sell detected in same cycle — SL-01 violation, creates implicit leverage'
business_meaning: SL-01 requires sell() before buy() in each cycle; violation means available_long was not updated before
buying, risking duplicate positions.
source_ids:
- SL-01
scaffold:
validate_py_path: '{workspace}/validate.py'
tail_block: "# === DO NOT MODIFY BELOW THIS LINE ===\nif __name__ == \"__main__\":\n result = run_backtest()\n from\
\ validate import enforce_validation\n enforce_validation(result, output_path=\"{workspace}/result.csv\")\n# ===\
\ END DO NOT MODIFY ==="
enforcement_protocol: 1. Never edit validate.py. 2. Never delete the DO NOT MODIFY tail block from the main script. 3. Never
wrap enforce_validation() in try/except. 4. Never rewrite result write logic — it MUST go through enforce_validation.
5. If validate.py raises ImportError, fix the dependency, do not remove the call.
acceptance:
hard_gates:
- id: G1
check: '{workspace}/result.csv exists AND file size > 0'
on_fail: Strategy did not produce output; check run_backtest() return value and enforce_validation() call
- id: G2
check: '{workspace}/result.csv.validation_passed marker file exists'
on_fail: Validation did not complete; review validate.py output and fix assertion failures
- id: G3
check: 'Main script contains literal: from validate import enforce_validation'
on_fail: Validation chain stripped; re-add the import in the DO NOT MODIFY block
- id: G4
check: 'Main script contains literal: # === DO NOT MODIFY BELOW THIS LINE ==='
on_fail: Validation fence removed; regenerate DO NOT MODIFY tail block
- id: G5
check: 'result.csv has at least 1 row: pandas.read_csv(result.csv).shape[0] >= 1'
on_fail: Empty result; check if trade_log is non-empty and factors generated signals. Confirm PC-02 (k-data exists) passed.
- id: G6
check: 'If MACD strategy: source contains ''slow=26'' AND ''fast=12'' AND ''n=9'' in algorithm call'
on_fail: MACD params drifted from SL-08 lock; restore standard (12, 26, 9)
- id: G7
check: 'For data pipeline tasks: result.csv contains ''entity_id'' and ''timestamp'' fields'
on_fail: Missing required columns; check Mixin.query_data return schema and DataFrame MultiIndex reset_index() before
writing
- id: G8
check: 'OV-03 passes: abs(annual_return) <= 5.0 (500%)'
on_fail: Physical plausibility check failed; investigate look-ahead bias or data corruption in input kdata
soft_gates:
- id: SG-01
rubric: 'Strategy narrative consistency: user intent aligns with generated strategy.py logic. dim_a: signal direction
(buy/sell) matches intent [1-5, pass>=4]; dim_b: frequency (daily/intraday) aligns [1-5, pass>=4]; dim_c: risk controls
match user intent [1-5, pass>=4].'
- id: SG-02
rubric: 'Factor combination quality. dim_a: no highly correlated factor duplication [1-5, pass>=4]; dim_b: multi-period
alignment correct [1-5, pass>=4]; dim_c: liquidity filter present for A-share [1-5, pass>=4].'
- id: SG-03
rubric: 'Data source selection appropriateness. dim_a: coverage sufficient for target entities [1-5, pass>=4]; dim_b:
provider latency acceptable for strategy frequency [1-5, pass>=4]; dim_c: no unauthorized provider used without credentials
[1-5, pass>=4].'
skill_crystallization:
trigger: all_hard_gates_passed AND user_opt_out_skill_saving != true
output_path_template: '{workspace}/../skills/{slug}.skill'
slug_template: '{blueprint_id_short}-{uc_id_lower}'
captured_fields:
- name
- intent_keywords
- entry_point_script
- validate_script
- fatal_constraints
- spec_locks
- preconditions
- install_recipes
- human_summary_translated
action: 'After all Hard Gates PASS, resolve slug via slug_template using the executed UC, then write the .skill YAML file
at output_path_template. Notify user in their detected locale: ''Skill saved as {slug}.skill — next time say one of {sample_triggers}
from the matched UC to invoke directly.'''
violation_signal: All hard gates passed but no .skill file exists at expected path
skill_file_schema:
name: finance-bp-020 / UC-101
version: v5.3
intent_keywords: []
entry_point: run_backtest
fatal_guards:
- SL-01
- SL-02
- SL-03
- SL-04
- SL-05
- SL-06
- SL-07
- SL-08
- SL-10
- SL-11
- SL-12
spec_locks:
- SL-01
- SL-02
- SL-03
- SL-04
- SL-05
- SL-06
- SL-07
- SL-08
- SL-09
- SL-10
- SL-11
- SL-12
preconditions:
- PC-01
- PC-02
- PC-03
- PC-04
post_install_notice:
trigger: skill_installation_complete
message_template:
positioning: I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow.
capability_catalog:
group_strategy:
source: auto_grouped
strategy_reason: no candidate field had 2-7 distinct values; all capabilities collapsed into single group
groups:
- group_id: all
name: All Capabilities
description: ''
emoji: 📦
uc_count: 0
ucs: []
call_to_action: Tell me which one you want to try.
featured_entries:
- uc_id: UC-100
beginner_prompt: Try capability UC-100
auto_selected: true
- uc_id: UC-101
beginner_prompt: Try capability UC-101
auto_selected: true
- uc_id: UC-102
beginner_prompt: Try capability UC-102
auto_selected: true
more_info_hint: Ask me 'what else can you do?' to see all 0 capabilities.
locale_rendering:
instruction: On skill_installation_complete, translate ALL user-facing strings (positioning + capability_catalog.groups[].name
+ capability_catalog.groups[].description + capability_catalog.groups[].ucs[].short_description + call_to_action + featured_entries[].beginner_prompt
+ more_info_hint) into detected user locale per locale_contract. Preserve UC-IDs, group_id, emoji, and sample_triggers
verbatim.
preserve_verbatim:
- UC-IDs
- group_id
- emoji
- sample_triggers
- technical_class_names
enforcement:
action: 'Host agent MUST send composed message to user as the FIRST user-facing response after skill_installation_complete
event. Message MUST contain: positioning, capability_catalog (rendered as markdown tables per group), 3 featured_entries,
call_to_action, and more_info_hint.'
violation_code: PIN-01
violation_signal: First user-facing message post-install does not contain the full capability_catalog (all UCs grouped)
OR skips featured_entries OR skips call_to_action.
human_summary:
persona: Doraemon
what_i_can_do:
tagline: 'I help you build quant strategies on A-share with ZVT — from data fetch to backtest, one flow. Just tell me
what you want; I''ll write the code, you don''t have to dig docs. (Heads up: ZVT natively supports A-share, HK, and
crypto. US stocks — stockus_nasdaq_AAPL — are half-baked; don''t bother for serious work.)'
use_cases:
- A-share MACD daily golden-cross backtest with hfq price adjustment from eastmoney
- 'End-to-end ZVT pipeline: FinanceRecorder + GoodCompanyFactor + StockTrader'
- Multi-factor strategy with TargetSelector (AND mode) combining MACD + volume breakout
- Index composition data collection (SZ1000, SZ2000) with EM recorder
- Institutional fund holdings tracker via joinquant_fund_runner pattern
- Custom Transformer + Accumulator factor with per-entity rolling state
- Bollinger Band mean-reversion factor with BollTransformer (window=20, window_dev=2)
what_i_auto_fetch:
- ZVT stage pipeline structure (data_collection → visualization) from LATEST.yaml
- Semantic locks (SL-01 through SL-12) — especially sell-before-buy ordering and MACD params
- Fatal constraints (finance-C-*) relevant to your target strategy type
- 'Default parameters: MACD(12,26,9), hfq adjustment, buy_cost=0.001, base_capital=1M CNY'
- Entity ID format (stock_sh_600000) and DataFrame MultiIndex convention
- Provider-specific recorder class names and required class attributes
what_i_ask_you:
- 'Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage
is thin)'
- 'Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare,
or qmt (broker)?'
- 'Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?'
- 'Time range: start_timestamp and end_timestamp for backtest period'
- 'Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?'
locale_rendering:
instruction: On first user contact, translate all fields above into detected user locale while preserving Doraemon persona
(direct, frank, mildly snarky, knows limits).
preserve_verbatim:
- BD-IDs
- SL-IDs
- UC-IDs
- finance-C-IDs
- class_names
- function_names
- file_paths
- numeric_thresholds