From tradermonty-claude-trading-skills
Guides systematic backtesting of trading strategies with robustness testing, parameter sensitivity analysis, slippage modeling, bias prevention, and result interpretation. Use for strategy development and validation.
npx claudepluginhub joshuarweaver/cascade-business-ops --plugin tradermonty-claude-trading-skillsThis skill uses the workspace's default tool permissions.
Systematic approach to backtesting trading strategies based on professional methodology that prioritizes robustness over optimistic results.
Generates design tokens/docs from CSS/Tailwind/styled-components codebases, audits visual consistency across 10 dimensions, detects AI slop in UI.
Records polished WebM UI demo videos of web apps using Playwright with cursor overlay, natural pacing, and three-phase scripting. Activates for demo, walkthrough, screen recording, or tutorial requests.
Delivers idiomatic Kotlin patterns for null safety, immutability, sealed classes, coroutines, Flows, extensions, DSL builders, and Gradle DSL. Use when writing, reviewing, refactoring, or designing Kotlin code.
Systematic approach to backtesting trading strategies based on professional methodology that prioritizes robustness over optimistic results.
Goal: Find strategies that "break the least", not strategies that "profit the most" on paper.
Principle: Add friction, stress test assumptions, and see what survives. If a strategy holds up under pessimistic conditions, it's more likely to work in live trading.
Use this skill when:
Define the edge in one sentence.
Example: "Stocks that gap up >3% on earnings and pull back to previous day's close within first hour provide mean-reversion opportunity."
If you can't articulate the edge clearly, don't proceed to testing.
Define with complete specificity:
Critical: No subjective judgment allowed. Every decision must be rule-based and unambiguous.
Test over:
Examine initial results for basic viability. If fundamentally broken, iterate on hypothesis.
This is where 80% of testing time should be spent.
Parameter sensitivity:
Execution friction:
Time robustness:
Sample size:
Walk-forward analysis:
Warning signs:
Questions to answer:
Decision criteria:
Use the evaluation script for a structured, quantitative assessment:
python3 skills/backtest-expert/scripts/evaluate_backtest.py \
--total-trades 150 \
--win-rate 62 \
--avg-win-pct 1.8 \
--avg-loss-pct 1.2 \
--max-drawdown-pct 15 \
--years-tested 8 \
--num-parameters 3 \
--slippage-tested \
--output-dir reports/
The script scores across 5 dimensions (Sample Size, Expectancy, Risk Management, Robustness, Execution Realism), detects red flags, and outputs a Deploy/Refine/Abandon verdict.
Add friction everywhere:
Rationale: Strategies that survive pessimistic assumptions often outperform in live trading.
Look for parameter ranges where performance is stable, not optimal values that create performance spikes.
Good: Strategy profitable with stop loss anywhere from 1.5% to 3.0% Bad: Strategy only works with stop loss at exactly 2.13%
Stable performance indicates genuine edge; narrow optima suggest curve-fitting.
Wrong approach: Study hand-picked "market leaders" that worked Right approach: Test every stock that met criteria, including those that failed
Selective examples create survivorship bias and overestimate strategy quality.
Intuition: Useful for generating hypotheses Validation: Must be purely data-driven
Never let attachment to an idea influence interpretation of test results.
Recognize these patterns early to save time:
See references/failed_tests.md for detailed examples and diagnostic framework.
reports/backtest_eval_<timestamp>.json — structured evaluation with per-dimension scores, red flags, and verdictreports/backtest_eval_<timestamp>.md — human-readable report with dimension table, key metrics, and red flag detailsFile: references/methodology.md
When to read: For detailed guidance on specific testing techniques.
Contents:
File: references/failed_tests.md
When to read: When strategy fails tests, or learning from past mistakes.
Contents:
Time allocation: Spend 20% generating ideas, 80% trying to break them.
Context-free requirement: If strategy requires "perfect context" to work, it's not robust enough for systematic trading.
Red flag: If backtest results look too good (>90% win rate, minimal drawdowns, perfect timing), audit carefully for look-ahead bias or data issues.
Tool limitations: Understand your backtesting platform's quirks (interpolation methods, handling of low liquidity, data alignment issues).
Statistical significance: Small edges require large sample sizes to prove. 5% edge per trade needs 100+ trades to distinguish from luck.
This skill focuses on systematic/quantitative backtesting where:
Discretionary traders study differently—this skill may not apply to setups requiring subjective judgment.