Methodology · 8 min read

How to know if your trading backtest is overfit

A flawless equity curve is a warning sign, not a green light. Overfit strategies look brilliant in the past and fall apart live. Here is how to tell the difference — with Monte Carlo, not hope.

Overfitting is when a strategy is tuned so tightly to historical data that it has memorised the past instead of capturing a real edge. The tell is a backtest that is too clean — smooth curve, tiny drawdown, dozens of parameters.

Red flags

Many parameters, each finely tuned.
Performance collapses if you nudge a setting slightly.
One historical regime carries the whole result.
A suspiciously straight equity line.

The honest tests

Out-of-sample. Hold back data the strategy never saw; if it only works in-sample, it's overfit.
Monte Carlo resampling. Resample the daily P&L into thousands of alternate sequences (a block bootstrap preserves short-term autocorrelation). Now you see a distribution of outcomes, not one lucky path.
Blow-rate metric. Across those simulated paths, what fraction end in a blow-out under real account rules? That annualized blow rate is the number that actually matters — far more than a headline return.
Parameter sensitivity. A robust strategy degrades gracefully as you vary settings; a fragile one falls off a cliff.

How Puravida Edge does it

Every strategy is validated on 12 months of empirical data, then resampled into 1,500 Monte Carlo paths over a 3-year horizon with a 5-day block bootstrap. We report percentile outcomes (P25/P50/P75) and an annualized blow rate, not just a best-case return — and we publish the methodology rather than a single hero curve. Full detail on the methodology page; outcomes per portfolio in the Pass Estimator.

FAQ

How can I tell if my backtest is overfit?

Watch for too-clean results, many finely-tuned parameters, and performance that collapses when you change a setting or test out-of-sample. A real edge degrades gracefully and survives unseen data.

What does Monte Carlo do for a trading strategy?

It resamples your daily P&L into thousands of alternate sequences so you see a distribution of outcomes — including the probability of a blow-out — instead of one historical path.

What's a block bootstrap and why use it?

It resamples in short blocks (e.g. 5 days) rather than single days, preserving short-term autocorrelation so the simulated sequences behave like real markets.

Which metric matters most for prop trading?

The annualized blow rate — the share of simulated paths that violate account rules — matters more than headline return, because surviving the drawdown rule is what gets you paid.

Not financial advice. Performance figures referenced are hypothetical, modeled outputs (1,500-path Monte Carlo on a backtest + live sample). Past performance does not guarantee future results. Prop-firm Terms of Service compliance is your responsibility — verify every rule with the firm directly.