Free Tool

Backtest Quality Checker

Paste your backtest metrics. Get a quality score (0–100) and a list of red flags that often indicate overfitting or unreliable results.

From your backtest report
Gross profit / gross loss
Annualized
Peak-to-trough
Total trade count
Backtest length
Done
85
Quality score
High quality

How the score is calculated

MetricHealthy rangeSuspiciousWeight
Win Rate30–75%>85%20
Profit Factor1.3–3.0>520
Sharpe0.5–2.0>320
Max DD5–20%<5% or >40%20
Sample size≥200 trades<10020
Sample period≥24 months<12 months10

Scores are heuristics, not verdicts. A high score doesn't guarantee live performance — only proper out-of-sample testing and Monte Carlo stress testing can do that. A low score doesn't mean the strategy is worthless; it means the metrics shown raise concerns that need investigation.

Related

Frequently asked questions

What's a good backtest quality score?

Above 80 indicates strong metrics with reasonable sample size and period. 60–80 means decent but with one or two concerns worth investigating. 40–60 has multiple red flags — investigate before trusting. Below 40 suggests the backtest is either overfit, on insufficient sample size, or has metrics that raise red flags.

Why is a 95% win rate suspicious?

Real systematic strategies rarely sustain win rates above 80% across realistic sample sizes. Win rates near 95% almost always indicate either (a) data mining and parameter overfitting, (b) a strategy that simply hasn't encountered its losing regime yet, or (c) a methodology issue (look-ahead bias, survivorship bias). High win rate alone is not a positive signal — it's a question.

Can a low-score strategy still be profitable live?

Sometimes — especially if the low score is driven by small sample size rather than suspicious metrics. A new strategy with 60 trades and otherwise reasonable metrics scores low because of insufficient sample, but might be perfectly fine to paper-trade forward. A strategy with PF 8 and 250 trades scores high but is probably overfit. Use the score as a question prompt, not a verdict.