Hi there,
I'd like to ask what are the best practices on evaluating quality of a strategy. I suppose that looking purely at APR is not enough.
What I usually look at is also drawdowns, especially in bear market periods - how well is the strategy able to cope with heavy market declines. Profit factor and win rate are the next metrics I follow up.
What next?
Thx
Jiri
I'd like to ask what are the best practices on evaluating quality of a strategy. I suppose that looking purely at APR is not enough.
What I usually look at is also drawdowns, especially in bear market periods - how well is the strategy able to cope with heavy market declines. Profit factor and win rate are the next metrics I follow up.
What next?
Thx
Jiri
Rename
1.) A good performance metric (or better a set them) is one thing.
2.) Use cross-validation, i.e. Insample and Out-of-Sample intervals.
3.) Robustness Tests. This will be handled by one of the next finantic extensions to arrive. Search forum for "Robutness" to find a PDF as a starter..
2.) Use cross-validation, i.e. Insample and Out-of-Sample intervals.
3.) Robustness Tests. This will be handled by one of the next finantic extensions to arrive. Search forum for "Robutness" to find a PDF as a starter..
1) Yes, but what is that set of performance metrics? There are plenty of these in the backtest results but what do they say about the strategy quality? And how do they work together? Is there any description available?
2) Could you elaborate on that pls?
3) Good, thx for hint :)
2) Could you elaborate on that pls?
3) Good, thx for hint :)
QUOTE:
what is that set of performance metrics?
This is an ongoing discussion, everybody seems to have their preferences.
A beginner will look at profit only. After all you're interested to make as much money as possible.
If slightly advanced you'll look at annualized profit in percent. This makes backtests comparable that run for different data ranges and with different start capital.
If more advanced you'll realize that it is possible to get much higher profits with bigger positions, but this comes with a downside: Your risk (measured as drawdown or volatility of profits) will go up by exactly the same factor. So you'll finally switch to a risk/reward ratio. The most popular seems to be the Sharpe ratio (which is calculated once a month). I prefer the Sharpe Ratio daily, because it is more precise. With such a risk/reward ratio you get your results independent form position sizes and margins.
BUT: If you look at risk/reward only you'll get fooled by strategies which are otherwise not practical. To avoid this, it is useful to look at APR and MaxDD separately and keep an eye on PositionCount and Exposure.
Let me add a word about MaxDD (Maximum Drawdown):
This important metric depends on a handful of trades in your backtest. If you leave out one or two symbols (which are involved in that drawdown on that date) the MaxDD metric will change a lot, it is not very stable. A better choice is a MaxDrawDown number calculated from a bootstrap: This one shuffles all available trades in a clever way to get a realistc/robust estimate of the most probable MaxDrawdown. I called this number "Expected Drawdown" and made it available in the (free) finantic.SysQ extension. (https://www.wealth-lab.com/extension/detail/finantic.SysQ)
I'd suggest you add the resulting risk/reward ratio (called SysQ) to the set of metrics you observe after a backtest.
QUOTE:
2.) Use cross-validation, i.e. Insample and Out-of-Sample intervals.
Whenever you change something in your strategy with the aim to improve it you do an optimization step (no matter if you use an optimizer or do these changes manually)
With every optimization step the risk grows to produce an overoptimized strategy (no matter of you use an optimizer or not)
One way to see the effects of overoptimization works as follows:
You divide your backtest data range into two (or more) intervals, say 7 years and 3 years instead of 10 years.
You use the first 7 years to work on your strategy, run backtests, probably use the optimizer and so forth. These 7 years are called the Insample interval.
If you are happy with the results, you run your strategy over the second 3 year interval, over "unseen data" These three years are called the Out-of-Sample (OOS) interval.
If the results on Out-of-Sample are not good enough you start all over.
Some Remarks:
1. Don't use the OOS data too often, it will be less OOS every time you look at the results.
2. The whole process is accelerated by using the IS/OS Scorecard (Part of finantic.ScoreCards extension)
In addition to all the metrics above, you also want to be sure the most recent slope of the equity curve is positive. If your equity isn't increasing over time (i.e. you're not making money over time), then you have a bad match between your stock (or investment instrument) and your strategy.
The Basic ScoreCard has an "EquitySlope" metric to gauge this, but it's looking at the slope (via simple linear regression) over the entire Data Range, not just the most recent slope.
The Basic ScoreCard has an "EquitySlope" metric to gauge this, but it's looking at the slope (via simple linear regression) over the entire Data Range, not just the most recent slope.
Thank you both for great insights. I'm mostly familiar with the described concepts, it's just to put them all together :)
Your Response
Post
Edit Post
Login is required