Machine learning models predicting returns: why most popular performance metrics are misleading and proposal for an efficient metric

Jean Dessain

doi:10.2139/ssrn.3927058

Abstract

Numerous machine learning models have been developed to achieve the ‘real-life’ financial objective of optimising the risk/return profile of investment strategies. In the current article: (a) we present and classify the most popular performance metrics used in 190 articles analysed. We noticed that, in most articles, no attention is devoted to the criteria used to compare the algorithms. (b) We evaluate the ability of the metrics used in the literature to assess the efficiency of algorithms to improve investments results. We demonstrate that many of the most popular metrics, like mean squared error (MSE) or root mean squared error (RMSE), are inappropriate for this purpose while others, like accuracy or F1, are just weak. We explain why risk-adjusted return-based metrics are best-in-class, although they suffer from statistical limitations and do not allow easy comparison of algorithms across assets or over time. (c) We propose a new discriminant metric that measures the efficiency of AI models to optimize the risk-adjusted return, which is statistically more robust, and which can test the stability of models over time and across assets.

Full Text