1. Persistence in factor-based supervised learning models
- Author
-
Guillaume Coqueret
- Subjects
Statistics and Probability ,Persistence (psychology) ,Economics and Econometrics ,History ,Polymers and Plastics ,Computer science ,media_common.quotation_subject ,Industrial and Manufacturing Engineering ,Factor (programming language) ,Econometrics ,Capital asset pricing model ,Asset (economics) ,G11 ,Business and International Management ,G12 ,C53 ,media_common ,computer.programming_language ,Transaction cost ,Variables ,Applied Mathematics ,Supervised learning ,Autocorrelation ,QA75.5-76.95 ,Predictive analytics ,Investment (macroeconomics) ,Computer Science Applications ,Variable (computer science) ,Electronic computers. Computer science ,HG1-9999 ,Key (cryptography) ,Business, Management and Accounting (miscellaneous) ,Portfolio ,computer ,Finance ,C45 - Abstract
In this paper, we document the importance of memory in machine learning (ML)-based models relying on firm characteristics for asset pricing. We come to three empirical conclusions. First, the pure out-of-sample fit of the models is often poor: we find that most R^2 measures are negative, especially when training samples are short. Second, we show that poor fit does not necessarily matter from an investment standpoint: what actually counts are measures of cross-sectional accuracy, which are seldom reported in the literature. Third, memory is key. The accuracy of models is maximal when both labels and features are highly autocorrelated. Relatedly, we show that investments are the most profitable when they are based on models driven by strong persistence. Average realized returns are the highest when the size of training samples is large and when the horizon of the predicted variable is also long.
- Published
- 2022