Machine learning and variable selection methods for recovery rates prediction

Files

Supervisors: Vrins, Frédéric ; Gambetti, Paolo
Faculty: Louvain School of Management
Degree label: Master [120] en ingénieur de gestion
Abstract: In a world overwhelmed by large sets of raw data, variable selection has become an indispensable step in data analysis. High-dimensional data sets do not provide information by themselves, the information has to be extracted and treated. This master thesis explores the use of automated variable selection methods in the frame of recovery rates prediction. The results suggest that subsets constructed with feature selection methods yield better forecasts than subsets with bond specific variables. Moreover, the individual selection methods outperform a popular variables approach based on the recurrent variables identified in the individual selection methods. Furthermore, the results show that once a wide selection of macroeconomic features has been added to bond specific variables, the choice of predictive model becomes more important than the exact composition of the subset. Our findings reveal that metaheuristic approaches yield slightly better forecasts than other feature selection methods and that artificial neural networks, random forests and the least-squares support vector regression techniques are the most promising machine learning methods for bond recovery rates prediction.