ATTENTION/WARNING - NE PAS DÉPOSER ICI/DO NOT SUBMIT HERE

Ceci est la version de TEST de DIAL.mem. Veuillez ne pas soumettre votre mémoire sur ce site mais bien à l'URL suivante: 'https://thesis.dial.uclouvain.be'.
This is the TEST version of DIAL.mem. Please use the following URL to submit your master thesis: 'https://thesis.dial.uclouvain.be'.
 

Evaluating decision trees in a predict-and-optimize setting

(2024)

Files

Castro_05732100_2024.pdf
  • Open access
  • Adobe PDF
  • 2.68 MB

Details

Supervisors
Faculty
Degree label
Abstract
This thesis studies the performance of decision trees in a Predict-and-Optimize setting, where a decision tree predicts the unknown input parameters for an optimization problem, and then decisions are made using these parameters to address the optimization problem. For this study, two loss functions are used for training decision trees: MSE loss and the Smart-Predict-then-Optimize (SPO) loss, and each loss function is considered in two algorithms: greedy decision trees and optimal decision trees (DL8.5). Therefore, the models under study are: MSE trees and SPO trees (SPOTs), along with DL8.5 algorithms—DL8.5 with MSE and DL8.5 with SPO—aimed at building optimal decision trees. Then, these models are employed to address two optimization problems: the Shortest path problem and the Traveling salesman problem (TSP). In the context of the TSP problem, the greedy models and the DL8.5 decision trees are assessed with different TSP algorithms (exact and heuristic models) to tackle this NP-hard optimization problem of finding the shortest tour. We conduct several experiments on synthetic workloads with different degree and noise parameters, involving the prediction of the travel times for these optimization problems. Based on the results, we underscore the significance of employing SPO metric for building decision trees because it provides better quality decisions and lower model complexity than trees designed to minimize the prediction error (MSE). Furthermore, employing DL8.5 with SPO consistently outperforms the other methods when analyzing decision error.