ATTENTION/WARNING - NE PAS DÉPOSER ICI/DO NOT SUBMIT HERE

Ceci est la version de TEST de DIAL.mem. Veuillez ne pas soumettre votre mémoire sur ce site mais bien à l'URL suivante: 'https://thesis.dial.uclouvain.be'.
This is the TEST version of DIAL.mem. Please use the following URL to submit your master thesis: 'https://thesis.dial.uclouvain.be'.
 

Tennis matches outcome prediction via Low-Rank approaches

(2022)

Files

Massion_13701700_2022.pdf
  • Open access
  • Adobe PDF
  • 3.54 MB

Details

Supervisors
Faculty
Degree label
Abstract
This thesis tackles the problem of predicting the outcome of tennis matches using low-rank approaches. The work postulates the existence of a true winning probability matrix which generates the results of all matches. Three types of approaches are proposed in order to find this matrix back from the dataset. The first idea consists in setting up a low-rank matrix completion (LRMC) problem. Several classical LRMC techniques, as well as new ones adapted to this problem, are tested. We introduce a second approach consisting in solving a maximum a posteriori (MAP) problem on the probability matrix while imposing a low-rank structure. The third novel formulation uses the famous Bradley-Terry-Luce (BTL) model in order to convert the probability guessing problem into a rating guessing problem. This idea reduces the number of constraints and allows for the inclusion of one more feature in the model such as the tournament in which a match is played. The last formulation develops a MAP formulation on the ratings constrained to be low-rank, and the probabilities of winning are computed afterwards via the BTL formula. The new techniques introduced in this work give similar or better results compared to classical LRMC techniques for this problem. Finally, an important statement about the winning probabilities is proved. Even if it is guessed that they should avoid being too large or too small, it turns out that in order to maximize the prediction accuracy, they need to be clipped to zero or one. This implies that, in the MAP framework, any prior distribution that could be chosen symmetric for normalization, is useless.