Files
Massion_13701700_2022.pdf
Open access - Adobe PDF
- 3.54 MB
Details
- Supervisors
- Faculty
- Degree label
- Abstract
- This thesis tackles the problem of predicting the outcome of tennis matches using low-rank approaches. The work postulates the existence of a true winning probability matrix which generates the results of all matches. Three types of approaches are proposed in order to find this matrix back from the dataset. The first idea consists in setting up a low-rank matrix completion (LRMC) problem. Several classical LRMC techniques, as well as new ones adapted to this problem, are tested. We introduce a second approach consisting in solving a maximum a posteriori (MAP) problem on the probability matrix while imposing a low-rank structure. The third novel formulation uses the famous Bradley-Terry-Luce (BTL) model in order to convert the probability guessing problem into a rating guessing problem. This idea reduces the number of constraints and allows for the inclusion of one more feature in the model such as the tournament in which a match is played. The last formulation develops a MAP formulation on the ratings constrained to be low-rank, and the probabilities of winning are computed afterwards via the BTL formula. The new techniques introduced in this work give similar or better results compared to classical LRMC techniques for this problem. Finally, an important statement about the winning probabilities is proved. Even if it is guessed that they should avoid being too large or too small, it turns out that in order to maximize the prediction accuracy, they need to be clipped to zero or one. This implies that, in the MAP framework, any prior distribution that could be chosen symmetric for normalization, is useless.