Files
Bartolomeo_28891700_2022.pdf
Open access - Adobe PDF
- 1.07 MB
Details
- Supervisors
- Faculty
- Degree label
- Abstract
- The general scope of this work is to give an overview of the subject of missing data. The first part is a review of classical imputation methods to handle missing data (complete case analysis, last observation carried forward, mean imputation, (stochastic) regression imputation, multiple imputation). It is the statistical part of the work and is focused on the pros and cons of each method from a statistical point of view. The second part is an analysis of the impact of missing data and imputation methods on support vector machines, an increasingly popular supervised machine learning classification method. This impact is measured by performing numerical simulations using R and Python.