Decoding sound categories in time: an electrophysiological study in sighted humans
Files
Denis_58701700_2023.pdf
Open access - Adobe PDF
- 5.1 MB
Details
- Supervisors
- Faculty
- Degree label
- Abstract
- The goal of this master thesis is to investigate different machine learning procedures for discriminating electroencephalography (EEG) responses to sounds stimuli belonging to different living and non-living categories. Due to the high temporal resolution of EEG, machine learning algorithms can be employed to localise the exact time (in milliseconds) at which the sounds are processed similarly or differently in the brain. Thus, it is possible to explore how representations of different sound categories emerge and resolve in time, which is of great importance in the field of auditory neuroscience. In this thesis, we explored different machine learning algorithms: first, for decoding: classifications algorithms tried were Support Vector Machines (SVM) with linear and radial basis function kernels, linear discriminant analysis; and a little exploration of the SVM C hyperparameter. Additionally, multiclass classification (in contrast to binary classification) was tried. Second, for processing dataset to modify some of its properties (dimension, size, quality of data) and to observe the impact on results: feature extraction and selection, averaging features of different time points, use of pseudo-trials and sliding time-window, and third, for evaluating the results with different cross-validation strategies: stratified KFold, leave exemplar out and leave one trial per exemplar out. It is complemented with metrics measurement that affect the results, which consequently affect the inferences made by neuroscientists about auditory categorisation. Measured metrics are accuracy and Area Under the Curve (AUC). The results we have obtained suggested that SVM classifier with appropriate hyperparameters is well suited for this type of study. About cross-validation, the standard stratified KFold appears to be good and reliable. Leave examplar out could be useful to evaluate performances of a classifier with yet unseen sound. In several scenarios, results show statistically significant above chance level metrics following the onset of the sound and sometimes, just after the offset of the sound. Other outcomes of this work show that the use of pseudo-trials (trials averaging) leads to a certain improvement in the results obtained for both metrics. However, it has to be employed cautiously as the dataset size is reduced, it can be subject to underfitting. Reducing the dimensions with feature extraction and selection do not show any significant change in the outcomes but allowed for reduced execution times. Multiclass classification has shown interesting results with more prolonged statistical significance after onset of the stimulus than with binary classification. Concerning metrics, we have also made the observation that accuracy and AUC often had dissimilarities in their values. This little review is left with the hope that a strategic combination of the tools and techniques discussed here can be designed to significantly improve the metrics obtained in future similar EEG decoding studies.