ATTENTION/WARNING - NE PAS DÉPOSER ICI/DO NOT SUBMIT HERE

Ceci est la version de TEST de DIAL.mem. Veuillez ne pas soumettre votre mémoire sur ce site mais bien à l'URL suivante: 'https://thesis.dial.uclouvain.be'.
This is the TEST version of DIAL.mem. Please use the following URL to submit your master thesis: 'https://thesis.dial.uclouvain.be'.
 

CESReS - Code Embeddings for a Student Recommendation System

(2024)

Files

Steveny_16971900_2024.pdf
  • Open access
  • Adobe PDF
  • 2.47 MB

Steveny_16971900_2024_Appendix1.zip
  • Open access
  • Unknown
  • 3.06 MB

Details

Supervisors
Faculty
Degree label
Abstract
Students need feedback on their work. They use this to get a better overview of their level, strengths and shortcomings. However, personalized feedback about their misconceptions is complex due to a lack of means or resources. Even if some mechanisms have proven to be efficient, like Inginious, these require cumbersome and time consuming work to design specific tests and feedback. This thesis objective was to go over these limits by enabling automatic feedback generation thanks to a machine learning model. To do so, we developed a classification architecture following the latest advances in Natural Language Processing. By using code embeddings, i.e. generated vectors on the students' submissions, our system allows one to detect specific misconceptions occurring in code snippets. To get control of the classes and enable the training of our deep neural network, we developed an approach inspired by DeepBugs. The training instances are mutants of original students' submissions where the injected modifications are representative of a set of 14 misconceptions we selected. Our model obtained f1-score values up to 72.9% when predicting an evaluation dataset of students' mistakes. We also highlighted the limits of our current mutation labelling technique and improvements to be conducted by further works. Finally, the model's program we created can interconnect with a graphical interface and provide basic insights into the model's decisions thanks to the Integrated Gradients computation.