ATTENTION/WARNING - NE PAS DÉPOSER ICI/DO NOT SUBMIT HERE

Ceci est la version de TEST de DIAL.mem. Veuillez ne pas soumettre votre mémoire sur ce site mais bien à l'URL suivante: 'https://thesis.dial.uclouvain.be'.
This is the TEST version of DIAL.mem. Please use the following URL to submit your master thesis: 'https://thesis.dial.uclouvain.be'.
 

Deep learning in Automatic Piano Transcription

(2018)

Files

Karioun_19310800_Tihon_90761300_2018.pdf
  • Open access
  • Adobe PDF
  • 3.48 MB

Details

Supervisors
Faculty
Degree label
Abstract
Automatic Piano Transcription (APT) is a branch of Music Information Retrieval (MIR) that focuses on transcribing high-quality piano audio recordings into sheet music. In this thesis, we propose and analyse new methods to improve the state of the art deep learning approaches, basing our experiments on the Onsets and Frames algorithm defined in [1]. First, we proposed the so-called harmonic layer, a convolutional layer specifically designed to take into account the harmonics of the notes. However, traditional networks already take these harmonics into account implicitly thanks to the fully connected layers. Even if the results of our experiments are not initially promising, the layer could probably find a better use in a more appropriate architecture. Second, we balanced the dataset’s labels. Given the nature of the datasets (piano recordings of traditional and classical pieces), strong imbalances are observed in the total activation time of the different notes. The performances of Onsets and Frames are slightly improved when balancing the pitch distribution during training through repeated sequences. Finally, a simplified model of Onsets and Frames is proposed. At the expense of slightly worse performances, the proposed algorithm runs 6 times faster. This model can be used for fast prototyping or as a basis for more complex models. The different models are compared using ROC curves and F1 scores, but also pitch-wise F1 scores, which reveal some interesting differences in the learning of different pitches.