ATTENTION/WARNING - NE PAS DÉPOSER ICI/DO NOT SUBMIT HERE

Ceci est la version de TEST de DIAL.mem. Veuillez ne pas soumettre votre mémoire sur ce site mais bien à l'URL suivante: 'https://thesis.dial.uclouvain.be'.
This is the TEST version of DIAL.mem. Please use the following URL to submit your master thesis: 'https://thesis.dial.uclouvain.be'.
 

Entity Similarity through word embedding and Named Entity Recognition using word vectors

(2017)

Files

Gusbin_53801200_Vrielynck_80991000_Annexe1.zip
  • UCLouvain restricted access
  • Unknown
  • 4.46 MB

Gusbin_53801200_Vrielynck_80991000_2017.pdf
  • UCLouvain restricted access
  • Adobe PDF
  • 9.58 MB

Gusbin_53801200_Vrielynck_80991000_2017_Erratum.pdf
  • UCLouvain restricted access
  • Adobe PDF
  • 23.16 KB

Details

Supervisors
Faculty
Degree label
Abstract
The recently introduced Word2vec and GloVe models are efficient methods to build quality word embeddings. In this thesis we investigated and implemented both models to compute Entity Similarity. We assessed two different approaches. The first one considers entities as normal words to include them in the final word embedding. The second approach uses the already built word embedding to project the entities inside it. A complete data enrichment pipeline was also designed to increase the data quality and improve the final results. Currently, Named Entity Recognition state-of-the-art uses Conditional Random Fields. We built a word vector based Multilayer Bidirectional Long-Short-Term-Memory Recurrent Neural Network using the deep learning framework Tensorflow. Providing only few feature and archi- tecture engineering the model achieved near to state-of-the-art results. Considering a few more optimizations explained in the thesis, Recurrent Neural Network using word vector could become the next state-of-the-art method.