ATTENTION/WARNING - NE PAS DÉPOSER ICI/DO NOT SUBMIT HERE

Ceci est la version de TEST de DIAL.mem. Veuillez ne pas soumettre votre mémoire sur ce site mais bien à l'URL suivante: 'https://thesis.dial.uclouvain.be'.
This is the TEST version of DIAL.mem. Please use the following URL to submit your master thesis: 'https://thesis.dial.uclouvain.be'.
 

Word Embeddings using Canonical Correlation Analysis

(2022)

Files

Ruan_31891400_2022.pdf
  • Open access
  • Adobe PDF
  • 1.26 MB

Details

Supervisors
Faculty
Degree label
Abstract
The totality of human communication is a complex system comprised of sound and vision patterns, specifically constructed to convey meaning between people. This presents a stark contrast between the language spoken by machines. The bridge between both systems can be built with the aid of word embedding algorithms that aim to mimic the words' semantic and syntactic meaning through real-valued vectors. This thesis is grounded on a rather recent field, and aims to shed some light on the mathematical background of a specific statistical approach to this algorithm, while comparing it to modern neural network models. In particular, we test these models on real-world data consisting of articles related to the Russo-Ukrainian war. The outcomes of these comparisons are examined, as are the limitations of the studies and future directions.