Construction of investment signals based on corporate filings and ways to improve textual analysis
Files
Delaunoy_52541400_2018.pdf
Closed access - Adobe PDF
- 6.09 MB
Details
- Supervisors
- Faculty
- Degree label
- Abstract
- Confidential Thesis This contribution has mainly one goal: it tries to replicate and verify the ideas expressed in a confidential paper owned by the Quantitative Equity Department of Candriam Investor Group which is a client of the research company that produced the study. The paper is called "Text Mining Unstructured Corporate Filing Data" which was written by Yin Luo, Managing Director and Vice Chairman of Wolfe Research, and his team. In this research, a detailed profiling of company’s annual reports is constructed via textual analysis. Chapter 1 introduces the unstructured data problem. In Chapter 2, we describe how the data is retrieved via web scraping and parsing. Then, Chapter 3 deconstructs the textual analysis via Natural Language Processing algorithms into five different aspects that we call indicators. Chapter 4 presents our performance evaluation and the results of such an evaluation on financial strategies built over the constructed indicators mentioned on the previous chapter. Finally, Chapter 5 tries to improve the previous chapters of the thesis by extending textual analysis to complete unstructured data with technologies such as Topic Modeling via Latent Dirichlet allocation (LDA), and Optical Character Recognition (OCR). Even if some changes were made compared to what Yin Luo proposed in his paper, we still tried to be as conservative as possible to the original ideas. Note however, that the last section gathers all extensions that were thought during this thesis and goes way beyond the attempt of replicating a systematic profiling of company’s reports.