Heterogeneous Data in Machine Learning and Multimodal Architectures : use-case applied to tumor segmentation from multiple medical images for radiotherapy
Files
Boulanger_58491500_2022.pdf
Open access - Adobe PDF
- 29.49 MB
Details
- Supervisors
- Degree label
- Abstract
- Artificial intelligence is a field in which much effort is invested to allow machines to understand the world around us. Machine learning models are used to learn from data that describe facts and observations and tend to become better at this thanks to recent advances in this field. The vast majority of them are designed to work with unimodal data, i.e. data coming from one modality, a specific way to acquire information. But the world is far from unimodal and we experience it in a multimodal way through our view, our hearing, our touch, our sense of smell, and our taste. Multimodal machine learning is a branch that aims to exploit multiple sources of information as we do. The first aim of this thesis is to provide a general point of view on the use of multimodal data in machine learning by introducing five main challenges in this domain, namely: fusion, representation, translation, alignment, and co-learning. It is shown that many fields can take advantage of this approach for all kinds of complex application by showing concrete examples. The second aim of this thesis is to put the theory into practice by implementing a multimodal approach for a medical segmentation task. This task was segmentation of the gross tumor volume in the head and neck regions for radiation therapy treatment using two types of imaging, medical resonance imaging (MRI) and computed tomography (CT). The novel architecture presented is the X-Net, an adaptation of the well-known U-Net. The X-Net showed promising results compared to its unimodal counterpart and served as a proof-of-concept for its unique design. It also served as a motivation for a more advanced architecture, the nnX-Net, and for more experiences to be performed to analyze the behavior of this network in new scenarios. Finally, several modifications have also been recommended to maximize the potential of the X-Net.