Files
VanVracem_17351500_2021.pdf
Open access - Adobe PDF
- 20 MB
Details
- Supervisors
- Faculty
- Degree label
- Abstract
- Data visualization is an important tool in data science to quickly capture the main patterns and trends in large datasets. One of these techniques concerns Heatmaps and allows the visualization of matrices or tables in their entirety. This approach assumes that the order of the rows and columns can be modified in order to highlight patterns of interest in the analysis. Unfortunately, existing techniques that aim at finding ideal permutations are often inefficient to obtain a meaningful order when the data are characterized by a high degree of noise. To address this issue, this thesis introduces a new framework that boosts these inefficient methods by integrating them into an iterative process using convolution. The key idea being to temporarily transform the noisy matrix into a simpler model to reorder. The approach is intended to be as generic as possible by being able to process both numerical and binary data while offering satisfactory execution times. We show that this approach is efficient to improve the quality of the output of all tested basic methods, applicable on both synthetic datasets where the pattern is known and on real-world data.