ATTENTION/WARNING - NE PAS DÉPOSER ICI/DO NOT SUBMIT HERE

Ceci est la version de TEST de DIAL.mem. Veuillez ne pas soumettre votre mémoire sur ce site mais bien à l'URL suivante: 'https://thesis.dial.uclouvain.be'.
This is the TEST version of DIAL.mem. Please use the following URL to submit your master thesis: 'https://thesis.dial.uclouvain.be'.
 

Joint optimization of mega-image compression and detection by deep learning

(2023)

Files

Leconte_56021700_2023.pdf
  • Open access
  • Adobe PDF
  • 7.84 MB

Details

Supervisors
Faculty
Degree label
Abstract
Context : Object detection is a critical task in computer vision that involves identifying and locating objects of interest within an image or video. It has become increasingly important in recent years due to its wide range of applications in various fields, such as autonomous driving, surveillance, robotics, and healthcare. There are multiple machine learning methods for object detection, based on neural networks or not, but most recent and efficient methods use Deep Learning. These learning methods require vast amounts of data to achieve the most effective models. The exchange and storage of such large amounts of data also poses significant challenges as it is time-consuming, costly and the data flow increases exponentially. Hence, it is often more advisable to compress these data, usually with loss. The challenge then becomes a trade-off between the loss of information due to compression and the model's detection performance. To find the best balance, it is necessary to use efficient compression methods that minimize image size while retaining maximum information. It is equally essential to develop the most robust models possible to deal with this compression. This latter point is the subject of this work. Material and methods : The experiments are conducted using the DOTA dataset and YOLOv5 as deep learning detection algorithm. The images are compressed using JPEG2000 coding system. The first part of the experiments concerns the optimisation of the training of the network by pre-processing the dataset, training on compressed images and fine-tuning. The second part focuses on applying and analyzing ensemble methods, namely TTA and models ensembling. Results : Optimizing the training in the scope of detecting objects on compressed images led to an improvement of the performances by 50%. The application of TTA resulted in a notable performance boost of up to 4%, specifically on small objects and low-compressed images. By selecting models carefully, ensembling demonstrated a slight improvement of around 1%. The combination of the two methods produced the best performance measured on highly compressed images with an improvement of around 2%. Conclusion : Training optimizations and ensemble methods can be joint to enhance the detection of objects by deep CNNs on compressed images. Moreover, recent publications and ongoing research hold promise for further advancements in this area.