Files
Ronval_29931800_2023.pdf
Open access - Adobe PDF
- 5.32 MB
Details
- Supervisors
- Faculty
- Degree label
- Abstract
- In this master thesis, we design and test a new approach called the robust training loop, used to perform adversarial training of binarized neural networks. Even if deep learning models can obtain good scores for various tasks, they are easily fooled by small modifications of the data given as input. Preventing such unwanted behavior may turn out to be a complex task because of the difficulty to interpret how the model really works. State-of-the-art solutions for real-valued neural networks generally use information from the gradient to create the modification. They then use the disrupted samples in a specific training algorithm, along with a modification of the loss function. However, each kind of modification is linked to a specific training algorithm and uses directly the gradient from the model, which is not usable with all types of deep learning models. In this thesis, we focus on binarized neural networks since they can be exactly verified, which means we can create the modification by understanding how they perform their task. With these modified samples, we propose an extension of the classical training algorithm, named the robust training loop, in order to improve the resistance of such models against modification of their inputs. We believe this algorithm can be easily extended to other methods of sample modification than the one used in this work. Through extensive experiments, we show that our approach can indeed improve robustness according to the considered criteria.