Files
Mouchet_00442101_2024.pdf
Open access - Adobe PDF
- 2.29 MB
Details
- Supervisors
- Faculty
- Degree label
- Abstract
- Function clone detection is a crucial task in malware analysis as this avoids repeating the complete analysis of a known function. With the ever-evolving landscape of cybersecurity, the increasing complexity of malware outpaces traditional detection techniques. This thesis aims to improve the efficiency of the Symbolic Execution toolchain for Malware Analysis (SEMA) by adding an extended version of SimID as a preprocessing step. SimID is an obfuscation-resilient approach based on the observation of input-output pairs to detect known functions. By shifting from a pattern-matching to a semantic signature-based methodology, we intend to improve the robustness of the toolchain in the presence of obfuscation. We exhibit the added value of our extended version of SimID by showing diverse applications of its different modes. We demonstrate the detection of functions in different compilation and obfuscation contexts with a unique known signature which is an improvement on the current approach. We perform this demonstration in toy binaries, complex programs, and real-world malware. Furthermore, we explore the use of our tool in combination with manual analysis in the case of a lack of resources. Finally, we suggest future research directions to address the current limitations of our approach and to improve the effectiveness of both our version of SimID and the SEMA toolchain more generally.