Files
Losicki_51311600_2019.pdf
Open access - Adobe PDF
- 1.15 MB
Details
- Supervisors
- Faculty
- Degree label
- Abstract
- The increased availability of large, widely used, and open-source software systems has sparked the interest of researchers from different communities. One possible use of this data is to build new developments tools. These tools can help programmers analyse a previously unseen codebase, get familiar with a new language, or write idiomatic code. However, finding how to infer knowledge from source code is still an active topic of research. Here, we focus on using pattern mining techniques and the kind of modifications they require to extract this knowledge in the form of code idioms. In particular, we explore whether closed tree patterns and tree patterns that compress the data can successfully be used when mining for idioms in source code. This thesis was made in the context of an ongoing research project, whose goal is to develop a software tool which would assist in legacy software modernisation. We describe our contributions to their development effort of the mining process as well.