A method to detection code smells with Stacking Ensemble Learning

Document Type : Original Article

Authors

Comprehensive Imam Hossein University, Computer Engineering Faculty, Tehran, Iran

Abstract

With the expansion of the use of information technology in all areas of human life, the production of high-quality software has become more important than ever. There are various factors that reduce the quality of produced software. One of these factors is the presence of messy code or code smells. They are structural defects in software programs that often arise due to incorrect implementation of software engineering processes or lack of sufficient experience of software developers. To solve this problem, it is necessary to identify them and then fix them by refactoring the program. For this purpose, the use of appropriate and accurate methods and techniques in the field of identifying messy codes is of particular importance. The use of machine learning techniques and algorithms is one of the proposed and widely used solutions for identifying such codes. Therefore, in this article, a solution to improve the accuracy of identifying messy codes including; The Feature Envy, Long Method, data class, Large Class, Long Parameter List and Switch Statements are presented using a combination of ensemble feature selection and stacking ensemble learning techniques. The final results from various experiments show a maximum performance of 99% in the accuracy benchmark for some code smells.

Keywords