Deep learning based code smell detection

Hui Liu*, Jiahao Jin, Zhifeng Xu, Yanzhen Zou, Yifan Bu, Lu Zhang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

66 Citations (Scopus)

Abstract

Code smells are structures in the source code that suggest the possibility of refactorings. Consequently, developers may identify refactoring opportunities by detecting code smells. However, manual identification of code smells is challenging and tedious. To this end, a number of approaches have been proposed to identify code smells automatically or semi-automatically. Most of such approaches rely on manually designed heuristics to map manually selected source code metrics into predictions. However, it is challenging to manually select the best features. It is also difficult to manually construct the optimal heuristics. To this end, in this paper we propose a deep learning based novel approach to detecting code smells. The key insight is that deep neural networks and advanced deep learning techniques could automatically select features of source code for code smell detection, and could automatically build the complex mapping between such features and predictions. A big challenge for deep learning based smell detection is that deep learning often requires a large number of labeled training data (to tune a large number of parameters within the employed deep neural network) whereas existing datasets for code smell detection are rather small. To this end, we propose an automatic approach to generating labeled training data for the neural network based classifier, which does not require any human intervention. As an initial try, we apply the proposed approach to four common and well-known code smells, i.e., feature envy, long method, large class, and misplaced class. Evaluation results on open-source applications suggest that the proposed approach significantly improves the state-of-the-art.

Original languageEnglish
Pages (from-to)1811-1837
Number of pages27
JournalIEEE Transactions on Software Engineering
Volume47
Issue number9
DOIs
Publication statusPublished - 1 Sept 2021

Keywords

  • Software refactoring
  • code smells
  • deep learning
  • identification
  • quality

Fingerprint

Dive into the research topics of 'Deep learning based code smell detection'. Together they form a unique fingerprint.

Cite this