TY - GEN
T1 - Deep Learning Based Feature Envy Detection Boosted by Real-World Examples
AU - Liu, Bo
AU - Liu, Hui
AU - Li, Guangjie
AU - Niu, Nan
AU - Xu, Zimao
AU - Wang, Yifan
AU - Xia, Yunni
AU - Zhang, Yuxia
AU - Jiang, Yanjie
N1 - Publisher Copyright:
© 2023 ACM.
PY - 2023/11/30
Y1 - 2023/11/30
N2 - Feature envy is one of the well-recognized code smells that should be removed by software refactoring. A major challenge in feature envy detection is that traditional approaches are less accurate whereas deep learning-based approaches are suffering from the lack of high-quality large-scale training data. Although existing refactoring detection tools could be employed to discover real-world feature envy examples, the noise (i.e., false positives) within the resulting data could significantly influence the quality of the training data as well as the performance of the models trained on the data. To this end, in this paper, we propose a sequence of heuristic rules and a decision tree-based classifier to filter out false positives reported by state-of-the-art refactoring detection tools. The data after filtering serve as the positive items in the requested training data. From the same subject projects, we randomly select methods that are different from positive items as negative items. With the real-world examples (both positive and negative examples), we design and train a deep learning-based binary model to predict whether a given method should be moved to a potential target class. Different from existing models, it leverages additional features, i.e., coupling between methods and classes (CBMC) and the message passing coupling between methods and classes (MCMC) that have not yet been exploited by existing approaches. Our evaluation results on real-world open-source projects suggest that the proposed approach substantially outperforms the state of the art in feature envy detection, improving precision and recall by 38.5% and 20.8%, respectively.
AB - Feature envy is one of the well-recognized code smells that should be removed by software refactoring. A major challenge in feature envy detection is that traditional approaches are less accurate whereas deep learning-based approaches are suffering from the lack of high-quality large-scale training data. Although existing refactoring detection tools could be employed to discover real-world feature envy examples, the noise (i.e., false positives) within the resulting data could significantly influence the quality of the training data as well as the performance of the models trained on the data. To this end, in this paper, we propose a sequence of heuristic rules and a decision tree-based classifier to filter out false positives reported by state-of-the-art refactoring detection tools. The data after filtering serve as the positive items in the requested training data. From the same subject projects, we randomly select methods that are different from positive items as negative items. With the real-world examples (both positive and negative examples), we design and train a deep learning-based binary model to predict whether a given method should be moved to a potential target class. Different from existing models, it leverages additional features, i.e., coupling between methods and classes (CBMC) and the message passing coupling between methods and classes (MCMC) that have not yet been exploited by existing approaches. Our evaluation results on real-world open-source projects suggest that the proposed approach substantially outperforms the state of the art in feature envy detection, improving precision and recall by 38.5% and 20.8%, respectively.
KW - Code Smells
KW - Feature Envy
KW - Software Refactoring
UR - http://www.scopus.com/inward/record.url?scp=85180552552&partnerID=8YFLogxK
U2 - 10.1145/3611643.3616353
DO - 10.1145/3611643.3616353
M3 - Conference contribution
AN - SCOPUS:85180552552
T3 - ESEC/FSE 2023 - Proceedings of the 31st ACM Joint Meeting European Software Engineering Conference and Symposium on the Foundations of Software Engineering
SP - 908
EP - 920
BT - ESEC/FSE 2023 - Proceedings of the 31st ACM Joint Meeting European Software Engineering Conference and Symposium on the Foundations of Software Engineering
A2 - Chandra, Satish
A2 - Blincoe, Kelly
A2 - Tonella, Paolo
PB - Association for Computing Machinery, Inc
T2 - 31st ACM Joint Meeting European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2023
Y2 - 3 December 2023 through 9 December 2023
ER -