TY - JOUR
T1 - SDP-MTF
T2 - A Composite Transfer Learning and Feature Fusion for Cross-Project Software Defect Prediction
AU - Lei, Tianwei
AU - Xue, Jingfeng
AU - Man, Duo
AU - Wang, Yong
AU - Li, Minghui
AU - Kong, Zixiao
N1 - Publisher Copyright:
© 2024 by the authors.
PY - 2024/7
Y1 - 2024/7
N2 - Software defect prediction is critical for improving software quality and reducing maintenance costs. In recent years, Cross-Project software defect prediction has garnered significant attention from researchers. This approach leverages transfer learning to apply the knowledge from existing projects to new ones, thereby enhancing the universality of predictive models. It provides an effective solution for projects with limited historical defect data. Nevertheless, current methodologies face two main challenges: first, the inadequacy of feature information mining, where code statistical information or semantic information is used in isolation, ignoring the benefits of their integration; second, the substantial feature disparity between different projects, which can lead to insufficient effect during transfer learning, necessitating additional efforts to narrow this gap to improve precision. Addressing these challenges, this paper proposes a novel methodology, SDP-MTF (Software Defect Prediction using Multi-stage Transfer learning and Feature fusion), that combines code statistical features, deep semantic features, and multiple feature transfer learning methods to enhance the predictive effect. The SDP-MTF method was empirically tested on single-source cross-project software defect prediction across six projects from the PROMISE dataset, benchmarked against five baseline algorithms that employ distinct features and transfer methodologies. Our findings indicate that SDP-MTF significantly outperforms five classical baseline algorithms, improving the F1-Score by 8% to 15.2%, thereby substantively advancing the precision of cross-project software defect prediction.
AB - Software defect prediction is critical for improving software quality and reducing maintenance costs. In recent years, Cross-Project software defect prediction has garnered significant attention from researchers. This approach leverages transfer learning to apply the knowledge from existing projects to new ones, thereby enhancing the universality of predictive models. It provides an effective solution for projects with limited historical defect data. Nevertheless, current methodologies face two main challenges: first, the inadequacy of feature information mining, where code statistical information or semantic information is used in isolation, ignoring the benefits of their integration; second, the substantial feature disparity between different projects, which can lead to insufficient effect during transfer learning, necessitating additional efforts to narrow this gap to improve precision. Addressing these challenges, this paper proposes a novel methodology, SDP-MTF (Software Defect Prediction using Multi-stage Transfer learning and Feature fusion), that combines code statistical features, deep semantic features, and multiple feature transfer learning methods to enhance the predictive effect. The SDP-MTF method was empirically tested on single-source cross-project software defect prediction across six projects from the PROMISE dataset, benchmarked against five baseline algorithms that employ distinct features and transfer methodologies. Our findings indicate that SDP-MTF significantly outperforms five classical baseline algorithms, improving the F1-Score by 8% to 15.2%, thereby substantively advancing the precision of cross-project software defect prediction.
KW - code statistical features
KW - cross-project software defect prediction
KW - feature fusion
KW - semantic features
UR - http://www.scopus.com/inward/record.url?scp=85198406876&partnerID=8YFLogxK
U2 - 10.3390/electronics13132439
DO - 10.3390/electronics13132439
M3 - Article
AN - SCOPUS:85198406876
SN - 2079-9292
VL - 13
JO - Electronics (Switzerland)
JF - Electronics (Switzerland)
IS - 13
M1 - 2439
ER -