TY - GEN
T1 - Deep feature learning to quantitative prediction of software defects
AU - Qiao, Lei
AU - Li, Guangjie
AU - Yu, Daohua
AU - Liu, Hui
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021/7
Y1 - 2021/7
N2 - Defect prediction forecasts defect proneness or the number of defects contained in software systems. It is frequently employed to efficiently prioritize and allocate the limited testing resources to the modules that are more likely to be defective during the process of software development and maintenance. Consequently, a number of defect prediction approaches have been proposed. Most of the existing approaches on defect prediction regard defect prediction as a classification problem in which programs are classified as buggy or non-buggy. However, identifying the defect proneness of a given software module is not sufficient in practical software testing. The research on predicting the number of defects is limited and the performances of these approaches are constantly being optimized and improved. Therefore, in this paper, we propose a novel approach that leverages a convolutional neural network to predict the number of defects in software systems automatically. First, we preprocess the PROMISE dataset, which involves performing natural logarithm transformation and data normalization. Second, we feed the preprocessed dataset to a specially designed convolutional neural network-based model to predict the number of defects. Third, we rank the software modules according to the corresponding predicted number of defects in descending order. We also evaluate the proposed approach on a well-known dataset by cross-validation. The evaluation results suggest that the proposed approach is both accurate and robust, and it improves the state of the art. On average, it significantly improves the Kendall correlation coefficient by 16% and the fault-percentile-average by 4%.
AB - Defect prediction forecasts defect proneness or the number of defects contained in software systems. It is frequently employed to efficiently prioritize and allocate the limited testing resources to the modules that are more likely to be defective during the process of software development and maintenance. Consequently, a number of defect prediction approaches have been proposed. Most of the existing approaches on defect prediction regard defect prediction as a classification problem in which programs are classified as buggy or non-buggy. However, identifying the defect proneness of a given software module is not sufficient in practical software testing. The research on predicting the number of defects is limited and the performances of these approaches are constantly being optimized and improved. Therefore, in this paper, we propose a novel approach that leverages a convolutional neural network to predict the number of defects in software systems automatically. First, we preprocess the PROMISE dataset, which involves performing natural logarithm transformation and data normalization. Second, we feed the preprocessed dataset to a specially designed convolutional neural network-based model to predict the number of defects. Third, we rank the software modules according to the corresponding predicted number of defects in descending order. We also evaluate the proposed approach on a well-known dataset by cross-validation. The evaluation results suggest that the proposed approach is both accurate and robust, and it improves the state of the art. On average, it significantly improves the Kendall correlation coefficient by 16% and the fault-percentile-average by 4%.
KW - Convolutional neural network
KW - Deep feature learning
KW - Number of defects
KW - Regression model
KW - Software defect prediction
KW - Software metrics
UR - http://www.scopus.com/inward/record.url?scp=85115870131&partnerID=8YFLogxK
U2 - 10.1109/COMPSAC51774.2021.00204
DO - 10.1109/COMPSAC51774.2021.00204
M3 - Conference contribution
AN - SCOPUS:85115870131
T3 - Proceedings - 2021 IEEE 45th Annual Computers, Software, and Applications Conference, COMPSAC 2021
SP - 1401
EP - 1402
BT - Proceedings - 2021 IEEE 45th Annual Computers, Software, and Applications Conference, COMPSAC 2021
A2 - Chan, W. K.
A2 - Claycomb, Bill
A2 - Takakura, Hiroki
A2 - Yang, Ji-Jiang
A2 - Teranishi, Yuuichi
A2 - Towey, Dave
A2 - Segura, Sergio
A2 - Shahriar, Hossain
A2 - Reisman, Sorel
A2 - Ahamed, Sheikh Iqbal
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 45th IEEE Annual Computers, Software, and Applications Conference, COMPSAC 2021
Y2 - 12 July 2021 through 16 July 2021
ER -