Deep learning-based software engineering: progress, challenges, and opportunities

Xiangping Chen; Xing Hu; Yuan Huang; He Jiang; Weixing Ji; Yanjie Jiang; Yanyan Jiang; Bo Liu; Hui Liu; Xiaochen Li; Xiaoli Lian; Guozhu Meng; Xin Peng; Hailong Sun; Lin Shi; Bo Wang; Chong Wang; Jiayi Wang; Tiantian Wang; Jifeng Xuan; Xin Xia; Yibiao Yang; Yixin Yang; Li Zhang; Yuming Zhou; Lu Zhang

doi:10.1007/s11432-023-4127-5

Deep learning-based software engineering: progress, challenges, and opportunities

Xiangping Chen^*, Xing Hu^*, Yuan Huang, He Jiang^*, Weixing Ji, Yanjie Jiang^*, Yanyan Jiang^*, Bo Liu, Hui Liu, Xiaochen Li, Xiaoli Lian^*, Guozhu Meng^*, Xin Peng^*, Hailong Sun^*, Lin Shi^*, Bo Wang^*, Chong Wang, Jiayi Wang, Tiantian Wang^*, Jifeng Xuan^*Xin Xia, Yibiao Yang^*, Yixin Yang, Li Zhang, Yuming Zhou^*, Lu Zhang^*

^*Corresponding author for this work

School of Computer Science and Technology

Research output: Contribution to journal › Review article › peer-review

4 Citations (Scopus)

Abstract

Researchers have recently achieved significant advances in deep learning techniques, which in turn has substantially advanced other research disciplines, such as natural language processing, image processing, speech recognition, and software engineering. Various deep learning techniques have been successfully employed to facilitate software engineering tasks, including code generation, software refactoring, and fault localization. Many studies have also been presented in top conferences and journals, demonstrating the applications of deep learning techniques in resolving various software engineering tasks. However, although several surveys have provided overall pictures of the application of deep learning techniques in software engineering, they focus more on learning techniques, that is, what kind of deep learning techniques are employed and how deep models are trained or fine-tuned for software engineering tasks. We still lack surveys explaining the advances of subareas in software engineering driven by deep learning techniques, as well as challenges and opportunities in each subarea. To this end, in this study, we present the first task-oriented survey on deep learning-based software engineering. It covers twelve major software engineering subareas significantly impacted by deep learning techniques. Such subareas spread out through the whole lifecycle of software development and maintenance, including requirements engineering, software development, testing, maintenance, and developer collaboration. As we believe that deep learning may provide an opportunity to revolutionize the whole discipline of software engineering, providing one survey covering as many subareas as possible in software engineering can help future research push forward the frontier of deep learning-based software engineering more systematically. For each of the selected subareas, we highlight the major advances achieved by applying deep learning techniques with pointers to the available datasets in such a subarea. We also discuss the challenges and opportunities concerning each of the surveyed software engineering subareas.

Original language	English
Article number	111102
Journal	Science China Information Sciences
Volume	68
Issue number	1
DOIs	https://doi.org/10.1007/s11432-023-4127-5
Publication status	Published - Jan 2025

Keywords

deep learning
software artifact representation
software benchmark
software engineering
survey

Access to Document

10.1007/s11432-023-4127-5

Cite this

Chen, X., Hu, X., Huang, Y., Jiang, H., Ji, W., Jiang, Y., Jiang, Y., Liu, B., Liu, H., Li, X., Lian, X., Meng, G., Peng, X., Sun, H., Shi, L., Wang, B., Wang, C., Wang, J., Wang, T., ... Zhang, L. (2025). Deep learning-based software engineering: progress, challenges, and opportunities. Science China Information Sciences, 68(1), Article 111102. https://doi.org/10.1007/s11432-023-4127-5

@article{06aaaebc8b32470599140355f5dbb352,

title = "Deep learning-based software engineering: progress, challenges, and opportunities",

abstract = "Researchers have recently achieved significant advances in deep learning techniques, which in turn has substantially advanced other research disciplines, such as natural language processing, image processing, speech recognition, and software engineering. Various deep learning techniques have been successfully employed to facilitate software engineering tasks, including code generation, software refactoring, and fault localization. Many studies have also been presented in top conferences and journals, demonstrating the applications of deep learning techniques in resolving various software engineering tasks. However, although several surveys have provided overall pictures of the application of deep learning techniques in software engineering, they focus more on learning techniques, that is, what kind of deep learning techniques are employed and how deep models are trained or fine-tuned for software engineering tasks. We still lack surveys explaining the advances of subareas in software engineering driven by deep learning techniques, as well as challenges and opportunities in each subarea. To this end, in this study, we present the first task-oriented survey on deep learning-based software engineering. It covers twelve major software engineering subareas significantly impacted by deep learning techniques. Such subareas spread out through the whole lifecycle of software development and maintenance, including requirements engineering, software development, testing, maintenance, and developer collaboration. As we believe that deep learning may provide an opportunity to revolutionize the whole discipline of software engineering, providing one survey covering as many subareas as possible in software engineering can help future research push forward the frontier of deep learning-based software engineering more systematically. For each of the selected subareas, we highlight the major advances achieved by applying deep learning techniques with pointers to the available datasets in such a subarea. We also discuss the challenges and opportunities concerning each of the surveyed software engineering subareas.",

keywords = "deep learning, software artifact representation, software benchmark, software engineering, survey",

author = "Xiangping Chen and Xing Hu and Yuan Huang and He Jiang and Weixing Ji and Yanjie Jiang and Yanyan Jiang and Bo Liu and Hui Liu and Xiaochen Li and Xiaoli Lian and Guozhu Meng and Xin Peng and Hailong Sun and Lin Shi and Bo Wang and Chong Wang and Jiayi Wang and Tiantian Wang and Jifeng Xuan and Xin Xia and Yibiao Yang and Yixin Yang and Li Zhang and Yuming Zhou and Lu Zhang",

note = "Publisher Copyright: {\textcopyright} The Author(s) 2024.",

year = "2025",

month = jan,

doi = "10.1007/s11432-023-4127-5",

language = "English",

volume = "68",

journal = "Science China Information Sciences",

issn = "1674-733X",

publisher = "Science China Press",

number = "1",

}

Chen, X, Hu, X, Huang, Y, Jiang, H, Ji, W, Jiang, Y, Jiang, Y, Liu, B, Liu, H, Li, X, Lian, X, Meng, G, Peng, X, Sun, H, Shi, L, Wang, B, Wang, C, Wang, J, Wang, T, Xuan, J, Xia, X, Yang, Y, Yang, Y, Zhang, L, Zhou, Y & Zhang, L 2025, 'Deep learning-based software engineering: progress, challenges, and opportunities', Science China Information Sciences, vol. 68, no. 1, 111102. https://doi.org/10.1007/s11432-023-4127-5

TY - JOUR

T1 - Deep learning-based software engineering

T2 - progress, challenges, and opportunities

AU - Chen, Xiangping

AU - Hu, Xing

AU - Huang, Yuan

AU - Jiang, He

AU - Ji, Weixing

AU - Jiang, Yanjie

AU - Jiang, Yanyan

AU - Liu, Bo

AU - Liu, Hui

AU - Li, Xiaochen

AU - Lian, Xiaoli

AU - Meng, Guozhu

AU - Peng, Xin

AU - Sun, Hailong

AU - Shi, Lin

AU - Wang, Bo

AU - Wang, Chong

AU - Wang, Jiayi

AU - Wang, Tiantian

AU - Xuan, Jifeng

AU - Xia, Xin

AU - Yang, Yibiao

AU - Yang, Yixin

AU - Zhang, Li

AU - Zhou, Yuming

AU - Zhang, Lu

PY - 2025/1

Y1 - 2025/1

N2 - Researchers have recently achieved significant advances in deep learning techniques, which in turn has substantially advanced other research disciplines, such as natural language processing, image processing, speech recognition, and software engineering. Various deep learning techniques have been successfully employed to facilitate software engineering tasks, including code generation, software refactoring, and fault localization. Many studies have also been presented in top conferences and journals, demonstrating the applications of deep learning techniques in resolving various software engineering tasks. However, although several surveys have provided overall pictures of the application of deep learning techniques in software engineering, they focus more on learning techniques, that is, what kind of deep learning techniques are employed and how deep models are trained or fine-tuned for software engineering tasks. We still lack surveys explaining the advances of subareas in software engineering driven by deep learning techniques, as well as challenges and opportunities in each subarea. To this end, in this study, we present the first task-oriented survey on deep learning-based software engineering. It covers twelve major software engineering subareas significantly impacted by deep learning techniques. Such subareas spread out through the whole lifecycle of software development and maintenance, including requirements engineering, software development, testing, maintenance, and developer collaboration. As we believe that deep learning may provide an opportunity to revolutionize the whole discipline of software engineering, providing one survey covering as many subareas as possible in software engineering can help future research push forward the frontier of deep learning-based software engineering more systematically. For each of the selected subareas, we highlight the major advances achieved by applying deep learning techniques with pointers to the available datasets in such a subarea. We also discuss the challenges and opportunities concerning each of the surveyed software engineering subareas.

AB - Researchers have recently achieved significant advances in deep learning techniques, which in turn has substantially advanced other research disciplines, such as natural language processing, image processing, speech recognition, and software engineering. Various deep learning techniques have been successfully employed to facilitate software engineering tasks, including code generation, software refactoring, and fault localization. Many studies have also been presented in top conferences and journals, demonstrating the applications of deep learning techniques in resolving various software engineering tasks. However, although several surveys have provided overall pictures of the application of deep learning techniques in software engineering, they focus more on learning techniques, that is, what kind of deep learning techniques are employed and how deep models are trained or fine-tuned for software engineering tasks. We still lack surveys explaining the advances of subareas in software engineering driven by deep learning techniques, as well as challenges and opportunities in each subarea. To this end, in this study, we present the first task-oriented survey on deep learning-based software engineering. It covers twelve major software engineering subareas significantly impacted by deep learning techniques. Such subareas spread out through the whole lifecycle of software development and maintenance, including requirements engineering, software development, testing, maintenance, and developer collaboration. As we believe that deep learning may provide an opportunity to revolutionize the whole discipline of software engineering, providing one survey covering as many subareas as possible in software engineering can help future research push forward the frontier of deep learning-based software engineering more systematically. For each of the selected subareas, we highlight the major advances achieved by applying deep learning techniques with pointers to the available datasets in such a subarea. We also discuss the challenges and opportunities concerning each of the surveyed software engineering subareas.

KW - deep learning

KW - software artifact representation

KW - software benchmark

KW - software engineering

KW - survey

UR - http://www.scopus.com/inward/record.url?scp=85213975936&partnerID=8YFLogxK

U2 - 10.1007/s11432-023-4127-5

DO - 10.1007/s11432-023-4127-5

M3 - Review article

AN - SCOPUS:85213975936

SN - 1674-733X

VL - 68

JO - Science China Information Sciences

JF - Science China Information Sciences

IS - 1

M1 - 111102

ER -

Deep learning-based software engineering: progress, challenges, and opportunities

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this