TY - JOUR
T1 - A survey on mining stack overflow
T2 - question and answering (Q&A) community
AU - Ahmad, Arshad
AU - Feng, Chong
AU - Ge, Shi
AU - Yousif, Abdallah
N1 - Publisher Copyright:
© 2018, Emerald Publishing Limited.
PY - 2018/3/22
Y1 - 2018/3/22
N2 - Purpose: Software developers extensively use stack overflow (SO) for knowledge sharing on software development. Thus, software engineering researchers have started mining the structured/unstructured data present in certain software repositories including the Q&A software developer community SO, with the aim to improve software development. The purpose of this paper is show that how academics/practitioners can get benefit from the valuable user-generated content shared on various online social networks, specifically from Q&A community SO for software development. Design/methodology/approach: A comprehensive literature review was conducted and 166 research papers on SO were categorized about software development from the inception of SO till June 2016. Findings: Most of the studies revolve around a limited number of software development tasks; approximately 70 percent of the papers used millions of posts data, applied basic machine learning methods, and conducted investigations semi-automatically and quantitative studies. Thus, future research should focus on the overcoming existing identified challenges and gaps. Practical implications: The work on SO is classified into two main categories; “SO design and usage” and “SO content applications.” These categories not only give insights to Q&A forum providers about the shortcomings in design and usage of such forums but also provide ways to overcome them in future. It also enables software developers to exploit such forums for the identified under-utilized tasks of software development. Originality/value: The study is the first of its kind to explore the work on SO about software development and makes an original contribution by presenting a comprehensive review, design/usage shortcomings of Q&A sites, and future research challenges.
AB - Purpose: Software developers extensively use stack overflow (SO) for knowledge sharing on software development. Thus, software engineering researchers have started mining the structured/unstructured data present in certain software repositories including the Q&A software developer community SO, with the aim to improve software development. The purpose of this paper is show that how academics/practitioners can get benefit from the valuable user-generated content shared on various online social networks, specifically from Q&A community SO for software development. Design/methodology/approach: A comprehensive literature review was conducted and 166 research papers on SO were categorized about software development from the inception of SO till June 2016. Findings: Most of the studies revolve around a limited number of software development tasks; approximately 70 percent of the papers used millions of posts data, applied basic machine learning methods, and conducted investigations semi-automatically and quantitative studies. Thus, future research should focus on the overcoming existing identified challenges and gaps. Practical implications: The work on SO is classified into two main categories; “SO design and usage” and “SO content applications.” These categories not only give insights to Q&A forum providers about the shortcomings in design and usage of such forums but also provide ways to overcome them in future. It also enables software developers to exploit such forums for the identified under-utilized tasks of software development. Originality/value: The study is the first of its kind to explore the work on SO about software development and makes an original contribution by presenting a comprehensive review, design/usage shortcomings of Q&A sites, and future research challenges.
KW - Information retrieval
KW - Mining
KW - Software development
KW - Software repositories
KW - Stack overflow
KW - Text mining
UR - http://www.scopus.com/inward/record.url?scp=85051274733&partnerID=8YFLogxK
U2 - 10.1108/DTA-07-2017-0054
DO - 10.1108/DTA-07-2017-0054
M3 - Review article
AN - SCOPUS:85051274733
SN - 0033-0337
VL - 52
SP - 190
EP - 247
JO - Data Technologies and Applications
JF - Data Technologies and Applications
IS - 2
ER -