Identifying Textual Features of High-Quality Questions: An Empirical Study on Stack Overflow

Qing Mi, Yujin Gao*, Jacky Keung, Yan Xiao, Solomon Mensah

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Citations (Scopus)

Abstract

Background: Stack Overflow (SO) is a programming-specific Q&A website that serves as a valuable repository of software engineering knowledge. For SO members, formulating a good question is the first step towards eliciting satisfactory responses. Aims: To guide SO members on how to make a good question, we conduct an empirical study using the publicly available Stack Overflow Data Dump for the period of 2008-2016. Method: We first choose 25 features along 5 dimensions to represent the textual characteristics that we are interested in. Making use of the Boruta algorithm, we then capture all features that are either strongly or weakly relevant to the question quality. Results: The results show that the number of tags and code snippets are the most discriminative features, whereas there is only a weak correlation between the question quality and the sentiment-related factors. Based on the empirical evidence, we provide useful and usable suggestions to SO members on how to optimize their questions. Conclusions: We consider that our findings will provide SO members with a better understanding of the patterns behind high-quality questions, this is to support effective and efficient utilization of Q&A websites as the ultimate goal.

Original languageEnglish
Title of host publicationProceedings - 24th Asia-Pacific Software Engineering Conference, APSEC 2017
EditorsJian Lv, He Zhang, Mike Hinchey, Xiao Liu
PublisherIEEE Computer Society
Pages636-641
Number of pages6
ISBN (Electronic)9781538636817
DOIs
Publication statusPublished - 2 Jul 2017
Event24th Asia-Pacific Software Engineering Conference, APSEC 2017 - Nanjing, Jiangsu, China
Duration: 4 Dec 20178 Dec 2017

Publication series

NameProceedings - Asia-Pacific Software Engineering Conference, APSEC
Volume2017-December
ISSN (Print)1530-1362

Conference

Conference24th Asia-Pacific Software Engineering Conference, APSEC 2017
Country/TerritoryChina
CityNanjing, Jiangsu
Period4/12/178/12/17

Keywords

  • Boruta algorithm
  • Q&A website
  • Stack Overflow
  • empirical software engineering
  • textual feature

Fingerprint

Dive into the research topics of 'Identifying Textual Features of High-Quality Questions: An Empirical Study on Stack Overflow'. Together they form a unique fingerprint.

Cite this