TY - JOUR
T1 - Machine learning based success prediction for crowdsourcing software projects
AU - Illahi, Inam
AU - Liu, Hui
AU - Umer, Qasim
AU - Niu, Nan
N1 - Publisher Copyright:
© 2021 Elsevier Inc.
PY - 2021/8
Y1 - 2021/8
N2 - Competitive Crowdsourcing Software Development is an online software development paradigm, promises the innovative, cost effective and high quality solutions on time. However, the paradigm is still in infancy and does not address the key challenges such as low rate of submissions and high risk of project failure. A significant number of software projects fail to receive a satisfactory solution and end up wasting the time and efforts of stakeholders. Therefore, the success prediction of a new software project may help stakeholders in the project crowdsourcing decision, saving their time and efforts. To this end, this study proposes a novel approach based on machine learning to predict the success of a software project for crowdsourcing platforms in terms of whether the given project will reach its completion or otherwise. First, the textual description and important attributes of software projects from TopCoder is extracted. Next, the description is preprocessed using natural language processing technologies. Then, keywords are identified using a modified keyword ranking algorithm and each software project is awarded a ranking score. Every software project is modeled as a vector that is based on the extracted attributes, its identified keywords and ranking scores. Using these vectors with their associated solution status, a support vector machine classifier is trained to predict the success of a given software project. Different machine learning classifiers are applied and it turns out that support vector machine yields the highest performance on the given dataset. Finally, the proposed approach is evaluated with history data of real software projects. The results of hold-out validation suggest that the average precision, recall, and f-measure are up to 94.53%, 99.30% and 96.85%, respectively.
AB - Competitive Crowdsourcing Software Development is an online software development paradigm, promises the innovative, cost effective and high quality solutions on time. However, the paradigm is still in infancy and does not address the key challenges such as low rate of submissions and high risk of project failure. A significant number of software projects fail to receive a satisfactory solution and end up wasting the time and efforts of stakeholders. Therefore, the success prediction of a new software project may help stakeholders in the project crowdsourcing decision, saving their time and efforts. To this end, this study proposes a novel approach based on machine learning to predict the success of a software project for crowdsourcing platforms in terms of whether the given project will reach its completion or otherwise. First, the textual description and important attributes of software projects from TopCoder is extracted. Next, the description is preprocessed using natural language processing technologies. Then, keywords are identified using a modified keyword ranking algorithm and each software project is awarded a ranking score. Every software project is modeled as a vector that is based on the extracted attributes, its identified keywords and ranking scores. Using these vectors with their associated solution status, a support vector machine classifier is trained to predict the success of a given software project. Different machine learning classifiers are applied and it turns out that support vector machine yields the highest performance on the given dataset. Finally, the proposed approach is evaluated with history data of real software projects. The results of hold-out validation suggest that the average precision, recall, and f-measure are up to 94.53%, 99.30% and 96.85%, respectively.
KW - Classification
KW - Competitive crowdsourcing
KW - Machine learning
KW - Prediction
KW - Risk
UR - http://www.scopus.com/inward/record.url?scp=85104855804&partnerID=8YFLogxK
U2 - 10.1016/j.jss.2021.110965
DO - 10.1016/j.jss.2021.110965
M3 - Article
AN - SCOPUS:85104855804
SN - 0164-1212
VL - 178
JO - Journal of Systems and Software
JF - Journal of Systems and Software
M1 - 110965
ER -