TY - GEN
T1 - Analysis of Machine Learning Models for Stroke Prediction with Emphasis on Hyperparameter Tuning Techniques
AU - Hasan, Sakib
AU - Islam, Alamgir
AU - Islam, Tanjin
AU - Ma, Hongbin
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.
PY - 2025
Y1 - 2025
N2 - Stroke remains a significant global cause of death and disability, necessitating early and accurate prediction models for prompt intervention. This study contrasts the performance of Support Vector Machine (SVM) and Random Forest (RF) models to enhance stroke prediction approaches. Emphasizing the critical role of hyper parameter adjustment in improving model efficiency, two tuning methods—Grid Search Cross-Validation (GS-CV) and Randomized Search Cross-Validation (RS-CV)—are investigated. Data prepossessing utilizes a data set from the Medical Clinic of Bangladesh, comprising 5,110 patient records. Imbalanced data is addressed through the Synthetic Minority Over-sampling Technique (SMOTE). Despite being good at predicting accuracy, SVM with RS-CV tuning is more accurate, achieving a 96% accuracy than RF with GS-CV tuning that achieves 92% accuracy. Such outcomes highlight the significance of choosing proper hyperparameter tuning techniques and ML models for stroke prediction. They also imply an outlet for use in healthcare contexts concerning early identification and prophylactic steps. This comparison study adds to the current debate about machine learning in medical prediction, focusing on the methodological aspects critical to constructing reliable and effective predictive systems.
AB - Stroke remains a significant global cause of death and disability, necessitating early and accurate prediction models for prompt intervention. This study contrasts the performance of Support Vector Machine (SVM) and Random Forest (RF) models to enhance stroke prediction approaches. Emphasizing the critical role of hyper parameter adjustment in improving model efficiency, two tuning methods—Grid Search Cross-Validation (GS-CV) and Randomized Search Cross-Validation (RS-CV)—are investigated. Data prepossessing utilizes a data set from the Medical Clinic of Bangladesh, comprising 5,110 patient records. Imbalanced data is addressed through the Synthetic Minority Over-sampling Technique (SMOTE). Despite being good at predicting accuracy, SVM with RS-CV tuning is more accurate, achieving a 96% accuracy than RF with GS-CV tuning that achieves 92% accuracy. Such outcomes highlight the significance of choosing proper hyperparameter tuning techniques and ML models for stroke prediction. They also imply an outlet for use in healthcare contexts concerning early identification and prophylactic steps. This comparison study adds to the current debate about machine learning in medical prediction, focusing on the methodological aspects critical to constructing reliable and effective predictive systems.
KW - Data Prepossessing
KW - Grid Search Cross-Validation
KW - Health care Technology
KW - Hyper parameter Tuning
KW - Machine Learning
KW - Random Forest
KW - Randomized Search Cross-Validation
KW - Stroke Prediction
KW - Support Vector Machine
KW - Synthetic Minority Over-sampling Technique (SMOTE)
UR - http://www.scopus.com/inward/record.url?scp=105003902835&partnerID=8YFLogxK
U2 - 10.1007/978-981-96-4756-9_1
DO - 10.1007/978-981-96-4756-9_1
M3 - Conference contribution
AN - SCOPUS:105003902835
SN - 9789819647552
T3 - Communications in Computer and Information Science
SP - 1
EP - 9
BT - Computational Intelligence and Industrial Applications - 11th International Symposium, ISCIIA 2024, Proceedings
A2 - Xin, Bin
A2 - Ma, Hongbin
A2 - She, Jinhua
A2 - Cao, Weihua
PB - Springer Science and Business Media Deutschland GmbH
T2 - 11th International Symposium on Computational Intelligence and Industrial Applications, ISCIIA 2024
Y2 - 1 November 2024 through 5 November 2024
ER -