Predicting the Occurrence of Cerebrovascular Accident in Patients using Machine Learning Technique

PDF (1265KB), PP.36-48

Views: 0 Downloads: 0

Author(s)

Edward N. Udo 1,2,* Anietie P. Ekong 3 Favour A. Akumute 1

1. Department of Computer Science, University of Uyo, Uyo, Nigeria

2. TETFund Centre of Excellence in Computational Intelligence Research, University of Uyo, Uyo, Nigeria

3. Department of Computer Science, Akwa Ibom StateUniversity, Ikot Akpaden, Nigeria

* Corresponding author.

DOI: https://doi.org/10.5815/ijitcs.2025.02.04

Received: 22 Sep. 2024 / Revised: 27 Nov. 2024 / Accepted: 17 Dec. 2024 / Published: 8 Apr. 2025

Index Terms

Prediction Model, Cerebrovascular Accident, Stroke, Machine Learning Algorithms, Classifiers, Oversampling Technique

Abstract

Cerebrovascular disease commonly known as stroke is the third leading cause of disability and mortality in the world. In recent years, technological advancements have transformed the way information is acquired and how problems are solved in diverse fields of human endeavors, including the medical and healthcare sectors. Machine Learning (ML) and data driven techniques have gain prominence in problem solving and have been deployed in the prediction of the occurrences of stroke. This work explores the application of supervised machine learning algorithms for the prediction of stroke, emphasizing the critical need for early prediction to enhance preventive measures. A comprehensive comparison of classification (Support Vector Machine and Random Forest) and regression (Logistic Regression) algorithms was conducted, with concerns on binary stroke outcome (likelihood of stroke and no stroke) data utilizing dataset from the International Stroke Trial database. The Synthetic Minority Oversampling Technique (SMOTE) and K-fold cross validation were used to balance and address the class imbalance in the datasets. The subsequent model comparison demonstrated distinct strengths and weaknesses among the three models.  Random Forest (RF) exhibited high accuracy score of 89%, Support Vector Machine (SVM) and Logistic Regression (LR) showed 86% accuracy. LR demonstrated the most balanced predictive performance, achieving high precision for stroke cases and reasonable recall for both classes.

Cite This Paper

Edward N. Udo, Anietie P. Ekong, Favour A. Akumute, "Predicting the Occurrence of Cerebrovascular Accident in Patients using Machine Learning Technique", International Journal of Information Technology and Computer Science(IJITCS), Vol.17, No.2, pp.36-48, 2025. DOI:10.5815/ijitcs.2025.02.04

Reference

[1]I. Ivanov, Y. Kumchev, and V. Hooper, “An Optimization Precise Model of Stroke Data to Improve Stroke Prediction,” Algorithms, vol. 16, no. 417, 2023.https://doi.org/10.3390/a16090417.
[2]E. Dritsas and M. Trigka, “Stroke Risk Prediction with Machine Learning Techniques,” Sensors, vol. 22, no. 4670, 2022.https://doi.org/10.3390/s22134670
[3]N. Biswas, K., Uddin, S. Rikta and S. Dey, “A comparative analysis of machine learning classifiers for stroke prediction: A predictive analytics approach,” Healthcare Analytics, vol. 2, no. 100116, 2022. https://doi.org/10.1016/j.health.2022.100116
[4]N. Someeh, M. Mirfeizi, M. Asghani-Jafarabadi, S. Alinia, F. Farzipoor and S. Shamshirgaran, “Predicting Mortality in Brain Stroke Patients using Neural Networks: Outcomes Analysis in a Kongitudinal Study,” Scientific Reports, vol. 13, no. 18530, 2023. https://doi.org/10.1038/s41598-023-45877-8
[5]T. Tazin, M. Alam, N. Dola, M. Bari, S. Bourouis, M. Monirujjaman, “Stroke Disease Detection and Prediction using Robust Learning Approaches,” Journal of Healthcare Engineering, Article ID, 7633381, 2021. https://doi.org/10.1155/2021/7633381
[6]A. Bersano and L. Gatti, “Pathophysiology and Treatment of Stroke: Present Status and Future perspectives,” International Journal of Molecular Sciences, vol. 24, no. 19, 2023. https://doi.org/10.3390/ijms241914848.
[7]K. Harshitha, P. Harshitha, G. Gupta, P. Vaishak and K. Prajna, “Stroke Prediction Using Machine Learning Algorithms,” International Journal of Innovative Research in Engineering & Management, vol. 8, no. 4, pp 6 – 9, 2021. https://doi.org/10.21276/ijirem.2021.8.4.2
[8]N. Alageel, R. Alharbi, R. Alharbi, M.  Alsayil and L. Alharbi, “Using Machine Learning Algorithm as a Method for Improving Stroke Prediction,”International Journal of Advanced Computer Science and Applications, vol. 14, no. 4,pp 738 – 744, 2023.http://dx.doi.org/10.14569/IJACSA.2023.0140481
[9]A. Hassan, S. Ahmad, E. Munir, I. Khan and N.  Ramzan, “Predictive Modelling and Identification of Key Risk Factors for Stroke using Machine Learning,” Scientific Reports, vol. 14, no. 11498, 2024. https://doi.org/10.1038/s41598-024-61665-4
[10]R. Mia, S. Khanam, A. Mahjabeen, N. Ovy, D. Ghimire, M. Park, M. Begum et al., “Exploring Machine Learning for Predicting Cerebral Stroke: A Study in Discovery,” Electronics, vol. 13, no. 686, 2023.https://doi.org/10.3390/electronics13040686
[11]G. Fang, W. Liu and L. Wang, “A Machine Learning Approach to Select Features Important to Stroke Prognosis,” Computational Biology and Chemistry, vol. 88, no.107316, 2020. .https://doi.org/10.1016/j.compbiolchem.2020.107316
[12]S., Dev, H. Weng, C. Nwosu, N. Jain, B. Veeravalli and D. John, “A Predictive Analytics Approach for stroke Prediction using Machine Learning and Neural Networks,” Healthcare Analytics, vol. 2, 100032, 2022. https://doi.org/10.1016/j.health.2022.100032
[13]T. Yahya, M. Jilani, S. Khan, R. Mszar, S. Hassan, M. Blaha, R. Blankstein, S. Virani, M. Johansen, F. Vahidy et al., “Stroke in Young Adults: Current Trends, Opportunities for Prevention and Pathways Forward,”American Journal of Preventive Cardiology, vol. 3, 100085, 2020. doi: 10.1016/j.ajpc.2020.100085. PMID: 34327465; PMCID: PMC8315351.
[14]M. Owolabi, A. Thrift, A. Mahal, M. Marie Ishida, S. Martins, W. Johnson, J.  Pandian, F. AbdAllah, J. Yaria, H. Phan et al., “Primary Stroke Prevention Worldwide: Translating Evidence into Action,” Lancet Public Health, vol. 7, no.1,:e74-e85, 2022. doi: 10.1016/S2468-2667(21)00230-9.
[15]R. Balamurugan and A. Martin, “Brain Stroke Prediction using Machine Learning Techniques. A Comparative Study, Munich, GRIN Verlag, 2023. https://www.grin.com/document/1387628
[16]M. Javaid, A. Haleem, R. Pratap, R. Singh, R. Suman and S. Ra, “Significance of Machine Learning in Healthcare: Features, Pillars and Applications,” International Journal of Intelligent Networks, vol. 3, pp 58–73, 2022. https://doi.org/10.1016/j.ijin.2022.05.002
[17]C. Fernandez‑Lozano, P. Hervella, V. Mato‑Abad, M. Rodríguez‑Yáñez, S. Suárez‑Garaboa, “Random Forest‑based Prediction of Stroke Outcome,” Scientific Reports, vol.11, 10071, 2021. https://doi.org/10.1038/s41598-021-89434-7
[18]T. Rakshit and A. Shrestha. “Comparative Analysis and Implementation of Heart Stroke Prediction using Various Machine Learning Techniques,” International Journal of Engineering Research and Technology, vol.10, no. 2, pp 886-890. 2021
[19]G. Sailasya and G. Kumari, “Analyzing the Performance of Stroke Prediction using ML Classification Algorithms,” International Journal of Advanced Computer Science and Applications, vol.12, no. 6, pp 539- 545, 2021.http://dx.doi.org/10.14569/IJACSA.2021.0120662
[20]S. Ghanipour and Y. Soroush, “Stroke Prediction with Logistic Regression and assessing it using Confusion Matrix,” Medically reviewed by Heidi Moawad, M.D. — By James McIntosh — Updated on January 6, 2023. https://www.medicalnewstoday.com/articles/7624
[21]R. Mitra and T. Rajendran, “Efficient Prediction of Stroke Patients Using Random Forest Algorithm in Comparison to Support Vector Machine,” In Book, Advances in Parallel Computing Algorithms, Tools and Paradigms, D.J. Hemanth et al. (Eds.), pp 530 – 536, 2022.  Doi:10.3233/APC220075
[22]O. Shobayo, O. Zachariah, M. Olufunke and B, Ogunleye, “Prediction of Stroke Disease with Demographic and Behavioural Data Using Random Forest Algorithm,” Analytics, vol. 2, pp 604 – 617, 2023.https://doi.org/10.3390/analytics2030034
[23]H. Zhang, “Stroke Prediction Based on Support Vector Machine,” Highlights in Science, Engineering and Technology, vol. 31, pp 53 – 59, 2023.https://doi.org/10.54097/hset.v31i.4812
[24]T. Shoily, T. Islam, S. Jannat, S. Tanna, T. Alif and R. Ema, “Detection of Stroke Disease using Machine Learning Algorithms,” In Proceedings of 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT), Kanpur, India, pp 1-6, 2019.  doi: 10.1109/ICCCNT45670.2019.8944689.
[25]H. Ahmed, S. Abd-el ghany, E. Youn, N. Omran and A. Ali, “Stroke Prediction using Distributed Machine Learning Based on Apache Spark,” International Journal of Advanced Science and Technology, vol. 28, no. 15, pp 89–97, 2019. doi: 10.13140/RG.2.2.13478.68162
[26]M. Emon, M. Keya, T. Meghla, M. Rahman, M. Mamun and M. Kaiser, “Performance Analysis of Machine Learning Approaches in Stroke Prediction,” In Proceedings of 4th International Conference on Electronics, Communication and Aerospace Technology (ICECA), Coimbatore, India, pp 1464-1469, 2020. doi: 10.1109/ICECA49313.2020.9297525.
[27]P. Govindarajan, R. Soundarapandian, A. Gandomi, R. Patan, P. Jayaraman, R.  Manikandan, “Classification of Stroke Disease using Machine Learning Algorithms,” Neural Computing and Applications, vol. 32, pp 817–828, 2020. https://doi.org/10.1007/s00521-019-04041-y
[28]Y. Wu and Y. Fang, “Stroke Prediction with Machine Learning Methods among older Chinese,” International Journal of Environmental Research and Public Health, vol. 17, 1828, pp 1–11, 2020. Doi: 10.3390/ijerph17061828.
[29]R. Pitchai, B. Dappuri, P. Pramila, M. Vidhyalakshmi, S. Shanthi, W. Alonazi, K. Almutairi, R. Sundaram and I. Beyene, “An Artificial Intelligence-Based Bio-Medical Stroke Prediction and Analytical System Using a Machine Learning Approach,” Computational Intelligence and Neuroscience, Article ID 5489084, 2022. https://doi.org/10.1155/2022/5489084
[30]R. Jeena and S. Kumar, “Stroke Prediction using Support Vector Machine,”  In proceeding ofInternational Conference on Control, Instrumentation, Communication and Computational Technologies (ICCICCT), December 6 – 17, 2016, Kumaracoil, India, pp600-602.  doi: 10.1109/ICCICCT.2016.7988020.
[31]L. Wang, “Logistic Regression for Stroke Prediction: An Evaluation of its Accuracy and Validity,” Highlights in Science, Engineering and Technology, vol. 39, pp1086 – 1092, 2023. Doi: 10.54097/hset.v39i.6712
[32]G. Mohammed, A. Melhum and A. Ibrahim, “Optimizing Accuracy of Stroke Prediction Using Logistic Regression,” Journal of Technology and Informatics, vol. 4, no. 2, pp 41 – 47, 2023. https://doi.org/10.37802/joti.v4i2.278
[33]C. Nnena, E. Nnena and K.  Ajoku, “Development of Random Forest Model for Stroke Prediction,” International Journal of Innovative and Research Technology, vol. 9, no.4, pp 2785 – 2795, 2024. https://doi.org/10.38124/ijisrt/IJISRT24APR2566
[34]O. Okwori, M. Agana and O.  Ofem, “Application of Support Vector Machine Model for Prediction of Stroke Vulnerability Status.”  Asian Journal of Pure and Applied Mathematics, vol. 6, no. 1, pp 174 – 181, 2024. https://jofmath.com/index.php/AJPAM/article/view/163