NIPP: Non-Invasive PCOS Prediction using XG-boost Machine Learning Model

PDF (481KB), PP.82-95

Views: 0 Downloads: 0

Author(s)

Shikha Prasher 1,* Leema Nelson 1,* Manal Gafar 2,3

1. Chitkara University Institute of Engineering & Technology, Chitkara University, Punjab, India

2. Cyber Security and Networks Program, University of East London, European Universities in Egypt (EUE), The New Administrative Capital, Egypt

3. Department of Electronics and Electrical Communications Engineering, Faculty of Electronic Engineering, Menoufia University, Menouf 32952, Egypt

* Corresponding author.

DOI: https://doi.org/10.5815/ijitcs.2025.01.06

Received: 27 Aug. 2024 / Revised: 13 Oct. 2024 / Accepted: 2 Dec. 2024 / Published: 8 Feb. 2025

Index Terms

Machine Learning, Polycystic Ovary Syndrome, Extreme Gradient Boosting, Numeric Database, Early Detection

Abstract

Polycystic Ovary Syndrome (PCOS) is a common endocrine disorder that affects women of reproductive age, leading to hormonal imbalances and ovarian dysfunction. Early detection and intervention are vital for effective management and prevention of complications. This study compares PCOS prediction using the XGBoost machine learning model against four traditional models: Logistic Regression (LR), Support Vector Machine (SVM), Decision Trees (DT), and Random Forests (RF). LR and SVM achieve accuracies of 95% and 96%, respectively, demonstrating strong predictive capabilities. In contrast, DT had a lower accuracy (82%), indicating limitations in PCOS data complexity. RF showed competitive performance with 96% accuracy, underscoring its effectiveness in ensemble learning. XGBoost achieves 98% accuracy with its parameter configuration. The scale pos weight parameter adjusts the positive class weight in imbalanced datasets, addressing under representation by assigning more weight to the minority class, and thereby improving the training focus. The gradient boosting framework incrementally builds models to address complex feature interactions and dependencies, enhancing the accuracy and stability in predicting intricate PCOS dataset. This analysis highlights the importance of advanced machine learning models such as XGBoost for accurate and reliable PCOS predictions. This research advances PCOS prediction, demonstrates the potential of machine learning in healthcare, and clarifies the strengths and limitations of different algorithms with complex medical datasets.

Cite This Paper

Shikha Prasher, Leema Nelson, Manal Gafar, "NIPP: Non-Invasive PCOS Prediction using XG-boost Machine Learning Model", International Journal of Information Technology and Computer Science(IJITCS), Vol.17, No.1, pp.82-95, 2025. DOI:10.5815/ijitcs.2025.01.06

Reference

[1]S. Viswanathan, R. Jiji, B. C. Nayana and C. Baby, “Pregnancy complications associated with polycystic ovary syndrome: A cross-sectional study,” World J Pharm Res, 11, 2022, 1539-1552.
[2]L.H. Zeng, S. Rana, L. Hussain, M. Asif, M. H. Mehmood, I. Imran, and S.N. Abed, “Polycystic ovary syndrome: a disorder of reproductive age, its pathogenesis, and a discussion on the emerging role of herbal remedies,” Frontiers in Pharmacology,13, 2022, https://doi.org/10.3389/fphar.2022.874914874914- 874936.
[3]S. A. Bhat, “Detection of Polycystic Ovary Syndrome using Machine Learning Algorithms (Doctoral dissertation, Dublin, National College of Ireland),” 2021, 1-30.
[4]A. A. Choudhury, and V.D. Rajeswari, “Gestational diabetes mellitus-a metabolic and reproductive disor- der,” Biomedicine & Pharmacotherapy, 143, 2021, 112183-112201.
[5]A. Denny, A. Raj, A. Ashok, C. M. Ram, and R. George, “I-HOPE: Detection and prediction system for polycystic ovary syndrome (PCOS) using machine learning techniques,” in TENCON 2019 - 2019 IEEE Region 10 Conference (TENCON), 2019.
[6]V. Thakre, “PCOcare: PCOS detection and prediction using machine learning algorithms,” Biosci. Biotechnol. Res. Commun., vol. 13, no. 14, pp. 240–244, 2020.
[7]S. S. Deshpande and A. Wakankar, “Automated detection of polycystic ovarian syndrome using follicle recognition,” in 2014 IEEE International Conference on Advanced Communications, Control and Computing Technologies, 2014.
[8]S. Ahmed, M. S. Rahman, I, Jahan, M.S. Kaiser, A. S., Hosen, D. Ghirime, and S. H. Kim, “A review on the detection techniques of polycystic ovary syndrome using machine learning,” IEEE Access, vol. 11, pp. 86522–86543, 2023.
[9]D. Hdaib, N. Almajali, H. Alquran, W. A. Mustafa, W. Al-Azzawi, and A. Alkhayyat, “Detection of polycystic ovary syndrome (PCOS) using machine learning algorithms,” in 2022 5th International Conference on Engineering Technology and its Applications (IICETA), 2022.
[10]R. Kaur, R. Kumar, and M. Gupta, “Lifestyle and Dietary Management Associated with Chronic Diseases in Women Using Deep Learning. Combating Women’s Health Issues with Machine Learning,” pp. 59–73, 2024.
[11]I. B. CICEK, Z. KUCUKAKCALI, and F. H. YAGIN, “Detection of risk factors of PCOS patients with Local Interpretable Model-agnostic Explanations (LIME) Method that an explainable artificial intelligence model,” The Journal of Cognitive Systems, 6, 2021, 59-63.
[12]Neto, C., Silva, M., Fernandes, M., Ferreira, D., and Machado, J., Prediction models for Polycystic Ovary Syndrome using data mining. In International Conference on Advances in Digital Science, 2021, 210-221.
[13]Rathod, Y., Komare, A., Ajgaonkar, R., Chindarkar, S., Nagare, G., Punjabi, N., & Karpate, Y. (2022, July). Pre- dictive Analysis of Polycystic Ovarian Syndrome using CatBoost Algorithm. In 2022 IEEE Region 10 Symposium (TENSYMP) 2022, 1-6.
[14]M. Marreiros, D. Ferreira, C. Neto, D. Witarsyah, and J. Machado, “Classification of Polycystic Ovary Syndrome Based on Correlation Weight Using Machine Learning,” In Big Data Analytics and Artificial Intelligence in the Healthcare Industry, IGI Global, 2022, 150-176.
[15]H. Danaei Mehr, and H. Polat, “Diagnosis of polycystic ovary syndrome through different machine learning and feature selection techniques,” Health and Technology, 12, 2022,137-150.
[16]P. Bhardwaj, and P. Tiwari, “Manoeuvre of Machine Learning Algorithms in Healthcare Sector with Application to Polycystic Ovarian Syndrome Diagnosis,” In Proceedings of Academia-Industry Consortium for Data Science, Springer, Singapore, 2022, 71-84.
[17]G. Sinthia, T. Poovizhi, and R. Khilar, “Analysis on Polycystic Ovarian Syndrome and Comparative Study of Differ- ent Machine Learning Algorithms,” In Advances in Intelligent Computing and Communication, Springer, Singapore, 2022, 191-196.
[18]R. Subha, B. R. Nayana, R. Radhakrishnan, and P. Sumalatha,“Computerized Diagnosis of Polycystic Ovary Syn- drome Using Machine Learning and Swarm Intelligence Techniques;” 2022, 3330-3357.
[19]K. Rakshitha, and N. Naveen, “Op-RMSprop (Optimized-Root Mean Square Propagation) Classification for Predic- tion of Polycystic Ovary Syndrome (PCOS) using Hybrid Machine Learning Technique,” International Journal of Advanced Computer Science and Applications,13, 2022, 0130671-0130681.
[20]S.Tiwari, L. Kane, D. Koundal, A. Jain, A. Alhudhaif, K. Polat, ... and S. A.Althubiti, “SPOSDS: A Smart Polycystic Ovary Syndrome Diagnostic System Using Machine Learning,” Expert Systems with Applications, 2022, 117592- 117622.
[21]Tempelman, J. R., Wachtor, A. J., Flynn, E. B., Depond, P. J., Forien, J. B., Guss, G. M., ... & Matthews, M. J. Detection of keyhole pore formations in laser powder-bed fusion using acoustic process monitoring measure- ments. Additive Manufacturing, 55, 2022, 102735-102751.
[22]Khanna, V. V., Chadaga, K., Sampathila, N., Prabhu, S., Bhandage, V., & Hegde, G. K. A distinctive explainable machine learning framework for detection of polycystic ovary syndrome. Applied System Innovation, 6(2), 2023, 1-32.
[23]Suha, S. A., & Islam, M. N. Exploring the dominant features and data-driven detection of polycystic ovary syndrome through modified stacking ensemble machine learning technique. Heliyon, 9(3), 2023, 1-21.
[24]Svm-PCOS diagnosis. Retrieved, from Kaggle.com website: https://www.kaggle.com/code/swabbie8/svm-pcos- diagnosishttps://www.kaggle.com/code/swabbie8/svm-pcos-diagnosis, 2023.
[25]Chen, T. and Guestrin, C., 2016, August. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining (pp. 785-794).
[26]Wang, C., Deng, C. and Wang, S., 2020. Imbalance-XGBoost: leveraging weighted and focal losses for binary label-imbalanced classification with XGBoost. Pattern recognition letters, 136, pp.190-197.
[27]Rodriguez, J.J., Kuncheva, L.I. and Alonso, C.J., 2006. Rotation forest: A new classifier ensemble method. IEEE transactions on pattern analysis and machine intelligence, 28(10), pp.1619-1630.
[28]Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q. and Liu, T.Y., 2017. Lightgbm: A highly efficient gradient boosting decision tree. Advances in neural information processing systems, 30.
[29]Friedman, J.H., 2001. Greedy function approximation: a gradient boosting machine. Annals of statistics, pp.1189- 1232.
[30]Maimon, O. and Rokach, L. eds., 2005. Data mining and knowledge discovery handbook (Vol. 2, No. 2005). New York: Springer.
[31]Terven, J., Cordova-Esparza, D.M., Ramirez-Pedraza, A., Chavez-Urbiola, E.A. and Romero-Gonzalez, J.A., 2023. Loss functions and metrics in deep learning. arXiv preprint arXiv:2307.02694.