International Journal of Education and Management Engineering(IJEME)

ISSN: 2305-3623 (Print), ISSN: 2305-8463 (Online)

Published By: MECS Press

IJEME Vol.10, No.2, Apr. 2020

Evaluation of Data Mining Categorization Algorithms on Aspirates Nucleus Features for Breast Cancer Prediction and Detection

Full Text (PDF, 467KB), PP.28-37

Views:47   Downloads:8


Gajendra Sharma

Index Terms

Data Mining, Breast Cancer, Classification Techniques, Prediction, Diagnosis,WEKA


With the development of technology the use of Computer Aided Diagnosis has become a key for breast cancer diagnosis. It is important to increase the accuracy and effective of such systems. The concept of data mining can be applied on the data gathered through such systems for prediction and prevention of breast cancer. In this research, we have conducted the comparison between seven classification algorithms with the help of WEKA (The Waikato Environment for Knowledge Analysis) tool on the 569 instances (10 nucleus attributes) of data with two classes Malignant(M) and Benign (B) of breast cancer aspirate cells. Furthermore the influence of each attribute on prediction was evaluated. The accuracy of these algorithms was above 91% with the highest value of 94.02% for random forest and the predictive power of conclave points was highest whereas lowest was of Fractal Dimension. 

Cite This Paper

Gajendra Sharma. "Evaluation of Data Mining Categorization Algorithms on Aspirates Nucleus Features for Breast Cancer Prediction and Detection", International Journal of Education and Management Engineering(IJEME), Vol.10, No.2, pp.28-37, 2020.DOI: 10.5815/ijeme.2020.02.04


[1]Jinshan Tang, R. Rangayyan, Jun Xu, I. El Naqa, Yongyi Yang (2019). "Computer-aided detection and diagnosis of breast cancer with mammography: recent advances". IEEE Transactions on Information Technology in Biomedicine. 13(2), Pp. 236-251. Available online: 10.1109/titb.2008.2009441 [cited 2019 20 May].

[2] "Breast cancer" (2019). World Health Organization.Available online: cancer/prevention/diagnosis-screening/breast-cancer/en/. [cited 2019 20 May].

[3] Karthikeyani, V., I. Parvin, K. Tajudin, I. Shahina Begam. “Comparative of data mining classification algorithm in Diabetes disease prediction”. International journal of computer application.

[4] Street, W., W. Wolberg, O. Mangasarian, "Nuclear feature extraction for breast tumor diagnosis" (1993). Biomedical Image Processing and Biomedical Visualization,. Available: 10.1117/12.148698 [cited 2019 29 July].

[5] Sampat, M. P., M. K. Markey, A. C. Bovik (2005), “Computer-aided detectionand diagnosis in mammography,” in Handbook of Image and VideoProcessing, A.C. Bovik, Ed., 2nd ed. New York: Academic, Pp. 1195–1217.

[6] Maimon, O., L. Rokach (2019), "Decomposition Methodology for Knowledge Discovery and Data Mining", Data Mining and Knowledge Discovery Handbook, Pp. 981-1003. Available: 10.1007/0-387-25465-x_46.

[7] "Data Mining Occupant Behavior Research at LBNL BTUS" (2019),, 2019. Available online: [cited: 2019 25 May].

[8] Osmar R. Zaïane, Principles of Knowledge Discovery in Databases. Available online:

[9] Jiawei Han and Micheline Kamber (2012), “Data Mining Concepts and Techniques”, third edition, Morgan Kaufmann Publishers an imprint of Elsevier.

[10] Wolberg, W.H., W.N. Street, O.L. Mangasarian (1994). “Machine learning techniques to diagnose breast cancer from fine-needle aspirates”, Cancer Letters 77.Pp163-171.

[11] Chaurasia, Vikas & Pal, Saurabh. “A novel approach for breast cancer detection using data mining techniques”.International Journal of Innovative Research in Computer and Communication Engineering. 3297. 2320-9801.

[12] Liu, Huan, and Lei Yu (2005). "Toward integrating feature selection algorithms for classification and clustering." IEEE Transactions on knowledge and data engineering. 17(4) Pp. 491-502.

[13] Vanaja, S., and K. Ramesh Kumar (2014). "Analysis of feature selection algorithms on classification: a survey." International Journal of Computer Applications. 96 (17).

[14] Cao, Dong-Sheng (2010). "Automatic feature subset selection for decision tree-based ensemble methods in the prediction of bioactivity." Chemometrics and Intelligent Laboratory Systems . 103 (2), Pp. 129-136.

[15] Ya-Qin, Liu, Wang Cheng, and Zhang Lu (2009). "Decision tree based predictive models for breast cancer survivability on imbalanced data." 2009 3rd International Conference on Bioinformatics and Biomedical Engineering.

[16] Abdelaal, Medhat Mohamed Ahmed (2010). "Using data mining for assessing diagnosis of breast cancer." Computer Science and Information Technology (IMCSIT). IEEE Proceedings of the 2010 International Multiconference.