O. M. Elzeki

Work place: Faculty of Computers and Information, Mansoura University, Mansoura, dakahliya, Egypt

E-mail: omar_m_elzekia@mans.edu.eg

Website:

Research Interests: Computer systems and computational processes, Computational Learning Theory, Data Structures and Algorithms

Biography

O. M. ELZEKIreceived his Bachelor’s Degree in 2007 from Computer Science Department, Mansoura University, Egypt and received his Master’s Degree from Mansoura University of Computer and Information Systems, Egypt in 2013. Now, he is pursuing his Ph.D. degree from Mansoura University of Computer and Information Systems, Egypt. He interests in Data science, Big-Data analysis and machine learning in the different computational environment. He was a receptionist of the best Government student from Dakhlia Government, Egypt.

Author Articles
A New Hybrid Genetic and Information Gain Algorithm for Imputing Missing Values in Cancer Genes Datasets

By O. M. Elzeki M. F. Alrahmawy Samir Elmougy

DOI: https://doi.org/10.5815/ijisa.2019.12.03, Pub. Date: 8 Dec. 2019

A DNA microarray can represent thousands of genes for studying tumor and genetic diseases in humans. Datasets of DNA microarray normally have missing values, which requires an undeniably crucial process for handling missing values. This paper presents a new algorithm, named EMII, for imputing missing values in medical datasets. EMII algorithm evolutionarily combines Information Gain (IG) and Genetic Algorithm (GA) to mutually generate imputable values. EMII algorithm is column-oriented not instance oriented than other implementation of GA which increases column correlation to the class in the same dataset. EMII algorithm is evaluated for imputing the generated missing values in four cancer gene expression standard medical datasets (Colon, Leukemia, Lung cancer-Michigan, and Prostate) via comparing the truth original complete datasets against the imputed datasets. The analysis of the experimental results reveals that the imputed values generated by EMII were almost the same as the original values besides having the same impact on the applied classifiers due to accuracy as similar as the original complete datasets. EMII has a running time of θ(n2), where n is the total number of columns.

[...] Read more.
Other Articles