Samir Elmougy

Work place: Faculty of Computers and Information, Mansoura University, Mansoura, dakahliya, Egypt

E-mail: mougyc@mans.edu.eg

Website:

Research Interests: Software Engineering, Computational Learning Theory, Neural Networks, Computer Architecture and Organization, Computer Networks, Data Structures and Algorithms, Analysis of Algorithms, Models of Computation

Biography

SAMIR ELMOUGY received the Ph.D. degree in Computer Science from the School of Electrical Engineering and Computer Science, Oregon State University, USA. He is currently a Professor and the Chair of the Dept. of Computer Science, Faculty of Computers and Information, Mansoura University, Egypt. From 2008 to 2014, he was an Assistant Professor at the Department of Computer Science, College of Computer and Information Sciences, King Saud University, Riyadh, Saudi Arabia. He published over 50 publications in refereed IEEE Transactions/Springer Journals, IEEE conferences, and book chapters. His current research interests include error correcting codes, computer networks, IoT, analysis of algorithms, machine learning, and software engineering.

Author Articles
A New Hybrid Genetic and Information Gain Algorithm for Imputing Missing Values in Cancer Genes Datasets

By O. M. Elzeki M. F. Alrahmawy Samir Elmougy

DOI: https://doi.org/10.5815/ijisa.2019.12.03, Pub. Date: 8 Dec. 2019

A DNA microarray can represent thousands of genes for studying tumor and genetic diseases in humans. Datasets of DNA microarray normally have missing values, which requires an undeniably crucial process for handling missing values. This paper presents a new algorithm, named EMII, for imputing missing values in medical datasets. EMII algorithm evolutionarily combines Information Gain (IG) and Genetic Algorithm (GA) to mutually generate imputable values. EMII algorithm is column-oriented not instance oriented than other implementation of GA which increases column correlation to the class in the same dataset. EMII algorithm is evaluated for imputing the generated missing values in four cancer gene expression standard medical datasets (Colon, Leukemia, Lung cancer-Michigan, and Prostate) via comparing the truth original complete datasets against the imputed datasets. The analysis of the experimental results reveals that the imputed values generated by EMII were almost the same as the original values besides having the same impact on the applied classifiers due to accuracy as similar as the original complete datasets. EMII has a running time of θ(n2), where n is the total number of columns.

[...] Read more.
Other Articles