Work place: KMBB College & Technology of Engineering, Bhubaneswar, India
E-mail: sujata_dash@yahoo.com
Website:
Research Interests: Bioinformatics, Computer systems and computational processes, Data Mining, Data Compression, Data Structures and Algorithms
Biography
Sujata Dash: received her Ph.D. degree in Computational Modeling and Simulation from Berhampur University, Orissa, India in 1995. She is a Professor in Computer Science at KMBB College of Engineering and Technology, Biju Pattnaik University of Technology, has published more than 50 technical papers in international journals / Proceedings of international conferences / book chapters of reputed publications. Her current research interests includes Data Warehouse and Data Mining, Bioinformatics, Intelligent Agent, Web Data Mining and Wireless Technology.
By Sujata Dash Bichitrananda Patra B.K. Tripathy
DOI: https://doi.org/10.5815/ijieeb.2012.02.07, Pub. Date: 8 Apr. 2012
A major challenge in biomedical studies in recent years has been the classification of gene expression profiles into categories, such as cases and controls. This is done by first training a classifier by using a labeled training set containing labeled samples from the two populations, and then using that classifier to predict the labels of new samples. Such predictions have recently been shown to improve the diagnosis and treatment selection practices for several diseases. This procedure is complicated, however, by the high dimensionality of the data. While microarrays can measure the levels of thousands of genes per sample, case-control microarray studies usually involve no more than several dozen samples. Standard classifiers do not work well in these situations where the number of features (gene expression levels measured in these microarrays) far exceeds the number of samples. Selecting only the features that are most relevant for discriminating between the two categories can help construct better classifiers, in terms of both accuracy and efficiency. This paper provides a comparison between dimension reduction technique, namely Partial Least Squares (PLS)method and a hybrid feature selection scheme, and evaluates the relative performance of four different supervised classification procedures such as Radial Basis Function Network (RBFN), Multilayer Perceptron Network (MLP), Support Vector Machine using Polynomial kernel function(Polynomial- SVM) and Support Vector Machine using RBF kernel function (RBF-SVM) incorporating those methods. Experimental results show that the Partial Least-Squares(PLS) regression method is an appropriate feature selection method and a combined use of different classification and feature selection approaches makes it possible to construct high performance classification models for microarray data.
[...] Read more.Subscribe to receive issue release notifications and newsletters from MECS Press journals