Work place: Department of IT, GVP College of Engineering (A), Andhra Pradesh, 530048, India
E-mail: kbmcst1@yahoo.com
Website:
Research Interests: Pattern Recognition, Data Mining, Data Structures and Algorithms
Biography
K.B. Madhuri received M.Tech. degree in Computer Science and Technology from Andhra University in 1999. She obtained Ph.D from JNTU, Hyderabad in 2009. Presently she is working as Professor and Head of the department in department of Information Technology at Gayatri Vidya Parishad College of Engineering (A), Visakhapatnam, Andhra Pradesh, India. Her research interests include Data Mining, Pattern Recognition, Data warehousing and RDBMS. She has guided one Ph.D scholar and currently guiding one Ph.D scholar. She published research papers in National and International Journals. She is a member of IEEE and associate member of Institute of Engineers (India).
DOI: https://doi.org/10.5815/ijmsc.2019.03.04, Pub. Date: 8 Aug. 2019
In general, subspace clustering algorithms identify enormously large number of subspace clusters which may possibly involve redundant clusters. This paper presents Dynamic Epsilon based Maximal Subspace Clustering Algorithm (DEMSC) that handles both redundancy and inter-subspace density divergence, a phenomenon in density based subspace clustering. The proposed algorithm aims to mine maximal and non-redundant subspace clusters. A maximal subspace cluster is defined by a group of similar data objects that share maximal number of attributes. The DEMSC algorithm consists of four steps. In the first step, data points are assigned with random unique positive integers called labels. In the second step, dense units are identified based on the density notion using proposed dynamically computed epsilon-radius specific to each subspace separately and user specified input parameter minimum points, τ. In the third step, sum of the labels of each data object forming the dense unit is calculated to compute its signature and is hashed into the hash table. Finally, if a dense unit of a particular subspace collides with that of the other subspace in the hash table, then both the dense units exists with high probability in the subspace formed by combining the colliding subspaces. With this approach efficient maximal subspace clusters which are non-redundant are identified and outperforms the existing algorithms in terms of cluster quality and number of the resulted subspace clusters when experimented on different benchmark datasets.
[...] Read more.DOI: https://doi.org/10.5815/ijieeb.2017.05.06, Pub. Date: 8 Sep. 2017
In most of the applications, data in multiple data sources describes the same set of objects. The analysis of the data has to be carried with respect to all the data sources. To form clusters in subspaces of the data sources the data mining task has to find interesting groups of objects jointly supported by the multiple data sources. This paper addresses the problem of mining mutual subspace clusters in multiple sources. The authors propose a partitional model using k-medoids algorithm to determine k-exclusive subspace clusters and signature subspaces corresponding to multiple data sources, where k is the number of subspace clusters to be specified by the user. The proposed algorithm generates mutual subspace clusters in multiple data sources in less time without the loss of cluster quality when compared to the existing algorithm.
[...] Read more.By B.Jaya Lakshmi K.B.Madhuri M.Shashi
DOI: https://doi.org/10.5815/ijitcs.2017.06.04, Pub. Date: 8 Jun. 2017
Density based Subspace Clustering algorithms have gained their importance owing to their ability to identify arbitrary shaped subspace clusters. Density-connected SUBspace CLUstering(SUBCLU) uses two input parameters namely epsilon and minpts whose values are same in all subspaces which leads to a significant loss to cluster quality. There are two important issues to be handled. Firstly, cluster densities vary in subspaces which refers to the phenomenon of density divergence. Secondly, the density of clusters within a subspace may vary due to the data characteristics which refers to the phenomenon of multi-density behavior. To handle these two issues of density divergence and multi-density behavior, the authors propose an efficient algorithm for generating subspace clusters by appropriately fixing the input parameter epsilon. The version1 of the proposed algorithm computes epsilon dynamically for each subspace based on the maximum spread of the data. To handle data that exhibits multi-density behavior, the algorithm is further refined and presented in version2. The initial value of epsilon is set to half of the value resulted in the version1 for a subspace and a small step value 'delta' is used for finalizing the epsilon separately for each cluster through step-wise refinement to form multiple higher dimensional subspace clusters. The proposed algorithm is implemented and tested on various bench-mark and synthetic datasets. It outperforms SUBCLU in terms of cluster quality and execution time.
[...] Read more.Subscribe to receive issue release notifications and newsletters from MECS Press journals