IJISA Vol. 4, No. 1, 8 Feb. 2012
Cover page and Table of Contents: PDF (size: 930KB)
Full Text (PDF, 930KB), PP.21-31
Views: 0 Downloads: 0
Data mining, K-means initialization m pattern recognition
The famous K-means clustering algorithm is sensitive to the selection of the initial centroids and may converge to a local minimum of the criterion function value. A new algorithm for initialization of the K-means clustering algorithm is presented. The proposed initial starting centroids procedure allows the K-means algorithm to converge to a “better” local minimum. Our algorithm shows that refined initial starting centroids indeed lead to improved solutions. A framework for implementing and testing various clustering algorithms is presented and used for developing and evaluating the algorithm.
Mohammed El Agha, Wesam M. Ashour, "Efficient and Fast Initialization Algorithm for K-means Clustering", International Journal of Intelligent Systems and Applications(IJISA), vol.4, no.1, pp.21-31, 2012. DOI:10.5815/ijisa.2012.01.03
[1]Sanjay Goil, Harasha Nagesh, Alok Choudhary, “MAFIA: Efficient and Scalable Subspace Clustering for Very Large Data Sets”, 1999
[2]U.M. Fayyad, G Piatesky –Shapiro, P.Smyth, and R.Uthuusamy. “Advances in data mining and knowledge discovery. MIT Press”, 1994
[3]M. Eirinaki and M. Vazirgiannis, “Web Mining for Web Personalization,” ACM Transactions on Internet Technology (TOIT), vol. 3, no. 1, pp. 1-27, 2003
[4]B.Bahmani Firouzi, T. Niknam, and M. Nayeripour, “A New Evolutionary Algorithm for Cluster Analysis,” Proceeding of world Academy of Science, Engineering and Technology, vol. 36. Dec. 2008.
[5]A.Gersho and R. Gray, “Vector Quantization and Signal Compression,” Kulwer Acadimec, Boston, 1992.
[6]M. Al- Zoubi, A. Hudaib, A. Huneiti and B. Hammo, “New Efficient Strategy to Accelerate k-Means Clustering Algorithm,” American Journal of Applied Science, vol. 5, no. 9, pp 1247-1250, 2008.
[7]M. Celebi, “Effecitive Initialization of K-means for Color Quantization,” Proceeding of the IEEE International Conference on Image Processing, pp. 1649-1652, 2009.
[8]M. Borodovsky and J. McIninch, “Recognition of genes in DNA
[9]A.K Jane and R.C Dube, “Algorithms for Clustering Data. Prentice-Hall Inc”, 1988
[10]A.K. JAIN , M.N. MURTY and P.J. FLYNN, “Data Clustering: A Review”, 2000
[11]Guojun Gan, Chaoqun Ma and Jianhong Wu, “Data Clustering Theory, Algorithms, and Applications” 2007.
[12]MAO, J. AND JAIN, A. K, “Texture classification and segmentation using multi resolution simultaneous autoregressive models”, 1992.
[13]MCQUEEN, J. “Some methods for classification and analysis of multivariate observations”, 1967.
[14]R.O. Duda and P.E. Hart, “Pattern Classification and Scene Analysis”, 1973.
[15]R. Neal and G. Hinton, “A view of the EM algorithm that justifies incremental, sparse, and other variants'', 1998.
[16]P. S. Bradley, O. L. Mangasarian, and W. N. Street, "Clustering via Concave Minimization, 1997.
[17]K. Fukunaga ,” Introduction to Statistical Pattern Recognition”, 1990.
[18]Shehroz and Ahmad, “Cluster center initiation algorithm for k-means clustering” , 2004.
[19]Bradley and Fayyad, “Refining initial points for K-means clustering”, 1998
[20]Penã, J.M., Lozano, J.A., Larrañaga, P., 1999. “ An empirical comparison of four initialization methods for the K-means algorithm”, 1999.
[21]Kohei Arai and Ali Ridho Barakha, “Hierarchical K-means: an algorithm for centroids initialization for K-means” , 2007
[22]M. Al-Daoud, “A New Algorithm for Clustering Initialization,” Preceeding World Academy of Science, Engineering, and Technology, vol. 4, 2005.
[23]M. Meila and D. Heckerman, "An experimental comparison of several clustering methods", 1998.