IJMSC Vol. 3, No. 4, 8 Nov. 2017
Cover page and Table of Contents: PDF (size: 538KB)
Map-Reduce, Distributed File System, Association Rule Mining, Cluster, Apriori, Minimum Support, Minimum Confidence
Association rule mining is a data mining technique which is used to identify decision-making patterns by analyzing datasets. Many association rule mining techniques exist to find various relationships among itemsets. The techniques proposed in the literature are processed using non-distributed platform in which the entire dataset is sustained till all transactions are processed and the transactions are scanned sequentially. They require more space and are time consuming techniques when large amounts of data are considered. An efficient technique is needed to find association rules from big data set to minimize the space as well as time. Thus, this paper aims to enhance the efficiency of association rule mining of big transaction database both in terms of memory and speed by processing the big transaction database as distributed file system in Map-Reduce framework. The proposed method organizes the transactions into clusters and the clusters are distributed among many parallel processors in a distributed platform. This distribution makes the clusters to be processed simultaneously to find itemsets which enhances the performance both in memory and speed. Then, frequent itemsets are discovered using minimum support threshold. Associations are generated from frequent itemsets and finally interesting rules are found using minimum confidence threshold. The efficiency of the proposed method is enhanced in a noticeably higher level both in terms of memory and speed.
R.Akila, K.Mani,"Augmented Apriori by Simulating Map-Reduce", International Journal of Mathematical Sciences and Computing(IJMSC), Vol.3, No.4, pp.52-66, 2017.DOI: 10.5815/ijmsc.2017.04.05
[1]Jiawei Han and Michelin Kamber, “Data Mining Concepts and Techniques”, 2nd Ed, Morgan Kaufmann Publisher, 2006.
[2]Jongwook Woo, “Apriori-Map/Reduce Algorithm”, In the Proceeding of International Conference on Parallel and Distributed Processing Techniques and Applications, 2012.
[3]Sanjay Rathee, Manohar Kaul and Arti Kashyap, “R-Apriori: An Efficient Apriori based Algorithm on Spark”, In the proceedings of Information and Knowledge Mangaement, ACM Digital Library, Oct 2015.
[4]Jeffrey Dean and Sanjay Ghemawat,”Map Reduce: Simplified Data Processing on Large Clusters”, Google Inc, Google Research Publication.
[5]R. Sumithra, Sujni Paul and D. Ponmary Pushpa Latha, “A hybrid algorithm combining weighted and hash T apriori algorithms in Map Reduce model using Eucalyptus cloud platform”, WSEAS Transaction Computers,Vol. 14, 2015.
[6]Shafali Agarwal and Zeba Khanam, “Map Reduce: A Survey Paper on Recent Expansion”, International Journal of Advanced Computer Science and Applications, Vol. 6, 2015.
[7]Sonali Satija and Dr. Rajender Nath, “Performance Improvement of Apriori Algorithm Using Hadoop”, International Journal of Advanced Research in Computer Science and Software Engineering, June 2015.
[8]Vijay Swaroop, “A Line of Attack to Accentuate FIM in Cloud Computing”, International Journal of Advanced Research in Computer Science and Software Engineering, Vol. 5, Jan 2015.
[9]Zhang Danping, Yu Haoran and Zheng Linyu, “Apriori Algorithm Research Based on Map-Reduce in Cloud Computing Environments”, The Open Automation and Control Systems Journal, Vol. 6, 2014.
[10]Sudhakar Singh, Rakhi Garg and P. K. Mishra, “Review of Apriori Based Algorithms on Map Reduce Framework”, ICC, 2014.
[11]Y. Venkata Raghavarao, L. S. S Reddy and A. Govardhan, “Map Reducing Stream Based Apriori in Distributed Big Data Mining”, International Journal of Advanced Research in Computer Science and Software Engineering, Vol. 4, July 2014.
[12]Mr. Uday K. Kakkad and Prof. Rajanikanth Aluvalu, “Association Rule”, Journal of Computer Engineering (IOSR-JCE), Vol.16, Mar-Apr 2014.
[13]A.Pradeepa and Dr.Antonyselvadoss Thanamani, “Parallelized Comprising For Apriori Algorithm Using Map reduce Framework”, International Journal of Advanced Research in Computer and Communication Engineering, Vol. 2, Nov 2013.
[14]A.Ezhilvathani and Dr. K. Raja,”Implementation Of Parallel Apriori Algorithm On Hadoop Cluster”, International Journal of Computer Science and Mobile Computing, Vol. 2, April 2013.
[15]Anjan K Koundinya, Srinath N K, K A K Sharma, Kiran Kumar, Madhu M N and Kiran U Shanbag, “Map/Reduce Design and Implementation of Apriori algorithm For Handling Voluminous Data-Sets”, Advanced Computing: An International Journal, Vol. 3, Nov 2012.
[16]G.Vamsi Krishna,” Prediction of Rainfall Using Unsupervised Model based Approach Using K-Means Algorithm”, International Journal of Mathematical Sciences and Computing, PP 11-20, July 2015.
[17]Deepa B. Patila , Yashwant V. Dongre, ”A Fuzzy Approach for Text Mining”, International Journal of Mathematical Sciences and Computing, 4, PP 34-43, Nov 2015.