International Journal of Information Engineering and Electronic Business(IJIEEB)
ISSN: 2074-9023 (Print), ISSN: 2074-9031 (Online)
Published By: MECS Press
IJIEEB Vol.9, No.2, Mar. 2017
Map-Reduce based Multiple Sub-Graph Enumeration Using Dominating-Set Graph Partition
Full Text (PDF, 458KB), PP.36-44
The purpose of this paper is to find all the instances of a given set of pattern graphs (sub-graphs) in a large data graph using a single round of Map-Reduce. For the simplest pattern graphs like a triangle and rectangle we propose the solution. This paper enumerates complex pattern graphs using the enumeration of simple pattern graphs. We proposed Dominating set based graph partition, it generates non-overlapped sub-graphs. Each sub-graph is processed by one machine in the cluster. We analyze both the communication cost and the total computational cost. Communication cost is reduced by using Map-Reduce based dominating set graph partition. At the same time Multiple pattern (sub-graphs) graphs can be enumerated with the same communication cost. The proposed method is not always superior to the conventional sub-graph enumeration, but in some cases involving large-scale data where this method wins, including (1) Adjacency list representation of the graph is the input (2) Number of partitions are decided based on the graph size. We experimentally show that our approach decreases significantly the computation cost, communication cost and scales the enumeration process with a large graph database.
Cite This Paper
Fathimabi shaik, RBV Subramanyam, DVLN Somayajulu,"Map-Reduce based Multiple Sub-Graph Enumeration Using Dominating-Set Graph Partition", International Journal of Information Engineering and Electronic Business(IJIEEB), Vol.9, No.2, pp.36-44, 2017. DOI: 10.5815/ijieeb.2017.02.05
Alon, Noga, Raphael Yuster, and Uri Zwick. "Finding and counting given length cycles." Algorithmica 17.3 (1997): 209-223. DOI: 10.1007/BF02523189
Kowaluk, Mirosław, Andrzej Lingas, and Eva-Marta Lundell. "Counting and detecting small subgraphs via equations and matrix multiplication." Proceedings of the twenty-second annual ACM-SIAM symposium on Discrete Algorithms. SIAM, 2011.
Tsourakakis, Charalampos E., et al. "Doulion: counting triangles in massive graphs with a coin." Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2009,doi:10.1145/1557019.1557111
Schank, Thomas. "Algorithmic aspects of triangle-based network analysis." Phd in computer science, University Karlsruhe 3 (2007).
Suri, Siddharth, and Sergei Vassilvitskii. "Counting triangles and the curse of the last reducer." Proceedings of the 20th international conference on World wide web. ACM, 2011.DOI: 10.1145/1963405.1963491
Kolda, Tamara G., et al. "Counting triangles in massive graphs with Map-Reduce." SIAM Journal on Scientific Computing 36.5 (2014): S48-S77. DOI: 10.1137/13090729X
Pagh, Rasmus, and Charalampos E. Tsourakakis. "Colorful triangle counting and a Map-Reduce implementation." Information Processing Letters 112.7 (2012): 277-281.
Afrati, Foto N., Dimitris Fotakis, and Jeffrey D. Ullman. "Enumerating subgraph instances using map-reduce." Data Engineering (ICDE), 2013 IEEE 29th International Conference on. IEEE, 2013. DOI: 10.1109/ICDE.2013.6544814
Afrati, Foto N., and Jeffrey D. Ullman. "Optimizing multiway joins in a map-reduce environment." IEEE Transactions on Knowledge and Data Engineering 23.9 (2011): 1282-1298.DOI: 10.1145/1739041.1739056
Lai, Longbin, et al. "Scalable subgraph enumeration in Map-Reduce." Proceedings of the VLDB Endowment 8.10 (2015): 974-985. DOI: 10.14778/2794367.2794368
Chu, Shumo, and James Cheng. "Triangle listing in massive networks and its applications." Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2011. DOI: 10.1145/2020408.2020513
Park, Ha-Myung, and Chin-Wan Chung. "An efficient Map-Reduce algorithm for counting triangles in a very large graph." Proceedings of the 22nd ACM international conference on Information & Knowledge Management. ACM, 2013. DOI: 10.1145/2505515.2505563
Hu, Xiaocheng, Yufei Tao, and Chin-Wan Chung. "Massive graph triangulation." Proceedings of the 2013 ACM SIGMOD international conference on Management of data. ACM, 2013.DOI:10.1145/2463676.2463704
Sun, Zhao, et al. "Efficient subgraph matching on billion node graphs." Proceedings of the VLDB Endowment 5.9 (2012): 788-799. DOI: 10.14778/2311906.2311907
Viger, Fabien, and Matthieu Latapy. "Efficient and simple generation of random simple connected graphs with prescribed degree sequence." International Computing and Combinatorics Conference. Springer Berlin Heidelberg, 2005. DOI: 10.1007/11533719_45
Zhang, Xiaofei, Lei Chen, and Min Wang. "Efficient multi-way theta-join processing using Map-Reduce." Proceedings of the VLDB Endowment 5.11 (2012): 1184-1195.DOI: 10.14778/2350229.2350238
Zhao, Peixiang, and Jiawei Han. "On graph query optimization in large networks." Proceedings of the VLDB Endowment 3.1-2 (2010): 340-351. DOI: 10.14778/1920841.1920887
Zhao, Zhao, et al. "Subgraph enumeration in large social contact networks using parallel color coding and streaming." 2010 39th International Conference on Parallel Processing. IEEE, 2010.DOI: 10.1109/ICPP.2010.67
Wang, Chaokun, et al. "MapDupReducer: detecting near duplicates over massive datasets." Proceedings of the 2010 ACM SIGMOD International Conference on Management of data. ACM, 2010.DOI: 10.1145/1807167.1807296
Chiba, Norishige, and Takao Nishizeki. "Arboricity and subgraph listing algorithms." SIAM Journal on Computing 14.1 (1985): 210-223.
Lin, Jimmy, and Chris Dyer. "Data-intensive text processing with Map-Reduce." Synthesis Lectures on Human Language Technologies3.1(2010):1-170
Hadoop Distributed File System hdfs : http://hadoop.apache.org/hdfs
Ghemawat, Sanjay, Howard Gobioff, and Shun-Tak Leung. "The Google file system." ACM SIGOPS operating systems review. Vol. 37. No. 5. ACM, 2003.DOI: 10.1145/1165389.945450
J. Dean and S. Ghemawat. Mapreduce: simplified data processing on large clusters. In OSDI, pages 137–150, 2004. doi: 10.1016/j.procs.2015.07.392
J. Cohen. Graph twiddling in a mapreduce world.Computing in Science and Engineering, 11(4):29–41, 2009.
Vernica, Rares, Michael J. Carey, and Chen Li. "Efficient parallel set-similarity joins using MapReduce." Proceedings of the 2010 ACM SIGMOD International Conference on Management of data. ACM, 2010, 10.1145/1807167.1807222
Dilbag Singh,Jaswinder Singh,Amit Chhabra "Failures in Cloud Computing Data Centers in 3-tier Cloud Architecture", IJIEEB, vol.4, no.3, pp.1-8, 2012. DOI: 10.5815/ijieeb.2012.03.01
Jitendra Singh,"Study of Response Time in Cloud Computing", IJIEEB, vol.6, no.5, pp.36-43, 2014. DOI: 10.5815/ijieeb.2014.05.06
Karim Zarour, Nacereddine Zarour,"Data Center Strategy to Increase Medical Information Sharing in Hospital Information Systems", IJIEEB, vol.5, no.1, pp.33-39, 2013. DOI: 10.5815/ijieeb.2013.01.04