International Journal of Information Technology and Computer Science(IJITCS)

ISSN: 2074-9007 (Print), ISSN: 2074-9015 (Online)

Published By: MECS Press

IJITCS Vol.9, No.10, Oct. 2017

An Efficient Framework for Creating Twitter Mart on a Hybrid Cloud

Full Text (PDF, 605KB), PP.59-67

Views:101   Downloads:0


Imran Khan, S. Kazim Naqvi, Mansaf Alam, Mohammad Najmud Doja, S. Nasir Aziz Rizvi

Index Terms

Cloud Computing;Big Data;Hadoop;Twitter


The contemporary era of technological quest is buzzing with two words - Big Data and Cloud Computing. Digital data is growing rapidly from Gigabytes (GBs), terabytes (TBs) to Petabytes (PBs), and thereby burgeoning data management challenges. Social networking sites like Twitter, Facebook, Google+ etc generate huge data chunks on daily basis. Among them, twitter masks as the largest source of publicly available mammoth data chunks intended for various objectives of research and development. In order to further research in this fast emerging area of managing Big Data, we propose a novel framework for doing analysis on Big Data and show its implementation by  creating a ‘Twitter Mart’ which is a compilation of subject specific tweets that address some of the challenges for industries engaged in analyzing subject specific data. In this paper, we adduce algorithms and an holistic model that aids in effective stockpiling and retrieving data in an efficient manner.

Cite This Paper

Imran Khan, S. Kazim Naqvi, Mansaf Alam, Mohammad Najmud Doja, S. Nasir Aziz Rizvi, "An Efficient Framework for Creating Twitter Mart on a Hybrid Cloud", International Journal of Information Technology and Computer Science(IJITCS), Vol.9, No.10, pp.59-67, 2017. DOI: 10.5815/ijitcs.2017.10.06


[1]MAPR Article: "5 Google Projects That Changed Big Data Forever": (accessed Mar 27, 2017) 

[2] Blog : "Big Data – What is Big Data – 3 Vs of Big Data – Volume, Velocity and Variety – Day 2 of 21": (accessed Mar 27, 2017)

[3]Keywebmetrcs article “How Big Data drives Facial Recognition”: (accessed Mar 27, 2017):

[4]Article on "Global social media research summary 2017": (accessed Mar 27, 2017):

[5]Ali M, Nasr ES, Geith M, Benefits and Challenges of Cloud ERP Systems- A Systematic Literature Review, Future Computing and Informatics Journal (2017), doi: 10.1016/j.fcij.2017.03.003

[6]Article on “Public Cloud Definition” (accessed Mar 27,2017)”

[7]A. Shakimov, A. Varshavsky, L.P. Cox, et al., Privacy, cost, and availability tradeoffs in decentralized OSNs, in: The 2nd ACM Workshop on Online Social Networks, ACM, 2009, pp.13–18.

[8]Y. Ding, M. Neumann, D. Gordon, T. Riedel, T. Miyaki, M. Beigl, W. Zhang, L. Zhang, A platform-as-a-service for in-situ development of wireless sensor network applications, in: Networked Sensing Systems (INSS), 2012 Ninth International Conference on, 2012, pp. 1–8.

[9]J. Zhou, T. Leppanen, E. Harjula, M. Ylianttila, T. Ojala, C. Yu, H. Jin, L. Yang, Cloudthings: A common architecture for integrating the internet of things with cloud computing, in: Computer Supported Cooperative Work in Design (CSCWD), 2013 IEEE 17th International Conference on, 2013, pp. 651–657

[10]A Study on Data Storage Security Issues in Cloud Computing Naresh vurukonda1, B.Thirumala Rao2

[11]Fabregas, Aleta C., Bobby D. Gerardo, and Bartolome T. Tanguilig III. "Enhanced Initial Centroids for K-means Algorithm." (2017).

[12]C. Wang, Q. Wang, K. Ren, N. Cao, W. Lou, Toward secure and dependable storage services in cloud computing, IEEE Trans. Services Comput. 5 (2)(2012) 220–232.

[13]L. Wei, H. Zhu, Z. Cao, X. Dong, W. Jia, Y. Chen, A.V. Vasilakos, Security and privacy for storage and computation in cloud computing, Inform. Sci. 258 (2014) 371–386.

[14]S.M.S. Chow, Y. He, L.C.K. Hui, S.M. Yiu, Spicesimple privacy-preserving identity-management for cloud environment, in: Applied Cryptography and Network Security, Springer, Berlin, Heidelberg, 2012, pp. 526–543

[15]R.D. Dhungana, A. Mohammad, A. Sharma, I. Schoen, Identity management framework for cloud networking infrastructure, in: IEEE International Conference on Innovations in Information Technology (IIT), 2013, pp. 13–17.

[16]M.L. Hale, R. Gamble, Risk propagation of security SLAs in the cloud, in: IEEE Globecom Workshops (GC Wkshps), 2012, pp. 730–735.

[17]B. Krishnamurthy, C.E. Wills, Characterizing privacy in online social networks, in: Proceedings of the First Workshop on Online Social Networks, ACM, 2008, pp.37–42.

[18]C. Zhang, J. Sun, X. Zhu, et al., Privacy and security for online social networks: challenges and opportunities, IEEE Netw. 24(4) (2010).

[19]B. Krishnamurthy, C.E. Wills, Privacy leakage in mobile online social networks, in: Proceedings of the 3rd Conference on Online Social Networks, USENIX Association, 2010.

[20]K. Graffi, C. Gross, P. Mukherjee, et al., LifeSocial.KOM: a P2P-based platform for secure online social networks, in: 2010 IEEE Tenth International Conference on Peer-to-Peer Computing (P2P), IEEE, 2010.

[21]S. Buchegger, D. Schiöberg, L.H. Vu, et al., PeerSoN: P2P social networking: early experiences and insights, in: The Second ACM EuroSys Workshop on Social Network Systems, ACM, 2009, pp.46–52.

[22]Laatikainen, Gabriella, OleksiyMazhelis, and PasiTyrvainen. "Cost benefits of flexible hybrid cloud storage: Mitigating volume variation with shorter acquisition cycle." Journal of Systems and Software 122 (2016): 180-201.

[23]Borthakur, Dhruba. "HDFS architecture guide." Hadoop Apache Project 53 (2008).

[24]Adams, Ian F., et al. "Maximizing Efficiency by Trading Storage for Computation." HotCloud. 2009.

[25]Yuan, Dong, et al. "A cost-effective strategy for intermediate data storage in scientific cloud workflow systems." Parallel & Distributed Processing (IPDPS), 2010 IEEE International Symposium on. IEEE, 2010.

[26]Yuan, Dong, et al. "On-demand minimum cost benchmarking for intermediate dataset storage in scientific cloud workflow systems." Journal of Parallel and Distributed Computing 71.2 (2011): 316-332.

[27]Yuan, Dong, et al. "A data dependency based strategy for intermediate data storage in scientific cloud workflow systems." Concurrency and Computation: Practice and Experience 24.9 (2012): 956-976.

[28]Mao, Bo, et al. "Read-performance optimization for deduplication-based storage systems in the cloud." ACM Transactions on Storage (TOS) 10.2 (2014): 6.

[29]Clements, Austin T., et al. "Decentralized Deduplication in SAN Cluster File Systems." USENIX annual technical conference. 2009.

[30]White, Tom. Hadoop: The definitive guide. " O'Reilly Media, Inc.", 2012.

[31]S. Rao, R. Ramakrishnan, A. Silberstein, M. Ovsiannikov, D. Reeves, Sailfish: A framework for large scale data processing, in: Proceedings of the Third ACM Symposium on Cloud Computing, SoCC ’12, ACM, New York, NY, USA, 2012, pp. 4:1–4:14.

[32]Twitter Developer Documentation :(accessed on  Mar 27, 2017): ttps://

[33]Fazio, Maria, et al. "Big data storage in the cloud for smart environment monitoring." Procedia Computer Science 52 (2015): 500-506

[34]Vinay, A., et al. "Cloud based big data analytics framework for face recognition in social networks using machine learning." Procedia Computer Science 50 (2015): 623-630.

[35]Khan, Imran, et al. "An efficient framework for real-time tweet classification." International Journal of Information Technology: 1-7.

[36]Venkatraman, Sitalakshmi, et al. "SQL Versus NoSQL Movement with Big Data Analytics." International Journal of Information Technology and Computer Science (IJITCS) 8.12 (2016): 59.

[37]Diaby, Tinankoria, and Babak Bashari Rad. "Cloud Computing: A review of the Concepts and Deployment Models." International Journal of Information Technology and Computer Science (IJITCS) (2017).