IJISA Vol. 13, No. 4, 8 Aug. 2021
Cover page and Table of Contents: PDF (size: 300KB)
Full Text (PDF, 300KB), PP.38-48
Views: 0 Downloads: 0
Data Mining, Meta Data, Classification, Bayes Classifier, Decision Tree Classifier
Classification is a parlance of Data Mining to genre data of different kinds in particular classes. As we observe, social media is an immense manifesto that allows billions of people share their thoughts, updates and multimedia information as status, photo, video, link, audio and graphics. Because of this flexibility cloud has enormous data. Most of the times, this data is much complicated to retrieve and to understand. And the data may contain lot of noise and at most the data will be incomplete. To make this complication easier, the data existed on the cloud has to be classified with labels which is viable through data mining Classification techniques. In the present work, we have considered Facebook dataset which holds meta data of cosmetic company’s Facebook page. 19 different Meta Data are used as main attributes. Out of those, Meta Data ‘Type’ is concentrated for Classification. Meta data ‘Type’ is classified into four different classes such as link, status, photo and video. We have used two favored Classifiers of Data Mining that are, Bayes Classifier and Decision Tree Classifier. Data Mining Classifiers contain several classification algorithms. Few algorithms from Bayes and Decision Tree have been chosen for the experiment and explained in detail in the present work. Percentage split method is used to split the dataset as training and testing data which helps in calculating the Accuracy level of Classification and to form confusion matrix. The Accuracy results, kappa statistics, root mean squared error, relative absolute error, root relative squared error and confusion matrix of all the algorithms are compared, studied and analyzed in depth to produce the best Classifier which can label the company’s Facebook data into appropriate classes thus Knowledge Discovery is the ultimate goal of this experiment.
Prashant Bhat, Pradnya Malaganve, "Metadata based Classification Techniques for Knowledge Discovery from Facebook Multimedia Database", International Journal of Intelligent Systems and Applications(IJISA), Vol.13, No.4, pp.38-48, 2021. DOI: 10.5815/ijisa.2021.04.04
[1] Prashant Bhat, Pradnya Malaganve and Prajna Hegade, “A New Framework for Social Media Content Mining and Knowledge Discovery”, IJCA (0975 – 8887) Volume 182 – No. 36, January 2019.
[2] Subitha Sivakumar, Sivakumar Venkataraman and Rajalakshmi Selvaraj, “Predictive Modeling of Student Dropout Indicators in Educational Data Mining using Improved Decision Tree”, IJST, Vol 9(4), DOI: 10.17485, v9i4, 87032, Jan 2016, ISSN (Print): 0974-6846 ISSN (Online): 0974-5645.
[3] B. Tang, H. He, P. M. Baggenstoss and S. Kay, “A Bayesian Classification Approach Using Class-Specific Features for Text Categorization”, IEEE Transactions on Knowledge and Data Engineering, vol. 28, no. 6, pp. 1602-1606, 1 June 2016, doi: 10.1109/TKDE.2016.2522427.
[4] Sean V. Tavtigian, Marc S. Greenblatt, Steven M. Harrison, Robert L. Nussbaum, Snehit A. Prabhu, Kenneth M. Boucher and Leslie G. Biesecker, “Modeling the ACMG/AMP variant classification guidelines as a Bayesian classification framework”, advance online publication 4 Jan 2018. doi:10.1038/gim.2017.210, Vol 20, September 2018.
[5] https://www.cs.waikato.ac.nz/
[6] Norbert Dojer, Paweł Bednarz, Agnieszka Podsiadło, Bartek Wilczyński, BNFinder2: Faster Bayesian network learning and Bayesian classification, Bioinformatics, Volume 29, Issue 16, August 2013, Pages 2068–2070.
[7] X. Liu, R. Lu, J. Ma, L. Chen and B. Qin, “Privacy-Preserving Patient-Centric Clinical Decision Support System on Naïve Bayesian Classification”, IEEE Journal of Biomedical and Health Informatics, vol. 20, no. 2, pp. 655-668, doi: 10.1109/JBHI.2015.2407157. M: March, Y: 2016.
[8] Firoj Alam, Evgeny A. Stepanov, Giuseppe Riccardi, “Personality Traits Recognition on Social Network – Facebook”, © 2013, (www.aaai.org).
[9] Dr. Neeraj Bhargava, Girja Sharma, Dr. Ritu Bhargava and Manish Mathuria, “Decision Tree Analysis on J48 Algorithm for Data Mining”, International Journal of Advanced Research in Computer Science and Software Engineering, Volume 3, Issue 6, M: June, Y:2013, ISSN: 2277-128X.
[10] Marina Milanović and Milan Stamenković, “Chaid Decision Tree: Methodological Frame And Application”, Economic Themes (2016), 54(4): 563-586. DOI: 10.1515/ethemes-2016-0029.
[11] Siddu P. Algur, Prashant Bhat, Narasimha H Ayachit, “Educational Data Mining: RT and RF Classification Models for Higher Education Professional Courses", International Journal of Information Engineering and Electronic Business, Vol.8, No.2, pp.59-65, 2016.
[12] Siddu P. Algur, Prashant Bhat, “Web Video Mining: Metadata Predictive Analysis using Classification Techniques", International Journal of Information Technology and Computer Science, Vol.8, No.2, pp.69-77, 2016.
[13] Harsh H. Patel, Purvi Prajapati, “Study and Analysis of Decision Tree Based Classification Algorithms”, IJCSE- Vol.-6, Issue-10, Oct. 2018 E-ISSN: 2347-2693.
[14] Saman Rizvi, Bart Rienties and Shakeel Ahmed Khoja, “The role of demographics in online learning; A decision tree-based approach”, Computers & Education 137 (2019) 32–47, 0360-1315/ © 2019 Elsevier Ltd.
[15] Sushilkumar Kalmegh, “Analysis of WEKA Data Mining Algorithm REPTree, Simple Cart and RandomTree for Classification of Indian News”, IJISET - Vol. 2 Issue 2, February 2015.
[16] Siddu P. Algur, Prashant Bhat, Nitin Kulkarni, “Educational Data Mining: Classification Techniques for Recruitment Analysis", International Journal of Modern Education and Computer Science, Vol.8, No.2, pp.59-65, 2016.
[17] Landis, J.R. Koch, G, “The measurement of observer agreement for categorical data”, Biometrics 33 (1): 159–174.
[18] Stephanie Glen, “Absolute Error & Mean Absolute Error (MAE”), From StatisticsHowTo.com.
[19] Jiawei Han, Micheline Kamber and Jian Pei, “Data Mining Concepts and Techniques”, Morgan Kaufmann Publishers is an imprint of Elsevier. 225 Wyman Street, Waltham, MA 02451, USA 2012 by Elsevier Inc.
[20] Dewan Md. Farid, Li Zhang, Chowdhury Mofizur Rahman, M.A. Hossain and Rebecca Strachan, “Hybrid decision tree and naïve Bayes classifiers for multi-class classification tasks”, D.Md. Farid et al. / Expert Systems with Applications 41, 1937–1946, Y:2014.
[21] Moro, S., et al., “Predicting social media performance metrics and evaluation of the impact on brand building: A data mining approach”, Journal of Business Research, Y:2016. http://dx.doi.org/10.1016/j.jbusres.2016.02.010.