Threat Modelling and Detection Using Semantic Network for Improving Social Media Safety

Full Text (PDF, 874KB), PP.39-53

Views: 0 Downloads: 0

Author(s)

Fethi Fkih 1,2,* Ghadeer Al-Turaif 1

1. Department of Computer Science, College of Computer, Qassim University, Buraydah, Saudi Arabia

2. MARS Research Laboratory LR17ES05, University of Sousse, Tunisia

* Corresponding author.

DOI: https://doi.org/10.5815/ijcnis.2023.01.04

Received: 20 Sep. 2022 / Revised: 25 Oct. 2022 / Accepted: 29 Nov. 2022 / Published: 8 Feb. 2023

Index Terms

Semantic Network, Threat Detection, Social Media Safety, Cybersecurity, Knowledge Modeling

Abstract

Social media provides a free space to users to post their information, opinions, feelings, etc. Also, it allows users to easily and simultaneously communicate with each other. As a result, threat detection in social media is critical for ensuring the user’s safety and preventing suspicious activities such as criminal behavior, hate speech, ethnic conflicts and terrorist plots. These suspicious activities have a negative impact on the community’s life and cause tension and social unrest among individuals in both inside and outside of cyberspace. Furthermore, with the recent popularity of social networking sites, the number of discussions containing threats is increasing, causing fear in various parties, whether at the individual or state level. Moreover, these social networking service providers do not have complete control over the content that users post. In this paper, we propose to design a threat detection model on Twitter using a semantic network. To achieve this aim, we designed a threat semantic network, named, ThrNet that will be integrated in our proposed threat detection model called, DetThr. We compared the performance of our model (DetThr) with a set of well-known Machine Learning algorithms. Results show that the DetThr model achieves an accuracy of 76% better than Machine Learning algorithms. It works well with an error rate of forecasting threatening tweet messages as non-threatening (false negatives) is about 29%, while the error rate of forecasting non-threatening tweet messages as threatening (false positives) is about 19%.

Cite This Paper

Fethi Fkih, Ghadeer Al-Turaif, "Threat Modelling and Detection Using Semantic Network for Improving Social Media Safety", International Journal of Computer Network and Information Security(IJCNIS), Vol.15, No.1, pp.39-53, 2023. DOI:10.5815/ijcnis.2023.01.04

Reference

[1]Maaz Amjad, Noman Ashraf, Alisa Zhila, Grigori Sidorov, Arkaitz Zubiaga, and Alexander Gelbukh. Threatening language detection and target identification in urdu tweets. IEEE Access, 9:128302–128313, 2021.
[2]D. B. Alorini, D.and Rawat. Automatic spam detection on gulf dialectical arabic tweets. In 2019 International Conference on Computing, Networking and Communications (ICNC), pages 448–452, 2019.
[3]Fabio Del Vigna, Andrea Cimino, Felice Dell’Orletta, Marinella Petrocchi, and Maurizio Tesconi. Hate me, hate me not: Hate speech detection on facebook. In ITASEC, pages 86–95, 01 2017.
[4]Atta ur Rahman. Knowledge representation: A semantic network approach. In Handbook of Research on Computational Intelligence Applications in Bioinformatics, pages 55–74. IGI Global, 2016.
[5]Serhad Sarica, Jianxi Luo, and Kristin L. Wood. Technet: Technology semantic network based on patent data. Expert Systems with Applications, 142:112995, 2020.
[6]Derek L Hansen, Ben Shneiderman, Marc A Smith, and Itai Himelboim. Semantic networks. In Derek L Hansen, Ben Shneiderman, Marc A Smith, and Itai Himelboim, editors, Analyzing Social Media Networks with NodeXL, pages 115–125. Elsevier, second edi edition, 2020.
[7]Kyounghee Kwon, C. Chris Bang, Michael Egnoto, and H. Raghav Rao. Social media rumors as improvised public opinion: semantic network analyses of twitter discourses during korean saber rattling 2013. Asian Journal of Communication, 26(3):201–222, May 2016.
[8]Aksel Wester, Lilja Øvrelid, Erik Velldal, and Hugo Lewi Hammer. Threat detection in online discussions. In Proceedings of the 7th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, pages 66–71, San Diego, California, June 2016. Association for Computational Linguistics.
[9]Noman Ashraf, Rabia Mustafa, Grigori Sidorov, and Alexander Gelbukh. Individual vs. group violent threats classification in online discussions. In Companion Proceedings of the Web Conference 2020, WWW ’20, page 629–633, New York, NY, USA, 2020. Association for Computing Machinery.
[10]Hugo Lewi Hammer. Automatic detection of hateful comments in online discussion. In International Conference on Industrial Networks and Intelligent Systems, pages 164–173. Springer, 2017.
[11]H. L. Hammer, M. A. Riegler, L. Øvrelid, and E. Velldal. Threat: A large annotated corpus for detection of violent threats. In 2019 International Conference on Content-Based Multimedia Indexing (CBMI), pages 1–5, 2019.
[12]Khaled Bedjou, Faiçal Azouaou, and Abdelouhab Aloui. Detection of terrorist threats on twitter using svm. In Proceedings of the 3rd International Conference on Future Networks and Distributed Systems, ICFNDS ’19, pages 1–5, New York, NY, USA, 2019. Association for Computing Machinery.
[13]Addie Beach. “it’s so bomb”: Exploring corpus-based threat detection on Twitter with discourse analysis. PhD thesis, University of Vermont, 2019.
[14]Puja Chakraborty and Md. Hanif Seddiqui. Threat and abusive language detection on social media in bengali language. In 2019 1st International Conference on Advances in Science, Engineering and Robotics Technology (ICASERT), pages 1–6, 2019.
[15]Shatha Abdulaziz AlAjlan and Abdul Khader Jilani Saudagar. Machine learning approach for threat detection on social media posts containing arabic text. Evol. Intell., 14:811–822, 2021.
[16]Nelleke Oostdijk and Hans van Halteren. N-gram-based recognition of threatening tweets. In International Conference on Intelligent Text Processing and Computational Linguistics, pages 183–196. Springer, 2013.
[17]Martijn Spitters, Pieter T. Eendebak, Daniël T. H. Worm, and Henri Bouma. Threat detection in tweets with trigger patterns and contextual cues. In 2014 IEEE Joint Intelligence and Security Informatics Conference, pages 216–219, 2014.
[18]Wenlin Liu, Chih-Hui Lai, and Weiai (Wayne) Xu. Tweeting about emergency: A semantic network analysis of government organizations’ social media messaging during hurricane harvey. Public Relations Review, 44(5):807–819, 2018.
[19]Derek L. Hansen, Ben Shneiderman, Marc A. Smith, and Itai Himelboim. Chapter 8 - semantic networks. In Derek L. Hansen, Ben Shneiderman, Marc A. Smith, and Itai Himelboim, editors, Analyzing Social Media Networks with NodeXL (Second Edition), pages 115–125. Morgan Kaufmann, USA, second edition edition, 2020.
[20]Lu Tang, Bijie Bie, and Degui Zhi. Tweeting about measles during stages of an outbreak: A semantic network approach to the framing of an emerging infectious disease. American Journal of Infection Control, 46(12):1375–1380, dec 2018.
[21]Marya L Doerfel. What constitutes semantic network analysis? a comparison of research and methodologies. Connections, 21(2):16–26, 1998.
[22]Engels Rajangam and Chitra Annamalai. Graph models for knowledge representation and reasoning for contemporary and emerging needs–a survey. International Journal of Information Technology and Computer Science (IJITCS), 8(2):14–22, 2016.
[23]Derek L. Hansen, Ben Shneiderman, Marc A. Smith, and Itai Himelboim. Chapter 6 - calculating and visualizing network metrics. In Derek L. Hansen, Ben Shneiderman, Marc A. Smith, and Itai Himelboim, editors, Analyzing Social Media Networks with NodeXL, pages 79–94. Morgan Kaufmann, USA, second edition edition, 2020.
[24]Ghadeer Al-Turaif and Fethi Fkih. A review on threat detection approaches in social networks. International Journal of Computer Science and Network Security (IJCSNS), 21:353–361, 2021.
[25]Joan A Sereno and Allard Jongman. Processing of english inflectional morphology. Memory & cognition, 25(4):425–437, 1997.
[26]Enrique Manjavacas, Ákos Kádár, and Mike Kestemont. Improving lemmatization of non-standard languages with joint learning. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 1493–1503, Minneapolis, Minnesota, June 2019. Association for Computational Linguistics.
[27]Symeon Symeonidis, Dimitrios Effrosynidis, and Avi Arampatzis. A comparative evaluation of pre-processing techniques and their interactions for twitter sentiment analysis. Expert Systems with Applications, 110:298–310, 2018.
[28]Deepa Yogish, TN Manjunath, and Ravindra S Hegadi. Review on natural language processing trends and techniques using nltk. In International Conference on Recent Trends in Image Processing and Pattern Recognition, pages 589–606. Springer, 2018.
[29]Òscar Garibo i Orts. Multilingual detection of hate speech against immigrants and women in Twitter at SemEval-2019 task 5: Frequency analysis interpolation for hate in speech detection. In Proceedings of the 13th International Workshop on Semantic Evaluation, pages 460–463, Minneapolis, Minnesota, USA, June 2019. Association for Computational Linguistics.
[30]Deepa Yogish, T. N. Manjunath, and Ravindra S. Hegadi. Review on natural language processing trends and techniques using nltk. In K. C. Santosh and Ravindra S. Hegadi, editors, Recent Trends in Image Processing and Pattern Recognition, pages 589–606, Singapore, 2019. Springer Singapore.
[31]Michael Mesfin Tadesse, Hongfei Lin, Bo Xu, and Liang Yang. Detection of suicide ideation in social media forums using deep learning. Algorithms, 13(1), 2020.
[32]Elaine J. Yuan, Miao Feng, and James A. Danowski. “Privacy” in Semantic Networks on Chinese Social Media: The Case of Sina Weibo. Journal of Communication, 63(6):1011–1031, 10 2013.
[33]Jeanette B. Ruiz and George A. Barnett. Exploring the presentation of hpv information online: A semantic network analysis of websites. Vaccine, 33(29):3354–3359, 2015.
[34]Fethi Fkih and Mohamed Nazih Omri. Learning the Size of the Sliding Window for the Collocations Extraction: a ROC-based Approach. In Proceedings of the 2012 International Conference on Artificial Intelligence: ICAI'12, Las Vegas, Nevada, USA.
[35]James A Danowski. Wordij version 3.0: Semantic network analysis software. Chicago: University of Illinois at Chicago, 2013.
[36]James A Danowski. Social media network size and semantic networks for collaboration in design. International Journal of Organisational Design and Engineering, 2(4):343–361, 2012.
[37]Kenneth Ward Church and Patrick Hanks. Word association norms, mutual information, and lexicography. In 27th Annual Meeting of the Association for Computational Linguistics, pages 76–83, Vancouver, British Columbia, Canada, June 1989. Association for Computational Linguistics.
[38]Hailong Zhang, Wenyan Gan, and Bo Jiang. Machine learning and lexicon based methods for sentiment classification: A survey. In 2014 11th Web Information System and Application Conference, pages 262–265, 2014.
[39]Fethi Fkih and Mohamed Nazih Omri. A statistical classifier based Markov chain for complex terms filtration. In Proceedings of the International Conference on Web Informations and Technologies, ICWIT 2013, pages 175–184, Hammamet, Tunisia, 2013.
[40]Fethi Fkih, Mohamed Nazih Omri and Imen Toumia. A Linguistic Model for Terminology Extraction based Conditional Random Fields. ICCRK'2012-International Conference on Computer Related Knowledge, Sousse, Tunisia, 2013.
[41]Fethi Fkih and Mohamed Nazih Omri. “Hybridization of an Index Based on Concept Lattice with a Terminology Extraction Model for Semantic Information Retrieval Guided by WordNet”. In: Abraham, A., Haqiq, A., Alimi, A., Mezzour, G., Rokbani, N., Muda, A. (eds) Proceedings of the 16th International Conference on Hybrid Intelligent Systems (HIS 2016). HIS 2016. Advances in Intelligent Systems and Computing, vol 552. 2017. Springer, Cham.
[42]Fethi Fkih and Mohamed Nazih Omri. Information retrieval from unstructured web text document based on automatic learning of the threshold. Int. J. Inf. Retr. Res., 2(4):12–30, 2012.
[43]Sarra Ouni, Fethi Fkih and Mohamed Nazih Omri. BERT- and CNN-based TOBEAT approach for unwelcome tweets detection. Soc. Netw. Anal. Min. 12, 144 (2022). https://doi.org/10.1007/s13278-022-00970-0.
[44]Fethi Fkih and Mohamed Nazih Omri. Hidden data states-based complex terminology extraction from textual web data model. Appl. Intell., 50(6):1813–1831, 2020.