Identifying Dark Web Hidden Services with Novel Image Classes Using CNN and Quantum Transfer Learning

PDF (1143KB), PP.44-59

Views: 0 Downloads: 0

Author(s)

Ashwini Dalvi 1,* Soham Bhoir 2 Akansha Singh 1 Irfan Siddavatam 2 Sunil Bhirud 1

1. Department of Computer Engineering, Veermata Jijabai Technological Institute, India, Mumbai, 400019, India

2. Department of Information Technology, K J Somaiya College of Engineering, Vidyavihar, 400077, India

* Corresponding author.

DOI: https://doi.org/10.5815/ijeme.2025.02.05

Received: 23 Oct. 2023 / Revised: 8 Dec. 2023 / Accepted: 9 Dec. 2024 / Published: 8 Apr. 2025

Index Terms

Dark Web, Image classification, CNN Model, Quantum Transfer Learning, TF-IDF

Abstract

The dark web is an overwhelming and mysterious place that comprises hidden services. Dark web hidden services contain illegal or offensive content. Hidden services are not accessible through regular search engines or browsers and can only be accessed via specific software. The proposed work aims to identify these hidden services by analyzing their associated images and text data. Doing so, one can better understand the types of activities on the dark web and what kind of content is available. First, a dark web crawler is developed to collect dark web services. Images are then manually classified into four categories: Cards, Devices, Hackers, and Money. Next, preprocessing the collected dataset removed irrelevant images, and a Convolutional Neural Network (CNN) was trained to identify new dark web image classes. Finally, quantum Transfer Learning (QTL) improved the model’s performance. The proposed work goes beyond conventional methods of categorizing datasets by including new categories of image classes of dark web hidden services that have not been considered before. Also, the work examines image data and related text to establish a strong correlation between them. The proposed approach will provide insights into the dark web hidden service by confirming the relationship between the image and text data of the respective hidden-services.

Cite This Paper

Ashwini Dalvi, Soham Bhoir, Akansha Singh, Irfan Siddavatam, Sunil Bhirud, "Identifying Dark Web Hidden Services with Novel Image Classes Using CNN and Quantum Transfer Learning", International Journal of Education and Management Engineering (IJEME), Vol.15, No.2, pp. 44-59, 2025. DOI:10.5815/ijeme.2025.02.05

Reference

[1]Computer Incident Response Center Luxembourg (CIRCL). CIRCL AIL Dataset 01. https://www.circl. lu/opendata/datasets/circl-ail-dataset-01. Accessed: April 3, 2023. 2018.
[2]Mhd Wesam Al Nabki et al. “Classifying illegal activities on tor network based on web textual contents”. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers. 2017, pp. 35–43.
[3]Randa Basheer and Bassel Alkhatib. “Threats from the dark: a review over dark web investigation research for cyber threat intelligence”. In: Journal of Computer Networks and Communications 2021 (2021), pp. 1–21.
[4]Rubel Biswas, Eduardo Fidalgo, and Enrique Alegre. “Recognition of service domains on TOR dark net using perceptual hashing and image classification techniques”. In: 8th International Conference on Imaging for Crime Detection and Prevention (ICDP 2017). IET. 2017, pp. 7–12.
[5]Rubel Biswas et al. “Perceptual Hashing applied to Tor domains recognition”. In: arXiv preprint arXiv:2005.10090 (2020).
[6]Gwern Branwen et al. StExo, El Presidente, Anonymous, Daryl Lau, Delyan Kratunov Sohhlz, Vince Cakic, Van Buskirk, Whom, Michael McKenna, and Sigi Goode. 2015. Dark net market archives, 2011-2015. 2021.
[7]Madeleine van der Bruggen and Arjan Blokland. “Child sexual exploitation communities on the Darkweb: How or- ganized are they?” In: Cybercrime in Context: The human factor in victimization, offending, and policing. Springer, 2021, pp. 259–280.
[8]Janis Dalins, Campbell Wilson, and Mark Carman. “Criminal motivation on the dark web: A categorisation model for law enforcement”. In: Digital Investigation 24 (2018), pp. 62–71.
[9]Ashwini Dalvi et al. “Content Labelling of Hidden Services With Keyword Extraction Using the Graph Decompo- sition Method”. In: Using Computational Intelligence for the Dark Web and Illicit Behavior Detection. IGI Global, 2022, pp. 181–205.
[10]Ashwini Dalvi et al. “SpyDark: Surface and Dark Web Crawler”. In: 2021 2nd International Conference on Secure Cyber Computing and Communications (ICSCCC). IEEE. 2021, pp. 45–49.
[11]Edward Farhi and Hartmut Neven. “Classification with quantum neural networks on near term processors”. In:arXiv preprint arXiv:1802.06002 (2018).
[12]Eduardo Fidalgo Fernandez et al. “Classifying suspicious content in Tor Darknet”. In: arXiv preprint arXiv:2005.10086 (2020).
[13]Eduardo Fidalgo et al. “Classifying suspicious content in tor darknet through Semantic Attention Keypoint Filter- ing”. In: Digital Investigation 30 (2019), pp. 12–22.
[14]Eduardo Fidalgo et al. “Illegal activity categorisation in DarkNet based on image classification using CREIC method”. In: International Joint Conference SOCO’17-CISIS’17-ICEUTE’17 Leo´n, Spain, September 6–8, 2017, Proceeding 12. Springer. 2018, pp. 600–609.
[15]Michael R Geller. “Rigorous measurement error correction”. In: Quantum Science and Technology 5.3 (2020), 03LT01.
[16]Mahdi Hashemi and Margeret Hall. “Detecting and classifying online dark visual propaganda”. In: Image and Vision Computing 89 (2019), pp. 95–105.
[17]Kaiming He et al. “Deep residual learning for image recognition”. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2016, pp. 770–778.
[18]Siyu He, Yongzhong He, and Mingzhe Li. “Classification of illegal activities on the dark web”. In: Proceedings of the 2nd International Conference on Information Science and Systems. 2019, pp. 73–78.
[19]Susan Jeziorowski, Muhammad Ismail, and Ambareen Siraj. “Towards image-based dark vendor profiling: An analysis of image metadata and image hashing in dark web marketplaces”. In: Proceedings of the Sixth International Workshop on Security and Privacy Analytics. 2020, pp. 15–22.
[20]Shubhdeep Kaur and Sukhchandan Randhawa. “Dark web: A web of crimes”. In: Wireless Personal Communica- tions 112 (2020), pp. 2131–2158.
[21]Nathan Killoran et al. “Continuous-variable quantum neural networks”. In: Physical Review Research 1.3 (2019),p. 033063.
[22]Roberta Liggett et al. “The dark web as a platform for crime: An exploration of illicit drug, firearm, CSAM, and cybercrime markets”. In: The Palgrave handbook of international cybercrime and cyberdeviance (2020), pp. 91– 116.
[23]Jarrod R McClean et al. “The theory of variational hybrid quantum-classical algorithms”. In: New Journal of Physics 18.2 (2016), p. 023023.
[24]Harsha Moraliyage et al. “Multimodal Classification of Onion Services for Proactive Cyber Threat Intelligence Using Explainable Deep Learning”. In: IEEE Access 10 (2022), pp. 56044–56056.
[25]Saiba Nazah et al. “An unsupervised model for identifying and characterizing dark web forums”. In: IEEE Access 9 (2021), pp. 112871–112892.
[26]Javier Pastor-Galindo, Fe´lix Go´mez Ma´rmol, and Gregorio Mart´ınez Pe´rez. “On the gathering of Tor onion ad- dresses”. In: Future Generation Computer Systems 145 (2023), pp. 12–26.
[27]Stephan Raaijmakers. “Artificial intelligence for law enforcement: challenges and opportunities”. In: IEEE security & privacy 17.5 (2019), pp. 74–77.
[28]Matthias Scha¨fer et al. “BlackWidow: Monitoring the dark web for cyber security information”. In: 2019 11th International Conference on Cyber Conflict (CyCon). Vol. 900. IEEE. 2019, pp. 1–21.
[29]Maria Schuld et al. “Circuit-centric quantum classifiers”. In: Physical Review A 101.3 (2020), p. 032308.
[30]Vidyesh Shinde et al. “CrawlBot: A Domain-Specific Pseudonymous Crawler”. In: Cybersecurity in Emerging Digital Era: First International Conference, ICCEDE 2020, Greater Noida, India, October 9-10, 2020, Revised Selected Papers. Springer. 2021, pp. 89–101.
[31]Sukin Sim, Peter D Johnson, and Ala´n Aspuru-Guzik. “Expressibility and entangling capability of parameter- ized quantum circuits for hybrid quantum-classical algorithms”. In: Advanced Quantum Technologies 2.12 (2019), p. 1900070.
[32]Sukin Sim, Peter D Johnson, and Ala´n Aspuru-Guzik. “Expressibility and entangling capability of parameter- ized quantum circuits for hybrid quantum-classical algorithms”. In: Advanced Quantum Technologies 2.12 (2019), p. 1900070.
[33]EDA So¨nmez and Keziban Sec¸kin Codal. “Terrorism in cyberspace: A critical review of dark web studies under the terrorism landscape”. In: Sakarya University Journal of Computer and Information Sciences 5 (2022).
[34]Xiangwen Wang et al. “You are your photographs: Detecting multiple identities of vendors in the darknet mar- ketplaces”. In: Proceedings of the 2018 on Asia Conference on Computer and Communications Security. 2018, pp. 431–442.
[35]Ning Zhang et al. “Counteracting dark Web text-based CAPTCHA with generative adversarial learning for proac- tive cyber threat intelligence”. In: ACM Transactions on Management Information Systems (TMIS) 13.2 (2022), pp. 1–21.