IJIEEB Vol. 17, No. 2, 8 Apr. 2025
Cover page and Table of Contents: PDF (size: 793KB)
PDF (793KB), PP.95-110
Views: 0 Downloads: 0
Anti-Phishing Working Group (APWG), Decision Tree, and Random Forest, and XGB, Hybrid Machine Learning
The fast growth of Internet technology has significantly changed online users’ experiences, while security concerns are becoming increasingly overpowering. Among these concerns, phishing stands out as a prominent criminal activity that uses social engineering and technology to steal a victim’s identification data and account information. According to the Anti-Phishing Working Group (APWG), the number of phishing detections increased by 46 in the first quarter of 2018 compared to the fourth quarter of 2017. So to overcome these situations below paper introduces a phishing detection system using a hybrid machine learning approach based on URL attributes. It addresses the growing threat of phishing attacks that exploit email manipulation and fake websites to deceive users and steal sensitive data. The study employs a phishing URL dataset with over 11,000 websites, extracted from a reputable repository. After pre-processing, a hybrid machine learning model, which includes Decision Tree, Random Forest, and XGB is employed to safeguard against phishing URLs. The proposed approach undergoes evaluation with key metrics such as precision, accuracy, recall, F1-score, and specificity. Results demonstrate that the proposed method surpasses other models, achieving superior accuracy and efficiency in detecting phishing attacks.
Pradip M. Paithane, "URLGuard: A Holistic Hybrid Machine Learning Approach for Phishing Detection", International Journal of Information Engineering and Electronic Business(IJIEEB), Vol.17, No.2, pp. 95-110, 2025. DOI:10.5815/ijieeb.2025.02.05
[1]Karim, Abdul, Mobeen Shahroz, Khabib Mustofa, Samir Brahim Belhaouari, and S. Ramana Kumar Joga. "Phishing detection system through hybrid machine learning based on URL." IEEE Access 11 (2023): 36805-36822.
[2]Zouina, M., Outtaj, B.: A novel lightweight url phishing detection system using svm and similarity index. Human-centric Computing and Information Sciences 7(1), 1–13 (2017).
[3]Wang, S., Khan, S., Xu, C., Nazir, S., Hafeez, A.: Deep learning-based efficient model development for phishing detection using random forest and blstm classifiers. Complexity 2020, 1–7 (2020)
[4]Fang, Yong, Cheng Zhang, Cheng Huang, Liang Liu, and Yue Yang. "Phishing email detection using improved RCNN model with multilevel vectors and attention mechanism." IEEE Access 7 (2019): 56329-56340.
[5]Sahingoz, Ozgur Koray, Ebubekir Buber, Onder Demir, and Banu Diri. "Machine learning based phishing detection from URLs." Expert Systems with Applications 117 (2019): 345-357.
[6]Abdelhamid, N., Ayesh, A., Thabtah, F.: Phishing detection based associative classification data mining. Expert Systems with Applications 41(13), 5948–5959 (2014)
[7]Karim, A., Shahroz, M., Mustofa, K., Belhaouari, S.B., Joga, S.R.K.: Phishing detection system through hybrid machine learning based on url. IEEE Access 11, 36805–36822 (2023).
[8]Sahingoz, O.K., Buber, E., Demir, O., Diri, B.: Machine learning based phishing detection from urls. Expert Systems with Applications 117, 345–357 (2019)
[9]Chiew, Kang Leng, Choon Lin Tan, KokSheik Wong, Kelvin SC Yong, and Wei King Tiong. "A new hybrid ensemble feature selection framework for machine learning-based phishing detection system." Information Sciences 484 (2019): 153-166.
[10]Shahrivari, V., Darabi, M.M., Izadi, M.: Phishing detection using machine learning techniques. arXiv preprint arXiv:2009.11116 (2020)
[11]Buber, E., Diri, B., Sahingoz, O.K.: Nlp based phishing attack detection from urls. In: Intelligent Systems Design and Applications: 17th International Conference on Intelligent Systems Design and Applications (ISDA 2017) Held in Delhi, India, December 14-16, 2017, pp. 608–618 (2018). Springer
[12]Paithane, P.M.: Random forest algorithm use for crop recommendation. ITEGAM-JETIA 9(43), 34–41 (2023)
[13]Paithane, P.M.: Yoga posture detection using machine learning. Artificial Intelligence in Information and Communication Technologies, Healthcare and Education: A Roadmap Ahead 27 (2022).
[14]Chiew, K.L., Tan, C.L., Wong, K., Yong, K.S., Tiong, W.K.: A new hybrid ensemble feature selection framework for machine learning-based phishing detection system. Information Sciences 484, 153–166 (2019)
[15]Paithane, P., Kakarwal, S.: Lmns-net: Lightweight multiscale novel semantic-net deep learning approach used for automatic pancreas image segmentation in ct scan images. Expert Systems with Applications 234, 121064 (2023)
[16]Fang, Y., Zhang, C., Huang, C., Liu, L., Yang, Y.: Phishing email detection using improved rcnn model with multilevel vectors and attention mechanism. IEEE Access 7, 56329–56340 (2019).
[17]Rao, R.S., Pais, A.R.: Detection of phishing websites using an efficient featurebased machine learning framework. Neural Computing and applications 31, 3851– 3873 (2019).
[18]Sonowal, G., Kuppusamy, K.: Phidma–a phishing detection model with multifilter approach. Journal of King Saud University-Computer and Information Sciences 32(1), 99–112 (2020).
[19]Alam, M.S., Vuong, S.T.: Random forest classification for detecting android malware. In: 2013 IEEE International Conference on Green Computing and Communications and IEEE Internet of Things and IEEE Cyber, Physical and Social Computing, pp. 663–669 (2013). IEEE.
[20]Smadi, S., Aslam, N., Zhang, L.: Detection of online phishing email using dynamic evolving neural network based on reinforcement learning. Decision Support Systems 107, 88–102 (2018).
[21]Paithane, P.M., Kakarwal, S.: Automatic pancreas segmentation using a novel modified semantic deep learning bottom-up approach. International Journal of Intelligent Systems and Applications in Engineering 10(1), 98–104 (2022).
[22]Buber, E., Dırı, B., Sahingoz, O.K.: Detecting phishing attacks from url by using nlp techniques. In: 2017 International Conference on Computer Science and Engineering (UBMK), pp. 337–342 (2017). IEEE.
[23]Feng, F., Zhou, Q., Shen, Z., Yang, X., Han, L., Wang, J.: The application of a novel neural network in the detection of phishing websites. Journal of Ambient Intelligence and Humanized Computing, 1–15 (2018).
[24]Wagh, S.J., Paithane, P.M., Patil, S.: Applications of fuzzy logic in assessment of groundwater quality index from jafrabad taluka of marathawada region of maharashtra state: A gis based approach. In: International Conference on Hybrid Intelligent Systems, pp. 354–364 (2021). Springer.
[25]Shirazi, H., Hayne, K.: Towards performance of nlp transformers on url-based phishing detection for mobile devices. International journal of ubiquitous systems and pervasive networks (2022)
[26]Pradip paithane, “Trust Aware Recommendation using Deep Matrix Factorization Model”, JETIA, vol. 10, no. 48, pp. 115-121, Aug.2024.