IJISA Vol. 9, No. 2, 8 Feb. 2017
Cover page and Table of Contents: PDF (size: 424KB)
Full Text (PDF, 424KB), PP.18-24
Views: 0 Downloads: 0
Cost sensitive learning, fraudulent transactions, Bayes minimum risk
In this paper, we present a new investigation to the literature, where we that study the impact of false negative (FN) cost on the performance of cost sensitive learning. The proposed investigation approach has been performed on cost sensitive classifiers developed using Bayes minimum risk as an example of an applied mechanism for making classifier cost sensitive. We consider a case study in credit card fraud detection, where FN refers to the number of fraudulent transactions that are miss-detected and approved as legitimate ones. Our investigation approach relies on testing the performance of various complex cost sensitive classifiers from different categories developed using Bayes minimum risk at different costs of FN. Our results also show that those classifiers behave differently at different costs of FN including the real and average amount of transaction, and a range of random constant costs that are greater or less than the average amount. However, in general the results show that the lower the costs of FN are, the better the classifier performances are. This leads to different conclusions from the one drown in [1], which states that choosing the cost of FN to be equal to the amount of transaction leads to better performance of cost sensitive learning using Bayes minimum risk. The results of this paper are based on the real life anonymous and imbalanced UCSD transactional data set.
Doaa Hassan, "The Impact of False Negative Cost on the Performance of Cost Sensitive Learning: A Case Study in Detecting Fraudulent Transactions", International Journal of Intelligent Systems and Applications (IJISA), Vol.9, No.2, pp.18-24, 2017. DOI:10.5815/ijisa.2017.02.03
[1]Alejandro Correa Bahnsen, Aleksandar Stojanovic, Djamila Aouada, and Bj¨orn E. Ottersten. Cost sensitive credit card fraud detection using bayes minimum risk. In 12th International Conference on Machine Learning and Applications, ICMLA 2013, Miami, FL, USA, December 4-7, 2013, Volume 1, pages 333–338, 2013.
[2]Clifton Phua, Vincent C. S. Lee, Kate Smith-Miles, and Ross W. Gayler. A comprehensive survey of data mining-based fraud detection research. CoRR, abs/1009.6119, 2010.
[3]John Akhilomen. Data mining application for cyber credit-card fraud detection system. In Advances in Data Mining. Applications and Theoretical Aspects - 13th Industrial Conference, ICDM 2013, New York, NY, USA, July 16-21, 2013. Proceedings, pages 218–228, 2013.
[4]Sitaram patel and Sunita Gond. Supervised machine (svm) learning for credit card fraud detection. International Journal of Engineering Trends and Technology (IJETT), 8(3), 2014.
[5]Masoumeh Zareapoor and Pourya Shamsolmoali. Application of credit card fraud detection: Based on bagging ensemble classifier. Procedia Computer Science, 48:679–685, 2015.
[6]Precision and recall. Available at: https://en.wikipedia.org/wiki/Precision_and_ recall.
[7]D. Hand, C. Whitrow, N. M. Adams, P. Juszczak, and D. Weston. Performance Criteria for Plastic Card Fraud Detection Tools. Journal of the Operational Research Society, 59(7):956–962, 2007.
[8]Yusuf Sahin, Serol Bulkan, and Ekrem Duman. A Cost-Sensitive Decision Tree Approach for Fraud Detection. Expert Syst. Appl., 40(15):5916–5923, 2013.
[9]P.-N. Tan, M. Steinbach, and V. Kumar. Introduction to Data Mining. Addison-Wesley, 2005.
[10]Charles Elkan. The foundations of cost-sensitive learning. In Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence, IJCAI 2001, Seattle, Washington, USA, August 4-10, 2001, pages 973–978, 2001.
[11]Confusion matrix. Available at: https://en.wikipedia.org/wiki/Confusion_ matrix.
[12]Predrag Radivojac. Machine learning lecture notes. February, 2015.
[13]Philip K. Chan and Salvatore J. Stolfo. Toward scalable learning with non-uniform class and cost distributions: A case study in credit card fraud detection. In Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (KDD-98), pages 164–168, 1998.
[14]Richard J. Bolton, David J. Hand, and David J. H. Statistical fraud detection: A review. Statistical Science, 17(3):235–249, 2002.
[15]Weka 3: Data mining software in java. Available at: http://www.cs.waikato.ac.nz/ml/weka/.
[16]Xinjian Guo, Yilong Yin, Cailing Dong, Gongping Yang, and Guangtong Zhou. On the class imbalance problem. In 2008 Fourth International Conference on Natural Computation, volume 4, pages 192–201. IEEE, 2008.
[17]Remco R. Bouckaert, E. Frank, M. Hall, R. Kirkby, P. Reutemann, A. Seewald, and D. Scuse. Weka manual (3.7.1), 2010.
[18]Ucsd-fico data mining contest 2009 data set. https://www.cs.purdue.edu/ commugrate/data/credit card/.
[19]K. R. Seeja and Masoumeh Zareapoor. Fraudminer: A novel credit card fraud detection model based on frequent itemset mining. The Scientific World Journal, 2014:10 pages, 2014.
[20]Minyong Lee, Seunghee Ham, and Qiyi Jiang. E-commerce transaction anomaly classification, 2013.
[21]J. V. Hulse and T. M. Khoshgoftaar. Experimental perspectives on learning from imbalanced data. In International Conference on Machine Learning, 2007, pages 155–164, 2007.
[22]http://weka.wikispaces.com/Primer. September, 2010.
[23]Pedro M. Domingos. Metacost: A general method for making classifiers cost-sensitive. In Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, CA, USA, August 15-18, 1999, pages 155–164, 1999.
[24]Victor S. Sheng and Charles X. Ling. Thresholding for making classifierscost-sensitive. In Proceedings, The Twenty-First National Conference onArtificial Intelligence and the Eighteenth Innovative Applications of ArtificialIntelligence Conference, July 16-20, 2006, Boston, Massachusetts, USA, pages 476–481, 2006.
[25]Michael J. Pazzani, Christopher J. Merz, Patrick M. Murphy, Kamal M. Ali, Timothy Hume, and Clifford Brunk. Reducing misclassification costs. In Proceedings of the Eleventh International Conference on Machine Learning, Rutgers University, New Brunswick, NJ, USA, July 10-13,1994, pages 217–225, 1994.
[26]Alejandro Correa Bahnsen, Example dependent cost sensitive classification: Applications in Financial Risk Modeling and marketing analytics. PhD thesis, University of Luxembourg, 2015.
[27]J. Wang, P. Zhao, and S. C. H. Hoi. Cost-Sensitive Online Classification. IEEE Transactions on Knowledge and Data Engineering, 26(10):2425–2438, Oct. 2014.