Work place: Department of Computer Science and Informatics, University of Energy and Natural Resources, Sunyani, Ghana
E-mail: owusu.nyarko-boateng@uenr.edu.gh
Website:
Research Interests: Computer systems and computational processes, Artificial Intelligence, Computational Learning Theory, Computer Architecture and Organization, Network Architecture, Network Security, Data Structures and Algorithms
Biography
Owusu Nyarko-Boateng is a PhD Computer Science candidate and a lecturer at the Department of Computer Science and Informatics, the University of Energy and Natural Resources, Ghana. He holds HND Electrical & Electronics Engineering, BSc Computer Science, PGDE, and MSc Information Technology. Owusu Nyarko-Boateng has previously worked with MTN Ghana and Huawei Technologies (SA) for over ten (10) years. He has in-depth experience in telecommunications transmission systems, including fiber optics cable deployment for long-haul and short distance (FTTx), 2G BTS, WCDMA (3G), and 4G (LTE) plants installations and configurations. He also managed Huawei DWDM Optix8800 OSN, Huawei OptiX OSN1800 and optical distributor frames (ODF). As an academic and a researcher, Owusu Nyarko-Boateng has developed a passion in the following research areas: machine learning, artificial intelligence, computer networks and data communications, network security, fiber optics technologies, modelling transmission systems, 5G & 6G Technologies, Expert Systems, computational intelligence for data communications.
By Isaac Kofi Nti Owusu Nyarko-Boateng Justice Aning
DOI: https://doi.org/10.5815/ijitcs.2021.06.05, Pub. Date: 8 Dec. 2021
The numerical value of k in a k-fold cross-validation training technique of machine learning predictive models is an essential element that impacts the model’s performance. A right choice of k results in better accuracy, while a poorly chosen value for k might affect the model’s performance. In literature, the most commonly used values of k are five (5) or ten (10), as these two values are believed to give test error rate estimates that suffer neither from extremely high bias nor very high variance. However, there is no formal rule. To the best of our knowledge, few experimental studies attempted to investigate the effect of diverse k values in training different machine learning models. This paper empirically analyses the prevalence and effect of distinct k values (3, 5, 7, 10, 15 and 20) on the validation performance of four well-known machine learning algorithms (Gradient Boosting Machine (GBM), Logistic Regression (LR), Decision Tree (DT) and K-Nearest Neighbours (KNN)). It was observed that the value of k and model validation performance differ from one machine-learning algorithm to another for the same classification task. However, our empirical suggest that k = 7 offers a slight increase in validations accuracy and area under the curve measure with lesser computational complexity than k = 10 across most MLA. We discuss in detail the study outcomes and outline some guidelines for beginners in the machine learning field in selecting the best k value and machine learning algorithm for a given task.
[...] Read more.Subscribe to receive issue release notifications and newsletters from MECS Press journals