Cover page and Table of Contents: PDF (size: 1045KB)
Full Text (PDF, 1045KB), PP.57-69
Views: 0 Downloads: 0
XGBoost, SVM, Random Forest, Machine Learning, Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE), R2 Score, Mean Absolute Percentage Error (MAPE), and Explained Variance Score (EVS)
Climate change, a significant and lasting alteration in global weather patterns, is profoundly impacting the stability and predictability of global temperature regimes. As the world continues to grapple with the far-reaching effects of climate change, accurate and timely temperature predictions have become pivotal to various sectors, including agriculture, energy, public health and many more. Crucially, precise temperature forecasting assists in developing effective climate change mitigation and adaptation strategies. With the advent of machine learning techniques, we now have powerful tools that can learn from vast climatic datasets and provide improved predictive performance. This study delves into the comparison of three such advanced machine learning models—XGBoost, Support Vector Machine (SVM), and Random Forest—in predicting daily maximum and minimum temperatures using a 45-year dataset of Visakhapatnam airport. Each model was rigorously trained and evaluated based on key performance metrics including training loss, Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE), R2 score, Mean Absolute Percentage Error (MAPE), and Explained Variance Score. Although there was no clear dominance of a single model across all metrics, SVM and Random Forest showed slightly superior performance on several measures. These findings not only highlight the potential of machine learning techniques in enhancing the accuracy of temperature forecasting but also stress the importance of selecting an appropriate model and performance metrics aligned with the requirements of the task at hand. This research accomplishes a thorough comparative analysis, conducts a rigorous evaluation of the models, highlights the significance of model selection.
Deep Karan Singh, Nisha Rawat, "Machine Learning for Weather Forecasting: XGBoost vs SVM vs Random Forest in Predicting Temperature for Visakhapatnam", International Journal of Intelligent Systems and Applications(IJISA), Vol.15, No.5, pp.57-69, 2023. DOI:10.5815/ijisa.2023.05.05
Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5-32.
Caruana, R., & Niculescu-Mizil, A. (2006, June). An empirical comparison of supervised learning algorithms. In Proceedings of the 23rd international conference on Machine learning (pp. 161-168).
Chen, T., & Guestrin, C. (2016, August). Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining (pp. 785-794).
Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine learning, 20(3), 273-297.
Kavzoglu, T., & Colkesen, I. (2009). A kernel functions analysis for support vector machines for land cover classification. International Journal of Applied Earth Observation and Geoinformation, 11(5), 352-359.
Kelleher, J. D., Mac Namee, B., & D'Arcy, A. (2015). Fundamentals of machine learning for predictive data analytics: algorithms, worked examples, and case studies. MIT Press.
Liaw, A., & Wiener, M. (2002). Classification and regression by randomForest. R news, 2(3), 18-22.
Sani, A., Tahir, A., & Chiroma, H. (2021). Performance evaluation of XGBoost and Random Forest regression models in predicting temperature levels in Nigeria. In IOP Conference Series: Earth and Environmental Science (Vol. 655, No. 1, p. 012014). IOP Publishing.
Debnath, R., Assibong, P., Valera, I., & Nwulu, N. (2019). Comparative study of support vector machine, artificial neural network, and random forest for temperature prediction. Computers and Electronics in Agriculture, 163, 104859.
Faisal, M., Abdullah, A., & Yusof, Y. (2018). Prediction of weather forecast by using machine learning approach: A survey. In IOP Conference Series: Materials Science and Engineering (Vol. 342, No. 1, p. 012010). IOP Publishing.
Zhang, G. P., & Qi, M. (2005). Neural network forecasting for seasonal and trend time series. European Journal of Operational Research, 160(2), 501-514.
Laio, F., Porporato, A., Ridolfi, L., & Rodriguez-Iturbe, I. (2001). Plants in water-controlled ecosystems: active role in hydrologic processes and response to water stress: III. Vegetation water stress. Advances in Water Resources, 24(7), 707-723.
Witten, I. H., & Frank, E. (2005). Data mining: Practical machine learning tools and techniques. Morgan Kaufmann.
Biau, G., & Scornet, E. (2016). A random forest guided tour. Test, 25(2), 197-227.
Vapnik, V. N. (2013). The nature of statistical learning theory. Springer science & business media.
Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: data mining, inference, and prediction. Springer Science & Business Media.
Burges, C. J. (1998). A tutorial on support vector machines for pattern recognition. Data mining and knowledge discovery, 2(2), 121-167.
Friedman, J. H. (2002). Stochastic gradient boosting. Computational statistics & data analysis, 38(4), 367-378.