IJMECS Vol. 17, No. 1, 8 Feb. 2025
Cover page and Table of Contents: PDF (size: 537KB)
PDF (537KB), PP.47-58
Views: 0 Downloads: 0
Computer Science and Technology, Machine Learning, Clustering Method, Feature Extraction, Panel Data
Aiming at the problems of large information loss and feature loss in the similarity design of high-dimensional panel data in clustering, a new panel data clustering method was proposed, which named an adaptive clustering method for panel data based on multi-dimensional feature extraction. This method defined "comprehensive quantity", "absolute quantity", "growth rate", "general trend" and "fluctuation quantity" of samples to extract features, and the five features were weighted to calculate the samples comprehensive distance. On this basis, ward method is used for clustering. This method can greatly reduces the loss of effective information. To verify the effectiveness of the method, cluster empirical analysis was conducted using GDP panel data from 31 regions in China, and the clustering results were compared with those of other clustering models. The experimental results showed that the proposed model was more interpretable and the clustering results were better.
Xiqin Ao, Mideth Abisado, "Adaptive Clustering Method for Panel Data Based on Multi-dimensional Feature Extraction", International Journal of Modern Education and Computer Science(IJMECS), Vol.17, No.1, pp. 47-58, 2025. DOI:10.5815/ijmecs.2025.01.04
[1]Pesaran, M. Hashem. Time series and panel data econometrics. Oxford University Press, 2015.
[2]Y Wang. An improved prediction model based on grey clustering analysis method and its application power load forecasting. International Journal of Control and Automation, no.8, pp.432-451, 2015.https://www.earticle.net/Article/A254637
[3]J Ren , S Shi. Multivariable Panel Data Ordinal Clustering and Its Application in Competitive Strategy Identification of Appliance-wiring Listed Companies. 2009 International Conference on Management Science and Engineering. IEEE, 2009. DOI: 10.1109/ICMSE.2009.5317442
[4]Y Chen , Z Zhang , X Song , et al. Coherent Clustering Method Based on Weighted Clustering of Multi-Indicator Panel Data. IEEE Access, 7, 43462-43472,2020.DOI: 10.1109/ACCESS.2019.2907270
[5]X M Li, K W Hipel, Y G Dang. An improved grey relational analysis approach for panel data clustering. Expert Systems with Applications, vol.42, no.23, pp.9105-9116, 2015. DOI:10.1016/j.eswa.2015.07.066
[6]S Aghabozorgi, S.A Seyed, Teh Y. Time-series clustering-A decade review. Information Systems, no.53, pp.16-38, 2015. DOI:10.1016/j.is.2015.04.007
[7]X Zhang , J Liu , Y Du , et al. A novel clustering method on time series data. Expert Systems with Application, vol.38, no.9, pp.11891-11900, 2011. DOI:10.1016/j.eswa.2011.03.081
[8]M.A. Juarez, M.F. Steel. Model-based clustering of non-Gaussian panel data based on skew-distributions. Journal of Business & Economics Statistics, vol.28, no.1, pp.52-66, 2010. DOI:10.1198/jbes.2009.07145
[9]H T Zhang, Z H Li, Y Sun, et al. New similarity measure method on time series. Computer Engineering & Design, 35(4) :1279-1284. DOI:10.16208/j.issn1000-7024.2014.04.032.
[10]Y G Li, Dai Y, X Q He. Panel data clustering method based on adaptive weight. System Engineering Theory and Practice, vol.33, no. 02, pp.388-395, 2013.
[11]Y G Dang, D Q Hou. Multi index Panel data clustering method based on feature extraction. Statistics and Decision, no.19, pp.68-72,2016. DOI:10.13546/j.cnki.tjyjc.2016.19.018.
[12]L J Zhang, H Peng. Research on weighted cluster analysis method of Panel data. Statistics and Information Forum, vol.32, no. 04, pp.21-26, 2017.
[13]D Y Dai, G M Deng. Clustering method of Panel data based on principal component feature extraction. Statistics and Decision, vol.34, no. 21, pp.72-76, 2018. DOI:10.13546/j.cnki.tjyjc.2018.21.017.
[14]Z D Wang, G M Deng. Discussion on Panel data clustering method based on trend distance. Statistics and Decision, vol.35, no. 08, pp.35-38, 2019. DOI:10.13546/j.cnki.tjyjc.2019.08.008.
[15]B Liu, C L Zheng. Adaptive clustering method for high-frequency Panel data based on EMD feature extraction. Statistics and Decision, vol.38, no. 10, pp.16-20, 2022. DOI:10.13546/j.cnki.tjyjc.2022.10.003.
[16]C Hsiao, C. Analysis of panel data (No. 64). Cambridge university press, 2022.
[17]A. Berrington, P. Smith and P. Sturgis. An overview of methods for the analysis of panel data, National Centre for Research Methods, School of Social Sciences, University of Southampton, 2006.
[18]M. Barandas, D. Folgado, L. Fernandes, S. Santos ,Abreu, M., Bota, P., & Gamboa, H. TSFEL: Time series feature extraction library. SoftwareX, 11, 100456,2020.
[19]B. K.Chandar, A. Hortaçsu, J. A. List, et al. Design and analysis of cluster-randomized field experiments in panel data settings. No. w26389. National Bureau of Economic Research, 2019. DOI:10.2139/ssrn.3473409
[20]V. Verdier. Estimation of Dynamic Panel Data Models with Cross‐Sectional Dependence: Using Cluster Dependence for Efficiency. Journal of Applied Econometrics, vol.31, no.1, pp. 85-105, 2016. DOI:10.1002/jae.2486
[21]A. Jaeger, D. Banks. Cluster analysis: A modern statistical review. Wiley Interdisciplinary Reviews: Computational Statistics, vol.15, no.3, e1597, 2023. DOI:10.1002/wics.1597.
[22]H P Zhang. Application on the entropy method for determination of weight of evaluating index in fuzzy mathematics for wine quality assessment. Advance Journal of Food Science and Technology, vol.7, no.3, pp. 195-198, 2015.
[23]R A Johnson , D W Wichern. Applied multivariate statistical analysis. New Jersey: Prentice Hall, 2013.
[24]J Abonyi, B Feil. Cluster Analysis for Data Mining and System Identification. Springer Science & Business Media, 2007.
[25]S. T. Wierzchoń, M. A. Kłopotek. Modern algorithms of cluster analysis .Vol. 34. Springer International Publishing, 2018.
[26]C. C. Aggarwal. "An introduction to cluster analysis." Data clustering. Chapman and Hall/CRC, 2018. 1-28.
[27]X C Zhang. Data Clustering. Beijing: Science Press, 2022.
[28]E. S. Dalmaijer, C. L. Nord, D. E. Astle. Statistical power for cluster analysis. BMC bioinformatics, vol.23, no.1, pp.1-28, 2022. DOI:10.1186/s12859-022-04675-1
[29]B S.Duran, L. Patrick Odell. Cluster analysis: a survey. Vol.100. Springer Science & Business Media, 2013.
[30]C Hennig, M Meila, F Murtagh, et al. Handbook of Cluster Analysis. Boca Ration: CRC Press, 2016.
[31]E. J. Bynen. Cluster analysis: Survey and evaluation of techniques .Vol. 1. Springer Science & Business Media, 2012.
[32]National Bureau of Statistics-China Statistical Yearbook, website: http://www.stats.gov.cn/sj/ndsj/