Work place: Department of Information Systems and Networks, Lviv Polytechnic National University, Lviv, 79013, Ukraine
E-mail: roman.v.romanchuk@lpnu.ua
Website: https://orcid.org/0009-0004-4352-1073
Research Interests:
Biography
Roman Romanchuk is a postgraduate student at the Department of Information Systems and Networks at Lviv Polytechnic National University and a project manager at TietoEvry LLC, Lviv, Ukraine. My research interests are natural language processing, fake detection, computer vision, and telecommunication.
By Zhengbing Hu Victoria Vysotska Lyubomyr Chyrun Roman Romanchuk Yuriy Ushenko Dmytro Uhryn Cennuo Hu
DOI: https://doi.org/10.5815/ijmecs.2025.02.02, Pub. Date: 8 Apr. 2025
The main goal of the work is to create an intelligent system that uses NLP methods and machine learning algorithms to analyse and classify textual content authorship. The following machine learning models for English and Ukrainian publications were tested and trained on the dataset: Support Vector Classifier, Random Forest, Naive Bayes, Logistic Regression and Neuron Networks. For English, the accuracy of the models was higher due to the more significant amount of text data available. The results for English fiction publication show that the Neuron Networks classifier outperforms the other models in all evaluated metrics, achieving the highest accuracy (0.97), recall (0.96), F1 score (0.98), and precision (0.96). It shows that Neuron Networks is particularly effective in capturing distinctive features of the writing styles of different English authors in scientific and technical texts. For the Ukrainian language, there is a drop in accuracy by 5-10% due to the smaller number of corpora of texts for teaching. The results for scientific and technical Ukrainian publications show that the Random Forest classifier outperforms the other models in all evaluated metrics, achieving the highest accuracy (0.88), recall (0.87), F1 score (0.87), and precision (0.87). It shows that Random Forest is particularly effective in capturing distinctive features of the writing styles of different Ukrainian authors in scientific and technical texts. Much worse accuracy results were shown by other models such as Support Vector Classifier (77%), Logistic Regression (73%) and Naive Bayes (70%). The results for the Ukrainian fiction publication show that the Random Forest classifier outperforms the other models in all evaluated metrics, achieving the highest accuracy (0.85), recall (0.84), F1 score (0.84), and precision (0.84). Much worse accuracy results were shown by other models such as Support Vector Classifier (77%), Logistic Regression (73%) and Naive Bayes (70%)
[...] Read more.Subscribe to receive issue release notifications and newsletters from MECS Press journals