International Journal of Information Technology and Computer Science (IJITCS)

IJITCS Vol. 17, No. 2, Apr. 2025

Cover page and Table of Contents: PDF (size: 194KB)

Table Of Contents

REGULAR PAPERS

Green AI Practices in Multi-objective Hyperparameter Optimization for Sustainable Machine Learning

By K. Jegadeeswari R. Rathipriya

DOI: https://doi.org/10.5815/ijitcs.2025.02.01, Pub. Date: 8 Apr. 2025

The hyperparameter tuning process is an essential step for ML model optimization, as it is necessary to improve model performance. However, this enhancement involves high computational resources and time costs. Model tuning can significantly raise energy consumption and consequently increase carbon emissions.  Therefore, there is an essential need to construct a new framework for this challenge by adding carbon emissions as a vital consideration along with performance. The paper proposes a novel Sustainable Hyperparameter Optimization (SHPO) framework that uses an optimized multi-objective fitness approach. The framework focuses on ensemble classification models (ECMs) namely, Random Forest, ExtraTrees, XGBoost, and AdaBoost. All these models will be optimized using traditional and advanced techniques like Optuna, Hyperopt, and Grid Search. The proposed framework tracks carbon emissions during model hyperparameter tuning. The methodology uses the Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) as a method of multi-criteria decision-making (MCDM). This TOPSIS method ranks the hyperparameter sets based on both accuracy and carbon emissions. The objective of the multi-objective fitness approach is to reach the best parameter set with high accuracy and low carbon emissions. It is observed from the experimental results that Optuna based Hyperparameter optimization consistently produced low carbon emissions and achieved high predictive accuracy across the majority of benchmark hyperparameter setups for the models.

[...] Read more.
Unveiling Autism: Machine Learning-based Autism Spectrum Disorder Detection through MRI Analysis

By Chitta Hrudaya Neeharika Yeklur Mohammed Riyazuddin

DOI: https://doi.org/10.5815/ijitcs.2025.02.02, Pub. Date: 8 Apr. 2025

The prediction of autism features in relation to age groups has not been definitively addressed, despite the fact that several studies have been conducted using various methodologies. Research in the field of neuroscience has demonstrated that intracranial brain volume and the corpus callosum provide crucial information for the identification of autism spectrum disorder (ASD). Based on these findings, we present Decision Tree-based Autism Prediction System (DT-APS) and Random Forest-based Autism Prediction System (RF-APS) for automatic ASD identification in this paper. These systems utilize characteristics extracted from the corpus callosum and intracranial brain volume, and are based on machine learning techniques. By prioritizing characteristics with the highest discriminatory power for ASD classification, our proposed approaches, DT-APS and RF-APS, have not only enhanced identification accuracy but also simplified the training of machine learning models. The initial step of this method involves dividing each MRI scan into distinct anatomical areas. These areas are adjacent slices in a single 2D image. Each 2D image is mapped to the curvelet space, and the set of GGD parameters characterizes each of the distinct curvelet sub-bands. The AQ-10 dataset was utilized to evaluate the proposed model. When tested on both types of datasets, the suggested prediction model demonstrated superior performance compared to alternative approaches in all relevant metrics, including accuracy, specificity, sensitivity, precision, and false positive rate (FPR).

[...] Read more.
Priorities for the Strategic Development of Ukraine's Cybersecurity Based on the Analysis of Expert Sampling Patterns

By Oleksandr Korystin Serhii Demediuk Yaroslav Likhovitskyy Yuriy Kardashevskyy Olena Mitina

DOI: https://doi.org/10.5815/ijitcs.2025.02.03, Pub. Date: 8 Apr. 2025

The study is devoted to assessing the risks of cyber threats in the future based on expert sampling patterns. One of the key problems of modern cybersecurity is the dynamic nature of threats that change under the influence of technological progress and socio-economic factors. In this context, the authors consider a methodological approach that involves the use of a multi-level analysis of expert opinions. The main emphasis is placed on taking into account the different points of view, experience and professional activities of experts from the public, private and academic sectors. An important stage of the study is the procedure of data cleaning to form a representative sample that takes into account only logically consistent responses of experts. The paper focuses on the integration of the expert sample patterns‘ features. The key differences in threat assessments between different groups of experts depending on their professional role and experience are identified. This made it possible to formulate comprehensive recommendations for strategic cyber risk management focused on both short-term and long-term priorities. The study makes a significant contribution to understanding the peculiarities of cyber risk assessment through the use of multivariate analysis of expert opinions. The proposed methodology allows not only to improve the quality of forecasts of future cyber threats, but also contributes to the creation of adaptive cybersecurity strategies that take into account the specifics of each sector. The findings of the study emphasize the importance of a multidimensional approach to analyzing cyber threats, taking into account the specifics of each expert group. Integration of assessments and consideration of local peculiarities are key to the development of adaptive and effective cyber defense strategies focused on global and local challenges.

[...] Read more.
Predicting the Occurrence of Cerebrovascular Accident in Patients using Machine Learning Technique

By Edward N. Udo Anietie P. Ekong Favour A. Akumute

DOI: https://doi.org/10.5815/ijitcs.2025.02.04, Pub. Date: 8 Apr. 2025

Cerebrovascular disease commonly known as stroke is the third leading cause of disability and mortality in the world. In recent years, technological advancements have transformed the way information is acquired and how problems are solved in diverse fields of human endeavors, including the medical and healthcare sectors. Machine Learning (ML) and data driven techniques have gain prominence in problem solving and have been deployed in the prediction of the occurrences of stroke. This work explores the application of supervised machine learning algorithms for the prediction of stroke, emphasizing the critical need for early prediction to enhance preventive measures. A comprehensive comparison of classification (Support Vector Machine and Random Forest) and regression (Logistic Regression) algorithms was conducted, with concerns on binary stroke outcome (likelihood of stroke and no stroke) data utilizing dataset from the International Stroke Trial database. The Synthetic Minority Oversampling Technique (SMOTE) and K-fold cross validation were used to balance and address the class imbalance in the datasets. The subsequent model comparison demonstrated distinct strengths and weaknesses among the three models.  Random Forest (RF) exhibited high accuracy score of 89%, Support Vector Machine (SVM) and Logistic Regression (LR) showed 86% accuracy. LR demonstrated the most balanced predictive performance, achieving high precision for stroke cases and reasonable recall for both classes.

[...] Read more.
RSKD Ensemble Classifier with Stable Ensemble Feature Selection for High Dimensional Low Sample Size Cancer Datasets

By Archana Suhas Vaidya Dipak V. Patil

DOI: https://doi.org/10.5815/ijitcs.2025.02.05, Pub. Date: 8 Apr. 2025

This study presents the RSKD ensemble classifier, developed with ensemble feature selection techniques, to address high-dimensional, low-sample-size cancer datasets. Ensemble classifiers are advantageous in such scenarios, offering better classification accuracy than traditional methods by combining multiple models. This combination enhances predictive performance on high-dimensional datasets. However, stability—a key factor for consistent performance on unseen data—often involves a tradeoff with accuracy. Ensemble methods, due to their generalization capabilities, exhibit higher stability, with feature selection stability measured using a consistency index, averaging 65–70%.
The RSKD classifier integrates ensemble feature selection methods SU-R and ChS-R, which enhance feature selection stability and classification accuracy. Its performance was evaluated on seven high-dimensional, low-sample-size datasets and compared against state-of-the-art classifiers, including Adaboost, GradientBoost, REPTree, asBagging_FSS, SRKNN, MF-GE, and eAdaBoost with DSC. The RSKD ensemble classifier achieved an accuracy improvement of 7.69% to 12.35% over these methods. Among the feature selection approaches, SU-R combined with RSKD outperformed ChS-R, demonstrating superior results in cancer prediction tasks.
The findings of this study underscore the potential of RSKD for achieving generalized, robust performance on challenging datasets. By leveraging ensemble classifiers and ensemble feature selection techniques, researchers can address the inherent difficulties of high-dimensional, low-sample-size datasets, enhancing both accuracy and stability. This work provides a valuable foundation for developing diverse, heterogeneous ensemble approaches for cancer prediction and similar applications.

[...] Read more.
An Integrated Approach using Loss Sensitivity Factor and Whale Optimization Algorithm for Distributed Generation Allocation

By Sakthidasan A. Senthil Kumar M. Jovin Deglus Sabarish P. Rajakumar P. Sundar R.

DOI: https://doi.org/10.5815/ijitcs.2025.02.06, Pub. Date: 8 Apr. 2025

The incorporation of distributed generation (DG) in radial distribution systems (RDS) has recently garnered much attention. The prime goal of DG integration is to generate power locally and cut down the total power losses (PL) of RDS to increase the overall efficiency. The present work suggests a hybrid optimization approach integrating loss sensitivity factor (LSF) with a whale optimization algorithm (WOA) to optimize different categories of DG. The LSF locates the ideal site, and WOA optimizes the size. The present study optimizes DG units to minimize the total active power losses (APLT) and enhance the bus voltages (BV). The present work investigates the adaptability of the proposed integrated technique on the small 33-bus and a large 118-bus RDS. The APLT of the 33-bus RDS is minimized from 210.98 kW to 101.3 kW, 124.3 kW, 64.56 kW, and 86.5 kW for Type I, Type II, Type III, and Type IV DG placements, respectively. Correspondingly, the minimum bus voltage (BVmin) is increased from 0.9038 p.u. to 0.9511 p.u., 0.9503 p.u., 0.9608 p.u., and 0.9579 p.u. Likewise, significant PL reduction and bus voltage enhancement are obtained in 118-bus for three units of Type I and Type III DG placements. Further, the adequateness of the hybrid technique is examined for varying power demand on the IEEE 33-bus RDS. The integrated technique effectively narrows the search space of the optimization problem and helps the WOA to find the optimal solution. The simulation outcomes are compared to examine the sovereignty of the proposed optimization technique.

[...] Read more.
A Survey of Techniques for Improving Information Retrieval through Query Expansion

By Surabhi Solanki Seema Verma Sachin Kumar

DOI: https://doi.org/10.5815/ijitcs.2025.02.07, Pub. Date: 8 Apr. 2025

This paper presents a comprehensive survey of QE techniques in IR. Core techniques, employed data sources, and methodologies used in the process of query expansion are discussed. The output study highlights four main steps concerned with expanding queries: steps related to preprocessing of data sources and term extraction, calculation of weights and ranking of terms, selection of terms, and finally expansion. The most important findings are that only effective text normalization and removal of stopwords provide a real platform for performing QE. The introduction of contextually relevant terms significantly enhanced relevance feedback and thesaurus-based WordNet expansion techniques. They have been shown to significantly improve retrieval effectiveness as has been realized from various experiments conducted over years now. It also uses the manual query expansion techniques and discusses several automated ways in order to improve retrieval effectiveness. This work, by reviewing the related literature and methodologies, gives an overview of how the techniques of query expansion have been evolving with time and achieved better results in IR systems. The survey offers a valuable resource for researchers and practitioners in information retrieval, shedding light on the advancements, challenges, and future directions in query expansion research.

[...] Read more.