IJISA Vol. 15, No. 5, Oct. 2023
Cover page and Table of Contents: PDF (size: 130KB)
Water supply infrastructure operational efficiency has a direct impact on the quantity of portable water available to end users. It is commonplace to find water supply infrastructure in a declining operational state in rural and some urban centers in developing countries. Maintenance issues result in unabated wastage and shortage of supply to users. This work proposes a cost-effective solution to the problem of water distribution losses using a Microcontroller-based digital control method and Machine Learning (ML) to forecast and manage portable water production and system maintenance. A fundamental concept of hydrostatic pressure equilibrium was used for the detection and control of leakages from pipeline segments. The results obtained from the analysis of collated data show a linear direct relationship between water distribution loss and production quantity; an inverse relationship between Mean Time Between Failure (MTBF) and yearly failure rates, which are the key problem factors affecting water supply efficiency and availability. Results from the prototype system test show water supply efficiency of 99% as distribution loss was reduced to 1% due to Line Control Unit (LCU) installed on the prototype pipeline. Hydrostatic pressure equilibrium being used as the logic criteria for leak detection and control indeed proved potent for significant efficiency improvement in the water supply infrastructure.[...] Read more.
This paper analyses the performance of machine learning models in forecasting the Tehran Stock Exchange's automobile index. Historical daily data from 2018-2022 was pre-processed and used to train Linear Regression (LR), Support Vector Regression (SVR), and Random Forest (RF) models. The models were evaluated on mean absolute error, mean squared error, root mean squared error and R2 score metrics. The results indicate that LR and SVR outperformed RF in predicting automobile stock prices, with LR achieving the lowest error scores. This demonstrates the capability of machine learning techniques to model complex, nonlinear relationships in financial time series data. This pioneering study on a previously unexplored dataset provides empirical evidence that LR and SVR can reliably forecast automobile stock market prices, holding promise for investing applications.[...] Read more.
Data Structures and Algorithms (DSA) is a widely explored domain in the world of computer science. With it being a crucial topic during an interview for a software engineer, it is a topic not to take lightly. There are various platforms available to understand a particular DSA, several programming problems, and its implementation. Hacckerank, LeetCode, GeeksForGeeks (GFG), and Codeforces are popular platforms that offer a vast collection of programming problems to enhance skills. However, with the huge content of DSA available, it is challenging for users to identify which one among all to focus on after going through the required domain. This work aims to use a Content-based filtering (CBF) recommendation engine to suggest users programming-based questions related to different DSAs such as arrays, linked lists, trees, graphs, etc. The recommendations are generated using the concept of Natural Language Processing (NLP). The data set consists of approximately 500 problems. Each problem is represented by the features such as problem statement, related topics, level of difficulty, and platform link. Standard measures like cosine similarity, accuracy, precision, and F1-score are used to determine the proportion of correctly recommended problems. The percentages indicate how well the system performed regarding that evaluation. The result shows that CBF achieves an accuracy of 83 %, a precision of 83 %, a recall of 80%, and an F1-score of 80%. This recommendation system is deployed on a web application that provides a suitable user interface allowing the user to interact with other features. With this, a whole E-learning application is built to aid potential software engineers and computer science students. In the future, two more recommendation systems, Collaborative Filtering (CF) and Hybrid systems, can be implemented to make a comparison and decide which is most suitable for the given problem statement.[...] Read more.
The detection of outliers in text documents is a highly challenging task, primarily due to the unstructured nature of documents and the curse of dimensionality. Text document outliers refer to text data that deviates from the text found in other documents belonging to the same category. Mining text document outliers has wide applications in various domains, including spam email identification, digital libraries, medical archives, enhancing the performance of web search engines, and cleaning corpora used in document classification. To address the issue of dimensionality, it is crucial to employ feature selection techniques that reduce the large number of features without compromising their representativeness of the domain. In this paper, we propose a hybrid density-based approach that incorporates mutual information for text document outlier detection. The proposed approach utilizes normalized mutual information to identify the most distinct features that characterize the target domain. Subsequently, we customize the well-known density-based local outlier factor algorithm to suit text document datasets. To evaluate the effectiveness of the proposed approach, we conduct experiments on synthetic and real datasets comprising twelve high-dimensional datasets. The results demonstrate that the proposed approach consistently outperforms conventional methods, achieving an average improvement of 5.73% in terms of the AUC metric. These findings highlight the remarkable enhancements achieved by leveraging normalized mutual information in conjunction with a density-based algorithm, particularly in high-dimensional datasets.[...] Read more.
Climate change, a significant and lasting alteration in global weather patterns, is profoundly impacting the stability and predictability of global temperature regimes. As the world continues to grapple with the far-reaching effects of climate change, accurate and timely temperature predictions have become pivotal to various sectors, including agriculture, energy, public health and many more. Crucially, precise temperature forecasting assists in developing effective climate change mitigation and adaptation strategies. With the advent of machine learning techniques, we now have powerful tools that can learn from vast climatic datasets and provide improved predictive performance. This study delves into the comparison of three such advanced machine learning models—XGBoost, Support Vector Machine (SVM), and Random Forest—in predicting daily maximum and minimum temperatures using a 45-year dataset of Visakhapatnam airport. Each model was rigorously trained and evaluated based on key performance metrics including training loss, Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE), R2 score, Mean Absolute Percentage Error (MAPE), and Explained Variance Score. Although there was no clear dominance of a single model across all metrics, SVM and Random Forest showed slightly superior performance on several measures. These findings not only highlight the potential of machine learning techniques in enhancing the accuracy of temperature forecasting but also stress the importance of selecting an appropriate model and performance metrics aligned with the requirements of the task at hand. This research accomplishes a thorough comparative analysis, conducts a rigorous evaluation of the models, highlights the significance of model selection.[...] Read more.