Information Technology for Sound Analysis and Recognition in the Metropolis based on Machine Learning Methods

PDF (3141KB), PP.40-72

Views: 0 Downloads: 0

Author(s)

Lyubomyr Chyrun 1 Victoria Vysotska 2,3 Stepan Tchynetskyi 2 Yuriy Ushenko 4,* Dmytro Uhryn 4

1. Applied Mathematics Department, Faculty of Applied Mathematics and Informatics, Ivan Franko National University of Lviv, Lviv, 79000, Ukraine

2. Department of Information Systems and Networks, Institute of Computer Sciences and Information Technologies, Lviv Polytechnic National University, Lviv, 79013, Ukraine

3. Osnabrück University, Osnabrück, 49076, Germany

4. Department of Computer Science, Educational and Research Institute of Physical, Technical and Computer Sciences, Yuriy Fedkovych Chernivtsi National University, 58012, Ukraine

* Corresponding author.

DOI: https://doi.org/10.5815/ijisa.2024.06.03

Received: 22 Oct. 2023 / Revised: 15 Mar. 2024 / Accepted: 11 May 2024 / Published: 8 Dec. 2024

Index Terms

Data Augmentation, Intelligent System, Application, Sound Waves, Sound Spectrum, SkLearn, Feature Extraction, Sound Analysis, Machine Learning Methods

Abstract

The goal of designing and implementing an intelligent information system for the recognition and classification of sound signals is to create an effective solution at the software level, which would allow analysis, recognition, classification and forecasting of sound signals in megacities and smart cities using machine learning methods. This system can help people in various fields to simplify their lives, for example, it can help farmers protect their crops from animals, in the military it can help with the identification of weapons and the search for flying objects, such as drones or missiles, in the future there is a possibility for recognizing the distance to sound, also, in cities can help with security, so a preventive response system can be built, which can check if everything is in order based on sounds. Also, it can make life easier for people with impaired hearing to detect danger in everyday life. In the part of the comparison of analogues of the developed product, 4 analogues were found: Shazam, sound recognition from Apple, Vocapia, and SoundHound. A table of comparisons was made for these analogues and the product under development. Also, after comparing analogues, a table for evaluating the effects of the development was built. During the system analysis section, a variety of audio research materials were developed to indicate the characteristics that can be used for this design: period, amplitude, and frequency, and, as an example, an article on real-world audio applications is shown. A precedent scenario is described using the RUP methodology and UML diagrams are constructed: Diagram of use cases; Class diagram; Activity chart; Sequence diagram; Diagram of components; and Deployment diagram. Also, sound data analysis was performed, sound data was visualized as spectrograms and sound waves, which clearly show that the data are different, so it is possible to classify them using machine learning methods. An experimental selection of the machine learning method as staandart clasificers for building a sound recognition model was made. The best method turned out to be SVC, the accuracy of which reflects more than 30 per cent. A neural network was also implemented to improve the obtained results. The result of training a model based on a neural network during 100 epochs achieved a result of 97.7% accuracy for training data and 47.8% accuracy when checking performance on test data. This result should be higher, so it is necessary to consider improving recognition algorithms, increasing the amount of data, and changing the recognition method. Testing of the project was carried out, showing its operation and pointing out shortcomings that need to be corrected in the future.

Cite This Paper

Lyubomyr Chyrun, Victoria Vysotska, Stepan Tchynetskyi, Yuriy Ushenko, Dmytro Uhryn, "Information Technology for Sound Analysis and Recognition in the Metropolis based on Machine Learning Methods", International Journal of Intelligent Systems and Applications(IJISA), Vol.16, No.6, pp.40-72, 2024. DOI:10.5815/ijisa.2024.06.03

Reference

[1]Destroyer birds: how to protect your farm from birds? 2015. Kurkul. URL: https://kurkul.com/blog/80-ptahi-nischivniki-yak-zahistiti-svoye-gospodarstvo-vid-pernatih
[2]HESA Shahed-136. 2022.  Militaryfactory. URL: https://www.militaryfactory.com/aircraft/detail.php?aircraft_id=2520
[3]T. Basyuk, A. Vasyliuk, Peculiarities of matching the text and sound components in the Ukrainian language system development, CEUR Workshop Proceedings 3723 (2024) 466-483.
[4]T. Kovaliuk, I. Yurchuk, O. Gurnik, Topological structure of Ukrainian tongue twisters based on speech sound analysis, CEUR Workshop Proceedings 3723 (2024) 328-339.
[5]Peleshchak, R.M., Kuzyk, O.V., Dan'kiv, O.O.: The influence of ultrasound on formation of self-organized uniform nanoclusters. In: Journal of Nano- and Electronic Physics 8(2),02014. (2016)
[6]Peleshchak, R., Kuzyk, O., Dan'Kiv, O.: The criteria of formation of InAs quantum dots in the presence of ultrasound. In: International Conference on Nanomaterials: Applications and Properties, NAP, 01NNPT06. (2017)
[7]Peleshchak, R.M., Kuzyk, O.V., Dan'kiv, O.O.: The influence of ultrasound on the energy spectrum of electron and hole in InAs/GaAs heterosystem with InAs quantum dots. In: Journal of Nano- and Electronic Physics 8(4),04064. (2016)
[8]Altexsoft. 2022. Audio Analysis With Machine Learning: Building AI-Fueled Sound Detection App. URL: https://www.altexsoft.com/blog/audio-analysis/
[9]V. Motyka, Y. Stepaniak, M. Nasalska, V. Vysotska, People's Emotions Analysis while Watching YouTube Videos, CEUR Workshop Proceedings, 3403, 2023, pp. 500–525.
[10]O. Turuta, I. Afanasieva, N. Golian, V. Golian, K. Onyshchenko, Daniil Suvorov, Audio processing methods for speech emotion recognition using machine learning, CEUR Workshop Proceedings 3711 (2024) 75-108.
[11]Audio Deep Learning Made Simple: Sound Classification, Step-by-Step. 2021. Towardsdatascience. URL: https://towardsdatascience.com/audio-deep-learning-made-simple-sound-classification-step-by-step-cebc936bbe5
[12]Sartiukova, O. Markiv, V. Vysotska, I. Shakleina, N. Sokulska, I. Romanets, Remote Voice Control of Computer Based on Convolutional Neural Network, in: Proceedings of the IEEE 12th International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS), Dortmund, Germany, 07-09 September 2023, pp. 1058 – 1064.
[13]Koshovyy, V., Ivantyshyn, O., Mezentsev, V., Rusyn, B., Kalinichenko, M., Influence of active cosmic factors on the dynamics of natural infrasound in the earth’s atmosphere. In: Romanian Journal of Physics, 2020, 65(9-10), pp. 1–10, 813
[14]Soundproofcow. What Are the Characteristics of a Sound Wave?.URL: https://www.soundproofcow.com/characteristics-of-sound-wave/
[15]Vladyslav Tsap, Nataliya Shakhovska, Ivan Sokolovskyi, The Developing of the System for Automatic Audio to Text Conversion, in: CEUR Workshop Proceedings, Vol-2917, 2021, pp. 75-84.
[16]Basystiuk, O., Shakhovska, N., Bilynska, V., ...Shamuratov, O., Kuchkovskiy, V., The developing of the system for automatic audio to text conversion. In: CEUR Workshop Proceedings, 2021, 2824, pp. 1–8
[17]L. Kobylyukh, Z. Rybchak, O. Basystiuk, Analyzing the Accuracy of Speech-to-Text APIs in Transcribing the Ukrainian Language, CEUR Workshop Proceedings, Vol-3396, 2023, 217-227.
[18]K. Tymoshenko, V. Vysotska, O. Kovtun, R. Holoshchuk, S. Holoshchuk, Real-time Ukrainian text recognition and voicing, CEUR Workshop Proceedings, Vol-2870, 2021, pp. 357-387.
[19]Trysnyuk, V., Nagornyi, Y., Smetanin, K., Humeniuk, I., & Uvarova, T. (2020). A method for user authenticating to critical infrastructure objects based on voice message identification. Advanced Information Systems, 4(3), 11–16. https://doi.org/10.20998/2522-9052.2020.3.02
[20]Bisikalo, O., Boivan, O., Khairova, N., Kovtun, O., Kovtun, V., Precision automated phonetic analysis of speech signals for information technology of text-dependent authentication of a person by voice. In: CEUR Workshop Proceedings, 2021, 2853, pp. 276–288
[21]ScienceDirect. 2021. Sound-spectrogram based automatic bird species recognition using MLP classifier. URL: https://www.sciencedirect.com/science/article/abs/pii/S0003682X21001705
[22]Apple. 2022. Shazam turns 20. URL: https://www.apple.com/sa/newsroom/2022/08/shazam-turns-20/
[23]Wayback Machine. 2009. Shazam names that tune. URL: https://web.archive.org/web/20120807220614/http://www.director.co.uk/magazine/2009/11%20December/shazam_63_04.html
[24]Indianexpress. 2021. What is Sound Recognition in iOS 14 and how does it work? URL: https://indianexpress.com/article/technology/mobile-tabs/what-is-sound-recognition-in-ios-14-and-how-does-to-work-7311903/
[25]TopAI.tools. Vocapia. URL: https://topai.tools/t/vocapia
[26]Vocapia. 2023. Speech to Text Software. URL: https://www.vocapia.com/
[27]SoundHoundAI. Voice AI platform. URL: https://www.soundhound.com/voice-ai-products/platform/
[28]Tutorialspoint. (2018). System Analysis and Design – Overview. URL: https://www.tutorialspoint.com/system_analysis_and_design/system_analysis_and_design_overview.htm 
[29]Business Analysts Handbook. User Requirements. URL: https://businessanalyst.fandom.com/wiki/User_Requirements
[30]MEDIUM. 2023. Understand Linear Support Vector Classifier (SVC) In Machine Learning a Classification Algorithm. URL: https://blog.tdg.international/understand-linear-support-vector-classifier-svc-in-maschine-learning-a-classification-algorithm-3deb385f6e7d
[31]MEDIUM. 2022. What? When? How?: ExtraTrees Classifier. URL: https://towardsdatascience.com/what-when-how-extratrees-classifier-c939f905851c
[32]AlmaBetter. AdaBoost Algorithm in Machine Learning. URL: https://www.almabetter.com/bytes/tutorials/data-science/adaboost-algorithm
[33]MEDIUM. 2021. A Multi-layer Perceptron Classifier in Python; Predict Digits from Gray-Scale Images of Hand-Drawn Digits from 0 Through 9. URL: https://medium.com/@polanitzer/a-multi-layer-perceptron-classifier-in-python-predict-digits-from-gray-scale-images-of-hand-drawn-44936176be33
[34]OpenSource. What is Python?. URL: https://opensource.com/resources/python
[35]scikit-learn. Machine Learning in Python. URL: https://scikit-learn.org/stable/
[36]Librosa. Librosa. URL: https://librosa.org/doc/latest/index.html
[37]Matplotlib. Matplotlib: Visualization with Python. URL: https://matplotlib.org/
[38]Seaborn. seaborn: statistical data visualization. URL: https://seaborn.pydata.org/
[39]NumPy. NumPy documentation. URL: https://numpy.org/
[40]Pandas. pandas documentation . URL: https://pandas.pydata.org/
[41]Developedia. 2021. Audio Feature Extraction. URL: https://devopedia.org/audio-feature-extraction
[42]Krishna Kumar, “Audio classification using ML methods” у M.Tech Artificial Intelligence REVA Academy for Corporate Excellence - RACE, REVA University Bengaluru, India
[43]Find any sound you like. https://freesound.org/
[44]ESC-50: Dataset for Environmental Sound Classification. https://github.com/karolpiczak/ESC-50?tab=readme-ov-file
[45]ESC-50 audio classification. https://github.com/shibuiwilliam/audio_classification_keras/blob/master/esc50_classification.ipynb