An Extensive Review of Feature Extraction Techniques, Challenges and Trends in Automatic Speech Recognition

Full Text (PDF, 752KB), PP.1-12

Views: 0 Downloads: 0

Author(s)

Vidyashree Kanabur 1,* Sunil S Harakannanavar 1 Dattaprasad Torse 2

1. Department of Electronics and Communication Engineering, S. G. Balekundri Institute of Technology, Belagavi-India

2. Department of Electronics and Communication Engineering, KLS Gogte Institute of Technology, Belagavi, India

* Corresponding author.

DOI: https://doi.org/10.5815/ijigsp.2019.05.01

Received: 7 Jan. 2019 / Revised: 17 Jan. 2019 / Accepted: 28 Jan. 2019 / Published: 8 May 2019

Index Terms

Automatic Speech Recognition, Feature Extraction, Acoustics, Phonemes, Pattern Recognition, Artificial Intelligence

Abstract

Speech is the natural mode of communication between humans. Human-to-machine interaction is gaining importance in the past few decades which demands the machine to be able to analyze, respond and perform tasks at the same speed as performed by human. This task is achieved by Automatic Speech Recognition (ASR) system which is typically a speech-to-text converter. In order to recognize the areas of further research in ASR, one must be aware of the current approaches, challenges faced by each and issues that needs to be addressed. Therefore, in this paper human speech production mechanism is discussed. The various speech recognition techniques and models are addressed in detail. The performance parameters that measure the accuracy of the system in recognizing the speech signal are described. 

Cite This Paper

Vidyashree Kanabur, Sunil S Harakannanavar, Dattaprasad Torse, "An Extensive Review of Feature Extraction Techniques, Challenges and Trends in Automatic Speech Recognition", International Journal of Image, Graphics and Signal Processing(IJIGSP), Vol.11, No.5, pp. 1-12, 2019. DOI: 10.5815/ijigsp.2019.05.01

Reference

[1]Dani Byrd and Elliot Saltzman, “Speech Production”, The Handbook of Brain Theory and Neural Networks, pp. 1072-1076, 2002.

[2]Harshalata Petkar, “A Review of Challenges in Automatic Speech Recognition”, International Journal of Computer Applications, vol. 151, no. 3, pp. 23-26, 2016.

[3]Rashmi M, Urmila S and V M Thakare, “Opportunities and Challenges in Automatic Speech Recognition”, International Conference on Biomedical Engineering and Assistive Technologies, pp. 1-5, 2010.

[4]Davinder P Sharma and Jamin Atkins, “Automatic speech recognition systems challenges and recent implementation trends”, International Journal on Signal and Imaging Systems Engineering, vol. 7, no. 4, pp. 219-234, 2014.

[5]Vijayalakshmi A, Midhun Jimmy and Moksha Nair, “A study on Automatic Speech Recognition Techniques”, International Journal of Advanced Research in Computer Engineering & Technology, vol. 4, no. 3, pp. 614-617, 2015.

[6]Preeti Saini and Parneet Kaur, “Automatic Speech Recognition: A Review”, International Journal of Engineering Trends and Technology, vol. 4, no. 2, pp. 132-136, 2013. 

[7]Bhagat Parabattina and Pradip Das, “Acoustic Phonetic Approach for Speech Recognition A Review”, International Conference of the Neurosurgical Society, pp. 1-6, 2016. 

[8]Rohini Shinde and V P Pawar, “A Review on Acoustic Phonetic Approach for Marathi Speech Recognition”, International Journal of Computer Applications, vol. 59, no. 2, pp. 40-44, 2012.

[9]Deng, Li, and Xiao Li, "Machine learning paradigms for speech recognition”, vol. 21, no. 5, pp. 1060-1089, 2013.

[10]W Ghai and Navdeep Singh, “Literature Review on Automatic Speech Recognition”, International Journal of Computer Applications, vol. 44, no. 8, pp. 42-20, 2012.

[11]B S Atal, “A Pattern Recognition Approach to Voiced Unvoiced Silence Classification with Applications to Speech Recognition”, IEEE transactions on Acoustics, Speech and Signal Processing, vol. 24, no. 3, pp. 201-212, 1976.

[12]Mohammad A. Bah, Abdusahmad A and M. A. Eyad, “Artificial Intelligence Technigue for Speech Recognition based on Neural Networks”, Oriental Journal of Computer Science and Technology, vol. 7, no. 3, pp. 331-336, 2014.

[13]Pukhraj P Shrishrimal, Ratnadeep R Deshmukh and Vishal M Waghmare, “Indian Language Speech Database A Review”, Internal Journal of Computer Applications, vol. 47, no. 5, pp. 17-21, 2012.

[14]Ayushi Pandey, B Srivastava, Rohit Kumar, B Nellore, K Teja and S Gangashetty, “Phonetically Balanced Code-Mixed Speech Corpus for Hindi-English Automatic Speech Recognition”, International Conference on Language Resources and Evaluation, pp. 1-6, 2018.

[15]P L Chithra and R Aparna, “Performance Analysis of Windowing Techniques in Automatic Speech Signal Segmentation”, Indian Journal of Science and Technology, vol. 8, no. 29, pp. 1-7, 2015.

[16]Nisha, “Voice Recognition Technique Review”, International Journal for Research in Applied Science & Engineering Technology, vol. 5, no. 5, 28-35, 2017.

[17]Smita Magre, Ratnadeep Deshmukh and Pukhraj Shrishrimal, “A Comparative Study on Feature Extraction Techniques in Speech Recognition”, International Journal of Innovative Research in Science, Engineering and Technology, vol. 3, no. 12, pp. 18006-18016, 2014. 

[18]Shreya Narang and Divya Gupta, “Speech Feature Extraction Techniques Review”, International Journal of Computer Science and Mobile Computing, vol. 4 no. 3, pp. 107-114, 2015.

[19]Arshpreet Kaur, Amitoj Singh and Virender Kadyan, “Correlative consideration concerning feature extraction techniques for Speech Recognition Review”, International Conference on Circuit, Power and Computing Technologies, pp. 1-4, 2016.

[20]Anjali Garg and Poonam Sharma, “Survey on Acoustic Modeling and Feature Extraction for Speech Recognition”, International Conference on Computing for Sustainable Global Development, pp. 2291-2295, 2016.

[21]Saikat Basu, Jaybrata Chakraborty, Arnab Bag and Md. Aftabuddin,” A Review on Emotion Recognition using Speech”, International Conference on Inventive Communication and Computational Technologies, pp. 109-114, 2017.

[22]Swathy M S and Mahesh K R, “Review on Feature Extraction and Classification Techniques in Speaker Recognition”, International Journal of Engineering Research and General Science, vol. 5, no. 2, pp. 78-83, 2017.

[23]Nidhi Desai, Kinnal Dhameliya and Vijayendra Desai, “Feature Extraction and Classification Techniques for Speech Recognition Review”, International Journal of Emerging Technology and Advanced Engineering, vol. 3, no. 12, pp. 367-371, 2013.

[24]Ahmed Ali and Steve Renal, “Word Error Rate Estimation for Speech Recognition e-WER”, International conference on Computational Linguistics, pp. 20-24, 2018.

[25]Pratiksha C Raut and Seema U Deoghare, “Automatic Speech Recognition and its Applications”, International Research Journal of Engineering and Technology, vol. 3, no. 5, pp. 2368-2371, 2016.

[26]Jayashri Vajpai and Avnish Bora, “Industrial Applications of Automatic Speech Recognition Systems”, International Journal of Engineering Research and Applications, vol. 6, no. 3, pp. 88-95, 2016.

[27]Sunita Dixit and M Yusuf Mulge, “Speech Processing: A Review”, International Journal of Advanced Research in Computer Engineering & Technology, vol. 3, no. 8, pp. 2775-2778, 2014.

[28]Nobuyasu Itoh, Gakuto Kurata, Ryuki Tachibana and Masafumi Nishimura, “A Metric for Evaluating Speech Recognition Accuracy based on Human Perception”, International Journal of Information  Processing Society of Japan, vol. 104, no. 11, pp 1-7, 2014.

[29]Trishna Barman and Nabamita Deb, “State of the Art Review of Speech Recognition using Genetic Algorithm”, International Conference on Power, Control, Signals and Instrumentation Engineering,  pp. 2944 – 2946, 2017.

[30]H Gupta and D S Wadhwa, “Speech feature extraction and recognition using genetic algorithm”, International Journal of Emerging Technology and Advanced Engineering, vol. 4, no. 1, pp. 363–369, 2014.