IJIGSP Vol. 8, No. 9, 8 Sep. 2016
Cover page and Table of Contents: PDF (size: 875KB)
Full Text (PDF, 875KB), PP.17-25
Views: 0 Downloads: 0
Gender recognition, Hindi, mel-frequency, delta, delta-delta, neural network
Speech Recognition Technology can be embedded in various real time applications in order to increase the human-computer interaction. From robotics to health care and aerospace, from interactive voice response systems to mobile telephony and telematics, speech recognition technology have enhanced the human-machine interaction. Gender recognition is an important component for the application embedding speech recognition as it reduces the computational complexity for the further processing in these applications. The paper involves the extraction of one of the most dominant and most researched up on speech feature, Mel coefficients and its first and second order derivatives. We extracted 13 values for each of these from a data-set 46 speech samples containing the Hindi vowels (आ, इ, ई, उ, ऊ, ऋ, ए, ऎ, ऒ, ऑ) and trained them using a combined model of SVM and neural network classification to determine their gender using stacking. The results obtained showed the accuracy of 93.48% after taking into consideration the first Mel coefficient. The purpose of this study was to extract the correct features and to compare the performance based on first Mel coefficient.
Anjali Pahwa, Gaurav Aggarwal,"Speech Feature Extraction for Gender Recognition", International Journal of Image, Graphics and Signal Processing(IJIGSP), Vol.8, No.9, pp.17-25, 2016. DOI: 10.5815/ijigsp.2016.09.03
[1]Parwinder Pal Singh and Pushpa Rani, An Approach to Extract Feature using MFCC, IOSR Journal of Engineering, Vol. 04, August. 2014.
[2]D.Shakina Deiv, Gaurav, Mahua Bhattacharya, Automatic Gender Identification for Hindi Speech Recognition, International Journal of Computer Applications (0975 – 8887) Volume 31– No.5, October 2011.
[3]M.A Anusuya and S.K. Katti, Front End Analysis of Speech Recognition- A review, Int J Speech Technol, Springer, DOI 10.1007/s10772-010-9088-7.
[4]M.Li, K. Han and S. Narayanan, automatic Speaker Age and Gender Recognition Using Acoustic and Prosodic Level Information Fusion, Computer Speech and Language, Jan 2013.
[5]H. Kim, K. Bae, H. Yoon, Age and Gender Classification for a Home-Robot Service, Proc. 16th IEEE International Symposium on Robot and Human Interactive Communication.
[6]Qiyue Liu, Mingqiu Yao, Han Xu, Fang Wang, Different Feature Parameters in Speaker Recognition, Journal of Signal and Information Processing 2013.
[7]Parwinder Pal Singh and Pushpa Rani, An Approach to Extract Feature using MFC, IOSR Journal of Engineering (IOSRJEN) Vol. 4, Issue 08(August 2014).
[8]Jamil Ahma, Mustansar, Fiaz, Soon-il Kwon, Maleerat Soanil, Bay Vo and Sung Wook Baik, Gender Identification using the MFCC for Telephone Applications- A Comparitive Study, International Journal of Computer Science and Electronics Engineering (IJCSEE) 2015.
[9]Jerzy SAS, Aleksaner SAS, Gender Recognition Using Neural Networks and ASR Techniques, Journal of Medical Informatics and Technology 2013.
[10]Lindasalwa Muda, Mumtaj Begam and I. Elamvazuthi, Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) Techniques, Journal Of Computing, Volume 2, Issue 3, March 2010.
[11]Mfcc Tutorial, practical cryptography.
[12]M.A. Anusuya, Frontend Analysis of Speech Recognition-a review, Int J Speech Technol, Springer.
[13]Kalpana Rangra and Dr. K. L. Bansal, Comparative Study of Data Mining Tools, International Journal of Advanced Research in Computer Science and Software Engineering, June 2014.
[14]Rapid Miner Documentation, Operator reference Manual.pdf.
[15]Nidhi H. Ruparel and Nitin M. Shahane, Learning from Small Data Set to Build Classification Model: A Survey, International Journal of Computer Applications, International Conference on Recent Trends in engineering & Technology-2013(ICRTET'2013)
[16]Han and Kamber, Data Mining Concepts and Techniques, Second Edition.