IJIGSP Vol. 5, No. 9, 8 Jul. 2013
Cover page and Table of Contents: PDF (size: 518KB)
Full Text (PDF, 518KB), PP.14-20
Views: 0 Downloads: 0
Speaker identification, monolingual, crosslingual, multilingual, language parameters, Kannada
In this paper we demonstrate the impact of language parameter variability on mono, cross and multi-lingual speaker identification under limited data condition. The languages considered for the study are English, Hindi and Kannada. The speaker specific features are extracted using multi-taper mel-frequency cepstral coefficients (MFCC) and speaker models are built using Gaussian mixture model (GMM)-universal background model (UBM). The sine-weighted cepstrum estimators (SWCE) with 6 tapers are considered for multi-taper MFCC feature extraction. The mono and cross-lingual experimental results show that the performance of speaker identification trained and/or tested with Kannada language is decreased as compared to other languages. It was observed that a database free from ottakshara, arka and anukaranavyayagalu results a good performance and is almost equal to other languages.
Nagaraja B.G.,H.S. Jayanna,"Kannada Language Parameters for Speaker Identification with The Constraint of Limited Data", IJIGSP, vol.5, no.9, pp.14-20, 2013. DOI: 10.5815/ijigsp.2013.09.03
[1]Reynolds D.A. and Rose R.C., "Robust Text-Independent Speaker Identification Using Gaussian Mixture Speaker Models", IEEE Trans. Speech and Audio Processing. 72-83, 1995.
[2]Arjun P.H., "Speaker Recognition in Indian Languages: A Feature Based Approach", Ph.D. dissertation, Indian Institute of Technology, Kharagpur, India, 2005.
[3]Jayanna H.S., "Limited data Speaker Recognition", Ph.D. dissertation, Indian Institute of Technology, Guwahati, India, 2009.
[4]Zhao Jing, Gong Wei-guo and Yang Li-ping., "Mandarin-Sichuan dialect bilingual text-independent speaker verification using GMM", Journal of Computer Applications, vol. 28, no. 3, pp. 792-794, 2008.
[5]Bhattacharjee U. and Sarmah K., "A multilingual speech database for speaker recognition", Proc. IEEE, ISPCC, pp. 15-17, 2012.
[6]Nagaraja B.G. and Jayanna H.S., "Mono and Cross-lingual speaker identification with the constraint of limited data", Proc. IEEE, PRIME, pp. 439-449, 2012.
[7]Kinnunen T, Saeidi R, Sedlak F, Lee K.A, Sandberg J, Hansson-Sandsten M. and Li H., "Low-Variance Multitaper MFCC Features: A Case Study in Robust Speaker Verification", IEEE Transaction on Audio, Speech and Language Processing 20, pp. 1990-2001, 2012.
[8]Kinnunen T, Saeidi R, Sandberg J. and Hansson-Sandsten M., "What Else is New Than the Hamming- Window? Robust MFCCs for Speaker Recognition via Multitapering", Proc. Interspeech 2010, pp. 2734-2737, 2010.
[9]Percival D.B. and Walden A.T., "Spectral Analysis for Physical Applications", Cambridge, MA: Cam- bridge Univ. Press, 1993.
[10]Thomson D.J., "Spectrum estimation and harmonic analysis", Proc. IEEE, pp. 1055-1096, 1982.
[11]Reynolds D.A., "Universal Background Models", Encyclopedia of Biometric Recognition, Springer, Journal Article, Feb. 2008.
[12]Geoffrey Durou., "Multilingual text-independent speaker identification", Proc. MIST'99 Workshop, Leusden, Netherlands. pp.115-118, 1999.
[13]Sonali Nag, "Early reading in Kannada: the pace of acquisition of orthographic knowledge and phonemic awareness", Journal of Research in Reading, vol. 30, Issue 1, pp. 7-22, 2007.