Syed Abbas Ali; Anas Khan; Nazia Bashir

Analyzing the Impact of Prosodic Feature (Pitch) on Learning Classifiers for Speech Emotion Corpus

Full Text (PDF, 319KB), PP.54-59

Views: 0 Downloads: 0

Author(s)

Syed Abbas Ali ^1,* Anas Khan ² Nazia Bashir ²

1. Department of Computer & Information Systems Engineering, N.E.D University of Engineering & Technology, Karachi, Pakistan

2. Department of Telecommunications Engineering, N.E.D University of Engineering & Technology, Karachi, Pakistan

* Corresponding author.

DOI: https://doi.org/10.5815/ijitcs.2015.02.07

Received: 3 Aug. 2014 / Revised: 20 Sep. 2014 / Accepted: 10 Nov. 2014 / Published: 8 Jan. 2015

Index Terms

Prosodic Features, Learning Classifiers, Speech Emotion, Regional Languages of Pakistan

Abstract

Emotion plays a significant role in human perception and decision making whereas, prosodic features plays a crucial role in recognizing the emotion from speech utterance. This paper introduces the speech emotion corpus recorded in the provincial languages of Pakistan: Urdu, Balochi, Pashto Sindhi and Punjabi having four different emotions (Anger, Happiness, Neutral and Sad). The objective of this paper is to analyze the impact of prosodic feature (pitch) on learning classifiers (adaboostM1, classification via regression, decision stump, J48) in comparison with other prosodic features (intensity and formant) in term of classification accuracy using speech emotion corpus recorded in the provincial languages of Pakistan. Experimental framework evaluated four different classifiers with the possible combinations of prosodic features with and without pitch. An experimental study shows that the prosodic feature (pitch) plays a vital role in providing the significant classification accuracy as compared to prosodic features excluding pitch. The classification accuracy for formant and intensity either individually or with any combination excluding pitch are found to be approximately 20%. Whereas, pitch gives classification accuracy of around 40%.

Cite This Paper

Syed Abbas Ali, Anas Khan, Nazia Bashir, "Analyzing the Impact of Prosodic Feature (Pitch) on Learning Classifiers for Speech Emotion Corpus", International Journal of Information Technology and Computer Science(IJITCS), vol.7, no.2, pp.54-59, 2015. DOI:10.5815/ijitcs.2015.02.07

Reference

[1]M. E. Ayadi, M. S. Kamel and F. Karray, ‘Survey on Speech Emotion Recognition: Features, Classification Schemes, and Databases”, Pattern Recognition, 44(16), 572-587, 2011.

[2]P. Ekman, "An argument for basic emotions”, Cognition and Emotion, Vol. 6, pp. 169-200, 1992.

[3]I. Chiriacescu, “Automatic Emotion Analysis Based On Speech”, M.Sc. THESIS Delft University of Technology, 2009.

[4]J. Rong, G. Li, Y.P. Phoebe Chen, “Acoustic feature selection for automatic emotion recognition from speech”, Information Processing and Management, 45, pp. 315–328, 2009.

[5]M.B. Mustafa, R.N. Ainon1, R. Zainuddin, Z.M. Don, G. Knowles, S. Mokhtar, “Prosodic Analysis And Modelling For Malay Emotional Speech Synthesis” , Malaysian Journal of Computer Science, pp. 102-110. 2010.

[6]S. Wu, T.H. Falk, W.Y. Chan, “Automatic speech emotion recognition using modulation spectral features”, Speech communication,vol. 53,pp. 768–785,2011.

[7]M. Kuremastsu et al, “An extraction of emotion in human speech using speech synthesize and classifiers for each emotion”, WSEAS Transaction on Information Science and Applications, Vol .5(3), pp.246-251, 2008.

[8]J. Nicholson, K. Takahashi, R. Nakatsu, “Emotion recognition in speech using neural networks”, Neural Computation. Appl. Vol. 9, pp. 290–296, 2000.

[9]Z.-J. Chuang and C.-H. Wu, “Emotion recognition using acoustic features and textual content”, In Proc of IEEE international conference on multimedia and expo (ICME’04), Vol. 1, pp. 53–56, IEEE Computer Society, 2004.

[10]M. Song, C. Chen, and M. You, “Audio-visual based emotion recognition using tripled hidden markov model”, In Proceedings of IEEE international conference on acoustic, speech and signal processing (ICASSP’04), Vol. 5, pp. 877–880, IEEE Computer Society, 2004.

[11]S.R.Karathapalli and S.G.Koolagudi, “Emotion recognition using speech features”, Springer Science+ Business Media New York, 2013.

[12]A. Nogueiras, A. Moreno, A. Bonafonte, Jose B. Marino, “Speech Emotion Recognition Using Hidden Markov Model”, Euro speech, 2001.

[13]P.Shen, Z. Changjun, X. Chen, “Automatic Speech Emotion Recognition Using Support Vector Machine”, International Conference on Electronic And Mechanical Engineering And Information Technology, 2011.

[14]Z. Ciota, “Feature Extraction of Spoken Dialogs for Emotion Detection”, ICSP, 2006.

[15]A.S. Utane and S.L. Nalbalwar, “Emotion recognition through Speech” International Journal of Applied Information Systems (IJAIS), pp.5-8, 2013.

[16]P.Shen, Z. Changjun, X. Chen, “Automatic Speech Emotion Recognition Using Support Vector Machine”, International Conference On Electronic And Mechanical Engineering And Information Technology, 2011.

[17]D. Ververidis and C. Kotropoulos, "Emotional Speech Recognition: Resources, Features and Methods", Elsevier Speech communication, vol. 48, no. 9, pp. 1162-1181, September, 2006.

[18]E. Bozkurt, E, Erzin, C. E. Erdem, A. Tanju Erdem, “Formant Position Based Weighted Spectral Features for Emotion Recognition”, Science Direct Speech Communication, 2011.

[19]A. Batliner, S. Steidl, B. Schuller, D. Seppi, K. Laskowski, T. Vogt, L. Devillers, L. Vidrascu, N. Amir, L. Kessous, and V. Aharonson,―Combining Efforts for Improving Automatic Classification of Emotional User States, In Proc. of IS-LTC, pages 240—245, 2006.

[20]J. Han and M. Kamber. Data Mining: Concepts and Techniques. Elsevier, 2nd edition, 2006.

[21]Yoav Freund, Robert E. Schapire: Experiments with a new boosting algorithm. In: Thirteenth International Conference on Machine Learning, San Francisco, 148-156, 1996.

[22]E. Frank, Y. Wang, S. Inglis, G. Holmes, I.H. Witten (1998). Using model trees for classification. Machine Learning. 32(1):63-76.

[23]]http://weka.sourceforge.net/doc.dev/weka/classifiers/trees/DecisionStump.html

[24]S.A. Ali., S Zehra., et.al. “Development and Analysis of Speech Emotion Corpus Using Prosodic Features for Cross Linguistic”, International Journal of Scientific & Engineering Research, Vol. 4, Issue 1, January 2013.

[25][http://www.fon.hum.uva.nl/praat/

[26]R. R. Bouckaert, E. Frank, M. H. R. Kirkby, P. Reutemann, S. D. Scuse, WEKA Manual for Version 3-7-5, October 28, 2011.

International Journal of Information Technology and Computer Science (IJITCS)