IJISA Vol. 8, No. 12, 8 Dec. 2016
Cover page and Table of Contents: PDF (size: 339KB)
Wiener filter, Bayesian Estimators, Super Gaussian priors, Nonnegative Matrix Factorization (NMF), Hidden Markov Model (HMM), Phase Processing
Speech enhancement is a technique which processes the noisy speech signal. The aim of speech enhancement is to improve the perceived quality of speech and/or to improve its intelligibility. Due to its vast applications in mobile telephony, VOIP, hearing aids, Skype and speaker recognition, the challenges in speech enhancement have grown over the years. It is more challenging to suppress back ground noise that effects human communication in noisy environments like airports, road works, traffic, and cars. The objective of this survey paper is to outline the single channel speech enhancement methodologies used for enhancing the speech signal which is corrupted with additive background noise and also discuss the challenges and opportunities of single channel speech enhancement. This paper mainly focuses on transform domain techniques and supervised (NMF, HMM) speech enhancement techniques. This paper gives frame work for developments in speech enhancement methodologies.
Ravi Kumar. K, P.V. Subbaiah, "A Survey on Speech Enhancement Methodologies", International Journal of Intelligent Systems and Applications (IJISA), Vol.8, No.12, pp.37-45, 2016. DOI:10.5815/ijisa.2016.12.05
[1]Berouti, M. Schwartz, R. Makhoul, Enhancement of Noisy Speech Corrupted by Acoustic Noise, Proc.. of ICASSP 1979, pp.208-211.
[2]Boll,S.F. Supression of Acoustic Noise in Speech Using Spectral Subtraction, Proc.. of IEEE Trans AASP, Vol 27, No.2, 1979, pp. 113-120
[3]P.C. Loizou, speech enhancement: Theory and practice, CRC press, 2007.
[4]Kuldip Paliwal, Kamil Wojcicki, Single Channel Speech Enhancement Using Spectral Subtraction in Short-Time Modulation Domain, Speech Communication Vol 50, 2008, pp.453-446.
[5]Kamath S, Loizou P, A Multiband Spectral Subtraction Method for Enhancing Speech Corrupted by Colored Noise, Proc.. IEEE Intr.conf. Acoustics, Speech Signal Process.vol-30, 1982, pp.679-681.
[6]Eric Plourde, Benoit champagne, Auditory based Spectral Amplitude Estimators for Speech Enhancement, Proc.. of IEEE Trans on ASL, Vol.16 , No.8
[7]Y. Ephraim D. Malah Speech Enhancement using a Minimum Mean-Square Error spectral Amplitude estimator, Proc. of IEEE Trans on ASS, Vol. 32, No.6, Dec 1984. pp. 1109 -1121.
[8]Y. Ephraim D. Malah Speech Enhancement using a Minimum Mean-Square Error Log-spectral Amplitude estimator, Proc. of IEEE Trans on ASS, Vol.33, No.2, April 1985, pp. 443-445.
[9]Chang Huai You, Soo Ngee Koh, Susanto Rahardja, β order MMSE Spectral Amplitude Estimation for Speech Enhancement, Proc.. of IEEE Trans.. on speech and Audio Processing, Vol.13, No.4 , July 2005.
[10]V.Sunny Dayal, T.Kishore Kumar, Speech Enhancement using Sub-band wiener filter with Pitch Synchronous analysis. IEEE conference, 2013.
[11]Eric Plourde, Benoit Champange, Generalized Bayesian Estimators of the spectral Amplitude for speech Enhancement, IEEE signal Processing Letter, Vol 16, No 6, June 2009
[12]Timo Gerkman, Martin Krawczyk, MMSE-Optimal Spectral Amplitude Estimation Given the STFT Phase, IEEE signal Processing Letter, Vol . 20, No 2, Feb 2013
[13]Shan An, Chang-chun Bao, Bing-yin Xia An Adaptive β-order MMSE Estimator for speech Enhancement using Super-Gaussian Speech Model.
[14]Thomas Lotter, Speech Enhancement by MAP Spectral Amplitude Estimation Using a Super-Gaussian Speech Model, EURASIP Journal on Applied Signal Processing 2005 Vol.7, pp. 1110-1126
[15]Kevin W. Wilson, Bhiksha Raj, Paris smaragdis, Ajay Divakaram. Speech Denoising Using Nonnegative Matrix Factorization with Prior, proc . of ICAASP, 2008
[16]N. Mohammadiha, T. Gerkman, A. Leijon, “A New Linear MMSE Filter for Single Channel Speech Enhancement Based on Nonnegative Matrix Factorization,” IEEE Workshop Applications of Signal Process. 2011: 45-48
[17]N. Mohammadiha, P.Smaragdis, A. Leijon, Supervised and Unsupervised Speech Enhancement Nonnegative Matrix Factorization, IEEE Trans on Audio, Speech, and Language process, Vol. 21, No. 10 oct 2013, pp 2140-2151.
[18]Y. Ephraim, Murray Hill, D. Malah, On the application of Hidden Markov Models for enhancing Noisy Speech, Proc. of IEEE Trans . on ASS, Vol. 37, No. 12, Dec 1989, pp.1846-1856.
[19]Sunnydayal. V, N. Sivaprasad, T. Kishore Kumar, A Survey on Statistical Based Single Channel Speech Enhancement Techniques, IJISA, Vol 6, No.12, November 2014
[20]Ephraim Y, Malah D., On the Application of Hidden Markov Models for Enhancing Noisy Speech, IEEE Trans. on ASS, 1989, Vol 37 No.12: 1846-1856
[21]Balazs Fodor, Tim Fingscheidt, Speech Enhancement using a joint MAP Estimator With Gaussian Mixture Model For NON- Stationary Noise, Proc. of ICAASP 2011.
[22]Timo Gerkmann, Martin Krawczyk-Becker, and Jonathan Le Roux, Phase Processing for Single Channel Speech Enhancement, IEEE signal processing Magazine, Vol. 32 No.2, March 2015, pp. 55-66
[23]Timo Gerkman, Bayesian Estimation of Clean Speech Spectral Coefficients Given Apriori Knowledge of Phase. IEEE Trans. on Signal Processing, Vol 62, No 16. 4199-4226
[24]Sunny Dayal Vanambathina, T. Kishore Kumar, Speech Enhancement using a Bayesian Estimation Given Apriori Knowledge of Clean Speech Phase. Speech com., November 2015.