Multi Band Spectral Subtraction for Speech Enhancement with Different Frequency Spacing Methods and their Effect on Objective Quality Measures

Full Text (PDF, 749KB), PP.54-62

Views: 0 Downloads: 0

Author(s)

P.Sunitha 1,* K.Satya Prasad 2

1. Dept. of ECE, JNTUK,India

2. VFSTR, Guntur,India

* Corresponding author.

DOI: https://doi.org/10.5815/ijigsp.2019.05.06

Received: 31 Aug. 2018 / Revised: 4 Feb. 2019 / Accepted: 19 Mar. 2019 / Published: 8 May 2019

Index Terms

Speech enhancement, Multi Band Spectral Subtraction, Frequency Spacing Methods, Linear, mel, logarithmic, Objective Quality Measures

Abstract

This paper mainly studies Multi Band Spectral Subtraction (MBSS) for speech enhancement based on the spectrum representation in the frequency domain with three different scales(linear, log, mel) and their effect on performance measures in presence of additive non-stationary noise at different ranges of input SNR. Since speech is non-stationary signal, noise distribution is non-uniform i.e few frequency components are affected severely than others. A common method to restore the original speech in presence of noise is speech enhancement by suppressing the back ground noise. Multi Band Spectral Subtraction is one among the speech enhancement techniques which performs spectral subtraction by dividing noisy speech spectrum into uniformly spaced non over lapping frequency bands and spectral over subtraction is performed in each band separately. The performance of this method is evaluated in terms of objective measures such as Cepstrum distance, Log Likelihood Ratio, Weighted Spectral Slope distance, segmental SNR and Perceptual Evaluation of Speech Quality.

Cite This Paper

P.Sunitha, K.Satya Prasad, " Multi Band Spectral Subtraction for Speech Enhancement with Different Frequency Spacing Methods and their Effect on Objective Quality Measures", International Journal of Image, Graphics and Signal Processing(IJIGSP), Vol.11, No.5, pp. 54-62, 2019. DOI: 10.5815/ijigsp.2019.05.06

Reference

[1]Boll, S.F,” Suppression of acoustic noise in speech using spectral subtraction”. IEEE Transactions on Acoustics Speech and Signal Processing, 1979, 27(2), 113–120.

[2]Berouti, M.,Schwartz, R.,Makhoul, J., "Enhancement of Speech Corrupted by Acoustic Noise”, Proc ICASSP 1979, pp.208-211.

[3]Emphraim,Y. and Malah,D. ‘Speech enhancement using a minimum mean square error short time spectral amplitude estimator, IEEE Trans. Acoustics, Speech and Signal Processing, 1984 ,32(6)1109-1121.

[4]Virag,.N, “Single channel speech enhancement based on masking properties of the human auditory system”, IEEE Trans. Speech Audio Processing, 1997, 126-137.

[5]Sovka,P., Pollak,P., and Kybic,J., “Extended spectral subtraction, proceedings on European conference on Signal Processing Communication, 1996, Trieste, Italy, pp. 963-966.

[6]He,C. and Zweig,G., “Adaptive two–band spectral subtraction with multi window spectral subtraction”, proceedings on IEEE Conference on Acoustics, Speech and Signal Processing, 1999, Phoenix, AZ, pp.793-796.

[7]Gustafsson,H., Nordholm,S. and Claesson,I., “Spectral Subtraction using Reduced delay convolution and adaptive averaging”, 2001, IEEE Trans. Speech Audio Processing, 9(8), 799-807.

[8]Lockwood,P. and Boudy,.J, ”Experiments with a non-linear spectral subtractor (NSS), Hidden Markov Models and the projections, for roboust recognition in cars, Speech Communication 11 (2-3), 215-228.

[9]Kamath S., Loizou P., “A multiband spectral subtraction method for enhancing speech corrupted by colored noise”, Proc. IEEE Intl. Conf. Acoustics, Speech, Signal Processing, 2002.

[10]Stevens, Stanly Smith, Violmann, John & Newman, and Edwin,B.,” A Scale for the measurement of the psychological Magnitude pitch”, Journal of the Acoustical Society of America. 8(3):185-190. 1937.

[11]Stevens,S. & Volkmann,J. “The relation of pitch to frequency: A revised Scale “,American Journal of Psychology.53(3), 329-353, 1940.

[12]T.J.Moir and J.F.Barret, “A Kepstrum Approach to filtering, smoothing and prediction with application to speech enhancement, Proc. Royal Society, 459,2957-2976,2003.

[13]Philipos, C.Loizou, “Speech Enhancement”: Theory and Practice, 2ndedition, CRC Press, 2013.

[14]Navneet Upadhyay, Abhigit Kumar, “Single-Channel Speech Enhancement Using Critical –Band Rate Scale Based Improved Multi-Band Spectral Subtraction” Journal of Signal and Information Processing, 2013, 4, 314-326.

[15]ITU_T Rec, “Perceptual evaluation of speech quality(PESQ), An objective method for end to end speech quality assessment of narrowband telephone networks and speech codecs”. International Telecommunications Union, Geneva Switzerland, February 2001.

[16]A Noisy Speech Corpus for Assessment of Speech Enhancement Algorithms. https:// ecs. utdallas. edu/ Loizou /speech/ noizeous

[17]Yi,.Hu and LoizouP.C., "Evaluation of Objective Quality Measures for Speech Enhancement," IEEE Transactions on Audio, Speech and Language Processing, vol.16,no.1,pp.229-238,Jan.2008.

[18]Quackenbush.S, Barnwell.T and Clements.M, Objectives measures of Speech quality, Englewood liffs, NJ: Prentice Hall.

[19]Klatt,D. (1982), Prediction of perceived phonetic distance from Critical band spectra, proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, vol.7, pp.1278-1281.

[20]Mermelstein,.P (1979) Evaluation of segmental SNR measure as an indicator of the quality of ADPCM coded speech, J.Acoust. Soc. Am., 66(6), 1664-1667.

[21]Jianfen Ma, Yi Hu and Philipos C.Loizou, “Objective Measures for Predicting Speech Intelligibility in Noisy Conditions Based on New Band-important functions”, Journal of Acoustical Society America, Vol. 125, No. 5, pp. 3387-3405, May 2009.

[22]Y.Hu and P.Loizou, “Evaluation of objective quality measures for Speech enhancement,” in Proc. Inter speech, 2006, PP.1447-1450.

[23]“Application Guide for Objective quality measurement based on recommendations P.862, P.862.1 and P.862.2”, ITU_T rec. P.862.3, 2005.