Work place: Dept. of Electronics and Communication Engineering, Basaveshwar Engineering College Bagalkote, Visvesvaraya Technological University, Belagavi-590018, Karnataka, India
E-mail: pnk_bewoor@yahoo.com
Website:
Research Interests: Signal Processing, Optical Communication,
Biography
Dr. Pandurangarao N. Kulkarni has completed his PhD from Indian Institute of Technology Bombay, India, in the year 2010. He has twenty nine years of teaching experience and ten years of research experience. Currently he is working as a professor and Head, department of Electronics and Communication Engineering, at Basaveshwar Engineering College, Bagalkot, Karnataka, India. His fields of interest include digital communication, digital signal processing and applications. His research area is speech processing to improve speech perception in sensorineural hearing loss.
By Aparna Chilakawad Pandurangarao N. Kulkarni
DOI: https://doi.org/10.5815/ijigsp.2024.02.05, Pub. Date: 8 Apr. 2024
In the case of Sensorineural Hearing Loss (SNHL) persons speech perception diminishes in a noisy environment because of masking. The present work aims mainly at improving speech perception in sensorineural hearing-impaired subjects, as there is no known medical treatment for this condition. Speech perception can be improved by reducing the impact of masking. This is accomplished by splitting the speech signal into two parts for binaural dichotic presentation using time-varying comb filters having complementary magnitude responses. Using the frequency sampling method time-varying comb (FIR) filters with magnitude responses complementary to each other with 512 order are designed to split the speech signal for dichotic presentation. For the purpose of designing filters, 22 kHz sampling frequency and twenty-two one-third octave bands spanning from 0 to 11 kHz are taken into consideration. Magnitude responses of filters are continuously swept with a time shift less than just noticeable difference (JND) so that capacity to detect gaps in speech signal enhances without negating the benefits of the spectral splitting technique. Filter functioning is evaluated by using objective and subjective measures. Using Perceptual Evaluation of Speech Quality (PESQ) and spectrographic analysis an objective evaluation is made. The subjective measure is done using Mean Opinion Score (MOS) for quality of speech. MOS test is examined on normal hearing subjects by adding white noise to study materials at different SNR levels. For the evaluation of intelligibility of speech Modified Rhyme Test (MRT) is considered and evaluated on normal hearing subjects as well as bilateral moderate SNHL persons by adding white noise to study materials at different SNR levels. Study materials used for the evaluation of quality are VC syllable /aa-b/ & vowel /aa/. 300 monosyllabic words of consonant-vowel-consonant (CVC) are used as study materials for the evaluation of speech intelligibility.
The outcomes showed an improvement in PESQ values and MOS test scores for lower SNR values comparing unprocessed speech with processed speech and also an improvement in the intelligibility of processed speech in a noisy atmosphere for both types of subjects. Thus there is an enhancement in speech perception of processed speech in a noisy environment.
By Jyoti M. Katagi Pandurangarao N. Kulkarni
DOI: https://doi.org/10.5815/ijigsp.2021.06.04, Pub. Date: 8 Dec. 2021
Ability to locate sound source in human acoustic system is a prime factor. The source of sound has various spectral, temporal and strength characteristics depending on where it is located. To identify the sound location, the listeners analyze these characteristics arising from various directions on the horizontal and the vertical surfaces. In noisy background, it is very difficult to understand the speech for individuals with sensorineural hearing loss. In order to reliably distinguish various sound sources and increase speech intelligibility in noisy conditions, binaural hearing is adopted. Diffraction induced by the pinnae, head, shoulders and torso changes the pressure waveform when sound waves travel from the audio source to the listener's eardrum. Two transfer functions that specify the relation between the sound pressures at the listener's right and left ear drums will catch these propagation effects. These spectral changes are recorded by Head Related Transfer Functions (HRTFs). Different hearing aid algorithms are to be studied to measure their effectiveness in improving speech perception through series of subjective evaluations involving subjects with sensorineural hearing loss with different types of loss characteristics under different listening conditions. We investigated the various proposed approaches, weighed in on their benefits and drawbacks and most importantly, examined whether and how the resulting HRTFs perceptual validity is evaluated. This paper brings out current research efforts on sound source localization ability in hearing aids, which includes use of Head Related Transfer Functions (HRTFs) for generating spatial sounds in elevation and azimuth plane, evaluating the effect of monaural and binaural hearing aid algorithms on source localization under different listening conditions on subjects with different hearing losses and also to assess the effectiveness of localization with type of hearing aids.
[...] Read more.By Rajani S. Pujar Pandurangarao N. Kulkarni
DOI: https://doi.org/10.5815/ijigsp.2019.07.06, Pub. Date: 8 Jul. 2019
This paper presents a filter bank summation method to perform spectral splitting of input signal for binaural dichotic presentation along with dynamic range compression coupled with noise reduction algorithm based on wiener filter. This helps to compensate the effect of spectral masking, reduced dynamic range, and improves speech perception for moderate sensorineural hearing loss in the adverse listening conditions. We have considered cascaded structure of noise reduction technique; Filter Bank Summation (FBS) based amplitude compression and spectral splitting. Wiener filter produces the enhanced signal by removing unwanted noise. The signal is split into eighteen frequency bands, ranging from 0-5KHz, based on auditory critical bandwidths. To reduce the dynamic range, amplitude compression is carried out using constant compression factor in each of the bands. Subjective and objective assessment based on Mean Opinion Score (MOS) and Perceptual Evaluation of Speech Quality (PESQ) scores, respectively, are used to test the Perceived quality of speech for different Signal-to-Noise Ratio (SNR) conditions. Vowel Consonant Vowel (VCV) syllable /aba/ and sentences were used as the test material. The results of the listening tests showed MOS scores for processed speech sentence “sky that morning was clear and bright blue” (4.41, 4.2, 3.96, 3.6, 3.08 and 2.66) as compared with unprocessed speech MOS scores ( 4.53, 1.21, 1.16, 1.06, 0.8, 0.483) for SNR values of ∞, +6, +3, 0, -3 and -6 dB respectively, and PESQ values (Left Channel: 2.6192, 2.5355, 2.5646, 2.5513, 2.5221, and 2.4309; Right Channel: 2.5889, 2.3001, 2.3714, 2.4710, 2.3636, and 2.4712) for SNR values of ∞, +6, +3, 0, -3 and -6 dB respectively, indicating the improvement in the perceived quality for different SNR conditions. To evaluate the intelligibility of the perceived speech, listening test was carried out for hearing impaired (moderate Sensorineural Hearing Loss (SNHL)) persons in the presence of background noise using Modified Rhyme Test (MRT).The test material consists 50 sets of monosyllabic words of consonant-vowel-consonant (CVC) form with six words in each set. Each subject responded for a total of 1800 presentations (300 words x 6 different SNR conditions). Results of the listening tests (using MRT) showed maximum improvement of (27.299%, 23.95%, 24.503%, 23.602%, and 23.498%) in the speech recognition scores at SNR values of (-6dB, -3dB, 0dB, +3dB, +6dB) compared to unprocessed speech recognition scores. Reductions in response times compared to unprocessed speech response times at lower SNR values were observed. The decrease in response times at the SNR values of -6, -3, 0, +3 and+6 dB were 1.581, 1.41, 1.329, 1.279, and 1.01s, respectively, indicating improvement in intelligibility of the speech at lower SNR values.
[...] Read more.Subscribe to receive issue release notifications and newsletters from MECS Press journals