Todor Ganchev

Work place: Dept. of Electrical & Computer Engineering, University of Patras, 26500 Patras, Greece

E-mail: tgachev@upatras.gr

Website:

Research Interests: Pattern Recognition, Speech Recognition

Biography

Todor Ganchev is senior researcher at the University of Patras and Assistant Professor at the Technical University of Varna. His research interests include speech and audio signal processing, pattern recognition, speaker identification and verification, and bioacoustics.

Author Articles
Integration of Temporal Contextual Information for Robust Acoustic Recognition of Bird Species from Real-Field Data

By Iosif Mporas Todor Ganchev Otilia Kocsis Nikos Fakotakis Olaf Jahn Klaus Riede

DOI: https://doi.org/10.5815/ijisa.2013.07.02, Pub. Date: 8 Jun. 2013

We report on the development of an automated acoustic bird recognizer with improved noise robustness, which is part of a long-term project, aiming at the establishment of an automated biodiversity monitoring system at the Hymettus Mountain near Athens, Greece. In particular, a typical audio processing strategy, which has been proved quite successful in various audio recognition applications, was amended with a simple and effective mechanism for integration of temporal contextual information in the decision-making process. In the present implementation, we consider integration of temporal contextual information by joint post-processing of the recognition results for a number of preceding and subsequent audio frames. In order to evaluate the usefulness of the proposed scheme on the task of acoustic bird recognition, we experimented with six widely used classifiers and a set of real-field audio recordings for two bird species which are present at the Hymettus Mountain. The highest achieved recognition accuracy obtained on the real-field data was approximately 93%, while experiments with additive noise showed significant robustness in low signal-to-noise ratio setups. In all cases, the integration of temporal contextual information was found to improve the overall accuracy of the recognizer.

[...] Read more.
Phone Duration Modeling of Affective Speech Using Support Vector Regression

By Alexandros Lazaridis Iosif Mporas Todor Ganchev

DOI: https://doi.org/10.5815/ijisa.2012.08.01, Pub. Date: 8 Jul. 2012

In speech synthesis accurate modeling of prosody is important for producing high quality synthetic speech. One of the main aspects of prosody is phone duration. Robust phone duration modeling is a prerequisite for synthesizing emotional speech with natural sounding. In this work ten phone duration models are evaluated. These models belong to well known and widely used categories of algorithms, such as the decision trees, linear regression, lazy-learning algorithms and meta-learning algorithms. Furthermore, we investigate the effectiveness of Support Vector Regression (SVR) in phone duration modeling in the context of emotional speech. The evaluation of the eleven models is performed on a Modern Greek emotional speech database which consists of four categories of emotional speech (anger, fear, joy, sadness) plus neutral speech. The experimental results demonstrated that the SVR-based modeling outperforms the other ten models across all the four emotion categories. Specifically, the SVR model achieved an average relative reduction of 8% in terms of root mean square error (RMSE) throughout all emotional categories.

[...] Read more.
Other Articles