Juan Andres Morales-Cordovilla

Learn More
This paper deals with the problem of searching for a suitable window for robust speech recognition in noisy conditions. A set of asymmetric windows, socalled DDRc,w, are proposed which are controlled by two parameters, center c and width w. These windows are derived from the DDR window used in the higher-lag autocorrelation spectrum estimation (HASE) method(More)
In this paper, we address the small vocabulary track (track 1) described in the CHiME 2 challenge dedicated to recognize utterances of a target speaker with small head movements. The utterances are recorded in a reverberant room acoustics corrupted with highly non-stationary noise sources. Such adverse noise scenario imposes a challenge to state-of-the-art(More)
This paper presents a noise estimation technique based on knowledge of pitch information for robust speech recognition. In the first stage the noise is estimated by means of extrapolating the noise from frames where speech is believed to be absent. These frames are detected with a proposed pitch based VAD (Voice Activity Detector). In the second stage the(More)
In this paper, we propose two estimators for the autocorrelation sequence of a periodic signal in additive noise. Both estimators are formulated employing tables which contain all the possible products of sample pairs in a speech signal frame. The first estimator is based on a pitch-synchronous averaging. This estimator is statistically analyzed and we show(More)
This paper provides a description of the preparation, the speakers, the recordings, and the creation of the orthographic transcriptions of the first large scale speech database for Austrian German. It contains approximately 1900 minutes of (read and spontaneous) speech produced by 38 speakers. The corpus consists of three components. First, the Conversation(More)
This paper addresses the problem of distant speech recognition in reverberant noise conditions applying a star-shaped microphone array and missing data techniques. The performance of the system is evaluated over a German database, which has been contaminated with noise of an apartment of the DIRHA (Distant Speech Interaction for Robust Home Applications)(More)
The problem of room localization is to determine where, in a multi-room environment, a person is producing a speech utterance. In our work, we are exploiting the information gained from a network of microphones installed all over a house, where the lack of calibration of the microphone energies creates an additional challenge. This paper compares room(More)
This paper addresses the problem of distant speech recognition in reverberant noisy conditions employing a microphone array. We present a prototype system that can segment the utterances in real-time and generate robust ASR results off-line. The segmentation is carried out by a voice activity detector based on deep belief networks, the speaker localization(More)