Learn More
An effective way to increase the noise robustness of automatic speech recognition is to label noisy speech features as either reliable or unreliable (missing), and to replace (impute) the missing ones by clean speech estimates. Conventional imputation techniques employ parametric models and impute the missing features on a frame-by-frame basis. At low(More)
This paper introduces a novel approach to exemplar-based connected digit recognition. The approach is tested for different sizes of the exemplar collection (from 250 to 16,000), different length of the exemplars (from 1 to 50 time frames) and state-labeled versus word-labeled decoding. In addition, we compare the novel method for selecting exemplars, based(More)
Noise robustness of automatic speech recognition benefits from using missing data imputation: Prior to recognition the parts of the spectrogram dominated by noise are replaced by clean speech estimates. Especially at low SNRs each frame contains at best only a few uncorrupted coefficients. This makes frame-by-frame restoration of corrupted feature vectors(More)
An effective way to increase the noise robustness of automatic speech recognition is to label noisy speech features as either reliable or unreliable (missing) prior to decoding, and to replace the missing ones by clean speech estimates. We present a novel method based on techniques from the field of Compressive Sensing to obtain these clean speech(More)
Acoustic backing-off was recently proposed as an operationalisa­ tion of missing feature theory for increased recognition robustness. Acoustic backing-off effectively removes the detrimental influence of outlier values from the local decisions in the Viterbi algorithm without any kind of explicit outlier detection. In the context of con­ nected digit(More)
The aim of this investigation is to determine to what extent automatic speech recognitionmay be enhanced if, in addition to the linear compensation accomplished by mean and variance normalisation, a non-linear mismatch reduction technique is applied to the cepstral and energy features, respectively. An additional goal is to determine whether the degree of(More)
When subglottal pressure signals which are recorded during normal speech production are spectrally analyzed, the frequency of the first spectral maximum appears to deviate appreciably from the first resonance frequency which has been reported in the literature and which stems from measurements of the acoustic impedance of the subglottal system. It is(More)
A method is presented for the automatic extraction of voice source parameters from speech. An automatic inverse filtering algorithm is used to obtain an estimate of the glottal flow signal. Subsequently, an LF-model [1] is fitted to the glottal flow signal. In the current article we will focus on the improvement of the automatic fit procedure. To keep track(More)
An effective way to increase noise robustness in automatic speech recognition is to label the noisy speech features as either reliable or unreliable (‘missing’), and replace (‘impute’) the missing ones by clean speech estimates. Conventional imputation techniques employ parametric models and impute the missing features on a frame-by-frame basis. At low(More)