Hugo Van hamme

Learn More
Missing data techniques (MDT) have been shown to be an effective method for curing the performance degradation of HMM-based speech recognition systems operating on noisy signals. However, a major drawback of the approach is that MDT requires that the acoustic model be expressed as a mixture of diagonal Gaussians in the log-spectral domain, whereas a higher(More)
Missing data theory has been applied to the problem of speech recognition in adverse environments. The resulting systems require acoustic models that are expressed in the spectral rather than in the cepstral domain, which leads to loss of accuracy. Cepstral Missing Data Techniques (CMDT) surmount this disadvantage, but require significantly more(More)
An effective way to increase the noise robustness of automatic speech recognition is to label noisy speech features as either reliable or unreliable (missing), and to replace (impute) the missing ones by clean speech estimates. Conventional imputation techniques employ parametric models and impute the missing features on a frame-by-frame basis. At low(More)
In this paper, a bottom-up, activation-based paradigm for continuous speech recognition is described. Speech is described by co-occurrence statistics of acoustic events over an analysis window of variable length, leading to a vectorial representation of high but fixed dimension called “Histogram of Acoustic Co-occurrence” (HAC). During training, recurring(More)
Motivated by the success of i-vectors in the field of speaker recognition, this paper proposes a new approach for age estimation from telephone speech patterns based on i-vectors. In this method, each utterance is modeled by its corresponding ivector. Then, Support Vector Regression (SVR) is applied to estimate the age of speakers. The proposed method is(More)
In this paper, a new approach for age estimation from speech signals based on i-vectors is proposed. In this method, each utterance is modeled by its corresponding i-vector. Then, a Within-Class Covariance Normalization technique is used for session variability compensation. Finally, a least squares support vector regression (LSSVR) is applied to estimate(More)
The objective of this paper is threefold: (1) to provide an extensive review of signal subspace speech enhancement, (2) to derive an upper bound for the performance of these techniques, and (3) to present a comprehensive study of the potential of subspace filtering to increase the robustness of automatic speech recognisers against stationary additive noise(More)
We present a technique to automatically discover the (word-sized) phone patterns that are present in speech utterances. These patterns are learnt from a set of phone lattices generated from the utterances. Just like children acquiring language, our system does not have prior information on what the meaningful patterns are. By applying the non-negative(More)
We describe an algorithm to automatically estimate the voice onset time (VOT) of plosives. The VOT is the time delay between the burst onset and the start of periodicity when it is followed by a voiced sound. Since the VOT is affected by factors like place of articulation and voicing it can be used for inference of these factors. The algorithm uses the(More)
During the early stages of language acquisition, young infants face the task of learning a basic vocabulary without the aid of prior linguistic knowledge. It is believed the long term episodic memory plays an important role in this process. Experiments have shown that infants retain large amounts of very detailed episodic information about the speech they(More)