John H. L. Hansen

Learn More
FOR SPEECH ENHANCEMENT ALGORITHMS John H.L. Hansen and Bryan L. Pellom Robust Speech Processing Laboratory Duke University, Box 90291, Durham, NC 27708-0291 ABSTRACT Much progress has been made in speech enhancement algorithm formulation in recent years. However, while researchers in the(More)
It is well known that the introduction of acoustic background distortion and the variability resulting from environmentally induced stress causes speech recognition algorithms to fail. In this paper, we discuss SUSAS: a speech database collected for analysis and algorithm formulation of speech recognition in noise and stress. The SUSAS database refers to(More)
Studies have shown that variability introduced by stress or emotion can severely reduce speech recognition accuracy. Techniques for detecting or assessing the presence of stress could help improve the robustness of speech recognition systems. Although some acoustic variables derived from linear speech production theory have been investigated as indicators(More)
In this paper, an improved form of iterative speech enhancement for single channel inputs is formulated. The basis of the procedure is sequential maximum a posteriori estimation of the speech waveform and its all-pole parameters as originally formulated by Lim and Oppenheim, followed by imposition of constraints upon the sequence of speech spectra. The new(More)
This study proposes a new set of feature parameters based on wavelet packet transform analysis of the speech signal. The new speech features are named subband based cepstral parameters (SBC) and wavelet packet parameters (WPP). The ability of each parameter set to capture speaker identity conveyed in the speech signal is compared to the widely used(More)
It is well known that the performance of speech recognition algorithms degrade in the presence of adverse environments where a speaker is under stress, emotion, or Lombard effect. This study evaluates the effectiveness of traditional features in recognition of speech under stress and formulates new features which are shown to improve stressed speech(More)
It is well known that speaker variability caused by accent is one factor that degrades performance of speech recognition algorithms. If knowledge of speaker accent can be estimated accurately, then a modified set of recognition models which addresses speaker accent could be employed to increase recognition accuracy. In this study, the problem of language(More)
Acoustic feature extraction from speech constitutes a fundamental component of automatic speech recognition (ASR) systems. In this paper, we propose a novel feature extraction algorithm, perceptual-MVDR (PMVDR), which computes cepstral coefficients from the speech signal. This new feature representation is shown to better model the speech spectrum compared(More)
In this paper, we propose an eÆcient approach for unsupervised audio stream segmentation and clustering via the Bayesian Information Criterion (BIC). The proposed method extends an earlier formulation by Chen and Gopalakrishnan[1]. In our segmentation formulation, Hotelling's T -Statistic is used to pre-select candidate segmentation boundaries followed by(More)