Learn More
In recent years, there have been significant advances in the field of speaker recognition that has resulted in very robust recognition systems. The primary focus of many recent developments have shifted to the problem of recognizing speakers in adverse conditions, e.g in the presence of noise/reverberation. In this paper, we present the UMD-JHU speaker(More)
A novel method for speech enhancement based on Convolutive Non-negative Matrix Factorization (CNMF) is presented in this paper. The sparsity of activation matrix for speech components has already been utilized in NMF-based enhancement methods. However such methods do not usually take into account prior knowledge about occurrence relations between different(More)
In many under-resourced languages it is possible to find text, and it is possible to find speech, but transcribed speech suitable for training automatic speech recognition ASR is unavailable. In the absence of native transcripts, this paper proposes the use of a probabilistic transcript: A probability mass function over possible phonetic transcripts of the(More)
A monaural noise-suppression algorithm is proposed that nonlinearly manipulates the spectrotemporal modulations of speech as represented in a model of auditory cortical processing. A distinctive aspect of this approach is its consideration of the non-stationary dynamic behavior of speech that is captured using nonlinear filters, thus achieving excellent(More)
A new multimodal framework for speech enhancement in noisy environments based on human auditory system model is proposed in this paper. Unlike existing engineering architectures each of which specifically designed for certain speech sensors (extracted pitch, visual cues, etc.), our proposed model provides the capacity to integrate cues of different type(More)
A spectrotemporal method based on Mutual Information (MI) is proposed for pitch estimation of voiced speech signals. We use MI as the similarity measure between voiced speech segments and their delayed version. Instead of measuring linear dependencies, MI measures statistical dependency, which suits the dynamic characteristic of speech signals. Besides,(More)
Linear-Nonlinear regression models play a fundamental role in characterizing nonlinear systems. In this paper, we propose a method to estimate the linear transform in such models equivalent to a subspace of a small dimension in the input space that is relevant for eliciting response. The novel aspect of this work is the formulation of the mutual information(More)
Auditory selective attention plays a central role in the human capacity to reliably process complex sounds in multi-source environments. Stimulus reconstruction has been widely used for the investigation of selective auditory attention using multichannel electroencephalography (EEG). In particular, the influence of attention on sound representations in the(More)
  • 1