Joint acoustic and spectral modeling for speech dereverberation using non-negative representations
Exemplar-based techniques, where the noisy speech is decomposed as a linear combination of the speech and noise exemplars stored in a dictionary, have been successfully used for speech enhancement in noisy environments. This paper extends this technique to achieve speech dereverberation in noisy environments by means of a nonnegative approximation of the noisy reverberant speech in the frequency domain. A novel approach for estimating the room impulse response (RIR) together with the speech and noise estimates using a non-negative matrix deconvolution (NMD)-based technique is proposed. In addition, we extend an existing technique based on nonnegative matrix factorisation (NMF) that performs speech derever-beration in noise-free environments to noisy scenarios. New estimators for jointly obtaining the RIR and exemplar weights for the NMD and NMF-based formulations are presented. The proposed techniques are evaluated on the noise-free and noisy reverberant speech in the CHiME-2 WSJ0 database and are shown to yield better speech enhancement in terms of signal-to-distortion ratio (SDR), perceptual evaluation of speech quality (PESQ) and cepstral distance (CD) measures.