Learn More
In this paper, aiming to utilize independent recording devices as a distributed microphone array, we present a novel method for alignment of recorded signals with localizing microphones and sources. Unlike conventional microphone array, signals recorded by independent devices have different origins of time, and microphone positions are generally unknown. In(More)
This paper describes systems for the enhancement and recognition of distant speech recorded in reverberant rooms. Our speech enhancement (SE) system handles reverberation with blind deconvolution using linear filtering estimated by exploiting the temporal correlation of observed reverberant speech signals. Additional noise reduction is then performed using(More)
We propose permutation-free frequency-domain blind source separation (BSS) via full-band clustering of the time-frequency (T-F) components based on time-varying signal presence priors. Frequency-domain methods of BSS usually process each frequency bin separately, and therefore necessitate the subsequent alignment of the permutation ambiguity that arises(More)
This paper considers acoustic beamforming for noise robust automatic speech recognition (ASR). A beamformer attenuates background noise by enhancing sound components coming from a direction specified by a steering vector. Hence, accurate steering vector estimation is paramount for successful noise reduction. Recently, a beamforming approach was proposed(More)
We consider the estimation of the Directions-Of-Arrival (DOA) of target signals in diffuse noise. The state-of-the-art MUltiple SIgnal Classification (MUSIC) algorithm necessitates accurate identification of the signal subspace. In diffuse noise, however, it is difficult to identify it directly from the observed spatial covariance matrix. In our approach,(More)
This paper describes a new design of the Wiener post-filter for diffuse noise suppression. The Wiener post-filter is well-known as an effective post-processing of the minimum variance distortionless response beamformer, and its output is the optimal estimate of the target signal in the sense of the minimum mean square error. It is essential to accurately(More)
This paper summarizes the 2013 community-based Signal Separation Evaluation Campaign (SiSEC 2013). Five speech and music datasets were contributed, including two new datasets: “Two-channel noisy recordings of a moving speaker within a limited area” and “Asynchronous recordings of speech mixtures”. The participants addressed one(More)