Learn More
For a mixture of target speech and noise in anechoic conditions, the ideal binary mask is defined as follows: It selects the time-frequency units where target energy exceeds noise energy by a certain local threshold and cancels the other units. In this study, the definition of the ideal binary mask is extended to reverberant conditions. Given the division(More)
Ideal binary masking is a signal processing technique that separates a desired signal from a mixture by retaining only the time-frequency units where the signal-to-noise ratio (SNR) exceeds a predetermined threshold. In reverberant conditions there are multiple possible definitions of the ideal binary mask in that one may choose to treat the target early(More)
—Sound source localization from a binaural input is a challenging problem, particularly when multiple sources are active simultaneously and reverberation or background noise are present. In this work, we investigate a multi-source localization framework in which monaural source segregation is used as a mechanism to increase the robustness of azimuth(More)
—Monaural musical sound separation has been extensively studied recently. An important problem in separation of pitched musical sounds is the estimation of time–frequency regions where harmonics overlap. In this paper, we propose a sinusoidal modeling-based separation system that can effectively resolve overlapping harmonics. Our strategy is based on the(More)
USD (Unstructured Scientific Data) is a database system developed at Lawrence Livermore National Laboratory (LLNL) that provides database capabilities required when doing scientific research. USD is implemented in a version of EMACS Lisp that has been extended to include a relational database management system and graphical user interface primitives. It is(More)
—We propose an approach to binaural detection, lo-calization and segregation of speech based on pitch and azimuth cues. We formulate the problem as a search through a multisource state space across time, where each multisource state encodes the number of active sources, and the azimuth and pitch of each active source. A set of multilayer perceptrons are(More)
—Existing binaural approaches to speech segregation place an exclusive burden on cues related to the location of sound sources in space. These approaches can achieve excellent performance in anechoic conditions but degrade rapidly in realistic environments where room reverberation corrupts localization cues. In this paper, we propose to integrate monaural(More)
Localization of simultaneous sound sources in natural environments with only two microphones is a challenging problem. Reverberation degrades performance of localization based exclusively on directional cues. We present an approach that integrates monaural and binaural analysis to improve localization of multiple speech sources in noisy and reverberant(More)
Approaches to binaural and stereo speech segregation have often assumed that localization information can be used as a primary cue to achieve segregation of a target signal. Results produced by these systems degrade significantly in the presence of room reverberation. In this work, we present an alternative framework to achieve localization of groups of(More)
In mixtures of pitched sounds, the problem of overlapping harmonics poses a significant challenge to monaural musical sound separation systems. In this paper we present a new algorithm for sinusoidal parameter estimation of overlapping harmonics for pitched instruments. Our algorithm is based on the assumptions that harmonics of the same source have(More)