Learn More
In this paper we describe a model developed for the analysis of acoustic spectra. Unlike decom-positions techniques that can result in difficult to interpret results this model explicitly models spectra as distributions and extracts sets of additive and semantically useful components that facilitate a variety of applications ranging from source separation,(More)
In this paper we describe a technique that allows the extraction of multiple local shift-invariant features from analysis of non-negative data of arbitrary dimensionality. Our approach employs a probabilistic latent variable model with sparsity constraints. We demonstrate its utility by performing feature extraction in a variety of domains ranging from(More)
In this paper we describe a methodology for model-based single channel separation of sounds. We present a sparse latent variable model that can learn sounds based on their distribution of time/frequency energy. This model can then be used to extract known types of sounds from mixtures in two scenarios. One being the case where all sound types in the mixture(More)
This paper presents a family of probabilistic latent variable models that can be used for analysis of nonnegative data. We show that there are strong ties between nonnegative matrix factorization and this family, and provide some straightforward extensions which can help in dealing with shift invariances, higher-order decompositions and sparsity(More)
An important problem in many fields is the analysis of counts data to extract meaningful latent components. Methods like Probabilistic Latent Semantic Analysis (PLSA) and Latent Dirichlet Allocation (LDA) have been proposed for this purpose. However, they are limited in the number of components they can extract and lack an explicit provision to control the(More)
—Broadcast is an efficient and scalable way of transmitting data to an unlimited number of clients that are listening to a channel. Cyclically broadcasting data over the channel is a basic scheduling technique, which is known as flat scheduling. When multiple channels are available, a data allocation technique is needed to assign data to channels.(More)
In this paper, we present a process which enables privacy-preserving speech recognition transactions between two parties. We assume one party with private speech data and one party with private speech recognition models. Our goal is to enable these parties to perform a speech recognition task using their data, but without exposing their private information(More)
3 Introduction The achievements of the ear are indeed fabulous. While I am writing, my elder son rattles the fire rake in the stove, the infant babbles contentedly in his baby carriage, the church clock strikes the hour, … … In the vibrations of air striking my ear, all these sounds are superimposed into a single extremely complex stream of pressure waves.(More)
With the recent attention to audio processing in the time -frequency domain we increasingly encounter the problem of missing data. In this paper we present an approach that allows for imputing missing values in the time-frequency domain of audio signals. The presented approach is able to deal with real-world polyphonic signals by performing imputation even(More)
We present an algorithm for the separation of multiple speakers from mixed single-channel recordings by latent variable decomposition of the speech spectrogram. We model each magnitude spectral vector in the short-time Fourier transform of a speech signal as the outcome of a discrete random process that generates frequency bin indices. The distribution of(More)