Learn More
• Where represents the power spectrum of the degraded speech, is the power spectrum of the clean speech, is the transfer function of the linear filter, and is the power spectrum of the additive noise.) (|) (|) () Z(2 ω ω ω ω N H X + =) Z(ω) X(ω) (ω H) (ω N • In the log-Spectral domain this relation can be expressed as:) 1 log(q x n e q x z − − + + + = of in(More)
This paper presents a family of probabilistic latent variable models that can be used for analysis of nonnegative data. We show that there are strong ties between nonnegative matrix factorization and this family, and provide some straightforward extensions which can help in dealing with shift invariances, higher-order decompositions and sparsity(More)
Sphinx-4 is a flexible, modular and pluggable framework to help foster new innovations in the core research of hidden Markov model (HMM) recognition systems. The design of Sphinx-4 is based on patterns that have emerged from the design of past systems as well as new requirements based on areas that researchers currently want to explore. To exercise this(More)
Speech recognition systems perform poorly in the presence of corrupting noise. Missing feature methods attempt to compensate for the noise by removing noise corrupted components of spectrographic representations of noisy speech and performing recognition with the remaining reliable components. Conventional classifier-compensation methods modify the(More)
In this article we have reviewed a wide variety of techniques based on the identification of missing spectral features that have proved effective in reducing the error rates of automatic speech recognition systems. These approaches have been conspicuously effective in ameliorating the effects of transient maskers such as impulsive noise or background music.(More)
Missing feature methods of noise compensation for speech recognition operate by first identifying components of a spectrographic representation of speech that are considered to be corrupt. Recognition is then performed either using only the remaining reliable components, or the corrupt components are reconstructed prior to recognition. These methods require(More)
In this paper we describe a methodology for model-based single channel separation of sounds. We present a sparse latent variable model that can learn sounds based on their distribution of time/frequency energy. This model can then be used to extract known types of sounds from mixtures in two scenarios. One being the case where all sound types in the mixture(More)
We present a technique for denoising speech using nonnegative matrix factorization (NMF) in combination with statistical speech and noise models. We compare our new technique to standard NMF and to a state-of-the-art Wiener filter implementation and show improvements in speech quality across a range of interfering noise types.
In this paper we describe a model developed for the analysis of acoustic spectra. Unlike decom-positions techniques that can result in difficult to interpret results this model explicitly models spectra as distributions and extracts sets of additive and semantically useful components that facilitate a variety of applications ranging from source separation,(More)
In this paper we describe a technique that allows the extraction of multiple local shift-invariant features from analysis of non-negative data of arbitrary dimensionality. Our approach employs a probabilistic latent variable model with sparsity constraints. We demonstrate its utility by performing feature extraction in a variety of domains ranging from(More)