Learn More
Most current speech recognition systems use hidden Markov models (HMMs) to deal with the temporal variability of speech and Gaussian mixture models (GMMs) to determine how well each state of each HMM fits a frame or a short window of frames of coefficients that represents the acoustic input. An alternative way to evaluate the fit is to use a feed-forward(More)
We propose a novel context-dependent (CD) model for large-vocabulary speech recognition (LVSR) that leverages recent advances in using deep belief networks for phone recognition. We describe a pre-trained deep neural network hidden Markov model (DNN-HMM) hybrid architecture that trains the DNN to produce a distribution over senones (tied triphone states) as(More)
Most current speech recognition systems use hidden Markov models (HMMs) to deal with the temporal variability of speech and Gaussian mixture models to determine how well each state of each HMM fits a frame or a short window of frames of coefficients that represents the acoustic input. An alternative way to evaluate the fit is to use a feedforward neural(More)
Latent semantic models, such as LSA, intend to map a query to its relevant documents at the semantic level where keyword-based matching often fails. In this study we strive to develop a series of new latent semantic models with a deep structure that project queries and documents into a common low-dimensional space where the relevance of a document given a(More)
This book is aimed to provide an overview of general deep learning methodology and its applications to a variety of signal and information processing tasks. The application areas are chosen with the following three criteria: 1) expertise or knowledge of the authors; 2) the application areas that have already been transformed by the successful use of deep(More)
This paper presents a novel approach for automatically generating image descriptions: visual detectors, language models, and multimodal similarity models learnt directly from a dataset of image captions. We use multiple instance learning to train visual detectors for words that commonly occur in captions, including many different parts of speech such as(More)
This paper presents stacked attention networks (SANs) that learn to answer natural language questions from images. SANs use semantic representation of a question as query to search for the regions in an image that are related to the answer. We argue that image question answering (QA) often requires multiple steps of reasoning. Thus, we develop a(More)
In this paper we address the problem of robustness of speech recognition systems in noisy environments. The goal is to estimate the parameters of a HMM that is matched to a noisy environment, given a HMM trained with clean speech and knowledge of the acoustical environment. We propose a method based on truncated vector Taylor series that approximates the(More)
The receptor interacting protein kinase 1 (RIP1) is essential for the activation of nuclear factor kappaB (NF-kappaB) by tumor necrosis factor alpha (TNFalpha). Here, we present evidence that TNFalpha induces the polyubiquitination of RIP1 at Lys-377 and that this polyubiquitination is required for the activation of IkappaB kinase (IKK) and NF-kappaB. A(More)
TRAF6 is a signal transducer in the NF-kappaB pathway that activates IkappaB kinase (IKK) in response to proinflammatory cytokines. We have purified a heterodimeric protein complex that links TRAF6 to IKK activation. Peptide mass fingerprinting analysis reveals that this complex is composed of the ubiquitin conjugating enzyme Ubc13 and the Ubc-like protein(More)