• Publications
  • Influence
Linguistic Regularities in Continuous Space Word Representations
TLDR
The vector-space word representations that are implicitly learned by the input-layer weights are found to be surprisingly good at capturing syntactic and semantic regularities in language, and that each relationship is characterized by a relation-specific vector offset.
Syntactic Clustering of the Web
From captions to visual concepts and back
TLDR
This paper uses multiple instance learning to train visual detectors for words that commonly occur in captions, including many different parts of speech such as nouns, verbs, and adjectives, and develops a maximum-entropy language model.
Context dependent recurrent neural network language model
TLDR
This paper improves recurrent neural network language models performance by providing a contextual real-valued input vector in association with each word to convey contextual information about the sentence being modeled by performing Latent Dirichlet Allocation using a block of preceding text.
Using Recurrent Neural Networks for Slot Filling in Spoken Language Understanding
TLDR
This paper implemented and compared several important RNN architectures, including Elman, Jordan, and hybrid variants, and implemented these networks with the publicly available Theano neural network toolkit and completed experiments on the well-known airline travel information system (ATIS) benchmark.
An introduction to computational networks and the computational network toolkit (invited talk)
TLDR
The computational network toolkit (CNTK), an implementation of CN that supports both GPU and CPU, is introduced and the architecture and the key components of the CNTK are described, the command line options to use C NTK, and the network definition and model editing language are described.
Hybrid Code Networks: practical and efficient end-to-end dialog control with supervised and reinforcement learning
TLDR
This work introduces Hybrid Code Networks (HCNs), which combine an RNN with domain-specific knowledge encoded as software and system action templates, and considerably reduce the amount of training data required, while retaining the key benefit of inferring a latent representation of dialog state.
Spoken language understanding using long short-term memory neural networks
TLDR
This paper investigates using long short-term memory (LSTM) neural networks, which contain input, output and forgetting gates and are more advanced than simple RNN, for the word labeling task and proposes a regression model on top of the LSTM un-normalized scores to explicitly model output-label dependence.
Language Models for Image Captioning: The Quirks and What Works
TLDR
By combining key aspects of the ME and RNN methods, this paper achieves a new record performance over previously published results on the benchmark COCO dataset, however, the gains the authors see in BLEU do not translate to human judgments.
A segmental CRF approach to large vocabulary continuous speech recognition
  • G. Zweig, P. Nguyen
  • Computer Science
    IEEE Workshop on Automatic Speech Recognition…
  • 1 December 2009
TLDR
A segmental conditional random field framework for large vocabulary continuous speech recognition that allows for the joint or separate discriminative training of the acoustic and language models.
...
...