• Publications
  • Influence
Effective Approaches to Attention-based Neural Machine Translation
TLDR
A global approach which always attends to all source words and a local one that only looks at a subset of source words at a time are examined, demonstrating the effectiveness of both approaches on the WMT translation tasks between English and German in both directions.
Addressing the Rare Word Problem in Neural Machine Translation
TLDR
This paper proposes and implements an effective technique to address the problem of end-to-end neural machine translation's inability to correctly translate very rare words, and is the first to surpass the best result achieved on a WMT’14 contest task.
Better Word Representations with Recursive Neural Networks for Morphology
TLDR
This paper combines recursive neural networks, where each morpheme is a basic unit, with neural language models to consider contextual information in learning morphologicallyaware word representations and proposes a novel model capable of building representations for morphologically complex words from their morphemes.
Bilingual Word Representations with Monolingual Quality in Mind
TLDR
This work proposes a joint model to learn word representations from scratch that utilizes both the context coocurrence information through the monolingual component and the meaning equivalent signals from the bilingual constraint to learn high quality bilingual representations efficiently.
When Are Tree Structures Necessary for Deep Learning of Representations?
TLDR
This paper benchmarks recursive neural models against sequential recurrent neural models, enforcing applesto-apples comparison as much as possible, and introduces a method for allowing recurrent models to achieve similar performance: breaking long sentences into clause-like units at punctuation and processing them separately before combining.
Learning Longer-term Dependencies in RNNs with Auxiliary Losses
TLDR
This paper proposes a simple method that improves the ability to capture long term dependencies in RNNs by adding an unsupervised auxiliary loss to the original objective, making truncated backpropagation feasible for long sequences and also improving full BPTT.
Learning Distributed Representations for Multilingual Text Sequences
TLDR
This work is similar in spirit to the recent paragraph vector approach but extends to the bilingual context so as to efficiently encode meaning-equivalent text sequences of multiple languages in the same semantic space.
Just Pick a Sign: Optimizing Deep Multitask Models with Gradient Sign Dropout
TLDR
This work presents Gradient Sign Dropout (GradDrop), a probabilistic masking procedure which samples gradients at an activation layer based on their level of consistency, and discusses how GradDrop reveals links between optimal multiloss training and gradient stochasticity.
The More Extreme Nature of North American Monsoon Precipitation in the Southwestern United States as Revealed by a Historical Climatology of Simulated Severe Weather Events
AbstractLong-term changes in North American monsoon (NAM) precipitation intensity in the Southwest U.S. are evaluated through the use of convective-permitting model simulations of objectively
18F-FDG PET/CT appearance of metastatic brachial plexopathy involving epidural space from breast carcinoma.
TLDR
Hypermetabolic activity of the brachial plexopathy has extended superiorly and medially involving the right lateral aspect of the epidural space at the C5-6 level, consistent with epidural involvement.
...
1
2
3
...