• Publications
  • Influence
Improving Massively Multilingual Neural Machine Translation and Zero-Shot Translation
TLDR
It is argued that multilingual NMT requires stronger modeling capacity to support language pairs with varying typological characteristics, and overcome this bottleneck via language-specific components and deepening NMT architectures. Expand
Variational Neural Machine Translation
TLDR
This paper builds a neural posterior approximator conditioned on both the source and the target sides, and equip it with a reparameterization technique to estimate the variational lower bound, and shows that the proposed variational neural machine translation achieves significant improvements over the vanilla neural machinetranslation baselines. Expand
Accelerating Neural Transformer via an Average Attention Network
TLDR
The proposed average attention network is applied on the decoder part of the neural Transformer to replace the original target-side self-attention model and enables the neuralTransformer to decode sentences over four times faster than its original version with almost no loss in training time and translation performance. Expand
Improving Deep Transformer with Depth-Scaled Initialization and Merged Attention
TLDR
Results on WMT and IWSLT translation tasks with five translation directions show that deep Transformers with DS-Init and MAtt can substantially outperform their base counterpart in terms of BLEU, while matching the decoding speed of the baseline model thanks to the efficiency improvements of MAtt. Expand
Revisiting Low-Resource Neural Machine Translation: A Case Study
TLDR
It is shown that, without the use of any auxiliary monolingual or multilingual data, an optimized NMT system can outperform PBSMT with far less data than previously claimed. Expand
Shallow Convolutional Neural Network for Implicit Discourse Relation Recognition
TLDR
A Shallow Convolutional Neural Network (SCNN) is proposed, which contains only one hidden layer but is effective in relation recognition, which alleviates the overfitting problem, while the convolution and nonlinear operations help preserve the recognition and generalization ability of the model. Expand
Variational Recurrent Neural Machine Translation
TLDR
A novel variational recurrent neural machine translation model that introduces a series of latent random variables to model the translation procedure of a sentence in a generative way, instead of a single latent variable. Expand
Neural Machine Translation with Deep Attention
TLDR
A deep attention model (DeepAtt) is proposed that is capable of automatically determining what should be passed or suppressed from the corresponding encoder layer so as to make the distributed representation appropriate for high-level attention and translation. Expand
A Context-Aware Recurrent Encoder for Neural Machine Translation
TLDR
This paper proposes a novel context-aware recurrent encoder (CAEncoder), as an alternative to the widely-used bidirectional encoder, such that the future and history contexts can be fully incorporated into the learned source representations. Expand
...
1
2
3
4
...