• Publications
  • Influence
Modeling Coverage for Neural Machine Translation
TLDR
This paper proposes coverage-based NMT, which maintains a coverage vector to keep track of the attention history and improves both translation quality and alignment quality over standard attention- based NMT.
MobileFaceNets: Efficient CNNs for Accurate Real-time Face Verification on Mobile Devices
TLDR
A class of extremely efficient CNN models, MobileFaceNets, which use less than 1 million parameters and are specifically tailored for high-accuracy real-time face verification on mobile and embedded devices, achieve significantly improved efficiency over previous state-of-the-art mobile CNNs.
Improving the Transformer Translation Model with Document-Level Context
TLDR
This work extends the Transformer model with a new context encoder to represent document-level context, which is then incorporated into the original encoder and decoder, and introduces a two-step training method to take full advantage of abundant sentence-level parallel corpora and limited document- level parallel Corpora.
Early Detection of Fake News on Social Media Through Propagation Path Classification with Recurrent and Convolutional Networks
TLDR
The experimental results demonstrate that the proposed models can detect fake news with over 90% accuracy within five minutes after it starts to spread and before it is retweeted 50 times, which is significantly faster than state-of-the-art baselines.
graph2vec: Learning Distributed Representations of Graphs
TLDR
This work proposes a neural embedding framework named graph2vec to learn data-driven distributed representations of arbitrary sized graphs that achieves significant improvements in classification and clustering accuracies over substructure representation learning approaches and are competitive with state-of-the-art graph kernels.
Learning Natural Language Inference using Bidirectional LSTM model and Inner-Attention
TLDR
A sentence encoding-based model for recognizing text entailment that utilized the sentence's first-stage representation to attend words appeared in itself, which is called "Inner-Attention" in this paper.
Learning to Remember Translation History with a Continuous Cache
TLDR
This work proposes to augment NMT models with a very light-weight cache-like memory network, which stores recent hidden representations as translation history and the probability distribution over generated words is updated online depending on the translation history retrieved from the memory.
Transductive Unbiased Embedding for Zero-Shot Learning
TLDR
This paper proposes a straightforward yet effective method named Quasi-Fully Supervised Learning (QFSL) to alleviate the bias problem in Zero-Shot Learning, which outperforms existing state-of-the-art approaches by a huge margin.
Neural Machine Translation with Reconstruction
TLDR
Experiments show that the proposed framework significantly improves the adequacy of NMT output and achieves superior translation result over state-of-the-art NMT and statistical MT systems.
Asynchronous Bidirectional Decoding for Neural Machine Translation
TLDR
This paper equip the conventional attentional encoder-decoder NMT framework with a backward decoder, in order to explore bidirectional decoding for NMT, and achieves substantial improvements over the conventional NMT by 3.14 and 1.38 BLEU points, respectively.
...
1
2
3
4
5
...