• Publications
  • Influence
Learning to rank: from pairwise approach to listwise approach
TLDR
The paper is concerned with learning to rank, which is to construct a model or a function for ranking objects. Expand
MASS: Masked Sequence to Sequence Pre-training for Language Generation
TLDR
We propose MAsked Sequence to Sequence pre-training (MASS) for the encoder-decoder based language generation tasks, which achieves state-of-the-art accuracy on unsupervised English-French translation. Expand
LETOR: Benchmark Dataset for Research on Learning to Rank for Information Retrieval
TLDR
We have derived the LETOR data from the existing data sets widely used in information retrieval, namely, OHSUMED and TREC data. Expand
LETOR: A benchmark collection for research on learning to rank for information retrieval
TLDR
A benchmark collection for the research on learning to rank for information retrieval, released by Microsoft Research Asia. Expand
Dual Learning for Machine Translation
TLDR
We develop a dual-learning mechanism, which can enable an NMT system to automatically learn from unlabeled data through a dual learning game. Expand
FastSpeech: Fast, Robust and Controllable Text to Speech
TLDR
In this work, we propose a novel feed-forward network based on Transformer to generate mel-spectrogram in parallel for TTS. Expand
Neural Architecture Optimization
TLDR
In this paper, we propose a simple and efficient method to automatic neural architecture design based on continuous optimization based on gradient based optimization. Expand
Introducing LETOR 4.0 Datasets
TLDR
LETOR is a package of benchmark data sets for research on LEarning TO Rank, which contains standard features, relevance judgments, data partitioning, evaluation tools, and several baselines. Expand
Global Ranking Using Continuous Conditional Random Fields
TLDR
This paper studies global ranking problem by learning to rank methods. Expand
Incorporating BERT into Neural Machine Translation
TLDR
We propose a new algorithm named BERT-fused model, in which we first use BERT to extract representations for an input sequence, and then the representations are fused with each layer of the encoder and decoder of the NMT model through attention mechanisms. Expand
...
1
2
3
4
5
...