Unified Language Model Pre-training for Natural Language Understanding and Generation
A new Unified pre-trained Language Model (UniLM) that can be fine-tuned for both natural language understanding and generation tasks that compares favorably with BERT on the GLUE benchmark, and the SQuAD 2.0 and CoQA question answering tasks.
Gated Self-Matching Networks for Reading Comprehension and Question Answering
- Wenhui Wang, Nan Yang, Furu Wei, Baobao Chang, M. Zhou
- Computer ScienceAnnual Meeting of the Association for…
- 1 July 2017
The gated self-matching networks for reading comprehension style question answering, which aims to answer questions from a given passage, are presented and holds the first place on the SQuAD leaderboard for both single and ensemble model.
Learning Sentiment-Specific Word Embedding for Twitter Sentiment Classification
- Duyu Tang, Furu Wei, Nan Yang, M. Zhou, Ting Liu, Bing Qin
- Computer ScienceAnnual Meeting of the Association for…
- 1 June 2014
Three neural networks are developed to effectively incorporate the supervision from sentiment polarity of text (e.g. sentences or tweets) in their loss functions and the performance of SSWE is improved by concatenating SSWE with existing feature set.
MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers
- Wenhui Wang, Furu Wei, Li Dong, Hangbo Bao, Nan Yang, Ming Zhou
- Computer ScienceNeural Information Processing Systems
- 25 February 2020
This work presents a simple and effective approach to compress large Transformer (Vaswani et al., 2017) based pre-trained models, termed as deep self-attention distillation, and demonstrates that the monolingual model outperforms state-of-the-art baselines in different parameter size of student models.
Neural Question Generation from Text: A Preliminary Study
- Qingyu Zhou, Nan Yang, Furu Wei, Chuanqi Tan, Hangbo Bao, M. Zhou
- Computer ScienceNatural Language Processing and Chinese Computing
- 6 April 2017
A preliminary study on neural question generation from text with the SQuAD dataset is conducted, and the experiment results show that the method can produce fluent and diverse questions.
Neural Document Summarization by Jointly Learning to Score and Select Sentences
- Qingyu Zhou, Nan Yang, Furu Wei, Shaohan Huang, M. Zhou, T. Zhao
- Computer ScienceAnnual Meeting of the Association for…
- 1 July 2018
This paper presents a novel end-to-end neural network framework for extractive document summarization by jointly learning to score and select sentences, which significantly outperforms the state-of-the-art extractive summarization models.
Selective Encoding for Abstractive Sentence Summarization
- Qingyu Zhou, Nan Yang, Furu Wei, M. Zhou
- Computer ScienceAnnual Meeting of the Association for…
- 1 April 2017
The experimental results show that the proposed selective encoding model outperforms the state-of-the-art baseline models.
UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training
- Hangbo Bao, Li Dong, H. Hon
- Computer ScienceInternational Conference on Machine Learning
- 28 February 2020
The experiments show that the unified language models pre-trained using PMLM achieve new state-of-the-art results on a wide range of natural language understanding and generation tasks across several widely used benchmarks.
InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training
- Zewen Chi, Li Dong, M. Zhou
- Computer ScienceNorth American Chapter of the Association for…
- 15 July 2020
An information-theoretic framework that formulates cross-lingual language model pre- training as maximizing mutual information between multilingual-multi-granularity texts is presented and a new pre-training task based on contrastive learning is proposed.
Sentiment Embeddings with Applications to Sentiment Analysis
- Duyu Tang, Furu Wei, Bing Qin, Nan Yang, Ting Liu, M. Zhou
- Computer ScienceIEEE Transactions on Knowledge and Data…
- 1 February 2016
This work develops a number of neural networks with tailoring loss functions, and applies sentiment embeddings to word-level sentiment analysis, sentence level sentiment classification, and building sentiment lexicons, showing results that consistently outperform context-basedembeddings on several benchmark datasets of these tasks.
...
...