Author pages are created from data sourced from our academic publisher partnerships and public sources.
Share This Author
Multi-Task Deep Neural Networks for Natural Language Understanding
A Multi-Task Deep Neural Network (MT-DNN) for learning representations across multiple natural language understanding (NLU) tasks that allows domain adaptation with substantially fewer in-domain labels than the pre-trained BERT representations.
On the Variance of the Adaptive Learning Rate and Beyond
This work identifies a problem of the adaptive learning rate, suggests warmup works as a variance reduction technique, and proposes RAdam, a new variant of Adam, by introducing a term to rectify the variance of theadaptive learning rate.
Unified Language Model Pre-training for Natural Language Understanding and Generation
A new Unified pre-trained Language Model (UniLM) that can be fine-tuned for both natural language understanding and generation tasks that compares favorably with BERT on the GLUE benchmark, and the SQuAD 2.0 and CoQA question answering tasks.
RAT-SQL: Relation-Aware Schema Encoding and Linking for Text-to-SQL Parsers
- Bailin Wang, Richard Shin, Xiaodong Liu, Oleksandr Polozov, Matthew Richardson
- Computer ScienceACL
- 10 November 2019
This work presents a unified framework, based on the relation-aware self-attention mechanism, to address schema encoding, schema linking, and feature representation within a text-to-SQL encoder and achieves the new state-of-the-art performance on the Spider leaderboard.
New approaches to H∞ controller designs based on fuzzy observers for T-S fuzzy systems via LMI
Using the LMI technique, it is shown that the regulators, the fuzzy observers and the H∞ controller designs based on new observers for the T-S fuzzy systems are very practical and e8cient.
Cyclical Annealing Schedule: A Simple Approach to Mitigating KL Vanishing
- Hao Fu, Chunyuan Li, Xiaodong Liu, Jianfeng Gao, Asli Celikyilmaz, L. Carin
- Computer Science, MathematicsNAACL
- 25 March 2019
A cyclical annealing schedule is proposed, which simply repeats the process of increasing \beta multiple times, and allows to learn more meaningful latent codes progressively by leveraging the results of previous learning cycles as warm re-restart.
SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization
- Haoming Jiang, Pengcheng He, Weizhu Chen, Xiaodong Liu, Jianfeng Gao, T. Zhao
- Computer Science, MathematicsACL
- 8 November 2019
A new learning framework for robust and efficient fine-tuning for pre-trained models to attain better generalization performance and outperforms the state-of-the-art T5 model, which is the largest pre- trained model containing 11 billion parameters, on GLUE.
Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing
- Yuxian Gu, Robert Tinn, +6 authors Hoifung Poon
- Computer ScienceACM Transactions on Computing for Healthcare
- 31 July 2020
It is shown that for domains with abundant unlabeled text, such as biomedicine, pretraining language models from scratch results in substantial gains over continual pretraining of general-domain language models.
Improving Multi-Task Deep Neural Networks via Knowledge Distillation for Natural Language Understanding
This paper explores the use of knowledge distillation to improve a Multi-Task Deep Neural Network (MT-DNN) (Liu et al., 2019) for learning text representations across multiple natural language understanding tasks and shows that the distilled MT-dNN significantly outperforms the original MT- DNN on 7 out of 9 GLUE tasks.
The fuzzy sets and systems based on AFS.structure, EI algebra and EII algebra
- Xiaodong Liu
- Computer ScienceFuzzy Sets Syst.
- 16 April 1998
EI algebra and EII algebra, which are infinite distributive molecular lattices, and the AFS, which is a special system of Graver and Watkins (1977), are defined, which establish a totally new system of fuzzy sets and systems.