CommonGen: A Constrained Text Generation Challenge for Generative Commonsense Reasoning
- Bill Yuchen Lin, Minghan Shen, Xiang Ren
- Computer ScienceFindings
- 14 February 2020
A constrained text generation task, CommonGen associated with a benchmark dataset, to explicitly test machines for the ability of generative commonsense reasoning, and demonstrates that the learned generative Commonsense reasoning capability can be transferred to improve downstream tasks such as CommonsenseQA by generating additional context.
KagNet: Knowledge-Aware Graph Networks for Commonsense Reasoning
- Bill Yuchen Lin, Xinyue Chen, Jamin Chen, Xiang Ren
- Computer ScienceConference on Empirical Methods in Natural…
- 4 September 2019
This paper proposes a textual inference framework for answering commonsense questions, which effectively utilizes external, structured commonsense knowledge graphs to perform explainable inferences.
Scalable Multi-Hop Relational Reasoning for Knowledge-Aware Question Answering
- Yanlin Feng, Xinyue Chen, Bill Yuchen Lin, Peifeng Wang, Jun Yan, Xiang Ren
- Computer ScienceConference on Empirical Methods in Natural…
- 1 May 2020
A novel knowledge-aware approach that equips pre-trained language models (PTLMs) with a multi-hop relational reasoning module, namedmulti-hop graph relation network (MHGRN), which performs multi-Hop, multi-relational reasoning over subgraphs extracted from external knowledge graphs.
CrossFit: A Few-shot Learning Challenge for Cross-task Generalization in NLP
- Qinyuan Ye, Bill Yuchen Lin, Xiang Ren
- Computer ScienceConference on Empirical Methods in Natural…
- 18 April 2021
This paper presents the NLP Few-shot Gym, a repository of 160 diverse few-shot NLP tasks created from open-access NLP datasets and converted to a unified text-to-text format, and reveals that the few- shot learning ability on unseen tasks can be improved via an upstream learning stage using a set of seen tasks.
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
- A. Srivastava, Abhinav Rastogi, Uri Shaham
- Computer ScienceArXiv
- 9 June 2022
Evaluation of OpenAI's GPT models, Google-internal dense transformer architectures, and Switch-style sparse transformers on BIG-bench, across model sizes spanning millions to hundreds of billions of parameters finds that model performance and calibration both improve with scale, but are poor in absolute terms.
Birds Have Four Legs?! NumerSense: Probing Numerical Commonsense Knowledge of Pre-trained Language Models
- Bill Yuchen Lin, Seyeon Lee, Rahul Khanna, Xiang Ren
- Computer ScienceConference on Empirical Methods in Natural…
- 2 May 2020
Investigating whether and to what extent one can induce numerical commonsense knowledge from PTLMs as well as the robustness of this process finds that this may not work for numerical Commonsense knowledge.
Pre-training Text-to-Text Transformers for Concept-centric Common Sense
- Wangchunshu Zhou, Dong-Ho Lee, Ravi Kiran Selvam, Seyeon Lee, Bill Yuchen Lin, Xiang Ren
- Computer ScienceInternational Conference on Learning…
- 24 October 2020
It is shown that while only incrementally pre-trained on a relatively small corpus for a few steps, CALM outperforms baseline methods by a consistent margin and even comparable with some larger PTLMs, which suggests that CALM can serve as a general, plug-and-play method for improving the commonsense reasoning ability of a PTLM.
TriggerNER: Learning with Entity Triggers as Explanations for Named Entity Recognition
- Bill Yuchen Lin, Dong-Ho Lee, Xiang Ren
- Computer ScienceAnnual Meeting of the Association for…
- 16 April 2020
The proposed model, Trigger Matching Network, jointly learns trigger representations and soft matching module with self-attention such that can generalize to unseen sentences easily for tagging, and is significantly more cost-effective than the traditional neural NER frameworks.
FedNLP: A Research Platform for Federated Learning in Natural Language Processing
- Bill Yuchen Lin, Chaoyang He, S. Avestimehr
- Computer ScienceArXiv
- 2021
The preliminary experiments with FedNLP reveal that there exists a large performance gap between learning on decentralized and centralized datasets — opening intriguing and exciting future research directions aimed at developing FL methods suited to NLP tasks.
Multi-channel BiLSTM-CRF Model for Emerging Named Entity Recognition in Social Media
- Bill Yuchen Lin, Frank F. Xu, Zhiyi Luo, Kenny Q. Zhu
- Computer ScienceNUT@EMNLP
- 1 September 2017
A novel approach is proposed, which incorporates comprehensive word representations with multi-channel information and Conditional Random Fields (CRF) into a traditional Bidirectional Long Short-Term Memory (BiLSTM) neural network without using any additional hand-craft features such as gazetteers.
...
...