Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks
- Suchin Gururangan, Ana Marasović, Noah A. Smith
- Computer ScienceAnnual Meeting of the Association for…
- 23 April 2020
It is consistently found that multi-phase adaptive pretraining offers large gains in task performance, and it is shown that adapting to a task corpus augmented using simple data selection strategies is an effective alternative, especially when resources for domain-adaptive pretraining might be unavailable.
Annotation Artifacts in Natural Language Inference Data
- Suchin Gururangan, Swabha Swayamdipta, Omer Levy, Roy Schwartz, Samuel R. Bowman, Noah A. Smith
- Computer ScienceNorth American Chapter of the Association for…
- 6 March 2018
It is shown that a simple text categorization model can correctly classify the hypothesis alone in about 67% of SNLI and 53% of MultiNLI, and that specific linguistic phenomena such as negation and vagueness are highly correlated with certain inference classes.
Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics
- Swabha Swayamdipta, Roy Schwartz, Yejin Choi
- Computer ScienceConference on Empirical Methods in Natural…
- 22 September 2020
The results indicate that a shift in focus from quantity to quality of data could lead to robust models and improved out-of-distribution generalization, and a model-based tool to characterize and diagnose datasets.
DyNet: The Dynamic Neural Network Toolkit
- Graham Neubig, Chris Dyer, Pengcheng Yin
- Computer SciencearXiv.org
- 15 January 2017
DyNet is a toolkit for implementing neural network models based on dynamic declaration of network structure that has an optimized C++ backend and lightweight graph representation and is designed to allow users to implement their models in a way that is idiomatic in their preferred programming language.
A Dependency Parser for Tweets
- Lingpeng Kong, Nathan Schneider, Swabha Swayamdipta, Archna Bhatia, Chris Dyer, Noah A. Smith
- Computer ScienceConference on Empirical Methods in Natural…
- 1 October 2014
A new dependency parser for English tweets, TWEEBOPARSER, which builds on several contributions: new syntactic annotations for a corpus of tweets, with conventions informed by the domain; adaptations to a statistical parsing algorithm; and a new approach to exploiting out-of-domain Penn Treebank data.
DExperts: Decoding-Time Controlled Text Generation with Experts and Anti-Experts
- Alisa Liu, Maarten Sap, Yejin Choi
- Computer ScienceAnnual Meeting of the Association for…
- 7 May 2021
This work highlights the promise of tuning small LMs on text with (un)desirable attributes for efficient decoding-time steering and applies DExperts to language detoxification and sentiment-controlled generation, where it outperform existing controllable generation methods on both automatic and human evaluations.
Frame-Semantic Parsing with Softmax-Margin Segmental RNNs and a Syntactic Scaffold
- Swabha Swayamdipta, Sam Thomson, Chris Dyer, Noah A. Smith
- Computer SciencearXiv.org
- 29 June 2017
A new, efficient frame-semantic parser that labels semantic arguments to FrameNet predicates, built using an extension to the segmental RNN that emphasizes recall, achieves competitive performance without any calls to a syntactic parser.
Adversarial Filters of Dataset Biases
- Ronan Le Bras, Swabha Swayamdipta, Yejin Choi
- Computer ScienceInternational Conference on Machine Learning
- 10 February 2020
This work presents extensive supporting evidence that AFLite is broadly applicable for reduction of measurable dataset biases, and that models trained on the filtered datasets yield better generalization to out-of-distribution tasks.
Transfer Learning in Natural Language Processing
- Sebastian Ruder, Matthew E. Peters, Swabha Swayamdipta, Thomas Wolf
- Computer ScienceNorth American Chapter of the Association for…
- 1 June 2019
An overview of modern transfer learning methods in NLP, how models are pre-trained, what information the representations they learn capture, and review examples and case studies on how these models can be integrated and adapted in downstream NLP tasks are presented.
The Right Tool for the Job: Matching Model and Instance Complexities
- Roy Schwartz, Gabriel Stanovsky, Swabha Swayamdipta, Jesse Dodge, Noah A. Smith
- Computer ScienceAnnual Meeting of the Association for…
- 16 April 2020
This work proposes a modification to contextual representation fine-tuning which allows for an early (and fast) “exit” from neural network calculations for simple instances, and late (and accurate) exit for hard instances during inference.
...
...