Share This Author
MERLOT: Multimodal Neural Script Knowledge Models
This work introduces MERLOT, a model that learns multimodal script knowledge by watching millions of YouTube videos with transcribed speech – in an entirely label-free, self-supervised manner, and achieves state-ofthe-art performance on 12 different video QA datasets when finetuned.
DExperts: Decoding-Time Controlled Text Generation with Experts and Anti-Experts
This work highlights the promise of tuning small LMs on text with (un)desirable attributes for efficient decoding-time steering and applies DExperts to language detoxification and sentiment-controlled generation, where it outperform existing controllable generation methods on both automatic and human evaluations.
Symbolic Knowledge Distillation: from General Language Models to Commonsense Models
It is demonstrated that careful prompt engineering and a separately trained critic model allow us to selectively distill high-quality causal commonsense from GPT-3, a general language model, and results in a neural commonsense model that surpasses the teacher model's commonsense capabilities despite its 100x smaller size.
Understanding Few-Shot Commonsense Knowledge Models
This work investigates training commonsense knowledge models in a fewshot setting with limited tuples per commonsense relation in the graph and finds that human quality ratings for knowledge produced from a few-shot trained system can achieve performance within 6% of knowledgeproduced from fully supervised systems.
NeuroLogic Decoding: (Un)supervised Neural Text Generation with Predicate Logic Constraints
- Ximing Lu, Peter West, Rowan Zellers, Ronan Le Bras, Chandra Bhagavatula, Yejin Choi
- Computer ScienceNAACL
- 24 October 2020
This work proposes NeuroLogic Decoding, a simple yet effective algorithm that enables neural language models – supervised or not – to generate fluent text while satisfying complex lexical constraints, and suggests the limit of large-scale neural networks for fine-grained controllable generation and the promise of inference-time algorithms.
MERLOT Reserve: Neural Script Knowledge through Vision and Language and Sound
MERLOT Reserve is introduced, a model that represents videos jointly over time – through a new training objective that learns from audio, subtitles, and video frames, which enables out-of-the-box prediction, revealing strong multimodal commonsense understanding.
Analyzing Commonsense Emergence in Few-shot Knowledge Models
The results show that commonsense knowledge models can rapidly adapt from limited examples, indicating that KG fine-tuning serves to learn an interface to encoded knowledge learned during pretraining.
On-the-Fly Controlled Text Generation with Experts and Anti-Experts
This work proposes DEXPERTS: Decoding-time Experts, a decodingtime method for controlled text generation which combines a pretrained language model with “experts” and/or “anti-experts" in an ensemble of language models.
HATNet: An End-to-End Holistic Attention Network for Diagnosis of Breast Biopsy Images
- Sachin Mehta, Ximing Lu, D. Weaver, J. Elmore, Hannaneh Hajishirzi, L. Shapiro
- Computer ScienceArXiv
- 25 July 2020
This paper introduces a novel attention-based network, the Holistic ATtention Network (HATNet), which outperforms the previous best network Y-Net and uses self-attention to encode global information, allowing it to learn representations from clinically relevant tissue structures without any explicit supervision.
Generated Knowledge Prompting for Commonsense Reasoning
Generated knowledge prompting develops generated knowledge prompting, which consists of generating knowledge from a language model, then providing the knowledge as additional input when answering a question, and improves performance of large-scale, state-of-the-art models on four commonsense reasoning tasks.