• Corpus ID: 237091588

On the Opportunities and Risks of Foundation Models

  title={On the Opportunities and Risks of Foundation Models},
  author={Rishi Bommasani and Drew A. Hudson and Ehsan Adeli and Russ Altman and Simran Arora and Sydney von Arx and Michael S. Bernstein and Jeannette Bohg and Antoine Bosselut and Emma Brunskill and Erik Brynjolfsson and S. Buch and Dallas Card and Rodrigo Castellon and Niladri S. Chatterji and Annie S. Chen and Kathleen A. Creel and Jared Davis and Dora Demszky and Chris Donahue and Moussa Doumbouya and Esin Durmus and Stefano Ermon and John Etchemendy and Kawin Ethayarajh and Li Fei-Fei and Chelsea Finn and Trevor Gale and Lauren E. Gillespie and Karan Goel and Noah D. Goodman and Shelby Grossman and Neel Guha and Tatsunori Hashimoto and Peter Henderson and John Hewitt and Daniel E. Ho and Jenny Hong and Kyle Hsu and Jing Huang and Thomas F. Icard and Saahil Jain and Dan Jurafsky and Pratyusha Kalluri and Siddharth Karamcheti and Geoff Keeling and Fereshte Khani and O. Khattab and Pang Wei Koh and Mark S. Krass and Ranjay Krishna and Rohith Kuditipudi and Ananya Kumar and Faisal Ladhak and Mina Lee and Tony Lee and Jure Leskovec and Isabelle Levent and Xiang Lisa Li and Xuechen Li and Tengyu Ma and Ali Malik and Christopher D. Manning and Suvir P. Mirchandani and Eric Mitchell and Zanele Munyikwa and Suraj Nair and Avanika Narayan and Deepak Narayanan and Benjamin Newman and Allen Nie and Juan Carlos Niebles and Hamed Nilforoshan and J. F. Nyarko and Giray Ogut and Laurel J. Orr and Isabel Papadimitriou and Joon Sung Park and Chris Piech and Eva Portelance and Christopher Potts and Aditi Raghunathan and Robert Reich and Hongyu Ren and Frieda Rong and Yusuf H. Roohani and Camilo Ruiz and Jack Ryan and Christopher R'e and Dorsa Sadigh and Shiori Sagawa and Keshav Santhanam and Andy Shih and Krishna Parasuram Srinivasan and Alex Tamkin and Rohan Taori and Armin W. Thomas and Florian Tram{\`e}r and Rose E. Wang and William Wang and Bohan Wu and Jiajun Wu and Yuhuai Wu and Sang Michael Xie and Michihiro Yasunaga and Jiaxuan You and Matei A. Zaharia and Michael Zhang and Tianyi Zhang and Xikun Zhang and Yuhui Zhang and Lucia Zheng and Kaitlyn Zhou and Percy Liang},
AI is undergoing a paradigm shift with the rise of models (e.g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks. We call these models foundation models to underscore their critically central yet incomplete character. This report provides a thorough account of the opportunities and risks of foundation models, ranging from their capabilities (e.g., language, vision, robotics, reasoning, human interaction) and technical principles… 

Debiasing Methods for Fairer Neural Models in Vision and Language Research: A Survey

A novel taxonomy is proposed to better organize the literature on debiasing methods for fairness-aware neural networks in the context of vision and language research and discusses the current challenges, trends, and important future work directions for the interested researcher and practitioner.

Large-scale Text-to-Image Generation Models for Visual Artists' Creative Works

This work conducts an interview study as well as a systematic literature review of 72 system/application papers for a thorough examination to understand how visual artists would adopt LTGMs to support their creative works and provides four design guidelines that future researchers can refer to in making intelligent user interfaces usingLTGMs.

Out-of-Distribution Generalization in Algorithmic Reasoning Through Curriculum Learning

This work trains a transformer-based network to learn a simple solution strategy to the popular puzzle game Sudoku, using a 6x6 Sudoku grid rather than the traditional 9x9, which provides sufficient complexity for investigating algorithmic reasoning while offering more tractability and lower compute requirements.

LIFT: Language-Interfaced Fine-Tuning for Non-Language Machine Learning Tasks

Analyzed the convergence rates for local and minibatch Random Reshuffling in Federated Learning, and developed a new technique to speed up convergence under heterogeneity. Developed a model to generate

What do tokens know about their characters and how do they know it?

The mechanisms through which PLMs acquire English-language character information during training are investigated and it is argued that this knowledge is acquired through multiple phenomena, including a systematic relationship between particular characters and particular parts of speech, as well as natural variability in the tokenization of related strings.

PEVL: Position-enhanced Pre-training and Prompt Tuning for Vision-language Models

This work introduces PEVL that enhances the pre-training and prompt tuning of VLP models with explicit object position modeling and reformulates discretized object positions and language in a unified language modeling framework, which facilitates explicit VL alignment during pre-trained and enables prompt tuning for various downstream tasks.

Should attention be all we need? The epistemic and ethical implications of unification in machine learning

It is argued that many of the arguments in favor of unification in the natural sciences fail to transfer over to the machine learning case, or transfer over only under assumptions that might not hold.

White-box Testing of NLP models with Mask Neuron Coverage

A set of white-box testing methods that are customized for transformer-based NLP models are proposed, including MASK NEU RON COVERAGE ( MN C OVER ) that measures how thoroughly the attention layers in models are exercised during testing.

Can Foundation Models Perform Zero-Shot Task Specification For Robot Manipulation?

This work explores alternate and more general forms of goal specification that are expected to be easier for humans to specify and use such as images obtained from the internet, hand sketches that provide a visual description of the desired task, or simple language descriptions.

Pushing the Limits of Simple Pipelines for Few-Shot Learning: External Data and Fine-Tuning Make a Difference

It is shown that a simple transformer-based pipeline yields surprisingly good performance on standard benchmarks such as Mini-ImageNet, CIFAR-FS, CDFSL and Meta-Dataset.