• Corpus ID: 237091588

On the Opportunities and Risks of Foundation Models

  title={On the Opportunities and Risks of Foundation Models},
  author={Rishi Bommasani and Drew A. Hudson and Ehsan Adeli and Russ Altman and Simran Arora and Sydney von Arx and Michael S. Bernstein and Jeannette Bohg and Antoine Bosselut and Emma Brunskill and Erik Brynjolfsson and S. Buch and D. Card and Rodrigo Castellon and Niladri S. Chatterji and Annie Chen and Kathleen Creel and Jared Davis and Dora Demszky and Chris Donahue and Moussa Doumbouya and Esin Durmus and Stefano Ermon and John Etchemendy and Kawin Ethayarajh and Li Fei-Fei and Chelsea Finn and Trevor Gale and Lauren E. Gillespie and Karan Goel and Noah D. Goodman and Shelby Grossman and Neel Guha and Tatsunori Hashimoto and Peter Henderson and John Hewitt and Daniel E. Ho and Jenny Hong and Kyle Hsu and Jing Huang and Thomas F. Icard and Saahil Jain and Dan Jurafsky and Pratyusha Kalluri and Siddharth Karamcheti and Geoff Keeling and Fereshte Khani and O. Khattab and Pang Wei Koh and Mark S. Krass and Ranjay Krishna and Rohith Kuditipudi and Ananya Kumar and Faisal Ladhak and Mina Lee and Tony Lee and Jure Leskovec and Isabelle Levent and Xiang Lisa Li and Xuechen Li and Tengyu Ma and Ali Malik and Christopher D. Manning and Suvir P. Mirchandani and Eric Mitchell and Zanele Munyikwa and Suraj Nair and Avanika Narayan and Deepak Narayanan and Benjamin Newman and Allen Nie and Juan Carlos Niebles and Hamed Nilforoshan and J. F. Nyarko and Giray Ogut and Laurel Orr and Isabel Papadimitriou and Joon Sung Park and Chris Piech and Eva Portelance and Christopher Potts and Aditi Raghunathan and Robert Reich and Hongyu Ren and Frieda Rong and Yusuf H. Roohani and Camilo Ruiz and Jack Ryan and Christopher R'e and Dorsa Sadigh and Shiori Sagawa and Keshav Santhanam and Andy Shih and Krishna Parasuram Srinivasan and Alex Tamkin and Rohan Taori and Armin W. Thomas and Florian Tram{\`e}r and Rose E. Wang and William Wang and Bohan Wu and Jiajun Wu and Yuhuai Wu and Sang Michael Xie and Michihiro Yasunaga and Jiaxuan You and Matei A. Zaharia and Michael Zhang and Tianyi Zhang and Xikun Zhang and Yuhui Zhang and Lucia Zheng and Kaitlyn Zhou and Percy Liang},
AI is undergoing a paradigm shift with the rise of models (e.g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks. We call these models foundation models to underscore their critically central yet incomplete character. This report provides a thorough account of the opportunities and risks of foundation models, ranging from their capabilities (e.g., language, vision, robotics, reasoning, human interaction) and technical principles… 
Bag-of-Vectors Autoencoders for Unsupervised Conditional Text Generation
This work extends Emb2Emb to Bag-of-Vectors Autoencoders (BoV-AEs), which encode the text into a variablesize bag of vectors that grows with the size of the text, as in attention-based models, and proposes regularization techniques that facilitate learning meaningful operations in the latent space.
Controllable Response Generation for Assistive Use-cases
This study shows that keyword-control on end-to-end response generation models is powerful and can enable and empower users with degenerative disorders to carry out their dayto-day communication.
Human Parity on CommonsenseQA: Augmenting Self-Attention with External Attention
  • Yichong Xu, Chenguang Zhu, +7 authors Xuedong Huang
  • Computer Science
  • 2021
It is found that the proposed external attention mechanism can significantly improve the performance of existing AI systems, allowing practitioners to easily customize foundation AI models to many diverse downstream applications and increase the democratization of AI systems.
Assemble Foundation Models for Automatic Code Summarization
Automatic code summarization is beneficial to software development and maintenance since it reduces the burden of manual tasks. Currently, artificial intelligence is undergoing a paradigm shift. The
Fortunately, Discourse Markers Can Enhance Language Models for Sentiment Analysis
This work proposes to leverage sentiment-carrying discourse markers to generate large-scale weakly-labeled data, which can be used to adapt language models for sentiment analysis, and shows the value of the approach on various benchmark datasets, including the finance domain.
Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents
Can world knowledge learned by large language models (LLMs) be used to act in interactive environments? In this paper, we investigate the possibility of grounding high-level tasks, expressed in
MERLOT Reserve: Neural Script Knowledge through Vision and Language and Sound
MERLOT Reserve is introduced, a model that represents videos jointly over time – through a new training objective that learns from audio, subtitles, and video frames, which enables out-of-the-box prediction, revealing strong multimodal commonsense understanding.
Machines & Influence: An Information Systems Lens
  • Shashank Yadav
  • 2022
Policymakers face a broader challenge of how to view AI capabilities today and where does society stand in terms of those capabilities. This paper surveys AI capabilities and tackles this very issue,
Neural Circuit Architectural Priors for Embodied Control
Artificial neural networks for simulated motor control and robotics often adopt generic architectures like fully connected MLPs. While general, these tabula rasa architectures rely on large amounts
Parameter-free Online Test-time Adaptation
Training state-of-the-art vision models has become prohibitively expensive for researchers and practitioners. For the sake of accessibility and resource reuse, it is important to focus on adapting