• Publications
  • Influence
Deep Reinforcement Learning: A Brief Survey
TLDR
This survey will cover central algorithms in deep RL, including the deep Q-network (DQN), trust region policy optimization (TRPO), and asynchronous advantage actor critic, and highlight the unique advantages of deep neural networks, focusing on visual understanding via RL.
Evaluating Large Language Models Trained on Code
TLDR
It is found that repeated sampling from the GPT language model is a surprisingly effective strategy for producing working solutions to difficult prompts, and the potential broader impacts of deploying powerful code generation technologies, covering safety, security, and economics are discussed.
A Brief Survey of Deep Reinforcement Learning
TLDR
This survey will cover central algorithms in deep reinforcement learning, including the deep Q-network, trust region policy optimisation, and asynchronous advantage actor-critic, and highlight the unique advantages of deep neural networks, focusing on visual understanding via reinforcement learning.
The Malicious Use of Artificial Intelligence: Forecasting, Prevention, and Mitigation
The following organisations are named on the report: Future of Humanity Institute, University of Oxford, Centre for the Study of Existential Risk, University of Cambridge, Center for a New American
Release Strategies and the Social Impacts of Language Models
TLDR
This report discusses OpenAI's work related to the release of its GPT-2 language model and discusses staged release, which allows time between model releases to conduct risk and benefit analyses as model sizes increased.
Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims
TLDR
This report suggests various steps that different stakeholders can take to improve the verifiability of claims made about AI systems and their associated development processes, with a focus on providing evidence about the safety, security, fairness, and privacy protection of AI systems.
Limitations and risks of machine ethics
TLDR
This paper critically analyses the prospects for machine ethics, identifying several inherent limitations and suggesting that machine ethics even if it were to be ‘solved’ at a technical level would be insufficient to ensure positive social outcomes from intelligent systems.
The Role of Cooperation in Responsible AI Development
In this paper, we argue that competitive pressures could incentivize AI companies to underinvest in ensuring their systems are safe, secure, and have a positive social impact. Ensuring that AI
Understanding the Capabilities, Limitations, and Societal Impact of Large Language Models
TLDR
The discussion touched on several key areas including: the surprising impact of scale on model capabilities, the difficulty in assessing whether large language models truly understand language, the importance of training models on multiple data modalities, and challenges in aligning model objectives with human values.
Modeling Progress in AI
  • Miles Brundage
  • Business
    AAAI Workshop: AI, Ethics, and Society
  • 18 December 2015
TLDR
This paper suggests ways to account for the relationship between hardware speed increases and algorithmic improvements in AI, the role of human inputs in enabling AI capabilities, and the relationships between different sub-fields of AI.
...
...