• Corpus ID: 21193242

On Ensuring that Intelligent Machines Are Well-Behaved

  title={On Ensuring that Intelligent Machines Are Well-Behaved},
  author={Philip S. Thomas and Bruno C. da Silva and Andrew G. Barto and Emma Brunskill},
Machine learning algorithms are everywhere, ranging from simple data analysis and pattern recognition tools used across the sciences to complex systems that achieve super-human performance on various tasks. [] Key Method To show the viability of this new framework, we use it to create new machine learning algorithms that preclude the sexist and harmful behaviors exhibited by standard machine learning algorithms in our experiments. Our framework for designing machine learning algorithms simplifies the safe…

Risks of using naïve approaches to Artificial Intelligence: A case study

It is demonstrated that naive approaches can have unanticipated consequences and can generate predictions based on discriminatory factors such as gender or race in the application of machine learning applied to secondary school student grades.

Challenges of real-world reinforcement learning: definitions, benchmarks and analysis

This work identifies and formalizes a series of independent challenges that embody the difficulties that must be addressed for RL to be commonly deployed in real-world systems and proposes an as an open-source benchmark.

Automatic programming: The open issue?

The genetic programming community is challenged to refocus research towards the objective of automatic programming, and to do so in a manner that embraces a wider perspective encompassing the related fields of, for example, artificial intelligence, machine learning, analytics, optimisation and software engineering.

Commander's Intent: A Dataset and Modeling Approach for Human-AI Task Specification in Strategic Play

A machine learning framework which can be used to identify goals and constraints from unstructured strategy descriptions, and a novel dataset, and an associated data collection protocol, which maps language descriptions to goals and constraint corresponding to specific strategies developed by human participants for the board game Risk.

Risk-Aware Active Inverse Reinforcement Learning

It is shown that risk-aware active learning outperforms standard active IRL approaches on gridworld, simulated driving, and table setting tasks, while also providing a performance-based stopping criterion that allows a robot to know when it has received enough demonstrations to safely perform a task.

Themis: automatically testing software for discrimination

Themis is presented, an automated test suite generator to measure two types of discrimination, including causal relationships between sensitive inputs and program behavior, and its effectiveness on open-source software.

Efficient Probabilistic Performance Bounds for Inverse Reinforcement Learning

A sampling method based on Bayesian inverse reinforcement learning that uses demonstrations to determine practical high-confidence upper bounds on the alpha-worst-case difference in expected return between any evaluation policy and the optimal policy under the expert's unknown reward function is proposed.

Balancing Constraints and Rewards with Meta-Gradient D4PG

This work presents two soft-constrained RL approaches that utilize meta-gradients to find a good trade-off between expected return and minimizing constraint violations and demonstrates the effectiveness of these approaches by showing that they consistently outperform the baselines across four different Mujoco domains.

Safe Policy Improvement with Baseline Bootstrapping

This paper adopts the safe policy improvement (SPI) approach, inspired by the knows-what-it-knows paradigms, and develops two computationally efficient bootstrapping algorithms, a value-based and a policy-based, both accompanied with theoretical SPI bounds.

Efficient Computation of Collision Probabilities for Safe Motion Planning

A key contribution of this paper is the use of a `convolution trick' to factor the calculation of integrals providing bounds on collision risk, enabling an $O(1)$ computation even in cluttered and complex environments.



Concrete Problems in AI Safety

A list of five practical research problems related to accident risk, categorized according to whether the problem originates from having the wrong objective function, an objective function that is too expensive to evaluate frequently, or undesirable behavior during the learning process, are presented.

Auditing Black-box Models by Obscuring Features

A class of techniques originally developed for the detection and repair of disparate impact in classification models can be used to study the sensitivity of any model with respect to any feature subsets, and does not require the black-box model to be retrained.

Robust Classification for Imprecise Environments

It is shown that it is possible to build a hybrid classifier that will perform at least as well as the best available classifier for any target conditions, and in some cases, the performance of the hybrid actually can surpass that of the best known classifier.

Three naive Bayes approaches for discrimination-free classification

Three approaches for making the naive Bayes classifier discrimination-free are presented: modifying the probability of the decision being positive, training one model for every sensitive attribute value and balancing them, and adding a latent variable to the Bayesian model that represents the unbiased label and optimizing the model parameters for likelihood using expectation maximization.

Classifying without discriminating

  • F. KamiranT. Calders
  • Computer Science
    2009 2nd International Conference on Computer, Control and Communication
  • 2009
This paper proposes a new classification scheme for learning unbiased models on biased training data based on massaging the dataset by making the least intrusive modifications which lead to an unbiased dataset and learns a non-discriminating classifier.

Human-level control through deep reinforcement learning

This work bridges the divide between high-dimensional sensory inputs and actions, resulting in the first artificial agent that is capable of learning to excel at a diverse array of challenging tasks.

Doubly Robust Off-policy Evaluation for Reinforcement Learning

This work extends the so-called doubly robust estimator for bandits to sequential decision-making problems, which gets the best of both worlds: it is guaranteed to be unbiased and has low variance, and as a point estimator, it outperforms the most popular importance-sampling estimator and its variants in most occasions.

Reinforcement Learning: A Survey

Central issues of reinforcement learning are discussed, including trading off exploration and exploitation, establishing the foundations of the field via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state.

Temporal difference learning and TD-Gammon

The domain of complex board games such as Go, chess, checkers, Othello, and backgammon has been widely regarded as an ideal testing ground for exploring a variety of concepts and approaches in artificial intelligence and machine learning.

Learning motor primitives for robotics

  • J. KoberJan Peters
  • Computer Science
    2009 IEEE International Conference on Robotics and Automation
  • 2009
It is shown that two new motor skills, i.e., Ball-in-a-Cup and Ball-Paddling, can be learned on a real Barrett WAM robot arm at a pace similar to human learning while achieving a significantly more reliable final performance.