Corpus ID: 21193242

On Ensuring that Intelligent Machines Are Well-Behaved

@article{Thomas2017OnET,
  title={On Ensuring that Intelligent Machines Are Well-Behaved},
  author={P. S. Thomas and B. Silva and A. Barto and Emma Brunskill},
  journal={ArXiv},
  year={2017},
  volume={abs/1708.05448}
}
  • P. S. Thomas, B. Silva, +1 author Emma Brunskill
  • Published 2017
  • Computer Science
  • ArXiv
  • Machine learning algorithms are everywhere, ranging from simple data analysis and pattern recognition tools used across the sciences to complex systems that achieve super-human performance on various tasks. [...] Key Method To show the viability of this new framework, we use it to create new machine learning algorithms that preclude the sexist and harmful behaviors exhibited by standard machine learning algorithms in our experiments. Our framework for designing machine learning algorithms simplifies the safe…Expand Abstract
    12 Citations
    Risk-Aware Active Inverse Reinforcement Learning
    • 23
    • PDF
    Automatic programming: The open issue?
    • 5
    Themis: automatically testing software for discrimination
    • 24
    • PDF
    Efficient Probabilistic Performance Bounds for Inverse Reinforcement Learning
    • 15
    • PDF
    Balancing Constraints and Rewards with Meta-Gradient D4PG
    • 1
    • PDF
    Safe Policy Improvement with Baseline Bootstrapping
    • 37
    • PDF
    Software fairness
    • 9
    • Highly Influenced
    • PDF

    References

    SHOWING 1-10 OF 115 REFERENCES
    Auditing Black-box Models by Obscuring Features
    • 24
    • PDF
    Concrete Problems in AI Safety
    • 733
    • PDF
    Robust Classification for Imprecise Environments
    • 1,213
    • PDF
    Three naive Bayes approaches for discrimination-free classification
    • 453
    • PDF
    Superintelligence: Paths, Dangers, Strategies
    • 604
    • PDF
    Classifying without discriminating
    • F. Kamiran, T. Calders
    • Computer Science
    • 2009 2nd International Conference on Computer, Control and Communication
    • 2009
    • 205
    • PDF
    Doubly Robust Off-policy Evaluation for Reinforcement Learning
    • Nan Jiang, L. Li
    • Computer Science
    • ArXiv
    • 2015
    • 45
    • PDF
    Human-level control through deep reinforcement learning
    • 10,329
    • PDF
    Reinforcement Learning: A Survey
    • 6,492
    • PDF
    Temporal difference learning and TD-Gammon
    • 1,164