Corpus ID: 221818767

Hidden Incentives for Auto-Induced Distributional Shift

  title={Hidden Incentives for Auto-Induced Distributional Shift},
  author={David Krueger and Tegan Maharaj and Jan Leike},
Decisions made by machine learning systems have increasing influence on the world, yet it is common for machine learning algorithms to assume that no such influence exists. An example is the use of the i.i.d. assumption in content recommendation. In fact, the (choice of) content displayed can change users' perceptions and preferences, or even drive them away, causing a shift in the distribution of users. We introduce the term auto-induced distributional shift (ADS) to describe the phenomenon of… Expand
Estimating and Penalizing Preference Shift in Recommender Systems
This work advocate for estimating the preference shifts that would be induced by recommender system policies, and explicitly characterizing what unwanted shifts are and assessing before deployment whether such policies will produce them, and shows that recommender systems that optimize for staying in the trust region avoid manipulative behaviors, while still generating engagement. Expand
Existence conditions for hidden feedback loops in online recommender systems
It is shown that an unbiased additive random noise in user interests does not prevent a feedback loop, and it is demonstrated that a non-zero probability of resetting user interests is sufficient to limit the feedback loop. Expand
Designing Recommender Systems to Depolarize
This paper examines algorithmic depolarization interventions with the goal of conflict transformation: not suppressing or eliminating conflict but moving towards more constructive conflict. Expand
Recursively Summarizing Books with Human Feedback
This method combines learning from human feedback with recursive task decomposition: it uses models trained on smaller parts of the task to assist humans in giving feedback on the broader task, and generates sensible summaries of entire books. Expand
Predicting Infectiousness for Proactive Contact Tracing
Methods that can be deployed to a smartphone to locally and proactively predict an individual's infectiousness based on their contact history and other information are developed, suggesting PCT could help in safe re-opening and second-wave prevention. Expand
Objective Robustness in Deep Reinforcement Learning
This work provides the first explicit empirical demonstrations of objective robustness failures and argues that this type of failure is critical to address in reinforcement learning. Expand
Unsolved Problems in ML Safety
A new roadmap for ML Safety is provided and four problems ready for research are presented, namely withstanding hazards, withstanding hazard motivation, identifying hazards, steering ML systems, and reducing hazards in deployment. Expand


Dataset Shift in Machine Learning
This volume offers an overview of current efforts to deal with dataset and covariate shift, and places dataset shift in relationship to transfer learning, transduction, local learning, active learning, and semi-supervised learning. Expand
Understanding Agent Incentives using Causal Influence Diagrams. Part I: Single Action Settings
Modeling the agent-environment interaction in graphical models called influence diagrams can answer two fundamental questions about an agent's incentives directly from the graph to identify algorithms with problematic incentives and help in designing algorithms with better incentives. Expand
A Research Agenda: Dynamic Models to Defend Against Correlated Attacks
It is argued that machine learning security researchers should also address the problem of relaxing the {\em independence} assumption and that current strategies designed for robustness to distribution shift will not do so. Expand
Reliable Decision Support using Counterfactual Models
This work proposes using a different learning objective that predicts counterfactuals instead of predicting outcomes under an existing action policy as in supervised learning, and introduces the Counterfactual Gaussian Process (CGP) to support decision-making in temporal settings. Expand
Bandit Learning with Positive Externalities
A new algorithm is developed, Balanced Exploration (BE), which explores arms carefully to avoid suboptimal convergence of arrivals before sufficient evidence is gathered and establishes optimality by showing that no algorithm can perform better. Expand
Practical Bayesian Optimization of Machine Learning Algorithms
This work describes new algorithms that take into account the variable cost of learning algorithm experiments and that can leverage the presence of multiple cores for parallel experimentation and shows that these proposed algorithms improve on previous automatic procedures and can reach or surpass human expert-level optimization for many algorithms. Expand
Online Stochastic Optimization under Correlated Bandit Feedback
The high-confidence tree (HCT) algorithm is introduced, a novel anytime χ-armed bandit algorithm, and regret bounds matching the performance of state-of-the-art algorithms in terms of the dependency on number of steps and the near-optimality dimension are derived. Expand
Bandit problems with side observations
An extension of the traditional two-armed bandit problem is considered, in which the decision maker has access to some side information before deciding which arm to pull and how much the additional information helps is quantified. Expand
Meta-learners' learning dynamics are unlike learners'
It is shown that, once meta-trained, LSTM Meta-Learners aren't just faster learners than their sample-inefficient deep learning and reinforcement learning brethren, but that they actually pursue fundamentally different learning trajectories. Expand
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
We propose an algorithm for meta-learning that is model-agnostic, in the sense that it is compatible with any model trained with gradient descent and applicable to a variety of different learningExpand