• Corpus ID: 3562988

Learning to select computations

@article{Lieder2018LearningTS,
  title={Learning to select computations},
  author={Falk Lieder and Frederick Callaway and Sayan Gul and Paul M. Krueger and Thomas L. Griffiths},
  journal={ArXiv},
  year={2018},
  volume={abs/1711.06892}
}
Efficient use of limited computational resources is essential to intelligence. Selecting computations optimally according to rational metareasoning would achieve this, but rational metareasoning is computationally intractable. Inspired by psychology and neuroscience, we propose the first learning algorithm for approximating the optimal selection of computations. We derive a general, sample-efficient reinforcement learning algorithm for learning to select computations from the insight that the… 

Figures and Tables from this paper

Discovering Rational Heuristics for Risky Choice
For computationally limited agents such as humans, perfectly rational decision-making is almost always out of reach. Instead, people may rely on computationally frugal heuristics that usually yield
Have I done enough planning or should I plan more?
TLDR
The results suggest that the metacognitive ability to adjust the amount of planning might be learned through a policy-gradient mechanism that is guided by metac cognitive pseudo-rewards that communicate the value of planning.
Doing more with less: meta-reasoning and meta-learning in humans and machines
Improving Human Decision-Making by Discovering Efficient Strategies for Hierarchical Planning
TLDR
This work introduces a cognitively inspired reinforcement learning method that can discover optimal human planning strategies for larger and more complex tasks than was previously possible and demonstrates that teaching people to use those strategies significantly increases their level of resource-rationality in tasks that require planning up to eight steps ahead.
Automatic Discovery of Interpretable Planning Strategies
TLDR
The results of three large behavioral experiments showed that providing the decision rules generated by AI-Interpret as flowcharts significantly improved people’s planning strategies and decisions across three different classes of sequential decision problems.
Imagining the good: An offline tendency to simulate good options even when no decision has to be made
TLDR
Results suggest that people focus their offline cognition on the apparently good, with faster online response times for the options that appeared to have higher values, indicating a pre-computation benefit for these items.
Leveraging Machine Learning to Automatically Derive Robust Planning Strategies from Biased Models of the Environment
TLDR
This work translates strategy discovery methods into an intelligent tutor that automatically discovers and teaches robust planning strategies and significantly improved human decision-making when the model was so biased that conventional cognitive tutors were no longer effective.
Measuring and modelling how people learn how to plan and how people adapt their planning strategies to the structure of the environment
Often we find ourselves in unknown situations where we have to make a decision based on reasoning upon experiences. However, it is still unclear how people choose which pieces of information to take
Fixation patterns in simple choice reflect optimal information sampling
TLDR
The results show that the fixation process during simple choice is influenced dynamically by the value estimates computed during the decision process, in a manner consistent with optimal information sampling.
Deep Active Learning with Adaptive Acquisition
TLDR
This work presents a method to break this vicious circle by defining the acquisition function as a learning predictor and training it by reinforcement feedback collected from each labeling round, and observes that this method always manages to either invent a new superior acquisition function or to adapt itself to the a priori unknown best performing heuristic for each specific data set.
...
...

References

SHOWING 1-10 OF 46 REFERENCES
When Does Bounded-Optimal Metareasoning Favor Few Cognitive Systems?
TLDR
It is found that the optimal number of systems depends on the variability of the environment and the costliness of metareasoning, and that when having two systems is optimal, then the first system is fast but error-prone and the second system is slow but accurate.
Selecting Computations: Theory and Applications
TLDR
This paper develops a theoretical basis for metalevel decisions in the statistical framework of Bayesian selection problems, arguing that this is more appropriate than the bandit framework, and derives heuristic approximations in both Bayesian and distribution-free settings and demonstrates their superiority to bandit-based heuristics in one-shot decision problems and in Go.
Strategy Selection as Rational Metareasoning
TLDR
A rational model of strategy selection is developed, based on the theory of rational metareasoning developed in the artificial intelligence literature, that suggests that people gradually learn to make increasingly more rational use of fallible heuristics.
Bayesian Q-Learning
TLDR
This paper extends Watkins' Q-learning by maintaining and propagating probability distributions over the Q-values and establishes the convergence properties of the algorithm, which can exhibit substantial improvements over other well-known model-free exploration strategies.
Algorithm selection by rational metareasoning as a model of human strategy selection
TLDR
Rational metareasoning appears to be a promising framework for reverse-engineering how people choose among cognitive strategies and translating the results into better solutions to the algorithm selection problem.
Habitual control of goal selection in humans
TLDR
This work considers a hierarchical architecture that exploits the computational efficiency of habitual control to select goal states that are subsequently used in planning while preserving the flexibility of planning to achieve those goals.
Computational rationality: A converging paradigm for intelligence in brains, minds, and machines
TLDR
This work charts advances over the past several decades that address challenges of perception and action under uncertainty through the lens of computation to identify decisions with highest expected utility, while taking into consideration the costs of computation in complex real-world problems in which most relevant calculations can only be approximated.
Metareasoning for Planning Under Uncertainty
TLDR
This work formalizes and analyzes the metareasoning problem for Markov Decision Processes (MDPs) and shows that in the general case, metare Masoning is at most polynomially harder than solving MDPs with any given algorithm that disregards the cost of thinking.
An automatic method for discovering rational heuristics for risky choice
TLDR
The bounded optimal decision process is formalized as the solution to a meta-level Markov decision process whose actions are costly computations and rediscovered well-known heuristic strategies and the conditions under which they are used, as well as novel heuristics.
Principles of Metareasoning
...
...