# Learning to select computations

@article{Lieder2018LearningTS, title={Learning to select computations}, author={Falk Lieder and Frederick Callaway and Sayan Gul and Paul M. Krueger and Thomas L. Griffiths}, journal={ArXiv}, year={2018}, volume={abs/1711.06892} }

Efficient use of limited computational resources is essential to intelligence. Selecting computations optimally according to rational metareasoning would achieve this, but rational metareasoning is computationally intractable. Inspired by psychology and neuroscience, we propose the first learning algorithm for approximating the optimal selection of computations. We derive a general, sample-efficient reinforcement learning algorithm for learning to select computations from the insight that the…

## 18 Citations

Discovering Rational Heuristics for Risky Choice

- Psychology
- 2022

For computationally limited agents such as humans, perfectly rational decision-making is almost always out of reach. Instead, people may rely on computationally frugal heuristics that usually yield…

Have I done enough planning or should I plan more?

- Computer Science
- 2022

The results suggest that the metacognitive ability to adjust the amount of planning might be learned through a policy-gradient mechanism that is guided by metac cognitive pseudo-rewards that communicate the value of planning.

Doing more with less: meta-reasoning and meta-learning in humans and machines

- Computer ScienceCurrent Opinion in Behavioral Sciences
- 2019

Improving Human Decision-Making by Discovering Efficient Strategies for Hierarchical Planning

- Computer ScienceComputational Brain & Behavior
- 2022

This work introduces a cognitively inspired reinforcement learning method that can discover optimal human planning strategies for larger and more complex tasks than was previously possible and demonstrates that teaching people to use those strategies significantly increases their level of resource-rationality in tasks that require planning up to eight steps ahead.

Automatic Discovery of Interpretable Planning Strategies

- Computer ScienceMach. Learn.
- 2021

The results of three large behavioral experiments showed that providing the decision rules generated by AI-Interpret as flowcharts significantly improved people’s planning strategies and decisions across three different classes of sequential decision problems.

Imagining the good: An offline tendency to simulate good options even when no decision has to be made

- PsychologyCogSci
- 2019

Results suggest that people focus their offline cognition on the apparently good, with faster online response times for the options that appeared to have higher values, indicating a pre-computation benefit for these items.

Leveraging Machine Learning to Automatically Derive Robust Planning Strategies from Biased Models of the Environment

- Computer ScienceCogSci
- 2020

This work translates strategy discovery methods into an intelligent tutor that automatically discovers and teaches robust planning strategies and significantly improved human decision-making when the model was so biased that conventional cognitive tutors were no longer effective.

Measuring and modelling how people learn how to plan and how people adapt their planning strategies to the structure of the environment

- Psychology
- 2021

Often we find ourselves in unknown situations where we have to make a decision based on reasoning upon experiences. However, it is still unclear how people choose which pieces of information to take…

Fixation patterns in simple choice reflect optimal information sampling

- EconomicsPLoS Comput. Biol.
- 2021

The results show that the fixation process during simple choice is influenced dynamically by the value estimates computed during the decision process, in a manner consistent with optimal information sampling.

Deep Active Learning with Adaptive Acquisition

- Computer ScienceIJCAI
- 2019

This work presents a method to break this vicious circle by defining the acquisition function as a learning predictor and training it by reinforcement feedback collected from each labeling round, and observes that this method always manages to either invent a new superior acquisition function or to adapt itself to the a priori unknown best performing heuristic for each specific data set.

## References

SHOWING 1-10 OF 46 REFERENCES

When Does Bounded-Optimal Metareasoning Favor Few Cognitive Systems?

- Computer ScienceAAAI
- 2017

It is found that the optimal number of systems depends on the variability of the environment and the costliness of metareasoning, and that when having two systems is optimal, then the first system is fast but error-prone and the second system is slow but accurate.

Selecting Computations: Theory and Applications

- Computer ScienceUAI
- 2012

This paper develops a theoretical basis for metalevel decisions in the statistical framework of Bayesian selection problems, arguing that this is more appropriate than the bandit framework, and derives heuristic approximations in both Bayesian and distribution-free settings and demonstrates their superiority to bandit-based heuristics in one-shot decision problems and in Go.

Strategy Selection as Rational Metareasoning

- Psychology, BusinessPsychological review
- 2017

A rational model of strategy selection is developed, based on the theory of rational metareasoning developed in the artificial intelligence literature, that suggests that people gradually learn to make increasingly more rational use of fallible heuristics.

Bayesian Q-Learning

- Computer ScienceAAAI/IAAI
- 1998

This paper extends Watkins' Q-learning by maintaining and propagating probability distributions over the Q-values and establishes the convergence properties of the algorithm, which can exhibit substantial improvements over other well-known model-free exploration strategies.

Algorithm selection by rational metareasoning as a model of human strategy selection

- Computer ScienceNIPS
- 2014

Rational metareasoning appears to be a promising framework for reverse-engineering how people choose among cognitive strategies and translating the results into better solutions to the algorithm selection problem.

Habitual control of goal selection in humans

- Computer ScienceProceedings of the National Academy of Sciences
- 2015

This work considers a hierarchical architecture that exploits the computational efficiency of habitual control to select goal states that are subsequently used in planning while preserving the flexibility of planning to achieve those goals.

Computational rationality: A converging paradigm for intelligence in brains, minds, and machines

- Computer ScienceScience
- 2015

This work charts advances over the past several decades that address challenges of perception and action under uncertainty through the lens of computation to identify decisions with highest expected utility, while taking into consideration the costs of computation in complex real-world problems in which most relevant calculations can only be approximated.

Metareasoning for Planning Under Uncertainty

- Computer ScienceIJCAI
- 2015

This work formalizes and analyzes the metareasoning problem for Markov Decision Processes (MDPs) and shows that in the general case, metare Masoning is at most polynomially harder than solving MDPs with any given algorithm that disregards the cost of thinking.

An automatic method for discovering rational heuristics for risky choice

- Computer Science, PsychologyCogSci
- 2017

The bounded optimal decision process is formalized as the solution to a meta-level Markov decision process whose actions are costly computations and rediscovered well-known heuristic strategies and the conditions under which they are used, as well as novel heuristics.