An Information-theoretic On-line Learning Principle for Specialization in Hierarchical Decision-Making Systems

  title={An Information-theoretic On-line Learning Principle for Specialization in Hierarchical Decision-Making Systems},
  author={Heinke Hihn and Sebastian Gottwald and Daniel A. Braun},
  journal={2019 IEEE 58th Conference on Decision and Control (CDC)},
Information-theoretic bounded rationality describes utility-optimizing decision-makers whose limited information-processing capabilities are formalized by information constraints. One of the consequences of bounded rationality is that resource-limited decision-makers can join together to solve decision-making problems that are beyond the capabilities of each individual. Here, we study an information-theoretic principle that drives division of labor and specialization when decision-makers with… 

Figures from this paper

Specialization in Hierarchical Learning Systems

This work devise an information-theoretically motivated on-line learning rule that allows partitioning of the problem space into multiple sub-problems that can be solved by the individual experts.

Hierarchical Expert Networks for Meta-Learning

A principled information-theoretic model is proposed that optimally partitions the underlying problem space such that specialized expert decision-makers solve the resulting sub-problems and argues that this specialization leads to efficient adaptation to new tasks.

Rationality in current era - A recent survey

  • D. Das
  • Computer Science
  • 2022
This survey attempts to put forward a recent survey of research on divergent views on rationality and believes that bounds of bounded rationality will be extended by advances in AI and various other technologies.

Mutual-Information Regularization in Markov Decision Processes and Actor-Critic Learning

A novel mutual-information regularized actor-critic learning (MIRACLE) algorithm for continuous action spaces that optimizes over the reference marginal policy and can compete with contemporary RL methods.

Variational Inference for Model-Free and Model-Based Reinforcement Learning

This manuscript shows how the apparently different subjects of VI and RL are linked in two fundamental ways: first, the optimization objective of RL to maximize future cumulative rewards can be recovered via a VI objective under a soft policy constraint in both the non-sequential and the sequential setting.

Mixture-of-Variational-Experts for Continual Learning

This work proposes an optimality principle that facilitates a trade-off between learning and forgetting and proposes a neural network layer for continual learning, called Mixture-of-Variational-Experts (MoVE), that alleviates forgetting while enabling the beneficial transfer of knowledge to new tasks.

Multi-Modal Pain Intensity Assessment Based on Physiological Signals: A Deep Learning Perspective

This work introduces several novel multi-modal deep learning approaches (characterized by specific supervised, as well as self-supervised learning techniques) for the assessment of pain intensity based on measurable bio-physiological data.

Hierarchically Structured Task-Agnostic Continual Learning

A task-agnostic view of continual learning is taken and a hierarchical information-theoretic optimality principle is developed that facilitates a trade-off between learning and forgetting and proposes a neural network layer, called the Mixture-of-Variational-Experts layer, that alleviates forgetting.

A Tutorial on Sparse Gaussian Processes and Variational Inference

This tutorial is to provide access to the basic matter for readers without prior knowledge in both GPs and VI, where pseudo-training examples are treated as optimization arguments of the approximate posterior that are jointly identified together with hyperparameters of the generative model.



Bounded Rationality, Abstraction, and Hierarchical Decision-Making: An Information-Theoretic Optimality Principle

This work applies the basic principle of this framework of bounded rational decision-making to perception-action systems with multiple information-processing nodes and derive bounded optimal solutions and formalizes a mathematically unifying optimization principle that could potentially be extended to more complex systems.

Bounded Rational Decision-Making with Adaptive Neural Network Priors

This work investigates generative neural networks as priors that are optimized concurrently with anytime sample-based decision-making processes such as MCMC, and evaluates this approach on toy examples.

Thermodynamics as a theory of decision-making with information-processing costs

  • Pedro A. OrtegaD. Braun
  • Economics
    Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences
  • 2013
Perfectly rational decision-makers maximize expected utility, but crucially ignore the resource costs incurred when determining optimal actions. Here, we propose a thermodynamically inspired

Planning with Information-Processing Constraints and Model Uncertainty in Markov Decision Processes

This work provides a generalized value iteration scheme together with a convergence proof that can derive a unified solution from a single generalized variational principle for Markov Decision Problems.

Bounded Rational Decision-Making from Elementary Computations That Reduce Uncertainty

This work introduces the notion of elementary computation based on a fundamental principle for probability transfers that reduce uncertainty, and proves several new results on majorization theory, as well as on entropy and divergence measures.

Systems of Bounded Rational Agents with Information-Theoretic Constraints

The results suggest that hierarchical architectures of specialized units at lower levels that are coordinated by units at higher levels are optimal, given that each unit's information-processing capability is limited and conforms to constraints on complexity costs.

Non-Equilibrium Relations for Bounded Rational Decision-Making in Changing Environments

An abstract model of organisms as decision-makers with limited information-processing resources that trade off between maximization of utility and computational costs measured by a relative entropy is considered, in a similar fashion to thermodynamic systems undergoing isothermal transformations.

Actor-Critic Algorithms

This thesis proposes and studies actor-critic algorithms which combine the above two approaches with simulation to find the best policy among a parameterized class of policies, and proves convergence of the algorithms for problems with general state and decision spaces.

Hierarchical Relative Entropy Policy Search

This work defines the problem of learning sub-policies in continuous state action spaces as finding a hierarchical policy that is composed of a high-level gating policy to select the low-level sub-Policies for execution by the agent and treats them as latent variables which allows for distribution of the update information between the sub- policies.


This paper presents and proves in detail a convergence theorem forQ-learning based on that outlined in Watkins (1989), showing that Q-learning converges to the optimum action-values with probability 1 so long as all actions are repeatedly sampled in all states and the action- values are represented discretely.