# Information Bottlenecks, Causal States, and Statistical Relevance Bases: How to Represent Relevant Information in memoryless transduction

@article{Shalizi2002InformationBC, title={Information Bottlenecks, Causal States, and Statistical Relevance Bases: How to Represent Relevant Information in memoryless transduction}, author={Cosma Rohilla Shalizi and James P. Crutchfield}, journal={Adv. Complex Syst.}, year={2002}, volume={5}, pages={91-96} }

Discovering relevant, but possibly hidden, variables is a key step in constructing useful and predictive theories about the natural world. This brief note explains the connections between three approaches to this problem: the recently introduced information-bottleneck method, the computational mechanics approach to inferring optimal models, and Salmon's statistical relevance basis.

## 37 Citations

### Information Bottleneck Approach to Predictive Inference

- Computer ScienceEntropy
- 2014

This paper synthesizes a recent line of work on automated predictive model making inspired by Rate-Distortion theory, in particular by the Information Bottleneck method to explain how this information theoretic approach provides an intuitive, overarching framework for predictive inference.

### Information Flows in Causal Networks

- Computer ScienceAdv. Complex Syst.
- 2008

A notion of causal independence based on intervention is used to define a measure for the strength of a causal effect, called "information flow", which is compared with known information flow measures such as transfer entropy.

### Optimal causal inference: estimating stored information and approximating causal architecture.

- Computer ScienceChaos
- 2010

We introduce an approach to inferring the causal architecture of stochastic dynamical systems that extends rate-distortion theory to use causal shielding--a natural principle of learning. We study…

### Information theory and learning: a physical approach

- Computer ScienceArXiv
- 2000

It is proved that predictive information provides the unique measure for the complexity of dynamics underlying the time series and there are classes of models characterized by {\em power-law growth of the predictive information} that are qualitatively more complex than any of the systems that have been investigated before.

### Informational and Causal Architecture of Discrete-Time Renewal Processes

- Computer ScienceEntropy
- 2015

This work identifies the minimal sufficient statistic for their prediction (the set of causal states), calculates the historical memory capacity required to store those states, delineate what information is predictable (statistical complexity), and decompose the entropy of a single measurement into that shared with the past, future, or both.

### Predictability, Complexity, and Learning

- Computer ScienceNeural Computation
- 2001

It is argued that the divergent part of Ipred(T) provides the unique measure for the complexity of dynamics underlying a time series.

### Learning predictive partitions for continuous feature spaces

- Computer ScienceESANN
- 2014

An unsupervised learning algorithm is proposed that finds discrete partitions of a continuous feature space that are predictive with respect to the future and induces a Markov chain on the data with high mutual information between the current state and the next state.

### Circumventing the Curse of Dimensionality in Prediction: Causal Rate-Distortion for Infinite-Order Markov Processes

- Computer ScienceArXiv
- 2014

This work circumvents the curse of dimensionality in rate-distortion analysis of infinite-order processes by casting predictive rate- Distortion objective functions in terms of the forward- and reverse-time causal states of computational mechanics.

### An Information-Theoretic Formalism for Multiscale Structure in Complex Systems

- Computer Science
- 2014

This work develops a general formalism for representing and understanding structure in complex systems, and explores quantitative indices that summarize system structure, providing a new formal basis for the complexity profile and introducing a new index, the "marginal utility of information".

### Predictive Rate-Distortion for Infinite-Order Markov Processes

- Computer Science
- 2016

This work casts predictive rate-distortion objective functions in terms of the forward- and reverse-time causal states of computational mechanics, and shows that the resulting algorithms yield substantial improvements.

## References

SHOWING 1-10 OF 21 REFERENCES

### Computational Mechanics: Pattern and Prediction, Structure and Simplicity

- Computer ScienceArXiv
- 1999

It is shown that the causal-state representation—an ∈-machine—is the minimal one consistent with accurate prediction, and several results are established on ∉-machine optimality and uniqueness and on how∈-machines compare to alternative representations.

### Inferring statistical complexity.

- Computer SciencePhysical review letters
- 1989

A technique is presented that directly reconstructs minimal equations of motion from the recursive structure of measurement sequences, demonstrating a form of superuniversality that refers only to the entropy and complexity of a data stream.

### Causation, prediction, and search

- Computer Science
- 1993

The authors axiomatize the connection between causal structure and probabilistic independence, explore several varieties of causal indistinguishability, formulate a theory of manipulation, and develop asymptotically reliable procedures for searching over equivalence classes of causal models.

### Statistical explanation & statistical relevance

- Physics
- 1971

According to modern physics, many objectively improbable events actually occur, such as the spontaneous disintegration of radioactive atoms. Because of high levels of improbability, scientists are…

### Thermodynamic depth of causal states: Objective complexity via minimal representations

- Computer Science
- 1999

It is shown that the rate of increase in thermodynamic depth is the system's reverse-time Shannon entropy rate, and so depth only measures degrees of macroscopic randomness, not structure, and thus Epsilon-machines are optimally shallow.

### Statistical Explanation and Statistical Relevance

- Geology
- 1981

The ontological difference between the hypothetical frequency interpretations advanced by Kyburg [1974], [1978] and by van Fraassen [1977], [1979] and the single-case dispositional analysis advanced…

### The information bottleneck method

- Computer ScienceArXiv
- 2000

The variational principle provides a surprisingly rich framework for discussing a variety of problems in signal processing and learning, as will be described in detail elsewhere.

### Scientific Explanation and the Causal Structure of the World

- Philosophy
- 1984

The philosophical theory of scientific explanation proposed here involves a radically new treatment of causality that accords with the pervasively statistical character of contemporary science.…

### Latent variable models: an introduction to factor, path, and structural analysis

- Psychology
- 1986

This text provides an introduction to a growing area in the social and behavioural sciences - the modelling of systems in which one or more variables are hypothesized but not directly observed.…