Abductive learning of quantized stochastic processes with probabilistic finite automata

  title={Abductive learning of quantized stochastic processes with probabilistic finite automata},
  author={Ishanu Chattopadhyay and Hod Lipson},
  journal={Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences},
  • I. Chattopadhyay, Hod Lipson
  • Published 13 February 2013
  • Computer Science
  • Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences
We present an unsupervised learning algorithm (GenESeSS) to infer the causal structure of quantized stochastic processes, defined as stochastic dynamical systems evolving over discrete time, and producing quantized observations. Assuming ergodicity and stationarity, GenESeSS infers probabilistic finite state automata models from a sufficiently long observed trace. Our approach is abductive; attempting to infer a simple hypothesis, consistent with observations and modelling framework that… 

Figures and Tables from this paper

Recursive Abduction and the Universality of Physical Laws: A Logical Analysis Based on Case Studies
The paper studies some cases in physics such as Galilean inertia motion and etc., and hereby, presents a logical schema of recursive abduction, from which we can derive the universality of physical
Timesmash: Process-Aware Fast Time Series Clustering and Classification
A suite of algorithms for time series classification and clustering, implemented as a python package Timesmash, which leverages a subclass of hidden Markov model, called Probabilistic Finite-State Automaton (PFSA), to first model in an unsupervised setting the underlying generative processes for observed data streams, which then aid in carrying out automatic physics or process aware featurization enabling subsequent clustering and classification.
Computing entropy rate of symbol sources & a distribution-free limit theorem
A fundamentally new approach to estimate entropy rates is presented, which is demonstrated to converge significantly faster in terms of input data lengths, and is shown to be effective in diverse applications ranging from the estimation of the entropy rate of English texts toThe estimation of complexity of chaotic dynamical systems.
Symbolic analysis-based reduced order Markov modeling of time series data
Data Smashing
Diverse applications are presented, including disambiguation of brainwaves pertaining to epileptic seizures, detection of anomalous cardiac rhythms, and classification of astronomical objects from raw photometry, using the data smashing principle.
Causality Networks
This work presents a new non-parametric test of Granger causality for quantized or symbolic data streams generated by ergodic stationary sources, and makes precise and computes the degree of causal dependence between data streams, without making any restrictive assumptions, linearity or otherwise.
Data Smashing 2.0: Sequence Likelihood (SL) Divergence For Fast Time Series Comparison
A new approach to quantify deviations in the underlying hidden generators of observed data streams is introduced, resulting in a new efficiently computable universal metric for time series, which may be used to measure deviations within a well-defined class of discrete-valued stochastic processes.
Markov Modeling of Time Series via Spectral Analysis for Detection of Combustion Instabilities
This chapter presents a methodology for reduced-order Markov modeling of time-series data based has been used on spectral properties of stochastic matrix and clustering of directed graphs.
Causality inference between time series data and its applications
The subsequent chapters detail the four endeavors of studying causality in financial markets, earthquakes, animal/human brain signals, the predictivity of data sets, and the causal pattern in the time series can be used to compress data.
Data smashing: uncovering lurking order in data
It is suggested that data smashing principles may open the door to understanding increasingly complex observations, especially when experts do not know what to look for.


Learning probabilistic automata with variable memory length
It is proved that the algorithm proposed can indeed efficiently learn distributions generated by the authors' more restricted sources and the KL-divergence between the distribution generated by the target source and the distributiongenerated by the hypothesis can be made small with high confidence in polynomial time and sample complexity.
Towards Feasible PAC-Learning of Probabilistic Deterministic Finite Automata
It is proved that indeed this algorithm PAC-learns in a stronger sense than the Clark-Thollard algorithm, and is an attempt to keep the rigorous guarantees of the original one but use sample sizes that are not as astronomical as predicted by the theory.
Structural transformations of probabilistic finite state machines
The binary operations of probabilistic synchronous composition and projective composition, which have applications in symbolic model-based supervisory control and in symbolic pattern recognition problems, are introduced.
PAC-learnability of Probabilistic Deterministic Finite State Automata
It is demonstrated that the class of PDFAs is PAC-learnable using a variant of a standard state-merging algorithm and the Kullback-Leibler divergence as error function.
PAC-Learning of Markov Models with Hidden State
This work proposes a new PAC framework for learning both the topology and the parameters in partially observable Markov models, and learns a Probabilistic Deterministic Finite Automata (PDFA) which approximates a Hidden Markov Model (HMM) up to some desired degree of accuracy.
On the learnability of discrete distributions
A new model of learning probability distributions from independent draws is introduced, inspired by the popular Probably Approximately Correct (PAC) model for learning boolean functions from labeled examples, in the sense that it emphasizes efficient and approximate learning, and it studies the learnability of restricted classes of target distributions.
Learning deterministic regular grammars from stochastic samples in polynomial time
A class of algorithms which allow for the identification of the structure of the minimal stochastic automaton generating the language are proposed and it is shown that the time needed grows only linearly with the size of the sample set.
On the Inference of Stochastic Regular Grammars
Automated reverse engineering of nonlinear dynamical systems
This work introduces for the first time a method that can automatically generate symbolic equations for a nonlinear coupled dynamical system directly from time series data, applicable to any system that can be described using sets of ordinary nonlinear differential equations.