# Hilbert Space Embeddings of Hidden Markov Models

@inproceedings{Song2010HilbertSE, title={Hilbert Space Embeddings of Hidden Markov Models}, author={Le Song and Byron Boots and Sajid M. Siddiqi and Geoffrey J. Gordon and Alex Smola}, booktitle={ICML}, year={2010} }

Hidden Markov Models (HMMs) are important tools for modeling sequence data. However, they are restricted to discrete latent states, and are largely restricted to Gaussian and discrete observations. And, learning algorithms for HMMs have predominantly relied on local search heuristics, with the exception of spectral methods such as those described below. We propose a nonparametric HMM that extends traditional HMMs to structured and non-Gaussian continuous distributions. Furthermore, we derive a…

## 215 Citations

Using Regression for Spectral Estimation of HMMs

- Computer ScienceSLSP
- 2013

This work introduces a new spectral model for HMM estimation, a corresponding spectral bilinear regression model, and systematically compare them with a variety of competing simplified models, explaining when and why each method gives superior performance.

Regularization of Hidden Markov Models Embedded into Reproducing Kernel Hilbert Space

- Computer ScienceAdvances in Intelligent Systems and Computing
- 2018

An approach to extend the application of HMMs to non-Gaussian continuous distributions by embedding the belief about the state into a reproducing kernel Hilbert space (RKHS), and reduce tendency to overfitting and computational complexity of algorithm by means of various regularization techniques, specifically, Nystrom subsampling is discussed.

Spectral dimensionality reduction for HMMs

- Computer ScienceArXiv
- 2012

This work provides a new spectral method which significantly reduces the number of model parameters that need to be estimated, and generates a sample complexity that does not depend on the size of the observation vocabulary.

On learning parametric-output HMMs

- Computer ScienceICML
- 2013

We present a novel approach for learning an HMM whose outputs are distributed according to a parametric family. This is done by {\em decoupling} the learning task into two steps: first estimating the…

Discriminative spectral learning of hidden markov models for human activity recognition

- Computer Science2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2015

The spectral learning of HMMs is extended, a moment matching learning technique free from local maxima, to discriminative HMMs, and the resulting method provides the posterior probabilities of the classes without explicitly determining the HMM parameters, and is able to deal with missing labels.

Identification of hidden Markov models using spectral learning with likelihood maximization

- Computer Science2017 IEEE 56th Annual Conference on Decision and Control (CDC)
- 2017

This paper proposes a two-step procedure that combines spectral learning with a single Newton-like iteration for maximum likelihood estimation and demonstrates an improved statistical performance using the proposed algorithm in numerical simulations.

Spectral Learning of Hidden Markov Models

- 2014

The Hidden Markov Model is the most fundamental model of partially observable uncontrolled systems in Machine Learning. The state of the art over the last years was the Baum-Welch algorithm which…

Learning HMMs with Nonparametric Emissions via Spectral Decompositions of Continuous Matrices

- Computer Science, MathematicsNIPS
- 2016

This paper studies the estimation of an $m$-state hidden Markov model (HMM) with only smoothness assumptions, such as Holderian conditions, on the emission densities and develops a computationally efficient spectral algorithm for learning nonparametric HMMs.

Spectral estimation of hidden Markov models

- Computer Science
- 2014

It is shown that spectral estimation of hidden Markov models can be factored into two major componentsestimation of the hidden state space dynamics, and estimation of the observation probability distributions, which leads to extremely flexible estimation procedures that can be tailored precisely for the task of interest.

## References

SHOWING 1-10 OF 28 REFERENCES

A Spectral Algorithm for Learning Hidden Markov Models

- Computer ScienceCOLT
- 2009

Reduced-Rank Hidden Markov Models

- Computer ScienceAISTATS
- 2010

This paper proves a tighter nite-sample error bound for the case of Reduced-Rank HMMs, i.e., HMMs with low-rank transition matrices, and generalizes the algorithm and bounds to models where multiple observations are needed to disambiguate state, and to models that emit multivariate real-valued observations.

Observable Operator Models for Discrete Stochastic Time Series

- Mathematics, Computer ScienceNeural Computation
- 2000

A novel, simple characterization of linearly dependent processes, called observable operator models, is provided, which leads to a constructive learning algorithm for the identification of linially dependent processes.

Automatic state discovery for unstructured audio scene classification

- Computer Science2010 IEEE International Conference on Acoustics, Speech and Signal Processing
- 2010

A novel scheme for unstructured audio scene classification that possesses three highly desirable and powerful features: autonomy, scalability, and robustness that has proven to be highly effective for building real-world applications and has been integrated into a commercial surveillance system as an event detection component.

Hilbert space embeddings of conditional distributions with applications to dynamical systems

- Computer Science, MathematicsICML '09
- 2009

This paper derives a kernel estimate for the conditional embedding, and shows its connection to ordinary embeddings, and aims to derive a nonparametric method for modeling dynamical systems where the belief state of the system is maintained as a conditional embeddedding.

Random Features for Large-Scale Kernel Machines

- Computer ScienceNIPS
- 2007

Two sets of random features are explored, provided convergence bounds on their ability to approximate various radial basis kernels, and it is shown that in large-scale classification and regression tasks linear machine learning algorithms applied to these features outperform state-of-the-art large- scale kernel machines.

A Hilbert Space Embedding for Distributions

- MathematicsALT
- 2007

We describe a technique for comparing distributions without the need for density estimation as an intermediate step. Our approach relies on mapping the distributions into a reproducing kernel Hilbert…

A general regression technique for learning transductions

- Computer ScienceICML
- 2005

A novel and conceptually cleaner formulation of kernel dependency estimation provides a simple framework for estimating the regression coefficients, and an efficient algorithm for computing the pre-image from the regression coefficient extends the applicability of Kernel dependency estimation to output sequences.

Injective Hilbert Space Embeddings of Probability Measures

- Computer Science, MathematicsCOLT
- 2008

This work considers more broadly the problem of specifying characteristic kernels, defined as kernels for which the RKHS embedding of probability measures is injective, and restricts ourselves to translation-invariant kernels on Euclidean space.

Large Margin Methods for Structured and Interdependent Output Variables

- Computer ScienceJ. Mach. Learn. Res.
- 2005

This paper proposes to appropriately generalize the well-known notion of a separation margin and derive a corresponding maximum-margin formulation and presents a cutting plane algorithm that solves the optimization problem in polynomial time for a large class of problems.