• Corpus ID: 4693805

# Hilbert Space Embeddings of Hidden Markov Models

@inproceedings{Song2010HilbertSE,
title={Hilbert Space Embeddings of Hidden Markov Models},
author={Le Song and Byron Boots and Sajid M. Siddiqi and Geoffrey J. Gordon and Alex Smola},
booktitle={ICML},
year={2010}
}
• Published in ICML 21 June 2010
• Computer Science
Hidden Markov Models (HMMs) are important tools for modeling sequence data. However, they are restricted to discrete latent states, and are largely restricted to Gaussian and discrete observations. And, learning algorithms for HMMs have predominantly relied on local search heuristics, with the exception of spectral methods such as those described below. We propose a nonparametric HMM that extends traditional HMMs to structured and non-Gaussian continuous distributions. Furthermore, we derive a…
215 Citations

## Figures from this paper

Using Regression for Spectral Estimation of HMMs
• Computer Science
SLSP
• 2013
This work introduces a new spectral model for HMM estimation, a corresponding spectral bilinear regression model, and systematically compare them with a variety of competing simplified models, explaining when and why each method gives superior performance.
Regularization of Hidden Markov Models Embedded into Reproducing Kernel Hilbert Space
• Computer Science
Advances in Intelligent Systems and Computing
• 2018
An approach to extend the application of HMMs to non-Gaussian continuous distributions by embedding the belief about the state into a reproducing kernel Hilbert space (RKHS), and reduce tendency to overfitting and computational complexity of algorithm by means of various regularization techniques, specifically, Nystrom subsampling is discussed.
Spectral dimensionality reduction for HMMs
• Computer Science
ArXiv
• 2012
This work provides a new spectral method which significantly reduces the number of model parameters that need to be estimated, and generates a sample complexity that does not depend on the size of the observation vocabulary.
On learning parametric-output HMMs
• Computer Science
ICML
• 2013
We present a novel approach for learning an HMM whose outputs are distributed according to a parametric family. This is done by {\em decoupling} the learning task into two steps: first estimating the
Discriminative spectral learning of hidden markov models for human activity recognition
• Computer Science
2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
• 2015
The spectral learning of HMMs is extended, a moment matching learning technique free from local maxima, to discriminative HMMs, and the resulting method provides the posterior probabilities of the classes without explicitly determining the HMM parameters, and is able to deal with missing labels.
Identification of hidden Markov models using spectral learning with likelihood maximization
• Computer Science
2017 IEEE 56th Annual Conference on Decision and Control (CDC)
• 2017
This paper proposes a two-step procedure that combines spectral learning with a single Newton-like iteration for maximum likelihood estimation and demonstrates an improved statistical performance using the proposed algorithm in numerical simulations.
Spectral Learning of Hidden Markov Models
• 2014
The Hidden Markov Model is the most fundamental model of partially observable uncontrolled systems in Machine Learning. The state of the art over the last years was the Baum-Welch algorithm which
Learning HMMs with Nonparametric Emissions via Spectral Decompositions of Continuous Matrices
• Computer Science, Mathematics
NIPS
• 2016
This paper studies the estimation of an $m$-state hidden Markov model (HMM) with only smoothness assumptions, such as Holderian conditions, on the emission densities and develops a computationally efficient spectral algorithm for learning nonparametric HMMs.
Spectral estimation of hidden Markov models
It is shown that spectral estimation of hidden Markov models can be factored into two major componentsestimation of the hidden state space dynamics, and estimation of the observation probability distributions, which leads to extremely flexible estimation procedures that can be tailored precisely for the task of interest.

## References

SHOWING 1-10 OF 28 REFERENCES
Reduced-Rank Hidden Markov Models
• Computer Science
AISTATS
• 2010
This paper proves a tighter nite-sample error bound for the case of Reduced-Rank HMMs, i.e., HMMs with low-rank transition matrices, and generalizes the algorithm and bounds to models where multiple observations are needed to disambiguate state, and to models that emit multivariate real-valued observations.
Observable Operator Models for Discrete Stochastic Time Series
• H. Jaeger
• Mathematics, Computer Science
Neural Computation
• 2000
A novel, simple characterization of linearly dependent processes, called observable operator models, is provided, which leads to a constructive learning algorithm for the identification of linially dependent processes.
Automatic state discovery for unstructured audio scene classification
• Computer Science
2010 IEEE International Conference on Acoustics, Speech and Signal Processing
• 2010
A novel scheme for unstructured audio scene classification that possesses three highly desirable and powerful features: autonomy, scalability, and robustness that has proven to be highly effective for building real-world applications and has been integrated into a commercial surveillance system as an event detection component.
Hilbert space embeddings of conditional distributions with applications to dynamical systems
• Computer Science, Mathematics
ICML '09
• 2009
This paper derives a kernel estimate for the conditional embedding, and shows its connection to ordinary embeddings, and aims to derive a nonparametric method for modeling dynamical systems where the belief state of the system is maintained as a conditional embeddedding.
Random Features for Large-Scale Kernel Machines
• Computer Science
NIPS
• 2007
Two sets of random features are explored, provided convergence bounds on their ability to approximate various radial basis kernels, and it is shown that in large-scale classification and regression tasks linear machine learning algorithms applied to these features outperform state-of-the-art large- scale kernel machines.
A Hilbert Space Embedding for Distributions
• Mathematics
ALT
• 2007
We describe a technique for comparing distributions without the need for density estimation as an intermediate step. Our approach relies on mapping the distributions into a reproducing kernel Hilbert
A general regression technique for learning transductions
• Computer Science
ICML
• 2005
A novel and conceptually cleaner formulation of kernel dependency estimation provides a simple framework for estimating the regression coefficients, and an efficient algorithm for computing the pre-image from the regression coefficient extends the applicability of Kernel dependency estimation to output sequences.
Injective Hilbert Space Embeddings of Probability Measures
• Computer Science, Mathematics
COLT
• 2008
This work considers more broadly the problem of specifying characteristic kernels, defined as kernels for which the RKHS embedding of probability measures is injective, and restricts ourselves to translation-invariant kernels on Euclidean space.
Large Margin Methods for Structured and Interdependent Output Variables
• Computer Science
J. Mach. Learn. Res.
• 2005
This paper proposes to appropriately generalize the well-known notion of a separation margin and derive a corresponding maximum-margin formulation and presents a cutting plane algorithm that solves the optimization problem in polynomial time for a large class of problems.