# Concentration Bounds for Co-occurrence Matrices of Markov Chains

@article{Qiu2020ConcentrationBF, title={Concentration Bounds for Co-occurrence Matrices of Markov Chains}, author={Jiezhong Qiu and Chi Wang and Ben Liao and Richard Peng and Jie Tang}, journal={ArXiv}, year={2020}, volume={abs/2008.02464} }

Co-occurrence statistics for sequential data are common and important data signals in machine learning, which provide rich correlation and clustering information about the underlying object space. We give the first bound on the convergence rate of estimating the co-occurrence matrix of a regular (aperiodic and irreducible) finite Markov chain from a single random trajectory. Our work is motivated by the analysis of a well-known graph learning algorithm DeepWalk by [Qiu et al. WSDM '18], who…

## Figures from this paper

## One Citation

Consistency of random-walk based network embedding algorithms

- Computer ScienceArXiv
- 2021

This paper established large-sample error bounds and prove consistent community recovery of node2vec/DeepWalk embedding followed by k-means clustering and suggests using larger window sizes, or equivalently, taking longer random walks, in order to attain better convergence rate for the resulting embeddings.

## References

SHOWING 1-10 OF 49 REFERENCES

Estimating the Mixing Time of Ergodic Markov Chains

- MathematicsCOLT
- 2019

This work addresses the problem of estimating the mixing time of an arbitrary ergodic finite-state Markov chain from a single trajectory of length $m$ and estimates the pseudo-spectral gap $\gamma_{\mathsf{ps}}$, which allows it to overcome the loss of symmetry and to achieve a polynomial dependence on the minimal stationary probability.

Tail Estimates for Sums of Variables Sampled by a Random Walk

- MathematicsCombinatorics, Probability and Computing
- 2008

We prove tail estimates for variables of the form ∑if(Xi), where (Xi)i is a sequence of states drawn from a reversible Markov chain, or, equivalently, from a random walk on an undirected graph. The…

Random Walks on Dynamic Graphs: Mixing Times, HittingTimes, and Return Probabilities

- MathematicsICALP
- 2019

We establish and generalise several bounds for various random walk quantities including the mixing time and the maximum hitting time. Unlike previous analyses, our derivations are based on rather…

A Chernoff Bound for Random Walks on Expander Graphs

- Computer Science, MathematicsSIAM J. Comput.
- 1998

The method of taking the sample average from one trajectory is a more efficient estimate of /spl pi/(A) than the standard method of generating independent sample points from several trajectories and improves the algorithms of Jerrum and Sinclair (1989) for approximating the number of perfect matchings in a dense graph.

Chernoff-type bound for finite Markov chains

- Mathematics
- 1998

This paper develops bounds on the distribution function of the empirical mean for irreducible finite-state Markov chains. One approach, explored by D. Gillman, reduces this problem to bounding the…

Chernoff-Hoeffding Bounds for Markov Chains: Generalized and Simplified

- MathematicsSTACS
- 2012

We prove the first Chernoff-Hoeffding bounds for general nonreversible finite-state Markov chains based on the standard L_1 (variation distance) mixing-time of the chain. Specifically, consider an…

Efficient Sampling for Gaussian Graphical Models via Spectral Sparsification

- Computer Science, MathematicsCOLT
- 2015

A toolset based on spectral sparsification for a family of fundamental problems involving Gaussian sampling, matrix functionals, and reversible Markov chains is developed, which is expected to strengthen the connection between machine learning and spectral graph theory, two of the most active fields in understanding large data and networks.

A randomness-efficient sampler for matrix-valued functions and applications

- Mathematics, Computer Science46th Annual IEEE Symposium on Foundations of Computer Science (FOCS'05)
- 2005

It is shown that a random walk on an expander approximates the recent Chernoff-like bound for matrix-valued functions of Ahlswede and Winter [2002], in a manner which depends optimally on the spectral gap.

Spectral Sparsification of Random-Walk Matrix Polynomials

- Computer Science, MathematicsArXiv
- 2015

The first nearly linear time algorithm for sparsification of a spectral sparsifier of a constant degree random-walk matrix-polynomials introduced by Newton's method is developed.

Learning Hidden Markov Models from Pairwise Co-occurrences with Applications to Topic Modeling

- Computer ScienceICML
- 2018

A new algorithm for identifying the transition and emission probabilities of a hidden Markov model (HMM) from the emitted data is presented, and it is demonstrated that topics can be learned with higher quality if documents are modeled as observations of HMMs sharing the same emission probability.