• Corpus ID: 88516542

Hoeffding's lemma for Markov Chains and its applications to statistical learning

@article{Fan2018HoeffdingsLF,
  title={Hoeffding's lemma for Markov Chains and its applications to statistical learning},
  author={Jianqing Fan and Bai Jiang and Qiang Sun},
  journal={arXiv: Statistics Theory},
  year={2018}
}
We establish the counterpart of Hoeffding's lemma for Markov dependent random variables. Specifically, if a stationary Markov chain $\{X_i\}_{i \ge 1}$ with invariant measure $\pi$ admits an $\mathcal{L}_2(\pi)$-spectral gap $1-\lambda$, then for any bounded functions $f_i: x \mapsto [a_i,b_i]$, the sum of $f_i(X_i)$ is sub-Gaussian with variance proxy $\frac{1+\lambda}{1-\lambda} \cdot \sum_i \frac{(b_i-a_i)^2}{4}$. The counterpart of Hoeffding's inequality immediately follows. Our results… 
A Hoeffding inequality for Markov chains
  • Shravas Rao
  • Mathematics
    Electronic Communications in Probability
  • 2019
We prove deviation bounds for the random variable $\sum_{i=1}^{n} f_i(Y_i)$ in which $\{Y_i\}_{i=1}^{\infty}$ is a reversible Markov chain with stationary distribution and state space $[N]$, and
Optimal Chernoff and Hoeffding Bounds for Finite State Markov Chains
This paper develops an optimal Chernoff type bound for the probabilities of large deviations of sums $\sum_{k=1}^n f (X_k)$ where $f$ is a real-valued function and $(X_k)_{k \in \mathbb{Z}_{\ge 0}}$
Concentration and Anti-concentration for Markov Chains
  • 2019
We study tail bounds and small ball probabilities for sums of random variables obtained from a Markov chain. In particular, we consider the following sum Sn = f1(Y1) + · · ·+ fn(Yn) where {Yi}i=1 is
Concentration inequality for U-statistics of order two for uniformly ergodic Markov chains, and applications
We prove a new concentration inequality for U-statistics of order two for uniformly ergodic Markov chains. Working with bounded π-canonical kernels, we show that we can recover the convergence rate
Transport-information inequalities for Markov chains
This paper is the discrete time counterpart of the previous work in the continuous time case by Guillin, Léonard, the second named author and Yao [Probab. Theory Related Fields 144(2009), no. 3-4,
Fast Doubly-Adaptive MCMC to Estimate the Gibbs Partition Function with Weak Mixing Time Bounds
TLDR
A doubly adaptive approach is developed, combining the adaptive cooling schedule with an adaptive MCMC mean estimator, whose number of Markov chain steps adapts dynamically to the underlying chain.
Bernstein's inequality for general Markov chains
We establish Bernstein inequalities for functions of general (general-state-space, not necessarily reversible) Markov chains. These inequalities achieve sharp variance proxies and recover the
Bernstein’s Inequalities for General Markov Chains
We establish Bernstein inequalities for functions of general (general-state-space, not necessarily reversible) Markov chains. These inequalities achieve sharp variance proxies and recover the
Three rates of convergence or separation via U-statistics in a dependent framework
Despite the ubiquity of U-statistics in modern Probability and Statistics, their non-asymptotic analysis in a dependent framework may have been overlooked. In a recent work, a new concentration
Generalized Autoregressive Linear Models for Discrete High-Dimensional Data
TLDR
The main result provides a bound on the mean-squared error of the estimated connectivity tensor as a function of the sparsity and the number of samples, for a class of discrete multivariate AR models, in the high-dimensional regime.
...
1
2
3
...

References

SHOWING 1-10 OF 57 REFERENCES
Geometric ergodicity and the spectral gap of non-reversible Markov chains
We argue that the spectral theory of non-reversible Markov chains may often be more effectively cast within the framework of the naturally associated weighted-L∞ space $${L_\infty^V}$$ , instead of
Chernoff-Hoeffding Bounds for Markov Chains: Generalized and Simplified
We prove the first Chernoff-Hoeffding bounds for general nonreversible finite-state Markov chains based on the standard L_1 (variation distance) mixing-time of the chain. Specifically, consider an
Exponential bounds and stopping rules for MCMC and general Markov chains
We develop explicit, general bounds for the probability that the empirical sample averages of a function of a Markov chain on a general alphabet will exceed the steady-state mean of that function by
Error Bounds for Approximations of Geometrically Ergodic Markov Chains
A common tool in the practice of Markov Chain Monte Carlo is to use approximating transition kernels to speed up computation when the true kernel is slow to evaluate. A relatively limited set of
Approximations of Geometrically Ergodic Markov Chains
A common tool in the practice of Markov Chain Monte Carlo is to use approximating transition kernels to speed up computation when the desired kernel is slow to evaluate or intractable. A relatively
Relative entropy and exponential deviation bounds for general Markov chains
We develop explicit, general bounds for the probability that the normalized partial sums of a function of a Markov chain on a general alphabet would exceed the steady-state mean of that function by a
Chernoff-type bound for finite Markov chains
This paper develops bounds on the distribution function of the empirical mean for irreducible finite-state Markov chains. One approach, explored by D. Gillman, reduces this problem to bounding the
Concentration inequalities for Markov processes via coupling
We obtain moment and Gaussian bounds for general coordinate-wise Lipschitz functions evaluated along the sample path of a Markov chain. We treat Markov chains on general (possibly unbounded) state
HOEFFDING'S INEQUALITIES FOR GEOMETRICALLY ERGODIC MARKOV CHAINS ON GENERAL STATE SPACE ∗
Let (Xn)n≥1 be a Markov chain on a general state space with stationary distribution π and a spectral gap in the space Lπ2. In this paper, we prove that the probabilities of large deviations of sums
Measure concentration for a class of random processes
Summary. Let X={Xi}i=−∞∞ be a stationary random process with a countable alphabet and distribution q. Let q∞(·|x−k0) denote the conditional distribution of X∞=(X1,X2,…,Xn,…) given the k-length past:
...
1
2
3
4
5
...