mraginsky + markov-chains   10

[1111.2687] Ricci curvature of finite Markov chains via convexity of the entropy
We define and study a new notion of Ricci curvature that applies to Markov chains on discrete spaces. This notion relies on geodesic convexity of the entropy and is analogous to the one introduced by Lott, Sturm, and Villani for geodesic measure spaces. In order to apply to the discrete setting, the role of the Wasserstein metric is taken over by a different metric, having the property that continuous time Markov chains are gradient flows of the entropy.
Using this notion of Ricci curvature we prove discrete analogues of fundamental results by Bakry--Emery and Otto--Villani. Furthermore we show that Ricci curvature bounds are preserved under tensorisation. As a special case we obtain the sharp Ricci curvature lower bound for the discrete hypercube.
papers  to-read  markov-chains  probability  measure-concentration 
5 weeks ago by mraginsky
[1201.2256] Empirical Processes of Markov Chains and Dynamical Systems Indexed by Classes of Functions
We study weak convergence of empirical processes of dependent data, indexed by classes of functions. We obtain results that are especially suitable for data arising from dynamical systems and Markov chains, where the Central Limit Theorem for partial sums is commonly derived via the spectral gap technique. Our results apply, e.g. to the empirical process of ergodic torus automorphisms.
papers  to-read  empirical-processes  dynamical-systems  markov-chains  re:adaptive_control_project 
january 2012 by mraginsky
[1111.1977] On Refined Versions of the Azuma-Hoeffding Inequality with Applications in Information Theory
This paper derives some refined versions of the Azuma-Hoeffding inequality for discrete-parameter martingales with uniformly bounded jumps, and it considers some of their potential applications in information theory and related topics. The first part of this paper derives these refined inequalities, followed by a discussion on their relations to some classical results in probability theory. It also considers a geometric interpretation of some of these inequalities, providing an insight on the inter-connections between them. The second part exemplifies the use of these refined inequalities in the context of hypothesis testing, information theory, and communication. The paper is concluded with a discussion on some directions for further research. This work is meant to stimulate the use of some refined versions of the Azuma-Hoeffding inequality in information-theoretic aspects.
papers  to-read  information-theory  measure-concentration  martingales  markov-chains  probability 
january 2012 by mraginsky
[1201.0559] Chernoff-Hoeffding Bounds for Markov Chains: Generalized and Simplified
We prove the first Chernoff-Hoeffding bounds for general nonreversible finite-state Markov chains based on the standard L_1 (variation distance) mixing-time of the chain. Specifically, consider an ergodic Markov chain M and a weight function f: [n] -> [0,1] on the state space [n] of M with mean mu = E_{v <- pi}[f(v)], where pi is the stationary distribution of M. A t-step random walk (v_1,...,v_t) on M starting from the stationary distribution pi has expected total weight E[X] = mu t, where X = sum_{i=1}^t f(v_i). Let T be the L_1 mixing-time of M. We show that the probability of X deviating from its mean by a multiplicative factor of delta, i.e., Pr [ |X - mu t| >= delta mu t ], is at most exp(-Omega(delta^2 mu t / T)) for 0 <= delta <= 1, and exp(-Omega(delta mu t / T)) for delta > 1. In fact, the bounds hold even if the weight functions f_i's for i in [t] are distinct, provided that all of them have the same mean mu.
We also obtain a simplified proof for the Chernoff-Hoeffding bounds based on the spectral expansion lambda of M, which is the square root of the second largest eigenvalue (in absolute value) of M tilde{M}, where tilde{M} is the time-reversal Markov chain of M. We show that the probability Pr [ |X - mu t| >= delta mu t ] is at most exp(-Omega(delta^2 (1-lambda) mu t)) for 0 <= delta <= 1, and exp(-Omega(delta (1-lambda) mu t)) for delta > 1.
Both of our results extend to continuous-time Markov chains, and to the case where the walk starts from an arbitrary distribution x, at a price of a multiplicative factor depending on the distribution x in the concentration bounds
to-read  papers  markov-chains  measure-concentration  probability 
january 2012 by mraginsky
[1102.5245] Quantitative bounds for Markov chain convergence: Wasserstein and total variation distances
"We present a framework for obtaining explicit bounds on the rate of convergence to equilibrium of a Markov chain on a general state space, with respect to both total variation and Wasserstein distances. For Wasserstein bounds, our main tool is Steinsaltz's convergence theorem for locally contractive random dynamical systems. We describe practical methods for finding Steinsaltz's "drift functions" that prove local contractivity. We then use the idea of "one-shot coupling" to derive criteria that give bounds for total variation distances in terms of Wasserstein distances. Our methods are applied to two examples: a two-component Gibbs sampler for the Normal distribution and a random logistic dynamical system."
papers  to-read  markov-chains  ergodic-theory 
february 2011 by mraginsky
[1010.2894] Markov Chains and Dynamical Systems: The Open System Point of View
"This article presents several results establishing connections be- tween Markov chains and dynamical systems, from the point of view of open systems in physics. We show how all Markov chains can be understood as the information on one component that we get from a dynamical system on a product system, when losing information on the other component. We show that passing from the deterministic dynamics to the random one is character- ized by the loss of algebra morphism property; it is also characterized by the loss of reversibility. In the continuous time framework, we show that the solu- tions of stochastic dierential equations are actually deterministic dynamical systems on a particular product space. When losing the information on one component, we recover the usual associated Markov semigroup." Is there anything new there? I've seen numerous constructions of the sort, e.g., in R.F. Streater's "Statistical Dynamics" ...
to-read  papers  markov-chains  dynamical-systems 
january 2011 by mraginsky

Copy this bookmark:



description:


tags: