cshalizi + concentration_of_measure 34
[math/0612726] High Dimensional Probability
6 weeks ago by cshalizi
"About forty years ago it was realized by several researchers that the essential features of certain objects of Probability theory, notably Gaussian processes and limit theorems, may be better understood if they are considered in settings that do not impose structures extraneous to the problems at hand. For instance, in the case of sample continuity and boundedness of Gaussian processes, the essential feature is the metric or pseudometric structure induced on the index set by the covariance structure of the process, regardless of what the index set may be. This point of view ultimately led to the Fernique-Talagrand majorizing measure characterization of sample boundedness and continuity of Gaussian processes, thus solving an important problem posed by Kolmogorov. Similarly, separable Banach spaces provided a minimal setting for the law of large numbers, the central limit theorem and the law of the iterated logarithm, and this led to the elucidation of the minimal (necessary and/or sufficient) geometric properties of the space under which different forms of these theorems hold. However, in light of renewed interest in Empirical processes, a subject that has considerably influenced modern Statistics, one had to deal with a non-separable Banach space, namely $mathcal{L}_{infty}$. With separability discarded, the techniques developed for Gaussian processes and for limit theorems and inequalities in separable Banach spaces, together with combinatorial techniques, led to powerful inequalities and limit theorems for sums of independent bounded processes over general index sets, or, in other words, for general empirical processes."
to:NB
empirical_processes
probability
stochastic_processes
high-dimensional_probability
convergence_of_stochastic_processes
concentration_of_measure
6 weeks ago by cshalizi
A Kernel Two-Sample Test
7 weeks ago by cshalizi
"We propose a framework for analyzing and comparing distributions, which we use to construct statistical tests to determine if two samples are drawn from different distributions. Our test statistic is the largest difference in expectations over functions in the unit ball of a reproducing kernel Hilbert space (RKHS), and is called the maximum mean discrepancy (MMD). We present two distribution-free tests based on large deviation bounds for the MMD, and a third test based on the asymptotic distribution of this statistic. The MMD can be computed in quadratic time, although efficient linear time approximations are available. Our statistic is an instance of an integral probability metric, and various classical metrics on distributions are obtained when alternative function classes are used in place of an RKHS. We apply our two-sample tests to a variety of problems, including attribute matching for databases using the Hungarian marriage method, where they perform strongly. Excellent performance is also obtained when comparing distributions over graphs, for which these are the first such tests."
in_NB
to_read
hilbert_space
kernel_methods
goodness-of-fit
statistics
concentration_of_measure
probability
two-sample_tests
re:network_differences
7 weeks ago by cshalizi
[1202.2341] Measure concentration through non-Lipschitz observables and functional inequalities
12 weeks ago by cshalizi
"Non-Gaussian concentration estimates are obtained for invariant probability measures of reversible Markov processes. We show that the functional inequalities approach combined with a suitable Lyapunov condition allows us to circumvent the classical Lipschitz assumption of the observables. Our method is general and covers diffusions as well as pure-jump Markov processes on unbounded spaces."
in_NB
concentration_of_measure
markov_models
stochastic_processes
12 weeks ago by cshalizi
Foundations and Trends in Machine Learning
12 weeks ago by cshalizi
"This paper presents some new concentration inequalities for Feynman-Kac particle processes. We analyze different types of stochastic particle models, including particle profile occupation measures, genealogical tree based evolution models, particle free energies, as well as backward Markov chain particle models. We illustrate these results with a series of topics related to computational physics and biology, stochastic optimization, signal processing and Bayesian statistics, and many other probabilistic machine learning algorithms. Special emphasis is given to the stochastic modeling, and to the quantitative performance analysis of a series of advanced Monte Carlo methods, including particle filters, genetic type island models, Markov bridge models, and interacting particle Markov chain Monte Carlo methodologies."
to:NB
stochastic_processes
interacting_particle_systems
concentration_of_measure
particle_filters
12 weeks ago by cshalizi
[1201.3569] Exponential Concentration Inequalities for Additive Functionals of Markov Chains
january 2012 by cshalizi
"Using the renewal approach we prove exponential inequalities for additive functionals and empirical processes of ergodic Markov chains, thus obtaining counterparts of inequalities for sums of independent random variables. The inequalities do not require functions of the chain to be bounded and moreover have all the constants accessible whenever the usual drift condition holds, which is crucial for practical applications e.g. in MCMC algorithms."
in_NB
stochastic_processes
empirical_processes
markov_models
concentration_of_measure
january 2012 by cshalizi
[0811.2769] Quantitative asymptotics of graphical projection pursuit
december 2011 by cshalizi
"There is a result of Diaconis and Freedman which says that, in a limiting sense, for large collections of high-dimensional data most one-dimensional projections of the data are approximately Gaussian. This paper gives quantitative versions of that result. For a set of deterministic vectors ${x_i}_{i=1}^n$ in $R^d$ with $n$ and $d$ fixed, let $thetains^{d-1}$ be a random point of the sphere and let $mu_n^theta$ denote the random measure which puts mass $frac{1}{n}$ at each of the points $inprod{x_1}{theta},...,inprod{x_n}{theta}$. For a fixed bounded Lipschitz test function $f$, $Z$ a standard Gaussian random variable and $sigma^2$ a suitable constant, an explicit bound is derived for the quantity $dsP[|int f dmu_n^theta-E f(sigma Z)|>epsilon]$. A bound is also given for $dsP[d_{BL}(mu_n^theta, N(0,sigma^2))>epsilon]$, where $d_{BL}$ denotes the bounded-Lipschitz distance, which yields a lower bound on the waiting time to finding a non-Gaussian projection of the ${x_i}$ if directions are tried independently and uniformly on $s^{d-1}$."
to:NB
probability
concentration_of_measure
high-dimensional_probability
december 2011 by cshalizi
Non-parametric detection of meaningless distances in high dimensional data - Ata Kabán - Statistics and Computing, Volume 22, Number 2
december 2011 by cshalizi
"Distance concentration is the phenomenon that, in certain conditions, the contrast between the nearest and the farthest neighbouring points vanishes as the data dimensionality increases. It affects high dimensional data processing, analysis, retrieval, and indexing, which all rely on some notion of distance or dissimilarity. Previous work has characterised this phenomenon in the limit of infinite dimensions. However, real data is finite dimensional, and hence the infinite-dimensional characterisation is insufficient. Here we quantify the phenomenon more precisely, for the possibly high but finite dimensional case in a distribution-free manner, by bounding the tails of the probability that distances become meaningless. As an application, we show how this can be used to assess the concentration of a given distance function in some unknown data distribution solely on the basis of an available data sample from it. This can be used to test and detect problematic cases more rigorously than it is currently possible, and we demonstrate the working of this approach on both synthetic data and ten real-world data sets from different domains."
statistics
probability
curse_of_dimensonality
hypothesis_testing
concentration_of_measure
in_NB
high-dimensional_probability
december 2011 by cshalizi
[1111.4073] Multivariate Normal Approximation by Stein's Method: The Concentration Inequality Approach
november 2011 by cshalizi
"The concentration inequality approach for normal approximation by Stein's method is generalized to the multivariate setting. This approach is used to prove a multivariate normal approximation theorem for standardized sums of independent random vectors with an error bound of the order $k^{1/2}gamma$, where $k$ is the dimension of the random vectors and $gamma$ is the sum of absolute third moments of the random vectors."
in_NB
probability
central_limit_theorem
steins_method
concentration_of_measure
november 2011 by cshalizi
[1111.3486] New Concentration Inequalities for Suprema of Empirical Processes
november 2011 by cshalizi
"While effective concentration inequalities for suprema of empirical processes exist under boundedness or strict tail assumptions, no comparable results have been available under considerably weaker assumptions. In this paper, we derive concentration inequalities assuming only low moments for an envelope of the empirical process. These concentration inequalities are beneficial even when the envelope is much larger than the single functions under consideration."
in_NB
concentration_of_measure
probability
empirical_processes
van_de_geer.sara
to_read
november 2011 by cshalizi
[1111.2622] Optimal re-centering bounds, with applications to Rosenthal-type concentration of measure inequalities
november 2011 by cshalizi
"For any nonnegative Borel-measurable function f such that f(x)=0 if and only if x=0, the best constant c_f in the inequality E f(X-E X) leq c_f E f(X) for all random variables X with a finite mean is obtained. Properties of the constant c_f in the case when f=|.|^p are studied. Applications to concentration of measure in the form of Rosenthal-type bounds on the moments of separately Lipschitz functions on product spaces are given."
in_NB
deviation_bounds
probability
concentration_of_measure
november 2011 by cshalizi
A Bernstein type inequality and moderate deviations for weakly dependent sequences
november 2011 by cshalizi
"In this paper we present a Bernstein-type tail inequality for the maximum of partial sums of a weakly dependent sequence of random variables that is not necessarily bounded. The class considered includes geometrically and subgeometrically strongly mixing sequences. The result is then used to derive asymptotic moderate deviation results. Applications are given for classes of Markov chains, iterated Lipschitz models and functions of linear processes with absolutely regular innovations." Also: http://arxiv.org/abs/0902.0582
in_NB
to_read
re:XV_for_mixing
re:your_favorite_dsge_sucks
concentration_of_measure
deviation_bounds
mixing
ergodic_theory
stochastic_processes
moderate_deviations
november 2011 by cshalizi
[1110.6886] PAC-Bayesian Inequalities for Martingales
november 2011 by cshalizi
"We present a set of high-probability inequalities that control the concentration of weighted averages of multiple (possibly uncountably many) simultaneously evolving and interdependent martingales. We also present a comparison inequality that bounds expectation of a convex function of martingale difference type variables by expectation of the same function of independent Bernoulli variables. This inequality is applied to derive a tighter analog of Hoeffding-Azuma inequality." --- For the record: I hate martingales.
to:NB
re:almost_none
learning_theory
concentration_of_measure
martingales
november 2011 by cshalizi
[1110.2529] The Generalization Ability of Online Algorithms for Dependent Data
october 2011 by cshalizi
"We study the generalization performance of arbitrary online learning algorithms trained on samples coming from a dependent source of data. We show that the generalization error of any stable online algorithm concentrates around its regret--an easily computable statistic of the online performance of the algorithm--when the underlying ergodic process is $beta$- or $phi$-mixing. We show high probability error bounds assuming the loss function is convex, and we also establish sharp convergence rates and deviation bounds for strongly convex losses and several linear prediction problems such as linear and logistic regression, least-squares SVM, and boosting on dependent data. In addition, our results have straightforward applications to stochastic optimization with dependent data, and our analysis requires only martingale convergence arguments; we need not rely on more powerful statistical tools such as empirical process theory."
in_NB
learning_theory
individual_sequence_prediction
ergodic_theory
mixing
re:growing_ensemble_project
re:XV_for_mixing
stability_of_learning
concentration_of_measure
have_read
re:your_favorite_dsge_sucks
october 2011 by cshalizi
[1110.2392] A Variant of Azuma's Inequality for Martingales with Subgaussian Tail
october 2011 by cshalizi
"We present a variant of Azuma's concentration inequality for martingales, in which the standard boundedness requirement is replaced by the milder requirement of a subgaussian tail."
to:NB
to_read
deviation_bounds
martingales
concentration_of_measure
via:shivak
october 2011 by cshalizi
[1107.1948] On the concentration properties of Interacting particle processes
july 2011 by cshalizi
"These lecture notes present some new concentration inequalities for Feynman-Kac particle processes. We analyze different types of stochastic particle models, including particle profile occupation measures, genealogical tree based evolution models, particle free energies, as well as backward Markov chain particle models. We illustrate these results with a series of topics related to computational physics and biology, stochastic optimization, signal processing and bayesian statistics, and many other probabilistic machine learning algorithms. Special emphasis is given to the stochastic modeling and the quantitative performance analysis of a series of advanced Monte Carlo methods, including particle filters, genetic type island models, Markov bridge models, interacting particle Markov chain Monte Carlo methodologies."
interacting_particle_systems
concentration_of_measure
stochastic_processes
markov_models
branching_processes
particle_filters
in_NB
del_moral.pierre
july 2011 by cshalizi
[1012.5457] Concentration of the information in data with log-concave distributions
january 2011 by cshalizi
"A concentration property of the functional $-\log f(X)$ is demonstrated, when a random vector $X$ has a log-concave density $f$ on $\R^n$. This concentration property implies in particular an extension of the Shannon-McMillan-Breiman strong ergodic theorem to the class of discrete-time stochastic processes with log-concave marginals."
information_theory
concentration_of_measure
ergodic_theory
in_NB
have_read
january 2011 by cshalizi
Bobkov , Götze : Concentration of empirical distribution functions with applications to non-i.i.d. models
november 2010 by cshalizi
"The concentration of empirical measures is studied for dependent data, whose joint distribution satisfies Poincaré-type or logarithmic Sobolev inequalities. The general concentration results are then applied to spectral empirical distribution functions associated with high-dimensional random matrices."
concentration_of_measure
empirical_processes
stochastic_processes
random_matrix_theory
november 2010 by cshalizi
[0901.0655] Exponential bounds for minimum contrast estimators
may 2010 by cshalizi
"The paper focuses on general properties of parametric minimum contrast estimators. The quality of estimation is measured in terms of the rate function related to the contrast, thus allowing to derive exponential risk bounds invariant with respect to the detailed probabilistic structure of the model. This approach works well for small or moderate samples and covers the case of a misspecified parametric model. Another important feature of the presented bounds is that they may be used in the case when the parametric set is unbounded and non-compact. These bounds do not rely on the entropy or covering numbers and can be easily computed. The most important statistical fact resulting from the exponential bonds is a concentration inequality which claims that minimum contrast estimators concentrate with a large probability on the level set of the rate function. In typical situations, every such set is a root-n neighborhood of the parameter of interest."
large_deviations
concentration_of_measure
estimation
statistics
misspecification
re:almost_none
to_teach:advanced-stochastic-processes
may 2010 by cshalizi
Constructive Proofs of Concentration Bounds
april 2010 by cshalizi
Possibly the most non-probabilistic proof of a probabilistic result I have ever seen. I must say it leaves me with absolutely no intuition whatsoever. Yet might it not be useful for concentration for dependent variables?
concentration_of_measure
theoretical_computer_science
probability
via:shivak
april 2010 by cshalizi
[1003.3852] Transport Inequalities. A Survey
march 2010 by cshalizi
"This is a survey of recent developments in the area of transport inequalities. We investigate their consequences in terms of concentration and deviation inequalities and sketch their links with other functional inequalities and also large deviation theory."
probability
large_deviations
concentration_of_measure
to_read
march 2010 by cshalizi
On Learning with Integral Operators
march 2010 by cshalizi
"A large number of learning algorithms, for example, spectral clustering, kernel Principal Components Analysis and many manifold methods are based on estimating eigenvalues and eigenfunctions of operators defined by a similarity function or a kernel, given empirical data. Thus for the analysis of algorithms, it is an important problem to be able to assess the quality of such approximations. The contribution of our paper is two-fold:
1. We use a technique based on a concentration inequality for Hilbert spaces to provide new much simplified proofs for a number of results in spectral approximation.
2. Using these methods we provide several new results for estimating spectral properties of the graph Laplacian operator extending and strengthening results from von Luxburg et al. (2008)."
manifold_learning
concentration_of_measure
machine_learning
spectral_clustering
1. We use a technique based on a concentration inequality for Hilbert spaces to provide new much simplified proofs for a number of results in spectral approximation.
2. Using these methods we provide several new results for estimating spectral properties of the graph Laplacian operator extending and strengthening results from von Luxburg et al. (2008)."
march 2010 by cshalizi
[1003.0006] Concentration of Additive Functionals for Markov Processes and Applications to Interacting Particle Systems
march 2010 by cshalizi
"e consider additive functionals of Markov processes in continuous time with general (metric) state spaces. We derive concentration bounds for their exponential moments and moments of finite order. Applications include diffusions, interacting particle systems and random walks. In particular, for the symmetric exclusion process we generalize large deviation bounds for occupation times to general local functions. The method is based on coupling estimates and not spectral theory, hence reversibility is not needed. We bound the exponential moments(or the moments of finite order) in terms of a so-called coupled function difference, which in turn is estimated using the generalized coupling time. Along the way we prove a general relation between the contractivity of the semigroup and bounds on the generalized coupling time."
markov_models
concentration_of_measure
coupling
interacting_particle_systems
march 2010 by cshalizi
Arlot, Blanchard, Roquain: Some nonasymptotic results on resampling in high dimension, I: Confidence regions
december 2009 by cshalizi
"We study generalized bootstrap confidence regions for the mean of a random vector whose coordinates have an unknown dependency structure. The random vector is supposed to be either Gaussian or to have a symmetric and bounded distribution. The dimensionality of the vector can possibly be much larger than the number of observations and we focus on a nonasymptotic control of the confidence level, following ideas inspired by recent results in learning theory. We consider two approaches, the first based on a concentration principle (valid for a large class of resampling weights) and the second on a resampled quantile, specifically using Rademacher weights. Several intermediate results established in the approach based on concentration principles are of interest in their own right. We also discuss the question of accuracy when using Monte Carlo approximations of the resampled quantities."
statistics
resampling
bootstrap
cross-validation
confidence_sets
to_read
re:XV_for_mixing
concentration_of_measure
learning_theory
december 2009 by cshalizi
[0912.4480] Consistency of the Maximum Likelihood Estimator for general hidden Markov models
december 2009 by cshalizi
"a parametrized family of general hidden Markov models, where both the observed and unobserved components take values in a complete separable metric space. We prove that the maximum likelihood estimator (MLE) of the parameter is strongly consistent under a rather minimal set of assumptions. As special cases of our main result, we obtain consistency in a large class of nonlinear state space models, as well as general results on linear Gaussian state space models and finite state models. A novel aspect of our approach is an information-theoretic technique for proving identifiability, which does not require an explicit representation for the relative entropy rate. Our method of proof could therefore form a foundation for the investigation of MLE consistency in more general dependent and non-Markovian time series. Also of independent interest is a general concentration inequality for $V$-uniformly ergodic Markov chains."
markov_models
statistics
estimation
concentration_of_measure
information_theory
identifiability
re:AoS_project
re:XV_for_mixing
have_read
december 2009 by cshalizi
related tags
barvinok.alexander ⊕ blanchard.gilles ⊕ bootstrap ⊕ bousquet.olivier ⊕ branching_processes ⊕ central_limit_theorem ⊕ concentration_of_measure ⊖ confidence_sets ⊕ convergence_of_stochastic_processes ⊕ coupling ⊕ cross-validation ⊕ curse_of_dimensonality ⊕ del_moral.pierre ⊕ deviation_bounds ⊕ deviation_inequalities ⊕ empirical_processes ⊕ ergodic_theory ⊕ estimation ⊕ goodness-of-fit ⊕ have_read ⊕ high-dimensional_probability ⊕ hilbert_space ⊕ hypothesis_testing ⊕ identifiability ⊕ individual_sequence_prediction ⊕ information_theory ⊕ interacting_particle_systems ⊕ in_NB ⊕ kernel_methods ⊕ kith_and_kin ⊕ kontorovich.aryeh ⊕ large_deviations ⊕ learning_theory ⊕ lugosi.gabor ⊕ machine_learning ⊕ manifold_learning ⊕ markov_models ⊕ martingales ⊕ massart.pascal ⊕ mathematics ⊕ misspecification ⊕ mixing ⊕ model_selection ⊕ moderate_deviations ⊕ particle_filters ⊕ probability ⊕ raginsky.maxim ⊕ random_matrix_theory ⊕ re:almost_none ⊕ re:AoS_project ⊕ re:growing_ensemble_project ⊕ re:network_differences ⊕ re:XV_for_mixing ⊕ re:your_favorite_dsge_sucks ⊕ resampling ⊕ spectral_clustering ⊕ stability_of_learning ⊕ statistical_mechanics ⊕ statistics ⊕ steins_method ⊕ stochastic_processes ⊕ support_vector_machines ⊕ theoretical_computer_science ⊕ to:blog ⊕ to:NB ⊕ to_read ⊕ to_teach:advanced-stochastic-processes ⊕ two-sample_tests ⊕ van_de_geer.sara ⊕ vc-dimension ⊕ via:ded-maxim ⊕ via:shivak ⊕Copy this bookmark: