cshalizi + confidence_sets 38
[1203.4354] Asymptotic Confidence Sets for General Nonparametric Regression and Classification by Regularized Kernel Methods
7 weeks ago by cshalizi
"Regularized kernel methods such as, e.g., support vector machines and least-squares support vector regression constitute an important class of standard learning algorithms in machine learning. Theoretical investigations concerning asymptotic properties have manly focused on rates of convergence during the last years but there are only very few and limited (asymptotic) results on statistical inference so far. As this is a serious limitation for their use in mathematical statistics, the goal of the article is to fill this gap. Based on asymptotic normality of many of these methods, the article derives a strongly consistent estimator for the unknown covariance matrix of the limiting normal distribution. In this way, we obtain asymptotically correct confidence sets for $psi(f_{P,lambda_0})$ where $f_{P,lambda_0}$ denotes the minimizer of the regularized risk in the reproducing kernel Hilbert space $H$ and $psi:Hrightarrowmathds{R}^m$ is any Hadamard-differentiable functional. Applications include (multivariate) pointwise confidence sets for values of $f_{P,lambda_0}$ and confidence sets for gradients, integrals, and norms."
to:NB
confidence_sets
kernel_methods
statistics
nonparametrics
regression
classifiers
7 weeks ago by cshalizi
[1203.5422] Distribution Free Prediction Bands
8 weeks ago by cshalizi
"We study distribution free, nonparametric prediction bands with a special focus on their finite sample behavior. First we investigate and develop different notions of finite sample coverage guarantees. Then we give a new prediction band estimator by combining the idea of "conformal prediction" (Vovk et al. 2009) with nonparametric conditional density estimation. The proposed estimator, called COPS (Conformal Optimized Prediction Set), always has finite sample guarantee in a stronger sense than the original conformal prediction estimator. Under regularity conditions the estimator converges to an oracle band at a minimax optimal rate. A fast approximation algorithm and a data driven method for selecting the bandwidth are developed. The method is illustrated first in simulated data. Then, an application shows that the proposed method gives desirable prediction intervals in an automatic way, as compared to the classical linear regression modeling."
to:NB
prediction
statistics
nonparametrics
kith_and_kin
wasserman.larry
lei.jing
heard
confidence_sets
density_estimation
8 weeks ago by cshalizi
Kleijn , van der Vaart : The Bernstein-Von-Mises theorem under misspecification
10 weeks ago by cshalizi
"We prove that the posterior distribution of a parameter in misspecified LAN parametric models can be approximated by a random normal distribution. We derive from this that Bayesian credible sets are not valid confidence sets if the model is misspecified. We obtain the result under conditions that are comparable to those in the well-specified situation: uniform testability against fixed alternatives and sufficient prior mass in neighbourhoods of the point of convergence. The rate of convergence is considered in detail, with special attention for the existence and construction of suitable test sequences. We also give a lemma to exclude testable model subsets which implies a misspecified version of Schwartz’ consistency theorem, establishing weak convergence of the posterior to a measure degenerate at the point at minimal Kullback-Leibler divergence with respect to the true distribution."
to:NB
to_read
bayesian_consistency
statistics
bernstein-von_mises
asymptotics
confidence_sets
van_der_vaart.aad
10 weeks ago by cshalizi
Cai : Minimax and Adaptive Inference in Nonparametric Function Estimation
10 weeks ago by cshalizi
"Since Stein’s 1956 seminal paper, shrinkage has played a fundamental role in both parametric and nonparametric inference. This article discusses minimaxity and adaptive minimaxity in nonparametric function estimation. Three interrelated problems, function estimation under global integrated squared error, estimation under pointwise squared error, and nonparametric confidence intervals, are considered. Shrinkage is pivotal in the development of both the minimax theory and the adaptation theory.
"While the three problems are closely connected and the minimax theories bear some similarities, the adaptation theories are strikingly different. For example, in a sharp contrast to adaptive point estimation, in many common settings there do not exist nonparametric confidence intervals that adapt to the unknown smoothness of the underlying function. A concise account of these theories is given. The connections as well as differences among these problems are discussed and illustrated through examples."
in_NB
statistics
estimation
regression
confidence_sets
nonparametrics
shrinkage
cai.t._tony
minimax
"While the three problems are closely connected and the minimax theories bear some similarities, the adaptation theories are strikingly different. For example, in a sharp contrast to adaptive point estimation, in many common settings there do not exist nonparametric confidence intervals that adapt to the unknown smoothness of the underlying function. A concise account of these theories is given. The connections as well as differences among these problems are discussed and illustrated through examples."
10 weeks ago by cshalizi
[0808.1010] Confidence bands in nonparametric time series regression
12 weeks ago by cshalizi
"We consider nonparametric estimation of mean regression and conditional variance (or volatility) functions in nonlinear stochastic regression models. Simultaneous confidence bands are constructed and the coverage probabilities are shown to be asymptotically correct. The imposed dependence structure allows applications in many linear and nonlinear auto-regressive processes. The results are applied to the S&P 500 Index data."
to:NB
statistics
regression
time_series
confidence_sets
to_teach:undergrad-ADA
12 weeks ago by cshalizi
[1202.4294] Prediction of quantiles by statistical learning and application to GDP forecasting
february 2012 by cshalizi
"In this paper, we tackle the problem of prediction and confidence intervals for time series using a statistical learning approach and quantile loss functions. In a first time, we show that the Gibbs estimator (also known as Exponentially Weighted aggregate) is able to predict as well as the best predictor in a given family for a wide set of loss functions. In particular, using the quantile loss function of Koenker and Bassett (1978), this allows to build confidence intervals. We apply these results to the problem of prediction and confidence regions for the French Gross Domestic Product (GDP) growth, with promising results."
in_NB
to_read
prediction
confidence_sets
learning_theory
re:your_favorite_dsge_sucks
re:growing_ensemble_project
february 2012 by cshalizi
[1111.1386] Confidence Estimation in Structured Prediction
november 2011 by cshalizi
"Structured classification tasks such as sequence labeling and dependency parsing have seen much interest by the Natural Language Processing and the machine learning communities. Several online learning algorithms were adapted for structured tasks such as Perceptron, Passive- Aggressive and the recently introduced Confidence-Weighted learning . These online algorithms are easy to implement, fast to train and yield state-of-the-art performance. However, unlike probabilistic models like Hidden Markov Model and Conditional random fields, these methods generate models that output merely a prediction with no additional information regarding confidence in the correctness of the output. In this work we fill the gap proposing few alternatives to compute the confidence in the output of non-probabilistic algorithms.We show how to compute confidence estimates in the prediction such that the confidence reflects the probability that the word is labeled correctly. We then show how to use our methods to detect mislabeled words, trade recall for precision and active learning. We evaluate our methods on four noun-phrase chunking and named entity recognition sequence labeling tasks, and on dependency parsing for 14 languages."
to:NB
machine_learning
confidence_sets
prediction
natural_language_processing
november 2011 by cshalizi
[1111.1418] Efficient Nonparametric Conformal Prediction Regions
november 2011 by cshalizi
Yay, it's out! "We investigate and extend the conformal prediction method due to Vovk,Gammerman and Shafer (2005) to construct nonparametric prediction regions. These regions have guaranteed distribution free, finite sample coverage, without any assumptions on the distribution or the bandwidth. Explicit convergence rates of the loss function are established for such regions under standard regularity conditions. Approximations for simplifying implementation and data driven bandwidth selection methods are also discussed. The theoretical properties of our method are demonstrated through simulations."
in_NB
prediction
statistics
confidence_sets
nonparametrics
kith_and_kin
wasserman.larry
robins.james
have_read
density_estimation
november 2011 by cshalizi
Fraser : Is Bayes Posterior just Quick and Dirty Confidence?
october 2011 by cshalizi
Shorter Fraser: Yes. Yes it is.
Longer Fraser: "Bayes introduced the observed likelihood function to statistical inference and provided a weight function to calibrate the parameter; he also introduced a confidence distribution on the parameter space but did not provide present justifications. Of course the names likelihood and confidence did not appear until much later: Fisher for likelihood and Neyman for confidence. Lindley showed that the Bayes and the confidence results were different when the model was not location. This paper examines the occurrence of true statements from the Bayes approach and from the confidence approach, and shows that the proportion of true statements in the Bayes case depends critically on the presence of linearity in the model; and with departure from this linearity the Bayes approach can be a poor approximation and be seriously misleading. Bayesian integration of weighted likelihood thus provides a first-order linear approximation to confidence, but without linearity can give substantially incorrect results."
The responses are worth reading, especially, of course, Larry's.
in_NB
statistics
estimation
confidence_sets
bayesianism
fraser.d.a.s.
have_read
Longer Fraser: "Bayes introduced the observed likelihood function to statistical inference and provided a weight function to calibrate the parameter; he also introduced a confidence distribution on the parameter space but did not provide present justifications. Of course the names likelihood and confidence did not appear until much later: Fisher for likelihood and Neyman for confidence. Lindley showed that the Bayes and the confidence results were different when the model was not location. This paper examines the occurrence of true statements from the Bayes approach and from the confidence approach, and shows that the proportion of true statements in the Bayes case depends critically on the presence of linearity in the model; and with departure from this linearity the Bayes approach can be a poor approximation and be seriously misleading. Bayesian integration of weighted likelihood thus provides a first-order linear approximation to confidence, but without linearity can give substantially incorrect results."
The responses are worth reading, especially, of course, Larry's.
october 2011 by cshalizi
[1110.2563] Confidence Intervals for Low-Dimensional Parameters With High-Dimensional Data
october 2011 by cshalizi
"The purpose of this paper is to propose methodologies for statistical inference of low-dimensional parameters with high-dimensional data. We focus on constructing confidence intervals for individual coefficients and linear combinations of several of them in a linear regression model, although our ideas are applicable in a much broad context. The theoretical results presented here provide sufficient conditions for the asymptotic normality of the proposed estimators along with a consistent estimator for their finite-dimensional covariance matrices. These sufficient conditions allow the number of variables to far exceed the sample size. The simulation results presented here demonstrate the accuracy of the coverage probability of the proposed confidence intervals, strongly supporting the theoretical results."
to:NB
statistics
confidence_sets
october 2011 by cshalizi
Hybrid confidence regions based on data depth - Lee - 2011 - Journal of the Royal Statistical Society: Series B (Statistical Methodology) - Wiley Online Library
october 2011 by cshalizi
"We consider the general problem of constructing confidence regions for, possibly multi-dimensional, parameters when we have available more than one approach for the construction. These approaches may be motivated by different model assumptions, different levels of approximation, different settings of tuning parameters or different Monte Carlo algorithms. Their effectiveness is often governed by different sets of conditions which are difficult to vindicate in practice. We propose two procedures for constructing hybrid confidence regions which endeavour to integrate all such individual approaches. The procedures employ the concept of data depth to calibrate the confidence region in two different ways, the first rendering its coverage error minimax and the second rendering its coverage error conservative. The resulting region reconciles in many important aspects the discrepancies between the various approaches, and is robust against misspecification of their governing conditions. Theoretical and empirical properties of our procedures are investigated in comparison with those of the constituent individual approaches."
to:NB
statistics
confidence_sets
october 2011 by cshalizi
[1110.1248] An algorithm to compute the power of Monte Carlo tests with guaranteed precision
october 2011 by cshalizi
"This article presents an algorithm that generates an exact (conservative) confidence interval of a specified length and coverage probability for the power of a Monte Carlo test (such as a bootstrap or permutation test). It is the first method that achieves this aim for almost any Monte Carlo test. The existing research on power estimation for Monte Carlo tests has focused on obtaining as accurate a result as possible for a fixed computational effort. However, the methods proposed do not provide any guarantee of precision, in the sense that they cannot report a confidence interval to accompany their estimate of the power. Conversely in this article the computational effort is random. The algorithm operates until a confidence interval can be constructed that meets the requirements of the user, in terms of length and coverage probability. We show that, surprisingly, by generating two more datasets that what might have been assumed to be sufficient, the expected number of steps required by the algorithm is finite in many cases of practical interest. These include, for instance, any situation where the distribution of the p-value is absolutely continuous or if it is discrete with finite support. The algorithm is implemented in the R package simctest."
statistics
hypothesis_testing
confidence_sets
monte_carlo
bootstrap
in_NB
october 2011 by cshalizi
A Perturbation Method for Inference on Regularized Regression Estimates
september 2011 by cshalizi
"Analysis of high-dimensional data often seeks to identify a subset of important features and to assess the effects of these features on outcomes. Traditional statistical inference procedures based on standard regression methods often fail in the presence of high-dimensional features. In recent years, regularization methods have emerged as promising tools for analyzing high-dimensional data. These methods simultaneously select important features and provide stable estimation of their effects. Adaptive LASSO and SCAD, for instance, give consistent and asymptotically normal estimates with oracle properties. However, in finite samples, it remains difficult to obtain interval estimators for the regression parameters. In this article, we propose perturbation resampling-based procedures to approximate the distribution of a general class of penalized parameter estimates. Our proposal, justified by asymptotic theory, provides a simple way to estimate the covariance matrix and confidence regions. Through finite-sample simulations, we verify the ability of this method to give accurate inference and compare it with other widely used standard deviation and confidence interval estimates. We also illustrate our proposals with a dataset used to study the association of HIV drug resistance and a large number of genetic mutations."
in_NB
regression
sparsity
confidence_sets
statistics
bootstrap
september 2011 by cshalizi
Don Fraser’s rejoinder « Xi'an's Og
august 2011 by cshalizi
Do follow the links to the papers. Shorter Fraser: except in very special and simple situations, Bayesian credible sets have demonstrably horrible coverage/confidence properties; that is, the probabilities attached to them do not tell you how often they really contain the true parameter values. In fact, it scarcely seems to make sense to describe those numbers as "probabilities". (I find Robert's response to Fraser's article extremely unconvincing, especially where it descends into pure aesthetics, e.g. saying that Bayes gives you an elegant and unified way of doing inference. Well, so does referring all questions to the I Ching, but does it work?)
bayesianism
estimation
confidence_sets
statistics
in_NB
august 2011 by cshalizi
[1107.0013] Likelihood based observability analysis and confidence intervals for predictions of dynamic models
july 2011 by cshalizi
"Mechanistic dynamic models of biochemical networks such as Ordinary Differential Equations (ODEs) contain unknown parameters like the reaction rate constants and the initial concentrations of the compounds. The large number of parameters as well as their nonlinear impact on the model responses hamper the determination of confidence regions for parameter estimates. At the same time, classical approaches translating the uncertainty of the parameters into confidence intervals for model predictions are hardly feasible. In this article it is shown that a so-called prediction profile likelihood yields reliable confidence intervals for model predictions, despite arbitrarily complex and high-dimensional shapes of the confidence regions for the estimated parameters. Prediction confidence intervals of the dynamic states allow a data-based observability analysis. The approach renders the issue of sampling a high-dimensional parameter space into evaluating one-dimensional prediction spaces."
dynamical_systemss
statistics
statistical_inference_for_stochastic_processes
prediction
confidence_sets
to_read
july 2011 by cshalizi
Leahu : On the Bernstein-von Mises phenomenon in the Gaussian white noise model
may 2011 by cshalizi
"We study the Bernstein-von Mises (BvM) phenomenon, i.e., Bayesian credible sets and frequentist confidence regions for the estimation error coincide asymptotically, for the infinite-dimensional Gaussian white noise model governed by Gaussian prior with diagonal-covariance structure. While in parametric statistics this fact is a consequence of (a particular form of) the BvM Theorem, in the nonparametric setup, however, the BvM Theorem is known to fail even in some, apparently, elementary cases. In the present paper we show that BvM-like statements hold for this model, provided that the parameter space is suitably embedded into the support of the prior. The overall conclusion is that, unlike in the parametric setup, positive results regarding frequentist probability coverage of credible sets can only be obtained if the prior assigns null mass to the parameter space."
statistics
confidence_sets
bayesianism
bernstein-von-mises
to:NB
may 2011 by cshalizi
[0812.2749] Nonparametric inference of a trend using functional data
april 2011 by cshalizi
I guess I've been more or less presuming this was true. (And I'd have been wrong about the form of the simultaneous CI, actually.) Worth trying to work into the final exam for The Kids?
curve_fitting
gaussian_processes
time_series
statistics
nonparametrics
have_read
confidence_sets
to_teach:undergrad-ADA
april 2011 by cshalizi
[1005.2137] A self-normalized approach to confidence interval construction in time series
october 2010 by cshalizi
Revised version of a paper from JRSS-B. "ew method to construct confidence intervals for quantities that are associated with a stationary time series, which avoids direct estimation of the asymptotic variances. Unlike the existing tuning-parameter-dependent approaches, our method has the attractive convenience of being free of choosing any user-chosen number or smoothing parameter. The interval is constructed on the basis of an asymptotically distribution-free self-normalized statistic, in which the normalizing matrix is computed using recursive estimates. Under mild conditions, we establish the theoretical validity of our method for a broad class of statistics that are functionals of the empirical distribution of fixed or growing dimension. From a practical point of view, our method is conceptually simple, easy to implement .... Monte-Carlo simulations ... compare the finite sample performance ... with ... normal approximation and the block bootstrap approach."
time_series
confidence_sets
estimation
statistics
october 2010 by cshalizi
Liu, Wu: Simultaneous nonparametric inference of time series
july 2010 by cshalizi
" kernel estimation of marginal densities and regression functions of stationary processes. It is shown that for a wide class of time series, with proper centering and scaling, the maximum deviations of kernel density and regression estimates are asymptotically Gumbel. Our results substantially generalize earlier ones which were obtained under independence or beta mixing assumptions. The asymptotic results can be applied to assess patterns of marginal densities or regression functions via the construction of simultaneous confidence bands for which one can perform goodness-of-fit tests"
time_series
statistical_inference_for_stochastic_processes
kernel_estimators
confidence_sets
july 2010 by cshalizi
"Simultaneous Confidence Bands for Penalized Spline Estimators" - Journal of the American Statistical Association - 105(490):852
july 2010 by cshalizi
"In this article we construct simultaneous confidence bands for a smooth curve using penalized spline estimators. We consider three types of estimation methods: (a) as a standard (fixed effect) nonparametric model, (b) using the mixed-model framework with the spline coefficients as random effects, and (c) a full Bayesian approach. The volume-of-tube formula is applied for the first two methods and compared with Bayesian simultaneous confidence bands from a frequentist perspective. We show that the mixed-model formulation of penalized splines can help obtain, at least approximately, confidence bands with either Bayesian or frequentist properties. Simulations and data analysis support the proposed methods. The R package ConfBands accompanies the article."
splines
confidence_sets
statistics
july 2010 by cshalizi
10-705 Intermediate Statistics, Fall 2009
april 2010 by cshalizi
Larry's version of the typical masters-level course based on Casella and Berger. Note: half of what he covers is not in Casella and Berger. (For example, he starts with VC theory!)
learning_theory
statistics
estimation
hypothesis_testing
prediction
minimax
bootstrap
model_selection
regression
classifiers
confidence_sets
wasserman.larry
kith_and_kin
april 2010 by cshalizi
Giné, Nickl: Confidence bands in density estimation
february 2010 by cshalizi
"Given a sample from some unknown continuous density f : ℝ→ℝ, we construct adaptive confidence bands that are honest for all densities in a “generic” subset of the union of t-Hölder balls, 0
statistics
estimation
confidence_sets
density_estimation
february 2010 by cshalizi
Singh, Xie, Strawderman: Confidence distribution (CD) -- distribution estimator of a parameter
february 2010 by cshalizi
"The notion of confidence distribution (CD), an entirely frequentist concept, is in essence a Neymanian interpretation of Fisher's Fiducial distribution. It contains information related to every kind of frequentist inference. In this article, a CD is viewed as a distribution estimator of a parameter. This leads naturally to consideration of the information contained in CD, comparison of CDs and optimal CDs, and connection of the CD concept to the (profile) likelihood function. A formal development of a multiparameter CD is also presented." Hmmm. Relevant to the phil-of-bayes paper?
statistics
estimation
confidence_sets
bayesianism
to:NB
february 2010 by cshalizi
Arlot, Blanchard, Roquain: Some nonasymptotic results on resampling in high dimension, I: Confidence regions
december 2009 by cshalizi
"We study generalized bootstrap confidence regions for the mean of a random vector whose coordinates have an unknown dependency structure. The random vector is supposed to be either Gaussian or to have a symmetric and bounded distribution. The dimensionality of the vector can possibly be much larger than the number of observations and we focus on a nonasymptotic control of the confidence level, following ideas inspired by recent results in learning theory. We consider two approaches, the first based on a concentration principle (valid for a large class of resampling weights) and the second on a resampled quantile, specifically using Rademacher weights. Several intermediate results established in the approach based on concentration principles are of interest in their own right. We also discuss the question of accuracy when using Monte Carlo approximations of the resampled quantities."
statistics
resampling
bootstrap
cross-validation
confidence_sets
to_read
re:XV_for_mixing
concentration_of_measure
learning_theory
december 2009 by cshalizi
A simple procedure for computing improved prediction intervals for autoregressive models. Paolo Vidoni. 2009; Journal of Time Series Analysis - Wiley InterScience
december 2009 by cshalizi
"construction of prediction intervals for time series models. The estimative or plug-in solution is usually not entirely adequate, since the (conditional) coverage probability may differ substantially from the nominal value. Prediction intervals with improved (conditional) coverage probability can be defined by adjusting the estimative ones, using rather complicated asymptotic procedures or suitable simulation techniques. This article extends to Markov process models a recent result by Vidoni, which defines a relatively simple predictive distribution function, giving improved prediction limits as quantiles"
prediction
time_series
statistics
confidence_sets
to_read
december 2009 by cshalizi
Simultaneous Confidence Intervals for Impulse Response
july 2009 by cshalizi
"Inference about an impulse response is a multiple testing problem with serially correlated coefficient estimates. This paper provides a method to construct simultaneous confidence regions for impulse responses and conditional bands to examine significance levels of individual impulse response coefficients given propagation trajectories. The paper also shows how to constrain a subset of impulse response paths to anchor structural identification and how to formally test the validity of such identifying constraints. Simulation and empirical evidence illustrate the new techniques. A broad summary of asymptotic analytic formulas is provided to make the methods easy to implement with commonly available statistical software."
time_series
systems_identification
confidence_sets
statistics
macroeconomics
re:your_favorite_dsge_sucks
july 2009 by cshalizi
Interval Forecasts and Parameter Uncertainty
june 2009 by cshalizi
Frequentist prediction interval reflecting parameter uncertainty.
prediction
statistics
to:NB
confidence_sets
to_read
hansen.bruce
june 2009 by cshalizi
[0711.1036v2] Confidence Sets Based on Sparse Estimators Are Necessarily Large
may 2009 by cshalizi
"Confidence sets based on sparse estimators are shown to be large compared to more standard confidence sets, demonstrating that sparsity of an estimator comes at a substantial price in terms of the quality of the estimator. The results are set in a general parametric or semiparametric framework."
sparsity
confidence_sets
model_selection
statistics
to:NB
via:shivak
may 2009 by cshalizi
related tags
asymptotics ⊕ bayesianism ⊕ bayesian_consistency ⊕ bernstein-von-mises ⊕ bernstein-von_mises ⊕ bootstrap ⊕ buhlmann.peter ⊕ cai.t._tony ⊕ causal_inference ⊕ classifiers ⊕ concentration_of_measure ⊕ confidence_sets ⊖ cross-validation ⊕ curve_fitting ⊕ decision_trees ⊕ density_estimation ⊕ dynamical_systemss ⊕ econometrics ⊕ economics ⊕ estimation ⊕ fraser.d.a.s. ⊕ gaussian_processes ⊕ graphical_models ⊕ haavelmo.trygve ⊕ hansen.bruce ⊕ have_read ⊕ heard ⊕ hypothesis_testing ⊕ information_criteria ⊕ information_theory ⊕ in_NB ⊕ kernel_estimators ⊕ kernel_methods ⊕ kith_and_kin ⊕ learning_theory ⊕ lei.jing ⊕ machine_learning ⊕ macroeconomics ⊕ methodology ⊕ minimax ⊕ model_selection ⊕ monte_carlo ⊕ natural_language_processing ⊕ nonparametrics ⊕ prediction ⊕ re:growing_ensemble_project ⊕ re:XV_for_mixing ⊕ re:your_favorite_dsge_sucks ⊕ regression ⊕ resampling ⊕ robins.james ⊕ self-similarity ⊕ shrinkage ⊕ social_science_methodology ⊕ sparsity ⊕ splines ⊕ statistical_inference_for_stochastic_processes ⊕ statistics ⊕ stochastic_approximation ⊕ systems_identification ⊕ time_series ⊕ to:NB ⊕ to_read ⊕ to_teach:data-mining ⊕ to_teach:undergrad-ADA ⊕ van_der_vaart.aad ⊕ via:shivak ⊕ wasserman.larry ⊕Copy this bookmark: