cshalizi + nonparametrics   98

Local polynomial regression for symmetric positive definite matrices - Yuan - 2012 - Journal of the Royal Statistical Society: Series B (Statistical Methodology) - Wiley Online Library
"Local polynomial regression has received extensive attention for the non-parametric estimation of regression functions when both the response and the covariate are in Euclidean space. However, little has been done when the response is in a Riemannian manifold. We develop an intrinsic local polynomial regression estimate for the analysis of symmetric positive definite matrices as responses that lie in a Riemannian manifold with covariate in Euclidean space. The primary motivation and application of the methodology proposed is in computer vision and medical imaging. We examine two commonly used metrics, including the trace metric and the log-Euclidean metric on the space of symmetric positive definite matrices. For each metric, we develop a cross-validation bandwidth selection method, derive the asymptotic bias, variance and normality of the intrinsic local constant and local linear estimators, and compare their asymptotic mean-square errors. Simulation studies are further used to compare the estimators under the two metrics and to examine their finite sample performance. We use our method to detect diagnostic differences between diffusion tensors along fibre tracts in a study of human immunodeficiency virus."
to:NB  variance_estimation  statistics  regression  nonparametrics  kernel_estimators 
6 weeks ago by cshalizi
[math/0612776] Uniform error bounds for smoothing splines
"Almost sure bounds are established on the uniform error of smoothing spline estimators in nonparametric regression with random designs. Some results of Einmahl and Mason (2005) are used to derive uniform error bounds for the approximation of the spline smoother by an ``equivalent'' reproducing kernel regression estimator, as well as for proving uniform error bounds on the reproducing kernel regression estimator itself, uniformly in the smoothing parameter over a wide range. This admits data-driven choices of the smoothing parameter."
to:NB  splines  regression  nonparametrics  statistics  learning_theory 
6 weeks ago by cshalizi
[math/0603130] Nonparametric methods for inference in the presence of instrumental variables
"We suggest two nonparametric approaches, based on kernel methods and orthogonal series to estimating regression functions in the presence of instrumental variables. For the first time in this class of problems, we derive optimal convergence rates, and show that they are attained by particular estimators. In the presence of instrumental variables the relation that identifies the regression function also defines an ill-posed inverse problem, the ``difficulty'' of which depends on eigenvalues of a certain integral operator which is determined by the joint density of endogenous and instrumental variables. We delineate the role played by problem difficulty in determining both the optimal convergence rate and the appropriate choice of smoothing parameter."
to:NB  to_read  regression  statistics  instrumental_variables  nonparametrics  to_teach:undergrad-ADA 
6 weeks ago by cshalizi
The benchden Package: Benchmark Densities for Nonparametric Density Estimation
"This article describes the benchden package which implements a set of 28 example densities for nonparametric density estimation in R. In addition to the usual functions that evaluate the density, distribution and quantile functions or generate random variates, a function designed to be specifically useful for larger simulation studies has been added. After describing the set of densities and the usage of the package, a small toy example of a simulation study conducted using the benchden package is given."
to:NB  computational_statistics  R  density_estimation  nonparametrics  to_teach:undergrad-ADA 
7 weeks ago by cshalizi
[1203.4354] Asymptotic Confidence Sets for General Nonparametric Regression and Classification by Regularized Kernel Methods
"Regularized kernel methods such as, e.g., support vector machines and least-squares support vector regression constitute an important class of standard learning algorithms in machine learning. Theoretical investigations concerning asymptotic properties have manly focused on rates of convergence during the last years but there are only very few and limited (asymptotic) results on statistical inference so far. As this is a serious limitation for their use in mathematical statistics, the goal of the article is to fill this gap. Based on asymptotic normality of many of these methods, the article derives a strongly consistent estimator for the unknown covariance matrix of the limiting normal distribution. In this way, we obtain asymptotically correct confidence sets for $psi(f_{P,lambda_0})$ where $f_{P,lambda_0}$ denotes the minimizer of the regularized risk in the reproducing kernel Hilbert space $H$ and $psi:Hrightarrowmathds{R}^m$ is any Hadamard-differentiable functional. Applications include (multivariate) pointwise confidence sets for values of $f_{P,lambda_0}$ and confidence sets for gradients, integrals, and norms."
to:NB  confidence_sets  kernel_methods  statistics  nonparametrics  regression  classifiers 
7 weeks ago by cshalizi
Taylor & Francis Online :: Bayesian Nonparametric Modeling for Causal Inference - Journal of Computational and Graphical Statistics - Volume 20, Issue 1
"Researchers have long struggled to identify causal effects in nonexperimental settings. Many recently proposed strategies assume ignorability of the treatment assignment mechanism and require fitting two models—one for the assignment mechanism and one for the response surface. This article proposes a strategy that instead focuses on very flexibly modeling just the response surface using a Bayesian nonparametric modeling procedure, Bayesian Additive Regression Trees (BART). BART has several advantages: it is far simpler to use than many recent competitors, requires less guesswork in model fitting, handles a large number of predictors, yields coherent uncertainty intervals, and fluidly handles continuous treatment variables and missing data for the outcome variable. BART also naturally identifies heterogeneous treatment effects. BART produces more accurate estimates of average treatment effects compared to propensity score matching, propensity-weighted estimators, and regression adjustment in the nonlinear simulation situations examined. Further, it is highly competitive in linear settings with the “correct” model, linear regression. Supplemental materials including code and data to replicate simulations and examples from the article as well as methods for population inference are available online."
to:NB  regression  causal_inference  nonparametrics  statistics  hill.jennifer 
8 weeks ago by cshalizi
Bickel , Kleijn : The semiparametric Bernstein–von Mises theorem
"In a smooth semiparametric estimation problem, the marginal posterior for the parameter of interest is expected to be asymptotically normal and satisfy frequentist criteria of optimality if the model is endowed with a suitable prior. It is shown that, under certain straightforward and interpretable conditions, the assertion of Le Cam’s acclaimed, but strictly parametric, Bernstein–von Mises theorem [Univ. California Publ. Statist. 1 (1953) 277–329] holds in the semiparametric situation as well. As a consequence, Bayesian point-estimators achieve efficiency, for example, in the sense of Hájek’s convolution theorem [Z. Wahrsch. Verw. Gebiete 14 (1970) 323–330]. The model is required to satisfy differentiability and metric entropy conditions, while the nuisance prior must assign nonzero mass to certain Kullback–Leibler neighborhoods [Ghosal, Ghosh and van der Vaart Ann. Statist. 28 (2000) 500–531]. In addition, the marginal posterior is required to converge at parametric rate, which appears to be the most stringent condition in examples. The results are applied to estimation of the linear coefficient in partial linear regression, with a Gaussian prior on a smoothness class for the nuisance."
to:NB  statistics  bayesian_consistency  nonparametrics  bickel.peter  bernstein-von_mises 
8 weeks ago by cshalizi
[1203.5422] Distribution Free Prediction Bands
"We study distribution free, nonparametric prediction bands with a special focus on their finite sample behavior. First we investigate and develop different notions of finite sample coverage guarantees. Then we give a new prediction band estimator by combining the idea of "conformal prediction" (Vovk et al. 2009) with nonparametric conditional density estimation. The proposed estimator, called COPS (Conformal Optimized Prediction Set), always has finite sample guarantee in a stronger sense than the original conformal prediction estimator. Under regularity conditions the estimator converges to an oracle band at a minimax optimal rate. A fast approximation algorithm and a data driven method for selecting the bandwidth are developed. The method is illustrated first in simulated data. Then, an application shows that the proposed method gives desirable prediction intervals in an automatic way, as compared to the classical linear regression modeling."
to:NB  prediction  statistics  nonparametrics  kith_and_kin  wasserman.larry  lei.jing  heard  confidence_sets  density_estimation 
8 weeks ago by cshalizi
[0803.1628] Component models for large networks
"Being among the easiest ways to find meaningful structure from discrete data, Latent Dirichlet Allocation (LDA) and related component models have been applied widely. They are simple, computationally fast and scalable, interpretable, and admit nonparametric priors. In the currently popular field of network modeling, relatively little work has taken uncertainty of data seriously in the Bayesian sense, and component models have been introduced to the field only recently, by treating each node as a bag of out-going links. We introduce an alternative, interaction component model for communities (ICMc), where the whole network is a bag of links, stemming from different components. The former finds both disassortative and assortative structure, while the alternative assumes assortativity and finds community-like structures like the earlier methods motivated by physics. With Dirichlet Process priors and an efficient implementation the models are highly scalable, as demonstrated with a social network from the Last.fm web site, with 670,000 nodes and 1.89 million links."
in_NB  community_discovery  statistics  nonparametrics  clustering 
10 weeks ago by cshalizi
Cai : Minimax and Adaptive Inference in Nonparametric Function Estimation
"Since Stein’s 1956 seminal paper, shrinkage has played a fundamental role in both parametric and nonparametric inference. This article discusses minimaxity and adaptive minimaxity in nonparametric function estimation. Three interrelated problems, function estimation under global integrated squared error, estimation under pointwise squared error, and nonparametric confidence intervals, are considered. Shrinkage is pivotal in the development of both the minimax theory and the adaptation theory.
"While the three problems are closely connected and the minimax theories bear some similarities, the adaptation theories are strikingly different. For example, in a sharp contrast to adaptive point estimation, in many common settings there do not exist nonparametric confidence intervals that adapt to the unknown smoothness of the underlying function. A concise account of these theories is given. The connections as well as differences among these problems are discussed and illustrated through examples."
in_NB  statistics  estimation  regression  confidence_sets  nonparametrics  shrinkage  cai.t._tony  minimax 
10 weeks ago by cshalizi
[0803.3017] Accelerated convergence for nonparametric regression with coarsened predictors
"We consider nonparametric estimation of a regression function for a situation where precisely measured predictors are used to estimate the regression curve for coarsened, that is, less precise or contaminated predictors. Specifically, while one has available a sample $(W_1,Y_1),...,(W_n,Y_n)$ of independent and identically distributed data, representing observations with precisely measured predictors, where $mathrm{E}(Y_i|W_i)=g(W_i)$, instead of the smooth regression function $g$, the target of interest is another smooth regression function $m$ that pertains to predictors $X_i$ that are noisy versions of the $W_i$. Our target is then the regression function $m(x)=E(Y|X=x)$, where $X$ is a contaminated version of $W$, that is, $X=W+delta$. It is assumed that either the density of the errors is known, or replicated data are available resembling, but not necessarily the same as, the variables $X$. In either case, and under suitable conditions, we obtain $sqrt{n}$-rates of convergence of the proposed estimator and its derivatives, and establish a functional limit theorem. Weak convergence to a Gaussian limit process implies pointwise and uniform confidence intervals and $sqrt{n}$-consistent estimators of extrema and zeros of $m$. It is shown that these results are preserved under more general models in which $X$ is determined by an explanatory variable. Finite sample performance is investigated in simulations and illustrated by a real data example." --- I can think of potential applications...
in_NB  regression  statistics  nonparametrics  error-in-variables 
11 weeks ago by cshalizi
[0803.2963] Consistency of cross validation for comparing regression procedures
"Theoretical developments on cross validation (CV) have mainly focused on selecting one among a list of finite-dimensional models (e.g., subset or order selection in linear regression) or selecting a smoothing parameter (e.g., bandwidth for kernel smoothing). However, little is known about consistency of cross validation when applied to compare between parametric and nonparametric methods or within nonparametric methods. We show that under some conditions, with an appropriate choice of data splitting ratio, cross validation is consistent in the sense of selecting the better procedure with probability approaching 1. Our results reveal interesting behavior of cross validation. When comparing two models (procedures) converging at the same nonparametric rate, in contrast to the parametric case, it turns out that the proportion of data used for evaluation in CV does not need to be dominating in size. Furthermore, it can even be of a smaller order than the proportion for estimation while not affecting the consistency property."
to:NB  statistics  to_read  cross-validation  model_selection  nonparametrics  to_teach:undergrad-ADA  re:stacs 
11 weeks ago by cshalizi
[0803.2984] Conditional density estimation in a regression setting
"Regression problems are traditionally analyzed via univariate characteristics like the regression function, scale function and marginal density of regression errors. These characteristics are useful and informative whenever the association between the predictor and the response is relatively simple. More detailed information about the association can be provided by the conditional density of the response given the predictor. For the first time in the literature, this article develops the theory of minimax estimation of the conditional density for regression settings with fixed and random designs of predictors, bounded and unbounded responses and a vast set of anisotropic classes of conditional densities. The study of fixed design regression is of special interest and novelty because the known literature is devoted to the case of random predictors. For the aforementioned models, the paper suggests a universal adaptive estimator which (i) matches performance of an oracle that knows both an underlying model and an estimated conditional density; (ii) is sharp minimax over a vast class of anisotropic conditional densities; (iii) is at least rate minimax when the response is independent of the predictor and thus a bivariate conditional density becomes a univariate density; (iv) is adaptive to an underlying design (fixed or random) of predictors."
in_NB  statistics  nonparametrics  regression  density_estimation  minimax  to_read  to_teach:undergrad-ADA 
11 weeks ago by cshalizi
[0803.2999] Rate-optimal estimation for a general class of nonparametric regression models with unknown link functions
"This paper discusses a nonparametric regression model that naturally generalizes neural network models. The model is based on a finite number of one-dimensional transformations and can be estimated with a one-dimensional rate of convergence. The model contains the generalized additive model with unknown link function as a special case. For this case, it is shown that the additive components and link function can be estimated with the optimal rate by a smoothing spline that is the solution of a penalized least squares criterion."
in_NB  statistics  regression  neural_networks  nonparametrics 
11 weeks ago by cshalizi
[0801.1599] Parametric and nonparametric models and methods in financial econometrics
"Financial econometrics has become an increasingly popular research field. In this paper we review a few parametric and nonparametric models and methods used in this area. After introducing several widely used continuous-time and discrete-time models, we study in detail dependence structures of discrete samples, including Markovian property, hidden Markovian structure, contaminated observations, and random samples. We then discuss several popular parametric and nonparametric estimation methods. To avoid model mis-specification, model validation plays a key role in financial modeling. We discuss several model validation techniques, including pseudo-likelihood ratio test, nonparametric curve regression based test, residuals based test, generalized likelihood ratio test, simultaneous confidence band construction, and density based test. Finally, we briefly touch on tools for studying large sample properties."
to:NB  statistics  econometrics  finance  review_papers  nonparametrics 
11 weeks ago by cshalizi
[1102.3616] Tight conditions for consistent variable selection in high dimensional nonparametric regression
"We address the issue of variable selection in the regression model with very high ambient dimension, i.e., when the number of covariates is very large. The main focus is on the situation where the number of relevant covariates, called intrinsic dimension, is much smaller than the ambient dimension. Without assuming any parametric form of the underlying regression function, we get tight conditions making it possible to consistently estimate the set of relevant variables. These conditions relate the intrinsic dimension to the ambient dimension and to the sample size. The procedure that is provably consistent under these tight conditions is simple and is based on comparing the empirical Fourier coefficients with an appropriately chosen threshold value."
in_NB  regression  variable_selection  nonparametrics  statistics 
february 2012 by cshalizi
[0801.0327] Nonparametric sequential prediction of time series
"Time series prediction covers a vast field of every-day statistical applications in medical, environmental and economic domains. In this paper we develop nonparametric prediction strategies based on the combination of a set of 'experts' and show the universal consistency of these strategies under a minimum of conditions. We perform an in-depth analysis of real-world data sets and show that these nonparametric strategies are more flexible, faster and generally outperform ARMA methods in terms of normalized cumulative prediction error."
in_NB  time_series  nonparametrics  prediction  statistics  to_teach:undergrad-ADA  re:growing_ensemble_project 
february 2012 by cshalizi
Model Selection in Kernel Based Regression using the Influence Function
"Recent results about the robustness of kernel methods involve the analysis of influence functions. By definition the influence function is closely related to leave-one-out criteria. In statistical learning, the latter is often used to assess the generalization of a method. In statistics, the influence function is used in a similar way to analyze the statistical efficiency of a method. Links between both worlds are explored. The influence function is related to the first term of a Taylor expansion. Higher order influence functions are calculated. A recursive relation between these terms is found characterizing the full Taylor expansion. It is shown how to evaluate influence functions at a specific sample distribution to obtain an approximation of the leave-one-out error. A specific implementation is proposed using a L1 loss in the selection of the hyperparameters and a Huber loss in the estimation procedure. The parameter in the Huber loss controlling the degree of robustness is optimized as well. The resulting procedure gives good results, even when outliers are present in the data."
to:NB  statistics  regression  kernel_estimators  model_selection  robustness  nonparametrics  cross-validation 
february 2012 by cshalizi
Non-Parametric Modeling of Partially Ranked Data
"Statistical models on full and partial rankings of n items are often of limited practical use for large n due to computational consideration. We explore the use of non-parametric models for partially ranked data and derive computationally efficient procedures for their use for large n. The derivations are largely possible through combinatorial and algebraic manipulations based on the lattice of partial rankings. A bias-variance analysis and an experimental study demonstrate the applicability of the proposed method."
to:NB  statistics  machine_learning  categorical_data  ordinal_data  information_retrieval  nonparametrics  lebanon.guy 
february 2012 by cshalizi
Giné , Nickl : Rates of contraction for posterior distributions in Lr-metrics, 1 ≤ r ≤ ∞
"The frequentist behavior of nonparametric Bayes estimates, more specifically, rates of contraction of the posterior distributions to shrinking Lr-norm neighborhoods, 1 ≤ r ≤ ∞, of the unknown parameter, are studied. A theorem for nonparametric density estimation is proved under general approximation-theoretic assumptions on the prior. The result is applied to a variety of common examples, including Gaussian process, wavelet series, normal mixture and histogram priors. The rates of contraction are minimax-optimal for 1 ≤ r ≤ 2, but deteriorate as r increases beyond 2. In the case of Gaussian nonparametric regression a Gaussian prior is devised for which the posterior contracts at the optimal rate in all Lr-norms, 1 ≤ r ≤ ∞."
in_NB  bayesian_consistency  statistics  nonparametrics  learning_theory  re:bayes_as_evol  density_estimation  regression 
january 2012 by cshalizi
[1201.2334] Universal Estimation of Directed Information
"We propose four approaches to estimating the directed information rate between a pair of jointly stationary ergodic processes with the help of universal probability assignments. The four approaches yield estimators with different merits such as nonnegativity and boundedness. We establish consistency of these estimators in various senses and derive near-optimal rates of convergence in the minimax sense under mild conditions. The estimators carry over directly to estimating other information measures of stationary ergodic processes, such as entropy rate and mutual information rate, and provide alternatives to classical approaches in the existing literature. Guided by the theoretical results, we use context tree weighting as the vehicle for the implementations of the proposed estimators. Experiments on synthetic and real data are presented, demonstrating the potential of the proposed schemes in practice and the efficacy of directed information estimation as a tool for detecting and measuring causality and delay."
in_NB  to_read  information_theory  entropy_estimation  directed_information  stochastic_processes  nonparametrics  statistics  re:AoS_project 
january 2012 by cshalizi
Building Consistent Regression Trees from Complex Sample Data
"In the past several years a wide range of methods for the construction of regression trees and other estimators based on the recursive partitioning of samples have appeared in the statistics literature. Many applications involve data collected through a complex sample design. At present, however, relatively little is known regarding the properties of these methods under complex designs. This article proposes a method for incorporating information about the complex sample design when building a regression tree using a recursive partitioning algorithm. Sufficient conditions are established for asymptotic design L2 consistency of these regression trees as estimators for an arbitrary regression function. The proposed method is illustrated with Occupational Employment Statistics establishment survey data linked to Quarterly Census of Employment and Wage payroll data of the Bureau of Labor Statistics. Performance of the nonparametric estimator is investigated through a simulation study based on this example."
to:NB  regression  prediction_trees  statistics  machine_learning  to_teach:data-mining  nonparametrics 
january 2012 by cshalizi
[0805.4136] Inference for the dark energy equation of state using Type IA supernova data
"The surprising discovery of an accelerating universe led cosmologists to posit the existence of "dark energy"--a mysterious energy field that permeates the universe. Understanding dark energy has become the central problem of modern cosmology. After describing the scientific background in depth, we formulate the task as a nonlinear inverse problem that expresses the comoving distance function in terms of the dark-energy equation of state. We present two classes of methods for making sharp statistical inferences about the equation of state from observations of Type Ia Supernovae (SNe). First, we derive a technique for testing hypotheses about the equation of state that requires no assumptions about its form and can distinguish among competing theories. Second, we present a framework for computing parametric and nonparametric estimators of the equation of state, with an associated assessment of uncertainty. Using our approach, we evaluate the strength of statistical evidence for various competing models of dark energy. Consistent with current studies, we find that with the available Type Ia SNe data, it is not possible to distinguish statistically among popular dark-energy models, and that, in particular, there is no support in the data for rejecting a cosmological constant. With much more supernova data likely to be available in coming years (e.g., from the DOE/NASA Joint Dark Energy Mission), we address the more interesting question of whether future data sets will have sufficient resolution to distinguish among competing theories."

--- I am biased, because Chris G. and Larry are friends, but this seems to me a model of the modern applied statistics paper: use interesting statistical tools to say something helpful about an important scientific problem on its own terms, rather than distorting the problem until it "looks like a nail".
in_NB  kith_and_kin  cosmology  astronomy  inverse_problems  nonparametrics  estimation  hypothesis_testing  statistics  bootstrap  genovese.christopher  wasserman.larry  have_read 
january 2012 by cshalizi
[1201.0794] Sparse Nonparametric Graphical Models
"We present some nonparametric methods for graphical modeling. In the discrete case, where the data are binary or drawn from a finite alphabet, Markov random fields are already essentially nonparametric, since the cliques can take only a finite number of values. Continuous data are different. The Gaussian graphical model is the standard parametric model for continuous data, but it makes distributional assumptions that are often unrealistic. We discuss two approaches to building more flexible graphical models. One allows arbitrary graphs and a nonparametric extension of the Gaussian; the other uses kernel density estimation and restricts the graphs to trees and forests. Examples of both methods are presented. We also discuss possible future research directions for nonparametric graphical modeling."

(Review/good parts version of previous papers.)
in_NB  kith_and_kin  statistics  machine_learning  graphical_models  nonparametrics  density_estimation  wasserman.larry  liu.han  lafferty.john 
january 2012 by cshalizi
[1112.1838] Non-parametric kernel estimation for symmetric Hawkes processes. Application to high frequency financial data
"We define a numerical method that provides a non-parametric estimation of the kernel shape in symmetric multivariate Hawkes processes. This method relies on second order statistical properties of Hawkes processes that relate the covariance matrix of the process to the kernel matrix. The square root of the correlation function is computed using a minimal phase recovering method. We illustrate our method on some examples and provide an empirical study of the estimation errors. Within this framework, we analyze high frequency financial price data modeled as 1D or 2D Hawkes processes. We find slowly decaying (power-law) kernel shapes suggesting a long memory nature of self-excitation phenomena at the microstructure level of price dynamics."
to:NB  kernel_estimators  time_series  point_processes  nonparametrics  statistics  re:LoB_project 
december 2011 by cshalizi
[1111.6291] Semiparametric Time Series Models with Log-concave Innovations
"We study a class of semiparametric time series models with innovations following a log-concave distribution. We propose a general maximum likelihood framework which allows us to estimate simultaneously the parameters of a model and the density of the innovations. This framework can be easily adapted to many well-known models, including ARMA and GARCH. Furthermore, we show that the estimator under our new framework is consistent in both ARMA and GARCH settings. We demonstrate its finite sample performance via a thorough simulation study and apply it to two real data sets concerning the streamflow of the Hirnant river and the FTSE daily return."
to:NB  statistics  time_series  nonparametrics 
december 2011 by cshalizi
[1111.6230] Convergence of Nonparametric Functional Regression Estimates with Functional Responses
"We consider nonparametric functional regression when both predictors and responses are functions. More specifically, we let $(X_1,Y_1),...,(X_n,Y_n)$ be random elements in $mathcal{F}timesmathcal{H}$ where $mathcal{F}$ is a semi-metric space and $mathcal{H}$ is a separable Hilbert space. Based on a recently introduced notion of weak dependence for functional data, we showed the almost sure convergence rates of both the Nadaraya-Watson estimator and the nearest neighbor estimator, in a unified manner. Several factors, including functional nature of the responses, the assumptions on the functional variables using the Orlicz norm and the desired generality on weakly dependent data, make the theoretical investigations more challenging and interesting."
to:NB  statistics  regression  nonparametrics  functional_data  kernel_estimators 
december 2011 by cshalizi
[1111.5989] Large Deviation Results for the Nonparametric Regression Function Estimator on Functional Data
"This paper is devoted to the study of large deviation behaviors in the setting of the estimation of the regression function on functional data. A large deviation principle is stated for a process Zn, defined below, allowing to derive a pointwise large deviation principle for the Nadaraya-Watson-type l-indexed regression function estimator as a by-product. Moreover, a uniform over VC-classes Chernoff type large deviation result is stated for the deviation of the l-indexed regression estimator."
to:NB  regression  nonparametrics  deviation_bounds 
december 2011 by cshalizi
Nonparametric estimation of the link function including variable selection - Gerhard Tutz and Sebastian Petry - Statistics and Computing, Volume 22, Number 2
"Nonparametric methods for the estimation of the link function in generalized linear models are able to avoid bias in the regression parameters. But for the estimation of the link typically the full model, which includes all predictors, has been used. When the number of predictors is large these methods fail since the full model cannot be estimated. In the present article a boosting type method is proposed that simultaneously selects predictors and estimates the link function. The method performs quite well in simulations and real data examples." (The "to teach" tag is conjectural.)
in_NB  regression  variable_selection  statistics  nonparametrics  to_read  to_teach:undergrad-ADA 
december 2011 by cshalizi
[1111.4226] Joint Modeling of Multiple Related Time Series via the Beta Process
"We propose a Bayesian nonparametric approach to the problem of jointly modeling multiple related time series. Our approach is based on the discovery of a set of latent, shared dynamical behaviors. Using a beta process prior, the size of the set and the sharing pattern are both inferred from data. We develop efficient Markov chain Monte Carlo methods based on the Indian buffet process representation of the predictive distribution of the beta process, without relying on a truncated model. In particular, our approach uses the sum-product algorithm to efficiently compute Metropolis-Hastings acceptance probabilities, and explores new dynamical behaviors via birth and death proposals. We examine the benefits of our proposed feature-based model on several synthetic datasets, and also demonstrate promising results on unsupervised segmentation of visual motion capture data."
to:NB  heard_the_talk  time_series  statistics  machine_learning  nonparametrics  fox.emily  jordan.michael_i. 
november 2011 by cshalizi
[1111.1418] Efficient Nonparametric Conformal Prediction Regions
Yay, it's out! "We investigate and extend the conformal prediction method due to Vovk,Gammerman and Shafer (2005) to construct nonparametric prediction regions. These regions have guaranteed distribution free, finite sample coverage, without any assumptions on the distribution or the bandwidth. Explicit convergence rates of the loss function are established for such regions under standard regularity conditions. Approximations for simplifying implementation and data driven bandwidth selection methods are also discussed. The theoretical properties of our method are demonstrated through simulations."
in_NB  prediction  statistics  confidence_sets  nonparametrics  kith_and_kin  wasserman.larry  robins.james  have_read  density_estimation 
november 2011 by cshalizi
[1111.1120] Parametric inference for stochastic differential equations: a smooth and match approach
"We study the problem of parameter estimation for a univariate discretely observed ergodic diffusion process given as a solution to a stochastic differential equation. The estimation procedure we propose consists of two steps. In the first step, which is referred to as a smoothing step, we smooth the data and construct a nonparametric estimator of the invariant density of the process. In the second step, which is referred to as a matching step, we exploit a characterisation of the invariant density as a solution of a certain ordinary differential equation, replace the invariant density in this equation by its nonparametric estimator from the smoothing step in order to arrive at an intuitively appealing criterion function, and next define our estimator of the parameter of interest as a minimiser of this criterion function. In many interesting examples such an estimator will be computationally less intense than the more conventional estimators obtained through approximation of the likelihood function associated with the observations. Our main result shows that our estimator is $sqrt{n}$-consistent under suitable conditions. We also discuss a way of improving its asymptotic performance through a one-step Newton-Raphson type procedure."
to:NB  statistical_inference_for_stochastic_processes  stochastic_differential_equations  ergodic_theory  nonparametrics  statistics  estimation 
november 2011 by cshalizi
Science without (parametric) models: the case of bootstrap resampling: SpringerLink - Synthese, Volume 180, Number 1
"Scientific and statistical inferences build heavily on explicit, parametric models, and often with good reasons. However, the limited scope of parametric models and the increasing complexity of the studied systems in modern science raise the risk of model misspecification. Therefore, I examine alternative, data-based inference techniques, such as bootstrap resampling. I argue that their neglect in the philosophical literature is unjustified: they suit some contexts of inquiry much better and use a more direct approach to scientific inference. Moreover, they make more parsimonious assumptions and often replace theoretical understanding and knowledge about mechanisms by careful experimental design. Thus, it is worthwhile to study in detail how nonparametric models serve as inferential engines in science."
in_NB  philosophy_of_science  bootstrap  statistics  modeling  nonparametrics 
october 2011 by cshalizi
Liu , Yang : Parametric or nonparametric? A parametricness index for model selection
"In model selection literature, two classes of criteria perform well asymptotically in different situations: Bayesian information criterion (BIC) (as a representative) is consistent in selection when the true model is finite dimensional (parametric scenario); Akaike’s information criterion (AIC) performs well in an asymptotic efficiency when the true model is infinite dimensional (nonparametric scenario). But there is little work that addresses if it is possible and how to detect the situation that a specific model selection problem is in. In this work, we differentiate the two scenarios theoretically under some conditions. We develop a measure, parametricness index (PI), to assess whether a model selected by a potentially consistent procedure can be practically treated as the true model, which also hints on AIC or BIC is better suited for the data for the goal of estimating the regression function. A consequence is that by switching between AIC and BIC based on the PI, the resulting regression estimator is simultaneously asymptotically efficient for both parametric and nonparametric scenarios. In addition, we systematically investigate the behaviors of PI in simulation and real data and show its usefulness."
to:NB  model_selection  statistics  nonparametrics  information_criteria 
october 2011 by cshalizi
An observation « An Ergodic Walk
"Bayesian nonparametrics is a bit like the Catholic church : there is a fair bit of dogma, mystery, and reliance on countably infinite populations from the developing world."
funny:geeky  statistics  bayesianism  nonparametrics  sarwate.anand 
october 2011 by cshalizi
Phys. Rev. E 84, 046702 (2011): Nonparametric segmentation of nonstationary time series
"The nonstationary evolution of observable quantities in complex systems can frequently be described as a juxtaposition of quasistationary spells. Given that standard theoretical and data analysis approaches usually rely on the assumption of stationarity, it is important to detect in real time series intervals holding that property. With that aim, we introduce a segmentation algorithm based on a fully nonparametric approach. We illustrate its applicability through the analysis of real time series presenting diverse degrees of nonstationarity, thus showing that this segmentation procedure generalizes and allows one to uncover features unresolved by previous proposals based on the discrepancy of low order statistical moments only."
in_NB  statistics  change-point_problem  time_series  nonparametrics  re:growing_ensemble_project 
october 2011 by cshalizi
Information Rates of Nonparametric Gaussian Process Methods
" For these priors the risk, and hence the information criterion, tends to zero for all continuous response functions. However, the rate at which this happens depends on the combination of true response function and Gaussian prior, and is expressible in a certain concentration function. In particular, the results show that for good performance, the regularity of the GP prior should match the regularity of the unknown response function."
nonparametrics  regression  gaussian_processes  to:NB  statistics  van_der_vaart.aad 
july 2011 by cshalizi
[0812.2749] Nonparametric inference of a trend using functional data
I guess I've been more or less presuming this was true.  (And I'd have been wrong about the form of the simultaneous CI, actually.)  Worth trying to work into the final exam for The Kids?
curve_fitting  gaussian_processes  time_series  statistics  nonparametrics  have_read  confidence_sets  to_teach:undergrad-ADA 
april 2011 by cshalizi
Combining Nonparametric and Optimal Linear Time Series Predictions
ARMA model forecasting, supplemented somehow with nonparametric smoothing of the residuals.  (I haven't read beyond the abstract.)
time_series  prediction  statistics  nonparametrics  to_teach:undergrad-ADA 
january 2011 by cshalizi
Levina, Bickel: Texture synthesis and nonparametric resampling of random fields
Found the pre-print, which I'd read in '04, while looking for something else in my office... Note that this is the same shape of mesh that Lindgren and Nordahl advocated for use in 2D information theory, on totally different (I think) grounds.
bootstrap  spatial_statistics  random_fields  statistics  nonparametrics  have_read 
august 2010 by cshalizi
[1002.4802] Gaussian Process Structural Equation Models with Latent Variables
"In a variety of disciplines such as social sciences, psychology, medicine and economics, the recorded data are considered to be noisy measurements of latent variables connected by some causal structure. This corresponds to a family of graphical models known as the structural equation model with latent variables. While linear non-Gaussian variants have been well-studied, inference in nonparametric structural equation models is still underdeveloped. We introduce a sparse Gaussian process parameterization that defines a non-linear structure connecting latent variables, unlike common formulations of Gaussian process latent variable models. An efficient Markov chain Monte Carlo procedure is described. We evaluate the stability of the sampling procedure and the predictive ability of the model compared against the current practice."
statistics  graphical_models  latent_variables  nonparametrics  estimation  heard_the_talk 
february 2010 by cshalizi
[1002.1559] Computational limits to nonparametric estimation for ergodic processes
"A new negative result for nonparametric estimation of ergodic processes is shown. In this paper we restrict the class of estimators to computable functions and study estimation of distribution of ergodic processes with any given accuracy. Then we show a single ergodic process that is inconsistent with all computable estimators." --- But how is say the Ornstein-Weiss procedure incomputable? Must read carefully.
stochastic_processes  ergodic_theory  nonparametrics  statistics  statistical_inference_for_stochastic_processes  to_read 
february 2010 by cshalizi
Local additive estimation. Juhyun Park. 2010; JRSS B
"Additive models are popular in high dimensional regression problems owing to their flexibility in model building and optimality in additive function estimation. Moreover, they do not suffer from the so-called curse of dimensionality generally arising in non-parametric regression settings. Less known is the model bias that is incurred from the restriction to the additive class of models. We introduce a new class of estimators that reduces additive model bias, yet preserves some stability of the additive estimator. The new estimator is constructed by localizing the additivity assumption and thus is named the local additive estimator. It follows the spirit of local linear estimation but is shown to be able to relieve partially the dimensionality problem. Implementation can be easily made with any standard software for additive regression. For detailed analysis we explicitly use the smooth backfitting estimator"
additive_models  to_teach:data-mining  nonparametrics  statistics  estimation  regression  to_teach:undergrad-ADA 
january 2010 by cshalizi
[0902.3311] Why minimax is not that pessimistic
Claims that the generic risk for function learning, in some suitable sense, is in fact very close to the worst risk, so minimax is not pessimistic but realistic. English is shaky.
nonparametrics  estimation  statistics  minimax 
january 2010 by cshalizi
Nonparametric Econometrics: A Primer (Racine)
Exclusive focus on kernel methods, using Hayfield and Racine's np package for R.
econometrics  statistics  nonparametrics  racine.jeffrey  have_read 
october 2009 by cshalizi
[0909.0823] Nonparametric "regression" when errors are positioned at end-points
"In economics ... data on markets, productivity and auctions ,,, natural to centre at an end-point of the error distribution rather than at the distribution's mean. Often ... have an extreme-value character .... examples involving meteorological, record-value and production-frontier data ... We shall discuss nonparametric methods for estimating regression curves in these settings ... features that contrast so starkly with those in better understood problems that they lead to apparent contradictions. For example, merely by centring errors at their end-points rather than their means the problem can change from one with a familiar nonparametric character, where the optimal convergence rate is slower than $n^{-1/2}$, to one in the super-efficient class, where the optimal rate is faster than $n^{-1/2}$. "
regression  nonparametrics  statistics 
september 2009 by cshalizi
[0909.0343] Asymptotic equivalence and adaptive estimation for robust nonparametric regression
"Asymptotic equivalence theory developed in the literature so far are only for bounded loss functions. This limits the potential applications of the theory because many commonly used loss functions in statistical inference are unbounded. In this paper we develop asymptotic equivalence results for robust nonparametric regression with unbounded loss functions. The results imply that all the Gaussian nonparametric regression procedures can be robustified in a unified way. A key step in our equivalence argument is to bin the data and then take the median of each bin. ... To illustrate the general principles of the equivalence argument we consider two important nonparametric inference problems: robust estimation of the regression function and the estimation of a quadratic functional. In both cases easily implementable procedures are constructed and are shown to enjoy simultaneously a high degree of robustness and adaptivity"
statistics  regression  nonparametrics  robust_statistics 
september 2009 by cshalizi
[0909.0170] Goodness-of-fit problem for errors in nonparametric regression: Distribution free approach
"This paper discusses asymptotically distribution free tests for the classical goodness-of-fit hypothesis of an error distribution in nonparametric regression models. These tests are based on the same martingale transform of the residual empirical process as used in the one sample location model. This transformation eliminates extra randomization due to covariates but not due the errors, which is intrinsically present in the estimators of the regression function. Thus, tests based on the transformed process have, generally, better power."
goodness-of-fit  regression  statistics  nonparametrics  to_read 
september 2009 by cshalizi
[0908.3856] Self-consistent method for density estimation
Physicists discovering non-parametric density estimation. It's a cute idea, but I am not comfortable with anything which can give me a negative density estimate.
density_estimation  statistics  nonparametrics  have_read 
august 2009 by cshalizi
Challenges for Econometric Model Selection
"Standard econometric model selection methods are based on four fundamental errors in approach: parametric vision, the assumption of a true DGP, evaluation based on fit, and ignoring the impact of model uncertainty on inference. Instead, econometric model selection methods should be based on a semiparametric vision, models should be viewed as approximations, models should be evaluated based on their purpose, and model uncertainty should be incorporated into inference methods. These problems have been examined individually, but not jointly, and my view is that future research into econometric model selection should attempt to address all four issues. "
model_selection  econometrics  statistics  nonparametrics  have_read  hansen.bruce 
june 2009 by cshalizi
Inverse problems as statistics (Evans and Stark, 2001)
"For a statistician, an inverse problem is an inference or estimation problem. The data are finite in number and contain errors, as they do in classical ... problems, and the unknown typically is infinite-dimensional, as it is in nonparametric regression. The additional complication in an inverse problem is that the data are only indirectly related to the unknown. Canonical abstract formulations of statistical estimation problems subsume this complication by allowing probability distributions to be indexed in more-or-less arbitrary ways by parameters, which can be infinite-dimensional. Standard statistical concepts, questions, and considerations such as bias, variance, mean-squared error, identifiability, consistency, efficiency, and various forms of optimality, apply to inverse problems. This article discusses inverse problems as statistical estimation and inference problems, and points to the literature for a variety of techniques and results."
inverse_problems  statistics  nonparametrics  estimation  latent_variables  to_read  to_teach:complexity-and-inference 
june 2009 by cshalizi
« earlier      

related tags

additive_models  astronomy  bayesianism  bayesian_consistency  bernstein-von_mises  bickel.peter  blei.david  books:noted  bootstrap  buhlmann.peter  cai.t._tony  caires.s.  categorical_data  causal_inference  change-point_problem  classifiers  clustering  community_discovery  computational_statistics  conferences  confidence_sets  context-free_grammars  copulas  cosmology  cross-validation  curse_of_dimensionality  curve-estimation  curve_fitting  damouras.sotirios  decision_theory  density_estimation  deviation_bounds  directed_information  dsquared  econometrics  entropy_estimation  ergodic_theory  error-in-variables  estimation  evisceration  exponential_families  ferreira.j.a.  finance  fox.emily  functional_data  functional_data_analysis  funny:geeky  gaussian_processes  generalized_linear_models  genovese.christopher  goodness-of-fit  grammar_induction  graphical_models  hansen.bruce  hardle.wolfgang  have_read  heard  heard_the_talk  heteroskedasticity  hilbert_space  hill.jennifer  history_of_statistics  hypothesis_testing  independence_testing  indirect_inference  information_criteria  information_retrieval  information_theory  instrumental_variables  inverse_problems  in_NB  jordan.michael_i.  kernel_estimators  kernel_methods  kith_and_kin  lafferty.john  latent_variables  learning_theory  lebanon.guy  lei.jing  linear_regression  liu.han  long-range_dependence  machine_learning  markov_models  meier.lukas  minimax  mis-specification_testing  mixture_models  modeling  model_selection  network_data_analysis  neural_networks  nonparametrics  online_learning  oracle_inequalities  ordinal_data  philosophy_of_science  point_processes  prediction  prediction_trees  R  racine.jeffrey  random_fields  ravikumar.pradeep  re:AoS_project  re:bayes_as_evol  re:growing_ensemble_project  re:LoB_project  re:smoothing_adjacency_matrices  re:stacs  re:your_favorite_dsge_sucks  regression  reinforcement_learning  review_papers  rigollet.philippe  robins.james  robustness  robust_statistics  ryabko.b._ya.  sarwate.anand  self-similarity  shot_after_a_fair_trial  shrinkage  smoothing  sparsity  spatial_statistics  splines  state-space_models  statistical_inference_for_stochastic_processes  statistics  stochastic_approximation  stochastic_differential_equations  stochastic_processes  support_vector_machines  time_series  to:blog  to:NB  to_read  to_teach:complexity-and-inference  to_teach:data-mining  to_teach:undergrad-ADA  track_down_references  two-sample_tests  universal_prediction  van_der_vaart.aad  van_de_geer.sara  van_handel.ramon  variable_selection  variance_estimation  via:arthegall  via:klk  via:larry  wasserman.larry 

Copy this bookmark:



description:


tags: