[0812.3141] Choosing a penalty for model selection in heteroscedastic regression
june 2010 by Vaguery
"We consider the problem of choosing between several models in least-squares regression with heteroscedastic data. We prove that any penalization procedure is suboptimal when the penalty is a function of the dimension of the model, at least for some typical heteroscedastic model selection problems. In particular, Mallows' Cp is suboptimal in this framework. On the contrary, optimal model selection is possible with data-driven penalties such as resampling or $V$-fold penalties. Therefore, it is worth estimating the shape of the penalty from data, even at the price of a higher computational cost. Simulation experiments illustrate the existence of a trade-off between statistical accuracy and computational complexity. As a conclusion, we sketch some rules for choosing a penalty in least-squares regression, depending on what is known about possible variations of the noise-level."
statistics
statistical-tests
linear-regression
meta-optimization
nudge-targets
multiobjective-optimization
pragmatism-it-ain't
june 2010 by Vaguery
[1005.4358] On the estimation of the extremal index based on scaling and resampling
may 2010 by Vaguery
"The extremal index parameter theta characterizes the degree of local dependence in the extremes of a stationary time series and has important applications in a number of areas, such as hydrology, telecommunications, finance and environmental studies.…Further, a procedure for the automatic selection of its tuning parameter is developed and different types of confidence intervals that prove useful in practice proposed. The performance of the estimator is examined through simulations, which show its highly competitive behavior. Finally, the estimator is applied to three real data sets of daily crude oil prices, daily returns of the S&P 500 stock index, and high-frequency, intra-day traded volumes of a stock. These applications demonstrate additional diagnostic features of statistical plots based on the new estimator."
statistics
time-series
statistical-tests
nudge-targets
algorithms
extreme-values
may 2010 by Vaguery
[1003.2294] A Simple Lack-of-Fit Test for Regression Models
april 2010 by Vaguery
"A simple test is proposed for examining the correctness of a given completely specified response function against unspecified general alternatives in the context of univariate regression. The usual diagnostic tools based on residuals plots are useful but heuristic. We introduce a formal statistical test supplementing the graphical analysis. Technically, the test statistic is the maximum length of the sequences of ordered (with respect to the covariate) observations that are consecutively overestimated or underestimated by the candidate regression function. Note that the testing procedure can cope with heteroscedastic errors and no replicates. Recursive formulae allowing to calculate the exact distribution of the test statistic under the null hypothesis and under a class of alternative hypotheses are given."
statistics
statistical-tests
modeling
algorithms
things-to-ask-Cosma-about
april 2010 by Vaguery
Copy this bookmark: