cshalizi + hierarchical_models 10
Misuse of hierarchical linear models overstates the significance of a reported association between OXTR and prosociality
27 days ago by cshalizi
Going from a p-value of 10^-16 to 0.027 is --- painful. IFrom the lack of a response, I tend to infer that there's no arguing back...
Prediction: the original association will continue to be cited without correction.
bad_data_analysis
hierarchical_models
human_genetics
evisceration
Prediction: the original association will continue to be cited without correction.
27 days ago by cshalizi
Bootstrapping clustered data - Field - 2007 - Journal of the Royal Statistical Society: Series B (Statistical Methodology) - Wiley Online Library
february 2012 by cshalizi
"Various bootstraps have been proposed for bootstrapping clustered data from one-way arrays. The simulation results in the literature suggest that some of these methods work quite well in practice; the theoretical results are limited and more mixed in their conclusions. For example, McCullagh reached negative conclusions about the use of non-parametric bootstraps for one-way arrays. The purpose of this paper is to extend our understanding of the issues by discussing the effect of different ways of modelling clustered data, the criteria for successful bootstraps used in the literature and extending the theory from functions of the sample mean to include functions of the between and within sums of squares and non-parametric bootstraps to include model-based bootstraps. We determine that the consistency of variance estimates for a bootstrap method depends on the choice of model with the residual bootstrap giving consistency under the transformation model whereas the cluster bootstrap gives consistent estimates under both the transformation and the random-effect model. In addition we note that the criteria based on the distribution of the bootstrap observations are not really useful in assessing consistency."
in_NB
have_read
statistics
bootstrap
to_teach:undergrad-ADA
hierarchical_models
february 2012 by cshalizi
RE-EM Trees: A Data Ming Approach for Longitudinal and Clustered Data
january 2012 by cshalizi
"Longitudinal data refer to the situation where repeated observations are available for each sampled object. Clustered data, where observations are nested in a hierarchical structure within objects (without time necessarily being involved) represent a similar type of situation. Methodologies that take this structure into account allow for the possibilities of systematic differences between objects that are not related to attributes and autocorrelation within objects across time periods. A standard methodology in the statistics literature for this type of data is the mixed effects model, where these differences between objects are represented by so-called “random effects” that are estimated from the data (population-level relationships are termed “fixed effects,” together resulting in a mixed effects model). This paper presents a methodology that combines the structure of mixed effects models for longitudinal and clustered data with the flexibility of tree-based estimation methods. We apply the resulting estimation method, called the RE-EM tree, to pricing in online transactions, showing that the RE-EM tree is less sensitive to parametric assumptions and provides improved predictive power compared to linear models with random effects and regression trees without random effects. We also apply it to a smaller data set examining accident fatalities, and show that the RE-EM tree strongly outperforms a tree without random effects while performing comparably to a linear model with random effects. We also perform extensive simulation experiments to show that the estimator improves predictive performance relative to regression trees without random effects and is comparable or superior to using linear models with random effects in more general situations."
to:NB
machine_learning
decision_trees
data_mining
statistics
hierarchical_models
january 2012 by cshalizi
[1201.1980] Misspecifying the Shape of a Random Effects Distribution: Why Getting It Wrong May Not Matter
january 2012 by cshalizi
"Statistical models that include random effects are commonly used to analyze longitudinal and correlated data, often with strong and parametric assumptions about the random effects distribution. There is marked disagreement in the literature as to whether such parametric assumptions are important or innocuous. In the context of generalized linear mixed models used to analyze clustered or longitudinal data, we examine the impact of random effects distribution misspecification on a variety of inferences, including prediction, inference about covariate effects, prediction of random effects and estimation of random effects variances. We describe examples, theoretical calculations and simulations to elucidate situations in which the specification is and is not important. A key conclusion is the large degree of robustness of maximum likelihood for a wide variety of commonly encountered situations."
to:NB
regression
statistics
estimation
hierarchical_models
january 2012 by cshalizi
Bayesian Checking of the Second Levels of Hierarchical Models
july 2009 by cshalizi
In particular see the bit about "pure Bayesian reasoning" in the rejoinder.
statistics
modeling
hierarchical_models
model-checking
bayesianism
re:phil-of-bayes_paper
have_read
july 2009 by cshalizi
Why we (usually) don’t have to worry about multiple comparisons
march 2008 by cshalizi
My initial reaction is one of skepticism, despite my respect for Andy. To work through.
statistics
multiple_comparisons
hierarchical_models
gelman.andrew
yajima.masano
via:arthegall
have_read
hill.jennifer
march 2008 by cshalizi
Data Analysis Using Regression and Multilevel/Hierarchical Models - Gelman and Hill (@Labyrinth)
january 2008 by cshalizi
Maybe the best applied textbook on regression and hierarchical modeling available. Good as an introduction to statistical modeling more generally.
regression
hierarchical_models
statistics
modeling
data_analysis
gelman.andrew
hill.jennifer
books:recommended
january 2008 by cshalizi
Adrian Dobra, University of Washington
october 2007 by cshalizi
in particular papers on model search for high-dimensional graphical models
graphical_models
model_selection
model_search
computational_statistics
hierarchical_models
contingency_tables
statistics
october 2007 by cshalizi
related tags
bad_data_analysis ⊕ bayesianism ⊕ books:recommended ⊕ bootstrap ⊕ computational_statistics ⊕ contingency_tables ⊕ data_analysis ⊕ data_mining ⊕ decision_trees ⊕ estimation ⊕ evisceration ⊕ gelman.andrew ⊕ graphical_models ⊕ have_read ⊕ hierarchical_models ⊖ hill.jennifer ⊕ hoff.peter ⊕ human_genetics ⊕ in_NB ⊕ machine_learning ⊕ model-checking ⊕ modeling ⊕ model_search ⊕ model_selection ⊕ multiple_comparisons ⊕ principal_components ⊕ re:phil-of-bayes_paper ⊕ regression ⊕ statistics ⊕ to:NB ⊕ to_teach:undergrad-ADA ⊕ via:arthegall ⊕ yajima.masano ⊕Copy this bookmark: