cshalizi + re:g_paper 37
Phys. Rev. Lett. 108, 200601 (2012): Number of Relevant Directions in Principal Component Analysis and Wishart Random Matrices
7 days ago by cshalizi
"We compute analytically, for large N, the probability P(N+,N) that a N×N Wishart random matrix has N+ eigenvalues exceeding a threshold Nζ, including its large deviation tails. This probability plays a benchmark role when performing the principal component analysis of a large empirical data set. We find that P(N+,N)≈exp[-βN2ψζ(N+/N)], where β is the Dyson index of the ensemble and ψζ(κ) is a rate function that we compute explicitly in the full range 0≤κ≤1 and for any ζ. The rate function ψζ(κ) displays a quadratic behavior modulated by a logarithmic singularity close to its minimum κ⋆(ζ). This is shown to be a consequence of a phase transition in an associated Coulomb gas problem. The variance Δ(N) of the number of relevant components is also shown to grow universally (independent of ζ) as Δ(N)∼(βπ2)-1lnN for large N."
to:NB
to_read
principal_components
large_deviations
random_matrices
stochastic_processes
high-dimensional_probability
re:g_paper
phase_transitions
7 days ago by cshalizi
The mystery of missing heritability: Genetic interactions create phantom heritability
january 2012 by cshalizi
"Human genetics has been haunted by the mystery of “missing heritability” of common traits. Although studies have discovered >1,200 variants associated with common diseases and traits, these variants typically appear to explain only a minority of the heritability. The proportion of heritability explained by a set of variants is the ratio of (i) the heritability due to these variants (numerator), estimated directly from their observed effects, to (ii) the total heritability (denominator), inferred indirectly from population data. The prevailing view has been that the explanation for missing heritability lies in the numerator—that is, in as-yet undiscovered variants. While many variants surely remain to be found, we show here that a substantial portion of missing heritability could arise from overestimation of the denominator, creating “phantom heritability.” Specifically, (i) estimates of total heritability implicitly assume the trait involves no genetic interactions (epistasis) among loci; (ii) this assumption is not justified, because models with interactions are also consistent with observable data; and (iii) under such models, the total heritability may be much smaller and thus the proportion of heritability explained much larger. For example, 80% of the currently missing heritability for Crohn's disease could be due to genetic interactions, if the disease involves interaction among three pathways. In short, missing heritability need not directly correspond to missing variants, because current estimates of total heritability may be significantly inflated by genetic interactions. Finally, we describe a method for estimating heritability from isolated populations that is not inflated by genetic interactions."
--- I'm not sure about the validity of their slope-based estimator of narrow heritability, I should ask K.R. about that.
human_genetics
heritability
re:g_paper
i_told_you_so
have_read
in_NB
to:blog
--- I'm not sure about the validity of their slope-based estimator of narrow heritability, I should ask K.R. about that.
january 2012 by cshalizi
The Effect of Summer Vacation on Achievement Test Scores: A Narrative and Meta-Analytic Review JSTOR: Review of Educational Research, Vol. 66, No. 3 (Autumn, 1996), pp. 227-268
january 2012 by cshalizi
"A review of 39 studies indicated that achievement test scores decline over summer vacation. The results of the 13 most recent studies were combined using meta-analytic procedures. The meta-analysis indicated that the summer loss equaled about one month on a grade-level equivalent scale, or one tenth of a standard deviation relative to spring test scores. The effect of summer break was more detrimental for math than for reading and most detrimental for math computation and spelling. Also, middle-class students appeared to gain on grade-level equivalent reading recognition tests over summer while lower-class students lost on them. There were no moderating effects for student gender or race, but the negative effect of summer did increase with increases in students' grade levels. Suggested explanations for the findings include the differential availability of opportunities to practice different academic material over summer (with reading practice more available than math practice) and differences in the material's susceptibility to memory decay (with fact- and procedure-based knowledge more easily forgotten than conceptual knowledge). The income differences also may be related to differences in opportunities to practice and learn. The results are examined for implications concerning summer school programs and proposals concerning school calendar changes."
to:NB
re:g_paper
mental_testing
education
standardized_testing
january 2012 by cshalizi
[1106.5834] A method for generating realistic correlation matrices
july 2011 by cshalizi
"Simulating sample correlation matrices is important in many areas of statistics. Approaches such as generating normal data and finding their sample correlation matrix or generating random uniform $[-1,1]$ deviates as pairwise correlations both have drawbacks. We develop an algorithm for adding noise, in a highly controlled manner, to general correlation matrices. In many instances, our method yields results which are superior to those obtained by simply simulating normal data. Moreover, we demonstrate how our general algorithm can be tailored to a number of different correlation models. Finally, using our results with an existing clustering algorithm, we show that simulating correlation matrices can help assess statistical methodology."
random_matrix_theory
statistics
re:g_paper
to:NB
july 2011 by cshalizi
Role of test motivation in intelligence testing
april 2011 by cshalizi
Shorter: many people taking pointless tests are not actually motivated to try very hard. Those who are motivated to try hard on pointless tests do better, and are different people in many ways. In other breaking news, snow is cold and water is wet. (To be clear, my "bad data analysis" tag here refers to the IQ-mongers, and not to this paper.)
mental_testing
iq
experimental_psychology
confounding
bad_data_analysis
re:g_paper
to:blog
april 2011 by cshalizi
Sex Differences in Variability in General Intelligence: A New Look at the Old Question
march 2011 by cshalizi
This would make a great mixture-models problem set, if only the data were available, which doesn't seem to be the case.
mental_testing
iq
data_analysis
sex_differences
re:g_paper
march 2011 by cshalizi
Twin Studies in Behavioral Research (Kamin and Goldberger, 2001)
february 2011 by cshalizi
Now that is how you give these idiots the business... The last paragraph is a lovely encapsulation of just how foolish the whole enterprise really is.
heritability
human_genetics
behavioral_genetics
evisceration
bad_data_analysis
re:g_paper
kamin.leon
goldberger.arthur
february 2011 by cshalizi
Parental Guidance and Supervised Learning
january 2011 by cshalizi
They do not mean "supervised learning" the way learning theorists do. "We propose a simple theoretical model of supervised learning that is poten- tially useful to interpret a number of empirical phenomena relevant to the nature- nurture debate. The model captures a basic trade-off between sheltering the child from the consequences of his mistakes and allowing him to learn from experience. We characterize the optimal parenting policy and its comparative-statics proper- ties. We then show that key features of the optimal policy can be useful to interpret provocative findings from behavioral genetics."
heritability
human_genetics
parenting
social_learning
re:g_paper
to_read
behavioral_genetics
in_NB
january 2011 by cshalizi
Evidence for a Collective Intelligence Factor in the Performance of Human Groups | Science/AAAS
december 2010 by cshalizi
I will give this a fair shot, but the abstract is not promising at all. A great fit to the one-factor model is, after all, precisely what you should expect if there are really an immense number of factors, but your measurement procedures are all crap and depend on random subsets of them. (Perhaps I need to turn http://bactra.org/weblog/523.html into a proper paper after all.)
to_be_shot_after_a_fair_trial
collective_cognition
experimental_psychology
factor_analysis
via:nielsen
re:g_paper
inference_to_latent_objects
december 2010 by cshalizi
The Great DNA Data Deficit: Are Genes for Disease a Mirage? (Jonathan Latham and Allison Wilson)
december 2010 by cshalizi
ETA: I withdraw my approval, and question my own reading skills. See
http://bayes.wordpress.com/2010/12/14/it-takes-three-to-read-this-stupid-article-marge-two-to-write-it-and-one-to-read-it/
genetics
heritability
human_genetics
genomics
re:g_paper
via:arsyed
link_left_as_a_reminder_to_self
http://bayes.wordpress.com/2010/12/14/it-takes-three-to-read-this-stupid-article-marge-two-to-write-it-and-one-to-read-it/
december 2010 by cshalizi
"Revival of test bias research in preemployment testing"
august 2010 by cshalizi
Those studies you ran to show that your standardized tests had no predictive bias? Had no power to detect bias when it exists. Get back to us when you've got sample sizes of 10^5 from the minority groups. HTH. (Application to IQ is let as an exercise to the reader.) --- But oh, those tables are so awful and ugly!
mental_testing
iq
debunking
to:blog
have_read
via:fred_feinberg
re:g_paper
correlational_psychology
august 2010 by cshalizi
Children's educational progress: partitioning family, school and area effects. Jon Rasbash. 2010; Journal of the Royal Statistical Society: Series A (Statistics in Society) - Wiley InterScience
may 2010 by cshalizi
"School effectiveness analyses have largely ignored the role of the family as an important source of variation for children's educational progress. Sibling analyses in developmental psychology and behavioural genetics have largely ignored sources of shared environmental variation beyond the immediate family. We formulate a multilevel cross-classified model that examines variation in children's progress during secondary schooling and partitions this variability into pupil, family, primary school, secondary school, local education authority and residential area. Our results suggest that about 50% of what has been labelled as pupil variation in school effectiveness models is really between-family variation and that about 22% of the total variance is due to shared environments beyond the immediate family." --- Haven't read the paper, could be crap.
statistics
education
variance_components
re:g_paper
may 2010 by cshalizi
A New Lease on Life for Thomson's Bonds Model of Intelligence (Bartholomew, Deary and Lawn, 2009)
august 2009 by cshalizi
I _told_ you so. (Though they are _shockingly_ naive about fMRI and brain organization.)
to:blog
iq
mental_testing
factor_analysis
psychometrics
thomson.godfrey
spearman.charles
latent_variables
re:g_paper
via:moritz-heene
i_told_you_so
august 2009 by cshalizi
[0906.2885] Noisy Independent Factor Analysis Model for Density Estimation and Classification
june 2009 by cshalizi
"We consider the problem of multivariate density estimation when the unknown density is assumed to follow a particular form of dimensionality reduction, a noisy independent factor analysis (IFA) model. In this model the data are generated by a number of latent independent components having unknown distributions and are observed in Gaussian noise. We do not assume that either the number of components or the matrix mixing the components are known. We show that the densities of this form can be estimated with a fast rate"
factor_analysis
density_estimation
statistics
to_read
re:g_paper
june 2009 by cshalizi
A Meta-Analysis of Variance Accounted for and Factor Loadings in Exploratory Factor Analysis
may 2009 by cshalizi
Shorter Peterson: Your results look like a factor analysis of pure noise. Have a nice day. (Also, a citation in support of the folk wisdom that factor analysis doesn't work any better as data reduction than simple principal components analysis.)
factor_analysis
statistics
to:NB
to_teach:data-mining
via:moritz-heene
re:g_paper
dimension_reduction
principal_components
to_teach:undergrad-ADA
may 2009 by cshalizi
Sex differences in IQ variability - The differential biology reader
february 2009 by cshalizi
Figures from the Johnson/Carothers/Deary paper, for ease of reference.
iq
sex_differences
re:g_paper
via:arthegall
february 2009 by cshalizi
Improving fluid intelligence with training on working memory — PNAS
january 2009 by cshalizi
Since (to indulge in self-quotation) about the only thing in actual cognitive psychology which correlates well with "g" is working memory capacity, it's not exactly astonishing that memory training improves measured "g". (But it is a non-trivial finding nonetheless because the memory training does transfer across tasks.)
iq
experimental_psychology
via:moritz-heene
re:g_paper
january 2009 by cshalizi
High/Scope Perry Preschool Study Lifetime Effects
march 2008 by cshalizi
"From 1962–1967, at ages 3 and 4, the subjects were randomly divided into a program group that received a high-quality preschool program based on High/Scope's participatory learning approach and a comparison group who received no preschool program. ..."
cognitive_development
experimental_psychology
inequality
re:g_paper
via:moritz-heene
march 2008 by cshalizi
Nisbett on Rushton & Jensen
november 2007 by cshalizi
One of the great psychologists of our time bangs his head against the wall
re:g_paper
nisbett.richard
jensen.arthur
race
iq
mental_testing
rushton.j._philippe
november 2007 by cshalizi
Black Americans Reduce the Racial IQ Gap: Evidence from Standardization Samples (Dickens and Flynn)
november 2007 by cshalizi
approx. 5-6 IQ points between 1972 and 2002
iq
mental_testing
race
re:g_paper
via:jbdelong
flynn.james
dickens.william
november 2007 by cshalizi
related tags
attention ⊕ bad_data_analysis ⊕ bad_science ⊕ behavioral_genetics ⊕ cognitive_development ⊕ collective_cognition ⊕ confounding ⊕ correlational_psychology ⊕ data_analysis ⊕ debunking ⊕ density_estimation ⊕ dickens.william ⊕ dimension_reduction ⊕ education ⊕ eigenproblems ⊕ evisceration ⊕ experimental_psychology ⊕ factor_analysis ⊕ flynn.james ⊕ funny:malicious ⊕ genetics ⊕ genomics ⊕ goldberger.arthur ⊕ graphical_models ⊕ have_read ⊕ heckman.james ⊕ heritability ⊕ high-dimensional_probability ⊕ human_genetics ⊕ identifiability ⊕ inequality ⊕ inference_to_latent_objects ⊕ in_NB ⊕ iq ⊕ i_told_you_so ⊕ jensen.arthur ⊕ kamin.leon ⊕ krijnen.wim ⊕ large_deviations ⊕ latent_variables ⊕ linear_algebra ⊕ link_left_as_a_reminder_to_self ⊕ lynn.richard ⊕ mental_testing ⊕ neuroscience ⊕ nisbett.richard ⊕ parenting ⊕ phase_transitions ⊕ please_give_me_strength ⊕ principal_components ⊕ psychology ⊕ psychometrics ⊕ race ⊕ racist_idiocy ⊕ random_matrices ⊕ random_matrix_theory ⊕ re:g_paper ⊖ re:homophily_and_confounding ⊕ regression ⊕ rushton.j._philippe ⊕ schofield.lynne ⊕ sex_differences ⊕ social_learning ⊕ spearman.charles ⊕ standardized_testing ⊕ statistics ⊕ stochastic_processes ⊕ stress ⊕ structural_equations ⊕ thomson.godfrey ⊕ to:blog ⊕ to:NB ⊕ to_be_shot_after_a_fair_trial ⊕ to_read ⊕ to_teach:data-mining ⊕ to_teach:undergrad-ADA ⊕ variance_components ⊕ via:? ⊕ via:arsyed ⊕ via:arthegall ⊕ via:flint_riemen ⊕ via:fred_feinberg ⊕ via:jbdelong ⊕ via:kathryn ⊕ via:michael-meadon ⊕ via:moritz-heene ⊕ via:neuroanthropology ⊕ via:nielsen ⊕Copy this bookmark: