cshalizi + have_read   343

[1204.6441] "I Wanted to Predict Elections with Twitter and all I got was this Lousy Paper" -- A Balanced Survey on Election Prediction using Twitter Data
"Predicting X from Twitter is a popular fad within the Twitter research subculture. It seems both appealing and relatively easy. Among such kind of studies, electoral prediction is maybe the most attractive, and at this moment there is a growing body of literature on such a topic. This is not only an interesting research problem but, above all, it is extremely difficult. However, most of the authors seem to be more interested in claiming positive results than in providing sound and reproducible methods. It is also especially worrisome that many recent papers seem to only acknowledge those studies supporting the idea of Twitter predicting elections, instead of conducting a balanced literature review showing both sides of the matter. After reading many of such papers I have decided to write such a survey myself. Hence, in this paper, every study relevant to the matter of electoral prediction using social media is commented. From this review it can be concluded that the predictive power of Twitter regarding elections has been greatly exaggerated, and that hard research problems still lie ahead."
to:NB  social_media  data_mining  prediction  have_read 
24 days ago by cshalizi
Larger than Life: Digital Creatures in a Family of Two-Dimensional Cellular Automata (Evans, 2001)
"We introduce the Larger than Life family of two-dimensional two-state cellular automata that generalize certain nearest neighbor outer totalistic cellular automaton rules to large neighborhoods. We describe linear and quadratic rescalings of John Conway's celebrated Game of Life to these large neighborhood cellular automaton rules and present corresponding generalizations of Life's famous gliders and spaceships. We show that, as is becoming well known for nearest neighbor cellular automaton rules, these ``digital creatures'' are ubiquitous for certain parameter values."

(Meta-comment: jeez, guys, how hard is it to re-direct old URLs? Or at least to have a working search box?)
cellular_automata  conways_life  have_read  to:NB  evans.kellie_m. 
27 days ago by cshalizi
[1201.5871] Null models for network data
"The analysis of datasets taking the form of simple, undirected graphs continues to gain in importance across a variety of disciplines. Two choices of null model, the logistic-linear model and the implicit log-linear model, have come into common use for analyzing such network data, in part because each accounts for the heterogeneity of network node degrees typically observed in practice. Here we show how these both may be viewed as instances of a broader class of null models, with the property that all members of this class give rise to essentially the same likelihood-based estimates of link probabilities in sparse graph regimes. This facilitates likelihood-based computation and inference, and enables practitioners to choose the most appropriate null model from this family based on application context. Comparative model fits for a variety of network datasets demonstrate the practical implications of our results."
in_NB  network_data_analysis  have_read  statistics  estimation  approximation  re:smoothing_adjacency_matrices 
5 weeks ago by cshalizi
This Time, It Is Not Different: The Persistent Concerns of Financial Macroeconomics
"When the Financial Times's Martin Wolf asked former U.S. Treasury Secretary Lawrence Summers what in economics had proved useful in understanding the financial crisis and the recession, Summers answered: “There is a lot about the recent financial crisis in Bagehot...”. “Bagehot” here is Walter Bagehot’s 1873 book, Lombard Street. How is it that a book written 150 years ago is still state-of-the- art in economists’ analysis of episodes like the one that we hope is just about to end? There are three reasons. The first is that modern academic economics has long possessed drives toward analyzing empirical issues that can be successfully treated statistically and theoretical issues that can be successfully modeled on the foundation of individual rationality. But those drives are disabilities in analyzing episodes like major financial crises that come too rarely for statistical tools to have much bite, and for which a major ex post question asked of wealth holders and their portfolios is: “just what were they thinking?”. The second is that even though the causes of financial collapses like the one we saw in 2007-9 are diverse, the transmission mechanism in the form of the flight to liquidity and/or safety in asset holdings and the consequences for the real economy in the freezing-up of the spending flow and its implications have always been very similar since at least the first proper industrial business cycle in 1825. Thus a nineteenth-century author like Walter Bagehot is in no wise at a disadvantage in analyzing the downward financial spiral. The third is that the proposed cures for current financial crises still bear a remarkable family resemblance to those proposed by Walter Bagehot. And so he is remarkably close to the best we can do, even today."
have_read  economics  macroeconomics  finance  financial_crisis_of_2007--  bagehot.walter  delong.brad 
6 weeks ago by cshalizi
[1204.1351] Mathematicians take a stand
"We survey the reasons for the ongoing boycott of the publisher Elsevier. We examine Elsevier's pricing and bundling policies, restrictions on dissemination by authors, and lapses in ethics and peer review, and we conclude with thoughts about the future of mathematical publishing."
to:blog  elsevier  why_oh_why_cant_we_have_a_better_academic_publishing_system  cohn.henry  have_read 
7 weeks ago by cshalizi
Colombo , Maathuis , Kalisch , Richardson : Learning high-dimensional directed acyclic graphs with latent and selection variables
"We consider the problem of learning causal information between random variables in directed acyclic graphs (DAGs) when allowing arbitrarily many latent and selection variables. The FCI (Fast Causal Inference) algorithm has been explicitly designed to infer conditional independence and causal information in such settings. However, FCI is computationally infeasible for large graphs. We therefore propose the new RFCI algorithm, which is much faster than FCI. In some situations the output of RFCI is slightly less informative, in particular with respect to conditional independence information. However, we prove that any causal information in the output of RFCI is correct in the asymptotic limit. We also define a class of graphs on which the outputs of FCI and RFCI are identical. We prove consistency of FCI and RFCI in sparse high-dimensional settings, and demonstrate in simulations that the estimation performances of the algorithms are very similar. All software is implemented in the R-package pcalg."

--- To complicated to actually teach, but should be mentioned in the lecture notes on causal discovery, along with FCI.
in_NB  have_read  statistics  graphical_models  causal_inference  sparsity  to_teach:undergrad-ADA 
7 weeks ago by cshalizi
[1203.0683] A Method of Moments for Mixture Models and Hidden Markov Models
"Mixture models are a fundamental tool in applied statistics and machine learning for treating data taken from multiple subpopulations. The current practice for estimating the parameters of such models relies on local search heuristics (e.g., the EM algorithm) which are prone to failure, and existing consistent methods are unfavorable due to their high computational and sample complexity which typically scale exponentially with the number of mixture components. This work develops an efficient method of moments approach to parameter estimation for a broad class of high-dimensional mixture models with many components, including multi-view mixtures of Gaussians (such as mixtures of axis-aligned Gaussians) and hidden Markov models. The new method leads to rigorous unsupervised learning results for mixture models that were not achieved by previous works; and, because of its simplicity, it also constitutes a viable alternative to EM for practical deployment."

Clever: some mixture models can be characterized by expectations, covariances, and third-order mixed moments, so you just need to estimate tensors up to third order, and not very high moments of vectors (which are very noisy) and do some linear algebra. I should probably re-read because I couldn't reproduce this at the board.
in_NB  statistics  estimation  mixture_models  markov_models  state-space_models  have_read 
7 weeks ago by cshalizi
Stock Market Behavior Predicted by Rat Neurons
"We here report for the first time, to the best of our knowledge, rat motor cortex neurons predicting the behavior of the American stock market. We implanted the motor cortex of the brains of rats with silicon electrodes. Using the correlation technique, we monitored the activity of neurons in our rats while simultaneously tracking the activity of stocks in the U.S. stock market."
have_read  to:NB  neuroscience  finance  statistics  prediction  multiple_testing  bad_data_analysis  funny:geeky  funny:malicious  via:mejn  to:blog  to_teach:undergrad-ADA 
8 weeks ago by cshalizi
[1203.2035] A Noether Theorem for Markov Processes
"Noether's theorem links the symmetries of a quantum system with its conserved quantities, and is a cornerstone of quantum mechanics. Here we prove a version of Noether's theorem for Markov processes. In quantum mechanics, an observable commutes with the Hamiltonian if and only if its expected value remains constant in time for every state. For Markov processes that no longer holds, but an observable commutes with the Hamiltonian if and only if both its expected value and standard deviation are constant in time for every state."
--- For "Hamiltonian" of a Markov process, read "generator".
to:NB  stochastic_processes  markov_models  noethers_theorem  baez.john  re:almost_none  have_read 
11 weeks ago by cshalizi
Rainfall and Conflict - Heather Sarsons
"Starting with Miguel, Satyanath, and Sergenti (2004), a large literature has used rainfall variation as an instrument to study the impacts of income shocks on civil war and conáict. These studies argue that in agriculturally-dependent regions, negative rain shocks lower income levels, which in turn incites violence. This identiÖcation strategy relies on the assumption that rainfall shocks a§ect conáict only through their impacts on income. I evaluate this exclusion restriction by identifying districts that are downstream from dams in India. In downstream districts, income is much less sensitive to rainfall áuctuations. However, rain shocks remain equally strong predictors of riot incidence in these districts. These results suggest that rainfall a§ects rioting through a channel other than income and cast doubt on the conclusion that income shocks incite riots."

Cute.
to:NB  have_read  instrumental_variables  causal_inference  statistics  to_teach:undergrad-ADA  sociology  to:blog 
11 weeks ago by cshalizi
[1202.3323] A new look at shifting regret
We investigate extensions of well-known online learning algorithms such as fixed-share of Herbster and Warmuth (1998) or the methods proposed by Bousquet and Warmuth (2002). These algorithms use weight sharing schemes to perform as well as the best sequence of experts with a limited number of changes. Here we show, with a common, general, and simpler analysis, that weight sharing in fact achieves much more than what it was designed for. We use it to simultaneously prove new shifting regret bounds for online convex optimization on the simplex in terms of the total variation distance as well as new bounds for the related setting of adaptive regret. Finally, we exhibit the first logarithmic shifting bounds for exp-concave loss functions on the simplex.
online_learning  to_read  individual_sequence_prediction  non-stationarity  re:growing_ensemble_project  in_NB  low-regret_learning  have_read 
12 weeks ago by cshalizi
[1202.3775] Kernel-based Conditional Independence Test and Application in Causal Discovery
"Conditional independence testing is an important problem, especially in Bayesian network learning and causal discovery. Due to the curse of dimensionality, testing for conditional independence of continuous variables is particularly challenging. We propose a Kernel-based Conditional Independence test (KCI-test), by constructing an appropriate test statistic and deriving its asymptotic distribution under the null hypothesis of conditional independence. The proposed method is computationally efficient and easy to implement. Experimental results show that it outperforms other methods, especially when the conditioning set is large or the sample size is not very large, in which case other methods encounter difficulties."
statistics  kernel_estimators  independence_testing  hypothesis_testing  causal_inference  in_NB  have_read  to:blog  to_teach:undergrad-ADA 
12 weeks ago by cshalizi
[0805.3032] Testing earthquake predictions
"Statistical tests of earthquake predictions require a null hypothesis to model occasional chance successes. To define and quantify `chance success' is knotty. Some null hypotheses ascribe chance to the Earth: Seismicity is modeled as random. The null distribution of the number of successful predictions -- or any other test statistic -- is taken to be its distribution when the fixed set of predictions is applied to random seismicity. Such tests tacitly assume that the predictions do not depend on the observed seismicity. Conditioning on the predictions in this way sets a low hurdle for statistical significance. Consider this scheme: When an earthquake of magnitude 5.5 or greater occurs anywhere in the world, predict that an earthquake at least as large will occur within 21 days and within an epicentral distance of 50 km. We apply this rule to the Harvard centroid-moment-tensor (CMT) catalog for 2000--2004 to generate a set of predictions. The null hypothesis is that earthquake times are exchangeable conditional on their magnitudes and locations and on the predictions--a common ``nonparametric'' assumption in the literature. We generate random seismicity by permuting the times of events in the CMT catalog. We consider an event successfully predicted only if (i) it is predicted and (ii) there is no larger event within 50 km in the previous 21 days. The $P$-value for the observed success rate is $<0.001$: The method successfully predicts about 5% of earthquakes, far better than `chance,' because the predictor exploits the clustering of earthquakes -- occasional foreshocks -- which the null hypothesis lacks. Rather than condition on the predictions and use a stochastic model for seismicity, it is preferable to treat the observed seismicity as fixed, and to compare the success rate of the predictions to the success rate of simple-minded predictions like those just described. If the proffered predictions do no better than a simple scheme, they have little value."
have_read  to:NB  statistics  geology  prediction  earthquakes  to_teach:undergrad-ADA  to_teach:data-mining 
12 weeks ago by cshalizi
[0805.3906] Inference for Multivariate Normal Mixtures
"Multivariate normal mixtures provide a flexible model for high-dimensional data. They are widely used in statistical genetics, statistical finance, and other disciplines. Due to the unboundedness of the likelihood function, classical likelihood-based methods, which may have nice practical properties, are inconsistent. In this paper, we recommend a penalized likelihood method for estimating the mixing distribution. We show that the maximum penalized likelihood estimator is strongly consistent when the number of components has a known upper bound. We also explore a convenient EM-algorithm for computing the maximum penalized likelihood estimator. Extensive simulations are conducted to explore the effectiveness and the practical limitations of both the new method and the ratified maximum likelihood estimators. Guidelines are provided based on the simulation results."
have_read  statistics  mixture_models  re:network_model_selection  in_NB 
12 weeks ago by cshalizi
Bootstrapping clustered data - Field - 2007 - Journal of the Royal Statistical Society: Series B (Statistical Methodology) - Wiley Online Library
"Various bootstraps have been proposed for bootstrapping clustered data from one-way arrays. The simulation results in the literature suggest that some of these methods work quite well in practice; the theoretical results are limited and more mixed in their conclusions. For example, McCullagh reached negative conclusions about the use of non-parametric bootstraps for one-way arrays. The purpose of this paper is to extend our understanding of the issues by discussing the effect of different ways of modelling clustered data, the criteria for successful bootstraps used in the literature and extending the theory from functions of the sample mean to include functions of the between and within sums of squares and non-parametric bootstraps to include model-based bootstraps. We determine that the consistency of variance estimates for a bootstrap method depends on the choice of model with the residual bootstrap giving consistency under the transformation model whereas the cluster bootstrap gives consistent estimates under both the transformation and the random-effect model. In addition we note that the criteria based on the distribution of the bootstrap observations are not really useful in assessing consistency."
in_NB  have_read  statistics  bootstrap  to_teach:undergrad-ADA  hierarchical_models 
february 2012 by cshalizi
[1202.4283] Fast rates in learning with dependent observations
"In this paper we tackle the problem of fast rates in time series forecasting from a statistical learning perspective. In a serie of papers (e.g. Meir 2000, Modha and Masry 1998, Alquier and Wintenberger 2012) it is shown that the main tools used in learning theory with iid observations can be extended to the prediction of time series. The main message of these papers is that, given a family of predictors, we are able to build a new predictor that predicts the series as well as the best predictor in the family, up to a remainder of order $1/sqrt{n}$. It is known that this rate cannot be improved in general. In this paper, we show that in the particular case of the least square loss, and under a strong assumption on the time series (phi-mixing) the remainder is actually of order $1/n$. Thus, the optimal rate for iid variables, see e.g. Tsybakov 2003, and individual sequences, see cite{lugosi} is, for the first time, achieved for uniformly mixing processes. We also show that our method is optimal for aggregating sparse linear combinations of predictors."

--- Assumes observations are in the interval [-B,B] and gets a bound which is O(B^3), and so useless for our purposes.
in_NB  learning_theory  mixing  ergodic_theory  re:your_favorite_dsge_sucks  re:XV_for_mixing  have_read 
february 2012 by cshalizi
Is psychological research really as good as medical research? Effect size comparisons between psychology and medicine
"Researchers have looked at comparisons between medical epidemiological research and psychological research using effect size r in an effort to compare relative effects. Often the outcomes of such efforts have demonstrated comparatively low effects for medical epidemiology research in comparison with effect sizes seen in psychology. The conclusion has often been that relatively small effects seen in psychology research are as strong as those found in important epidemiological medical research. The author suggests that many of the calculated effect sizes from medical epidemiological research on which this conclusion has been based are flawed. Specifically, rather than calculating effect sizes for treatment, many results have been for a Treatment Effect × Disease Effect interaction that was irrelevant to the main study hypothesis. A technique for developing a “hypothesis-relevant” effect size r is proposed."
data_analysis  statistics  psychology  epidemiology  evisceration  via:moritz-heene  have_read 
february 2012 by cshalizi
[0810.3023] Iterated Regret Minimization: A More Realistic Solution Concept
"For some well-known games, such as the Traveler's Dilemma or the Centipede Game, traditional game-theoretic solution concepts--and most notably Nash equilibrium--predict outcomes that are not consistent with empirical observations. In this paper, we introduce a new solution concept, iterated regret minimization, which exhibits the same qualitative behavior as that observed in experiments in many games of interest, including Traveler's Dilemma, the Centipede Game, Nash bargaining, and Bertrand competition. As the name suggests, iterated regret minimization involves the iterated deletion of strategies that do not minimize regret."

--- Quite astonishingly, no mention at all of low-regret learning!
game_theory  online_learning  have_read  in_NB  halpern.joseph_y.  re:knightian_uncertainty  low-regret_learning 
february 2012 by cshalizi
[1202.1523] Information Forests
"We describe Information Forests, an approach to classification that generalizes Random Forests by replacing the splitting criterion of non-leaf nodes from a discriminative one -- based on the entropy of the label distribution -- to a generative one -- based on maximizing the information divergence between the class-conditional distributions in the resulting partitions. The basic idea consists of deferring classification until a measure of "classification confidence" is sufficiently high, and instead breaking down the data so as to maximize this measure. In an alternative interpretation, Information Forests attempt to partition the data into subsets that are "as informative as possible" for the purpose of the task, which is to classify the data. Classification confidence, or informative content of the subsets, is quantified by the Information Divergence. Our approach relates to active learning, semi-supervised learning, mixed generative/discriminative learning."

After reading: meh.
have_read  decision_trees  information_theory  classifiers  machine_learning  to_teach:data-mining  re:AoS_project 
february 2012 by cshalizi
Conditional Likelihood Maximisation: A Unifying Framework for Information Theoretic Feature Selection
"We present a unifying framework for information theoretic feature selection, bringing almost two decades of research on heuristic filter criteria under a single theoretical interpretation. This is in response to the question: "what are the implicit statistical assumptions of feature selection criteria based on mutual information?". To answer this, we adopt a different strategy than is usual in the feature selection literature−instead of trying to define a criterion, we derive one, directly from a clearly specified objective function: the conditional likelihood of the training labels. While many hand-designed heuristic criteria try to optimize a definition of feature 'relevancy' and 'redundancy', our approach leads to a probabilistic framework which naturally incorporates these concepts. As a result we can unify the numerous criteria published over the last two decades, and show them to be low-order approximations to the exact (but intractable) optimisation problem. The primary contribution is to show that common heuristics for information based feature selection (including Markov Blanket algorithms as a special case) are approximate iterative maximisers of the conditional likelihood. A large empirical study provides strong evidence to favour certain classes of criteria, in particular those that balance the relative size of the relevancy/redundancy terms. Overall we conclude that the JMI criterion (Yang and Moody, 1999; Meyer et al., 2008) provides the best tradeoff in terms of accuracy, stability, and flexibility with small data samples."
in_NB  information_theory  statistics  variable_selection  model_selection  to_teach:data-mining  to:blog  machine_learning  classifiers  have_read  graphical_models 
february 2012 by cshalizi
[1202.1561] Tree Models for Difference and Change Detection in a Complex Environment
"A new family of tree models is proposed, which we call "differential trees." A differential tree model is constructed from multiple data sets and aims to detect distributional differences between them. The new methodology differs from the existing difference and change detection techniques in its nonparametric nature, model construction from multiple data sets, and applicability to high-dimensional data. Through a detailed study of an arson case in New Zealand, where an individual is known to have been laying vegetation fires within a certain time period, we illustrate how these models can help detect changes in the frequencies of event occurrences and uncover unusual clusters of events in a complex environment."

--- After reading, I think their exposition is needlessly hard to follow, but let me take a stab at it. In an ordinary classification tree, we are interested in the distribution of the class labels Y given the predictors X, i.e., Pr(Y|X), and make splits on X so that (in essence) the conditional entropy H[Y|X] becomes small. This is of course equivalent to making splits so that the divergence of Pr(Y|X) from Pr(Y) is maximized. What they are interested in is not classification but _describing_ how the different classes are distinct, so the relevant distribution is Pr(X|Y), and they want a big divergence between Pr(X) and Pr(X|Y).
to:NB  re:network_differences  statistics  hypothesis_testing  density_estimation  decision_trees  have_read  data_mining  two-sample_tests 
february 2012 by cshalizi
The Asymmetric Business Cycle
"The business cycle is a fundamental yet elusive concept in macroeconomics. In this paper, we consider the problem of measuring the business cycle. First, we argue for the output-gap view that the business cycle corresponds to transitory deviations in economic activity away from a permanent, or trend, level. Then we investigate the extent to which a general model-based approach to estimating trend and cycle for the U.S. economy leads to measures of the business cycle that reflect models versus the data. We find empirical support for a nonlinear time series model that produces a business cycle measure with an asymmetric shape across NBER expansion and recession phases. Specifically, this business cycle measure suggests that recessions are periods of relatively large and negative transitory fluctuations in output. However, several close competitors to the nonlinear model produce business cycle measures of widely differing shapes and magnitudes. Given this model-based uncertainty, we construct a model-averaged measure of the business cycle. This measure also displays an asymmetric shape and is closely related to other measures of economic slack such as the unemployment rate and capacity utilization."
--- Worthy, but at the same time makes me want to lock them in a room with a copy of Li and Racine's _Nonparametric Econometrics_, or even _The Elements of Statistical Learning_, and not let them out until they understand it.
in_NB  time_series  statistics  economics  macroeconomics  inference_to_latent_objects  re:your_favorite_dsge_sucks  morley.james  have_read  ensemble_methods  model_selection 
february 2012 by cshalizi
On a New Method of Graduation
Whittaker introduces spline smoothing in 1922, complete with the Bayesian derivation. Does not use the word "spline", however --- when did that come in?
in_NB  to_teach:undergrad-ADA  splines  smoothing  regression  statistics  have_read 
january 2012 by cshalizi
A Method of Handling Curvilinear Correlation for Any Number of Variables (Ezekiel, 1924)
Additive regression models as a general statistical method, complete with a successive-approximation algorithm that's really damn close to modern back-fitting, and a plea for economists to use it. In 1924!
in_NB  to_teach:undergrad-ADA  regression  additive_models  statistics  have_read 
january 2012 by cshalizi
The mystery of missing heritability: Genetic interactions create phantom heritability
"Human genetics has been haunted by the mystery of “missing heritability” of common traits. Although studies have discovered >1,200 variants associated with common diseases and traits, these variants typically appear to explain only a minority of the heritability. The proportion of heritability explained by a set of variants is the ratio of (i) the heritability due to these variants (numerator), estimated directly from their observed effects, to (ii) the total heritability (denominator), inferred indirectly from population data. The prevailing view has been that the explanation for missing heritability lies in the numerator—that is, in as-yet undiscovered variants. While many variants surely remain to be found, we show here that a substantial portion of missing heritability could arise from overestimation of the denominator, creating “phantom heritability.” Specifically, (i) estimates of total heritability implicitly assume the trait involves no genetic interactions (epistasis) among loci; (ii) this assumption is not justified, because models with interactions are also consistent with observable data; and (iii) under such models, the total heritability may be much smaller and thus the proportion of heritability explained much larger. For example, 80% of the currently missing heritability for Crohn's disease could be due to genetic interactions, if the disease involves interaction among three pathways. In short, missing heritability need not directly correspond to missing variants, because current estimates of total heritability may be significantly inflated by genetic interactions. Finally, we describe a method for estimating heritability from isolated populations that is not inflated by genetic interactions."
--- I'm not sure about the validity of their slope-based estimator of narrow heritability, I should ask K.R. about that.
human_genetics  heritability  re:g_paper  i_told_you_so  have_read  in_NB  to:blog 
january 2012 by cshalizi
Collaborative learning in networks
"Complex problems in science, business, and engineering typically require some tradeoff between exploitation of known solutions and exploration for novel ones, where, in many cases, information about known solutions can also disseminate among individual problem solvers through formal or informal networks. Prior research on complex problem solving by collectives has found the counterintuitive result that inefficient networks, meaning networks that disseminate information relatively slowly, can perform better than efficient networks for problems that require extended exploration. In this paper, we report on a series of 256 Web-based experiments in which groups of 16 individuals collectively solved a complex problem and shared information through different communication networks. As expected, we found that collective exploration improved average success over independent exploration because good solutions could diffuse through the network. In contrast to prior work, however, we found that efficient networks outperformed inefficient networks, even in a problem space with qualitative properties thought to favor inefficient networks. We explain this result in terms of individual-level explore-exploit decisions, which we find were influenced by the network structure as well as by strategic considerations and the relative payoff between maxima. We conclude by discussing implications for real-world problem solving and possible extensions."
in_NB  re:do-institutions-evolve  re:democratic_cognition  social_life_of_the_mind  collective_cognition  experimental_psychology  experimental_sociology  social_networks  watts.duncan  mason.winter  have_read  exploration-exploitation 
january 2012 by cshalizi
[0805.4136] Inference for the dark energy equation of state using Type IA supernova data
"The surprising discovery of an accelerating universe led cosmologists to posit the existence of "dark energy"--a mysterious energy field that permeates the universe. Understanding dark energy has become the central problem of modern cosmology. After describing the scientific background in depth, we formulate the task as a nonlinear inverse problem that expresses the comoving distance function in terms of the dark-energy equation of state. We present two classes of methods for making sharp statistical inferences about the equation of state from observations of Type Ia Supernovae (SNe). First, we derive a technique for testing hypotheses about the equation of state that requires no assumptions about its form and can distinguish among competing theories. Second, we present a framework for computing parametric and nonparametric estimators of the equation of state, with an associated assessment of uncertainty. Using our approach, we evaluate the strength of statistical evidence for various competing models of dark energy. Consistent with current studies, we find that with the available Type Ia SNe data, it is not possible to distinguish statistically among popular dark-energy models, and that, in particular, there is no support in the data for rejecting a cosmological constant. With much more supernova data likely to be available in coming years (e.g., from the DOE/NASA Joint Dark Energy Mission), we address the more interesting question of whether future data sets will have sufficient resolution to distinguish among competing theories."

--- I am biased, because Chris G. and Larry are friends, but this seems to me a model of the modern applied statistics paper: use interesting statistical tools to say something helpful about an important scientific problem on its own terms, rather than distorting the problem until it "looks like a nail".
in_NB  kith_and_kin  cosmology  astronomy  inverse_problems  nonparametrics  estimation  hypothesis_testing  statistics  bootstrap  genovese.christopher  wasserman.larry  have_read 
january 2012 by cshalizi
PLoS ONE: Low Pitched Voices Are Perceived as Masculine and Attractive but Do They Predict Semen Quality in Men?
How does anyone _not_ read this paper and think that they were correlating everything they could until they got a "significant" effect?
--- I am very tempted right now to make this a problem set in ADA, but that's just asking for trouble, yes?
practices_relating_to_the_transmission_of_genetic_information  regression  statistics  bad_data_analysis  via:unfogged  have_read  principal_components  to:blog 
december 2011 by cshalizi
Instruments, Randomization, and Learning about Development (Deaton, 2010)
"There is currently much debate about the effectiveness of foreign aid and about what kind of projects can engender economic development. There is skepticism about the ability of econometric analysis to resolve these issues or of development agencies to learn from their own experience. In response, there is increasing use in development economics of randomized controlled trials (RCTs) to accumulate credible knowl- edge of what works, without overreliance on questionable theory or statistical meth- ods. When RCTs are not possible, the proponents of these methods advocate quasi- randomization through instrumental variable (IV) techniques or natural experiments. I argue that many of these applications are unlikely to recover quantities that are use- ful for policy or understanding: two key issues are the misunderstanding of exogeneity and the handling of heterogeneity. I illustrate from the literature on aid and growth. Actual randomization faces similar problems as does quasi-randomization, notwith- standing rhetoric to the contrary. I argue that experiments have no special ability to produce more credible knowledge than other methods, and that actual experiments are frequently subject to practical problems that undermine any claims to statisti- cal or epistemic superiority. I illustrate using prominent experiments in development and elsewhere. As with IV methods, RCT-based evaluation of projects, without guid- ance from an understanding of underlying mechanisms, is unlikely to lead to scientific progress in the understanding of economic development. I welcome recent trends in development experimentation away from the evaluation of projects and toward the evaluation of theoretical mechanisms."
causal_inference  experimental_economics  experimental_sociology  economics  development_economics  social_science_methodology  explanation_by_mechanisms  to_teach:undergrad-ADA  instrumental_variables  have_read  evisceration  in_NB  randomization  to:blog 
december 2011 by cshalizi
Cues of being watched enhance cooperation in a real-world setting
An unusually literal reading of Mencken's "conscience is the little voice that tells us someone might be watching": "We examined the effect of an image of a pair of eyes on contributions to an honesty box used to collect money for drinks in a university coffee room. People paid nearly three times as much for their drinks when eyes were displayed rather than a control image. This finding provides the first evidence from a naturalistic setting of the importance of cues of being watched, and hence reputational concerns, on human cooperative behaviour."
to:NB  have_read  experimental_psychology  evolution_of_cooperation  experimental_economics  to:blog 
december 2011 by cshalizi
High Relatedness Is Necessary and Sufficient to Maintain Multicellularity in Dictyostelium
Cool! "Most complex multicellular organisms develop clonally from a single cell. This should limit conflicts between cell lineages that could threaten the extensive cooperation of cells within multicellular bodies. Cellular composition can be manipulated in the social amoeba Dictyostelium discoideum, which allows us to test and confirm the two key predictions of this theory. Experimental evolution at low relatedness favored cheating mutants that could destroy multicellular development. However, under high relatedness, the forces of mutation and within-individual selection are too small for these destructive cheaters to spread, as shown by a mutation accumulation experiment. Thus, we conclude that the single-cell bottleneck is a powerful stabilizer of cellular cooperation in multicellular organisms."
slime_molds  evolutionary_biology  experimental_biology  evolution_of_cooperation  evo-devo  developmental_biology  major_transitions_of_evolution  have_read  in_NB  to:blog 
december 2011 by cshalizi
[1112.1440] Complex Systems: A Survey
"A complex system is a system composed of many interacting parts, often called agents, which displays collective behavior that does not follow trivially from the behaviors of the individual parts. Examples include condensed matter systems, ecosystems, stock markets and economies, biological evolution, and indeed the whole of human society. Substantial progress has been made in the quantitative understanding of complex systems, particularly since the 1980s, using a combination of basic theory, much of it derived from physics, and computer simulation. The subject is a broad one, drawing on techniques and ideas from a wide range of areas. Here I give a survey of the main themes and methods of complex systems science and an annotated bibliography of resources, ranging from classic papers to recent books and reviews."
in_NB  have_read  complexity  kith_and_kin  newman.mark 
december 2011 by cshalizi
[1112.1047] Network Inference and Biological Dynamics
"Network inference approaches are now widely used in biological applications to probe regulatory relationships between molecular components such as genes or proteins. Many methods have been proposed for this setting, but the connections and differences between their statistical formulations have received less attention. In this paper, we show how a broad class of statistical network inference methods, including a number of existing approaches, can be described in terms of variable selection for the linear model. This reveals some subtle but important differences between the methods, including the treatment of time intervals in discretely observed data. In developing a general formulation, we also explore the relationship between single-cell stochastic dynamics and network inference on averages over cells. This clarifies the link between biochemical networks as they operate at the cellular level and network inference as carried out on data that are averages over populations of cells. We present empirical results, comparing thirty-two network inference methods that are instances of the general formulation we describe, using two published dynamical models. Our investigation sheds light on the applicability and limitations of network inference and provides guidance for practitioners and suggestions for experimental design."
in_NB  have_read  biochemical_networks  network_data_analysis 
december 2011 by cshalizi
[0809.2792] Predicting Abnormal Returns From News Using Text Classification
"We show how text from news articles can be used to predict intraday price movements of financial assets using support vector machines. Multiple kernel learning is used to combine equity returns with text as predictive features to increase classification performance and we develop an analytic center cutting plane method to solve the kernel learning problem efficiently. We observe that while the direction of returns is not predictable using either text or returns, their size is, with text features producing significantly better performance than historical returns alone."
to:NB  have_read  financial_speculation  text_mining 
december 2011 by cshalizi
[1112.0840] On the question of effective sample size in network modeling
"We raise the issue of effective sample size in network graph modeling and inference and illustrate, using simple models and arguments, how this issue can quickly become nontrivial."
in_NB  network_data_analysis  have_read  estimation  statistics  fisher_information  exponential_family_random_graphs  kolaczyk.eric  krivitsky.pavel 
december 2011 by cshalizi
[1111.6201] Learning a Factor Model via Regularized PCA
"We consider the problem of learning a linear factor model with an unknown number of factors. We propose a regularized form of principal component analysis (PCA) and demonstrate through experiments with synthetic and real data the superiority of resulting estimates to those produced by pre-existing factor analysis approaches. We also establish theoretical results that elucidate the manner in which our algorithm corrects biases induced by conventional PCA. An important feature of our algorithm is its computational efficiency, which is close to that of PCA, which enjoys wide use in large part due to its efficiency."
to:NB  factor_analysis  principal_components  statistics  have_read  to_teach:undergrad-ADA  van_roy.benjamin 
december 2011 by cshalizi
Dynamic social networks promote cooperation in experiments with humans
"Human populations are both highly cooperative and highly organized. Human interactions are not random but rather are structured in social networks. Importantly, ties in these networks often are dynamic, changing in response to the behavior of one's social partners. This dynamic structure permits an important form of conditional action that has been explored theoretically but has received little empirical attention: People can respond to the cooperation and defection of those around them by making or breaking network links. Here, we present experimental evidence of the power of using strategic link formation and dissolution, and the network modification it entails, to stabilize cooperation in sizable groups. Our experiments explore large-scale cooperation, where subjects’ cooperative actions are equally beneficial to all those with whom they interact. Consistent with previous research, we find that cooperation decays over time when social networks are shuffled randomly every round or are fixed across all rounds. We also find that, when networks are dynamic but are updated only infrequently, cooperation again fails. However, when subjects can update their network connections frequently, we see a qualitatively different outcome: Cooperation is maintained at a high level through network rewiring. Subjects preferentially break links with defectors and form new links with cooperators, creating an incentive to cooperate and leading to substantial changes in network structure. Our experiments confirm the predictions of a set of evolutionary game theoretic models and demonstrate the important role that dynamic social networks can play in supporting large-scale human cooperation."
to:NB  have_read  experimental_sociology  social_networks  evolution_of_cooperation  christakis.nicholas 
december 2011 by cshalizi
2012 and the End of the World: The Western Roots of the Maya Apocalypse by Matthew Restall - Powell's Books
"Did the Maya really predict that the world would end in December of 2012? If not, how and why has 2012 millenarianism gained such popular appeal? In this deeply knowledgeable book, two leading historians of the Maya answer these questions in a succinct, readable, and accessible style. Matthew Restall and Amara Solari introduce, explain, and ultimately demystify the 2012 phenomenon. They begin by briefly examining the evidence for the prediction of the world's end in ancient Maya texts and images, analyzing precisely what Maya priests did and did not prophesize. The authors then convincingly show how 2012 millenarianism has roots far in time and place from Maya cultural traditions, but in those of medieval and Early Modern Western Europe. Revelatory and myth-busting, while remaining firmly grounded in historical fact, this fascinating book will be essential reading as the countdown to December 21, 2012, begins." --- They're speaking here on Nov. 28th, but I suspect I won't be able to make it.
books:recommended  millenarianism  apocalypticism  maya_civilization  historical_myths  debunking  cultural_appropriation  history_of_ideas  psychoceramics  in_NB  have_read 
november 2011 by cshalizi
Le Cam Made Simple: No-N Asymptotics
"If the log likelihood is approximately quadratic with constant Hessian, then the maximum likelihood estimator (MLE) is approximately normally distributed. No other assumptions are required. We do not need independent and identically distributed data. We do not need the law of large numbers (LLN) or the central limit theorem (CLT). We do not need sample size going to infinity or anything going to infinity.

The theory presented here is a combination of Le Cam style involving local asymptotic normality (LAN) and local asymptotic mixed normality (LAMN) and Cramér style involving derivatives and Fisher information. The main tool is convergence in law of the log likelihood function and its derivatives considered as random elements of a Polish space of continuous functions with the metric of uniform convergence on compact sets. We obtain results for both one-step-Newton estimators and Newton-iterated-to-convergence estimators."
in_NB  have_read  statistics  estimation  geyer.charles  via:ale 
november 2011 by cshalizi
Low Assumptions, High Dimensions
"These days, statisticians often deal with complex, high dimensional datasets. Research- ers in statistics and machine learning have responded by creating many new methods for analyzing high dimensional data. However, many of these new methods depend on strong assumptions. The challenge of bringing low assumption inference to high dimen- sional settings requires new ways to think about the foundations of statistics. Traditional foundational concerns, such as the Bayesian versus frequentist debate, have become less important."
in_NB  foundations_of_statistics  statistics  bayesianism  kith_and_kin  wasserman.larry  have_read 
november 2011 by cshalizi
PLoS ONE: The Small World of Psychopathology
"Background
Mental disorders are highly comorbid: people having one disorder are likely to have another as well. We explain empirical comorbidity patterns based on a network model of psychiatric symptoms, derived from an analysis of symptom overlap in the Diagnostic and Statistical Manual of Mental Disorders-IV (DSM-IV).

Principal Findings
We show that a) half of the symptoms in the DSM-IV network are connected, b) the architecture of these connections conforms to a small world structure, featuring a high degree of clustering but a short average path length, and c) distances between disorders in this structure predict empirical comorbidity rates. Network simulations of Major Depressive Episode and Generalized Anxiety Disorder show that the model faithfully reproduces empirical population statistics for these disorders.

Conclusions
In the network model, mental disorders are inherently complex. This explains the limited successes of genetic, neuroscientific, and etiological approaches to unravel their causes. We outline a psychosystems approach to investigate the structure and dynamics of mental disorders."
to:NB  psychometrics  psychiatry  network_data_analysis  inference_to_latent_objects  borsboom.denny  have_read  to:blog 
november 2011 by cshalizi
Bickel , Chen , Levina : The method of moments and degree distributions for network models
"Probability models on graphs are becoming increasingly important in many applications, but statistical tools for fitting such models are not yet well developed. Here we propose a general method of moments approach that can be used to fit a large class of probability models through empirical counts of certain patterns in a graph. We establish some general asymptotic properties of empirical graph moments and prove consistency of the estimates as the graph size grows for all ranges of the average degree including Omega(1). Additional results are obtained for the important special case of degree distributions."

After reading this, I note that they do not go through even one example of actually estimating anything. I think this is because the inversion from moments to graphons, while mathematically well-defined, is hellish to calculate (and probably very numerically unstable).
network_data_analysis  statistics  estimation  bickel.peter  levina.elizaveta  re:smoothing_adjacency_matrices  in_NB  have_read 
november 2011 by cshalizi
[1111.1418] Efficient Nonparametric Conformal Prediction Regions
Yay, it's out! "We investigate and extend the conformal prediction method due to Vovk,Gammerman and Shafer (2005) to construct nonparametric prediction regions. These regions have guaranteed distribution free, finite sample coverage, without any assumptions on the distribution or the bandwidth. Explicit convergence rates of the loss function are established for such regions under standard regularity conditions. Approximations for simplifying implementation and data driven bandwidth selection methods are also discussed. The theoretical properties of our method are demonstrated through simulations."
in_NB  prediction  statistics  confidence_sets  nonparametrics  kith_and_kin  wasserman.larry  robins.james  have_read  density_estimation 
november 2011 by cshalizi
Words to the Wise: Stock Flow Consistent Modeling of Financial Instability by Stephen Kinsella :: SSRN
Programmatic: "The crisis has exposed the failure of economic models to deal sensibly with endogenously generated crises propagating from the financial sectors to the real economy, and back again. The goal of this paper is to review the method of stock flow consistent modeling to highlight areas in which it is deficient. I argue there is a fruitful research agenda in shoring up these deficiencies. The objective of stock flow modeling should be the ability to practically model unstable macro-economies, and in particular their interactions with the financial sector. These models should provide ‘Words to the Wise’, and until they do, they are just thought experiments."
to:NB  economics  macroeconomics  kinsella.stephen  have_read 
november 2011 by cshalizi
"Dynamic threshold modeling of budget changes"
"A family of models was given to explain how the public budgeting process, as a multi-stage institutional decision making mechanism transforms the stimuli characterized by Gaussian distribution to skew, power law distributions. While the annual change is generally incremental, deviations from this incremental changes are more frequent, than the Gaussian distribution suggests. A set of threshold models, reflecting error-accumulation and friction, was suggested. The three-threshold model seems to be good to describe appropriately the basic statistical features of the data."
have_read  heavy_tails  political_science  via:blyth 
november 2011 by cshalizi
Natural Movies Evoke Spike Trains with Low Spike Time Variability in Cat Primary Visual Cortex
"Neuronal responses in primary visual cortex have been found to be highly variable. This has led to the widespread notion that neuronal responses have to be averaged over large numbers of neurons to obtain suitably invariant responses that can be used to reliably encode or represent external stimuli. However, it is possible that the high variability of neuronal responses may result from the use of simple, artificial stimuli and that the visual cortex may respond differently to dynamic, naturalistic images. To investigate this question, we recorded the responses of primary visual cortical neurons in the anesthetized cat under stimulation with time-varying natural movies. We found that cortical neurons on the whole exhibited a high degree of spike count variability, but a surprisingly low degree of spike time variability. The spike count variability was further reduced when all but the first spike in a burst were removed. We also found that responses exhibiting low spike time variability exhibited low spike count variability, suggesting that rate coding and temporal coding might be more compatible than previously thought. In addition, we found the spike time variability to be significantly lower when stimulated by natural movies as compared with stimulation using drifting gratings. Our results indicate that response variability in primary visual cortex is stimulus dependent and significantly lower than previous measurements have indicated."
in_NB  neuroscience  friday_cat_blogging  to:blog  have_read  neural_coding_and_decoding 
november 2011 by cshalizi
CAKE: Convex Adaptive Kernel Density Estimation
"In this paper we present a generalization of kernel density estimation called Convex Adaptive Kernel Density Estimation (CAKE) that replaces single bandwidth se- lection by a convex aggregation of kernels at all scales, where the convex aggregation is allowed to vary from one training point to another, treating the fundamental problem of heterogeneous smoothness in a novel way. Learning the CAKE estimator given a training set reduces to solving a single con- vex quadratic programming problem. We derive rates of convergence of CAKE like estimator to the true underlying density under smoothness assumptions on the class and show that given a sufficiently large sample the mean squared error of such estimators is optimal in a minimax sense. We also give a risk bound of the CAKE estimator in terms of its empirical risk. We empirically compare CAKE to other density estimators proposed in the statistics literature for handling heterogeneous smoothness on different synthetic and natural distributions. "
to:NB  have_read  density_estimation  ensemble_methods  kernel_estimators  statistics 
november 2011 by cshalizi
Fraser : Is Bayes Posterior just Quick and Dirty Confidence?
Shorter Fraser: Yes. Yes it is.
Longer Fraser: "Bayes introduced the observed likelihood function to statistical inference and provided a weight function to calibrate the parameter; he also introduced a confidence distribution on the parameter space but did not provide present justifications. Of course the names likelihood and confidence did not appear until much later: Fisher for likelihood and Neyman for confidence. Lindley showed that the Bayes and the confidence results were different when the model was not location. This paper examines the occurrence of true statements from the Bayes approach and from the confidence approach, and shows that the proportion of true statements in the Bayes case depends critically on the presence of linearity in the model; and with departure from this linearity the Bayes approach can be a poor approximation and be seriously misleading. Bayesian integration of weighted likelihood thus provides a first-order linear approximation to confidence, but without linearity can give substantially incorrect results."
The responses are worth reading, especially, of course, Larry's.
in_NB  statistics  estimation  confidence_sets  bayesianism  fraser.d.a.s.  have_read 
october 2011 by cshalizi
From Wald to Savage: homo economicus becomes a Bayesian statistician - Munich Personal RePEc Archive
"Bayesian rationality is the paradigm of rational behavior in neoclassical economics. A rational agent in an economic model is one who maximizes her subjective expected utility and consistently revises her beliefs according to Bayes’s rule. The paper raises the question of how, when and why this characterization of rationality came to be endorsed by mainstream economists. Though no definitive answer is provided, it is argued that the question is far from trivial and of great historiographic importance. The story begins with Abraham Wald’s behaviorist approach to statistics and culminates with Leonard J. Savage’s elaboration of subjective expected utility theory in his 1954 classic The Foundations of Statistics. It is the latter’s acknowledged fiasco to achieve its planned goal, the reinterpretation of traditional inferential techniques along subjectivist and behaviorist lines, which raises the puzzle of how a failed project in statistics could turn into such a tremendous hit in economics. A couple of tentative answers are also offered, involving the role of the consistency requirement in neoclassical analysis and the impact of the postwar transformation of US business schools." --- The guess about business schools at the end seems plausible.
in_NB  have_read  re:phil-of-bayes_paper  bayesianism  statistics  decision_theory  economics  history_of_statistics  history_of_economics  wald.abraham  savage.leonard_j.  foundations_of_statistics 
october 2011 by cshalizi
[1110.2529] The Generalization Ability of Online Algorithms for Dependent Data
"We study the generalization performance of arbitrary online learning algorithms trained on samples coming from a dependent source of data. We show that the generalization error of any stable online algorithm concentrates around its regret--an easily computable statistic of the online performance of the algorithm--when the underlying ergodic process is $beta$- or $phi$-mixing. We show high probability error bounds assuming the loss function is convex, and we also establish sharp convergence rates and deviation bounds for strongly convex losses and several linear prediction problems such as linear and logistic regression, least-squares SVM, and boosting on dependent data. In addition, our results have straightforward applications to stochastic optimization with dependent data, and our analysis requires only martingale convergence arguments; we need not rely on more powerful statistical tools such as empirical process theory."
in_NB  learning_theory  individual_sequence_prediction  ergodic_theory  mixing  re:growing_ensemble_project  re:XV_for_mixing  stability_of_learning  concentration_of_measure  have_read  re:your_favorite_dsge_sucks 
october 2011 by cshalizi
A Resampling Technique for Relational Data
Roughly: Fix an integer b. Do snowballing sampling from uniformly-random seeds until each snowball contains b nodes. Try to wire up the peripheral nodes of each snowball in a similarity-preserving way.

This reminds me of the block bootstrap for time series, only the similarity-preserving step seems ugly; blocks are independent in time series. What if we have the peripheral nodes attach randomly to each other, preserving only degree? We'd need to let b grow with n --- when would this give a good approximation to the sampling distribution?
in_NB  re:XV_for_networks  bootstrap  statistics  relational_learning  neville.jennifer  have_read 
october 2011 by cshalizi
[1104.5617] Learning high-dimensional directed acyclic graphs with latent and selection variables
"We consider the problem of learning causal information between random variables in directed acyclic graph (DAGs) when allowing arbitrarily many latent and selection variables. The FCI algorithm (Spirtes et al., 1999) has been explicitly designed to infer conditional independence and causal information in such settings. However, FCI is computationally infeasible for large graphs. We therefore propose a new algorithm, the RFCI algorithm, which is much faster than FCI. In some situations the output of RFCI is slightly less informative, in particular with respect to conditional independence information. However, we prove that any causal information in the output of RFCI is correct. We also define a class of graphs on which the outputs of FCI and RFCI are identical. We prove consistency of FCI and RFCI in sparse high-dimensional settings, and demonstrate in simulations that the estimation performances of the algorithms are very similar. All software is implemented in the R-package pcalg."
have_read  to_teach:undergrad-ADA  graphical_models  causal_inference  in_NB  kalisch.markus  richardson.thomas_s. 
september 2011 by cshalizi
[1108.0833] Temporal statistical analysis on human article creation patterns
Sadly, in this case fitting crappy power laws to the works of Gene Stanley and Laszlo Barabasi is not an_intentional_ joke.
bad_data_analysis  heavy_tails  barabasi.albert-laszlo  stanley.h._eugene  newman.mark  su.shi  have_read  blogged 
august 2011 by cshalizi
Phys. Rev. Lett. 107, 018102 (2011): Geometric Effects on Complex Network Structure in the Cortex
"It is shown that homogeneous, short-range, two-dimensional (2D) cortical connectivity, without modularity, hierarchy, or other specialized structure, reproduces key observed properties of cortical networks, including low path length, high clustering and modularity index, and apparent hierarchical block-diagonal structure in connection matrices. Geometry strongly influences connection matrices, implying that simple interpretations of connectivity measures as reflecting specialized structure can be misleading: Such apparent structure is seen in strictly uniform, locally connected architectures in 2D. Geometry is thus a proxy for function, modularity, and hierarchy and must be accounted for when structural inferences are made."
neuroscience  networks  network_data_analysis  in_NB  have_read  re:sporns_review  evisceration  to:blog 
july 2011 by cshalizi
[1107.5543] Coevolution of Network Structure and Content
Disappointing.  The content variables are all completely ad hoc (the structure variables are also ad hoc, but traditional), so we really have no idea of what is being found here.  And there is no assessment of uncertainty at all.  And, for the love of Gauss, stop using R^2 like that!
time_series  social_networks  social_media  statistics  adamic.lada  to:NB  have_read  network_data_analysis 
july 2011 by cshalizi
How Useful are Estimated DSGE Model Forecasts? by Rochelle Edge, Refet Gurkaynak :: SSRN
The methodological ideas here are suspect.  It is true that there is not much to predict about an in-control system, and what is happening is largely random and so unpredictable, so that even the true model would show low forecasting ability.  The question however is why we are supposed to think that the DSGE _does_ give us good information about counterfactuals.  If you could show that it had much better predictive performance than baselines like constants or random walks during _out-of-control_ periods, that would be something; but they don't.
re:your_favorite_dsge_sucks  dsges  prediction  economics  macroeconomics  time_series  statistics  in_NB  have_read  to:blog 
july 2011 by cshalizi
[1107.3806] Asymptotics for minimisers of convex processes
From 1993: "By means of two simple convexity arguments we are able to develop a general method for proving consistency and asymptotic normality of estimators that are defined by minimisation of convex criterion functions. This method is then applied to a fair range of different statistical estimation problems, including Cox regression, logistic and Poisson regression, least absolute deviation regression outside model conditions, and pseudo-likelihood estimation for Markov chains. Our paper has two aims. The first is to exposit the method itself, which in many cases, under reasonable regularity conditions, leads to new proofs that are simpler than the traditional proofs. Our second aim is to exploit the method to its limits for logistic regression and Cox regression, where we seek asymptotic results under as weak regularity conditions as possible. For Cox regression in particular we are able to weaken previously published regularity conditions substantially."
statistics  estimation  pollard.david  hjort.nils_lid  empirical_processes  have_read  in_NB 
july 2011 by cshalizi
Becker, 1962: Irrational Behavior and Economic Theory (JSTOR: Journal of Political Economy, Vol. 70, No. 1 (Feb., 1962), pp. 1-13)
This is a genuinely brilliant and important paper.  But Becker shows no sign of realizing just how completely he has just undermined the whole normative side of traditional economics!
economics  bounded_rationality  markets_as_collective_calculating_devices  decision_theory  have_read  to:blog  becker.gary 
july 2011 by cshalizi
Bayesian Checking for Topic Models
"Real document collections do not fit the inde- pendence assumptions asserted by most statistical topic models, but how badly do they violate them? We present a Bayesian method for measuring how well a topic model fits a corpus. Our approach is based on posterior predictive checking, a method for diagnosing Bayesian models in user-defined ways. Our method can identify where a topic model fits the data, where it falls short, and in which directions it might be improved."
topic_models  model-checking  blei.david  in_NB  via:ariddell  statistics  machine_learning  information_retrieval  clustering  have_read 
july 2011 by cshalizi
Confirmation in the Cognitive Sciences: The Problematic Case of Bayesian Models
"Bayesian models of human learning are becoming increasingly popular in cognitive science. We argue that their purported confirmation largely relies on a methodology that depends on premises that are inconsistent with the claim that people are Bayesian about learning and inference. Bayesian models in cognitive science derive their appeal from their normative claim that the modeled inference is in some sense rational. Standard accounts of the rationality of Bayesian inference imply predictions that an agent selects the option that maximizes the posterior expected utility. Experimental confirmation of the models, however, has been claimed because of groups of agents that “probability match” the posterior. Probability matching only constitutes support for the Bayesian claim if additional unobvious and untested (but testable) assumptions are invoked. The alternative strategy of weakening the underlying notion of rationality no longer distinguishes the Bayesian model uniquely."
philosophy_of_science  cognitive_science  bayesianism  kith_and_kin  have_read  re:phil-of-bayes_paper  blogged  eberhardt.frederick  danks.david 
july 2011 by cshalizi
Socialist alternatives to capitalism II: Vienna to Santa Fe
Less than convincing, both as socialist argument and as discussion of intellectual history.  (John von Neumann was not, repeat, not, part of the Vienna Circle.  On the other hand, actual full-blown socialist theorists like Otto Neurath _were_.)  No discussion of market socialist traditions, other than passing mentions of Lange et al.
socialism  economics  history_of_economics  foley.duncan  via:?  have_read 
july 2011 by cshalizi
Making and Evaluating Point Forecasts (Gneiting)
"Typically, point forecasting methods are compared and assessed by means of an error measure or scoring function, with the absolute error and the squared error being key examples. The individual scores are averaged over forecast cases, to result in a summary measure of the predictive performance, such as the mean absolute error or the mean squared error. I demonstrate that this common practice can lead to grossly misguided inferences, unless the scoring function and the forecasting task are carefully matched...."
prediction  statistics  calibration  machine_learning  decision_theory  gneiting.tilmann  have_read 
july 2011 by cshalizi
[1106.2125] Bootstrapping data arrays of arbitrary order
"In this paper we study a bootstrap strategy for estimating the variance of a mean taken over large multifactor crossed random effects data sets. We apply bootstrap reweighting independently to the levels of each factor, giving each observation the product of its factor weights. No exact bootstrap exists for this problem (McCullagh (2000)). We show that the proposed bootstrap is mildly conservative, under sufficient conditions that allow very unbalanced and heteroscedastic inputs. Earlier results for a resampling bootstrap only apply to two factors and are not suitable to online computation. The proposed reweighting approach can be implemented in parallel and online settings. The results for this method apply to any number of factors. The method is illustrated using a 3 factor data set of comment lengths from Facebook."
bootstrap  statistics  eckles.dean  owen.art  have_read  network_data_analysis  re:smoothing_adjacency_matrices  to:blog 
june 2011 by cshalizi
« earlier      

related tags

aaronson.scott  abstract_algebra  academia  acemoglu.daron  adamic.lada  adams.terrence  additive_models  aesthetics  agent-based_models  airoldi.edo  algorithmic_information_theory  amaral.luis  amiga  analysis  anarchism  anomaly_detection  ansell.christopher  anthropic_arguments  anthropology  apocalypticism  approximation  approximation_algorithms  argumentation  arkin.william  arlot.sylvain  arrow_of_time  art  artificial_intelligence  astronomy  astrophysics  asymmetric_information  atkinson.r_l  automata_theory  automated_diagnosis  automating_craft  avant_gardes  ay.nihat  bad_data_analysis  bad_science  baez.john  bagehot.walter  banking  barabasi.albert-laszlo  bartlett.m.s.  basu.sumit  bayesianism  bayesian_consistency  becker.gary  behavioral_ecology  belief_propagation  bergstrom.carl  berk.robert_h  bibliometry  bickel.peter  biochemical_networks  blei.david  blitzstein.joseph  blogged  blogging  books:noted  books:recommended  books:reviewed  book_reviews  boosting  bootstrap  borsboom.denny  bounded_rationality  bowles.samuel  brockwell.anthony  bubonic_plague  buhlmann.peter  cai.t._tony  caires.s.  calibration  campaign_finance  carey.kevin  cats  causality  causal_inference  celisse.alain  cellular_automata  central_asia  central_limit_theorem  change-point_problem  chatterjee.souav  china  choudhury.tanzeem  christakis.nicholas  christensen.b.j.  chu.tianjiao  citation_networks  cities  claidiere.nicolas  clarke.kevin  clarkson.brian  classifiers  class_struggles_in_america  clermont.gilles  climatology  clustering  cobb_douglas_production_function  cognitive_dissonance  cognitive_science  cohn.henry  collective_cognition  community_discovery  complexity  complexity_measures  computability  computational_complexity  computational_statistics  computers  concentration_of_measure  confidence_sets  confounding  congress  contagion  convergence_of_stochastic_processes  convexity  convex_sets  conways_life  copulas  correlational_psychology  cosmology  cost_disease  credit_derivatives  crespi.valentino  crime  cross-validation  crotty.james  csiszar.imre  cultural_appropriation  cultural_diversity  cultural_evolution  cultural_exchange  cultural_transmission  cultural_universals  curse_of_dimensionality  curve_fitting  cybenko.george  damouras.sotirios  danks.david  data_analysis  data_mining  dauxois.thierry  davidson.paul  debunking  decision-making  decision_theory  decision_trees  delong.brad  density_estimation  developmental_biology  development_economics  deviation_bounds  deviation_inequalities  diaconis.persi  diffusion_maps  dimension_reduction  dinardo.john  disease  distributed_systems  dsges  dynamical_systems  dynamic_programming  earthquakes  eberhardt.frederick  eckles.dean  econometrics  economics  economics_of_superstars  economic_policy  education  efficient_markets  elsevier  elster.jon  elwert.felix  empirical_processes  em_algorithm  ensemble_methods  entropy  epidemic_models  epidemiology  epistemology  equilibrium  ergodic_decomposition  ergodic_theory  erikson.robert_s  essays  estimation  evans.kellie_m.  evisceration  evo-devo  evolutionary_biology  evolutionary_game_theory  evolutionary_psychology  evolution_of_complexity  evolution_of_cooperation  evolution_of_culture  evolution_of_intelligence  exchangeable_arrays  exchangeable_sequences  experimental_biology  experimental_economics  experimental_political_science  experimental_psychology  experimental_sociology  explanation  explanation_by_mechanisms  exploitation  exploration-exploitation  exponential_families  exponential_family_random_graphs  factor_analysis  fear  feature_selection  feedback  fermi.enrico  ferreira.j.a.  field_theory  fienberg.steve  filtering  finance  financial_crisis_of_2007--  financial_markets  financial_speculation  fisher_information  fleuret.francois  fluid_mechanics  fmri  foley.duncan  forrester.jay  foundations_of_statistics  fowler.james  fraser.d.a.s.  freedman.david  friday_cat_blogging  funny:geeky  funny:malicious  gabaix.xavier  game_theory  gaussian_processes  gelman.andrew  geman.donald  genovese.christopher  geology  gerbils  geyer.charles  git  gives_economists_a_bad_name  glivenko-cantelli  glymour.clark  gneiting.tilmann  godfrey-smith.peter  goernerup.olof  goldenberg.anna  goodness-of-fit  grafton.anthony  graphical_models  graph_limits  graph_theory  gray.robert_m  greenland.sander  grossman.sanford  haavelmo.trygve  halpern.joseph_y.  hansen.bruce  harrison.matt  have_read  heard_the_talk  heavy_tails  heckman.james  hendry.david  heritability  heteroskedasticity  hierarchical_models  hilbert_space  hill.jennifer  historical_myths  history_of_economics  history_of_ideas  history_of_physics  history_of_science  history_of_statistics  history_of_technology  hjort.nils_lid  hodrick-prescott_filter  hoeffdings_inequality  hofman.jake  homogamy  homophily  hormones  hoyer.patrik  hoyle.fred  huber.peter  human_evolution  human_genetics  hypothesis_testing  identifiability  imitation  imperfect_competition  incest  independence_testing  independent_component_analysis  indirect_inference  individual_sequence_prediction  induction  industrial_revolution  inequality  inference_to_latent_objects  influence  information_cascades  information_retrieval  information_theory  institutional_change  institutions  instrumental_variables  intentional_explanation  interacting_particle_systems  internet  inverse_problems  in_NB  iq  i_told_you_so  janson.svante  janzing.dominik  jiang.wenxin  journalism  k-means  kalisch.markus  kass.rob  kearns.michael  kempthorne.oscar  kernel_estimators  kernel_methods  keynes.john_maynard  kiefer.n.m.  kinsella.stephen  kirman.alan  kitchens.bruce  kitcher.philip  kith_and_kin  kolaczyk.eric  kontorovich.aryeh  kragh.helge  krivitsky.pavel  lafferty.john  lang.kevin  langford.john  large_deviations  lauritzen.steffen  law  lazer.david  lead  leamer.ed  learning_in_games  learning_theory  lee.ann  leenders.roger  lerman.kristina  levina.elizaveta  levitt.steven  levy.ferdinand  levy.frank  le_cam.lucien  likelihood  likelihood_ratio_tests  limit_theorems  linear_algebra  linear_regression  liu.han  lives_of_the_scientists  logistic_regression  logothetis.nikos  low-regret-learning  low-regret_learning  machine_learning  macroeconomics  macro_from_micro  major_transitions_of_evolution  manifold_learning  markets_as_collective_calculating_devices  market_failures_in_everything  markov_models  martingales  mason.winter  matching  mathematics  maya_civilization  mccloskey.deirdre  mean-field_theory  measure_theory  medici.cosimo_de  meinshausen.nicolai  memoir  mental_testing  mercier.hugo  merton.robert.k.  methodological_advice  methodology  method_of_types  millenarianism  minimax  minimum_description_length  mis-specification_testing  misspecification  mixing  mixture_models  model-checking  modeling  model_averaging  model_selection  monopolistic_competition  monte_carlo  moore.cris  morley.james  morris.martina  mueller.john  multiple_comparisons  multiple_testing  national_surveillance_state  natural_born_cyborgs  neanderthals  networked_life  networks  network_data_analysis  network_sampling  neural_coding_and_decoding  neural_data_analysis  neural_networks  neuroscience  neville.jennifer  newman.mark  neyman.jerzy  nilsson_jacobi.martin  nobel.andrew  noel.hans  noethers_theorem  non-stationarity  nonparametrics  nordhaus.william  norton.john  nyhan.brendan  obesity  obvious_to_one_skilled_in_the_art  occams_razor  online_learning  oppression  optimization  oracle_inequalities  order_statistics  organizations  our_decrepit_institutions  owen.art  p-values  padgett.john  page.scott  pasta.j  path_dependence  pattern_recognition  pearl.judea  pentland.alex  percolation  phase_transitions  philosophy  philosophy_of_science  philosophy_of_social_sciience  point_processes  political_economy  political_science  pollard.david  popper.karl  practices_relating_to_the_transmission_of_genetic_information  prediction  prediction_markets  preference  priest.dana  principal_components  principle_of_indifference  principle_of_least_action  probability  productivity  programming  progressive_forces  psychiatry  psychoceramics  psychology  psychometrics  R  racine.jeffrey  raginsky.maxim  randomization  random_fields  random_graphs  random_walks  rationality  rational_choice  ravikumar.pradeep  re:almost_none  re:AoS_project  re:bayes_as_evol  re:critique_of_diffusion  re:democratic_cognition  re:do-institutions-evolve  re:donor_networks  re:functional_communities  re:growing_ensemble_project  re:g_paper  re:homophily_and_confounding  re:knightian_uncertainty  re:network_differences  re:network_model_selection  re:phil-of-bayes_paper  re:simulating_coupled_markov_chains  re:smoothing_adjacency_matrices  re:social-networks-as-sensor-networks  re:sporns_review  re:stacs  re:what_is_the_right_null_model_for_linear_regression  re:XV_for_mixing  re:XV_for_networks  re:your_favorite_dsge_sucks  re:your_favorite_ergm_sucks  regression  regulation  relational_learning  renaissance_history  renormalization  renyi.alfred  renyi_entropy  reputation  resampling  revealed_preferences  rhetoric  richardson.thomas_s.  rigollet.philippe  risk  robins.james  robustness  robust_statistics  rosenblatt.murray  rosenblueth.arturo  rosvall.martin  rubin.jonathan  salakhutdinov.ruslan  salmon.wesley  savage.leonard_j.  science_as_a_social_process  science_policy  science_studies  scooped  securitization  self-fulfilling_prophecy  sensitive_dependence_on_initial_conditions  series_of_footnotes  shot_after_a_fair_trial  silly_priors  simon.herbert  simulation  slime_molds  smith.eric  smoothing  socialism  social_cognition  social_contagion  social_engineering  social_influence  social_life_of_the_mind  social_media  social_misconstruction_of_reality  social_networks  social_psychology  social_science_methodology  social_theory  sociology  sociology_of_science  solidarity  spanos.aris  sparsity  spatial_statistics  spectral_clustering  sperber.dan  spiders  splines  stability_of_learning  standardized_testing  stanley.h._eugene  stark.philip  state-building  state-space_models  state_estimation  stationary_features  statistical_inference_for_stochastic_processes  statistical_mechanics  statistics  sterling.bruce  stiglitz.joseph  stochastic_processes  stochastic_volatility  strategic_ambiguity  strategic_position_in_networks  structural_risk_minimization  su.shi  sufficiency  superefficiency  suresh.naidu  symbolic_dynamics  synchronizing_words  teleology  temin.peter  testosterone  text_mining  theoretical_computer_science  theory_of_value  thermodynamics  thomas.andrew  tibshirani.robert  tibshirani.ryan  tilly.charles  time_series  tkacik.maureen  to:blog  to:NB  tofias.michael  topic_models  touchette.hugo  to_read  to_teach  to_teach:complexity-and-inference  to_teach:data-mining  to_teach:statcomp  to_teach:undergrad-ADA  track_down_references  true_knowledge  truth  tsingou.mary  tuncel.selim  two-sample_tests  ulam.stanislaw  uncertainty  unions  us_politics  van_roy.benjamin  variable_selection  vc-dimension  ventura.valerie  version_control  via:?  via:ale  via:ariddell  via:arinaldo  via:arthegall  via:blyth  via:crooked_timber  via:deaneckles  via:ded-maxim  via:djm1107  via:dsquared  via:erindanielson  via:fionajay  via:fred_feinberg  via:gelman  via:georg  via:guslacerda  via:iqss  via:jbdelong  via:kass  via:kevin_drum  via:krugman  via:larry  via:martens  via:matthew_berryman  via:mejn  via:mind-hacks  via:moritz-heene  via:shivak  via:slaniel  via:tozier  via:unfogged  via:vqv  via:wiggins  viruses  visual_display_of_quantitative_information  voter_model  vovk.vladimir_g.  wald.abraham  war  wasserman.larry  watts.duncan  weak_dependence  weather_prediction  weaver.rhiannon  whats_gone_wrong_with_america  why_oh_why_cant_we_have_a_better_academic_publishing_system  why_oh_why_cant_we_have_a_better_press_corps  wiener.norbert  wiesner.karoline  wilks.s._s.  willett.rebecca  wlezien.christopher  wolfowitz.j.  world_history  yajima.masano  zenker.sven  zhang.tong  zheng.alice  ziliak.stephen  zilsel.edgar 

Copy this bookmark:



description:


tags: