[1205.3845] Forecasting with Historical Data or Process Knowledge under Misspecification: A Comparison
8 days ago by cshalizi
"When faced with the task of forecasting a dynamic system, practitioners often have available historical data, knowledge of the system, or a combination of both. While intuition dictates that perfect knowledge of the system should in theory yield perfect forecasting, often knowledge of the system is only partially known, known up to parameters, or known incorrectly. In contrast, forecasting using previous data without any process knowledge might result in accurate prediction for simple systems, but will fail for highly nonlinear and chaotic systems. In this paper, the authors demonstrate how even in chaotic systems, forecasting with historical data is preferable to using process knowledge if this knowledge exhibits certain forms of misspecification. Through an extensive simulation study, a range of misspecification and forecasting scenarios are examined with the goal of gaining an improved understanding of the circumstances under which forecasting from historical data is to be preferred over using process knowledge."
to:NB
to_read
prediction
time_series
misspecification
re:growing_ensemble_project
8 days ago by cshalizi
Phys. Rev. Lett. 108, 200601 (2012): Number of Relevant Directions in Principal Component Analysis and Wishart Random Matrices
8 days ago by cshalizi
"We compute analytically, for large N, the probability P(N+,N) that a N×N Wishart random matrix has N+ eigenvalues exceeding a threshold Nζ, including its large deviation tails. This probability plays a benchmark role when performing the principal component analysis of a large empirical data set. We find that P(N+,N)≈exp[-βN2ψζ(N+/N)], where β is the Dyson index of the ensemble and ψζ(κ) is a rate function that we compute explicitly in the full range 0≤κ≤1 and for any ζ. The rate function ψζ(κ) displays a quadratic behavior modulated by a logarithmic singularity close to its minimum κ⋆(ζ). This is shown to be a consequence of a phase transition in an associated Coulomb gas problem. The variance Δ(N) of the number of relevant components is also shown to grow universally (independent of ζ) as Δ(N)∼(βπ2)-1lnN for large N."
to:NB
to_read
principal_components
large_deviations
random_matrices
stochastic_processes
high-dimensional_probability
re:g_paper
phase_transitions
8 days ago by cshalizi
[1205.3703] Generic chaining and the l1-penalty
11 days ago by cshalizi
"We address the choice of the tuning parameter $lambda$ in $ell_1$-penalized M-estimation. Our main concern is models which are highly nonlinear, such as the Gaussian mixture model. The number of parameters $p$ is moreover large, possibly larger than the number of observations $n$. The generic chaining technique of Talagrand[2005] is tailored for this problem. It leads to the choice $lambda asymp sqrt {log p / n}$, as in the standard Lasso procedure (which concerns the linear model and least squares loss)."
to:NB
to_read
statistics
empirical_processes
high-dimensional_statistics
van_de_geer.sara
11 days ago by cshalizi
Phys. Rev. Lett. 108, 200403 (2012): Time Asymmetry of Probabilities Versus Relativistic Causal Structure: An Arrow of Time
12 days ago by cshalizi
"There is an incompatibility between the symmetries of causal structure in relativity theory and the signaling abilities of probabilistic devices with inputs and outputs: while time reversal in relativity will not introduce the ability to signal between spacelike separated regions, this is not the case for probabilistic devices with spacelike separated input-output pairs. We explicitly describe a nonsignaling device which becomes a perfect signaling device under time reversal, where time reversal can be conceptualized as playing backwards a videotape of an agent manipulating the device. This leads to an arrow of time that is identifiable when studying the correlations of events for spacelike separated regions. Somewhat surprisingly, although the time reversal of Popescu-Rohrlich boxes also allows agents to signal, it does not yield a perfect signaling device. Finally, we realize time reversal using postselection, which could to lead experimental implementation."
to:NB
causality
physics
relativity
arrow_of_time
to_read
12 days ago by cshalizi
Quantitative patterns of stylistic influence in the evolution of literature
13 days ago by cshalizi
"Literature is a form of expression whose temporal structure, both in content and style, provides a historical record of the evolution of culture. In this work we take on a quantitative analysis of literary style and conduct the first large-scale temporal stylometric study of literature by using the vast holdings in the Project Gutenberg Digital Library corpus. We find temporal stylistic localization among authors through the analysis of the similarity structure in feature vectors derived from content-free word usage, nonhomogeneous decay rates of stylistic influence, and an accelerating rate of decay of influence among modern authors. Within a given time period we also find evidence for stylistic coherence with a given literary topic, such that writers in different fields adopt different literary styles. This study gives quantitative support to the notion of a literary “style of a time” with a strong trend toward increasingly contemporaneous stylistic influence."
It'll be interesting to see how they handle the bias induced by selective retention.
to:NB
to_read
literary_history
text_mining
kith_and_kin
rockmore.dan
krakuer.david
It'll be interesting to see how they handle the bias induced by selective retention.
13 days ago by cshalizi
Uncovering Structure in High-Dimensions: Networks and Multi-task Learning Problems
25 days ago by cshalizi
"Extracting knowledge and providing insights into complex mechanisms underlying noisy high-dimensional data sets is of utmost importance in many scientific domains. Statistical modeling has become ubiquitous in the analysis of high-dimensional functional data in search of better understanding of cognition mechanisms, in the exploration of large-scale gene regulatory networks in hope of developing drugs for lethal diseases, and in prediction of volatility in stock market in hope of beating the market. Statistical analysis in these high-dimensional data sets is possible only if an estimation procedure exploits hidden structures underlying data.
"This thesis develops flexible estimation procedures with provable theoretical guarantees for uncovering unknown hidden structures underlying data generating process. Of particular interest are procedures that can be used on high-dimensional data sets where the number of samples n much smaller than the ambient dimension p. Learning in high-dimensions is difficult due to the curse of dimensionality, however, the special problem structure makes inference possible. Due to its importance for scientific discovery, we put emphasis on consistent structure recovery throughout the thesis. Particular focus is given to two important problems, semi-parametric estimation of networks and feature selection in multi-task learning."
to_read
network_data_analysis
machine_learning
high-dimensional_statistics
kolar.mladen
kith_and_kin
relational_learning
"This thesis develops flexible estimation procedures with provable theoretical guarantees for uncovering unknown hidden structures underlying data generating process. Of particular interest are procedures that can be used on high-dimensional data sets where the number of samples n much smaller than the ambient dimension p. Learning in high-dimensions is difficult due to the curse of dimensionality, however, the special problem structure makes inference possible. Due to its importance for scientific discovery, we put emphasis on consistent structure recovery throughout the thesis. Particular focus is given to two important problems, semi-parametric estimation of networks and feature selection in multi-task learning."
25 days ago by cshalizi
Towards Integrative Causal Analysis of Heterogeneous Data Sets and Studies
25 days ago by cshalizi
"We present methods able to predict the presence and strength of conditional and unconditional dependencies (correlations) between two variables Y and Z never jointly measured on the same samples, based on multiple data sets measuring a set of common variables. The algorithms are specializations of prior work on learning causal structures from overlapping variable sets. This problem has also been addressed in the field of statistical matching. The proposed methods are applied to a wide range of domains and are shown to accurately predict the presence of thousands of dependencies. Compared against prototypical statistical matching algorithms and within the scope of our experiments, the proposed algorithms make predictions that are better correlated with the sample estimates of the unknown parameters on test data ; this is particularly the case when the number of commonly measured variables is low.
"The enabling idea behind the methods is to induce one or all causal models that are simultaneously consistent with (fit) all available data sets and prior knowledge and reason with them. This allows constraints stemming from causal assumptions (e.g., Causal Markov Condition, Faithfulness) to propagate. Several methods have been developed based on this idea, for which we propose the unifying name Integrative Causal Analysis (INCA). A contrived example is presented demonstrating the theoretical potential to develop more general methods for co-analyzing heterogeneous data sets. The computational experiments with the novel methods provide evidence that causally-inspired assumptions such as Faithfulness often hold to a good degree of approximation in many real systems and could be exploited for statistical inference. Code, scripts, and data are available at www.mensxmachina.org."
to:NB
to_read
causal_inference
graphical_models
to_teach:undergrad-ADA
"The enabling idea behind the methods is to induce one or all causal models that are simultaneously consistent with (fit) all available data sets and prior knowledge and reason with them. This allows constraints stemming from causal assumptions (e.g., Causal Markov Condition, Faithfulness) to propagate. Several methods have been developed based on this idea, for which we propose the unifying name Integrative Causal Analysis (INCA). A contrived example is presented demonstrating the theoretical potential to develop more general methods for co-analyzing heterogeneous data sets. The computational experiments with the novel methods provide evidence that causally-inspired assumptions such as Faithfulness often hold to a good degree of approximation in many real systems and could be exploited for statistical inference. Code, scripts, and data are available at www.mensxmachina.org."
25 days ago by cshalizi
[1204.6703] Two SVDs Suffice: Spectral decompositions for probabilistic topic modeling and latent Dirichlet allocation
27 days ago by cshalizi
"Topic models can be seen as a generalization of the clustering problem, in that they posit that observations are generated due to multiple latent factors (e.g. the words in each document are generated as a mixture of several active topics, as opposed to just one). This increased representational power comes at the cost of a more challenging unsupervised learning problem of estimating the topic probability vectors (the distributions over words for each topic), when only the words are observed and the corresponding topics are hidden.
"We provide a simple and efficient learning procedure that is guaranteed to recover the parameters for a wide class of mixture models, including the popular latent Dirichlet allocation (LDA) model. For LDA, the procedure correctly recovers both the topic probability vectors and the prior over the topics, using only trigram statistics (i.e. third order moments, which may be estimated with documents containing just three words). The method, termed Excess Correlation Analysis (ECA), is based on a spectral decomposition of low order moments (third and fourth order) via two singular value decompositions (SVDs). Moreover, the algorithm is scalable since the SVD operations are carried out on k by k matrices, where k is the number of latent factors (e.g. the number of topics), rather than in the d-dimensional observed space (typically d >> k)."
That's a really remarkable claim, and I'd tag it to_be_shot_after_a_fair_trial if it weren't being made by genuinely serious people.
in_NB
to_read
latent_variables
topic_models
text_mining
mixture_models
statistics
machine_learning
cool_if_true
spectral_clustering
"We provide a simple and efficient learning procedure that is guaranteed to recover the parameters for a wide class of mixture models, including the popular latent Dirichlet allocation (LDA) model. For LDA, the procedure correctly recovers both the topic probability vectors and the prior over the topics, using only trigram statistics (i.e. third order moments, which may be estimated with documents containing just three words). The method, termed Excess Correlation Analysis (ECA), is based on a spectral decomposition of low order moments (third and fourth order) via two singular value decompositions (SVDs). Moreover, the algorithm is scalable since the SVD operations are carried out on k by k matrices, where k is the number of latent factors (e.g. the number of topics), rather than in the d-dimensional observed space (typically d >> k)."
That's a really remarkable claim, and I'd tag it to_be_shot_after_a_fair_trial if it weren't being made by genuinely serious people.
27 days ago by cshalizi
[1204.6265] Statistical inference for dynamical systems: a review
28 days ago by cshalizi
"The topic of statistical inference for dynamical systems has been studied extensively across several fields. In this survey we focus on the problem of parameter estimation for non-linear dynamical systems. Our objective is to place results across distinct disciplines in a common setting and highlight opportunities for further research."
to:NB
to_read
statistical_inference_for_stochastic_processes
dynamical_systems
statistics
time_series
state-space_models
state-space_reconstruction
pillai.natesh
via:ded-maxim
28 days ago by cshalizi
Graphlets: a Spectral Perspective for Graph Limits
4 weeks ago by cshalizi
"Graphlets give a spectral approach to graph limits for general graph sequences in a framework that unifies previous disparate approaches for dealing with dense graphs and sparse graphs. We will show that the con- vergence to graphlets under the appropriate spectral distance is equivalent to the convergence using the (normalized) cut distance. We then examine the geometry of graphlets, illustrated by examples of several families of graphlets and, in particular, graphlets with low ranks. We further dis- cuss a number of usages of graphlets, including universal scalable bases, universal embeddings vis heat kernels and the preservation of Cheeger cuts."
ETA: This is so not an easy read. I like what I understand, but I definitely have to make another attack on it.
to:NB
to_read
graph_theory
graph_limits
re:smoothing_adjacency_matrices
re:network_differences
chung.fan
via:alessandro
graph_spectra
ETA: This is so not an easy read. I like what I understand, but I definitely have to make another attack on it.
4 weeks ago by cshalizi
Analytic Thinking Promotes Religious Disbelief
4 weeks ago by cshalizi
"Scientific interest in the cognitive underpinnings of religious belief has grown in recent years. However, to date, little experimental research has focused on the cognitive processes that may promote religious disbelief. The present studies apply a dual-process model of cognitive processing to this problem, testing the hypothesis that analytic processing promotes religious disbelief. Individual differences in the tendency to analytically override initially flawed intuitions in reasoning were associated with increased religious disbelief. Four additional experiments provided evidence of causation, as subtle manipulations known to trigger analytic processing also encouraged religious disbelief. Combined, these studies indicate that analytic processing is one factor (presumably among several) that promotes religious disbelief. Although these findings do not speak directly to conversations about the inherent rationality, value, or truth of religious beliefs, they illuminate one cognitive factor that may influence such discussions."
The part of me which imprinted on _Why I Am Not a Christian_ is chortling. Another part of me, however, is wondering how hard it would be to write "Analytic Thinking Promotes Disbelief in Psychological Studies".
to:NB
to_read
experimental_psychology
cognitive_science
religion
The part of me which imprinted on _Why I Am Not a Christian_ is chortling. Another part of me, however, is wondering how hard it would be to write "Analytic Thinking Promotes Disbelief in Psychological Studies".
4 weeks ago by cshalizi
"Network Coevolution and Democracy: A Spatial Econometric Approach" by Aya Kachi
4 weeks ago by cshalizi
"Regime transitions are contagious according to the diffusion-of-democracy literature: a country's regime is affected by others' through various predefined networks (e.g. geographical proximity), as well as by the country's own political, economic and social attributes (e.g. GDP levels). My account departs from the existing diffusion theory by allowing for countries' self-selection into peer regime networks based on their democracy levels in the past. For example, a country can form stronger dependency ties with countries that demonstrated similar democracy levels in the past (homophily). In the longitudinal setting, the traditional diffusion mechanism with the presence of self-selection generates the "co-evolutionary dynamic" between country networks and democracy levels. With this recursive feedback process between tie formation and democracy levels, it becomes extremely difficult to evaluate empirically how each country's level of democracy is determined, because we need to distinguish the following three processes statistically. First, country-specific attributes determine the level of democracy as in the earliest democratization studies. Second, other states' democracy levels also predict a country's regime as demonstrated in the conventional diffusion studies. Finally with my theory of endogenous network formation, the seeming diffusion effect is partially a consequence of their self-selection into peer networks. A newer spatial econometric model, an "M-STAR + Co-Evolution" model, is one of the first that allows us to test for all of these three dynamics behind democratization. In my first-cut analysis, I find that all three processes indeed exist."
ETA: It's good to recognize the problem exists, but the model used here does not make it go away, and still fails to identify the influence effect (if one exists).
to:NB
to_read
political_science
network_data_analysis
homophily
contagion
re:critique_of_diffusion
democracy
ETA: It's good to recognize the problem exists, but the model used here does not make it go away, and still fails to identify the influence effect (if one exists).
4 weeks ago by cshalizi
On the Relation Between Encoding and Decoding of Neuronal Spikes
4 weeks ago by cshalizi
"Neural coding is a field of study that concerns how sensory information is represented in the brain by networks of neurons. The link between external stimulus and neural response can be studied from two parallel points of view. The first, neural encoding, refers to the mapping from stimulus to response. It focuses primarily on understanding how neurons respond to a wide variety of stimuli and constructing models that accurately describe the stimulus-response relationship. Neural decoding refers to the reverse mapping, from response to stimulus, where the challenge is to reconstruct a stimulus from the spikes it evokes. Since neuronal response is stochastic, a one-to-one mapping of stimuli into neural responses does not exist, causing a mismatch between the two viewpoints of neural coding. Here we use these two perspectives to investigate the question of what rate coding is, in the simple setting of a single stationary stimulus parameter and a single stationary spike train represented by a renewal process. We show that when rate codes are defined in terms of encoding, that is, the stimulus parameter is mapped onto the mean firing rate, the rate decoder given by spike counts or the sample mean does not always efficiently decode the rate codes, but it can improve efficiency in reading certain rate codes when correlations within a spike train are taken into account."
to:NB
to_read
neural_coding_and_decoding
kith_and_kin
koyama.shinsuke
4 weeks ago by cshalizi
Game-powered machine learning
4 weeks ago by cshalizi
"Searching for relevant content in a massive amount of multimedia information is facilitated by accurately annotating each image, video, or song with a large number of relevant semantic keywords, or tags. We introduce game-powered machine learning, an integrated approach to annotating multimedia content that combines the effectiveness of human computation, through online games, with the scalability of machine learning. We investigate this framework for labeling music. First, a socially-oriented music annotation game called Herd It collects reliable music annotations based on the “wisdom of the crowds.” Second, these annotated examples are used to train a supervised machine learning system. Third, the machine learning system actively directs the annotation games to collect new data that will most benefit future model iterations. Once trained, the system can automatically annotate a corpus of music much larger than what could be labeled using human computation alone. Automatically annotated songs can be retrieved based on their semantic relevance to text-based queries (e.g., “funky jazz with saxophone,” “spooky electronica,” etc.). Based on the results presented in this paper, we find that actively coupling annotation games with machine learning provides a reliable and scalable approach to making searchable massive amounts of multimedia data."
--- This is more than a bit of a stunt, but it points in an interesting direction.
to:NB
to_read
data_mining
collective_cognition
active_learning
tagging
classifiers
re:democratic_cognition
--- This is more than a bit of a stunt, but it points in an interesting direction.
4 weeks ago by cshalizi
Bai , Li : Statistical analysis of factor models of high dimension
6 weeks ago by cshalizi
"This paper considers the maximum likelihood estimation of factor models of high dimension, where the number of variables (N) is comparable with or even greater than the number of observations (T). An inferential theory is developed. We establish not only consistency but also the rate of convergence and the limiting distributions. Five different sets of identification conditions are considered. We show that the distributions of the MLE estimators depend on the identification restrictions. Unlike the principal components approach, the maximum likelihood estimator explicitly allows heteroskedasticities, which are jointly estimated with other parameters. Efficiency of MLE relative to the principal components method is also considered."
to:NB
to_read
factor_analysis
statistics
high-dimensional_statistics
6 weeks ago by cshalizi
[1204.2523] Concept Modeling with Superwords
6 weeks ago by cshalizi
"In information retrieval, a fundamental goal is to transform a document into concepts that are representative of its content. The term "representative" is in itself challenging to define, and various tasks require different granularities of concepts. In this paper, we aim to model concepts that are sparse over the vocabulary, and that flexibly adapt their content based on other relevant semantic information such as textual structure or associated image features. We explore a Bayesian nonparametric model based on nested beta processes that allows for inferring an unknown number of strictly sparse concepts. The resulting model provides an inherently different representation of concepts than a standard LDA (or HDP) based topic model, and allows for direct incorporation of semantic features. We demonstrate the utility of this representation on multilingual blog data and the Congressional Record."
in_NB
to_read
text_mining
topic_models
fox.emily
guestrin.carlos
kith_and_kin
6 weeks ago by cshalizi
[1204.2477] A Simple Explanation of A Spectral Algorithm for Learning Hidden Markov Models
6 weeks ago by cshalizi
"A simple linear algebraic explanation of the algorithm in "A Spectral Algorithm for Learning Hidden Markov Models" (COLT 2009). Most of the content is in Figure 2; the text just makes everything precise in four nearly-trivial claims."
to:NB
to_read
statistics
markov_models
re:AoS_project
spectral_methods
6 weeks ago by cshalizi
[0802.4363] Estimating the entropy of binary time series: Methodology, some theory and a simulation study
6 weeks ago by cshalizi
"Partly motivated by entropy-estimation problems in neuroscience, we present a detailed and extensive comparison between some of the most popular and effective entropy estimation methods used in practice: The plug-in method, four different estimators based on the Lempel-Ziv (LZ) family of data compression algorithms, an estimator based on the Context-Tree Weighting (CTW) method, and the renewal entropy estimator.
"**Methodology. Three new entropy estimators are introduced. For two of the four LZ-based estimators, a bootstrap procedure is described for evaluating their standard error, and a practical rule of thumb is heuristically derived for selecting the values of their parameters. ** Theory. We prove that, unlike their earlier versions, the two new LZ-based estimators are consistent for every finite-valued, stationary and ergodic process. An effective method is derived for the accurate approximation of the entropy rate of a finite-state HMM with known distribution. Heuristic calculations are presented and approximate formulas are derived for evaluating the bias and the standard error of each estimator. ** Simulation. All estimators are applied to a wide range of data generated by numerous different processes with varying degrees of dependence and memory. Some conclusions drawn from these experiments include: (i) For all estimators considered, the main source of error is the bias. (ii) The CTW method is repeatedly and consistently seen to provide the most accurate results. (iii) The performance of the LZ-based estimators is often comparable to that of the plug-in method. (iv) The main drawback of the plug-in method is its computational inefficiency."
in_NB
to_read
entropy_estimation
information_theory
time_series
statistics
kontoyiannis.ioannis
re:stacs
"**Methodology. Three new entropy estimators are introduced. For two of the four LZ-based estimators, a bootstrap procedure is described for evaluating their standard error, and a practical rule of thumb is heuristically derived for selecting the values of their parameters. ** Theory. We prove that, unlike their earlier versions, the two new LZ-based estimators are consistent for every finite-valued, stationary and ergodic process. An effective method is derived for the accurate approximation of the entropy rate of a finite-state HMM with known distribution. Heuristic calculations are presented and approximate formulas are derived for evaluating the bias and the standard error of each estimator. ** Simulation. All estimators are applied to a wide range of data generated by numerous different processes with varying degrees of dependence and memory. Some conclusions drawn from these experiments include: (i) For all estimators considered, the main source of error is the bias. (ii) The CTW method is repeatedly and consistently seen to provide the most accurate results. (iii) The performance of the LZ-based estimators is often comparable to that of the plug-in method. (iv) The main drawback of the plug-in method is its computational inefficiency."
6 weeks ago by cshalizi
Cambridge Journals Online - Abstract - KNOWLEDGE, PLANNING, AND MARKETS: A MISSING CHAPTER IN THE SOCIALIST CALCULATION DEBATES
6 weeks ago by cshalizi
"This paper examines the epistemological arguments about markets and planning that emerged in a series of unpublished exchanges between Hayek and Neurath. The exchanges reveal problems for standard accounts of both the socialist calculation debates and logical empiricism. They also raise questions concerning the sources of ignorance and uncertainty in modern economies, and the role of market and non-market organisations in the distribution and coordination of limited knowledge, which remain relevant to contemporary debates in economics. Hayek had argued that Neurath's work exemplified the errors of rationalism that underpinned the socialist project. In response Neurath highlighted assumptions about the limits of reason and predictability that the two theorists shared and attempted to turn those assumptions back against Hayek in a defence of the possibility of socialist planning. The paper critically compares Neurath's and Hayek's criticisms of rationalism and considers how far Neurath is successful in his attempt to employ Hayek's assumptions against Hayek himself."
to:NB
to_read
markets_as_collective_calculating_devices
neurath.otto
hayek.f.a._von
socialist_calculation_debate
history_of_economics
socialism
logical_positivism
6 weeks ago by cshalizi
[math/0603130] Nonparametric methods for inference in the presence of instrumental variables
6 weeks ago by cshalizi
"We suggest two nonparametric approaches, based on kernel methods and orthogonal series to estimating regression functions in the presence of instrumental variables. For the first time in this class of problems, we derive optimal convergence rates, and show that they are attained by particular estimators. In the presence of instrumental variables the relation that identifies the regression function also defines an ill-posed inverse problem, the ``difficulty'' of which depends on eigenvalues of a certain integral operator which is determined by the joint density of endogenous and instrumental variables. We delineate the role played by problem difficulty in determining both the optimal convergence rate and the appropriate choice of smoothing parameter."
to:NB
to_read
regression
statistics
instrumental_variables
nonparametrics
to_teach:undergrad-ADA
6 weeks ago by cshalizi
Relative Entropy and Exponential Deviation Bounds for General Markov Chains
6 weeks ago by cshalizi
"We develop explicit, general bounds for the prob- ability that the normalized partial sums of a function of a Markov chain on a general alphabet will exceed the steady-state mean of that function by a given amount. Our bounds combine simple information-theoretic ideas together with techniques from optimization and some fairly elementary tools from analysis. In one direction, we obtain a general bound for the important class of Doeblin chains; this bound is optimal, in the sense that in the special case of independent and identically distributed random variables it essentially reduces to the classical Hoeffding bound. In another direction, motivated by important problems in simulation, we develop a series of bounds in a form which is particularly suited to these problems, and which apply to the more general class of “geometrically ergodic” Markov chains."
to:NB
to_read
deviation_bounds
markov_models
stochastic_processes
via:ded-maxim
meyn.sean
kontoyiannis.ioannis
mixing
information_theory
6 weeks ago by cshalizi
[1204.2003] Directed Information Graphs
6 weeks ago by cshalizi
"We propose two graphical models to represent a concise description of the causal statistical dependence structure between a group of coupled stochastic processes. The first, minimum generative model graphs, is motivated by generative models. The second, directed information graphs, is motivated by Granger causality. We show that under mild assumptions, the graphs are identical. In fact, these are analogous to Bayesian and Markov networks respectively, in terms of Markov blankets and I-map properties. Furthermore, the underlying variable dependence structure is the unique causal Bayesian network. Lastly, we present a method using minimal-dimension statistics to identify the structure when upper bounds on the in-degrees are known. Simulations show the effectiveness of the approach."
to:NB
graphical_models
to_read
re:functional_communities
causality
information_theory
coleman.todd
6 weeks ago by cshalizi
[1203.0697] Learning High-Dimensional Mixtures of Graphical Models
7 weeks ago by cshalizi
"We consider the problem of learning mixtures of discrete graphical models in high dimensions and propose a novel method for estimating the mixture components with provable guarantees. The method proceeds mainly in three stages. In the first stage, it estimates the union of the Markov graphs of the mixture components (referred to as the union graph) via a series of rank tests. It then uses this estimated union graph to compute the mixture components via a spectral decomposition method. The spectral decomposition method was originally proposed for latent class models, and we adapt this method for learning the more general class of graphical model mixtures. In the end, the method produces tree approximations of the mixture components via the Chow-Liu algorithm. Our output is thus a tree-mixture model which serves as a good approximation to the underlying graphical model mixture. When the union graph has sparse node separators, we prove that our method has sample and computational complexities scaling as poly(p, d, r), for an r-component mixture of p-variate graphical models, where d is the cardinality of the sample space of each node variable. We also extend our results to the case when the union graph has sparse local separators, which is a weaker criterion than having sparse exact separators, and when the mixture components are in the regime of correlation decay. The computational and sample complexities of our method for this class are significantly improved, since they involve an upper bound on the cardinality of local separators (as opposed to exact separators). Our results push the realm of tractable model classes for high-dimensional learning, which includes the class of tree mixtures."
in_NB
mixture_models
ensemble_methods
graphical_models
machine_learning
to_read
chow-liu_algorithm
7 weeks ago by cshalizi
[1204.0321] The averaging principle
7 weeks ago by cshalizi
"Typically, models with a heterogeneous property are considerably harder to analyze than the corresponding homogeneous models, in which the heterogeneous property is replaced with its average value. In this study we show that any outcome of a heterogeneous model that satisfies the two properties of emph{differentiability} and emph{interchangibility}, is $O(epsilon^2)$ equivalent to the outcome of the corresponding homogeneous model, where $epsilon$ is the level of heterogeneity. We then use this emph{averaging principle} to obtain new results in queueing theory, game theory (auctions), and social networks (marketing)."
--- The claim in the abstract seems far too general to be true.
to:NB
to_read
macro_from_micro
--- The claim in the abstract seems far too general to be true.
7 weeks ago by cshalizi
A Kernel Two-Sample Test
7 weeks ago by cshalizi
"We propose a framework for analyzing and comparing distributions, which we use to construct statistical tests to determine if two samples are drawn from different distributions. Our test statistic is the largest difference in expectations over functions in the unit ball of a reproducing kernel Hilbert space (RKHS), and is called the maximum mean discrepancy (MMD). We present two distribution-free tests based on large deviation bounds for the MMD, and a third test based on the asymptotic distribution of this statistic. The MMD can be computed in quadratic time, although efficient linear time approximations are available. Our statistic is an instance of an integral probability metric, and various classical metrics on distributions are obtained when alternative function classes are used in place of an RKHS. We apply our two-sample tests to a variety of problems, including attribute matching for databases using the Hungarian marriage method, where they perform strongly. Excellent performance is also obtained when comparing distributions over graphs, for which these are the first such tests."
in_NB
to_read
hilbert_space
kernel_methods
goodness-of-fit
statistics
concentration_of_measure
probability
two-sample_tests
re:network_differences
7 weeks ago by cshalizi
[1203.5351] Activity driven modeling of dynamic networks
8 weeks ago by cshalizi
"Network modeling plays a critical role in identifying statistical regularities and structural principles common to many systems. The large majority of recent modeling approaches are connectivity driven, in the sense that the structural pattern of the network is at the basis of the mechanisms ruling the network formation. Connectivity driven models necessarily provide a time-aggregated representation that may fail to describe the instantaneous and fluctuating dynamics of many networks. We address this challenge by defining the activity potential, a time invariant function characterizing the agents' interactions in real-world networks and constructing an activity driven model capable of encoding the instantaneous time description of the network dynamics. The model provides an explanation of structural features such as the presence of hubs, which simply originate from the heterogeneous activity of agents. Additionally, we find that diffusive processes in highly dynamical networks can be described analytically in terms of the activity potential, allowing a quantitative discussion of the biases induced by the time-aggregated network representation in the analysis of dynamical processes in evolving networks."
to:NB
network_data_analysis
networks
stochastic_processes
markov_models
transaction_networks
to_read
re:stacs
8 weeks ago by cshalizi
[1203.5974] The Concentration and Stability of the Community Detecting Functions on Random Networks
8 weeks ago by cshalizi
"We propose a general form of community detecting functions for finding the communities or the optimal partition of a random network, and examine the concentration and stability of the function values using the bounded difference martingale method. We derive LDP inequalities for both the general case and several specific community detecting functions: modularity, graph bipartitioning and q-Potts community structure. We also discuss the concentration and stability of community detecting functions on different types of random networks: the sparse and non-sparse networks and some examples such as ER and CL networks."
in_NB
to_read
community_discovery
network_data_analysis
statistics
8 weeks ago by cshalizi
[1203.6119] Robustness of Complex Networks: Reaching Consensus Despite Adversaries
8 weeks ago by cshalizi
"We study the problem of reaching consensus in complex networks where each node knows nothing about the overall topology, other than its own neighbors. We assume that there exist a set of malicious or stubborn nodes in the network that do not follow the same dynamics as the rest of the nodes. When the normal nodes act on purely local information, previous work has established that standard graph notions such as connectivity are no longer sufficient to characterize the ability of the non-malicious nodes to reach agreement. Instead, the network must satisfy a property known as robustness. In this paper we investigate the robustness properties of common random graph models for complex networks, including the preferential attachment model, the Erdos-Renyi model, and the geometric random graph model. We show that these models exhibit a thresholding behavior for robustness. In particular, we show that the notions of connectivity and robustness coincide on various random graph models, indicating that purely local knowledge is sufficient when the objective is to reach agreement on an appropriate function of the initial values."
to:NB
to_read
networks
diffusion_of_innovations
re:do-institutions-evolve
8 weeks ago by cshalizi
[1203.6130] Spectral dimensionality reduction for HMMs
8 weeks ago by cshalizi
"Hidden Markov Models (HMMs) can be accurately approximated using co-occurrence frequencies of pairs and triples of observations by using a fast spectral method in contrast to the usual slow methods like EM or Gibbs sampling. We provide a new spectral method which significantly reduces the number of model parameters that need to be estimated, and generates a sample complexity that does not depend on the size of the observation vocabulary. We present an elementary proof giving bounds on the relative accuracy of probability estimates from our model. (Correlaries show our bounds can be weakened to provide either L1 bounds or KL bounds which provide easier direct comparisons to previous work.) Our theorem uses conditions that are checkable from the data, instead of putting conditions on the unobservable Markov transition matrix."
to:NB
to_read
markov_models
statistics
machine_learning
dimension_reduction
re:AoS_project
spectral_clustering
8 weeks ago by cshalizi
[1203.6502] Quantifying causal influences
8 weeks ago by cshalizi
"Common methods of causal inference generate directed acyclic graphs (DAGs) that formalize causal relations between n variables. Given the joint distribution of all these variables, the DAG contains all information about how intervening on one variable would change the distribution of the other n-1 variables. It remains, however, a non-trivial question how to quantify the causal influence of one variable on another one.
Here we propose a measure for causal strength that refers to direct effects and measure the "strength of an arrow" or a set of arrows. It is based on a hypothetical intervention that modifies the joint distribution by cutting the corresponding edge. The causal strength is then the relative entropy distance between the old and the new distribution.
We discuss other measures of causal strength like the average causal effect, transfer entropy and information flow and describe their limitations. We argue that our measure is also more appropriate for time series than the known ones.
Finally, we discuss conceptual problems in defining the strength of indirect effects."
to:NB
to_read
causality
graphical_models
information_theory
statistics
via:ded-maxim
Here we propose a measure for causal strength that refers to direct effects and measure the "strength of an arrow" or a set of arrows. It is based on a hypothetical intervention that modifies the joint distribution by cutting the corresponding edge. The causal strength is then the relative entropy distance between the old and the new distribution.
We discuss other measures of causal strength like the average causal effect, transfer entropy and information flow and describe their limitations. We argue that our measure is also more appropriate for time series than the known ones.
Finally, we discuss conceptual problems in defining the strength of indirect effects."
8 weeks ago by cshalizi
Kleijn , van der Vaart : The Bernstein-Von-Mises theorem under misspecification
10 weeks ago by cshalizi
"We prove that the posterior distribution of a parameter in misspecified LAN parametric models can be approximated by a random normal distribution. We derive from this that Bayesian credible sets are not valid confidence sets if the model is misspecified. We obtain the result under conditions that are comparable to those in the well-specified situation: uniform testability against fixed alternatives and sufficient prior mass in neighbourhoods of the point of convergence. The rate of convergence is considered in detail, with special attention for the existence and construction of suitable test sequences. We also give a lemma to exclude testable model subsets which implies a misspecified version of Schwartz’ consistency theorem, establishing weak convergence of the posterior to a measure degenerate at the point at minimal Kullback-Leibler divergence with respect to the true distribution."
to:NB
to_read
bayesian_consistency
statistics
bernstein-von_mises
asymptotics
confidence_sets
van_der_vaart.aad
10 weeks ago by cshalizi
"Neural reuse: A fundamental organizational principle of the brain" (Anderson, 2010)
10 weeks ago by cshalizi
BBS target article.
Abstract: "An emerging class of theories concerning the functional structure of the brain takes the reuse of neural circuitry for various cognitive purposes to be a central organizational principle. According to these theories, it is quite common for neural circuits established for one purpose to be exapted (exploited, recycled, redeployed) during evolution or normal development, and be put to different uses, often without losing their original functions. Neural reuse theories thus differ from the usual understanding of the role of neural plasticity (which is, after all, a kind of reuse) in brain organization along the following lines: According to neural reuse, circuits can continue to acquire new uses after an initial or original function is established; the acquisition of new uses need not involve unusual circumstances such as injury or loss of established function; and the acquisition of a new use need not involve (much) local change to circuit structure (e.g., it might involve only the establishment of functional connections to new neural partners). Thus, neural reuse theories offer a distinct perspective on several topics of general interest, such as: the evolution and development of the brain, including (for instance) the evolutionary-developmental pathway supporting primate tool use and human language; the degree of modularity in brain organization; the degree of localization of cognitive function; and the cortical parcellation problem and the prospects (and proper methods to employ) for function to structure mapping. The idea also has some practical implications in the areas of rehabilitative medicine and machine interface design."
in_NB
to_read
fmri
neuroscience
functional_connectivity
modularity
re:functional_communities
neuropsychology
cognitive_science
Abstract: "An emerging class of theories concerning the functional structure of the brain takes the reuse of neural circuitry for various cognitive purposes to be a central organizational principle. According to these theories, it is quite common for neural circuits established for one purpose to be exapted (exploited, recycled, redeployed) during evolution or normal development, and be put to different uses, often without losing their original functions. Neural reuse theories thus differ from the usual understanding of the role of neural plasticity (which is, after all, a kind of reuse) in brain organization along the following lines: According to neural reuse, circuits can continue to acquire new uses after an initial or original function is established; the acquisition of new uses need not involve unusual circumstances such as injury or loss of established function; and the acquisition of a new use need not involve (much) local change to circuit structure (e.g., it might involve only the establishment of functional connections to new neural partners). Thus, neural reuse theories offer a distinct perspective on several topics of general interest, such as: the evolution and development of the brain, including (for instance) the evolutionary-developmental pathway supporting primate tool use and human language; the degree of modularity in brain organization; the degree of localization of cognitive function; and the cortical parcellation problem and the prospects (and proper methods to employ) for function to structure mapping. The idea also has some practical implications in the areas of rehabilitative medicine and machine interface design."
10 weeks ago by cshalizi
Kaiser , Lahiri , Nordman : Goodness of fit tests for a class of Markov random field models
10 weeks ago by cshalizi
"This paper develops goodness of fit statistics that can be used to formally assess Markov random field models for spatial data, when the model distributions are discrete or continuous and potentially parametric. Test statistics are formed from generalized spatial residuals which are collected over groups of nonneighboring spatial observations, called concliques. Under a hypothesized Markov model structure, spatial residuals within each conclique are shown to be independent and identically distributed as uniform variables. The information from a series of concliques can be then pooled into goodness of fit statistics. Under some conditions, large sample distributions of these statistics are explicitly derived for testing both simple and composite hypotheses, where the latter involves additional parametric estimation steps. The distributional results are verified through simulation, and a data example illustrates the method for model assessment."
to:NB
to_read
statistics
spatial_statistics
random_fields
goodness-of-fit
hypothesis_testing
re:stacs
markov_models
10 weeks ago by cshalizi
[1203.2268] Friends FTW! Friendship and competition in Halo: Reach
10 weeks ago by cshalizi
"How important are friendships in determining success by individuals and teams in complex competitive environments? By combining a novel data set on the dynamics of millions of ad hoc team-based competitions from the massively multiplayer online first person shooter (MMOFPS) Halo: Reach with ground-truth data on player demographics, play style, psychometrics and friendships derived from an anonymous online survey, we investigate the impact of friendship on performance in such competitive environments. We find that friendships play a fundamental role, leading to both improved individual and team performance---even after controlling for the overall expertise of the team---and increased pro-social behavior. Furthermore, because players structure their in-game activities around opportunities to play with friends, we show that friendships can largely be inferred directly from behavioral time series using common-sense heuristics. Algorithms that leverage the utility of friendships, without needing explicitly labeled (and thus private) data, are thus both possible and will likely improve many aspects of competition prediction and design."
to:NB
kith_and_kin
to_read
social_networks
videogames
networked_life
clauset.aaron
mason.winter
10 weeks ago by cshalizi
[0803.2963] Consistency of cross validation for comparing regression procedures
11 weeks ago by cshalizi
"Theoretical developments on cross validation (CV) have mainly focused on selecting one among a list of finite-dimensional models (e.g., subset or order selection in linear regression) or selecting a smoothing parameter (e.g., bandwidth for kernel smoothing). However, little is known about consistency of cross validation when applied to compare between parametric and nonparametric methods or within nonparametric methods. We show that under some conditions, with an appropriate choice of data splitting ratio, cross validation is consistent in the sense of selecting the better procedure with probability approaching 1. Our results reveal interesting behavior of cross validation. When comparing two models (procedures) converging at the same nonparametric rate, in contrast to the parametric case, it turns out that the proportion of data used for evaluation in CV does not need to be dominating in size. Furthermore, it can even be of a smaller order than the proportion for estimation while not affecting the consistency property."
to:NB
statistics
to_read
cross-validation
model_selection
nonparametrics
to_teach:undergrad-ADA
re:stacs
11 weeks ago by cshalizi
[0803.2984] Conditional density estimation in a regression setting
11 weeks ago by cshalizi
"Regression problems are traditionally analyzed via univariate characteristics like the regression function, scale function and marginal density of regression errors. These characteristics are useful and informative whenever the association between the predictor and the response is relatively simple. More detailed information about the association can be provided by the conditional density of the response given the predictor. For the first time in the literature, this article develops the theory of minimax estimation of the conditional density for regression settings with fixed and random designs of predictors, bounded and unbounded responses and a vast set of anisotropic classes of conditional densities. The study of fixed design regression is of special interest and novelty because the known literature is devoted to the case of random predictors. For the aforementioned models, the paper suggests a universal adaptive estimator which (i) matches performance of an oracle that knows both an underlying model and an estimated conditional density; (ii) is sharp minimax over a vast class of anisotropic conditional densities; (iii) is at least rate minimax when the response is independent of the predictor and thus a bivariate conditional density becomes a univariate density; (iv) is adaptive to an underlying design (fixed or random) of predictors."
in_NB
statistics
nonparametrics
regression
density_estimation
minimax
to_read
to_teach:undergrad-ADA
11 weeks ago by cshalizi
Richards , Lee , Schafer , Freeman : Prototype selection for parameter estimation in complex models
11 weeks ago by cshalizi
"Parameter estimation in astrophysics often requires the use of complex physical models. In this paper we study the problem of estimating the parameters that describe star formation history (SFH) in galaxies. Here, high-dimensional spectral data from galaxies are appropriately modeled as linear combinations of physical components, called simple stellar populations (SSPs), plus some nonlinear distortions. Theoretical data for each SSP is produced for a fixed parameter vector via computer modeling. Though the parameters that define each SSP are continuous, optimizing the signal model over a large set of SSPs on a fine parameter grid is computationally infeasible and inefficient. The goal of this study is to estimate the set of parameters that describes the SFH of each galaxy. These target parameters, such as the average ages and chemical compositions of the galaxy’s stellar populations, are derived from the SSP parameters and the component weights in the signal model. Here, we introduce a principled approach of choosing a small basis of SSP prototypes for SFH parameter estimation. The basic idea is to quantize the vector space and effective support of the model components. In addition to greater computational efficiency, we achieve better estimates of the SFH target parameters. In simulations, our proposed quantization method obtains a substantial improvement in estimating the target parameters over the common method of employing a parameter grid. Sparse coding techniques are not appropriate for this problem without proper constraints, while constrained sparse coding methods perform poorly for parameter estimation because their objective is signal reconstruction, not estimation of the target parameters."
to:NB
to_read
statistics
estimation
astronomy
kith_and_kin
lee.ann_b.
schafer.chad
richards.joey
freeman.peter
11 weeks ago by cshalizi
[1203.0738] Avalanche analysis from multi-electrode ensemble recordings in cat, monkey and human cerebral cortex during wakefulness and sleep
11 weeks ago by cshalizi
"Self-organized critical states are found in many natural systems, from earthquakes to forest fires, they have also been found in neural systems, particularly, in neuronal cultures. However, the presence of critical states in the awake brain remains controversial. Here, we compared avalanche analyses performed on different in vivo preparations during wakefulness, slow-wave sleep and REM sleep, in cat parietal cortex (8 electrodes), monkey motor cortex (64/96 electrodes) and human temporal cortex (96 electrodes) in epileptic patients. In neuronal avalanches defined from units (up to 152 single units), the size of avalanches never clearly scaled as power-law, but rather scaled exponentially or displayed intermediate scaling. We also analyzed the dynamics of local field potentials (LFPs) and in particular LFP negative peaks (nLFPs) among the different electrodes (up to 96 sites in temporal cortex or up to 128 sites in adjacent motor and pre-motor cortices). In this case, the avalanches defined from nLFPs displayed power-law scaling in double logarithmic representations, as reported previously in monkey. However, avalanche defined as positive LFP (pLFP) peaks, which are not related to neuronal firing, also displayed apparent power-law scaling. Closer examination of this scaling using the more reliable cumulative distribution function (CDF) and other rigorous statistical measures, did not confirm power-law scaling. The same pattern was seen for cats, monkey and human, as well as for different brain states of wakefulness and sleep. We also tested other alternative distributions. While simple exponentials yielded very good fits of the avalanche dynamics, the "sum of exponentials" provided the best fit to the data. Collectively, these results show no clear evidence for power-law scaling or self-organized critical states in the awake and sleeping brain of mammals, from cat to man."
Impressions from a quick scan: yes, those are not power laws (way too curved), but no, you cannot use R^2 like that --- and in fact we explained why, in that paper you cite. Oy.
to:NB
self-organized_criticality
neuroscience
to_read
heavy_tails
Impressions from a quick scan: yes, those are not power laws (way too curved), but no, you cannot use R^2 like that --- and in fact we explained why, in that paper you cite. Oy.
11 weeks ago by cshalizi
[0804.2487] The ergodic decomposition of asymptotically mean stationary random sources
12 weeks ago by cshalizi
"It is demonstrated how to represent asymptotically mean stationary (AMS) random sources with values in standard spaces as mixtures of ergodic AMS sources. This an extension of the well known decomposition of stationary sources which has facilitated the generalization of prominent source coding theorems to arbitrary, not necessarily ergodic, stationary sources. Asymptotic mean stationarity generalizes the definition of stationarity and covers a much larger variety of real-world examples of random sources of practical interest. It is sketched how to obtain source coding and related theorems for arbitrary, not necessarily ergodic, AMS sources, based on the presented ergodic decomposition."
in_NB
ergodic_theory
to_read
re:almost_none
stochastic_processes
12 weeks ago by cshalizi
Global Network Reorganization During Dynamic Adaptations of Bacillus subtilis Metabolism
12 weeks ago by cshalizi
"Adaptation of cells to environmental changes requires dynamic interactions between metabolic and regulatory networks, but studies typically address only one or a few layers of regulation. For nutritional shifts between two preferred carbon sources of Bacillus subtilis, we combined statistical and model-based data analyses of dynamic transcript, protein, and metabolite abundances and promoter activities. Adaptation to malate was rapid and primarily controlled posttranscriptionally compared with the slow, mainly transcriptionally controlled adaptation to glucose that entailed nearly half of the known transcription regulation network. Interactions across multiple levels of regulation were involved in adaptive changes that could also be achieved by controlling single genes. Our analysis suggests that global trade-offs and evolutionary constraints provide incentives to favor complex control programs."
to:NB
to_read
biochemical_networks
adaptive_behavior
experimental_biology
re:network_differences
gene_regulation
12 weeks ago by cshalizi
Periodic stripe formation by a Turing mechanism operating at growth zones in the mammalian palate : Nature Genetics : Nature Publishing Group
12 weeks ago by cshalizi
"We present direct evidence of an activator-inhibitor system in the generation of the regularly spaced transverse ridges of the palate. We show that new ridges, called rugae, that are marked by stripes of expression of Shh (encoding Sonic hedgehog), appear at two growth zones where the space between previously laid rugae increases. However, inter-rugal growth is not absolutely required: new stripes of Shh expression still appeared when growth was inhibited. Furthermore, when a ruga was excised, new Shh expression appeared not at the cut edge but as bifurcating stripes branching from the neighboring stripe of Shh expression, diagnostic of a Turing-type reaction-diffusion mechanism. Genetic and inhibitor experiments identified fibroblast growth factor (FGF) and Shh as components of an activator-inhibitor pair in this system. These findings demonstrate a reaction-diffusion mechanism that is likely to be widely relevant in vertebrate development."
to_read
to:NB
pattern_formation
biology
morphogenesis
reaction-diffusion
turing_mechanism
via:aks
to_teach:complexity-and-inference
re:stacs
experimental_biology
to:blog
12 weeks ago by cshalizi
Higher social class predicts increased unethical behavior
12 weeks ago by cshalizi
"Seven studies using experimental and naturalistic methods reveal that upper-class individuals behave more unethically than lower-class individuals. In studies 1 and 2, upper-class individuals were more likely to break the law while driving, relative to lower-class individuals. In follow-up laboratory studies, upper-class individuals were more likely to exhibit unethical decision-making tendencies (study 3), take valued goods from others (study 4), lie in a negotiation (study 5), cheat to increase their chances of winning a prize (study 6), and endorse unethical behavior at work (study 7) than were lower-class individuals. Mediator and moderator data demonstrated that upper-class individuals’ unethical tendencies are accounted for, in part, by their more favorable attitudes toward greed."
to:NB
to_read
experimental_psychology
moral_psychology
inequality
12 weeks ago by cshalizi
Interactive Diffusion
12 weeks ago by cshalizi
"In this article, the authors focus attention on a poorly understood aspect of contentious politics: the interaction between the transnational diffusion of new forms of protest behavior and police practices in response to them. Studies of diffusion are usually limited to the diffusion of one kind of innovation by one set of actors to another, as in the diffusion of technical innovations from innovators to adopters. But collective action diffusion also produces a parallel and interactive sequence of “public order” reactions. Using the transnational countersummits that emerged around the turn of the century as their source of evidence, the authors focus on the coevolution of protester and police innovations across national boundaries. The authors’ major finding is that the mechanisms that cause protester and police innovations to diffuse are remarkably similar, even though they can combine in different ways at different moments: promotion, the proactive intervention by a sender actor aimed at deliberate diffusion of an innovation; assessment, the analysis of information on past events and their definition as successes or failures, which leads to adaption of the innovation to new sites and situations; and theorization, the location of technical innovations within broader normative and cognitive frameworks. The authors close with a speculative application of their findings to the recent diffusion of protester tactics and regime responses in the Middle East and North Africa."
to:NB
to_read
diffusion_of_innovations
social_movements
arab_spring
re:critique_of_diffusion
via:henry_farrell
12 weeks ago by cshalizi
[1202.3323] A new look at shifting regret
12 weeks ago by cshalizi
We investigate extensions of well-known online learning algorithms such as fixed-share of Herbster and Warmuth (1998) or the methods proposed by Bousquet and Warmuth (2002). These algorithms use weight sharing schemes to perform as well as the best sequence of experts with a limited number of changes. Here we show, with a common, general, and simpler analysis, that weight sharing in fact achieves much more than what it was designed for. We use it to simultaneously prove new shifting regret bounds for online convex optimization on the simplex in terms of the total variation distance as well as new bounds for the related setting of adaptive regret. Finally, we exhibit the first logarithmic shifting bounds for exp-concave loss functions on the simplex.
online_learning
to_read
individual_sequence_prediction
non-stationarity
re:growing_ensemble_project
in_NB
low-regret_learning
have_read
12 weeks ago by cshalizi
Universality of Bayesian Predictions
12 weeks ago by cshalizi
"This paper studies the theoretical properties of Bayesian predictions and shows that under minimal conditions we can derive finite sample bounds for the loss incurred using Bayesian predictions under the Kullback-Leibler divergence. In particular, the concept of universality of predictions is discussed and universality is established for Bayesian predictions in a variety of settings. These include predictions under almost arbitrary loss functions, model averaging, predictions in a non-stationary environment and under model misspecification."
in_NB
to_read
statistics
bayesian_consistency
prediction
misspecification
universal_prediction
12 weeks ago by cshalizi
[0809.5032] Identifiability of parameters in latent structure models with many observed variables
12 weeks ago by cshalizi
"While hidden class models of various types arise in many statistical applications, it is often difficult to establish the identifiability of their parameters. Focusing on models in which there is some structure of independence of some of the observed variables conditioned on hidden ones, we demonstrate a general approach for establishing identifiability utilizing algebraic arguments. A theorem of J. Kruskal for a simple latent-class model with finite state space lies at the core of our results, though we apply it to a diverse set of models. These include mixtures of both finite and nonparametric product distributions, hidden Markov models and random graph mixture models, and lead to a number of new results and improvements to old ones. In the parametric setting, this approach indicates that for such models, the classical definition of identifiability is typically too strong. Instead generic identifiability holds, which implies that the set of nonidentifiable parameters has measure zero, so that parameter inference is still meaningful. In particular, this sheds light on the properties of finite mixtures of Bernoulli products, which have been used for decades despite being known to have nonidentifiable parameters. In the nonparametric setting, we again obtain identifiability only when certain restrictions are placed on the distributions that are mixed, but we explicitly describe the conditions."
in_NB
statistics
identifiability
mixture_models
inference_to_latent_objects
re:homophily_and_confounding
to_read
12 weeks ago by cshalizi
[1202.4294] Prediction of quantiles by statistical learning and application to GDP forecasting
february 2012 by cshalizi
"In this paper, we tackle the problem of prediction and confidence intervals for time series using a statistical learning approach and quantile loss functions. In a first time, we show that the Gibbs estimator (also known as Exponentially Weighted aggregate) is able to predict as well as the best predictor in a given family for a wide set of loss functions. In particular, using the quantile loss function of Koenker and Bassett (1978), this allows to build confidence intervals. We apply these results to the problem of prediction and confidence regions for the French Gross Domestic Product (GDP) growth, with promising results."
in_NB
to_read
prediction
confidence_sets
learning_theory
re:your_favorite_dsge_sucks
re:growing_ensemble_project
february 2012 by cshalizi
Henze : A Multivariate Two-Sample Test Based on the Number of Nearest Neighbor Type Coincidences
february 2012 by cshalizi
"For independent $d$-variate random samples $X_1, cdots, X_{n_1}$ i.i.d. $f(x), Y_1, cdots, Y_{n_2}$ i.i.d. $g(x)$, where the densities $f$ and $g$ are assumed to be continuous a.e., consider the number $T$ of all $k$ nearest neighbor comparisons in which observations and their neighbors belong to the same sample. We show that, if $f = g$ a.e., the limiting (normal) distribution of $T$, as $min(n_1, n_2) rightarrow infty, n_1/(n_1 + n_2) rightarrow tau, 0 < tau < 1$, does not depend on $f$. An omnibus procedure for testing the hypothesis $H_0: f = g$ a.e. is obtained by rejecting $H_0$ for large values of $T$. The result applies to a general distance (generated by a norm on $mathbb{R}^d$) for determining nearest neighbors, and it generalizes to the multisample situation."
to:NB
to_read
statistics
hypothesis_testing
two-sample_tests
re:AoS_project
february 2012 by cshalizi
“Economic Shocks and Conflict: The (Absence of?) Evidence from Commodity Prices
february 2012 by cshalizi
"Replication files":
http://www.chrisblattman.com/documents/data/shocks-conflict/Bazzi-Blattman.zip?9d7bd4
to:NB
statistics
to_read
data_analysis
economics
political_economy
war
violence
political_science
blattman.chris
to_teach:undergrad-ADA
http://www.chrisblattman.com/documents/data/shocks-conflict/Bazzi-Blattman.zip?9d7bd4
february 2012 by cshalizi
[1202.3123] Right-convergence of sparse random graphs
february 2012 by cshalizi
"The paper is devoted to the problem of establishing right-convergence of sparse random graphs. This concerns the convergence of the logarithm of number of homomorphisms from graphs or hyper-graphs $G_N, Nge 1$ to some target graph $W$. The theory of dense graph convergence, including random dense graphs, is now well understood, but its counterpart for sparse random graphs presents some fundamental difficulties. Phrased in the statistical physics terminology, the issue is the existence of the log-partition function limits, also known as free energy limits, appropriately normalized for the Gibbs distribution associated with $W$. In this paper we prove that the sequence of sparse ER graphs is right-converging when the tensor product associated with the target graph $W$ satisfies certain convexity property. We treat the case of discrete and continuous target graphs $W$. The latter case allows us to prove a special case of Talagrand's recent conjecture (more accurately stated as level III Research Problem 6.7.2 in his recent book), concerning the existence of the limit of the measure of a set obtained from $R^N$ by intersecting it with linearly in $N$ many subsets, generated according to some common probability law.
Our proof is based on the interpolation technique, introduced first by Guerra and Toninelli and developed further in a series of papers. Specifically, Bayati et al establish the right-convergence property for Erdos-Renyi graphs for some special cases of $W$. In this paper most of the results in this paper follow as a special case of our main theorem."
to:NB
to_read
graph_theory
graph_limits
re:smoothing_adjacency_matrices
Our proof is based on the interpolation technique, introduced first by Guerra and Toninelli and developed further in a series of papers. Specifically, Bayati et al establish the right-convergence property for Erdos-Renyi graphs for some special cases of $W$. In this paper most of the results in this paper follow as a special case of our main theorem."
february 2012 by cshalizi
"Trygve Haavelmo and the Emergence of Causal Calculus" (Judea Pearl, 2011)
february 2012 by cshalizi
"Haavelmo was the first to recognize the capacity of economic models to guide poli- cies. This paper describes some of the barriers that Haavelmo’s ideas have had (and still have) to overcome, and lays out a logical framework for capturing the relationships between theory, data and policy questions. The mathematical tools that emerge from this framework now enable investigators to answer complex policy and counterfactual questions using embarrassingly simple routines, some by mere inspection of the model’s structure. Several such problems are illustrated by examples, including misspecification tests, identification, mediation and introspection."
to:NB
causal_inference
economics
econometrics
haavelmo.trygve
pearl.judea
graphical_models
to_read
february 2012 by cshalizi
Fisher Dynamics in Household Debt: The Case of the U.S. 1929-2011
february 2012 by cshalizi
"We examine the importance of what we term ‘Fisher dynamics’- the mechanical effects of changes in interest rates, growth rates and inflation rates on debt levels independent of borrowing -for the evolution of household debt in the U.S. over a long time horizon (1929- 2011). Adapting a standard decomposition of public debt to household sector debt, we show that these factors have been important in explaining rising debt levels, especially in the past thirty years. We identify and describe three broad regimes in the growth of household debt and several shorter episodes, distinguished by the distinct roles played Fisher dynamics and borrowing behavior in the evolution of household debt. We then provide some counterfactual trajectories of debt burdens that suggest how important financial changes beginning around 1980 have been in contributing to household debt, independent of any changes in household behavior. Specifically, if average rates of growth, inflation and interest remained the same after 1980 as before 1980, household debt burdens in 2011 would have been roughly the same as they were in the early 1950s, despite the sharp increase in borrowing in the early 2000s. We then discuss the difficulties involved in deleveraging. Under scenarios involving even substantial reductions in household expenditure, returning to debt levels of the 1980s could take decades. If lower private leverage is a condition of acceptable growth,then in the absence of a substantial fall in interest rates relative to growth rates, large-scale debt forgiveness of some form may be unavoidable."
economics
economic_history
mason.joshua_w.
financial_crisis_of_2007--
to_read
february 2012 by cshalizi
[1202.1540] Quantifying the complexity of random Boolean networks
february 2012 by cshalizi
"We study two measures of the complexity of heterogeneous extended systems, taking random Boolean networks as prototypical cases. A measure defined by Shalizi et al. for cellular automata, based on a criterion for optimal statistical prediction [1], does not distinguish between the spatial inhomogeneity of the ordered phase and the dynamical inhomogeneity of the disordered phase. A modification in which complexities of individual nodes are calculated yields vanishing complexity values for networks in the ordered and critical regimes and for highly disordered networks, peaking somewhere in the disordered regime. Individual nodes with high complexity are the ones that pass the most information from the past to the future, a quantity that depends in a nontrivial way on both the Boolean function of a given node and its location within the network."
to:NB
complexity_measures
random_boolean_networks
to_read
february 2012 by cshalizi
Plausibly Exogenous
february 2012 by cshalizi
"Instrumental variable (IV) methods are widely used to identify causal effects in models with endogenous explanatory variables. Often the instrument exclusion restriction that underlies the validity of the usual IV inference is suspect; that is, instruments are only plausibly exogenous. We present practical methods for performing inference while relaxing the exclusion restriction. We illustrate the approaches with empirical examples that examine the effect of 401(k) participation on asset accumulation, price elasticity of demand for margarine, and returns to schooling. We find that inference is informative even with a substantial relaxation of the exclusion restriction in two of the three cases."
to:NB
to_read
causal_inference
regression
statistics
economics
social_science_methodology
instrumental_variables
to_teach:undergrad-ADA
hansen.christian
february 2012 by cshalizi
A Multi-Language Computing Environment for Literate Programming and Reproducible Research
february 2012 by cshalizi
"We present a new computing environment for authoring mixed natural and computer language documents. In this environment a single hierarchically-organized plain text source file may contain a variety of elements such as code in arbitrary programming languages, raw data, links to external resources, project management data, working notes, and text for publication. Code fragments may be executed in situ with graphical, numerical and textual output captured or linked in the file. Export to LATEX, HTML, LATEX beamer, DocBook and other formats permits working reports, presentations and manuscripts for publication to be generated from the file. In addition, functioning pure code files can be automatically extracted from the file. This environment is implemented as an extension to the Emacs text editor and provides a rich set of features for authoring both prose and code, as well as sophisticated project management capabilities."
paper_writing
programming
R
latex
to_read
february 2012 by cshalizi
Social Influence, Binary Decisions and Collective Dynamics
january 2012 by cshalizi
"In this paper we address the general question of how social influence determines collective outcomes for large populations of individuals faced with binary decisions. First, we define conditions under which the behavior of individuals making binary decisions can be described in terms of what we call an influence-response function: a one-dimensional function of the (weighted) number of individuals choosing each of the alternatives. And second, we demonstrate that, under the assumptions of global and anonymous interactions, general knowledge of the influence-response functions is sufficient to compute equilibrium, and even non-equilibrium, properties of the collective dynamics. By enabling us to treat in a consistent manner classes of decisions that have previously been analyzed separately, our framework allows us to find similarities between apparently quite different kinds of decision situations, and conversely to identify important differences between decisions that would otherwise appear very similar."
to:NB
to_read
re:do-institutions-evolve
re:homophily_and_confounding
social_life_of_the_mind
social_influence
herding
watts.duncan
kith_and_kin
january 2012 by cshalizi
[1201.5568] Dynamic trees for streaming and massive data contexts
january 2012 by cshalizi
"Data collection at a massive scale is becoming ubiquitous in a wide variety of settings, from vast offline databases to streaming real-time information. Learning algorithms deployed in such contexts must rely on single-pass inference, where the data history is never revisited. In streaming contexts, learning must also be temporally adaptive to remain up-to-date against unforeseen changes in the data generating mechanism. Although rapidly growing, the online Bayesian inference literature remains challenged by massive data and transient, evolving data streams. Non-parametric modelling techniques can prove particularly ill-suited, as the complexity of the model is allowed to increase with the sample size. In this work, we take steps to overcome these challenges by porting standard streaming techniques, like data discarding and downweighting, into a fully Bayesian framework via the use of informative priors and active learning heuristics. We showcase our methods by augmenting a modern non-parametric modelling framework, dynamic trees, and illustrate its performance on a number of practical examples. The end product is a powerful streaming regression and classification tool, whose performance compares favourably to the state-of-the-art."
to:NB
machine_learning
non-stationarity
statistics
data_mining
to_read
re:growing_ensemble_project
january 2012 by cshalizi
[1201.2334] Universal Estimation of Directed Information
january 2012 by cshalizi
"We propose four approaches to estimating the directed information rate between a pair of jointly stationary ergodic processes with the help of universal probability assignments. The four approaches yield estimators with different merits such as nonnegativity and boundedness. We establish consistency of these estimators in various senses and derive near-optimal rates of convergence in the minimax sense under mild conditions. The estimators carry over directly to estimating other information measures of stationary ergodic processes, such as entropy rate and mutual information rate, and provide alternatives to classical approaches in the existing literature. Guided by the theoretical results, we use context tree weighting as the vehicle for the implementations of the proposed estimators. Experiments on synthetic and real data are presented, demonstrating the potential of the proposed schemes in practice and the efficacy of directed information estimation as a tool for detecting and measuring causality and delay."
in_NB
to_read
information_theory
entropy_estimation
directed_information
stochastic_processes
nonparametrics
statistics
re:AoS_project
january 2012 by cshalizi
The Reductionist Gamble: Open Economy Politics in the Global Economy
january 2012 by cshalizi
"[International political economy] should transition to “third wave” scholarship. This transition is necessary because the approach that dominates current American IPE scholarship, Open Economy Politics (OEP), generates inaccurate knowledge. OEP produces inaccurate knowledge because it studies domestic politics in isolation from international or macro processes. This methodological reductionism is often inappropriate for the phenomena IPE studies because governments inhabit a system. As a result, the political choices that OEP attempts to explain are typically a product of the interplay between domestic politics and macro processes. When OEP omits causally significant macro processes from empirical models, the models yield biased inferences about the domestic political relationships under investigation. Although we tolerated such errors when the gains from OEP were large, these errors are less tolerable now that OEP has matured. Consequently, the field should transition toward research that is non-reductionist (systemic), problem-driven, and pluralistic."
--- I don't see how the issue is _reductionism_ so much as _ignoring interactions_.
to:NB
to_read
re:critique_of_diffusion
social_science_methodology
international_relations
political_economy
via:henry_farrell
--- I don't see how the issue is _reductionism_ so much as _ignoring interactions_.
january 2012 by cshalizi
Social Movement Organizational Collaboration: Networks of Learning and the Diffusion of Protest Tactics, 1960-1995
january 2012 by cshalizi
"This paper examines the diffusion of protest tactics between social movement organizations (SMOs). Drawing on organizational learning theory, we argue that knowledge about specific tactics diffuses between social movement organizations via their co-engagement in protest events. Using a longitudinal network dataset of organizations and their participation in protest events between 1960 and 1995, we adapt novel methodological techniques for dealing with selection and measurement bias in networks analysis, which comes in two forms—1) the mechanism that renders some organizations more likely to select into collaborations than others, and 2) the notion that tactical diffusion is not a result of collaboration, but rather is an artifact of homophily or some form of indirect learning. We find that collaboration is indeed an important channel of tactical diffusion. We also find that SMOs with broader tactical repertoires are more likely to adopt additional tactics as a result of their collaborations with other SMOs, but only up to a point, beyond which such SMOs are spread too thin. Engaging in more collaborations also makes SMOs both more active transmitters and adopters of novel tactics. Finally, achieving some initial overlap in their respective tactical repertoires facilitates the diffusion of tactics between collaborating SMOs."
-- Andrew and I are cited, but they show no real awareness of the fact that Aral's matching method does nothing about latent homophily, and so their results are still completely exposed to confounding (unless they've got truly well-chosen control variables going into the matching).
to:NB
to_read
sociology
social_movements
diffusion_of_innovations
re:critique_of_diffusion
homophily
-- Andrew and I are cited, but they show no real awareness of the fact that Aral's matching method does nothing about latent homophily, and so their results are still completely exposed to confounding (unless they've got truly well-chosen control variables going into the matching).
january 2012 by cshalizi
Modeling the Change of Paradigm: Non-Bayesian Reactions to Unexpected News (Ortoleva)
january 2012 by cshalizi
"Despite its normative appeal and widespread use, Bayes’ rule has two well-known limitations: first, it does not predict how agents should react to an information to which they assigned probability zero; second, a sizable empirical evidence documents how agents systematically deviate from its prescriptions by overreacting to information to which they assigned a positive but small probability. By replacing Dynamic Consistency with a novel axiom, Dynamic Coherence, we characterize an alternative updating rule that is not subject to these limitations, but at the same time coincides with Bayes’ rule for “normal” events. In particular, we model an agent with a utility function over consequences, a prior over priors ρ, and a threshold. In the first period she chooses the prior that maximizes the prior over priors ρ - a’ la maximum likelihood. As new information is revealed: if the chosen prior assigns to this information a probability above the threshold, she follows Bayes’ rule and updates it. Otherwise, she goes back to her prior over priors ρ, updates it using Bayes’ rule, and then chooses the new prior that maximizes the updated ρ. We also extend our analysis to the case of ambiguity aversion."
to:NB
to_read
decision_theory
bayesianism
statistics
re:phil-of-bayes_paper
january 2012 by cshalizi
Introduction to Online Optimization (Bubeck)
december 2011 by cshalizi
"to_teach" tag a sudden brainstorm for how to make next year's statistical computing class either unbeatably awesome or an absolute disaster
to:NB
online_learning
regression
individual_sequence_prediction
optimization
machine_learning
learning_theory
via:mraginsky
to_read
to_teach:statcomp
december 2011 by cshalizi
Social selection and peer influence in an online social network
december 2011 by cshalizi
"Disentangling the effects of selection and influence is one of social science's greatest unsolved puzzles: Do people befriend others who are similar to them, or do they become more similar to their friends over time? Recent advances in stochastic actor-based modeling, combined with self-reported data on a popular online social network site, allow us to address this question with a greater degree of precision than has heretofore been possible. Using data on the Facebook activity of a cohort of college students over 4 years, we find that students who share certain tastes in music and in movies, but not in books, are significantly likely to befriend one another. Meanwhile, we find little evidence for the diffusion of tastes among Facebook friends—except for tastes in classical/jazz music. These findings shed light on the mechanisms responsible for observed network homogeneity; provide a statistically rigorous assessment of the coevolution of cultural tastes and social relationships; and suggest important qualifications to our understanding of both homophily and contagion as generic social processes."
It will be interested to see how they argue this isn't confounded six ways from Sunday.
in_NB
to_read
re:homophily_and_confounding
social_networks
social_influence
homophily
social_media
to_be_shot_after_a_fair_trial
It will be interested to see how they argue this isn't confounded six ways from Sunday.
december 2011 by cshalizi
[1112.3914] Robust empirical mean Estimators
december 2011 by cshalizi
"We study robust estimators of the mean of a probability measure $P$, called robust empirical mean estimators. This elementary construction is then used to revisit a problem of aggregation and a problem of estimator selection, extending these methods to not necessarily bounded collections of previous estimators.
We consider then the problem of robust $M$-estimation. We propose a slightly more complicated construction to handle this problem and, as examples of applications, we apply our general approach to least-squares density estimation, to density estimation with K"ullback loss and to a non-Gaussian, unbounded, random design and heteroscedastic regression problem.
Finally, we show that our strategy can be used when the data are only assumed to be mixing."
in_NB
to_read
statistics
estimation
statistical_inference_for_stochastic_processes
We consider then the problem of robust $M$-estimation. We propose a slightly more complicated construction to handle this problem and, as examples of applications, we apply our general approach to least-squares density estimation, to density estimation with K"ullback loss and to a non-Gaussian, unbounded, random design and heteroscedastic regression problem.
Finally, we show that our strategy can be used when the data are only assumed to be mixing."
december 2011 by cshalizi
[1112.3257] Exact Computation of Kullback-Leibler Distance for Hidden Markov Trees and Models
december 2011 by cshalizi
"We suggest new recursive formulas to compute the exact value of the Kullback-Leibler distance (KLD) between two general Hidden Markov Trees (HMTs). For homogeneous HMTs with regular topology, such as homogeneous Hidden Markov Models (HMMs), we obtain a closed-form expression for the KLD when no evidence is given. We generalize our recursive formulas to the case of HMMs conditioned on the observable variables. Our proposed formulas are validated through several numerical examples in which we compare the exact KLD value with Monte Carlo estimations."
to:NB
to_read
re:AoS_project
markov_models
stochastic_processes
information_theory
december 2011 by cshalizi
[1112.3308] Spatial correlations in attribute communities
december 2011 by cshalizi
"Community detection is an important tool for exploring and classifying the properties of large complex networks and should be of great help for spatial networks. Indeed, in addition to their location, nodes in spatial networks can have attributes such as the language for individuals, or any other socio-economical feature that we would like to identify in communities. We discuss in this paper a crucial aspect which was not considered in previous studies which is the possible existence of correlations between space and attributes. Introducing a simple toy model in which both space and node attributes are considered, we discuss the effect of space-attribute correlations on the results of various community detection methods proposed for spatial networks in this paper and in previous studies. When space is irrelevant, our model is equivalent to the stochastic block model which has been shown to display a detectability-non detectability transition. In the regime where space dominates the link formation process, most methods can fail to recover the communities, an effect which is particularly marked when space-attributes correlations are strong. In this latter case, community detection methods which remove the spatial component of the network can miss a large part of the community structure and can lead to incorrect results."
in_NB
to_read
statistics
community_discovery
network_data_analysis
spatial_statistics
december 2011 by cshalizi
An Experimental Study of Homophily in the Adoption of Health Behavior
december 2011 by cshalizi
"How does the composition of a population affect the adoption of health behaviors and innovations? Homophily—similarity of social contacts—can increase dyadic-level influence, but it can also force less healthy individuals to interact primarily with one another, thereby excluding them from interactions with healthier, more influential, early adopters. As a result, an important network-level effect of homophily is that the people who are most in need of a health innovation may be among the least likely to adopt it. Despite the importance of this thesis, confounding factors in observational data have made it difficult to test empirically. We report results from a controlled experimental study on the spread of a health innovation through fixed social networks in which the level of homophily was independently varied. We found that homophily significantly increased overall adoption of a new health behavior, especially among those most in need of it."
in_NB
to_read
social_networks
experimental_sociology
re:homophily_and_confounding
homophily
diffusion_of_innovations
contagion
social_influence
december 2011 by cshalizi
Multistability and Perceptual Inference - Neural Computation - Abstract
december 2011 by cshalizi
"Ambiguous images present a challenge to the visual system: How can uncertainty about the causes of visual inputs be represented when there are multiple equally plausible causes? A Bayesian ideal observer should represent uncertainty in the form of a posterior probability distribution over causes. However, in many real-world situations, computing this distribution is intractable and requires some form of approximation. We argue that the visual system approximates the posterior over underlying causes with a set of samples and that this approximation strategy produces perceptual multistability—stochastic alternation between percepts in consciousness. Under our analysis, multistability arises from a dynamic sample-generating process that explores the posterior through stochastic diffusion, implementing a rational form of approximate Bayesian inference known as Markov chain Monte Carlo (MCMC). We examine in detail the most extensively studied form of multistability, binocular rivalry, showing how a variety of experimental phenomena—gamma-like stochastic switching, patchy percepts, fusion, and traveling waves—can be understood in terms of MCMC sampling over simple graphical models of the underlying perceptual tasks. We conjecture that the stochastic nature of spiking neurons may lend itself to implementing sample-based posterior approximations in the brain."
(Actually, if I was going to try to model this as a Bayesian inference [and why would one _do_ that?], the more natural analogy would seem to be a Berk-style oscillation among equally good, i.e., equally wrong, hypotheses.)
to:NB
to_read
perception
neural_networks
bayesianism
gershman.samuel
vul.edward
tenenbaum.joshua
(Actually, if I was going to try to model this as a Bayesian inference [and why would one _do_ that?], the more natural analogy would seem to be a Berk-style oscillation among equally good, i.e., equally wrong, hypotheses.)
december 2011 by cshalizi
[1112.1667] Boltzmann's Entropy and Large Deviation Lyapunov Functionals for Closed and Open Macroscopic Systems
december 2011 by cshalizi
"I give a brief overview of the resolution of the apparent problem of reconciling time symmetric microscopic dynamic with time asymmetric equations describing the evolution of macroscopic variables. I then show how the large deviation function of the stationary state of the microscopic system can be used as a Lyapunov function for the macroscopic evolution equations."
to:NB
to_read
statistical_mechanics
non-equilibrium
arrow_of_time
large_deviations
lebowitz.joel
december 2011 by cshalizi
Early Computational Statistics - Journal of Computational and Graphical Statistics - 20(4):811
december 2011 by cshalizi
"I consider the beginnings of computational and empirical statistics, particularly emphasizing the contributions to these by the scientists at Los Alamos National Laboratory during and after World War II. The timeline considered herein begins with preparations for the 1890 U.S. Census and concludes with Tukey’s introduction of the jackknife."
in_NB
to_read
statistics
history_of_mathematics
history_of_statistics
computational_statistics
december 2011 by cshalizi
Hive Plots - Linear Layout for Network Visualization - Visually Interpreting Network Structure and Content Made Possible
december 2011 by cshalizi
Examine carefully. God knows hairballs are not very useful. There's apparently an R package.
in_NB
to_read
network_data_analysis
visual_display_of_quantitative_information
via:dsparks
to_teach:complexity-and-inference
re:stacs
december 2011 by cshalizi
The Ethics of Nudge (Bovens)
december 2011 by cshalizi
"In their recently published book Nudge (2008) Richard H. Thaler and Cass R. Sunstein (T&S) defend a position labelled as ‘libertarian paternalism’. Their thinking appeals to both the right and the left of the political spectrum, as evidenced by the bedfellows they keep on either side of the Atlantic. In the US, they have advised Barack Obama, while, in the UK, they were welcomed with open arms by the David Cameron’s camp. (Chakrabortty, 2008) I will consider the following questions. What is Nudge? How is it different from social advertisement? Does Nudge induce genuine preference change? Does Nudge build moral character? Is there a moral difference between the use of Nudge as opposed to subliminal images to reach policy objectives? And what are the moral constraints on Nudge?"
to:NB
to_read
moral_philosophy
december 2011 by cshalizi
[1111.5648] Falsification and future performance
december 2011 by cshalizi
"We information-theoretically reformulate two measures of capacity from statistical learning theory: empirical VC-entropy and empirical Rademacher complexity. We show these capacity measures count the number of hypotheses about a dataset that a learning algorithm falsifies when it finds the classifier in its repertoire minimizing empirical risk. It then follows from that the future performance of predictors on unseen data is controlled in part by how many hypotheses the learner falsifies. As a corollary we show that empirical VC-entropy quantifies the message length of the true hypothesis in the optimal code of a particular probability distribution, the so-called actual repertoire."
to:NB
to_read
information_theory
learning_theory
falsification
balduzzi.david
december 2011 by cshalizi
Phys. Rev. E 84, 051138 (2011): Anomalous diffusion: Testing ergodicity breaking in experimental data
december 2011 by cshalizi
"Recent advances in single-molecule experiments show that various complex systems display nonergodic behavior. In this paper, we show how to test ergodicity and ergodicity breaking in experimental data. Exploiting the so-called dynamical functional, we introduce a simple test which allows us to verify ergodic properties of a real-life process. The test can be applied to a large family of stationary infinitely divisible processes. We check the performance of the test for various simulated processes and apply it to experimental data describing the motion of mRNA molecules inside live Escherichia coli cells. We show that the data satisfy necessary conditions for mixing and ergodicity. The detailed analysis is presented in the supplementary material."
in_NB
to_read
ergodic_theory
hypothesis_testing
stochastic_processes
statistical_inference_for_stochastic_processes
december 2011 by cshalizi
[1111.6337] Regret Bound by Variation for Online Convex Optimization
december 2011 by cshalizi
"In citep{Hazan-2008-extract}, the authors showed that the regret of online linear optimization can be bounded by the total variation of the cost vectors. In this paper, we extend this result to general online convex optimization. We first analyze the limitations of the algorithm in citep{Hazan-2008-extract} when applied it to online convex optimization. We then present two algorithms for online convex optimization whose regrets are bounded by the variation of cost functions. We finally consider the bandit setting, and present a randomized algorithm for online bandit convex optimization with a variation-based regret bound. We show that the regret bound for online bandit convex optimization is optimal when the variation of cost functions is independent of the number of trials."
in_NB
to_read
re:growing_ensemble_project
learning_theory
individual_sequence_prediction
december 2011 by cshalizi
Prediction-based regularization using data augmented regression - Statistics and Computing, Volume 22, Number 1
december 2011 by cshalizi
"The role of regularization is to control fitted model complexity and variance by penalizing (or constraining) models to be in an area of model space that is deemed reasonable, thus facilitating good predictive performance. This is typically achieved by penalizing a parametric or non-parametric representation of the model. In this paper we advocate instead the use of prior knowledge or expectations about the predictions of models for regularization. This has the twofold advantage of allowing a more intuitive interpretation of penalties and priors and explicitly controlling model extrapolation into relevant regions of the feature space. This second point is especially critical in high-dimensional modeling situations, where the curse of dimensionality implies that new prediction points usually require extrapolation. We demonstrate that prediction-based regularization can, in many cases, be stochastically implemented by simply augmenting the dataset with Monte Carlo pseudo-data. We investigate the range of applicability of this implementation. An asymptotic analysis of the performance of Data Augmented Regression (DAR) in parametric and non-parametric linear regression, and in nearest neighbor regression, clarifies the regularizing behavior of DAR. We apply DAR to simulated and real data, and show that it is able to control the variance of extrapolation, while maintaining, and often improving, predictive accuracy."
in_NB
to_read
statistics
prediction
estimation
hooker.giles
regression
to_teach:undergrad-ADA
to_teach:data-mining
curse_of_dimensionality
december 2011 by cshalizi
related tags
academia ⊕ active_learning ⊕ adaptive_behavior ⊕ additive_models ⊕ afghanistan ⊕ agent-based_models ⊕ ai ⊕ akerlof.george ⊕ al-qaeda ⊕ albers.dave ⊕ algebraic_statistics ⊕ algorithmic_information_theory ⊕ allometric_scaling ⊕ amaral.luis ⊕ american_hegemony ⊕ ancel_meyers.lauren ⊕ anderson.perry ⊕ andrews.donald_w._k. ⊕ anthropology ⊕ appropriations_of_complexity ⊕ approximate_bayesian_computation ⊕ approximation ⊕ arab_spring ⊕ arbitrage ⊕ arlot.sylvain ⊕ arrow.kenneth ⊕ arrow_of_time ⊕ astronomy ⊕ asymptotics ⊕ atay.fatihcan ⊕ attention ⊕ attractor_reconstruction ⊕ autism ⊕ automata_theory ⊕ automation ⊕ ay.nihat ⊕ backfitting ⊕ bad_data_analysis ⊕ baker.dean ⊕ balduzzi.david ⊕ ballistic_computation ⊕ banking ⊕ barvinok.alexander ⊕ batterman.robert_w ⊕ bayesianism ⊕ bayesian_consistency ⊕ behavioral_genetics ⊕ belusov-zhabotinsky ⊕ bergstrom.carl ⊕ bernstein-von_mises ⊕ bialek.william ⊕ biau.gerard ⊕ biochemical_networks ⊕ biology ⊕ birds ⊕ blanchard.gilles ⊕ blattman.chris ⊕ blogging ⊕ blume.andreas ⊕ books:noted ⊕ book_reviews ⊕ boots.byron ⊕ bootstrap ⊕ bounded_rationality ⊕ bousquet.olivier ⊕ bowles.samuel ⊕ branching_processes ⊕ breiman.leo ⊕ buhlmann.peter ⊕ c++ ⊕ calibration ⊕ campaign_finance ⊕ carroll.sean ⊕ caruana.rich ⊕ category_theory ⊕ catoni.olivier ⊕ cats ⊕ causality ⊕ causal_inference ⊕ cellular_automata ⊕ central_asia ⊕ central_limit_theorem ⊕ change-point_problem ⊕ chaos ⊕ chicago ⊕ chow-liu_algorithm ⊕ chow-liu_trees ⊕ chung.fan ⊕ citation_networks ⊕ cities ⊕ clarke.kevin ⊕ classifiers ⊕ clauset.aaron ⊕ climate_change ⊕ climatology ⊕ clustering ⊕ coarse-graining ⊕ cognitive_development ⊕ cognitive_science ⊕ cognitive_tools ⊕ cognitive_triage ⊕ cohen.michael ⊕ coleman.todd ⊕ collaborative_filtering ⊕ collective_cognition ⊕ community_discovery ⊕ complexity_measures ⊕ computability ⊕ computational_complexity ⊕ computational_statistics ⊕ concentration_of_measure ⊕ confidence_sets ⊕ congress ⊕ consistency ⊕ contagion ⊕ control_theory ⊕ convergence_of_stochastic_processes ⊕ convexity ⊕ convex_sets ⊕ cool_if_true ⊕ corporate_governance ⊕ corpus_linguistics ⊕ corruption ⊕ cosmology ⊕ counter-insurgency ⊕ coupled_map_lattices ⊕ credit ⊕ cross-validation ⊕ crutchfield.james_p. ⊕ CSSR ⊕ cultural_evolution ⊕ cultural_transmission ⊕ cultural_transmission_of_cognitive_tools ⊕ culture ⊕ curse_of_dimensionality ⊕ curve_fitting ⊕ dasgupta.anirban ⊕ databases ⊕ data_analysis ⊕ data_mining ⊕ david.paul ⊕ debowski.lukasz ⊕ decision_theory ⊕ decision_trees ⊕ defenses_of_liberalism ⊕ degrees_of_freedom ⊕ democracy ⊕ density_estimation ⊕ density_ratio_estimation ⊕ determinism ⊕ development_economics ⊕ deviation_bounds ⊕ devroye.luc ⊕ dewey.john ⊕ de_deo.simon ⊕ dietterich.thomas ⊕ differential_equations ⊕ differential_geometry ⊕ diffusion_of_innovations ⊕ dimension_reduction ⊕ directed_information ⊕ discretization ⊕ distributed_systems ⊕ document_summarization ⊕ domingos.pedro ⊕ donskers_theorem ⊕ dudoit.sandrine ⊕ dupuis.paul ⊕ dynamical_systems ⊕ dynamical_systemss ⊕ early_modern_european_history ⊕ earthquakes ⊕ ecology ⊕ econometrics ⊕ economics ⊕ economic_growth ⊕ economic_history ⊕ economic_policy ⊕ education ⊕ effective_field_theories ⊕ elites ⊕ ellis.richard ⊕ emergence ⊕ emotion ⊕ empirical_processes ⊕ em_algorithm ⊕ encompassing ⊕ energy ⊕ ensemble_methods ⊕ entropy ⊕ entropy_estimation ⊕ epidemic_models ⊕ epidemiology ⊕ ergodic_decomposition ⊕ ergodic_theory ⊕ estimation ⊕ estimation_of_dynamical_systems ⊕ ethnography ⊕ europe ⊕ evolution ⊕ evolutionary_biology ⊕ evolutionary_economics ⊕ evolutionary_game_theory ⊕ evolutionary_optimization ⊕ evolution_of_complexity ⊕ evolution_of_cooperation ⊕ evolving_local_rules ⊕ executive_function ⊕ expectation-maximization ⊕ experimental_biology ⊕ experimental_economics ⊕ experimental_psychology ⊕ experimental_sociology ⊕ explanation ⊕ exploitation-exploration_tradeoff ⊕ exponential_convergence_of_empirical_probabilities ⊕ exponential_families ⊕ exponential_family_random_graphs ⊕ face_recognition ⊕ factor_analysis ⊕ falsification ⊕ feedback ⊕ feldman.david ⊕ feminism ⊕ field_theory ⊕ fienberg.steve ⊕ finance ⊕ financial_crisis_of_2007-- ⊕ financial_markets ⊕ financial_speculation ⊕ fink.daniel ⊕ fisher_information ⊕ flickr ⊕ flocks_and_swarms ⊕ fluctuation-response ⊕ fluid_mechanics ⊕ fmri ⊕ foundations_of_statistics ⊕ fox.emily ⊕ fraud ⊕ freeman.peter ⊕ functional_central_limit_theorem ⊕ functional_connectivity ⊕ funny:laughing_instead_of_screaming ⊕ galstyan.aram ⊕ galves.antonio ⊕ game_theory ⊕ gangs ⊕ gauge_symmetry ⊕ gaussian_processes ⊕ gene_expression ⊕ gene_expression_data_analysis ⊕ gene_regulation ⊕ genovese.chris ⊕ geography ⊕ geology ⊕ gershman.samuel ⊕ getoor.lise ⊕ gladwell.malcolm ⊕ godfrey-smith.peter ⊕ goodness-of-fit ⊕ gordon.geoffrey_j. ⊕ grammar_induction ⊕ granger_causality ⊕ graphical_models ⊕ graph_grammars ⊕ graph_limits ⊕ graph_spectra ⊕ graph_theory ⊕ great_transformation ⊕ grunwald.peter ⊕ guerrilla_warfare ⊕ guestrin.carlos ⊕ gustafson.paul ⊕ haavelmo.trygve ⊕ habit ⊕ hacking.ian ⊕ hansen.bruce ⊕ hansen.christian ⊕ harris.zellig ⊕ have_read ⊕ hayek.f.a._von ⊕ heard_the_talk ⊕ heavy_tails ⊕ herding ⊕ heritability ⊕ hierarchical_structure ⊕ high-dimensional_probability ⊕ high-dimensional_statistics ⊕ hilbert_space ⊕ historical_linguistics ⊕ historical_materialism ⊕ history_of_economics ⊕ history_of_mathematics ⊕ history_of_physics ⊕ history_of_science ⊕ history_of_statistics ⊕ history_of_technology ⊕ hofling.holger ⊕ homophily ⊕ hooker.giles ⊕ hopcroft.john ⊕ horrifying ⊕ hsu.daniel ⊕ human_genetics ⊕ hydrodynamics ⊕ hypergraphs ⊕ hypothesis_testing ⊕ ideal-point_models ⊕ identifiability ⊕ identity_politics ⊕ implicit_learning ⊕ increasing_returns ⊕ independence_testing ⊕ india ⊕ indirect_inference ⊕ individual_sequence_prediction ⊕ inequality ⊕ inference_to_latent_objects ⊕ influence ⊕ information_cascades ⊕ information_criteria ⊕ information_geometry ⊕ information_retrieval ⊕ information_theory ⊕ innovation ⊕ input-output_analysis ⊕ institutions ⊕ instrumental_variables ⊕ intellectual_property ⊕ interacting_particle_systems ⊕ interface_design ⊕ international_relations ⊕ internet ⊕ interpretation ⊕ inverse_problems ⊕ in_NB ⊕ ising_model ⊕ islam ⊕ jackson.matthew_o. ⊕ jaeger.herbert ⊕ janzing.dominik ⊕ jordan.michael_i. ⊕ jost.jurgen ⊕ k-means ⊕ kadanoff.leo ⊕ kakade.sham ⊕ kalisch.markus ⊕ KAM_theory ⊕ kantz.holger ⊕ kernel_estimators ⊕ kernel_methods ⊕ khinchin.a._i. ⊕ kith_and_kin ⊕ kleinberg.jon ⊕ klemens.ben ⊕ kolar.mladen ⊕ kontoyiannis.ioannis ⊕ koyama.shinsuke ⊕ krakauer.david ⊕ krakuer.david ⊕ lafferty.john ⊕ landauers_principle ⊕ large_deviations ⊕ lasso ⊕ latent_variables ⊕ latex ⊕ lazer.david ⊕ learning ⊕ learning_in_games ⊕ learning_theory ⊕ lebanon.guy ⊕ lebaron.blake ⊕ lebowitz.joel ⊕ lee.ann_b. ⊕ leonardi.florencia ⊕ lerman.kristina ⊕ levina.liza ⊕ likelihood ⊕ linguistics ⊕ linguistic_evolution ⊕ link_prediction ⊕ literary_criticism ⊕ literary_history ⊕ liu.han ⊕ lives_of_the_scientists ⊕ logic ⊕ logical_positivism ⊕ lohr.wolfgang ⊕ long-memory_processes ⊕ long-range_dependence ⊕ low-regret_learning ⊕ low_dimensional_projections ⊕ lugosi.gabor ⊕ lyapunov_exponents ⊕ machine_learning ⊕ machta.jon ⊕ macroeconomics ⊕ macro_from_micro ⊕ manifold_learning ⊕ mapping ⊕ markets_as_collective_calculating_devices ⊕ markov_models ⊕ martingales ⊕ mason.joshua_w. ⊕ mason.winter ⊕ massart.pascal ⊕ mathematics ⊕ maxwell.james_clerk ⊕ meaning_as_location_in_a_system_of_relations ⊕ measurement ⊕ measure_theory ⊕ mechanism_design ⊕ memory ⊕ meta-analysis ⊕ methodology ⊕ method_of_moments ⊕ meyn.sean ⊕ military_industrial_complex ⊕ minimax ⊕ misspecification ⊕ mixing ⊕ mixture_models ⊕ model-checking ⊕ modeling ⊕ model_averaging ⊕ model_checking ⊕ model_discovery ⊕ model_selection ⊕ model_uncertainty ⊕ moderate_deviations ⊕ modularity ⊕ molecular_dynamics ⊕ monte_carlo ⊕ moral_hazard ⊕ moral_philosophy ⊕ moral_psychology ⊕ morphogenesis ⊕ morvai.gusztav ⊕ moulines.eric ⊕ multiple_testing ⊕ names ⊕ natural_language_processing ⊕ nearest_neighbors ⊕ neat_nonlinear_nonsense ⊕ networked_life ⊕ networks ⊕ network_data_analysis ⊕ network_formation ⊕ neural_coding_and_decoding ⊕ neural_computation ⊕ neural_data_analysis ⊕ neural_modeling ⊕ neural_networks ⊕ neurath.otto ⊕ neuropsychology ⊕ neuroscience ⊕ neville.jennifer ⊕ newman.mark ⊕ nilsson_jacobi.martin ⊕ nominate ⊕ non-equilibrium ⊕ non-stationarity ⊕ nonparametrics ⊕ nordhaus.william ⊕ norton.john ⊕ numeracy ⊕ observable_operator_models ⊕ ocaml ⊕ oligarchy ⊕ online_learning ⊕ optimization ⊕ orbanz.peter ⊕ organizations ⊕ our_decrepit_institutions ⊕ owen.art ⊕ p-values ⊕ pac-bayesian ⊕ pakistan ⊕ paper_writing ⊕ parenting ⊕ particle_filters ⊕ pattern_formation ⊕ pearl.judea ⊕ pedagogy ⊕ peer_production ⊕ perception ⊕ percival.daniel ⊕ phase_transitions ⊕ philosophy_of_science ⊕ photos ⊕ physics ⊕ physics_of_information ⊕ pillai.natesh ⊕ pittsburgh ⊕ poincare_recurrence ⊕ point_processes ⊕ political_economy ⊕ political_networks ⊕ political_science ⊕ pollard.david ⊕ polletta.francesca ⊕ polya.george ⊕ porter.mason ⊕ pragmatics ⊕ pre-validation ⊕ prediction ⊕ prediction_trees ⊕ predictive_state_representations ⊕ primo.david ⊕ principal_components ⊕ privatization ⊕ probability ⊕ probably_approximately_correct ⊕ productivity ⊕ programming ⊕ progressive_forces ⊕ psychology ⊕ publication_bias ⊕ public_policy ⊕ quantum_mechanics ⊕ R ⊕ radev.dragomir ⊕ raginsky.maxim ⊕ randal.douc ⊕ randomization ⊕ random_boolean_networks ⊕ random_fields ⊕ random_forests ⊕ random_matrices ⊕ random_matrix_theory ⊕ rashid.ahmed ⊕ re:aggregating_random_graphs ⊕ re:almost_none ⊕ re:AoS_project ⊕ re:bayes_as_evol ⊕ re:critique_of_diffusion ⊕ re:democratic_cognition ⊕ re:do-institutions-evolve ⊕ re:donor_networks ⊕ re:friday_cat-blogging ⊕ re:functional_communities ⊕ re:growing_ensemble_project ⊕ re:g_paper ⊕ re:homophily_and_confounding ⊕ re:knightian_uncertainty ⊕ re:naive-semi-supervised ⊕ re:network_differences ⊕ re:phil-of-bayes_paper ⊕ re:sensor-networks-as-social-networks ⊕ re:smoothing_adjacency_matrices ⊕ re:social-networks-as-sensor-networks ⊕ re:stacs ⊕ re:what_is_a_macrostate ⊕ re:XV_for_mixing ⊕ re:XV_for_networks ⊕ re:your_favorite_dsge_sucks ⊕ re:your_favorite_ergm_sucks ⊕ reaction-diffusion ⊕ reciprocity ⊕ recurrence_times ⊕ recursive_estimation ⊕ reductionism ⊕ regression ⊕ regulation ⊕ reichenbach.hans ⊕ reinforcement_learning ⊕ relational_learning ⊕ relativity ⊕ religion ⊕ renormalization ⊕ replicator_dynamics ⊕ resampling ⊕ rhetoric ⊕ richards.joey ⊕ riedewald.mirek ⊕ rigollet.philippe ⊕ rinaldo.alessandro ⊕ risk_vs_uncertainty ⊕ robins.james ⊕ robust_statistics ⊕ rockmore.dan ⊕ romer.paul ⊕ rubin.barnett ⊕ runciman.w.g. ⊕ running_dogs_of_reaction ⊕ ryabko.b._ya. ⊕ ryabko.daniil ⊕ saddle-point_approximation ⊕ sandler.mark ⊕ schafer.chad ⊕ science_studies ⊕ scientific_computing ⊕ scientific_revolution ⊕ self-fulfilling_prophecy ⊕ self-organization ⊕ self-organized_criticality ⊕ semi-supervised_learning ⊕ send_a_note ⊕ shrinkage ⊕ siddiqi.sajid_m. ⊕ signal_processing ⊕ signal_transduction ⊕ simulation ⊕ smoking ⊕ smoothing ⊕ socialism ⊕ socialist_calculation_debate ⊕ social_construction ⊕ social_contagion ⊕ social_influence ⊕ social_learning ⊕ social_life_of_the_mind ⊕ social_media ⊕ social_movements ⊕ social_networks ⊕ social_norms ⊕ social_organization ⊕ social_psychology ⊕ social_science_methodology ⊕ sociology ⊕ sociology_of_science ⊕ song.le ⊕ sorokina.daria ⊕ soviet-afghan_war ⊕ sparsity ⊕ spatial_statistics ⊕ spectral_clustering ⊕ spectral_estimation ⊕ spectral_methods ⊕ splines ⊕ stability_of_learning ⊕ state-building ⊕ state-space_models ⊕ state-space_reconstruction ⊕ stationarity ⊕ statistical_inference_for_stochastic_processes ⊕ statistical_interaction ⊕ statistical_mechanics ⊕ statistics ⊕ stiglitz.joseph ⊕ stochastic_approximation ⊕ stochastic_differential_equations ⊕ stochastic_models ⊕ stochastic_processes ⊕ stotz.karola ⊕ strategic_interaction ⊕ strategic_position_in_networks ⊕ stress ⊕ structured_data ⊕ sufficiency ⊕ suhay.liz ⊕ support_vector_machines ⊕ symbolic_dynamics ⊕ synchronization ⊕ synchronizing_words ⊕ tagging ⊕ technological_change ⊕ technological_unemployment ⊕ tenenbaum.joshua ⊕ text_mining ⊕ theoretical_computer_science ⊕ thermodynamics ⊕ thermodynamic_formalism ⊕ the_continuing_crises ⊕ things_that_should_not_be ⊕ tibshirani.robert ⊕ time_series ⊕ tishby.naftali ⊕ tkacik.gasper ⊕ tkacik.maureen ⊕ to:blog ⊕ to:NB ⊕ topic_models ⊕ total_factor_productivity ⊕ to_be_shot_after_a_fair_trial ⊕ to_read ⊖ to_teach:advanced-stochastic-processes ⊕ to_teach:complexity-and-inference ⊕ to_teach:data-mining ⊕ to_teach:statcomp ⊕ to_teach:undergrad-ADA ⊕ transaction_networks ⊕ turing_mechanism ⊕ two-sample_tests ⊕ unions ⊕ universal_prediction ⊕ unsupervised_learning ⊕ us ⊕ us-iraq_war ⊕ ussr ⊕ us_military ⊕ us_politics ⊕ utter_stupidity ⊕ vagueness ⊕ van_der_vaart.aad ⊕ van_de_geer.sara ⊕ variable-length_markov_models ⊕ variable_selection ⊕ variational_methods ⊕ verdinelli.isa ⊕ via:? ⊕ via:aaron_clauset ⊕ via:aks ⊕ via:alessandro ⊕ via:anoopsarkar ⊕ via:ariddell ⊕ via:arthegall ⊕ via:blattman ⊕ via:brad_plumer ⊕ via:cris_moore ⊕ via:ded-maxim ⊕ via:djm1107 ⊕ via:dsparks ⊕ via:fionajay ⊕ via:gelman ⊕ via:guslacerda ⊕ via:henry_farrell ⊕ via:jhofman ⊕ via:joncgoodwin ⊕ via:justin ⊕ via:kevin_kelly ⊕ via:krugman ⊕ via:mason ⊕ via:matthew_berryman ⊕ via:mberryman ⊕ via:merriam ⊕ via:mindhacks ⊕ via:mraginsky ⊕ via:nequitans ⊕ via:neuroanthropology ⊕ via:nikete ⊕ via:orzelc ⊕ via:paper_I_refereed_and_can't_tell_you_about ⊕ via:rjwaldmann ⊕ via:rocha ⊕ via:santerre ⊕ via:shivak ⊕ via:spangledrongo ⊕ via:spencer-ackerman ⊕ via:vaguery ⊕ via:wiggins ⊕ videogames ⊕ violence ⊕ vision ⊕ visual_display_of_quantitative_information ⊕ von_mises.richard ⊕ voter_model ⊕ vul.edward ⊕ wahba.grace ⊕ wainwright.martin ⊕ waiting_times ⊕ war ⊕ wasserman.larry ⊕ watts.duncan ⊕ weiss.benjamin ⊕ whats_gone_wrong_with_america ⊕ xing.eric ⊕ zhang.jiji ⊕ zhang.tong ⊕ zhu.ji ⊕Copy this bookmark: