cshalizi + to_read   559

[1205.3845] Forecasting with Historical Data or Process Knowledge under Misspecification: A Comparison
"When faced with the task of forecasting a dynamic system, practitioners often have available historical data, knowledge of the system, or a combination of both. While intuition dictates that perfect knowledge of the system should in theory yield perfect forecasting, often knowledge of the system is only partially known, known up to parameters, or known incorrectly. In contrast, forecasting using previous data without any process knowledge might result in accurate prediction for simple systems, but will fail for highly nonlinear and chaotic systems. In this paper, the authors demonstrate how even in chaotic systems, forecasting with historical data is preferable to using process knowledge if this knowledge exhibits certain forms of misspecification. Through an extensive simulation study, a range of misspecification and forecasting scenarios are examined with the goal of gaining an improved understanding of the circumstances under which forecasting from historical data is to be preferred over using process knowledge."
to:NB  to_read  prediction  time_series  misspecification  re:growing_ensemble_project 
8 days ago by cshalizi
Phys. Rev. Lett. 108, 200601 (2012): Number of Relevant Directions in Principal Component Analysis and Wishart Random Matrices
"We compute analytically, for large N, the probability P(N+,N) that a N×N Wishart random matrix has N+ eigenvalues exceeding a threshold Nζ, including its large deviation tails. This probability plays a benchmark role when performing the principal component analysis of a large empirical data set. We find that P(N+,N)≈exp⁡[-βN2ψζ(N+/N)], where β is the Dyson index of the ensemble and ψζ(κ) is a rate function that we compute explicitly in the full range 0≤κ≤1 and for any ζ. The rate function ψζ(κ) displays a quadratic behavior modulated by a logarithmic singularity close to its minimum κ⋆(ζ). This is shown to be a consequence of a phase transition in an associated Coulomb gas problem. The variance Δ(N) of the number of relevant components is also shown to grow universally (independent of ζ) as Δ(N)∼(βπ2)-1ln⁡N for large N."
to:NB  to_read  principal_components  large_deviations  random_matrices  stochastic_processes  high-dimensional_probability  re:g_paper  phase_transitions 
8 days ago by cshalizi
[1205.3703] Generic chaining and the l1-penalty
"We address the choice of the tuning parameter $lambda$ in $ell_1$-penalized M-estimation. Our main concern is models which are highly nonlinear, such as the Gaussian mixture model. The number of parameters $p$ is moreover large, possibly larger than the number of observations $n$. The generic chaining technique of Talagrand[2005] is tailored for this problem. It leads to the choice $lambda asymp sqrt {log p / n}$, as in the standard Lasso procedure (which concerns the linear model and least squares loss)."
to:NB  to_read  statistics  empirical_processes  high-dimensional_statistics  van_de_geer.sara 
11 days ago by cshalizi
Phys. Rev. Lett. 108, 200403 (2012): Time Asymmetry of Probabilities Versus Relativistic Causal Structure: An Arrow of Time
"There is an incompatibility between the symmetries of causal structure in relativity theory and the signaling abilities of probabilistic devices with inputs and outputs: while time reversal in relativity will not introduce the ability to signal between spacelike separated regions, this is not the case for probabilistic devices with spacelike separated input-output pairs. We explicitly describe a nonsignaling device which becomes a perfect signaling device under time reversal, where time reversal can be conceptualized as playing backwards a videotape of an agent manipulating the device. This leads to an arrow of time that is identifiable when studying the correlations of events for spacelike separated regions. Somewhat surprisingly, although the time reversal of Popescu-Rohrlich boxes also allows agents to signal, it does not yield a perfect signaling device. Finally, we realize time reversal using postselection, which could to lead experimental implementation."
to:NB  causality  physics  relativity  arrow_of_time  to_read 
12 days ago by cshalizi
Quantitative patterns of stylistic influence in the evolution of literature
"Literature is a form of expression whose temporal structure, both in content and style, provides a historical record of the evolution of culture. In this work we take on a quantitative analysis of literary style and conduct the first large-scale temporal stylometric study of literature by using the vast holdings in the Project Gutenberg Digital Library corpus. We find temporal stylistic localization among authors through the analysis of the similarity structure in feature vectors derived from content-free word usage, nonhomogeneous decay rates of stylistic influence, and an accelerating rate of decay of influence among modern authors. Within a given time period we also find evidence for stylistic coherence with a given literary topic, such that writers in different fields adopt different literary styles. This study gives quantitative support to the notion of a literary “style of a time” with a strong trend toward increasingly contemporaneous stylistic influence."

It'll be interesting to see how they handle the bias induced by selective retention.
to:NB  to_read  literary_history  text_mining  kith_and_kin  rockmore.dan  krakuer.david 
13 days ago by cshalizi
Uncovering Structure in High-Dimensions: Networks and Multi-task Learning Problems
"Extracting knowledge and providing insights into complex mechanisms underlying noisy high-dimensional data sets is of utmost importance in many scientific domains. Statistical modeling has become ubiquitous in the analysis of high-dimensional functional data in search of better understanding of cognition mechanisms, in the exploration of large-scale gene regulatory networks in hope of developing drugs for lethal diseases, and in prediction of volatility in stock market in hope of beating the market. Statistical analysis in these high-dimensional data sets is possible only if an estimation procedure exploits hidden structures underlying data.
"This thesis develops flexible estimation procedures with provable theoretical guarantees for uncovering unknown hidden structures underlying data generating process. Of particular interest are procedures that can be used on high-dimensional data sets where the number of samples n much smaller than the ambient dimension p. Learning in high-dimensions is difficult due to the curse of dimensionality, however, the special problem structure makes inference possible. Due to its importance for scientific discovery, we put emphasis on consistent structure recovery throughout the thesis. Particular focus is given to two important problems, semi-parametric estimation of networks and feature selection in multi-task learning."
to_read  network_data_analysis  machine_learning  high-dimensional_statistics  kolar.mladen  kith_and_kin  relational_learning 
25 days ago by cshalizi
Towards Integrative Causal Analysis of Heterogeneous Data Sets and Studies
"We present methods able to predict the presence and strength of conditional and unconditional dependencies (correlations) between two variables Y and Z never jointly measured on the same samples, based on multiple data sets measuring a set of common variables. The algorithms are specializations of prior work on learning causal structures from overlapping variable sets. This problem has also been addressed in the field of statistical matching. The proposed methods are applied to a wide range of domains and are shown to accurately predict the presence of thousands of dependencies. Compared against prototypical statistical matching algorithms and within the scope of our experiments, the proposed algorithms make predictions that are better correlated with the sample estimates of the unknown parameters on test data ; this is particularly the case when the number of commonly measured variables is low.
"The enabling idea behind the methods is to induce one or all causal models that are simultaneously consistent with (fit) all available data sets and prior knowledge and reason with them. This allows constraints stemming from causal assumptions (e.g., Causal Markov Condition, Faithfulness) to propagate. Several methods have been developed based on this idea, for which we propose the unifying name Integrative Causal Analysis (INCA). A contrived example is presented demonstrating the theoretical potential to develop more general methods for co-analyzing heterogeneous data sets. The computational experiments with the novel methods provide evidence that causally-inspired assumptions such as Faithfulness often hold to a good degree of approximation in many real systems and could be exploited for statistical inference. Code, scripts, and data are available at www.mensxmachina.org."
to:NB  to_read  causal_inference  graphical_models  to_teach:undergrad-ADA 
25 days ago by cshalizi
[1204.6703] Two SVDs Suffice: Spectral decompositions for probabilistic topic modeling and latent Dirichlet allocation
"Topic models can be seen as a generalization of the clustering problem, in that they posit that observations are generated due to multiple latent factors (e.g. the words in each document are generated as a mixture of several active topics, as opposed to just one). This increased representational power comes at the cost of a more challenging unsupervised learning problem of estimating the topic probability vectors (the distributions over words for each topic), when only the words are observed and the corresponding topics are hidden.
"We provide a simple and efficient learning procedure that is guaranteed to recover the parameters for a wide class of mixture models, including the popular latent Dirichlet allocation (LDA) model. For LDA, the procedure correctly recovers both the topic probability vectors and the prior over the topics, using only trigram statistics (i.e. third order moments, which may be estimated with documents containing just three words). The method, termed Excess Correlation Analysis (ECA), is based on a spectral decomposition of low order moments (third and fourth order) via two singular value decompositions (SVDs). Moreover, the algorithm is scalable since the SVD operations are carried out on k by k matrices, where k is the number of latent factors (e.g. the number of topics), rather than in the d-dimensional observed space (typically d >> k)."

That's a really remarkable claim, and I'd tag it to_be_shot_after_a_fair_trial if it weren't being made by genuinely serious people.
in_NB  to_read  latent_variables  topic_models  text_mining  mixture_models  statistics  machine_learning  cool_if_true  spectral_clustering 
27 days ago by cshalizi
[1204.6265] Statistical inference for dynamical systems: a review
"The topic of statistical inference for dynamical systems has been studied extensively across several fields. In this survey we focus on the problem of parameter estimation for non-linear dynamical systems. Our objective is to place results across distinct disciplines in a common setting and highlight opportunities for further research."
to:NB  to_read  statistical_inference_for_stochastic_processes  dynamical_systems  statistics  time_series  state-space_models  state-space_reconstruction  pillai.natesh  via:ded-maxim 
28 days ago by cshalizi
Graphlets: a Spectral Perspective for Graph Limits
"Graphlets give a spectral approach to graph limits for general graph sequences in a framework that unifies previous disparate approaches for dealing with dense graphs and sparse graphs. We will show that the con- vergence to graphlets under the appropriate spectral distance is equivalent to the convergence using the (normalized) cut distance. We then examine the geometry of graphlets, illustrated by examples of several families of graphlets and, in particular, graphlets with low ranks. We further dis- cuss a number of usages of graphlets, including universal scalable bases, universal embeddings vis heat kernels and the preservation of Cheeger cuts."

ETA: This is so not an easy read. I like what I understand, but I definitely have to make another attack on it.
to:NB  to_read  graph_theory  graph_limits  re:smoothing_adjacency_matrices  re:network_differences  chung.fan  via:alessandro  graph_spectra 
4 weeks ago by cshalizi
Analytic Thinking Promotes Religious Disbelief
"Scientific interest in the cognitive underpinnings of religious belief has grown in recent years. However, to date, little experimental research has focused on the cognitive processes that may promote religious disbelief. The present studies apply a dual-process model of cognitive processing to this problem, testing the hypothesis that analytic processing promotes religious disbelief. Individual differences in the tendency to analytically override initially flawed intuitions in reasoning were associated with increased religious disbelief. Four additional experiments provided evidence of causation, as subtle manipulations known to trigger analytic processing also encouraged religious disbelief. Combined, these studies indicate that analytic processing is one factor (presumably among several) that promotes religious disbelief. Although these findings do not speak directly to conversations about the inherent rationality, value, or truth of religious beliefs, they illuminate one cognitive factor that may influence such discussions."

The part of me which imprinted on _Why I Am Not a Christian_ is chortling. Another part of me, however, is wondering how hard it would be to write "Analytic Thinking Promotes Disbelief in Psychological Studies".
to:NB  to_read  experimental_psychology  cognitive_science  religion 
4 weeks ago by cshalizi
"Network Coevolution and Democracy: A Spatial Econometric Approach" by Aya Kachi
"Regime transitions are contagious according to the diffusion-of-democracy literature: a country's regime is affected by others' through various predefined networks (e.g. geographical proximity), as well as by the country's own political, economic and social attributes (e.g. GDP levels). My account departs from the existing diffusion theory by allowing for countries' self-selection into peer regime networks based on their democracy levels in the past. For example, a country can form stronger dependency ties with countries that demonstrated similar democracy levels in the past (homophily). In the longitudinal setting, the traditional diffusion mechanism with the presence of self-selection generates the "co-evolutionary dynamic" between country networks and democracy levels. With this recursive feedback process between tie formation and democracy levels, it becomes extremely difficult to evaluate empirically how each country's level of democracy is determined, because we need to distinguish the following three processes statistically. First, country-specific attributes determine the level of democracy as in the earliest democratization studies. Second, other states' democracy levels also predict a country's regime as demonstrated in the conventional diffusion studies. Finally with my theory of endogenous network formation, the seeming diffusion effect is partially a consequence of their self-selection into peer networks. A newer spatial econometric model, an "M-STAR + Co-Evolution" model, is one of the first that allows us to test for all of these three dynamics behind democratization. In my first-cut analysis, I find that all three processes indeed exist."

ETA: It's good to recognize the problem exists, but the model used here does not make it go away, and still fails to identify the influence effect (if one exists).
to:NB  to_read  political_science  network_data_analysis  homophily  contagion  re:critique_of_diffusion  democracy 
4 weeks ago by cshalizi
On the Relation Between Encoding and Decoding of Neuronal Spikes
"Neural coding is a field of study that concerns how sensory information is represented in the brain by networks of neurons. The link between external stimulus and neural response can be studied from two parallel points of view. The first, neural encoding, refers to the mapping from stimulus to response. It focuses primarily on understanding how neurons respond to a wide variety of stimuli and constructing models that accurately describe the stimulus-response relationship. Neural decoding refers to the reverse mapping, from response to stimulus, where the challenge is to reconstruct a stimulus from the spikes it evokes. Since neuronal response is stochastic, a one-to-one mapping of stimuli into neural responses does not exist, causing a mismatch between the two viewpoints of neural coding. Here we use these two perspectives to investigate the question of what rate coding is, in the simple setting of a single stationary stimulus parameter and a single stationary spike train represented by a renewal process. We show that when rate codes are defined in terms of encoding, that is, the stimulus parameter is mapped onto the mean firing rate, the rate decoder given by spike counts or the sample mean does not always efficiently decode the rate codes, but it can improve efficiency in reading certain rate codes when correlations within a spike train are taken into account."
to:NB  to_read  neural_coding_and_decoding  kith_and_kin  koyama.shinsuke 
4 weeks ago by cshalizi
Game-powered machine learning
"Searching for relevant content in a massive amount of multimedia information is facilitated by accurately annotating each image, video, or song with a large number of relevant semantic keywords, or tags. We introduce game-powered machine learning, an integrated approach to annotating multimedia content that combines the effectiveness of human computation, through online games, with the scalability of machine learning. We investigate this framework for labeling music. First, a socially-oriented music annotation game called Herd It collects reliable music annotations based on the “wisdom of the crowds.” Second, these annotated examples are used to train a supervised machine learning system. Third, the machine learning system actively directs the annotation games to collect new data that will most benefit future model iterations. Once trained, the system can automatically annotate a corpus of music much larger than what could be labeled using human computation alone. Automatically annotated songs can be retrieved based on their semantic relevance to text-based queries (e.g., “funky jazz with saxophone,” “spooky electronica,” etc.). Based on the results presented in this paper, we find that actively coupling annotation games with machine learning provides a reliable and scalable approach to making searchable massive amounts of multimedia data."

--- This is more than a bit of a stunt, but it points in an interesting direction.
to:NB  to_read  data_mining  collective_cognition  active_learning  tagging  classifiers  re:democratic_cognition 
4 weeks ago by cshalizi
Bai , Li : Statistical analysis of factor models of high dimension
"This paper considers the maximum likelihood estimation of factor models of high dimension, where the number of variables (N) is comparable with or even greater than the number of observations (T). An inferential theory is developed. We establish not only consistency but also the rate of convergence and the limiting distributions. Five different sets of identification conditions are considered. We show that the distributions of the MLE estimators depend on the identification restrictions. Unlike the principal components approach, the maximum likelihood estimator explicitly allows heteroskedasticities, which are jointly estimated with other parameters. Efficiency of MLE relative to the principal components method is also considered."
to:NB  to_read  factor_analysis  statistics  high-dimensional_statistics 
6 weeks ago by cshalizi
[1204.2523] Concept Modeling with Superwords
"In information retrieval, a fundamental goal is to transform a document into concepts that are representative of its content. The term "representative" is in itself challenging to define, and various tasks require different granularities of concepts. In this paper, we aim to model concepts that are sparse over the vocabulary, and that flexibly adapt their content based on other relevant semantic information such as textual structure or associated image features. We explore a Bayesian nonparametric model based on nested beta processes that allows for inferring an unknown number of strictly sparse concepts. The resulting model provides an inherently different representation of concepts than a standard LDA (or HDP) based topic model, and allows for direct incorporation of semantic features. We demonstrate the utility of this representation on multilingual blog data and the Congressional Record."
in_NB  to_read  text_mining  topic_models  fox.emily  guestrin.carlos  kith_and_kin 
6 weeks ago by cshalizi
[1204.2477] A Simple Explanation of A Spectral Algorithm for Learning Hidden Markov Models
"A simple linear algebraic explanation of the algorithm in "A Spectral Algorithm for Learning Hidden Markov Models" (COLT 2009). Most of the content is in Figure 2; the text just makes everything precise in four nearly-trivial claims."
to:NB  to_read  statistics  markov_models  re:AoS_project  spectral_methods 
6 weeks ago by cshalizi
[0802.4363] Estimating the entropy of binary time series: Methodology, some theory and a simulation study
"Partly motivated by entropy-estimation problems in neuroscience, we present a detailed and extensive comparison between some of the most popular and effective entropy estimation methods used in practice: The plug-in method, four different estimators based on the Lempel-Ziv (LZ) family of data compression algorithms, an estimator based on the Context-Tree Weighting (CTW) method, and the renewal entropy estimator.
"**Methodology. Three new entropy estimators are introduced. For two of the four LZ-based estimators, a bootstrap procedure is described for evaluating their standard error, and a practical rule of thumb is heuristically derived for selecting the values of their parameters. ** Theory. We prove that, unlike their earlier versions, the two new LZ-based estimators are consistent for every finite-valued, stationary and ergodic process. An effective method is derived for the accurate approximation of the entropy rate of a finite-state HMM with known distribution. Heuristic calculations are presented and approximate formulas are derived for evaluating the bias and the standard error of each estimator. ** Simulation. All estimators are applied to a wide range of data generated by numerous different processes with varying degrees of dependence and memory. Some conclusions drawn from these experiments include: (i) For all estimators considered, the main source of error is the bias. (ii) The CTW method is repeatedly and consistently seen to provide the most accurate results. (iii) The performance of the LZ-based estimators is often comparable to that of the plug-in method. (iv) The main drawback of the plug-in method is its computational inefficiency."
in_NB  to_read  entropy_estimation  information_theory  time_series  statistics  kontoyiannis.ioannis  re:stacs 
6 weeks ago by cshalizi
Cambridge Journals Online - Abstract - KNOWLEDGE, PLANNING, AND MARKETS: A MISSING CHAPTER IN THE SOCIALIST CALCULATION DEBATES
"This paper examines the epistemological arguments about markets and planning that emerged in a series of unpublished exchanges between Hayek and Neurath. The exchanges reveal problems for standard accounts of both the socialist calculation debates and logical empiricism. They also raise questions concerning the sources of ignorance and uncertainty in modern economies, and the role of market and non-market organisations in the distribution and coordination of limited knowledge, which remain relevant to contemporary debates in economics. Hayek had argued that Neurath's work exemplified the errors of rationalism that underpinned the socialist project. In response Neurath highlighted assumptions about the limits of reason and predictability that the two theorists shared and attempted to turn those assumptions back against Hayek in a defence of the possibility of socialist planning. The paper critically compares Neurath's and Hayek's criticisms of rationalism and considers how far Neurath is successful in his attempt to employ Hayek's assumptions against Hayek himself."
to:NB  to_read  markets_as_collective_calculating_devices  neurath.otto  hayek.f.a._von  socialist_calculation_debate  history_of_economics  socialism  logical_positivism 
6 weeks ago by cshalizi
[math/0603130] Nonparametric methods for inference in the presence of instrumental variables
"We suggest two nonparametric approaches, based on kernel methods and orthogonal series to estimating regression functions in the presence of instrumental variables. For the first time in this class of problems, we derive optimal convergence rates, and show that they are attained by particular estimators. In the presence of instrumental variables the relation that identifies the regression function also defines an ill-posed inverse problem, the ``difficulty'' of which depends on eigenvalues of a certain integral operator which is determined by the joint density of endogenous and instrumental variables. We delineate the role played by problem difficulty in determining both the optimal convergence rate and the appropriate choice of smoothing parameter."
to:NB  to_read  regression  statistics  instrumental_variables  nonparametrics  to_teach:undergrad-ADA 
6 weeks ago by cshalizi
Relative Entropy and Exponential Deviation Bounds for General Markov Chains
"We develop explicit, general bounds for the prob- ability that the normalized partial sums of a function of a Markov chain on a general alphabet will exceed the steady-state mean of that function by a given amount. Our bounds combine simple information-theoretic ideas together with techniques from optimization and some fairly elementary tools from analysis. In one direction, we obtain a general bound for the important class of Doeblin chains; this bound is optimal, in the sense that in the special case of independent and identically distributed random variables it essentially reduces to the classical Hoeffding bound. In another direction, motivated by important problems in simulation, we develop a series of bounds in a form which is particularly suited to these problems, and which apply to the more general class of “geometrically ergodic” Markov chains."
to:NB  to_read  deviation_bounds  markov_models  stochastic_processes  via:ded-maxim  meyn.sean  kontoyiannis.ioannis  mixing  information_theory 
6 weeks ago by cshalizi
[1204.2003] Directed Information Graphs
"We propose two graphical models to represent a concise description of the causal statistical dependence structure between a group of coupled stochastic processes. The first, minimum generative model graphs, is motivated by generative models. The second, directed information graphs, is motivated by Granger causality. We show that under mild assumptions, the graphs are identical. In fact, these are analogous to Bayesian and Markov networks respectively, in terms of Markov blankets and I-map properties. Furthermore, the underlying variable dependence structure is the unique causal Bayesian network. Lastly, we present a method using minimal-dimension statistics to identify the structure when upper bounds on the in-degrees are known. Simulations show the effectiveness of the approach."
to:NB  graphical_models  to_read  re:functional_communities  causality  information_theory  coleman.todd 
6 weeks ago by cshalizi
[1203.0697] Learning High-Dimensional Mixtures of Graphical Models
"We consider the problem of learning mixtures of discrete graphical models in high dimensions and propose a novel method for estimating the mixture components with provable guarantees. The method proceeds mainly in three stages. In the first stage, it estimates the union of the Markov graphs of the mixture components (referred to as the union graph) via a series of rank tests. It then uses this estimated union graph to compute the mixture components via a spectral decomposition method. The spectral decomposition method was originally proposed for latent class models, and we adapt this method for learning the more general class of graphical model mixtures. In the end, the method produces tree approximations of the mixture components via the Chow-Liu algorithm. Our output is thus a tree-mixture model which serves as a good approximation to the underlying graphical model mixture. When the union graph has sparse node separators, we prove that our method has sample and computational complexities scaling as poly(p, d, r), for an r-component mixture of p-variate graphical models, where d is the cardinality of the sample space of each node variable. We also extend our results to the case when the union graph has sparse local separators, which is a weaker criterion than having sparse exact separators, and when the mixture components are in the regime of correlation decay. The computational and sample complexities of our method for this class are significantly improved, since they involve an upper bound on the cardinality of local separators (as opposed to exact separators). Our results push the realm of tractable model classes for high-dimensional learning, which includes the class of tree mixtures."
in_NB  mixture_models  ensemble_methods  graphical_models  machine_learning  to_read  chow-liu_algorithm 
7 weeks ago by cshalizi
[1204.0321] The averaging principle
"Typically, models with a heterogeneous property are considerably harder to analyze than the corresponding homogeneous models, in which the heterogeneous property is replaced with its average value. In this study we show that any outcome of a heterogeneous model that satisfies the two properties of emph{differentiability} and emph{interchangibility}, is $O(epsilon^2)$ equivalent to the outcome of the corresponding homogeneous model, where $epsilon$ is the level of heterogeneity. We then use this emph{averaging principle} to obtain new results in queueing theory, game theory (auctions), and social networks (marketing)."

--- The claim in the abstract seems far too general to be true.
to:NB  to_read  macro_from_micro 
7 weeks ago by cshalizi
A Kernel Two-Sample Test
"We propose a framework for analyzing and comparing distributions, which we use to construct statistical tests to determine if two samples are drawn from different distributions. Our test statistic is the largest difference in expectations over functions in the unit ball of a reproducing kernel Hilbert space (RKHS), and is called the maximum mean discrepancy (MMD). We present two distribution-free tests based on large deviation bounds for the MMD, and a third test based on the asymptotic distribution of this statistic. The MMD can be computed in quadratic time, although efficient linear time approximations are available. Our statistic is an instance of an integral probability metric, and various classical metrics on distributions are obtained when alternative function classes are used in place of an RKHS. We apply our two-sample tests to a variety of problems, including attribute matching for databases using the Hungarian marriage method, where they perform strongly. Excellent performance is also obtained when comparing distributions over graphs, for which these are the first such tests."
in_NB  to_read  hilbert_space  kernel_methods  goodness-of-fit  statistics  concentration_of_measure  probability  two-sample_tests  re:network_differences 
7 weeks ago by cshalizi
[1203.5351] Activity driven modeling of dynamic networks
"Network modeling plays a critical role in identifying statistical regularities and structural principles common to many systems. The large majority of recent modeling approaches are connectivity driven, in the sense that the structural pattern of the network is at the basis of the mechanisms ruling the network formation. Connectivity driven models necessarily provide a time-aggregated representation that may fail to describe the instantaneous and fluctuating dynamics of many networks. We address this challenge by defining the activity potential, a time invariant function characterizing the agents' interactions in real-world networks and constructing an activity driven model capable of encoding the instantaneous time description of the network dynamics. The model provides an explanation of structural features such as the presence of hubs, which simply originate from the heterogeneous activity of agents. Additionally, we find that diffusive processes in highly dynamical networks can be described analytically in terms of the activity potential, allowing a quantitative discussion of the biases induced by the time-aggregated network representation in the analysis of dynamical processes in evolving networks."
to:NB  network_data_analysis  networks  stochastic_processes  markov_models  transaction_networks  to_read  re:stacs 
8 weeks ago by cshalizi
[1203.5974] The Concentration and Stability of the Community Detecting Functions on Random Networks
"We propose a general form of community detecting functions for finding the communities or the optimal partition of a random network, and examine the concentration and stability of the function values using the bounded difference martingale method. We derive LDP inequalities for both the general case and several specific community detecting functions: modularity, graph bipartitioning and q-Potts community structure. We also discuss the concentration and stability of community detecting functions on different types of random networks: the sparse and non-sparse networks and some examples such as ER and CL networks."
in_NB  to_read  community_discovery  network_data_analysis  statistics 
8 weeks ago by cshalizi
[1203.6119] Robustness of Complex Networks: Reaching Consensus Despite Adversaries
"We study the problem of reaching consensus in complex networks where each node knows nothing about the overall topology, other than its own neighbors. We assume that there exist a set of malicious or stubborn nodes in the network that do not follow the same dynamics as the rest of the nodes. When the normal nodes act on purely local information, previous work has established that standard graph notions such as connectivity are no longer sufficient to characterize the ability of the non-malicious nodes to reach agreement. Instead, the network must satisfy a property known as robustness. In this paper we investigate the robustness properties of common random graph models for complex networks, including the preferential attachment model, the Erdos-Renyi model, and the geometric random graph model. We show that these models exhibit a thresholding behavior for robustness. In particular, we show that the notions of connectivity and robustness coincide on various random graph models, indicating that purely local knowledge is sufficient when the objective is to reach agreement on an appropriate function of the initial values."
to:NB  to_read  networks  diffusion_of_innovations  re:do-institutions-evolve 
8 weeks ago by cshalizi
[1203.6130] Spectral dimensionality reduction for HMMs
"Hidden Markov Models (HMMs) can be accurately approximated using co-occurrence frequencies of pairs and triples of observations by using a fast spectral method in contrast to the usual slow methods like EM or Gibbs sampling. We provide a new spectral method which significantly reduces the number of model parameters that need to be estimated, and generates a sample complexity that does not depend on the size of the observation vocabulary. We present an elementary proof giving bounds on the relative accuracy of probability estimates from our model. (Correlaries show our bounds can be weakened to provide either L1 bounds or KL bounds which provide easier direct comparisons to previous work.) Our theorem uses conditions that are checkable from the data, instead of putting conditions on the unobservable Markov transition matrix."
to:NB  to_read  markov_models  statistics  machine_learning  dimension_reduction  re:AoS_project  spectral_clustering 
8 weeks ago by cshalizi
[1203.6502] Quantifying causal influences
"Common methods of causal inference generate directed acyclic graphs (DAGs) that formalize causal relations between n variables. Given the joint distribution of all these variables, the DAG contains all information about how intervening on one variable would change the distribution of the other n-1 variables. It remains, however, a non-trivial question how to quantify the causal influence of one variable on another one.
Here we propose a measure for causal strength that refers to direct effects and measure the "strength of an arrow" or a set of arrows. It is based on a hypothetical intervention that modifies the joint distribution by cutting the corresponding edge. The causal strength is then the relative entropy distance between the old and the new distribution.
We discuss other measures of causal strength like the average causal effect, transfer entropy and information flow and describe their limitations. We argue that our measure is also more appropriate for time series than the known ones.
Finally, we discuss conceptual problems in defining the strength of indirect effects."
to:NB  to_read  causality  graphical_models  information_theory  statistics  via:ded-maxim 
8 weeks ago by cshalizi
Kleijn , van der Vaart : The Bernstein-Von-Mises theorem under misspecification
"We prove that the posterior distribution of a parameter in misspecified LAN parametric models can be approximated by a random normal distribution. We derive from this that Bayesian credible sets are not valid confidence sets if the model is misspecified. We obtain the result under conditions that are comparable to those in the well-specified situation: uniform testability against fixed alternatives and sufficient prior mass in neighbourhoods of the point of convergence. The rate of convergence is considered in detail, with special attention for the existence and construction of suitable test sequences. We also give a lemma to exclude testable model subsets which implies a misspecified version of Schwartz’ consistency theorem, establishing weak convergence of the posterior to a measure degenerate at the point at minimal Kullback-Leibler divergence with respect to the true distribution."
to:NB  to_read  bayesian_consistency  statistics  bernstein-von_mises  asymptotics  confidence_sets  van_der_vaart.aad 
10 weeks ago by cshalizi
"Neural reuse: A fundamental organizational principle of the brain" (Anderson, 2010)
BBS target article.
Abstract: "An emerging class of theories concerning the functional structure of the brain takes the reuse of neural circuitry for various cognitive purposes to be a central organizational principle. According to these theories, it is quite common for neural circuits established for one purpose to be exapted (exploited, recycled, redeployed) during evolution or normal development, and be put to different uses, often without losing their original functions. Neural reuse theories thus differ from the usual understanding of the role of neural plasticity (which is, after all, a kind of reuse) in brain organization along the following lines: According to neural reuse, circuits can continue to acquire new uses after an initial or original function is established; the acquisition of new uses need not involve unusual circumstances such as injury or loss of established function; and the acquisition of a new use need not involve (much) local change to circuit structure (e.g., it might involve only the establishment of functional connections to new neural partners). Thus, neural reuse theories offer a distinct perspective on several topics of general interest, such as: the evolution and development of the brain, including (for instance) the evolutionary-developmental pathway supporting primate tool use and human language; the degree of modularity in brain organization; the degree of localization of cognitive function; and the cortical parcellation problem and the prospects (and proper methods to employ) for function to structure mapping. The idea also has some practical implications in the areas of rehabilitative medicine and machine interface design."
in_NB  to_read  fmri  neuroscience  functional_connectivity  modularity  re:functional_communities  neuropsychology  cognitive_science 
10 weeks ago by cshalizi
Kaiser , Lahiri , Nordman : Goodness of fit tests for a class of Markov random field models
"This paper develops goodness of fit statistics that can be used to formally assess Markov random field models for spatial data, when the model distributions are discrete or continuous and potentially parametric. Test statistics are formed from generalized spatial residuals which are collected over groups of nonneighboring spatial observations, called concliques. Under a hypothesized Markov model structure, spatial residuals within each conclique are shown to be independent and identically distributed as uniform variables. The information from a series of concliques can be then pooled into goodness of fit statistics. Under some conditions, large sample distributions of these statistics are explicitly derived for testing both simple and composite hypotheses, where the latter involves additional parametric estimation steps. The distributional results are verified through simulation, and a data example illustrates the method for model assessment."
to:NB  to_read  statistics  spatial_statistics  random_fields  goodness-of-fit  hypothesis_testing  re:stacs  markov_models 
10 weeks ago by cshalizi
[1203.2268] Friends FTW! Friendship and competition in Halo: Reach
"How important are friendships in determining success by individuals and teams in complex competitive environments? By combining a novel data set on the dynamics of millions of ad hoc team-based competitions from the massively multiplayer online first person shooter (MMOFPS) Halo: Reach with ground-truth data on player demographics, play style, psychometrics and friendships derived from an anonymous online survey, we investigate the impact of friendship on performance in such competitive environments. We find that friendships play a fundamental role, leading to both improved individual and team performance---even after controlling for the overall expertise of the team---and increased pro-social behavior. Furthermore, because players structure their in-game activities around opportunities to play with friends, we show that friendships can largely be inferred directly from behavioral time series using common-sense heuristics. Algorithms that leverage the utility of friendships, without needing explicitly labeled (and thus private) data, are thus both possible and will likely improve many aspects of competition prediction and design."
to:NB  kith_and_kin  to_read  social_networks  videogames  networked_life  clauset.aaron  mason.winter 
10 weeks ago by cshalizi
[0803.2963] Consistency of cross validation for comparing regression procedures
"Theoretical developments on cross validation (CV) have mainly focused on selecting one among a list of finite-dimensional models (e.g., subset or order selection in linear regression) or selecting a smoothing parameter (e.g., bandwidth for kernel smoothing). However, little is known about consistency of cross validation when applied to compare between parametric and nonparametric methods or within nonparametric methods. We show that under some conditions, with an appropriate choice of data splitting ratio, cross validation is consistent in the sense of selecting the better procedure with probability approaching 1. Our results reveal interesting behavior of cross validation. When comparing two models (procedures) converging at the same nonparametric rate, in contrast to the parametric case, it turns out that the proportion of data used for evaluation in CV does not need to be dominating in size. Furthermore, it can even be of a smaller order than the proportion for estimation while not affecting the consistency property."
to:NB  statistics  to_read  cross-validation  model_selection  nonparametrics  to_teach:undergrad-ADA  re:stacs 
11 weeks ago by cshalizi
[0803.2984] Conditional density estimation in a regression setting
"Regression problems are traditionally analyzed via univariate characteristics like the regression function, scale function and marginal density of regression errors. These characteristics are useful and informative whenever the association between the predictor and the response is relatively simple. More detailed information about the association can be provided by the conditional density of the response given the predictor. For the first time in the literature, this article develops the theory of minimax estimation of the conditional density for regression settings with fixed and random designs of predictors, bounded and unbounded responses and a vast set of anisotropic classes of conditional densities. The study of fixed design regression is of special interest and novelty because the known literature is devoted to the case of random predictors. For the aforementioned models, the paper suggests a universal adaptive estimator which (i) matches performance of an oracle that knows both an underlying model and an estimated conditional density; (ii) is sharp minimax over a vast class of anisotropic conditional densities; (iii) is at least rate minimax when the response is independent of the predictor and thus a bivariate conditional density becomes a univariate density; (iv) is adaptive to an underlying design (fixed or random) of predictors."
in_NB  statistics  nonparametrics  regression  density_estimation  minimax  to_read  to_teach:undergrad-ADA 
11 weeks ago by cshalizi
Richards , Lee , Schafer , Freeman : Prototype selection for parameter estimation in complex models
"Parameter estimation in astrophysics often requires the use of complex physical models. In this paper we study the problem of estimating the parameters that describe star formation history (SFH) in galaxies. Here, high-dimensional spectral data from galaxies are appropriately modeled as linear combinations of physical components, called simple stellar populations (SSPs), plus some nonlinear distortions. Theoretical data for each SSP is produced for a fixed parameter vector via computer modeling. Though the parameters that define each SSP are continuous, optimizing the signal model over a large set of SSPs on a fine parameter grid is computationally infeasible and inefficient. The goal of this study is to estimate the set of parameters that describes the SFH of each galaxy. These target parameters, such as the average ages and chemical compositions of the galaxy’s stellar populations, are derived from the SSP parameters and the component weights in the signal model. Here, we introduce a principled approach of choosing a small basis of SSP prototypes for SFH parameter estimation. The basic idea is to quantize the vector space and effective support of the model components. In addition to greater computational efficiency, we achieve better estimates of the SFH target parameters. In simulations, our proposed quantization method obtains a substantial improvement in estimating the target parameters over the common method of employing a parameter grid. Sparse coding techniques are not appropriate for this problem without proper constraints, while constrained sparse coding methods perform poorly for parameter estimation because their objective is signal reconstruction, not estimation of the target parameters."
to:NB  to_read  statistics  estimation  astronomy  kith_and_kin  lee.ann_b.  schafer.chad  richards.joey  freeman.peter 
11 weeks ago by cshalizi
[1203.0738] Avalanche analysis from multi-electrode ensemble recordings in cat, monkey and human cerebral cortex during wakefulness and sleep
"Self-organized critical states are found in many natural systems, from earthquakes to forest fires, they have also been found in neural systems, particularly, in neuronal cultures. However, the presence of critical states in the awake brain remains controversial. Here, we compared avalanche analyses performed on different in vivo preparations during wakefulness, slow-wave sleep and REM sleep, in cat parietal cortex (8 electrodes), monkey motor cortex (64/96 electrodes) and human temporal cortex (96 electrodes) in epileptic patients. In neuronal avalanches defined from units (up to 152 single units), the size of avalanches never clearly scaled as power-law, but rather scaled exponentially or displayed intermediate scaling. We also analyzed the dynamics of local field potentials (LFPs) and in particular LFP negative peaks (nLFPs) among the different electrodes (up to 96 sites in temporal cortex or up to 128 sites in adjacent motor and pre-motor cortices). In this case, the avalanches defined from nLFPs displayed power-law scaling in double logarithmic representations, as reported previously in monkey. However, avalanche defined as positive LFP (pLFP) peaks, which are not related to neuronal firing, also displayed apparent power-law scaling. Closer examination of this scaling using the more reliable cumulative distribution function (CDF) and other rigorous statistical measures, did not confirm power-law scaling. The same pattern was seen for cats, monkey and human, as well as for different brain states of wakefulness and sleep. We also tested other alternative distributions. While simple exponentials yielded very good fits of the avalanche dynamics, the "sum of exponentials" provided the best fit to the data. Collectively, these results show no clear evidence for power-law scaling or self-organized critical states in the awake and sleeping brain of mammals, from cat to man."

Impressions from a quick scan: yes, those are not power laws (way too curved), but no, you cannot use R^2 like that --- and in fact we explained why, in that paper you cite. Oy.
to:NB  self-organized_criticality  neuroscience  to_read  heavy_tails 
11 weeks ago by cshalizi
[0804.2487] The ergodic decomposition of asymptotically mean stationary random sources
"It is demonstrated how to represent asymptotically mean stationary (AMS) random sources with values in standard spaces as mixtures of ergodic AMS sources. This an extension of the well known decomposition of stationary sources which has facilitated the generalization of prominent source coding theorems to arbitrary, not necessarily ergodic, stationary sources. Asymptotic mean stationarity generalizes the definition of stationarity and covers a much larger variety of real-world examples of random sources of practical interest. It is sketched how to obtain source coding and related theorems for arbitrary, not necessarily ergodic, AMS sources, based on the presented ergodic decomposition."
in_NB  ergodic_theory  to_read  re:almost_none  stochastic_processes 
12 weeks ago by cshalizi
Global Network Reorganization During Dynamic Adaptations of Bacillus subtilis Metabolism
"Adaptation of cells to environmental changes requires dynamic interactions between metabolic and regulatory networks, but studies typically address only one or a few layers of regulation. For nutritional shifts between two preferred carbon sources of Bacillus subtilis, we combined statistical and model-based data analyses of dynamic transcript, protein, and metabolite abundances and promoter activities. Adaptation to malate was rapid and primarily controlled posttranscriptionally compared with the slow, mainly transcriptionally controlled adaptation to glucose that entailed nearly half of the known transcription regulation network. Interactions across multiple levels of regulation were involved in adaptive changes that could also be achieved by controlling single genes. Our analysis suggests that global trade-offs and evolutionary constraints provide incentives to favor complex control programs."
to:NB  to_read  biochemical_networks  adaptive_behavior  experimental_biology  re:network_differences  gene_regulation 
12 weeks ago by cshalizi
Periodic stripe formation by a Turing mechanism operating at growth zones in the mammalian palate : Nature Genetics : Nature Publishing Group
"We present direct evidence of an activator-inhibitor system in the generation of the regularly spaced transverse ridges of the palate. We show that new ridges, called rugae, that are marked by stripes of expression of Shh (encoding Sonic hedgehog), appear at two growth zones where the space between previously laid rugae increases. However, inter-rugal growth is not absolutely required: new stripes of Shh expression still appeared when growth was inhibited. Furthermore, when a ruga was excised, new Shh expression appeared not at the cut edge but as bifurcating stripes branching from the neighboring stripe of Shh expression, diagnostic of a Turing-type reaction-diffusion mechanism. Genetic and inhibitor experiments identified fibroblast growth factor (FGF) and Shh as components of an activator-inhibitor pair in this system. These findings demonstrate a reaction-diffusion mechanism that is likely to be widely relevant in vertebrate development."
to_read  to:NB  pattern_formation  biology  morphogenesis  reaction-diffusion  turing_mechanism  via:aks  to_teach:complexity-and-inference  re:stacs  experimental_biology  to:blog 
12 weeks ago by cshalizi
Higher social class predicts increased unethical behavior
"Seven studies using experimental and naturalistic methods reveal that upper-class individuals behave more unethically than lower-class individuals. In studies 1 and 2, upper-class individuals were more likely to break the law while driving, relative to lower-class individuals. In follow-up laboratory studies, upper-class individuals were more likely to exhibit unethical decision-making tendencies (study 3), take valued goods from others (study 4), lie in a negotiation (study 5), cheat to increase their chances of winning a prize (study 6), and endorse unethical behavior at work (study 7) than were lower-class individuals. Mediator and moderator data demonstrated that upper-class individuals’ unethical tendencies are accounted for, in part, by their more favorable attitudes toward greed."
to:NB  to_read  experimental_psychology  moral_psychology  inequality 
12 weeks ago by cshalizi
Interactive Diffusion
"In this article, the authors focus attention on a poorly understood aspect of contentious politics: the interaction between the transnational diffusion of new forms of protest behavior and police practices in response to them. Studies of diffusion are usually limited to the diffusion of one kind of innovation by one set of actors to another, as in the diffusion of technical innovations from innovators to adopters. But collective action diffusion also produces a parallel and interactive sequence of “public order” reactions. Using the transnational countersummits that emerged around the turn of the century as their source of evidence, the authors focus on the coevolution of protester and police innovations across national boundaries. The authors’ major finding is that the mechanisms that cause protester and police innovations to diffuse are remarkably similar, even though they can combine in different ways at different moments: promotion, the proactive intervention by a sender actor aimed at deliberate diffusion of an innovation; assessment, the analysis of information on past events and their definition as successes or failures, which leads to adaption of the innovation to new sites and situations; and theorization, the location of technical innovations within broader normative and cognitive frameworks. The authors close with a speculative application of their findings to the recent diffusion of protester tactics and regime responses in the Middle East and North Africa."
to:NB  to_read  diffusion_of_innovations  social_movements  arab_spring  re:critique_of_diffusion  via:henry_farrell 
12 weeks ago by cshalizi
[1202.3323] A new look at shifting regret
We investigate extensions of well-known online learning algorithms such as fixed-share of Herbster and Warmuth (1998) or the methods proposed by Bousquet and Warmuth (2002). These algorithms use weight sharing schemes to perform as well as the best sequence of experts with a limited number of changes. Here we show, with a common, general, and simpler analysis, that weight sharing in fact achieves much more than what it was designed for. We use it to simultaneously prove new shifting regret bounds for online convex optimization on the simplex in terms of the total variation distance as well as new bounds for the related setting of adaptive regret. Finally, we exhibit the first logarithmic shifting bounds for exp-concave loss functions on the simplex.
online_learning  to_read  individual_sequence_prediction  non-stationarity  re:growing_ensemble_project  in_NB  low-regret_learning  have_read 
12 weeks ago by cshalizi
Universality of Bayesian Predictions
"This paper studies the theoretical properties of Bayesian predictions and shows that under minimal conditions we can derive finite sample bounds for the loss incurred using Bayesian predictions under the Kullback-Leibler divergence. In particular, the concept of universality of predictions is discussed and universality is established for Bayesian predictions in a variety of settings. These include predictions under almost arbitrary loss functions, model averaging, predictions in a non-stationary environment and under model misspecification."
in_NB  to_read  statistics  bayesian_consistency  prediction  misspecification  universal_prediction 
12 weeks ago by cshalizi
[0809.5032] Identifiability of parameters in latent structure models with many observed variables
"While hidden class models of various types arise in many statistical applications, it is often difficult to establish the identifiability of their parameters. Focusing on models in which there is some structure of independence of some of the observed variables conditioned on hidden ones, we demonstrate a general approach for establishing identifiability utilizing algebraic arguments. A theorem of J. Kruskal for a simple latent-class model with finite state space lies at the core of our results, though we apply it to a diverse set of models. These include mixtures of both finite and nonparametric product distributions, hidden Markov models and random graph mixture models, and lead to a number of new results and improvements to old ones. In the parametric setting, this approach indicates that for such models, the classical definition of identifiability is typically too strong. Instead generic identifiability holds, which implies that the set of nonidentifiable parameters has measure zero, so that parameter inference is still meaningful. In particular, this sheds light on the properties of finite mixtures of Bernoulli products, which have been used for decades despite being known to have nonidentifiable parameters. In the nonparametric setting, we again obtain identifiability only when certain restrictions are placed on the distributions that are mixed, but we explicitly describe the conditions."
in_NB  statistics  identifiability  mixture_models  inference_to_latent_objects  re:homophily_and_confounding  to_read 
12 weeks ago by cshalizi
[1202.4294] Prediction of quantiles by statistical learning and application to GDP forecasting
"In this paper, we tackle the problem of prediction and confidence intervals for time series using a statistical learning approach and quantile loss functions. In a first time, we show that the Gibbs estimator (also known as Exponentially Weighted aggregate) is able to predict as well as the best predictor in a given family for a wide set of loss functions. In particular, using the quantile loss function of Koenker and Bassett (1978), this allows to build confidence intervals. We apply these results to the problem of prediction and confidence regions for the French Gross Domestic Product (GDP) growth, with promising results."
in_NB  to_read  prediction  confidence_sets  learning_theory  re:your_favorite_dsge_sucks  re:growing_ensemble_project 
february 2012 by cshalizi
Henze : A Multivariate Two-Sample Test Based on the Number of Nearest Neighbor Type Coincidences
"For independent $d$-variate random samples $X_1, cdots, X_{n_1}$ i.i.d. $f(x), Y_1, cdots, Y_{n_2}$ i.i.d. $g(x)$, where the densities $f$ and $g$ are assumed to be continuous a.e., consider the number $T$ of all $k$ nearest neighbor comparisons in which observations and their neighbors belong to the same sample. We show that, if $f = g$ a.e., the limiting (normal) distribution of $T$, as $min(n_1, n_2) rightarrow infty, n_1/(n_1 + n_2) rightarrow tau, 0 < tau < 1$, does not depend on $f$. An omnibus procedure for testing the hypothesis $H_0: f = g$ a.e. is obtained by rejecting $H_0$ for large values of $T$. The result applies to a general distance (generated by a norm on $mathbb{R}^d$) for determining nearest neighbors, and it generalizes to the multisample situation."
to:NB  to_read  statistics  hypothesis_testing  two-sample_tests  re:AoS_project 
february 2012 by cshalizi
[1202.3123] Right-convergence of sparse random graphs
"The paper is devoted to the problem of establishing right-convergence of sparse random graphs. This concerns the convergence of the logarithm of number of homomorphisms from graphs or hyper-graphs $G_N, Nge 1$ to some target graph $W$. The theory of dense graph convergence, including random dense graphs, is now well understood, but its counterpart for sparse random graphs presents some fundamental difficulties. Phrased in the statistical physics terminology, the issue is the existence of the log-partition function limits, also known as free energy limits, appropriately normalized for the Gibbs distribution associated with $W$. In this paper we prove that the sequence of sparse ER graphs is right-converging when the tensor product associated with the target graph $W$ satisfies certain convexity property. We treat the case of discrete and continuous target graphs $W$. The latter case allows us to prove a special case of Talagrand's recent conjecture (more accurately stated as level III Research Problem 6.7.2 in his recent book), concerning the existence of the limit of the measure of a set obtained from $R^N$ by intersecting it with linearly in $N$ many subsets, generated according to some common probability law.
Our proof is based on the interpolation technique, introduced first by Guerra and Toninelli and developed further in a series of papers. Specifically, Bayati et al establish the right-convergence property for Erdos-Renyi graphs for some special cases of $W$. In this paper most of the results in this paper follow as a special case of our main theorem."
to:NB  to_read  graph_theory  graph_limits  re:smoothing_adjacency_matrices 
february 2012 by cshalizi
"Trygve Haavelmo and the Emergence of Causal Calculus" (Judea Pearl, 2011)
"Haavelmo was the first to recognize the capacity of economic models to guide poli- cies. This paper describes some of the barriers that Haavelmo’s ideas have had (and still have) to overcome, and lays out a logical framework for capturing the relationships between theory, data and policy questions. The mathematical tools that emerge from this framework now enable investigators to answer complex policy and counterfactual questions using embarrassingly simple routines, some by mere inspection of the model’s structure. Several such problems are illustrated by examples, including misspecification tests, identification, mediation and introspection."
to:NB  causal_inference  economics  econometrics  haavelmo.trygve  pearl.judea  graphical_models  to_read 
february 2012 by cshalizi
Fisher Dynamics in Household Debt: The Case of the U.S. 1929-2011
"We examine the importance of what we term ‘Fisher dynamics’- the mechanical effects of changes in interest rates, growth rates and inflation rates on debt levels independent of borrowing -for the evolution of household debt in the U.S. over a long time horizon (1929- 2011). Adapting a standard decomposition of public debt to household sector debt, we show that these factors have been important in explaining rising debt levels, especially in the past thirty years. We identify and describe three broad regimes in the growth of household debt and several shorter episodes, distinguished by the distinct roles played Fisher dynamics and borrowing behavior in the evolution of household debt. We then provide some counterfactual trajectories of debt burdens that suggest how important financial changes beginning around 1980 have been in contributing to household debt, independent of any changes in household behavior. Specifically, if average rates of growth, inflation and interest remained the same after 1980 as before 1980, household debt burdens in 2011 would have been roughly the same as they were in the early 1950s, despite the sharp increase in borrowing in the early 2000s. We then discuss the difficulties involved in deleveraging. Under scenarios involving even substantial reductions in household expenditure, returning to debt levels of the 1980s could take decades. If lower private leverage is a condition of acceptable growth,then in the absence of a substantial fall in interest rates relative to growth rates, large-scale debt forgiveness of some form may be unavoidable."
economics  economic_history  mason.joshua_w.  financial_crisis_of_2007--  to_read 
february 2012 by cshalizi
[1202.1540] Quantifying the complexity of random Boolean networks
"We study two measures of the complexity of heterogeneous extended systems, taking random Boolean networks as prototypical cases. A measure defined by Shalizi et al. for cellular automata, based on a criterion for optimal statistical prediction [1], does not distinguish between the spatial inhomogeneity of the ordered phase and the dynamical inhomogeneity of the disordered phase. A modification in which complexities of individual nodes are calculated yields vanishing complexity values for networks in the ordered and critical regimes and for highly disordered networks, peaking somewhere in the disordered regime. Individual nodes with high complexity are the ones that pass the most information from the past to the future, a quantity that depends in a nontrivial way on both the Boolean function of a given node and its location within the network."
to:NB  complexity_measures  random_boolean_networks  to_read 
february 2012 by cshalizi
Plausibly Exogenous
"Instrumental variable (IV) methods are widely used to identify causal effects in models with endogenous explanatory variables. Often the instrument exclusion restriction that underlies the validity of the usual IV inference is suspect; that is, instruments are only plausibly exogenous. We present practical methods for performing inference while relaxing the exclusion restriction. We illustrate the approaches with empirical examples that examine the effect of 401(k) participation on asset accumulation, price elasticity of demand for margarine, and returns to schooling. We find that inference is informative even with a substantial relaxation of the exclusion restriction in two of the three cases."
to:NB  to_read  causal_inference  regression  statistics  economics  social_science_methodology  instrumental_variables  to_teach:undergrad-ADA  hansen.christian 
february 2012 by cshalizi
A Multi-Language Computing Environment for Literate Programming and Reproducible Research
"We present a new computing environment for authoring mixed natural and computer language documents. In this environment a single hierarchically-organized plain text source file may contain a variety of elements such as code in arbitrary programming languages, raw data, links to external resources, project management data, working notes, and text for publication. Code fragments may be executed in situ with graphical, numerical and textual output captured or linked in the file. Export to LATEX, HTML, LATEX beamer, DocBook and other formats permits working reports, presentations and manuscripts for publication to be generated from the file. In addition, functioning pure code files can be automatically extracted from the file. This environment is implemented as an extension to the Emacs text editor and provides a rich set of features for authoring both prose and code, as well as sophisticated project management capabilities."
paper_writing  programming  R  latex  to_read 
february 2012 by cshalizi
Social Influence, Binary Decisions and Collective Dynamics
"In this paper we address the general question of how social influence determines collective outcomes for large populations of individuals faced with binary decisions. First, we define conditions under which the behavior of individuals making binary decisions can be described in terms of what we call an influence-response function: a one-dimensional function of the (weighted) number of individuals choosing each of the alternatives. And second, we demonstrate that, under the assumptions of global and anonymous interactions, general knowledge of the influence-response functions is sufficient to compute equilibrium, and even non-equilibrium, properties of the collective dynamics. By enabling us to treat in a consistent manner classes of decisions that have previously been analyzed separately, our framework allows us to find similarities between apparently quite different kinds of decision situations, and conversely to identify important differences between decisions that would otherwise appear very similar."
to:NB  to_read  re:do-institutions-evolve  re:homophily_and_confounding  social_life_of_the_mind  social_influence  herding  watts.duncan  kith_and_kin 
january 2012 by cshalizi
[1201.5568] Dynamic trees for streaming and massive data contexts
"Data collection at a massive scale is becoming ubiquitous in a wide variety of settings, from vast offline databases to streaming real-time information. Learning algorithms deployed in such contexts must rely on single-pass inference, where the data history is never revisited. In streaming contexts, learning must also be temporally adaptive to remain up-to-date against unforeseen changes in the data generating mechanism. Although rapidly growing, the online Bayesian inference literature remains challenged by massive data and transient, evolving data streams. Non-parametric modelling techniques can prove particularly ill-suited, as the complexity of the model is allowed to increase with the sample size. In this work, we take steps to overcome these challenges by porting standard streaming techniques, like data discarding and downweighting, into a fully Bayesian framework via the use of informative priors and active learning heuristics. We showcase our methods by augmenting a modern non-parametric modelling framework, dynamic trees, and illustrate its performance on a number of practical examples. The end product is a powerful streaming regression and classification tool, whose performance compares favourably to the state-of-the-art."
to:NB  machine_learning  non-stationarity  statistics  data_mining  to_read  re:growing_ensemble_project 
january 2012 by cshalizi
[1201.2334] Universal Estimation of Directed Information
"We propose four approaches to estimating the directed information rate between a pair of jointly stationary ergodic processes with the help of universal probability assignments. The four approaches yield estimators with different merits such as nonnegativity and boundedness. We establish consistency of these estimators in various senses and derive near-optimal rates of convergence in the minimax sense under mild conditions. The estimators carry over directly to estimating other information measures of stationary ergodic processes, such as entropy rate and mutual information rate, and provide alternatives to classical approaches in the existing literature. Guided by the theoretical results, we use context tree weighting as the vehicle for the implementations of the proposed estimators. Experiments on synthetic and real data are presented, demonstrating the potential of the proposed schemes in practice and the efficacy of directed information estimation as a tool for detecting and measuring causality and delay."
in_NB  to_read  information_theory  entropy_estimation  directed_information  stochastic_processes  nonparametrics  statistics  re:AoS_project 
january 2012 by cshalizi
The Reductionist Gamble: Open Economy Politics in the Global Economy
"[International political economy] should transition to “third wave” scholarship. This transition is necessary because the approach that dominates current American IPE scholarship, Open Economy Politics (OEP), generates inaccurate knowledge. OEP produces inaccurate knowledge because it studies domestic politics in isolation from international or macro processes. This methodological reductionism is often inappropriate for the phenomena IPE studies because governments inhabit a system. As a result, the political choices that OEP attempts to explain are typically a product of the interplay between domestic politics and macro processes. When OEP omits causally significant macro processes from empirical models, the models yield biased inferences about the domestic political relationships under investigation. Although we tolerated such errors when the gains from OEP were large, these errors are less tolerable now that OEP has matured. Consequently, the field should transition toward research that is non-reductionist (systemic), problem-driven, and pluralistic."

--- I don't see how the issue is _reductionism_ so much as _ignoring interactions_.
to:NB  to_read  re:critique_of_diffusion  social_science_methodology  international_relations  political_economy  via:henry_farrell 
january 2012 by cshalizi
Social Movement Organizational Collaboration: Networks of Learning and the Diffusion of Protest Tactics, 1960-1995
"This paper examines the diffusion of protest tactics between social movement organizations (SMOs). Drawing on organizational learning theory, we argue that knowledge about specific tactics diffuses between social movement organizations via their co-engagement in protest events. Using a longitudinal network dataset of organizations and their participation in protest events between 1960 and 1995, we adapt novel methodological techniques for dealing with selection and measurement bias in networks analysis, which comes in two forms—1) the mechanism that renders some organizations more likely to select into collaborations than others, and 2) the notion that tactical diffusion is not a result of collaboration, but rather is an artifact of homophily or some form of indirect learning. We find that collaboration is indeed an important channel of tactical diffusion. We also find that SMOs with broader tactical repertoires are more likely to adopt additional tactics as a result of their collaborations with other SMOs, but only up to a point, beyond which such SMOs are spread too thin. Engaging in more collaborations also makes SMOs both more active transmitters and adopters of novel tactics. Finally, achieving some initial overlap in their respective tactical repertoires facilitates the diffusion of tactics between collaborating SMOs."

-- Andrew and I are cited, but they show no real awareness of the fact that Aral's matching method does nothing about latent homophily, and so their results are still completely exposed to confounding (unless they've got truly well-chosen control variables going into the matching).
to:NB  to_read  sociology  social_movements  diffusion_of_innovations  re:critique_of_diffusion  homophily 
january 2012 by cshalizi
Modeling the Change of Paradigm: Non-Bayesian Reactions to Unexpected News (Ortoleva)
"Despite its normative appeal and widespread use, Bayes’ rule has two well-known limitations: first, it does not predict how agents should react to an information to which they assigned probability zero; second, a sizable empirical evidence documents how agents systematically deviate from its prescriptions by overreacting to information to which they assigned a positive but small probability. By replacing Dynamic Consistency with a novel axiom, Dynamic Coherence, we characterize an alternative updating rule that is not subject to these limitations, but at the same time coincides with Bayes’ rule for “normal” events. In particular, we model an agent with a utility function over consequences, a prior over priors ρ, and a threshold. In the first period she chooses the prior that maximizes the prior over priors ρ - a’ la maximum likelihood. As new information is revealed: if the chosen prior assigns to this information a probability above the threshold, she follows Bayes’ rule and updates it. Otherwise, she goes back to her prior over priors ρ, updates it using Bayes’ rule, and then chooses the new prior that maximizes the updated ρ. We also extend our analysis to the case of ambiguity aversion."
to:NB  to_read  decision_theory  bayesianism  statistics  re:phil-of-bayes_paper 
january 2012 by cshalizi
Introduction to Online Optimization (Bubeck)
"to_teach" tag a sudden brainstorm for how to make next year's statistical computing class either unbeatably awesome or an absolute disaster
to:NB  online_learning  regression  individual_sequence_prediction  optimization  machine_learning  learning_theory  via:mraginsky  to_read  to_teach:statcomp 
december 2011 by cshalizi
Social selection and peer influence in an online social network
"Disentangling the effects of selection and influence is one of social science's greatest unsolved puzzles: Do people befriend others who are similar to them, or do they become more similar to their friends over time? Recent advances in stochastic actor-based modeling, combined with self-reported data on a popular online social network site, allow us to address this question with a greater degree of precision than has heretofore been possible. Using data on the Facebook activity of a cohort of college students over 4 years, we find that students who share certain tastes in music and in movies, but not in books, are significantly likely to befriend one another. Meanwhile, we find little evidence for the diffusion of tastes among Facebook friends—except for tastes in classical/jazz music. These findings shed light on the mechanisms responsible for observed network homogeneity; provide a statistically rigorous assessment of the coevolution of cultural tastes and social relationships; and suggest important qualifications to our understanding of both homophily and contagion as generic social processes."

It will be interested to see how they argue this isn't confounded six ways from Sunday.
in_NB  to_read  re:homophily_and_confounding  social_networks  social_influence  homophily  social_media  to_be_shot_after_a_fair_trial 
december 2011 by cshalizi
[1112.3914] Robust empirical mean Estimators
"We study robust estimators of the mean of a probability measure $P$, called robust empirical mean estimators. This elementary construction is then used to revisit a problem of aggregation and a problem of estimator selection, extending these methods to not necessarily bounded collections of previous estimators.
We consider then the problem of robust $M$-estimation. We propose a slightly more complicated construction to handle this problem and, as examples of applications, we apply our general approach to least-squares density estimation, to density estimation with K"ullback loss and to a non-Gaussian, unbounded, random design and heteroscedastic regression problem.
Finally, we show that our strategy can be used when the data are only assumed to be mixing."
in_NB  to_read  statistics  estimation  statistical_inference_for_stochastic_processes 
december 2011 by cshalizi
[1112.3257] Exact Computation of Kullback-Leibler Distance for Hidden Markov Trees and Models
"We suggest new recursive formulas to compute the exact value of the Kullback-Leibler distance (KLD) between two general Hidden Markov Trees (HMTs). For homogeneous HMTs with regular topology, such as homogeneous Hidden Markov Models (HMMs), we obtain a closed-form expression for the KLD when no evidence is given. We generalize our recursive formulas to the case of HMMs conditioned on the observable variables. Our proposed formulas are validated through several numerical examples in which we compare the exact KLD value with Monte Carlo estimations."
to:NB  to_read  re:AoS_project  markov_models  stochastic_processes  information_theory 
december 2011 by cshalizi
[1112.3308] Spatial correlations in attribute communities
"Community detection is an important tool for exploring and classifying the properties of large complex networks and should be of great help for spatial networks. Indeed, in addition to their location, nodes in spatial networks can have attributes such as the language for individuals, or any other socio-economical feature that we would like to identify in communities. We discuss in this paper a crucial aspect which was not considered in previous studies which is the possible existence of correlations between space and attributes. Introducing a simple toy model in which both space and node attributes are considered, we discuss the effect of space-attribute correlations on the results of various community detection methods proposed for spatial networks in this paper and in previous studies. When space is irrelevant, our model is equivalent to the stochastic block model which has been shown to display a detectability-non detectability transition. In the regime where space dominates the link formation process, most methods can fail to recover the communities, an effect which is particularly marked when space-attributes correlations are strong. In this latter case, community detection methods which remove the spatial component of the network can miss a large part of the community structure and can lead to incorrect results."
in_NB  to_read  statistics  community_discovery  network_data_analysis  spatial_statistics 
december 2011 by cshalizi
An Experimental Study of Homophily in the Adoption of Health Behavior
"How does the composition of a population affect the adoption of health behaviors and innovations? Homophily—similarity of social contacts—can increase dyadic-level influence, but it can also force less healthy individuals to interact primarily with one another, thereby excluding them from interactions with healthier, more influential, early adopters. As a result, an important network-level effect of homophily is that the people who are most in need of a health innovation may be among the least likely to adopt it. Despite the importance of this thesis, confounding factors in observational data have made it difficult to test empirically. We report results from a controlled experimental study on the spread of a health innovation through fixed social networks in which the level of homophily was independently varied. We found that homophily significantly increased overall adoption of a new health behavior, especially among those most in need of it."
in_NB  to_read  social_networks  experimental_sociology  re:homophily_and_confounding  homophily  diffusion_of_innovations  contagion  social_influence 
december 2011 by cshalizi
Multistability and Perceptual Inference - Neural Computation - Abstract
"Ambiguous images present a challenge to the visual system: How can uncertainty about the causes of visual inputs be represented when there are multiple equally plausible causes? A Bayesian ideal observer should represent uncertainty in the form of a posterior probability distribution over causes. However, in many real-world situations, computing this distribution is intractable and requires some form of approximation. We argue that the visual system approximates the posterior over underlying causes with a set of samples and that this approximation strategy produces perceptual multistability—stochastic alternation between percepts in consciousness. Under our analysis, multistability arises from a dynamic sample-generating process that explores the posterior through stochastic diffusion, implementing a rational form of approximate Bayesian inference known as Markov chain Monte Carlo (MCMC). We examine in detail the most extensively studied form of multistability, binocular rivalry, showing how a variety of experimental phenomena—gamma-like stochastic switching, patchy percepts, fusion, and traveling waves—can be understood in terms of MCMC sampling over simple graphical models of the underlying perceptual tasks. We conjecture that the stochastic nature of spiking neurons may lend itself to implementing sample-based posterior approximations in the brain."

(Actually, if I was going to try to model this as a Bayesian inference [and why would one _do_ that?], the more natural analogy would seem to be a Berk-style oscillation among equally good, i.e., equally wrong, hypotheses.)
to:NB  to_read  perception  neural_networks  bayesianism  gershman.samuel  vul.edward  tenenbaum.joshua 
december 2011 by cshalizi
[1112.1667] Boltzmann's Entropy and Large Deviation Lyapunov Functionals for Closed and Open Macroscopic Systems
"I give a brief overview of the resolution of the apparent problem of reconciling time symmetric microscopic dynamic with time asymmetric equations describing the evolution of macroscopic variables. I then show how the large deviation function of the stationary state of the microscopic system can be used as a Lyapunov function for the macroscopic evolution equations."
to:NB  to_read  statistical_mechanics  non-equilibrium  arrow_of_time  large_deviations  lebowitz.joel 
december 2011 by cshalizi
Early Computational Statistics - Journal of Computational and Graphical Statistics - 20(4):811
"I consider the beginnings of computational and empirical statistics, particularly emphasizing the contributions to these by the scientists at Los Alamos National Laboratory during and after World War II. The timeline considered herein begins with preparations for the 1890 U.S. Census and concludes with Tukey’s introduction of the jackknife."
in_NB  to_read  statistics  history_of_mathematics  history_of_statistics  computational_statistics 
december 2011 by cshalizi
The Ethics of Nudge (Bovens)
"In their recently published book Nudge (2008) Richard H. Thaler and Cass R. Sunstein (T&S) defend a position labelled as ‘libertarian paternalism’. Their thinking appeals to both the right and the left of the political spectrum, as evidenced by the bedfellows they keep on either side of the Atlantic. In the US, they have advised Barack Obama, while, in the UK, they were welcomed with open arms by the David Cameron’s camp. (Chakrabortty, 2008) I will consider the following questions. What is Nudge? How is it different from social advertisement? Does Nudge induce genuine preference change? Does Nudge build moral character? Is there a moral difference between the use of Nudge as opposed to subliminal images to reach policy objectives? And what are the moral constraints on Nudge?"
to:NB  to_read  moral_philosophy 
december 2011 by cshalizi
[1111.5648] Falsification and future performance
"We information-theoretically reformulate two measures of capacity from statistical learning theory: empirical VC-entropy and empirical Rademacher complexity. We show these capacity measures count the number of hypotheses about a dataset that a learning algorithm falsifies when it finds the classifier in its repertoire minimizing empirical risk. It then follows from that the future performance of predictors on unseen data is controlled in part by how many hypotheses the learner falsifies. As a corollary we show that empirical VC-entropy quantifies the message length of the true hypothesis in the optimal code of a particular probability distribution, the so-called actual repertoire."
to:NB  to_read  information_theory  learning_theory  falsification  balduzzi.david 
december 2011 by cshalizi
Phys. Rev. E 84, 051138 (2011): Anomalous diffusion: Testing ergodicity breaking in experimental data
"Recent advances in single-molecule experiments show that various complex systems display nonergodic behavior. In this paper, we show how to test ergodicity and ergodicity breaking in experimental data. Exploiting the so-called dynamical functional, we introduce a simple test which allows us to verify ergodic properties of a real-life process. The test can be applied to a large family of stationary infinitely divisible processes. We check the performance of the test for various simulated processes and apply it to experimental data describing the motion of mRNA molecules inside live Escherichia coli cells. We show that the data satisfy necessary conditions for mixing and ergodicity. The detailed analysis is presented in the supplementary material."
in_NB  to_read  ergodic_theory  hypothesis_testing  stochastic_processes  statistical_inference_for_stochastic_processes 
december 2011 by cshalizi
[1111.6337] Regret Bound by Variation for Online Convex Optimization
"In citep{Hazan-2008-extract}, the authors showed that the regret of online linear optimization can be bounded by the total variation of the cost vectors. In this paper, we extend this result to general online convex optimization. We first analyze the limitations of the algorithm in citep{Hazan-2008-extract} when applied it to online convex optimization. We then present two algorithms for online convex optimization whose regrets are bounded by the variation of cost functions. We finally consider the bandit setting, and present a randomized algorithm for online bandit convex optimization with a variation-based regret bound. We show that the regret bound for online bandit convex optimization is optimal when the variation of cost functions is independent of the number of trials."
in_NB  to_read  re:growing_ensemble_project  learning_theory  individual_sequence_prediction 
december 2011 by cshalizi
Prediction-based regularization using data augmented regression - Statistics and Computing, Volume 22, Number 1
"The role of regularization is to control fitted model complexity and variance by penalizing (or constraining) models to be in an area of model space that is deemed reasonable, thus facilitating good predictive performance. This is typically achieved by penalizing a parametric or non-parametric representation of the model. In this paper we advocate instead the use of prior knowledge or expectations about the predictions of models for regularization. This has the twofold advantage of allowing a more intuitive interpretation of penalties and priors and explicitly controlling model extrapolation into relevant regions of the feature space. This second point is especially critical in high-dimensional modeling situations, where the curse of dimensionality implies that new prediction points usually require extrapolation. We demonstrate that prediction-based regularization can, in many cases, be stochastically implemented by simply augmenting the dataset with Monte Carlo pseudo-data. We investigate the range of applicability of this implementation. An asymptotic analysis of the performance of Data Augmented Regression (DAR) in parametric and non-parametric linear regression, and in nearest neighbor regression, clarifies the regularizing behavior of DAR. We apply DAR to simulated and real data, and show that it is able to control the variance of extrapolation, while maintaining, and often improving, predictive accuracy."
in_NB  to_read  statistics  prediction  estimation  hooker.giles  regression  to_teach:undergrad-ADA  to_teach:data-mining  curse_of_dimensionality 
december 2011 by cshalizi
« earlier      

related tags

academia  active_learning  adaptive_behavior  additive_models  afghanistan  agent-based_models  ai  akerlof.george  al-qaeda  albers.dave  algebraic_statistics  algorithmic_information_theory  allometric_scaling  amaral.luis  american_hegemony  ancel_meyers.lauren  anderson.perry  andrews.donald_w._k.  anthropology  appropriations_of_complexity  approximate_bayesian_computation  approximation  arab_spring  arbitrage  arlot.sylvain  arrow.kenneth  arrow_of_time  astronomy  asymptotics  atay.fatihcan  attention  attractor_reconstruction  autism  automata_theory  automation  ay.nihat  backfitting  bad_data_analysis  baker.dean  balduzzi.david  ballistic_computation  banking  barvinok.alexander  batterman.robert_w  bayesianism  bayesian_consistency  behavioral_genetics  belusov-zhabotinsky  bergstrom.carl  bernstein-von_mises  bialek.william  biau.gerard  biochemical_networks  biology  birds  blanchard.gilles  blattman.chris  blogging  blume.andreas  books:noted  book_reviews  boots.byron  bootstrap  bounded_rationality  bousquet.olivier  bowles.samuel  branching_processes  breiman.leo  buhlmann.peter  c++  calibration  campaign_finance  carroll.sean  caruana.rich  category_theory  catoni.olivier  cats  causality  causal_inference  cellular_automata  central_asia  central_limit_theorem  change-point_problem  chaos  chicago  chow-liu_algorithm  chow-liu_trees  chung.fan  citation_networks  cities  clarke.kevin  classifiers  clauset.aaron  climate_change  climatology  clustering  coarse-graining  cognitive_development  cognitive_science  cognitive_tools  cognitive_triage  cohen.michael  coleman.todd  collaborative_filtering  collective_cognition  community_discovery  complexity_measures  computability  computational_complexity  computational_statistics  concentration_of_measure  confidence_sets  congress  consistency  contagion  control_theory  convergence_of_stochastic_processes  convexity  convex_sets  cool_if_true  corporate_governance  corpus_linguistics  corruption  cosmology  counter-insurgency  coupled_map_lattices  credit  cross-validation  crutchfield.james_p.  CSSR  cultural_evolution  cultural_transmission  cultural_transmission_of_cognitive_tools  culture  curse_of_dimensionality  curve_fitting  dasgupta.anirban  databases  data_analysis  data_mining  david.paul  debowski.lukasz  decision_theory  decision_trees  defenses_of_liberalism  degrees_of_freedom  democracy  density_estimation  density_ratio_estimation  determinism  development_economics  deviation_bounds  devroye.luc  dewey.john  de_deo.simon  dietterich.thomas  differential_equations  differential_geometry  diffusion_of_innovations  dimension_reduction  directed_information  discretization  distributed_systems  document_summarization  domingos.pedro  donskers_theorem  dudoit.sandrine  dupuis.paul  dynamical_systems  dynamical_systemss  early_modern_european_history  earthquakes  ecology  econometrics  economics  economic_growth  economic_history  economic_policy  education  effective_field_theories  elites  ellis.richard  emergence  emotion  empirical_processes  em_algorithm  encompassing  energy  ensemble_methods  entropy  entropy_estimation  epidemic_models  epidemiology  ergodic_decomposition  ergodic_theory  estimation  estimation_of_dynamical_systems  ethnography  europe  evolution  evolutionary_biology  evolutionary_economics  evolutionary_game_theory  evolutionary_optimization  evolution_of_complexity  evolution_of_cooperation  evolving_local_rules  executive_function  expectation-maximization  experimental_biology  experimental_economics  experimental_psychology  experimental_sociology  explanation  exploitation-exploration_tradeoff  exponential_convergence_of_empirical_probabilities  exponential_families  exponential_family_random_graphs  face_recognition  factor_analysis  falsification  feedback  feldman.david  feminism  field_theory  fienberg.steve  finance  financial_crisis_of_2007--  financial_markets  financial_speculation  fink.daniel  fisher_information  flickr  flocks_and_swarms  fluctuation-response  fluid_mechanics  fmri  foundations_of_statistics  fox.emily  fraud  freeman.peter  functional_central_limit_theorem  functional_connectivity  funny:laughing_instead_of_screaming  galstyan.aram  galves.antonio  game_theory  gangs  gauge_symmetry  gaussian_processes  gene_expression  gene_expression_data_analysis  gene_regulation  genovese.chris  geography  geology  gershman.samuel  getoor.lise  gladwell.malcolm  godfrey-smith.peter  goodness-of-fit  gordon.geoffrey_j.  grammar_induction  granger_causality  graphical_models  graph_grammars  graph_limits  graph_spectra  graph_theory  great_transformation  grunwald.peter  guerrilla_warfare  guestrin.carlos  gustafson.paul  haavelmo.trygve  habit  hacking.ian  hansen.bruce  hansen.christian  harris.zellig  have_read  hayek.f.a._von  heard_the_talk  heavy_tails  herding  heritability  hierarchical_structure  high-dimensional_probability  high-dimensional_statistics  hilbert_space  historical_linguistics  historical_materialism  history_of_economics  history_of_mathematics  history_of_physics  history_of_science  history_of_statistics  history_of_technology  hofling.holger  homophily  hooker.giles  hopcroft.john  horrifying  hsu.daniel  human_genetics  hydrodynamics  hypergraphs  hypothesis_testing  ideal-point_models  identifiability  identity_politics  implicit_learning  increasing_returns  independence_testing  india  indirect_inference  individual_sequence_prediction  inequality  inference_to_latent_objects  influence  information_cascades  information_criteria  information_geometry  information_retrieval  information_theory  innovation  input-output_analysis  institutions  instrumental_variables  intellectual_property  interacting_particle_systems  interface_design  international_relations  internet  interpretation  inverse_problems  in_NB  ising_model  islam  jackson.matthew_o.  jaeger.herbert  janzing.dominik  jordan.michael_i.  jost.jurgen  k-means  kadanoff.leo  kakade.sham  kalisch.markus  KAM_theory  kantz.holger  kernel_estimators  kernel_methods  khinchin.a._i.  kith_and_kin  kleinberg.jon  klemens.ben  kolar.mladen  kontoyiannis.ioannis  koyama.shinsuke  krakauer.david  krakuer.david  lafferty.john  landauers_principle  large_deviations  lasso  latent_variables  latex  lazer.david  learning  learning_in_games  learning_theory  lebanon.guy  lebaron.blake  lebowitz.joel  lee.ann_b.  leonardi.florencia  lerman.kristina  levina.liza  likelihood  linguistics  linguistic_evolution  link_prediction  literary_criticism  literary_history  liu.han  lives_of_the_scientists  logic  logical_positivism  lohr.wolfgang  long-memory_processes  long-range_dependence  low-regret_learning  low_dimensional_projections  lugosi.gabor  lyapunov_exponents  machine_learning  machta.jon  macroeconomics  macro_from_micro  manifold_learning  mapping  markets_as_collective_calculating_devices  markov_models  martingales  mason.joshua_w.  mason.winter  massart.pascal  mathematics  maxwell.james_clerk  meaning_as_location_in_a_system_of_relations  measurement  measure_theory  mechanism_design  memory  meta-analysis  methodology  method_of_moments  meyn.sean  military_industrial_complex  minimax  misspecification  mixing  mixture_models  model-checking  modeling  model_averaging  model_checking  model_discovery  model_selection  model_uncertainty  moderate_deviations  modularity  molecular_dynamics  monte_carlo  moral_hazard  moral_philosophy  moral_psychology  morphogenesis  morvai.gusztav  moulines.eric  multiple_testing  names  natural_language_processing  nearest_neighbors  neat_nonlinear_nonsense  networked_life  networks  network_data_analysis  network_formation  neural_coding_and_decoding  neural_computation  neural_data_analysis  neural_modeling  neural_networks  neurath.otto  neuropsychology  neuroscience  neville.jennifer  newman.mark  nilsson_jacobi.martin  nominate  non-equilibrium  non-stationarity  nonparametrics  nordhaus.william  norton.john  numeracy  observable_operator_models  ocaml  oligarchy  online_learning  optimization  orbanz.peter  organizations  our_decrepit_institutions  owen.art  p-values  pac-bayesian  pakistan  paper_writing  parenting  particle_filters  pattern_formation  pearl.judea  pedagogy  peer_production  perception  percival.daniel  phase_transitions  philosophy_of_science  photos  physics  physics_of_information  pillai.natesh  pittsburgh  poincare_recurrence  point_processes  political_economy  political_networks  political_science  pollard.david  polletta.francesca  polya.george  porter.mason  pragmatics  pre-validation  prediction  prediction_trees  predictive_state_representations  primo.david  principal_components  privatization  probability  probably_approximately_correct  productivity  programming  progressive_forces  psychology  publication_bias  public_policy  quantum_mechanics  R  radev.dragomir  raginsky.maxim  randal.douc  randomization  random_boolean_networks  random_fields  random_forests  random_matrices  random_matrix_theory  rashid.ahmed  re:aggregating_random_graphs  re:almost_none  re:AoS_project  re:bayes_as_evol  re:critique_of_diffusion  re:democratic_cognition  re:do-institutions-evolve  re:donor_networks  re:friday_cat-blogging  re:functional_communities  re:growing_ensemble_project  re:g_paper  re:homophily_and_confounding  re:knightian_uncertainty  re:naive-semi-supervised  re:network_differences  re:phil-of-bayes_paper  re:sensor-networks-as-social-networks  re:smoothing_adjacency_matrices  re:social-networks-as-sensor-networks  re:stacs  re:what_is_a_macrostate  re:XV_for_mixing  re:XV_for_networks  re:your_favorite_dsge_sucks  re:your_favorite_ergm_sucks  reaction-diffusion  reciprocity  recurrence_times  recursive_estimation  reductionism  regression  regulation  reichenbach.hans  reinforcement_learning  relational_learning  relativity  religion  renormalization  replicator_dynamics  resampling  rhetoric  richards.joey  riedewald.mirek  rigollet.philippe  rinaldo.alessandro  risk_vs_uncertainty  robins.james  robust_statistics  rockmore.dan  romer.paul  rubin.barnett  runciman.w.g.  running_dogs_of_reaction  ryabko.b._ya.  ryabko.daniil  saddle-point_approximation  sandler.mark  schafer.chad  science_studies  scientific_computing  scientific_revolution  self-fulfilling_prophecy  self-organization  self-organized_criticality  semi-supervised_learning  send_a_note  shrinkage  siddiqi.sajid_m.  signal_processing  signal_transduction  simulation  smoking  smoothing  socialism  socialist_calculation_debate  social_construction  social_contagion  social_influence  social_learning  social_life_of_the_mind  social_media  social_movements  social_networks  social_norms  social_organization  social_psychology  social_science_methodology  sociology  sociology_of_science  song.le  sorokina.daria  soviet-afghan_war  sparsity  spatial_statistics  spectral_clustering  spectral_estimation  spectral_methods  splines  stability_of_learning  state-building  state-space_models  state-space_reconstruction  stationarity  statistical_inference_for_stochastic_processes  statistical_interaction  statistical_mechanics  statistics  stiglitz.joseph  stochastic_approximation  stochastic_differential_equations  stochastic_models  stochastic_processes  stotz.karola  strategic_interaction  strategic_position_in_networks  stress  structured_data  sufficiency  suhay.liz  support_vector_machines  symbolic_dynamics  synchronization  synchronizing_words  tagging  technological_change  technological_unemployment  tenenbaum.joshua  text_mining  theoretical_computer_science  thermodynamics  thermodynamic_formalism  the_continuing_crises  things_that_should_not_be  tibshirani.robert  time_series  tishby.naftali  tkacik.gasper  tkacik.maureen  to:blog  to:NB  topic_models  total_factor_productivity  to_be_shot_after_a_fair_trial  to_read  to_teach:advanced-stochastic-processes  to_teach:complexity-and-inference  to_teach:data-mining  to_teach:statcomp  to_teach:undergrad-ADA  transaction_networks  turing_mechanism  two-sample_tests  unions  universal_prediction  unsupervised_learning  us  us-iraq_war  ussr  us_military  us_politics  utter_stupidity  vagueness  van_der_vaart.aad  van_de_geer.sara  variable-length_markov_models  variable_selection  variational_methods  verdinelli.isa  via:?  via:aaron_clauset  via:aks  via:alessandro  via:anoopsarkar  via:ariddell  via:arthegall  via:blattman  via:brad_plumer  via:cris_moore  via:ded-maxim  via:djm1107  via:dsparks  via:fionajay  via:gelman  via:guslacerda  via:henry_farrell  via:jhofman  via:joncgoodwin  via:justin  via:kevin_kelly  via:krugman  via:mason  via:matthew_berryman  via:mberryman  via:merriam  via:mindhacks  via:mraginsky  via:nequitans  via:neuroanthropology  via:nikete  via:orzelc  via:paper_I_refereed_and_can't_tell_you_about  via:rjwaldmann  via:rocha  via:santerre  via:shivak  via:spangledrongo  via:spencer-ackerman  via:vaguery  via:wiggins  videogames  violence  vision  visual_display_of_quantitative_information  von_mises.richard  voter_model  vul.edward  wahba.grace  wainwright.martin  waiting_times  war  wasserman.larry  watts.duncan  weiss.benjamin  whats_gone_wrong_with_america  xing.eric  zhang.jiji  zhang.tong  zhu.ji 

Copy this bookmark:



description:


tags: