cshalizi + prediction 100
[1205.3845] Forecasting with Historical Data or Process Knowledge under Misspecification: A Comparison
7 days ago by cshalizi
"When faced with the task of forecasting a dynamic system, practitioners often have available historical data, knowledge of the system, or a combination of both. While intuition dictates that perfect knowledge of the system should in theory yield perfect forecasting, often knowledge of the system is only partially known, known up to parameters, or known incorrectly. In contrast, forecasting using previous data without any process knowledge might result in accurate prediction for simple systems, but will fail for highly nonlinear and chaotic systems. In this paper, the authors demonstrate how even in chaotic systems, forecasting with historical data is preferable to using process knowledge if this knowledge exhibits certain forms of misspecification. Through an extensive simulation study, a range of misspecification and forecasting scenarios are examined with the goal of gaining an improved understanding of the circumstances under which forecasting from historical data is to be preferred over using process knowledge."
to:NB
to_read
prediction
time_series
misspecification
re:growing_ensemble_project
7 days ago by cshalizi
[1205.2609] Which Spatial Partition Trees are Adaptive to Intrinsic Dimension?
13 days ago by cshalizi
"Recent theory work has found that a special type of spatial partition tree - called a random projection tree - is adaptive to the intrinsic dimension of the data from which it is built. Here we examine this same question, with a combination of theory and experiments, for a broader class of trees that includes k-d trees, dyadic trees, and PCA trees. Our motivation is to get a feel for (i) the kind of intrinsic low dimensional structure that can be empirically verified, (ii) the extent to which a spatial partition can exploit such structure, and (iii) the implications for standard statistical tasks such as regression, vector quantization, and nearest neighbor search."
to:NB
decision_trees
prediction
regression
statistics
dimension_reduction
machine_learning
13 days ago by cshalizi
[1205.1406] Graph Prediction in a Low-Rank and Autoregressive Setting
19 days ago by cshalizi
"We study the problem of prediction for evolving graph data. We formulate the problem as the minimization of a convex objective encouraging sparsity and low-rank of the solution, that reflect natural graph properties. The convex formulation allows to obtain oracle inequalities and efficient solvers. We provide empirical results for our algorithm and comparison with competing methods, and point out two open questions related to compressed sensing and algebra of low-rank and sparse matrices."
to:NB
network_data_analysis
prediction
statistics
low-rank_approximation
19 days ago by cshalizi
Clarke , Clarke : Prediction in several conventional contexts
20 days ago by cshalizi
"We review predictive techniques from several traditional branches of statistics. Starting with prediction based on the normal model and on the empirical distribution function, we proceed to techniques for various forms of regression and classification. Then, we turn to time series, longitudinal data, and survival analysis. Our focus throughout is on the mechanics of prediction more than on the properties of predictors."
(to_teach tags are tentative.)
to:NB
prediction
statistics
classifiers
regression
to_teach:undergrad-ADA
to_teach:data-mining
(to_teach tags are tentative.)
20 days ago by cshalizi
Ehm , Gneiting : Local proper scoring rules of order two
21 days ago by cshalizi
"Scoring rules assess the quality of probabilistic forecasts, by assigning a numerical score based on the predictive distribution and on the event or value that materializes. A scoring rule is proper if it encourages truthful reporting. It is local of order k if the score depends on the predictive density only through its value and the values of its derivatives of order up to k at the realizing event. Complementing fundamental recent work by Parry, Dawid and Lauritzen, we characterize the local proper scoring rules of order 2 relative to a broad class of Lebesgue densities on the real line, using a different approach. In a data example, we use local and nonlocal proper scoring rules to assess statistically postprocessed ensemble weather forecasts."
to:NB
prediction
scoring_rules
statistics
gneiting.tilmann
21 days ago by cshalizi
Dawid , Lauritzen , Parry : Proper local scoring rules on discrete sample spaces
21 days ago by cshalizi
"A scoring rule is a loss function measuring the quality of a quoted probability distribution Q for a random variable X, in the light of the realized outcome x of X; it is proper if the expected score, under any distribution P for X, is minimized by quoting Q = P. Using the fact that any differentiable proper scoring rule on a finite sample space is the gradient of a concave homogeneous function, we consider when such a rule can be local in the sense of depending only on the probabilities quoted for points in a nominated neighborhood of x. Under mild conditions, we characterize such a proper local scoring rule in terms of a collection of homogeneous functions on the cliques of an undirected graph on the space . A useful property of such rules is that the quoted distribution Q need only be known up to a scale factor. Examples of the use of such scoring rules include Besag’s pseudo-likelihood and Hyvärinen’s method of ratio matching."
to:NB
prediction
scoring_rules
statistics
lauritzen.steffen
dawid.philip
21 days ago by cshalizi
Parry , Dawid , Lauritzen : Proper local scoring rules
21 days ago by cshalizi
"We investigate proper scoring rules for continuous distributions on the real line. It is known that the log score is the only such rule that depends on the quoted density only through its value at the outcome that materializes. Here we allow further dependence on a finite number m of derivatives of the density at the outcome, and describe a large class of such m-local proper scoring rules: these exist for all even m but no odd m. We further show that for m ≥ 2 all such m-local rules can be computed without knowledge of the normalizing constant of the distribution."
to:NB
prediction
scoring_rules
lauritzen.steffen
dawid.philip
statistics
21 days ago by cshalizi
[1204.6441] "I Wanted to Predict Elections with Twitter and all I got was this Lousy Paper" -- A Balanced Survey on Election Prediction using Twitter Data
24 days ago by cshalizi
"Predicting X from Twitter is a popular fad within the Twitter research subculture. It seems both appealing and relatively easy. Among such kind of studies, electoral prediction is maybe the most attractive, and at this moment there is a growing body of literature on such a topic. This is not only an interesting research problem but, above all, it is extremely difficult. However, most of the authors seem to be more interested in claiming positive results than in providing sound and reproducible methods. It is also especially worrisome that many recent papers seem to only acknowledge those studies supporting the idea of Twitter predicting elections, instead of conducting a balanced literature review showing both sides of the matter. After reading many of such papers I have decided to write such a survey myself. Hence, in this paper, every study relevant to the matter of electoral prediction using social media is commented. From this review it can be concluded that the predictive power of Twitter regarding elections has been greatly exaggerated, and that hard research problems still lie ahead."
to:NB
social_media
data_mining
prediction
have_read
24 days ago by cshalizi
Assessing gross domestic product and inflation probability forecasts derived from Bank of England fan charts - Galbraith - 2011 - Journal of the Royal Statistical Society: Series A (Statistics in Society) - Wiley Online Library
6 weeks ago by cshalizi
"Density forecasts, including the pioneering Bank of England ‘fan charts’, are often used to produce forecast probabilities of a particular event. We use the Bank of England's forecast densities to calculate the forecast probability that annual rates of inflation and output growth exceed given thresholds. We subject these implicit probability forecasts to graphical and numerical diagnostic checks. We measure both their calibration and their resolution, providing both statistical and graphical interpretations of the results. The results reinforce earlier evidence on limitations of these forecasts and provide new evidence on their information content and on the relative performance of inflation and gross domestic product growth forecasts. In particular, gross domestic product forecasts show little or no ability to predict periods of low growth beyond the current quarter, in part because of the important role of data revisions."
to:NB
prediction
statistics
calibration
macroeconomics
to_teach:undergrad-ADA
6 weeks ago by cshalizi
Stock Market Behavior Predicted by Rat Neurons
8 weeks ago by cshalizi
"We here report for the first time, to the best of our knowledge, rat motor cortex neurons predicting the behavior of the American stock market. We implanted the motor cortex of the brains of rats with silicon electrodes. Using the correlation technique, we monitored the activity of neurons in our rats while simultaneously tracking the activity of stocks in the U.S. stock market."
have_read
to:NB
neuroscience
finance
statistics
prediction
multiple_testing
bad_data_analysis
funny:geeky
funny:malicious
via:mejn
to:blog
to_teach:undergrad-ADA
8 weeks ago by cshalizi
[1203.5422] Distribution Free Prediction Bands
8 weeks ago by cshalizi
"We study distribution free, nonparametric prediction bands with a special focus on their finite sample behavior. First we investigate and develop different notions of finite sample coverage guarantees. Then we give a new prediction band estimator by combining the idea of "conformal prediction" (Vovk et al. 2009) with nonparametric conditional density estimation. The proposed estimator, called COPS (Conformal Optimized Prediction Set), always has finite sample guarantee in a stronger sense than the original conformal prediction estimator. Under regularity conditions the estimator converges to an oracle band at a minimax optimal rate. A fast approximation algorithm and a data driven method for selecting the bandwidth are developed. The method is illustrated first in simulated data. Then, an application shows that the proposed method gives desirable prediction intervals in an automatic way, as compared to the classical linear regression modeling."
to:NB
prediction
statistics
nonparametrics
kith_and_kin
wasserman.larry
lei.jing
heard
confidence_sets
density_estimation
8 weeks ago by cshalizi
Universality of Bayesian Predictions
12 weeks ago by cshalizi
"This paper studies the theoretical properties of Bayesian predictions and shows that under minimal conditions we can derive finite sample bounds for the loss incurred using Bayesian predictions under the Kullback-Leibler divergence. In particular, the concept of universality of predictions is discussed and universality is established for Bayesian predictions in a variety of settings. These include predictions under almost arbitrary loss functions, model averaging, predictions in a non-stationary environment and under model misspecification."
in_NB
to_read
statistics
bayesian_consistency
prediction
misspecification
universal_prediction
12 weeks ago by cshalizi
[0805.3032] Testing earthquake predictions
12 weeks ago by cshalizi
"Statistical tests of earthquake predictions require a null hypothesis to model occasional chance successes. To define and quantify `chance success' is knotty. Some null hypotheses ascribe chance to the Earth: Seismicity is modeled as random. The null distribution of the number of successful predictions -- or any other test statistic -- is taken to be its distribution when the fixed set of predictions is applied to random seismicity. Such tests tacitly assume that the predictions do not depend on the observed seismicity. Conditioning on the predictions in this way sets a low hurdle for statistical significance. Consider this scheme: When an earthquake of magnitude 5.5 or greater occurs anywhere in the world, predict that an earthquake at least as large will occur within 21 days and within an epicentral distance of 50 km. We apply this rule to the Harvard centroid-moment-tensor (CMT) catalog for 2000--2004 to generate a set of predictions. The null hypothesis is that earthquake times are exchangeable conditional on their magnitudes and locations and on the predictions--a common ``nonparametric'' assumption in the literature. We generate random seismicity by permuting the times of events in the CMT catalog. We consider an event successfully predicted only if (i) it is predicted and (ii) there is no larger event within 50 km in the previous 21 days. The $P$-value for the observed success rate is $<0.001$: The method successfully predicts about 5% of earthquakes, far better than `chance,' because the predictor exploits the clustering of earthquakes -- occasional foreshocks -- which the null hypothesis lacks. Rather than condition on the predictions and use a stochastic model for seismicity, it is preferable to treat the observed seismicity as fixed, and to compare the success rate of the predictions to the success rate of simple-minded predictions like those just described. If the proffered predictions do no better than a simple scheme, they have little value."
have_read
to:NB
statistics
geology
prediction
earthquakes
to_teach:undergrad-ADA
to_teach:data-mining
12 weeks ago by cshalizi
[0801.0327] Nonparametric sequential prediction of time series
february 2012 by cshalizi
"Time series prediction covers a vast field of every-day statistical applications in medical, environmental and economic domains. In this paper we develop nonparametric prediction strategies based on the combination of a set of 'experts' and show the universal consistency of these strategies under a minimum of conditions. We perform an in-depth analysis of real-world data sets and show that these nonparametric strategies are more flexible, faster and generally outperform ARMA methods in terms of normalized cumulative prediction error."
in_NB
time_series
nonparametrics
prediction
statistics
to_teach:undergrad-ADA
re:growing_ensemble_project
february 2012 by cshalizi
[math/0701419] Strategies for prediction under imperfect monitoring
february 2012 by cshalizi
"We propose simple randomized strategies for sequential prediction under imperfect monitoring, that is, when the forecaster does not have access to the past outcomes but rather to a feedback signal. The proposed strategies are consistent in the sense that they achieve, asymptotically, the best possible average reward. It was Rustichini (1999) who first proved the existence of such consistent predictors. The forecasters presented here offer the first constructive proof of consistency. Moreover, the proposed algorithms are computationally efficient. We also establish upper bounds for the rates of convergence. In the case of deterministic feedback, these rates are optimal up to logarithmic terms."
to:NB
prediction
individual_sequence_prediction
learning_in_games
re:growing_ensemble_project
february 2012 by cshalizi
[1202.4294] Prediction of quantiles by statistical learning and application to GDP forecasting
february 2012 by cshalizi
"In this paper, we tackle the problem of prediction and confidence intervals for time series using a statistical learning approach and quantile loss functions. In a first time, we show that the Gibbs estimator (also known as Exponentially Weighted aggregate) is able to predict as well as the best predictor in a given family for a wide set of loss functions. In particular, using the quantile loss function of Koenker and Bassett (1978), this allows to build confidence intervals. We apply these results to the problem of prediction and confidence regions for the French Gross Domestic Product (GDP) growth, with promising results."
in_NB
to_read
prediction
confidence_sets
learning_theory
re:your_favorite_dsge_sucks
re:growing_ensemble_project
february 2012 by cshalizi
Proving Induction
february 2012 by cshalizi
"The hard problem of induction is to argue without begging the question that inductive inference, applied properly in the proper circumstances, is con- ducive to truth. A recent theorem seems to show that the hard problem has a deductive solution. The theorem, provable in , states that a predictive func- tion M exists with the following property: whatever world we live in, M correctly predicts the world’s present state given its previous states at all times apart from a well-ordered subset. On the usual model of time a well-ordered subset is small relative to the set of all times. M’s existence therefore seems to provide a solution to the hard problem.
My paper argues for two conclusions. First, the theorem does not solve the hard problem of induction. More positively though, it solves a version of the problem in which the structure of time is given modulo our choice of set theory."
--- Seems prodigiously strange, on first glance. Ask the people downstairs and up the hall what they think of it?
to:NB
induction
set_theory
philosophy_of_science
prediction
My paper argues for two conclusions. First, the theorem does not solve the hard problem of induction. More positively though, it solves a version of the problem in which the structure of time is given modulo our choice of set theory."
--- Seems prodigiously strange, on first glance. Ask the people downstairs and up the hall what they think of it?
february 2012 by cshalizi
Stein : When does the screening effect hold?
january 2012 by cshalizi
"When using optimal linear prediction to interpolate point observations of a mean square continuous stationary spatial process, one often finds that the interpolant mostly depends on those observations located nearest to the predictand. This phenomenon is called the screening effect. However, there are situations in which a screening effect does not hold in a reasonable asymptotic sense, and theoretical support for the screening effect is limited to some rather specialized settings for the observation locations. This paper explores conditions on the observation locations and the process model under which an asymptotic screening effect holds. A series of examples shows the difficulty in formulating a general result, especially for processes with different degrees of smoothness in different directions, which can naturally occur for spatial-temporal processes. These examples lead to a general conjecture and two special cases of this conjecture are proven. The key condition on the process is that its spectral density should change slowly at high frequencies. Models not satisfying this condition of slow high-frequency change should be used with caution."
to:NB
spatial_statistics
smoothing
statistics
prediction
january 2012 by cshalizi
Forecasting Time Series with Complex Seasonal Patterns Using Exponential Smoothing
january 2012 by cshalizi
An innovations state space modeling framework is introduced for forecasting complex seasonal time series such as those with multiple seasonal periods, high-frequency seasonality, non-integer seasonality, and dual-calendar effects. The new framework incorporates Box–Cox transformations, Fourier representations with time varying coefficients, and ARMA error correction. Likelihood evaluation and analytical expressions for point forecasts and interval predictions under the assumption of Gaussian errors are derived, leading to a simple, comprehensive approach to forecasting complex seasonal time series. A key feature of the framework is that it relies on a new method that greatly reduces the computational burden in the maximum likelihood estimation. The modeling framework is useful for a broad range of applications, its versatility being illustrated in three empirical studies. In addition, the proposed trigonometric formulation is presented as a means of decomposing complex seasonal time series, and it is shown that this decomposition leads to the identification and extraction of seasonal components which are otherwise not apparent in the time series plot itself.
to:NB
statistics
time_series
prediction
january 2012 by cshalizi
[1112.6390] Early Warning with Calibrated and Sharper Probabilistic Forecasts
december 2011 by cshalizi
"Given a nonlinear model, a probabilistic forecast may be obtained by Monte Carlo simulations. At a given forecast horizon, Monte Carlo simulations yield sets of discrete forecasts, which can be converted to density forecasts. The resulting density forecasts will inevitably be downgraded by model mis-specification. In order to enhance the quality of the density forecasts, one can mix them with the unconditional density. This paper examines the value of combining conditional density forecasts with the unconditional density. The findings have positive implications for issuing early warnings in different disciplines including economics and meteorology, but UK inflation forecasts are considered as an example." --- Better than conformal predictors?
to:NB
prediction
statistics
ensemble_methods
density_estimation
december 2011 by cshalizi
Clements , Schoenberg , Schorlemmer : Residual analysis methods for space–time point processes with applications to earthquake forecast models in California
december 2011 by cshalizi
"Modern, powerful techniques for the residual analysis of spatial-temporal point process models are reviewed and compared. These methods are applied to California earthquake forecast models used in the Collaboratory for the Study of Earthquake Predictability (CSEP). Assessments of these earthquake forecasting models have previously been performed using simple, low-power means such as the L-test and N-test. We instead propose residual methods based on rescaling, thinning, superposition, weighted K-functions and deviance residuals. Rescaled residuals can be useful for assessing the overall fit of a model, but as with thinning and superposition, rescaling is generally impractical when the conditional intensity λ is volatile. While residual thinning and superposition may be useful for identifying spatial locations where a model fits poorly, these methods have limited power when the modeled conditional intensity assumes extremely low or high values somewhere in the observation region, and this is commonly the case for earthquake forecasting models. A recently proposed hybrid method of thinning and superposition, called super-thinning, is a more powerful alternative. The weighted K-function is powerful for evaluating the degree of clustering or inhibition in a model. Competing models are also compared using pixel-based approaches, such as Pearson residuals and deviance residuals. The different residual analysis techniques are demonstrated using the CSEP models and are used to highlight certain deficiencies in the models, such as the overprediction of seismicity in inter-fault zones for the model proposed by Helmstetter, Kagan and Jackson [Seismological Research Letters 78 (2007) 78–86], the underprediction of the model proposed by Kagan, Jackson and Rong [Seismological Research Letters 78 (2007) 94–98] in forecasting seismicity around the Imperial, Laguna Salada, and Panamint clusters, and the underprediction of the model proposed by Shen, Jackson and Kagan [Seismological Research Letters 78 (2007) 116–120] in forecasting seismicity around the Laguna Salada, Baja, and Panamint clusters."
to:NB
point_processes
spatial_statistics
time_series
statistics
model_selection
model-checking
prediction
earthquakes
geology
december 2011 by cshalizi
[1112.1674] Predicting Failures of Point Forecasts
december 2011 by cshalizi
"The predictability of errors in deterministic temperature forecasts is investigated. More precisely, the aim is to issue warnings whenever the differences between forecast and verification exceed a given threshold. The warnings are generated by analyzing the output of an ensemble forecast system in terms of a decision making approach. The quality of the resulting predictions is evaluated by computing receiver operating characteristics, the Brier score, and the Ignorance score. Special emphasis is also given to the question whether rare events are better predictable."
to:NB
prediction
statistics
time_series
dynamical_systems
december 2011 by cshalizi
[1111.6174] Resolving conflicts between statistical methods by probability combination: Application to empirical Bayes analyses of genomic data
december 2011 by cshalizi
"In the typical analysis of a data set, a single method is selected for statistical reporting even when equally applicable methods yield very different results. Examples of equally applicable methods can correspond to those of different ancillary statistics in frequentist inference and of different prior distributions in Bayesian inference. More broadly, choices are made between parametric and nonparametric methods and between frequentist and Bayesian methods.
Rather than choosing a single method, it can be safer, in a game-theoretic sense, to combine those that are equally appropriate in light of the available information. Since methods of combining subjectively assessed probability distributions are not objective enough for that purpose, this paper introduces a method of distribution combination that does not require any assignment of distribution weights. It does so by formalizing a hedging strategy in terms of a game between three players: nature, a statistician combining distributions, and a statistician refusing to combine distributions. The optimal move of the first statistician reduces to the solution of a simpler problem of selecting an estimating distribution that minimizes the Kullback-Leibler loss maximized over the plausible distributions to be combined. The resulting combined distribution is a linear combination of the most extreme of the distributions to be combined that are scientifically plausible. The optimal weights are close enough to each other that no extreme distribution dominates the others.
The new methodology is illustrated by combining conflicting empirical Bayes methodologies in the context of gene expression data analysis."
in_NB
ensemble_methods
statistics
prediction
bickel.david
Rather than choosing a single method, it can be safer, in a game-theoretic sense, to combine those that are equally appropriate in light of the available information. Since methods of combining subjectively assessed probability distributions are not objective enough for that purpose, this paper introduces a method of distribution combination that does not require any assignment of distribution weights. It does so by formalizing a hedging strategy in terms of a game between three players: nature, a statistician combining distributions, and a statistician refusing to combine distributions. The optimal move of the first statistician reduces to the solution of a simpler problem of selecting an estimating distribution that minimizes the Kullback-Leibler loss maximized over the plausible distributions to be combined. The resulting combined distribution is a linear combination of the most extreme of the distributions to be combined that are scientifically plausible. The optimal weights are close enough to each other that no extreme distribution dominates the others.
The new methodology is illustrated by combining conflicting empirical Bayes methodologies in the context of gene expression data analysis."
december 2011 by cshalizi
Quantile regression for longitudinal data based on latent Markov subject-specific parameters Alessio Farcomeni - Statistics and Computing, Volume 22, Number 1
december 2011 by cshalizi
"We propose a latent Markov quantile regression model for longitudinal data with non-informative drop-out. The observations, conditionally on covariates, are modeled through an asymmetric Laplace distribution. Random effects are assumed to be time-varying and to follow a first order latent Markov chain. This latter assumption is easily interpretable and allows exact inference through an ad hoc EM-type algorithm based on appropriate recursions. Finally, we illustrate the model on a benchmark data set."
to:NB
regression
time_series
prediction
markov_models
december 2011 by cshalizi
Prediction-based regularization using data augmented regression - Statistics and Computing, Volume 22, Number 1
december 2011 by cshalizi
"The role of regularization is to control fitted model complexity and variance by penalizing (or constraining) models to be in an area of model space that is deemed reasonable, thus facilitating good predictive performance. This is typically achieved by penalizing a parametric or non-parametric representation of the model. In this paper we advocate instead the use of prior knowledge or expectations about the predictions of models for regularization. This has the twofold advantage of allowing a more intuitive interpretation of penalties and priors and explicitly controlling model extrapolation into relevant regions of the feature space. This second point is especially critical in high-dimensional modeling situations, where the curse of dimensionality implies that new prediction points usually require extrapolation. We demonstrate that prediction-based regularization can, in many cases, be stochastically implemented by simply augmenting the dataset with Monte Carlo pseudo-data. We investigate the range of applicability of this implementation. An asymptotic analysis of the performance of Data Augmented Regression (DAR) in parametric and non-parametric linear regression, and in nearest neighbor regression, clarifies the regularizing behavior of DAR. We apply DAR to simulated and real data, and show that it is able to control the variance of extrapolation, while maintaining, and often improving, predictive accuracy."
in_NB
to_read
statistics
prediction
estimation
hooker.giles
regression
to_teach:undergrad-ADA
to_teach:data-mining
curse_of_dimensionality
december 2011 by cshalizi
Lai , Gross , Shen : Evaluating probability forecasts
november 2011 by cshalizi
"Probability forecasts of events are routinely used in climate predictions, in forecasting default probabilities on bank loans or in estimating the probability of a patient’s positive response to treatment. Scoring rules have long been used to assess the efficacy of the forecast probabilities after observing the occurrence, or nonoccurrence, of the predicted events. We develop herein a statistical theory for scoring rules and propose an alternative approach to the evaluation of probability forecasts. This approach uses loss functions relating the predicted to the actual probabilities of the events and applies martingale theory to exploit the temporal structure between the forecast and the subsequent occurrence or nonoccurrence of the event."
in_NB
statistics
prediction
calibration
to_read
to_teach:undergrad-ADA
november 2011 by cshalizi
[1111.1386] Confidence Estimation in Structured Prediction
november 2011 by cshalizi
"Structured classification tasks such as sequence labeling and dependency parsing have seen much interest by the Natural Language Processing and the machine learning communities. Several online learning algorithms were adapted for structured tasks such as Perceptron, Passive- Aggressive and the recently introduced Confidence-Weighted learning . These online algorithms are easy to implement, fast to train and yield state-of-the-art performance. However, unlike probabilistic models like Hidden Markov Model and Conditional random fields, these methods generate models that output merely a prediction with no additional information regarding confidence in the correctness of the output. In this work we fill the gap proposing few alternatives to compute the confidence in the output of non-probabilistic algorithms.We show how to compute confidence estimates in the prediction such that the confidence reflects the probability that the word is labeled correctly. We then show how to use our methods to detect mislabeled words, trade recall for precision and active learning. We evaluate our methods on four noun-phrase chunking and named entity recognition sequence labeling tasks, and on dependency parsing for 14 languages."
to:NB
machine_learning
confidence_sets
prediction
natural_language_processing
november 2011 by cshalizi
[1111.1418] Efficient Nonparametric Conformal Prediction Regions
november 2011 by cshalizi
Yay, it's out! "We investigate and extend the conformal prediction method due to Vovk,Gammerman and Shafer (2005) to construct nonparametric prediction regions. These regions have guaranteed distribution free, finite sample coverage, without any assumptions on the distribution or the bandwidth. Explicit convergence rates of the loss function are established for such regions under standard regularity conditions. Approximations for simplifying implementation and data driven bandwidth selection methods are also discussed. The theoretical properties of our method are demonstrated through simulations."
in_NB
prediction
statistics
confidence_sets
nonparametrics
kith_and_kin
wasserman.larry
robins.james
have_read
density_estimation
november 2011 by cshalizi
[1110.6416] Adaptive Hedge
october 2011 by cshalizi
"Most methods for decision-theoretic online learning are based on the Hedge algorithm, which takes a parameter called the learning rate. In most previous analyses the learning rate was carefully tuned to obtain optimal worst-case performance, leading to suboptimal performance on easy instances, for example when there exists an action that is significantly better than all others. We propose a new way of setting the learning rate, which adapts to the difficulty of the learning problem: in the worst case our procedure still guarantees optimal performance, but on easy instances it achieves much smaller regret. In particular, our adaptive method achieves constant regret in a probabilistic setting, when there exists an action that on average obtains strictly smaller loss than all other actions. We also provide a simulation study comparing our approach to existing methods."
to:NB
to_read
re:growing_ensemble_project
online_learning
prediction
grunwald.peter
low-regret_learning
october 2011 by cshalizi
Universiality of Bayesian Predictions
october 2011 by cshalizi
"This paper studies the theoretical properties of Bayesian predictions and shows that under minimal conditions we can derive finite sample bounds for the loss incurred using Bayesian predictions under the Kullback-Leibler divergence. In particular, the concept of universality of predictions is discussed and universality is established for Bayesian predictions in a variety of settings. These include predictions under almost arbitrary loss functions, model averaging, predictions in a non-stationary environment and under model misspecification."
statistics
prediction
universal_prediction
bayesianism
to:NB
to_read
re:bayes_as_evol
october 2011 by cshalizi
How Useful are Estimated DSGE Model Forecasts? by Rochelle Edge, Refet Gurkaynak :: SSRN
july 2011 by cshalizi
The methodological ideas here are suspect. It is true that there is not much to predict about an in-control system, and what is happening is largely random and so unpredictable, so that even the true model would show low forecasting ability. The question however is why we are supposed to think that the DSGE _does_ give us good information about counterfactuals. If you could show that it had much better predictive performance than baselines like constants or random walks during _out-of-control_ periods, that would be something; but they don't.
re:your_favorite_dsge_sucks
dsges
prediction
economics
macroeconomics
time_series
statistics
in_NB
have_read
to:blog
july 2011 by cshalizi
[1107.0013] Likelihood based observability analysis and confidence intervals for predictions of dynamic models
july 2011 by cshalizi
"Mechanistic dynamic models of biochemical networks such as Ordinary Differential Equations (ODEs) contain unknown parameters like the reaction rate constants and the initial concentrations of the compounds. The large number of parameters as well as their nonlinear impact on the model responses hamper the determination of confidence regions for parameter estimates. At the same time, classical approaches translating the uncertainty of the parameters into confidence intervals for model predictions are hardly feasible. In this article it is shown that a so-called prediction profile likelihood yields reliable confidence intervals for model predictions, despite arbitrarily complex and high-dimensional shapes of the confidence regions for the estimated parameters. Prediction confidence intervals of the dynamic states allow a data-based observability analysis. The approach renders the issue of sampling a high-dimensional parameter space into evaluating one-dimensional prediction spaces."
dynamical_systemss
statistics
statistical_inference_for_stochastic_processes
prediction
confidence_sets
to_read
july 2011 by cshalizi
Making and Evaluating Point Forecasts (Gneiting)
july 2011 by cshalizi
"Typically, point forecasting methods are compared and assessed by means of an error measure or scoring function, with the absolute error and the squared error being key examples. The individual scores are averaged over forecast cases, to result in a summary measure of the predictive performance, such as the mean absolute error or the mean squared error. I demonstrate that this common practice can lead to grossly misguided inferences, unless the scoring function and the forecasting task are carefully matched...."
prediction
statistics
calibration
machine_learning
decision_theory
gneiting.tilmann
have_read
july 2011 by cshalizi
[0711.3856] Forward estimation for ergodic time series
july 2011 by cshalizi
"The forward estimation problem for stationary and ergodic time series $\{X_n\}_{n=0}^{\infty}$ taking values from a finite alphabet ${\cal X}$ is to estimate the probability that $X_{n+1}=x$ based on the observations $X_i$, $0\le i\le n$ without prior knowledge of the distribution of the process $\{X_n\}$. We present a simple procedure $g_n$ which is evaluated on the data segment $(X_0,...,X_n)$ and for which, ${\rm error}(n) = |g_{n}(x)-P(X_{n+1}=x |X_0,...,X_n)|\to 0$ almost surely for a subclass of all stationary and ergodic time series, while for the full class the Cesaro average of the error tends to zero almost surely and moreover, the error tends to zero in probability."
prediction
ergodic_theory
time_series
statistics
morvai.gusztav
weiss.benjamin
july 2011 by cshalizi
An Uncertain World II: Adapt, by Tim Harford - Whimsley
june 2011 by cshalizi
"This sentence shows another failure of the book: a blurring of the line between experimentation (trial-and-error) and decentralization. Throughout most of the book he uses experimentation as a synonym for decentralization (tacit knowledge and all that) and is in favour of both, but sometimes - as here - he separates the two to make his argument fit."
prediction
adaptive_behavior
book_reviews
slee.tom
june 2011 by cshalizi
Scoring the pundits — Crooked Timber
may 2011 by cshalizi
"So, although the development of even rudimentary forms of audit is a great boon to the democratic public (and probably a lot more so than yet another inconclusive study of “media bias” one way or the other), I think it needs to be taken with two caveats. The biggest villain is not the guy who gets it wrong. The people who will cost you money and reputation over the long run are first, the guy who says he’s more certain than he really is, and second, the guy who won’t admit he’s wrong when he knows he is. "
prediction
natural_history_of_truthiness
why_oh_why_cant_we_have_a_better_press_corps
dsquared
may 2011 by cshalizi
Statistical Prediction Analysis (Aitchison and Dunsmore, 1980)
may 2011 by cshalizi
Ancient, but I should see if there are examples or simple tools worth stealing for ADA.
books:noted
statistics
prediction
to:NB
to_teach:undergrad-ADA
may 2011 by cshalizi
Fourdrinier , Marchand , Righi , Strawderman : On improved predictive density estimation with parametric constraints
april 2011 by cshalizi
Too Gaussian to be applicable, but perhaps useful to think through.
prediction
information_theory
statistics
to:NB
april 2011 by cshalizi
Efficient probabilistic forecasts for counts - McCabe et al., 2011 - JRSS-B
march 2011 by cshalizi
" Efficient probabilistic forecasts of integer-valued random variables are derived. The optimality is achieved by estimating the forecast distribution non-parametrically over a given broad model class and proving asymptotic (non-parametric) efficiency in that setting. The method is developed within the context of the integer auto-regressive class of models, which is a suitable class for any count data that can be interpreted as a queue, stock, birth-and-death process or branching process. The theoretical proofs of asymptotic efficiency are supplemented by simulation results that demonstrate the overall superiority of the non-parametric estimator relative to a misspecified parametric alternative, in large but finite samples. The method is applied to counts of stock market iceberg orders. A subsampling method is used to assess sampling variation in the full estimated forecast distribution and a proof of its validity is given." (Dunno about the to_teach tags, I haven't read this yet.)
statistics
prediction
density_estimation
time_series
stochastic_processes
branching_processes
to_teach:data-mining
to_teach:undergrad-ADA
march 2011 by cshalizi
Shmueli : To Explain or to Predict?
january 2011 by cshalizi
"Statistical modeling is a powerful tool for developing and testing theories by way of causal explanation, prediction, and description. In many disciplines there is near-exclusive use of statistical modeling for causal explanation and the assumption that models with high explanatory power are inherently of high predictive power. Conflation between explanation and prediction is common, yet the distinction must be understood for progressing scientific knowledge. While this distinction has been recognized in the philosophy of science, the statistical literature lacks a thorough discussion of the many differences that arise in the process of modeling for an explanatory versus a predictive goal. The purpose of this article is to clarify the distinction between explanatory and predictive modeling, to discuss its sources, and to reveal the practical implications of the distinction to each step in the modeling process."
statistics
prediction
philosophy_of_science
january 2011 by cshalizi
Combining Nonparametric and Optimal Linear Time Series Predictions
january 2011 by cshalizi
ARMA model forecasting, supplemented somehow with nonparametric smoothing of the residuals. (I haven't read beyond the abstract.)
time_series
prediction
statistics
nonparametrics
to_teach:undergrad-ADA
january 2011 by cshalizi
Phys. Rev. E 82, 056206 (2010): Forecasting the evolution of nonlinear and nonstationary systems using recurrence-based local Gaussian process models
november 2010 by cshalizi
"...combining nonparametric Gaussian process (GP) modeling with certain local topological considerations ... for prediction (one-step look ahead) of ... nonlinear and nonstationary dynamics. ... partition ... trajectories into multiple near-stationary segments by aligning the boundaries of the partitions with those of the piecewise affine projections of the underlying dynamic system... alignment is achieved through the consideration of recurrence and other local topological properties ... forecasting in Lorenz system under different levels of induced noise and nonstationarity, synthetic heart-rate signals, and a real-world time-series from an industrial operation known to exhibit highly nonlinear and nonstationary dynamics. ... local Gaussian process can significantly outperform not just classical system identification, neural network and nonparametric models, but also the sequential Bayesian Monte Carlo methods in terms of prediction accuracy and computational speed."
time_series
prediction
non-stationarity
gaussian_processes
re:growing_ensemble_project
to_read
november 2010 by cshalizi
[1010.6202] Sequential Data-Adaptive Bandwidth Selection by Cross-Validation for Nonparametric Prediction
november 2010 by cshalizi
"We consider the problem of bandwidth selection by cross-validation from a sequential point of view in a nonparametric regression model. Having in mind that in applications one often aims at estimation, prediction and change detection simultaneously, we investigate that approach for sequential kernel smoothers in order to base these tasks on a single statistic. We provide uniform weak laws of large numbers and weak consistency results for the cross-validated bandwidth. Extensions to weakly dependent error terms are discussed as well. The errors may be {\alpha}-mixing or L2-near epoch dependent, which guarantees that the uniform convergence of the cross validation sum and the consistency of the cross-validated bandwidth hold true for a large class of time series. The method is illustrated by analyzing photovoltaic data."
cross-validation
prediction
time_series
model_selection
to_read
november 2010 by cshalizi
Lauritzen - "Sufficiency, Prediction, and Extreme Models" (JSTOR: Scandinavian Journal of Statistics, Vol. 1, No. 3 (1974), pp. 128-134)
september 2010 by cshalizi
"A modified concept of sufficiency, relevant in connection with statistical analysis of stochastic processes, is defined and its basic properties investigated. A method of prediction that applies when the probability structure is partly unknown is introduced and the method is shown to possess certain important invariance properties. The concept of an extreme model is defined and its probabilistic and statistical properties discussed. Existence of maximum likelihood estimators and predictors is established under weak regularity assumptions. For technical convenience, only discrete-valued stochastic processes are considered throughout the paper."
sufficiency
statistics
prediction
stochastic_processes
statistical_inference_for_stochastic_processes
have_read
lauritzen.steffen
september 2010 by cshalizi
Wiener: Nonlinear Prediction and Dynamics
august 2010 by cshalizi
"Norbert Wiener really was that smart" dept.: long-term weather forecasting on the basis of deterministic dynamical models impossible because of limited precision observations and self-amplifying processes; but ergodic theory to the rescue for statistical forecasting; reconstruction of dynamical systems from sufficiently long trajectories (up to the ergodic component); linearization of nonlinear problems by projection into a higher-dimensional space; probably more, I'm not done reading it yet.
wiener.norbert
prediction
ergodic_theory
ergodic_decomposition
statistics
time_series
sensitive_dependence_on_initial_conditions
statistical_inference_for_stochastic_processes
series_of_footnotes
to:blog
have_read
august 2010 by cshalizi
"Predictive Likelihood Inference, with Applications" - Butler, Journal of the Royal Statistical Society. Series B (Methodological), Vol. 48, No. 1 (1986), pp. 1-38
july 2010 by cshalizi
"in the predictive setting, all parameters are nuisance parameters": yes!
prediction
likelihood
estimation
statistics
july 2010 by cshalizi
[1006.0475] Prediction with Advice of Unknown Number of Experts
june 2010 by cshalizi
"In the framework of prediction with expert advice, we consider a recently introduced kind of regret bounds: the bounds that depend on the effective instead of nominal number of experts. In contrast to the NormalHedge bound, which mainly depends on the effective number of experts and also weakly depends on the nominal one, we obtain a bound that does not contain the nominal number of experts at all. We use the defensive forecasting method and introduce an application of defensive forecasting to multivalued supermartingales."
prediction
learning_theory
re:growing_ensemble_project
june 2010 by cshalizi
10-705 Intermediate Statistics, Fall 2009
april 2010 by cshalizi
Larry's version of the typical masters-level course based on Casella and Berger. Note: half of what he covers is not in Casella and Berger. (For example, he starts with VC theory!)
learning_theory
statistics
estimation
hypothesis_testing
prediction
minimax
bootstrap
model_selection
regression
classifiers
confidence_sets
wasserman.larry
kith_and_kin
april 2010 by cshalizi
Desiderata for a Predictive Theory of Statistics - Clarke, 2010
march 2010 by cshalizi
"In many contexts the predictive validation of models or their associated prediction strategies is of greater importance than model identification which may be practically impossible. This is particularly so in fields involving complex or high dimensional data where model selection, or more generally predictor selection is the main focus of effort. This paper suggests a unified treatment for predictive analyses based on six `desiderata'. These desiderata are an effort to clarify what criteria a good predictive theory of statistics should satisfy." --- I presume (I haven't read the paper yet) that he means a theory of statistical predictions, and not a theory which tries to predict future developments within statistics.
statistics
prediction
methodology
to_read
data_mining
march 2010 by cshalizi
[1003.1513] On the trasductive arguments in statistics
march 2010 by cshalizi
"The paper argues that a part of the current statistical discussion is not based on the standard firm foundations of the field. Among the examples we consider are prediction into the future, semi-supervised classification, and causality inference based on observational data." --- I have read this paper, but do not pretend to understand it. (For instance, I really don't get what he's saying about time series.)
statistics
prediction
causal_inference
time_series
have_read
march 2010 by cshalizi
Reintroducing Prediction to Explanation
february 2010 by cshalizi
"Although prediction has been largely absent from discussions of explanation for the past 40 years, theories of explanation can gain much from a reintroduction. I review the history that divorced prediction from explanation, examine the proliferation of models of explanation that followed, and argue that accounts of explanation have been impoverished by the neglect of prediction. Instead of a revival of the symmetry thesis, I suggest that explanation should be understood as a cognitive tool that assists us in generating new predictions. This view of explanation and prediction clarifies what makes an explanation scientific and why inference to the best explanation makes sense in science."
explanation
prediction
philosophy_of_science
february 2010 by cshalizi
Sequences Project
january 2010 by cshalizi
Page on individual-sequence prediction.
prediction
time_series
machine_learning
learning_theory
universal_prediction
via:?
january 2010 by cshalizi
High skill in low-frequency climate response through fluctuation dissipation theorems despite structural instability — PNAS
january 2010 by cshalizi
Not peer reviewed, so one entertains the gravest suspicions.
climate_change
climatology
fluctuation-response
prediction
to_read
january 2010 by cshalizi
Berti, Crimaldi, Pratelli, Rigo: Rate of convergence of predictive distributions for dependent data
january 2010 by cshalizi
Exchangeable sequences, rather than anything interesting, but worth looking at for ideas? ETA: Eh.
prediction
statistics
empirical_processes
stochastic_processes
have_read
learning_theory
statistical_inference_for_stochastic_processes
january 2010 by cshalizi
Luen, Stark: Testing earthquake predictions
december 2009 by cshalizi
Back-up for the Hough review. Also: might make a good mini-project for the data-mining class, though I'd have to teach about spatio-temporal methods (which I should anyway [but where would the time come?]).
earthquakes
hypothesis_testing
bad_data_analysis
stark.philip
statistics
prediction
have_read
to_teach:data-mining
to_teach:undergrad-ADA
december 2009 by cshalizi
[0912.4883] On Finding Predictors for Arbitrary Families of Processes
december 2009 by cshalizi
" A sequence $x_1,...,x_n,...$ of discrete-valued observations is generated according to some unknown [measure] $\mu$. After observing each outcome, ... give the conditional probabilities of the next observation. ... $\mu$ [is in] an arbitrary but known class $C$ of stochastic process measures. We [want] predictors ... whose conditional probabilities converge (in some sense) to the [true] conditional probabilities if any $\mu\in C$ is chosen to generate the sequence. ... [C]haracteriz[e] the families $C$ for which such predictors exist ... a specific and simple form in which to look for a solution. ... if any predictor works, then there exists a Bayesian predictor, whose prior is discrete, and which works too. .... sufficient and necessary conditions for the existence of a predictor, in terms of topological characterizations of the family $C$, as well as in terms of local behaviour of the measures in $C$, which in some cases lead to procedures for constructing such predictors."
prediction
universal_prediction
stochastic_processes
statistical_inference_for_stochastic_processes
statistics
re:AoS_project
december 2009 by cshalizi
[0903.3620] Reconciling Model Selection and Prediction
december 2009 by cshalizi
"It is known that there is a dichotomy in the performance of model selectors. Those that are consistent (having the "oracle property") do not achieve the asymptotic minimax rate for prediction error. We look at this phenomenon closely, and argue that the set of parameters on which this dichotomy occurs is extreme, even pathological, and should not be considered when evaluating model selectors. We characterize this set, and show that, when such parameters are dismissed from consideration, consistency and asymptotic minimaxity can be attained simultaneously."
model_selection
statistics
minimax
regression
have_read
prediction
december 2009 by cshalizi
A simple procedure for computing improved prediction intervals for autoregressive models. Paolo Vidoni. 2009; Journal of Time Series Analysis - Wiley InterScience
december 2009 by cshalizi
"construction of prediction intervals for time series models. The estimative or plug-in solution is usually not entirely adequate, since the (conditional) coverage probability may differ substantially from the nominal value. Prediction intervals with improved (conditional) coverage probability can be defined by adjusting the estimative ones, using rather complicated asymptotic procedures or suitable simulation techniques. This article extends to Markov process models a recent result by Vidoni, which defines a relatively simple predictive distribution function, giving improved prediction limits as quantiles"
prediction
time_series
statistics
confidence_sets
to_read
december 2009 by cshalizi
Sequential Probability Assignment Via Online Convex Programming Using Exponential Families (Raginsky, Marcia, Silva and Willett)
october 2009 by cshalizi
Today's seminar. Very cool.
have_read
statistics
statistical_inference_for_stochastic_processes
exponential_families
information_theory
prediction
minimax
optimization
to:blog
raginsky.maxim
online_learning
willett.rebecca
low-regret-learning
in_NB
october 2009 by cshalizi
"Symbols as Self-emergent Entities in an Optimization Process of Feature Extraction and Predictions"
september 2009 by cshalizi
Sounds like it should either be interesting or truly horrible.
cognitive_science
neuroscience
symbols_from_dynamics
to:NB
prediction
september 2009 by cshalizi
The Monkey Cage: Forecasting Fallacies?
august 2009 by cshalizi
What we have here, boy, is a failure to calibrate: " 'Around 74% of companies have beat forecasts, versus the long-term average of 61% (empahsis added) and the all-time record of 73%, reached in the first quarter of 2004.' Now I might be missing something here, but if the forecasters were good at their jobs, shouldn’t the long term average of companies beating forecasts be the same as the long term average of companies doing worse than the forecasts?" --- Actually, isn't this compatible with the forecasters minimizing squared error under an asymmetric (but mean zero) noise distribution? (A more plausible explanation, to my mind, has to do with corrupt practices, where the same firms solicit investment-banking business from companies and purport to advise investors on what those companies are worth. But that's my cynicism.)
calibration
prediction
financial_markets
to_teach:data-mining
statistics
august 2009 by cshalizi
PhilSci Archive - Deterministic versus indeterministic descriptions: not that different after all?
july 2009 by cshalizi
"The guiding question of this paper is: how similar are deterministic descriptions and indeterministic descriptions from a predictive viewpoint? The deterministic and indeterministic descriptions of concern in this paper are measure-theoretic deterministic systems and stochastic processes, respectively. I will explain intuitively some mathematical results which show that measure-theoretic deterministic systems and stochastic processes give more often the same predictions than one might perhaps have expected, and hence that from a predictive viewpoint these descriptions are quite similar." This needs saying?!?
dynamical_systems
stochastic_processes
prediction
philosophy_of_science
boltzmann_died_for_your_sins
july 2009 by cshalizi
Limits of declustering methods for disentangling exogenous from endogenous events in time series with foreshocks, main shocks, and aftershocks
july 2009 by cshalizi
"Many time series in natural and social sciences can be seen as resulting from an interplay between exogenous influences and an endogenous organization. We use a simple epidemic-type aftershock model of events occurring sequentially, in which future events are influenced (partially triggered) by past events to ask the question of how well can one disentangle the exogenous events from the endogenous ones. We apply both model-dependent and model-independent stochastic declustering methods to reconstruct the tree of ancestry and estimate key parameters. In contrast with previously reported positive results, we have to conclude that declustered catalogs are rather unreliable for the synthetic catalogs that we have investigated, which contains of the order of thousands of events, typical of realistic applications. The estimated rates of exogenous events suffer from large errors. The branching ratio n, quantifying the fraction of events that have been triggered by previous events, is also badly estimated in general from declustered catalogs. We find, however, that the errors tend to be smaller and perhaps acceptable in some cases for small triggering efficiency and branching ratios. The high level of randomness together with the long memory makes the stochastic reconstruction of trees of ancestry and the estimation of the key parameters perhaps intrinsically unreliable for long-memory processes. For shorter memories (larger “bare” Omori exponent), the results improve significantly."
statistics
time_series
branching_processes
in_NB
earthquakes
prediction
inference_to_latent_objects
sornette.didier
long-range_dependence
july 2009 by cshalizi
Why we overestimate the costs of climate change legislation | Grist
june 2009 by cshalizi
Conversely, the demand for Pan Am flights to the moon is much smaller than _very reasonable_ people have expected. This suggests an interesting question for retrospective studies of futurology: what's the variance? Quite conceivably, futurology is right _on average_, but with such a huge spread as to be unusable...
prediction
innovation
technological_change
environmental_management
environmental_policy
cost-benefit_analysis
climate_change
june 2009 by cshalizi
related tags
adaptive_behavior ⊕ arrow_of_time ⊕ art ⊕ artificial_life ⊕ automata_theory ⊕ autonomous_agents ⊕ autonomy ⊕ bad_data_analysis ⊕ bayesianism ⊕ bayesian_consistency ⊕ beirl.wolfgang ⊕ bickel.david ⊕ blogged ⊕ boltzmann_died_for_your_sins ⊕ books:disrecommended ⊕ books:noted ⊕ books:recommended ⊕ book_reviews ⊕ bootstrap ⊕ branching_processes ⊕ caires.s. ⊕ calibration ⊕ causal_inference ⊕ chaos ⊕ classical_mechanics ⊕ classifiers ⊕ climate_change ⊕ climatology ⊕ coarse-graining ⊕ cognitive_science ⊕ communication ⊕ complexity ⊕ complexity_measures ⊕ computational_mechanics ⊕ confidence_sets ⊕ control ⊕ cost-benefit_analysis ⊕ cross-validation ⊕ cumulative_advantage ⊕ curse_of_dimensionality ⊕ cybernetics ⊕ data_mining ⊕ dawid.a.p. ⊕ dawid.philip ⊕ decision-making ⊕ decision_theory ⊕ decision_trees ⊕ density_estimation ⊕ determinism ⊕ diacu.florian ⊕ differential_equations ⊕ dimension_reduction ⊕ disasters ⊕ dsges ⊕ dsquared ⊕ dynamical_systems ⊕ dynamical_systemss ⊕ earthquakes ⊕ econometrics ⊕ economics ⊕ empirical_processes ⊕ ensemble_methods ⊕ environmental_management ⊕ environmental_policy ⊕ ergodic_decomposition ⊕ ergodic_theory ⊕ estimation ⊕ expertise ⊕ explanation ⊕ exponential_families ⊕ feedback ⊕ ferreira.j.a. ⊕ finance ⊕ financial_markets ⊕ fluctuation-response ⊕ freedom_as_self-control ⊕ funny:geeky ⊕ funny:malicious ⊕ game_theory ⊕ gaussian_processes ⊕ geology ⊕ gneiting.tilmann ⊕ grants ⊕ grunwald.peter ⊕ hansen.bruce ⊕ have_read ⊕ heard ⊕ heavy_tails ⊕ history_of_science ⊕ homeostasis ⊕ homophily ⊕ hooker.giles ⊕ hypothesis_testing ⊕ individual_sequence_prediction ⊕ induction ⊕ inference_to_latent_objects ⊕ information_theory ⊕ innovation ⊕ in_NB ⊕ kith_and_kin ⊕ knight.frank_b. ⊕ langford.john ⊕ lauritzen.steffen ⊕ learning_in_games ⊕ learning_theory ⊕ lei.jing ⊕ likelihood ⊕ long-range_dependence ⊕ low-rank_approximation ⊕ low-regret-learning ⊕ low-regret_learning ⊕ machine_learning ⊕ macroeconomics ⊕ macro_from_micro ⊕ markov_models ⊕ martingales ⊕ meteorology ⊕ methodology ⊕ minimax ⊕ misspecification ⊕ model-checking ⊕ modeling ⊕ model_averaging ⊕ model_selection ⊕ modularity ⊕ morvai.gusztav ⊕ multiple_testing ⊕ natural_history_of_truthiness ⊕ natural_language_processing ⊕ network_data_analysis ⊕ neuroscience ⊕ non-stationarity ⊕ nonparametrics ⊕ no_such_thing_as_bad_publicity ⊕ online_learning ⊕ optimization ⊕ path_dependence ⊕ philosophy_of_science ⊕ point_processes ⊕ popular_social_science ⊕ prediction ⊖ prequentialism ⊕ probability ⊕ prophecy ⊕ racine.jeffrey ⊕ raginsky.maxim ⊕ random_fields ⊕ re:almost_none ⊕ re:AoS_project ⊕ re:bayes_as_evol ⊕ re:growing_ensemble_project ⊕ re:phil-of-bayes_paper ⊕ re:stacs ⊕ re:XV_for_mixing ⊕ re:XV_for_networks ⊕ re:your_favorite_dsge_sucks ⊕ regression ⊕ robins.james ⊕ ryabko.b._ya. ⊕ salakhutdinov.ruslan ⊕ scoring_rules ⊕ search_engines ⊕ self-centered ⊕ self-organization ⊕ sensitive_dependence_on_initial_conditions ⊕ series_of_footnotes ⊕ set_theory ⊕ simulation ⊕ slee.tom ⊕ smoothing ⊕ social_media ⊕ sornette.didier ⊕ spatial_statistics ⊕ stark.philip ⊕ state-space_models ⊕ state_estimation ⊕ statistical_inference_for_stochastic_processes ⊕ statistical_mechanics ⊕ statistics ⊕ stochastic_processes ⊕ sufficiency ⊕ symbols_from_dynamics ⊕ technological_change ⊕ teleology ⊕ teleonomy ⊕ time_series ⊕ to:blog ⊕ to:NB ⊕ to_be_shot_after_a_fair_trial ⊕ to_read ⊕ to_teach:complexity-and-inference ⊕ to_teach:data-mining ⊕ to_teach:undergrad-ADA ⊕ track_down_references ⊕ universal_prediction ⊕ via:? ⊕ via:arthegall ⊕ via:mejn ⊕ vovk.vladimir_g. ⊕ wasserman.larry ⊕ watts.duncan ⊕ weather_prediction ⊕ weiss.benjamin ⊕ why_oh_why_cant_we_have_a_better_press_corps ⊕ wiener.norbert ⊕ willett.rebecca ⊕ zhang.tong ⊕Copy this bookmark: