cshalizi + time_series 205
[1205.3845] Forecasting with Historical Data or Process Knowledge under Misspecification: A Comparison
8 days ago by cshalizi
"When faced with the task of forecasting a dynamic system, practitioners often have available historical data, knowledge of the system, or a combination of both. While intuition dictates that perfect knowledge of the system should in theory yield perfect forecasting, often knowledge of the system is only partially known, known up to parameters, or known incorrectly. In contrast, forecasting using previous data without any process knowledge might result in accurate prediction for simple systems, but will fail for highly nonlinear and chaotic systems. In this paper, the authors demonstrate how even in chaotic systems, forecasting with historical data is preferable to using process knowledge if this knowledge exhibits certain forms of misspecification. Through an extensive simulation study, a range of misspecification and forecasting scenarios are examined with the goal of gaining an improved understanding of the circumstances under which forecasting from historical data is to be preferred over using process knowledge."
to:NB
to_read
prediction
time_series
misspecification
re:growing_ensemble_project
8 days ago by cshalizi
Lam , Yao : Factor modeling for high-dimensional time series: Inference for the number of factors
10 days ago by cshalizi
"This paper deals with the factor modeling for high-dimensional time series based on a dimension-reduction viewpoint. Under stationary settings, the inference is simple in the sense that both the number of factors and the factor loadings are estimated in terms of an eigenanalysis for a nonnegative definite matrix, and is therefore applicable when the dimension of time series is on the order of a few thousands. Asymptotic properties of the proposed method are investigated under two settings: (i) the sample size goes to infinity while the dimension of time series is fixed; and (ii) both the sample size and the dimension of time series go to infinity together. In particular, our estimators for zero-eigenvalues enjoy faster convergence (or slower divergence) rates, hence making the estimation for the number of factors easier. In particular, when the sample size and the dimension of time series go to infinity together, the estimators for the eigenvalues are no longer consistent. However, our estimator for the number of the factors, which is based on the ratios of the estimated eigenvalues, still works fine. Furthermore, this estimation shows the so-called “blessing of dimensionality” property in the sense that the performance of the estimation may improve when the dimension of time series increases. A two-step procedure is investigated when the factors are of different degrees of strength. Numerical illustration with both simulated and real data is also reported."
to:NB
dimension_reduction
factor_analysis
time_series
high-dimensional_statistics
inference_to_latent_objects
10 days ago by cshalizi
Wang , Phillips : A specification test for nonlinear nonstationary models
10 days ago by cshalizi
"We provide a limit theory for a general class of kernel smoothed U-statistics that may be used for specification testing in time series regression with nonstationary data. The test framework allows for linear and nonlinear models with endogenous regressors that have autoregressive unit roots or near unit roots. The limit theory for the specification test depends on the self-intersection local time of a Gaussian process. A new weak convergence result is developed for certain partial sums of functions involving nonstationary time series that converges to the intersection local time process. This result is of independent interest and is useful in other applications. Simulations examine the finite sample performance of the test."
to:NB
time_series
non-stationarity
model-checking
statistics
misspecification
10 days ago by cshalizi
Likelihood inference for discriminating between long-memory and change-point models - Yau - 2012 - Journal of Time Series Analysis - Wiley Online Library
13 days ago by cshalizi
"We develop a likelihood ratio (LR) test procedure for discriminating between a short-memory time series with a change-point (CP) and a long-memory (LM) time series. Under the null hypothesis, the time series consists of two segments of short-memory time series with different means and possibly different covariance functions. The location of the shift in the mean is unknown. Under the alternative, the time series has no shift in mean but rather is LM. The LR statistic is defined as the normalized log-ratio of the Whittle likelihood between the CP model and the LM model, which is asymptotically normally distributed under the null. The LR test provides a parametric alternative to the CUSUM test proposed by Berkes et al. (2006). Moreover, the LR test is more general than the CUSUM test in the sense that it is applicable to changes in other marginal or dependence features other than a change-in-mean. We show its good performance in simulations and apply it to two data examples."
to:NB
time_series
change-point_problem
long-range_dependence
statistics
to_teach:undergrad-ADA
hypothesis_testing
13 days ago by cshalizi
[1204.6265] Statistical inference for dynamical systems: a review
28 days ago by cshalizi
"The topic of statistical inference for dynamical systems has been studied extensively across several fields. In this survey we focus on the problem of parameter estimation for non-linear dynamical systems. Our objective is to place results across distinct disciplines in a common setting and highlight opportunities for further research."
to:NB
to_read
statistical_inference_for_stochastic_processes
dynamical_systems
statistics
time_series
state-space_models
state-space_reconstruction
pillai.natesh
via:ded-maxim
28 days ago by cshalizi
[1204.3915] Theory and Inference for a Class of Observation-driven Models with Application to Time Series of Counts
5 weeks ago by cshalizi
"This paper studies theory and inference related to a class of time series models that incorporates nonlinear dynamics. It is assumed that the observations follow a one-parameter exponential family of distributions given an accompanying process that evolves as a function of lagged observations. We employ an iterated random function approach and a special coupling technique to show that, under suitable conditions on the parameter space, the conditional mean process is a geometric moment contracting Markov chain and that the observation process is absolutely regular with geometrically decaying coefficients. Moreover the asymptotic theory of the maximum likelihood estimates of the parameters is established under some mild assumptions. These models are applied to two examples; the first is the number of transactions per minute of Ericsson stock and the second is related to return times of extreme events of Goldman Sachs Group stock."
--- Without reading beyond the abstract, I'm guessing chains with complete connections.
to:NB
time_series
markov_models
statistics
--- Without reading beyond the abstract, I'm guessing chains with complete connections.
5 weeks ago by cshalizi
Xiao , Wu : Covariance matrix estimation for stationary time series
6 weeks ago by cshalizi
"We obtain a sharp convergence rate for banded covariance matrix estimates of stationary processes. A precise order of magnitude is derived for spectral radius of sample covariance matrices. We also consider a thresholded covariance matrix estimator that can better characterize sparsity if the true covariance matrix is sparse. As our main tool, we implement Toeplitz [Math. Ann. 70 (1911) 351–376] idea and relate eigenvalues of covariance matrices to the spectral densities or Fourier transforms of the covariances. We develop a large deviation result for quadratic forms of stationary processes using m-dependence approximation, under the framework of causal representation and physical dependence measures."
to:NB
time_series
statistics
estimation
variance_estimation
6 weeks ago by cshalizi
[0802.4363] Estimating the entropy of binary time series: Methodology, some theory and a simulation study
6 weeks ago by cshalizi
"Partly motivated by entropy-estimation problems in neuroscience, we present a detailed and extensive comparison between some of the most popular and effective entropy estimation methods used in practice: The plug-in method, four different estimators based on the Lempel-Ziv (LZ) family of data compression algorithms, an estimator based on the Context-Tree Weighting (CTW) method, and the renewal entropy estimator.
"**Methodology. Three new entropy estimators are introduced. For two of the four LZ-based estimators, a bootstrap procedure is described for evaluating their standard error, and a practical rule of thumb is heuristically derived for selecting the values of their parameters. ** Theory. We prove that, unlike their earlier versions, the two new LZ-based estimators are consistent for every finite-valued, stationary and ergodic process. An effective method is derived for the accurate approximation of the entropy rate of a finite-state HMM with known distribution. Heuristic calculations are presented and approximate formulas are derived for evaluating the bias and the standard error of each estimator. ** Simulation. All estimators are applied to a wide range of data generated by numerous different processes with varying degrees of dependence and memory. Some conclusions drawn from these experiments include: (i) For all estimators considered, the main source of error is the bias. (ii) The CTW method is repeatedly and consistently seen to provide the most accurate results. (iii) The performance of the LZ-based estimators is often comparable to that of the plug-in method. (iv) The main drawback of the plug-in method is its computational inefficiency."
in_NB
to_read
entropy_estimation
information_theory
time_series
statistics
kontoyiannis.ioannis
re:stacs
"**Methodology. Three new entropy estimators are introduced. For two of the four LZ-based estimators, a bootstrap procedure is described for evaluating their standard error, and a practical rule of thumb is heuristically derived for selecting the values of their parameters. ** Theory. We prove that, unlike their earlier versions, the two new LZ-based estimators are consistent for every finite-valued, stationary and ergodic process. An effective method is derived for the accurate approximation of the entropy rate of a finite-state HMM with known distribution. Heuristic calculations are presented and approximate formulas are derived for evaluating the bias and the standard error of each estimator. ** Simulation. All estimators are applied to a wide range of data generated by numerous different processes with varying degrees of dependence and memory. Some conclusions drawn from these experiments include: (i) For all estimators considered, the main source of error is the bias. (ii) The CTW method is repeatedly and consistently seen to provide the most accurate results. (iii) The performance of the LZ-based estimators is often comparable to that of the plug-in method. (iv) The main drawback of the plug-in method is its computational inefficiency."
6 weeks ago by cshalizi
[1203.3037] Expanding the Transfer Entropy to Identify Information Subgraphs in Complex Systems
7 weeks ago by cshalizi
"We propose a formal expansion of the transfer entropy to put in evidence irreducible sets of variables which provide information for the future state of each assigned target. Multiplets characterized by an high value will be associated to informational circuits present in the system, with an informational character (synergetic or redundant) which can be associated to the sign of the contribution. We also present preliminary results on fMRI and EEG data sets."
in_NB
graphical_models
information_theory
community_discovery
time_series
re:functional_communities
7 weeks ago by cshalizi
[1203.1515] Multiple Change-Point Estimation in Stationary Ergodic Time-Series
7 weeks ago by cshalizi
"The multiple change-point problem is considered in the most general setting, where the only assumption made on the time-series distributions generating the data is that they are stationary ergodic. No modeling, independence or parametric assumptions are made. While the need for such a general setting is dictated by real applications, the problem of change-point estimation becomes a difficult unsupervised learning problem. In this work a novel algorithm for solving this problem is proposed, and it is shown to be asymptotically consistent under the general assumptions considered."
to:NB
change-point_problem
time_series
ergodic_theory
statistics
statistical_inference_for_stochastic_processes
ryabko.daniil
7 weeks ago by cshalizi
[1203.6898] Long-term stability of sequential Monte Carlo methods under verifiable conditions
8 weeks ago by cshalizi
"This paper discusses particle filtering in general hidden Markov models (HMMs) and presents novel theoretical results on the long-term stability of bootstrap-type particle filters. More specifically, we establish that the asymptotic variance of the Monte Carlo estimates produced by the bootstrap filter is uniformly bounded in time. On the contrary to most previous results of this type, which in general presuppose that the state space of the hidden state process is compact (an assumption that is rarely satisfied in practice), our very mild assumptions are satisfied for a large class of HMMs with possibly non-compact state space. In addition, we derive a similar time uniform bound on the asymptotic Lp error. Importantly, our results hold for misspecified models, i.e. we do not at all assume that the data entering into the particle filter originate from the model governing the dynamics of the particles or not even from an HMM."
to:NB
particle_filters
stochastic_processes
time_series
state_estimation
state-space_models
markov_models
statistics
8 weeks ago by cshalizi
[math/0609514] Sequential Monte Carlo smoothing with application to parameter estimation in non-linear state space models
8 weeks ago by cshalizi
"This paper concerns the use of sequential Monte Carlo methods (SMC) for smoothing in general state space models. A well-known problem when applying the standard SMC technique in the smoothing mode is that the resampling mechanism introduces degeneracy of the approximation in the path space. However, when performing maximum likelihood estimation via the EM algorithm, all functionals involved are of additive form for a large subclass of models. To cope with the problem in this case, a modification of the standard method (based on a technique proposed by Kitagawa and Sato) is suggested. Our algorithm relies on forgetting properties of the filtering dynamics and the quality of the estimates produced is investigated, both theoretically and via simulations."
to:NB
statistics
time_series
state_estimation
state-space_models
particle_filters
8 weeks ago by cshalizi
[1203.5673] Effect of Nonstationarity on Models Inferred from Neural Data
8 weeks ago by cshalizi
"Neurons subject to a common non-stationary input may exhibit a correlated firing behavior. Correlations in the statistics of neural spike trains also arise as the effect of interaction between neurons. Here we show that these two situations can be distinguished, with machine learning techniques, provided the data are rich enough. In order to do this, we study the problem of inferring a kinetic Ising model, stationary or nonstationary, from the available data. We apply the inference procedure to two data sets: one from salamander retinal ganglion cells and the other from a realistic computational cortical network model. We show that many aspects of the concerted activity of the salamander retinal neurons can be traced simply to the external input. A model of non-interacting neurons subject to a non-stationary external field outperforms a model with stationary input with couplings between neurons, even accounting for the differences in the number of model parameters. When couplings are added to the non-stationary model, for the retinal data, little is gained: the inferred couplings are generally not significant. Likewise, the distribution of the sizes of sets of neurons that spike simultaneously and the frequency of spike patterns as function of their rank (Zipf plots) are well-explained by an independent-neuron model with time-dependent external input, and adding connections to such a model does not offer significant improvement. For the cortical model data, robust couplings, well correlated with the real connections, can be inferred using the non-stationary model. Adding connections to this model slightly improves the agreement with the data for the probability of synchronous spikes but hardly affects the Zipf plot."
to:NB
neural_data_analysis
statistics
time_series
8 weeks ago by cshalizi
[1203.5950] Capturing the time-varying drivers of an epidemic using stochastic dynamical systems
8 weeks ago by cshalizi
"Epidemics are often modelled using state-space models based on dynamical systems, observed through partial and noisy data. In this paper we develop stochastic extensions to the popular SEIR model with parameters evolving in time, in order to capture unknown influences of changing behaviors, public interventions, seasonal effects etc. Our models assign diffusion processes for the time-varying parameters, and our inferential procedure is based on the particle Markov Chain Monte Carlo algorithm, suitably adjusted to accommodate the features of this challenging nonlinear stochastic model. The performance of the proposed computational methods is validated on simulated data and the adopted model is applied to the 2009 A/H1N1 pandemic in England. In addition to estimating the trajectories of the effective contact rate, the methodology is applied in real time to provide evidence in related public health decisions."
to:NB
time_series
epidemic_models
state-space_models
statistics
8 weeks ago by cshalizi
On robust tail index estimation for linear long-memory processes - Beran - 2012 - Journal of Time Series Analysis - Wiley Online Library
8 weeks ago by cshalizi
"We consider robust estimation of the tail index α for linear long-memory processes with i.i.d. innovations εj following a symmetric α-stable law (1 < α < 2) and coefficients aj ∼ c·j−β. Estimates based on the left and right tail respectively are obtained together with a combined statistic with improved efficiency, and a test statistic comparing both tails. Asymptotic results are derived. Simulations illustrate the finite sample performance."
to:NB
heavy_tails
time_series
statistics
beran.jan
8 weeks ago by cshalizi
Time-series clustering via quasi U-statistics - Valk - 2012 - Journal of Time Series Analysis - Wiley Online Library
8 weeks ago by cshalizi
"The problem of time-series discrimination and classification is discussed. We propose a novel clustering algorithm based on a class of quasi U-statistics and subgroup decomposition tests. The decomposition may be applied to any concave time-series distance. The resulting test statistics are proven to be asymptotically normal for either i.i.d. or non-identically distributed groups of time-series under mild conditions. We illustrate its empirical performance on a simulation study and a real data analysis. The simulation setup includes stationary vs. stationary and stationary vs. non-stationary cases. The performance of the proposed method is favourably compared with some of the most common clustering measures available."
to:NB
clustering
time_series
statistics
classifiers
8 weeks ago by cshalizi
[0805.2214] Augmented GARCH sequences: Dependence structure and asymptotics
12 weeks ago by cshalizi
"The augmented GARCH model is a unification of numerous extensions of the popular and widely used ARCH process. It was introduced by Duan and besides ordinary (linear) GARCH processes, it contains exponential GARCH, power GARCH, threshold GARCH, asymmetric GARCH, etc. In this paper, we study the probabilistic structure of augmented $mathrm {GARCH}(1,1)$ sequences and the asymptotic distribution of various functionals of the process occurring in problems of statistical inference. Instead of using the Markov structure of the model and implied mixing properties, we utilize independence properties of perturbed GARCH sequences to directly reduce their asymptotic behavior to the case of independent random variables. This method applies for a very large class of functionals and eliminates the fairly restrictive moment and smoothness conditions assumed in the earlier theory. In particular, we derive functional CLTs for powers of the augmented GARCH variables, derive the error rate in the CLT and obtain asymptotic results for their empirical processes under nearly optimal conditions."
to:NB
stochastic_processes
time_series
finance
12 weeks ago by cshalizi
[0805.1179] Autoregressive Process Modeling via the Lasso Procedure
12 weeks ago by cshalizi
"The Lasso is a popular model selection and estimation procedure for linear models that enjoys nice theoretical properties. In this paper, we study the Lasso estimator for fitting autoregressive time series models. We adopt a double asymptotic framework where the maximal lag may increase with the sample size. We derive theoretical results establishing various types of consistency. In particular, we derive conditions under which the Lasso estimator for the autoregressive coefficients is model selection consistent, estimation consistent and prediction consistent. Simulation study results are reported."
to:NB
time_series
statistics
lasso
sparsity
variable_selection
kith_and_kin
heard_the_talk
rinaldo.alessandro
nardi.yuval
12 weeks ago by cshalizi
[0808.1010] Confidence bands in nonparametric time series regression
12 weeks ago by cshalizi
"We consider nonparametric estimation of mean regression and conditional variance (or volatility) functions in nonlinear stochastic regression models. Simultaneous confidence bands are constructed and the coverage probabilities are shown to be asymptotically correct. The imposed dependence structure allows applications in many linear and nonlinear auto-regressive processes. The results are applied to the S&P 500 Index data."
to:NB
statistics
regression
time_series
confidence_sets
to_teach:undergrad-ADA
12 weeks ago by cshalizi
[0805.3019] Three months journeying of a Hawaiian monk seal
12 weeks ago by cshalizi
"Hawaiian monk seals (Monachus schauinslandi) are endemic to the Hawaiian Islands and are the most endangered species of marine mammal that lives entirely within the jurisdiction of the United States. The species numbers around 1300 and has been declining owing, among other things, to poor juvenile survival which is evidently related to poor foraging success. Consequently, data have been collected recently on the foraging habitats, movements, and behaviors of monk seals throughout the Northwestern and main Hawaiian Islands. Our work here is directed to exploring a data set located in a relatively shallow offshore submerged bank (Penguin Bank) in our search of a model for a seal's journey. The work ends by fitting a stochastic differential equation (SDE) that mimics some aspects of the behavior of seals by working with location data collected for one seal. The SDE is found by developing a time varying potential function with two points of attraction. The times of location are irregularly spaced and not close together geographically, leading to some difficulties of interpretation. Synthetic plots generated using the model are employed to assess its reasonableness spatially and temporally. One aspect is that the animal stays mainly southwest of Molokai. The work led to the estimation of the lengths and locations of the seal's foraging trips."
to:NB
statistics
stochastic_differential_equations
statistical_inference_for_stochastic_processes
brillinger.david
time_series
12 weeks ago by cshalizi
[math/0410271] Statistical modeling of causal effects in continuous time
12 weeks ago by cshalizi
"This article studies the estimation of the causal effect of a time-varying treatment on time-to-an-event or on some other continuously distributed outcome. The paper applies to the situation where treatment is repeatedly adapted to time-dependent patient characteristics. The treatment effect cannot be estimated by simply conditioning on these time-dependent patient characteristics, as they may themselves be indications of the treatment effect. This time-dependent confounding is common in observational studies. Robins [(1992) Biometrika 79 321--334, (1998b) Encyclopedia of Biostatistics 6 4372--4389] has proposed the so-called structural nested models to estimate treatment effects in the presence of time-dependent confounding. In this article we provide a conceptual framework and formalization for structural nested models in continuous time. We show that the resulting estimators are consistent and asymptotically normal. Moreover, as conjectured in Robins [(1998b) Encyclopedia of Biostatistics 6 4372--4389], a test for whether treatment affects the outcome of interest can be performed without specifying a model for treatment effect. We illustrate the ideas in this article with an example."
to:NB
statistics
causal_inference
time_series
12 weeks ago by cshalizi
[0809.1053] An impossibility result for process discrimination
12 weeks ago by cshalizi
"Two series of binary observations $x_1,x_1,...$ and $y_1,y_2,...$ are presented: at each time $ninN$ we are given $x_n$ and $y_n$. It is assumed that the sequences are generated independently of each other by two B-processes. We are interested in the question of whether the sequences represent a typical realization of two different processes or of the same one. We demonstrate that this is impossible to decide, in the sense that every discrimination procedure is bound to err with non-negligible frequency when presented with sequences from some B-processes. This contrasts earlier positive results on B-processes, in particular those showing that there are consistent $bar d$-distance estimates for this class of processes."
to:NB
statistics
time_series
stochastic_processes
ergodic_theory
statistical_inference_for_stochastic_processes
hypothesis_testing
12 weeks ago by cshalizi
[math/0510311] Adaptive density estimation under dependence
12 weeks ago by cshalizi
"Assume that $(X_t)_{tinZ}$ is a real valued time series admitting a common marginal density $f$ with respect to Lebesgue's measure. Donoho {it et al.} (1996) propose a near-minimax method based on thresholding wavelets to estimate $f$ on a compact set in an independent and identically distributed setting. The aim of the present work is to extend these results to general weak dependent contexts. Weak dependence assumptions are expressed as decreasing bounds of covariance terms and are detailed for different examples. The threshold levels in estimators $widehat f_n$ depend on weak dependence properties of the sequence $(X_t)_{tinZ}$ through the constant. If these properties are unknown, we propose cross-validation procedures to get new estimators. These procedures are illustrated via simulations of dynamical systems and non causal infinite moving averages. We also discuss the efficiency of our estimators with respect to the decrease of covariances bounds."
to:NB
statistics
density_estimation
wavelets
time_series
statistical_inference_for_stochastic_processes
12 weeks ago by cshalizi
[0810.2276] A generalized portmanteau test of independence between two stationary time series
12 weeks ago by cshalizi
"We propose generalized portmanteau-type test statistics in the frequency domain to test independence between two stationary time series. The test statistics are formed analogous to the one in Chen and Deo (2004, Econometric Theory 20, 382-416), who extended the applicability of portmanteau goodness-of-fit test to the long memory case. Under the null hypothesis of independence, the asymptotic standard normal distributions of the proposed statistics are derived under fairly mild conditions. In particular, each time series is allowed to possess short memory, long memory or anti-persistence. A simulation study shows that the tests have reasonable size and power properties."
in_NB
statistics
time_series
hypothesis_testing
independence_testing
12 weeks ago by cshalizi
[0801.0327] Nonparametric sequential prediction of time series
february 2012 by cshalizi
"Time series prediction covers a vast field of every-day statistical applications in medical, environmental and economic domains. In this paper we develop nonparametric prediction strategies based on the combination of a set of 'experts' and show the universal consistency of these strategies under a minimum of conditions. We perform an in-depth analysis of real-world data sets and show that these nonparametric strategies are more flexible, faster and generally outperform ARMA methods in terms of normalized cumulative prediction error."
in_NB
time_series
nonparametrics
prediction
statistics
to_teach:undergrad-ADA
re:growing_ensemble_project
february 2012 by cshalizi
[1201.6211] On the range of validity of the autoregressive sieve bootstrap
february 2012 by cshalizi
"We explore the limits of the autoregressive (AR) sieve bootstrap, and show that its applicability extends well beyond the realm of linear time series as has been previously thought. In particular, for appropriate statistics, the AR-sieve bootstrap is valid for stationary processes possessing a general Wold-type autoregressive representation with respect to a white noise; in essence, this includes all stationary, purely nondeterministic processes, whose spectral density is everywhere positive. Our main theorem provides a simple and effective tool in assessing whether the AR-sieve bootstrap is asymptotically valid in any given situation. In effect, the large-sample distribution of the statistic in question must only depend on the first and second order moments of the process; prominent examples include the sample mean and the spectral density. As a counterexample, we show how the AR-sieve bootstrap is not always valid for the sample autocovariance even when the underlying process is linear."
in_NB
bootstrap
time_series
statistics
stochastic_processes
february 2012 by cshalizi
The Asymmetric Business Cycle
february 2012 by cshalizi
"The business cycle is a fundamental yet elusive concept in macroeconomics. In this paper, we consider the problem of measuring the business cycle. First, we argue for the output-gap view that the business cycle corresponds to transitory deviations in economic activity away from a permanent, or trend, level. Then we investigate the extent to which a general model-based approach to estimating trend and cycle for the U.S. economy leads to measures of the business cycle that reflect models versus the data. We find empirical support for a nonlinear time series model that produces a business cycle measure with an asymmetric shape across NBER expansion and recession phases. Specifically, this business cycle measure suggests that recessions are periods of relatively large and negative transitory fluctuations in output. However, several close competitors to the nonlinear model produce business cycle measures of widely differing shapes and magnitudes. Given this model-based uncertainty, we construct a model-averaged measure of the business cycle. This measure also displays an asymmetric shape and is closely related to other measures of economic slack such as the unemployment rate and capacity utilization."
--- Worthy, but at the same time makes me want to lock them in a room with a copy of Li and Racine's _Nonparametric Econometrics_, or even _The Elements of Statistical Learning_, and not let them out until they understand it.
in_NB
time_series
statistics
economics
macroeconomics
inference_to_latent_objects
re:your_favorite_dsge_sucks
morley.james
have_read
ensemble_methods
model_selection
--- Worthy, but at the same time makes me want to lock them in a room with a copy of Li and Racine's _Nonparametric Econometrics_, or even _The Elements of Statistical Learning_, and not let them out until they understand it.
february 2012 by cshalizi
Improved Predictions of Lynx Trappings Using a Biological Model
january 2012 by cshalizi
Sweet. (Bayesian estimation seems like overkill here however, especially since predictions are just made from point estimates.)
in_NB
have_read
to_teach:undergrad-ADA
to_teach:complexity-and-inference
re:stacs
dynamical_systems
stochastic_processes
statistical_inference_for_stochastic_processes
statistics
time_series
via:gelman
january 2012 by cshalizi
Forecasting Time Series with Complex Seasonal Patterns Using Exponential Smoothing
january 2012 by cshalizi
An innovations state space modeling framework is introduced for forecasting complex seasonal time series such as those with multiple seasonal periods, high-frequency seasonality, non-integer seasonality, and dual-calendar effects. The new framework incorporates Box–Cox transformations, Fourier representations with time varying coefficients, and ARMA error correction. Likelihood evaluation and analytical expressions for point forecasts and interval predictions under the assumption of Gaussian errors are derived, leading to a simple, comprehensive approach to forecasting complex seasonal time series. A key feature of the framework is that it relies on a new method that greatly reduces the computational burden in the maximum likelihood estimation. The modeling framework is useful for a broad range of applications, its versatility being illustrated in three empirical studies. In addition, the proposed trigonometric formulation is presented as a means of decomposing complex seasonal time series, and it is shown that this decomposition leads to the identification and extraction of seasonal components which are otherwise not apparent in the time series plot itself.
to:NB
statistics
time_series
prediction
january 2012 by cshalizi
Quantifying Statistical Interdependence, Part III: N > 2 Point Processes
january 2012 by cshalizi
"Stochastic event synchrony (SES) is a recently proposed family of similarity measures. First, “events” are extracted from the given signals; next, one tries to align events across the different time series. The better the alignment, the more similar the N time series are considered to be. The similarity measures quantify the reliability of the events (the fraction of “nonaligned” events) and the timing precision. So far, SES has been developed for pairs of one-dimensional (Part I) and multidimensional (Part II) point processes. In this letter (Part III), SES is extended from pairs of signals to N > 2 signals. The alignment and SES parameters are again determined through statistical inference, more specifically, by alternating two steps: (1) estimating the SES parameters from a given alignment and (2), with the resulting estimates, refining the alignment. The SES parameters are computed by maximum a posteriori (MAP) estimation (step 1), in analogy to the pairwise case. The alignment (step 2) is solved by linear integer programming. In order to test the robustness and reliability of the proposed N-variate SES method, it is first applied to synthetic data. We show that N-variate SES results in more reliable estimates than bivariate SES. Next N-variate SES is applied to two problems in neuroscience: to quantify the firing reliability of Morris-Lecar neurons and to detect anomalies in EEG synchrony of patients with mild cognitive impairment. Those problems were also considered in Parts I and II, respectively. In both cases, the N-variate SES approach yields a more detailed analysis."
to:NB
neural_data_analysis
point_processes
time_series
january 2012 by cshalizi
Clements , Schoenberg , Schorlemmer : Residual analysis methods for space–time point processes with applications to earthquake forecast models in California
december 2011 by cshalizi
"Modern, powerful techniques for the residual analysis of spatial-temporal point process models are reviewed and compared. These methods are applied to California earthquake forecast models used in the Collaboratory for the Study of Earthquake Predictability (CSEP). Assessments of these earthquake forecasting models have previously been performed using simple, low-power means such as the L-test and N-test. We instead propose residual methods based on rescaling, thinning, superposition, weighted K-functions and deviance residuals. Rescaled residuals can be useful for assessing the overall fit of a model, but as with thinning and superposition, rescaling is generally impractical when the conditional intensity λ is volatile. While residual thinning and superposition may be useful for identifying spatial locations where a model fits poorly, these methods have limited power when the modeled conditional intensity assumes extremely low or high values somewhere in the observation region, and this is commonly the case for earthquake forecasting models. A recently proposed hybrid method of thinning and superposition, called super-thinning, is a more powerful alternative. The weighted K-function is powerful for evaluating the degree of clustering or inhibition in a model. Competing models are also compared using pixel-based approaches, such as Pearson residuals and deviance residuals. The different residual analysis techniques are demonstrated using the CSEP models and are used to highlight certain deficiencies in the models, such as the overprediction of seismicity in inter-fault zones for the model proposed by Helmstetter, Kagan and Jackson [Seismological Research Letters 78 (2007) 78–86], the underprediction of the model proposed by Kagan, Jackson and Rong [Seismological Research Letters 78 (2007) 94–98] in forecasting seismicity around the Imperial, Laguna Salada, and Panamint clusters, and the underprediction of the model proposed by Shen, Jackson and Kagan [Seismological Research Letters 78 (2007) 116–120] in forecasting seismicity around the Laguna Salada, Baja, and Panamint clusters."
to:NB
point_processes
spatial_statistics
time_series
statistics
model_selection
model-checking
prediction
earthquakes
geology
december 2011 by cshalizi
Phys. Rev. E 84, 066702 (2011): Nonparametric model reconstruction for stochastic differential equations from discretely observed time-series data
december 2011 by cshalizi
"A scheme is developed for estimating state-dependent drift and diffusion coefficients in a stochastic differential equation from time-series data. The scheme does not require to specify parametric forms for the drift and diffusion coefficients in advance. In order to perform the nonparametric estimation, a maximum likelihood method is combined with a concept based on a kernel density estimation. In order to deal with discrete observation or sparsity of the time-series data, a local linearization method is employed, which enables a fast estimation."
to:NB
statistics
time_series
stochastic_differential_equations
statistical_inference_for_stochastic_processes
december 2011 by cshalizi
[0805.0463] Distance-based clustering of sparsely observed stochastic processes, with applications to online auctions
december 2011 by cshalizi
"We propose a distance between two realizations of a random process where for each realization only sparse and irregularly spaced measurements with additional measurement errors are available. Such data occur commonly in longitudinal studies and online trading data. A distance measure then makes it possible to apply distance-based analysis such as classification, clustering and multidimensional scaling for irregularly sampled longitudinal data. Once a suitable distance measure for sparsely sampled longitudinal trajectories has been found, we apply distance-based clustering methods to eBay online auction data. We identify six distinct clusters of bidding patterns. Each of these bidding patterns is found to be associated with a specific chance to obtain the auctioned item at a reasonable price."
to:NB
statistics
clustering
time_series
december 2011 by cshalizi
[1112.1838] Non-parametric kernel estimation for symmetric Hawkes processes. Application to high frequency financial data
december 2011 by cshalizi
"We define a numerical method that provides a non-parametric estimation of the kernel shape in symmetric multivariate Hawkes processes. This method relies on second order statistical properties of Hawkes processes that relate the covariance matrix of the process to the kernel matrix. The square root of the correlation function is computed using a minimal phase recovering method. We illustrate our method on some examples and provide an empirical study of the estimation errors. Within this framework, we analyze high frequency financial price data modeled as 1D or 2D Hawkes processes. We find slowly decaying (power-law) kernel shapes suggesting a long memory nature of self-excitation phenomena at the microstructure level of price dynamics."
to:NB
kernel_estimators
time_series
point_processes
nonparametrics
statistics
re:LoB_project
december 2011 by cshalizi
[1112.1674] Predicting Failures of Point Forecasts
december 2011 by cshalizi
"The predictability of errors in deterministic temperature forecasts is investigated. More precisely, the aim is to issue warnings whenever the differences between forecast and verification exceed a given threshold. The warnings are generated by analyzing the output of an ensemble forecast system in terms of a decision making approach. The quality of the resulting predictions is evaluated by computing receiver operating characteristics, the Brier score, and the Ignorance score. Special emphasis is also given to the question whether rare events are better predictable."
to:NB
prediction
statistics
time_series
dynamical_systems
december 2011 by cshalizi
[1111.6291] Semiparametric Time Series Models with Log-concave Innovations
december 2011 by cshalizi
"We study a class of semiparametric time series models with innovations following a log-concave distribution. We propose a general maximum likelihood framework which allows us to estimate simultaneously the parameters of a model and the density of the innovations. This framework can be easily adapted to many well-known models, including ARMA and GARCH. Furthermore, we show that the estimator under our new framework is consistent in both ARMA and GARCH settings. We demonstrate its finite sample performance via a thorough simulation study and apply it to two real data sets concerning the streamflow of the Hirnant river and the FTSE daily return."
to:NB
statistics
time_series
nonparametrics
december 2011 by cshalizi
[1111.6801] The direct L2 geometric structure on a manifold of probability densities with applications to Filtering
december 2011 by cshalizi
"In this paper we introduce a projection method for the space of probability distributions based on the differential geometric approach to statistics. This method is based on a direct L2 metric as opposed to the usual Hellinger distance and the related Fisher Information metric. We explain how this apparatus can be used for the nonlinear filtering problem, in relationship also to earlier projection methods based on the Fisher metric. Past projection filters focused on the Fisher metric and the exponential families that made the filter correction step exact. In this work we introduce the mixture projection filter, namely the projection filter based on the direct $L^2$ metric and based on a manifold given by a mixture of pre-assigned densities. The resulting prediction step in the filtering problem is described by a linear differential equation, while the correction step can be made exact."
in_NB
filtering
state_estimation
information_geometry
time_series
december 2011 by cshalizi
Yao , Müller , Wang : Functional linear regression analysis for longitudinal data
december 2011 by cshalizi
"We propose nonparametric methods for functional linear regression which are designed for sparse longitudinal data, where both the predictor and response are functions of a covariate such as time. Predictor and response processes have smooth random trajectories, and the data consist of a small number of noisy repeated measurements made at irregular times for a sample of subjects. In longitudinal studies, the number of repeated measurements per subject is often small and may be modeled as a discrete random number and, accordingly, only a finite and asymptotically nonincreasing number of measurements are available for each subject or experimental unit. We propose a functional regression approach for this situation, using functional principal component analysis, where we estimate the functional principal component scores through conditional expectations. This allows the prediction of an unobserved response trajectory from sparse measurements of a predictor trajectory. The resulting technique is flexible and allows for different patterns regarding the timing of the measurements obtained for predictor and response trajectories. Asymptotic properties for a sample of n subjects are investigated under mild conditions, as n→∞, and we obtain consistent estimation for the regression function. Besides convergence results for the components of functional linear regression, such as the regression parameter function, we construct asymptotic pointwise confidence bands for the predicted trajectories. A functional coefficient of determination as a measure of the variance explained by the functional regression model is introduced, extending the standard R2 to the functional case. The proposed methods are illustrated with a simulation study, longitudinal primary biliary liver cirrhosis data and an analysis of the longitudinal relationship between blood pressure and body mass index."
to:NB
statistics
time_series
regression
functional_data
december 2011 by cshalizi
Quantile regression for longitudinal data based on latent Markov subject-specific parameters Alessio Farcomeni - Statistics and Computing, Volume 22, Number 1
december 2011 by cshalizi
"We propose a latent Markov quantile regression model for longitudinal data with non-informative drop-out. The observations, conditionally on covariates, are modeled through an asymmetric Laplace distribution. Random effects are assumed to be time-varying and to follow a first order latent Markov chain. This latter assumption is easily interpretable and allows exact inference through an ad hoc EM-type algorithm based on appropriate recursions. Finally, we illustrate the model on a benchmark data set."
to:NB
regression
time_series
prediction
markov_models
december 2011 by cshalizi
Phys. Rev. E 84, 056214 (2011): State and parameter estimation using unconstrained optimization
november 2011 by cshalizi
"We present an efficient method for estimating variables and parameters of a given system of ordinary differential equations by adapting the model output to an observed time series from the (physical) process described by the model. The proposed method is based on (unconstrained) nonlinear optimization exploiting the particular structure of the relevant cost function. To illustrate the features and performance of the method, simulations are presented using chaotic time series generated by the Colpitts oscillator, the three-dimensional Hindmarsh-Rose neuron model, and a nine-dimensional extended Rössler system." --- Sounds like Hooker & Ramsay.
to:NB
dynamical_systems
statistics
time_series
estimation
statistical_inference_for_stochastic_processes
november 2011 by cshalizi
[1111.5312] Representations and Ensemble Methods for Dynamic Relational Classification
november 2011 by cshalizi
"Temporal networks are ubiquitous and evolve over time by the addition, deletion, and changing of links, nodes, and attributes. Although many relational datasets contain temporal information, the majority of existing techniques in relational learning focus on static snapshots and ignore the temporal dynamics. We propose a framework for discovering temporal representations of relational data to increase the accuracy of statistical relational learning algorithms. The temporal relational representations serve as a basis for classification, ensembles, and pattern mining in evolving domains. The framework includes (1) selecting the time-varying relational components (links, attributes, nodes), (2) selecting the temporal granularity, (3) predicting the temporal influence of each time-varying relational component, and (4) choosing the weighted relational classifier. Additionally, we propose temporal ensemble methods that exploit the temporal-dimension of relational data. These ensembles outperform traditional and more sophisticated relational ensembles while avoiding the issue of learning the most optimal representation. Finally, the space of temporal-relational models are evaluated using a sample of classifiers. In all cases, the proposed temporal-relational classifiers outperform competing models that ignore the temporal information. The results demonstrate the capability and necessity of the temporal-relational representations for classification, ensembles, and for mining temporal datasets."
in_NB
to_read
relational_learning
network_data_analysis
transaction_networks
neville.jennifer
machine_learning
ensemble_methods
time_series
classifiers
november 2011 by cshalizi
[1111.4226] Joint Modeling of Multiple Related Time Series via the Beta Process
november 2011 by cshalizi
"We propose a Bayesian nonparametric approach to the problem of jointly modeling multiple related time series. Our approach is based on the discovery of a set of latent, shared dynamical behaviors. Using a beta process prior, the size of the set and the sharing pattern are both inferred from data. We develop efficient Markov chain Monte Carlo methods based on the Indian buffet process representation of the predictive distribution of the beta process, without relying on a truncated model. In particular, our approach uses the sum-product algorithm to efficiently compute Metropolis-Hastings acceptance probabilities, and explores new dynamical behaviors via birth and death proposals. We examine the benefits of our proposed feature-based model on several synthetic datasets, and also demonstrate promising results on unsupervised segmentation of visual motion capture data."
to:NB
heard_the_talk
time_series
statistics
machine_learning
nonparametrics
fox.emily
jordan.michael_i.
november 2011 by cshalizi
Kreiss , Paparoditis , Politis : On the range of validity of the autoregressive sieve bootstrap
october 2011 by cshalizi
"We explore the limits of the autoregressive (AR) sieve bootstrap, and show that its applicability extends well beyond the realm of linear time series as has been previously thought. In particular, for appropriate statistics, the AR-sieve bootstrap is valid for stationary processes possessing a general Wold-type autoregressive representation with respect to a white noise; in essence, this includes all stationary, purely nondeterministic processes, whose spectral density is everywhere positive. Our main theorem provides a simple and effective tool in assessing whether the AR-sieve bootstrap is asymptotically valid in any given situation. In effect, the large-sample distribution of the statistic in question must only depend on the first and second order moments of the process; prominent examples include the sample mean and the spectral density. As a counterexample, we show how the AR-sieve bootstrap is not always valid for the sample autocovariance even when the underlying process is linear."
in_NB
time_series
bootstrap
statistics
stochastic_processes
october 2011 by cshalizi
A Measure of Stationarity in Locally Stationary Processes With Applications to Testing- Journal of the American Statistical Association - 106(495):1113
october 2011 by cshalizi
"In this article we investigate the problem of measuring deviations from stationarity in locally stationary time series. Our approach is based on a direct estimate of the L2-distance between the spectral density of the locally stationary process and its best approximation by a spectral density of a stationary process. An explicit expression of the minimal distance is derived, which depends only on integrals of the spectral density of the locally stationary process and its square. These integrals can be estimated directly without estimating the spectral density, and as a consequence, the estimation of the measure of stationarity does not require the specification of a smoothing bandwidth. We show weak convergence of an appropriately standardized version of the statistic to a standard normal distribution. The results are used to construct confidence intervals for the measure of stationarity and to develop a new test for the hypothesis of stationarity. Finally, we investigate the finite sample properties of the resulting confidence intervals and tests by means of a simulation study and illustrate the methodology in two data examples. Parts of the proofs are available online as supplemental material to this article."
in_NB
to_read
statistics
time_series
stationarity
spectral_estimation
october 2011 by cshalizi
Phys. Rev. E 84, 046205 (2011): Nonuniqueness of global modeling and time scaling
october 2011 by cshalizi
"Starting from an observed single time series, it is shown how to reconstruct a global model in the original phase space by using the ansatz library approach. This model is then compared to the underlying dynamical system that describes the initial time series, and the nonuniqueness of the reconstructed model is discussed. This framework is extended by taking an additional time scaling factor in the reconstructed model class under consideration."
to:NB
time_series
state-space_reconstruction
to_read
october 2011 by cshalizi
Phys. Rev. E 84, 046702 (2011): Nonparametric segmentation of nonstationary time series
october 2011 by cshalizi
"The nonstationary evolution of observable quantities in complex systems can frequently be described as a juxtaposition of quasistationary spells. Given that standard theoretical and data analysis approaches usually rely on the assumption of stationarity, it is important to detect in real time series intervals holding that property. With that aim, we introduce a segmentation algorithm based on a fully nonparametric approach. We illustrate its applicability through the analysis of real time series presenting diverse degrees of nonstationarity, thus showing that this segmentation procedure generalizes and allows one to uncover features unresolved by previous proposals based on the discrepancy of low order statistical moments only."
in_NB
statistics
change-point_problem
time_series
nonparametrics
re:growing_ensemble_project
october 2011 by cshalizi
Phys. Rev. Lett. 107, 148501 (2011): Emergence of El Niño as an Autonomous Component in the Climate Network
october 2011 by cshalizi
We construct and analyze a climate network which represents the interdependent structure of the climate in different geographical zones and find that the network responds in a unique way to El Niño events. Analyzing the dynamics of the climate network shows that when El Niño events begin, the El Niño basin partially loses its influence on its surroundings. After typically three months, this influence is restored while the basin loses almost all dependence on its surroundings and becomes autonomous. The formation of an autonomous basin is the missing link to understand the seemingly contradicting phenomena of the afore-noticed weakening of the interdependencies in the climate network during El Niño and the known impact of the anomalies inside the El Niño basin on the global climate system.
climatology
time_series
macro_from_micro
emergence
to:NB
pattern_formation
october 2011 by cshalizi
Reality Checks and Comparisons of Nested Predictive Models - Journal of Business and Economic Statistics - 0(0):1
september 2011 by cshalizi
"This article develops a simple bootstrap method for simulating asymptotic critical values for tests of equal forecast accuracy and encompassing among many nested models. Our method combines elements of fixed regressor and wild bootstraps. We first derive the asymptotic distributions of tests of equal forecast accuracy and encompassing applied to forecasts from multiple models that nest the benchmark model—that is, reality check tests. We then prove the validity of the bootstrap for these tests. Monte Carlo experiments indicate that our proposed bootstrap has better finite-sample size and power than other methods designed for comparison of nonnested models."
statistics
model_checking
model_selection
time_series
bootstrap
to_read
to_teach:undergrad-ADA
encompassing
september 2011 by cshalizi
[1107.5543] Coevolution of Network Structure and Content
july 2011 by cshalizi
Disappointing. The content variables are all completely ad hoc (the structure variables are also ad hoc, but traditional), so we really have no idea of what is being found here. And there is no assessment of uncertainty at all. And, for the love of Gauss, stop using R^2 like that!
time_series
social_networks
social_media
statistics
adamic.lada
to:NB
have_read
network_data_analysis
july 2011 by cshalizi
How Useful are Estimated DSGE Model Forecasts? by Rochelle Edge, Refet Gurkaynak :: SSRN
july 2011 by cshalizi
The methodological ideas here are suspect. It is true that there is not much to predict about an in-control system, and what is happening is largely random and so unpredictable, so that even the true model would show low forecasting ability. The question however is why we are supposed to think that the DSGE _does_ give us good information about counterfactuals. If you could show that it had much better predictive performance than baselines like constants or random walks during _out-of-control_ periods, that would be something; but they don't.
re:your_favorite_dsge_sucks
dsges
prediction
economics
macroeconomics
time_series
statistics
in_NB
have_read
to:blog
july 2011 by cshalizi
[0812.0449] Locally adaptive estimation methods with application to univariate time series
july 2011 by cshalizi
"The paper offers a unified approach to the study of three locally adaptive estimation methods in the context of univariate time series from both theoretical and empirical points of view. A general procedure for the computation of critical values is given. The underlying model encompasses all distributions from the exponential family providing for great flexibility. The procedures are applied to simulated and real financial data distributed according to the Gaussian, volatility, Poisson, exponential and Bernoulli models. Numerical results exhibit a very reasonable performance of the methods."
time_series
statistics
estimation
exponential_families
non-stationarity
to:NB
july 2011 by cshalizi
[0711.3856] Forward estimation for ergodic time series
july 2011 by cshalizi
"The forward estimation problem for stationary and ergodic time series $\{X_n\}_{n=0}^{\infty}$ taking values from a finite alphabet ${\cal X}$ is to estimate the probability that $X_{n+1}=x$ based on the observations $X_i$, $0\le i\le n$ without prior knowledge of the distribution of the process $\{X_n\}$. We present a simple procedure $g_n$ which is evaluated on the data segment $(X_0,...,X_n)$ and for which, ${\rm error}(n) = |g_{n}(x)-P(X_{n+1}=x |X_0,...,X_n)|\to 0$ almost surely for a subclass of all stationary and ergodic time series, while for the full class the Cesaro average of the error tends to zero almost surely and moreover, the error tends to zero in probability."
prediction
ergodic_theory
time_series
statistics
morvai.gusztav
weiss.benjamin
july 2011 by cshalizi
[1104.3073] Feature Matching in Time Series Modelling
april 2011 by cshalizi
"Using a time series model to mimic an observed time series has a long history. However, with regard to this objective, conventional estimation methods for discrete-time dynamical models are frequently found to be wanting. In fact, they are characteristically misguided in at least two respects: (i) assuming that there is a true model; (ii) evaluating the efficacy of the estimation as if the postulated model is true. There are numerous examples of models, when fitted by conventional methods, that fail to capture some of the most basic global features of the data, such as cycles with good matching periods, singularities of spectral density functions (especially at the origin) and others. We argue that the shortcomings need not always be due to the model formulation but the inadequacy of the conventional fitting methods. After all, all models are wrong, but some are useful if they are fitted properly. The practical issue becomes one of how to best fit the model to data. Thus, in the absence of a true model, we prefer an alternative approach to conventional model fitting that typically involves one-step-ahead prediction errors. Our primary aim is to match the joint probability distribution of the observable time series, including long-term features of the dynamics that underpin the data, such as cycles, long memory and others, rather than short-term prediction. For want of a better name, we call this specific aim feature matching."
to:NB
time_series
statistics
model-checking
to_read
april 2011 by cshalizi
[0812.2749] Nonparametric inference of a trend using functional data
april 2011 by cshalizi
I guess I've been more or less presuming this was true. (And I'd have been wrong about the form of the simultaneous CI, actually.) Worth trying to work into the final exam for The Kids?
curve_fitting
gaussian_processes
time_series
statistics
nonparametrics
have_read
confidence_sets
to_teach:undergrad-ADA
april 2011 by cshalizi
Efficient probabilistic forecasts for counts - McCabe et al., 2011 - JRSS-B
march 2011 by cshalizi
" Efficient probabilistic forecasts of integer-valued random variables are derived. The optimality is achieved by estimating the forecast distribution non-parametrically over a given broad model class and proving asymptotic (non-parametric) efficiency in that setting. The method is developed within the context of the integer auto-regressive class of models, which is a suitable class for any count data that can be interpreted as a queue, stock, birth-and-death process or branching process. The theoretical proofs of asymptotic efficiency are supplemented by simulation results that demonstrate the overall superiority of the non-parametric estimator relative to a misspecified parametric alternative, in large but finite samples. The method is applied to counts of stock market iceberg orders. A subsampling method is used to assess sampling variation in the full estimated forecast distribution and a proof of its validity is given." (Dunno about the to_teach tags, I haven't read this yet.)
statistics
prediction
density_estimation
time_series
stochastic_processes
branching_processes
to_teach:data-mining
to_teach:undergrad-ADA
march 2011 by cshalizi
[1101.0673] Autoregressive Kernels For Time Series
january 2011 by cshalizi
Under this kernel, two time series are similar if they lead to similar vector autoregressions...
kernel_methods
time_series
statistics
january 2011 by cshalizi
Combining Nonparametric and Optimal Linear Time Series Predictions
january 2011 by cshalizi
ARMA model forecasting, supplemented somehow with nonparametric smoothing of the residuals. (I haven't read beyond the abstract.)
time_series
prediction
statistics
nonparametrics
to_teach:undergrad-ADA
january 2011 by cshalizi
UnderstandingSociety: Ngram anomalies
december 2010 by cshalizi
It is indeed rather odd-looking.
google_ngrams
time_series
colors
december 2010 by cshalizi
[1012.3795] Estimating Networks With Jumps
december 2010 by cshalizi
"We study the problem of estimating a temporally varying coefficient and varying structure (VCVS) graphical model underlying nonstationary time series data, such as social states of interacting individuals or microarray expression profiles of gene networks, as opposed to i.i.d. data from an invariant model widely considered in current literature of structural estimation. In particular, we consider the scenario in which the model evolves in a piece-wise constant fashion. We propose a procedure that minimizes the so-called TESLA loss (i.e., temporally smoothed L1 regularized regression), which allows jointly estimating the partition boundaries of the VCVS model and the coefficient of the sparse precision matrix on each block of the partition. "
graphical_models
network_data_analysis
time_series
model_selection
statistics
xing.eric
december 2010 by cshalizi
Rule generation for categorical time series with Markov assumptions
december 2010 by cshalizi
"Several procedures of sequential pattern analysis are designed to detect frequently occurring patterns in a single categorical time series (episode mining). Based on these frequent patterns, rules are generated and evaluated, for example, in terms of their confidence. The confidence value is commonly interpreted as an estimate of a conditional probability, so some kind of stochastic model has to be assumed. The model is identified as a variable length Markov model. With this assumption, the usual confidences are maximum likelihood estimates of the transition probabilities of the Markov model. We discuss possibilities of how to efficiently fit an appropriate model to the data. Based on this model, rules are formulated. It is demonstrated that this new approach generates noticeably less and more reliable rules." --- I should really add some time series stuff to data mining...
data_mining
markov_models
time_series
in_NB
to_teach:data-mining
variable-length_markov_models
december 2010 by cshalizi
[1011.2998] A compact statistical model of the song syntax in Bengalese finch
november 2010 by cshalizi
"Songs of many songbird species consist of variable sequences of a finite number of syllables. A common approach for characterizing the syntax of these ... sequences is to use transition probabilities between the syllables. This is equivalent to the Markov model, in which each syllable is associated with one state, and the transition probabilities between the states do not depend on the state transition history. ... analyze the song syntax in a Bengalese finch. ... the Markov model fails ... Instead, ... include adaptation of the self-transition probabilities when states are repeatedly revisited ... more than one state to the same syllable. ... Mathematically, the model is a partially observable Markov model with adaptation (POMMA). ... supports the branching chain network hypothesis of how syntax is controlled within the premotor song nucleus HVC ..."
birds
grammar_induction
markov_models
time_series
to_teach:complexity-and-inference
to_read
in_NB
november 2010 by cshalizi
Phys. Rev. E 82, 056206 (2010): Forecasting the evolution of nonlinear and nonstationary systems using recurrence-based local Gaussian process models
november 2010 by cshalizi
"...combining nonparametric Gaussian process (GP) modeling with certain local topological considerations ... for prediction (one-step look ahead) of ... nonlinear and nonstationary dynamics. ... partition ... trajectories into multiple near-stationary segments by aligning the boundaries of the partitions with those of the piecewise affine projections of the underlying dynamic system... alignment is achieved through the consideration of recurrence and other local topological properties ... forecasting in Lorenz system under different levels of induced noise and nonstationarity, synthetic heart-rate signals, and a real-world time-series from an industrial operation known to exhibit highly nonlinear and nonstationary dynamics. ... local Gaussian process can significantly outperform not just classical system identification, neural network and nonparametric models, but also the sequential Bayesian Monte Carlo methods in terms of prediction accuracy and computational speed."
time_series
prediction
non-stationarity
gaussian_processes
re:growing_ensemble_project
to_read
november 2010 by cshalizi
related tags
academia ⊕ adamic.lada ⊕ agent-based_models ⊕ albers.dave ⊕ amaral.luis ⊕ anomaly_detection ⊕ apollo_project ⊕ arrow_of_time ⊕ artificial_life ⊕ astrophysics ⊕ asymptotics ⊕ attractor_reconstruction ⊕ autonomous_agents ⊕ autonomy ⊕ bad_data_analysis ⊕ bad_science ⊕ bartlett.m.s. ⊕ beran.jan ⊕ bibliometry ⊕ bioinformatics ⊕ birds ⊕ blei.david ⊕ books:noted ⊕ books:recommended ⊕ book_reviews ⊕ boosting ⊕ bootstrap ⊕ boris ⊕ branching_processes ⊕ brillinger.david ⊕ brown.emery ⊕ buhlmann.peter ⊕ caires.s. ⊕ cartoons ⊕ causal_inference ⊕ cham.jorge ⊕ change-point_problem ⊕ chaos ⊕ chapman.sandra ⊕ chu.tianjiao ⊕ classifiers ⊕ climate_change ⊕ climatology ⊕ clustering ⊕ colors ⊕ communication ⊕ community_discovery ⊕ complexity_measures ⊕ conferences ⊕ confidence_sets ⊕ control ⊕ control_theory ⊕ correlation_time ⊕ cross-validation ⊕ curve_fitting ⊕ cybernetics ⊕ damouras.sotirios ⊕ data_analysis ⊕ data_mining ⊕ data_sets ⊕ density_estimation ⊕ deviation_bounds ⊕ dietterich.thomas ⊕ dimension_reduction ⊕ dsges ⊕ dynamical_systems ⊕ earthquakes ⊕ econometrics ⊕ economics ⊕ EEG ⊕ emergence ⊕ empirical_processes ⊕ encompassing ⊕ engle.robert ⊕ ensemble_methods ⊕ entropy ⊕ entropy_estimation ⊕ epidemic_models ⊕ epidemiology ⊕ ergodic_decomposition ⊕ ergodic_theory ⊕ estimation ⊕ estimation_of_dynamical_systems ⊕ events ⊕ evisceration ⊕ exponential_families ⊕ extended_kalman_filter ⊕ factor_analysis ⊕ feedback ⊕ ferreira.j.a. ⊕ filtering ⊕ finance ⊕ fisher_information ⊕ flocks_and_swarms ⊕ fmri ⊕ foundations_of_statistics ⊕ fourier_analysis ⊕ fox.emily ⊕ fractals ⊕ frankel.jeffrey ⊕ freedom_as_self-control ⊕ frequency_domain ⊕ functional_connectivity ⊕ functional_data ⊕ functional_data_analysis ⊕ funny:geeky ⊕ gaussian_processes ⊕ gene_expression ⊕ gene_expression_data_analysis ⊕ geology ⊕ glymour.clark ⊕ goodness-of-fit ⊕ google_ngrams ⊕ grammar_induction ⊕ granger_causality ⊕ graphical_models ⊕ hansen.bruce ⊕ harrison.matt ⊕ haslinger.rob ⊕ have_read ⊕ heard_the_talk ⊕ heavy_tails ⊕ hendry.david ⊕ heteroskedasticity ⊕ high-dimensional_statistics ⊕ hilbert_space ⊕ history_of_technology ⊕ hodrick-prescott_filter ⊕ hoeffdings_inequality ⊕ hofman.jake ⊕ homeostasis ⊕ hoyer.patrik ⊕ hypothesis_testing ⊕ iacus.stefano ⊕ identifiability ⊕ independence_testing ⊕ independent_component_analysis ⊕ indirect_inference ⊕ individual_sequence_prediction ⊕ inequality ⊕ inference_to_latent_objects ⊕ information_criteria ⊕ information_geometry ⊕ information_theory ⊕ interview ⊕ in_NB ⊕ janzing.dominik ⊕ jordan.michael_i. ⊕ kalman_filter ⊕ kantz.holger ⊕ kernel_estimators ⊕ kernel_methods ⊕ kith_and_kin ⊕ kontoyiannis.ioannis ⊕ lagrange_multipliers ⊕ langford.john ⊕ laplace_approximation ⊕ lasso ⊕ learning_theory ⊕ likelihood ⊕ likelihood_ratio_tests ⊕ lives_of_the_scientists ⊕ long-memory_processes ⊕ long-range_dependence ⊕ machine_learning ⊕ macroeconomics ⊕ macro_from_micro ⊕ markov_models ⊕ martingales ⊕ method_of_moments ⊕ misspecification ⊕ mixing ⊕ mixture_models ⊕ model-checking ⊕ model_checking ⊕ model_selection ⊕ monte_carlo ⊕ morley.james ⊕ morvai.gusztav ⊕ nardi.yuval ⊕ nasa ⊕ networks ⊕ network_data_analysis ⊕ neural_coding_and_decoding ⊕ neural_data_analysis ⊕ neural_modeling ⊕ neuroscience ⊕ neville.jennifer ⊕ neyman-pearson_lemma ⊕ nilsson_jacobi.martin ⊕ non-equilibrium ⊕ non-stationarity ⊕ nonparametrics ⊕ obvious_to_one_skilled_in_the_art ⊕ oscillators ⊕ particle_filters ⊕ parzen.emanuel ⊕ pattern_formation ⊕ pecora.louis ⊕ philosophy_of_science ⊕ pillai.natesh ⊕ pittsburgh ⊕ plasma ⊕ point_processes ⊕ prediction ⊕ probability ⊕ programming ⊕ quantile_estimation ⊕ r ⊕ raginsky.maxim ⊕ re:almost_none ⊕ re:AoS_project ⊕ re:functional_communities ⊕ re:growing_ensemble_project ⊕ re:LoB_project ⊕ re:stacs ⊕ re:XV_for_mixing ⊕ re:your_favorite_dsge_sucks ⊕ recurrence_times ⊕ recursive_estimation ⊕ regression ⊕ relational_learning ⊕ resampling ⊕ rinaldo.alessandro ⊕ rosenblatt.murray ⊕ ryabko.daniil ⊕ salakhutdinov.ruslan ⊕ schreiber.thomas ⊕ scientific_computing ⊕ self-centered ⊕ self-organization ⊕ self-promotion ⊕ sensitive_dependence_on_initial_conditions ⊕ series_of_footnotes ⊕ shot_after_a_fair_trial ⊕ simulation ⊕ smoothing ⊕ social_media ⊕ social_networks ⊕ social_science_methodology ⊕ sociology ⊕ sociology_of_science ⊕ sornette.didier ⊕ sparsity ⊕ spatial_statistics ⊕ spectral_estimation ⊕ splines ⊕ state-space_models ⊕ state-space_reconstruction ⊕ state_estimation ⊕ stationarity ⊕ statistical_inference_for_stochastic_processes ⊕ statistical_mechanics ⊕ statistics ⊕ stochastic_differential_equations ⊕ stochastic_processes ⊕ stochastic_volatility ⊕ systems_identification ⊕ teleology ⊕ teleonomy ⊕ text_mining ⊕ time_rescaling ⊕ time_series ⊖ to:blog ⊕ to:NB ⊕ topic_models ⊕ to_be_shot_after_a_fair_trial ⊕ to_read ⊕ to_teach ⊕ to_teach:complexity-and-inference ⊕ to_teach:data-mining ⊕ to_teach:financial-time-series ⊕ to_teach:undergrad-ADA ⊕ trains ⊕ transaction_networks ⊕ universal_prediction ⊕ variable-length_markov_models ⊕ variable_selection ⊕ variance_estimation ⊕ via:? ⊕ via:aaronsw ⊕ via:arthegall ⊕ via:ded-maxim ⊕ via:flaxman ⊕ via:gelman ⊕ via:guslacerda ⊕ via:jbdelong ⊕ via:krugman ⊕ via:nicholas_della_penna ⊕ via:nick-watkins ⊕ via:the_author ⊕ via:vaguery ⊕ visual_display_of_quantitative_information ⊕ watkins.nicholas ⊕ watts.duncan ⊕ wavelets ⊕ weiss.benjamin ⊕ whats_gone_wrong_with_America ⊕ wheels:reinvention_of ⊕ wiener-khinchin ⊕ wiener.norbert ⊕ willett.rebecca ⊕ xing.eric ⊕ zhang.tong ⊕Copy this bookmark: