cshalizi + time_series   205

[1205.3845] Forecasting with Historical Data or Process Knowledge under Misspecification: A Comparison
"When faced with the task of forecasting a dynamic system, practitioners often have available historical data, knowledge of the system, or a combination of both. While intuition dictates that perfect knowledge of the system should in theory yield perfect forecasting, often knowledge of the system is only partially known, known up to parameters, or known incorrectly. In contrast, forecasting using previous data without any process knowledge might result in accurate prediction for simple systems, but will fail for highly nonlinear and chaotic systems. In this paper, the authors demonstrate how even in chaotic systems, forecasting with historical data is preferable to using process knowledge if this knowledge exhibits certain forms of misspecification. Through an extensive simulation study, a range of misspecification and forecasting scenarios are examined with the goal of gaining an improved understanding of the circumstances under which forecasting from historical data is to be preferred over using process knowledge."
to:NB  to_read  prediction  time_series  misspecification  re:growing_ensemble_project 
8 days ago by cshalizi
Lam , Yao : Factor modeling for high-dimensional time series: Inference for the number of factors
"This paper deals with the factor modeling for high-dimensional time series based on a dimension-reduction viewpoint. Under stationary settings, the inference is simple in the sense that both the number of factors and the factor loadings are estimated in terms of an eigenanalysis for a nonnegative definite matrix, and is therefore applicable when the dimension of time series is on the order of a few thousands. Asymptotic properties of the proposed method are investigated under two settings: (i) the sample size goes to infinity while the dimension of time series is fixed; and (ii) both the sample size and the dimension of time series go to infinity together. In particular, our estimators for zero-eigenvalues enjoy faster convergence (or slower divergence) rates, hence making the estimation for the number of factors easier. In particular, when the sample size and the dimension of time series go to infinity together, the estimators for the eigenvalues are no longer consistent. However, our estimator for the number of the factors, which is based on the ratios of the estimated eigenvalues, still works fine. Furthermore, this estimation shows the so-called “blessing of dimensionality” property in the sense that the performance of the estimation may improve when the dimension of time series increases. A two-step procedure is investigated when the factors are of different degrees of strength. Numerical illustration with both simulated and real data is also reported."
to:NB  dimension_reduction  factor_analysis  time_series  high-dimensional_statistics  inference_to_latent_objects 
10 days ago by cshalizi
Wang , Phillips : A specification test for nonlinear nonstationary models
"We provide a limit theory for a general class of kernel smoothed U-statistics that may be used for specification testing in time series regression with nonstationary data. The test framework allows for linear and nonlinear models with endogenous regressors that have autoregressive unit roots or near unit roots. The limit theory for the specification test depends on the self-intersection local time of a Gaussian process. A new weak convergence result is developed for certain partial sums of functions involving nonstationary time series that converges to the intersection local time process. This result is of independent interest and is useful in other applications. Simulations examine the finite sample performance of the test."
to:NB  time_series  non-stationarity  model-checking  statistics  misspecification 
10 days ago by cshalizi
Likelihood inference for discriminating between long-memory and change-point models - Yau - 2012 - Journal of Time Series Analysis - Wiley Online Library
"We develop a likelihood ratio (LR) test procedure for discriminating between a short-memory time series with a change-point (CP) and a long-memory (LM) time series. Under the null hypothesis, the time series consists of two segments of short-memory time series with different means and possibly different covariance functions. The location of the shift in the mean is unknown. Under the alternative, the time series has no shift in mean but rather is LM. The LR statistic is defined as the normalized log-ratio of the Whittle likelihood between the CP model and the LM model, which is asymptotically normally distributed under the null. The LR test provides a parametric alternative to the CUSUM test proposed by Berkes et al. (2006). Moreover, the LR test is more general than the CUSUM test in the sense that it is applicable to changes in other marginal or dependence features other than a change-in-mean. We show its good performance in simulations and apply it to two data examples."
to:NB  time_series  change-point_problem  long-range_dependence  statistics  to_teach:undergrad-ADA  hypothesis_testing 
13 days ago by cshalizi
[1204.6265] Statistical inference for dynamical systems: a review
"The topic of statistical inference for dynamical systems has been studied extensively across several fields. In this survey we focus on the problem of parameter estimation for non-linear dynamical systems. Our objective is to place results across distinct disciplines in a common setting and highlight opportunities for further research."
to:NB  to_read  statistical_inference_for_stochastic_processes  dynamical_systems  statistics  time_series  state-space_models  state-space_reconstruction  pillai.natesh  via:ded-maxim 
28 days ago by cshalizi
[1204.3915] Theory and Inference for a Class of Observation-driven Models with Application to Time Series of Counts
"This paper studies theory and inference related to a class of time series models that incorporates nonlinear dynamics. It is assumed that the observations follow a one-parameter exponential family of distributions given an accompanying process that evolves as a function of lagged observations. We employ an iterated random function approach and a special coupling technique to show that, under suitable conditions on the parameter space, the conditional mean process is a geometric moment contracting Markov chain and that the observation process is absolutely regular with geometrically decaying coefficients. Moreover the asymptotic theory of the maximum likelihood estimates of the parameters is established under some mild assumptions. These models are applied to two examples; the first is the number of transactions per minute of Ericsson stock and the second is related to return times of extreme events of Goldman Sachs Group stock."

--- Without reading beyond the abstract, I'm guessing chains with complete connections.
to:NB  time_series  markov_models  statistics 
5 weeks ago by cshalizi
Xiao , Wu : Covariance matrix estimation for stationary time series
"We obtain a sharp convergence rate for banded covariance matrix estimates of stationary processes. A precise order of magnitude is derived for spectral radius of sample covariance matrices. We also consider a thresholded covariance matrix estimator that can better characterize sparsity if the true covariance matrix is sparse. As our main tool, we implement Toeplitz [Math. Ann. 70 (1911) 351–376] idea and relate eigenvalues of covariance matrices to the spectral densities or Fourier transforms of the covariances. We develop a large deviation result for quadratic forms of stationary processes using m-dependence approximation, under the framework of causal representation and physical dependence measures."
to:NB  time_series  statistics  estimation  variance_estimation 
6 weeks ago by cshalizi
[0802.4363] Estimating the entropy of binary time series: Methodology, some theory and a simulation study
"Partly motivated by entropy-estimation problems in neuroscience, we present a detailed and extensive comparison between some of the most popular and effective entropy estimation methods used in practice: The plug-in method, four different estimators based on the Lempel-Ziv (LZ) family of data compression algorithms, an estimator based on the Context-Tree Weighting (CTW) method, and the renewal entropy estimator.
"**Methodology. Three new entropy estimators are introduced. For two of the four LZ-based estimators, a bootstrap procedure is described for evaluating their standard error, and a practical rule of thumb is heuristically derived for selecting the values of their parameters. ** Theory. We prove that, unlike their earlier versions, the two new LZ-based estimators are consistent for every finite-valued, stationary and ergodic process. An effective method is derived for the accurate approximation of the entropy rate of a finite-state HMM with known distribution. Heuristic calculations are presented and approximate formulas are derived for evaluating the bias and the standard error of each estimator. ** Simulation. All estimators are applied to a wide range of data generated by numerous different processes with varying degrees of dependence and memory. Some conclusions drawn from these experiments include: (i) For all estimators considered, the main source of error is the bias. (ii) The CTW method is repeatedly and consistently seen to provide the most accurate results. (iii) The performance of the LZ-based estimators is often comparable to that of the plug-in method. (iv) The main drawback of the plug-in method is its computational inefficiency."
in_NB  to_read  entropy_estimation  information_theory  time_series  statistics  kontoyiannis.ioannis  re:stacs 
6 weeks ago by cshalizi
[1203.3037] Expanding the Transfer Entropy to Identify Information Subgraphs in Complex Systems
"We propose a formal expansion of the transfer entropy to put in evidence irreducible sets of variables which provide information for the future state of each assigned target. Multiplets characterized by an high value will be associated to informational circuits present in the system, with an informational character (synergetic or redundant) which can be associated to the sign of the contribution. We also present preliminary results on fMRI and EEG data sets."
in_NB  graphical_models  information_theory  community_discovery  time_series  re:functional_communities 
7 weeks ago by cshalizi
[1203.1515] Multiple Change-Point Estimation in Stationary Ergodic Time-Series
"The multiple change-point problem is considered in the most general setting, where the only assumption made on the time-series distributions generating the data is that they are stationary ergodic. No modeling, independence or parametric assumptions are made. While the need for such a general setting is dictated by real applications, the problem of change-point estimation becomes a difficult unsupervised learning problem. In this work a novel algorithm for solving this problem is proposed, and it is shown to be asymptotically consistent under the general assumptions considered."
to:NB  change-point_problem  time_series  ergodic_theory  statistics  statistical_inference_for_stochastic_processes  ryabko.daniil 
7 weeks ago by cshalizi
[1203.6898] Long-term stability of sequential Monte Carlo methods under verifiable conditions
"This paper discusses particle filtering in general hidden Markov models (HMMs) and presents novel theoretical results on the long-term stability of bootstrap-type particle filters. More specifically, we establish that the asymptotic variance of the Monte Carlo estimates produced by the bootstrap filter is uniformly bounded in time. On the contrary to most previous results of this type, which in general presuppose that the state space of the hidden state process is compact (an assumption that is rarely satisfied in practice), our very mild assumptions are satisfied for a large class of HMMs with possibly non-compact state space. In addition, we derive a similar time uniform bound on the asymptotic Lp error. Importantly, our results hold for misspecified models, i.e. we do not at all assume that the data entering into the particle filter originate from the model governing the dynamics of the particles or not even from an HMM."
to:NB  particle_filters  stochastic_processes  time_series  state_estimation  state-space_models  markov_models  statistics 
8 weeks ago by cshalizi
[math/0609514] Sequential Monte Carlo smoothing with application to parameter estimation in non-linear state space models
"This paper concerns the use of sequential Monte Carlo methods (SMC) for smoothing in general state space models. A well-known problem when applying the standard SMC technique in the smoothing mode is that the resampling mechanism introduces degeneracy of the approximation in the path space. However, when performing maximum likelihood estimation via the EM algorithm, all functionals involved are of additive form for a large subclass of models. To cope with the problem in this case, a modification of the standard method (based on a technique proposed by Kitagawa and Sato) is suggested. Our algorithm relies on forgetting properties of the filtering dynamics and the quality of the estimates produced is investigated, both theoretically and via simulations."
to:NB  statistics  time_series  state_estimation  state-space_models  particle_filters 
8 weeks ago by cshalizi
[1203.5673] Effect of Nonstationarity on Models Inferred from Neural Data
"Neurons subject to a common non-stationary input may exhibit a correlated firing behavior. Correlations in the statistics of neural spike trains also arise as the effect of interaction between neurons. Here we show that these two situations can be distinguished, with machine learning techniques, provided the data are rich enough. In order to do this, we study the problem of inferring a kinetic Ising model, stationary or nonstationary, from the available data. We apply the inference procedure to two data sets: one from salamander retinal ganglion cells and the other from a realistic computational cortical network model. We show that many aspects of the concerted activity of the salamander retinal neurons can be traced simply to the external input. A model of non-interacting neurons subject to a non-stationary external field outperforms a model with stationary input with couplings between neurons, even accounting for the differences in the number of model parameters. When couplings are added to the non-stationary model, for the retinal data, little is gained: the inferred couplings are generally not significant. Likewise, the distribution of the sizes of sets of neurons that spike simultaneously and the frequency of spike patterns as function of their rank (Zipf plots) are well-explained by an independent-neuron model with time-dependent external input, and adding connections to such a model does not offer significant improvement. For the cortical model data, robust couplings, well correlated with the real connections, can be inferred using the non-stationary model. Adding connections to this model slightly improves the agreement with the data for the probability of synchronous spikes but hardly affects the Zipf plot."
to:NB  neural_data_analysis  statistics  time_series 
8 weeks ago by cshalizi
[1203.5950] Capturing the time-varying drivers of an epidemic using stochastic dynamical systems
"Epidemics are often modelled using state-space models based on dynamical systems, observed through partial and noisy data. In this paper we develop stochastic extensions to the popular SEIR model with parameters evolving in time, in order to capture unknown influences of changing behaviors, public interventions, seasonal effects etc. Our models assign diffusion processes for the time-varying parameters, and our inferential procedure is based on the particle Markov Chain Monte Carlo algorithm, suitably adjusted to accommodate the features of this challenging nonlinear stochastic model. The performance of the proposed computational methods is validated on simulated data and the adopted model is applied to the 2009 A/H1N1 pandemic in England. In addition to estimating the trajectories of the effective contact rate, the methodology is applied in real time to provide evidence in related public health decisions."
to:NB  time_series  epidemic_models  state-space_models  statistics 
8 weeks ago by cshalizi
On robust tail index estimation for linear long-memory processes - Beran - 2012 - Journal of Time Series Analysis - Wiley Online Library
"We consider robust estimation of the tail index α for linear long-memory processes with i.i.d. innovations εj following a symmetric α-stable law (1 < α < 2) and coefficients aj ∼ c·j−β. Estimates based on the left and right tail respectively are obtained together with a combined statistic with improved efficiency, and a test statistic comparing both tails. Asymptotic results are derived. Simulations illustrate the finite sample performance."
to:NB  heavy_tails  time_series  statistics  beran.jan 
8 weeks ago by cshalizi
Time-series clustering via quasi U-statistics - Valk - 2012 - Journal of Time Series Analysis - Wiley Online Library
"The problem of time-series discrimination and classification is discussed. We propose a novel clustering algorithm based on a class of quasi U-statistics and subgroup decomposition tests. The decomposition may be applied to any concave time-series distance. The resulting test statistics are proven to be asymptotically normal for either i.i.d. or non-identically distributed groups of time-series under mild conditions. We illustrate its empirical performance on a simulation study and a real data analysis. The simulation setup includes stationary vs. stationary and stationary vs. non-stationary cases. The performance of the proposed method is favourably compared with some of the most common clustering measures available."
to:NB  clustering  time_series  statistics  classifiers 
8 weeks ago by cshalizi
[0805.2214] Augmented GARCH sequences: Dependence structure and asymptotics
"The augmented GARCH model is a unification of numerous extensions of the popular and widely used ARCH process. It was introduced by Duan and besides ordinary (linear) GARCH processes, it contains exponential GARCH, power GARCH, threshold GARCH, asymmetric GARCH, etc. In this paper, we study the probabilistic structure of augmented $mathrm {GARCH}(1,1)$ sequences and the asymptotic distribution of various functionals of the process occurring in problems of statistical inference. Instead of using the Markov structure of the model and implied mixing properties, we utilize independence properties of perturbed GARCH sequences to directly reduce their asymptotic behavior to the case of independent random variables. This method applies for a very large class of functionals and eliminates the fairly restrictive moment and smoothness conditions assumed in the earlier theory. In particular, we derive functional CLTs for powers of the augmented GARCH variables, derive the error rate in the CLT and obtain asymptotic results for their empirical processes under nearly optimal conditions."
to:NB  stochastic_processes  time_series  finance 
12 weeks ago by cshalizi
[0805.1179] Autoregressive Process Modeling via the Lasso Procedure
"The Lasso is a popular model selection and estimation procedure for linear models that enjoys nice theoretical properties. In this paper, we study the Lasso estimator for fitting autoregressive time series models. We adopt a double asymptotic framework where the maximal lag may increase with the sample size. We derive theoretical results establishing various types of consistency. In particular, we derive conditions under which the Lasso estimator for the autoregressive coefficients is model selection consistent, estimation consistent and prediction consistent. Simulation study results are reported."
to:NB  time_series  statistics  lasso  sparsity  variable_selection  kith_and_kin  heard_the_talk  rinaldo.alessandro  nardi.yuval 
12 weeks ago by cshalizi
[0808.1010] Confidence bands in nonparametric time series regression
"We consider nonparametric estimation of mean regression and conditional variance (or volatility) functions in nonlinear stochastic regression models. Simultaneous confidence bands are constructed and the coverage probabilities are shown to be asymptotically correct. The imposed dependence structure allows applications in many linear and nonlinear auto-regressive processes. The results are applied to the S&P 500 Index data."
to:NB  statistics  regression  time_series  confidence_sets  to_teach:undergrad-ADA 
12 weeks ago by cshalizi
[0805.3019] Three months journeying of a Hawaiian monk seal
"Hawaiian monk seals (Monachus schauinslandi) are endemic to the Hawaiian Islands and are the most endangered species of marine mammal that lives entirely within the jurisdiction of the United States. The species numbers around 1300 and has been declining owing, among other things, to poor juvenile survival which is evidently related to poor foraging success. Consequently, data have been collected recently on the foraging habitats, movements, and behaviors of monk seals throughout the Northwestern and main Hawaiian Islands. Our work here is directed to exploring a data set located in a relatively shallow offshore submerged bank (Penguin Bank) in our search of a model for a seal's journey. The work ends by fitting a stochastic differential equation (SDE) that mimics some aspects of the behavior of seals by working with location data collected for one seal. The SDE is found by developing a time varying potential function with two points of attraction. The times of location are irregularly spaced and not close together geographically, leading to some difficulties of interpretation. Synthetic plots generated using the model are employed to assess its reasonableness spatially and temporally. One aspect is that the animal stays mainly southwest of Molokai. The work led to the estimation of the lengths and locations of the seal's foraging trips."
to:NB  statistics  stochastic_differential_equations  statistical_inference_for_stochastic_processes  brillinger.david  time_series 
12 weeks ago by cshalizi
[math/0410271] Statistical modeling of causal effects in continuous time
"This article studies the estimation of the causal effect of a time-varying treatment on time-to-an-event or on some other continuously distributed outcome. The paper applies to the situation where treatment is repeatedly adapted to time-dependent patient characteristics. The treatment effect cannot be estimated by simply conditioning on these time-dependent patient characteristics, as they may themselves be indications of the treatment effect. This time-dependent confounding is common in observational studies. Robins [(1992) Biometrika 79 321--334, (1998b) Encyclopedia of Biostatistics 6 4372--4389] has proposed the so-called structural nested models to estimate treatment effects in the presence of time-dependent confounding. In this article we provide a conceptual framework and formalization for structural nested models in continuous time. We show that the resulting estimators are consistent and asymptotically normal. Moreover, as conjectured in Robins [(1998b) Encyclopedia of Biostatistics 6 4372--4389], a test for whether treatment affects the outcome of interest can be performed without specifying a model for treatment effect. We illustrate the ideas in this article with an example."
to:NB  statistics  causal_inference  time_series 
12 weeks ago by cshalizi
[0809.1053] An impossibility result for process discrimination
"Two series of binary observations $x_1,x_1,...$ and $y_1,y_2,...$ are presented: at each time $ninN$ we are given $x_n$ and $y_n$. It is assumed that the sequences are generated independently of each other by two B-processes. We are interested in the question of whether the sequences represent a typical realization of two different processes or of the same one. We demonstrate that this is impossible to decide, in the sense that every discrimination procedure is bound to err with non-negligible frequency when presented with sequences from some B-processes. This contrasts earlier positive results on B-processes, in particular those showing that there are consistent $bar d$-distance estimates for this class of processes."
to:NB  statistics  time_series  stochastic_processes  ergodic_theory  statistical_inference_for_stochastic_processes  hypothesis_testing 
12 weeks ago by cshalizi
[math/0510311] Adaptive density estimation under dependence
"Assume that $(X_t)_{tinZ}$ is a real valued time series admitting a common marginal density $f$ with respect to Lebesgue's measure. Donoho {it et al.} (1996) propose a near-minimax method based on thresholding wavelets to estimate $f$ on a compact set in an independent and identically distributed setting. The aim of the present work is to extend these results to general weak dependent contexts. Weak dependence assumptions are expressed as decreasing bounds of covariance terms and are detailed for different examples. The threshold levels in estimators $widehat f_n$ depend on weak dependence properties of the sequence $(X_t)_{tinZ}$ through the constant. If these properties are unknown, we propose cross-validation procedures to get new estimators. These procedures are illustrated via simulations of dynamical systems and non causal infinite moving averages. We also discuss the efficiency of our estimators with respect to the decrease of covariances bounds."
to:NB  statistics  density_estimation  wavelets  time_series  statistical_inference_for_stochastic_processes 
12 weeks ago by cshalizi
[0810.2276] A generalized portmanteau test of independence between two stationary time series
"We propose generalized portmanteau-type test statistics in the frequency domain to test independence between two stationary time series. The test statistics are formed analogous to the one in Chen and Deo (2004, Econometric Theory 20, 382-416), who extended the applicability of portmanteau goodness-of-fit test to the long memory case. Under the null hypothesis of independence, the asymptotic standard normal distributions of the proposed statistics are derived under fairly mild conditions. In particular, each time series is allowed to possess short memory, long memory or anti-persistence. A simulation study shows that the tests have reasonable size and power properties."
in_NB  statistics  time_series  hypothesis_testing  independence_testing 
12 weeks ago by cshalizi
[0801.0327] Nonparametric sequential prediction of time series
"Time series prediction covers a vast field of every-day statistical applications in medical, environmental and economic domains. In this paper we develop nonparametric prediction strategies based on the combination of a set of 'experts' and show the universal consistency of these strategies under a minimum of conditions. We perform an in-depth analysis of real-world data sets and show that these nonparametric strategies are more flexible, faster and generally outperform ARMA methods in terms of normalized cumulative prediction error."
in_NB  time_series  nonparametrics  prediction  statistics  to_teach:undergrad-ADA  re:growing_ensemble_project 
february 2012 by cshalizi
[1201.6211] On the range of validity of the autoregressive sieve bootstrap
"We explore the limits of the autoregressive (AR) sieve bootstrap, and show that its applicability extends well beyond the realm of linear time series as has been previously thought. In particular, for appropriate statistics, the AR-sieve bootstrap is valid for stationary processes possessing a general Wold-type autoregressive representation with respect to a white noise; in essence, this includes all stationary, purely nondeterministic processes, whose spectral density is everywhere positive. Our main theorem provides a simple and effective tool in assessing whether the AR-sieve bootstrap is asymptotically valid in any given situation. In effect, the large-sample distribution of the statistic in question must only depend on the first and second order moments of the process; prominent examples include the sample mean and the spectral density. As a counterexample, we show how the AR-sieve bootstrap is not always valid for the sample autocovariance even when the underlying process is linear."
in_NB  bootstrap  time_series  statistics  stochastic_processes 
february 2012 by cshalizi
The Asymmetric Business Cycle
"The business cycle is a fundamental yet elusive concept in macroeconomics. In this paper, we consider the problem of measuring the business cycle. First, we argue for the output-gap view that the business cycle corresponds to transitory deviations in economic activity away from a permanent, or trend, level. Then we investigate the extent to which a general model-based approach to estimating trend and cycle for the U.S. economy leads to measures of the business cycle that reflect models versus the data. We find empirical support for a nonlinear time series model that produces a business cycle measure with an asymmetric shape across NBER expansion and recession phases. Specifically, this business cycle measure suggests that recessions are periods of relatively large and negative transitory fluctuations in output. However, several close competitors to the nonlinear model produce business cycle measures of widely differing shapes and magnitudes. Given this model-based uncertainty, we construct a model-averaged measure of the business cycle. This measure also displays an asymmetric shape and is closely related to other measures of economic slack such as the unemployment rate and capacity utilization."
--- Worthy, but at the same time makes me want to lock them in a room with a copy of Li and Racine's _Nonparametric Econometrics_, or even _The Elements of Statistical Learning_, and not let them out until they understand it.
in_NB  time_series  statistics  economics  macroeconomics  inference_to_latent_objects  re:your_favorite_dsge_sucks  morley.james  have_read  ensemble_methods  model_selection 
february 2012 by cshalizi
Forecasting Time Series with Complex Seasonal Patterns Using Exponential Smoothing
An innovations state space modeling framework is introduced for forecasting complex seasonal time series such as those with multiple seasonal periods, high-frequency seasonality, non-integer seasonality, and dual-calendar effects. The new framework incorporates Box–Cox transformations, Fourier representations with time varying coefficients, and ARMA error correction. Likelihood evaluation and analytical expressions for point forecasts and interval predictions under the assumption of Gaussian errors are derived, leading to a simple, comprehensive approach to forecasting complex seasonal time series. A key feature of the framework is that it relies on a new method that greatly reduces the computational burden in the maximum likelihood estimation. The modeling framework is useful for a broad range of applications, its versatility being illustrated in three empirical studies. In addition, the proposed trigonometric formulation is presented as a means of decomposing complex seasonal time series, and it is shown that this decomposition leads to the identification and extraction of seasonal components which are otherwise not apparent in the time series plot itself.
to:NB  statistics  time_series  prediction 
january 2012 by cshalizi
Quantifying Statistical Interdependence, Part III: N > 2 Point Processes
"Stochastic event synchrony (SES) is a recently proposed family of similarity measures. First, “events” are extracted from the given signals; next, one tries to align events across the different time series. The better the alignment, the more similar the N time series are considered to be. The similarity measures quantify the reliability of the events (the fraction of “nonaligned” events) and the timing precision. So far, SES has been developed for pairs of one-dimensional (Part I) and multidimensional (Part II) point processes. In this letter (Part III), SES is extended from pairs of signals to N > 2 signals. The alignment and SES parameters are again determined through statistical inference, more specifically, by alternating two steps: (1) estimating the SES parameters from a given alignment and (2), with the resulting estimates, refining the alignment. The SES parameters are computed by maximum a posteriori (MAP) estimation (step 1), in analogy to the pairwise case. The alignment (step 2) is solved by linear integer programming. In order to test the robustness and reliability of the proposed N-variate SES method, it is first applied to synthetic data. We show that N-variate SES results in more reliable estimates than bivariate SES. Next N-variate SES is applied to two problems in neuroscience: to quantify the firing reliability of Morris-Lecar neurons and to detect anomalies in EEG synchrony of patients with mild cognitive impairment. Those problems were also considered in Parts I and II, respectively. In both cases, the N-variate SES approach yields a more detailed analysis."
to:NB  neural_data_analysis  point_processes  time_series 
january 2012 by cshalizi
Clements , Schoenberg , Schorlemmer : Residual analysis methods for space–time point processes with applications to earthquake forecast models in California
"Modern, powerful techniques for the residual analysis of spatial-temporal point process models are reviewed and compared. These methods are applied to California earthquake forecast models used in the Collaboratory for the Study of Earthquake Predictability (CSEP). Assessments of these earthquake forecasting models have previously been performed using simple, low-power means such as the L-test and N-test. We instead propose residual methods based on rescaling, thinning, superposition, weighted K-functions and deviance residuals. Rescaled residuals can be useful for assessing the overall fit of a model, but as with thinning and superposition, rescaling is generally impractical when the conditional intensity λ is volatile. While residual thinning and superposition may be useful for identifying spatial locations where a model fits poorly, these methods have limited power when the modeled conditional intensity assumes extremely low or high values somewhere in the observation region, and this is commonly the case for earthquake forecasting models. A recently proposed hybrid method of thinning and superposition, called super-thinning, is a more powerful alternative. The weighted K-function is powerful for evaluating the degree of clustering or inhibition in a model. Competing models are also compared using pixel-based approaches, such as Pearson residuals and deviance residuals. The different residual analysis techniques are demonstrated using the CSEP models and are used to highlight certain deficiencies in the models, such as the overprediction of seismicity in inter-fault zones for the model proposed by Helmstetter, Kagan and Jackson [Seismological Research Letters 78 (2007) 78–86], the underprediction of the model proposed by Kagan, Jackson and Rong [Seismological Research Letters 78 (2007) 94–98] in forecasting seismicity around the Imperial, Laguna Salada, and Panamint clusters, and the underprediction of the model proposed by Shen, Jackson and Kagan [Seismological Research Letters 78 (2007) 116–120] in forecasting seismicity around the Laguna Salada, Baja, and Panamint clusters."
to:NB  point_processes  spatial_statistics  time_series  statistics  model_selection  model-checking  prediction  earthquakes  geology 
december 2011 by cshalizi
Phys. Rev. E 84, 066702 (2011): Nonparametric model reconstruction for stochastic differential equations from discretely observed time-series data
"A scheme is developed for estimating state-dependent drift and diffusion coefficients in a stochastic differential equation from time-series data. The scheme does not require to specify parametric forms for the drift and diffusion coefficients in advance. In order to perform the nonparametric estimation, a maximum likelihood method is combined with a concept based on a kernel density estimation. In order to deal with discrete observation or sparsity of the time-series data, a local linearization method is employed, which enables a fast estimation."
to:NB  statistics  time_series  stochastic_differential_equations  statistical_inference_for_stochastic_processes 
december 2011 by cshalizi
[0805.0463] Distance-based clustering of sparsely observed stochastic processes, with applications to online auctions
"We propose a distance between two realizations of a random process where for each realization only sparse and irregularly spaced measurements with additional measurement errors are available. Such data occur commonly in longitudinal studies and online trading data. A distance measure then makes it possible to apply distance-based analysis such as classification, clustering and multidimensional scaling for irregularly sampled longitudinal data. Once a suitable distance measure for sparsely sampled longitudinal trajectories has been found, we apply distance-based clustering methods to eBay online auction data. We identify six distinct clusters of bidding patterns. Each of these bidding patterns is found to be associated with a specific chance to obtain the auctioned item at a reasonable price."
to:NB  statistics  clustering  time_series 
december 2011 by cshalizi
[1112.1838] Non-parametric kernel estimation for symmetric Hawkes processes. Application to high frequency financial data
"We define a numerical method that provides a non-parametric estimation of the kernel shape in symmetric multivariate Hawkes processes. This method relies on second order statistical properties of Hawkes processes that relate the covariance matrix of the process to the kernel matrix. The square root of the correlation function is computed using a minimal phase recovering method. We illustrate our method on some examples and provide an empirical study of the estimation errors. Within this framework, we analyze high frequency financial price data modeled as 1D or 2D Hawkes processes. We find slowly decaying (power-law) kernel shapes suggesting a long memory nature of self-excitation phenomena at the microstructure level of price dynamics."
to:NB  kernel_estimators  time_series  point_processes  nonparametrics  statistics  re:LoB_project 
december 2011 by cshalizi
[1112.1674] Predicting Failures of Point Forecasts
"The predictability of errors in deterministic temperature forecasts is investigated. More precisely, the aim is to issue warnings whenever the differences between forecast and verification exceed a given threshold. The warnings are generated by analyzing the output of an ensemble forecast system in terms of a decision making approach. The quality of the resulting predictions is evaluated by computing receiver operating characteristics, the Brier score, and the Ignorance score. Special emphasis is also given to the question whether rare events are better predictable."
to:NB  prediction  statistics  time_series  dynamical_systems 
december 2011 by cshalizi
[1111.6291] Semiparametric Time Series Models with Log-concave Innovations
"We study a class of semiparametric time series models with innovations following a log-concave distribution. We propose a general maximum likelihood framework which allows us to estimate simultaneously the parameters of a model and the density of the innovations. This framework can be easily adapted to many well-known models, including ARMA and GARCH. Furthermore, we show that the estimator under our new framework is consistent in both ARMA and GARCH settings. We demonstrate its finite sample performance via a thorough simulation study and apply it to two real data sets concerning the streamflow of the Hirnant river and the FTSE daily return."
to:NB  statistics  time_series  nonparametrics 
december 2011 by cshalizi
[1111.6801] The direct L2 geometric structure on a manifold of probability densities with applications to Filtering
"In this paper we introduce a projection method for the space of probability distributions based on the differential geometric approach to statistics. This method is based on a direct L2 metric as opposed to the usual Hellinger distance and the related Fisher Information metric. We explain how this apparatus can be used for the nonlinear filtering problem, in relationship also to earlier projection methods based on the Fisher metric. Past projection filters focused on the Fisher metric and the exponential families that made the filter correction step exact. In this work we introduce the mixture projection filter, namely the projection filter based on the direct $L^2$ metric and based on a manifold given by a mixture of pre-assigned densities. The resulting prediction step in the filtering problem is described by a linear differential equation, while the correction step can be made exact."
in_NB  filtering  state_estimation  information_geometry  time_series 
december 2011 by cshalizi
Yao , Müller , Wang : Functional linear regression analysis for longitudinal data
"We propose nonparametric methods for functional linear regression which are designed for sparse longitudinal data, where both the predictor and response are functions of a covariate such as time. Predictor and response processes have smooth random trajectories, and the data consist of a small number of noisy repeated measurements made at irregular times for a sample of subjects. In longitudinal studies, the number of repeated measurements per subject is often small and may be modeled as a discrete random number and, accordingly, only a finite and asymptotically nonincreasing number of measurements are available for each subject or experimental unit. We propose a functional regression approach for this situation, using functional principal component analysis, where we estimate the functional principal component scores through conditional expectations. This allows the prediction of an unobserved response trajectory from sparse measurements of a predictor trajectory. The resulting technique is flexible and allows for different patterns regarding the timing of the measurements obtained for predictor and response trajectories. Asymptotic properties for a sample of n subjects are investigated under mild conditions, as n→∞, and we obtain consistent estimation for the regression function. Besides convergence results for the components of functional linear regression, such as the regression parameter function, we construct asymptotic pointwise confidence bands for the predicted trajectories. A functional coefficient of determination as a measure of the variance explained by the functional regression model is introduced, extending the standard R2 to the functional case. The proposed methods are illustrated with a simulation study, longitudinal primary biliary liver cirrhosis data and an analysis of the longitudinal relationship between blood pressure and body mass index."
to:NB  statistics  time_series  regression  functional_data 
december 2011 by cshalizi
Quantile regression for longitudinal data based on latent Markov subject-specific parameters Alessio Farcomeni - Statistics and Computing, Volume 22, Number 1
"We propose a latent Markov quantile regression model for longitudinal data with non-informative drop-out. The observations, conditionally on covariates, are modeled through an asymmetric Laplace distribution. Random effects are assumed to be time-varying and to follow a first order latent Markov chain. This latter assumption is easily interpretable and allows exact inference through an ad hoc EM-type algorithm based on appropriate recursions. Finally, we illustrate the model on a benchmark data set."
to:NB  regression  time_series  prediction  markov_models 
december 2011 by cshalizi
Phys. Rev. E 84, 056214 (2011): State and parameter estimation using unconstrained optimization
"We present an efficient method for estimating variables and parameters of a given system of ordinary differential equations by adapting the model output to an observed time series from the (physical) process described by the model. The proposed method is based on (unconstrained) nonlinear optimization exploiting the particular structure of the relevant cost function. To illustrate the features and performance of the method, simulations are presented using chaotic time series generated by the Colpitts oscillator, the three-dimensional Hindmarsh-Rose neuron model, and a nine-dimensional extended Rössler system." --- Sounds like Hooker & Ramsay.
to:NB  dynamical_systems  statistics  time_series  estimation  statistical_inference_for_stochastic_processes 
november 2011 by cshalizi
[1111.5312] Representations and Ensemble Methods for Dynamic Relational Classification
"Temporal networks are ubiquitous and evolve over time by the addition, deletion, and changing of links, nodes, and attributes. Although many relational datasets contain temporal information, the majority of existing techniques in relational learning focus on static snapshots and ignore the temporal dynamics. We propose a framework for discovering temporal representations of relational data to increase the accuracy of statistical relational learning algorithms. The temporal relational representations serve as a basis for classification, ensembles, and pattern mining in evolving domains. The framework includes (1) selecting the time-varying relational components (links, attributes, nodes), (2) selecting the temporal granularity, (3) predicting the temporal influence of each time-varying relational component, and (4) choosing the weighted relational classifier. Additionally, we propose temporal ensemble methods that exploit the temporal-dimension of relational data. These ensembles outperform traditional and more sophisticated relational ensembles while avoiding the issue of learning the most optimal representation. Finally, the space of temporal-relational models are evaluated using a sample of classifiers. In all cases, the proposed temporal-relational classifiers outperform competing models that ignore the temporal information. The results demonstrate the capability and necessity of the temporal-relational representations for classification, ensembles, and for mining temporal datasets."
in_NB  to_read  relational_learning  network_data_analysis  transaction_networks  neville.jennifer  machine_learning  ensemble_methods  time_series  classifiers 
november 2011 by cshalizi
[1111.4226] Joint Modeling of Multiple Related Time Series via the Beta Process
"We propose a Bayesian nonparametric approach to the problem of jointly modeling multiple related time series. Our approach is based on the discovery of a set of latent, shared dynamical behaviors. Using a beta process prior, the size of the set and the sharing pattern are both inferred from data. We develop efficient Markov chain Monte Carlo methods based on the Indian buffet process representation of the predictive distribution of the beta process, without relying on a truncated model. In particular, our approach uses the sum-product algorithm to efficiently compute Metropolis-Hastings acceptance probabilities, and explores new dynamical behaviors via birth and death proposals. We examine the benefits of our proposed feature-based model on several synthetic datasets, and also demonstrate promising results on unsupervised segmentation of visual motion capture data."
to:NB  heard_the_talk  time_series  statistics  machine_learning  nonparametrics  fox.emily  jordan.michael_i. 
november 2011 by cshalizi
Kreiss , Paparoditis , Politis : On the range of validity of the autoregressive sieve bootstrap
"We explore the limits of the autoregressive (AR) sieve bootstrap, and show that its applicability extends well beyond the realm of linear time series as has been previously thought. In particular, for appropriate statistics, the AR-sieve bootstrap is valid for stationary processes possessing a general Wold-type autoregressive representation with respect to a white noise; in essence, this includes all stationary, purely nondeterministic processes, whose spectral density is everywhere positive. Our main theorem provides a simple and effective tool in assessing whether the AR-sieve bootstrap is asymptotically valid in any given situation. In effect, the large-sample distribution of the statistic in question must only depend on the first and second order moments of the process; prominent examples include the sample mean and the spectral density. As a counterexample, we show how the AR-sieve bootstrap is not always valid for the sample autocovariance even when the underlying process is linear."
in_NB  time_series  bootstrap  statistics  stochastic_processes 
october 2011 by cshalizi
A Measure of Stationarity in Locally Stationary Processes With Applications to Testing- Journal of the American Statistical Association - 106(495):1113
"In this article we investigate the problem of measuring deviations from stationarity in locally stationary time series. Our approach is based on a direct estimate of the L2-distance between the spectral density of the locally stationary process and its best approximation by a spectral density of a stationary process. An explicit expression of the minimal distance is derived, which depends only on integrals of the spectral density of the locally stationary process and its square. These integrals can be estimated directly without estimating the spectral density, and as a consequence, the estimation of the measure of stationarity does not require the specification of a smoothing bandwidth. We show weak convergence of an appropriately standardized version of the statistic to a standard normal distribution. The results are used to construct confidence intervals for the measure of stationarity and to develop a new test for the hypothesis of stationarity. Finally, we investigate the finite sample properties of the resulting confidence intervals and tests by means of a simulation study and illustrate the methodology in two data examples. Parts of the proofs are available online as supplemental material to this article."
in_NB  to_read  statistics  time_series  stationarity  spectral_estimation 
october 2011 by cshalizi
Phys. Rev. E 84, 046205 (2011): Nonuniqueness of global modeling and time scaling
"Starting from an observed single time series, it is shown how to reconstruct a global model in the original phase space by using the ansatz library approach. This model is then compared to the underlying dynamical system that describes the initial time series, and the nonuniqueness of the reconstructed model is discussed. This framework is extended by taking an additional time scaling factor in the reconstructed model class under consideration."
to:NB  time_series  state-space_reconstruction  to_read 
october 2011 by cshalizi
Phys. Rev. E 84, 046702 (2011): Nonparametric segmentation of nonstationary time series
"The nonstationary evolution of observable quantities in complex systems can frequently be described as a juxtaposition of quasistationary spells. Given that standard theoretical and data analysis approaches usually rely on the assumption of stationarity, it is important to detect in real time series intervals holding that property. With that aim, we introduce a segmentation algorithm based on a fully nonparametric approach. We illustrate its applicability through the analysis of real time series presenting diverse degrees of nonstationarity, thus showing that this segmentation procedure generalizes and allows one to uncover features unresolved by previous proposals based on the discrepancy of low order statistical moments only."
in_NB  statistics  change-point_problem  time_series  nonparametrics  re:growing_ensemble_project 
october 2011 by cshalizi
Phys. Rev. Lett. 107, 148501 (2011): Emergence of El Niño as an Autonomous Component in the Climate Network
We construct and analyze a climate network which represents the interdependent structure of the climate in different geographical zones and find that the network responds in a unique way to El Niño events. Analyzing the dynamics of the climate network shows that when El Niño events begin, the El Niño basin partially loses its influence on its surroundings. After typically three months, this influence is restored while the basin loses almost all dependence on its surroundings and becomes autonomous. The formation of an autonomous basin is the missing link to understand the seemingly contradicting phenomena of the afore-noticed weakening of the interdependencies in the climate network during El Niño and the known impact of the anomalies inside the El Niño basin on the global climate system.
climatology  time_series  macro_from_micro  emergence  to:NB  pattern_formation 
october 2011 by cshalizi
Reality Checks and Comparisons of Nested Predictive Models - Journal of Business and Economic Statistics - 0(0):1
"This article develops a simple bootstrap method for simulating asymptotic critical values for tests of equal forecast accuracy and encompassing among many nested models. Our method combines elements of fixed regressor and wild bootstraps. We first derive the asymptotic distributions of tests of equal forecast accuracy and encompassing applied to forecasts from multiple models that nest the benchmark model—that is, reality check tests. We then prove the validity of the bootstrap for these tests. Monte Carlo experiments indicate that our proposed bootstrap has better finite-sample size and power than other methods designed for comparison of nonnested models."
statistics  model_checking  model_selection  time_series  bootstrap  to_read  to_teach:undergrad-ADA  encompassing 
september 2011 by cshalizi
[1107.5543] Coevolution of Network Structure and Content
Disappointing.  The content variables are all completely ad hoc (the structure variables are also ad hoc, but traditional), so we really have no idea of what is being found here.  And there is no assessment of uncertainty at all.  And, for the love of Gauss, stop using R^2 like that!
time_series  social_networks  social_media  statistics  adamic.lada  to:NB  have_read  network_data_analysis 
july 2011 by cshalizi
How Useful are Estimated DSGE Model Forecasts? by Rochelle Edge, Refet Gurkaynak :: SSRN
The methodological ideas here are suspect.  It is true that there is not much to predict about an in-control system, and what is happening is largely random and so unpredictable, so that even the true model would show low forecasting ability.  The question however is why we are supposed to think that the DSGE _does_ give us good information about counterfactuals.  If you could show that it had much better predictive performance than baselines like constants or random walks during _out-of-control_ periods, that would be something; but they don't.
re:your_favorite_dsge_sucks  dsges  prediction  economics  macroeconomics  time_series  statistics  in_NB  have_read  to:blog 
july 2011 by cshalizi
[0812.0449] Locally adaptive estimation methods with application to univariate time series
"The paper offers a unified approach to the study of three locally adaptive estimation methods in the context of univariate time series from both theoretical and empirical points of view. A general procedure for the computation of critical values is given. The underlying model encompasses all distributions from the exponential family providing for great flexibility. The procedures are applied to simulated and real financial data distributed according to the Gaussian, volatility, Poisson, exponential and Bernoulli models. Numerical results exhibit a very reasonable performance of the methods."
time_series  statistics  estimation  exponential_families  non-stationarity  to:NB 
july 2011 by cshalizi
[0711.3856] Forward estimation for ergodic time series
"The forward estimation problem for stationary and ergodic time series $\{X_n\}_{n=0}^{\infty}$ taking values from a finite alphabet ${\cal X}$ is to estimate the probability that $X_{n+1}=x$ based on the observations $X_i$, $0\le i\le n$ without prior knowledge of the distribution of the process $\{X_n\}$. We present a simple procedure $g_n$ which is evaluated on the data segment $(X_0,...,X_n)$ and for which, ${\rm error}(n) = |g_{n}(x)-P(X_{n+1}=x |X_0,...,X_n)|\to 0$ almost surely for a subclass of all stationary and ergodic time series, while for the full class the Cesaro average of the error tends to zero almost surely and moreover, the error tends to zero in probability."
prediction  ergodic_theory  time_series  statistics  morvai.gusztav  weiss.benjamin 
july 2011 by cshalizi
[1104.3073] Feature Matching in Time Series Modelling
"Using a time series model to mimic an observed time series has a long history. However, with regard to this objective, conventional estimation methods for discrete-time dynamical models are frequently found to be wanting. In fact, they are characteristically misguided in at least two respects: (i) assuming that there is a true model; (ii) evaluating the efficacy of the estimation as if the postulated model is true. There are numerous examples of models, when fitted by conventional methods, that fail to capture some of the most basic global features of the data, such as cycles with good matching periods, singularities of spectral density functions (especially at the origin) and others. We argue that the shortcomings need not always be due to the model formulation but the inadequacy of the conventional fitting methods. After all, all models are wrong, but some are useful if they are fitted properly. The practical issue becomes one of how to best fit the model to data. Thus, in the absence of a true model, we prefer an alternative approach to conventional model fitting that typically involves one-step-ahead prediction errors. Our primary aim is to match the joint probability distribution of the observable time series, including long-term features of the dynamics that underpin the data, such as cycles, long memory and others, rather than short-term prediction. For want of a better name, we call this specific aim feature matching."
to:NB  time_series  statistics  model-checking  to_read 
april 2011 by cshalizi
[0812.2749] Nonparametric inference of a trend using functional data
I guess I've been more or less presuming this was true.  (And I'd have been wrong about the form of the simultaneous CI, actually.)  Worth trying to work into the final exam for The Kids?
curve_fitting  gaussian_processes  time_series  statistics  nonparametrics  have_read  confidence_sets  to_teach:undergrad-ADA 
april 2011 by cshalizi
Efficient probabilistic forecasts for counts - McCabe et al., 2011 - JRSS-B
" Efficient probabilistic forecasts of integer-valued random variables are derived. The optimality is achieved by estimating the forecast distribution non-parametrically over a given broad model class and proving asymptotic (non-parametric) efficiency in that setting. The method is developed within the context of the integer auto-regressive class of models, which is a suitable class for any count data that can be interpreted as a queue, stock, birth-and-death process or branching process. The theoretical proofs of asymptotic efficiency are supplemented by simulation results that demonstrate the overall superiority of the non-parametric estimator relative to a misspecified parametric alternative, in large but finite samples. The method is applied to counts of stock market iceberg orders. A subsampling method is used to assess sampling variation in the full estimated forecast distribution and a proof of its validity is given."  (Dunno about the to_teach tags, I haven't read this yet.)
statistics  prediction  density_estimation  time_series  stochastic_processes  branching_processes  to_teach:data-mining  to_teach:undergrad-ADA 
march 2011 by cshalizi
[1101.0673] Autoregressive Kernels For Time Series
Under this kernel, two time series are similar if they lead to similar vector autoregressions...
kernel_methods  time_series  statistics 
january 2011 by cshalizi
Combining Nonparametric and Optimal Linear Time Series Predictions
ARMA model forecasting, supplemented somehow with nonparametric smoothing of the residuals.  (I haven't read beyond the abstract.)
time_series  prediction  statistics  nonparametrics  to_teach:undergrad-ADA 
january 2011 by cshalizi
[1012.3795] Estimating Networks With Jumps
"We study the problem of estimating a temporally varying coefficient and varying structure (VCVS) graphical model underlying nonstationary time series data, such as social states of interacting individuals or microarray expression profiles of gene networks, as opposed to i.i.d. data from an invariant model widely considered in current literature of structural estimation. In particular, we consider the scenario in which the model evolves in a piece-wise constant fashion. We propose a procedure that minimizes the so-called TESLA loss (i.e., temporally smoothed L1 regularized regression), which allows jointly estimating the partition boundaries of the VCVS model and the coefficient of the sparse precision matrix on each block of the partition. "
graphical_models  network_data_analysis  time_series  model_selection  statistics  xing.eric 
december 2010 by cshalizi
Rule generation for categorical time series with Markov assumptions
"Several procedures of sequential pattern analysis are designed to detect frequently occurring patterns in a single categorical time series (episode mining). Based on these frequent patterns, rules are generated and evaluated, for example, in terms of their confidence. The confidence value is commonly interpreted as an estimate of a conditional probability, so some kind of stochastic model has to be assumed. The model is identified as a variable length Markov model. With this assumption, the usual confidences are maximum likelihood estimates of the transition probabilities of the Markov model. We discuss possibilities of how to efficiently fit an appropriate model to the data. Based on this model, rules are formulated. It is demonstrated that this new approach generates noticeably less and more reliable rules." --- I should really add some time series stuff to data mining...
data_mining  markov_models  time_series  in_NB  to_teach:data-mining  variable-length_markov_models 
december 2010 by cshalizi
[1011.2998] A compact statistical model of the song syntax in Bengalese finch
"Songs of many songbird species consist of variable sequences of a finite number of syllables. A common approach for characterizing the syntax of these ... sequences is to use transition probabilities between the syllables. This is equivalent to the Markov model, in which each syllable is associated with one state, and the transition probabilities between the states do not depend on the state transition history. ... analyze the song syntax in a Bengalese finch. ... the Markov model fails ... Instead, ... include adaptation of the self-transition probabilities when states are repeatedly revisited ... more than one state to the same syllable. ... Mathematically, the model is a partially observable Markov model with adaptation (POMMA). ... supports the branching chain network hypothesis of how syntax is controlled within the premotor song nucleus HVC ..."
birds  grammar_induction  markov_models  time_series  to_teach:complexity-and-inference  to_read  in_NB 
november 2010 by cshalizi
Phys. Rev. E 82, 056206 (2010): Forecasting the evolution of nonlinear and nonstationary systems using recurrence-based local Gaussian process models
"...combining nonparametric Gaussian process (GP) modeling with certain local topological considerations ... for prediction (one-step look ahead) of ... nonlinear and nonstationary dynamics. ... partition ... trajectories into multiple near-stationary segments by aligning the boundaries of the partitions with those of the piecewise affine projections of the underlying dynamic system... alignment is achieved through the consideration of recurrence and other local topological properties ... forecasting in Lorenz system under different levels of induced noise and nonstationarity, synthetic heart-rate signals, and a real-world time-series from an industrial operation known to exhibit highly nonlinear and nonstationary dynamics. ... local Gaussian process can significantly outperform not just classical system identification, neural network and nonparametric models, but also the sequential Bayesian Monte Carlo methods in terms of prediction accuracy and computational speed."
time_series  prediction  non-stationarity  gaussian_processes  re:growing_ensemble_project  to_read 
november 2010 by cshalizi
« earlier      

related tags

academia  adamic.lada  agent-based_models  albers.dave  amaral.luis  anomaly_detection  apollo_project  arrow_of_time  artificial_life  astrophysics  asymptotics  attractor_reconstruction  autonomous_agents  autonomy  bad_data_analysis  bad_science  bartlett.m.s.  beran.jan  bibliometry  bioinformatics  birds  blei.david  books:noted  books:recommended  book_reviews  boosting  bootstrap  boris  branching_processes  brillinger.david  brown.emery  buhlmann.peter  caires.s.  cartoons  causal_inference  cham.jorge  change-point_problem  chaos  chapman.sandra  chu.tianjiao  classifiers  climate_change  climatology  clustering  colors  communication  community_discovery  complexity_measures  conferences  confidence_sets  control  control_theory  correlation_time  cross-validation  curve_fitting  cybernetics  damouras.sotirios  data_analysis  data_mining  data_sets  density_estimation  deviation_bounds  dietterich.thomas  dimension_reduction  dsges  dynamical_systems  earthquakes  econometrics  economics  EEG  emergence  empirical_processes  encompassing  engle.robert  ensemble_methods  entropy  entropy_estimation  epidemic_models  epidemiology  ergodic_decomposition  ergodic_theory  estimation  estimation_of_dynamical_systems  events  evisceration  exponential_families  extended_kalman_filter  factor_analysis  feedback  ferreira.j.a.  filtering  finance  fisher_information  flocks_and_swarms  fmri  foundations_of_statistics  fourier_analysis  fox.emily  fractals  frankel.jeffrey  freedom_as_self-control  frequency_domain  functional_connectivity  functional_data  functional_data_analysis  funny:geeky  gaussian_processes  gene_expression  gene_expression_data_analysis  geology  glymour.clark  goodness-of-fit  google_ngrams  grammar_induction  granger_causality  graphical_models  hansen.bruce  harrison.matt  haslinger.rob  have_read  heard_the_talk  heavy_tails  hendry.david  heteroskedasticity  high-dimensional_statistics  hilbert_space  history_of_technology  hodrick-prescott_filter  hoeffdings_inequality  hofman.jake  homeostasis  hoyer.patrik  hypothesis_testing  iacus.stefano  identifiability  independence_testing  independent_component_analysis  indirect_inference  individual_sequence_prediction  inequality  inference_to_latent_objects  information_criteria  information_geometry  information_theory  interview  in_NB  janzing.dominik  jordan.michael_i.  kalman_filter  kantz.holger  kernel_estimators  kernel_methods  kith_and_kin  kontoyiannis.ioannis  lagrange_multipliers  langford.john  laplace_approximation  lasso  learning_theory  likelihood  likelihood_ratio_tests  lives_of_the_scientists  long-memory_processes  long-range_dependence  machine_learning  macroeconomics  macro_from_micro  markov_models  martingales  method_of_moments  misspecification  mixing  mixture_models  model-checking  model_checking  model_selection  monte_carlo  morley.james  morvai.gusztav  nardi.yuval  nasa  networks  network_data_analysis  neural_coding_and_decoding  neural_data_analysis  neural_modeling  neuroscience  neville.jennifer  neyman-pearson_lemma  nilsson_jacobi.martin  non-equilibrium  non-stationarity  nonparametrics  obvious_to_one_skilled_in_the_art  oscillators  particle_filters  parzen.emanuel  pattern_formation  pecora.louis  philosophy_of_science  pillai.natesh  pittsburgh  plasma  point_processes  prediction  probability  programming  quantile_estimation  r  raginsky.maxim  re:almost_none  re:AoS_project  re:functional_communities  re:growing_ensemble_project  re:LoB_project  re:stacs  re:XV_for_mixing  re:your_favorite_dsge_sucks  recurrence_times  recursive_estimation  regression  relational_learning  resampling  rinaldo.alessandro  rosenblatt.murray  ryabko.daniil  salakhutdinov.ruslan  schreiber.thomas  scientific_computing  self-centered  self-organization  self-promotion  sensitive_dependence_on_initial_conditions  series_of_footnotes  shot_after_a_fair_trial  simulation  smoothing  social_media  social_networks  social_science_methodology  sociology  sociology_of_science  sornette.didier  sparsity  spatial_statistics  spectral_estimation  splines  state-space_models  state-space_reconstruction  state_estimation  stationarity  statistical_inference_for_stochastic_processes  statistical_mechanics  statistics  stochastic_differential_equations  stochastic_processes  stochastic_volatility  systems_identification  teleology  teleonomy  text_mining  time_rescaling  time_series  to:blog  to:NB  topic_models  to_be_shot_after_a_fair_trial  to_read  to_teach  to_teach:complexity-and-inference  to_teach:data-mining  to_teach:financial-time-series  to_teach:undergrad-ADA  trains  transaction_networks  universal_prediction  variable-length_markov_models  variable_selection  variance_estimation  via:?  via:aaronsw  via:arthegall  via:ded-maxim  via:flaxman  via:gelman  via:guslacerda  via:jbdelong  via:krugman  via:nicholas_della_penna  via:nick-watkins  via:the_author  via:vaguery  visual_display_of_quantitative_information  watkins.nicholas  watts.duncan  wavelets  weiss.benjamin  whats_gone_wrong_with_America  wheels:reinvention_of  wiener-khinchin  wiener.norbert  willett.rebecca  xing.eric  zhang.tong 

Copy this bookmark:



description:


tags: